Test 0582: Create QFQ with default setting
Test c9a3: Create QFQ with class weight setting
Test 8452: Create QFQ with class maxpkt setting
Test d920: Create QFQ with multiple class setting
Test 0548: Delete QFQ with handle
Test 5901: Show QFQ class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for netem qdisc
Test cb28: Create NETEM with default setting
Test a089: Create NETEM with limit flag
Test 3449: Create NETEM with delay time
Test 3782: Create NETEM with distribution and corrupt flag
Test 2b82: Create NETEM with distribution and duplicate flag
Test a932: Create NETEM with distribution and loss flag
Test e01a: Create NETEM with distribution and loss state flag
Test ba29: Create NETEM with loss gemodel flag
Test 0492: Create NETEM with reorder flag
Test 7862: Create NETEM with rate limit
Test 7235: Create NETEM with multiple slot rate
Test 5439: Create NETEM with multiple slot setting
Test 5029: Change NETEM with loss state
Test 3785: Replace NETEM with delay time
Test 4502: Delete NETEM with handle
Test 0785: Show NETEM class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for multiq qdisc
Test 20ba: Add multiq Qdisc to multi-queue device (8 queues)
Test 4301: List multiq Class
Test 7832: Delete nonexistent multiq Qdisc
Test 2891: Delete multiq Qdisc twice
Test 1329: Add multiq Qdisc to single-queue device
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for mqprio qdisc
Test 9903: Add mqprio Qdisc to multi-queue device (8 queues)
Test 453a: Delete nonexistent mqprio Qdisc
Test 5292: Delete mqprio Qdisc twice
Test 45a9: Add mqprio Qdisc to single-queue device
Test 2ba9: Show mqprio class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test 0904: Create HTB with default setting
Test 3906: Create HTB with default-N setting
Test 8492: Create HTB with r2q setting
Test 9502: Create HTB with direct_qlen setting
Test b924: Create HTB with class rate and burst setting
Test 4359: Create HTB with class mpu setting
Test 9048: Create HTB with class prio setting
Test 4994: Create HTB with class ceil setting
Test 9523: Create HTB with class cburst setting
Test 5353: Create HTB with class mtu setting
Test 346a: Create HTB with class quantum setting
Test 303a: Delete HTB with handle
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for hfsc qdisc
Test 3254: Create HFSC with default setting
Test 0289: Create HFSC with class sc and ul rate setting
Test 846a: Create HFSC with class sc umax and dmax setting
Test 5413: Create HFSC with class rt and ls rate setting
Test 9312: Create HFSC with class rt umax and dmax setting
Test 6931: Delete HFSC with handle
Test 8436: Show HFSC class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for fq_codel qdisc
Test 4957: Create FQ_CODEL with default setting
Test 7621: Create FQ_CODEL with limit setting
Test 6871: Create FQ_CODEL with memory_limit setting
Test 5636: Create FQ_CODEL with target setting
Test 630a: Create FQ_CODEL with interval setting
Test 4324: Create FQ_CODEL with quantum setting
Test b190: Create FQ_CODEL with noecn flag
Test 5381: Create FQ_CODEL with ce_threshold setting
Test c9d2: Create FQ_CODEL with drop_batch setting
Test 523b: Create FQ_CODEL with multiple setting
Test 9283: Replace FQ_CODEL with noecn setting
Test 3459: Change FQ_CODEL with limit setting
Test 0128: Delete FQ_CODEL with handle
Test 0435: Show FQ_CODEL class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for dsmark qdisc
Test 6345: Create DSMARK with default setting
Test 3462: Create DSMARK with default_index setting
Test ca95: Create DSMARK with set_tc_index flag
Test a950: Create DSMARK with multiple setting
Test 4092: Delete DSMARK with handle
Test 5930: Show DSMARK class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test 1820: Create CBS with default setting
Test 1532: Create CBS with hicredit setting
Test 2078: Create CBS with locredit setting
Test 9271: Create CBS with sendslope setting
Test 0482: Create CBS with idleslope setting
Test e8f3: Create CBS with multiple setting
Test 23c9: Replace CBS with sendslope setting
Test a07a: Change CBS with idleslope setting
Test 43b3: Delete CBS with handle
Test 9472: Show CBS class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test 3460: Create CBQ with default setting
Test 0592: Create CBQ with mpu
Test 4684: Create CBQ with valid cell num
Test 4345: Create CBQ with invalid cell num
Test 4525: Create CBQ with valid ewma
Test 6784: Create CBQ with invalid ewma
Test 5468: Delete CBQ with handle
Test 492a: Show CBQ class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for cake qdisc
Test 1212: Create CAKE with default setting
Test 3281: Create CAKE with bandwidth limit
Test c940: Create CAKE with autorate-ingress flag
Test 2310: Create CAKE with rtt time
Test 2385: Create CAKE with besteffort flag
Test a032: Create CAKE with diffserv8 flag
Test 2349: Create CAKE with diffserv4 flag
Test 8472: Create CAKE with flowblind flag
Test 2341: Create CAKE with dsthost and nat flag
Test 5134: Create CAKE with wash flag
Test 2302: Create CAKE with flowblind and no-split-gso flag
Test 0768: Create CAKE with dual-srchost and ack-filter flag
Test 0238: Create CAKE with dual-dsthost and ack-filter-aggressive flag
Test 6572: Create CAKE with memlimit and ptm flag
Test 2436: Create CAKE with fwmark and atm flag
Test 3984: Create CAKE with overhead and mpu
Test 5421: Create CAKE with conservative and ingress flag
Test 6854: Delete CAKE with conservative and ingress flag
Test 2342: Replace CAKE with mpu
Test 2313: Change CAKE with mpu
Test 4365: Show CAKE class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/sched: sch_api: add helper for tc qdisc walker stats dump
The walk implementation of most qdisc class modules is basically the
same. That is, the values of count and skip are checked first. If
count is greater than or equal to skip, the registered fn function is
executed. Otherwise, increase the value of count. So we can reconstruct
them.
The 3 functions that want access to the taprio_list:
taprio_dev_notifier(), taprio_destroy() and taprio_init() are all called
with the rtnl_mutex held, therefore implicitly serialized with respect
to each other. A spin lock serves no purpose.
====================
Support 256 bit TLS keys with device offload
This series adds support for 256 bit TLS keys with device offload, and a
cleanup patch to remove repeating code:
- Patches #1-2 add cipher sizes descriptors which allow reducing the
amount of code duplications.
- Patch #3 allows 256 bit keys to be TX offloaded in the tls module (RX
already supported).
- Patch #4 adds 256 bit keys support to the mlx5 driver.
====================
Introduce cipher sizes descriptor. It helps reducing the amount of code
duplications and repeated switch/cases that assigns the proper sizes
according to the cipher type.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Any change to the hardware timestamps configuration triggers nic restart,
which breaks transmition and reception of network packets for a while.
But there is no need to fully restart the device because while configuring
hardware timestamps. The code for changing configuration runs after all
of the initialisation, when the NIC is actually up and running. This patch
changes the code that ioctl will only update configuration registers and
will not trigger carrier status change, but in case of timestamps for
all rx packetes it fallbacks to close()/open() sequnce because of
synchronization issues in the hardware. Tested on BCM57504.
Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220922191038.29921-1-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/freescale/fec.h 8aaa8ff39e9a ("Revert "fec: Restart PPS after link state change"") ad168237a332 ("net: fec: add stop mode support for imx8 platform")
https://lore.kernel.org/all/20220921105337.62b41047@canb.auug.org.au/
drivers/pinctrl/pinctrl-ocelot.c 4ebab0460deb ("pinctrl: ocelot: Fix interrupt controller") e2e1adddecf7 ("pinctrl: ocelot: add ability to be used in a non-mmio configuration")
https://lore.kernel.org/all/20220921110032.7cd28114@canb.auug.org.au/
tools/testing/selftests/drivers/net/bonding/Makefile b1c6b30eafd3 ("net: Add tests for bonding and team address list management") abd12753fab7 ("selftests/bonding: add a test for bonding lladdr target")
https://lore.kernel.org/all/20220921110437.5b7dbd82@canb.auug.org.au/
Merge tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wifi, netfilter and can.
A handful of awaited fixes here - revert of the FEC changes, bluetooth
fix, fixes for iwlwifi spew.
We added a warning in PHY/MDIO code which is triggering on a couple of
platforms in a false-positive-ish way. If we can't iron that out over
the week we'll drop it and re-add for 6.1.
I've added a new "follow up fixes" section for fixes to fixes in
6.0-rcs but it may actually give the false impression that those are
problematic or that more testing time would have caught them. So
likely a one time thing.
- ebtables: fix memory leak when blob is malformed
- nf_ct_ftp: fix deadlock when nat rewrite is needed
Current release - regressions:
- Revert "fec: Restart PPS after link state change" and the related
"net: fec: Use a spinlock to guard `fep->ptp_clk_on`"
- Bluetooth: fix HCIGETDEVINFO regression
- wifi: mt76: fix 5 GHz connection regression on mt76x0/mt76x2
- mptcp: fix fwd memory accounting on coalesce
- rwlock removal fall out:
- ipmr: always call ip{,6}_mr_forward() from RCU read-side
critical section
- ipv6: fix crash when IPv6 is administratively disabled
- tcp: read multiple skbs in tcp_read_skb()
- mdio_bus_phy_resume state warning fallout:
- eth: ravb: fix PHY state warning splat during system resume
- eth: sh_eth: fix PHY state warning splat during system resume
Current release - new code bugs:
- wifi: iwlwifi: don't spam logs with NSS>2 messages
- eth: mtk_eth_soc: enable XDP support just for MT7986 SoC
Previous releases - regressions:
- bonding: fix NULL deref in bond_rr_gen_slave_id
- wifi: iwlwifi: mark IWLMEI as broken
Previous releases - always broken:
- nf_conntrack helpers:
- irc: tighten matching on DCC message
- sip: fix ct_sip_walk_headers
- osf: fix possible bogus match in nf_osf_find()
- ipvlan: fix out-of-bound bugs caused by unset skb->mac_header
- core: fix flow symmetric hash
- bonding, team: unsync device addresses on ndo_stop
- phy: micrel: fix shared interrupt on LAN8814"
* tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
selftests: forwarding: add shebang for sch_red.sh
bnxt: prevent skb UAF after handing over to PTP worker
net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
net: sched: fix possible refcount leak in tc_new_tfilter()
net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
udp: Use WARN_ON_ONCE() in udp_read_skb()
selftests: bonding: cause oops in bond_rr_gen_slave_id
bonding: fix NULL deref in bond_rr_gen_slave_id
net: phy: micrel: fix shared interrupt on LAN8814
net/smc: Stop the CLC flow if no link to map buffers on
ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient
net: atlantic: fix potential memory leak in aq_ndev_close()
can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
can: gs_usb: gs_can_open(): fix race dev->can.state condition
can: flexcan: flexcan_mailbox_read() fix return value for drop = true
net: sh_eth: Fix PHY state warning splat during system resume
net: ravb: Fix PHY state warning splat during system resume
netfilter: nf_ct_ftp: fix deadlock when nat rewrite is needed
netfilter: ebtables: fix memory leak when blob is malformed
netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()
...
Merge tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Use the right variable to check for shim insecure mode
- Wipe setup_data field when booting via EFI
- Add missing error check to efibc driver
* tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: libstub: check Shim mode using MokSBStateRT
efi: x86: Wipe setup_data on pure EFI boot
efi: efibc: Guard against allocation failure
Merge tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix polling of system-wide events related to mixing per-cpu and
per-thread events.
- Do not check if /proc/modules is unchanged when copying /proc/kcore,
that doesn't get in the way of post processing analysis.
- Include program header in ELF files generated for JIT files, so that
they can be opened by tools using elfutils libraries.
- Enter namespaces when synthesizing build-ids.
- Fix some bugs related to a recent cpu_map overhaul where we should be
using an index and not the cpu number.
- Fix BPF program ELF section name, using the naming expected by libbpf
when using BPF counters in 'perf stat'.
- Add a new test for perf stat cgroup BPF counter.
- Adjust check on 'perf test wp' for older kernels, where the
PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl isn't supported.
- Sync x86 cpufeatures with the kernel sources, no changes in tooling.
* tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Honor namespace when synthesizing build-ids
tools headers cpufeatures: Sync with the kernel sources
perf kcore_copy: Do not check /proc/modules is unchanged
libperf evlist: Fix polling of system-wide events
perf record: Fix cpu mask bit setting for mixed mmaps
perf test: Skip wp modify test on old kernels
perf jit: Include program header in ELF files
perf test: Add a new test for perf stat cgroup BPF counter
perf stat: Use evsel->core.cpus to iterate cpus in BPF cgroup counters
perf stat: Fix cpu map index in bperf cgroup code
perf stat: Fix BPF program section name
Hangbin Liu [Thu, 22 Sep 2022 02:44:53 +0000 (10:44 +0800)]
selftests: forwarding: add shebang for sch_red.sh
RHEL/Fedora RPM build checks are stricter, and complain when executable
files don't have a shebang line, e.g.
*** WARNING: ./kselftests/net/forwarding/sch_red.sh is executable but has no shebang, removing executable bit
Fix it by adding shebang line.
Fixes: 326d55eb37a8 ("selftests: forwarding: Add a RED test for SW datapath") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20220922024453.437757-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 21 Sep 2022 20:10:05 +0000 (13:10 -0700)]
bnxt: prevent skb UAF after handing over to PTP worker
When reading the timestamp is required bnxt_tx_int() hands
over the ownership of the completed skb to the PTP worker.
The skb should not be used afterwards, as the worker may
run before the rest of our code and free the skb, leading
to a use-after-free.
Since dev_kfree_skb_any() accepts NULL make the loss of
ownership more obvious and set skb to NULL.
Fixes: bdd9258cb478 ("bnxt_en: Transmit and retrieve packet timestamps") Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220921201005.335390-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Liang He [Wed, 21 Sep 2022 13:32:45 +0000 (21:32 +0800)]
net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
In prestera_port_sfp_bind(), there are two refcounting bugs:
(1) we should call of_node_get() before of_find_node_by_name() as
it will automaitcally decrease the refcount of 'from' argument;
(2) we should call of_node_put() for the break of the iteration
for_each_child_of_node() as it will automatically increase and
decrease the 'child'.
net: sched: fix possible refcount leak in tc_new_tfilter()
tfilter_put need to be called to put the refount got by tp->ops->get to
avoid possible refcount leak when chain->tmplt_ops != NULL and
chain->tmplt_ops != tp->ops.
Fixes: f80d49a81105 ("net: sched: extend proto ops with 'put' callback") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Link: https://lore.kernel.org/r/20220921092734.31700-1-hbh25y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Anderson [Tue, 20 Sep 2022 23:50:18 +0000 (19:50 -0400)]
net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
There is a separate receive path for small packets (under 256 bytes).
Instead of allocating a new dma-capable skb to be used for the next packet,
this path allocates a skb and copies the data into it (reusing the existing
sbk for the next packet). There are two bytes of junk data at the beginning
of every packet. I believe these are inserted in order to allow aligned DMA
and IP headers. We skip over them using skb_reserve. Before copying over
the data, we must use a barrier to ensure we see the whole packet. The
current code only synchronizes len bytes, starting from the beginning of
the packet, including the junk bytes. However, this leaves off the final
two bytes in the packet. Synchronize the whole packet.
To reproduce this problem, ping a HME with a payload size between 17 and
214
$ ping -s 17 <hme_address>
which will complain rather loudly about the data mismatch. Small packets
(below 60 bytes on the wire) do not have this issue. I suspect this is
related to the padding added to increase the minimum packet size.
====================
bonding: fix NULL deref in bond_rr_gen_slave_id
Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.
====================
Jonathan Toppins [Tue, 20 Sep 2022 17:45:51 +0000 (13:45 -0400)]
selftests: bonding: cause oops in bond_rr_gen_slave_id
This bonding selftest used to cause a kernel oops on aarch64
and should be architectures agnostic.
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonathan Toppins [Tue, 20 Sep 2022 17:45:52 +0000 (13:45 -0400)]
bonding: fix NULL deref in bond_rr_gen_slave_id
Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.
The fix is to allocate the memory in bond_open() which is guaranteed
to be called before any packets are processed.
Fixes: 1999b014bd2e ("net: bonding: Use per-cpu rr_tx_counter") CC: Jussi Maki <joamaki@gmail.com> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Walle [Tue, 20 Sep 2022 14:16:19 +0000 (16:16 +0200)]
net: phy: micrel: fix shared interrupt on LAN8814
Since commit 8a86c40facf2 ("net: phy: micrel: 1588 support for LAN8814
phy") the handler always returns IRQ_HANDLED, except in an error case.
Before that commit, the interrupt status register was checked and if
it was empty, IRQ_NONE was returned. Restore that behavior to play nice
with the interrupt line being shared with others.
Fixes: 8a86c40facf2 ("net: phy: micrel: 1588 support for LAN8814 phy") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Divya Koppera <Divya.Koppera@microchip.com> Link: https://lore.kernel.org/r/20220920141619.808117-1-michael@walle.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Thu, 22 Sep 2022 13:13:26 +0000 (15:13 +0200)]
Merge branch 'add-wed-support-for-mt7986-chipset'
Lorenzo Bianconi says:
====================
Add WED support for MT7986 chipset
Similar to MT7622, introduce Wireless Ethernet Dispatch (WED) support
for MT7986 chipset in order to offload to the hw packet engine traffic
received from LAN/WAN device to WLAN nic (MT7915E).
====================
Lorenzo Bianconi [Tue, 20 Sep 2022 10:11:20 +0000 (12:11 +0200)]
net: ethernet: mtk_eth_wed: add mtk_wed_configure_irq and mtk_wed_dma_{enable/disable}
Introduce mtk_wed_configure_irq, mtk_wed_dma_enable and mtk_wed_dma_disable
utility routines.
This is a preliminary patch to introduce mt7986 wed support.
Tested-by: Daniel Golle <daniel@makrotopia.org> Co-developed-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com> Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lorenzo Bianconi [Tue, 20 Sep 2022 10:11:13 +0000 (12:11 +0200)]
arm64: dts: mediatek: mt7986: add support for Wireless Ethernet Dispatch
Introduce wed0 and wed1 nodes in order to enable offloading forwarding
between ethernet and wireless devices on the mt7986 chipset.
Co-developed-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
====================
Separate SMC parameter settings from TCP sysctls
SMC shares some sysctls with TCP, but considering the difference
between these two protocols, it may not be very suitable for SMC
to reuse TCP parameter settings in some cases, such as keepalive
time or buffer size.
So this patch set aims to introduce some SMC specific sysctls to
independently and flexibly set the parameters that suit SMC.
====================
Tony Lu [Tue, 20 Sep 2022 09:52:22 +0000 (17:52 +0800)]
net/smc: Unbind r/w buffer size from clcsock and make them tunable
Currently, SMC uses smc->sk.sk_{rcv|snd}buf to create buffers for
send buffer and RMB. And the values of buffer size are from tcp_{w|r}mem
in clcsock.
The buffer size from TCP socket doesn't fit SMC well. Generally, buffers
are usually larger than TCP for SMC-R/-D to get higher performance, for
they are different underlay devices and paths.
So this patch unbinds buffer size from TCP, and introduces two sysctl
knobs to tune them independently. Also, these knobs are per net
namespace and work for containers.
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
net/smc: Introduce a specific sysctl for TEST_LINK time
SMC-R tests the viability of link by sending out TEST_LINK LLC
messages over RoCE fabric when connections on link have been
idle for a time longer than keepalive interval (testlink time).
But using tcp_keepalive_time as testlink time maybe not quite
suitable because it is default no less than two hours[1], which
is too long for single link to find peer dead. The active host
will still use peer-dead link (QP) sending messages, and can't
find out until get IB_WC_RETRY_EXC_ERR error CQEs, which takes
more time than TEST_LINK timeout (SMC_LLC_WAIT_TIME) normally.
So this patch introduces a independent sysctl for SMC-R to set
link keepalive time, in order to detect link down in time. The
default value is 30 seconds.
net/smc: Stop the CLC flow if no link to map buffers on
There might be a potential race between SMC-R buffer map and
link group termination.
smc_smcr_terminate_all() | smc_connect_rdma()
--------------------------------------------------------------
| smc_conn_create()
for links in smcibdev |
schedule links down |
| smc_buf_create()
| \- smcr_buf_map_usable_links()
| \- no usable links found,
| (rmb->mr = NULL)
|
| smc_clc_send_confirm()
| \- access conn->rmb_desc->mr[]->rkey
| (panic)
During reboot and IB device module remove, all links will be set
down and no usable links remain in link groups. In such situation
smcr_buf_map_usable_links() should return an error and stop the
CLC flow accessing to uninitialized mr.
We currently check the MokSBState variable to decide whether we should
treat UEFI secure boot as being disabled, even if the firmware thinks
otherwise. This is used by shim to indicate that it is not checking
signatures on boot images. In the kernel, we use this to relax lockdown
policies.
However, in cases where shim is not even being used, we don't want this
variable to interfere with lockdown, given that the variable may be
non-volatile and therefore persist across a reboot. This means setting
it once will persistently disable lockdown checks on a given system.
So switch to the mirrored version of this variable, called MokSBStateRT,
which is supposed to be volatile, and this is something we can check.
Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Peter Jones <pjones@redhat.com>
Ard Biesheuvel [Thu, 4 Aug 2022 13:39:48 +0000 (15:39 +0200)]
efi: x86: Wipe setup_data on pure EFI boot
When booting the x86 kernel via EFI using the LoadImage/StartImage boot
services [as opposed to the deprecated EFI handover protocol], the setup
header is taken from the image directly, and given that EFI's LoadImage
has no Linux/x86 specific knowledge regarding struct bootparams or
struct setup_header, any absolute addresses in the setup header must
originate from the file and not from a prior loading stage.
Since we cannot generally predict where LoadImage() decides to load an
image (*), such absolute addresses must be treated as suspect: even if a
prior boot stage intended to make them point somewhere inside the
[signed] image, there is no way to validate that, and if they point at
an arbitrary location in memory, the setup_data nodes will not be
covered by any signatures or TPM measurements either, and could be made
to contain an arbitrary sequence of SETUP_xxx nodes, which could
interfere quite badly with the early x86 boot sequence.
(*) Note that, while LoadImage() does take a buffer/size tuple in
addition to a device path, which can be used to provide the image
contents directly, it will re-allocate such images, as the memory
footprint of an image is generally larger than the PE/COFF file
representation.
Jakub Kicinski [Thu, 22 Sep 2022 01:39:23 +0000 (18:39 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-09-20 (ice)
Michal re-sets TC configuration when changing number of queues.
Mateusz moves the check and call for link-down-on-close to the specific
path for downing/closing the interface.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix interface being down after reset with link-down-on-close flag on
ice: config netdev tc before setting queues number
====================
ethernet: tundra: Drop forward declaration of static functions
Usually it's not necessary to declare static functions if the symbols are
in the right order. Moving the definition of tsi_eth_driver down in the
compilation unit allows to drop two such declarations.
Qingqing Yang [Mon, 19 Sep 2022 07:48:08 +0000 (15:48 +0800)]
flow_dissector: Do not count vlan tags inside tunnel payload
We've met the problem that when there is a vlan tag inside
GRE encapsulation, the match of num_of_vlans fails.
It is caused by the vlan tag inside GRE payload has been
counted into num_of_vlans, which is not expected.
One example packet is like this:
Ethernet II, Src: Broadcom_68:56:07 (00:10:18:68:56:07)
Dst: Broadcom_68:56:08 (00:10:18:68:56:08)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 100
Internet Protocol Version 4, Src: 192.168.1.4, Dst: 192.168.1.200
Generic Routing Encapsulation (Transparent Ethernet bridging)
Ethernet II, Src: Broadcom_68:58:07 (00:10:18:68:58:07)
Dst: Broadcom_68:58:08 (00:10:18:68:58:08)
802.1Q Virtual LAN, PRI: 0, DEI: 0, ID: 200
...
It should match the (num_of_vlans 1) rule, but it matches
the (num_of_vlans 2) rule.
The vlan tags inside the GRE or other tunnel encapsulated payload
should not be taken into num_of_vlans.
The fix is to stop counting the vlan number when the encapsulation
bit is set.
Fixes: f6a8c26d94a6 ("flow_dissector: Add number of vlan tags dissector") Signed-off-by: Qingqing Yang <qingqing.yang@broadcom.com> Reviewed-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Link: https://lore.kernel.org/r/20220919074808.136640-1-qingqing.yang@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 22 Sep 2022 01:29:38 +0000 (18:29 -0700)]
Merge branch 'clean-up-ocelot_reset-routine'
Colin Foster says:
====================
clean up ocelot_reset() routine
ocelot_reset() will soon be exported to a common library to be used by
the ocelot_ext system. This will make error values from regmap calls
possible, so they must be checked. Additionally, readx_poll_timeout()
can be substituted for the custom loop, as a simple cleanup.
I don't have hardware to verify this set directly, but there shouldn't
be any functional changes.
====================
Colin Foster [Sat, 17 Sep 2022 17:51:27 +0000 (10:51 -0700)]
net: mscc: ocelot: check return values of writes during reset
The ocelot_reset() function utilizes regmap_field_write() but wasn't
checking return values. While this won't cause issues for the current MMIO
regmaps, it could be an issue for externally controlled interfaces.
Add checks for these return values.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Colin Foster [Sat, 17 Sep 2022 17:51:26 +0000 (10:51 -0700)]
net: mscc: ocelot: utilize readx_poll_timeout() for chip reset
Clean up the reset code by utilizing readx_poll_timeout instead of a custom
loop.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: ll_temac: Cleanup for clearing static warnings
Most static warnings are detected by Checkpatch.pl, mainly about:
(1) #1: About the comments.
(2) #2: About function name in a string.
(3) #3: About the open parenthesis.
(4) #4: About the else branch.
(6) #6: About trailing statements.
(7) #5,#7: About blank lines and spaces.
====================
ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient
The original patch added the static branch to handle the situation,
when assigning an XDP TX queue to every CPU is not possible,
so they have to be shared.
However, in the XDP transmit handler ice_xdp_xmit(), an error was
returned in such cases even before static condition was checked,
thus making queue sharing still impossible.
Jakub Kicinski [Thu, 22 Sep 2022 00:28:35 +0000 (17:28 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-09-19 (iavf, i40e)
Norbert adds checking of buffer size for Rx buffer checks in iavf.
Michal corrects setting of max MTU in iavf to account for MTU data provided
by PF, fixes i40e to set VF max MTU, and resolves lack of rate limiting
when value was less than divisor for i40e.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
i40e: Fix set max_tx_rate when it is lower than 1 Mbps
i40e: Fix VF set max MTU size
iavf: Fix set max MTU size with port VLAN and jumbo frames
iavf: Fix bad page state
====================
net: hns3: add judge fd ability for sync and clear process of flow director
Currently, driver will always clear user defined field of flow director
in uninit process and sync flow director table in periodic task. However,
if hardware does not support flow director driver should not do those
processes, so add fd ability judgement.
The fd ability judgement in function hclge_clear_fd_rules_in_list() is
redundant, so delete it.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Hao Lan [Fri, 16 Sep 2022 02:38:02 +0000 (10:38 +0800)]
net: hns3: refactor function hclge_mbx_handler()
Currently, the function hclge_mbx_handler() has too many switch-case
statements, it makes this function too long. To improve code readability,
refactor this function and use lookup table instead.
Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: hns3: optimize converting dscp to priority process of hns3_nic_select_queue()
Currently, when function hns3_nic_select_queue() converts dscp to priority,
it calls an indirect callback ae_algo->ops->get_dscp_prio to get priority.
However as function hns3_nic_select_queue() is in fast path, the indirect
call may cause degradation of performance. For optimization, this patch
moves dscp_app_cnt and dscp_prio from struct hclge_tm_info to struct
hnae3_knic_private_info, so they can be used in both hclge and hns3 layers.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Yonglong Liu [Fri, 16 Sep 2022 02:38:00 +0000 (10:38 +0800)]
net: hns3: add support for external loopback test
This patch add support for external loopback test.
The successful test need the link is up with duplex full. The
driver do external loopback first, and then the whole offline
test.
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Namhyung Kim [Tue, 20 Sep 2022 22:28:21 +0000 (15:28 -0700)]
perf tools: Honor namespace when synthesizing build-ids
It needs to enter the namespace before reading a file.
Fixes: 9e041aaa2c214563 ("perf tools: Allow synthesizing the build id for kernel/modules/tasks in PERF_RECORD_MMAP2") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220920222822.2171056-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:
8029d2499af50d7b ("x86/bugs: Add "unknown" reporting for MMIO Stale Data")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Link: https://lore.kernel.org/lkml/YysTRji90sNn2p5f@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Wed, 14 Sep 2022 12:24:29 +0000 (15:24 +0300)]
perf kcore_copy: Do not check /proc/modules is unchanged
/proc/kallsyms and /proc/modules are compared before and after the copy
in order to ensure no changes during the copy.
However /proc/modules also might change due to reference counts changing
even though that does not make any difference.
Any modules loaded or unloaded should be visible in changes to kallsyms,
so it is not necessary to check /proc/modules also anyway.
Remove the comparison checking that /proc/modules is unchanged.
Fixes: 222e11096bd6357f ("perf buildid-cache: Add ability to add kcore to the cache") Reported-by: Daniel Dao <dqminh@cloudflare.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Daniel Dao <dqminh@cloudflare.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220914122429.8770-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adrian Hunter [Thu, 15 Sep 2022 12:26:12 +0000 (15:26 +0300)]
libperf evlist: Fix polling of system-wide events
Originally, (refer commit 78e9100aa2d99a20 ("perf evlist: Do not poll
events that use the system_wide flag") there wasn't much reason to poll
system-wide events because:
1. The mmaps get "merged" via set-output anyway (the per-cpu case)
2. perf reads all mmaps when any event is woken
3. system-wide mmaps do not fill up as fast as the mmaps for user
selected events
But there was 1 reason not to poll which was that it prevented correct
termination due to POLLHUP on all user selected events. That issue is
now easily resolved by using fdarray_flag__nonfilterable.
With the advent of commit 613c0269421a01f8 ("libperf evlist: Allow
mixing per-thread and per-cpu mmaps"), system-wide mmaps can be used
also in the per-thread case where reason 1 does not apply.
Fix the omission of system-wide events from polling by using the
fdarray_flag__nonfilterable flag.
Adrian Hunter [Thu, 15 Sep 2022 12:26:11 +0000 (15:26 +0300)]
perf record: Fix cpu mask bit setting for mixed mmaps
With mixed per-thread and (system-wide) per-cpu maps, the "any cpu" value
-1 must be skipped when setting CPU mask bits.
Prior to commit fd95a47a005957d5 ("tools/perf: Fix out of bound access
to cpu mask array") the invalid setting went unnoticed, but since then
it causes perf record to fail with an error.
Example:
Before:
$ perf record -e intel_pt// --per-thread uname
Failed to initialize parallel data streaming masks
After:
$ perf record -e intel_pt// --per-thread uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.068 MB perf.data ]
Fixes: 613c0269421a01f8 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220915122612.81738-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim [Wed, 14 Sep 2022 18:33:38 +0000 (11:33 -0700)]
perf test: Skip wp modify test on old kernels
It uses PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl. The kernel would return
ENOTTY if it's not supported. Update the skip reason in that case.
Committer notes:
On s/390 the args aren't used, so need to be marked __maybe_unused.
Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20220914183338.546357-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Merge tag 'for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML fixes from Richard Weinberger:
- Various fixes for build warnings
- Fix default kernel command line
* tag 'for-linus-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
arch: um: Mark the stack non-executable to fix a binutils warning
um: Prevent KASAN splats in dump_stack()
um: fix default console kernel parameter
um: Cleanup compiler warning in arch/x86/um/tls_32.c
um: Cleanup syscall_handler_t cast in syscalls_32.h
Merge tag 'mips-fixes_6.0_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
- fix missing export for Lantiq watchdog driver
- fix ethernet phy interface setup for Loongson32
* tag 'mips-fixes_6.0_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
MIPS: Loongson32: Fix PHY-mode being left unspecified
MIPS: lantiq: export clk_get_io() for lantiq_wdt.ko
* tag 'dmaengine-fix-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine:
dmaengine: zynqmp_dma: Typecast with enum to fix the coverity warning
dmaengine: ti: k3-udma-private: Fix refcount leak bug in of_xudma_dev_get()
dmaengine: xilinx_dma: Report error in case of dma_set_mask_and_coherent API failure
dmaengine: xilinx_dma: cleanup for fetching xlnx,num-fstores property
dmaengine: xilinx_dma: Fix devm_platform_ioremap_resource error handling
Merge tag 'iommu-fixes-v6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu fixes from Joerg Roedel:
"Two fixes for Intel VT-d:
- Check the right capability bit for 5-level page table support.
- Revert a previous fix which caused a regression with Thunderbolt
devices"
* tag 'iommu-fixes-v6.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu/vt-d: Check correct capability for sagaw determination
Revert "iommu/vt-d: Fix possible recursive locking in intel_iommu_init()"
Merge tag 'sound-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A bit more changes than wished, but still manageable amount.
Most of commits are HD-audio specific device fixes / quirks, while
there is a revert for the previous fix due to regressions and a
double-free fix in ALSA core code"
* tag 'sound-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
Revert "ALSA: usb-audio: Split endpoint setups for hw_params and prepare"
ALSA: core: Fix double-free at snd_card_new()
ALSA: hda/realtek: Add a quirk for HP OMEN 16 (8902) mute LED
ALSA: hda/hdmi: Fix the converter reuse for the silent stream
ALSA: hda/realtek: Add quirk for ASUS GA503R laptop
ALSA: hda/realtek: Add pincfg for ASUS G533Z HP jack
ALSA: hda/realtek: Add pincfg for ASUS G513 HP jack
ALSA: hda/realtek: Re-arrange quirk table entries
ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5530 laptop
ALSA: hda/realtek: Enable 4-speaker output Dell Precision 5570 laptop
ALSA: hda: Fix Nvidia dp infoframe
ALSA: hda/realtek: Add quirk for Huawei WRT-WX9
ALSA: hda/tegra: set depop delay for tegra
ALSA: hda: add Intel 5 Series / 3400 PCI DID
ALSA: hda: Fix hang at HD-audio codec unbinding due to refcount saturation