In this patch set, I removed all set_drvdata(NULL) functions
in ->remove() in drivers/net/dsa/.
The driver_data will be set to NULL in device_unbind_cleanup()
after calling ->remove(), so all set_drvdata(NULL) functions
in ->remove() is redundant, they can be removed.
Here is the previous patch set:
https://lore.kernel.org/netdev/facfc855-d082-cc1c-a0bc-027f562a2f45@huawei.com/T/
====================
Remove unnecessary platform_set_drvdata() in ->remove(), the driver_data
will be set to NULL in device_unbind_cleanup() after calling ->remove().
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Li Zhong [Wed, 21 Sep 2022 18:17:16 +0000 (11:17 -0700)]
ethtool: tunnels: check the return value of nla_nest_start()
Check the return value of nla_nest_start(). When starting the entry
level nested attributes, if the tailroom of socket buffer is
insufficient to store the attribute header and payload, the return value
will be NULL.
There is, however, no real bug here since if the skb is full
nla_put_be16() will fail as well and we'll error out.
====================
mlx5 MACSec Extended packet number and replay window offload
This is a follow up series to the previously submitted mlx5 MACsec offload [1]
earlier this release cycle.
In this series we add the support for MACsec Extended packet number and
replay window offloads.
First patch is a simple modification (code movements) to the core macsec code
to allow exposing the EPN related user properties to the offloading
device driver.
The rest of the patches are mlx5 specific, we start off with fixing some
trivial issues with mlx5 MACsec code, and a simple refactoring to allow
additional functionality in mlx5 macsec to support EPN and window replay
offloads.
A) Expose mkey creation functionality to MACsec
B) Expose ASO object to MACsec, to allow advanced steering operations,
ASO objects are used to modify MACsec steering objects in fastpath.
1) Support MACsec offload extended packet number (EPN)
MACsec EPN splits the packet number (PN) into two 32-bits fields,
epn_lsb (32 least significant bits (LSBs) of PN) and epn_msb (32
most significant bits (MSBs) of PN).
Epn_msb bits are managed by SW and for that HW is required to send
an object change event of type EPN event notifying the SW to update
the epn_msb in addition, once epn_msb is updated SW update HW with
the new epn_msb value for HW to perform replay protection.
To prevent HW from stopping while handling the event, SW manages
another bit for HW called epn_overlap, HW uses the latter to get
an indication regarding how to read the epn_msb value correctly
while still receiving packets.
Add epn event handling that updates the epn_overlap and epn_msb for
every 2^31 packets according to the following logic:
if epn_lsb crosses 2^31 (half sequence number wraparound) upon HW
relevant event, SW updates the esn_overlap value to OLD (value = 1).
When the epn_lsb crosses 2^32 (full sequence number wraparound)
upon HW relevant event, SW updates the esn_overlap to NEW
(value = 0) and increment the esn_msb.
When using MACsec EPN a salt and short secure channel id (ssci)
needs to be provided by the user, when offloading EPN need to pass
this salt and ssci to the HW to be used in the initial vector (IV)
calculations.
2) Support MACsec offload replay window
Support setting replay window size for MACsec offload.
Currently supported window size of 32, 64, 128 and 256
bit. Other values will be returned as invalid parameter.
Emeel Hakim [Wed, 21 Sep 2022 18:10:54 +0000 (11:10 -0700)]
net/mlx5e: Support MACsec offload replay window
Support setting replay window size for MACsec offload.
Currently supported window size of 32, 64, 128 and 256
bit. Other values will be returned as invalid parameter.
Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emeel Hakim [Wed, 21 Sep 2022 18:10:53 +0000 (11:10 -0700)]
net/mlx5e: Support MACsec offload extended packet number (EPN)
MACsec EPN splits the packet number (PN) into two 32-bits fields,
epn_lsb (32 least significant bits (LSBs) of PN) and epn_msb (32
most significant bits (MSBs) of PN).
Epn_msb bits are managed by SW and for that HW is required to send
an object change event of type EPN event notifying the SW to update
the epn_msb in addition, once epn_msb is updated SW update HW with
the new epn_msb value for HW to perform replay protection.
To prevent HW from stopping while handling the event, SW manages
another bit for HW called epn_overlap, HW uses the latter to get
an indication regarding how to read the epn_msb value correctly
while still receiving packets.
Add epn event handling that updates the epn_overlap and epn_msb for
every 2^31 packets according to the following logic:
if epn_lsb crosses 2^31 (half sequence number wraparound) upon HW
relevant event, SW updates the esn_overlap value to OLD (value = 1).
When the epn_lsb crosses 2^32 (full sequence number wraparound)
upon HW relevant event, SW updates the esn_overlap to NEW
(value = 0) and increment the esn_msb.
When using MACsec EPN a salt and short secure channel id (ssci)
needs to be provided by the user, when offloading EPN need to pass
this salt and ssci to the HW to be used in the initial vector (IV)
calculations.
Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emeel Hakim [Wed, 21 Sep 2022 18:10:52 +0000 (11:10 -0700)]
net/mlx5e: Move MACsec initialization from profile init stage to profile enable stage
Postpone MACsec initialization to the mlx5e profile enable stage to have
user access region (UAR) pages and other resources ready before MACsec
initialization to initialize advanced steering operation (ASO) hardware
resources.
Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emeel Hakim [Wed, 21 Sep 2022 18:10:51 +0000 (11:10 -0700)]
net/mlx5e: Create advanced steering operation (ASO) object for MACsec
Add support for ASO work queue entry (WQE) data to allow reading
data upon querying the ASO work queue (WQ).
Register user mode memory registration (UMR) upon ASO WQ init,
de-register UMR upon ASO WQ cleanup.
MACsec uses UMR to determine the cause of the event triggered
by the HW since different scenarios could trigger the same event.
Setup MACsec ASO object to sync HW with SW about various macsec
flow stateful features like: replay window, lifetime limits e.t.c
Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emeel Hakim [Wed, 21 Sep 2022 18:10:50 +0000 (11:10 -0700)]
net/mlx5e: Expose memory key creation (mkey) function
Expose mlx5e_create_mkey function, for future patches in the
macsec series to use.
The above function creates a memory key which describes a
region in memory that can be later used by both HW and SW.
The counterpart destroy functionality is already exposed.
Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emeel Hakim [Wed, 21 Sep 2022 18:10:49 +0000 (11:10 -0700)]
net/mlx5: Add ifc bits for MACsec extended packet number (EPN) and replay protection
Add ifc bits related to advanced steering operations (ASO) and general
object modify for macsec to use as part of offloading EPN and replay
protection features.
Reviewed-by: Raed Salem <raeds@nvidia.com> Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Emeel Hakim [Wed, 21 Sep 2022 18:10:48 +0000 (11:10 -0700)]
net/mlx5e: Fix MACsec initial packet number
Currently when creating MACsec object, next_pn which represents
the initial packet number (PN) is considered only in TX flow.
The above causes mismatch between TX and RX initial PN which
is reflected in packet drops.
Fix by considering next_pn in RX flow too.
Emeel Hakim [Wed, 21 Sep 2022 18:10:45 +0000 (11:10 -0700)]
net: macsec: Expose extended packet number (EPN) properties to macsec offload
Currently macsec invokes HW offload path before reading extended packet
number (EPN) related user properties i.e. salt and short secure channel
identifier (ssci), hence preventing macsec EPN HW offload.
Expose those by moving macsec EPN properties reading prior to HW offload
path.
Signed-off-by: Emeel Hakim <ehakim@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When hinic_pci_sriov_disable() calls hinic_deinit_vf_hw(), it doesn't
care about the return value of hinic_deinit_vf_hw(). Also
hinic_deinit_vf_hw() is return 0, so change it to void.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
hinic_hwdev_max_num_qpas() and hinic_msix_attr_get() are no longer called,
remove them. Also the macro HINIC_MSIX_ATTR_GET is also not called, remove
it.
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
refactor duplicate codes in the qdisc class walk function
The walk implementation of most qdisc class modules is basically the
same. That is, the values of count and skip are checked first. If count
is greater than or equal to skip, the registered fn function is
executed. Otherwise, increase the value of count. So the code can be
refactored.
The walk function is invoked during dump. Therefore, test cases related
to the tdc filter need to be added.
====================
Test 0582: Create QFQ with default setting
Test c9a3: Create QFQ with class weight setting
Test 8452: Create QFQ with class maxpkt setting
Test d920: Create QFQ with multiple class setting
Test 0548: Delete QFQ with handle
Test 5901: Show QFQ class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for netem qdisc
Test cb28: Create NETEM with default setting
Test a089: Create NETEM with limit flag
Test 3449: Create NETEM with delay time
Test 3782: Create NETEM with distribution and corrupt flag
Test 2b82: Create NETEM with distribution and duplicate flag
Test a932: Create NETEM with distribution and loss flag
Test e01a: Create NETEM with distribution and loss state flag
Test ba29: Create NETEM with loss gemodel flag
Test 0492: Create NETEM with reorder flag
Test 7862: Create NETEM with rate limit
Test 7235: Create NETEM with multiple slot rate
Test 5439: Create NETEM with multiple slot setting
Test 5029: Change NETEM with loss state
Test 3785: Replace NETEM with delay time
Test 4502: Delete NETEM with handle
Test 0785: Show NETEM class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for multiq qdisc
Test 20ba: Add multiq Qdisc to multi-queue device (8 queues)
Test 4301: List multiq Class
Test 7832: Delete nonexistent multiq Qdisc
Test 2891: Delete multiq Qdisc twice
Test 1329: Add multiq Qdisc to single-queue device
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for mqprio qdisc
Test 9903: Add mqprio Qdisc to multi-queue device (8 queues)
Test 453a: Delete nonexistent mqprio Qdisc
Test 5292: Delete mqprio Qdisc twice
Test 45a9: Add mqprio Qdisc to single-queue device
Test 2ba9: Show mqprio class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test 0904: Create HTB with default setting
Test 3906: Create HTB with default-N setting
Test 8492: Create HTB with r2q setting
Test 9502: Create HTB with direct_qlen setting
Test b924: Create HTB with class rate and burst setting
Test 4359: Create HTB with class mpu setting
Test 9048: Create HTB with class prio setting
Test 4994: Create HTB with class ceil setting
Test 9523: Create HTB with class cburst setting
Test 5353: Create HTB with class mtu setting
Test 346a: Create HTB with class quantum setting
Test 303a: Delete HTB with handle
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for hfsc qdisc
Test 3254: Create HFSC with default setting
Test 0289: Create HFSC with class sc and ul rate setting
Test 846a: Create HFSC with class sc umax and dmax setting
Test 5413: Create HFSC with class rt and ls rate setting
Test 9312: Create HFSC with class rt umax and dmax setting
Test 6931: Delete HFSC with handle
Test 8436: Show HFSC class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for fq_codel qdisc
Test 4957: Create FQ_CODEL with default setting
Test 7621: Create FQ_CODEL with limit setting
Test 6871: Create FQ_CODEL with memory_limit setting
Test 5636: Create FQ_CODEL with target setting
Test 630a: Create FQ_CODEL with interval setting
Test 4324: Create FQ_CODEL with quantum setting
Test b190: Create FQ_CODEL with noecn flag
Test 5381: Create FQ_CODEL with ce_threshold setting
Test c9d2: Create FQ_CODEL with drop_batch setting
Test 523b: Create FQ_CODEL with multiple setting
Test 9283: Replace FQ_CODEL with noecn setting
Test 3459: Change FQ_CODEL with limit setting
Test 0128: Delete FQ_CODEL with handle
Test 0435: Show FQ_CODEL class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for dsmark qdisc
Test 6345: Create DSMARK with default setting
Test 3462: Create DSMARK with default_index setting
Test ca95: Create DSMARK with set_tc_index flag
Test a950: Create DSMARK with multiple setting
Test 4092: Delete DSMARK with handle
Test 5930: Show DSMARK class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test 1820: Create CBS with default setting
Test 1532: Create CBS with hicredit setting
Test 2078: Create CBS with locredit setting
Test 9271: Create CBS with sendslope setting
Test 0482: Create CBS with idleslope setting
Test e8f3: Create CBS with multiple setting
Test 23c9: Replace CBS with sendslope setting
Test a07a: Change CBS with idleslope setting
Test 43b3: Delete CBS with handle
Test 9472: Show CBS class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Test 3460: Create CBQ with default setting
Test 0592: Create CBQ with mpu
Test 4684: Create CBQ with valid cell num
Test 4345: Create CBQ with invalid cell num
Test 4525: Create CBQ with valid ewma
Test 6784: Create CBQ with invalid ewma
Test 5468: Delete CBQ with handle
Test 492a: Show CBQ class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
selftests/tc-testing: add selftests for cake qdisc
Test 1212: Create CAKE with default setting
Test 3281: Create CAKE with bandwidth limit
Test c940: Create CAKE with autorate-ingress flag
Test 2310: Create CAKE with rtt time
Test 2385: Create CAKE with besteffort flag
Test a032: Create CAKE with diffserv8 flag
Test 2349: Create CAKE with diffserv4 flag
Test 8472: Create CAKE with flowblind flag
Test 2341: Create CAKE with dsthost and nat flag
Test 5134: Create CAKE with wash flag
Test 2302: Create CAKE with flowblind and no-split-gso flag
Test 0768: Create CAKE with dual-srchost and ack-filter flag
Test 0238: Create CAKE with dual-dsthost and ack-filter-aggressive flag
Test 6572: Create CAKE with memlimit and ptm flag
Test 2436: Create CAKE with fwmark and atm flag
Test 3984: Create CAKE with overhead and mpu
Test 5421: Create CAKE with conservative and ingress flag
Test 6854: Delete CAKE with conservative and ingress flag
Test 2342: Replace CAKE with mpu
Test 2313: Change CAKE with mpu
Test 4365: Show CAKE class
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Victor Nogueira <victor@mojatatu.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net/sched: sch_api: add helper for tc qdisc walker stats dump
The walk implementation of most qdisc class modules is basically the
same. That is, the values of count and skip are checked first. If
count is greater than or equal to skip, the registered fn function is
executed. Otherwise, increase the value of count. So we can reconstruct
them.
The 3 functions that want access to the taprio_list:
taprio_dev_notifier(), taprio_destroy() and taprio_init() are all called
with the rtnl_mutex held, therefore implicitly serialized with respect
to each other. A spin lock serves no purpose.
====================
Support 256 bit TLS keys with device offload
This series adds support for 256 bit TLS keys with device offload, and a
cleanup patch to remove repeating code:
- Patches #1-2 add cipher sizes descriptors which allow reducing the
amount of code duplications.
- Patch #3 allows 256 bit keys to be TX offloaded in the tls module (RX
already supported).
- Patch #4 adds 256 bit keys support to the mlx5 driver.
====================
Introduce cipher sizes descriptor. It helps reducing the amount of code
duplications and repeated switch/cases that assigns the proper sizes
according to the cipher type.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Any change to the hardware timestamps configuration triggers nic restart,
which breaks transmition and reception of network packets for a while.
But there is no need to fully restart the device because while configuring
hardware timestamps. The code for changing configuration runs after all
of the initialisation, when the NIC is actually up and running. This patch
changes the code that ioctl will only update configuration registers and
will not trigger carrier status change, but in case of timestamps for
all rx packetes it fallbacks to close()/open() sequnce because of
synchronization issues in the hardware. Tested on BCM57504.
Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220922191038.29921-1-vfedorenko@novek.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
drivers/net/ethernet/freescale/fec.h 8aaa8ff39e9a ("Revert "fec: Restart PPS after link state change"") ad168237a332 ("net: fec: add stop mode support for imx8 platform")
https://lore.kernel.org/all/20220921105337.62b41047@canb.auug.org.au/
drivers/pinctrl/pinctrl-ocelot.c 4ebab0460deb ("pinctrl: ocelot: Fix interrupt controller") e2e1adddecf7 ("pinctrl: ocelot: add ability to be used in a non-mmio configuration")
https://lore.kernel.org/all/20220921110032.7cd28114@canb.auug.org.au/
tools/testing/selftests/drivers/net/bonding/Makefile b1c6b30eafd3 ("net: Add tests for bonding and team address list management") abd12753fab7 ("selftests/bonding: add a test for bonding lladdr target")
https://lore.kernel.org/all/20220921110437.5b7dbd82@canb.auug.org.au/
Merge tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from wifi, netfilter and can.
A handful of awaited fixes here - revert of the FEC changes, bluetooth
fix, fixes for iwlwifi spew.
We added a warning in PHY/MDIO code which is triggering on a couple of
platforms in a false-positive-ish way. If we can't iron that out over
the week we'll drop it and re-add for 6.1.
I've added a new "follow up fixes" section for fixes to fixes in
6.0-rcs but it may actually give the false impression that those are
problematic or that more testing time would have caught them. So
likely a one time thing.
- ebtables: fix memory leak when blob is malformed
- nf_ct_ftp: fix deadlock when nat rewrite is needed
Current release - regressions:
- Revert "fec: Restart PPS after link state change" and the related
"net: fec: Use a spinlock to guard `fep->ptp_clk_on`"
- Bluetooth: fix HCIGETDEVINFO regression
- wifi: mt76: fix 5 GHz connection regression on mt76x0/mt76x2
- mptcp: fix fwd memory accounting on coalesce
- rwlock removal fall out:
- ipmr: always call ip{,6}_mr_forward() from RCU read-side
critical section
- ipv6: fix crash when IPv6 is administratively disabled
- tcp: read multiple skbs in tcp_read_skb()
- mdio_bus_phy_resume state warning fallout:
- eth: ravb: fix PHY state warning splat during system resume
- eth: sh_eth: fix PHY state warning splat during system resume
Current release - new code bugs:
- wifi: iwlwifi: don't spam logs with NSS>2 messages
- eth: mtk_eth_soc: enable XDP support just for MT7986 SoC
Previous releases - regressions:
- bonding: fix NULL deref in bond_rr_gen_slave_id
- wifi: iwlwifi: mark IWLMEI as broken
Previous releases - always broken:
- nf_conntrack helpers:
- irc: tighten matching on DCC message
- sip: fix ct_sip_walk_headers
- osf: fix possible bogus match in nf_osf_find()
- ipvlan: fix out-of-bound bugs caused by unset skb->mac_header
- core: fix flow symmetric hash
- bonding, team: unsync device addresses on ndo_stop
- phy: micrel: fix shared interrupt on LAN8814"
* tag 'net-6.0-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
selftests: forwarding: add shebang for sch_red.sh
bnxt: prevent skb UAF after handing over to PTP worker
net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
net: sched: fix possible refcount leak in tc_new_tfilter()
net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
udp: Use WARN_ON_ONCE() in udp_read_skb()
selftests: bonding: cause oops in bond_rr_gen_slave_id
bonding: fix NULL deref in bond_rr_gen_slave_id
net: phy: micrel: fix shared interrupt on LAN8814
net/smc: Stop the CLC flow if no link to map buffers on
ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient
net: atlantic: fix potential memory leak in aq_ndev_close()
can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
can: gs_usb: gs_can_open(): fix race dev->can.state condition
can: flexcan: flexcan_mailbox_read() fix return value for drop = true
net: sh_eth: Fix PHY state warning splat during system resume
net: ravb: Fix PHY state warning splat during system resume
netfilter: nf_ct_ftp: fix deadlock when nat rewrite is needed
netfilter: ebtables: fix memory leak when blob is malformed
netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain()
...
Merge tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI fixes from Ard Biesheuvel:
- Use the right variable to check for shim insecure mode
- Wipe setup_data field when booting via EFI
- Add missing error check to efibc driver
* tag 'efi-urgent-for-v6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: libstub: check Shim mode using MokSBStateRT
efi: x86: Wipe setup_data on pure EFI boot
efi: efibc: Guard against allocation failure
Merge tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix polling of system-wide events related to mixing per-cpu and
per-thread events.
- Do not check if /proc/modules is unchanged when copying /proc/kcore,
that doesn't get in the way of post processing analysis.
- Include program header in ELF files generated for JIT files, so that
they can be opened by tools using elfutils libraries.
- Enter namespaces when synthesizing build-ids.
- Fix some bugs related to a recent cpu_map overhaul where we should be
using an index and not the cpu number.
- Fix BPF program ELF section name, using the naming expected by libbpf
when using BPF counters in 'perf stat'.
- Add a new test for perf stat cgroup BPF counter.
- Adjust check on 'perf test wp' for older kernels, where the
PERF_EVENT_IOC_MODIFY_ATTRIBUTES ioctl isn't supported.
- Sync x86 cpufeatures with the kernel sources, no changes in tooling.
* tag 'perf-tools-fixes-for-v6.0-2022-09-21' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf tools: Honor namespace when synthesizing build-ids
tools headers cpufeatures: Sync with the kernel sources
perf kcore_copy: Do not check /proc/modules is unchanged
libperf evlist: Fix polling of system-wide events
perf record: Fix cpu mask bit setting for mixed mmaps
perf test: Skip wp modify test on old kernels
perf jit: Include program header in ELF files
perf test: Add a new test for perf stat cgroup BPF counter
perf stat: Use evsel->core.cpus to iterate cpus in BPF cgroup counters
perf stat: Fix cpu map index in bperf cgroup code
perf stat: Fix BPF program section name
Hangbin Liu [Thu, 22 Sep 2022 02:44:53 +0000 (10:44 +0800)]
selftests: forwarding: add shebang for sch_red.sh
RHEL/Fedora RPM build checks are stricter, and complain when executable
files don't have a shebang line, e.g.
*** WARNING: ./kselftests/net/forwarding/sch_red.sh is executable but has no shebang, removing executable bit
Fix it by adding shebang line.
Fixes: 326d55eb37a8 ("selftests: forwarding: Add a RED test for SW datapath") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20220922024453.437757-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Wed, 21 Sep 2022 20:10:05 +0000 (13:10 -0700)]
bnxt: prevent skb UAF after handing over to PTP worker
When reading the timestamp is required bnxt_tx_int() hands
over the ownership of the completed skb to the PTP worker.
The skb should not be used afterwards, as the worker may
run before the rest of our code and free the skb, leading
to a use-after-free.
Since dev_kfree_skb_any() accepts NULL make the loss of
ownership more obvious and set skb to NULL.
Fixes: bdd9258cb478 ("bnxt_en: Transmit and retrieve packet timestamps") Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220921201005.335390-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Liang He [Wed, 21 Sep 2022 13:32:45 +0000 (21:32 +0800)]
net: marvell: Fix refcounting bugs in prestera_port_sfp_bind()
In prestera_port_sfp_bind(), there are two refcounting bugs:
(1) we should call of_node_get() before of_find_node_by_name() as
it will automaitcally decrease the refcount of 'from' argument;
(2) we should call of_node_put() for the break of the iteration
for_each_child_of_node() as it will automatically increase and
decrease the 'child'.
net: sched: fix possible refcount leak in tc_new_tfilter()
tfilter_put need to be called to put the refount got by tp->ops->get to
avoid possible refcount leak when chain->tmplt_ops != NULL and
chain->tmplt_ops != tp->ops.
Fixes: f80d49a81105 ("net: sched: extend proto ops with 'put' callback") Signed-off-by: Hangyu Hua <hbh25y@gmail.com> Reviewed-by: Vlad Buslov <vladbu@nvidia.com> Link: https://lore.kernel.org/r/20220921092734.31700-1-hbh25y@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Anderson [Tue, 20 Sep 2022 23:50:18 +0000 (19:50 -0400)]
net: sunhme: Fix packet reception for len < RX_COPY_THRESHOLD
There is a separate receive path for small packets (under 256 bytes).
Instead of allocating a new dma-capable skb to be used for the next packet,
this path allocates a skb and copies the data into it (reusing the existing
sbk for the next packet). There are two bytes of junk data at the beginning
of every packet. I believe these are inserted in order to allow aligned DMA
and IP headers. We skip over them using skb_reserve. Before copying over
the data, we must use a barrier to ensure we see the whole packet. The
current code only synchronizes len bytes, starting from the beginning of
the packet, including the junk bytes. However, this leaves off the final
two bytes in the packet. Synchronize the whole packet.
To reproduce this problem, ping a HME with a payload size between 17 and
214
$ ping -s 17 <hme_address>
which will complain rather loudly about the data mismatch. Small packets
(below 60 bytes on the wire) do not have this issue. I suspect this is
related to the padding added to increase the minimum packet size.
====================
bonding: fix NULL deref in bond_rr_gen_slave_id
Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.
====================
Jonathan Toppins [Tue, 20 Sep 2022 17:45:51 +0000 (13:45 -0400)]
selftests: bonding: cause oops in bond_rr_gen_slave_id
This bonding selftest used to cause a kernel oops on aarch64
and should be architectures agnostic.
Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jonathan Toppins [Tue, 20 Sep 2022 17:45:52 +0000 (13:45 -0400)]
bonding: fix NULL deref in bond_rr_gen_slave_id
Fix a NULL dereference of the struct bonding.rr_tx_counter member because
if a bond is initially created with an initial mode != zero (Round Robin)
the memory required for the counter is never created and when the mode is
changed there is never any attempt to verify the memory is allocated upon
switching modes.
The fix is to allocate the memory in bond_open() which is guaranteed
to be called before any packets are processed.
Fixes: 1999b014bd2e ("net: bonding: Use per-cpu rr_tx_counter") CC: Jussi Maki <joamaki@gmail.com> Signed-off-by: Jonathan Toppins <jtoppins@redhat.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Michael Walle [Tue, 20 Sep 2022 14:16:19 +0000 (16:16 +0200)]
net: phy: micrel: fix shared interrupt on LAN8814
Since commit 8a86c40facf2 ("net: phy: micrel: 1588 support for LAN8814
phy") the handler always returns IRQ_HANDLED, except in an error case.
Before that commit, the interrupt status register was checked and if
it was empty, IRQ_NONE was returned. Restore that behavior to play nice
with the interrupt line being shared with others.
Fixes: 8a86c40facf2 ("net: phy: micrel: 1588 support for LAN8814 phy") Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Divya Koppera <Divya.Koppera@microchip.com> Link: https://lore.kernel.org/r/20220920141619.808117-1-michael@walle.cc Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Paolo Abeni [Thu, 22 Sep 2022 13:13:26 +0000 (15:13 +0200)]
Merge branch 'add-wed-support-for-mt7986-chipset'
Lorenzo Bianconi says:
====================
Add WED support for MT7986 chipset
Similar to MT7622, introduce Wireless Ethernet Dispatch (WED) support
for MT7986 chipset in order to offload to the hw packet engine traffic
received from LAN/WAN device to WLAN nic (MT7915E).
====================
Lorenzo Bianconi [Tue, 20 Sep 2022 10:11:20 +0000 (12:11 +0200)]
net: ethernet: mtk_eth_wed: add mtk_wed_configure_irq and mtk_wed_dma_{enable/disable}
Introduce mtk_wed_configure_irq, mtk_wed_dma_enable and mtk_wed_dma_disable
utility routines.
This is a preliminary patch to introduce mt7986 wed support.
Tested-by: Daniel Golle <daniel@makrotopia.org> Co-developed-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com> Co-developed-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Sujuan Chen <sujuan.chen@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Lorenzo Bianconi [Tue, 20 Sep 2022 10:11:13 +0000 (12:11 +0200)]
arm64: dts: mediatek: mt7986: add support for Wireless Ethernet Dispatch
Introduce wed0 and wed1 nodes in order to enable offloading forwarding
between ethernet and wireless devices on the mt7986 chipset.
Co-developed-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Bo Jiao <Bo.Jiao@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
====================
Separate SMC parameter settings from TCP sysctls
SMC shares some sysctls with TCP, but considering the difference
between these two protocols, it may not be very suitable for SMC
to reuse TCP parameter settings in some cases, such as keepalive
time or buffer size.
So this patch set aims to introduce some SMC specific sysctls to
independently and flexibly set the parameters that suit SMC.
====================
Tony Lu [Tue, 20 Sep 2022 09:52:22 +0000 (17:52 +0800)]
net/smc: Unbind r/w buffer size from clcsock and make them tunable
Currently, SMC uses smc->sk.sk_{rcv|snd}buf to create buffers for
send buffer and RMB. And the values of buffer size are from tcp_{w|r}mem
in clcsock.
The buffer size from TCP socket doesn't fit SMC well. Generally, buffers
are usually larger than TCP for SMC-R/-D to get higher performance, for
they are different underlay devices and paths.
So this patch unbinds buffer size from TCP, and introduces two sysctl
knobs to tune them independently. Also, these knobs are per net
namespace and work for containers.
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>