Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:52 +0000 (18:49 +0800)]
wifi: rtw89: early recognize FW feature to decide if chanctx
In current flow, FW is asynchronously loaded after alloc_hw(). It defers
the decision on FW feature map. It makes things difficult for us to decide
whether to hook chanctx ops, which should be decided while alloc_hw() is
calling. Still, asynchronous gets its advantages. So, we want to resolve
this without dropping them.
Based on multi-FW flag, RTW89_MFW_SIG, we can determine runtime FW is
multi-FW (MFW) or single FW (SFW). Both of them have a quite small chunk
for header at the head. The difference is that MFW doesn't describe version
code in its header while SFW does. So, we plan to extend MFW header for
version code. After that, in both cases, we can determine FW feature map by
just FW header. And, according to the map, we can decide chanctx.
So, we call request_partial_firmware_into_buf() to request a quite small
chunk before alloc_hw() to get a early FW feature map without affecting
things much and only use early map to decide whether to hook chanctx ops.
It means that if non-extended MFW is used at runtime, driver just acts
without chanctx as before. If extended MFW or SFW, which supports required
FW features, is used at runtime, driver can hook chanctx ops to mac80211 if
chip has configured support_chanctx_num > 0.
Besides, key point for now to support single one chanctx is whether HW scan
is supported at runtime. So, we configure all chip's support_chanctx_num to
1, and check if HW scan is supported at runtime via early FW feature map.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:51 +0000 (18:49 +0800)]
wifi: rtw89: declare support for mac80211 chanctx ops by chip
Some HW features are required if we hook chanctx ops to mac80211. With it,
mac80211 would expect the HW-supported variant ops exists on some behavior,
e.g. HW scan. But, HW features may depend on chip's FW or its development.
Besides, how many chanctx can be supported also depends on chip design.
We can neither decide whether to generally support chanctx ops nor how many
chanctx can be supported.
So, support_chanctx_num is added under chip info to deal with this by chip.
For now, all chip configure support_chanctx_num as 0. We haven't really
hook chanctx ops yet. So, chip can run without mac80211 chanctx as before.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:50 +0000 (18:49 +0800)]
wifi: rtw89: add skeleton of mac80211 chanctx ops support
Support mac80211 chanctx series ops. Still, currently support
single channel. Based on this premise, things should be similar
to before. So, we haven't dealt with relationship between vif
and chanctx in depth. Instead, we leave both ::assign_vif()
and ::unassign_vif() as noops for now.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:49 +0000 (18:49 +0800)]
wifi: rtw89: introduce entity mode and its recalculated prototype
After supporting more than one channel, we need entity mode to decide
how to set current channel(s) on the sub-entities. This decision may
happen on set_channel() and rtw89_core_set_chip_txpwr().
For now, we support single one channel and use only first HW entry,
i.e. RTW89_SUB_ENTITY_0, RTW89_MAC_0, RTW89_PHY_0. Without something
unexpected, the entity mode should always be RTW89_ENT_MODE_SCC after
recalcated, where SCC means single channel concurrency. So, an assert
is added in set_channel() and rtw89_core_set_chip_txpwr().
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:48 +0000 (18:49 +0800)]
wifi: rtw89: initialize entity and configure default chandef
While idle, we need a default chandef to set channel for things,
such as scan. Before support of mac80211 chanctx, mac80211 would
configure a default one on ieee80211_hw::conf::chandef. And we
just queried it whenever we did set channel. However, after support
of mac80211 chanctx, the flow won't work like before.
Besides, we don't now query chandef from ieee80211_hw::conf::chandef
either. So, similar to mac80211 without using chanctx, we configure
the default chandef with ieee80211_channel of index 0 in 2GHz.
Although we have not added the support of mac80211 chanctx here,
this configuration should be compatible before that. So, we commit
this ahead.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:47 +0000 (18:49 +0800)]
wifi: rtw89: concentrate chandef setting to stack callback
Originally, we didn't support mac80211 chanctx, so it's expected that
ieee80211_hw::conf::chandef would be filled by mac80211. And then, we
could just query it whenever we need the current chandef.
However, we are planing to support mac80211 chanctx. After that, the
above assumption would be broken. So, we adjust a bit ahead to reduce
future works about mac80211 chanctx.
After this, we don't query ieee80211_hw::conf::chandef directly, and
we add a map, entity_map, to HAL to indicate which chandef came from
stack. And it will later be used to recalcate entity mode.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:46 +0000 (18:49 +0800)]
wifi: rtw89: concentrate parameter control for setting channel callback
For future support on multiple channels by multiple sub-entities,
we need to manage parameters of each channel instance like rtw89_chan,
rtw89_mac_idx, rtw89_phy_idx. So, we adjust related channel callback
functions and centrally conrtol these parameters in set_channel().
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:45 +0000 (18:49 +0800)]
wifi: rtw89: rfk: concentrate parameter control while set_channel()
For future support on multiple channels, there will be settings of
multiple sub-entities that we need to control. We don't want such
settings to be scattered all over the place. So, we centrally manage
controls of rtw89_phy_idx for RFK in set_channel().
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:44 +0000 (18:49 +0800)]
wifi: rtw89: txpwr: concentrate channel related control to top
For future support on multiple channels, it would be disturbing if we
still allow scattered leaf functions of TX power to query and manage
channel related control by themselves.
So, query rtw89_chan only on top functions. Then, pass it via functions
to make sure that the values coming from the same struct rtw89_chan.
Besides, fix rtw8852a_set_txpwr_offset() from rtw8852a_set_txpwr_ctrl()
to rtw8852a_set_txpwr(). TX power offset should consider current band,
so move it to chip_ops::set_txpwr() which will be called every time that
channel is set.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:43 +0000 (18:49 +0800)]
wifi: rtw89: create rtw89_chan centrally to avoid breakage
Sometimes we need to write current rtw89_chan outside set_channel(),
e.g. during HW scan, we adjust it to align FW process through C2H.
However, we don't have full parameters to fill entire rtw89_chan.
And it will breakage if we update only part of current rtw89_chan.
That is what we don't want to see because most flows throughout
driver treat rtw89_chan as a whole.
So, we divide struct rtw89_chan to basic part and derived part. The
basic part contains the parameters which we are always able to know.
And the derived part will be calculated by the basic part. Then, a
central function, rtw89_chan_create(), is added to deal with this.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:42 +0000 (18:49 +0800)]
wifi: rtw89: re-arrange channel related stuffs under HAL
We are planning to support mac80211 chanctx. To reduce future works,
the driver architecture is adjusted first to isolate related things.
According to chip, our HW may have multiple sub-entities to support
multiple mac80211 chanctx. Struct rtw89_chan has been introduced for
things about channel/band/subband/... Now introduce struct rtw89_chan_rcd
to record difference after assigning new one of struct rtw89_chan.
We will implement and support chanctx with single channel first, i.e.
only use entry in RTW89_SUB_ENTITY_0, before handling dual channels.
Our hierarchy in planning will become as the following.
DEV
-> HAL
---> entity (manage status across sub-entities)
-----> sub-entity[*] (support mac80211 chanctx)
where each sub-entity contains one struct rtw89_chan.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:41 +0000 (18:49 +0800)]
wifi: rtw89: introduce rtw89_chan for channel stuffs
Introduce struct rtw89_chan ahead to encapsulate stuffs from struct
rtw89_channel_params. These stuffs have a clone in HAL and are used
throughout driver. After multiple channels support, it's expected that
each channel instance has a configuration of them. So, we refine them
with struct rtw89_chan by precise type first, and will re-arrange HAL
by struct rtw89_chan in the following as well.
Zong-Zhe Yang [Tue, 9 Aug 2022 10:49:40 +0000 (18:49 +0800)]
wifi: rtw89: rewrite decision on channel by entity state
We need to invoke the callback of the changed band at the first
set_channel() after every power-off. Originally, we forced the
channel to be 0 when doing power-off, and then determined things
by comparing channel with 0.
However, deciding on such things by channel might be confusing.
It's also confusing to use this kind of decision when we consider
multiple channels in the follow-up patches. So, another flag,
entity_active, is added ahead to HAL to deal with this.
Besides, we also need to check if entity is active when we set
TX power.
Tetsuo Handa [Tue, 16 Aug 2022 14:46:13 +0000 (23:46 +0900)]
wifi: ath9k: avoid uninit memory read in ath9k_htc_rx_msg()
syzbot is reporting uninit value at ath9k_htc_rx_msg() [1], for
ioctl(USB_RAW_IOCTL_EP_WRITE) can call ath9k_hif_usb_rx_stream() with
pkt_len = 0 but ath9k_hif_usb_rx_stream() uses
__dev_alloc_skb(pkt_len + 32, GFP_ATOMIC) based on an assumption that
pkt_len is valid. As a result, ath9k_hif_usb_rx_stream() allocates skb
with uninitialized memory and ath9k_htc_rx_msg() is reading from
uninitialized memory.
Since bytes accessed by ath9k_htc_rx_msg() is not known until
ath9k_htc_rx_msg() is called, it would be difficult to check minimal valid
pkt_len at "if (pkt_len > 2 * MAX_RX_BUF_SIZE) {" line in
ath9k_hif_usb_rx_stream().
We have two choices. One is to workaround by adding __GFP_ZERO so that
ath9k_htc_rx_msg() sees 0 if pkt_len is invalid. The other is to let
ath9k_htc_rx_msg() validate pkt_len before accessing. This patch chose
the latter.
Note that I'm not sure threshold condition is correct, for I can't find
details on possible packet length used by this protocol.
David S. Miller [Fri, 26 Aug 2022 10:56:55 +0000 (11:56 +0100)]
Merge tag 'wireless-next-2022-08-26-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes berg says:
====================
Various updates:
* rtw88: operation, locking, warning, and code style fixes
* rtw89: small updates
* cfg80211/mac80211: more EHT/MLO (802.11be, WiFi 7) work
* brcmfmac: a couple of fixes
* misc cleanups etc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The Lenovo OneLink+ Dock contains an RTL8153 controller that behaves as
a broken CDC device by default. Add the custom Lenovo PID to the r8152
driver to support it properly.
Also, systems compatible with this dock provide a BIOS option to enable
MAC address passthrough (as per Lenovo document "ThinkPad Docking
Solutions 2017"). Add the custom PID to the MAC passthrough list too.
Tested on a ThinkPad 13 1st gen with the expected results:
passthrough disabled: Invalid header when reading pass-thru MAC addr
passthrough enabled: Using pass-thru MAC addr XX:XX:XX:XX:XX:XX
Signed-off-by: Jean-Francois Le Fillatre <jflf_kernel@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ping-Ke Shih [Mon, 15 Aug 2022 06:20:04 +0000 (14:20 +0800)]
wifi: rtw88: fix uninitialized use of primary channel index
clang reports uninitialized use:
>> drivers/net/wireless/realtek/rtw88/main.c:731:2: warning: variable
'primary_channel_idx' is used uninitialized whenever switch default is
taken [-Wsometimes-uninitialized]
default:
^~~~~~~
drivers/net/wireless/realtek/rtw88/main.c:754:39: note: uninitialized
use occurs here
hal->current_primary_channel_index = primary_channel_idx;
^~~~~~~~~~~~~~~~~~~
drivers/net/wireless/realtek/rtw88/main.c:687:24: note: initialize the
variable 'primary_channel_idx' to silence this warning
u8 primary_channel_idx;
^
= '\0'
This situation could not happen, because possible channel bandwidth
20/40/80MHz are enumerated.
Fixes: 02226ad86faa ("wifi: rtw88: add the update channel flow to support setting by parameters") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20220815062004.22920-1-pkshih@realtek.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Qingfang DENG [Wed, 24 Aug 2022 06:10:34 +0000 (14:10 +0800)]
net: phylink: allow RGMII/RTBI in-band status
As per RGMII specification v2.0, section 3.4.1, RGMII/RTBI has an
optional in-band status feature where the PHY's link status, speed and
duplex mode can be passed to the MAC.
Allow RGMII/RTBI to use in-band status.
Signed-off-by: Qingfang DENG <dqfext@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 26 Aug 2022 09:04:55 +0000 (10:04 +0100)]
Merge branch 'prestera-matchall'
Maksym Glubokiy says:
====================
net: prestera: matchall features
This patch series extracts matchall rules management out of SPAN API
implementation and adds 2 features on top of that:
- support for egress traffic (mirred egress action)
- proper rule priorities management between matchall and flower
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Maksym Glubokiy [Tue, 23 Aug 2022 11:39:58 +0000 (14:39 +0300)]
net: prestera: manage matchall and flower priorities
matchall rules can be added only to chain 0 and their priorities have
limitations:
- new matchall ingress rule's priority must be higher (lower value)
than any existing flower rule;
- new matchall egress rule's priority must be lower (higher value)
than any existing flower rule.
The opposite works for flower rule adding:
- new flower ingress rule's priority must be lower (higher value)
than any existing matchall rule;
- new flower egress rule's priority must be higher (lower value)
than any existing matchall rule.
This is a hardware limitation and thus must be properly handled in
driver by reporting errors to the user when newly added rule has such a
priority that cannot be installed into the hardware.
To achieve this, the driver must maintain both min/max matchall
priorities for every flower block when user adds/deletes a matchall
rule, as well as both min/max flower priorities for chain 0 for every
adding/deletion of flower rules for chain 0.
Cc: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
Serhiy Boiko [Tue, 23 Aug 2022 11:39:56 +0000 (14:39 +0300)]
net: prestera: acl: extract matchall logic into a separate file
This commit adds more clarity to handling of TC_CLSMATCHALL_REPLACE and
TC_CLSMATCHALL_DESTROY events by calling newly added *_mall_*() handlers
instead of directly calling SPAN API.
This also extracts matchall rules management out of SPAN API since SPAN
is a hardware module which is used to implement 'matchall egress mirred'
action only.
Signed-off-by: Taras Chornyi <tchornyi@marvell.com> Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu> Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel [Mon, 22 Aug 2022 12:39:42 +0000 (14:39 +0200)]
net: asix: ax88772: migrate to phylink
There are some exotic ax88772 based devices which may require
functionality provide by the phylink framework. For example:
- US100A20SFP, USB 2.0 auf LWL Converter with SFP Cage
- AX88772B USB to 100Base-TX Ethernet (with RMII) demo board, where it
is possible to switch between internal PHY and external RMII based
connection.
So, convert this driver to phylink as soon as possible.
Wolfram Sang [Thu, 18 Aug 2022 21:02:23 +0000 (23:02 +0200)]
wifi: mac80211: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Johannes Berg [Thu, 25 Aug 2022 18:57:32 +0000 (20:57 +0200)]
wifi: mac80211: correct SMPS mode in HE 6 GHz capability
If we add 6 GHz capability in MLO, we cannot use the SMPS
mode from the deflink. Pass it separately instead since on
a second link we don't even have a link data struct yet.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Rob Herring [Thu, 25 Aug 2022 19:26:07 +0000 (14:26 -0500)]
dt-bindings: net: Add missing (unevaluated|additional)Properties on child nodes
In order to ensure only documented properties are present, node schemas
must have unevaluatedProperties or additionalProperties set to false
(typically). Add missing properties/$refs as exposed by this addition.
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c c3683bad5699 ("net/mlx5e: Fix use after free in mlx5e_fs_init()") 8c2600bce6ce ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer")
https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/
https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/
Uros Bizjak [Mon, 22 Aug 2022 14:32:43 +0000 (16:32 +0200)]
netdev: Use try_cmpxchg in napi_if_scheduled_mark_missed
Use try_cmpxchg instead of cmpxchg (*ptr, old, new) == old in
napi_if_scheduled_mark_missed. x86 CMPXCHG instruction returns
success in ZF flag, so this change saves a compare after cmpxchg
(and related move instruction in front of cmpxchg).
Also, try_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg
fails, enabling further code simplifications.
Linus Torvalds [Thu, 25 Aug 2022 21:03:58 +0000 (14:03 -0700)]
Merge tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec and netfilter (with one broken Fixes tag).
Current release - new code bugs:
- dsa: don't dereference NULL extack in dsa_slave_changeupper()
- dpaa: fix <1G ethernet on LS1046ARDB
- neigh: don't call kfree_skb() under spin_lock_irqsave()
Previous releases - regressions:
- r8152: fix the RX FIFO settings when suspending
- dsa: microchip: keep compatibility with device tree blobs with no
phy-mode
- Revert "net: macsec: update SCI upon MAC address change."
- Revert "xfrm: update SA curlft.use_time", comply with RFC 2367
Previous releases - always broken:
- netfilter: conntrack: work around exceeded TCP receive window
- ipsec: fix a null pointer dereference of dst->dev on a metadata dst
in xfrm_lookup_with_ifid
- moxa: get rid of asymmetry in DMA mapping/unmapping
- dsa: microchip: make learning configurable and keep it off while
standalone
- ice: xsk: prohibit usage of non-balanced queue id
- rxrpc: fix locking in rxrpc's sendmsg
Misc:
- another chunk of sysctl data race silencing"
* tag 'net-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
net: lantiq_xrx200: restore buffer if memory allocation failed
net: lantiq_xrx200: fix lock under memory pressure
net: lantiq_xrx200: confirm skb is allocated before using
net: stmmac: work around sporadic tx issue on link-up
ionic: VF initial random MAC address if no assigned mac
ionic: fix up issues with handling EAGAIN on FW cmds
ionic: clear broken state on generation change
rxrpc: Fix locking in rxrpc's sendmsg
net: ethernet: mtk_eth_soc: fix hw hash reporting for MTK_NETSYS_V2
MAINTAINERS: rectify file entry in BONDING DRIVER
i40e: Fix incorrect address type for IPv6 flow rules
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
net: Fix a data-race around sysctl_somaxconn.
net: Fix a data-race around netdev_unregister_timeout_secs.
net: Fix a data-race around gro_normal_batch.
net: Fix data-races around sysctl_devconf_inherit_init_net.
net: Fix data-races around sysctl_fb_tunnels_only_for_init_net.
net: Fix a data-race around netdev_budget_usecs.
net: Fix data-races around sysctl_max_skb_frags.
net: Fix a data-race around netdev_budget.
...
====================
net: devlink: sync flash and dev info commands
Purpose of this patchset is to introduce consistency between two devlink
commands:
devlink dev info
Shows versions of running default flash target and components.
devlink dev flash
Flashes default flash target or component name (if specified
on cmdline).
Currently it is up to the driver what versions to expose and what flash
update component names to accept. This is inconsistent. Thankfully, only
netdevsim currently using components so it is still time
to sanitize this.
This patchset makes sure, that devlink.c calls into driver for
component flash update only in case the driver exposes the same version
name.
Example:
$ devlink dev info
netdevsim/netdevsim10:
driver netdevsim
versions:
running:
fw.mgmt 10.20.30
stored:
fw.mgmt 10.20.30
$ devlink dev flash netdevsim/netdevsim10 file somefile.bin
[fw.mgmt] Preparing to flash
[fw.mgmt] Flashing 100%
[fw.mgmt] Flash select
[fw.mgmt] Flashing done
$ devlink dev flash netdevsim/netdevsim10 file somefile.bin component fw.mgmt
[fw.mgmt] Preparing to flash
[fw.mgmt] Flashing 100%
[fw.mgmt] Flash select
[fw.mgmt] Flashing done
$ devlink dev flash netdevsim/netdevsim10 file somefile.bin component dummy
Error: selected component is not supported by this device.
====================
Jiri Pirko [Wed, 24 Aug 2022 12:20:11 +0000 (14:20 +0200)]
net: devlink: limit flash component name to match version returned by info_get()
Limit the acceptance of component name passed to cmd_flash_update() to
match one of the versions returned by info_get(), marked by version type.
This makes things clearer and enforces 1:1 mapping between exposed
version and accepted flash component.
Check VERSION_TYPE_COMPONENT version type during cmd_flash_update()
execution by calling info_get() with different "req" context.
That causes info_get() to lookup the component name instead of
filling-up the netlink message.
Remove "UPDATE_COMPONENT" flag which becomes used.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 24 Aug 2022 12:20:10 +0000 (14:20 +0200)]
netdevsim: add version fw.mgmt info info_get() and mark as a component
Fix the only component user which is netdevsim. It uses component named
"fw.mgmt" in selftests. So add this version to info_get() output with
version type component.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jiri Pirko [Wed, 24 Aug 2022 12:20:09 +0000 (14:20 +0200)]
net: devlink: extend info_get() version put to indicate a flash component
Whenever the driver is called by his info_get() op, it may put multiple
version names and values to the netlink message. Extend by additional
helper devlink_info_version_running/stored_put_ext() that allows to
specify a version type that indicates when particular version name
represents a flash component.
This is going to be used in follow-up patch calling info_get() during
flash update command checking if version with this the version type
exists.
Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
====================
net: lantiq_xrx200: fix errors under memory pressure
This series fixes issues that can occur in the driver under memory pressure.
Situations when the system cannot allocate memory are rare, so the mentioned
bugs have been fixed recently. The patches have been tested on a BT Home
router with the Lantiq xRX200 chipset.
Changelog:
v3: - removed netdev_err() log from the first patch
v2:
- the second patch has been changed, so that under memory pressure situation
the driver will not receive packets indefinitely regardless of the NAPI budget,
- the third patch has been added.
====================
net: lantiq_xrx200: restore buffer if memory allocation failed
In a situation where memory allocation fails, an invalid buffer address
is stored. When this descriptor is used again, the system panics in the
build_skb() function when accessing memory.
Fixes: aaabf04caa78 ("lantiq: net: fix duplicated skb in rx descriptor ring") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: lantiq_xrx200: fix lock under memory pressure
When the xrx200_hw_receive() function returns -ENOMEM, the NAPI poll
function immediately returns an error.
This is incorrect for two reasons:
* the function terminates without enabling interrupts or scheduling NAPI,
* the error code (-ENOMEM) is returned instead of the number of received
packets.
After the first memory allocation failure occurs, packet reception is
locked due to disabled interrupts from DMA..
Fixes: 462a72fd466e ("net: lantiq: Add Lantiq / Intel VRX200 Ethernet driver") Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
net: lantiq_xrx200: confirm skb is allocated before using
xrx200_hw_receive() assumes build_skb() always works and goes straight
to skb_reserve(). However, build_skb() can fail under memory pressure.
Add a check in case build_skb() failed to allocate and return NULL.
Fixes: 0edcf29e78ce ("net: lantiq_xrx200: convert to build_skb") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Heiner Kallweit [Wed, 24 Aug 2022 20:34:49 +0000 (22:34 +0200)]
net: stmmac: work around sporadic tx issue on link-up
This is a follow-up to the discussion in [0]. It seems to me that
at least the IP version used on Amlogic SoC's sometimes has a problem
if register MAC_CTRL_REG is written whilst the chip is still processing
a previous write. But that's just a guess.
Adding a delay between two writes to this register helps, but we can
also simply omit the offending second write. This patch uses the second
approach and is based on a suggestion from Qi Duan.
Benefit of this approach is that we can save few register writes, also
on not affected chip versions.
Jakub Kicinski [Thu, 25 Aug 2022 19:40:29 +0000 (12:40 -0700)]
Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-08-24 (ixgbe, i40e)
This series contains updates to ixgbe and i40e drivers.
Jake stops incorrect resetting of SYSTIME registers when starting
cyclecounter for ixgbe.
Sylwester corrects a check on source IP address when validating destination
for i40e.
* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
i40e: Fix incorrect address type for IPv6 flow rules
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
====================
Jakub Kicinski [Thu, 25 Aug 2022 19:40:17 +0000 (12:40 -0700)]
Merge branch 'ionic-bug-fixes'
Shannon Nelson says:
====================
ionic: bug fixes
These are a couple of maintenance bug fixes for the Pensando ionic
networking driver.
Mohamed takes care of a "plays well with others" issue where the
VF spec is a bit vague on VF mac addresses, but certain customers
have come to expect behavior based on other vendor drivers.
Shannon addresses a couple of corner cases seen in internal
stress testing.
====================
R Mohamed Shah [Wed, 24 Aug 2022 16:50:51 +0000 (09:50 -0700)]
ionic: VF initial random MAC address if no assigned mac
Assign a random mac address to the VF interface station
address if it boots with a zero mac address in order to match
similar behavior seen in other VF drivers. Handle the errors
where the older firmware does not allow the VF to set its own
station address.
Newer firmware will allow the VF to set the station mac address
if it hasn't already been set administratively through the PF.
Setting it will also be allowed if the VF has trust.
Fixes: 891d6f039073 ("ionic: support sr-iov operations") Signed-off-by: R Mohamed Shah <mohamed@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shannon Nelson [Wed, 24 Aug 2022 16:50:50 +0000 (09:50 -0700)]
ionic: fix up issues with handling EAGAIN on FW cmds
In looping on FW update tests we occasionally see the
FW_ACTIVATE_STATUS command fail while it is in its EAGAIN loop
waiting for the FW activate step to finsh inside the FW. The
firmware is complaining that the done bit is set when a new
dev_cmd is going to be processed.
Doing a clean on the cmd registers and doorbell before exiting
the wait-for-done and cleaning the done bit before the sleep
prevents this from occurring.
Fixes: 22c2f0b535b9 ("ionic: Add hardware init and device commands") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Shannon Nelson [Wed, 24 Aug 2022 16:50:49 +0000 (09:50 -0700)]
ionic: clear broken state on generation change
There is a case found in heavy testing where a link flap happens just
before a firmware Recovery event and the driver gets stuck in the
BROKEN state. This comes from the driver getting interrupted by a FW
generation change when coming back up from the link flap, and the call
to ionic_start_queues() in ionic_link_status_check() fails. This can be
addressed by having the fw_up code clear the BROKEN bit if seen, rather
than waiting for a user to manually force the interface down and then
back up.
Fixes: 1600eba726cb ("ionic: stop watchdog when in broken state") Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Howells [Wed, 24 Aug 2022 16:35:45 +0000 (17:35 +0100)]
rxrpc: Fix locking in rxrpc's sendmsg
Fix three bugs in the rxrpc's sendmsg implementation:
(1) rxrpc_new_client_call() should release the socket lock when returning
an error from rxrpc_get_call_slot().
(2) rxrpc_wait_for_tx_window_intr() will return without the call mutex
held in the event that we're interrupted by a signal whilst waiting
for tx space on the socket or relocking the call mutex afterwards.
Fix this by: (a) moving the unlock/lock of the call mutex up to
rxrpc_send_data() such that the lock is not held around all of
rxrpc_wait_for_tx_window*() and (b) indicating to higher callers
whether we're return with the lock dropped. Note that this means
recvmsg() will not block on this call whilst we're waiting.
(3) After dropping and regaining the call mutex, rxrpc_send_data() needs
to go and recheck the state of the tx_pending buffer and the
tx_total_len check in case we raced with another sendmsg() on the same
call.
Thinking on this some more, it might make sense to have different locks for
sendmsg() and recvmsg(). There's probably no need to make recvmsg() wait
for sendmsg(). It does mean that recvmsg() can return MSG_EOR indicating
that a call is dead before a sendmsg() to that call returns - but that can
currently happen anyway.
Without fix (2), something like the following can be induced:
WARNING: bad unlock balance detected!
5.16.0-rc6-syzkaller #0 Not tainted
-------------------------------------
syz-executor011/3597 is trying to release lock (&call->user_mutex) at:
[<ffffffff885163a3>] rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
but there are no more locks to release!
other info that might help us debug this:
no locks held by syz-executor011/3597.
...
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_unlock_imbalance_bug include/trace/events/lock.h:58 [inline]
__lock_release kernel/locking/lockdep.c:5306 [inline]
lock_release.cold+0x49/0x4e kernel/locking/lockdep.c:5657
__mutex_unlock_slowpath+0x99/0x5e0 kernel/locking/mutex.c:900
rxrpc_do_sendmsg+0xc13/0x1350 net/rxrpc/sendmsg.c:748
rxrpc_sendmsg+0x420/0x630 net/rxrpc/af_rxrpc.c:561
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:724
____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
___sys_sendmsg+0xf3/0x170 net/socket.c:2463
__sys_sendmsg+0xe5/0x1b0 net/socket.c:2492
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
[Thanks to Hawkins Jiawei and Khalid Masum for their attempts to fix this]
Fixes: 8d0c6fe295af ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals") Reported-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Tested-by: syzbot+7f0483225d0c94cb3441@syzkaller.appspotmail.com
cc: Hawkins Jiawei <yin31149@gmail.com>
cc: Khalid Masum <khalid.masum.92@gmail.com>
cc: Dan Carpenter <dan.carpenter@oracle.com>
cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/166135894583.600315.7170979436768124075.stgit@warthog.procyon.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 25 Aug 2022 17:52:16 +0000 (10:52 -0700)]
Merge tag 'cgroup-for-6.0-rc2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull another cgroup fix from Tejun Heo:
"Commit baf59de3daeb ("cgroup: Fix threadgroup_rwsem <->
cpus_read_lock() deadlock") required the cgroup
core to grab cpus_read_lock() before invoking ->attach().
Unfortunately, it missed adding cpus_read_lock() in
cgroup_attach_task_all(). Fix it"
* tag 'cgroup-for-6.0-rc2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
Tetsuo Handa [Thu, 25 Aug 2022 08:38:38 +0000 (17:38 +0900)]
cgroup: Add missing cpus_read_lock() to cgroup_attach_task_all()
syzbot is hitting percpu_rwsem_assert_held(&cpu_hotplug_lock) warning at
cpuset_attach() [1], for commit baf59de3daebd355 ("cgroup: Fix
threadgroup_rwsem <-> cpus_read_lock() deadlock") missed that
cpuset_attach() is also called from cgroup_attach_task_all().
Add cpus_read_lock() like what cgroup_procs_write_start() does.
Zhengchao Shao [Wed, 24 Aug 2022 00:52:31 +0000 (08:52 +0800)]
net: sched: delete duplicate cleanup of backlog and qlen
qdisc_reset() is clearing qdisc->q.qlen and qdisc->qstats.backlog
_after_ calling qdisc->ops->reset. There is no need to clear them
again in the specific reset function.
Wenjuan Geng [Tue, 23 Aug 2022 09:01:22 +0000 (11:01 +0200)]
nfp: flower: support case of match on ct_state(0/0x3f)
is_post_ct_flow() function will process only ct_state ESTABLISHED,
then offload_pre_check() function will check FLOW_DISSECTOR_KEY_CT flag.
When config tc filter match ct_state(0/0x3f), dissector->used_keys
with FLOW_DISSECTOR_KEY_CT bit, function offload_pre_check() will
return false, so not offload. This is a special case that can be handled
safely.
Therefore, modify to let initial packet which won't go through conntrack
can be offloaded, as long as the cared ct fields are all zero.
wifi: mac80211: allow bw change during channel switch in mesh
From 'IEEE Std 802.11-2020 section 11.8.8.4.1':
The mesh channel switch may be triggered by the need to avoid
interference to a detected radar signal, or to reassign mesh STA
channels to ensure the MBSS connectivity.
A 20/40 MHz MBSS may be changed to a 20 MHz MBSS and a 20 MHz
MBSS may be changed to a 20/40 MHz MBSS.
Since the standard allows the change of bandwidth during
the channel switch in mesh, remove the bandwidth check present in
ieee80211_set_csa_beacon.
Lukas Bulwahn [Fri, 12 Aug 2022 10:31:26 +0000 (12:31 +0200)]
wifi: mac80211: clean up a needless assignment in ieee80211_sta_activate_link()
Commit 18a1b8dae51b ("wifi: mac80211: sta_info: fix link_sta insertion")
makes ieee80211_sta_activate_link() return 0 in the 'hash' label case.
Hence, setting ret in the !test_sta_flag(...) branch to zero is not needed
anymore and can be dropped.
Johannes Berg [Wed, 24 Aug 2022 10:37:32 +0000 (12:37 +0200)]
wifi: mac80211: allow link address A2 in TXQ dequeue
In ieee80211_tx_dequeue() we currently allow a control port
frame to be transmitted on a non-authorized port only if the
A2 matches the local interface address, but if that's an MLD
and the peer is a legacy peer, we need to allow link address
here. Fix that.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 24 Aug 2022 10:30:16 +0000 (12:30 +0200)]
wifi: mac80211: fix control port frame addressing
For an AP interface, when userspace specifieds the link ID to
transmit the control port frame on (in particular for the
initial 4-way-HS), due to the logic in ieee80211_build_hdr()
for a frame transmitted from/to an MLD, we currently build a
header with
A1 = DA = MLD address of the peer MLD
A2 = local link address (!)
A3 = SA = local MLD address
This clearly makes no sense, and leads to two problems:
- if the frame were encrypted (not true for the initial
4-way-HS) the AAD would be calculated incorrectly
- if iTXQs are used, the frame is dropped by logic in
ieee80211_tx_dequeue()
Fix the addressing, which fixes the first bullet, and the
second bullet for peer MLDs, I'll fix the second one for
non-MLD peers separately.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Fri, 19 Aug 2022 12:58:42 +0000 (14:58 +0200)]
wifi: mac80211_hwsim: fix link change handling
The code for determining which links to update in wmediumd
or virtio was wrong, fix it to remove the deflink only if
there were no old links, and also add the deflink if there
are no other new links.
Fixes: e7c0c32290d4 ("wifi: mac80211_hwsim: handle links for wmediumd/virtio") Signed-off-by: Johannes Berg <johannes.berg@intel.com>
For AP/non-AP the EHT MCS/NSS subfield size differs, the
4-octet subfield is only used for 20 MHz-only non-AP STA.
Pass an argument around everywhere to be able to parse it
properly.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Tue, 16 Aug 2022 09:22:46 +0000 (11:22 +0200)]
wifi: mac80211_hwsim: split iftype data into AP/non-AP
The next patch will require splitting the data here into
AP and non-AP because for EHT, the format of the MCS/NSS
support (struct ieee80211_eht_mcs_nss_supp) is different,
for AP the only_20mhz cannot be used.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Draft P802.11be_D2.1, section 35.3.17 states that the EML Capabilities
Field shouldn't be included in case the device doesn't have support for
EMLSR or EMLMR.
Fixes: ab36dd033db8 ("wifi: mac80211: support MLO authentication/association with one link") Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Wed, 17 Aug 2022 19:57:19 +0000 (21:57 +0200)]
wifi: mac80211: use link ID for MLO in queued frames
When queuing frames to an interface store the link ID we
determined (which possibly came from the driver in the
RX status in the first place) in the RX status, and use
it in the MLME code to send probe responses, beacons and
CSA frames to the right link.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
wifi: mac80211: add link information in ieee80211_rx_status
In MLO, when the address translation from link to MLD is done
in fw/hw, it is necessary to be able to have some information
on the link on which the frame has been received. Extend the
rx API to include link_id and a valid flag in ieee80211_rx_status.
Also make chanes to mac80211 rx APIs to make use of the reported
link_id after sanity checks.
wifi: cfg80211: Add link_id parameter to various key operations for MLO
Add support for various key operations on MLD by adding new parameter
link_id. Pass the link_id received from userspace to driver for add_key,
get_key, del_key, set_default_key, set_default_mgmt_key and
set_default_beacon_key to support configuring keys specific to each MLO
link. Userspace must not specify link ID for MLO pairwise key since it
is common for all the MLO links.
wifi: cfg80211: Prevent cfg80211_wext_siwencodeext() on MLD
Currently, MLO support is not added for WEXT code and WEXT handlers are
prevented on MLDs. Prevent WEXT handler cfg80211_wext_siwencodeext()
also on MLD which is missed in commit d624b47de5de ("wifi: cfg80211: do
some rework towards MLO link APIs")
Shaul Triebitz [Mon, 1 Aug 2022 11:12:29 +0000 (14:12 +0300)]
wifi: cfg80211: get correct AP link chandef
When checking for channel regulatory validity, use the
AP link chandef (and not mesh's chandef).
Fixes: d624b47de5de ("wifi: cfg80211: do some rework towards MLO link APIs") Signed-off-by: Shaul Triebitz <shaul.triebitz@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Ilan Peer [Wed, 3 Aug 2022 15:02:56 +0000 (18:02 +0300)]
wifi: cfg80211: Update RNR parsing to align with Draft P802.11be_D2.0
Based on changes in the specification the TBTT information in
the RNR can include MLD information, so update the parsing to
allow extracting the short SSID information in such a case.
Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Johannes Berg [Sat, 23 Jul 2022 20:08:49 +0000 (22:08 +0200)]
wifi: mac80211: accept STA changes without link changes
If there's no link ID, then check that there are no changes to
the link, and if so accept them, unless a new link is created.
While at it, reject creating a new link without an address.
This fixes authorizing an MLD (peer) that has no link 0.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Richard Gobert [Tue, 23 Aug 2022 07:10:49 +0000 (09:10 +0200)]
net: gro: skb_gro_header helper function
Introduce a simple helper function to replace a common pattern.
When accessing the GRO header, we fetch the pointer from frag0,
then test its validity and fetch it from the skb when necessary.
This leads to the pattern
skb_gro_header_fast -> skb_gro_header_hard -> skb_gro_header_slow
recurring many times throughout GRO code.
This patch replaces these patterns with a single inlined function
call, improving code readability.
====================
Add a second bind table hashed by port and address
Currently, there is one bind hashtable (bhash) that hashes by port only.
This patchset adds a second bind table (bhash2) that hashes by port and
address.
The motivation for adding bhash2 is to expedite bind requests in situations
where the port has many sockets in its bhash table entry (eg a large number
of sockets bound to different addresses on the same port), which makes checking
bind conflicts costly especially given that we acquire the table entry spinlock
while doing so, which can cause softirq cpu lockups and can prevent new tcp
connections.
We ran into this problem at Meta where the traffic team binds a large number
of IPs to port 443 and the bind() call took a significant amount of time
which led to cpu softirq lockups, which caused packet drops and other failures
on the machine.
When experimentally testing this on a local server for ~24k sockets bound to
the port, the results seen were:
ipv4:
before - 0.002317 seconds
with bhash2 - 0.000020 seconds
ipv6:
before - 0.002431 seconds
with bhash2 - 0.000021 seconds
The additions to the initial bhash2 submission [0] are:
* Updating bhash2 in the cases where a socket's rcv saddr changes after it has
* been bound
* Adding locks for bhash2 hashbuckets
Joanne Koong [Mon, 22 Aug 2022 18:10:23 +0000 (11:10 -0700)]
selftests/net: Add sk_bind_sendto_listen and sk_connect_zero_addr
This patch adds 2 new tests: sk_bind_sendto_listen and
sk_connect_zero_addr.
The sk_bind_sendto_listen test exercises the path where a socket's
rcv saddr changes after it has been added to the binding tables,
and then a listen() on the socket is invoked. The listen() should
succeed.
The sk_bind_sendto_listen test is copied over from one of syzbot's
tests: https://syzkaller.appspot.com/x/repro.c?x=1673a38df00000
The sk_connect_zero_addr test exercises the path where the socket was
never previously added to the binding tables and it gets assigned a
saddr upon a connect() to address 0.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joanne Koong [Mon, 22 Aug 2022 18:10:22 +0000 (11:10 -0700)]
selftests/net: Add test for timing a bind request to a port with a populated bhash entry
This test populates the bhash table for a given port with
MAX_THREADS * MAX_CONNECTIONS sockets, and then times how long
a bind request on the port takes.
When populating the bhash table, we create the sockets and then bind
the sockets to the same address and port (SO_REUSEADDR and SO_REUSEPORT
are set). When timing how long a bind on the port takes, we bind on a
different address without SO_REUSEPORT set. We do not set SO_REUSEPORT
because we are interested in the case where the bind request does not
go through the tb->fastreuseport path, which is fragile (eg
tb->fastreuseport path does not work if binding with a different uid).
To run the script:
Usage: ./bind_bhash.sh [-6 | -4] [-p port] [-a address]
6: use ipv6
4: use ipv4
port: Port number
address: ip address
Without any arguments, ./bind_bhash.sh defaults to ipv6 using ip address
"2001:0db8:0:f101::1" on port 443.
On my local machine, I see:
ipv4:
before - 0.002317 seconds
with bhash2 - 0.000020 seconds
ipv6:
before - 0.002431 seconds
with bhash2 - 0.000021 seconds
Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Joanne Koong [Mon, 22 Aug 2022 18:10:21 +0000 (11:10 -0700)]
net: Add a bhash2 table hashed by port and address
The current bind hashtable (bhash) is hashed by port only.
In the socket bind path, we have to check for bind conflicts by
traversing the specified port's inet_bind_bucket while holding the
hashbucket's spinlock (see inet_csk_get_port() and
inet_csk_bind_conflict()). In instances where there are tons of
sockets hashed to the same port at different addresses, the bind
conflict check is time-intensive and can cause softirq cpu lockups,
as well as stops new tcp connections since __inet_inherit_port()
also contests for the spinlock.
This patch adds a second bind table, bhash2, that hashes by
port and sk->sk_rcv_saddr (ipv4) and sk->sk_v6_rcv_saddr (ipv6).
Searching the bhash2 table leads to significantly faster conflict
resolution and less time holding the hashbucket spinlock.
Please note a few things:
* There can be the case where the a socket's address changes after it
has been bound. There are two cases where this happens:
1) The case where there is a bind() call on INADDR_ANY (ipv4) or
IPV6_ADDR_ANY (ipv6) and then a connect() call. The kernel will
assign the socket an address when it handles the connect()
2) In inet_sk_reselect_saddr(), which is called when rebuilding the
sk header and a few pre-conditions are met (eg rerouting fails).
In these two cases, we need to update the bhash2 table by removing the
entry for the old address, and add a new entry reflecting the updated
address.
* The bhash2 table must have its own lock, even though concurrent
accesses on the same port are protected by the bhash lock. Bhash2 must
have its own lock to protect against cases where sockets on different
ports hash to different bhash hashbuckets but to the same bhash2
hashbucket.
This brings up a few stipulations:
1) When acquiring both the bhash and the bhash2 lock, the bhash2 lock
will always be acquired after the bhash lock and released before the
bhash lock is released.
2) There are no nested bhash2 hashbucket locks. A bhash2 lock is always
acquired+released before another bhash2 lock is acquired+released.
* The bhash table cannot be superseded by the bhash2 table because for
bind requests on INADDR_ANY (ipv4) or IPV6_ADDR_ANY (ipv6), every socket
bound to that port must be checked for a potential conflict. The bhash
table is the only source of port->socket associations.
Signed-off-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
1) Fix crash with malformed ebtables blob which do not provide all
entry points, from Florian Westphal.
2) Fix possible TCP connection clogging up with default 5-days
timeout in conntrack, from Florian.
3) Fix crash in nf_tables tproxy with unsupported chains, also from Florian.
4) Do not allow to update implicit chains.
5) Make table handle allocation per-netns to fix data race.
6) Do not truncated payload length and offset, and checksum offset.
Instead report EINVAl.
7) Enable chain stats update via static key iff no error occurs.
8) Restrict osf expression to ip, ip6 and inet families.
9) Restrict tunnel expression to netdev family.
10) Fix crash when trying to bind again an already bound chain.
11) Flowtable garbage collector might leave behind pending work to
delete entries. This patch comes with a previous preparation patch
as dependency.
12) Allow net.netfilter.nf_conntrack_frag6_high_thresh to be lowered,
from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_defrag_ipv6: allow nf_conntrack_frag6_high_thresh increases
netfilter: flowtable: fix stuck flows on cleanup due to pending work
netfilter: flowtable: add function to invoke garbage collection immediately
netfilter: nf_tables: disallow binding to already bound chain
netfilter: nft_tunnel: restrict it to netdev family
netfilter: nft_osf: restrict osf to ipv4, ipv6 and inet families
netfilter: nf_tables: do not leave chain stats enabled on error
netfilter: nft_payload: do not truncate csum_offset and csum_type
netfilter: nft_payload: report ERANGE for too long offset and length
netfilter: nf_tables: make table handle allocation per-netns friendly
netfilter: nf_tables: disallow updates of implicit chain
netfilter: nft_tproxy: restrict to prerouting hook
netfilter: conntrack: work around exceeded receive window
netfilter: ebtables: reject blobs that don't provide all entry points
====================
Randy Dunlap [Wed, 24 Aug 2022 02:42:16 +0000 (19:42 -0700)]
net: ethernet: ti: davinci_mdio: fix build for mdio bitbang uses
davinci_mdio.c uses mdio bitbang APIs, so it should select
MDIO_BITBANG to prevent build errors.
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdio_remove':
drivers/net/ethernet/ti/davinci_mdio.c:649: undefined reference to `free_mdio_bitbang'
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdio_probe':
drivers/net/ethernet/ti/davinci_mdio.c:545: undefined reference to `alloc_mdio_bitbang'
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdiobb_read':
drivers/net/ethernet/ti/davinci_mdio.c:236: undefined reference to `mdiobb_read'
arm-linux-gnueabi-ld: drivers/net/ethernet/ti/davinci_mdio.o: in function `davinci_mdiobb_write':
drivers/net/ethernet/ti/davinci_mdio.c:253: undefined reference to `mdiobb_write'
Fixes: a53af268f7c6 ("net: ethernet: ti: davinci_mdio: Add workaround for errata i2329") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ravi Gunasekaran <r-gunasekaran@ti.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Cc: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com> Link: https://lore.kernel.org/r/20220824024216.4939-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
i40e: Fix incorrect address type for IPv6 flow rules
It was not possible to create 1-tuple flow director
rule for IPv6 flow type. It was caused by incorrectly
checking for source IP address when validating user provided
destination IP address.
Fix this by changing ip6src to correct ip6dst address
in destination IP address validation for IPv6 flow type.
Fixes: 06962af1fda1 ("i40e: Add flow director support for IPv6") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jacob Keller [Tue, 2 Aug 2022 00:24:19 +0000 (17:24 -0700)]
ixgbe: stop resetting SYSTIME in ixgbe_ptp_start_cyclecounter
The ixgbe_ptp_start_cyclecounter is intended to be called whenever the
cyclecounter parameters need to be changed.
Since commit 8b95512ad51a ("ixgbe: Update PTP to support X550EM_x
devices"), this function has cleared the SYSTIME registers and reset the
TSAUXC DISABLE_SYSTIME bit.
While these need to be cleared during ixgbe_ptp_reset, it is wrong to clear
them during ixgbe_ptp_start_cyclecounter. This function may be called
during both reset and link status change. When link changes, the SYSTIME
counter is still operating normally, but the cyclecounter should be updated
to account for the possibly changed parameters.
Clearing SYSTIME when link changes causes the timecounter to jump because
the cycle counter now reads zero.
Extract the SYSTIME initialization out to a new function and call this
during ixgbe_ptp_reset. This prevents the timecounter adjustment and avoids
an unnecessary reset of the current time.
This also restores the original SYSTIME clearing that occurred during
ixgbe_ptp_reset before the commit above.
Reported-by: Steve Payne <spayne@aurora.tech> Reported-by: Ilya Evenbach <ievenbach@aurora.tech> Fixes: 8b95512ad51a ("ixgbe: Update PTP to support X550EM_x devices") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>