Karol Kolacinski [Tue, 16 Nov 2021 12:07:14 +0000 (13:07 +0100)]
ice: Don't put stale timestamps in the skb
The driver has to check if it does not accidentally put the timestamp in
the SKB before previous timestamp gets overwritten.
Timestamp values in the PHY are read only and do not get cleared except
at hardware reset or when a new timestamp value is captured.
The cached_tstamp field is used to detect the case where a new timestamp
has not yet been captured, ensuring that we avoid sending stale
timestamp data to the stack.
Fixes: fd28ad3fe698 ("ice: enable transmit timestamps for E810 devices") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Change the division in ice_ptp_adjfine from div_u64 to div64_u64.
div_u64 is used when the divisor is 32 bit but in this case incval is
64 bit and it caused incorrect calculations and incval adjustments.
Fixes: 6e97854adf8d ("ice: register 1588 PTP clock device object for E810 devices") Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Danielle Ratson [Tue, 14 Dec 2021 10:21:37 +0000 (12:21 +0200)]
selftests: mlxsw: Add a test case for MAC profiles consolidation
Add a test case to cover the bug fixed by the previous patch.
Edit the MAC address of one netdev so that it matches the MAC address of
the second netdev. Verify that the two MAC profiles were consolidated by
testing that the MAC profiles occupancy decreased by one.
Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Danielle Ratson [Tue, 14 Dec 2021 10:21:36 +0000 (12:21 +0200)]
mlxsw: spectrum_router: Consolidate MAC profiles when possible
Currently, when setting a router interface (RIF) MAC address while the
MAC profile is not shared with other RIFs, the profile is edited so that
the new MAC address is assigned to it.
This does not take into account a situation in which the new MAC address
already matches an existing MAC profile. In that situation, two MAC
profiles will be occupied even though they hold MAC addresses from the
same profile.
In order to prevent that, add a check to ensure that editing a MAC
profile takes place only when the new MAC address does not match an
existing profile.
Fixes: 9bba70d9958b6 ("mlxsw: spectrum_router: Add RIF MAC profiles support") Reported-by: Maksym Yaremchuk <maksymy@nvidia.com> Tested-by: Maksym Yaremchuk <maksymy@nvidia.com> Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Dec 2021 12:47:22 +0000 (12:47 +0000)]
Merge tag 'mac80211-for-net-2021-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
A fairly large number of fixes this time:
* fix a station info memory leak on insert collisions
* a rate control fix for retransmissions
* two aggregation setup fixes
* reload current regdomain when reloading database
* a locking fix in regulatory work
* a probe request allocation size fix in mac80211
* apply TCP vs. aggregation (sk pacing) on mesh
* fix ordering of channel context update vs. station
state
* set up skb->dev for mesh forwarding properly
* track QoS data frames only for admission control to
avoid out-of-bounds read (found by syzbot)
* validate extended element ID vs. existing data to
avoid out-of-bounds read (found by syzbot)
* fix locking in mac80211 aggregation TX setup
* fix traffic stall after HW restart when TXQs are used
* fix ordering of reconfig/restart after HW restart
* fix interface type for extended aggregation capability
lookup
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Baowen Zheng [Mon, 13 Dec 2021 14:46:04 +0000 (15:46 +0100)]
flow_offload: return EOPNOTSUPP for the unsupported mpls action type
We need to return EOPNOTSUPP for the unsupported mpls action type when
setup the flow action.
In the original implement, we will return 0 for the unsupported mpls
action type, actually we do not setup it and the following actions
to the flow action entry.
Fixes: bc5052d8e47f ("net: sched: take rtnl lock in tc_setup_flow_action()") Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The above issue is due to previous incorrect implementation of
tc_del_vlan_flow(), shown below, that uses flow_cls_offload_flow_rule()
to get struct flow_rule *rule which is no longer valid for tc filter
delete operation.
So, to ensure tc_del_vlan_flow() deletes the right VLAN cls record for
earlier configured RX queue (configured by hw_tc) in tc_add_vlan_flow(),
this patch introduces stmmac_rfs_entry as driver-side flow_cls_offload
record for 'RX frame steering' tc flower, currently used for VLAN
priority. The implementation has taken consideration for future extension
to include other type RX frame steering such as EtherType based.
v2:
- Clean up overly extensive backtrace and rewrite git message to better
explain the kernel NULL pointer issue.
Fixes: d7b533788066 ("net: stmmac: add RX frame steering based on VLAN priority in tc flower") Tested-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Mon, 29 Nov 2021 13:32:40 +0000 (15:32 +0200)]
mac80211: do drv_reconfig_complete() before restarting all
When we reconfigure, the driver might do some things to complete
the reconfiguration. It's strange and could be broken in some
cases because we restart other works (e.g. remain-on-channel and
TX) before this happens, yet only start queues later.
Change this to do the reconfig complete when reconfiguration is
actually complete, not when we've already started doing other
things again.
For iwlwifi, this should fix a race where the reconfig can race
with TX, for ath10k and ath11k that also use this it won't make
a difference because they just start queues there, and mac80211
also stopped the queues and will restart them later as before.
Johannes Berg [Mon, 29 Nov 2021 13:32:39 +0000 (15:32 +0200)]
mac80211: mark TX-during-stop for TX in in_reconfig
Mark TXQs as having seen transmit while they were stopped if
we bail out of drv_wake_tx_queue() due to reconfig, so that
the queue wake after this will make them catch up. This is
particularly necessary for when TXQs are used for management
packets since those TXQs won't see a lot of traffic that'd
make them catch up later.
mac80211: update channel context before station state
Currently channel context is updated only after station got an update about
new assoc state, this results in station using the old channel context.
Fix this by moving the update channel context before updating station,
enabling the driver to immediately use the updated channel context in
the new assoc state.
Johannes Berg [Mon, 29 Nov 2021 13:32:46 +0000 (15:32 +0200)]
mac80211: fix lookup when adding AddBA extension element
We should be doing the HE capabilities lookup based on the full
interface type so if P2P doesn't have HE but client has it doesn't
get confused. Fix that.
Ilan Peer [Thu, 2 Dec 2021 13:28:54 +0000 (15:28 +0200)]
cfg80211: Acquire wiphy mutex on regulatory work
The function cfg80211_reg_can_beacon_relax() expects wiphy
mutex to be held when it is being called. However, when
reg_leave_invalid_chans() is called the mutex is not held.
Fix it by acquiring the lock before calling the function.
Johannes Berg [Thu, 2 Dec 2021 13:26:25 +0000 (15:26 +0200)]
mac80211: agg-tx: don't schedule_and_wake_txq() under sta->lock
When we call ieee80211_agg_start_txq(), that will in turn call
schedule_and_wake_txq(). Called from ieee80211_stop_tx_ba_cb()
this is done under sta->lock, which leads to certain circular
lock dependencies, as reported by Chris Murphy:
https://lore.kernel.org/r/CAJCQCtSXJ5qA4bqSPY=oLRMbv-irihVvP7A2uGutEbXQVkoNaw@mail.gmail.com
In general, ieee80211_agg_start_txq() is usually not called
with sta->lock held, only in this one place. But it's always
called with sta->ampdu_mlme.mtx held, and that's therefore
clearly sufficient.
Change ieee80211_stop_tx_ba_cb() to also call it without the
sta->lock held, by factoring it out of ieee80211_remove_tid_tx()
(which is only called in this one place).
This breaks the locking chain and makes it less likely that
we'll have similar locking chain problems in the future.
Finn Behrens [Wed, 1 Dec 2021 12:49:18 +0000 (13:49 +0100)]
nl80211: remove reload flag from regulatory_request
This removes the previously unused reload flag, which was introduced in 7c96fb8e4ff0.
The request is handled as NL80211_REGDOM_SET_BY_CORE, which is parsed
unconditionally.
Felix Fietkau [Thu, 2 Dec 2021 12:45:33 +0000 (13:45 +0100)]
mac80211: send ADDBA requests using the tid/queue of the aggregation session
Sending them out on a different queue can cause a race condition where a
number of packets in the queue may be discarded by the receiver, because
the ADDBA request is sent too early.
This affects any driver with software A-MPDU setup which does not allocate
packet seqno in hardware on tx, regardless of whether iTXQ is used or not.
The only driver I've seen that explicitly deals with this issue internally
is mwl8k.
Paolo Abeni [Sat, 11 Dec 2021 16:11:12 +0000 (17:11 +0100)]
mptcp: never allow the PM to close a listener subflow
Currently, when deleting an endpoint the netlink PM treverses
all the local MPTCP sockets, regardless of their status.
If an MPTCP listener socket is bound to the IP matching the
delete endpoint, the listener TCP socket will be closed.
That is unexpected, the PM should only affect data subflows.
Additionally, syzbot was able to trigger a NULL ptr dereference
due to the above:
Stefan Assmann [Wed, 1 Dec 2021 08:14:34 +0000 (09:14 +0100)]
iavf: do not override the adapter state in the watchdog task (again)
The watchdog task incorrectly changes the state to __IAVF_RESETTING,
instead of letting the reset task take care of that. This was already
resolved by commit a15f78ea1171 ("iavf: do not override the adapter
state in the watchdog task") but the problem was reintroduced by the
recent code refactoring in commit 5e34a07e771d ("iavf: Refactor iavf
state machine tracking").
Fixes: 5e34a07e771d ("iavf: Refactor iavf state machine tracking") Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Dan Carpenter [Wed, 10 Nov 2021 08:13:50 +0000 (11:13 +0300)]
iavf: missing unlocks in iavf_watchdog_task()
This code was re-organized and there some unlocks missing now.
Fixes: 176955e1a486 ("iavf: Combine init and watchdog state machines") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
David Wu [Mon, 13 Dec 2021 11:15:15 +0000 (19:15 +0800)]
net: stmmac: Add GFP_DMA32 for rx buffers if no 64 capability
Use page_pool_alloc_pages instead of page_pool_dev_alloc_pages, which
can give the gfp parameter, in the case of not supporting 64-bit width,
using 32-bit address memory can reduce a copy from swiotlb.
Signed-off-by: David Wu <david.wu@rock-chips.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Recently, a patch has been submitted to "fix" the refcounting for a DT
node in of_mdiobus_link_mdiodev(). This is not a leaked refcount. The
refcount is passed to the new device.
Sadly, coccicheck identifies this location as a leaked refcount, which
means we're likely to keep getting patches to "fix" this. However,
fixing this will cause breakage. Add a comment to state that the lack
of of_node_put() here is intentional.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu [Mon, 13 Dec 2021 08:36:00 +0000 (16:36 +0800)]
selftest/net/forwarding: declare NETIFS p9 p10
The recent GRE selftests defined NUM_NETIFS=10. If the users copy
forwarding.config.sample to forwarding.config directly, they will get
error "Command line is not complete" when run the GRE tests, because
create_netif_veth() failed with no interface name defined.
Fix it by extending the NETIFS with p9 and p10.
Fixes: d4d5026d1afd ("selftests: forwarding: Test multipath hashing on inner IP pkts for GRE tunnel") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marek Behún [Sat, 11 Dec 2021 22:51:41 +0000 (23:51 +0100)]
net: dsa: mv88e6xxx: Unforce speed & duplex in mac_link_down()
Commit 429b7be81293 ("net: dsa: mv88e6xxx: configure interface settings
in mac_config") removed forcing of speed and duplex from
mv88e6xxx_mac_config(), where the link is forced down, and left it only
in mv88e6xxx_mac_link_up(), by which time link is unforced.
It seems that (at least on 88E6190) when changing cmode to 2500base-x,
if the link is not forced down, but the speed or duplex are still
forced, the forcing of new settings for speed & duplex doesn't take in
mv88e6xxx_mac_link_up().
Fix this by unforcing speed & duplex in mv88e6xxx_mac_link_down().
Fixes: 429b7be81293 ("net: dsa: mv88e6xxx: configure interface settings in mac_config") Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Sat, 11 Dec 2021 19:30:31 +0000 (14:30 -0500)]
selftests/net: toeplitz: fix udp option
Tiny fix. Option -u ("use udp") does not take an argument.
It can cause the next argument to silently be ignored.
Fixes: b3a4036cc56c ("selftests/net: toeplitz test") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Miaoqian Lin [Sat, 11 Dec 2021 14:01:53 +0000 (14:01 +0000)]
net: bcmgenet: Fix NULL vs IS_ERR() checking
The phy_attach() function does not return NULL. It returns error pointers.
Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
we can remove 'q->classes[i].alist' only if DRR class 'i' was part of the
active list. In the ETS scheduler DRR classes belong to that list only if
the queue length is greater than zero: we need to test for non-zero value
of 'q->classes[i].qdisc->q.qlen' before removing from the list, similarly
to what has been done elsewhere in the ETS code.
Fixes: 2de74becb249 ("net/sched: sch_ets: don't peek at classes beyond 'nbands'") Reported-by: Shuang Li <shuali@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 11 Dec 2021 18:26:16 +0000 (11:26 -0700)]
selftests: Fix IPv6 address bind tests
IPv6 allows binding a socket to a device then binding to an address
not on the device (__inet6_bind -> ipv6_chk_addr with strict flag
not set). Update the bind tests to reflect legacy behavior.
Fixes: ea3d469c7399 ("selftests: Add ipv6 address bind tests to fcnal-test") Reported-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 11 Dec 2021 17:21:08 +0000 (10:21 -0700)]
selftests: Fix raw socket bind tests with VRF
Commit referenced below added negative socket bind tests for VRF. The
socket binds should fail since the address to bind to is in a VRF yet
the socket is not bound to the VRF or a device within it. Update the
expected return code to check for 1 (bind failure) so the test passes
when the bind fails as expected. Add a 'show_hint' comment to explain
why the bind is expected to fail.
Fixes: a77a40ae320f ("selftests: Add ipv4 address bind tests to fcnal-test") Reported-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Sat, 11 Dec 2021 17:11:30 +0000 (10:11 -0700)]
selftests: Add duplicate config only for MD5 VRF tests
Commit referenced below added configuration in the default VRF that
duplicates a VRF to check MD5 passwords are properly used and fail
when expected. That config should not be added all the time as it
can cause tests to pass that should not (by matching on default VRF
setup when it should not). Move the duplicate setup to a function
that is only called for the MD5 tests and add a cleanup function
to remove it after the MD5 tests.
Fixes: 4ed407fd8b0e ("fcnal-test: Add TCP MD5 tests for VRF") Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Yufeng Mo [Fri, 10 Dec 2021 13:09:34 +0000 (21:09 +0800)]
net: hns3: fix race condition in debugfs
When multiple threads concurrently access the debugfs content, data
and pointer exceptions may occur. Therefore, mutex lock protection is
added for debugfs.
Fixes: 51b4d1974cac ("net: hns3: refactor the debugfs process") Signed-off-by: Yufeng Mo <moyufeng@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jie Wang [Fri, 10 Dec 2021 13:09:33 +0000 (21:09 +0800)]
net: hns3: fix use-after-free bug in hclgevf_send_mbx_msg
Currently, the hns3_remove function firstly uninstall client instance,
and then uninstall acceletion engine device. The netdevice is freed in
client instance uninstall process, but acceletion engine device uninstall
process still use it to trace runtime information. This causes a use after
free problem.
So fixes it by check the instance register state to avoid use after free.
Fixes: 5a80f6ec39d2 ("net: hns3: add trace event support for PF/VF mailbox") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Po-Hsu Lin [Fri, 10 Dec 2021 07:25:23 +0000 (15:25 +0800)]
selftests: icmp_redirect: pass xfail=0 to log_test()
If any sub-test in this icmp_redirect.sh is failing but not expected
to fail. The script will complain:
./icmp_redirect.sh: line 72: [: 1: unary operator expected
This is because when the sub-test is not expected to fail, we won't
pass any value for the xfail local variable in log_test() and thus
it's empty. Fix this by passing 0 as the 4th variable to log_test()
for non-xfail cases.
v2: added fixes tag
Fixes: c6a7c8bd648a ("selftests: icmp_redirect: support expected failures") Signed-off-by: Po-Hsu Lin <po-hsu.lin@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Filip Pokryvka [Fri, 10 Dec 2021 17:50:32 +0000 (18:50 +0100)]
netdevsim: don't overwrite read only ethtool parms
Ethtool ring feature has _max_pending attributes read-only.
Set only read-write attributes in nsim_set_ringparam.
This patch is useful, if netdevsim device is set-up using NetworkManager,
because NetworkManager sends 0 as MAX values, as it is pointless to
retrieve them in extra call, because they should be read-only. Then,
the device is left in incosistent state (value > MAX).
Eric Dumazet [Thu, 9 Dec 2021 18:50:58 +0000 (10:50 -0800)]
inet_diag: fix kernel-infoleak for UDP sockets
KMSAN reported a kernel-infoleak [1], that can exploited
by unpriv users.
After analysis it turned out UDP was not initializing
r->idiag_expires. Other users of inet_sk_diag_fill()
might make the same mistake in the future, so fix this
in inet_sk_diag_fill().
ping6 is failing as it should.
COMMAND: ip netns exec ns-A /bin/ping6 -c1 -w1 fe80::7c4c:bcff:fe66:a63a%red
strace of ping6 shows it is failing with '1',
so change the expected rc from 2 to 1.
Fixes: 1fdd5f0dd4ed ("selftests: Add ipv6 ping tests to fcnal-test") Reported-by: kernel test robot <lkp@intel.com> Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Jie2x Zhou <jie2x.zhou@intel.com> Link: https://lore.kernel.org/r/20211209020230.37270-1-jie2x.zhou@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Linus Torvalds [Thu, 9 Dec 2021 19:26:44 +0000 (11:26 -0800)]
Merge tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf, can and netfilter.
Current release - regressions:
- bpf, sockmap: re-evaluate proto ops when psock is removed from
sockmap
Current release - new code bugs:
- bpf: fix bpf_check_mod_kfunc_call for built-in modules
- ice: fixes for TC classifier offloads
- vrf: don't run conntrack on vrf with !dflt qdisc
Previous releases - regressions:
- bpf: fix the off-by-two error in range markings
- seg6: fix the iif in the IPv6 socket control block
- devlink: fix netns refcount leak in devlink_nl_cmd_reload()
- dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
- dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
Previous releases - always broken:
- ethtool: do not perform operations on net devices being
unregistered
- udp: use datalen to cap max gso segments
- ice: fix races in stats collection
- fec: only clear interrupt of handling queue in fec_enet_rx_queue()
- m_can: pci: fix incorrect reference clock rate
- m_can: disable and ignore ELO interrupt
- mvpp2: fix XDP rx queues registering
Misc:
- treewide: add missing includes masked by cgroup -> bpf.h
dependency"
* tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
net: wwan: iosm: fixes unable to send AT command during mbim tx
net: wwan: iosm: fixes net interface nonfunctional after fw flash
net: wwan: iosm: fixes unnecessary doorbell send
net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering
MAINTAINERS: s390/net: remove myself as maintainer
net/sched: fq_pie: prevent dismantle issue
net: mana: Fix memory leak in mana_hwc_create_wq
seg6: fix the iif in the IPv6 socket control block
nfp: Fix memory leak in nfp_cpp_area_cache_add()
nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
nfc: fix segfault in nfc_genl_dump_devices_done
udp: using datalen to cap max gso segments
net: dsa: mv88e6xxx: error handling for serdes_power functions
can: kvaser_usb: get CAN clock frequency from device
can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter
net: mvpp2: fix XDP rx queues registering
vmxnet3: fix minimum vectors alloc issue
net, neigh: clear whole pneigh_entry at alloc time
net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
...
Linus Torvalds [Thu, 9 Dec 2021 19:08:19 +0000 (11:08 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- fixes for various drivers which assume that a HID device is on USB
transport, but that might not necessarily be the case, as the device
can be faked by uhid. (Greg, Benjamin Tissoires)
- fix for spurious wakeups on certain Lenovo notebooks (Thomas
Weißschuh)
- a few other device-specific quirks
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: Ignore battery for Elan touchscreen on Asus UX550VE
HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested
HID: google: add eel USB id
HID: add USB_HID dependancy to hid-prodikeys
HID: add USB_HID dependancy to hid-chicony
HID: bigbenff: prevent null pointer dereference
HID: sony: fix error path in probe
HID: add USB_HID dependancy on some USB HID drivers
HID: check for valid USB device for many HID drivers
HID: wacom: fix problems when device is not a valid USB device
HID: add hid_is_usb() function to make it simpler for USB detection
HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
Linus Torvalds [Thu, 9 Dec 2021 18:49:36 +0000 (10:49 -0800)]
Merge tag 'netfs-fixes-20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull netfslib fixes from David Howells:
- Fix a lockdep warning and potential deadlock. This is takes the
simple approach of offloading the write-to-cache done from within a
network filesystem read to a worker thread to avoid taking the
sb_writer lock from the cache backing filesystem whilst holding the
mmap lock on an inode from the network filesystem.
Jan Kara posits a scenario whereby this can cause deadlock[1], though
it's quite complex and I think requires someone in userspace to
actually do I/O on the cache files. Matthew Wilcox isn't so certain,
though[2].
An alternative way to fix this, suggested by Darrick Wong, might be
to allow cachefiles to prevent userspace from performing I/O upon the
file - something like an exclusive open - but that's beyond the scope
of a fix here if we do want to make such a facility in the future.
- In some of the error handling paths where netfs_ops->cleanup() is
called, the arguments are transposed[3]. gcc doesn't complain because
one of the parameters is void* and one of the values is void*.
Sasha Levin [Thu, 9 Dec 2021 16:51:13 +0000 (11:51 -0500)]
tools/lib/lockdep: drop leftover liblockdep headers
Clean up remaining headers that are specific to liblockdep but lived in
the shared header directory. These are all unused after the liblockdep
code was removed in commit 1de3bdde3dfd ("tools/lib/lockdep: drop
liblockdep").
Note that there are still headers that were originally created for
liblockdep, that still have liblockdep references, but they are used by
other tools/ code at this point.
net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports
Martyn Welch reports that his CPU port is unable to link where it has
been necessary to use one of the switch ports with an internal PHY for
the CPU port. The reason behind this is the port control register is
left forcing the link down, preventing traffic flow.
This occurs because during initialisation, phylink expects the link to
be down, and DSA forces the link down by synthesising a call to the
DSA drivers phylink_mac_link_down() method, but we don't touch the
forced-link state when we later reconfigure the port.
Resolve this by also unforcing the link state when we are operating in
PHY mode and the PPU is set to poll the PHY to retrieve link status
information.
Reported-by: Martyn Welch <martyn.welch@collabora.com> Tested-by: Martyn Welch <martyn.welch@collabora.com> Fixes: 0da96bd3f59c ("net: dsa: Down cpu/dsa ports phylink will control") Cc: <stable@vger.kernel.org> # 5.7: 4de91503014d: net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1mvFhP-00F8Zb-Ul@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jakub Kicinski [Thu, 9 Dec 2021 16:10:38 +0000 (08:10 -0800)]
Merge branch 'net-wwan-iosm-bug-fixes'
M Chetan Kumar says:
====================
net: wwan: iosm: bug fixes
This patch series brings in IOSM driver bug fixes. Patch details are
explained below.
PATCH1: stop sending unnecessary doorbell in IP tx flow.
PATCH2: Restore the IP channel configuration after fw flash.
PATCH3: Removed the unnecessary check around control port TX transfer.
====================
M Chetan Kumar [Thu, 9 Dec 2021 10:16:28 +0000 (15:46 +0530)]
net: wwan: iosm: fixes net interface nonfunctional after fw flash
Devlink initialization flow was overwriting the IP traffic
channel configuration. This was causing wwan0 network interface
to be unusable after fw flash.
When device boots to fully functional mode restore the IP channel
configuration.
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
M Chetan Kumar [Thu, 9 Dec 2021 10:16:27 +0000 (15:46 +0530)]
net: wwan: iosm: fixes unnecessary doorbell send
In TX packet accumulation flow transport layer is
giving a doorbell to device even though there is
no pending control TX transfer that needs immediate
attention.
Introduced a new hpda_ctrl_pending variable to keep
track of pending control TX transfer. If there is a
pending control TX transfer which needs an immediate
attention only then give a doorbell to device.
Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Andrea Mayer [Wed, 8 Dec 2021 19:54:09 +0000 (20:54 +0100)]
seg6: fix the iif in the IPv6 socket control block
When an IPv4 packet is received, the ip_rcv_core(...) sets the receiving
interface index into the IPv4 socket control block (v5.16-rc4,
net/ipv4/ip_input.c line 510):
IPCB(skb)->iif = skb->skb_iif;
If that IPv4 packet is meant to be encapsulated in an outer IPv6+SRH
header, the seg6_do_srh_encap(...) performs the required encapsulation.
In this case, the seg6_do_srh_encap function clears the IPv6 socket control
block (v5.16-rc4 net/ipv6/seg6_iptunnel.c line 163):
memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
The memset(...) was introduced in commit 8e948a9f62f4 ("ipv6: sr: clear
IP6CB(skb) on SRH ip4ip6 encapsulation") a long time ago (2019-01-29).
Since the IPv6 socket control block and the IPv4 socket control block share
the same memory area (skb->cb), the receiving interface index info is lost
(IP6CB(skb)->iif is set to zero).
As a side effect, that condition triggers a NULL pointer dereference if
commit 05017901c97a ("ipv6: When forwarding count rx stats on the orig
netdev") is applied.
To fix that issue, we set the IP6CB(skb)->iif with the index of the
receiving interface once again.
Fixes: 8e948a9f62f4 ("ipv6: sr: clear IP6CB(skb) on SRH ip4ip6 encapsulation") Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211208195409.12169-1-andrea.mayer@uniroma2.it Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jianglei Nie [Thu, 9 Dec 2021 06:15:11 +0000 (14:15 +0800)]
nfp: Fix memory leak in nfp_cpp_area_cache_add()
In line 800 (#1), nfp_cpp_area_alloc() allocates and initializes a
CPP area structure. But in line 807 (#2), when the cache is allocated
failed, this CPP area structure is not freed, which will result in
memory leak.
We can fix it by freeing the CPP area when the cache is allocated
failed (#2).
nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done
The done() netlink callback nfc_genl_dump_ses_done() should check if
received argument is non-NULL, because its allocation could fail earlier
in dumpit() (nfc_genl_dump_ses()).
Jakub Kicinski [Thu, 9 Dec 2021 15:43:22 +0000 (07:43 -0800)]
Merge tag 'linux-can-fixes-for-5.16-20211209' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
can 2021-12-09
Both patches are by Jimmy Assarsson. The first one fixes the
incrementing of the rx/tx error counters in the Kvaser PCIe FD driver.
The second one fixes the Kvaser USB driver by using the CAN clock
frequency provided by the device instead of using a hard coded value.
* tag 'linux-can-fixes-for-5.16-20211209' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: kvaser_usb: get CAN clock frequency from device
can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter
====================
Jimmy Assarsson [Wed, 8 Dec 2021 15:21:22 +0000 (16:21 +0100)]
can: kvaser_usb: get CAN clock frequency from device
The CAN clock frequency is used when calculating the CAN bittiming
parameters. When wrong clock frequency is used, the device may end up
with wrong bittiming parameters, depending on user requested bittiming
parameters.
To avoid this, get the CAN clock frequency from the device. Various
existing Kvaser Leaf products use different CAN clocks.
Fixes: d001c8c6a4d1 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Link: https://lore.kernel.org/all/20211208152122.250852-2-extja@kvaser.com Cc: stable@vger.kernel.org Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Louis Amas [Tue, 7 Dec 2021 14:34:22 +0000 (15:34 +0100)]
net: mvpp2: fix XDP rx queues registering
The registration of XDP queue information is incorrect because the
RX queue id we use is invalid. When port->id == 0 it appears to works
as expected yet it's no longer the case when port->id != 0.
The problem arised while using a recent kernel version on the
MACCHIATOBin. This board has several ports:
* eth0 and eth1 are 10Gbps interfaces ; both ports has port->id == 0;
* eth2 is a 1Gbps interface with port->id != 0.
Code from xdp-tutorial (more specifically advanced03-AF_XDP) was used
to test packet capture and injection on all these interfaces. The XDP
kernel was simplified to:
SEC("xdp_sock")
int xdp_sock_prog(struct xdp_md *ctx)
{
int index = ctx->rx_queue_index;
/* A set entry here means that the correspnding queue_id
* has an active AF_XDP socket bound to it. */
if (bpf_map_lookup_elem(&xsks_map, &index))
return bpf_redirect_map(&xsks_map, index, 0);
return XDP_PASS;
}
Starting the program using:
./af_xdp_user -d DEV
Gives the following result:
* eth0 : ok
* eth1 : ok
* eth2 : no capture, no injection
Investigating the issue shows that XDP rx queues for eth2 are wrong:
XDP expects their id to be in the range [0..3] but we found them to be
in the range [32..35].
Trying to force rx queue ids using:
./af_xdp_user -d eth2 -Q 32
fails as expected (we shall not have more than 4 queues).
When we register the XDP rx queue information (using
xdp_rxq_info_reg() in function mvpp2_rxq_init()) we tell it to use
rxq->id as the queue id. This value is computed as:
rxq->id = port->id * max_rxq_count + queue_id
where max_rxq_count depends on the device version. In the MACCHIATOBin
case, this value is 32, meaning that rx queues on eth2 are numbered
from 32 to 35 - there are four of them.
Clearly, this is not the per-port queue id that XDP is expecting:
it wants a value in the range [0..3]. It shall directly use queue_id
which is stored in rxq->logic_rxq -- so let's use that value instead.
rxq->id is left untouched ; its value is indeed valid but it should
not be used in this context.
This is consistent with the remaining part of the code in
mvpp2_rxq_init().
With this change, packet capture is working as expected on all the
MACCHIATOBin ports.
Fixes: 887e8381f684 ("mvpp2: use page_pool allocator") Signed-off-by: Louis Amas <louis.amas@eho.link> Signed-off-by: Emmanuel Deloget <emmanuel.deloget@eho.link> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/20211207143423.916334-1-louis.amas@eho.link Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Ronak Doshi [Tue, 7 Dec 2021 08:17:37 +0000 (00:17 -0800)]
vmxnet3: fix minimum vectors alloc issue
'Commit 4f9f3aa6b852 ("vmxnet3: add support for 32 Tx/Rx queues")'
added support for 32Tx/Rx queues. Within that patch, value of
VMXNET3_LINUX_MIN_MSIX_VECT was updated.
However, there is a case (numvcpus = 2) which actually requires 3
intrs which matches VMXNET3_LINUX_MIN_MSIX_VECT which then is
treated as failure by stack to allocate more vectors. This patch
fixes this issue.
Fixes: 4f9f3aa6b852 ("vmxnet3: add support for 32 Tx/Rx queues") Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Guolin Yang <gyang@vmware.com> Link: https://lore.kernel.org/r/20211207081737.14000-1-doshir@vmware.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
CPU: 1 PID: 20001 Comm: syz-executor.0 Not tainted 5.16.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Fixes: ea59d98fdd3e ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211206165329.1049835-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal.
2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel.
3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit
groups, from Stefano Brivio.
4) Break rule evaluation on malformed TCP options.
5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh,
also from Florian
6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf:
netfilter: conntrack: annotate data-races around ct->timeout
selftests: netfilter: switch zone stress to socat
netfilter: nft_exthdr: break evaluation if setting TCP option fails
selftests: netfilter: Add correctness test for mac,net set type
nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups
vrf: don't run conntrack on vrf with !dflt qdisc
netfilter: nfnetlink_queue: silence bogus compiler warning
====================
We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).
The main changes are:
1) Fix an off-by-two error in packet range markings and also add a batch of
new tests for coverage of these corner cases, from Maxim Mikityanskiy.
2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.
3) Fix two functional regressions and a build warning related to BTF kfunc
for modules, from Kumar Kartikeya Dwivedi.
4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
PREEMPT_RT kernels, from Sebastian Andrzej Siewior.
5) Add missing includes in order to be able to detangle cgroup vs bpf header
dependencies, from Jakub Kicinski.
6) Fix regression in BPF sockmap tests caused by missing detachment of progs
from sockets when they are removed from the map, from John Fastabend.
7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
dispatcher, from Björn Töpel.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Add selftests to cover packet access corner cases
bpf: Fix the off-by-two error in range markings
treewide: Add missing includes masked by cgroup -> bpf dependency
tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
bpf: Fix bpf_check_mod_kfunc_call for built-in modules
bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
mips, bpf: Fix reference to non-existing Kconfig symbol
bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
Documentation/locking/locktypes: Update migrate_disable() bits.
bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
bpf, sockmap: Attach map progs to psock early for feature probes
bpf, x86: Fix "no previous prototype" warning
====================
net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"
This commit fixes a misunderstanding in commit 06c5f7f5cb14 ("net: dsa:
mv88e6xxx: don't use PHY_DETECT on internal PHY's").
For Marvell DSA switches with the PHY_DETECT bit (for non-6250 family
devices), controls whether the PPU polls the PHY to retrieve the link,
speed, duplex and pause status to update the port configuration. This
applies for both internal and external PHYs.
For some switches such as 88E6352 and 88E6390X, PHY_DETECT has an
additional function of enabling auto-media mode between the internal
PHY and SERDES blocks depending on which first gains link.
The original intention of commit 83d5a3c13a1b (net: dsa: mv88e6xxx: use
PHY_DETECT in mac_link_up/mac_link_down) was to allow this bit to be
used to detect when this propagation is enabled, and allow software to
update the port configuration. This has found to be necessary for some
switches which do not automatically propagate status from the SERDES to
the port, which includes the 88E6390. However, commit 06c5f7f5cb14
("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") breaks
this assumption.
Maarten Zanders has confirmed that the issue he was addressing was for
an 88E6250 switch, which does not have a PHY_DETECT bit in bit 12, but
instead a link status bit. Therefore, mv88e6xxx_port_ppu_updates() does
not report correctly.
This patch resolves the above issues by reverting Maarten's change and
instead making mv88e6xxx_port_ppu_updates() indicate whether the port
is internal for the 88E6250 family of switches.
Yes, you're right, I'm targeting the 6250 family. And yes, your
suggestion would solve my case and is a better implementation for
the other devices (as far as I can see).
Fixes: 06c5f7f5cb14 ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Tested-by: Maarten Zanders <maarten.zanders@mind.be> Link: https://lore.kernel.org/r/E1muXm7-00EwJB-7n@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jesse Brandeburg [Sat, 13 Nov 2021 01:06:02 +0000 (17:06 -0800)]
ice: safer stats processing
The driver was zeroing live stats that could be fetched by
ndo_get_stats64 at any time. This could result in inconsistent
statistics, and the telltale sign was when reading stats frequently from
/proc/net/dev, the stats would go backwards.
Fix by collecting stats into a local, and delaying when we write to the
structure so it's not incremental.
Fixes: 7c3e0ebc363c ("ice: Add stats and ethtool support") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Hans de Goede [Tue, 7 Dec 2021 12:10:53 +0000 (13:10 +0100)]
HID: Ignore battery for Elan touchscreen on Asus UX550VE
Battery status is reported for the Asus UX550VE touchscreen even though
it does not have a battery. Prevent it from always reporting the
battery as low.
bpf: Add selftests to cover packet access corner cases
This commit adds BPF verifier selftests that cover all corner cases by
packet boundary checks. Specifically, 8-byte packet reads are tested at
the beginning of data and at the beginning of data_meta, using all kinds
of boundary checks (all comparison operators: <, >, <=, >=; both
permutations of operands: data + length compared to end, end compared to
data + length). For each case there are three tests:
1. Length is just enough for an 8-byte read. Length is either 7 or 8,
depending on the comparison.
2. Length is increased by 1 - should still pass the verifier. These
cases are useful, because they failed before commit 31853f3d9196
("bpf: Fix the off-by-two error in range markings").
3. Length is decreased by 1 - should be rejected by the verifier.
Some existing tests are just renamed to avoid duplication.
Joakim Zhang [Mon, 6 Dec 2021 13:54:57 +0000 (21:54 +0800)]
net: fec: only clear interrupt of handling queue in fec_enet_rx_queue()
Background:
We have a customer is running a Profinet stack on the 8MM which receives and
responds PNIO packets every 4ms and PNIO-CM packets every 40ms. However, from
time to time the received PNIO-CM package is "stock" and is only handled when
receiving a new PNIO-CM or DCERPC-Ping packet (tcpdump shows the PNIO-CM and
the DCERPC-Ping packet at the same time but the PNIO-CM HW timestamp is from
the expected 40 ms and not the 2s delay of the DCERPC-Ping).
After debugging, we noticed PNIO, PNIO-CM and DCERPC-Ping packets would
be handled by different RX queues.
The root cause should be driver ack all queues' interrupt when handle a
specific queue in fec_enet_rx_queue(). The blamed patch is introduced to
receive as much packets as possible once to avoid interrupt flooding.
But it's unreasonable to clear other queues'interrupt when handling one
queue, this patch tries to fix it.
Fixes: eeae11157a75 (net: fec: clear receive interrupts before processing a packet) Cc: Russell King <rmk+kernel@arm.linux.org.uk> Reported-by: Nicolas Diaz <nicolas.diaz@nxp.com> Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com> Link: https://lore.kernel.org/r/20211206135457.15946-1-qiangqing.zhang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 17446 Comm: kworker/u4:5 Tainted: G W 5.16.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: krdsd rds_send_worker
Note: I chose an arbitrary commit for the Fixes: tag,
because I do not think we need to backport this fix to very old kernels.
Fixes: 70518a997534 ("netfilter: conntrack: avoid possible false sharing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stefano Brivio [Sat, 27 Nov 2021 10:33:38 +0000 (11:33 +0100)]
selftests: netfilter: Add correctness test for mac,net set type
The existing net,mac test didn't cover the issue recently reported
by Nikita Yushchenko, where MAC addresses wouldn't match if given
as first field of a concatenated set with AVX2 and 8-bit groups,
because there's a different code path covering the lookup of six
8-bit groups (MAC addresses) if that's the first field.
Add a similar mac,net test, with MAC address and IPv4 address
swapped in the set specification.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Stefano Brivio [Sat, 27 Nov 2021 10:33:37 +0000 (11:33 +0100)]
nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups
The sixth byte of packet data has to be looked up in the sixth group,
not in the seventh one, even if we load the bucket data into ymm6
(and not ymm5, for convenience of tracking stalls).
Without this fix, matching on a MAC address as first field of a set,
if 8-bit groups are selected (due to a small set size) would fail,
that is, the given MAC address would never match.
Nicolas Dichtel [Fri, 26 Nov 2021 14:36:12 +0000 (15:36 +0100)]
vrf: don't run conntrack on vrf with !dflt qdisc
After the below patch, the conntrack attached to skb is set to "notrack" in
the context of vrf device, for locally generated packets.
But this is true only when the default qdisc is set to the vrf device. When
changing the qdisc, notrack is not set anymore.
In fact, there is a shortcut in the vrf driver, when the default qdisc is
set, see commit 9b4a779ec95d ("net: vrf: performance improvements for
IPv4") for more details.
This patch ensures that the behavior is always the same, whatever the qdisc
is.
To demonstrate the difference, a new test is added in conntrack_vrf.sh.
Fixes: 37196bb42752 ("vrf: run conntrack only in context of lower/physdev for locally generated packets") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Florian Westphal <fw@strlen.de> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Linus Torvalds [Tue, 7 Dec 2021 23:36:45 +0000 (15:36 -0800)]
Merge tag 'perf-tools-fixes-for-v5.16-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:
- Fix SMT detection fast read path on sysfs.
- Fix memory leaks when processing feature headers in perf.data files.
- Fix 'Simple expression parser' 'perf test' on arch without CPU die
topology info, such as s/390.
- Fix building perf with BUILD_BPF_SKEL=1.
- Fix 'perf bench' by reverting "perf bench: Fix two memory leaks
detected with ASan".
- Fix itrace space allowed for new attributes in 'perf script'.
- Fix the build feature detection fast path, that was always failing on
systems with python3 development packages, speeding up the build.
- Reset shadow counts before loading, fixing metrics using
duration_time.
- Sync more kernel headers changed by the new futex_waitv syscall: s390
and powerpc.
* tag 'perf-tools-fixes-for-v5.16-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf bpf_skel: Do not use typedef to avoid error on old clang
perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros
perf header: Fix memory leaks when processing feature headers
perf test: Reset shadow counts before loading
perf test: Fix 'Simple expression parser' test on arch without CPU die topology info
tools build: Remove needless libpython-version feature check that breaks test-all fast path
perf tools: Fix SMT detection fast read path
tools headers UAPI: Sync powerpc syscall table file changed by new futex_waitv syscall
perf inject: Fix itrace space allowed for new attributes
tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv syscall
Revert "perf bench: Fix two memory leaks detected with ASan"
Adding filters with the same values inside for VXLAN and Geneve causes HW
error, because it looks exactly the same. To choose between different
type of tunnels new recipe is needed. Add storing tunnel types in
creating recipes function and start checking it in finding function.
Change getting open tunnels function to return port on correct tunnel
type. This is needed to copy correct port to dummy packet.
Block user from adding enc_dst_port via tc flower, because VXLAN and
Geneve filters can be created only with destination port which was
previously opened.
Fixes: d8d7095ca94a7 ("ice: low level support for tunnels") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
In tunnels packet there can be two UDP headers:
- outer which for hw should be mark as ICE_UDP_OF
- inner which for hw should be mark as ICE_UDP_ILOS or as ICE_TCP_IL if
inner header is of TCP type
In none tunnels packet header can be:
- UDP, which for hw should be mark as ICE_UDP_ILOS
- TCP, which for hw should be mark as ICE_TCP_IL
Change incorrect ICE_UDP_OF for none tunnel packets to ICE_UDP_ILOS.
ICE_UDP_OF is incorrect for none tunnel packets and setting it leads to
error from hw while adding this kind of recipe.
In summary, for tunnel outer port type should always be set to
ICE_UDP_OF, for none tunnel outer and tunnel inner it should always be
set to ICE_UDP_ILOS.
Fixes: 0be0edf75f72 ("ice: VXLAN and Geneve TC support") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Jesse Brandeburg [Sat, 23 Oct 2021 00:28:17 +0000 (17:28 -0700)]
ice: ignore dropped packets during init
If the hardware is constantly receiving unicast or broadcast packets
during driver load, the device previously counted many GLV_RDPC (VSI
dropped packets) events during init. This causes confusing dropped
packet statistics during driver load. The dropped packets counter
incrementing does stop once the driver finishes loading.
Avoid this problem by baselining our statistics at the end of driver
open instead of the end of probe.
Fixes: 03e3b9ab0aad ("ice: Configure VSIs for Tx/Rx") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Dave Ertman [Tue, 12 Oct 2021 20:31:21 +0000 (13:31 -0700)]
ice: Fix problems with DSCP QoS implementation
The patch that implemented DSCP QoS implementation removed a
bandwidth check that was used to check for a specific condition
caused by some corner cases. This check should not of been
removed.
The same patch also added a check for when the DCBx state could
be changed in relation to DSCP, but the check was erroneously
added nested in a check for CEE mode, which made the check useless.
Fix these problems by re-adding the bandwidth check and relocating
the DSCP mode check earlier in the function that changes DCBx state
in the driver.
Fixes: c56146c4c21c ("ice: Add DSCP support") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Paul Greenwalt [Mon, 12 Jul 2021 11:54:25 +0000 (07:54 -0400)]
ice: rearm other interrupt cause register after enabling VFs
The other interrupt cause register (OICR), global interrupt 0, is
disabled when enabling VFs to prevent handling VFLR. If the OICR is
not rearmed then the VF cannot communicate with the PF.
Rearm the OICR after enabling VFs.
Fixes: fcefa27d96fe ("ice: Separate VF VSI initialization/creation from reset flow") Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Yahui Cao [Wed, 5 May 2021 21:18:00 +0000 (14:18 -0700)]
ice: fix FDIR init missing when reset VF
When VF is being reset, ice_reset_vf() will be called and FDIR
resource should be released and initialized again.
Fixes: 116686042684 ("ice: Enable FDIR Configure for AVF") Signed-off-by: Yahui Cao <yahui.cao@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>