Tejun Heo [Thu, 6 Jan 2022 21:02:29 +0000 (11:02 -1000)]
cgroup: Use open-time cgroup namespace for process migration perm checks
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's cgroup namespace which is
a potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.
This patch makes cgroup remember the cgroup namespace at the time of open
and uses it for migration permission checks instad of current's. Note that
this only applies to cgroup2 as cgroup1 doesn't have namespace support.
This also fixes a use-after-free bug on cgroupns reported in
Tejun Heo [Thu, 6 Jan 2022 21:02:29 +0000 (11:02 -1000)]
cgroup: Allocate cgroup_file_ctx for kernfs_open_file->priv
of->priv is currently used by each interface file implementation to store
private information. This patch collects the current two private data usages
into struct cgroup_file_ctx which is allocated and freed by the common path.
This allows generic private data which applies to multiple files, which will
be used to in the following patch.
Note that cgroup_procs iterator is now embedded as procs.iter in the new
cgroup_file_ctx so that it doesn't need to be allocated and freed
separately.
v2: union dropped from cgroup_file_ctx and the procs iterator is embedded in
cgroup_file_ctx as suggested by Linus.
v3: Michal pointed out that cgroup1's procs pidlist uses of->priv too.
Converted. Didn't change to embedded allocation as cgroup1 pidlists get
stored for caching.
Tejun Heo [Thu, 6 Jan 2022 21:02:28 +0000 (11:02 -1000)]
cgroup: Use open-time credentials for process migraton perm checks
cgroup process migration permission checks are performed at write time as
whether a given operation is allowed or not is dependent on the content of
the write - the PID. This currently uses current's credentials which is a
potential security weakness as it may allow scenarios where a less
privileged process tricks a more privileged one into writing into a fd that
it created.
This patch makes both cgroup2 and cgroup1 process migration interfaces to
use the credentials saved at the time of open (file->f_cred) instead of
current's.
Reported-by: "Eric W. Biederman" <ebiederm@xmission.com> Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Fixes: 025979bc8fa2 ("cgroup: require write perm on common ancestor when moving processes on the default hierarchy") Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
Linus Torvalds [Wed, 5 Jan 2022 22:08:56 +0000 (14:08 -0800)]
Merge tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski"
"Networking fixes, including fixes from bpf, and WiFi. One last pull
request, turns out some of the recent fixes did more harm than good.
Current release - regressions:
- Revert "xsk: Do not sleep in poll() when need_wakeup set", made the
problem worse
- Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in
__fixed_phy_register", broke EPROBE_DEFER handling
- Revert "net: usb: r8152: Add MAC pass-through support for more
Lenovo Docks", broke setups without a Lenovo dock
Current release - new code bugs:
- selftests: set amt.sh executable
Previous releases - regressions:
- batman-adv: mcast: don't send link-local multicast to mcast routers
Previous releases - always broken:
- ipv4/ipv6: check attribute length for RTA_FLOW / RTA_GATEWAY
- sctp: hold endpoint before calling cb in
sctp_transport_lookup_process
- mac80211: mesh: embed mesh_paths and mpp_paths into
ieee80211_if_mesh to avoid complicated handling of sub-object
allocation failures
- seg6: fix traceroute in the presence of SRv6
- tipc: fix a kernel-infoleak in __tipc_sendmsg()"
* tag 'net-5.16-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
selftests: set amt.sh executable
Revert "net: usb: r8152: Add MAC passthrough support for more Lenovo Docks"
sfc: The RX page_ring is optional
iavf: Fix limit of total number of queues to active queues of VF
i40e: Fix incorrect netdev's real number of RX/TX queues
i40e: Fix for displaying message regarding NVM version
i40e: fix use-after-free in i40e_sync_filters_subtask()
i40e: Fix to not show opcode msg on unsuccessful VF MAC change
ieee802154: atusb: fix uninit value in atusb_set_extended_addr
mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
mac80211: initialize variable have_higher_than_11mbit
sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc
netrom: fix copying in user data in nr_setsockopt
udp6: Use Segment Routing Header for dest address if present
icmp: ICMPV6: Examine invoking packet for Segment Route Headers.
seg6: export get_srh() for ICMP handling
Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register"
ipv6: Do cleanup if attribute validation fails in multipath route
ipv6: Continue processing multipath route even if gateway attribute is invalid
net/fsl: Remove leftover definition in xgmac_mdio
...
Linus Torvalds [Wed, 5 Jan 2022 17:30:10 +0000 (09:30 -0800)]
Merge tag 'gpio-fixes-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:
"Here are two last fixes for this release cycle from the GPIO
subsystem:
- fix irq offset calculation in gpio-aspeed-sgpio
- update the MAINTAINERS entry for gpio-brcmstb"
* tag 'gpio-fixes-for-v5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
MAINTAINERS: update gpio-brcmstb maintainers
gpio: gpio-aspeed-sgpio: Fix wrong hwirq base in irq handler
Jakub Kicinski [Wed, 5 Jan 2022 17:00:11 +0000 (09:00 -0800)]
Merge tag 'ieee802154-for-net-2022-01-05' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:
====================
pull-request: ieee802154 for net 2022-01-05
Below I have a last minute fix for the atusb driver.
Pavel fixes a KASAN uninit report for the driver. This version is the
minimal impact fix to ease backporting. A bigger rework of the driver to
avoid potential similar problems is ongoing and will come through net-next
when ready.
* tag 'ieee802154-for-net-2022-01-05' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
ieee802154: atusb: fix uninit value in atusb_set_extended_addr
====================
David S. Miller [Wed, 5 Jan 2022 11:15:16 +0000 (11:15 +0000)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2022-01-04
This series contains updates to i40e and iavf drivers.
Mateusz adjusts displaying of failed VF MAC message when the failure is
expected as well as modifying an NVM info message to not confuse the user
for i40e.
Di Zhu fixes a use-after-free issue MAC filters for i40e.
Jedrzej fixes an issue with misreporting of Rx and Tx queues during
reinitialization for i40e.
Karen correct checking of channel queue configuration to occur against
active queues for iavf.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Habets [Sun, 2 Jan 2022 08:41:22 +0000 (08:41 +0000)]
sfc: The RX page_ring is optional
The RX page_ring is an optional feature that improves
performance. When allocation fails the driver can still
function, but possibly with a lower bandwidth.
Guard against dereferencing a NULL page_ring.
iavf: Fix limit of total number of queues to active queues of VF
In the absence of this validation, if the user requests to
configure queues more than the enabled queues, it results in
sending the requested number of queues to the kernel stack
(due to the asynchronous nature of VF response), in which
case the stack might pick a queue to transmit that is not
enabled and result in Tx hang. Fix this bug by
limiting the total number of queues allocated for VF to
active queues of VF.
Fixes: e3a3d208c68d ("i40evf: add ndo_setup_tc callback to i40evf") Signed-off-by: Ashwin Vijayavel <ashwin.vijayavel@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
i40e: Fix incorrect netdev's real number of RX/TX queues
There was a wrong queues representation in sysfs during
driver's reinitialization in case of online cpus number is
less than combined queues. It was caused by stopped
NetworkManager, which is responsible for calling vsi_open
function during driver's initialization.
In specific situation (ex. 12 cpus online) there were 16 queues
in /sys/class/net/<iface>/queues. In case of modifying queues with
value higher, than number of online cpus, then it caused write
errors and other errors.
Add updating of sysfs's queues representation during driver
initialization.
Fixes: 79a154c5e3e8 ("i40e: main driver core") Signed-off-by: Lukasz Cieplicki <lukaszx.cieplicki@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
i40e: Fix for displaying message regarding NVM version
When loading the i40e driver, it prints a message like: 'The driver for the
device detected a newer version of the NVM image v1.x than expected v1.y.
Please install the most recent version of the network driver.' This is
misleading as the driver is working as expected.
Fix that by removing the second part of message and changing it from
dev_info to dev_dbg.
Fixes: 8b0ab96cab2d ("i40e: The driver now prints the API version in error message") Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Di Zhu [Mon, 29 Nov 2021 13:52:01 +0000 (19:52 +0600)]
i40e: fix use-after-free in i40e_sync_filters_subtask()
Using ifconfig command to delete the ipv6 address will cause
the i40e network card driver to delete its internal mac_filter and
i40e_service_task kernel thread will concurrently access the mac_filter.
These two processes are not protected by lock
so causing the following use-after-free problems.
i40e: Fix to not show opcode msg on unsuccessful VF MAC change
Hide i40e opcode information sent during response to VF in case when
untrusted VF tried to change MAC on the VF interface.
This is implemented by adding an additional parameter 'hide' to the
response sent to VF function that hides the display of error
information, but forwards the error code to VF.
Previously it was not possible to send response with some error code
to VF without displaying opcode information.
Fixes: eb89ebf92394 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Pavel Skripkin [Tue, 4 Jan 2022 18:28:06 +0000 (21:28 +0300)]
ieee802154: atusb: fix uninit value in atusb_set_extended_addr
Alexander reported a use of uninitialized value in
atusb_set_extended_addr(), that is caused by reading 0 bytes via
usb_control_msg().
Fix it by validating if the number of bytes transferred is actually
correct, since usb_control_msg() may read less bytes, than was requested
by caller.
Fail log:
BUG: KASAN: uninit-cmp in ieee802154_is_valid_extended_unicast_addr include/linux/ieee802154.h:310 [inline]
BUG: KASAN: uninit-cmp in atusb_set_extended_addr drivers/net/ieee802154/atusb.c:1000 [inline]
BUG: KASAN: uninit-cmp in atusb_probe.cold+0x29f/0x14db drivers/net/ieee802154/atusb.c:1056
Uninit value used in comparison: 311daa649a2003bd stack handle: 000000009a2003bd
ieee802154_is_valid_extended_unicast_addr include/linux/ieee802154.h:310 [inline]
atusb_set_extended_addr drivers/net/ieee802154/atusb.c:1000 [inline]
atusb_probe.cold+0x29f/0x14db drivers/net/ieee802154/atusb.c:1056
usb_probe_interface+0x314/0x7f0 drivers/usb/core/driver.c:396
Fixes: 11c754c19e5e ("ieee802154: add support for atusb transceiver") Reported-by: Alexander Potapenko <glider@google.com> Acked-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20220104182806.7188-1-paskripkin@gmail.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
Jakub Kicinski [Tue, 4 Jan 2022 15:18:27 +0000 (07:18 -0800)]
Merge tag 'mac80211-for-net-2022-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg says:
====================
Two more changes:
- mac80211: initialize a variable to avoid using it uninitialized
- mac80211 mesh: put some data structures into the container to
fix bugs with and not have to deal with allocation failures
* tag 'mac80211-for-net-2022-01-04' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211:
mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
mac80211: initialize variable have_higher_than_11mbit
====================
Pavel Skripkin [Thu, 30 Dec 2021 19:55:47 +0000 (22:55 +0300)]
mac80211: mesh: embedd mesh_paths and mpp_paths into ieee80211_if_mesh
Syzbot hit NULL deref in rhashtable_free_and_destroy(). The problem was
in mesh_paths and mpp_paths being NULL.
mesh_pathtbl_init() could fail in case of memory allocation failure, but
nobody cared, since ieee80211_mesh_init_sdata() returns void. It led to
leaving 2 pointers as NULL. Syzbot has found null deref on exit path,
but it could happen anywhere else, because code assumes these pointers are
valid.
Since all ieee80211_*_setup_sdata functions are void and do not fail,
let's embedd mesh_paths and mpp_paths into parent struct to avoid
adding error handling on higher levels and follow the pattern of others
setup_sdata functions
Fixes: 0ef9e26daa10 ("mac80211: mesh: convert path table to rhashtable") Reported-and-tested-by: syzbot+860268315ba86ea6b96b@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Link: https://lore.kernel.org/r/20211230195547.23977-1-paskripkin@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
mlme.c:5332:7: warning: Branch condition evaluates to a
garbage value
have_higher_than_11mbit)
^~~~~~~~~~~~~~~~~~~~~~~
have_higher_than_11mbit is only set to true some of the time in
ieee80211_get_rates() but is checked all of the time. So
have_higher_than_11mbit needs to be initialized to false.
Fixes: a5ec26c7e553 ("mac80211: set basic rates earlier") Signed-off-by: Tom Rix <trix@redhat.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Link: https://lore.kernel.org/r/20211223162848.3243702-1-trix@redhat.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Eric Dumazet [Tue, 4 Jan 2022 09:45:08 +0000 (01:45 -0800)]
sch_qfq: prevent shift-out-of-bounds in qfq_init_qdisc
tx_queue_len can be set to ~0U, we need to be more
careful about overflows.
__fls(0) is undefined, as this report shows:
UBSAN: shift-out-of-bounds in net/sched/sch_qfq.c:1430:24
shift exponent 51770272 is too large for 32-bit type 'int'
CPU: 0 PID: 25574 Comm: syz-executor.0 Not tainted 5.16.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x201/0x2d8 lib/dump_stack.c:106
ubsan_epilogue lib/ubsan.c:151 [inline]
__ubsan_handle_shift_out_of_bounds+0x494/0x530 lib/ubsan.c:330
qfq_init_qdisc+0x43f/0x450 net/sched/sch_qfq.c:1430
qdisc_create+0x895/0x1430 net/sched/sch_api.c:1253
tc_modify_qdisc+0x9d9/0x1e20 net/sched/sch_api.c:1660
rtnetlink_rcv_msg+0x934/0xe60 net/core/rtnetlink.c:5571
netlink_rcv_skb+0x200/0x470 net/netlink/af_netlink.c:2496
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x814/0x9f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0xaea/0xe60 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg net/socket.c:724 [inline]
____sys_sendmsg+0x5b9/0x910 net/socket.c:2409
___sys_sendmsg net/socket.c:2463 [inline]
__sys_sendmsg+0x280/0x370 net/socket.c:2492
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
Fixes: 6a4df64a0723 ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This code used to copy in an unsigned long worth of data before
the sockptr_t conversion, so restore that.
Fixes: cf0058199c0b ("net: pass a sockptr_t into ->setsockopt") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 4 Jan 2022 12:17:35 +0000 (12:17 +0000)]
Merge branch 'srv6-traceroute'
Andrew Lunn says:
====================
Fix traceroute in the presence of SRv6
When using SRv6 the destination IP address in the IPv6 header is not
always the true destination, it can be a router along the path that
SRv6 is using.
When ICMP reports an error, e.g, time exceeded, which is what
traceroute uses, it included the packet which invoked the error into
the ICMP message body. Upon receiving such an ICMP packet, the
invoking packet is examined and an attempt is made to find the socket
which sent the packet, so the error can be reported. Lookup is
performed using the source and destination address. If the
intermediary router IP address from the IP header is used, the lookup
fails. It is necessary to dig into the header and find the true
destination address in the Segment Router header, SRH.
v2:
Play games with the skb->network_header rather than clone the skb
v3:
Move helpers into seg6.c
v4:
Move short helper into header file.
Rework getting SRH destination address
v5:
Fix comment to describe function, not caller
Patch 1 exports a helper which can find the SRH in a packet
Patch 2 does the actual examination of the invoking packet
Patch 3 makes use of the results when trying to find the socket.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 3 Jan 2022 17:11:32 +0000 (18:11 +0100)]
udp6: Use Segment Routing Header for dest address if present
When finding the socket to report an error on, if the invoking packet
is using Segment Routing, the IPv6 destination address is that of an
intermediate router, not the end destination. Extract the ultimate
destination address from the segment address.
This change allows traceroute to function in the presence of Segment
Routing.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 3 Jan 2022 17:11:31 +0000 (18:11 +0100)]
icmp: ICMPV6: Examine invoking packet for Segment Route Headers.
RFC8754 says:
ICMP error packets generated within the SR domain are sent to source
nodes within the SR domain. The invoking packet in the ICMP error
message may contain an SRH. Since the destination address of a packet
with an SRH changes as each segment is processed, it may not be the
destination used by the socket or application that generated the
invoking packet.
For the source of an invoking packet to process the ICMP error
message, the ultimate destination address of the IPv6 header may be
required. The following logic is used to determine the destination
address for use by protocol-error handlers.
* Walk all extension headers of the invoking IPv6 packet to the
routing extension header preceding the upper-layer header.
- If routing header is type 4 Segment Routing Header (SRH)
o The SID at Segment List[0] may be used as the destination
address of the invoking packet.
Mangle the skb so the network header points to the invoking packet
inside the ICMP packet. The seg6 helpers can then be used on the skb
to find any segment routing headers. If found, mark this fact in the
IPv6 control block of the skb, and store the offset into the packet of
the SRH. Then restore the skb back to its old state.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Mon, 3 Jan 2022 17:11:30 +0000 (18:11 +0100)]
seg6: export get_srh() for ICMP handling
An ICMP error message can contain in its message body part of an IPv6
packet which invoked the error. Such a packet might contain a segment
router header. Export get_srh() so the ICMP code can make use of it.
Since his changes the scope of the function from local to global, add
the seg6_ prefix to keep the namespace clean. And move it into seg6.c
so it is always available, not just when IPV6_SEG6_LWTUNNEL is
enabled.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Tue, 4 Jan 2022 03:50:16 +0000 (19:50 -0800)]
Merge tag 'batadv-net-pullrequest-20220103' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here is a batman-adv bugfix:
- avoid sending link-local multicast to multicast routers,
by Linus Lüssing
* tag 'batadv-net-pullrequest-20220103' of git://git.open-mesh.org/linux-merge:
batman-adv: mcast: don't send link-local multicast to mcast routers
====================
Revert "net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register"
This reverts commit 5bd649166faf66557a5f63edb0dfa88089aa7a0e ("net: phy:
fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register")
since it prevents any system that uses a fixed PHY without a GPIO
descriptor from properly working:
[ 5.971952] brcm-systemport 9300000.ethernet: failed to register fixed PHY
[ 5.978854] brcm-systemport: probe of 9300000.ethernet failed with error -22
[ 5.986047] brcm-systemport 9400000.ethernet: failed to register fixed PHY
[ 5.992947] brcm-systemport: probe of 9400000.ethernet failed with error -22
Fixes: 5bd649166faf ("net: phy: fixed_phy: Fix NULL vs IS_ERR() checking in __fixed_phy_register") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220103193453.1214961-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
David Ahern [Mon, 3 Jan 2022 17:19:11 +0000 (10:19 -0700)]
ipv6: Continue processing multipath route even if gateway attribute is invalid
ip6_route_multipath_del loop continues processing the multipath
attribute even if delete of a nexthop path fails. For consistency,
do the same if the gateway attribute is invalid.
Fixes: 7eaa95c8311c ("ipv6: Check attribute length for RTA_GATEWAY when deleting multipath route") Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20220103171911.94739-1-dsahern@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Steven Lee [Tue, 14 Dec 2021 04:02:38 +0000 (12:02 +0800)]
gpio: gpio-aspeed-sgpio: Fix wrong hwirq base in irq handler
Each aspeed sgpio bank has 64 gpio pins(32 input pins and 32 output pins).
The hwirq base for each sgpio bank should be multiples of 64 rather than
multiples of 32.
Signed-off-by: Steven Lee <steven_lee@aspeedtech.com> Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
- Fix parsing of Intel PT VM time correlation arguments.
- Honour CPU filtering command line request of a script's switch events
in 'perf script'.
- Fix printing of switch events in Intel PT python script.
- Fix duplicate alias events list printing in 'perf list', noticed on
heterogeneous arm64 systems.
- Fix return value of ids__new(), users expect NULL for failure, not
ERR_PTR(-ENOMEM).
* tag 'perf-tools-fixes-for-v5.16-2022-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
perf top: Fix TUI exit screen refresh race condition
perf pmu: Fix alias events list
perf scripts python: intel-pt-events.py: Fix printing of switch events
perf script: Fix CPU filtering of a script's switch events
perf intel-pt: Fix parsing of VM time correlation arguments
perf expr: Fix return value of ids__new()
Markus Koch [Sun, 2 Jan 2022 16:54:08 +0000 (17:54 +0100)]
net/fsl: Remove leftover definition in xgmac_mdio
commit c740435d4913 ("net/fsl: fix a bug in xgmac_mdio") fixed a bug in
the QorIQ mdio driver but left the (now unused) incorrect bit definition
for MDIO_DATA_BSY in the code. This commit removes it.
Signed-off-by: Markus Koch <markus@notsyncing.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 2 Jan 2022 18:36:09 +0000 (10:36 -0800)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"Better input validation for compat ioctls and a documentation bugfix
for 5.16"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Docs: Fixes link to I2C specification
i2c: validate user data in compat ioctl
Thomas Toye [Sat, 1 Jan 2022 17:22:07 +0000 (18:22 +0100)]
rndis_host: support Hytera digital radios
Hytera makes a range of digital (DMR) radios. These radios can be
programmed to a allow a computer to control them over Ethernet over USB,
either using NCM or RNDIS.
This commit adds support for RNDIS for Hytera radios. I tested with a
Hytera PD785 and a Hytera MD785G. When these radios are programmed to
set up a Radio to PC Network using RNDIS, an USB interface will be added
with class 2 (Communications), subclass 2 (Abstract Modem Control) and
an interface protocol of 255 ("vendor specific" - lsusb even hints "MSFT
RNDIS?").
This patch is similar to the solution of this StackOverflow user, but
that only works for the Hytera MD785:
https://stackoverflow.com/a/53550858
To use the "Radio to PC Network" functionality of Hytera DMR radios, the
radios need to be programmed correctly in CPS (Hytera's Customer
Programming Software). "Forward to PC" should be checked in "Network"
(under "General Setting" in "Conventional") and the "USB Network
Communication Protocol" should be set to RNDIS.
Signed-off-by: Thomas Toye <thomas@toye.io> Signed-off-by: David S. Miller <davem@davemloft.net>
When the following command is executed several times, a coredump file is
generated.
$ timeout -k 9 5 perf top -e task-clock
*******
*******
*******
0.01% [kernel] [k] __do_softirq
0.01% libpthread-2.28.so [.] __pthread_mutex_lock
0.01% [kernel] [k] __ll_sc_atomic64_sub_return
double free or corruption (!prev) perf top --sort comm,dso
timeout: the monitored command dumped core
When we terminate "perf top" using sending signal method,
SLsmg_reset_smg() called. SLsmg_reset_smg() resets the SLsmg screen
management routines by freeing all memory allocated while it was active.
However SLsmg_reinit_smg() maybe be called by another thread.
SLsmg_reinit_smg() will free the same memory accessed by
SLsmg_reset_smg(), thus it results in a double free.
SLsmg_reinit_smg() is called already protected by ui__lock, so we fix
the problem by adding pthread_mutex_trylock of ui__lock when calling
SLsmg_reset_smg().
Signed-off-by: Wenyu Liu <liuwenyu7@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: wuxu.wu@huawei.com Link: http://lore.kernel.org/lkml/a91e3943-7ddc-f5c0-a7f5-360f073c20e6@huawei.com Signed-off-by: Hewenliang <hewenliang4@huawei.com> Signed-off-by: yaowenbin <yaowenbin1@huawei.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
John Garry [Tue, 21 Dec 2021 16:11:30 +0000 (00:11 +0800)]
perf pmu: Fix alias events list
Commit 161dcabbbadd3153 ("perf list: Display hybrid PMU events with cpu
type") changes the event list for uncore PMUs or arm64 heterogeneous CPU
systems, such that duplicate aliases are incorrectly listed per PMU
(which they should not be), like:
# perf list
...
unc_cbo_cache_lookup.any_es
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in E or S-state]
unc_cbo_cache_lookup.any_es
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in E or S-state]
unc_cbo_cache_lookup.any_i
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in I-state]
unc_cbo_cache_lookup.any_i
[Unit: uncore_cbox L3 Lookup any request that access cache and found
line in I-state]
...
Notice how the events are listed twice.
The named commit changed how we remove duplicate events, in that events
for different PMUs are not treated as duplicates. I suppose this is to
handle how "Each hybrid pmu event has been assigned with a pmu name".
Fix PMU alias listing by restoring behaviour to remove duplicates for
non-hybrid PMUs.
Fixes: 161dcabbbadd3153 ("perf list: Display hybrid PMU events with cpu type") Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1640103090-140490-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Xin Long [Fri, 31 Dec 2021 23:37:37 +0000 (18:37 -0500)]
sctp: hold endpoint before calling cb in sctp_transport_lookup_process
The same fix in commit 2ab2940504bc ("sctp: use call_rcu to free endpoint")
is also needed for dumping one asoc and sock after the lookup.
Fixes: 2959b44e60ae ("sctp: ensure ep is not destroyed before doing the dump") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ena: Fix error handling when calculating max IO queues number
The role of ena_calc_max_io_queue_num() is to return the number
of queues supported by the device, which means the return value
should be >=0.
The function that calls ena_calc_max_io_queue_num(), checks
the return value. If it is 0, it means the device reported
it supports 0 IO queues. This case is considered an error
and is handled by the calling function accordingly.
However the current implementation of ena_calc_max_io_queue_num()
is wrong, since when it detects the device supports 0 IO queues,
it returns -EFAULT.
In such a case the calling function doesn't detect the error,
and therefore doesn't handle it.
This commit changes ena_calc_max_io_queue_num() to return 0
in case the device reported it supports 0 queues, allowing the
calling function to properly handle the error case.
Fixes: 8fc00a835bc1 ("net: ena: make ethtool -l show correct max number of queues") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ena: Fix wrong rx request id by resetting device
A wrong request id received from the device is a sign that
something is wrong with it, therefore trigger a device reset.
Also add some debug info to the "Page is NULL" print to make
it easier to debug.
Fixes: 42904aea461f ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ena: Fix undefined state when tx request id is out of bounds
ena_com_tx_comp_req_id_get() checks the req_id of a received completion,
and if it is out of bounds returns -EINVAL. This is a sign that
something is wrong with the device and it needs to be reset.
The current code does not reset the device in this case, which leaves
the driver in an undefined state, where this completion is not properly
handled.
This commit adds a call to handle_invalid_req_id() in ena_clean_tx_irq()
and ena_clean_xdp_irq() which resets the device to fix the issue.
This commit also removes unnecessary request id checks from
validate_tx_req_id() and validate_xdp_req_id(). This check is unneeded
because it was already performed in ena_com_tx_comp_req_id_get(), which
is called right before these functions.
Fixes: 403305c9836f ("net: ena: Implement XDP_TX action") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Sat, 1 Jan 2022 05:27:13 +0000 (06:27 +0100)]
batman-adv: mcast: don't send link-local multicast to mcast routers
The addition of routable multicast TX handling introduced a
bug/regression for packets with a link-local multicast destination:
These packets would be sent to all batman-adv nodes with a multicast
router and to all batman-adv nodes with an old version without multicast
router detection.
This even disregards the batman-adv multicast fanout setting, which can
potentially lead to an unwanted, high number of unicast transmissions or
even congestion.
Fixing this by avoiding to send link-local multicast packets to nodes in
the multicast router list.
Fixes: 38dbf505ecd4 ("batman-adv: mcast: apply optimizations for routable packets, too") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Linus Torvalds [Sat, 1 Jan 2022 18:21:49 +0000 (10:21 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Two small fixups for spaceball joystick driver and appletouch touchpad
driver"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: spaceball - fix parsing of movement data packets
Input: appletouch - initialize work before device registration
Jianguo Wu [Fri, 31 Dec 2021 02:01:08 +0000 (10:01 +0800)]
selftests: net: udpgro_fwd.sh: explicitly checking the available ping feature
As Paolo pointed out, the result of ping IPv6 address depends on
the running distro. So explicitly checking the available ping feature,
as e.g. do the bareudp.sh self-tests.
We've added 2 non-merge commits during the last 14 day(s) which contain
a total of 2 files changed, 3 insertions(+), 3 deletions(-).
The main changes are:
1) Revert of an earlier attempt to fix xsk's poll() behavior where it
turned out that the fix for a rare problem made it much worse in
general, from Magnus Karlsson. (Fyi, Magnus mentioned that a proper
fix is coming early next year, so the revert is mainly to avoid
slipping the behavior into 5.16.)
2) Minor misc spell fix in BPF selftests, from Colin Ian King.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, selftests: Fix spelling mistake "tained" -> "tainted"
Revert "xsk: Do not sleep in poll() when need_wakeup set"
====================
Mel Gorman [Fri, 31 Dec 2021 21:10:09 +0000 (13:10 -0800)]
mm: vmscan: reduce throttling due to a failure to make progress -fix
Hugh Dickins reported the following
My tmpfs swapping load (tweaked to use huge pages more heavily
than in real life) is far from being a realistic load: but it was
notably slowed down by your throttling mods in 5.16-rc, and this
patch makes it well again - thanks.
But: it very quickly hit NULL pointer until I changed that last
line to
if (first_pgdat)
consider_reclaim_throttle(first_pgdat, sc);
The likely issue is that huge pages are a major component of the test
workload. When this is the case, first_pgdat may never get set if
compaction is ready to continue due to this check
Mel Gorman [Thu, 2 Dec 2021 15:06:14 +0000 (15:06 +0000)]
mm: vmscan: Reduce throttling due to a failure to make progress
Mike Galbraith, Alexey Avramov and Darrick Wong all reported similar
problems due to reclaim throttling for excessive lengths of time. In
Alexey's case, a memory hog that should go OOM quickly stalls for
several minutes before stalling. In Mike and Darrick's cases, a small
memcg environment stalled excessively even though the system had enough
memory overall.
Commit 3843f54352e0 ("mm/vmscan: throttle reclaim when no progress is
being made") introduced the problem although commit ab68f496da4f
("mm/vmscan: increase the timeout if page reclaim is not making
progress") made it worse. Systems at or near an OOM state that cannot
be recovered must reach OOM quickly and memcg should kill tasks if a
memcg is near OOM.
To address this, only stall for the first zone in the zonelist, reduce
the timeout to 1 tick for VMSCAN_THROTTLE_NOPROGRESS and only stall if
the scan control nr_reclaimed is 0, kswapd is still active and there
were excessive pages pending for writeback. If kswapd has stopped
reclaiming due to excessive failures, do not stall at all so that OOM
triggers relatively quickly. Similarly, if an LRU is simply congested,
only lightly throttle similar to NOPROGRESS.
Alexey's original case was the most straight forward
for i in {1..3}; do tail /dev/zero; done
On vanilla 5.16-rc1, this test stalled heavily, after the patch the test
completes in a few seconds similar to 5.15.
Alexey's second test case added watching a youtube video while tail runs
10 times. On 5.15, playback only jitters slightly, 5.16-rc1 stalls a
lot with lots of frames missing and numerous audio glitches. With this
patch applies, the video plays similarly to 5.15.
Linus Torvalds [Fri, 31 Dec 2021 17:22:25 +0000 (09:22 -0800)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three fixes, all in drivers. The lpfc one doesn't look exploitable,
but nasty things could happen in string operations if mybuf ends up
with an on stack unterminated string"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: vmw_pvscsi: Set residual data length conditionally
scsi: libiscsi: Fix UAF in iscsi_conn_get_param()/iscsi_conn_teardown()
scsi: lpfc: Terminate string in lpfc_debugfs_nvmeio_trc_write()
SeongJae Park [Fri, 31 Dec 2021 04:12:34 +0000 (20:12 -0800)]
mm/damon/dbgfs: fix 'struct pid' leaks in 'dbgfs_target_ids_write()'
DAMON debugfs interface increases the reference counts of 'struct pid's
for targets from the 'target_ids' file write callback
('dbgfs_target_ids_write()'), but decreases the counts only in DAMON
monitoring termination callback ('dbgfs_before_terminate()').
Therefore, when 'target_ids' file is repeatedly written without DAMON
monitoring start/termination, the reference count is not decreased and
therefore memory for the 'struct pid' cannot be freed. This commit
fixes this issue by decreasing the reference counts when 'target_ids' is
written.
Link: https://lkml.kernel.org/r/20211229124029.23348-1-sj@kernel.org Fixes: ba056afad928 ("mm/damon: implement a debugfs-based user space interface") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> [5.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mike Kravetz [Fri, 31 Dec 2021 04:12:31 +0000 (20:12 -0800)]
userfaultfd/selftests: fix hugetlb area allocations
Currently, userfaultfd selftest for hugetlb as run from run_vmtests.sh
or any environment where there are 'just enough' hugetlb pages will
always fail with:
The ENOMEM error code implies there are not enough hugetlb pages.
However, there are free hugetlb pages but they are all reserved. There
is a basic problem with the way the test allocates hugetlb pages which
has existed since the test was originally written.
Due to the way 'cleanup' was done between different phases of the test,
this issue was masked until recently. The issue was uncovered by commit 83cb8ccccbe2 ("userfaultfd/selftests: reinitialize test context in each
test").
For the hugetlb test, src and dst areas are allocated as PRIVATE
mappings of a hugetlb file. This means that at mmap time, pages are
reserved for the src and dst areas. At the start of event testing (and
other tests) the src area is populated which results in allocation of
huge pages to fill the area and consumption of reserves associated with
the area. Then, a child is forked to fault in the dst area. Note that
the dst area was allocated in the parent and hence the parent owns the
reserves associated with the mapping. The child has normal access to
the dst area, but can not use the reserves created/owned by the parent.
Thus, if there are no other huge pages available allocation of a page
for the dst by the child will fail.
Fix by not creating reserves for the dst area. In this way the child
can use free (non-reserved) pages.
Also, MAP_PRIVATE of a file only makes sense if you are interested in
the contents of the file before making a COW copy. The test does not do
this. So, just use MAP_ANONYMOUS | MAP_HUGETLB to create an anonymous
hugetlb mapping. There is no need to create a hugetlb file in the
non-shared case.
Link: https://lkml.kernel.org/r/20211217172919.7861-1-mike.kravetz@oracle.com Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Fri, 31 Dec 2021 14:32:00 +0000 (14:32 +0000)]
Merge branch 'mpr-len-checks'
David Ahern says:
====================
net: Length checks for attributes within multipath routes
Add length checks for attributes within a multipath route (attributes
within RTA_MULTIPATH). Motivated by the syzbot report in patch 1 and
then expanded to other attributes as noted by Ido.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
lwtunnel_valid_encap_type_attr is used to validate encap attributes
within a multipath route. Add length validation checking to the type.
lwtunnel_valid_encap_type_attr is called converting attributes to
fib{6,}_config struct which means it is used before fib_get_nhs,
ip6_route_multipath_add, and ip6_route_multipath_del - other
locations that use rtnh_ok and then nla_get_u16 on RTA_ENCAP_TYPE
attribute.
Fixes: 118ac1465fb5 ("lwtunnel: fix autoload of lwt modules") Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 31 Dec 2021 00:36:33 +0000 (17:36 -0700)]
ipv6: Check attribute length for RTA_GATEWAY in multipath route
Commit referenced in the Fixes tag used nla_memcpy for RTA_GATEWAY as
does the current nla_get_in6_addr. nla_memcpy protects against accessing
memory greater than what is in the attribute, but there is no check
requiring the attribute to have an IPv6 address. Add it.
Fixes: ae2659f1b0bd ("ipv6: add support of equal cost multipath (ECMP)") Signed-off-by: David Ahern <dsahern@kernel.org> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Fri, 31 Dec 2021 00:36:32 +0000 (17:36 -0700)]
ipv4: Check attribute length for RTA_FLOW in multipath route
Make sure RTA_FLOW is at least 4B before using.
Fixes: be8a7cf82fb5 ("[IPv4]: FIB configuration using struct fib_config") Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Add helper to validate RTA_GATEWAY length before using the attribute.
Fixes: be8a7cf82fb5 ("[IPv4]: FIB configuration using struct fib_config") Reported-by: syzbot+d4b9a2851cc3ce998741@syzkaller.appspotmail.com Signed-off-by: David Ahern <dsahern@kernel.org> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Deep Majumder [Fri, 19 Nov 2021 06:14:01 +0000 (11:44 +0530)]
Docs: Fixes link to I2C specification
The link to the I2C specification is broken. Although
"https://www.nxp.com" hosts Rev 7 (2021) of this specification, it is
behind a login-wall. Thus, an additional link has been added (which
doesn't require a login) and the NXP official docs link has been
updated.
Signed-off-by: Deep Majumder <deep@fastmail.in>
[wsa: minor updates to text and commit message] Signed-off-by: Wolfram Sang <wsa@kernel.org>
Pavel Skripkin [Thu, 30 Dec 2021 22:47:50 +0000 (01:47 +0300)]
i2c: validate user data in compat ioctl
Wrong user data may cause warning in i2c_transfer(), ex: zero msgs.
Userspace should not be able to trigger warnings, so this patch adds
validation checks for user data in compact ioctl to prevent reported
warnings
Reported-and-tested-by: syzbot+e417648b303855b91d8a@syzkaller.appspotmail.com Fixes: 4e69e2f3444f ("i2c compat ioctls: move to ->compat_ioctl()") Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
Leo L. Schwab [Fri, 31 Dec 2021 05:05:00 +0000 (21:05 -0800)]
Input: spaceball - fix parsing of movement data packets
The spaceball.c module was not properly parsing the movement reports
coming from the device. The code read axis data as signed 16-bit
little-endian values starting at offset 2.
In fact, axis data in Spaceball movement reports are signed 16-bit
big-endian values starting at offset 3. This was determined first by
visually inspecting the data packets, and later verified by consulting:
http://spacemice.org/pdf/SpaceBall_2003-3003_Protocol.pdf
If this ever worked properly, it was in the time before Git...
Pavel Skripkin [Fri, 31 Dec 2021 04:57:46 +0000 (20:57 -0800)]
Input: appletouch - initialize work before device registration
Syzbot has reported warning in __flush_work(). This warning is caused by
work->func == NULL, which means missing work initialization.
This may happen, since input_dev->close() calls
cancel_work_sync(&dev->work), but dev->work initalization happens _after_
input_register_device() call.
So this patch moves dev->work initialization before registering input
device
Linus Torvalds [Fri, 31 Dec 2021 02:25:43 +0000 (18:25 -0800)]
Merge tag 'drm-fixes-2021-12-31' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"This is a bit bigger than I'd like, however it has two weeks of amdgpu
fixes in it, since they missed last week, which was very small.
The nouveau regression is probably the biggest fix in here, and it
needs to go into 5.15 as well, two i915 fixes, and then a scattering
of amdgpu fixes. The biggest fix in there is for a fencing NULL
pointer dereference, the rest are pretty minor.
For the misc team, I've pulled the two misc fixes manually since I'm
not sure what is happening at this time of year!
The amdgpu maintainers have the outstanding runpm regression to fix
still, they are just working through the last bits of it now.
Summary:
nouveau:
- fencing regression fix
i915:
- Fix possible uninitialized variable
- Fix composite fence seqno icrement on each fence creation
* tag 'drm-fixes-2021-12-31' of git://anongit.freedesktop.org/drm/drm:
drm/amd/display: Changed pipe split policy to allow for multi-display pipe split
drm/amd/display: Fix USB4 null pointer dereference in update_psp_stream_config
drm/amd/display: Set optimize_pwr_state for DCN31
drm/amd/display: Send s0i2_rdy in stream_count == 0 optimization
drm/amd/display: Added power down for DCN10
drm/amd/display: fix B0 TMDS deepcolor no dislay issue
drm/amdgpu: no DC support for headless chips
drm/amdgpu: put SMU into proper state on runpm suspending for BOCO capable platform
drm/amdgpu: always reset the asic in suspend (v2)
drm/amd/pm: skip setting gfx cgpg in the s0ix suspend-resume
drm/i915: Increment composite fence seqno
drm/i915: Fix possible uninitialized variable in parallel extension
drm/amdgpu: fix runpm documentation
drm/nouveau: wait for the exclusive fence after the shared ones v2
drm/amdgpu: add support for IP discovery gc_info table v2
drm/amdgpu: When the VCN(1.0) block is suspended, powergating is explicitly enabled
drm/amd/pm: Fix xgmi link control on aldebaran
drm/amdgpu: introduce new amdgpu_fence object to indicate the job embedded fence
drm/amdgpu: fix dropped backing store handling in amdgpu_dma_buf_move_notify
Make sure that finish_mount_kattr() is called after mount_kattr was
succesfully built in both the success and failure case to prevent
leaking any references we took when we built it. We returned early if
path lookup failed thereby risking to leak an additional reference we
took when building mount_kattr when an idmapped mount was requested.
Linus Torvalds [Thu, 30 Dec 2021 19:12:12 +0000 (11:12 -0800)]
Merge tag 'net-5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from.. Santa?
No regressions on our radar at this point. The igc problem fixed here
was the last one I was tracking but it was broken in previous
releases, anyway. Mostly driver fixes and a couple of largish SMC
fixes.
Current release - regressions:
- xsk: initialise xskb free_list_node, fixup for a -rc7 fix
Current release - new code bugs:
- mlx5: handful of minor fixes:
- use first online CPU instead of hard coded CPU
- fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'
- fix skb memory leak when TC classifier action offloads are disabled
- fix memory leak with rules with internal OvS port
Previous releases - regressions:
- igc: do not enable crosstimestamping for i225-V models
Previous releases - always broken:
- udp: use datalen to cap ipv6 udp max gso segments
- fix use-after-free in tw_timer_handler due to early free of stats
- smc: fix kernel panic caused by race of smc_sock
- smc: don't send CDC/LLC message if link not ready, avoid timeouts
- sctp: use call_rcu to free endpoint, avoid UAF in sock diag
- bridge: mcast: add and enforce query interval minimum
- usb: pegasus: do not drop long Ethernet frames
- mlx5e: fix ICOSQ recovery flow for XSK
- nfc: uapi: use kernel size_t to fix user-space builds"
* tag 'net-5.16-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
fsl/fman: Fix missing put_device() call in fman_port_probe
selftests: net: using ping6 for IPv6 in udpgro_fwd.sh
Documentation: fix outdated interpretation of ip_no_pmtu_disc
net/ncsi: check for error return from call to nla_put_u32
net: bridge: mcast: fix br_multicast_ctx_vlan_global_disabled helper
net: fix use-after-free in tw_timer_handler
selftests: net: Fix a typo in udpgro_fwd.sh
selftests/net: udpgso_bench_tx: fix dst ip argument
net: bridge: mcast: add and enforce startup query interval minimum
net: bridge: mcast: add and enforce query interval minimum
ipv6: raw: check passed optlen before reading
xsk: Initialise xskb free_list_node
net/mlx5e: Fix wrong features assignment in case of error
net/mlx5e: TC, Fix memory leak with rules with internal port
ionic: Initialize the 'lif->dbid_inuse' bitmap
igc: Fix TX timestamp support for non-MSI-X platforms
igc: Do not enable crosstimestamping for i225-V models
net/smc: fix kernel panic caused by race of smc_sock
net/smc: don't send CDC/LLC message if link not ready
NFC: st21nfca: Fix memory leak in device probe and remove
...
Linus Torvalds [Thu, 30 Dec 2021 17:52:32 +0000 (09:52 -0800)]
Merge tag 'char-misc-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc fixes from Greg KH:
"Here are two misc driver fixes for 5.16-final:
- binder accounting fix to resolve reported problem
- nitro_enclaves fix for mmap assert warning output
Both of these have been for over a week with no reported issues"
* tag 'char-misc-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
nitro_enclaves: Use get_user_pages_unlocked() call to handle mmap assert
binder: fix async_free_space accounting for empty parcels
Linus Torvalds [Thu, 30 Dec 2021 17:49:54 +0000 (09:49 -0800)]
Merge tag 'usb-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB driver fixes for 5.16 to resolve some reported
problems:
- mtu3 driver fixes
- typec ucsi driver fix
- xhci driver quirk added
- usb gadget f_fs fix for reported crash
All of these have been in linux-next for a while with no reported
problems"
* tag 'usb-5.16' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: typec: ucsi: Only check the contract if there is a connection
xhci: Fresco FL1100 controller should not have BROKEN_MSI quirk set.
usb: mtu3: set interval of FS intr and isoc endpoint
usb: mtu3: fix list_head check warning
usb: mtu3: add memory barrier before set GPD's HWO
usb: mtu3: fix interval value for intr and isoc
usb: gadget: f_fs: Clear ffs_eventfd in ffs_data_clear.
Miaoqian Lin [Thu, 30 Dec 2021 12:26:27 +0000 (12:26 +0000)]
fsl/fman: Fix missing put_device() call in fman_port_probe
The reference taken by 'of_find_device_by_node()' must be released when
not needed anymore.
Add the corresponding 'put_device()' in the and error handling paths.
Fixes: de0426be3bec ("fsl/fman: Add FMan Port Support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
xu xin [Thu, 30 Dec 2021 03:28:56 +0000 (03:28 +0000)]
Documentation: fix outdated interpretation of ip_no_pmtu_disc
The updating way of pmtu has changed, but documentation is still in the
old way. So this patch updates the interpretation of ip_no_pmtu_disc and
min_pmtu.
See commit 0356e4775fa01 ("net: ipv4: don't let PMTU updates increase
route MTU")
Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 30 Dec 2021 02:19:01 +0000 (18:19 -0800)]
Merge tag 'mlx5-fixes-2021-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2021-12-28
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2021-12-28' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: Fix wrong features assignment in case of error
net/mlx5e: TC, Fix memory leak with rules with internal port
====================
Jiasheng Jiang [Wed, 29 Dec 2021 03:21:18 +0000 (11:21 +0800)]
net/ncsi: check for error return from call to nla_put_u32
As we can see from the comment of the nla_put() that it could return
-EMSGSIZE if the tailroom of the skb is insufficient.
Therefore, it should be better to check the return value of the
nla_put_u32 and return the error code if error accurs.
Also, there are many other functions have the same problem, and if this
patch is correct, I will commit a new version to fix all.
We need to first check if the context is a vlan one, then we need to
check the global bridge multicast vlan snooping flag, and finally the
vlan's multicast flag, otherwise we will unnecessarily enable vlan mcast
processing (e.g. querier timers).
This issue was also reported since 2017 in the thread [1],
unfortunately, the issue was still can be reproduced after fixing
DCCP.
The ipv4_mib_exit_net is called before tcp_sk_exit_batch when a net
namespace is destroyed since tcp_sk_ops is registered befrore
ipv4_mib_ops, which means tcp_sk_ops is in the front of ipv4_mib_ops
in the list of pernet_list. There will be a use-after-free on
net->mib.net_statistics in tw_timer_handler after ipv4_mib_exit_net
if there are some inflight time-wait timers.
This bug is not introduced by commit fafe802d207c ("mib: add net to
NET_ADD_STATS_BH") since the net_statistics is a global variable
instead of dynamic allocation and freeing. Actually, commit 1a2db5afe950 ("mib: put net statistics on struct net") introduces
the bug since it put net statistics on struct net and free it when
net namespace is destroyed.
Moving init_ipv4_mibs() to the front of tcp_init() to fix this bug
and replace pr_crit() with panic() since continuing is meaningless
when init_ipv4_mibs() fails.
Fixes: 1a2db5afe950 ("mib: put net statistics on struct net") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Cc: Cong Wang <cong.wang@bytedance.com> Cc: Fam Zheng <fam.zheng@bytedance.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20211228104145.9426-1-songmuchun@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
wujianguo [Wed, 29 Dec 2021 10:58:10 +0000 (18:58 +0800)]
selftests/net: udpgso_bench_tx: fix dst ip argument
udpgso_bench_tx call setup_sockaddr() for dest address before
parsing all arguments, if we specify "-p ${dst_port}" after "-D ${dst_ip}",
then ${dst_port} will be ignored, and using default cfg_port 8000.
This will cause test case "multiple GRO socks" failed in udpgro.sh.
Setup sockaddr after parsing all arguments.
Fixes: 4d4a47023538 ("selftests: udp gso benchmark") Signed-off-by: Jianguo Wu <wujianguo@chinatelecom.cn> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/ff620d9f-5b52-06ab-5286-44b945453002@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Lukas Bulwahn [Wed, 29 Dec 2021 11:15:53 +0000 (12:15 +0100)]
x86/build: Use the proper name CONFIG_FW_LOADER
Commit in Fixes intends to add the expression regex only when FW_LOADER
is enabled - not FW_LOADER_BUILTIN. Latter is a leftover from a previous
patchset and not a valid config item.
So, adjust the condition to the actual name of the config.
====================
net: bridge: mcast: add and enforce query interval minimum
This set adds and enforces 1 second minimum value for bridge multicast
query and startup query intervals in order to avoid rearming the timers
too often which could lock and crash the host. I doubt anyone is using
such low values or anything lower than 1 second, so it seems like a good
minimum. In order to be compatible if the value is lower then it is
overwritten and a log message is emitted, since we can't return an error
at this point.
Eric, I looked for the syzbot reports in its dashboard but couldn't find
them so I've added you as the reporter.
I've prepared a global bridge igmp rate limiting patch but wasn't
sure if it's ok for -net. It adds a static limit of 32k packets per
second, I plan to send it for net-next with added drop counters for
each bridge so it can be easily debugged.
Original report can be seen at:
https://lore.kernel.org/netdev/e8b9ce41-57b9-b6e2-a46a-ff9c791cf0ba@gmail.com/
====================
net: bridge: mcast: add and enforce startup query interval minimum
As reported[1] if startup query interval is set too low in combination with
large number of startup queries and we have multiple bridges or even a
single bridge with multiple querier vlans configured we can crash the
machine. Add a 1 second minimum which must be enforced by overwriting the
value if set lower (i.e. without returning an error) to avoid breaking
user-space. If that happens a log message is emitted to let the admin know
that the startup interval has been set to the minimum. It doesn't make
sense to make the startup interval lower than the normal query interval
so use the same value of 1 second. The issue has been present since these
intervals could be user-controlled.
net: bridge: mcast: add and enforce query interval minimum
As reported[1] if query interval is set too low and we have multiple
bridges or even a single bridge with multiple querier vlans configured
we can crash the machine. Add a 1 second minimum which must be enforced
by overwriting the value if set lower (i.e. without returning an error) to
avoid breaking user-space. If that happens a log message is emitted to let
the administrator know that the interval has been set to the minimum.
The issue has been present since these intervals could be user-controlled.
Tamir Duberstein [Wed, 29 Dec 2021 20:09:47 +0000 (15:09 -0500)]
ipv6: raw: check passed optlen before reading
Add a check that the user-provided option is at least as long as the
number of bytes we intend to read. Before this patch we would blindly
read sizeof(int) bytes even in cases where the user passed
optlen<sizeof(int), which would potentially read garbage or fault.
Discovered by new tests in https://github.com/google/gvisor/pull/6957 .
The original get_user call predates history in the git repo.
Ciara Loftus [Mon, 20 Dec 2021 15:52:50 +0000 (15:52 +0000)]
xsk: Initialise xskb free_list_node
This commit initialises the xskb's free_list_node when the xskb is
allocated. This prevents a potential false negative returned from a call
to list_empty for that node, such as the one introduced in commit 01e87b77bc3b ("xsk: Fix crash on double free in buffer pool")
In my environment this issue caused packets to not be received by
the xdpsock application if the traffic was running prior to application
launch. This happened when the first batch of packets failed the xskmap
lookup and XDP_PASS was returned from the bpf program. This action is
handled in the i40e zc driver (and others) by allocating an skbuff,
freeing the xdp_buff and adding the associated xskb to the
xsk_buff_pool's free_list if it hadn't been added already. Without this
fix, the xskb is not added to the free_list because the check to determine
if it was added already returns an invalid positive result. Later, this
caused allocation errors in the driver and the failure to receive packets.
Fixes: 01e87b77bc3b ("xsk: Fix crash on double free in buffer pool") Fixes: d67be4ed6913 ("xsk: Introduce AF_XDP buffer allocation API") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20211220155250.2746-1-ciara.loftus@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Gal Pressman [Mon, 29 Nov 2021 09:08:41 +0000 (11:08 +0200)]
net/mlx5e: Fix wrong features assignment in case of error
In case of an error in mlx5e_set_features(), 'netdev->features' must be
updated with the correct state of the device to indicate which features
were updated successfully.
To do that we maintain a copy of 'netdev->features' and update it after
successful feature changes, so we can assign it to back to
'netdev->features' if needed.
However, since not all netdev features are handled by the driver (e.g.
GRO/TSO/etc), some features may not be updated correctly in case of an
error updating another feature.
For example, while requesting to disable TSO (feature which is not
handled by the driver) and enable HW-GRO, if an error occurs during
HW-GRO enable, 'oper_features' will be assigned with 'netdev->features'
and HW-GRO turned off. TSO will remain enabled in such case, which is a
bug.
To solve that, instead of using 'netdev->features' as the baseline of
'oper_features' and changing it on set feature success, use 'features'
instead and update it in case of errors.
Fixes: 4606bb368fb5 ("net/mlx5e: Don't override netdev features field unless in error flow") Signed-off-by: Gal Pressman <gal@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Roi Dayan [Wed, 22 Dec 2021 07:20:58 +0000 (09:20 +0200)]
net/mlx5e: TC, Fix memory leak with rules with internal port
Fix a memory leak with decap rule with internal port as destination
device. The driver allocates a modify hdr action but doesn't set
the flow attr modify hdr action which results in skipping releasing
the modify hdr action when releasing the flow.
Fixes: e94cae3b4c31 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Jakub Kicinski [Wed, 29 Dec 2021 00:19:09 +0000 (16:19 -0800)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-12-28
This series contains updates to igc driver only.
Vinicius disables support for crosstimestamp on i225-V as lockups are being
observed.
James McLaughlin fixes Tx timestamping support on non-MSI-X platforms.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igc: Fix TX timestamp support for non-MSI-X platforms
igc: Do not enable crosstimestamping for i225-V models
====================
[How]
That assignment occurs later depending on the ASIC version. It's only
needed on DCN31 and only after link_enc is already assigned.
Fixes: 99dc4c444e3a19 ("drm/amd/display: fix a crash on USB4 over C20 PHY") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>