]> git.baikalelectronics.ru Git - kernel.git/log
kernel.git
2 years agoMerge tag 'rcu-urgent.2022.07.21a' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 22 Jul 2022 17:01:20 +0000 (10:01 -0700)]
Merge tag 'rcu-urgent.2022.07.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU fix from Paul McKenney:
 "This contains a pair of commits that fix 282d8998e997 ("srcu: Prevent
  expedited GPs and blocking readers from consuming CPU"), which was
  itself a fix to an SRCU expedited grace-period problem that could
  prevent kernel live patching (KLP) from completing.

  That SRCU fix for KLP introduced large (as in minutes) boot-time
  delays to embedded Linux kernels running on qemu/KVM. These delays
  were due to the emulation of certain MMIO operations controlling
  memory layout, which were emulated with one expedited grace period per
  access. Common configurations required thousands of boot-time MMIO
  accesses, and thus thousands of boot-time expedited SRCU grace
  periods.

  In these configurations, the occasional sleeps that allowed KLP to
  proceed caused excessive boot delays. These commits preserve enough
  sleeps to permit KLP to proceed, but few enough that the virtual
  embedded kernels still boot reasonably quickly.

  This represents a regression introduced in the v5.19 merge window, and
  the bug is causing significant inconvenience"

* tag 'rcu-urgent.2022.07.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  srcu: Make expedited RCU grace periods block even less frequently
  srcu: Block less aggressively for expedited grace periods

2 years agommu_gather: fix the CONFIG_MMU_GATHER_NO_RANGE case
Linus Torvalds [Fri, 22 Jul 2022 16:28:34 +0000 (09:28 -0700)]
mmu_gather: fix the CONFIG_MMU_GATHER_NO_RANGE case

Sudip reports that alpha doesn't build properly, with errors like

  include/asm-generic/tlb.h:401:1: error: redefinition of 'tlb_update_vma_flags'
    401 | tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma)
        | ^~~~~~~~~~~~~~~~~~~~
  include/asm-generic/tlb.h:372:1: note: previous definition of 'tlb_update_vma_flags' with type 'void(struct mmu_gather *, struct vm_area_struct *)'
    372 | tlb_update_vma_flags(struct mmu_gather *tlb, struct vm_area_struct *vma) { }

the cause being that We have this odd situation where some architectures
were never converted to the newer TLB flushing interfaces that have a
range for the flush.  Instead people left them alone, and we have them
select the MMU_GATHER_NO_RANGE config option to make the tlb header
files account for this.

Peter Zijlstra cleaned some of these nasty header file games up in
commits

  1e9fdf21a433 ("mmu_gather: Remove per arch tlb_{start,end}_vma()")
  18ba064e42df ("mmu_gather: Let there be one tlb_{start,end}_vma() implementation")

but tlb_update_vma_flags() was left alone, and then commit b67fbebd4cf9
("mmu_gather: Force tlb-flush VM_PFNMAP vmas") ended up removing only
_one_ of the two stale duplicate dummy inline functions.

This removes the other stale one.

Somebody braver than me should try to remove MMU_GATHER_NO_RANGE
entirely, but it requires fixing up the oddball architectures that use
it: alpha, m68k, microblaze, nios2 and openrisc.

The fixups should be fairly straightforward ("fix the build errors it
exposes by adding the appropriate range arguments"), but the reason this
wasn't done in the first place is that so few people end up working on
those architectures.  But it could be done one architecture at a time,
hint, hint.

Reported-by: Sudip Mukherjee (Codethink) <sudipm.mukherjee@gmail.com>
Fixes: b67fbebd4cf9 ("mmu_gather: Force tlb-flush VM_PFNMAP vmas")
Link: https://lore.kernel.org/all/YtpXh0QHWwaEWVAY@debian/
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'mtd/fixes-for-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Thu, 21 Jul 2022 18:28:26 +0000 (11:28 -0700)]
Merge tag 'mtd/fixes-for-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull MTD fix from Richard Weinberger:
 "A aingle NAND controller fix:

   - gpmi: Fix busy timeout setting (wrong calculation, yes again)"

* tag 'mtd/fixes-for-5.19-final' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: gpmi: Set WAIT_FOR_READY timeout based on program/erase times

2 years agoMerge tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Thu, 21 Jul 2022 18:08:35 +0000 (11:08 -0700)]
Merge tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from can.

  Still no major regressions, most of the changes are still due to data
  races fixes, plus the usual bunch of drivers fixes.

  Previous releases - regressions:

   - tcp/udp: make early_demux back namespacified.

   - dsa: fix issues with vlan_filtering_is_global

  Previous releases - always broken:

   - ip: fix data-races around ipv4_net_table (round 2, 3 & 4)

   - amt: fix validation and synchronization bugs

   - can: fix detection of mcp251863

   - eth: iavf: fix handling of dummy receive descriptors

   - eth: lan966x: fix issues with MAC table

   - eth: stmmac: dwmac-mediatek: fix clock issue

  Misc:

   - dsa: update documentation"

* tag 'net-5.19-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (107 commits)
  mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
  net/sched: cls_api: Fix flow action initialization
  tcp: Fix data-races around sysctl_tcp_max_reordering.
  tcp: Fix a data-race around sysctl_tcp_abort_on_overflow.
  tcp: Fix a data-race around sysctl_tcp_rfc1337.
  tcp: Fix a data-race around sysctl_tcp_stdurg.
  tcp: Fix a data-race around sysctl_tcp_retrans_collapse.
  tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.
  tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.
  tcp: Fix data-races around sysctl_tcp_recovery.
  tcp: Fix a data-race around sysctl_tcp_early_retrans.
  tcp: Fix data-races around sysctl knobs related to SYN option.
  udp: Fix a data-race around sysctl_udp_l3mdev_accept.
  ip: Fix data-races around sysctl_ip_prot_sock.
  ipv4: Fix data-races around sysctl_fib_multipath_hash_fields.
  ipv4: Fix data-races around sysctl_fib_multipath_hash_policy.
  ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh.
  can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe()
  can: mcp251xfd: fix detection of mcp251863
  Documentation: fix udp_wmem_min in ip-sysctl.rst
  ...

2 years agommu_gather: Force tlb-flush VM_PFNMAP vmas
Peter Zijlstra [Fri, 8 Jul 2022 07:18:06 +0000 (09:18 +0200)]
mmu_gather: Force tlb-flush VM_PFNMAP vmas

Jann reported a race between munmap() and unmap_mapping_range(), where
unmap_mapping_range() will no-op once unmap_vmas() has unlinked the
VMA; however munmap() will not yet have invalidated the TLBs.

Therefore unmap_mapping_range() will complete while there are still
(stale) TLB entries for the specified range.

Mitigate this by force flushing TLBs for VM_PFNMAP ranges.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agommu_gather: Let there be one tlb_{start,end}_vma() implementation
Peter Zijlstra [Fri, 8 Jul 2022 07:18:05 +0000 (09:18 +0200)]
mmu_gather: Let there be one tlb_{start,end}_vma() implementation

Now that architectures are no longer allowed to override
tlb_{start,end}_vma() re-arrange code so that there is only one
implementation for each of these functions.

This much simplifies trying to figure out what they actually do.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agocsky/tlb: Remove tlb_flush() define
Peter Zijlstra [Fri, 8 Jul 2022 07:18:04 +0000 (09:18 +0200)]
csky/tlb: Remove tlb_flush() define

The previous patch removed the tlb_flush_end() implementation which
used tlb_flush_range(). This means:

 - csky did double invalidates, a range invalidate per vma and a full
   invalidate at the end

 - csky actually has range invalidates and as such the generic
   tlb_flush implementation is more efficient for it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Tested-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agommu_gather: Remove per arch tlb_{start,end}_vma()
Peter Zijlstra [Fri, 8 Jul 2022 07:18:03 +0000 (09:18 +0200)]
mmu_gather: Remove per arch tlb_{start,end}_vma()

Scattered across the archs are 3 basic forms of tlb_{start,end}_vma().
Provide two new MMU_GATHER_knobs to enumerate them and remove the per
arch tlb_{start,end}_vma() implementations.

 - MMU_GATHER_NO_FLUSH_CACHE indicates the arch has flush_cache_range()
   but does *NOT* want to call it for each VMA.

 - MMU_GATHER_MERGE_VMAS indicates the arch wants to merge the
   invalidate across multiple VMAs if possible.

With these it is possible to capture the three forms:

  1) empty stubs;
     select MMU_GATHER_NO_FLUSH_CACHE and MMU_GATHER_MERGE_VMAS

  2) start: flush_cache_range(), end: empty;
     select MMU_GATHER_MERGE_VMAS

  3) start: flush_cache_range(), end: flush_tlb_range();
     default

Obviously, if the architecture does not have flush_cache_range() then
it also doesn't need to select MMU_GATHER_NO_FLUSH_CACHE.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoscripts/gdb: Fix gdb 'lx-symbols' command
Khalid Masum [Thu, 21 Jul 2022 09:30:42 +0000 (15:30 +0600)]
scripts/gdb: Fix gdb 'lx-symbols' command

Currently the command 'lx-symbols' in gdb exits with the error`Function
"do_init_module" not defined in "kernel/module.c"`. This occurs because
the file kernel/module.c was moved to kernel/module/main.c.

Fix this breakage by changing the path to "kernel/module/main.c" in
LoadModuleBreakpoint.

Signed-off-by: Khalid Masum <khalid.masum.92@gmail.com>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Fixes: cfc1d277891e ("module: Move all into module/")
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agowatch-queue: remove spurious double semicolon
Linus Torvalds [Thu, 21 Jul 2022 17:30:14 +0000 (10:30 -0700)]
watch-queue: remove spurious double semicolon

Sedat Dilek noticed that I had an extraneous semicolon at the end of a
line in the previous patch.

It's harmless, but unintentional, and while compilers just treat it as
an extra empty statement, for all I know some other tooling might warn
about it. So clean it up before other people notice too ;)

Fixes: 353f7988dd84 ("watchqueue: make sure to serialize 'wqueue->defunct' properly")
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
2 years agowatchqueue: make sure to serialize 'wqueue->defunct' properly
Linus Torvalds [Tue, 19 Jul 2022 18:09:01 +0000 (11:09 -0700)]
watchqueue: make sure to serialize 'wqueue->defunct' properly

When the pipe is closed, we mark the associated watchqueue defunct by
calling watch_queue_clear().  However, while that is protected by the
watchqueue lock, new watchqueue entries aren't actually added under that
lock at all: they use the pipe->rd_wait.lock instead, and looking up
that pipe happens without any locking.

The watchqueue code uses the RCU read-side section to make sure that the
wqueue entry itself hasn't disappeared, but that does not protect the
pipe_info in any way.

So make sure to actually hold the wqueue lock when posting watch events,
properly serializing against the pipe being torn down.

Reported-by: Noam Rathaus <noamr@ssd-disclosure.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agolockdown: Fix kexec lockdown bypass with ima policy
Eric Snowberg [Wed, 20 Jul 2022 16:40:27 +0000 (12:40 -0400)]
lockdown: Fix kexec lockdown bypass with ima policy

The lockdown LSM is primarily used in conjunction with UEFI Secure Boot.
This LSM may also be used on machines without UEFI.  It can also be
enabled when UEFI Secure Boot is disabled.  One of lockdown's features
is to prevent kexec from loading untrusted kernels.  Lockdown can be
enabled through a bootparam or after the kernel has booted through
securityfs.

If IMA appraisal is used with the "ima_appraise=log" boot param,
lockdown can be defeated with kexec on any machine when Secure Boot is
disabled or unavailable.  IMA prevents setting "ima_appraise=log" from
the boot param when Secure Boot is enabled, but this does not cover
cases where lockdown is used without Secure Boot.

To defeat lockdown, boot without Secure Boot and add ima_appraise=log to
the kernel command line; then:

  $ echo "integrity" > /sys/kernel/security/lockdown
  $ echo "appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig" > \
    /sys/kernel/security/ima/policy
  $ kexec -ls unsigned-kernel

Add a call to verify ima appraisal is set to "enforce" whenever lockdown
is enabled.  This fixes CVE-2022-21505.

Cc: stable@vger.kernel.org
Fixes: 29d3c1c8dfe7 ("kexec: Allow kexec_file() with appropriate IMA policy when locked down")
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Acked-by: Mimi Zohar <zohar@linux.ibm.com>
Reviewed-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'linux-can-fixes-for-5.19-20220720' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Wed, 20 Jul 2022 10:13:54 +0000 (11:13 +0100)]
Merge tag 'linux-can-fixes-for-5.19-20220720' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
this is a pull request of 2 patches for net/master.

The first patch is by me and fixes the detection of the mcp251863 in
the mcp251xfd driver.

The last patch is by Liang He and adds a missing of_node_put() in the
rcar_canfd driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agomlxsw: spectrum_router: Fix IPv4 nexthop gateway indication
Ido Schimmel [Tue, 19 Jul 2022 12:26:26 +0000 (15:26 +0300)]
mlxsw: spectrum_router: Fix IPv4 nexthop gateway indication

mlxsw needs to distinguish nexthops with a gateway from connected
nexthops in order to write the former to the adjacency table of the
device. The check used to rely on the fact that nexthops with a gateway
have a 'link' scope whereas connected nexthops have a 'host' scope. This
is no longer correct after commit 747c14307214 ("ip: fix dflt addr
selection for connected nexthop").

Fix that by instead checking the address family of the gateway IP. This
is a more direct way and also consistent with the IPv6 counterpart in
mlxsw_sp_rt6_is_gateway().

Cc: stable@vger.kernel.org
Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop")
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agonet/sched: cls_api: Fix flow action initialization
Oz Shlomo [Tue, 19 Jul 2022 12:24:09 +0000 (15:24 +0300)]
net/sched: cls_api: Fix flow action initialization

The cited commit refactored the flow action initialization sequence to
use an interface method when translating tc action instances to flow
offload objects. The refactored version skips the initialization of the
generic flow action attributes for tc actions, such as pedit, that allocate
more than one offload entry. This can cause potential issues for drivers
mapping flow action ids.

Populate the generic flow action fields for all the flow action entries.

Fixes: c54e1d920f04 ("flow_offload: add ops to tc_action_ops for flow action setup")
Signed-off-by: Oz Shlomo <ozsh@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
----
v1 -> v2:
 - coalese the generic flow action fields initialization to a single loop
Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'net-sysctl-races-round-4'
David S. Miller [Wed, 20 Jul 2022 09:14:50 +0000 (10:14 +0100)]
Merge branch 'net-sysctl-races-round-4'

Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Round 4).

This series fixes data-races around 17 knobs after fib_multipath_use_neigh
in ipv4_net_table.

tcp_fack was skipped because it's obsolete and there's no readers.

So, round 5 will start with tcp_dsack, 2 rounds left for 27 knobs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_max_reordering.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:53 +0000 (10:26 -0700)]
tcp: Fix data-races around sysctl_tcp_max_reordering.

While reading sysctl_tcp_max_reordering, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: dca145ffaa8d ("tcp: allow for bigger reordering level")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_abort_on_overflow.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:52 +0000 (10:26 -0700)]
tcp: Fix a data-race around sysctl_tcp_abort_on_overflow.

While reading sysctl_tcp_abort_on_overflow, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_rfc1337.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:51 +0000 (10:26 -0700)]
tcp: Fix a data-race around sysctl_tcp_rfc1337.

While reading sysctl_tcp_rfc1337, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_stdurg.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:50 +0000 (10:26 -0700)]
tcp: Fix a data-race around sysctl_tcp_stdurg.

While reading sysctl_tcp_stdurg, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_retrans_collapse.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:49 +0000 (10:26 -0700)]
tcp: Fix a data-race around sysctl_tcp_retrans_collapse.

While reading sysctl_tcp_retrans_collapse, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_slow_start_after_idle.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:48 +0000 (10:26 -0700)]
tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.

While reading sysctl_tcp_slow_start_after_idle, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: 35089bb203f4 ("[TCP]: Add tcp_slow_start_after_idle sysctl.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:47 +0000 (10:26 -0700)]
tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.

While reading sysctl_tcp_thin_linear_timeouts, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: 36e31b0af587 ("net: TCP thin linear timeouts")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_recovery.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:46 +0000 (10:26 -0700)]
tcp: Fix data-races around sysctl_tcp_recovery.

While reading sysctl_tcp_recovery, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 4f41b1c58a32 ("tcp: use RACK to detect losses")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_early_retrans.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:45 +0000 (10:26 -0700)]
tcp: Fix a data-race around sysctl_tcp_early_retrans.

While reading sysctl_tcp_early_retrans, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: eed530b6c676 ("tcp: early retransmit")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl knobs related to SYN option.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:44 +0000 (10:26 -0700)]
tcp: Fix data-races around sysctl knobs related to SYN option.

While reading these knobs, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.

  - tcp_sack
  - tcp_window_scaling
  - tcp_timestamps

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoudp: Fix a data-race around sysctl_udp_l3mdev_accept.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:43 +0000 (10:26 -0700)]
udp: Fix a data-race around sysctl_udp_l3mdev_accept.

While reading sysctl_udp_l3mdev_accept, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 63a6fff353d0 ("net: Avoid receiving packets with an l3mdev on unbound UDP sockets")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoip: Fix data-races around sysctl_ip_prot_sock.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:42 +0000 (10:26 -0700)]
ip: Fix data-races around sysctl_ip_prot_sock.

sysctl_ip_prot_sock is accessed concurrently, and there is always a chance
of data-race.  So, all readers and writers need some basic protection to
avoid load/store-tearing.

Fixes: 4548b683b781 ("Introduce a sysctl that modifies the value of PROT_SOCK.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv4: Fix data-races around sysctl_fib_multipath_hash_fields.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:41 +0000 (10:26 -0700)]
ipv4: Fix data-races around sysctl_fib_multipath_hash_fields.

While reading sysctl_fib_multipath_hash_fields, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: ce5c9c20d364 ("ipv4: Add a sysctl to control multipath hash fields")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv4: Fix data-races around sysctl_fib_multipath_hash_policy.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:40 +0000 (10:26 -0700)]
ipv4: Fix data-races around sysctl_fib_multipath_hash_policy.

While reading sysctl_fib_multipath_hash_policy, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: bf4e0a3db97e ("net: ipv4: add support for ECMP hash policy choice")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoipv4: Fix a data-race around sysctl_fib_multipath_use_neigh.
Kuniyuki Iwashima [Mon, 18 Jul 2022 17:26:39 +0000 (10:26 -0700)]
ipv4: Fix a data-race around sysctl_fib_multipath_use_neigh.

While reading sysctl_fib_multipath_use_neigh, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its reader.

Fixes: a6db4494d218 ("net: ipv4: Consider failed nexthops in multipath routes")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
David S. Miller [Wed, 20 Jul 2022 09:11:58 +0000 (10:11 +0100)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2022-07-20

1) Fix a policy refcount imbalance in xfrm_bundle_lookup.
   From Hangyu Hua.

2) Fix some clang -Wformat warnings.
   Justin Stitt
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agocan: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe()
Liang He [Tue, 12 Jul 2022 09:56:23 +0000 (17:56 +0800)]
can: rcar_canfd: Add missing of_node_put() in rcar_canfd_probe()

We should use of_node_put() for the reference returned by
of_get_child_by_name() which has increased the refcount.

Fixes: 45721c406dcf ("can: rcar_canfd: Add support for r8a779a0 SoC")
Link: https://lore.kernel.org/all/20220712095623.364287-1-windhl@126.com
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agocan: mcp251xfd: fix detection of mcp251863
Marc Kleine-Budde [Tue, 5 Jul 2022 19:30:38 +0000 (21:30 +0200)]
can: mcp251xfd: fix detection of mcp251863

In commit c6f2a617a0a8 ("can: mcp251xfd: add support for mcp251863")
support for the mcp251863 was added. However it was not taken into
account that the auto detection of the chip model cannot distinguish
between mcp2518fd and mcp251863 and would lead to a warning message if
the firmware specifies a mcp251863.

Fix auto detection: If a mcp2518fd compatible chip is found, keep the
mcp251863 if specified by firmware, use mcp2518fd instead.

Link: https://lore.kernel.org/all/20220706064835.1848864-1-mkl@pengutronix.de
Fixes: c6f2a617a0a8 ("can: mcp251xfd: add support for mcp251863")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Jakub Kicinski [Wed, 20 Jul 2022 00:43:02 +0000 (17:43 -0700)]
Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue

Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2022-07-18

This series contains updates to iavf driver only.

Przemyslaw fixes handling of multiple VLAN requests to account for
individual errors instead of rejecting them all. He removes incorrect
implementations of ETHTOOL_COALESCE_MAX_FRAMES and
ETHTOOL_COALESCE_MAX_FRAMES_IRQ.

He also corrects an issue with NULL pointer caused by improper handling of
dummy receive descriptors. Finally, he corrects debug prints reporting an
unknown state.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  iavf: Fix missing state logs
  iavf: Fix handling of dummy receive descriptors
  iavf: Disallow changing rx/tx-frames and rx/tx-frames-irq
  iavf: Fix VLAN_V2 addition/rejection
====================

Link: https://lore.kernel.org/r/20220718174807.4113582-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoDocumentation: fix udp_wmem_min in ip-sysctl.rst
Xin Long [Mon, 18 Jul 2022 17:56:59 +0000 (13:56 -0400)]
Documentation: fix udp_wmem_min in ip-sysctl.rst

UDP doesn't support tx memory accounting, and sysctl udp_wmem_min
is not really used anywhere. So we should fix the description in
ip-sysctl.rst accordingly.

Fixes: 95766fff6b9a ("[UDP]: Add memory accounting.")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/c880a963d9b1fb5f442ae3c9e4dfa70d45296a16.1658167019.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_ppe: fix possible NULL pointer dereference in mtk_flow_get_wdma_info
Lorenzo Bianconi [Mon, 18 Jul 2022 09:51:53 +0000 (11:51 +0200)]
net: ethernet: mtk_ppe: fix possible NULL pointer dereference in mtk_flow_get_wdma_info

odev pointer can be NULL in mtk_flow_offload_replace routine according
to the flower action rules. Fix possible NULL pointer dereference in
mtk_flow_get_wdma_info.

Fixes: a333215e10cb5 ("net: ethernet: mtk_eth_soc: implement flow offloading to WED devices")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/4e1685bc4976e21e364055f6bee86261f8f9ee93.1658137753.git.lorenzo@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agor8152: fix a WOL issue
Hayes Wang [Mon, 18 Jul 2022 08:21:20 +0000 (16:21 +0800)]
r8152: fix a WOL issue

This fixes that the platform is waked by an unexpected packet. The
size and range of FIFO is different when the device enters S3 state,
so it is necessary to correct some settings when suspending.

Regardless of jumbo frame, set RMS to 1522 and MTPS to MTPS_DEFAULT.
Besides, enable MCU_BORW_EN to update the method of calculating the
pointer of data. Then, the hardware could get the correct data.

Fixes: 195aae321c82 ("r8152: support new chips")
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Link: https://lore.kernel.org/r/20220718082120.10957-391-nic_swsd@realtek.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agosrcu: Make expedited RCU grace periods block even less frequently
Neeraj Upadhyay [Fri, 1 Jul 2022 03:15:45 +0000 (08:45 +0530)]
srcu: Make expedited RCU grace periods block even less frequently

The purpose of commit 282d8998e997 ("srcu: Prevent expedited GPs
and blocking readers from consuming CPU") was to prevent a long
series of never-blocking expedited SRCU grace periods from blocking
kernel-live-patching (KLP) progress.  Although it was successful, it also
resulted in excessive boot times on certain embedded workloads running
under qemu with the "-bios QEMU_EFI.fd" command line.  Here "excessive"
means increasing the boot time up into the three-to-four minute range.
This increase in boot time was due to the more than 6000 back-to-back
invocations of synchronize_rcu_expedited() within the KVM host OS, which
in turn resulted from qemu's emulation of a long series of MMIO accesses.

Commit 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace
periods") did not significantly help this particular use case.

Zhangfei Gao and Shameerali Kolothum Thodi did experiments varying the
value of SRCU_MAX_NODELAY_PHASE with HZ=250 and with various values
of non-sleeping per phase counts on a system with preemption enabled,
and observed the following boot times:

+──────────────────────────+────────────────+
| SRCU_MAX_NODELAY_PHASE   | Boot time (s)  |
+──────────────────────────+────────────────+
| 100                      | 30.053         |
| 150                      | 25.151         |
| 200                      | 20.704         |
| 250                      | 15.748         |
| 500                      | 11.401         |
| 1000                     | 11.443         |
| 10000                    | 11.258         |
1000000                  | 11.154         |
+──────────────────────────+────────────────+

Analysis on the experiment results show additional improvements with
CPU-bound delays approaching one jiffy in duration. This improvement was
also seen when number of per-phase iterations were scaled to one jiffy.

This commit therefore scales per-grace-period phase number of non-sleeping
polls so that non-sleeping polls extend for about one jiffy. In addition,
the delay-calculation call to srcu_get_delay() in srcu_gp_end() is
replaced with a simple check for an expedited grace period.  This change
schedules callback invocation immediately after expedited grace periods
complete, which results in greatly improved boot times.  Testing done
by Marc and Zhangfei confirms that this change recovers most of the
performance degradation in boottime; for CONFIG_HZ_250 configuration,
specifically, boot times improve from 3m50s to 41s on Marc's setup;
and from 2m40s to ~9.7s on Zhangfei's setup.

In addition to the changes to default per phase delays, this
change adds 3 new kernel parameters - srcutree.srcu_max_nodelay,
srcutree.srcu_max_nodelay_phase, and srcutree.srcu_retry_check_delay.
This allows users to configure the srcu grace period scanning delays in
order to more quickly react to additional use cases.

Fixes: 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace periods")
Fixes: 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU")
Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reported-by: yueluck <yueluck@163.com>
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2 years agosrcu: Block less aggressively for expedited grace periods
Paul E. McKenney [Sun, 12 Jun 2022 22:00:06 +0000 (15:00 -0700)]
srcu: Block less aggressively for expedited grace periods

Commit 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers
from consuming CPU") fixed a problem where a long-running expedited SRCU
grace period could block kernel live patching.  It did so by giving up
on expediting once a given SRCU expedited grace period grew too old.

Unfortunately, this added excessive delays to boots of virtual embedded
systems specifying "-bios QEMU_EFI.fd" to qemu.  This commit therefore
makes the transition away from expediting less aggressive, increasing
the per-grace-period phase number of non-sleeping polls of readers from
one to three and increasing the required grace-period age from one jiffy
(actually from zero to one jiffies) to two jiffies (actually from one
to two jiffies).

Fixes: 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reported-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Reported-by: chenxiang (M)" <chenxiang66@hisilicon.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
2 years agoMerge branch 'amt-fix-validation-and-synchronization-bugs'
Paolo Abeni [Tue, 19 Jul 2022 10:37:04 +0000 (12:37 +0200)]
Merge branch 'amt-fix-validation-and-synchronization-bugs'

Taehee Yoo says:

====================
amt: fix validation and synchronization bugs

There are some synchronization issues in the amt module.
Especially, an amt gateway doesn't well synchronize its own variables
and status(amt->status).
It tries to use a workqueue for handles in a single thread.
A global lock is also good, but it would occur complex locking complex.

In this patchset, only the gateway uses workqueue.
The reason why only gateway interface uses workqueue is that gateway
should manage its own states and variables a little bit statefully.
But relay doesn't need to manage tunnels statefully, stateless is okay.
So, relay side message handlers are okay to be called concurrently.
But it doesn't mean that no lock is needed.

Only amt multicast data message type will not be processed by the work
queue because It contains actual multicast data.
So, it should be processed immediately.

When any amt gateway events are triggered(sending discovery message by
delayed_work, sending request message by delayed_work and receiving
messages), it stores event and skb into the event queue(amt->events[16]).
Then, workqueue processes these events one by one.

The first patch is to use the work queue.

The second patch is to remove unnecessary lock due to a previous patch.

The third patch is to use READ_ONCE() in the amt module.
Even if the amt module uses a single thread, some variables (ready4,
ready6, amt->status) can be accessed concurrently.

The fourth patch is to add missing nonce generation logic when it sends a
new request message.

The fifth patch is to drop unexpected advertisement messages.
advertisement message should be received only after the gateway sends
a discovery message first.
So, the gateway should drop advertisement messages if it has never
sent a discovery message and it also should drop duplicate advertisement
messages.
Using nonce is good to distinguish whether a received message is an
expected message or not.

The sixth patch is to drop unexpected query messages.
This is the same behavior as the fourth patch.
Query messages should be received only after the gateway sends a request
message first.
The nonce variable is used to distinguish whether it is a reply to a
previous request message or not.
amt->ready4 and amt->ready6 are used to distinguish duplicate messages.

The seventh patch is to drop unexpected multicast data.
AMT gateway should not receive multicast data message type before
establish between gateway and relay.
In order to drop unexpected multicast data messages, it checks amt->status.

The last patch is to fix a locking problem on the relay side.
amt->nr_tunnels variable is protected by amt->lock.
But amt_request_handler() doesn't protect this variable.

v2:
 - Use local_bh_disable() instead of rcu_read_lock_bh() in
   amt_membership_query_handler.
 - Fix using uninitialized variables.
 - Fix unexpectedly start the event_wq after stopping.
 - Fix possible deadlock in amt_event_work().
 - Add a limit variable in amt_event_work() to prevent infinite working.
 - Rename amt_queue_events() to amt_queue_event().
====================

Link: https://lore.kernel.org/r/20220717160910.19156-1-ap420073@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: do not use amt->nr_tunnels outside of lock
Taehee Yoo [Sun, 17 Jul 2022 16:09:10 +0000 (16:09 +0000)]
amt: do not use amt->nr_tunnels outside of lock

amt->nr_tunnels is protected by amt->lock.
But, amt_request_handler() has been using this variable without the
amt->lock.
So, it expands context of amt->lock in the amt_request_handler() to
protect amt->nr_tunnels variable.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: drop unexpected multicast data
Taehee Yoo [Sun, 17 Jul 2022 16:09:09 +0000 (16:09 +0000)]
amt: drop unexpected multicast data

AMT gateway interface should not receive unexpected multicast data.
Multicast data message type should be received after sending an update
message, which means all establishment between gateway and relay is
finished.
So, amt_multicast_data_handler() checks amt->status.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: drop unexpected query message
Taehee Yoo [Sun, 17 Jul 2022 16:09:08 +0000 (16:09 +0000)]
amt: drop unexpected query message

AMT gateway interface should not receive unexpected query messages.
In order to drop unexpected query messages, it checks nonce.
And it also checks ready4 and ready6 variables to drop duplicated messages.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: drop unexpected advertisement message
Taehee Yoo [Sun, 17 Jul 2022 16:09:07 +0000 (16:09 +0000)]
amt: drop unexpected advertisement message

AMT gateway interface should not receive unexpected advertisement messages.
In order to drop these packets, it should check nonce and amt->status.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: add missing regeneration nonce logic in request logic
Taehee Yoo [Sun, 17 Jul 2022 16:09:06 +0000 (16:09 +0000)]
amt: add missing regeneration nonce logic in request logic

When AMT gateway starts sending a new request message, it should
regenerate the nonce variable.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: use READ_ONCE() in amt module
Taehee Yoo [Sun, 17 Jul 2022 16:09:05 +0000 (16:09 +0000)]
amt: use READ_ONCE() in amt module

There are some data races in the amt module.
amt->ready4, amt->ready6, and amt->status can be accessed concurrently
without locks.
So, it uses READ_ONCE() and WRITE_ONCE().

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: remove unnecessary locks
Taehee Yoo [Sun, 17 Jul 2022 16:09:04 +0000 (16:09 +0000)]
amt: remove unnecessary locks

By the previous patch, amt gateway handlers are changed to worked by
a single thread.
So, most locks for gateway are not needed.
So, it removes.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agoamt: use workqueue for gateway side message handling
Taehee Yoo [Sun, 17 Jul 2022 16:09:03 +0000 (16:09 +0000)]
amt: use workqueue for gateway side message handling

There are some synchronization issues(amt->status, amt->req_cnt, etc)
if the interface is in gateway mode because gateway message handlers
are processed concurrently.
This applies a work queue for processing these messages instead of
expanding the locking context.

So, the purposes of this patch are to fix exist race conditions and to make
gateway to be able to validate a gateway status more correctly.

When the AMT gateway interface is created, it tries to establish to relay.
The establishment step looks stateless, but it should be managed well.
In order to handle messages in the gateway, it saves the current
status(i.e. AMT_STATUS_XXX).
This patch makes gateway code to be worked with a single thread.

Now, all messages except the multicast are triggered(received or
delay expired), and these messages will be stored in the event
queue(amt->events).
Then, the single worker processes stored messages asynchronously one
by one.
The multicast data message type will be still processed immediately.

Now, amt->lock is only needed to access the event queue(amt->events)
if an interface is the gateway mode.

Fixes: cbc21dc1cfe9 ("amt: add data plane of amt interface")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: vitesse-vsc73xx: silent spi_device_id warnings
Oleksij Rempel [Sun, 17 Jul 2022 13:58:31 +0000 (15:58 +0200)]
net: dsa: vitesse-vsc73xx: silent spi_device_id warnings

Add spi_device_id entries to silent SPI warnings.

Fixes: 5fa6863ba692 ("spi: Check we have a spi_device_id for each DT compatible")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220717135831.2492844-2-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: dsa: sja1105: silent spi_device_id warnings
Oleksij Rempel [Sun, 17 Jul 2022 13:58:30 +0000 (15:58 +0200)]
net: dsa: sja1105: silent spi_device_id warnings

Add spi_device_id entries to silent following warnings:
 SPI driver sja1105 has no spi_device_id for nxp,sja1105e
 SPI driver sja1105 has no spi_device_id for nxp,sja1105t
 SPI driver sja1105 has no spi_device_id for nxp,sja1105p
 SPI driver sja1105 has no spi_device_id for nxp,sja1105q
 SPI driver sja1105 has no spi_device_id for nxp,sja1105r
 SPI driver sja1105 has no spi_device_id for nxp,sja1105s
 SPI driver sja1105 has no spi_device_id for nxp,sja1110a
 SPI driver sja1105 has no spi_device_id for nxp,sja1110b
 SPI driver sja1105 has no spi_device_id for nxp,sja1110c
 SPI driver sja1105 has no spi_device_id for nxp,sja1110d

Fixes: 5fa6863ba692 ("spi: Check we have a spi_device_id for each DT compatible")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220717135831.2492844-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agobe2net: Fix buffer overflow in be_get_module_eeprom
Hristo Venev [Sat, 16 Jul 2022 08:51:34 +0000 (11:51 +0300)]
be2net: Fix buffer overflow in be_get_module_eeprom

be_cmd_read_port_transceiver_data assumes that it is given a buffer that
is at least PAGE_DATA_LEN long, or twice that if the module supports SFF
8472. However, this is not always the case.

Fix this by passing the desired offset and length to
be_cmd_read_port_transceiver_data so that we only copy the bytes once.

Fixes: e36edd9d26cf ("be2net: add ethtool "-m" option support")
Signed-off-by: Hristo Venev <hristo@venev.name>
Link: https://lore.kernel.org/r/20220716085134.6095-1-hristo@venev.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2 years agonet: stmmac: remove redunctant disable xPCS EEE call
Wong Vee Khee [Fri, 15 Jul 2022 12:24:02 +0000 (20:24 +0800)]
net: stmmac: remove redunctant disable xPCS EEE call

Disable is done in stmmac_init_eee() on the event of MAC link down.
Since setting enable/disable EEE via ethtool will eventually trigger
a MAC down, removing this redunctant call in stmmac_ethtool.c to avoid
calling xpcs_config_eee() twice.

Fixes: d4aeaed80b0e ("net: stmmac: trigger PCS EEE to turn off on link down")
Signed-off-by: Wong Vee Khee <vee.khee.wong@linux.intel.com>
Link: https://lore.kernel.org/r/20220715122402.1017470-1-vee.khee.wong@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'fix-2-dsa-issues-with-vlan_filtering_is_global'
Jakub Kicinski [Tue, 19 Jul 2022 03:14:27 +0000 (20:14 -0700)]
Merge branch 'fix-2-dsa-issues-with-vlan_filtering_is_global'

Vladimir Oltean says:

====================
Fix 2 DSA issues with vlan_filtering_is_global

This patch set fixes 2 issues with vlan_filtering_is_global switches.

Both are regressions introduced by refactoring commit d0004a020bb5
("net: dsa: remove the "dsa_to_port in a loop" antipattern from the
core"), which wasn't tested on a wide enough variety of switches.

Tested on the sja1105 driver.
====================

Link: https://lore.kernel.org/r/20220715151659.780544-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: fix NULL pointer dereference in dsa_port_reset_vlan_filtering
Vladimir Oltean [Fri, 15 Jul 2022 15:16:59 +0000 (18:16 +0300)]
net: dsa: fix NULL pointer dereference in dsa_port_reset_vlan_filtering

The "ds" iterator variable used in dsa_port_reset_vlan_filtering() ->
dsa_switch_for_each_port() overwrites the "dp" received as argument,
which is later used to call dsa_port_vlan_filtering() proper.

As a result, switches which do enter that code path (the ones with
vlan_filtering_is_global=true) will dereference an invalid dp in
dsa_port_reset_vlan_filtering() after leaving a VLAN-aware bridge.

Use a dedicated "other_dp" iterator variable to avoid this from
happening.

Fixes: d0004a020bb5 ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: dsa: fix dsa_port_vlan_filtering when global
Vladimir Oltean [Fri, 15 Jul 2022 15:16:58 +0000 (18:16 +0300)]
net: dsa: fix dsa_port_vlan_filtering when global

The blamed refactoring commit changed a "port" iterator with "other_dp",
but still looked at the slave_dev of the dp outside the loop, instead of
other_dp->slave from the loop.

As a result, dsa_port_vlan_filtering() would not call
dsa_slave_manage_vlan_filtering() except for the port in cause, and not
for all switch ports as expected.

Fixes: d0004a020bb5 ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
Reported-by: Lucian Banu <Lucian.Banu@westermo.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoixgbe: Add locking to prevent panic when setting sriov_numvfs to zero
Piotr Skajewski [Fri, 15 Jul 2022 21:44:56 +0000 (14:44 -0700)]
ixgbe: Add locking to prevent panic when setting sriov_numvfs to zero

It is possible to disable VFs while the PF driver is processing requests
from the VF driver.  This can result in a panic.

BUG: unable to handle kernel paging request at 000000000000106c
PGD 0 P4D 0
Oops: 0000 [#1] SMP NOPTI
CPU: 8 PID: 0 Comm: swapper/8 Kdump: loaded Tainted: G I      --------- -
Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020
RIP: 0010:ixgbe_msg_task+0x4c8/0x1690 [ixgbe]
Code: 00 00 48 8d 04 40 48 c1 e0 05 89 7c 24 24 89 fd 48 89 44 24 10 83 ff
01 0f 84 b8 04 00 00 4c 8b 64 24 10 4d 03 a5 48 22 00 00 <41> 80 7c 24 4c
00 0f 84 8a 03 00 00 0f b7 c7 83 f8 08 0f 84 8f 0a
RSP: 0018:ffffb337869f8df8 EFLAGS: 00010002
RAX: 0000000000001020 RBX: 0000000000000000 RCX: 000000000000002b
RDX: 0000000000000002 RSI: 0000000000000008 RDI: 0000000000000006
RBP: 0000000000000006 R08: 0000000000000002 R09: 0000000000029780
R10: 00006957d8f42832 R11: 0000000000000000 R12: 0000000000001020
R13: ffff8a00e8978ac0 R14: 000000000000002b R15: ffff8a00e8979c80
FS:  0000000000000000(0000) GS:ffff8a07dfd00000(0000) knlGS:00000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000106c CR3: 0000000063e10004 CR4: 00000000007726e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <IRQ>
 ? ttwu_do_wakeup+0x19/0x140
 ? try_to_wake_up+0x1cd/0x550
 ? ixgbevf_update_xcast_mode+0x71/0xc0 [ixgbevf]
 ixgbe_msix_other+0x17e/0x310 [ixgbe]
 __handle_irq_event_percpu+0x40/0x180
 handle_irq_event_percpu+0x30/0x80
 handle_irq_event+0x36/0x53
 handle_edge_irq+0x82/0x190
 handle_irq+0x1c/0x30
 do_IRQ+0x49/0xd0
 common_interrupt+0xf/0xf

This can be eventually be reproduced with the following script:

while :
do
    echo 63 > /sys/class/net/<devname>/device/sriov_numvfs
    sleep 1
    echo 0 > /sys/class/net/<devname>/device/sriov_numvfs
    sleep 1
done

Add lock when disabling SR-IOV to prevent process VF mailbox communication.

Fixes: d773d1310625 ("ixgbe: Fix memory leak when SR-IOV VFs are direct assigned")
Signed-off-by: Piotr Skajewski <piotrx.skajewski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220715214456.2968711-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoi40e: Fix erroneous adapter reinitialization during recovery process
Dawid Lukwinski [Fri, 15 Jul 2022 21:45:41 +0000 (14:45 -0700)]
i40e: Fix erroneous adapter reinitialization during recovery process

Fix an issue when driver incorrectly detects state
of recovery process and erroneously reinitializes interrupts,
which results in a kernel error and call trace message.

The issue was caused by a combination of two factors:
1. Assuming the EMP reset issued after completing
firmware recovery means the whole recovery process is complete.
2. Erroneous reinitialization of interrupt vector after detecting
the above mentioned EMP reset.

Fixes (1) by changing how recovery state change is detected
and (2) by adjusting the conditional expression to ensure using proper
interrupt reinitialization method, depending on the situation.

Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20220715214542.2968762-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: ethernet: mtk_eth_soc: fix off by one check of ARRAY_SIZE
Tom Rix [Sat, 16 Jul 2022 21:46:54 +0000 (17:46 -0400)]
net: ethernet: mtk_eth_soc: fix off by one check of ARRAY_SIZE

In mtk_wed_tx_ring_setup(.., int idx, ..), idx is used as an index here
  struct mtk_wed_ring *ring = &dev->tx_ring[idx];

The bounds of idx are checked here
  BUG_ON(idx > ARRAY_SIZE(dev->tx_ring));

If idx is the size of the array, it will pass this check and overflow.
So change the check to >= .

Fixes: 804775dfc288 ("net: ethernet: mtk_eth_soc: add support for Wireless Ethernet Dispatch (WED)")
Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220716214654.1540240-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge branch 'net-lan966x-fix-issues-with-mac-table'
Jakub Kicinski [Tue, 19 Jul 2022 03:00:05 +0000 (20:00 -0700)]
Merge branch 'net-lan966x-fix-issues-with-mac-table'

Horatiu Vultur says:

====================
net: lan966x: Fix issues with MAC table

The patch series fixes 2 issues:
- when an entry was forgotten the irq thread was holding a spin lock and then
  was talking also rtnl_lock.
- the access to the HW MAC table is indirect, so the access to the HW MAC
  table was not synchronized, which means that there could be race conditions.
====================

Link: https://lore.kernel.org/r/20220714194040.231651-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: lan966x: Fix usage of lan966x->mac_lock when used by FDB
Horatiu Vultur [Thu, 14 Jul 2022 19:40:40 +0000 (21:40 +0200)]
net: lan966x: Fix usage of lan966x->mac_lock when used by FDB

When the SW bridge was trying to add/remove entries to/from HW, the
access to HW was not protected by any lock. In this way, it was
possible to have race conditions.
Fix this by using the lan966x->mac_lock to protect parallel access to HW
for this cases.

Fixes: 25ee9561ec622 ("net: lan966x: More MAC table functionality")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: lan966x: Fix usage of lan966x->mac_lock inside lan966x_mac_irq_handler
Horatiu Vultur [Thu, 14 Jul 2022 19:40:39 +0000 (21:40 +0200)]
net: lan966x: Fix usage of lan966x->mac_lock inside lan966x_mac_irq_handler

The problem with this spin lock is that it was just protecting the list
of the MAC entries in SW and not also the access to the MAC entries in HW.
Because the access to HW is indirect, then it could happen to have race
conditions.
For example when SW introduced an entry in MAC table and the irq mac is
trying to read something from the MAC.
Update such that also the access to MAC entries in HW is protected by
this lock.

Fixes: 5ccd66e01cbef ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: lan966x: Fix usage of lan966x->mac_lock when entry is removed
Horatiu Vultur [Thu, 14 Jul 2022 19:40:38 +0000 (21:40 +0200)]
net: lan966x: Fix usage of lan966x->mac_lock when entry is removed

To remove an entry to the MAC table, it is required first to setup the
entry and then issue a command for the MAC to forget the entry.
So if it happens for two threads to remove simultaneously an entry
in MAC table then it would be a race condition.
Fix this by using lan966x->mac_lock to protect the HW access.

Fixes: e18aba8941b40 ("net: lan966x: add mactable support")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: lan966x: Fix usage of lan966x->mac_lock when entry is added
Horatiu Vultur [Thu, 14 Jul 2022 19:40:37 +0000 (21:40 +0200)]
net: lan966x: Fix usage of lan966x->mac_lock when entry is added

To add an entry to the MAC table, it is required first to setup the
entry and then issue a command for the MAC to learn the entry.
So if it happens for two threads to add simultaneously an entry in MAC
table then it would be a race condition.
Fix this by using lan966x->mac_lock to protect the HW access.

Fixes: fc0c3fe7486f2 ("net: lan966x: Add function lan966x_mac_ip_learn()")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agonet: lan966x: Fix taking rtnl_lock while holding spin_lock
Horatiu Vultur [Thu, 14 Jul 2022 19:40:36 +0000 (21:40 +0200)]
net: lan966x: Fix taking rtnl_lock while holding spin_lock

When the HW deletes an entry in MAC table then it generates an
interrupt. The SW will go through it's own list of MAC entries and if it
is not found then it would notify the listeners about this. The problem
is that when the SW will go through it's own list it would take a spin
lock(lan966x->mac_lock) and when it notifies that the entry is deleted.
But to notify the listeners it taking the rtnl_lock which is illegal.

This is fixed by instead of notifying right away that the entry is
deleted, move the entry on a temp list and once, it checks all the
entries then just notify that the entries from temp list are deleted.

Fixes: 5ccd66e01cbe ("net: lan966x: add support for interrupts from analyzer")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Linus Torvalds [Tue, 19 Jul 2022 00:16:22 +0000 (17:16 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Two bug fixes for irdma:

   - x722 does not support 1GB pages, trying to configure them will
     corrupt the dma mapping

   - Fix a sleep while holding a spinlock"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/irdma: Fix sleep from invalid context BUG
  RDMA/irdma: Do not advertise 1GB page size for x722

2 years agoMerge tag 'hte/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux
Linus Torvalds [Mon, 18 Jul 2022 18:47:04 +0000 (11:47 -0700)]
Merge tag 'hte/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux

Pull hardware timestamp fix from Thierry Reding:
 "A single fix for an out-of-sync kerneldoc comment"

* tag 'hte/for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  gpiolib: cdev: Fix kernel doc for struct line

2 years agoiavf: Fix missing state logs
Przemyslaw Patynowski [Wed, 15 Jun 2022 17:57:20 +0000 (13:57 -0400)]
iavf: Fix missing state logs

Fix debug prints, by adding missing state prints.

Extend iavf_state_str by strings for __IAVF_INIT_EXTENDED_CAPS and
__IAVF_INIT_CONFIG_ADAPTER.

Without this patch, when enabling debug prints for iavf.h, user will
see:
iavf 0000:06:0e.0: state transition from:__IAVF_INIT_GET_RESOURCES to:__IAVF_UNKNOWN_STATE
iavf 0000:06:0e.0: state transition from:__IAVF_UNKNOWN_STATE to:__IAVF_UNKNOWN_STATE

Fixes: 605ca7c5c670 ("iavf: Fix kernel BUG in free_msi_irqs")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jun Zhang <xuejun.zhang@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoiavf: Fix handling of dummy receive descriptors
Przemyslaw Patynowski [Sat, 25 Jun 2022 00:33:01 +0000 (17:33 -0700)]
iavf: Fix handling of dummy receive descriptors

Fix memory leak caused by not handling dummy receive descriptor properly.
iavf_get_rx_buffer now sets the rx_buffer return value for dummy receive
descriptors. Without this patch, when the hardware writes a dummy
descriptor, iavf would not free the page allocated for the previous receive
buffer. This is an unlikely event but can still happen.

[Jesse: massaged commit message]

Fixes: efa14c398582 ("iavf: allow null RX descriptors")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoiavf: Disallow changing rx/tx-frames and rx/tx-frames-irq
Przemyslaw Patynowski [Mon, 13 Jun 2022 23:07:42 +0000 (19:07 -0400)]
iavf: Disallow changing rx/tx-frames and rx/tx-frames-irq

Remove from supported_coalesce_params ETHTOOL_COALESCE_MAX_FRAMES
and ETHTOOL_COALESCE_MAX_FRAMES_IRQ. As tx-frames-irq allowed
user to change budget for iavf_clean_tx_irq, remove work_limit
and use define for budget.

Without this patch there would be possibility to change rx/tx-frames
and rx/tx-frames-irq, which for rx/tx-frames did nothing, while for
rx/tx-frames-irq it changed rx/tx-frames and only changed budget
for cleaning NAPI poll.

Fixes: fbb7ddfef253 ("i40evf: core ethtool functionality")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jun Zhang <xuejun.zhang@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoiavf: Fix VLAN_V2 addition/rejection
Przemyslaw Patynowski [Fri, 10 Jun 2022 12:15:54 +0000 (14:15 +0200)]
iavf: Fix VLAN_V2 addition/rejection

Fix VLAN addition, so that PF driver does not reject whole VLAN batch.
Add VLAN reject handling, so rejected VLANs, won't litter VLAN filter
list. Fix handling of active_(c/s)vlans, so it will be possible to
re-add VLAN filters for user.
Without this patch, after changing trust to off, with VLAN filters
saturated, no VLAN is added, due to PF rejecting addition.

Fixes: 92fc50859872 ("iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2 years agoMerge branch 'dsa-docs'
David S. Miller [Mon, 18 Jul 2022 11:44:37 +0000 (12:44 +0100)]
Merge branch 'dsa-docs'

Vladimir Oltean says:

====================
Update DSA documentation

These are some updates of dsa.rst, since it hasn't kept up with
development (in some cases, even since 2017). I've added Fixes: tags as
I thought was appropriate.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: mention that VLANs are now refcounted on shared ports
Vladimir Oltean [Sat, 16 Jul 2022 18:53:44 +0000 (21:53 +0300)]
docs: net: dsa: mention that VLANs are now refcounted on shared ports

The blamed commit updated the way in which VLANs are handled at the
cross-chip notifier layer and didn't update the documentation to say
that. Fix it.

Fixes: 134ef2388e7f ("net: dsa: add explicit support for host bridge VLANs")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: delete misinformation about -EOPNOTSUPP for FDB/MDB/VLAN
Vladimir Oltean [Sat, 16 Jul 2022 18:53:43 +0000 (21:53 +0300)]
docs: net: dsa: delete misinformation about -EOPNOTSUPP for FDB/MDB/VLAN

Returning -EOPNOTSUPP does *NOT* mean anything special.

port_vlan_add() is actually called from 2 code paths, one is
vlan_vid_add() from 8021q module and the other is
br_switchdev_port_vlan_add() from switchdev.

The bridge has a wrapper __vlan_vid_add() which first tries via
switchdev, then if that returns -EOPNOTSUPP, tries again via the VLAN RX
filters in the 8021q module. But DSA doesn't distinguish between one
call path and the other when calling the driver's port_vlan_add(), so if
the driver returns -EOPNOTSUPP to switchdev, it also returns -EOPNOTSUPP
to the 8021q module. And the latter is a hard error.

port_fdb_add() is called from the deferred dsa_owq only, so obviously
its return code isn't propagated anywhere, and cannot be interpreted in
any way.

The return code from port_mdb_add() is propagated to the bridge, but
again, this doesn't do anything special when -EOPNOTSUPP is returned,
but rather, br_switchdev_mdb_notify() returns void.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: re-explain what port_fdb_dump actually does
Vladimir Oltean [Sat, 16 Jul 2022 18:53:42 +0000 (21:53 +0300)]
docs: net: dsa: re-explain what port_fdb_dump actually does

Switchdev has changed radically from its initial implementation, and the
currently provided definition is incorrect and very confusing.

Rewrite it in light of what it actually does.

Fixes: 2bedde1abbef ("net: dsa: Move FDB dump implementation inside DSA")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: add a section for address databases
Vladimir Oltean [Sat, 16 Jul 2022 18:53:41 +0000 (21:53 +0300)]
docs: net: dsa: add a section for address databases

The given definition for what VID 0 represents in the current
port_fdb_add and port_mdb_add is blatantly wrong. Delete it and explain
the concepts surrounding DSA's understanding of FDB isolation.

Fixes: c26933639b54 ("net: dsa: request drivers to perform FDB isolation")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: delete port_mdb_dump
Vladimir Oltean [Sat, 16 Jul 2022 18:53:40 +0000 (21:53 +0300)]
docs: net: dsa: delete port_mdb_dump

This was deleted in 2017, stop documenting it.

Fixes: dc0cbff3ff9f ("net: dsa: Remove redundant MDB dump support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: remove port_vlan_dump
Vladimir Oltean [Sat, 16 Jul 2022 18:53:39 +0000 (21:53 +0300)]
docs: net: dsa: remove port_vlan_dump

This was deleted in 2017, delete the obsolete documentation.

Fixes: c069fcd82c57 ("net: dsa: Remove support for bypass bridge port attributes/vlan set")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: remove port_bridge_tx_fwd_offload
Vladimir Oltean [Sat, 16 Jul 2022 18:53:38 +0000 (21:53 +0300)]
docs: net: dsa: remove port_bridge_tx_fwd_offload

We've changed the API through which we can offload the bridge TX
forwarding process. Update the documentation in light of the removal of
2 DSA switch ops.

Fixes: b079922ba2ac ("net: dsa: add a "tx_fwd_offload" argument to ->port_bridge_join")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: document port_fast_age
Vladimir Oltean [Sat, 16 Jul 2022 18:53:37 +0000 (21:53 +0300)]
docs: net: dsa: document port_fast_age

The provided information about FDB flushing is not really up to date.
The DSA core automatically calls port_fast_age() when necessary, and
drivers should just implement that rather than hooking it to
port_bridge_leave, port_stp_state_set and others.

Fixes: 732f794c1baf ("net: dsa: add port fast ageing")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: document port_setup and port_teardown
Vladimir Oltean [Sat, 16 Jul 2022 18:53:36 +0000 (21:53 +0300)]
docs: net: dsa: document port_setup and port_teardown

These methods were added without being documented, fix that.

Fixes: fd292c189a97 ("net: dsa: tear down devlink port regions when tearing down the devlink port on error")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: document the teardown method
Vladimir Oltean [Sat, 16 Jul 2022 18:53:35 +0000 (21:53 +0300)]
docs: net: dsa: document the teardown method

A teardown method was added to dsa_switch_ops without being documented.
Do so now.

Fixes: 5e3f847a02aa ("net: dsa: Add teardown callback for drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: document change_tag_protocol
Vladimir Oltean [Sat, 16 Jul 2022 18:53:34 +0000 (21:53 +0300)]
docs: net: dsa: document change_tag_protocol

Support for changing the tagging protocol was added without this
operation being documented; do so now.

Fixes: 53da0ebaad10 ("net: dsa: allow changing the tag protocol via the "tagging" device attribute")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: add more info about the other arguments to get_tag_protocol
Vladimir Oltean [Sat, 16 Jul 2022 18:53:33 +0000 (21:53 +0300)]
docs: net: dsa: add more info about the other arguments to get_tag_protocol

Changes were made to the prototype of get_tag_protocol without
describing at a high level what they are about. Update the documentation
to explain that.

Fixes: 5ed4e3eb0217 ("net: dsa: Pass a port to get_tag_protocol()")
Fixes: 4d776482ecc6 ("net: dsa: Get information about stacked DSA protocol")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: rename tag_protocol to get_tag_protocol
Vladimir Oltean [Sat, 16 Jul 2022 18:53:32 +0000 (21:53 +0300)]
docs: net: dsa: rename tag_protocol to get_tag_protocol

Since the blamed commit, the enum was turned into a function pointer and
also renamed. Update the documentation.

Fixes: 7b314362a234 ("net: dsa: Allow the DSA driver to indicate the tag protocol")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: document the shutdown behavior
Vladimir Oltean [Sat, 16 Jul 2022 18:53:31 +0000 (21:53 +0300)]
docs: net: dsa: document the shutdown behavior

Document the changes that took place in the DSA core in the blamed
commit.

Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agodocs: net: dsa: update probing documentation
Vladimir Oltean [Sat, 16 Jul 2022 18:53:30 +0000 (21:53 +0300)]
docs: net: dsa: update probing documentation

Since the blamed commit we don't have register_switch_driver() and
unregister_switch_driver() anymore. Additionally, the expected
dsa_register_switch() and dsa_unregister_switch() calls aren't
documented.

Update the probing section with the details of how things are currently
done.

Fixes: 93e86b3bc842 ("net: dsa: Remove legacy probing support")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoMerge branch 'net-ipv4-sysctl-races-part-3'
David S. Miller [Mon, 18 Jul 2022 11:21:54 +0000 (12:21 +0100)]
Merge branch 'net-ipv4-sysctl-races-part-3'

Kuniyuki Iwashima says:

====================
sysctl: Fix data-races around ipv4_net_table (Round 3).

This series fixes data-races around 21 knobs after
igmp_link_local_mcast_reports in ipv4_net_table.

These 4 knobs are skipped because they are safe.

  - tcp_congestion_control: Safe with RCU and xchg().
  - tcp_available_congestion_control: Read only.
  - tcp_allowed_congestion_control: Safe with RCU and spinlock().
  - tcp_fastopen_key: Safe with RCU and xchg()

So, round 4 will start with fib_multipath_use_neigh.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_fastopen_blackhole_timeout.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:55 +0000 (10:17 -0700)]
tcp: Fix data-races around sysctl_tcp_fastopen_blackhole_timeout.

While reading sysctl_tcp_fastopen_blackhole_timeout, it can be changed
concurrently.  Thus, we need to add READ_ONCE() to its readers.

Fixes: cf1ef3f0719b ("net/tcp_fastopen: Disable active side TFO in certain scenarios")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_fastopen.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:54 +0000 (10:17 -0700)]
tcp: Fix data-races around sysctl_tcp_fastopen.

While reading sysctl_tcp_fastopen, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 2100c8d2d9db ("net-tcp: Fast Open base")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_max_syn_backlog.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:53 +0000 (10:17 -0700)]
tcp: Fix data-races around sysctl_max_syn_backlog.

While reading sysctl_max_syn_backlog, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_tw_reuse.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:52 +0000 (10:17 -0700)]
tcp: Fix a data-race around sysctl_tcp_tw_reuse.

While reading sysctl_tcp_tw_reuse, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix a data-race around sysctl_tcp_notsent_lowat.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:51 +0000 (10:17 -0700)]
tcp: Fix a data-race around sysctl_tcp_notsent_lowat.

While reading sysctl_tcp_notsent_lowat, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its reader.

Fixes: c9bee3b7fdec ("tcp: TCP_NOTSENT_LOWAT socket option")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around some timeout sysctl knobs.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:50 +0000 (10:17 -0700)]
tcp: Fix data-races around some timeout sysctl knobs.

While reading these sysctl knobs, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.

  - tcp_retries1
  - tcp_retries2
  - tcp_orphan_retries
  - tcp_fin_timeout

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_reordering.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:49 +0000 (10:17 -0700)]
tcp: Fix data-races around sysctl_tcp_reordering.

While reading sysctl_tcp_reordering, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_migrate_req.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:48 +0000 (10:17 -0700)]
tcp: Fix data-races around sysctl_tcp_migrate_req.

While reading sysctl_tcp_migrate_req, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: f9ac779f881c ("net: Introduce net.ipv4.tcp_migrate_req.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_syncookies.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:47 +0000 (10:17 -0700)]
tcp: Fix data-races around sysctl_tcp_syncookies.

While reading sysctl_tcp_syncookies, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around sysctl_tcp_syn(ack)?_retries.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:46 +0000 (10:17 -0700)]
tcp: Fix data-races around sysctl_tcp_syn(ack)?_retries.

While reading sysctl_tcp_syn(ack)?_retries, they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agotcp: Fix data-races around keepalive sysctl knobs.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:45 +0000 (10:17 -0700)]
tcp: Fix data-races around keepalive sysctl knobs.

While reading sysctl_tcp_keepalive_(time|probes|intvl), they can be changed
concurrently.  Thus, we need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2 years agoigmp: Fix data-races around sysctl_igmp_qrv.
Kuniyuki Iwashima [Fri, 15 Jul 2022 17:17:44 +0000 (10:17 -0700)]
igmp: Fix data-races around sysctl_igmp_qrv.

While reading sysctl_igmp_qrv, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

This test can be packed into a helper, so such changes will be in the
follow-up series after net is merged into net-next.

  qrv ?: READ_ONCE(net->ipv4.sysctl_igmp_qrv);

Fixes: a9fe8e29945d ("ipv4: implement igmp_qrv sysctl to tune igmp robustness variable")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>