git.baikalelectronics.ru Git

net: add skb_[inner_]tcp_all_headers helpers

Most drivers use "skb_transport_offset(skb) + tcp_hdrlen(skb)"
to compute headers length for a TCP packet, but others
use more convoluted (but equivalent) ways.

Add skb_tcp_all_headers() and skb_inner_tcp_all_headers()
helpers to harmonize this a bit.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

qlogic/qed: fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220630123924.7531-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

samsung/sxgbe: fix repeated words in comments

Delete the redundant word 'are'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220630124639.11420-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

stmicro/stmmac: fix repeated words in comments

Delete the redundant word 'all'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220630125222.14357-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ethernet/sun: fix repeated words in comments

Delete the redundant word 'the'.
Delete the redundant word 'is'.
Delete the redundant word 'start'.
Delete the redundant word 'checking'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220630130916.21074-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

usbnet: remove vestiges of debug macros

The driver has long since be converted to dynamic debugging.
The internal compile options for more debugging can simply be
deleted.

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20220630110741.21314-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: pcs: rzn1-miic: update speed only if interface is changed

As stated by Russel King, miic_config() can be called as a result of
ethtool setting the configuration while the link is already up. Since
the speed is also set in this function, it could potentially modify
the current speed that is set. This will only happen if there is
no PHY present and we aren't using fixed-link mode.

Handle that by storing the current interface mode in the miic_port
structure and update the speed only if the interface mode is going to
be changed.

Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Link: https://lore.kernel.org/r/20220629122003.189397-1-clement.leger@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

octeontx2-af: fix operand size in bitwise operation

Made size of operands same in bitwise operations.

The patch fixes the klocwork issue, operands in a bitwise operation have
different size at line 375 and 483.

Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Link: https://lore.kernel.org/r/f4fba33fe4f89b420b4da11d51255e7cc6ea1dbf.1656586269.git.sthotton@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net/cmsg_sender: Remove a semicolon

Remove the repeated ';' from code.

Signed-off-by: Li kunyu <kunyu@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nfp: support VF rate limit with NFDK

Support VF rate limiting with NFDK by adding ndo_set_vf_rate to the NFDK
ops structure.

NFDK is used to communicate via PCIE to NFP-3800 based NICs
while NFD3 is used for other NICs supported by the NFP driver.
The VF rate limit feature is already supported by the driver for NFD3.

Signed-off-by: Bin Chen <bin.chen@corigine.com>
Reviewed-by: Baowen Zheng <baowen.zheng@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cdc-eem: always use BIT

Either you use BIT(x) or 1 << x in the same expression.
Mixing them is ridiculous. Go to BIT()

Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cxgb4: Fix typo in string

Remove the repeated ',' from string

Signed-off-by: Li kunyu <kunyu@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: net: fib_rule_tests: fix support for running individual tests

parsing and usage of -t got missed in the previous patch.
this patch fixes it

Fixes: 00670779b23e ("selftests: net: fib_rule_tests: add support to select a test to run")
Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mptcp-mem-scheduling'

Mat Martineau says:

====================
mptcp: Updates for mem scheduling and SK_RECLAIM

In the "net: reduce tcp_memory_allocated inflation" series (merge commit
36a2db252d65), Eric Dumazet noted that "Removal of SK_RECLAIM_CHUNK and
SK_RECLAIM_THRESHOLD is left to MPTCP maintainers as a follow up."

Patches 1-3 align MPTCP with the above TCP changes to forward memory
allocation, reclaim, and memory scheduling.

Patch 4 removes the SK_RECLAIM_* macros as Eric requested.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: remove SK_RECLAIM_THRESHOLD and SK_RECLAIM_CHUNK

There are no more users for the mentioned macros, just
drop them.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: refine memory scheduling

Similar to commit 91efb764f795 ("net: fix sk_wmem_schedule() and
sk_rmem_schedule() errors"), let the MPTCP receive path schedule
exactly the required amount of memory.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: drop SK_RECLAIM_* macros

After commit 51d8ac882d1b ("net: keep sk->sk_forward_alloc as small as
possible"), the MPTCP protocol is the last SK_RECLAIM_CHUNK and
SK_RECLAIM_THRESHOLD users.

Update the MPTCP reclaim schema to match the core/TCP one and drop the
mentioned macros. This additionally clean the MPTCP code a bit.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: never fetch fwd memory from the subflow

The memory accounting is broken in such exceptional code
path, and after commit 51d8ac882d1b ("net: keep sk->sk_forward_alloc
as small as possible") we can't find much help there.

Drop the broken code.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nex
t-queue

Tony Nguyen says:

====================
100GbE Intel Wired LAN Driver Updates 2022-06-30

This series contains updates to ice driver only.

Martyna adds support for VLAN related TC switchdev filters and reworks
dummy packet implementation of VLANs to enable dynamic header insertion to
allow for more rule types.

Lu Wei utilizes eth_broadcast_addr() helper over an open coded version.

Ziyang Xuan removes unneeded NULL checks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

neterion/vxge: fix repeated words in comments

Delete the redundant word 'frame'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethernet/neterion: fix repeated words in comments

Delete the redundant word 'the'.
Delete the redundant word 'a'.
Delete the redundant word 'frame'.
Delete the redundant word 'is'.
Delete the redundant word 'not'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethernet/natsemi: fix repeated words in comments

Delete the redundant word 'in'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mellanox/mlxsw: fix repeated words in comments

Delete the redundant word 'action'.
Delete the redundant word 'refer'.
Delete the redundant word 'for'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ethernet/marvell: fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

marvell/octeontx2/af: fix repeated words in comments

Delete the redundant word 'so'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'prevent-permanently-closed-tc-taprio-gates-from-blocking-a-felix-dsa-switch-port'

Vladimir Oltean says:

====================
Prevent permanently closed tc-taprio gates from blocking a Felix DSA switch port

Richie Pearn reports that if we install a tc-taprio schedule on a Felix
switch port, and that schedule has at least one gate that never opens
(for example TC0 below):

tc qdisc add dev swp1 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
base-time 0 sched-entry S fe 1000000 flags 0x2

then packets classified to the permanently closed traffic class will not
be dequeued by the egress port. They will just remain in the queue
system, to consume resources. Frame aging does not trigger either,
because in order for that to happen, the packets need to be eligible for
egress scheduling in the first place, which they aren't. If that port is
allowed to consume the entire shared buffer of the switch (as we
configure things by default using devlink-sb), then eventually, by
sending enough packets, the entire switch will hang.

If we think enough about the problem, we realize that this is only a
special case of a more general issue, and can also be reproduced with
gates that aren't permanently closed, but are not large enough to send
an entire frame. In that sense, a permanently closed gate is simply a
case where all frames are oversized.

The ENETC has logic to reject transmitted packets that would overrun the
time window - see commit aad47a97afd1 ("net: enetc: count the tc-taprio
window drops").

The Felix switch has no such thing on a per-packet basis, but it has a
register replicated per {egress port, TC} which essentially limits the
max MTU. A packet which exceeds the per-port-TC MTU is immediately
discarded and therefore will not hang the port anymore (albeit, sadly,
this only bumps a generic drop hardware counter and we cannot really
infer the reason such as to offer a dedicated counter for these events).

This patch set calculates the max MTU per {port, TC} when the tc-taprio
config, or link speed, or port-global MTU values change. This solves the
larger "gate too small for packet" problem, but also the original issue
with the gate permanently closed that was reported by Richie.
====================

Link: https://lore.kernel.org/r/20220628145238.3247853-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

time64.h: consolidate uses of PSEC_PER_NSEC

Time-sensitive networking code needs to work with PTP times expressed in
nanoseconds, and with packet transmission times expressed in
picoseconds, since those would be fractional at higher than gigabit
speed when expressed in nanoseconds.

Convert the existing uses in tc-taprio and the ocelot/felix DSA driver
to a PSEC_PER_NSEC macro. This macro is placed in include/linux/time64.h
as opposed to its relatives (PSEC_PER_SEC etc) from include/vdso/time64.h
because the vDSO library does not (yet) need/use it.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> # for the vDSO parts
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: felix: drop oversized frames with tc-taprio instead of hanging the port

Currently, sending a packet into a time gate too small for it (or always
closed) causes the queue system to hold the frame forever. Even worse,
this frame isn't subject to aging either, because for that to happen, it
needs to be scheduled for transmission in the first place. But the frame
will consume buffer memory and frame references while it is forever held
in the queue system.

Before commit f24f9e3d8757 ("net: mscc: ocelot: initialize watermarks to
sane defaults"), this behavior was somewhat subtle, as the switch had a
more intricately tuned default watermark configuration out of reset,
which did not allow any single port and tc to consume the entire switch
buffer space. Nonetheless, the held frames are still there, and they
reduce the total backplane capacity of the switch.

However, after the aforementioned commit, the behavior can be very
clearly seen, since we deliberately allow each {port, tc} to consume the
entire shared buffer of the switch minus the reservations (and we
disable all reservations by default). That is to say, we allow a
permanently closed tc-taprio gate to hang the entire switch.

A careful inspection of the documentation shows that the QSYS:Q_MAX_SDU
per-port-tc registers serve 2 purposes: one is for guard band calculation
(when zero, this falls back to QSYS:PORT_MAX_SDU), and the other is to
enable oversized frame dropping (when non-zero).

Currently the QSYS:Q_MAX_SDU registers are all zero, so oversized frame
dropping is disabled. The goal of the change is to enable it seamlessly.
For that, we need to hook into the MTU change, tc-taprio change, and
port link speed change procedures, since we depend on these variables.

Frames are not dropped on egress due to a queue system oversize
condition, instead that egress port is simply excluded from the mask of
valid destination ports for the packet. If there are no destination
ports at all, the ingress counter that increments is the generic
"drop_tail" in ethtool -S.

The issue exists in various forms since the tc-taprio offload was introduced.

Fixes: edc15c3360cf ("net: dsa: felix: Configure Time-Aware Scheduler via taprio offload")
Reported-by: Richie Pearn <richard.pearn@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: felix: keep QSYS_TAG_CONFIG_INIT_GATE_STATE(0xFF) out of rmw

In vsc9959_tas_clock_adjust(), the INIT_GATE_STATE field is not changed,
only the ENABLE field. Similarly for the disabling of the time-aware
shaper in vsc9959_qos_port_tas_set().

To reflect this, keep the QSYS_TAG_CONFIG_INIT_GATE_STATE_M mask out of
the read-modify-write procedure to make it clearer what is the intention
of the code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: felix: keep reference on entire tc-taprio config

In a future change we will need to remember the entire tc-taprio config
on all ports rather than just the base time, so use the
taprio_offload_get() helper function to replace ocelot_port->base_time
with ocelot_port->taprio.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: gianfar: add support for software TX timestamping

These are required by certain network profiling applications in order to
measure delays.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20220629181335.3800821-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: rzn1-a5psw: add missing of_node_put() in a5psw_pcs_get()

of_parse_phandle() will increase the refcount of 'pcs_node', so add
of_node_put() before return from a5psw_pcs_get().

Fixes: 82752fa04815 ("net: dsa: rzn1-a5psw: add Renesas RZ/N1 advanced 5 port switch driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220630014153.1888811-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

hisilicon/hns3/hns3vf:fix repeated words in comments

Delete the redundant word 'new'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629131330.16812-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

google/gve:fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629130013.3273-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

freescale/fs_enet:fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629125441.62420-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ethernet/emulex:fix repeated words in comments

Delete the redundant word 'the'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629123756.48860-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

atheros/atl1e:fix repeated words in comments

Delete the redundant word 'slot'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629092856.35678-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue

Tony Nguyen says:

====================
1GbE Intel Wired LAN Driver Updates 2022-06-30

This series contains updates to misc Intel drivers.

Jesse removes unused macros from e100, e1000, e1000e, i40e, iavf, igc,
ixgb, ixgbe, and ixgbevf drivers.

Jiang Jian removes repeated words from ixgbe, fm10k, igb, and ixgbe
drivers.

Jilin Yuan removes repeated words from e1000, e1000e, fm10k, i40e, iavf,
igb, igbvf, igc, ixgbevf, and ice drivers.

* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  intel/ice:fix repeated words in comments
  intel/ixgbevf:fix repeated words in comments
  intel/igc:fix repeated words in comments
  intel/igbvf:fix repeated words in comments
  intel/igb:fix repeated words in comments
  intel/iavf:fix repeated words in comments
  intel/i40e:fix repeated words in comments
  intel/fm10k:fix repeated words in comments
  intel/e1000e:fix repeated words in comments
  intel/e1000:fix repeated words in comments
  ixgbe: drop unexpected word 'for' in comments
  igb: remove unexpected word "the"
  fm10k: remove unexpected word "the"
  ixgbe: remove unexpected word "the"
  intel: remove unused macros
====================

Link: https://lore.kernel.org/r/20220630213208.3034968-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
01f703f32b07 ("net: sparx5: mdb add/del handle non-sparx5 devices")
9597552d3cd9 ("net: sparx5: Allow mdb entries to both CPU and ports")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter.

  Current release - new code bugs:

   - clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()

   - mptcp:
      - invoke MP_FAIL response only when needed
      - fix shutdown vs fallback race
      - consistent map handling on failure

   - octeon_ep: use bitwise AND

  Previous releases - regressions:

   - tipc: move bc link creation back to tipc_node_create, fix NPD

  Previous releases - always broken:

   - tcp: add a missing nf_reset_ct() in 3WHS handling to prevent socket
     buffered skbs from keeping refcount on the conntrack module

   - ipv6: take care of disable_policy when restoring routes

   - tun: make sure to always disable and unlink NAPI instances

   - phy: don't trigger state machine while in suspend

   - netfilter: nf_tables: avoid skb access on nf_stolen

   - asix: fix "can't send until first packet is send" issue

   - usb: asix: do not force pause frames support

   - nxp-nci: don't issue a zero length i2c_master_read()

  Misc:

   - ncsi: allow use of proper "mellanox" DT vendor prefix

   - act_api: add a message for user space if any actions were already
     flushed before the error was hit"

* tag 'net-5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits)
  net: dsa: felix: fix race between reading PSFP stats and port stats
  selftest: tun: add test for NAPI dismantle
  net: tun: avoid disabling NAPI twice
  net: sparx5: mdb add/del handle non-sparx5 devices
  net: sfp: fix memory leak in sfp_probe()
  mlxsw: spectrum_router: Fix rollback in tunnel next hop init
  net: rose: fix UAF bugs caused by timer handler
  net: usb: ax88179_178a: Fix packet receiving
  net: bonding: fix use-after-free after 802.3ad slave unbind
  ipv6: fix lockdep splat in in6_dump_addrs()
  net: phy: ax88772a: fix lost pause advertisement configuration
  net: phy: Don't trigger state machine while in suspend
  usbnet: fix memory allocation in helpers
  selftests net: fix kselftest net fatal error
  NFC: nxp-nci: don't print header length mismatch on i2c error
  NFC: nxp-nci: Don't issue a zero length i2c_master_read()
  net: tipc: fix possible refcount leak in tipc_sk_create()
  nfc: nfcmrvl: Fix irq_of_parse_and_map() return value
  net: ipv6: unexport __init-annotated seg6_hmac_net_init()
  ipv6/sit: fix ipip6_tunnel_get_prl return value
  ...

vfs: fix copy_file_range() regression in cross-fs copies

A regression has been reported by Nicolas Boichat, found while using the
copy_file_range syscall to copy a tracefs file.

Before commit 6c5c6aabe260 ("vfs: allow copy_file_range to copy across
devices") the kernel would return -EXDEV to userspace when trying to
copy a file across different filesystems.  After this commit, the
syscall doesn't fail anymore and instead returns zero (zero bytes
copied), as this file's content is generated on-the-fly and thus reports
a size of zero.

Another regression has been reported by He Zhe - the assertion of
WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when
copying from a sysfs file whose read operation may return -EOPNOTSUPP.

Since we do not have test coverage for copy_file_range() between any two
types of filesystems, the best way to avoid these sort of issues in the
future is for the kernel to be more picky about filesystems that are
allowed to do copy_file_range().

This patch restores some cross-filesystem copy restrictions that existed
prior to commit 6c5c6aabe260 ("vfs: allow copy_file_range to copy across
devices"), namely, cross-sb copy is not allowed for filesystems that do
not implement ->copy_file_range().

Filesystems that do implement ->copy_file_range() have full control of
the result - if this method returns an error, the error is returned to
the user.  Before this change this was only true for fs that did not
implement the ->remap_file_range() operation (i.e.  nfsv3).

Filesystems that do not implement ->copy_file_range() still fall-back to
the generic_copy_file_range() implementation when the copy is within the
same sb.  This helps the kernel can maintain a more consistent story
about which filesystems support copy_file_range().

nfsd and ksmbd servers are modified to fall-back to the
generic_copy_file_range() implementation in case vfs_copy_file_range()
fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of
server-side-copy.

fall-back to generic_copy_file_range() is not implemented for the smb
operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct
change of behavior.

Fixes: 6c5c6aabe260 ("vfs: allow copy_file_range to copy across devices")
Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/
Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/
Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/
Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/
Reported-by: Nicolas Boichat <drinkcat@chromium.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Luis Henriques <lhenriques@suse.de>
Fixes: d7f7f30a97f7 ("vfs: no fallback for ->copy_file_range")
Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/
Reported-by: He Zhe <zhe.he@windriver.com>
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

intel/ice:fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: Remove unnecessary NULL check before dev_put

Since commit 8a32d00ea2a0 ("netdevice: add the case if dev is NULL"),
dev_put(NULL) is safe, check NULL before dev_put() is not needed.

Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: use eth_broadcast_addr() to set broadcast address

Use eth_broadcast_addr() to set broadcast address instead of memset().

Signed-off-by: Lu Wei <luwei32@huawei.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: switch: dynamically add VLAN headers to dummy packets

Enable the support of creating all kinds of declared dummy packets
with the VLAN tags by inserting VLAN headers (single VLAN and QinQ
cases) if needed.
Decrease the number of declared dummy packets and increase in the
possible packet's combinations for adding switch rules.

This change enables support of creating filters that match both on
VLAN + tunnels properties in switchdev.

Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: Add support for VLAN TPID filters in switchdev

Enable support for adding TC rules that filter on the VLAN tag type
in switchdev mode.

Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ice: Add support for double VLAN in switchdev

Enable support for adding TC rules with both C-tag and S-tag that can
filter on the inner and outer VLAN in QinQ for basic packets (not
tunneled cases).

Signed-off-by: Wiktor Pilarczyk <wiktor.pilarczyk@intel.com>
Signed-off-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

net: phylink: fix NULL pl->pcs dereference during phylink_pcs_poll_start

The current link mode of the phylink instance may not require an
attached PCS. However, phylink_major_config() unconditionally
dereferences this potentially NULL pointer when restarting the link poll
timer, which will panic the kernel.

Fix the problem by checking whether a PCS exists in phylink_pcs_poll_start(),
otherwise do nothing. The code prior to the blamed patch also only
looked at pcs->poll within an "if (pcs)" block.

Fixes: d4fe1494756c ("net: phylink: disable PCS polling over major configuration")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Tested-by: Michael Walle <michael@walle.cc> # on kontron-kbox-a-230-ls
Tested-by: Nicolas Ferre <nicolas.ferre@microchip.com> # on sam9x60ek
Link: https://lore.kernel.org/r/20220629193358.4007923-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: felix: fix race between reading PSFP stats and port stats

Both PSFP stats and the port stats read by ocelot_check_stats_work() are
indirectly read through the same mechanism - write to STAT_CFG:STAT_VIEW,
read from SYS:STAT:CNT[n].

It's just that for port stats, we write STAT_VIEW with the index of the
port, and for PSFP stats, we write STAT_VIEW with the filter index.

So if we allow them to run concurrently, ocelot_check_stats_work() may
change the view from vsc9959_psfp_counters_get(), and vice versa.

Fixes: fa5916fb5652 ("net: dsa: felix: support psfp filter on vsc9959")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220629183007.3808130-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftest: tun: add test for NAPI dismantle

Being lazy does not pay, add the test for various
ordering of tun queue close / detach / destroy.

Link: https://lore.kernel.org/r/20220629181911.372047-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: tun: avoid disabling NAPI twice

Eric reports that syzbot made short work out of my speculative
fix. Indeed when queue gets detached its tfile->tun remains,
so we would try to stop NAPI twice with a detach(), close()
sequence.

Alternative fix would be to move tun_napi_disable() to
tun_detach_all() and let the NAPI run after the queue
has been detached.

Fixes: 6760cd5a7546 ("net: tun: stop NAPI when detaching queues")
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20220629181911.372047-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: sparx5: mdb add/del handle non-sparx5 devices

When adding/deleting mdb entries on other net_devices, eg., tap
interfaces, it should not crash.

Fixes: b1e7d545051f ("net: sparx5: Add mdb handlers")
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Link: https://lore.kernel.org/r/20220630122226.316812-1-casper.casan@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

intel/ixgbevf:fix repeated words in comments

Delete the redundant word 'slot'.
Delete the redundant word 'we'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/igc:fix repeated words in comments

Delete the redundant word 'frames'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/igbvf:fix repeated words in comments

Delete the redundant word 'on'.
Delete the redundant word 'slot'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/igb:fix repeated words in comments

Delete the redundant word 'frames'.
Delete the redundant word 'set'.
Delete the redundant word 'slot'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/iavf:fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/i40e:fix repeated words in comments

Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/fm10k:fix repeated words in comments

Delete the redundant word 'by'.
Delete the redundant word 'a'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/e1000e:fix repeated words in comments

Delete the redundant word 'frames'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

intel/e1000:fix repeated words in comments

Delete the redundant word 'frames'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ixgbe: drop unexpected word 'for' in comments

there is an unexpected word 'for' in the comments that need to be dropped

file - drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
line - 5164

* ixgbe_lpbthresh - calculate low water mark for for flow control

changed to:

* ixgbe_lpbthresh - calculate low water mark for flow control

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

igb: remove unexpected word "the"

there is an unexpected word "the" in the comments that need to be removed

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

fm10k: remove unexpected word "the"

there is an unexpected word "the" in the comments that need to be removed

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

ixgbe: remove unexpected word "the"

there is an unexpected word "the" in the comments that need to be removed

Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
"Three minor bug fixes:

   - qedr not setting the QP timeout properly toward userspace

   - Memory leak on error path in ib_cm

   - Divide by 0 in RDMA interrupt moderation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  linux/dim: Fix divide by 0 in RDMA DIM
  RDMA/cm: Fix memory leak in ib_cm_insert_listen
  RDMA/qedr: Fix reporting QP timeout attribute

Merge tag 'fsnotify_for_v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fanotify fix from Jan Kara:
"A fix for recently added fanotify API to have stricter checks and
  refuse some invalid flag combinations to make our life easier in the
  future"

* tag 'fsnotify_for_v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fanotify: refine the validation checks on non-dir inode mask

Merge tag 'v5.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
"Fix a regression that breaks the ccp driver"

* tag 'v5.19-p3' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ccp - Fix device IRQ counting by using platform_irq_count()

intel: remove unused macros

As found by the compile option -Wunused-macros, remove these macros
that are never used by the code.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>

Merge branch 'net-neigh-introduce-interval_probe_time-for-periodic-probe'

Yuwei Wang says:

====================
net, neigh: introduce interval_probe_time for periodic probe

This series adds a new option `interval_probe_time_ms` in net, neigh
for periodic probe, and add a limitation to prevent it set to 0
====================

Link: https://lore.kernel.org/r/20220629084832.2842973-1-wangyuweihx@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net, neigh: introduce interval_probe_time_ms for periodic probe

commit 9c444ee74677 ("net, neigh: Set lower cap for neigh_managed_work rearming")
fixed a case when DELAY_PROBE_TIME is configured to 0, the processing of the
system work queue hog CPU to 100%, and further more we should introduce
a new option used by periodic probe

Signed-off-by: Yuwei Wang <wangyuweihx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

sysctl: add proc_dointvec_ms_jiffies_minmax

add proc_dointvec_ms_jiffies_minmax to fit read msecs value to jiffies
with a limited range of values

Signed-off-by: Yuwei Wang <wangyuweihx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

atheros/atl1c:fix repeated words in comments

Delete the redundant word 'slot'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220629081632.54445-1-yuanjilin@cdjrlc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: sfp: fix memory leak in sfp_probe()

sfp_probe() allocates a memory chunk from sfp with sfp_alloc(). When
devm_add_action() fails, sfp is not freed, which leads to a memory leak.

We should use devm_add_action_or_reset() instead of devm_add_action().

Signed-off-by: Jianglei Nie <niejianglei2021@163.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20220629075550.2152003-1-niejianglei2021@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

mlxsw: spectrum_router: Fix rollback in tunnel next hop init

In mlxsw_sp_nexthop6_init(), a next hop is always added to the router
linked list, and mlxsw_sp_nexthop_type_init() is invoked afterwards. When
that function results in an error, the next hop will not have been removed
from the linked list. As the error is propagated upwards and the caller
frees the next hop object, the linked list ends up holding an invalid
object.

A similar issue comes up with mlxsw_sp_nexthop4_init(), where rollback
block does exist, however does not include the linked list removal.

Both IPv6 and IPv4 next hops have a similar issue with next-hop counter
rollbacks. As these were introduced in the same patchset as the next hop
linked list, include the cleanup in this patch.

Fixes: 4e4a8cbe8a97 ("mlxsw: spectrum_router: Keep nexthops in a linked list")
Fixes: 57b8fd70b9c8 ("mlxsw: spectrum: Add support for setting counters on nexthops")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20220629070205.803952-1-idosch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: rose: fix UAF bugs caused by timer handler

There are UAF bugs in rose_heartbeat_expiry(), rose_timer_expiry()
and rose_idletimer_expiry(). The root cause is that del_timer()
could not stop the timer handler that is running and the refcount
of sock is not managed properly.

One of the UAF bugs is shown below:

    (thread 1)          |        (thread 2)
                        |  rose_bind
                        |  rose_connect
                        |    rose_start_heartbeat
rose_release            |    (wait a time)
  case ROSE_STATE_0     |
  rose_destroy_socket   |  rose_heartbeat_expiry
    rose_stop_heartbeat |
    sock_put(sk)        |    ...
  sock_put(sk) // FREE  |
                        |    bh_lock_sock(sk) // USE

The sock is deallocated by sock_put() in rose_release() and
then used by bh_lock_sock() in rose_heartbeat_expiry().

Although rose_destroy_socket() calls rose_stop_heartbeat(),
it could not stop the timer that is running.

The KASAN report triggered by POC is shown below:

BUG: KASAN: use-after-free in _raw_spin_lock+0x5a/0x110
Write of size 4 at addr ffff88800ae59098 by task swapper/3/0
...
Call Trace:
<IRQ>
dump_stack_lvl+0xbf/0xee
print_address_description+0x7b/0x440
print_report+0x101/0x230
? irq_work_single+0xbb/0x140
? _raw_spin_lock+0x5a/0x110
kasan_report+0xed/0x120
? _raw_spin_lock+0x5a/0x110
kasan_check_range+0x2bd/0x2e0
_raw_spin_lock+0x5a/0x110
rose_heartbeat_expiry+0x39/0x370
? rose_start_heartbeat+0xb0/0xb0
call_timer_fn+0x2d/0x1c0
? rose_start_heartbeat+0xb0/0xb0
expire_timers+0x1f3/0x320
__run_timers+0x3ff/0x4d0
run_timer_softirq+0x41/0x80
__do_softirq+0x233/0x544
irq_exit_rcu+0x41/0xa0
sysvec_apic_timer_interrupt+0x8c/0xb0
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1b/0x20
RIP: 0010:default_idle+0xb/0x10
RSP: 0018:ffffc9000012fea0 EFLAGS: 00000202
RAX: 000000000000bcae RBX: ffff888006660f00 RCX: 000000000000bcae
RDX: 0000000000000001 RSI: ffffffff843a11c0 RDI: ffffffff843a1180
RBP: dffffc0000000000 R08: dffffc0000000000 R09: ffffed100da36d46
R10: dfffe9100da36d47 R11: ffffffff83cf0950 R12: 0000000000000000
R13: 1ffff11000ccc1e0 R14: ffffffff8542af28 R15: dffffc0000000000
...
Allocated by task 146:
__kasan_kmalloc+0xc4/0xf0
sk_prot_alloc+0xdd/0x1a0
sk_alloc+0x2d/0x4e0
rose_create+0x7b/0x330
__sock_create+0x2dd/0x640
__sys_socket+0xc7/0x270
__x64_sys_socket+0x71/0x80
do_syscall_64+0x43/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0

Freed by task 152:
kasan_set_track+0x4c/0x70
kasan_set_free_info+0x1f/0x40
____kasan_slab_free+0x124/0x190
kfree+0xd3/0x270
__sk_destruct+0x314/0x460
rose_release+0x2fa/0x3b0
sock_close+0xcb/0x230
__fput+0x2d9/0x650
task_work_run+0xd6/0x160
exit_to_user_mode_loop+0xc7/0xd0
exit_to_user_mode_prepare+0x4e/0x80
syscall_exit_to_user_mode+0x20/0x40
do_syscall_64+0x4f/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0

This patch adds refcount of sock when we use functions
such as rose_start_heartbeat() and so on to start timer,
and decreases the refcount of sock when timer is finished
or deleted by functions such as rose_stop_heartbeat()
and so on. As a result, the UAF bugs could be mitigated.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Tested-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220629002640.5693-1-duoming@zju.edu.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: usb: ax88179_178a: Fix packet receiving

This patch corrects packet receiving in ax88179_rx_fixup.

- problem observed:
  ifconfig shows allways a lot of 'RX Errors' while packets
  are received normally.

  This occurs because ax88179_rx_fixup does not recognise properly
  the usb urb received.
  The packets are normally processed and at the end, the code exits
  with 'return 0', generating RX Errors.
  (pkt_cnt==-2 and ptk_hdr over field rx_hdr trying to identify
   another packet there)

  This is a usb urb received by "tcpdump -i usbmon2 -X" on a
  little-endian CPU:
  0x0000:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
           ^         packet 1 start (pkt_len = 0x05ec)
           ^^^^      IP alignment pseudo header
                ^    ethernet packet start
           last byte ethernet packet   v
           padding (8-bytes aligned)     vvvv vvvv
  0x05e0:  c92d d444 1420 8a69 83dd 272f e82b 9811
  0x05f0:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
  ...      ^ packet 2
  0x0be0:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
  ...
  0x1130:  9d41 9171 8a38 0ec5 eeee f8e3 3b19 87a0
  ...
  0x1720:  8cfc 15ff 5e4c e85c eeee f8e3 3b19 87a0
  ...
  0x1d10:  ecfa 2a3a 19ab c78c eeee f8e3 3b19 87a0
  ...
  0x2070:  eeee f8e3 3b19 87a0 94de 80e3 daac 0800
  ...      ^ packet 7
  0x2120:  7c88 4ca5 5c57 7dcc 0d34 7577 f778 7e0a
  0x2130:  f032 e093 7489 0740 3008 ec05 0000 0080
                               ====1==== ====2====
           hdr_off             ^
           pkt_len = 0x05ec         ^^^^
           AX_RXHDR_*=0x00830  ^^^^   ^
           pkt_len = 0                        ^^^^
           AX_RXHDR_DROP_ERR=0x80000000  ^^^^   ^
  0x2140:  3008 ec05 0000 0080 3008 5805 0000 0080
  0x2150:  3008 ec05 0000 0080 3008 ec05 0000 0080
  0x2160:  3008 5803 0000 0080 3008 c800 0000 0080
           ===11==== ===12==== ===13==== ===14====
  0x2170:  0000 0000 0e00 3821
                     ^^^^ ^^^^ rx_hdr
                     ^^^^      pkt_cnt=14
                          ^^^^ hdr_off=0x2138
           ^^^^ ^^^^           padding

  The dump shows that pkt_cnt is the number of entrys in the
  per-packet metadata. It is "2 * packet count".
  Each packet have two entrys. The first have a valid
  value (pkt_len and AX_RXHDR_*) and the second have a
  dummy-header 0x80000000 (pkt_len=0 with AX_RXHDR_DROP_ERR).
  Why exists dummy-header for each packet?!?
  My guess is that this was done probably to align the
  entry for each packet to 64-bits and maintain compatibility
  with old firmware.
  There is also a padding (0x00000000) before the rx_hdr to
  align the end of rx_hdr to 64-bit.
  Note that packets have a alignment of 64-bits (8-bytes).

  This patch assumes that the dummy-header and the last
  padding are optional. So it preserves semantics and
  recognises the same valid packets as the current code.

  This patch was made using only the dumpfile information and
  tested with only one device:
  0b95:1790 ASIX Electronics Corp. AX88179 Gigabit Ethernet

Fixes: 49d60052a7fe ("net: usb: ax88179_178a: Fix out-of-bounds accesses in RX fixup")
Fixes: ce13f848757a ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver")
Signed-off-by: Jose Alonso <joalonsof@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/d6970bb04bf67598af4d316eaeb1792040b18cfd.camel@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

net: pcs-rzn1-miic: fix return value check in miic_probe()

On failure, devm_platform_ioremap_resource() returns a ERR_PTR() value
and not NULL. Fix return value checking by using IS_ERR() and return
PTR_ERR() as error value.

Fixes: d2909d765ef7 ("net: pcs: add Renesas MII converter driver")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Clément Léger <clement.leger@bootlin.com>
Link: https://lore.kernel.org/r/20220628131259.3109124-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: rzn1-a5psw: fix a NULL vs IS_ERR() check in a5psw_probe()

The devm_platform_ioremap_resource() function never returns NULL.
It returns error pointers.

Signed-off-by: Peng Wu <wupeng58@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Reviewed-by: Clément Léger <clement.leger@bootlin.com>
Link: https://lore.kernel.org/r/20220628130920.49493-1-wupeng58@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: txgbe: Add build support for txgbe

Add doc build infrastructure for txgbe driver.
Initialize PCI memory space for WangXun 10 Gigabit Ethernet devices.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Link: https://lore.kernel.org/r/20220628095530.889344-1-jiawenwu@trustnetic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: bonding: fix use-after-free after 802.3ad slave unbind

commit badbe055b92f ("bonding: fix 802.3ad aggregator reselection"),
resolve case, when there is several aggregation groups in the same bond.
bond_3ad_unbind_slave will invalidate (clear) aggregator when
__agg_active_ports return zero. So, ad_clear_agg can be executed even, when
num_of_ports!=0. Than bond_3ad_unbind_slave can be executed again for,
previously cleared aggregator. NOTE: at this time bond_3ad_unbind_slave
will not update slave ports list, because lag_ports==NULL. So, here we
got slave ports, pointing to freed aggregator memory.

Fix with checking actual number of ports in group (as was before
commit badbe055b92f ("bonding: fix 802.3ad aggregator reselection") ),
before ad_clear_agg().

The KASAN logs are as follows:

[  767.617392] ==================================================================
[  767.630776] BUG: KASAN: use-after-free in bond_3ad_state_machine_handler+0x13dc/0x1470
[  767.638764] Read of size 2 at addr ffff00011ba9d430 by task kworker/u8:7/767
[  767.647361] CPU: 3 PID: 767 Comm: kworker/u8:7 Tainted: G           O 5.15.11 #15
[  767.655329] Hardware name: DNI AmazonGo1 A7040 board (DT)
[  767.660760] Workqueue: lacp_1 bond_3ad_state_machine_handler
[  767.666468] Call trace:
[  767.668930]  dump_backtrace+0x0/0x2d0
[  767.672625]  show_stack+0x24/0x30
[  767.675965]  dump_stack_lvl+0x68/0x84
[  767.679659]  print_address_description.constprop.0+0x74/0x2b8
[  767.685451]  kasan_report+0x1f0/0x260
[  767.689148]  __asan_load2+0x94/0xd0
[  767.692667]  bond_3ad_state_machine_handler+0x13dc/0x1470

Fixes: badbe055b92f ("bonding: fix 802.3ad aggregator reselection")
Co-developed-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: Yevhen Orlov <yevhen.orlov@plvision.eu>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20220629012914.361-1-yevhen.orlov@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: remove redundant store to value after addition

There is no need to store the result of the addition back to variable count
after the addition. The store is redundant, replace += with just +

Cleans up clang scan build warning:
warning: Although the value stored to 'count' is used in the enclosing
expression, the value is never actually read from 'count'

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220628145406.183527-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

ipv6: fix lockdep splat in in6_dump_addrs()

As reported by syzbot, we should not use rcu_dereference()
when rcu_read_lock() is not held.

WARNING: suspicious RCU usage
5.19.0-rc2-syzkaller #0 Not tainted

net/ipv6/addrconf.c:5175 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by syz-executor326/3617:
#0: ffffffff8d5848e8 (rtnl_mutex){+.+.}-{3:3}, at: netlink_dump+0xae/0xc20 net/netlink/af_netlink.c:2223

stack backtrace:
CPU: 0 PID: 3617 Comm: syz-executor326 Not tainted 5.19.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
in6_dump_addrs+0x12d1/0x1790 net/ipv6/addrconf.c:5175
inet6_dump_addr+0x9c1/0xb50 net/ipv6/addrconf.c:5300
netlink_dump+0x541/0xc20 net/netlink/af_netlink.c:2275
__netlink_dump_start+0x647/0x900 net/netlink/af_netlink.c:2380
netlink_dump_start include/linux/netlink.h:245 [inline]
rtnetlink_rcv_msg+0x73e/0xc90 net/core/rtnetlink.c:6046
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
____sys_sendmsg+0x6eb/0x810 net/socket.c:2492
___sys_sendmsg+0xf3/0x170 net/socket.c:2546
__sys_sendmsg net/socket.c:2575 [inline]
__do_sys_sendmsg net/socket.c:2584 [inline]
__se_sys_sendmsg net/socket.c:2582 [inline]
__x64_sys_sendmsg+0x132/0x220 net/socket.c:2582
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0

Fixes: 1913a80165c5 ("mld: convert ifmcaddr6 to RCU")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220628121248.858695-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: ax88772a: fix lost pause advertisement configuration

In case of asix_ax88772a_link_change_notify() workaround, we run soft
reset which will automatically clear MII_ADVERTISE configuration. The
PHYlib framework do not know about changed configuration state of the
PHY, so we need use phy_init_hw() to reinit PHY configuration.

Fixes: 877cfd219ba7 ("net: usb/phy: asix: add support for ax88772A/C PHYs")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220628114349.3929928-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: phy: Don't trigger state machine while in suspend

Upon system sleep, mdio_bus_phy_suspend() stops the phy_state_machine(),
but subsequent interrupts may retrigger it:

They may have been left enabled to facilitate wakeup and are not
quiesced until the ->suspend_noirq() phase.  Unwanted interrupts may
hence occur between mdio_bus_phy_suspend() and dpm_suspend_noirq(),
as well as between dpm_resume_noirq() and mdio_bus_phy_resume().

Retriggering the phy_state_machine() through an interrupt is not only
undesirable for the reason given in mdio_bus_phy_suspend() (freezing it
midway with phydev->lock held), but also because the PHY may be
inaccessible after it's suspended:  Accesses to USB-attached PHYs are
blocked once usb_suspend_both() clears the can_submit flag and PHYs on
PCI network cards may become inaccessible upon suspend as well.

Amend phy_interrupt() to avoid triggering the state machine if the PHY
is suspended.  Signal wakeup instead if the attached net_device or its
parent has been configured as a wakeup source.  (Those conditions are
identical to mdio_bus_phy_may_suspend().)  Postpone handling of the
interrupt until the PHY has resumed.

Before stopping the phy_state_machine() in mdio_bus_phy_suspend(),
wait for a concurrent phy_interrupt() to run to completion.  That is
necessary because phy_interrupt() may have checked the PHY's suspend
status before the system sleep transition commenced and it may thus
retrigger the state machine after it was stopped.

Likewise, after re-enabling interrupt handling in mdio_bus_phy_resume(),
wait for a concurrent phy_interrupt() to complete to ensure that
interrupts which it postponed are properly rerun.

The issue was exposed by commit 3e5d3f0549b3 ("usbnet: smsc95xx: Forward
PHY interrupts to PHY driver to avoid polling"), but has existed since
forever.

Fixes: 1ce274e52119 ("phylib: Fix deadlock on resume")
Link: https://lore.kernel.org/netdev/a5315a8a-32c2-962f-f696-de9a26d30091@samsung.com/
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@vger.kernel.org # v2.6.33+
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/b7f386d04e9b5b0e2738f0125743e30676f309ef.1656410895.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: switchdev: add reminder near struct switchdev_notifier_fdb_info

br_switchdev_fdb_notify() creates an on-stack FDB info variable, and
initializes it member by member. As such, newly added fields which are
not initialized by br_switchdev_fdb_notify() will contain junk bytes
from the stack.

Other uses of struct switchdev_notifier_fdb_info have a struct
initializer which should put zeroes in the uninitialized fields.

Add a reminder above the structure for future developers. Recently
discussed during review.

Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220524152144.40527-2-schultz.hans+netdev@gmail.com/#24877698
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220524152144.40527-3-schultz.hans+netdev@gmail.com/#24912269
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20220628100831.2899434-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

usbnet: fix memory allocation in helpers

usbnet provides some helper functions that are also used in
the context of reset() operations. During a reset the other
drivers on a device are unable to operate. As that can be block
drivers, a driver for another interface cannot use paging
in its memory allocations without risking a deadlock.
Use GFP_NOIO in the helpers.

Fixes: f950829cdcc42 ("usbnet: introduce usbnet 3 command helpers")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20220628093517.7469-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge branch 'net-dsa-add-pause-stats-support'

Oleksij Rempel says:

====================
net: dsa: add pause stats support
====================

Link: https://lore.kernel.org/r/20220628085155.2591201-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: microchip: count pause packets together will all other packets

This switch is calculating tx/rx_bytes for all packets including pause.
So, include rx/tx_pause counter to rx/tx_packets to make tx/rx_bytes fit
to rx/tx_packets.

Link: https://lore.kernel.org/all/20220624220317.ckhx6z7cmzegvoqi@skbuf/
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: microchip: add pause stats support

Add support for pause specific stats.

Tested on ksz9477.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: ar9331: add support for pause stats

Add support for pause stats.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

net: dsa: add get_pause_stats support

Add support for pause stats

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

selftests net: fix kselftest net fatal error

The incorrect path is causing the following error when trying to run net
kselftests:

In file included from bpf/nat6to4.c:43:
../../../lib/bpf/bpf_helpers.h:11:10: fatal error: 'bpf_helper_defs.h' file not found
^~~~~~~~~~~~~~~~~~~
1 error generated.

Fixes: ac3475686b0f ("selftests net: fix bpf build error")
Signed-off-by: Coleman Dietsch <dietschc@csp.edu>
Link: https://lore.kernel.org/r/20220628174744.7908-1-dietschc@csp.edu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

1) Restore set counter when one of the CPU loses race to add elements
   to sets.

2) After NF_STOLEN, skb might be there no more, update nftables trace
   infra to avoid access to skb in this case. From Florian Westphal.

3) nftables bridge might register a prerouting hook with zero priority,
   br_netfilter incorrectly skips it. Also from Florian.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: br_netfilter: do not skip all hooks with 0 priority
  netfilter: nf_tables: avoid skb access on nf_stolen
  netfilter: nft_dynset: restore set element counter when failing to update
====================

Link: https://lore.kernel.org/r/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Merge tag 'platform-drivers-x86-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:

- thinkpad_acpi/ideapad-laptop: mem-leak and platform-profile fixes

- panasonic-laptop: missing hotkey presses regression fix

- some hardware-id additions

- some other small fixes

* tag 'platform-drivers-x86-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86: hp-wmi: Ignore Sanitization Mode event
  platform/x86: thinkpad_acpi: do not use PSC mode on Intel platforms
  platform/x86: thinkpad-acpi: profile capabilities as integer
  platform/x86: panasonic-laptop: filter out duplicate volume up/down/mute keypresses
  platform/x86: panasonic-laptop: don't report duplicate brightness key-presses
  platform/x86: panasonic-laptop: revert "Resolve hotkey double trigger bug"
  platform/x86: panasonic-laptop: sort includes alphabetically
  platform/x86: panasonic-laptop: de-obfuscate button codes
  ACPI: video: Change how we determine if brightness key-presses are handled
  platform/x86: ideapad-laptop: Add Ideapad 5 15ITL05 to ideapad_dytc_v4_allow_table[]
  platform/x86: ideapad-laptop: Add allow_v4_dytc module parameter
  platform/x86: thinkpad_acpi: Fix a memory leak of EFCH MMIO resource
  platform/mellanox: nvsw-sn2201: fix error code in nvsw_sn2201_create_static_devices()
  platform/x86: intel/pmc: Add Alder Lake N support to PMC core driver

Merge tag '5.19-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd

Pull ksmbd server fixes from Steve French:

- seek null check (don't use f_seek op directly and blindly)

- offset validation in FSCTL_SET_ZERO_DATA

- fallocate fix (relates e.g. to xfstests generic/091 and 263)

- two cleanup fixes

- fix socket settings on some arch

* tag '5.19-rc4-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: use vfs_llseek instead of dereferencing NULL
  ksmbd: check invalid FileOffset and BeyondFinalZero in FSCTL_ZERO_DATA
  ksmbd: set the range of bytes to zero without extending file size in FSCTL_ZERO_DATA
  ksmbd: remove duplicate flag set in smb2_write
  ksmbd: smbd: Remove useless license text when SPDX-License-Identifier is already used
  ksmbd: use SOCK_NONBLOCK type for kernel_accept()

NFC: nxp-nci: don't print header length mismatch on i2c error

Don't print a misleading header length mismatch error if the i2c call
returns an error. Instead just return the error code without any error
message.

Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

NFC: nxp-nci: Don't issue a zero length i2c_master_read()

There are packets which doesn't have a payload. In that case, the second
i2c_master_read() will have a zero length. But because the NFC
controller doesn't have any data left, it will NACK the I2C read and
-ENXIO will be returned. In case there is no payload, just skip the
second i2c master read.

Fixes: 276f3e999832 ("NFC: nxp-nci_i2c: Add I2C support to NXP NCI driver")
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: prestera: acl: add support for 'egress' rules

The following is now supported:

$ tc qdisc add PORT clsact
$ tc filter add dev PORT egress ...

Signed-off-by: Maksym Glubokiy <maksym.glubokiy@plvision.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: forwarding: ethtool_extended_state: Convert to busywait

Currently, this script sets up the test scenario, which is supposed to end
in an inability of the system to negotiate a link. It then waits for a bit,
and verifies that the system can diagnose why the link was not established.

The wait time for the scenario where different link speeds are forced on
the two ends of a loopback cable, was set to 4 seconds, which exactly
covered it. As of a recent mlxsw firmware update, this time gets longer,
and this test starts failing.

The time that selftests currently wait for links to be established is
currently $WAIT_TIMEOUT, or 20 seconds. It seems reasonable that if this is
the time necessary to establish and bring up a link, it should also be
enough to determine that a link cannot be established and why.

Therefore in this patch, convert the sleeps to busywaits, so that if a
failure is established sooner (as is expected), the test runs quicker. And
use $WAIT_TIMEOUT as the time to wait.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>