The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.
When memory is allocated in 'smsc9420_probe()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.
While at it, rewrite the size passed to 'dma_alloc_coherent()' the same way
as the one passed to 'dma_free_coherent()'. This form is less verbose:
sizeof(struct smsc9420_dma_desc) * RX_RING_SIZE +
sizeof(struct smsc9420_dma_desc) * TX_RING_SIZE,
vs
sizeof(struct smsc9420_dma_desc) * (RX_RING_SIZE + TX_RING_SIZE)
The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.
When memory is allocated in 'epic_init_one()', GFP_KERNEL can be used
because it is a probe function and no lock is acquired.
Wei Wang [Tue, 1 Sep 2020 22:10:08 +0000 (15:10 -0700)]
ip: expose inet sockopts through inet_diag
Expose all exisiting inet sockopt bits through inet_diag for debug purpose.
Corresponding changes in iproute2 ss will be submitted to output all
these values.
Signed-off-by: Wei Wang <weiwan@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: bcm_sf2: recalculate switch clock rate based on ports
Whenever a port gets enabled/disabled, recalcultate the required switch
clock rate to make sure it always gets set to the expected rate
targeting our switch use case. This is only done for the BCM7445 switch
as there is no clocking profile available for BCM7278.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Describe the two possible clocks feeding into the Broadcom SF2
integrated Ethernet switch. BCM7445 systems have two clocks, one for the
main switch core clock, and another for controlling the switch clock
divider whereas BCM7278 systems only have the first kind.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
It is necessary to manage the Wake-on-LAN clock to turn on the
appropriate blocks for MPD or CFP-based packet matching to work
otherwise we will not be able to reliably match packets during suspend.
Reported-by: Blair Prescott <blair.prescott@broadcom.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We disable clocks shortly after probing the device to save as much power as
possible in case the interface is never used. When bcm_sysport_open() is
invoked, clocks are enabled, and disabled in bcm_sysport_stop().
A similar scheme is applied to the suspend/resume functions.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 1 Sep 2020 17:52:32 +0000 (18:52 +0100)]
ethtool: fix error handling in ethtool_phys_id
If ops->set_phys_id() returned an error, previously we would only break
out of the inner loop, which neither stopped the outer loop nor returned
the error to the user (since 'rc' would be overwritten on the next pass
through the loop).
Thus, rewrite it to use a single loop, so that the break does the right
thing. Use u64 for 'count' and 'i' to prevent overflow in case of
(unreasonably) large values of id.data and n.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
chelsio/chtls: CHELSIO_INLINE_CRYPTO should depend on CHELSIO_T4
While CHELSIO_INLINE_CRYPTO is a guard symbol, and just enabling it does
not cause any additional code to be compiled in, all configuration
options protected by it depend on CONFIG_CHELSIO_T4. Hence it doesn't
make much sense to bother the user with the guard symbol question when
CONFIG_CHELSIO_T4 is disabled.
Fix this by moving the dependency from the individual config options to
the guard symbol.
Fixes: 84e3325023fe9100 ("chelsio/chtls: separate chelsio tls driver from crypto driver") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 3 Sep 2020 21:52:33 +0000 (14:52 -0700)]
Merge branch 'Convert-mvpp2-to-split-PCS-support'
Russell King says:
====================
Convert mvpp2 to split PCS support
This series converts the mvpp2 driver to use the split PCS support
that has been merged into phylink last time around. I've been running
this for some time here and, apart from the recent bug fix sent to
net-next, have not seen any issues on DT based systems. I have not
tested ACPI setups, although I've tried to preserve the workaround.
Patch 1 formalises the ACPI workaround.
Patch 2 moves some of mac_config() to the mac_prepare() and
mac_finish() callbacks so we can keep the ordering when we split
the PCS bits out.
Patch 3 ensures that the port is forced down while changing the
interface mode - when in in-band mode, doing this in mac_prepare()
and mac_finish().
Patch 4 moves the reset handling to mac_prepare() and mac_finish()
Patch 5 does a straight conversion to use PCS operations.
Patch 6 splits the PCS operations into a GMAC PCS operations and
XLG PCS operations, selecting the appropriate set during
mac_prepare(). This eliminates a bunch of conditionals from the
code.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 1 Sep 2020 13:48:22 +0000 (14:48 +0100)]
net: mvpp2: ensure the port is forced down while changing modes
Ensure that the port is forced down while reconfiguring, controlling
this via mac_prepare() and mac_finish() so that it is down while we
are configuring our (future) PCS.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 3 Sep 2020 19:19:04 +0000 (12:19 -0700)]
Merge branch 'l2tp-miscellaneous-cleanups'
Tom Parkin says:
====================
l2tp: miscellaneous cleanups
This series of patches makes the following cleanups and improvements to
the l2tp code:
* various API tweaks to remove unused parameters from function calls
* lightly refactor the l2tp transmission path to capture more error
conditions in the data plane statistics
* repurpose the "magic feather" validation in l2tp to check for
sk_user_data (ab)use as opposed to refcount debugging
* remove some duplicated code
====================
Reviewed-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 3 Sep 2020 08:54:52 +0000 (09:54 +0100)]
l2tp: avoid duplicated code in l2tp_tunnel_closeall
l2tp_tunnel_closeall is called as a part of tunnel shutdown in order to
close all the sessions held by the tunnel. The code it uses to close a
session duplicates what l2tp_session_delete does.
Rather than duplicating the code, have l2tp_tunnel_closeall call
l2tp_session_delete instead.
This involves a very minor change to locking in l2tp_tunnel_closeall.
Previously, l2tp_tunnel_closeall checked the session "dead" flag while
holding tunnel->hlist_lock. This allowed for the code to step to the
next session in the list without releasing the lock if the current
session happened to be in the process of closing already.
By calling l2tp_session_delete instead, l2tp_tunnel_closeall must now
drop and regain the hlist lock for each session in the tunnel list.
Given that the likelihood of a session being in the process of closing
when the tunnel is closed, it seems worth this very minor potential
loss of efficiency to avoid duplication of the session delete code.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 3 Sep 2020 08:54:51 +0000 (09:54 +0100)]
l2tp: make magic feather checks more useful
The l2tp tunnel and session structures contain a "magic feather" field
which was originally intended to help trace lifetime bugs in the code.
Since the introduction of the shared kernel refcount code in refcount.h,
and l2tp's porting to those APIs, we are covered by the refcount code's
checks and warnings. Duplicating those checks in the l2tp code isn't
useful.
However, magic feather checks are still useful to help to detect bugs
stemming from misuse/trampling of the sk_user_data pointer in struct
sock. The l2tp code makes extensive use of sk_user_data to stash
pointers to the tunnel and session structures, and if another subsystem
overwrites sk_user_data it's important to detect this.
As such, rework l2tp's magic feather checks to focus on validating the
tunnel and session data structures when they're extracted from
sk_user_data.
* Add a new accessor function l2tp_sk_to_tunnel which contains a magic
feather check, and is used by l2tp_core and l2tp_ip[6]
* Comment l2tp_udp_encap_recv which doesn't use this new accessor function
because of the specific nature of the codepath it is called in
* Drop l2tp_session_queue_purge's check on the session magic feather:
it is called from code which is walking the tunnel session list, and
hence doesn't need validation
* Drop l2tp_session_free's check on the tunnel magic feather: the
intention of this check is covered by refcount.h's reference count
sanity checking
* Add session magic validation in pppol2tp_ioctl. On failure return
-EBADF, which mirrors the approach in pppol2tp_[sg]etsockopt.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 3 Sep 2020 08:54:50 +0000 (09:54 +0100)]
l2tp: capture more tx errors in data plane stats
l2tp_xmit_skb has a number of failure paths which are not reflected in
the tunnel and session statistics because the stats are updated by
l2tp_xmit_core. Hence any errors occurring before l2tp_xmit_core is
called are missed from the statistics.
Refactor the transmit path slightly to capture all error paths.
l2tp_xmit_skb now leaves all the actual work of transmission to
l2tp_xmit_core, and updates the statistics based on l2tp_xmit_core's
return code.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tom Parkin [Thu, 3 Sep 2020 08:54:47 +0000 (09:54 +0100)]
l2tp: remove header length param from l2tp_xmit_skb
All callers pass the session structure's hdr_len field as the header
length parameter to l2tp_xmit_skb.
Since we're passing a pointer to the session structure to l2tp_xmit_skb
anyway, there's not much point breaking the header length out as a
separate argument.
Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Acked-by: Vadim Pasternak <vadimp@mellanox.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Thu, 3 Sep 2020 13:41:45 +0000 (16:41 +0300)]
mlxsw: core_hwmon: Calculate MLXSW_HWMON_ATTR_COUNT more accurately
Currently the value of MLXSW_HWMON_ATTR_COUNT is calculated not really
accurate.
Add several defines to make the calculation clearer and easier to
change.
Calculate the precise high bound of number of attributes that may be
needed.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Thu, 3 Sep 2020 13:41:44 +0000 (16:41 +0300)]
mlxsw: core_hwmon: Split temperature querying from show functions
mlxsw_hwmon_module_temp_show(), mlxsw_hwmon_module_temp_critical_show()
and mlxsw_hwmon_module_temp_emergency_show() query the relevant
temperature from firmware and fill the value in provided buffers.
Split the temperature querying functionality to individual get()
functions and call them from the show() functions.
The get() functions will be used by subsequent patches in the set.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Minor improvements to b53 dmesg output
These changes were made while debugging the b53 driver for use on a
custom board. They've been runtime tested on a patched 4.14.y kernel
which supports this board as well as build tested with 5.9-rc3. The
changes are straightforward enough that I think this testing is
sufficient but let me know if further testing is required.
Unfortunately I don't have a board to hand which boots with a more
recent kernel and has a switch supported by the b53 driver. I'd still
like to upstream these patches if possible though.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Barker [Thu, 3 Sep 2020 11:26:21 +0000 (12:26 +0100)]
net: dsa: b53: Print err message on SW_RST timeout
This allows us to differentiate between the possible failure modes of
b53_switch_reset() by looking at the dmesg output.
Signed-off-by: Paul Barker <pbarker@konsulko.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 2 Sep 2020 22:47:01 +0000 (15:47 -0700)]
Merge branch 'ionic-struct-cleanups'
Shannon Nelson says:
====================
ionic: struct cleanups
This patchset has a few changes for better cacheline use,
to cleanup a page handling API, and to streamline the
Adminq/Notifyq queue handling. Lastly, we also have a couple
of fixes pointed out by the friendly kernel test robot.
v2: dropped change to irq coalesce default
fixed Neel's attributions to Co-developed-by
dropped the unnecessary new call to dma_sync_single_for_cpu()
added 2 patches from kernel test robot results
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 1 Sep 2020 18:20:24 +0000 (11:20 -0700)]
ionic: clarify boolean precedence
Add parenthesis to clarify a boolean usage.
Pointed out in https://lore.kernel.org/lkml/202008060413.VgrMuqLJ%25lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 1 Sep 2020 18:20:23 +0000 (11:20 -0700)]
ionic: remove unused variable
Remove a vestigial variable.
Pointed out in https://lore.kernel.org/lkml/20200806143735.GA9232@xsang-OptiPlex-9020/
Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 1 Sep 2020 18:20:22 +0000 (11:20 -0700)]
ionic: clean adminq service routine
The only thing calling ionic_napi any more is the adminq
processing, so combine and simplify.
Co-developed-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 1 Sep 2020 18:20:21 +0000 (11:20 -0700)]
ionic: clean up desc_info and cq_info structs
Remove some unnecessary struct fields and related code.
Co-developed-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
Shannon Nelson [Tue, 1 Sep 2020 18:20:19 +0000 (11:20 -0700)]
ionic: clean up page handling code
The internal page handling can be cleaned up by passing our
local page struct rather than dma addresses, and by putting
more of the mgmt code into the alloc and free routines.
Co-developed-by: Neel Patel <neel@pensando.io> Signed-off-by: Shannon Nelson <snelson@pensando.io> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 2 Sep 2020 21:13:35 +0000 (14:13 -0700)]
Merge branch 'RTL8366-stabilization'
Linus Walleij says:
====================
RTL8366 stabilization
This stabilizes the RTL8366 driver by checking validity
of the passed VLANs and refactoring the member config
(MC) code so we do not require strict call order and
de-duplicate some code.
Changes from v1: incorporate review comments on patch
2.
Changes from v2: a oneline bug fix in patch 2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The VLANs and PVIDs on the RTL8366 utilizes a "member
configuration" (MC) which is largely unexplained in the
code.
This set-up requires a special ordering: rtl8366_set_pvid()
must be called first, followed by rtl8366_set_vlan(),
else the MC will not be properly allocated. Relax this
by factoring out the code obtaining an MC and reuse
the helper in both rtl8366_set_pvid() and
rtl8366_set_vlan() so we remove this strict ordering
requirement.
In the process, add some better comments and debug prints
so people who read the code understand what is going on.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The rtl8366_set_vlan() and rtl8366_set_pvid() get invalid
VLANs tossed at it, especially VLAN0, something the hardware
and driver cannot handle. Check validity and bail out like
we do in the other callbacks.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
in mptcp_connect, 's' selects IPPROTO_MPTCP / IPPROTO_TCP as the value of
'protocol' in socket(), and 'm' switches between different send / receive
modes. Fix die_usage(): swap 'm' and 's' and add missing 'sendfile' mode.
Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 1 Sep 2020 02:32:57 +0000 (04:32 +0200)]
net: dsa: mv88e6xxx: Fix W=1 warning with !CONFIG_OF
When building on platforms without device tree, e.g. amd64, W=1 gives
a warning about mv88e6xxx_mdio_external_match being unused. Replace
of_match_node() with of_device_is_compatible() to prevent this
warning.
Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
dpaa2-eth: add a dpaa2_eth_ prefix to all functions
This is just a quick cleanup that aims at adding a dpaa2_eth_ prefix to
all functions within the dpaa2-eth driver even if those are static and
private to the driver. The main reason for doing this is that looking a
perf top, for example, is becoming an inconvenience because one cannot
easily determine which entries are dpaa2-eth related or not.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Mon, 31 Aug 2020 18:12:40 +0000 (21:12 +0300)]
dpaa2-eth: add a dpaa2_eth_ prefix to all functions in dpaa2-eth-dcb.c
Some static functions in the dpaa2-eth driver don't have the dpaa2_eth_
prefix and this is becoming an inconvenience when looking at, for
example, a perf top output and trying to determine easily which entries
are dpaa2-eth related. Ammend this by adding the prefix to all the
functions.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Mon, 31 Aug 2020 18:12:39 +0000 (21:12 +0300)]
dpaa2-eth: add a dpaa2_eth_ prefix to all functions in dpaa2-eth.c
Some static functions in the dpaa2-eth driver don't have the dpaa2_eth_
prefix and this is becoming an inconvenience when looking at, for
example, a perf top output and trying to determine easily which entries
are dpaa2-eth related. Ammend this by adding the prefix to all the
functions.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Mon, 31 Aug 2020 18:12:38 +0000 (21:12 +0300)]
dpaa2-eth: add a dpaa2_eth_ prefix to all functions in dpaa2-ethtool.c
Some static functions in the dpaa2-eth driver don't have the dpaa2_eth_
prefix and this is becoming an inconvenience when looking at, for
example, a perf top output and trying to determine easily which entries
are dpaa2-eth related. Ammend this by adding the prefix to all the
functions.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eelco Chaudron [Mon, 31 Aug 2020 09:57:57 +0000 (11:57 +0200)]
net: openvswitch: fixes crash if nf_conncount_init() fails
If nf_conncount_init fails currently the dispatched work is not canceled,
causing problems when the timer fires. This change fixes this by not
scheduling the work until all initialization is successful.
Fixes: 7a106103de59 ("net: openvswitch: fixes potential deadlock in dp cleanup code") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In some cases, the device or firmware may be busy when the
driver attempts to perform the CRQ initialization handshake.
If the partner is busy, the hypervisor will return the H_CLOSED
return code. The aim of this patch is that, if the device is not
ready, to query the device a number of times, with a small wait
time in between queries. If all initialization requests fail,
the driver will remain in a dormant state, awaiting a signal
from the device that it is ready for operation.
Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net-next* tree.
There are two small conflicts when pulling, resolve as follows:
1) Merge conflict in tools/lib/bpf/libbpf.c between 551175f2183c ("libbpf: Factor
out common ELF operations and improve logging") in bpf-next and 5766050da0b7
("libbpf: Fix map index used in error message") in net-next. Resolve by taking
the hunk in bpf-next:
[...]
scn = elf_sec_by_idx(obj, obj->efile.btf_maps_shndx);
data = elf_sec_data(obj, scn);
if (!scn || !data) {
pr_warn("elf: failed to get %s map definitions for %s\n",
MAPS_ELF_SEC, obj->path);
return -EINVAL;
}
[...]
2) Merge conflict in drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.c between 61acb2f844fe ("xsk: i40e: ice: ixgbe: mlx5: Test for dma_need_sync earlier for
better performance") in bpf-next and 93970113e83e ("net/mlx5e: RX, Add a prefetch
command for small L1_CACHE_BYTES") in net-next. Resolve the two locations by retaining
net_prefetch() and taking xsk_buff_dma_sync_for_cpu() from bpf-next. Should look like:
We've added 133 non-merge commits during the last 14 day(s) which contain
a total of 246 files changed, 13832 insertions(+), 3105 deletions(-).
The main changes are:
1) Initial support for sleepable BPF programs along with bpf_copy_from_user() helper
for tracing to reliably access user memory, from Alexei Starovoitov.
2) Add BPF infra for writing and parsing TCP header options, from Martin KaFai Lau.
3) bpf_d_path() helper for returning full path for given 'struct path', from Jiri Olsa.
4) AF_XDP support for shared umems between devices and queues, from Magnus Karlsson.
5) Initial prep work for full BPF-to-BPF call support in libbpf, from Andrii Nakryiko.
6) Generalize bpf_sk_storage map & add local storage for inodes, from KP Singh.
7) Implement sockmap/hash updates from BPF context, from Lorenz Bauer.
8) BPF xor verification for scalar types & add BPF link iterator, from Yonghong Song.
9) Use target's prog type for BPF_PROG_TYPE_EXT prog verification, from Udip Pant.
10) Rework BPF tracing samples to use libbpf loader, from Daniel T. Lee.
11) Fix xdpsock sample to really cycle through all buffers, from Weqaar Janjua.
12) Improve type safety for tun/veth XDP frame handling, from Maciej Żenczykowski.
13) Various smaller cleanups and improvements all over the place.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement the getsockopt SOL_TLS TLS_RX which is currently missing. The
primary usecase is to use it in conjunction with TCP_REPAIR to
checkpoint/restore the TLS record layer state.
TLS connection state usually exists on the user space library. So
basically we can easily extract it from there, but when the TLS
connections are delegated to the kTLS, it is not the case. We need to
have a way to extract the TLS state from the kernel for both of TX and
RX side.
The new TLS_RX getsockopt copies the crypto_info to user in the same
way as TLS_TX does.
We have described use cases in our research work in Netdev 0x14
Transport Workshop [1].
Also, there is an TLS implementation called tlse [2] which supports
TLS connection migration. They have support of kTLS and their code
shows that they are expecting the future support of this option.
Signed-off-by: Yutaro Hayakawa <yhayakawa3720@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
keep_flows was introduced by [1], which used as flag to delete flows or not.
When rehashing or expanding the table instance, we will not flush the flows.
Now don't use it anymore, remove it.
[1] - https://github.com/openvswitch/ovs/commit/acd051f1761569205827dc9b037e15568a8d59f8 Cc: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Decrease table->count and ufid_count unconditionally,
because we only don't use count or ufid_count to count
when flushing the flows. To simplify the codes, we
remove the "count" argument of table_instance_flow_free.
To avoid a bug when deleting flows in the future, add
WARN_ON in flush flows function.
Cc: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Not change the logic, just improve the coding style.
Cc: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Björn Töpel [Tue, 1 Sep 2020 08:39:28 +0000 (10:39 +0200)]
bpf: {cpu,dev}map: Change various functions return type from int to void
The functions bq_enqueue(), bq_flush_to_queue(), and bq_xmit_all() in
{cpu,dev}map.c always return zero. Changing the return type from int
to void makes the code easier to follow.
Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/bpf/20200901083928.6199-1-bjorn.topel@gmail.com
bpf: Remove bpf_lsm_file_mprotect from sleepable list.
Technically the bpf programs can sleep while attached to bpf_lsm_file_mprotect,
but such programs need to access user memory. So they're in might_fault()
category. Which means they cannot be called from file_mprotect lsm hook that
takes write lock on mm->mmap_lock.
Adjust the test accordingly.
Also add might_fault() to __bpf_prog_enter_sleepable() to catch such deadlocks early.
Weqaar Janjua [Fri, 28 Aug 2020 16:17:17 +0000 (00:17 +0800)]
samples/bpf: Fix to xdpsock to avoid recycling frames
The txpush program in the xdpsock sample application is supposed
to send out all packets in the umem in a round-robin fashion.
The problem is that it only cycled through the first BATCH_SIZE
worth of packets. Fixed this so that it cycles through all buffers
in the umem as intended.
Fixes: 4cb659ee6cb1 ("samples/bpf: convert xdpsock to use libbpf for AF_XDP access") Signed-off-by: Weqaar Janjua <weqaar.a.janjua@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20200828161717.42705-1-weqaar.a.janjua@intel.com
Magnus Karlsson [Fri, 28 Aug 2020 12:51:05 +0000 (14:51 +0200)]
samples/bpf: Optimize l2fwd performance in xdpsock
Optimize the throughput performance of the l2fwd sub-app in the
xdpsock sample application by removing a duplicate syscall and
increasing the size of the fill ring.
The latter needs some further explanation. We recommend that you set
the fill ring size >= HW RX ring size + AF_XDP RX ring size. Make sure
you fill up the fill ring with buffers at regular intervals, and you
will with this setting avoid allocation failures in the driver. These
are usually quite expensive since drivers have not been written to
assume that allocation failures are common. For regular sockets,
kernel allocated memory is used that only runs out in OOM situations
that should be rare.
These two performance optimizations together lead to a 6% percent
improvement for the l2fwd app on my machine.
Add support for the Lynx PCS as a separate module in drivers/net/phy/.
The advantage of this structure is that multiple ethernet or switch
drivers used on NXP hardware (ENETC, Seville, Felix DSA switch etc) can
share the same implementation of PCS configuration and runtime
management.
The module implements phylink_pcs_ops and exports a phylink_pcs
(incorporated into a lynx_pcs) which can be directly passed to phylink
through phylink_pcs_set.
The first 3 patches add some missing pieces in phylink and the locked
mdiobus write accessor. Next, the Lynx PCS MDIO module is added as a
standalone module. The majority of the code is extracted from the Felix
DSA driver. The last patch makes the necessary changes in the Felix and
Seville drivers in order to use the new common PCS implementation.
At the moment, USXGMII (only with in-band AN), SGMII, QSGMII (with and
without in-band AN) and 2500Base-X (only w/o in-band AN) are supported
by the Lynx PCS MDIO module since these were also supported by Felix and
no functional change is intended at this time.
Changes in v2:
* got rid of the mdio_lynx_pcs structure and directly exported the
functions without the need of an indirection
* made the necessary adjustments for this in the Felix DSA driver
* solved the broken allmodconfig build test by making the module
tristate instead of bool
* fixed a memory leakage in the Felix driver (the pcs structure was
allocated twice)
Changes in v3:
* added support for PHYLINK PCS ops in DSA (patch 5/9)
* cleanup in Felix PHYLINK operations and migrate to
phylink_mac_link_up() being the callback of choice for applying MAC
configuration (patches 6-8)
Changes in v4:
* use the newly introduced phylink PCS mechanism
* install the phylink_pcs in the phylink_mac_config DSA ops
* remove the direct implementations of the PCS ops
* do no use the SGMII_ prefix when referring to the IF_MORE register
* add a phylink helper to decode the USXGMII code word
* remove cleanup patches for Felix (these have been already accepted)
* Seville (recently introduced) now has PCS support through the same
Lynx PCS module
Changes in v5:
- move the pcs-lynx driver to drivers/net/pcs
- reword the commit message a bit in 4/5
- add error checking and error propagation in 4/5
- s/IF_MODE_DUPLEX/IF_MODE_HALF_DUPLEX in 4/5
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Sun, 30 Aug 2020 08:34:02 +0000 (11:34 +0300)]
net: dsa: ocelot: use the Lynx PCS helpers in Felix and Seville
Use the helper functions introduced by the newly added
Lynx PCS MDIO module in the Felix VSC9959 and Seville VSC9953.
Instead of representing the PCS as a phy_device, a mdio_device structure
will be passed to the Lynx module which is now actually implementing all
the PCS configuration and status reporting.
All code previously used for PCS monitoring and runtime configuration
is removed and replaced will calls to the Lynx PCS operations.
Tested on the following SERDES protocols of LS1028A: 0x7777
(2500Base-X), 0x85bb (QSGMII), 0x9999 (SGMII) and 0x13bb (USXGMII).
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Sun, 30 Aug 2020 08:34:01 +0000 (11:34 +0300)]
net: phy: add Lynx PCS module
Add a Lynx PCS module which exposes the necessary operations to drive
the PCS using phylink.
The majority of the code is extracted from the Felix DSA driver, which
will be also modified in a later patch, and exposed as a separate module
for code reusability purposes.
As such, this aims at feature and bug parity with the existing Felix DSA
driver, and thus USXGMII, SGMII, QSGMII and 2500Base-X (only w/o in-band
AN) are supported by the Lynx PCS module since these were also supported
by Felix.
The module can only be enabled by the drivers in need and not user
selectable.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add the locked variant of the clause 45 mdiobus write accessor -
mdiobus_c45_write().
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Sun, 30 Aug 2020 08:33:59 +0000 (11:33 +0300)]
net: phylink: consider QSGMII interface mode in phylink_mii_c22_pcs_get_state
The same link partner advertisement word is used for both QSGMII and
SGMII, thus treat both interface modes using the same
phylink_decode_sgmii_word() function.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Sun, 30 Aug 2020 08:33:58 +0000 (11:33 +0300)]
net: phylink: add helper function to decode USXGMII word
With the new addition of the USXGMII link partner ability constants we
can now introduce a phylink helper that decodes the USXGMII word and
populates the appropriate fields in the phylink_link_state structure
based on them.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
sfc: clean up some W=1 build warnings
A collection of minor fixes to issues flagged up by W=1.
After this series, the only remaining warnings in the sfc driver are
some 'member missing in kerneldoc' warnings from ptp.c.
Tested by building on x86_64 and running 'ethtool -p' on an EF10 NIC;
there was no error, but I couldn't observe the actual LED as I'm
working remotely.
[ Incidentally, ethtool_phys_id()'s behaviour on an error return
looks strange — if I'm reading it right, it will break out of the
inner loop but not the outer one, and eventually return the rc
from the last run of the inner loop. Is this intended? ]
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Fri, 28 Aug 2020 17:51:04 +0000 (18:51 +0100)]
sfc: return errors from efx_mcdi_set_id_led, and de-indirect
W=1 warnings indicated that 'rc' was unused in efx_mcdi_set_id_led();
change the function to return int instead of void and plumb the rc
through the caller efx_ethtool_phys_id().
Since (post-Falcon) all sfc NICs use MCDI for this, there's no point in
indirecting through a nic_type method, so remove that and just call
efx_mcdi_set_id_led() directly.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Fri, 28 Aug 2020 17:50:24 +0000 (18:50 +0100)]
sfc: fix unused-but-set-variable warning in efx_farch_filter_remove_safe
Thanks to some past refactor, 'spec' is not actually used in this
function; the code using it moved to the callee efx_farch_filter_remove.
Remove the variable to fix a W=1 warning.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Fri, 28 Aug 2020 17:50:02 +0000 (18:50 +0100)]
sfc: fix W=1 warnings in efx_farch_handle_rx_not_ok
Some of these RX-event flags aren't used at all, so remove them.
Others are used only #ifdef DEBUG to log a message; suppress the
unused-var warnings #ifndef DEBUG with a void cast.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Fri, 28 Aug 2020 13:30:55 +0000 (15:30 +0200)]
gtp: remove useless rcu_read_lock()
The rtnl lock is taken just the line above, no need to take the rcu also.
Fixes: fcdced5d8fbb ("gtp: fix use-after-free in gtp_encap_destroy()") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Fri, 28 Aug 2020 10:53:53 +0000 (11:53 +0100)]
net: phylink: avoid oops during initialisation
If we intend to use PCS operations, mac_pcs_get_state() will not be
implemented, so will be NULL. If we also intend to register the PCS
operations in mac_prepare() or mac_config(), then this leads to an
attempt to call NULL function pointer during phylink_start(). Avoid
this, but we must report the link is down.
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>