Daniel T. Lee [Tue, 7 Jul 2020 18:48:54 +0000 (03:48 +0900)]
samples: bpf: Refactor BPF map performance test with libbpf
Previously, in order to set the numa_node attribute at the time of map
creation using "libbpf", it was necessary to call bpf_create_map_node()
directly (bpf_load approach), instead of calling bpf_object_load()
that handles everything on its own, including map creation. And because
of this problem, this sample had problems with refactoring from bpf_load
to libbbpf.
However, by commit 1bdb6c9a1c43 ("libbpf: Add a bunch of attribute
getters/setters for map definitions") added the numa_node attribute and
allowed it to be set in the map.
By using libbpf instead of bpf_load, the inner map definition has
been explicitly declared with BTF-defined format. Also, the element of
ARRAY_OF_MAPS was also statically specified using the BTF format. And
for this reason some logic in fixup_map() was not needed and changed
or removed.
Daniel T. Lee [Tue, 7 Jul 2020 18:48:52 +0000 (03:48 +0900)]
samples: bpf: Fix bpf programs with kprobe/sys_connect event
Currently, BPF programs with kprobe/sys_connect does not work properly.
Commit 34745aed515c ("samples/bpf: fix kprobe attachment issue on x64")
This commit modifies the bpf_load behavior of kprobe events in the x64
architecture. If the current kprobe event target starts with "sys_*",
add the prefix "__x64_" to the front of the event.
Appending "__x64_" prefix with kprobe/sys_* event was appropriate as a
solution to most of the problems caused by the commit below.
commit d5a00528b58c ("syscalls/core, syscalls/x86: Rename struct
pt_regs-based sys_*() to __x64_sys_*()")
However, there is a problem with the sys_connect kprobe event that does
not work properly. For __sys_connect event, parameters can be fetched
normally, but for __x64_sys_connect, parameters cannot be fetched.
As the assembly code for __x64_sys_connect shows, parameters should be
fetched and set into rdi, rsi, rdx registers prior to calling
__sys_connect.
Because of this problem, this commit fixes the sys_connect event by
first getting the value of the rdi register and then the value of the
rdi, rsi, and rdx register through an offset based on that value.
Fixes: 34745aed515c ("samples/bpf: fix kprobe attachment issue on x64") Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200707184855.30968-2-danieltimlee@gmail.com
Sometimes it's handy to know when the socket gets freed. In
particular, we'd like to try to use a smarter allocation of
ports for bpf_bind and explore the possibility of limiting
the number of SOCK_DGRAM sockets the process can have.
Implement BPF_CGROUP_INET_SOCK_RELEASE hook that triggers on
inet socket release. It triggers only for userspace sockets
(not in-kernel ones) and therefore has the same semantics as
the existing BPF_CGROUP_INET_SOCK_CREATE.
priv->page_pool is an array, so comparing against it will always return true.
Do a meaningful check by checking priv->page_pool[0] instead.
While at it, clear the page_pool pointers on deallocation, or when an
allocation error happens during init.
Reported-by: Colin Ian King <colin.king@canonical.com> Fixes: c2d6fe6163de ("mvpp2: XDP TX support") Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
We can re-use the existing work queue to handle path management
instead of a dedicated work queue. Just move pm_worker to protocol.c,
call it from the mptcp worker and get rid of the msk lock (already held).
Signed-off-by: Florian Westphal <fw@strlen.de> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In certain configurations without power management support, gcc report
the following warning:
drivers/net/ethernet/sun/cassini.c:5206:12: warning:
'cas_resume' defined but not used [-Wunused-function]
5206 | static int cas_resume(struct device *dev_d)
| ^~~~~~~~~~
Mark cas_resume() as __maybe_unused to make it clear.
Fixes: f193f4ebde3d ("sun/cassini: use generic power management") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
sun/niu: add __maybe_unused attribute to PM functions
The upgraded .suspend() and .resume() throw
"defined but not used [-Wunused-function]" warning for certain
configurations.
Mark them with "__maybe_unused" attribute.
Compile-tested only.
Fixes: b0db0cc2f695 ("sun/niu: use generic power management") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes most of the Sparse and W=1 warnings in drivers/net/phy. The
Cavium code is still not fully clean, but it might actually be the
strange code is confusing Sparse.
v2
--
Added RB, TB, AB.
s/case/cause
Reverse Christmas tree
Module soft dependencies
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
To ensure that the octeon MDIO driver has been loaded, the Cavium
ethernet drivers reference a dummy symbol in the MDIO driver. This
forces it to be loaded first. And this symbol has not been cleanly
implemented, resulting in warnings when build W=1 C=1.
Since device tree is being used, and a phandle points to the PHY on
the MDIO bus, we can make use of deferred probing. If the PHY fails to
connect, it should be because the MDIO bus driver has not loaded
yet. Return -EPROBE_DEFER so it will be tried again later.
Additionally, add a MODULE_SOFTDEP() to give user space a hint as to
what order it should load the modules.
v2:
s/octoen/octeon/
Add MODULE_SOFTDEP()
Cc: Sunil Goutham <sgoutham@marvell.com> Cc: Robert Richter <rrichter@marvell.com> Cc: Chris Packham <chris.packham@alliedtelesis.co.nz> Tested-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 7 Jul 2020 01:49:38 +0000 (03:49 +0200)]
net: phy: cavium: Improve __iomem mess
The MIPS low level register access functions seem to be missing
__iomem annotation. This causes lots of sparse warnings, when code
casts off the __iomem. Make the Cavium MDIO drivers cleaner by pushing
the casts lower down into the helpers, allow the drivers to work as
normal, with __iomem.
bus->register_base is now an void *, rather than a u64. So forming the
mii_bus->id string cannot use %llx any more. Use %px, so this kernel
address is still exposed to user space, as it was before.
v2: s/cases/causes/g
Cc: Sunil Goutham <sgoutham@marvell.com> Cc: Robert Richter <rrichter@marvell.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 7 Jul 2020 01:49:37 +0000 (03:49 +0200)]
net: phy: dp83640: Fixup cast to restricted __be16 warning
ntohs() expects to be passed a __be16. Correct the type of the
variable holding the sequence ID.
Cc: Richard Cochran <richardcochran@gmail.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 7 Jul 2020 01:49:34 +0000 (03:49 +0200)]
net: phy: Fixup parameters in kerneldoc
Correct the kerneldoc for a few structure and function calls,
as reported by C=1 W=1.
Cc: Alexandru Ardelean <alexaundru.ardelean@analog.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Tue, 7 Jul 2020 01:49:33 +0000 (03:49 +0200)]
net: phy: at803x: Avoid comparison is always false warning
By placing the GENMASK value into an unsigned int and then passing it
to PREF_FIELD, the type is reduces down from ULL. Given the reduced
size of the type, the range checks in PREP_FAIL() are always true, and
-Wtype-limits then gives a warning.
By skipping the intermediate variable, the warning can be avoided.
Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Jacob Keller [Mon, 6 Jul 2020 21:53:41 +0000 (14:53 -0700)]
ice: add documentation for device-caps region
The recent change by commit 8d7aab3515fa ("ice: implement snapshot for
device capabilities") to implement the device-caps region for the ice
driver forgot to document it.
Add documentation to the ice devlink documentation file describing the
new region and add some sample output to the shell commands provided as
an example.
Fixes: 8d7aab3515fa ("ice: implement snapshot for device capabilities") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: systemport: Add support for VLAN transmit acceleration
SYSTEMPORT is capable of performing VLAN transmit acceleration, support
that by configuring it appropriately, providing the VLAN ID and PCP/DEI
where necessary.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
syzkaller was able to make the kernel reach subflow_data_ready() for a
server subflow that was closed before subflow_finish_connect() completed.
In these cases we can avoid using the path for regular/fallback MPTCP
data, and just wake the main socket, to avoid the following warning:
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/39 Reported-by: Christoph Paasch <cpaasch@apple.com> Fixes: e1ff9e82e2ea ("net: mptcp: improve fallback to TCP") Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
This set cleans qed/qede build log under W=1 C=1 with GCC 8 and
sparse 0.6.2. The only thing left is "context imbalance -- unexpected
unlock" in one of the source files, which will be issued later during
the refactoring cycles.
The biggest part is handling the endianness warnings. The current code
often just assumes that both host and device operate in LE, which is
obviously incorrect (despite the fact that it's true for x86 platforms),
and makes sparse {s,m}ad.
The rest of the series is mostly random non-functional fixes
here-and-there.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Flow Dissector's keys are mostly Network / Big Endian. U{16,32}_MAX are
the same in either of byteorders, but let's make sparse happy with
wrapping them into noops.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
One of the function arguments was renamed some time ago, but this
wasn't reflected in its kernel-doc comment.
Also add the description for return values.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Current code assumes that both host and device operates in Little Endian
in lots of places. While this is true for x86 platform, this doesn't mean
we should not care about this.
This commit addresses all parts of the code that were pointed out by sparse
checker. All operations with restricted (__be*/__le*) types are now
protected with explicit from/to CPU conversions, even if they're noops on
common setups.
I'm sure there are more such places, but this implies a deeper code
investigation, and is a subject for future works.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: qed: use ptr shortcuts to dedup field accessing in some parts
Use intermediate pointers instead of multiple dereferencing to
simplify and beautify parts of code that will be addressed in
the next commit.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: qed: improve indentation of some parts of code
To not mix functional and stylistic changes, correct indentation
of code that will be modified in the subsequent commits.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Get rid of the kernel-doc warnings when building with W=1+ by
rewriting the problematic doc comments according to the
recommended format and style.
Note that this only fixes problems found in C source files,
headers aren't in scope for now.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Change the prototype of qed_hw_err_notify() with the following:
* constify "fmt" argument according to printk() declarations;
* anontate it with __cold attribute to move the function out of
the line;
* annotate it with __printf() attribute;
This eliminates W=1+ warning:
drivers/net/ethernet/qlogic/qed/qed_hw.c: In function
‘qed_hw_err_notify’:
drivers/net/ethernet/qlogic/qed/qed_hw.c:851:3: warning: function
‘qed_hw_err_notify’ might be a candidate for ‘gnu_printf’ format
attribute [-Wsuggest-attribute=format]
len = vsnprintf(buf, QED_HW_ERR_MAX_STR_SIZE, fmt, vl);
^~~
__printf() will also be helpful with catching bad format strings
and arguments.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix several sparse warnings by moving structs declarations into
the corresponding header files:
drivers/net/ethernet/qlogic/qed/qed_dcbx.c:2402:32: warning:
symbol 'qed_dcbnl_ops_pass' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_ll2.c:2754:26: warning: symbol
'qed_ll2_ops_pass' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_ptp.c:449:30: warning: symbol
'qed_ptp_ops_pass' was not declared. Should it be static?
drivers/net/ethernet/qlogic/qed/qed_sriov.c:5265:29: warning:
symbol 'qed_iov_ops_pass' was not declared. Should it be static?
(some of them were declared twice in different header files)
Also make qed_hw_err_type_descr[] const while at it.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: qed: move static iro_arr[] out of header file
Static variables (and functions, unless they're inline) should not
be declared in header files.
Move the static array iro_arr[] from "qed_hsi.h" to the sole place
where it's used, "qed_init_ops.c". This eliminates lots of warnings
(42 of them actually) against W=1+:
In file included from drivers/net/ethernet/qlogic/qed/qed.h:51:0,
from drivers/net/ethernet/qlogic/qed/qed_ooo.c:40:
drivers/net/ethernet/qlogic/qed/qed_hsi.h:4421:18: warning: 'iro_arr'
defined but not used [-Wunused-const-variable=]
static const u32 iro_arr[] = {
^~~~~~~
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
geneve: move all configuration under struct geneve_config
This patch adds a new structure geneve_config and moves the per-device
configuration attributes to it, like we already have in VXLAN with
struct vxlan_config. This ends up being pretty invasive since those
attributes are used everywhere.
This allows us to clean up the argument lists for geneve_configure (4
arguments instead of 8) and geneve_nl2info (5 instead of 9).
This also reduces the copy-paste of code setting those attributes
between geneve_configure and geneve_changelink to a single memcpy,
which would have avoided the bug fixed in commit 56c09de347e4 ("geneve: allow changing DF behavior after creation").
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
On link down, the draining of the S/G cache should be done on all
_possible_ CPUs not just the ones that are online in that moment.
Fix this by changing the iterator.
Fixes: d70446ee1f40 ("dpaa2-eth: send a scatter-gather FD instead of realloc-ing") Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tang Bin [Mon, 6 Jul 2020 14:47:01 +0000 (22:47 +0800)]
net/amd: Remove needless assignment and the extra brank lines
The variable 'err = -ENODEV;' in au1000_probe() is
duplicate, so remove redundant one. And remove the
extra blank lines in the file au1000_eth.c
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: Tang Bin <tangbin@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When investigating performance issues that involve latency / loss /
reordering it is useful to have the pcap from the sender-side as it
allows to easier infer the state of the sender's congestion-control,
loss-recovery, etc.
Allow the selftests to capture a pcap on both sender and receiver so
that this information is not lost when reproducing.
This patch also improves the file names. Instead of:
It was a connection from ns3 to ns4, better to start with ns3 then. The
port is also added, easier to find the trace we want.
Co-developed-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
ethernet: sun: use generic power management
Linux Kernel Mentee: Remove Legacy Power Management.
The purpose of this patch series is to remove legacy power management callbacks
from sun ethernet drivers.
The callbacks performing suspend() and resume() operations are still calling
pci_save_state(), pci_set_power_state(), etc. and handling the power management
themselves, which is not recommended.
The conversion requires the removal of the those function calls and change the
callback definition accordingly and make use of dev_pm_ops structure.
All patches are compile-tested only.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
In this driver:
gem_suspend() calls gem_do_stop() which in turn invokes
pci_disable_device(). As the PCI helper function is not called at the
end/start of the function body, breaking the function in two parts
may change its behavior.
The only other function invoking gem_do_stop() is gem_close(). Hence,
gem_close() and gem_suspend() can do the required end steps on their own.
The same case is with gem_resume(). Both gem_resume() and gem_open()
invoke gem_do_start(). Again, make the caller functions do the required
steps on their own.
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 21:05:08 +0000 (23:05 +0200)]
net: dsa: vitesse-vsc73xx: Convert to plain comments to avoid kerneldoc warnings
The comments before struct vsc73xx_platform and struct vsc73xx_spi use
kerneldoc format, but then fail to document the members of these
structures. All the structure members are self evident, and the driver
has not other kerneldoc comments, so change these to plain comments to
avoid warnings.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 20:55:55 +0000 (22:55 +0200)]
net: dsa: lan9303: fix variable 'res' set but not used
Since lan9303_adjust_link() is a void function, there is no option to
return an error. So just remove the variable and lets any errors be
discarded.
Cc: Egil Hjelmeland <privat@egil-hjelmeland.no> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 20:42:27 +0000 (22:42 +0200)]
net: dsa: rtl8366: Pass GENMASK() signed bits
Oddly, GENMASK() requires signed bit numbers, so that it can compare
them for < 0. If passed an unsigned type, we get warnings about the
test never being true.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 20:36:25 +0000 (22:36 +0200)]
net: dsa: bcm_sf2: Pass GENMASK() signed bits
Oddly, GENMASK() requires signed bit numbers, so that it can compare
them for < 0. If passed an unsigned type, we get warnings about the
test never being true. There is no danger of overflow here, udf is
always a u8, so there is plenty of space when expanding to an int.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 20:36:24 +0000 (22:36 +0200)]
net: dsa: bcm_sf2: Initialize __be16 with a __be16 value
A __be16 variable should be initialised with a __be16 value. So add a
htons(). In this case it is pointless, given the value being assigned
is 0xffff, but it stops sparse from warnings.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 19:38:09 +0000 (21:38 +0200)]
net: dsa: mv88e6xxx: Remove set but unused variable
We don't act on any errors reading registers while handling watchdog
interrupt. Since this is an interrupt handler, we cannot return such
errors. So just remove the variable.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 19:38:07 +0000 (21:38 +0200)]
net: dsa: mv88e6xxx: Fix sparse warnings from GENMASK
Oddly, GENMASK() requires signed bit numbers, so that it can compare
them for < 0. If passed an unsigned type, we get warnings about the
test never being true.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 5 Jul 2020 19:30:07 +0000 (21:30 +0200)]
net: dsa: tag_mtk: Fix warnings for __be16
net/dsa/tag_mtk.c:84:13: warning: incorrect type in assignment (different base types)
net/dsa/tag_mtk.c:84:13: expected restricted __be16 [usertype] hdr
net/dsa/tag_mtk.c:84:13: got int
net/dsa/tag_mtk.c:94:17: warning: restricted __be16 degrades to integer
The result of a ntohs() is not __be16, but u16.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Phylink integration improvements for Felix DSA driver
This is an overhaul of the Felix switch driver's phylink operations.
Patches 1, 3, 4 and 5 are cleanup, patch 2 is adding a new feature and
and patch 6 is adaptation to the new format of an existing phylink API
(mac_link_up).
Changes since v2:
- Replaced "PHYLINK" with "phylink".
- Rewrote commit message of patch 5/6.
Changes since v1:
- Now using phy_clear_bits and phy_set_bits instead of plain writes to
MII_BMCR. This combines former patches 1/7 and 6/7 into a single new
patch 1/6.
- Updated commit message of patch 5/6.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 5 Jul 2020 16:16:26 +0000 (19:16 +0300)]
net: dsa: felix: use resolved link config in mac_link_up()
Phylink now requires that parameters established through
auto-negotiation be written into the MAC at the time of the
mac_link_up() callback. In the case of felix, that means taking the port
out of reset, setting the correct timers for PAUSE frames, and
enabling/disabling TX flow control.
This patch also splits the inband and noinband configuration of the
vsc9959 PCS (currently found in a function called "init") into 2
different functions, which have a nomenclature closer to phylink:
"config", for inband setup, and "link_up", for noinband (forced) setup.
This is necessary as a preparation step for giving up control of the PCS
to phylink, which will be done in further patch series.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Phylink uses the .mac_an_restart method to offer the user an
implementation of the "ethtool -r" behavior, when the media-side auto
negotiation can be restarted by the local MAC PCS. This is the case for
fiber modes 1000Base-X and 2500Base-X (IEEE clause 37) that don't have
an Ethernet PHY connected locally, and the media is connected to the MAC
PCS directly.
On the other hand, the Cisco SGMII and USXGMII standards also have an
auto negotiation mechanism based on IEEE 802.3 clause 37 (their
respective specs require a MAC PCS and a PHY PCS to implement the same
state machine, which is described in IEEE 802.3 "Auto-Negotiation Figure
37-6"), so the ability to restart auto-negotiation is intrinsically
symmetrical (the MAC PCS can do it too).
However, it appears that not all SGMII/USXGMII PHYs have logic to
restart the MDI-side auto-negotiation process when they detect a
transition of the SGMII link from data mode to configuration mode.
Some do (VSC8234) and some don't (AR8033, MV88E1111). IEEE and/or Cisco
specification wordings to not help to prove whether propagating the "AN
restart" event from MII side ("mr_restart_an") to MDI side
("mr_restart_negotiation") is required behavior - neither of them
specifies any mandatory interaction between the clause 37 AN state
machine from Figure 37-6 and the clause 28 AN state machine from Figure
28-18.
Therefore, even if a certain behavior could be proven as being required,
real-life SGMII/USXGMII PHYs are inconsistent enough that a clause 37 AN
restart cannot be used by phylink to reliably trigger a media-side
renegotiation, when the user requests it via ethtool.
The only remaining use that the .mac_an_restart callback might possibly
have, given what we know now, is to implement some silicon quirks, but
so far that has proven to not be necessary.
So remove this code for now, since it never gets called and we don't
foresee any circumstance in which it might be, either.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 5 Jul 2020 16:16:24 +0000 (19:16 +0300)]
net: dsa: felix: set proper pause frame timers based on link speed
state->speed holds a value of 10, 100, 1000 or 2500, but
SYS_MAC_FC_CFG_FC_LINK_SPEED expects a value in the range 0, 1, 2 or 3.
So set the correct speed encoding into this register.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 5 Jul 2020 16:16:23 +0000 (19:16 +0300)]
net: dsa: felix: unconditionally configure MAC speed to 1000Mbps
In VSC9959, the PCS is the one who performs rate adaptation (symbol
duplication) to the speed negotiated by the PHY. The MAC is unaware of
that and must remain configured for gigabit. If it is configured at
OCELOT_SPEED_10 or OCELOT_SPEED_100, it'll start transmitting PAUSE
frames out of control and never recover, _even if_ we then reconfigure
it at OCELOT_SPEED_1000 afterwards.
This patch fixes a bug that luckily did not have any functional impact.
We were writing 10, 100, 1000 etc into this 2-bit field in
DEV_CLOCK_CFG, but the hardware expects values in the range 0, 1, 2, 3.
So all speed values were getting truncated to 0, which is
OCELOT_SPEED_2500, and which also appears to be fine.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 5 Jul 2020 16:16:22 +0000 (19:16 +0300)]
net: dsa: felix: support half-duplex link modes
Ping tested:
[ 11.808455] mscc_felix 0000:00:00.5 swp0: Link is Up - 1Gbps/Full - flow control rx/tx
[ 11.816497] IPv6: ADDRCONF(NETDEV_CHANGE): swp0: link becomes ready
[root@LS1028ARDB ~] # ethtool -s swp0 advertise 0x4
[ 18.844591] mscc_felix 0000:00:00.5 swp0: Link is Down
[ 22.048337] mscc_felix 0000:00:00.5 swp0: Link is Up - 100Mbps/Half - flow control off
[root@LS1028ARDB ~] # ip addr add 192.168.1.1/24 dev swp0
[root@LS1028ARDB ~] # ethtool -s swp0 advertise 0x10
[ 355.637747] mscc_felix 0000:00:00.5 swp0: Link is Down
[ 358.788034] mscc_felix 0000:00:00.5 swp0: Link is Up - 1Gbps/Half - flow control off
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Sun, 5 Jul 2020 16:16:21 +0000 (19:16 +0300)]
net: dsa: felix: clarify the intention of writes to MII_BMCR
The driver appears to write to BMCR_SPEED and BMCR_DUPLEX, fields which
are read-only, since they are actually configured through the
vendor-specific IF_MODE (0x14) register.
But the reason we're writing back the read-only values of MII_BMCR is to
alter these writable fields:
In particular, the only field which is really relevant to this driver is
BMCR_ANENABLE. Clarify that intention by spelling it out, using
phy_set_bits and phy_clear_bits.
The driver also made a few writes to BMCR_RESET and BMCR_ANRESTART which
are unnecessary and may temporarily disrupt the link to the PHY. Remove
them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
qlogic: use generic power management
Linux Kernel Mentee: Remove Legacy Power Management.
The purpose of this patch series is to remove legacy power management callbacks
from qlogic ethernet drivers.
The callbacks performing suspend() and resume() operations are still calling
pci_save_state(), pci_set_power_state(), etc. and handling the power management
themselves, which is not recommended.
The conversion requires the removal of the those function calls and change the
callback definition accordingly and make use of dev_pm_ops structure.
All patches are compile-tested only.
V2: Fix unused variable warning in v1.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and taking care of register states. And they use PCI
helper functions to do it.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
.suspend() calls __qlcnic_shutdown, which then calls qlcnic_82xx_shutdown;
.resume() calls __qlcnic_resume, which then calls qlcnic_82xx_resume;
Both ...82xx..() are define in
drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c and are used only in
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c.
Hence upgrade them and remove PCI function calls, like pci_save_state() and
pci_enable_wake(), inside them
Compile-tested only.
Signed-off-by: Vaibhav Gupta <vaibhavgupta40@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
With legacy PM, drivers themselves were responsible for managing the
device's power states and takes care of register states. And they use PCI
helper functions to do it.
After upgrading to the generic structure, PCI core will take care of
required tasks and drivers should do only device-specific operations.
In this driver:
netxen_nic_resume() calls netxen_nic_attach_func() which then invokes PCI
helper functions like pci_enable_device(), pci_set_power_state() and
pci_restore_state(). Other function:
- netxen_io_slot_reset()
also calls netxen_nic_attach_func().
Also, netxen_io_slot_reset() returns specific value based on the return value
of netxen_nic_attach_func() as whole. Thus, cannot simply move some piece of
code from netxen_nic_attach_func() to it.
Hence, define a new function netxen_nic_attach_late_func() to do the tasks
which has to be done after PCI helper functions have done their job.
net: dsa: microchip: remove unused private members
Private structure members live_ports, on_ports, rx_ports, tx_ports are
initialized but not used anywhere. Let's remove them.
Suggested-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: microchip: split adjust_link() in phylink_mac_link_{up|down}()
The DSA subsystem moved to phylink and adjust_link() became deprecated in
the process. This patch removes adjust_link from the KSZ DSA switches and
adds phylink_mac_link_up() and phylink_mac_link_down().
Signed-off-by: Codrin Ciubotariu <codrin.ciubotariu@microchip.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
mptcp: add REUSEADDR/REUSEPORT/V6ONLY setsockopt support
restarting an mptcp-patched sshd yields following error:
sshd: error: Bind to port 22 on 0.0.0.0 failed: Address already in use.
sshd: error: setsockopt IPV6_V6ONLY: Operation not supported
sshd: error: Bind to port 22 on :: failed: Address already in use.
sshd: fatal: Cannot bind any address.
This series adds support for the needed setsockopts:
First patch skips the generic SOL_SOCKET handler for MPTCP:
in mptcp case, the setsockopt needs to alter the tcp socket, not the mptcp
parent socket.
Second patch adds minimal SOL_SOCKET support: REUSEPORT and REUSEADDR.
Rest is still handled by the generic SOL_SOCKET code.
Last patch adds IPV6ONLY support. This makes ipv6 work for openssh:
It creates two listening sockets, before this patch, binding the ipv6
socket will fail because the port is already bound by the ipv4 one.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Without this, Opensshd fails to open an ipv6 socket listening
socket:
error: setsockopt IPV6_V6ONLY: Operation not supported
error: Bind to port 22 on :: failed: Address already in use.
Opensshd opens an ipv4 and and ipv6 listening socket, but because
IPV6_V6ONLY setsockopt fails, the port number is already in use.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
This will e.g. make 'sshd restart' work when MPTCP is used, as we will
now set this option on the listener socket instead of only the mptcp
socket (where it has no effect).
We still need to copy the setting to the master socket so that a
subsequent getsockopt() returns the expected value.
Reported-by: Christoph Paasch <cpaasch@apple.com> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net: use mptcp setsockopt function for SOL_SOCKET on mptcp sockets
setsockopt(mptcp_fd, SOL_SOCKET, ...)... appears to work (returns 0),
but it has no effect -- this is because the MPTCP layer never has a
chance to copy the settings to the subflow socket.
Skip the generic handling for the mptcp case and instead call the
mptcp specific handler instead for SOL_SOCKET too.
Next patch adds more specific handling for SOL_SOCKET to mptcp.
Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Tanner Love [Sat, 4 Jul 2020 20:45:14 +0000 (16:45 -0400)]
selftests/net: update initializer syntax to use c99 designators
Before, clang version 9 threw errors such as: error:
use of GNU old-style field designator extension [-Werror,-Wgnu-designator]
{ tstamp: true, swtstamp: true }
^~~~~~~
.tstamp =
Fix these warnings in tools/testing/selftests/net in the same manner as
commit 121e357ac728 ("selftests/harness: Update named initializer syntax").
N.B. rxtimestamp.c is the only affected file in the directory.
Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 5 Jul 2020 00:51:07 +0000 (17:51 -0700)]
Merge branch 'bnx2x-Perform-IdleChk-dump'
Sudarsana Reddy Kalluru says:
====================
bnx2x: Perform IdleChk dump.
Idlechk test verifies that the chip is in idle state. If there are any
errors, Idlechk dump would capture the same. This data will help in
debugging the device related issues.
The patch series adds driver support for dumping IdleChk data during the
debug dump collection.
Patch (1) adds register definitions required in this implementation.
Patch (2) adds the implementation for Idlechk tests.
Patch (3) adds driver changes to invoke Idlechk implementation.
Changes from previous version:
-------------------------------
v3: Combined the test data creation and implementation to a single patch.
v2: Addressed issues reported by kernel test robot.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
bnx2x: Perform Idlechk dump during the debug collection.
The patch adds driver changes to perform Idlechk dump during the debug
data collection.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The patch adds register definitions required for Idlechk implementation.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net-next* tree.
We've added 73 non-merge commits during the last 17 day(s) which contain
a total of 106 files changed, 5233 insertions(+), 1283 deletions(-).
The main changes are:
1) bpftool ability to show PIDs of processes having open file descriptors
for BPF map/program/link/BTF objects, relying on BPF iterator progs
to extract this info efficiently, from Andrii Nakryiko.
2) Addition of BPF iterator progs for dumping TCP and UDP sockets to
seq_files, from Yonghong Song.
3) Support access to BPF map fields in struct bpf_map from programs
through BTF struct access, from Andrey Ignatov.
4) Add a bpf_get_task_stack() helper to be able to dump /proc/*/stack
via seq_file from BPF iterator progs, from Song Liu.
5) Make SO_KEEPALIVE and related options available to bpf_setsockopt()
helper, from Dmitry Yakunin.
6) Optimize BPF sk_storage selection of its caching index, from Martin
KaFai Lau.
7) Removal of redundant synchronize_rcu()s from BPF map destruction which
has been a historic leftover, from Alexei Starovoitov.
8) Several improvements to test_progs to make it easier to create a shell
loop that invokes each test individually which is useful for some CIs,
from Jesper Dangaard Brouer.
9) Fix bpftool prog dump segfault when compiled without skeleton code on
older clang versions, from John Fastabend.
10) Bunch of cleanups and minor improvements, from various others.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the transmit part of XDP support, which includes:
- support for XDP_TX in mvpp2_xdp()
- .ndo_xdp_xmit hook for AF_XDP and XDP_REDIRECT with mvpp2 as destination
mvpp2_xdp_submit_frame() is a generic function which is called by
mvpp2_xdp_xmit_back() when doing XDP_TX, and by mvpp2_xdp_xmit when
doing AF_XDP or XDP_REDIRECT target.
The buffer allocation has been reworked to be able to map the buffers
as DMA_FROM_DEVICE or DMA_BIDIRECTIONAL depending if native XDP is
in use or not.
Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add XDP native support.
By now only XDP_DROP, XDP_PASS and XDP_REDIRECT
verdicts are supported.
Co-developed-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Use the page_pool API for memory management.
This is a prerequisite for native XDP support.
Tested-by: Sven Auhagen <sven.auhagen@voleatech.de> Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In mvpp2_swf_bm_pool_init_percpu(), a reference to a struct
mvpp2_bm_pool is obtained traversing multiple structs, when a
local variable already points to the same object.
Fix it and, while at it, give the variable a meaningful name.
Signed-off-by: Matteo Croce <mcroce@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
selftests/net: add ipv6 test coverage in rxtimestamp test
Add the options --ipv4, --ipv6 to specify running over ipv4 and/or
ipv6. If neither is specified, then run both.
Signed-off-by: Tanner Love <tannerlove@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: ipa: fix HOLB timer register use
The function ipa_reg_init_hol_block_timer_val() generates the value
to write into the HOL_BLOCK_TIMER endpoint configuration register,
to represent a given timeout value (in microseconds). It only
supports a timer value of 0 though, in part because that's
sufficient, but mainly because there was some confusion about
how the register is formatted in newer hardware.
I got clarification about the register format, so this series fixes
ipa_reg_init_hol_block_timer_val() to work for any supported delay
value.
The delay is based on the IPA core clock, so determining the value
to write for a given period requires access to the current core
clock rate. So the first patch just creates a new function to
provide that.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Fri, 3 Jul 2020 21:23:35 +0000 (16:23 -0500)]
net: ipa: fix HOLB timer calculation
For IPA v4.2, the exact interpretation of the register that defines
the timeout for avoiding head-of-line blocking was a little unclear.
We're only assigning a 0 timeout to it right now, so that wasn't
very important. But now that I know how it's supposed to work, I'm
fixing it.
The register represents a tick counter, where each tick is equal to
128 IPA core clock cycles. For IPA v3.5.1, the register contains
a simple counter value. But for IPA v4.2, the register contains two
fields, base and scale, which approximate the tick counter as:
ticks = base << scale
The base and scale values to use for a given tick count are computed
using clever bit operations, and measures are taken to make the
resulting time period as close as possible to that requested.
There's no need for ipa_endpoint_init_hol_block_timer() to return
an error, so change its return type to void.
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Song Liu [Fri, 3 Jul 2020 18:17:19 +0000 (11:17 -0700)]
selftests/bpf: Fix compilation error of bpf_iter_task_stack.c
BPF selftests show a compilation error as follows:
libbpf: invalid relo for 'entries' in special section 0xfff2; forgot to
initialize global var?..
Fix it by initializing 'entries' to zeros.
Fixes: c7568114bc56 ("selftests/bpf: Add bpf_iter test with bpf_get_task_stack()") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200703181719.3747072-1-songliubraving@fb.com
John Fastabend [Fri, 3 Jul 2020 04:31:59 +0000 (21:31 -0700)]
bpf: Fix bpftool without skeleton code enabled
Fix segfault from bpftool by adding emit_obj_refs_plain when skeleton
code is disabled.
Tested by deleting BUILD_BPF_SKELS in Makefile. We found this doing
backports for Cilium when a testing image pulled in latest bpf-next
bpftool, but kept using an older clang-7.
# ./bpftool prog show
Error: bpftool built without PID iterator support
3: cgroup_skb tag 7be49e3934a125ba gpl
loaded_at 2020-07-01T08:01:29-0700 uid 0
Segmentation fault
Fixes: d53dee3fe013 ("tools/bpftool: Show info for processes holding BPF map/prog/link/btf FDs") Reported-by: Joe Stringer <joe@wand.net.nz> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/159375071997.14984.17404504293832961401.stgit@john-XPS-13-9370
net: bcmgenet: Allow changing carrier from user-space
The GENET driver interfaces with internal MoCA interface as well as
external MoCA chips like the BCM6802/6803 through a fixed link
interface. It is desirable for the mocad user-space daemon to be able to
control the carrier state based upon out of band messages that it
receives from the MoCA chip.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 3 Jul 2020 19:33:16 +0000 (12:33 -0700)]
Merge tag 'mlx5-updates-2020-07-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-07-02
Rx and Tx devlink health reporters enhancements.
1) Code cleanup
2) devlink output format improvements
3) Print more useful info on devlink health diagnose output
4) TX timeout recovery, on a single SQ recover failure, stop the loop
and reset all rings (re-open netdev).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Aya Levin [Mon, 18 May 2020 09:31:38 +0000 (12:31 +0300)]
net/mlx5e: Enhance TX timeout recovery
Upon a TX timeout handle, if the TX reporter was not able to recover
from the error, reopen the channels. If tried to reopen channels, do not
loop over TX queues for timeout.
With that, the reporters state and separation will better
expose the driver's state.
Signed-off-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>