Jiri Olsa [Tue, 14 Jul 2020 10:25:34 +0000 (12:25 +0200)]
bpf: Fix cross build for CONFIG_DEBUG_INFO_BTF option
Stephen and 0-DAY CI Kernel Test Service reported broken cross build
for arm (arm-linux-gnueabi-gcc (GCC) 9.3.0), with following output:
/tmp/ccMS5uth.s: Assembler messages:
/tmp/ccMS5uth.s:69: Error: unrecognized symbol type ""
/tmp/ccMS5uth.s:82: Error: unrecognized symbol type ""
Having '@object' for .type diretive is wrong because '@' is comment
character for some architectures. Using STT_OBJECT instead that should
work everywhere.
Also using HOST* variables to build resolve_btfids so it's properly
build in crossbuilds (stolen from objtool's Makefile).
Reported-by: kernel test robot <lkp@intel.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/bpf/20200714102534.299280-2-jolsa@kernel.org
Jiri Olsa [Tue, 14 Jul 2020 10:25:33 +0000 (12:25 +0200)]
bpf: Fix build for disabled CONFIG_DEBUG_INFO_BTF option
Stephen reported following linker warnings on powerpc build:
ld: warning: orphan section `.BTF_ids' from `kernel/trace/bpf_trace.o' being placed in section `.BTF_ids'
ld: warning: orphan section `.BTF_ids' from `kernel/bpf/btf.o' being placed in section `.BTF_ids'
ld: warning: orphan section `.BTF_ids' from `kernel/bpf/stackmap.o' being placed in section `.BTF_ids'
ld: warning: orphan section `.BTF_ids' from `net/core/filter.o' being placed in section `.BTF_ids'
ld: warning: orphan section `.BTF_ids' from `kernel/trace/bpf_trace.o' being placed in section `.BTF_ids'
It's because we generated .BTF_ids section even when
CONFIG_DEBUG_INFO_BTF is not enabled. Fixing this by
generating empty btf_id arrays for this case.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/bpf/20200714102534.299280-1-jolsa@kernel.org
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.
When memory is allocated in 'pcnet32_realloc_tx_ring()' and
'pcnet32_realloc_rx_ring()', GFP_ATOMIC must be used because a spin_lock is
hold.
The call chain is:
pcnet32_set_ringparam
** spin_lock_irqsave(&lp->lock, flags);
--> pcnet32_realloc_tx_ring
--> pcnet32_realloc_rx_ring
** spin_unlock_irqrestore(&lp->lock, flags);
When memory is in 'pcnet32_probe1()' and 'pcnet32_alloc_ring()', GFP_KERNEL
can be used.
While at it, update a few comments and pr_err messages to be more in line
with the new function names.
The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.
When memory is allocated in 'amd8111e_init_ring()', GFP_ATOMIC must be used
because a spin_lock is hold.
One of the call chains is:
amd8111e_open
** spin_lock_irq(&lp->lock);
--> amd8111e_restart
--> amd8111e_init_ring
** spin_unlock_irq(&lp->lock);
The rest of the patch is produced by coccinelle with a few adjustments to
please checkpatch.pl.
net: wan: cosa: Replace HTTP links with HTTPS ones
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
New DSA driver for VSC9953 Seville switch
Looking at the Felix and Ocelot drivers, Maxim asked if it would be
possible to use them as a base for a new driver for the Seville switch
inside NXP T1040. Turns out, it is! The result is that the mscc_felix
driver was extended to probe on Seville.
The biggest challenge seems to be getting register read/write API
generic enough to cover such wild bitfield variations between hardware
generations.
Currently, both felix and seville are built under the same kernel config
option (NET_DSA_MSCC_FELIX). This has both some advantages (no need to
duplicate the Lynx PCS code from felix_vsc9959.c) and some disadvantages
(Seville needs to depend on PCI and on ENETC_MDIO). This will be further
refined as time progresses.
The driver has been completely reviewed. Previous submission was here,
it wasn't accepted due to a conflict with Mark Brown's tree, very late
in the release cycle:
Vladimir Oltean [Mon, 13 Jul 2020 16:57:11 +0000 (19:57 +0300)]
docs: devicetree: add bindings for Seville DSA switch inside Felix driver
There are no non-standard bindings being used. However Felix is a PCI
device and Seville is a platform device. So give an example of device
tree for this switch and document its compatible string.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: felix: introduce support for Seville VSC9953 switch
This is another switch from Vitesse / Microsemi / Microchip, that has
10 ports (8 external, 2 internal) and is integrated into the Freescale /
NXP T1040 PowerPC SoC. It is very similar to Felix from NXP LS1028A,
except that this is a platform device and Felix is a PCI device, and it
doesn't support IEEE 1588 and TSN.
Like Felix, this driver configures its own PCS on the internal MDIO bus
using a phy_device abstraction for it (yes, it will be refactored to use
a raw mdio_device, like other phylink drivers do, but let's keep it like
that for now). But unlike Felix, the MDIO bus and the PCS are not from
the same vendor. The PCS is the same QorIQ/Layerscape PCS as found in
Felix/ENETC/DPAA*, but the internal MDIO bus that is used to access it
is actually an instantiation of drivers/net/phy/mdio-mscc-miim.c. But it
would be difficult to reuse that driver (it doesn't even use regmap, and
it's less than 200 lines of code), so we hand-roll here some internal
MDIO bus accessors within seville_vsc9953.c, which serves the purpose of
driving the PCS absolutely fine.
Also, same as Felix, the PCS doesn't support dynamic reconfiguration of
SerDes protocol, so we need to do pre-validation of PHY mode from device
tree and not let phylink change it.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 13 Jul 2020 16:57:09 +0000 (19:57 +0300)]
net: dsa: felix: move probing to felix_vsc9959.c
Felix is not actually meant to be a DSA driver only for the switch
inside NXP LS1028A, but an umbrella for all Vitesse / Microsemi /
Microchip switches that are register-compatible with Ocelot and that are
using in DSA mode (with an NPI Ethernet port).
For the dsa_switch_ops exported by the felix driver to be generic enough
to be used by other non-PCI switches, we need to move the PCI-specific
probing to the low-level translation module felix_vsc9959.c. This way,
other switches can have their own probing functions, as platform devices
or otherwise.
This patch also removes the "Felix instance table", which did not stand
the test of time and is unnecessary at this point.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mscc: ocelot: extend watermark encoding function
The ocelot_wm_encode function deals with setting thresholds for pause
frame start and stop. In Ocelot and Felix the register layout is the
same, but for Seville, it isn't. The easiest way to accommodate Seville
hardware configuration is to introduce a function pointer for setting
this up.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: mscc: ocelot: convert SYS_PAUSE_CFG register access to regfield
Seville has a different bitwise layout than Ocelot and Felix.
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 13 Jul 2020 16:57:06 +0000 (19:57 +0300)]
net: mscc: ocelot: disable flow control on NPI interface
The Ocelot switches do not support flow control on Ethernet interfaces
where a DSA tag must be added. If pause frames are enabled, they will be
encapsulated in the DSA tag just like regular frames, and the DSA master
will not recognize them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 13 Jul 2020 16:57:05 +0000 (19:57 +0300)]
net: mscc: ocelot: split writes to pause frame enable bit and to thresholds
We don't want ocelot_port_set_maxlen to enable pause frame TX, just to
adjust the pause thresholds.
Move the unconditional enabling of pause TX to ocelot_init_port. There
is no good place to put such setting because it shouldn't be
unconditional. But at the moment it is, we're not changing that.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 13 Jul 2020 16:57:04 +0000 (19:57 +0300)]
net: dsa: felix: create a template for the DSA tags on xmit
With this patch we try to kill 2 birds with 1 stone.
First of all, some switches that use tag_ocelot.c don't have the exact
same bitfield layout for the DSA tags. The destination ports field is
different for Seville VSC9953 for example. So the choices are to either
duplicate tag_ocelot.c into a new tag_seville.c (sub-optimal) or somehow
take into account a supposed ocelot->dest_ports_offset when packing this
field into the DSA injection header (again not ideal).
Secondly, tag_ocelot.c already needs to memset a 128-bit area to zero
and call some packing() functions of dubious performance in the
fastpath. And most of the values it needs to pack are pretty much
constant (BYPASS=1, SRC_PORT=CPU, DEST=port index). So it would be good
if we could improve that.
The proposed solution is to allocate a memory area per port at probe
time, initialize that with the statically defined bits as per chip
hardware revision, and just perform a simpler memcpy in the fastpath.
Other alternatives have been analyzed, such as:
- Create a separate tag_seville.c: too much code duplication for just 1
bit field difference.
- Create a separate DSA_TAG_PROTO_SEVILLE under tag_ocelot.c, just like
tag_brcm.c, which would have a separate .xmit function. Again, too
much code duplication for just 1 bit field difference.
- Allocate the template from the init function of the tag_ocelot.c
module, instead of from the driver: couldn't figure out a method of
accessing the correct port template corresponding to the correct
tagger in the .xmit function.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 13 Jul 2020 16:57:03 +0000 (19:57 +0300)]
net: mscc: ocelot: convert QSYS_SWITCH_PORT_MODE and SYS_PORT_MODE to regfields
Currently Felix and Ocelot share the same bit layout in these per-port
registers, but Seville does not. So we need reg_fields for that.
Actually since these are per-port registers, we need to also specify the
number of ports, and register size per port, and use the regmap API for
multiple ports.
There's a more subtle point to be made about the other 2 register
fields:
- QSYS_SWITCH_PORT_MODE_SCH_NEXT_CFG
- QSYS_SWITCH_PORT_MODE_INGRESS_DROP_MODE
which we are not writing any longer, for 2 reasons:
- Using the previous API (ocelot_write_rix), we were only writing 1 for
Felix and Ocelot, which was their hardware-default value, and which
there wasn't any intention in changing.
- In the case of SCH_NEXT_CFG, in fact Seville does not have this
register field at all, and therefore, if we want to have common code
we would be required to not write to it.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add the register definitions for the MSCC MIIM MDIO controller in
preparation for seville_vsc9959.c to create its accessors for the
internal MDIO bus.
Since we've introduced elements to ocelot_regfields that are not
instantiated by felix and ocelot, we need to define the size of the
regfields arrays explicitly, otherwise ocelot_regfields_init, which
iterates up to REGFIELD_MAX, will fault on the undefined regfield
entries (if we're lucky).
Signed-off-by: Maxim Kochetkov <fido_max@inbox.ru> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vladimir Oltean [Mon, 13 Jul 2020 16:57:01 +0000 (19:57 +0300)]
net: mscc: ocelot: convert port registers to regmap
At the moment, there are some minimal register differences between
VSC7514 Ocelot and VSC9959 Felix. To be precise, the PCS1G registers are
missing from Felix because it was integrated with an NXP PCS.
But with VSC9953 Seville (not yet introduced), the register differences
are more pronounced. The MAC registers are located at different offsets
within the DEV_GMII target. So we need to refactor the driver to keep a
regmap even for per-port registers. The callers of the ocelot_port_readl
and ocelot_port_writel were kept unchanged, only the implementation is
now more generic.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
devlink: Fix use-after-free when destroying health reporters
Dereferencing the reporter after it was destroyed in order to unlock the
reporters lock results in a use-after-free [1].
Fix this by storing a pointer to the lock in a local variable before
destroying the reporter.
[1]
==================================================================
BUG: KASAN: use-after-free in devlink_health_reporter_destroy+0x15c/0x1b0 net/core/devlink.c:5476
Read of size 8 at addr ffff8880650fd020 by task syz-executor.1/904
Memory state around the buggy address: ffff8880650fcf00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8880650fcf80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff8880650fd000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff8880650fd080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8880650fd100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Fixes: 3c5584bf0a04 ("devlink: Rework devlink health reporter destructor") Fixes: 15c724b997a8 ("devlink: Add devlink health port reporters API") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
PHYLIB is not selected by mdio-mscc-miim but it uses mdio devres helpers.
Explicitly select MDIO_DEVRES in this driver's Kconfig entry.
Reported-by: kernel test robot <lkp@intel.com> Fixes: 1814cff26739 ("net: phy: add a Kconfig option for mdio_devres") Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net/core/dev.c:5594:1: warning:
symbol '__pcpu_scope_flush_works' was not declared. Should it be static?
'flush_works' is not used outside of dev.c, so marks
it static.
Fixes: 41852497a9205 ("net: batch calls to flush_all_backlogs()") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
mlxsw: Add support for buffer drops mirroring
This set offloads the recently introduced qevent infrastructure in TC and
allows mlxsw to support mirroring of packets that were dropped due to
buffer related reasons (e.g., early drops) during forwarding.
Up until now mlxsw only supported mirroring that was either triggered by
per-port triggers (i.e., via matchall) or by the policy engine (i.e.,
via flower). Packets that are dropped due to buffer related reasons are
mirrored using a third type of trigger, a global trigger.
Global triggers are bound once to a mirroring (SPAN) agent and enabled
on a per-{port, TC} basis. This allows users, for example, to request
that only packets that were early dropped on a specific netdev to be
mirrored.
Patch set overview:
Patch #1 extends flow_block_offload and indirect offload structure to pass
a scheduler instead of a netdevice. That is necessary, because binding type
and netdevice are not a unique identifier of the block anymore.
Patches #2-#3 add the required registers to support above mentioned
functionality.
Patches #4-#6 gradually add support for global mirroring triggers.
Patch #7 adds support for enablement of global mirroring triggers.
Patches #8-#11 are cleanups in the flow offload code and shuffle some
code around to make the qevent offload easier.
Patch #12 implements offload of RED early_drop qevent.
Patch #13 extends the RED selftest for offloaded datapath to cover
early_drop qevent.
v2:
- Patch #1:
- In struct flow_block_indr, track both sch and dev.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 10 Jul 2020 21:55:14 +0000 (00:55 +0300)]
mlxsw: spectrum_qdisc: Offload mirroring on RED qevent early_drop
The RED qevents early_drop and mark can be offloaded under the following
fairly strict conditions:
- At most one filter is configured at the qevent block
- The protocol is "any"
- The classifier is matchall
- The action is trap, sample, or mirror with the same conditions as
with other SPAN offloads
- The hw_counters type is none
In this patchset, implement offload of mirror for early_drop qevent.
The ECN trigger is currently not implemented in the FW and therefore
the mark qevent is not supported.
The qevent notifications look exactly like regular block binding
notifications with a binder type that identifies them as qevents.
Therefore the details of processing this binding are fairly similar
to the matchall offload.
struct flow_block_offload.sch points at the qdisc in question. Use it to
figure out if the qdisc is offloaded at all and what TC it configures.
Bounce bindings on not-offloaded qdiscs.
Individual bindings are kept in a list so that several qevents can share
the same block and all binding points get configured as the configured
filters change.
Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 10 Jul 2020 21:55:13 +0000 (00:55 +0300)]
mlxsw: spectrum_flow: Promote binder-type dispatch to spectrum.c
Two RED qevents have been introduced recently. From the point of view of a
driver, qevents are simply blocks with unusual binder types. However they
need to be handled by different logic than ACL-like flows.
Thus rename mlxsw_sp_setup_tc_block() to mlxsw_sp_setup_tc_block_clsact()
and move the binder-type dispatch from there to spectrum.c into a new
function of the original name. The new dispatcher is easier to extend with
new binder types.
Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 10 Jul 2020 21:55:12 +0000 (00:55 +0300)]
mlxsw: spectrum_matchall: Publish matchall data structures
A following patch introduces offloading of filters attached to blocks bound
to the RED tail_drop qevent. The only classifier that mlxsw will permit in
this role is matchall. mlxsw currently offloads matchall filters used with
clsact qdisc. The data structures used for that offload will come handy for
the qevent offload as well. Publish them in spectrum.h.
Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_span: Add APIs to enable / disable global mirroring triggers
While the binding of global mirroring triggers to a SPAN agent is
global, packets are only mirrored if they belong to a port and TC on
which the trigger was enabled. This allows, for example, to mirror
packets that were tail-dropped on a specific netdev.
Implement the operations that allow to enable / disable a global
mirroring trigger on a specific port and TC.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_span: Add support for global mirroring triggers
Global mirroring triggers are triggers that are only keyed by their
trigger, as opposed to per-port triggers, which are keyed by their
trigger and port.
Such triggers allow mirroring packets that were tail/early dropped or
ECN marked to a SPAN agent.
Implement the previously added trigger operations for these global
triggers. Since such triggers are only supported from Spectrum-2
onwards, have the Spectrum-1 operations return an error.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_span: Prepare for global mirroring triggers
Currently, a SPAN agent can only be bound to a per-port trigger where
the trigger is either an incoming packet (INGRESS) or an outgoing packet
(EGRESS) to / from the port.
The subsequent patch will introduce the concept of global mirroring
triggers. The binding / unbinding of global triggers is different than
that of per-port triggers. Such triggers also need to be enabled /
disabled on a per-{port, TC} basis and are only supported from
Spectrum-2 onwards.
Add trigger operations that allow us to abstract these differences. Only
implement the operations for per-port triggers. Next patch will
implement the operations for global triggers.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_span: Move SPAN operations out of global file
The per-ASIC SPAN operations are relevant to the SPAN module and
therefore should be implemented there and not in the main driver file.
Move them.
These operations will be extended later on.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Fri, 10 Jul 2020 21:55:05 +0000 (00:55 +0300)]
mlxsw: reg: Add Monitoring Port Analyzer Global Register
This register is used for global port analyzer configurations.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This register is used to configure the mirror enable for different
mirror reasons.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 10 Jul 2020 21:55:03 +0000 (00:55 +0300)]
net: sched: Pass qdisc reference in struct flow_block_offload
Previously, shared blocks were only relevant for the pseudo-qdiscs ingress
and clsact. Recently, a qevent facility was introduced, which allows to
bind blocks to well-defined slots of a qdisc instance. RED in particular
got two qevents: early_drop and mark. Drivers that wish to offload these
blocks will be sent the usual notification, and need to know which qdisc it
is related to.
To that end, extend flow_block_offload with a "sch" pointer, and initialize
as appropriate. This prompts changes in the indirect block facility, which
now tracks the scheduler in addition to the netdevice. Update signatures of
several functions similarly.
Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 14 Jul 2020 00:20:40 +0000 (17:20 -0700)]
Merge branch 'net-simple-kerneldoc-fixes'
Andrew Lunn says:
====================
net simple kerneldoc fixes
This is a collection of simple kerneldoc fixes. They are all low
hanging fruit, were not real understanding of the code was needed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:15:14 +0000 (01:15 +0200)]
net: tipc: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Jon Maloy <jmaloy@redhat.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:15:12 +0000 (01:15 +0200)]
net: socket: Move kerneldoc next to function it documents
Fix the warning "Function parameter or member 'inode' not described in
'__sock_release'' due to the kerneldoc being placed before
__sock_release() not sock_release(), which does not take an inode
parameter.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:15:11 +0000 (01:15 +0200)]
net: sched: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:15:07 +0000 (01:15 +0200)]
net: netlabel: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:15:06 +0000 (01:15 +0200)]
net: netfilter: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Jozsef Kadlecsik <kadlec@netfilter.org> Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:15:03 +0000 (01:15 +0200)]
net: ipv6: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:15:02 +0000 (01:15 +0200)]
net: ipv4: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Paul Moore <paul@paul-moore.com> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:14:58 +0000 (01:14 +0200)]
net: can: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Oliver Hartkopp <socketcan@hartkopp.net> Cc: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sun, 12 Jul 2020 23:14:57 +0000 (01:14 +0200)]
net: 9p: kerneldoc fixes
Simple fixes which require no deep knowledge of the code.
Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Latchesar Ionkov <lucho@ionkov.net> Cc: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Alex Elder [Mon, 13 Jul 2020 12:24:18 +0000 (07:24 -0500)]
net: ipa: fix kerneldoc comments
This commit affects comments (and in one case, whitespace) only.
Throughout the IPA code, return statements are documented using
"@Return:", whereas they should use "Return:" instead. Fix these
mistakes.
In function definitions, some parameters are missing their comment
to describe them. And in structure definitions, some fields are
missing their comment to describe them. Add these missing
descriptions.
Some arguments changed name and type along the way, but their
descriptions were not updated (an endpoint pointer is now used in
many places that previously used an endpoint ID). Fix these
incorrect parameter descriptions.
In the description for the ipa_clock structure, one field had a
semicolon instead of a colon in its description. Fix this.
Add a missing function description for ipa_gsi_endpoint_data_empty().
All of these issues were identified when building with "W=1".
Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Fix bpftool logic of stripping away const/volatile modifiers for all global
variables during BPF skeleton generation. See patch #1 for details on when
existing logic breaks and why it's important. Support special .strip_mods=true
mode in btf_dump__emit_type_decl.
Recent example of when this has caused problems can be found in [0].
tools/bpftool: Strip away modifiers from global variables
Reliably remove all the type modifiers from read-only (.rodata) global
variable definitions, including cases of inner field const modifiers and
arrays of const values.
Also modify one of selftests to ensure that const volatile struct doesn't
prevent user-space from modifying .rodata variable.
One important use case when emitting const/volatile/restrict is undesirable is
BPF skeleton generation of DATASEC layout. These are further memory-mapped and
can be written/read from user-space directly.
For important case of .rodata variables, bpftool strips away first-level
modifiers, to make their use on user-space side simple and not requiring extra
type casts to override compiler complaining about writing to const variables.
This logic works mostly fine, but breaks in some more complicated cases. E.g.:
const volatile int params[10];
Because in BTF it's a chain of ARRAY -> CONST -> VOLATILE -> INT, bpftool
stops at ARRAY and doesn't strip CONST and VOLATILE. In skeleton this variable
will be emitted as is. So when used from user-space, compiler will complain
about writing to const array. This is problematic, as also mentioned in [0].
To solve this for arrays and other non-trivial cases (e.g., inner
const/volatile fields inside the struct), teach btf_dump to strip away any
modifier, when requested. This is done as an extra option on
btf_dump__emit_type_decl() API.
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Steven suggested a way to resolve the appearance of the warning banner
that appears as a result of using trace_printk() in BPF [1].
Applying the patch and testing reveals all works as expected; we
can call bpf_trace_printk() and see the trace messages in
/sys/kernel/debug/tracing/trace_pipe and no banner message appears.
Also add a test prog to verify basic bpf_trace_printk() helper behaviour.
Changes since v2:
- fixed stray newline in bpf_trace_printk(), use sizeof(buf)
rather than #defined value in vsnprintf() (Daniel, patch 1)
- Daniel also pointed out that vsnprintf() returns 0 on error rather
than a negative value; also turns out that a null byte is not
appended if the length of the string written is zero, so to fix
for cases where the string to be traced is zero length we set the
null byte explicitly (Daniel, patch 1)
- switch to using getline() for retrieving lines from trace buffer
to ensure we don't read a portion of the search message in one
read() operation and then fail to find it (Andrii, patch 2)
Changes since v1:
- reorder header inclusion in bpf_trace.c (Steven, patch 1)
- trace zero-length messages also (Andrii, patch 1)
- use a raw spinlock to ensure there are no issues for PREMMPT_RT
kernels when using bpf_trace_printk() within other raw spinlocks
(Steven, patch 1)
- always enable bpf_trace_printk() tracepoint when loading programs
using bpf_trace_printk() as this will ensure that a user disabling
that tracepoint will not prevent tracing output from being logged
(Steven, patch 1)
- use "tp/raw_syscalls/sys_enter" and a usleep(1) to trigger events
in the selftest ensuring test runs faster (Andrii, patch 2)
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Alan Maguire [Mon, 13 Jul 2020 11:52:33 +0000 (12:52 +0100)]
bpf: Use dedicated bpf_trace_printk event instead of trace_printk()
The bpf helper bpf_trace_printk() uses trace_printk() under the hood.
This leads to an alarming warning message originating from trace
buffer allocation which occurs the first time a program using
bpf_trace_printk() is loaded.
We can instead create a trace event for bpf_trace_printk() and enable
it in-kernel when/if we encounter a program using the
bpf_trace_printk() helper. With this approach, trace_printk()
is not used directly and no warning message appears.
This work was started by Steven (see Link) and finished by Alan; added
Steven's Signed-off-by with his permission.
====================
This series introduces new statistics for af_xdp:
1. drops due to rx ring being full
2. drops due to fill ring being empty
3. failures pulling an item from the tx ring
These statistics should assist users debugging and troubleshooting
peformance issues and packet drops.
The statistics are made available though the getsockopt and xsk_diag
interfaces, and the ability to dump these extended statistics is made
available in the xdpsock application via the --extra-stats or -x flag.
A separate patch which will add ss/iproute2 support will follow.
====================
Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
samples: bpf: Add an option for printing extra statistics in xdpsock
Introduce the --extra-stats (or simply -x) flag to the xdpsock application
which prints additional statistics alongside the regular rx and tx
counters. The new statistics printed report error conditions eg. rx ring
full, invalid descriptors, etc.
It can be useful for the user to know the reason behind a dropped packet.
Introduce new counters which track drops on the receive path caused by:
1. rx ring being full
2. fill ring being empty
Also, on the tx path introduce a counter which tracks the number of times
we attempt pull from the tx ring when it is empty.
====================
This patchset adds:
- support to generate BTF ID lists that are resolved during
kernel linking and usable within kernel code with following
macros:
and access it in kernel code via:
extern u32 bpf_skb_output_btf_ids[];
- resolve_btfids tool that scans elf object for .BTF_ids
section and resolves its symbols with BTF ID values
- resolving of bpf_ctx_convert struct and several other
objects with BTF_ID_LIST
v7 changes:
- added more acks [Andrii]
- added some name-conflicting entries and fixed resolve_btfids
to process them properly [Andrii]
- changed bpf_get_task_stack_proto to use BTF_IDS_LIST/BTF_ID
macros [Andrii]
- fixed selftest build for resolve_btfids test
====================
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.
Deterministic algorithm:
For each file:
If not .svg:
For each line:
If doesn't contain `\bxmlns\b`:
For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
If both the HTTP and HTTPS versions
return 200 OK and serve the same content:
Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: David S. Miller <davem@davemloft.net>
kbuild test robot found that addr_to_string() is available only when
DEBUG is defined. And I found that what that function is doing is
what %pM will do. Thus, replace %s with %pM and remove thread-unsafe
addr_to_string() function.
Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: fix undefined br_vlan_can_enter_range in tunnel code
If bridge vlan filtering is not defined we won't have
br_vlan_can_enter_range and thus will get a compile error as was
reported by Stephen and the build bot. So let's define a stub for when
vlan filtering is not used.
Fixes: 94339443686b ("net: bridge: notify on vlan tunnel changes done via the old api") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Olsa [Sat, 11 Jul 2020 21:53:29 +0000 (23:53 +0200)]
selftests/bpf: Add test for resolve_btfids
Adding resolve_btfids test under test_progs suite.
It's possible to use btf_ids.h header and its logic in
user space application, so we can add easy test for it.
The test defines BTF_ID_LIST and checks it gets properly
resolved.
For this reason the test_progs binary (and other binaries
that use TRUNNER* macros) is processed with resolve_btfids
tool, which resolves BTF IDs in .BTF_ids section. The BTF
data are taken from btf_data.o object rceated from
progs/btf_data.c.
Jiri Olsa [Sat, 11 Jul 2020 21:53:23 +0000 (23:53 +0200)]
bpf: Add BTF_ID_LIST/BTF_ID/BTF_ID_UNUSED macros
Adding support to generate .BTF_ids section that will hold BTF
ID lists for verifier.
Adding macros that will help to define lists of BTF ID values
placed in .BTF_ids section. They are initially filled with zeros
(during compilation) and resolved later during the linking phase
by resolve_btfids tool.
The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.
When memory is allocated in 'sky2_alloc_buffers()', GFP_KERNEL can be used
because some other memory allocations in the same function already use this
flag.
When memory is allocated in 'sky2_probe()', GFP_KERNEL can be used
because another memory allocations in the same function already uses this
flag.
The wrappers in include/linux/pci-dma-compat.h should go away.
The patch has been generated with the coccinelle script below and has been
hand modified to replace GPF_ with a correct flag.
It has been compile tested.
When memory is allocated in 'skge_up()', GFP_KERNEL can be used because
some other memory allocations done a few lines below in 'skge_ring_alloc()'
already use this flag.
====================
Fix MTU warnings for fec/mv886xxx combo
Since changing the MTU of dsa slave interfaces was implemented, the
fec/mv88e6xxx combo has been giving warnings:
[ 2.275925] mv88e6085 0.2:00: nonfatal error -95 setting MTU on port 9
[ 2.284306] eth1: mtu greater than device maximum
[ 2.287759] fec 400d1000.ethernet eth1: error -22 setting MTU to include DSA overhead
This patchset adds support for changing the MTU on mv88e6xxx switches,
which do support jumbo frames. And it modifies the FEC driver to
support its true MTU range, which is larger than the default Ethernet
MTU.
====================
Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 11 Jul 2020 20:32:06 +0000 (22:32 +0200)]
net: fec: Set max MTU size to allow the MTU to be changed
The FEC allocates 2K buffers, but looses some of it due to
alignment. It can however support an MTU bigger than the default. This
is particularly interesting when used in combination with Ethernet
switches supporting DSA, which have extra headers. The DSA core will
try to increase the MTU to support these extra headers. If the max
size defaults to that of standard Ethernet we get a warning. By
setting the max to what the driver actually supports, we avoid this
warning.
Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: bridge: notify on vlan tunnel changes done via the old api
If someone uses the old vlan API to configure tunnel mappings we'll only
generate the old-style full port notification. That would be a problem
if we are monitoring the new vlan notifications for changes. The patch
resolves the issue by adding vlan notifications to the old tunnel netlink
code. As usual we try to compress the notifications for as many vlans
in a range as possible, thus a vlan tunnel change is considered able
to enter the "current" vlan notification range if:
1. vlan exists
2. it has actually changed (curr_change == true)
3. it passes all standard vlan notification range checks done by
br_vlan_can_enter_range() such as option equality, id continuity etc
Note that vlan tunnel changes (add/del) are considered a part of vlan
options so only RTM_NEWVLAN notification is generated with the relevant
information inside.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
"Four cifs/smb3 fixes: the three for stable fix problems found recently
with change notification including a reference count leak"
* tag '5.8-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
cifs: update internal module version number
cifs: fix reference leak for tlink
smb3: fix unneeded error message on change notify
cifs: remove the retry in cifs_poxis_lock_set
smb3: fix access denied on change notify request to some servers