]> git.baikalelectronics.ru Git - kernel.git/log
kernel.git
3 years agor8169: tweak max read request size for newer chips also in jumbo mtu mode
Heiner Kallweit [Sat, 9 Jan 2021 22:01:18 +0000 (23:01 +0100)]
r8169: tweak max read request size for newer chips also in jumbo mtu mode

So far we don't increase the max read request size if we switch to
jumbo mode before bringing up the interface for the first time.
Let's change this.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8169: align RTL8168e jumbo pcie read request size with vendor driver
Heiner Kallweit [Sat, 9 Jan 2021 22:00:04 +0000 (23:00 +0100)]
r8169: align RTL8168e jumbo pcie read request size with vendor driver

Align behavior with r8168 vendor driver and don't reduce max read
request size for RTL8168e in jumbo mode.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: marvell: prestera: Correct typo
Florian Fainelli [Sat, 9 Jan 2021 05:06:22 +0000 (21:06 -0800)]
net: marvell: prestera: Correct typo

The function was incorrectly named with a trailing 'r' at the end of
prestera.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210109050622.8081-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: phy: at803x: use phy_modify_mmd()
Russell King [Sun, 10 Jan 2021 14:54:36 +0000 (14:54 +0000)]
net: phy: at803x: use phy_modify_mmd()

Convert at803x_clk_out_config() to use phy_modify_mmd().

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/E1kyc72-0008Pq-1x@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: sfp: extend bitrate-derived mode for 2500BASE-X
Russell King [Sun, 10 Jan 2021 10:58:37 +0000 (10:58 +0000)]
net: sfp: extend bitrate-derived mode for 2500BASE-X

Extend the bitrate-derived support to include 2500BASE-X for modules
that report a bitrate of 2500Mbaud.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/E1kyYQf-0004iY-Gh@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: sfp: cope with SFPs that set both LOS normal and LOS inverted
Russell King [Sun, 10 Jan 2021 10:58:32 +0000 (10:58 +0000)]
net: sfp: cope with SFPs that set both LOS normal and LOS inverted

The SFP MSA defines two option bits in byte 65 to indicate how the
Rx_LOS signal on SFP pin 8 behaves:

bit 2 - Loss of Signal implemented, signal inverted from standard
        definition in SFP MSA (often called "Signal Detect").
bit 1 - Loss of Signal implemented, signal as defined in SFP MSA
        (often called "Rx_LOS").

Clearly, setting both bits results in a meaningless situation: it would
mean that LOS is implemented in both the normal sense (1 = signal loss)
and inverted sense (0 = signal loss).

Unfortunately, there are modules out there which set both bits, which
will be initially interpret as "inverted" sense, and then, if the LOS
signal changes state, we will toggle between LINK_UP and WAIT_LOS
states.

Change our LOS handling to give well defined behaviour: only interpret
these bits as meaningful if exactly one is set, otherwise treat it as
if LOS is not implemented.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/E1kyYQa-0004iR-CU@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: felix: the switch does not support DMA
Vladimir Oltean [Sat, 9 Jan 2021 20:34:15 +0000 (22:34 +0200)]
net: dsa: felix: the switch does not support DMA

The code that sets the DMA mask to 64 bits is bogus, it is taken from
the enetc driver together with the rest of the PCI probing boilerplate.

Since this patch is touching the error path to delete err_dma, let's
also change the err_alloc_felix label which was incorrect. The kzalloc
failure does not need a kfree, but it doesn't hurt either, since kfree
works with NULL pointers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210109203415.2120142-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'get-rid-of-the-switchdev-transactional-model'
Jakub Kicinski [Tue, 12 Jan 2021 00:01:00 +0000 (16:01 -0800)]
Merge branch 'get-rid-of-the-switchdev-transactional-model'

Vladimir Oltean says:

====================
Get rid of the switchdev transactional model

Changes in v4:
- Fixed build error in dsa_loop and build warning in hellcreek driver.
- Scheduling the mlxsw SPAN work item regardless of the VLAN add return
  code, as per Ido's and Petr's request.

Changes in v3:
- Resolved a build warning in mv88e6xxx and tested that it actually
  works properly, which resulted in an extra patch (02/11).
- Addressed Ido's minor feedback in commit 10/11 relating to a comment.

Changes in v2:
- Got rid of the vid_begin -> vid_end range too from the switchdev API.
- Actually propagating errors from DSA MDB and VLAN notifiers.

This series comes after the late realization that the prepare/commit
separation imposed by switchdev does not help literally anybody:
https://patchwork.kernel.org/project/netdevbpf/patch/20201212203901.351331-1-vladimir.oltean@nxp.com/

We should kill it before it inflicts even more damage to the error
handling logic in drivers.

Also remove the unused VLAN ranges feature from the switchdev VLAN
objects, which simplifies all drivers by quite a bit.
====================

Link: https://lore.kernel.org/r/20210109000156.1246735-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: switchdev: delete the transaction object
Vladimir Oltean [Sat, 9 Jan 2021 00:01:56 +0000 (02:01 +0200)]
net: switchdev: delete the transaction object

Now that all users of struct switchdev_trans have been modified to do
without it, we can remove this structure and the two helpers to determine
the phase.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomlxsw: spectrum_switchdev: remove transactional logic for VLAN objects
Vladimir Oltean [Sat, 9 Jan 2021 00:01:55 +0000 (02:01 +0200)]
mlxsw: spectrum_switchdev: remove transactional logic for VLAN objects

As of commit f2625668e390 ("mlxsw: spectrum_switchdev: Avoid returning
errors in commit phase"), the mlxsw driver performs the VLAN object
offloading during the prepare phase. So conversion just seems to be a
matter of removing the code that was running in the commit phase.

Ido Schimmel explains that the reason why mlxsw_sp_span_respin is called
unconditionally is because the bridge driver will ignore -EOPNOTSUPP and
actually add the VLAN on the bridge device - see commit de3a4f487320
("net: bridge: Notify about bridge VLANs") and commit ab756964049b
("mlxsw: spectrum_switchdev: Ignore bridge VLAN events"). Since the VLAN
was successfully added on the bridge device, mlxsw_sp_span_respin_work()
should be able to resolve the egress port for a packet that is mirrored
to a gre tap and passes through the bridge device. Therefore keep the
logic as it is.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: remove obsolete comments about switchdev transactions
Vladimir Oltean [Sat, 9 Jan 2021 00:01:54 +0000 (02:01 +0200)]
net: dsa: remove obsolete comments about switchdev transactions

Now that all port object notifiers were converted to be non-transactional,
we can remove the comments that say otherwise.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: remove the transactional logic from VLAN objects
Vladimir Oltean [Sat, 9 Jan 2021 00:01:53 +0000 (02:01 +0200)]
net: dsa: remove the transactional logic from VLAN objects

It should be the driver's business to logically separate its VLAN
offloading into a preparation and a commit phase, and some drivers don't
need / can't do this.

So remove the transactional shim from DSA and let drivers propagate
errors directly from the .port_vlan_add callback.

It would appear that the code has worse error handling now than it had
before. DSA is the only in-kernel user of switchdev that offloads one
switchdev object to more than one port: for every VLAN object offloaded
to a user port, that VLAN is also offloaded to the CPU port. So the
"prepare for user port -> check for errors -> prepare for CPU port ->
check for errors -> commit for user port -> commit for CPU port"
sequence appears to make more sense than the one we are using now:
"offload to user port -> check for errors -> offload to CPU port ->
check for errors", but it is really a compromise. In the new way, we can
catch errors from the commit phase that we previously had to ignore.
But we have our hands tied and cannot do any rollback now: if we add a
VLAN on the CPU port and it fails, we can't do the rollback by simply
deleting it from the user port, because the switchdev API is not so nice
with us: it could have simply been there already, even with the same
flags. So we don't even attempt to rollback anything on addition error,
just leave whatever VLANs managed to get offloaded right where they are.
This should not be a problem at all in practice.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: remove the transactional logic from MDB entries
Vladimir Oltean [Sat, 9 Jan 2021 00:01:52 +0000 (02:01 +0200)]
net: dsa: remove the transactional logic from MDB entries

For many drivers, the .port_mdb_prepare callback was not a good opportunity
to avoid any error condition, and they would suppress errors found during
the actual commit phase.

Where a logical separation between the prepare and the commit phase
existed, the function that used to implement the .port_mdb_prepare
callback still exists, but now it is called directly from .port_mdb_add,
which was modified to return an int code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Linus Wallei <linus.walleij@linaro.org> # RTL8366
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: remove the transactional logic from ageing time notifiers
Vladimir Oltean [Sat, 9 Jan 2021 00:01:51 +0000 (02:01 +0200)]
net: dsa: remove the transactional logic from ageing time notifiers

Remove the shim introduced in DSA for offloading the bridge ageing time
from switchdev, by first checking whether the ageing time is within the
range limits requested by the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: switchdev: remove the transaction structure from port attributes
Vladimir Oltean [Sat, 9 Jan 2021 00:01:50 +0000 (02:01 +0200)]
net: switchdev: remove the transaction structure from port attributes

Since the introduction of the switchdev API, port attributes were
transmitted to drivers for offloading using a two-step transactional
model, with a prepare phase that was supposed to catch all errors, and a
commit phase that was supposed to never fail.

Some classes of failures can never be avoided, like hardware access, or
memory allocation. In the latter case, merely attempting to move the
memory allocation to the preparation phase makes it impossible to avoid
memory leaks, since commit fed8b4fac460 ("switchdev: Remove unused
transaction item queue") which has removed the unused mechanism of
passing on the allocated memory between one phase and another.

It is time we admit that separating the preparation from the commit
phase is something that is best left for the driver to decide, and not
something that should be baked into the API, especially since there are
no switchdev callers that depend on this.

This patch removes the struct switchdev_trans member from switchdev port
attribute notifier structures, and converts drivers to not look at this
member.

In part, this patch contains a revert of my previous commit 73712de3351f
("net: dsa: propagate switchdev vlan_filtering prepare phase to
drivers").

For the most part, the conversion was trivial except for:
- Rocker's world implementation based on Broadcom OF-DPA had an odd
  implementation of ofdpa_port_attr_bridge_flags_set. The conversion was
  done mechanically, by pasting the implementation twice, then only
  keeping the code that would get executed during prepare phase on top,
  then only keeping the code that gets executed during the commit phase
  on bottom, then simplifying the resulting code until this was obtained.
- DSA's offloading of STP state, bridge flags, VLAN filtering and
  multicast router could be converted right away. But the ageing time
  could not, so a shim was introduced and this was left for a further
  commit.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Reviewed-by: Linus Walleij <linus.walleij@linaro.org> # RTL8366RB
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: switchdev: delete switchdev_port_obj_add_now
Vladimir Oltean [Sat, 9 Jan 2021 00:01:49 +0000 (02:01 +0200)]
net: switchdev: delete switchdev_port_obj_add_now

After the removal of the transactional model inside
switchdev_port_obj_add_now, it has no added value and we can just call
switchdev_port_obj_notify directly, bypassing this function. Let's
delete it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: switchdev: remove the transaction structure from port object notifiers
Vladimir Oltean [Sat, 9 Jan 2021 00:01:48 +0000 (02:01 +0200)]
net: switchdev: remove the transaction structure from port object notifiers

Since the introduction of the switchdev API, port objects were
transmitted to drivers for offloading using a two-step transactional
model, with a prepare phase that was supposed to catch all errors, and a
commit phase that was supposed to never fail.

Some classes of failures can never be avoided, like hardware access, or
memory allocation. In the latter case, merely attempting to move the
memory allocation to the preparation phase makes it impossible to avoid
memory leaks, since commit fed8b4fac460 ("switchdev: Remove unused
transaction item queue") which has removed the unused mechanism of
passing on the allocated memory between one phase and another.

It is time we admit that separating the preparation from the commit
phase is something that is best left for the driver to decide, and not
something that should be baked into the API, especially since there are
no switchdev callers that depend on this.

This patch removes the struct switchdev_trans member from switchdev port
object notifier structures, and converts drivers to not look at this
member.

Where driver conversion is trivial (like in the case of the Marvell
Prestera driver, NXP DPAA2 switch, TI CPSW, and Rocker drivers), it is
done in this patch.

Where driver conversion needs more attention (DSA, Mellanox Spectrum),
the conversion is left for subsequent patches and here we only fake the
prepare/commit phases at a lower level, just not in the switchdev
notifier itself.

Where the code has a natural structure that is best left alone as a
preparation and a commit phase (as in the case of the Ocelot switch),
that structure is left in place, just made to not depend upon the
switchdev transactional model.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: mv88e6xxx: deny vid 0 on the CPU port and DSA links too
Vladimir Oltean [Sat, 9 Jan 2021 00:01:47 +0000 (02:01 +0200)]
net: dsa: mv88e6xxx: deny vid 0 on the CPU port and DSA links too

mv88e6xxx apparently has a problem offloading VID 0, which the 8021q
module tries to install as part of commit c3892c42b0dd ("vlan_dev: VLAN
0 should be treated as "no vlan tag" (802.1p packet)"). That mv88e6xxx
restriction seems to have been introduced by the "VTU GetNext VID-1
trick to retrieve a single entry" - see commit 8233b3395d3c ("net: dsa:
mv88e6xxx: extract single VLAN retrieval").

There is one more problem. The mv88e6xxx CPU port and DSA links do not
report properly in the prepare phase what are the VLANs that they can
offload. They'll say they can offload everything:

mv88e6xxx_port_vlan_prepare
-> mv88e6xxx_port_check_hw_vlan:

/* DSA and CPU ports have to be members of multiple vlans */
if (dsa_is_dsa_port(ds, port) || dsa_is_cpu_port(ds, port))
return 0;

Except that if you actually try to commit to it, they'll error out and
print this message:

[   32.802438] mv88e6085 d0032004.mdio-mii:12: p9: failed to add VLAN 0t

which comes from:

mv88e6xxx_port_vlan_add
-> mv88e6xxx_port_vlan_join:

if (!vid)
return -EOPNOTSUPP;

What prevents this condition from triggering in real life? The fact that
when a DSA_NOTIFIER_VLAN_ADD is emitted, it never targets a DSA link
directly. Instead, the notifier will always target either a user port or
a CPU port. DSA links just happen to get dragged in by:

static bool dsa_switch_vlan_match(struct dsa_switch *ds, int port,
  struct dsa_notifier_vlan_info *info)
{
...
if (dsa_is_dsa_port(ds, port))
return true;
...
}

So for every DSA VLAN notifier, during the prepare phase, it will just
so happen that there will be somebody to say "no, don't do that".

This will become a problem when the switchdev prepare/commit transactional
model goes away. Every port needs to think on its own. DSA links can no
longer bluff and rely on the fact that the prepare phase will not go
through to the end, because there will be no prepare phase any longer.

Fix this issue before it becomes a problem, by having the "vid == 0"
check earlier than the check whether we are a CPU port / DSA link or not.
Also, the "vid == 0" check becomes unnecessary in the .port_vlan_add
callback, so we can remove it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: switchdev: remove vid_begin -> vid_end range from VLAN objects
Vladimir Oltean [Sat, 9 Jan 2021 00:01:46 +0000 (02:01 +0200)]
net: switchdev: remove vid_begin -> vid_end range from VLAN objects

The call path of a switchdev VLAN addition to the bridge looks something
like this today:

        nbp_vlan_init
        |  __br_vlan_set_default_pvid
        |  |                       |
        |  |    br_afspec          |
        |  |        |              |
        |  |        v              |
        |  | br_process_vlan_info  |
        |  |        |              |
        |  |        v              |
        |  |   br_vlan_info        |
        |  |       / \            /
        |  |      /   \          /
        |  |     /     \        /
        |  |    /       \      /
        v  v   v         v    v
      nbp_vlan_add   br_vlan_add ------+
       |              ^      ^ |       |
       |             /       | |       |
       |            /       /  /       |
       \ br_vlan_get_master/  /        v
        \        ^        /  /  br_vlan_add_existing
         \       |       /  /          |
          \      |      /  /          /
           \     |     /  /          /
            \    |    /  /          /
             \   |   /  /          /
              v  |   | v          /
              __vlan_add         /
                 / |            /
                /  |           /
               v   |          /
   __vlan_vid_add  |         /
               \   |        /
                v  v        v
      br_switchdev_port_vlan_add

The ranges UAPI was introduced to the bridge in commit 0218dcb3b347
("bridge: support for multiple vlans and vlan ranges in setlink and
dellink requests") (Jan 10 2015). But the VLAN ranges (parsed in br_afspec)
have always been passed one by one, through struct bridge_vlan_info
tmp_vinfo, to br_vlan_info. So the range never went too far in depth.

Then Scott Feldman introduced the switchdev_port_bridge_setlink function
in commit 55858308ccc7 ("switchdev: add new switchdev bridge setlink").
That marked the introduction of the SWITCHDEV_OBJ_PORT_VLAN, which made
full use of the range. But switchdev_port_bridge_setlink was called like
this:

br_setlink
-> br_afspec
-> switchdev_port_bridge_setlink

Basically, the switchdev and the bridge code were not tightly integrated.
Then commit 83d63ddbe351 ("bridge: restore br_setlink back to original")
came, and switchdev drivers were required to implement
.ndo_bridge_setlink = switchdev_port_bridge_setlink for a while.

In the meantime, commits such as 0d72389c4754 ("bridge: try switchdev op
first in __vlan_vid_add/del") finally made switchdev penetrate the
br_vlan_info() barrier and start to develop the call path we have today.
But remember, br_vlan_info() still receives VLANs one by one.

Then Arkadi Sharshevsky refactored the switchdev API in 2017 in commit
6686e8d35bff ("net: switchdev: Remove bridge bypass support from
switchdev") so that drivers would not implement .ndo_bridge_setlink any
longer. The switchdev_port_bridge_setlink also got deleted.
This refactoring removed the parallel bridge_setlink implementation from
switchdev, and left the only switchdev VLAN objects to be the ones
offloaded from __vlan_vid_add (basically RX filtering) and  __vlan_add
(the latter coming from commit de3a4f487320 ("net: bridge: Notify about
bridge VLANs")).

That is to say, today the switchdev VLAN object ranges are not used in
the kernel. Refactoring the above call path is a bit complicated, when
the bridge VLAN call path is already a bit complicated.

Let's go off and finish the job of commit 6686e8d35bff by deleting the
bogus iteration through the VLAN ranges from the drivers. Some aspects
of this feature never made too much sense in the first place. For
example, what is a range of VLANs all having the BRIDGE_VLAN_INFO_PVID
flag supposed to mean, when a port can obviously have a single pvid?
This particular configuration _is_ denied as of commit c82f4f916b4f
("bridge: vlan: enforce no pvid flag in vlan ranges"), but from an API
perspective, the driver still has to play pretend, and only offload the
vlan->vid_end as pvid. And the addition of a switchdev VLAN object can
modify the flags of another, completely unrelated, switchdev VLAN
object! (a VLAN that is PVID will invalidate the PVID flag from whatever
other VLAN had previously been offloaded with switchdev and had that
flag. Yet switchdev never notifies about that change, drivers are
supposed to guess).

Nonetheless, having a VLAN range in the API makes error handling look
scarier than it really is - unwinding on errors and all of that.
When in reality, no one really calls this API with more than one VLAN.
It is all unnecessary complexity.

And despite appearing pretentious (two-phase transactional model and
all), the switchdev API is really sloppy because the VLAN addition and
removal operations are not paired with one another (you can add a VLAN
100 times and delete it just once). The bridge notifies through
switchdev of a VLAN addition not only when the flags of an existing VLAN
change, but also when nothing changes. There are switchdev drivers out
there who don't like adding a VLAN that has already been added, and
those checks don't really belong at driver level. But the fact that the
API contains ranges is yet another factor that prevents this from being
addressed in the future.

Of the existing switchdev pieces of hardware, it appears that only
Mellanox Spectrum supports offloading more than one VLAN at a time,
through mlxsw_sp_port_vlan_set. I have kept that code internal to the
driver, because there is some more bookkeeping that makes use of it, but
I deleted it from the switchdev API. But since the switchdev support for
ranges has already been de facto deleted by a Mellanox employee and
nobody noticed for 4 years, I'm going to assume it's not a biggie.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com> # switchdev and mlxsw
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> # hellcreek
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8169: deprecate support for RTL_GIGA_MAC_VER_27
Heiner Kallweit [Sun, 10 Jan 2021 12:17:52 +0000 (13:17 +0100)]
r8169: deprecate support for RTL_GIGA_MAC_VER_27

RTL8168dp is ancient anyway, and I haven't seen any trace of its early
version 27 yet. This chip versions needs quite some special handling,
therefore it would facilitate driver maintenance if support for it
could be dropped. For now just disable detection of this chip version.
If nobody complains we can remove support for it in the near future.

v2:
- extend unknown chip version error message

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/ca98f018-a0e1-8762-e95c-f0ad773a0271@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: bcm_sf2: support BCM4908's integrated switch
Rafał Miłecki [Wed, 6 Jan 2021 21:32:02 +0000 (22:32 +0100)]
net: dsa: bcm_sf2: support BCM4908's integrated switch

BCM4908 family SoCs come with integrated Starfighter 2 switch. Its
registers layout it a mix of BCM7278 and BCM7445. It has 5 integrated
PHYs and 8 ports. It also supports RGMII and SerDes.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210106213202.17459-3-zajec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodt-bindings: net: dsa: sf2: add BCM4908 switch binding
Rafał Miłecki [Wed, 6 Jan 2021 21:32:01 +0000 (22:32 +0100)]
dt-bindings: net: dsa: sf2: add BCM4908 switch binding

BCM4908 family SoCs have integrated Starfighter 2 switch.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210106213202.17459-2-zajec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodt-bindings: net: convert Broadcom Starfighter 2 binding to the json-schema
Rafał Miłecki [Wed, 6 Jan 2021 21:32:00 +0000 (22:32 +0100)]
dt-bindings: net: convert Broadcom Starfighter 2 binding to the json-schema

This helps validating DTS files. Only the current (not deprecated one)
binding was converted.

Minor changes:
1. Dropped dsa/dsa.txt references
2. Updated node name to match dsa.yaml requirement
3. Fixed 2 typos in examples

The new binding was validated using the dt_binding_check.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Link: https://lore.kernel.org/r/20210106213202.17459-1-zajec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'mptcp-add-mp_prio-support-and-rework-local-address-ids'
Jakub Kicinski [Sun, 10 Jan 2021 02:18:47 +0000 (18:18 -0800)]
Merge branch 'mptcp-add-mp_prio-support-and-rework-local-address-ids'

Mat Martineau says:

====================
MPTCP: Add MP_PRIO support and rework local address IDs

Patches 1 and 2 rework the assignment of local address IDs to allow them
to be assigned by a userspace path manager, and add corresponding self
tests.

Patches 2-8 add the ability to change subflow priority after a subflow
has been established. Each subflow in a MPTCP connection has a priority
level: "regular" or "backup". Data should only be sent on backup
subflows if no regular subflows are available. The priority level can be
set when the subflow connection is established (as was already
implemented), or during the life of the connection by sending MP_PRIO in
the TCP options (as added here). Self tests are included.
====================

Link: https://lore.kernel.org/r/20210109004802.341602-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: mptcp: add the MP_PRIO testcases
Geliang Tang [Sat, 9 Jan 2021 00:48:02 +0000 (16:48 -0800)]
selftests: mptcp: add the MP_PRIO testcases

This patch added the MP_PRIO testcases:

Add a new argument bkup for run_tests and do_transfer, it can be set as
"backup" or "nobackup", the default value is "".

Add a new function chk_prio_nr to check the MP_PRIO related MIB counters.

The output looks like this:

29 single subflow, backup      syn[ ok ] - synack[ ok ] - ack[ ok ]
                               ptx[ ok ] - prx   [ ok ]
30 single address, backup      syn[ ok ] - synack[ ok ] - ack[ ok ]
                               add[ ok ] - echo  [ ok ]
                               ptx[ ok ] - prx   [ ok ]

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: add the mibs for MP_PRIO
Geliang Tang [Sat, 9 Jan 2021 00:48:01 +0000 (16:48 -0800)]
mptcp: add the mibs for MP_PRIO

This patch added the mibs for MP_PRIO, MPTCP_MIB_MPPRIOTX for transmitting
of the MP_PRIO suboption, and MPTCP_MIB_MPPRIORX for receiving of it.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: mptcp: add set_flags command in pm_nl_ctl
Geliang Tang [Sat, 9 Jan 2021 00:48:00 +0000 (16:48 -0800)]
selftests: mptcp: add set_flags command in pm_nl_ctl

This patch added the set_flags command in pm_nl_ctl, currently we can only
set two flags: backup and nobackup. The set_flags command can be used like
this:

 # pm_nl_ctl set 10.0.0.1 flags backup
 # pm_nl_ctl set 10.0.0.1 flags nobackup

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: add set_flags command in PM netlink
Geliang Tang [Sat, 9 Jan 2021 00:47:59 +0000 (16:47 -0800)]
mptcp: add set_flags command in PM netlink

This patch added a new command MPTCP_PM_CMD_SET_FLAGS in PM netlink:

In mptcp_nl_cmd_set_flags, parse the input address, get the backup value
according to whether the address's FLAG_BACKUP flag is set from the
user-space. Then check whether this address had been added in the local
address list. If it had been, then call mptcp_nl_addr_backup to deal with
this address.

In mptcp_nl_addr_backup, traverse all the existing msk sockets to find
the relevant sockets, and call mptcp_pm_nl_mp_prio_send_ack to send out
a MP_PRIO ACK packet.

Finally in mptcp_nl_cmd_set_flags, set or clear the address's FLAG_BACKUP
flag.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: add the incoming MP_PRIO support
Geliang Tang [Sat, 9 Jan 2021 00:47:58 +0000 (16:47 -0800)]
mptcp: add the incoming MP_PRIO support

This patch added the incoming MP_PRIO logic:

Added a flag named mp_prio in struct mptcp_options_received, to mark the
MP_PRIO is received, and save the priority value to struct
mptcp_options_received's backup member. Then invoke
mptcp_pm_mp_prio_received with the receiving subsocket and the backup
value.

In mptcp_pm_mp_prio_received, get the subflow context according the input
subsocket, and change the subflow's backup as the incoming priority value.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: add the outgoing MP_PRIO support
Geliang Tang [Sat, 9 Jan 2021 00:47:57 +0000 (16:47 -0800)]
mptcp: add the outgoing MP_PRIO support

This patch added the outgoing MP_PRIO logic:

In mptcp_pm_nl_mp_prio_send_ack, find the related subflow and subsocket
according to the input parameter addr. Save the input priority value to
suflow's backup, then set subflow's send_mp_prio flag to true, and save
the input priority value to suflow's request_bkup. Finally, send out a
pure ACK on the related subsocket.

In mptcp_established_options_mp_prio, check whether the subflow's
send_mp_prio is set. If it is, this is the packet for sending MP_PRIO.
So save subflow->request_bkup value to mptcp_out_options's backup, and
change the option type to OPTION_MPTCP_PRIO.

In mptcp_write_options, clear the send_mp_prio flag and send out the
MP_PRIO suboption with mptcp_out_options's backup value.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: mptcp: add testcases for setting the address ID
Geliang Tang [Sat, 9 Jan 2021 00:47:56 +0000 (16:47 -0800)]
selftests: mptcp: add testcases for setting the address ID

Since the address ID can be set from user-space, some of the tests in
pm_netlink.sh will fail. This patch fixed the failures, and add the
testcases for setting the address ID.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agomptcp: add the address ID assignment bitmap
Geliang Tang [Sat, 9 Jan 2021 00:47:55 +0000 (16:47 -0800)]
mptcp: add the address ID assignment bitmap

Currently the address ID set by the netlink PM from user-space is
overridden by the kernel. This patch added the address ID assignment
bitmap to allow user-space to set the address ID.

Use a per netns bitmask id_bitmap (256 bits) to keep track of in-use IDs.
And use next_id to keep track of the highest ID currently in use. If the
user-space provides an ID at endpoint creation time, try to use it. If
already in use, endpoint creation fails. Otherwise pick the first ID
available after the highest currently in use, with wrap-around.

Signed-off-by: Geliang Tang <geliangtang@gmail.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'r8169-small-improvements'
Jakub Kicinski [Sun, 10 Jan 2021 02:07:54 +0000 (18:07 -0800)]
Merge branch 'r8169-small-improvements'

Heiner Kallweit says:

====================
r8169: small improvements

This series includes a number of smaller improvements.

v2:
- return on WARN in patch 1
====================

Link: https://lore.kernel.org/r/938caef4-8a0b-bbbd-66aa-76f758ff877a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8169: don't wakeup-enable device on shutdown if WOL is disabled
Heiner Kallweit [Fri, 8 Jan 2021 12:00:13 +0000 (13:00 +0100)]
r8169: don't wakeup-enable device on shutdown if WOL is disabled

If WOL isn't enabled, then there's no need to enable wakeup from D3
on system shutdown.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8169: improve rtl_ocp_reg_failure
Heiner Kallweit [Fri, 8 Jan 2021 11:58:54 +0000 (12:58 +0100)]
r8169: improve rtl_ocp_reg_failure

Use WARN_ONCE here to get a call trace in case of a problem.
This facilitates finding the offending code part.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agor8169: replace BUG_ON with WARN in _rtl_eri_write
Heiner Kallweit [Fri, 8 Jan 2021 11:57:57 +0000 (12:57 +0100)]
r8169: replace BUG_ON with WARN in _rtl_eri_write

Use WARN here to avoid stopping the system. In addition print the addr
and mask values that triggered the warning.

v2:
- return on WARN to avoid an invalid register write

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: dsa_legacy_fdb_{add,del} can be static
Vladimir Oltean [Fri, 8 Jan 2021 23:30:54 +0000 (01:30 +0200)]
net: dsa: dsa_legacy_fdb_{add,del} can be static

Introduced in commit e93132c94d15 ("net: dsa: Move FDB add/del
implementation inside DSA") in net/dsa/legacy.c, these functions were
moved again to slave.c as part of commit 50a215ac5772 ("net: dsa: Allow
compiling out legacy support"), before actually deleting net/dsa/slave.c
in 18f930e87d80 ("net: dsa: Remove legacy probing support"). Along with
that movement there should have been a deletion of the prototypes from
dsa_priv.h, they are not useful.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210108233054.1222278-1-olteanv@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'dpaa2-mac-various-updates'
Jakub Kicinski [Sun, 10 Jan 2021 00:21:33 +0000 (16:21 -0800)]
Merge branch 'dpaa2-mac-various-updates'

Ioana Ciornei says:

====================
dpaa2-mac: various updates

The first two patches of this series extends the MAC statistics support
to also work for network interfaces which have their link status handled
by firmware (TYPE_FIXED).

The next two patches are fixing a sporadic problem which happens when
the connected DPMAC object is not yet discovered by the fsl-mc bus, thus
the dpaa2-eth is not able to get a reference to it. A referred probe
will be requested in this case.

Finally, the last two patches make some cosmetic changes, mostly
removing comments and unnecessary checks.

Changes in v2:
 - replaced IS_ERR_OR_NULL() by IS_ERR() in patch 4/6
 - reworded the commit message of patch 6/6
====================

Link: https://lore.kernel.org/r/20210108090727.866283-1-ciorneiioana@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-mac: remove a comment regarding pause settings
Ioana Ciornei [Fri, 8 Jan 2021 09:07:27 +0000 (11:07 +0200)]
dpaa2-mac: remove a comment regarding pause settings

The MC firmware takes these PAUSE/ASYM_PAUSE flags provided by the
driver, transforms them back into rx/tx pause enablement status and
applies them to hardware. We are not losing information by this
transformation, thus remove the comment.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-mac: remove an unnecessary check
Ioana Ciornei [Fri, 8 Jan 2021 09:07:26 +0000 (11:07 +0200)]
dpaa2-mac: remove an unnecessary check

The dpaa2-eth driver has phylink integration only if the connected dpmac
object is in TYPE_PHY (aka the PCS/PHY etc link status is managed by
Linux instead of the firmware). The check is thus unnecessary because
the code path that reaches the .mac_link_up() callback is only with
TYPE_PHY dpmac objects.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-eth: retry the probe when the MAC is not yet discovered on the bus
Ioana Ciornei [Fri, 8 Jan 2021 09:07:25 +0000 (11:07 +0200)]
dpaa2-eth: retry the probe when the MAC is not yet discovered on the bus

The fsl_mc_get_endpoint() function now returns -EPROBE_DEFER when the
dpmac device was not yet discovered by the fsl-mc bus. When this
happens, pass the error code up so that we can retry the probe at a
later time.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobus: fsl-mc: return -EPROBE_DEFER when a device is not yet discovered
Ioana Ciornei [Fri, 8 Jan 2021 09:07:24 +0000 (11:07 +0200)]
bus: fsl-mc: return -EPROBE_DEFER when a device is not yet discovered

The fsl_mc_get_endpoint() should return a pointer to the connected
fsl_mc device, if there is one. By interrogating the MC firmware, we
know if there is an endpoint or not so when the endpoint device is
actually searched on the fsl-mc bus and not found we are hitting the
case in which the device has not been yet discovered by the bus.

Return -EPROBE_DEFER so that callers can differentiate this case.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Acked-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-mac: export MAC counters even when in TYPE_FIXED
Ioana Ciornei [Fri, 8 Jan 2021 09:07:23 +0000 (11:07 +0200)]
dpaa2-mac: export MAC counters even when in TYPE_FIXED

If the network interface object is connected to a MAC of TYPE_FIXED, the
link status management is handled exclusively by the firmware. This does
not mean that the driver cannot access the MAC counters and export them
in ethtool.

For this to happen, we open the attached dpmac device and keep a pointer
to it in priv->mac. Because of this, all the checks in the driver of the
following form 'if (priv->mac)' have to be updated to actually check
the dpmac attribute and not rely on the presence of a non-NULL value.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agodpaa2-mac: split up initializing the MAC object from connecting to it
Ioana Ciornei [Fri, 8 Jan 2021 09:07:22 +0000 (11:07 +0200)]
dpaa2-mac: split up initializing the MAC object from connecting to it

Split up the initialization phase of the dpmac object from actually
configuring the phylink instance, connecting to it and configuring the
MAC. This is done so that even though the dpni object is connected to a
dpmac which has link management handled by the firmware we are still
able to export the MAC counters.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-gro-gro_drop-deprecation'
Jakub Kicinski [Sat, 9 Jan 2021 22:24:29 +0000 (14:24 -0800)]
Merge branch 'net-gro-gro_drop-deprecation'

Eric Dumazet says:

====================
net-gro: GRO_DROP deprecation

GRO_DROP has no practical use and can be removed,
once ice driver is cleaned up.

This removes one useless conditional test in napi_gro_frags().
====================

Link: https://lore.kernel.org/r/20210108113903.3779510-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet-gro: remove GRO_DROP
Eric Dumazet [Fri, 8 Jan 2021 11:39:03 +0000 (03:39 -0800)]
net-gro: remove GRO_DROP

GRO_DROP can only be returned from napi_gro_frags()
if the skb has not been allocated by a prior napi_get_frags()

Since drivers must use napi_get_frags() and test its result
before populating the skb with metadata, we can safely remove
GRO_DROP since it offers no practical use.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jesse Brandeburg <jesse.brandeburg@intel.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoice: drop dead code in ice_receive_skb()
Eric Dumazet [Fri, 8 Jan 2021 11:39:02 +0000 (03:39 -0800)]
ice: drop dead code in ice_receive_skb()

napi_gro_receive() can never return GRO_DROP

GRO_DROP can only be returned from napi_gro_frags()
which is the other NAPI GRO entry point.

Followup patch will remove GRO_DROP, because drivers
are not supposed to call napi_gro_frags() if prior
napi_get_frags() has failed.

Note that I have left the gro_dropped variable. I leave to ice
maintainers the decision to further remove it from ethtool -S results.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: bridge: fix misspellings using codespell tool
Menglong Dong [Fri, 8 Jan 2021 02:53:32 +0000 (18:53 -0800)]
net: bridge: fix misspellings using codespell tool

Some typos are found out by codespell tool:

$ codespell ./net/bridge/
./net/bridge/br_stp.c:604: permanant  ==> permanent
./net/bridge/br_stp.c:605: persistance  ==> persistence
./net/bridge/br.c:125: underlaying  ==> underlying
./net/bridge/br_input.c:43: modue  ==> mode
./net/bridge/br_mrp.c:828: Determin  ==> Determine
./net/bridge/br_mrp.c:848: Determin  ==> Determine
./net/bridge/br_mrp.c:897: Determin  ==> Determine

Fix typos found by codespell.

Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210108025332.52480-1-dong.menglong@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'net-ipa-support-compile_test'
Jakub Kicinski [Sat, 9 Jan 2021 21:51:39 +0000 (13:51 -0800)]
Merge branch 'net-ipa-support-compile_test'

Alex Elder says:

====================
net: ipa: support COMPILE_TEST

This series adds the IPA driver as a possible target when
the COMPILE_TEST configuration is enabled.  Two small changes to
dependent subsystems needed to be made for this to work.

Version 2 of this series adds one more patch, which adds the
declation of struct page to "gsi_trans.h".  The Intel kernel test
robot reported that this was a problem for the alpha build.
====================

Link: https://lore.kernel.org/r/20210107233404.17030-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: support COMPILE_TEST
Alex Elder [Thu, 7 Jan 2021 23:34:04 +0000 (17:34 -0600)]
net: ipa: support COMPILE_TEST

Arrange for the IPA driver to be built when COMPILE_TEST is enabled.

Update the help text to reflect that we support two Qualcomm SoCs.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ipa: declare the page pointer type in "gsi_trans.h"
Alex Elder [Thu, 7 Jan 2021 23:34:03 +0000 (17:34 -0600)]
net: ipa: declare the page pointer type in "gsi_trans.h"

The second argument to gsi_trans_page_add() is a page pointer.
That declaration is found in header files used by "gsi_trans.h" for
(at least) arm64 and x86 builds, but apparently not for alpha
builds.

Fix this by adding a declaration of struct page to the top of
"gsi_trans.h".

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agosoc: qcom: mdt_loader: define stubs for COMPILE_TEST
Alex Elder [Thu, 7 Jan 2021 23:34:02 +0000 (17:34 -0600)]
soc: qcom: mdt_loader: define stubs for COMPILE_TEST

Define stub functions for the exposed MDT functions in case
QCOM_MDT_LOADER is not configured.  This allows users of these
functions to link correctly for COMPILE_TEST builds without
QCOM_SCM enabled.

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoremoteproc: qcom: expose types for COMPILE_TEST
Alex Elder [Thu, 7 Jan 2021 23:34:01 +0000 (17:34 -0600)]
remoteproc: qcom: expose types for COMPILE_TEST

Stub functions are defined for SSR notifier registration in case
QCOM_RPROC_COMMON is not configured.  As a result, code that uses
these functions can link successfully even if the common remoteproc
code is not built.

Code that registers an SSR notifier function likely needs the
types defined in "qcom_rproc.h", but those are only exposed if
QCOM_RPROC_COMMON is enabled.

Rearrange the conditional definition so the qcom_ssr_notify_data
structure and qcom_ssr_notify_type enumerated type are defined
whether or not QCOM_RPROC_COMMON is enabled.

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoibmvnic: merge do_change_param_reset into do_reset
Lijun Pan [Wed, 6 Jan 2021 21:35:14 +0000 (15:35 -0600)]
ibmvnic: merge do_change_param_reset into do_reset

Commit 5c4e7d3cf3c7 ("net/ibmvnic: unlock rtnl_lock in reset so
linkwatch_event can run") introduced do_change_param_reset function to
solve the rtnl lock issue. Majority of the code in do_change_param_reset
duplicates do_reset. Also, we can handle the rtnl lock issue in do_reset
itself. Hence merge do_change_param_reset back into do_reset to clean up
the code.

Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Link: https://lore.kernel.org/r/20210106213514.76027-1-ljp@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoppp: clean up endianness conversions
Julian Wiedmann [Thu, 7 Jan 2021 14:39:56 +0000 (15:39 +0100)]
ppp: clean up endianness conversions

sparse complains about some harmless endianness issues:

> drivers/net/ppp/pptp.c:281:21: warning: incorrect type in assignment (different base types)
> drivers/net/ppp/pptp.c:281:21:    expected unsigned int [usertype] ack
> drivers/net/ppp/pptp.c:281:21:    got restricted __be32
> drivers/net/ppp/pptp.c:283:23: warning: cast to restricted __be32

Here 'ack' is assigned a value in network-order, and then also the
byte-swapped value in host-order. Clean this up by doing the byte-swap
as part of the assignment.

> drivers/net/ppp/pptp.c:358:26: warning: cast from restricted __be16
> drivers/net/ppp/pptp.c:358:26: warning: incorrect type in argument 1 (different base types)
> drivers/net/ppp/pptp.c:358:26:    expected unsigned short [usertype] call_id
> drivers/net/ppp/pptp.c:358:26:    got restricted __be16 [usertype]

Here we use the wrong flavour of byte-swap. Use ntohs(), which of course
gives the same result.

Cc: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Link: https://lore.kernel.org/r/20210107143956.25549-1-jwi@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: ip_tunnel: clean up endianness conversions
Julian Wiedmann [Thu, 7 Jan 2021 14:40:08 +0000 (15:40 +0100)]
net: ip_tunnel: clean up endianness conversions

sparse complains about some harmless endianness issues:

> net/ipv4/ip_tunnel_core.c:225:43: warning: cast to restricted __be16
> net/ipv4/ip_tunnel_core.c:225:43: warning: incorrect type in initializer (different base types)
> net/ipv4/ip_tunnel_core.c:225:43:    expected restricted __be16 [usertype] mtu
> net/ipv4/ip_tunnel_core.c:225:43:    got unsigned short [usertype]

iptunnel_pmtud_build_icmp() uses the wrong flavour of byte-order conversion
when storing the MTU into the ICMPv4 packet. Use htons(), just like
iptunnel_pmtud_build_icmpv6() does.

> net/ipv4/ip_tunnel_core.c:248:35: warning: cast from restricted __be16
> net/ipv4/ip_tunnel_core.c:248:35: warning: incorrect type in argument 3 (different base types)
> net/ipv4/ip_tunnel_core.c:248:35:    expected unsigned short type
> net/ipv4/ip_tunnel_core.c:248:35:    got restricted __be16 [usertype]
> net/ipv4/ip_tunnel_core.c:341:35: warning: cast from restricted __be16
> net/ipv4/ip_tunnel_core.c:341:35: warning: incorrect type in argument 3 (different base types)
> net/ipv4/ip_tunnel_core.c:341:35:    expected unsigned short type
> net/ipv4/ip_tunnel_core.c:341:35:    got restricted __be16 [usertype]

eth_header() wants the Ethertype in host-order, use the correct flavour of
byte-order conversion.

> net/ipv4/ip_tunnel_core.c:600:45: warning: restricted __be16 degrades to integer
> net/ipv4/ip_tunnel_core.c:609:30: warning: incorrect type in assignment (different base types)
> net/ipv4/ip_tunnel_core.c:609:30:    expected int type
> net/ipv4/ip_tunnel_core.c:609:30:    got restricted __be16 [usertype]
> net/ipv4/ip_tunnel_core.c:619:30: warning: incorrect type in assignment (different base types)
> net/ipv4/ip_tunnel_core.c:619:30:    expected int type
> net/ipv4/ip_tunnel_core.c:619:30:    got restricted __be16 [usertype]
> net/ipv4/ip_tunnel_core.c:629:30: warning: incorrect type in assignment (different base types)
> net/ipv4/ip_tunnel_core.c:629:30:    expected int type
> net/ipv4/ip_tunnel_core.c:629:30:    got restricted __be16 [usertype]

The TUNNEL_* types are big-endian, so adjust the type of the local
variable in ip_tun_parse_opts().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Link: https://lore.kernel.org/r/20210107144008.25777-1-jwi@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMAINTAINERS: add bgmac section entry
Rafał Miłecki [Thu, 7 Jan 2021 18:00:51 +0000 (19:00 +0100)]
MAINTAINERS: add bgmac section entry

This driver exists for years but was missing its MAINTAINERS entry.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210107180051.1542-3-zajec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: broadcom: share header defining UniMAC registers
Rafał Miłecki [Thu, 7 Jan 2021 18:00:50 +0000 (19:00 +0100)]
net: broadcom: share header defining UniMAC registers

UniMAC is integrated into multiple Broadcom's Ethernet controllers so
use a shared header file for it and avoid some code duplication.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Doug Berger <opendmb@gmail.com>
Link: https://lore.kernel.org/r/20210107180051.1542-2-zajec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agobgmac: add bgmac_umac_*() helpers for accessing UniMAC registers
Rafał Miłecki [Thu, 7 Jan 2021 18:00:49 +0000 (19:00 +0100)]
bgmac: add bgmac_umac_*() helpers for accessing UniMAC registers

UniMAC is a hardware block commonly used in Broadcom Ethernet controllers
that should get its own header file. Not every controller has it mapped at
the 0x800 offset so add bgmac access helpers. They will allow using
shared register defines.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210107180051.1542-1-zajec5@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'update-register-bit-definitions-in-the-etheravb-driver'
Jakub Kicinski [Sat, 9 Jan 2021 02:38:33 +0000 (18:38 -0800)]
Merge branch 'update-register-bit-definitions-in-the-etheravb-driver'

Sergey Shtylyov says:

====================
Update register/bit definitions in the EtherAVB driver

Here are 2 patches against DaveM's 'net-next' repo.
I'm updating the driver to match the recent R-Car gen2/3 manuals.
====================

Link: https://lore.kernel.org/r/6aef8856-4bf5-1512-2ad4-62af05f00cc6@omprussia.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoravb: update "undocumented" annotations
Sergey Shtylyov [Wed, 6 Jan 2021 20:32:29 +0000 (23:32 +0300)]
ravb: update "undocumented" annotations

The "undocumented" annotations in the EtherAVB driver were done against
the R-Car gen2 manuals; most of these registers/bits were then described
in the R-Car gen3 manuals -- reflect  this fact in the annotations (note
that ECSIPR.LCHNGIP was documented in the recent R-Car gen2 manual)...

Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoravb: remove APSR_DM
Sergey Shtylyov [Wed, 6 Jan 2021 20:31:37 +0000 (23:31 +0300)]
ravb: remove APSR_DM

According to the R-Car Series, 3rd Generation User's Manual: Hardware,
Rev. 1.50, there's no APSR.DM field, instead there are 2 independent
RX/TX clock internal delay bits. Follow the suit: remove #define APSR_DM
and rename #define's APSR_DM_{R|T}DM to APSR_{R|T}DM.

While at it, do several more things to the declaration of *enum* APSR_BIT:
- remove superfluous indentation;
- annotate APSR_MEMS as undocumented;
- annotate APSR as R-Car Gen3 only.

Signed-off-by: Sergey Shtylyov <s.shtylyov@omprussia.ru>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Jakub Kicinski [Fri, 8 Jan 2021 21:28:00 +0000 (13:28 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Trivial conflict in CAN on file rename.

Conflicts:
drivers/net/can/m_can/tcan4x5x-core.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Linus Torvalds [Fri, 8 Jan 2021 20:12:30 +0000 (12:12 -0800)]
Merge tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull more networking fixes from Jakub Kicinski:
 "Slightly lighter pull request to get back into the Thursday cadence.

  Current release - always broken:

   - can: mcp251xfd: fix Tx/Rx ring buffer driver race conditions

   - dsa: hellcreek: fix led_classdev build errors

  Previous releases - regressions:

   - ipv6: fib: flush exceptions when purging route to avoid netdev
     reference leak

   - ip_tunnels: fix pmtu check in nopmtudisc mode

   - ip: always refragment ip defragmented packets to avoid MTU issues
     when forwarding through tunnels, correct "packet too big" message
     is prohibitively tricky to generate

   - s390/qeth: fix locking for discipline setup / removal and during
     recovery to prevent both deadlocks and races

   - mlx5: Use port_num 1 instead of 0 when delete a RoCE address

  Previous releases - always broken:

   - cdc_ncm: correct overhead calculation in delayed_ndp_size to
     prevent out of bound accesses with Huawei 909s-120 LTE module

   - fix stmmac dwmac-sun8i suspend/resume:
           - PHY being left powered off
           - MAC syscon configuration being reset
           - reference to the reset controller being improperly dropped

   - qrtr: fix null-ptr-deref in qrtr_ns_remove

   - can: tcan4x5x: fix bittiming const, use common bittiming from m_can
     driver

   - mlx5e: CT: Use per flow counter when CT flow accounting is enabled

   - mlx5e: Fix SWP offsets when vlan inserted by driver

  Misc:

   - bpf: Fix a task_iter bug caused by a bpf -> net merge conflict
     resolution

  And the usual many fixes to various error paths"

* tag 'net-5.11-rc3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (69 commits)
  net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
  s390/qeth: fix L2 header access in qeth_l3_osa_features_check()
  s390/qeth: fix locking for discipline setup / removal
  s390/qeth: fix deadlock during recovery
  selftests: fib_nexthops: Fix wrong mausezahn invocation
  nexthop: Bounce NHA_GATEWAY in FDB nexthop groups
  nexthop: Unlink nexthop group entry in error path
  nexthop: Fix off-by-one error in error path
  octeontx2-af: fix memory leak of lmac and lmac->name
  chtls: Fix chtls resources release sequence
  chtls: Added a check to avoid NULL pointer dereference
  chtls: Replace skb_dequeue with skb_peek
  chtls: Avoid unnecessary freeing of oreq pointer
  chtls: Fix panic when route to peer not configured
  chtls: Remove invalid set_tcb call
  chtls: Fix hardware tid leak
  net: ip: always refragment ip defragmented packets
  net: fix pmtu check in nopmtudisc mode
  selftests: netfilter: add selftest for ipip pmtu discovery with enabled connection tracking
  docs: octeontx2: tune rst markup
  ...

3 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Fri, 8 Jan 2021 20:05:11 +0000 (12:05 -0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes a functional bug in arm/chacha-neon as well as a potential
  buffer overflow in ecdh"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: ecdh - avoid buffer overflow in ecdh_set_secret()
  crypto: arm/chacha-neon - add missing counter increment

3 years agopoll: fix performance regression due to out-of-line __put_user()
Linus Torvalds [Thu, 7 Jan 2021 17:43:54 +0000 (09:43 -0800)]
poll: fix performance regression due to out-of-line __put_user()

The kernel test robot reported a -5.8% performance regression on the
"poll2" test of will-it-scale, and bisected it to commit 90ecb7b4941f
("x86: Make __put_user() generate an out-of-line call").

I didn't expect an out-of-line __put_user() to matter, because no normal
core code should use that non-checking legacy version of user access any
more.  But I had overlooked the very odd poll() usage, which does a
__put_user() to update the 'revents' values of the poll array.

Now, Al Viro correctly points out that instead of updating just the
'revents' field, it would be much simpler to just copy the _whole_
pollfd entry, and then we could just use "copy_to_user()" on the whole
array of entries, the same way we use "copy_from_user()" a few lines
earlier to get the original values.

But that is not what we've traditionally done, and I worry that threaded
applications might be concurrently modifying the other fields of the
pollfd array.  So while Al's suggestion is simpler - and perhaps worth
trying in the future - this instead keeps the "just update revents"
model.

To fix the performance regression, use the modern "unsafe_put_user()"
instead of __put_user(), with the proper "user_write_access_begin()"
guarding in place. This improves code generation enormously.

Link: https://lore.kernel.org/lkml/20210107134723.GA28532@xsang-OptiPlex-9020/
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Oliver Sang <oliver.sang@intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Laight <David.Laight@aculab.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoRevert "init/console: Use ttynull as a fallback when there is no console"
Petr Mladek [Fri, 8 Jan 2021 11:48:47 +0000 (12:48 +0100)]
Revert "init/console: Use ttynull as a fallback when there is no console"

This reverts commit d04fae0d6ff3138fd6bc71ee17f41f851cdd1409.

The commit caused that ttynull was used as the default console
on several systems[1][2][3]. As a result, the console was
blank even when a better alternative existed.

It happened when there was no console configured
on the command line and ttynull_init() was the first initcall
calling register_console().

Or it happened when /dev/ did not exist when console_on_rootfs()
was called. It was not able to open /dev/console even though
a console driver was registered. It tried to add ttynull console
but it obviously did not help. But ttynull became the preferred
console and was used by /dev/console when it was available later.

The commit tried to fix a historical problem that have been there
for ages. The primary motivation was the commit 6e2c2274f6d4661ea78
("printk/console: Allow to disable console output by using console=""
 or console=null"). It provided a clean solution for a workaround
 that was widely used and worked only by chance.

This revert causes that the console="" or console=null command line
options will again work only by chance. These options will cause that
a particular console will be preferred and the default (tty) ones
will not get enabled. There will be no console registered at
all. As a result there won't be stdin, stdout, and stderr for
the init process. But it worked exactly this way even before.

The proper solution has to fulfill many conditions:

  + Register ttynull only when explicitly required or as
    the ultimate fallback.

  + ttynull should get associated with /dev/console but it must
    not become preferred console when used as a fallback.
    Especially, it must still be possible to replace it
    by a better console later.

Such a change requires clean up of the register_console() code.
Otherwise, it would be even harder to follow. Especially, the use
of has_preferred_console and CON_CONSDEV flag is tricky. The clean
up is risky. The ordering of consoles is not well defined. And
any changes tend to break existing user settings.

Do the revert at the least risky solution for now.

[1] https://lore.kernel.org/linux-kselftest/20201221144302.GR4077@smile.fi.intel.com/
[2] https://lore.kernel.org/lkml/d2a3b3c0-e548-7dd1-730f-59bc5c04e191@synopsys.com/
[3] https://patchwork.ozlabs.org/project/linux-um/patch/20210105120128.10854-1-thomas@m3y3r.de/

Reported-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reported-by: Vineet Gupta <vgupta@synopsys.com>
Reported-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
3 years agoMerge tag 'mlx5-fixes-2021-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git...
Jakub Kicinski [Fri, 8 Jan 2021 03:13:29 +0000 (19:13 -0800)]
Merge tag 'mlx5-fixes-2021-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2021-01-07

* tag 'mlx5-fixes-2021-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups
  net/mlx5e: Fix two double free cases
  net/mlx5: Release devlink object if adev fails
  net/mlx5e: ethtool, Fix restriction of autoneg with 56G
  net/mlx5e: In skb build skip setting mark in switchdev mode
  net/mlx5: E-Switch, fix changing vf VLANID
  net/mlx5e: Fix SWP offsets when vlan inserted by driver
  net/mlx5e: CT: Use per flow counter when CT flow accounting is enabled
  net/mlx5: Use port_num 1 instead of 0 when delete a RoCE address
  net/mlx5e: Add missing capability check for uplink follow
  net/mlx5: Check if lag is supported before creating one
====================

Link: https://lore.kernel.org/r/20210107202845.470205-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
Aleksander Jan Bajkowski [Thu, 7 Jan 2021 19:58:18 +0000 (20:58 +0100)]
net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE

Exclude RMII from modes that report 1 GbE support. Reduced MII supports
up to 100 MbE.

Fixes: 82a04c5f5814 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200")
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20210107195818.3878-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 's390-qeth-fixes-2021-01-07'
Jakub Kicinski [Fri, 8 Jan 2021 02:54:08 +0000 (18:54 -0800)]
Merge branch 's390-qeth-fixes-2021-01-07'

Julian Wiedmann says:

====================
s390/qeth: fixes 2021-01-07

This brings two locking fixes for the device control path.
Also one fix for a path where our .ndo_features_check() attempts to
access a non-existent L2 header.
====================

Link: https://lore.kernel.org/r/20210107172442.1737-1-jwi@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agos390/qeth: fix L2 header access in qeth_l3_osa_features_check()
Julian Wiedmann [Thu, 7 Jan 2021 17:24:42 +0000 (18:24 +0100)]
s390/qeth: fix L2 header access in qeth_l3_osa_features_check()

ip_finish_output_gso() may call .ndo_features_check() even before the
skb has a L2 header. This conflicts with qeth_get_ip_version()'s attempt
to inspect the L2 header via vlan_eth_hdr().

Switch to vlan_get_protocol(), as already used further down in the
common qeth_features_check() path.

Fixes: 22ea4cac8dc7 ("s390/qeth: run non-offload L3 traffic over common xmit path")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agos390/qeth: fix locking for discipline setup / removal
Julian Wiedmann [Thu, 7 Jan 2021 17:24:41 +0000 (18:24 +0100)]
s390/qeth: fix locking for discipline setup / removal

Due to insufficient locking, qeth_core_set_online() and
qeth_dev_layer2_store() can run in parallel, both attempting to load &
setup the discipline (and stepping on each other toes along the way).
A similar race can also occur between qeth_core_remove_device() and
qeth_dev_layer2_store().

Access to .discipline is meant to be protected by the discipline_mutex,
so add/expand the locking in qeth_core_remove_device() and
qeth_core_set_online().
Adjust the locking in qeth_l*_remove_device() accordingly, as it's now
handled by the callers in a consistent manner.

Based on an initial patch by Ursula Braun.

Fixes: 5ade8c676c59 ("qeth: serialize sysfs-triggered device configurations")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agos390/qeth: fix deadlock during recovery
Julian Wiedmann [Thu, 7 Jan 2021 17:24:40 +0000 (18:24 +0100)]
s390/qeth: fix deadlock during recovery

When qeth_dev_layer2_store() - holding the discipline_mutex - waits
inside qeth_l*_remove_device() for a qeth_do_reset() thread to complete,
we can hit a deadlock if qeth_do_reset() concurrently calls
qeth_set_online() and thus tries to aquire the discipline_mutex.

Move the discipline_mutex locking outside of qeth_set_online() and
qeth_set_offline(), and turn the discipline into a parameter so that
callers understand the dependency.

To fix the deadlock, we can now relax the locking:
As already established, qeth_l*_remove_device() waits for
qeth_do_reset() to complete. So qeth_do_reset() itself is under no risk
of having card->discipline ripped out while it's running, and thus
doesn't need to take the discipline_mutex.

Fixes: 5ade8c676c59 ("qeth: serialize sysfs-triggered device configurations")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'nexthop-various-fixes'
Jakub Kicinski [Fri, 8 Jan 2021 02:47:21 +0000 (18:47 -0800)]
Merge branch 'nexthop-various-fixes'

Ido Schimmel says:

====================
nexthop: Various fixes

This series contains various fixes for the nexthop code. The bugs were
uncovered during the development of resilient nexthop groups.

Patches #1-#2 fix the error path of nexthop_create_group(). I was not
able to trigger these bugs with current code, but it is possible with
the upcoming resilient nexthop groups code which adds a user
controllable memory allocation further in the function.

Patch #3 fixes wrong validation of netlink attributes.

Patch #4 fixes wrong invocation of mausezahn in a selftest.
====================

Link: https://lore.kernel.org/r/20210107144824.1135691-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoselftests: fib_nexthops: Fix wrong mausezahn invocation
Ido Schimmel [Thu, 7 Jan 2021 14:48:24 +0000 (16:48 +0200)]
selftests: fib_nexthops: Fix wrong mausezahn invocation

For IPv6 traffic, mausezahn needs to be invoked with '-6'. Otherwise an
error is returned:

 # ip netns exec me mausezahn veth1 -B 2001:db8:101::2 -A 2001:db8:91::1 -c 0 -t tcp "dp=1-1023, flags=syn"
 Failed to set source IPv4 address. Please check if source is set to a valid IPv4 address.
  Invalid command line parameters!

Fixes: 7fd6561585bc ("selftests: Add torture tests to nexthop tests")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonexthop: Bounce NHA_GATEWAY in FDB nexthop groups
Petr Machata [Thu, 7 Jan 2021 14:48:23 +0000 (16:48 +0200)]
nexthop: Bounce NHA_GATEWAY in FDB nexthop groups

The function nh_check_attr_group() is called to validate nexthop groups.
The intention of that code seems to have been to bounce all attributes
above NHA_GROUP_TYPE except for NHA_FDB. However instead it bounces all
these attributes except when NHA_FDB attribute is present--then it accepts
them.

NHA_FDB validation that takes place before, in rtm_to_nh_config(), already
bounces NHA_OIF, NHA_BLACKHOLE, NHA_ENCAP and NHA_ENCAP_TYPE. Yet further
back, NHA_GROUPS and NHA_MASTER are bounced unconditionally.

But that still leaves NHA_GATEWAY as an attribute that would be accepted in
FDB nexthop groups (with no meaning), so long as it keeps the address
family as unspecified:

 # ip nexthop add id 1 fdb via 127.0.0.1
 # ip nexthop add id 10 fdb via default group 1

The nexthop code is still relatively new and likely not used very broadly,
and the FDB bits are newer still. Even though there is a reproducer out
there, it relies on an improbable gateway arguments "via default", "via
all" or "via any". Given all this, I believe it is OK to reformulate the
condition to do the right thing and bounce NHA_GATEWAY.

Fixes: 135a256ce034 ("nexthop: support for fdb ecmp nexthops")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonexthop: Unlink nexthop group entry in error path
Ido Schimmel [Thu, 7 Jan 2021 14:48:22 +0000 (16:48 +0200)]
nexthop: Unlink nexthop group entry in error path

In case of error, remove the nexthop group entry from the list to which
it was previously added.

Fixes: 6e9d1ef54f91 ("nexthop: Add support for nexthop groups")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonexthop: Fix off-by-one error in error path
Ido Schimmel [Thu, 7 Jan 2021 14:48:21 +0000 (16:48 +0200)]
nexthop: Fix off-by-one error in error path

A reference was not taken for the current nexthop entry, so do not try
to put it in the error path.

Fixes: 6e9d1ef54f91 ("nexthop: Add support for nexthop groups")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoocteontx2-af: fix memory leak of lmac and lmac->name
Colin Ian King [Thu, 7 Jan 2021 12:39:16 +0000 (12:39 +0000)]
octeontx2-af: fix memory leak of lmac and lmac->name

Currently the error return paths don't kfree lmac and lmac->name
leading to some memory leaks.  Fix this by adding two error return
paths that kfree these objects

Addresses-Coverity: ("Resource leak")
Fixes: 457ca95f900f ("octeontx2-af: Add support for CGX link management")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210107123916.189748-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'bug-fixes-for-chtls-driver'
Jakub Kicinski [Fri, 8 Jan 2021 01:06:05 +0000 (17:06 -0800)]
Merge branch 'bug-fixes-for-chtls-driver'

Ayush Sawal says:

====================
Bug fixes for chtls driver

patch 1: Fix hardware tid leak.
patch 2: Remove invalid set_tcb call.
patch 3: Fix panic when route to peer not configured.
patch 4: Avoid unnecessary freeing of oreq pointer.
patch 5: Replace skb_dequeue with skb_peek.
patch 6: Added a check to avoid NULL pointer dereference patch.
patch 7: Fix chtls resources release sequence.
====================

Link: https://lore.kernel.org/r/20210106042912.23512-1-ayush.sawal@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochtls: Fix chtls resources release sequence
Ayush Sawal [Wed, 6 Jan 2021 04:29:12 +0000 (09:59 +0530)]
chtls: Fix chtls resources release sequence

CPL_ABORT_RPL is sent after releasing the resources by calling
chtls_release_resources(sk); and chtls_conn_done(sk);
eventually causing kernel panic. Fixing it by calling release
in appropriate order.

Fixes: 91280cf7691a ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochtls: Added a check to avoid NULL pointer dereference
Ayush Sawal [Wed, 6 Jan 2021 04:29:11 +0000 (09:59 +0530)]
chtls: Added a check to avoid NULL pointer dereference

In case of server removal lookup_stid() may return NULL pointer, which
is used as listen_ctx. So added a check before accessing this pointer.

Fixes: 91280cf7691a ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochtls: Replace skb_dequeue with skb_peek
Ayush Sawal [Wed, 6 Jan 2021 04:29:10 +0000 (09:59 +0530)]
chtls: Replace skb_dequeue with skb_peek

The skb is unlinked twice, one in __skb_dequeue in function
chtls_reset_synq() and another in cleanup_syn_rcv_conn().
So in this patch using skb_peek() instead of __skb_dequeue(),
so that unlink will be handled only in cleanup_syn_rcv_conn().

Fixes: 91280cf7691a ("crypto : chtls - CPL handler definition")
Signed-off-by: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochtls: Avoid unnecessary freeing of oreq pointer
Ayush Sawal [Wed, 6 Jan 2021 04:29:09 +0000 (09:59 +0530)]
chtls: Avoid unnecessary freeing of oreq pointer

In chtls_pass_accept_request(), removing the chtls_reqsk_free()
call to avoid oreq freeing twice. Here oreq is the pointer to
struct request_sock.

Fixes: 91280cf7691a ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochtls: Fix panic when route to peer not configured
Ayush Sawal [Wed, 6 Jan 2021 04:29:08 +0000 (09:59 +0530)]
chtls: Fix panic when route to peer not configured

If route to peer is not configured, we might get non tls
devices from dst_neigh_lookup() which is invalid, adding a
check to avoid it.

Fixes: 91280cf7691a ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochtls: Remove invalid set_tcb call
Ayush Sawal [Wed, 6 Jan 2021 04:29:07 +0000 (09:59 +0530)]
chtls: Remove invalid set_tcb call

At the time of SYN_RECV, connection information is not
initialized at FW, updating tcb flag over uninitialized
connection causes adapter crash. We don't need to
update the flag during SYN_RECV state, so avoid this.

Fixes: 91280cf7691a ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agochtls: Fix hardware tid leak
Ayush Sawal [Wed, 6 Jan 2021 04:29:06 +0000 (09:59 +0530)]
chtls: Fix hardware tid leak

send_abort_rpl() is not calculating cpl_abort_req_rss offset and
ends up sending wrong TID with abort_rpl WR causng tid leaks.
Replaced send_abort_rpl() with chtls_send_abort_rpl() as it is
redundant.

Fixes: 91280cf7691a ("crypto : chtls - CPL handler definition")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Signed-off-by: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoMerge branch 'generic-zcopy_-functions'
Jakub Kicinski [Fri, 8 Jan 2021 00:08:38 +0000 (16:08 -0800)]
Merge branch 'generic-zcopy_-functions'

Jonathan Lemon says:

====================
Generic zcopy_* functions

This is set of cleanup patches for zerocopy which are intended
to allow a introduction of a different zerocopy implementation.

The top level API will use the skb_zcopy_*() functions, while
the current TCP specific zerocopy ends up using msg_zerocopy_*()
calls.

There should be no functional changes from these patches.
====================

Link: https://lore.kernel.org/r/20210106221841.1880536-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: Rename skb_zcopy_{get|put} to net_zcopy_{get|put}
Jonathan Lemon [Wed, 6 Jan 2021 22:18:41 +0000 (14:18 -0800)]
skbuff: Rename skb_zcopy_{get|put} to net_zcopy_{get|put}

Unlike the rest of the skb_zcopy_ functions, these routines
operate on a 'struct ubuf', not a skb.  Remove the 'skb_'
prefix from the naming to make things clearer.

Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agotap/tun: add skb_zcopy_init() helper for initialization.
Jonathan Lemon [Wed, 6 Jan 2021 22:18:40 +0000 (14:18 -0800)]
tap/tun: add skb_zcopy_init() helper for initialization.

Replace direct assignments with skb_zcopy_init() for zerocopy
cases where a new skb is initialized, without changing the
reference counts.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: add flags to ubuf_info for ubuf setup
Jonathan Lemon [Wed, 6 Jan 2021 22:18:39 +0000 (14:18 -0800)]
skbuff: add flags to ubuf_info for ubuf setup

Currently, when an ubuf is attached to a new skb, the shared
flags word is initialized to a fixed value.  Instead of doing
this, set the default flags in the ubuf, and have new skbs
inherit from this default.

This is needed when setting up different zerocopy types.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agonet: group skb_shinfo zerocopy related bits together.
Jonathan Lemon [Wed, 6 Jan 2021 22:18:38 +0000 (14:18 -0800)]
net: group skb_shinfo zerocopy related bits together.

In preparation for expanded zerocopy (TX and RX), move
the zerocopy related bits out of tx_flags into their own
flag word.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: rename sock_zerocopy_* to msg_zerocopy_*
Jonathan Lemon [Wed, 6 Jan 2021 22:18:37 +0000 (14:18 -0800)]
skbuff: rename sock_zerocopy_* to msg_zerocopy_*

At Willem's suggestion, rename the sock_zerocopy_* functions
so that they match the MSG_ZEROCOPY flag, which makes it clear
they are specific to this zerocopy implementation.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: Call skb_zcopy_clear() before unref'ing fragments
Jonathan Lemon [Wed, 6 Jan 2021 22:18:36 +0000 (14:18 -0800)]
skbuff: Call skb_zcopy_clear() before unref'ing fragments

RX zerocopy fragment pages which are not allocated from the
system page pool require special handling.  Give the callback
in skb_zcopy_clear() a chance to process them first.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: Call sock_zerocopy_put_abort from skb_zcopy_put_abort
Jonathan Lemon [Wed, 6 Jan 2021 22:18:35 +0000 (14:18 -0800)]
skbuff: Call sock_zerocopy_put_abort from skb_zcopy_put_abort

The sock_zerocopy_put_abort function contains logic which is
specific to the current zerocopy implementation.  Add a wrapper
which checks the callback and dispatches apppropriately.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: Add skb parameter to the ubuf zerocopy callback
Jonathan Lemon [Wed, 6 Jan 2021 22:18:34 +0000 (14:18 -0800)]
skbuff: Add skb parameter to the ubuf zerocopy callback

Add an optional skb parameter to the zerocopy callback parameter,
which is passed down from skb_zcopy_clear().  This gives access
to the original skb, which is needed for upcoming RX zero-copy
error handling.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: replace sock_zerocopy_get with skb_zcopy_get
Jonathan Lemon [Wed, 6 Jan 2021 22:18:33 +0000 (14:18 -0800)]
skbuff: replace sock_zerocopy_get with skb_zcopy_get

Rename the get routines for consistency.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: replace sock_zerocopy_put() with skb_zcopy_put()
Jonathan Lemon [Wed, 6 Jan 2021 22:18:32 +0000 (14:18 -0800)]
skbuff: replace sock_zerocopy_put() with skb_zcopy_put()

Replace sock_zerocopy_put with the generic skb_zcopy_put()
function.  Pass 'true' as the success argument, as this
is identical to no change.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: Push status and refcounts into sock_zerocopy_callback
Jonathan Lemon [Wed, 6 Jan 2021 22:18:31 +0000 (14:18 -0800)]
skbuff: Push status and refcounts into sock_zerocopy_callback

Before this change, the caller of sock_zerocopy_callback would
need to save the zerocopy status, decrement and check the refcount,
and then call the callback function - the callback was only invoked
when the refcount reached zero.

Now, the caller just passes the status into the callback function,
which saves the status and handles its own refcounts.

This makes the behavior of the sock_zerocopy_callback identical
to the tpacket and vhost callbacks.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 years agoskbuff: simplify sock_zerocopy_put
Jonathan Lemon [Wed, 6 Jan 2021 22:18:30 +0000 (14:18 -0800)]
skbuff: simplify sock_zerocopy_put

All 'struct ubuf_info' users should have a callback defined
as of commit b711e82a14ab ("sock: fix zerocopy_success regression
with msg_zerocopy").

Remove the dead code path to consume_skb(), which makes
assumptions about how the structure was allocated.

Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>