]> git.baikalelectronics.ru Git - kernel.git/log
kernel.git
4 years agoravb: Split delay handling in parsing and applying
Geert Uytterhoeven [Thu, 1 Oct 2020 10:10:07 +0000 (12:10 +0200)]
ravb: Split delay handling in parsing and applying

Currently, full delay handling is done in both the probe and resume
paths.  Split it in two parts, so the resume path doesn't have to redo
the parsing part over and over again.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodt-bindings: net: renesas,etheravb: Convert to json-schema
Geert Uytterhoeven [Thu, 1 Oct 2020 10:10:06 +0000 (12:10 +0200)]
dt-bindings: net: renesas,etheravb: Convert to json-schema

Convert the Renesas Ethernet AVB (EthernetAVB-IF) Device Tree binding
documentation to json-schema.

Add missing properties.
Update the example to match reality.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodt-bindings: net: renesas,ravb: Document internal clock delay properties
Geert Uytterhoeven [Thu, 1 Oct 2020 10:10:05 +0000 (12:10 +0200)]
dt-bindings: net: renesas,ravb: Document internal clock delay properties

Some EtherAVB variants support internal clock delay configuration, which
can add larger delays than the delays that are typically supported by
the PHY (using an "rgmii-*id" PHY mode, and/or "[rt]xc-skew-ps"
properties).

Add properties for configuring the internal MAC delays.
These properties are mandatory, even when specified as zero, to
distinguish between old and new DTBs.

Update the (bogus) example accordingly.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodt-bindings: net: ethernet-controller: Add internal delay properties
Geert Uytterhoeven [Thu, 1 Oct 2020 10:10:04 +0000 (12:10 +0200)]
dt-bindings: net: ethernet-controller: Add internal delay properties

Internal Receive and Transmit Clock Delays are a common setting for
RGMII capable devices.

While these delays are typically applied by the PHY, some MACs support
configuring internal clock delay settings, too.  Hence add standardized
properties to configure this.

This is the MAC counterpart of commit db44e3751ab7d089 ("dt-bindings:
net: Add tx and rx internal delays"), which applies to the PHY.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'mlx5-updates-2020-09-30' of git://git.kernel.org/pub/scm/linux/kernel...
David S. Miller [Thu, 1 Oct 2020 19:24:52 +0000 (12:24 -0700)]
Merge tag 'mlx5-updates-2020-09-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-09-30

Updates and cleanups for mlx5 driver:

1) From Ariel, Dan Carpenter and Gostavo, Fixes to the previous
   mlx5 Connection track series.

2) From Yevgeny, trivial cleanups for Software steering

3) From Hamdan, Support for Flow source hint in software steering and
   E-Switch

4) From Parav and Sunil, Small and trivial E-Switch updates and
   cleanups in preparation for mlx5 Sub-functions support
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet/mlx5e: Fix potential null pointer dereference
Gustavo A. R. Silva [Fri, 25 Sep 2020 16:49:13 +0000 (11:49 -0500)]
net/mlx5e: Fix potential null pointer dereference

Calls to kzalloc() and kvzalloc() should be null-checked
in order to avoid any potential failures. In this case,
a potential null pointer dereference.

Fix this by adding null checks for _parse_attr_ and _flow_
right after allocation.

Addresses-Coverity-ID: 1497154 ("Dereference before null check")
Fixes: a650fb5166d6 ("net/mlx5: Refactor tc flow attributes structure")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5e: Fix a use after free on error in mlx5_tc_ct_shared_counter_get()
Dan Carpenter [Mon, 28 Sep 2020 09:05:56 +0000 (12:05 +0300)]
net/mlx5e: Fix a use after free on error in mlx5_tc_ct_shared_counter_get()

This code frees "shared_counter" and then dereferences on the next line
to get the error code.

Fixes: da404640cb59 ("net/mlx5e: CT: Use the same counter for both directions")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: Fix dereference on pointer attr after null check
Ariel Levkovich [Mon, 28 Sep 2020 16:34:10 +0000 (19:34 +0300)]
net/mlx5: Fix dereference on pointer attr after null check

When removing a flow from the slow path fdb, a flow attr struct is
allocated for the rule removal process. If the allocation fails the
code prints a warning message but continues with the removal flow
which include dereferencing a pointer which could be null.
Fix this by exiting the function in case the attr allocation failed.

Fixes: a650fb5166d6 ("net/mlx5: Refactor tc flow attributes structure")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ariel Levkovich <lariel@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: Use dma device access helper
Parav Pandit [Wed, 9 Sep 2020 17:41:38 +0000 (20:41 +0300)]
net/mlx5: Use dma device access helper

Use the PCI device directly for dma accesses as non PCI device unlikely
support IOMMU and dma mappings.
Introduce and use helper routine to access DMA device.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: E-Switch, Support flow source for local vport
Hamdan Igbaria [Sun, 30 Aug 2020 09:32:37 +0000 (12:32 +0300)]
net/mlx5: E-Switch, Support flow source for local vport

Set flow source as hint for local vport.

Signed-off-by: Hamdan Igbaria <hamdani@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: E-switch, Move devlink eswitch ports closer to eswitch
Parav Pandit [Mon, 31 Aug 2020 19:47:47 +0000 (22:47 +0300)]
net/mlx5: E-switch, Move devlink eswitch ports closer to eswitch

Currently devlink eswitch ports are registered and unregistered by the
representor layer.
However it is better to register them at eswitch layer so that in future
user initiated command port add and delete commands can also
register/unregister devlink ports without depending on representor layer.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: E-switch, Use helper function to load unload representor
Parav Pandit [Mon, 31 Aug 2020 18:18:59 +0000 (21:18 +0300)]
net/mlx5: E-switch, Use helper function to load unload representor

To register and unregister devlink ports when loading/unload
representors, refactor the code to helper functions.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: E-switch, Add helper to check egress ACL need
Parav Pandit [Wed, 2 Sep 2020 05:59:30 +0000 (08:59 +0300)]
net/mlx5: E-switch, Add helper to check egress ACL need

Currently only VF vports need egress ACL table.
Add a generic helper to check whether a vport need egress
ACL table or not.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: E-switch, Use PF num in metadata reg c0
sunils [Thu, 10 Sep 2020 21:13:26 +0000 (00:13 +0300)]
net/mlx5: E-switch, Use PF num in metadata reg c0

Currently only 256 vports can be supported as only 8 bits are
reserved for them and 8 bits are reserved for vhca_ids in
metadata reg c0. To support more than 256 vports, replace
vhca_id with a unique shorter 4-bit PF number which covers
upto 16 PF's. Use remaining 12 bits for vports ranging 1-4095.
This will continue to generate unique metadata even if
multiple PCI devices have same switch_id.

Signed-off-by: sunils <sunils@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Vu Pham <vuhuong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: DR, Add support for rule creation with flow source hint
Hamdan Igbaria [Mon, 15 Jun 2020 15:18:14 +0000 (18:18 +0300)]
net/mlx5: DR, Add support for rule creation with flow source hint

Skip the rule according to flow arrival source, in case of RX and the
source is local port skip and in case of TX and the source is uplink
skip, we get this info according to the flow source hint we get from
upper layers when creating the rule.
This is needed because for example in case of FDB table which has a TX
and RX tables and we are inserting a rule with an encap action which
is only a TX action, in this case rule will fail on RX, so we can rely
on the flow source hint and skip RX in such case.
Until now we relied on metadata regc_0 that upper layer mapped the
port in the regc_0, but the problem is that upper layer did not always
use regc_0 for port mapping, so now we added support to flow source
hint which upper layers will pass to SW steering when creating a rule.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Hamdan Igbaria <hamdani@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: DR, Call ste_builder directly with tag pointer
Yevgeny Kliteynik [Mon, 31 Aug 2020 08:56:40 +0000 (11:56 +0300)]
net/mlx5: DR, Call ste_builder directly with tag pointer

Instead of getting the tag in each function, call the builder
directly with the tag. This will allow to use the same function
for building the tag and the bitmask.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: DR, Remove unneeded local variable
Yevgeny Kliteynik [Mon, 31 Aug 2020 08:57:28 +0000 (11:57 +0300)]
net/mlx5: DR, Remove unneeded local variable

The misc3 variable is used only once and can be dropped.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: DR, Remove unneeded vlan check from L2 builder
Yevgeny Kliteynik [Mon, 31 Aug 2020 08:57:14 +0000 (11:57 +0300)]
net/mlx5: DR, Remove unneeded vlan check from L2 builder

When we create a matcher we check that all fields are consumed.
There is no need for this specific check. This keeps the STE
builder functions simple and clean.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: DR, Remove unneeded check from source port builder
Yevgeny Kliteynik [Mon, 31 Aug 2020 08:56:59 +0000 (11:56 +0300)]
net/mlx5: DR, Remove unneeded check from source port builder

Mask validity for ste builders is checked by mlx5dr_ste_build_pre_check
during matcher creation.
It already checks the mask value of source_vport, so removing
this duplicated check.
Also, moving there the check of source_eswitch_owner_vhca_id mask.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agonet/mlx5: DR, Replace the check for valid STE entry
Yevgeny Kliteynik [Mon, 31 Aug 2020 08:56:07 +0000 (11:56 +0300)]
net/mlx5: DR, Replace the check for valid STE entry

Validity check is done by reading the next lu_type from the STE,
this check can be replaced by checking the refcount.
This will make the check independent on internal STE structure.

Signed-off-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
4 years agoMerge branch 'drop_monitor-Convert-to-use-devlink-tracepoint'
David S. Miller [Thu, 1 Oct 2020 01:01:27 +0000 (18:01 -0700)]
Merge branch 'drop_monitor-Convert-to-use-devlink-tracepoint'

Ido Schimmel says:

====================
drop_monitor: Convert to use devlink tracepoint

Drop monitor is able to monitor both software and hardware originated
drops. Software drops are monitored by having drop monitor register its
probe on the 'kfree_skb' tracepoint. Hardware originated drops are
monitored by having devlink call into drop monitor whenever it receives
a dropped packet from the underlying hardware.

This patch set converts drop monitor to monitor both software and
hardware originated drops in the same way - by registering its probe on
the relevant tracepoint.

In addition to drop monitor being more consistent, it is now also
possible to build drop monitor as module instead of as a builtin and
still monitor hardware originated drops. Initially, CONFIG_NET_DEVLINK
implied CONFIG_NET_DROP_MONITOR, but after commit 2f5a0f65c9f7
("kconfig: allow symbols implied by y to become m") we can have
CONFIG_NET_DEVLINK=y and CONFIG_NET_DROP_MONITOR=m and hardware
originated drops will not be monitored.

Patch set overview:

Patch #1 adds a tracepoint in devlink for trap reports.

Patch #2 prepares probe functions in drop monitor for the new
tracepoint.

Patch #3 converts drop monitor to use the new tracepoint.

Patches #4-#6 perform cleanups after the conversion.

Patch #7 adds a test case for drop monitor. Both software originated
drops and hardware originated drops (using netdevsim) are tested.

Tested:

| CONFIG_NET_DEVLINK | CONFIG_NET_DROP_MONITOR | Build | SW drops | HW drops |
| -------------------|-------------------------|-------|----------|----------|
|          y         |            y            |   v   |     v    |     v    |
|          y         |            m            |   v   |     v    |     v    |
|          y         |            n            |   v   |     x    |     x    |
|          n         |            y            |   v   |     v    |     x    |
|          n         |            m            |   v   |     v    |     x    |
|          n         |            n            |   v   |     x    |     x    |
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: net: Add drop monitor test
Ido Schimmel [Tue, 29 Sep 2020 08:15:56 +0000 (11:15 +0300)]
selftests: net: Add drop monitor test

Test that drop monitor correctly captures both software and hardware
originated packet drops.

# ./drop_monitor_tests.sh

Software drops test
    TEST: Capturing active software drops                               [ OK ]
    TEST: Capturing inactive software drops                             [ OK ]

Hardware drops test
    TEST: Capturing active hardware drops                               [ OK ]
    TEST: Capturing inactive hardware drops                             [ OK ]

Tests passed:   4
Tests failed:   0

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Filter control packets in drop monitor
Ido Schimmel [Tue, 29 Sep 2020 08:15:55 +0000 (11:15 +0300)]
drop_monitor: Filter control packets in drop monitor

Previously, devlink called into drop monitor in order to report hardware
originated drops / exceptions. devlink intentionally filtered control
packets and did not pass them to drop monitor as they were not dropped
by the underlying hardware.

Now drop monitor registers its probe on a generic 'devlink_trap_report'
tracepoint and should therefore perform this filtering itself instead of
having devlink do that.

Add the trap type as metadata and have drop monitor ignore control
packets.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Remove duplicate struct
Ido Schimmel [Tue, 29 Sep 2020 08:15:54 +0000 (11:15 +0300)]
drop_monitor: Remove duplicate struct

'struct net_dm_hw_metadata' is a duplicate of 'struct
devlink_trap_metadata'.

Remove the former and simplify the code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Remove no longer used functions
Ido Schimmel [Tue, 29 Sep 2020 08:15:53 +0000 (11:15 +0300)]
drop_monitor: Remove no longer used functions

The old probe functions that were invoked by drop monitor code are no
longer called and can thus be removed. They were replaced by actual
probe functions that are registered on the recently introduced
'devlink_trap_report' tracepoint.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Convert to using devlink tracepoint
Ido Schimmel [Tue, 29 Sep 2020 08:15:52 +0000 (11:15 +0300)]
drop_monitor: Convert to using devlink tracepoint

Convert drop monitor to use the recently introduced
'devlink_trap_report' tracepoint instead of having devlink call into
drop monitor.

This is both consistent with software originated drops ('kfree_skb'
tracepoint) and also allows drop monitor to be built as a module and
still report hardware originated drops.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodrop_monitor: Prepare probe functions for devlink tracepoint
Ido Schimmel [Tue, 29 Sep 2020 08:15:51 +0000 (11:15 +0300)]
drop_monitor: Prepare probe functions for devlink tracepoint

Drop monitor supports two alerting modes: Summary and packet. Prepare a
probe function for each, so that they could be later registered on the
devlink tracepoint by calling register_trace_devlink_trap_report(),
based on the configured alerting mode.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agodevlink: Add a tracepoint for trap reports
Ido Schimmel [Tue, 29 Sep 2020 08:15:50 +0000 (11:15 +0300)]
devlink: Add a tracepoint for trap reports

Add a tracepoint for trap reports so that drop monitor could register
its probe on it. Use trace_devlink_trap_report_enabled() to avoid
wasting cycles setting the trap metadata if the tracepoint is not
enabled.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge tag 'linux-can-next-for-5.10-20200930' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Wed, 30 Sep 2020 22:21:43 +0000 (15:21 -0700)]
Merge tag 'linux-can-next-for-5.10-20200930' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2020-09-30

this is a pull request of 13 patches for net-next.

The first 10 target the mcp25xxfd driver (which is renamed to mcp251xfd during
this series).

The first two patches are by Thomas Kopp, which adds reference to the just
related errata and updates the documentation and log messages.

Dan Carpenter's patch fixes a resource leak during ifdown.

A patch by me adds the missing initialization of a variable.

Oleksij Rempel updates the DT binding documentation as requested by Rob
Herring.

The next 5 patches are by Thomas Kopp and me. During review Geert Uytterhoeven
suggested to use "microchip,mcp251xfd" instead of "microchip,mcp25xxfd" as the
DT autodetection compatible to avoid clashes with future but incompatible
devices. We decided not only to rename the compatible but the whole driver from
"mcp25xxfd" to "mcp251xfd". This is done in several patches.

Joakim Zhang contributes three patches for the flexcan driver. The first one
adds support for the ECC feature, which is implemented on some modern IP cores,
by initializing the controller's memory during ifup. The next patch adds
support for the i.MX8MP (which supports ECC) and the last patch properly
disables the runtime PM if device registration fails.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'ionic-watchdog-training'
David S. Miller [Wed, 30 Sep 2020 22:11:09 +0000 (15:11 -0700)]
Merge branch 'ionic-watchdog-training'

Shannon Nelson says:

====================
ionic watchdog training

Our link watchdog displayed a couple of unfriendly behaviors in some recent
stress testing.  These patches change the startup and stop timing in order
to be sure that expected structures are ready to be used by the watchdog.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: prevent early watchdog check
Shannon Nelson [Wed, 30 Sep 2020 17:48:28 +0000 (10:48 -0700)]
ionic: prevent early watchdog check

In one corner case scenario, the driver device lif setup can
get delayed such that the ionic_watchdog_cb() timer goes off
before the ionic->lif is set, thus causing a NULL pointer panic.
We catch the problem by checking for a NULL lif just a little
earlier in the callback.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoionic: stop watchdog timer earlier on remove
Shannon Nelson [Wed, 30 Sep 2020 17:48:27 +0000 (10:48 -0700)]
ionic: stop watchdog timer earlier on remove

We need to be better at making sure we don't have a link check
watchdog go off while we're shutting things down, so let's stop
the timer as soon as we start the remove.

Meanwhile, since that was the only thing in
ionic_dev_teardown(), simplify and remove that function.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'tcp-exponential-backoff-in-tcp_send_ack'
David S. Miller [Wed, 30 Sep 2020 21:21:30 +0000 (14:21 -0700)]
Merge branch 'tcp-exponential-backoff-in-tcp_send_ack'

Eric Dumazet says:

====================
tcp: exponential backoff in tcp_send_ack()

We had outages caused by repeated skb allocation failures in tcp_send_ack()

It is time to add exponential backoff to reduce number of attempts.
Before doing so, first patch removes icsk_ack.blocked to make
room for a new field (icsk_ack.retry)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agotcp: add exponential backoff in __tcp_send_ack()
Eric Dumazet [Wed, 30 Sep 2020 12:54:57 +0000 (05:54 -0700)]
tcp: add exponential backoff in __tcp_send_ack()

Whenever host is under very high memory pressure,
__tcp_send_ack() skb allocation fails, and we setup
a 200 ms (TCP_DELACK_MAX) timer before retrying.

On hosts with high number of TCP sockets, we can spend
considerable amount of cpu cycles in these attempts,
add high pressure on various spinlocks in mm-layer,
ultimately blocking threads attempting to free space
from making any progress.

This patch adds standard exponential backoff to avoid
adding fuel to the fire.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoinet: remove icsk_ack.blocked
Eric Dumazet [Wed, 30 Sep 2020 12:54:56 +0000 (05:54 -0700)]
inet: remove icsk_ack.blocked

TCP has been using it to work around the possibility of tcp_delack_timer()
finding the socket owned by user.

After commit f02b41be228f ("tcp: improve latencies of timer triggered events")
we added TCP_DELACK_TIMER_DEFERRED atomic bit for more immediate recovery,
so we can get rid of icsk_ack.blocked

This frees space that following patch will reuse.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: macb: move pdata to private header
Alexandre Belloni [Wed, 30 Sep 2020 10:50:59 +0000 (12:50 +0200)]
net: macb: move pdata to private header

struct macb_platform_data is only used by macb_pci to register the platform
device, move its definition to cadence/macb.h and remove platform_data/macb.h

Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'mlxsw-PFC-and-headroom-selftests'
David S. Miller [Wed, 30 Sep 2020 21:06:54 +0000 (14:06 -0700)]
Merge branch 'mlxsw-PFC-and-headroom-selftests'

Petr Machata says:

====================
mlxsw: PFC and headroom selftests

Recent changes in the headroom management code made it clear that an
automated way of testing this functionality is needed. This patchset brings
two tests: a synthetic headroom behavior test, which verifies mechanics of
headroom management. And a PFC test, which verifies whether this behavior
actually translates into a working lossless configuration.

Both of these tests rely on mlnx_qos[1], a tool that interfaces with Linux
DCB API. The tool was originally written to work with Mellanox NICs, but
does not actually rely on anything Mellanox-specific, and can be used for
mlxsw as well as for any other NIC-like driver. Unlike Open LLDP it does
support buffer commands and permits a fire-and-forget approach to
configuration, which makes it very handy for writing of selftests.

Patches #1-#3 extend the selftest devlink_lib.sh in various ways. Patch #4
then adds a helper wrapper for mlnx_qos to mlxsw's qos_lib.sh.

Patch #5 adds a test for management of port headroom.

Patch #6 adds a PFC test.

[1] https://github.com/Mellanox/mlnx-tools/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: mlxsw: Add a PFC test
Petr Machata [Wed, 30 Sep 2020 10:49:12 +0000 (12:49 +0200)]
selftests: mlxsw: Add a PFC test

Add a test for PFC. Runs 10MB of traffic through a bottleneck and checks
that none of it gets lost.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: mlxsw: Add headroom handling test
Petr Machata [Wed, 30 Sep 2020 10:49:11 +0000 (12:49 +0200)]
selftests: mlxsw: Add headroom handling test

Add a test for headroom configuration. This covers projection of ETS
configuration to ingress, PFC, adjustments for MTU, the qdisc / TC
mode and the effect of egress SPAN session on buffer configuration.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: mlxsw: qos_lib: Add a wrapper for running mlnx_qos
Petr Machata [Wed, 30 Sep 2020 10:49:10 +0000 (12:49 +0200)]
selftests: mlxsw: qos_lib: Add a wrapper for running mlnx_qos

mlnx_qos is a script for configuration of DCB. Despite the name it is not
actually Mellanox-specific in any way. It is currently the only ad-hoc tool
available (in contrast to a daemon that manages an interface on an ongoing
basis). However, it is very verbose and parsing out error messages is not
really possible. Add a wrapper that makes it easier to use the tool.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: forwarding: devlink_lib: Support port-less topologies
Petr Machata [Wed, 30 Sep 2020 10:49:09 +0000 (12:49 +0200)]
selftests: forwarding: devlink_lib: Support port-less topologies

Some selftests may not need any actual ports. Technically those are not
forwarding selftests, but devlink_lib can still be handy. Fall back on
NETIF_NO_CABLE in those cases.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: forwarding: devlink_lib: Add devlink_cell_size_get()
Petr Machata [Wed, 30 Sep 2020 10:49:08 +0000 (12:49 +0200)]
selftests: forwarding: devlink_lib: Add devlink_cell_size_get()

Add a helper that answers the cell size of the devlink device.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoselftests: forwarding: devlink_lib: Split devlink_..._set() into save & set
Petr Machata [Wed, 30 Sep 2020 10:49:07 +0000 (12:49 +0200)]
selftests: forwarding: devlink_lib: Split devlink_..._set() into save & set

Changing pool type from static to dynamic causes reinterpretation of
threshold values. They therefore need to be saved before pool type is
changed, then the pool type can be changed, and then the new values need
to be set up.

For that reason, set cannot subsume save, because it would be saving the
wrong thing, with possibly a nonsensical value, and restore would then fail
to restore the nonsensical value.

Thus extract a _save() from each of the relevant _set()'s. This way it is
possible to save everything up front, then to tweak it, and then restore in
the required order.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agocan: flexcan: disable runtime PM if register flexcandev failed
Joakim Zhang [Tue, 29 Sep 2020 21:15:57 +0000 (05:15 +0800)]
can: flexcan: disable runtime PM if register flexcandev failed

Disable runtime PM if register flexcandev failed, and balance reference
of usage_count.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20200929211557.14153-4-qiangqing.zhang@nxp.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: flexcan: add flexcan driver for i.MX8MP
Joakim Zhang [Tue, 29 Sep 2020 21:15:56 +0000 (05:15 +0800)]
can: flexcan: add flexcan driver for i.MX8MP

Add flexcan driver for i.MX8MP, which supports CAN FD and ECC.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20200929211557.14153-3-qiangqing.zhang@nxp.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: flexcan: initialize all flexcan memory for ECC function
Joakim Zhang [Tue, 29 Sep 2020 21:15:55 +0000 (05:15 +0800)]
can: flexcan: initialize all flexcan memory for ECC function

One issue was reported at a baremetal environment, which is used for
FPGA verification. "The first transfer will fail for extended ID
format(for both 2.0B and FD format), following frames can be transmitted
and received successfully for extended format, and standard format don't
have this issue. This issue occurred randomly with high possiblity, when
it occurs, the transmitter will detect a BIT1 error, the receiver a CRC
error. According to the spec, a non-correctable error may cause this
transfer failure."

With FLEXCAN_QUIRK_DISABLE_MECR quirk, it supports correctable errors,
disable non-correctable errors interrupt and freeze mode. Platform has
ECC hardware support, but select this quirk, this issue may not come to
light. Initialize all FlexCAN memory before accessing them, at least it
can avoid non-correctable errors detected due to memory uninitialized.
The internal region can't be initialized when the hardware doesn't support
ECC.

According to IMX8MPRM, Rev.C, 04/2020. There is a NOTE at the section
11.8.3.13 Detection and correction of memory errors:
"All FlexCAN memory must be initialized before starting its operation in
order to have the parity bits in memory properly updated. CTRL2[WRMFRZ]
grants write access to all memory positions that require initialization,
ranging from 0x080 to 0xADF and from 0xF28 to 0xFFF when the CAN FD feature
is enabled. The RXMGMASK, RX14MASK, RX15MASK, and RXFGMASK registers need to
be initialized as well. MCR[RFEN] must not be set during memory initialization."

Memory range from 0x080 to 0xADF, there are reserved memory (unimplemented
by hardware, e.g. only configure 64 MBs), these memory can be initialized or not.
In this patch, initialize all flexcan memory which includes reserved memory.

In this patch, create FLEXCAN_QUIRK_SUPPORT_ECC for platforms which has ECC
feature. If you have a ECC platform in your hand, please select this
qurik to initialize all flexcan memory firstly, then you can select
FLEXCAN_QUIRK_DISABLE_MECR to only enable correctable errors.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Link: https://lore.kernel.org/r/20200929211557.14153-2-qiangqing.zhang@nxp.com
[mkl: wrap long lines]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp251xfd: rename all remaining occurrence to mcp251xfd
Marc Kleine-Budde [Wed, 30 Sep 2020 08:49:00 +0000 (10:49 +0200)]
can: mcp251xfd: rename all remaining occurrence to mcp251xfd

In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver,
which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming,
but incompatible chips.

In the previous patch the autodetect DT compatbile has been renamed to
"microchip,mcp251xfd", this patch changes all non user facing occurrence of
"mcp25xxfd" to "mcp251xfd" and "MCP25XXFD" to "MCP251XFD".

[1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com

Link: https://lore.kernel.org/r/20200930091424.792165-10-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp251xfd: rename all user facing strings to mcp251xfd
Marc Kleine-Budde [Wed, 30 Sep 2020 08:49:00 +0000 (10:49 +0200)]
can: mcp251xfd: rename all user facing strings to mcp251xfd

In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver,
which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming,
but incompatible chips.

In the previous patch the autodetect DT compatbile has been renamed to
"microchip,mcp251xfd", this patch changes all user facing strings from
"mcp25xxfd" to "mcp251xfd" and "MCP25XXFD" to "MCP251XFD", including:
- kconfig symbols
- name of kernel module
- DT and SPI compatible

[1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com

Link: https://lore.kernel.org/r/20200930091424.792165-9-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp251xfd: rename driver files and subdir to mcp251xfd
Marc Kleine-Budde [Wed, 30 Sep 2020 08:49:00 +0000 (10:49 +0200)]
can: mcp251xfd: rename driver files and subdir to mcp251xfd

In [1] Geert noted that the autodetect compatible for the mcp25xxfd driver,
which is "microchip,mcp25xxfd" might be too generic and overlap with upcoming,
but incompatible chips.

In the previous patch the autodetect DT compatbile has been renamed to
"microchip,mcp251xfd", this patch changes the name of the driver subdir and the
individual files accordinly.

[1] http://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com

Link: https://lore.kernel.org/r/20200930091424.792165-8-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp25xxfd: narrow down wildcards in device tree bindings to "microchip,mcp251xfd"
Thomas Kopp [Wed, 30 Sep 2020 09:14:22 +0000 (11:14 +0200)]
can: mcp25xxfd: narrow down wildcards in device tree bindings to "microchip,mcp251xfd"

The wildcard should be narrowed down to prevent existing and future devices
that are not compatible from matching. It is very unlikely that incompatible
devices will be released that do not match the wildcard.

Discussion Reference: https://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com>
Link: https://lore.kernel.org/r/20200930091423.755-1-thomas.kopp@microchip.com
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agodt-binding: can: mcp251xfd: narrow down wildcards in device tree bindings to "microch...
Thomas Kopp [Wed, 30 Sep 2020 09:14:23 +0000 (11:14 +0200)]
dt-binding: can: mcp251xfd: narrow down wildcards in device tree bindings to "microchip,mcp251xfd"

The wildcard should be narrowed down to prevent existing and future devices
that are not compatible from matching. It is very unlikely that incompatible
devices will be released that do not match the wildcard.

This is the documentation part of the commit.

Discussion Reference: https://lore.kernel.org/r/CAMuHMdVkwGjr6dJuMyhQNqFoJqbh6Ec5V2b5LenCshwpM2SDsQ@mail.gmail.com

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com>
Link: https://lore.kernel.org/r/20200930091423.755-2-thomas.kopp@microchip.com
[mkl: rename file, too]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agodt-binding: can: mcp25xxfd: documentation fixes
Oleksij Rempel [Wed, 23 Sep 2020 12:53:01 +0000 (14:53 +0200)]
dt-binding: can: mcp25xxfd: documentation fixes

Apply following fixes:
- Use 'interrupts'. (interrupts-extended will automagically be supported
  by the tools)
- *-supply is always a single item. So, drop maxItems=1
- add "additionalProperties: false" flag to detect unneeded properties.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20200923125301.27200-1-o.rempel@pengutronix.de
Reported-by: Rob Herring <robh@kernel.org>
Reviewed-by: Rob Herring <robh@kernel.org>
Fixes: 2dffde966ad8 ("dt-binding: can: mcp25xxfd: document device tree bindings")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp25xxfd: mcp25xxfd_irq(): add missing initialization of variable set_normal...
Marc Kleine-Budde [Wed, 23 Sep 2020 11:44:36 +0000 (13:44 +0200)]
can: mcp25xxfd: mcp25xxfd_irq(): add missing initialization of variable set_normal mode

This patch fixes the following warning:

    drivers/net/can/spi/mcp25xxfd/mcp25xxfd-core.c:2155 mcp25xxfd_irq()
    error: uninitialized symbol 'set_normal_mode'.

by adding the missing initialization.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Fixes: 74ed7c9bd838 ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Link: https://lore.kernel.org/r/20200923114726.2704426-1-mkl@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp25xxfd: mcp25xxfd_ring_free(): fix memory leak during cleanup
Dan Carpenter [Wed, 23 Sep 2020 11:27:52 +0000 (14:27 +0300)]
can: mcp25xxfd: mcp25xxfd_ring_free(): fix memory leak during cleanup

This loop doesn't free the first element of the array.  The "i > 0" has
to be changed to "i >= 0".

Fixes: 74ed7c9bd838 ("can: mcp25xxfd: add driver for Microchip MCP25xxFD SPI CAN")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20200923112752.GA1473821@mwanda
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp25xxfd: mcp25xxfd_probe(): add SPI clk limit related errata information
Thomas Kopp [Fri, 25 Sep 2020 06:56:06 +0000 (08:56 +0200)]
can: mcp25xxfd: mcp25xxfd_probe(): add SPI clk limit related errata information

This patch adds a reference to the recent released MCP2517FD and MCP2518FD
errata sheets and paste the explanation.

The driver already implements the proposed fix.

Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com>
Link: https://lore.kernel.org/r/20200925065606.358-1-thomas.kopp@microchip.com
[mkl: split into two patches, adjust subject and commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agocan: mcp25xxfd: mcp25xxfd_handle_eccif(): add ECC related errata and update log messages
Thomas Kopp [Fri, 25 Sep 2020 06:56:06 +0000 (08:56 +0200)]
can: mcp25xxfd: mcp25xxfd_handle_eccif(): add ECC related errata and update log messages

This patch adds a reference to the recent released MCP2517FD and MCP2518FD
errata sheets and paste the explanation.

The single error correction does not always work, so always indicate that a
single error occurred. If the location of the ECC error is outside of the
TX-RAM always use netdev_notice() to log the problem. For ECC errors in the
TX-RAM, there is a recovery procedure.

Signed-off-by: Thomas Kopp <thomas.kopp@microchip.com>
Link: https://lore.kernel.org/r/20200925065606.358-1-thomas.kopp@microchip.com
[mkl: split into two patches, adjust subject and commit message]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
4 years agoMerge branch 'HW-support-for-VCAP-IS1-and-ES0-in-mscc_ocelot'
David S. Miller [Wed, 30 Sep 2020 01:26:42 +0000 (18:26 -0700)]
Merge branch 'HW-support-for-VCAP-IS1-and-ES0-in-mscc_ocelot'

Vladimir Oltean says:

====================
HW support for VCAP IS1 and ES0 in mscc_ocelot

The patches from RFC series "Offload tc-flower to mscc_ocelot switch
using VCAP chains" have been split into 2:
https://patchwork.ozlabs.org/project/netdev/list/?series=204810&state=*

This is the boring part, that deals with the prerequisites, and not with
tc-flower integration. Apart from the initialization of some hardware
blocks, which at this point still don't do anything, no new
functionality is introduced.

- Key and action field offsets are defined for the supported switches.
- VCAP properties are added to the driver for the new TCAM blocks. But
  instead of adding them manually as was done for IS2, which is error
  prone, the driver is refactored to read these parameters from
  hardware, which is possible.
- Some improvements regarding the processing of struct ocelot_vcap_filter.
- Extending the code to be compatible with full and quarter keys.

This series was tested, along with other patches not yet submitted, on
the Felix and Seville switches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: look up the filters in flower_stats() and flower_destroy()
Vladimir Oltean [Tue, 29 Sep 2020 22:27:33 +0000 (01:27 +0300)]
net: mscc: ocelot: look up the filters in flower_stats() and flower_destroy()

Currently a new filter is created, containing just enough correct
information to be able to call ocelot_vcap_block_find_filter_by_index()
on it.

This will be limiting us in the future, when we'll have more metadata
associated with a filter, which will matter in the stats() and destroy()
callbacks, and which we can't make up on the spot. For example, we'll
start "offloading" some dummy tc filter entries for the TCAM skeleton,
but we won't actually be adding them to the hardware, or to block->rules.
So, it makes sense to avoid deleting those rules too. That's the kind of
thing which is difficult to determine unless we look up the real filter.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: add a new ocelot_vcap_block_find_filter_by_id function
Vladimir Oltean [Tue, 29 Sep 2020 22:27:32 +0000 (01:27 +0300)]
net: mscc: ocelot: add a new ocelot_vcap_block_find_filter_by_id function

And rename the existing find to ocelot_vcap_block_find_filter_by_index.
The index is the position in the TCAM, and the id is the flow cookie
given by tc.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: rename variable 'cnt' in vcap_data_offset_get()
Vladimir Oltean [Tue, 29 Sep 2020 22:27:31 +0000 (01:27 +0300)]
net: mscc: ocelot: rename variable 'cnt' in vcap_data_offset_get()

The 'cnt' variable is actually used for 2 purposes, to hold the number
of sub-words per VCAP entry, and the number of sub-words per VCAP
action.

In fact, I'm pretty sure these 2 numbers can never be different from one
another. By hardware definition, the entry (key) TCAM rows are divided
into the same number of sub-words as its associated action RAM rows.
But nonetheless, let's at least rename the variables such that
observations like this one are easier to make in the future.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: rename variable 'count' in vcap_data_offset_get()
Vladimir Oltean [Tue, 29 Sep 2020 22:27:30 +0000 (01:27 +0300)]
net: mscc: ocelot: rename variable 'count' in vcap_data_offset_get()

This gets rid of one of the 2 variables named, very generically,
"count".

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: calculate vcap offsets correctly for full and quarter entries
Xiaoliang Yang [Tue, 29 Sep 2020 22:27:29 +0000 (01:27 +0300)]
net: mscc: ocelot: calculate vcap offsets correctly for full and quarter entries

When calculating the offsets for the current entry within the row and
placing them inside struct vcap_data, the function assumes half key
entry (2 keys per row).

This patch modifies the vcap_data_offset_get() function to calculate a
correct data offset when the setting VCAP Type-Group of a key to
VCAP_TG_FULL or VCAP_TG_QUARTER.

This is needed because, for example, VCAP ES0 only supports full keys.

Also rename the 'count' variable to 'num_entries_per_row' to make the
function just one tiny bit easier to follow.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: parse flower action before key
Vladimir Oltean [Tue, 29 Sep 2020 22:27:28 +0000 (01:27 +0300)]
net: mscc: ocelot: parse flower action before key

When we'll make the switch to multiple chain offloading, we'll want to
know first what VCAP block the rule is offloaded to. This impacts what
keys are available. Since the VCAP block is determined by what actions
are used, parse the action first.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: remove unneeded VCAP parameters for IS2
Vladimir Oltean [Tue, 29 Sep 2020 22:27:27 +0000 (01:27 +0300)]
net: mscc: ocelot: remove unneeded VCAP parameters for IS2

Now that we are deriving these from the constants exposed by the
hardware, we can delete the static info we're keeping in the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: automatically detect VCAP constants
Vladimir Oltean [Tue, 29 Sep 2020 22:27:26 +0000 (01:27 +0300)]
net: mscc: ocelot: automatically detect VCAP constants

The numbers in struct vcap_props are not intuitive to derive, because
they are not a straightforward copy-and-paste from the reference manual
but instead rely on a fairly detailed level of understanding of the
layout of an entry in the TCAM and in the action RAM. For this reason,
bugs are very easy to introduce here.

Ease the work of hardware porters and read from hardware the constants
that were exported for this particular purpose. Note that this implies
that struct vcap_props can no longer be const.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: add definitions for VCAP ES0 keys, actions and target
Vladimir Oltean [Tue, 29 Sep 2020 22:27:25 +0000 (01:27 +0300)]
net: mscc: ocelot: add definitions for VCAP ES0 keys, actions and target

As a preparation step for the offloading to ES0, let's create the
infrastructure for talking with this hardware block.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: add definitions for VCAP IS1 keys, actions and target
Vladimir Oltean [Tue, 29 Sep 2020 22:27:24 +0000 (01:27 +0300)]
net: mscc: ocelot: add definitions for VCAP IS1 keys, actions and target

As a preparation step for the offloading to IS1, let's create the
infrastructure for talking with this hardware block.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: generalize existing code for VCAP
Vladimir Oltean [Tue, 29 Sep 2020 22:27:23 +0000 (01:27 +0300)]
net: mscc: ocelot: generalize existing code for VCAP

In the Ocelot switches there are 3 TCAMs: VCAP ES0, IS1 and IS2, which
have the same configuration interface, but different sets of keys and
actions. The driver currently only supports VCAP IS2.

In preparation of VCAP IS1 and ES0 support, the existing code must be
generalized to work with any VCAP.

In that direction, we should move the structures that depend upon VCAP
instantiation, like vcap_is2_keys and vcap_is2_actions, out of struct
ocelot and into struct vcap_props .keys and .actions, a structure that
is replicated 3 times, once per VCAP. We'll pass that structure as an
argument to each function that does the key and action packing - only
the control logic needs to distinguish between ocelot->vcap[VCAP_IS2]
or IS1 or ES0.

Another change is to make use of the newly introduced ocelot_target_read
and ocelot_target_write API, since the 3 VCAPs have the same registers
but put at different addresses.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: return error if VCAP filter is not found
Xiaoliang Yang [Tue, 29 Sep 2020 22:27:22 +0000 (01:27 +0300)]
net: mscc: ocelot: return error if VCAP filter is not found

Although it doesn't look like it is possible to hit these conditions
from user space, there are 2 separate, but related, issues.

First, the ocelot_vcap_block_get_filter_index function, née
ocelot_ace_rule_get_index_id prior to the 5e7620b6f9b6 ("net: mscc:
ocelot: generalize the "ACE/ACL" names") rename, does not do what the
author probably intended. If the desired filter entry is not present in
the ACL block, this function returns an index equal to the total number
of filters, instead of -1, which is maybe what was intended, judging
from the curious initialization with -1, and the "++index" idioms.
Either way, none of the callers seems to expect this behavior.

Second issue, the callers don't actually check the return value at all.
So in case the filter is not found in the rule list, propagate the
return code.

So update the callers and also take the opportunity to get rid of the
odd coding idioms that appear to work but don't.

Signed-off-by: Xiaoliang Yang <xiaoliang.yang_1@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mscc: ocelot: introduce a new ocelot_target_{read,write} API
Vladimir Oltean [Tue, 29 Sep 2020 22:27:21 +0000 (01:27 +0300)]
net: mscc: ocelot: introduce a new ocelot_target_{read,write} API

There are some targets (register blocks) in the Ocelot switch that are
instantiated more than once. For example, the VCAP IS1, IS2 and ES0
blocks all share the same register layout for interacting with the cache
for the TCAM and the action RAM.

For the VCAPs, the procedure for servicing them is actually common. We
just need an API specifying which VCAP we are talking to, and we do that
via these raw ocelot_target_read and ocelot_target_write accessors.

In plain ocelot_read, the target is encoded into the register enum
itself:

u16 target = reg >> TARGET_OFFSET;

For the VCAPs, the registers are currently defined like this:

enum ocelot_reg {
[...]
S2_CORE_UPDATE_CTRL = S2 << TARGET_OFFSET,
S2_CORE_MV_CFG,
S2_CACHE_ENTRY_DAT,
S2_CACHE_MASK_DAT,
S2_CACHE_ACTION_DAT,
S2_CACHE_CNT_DAT,
S2_CACHE_TG_DAT,
[...]
};

which is precisely what we want to avoid, because we'd have to duplicate
the same register map for S1 and for S0, and then figure out how to pass
VCAP instance-specific registers to the ocelot_read calls (basically
another lookup table that undoes the effect of shifting with
TARGET_OFFSET).

So for some targets, propose a more raw API, similar to what is
currently done with ocelot_port_readl and ocelot_port_writel. Those
targets can only be accessed with ocelot_target_{read,write} and not
with ocelot_{read,write} after the conversion, which is fine.

The VCAP registers are not actually modified to use this new API as of
this patch. They will be modified in the next one.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mvneta: avoid possible cache misses in mvneta_rx_swbm
Lorenzo Bianconi [Tue, 29 Sep 2020 21:58:57 +0000 (23:58 +0200)]
net: mvneta: avoid possible cache misses in mvneta_rx_swbm

Do not use rx_desc pointers if possible since rx descriptors are stored in
uncached memory and dereferencing rx_desc pointers generate extra loads.
This patch improves XDP_DROP performance of ~ 110Kpps (700Kpps vs 590Kpps)
on Marvell Espressobin

Analyzed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agolib8390: Replace panic() call with BUILD_BUG_ON
Armin Wolf [Tue, 29 Sep 2020 17:13:26 +0000 (19:13 +0200)]
lib8390: Replace panic() call with BUILD_BUG_ON

Replace panic() call in lib8390.c with BUILD_BUG_ON()
since checking the size of struct e8390_pkt_hdr should
happen at compile-time.

Signed-off-by: Armin Wolf <W_Armin@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agoMerge branch 'net-in_interrupt-cleanup-and-fixes'
David S. Miller [Tue, 29 Sep 2020 21:02:55 +0000 (14:02 -0700)]
Merge branch 'net-in_interrupt-cleanup-and-fixes'

Thomas Gleixner says:

====================
net: in_interrupt() cleanup and fixes

in the discussion about preempt count consistency accross kernel configurations:

  https://lore.kernel.org/r/20200914204209.256266093@linutronix.de/

Linus clearly requested that code in drivers and libraries which changes
behaviour based on execution context should either be split up so that
e.g. task context invocations and BH invocations have different interfaces
or if that's not possible the context information has to be provided by the
caller which knows in which context it is executing.

This includes conditional locking, allocation mode (GFP_*) decisions and
avoidance of code paths which might sleep.

In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
driver code completely.

This is the second version of the first batch of related changes. V1 can be
found here:

     https://lore.kernel.org/r/20200927194846.045411263@linutronix.de

Changes vs. V1:

  - Rebased to net-next

  - Fixed the half done rename sillyness in the ENIC patch.

  - Fixed the IONIC driver fallout.

  - Picked up the SFC fix from Edward and adjusted the GFP_KERNEL change
    accordingly.

  - Addressed the review comments vs. BCRFMAC.

  - Collected Reviewed/Acked-by tags as appropriate.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: rtlwifi: Replace in_interrupt() for context detection
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:45 +0000 (22:25 +0200)]
net: rtlwifi: Replace in_interrupt() for context detection

rtl_lps_enter() and rtl_lps_leave() are using in_interrupt() to detect
whether it is safe to acquire a mutex or if it is required to defer to a
workqueue.

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be seperated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

in_interrupt() also is only partially correct because it fails to chose the
correct code path when just preemption or interrupts are disabled.

Add an argument 'may_block' to both functions and adjust the callers to
pass the context information.

The following call chains were analyzed to be safe to block:

    rtl_watchdog_wq_callback()
      rlf_lps_leave/enter()

    rtl_op_suspend()
      rtl_lps_leave()

    rtl_op_bss_info_changed()
      rtl_lps_leave()

    rtl_op_sw_scan_start()
      rtl_lps_leave()

The following call chains were analyzed to be unsafe to block:

    _rtl_pci_interrupt()
      _rtl_pci_rx_interrupt()
  rtl_lps_leave()

    _rtl_pci_interrupt()
      _rtl_pci_rx_interrupt()
        rtl_is_special_data()
  rtl_lps_leave()

    _rtl_pci_interrupt()
      _rtl_pci_rx_interrupt()
        rtl_is_special_data()
  setup_special_tx()
    rtl_lps_leave()

    _rtl_pci_interrupt()
      _rtl_pci_tx_isr
        rtl_lps_leave()

      halbtc_leave_lps()
        rtl_lps_leave()

This leaves four callers of rtl_lps_enter/leave() where the analyzis
stopped dead in the maze of several nested pointer based callchains and
lack of rtlwifi hardware to debug this via tracing:

     halbtc_leave_lps(), halbtc_enter_lps(), halbtc_normal_lps(),
     halbtc_pre_normal_lps()

These four have been cautionally marked to be unable to block which is the
safe option, but the rtwifi wizards should be able to clarify that.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: rtlwifi: Remove in_interrupt() from debug macro
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:44 +0000 (22:25 +0200)]
net: rtlwifi: Remove in_interrupt() from debug macro

The usage of in_interrupt() in drivers in is phased out.

rtl_dbg() a printk based debug aid is using in_interrupt() in the
underlying C function _rtl_dbg_out() which is almost identical to
_rtl_dbg_print(). The only difference is the printout of in_interrupt().

The decoding of in_interrupt() as hexvalue is non-trivial and aside of
being phased out for driver usage the return value is just by chance the
masked preempt count value and not a boolean.

These home brewn printk debug aids are tedious to work with and provide
only minimal context.  They should be replaced by trace_printk() or a debug
tracepoint which automatically records all context information.

To make progress on the in_interrupt() cleanup, make rtl_dbg() use
_rtl_dbg_print() and remove _rtl_dbg_out().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: rtlwifi: Remove void* casts related to delayed work
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:43 +0000 (22:25 +0200)]
net: rtlwifi: Remove void* casts related to delayed work

INIT_DELAYED_WORK() takes two arguments: A pointer to the delayed work and
a function reference for the callback.

The rtl code casts all function references to (void *) because the
callbacks in use are not matching the required function signature. That's
error prone and bad pratice.

Some of the callback functions are also global, but only used in a single
file.

Clean the mess up by:

  - Adding the proper arguments to the callback functions and using them in
    the container_of() constructs correctly which removes the hideous
    container_of_dwork_rtl() macro as well.

  - Removing the type cast at the initializers

  - Making the unnecessary global functions static

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: libertas: Use netif_rx_any_context()
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:42 +0000 (22:25 +0200)]
net: libertas: Use netif_rx_any_context()

The usage of in_interrupt() in non-core code is phased out. Ideally the
information of the calling context should be passed by the callers or the
functions be split as appropriate.

libertas uses in_interupt() to select the netif_rx*() variant which matches
the calling context. The attempt to consolidate the code by passing an
arguemnt or by distangling it failed due lack of knowledge about this
driver and because the call chains are hard to follow.

As a stop gap use netif_rx_any_context() which invokes the correct code
path depending on context and confines the in_interrupt() usage to core
code.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: libertas libertas_tf: Remove in_interrupt() from debug macro.
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:41 +0000 (22:25 +0200)]
net: libertas libertas_tf: Remove in_interrupt() from debug macro.

The debug macro prints (INT) when in_interrupt() returns true. The value of
this information is dubious as it does not distinguish between the various
contexts which are covered by in_interrupt().

As the usage of in_interrupt() in drivers is phased out and the same
information can be more precisely obtained with tracing, remove the
in_interrupt() conditional from this debug printk.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mwifiex: Use netif_rx_any_context().
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:40 +0000 (22:25 +0200)]
net: mwifiex: Use netif_rx_any_context().

The usage of in_interrupt() in non-core code is phased out. Ideally the
information of the calling context should be passed by the callers or the
functions be split as appropriate.

mwifiex uses in_interupt() to select the netif_rx*() variant which matches
the calling context. The attempt to consolidate the code by passing an
arguemnt or by distangling it failed due lack of knowledge about this
driver and because the call chains are hard to follow.

As a stop gap use netif_rx_any_context() which invokes the correct code
path depending on context and confines the in_interrupt() usage to core
code.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: hostap: Remove in_interrupt() usage
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:39 +0000 (22:25 +0200)]
net: hostap: Remove in_interrupt() usage

in_interrupt() is ill defined and does not provide what the name
suggests. The usage especially in driver code is deprecated and a tree wide
effort to clean up and consolidate the (ab)usage of in_interrupt() and
related checks is happening.

hfa384x_cmd() and prism2_hw_reset() check in_interrupt() at function entry
and if true emit a printk at debug loglevel and return. This is clearly debug
code.

Both functions invoke functions which can sleep. These functions already
have appropriate debug checks which cover all invalid contexts, while
in_interrupt() fails to detect context which just has preemption or
interrupts disabled.

Remove both checks as they are incomplete, debug only and already covered
by the subsequently invoked functions properly. If called from invalid
context the resulting back trace is definitely more helpful to analyze the
problem than a printk at debug loglevel.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: iwlwifi: Remove in_interrupt() from tracing macro.
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:38 +0000 (22:25 +0200)]
net: iwlwifi: Remove in_interrupt() from tracing macro.

The usage of in_interrupt) in driver code is phased out.

The iwlwifi_dbg tracepoint records in_interrupt() seperately, but that's
superfluous because the trace header already records all kind of state and
context information like hardirq status, softirq status, preemption count
etc.

Aside of that the recording of in_interrupt() as boolean does not allow to
distinguish between the possible contexts (hard interrupt, soft interrupt,
bottom half disabled) while the trace header gives precise information.

Remove the duplicate information from the tracepoint and fixup the caller.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Luca Coelho <luca@coelho.fi>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ipw2x00,iwlegacy,iwlwifi: Remove in_interrupt() from debug macros
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:37 +0000 (22:25 +0200)]
net: ipw2x00,iwlegacy,iwlwifi: Remove in_interrupt() from debug macros

The usage of in_interrupt() in non-core code is phased out.

The debugging macros in these drivers use in_interrupt() to print 'I' or
'U' depending on the return value of in_interrupt(). While 'U' is confusing
at best and 'I' is not really describing the actual context (hard interupt,
soft interrupt, bottom half disabled section) these debug macros originate
from the pre ftrace kernel era and their value today is questionable. They
probably should be removed completely.

The macros weere added initially for ipw2100 and then spreaded when the
driver was forked.

Remove the in_interrupt() usage at least..

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: brcmfmac: Convey allocation mode as argument
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:36 +0000 (22:25 +0200)]
net: brcmfmac: Convey allocation mode as argument

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be seperated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

brcmf_fweh_process_event() uses in_interrupt() to select the allocation
mode GFP_KERNEL/GFP_ATOMIC. Aside of the above reasons this check is
incomplete as it cannot detect contexts which just have preemption or
interrupts disabled.

All callchains leading to brcmf_fweh_process_event() can clearly identify
the calling context. Convey a 'gfp' argument through the callchains and let
the callers hand in the appropriate GFP mode.

This has also the advantage that any change of execution context or
preemption/interrupt state in these callchains will be detected by the
memory allocator for all GFP_KERNEL allocations.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: brcmfmac: Convey execution context via argument to brcmf_netif_rx()
Thomas Gleixner [Tue, 29 Sep 2020 20:25:35 +0000 (22:25 +0200)]
net: brcmfmac: Convey execution context via argument to brcmf_netif_rx()

bcrmgf_netif_rx() uses in_interrupt to chose between netif_rx() and
netif_rx_ni(). in_interrupt() usage in drivers is phased out.

Convey the execution mode via an 'inirq' argument through the various
callchains leading to brcmf_netif_rx():

brcmf_pcie_isr_thread()     <- Task context
  brcmf_proto_msgbuf_rx_trigger()
    brcmf_msgbuf_process_rx()
      brcmf_msgbuf_process_msgtype()
        brcmf_msgbuf_process_rx_complete()
  brcmf_netif_mon_rx()
     brcmf_netif_rx(isirq = false)
  brcmf_netif_rx(isirq = false)

brcmf_sdio_readframes()  <- Task context sdio_claim_host() might sleep
  brcmf_rx_frame(isirq = false)

brcmf_sdio_rxglom()      <- Task context sdio_claim_host() might sleep
  brcmf_rx_frame(isirq = false)

brcmf_usb_rx_complete()  <- Interrupt context
  brcmf_rx_frame(isirq = true)

brcmf_rx_frame()
  brcmf_proto_rxreorder()
    brcmf_proto_bcdc_rxreorder()
      brcmf_fws_rxreorder()
        brcmf_netif_rx()
      brcmf_netif_rx()

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arend van Spriel <arend.vanspriel@broadcom.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: brcmfmac: Replace in_interrupt()
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:34 +0000 (22:25 +0200)]
net: brcmfmac: Replace in_interrupt()

brcmf_sdio_isr() is using in_interrupt() to distinguish if it is called
from a interrupt service routine or from a worker thread.

Passing such information from the calling context is preferred and
requested by Linus, so add an argument `in_isr' to brcmf_sdio_isr() and let
the callers pass the information about the calling context.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Arend van Spriel <arend.vanspriel@broadcom.com>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: wan/lmc: Remove lmc_trace()
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:33 +0000 (22:25 +0200)]
net: wan/lmc: Remove lmc_trace()

lmc_trace() was first introduced in commit e7a392d5158af ("Import
2.3.99pre6-5") and was not touched ever since.

The reason for looking at this was to get rid of the in_interrupt() usage,
but while looking at it the following observations were made:

 - At least lmc_get_stats() (->ndo_get_stats()) is invoked with disabled
   preemption which is not detected by the in_interrupt() check, which
   would cause schedule() to be called from invalid context.

 - The code is hidden behind #ifdef LMC_TRACE which is not defined within
   the kernel and wasn't at the time it was introduced.

 - Three jiffies don't match 50ms. msleep() would be a better match which
   would also avoid the schedule() invocation. But why have it to begin
   with?

 - Nobody would do something like this today. Either netdev_dbg() or
   trace_printk() or a trace event would be used.  If only the functions
   related to this driver are interesting then ftrace can be used with
   filtering.

As it is obviously broken for years, simply remove it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: usb: net1080: Remove in_interrupt() comment
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:32 +0000 (22:25 +0200)]
net: usb: net1080: Remove in_interrupt() comment

The comment above nc_vendor_write() suggests that the function could become
async so that is usable in `in_interrupt()' context or that it already is
safe to be called from such a context.

Eitherway: The function did not become async since v2.4.9.2 (2002) and it
must be not be called from `in_interrupt()' context because it sleeps on
mutltiple occations.

Remove the misleading comment.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: usb: kaweth: Remove last user of kaweth_control()
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:31 +0000 (22:25 +0200)]
net: usb: kaweth: Remove last user of kaweth_control()

kaweth_async_set_rx_mode() invokes kaweth_contol() and has two callers:

- kaweth_open() which is invoked from preemptible context
.
- kaweth_start_xmit() which holds a spinlock and has bottom halfs disabled.

If called from kaweth_start_xmit() kaweth_async_set_rx_mode() obviously
cannot block, which means it can't call kaweth_control(). This is detected
with an in_interrupt() check.

Replace the in_interrupt() check in kaweth_async_set_rx_mode() with an
argument which is set true by the caller if the context is safe to sleep,
otherwise false.

Now kaweth_control() is only called from preemptible context which means
there is no need for GFP_ATOMIC allocations anymore. Replace it with
usb_control_msg(). Cleanup the code a bit while at it.

Finally remove kaweth_control() since the last user is gone.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: usb: kaweth: Replace kaweth_control() with usb_control_msg()
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:30 +0000 (22:25 +0200)]
net: usb: kaweth: Replace kaweth_control() with usb_control_msg()

kaweth_control() is almost the same as usb_control_msg() except for the
memory allocation mode (GFP_ATOMIC vs GFP_NOIO) and the in_interrupt()
check.

All the invocations of kaweth_control() are within the probe function in
fully preemtible context so there is no reason to use atomic allocations,
GFP_NOIO which is used by usb_control_msg() is perfectly fine.

Replace kaweth_control() invocations from probe with usb_control_msg().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: zd1211rw: Remove ZD_ASSERT(in_interrupt())
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:29 +0000 (22:25 +0200)]
net: zd1211rw: Remove ZD_ASSERT(in_interrupt())

in_interrupt() is ill defined and does not provide what the name
suggests. The usage especially in driver code is deprecated and
a tree wide effort to clean up and consolidate the (ab)usage of
in_interrupt() and related checks is happening.

handle_regs_int() is always invoked as part of URB callback which is either
invoked from hard or soft interrupt context.

Remove the magic assertion.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: vxge: Remove in_interrupt() conditionals
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:28 +0000 (22:25 +0200)]
net: vxge: Remove in_interrupt() conditionals

vxge_os_dma_malloc() and vxge_os_dma_malloc_async() are both called from
callchains which use GFP_KERNEL allocations unconditionally or have other
requirements to be called from fully preemptible task context..

vxge_os_dma_malloc():
  1)  __vxge_hw_blockpool_create() <- GFP_KERNEL

  2)  __vxge_hw_mempool_grow() <- vzalloc()
        __vxge_hw_blockpool_malloc()

vxge_os_dma_malloc_async():
  1  __vxge_hw_mempool_grow() <- vzalloc()
      __vxge_hw_blockpool_malloc()
__vxge_hw_blockpool_blocks_add()

  2)  vxge_hw_vpath_open() <- vzalloc()
__vxge_hw_blockpool_block_allocate()

That means neither of these functions needs a conditional allocation mode.

Remove the in_interrupt() conditional and use GFP_KERNEL.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sun3lance: Remove redundant checks in interrupt handler
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:27 +0000 (22:25 +0200)]
net: sun3lance: Remove redundant checks in interrupt handler

lance_interrupt() contains two pointless checks:

 - A check whether the 'dev_id' argument is NULL. 'dev_id' is the pointer
   which was handed in to request_irq() and the interrupt handler will
   always be invoked with that pointer as 'dev_id' argument by the core
   code.

 - A check for interrupt reentrancy. The core code already guarantees
   non-reentrancy of interrupt handlers.

Remove these check.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sunbmac: Replace in_interrupt() usage
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:26 +0000 (22:25 +0200)]
net: sunbmac: Replace in_interrupt() usage

bigmac_init_rings() has an argument signaling if it is called from the
interrupt handler. This is used to decide between GFP_KERNEL and GFP_ATOMIC
for memory allocations.

But it also checks in_interrupt() to handle invocations which come from the
timer callback bigmac_timer() via bigmac_hw_init(), which is invoked with
'in_irq = 0'. While the timer callback is clearly not in hard interrupt
context it is still not sleepable context.

Rename the argument to `non_blocking' and set it to true if invoked from
the timer callback or the interrupt handler which allows to remove the
in_interrupt() check and makes the code consistent.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sfc: Use GFP_KERNEL in efx_ef10_try_update_nic_stats()
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:25 +0000 (22:25 +0200)]
net: sfc: Use GFP_KERNEL in efx_ef10_try_update_nic_stats()

efx_ef10_try_update_nic_stats_vf() is now only invoked from thread context
and can sleep after efx::stats_lock is dropped.

Change the allocation mode from GFP_ATOMIC to GFP_KERNEL.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: sfc: Replace in_interrupt() usage
Edward Cree [Tue, 29 Sep 2020 20:25:24 +0000 (22:25 +0200)]
net: sfc: Replace in_interrupt() usage

efx_ef10_try_update_nic_stats_vf() used in_interrupt() to figure out
whether it is safe to sleep (for MCDI) or not.

The only caller from which it was not is efx_net_stats(), which can be
invoked under dev_base_lock from net-sysfs::netstat_show().

So add a new update_stats_atomic() method to struct efx_nic_type, and call
it from efx_net_stats(), removing the need for
efx_ef10_try_update_nic_stats_vf() to behave differently for this case
(which it wasn't doing correctly anyway).

For all nic_types other than EF10 VF, this method is NULL so the the
regular update_stats() methods are invoked , which are happy with being
called from atomic contexts.

Fixes: 5be74da53936 ("sfc: don't update stats on VF when called in atomic context")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Martin Habets <mhabets@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: natsemi: Replace in_interrupt() usage.
Thomas Gleixner [Tue, 29 Sep 2020 20:25:23 +0000 (22:25 +0200)]
net: natsemi: Replace in_interrupt() usage.

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be seperated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

sonic_quiesce() uses 'in_interrupt() || irqs_disabled()' to chose either
udelay() or usleep_range() in the wait loop.

In all callchains leading to it the context is well defined and known.

Add a 'may_sleep' argument and pass it through the various callchains
leading to this function.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: mdiobus: Remove WARN_ON_ONCE(in_interrupt())
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:22 +0000 (22:25 +0200)]
net: mdiobus: Remove WARN_ON_ONCE(in_interrupt())

in_interrupt() is ill defined and does not provide what the name
suggests. The usage especially in driver code is deprecated and a tree wide
effort to clean up and consolidate the (ab)usage of in_interrupt() and
related checks is happening.

In this case the check covers only parts of the contexts in which these
functions cannot be called. It fails to detect preemption or interrupt
disabled invocations.

As the functions which contain these warnings invoke mutex_lock() which
contains a broad variety of checks (always enabled or debug option
dependent) and therefore covers all invalid conditions already, there is no
point in having inconsistent warnings in those drivers. The conditional
return is not really valuable in practice either.

Just remove them.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ionic: Remove WARN_ON(in_interrupt()).
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:21 +0000 (22:25 +0200)]
net: ionic: Remove WARN_ON(in_interrupt()).

in_interrupt() is ill defined and does not provide what the name
suggests. The usage especially in driver code is deprecated and a tree wide
effort to clean up and consolidate the (ab)usage of in_interrupt() and
related checks is happening.

In this case the check covers only parts of the contexts in which these
functions cannot be called. It fails to detect preemption or interrupt
disabled invocations.

As the functions which are invoked from ionic_adminq_post() and
ionic_dev_cmd_wait() contain a broad variety of checks (always enabled or
debug option dependent) which cover all invalid conditions already, there
is no point in having inconsistent warnings in those drivers.

Just remove them.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: ionic: Replace in_interrupt() usage.
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:20 +0000 (22:25 +0200)]
net: ionic: Replace in_interrupt() usage.

The in_interrupt() usage in this driver tries to figure out which context
may sleep and which context may not sleep. in_interrupt() is not really
suitable as it misses both preemption disabled and interrupt disabled
invocations from task context.

Conditionals like that in driver code are frowned upon in general because
invocations of functions from invalid contexts might not be detected
as the conditional papers over it.

ionic_lif_addr() and _ionoc_lif_rx_mode() can be called from:

 1) ->ndo_set_rx_mode() which is under netif_addr_lock_bh()) so it must not
    sleep.

 2) Init and setup functions which are in fully preemptible task context.

ionic_link_status_check_request() has two call paths:

 1) NAPI which obviously cannot sleep

 2) Setup which is again fully preemptible task context

Add arguments which convey the execution context to the affected functions
and let the callers provide the context instead of letting the functions
deduce it.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
4 years agonet: intel: Remove in_interrupt() warnings
Sebastian Andrzej Siewior [Tue, 29 Sep 2020 20:25:19 +0000 (22:25 +0200)]
net: intel: Remove in_interrupt() warnings

in_interrupt() is ill defined and does not provide what the name
suggests. The usage especially in driver code is deprecated and a tree wide
effort to clean up and consolidate the (ab)usage of in_interrupt() and
related checks is happening.

In this case the checks cover only parts of the contexts in which these
functions cannot be called. They fail to detect preemption or interrupt
disabled invocations.

As the functions which are invoked from the various places contain already
a broad variety of checks (always enabled or debug option dependent) cover
all invalid conditions already, there is no point in having inconsistent
warnings in those drivers.

Just remove them.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>