]> git.baikalelectronics.ru Git - kernel.git/log
kernel.git
5 years agonet: stmmac: dwmac1000: Clear unused address entries
Jose Abreu [Fri, 24 May 2019 08:20:21 +0000 (10:20 +0200)]
net: stmmac: dwmac1000: Clear unused address entries

In case we don't use a given address entry we need to clear it because
it could contain previous values that are no longer valid.

Found out while running stmmac selftests.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwmac1000: Fix Hash Filter
Jose Abreu [Fri, 24 May 2019 08:20:20 +0000 (10:20 +0200)]
net: stmmac: dwmac1000: Fix Hash Filter

In order for hash filter to work we need to set the HPF bit.

Found out while running stmmac selftests.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Introduce selftests support
Jose Abreu [Fri, 24 May 2019 08:20:19 +0000 (10:20 +0200)]
net: stmmac: Introduce selftests support

We add support for selftests on stmmac driver with 9 basic sanity checks
for now:
- MAC Loopback
- PHY Loopback
- MMC Counters
- EEE
- Hash Filter Multicast
- Perfect Filter Unicast
- Multicast Filter All
- Unicast Filter All
- Flow Control

This allows for fast tracking of regressions in the driver and helps in
spotting mis-configuration of HW.

Changes from v1:
- Fix build error as module (David)
- Check for link status before running tests
Changes from RFC v2:
- Return proper error code in stmmac_test_mmc (Corentin)
- Use only 1 MMC counter in stmmac_test_mmc (Alexandre)
Changes from RFC v1:
- Change test_loopback to test_mac_loopback (Andrew)
- Change timeout to retries (Andrew)
- Add MC/UC filter tests (Andrew)
- Only test in offline mode (Andrew)
- Do not call phy_loopback twice (Alexandre)

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwxgmac2: Also pass control frames while in promisc mode
Jose Abreu [Fri, 24 May 2019 08:20:18 +0000 (10:20 +0200)]
net: stmmac: dwxgmac2: Also pass control frames while in promisc mode

In order for the selftests to run the Flow Control selftest we need to
also pass pause frames to the stack.

Pass this type of frames while in promiscuous mode.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwmac4/5: Also pass control frames while in promisc mode
Jose Abreu [Fri, 24 May 2019 08:20:17 +0000 (10:20 +0200)]
net: stmmac: dwmac4/5: Also pass control frames while in promisc mode

In order for the selftests to run the Flow Control selftest we need to
also pass pause frames to the stack.

Pass this type of frames while in promiscuous mode.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwmac1000: Also pass control frames while in promisc mode
Jose Abreu [Fri, 24 May 2019 08:20:16 +0000 (10:20 +0200)]
net: stmmac: dwmac1000: Also pass control frames while in promisc mode

In order for the selftests to run the Flow Control selftest we need to
also pass pause frames to the stack.

Pass this type of frames while in promiscuous mode.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Switch MMC functions to HWIF callbacks
Jose Abreu [Fri, 24 May 2019 08:20:15 +0000 (10:20 +0200)]
net: stmmac: Switch MMC functions to HWIF callbacks

XGMAC has a different MMC module. Lets use HWIF callbacks for MMC module
so that correct callbacks are automatically selected.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ethernet: stmmac: dwmac-sun8i: Enable control of loopback
Corentin Labbe [Fri, 24 May 2019 08:20:14 +0000 (10:20 +0200)]
net: ethernet: stmmac: dwmac-sun8i: Enable control of loopback

This patch enable use of set_mac_loopback in dwmac-sun8i

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwxgmac2: Add MAC loopback support
Jose Abreu [Fri, 24 May 2019 08:20:13 +0000 (10:20 +0200)]
net: stmmac: dwxgmac2: Add MAC loopback support

In preparation for the addition of stmmac selftests we implement the MAC
loopback callback in dwxgmac2 core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwmac4/5: Add MAC loopback support
Jose Abreu [Fri, 24 May 2019 08:20:12 +0000 (10:20 +0200)]
net: stmmac: dwmac4/5: Add MAC loopback support

In preparation for the addition of stmmac selftests we implement the MAC
loopback callback in dwmac4/5 cores.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwmac1000: Add MAC loopback support
Jose Abreu [Fri, 24 May 2019 08:20:11 +0000 (10:20 +0200)]
net: stmmac: dwmac1000: Add MAC loopback support

In preparation for the addition of stmmac selftests we implement the MAC
loopback callback in dwmac1000 core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: dwmac100: Add MAC loopback support
Jose Abreu [Fri, 24 May 2019 08:20:10 +0000 (10:20 +0200)]
net: stmmac: dwmac100: Add MAC loopback support

In preparation for the addition of stmmac selftests we implement the MAC
loopback callback in dwmac100 core.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: stmmac: Add MAC loopback callback to HWIF
Jose Abreu [Fri, 24 May 2019 08:20:09 +0000 (10:20 +0200)]
net: stmmac: Add MAC loopback callback to HWIF

In preparation for the addition of selftests support for stmmac we add a
new callback to HWIF that can be used to set the controller in loopback
mode.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Cc: Alexandre Torgue <alexandre.torgue@st.com>
Cc: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-phy-add-interface-mode-PHY_INTERFACE_MODE_USXGMII'
David S. Miller [Fri, 24 May 2019 20:39:34 +0000 (13:39 -0700)]
Merge branch 'net-phy-add-interface-mode-PHY_INTERFACE_MODE_USXGMII'

Heiner Kallweit says:

====================
net: phy: add interface mode PHY_INTERFACE_MODE_USXGMII

Add support for interface mode USXGMII.

On Freescale boards LS1043A and LS1046A a warning may pop up now
because mode xgmii should be changed to usxgmii (as the used
Aquantia PHY doesn't support XGMII).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: aquantia: add USXGMII support and warn if XGMII mode is set
Heiner Kallweit [Thu, 23 May 2019 18:09:08 +0000 (20:09 +0200)]
net: phy: aquantia: add USXGMII support and warn if XGMII mode is set

So far we didn't support mode USXGMII, and in order to not break few
boards mode XGMII was accepted for the AQR107 family even though it
doesn't support XGMII. Add USXGMII support to the Aquantia PHY driver
and warn if XGMII mode is set.

v2:
- add warning if XGMII mode is set

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: net: document new usxgmii phy mode
Heiner Kallweit [Thu, 23 May 2019 18:07:56 +0000 (20:07 +0200)]
dt-bindings: net: document new usxgmii phy mode

Add new interface mode USXGMII to binding documentation.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: add interface mode PHY_INTERFACE_MODE_USXGMII
Heiner Kallweit [Thu, 23 May 2019 18:06:49 +0000 (20:06 +0200)]
net: phy: add interface mode PHY_INTERFACE_MODE_USXGMII

Add support for interface mode PHY_INTERFACE_MODE_USXGMII.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests/net: SO_TXTIME with ETF and FQ
Willem de Bruijn [Thu, 23 May 2019 17:48:46 +0000 (13:48 -0400)]
selftests/net: SO_TXTIME with ETF and FQ

The SO_TXTIME API enables packet tranmission with delayed delivery.
This is currently supported by the ETF and FQ packet schedulers.

Evaluate the interface with both schedulers. Install the scheduler
and send a variety of packets streams: without delay, with one
delayed packet, with multiple ordered delays and with reordering.
Verify that packets are released by the scheduler in expected order.

The ETF qdisc requires a timestamp in the future on every packet. It
needs a delay on the qdisc else the packet is dropped on dequeue for
having a delivery time in the past. The test value is experimentally
derived. ETF requires clock_id CLOCK_TAI. It checks this base and
drops for non-conformance.

The FQ qdisc expects clock_id CLOCK_MONOTONIC, the base used by TCP
as of commit fff99772a27e ("tcp/fq: move back to CLOCK_MONOTONIC").
Within a flow there is an expecation of ordered delivery, as shown by
delivery times of test 4. The FQ qdisc does not require all packets to
have timestamps and does not drop for non-conformance.

The large (msec) delays are chosen to avoid flakiness.

Output:

SO_TXTIME ipv6 clock monotonic
payload:a delay:28 expected:0 (us)

SO_TXTIME ipv4 clock monotonic
payload:a delay:38 expected:0 (us)

SO_TXTIME ipv6 clock monotonic
payload:a delay:40 expected:0 (us)

SO_TXTIME ipv4 clock monotonic
payload:a delay:33 expected:0 (us)

SO_TXTIME ipv6 clock monotonic
payload:a delay:10120 expected:10000 (us)

SO_TXTIME ipv4 clock monotonic
payload:a delay:10102 expected:10000 (us)

[.. etc ..]

OK. All tests passed

Changes v1->v2: update commit message output

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ipv6-Move-exceptions-to-fib6_nh-and-make-it-optional-in-a-fib6_info'
David S. Miller [Fri, 24 May 2019 20:26:44 +0000 (13:26 -0700)]
Merge branch 'ipv6-Move-exceptions-to-fib6_nh-and-make-it-optional-in-a-fib6_info'

David Ahern says:

====================
ipv6: Move exceptions to fib6_nh and make it optional in a fib6_info

Patches 1 and 4 move pcpu and exception caches from fib6_info to fib6_nh.
With respect to the current FIB entries this is only a movement from one
struct to another contained within the first.

Patch 2 refactors the core logic of fib6_drop_pcpu_from into a helper
that is invoked per fib6_nh.

Patch 3 refactors exception handling in a similar way - creating a bunch
of helpers that can be invoked per fib6_nh with the goal of making patch
4 easier to review as well as creating the code needed for nexthop
objects.

Patch 5 makes a fib6_nh at the end of a fib6_info an array similar to
IPv4 and its fib_info. For the current fib entry model, all fib6_info
will have a fib6_nh allocated for it.

Patch 6 refactors ip6_route_del moving the code for deleting an
exception entry into a new function.

Patch 7 adds tests for redirect route exceptions. The new test was
written against 5.1 (before any of the nexthop refactoring). It and the
pmtu.sh selftest exercise the exception code paths - from creating
exceptions to cleaning them up on device delete. All tests pass without
any rcu locking or memleak warnings.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: Add redirect tests
David Ahern [Thu, 23 May 2019 03:28:01 +0000 (20:28 -0700)]
selftests: Add redirect tests

Add test for ICMP redirects and exception processing. Test is setup
for later addition of tests using nexthop objects for routing.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Refactor ip6_route_del for cached routes
David Ahern [Thu, 23 May 2019 03:28:00 +0000 (20:28 -0700)]
ipv6: Refactor ip6_route_del for cached routes

Move the removal of cached routes to a helper, ip6_del_cached_rt, that
can be invoked per nexthop. Rename the existig ip6_del_cached_rt to
__ip6_del_cached_rt since it is called by ip6_del_cached_rt.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Make fib6_nh optional at the end of fib6_info
David Ahern [Thu, 23 May 2019 03:27:59 +0000 (20:27 -0700)]
ipv6: Make fib6_nh optional at the end of fib6_info

Move fib6_nh to the end of fib6_info and make it an array of
size 0. Pass a flag to fib6_info_alloc indicating if the
allocation needs to add space for a fib6_nh.

The current code path always has a fib6_nh allocated with a
fib6_info; with nexthop objects they will be separate.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Move exception bucket to fib6_nh
David Ahern [Thu, 23 May 2019 03:27:58 +0000 (20:27 -0700)]
ipv6: Move exception bucket to fib6_nh

Similar to the pcpu routes exceptions are really per nexthop, so move
rt6i_exception_bucket from fib6_info to fib6_nh.

To avoid additional increases to the size of fib6_nh for a 1-bit flag,
use the lowest bit in the allocated memory pointer for the flushed flag.
Add helpers for retrieving the bucket pointer to mask off the flag.

The cleanup of the exception bucket is moved to fib6_nh_release.

fib6_nh_flush_exceptions can now be called from 2 contexts:
1. deleting a fib entry
2. deleting a fib6_nh

For 1., fib6_nh_flush_exceptions is called for a specific fib6_info that
is getting deleted. All exceptions in the cache using the entry are
deleted. For 2, the fib6_nh itself is getting destroyed so
fib6_nh_flush_exceptions is called for a NULL fib6_info which means
flush all entries.

The pmtu.sh selftest exercises the affected code paths - from creating
exceptions to cleaning them up on device delete. All tests pass without
any rcu locking or memleak warnings.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Refactor exception functions
David Ahern [Thu, 23 May 2019 03:27:57 +0000 (20:27 -0700)]
ipv6: Refactor exception functions

Before moving exception bucket from fib6_info to fib6_nh, refactor
rt6_flush_exceptions, rt6_remove_exception_rt, rt6_mtu_change_route,
and rt6_update_exception_stamp_rt. In all 3 cases, move the primary
logic into a new helper that starts with fib6_nh_. The latter 3
functions still take a fib6_info; this will be changed to fib6_nh
in the next patch.

In the case of rt6_mtu_change_route, move the fib6_metric_locked
out as a standalone check - no need to call the new function if
the fib entry has the mtu locked. Also, add fib6_info to
rt6_mtu_change_arg as a way of passing the fib entry to the new
helper.

No functional change intended. The goal here is to make the next
patch easier to review by moving existing lookup logic for each to
new helpers.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Refactor fib6_drop_pcpu_from
David Ahern [Thu, 23 May 2019 03:27:56 +0000 (20:27 -0700)]
ipv6: Refactor fib6_drop_pcpu_from

Move the existing pcpu walk in fib6_drop_pcpu_from to a new
helper, __fib6_drop_pcpu_from, that can be invoked per fib6_nh with a
reference to the from entries that need to be evicted. If the passed
in 'from' is non-NULL then only entries associated with that fib6_info
are removed (e.g., case where fib entry is deleted); if the 'from' is
NULL are entries are flushed (e.g., fib6_nh is deleted).

For fib6_info entries with builtin fib6_nh (ie., current code) there
is no change in behavior.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Move pcpu cached routes to fib6_nh
David Ahern [Thu, 23 May 2019 03:27:55 +0000 (20:27 -0700)]
ipv6: Move pcpu cached routes to fib6_nh

rt6_info are specific instances of a fib entry and are tied to a
device and gateway - ie., a nexthop. Before nexthop objects, IPv6 fib
entries have separate fib6_info for each nexthop in a multipath route,
so the location of the pcpu cache in the fib6_info struct worked.
However, with nexthop objects a fib6_info can point to a set of nexthops
(yet another alignment of ipv6 with ipv4). Accordingly, the pcpu
cache needs to be moved to the fib6_nh struct so the cached entries
are local to the nexthop specification used to create the rt6_info.

Initialization and free of the pcpu entries moved to fib6_nh_init and
fib6_nh_release.

Change in location only, from fib6_info down to fib6_nh; no other
functional change intended.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ENETC-support-hardware-timestamping'
David S. Miller [Fri, 24 May 2019 20:16:33 +0000 (13:16 -0700)]
Merge branch 'ENETC-support-hardware-timestamping'

Y.b. Lu says:

====================
ENETC: support hardware timestamping

This patch-set is to support hardware timestamping for ENETC
and also to add ENETC 1588 timer device tree node for ls1028a.

Because the ENETC RX BD ring dynamic allocation has not been
supported and it is too expensive to use extended RX BDs
if timestamping is not used, a Kconfig option is used to
enable extended RX BDs in order to support hardware
timestamping. This option will be removed once RX BD
ring dynamic allocation is implemented.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoarm64: dts: fsl: ls1028a: add ENETC 1588 timer node
Y.b. Lu [Thu, 23 May 2019 02:33:41 +0000 (02:33 +0000)]
arm64: dts: fsl: ls1028a: add ENETC 1588 timer node

Add ENETC 1588 timer node which is ENETC PF 4 (Physiscal Function 4).

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-binding: ptp_qoriq: support ENETC PTP compatible
Y.b. Lu [Thu, 23 May 2019 02:33:37 +0000 (02:33 +0000)]
dt-binding: ptp_qoriq: support ENETC PTP compatible

Add a new compatible for ENETC PTP.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenetc: add get_ts_info interface for ethtool
Y.b. Lu [Thu, 23 May 2019 02:33:33 +0000 (02:33 +0000)]
enetc: add get_ts_info interface for ethtool

This patch is to add get_ts_info interface for ethtool
to support getting timestamping capability.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoenetc: add hardware timestamping support
Y.b. Lu [Thu, 23 May 2019 02:33:29 +0000 (02:33 +0000)]
enetc: add hardware timestamping support

This patch is to add hardware timestamping support
for ENETC. On Rx, timestamping is enabled for all
frames. On Tx, we only instruct the hardware to
timestamp the frames marked accordingly by the stack.

Because the RX BD ring dynamic allocation has not been
supported and it is too expensive to use extended RX BDs
if timestamping is not used, a Kconfig option is used to
enable extended RX BDs in order to support hardware
timestamping. This option will be removed once RX BD
ring dynamic allocation is implemented.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ll_temac: Fix compile error
Esben Haabendal [Fri, 24 May 2019 05:24:42 +0000 (07:24 +0200)]
net: ll_temac: Fix compile error

Fixes: 8396f064d1c0 ("net: ll_temac: Cleanup multicast filter on change")
Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Fri, 24 May 2019 00:52:43 +0000 (17:52 -0700)]
Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2019-05-23

This series contains updates to ice driver only.

Anirudh cleans up white space issues and other code formatting issues in the
driver.  Also implemented LLDP persistence across reboots and start/stop of the
LLDP agent.  Updated print statements for driver capabilities to include
if it is a device or function capability.

Bruce cleaned up variable declarations by removing unneeded assignment.

Dave fixes a potential hang due to a couple of flows that recursively
acquire the RTNL lock which results in a deadlock.

Tony updates the driver to advertise what link modes we are capable of
when the user does not request a specific link mode.

Usha fixes up the LLDP MIB change event handling by cleaning up
workarounds and print the DCB configuration changes detected.

Brett fixes the driver to handle failures in the VF reset path, which
was failing to free resources upon an error.

Richard fixed the reported of stats via ethtool to align with our other
Intel drivers.

Jesse optimizes the transmit buffer and ring structures to have more
efficient ordering to get hot cache lines to have packed data.  Also
optimized the VF structure to use less memory, since it is used hundreds
of times throughout the driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoice: Silence semantic parser warnings
Bruce Allan [Tue, 16 Apr 2019 17:24:38 +0000 (10:24 -0700)]
ice: Silence semantic parser warnings

Recent versions of sparse warn about casting pointers to/from restricted
endian types in the Linux driver.  Silence those with the compiler
attribute __force macro from the Linux kernel to force casts to/from
restricted endian types.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Fix couple of issues in ice_vsi_release
Brett Creeley [Tue, 16 Apr 2019 17:24:37 +0000 (10:24 -0700)]
ice: Fix couple of issues in ice_vsi_release

Currently the driver is calling ice_napi_del() and then
unregister_netdev(). The call to unregister_netdev() will result in a
call to ice_stop() and then ice_vsi_close(). This is where we call
napi_disable() for all the MSI-X vectors. This flow is reversed so make
the changes to ensure napi_disable() happens prior to napi_del().

Before calling napi_del() and free_netdev() make sure
unregister_netdev() was called. This is done by making sure the
__ICE_DOWN bit is set in the vsi->state for the interested VSI.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Reorganize ice_vf struct
Jesse Brandeburg [Tue, 16 Apr 2019 17:24:36 +0000 (10:24 -0700)]
ice: Reorganize ice_vf struct

The ice_vf struct can be used hundreds of times in our
driver so it pays to use less memory per struct.

ice_vf prior to this commit:
  /* size: 112, cachelines: 2, members: 25 */
  /* sum members: 101, holes: 4, sum holes: 8 */
  /* bit holes: 2, sum bit holes: 11 bits */
  /* padding: 3 */
  /* last cacheline: 48 bytes */

ice_vf after this commit:
  /* size: 104, cachelines: 2, members: 25 */
  /* sum members: 100, holes: 3, sum holes: 4 */
  /* bit holes: 1, sum bit holes: 3 bits */
  /* last cacheline: 40 bytes */

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Use bitfields when possible
Jesse Brandeburg [Tue, 16 Apr 2019 17:24:35 +0000 (10:24 -0700)]
ice: Use bitfields when possible

We can use bit fields to store boolean values and when the
bit fields are next to each other, the compiler will combine them
(as long as the size holds enough).

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Reorganize tx_buf and ring structs
Jesse Brandeburg [Tue, 16 Apr 2019 17:24:34 +0000 (10:24 -0700)]
ice: Reorganize tx_buf and ring structs

Use more efficient structure ordering by using the pahole tool
and a lot of code inspection to get hot cache lines to have
packed data (no holes if possible) and adjacent warm data.

ice_ring prior to this change:
  /* size: 192, cachelines: 3, members: 23 */
  /* sum members: 158, holes: 4, sum holes: 12 */
  /* padding: 22 */

ice_ring after this change:
  /* size: 192, cachelines: 3, members: 25 */
  /* sum members: 162, holes: 1, sum holes: 1 */
  /* padding: 29 */

ice_tx_buf prior to this change:
  /* size: 48, cachelines: 1, members: 7 */
  /* sum members: 38, holes: 2, sum holes: 6 */
  /* padding: 4 */
  /* last cacheline: 48 bytes */

ice_tx_buf after this change:
  /* size: 40, cachelines: 1, members: 7 */
  /* sum members: 38, holes: 1, sum holes: 2 */
  /* last cacheline: 40 bytes */

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Format ethtool reported stats
Richard Rodriguez [Tue, 16 Apr 2019 17:24:33 +0000 (10:24 -0700)]
ice: Format ethtool reported stats

Fixes ethtool -S reported stats in ice driver to match
format and nomenclature of the ixgbe driver.

Signed-off-by: Richard Rodriguez <richard.rodriguez@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Gracefully handle reset failure in ice_alloc_vfs()
Brett Creeley [Tue, 16 Apr 2019 17:24:32 +0000 (10:24 -0700)]
ice: Gracefully handle reset failure in ice_alloc_vfs()

Currently if ice_reset_all_vfs() fails in ice_alloc_vfs() we fail to
free some resources, reset variables, and return an error value.
Fix this by adding another unroll case to free the pf->vf array, set
the pf->num_alloc_vfs to 0, and return an error code.

Without this, if ice_reset_all_vfs() fails in ice_alloc_vfs() we will
not be able to do SRIOV without hard rebooting the system because
rmmod'ing the driver does not work.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Refactor the LLDP MIB change event handling
Usha Ketineni [Tue, 16 Apr 2019 17:24:31 +0000 (10:24 -0700)]
ice: Refactor the LLDP MIB change event handling

This patch fixes the LLDP MIB change event handling code by removing
the workarounds in the current code. Added ice_dcb_need_recfg() to
print the DCB configuration changes detected via MIB change event.

Signed-off-by: Usha Ketineni <usha.k.ketineni@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Advertise supported link modes if none requested
Tony Nguyen [Tue, 16 Apr 2019 17:24:30 +0000 (10:24 -0700)]
ice: Advertise supported link modes if none requested

User requested link modes affect what is returned as an advertised
link mode.  If no modes have been requested, we are not advertising
any link modes.  Advertise what we are capable of supporting if no
link modes have been requested.

Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Fix hang when ethtool disables FW LLDP
Dave Ertman [Tue, 16 Apr 2019 17:24:29 +0000 (10:24 -0700)]
ice: Fix hang when ethtool disables FW LLDP

When disabling and enabling VSIs, there are a couple of flows
that recursively acquire the RTNL lock which causes a deadlock.
Fix that.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Call out dev/func caps when printing
Anirudh Venkataramanan [Tue, 16 Apr 2019 17:24:28 +0000 (10:24 -0700)]
ice: Call out dev/func caps when printing

ice_parse_caps is used to parse both device and function capabilities.
Currently, capabilities are printed with a cryptic "HW caps" prefix,
which makes it difficult to distinguish whether the capabilities being
printed are device or function capabilities.

This patch makes a change to add a "func cap" prefix when printing
function capabilities, and a "dev cap" prefix when printing device
capabilities.

This patch also changes some of the capability print strings for
consistency.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Remove braces for single statement blocks
Anirudh Venkataramanan [Tue, 16 Apr 2019 17:24:27 +0000 (10:24 -0700)]
ice: Remove braces for single statement blocks

Fix checkpatch warning "WARNING:BRACES: braces {} are not necessary
for single statement blocks"

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Cleanup an unnecessary variable initialization
Bruce Allan [Tue, 16 Apr 2019 17:24:26 +0000 (10:24 -0700)]
ice: Cleanup an unnecessary variable initialization

Commit 3463688e6ced ("ice: Add more validation in ice_vc_cfg_irq_map_msg")
added an assignment of vsi making the assignment during declaration
unnecessary.

Also, cleanup the declaration and assignment of irqmap_info to not use two
lines in the variable declaration section.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Implement LLDP persistence
Anirudh Venkataramanan [Tue, 16 Apr 2019 17:24:25 +0000 (10:24 -0700)]
ice: Implement LLDP persistence

Implement LLDP persistence across reboots, start and stop of LLDP agent.
Add additional parameter to ice_aq_start_lldp and ice_aq_stop_lldp.

Also change the ethtool private flag from "disable-fw-lldp" to
"enable-fw-lldp". This change will flip the boolean logic of the
functionality of the flag (on = enable, off = disable). The change
in name and functionality is to differentiate between the
pre-persistence and post-persistence states.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoice: Fix double spacing
Anirudh Venkataramanan [Tue, 16 Apr 2019 17:21:28 +0000 (10:21 -0700)]
ice: Fix double spacing

Fix double spacing in ice_napi_disable_all

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agonet: qualcomm: rmnet: Move common struct definitions to include
Subash Abhinov Kasiviswanathan [Wed, 22 May 2019 20:21:07 +0000 (14:21 -0600)]
net: qualcomm: rmnet: Move common struct definitions to include

Create if_rmnet.h and move the rmnet MAP packet structs to this
common include file. To account for portablity, add little and
big endian bitfield definitions similar to the ip & tcp headers.

The definitions in the headers can now be re-used by the
upcoming ipa driver series as well as qmi_wwan.

Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoRevert "dpaa2-eth: configure the cache stashing amount on a queue"
Ioana Radulescu [Thu, 23 May 2019 14:38:22 +0000 (17:38 +0300)]
Revert "dpaa2-eth: configure the cache stashing amount on a queue"

This reverts commit 143aea80b005a0241f1035aaaebba4e2f1492e64.

The reverted change instructed the QMan hardware block to fetch
RX frame annotation and beginning of frame data to cache before
the core would read them.

It turns out that in rare cases, it's possible that a QMan
stashing transaction is delayed long enough such that, by the time
it gets executed, the frame in question had already been dequeued
by the core and software processing began on it. If the core
manages to unmap the frame buffer _before_ the stashing transaction
is executed, an SMMU exception will be raised.

Unfortunately there is no easy way to work around this while keeping
the performance advantages brought by QMan stashing, so disable
it altogether.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: use firmware API for validating filter spec
Raju Rangoju [Thu, 23 May 2019 13:51:21 +0000 (19:21 +0530)]
cxgb4: use firmware API for validating filter spec

Adds support for validating hardware filter spec configured in firmware
before offloading exact match flows.

Use the new fw api FW_PARAM_DEV_FILTER_MODE_MASK to read the filter mode
and mask from firmware. If the api isn't supported, then fall-back to
older way of reading just the mode from indirect register.

Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-ll_temac-Fix-and-enable-multicast-support'
David S. Miller [Thu, 23 May 2019 16:33:57 +0000 (09:33 -0700)]
Merge branch 'net-ll_temac-Fix-and-enable-multicast-support'

Esben Haabendal says:

====================
net: ll_temac: Fix and enable multicast support

This patch series makes the necessary fixes to ll_temac driver to make
multicast work, and enables support for it.so that multicast support can

The main change is the change from mutex to spinlock of the lock used to
synchronize access to the shared indirect register access.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ll_temac: Enable multicast support
Esben Haabendal [Thu, 23 May 2019 12:02:22 +0000 (14:02 +0200)]
net: ll_temac: Enable multicast support

Multicast support have been tested and is working now.

Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ll_temac: Cleanup multicast filter on change
Esben Haabendal [Thu, 23 May 2019 12:02:21 +0000 (14:02 +0200)]
net: ll_temac: Cleanup multicast filter on change

Avoid leaving old address table entries when using multicast.  If more than
one multicast address were removed, only the first removed address would
actually be cleared.

Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ll_temac: Prepare indirect register access for multicast support
Esben Haabendal [Thu, 23 May 2019 12:02:20 +0000 (14:02 +0200)]
net: ll_temac: Prepare indirect register access for multicast support

With .ndo_set_rx_mode/temac_set_multicast_list() being called in atomic
context (holding addr_list_lock), and temac_set_multicast_list() needing
to access temac indirect registers, the mutex used to synchronize indirect
register is a no-no.

Replace it with a spinlock, and avoid sleeping in
temac_indirect_busywait().

To avoid excessive holding of the lock, which is now a spinlock, the
temac_device_reset() function is changed to only hold the lock for short
periods.  With timeouts, it could be holding the spinlock for more than
2 seconds.

Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ll_temac: Do not make promiscuous mode sticky on multicast
Esben Haabendal [Thu, 23 May 2019 12:02:19 +0000 (14:02 +0200)]
net: ll_temac: Do not make promiscuous mode sticky on multicast

When user has requested IFF_ALLMULTI or have set more than 4 multicast
addresses, we should just use promiscuous mode, but not set it in flags,
as it causes the interface to stay in promiscuous mode even when the
non-IFF_PROMISC condition that caused promiscuous mode to be enabled
has gone away.

Signed-off-by: Esben Haabendal <esben@geanix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: lxt: Add suspend/resume support to LXT971 and LXT973.
Christophe Leroy [Thu, 23 May 2019 08:55:32 +0000 (08:55 +0000)]
net: phy: lxt: Add suspend/resume support to LXT971 and LXT973.

All LXT PHYs implement the standard "power down" bit 11 of
BMCR, so this patch adds support using the generic
genphy_{suspend,resume} functions added by
commit 6370fe8e2204 ("phy: power management support").

LXT970 is left aside because all registers get cleared upon
"power down" exit.

Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodevlink: add warning in case driver does not set port type
Jiri Pirko [Thu, 23 May 2019 08:43:35 +0000 (10:43 +0200)]
devlink: add warning in case driver does not set port type

Prevent misbehavior of drivers who would not set port type for longer
period of time. Drivers should always set port type. Do WARN if that
happens.

Note that it is perfectly fine to temporarily not have the type set,
during initialization and port type change.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agohv_sock: perf: loop in send() to maximize bandwidth
Sunil Muthuswamy [Wed, 22 May 2019 23:10:44 +0000 (23:10 +0000)]
hv_sock: perf: loop in send() to maximize bandwidth

Currently, the hv_sock send() iterates once over the buffer, puts data into
the VMBUS channel and returns. It doesn't maximize on the case when there
is a simultaneous reader draining data from the channel. In such a case,
the send() can maximize the bandwidth (and consequently minimize the cpu
cycles) by iterating until the channel is found to be full.

Perf data:
Total Data Transfer: 10GB/iteration
Single threaded reader/writer, Linux hvsocket writer with Windows hvsocket
reader
Packet size: 64KB
CPU sys time was captured using the 'time' command for the writer to send
10GB of data.
'Send Buffer Loop' is with the patch applied.
The values below are over 10 iterations.

|--------------------------------------------------------|
|        |        Current        |   Send Buffer Loop    |
|--------------------------------------------------------|
|        | Throughput | CPU sys  | Throughput | CPU sys  |
|        | (MB/s)     | time (s) | (MB/s)     | time (s) |
|--------------------------------------------------------|
| Min    |     407    |   7.048  |    401     |  5.958   |
|--------------------------------------------------------|
| Max    |     455    |   7.563  |    542     |  6.993   |
|--------------------------------------------------------|
| Avg    |     440    |   7.411  |    451     |  6.639   |
|--------------------------------------------------------|
| Median |     446    |   7.417  |    447     |  6.761   |
|--------------------------------------------------------|

Observation:
1. The avg throughput doesn't really change much with this change for this
scenario. This is most probably because the bottleneck on throughput is
somewhere else.
2. The average system (or kernel) cpu time goes down by 10%+ with this
change, for the same amount of data transfer.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agohv_sock: perf: Allow the socket buffer size options to influence the actual socket...
Sunil Muthuswamy [Wed, 22 May 2019 22:56:07 +0000 (22:56 +0000)]
hv_sock: perf: Allow the socket buffer size options to influence the actual socket buffers

Currently, the hv_sock buffer size is static and can't scale to the
bandwidth requirements of the application. This change allows the
applications to influence the socket buffer sizes using the SO_SNDBUF and
the SO_RCVBUF socket options.

Few interesting points to note:
1. Since the VMBUS does not allow a resize operation of the ring size, the
socket buffer size option should be set prior to establishing the
connection for it to take effect.
2. Setting the socket option comes with the cost of that much memory being
reserved/allocated by the kernel, for the lifetime of the connection.

Perf data:
Total Data Transfer: 1GB
Single threaded reader/writer
Results below are summarized over 10 iterations.

Linux hvsocket writer + Windows hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size ->   |      128B       |       1KB       |       4KB       |        64KB         |
|---------------------------------------------------------------------------------------------|
|SO_SNDBUF size | |                 Throughput in MB/s (min/max/avg/median):                  |
|               v |                                                                           |
|---------------------------------------------------------------------------------------------|
|      Default    | 109/118/114/116 | 636/774/701/700 | 435/507/480/476 |   410/491/462/470   |
|      16KB       | 110/116/112/111 | 575/705/662/671 | 749/900/854/869 |   592/824/692/676   |
|      32KB       | 108/120/115/115 | 703/823/767/772 | 718/878/850/866 | 1593/2124/2000/2085 |
|      64KB       | 108/119/114/114 | 592/732/683/688 | 805/934/903/911 | 1784/1943/1862/1843 |
|---------------------------------------------------------------------------------------------|

Windows hvsocket writer + Linux hvsocket reader:
|---------------------------------------------------------------------------------------------|
|Packet size ->   |     128B    |      1KB        |          4KB        |        64KB         |
|---------------------------------------------------------------------------------------------|
|SO_RCVBUF size | |               Throughput in MB/s (min/max/avg/median):                    |
|               v |                                                                           |
|---------------------------------------------------------------------------------------------|
|      Default    | 69/82/75/73 | 313/343/333/336 |   418/477/446/445   |   659/701/676/678   |
|      16KB       | 69/83/76/77 | 350/401/375/382 |   506/548/517/516   |   602/624/615/615   |
|      32KB       | 62/83/73/73 | 471/529/496/494 |   830/1046/935/939  | 944/1180/1070/1100  |
|      64KB       | 64/70/68/69 | 467/533/501/497 | 1260/1590/1430/1431 | 1605/1819/1670/1660 |
|---------------------------------------------------------------------------------------------|

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4/igmp: shrink struct ip_sf_list
Eric Dumazet [Wed, 22 May 2019 22:00:25 +0000 (15:00 -0700)]
ipv4/igmp: shrink struct ip_sf_list

Removing two 4 bytes holes allows to use kmalloc-32
kmem cache instead of kmalloc-64 on 64bit kernels.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoneighbor: Add tracepoint to __neigh_create
David Ahern [Wed, 22 May 2019 19:22:21 +0000 (12:22 -0700)]
neighbor: Add tracepoint to __neigh_create

Add tracepoint to __neigh_create to enable debugging of new entries.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: pmtu: Simplify cleanup and namespace names
David Ahern [Wed, 22 May 2019 19:11:06 +0000 (12:11 -0700)]
selftests: pmtu: Simplify cleanup and namespace names

The point of the pause-on-fail argument is to leave the setup as is after
a test fails to allow a user to debug why it failed. Move the cleanup
after posting the result to the user to make it so.

Random names for the namespaces are not user friendly when trying to
debug a failure. Make them simpler and more direct for the tests. Run
cleanup at the beginning to ensure they are cleaned up if they already
exist.

Remove cleanup_done. There is no harm in doing cleanup twice; just
ignore any errors related to not existing - which is already done.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: fib-onlink: Make quiet by default
David Ahern [Wed, 22 May 2019 19:09:16 +0000 (12:09 -0700)]
selftests: fib-onlink: Make quiet by default

Add VERBOSE argument to fib-onlink-tests.sh and make output quiet by
default. Add getopt parsing of inputs and support for -v (verbose) and
-p (pause on fail).

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Set strict_start_type for routes and rules
David Ahern [Wed, 22 May 2019 19:07:43 +0000 (12:07 -0700)]
net: Set strict_start_type for routes and rules

New userspace on an older kernel can send unknown and unsupported
attributes resulting in an incompelete config which is almost
always wrong for routing (few exceptions are passthrough settings
like the protocol that installed the route).

Set strict_start_type in the policies for IPv4 and IPv6 routes and
rules to detect new, unsupported attributes and fail the route add.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-Export-functions-for-nexthop-code'
David S. Miller [Thu, 23 May 2019 00:48:44 +0000 (17:48 -0700)]
Merge branch 'net-Export-functions-for-nexthop-code'

David Ahern says:

====================
net: Export functions for nexthop code

This set exports ipv4 and ipv6 fib functions for use by the nexthop
code. It also adds new ones to send route notifications if a nexthop
configuration changes.

v2
- repost of patches dropped at the end of the last dev window
  added patch 8 which exports nh_update_mtu since it is inline with
  the other patches
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: Rename and export nh_update_mtu
David Ahern [Wed, 22 May 2019 19:04:46 +0000 (12:04 -0700)]
ipv4: Rename and export nh_update_mtu

Rename nh_update_mtu to fib_nhc_update_mtu and export for use by the
nexthop code.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: export fib_info_update_nh_saddr
David Ahern [Wed, 22 May 2019 19:04:45 +0000 (12:04 -0700)]
ipv4: export fib_info_update_nh_saddr

Add scope as input argument versus relying on fib_info reference in
fib_nh, and export fib_info_update_nh_saddr.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: export fib_flush
David Ahern [Wed, 22 May 2019 19:04:44 +0000 (12:04 -0700)]
ipv4: export fib_flush

As nexthops are deleted, fib entries referencing it are marked dead.
Export fib_flush so those entries can be removed in a timely manner.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: export fib_check_nh
David Ahern [Wed, 22 May 2019 19:04:43 +0000 (12:04 -0700)]
ipv4: export fib_check_nh

Change fib_check_nh to take net, table and scope as input arguments
over struct fib_config and export for use by nexthop code.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv4: Add function to send route updates
David Ahern [Wed, 22 May 2019 19:04:42 +0000 (12:04 -0700)]
ipv4: Add function to send route updates

Add fib_info_notify_update to walk the fib and send RTM_NEWROUTE
notifications with NLM_F_REPLACE set for entries linked to a fib_info
that have nh_updated flag set. This helper will be used by the nexthop
code to notify userspace of routes that are impacted when a nexthop
config is updated via replace. The new function and its helper are
similar to how fib_flush and fib_table_flush work for address delete
and link down events.

This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.

In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: export function to send route updates
David Ahern [Wed, 22 May 2019 19:04:41 +0000 (12:04 -0700)]
ipv6: export function to send route updates

Add fib6_rt_update to send RTM_NEWROUTE with NLM_F_REPLACE set. This
helper will be used by the nexthop code to notify userspace of routes
that are impacted when a nexthop config is updated via replace.

This notification is needed for legacy apps that do not understand
the new nexthop object. Apps that are nexthop aware can use the
RTA_NH_ID attribute in the route notification to just ignore it.

In the future this should be wrapped in a sysctl to allow OS'es that
are fully updated to avoid the notificaton storm.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Add hook to bump sernum for a route to stubs
David Ahern [Wed, 22 May 2019 19:04:40 +0000 (12:04 -0700)]
ipv6: Add hook to bump sernum for a route to stubs

Add hook to ipv6 stub to bump the sernum up to the root node for a
route. This is needed by the nexthop code when a nexthop config changes.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Add delete route hook to stubs
David Ahern [Wed, 22 May 2019 19:04:39 +0000 (12:04 -0700)]
ipv6: Add delete route hook to stubs

Add ip6_del_rt to the IPv6 stub. The hook is needed by the nexthop
code to remove entries linked to a nexthop that is getting deleted.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-phy-T1-support'
David S. Miller [Thu, 23 May 2019 00:46:28 +0000 (17:46 -0700)]
Merge branch 'net-phy-T1-support'

Andrew Lunn says:

====================
net: phy: T1 support

T1 PHYs make use of a single twisted pair, rather than the traditional
2 pair for 100BaseT or 4 pair for 1000BaseT. This patchset adds link
modes for 100BaseT1 and 1000BaseT1, and them makes use of 100BaseT1 in
the list of PHY features used by current T1 drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: Make phy_basic_t1_features use base100t1.
Andrew Lunn [Wed, 22 May 2019 18:47:04 +0000 (20:47 +0200)]
net: phy: Make phy_basic_t1_features use base100t1.

Now that there is a link mode for 100BaseT1, use it in
phy_basic_t1_features so T1 PHY drivers will indicate this mode via
the Ethtool API.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: Add support for 100BaseT1 and 1000BaseT1
Andrew Lunn [Wed, 22 May 2019 18:47:03 +0000 (20:47 +0200)]
net: phy: Add support for 100BaseT1 and 1000BaseT1

Add link modes for 100Mbps and 1Gbps over a single pair.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: Allocate state struct in probe
Trent Piepho [Wed, 22 May 2019 18:43:27 +0000 (18:43 +0000)]
net: phy: dp83867: Allocate state struct in probe

This was being done in config the first time the phy was configured.
Should be in the probe method.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: Validate FIFO depth property
Trent Piepho [Wed, 22 May 2019 18:43:26 +0000 (18:43 +0000)]
net: phy: dp83867: Validate FIFO depth property

Insure property is in valid range and fail when reading DT if it is not.
Also add error message for existing failure if required property is not
present.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: IO impedance is not dependent on RGMII delay
Trent Piepho [Wed, 22 May 2019 18:43:25 +0000 (18:43 +0000)]
net: phy: dp83867: IO impedance is not dependent on RGMII delay

The driver would only set the IO impedance value when RGMII internal
delays were enabled.  There is no reason for this.  Move the IO
impedance block out of the RGMII delay block.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: Use unsigned variables to store unsigned properties
Trent Piepho [Wed, 22 May 2019 18:43:24 +0000 (18:43 +0000)]
net: phy: dp83867: Use unsigned variables to store unsigned properties

The variables used to store u32 DT properties were signed ints.  This
doesn't work properly if the value of the property were to overflow.
Use unsigned variables so this doesn't happen.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: Rework delay rgmii delay handling
Trent Piepho [Wed, 22 May 2019 18:43:23 +0000 (18:43 +0000)]
net: phy: dp83867: Rework delay rgmii delay handling

The code was assuming the reset default of the delay control register
was to have delay disabled.  This is what the datasheet shows as the
register's initial value.  However, that's not actually true: the
default is controlled by the PHY's pin strapping.

If the interface mode is selected as RX or TX delay only, insure the
other direction's delay is disabled.

If the interface mode is just "rgmii", with neither TX or RX internal
delay, one might expect that the driver should disable both delays.  But
this is not what the driver does.  It leaves the setting at the PHY's
strapping's default.  And that default, for no pins with strapping
resistors, is to have delay enabled and 2.00 ns.

Rather than change this behavior, I've kept it the same and documented
it.  No delay will most likely not work and will break ethernet on any
board using "rgmii" mode.  If the board is strapped to have a delay and
is configured to use "rgmii" mode a warning is generated that "rgmii-id"
should have been used.

Also validate the delay values and fail if they are not in range.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: dp83867: Add ability to disable output clock
Trent Piepho [Wed, 22 May 2019 18:43:22 +0000 (18:43 +0000)]
net: phy: dp83867: Add ability to disable output clock

Generally, the output clock pin is only used for testing and only serves
as a source of RF noise after this.  It could be used to daisy-chain
PHYs, but this is uncommon.  Since the PHY can disable the output, make
doing so an option.  I do this by adding another enumeration to the
allowed values of ti,clk-output-sel.

The code was not using the value DP83867_CLK_O_SEL_REF_CLK as one might
expect: to select the REF_CLK as the output.  Rather it meant "keep
clock output setting as is", which, depending on PHY strapping, might
not be outputting REF_CLK.

Change this so DP83867_CLK_O_SEL_REF_CLK means enable REF_CLK output.
Omitting the property will leave the setting as is (which was the
previous behavior in this case).

Out of range values were silently converted into
DP83867_CLK_O_SEL_REF_CLK.  Change this so they generate an error.

Cc: Andrew Lunn <andrew@lunn.ch>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: phy: dp83867: Add documentation for disabling clock output
Trent Piepho [Wed, 22 May 2019 18:43:21 +0000 (18:43 +0000)]
dt-bindings: phy: dp83867: Add documentation for disabling clock output

The clock output is generally only used for testing and development and
not used to daisy-chain PHYs.  It's just a source of RF noise afterward.

Add a mux value for "off".  I've added it as another enumeration to the
output property.  In the actual PHY, the mux and the output enable are
independently controllable.  However, it doesn't seem useful to be able
to describe the mux setting when the output is disabled.

Document that PHY's default setting will be left as is if the property
is omitted.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodt-bindings: phy: dp83867: Describe how driver behaves w.r.t rgmii delay
Trent Piepho [Wed, 22 May 2019 18:43:19 +0000 (18:43 +0000)]
dt-bindings: phy: dp83867: Describe how driver behaves w.r.t rgmii delay

Add a note to make it more clear how the driver behaves when "rgmii" vs
"rgmii-id", "rgmii-idrx", or "rgmii-idtx" interface modes are selected.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: Enable hash filter with offload
Vishal Kulkarni [Wed, 22 May 2019 16:16:12 +0000 (21:46 +0530)]
cxgb4: Enable hash filter with offload

Hash (exact-match) filters used for offloading flows share the
same active region resources on the chip with upper layer drivers,
like iw_cxgb4, chcr, etc. Currently, only either Hash filters
or ULDs can use the active region resources, but not both. Hence,
use the new firmware configuration parameters (when available)
to allow both the Hash filters and ULDs to share the
active region simultaneously.

Signed-off-by: Vishal Kulkarni <vishal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: fec: remove redundant ipg clock disable
Baruch Siach [Wed, 22 May 2019 13:56:15 +0000 (16:56 +0300)]
net: fec: remove redundant ipg clock disable

Don't disable the ipg clock in the regulator error path. The clock is
disable unconditionally two lines below the failed_regulator label.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Add UNIX_DIAG_UID to Netlink UNIX socket diagnostics.
Felipe Gasper [Tue, 21 May 2019 00:43:51 +0000 (19:43 -0500)]
net: Add UNIX_DIAG_UID to Netlink UNIX socket diagnostics.

This adds the ability for Netlink to report a socket's UID along with the
other UNIX diagnostic information that is already available. This will
allow diagnostic tools greater insight into which users control which
socket.

To test this, do the following as a non-root user:

    unshare -U -r bash
    nc -l -U user.socket.$$ &

.. and verify from within that same session that Netlink UNIX socket
diagnostics report the socket's UID as 0. Also verify that Netlink UNIX
socket diagnostics report the socket's UID as the user's UID from an
unprivileged process in a different session. Verify the same from
a root process.

Signed-off-by: Felipe Gasper <felipe@felipegasper.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Wed, 22 May 2019 15:36:16 +0000 (08:36 -0700)]
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - Fix SPE probe failure when backing auxbuf with high-order pages

 - Fix handling of DMA allocations from outside of the vmalloc area

 - Fix generation of build-id ELF section for vDSO object

 - Disable huge I/O mappings if kernel page table dumping is enabled

 - A few other minor fixes (comments, kconfig etc)

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: vdso: Explicitly add build-id option
  arm64/mm: Inhibit huge-vmap with ptdump
  arm64: Print physical address of page table base in show_pte()
  arm64: don't trash config with compat symbol if COMPAT is disabled
  arm64: assembler: Update comment above cond_yield_neon() macro
  drivers/perf: arm_spe: Don't error on high-order pages for aux buf
  arm64/iommu: handle non-remapped addresses in ->mmap and ->get_sgtable

5 years agoMerge tag 'gfs2-5.1.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2...
Linus Torvalds [Wed, 22 May 2019 15:31:09 +0000 (08:31 -0700)]
Merge tag 'gfs2-5.1.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:
 "Fix a gfs2 sign extension bug introduced in v4.3"

* tag 'gfs2-5.1.fixes2' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Fix sign extension bug in gfs2_update_stats

5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Wed, 22 May 2019 15:28:16 +0000 (08:28 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Clear up some recent tipc regressions because of registration
    ordering. Fix from Junwei Hu.

 2) tipc's TLV_SET() can read past the end of the supplied buffer during
    the copy. From Chris Packham.

 3) ptp example program doesn't match the kernel, from Richard Cochran.

 4) Outgoing message type fix in qrtr, from Bjorn Andersson.

 5) Flow control regression in stmmac, from Tan Tee Min.

 6) Fix inband autonegotiation in phylink, from Russell King.

 7) Fix sk_bound_dev_if handling in rawv6_bind(), from Mike Manning.

 8) Fix usbnet crash after disconnect, from Kloetzke Jan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (21 commits)
  usbnet: fix kernel crash after disconnect
  selftests: fib_rule_tests: use pre-defined DEV_ADDR
  net-next: net: Fix typos in ip-sysctl.txt
  ipv6: Consider sk_bound_dev_if when binding a raw socket to an address
  net: phylink: ensure inband AN works correctly
  usbnet: ipheth: fix racing condition
  net: stmmac: dma channel control register need to be init first
  net: stmmac: fix ethtool flow control not able to get/set
  net: qrtr: Fix message type of outgoing packets
  networking: : fix typos in code comments
  ptp: Fix example program to match kernel.
  fddi: fix typos in code comments
  selftests: fib_rule_tests: enable forwarding before ipv4 from/iif test
  selftests: fib_rule_tests: fix local IPv4 address typo
  tipc: Avoid copying bytes beyond the supplied data
  2/2] net: xilinx_emaclite: use readx_poll_timeout() in mdio wait function
  1/2] net: axienet: use readx_poll_timeout() in mdio wait function
  vlan: Mark expected switch fall-through
  macvlan: Mark expected switch fall-through
  net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query
  ...

5 years agoMerge tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Wed, 22 May 2019 15:10:35 +0000 (08:10 -0700)]
Merge tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fix from Mike Snitzer:
 "Fix a particularly glaring oversight in a DM core commit from 5.1 that
  doesn't properly trim special IOs (e.g. discards) relative to
  corresponding target's max_io_len_target_boundary()"

* tag 'for-5.2/dm-fix-1' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: make sure to obey max_io_len_target_boundary

5 years agogfs2: Fix sign extension bug in gfs2_update_stats
Andreas Gruenbacher [Fri, 17 May 2019 18:18:43 +0000 (19:18 +0100)]
gfs2: Fix sign extension bug in gfs2_update_stats

Commit 27908daf3d7e changed the types of the statistic values in struct
gfs2_lkstats from s64 to u64.  Because of that, what should be a signed
value in gfs2_update_stats turned into an unsigned value.  When shifted
right, we end up with a large positive value instead of a small negative
value, which results in an incorrect variance estimate.

Fixes: 27908daf3d7e ("gfs2: Make statistics unsigned, suitable for use with do_div()")
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Cc: stable@vger.kernel.org # v4.4+
5 years agodm: make sure to obey max_io_len_target_boundary
Michael Lass [Tue, 21 May 2019 19:58:07 +0000 (21:58 +0200)]
dm: make sure to obey max_io_len_target_boundary

Commit a0b8e6c11bbb ("dm: eliminate 'split_discard_bios' flag from DM
target interface") incorrectly removed code from
__send_changing_extent_only() that is required to impose a per-target IO
boundary on IO that exceeds max_io_len_target_boundary().  Otherwise
"special" IO (e.g. DISCARD, WRITE SAME, WRITE ZEROES) can write beyond
where allowed.

Fix this by restoring the max_io_len_target_boundary() limit in
__send_changing_extent_only()

Fixes: a0b8e6c11bbb ("dm: eliminate 'split_discard_bios' flag from DM target interface")
Cc: stable@vger.kernel.org # 5.1+
Signed-off-by: Michael Lass <bevan@bi-co.net>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
5 years agousbnet: fix kernel crash after disconnect
Kloetzke Jan [Tue, 21 May 2019 13:18:40 +0000 (13:18 +0000)]
usbnet: fix kernel crash after disconnect

When disconnecting cdc_ncm the kernel sporadically crashes shortly
after the disconnect:

  [   57.868812] Unable to handle kernel NULL pointer dereference at virtual address 00000000
  ...
  [   58.006653] PC is at 0x0
  [   58.009202] LR is at call_timer_fn+0xec/0x1b4
  [   58.013567] pc : [<0000000000000000>] lr : [<ffffff80080f5130>] pstate: 00000145
  [   58.020976] sp : ffffff8008003da0
  [   58.024295] x29: ffffff8008003da0 x28: 0000000000000001
  [   58.029618] x27: 000000000000000a x26: 0000000000000100
  [   58.034941] x25: 0000000000000000 x24: ffffff8008003e68
  [   58.040263] x23: 0000000000000000 x22: 0000000000000000
  [   58.045587] x21: 0000000000000000 x20: ffffffc68fac1808
  [   58.050910] x19: 0000000000000100 x18: 0000000000000000
  [   58.056232] x17: 0000007f885aff8c x16: 0000007f883a9f10
  [   58.061556] x15: 0000000000000001 x14: 000000000000006e
  [   58.066878] x13: 0000000000000000 x12: 00000000000000ba
  [   58.072201] x11: ffffffc69ff1db30 x10: 0000000000000020
  [   58.077524] x9 : 8000100008001000 x8 : 0000000000000001
  [   58.082847] x7 : 0000000000000800 x6 : ffffff8008003e70
  [   58.088169] x5 : ffffffc69ff17a28 x4 : 00000000ffff138b
  [   58.093492] x3 : 0000000000000000 x2 : 0000000000000000
  [   58.098814] x1 : 0000000000000000 x0 : 0000000000000000
  ...
  [   58.205800] [<          (null)>]           (null)
  [   58.210521] [<ffffff80080f5298>] expire_timers+0xa0/0x14c
  [   58.215937] [<ffffff80080f542c>] run_timer_softirq+0xe8/0x128
  [   58.221702] [<ffffff8008081120>] __do_softirq+0x298/0x348
  [   58.227118] [<ffffff80080a6304>] irq_exit+0x74/0xbc
  [   58.232009] [<ffffff80080e17dc>] __handle_domain_irq+0x78/0xac
  [   58.237857] [<ffffff8008080cf4>] gic_handle_irq+0x80/0xac
  ...

The crash happens roughly 125..130ms after the disconnect. This
correlates with the 'delay' timer that is started on certain USB tx/rx
errors in the URB completion handler.

The problem is a race of usbnet_stop() with usbnet_start_xmit(). In
usbnet_stop() we call usbnet_terminate_urbs() to cancel all URBs in
flight. This only makes sense if no new URBs are submitted
concurrently, though. But the usbnet_start_xmit() can run at the same
time on another CPU which almost unconditionally submits an URB. The
error callback of the new URB will then schedule the timer after it was
already stopped.

The fix adds a check if the tx queue is stopped after the tx list lock
has been taken. This should reliably prevent the submission of new URBs
while usbnet_terminate_urbs() does its job. The same thing is done on
the rx side even though it might be safe due to other flags that are
checked there.

Signed-off-by: Jan Klötzke <Jan.Kloetzke@preh.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: fib_rule_tests: use pre-defined DEV_ADDR
Hangbin Liu [Tue, 21 May 2019 06:40:47 +0000 (14:40 +0800)]
selftests: fib_rule_tests: use pre-defined DEV_ADDR

DEV_ADDR is defined but not used. Use it in address setting.
Do the same with IPv6 for consistency.

Reported-by: David Ahern <dsahern@gmail.com>
Fixes: 5ee583f884fe ("selftests: fib_rule_tests: fix local IPv4 address typo")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet-next: net: Fix typos in ip-sysctl.txt
Masanari Iida [Tue, 21 May 2019 03:41:15 +0000 (12:41 +0900)]
net-next: net: Fix typos in ip-sysctl.txt

This patch fixes some spelling typos found in ip-sysctl.txt

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: Consider sk_bound_dev_if when binding a raw socket to an address
Mike Manning [Mon, 20 May 2019 18:57:17 +0000 (19:57 +0100)]
ipv6: Consider sk_bound_dev_if when binding a raw socket to an address

IPv6 does not consider if the socket is bound to a device when binding
to an address. The result is that a socket can be bound to eth0 and
then bound to the address of eth1. If the device is a VRF, the result
is that a socket can only be bound to an address in the default VRF.

Resolve by considering the device if sk_bound_dev_if is set.

Signed-off-by: Mike Manning <mmanning@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Tested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phylink: ensure inband AN works correctly
Russell King [Mon, 20 May 2019 15:07:20 +0000 (16:07 +0100)]
net: phylink: ensure inband AN works correctly

Do not update the link interface mode while the link is down to avoid
spurious link interface changes.

Always call mac_config if we have a PHY to propagate the pause mode
settings to the MAC.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agousbnet: ipheth: fix racing condition
Bernd Eckstein [Mon, 20 May 2019 15:31:09 +0000 (17:31 +0200)]
usbnet: ipheth: fix racing condition

Fix a racing condition in ipheth.c that can lead to slow performance.

Bug: In ipheth_tx(), netif_wake_queue() may be called on the callback
ipheth_sndbulk_callback(), _before_ netif_stop_queue() is called.
When this happens, the queue is stopped longer than it needs to be,
thus reducing network performance.

Fix: Move netif_stop_queue() in front of usb_submit_urb(). Now the order
is always correct. In case, usb_submit_urb() fails, the queue is woken up
again as callback will not fire.

Testing: This racing condition is usually not noticeable, as it has to
occur very frequently to slowdown the network. The callback from the USB
is usually triggered slow enough, so the situation does not appear.
However, on a Ubuntu Linux on VMWare Workstation, running on Windows 10,
the we loose the race quite often and the following speedup can be noticed:

Without this patch: Download:  4.10 Mbit/s, Upload:  4.01 Mbit/s
With this patch:    Download: 36.23 Mbit/s, Upload: 17.61 Mbit/s

Signed-off-by: Oliver Zweigle <Oliver.Zweigle@faro.com>
Signed-off-by: Bernd Eckstein <3ernd.Eckstein@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>