net: phy: don't abuse devres in devm_mdiobus_register()
We currently have two managed helpers for mdiobus - devm_mdiobus_alloc()
and devm_mdiobus_register(). The idea behind devres is that the release
callback releases whatever resource the devm function allocates. In the
mdiobus case however there's no devres associated with the device by
devm_mdiobus_register(). Instead the release callback for
devm_mdiobus_alloc(): _devm_mdiobus_free() unregisters the device if
it is marked as managed.
This all seems wrong. The managed structure shouldn't need to know or
care about whether it's managed or not - and this is the case now for
struct mii_bus. The devres wrapper should be opaque to the managed
resource.
This changeset makes devm_mdiobus_alloc() and devm_mdiobus_register()
conform to common devres standards: devm_mdiobus_alloc() allocates a
devres structure and registers a callback that will call mdiobus_free().
__devm_mdiobus_register() allocated another devres and registers a
callback that will unregister the bus.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
phy: mdio: add kerneldoc for __devm_mdiobus_register()
This function is not documented. Add a short kerneldoc description.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Functions should only be static inline if they're very short. This
devres helper is already over 10 lines and it will grow soon as we'll
be improving upon its approach. Pull it into mdio_devres.c.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: devres: rename the release callback of devm_register_netdev()
Make it an explicit counterpart to devm_register_netdev() just like we
do with devm_free_netdev() for better clarity.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The idea behind devres is that the release callbacks are called if
probe fails. As we now check the return value of ixgbe_mii_bus_init(),
we can drop the call devm_mdiobus_free() in error path as the release
callback will be called automatically.
Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Find ports accessible by the VF, based on the index of the
mac address stored for the VF in the adapter. If no mac address
is stored for the VF, use the port mask provided by firmware.
Signed-off-by: Nirranjan Kirubaharan <nirranjan@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Correct already existing but wrong SPDX tags to match the actual
license.
Remove the license boilerplates and replace them with the correct
SPDX tag.
Update copyright years in all source files.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Set the actual copyright holder and years in all qede source files.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Remove all the boilerplates in the existing code and replace it with the
correct SPDX tag.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Correct already existing but wrong SPDX tags to match the actual
license.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Set the actual copyright holder and years in all qed source files.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Remove all the boilerplates in the existing code and replace it with the
correct SPDX tag.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
QLogic QED drivers source code is dual licensed under
GPL-2.0/BSD-3-Clause.
Correct already existing but wrong SPDX tags to match the actual
license.
Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yousuk Seung [Tue, 30 Jun 2020 16:49:33 +0000 (09:49 -0700)]
tcp: call tcp_ack_tstamp() when not fully acked
When skb is coalesced tcp_ack_tstamp() still needs to be called when not
fully acked in tcp_clean_rtx_queue(), otherwise SCM_TSTAMP_ACK
timestamps may never be fired. Since the original patch series had
dependent commits, this patch fixes the issue instead of reverting by
restoring calls to tcp_ack_tstamp() when skb is not fully acked.
Fixes: c6ce83aadfc9 ("tcp: stamp SCM_TSTAMP_ACK later in tcp_clean_rtx_queue()") Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 30 Jun 2020 15:16:46 +0000 (16:16 +0100)]
net/mlx5e: fix memory leak of tls
The error return path when create_singlethread_workqueue fails currently
does not kfree tls and leads to a memory leak. Fix this by kfree'ing
tls before returning -ENOMEM.
Addresses-Coverity: ("Resource leak") Fixes: 063cdfb23e45 ("net/mlx5e: kTLS, Add kTLS RX HW offload support") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Tue, 30 Jun 2020 14:38:26 +0000 (16:38 +0200)]
mptcp: do nonce initialization at subflow creation time
This clean-up the code a bit, reduces the number of
used hooks and indirect call requested, and allow
better error reporting from __mptcp_subflow_connect()
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Tue, 30 Jun 2020 14:27:46 +0000 (15:27 +0100)]
net/tls: fix sign extension issue when left shifting u16 value
Left shifting the u16 value promotes it to a int and then it
gets sign extended to a u64. If len << 16 is greater than 0x7fffffff
then the upper bits get set to 1 because of the implicit sign extension.
Fix this by casting len to u64 before shifting it.
Addresses-Coverity: ("integer handling issues") Fixes: 9e2c929d854f ("net/tls: Add asynchronous resync") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 30 Jun 2020 12:12:49 +0000 (13:12 +0100)]
sfc: remove duplicate declaration of efx_enqueue_skb_tso()
Define it in nic_common.h, even though the ef100 driver will have a
different implementation backing it (actually a WARN_ON_ONCE as it
should never get called by ef100. But it needs to still exist because
common TX path code references it).
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Tue, 30 Jun 2020 12:02:24 +0000 (13:02 +0100)]
sfc: move NIC-specific mcdi_port declarations out of common header
These functions are implemented in mcdi_port.c, which will not be linked
into the EF100 driver; thus their prototypes should not be visible in
common header files.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Convert Broadcom SF2 to mac_link_up() resolved state
Convert Broadcom SF2 DSA support to use the newly provided resolved
link state via mac_link_up() rather than using the state in
mac_config().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 30 Jun 2020 10:28:18 +0000 (11:28 +0100)]
net: dsa/bcm_sf2: move pause mode setting into mac_link_up()
bcm_sf2 only appears to support pause modes on RGMII interfaces (the
enable bits are in the RGMII control register.) Setup the pause modes
for RGMII connections.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 30 Jun 2020 10:28:13 +0000 (11:28 +0100)]
net: dsa/bcm_sf2: move speed/duplex forcing to mac_link_up()
Convert the bcm_sf2 to use the finalised speed and duplex in its
mac_link_up() call rather than the parameters in mac_config().
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 30 Jun 2020 10:28:08 +0000 (11:28 +0100)]
net: dsa/bcm_sf2: fix incorrect usage of state->link
state->link has never been valid in mac_config() implementations -
while it may be correct in some calls, it is not true that it can be
relied upon.
Fix bcm_sf2 to use the correct method of handling forced link status.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Convert Broadcom B53 to mac_link_up() resolved state
These two patches update the Broadcom B53 DSA support to use the newly
provided resolved link state via mac_link_up() rather than using the
state in mac_config().
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Russell King [Tue, 30 Jun 2020 10:25:06 +0000 (11:25 +0100)]
net: dsa/b53: use resolved link config in mac_link_up()
Convert the B53 driver to use the finalised link parameters in
mac_link_up() rather than the parameters in mac_config(). This is
just a matter of moving the call to b53_force_port_config().
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Replace the b53_force_port_config() pause argument, which is based on
phylink's MLO_PAUSE_* definitions, to use a pair of booleans. This
will allow us to move b53_force_port_config() from
b53_phylink_mac_config() to b53_phylink_mac_link_up().
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Tue, 30 Jun 2020 04:43:13 +0000 (21:43 -0700)]
net: dsa: Improve subordinate PHY error message
It is not very informative to know the DSA master device when a
subordinate network device fails to get its PHY setup. Provide the
device name and capitalize PHY while we are it.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Jun 2020 19:34:35 +0000 (12:34 -0700)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:
====================
1GbE Intel Wired LAN Driver Updates 2020-06-29
This series contains updates to only the igc driver.
Sasha added Energy Efficient Ethernet (EEE) support and Latency Tolerance
Reporting (LTR) support for the igc driver. Added Low Power Idle (LPI)
counters and cleaned up unused TCP segmentation counters. Removed
igc_power_down_link() and call igc_power_down_phy_copper_base()
directly. Removed unneeded copper media check.
Andre cleaned up timestamping by removing un-supported features and
duplicate code for i225. Fixed the timestamp check on the proper flag
instead of the skb for pending transmit timestamps. Refactored
igc_ptp_set_timestamp_mode() to simply the flow.
v2: Removed the log message in patch 1 as suggested by David Miller.
Note: The locking issue Jakub Kicinski saw in patch 5, currently
exists in the current net-next tree, so Andre will resolve the
locking issue in a follow-on patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Sasha Neftin [Mon, 8 Jun 2020 15:49:39 +0000 (18:49 +0300)]
igc: Refactor the igc_power_down_link()
Currently the implementation of igc_power_down_link()
method was just calling igc_power_down_phy_copper_base()
method.
We can just call igc_power_down_phy_copper_base()
method directly.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sasha Neftin [Thu, 4 Jun 2020 11:25:16 +0000 (14:25 +0300)]
igc: Add LPI counters
Add EEE TX LPI and EEE RX LPI counters. A EEE TX LPI event
occurs when the transmitter enters EEE (IEEE 802.3az) LPI
state. A EEE RX LPI event occurs when the receiver detect
link partner entry into EEE(IEEE 802.3az) LPI state.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Andre Guedes [Thu, 4 Jun 2020 00:01:05 +0000 (17:01 -0700)]
igc: Fix Rx timestamp disabling
When Rx timestamping is enabled, we set the timestamp bit in SRRCTL
register for each queue, but we don't clear it when disabling. This
patch fixes igc_ptp_disable_rx_timestamp() accordingly.
Also, this patch gets rid of igc_ptp_enable_tstamp_rxqueue() and
igc_ptp_enable_tstamp_all_rxqueues() and move their logic into
igc_ptp_enable_rx_timestamp() to keep the enable and disable
helpers symmetric.
Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Andre Guedes [Thu, 4 Jun 2020 00:01:04 +0000 (17:01 -0700)]
igc: Refactor igc_ptp_set_timestamp_mode()
Current igc_ptp_set_timestamp_mode() logic is a bit tangled since it
handles many different hardware configurations in one single place,
making it harder to follow. This patch untangles that code by breaking
it into helper functions.
Quick note about the hw->mac.type check which was removed in this
refactoring: this check it not really needed since igc_i225 is the only
type supported by the IGC driver.
Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Andre Guedes [Thu, 4 Jun 2020 00:01:03 +0000 (17:01 -0700)]
igc: Remove UDP filter setup in PTP code
As implemented in igc_ethtool_get_ts_info(), igc only supports HWTSTAMP_
FILTER_ALL so any HWTSTAMP_FILTER_* option the user may set falls back to
HWTSTAMP_FILTER_ALL.
HWTSTAMP_FILTER_ALL is implemented via Rx Time Sync Control (TSYNCRXCTL)
configuration which timestamps all incoming packets. Configuring a
UDP filter, in addition to TSYNCRXCTL, doesn't add much so this patch
removes that code. It also takes this opportunity to remove some
non-applicable comments.
Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Andre Guedes [Thu, 4 Jun 2020 00:01:02 +0000 (17:01 -0700)]
igc: Check __IGC_PTP_TX_IN_PROGRESS instead of ptp_tx_skb
The __IGC_PTP_TX_IN_PROGRESS flag indicates we have a pending Tx
timestamp. In some places, instead of checking that flag, we check
adapter->ptp_tx_skb. This patch fixes those places to use the flag.
Quick note about igc_ptp_tx_hwtstamp() change: when that function is
called, adapter->ptp_tx_skb is expected to be valid always so we
WARN_ON_ONCE() in case it is not.
Quick note about igc_ptp_suspend() change: when suspending, we don't
really need to check if there is a pending timestamp. We can simply
clear it unconditionally.
Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Andre Guedes [Thu, 4 Jun 2020 00:01:01 +0000 (17:01 -0700)]
igc: Remove duplicate code in Tx timestamp handling
The functions igc_ptp_tx_hang() and igc_ptp_tx_work() have duplicate
code which handles Tx timestamp timeouts. This patch does a trivial
refactoring by moving that code to its own function and reusing it.
Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Andre Guedes [Thu, 4 Jun 2020 00:01:00 +0000 (17:01 -0700)]
igc: Clean up Rx timestamping logic
Differently from I210, I225 doesn't report Rx timestamps via the TS bit
Rx descriptor + RXSTMPL/RXSTMPH registers mechanism. Rx timestamps are
reported in the packet buffer only, which is implemented by igc_ptp_rx_
pktstamp(). So this patch removes igc_ptp_rx_rgtstamp() and all code
related to it, copied from igb driver.
Signed-off-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sasha Neftin [Tue, 2 Jun 2020 07:50:47 +0000 (10:50 +0300)]
igc: Add initial LTR support
The LTR message on the PCIe inform the requested latency
on which the PCIe must become active to the downstream
PCIe port of the system.
This patch provide recommended LTR parameters by i225
specification.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
David S. Miller [Tue, 30 Jun 2020 00:45:02 +0000 (17:45 -0700)]
Merge branch 'Add-ethtool-extended-link-state'
Ido Schimmel says:
====================
Add ethtool extended link state
Amit says:
Currently, device drivers can only indicate to user space if the network
link is up or down, without additional information.
This patch set provides an infrastructure that allows these drivers to
expose more information to user space about the link state. The
information can save users' time when trying to understand why a link is
not operationally up, for example.
The above is achieved by extending the existing ethtool LINKSTATE_GET
command with attributes that carry the extended state.
For example, no link due to missing cable:
$ ethtool ethX
...
Link detected: no (No cable)
Beside the general extended state, drivers can pass additional
information about the link state using the sub-state field. For example:
$ ethtool ethX
...
Link detected: no (Autoneg, No partner detected)
In the future the infrastructure can be extended - for example - to
allow PHY drivers to report whether a downshift to a lower speed
occurred. Something like:
$ ethtool ethX
...
Link detected: yes (downshifted)
Patch set overview:
Patches #1-#3 move mlxsw ethtool code to a separate file
Patches #4-#5 add the ethtool infrastructure for extended link state
Patches #6-#7 add support of extended link state in the mlxsw driver
Patches #8-#10 add test cases
Changes since v1:
* In documentation, show ETHTOOL_LINK_EXT_STATE_* and
ETHTOOL_LINK_EXT_SUBSTATE_* constants instead of user-space strings
* Add `_CI_` to cable_issue substates to be consistent with
other substates
* Keep the commit messages within 75 columns
* Use u8 variable for __link_ext_substate
* Document the meaning of -ENODATA in get_link_ext_state() callback
description
* Do not zero data->link_ext_state_provided after getting an error
* Use `ret` variable for error value
Changes since RFC:
* Move documentation patch before ethtool patch
* Add nla_total_size() instead of sizeof() directly
* Return an error code from linkstate_get_ext_state()
* Remove SHORTED_CABLE, add CABLE_TEST_FAILURE instead
* Check if the interface is administratively up before setting ext_state
* Document all sub-states
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:21 +0000 (23:46 +0300)]
selftests: forwarding: Add tests for ethtool extended state
Add tests to check ethtool report about extended state.
The tests configure several states and verify that the correct extended
state is reported by ethtool.
Check extended state with substate (Autoneg) and extended state without
substate (No cable).
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:20 +0000 (23:46 +0300)]
selftests: forwarding: forwarding.config.sample: Add port with no cable connected
Add NETIF_NO_CABLE port to tests topology.
The port can also be declared as an environment variable and tests can be
run like that:
NETIF_NO_CABLE=eth9 ./test.sh eth{1..8}
The NETIF_NO_CABLE port will be used by ethtool_extended_state test.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:19 +0000 (23:46 +0300)]
selftests: forwarding: ethtool: Move different_speeds_get() to ethtool_lib
Currently different_speeds_get() is used only by ethtool.sh tests.
The function can be useful for another tests that check ethtool
configurations.
Move the function to ethtool_lib in order to allow other tests to use
it.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:18 +0000 (23:46 +0300)]
mlxsw: spectrum_ethtool: Add link extended state
Implement .get_down_ext_state() as part of ethtool_ops.
Query link down reason from PDDR register and convert it to ethtool
link_ext_state.
In case that more information than common link_ext_state is provided,
fill link_ext_substate also with the appropriate value.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:17 +0000 (23:46 +0300)]
mlxsw: reg: Port Diagnostics Database Register
The PDDR register enables to read the Phy debug database.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:16 +0000 (23:46 +0300)]
ethtool: Add link extended state
Currently, drivers can only tell whether the link is up/down using
LINKSTATE_GET, but no additional information is given.
Add attributes to LINKSTATE_GET command in order to allow drivers
to expose the user more information in addition to link state to ease
the debug process, for example, reason for link down state.
Extended state consists of two attributes - link_ext_state and
link_ext_substate. The idea is to avoid 'vendor specific' states in order
to prevent drivers to use specific link_ext_state that can be in the future
common link_ext_state.
The substates allows drivers to add more information to the common
link_ext_state. For example, vendor can expose 'Autoneg' as link_ext_state
and add 'No partner detected during force mode' as link_ext_substate.
If a driver cannot pinpoint the extended state with the substate
accuracy, it is free to expose only the extended state and omit the
substate attribute.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:15 +0000 (23:46 +0300)]
Documentation: networking: ethtool-netlink: Add link extended state
Add link extended state attributes.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Move mlxsw_sp1_port_type_speed_ops and mlxsw_sp2_port_type_speed_ops
with the relevant code from spectrum.c to spectrum_ethtool.c.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Amit Cohen [Mon, 29 Jun 2020 20:46:13 +0000 (23:46 +0300)]
mlxsw: Move ethtool_ops to spectrum_ethtool.c
Add spectrum_ethtool.c file for ethtool code.
Move ethtool_ops and the relevant code from spectrum.c to
spectrum_ethtool.c.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw_sp_port_headroom_set() is defined twice - in spectrum.c and in
spectrum_dcb.c, with different arguments and different implementation
but the name is same.
Rename mlxsw_sp_port_headroom_set() to mlxsw_sp_port_headroom_ets_set()
in order to allow using the second function in several files, and not
only as static function in spectrum.c.
Signed-off-by: Amit Cohen <amitc@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sasha Neftin [Wed, 27 May 2020 20:51:32 +0000 (13:51 -0700)]
igc: Add initial EEE support
IEEE802.3az-2010 Energy Efficient Ethernet has been
approved as standard (September 2010) and the driver
can enable and disable it via ethtool.
Disable the feature by default on parts which support it.
Add enable/disable eee options.
tx-lpi, tx-timer and advertise not supported yet.
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Reviewed-by: Andre Guedes <andre.guedes@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
====================
dpaa2-eth: send a scatter-gather FD instead of realloc-ing
This patch set changes the behaviour in case the Tx path is confroted
with an SKB with insufficient headroom for our hardware necessities (SW
annotation area). In the first patch, instead of realloc-ing the SKB we
now send a S/G frames descriptor while the second one adds a new
software held counter to account for for these types of frames.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Mon, 29 Jun 2020 18:47:12 +0000 (21:47 +0300)]
dpaa2-eth: add software counter for Tx frames converted to S/G
With the previous commit, in case of insufficient SKB headroom on the Tx
path instead of reallocing the SKB we now send a S/G frame descriptor.
Export the number of occurences of this case as a per CPU counter (in
debugfs) and a total number in the ethtool statistics - "tx converted sg
frames'.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei [Mon, 29 Jun 2020 18:47:11 +0000 (21:47 +0300)]
dpaa2-eth: send a scatter-gather FD instead of realloc-ing
Instead of realloc-ing the skb on the Tx path when the provided headroom
is smaller than the HW requirements, create a Scatter/Gather frame
descriptor with only one entry.
Remove the '[drv] tx realloc frames' counter exposed previously through
ethtool since it is no longer used.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
sfc: prerequisites for EF100 driver, part 1
This continues the work started by Alex Maftei <amaftei@solarflare.com>
in the series "sfc: code refactoring", "sfc: more code refactoring",
"sfc: even more code refactoring" and "sfc: refactor mcdi filtering
code", to prepare for a new driver which will share much of the code
to support the new EF100 family of Solarflare/Xilinx NICs.
After this series, there will be approximately two more of these
'prerequisites' series, followed by the sfc_ef100 driver itself.
v2: fix reverse xmas tree in patch 5. (Left the cases in patches 7,
9 and 14 alone as those are all in pure movement of existing code.)
====================
Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 29 Jun 2020 13:36:56 +0000 (14:36 +0100)]
sfc: extend common GRO interface to support CHECKSUM_COMPLETE
EF100 will use CHECKSUM_COMPLETE, but will also make use of
efx_rx_packet_gro(), thus needs to be able to pass the checksum value
into that function.
Drivers for older NICs pass in a csum of 0 to get the old semantics (use
the RX flags for CHECKSUM_UNNECESSARY marking).
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 29 Jun 2020 13:34:39 +0000 (14:34 +0100)]
sfc: split up nic.h
The new nic_common.h contains the inlines for NIC-type function dispatch,
declarations for NIC-generic functions in nic.c, and other similar NIC-
generic functionality. Retained in nic.h are NIC-specific declarations
such as the siena and ef10 nic_data structs and various farch functions.
The EF100 driver will thus include nic_common.h but not nic.h.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Mon, 29 Jun 2020 13:32:46 +0000 (14:32 +0100)]
sfc: determine flag word automatically in efx_has_cap()
Now that we have an _OFST definition for each individual flag bit,
callers of efx_has_cap() don't need to specify which flag word it's
in; we can just use the flag name directly in MCDI_CAPABILITY_OFST.
Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Po Liu [Mon, 29 Jun 2020 06:54:16 +0000 (14:54 +0800)]
net:qos: police action offloading parameter 'burst' change to the original value
Since 'tcfp_burst' with TICK factor, driver side always need to recover
it to the original value, this patch moves the generic calculation and
recover to the 'burst' original value before offloading to device driver.
Signed-off-by: Po Liu <po.liu@nxp.com> Acked-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Jun 2020 00:29:38 +0000 (17:29 -0700)]
Merge branch 'MPTCP-improve-fallback-to-TCP'
Davide Caratti says:
====================
MPTCP: improve fallback to TCP
there are situations where MPTCP sockets should fall-back to regular TCP:
this series reworks the fallback code to pursue the following goals:
1) cleanup the non fallback code, removing most of 'if (<fallback>)' in
the data path
2) improve performance for non-fallback sockets, avoiding locks in poll()
further work will also leverage on this changes to achieve:
a) more consistent behavior of gestockopt()/setsockopt() on passive sockets
after fallback
b) support for "infinite maps" as per RFC8684, section 3.7
the series is made of the following items:
- patch 1 lets sendmsg() / recvmsg() / poll() use the main socket also
after fallback
- patch 2 fixes 'simultaneous connect' scenario after fallback. The
problem was present also before the rework, but the fix is much easier
to implement after patch 1
- patch 3, 4, 5 are clean-ups for code that is no more needed after the
fallback rework
- patch 6 fixes a race condition between close() and poll(). The problem
was theoretically present before the rework, but it became almost
systematic after patch 1
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Mon, 29 Jun 2020 20:26:25 +0000 (22:26 +0200)]
mptcp: close poll() races
mptcp_poll always return POLLOUT for unblocking
connect(), ensure that the socket is a suitable
state.
The MPTCP_DATA_READY bit is never cleared on accept:
ensure we don't leave mptcp_accept() with an empty
accept queue and such bit set.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Mon, 29 Jun 2020 20:26:24 +0000 (22:26 +0200)]
mptcp: __mptcp_tcp_fallback() returns a struct sock
Currently __mptcp_tcp_fallback() always return NULL
on incoming connections, because MPTCP does not create
the additional socket for the first subflow.
Since the previous commit no __mptcp_tcp_fallback()
caller needs a struct socket, so let __mptcp_tcp_fallback()
return the first subflow sock and cope correctly even with
incoming connections.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Mon, 29 Jun 2020 20:26:23 +0000 (22:26 +0200)]
mptcp: create first subflow at msk creation time
This cleans the code a bit and makes the behavior more consistent.
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Mon, 29 Jun 2020 20:26:22 +0000 (22:26 +0200)]
mptcp: check for plain TCP sock at accept time
This cleanup the code a bit and avoid corrupted states
on weird syscall sequence (accept(), connect()).
Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Mon, 29 Jun 2020 20:26:21 +0000 (22:26 +0200)]
mptcp: fallback in case of simultaneous connect
when a MPTCP client tries to connect to itself, tcp_finish_connect() is
never reached. Because of this, depending on the socket current state,
multiple faulty behaviours can be observed:
1) a WARN_ON() in subflow_data_ready() is hit
WARNING: CPU: 2 PID: 882 at net/mptcp/subflow.c:911 subflow_data_ready+0x18b/0x230
[...]
CPU: 2 PID: 882 Comm: gh35 Not tainted 5.7.0+ #187
[...]
RIP: 0010:subflow_data_ready+0x18b/0x230
[...]
Call Trace:
tcp_data_queue+0xd2f/0x4250
tcp_rcv_state_process+0xb1c/0x49d3
tcp_v4_do_rcv+0x2bc/0x790
__release_sock+0x153/0x2d0
release_sock+0x4f/0x170
mptcp_shutdown+0x167/0x4e0
__sys_shutdown+0xe6/0x180
__x64_sys_shutdown+0x50/0x70
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
2) client is stuck forever in mptcp_sendmsg() because the socket is not
TCP_ESTABLISHED
3) tcpdump captures show that DSS is exchanged even when MP_CAPABLE handshake
didn't complete.
$ tcpdump -tnnr bad.pcap
IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [S], seq 3208913911, win 65483, options [mss 65495,sackOK,TS val 3291706876 ecr 3291694721,nop,wscale 7,mptcp capable v1], length 0
IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [S.], seq 3208913911, ack 3208913912, win 65483, options [mss 65495,sackOK,TS val 3291706876 ecr 3291706876,nop,wscale 7,mptcp capable v1], length 0
IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [.], ack 1, win 512, options [nop,nop,TS val 3291706876 ecr 3291706876], length 0
IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [F.], seq 1, ack 1, win 512, options [nop,nop,TS val 3291707876 ecr 3291706876,mptcp dss fin seq 0 subseq 0 len 1,nop,nop], length 0
IP 127.0.0.1.20000 > 127.0.0.1.20000: Flags [.], ack 2, win 512, options [nop,nop,TS val 3291707876 ecr 3291707876], length 0
force a fallback to TCP in these cases, and adjust the main socket
state to avoid hanging in mptcp_sendmsg().
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/35 Reported-by: Christoph Paasch <cpaasch@apple.com> Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Davide Caratti [Mon, 29 Jun 2020 20:26:20 +0000 (22:26 +0200)]
net: mptcp: improve fallback to TCP
Keep using MPTCP sockets and a use "dummy mapping" in case of fallback
to regular TCP. When fallback is triggered, skip addition of the MPTCP
option on send.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/11 Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/22 Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Baruch Siach [Sun, 28 Jun 2020 07:04:51 +0000 (10:04 +0300)]
net: phy: marvell10g: support XFI rate matching mode
When the hardware MACTYPE hardware configuration pins are set to "XFI
with Rate Matching" the PHY interface operate at fixed 10Gbps speed. The
MAC buffer packets in both directions to match various wire speeds.
Read the MAC Type field in the Port Control register, and set the MAC
interface speed accordingly.
Signed-off-by: Baruch Siach <baruch@tkos.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 30 Jun 2020 00:18:40 +0000 (17:18 -0700)]
Merge tag 'mlx5-tls-2020-06-26' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-tls-2020-06-26
1) Improve hardware layouts and structure for kTLS support
2) Generalize ICOSQ (Internal Channel Operations Send Queue)
Due to the asynchronous nature of adding new kTLS flows and handling
HW asynchronous kTLS resync requests, the XSK ICOSQ was extended to
support generic async operations, such as kTLS add flow and resync, in
addition to the existing XSK usages.
3) kTLS hardware flow steering and classification:
The driver already has the means to classify TCP ipv4/6 flows to send them
to the corresponding RSS HW engine, as reflected in patches 3 through 5,
the series will add a steering layer that will hook to the driver's TCP
classifiers and will match on well known kTLS connection, in case of a
match traffic will be redirected to the kTLS decryption engine, otherwise
traffic will continue flowing normally to the TCP RSS engine.
3) kTLS add flow RX HW offload support
New offload contexts post their static/progress params WQEs
(Work Queue Element) to communicate the newly added kTLS contexts
over the per-channel async ICOSQ.
The Channel/RQ is selected according to the socket's rxq index.
A new TLS-RX workqueue is used to allow asynchronous addition of
steering rules, out of the NAPI context.
It will be also used in a downstream patch in the resync procedure.
Feature is OFF by default. Can be turned on by:
$ ethtool -K <if> tls-hw-rx-offload on
4) Added mlx5 kTLS sw stats and new counters are documented in
Documentation/networking/tls-offload.rst
rx_tls_ctx - number of TLS RX HW offload contexts added to device for
decryption.
rx_tls_ooo - number of RX packets which were part of a TLS stream
but did not arrive in the expected order and triggered the resync
procedure.
rx_tls_del - number of TLS RX HW offload contexts deleted from device
(connection has finished).
rx_tls_err - number of RX packets which were part of a TLS stream
but were not decrypted due to unexpected error in the state machine.
5) Asynchronous RX resync
a. The NIC driver indicates that it would like to resync on some TLS
record within the received packet (P), but the driver does not
know (yet) which of the TLS records within the packet.
At this stage, the NIC driver will query the device to find the exact
TCP sequence for resync (tcpsn), however, the driver does not wait
for the device to provide the response.
b. Eventually, the device responds, and the driver provides the tcpsn
within the resync packet to KTLS. Now, KTLS can check the tcpsn against
any processed TLS records within packet P, and also against any record
that is processed in the future within packet P.
The asynchronous resync path simplifies the device driver, as it can
save bits on the packet completion (32-bit TCP sequence), and pass this
information on an asynchronous command instead.
Performance:
CPU: Intel(R) Xeon(R) CPU E5-2687W v4 @ 3.00GHz, 24 cores, HT off
NIC: ConnectX-6 Dx 100GbE dual port
David S. Miller [Tue, 30 Jun 2020 00:08:28 +0000 (17:08 -0700)]
Merge branch 'TC-Introduce-qevents'
Petr Machata says:
====================
TC: Introduce qevents
The Spectrum hardware allows execution of one of several actions as a
result of queue management decisions: tail-dropping, early-dropping,
marking a packet, or passing a configured latency threshold or buffer
size. Such packets can be mirrored, trapped, or sampled.
Modeling the action to be taken as simply a TC action is very attractive,
but it is not obvious where to put these actions. At least with ECN marking
one could imagine a tree of qdiscs and classifiers that effectively
accomplishes this task, albeit in an impractically complex manner. But
there is just no way to match on dropped-ness of a packet, let alone
dropped-ness due to a particular reason.
To allow configuring user-defined actions as a result of inner workings of
a qdisc, this patch set introduces a concept of qevents. Those are attach
points for TC blocks, where filters can be put that are executed as the
packet hits well-defined points in the qdisc algorithms. The attached
blocks can be shared, in a manner similar to clsact ingress and egress
blocks, arbitrary classifiers with arbitrary actions can be put on them,
etc.
For example:
red limit 500K avpkt 1K qevent early_drop block 10
matchall action mirred egress mirror dev eth1
The central patch #2 introduces several helpers to allow easy and uniform
addition of qevents to qdiscs: initialization, destruction, qevent block
number change validation, and qevent handling, i.e. dispatch of the filters
attached to the block bound to a qevent.
Patch #1 adds root_lock argument to qdisc enqueue op. The problem this is
tackling is that if a qevent filter pushes packets to the same qdisc tree
that holds the qevent in the first place, attempt to take qdisc root lock
for the second time will lead to a deadlock. To solve the issue, qevent
handler needs to unlock and relock the root lock around the filter
processing. Passing root_lock around makes it possible to get the lock
where it is needed, and visibly so, such that it is obvious the lock will
be used when invoking a qevent.
The following two patches, #3 and #4, then add two qevents to the RED
qdisc: "early_drop" qevent fires when a packet is early-dropped; "mark"
qevent, when it is ECN-marked.
Patch #5 contains a selftest. I have mentioned this test when pushing the
RED ECN nodrop mode and said that "I have no confidence in its portability
to [...] different configurations". That still holds. The backlog and
packet size are tuned to make the test deterministic. But it is better than
nothing, and on the boxes that I ran it on it does work and shows that
qevents work the way they are supposed to, and that their addition has not
broken the other tested features.
This patch set does not deal with offloading. The idea there is that a
driver will be able to figure out that a given block is used in qevent
context by looking at binder type. A future patch-set will add a qdisc
pointer to struct flow_block_offload, which a driver will be able to
consult to glean the TC or other relevant attributes.
Changes from RFC to v1:
- Move a "q = qdisc_priv(sch)" from patch #3 to patch #4
- Fix deadlock caused by mirroring packet back to the same qdisc tree.
- Rename "tail" qevent to "tail_drop".
- Adapt to the new 100-column standard.
- Add a selftest
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Petr Machata [Fri, 26 Jun 2020 22:45:29 +0000 (01:45 +0300)]
selftests: forwarding: Add a RED test for SW datapath
This test is inspired by the mlxsw RED selftest. It is much simpler to set
up (also because there is no point in testing PRIO / RED encapsulation). It
tests bare RED, ECN and ECN+nodrop modes of operation. On top of that it
tests RED early_drop and mark qevents.
Signed-off-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>