git.baikalelectronics.ru Git

net: dsa: mv88e6xxx: Allow PCS registers to be retrieved via ethtool

ethtool provides a generic mechanism for a driver to return the
registers of an ethernet device. DSA uses this to give the port
registers associated with an interfaces. Extend this to allow PCS
registers to also be returned, if the port has a PCS associated to it.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
100GbE Intel Wired LAN Driver Updates 2020-02-15

This series contains updates to ice driver only.

Brett adds support for "Queue in Queue" (QinQ) support, by supporting
S-tag & C-tag VLAN traffic by disabling pruning when there are no 0x8100
VLAN interfaces currently on top of the PF.  Also refactored the port
VLAN configuration to re-use the common code for enabling and disabling
a port VLAN in single function.  Added a helper function to determine if
the VF link is up.  Fixed how the port VLAN configures the priority bits
for a VF interface.  Fixed the port VLAN to only see its own broadcast
and multicast traffic.  Added support to enable and disable all receive
queues, by refactoring adding a new function to do the necessary steps
to enable/disable a queue with the necessary read flush.  Fixed how we
set the mapping mode for transmit and receive queues.  Added support for
VF queues to handle LAN overflow events.  Fixed and refactored how
receive queues get disabled for VFs, which was being handled one queue
at at time, so improve it to handle when the VF is requesting more than
one queue to be disabled.  Fixed how the virtchnl_queue_select bitmap is
validated.

Finally a patch not authored by Brett, Bruce cleans up "fallthrough"
comments which are unnecessary.  Also replaces the "fallthough" comments
with the GCC reserved word fallthrough, along with other GCC compiler
fixes.  Add missing function header comment regarding a function
argument that was missing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sonic-next'

Finn Thain says:

====================
Improvements for SONIC ethernet drivers

Now that the necessary sonic driver fixes have been merged, and the merge
window has closed again, I'm sending the remainder of my sonic driver
patch queue.

A couple of these patches will have to be applied in sequence to avoid
'git am' rejects. The others are independent and could have been submitted
individually. Please let me know if I should do that.

The complete sonic driver patch queue was tested on National Semiconductor
hardware (macsonic), qemu-system-m68k (macsonic) and qemu-system-mips64el
(jazzsonic).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/macsonic: Remove interrupt handler wrapper

On m68k, local irqs remain enabled while interrupt handlers execute.
Therefore the macsonic driver has had to disable interrupts to avoid
re-entering sonic_interrupt().

As of commit 5fe8de7dfd40 ("net/sonic: Add mutual exclusion for accessing
shared state"), sonic_interrupt() became re-entrant, and its wrapper
became redundant.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sonic: Start packet transmission immediately

Give the transmit command as soon as the transmit descriptor is ready.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sonic: Remove explicit memory barriers

The explicit memory barriers are redundant now that proper locking and
MMIO accessors have been employed.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sonic: Remove redundant netif_start_queue() call

The transmit queue must be running already otherwise sonic_send_packet()
would not have been called. If the queue was stopped by the interrupt
handler, the interrupt handler will restart it again.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sonic: Remove redundant next_tx variable

The eol_tx variable is the one that matters to the tx algorithm because
packets are always placed at the end of the list. The next_tx variable
just confuses things so remove it.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sonic: Refactor duplicated code

No functional change.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sonic: Remove obsolete comment

The comment is meaningless since mark_bh() was removed a long time ago.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'sh_eth-get-rid-of-the-dedicated-regiseter-mapping-for-RZ-A1-R7S72100'

Sergei Shtylyov says:

====================
sh_eth: get rid of the dedicated regiseter mapping for RZ/A1 (R7S72100)

Here's a set of 5 patches against DaveM's 'net-next.git' repo.

I changed my mind about the RZ/A1 SoC needing its own register
map -- now that we don't depend on the register map array in order
to determine whether a given register exists any more, we can add
a new flag to determine if the GECMR exists (this register is
present only on true GEther chips, not RZ/A1). We also need to
add the sh_eth_cpu_data::* flag checks where they were missing
so far: in the ethtool API for the register dump.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: use Gigabit register map for R7S72100

The register maps for the Gigabit controllers and the Ether one used on
RZ/A1  (AKA R7S72100) are identical except for GECMR which is only present
on the true GEther controllers.  We no longer use the register map arrays
to determine if a given register exists,  and have added the GECMR flag to
the 'struct sh_eth_cpu_data' in the previous patch, so we're ready to drop
the R7S72100 specific register map -- this saves 216 bytes of object code
(ARM gcc 4.8.5).

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: add sh_eth_cpu_data::gecmr flag

Not all Ether controllers having the Gigabit register layout have GECMR --
RZ/A1 (AKA R7S72100) actually has the same layout but no Gigabit speed
support and hence no GECMR. In the past, the new register map table was
added for this SoC, now I think we should have used the existing Gigabit
table with the differences (such as GECMR) covered by the mere flags in
the 'struct sh_eth_cpu_data'. Add such flag for GECMR -- and then we can
get rid of the R7S72100 specific layout in the next patch...

Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: check sh_eth_cpu_data::no_xdfar when dumping registers

When adding the sh_eth_cpu_data::no_xdfar flag I forgot to add the flag
check to __sh_eth_get_regs(), causing the non-existing RDFAR/TDFAR to be
considered for dumping on the R-Car gen1/2 SoCs (the register offset check
has the final say here)...

Fixes: fb394fe3b99 ("sh_eth: add sh_eth_cpu_data::cexcr flag")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: check sh_eth_cpu_data::cexcr when dumping registers

When adding the sh_eth_cpu_data::cexcr flag I forgot to add the flag
check to __sh_eth_get_regs(), causing the non-existing RX packet counter
registers to be considered for dumping on the R7S72100 SoC (the register
offset sanity check has the final say here)...

Fixes: fb394fe3b99 ("sh_eth: add sh_eth_cpu_data::cexcr flag")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sh_eth: check sh_eth_cpu_data::no_tx_cntrs when dumping registers

When adding the sh_eth_cpu_data::no_tx_cntrs flag I forgot to add the
flag check to __sh_eth_get_regs(), causing the non-existing TX counter
registers to be considered for dumping on the R7S72100 SoC (the register
offset sanity check has the final say here)...

Fixes: c38d425a376d ("sh_eth: add sh_eth_cpu_data::no_tx_cntrs flag")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Tested-by: Chris Brandt <chris.brandt@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'Pause-updates-for-phylib-and-phylink'

Russell King says:

====================
Pause updates for phylib and phylink

Currently, phylib resolves the speed and duplex settings, which MAC
drivers use directly. phylib also extracts the "Pause" and "AsymPause"
bits from the link partner's advertisement, and stores them in struct
phy_device's pause and asym_pause members with no further processing.
It is left up to each MAC driver to implement decoding for this
information.

phylink converted drivers are able to take advantage of code therein
which resolves the pause advertisements for the MAC driver, but this
does nothing for unconverted drivers. It also does not allow us to
make use of hardware-resolved pause states offered by several PHYs.

This series aims to address this by:

1. Providing a generic implementation, linkmode_resolve_pause(), that
   takes the ethtool linkmode arrays for the link partner and local
   advertisements, decoding the result to whether pause frames should
   be allowed to be transmitted or received and acted upon.  I call
   this the pause enablement state.

2. Providing a phylib implementation, phy_get_pause(), which allows
   MAC drivers to request the pause enablement state from phylib.

3. Providing a generic linkmode_set_pause() for setting the pause
   advertisement according to the ethtool tx/rx flags - note that this
   design has some shortcomings, see the comments in the kerneldoc for
   this function.

4. Remove the ability in phylink to set the pause states for fixed
   links, which brings them into line with how we deal with the speed
   and duplex parameters; we can reintroduce this later if anyone
   requires it.  This could be a userspace-visible change.

5. Split application of manual pause enablement state from phylink's
   resolution of the same to allow use of phylib's new phy_get_pause()
   interface by phylink, and make that switch.

6. Resolve the fixed-link pause enablement state using the generic
   linkmode_resolve_pause() helper introduced earlier. This, in
   connection with the previous commits, allows us to kill the
   MLO_PAUSE_SYM and MLO_PAUSE_ASYM flags.

7. make phylink's ethtool pause setting implementation update the
   pause advertisement in the same way that phylib does, with the
   same caveats that are present there (as mentioned above w.r.t
   linkmode_set_pause()).

8. create a more accurate initial configuration for MACs, used when
   phy_start() is called or a SFP is detected. In particular, this
   ensures that the pause bits seen by MAC drivers in state->pause
   are accurate for SGMII.

9. finally, update the kerneldoc descriptions for mac_config() for
   the above changes.

This series has been build-tested against net-next; the boot tested
patches are in my "phy" branch against v5.5 plus the queued phylink
changes that were merged for 5.6.

The next series will introduce the ability for phylib drivers to
provide hardware resolved pause enablement state.  These patches can
be found in my "phy" branch.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylink: clarify flow control settings in documentation

Clarify the expected flow control settings operation in the phylink
documentation for each negotiation mode.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylink: improve initial mac configuration

Improve the initial MAC configuration so we get a configuration which
more represents the final operating mode, in particular with respect
to the flow control settings.

We do this by:
1) more fully initialising our phy state, so we can use this as the
   initial state for PHY based connections.
2) reading the fixed link state.
3) ensuring that in-band mode has sane pause settings for SGMII vs
   802.3z negotiation modes.

In all three cases, we ensure that state->link is false, just in case
any MAC drivers have other ideas by mis-using this member, and we also
take account of manual pause mode configuration at this point.

This avoids MLO_PAUSE_AN being seen in mac_config() when operating in
PHY, fixed mode or inband SGMII mode, thereby giving cleaner semantics
to the pause flags.  As a result of this, the pause flags now indicate
in a mode-independent way what is required from a mac_config()
implementation.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylink: allow ethtool -A to change flow control advertisement

When ethtool -A is used to change the pause modes, the pause
advertisement is not being changed, but the documentation in
uapi/linux/ethtool.h says we should be. Add that capability to
phylink.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylink: resolve fixed link flow control

Resolve the fixed link flow control using the recently introduced
linkmode_resolve_pause() helper, which we use in
phylink_get_fixed_state() only when operating in full duplex mode.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylink: use phylib resolved flow control modes

Use the new phy_get_pause() helper to get the resolved pause modes for
a PHY rather than resolving the pause modes ourselves. We temporarily
retain our pause mode resolution for causes where there is no PHY
attached, e.g. for fixed-link modes.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylink: ensure manual flow control is selected appropriately

Split the application of manually controlled flow control modes from
phylink_resolve_flow(), so that we can use alternative providers of
flow control resolution.

We also want to clear the MLO_PAUSE_AN flag when autoneg is disabled,
since flow control can't be negotiated in this circumstance.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: phylink: remove pause mode ethtool setting for fixed links

Remove the ability for ethtool -A to change the pause settings for
fixed links; if this is really required, we can reinstate it later.

Andrew Lunn agrees: "So I think it is safe to not implement ethtool
-A, at least until somebody has a real use case for it."

Lets avoid making things too complex for use cases that aren't being
used.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add linkmode helper for setting flow control advertisement

Add a linkmode helper to set the flow control advertisement in an
ethtool linkmode mask according to the tx/rx capabilities. This
implementation is moved from phylib, and documented with an
analysis of its shortcomings.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: add helpers to resolve negotiated flow control

Add a couple of helpers to resolve negotiated flow control. Two helpers
are provided:

- linkmode_resolve_pause() which takes the link partner and local
  advertisements, and decodes whether we should enable TX or RX pause
  at the MAC. This is useful outside of phylib, e.g. in phylink.
- phy_get_pause(), which returns the TX/RX enablement status for the
  current negotiation results of the PHY.

This allows us to centralise the flow control resolution, rather than
spreading it around.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: linkmode: make linkmode_test_bit() take const pointer

linkmode_test_bit() does not modify the address; test_bit() is also
declared const volatile for the same reason. There's no need for
linkmode_test_bit() to be any different, and allows implementation of
helpers that take a const linkmode pointer.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'r8169-series-with-further-smaller-improvements'

Heiner Kallweit says:

====================
r8169: series with further smaller improvements

Nothing too exciting. This series includes further smaller
improvements.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: improve statistics of missed rx packets

Register RxMissed exists on few early chip versions only, however all
chip versions have the number of missed RX packets in the hardware
counters. Therefore remove using RxMissed and get the number of missed
RX packets from the hardware stats.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: improve rtl_jumbo_config

Merge enabling and disabling jumbo packets to one function to make
the code a little simpler.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: improve rtl8169_get_mac_version

Currently code snippet (RTL_R32(tp, TxConfig) >> 20) & 0xfcf is used
in few places to extract the chip XID. Change the code to do the XID
extraction only once.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: add helper rtl_pci_commit

In few places we do a PCI commit by reading an arbitrary chip register.
It's not always obvious that the read is meant to be a PCI commit,
therefore add a helper for it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: simplify setting netdev features

Setting dev->features a few lines later allows to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: remove setting PCI_CACHE_LINE_SIZE in rtl_hw_start_8169

This is done for all RTL8169 chip versions in rtl8169_init_phy already.
Therefore we can remove it here.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: remove unneeded check from rtl_link_chg_patch

rtl_link_chg_patch() can be called from rtl_open() to rtl8169_close()
only. And in rtl8169_close() phy_stop() ensures that this function
isn't called afterwards. So we don't need this check.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

openvswitch: add TTL decrement action

New action to decrement TTL instead of setting it to a fixed value.
This action will decrement the TTL and, in case of expired TTL, drop it
or execute an action passed via a nested attribute.
The default TTL expired action is to drop the packet.

Supports both IPv4 and IPv6 via the ttl and hop_limit fields, respectively.

Tested with a corresponding change in the userspace:

    # ovs-dpctl dump-flows
    in_port(2),eth(),eth_type(0x0800), packets:0, bytes:0, used:never, actions:dec_ttl{ttl<=1 action:(drop)},1
    in_port(1),eth(),eth_type(0x0800), packets:0, bytes:0, used:never, actions:dec_ttl{ttl<=1 action:(drop)},2
    in_port(1),eth(),eth_type(0x0806), packets:0, bytes:0, used:never, actions:2
    in_port(2),eth(),eth_type(0x0806), packets:0, bytes:0, used:never, actions:1

    # ping -c1 192.168.0.2 -t 42
    IP (tos 0x0, ttl 41, id 61647, offset 0, flags [DF], proto ICMP (1), length 84)
        192.168.0.1 > 192.168.0.2: ICMP echo request, id 386, seq 1, length 64
    # ping -c1 192.168.0.2 -t 120
    IP (tos 0x0, ttl 119, id 62070, offset 0, flags [DF], proto ICMP (1), length 84)
        192.168.0.1 > 192.168.0.2: ICMP echo request, id 388, seq 1, length 64
    # ping -c1 192.168.0.2 -t 1
    #

Co-developed-by: Bindiya Kurle <bindiyakurle@gmail.com>
Signed-off-by: Bindiya Kurle <bindiyakurle@gmail.com>
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: bcm_sf2: Also configure Port 5 for 2Gb/sec on 7278

Either port 5 or port 8 can be used on a 7278 device, make sure that
port 5 also gets configured properly for 2Gb/sec in that case.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp-zerocopy: Return sk_err (if set) along with tcp receive zerocopy.

This patchset is intended to reduce the number of extra system calls
imposed by TCP receive zerocopy. For ping-pong RPC style workloads,
this patchset has demonstrated a system call reduction of about 30%
when coupled with userspace changes.

For applications using epoll, returning sk_err along with the result
of tcp receive zerocopy could remove the need to call
recvmsg()=-EAGAIN after a spurious wakeup.

Consider a multi-threaded application using epoll. A thread may awaken
with EPOLLIN but another thread may already be reading. The
spuriously-awoken thread does not necessarily know that another thread
'won'; rather, it may be possible that it was woken up due to the
presence of an error if there is no data. A zerocopy read receiving 0
bytes thus would need to be followed up by recvmsg to be sure.

Instead, we return sk_err directly with zerocopy, so the application
can avoid this extra system call.

Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp-zerocopy: Return inq along with tcp receive zerocopy.

This patchset is intended to reduce the number of extra system calls
imposed by TCP receive zerocopy. For ping-pong RPC style workloads,
this patchset has demonstrated a system call reduction of about 30%
when coupled with userspace changes.

For applications using edge-triggered epoll, returning inq along with
the result of tcp receive zerocopy could remove the need to call
recvmsg()=-EAGAIN after a successful zerocopy. Generally speaking,
since normally we would need to perform a recvmsg() call for every
successful small RPC read via TCP receive zerocopy, returning inq can
reduce the number of system calls performed by approximately half.

Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'Enhance-virtio-vsock-connection-semantics'

Sebastien Boeuf says:

====================
Enhance virtio-vsock connection semantics

This series improves the semantics behind the way virtio-vsock server
accepts connections coming from the client. Whenever the server
receives a connection request from the client, if it is bound to the
socket but not yet listening, it will answer with a RST packet. The
point is to ensure each request from the client is quickly processed
so that the client can decide about the strategy of retrying or not.

The series includes along with the improvement patch a new test to
ensure the behavior is consistent across all hypervisors drivers.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

tools: testing: vsock: Test when server is bound but not listening

Whenever the server side of vsock is binding to the socket, but not
listening yet, we expect the behavior from the client to be identical to
what happens when the server is not even started.

This new test runs the server side so that it binds to the socket
without ever listening to it. The client side will try to connect and
should receive an ECONNRESET error.

This new test provides a way to validate the previously introduced patch
for making sure the server side will always answer with a RST packet in
case the client requested a new connection.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: virtio_vsock: Enhance connection semantics

Whenever the vsock backend on the host sends a packet through the RX
queue, it expects an answer on the TX queue. Unfortunately, there is one
case where the host side will hang waiting for the answer and might
effectively never recover if no timeout mechanism was implemented.

This issue happens when the guest side starts binding to the socket,
which insert a new bound socket into the list of already bound sockets.
At this time, we expect the guest to also start listening, which will
trigger the sk_state to move from TCP_CLOSE to TCP_LISTEN. The problem
occurs if the host side queued a RX packet and triggered an interrupt
right between the end of the binding process and the beginning of the
listening process. In this specific case, the function processing the
packet virtio_transport_recv_pkt() will find a bound socket, which means
it will hit the switch statement checking for the sk_state, but the
state won't be changed into TCP_LISTEN yet, which leads the code to pick
the default statement. This default statement will only free the buffer,
while it should also respond to the host side, by sending a packet on
its TX queue.

In order to simply fix this unfortunate chain of events, it is important
that in case the default statement is entered, and because at this stage
we know the host side is waiting for an answer, we must send back a
packet containing the operation VIRTIO_VSOCK_OP_RST.

One could say that a proper timeout mechanism on the host side will be
enough to avoid the backend to hang. But the point of this patch is to
ensure the normal use case will be provided with proper responsiveness
when it comes to establishing the connection.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mac80211-next-for-net-next-2020-02-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
A few big new things:
* 802.11 frame encapsulation offload support
* more HE (802.11ax) support, including some for 6 GHz band
* powersave in hwsim, for better testing

Of course as usual there are various cleanups and small fixes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: x25: convert to list_for_each_entry_safe()

Use list_for_each_entry_safe() instead of list_for_each_safe()
to simplify the code.

Signed-off-by: chenqiwu <chenqiwu@xiaomi.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

lib: objagg: Replace zero-length arrays with flexible-array member

The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
int stuff;
struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertenly introduced[3] to the codebase from now on.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 4287773bdd7a ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ptp_qoriq: drop the code of alarm

The alarm function hadn't been supported by PTP clock driver.
The recommended solution PHC + phc2sys + nanosleep provides
best performance. So drop the code of alarm in ptp_qoriq driver.

Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ice: use true/false for bool types

Subject says it all.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: add function argument description to function header comment

Commit f86cf94969d4 ("netdev: pass the stuck queue to the timeout handler")
introduced a new argument to the function but missed adding the description
of the argument to the function header comment. Add it now.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: use proper format for function pointer as a function parameter

Compiling with gcc-9.2.1 with W=1 points out warnings about the improper
function parameter list. Fix it.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: replace "fallthrough" comments with fallthrough reserved word

"fallthrough" comments are used in switch case statements to explicitly
indicate the code is intended to fall through to the following statement.
Different variants of "fallthough" are acceptable, e.g. "fall through",
"fallthrough", "Fall-through". The GCC compiler has an optional warning
(-Wimplicit-fallthrough[=n]) to warn when such a comment is not present;
the default version of which is enabled when compiling the Linux kernel.

There have been recent discussions in kernel mailing lists regarding
replacing non-standardized "fallthrough" comments with the pseudo-reserved
word 'fallthrough' which will be defined as __attribute__ ((fallthrough))
for versions of gcc that support it (i.e. gcc 7 and newer) or as a nop
for versions that do not. Replace "fallthrough" comments with fallthrough
reserved word.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: remove unnecessary fallthrough comments

Fallthrough comments are used to explicitly indicate the code is intended
to flow from one case statement to the next in a switch statement rather
than break out of the switch statement. They are only needed when a case
has one or more statements to execute before falling through to the next
case, not when there is a list of cases for which the same statement(s)
should be executed.

Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Fix virtchnl_queue_select bitmap validation

Currently in ice_vc_ena_qs_msg() we are incorrectly validating the
virtchnl queue select bitmaps. The virtchnl_queue_select rx_queues and
tx_queue bitmap is being compared against ICE_MAX_BASE_QS_PER_VF, but
the problem is that these bitmaps can have a value greater than
ICE_MAX_BASE_QS_PER_VF. Fix this by comparing the bitmaps against
BIT(ICE_MAX_BASE_QS_PER_VF).

Also, add the function ice_vc_validate_vqs_bitmaps() that checks to see
if both virtchnl_queue_select bitmaps are empty along with checking that
the bitmaps only have valid bits set. This function can then be used in
both the queue enable and disable flows.

Arkady Gilinksky's patch on the intel-wired-lan mailing list
("i40e/iavf: Fix msg interface between VF and PF") made me
aware of this issue.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Fix and refactor Rx queue disable for VFs

Currently when a VF driver sends the PF a request to disable Rx queues
we will disable them one at a time, even if the VF driver sent us a
batch of queues to disable. This is causing issues where the Rx queue
disable times out with LFC enabled. This can be improved by detecting
when the VF is trying to disable all of its queues.

Also remove the variable num_qs_ena from the ice_vf structure as it was
only used to see if there were no Rx and no Tx queues active. Instead
add a function that checks if both the vf->rxq_ena and vf->txq_ena
bitmaps are empty.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Handle LAN overflow event for VF queues

Currently we are not handling LAN overflow events. There can be cases
where LAN overflow events occur on VF queues, especially with Link Flow
Control (LFC) enabled on the controlling PF. In order to recover from
the LAN overflow event caused by a VF we need to determine if the queue
belongs to a VF and reset that VF accordingly.

The struct ice_aqc_event_lan_overflow returns a copy of the GLDCB_RTCTQ
register, which tells us what the queue index is in the global/device
space. The global queue index needs to first be converted to a PF space
queue index and then it can be used to find if a VF owns it.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Fix implicit queue mapping mode in ice_vsi_get_qs

Currently in ice_vsi_get_qs() we set the mapping_mode for Tx and Rx to
vsi->[tx|rx]_mapping_mode, but the problem is vsi->[tx|rx]_mapping_mode
have not been set yet. This was working because ICE_VSI_MAP_CONTIG is
defined to 0. Fix this by being explicit with our mapping mode by
initializing the Tx and Rx structure's mapping_mode to
ICE_VSI_MAP_CONTIG and then setting the vsi->[tx|rx]_mapping_mode to the
[tx|rx]_qs_cfg.mapping_mode values.

Also, only assign the vsi->[tx|rx]_mapping_mode when the queues are
successfully mapped to the VSI. With this change there was no longer a
need to initialize the ret variable to 0 so remove that.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Add support to enable/disable all Rx queues before waiting

Currently when we enable/disable all Rx queues we do the following
sequence for each Rx queue and then move to the next queue.

1. Enable/Disable the Rx queue via register write.
2. Read the configuration register to determine if the Rx queue was
enabled/disabled successfully.

In some cases enabling/disabling queue 0 fails because of step 2 above.
Fix this by doing step 1 for all of the Rx queues and then step 2 for
all of the Rx queues.

Also, there are cases where we enable/disable a single queue (i.e.
SR-IOV and XDP) so add a new function that does step 1 and 2 above with
a read flush in between.

This change also required a single Rx queue to be enabled/disabled with
and without waiting for the change to propagate through hardware. Fix
this by adding a boolean wait flag to the necessary functions.

Also, add the keywords "one" and "all" to distinguish between
enabling/disabling a single Rx queue and all Rx queues respectively.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Only allow tagged bcast/mcast traffic for VF in port VLAN

Currently the VF can see other's broadcast and multicast traffic because
it always has a VLAN filter for VLAN 0. Fix this by removing/adding the
VF's VLAN 0 filter when a port VLAN is added/removed respectively.

This required a few changes.

1. Move where we add VLAN 0 by default for the VF into
ice_alloc_vsi_res() because this is when we determine if a port VLAN is
present for load and reset.

2. Moved where we kill the old port VLAN filter in
ice_set_vf_port_vlan() to the very end of the function because it allows
us to save the old port VLAN configuration upon any failure case.

3. During adding/removing of a port VLAN via ice_set_vf_port_vlan() we
also need to remove/add the VLAN 0 filter rule respectively.

4. Improve log messages.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Fix Port VLAN priority bits

Currently when configuring a port VLAN for a VF we are only shifting the
QoS bits by 12. This is incorrect. Fix this by getting rid of the ICE
specific VLAN defines and use the kernel VLAN defines instead.

Also, don't assign a value to vlanprio until the VLAN ID and QoS
parameters have been validated.

Also, there are many places we do (le16_to_cpu(vsi->info.pvid) &
VLAN_VID_MASK). Instead do (vf->port_vlan_info & VLAN_VID_MASK) because
we always save what's stored in vsi->info.pvid to vf->port_vlan_info in
the CPU's endianness.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Add helper to determine if VF link is up

The check for vf->link_up is incorrect because this field is only valid if
vf->link_forced is true. Fix this by adding the helper ice_is_vf_link_up()
to determine if the VF's link is up.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Refactor port vlan configuration for the VF

Currently ice_vsi_manage_pvid() calls
ice_vsi_[set|kill]_pvid_fill_ctxt() when enabling/disabling a port VLAN
on a VSI respectively. These two functions have some duplication so just
move their unique pieces inline in ice_vsi_manage_pvid() and then the
duplicate code can be reused for both the enabling/disabling paths.

Before this patch the info.pvid field was not being written
correctly via ice_vsi_kill_pvid_fill_ctxt() so it was being hard coded
to 0 in ice_set_vf_port_vlan(). Fix this by setting the info.pvid field
to 0 before calling ice_vsi_update() in ice_vsi_manage_pvid().

We currently use vf->port_vlan_id to keep track of the port VLAN
ID and QoS, which is a bit misleading. Fix this by renaming it to
vf->port_vlan_info. Also change the name of the argument for
ice_vsi_manage_pvid() from vid to pvid_info.

In ice_vsi_manage_pvid() only save the fields that were modified
in the VSI properties structure on success instead of the entire thing.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

ice: Add initial support for QinQ

Allow support for S-Tag + C-Tag VLAN traffic by disabling pruning when
there are no 0x8100 VLAN interfaces currently created on top of the PF.
When an 0x8100 VLAN interface is configured, enable pruning and only
support single and double C-Tag VLAN traffic. If all of the 0x8100
interfaces that were created on top of the PF are removed via
ethtool -K <iface> rx-vlan-filter off or via ip tools, then disable
pruning and allow S-Tag + C-Tag traffic again.

Add VLAN 0 filter by default for the PF. This is because a bridge
sets the default_pvid to 1, sends the request down to
ice_vlan_rx_add_vid(), and we never get the request to add VLAN 0 via
the 8021q module which causes all untagged traffic to be dropped.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from David Miller:

1) Fix interrupt name truncation in mv88e6xxx dsa driver, from Andrew
    Lunn.

2) Process generic XDP even if SKB is cloned, from Toke Høiland-Jørgensen.

3) Fix leak of kernel memory to userspace in smc, from Eric Dumazet.

4) Add some missing netlink attribute validation to matchall and
    flower, from Davide Caratti.

5) Send icmp responses properly when NAT has been applied to the frame
    before we get to the tunnel emitting the icmp, from Jason Donenfeld.

6) Make sure there is enough SKB headroom when adding dsa tags for qca
    and ar9331. From Per Forlin.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (62 commits)
  netdevice.h: fix all kernel-doc and Sphinx warnings
  net: dsa: tag_ar9331: Make sure there is headroom for tag
  net: dsa: tag_qca: Make sure there is headroom for tag
  net, ip6_tunnel: enhance tunnel locate with link check
  net/smc: no peer ID in CLC decline for SMCD
  net/smc: transfer fasync_list in case of fallback
  net: hns3: fix a copying IPv6 address error in hclge_fd_get_flow_tuples()
  net: hns3: fix VF bandwidth does not take effect in some case
  net: hns3: add management table after IMP reset
  mac80211: fix wrong 160/80+80 MHz setting
  cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE
  xfrm: interface: use icmp_ndo_send helper
  wireguard: device: use icmp_ndo_send helper
  sunvnet: use icmp_ndo_send helper
  gtp: use icmp_ndo_send helper
  icmp: introduce helper for nat'd source address in network device context
  net/sched: flower: add missing validation of TCA_FLOWER_FLAGS
  net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGS
  net/flow_dissector: remove unexist field description
  page_pool: refill page when alloc.count of pool is zero
  ...

Merge tag 'pm-5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
"Fix three issues related to the handling of wakeup events signaled
  through the ACPI SCI while suspended to idle (Rafael Wysocki) and
  unexport an internal cpufreq variable (Yangtao Li)"

* tag 'pm-5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system
  ACPICA: Introduce acpi_any_gpe_status_set()
  ACPI: PM: s2idle: Avoid possible race related to the EC GPE
  ACPI: EC: Fix flushing of pending work
  cpufreq: Make cpufreq_global_kobject static

Merge tag 'sound-5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"The only common change is the regression fix of the previous PCM fix
  patch for managed buffers while the rest are usual suspects, USB-audio
  and HD-audio device-specific quirks.

  The change for UAC2 clock validation workaround became a bit big, but
  the changes are fairly straightforward"

* tag 'sound-5.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: pcm: Fix double hw_free calls
  ALSA: usb-audio: Add clock validity quirk for Denon MC7000/MCX8000
  ALSA: hda/realtek - Fix silent output on MSI-GL73
  ALSA: hda/realtek - Add more codec supported Headset Button
  ALSA: usb-audio: Apply sample rate quirk for Audioengine D1
  ALSA: usb-audio: Fix UAC2/3 effect unit parsing
  ALSA: usb-audio: Apply 48kHz fixed rate playback for Jabra Evolve 65 headset

Merge tag 'drm-fixes-2020-02-14' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"The core has a build fix for edid code on certain compilers/arches/,
  one MST fix and one vgem fix. Regular amdgpu fixes, and a couple of
  small driver fixes.

  The i915 fixes are bit larger than normal for this stage, but they
  were having CI issues last week, and they hadn't sent any fixes last
  week due to this.

  core:
   - edid build fix

  mst:
   - fix NULL ptr deref

  vgem:
   - fix close after free

  msm:
   - better dma-api usage

  sun4i:
   - disable allow_fb_modifiers

  amdgpu:
   - Additional OD fixes for navi
   - Misc display fixes
   - VCN 2.5 DPG fix
   - Prevent build errors on PowerPC on some configs
   - GDS EDC fix

  i915:
   - dsi/acpi fixes
   - gvt locking and allocation fixes
   - gem/gt fixes
   - bios timing parameters fix"

* tag 'drm-fixes-2020-02-14' of git://anongit.freedesktop.org/drm/drm: (50 commits)
  drm/i915: Mark the removal of the i915_request from the sched.link
  drm/i915/execlists: Reclaim the hanging virtual request
  drm/i915/execlists: Take a reference while capturing the guilty request
  drm/i915/execlists: Offline error capture
  drm/i915/gt: Allow temporary suspension of inflight requests
  drm/i915: Keep track of request among the scheduling lists
  drm/i915/gem: Tighten checks and acquiring the mmap object
  drm/i915: Fix preallocated barrier list append
  drm/i915/gt: Acquire ce->active before ce->pin_count/ce->pin_mutex
  drm/i915: Tighten atomicity of i915_active_acquire vs i915_active_release
  drm/i915: Stub out i915_gpu_coredump_put
  drm/amdgpu:/navi10: use the ODCAP enum to index the caps array
  drm/amdgpu: update smu_v11_0_pptable.h
  drm/amdgpu: correct comment to clear up the confusion
  drm/amd/display: DCN2.x Do not program DPPCLK if same value
  drm/amd/display: Don't map ATOM_ENABLE to ATOM_INIT
  drm/amdgpu/vcn2.5: fix warning
  drm/amdgpu: limit GDS clearing workaround in cold boot sequence
  drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf
  amdgpu: Prevent build errors regarding soft/hard-float FP ABI tags
  ...

netdevice.h: fix all kernel-doc and Sphinx warnings

Eliminate all kernel-doc and Sphinx warnings in
<linux/netdevice.h>. Fixes these warnings:

../include/linux/netdevice.h:2100: warning: Function parameter or member 'gso_partial_features' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'l3mdev_ops' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xfrmdev_ops' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'tlsdev_ops' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'name_assign_type' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'ieee802154_ptr' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'mpls_ptr' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xdp_prog' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'gro_flush_timeout' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xdp_bulkq' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xps_cpus_map' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'xps_rxqs_map' not described in 'net_device'
../include/linux/netdevice.h:2100: warning: Function parameter or member 'qdisc_hash' not described in 'net_device'
../include/linux/netdevice.h:3552: WARNING: Inline emphasis start-string without end-string.
../include/linux/netdevice.h:3552: WARNING: Inline emphasis start-string without end-string.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'dsa-headroom'

Per Forlin says:

====================
net: dsa: Make sure there is headroom for tag

Sorry for re-posting yet another time....
I manage to include multiple email-senders and forgot to include cover-letter.
Let's hope everyhthing is in order this time.

Fix two tag drivers to make sure there is headroom for the tag data.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: tag_ar9331: Make sure there is headroom for tag

Passing tag size to skb_cow_head will make sure
there is enough headroom for the tag data.
This change does not introduce any overhead in case there
is already available headroom for tag.

Signed-off-by: Per Forlin <perfn@axis.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: tag_qca: Make sure there is headroom for tag

Passing tag size to skb_cow_head will make sure
there is enough headroom for the tag data.
This change does not introduce any overhead in case there
is already available headroom for tag.

Signed-off-by: Per Forlin <perfn@axis.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net, ip6_tunnel: enhance tunnel locate with link check

With ipip, it is possible to create an extra interface explicitly
attached to a given physical interface:

  # ip link show tunl0
  4: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
  # ip link add tunl1 type ipip dev eth0
  # ip link show tunl1
  6: tunl1@eth0: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0

But it is not possible with ip6tnl:

  # ip link show ip6tnl0
  5: ip6tnl0@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000
      link/tunnel6 :: brd ::
  # ip link add ip6tnl1 type ip6tnl dev eth0
  RTNETLINK answers: File exists

This patch aims to make it possible by adding link comparaison in both
tunnel locate and lookup functions; we also modify mtu calculation when
attached to an interface with a lower mtu.

This permits to make use of x-netns communication by moving the newly
created tunnel in a given netns.

Signed-off-by: William Dauchy <w.dauchy@criteo.com>
Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mac80211-for-net-2020-02-14' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Just a few fixes:
* avoid running out of tracking space for frames that need
   to be reported to userspace by using more bits
* fix beacon handling suppression by adding some relevant
   elements to the CRC calculation
* fix quiet mode in action frames
* fix crash in ethtool for virt_wifi and similar
* add a missing policy entry
* fix 160 & 80+80 bandwidth to take local capabilities into
   account
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'smc-fixes'

Karsten Graul says:

====================
net/smc: fixes for -net

Fix a syzbot finding and a problem with the CLC handshake content.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: no peer ID in CLC decline for SMCD

Just SMCR requires a CLC Peer ID, but not SMCD. The field should be
zero for SMCD.

Fixes: 00b917c03235 ("net/smc: add SMC-D support in CLC messages")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/smc: transfer fasync_list in case of fallback

SMC does not work together with FASTOPEN. If sendmsg() is called with
flag MSG_FASTOPEN in SMC_INIT state, the SMC-socket switches to
fallback mode. To handle the previous ioctl FIOASYNC call correctly
in this case, it is necessary to transfer the socket wait queue
fasync_list to the internal TCP socket.

Reported-by: syzbot+4b1fe8105f8044a26162@syzkaller.appspotmail.com
Fixes: 8d716f2265c89 ("net/smc: handle sockopts forcing fallback")
Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'hns3-fixes'

Huazhong Tan says:

====================
net: hns3: fixes for -net

This series includes three bugfixes for the HNS3 ethernet driver.

[patch 1] fixes a management table lost issue after IMP reset.
[patch 2] fixes a VF bandwidth configuration not work problem.
[patch 3] fixes a problem related to IPv6 address copying.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: fix a copying IPv6 address error in hclge_fd_get_flow_tuples()

The IPv6 address defined in struct in6_addr is specified as
big endian, but there is no specified endian in struct
hclge_fd_rule_tuples, so it will cause a problem if directly
use memcpy() to copy ipv6 address between these two structures
since this field in struct hclge_fd_rule_tuples is little endian.

This patch fixes this problem by using be32_to_cpu() to convert
endian of IPv6 address of struct in6_addr before copying.

Fixes: 311b0277f0f2 ("net: hns3: add aRFS support for PF")
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: fix VF bandwidth does not take effect in some case

When enabling 4 TC after setting the bandwidth of VF, the bandwidth
of VF will resume to default value, because of the qset resources
changed in this case.

This patch fixes it by using a fixed VF's qset resources according to
HNAE3_MAX_TC macro.

Fixes: 959fbccef807 ("net: hns3: add support for configuring bandwidth of VF on the host")
Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: add management table after IMP reset

In the current process, the management table is missing after the
IMP reset. This patch adds the management table to the reset process.

Fixes: a36633ff8784 ("net: hns3: add manager table initialization for hardware")
Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'pm-cpufreq'

* pm-cpufreq:
cpufreq: Make cpufreq_global_kobject static

mac80211: allow setting queue_len for drivers not using wake_tx_queue

Currently a mac80211 driver can only set the txq_limit when using
wake_tx_queue. Not all drivers use wake_tx_queue. This patch adds a new
element to wiphy allowing a driver to set a custom tx_queue_len and the
code that will apply it in case it is set. The current default is
1000 which is too low for ath11k when doing HE rates.

Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20200211122605.13002-1-john@phrozen.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

ieee80211: add WPA3 OWE AKM suite selector

Add the definition for Opportunistic Wireless Encryption AKM selector.

Signed-off-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Link: https://lore.kernel.org/r/20200213131608.10541-3-sergey.matyukevich.os@quantenna.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: Fix setting txpower to zero

With multiple VIFS ath10k, and probably others, tries to find the
minimum txpower for all vifs and uses that when setting txpower in
the firmware.

If a second vif is added and starts to scan, it's txpower is not
initialized yet and it set to zero.

ath10k had a patch to ignore zero values, but then it is impossible
to actually set txpower to zero.

So, instead initialize the txpower to INT_MIN in mac80211, and let
drivers know that means the power has not been set and so should
be ignored.

This should fix regression in:

commit 333ca4a35e1bf77880d00e93ab7b28de800c1a7f
Author: Ryan Hsu <ryanhsu@qca.qualcomm.com>
Date: Tue Dec 13 14:55:19 2016 -0800

ath10k: fix incorrect txpower set by P2P_DEVICE interface

Tested on ath10k 9984 with ath10k-ct firmware.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Link: https://lore.kernel.org/r/20191217183057.24586-1-greearb@candelatech.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: fix wrong 160/80+80 MHz setting

Before this patch, STA's would set new width of 160/80+80 MHz based on AP capability only.
This is wrong because STA may not support > 80MHz BW.
Fix is to verify STA has 160/80+80 MHz capability before increasing its width to > 80MHz.

The "support_80_80" and "support_160" setting is based on:
"Table 9-272 — Setting of the Supported Channel Width Set subfield and Extended NSS BW
Support subfield at a STA transmitting the VHT Capabilities Information field"
From "Draft P802.11REVmd_D3.0.pdf"

Signed-off-by: Aviad Brikman <aviad.brikman@celeno.com>
Signed-off-by: Shay Bar <shay.bar@celeno.com>
Link: https://lore.kernel.org/r/20200210130728.23674-1-shay.bar@celeno.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: add missing policy for NL80211_ATTR_STATUS_CODE

The nl80211_policy is missing for NL80211_ATTR_STATUS_CODE attribute.
As a result, for strictly validated commands, it's assumed to not be
supported.

Signed-off-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Link: https://lore.kernel.org/r/20200213131608.10541-2-sergey.matyukevich.os@quantenna.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

Merge tag 'drm-intel-next-fixes-2020-02-13' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

drm/i915 fixes for v5.6-rc2

Most of these were aimed at a "next fixes" pull already during the merge
window, but there were issues with the baseline I used, which resulted
in a lot of issues in CI. I've regenerated this stuff piecemeal now,
adding gradually to it, and it seems healthy now.

Due to the issues this is much bigger than I'd like. But it was
obviously necessary to take the time to ensure it's not garbage...

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/878sl6yfrn.fsf@intel.com

Merge tag 'amd-drm-fixes-5.6-2020-02-12' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

amd-drm-fixes-5.6-2020-02-12:

amdgpu:
- Additional OD fixes for navi
- Misc display fixes
- VCN 2.5 DPG fix
- Prevent build errors on PowerPC on some configs
- GDS EDC fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200212224746.3992-1-alexander.deucher@amd.com

Merge tag 'drm-misc-next-fixes-2020-02-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

drm-misc-next fixes for v5.6:
- Fix build error in drm/edid.
- Plug close-after-free race in vgem_gem_create.
- Handle CONFIG_DMA_API_DEBUG_SG better in drm/msm.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/551b6183-a581-9d12-10a9-24cd929de425@linux.intel.com

Merge tag 'drm-misc-fixes-2020-02-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Fixes for v5.6:
- Revert allow_fb_modifiers in sun4i, as it causes a regression for DE2 and DE3.
- Fix null pointer deref in drm_dp_mst_process_up_req().

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/672810c3-4212-0a46-337b-2cb855573fd2@linux.intel.com

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"Summary below, but it's all reasonably straightforward. There are some
  more fixes on the horizon, but nothing disastrous yet.

  Summary:

   - Fix build when KASLR is enabled but CONFIG_ARCH_RANDOM is not set

   - Fix context-switching of SSBS state on systems that implement it

   - Fix spinlock compiler warning introduced during the merge window

   - Fix incorrect header inclusion (linux/clk-provider.h)

   - Use SYSCTL_{ZERO,ONE} instead of rolling our own static variables

   - Don't scream if optional SMMUv3 PMU irq is missing

   - Remove some unused function prototypes"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: time: Replace <linux/clk-provider.h> by <linux/of_clk.h>
  arm64: Fix CONFIG_ARCH_RANDOM=n build
  perf/smmuv3: Use platform_get_irq_optional() for wired interrupt
  arm64/spinlock: fix a -Wunused-function warning
  arm64: ssbs: Fix context-switch when SSBS is present on all CPUs
  arm64: use shared sysctl constants
  arm64: Drop do_el0_ia_bp_hardening() & do_sp_pc_abort() declarations

Merge tag 'gpio-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:

- Revert two patches to gpio_do_set_config() and implement the proper
   solution that works, also drop an unecessary call in set_config()

- Fix up the lockdep class for hierarchical IRQ domains.

- Remove some bridge code for line directions.

- Fix a register access bug in the Xilinx driver.

* tag 'gpio-v5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: sifive: fix static checker warning
  spmi: pmic-arb: Set lockdep class for hierarchical irq domains
  gpio: xilinx: Fix bug where the wrong GPIO register is written to
  gpiolib: remove unnecessary argument from set_config call
  gpio: bd71828: Remove unneeded defines for GPIO_LINE_DIRECTION_IN/OUT
  MAINTAINERS: Sort entries in database for GPIO
  gpiolib: fix gpio_do_set_config()
  Revert "gpiolib: remove set but not used variable 'config'"
  Revert "gpiolib: Remove duplicated function gpio_do_set_config()"

Merge branch 'icmp-account-for-NAT-when-sending-icmps-from-ndo-layer'

Jason A. Donenfeld says:

====================
icmp: account for NAT when sending icmps from ndo layer

The ICMP routines use the source address for two reasons:

1. Rate-limiting ICMP transmissions based on source address, so
   that one source address cannot provoke a flood of replies. If
   the source address is wrong, the rate limiting will be
   incorrectly applied.

2. Choosing the interface and hence new source address of the
   generated ICMP packet. If the original packet source address
   is wrong, ICMP replies will be sent from the wrong source
   address, resulting in either a misdelivery, infoleak, or just
   general network admin confusion.

Most of the time, the icmp_send and icmpv6_send routines can just reach
down into the skb's IP header to determine the saddr. However, if
icmp_send or icmpv6_send is being called from a network device driver --
there are a few in the tree -- then it's possible that by the time
icmp_send or icmpv6_send looks at the packet, the packet's source
address has already been transformed by SNAT or MASQUERADE or some other
transformation that CONNTRACK knows about. In this case, the packet's
source address is most certainly the *wrong* source address to be used
for the purpose of ICMP replies.

Rather, the source address we want to use for ICMP replies is the
original one, from before the transformation occurred.

Fortunately, it's very easy to just ask CONNTRACK if it knows about this
packet, and if so, how to fix it up. The saddr is the only field in the
header we need to fix up, for the purposes of the subsequent processing
in the icmp_send and icmpv6_send functions, so we do the lookup very
early on, so that the rest of the ICMP machinery can progress as usual.

Changes v3->v4:
- Add back the skb_shared checking, since the previous assumption isn't
  actually true [Eric]. This implies dropping the additional patches v3 had
  for removing skb_share_check from various drivers. We can revisit that
  general set of ideas later, but that's probably better suited as a net-next
  patchset rather than this stable one which is geared at fixing bugs. So,
  this implements things in the safe conservative way.

Changes v2->v3:
- Add selftest to ensure this actually does what we want and never regresses.
- Check the size of the skb header before operating on it.
- Use skb_ensure_writable to ensure we can modify the cloned skb [Florian].
- Conditionalize this on IPS_SRC_NAT so we don't do anything unnecessarily
  [Florian].
- It turns out that since we're calling these from the xmit path,
  skb_share_check isn't required, so remove that [Florian]. This simplifes the
  code a bit too. **The supposition here is that skbs passed to ndo_start_xmit
  are _never_ shared. If this is not correct NOW IS THE TIME TO PIPE UP, for
  doom awaits us later.**
- While investigating the shared skb business, several drivers appeared to be
  calling it incorrectly in the xmit path, so this series also removes those
  unnecessary calls, based on the supposition mentioned in the previous point.

Changes v1->v2:
- icmpv6 takes subtly different types than icmpv4, like u32 instead of be32,
  u8 instead of int.
- Since we're technically writing to the skb, we need to make sure it's not
  a shared one [Dave, 2017].
- Restore the original skb data after icmp_send returns. All current users
  are freeing the packet right after, so it doesn't matter, but future users
  might not.
- Remove superfluous route lookup in sunvnet [Dave].
- Use NF_NAT instead of NF_CONNTRACK for condition [Florian].
- Include this cover letter [Dave].
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

xfrm: interface: use icmp_ndo_send helper

Because xfrmi is calling icmp from network device context, it should use
the ndo helper so that the rate limiting applies correctly.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

wireguard: device: use icmp_ndo_send helper

Because wireguard is calling icmp from network device context, it should
use the ndo helper so that the rate limiting applies correctly. This
commit adds a small test to the wireguard test suite to ensure that the
new functions continue doing the right thing in the context of
wireguard. It does this by setting up a condition that will definately
evoke an icmp error message from the driver, but along a nat'd path.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

sunvnet: use icmp_ndo_send helper

Because sunvnet is calling icmp from network device context, it should use
the ndo helper so that the rate limiting applies correctly. While we're
at it, doing the additional route lookup before calling icmp_ndo_send is
superfluous, since this is the job of the icmp code in the first place.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

gtp: use icmp_ndo_send helper

Because gtp is calling icmp from network device context, it should use
the ndo helper so that the rate limiting applies correctly.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Harald Welte <laforge@gnumonks.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

icmp: introduce helper for nat'd source address in network device context

This introduces a helper function to be called only by network drivers
that wraps calls to icmp[v6]_send in a conntrack transformation, in case
NAT has been used. We don't want to pollute the non-driver path, though,
so we introduce this as a helper to be called by places that actually
make use of this, as suggested by Florian.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
"This fixes a Kconfig anomaly when lib/crypto is enabled without Crypto
API"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: Kconfig - allow tests to be disabled when manager is disabled

Merge branch 'skip_sw-skip_hw-validation'

Davide Caratti says:

====================
add missing validation of 'skip_hw/skip_sw'

ensure that all classifiers currently supporting HW offload
validate the 'flags' parameter provided by user:

- patch 1/2 fixes cls_matchall
- patch 2/2 fixes cls_flower
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: flower: add missing validation of TCA_FLOWER_FLAGS

unlike other classifiers that can be offloaded (i.e. users can set flags
like 'skip_hw' and 'skip_sw'), 'cls_flower' doesn't validate the size of
netlink attribute 'TCA_FLOWER_FLAGS' provided by user: add a proper entry
to fl_policy.

Fixes: 609f1261c1c5 ("net/flower: Introduce hardware offload support")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sched: matchall: add missing validation of TCA_MATCHALL_FLAGS

unlike other classifiers that can be offloaded (i.e. users can set flags
like 'skip_hw' and 'skip_sw'), 'cls_matchall' doesn't validate the size
of netlink attribute 'TCA_MATCHALL_FLAGS' provided by user: add a proper
entry to mall_policy.

Fixes: 2385c6d25b98 ("net/sched: Add match-all classifier hw offloading.")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>