The descriptor list is a shared resource across all of the transmit queues, and
the locking mechanism used today only protects concurrency across a given
transmit queue between the transmit and reclaiming. This creates an opportunity
for the SYSTEMPORT hardware to work on corrupted descriptors if we have
multiple producers at once which is the case when using multiple transmit
queues.
This was particularly noticeable when using multiple flows/transmit queues and
it showed up in interesting ways in that UDP packets would get a correct UDP
header checksum being calculated over an incorrect packet length. Similarly TCP
packets would get an equally correct checksum computed by the hardware over an
incorrect packet length.
The SYSTEMPORT hardware maintains an internal descriptor list that it re-arranges
when the driver produces a new descriptor anytime it writes to the
WRITE_PORT_{HI,LO} registers, there is however some delay in the hardware to
re-organize its descriptors and it is possible that concurrent TX queues
eventually break this internal allocation scheme to the point where the
length/status part of the descriptor gets used for an incorrect data buffer.
The fix is to impose a global serialization for all TX queues in the short
section where we are writing to the WRITE_PORT_{HI,LO} registers which solves
the corruption even with multiple concurrent TX queues being used.
Fixes: 2912eb7400ef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20211215202450.4086240-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
struct bcm_sysport_priv *priv = netdev_priv(dev);
struct device *kdev = &priv->pdev->dev;
struct bcm_sysport_tx_ring *ring;
+ unsigned long flags, desc_flags;
struct bcm_sysport_cb *cb;
struct netdev_queue *txq;
u32 len_status, addr_lo;
unsigned int skb_len;
- unsigned long flags;
dma_addr_t mapping;
u16 queue;
int ret;
ring->desc_count--;
/* Ports are latched, so write upper address first */
+ spin_lock_irqsave(&priv->desc_lock, desc_flags);
tdma_writel(priv, len_status, TDMA_WRITE_PORT_HI(ring->index));
tdma_writel(priv, addr_lo, TDMA_WRITE_PORT_LO(ring->index));
+ spin_unlock_irqrestore(&priv->desc_lock, desc_flags);
/* Check ring space and update SW control flow */
if (ring->desc_count == 0)
}
/* Initialize both hardware and software ring */
+ spin_lock_init(&priv->desc_lock);
for (i = 0; i < dev->num_tx_queues; i++) {
ret = bcm_sysport_init_tx_ring(priv, i);
if (ret) {
int wol_irq;
/* Transmit rings */
+ spinlock_t desc_lock;
struct bcm_sysport_tx_ring *tx_rings;
/* Receive queue */