]> git.baikalelectronics.ru Git - kernel.git/commit
tcp: use an RB tree for ooo receive queue
authorYaogong Wang <wygivan@google.com>
Wed, 7 Sep 2016 21:49:28 +0000 (14:49 -0700)
committerDavid S. Miller <davem@davemloft.net>
Fri, 9 Sep 2016 00:25:58 +0000 (17:25 -0700)
commit3e1b2177091c796b5da579a2c910336d08758ff7
treef434343314c30020025c7a84507107c4aad60fd4
parent7e086aff72e59965a5dd2d769641751dc62833d3
tcp: use an RB tree for ooo receive queue

Over the years, TCP BDP has increased by several orders of magnitude,
and some people are considering to reach the 2 Gbytes limit.

Even with current window scale limit of 14, ~1 Gbytes maps to ~740,000
MSS.

In presence of packet losses (or reorders), TCP stores incoming packets
into an out of order queue, and number of skbs sitting there waiting for
the missing packets to be received can be in the 10^5 range.

Most packets are appended to the tail of this queue, and when
packets can finally be transferred to receive queue, we scan the queue
from its head.

However, in presence of heavy losses, we might have to find an arbitrary
point in this queue, involving a linear scan for every incoming packet,
throwing away cpu caches.

This patch converts it to a RB tree, to get bounded latencies.

Yaogong wrote a preliminary patch about 2 years ago.
Eric did the rebase, added ofo_last_skb cache, polishing and tests.

Tested with network dropping between 1 and 10 % packets, with good
success (about 30 % increase of throughput in stress tests)

Next step would be to also use an RB tree for the write queue at sender
side ;)

Signed-off-by: Yaogong Wang <wygivan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Acked-By: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
include/linux/skbuff.h
include/linux/tcp.h
include/net/tcp.h
net/core/skbuff.c
net/ipv4/tcp.c
net/ipv4/tcp_input.c
net/ipv4/tcp_ipv4.c
net/ipv4/tcp_minisocks.c