git.baikalelectronics.ru Git - kernel.git/commit

virtio-net: auto-tune mergeable rx buffer size for improved performance

Commit de4826a7cfe2 ("virtio_net: migrate mergeable rx buffers to page frag
allocators") changed the mergeable receive buffer size from PAGE_SIZE to
MTU-size, introducing a single-stream regression for benchmarks with large
average packet size. There is no single optimal buffer size for all
workloads. For workloads with packet size <= MTU bytes, MTU + virtio-net
header-sized buffers are preferred as larger buffers reduce the TCP window
due to SKB truesize. However, single-stream workloads with large average
packet sizes have higher throughput if larger (e.g., PAGE_SIZE) buffers
are used.

This commit auto-tunes the mergeable receiver buffer packet size by
choosing the packet buffer size based on an EWMA of the recent packet
sizes for the receive queue. Packet buffer sizes range from MTU_SIZE +
virtio-net header len to PAGE_SIZE. This improves throughput for
large packet workloads, as any workload with average packet size >=
PAGE_SIZE will use PAGE_SIZE buffers.

These optimizations interact positively with recent commit
931b92cc96a8 ("virtio-net: coalesce rx frags when possible during rx"),
which coalesces adjacent RX SKB fragments in virtio_net. The coalescing
optimizations benefit buffers of any size.

Benchmarks taken from an average of 5 netperf 30-second TCP_STREAM runs
between two QEMU VMs on a single physical machine. Each VM has two VCPUs
with all offloads & vhost enabled. All VMs and vhost threads run in a
single 4 CPU cgroup cpuset, using cgroups to ensure that other processes
in the system will not be scheduled on the benchmark CPUs. Trunk includes
SKB rx frag coalescing.

net-next w/ virtio_net before de4826a7cfe2 (PAGE_SIZE bufs): 14642.85Gb/s
net-next (MTU-size bufs): 13170.01Gb/s
net-next + auto-tune: 14555.94Gb/s

Jason Wang also reported a throughput increase on mlx4 from 22Gb/s
using MTU-sized buffers to about 26Gb/s using auto-tuning.

Signed-off-by: Michael Dalton <mwdalton@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

author	Michael Dalton <mwdalton@google.com>
	Fri, 17 Jan 2014 06:23:27 +0000 (22:23 -0800)
committer	David S. Miller <davem@davemloft.net>
	Fri, 17 Jan 2014 07:46:06 +0000 (23:46 -0800)
commit	656857ae5c2821af90fcb48c8dc8669f2ead421f
tree	19468488a66a57873836cc28edba7ec6f43652cf	tree \| snapshot
parent	c3c03f4ab709dea2ac4799be4dbe3351561a5c41	commit \| diff