skbuff: introduce {,__}napi_build_skb() which reuses NAPI cache heads
Instead of just bulk-flushing skbuff_heads queued up through
napi_consume_skb() or __kfree_skb_defer(), try to reuse them
on allocation path.
If the cache is empty on allocation, bulk-allocate the first
16 elements, which is more efficient than per-skb allocation.
If the cache is full on freeing, bulk-wipe the second half of
the cache (32 elements).
This also includes custom KASAN poisoning/unpoisoning to be
double sure there are no use-after-free cases.
To not change current behaviour, introduce a new function,
napi_build_skb(), to optionally use a new approach later
in drivers.
Note on selected bulk size, 16:
- this equals to XDP_BULK_QUEUE_SIZE, DEV_MAP_BULK_SIZE
and especially VETH_XDP_BATCH, which is also used to
bulk-allocate skbuff_heads and was tested on powerful
setups;
- this also showed the best performance in the actual
test series (from the array of {8, 16, 32}).
Suggested-by: Edward Cree <ecree.xilinx@gmail.com> # Divide on two halves Suggested-by: Eric Dumazet <edumazet@google.com> # KASAN poisoning Cc: Dmitry Vyukov <dvyukov@google.com> # Help with KASAN Cc: Paolo Abeni <pabeni@redhat.com> # Reduced batch size Signed-off-by: Alexander Lobakin <alobakin@pm.me> Signed-off-by: David S. Miller <davem@davemloft.net>