Daniel Borkmann [Fri, 26 Jan 2018 22:33:37 +0000 (23:33 +0100)]
bpf: improve dead code sanitizing
Given we recently had c131187db2d3 ("bpf: fix branch pruning
logic") and 95a762e2c8c9 ("bpf: fix incorrect sign extension in
check_alu_op()") in particular where before verifier skipped
verification of the wrongly assumed dead branch, we should not
just replace the dead code parts with nops (mov r0,r0). If there
is a bug such as fixed in 95a762e2c8c9 in future again, where
runtime could execute those insns, then one of the potential
issues with the current setting would be that given the nops
would be at the end of the program, we could execute out of
bounds at some point.
The best in such case would be to just exit the BPF program
altogether and return an exception code. However, given this
would require two instructions, and such a dead code gap could
just be a single insn long, we would need to place 'r0 = X; ret'
snippet at the very end after the user program or at the start
before the program (where we'd skip that region on prog entry),
and then place unconditional ja's into the dead code gap.
While more complex but possible, there's still another block
in the road that currently prevents from this, namely BPF to
BPF calls. The issue here is that such exception could be
returned from a callee, but the caller would not know that
it's an exception that needs to be propagated further down.
Alternative that has little complexity is to just use a ja-1
code for now which will trap the execution here instead of
silently doing bad things if we ever get there due to bugs.
====================
This patchset adds support for:
- direct R or R/W access to many tcp_sock fields
- passing up to 4 arguments to sock_ops BPF functions
- tcp_sock field bpf_sock_ops_cb_flags for controlling callbacks
- optionally calling sock_ops BPF program when RTO fires
- optionally calling sock_ops BPF program when packet is retransmitted
- optionally calling sock_ops BPF program when TCP state changes
- access to tclass and sk_txhash
- new selftest
v2: Fixed commit message 0/11. The commit is to "bpf-next" but the patch
below used "bpf" and Patchwork didn't work correctly.
v3: Cleaned RTO callback as per Yuchung's comment
Added BPF enum for TCP states as per Alexei's comment
v4: Fixed compile warnings related to detecting changes between TCP
internal states and the BPF defined states.
v5: Fixed comment issues in some selftest files
Fixed accesss issue with u64 fields in bpf_sock_ops struct
v6: Made fixes based on comments form Eric Dumazet:
The field bpf_sock_ops_cb_flags was addded in a hole on 64bit kernels
Field bpf_sock_ops_cb_flags is now set through a helper function
which returns an error when a BPF program tries to set bits for
callbacks that are not supported in the current kernel.
Added a comment indicating that when adding fields to bpf_sock_ops_kern
they should be added before the field named "temp" if they need to be
cleared before calling the BPF function.
v7: Enfornced fields "op" and "replylong[1] .. replylong[3]" not be writable
based on comments form Eric Dumazet and Alexei Starovoitov.
Filled 32 bit hole in bpf_sock_ops struct with sk_txhash based on
comments from Daniel Borkmann.
Removed unused functions (tcp_call_bpf_1arg, tcp_call_bpf_4arg) based
on comments from Daniel Borkmann.
v8: Add commit message 00/12
Add Acked-by as appropriate
v9: Moved the bug fix to the front of the patchset
Changed RETRANS_CB so it is always called (before it was only called if
the retransmit succeeded). It is now called with an extra argument, the
return value of tcp_transmit_skb (0 => success). Based on comments
from Yuchung Cheng.
Added support for reading 2 new fields, sacked_out and lost_out, based on
comments from Yuchung Cheng.
v10: Moved the callback flags from include/uapi/linux/tcp.h to
include/uapi/linux/bpf.h
Cleaned up the test in selftest. Added a timeout so it always completes,
even if the client is not communicating with the server. Made it faster
by removing the sleeps. Made sure it works even when called back-to-back
20 times.
Consists of the following patches:
[PATCH bpf-next v10 01/12] bpf: Only reply field should be writeable
[PATCH bpf-next v10 02/12] bpf: Make SOCK_OPS_GET_TCP size
[PATCH bpf-next v10 03/12] bpf: Make SOCK_OPS_GET_TCP struct
[PATCH bpf-next v10 04/12] bpf: Add write access to tcp_sock and sock
[PATCH bpf-next v10 05/12] bpf: Support passing args to sock_ops bpf
[PATCH bpf-next v10 06/12] bpf: Adds field bpf_sock_ops_cb_flags to
[PATCH bpf-next v10 07/12] bpf: Add sock_ops RTO callback
[PATCH bpf-next v10 08/12] bpf: Add support for reading sk_state and
[PATCH bpf-next v10 09/12] bpf: Add sock_ops R/W access to tclass
[PATCH bpf-next v10 10/12] bpf: Add BPF_SOCK_OPS_RETRANS_CB
[PATCH bpf-next v10 11/12] bpf: Add BPF_SOCK_OPS_STATE_CB
[PATCH bpf-next v10 12/12] bpf: add selftest for tcpbpf
====================
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:16 +0000 (16:14 -0800)]
bpf: add selftest for tcpbpf
Added a selftest for tcpbpf (sock_ops) that checks that the appropriate
callbacks occured and that it can access tcp_sock fields and that their
values are correct.
Run with command: ./test_tcpbpf_user
Adding the flag "-d" will show why it did not pass.
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:15 +0000 (16:14 -0800)]
bpf: Add BPF_SOCK_OPS_STATE_CB
Adds support for calling sock_ops BPF program when there is a TCP state
change. Two arguments are used; one for the old state and another for
the new state.
There is a new enum in include/uapi/linux/bpf.h that exports the TCP
states that prepends BPF_ to the current TCP state names. If it is ever
necessary to change the internal TCP state values (other than adding
more to the end), then it will become necessary to convert from the
internal TCP state value to the BPF value before calling the BPF
sock_ops function. There are a set of compile checks added in tcp.c
to detect if the internal and BPF values differ so we can make the
necessary fixes.
New op: BPF_SOCK_OPS_STATE_CB.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:14 +0000 (16:14 -0800)]
bpf: Add BPF_SOCK_OPS_RETRANS_CB
Adds support for calling sock_ops BPF program when there is a
retransmission. Three arguments are used; one for the sequence number,
another for the number of segments retransmitted, and the last one for
the return value of tcp_transmit_skb (0 => success).
Does not include syn-ack retransmissions.
New op: BPF_SOCK_OPS_RETRANS_CB.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:11 +0000 (16:14 -0800)]
bpf: Add sock_ops RTO callback
Adds an optional call to sock_ops BPF program based on whether the
BPF_SOCK_OPS_RTO_CB_FLAG is set in bpf_sock_ops_flags.
The BPF program is passed 2 arguments: icsk_retransmits and whether the
RTO has expired.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:10 +0000 (16:14 -0800)]
bpf: Adds field bpf_sock_ops_cb_flags to tcp_sock
Adds field bpf_sock_ops_cb_flags to tcp_sock and bpf_sock_ops. Its primary
use is to determine if there should be calls to sock_ops bpf program at
various points in the TCP code. The field is initialized to zero,
disabling the calls. A sock_ops BPF program can set it, per connection and
as necessary, when the connection is established.
It also adds support for reading and writting the field within a
sock_ops BPF program. Reading is done by accessing the field directly.
However, writing is done through the helper function
bpf_sock_ops_cb_flags_set, in order to return an error if a BPF program
is trying to set a callback that is not supported in the current kernel
(i.e. running an older kernel). The helper function returns 0 if it was
able to set all of the bits set in the argument, a positive number
containing the bits that could not be set, or -EINVAL if the socket is
not a full TCP socket.
Examples of where one could call the bpf program:
1) When RTO fires
2) When a packet is retransmitted
3) When the connection terminates
4) When a packet is sent
5) When a packet is received
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:09 +0000 (16:14 -0800)]
bpf: Support passing args to sock_ops bpf function
Adds support for passing up to 4 arguments to sock_ops bpf functions. It
reusues the reply union, so the bpf_sock_ops structures are not
increased in size.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:08 +0000 (16:14 -0800)]
bpf: Add write access to tcp_sock and sock fields
This patch adds a macro, SOCK_OPS_SET_FIELD, for writing to
struct tcp_sock or struct sock fields. This required adding a new
field "temp" to struct bpf_sock_ops_kern for temporary storage that
is used by sock_ops_convert_ctx_access. It is used to store and recover
the contents of a register, so the register can be used to store the
address of the sk. Since we cannot overwrite the dst_reg because it
contains the pointer to ctx, nor the src_reg since it contains the value
we want to store, we need an extra register to contain the address
of the sk.
Also adds the macro SOCK_OPS_GET_OR_SET_FIELD that calls one of the
GET or SET macros depending on the value of the TYPE field.
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:07 +0000 (16:14 -0800)]
bpf: Make SOCK_OPS_GET_TCP struct independent
Changed SOCK_OPS_GET_TCP to SOCK_OPS_GET_FIELD and added 2
arguments so now it can also work with struct sock fields.
The first argument is the name of the field in the bpf_sock_ops
struct, the 2nd argument is the name of the field in the OBJ struct.
Where OBJ is either "struct tcp_sock" or "struct sock" (without
quotation). BPF_FIELD is the name of the field in the bpf_sock_ops
struct and OBJ_FIELD is the name of the field in the OBJ struct.
Although the field names are currently the same, the kernel struct names
could change in the future and this change makes it easier to support
that.
Note that adding access to tcp_sock fields in sock_ops programs does
not preclude the tcp_sock fields from being removed as long as we are
willing to do one of the following:
1) Return a fixed value (e.x. 0 or 0xffffffff), or
2) Make the verifier fail if that field is accessed (i.e. program
fails to load) so the user will know that field is no longer
supported.
Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Lawrence Brakmo [Fri, 26 Jan 2018 00:14:05 +0000 (16:14 -0800)]
bpf: Only reply field should be writeable
Currently, a sock_ops BPF program can write the op field and all the
reply fields (reply and replylong). This is a bug. The op field should
not have been writeable and there is currently no way to use replylong
field for indices >= 1. This patch enforces that only the reply field
(which equals replylong[0]) is writeable.
Fixes: 40304b2a1567 ("bpf: BPF support for sock_ops") Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Daniel Borkmann [Wed, 24 Jan 2018 09:46:59 +0000 (10:46 +0100)]
Merge branch 'bpf-samples-sockmap-improvements'
John Fastabend says:
====================
The sockmap sample is pretty simple at the moment. All it does is open
a few sockets attach BPF programs/sockmaps and sends a few packets.
However, for testing and debugging I wanted to have more control over
the sendmsg format and data than provided by tools like iperf3/netperf,
etc. The reason is for testing BPF programs and stream parser it is
helpful to be able submit multiple sendmsg calls with different msg
layouts. For example lots of 1B iovs or a single large MB of data, etc.
Additionally, my current test setup requires an entire orchestration
layer (cilium) to run. As well as lighttpd and http traffic generators
or for kafka testing brokers and clients. This makes it a bit more
difficult when doing performance optimizations to incrementally test
small changes and come up with performance delta's and perf numbers.
By adding a few more options and an additional few tests the sockmap
sample program can show a more complete example and do some of the
above. Because the sample program is self contained it doesn't require
additional infrastructure to run either.
This series, although still fairly crude, does provide some nice
additions. They are
- a new sendmsg tests with a sender and recv threads
- a new base tests so we can get metrics/data without BPF
- multiple GBps of throughput on base and sendmsg tests
- automatically set rlimit and common variables
That said the UI is still primitive, more features could be added,
more tests might be useful, the reporting is bare bones, etc. But,
IMO lets push this now rather than sit on it for weeks until I get
time to do the above improvements. Additional patches can address
the other limitations/issues. Another thing I am considering is
moving this into selftests, after a few more fixes so we avoid
false failures, so that we get more sockmap testing.
v2: removed bogus file added by patch 3/7
v3: 1/7 replace goto out with returns, remove sighandler update,
2/7 free iov in error cases
3/7 fix bogus makefile change, bail out early on errors
v4: add Martin's "nits" and ACKs along with fixes to 2/7 iov free
also pointed out by Martin.
Thanks Daniel and Martin for the reviews!
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 22 Jan 2018 18:37:11 +0000 (10:37 -0800)]
bpf: sockmap set rlimit
Avoid extra step of setting limit from cmdline and do it directly in
the program.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 22 Jan 2018 18:36:53 +0000 (10:36 -0800)]
bpf: sockmap put client sockets in blocking mode
Put client sockets in blocking mode otherwise with sendmsg tests
its easy to overrun the socket buffers which results in the test
being aborted.
The original non-blocking was added to handle listen/accept with
a single thread the client/accepted sockets do not need to be
non-blocking.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 22 Jan 2018 18:36:36 +0000 (10:36 -0800)]
bpf: sockmap sample add base test without any BPF for comparison
Add a base test that does not use BPF hooks to test baseline case.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 22 Jan 2018 18:36:19 +0000 (10:36 -0800)]
bpf: sockmap sample, report bytes/sec
Report bytes/sec sent as well as total bytes. Useful to get rough
idea how different configurations and usage patterns perform with
sockmap.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 22 Jan 2018 18:36:02 +0000 (10:36 -0800)]
bpf: sockmap sample, use fork() for send and recv
Currently for SENDMSG tests first send completes then recv runs. This
does not work well for large data sizes and/or many iterations. So
fork the recv and send handler so that we run both send and recv. In
the future we can add a parameter to do more than a single fork of
tx/rx.
With this we can get many GBps of data which helps exercise the
sockmap code.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
John Fastabend [Mon, 22 Jan 2018 18:35:45 +0000 (10:35 -0800)]
bpf: add sendmsg option for testing BPF programs
When testing BPF programs using sockmap I often want to have more
control over how sendmsg is exercised. This becomes even more useful
as new sockmap program types are added.
This adds a test type option to select type of test to run. Currently,
only "ping" and "sendmsg" are supported, but more can be added as
needed.
John Fastabend [Mon, 22 Jan 2018 18:35:27 +0000 (10:35 -0800)]
bpf: refactor sockmap sample program update for arg parsing
sockmap sample program takes arguments from cmd line but it reads them
in using offsets into the array. Because we want to add more arguments
in the future lets do proper argument handling.
Also refactor code to pull apart sock init and ping/pong test. This
allows us to add new tests in the future.
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
selftests/bpf: make 'dubious pointer arithmetic' test useful
mostly revert the previous workaround and make
'dubious pointer arithmetic' test useful again.
Use (ptr - ptr) << const instead of ptr << const to generate large scalar.
The rest stays as before commit 2b36047e7889.
The test incorrectly doing
mkdir /mnt/cgroup-test-work-dirtest-bpf-based-device-cgroup
instead of
mkdir /mnt/cgroup-test-work-dir/test-bpf-based-device-cgroup
somehow such mkdir succeeds and new directory appears:
/mnt/cgroup-test-work-dir/cgroup-test-work-dirtest-bpf-based-device-cgroup
Later cleanup via nftw("/mnt/cgroup-test-work-dir", ...);
doesn't walk this directory.
"rmdir /mnt/cgroup-test-work-dir" succeeds, but bpf program and
dangling cgroup stays in memory.
That's a separate issue on a cgroup side.
For now fix the test.
Fixes: 37f1ba0909df ("selftests/bpf: add a test for device cgroup controller") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
test_hashmap_walk takes very long time on debug kernel with kasan on.
Reduce the number of iterations in this test without sacrificing
test coverage.
Also add printfs as progress indicator.
Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Yonghong Song [Tue, 23 Jan 2018 06:10:59 +0000 (22:10 -0800)]
tools/bpf: fix a test failure in selftests prog test_verifier
Commit 111e6b45315c ("selftests/bpf: make test_verifier run most programs")
enables tools/testing/selftests/bpf/test_verifier unit cases to run
via bpf_prog_test_run command. With the latest code base,
test_verifier had one test case failure:
The test case does not set return value in the test
structure and hence the return value from the prog run
is assumed to be 0. However, the actual return value is 1.
As a result, the test failed. The fix is to correctly set
the return value in the test structure.
Fixes: 111e6b45315c ("selftests/bpf: make test_verifier run most programs") Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Yonghong Song [Tue, 23 Jan 2018 06:53:51 +0000 (22:53 -0800)]
bpf: fix incorrect kmalloc usage in lpm_trie MAP_GET_NEXT_KEY rcu region
In commit b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map"),
the implemented MAP_GET_NEXT_KEY callback function is guarded with rcu read lock.
In the function body, "kmalloc(size, GFP_USER | __GFP_NOWARN)" is used which may
sleep and violate rcu read lock region requirements. This patch fixed the issue
by using GFP_ATOMIC instead to avoid blocking kmalloc. Tested with
CONFIG_DEBUG_ATOMIC_SLEEP=y as suggested by Eric Dumazet.
Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE map") Signed-off-by: Yonghong Song <yhs@fb.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Wei Yongjun [Tue, 23 Jan 2018 02:10:38 +0000 (02:10 +0000)]
net: aquantia: make symbol hw_atl_boards static
Fixes the following sparse warning:
drivers/net/ethernet/aquantia/atlantic/aq_pci_func.c:50:34: warning:
symbol 'hw_atl_boards' was not declared. Should it be static?
Fixes: 4948293ff963 ("net: aquantia: Introduce new AQC devices and capabilities") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wei Yongjun [Tue, 23 Jan 2018 02:10:27 +0000 (02:10 +0000)]
nfp: fix error return code in nfp_pci_probe()
Fix to return error code -EINVAL instead of 0 when num_vfs above
limit_vfs, as done elsewhere in this function.
Fixes: 0dc786219186 ("nfp: handle SR-IOV already enabled when driver is probing") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Carl Heymann [Tue, 23 Jan 2018 01:29:43 +0000 (17:29 -0800)]
nfp: fix fw dump handling of absolute rtsym size
Fix bug that causes _absolute_ rtsym sizes of > 8 bytes (as per symbol
table) to result in incorrect space used during a TLV-based debug dump.
Detail: The size calculation stage calculates the correct size (size of
the rtsym address field == 8), while the dump uses the size in the table
to calculate the TLV size to reserve. Symbols with size <= 8 are handled
OK due to aligning sizes to 8, but including any absolute symbol with
listed size > 8 leads to an ENOSPC error during the dump.
Fixes: da762863edd9 ("nfp: fix absolute rtsym handling in debug dump") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Mon, 22 Jan 2018 21:49:27 +0000 (13:49 -0800)]
tun: avoid calling xdp_rxq_info_unreg() twice
Similarly to tx ring, xdp_rxq_info is only registered
when !tfile->detached, so we need to avoid calling
xdp_rxq_info_unreg() twice too. The helper tun_cleanup_tx_ring()
already checks for this properly, so it is correct to put
xdp_rxq_info_unreg() just inside there.
Reported-by: syzbot+1c788d7ce0f0888f1d7f@syzkaller.appspotmail.com Fixes: 8565d26bcb2f ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
net: sched: add extack support for cls offloads
I've dropped the tests from the series because test_offloads.py changes
will conflict with bpf-next patches. I will send four more patches with
tests once bpf-next is merged back, hopefully still making it into 4.16 :)
v4:
- rebase on top of Alex's changes.
---
Quentin says:
This series tries to improve user experience when eBPF hardware offload
hits error paths at load time. In particular, it introduces netlink
extended ack support in the nfp driver.
To that aim, transmission of the pointer to the extack object is piped
through the `change()` operation of the existing classifiers (patch 1 to
6). Then it is used for TC offload in the nfp driver (patch 8) and in
netdevsim (patch 9, selftest in patch 10). Patch 7 adds a helper to handle
extack messages in the core when TC offload is disabled on the net device.
For completeness extack is propagated for classifiers other than cls_bpf,
but it's up to the drivers to make use of it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Monnet [Sat, 20 Jan 2018 01:44:50 +0000 (17:44 -0800)]
nfp: bpf: use extack support to improve debugging
Use the recently added extack support for eBPF offload in the driver.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Monnet [Sat, 20 Jan 2018 01:44:49 +0000 (17:44 -0800)]
nfp: bpf: plumb extack into functions related to XDP offload
Pass a pointer to an extack object to nfp_app_xdp_offload() in order to
prepare for extack usage in the nfp driver. Next step will be to forward
this extack pointer to nfp_net_bpf_offload(), once this function is able
to use it for printing error messages.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Create a wrapper around tc_can_offload() that takes an additional
extack pointer argument in order to output an error message if TC
offload is disabled on the device.
In this way, the error message is handled by the core and can be the
same for all drivers.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Monnet [Sat, 20 Jan 2018 01:44:47 +0000 (17:44 -0800)]
net: sched: add extack support for offload via tc_cls_common_offload
Add extack support for hardware offload of classifiers. In order
to achieve this, a pointer to a struct netlink_ext_ack is added to the
struct tc_cls_common_offload that is passed to the callback for setting
up the classifier. Function tc_cls_common_offload_init() is updated to
support initialization of this new attribute.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Monnet [Sat, 20 Jan 2018 01:44:46 +0000 (17:44 -0800)]
net: sched: cls_bpf: plumb extack support in filter for hardware offload
Pass the extack pointer obtained in the `->change()` filter operation to
cls_bpf_offload() and then to cls_bpf_offload_cmd(). This makes it
possible to use this extack pointer in drivers offloading BPF programs
in a future patch.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Monnet [Sat, 20 Jan 2018 01:44:45 +0000 (17:44 -0800)]
net: sched: cls_u32: propagate extack support for filter offload
Propagate the extack pointer from the `->change()` classifier operation
to the function used for filter replacement in cls_u32. This makes it
possible to use netlink extack messages in the future at replacement
time for this filter, although it is not used at this point.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Monnet [Sat, 20 Jan 2018 01:44:44 +0000 (17:44 -0800)]
net: sched: cls_matchall: propagate extack support for filter offload
Propagate the extack pointer from the `->change()` classifier operation
to the function used for filter replacement in cls_matchall. This makes
it possible to use netlink extack messages in the future at replacement
time for this filter, although it is not used at this point.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Quentin Monnet [Sat, 20 Jan 2018 01:44:43 +0000 (17:44 -0800)]
net: sched: cls_flower: propagate extack support for filter offload
Propagate the extack pointer from the `->change()` classifier operation
to the function used for filter replacement in cls_flower. This makes it
possible to use netlink extack messages in the future at replacement
time for this filter, although it is not used at this point.
Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Fri, 19 Jan 2018 20:26:43 +0000 (13:26 -0700)]
hv_netvsc: Use the num_online_cpus() for channel limit
Since we no longer localize channel/CPU affiliation within one NUMA
node, num_online_cpus() is used as the number of channel cap, instead of
the number of processors in a NUMA node.
This patch allows a bigger range for tuning the number of channels.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Salil Mehta [Fri, 19 Jan 2018 15:20:53 +0000 (15:20 +0000)]
net: hns3: converting spaces into tabs to avoid checkpatch.pl warning
Spaces were mistakenly used instead of tabs in some of the code related
to reset functionality, which caused checkpatch.pl errors. These were
missed earlier so fixing them now.
Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arjun Vynipadath [Fri, 19 Jan 2018 09:41:48 +0000 (15:11 +0530)]
cxgb3: assign port id to net_device->dev_port
T3 devices have different ports on same PCI function,
so using dev_port to identify ports.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bridge: return boolean instead of integer in br_multicast_is_router
Return statements in functions returning bool should use
true/false instead of 1/0.
This issue was detected with the help of Coccinelle.
Fixes: 85b352693264 ("bridge: Fix build error when IGMP_SNOOPING is not enabled") Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 18 Jan 2018 23:12:21 +0000 (15:12 -0800)]
net: stmmac: Fix reception of Broadcom switches tags
Broadcom tags inserted by Broadcom switches put a 4 byte header after
the MAC SA and before the EtherType, which may look like some sort of 0
length LLC/SNAP packet (tcpdump and wireshark do think that way). With
ACS enabled in stmmac the packets were truncated to 8 bytes on
reception, whereas clearing this bit allowed normal reception to occur.
In order to make that possible, we need to pass a net_device argument to
the different core_init() functions and we are dependent on the Broadcom
tagger padding packets correctly (which it now does). To be as little
invasive as possible, this is only done for gmac1000 when the network
device is DSA-enabled (netdev_uses_dsa() returns true).
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 22 Jan 2018 21:05:50 +0000 (16:05 -0500)]
Merge branch 'hns3-new-features'
Peng Li says:
====================
add some features to hns3 driver
This patchset adds some features to hns3 driver, include the support
for ethtool command -d, -p and support for manager table.
[Patch 1/4] adds support for ethtool command -d, its ops is get_regs.
driver will send command to command queue, and get regs number and
regs value from command queue.
[Patch 2/4] adds manager table initialization for hardware.
[Patch 3/4] adds support for ethtool command -p. For fiber ports, driver
sends command to command queue, and IMP will write SGPIO regs to control
leds.
[Patch 4/4] adds support for net status led for fiber ports. Net status
include port speed, total rx/tx packets and link status. Driver send
the status to command queue, and IMP will write SGPIO to control leds.
---
Change log:
V1 -> V2:
1, fix comments from Andrew Lunn, remove the patch "net: hns3: add
ethtool -p support for phy device".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fuyun Liang [Fri, 19 Jan 2018 06:41:10 +0000 (14:41 +0800)]
net: hns3: add manager table initialization for hardware
The manager table is empty by default. If it is not initialized, the
management pkgs like LLDP will be dropped by hardware. Default entries
need to be added to manager table.
Signed-off-by: Fuyun Liang <liangfuyun1@huawei.com> Signed-off-by: Peng Li <lipeng321@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Decotigny [Thu, 18 Jan 2018 17:59:13 +0000 (09:59 -0800)]
net: core: Expose number of link up/down transitions
Expose the number of times the link has been going UP or DOWN, and
update the "carrier_changes" counter to be the sum of these two events.
While at it, also update the sysfs-class-net documentation to cover:
carrier_changes (3.15), carrier_up_count (4.16) and carrier_down_count
(4.16)
Signed-off-by: David Decotigny <decot@googlers.com>
[Florian:
* rebase
* add documentation
* merge carrier_changes with up/down counters] Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Thu, 18 Jan 2018 16:48:18 +0000 (17:48 +0100)]
macsec: restore uAPI after addition of GCM-AES-256
Commit ccfdec908922 ("macsec: Add support for GCM-AES-256 cipher suite")
changed a few values in the uapi headers for MACsec.
Because of existing userspace implementations, we need to preserve the
value of MACSEC_DEFAULT_CIPHER_ID. Not doing that resulted in
wpa_supplicant segfaults when a secure channel was created using the
default cipher. Thus, swap MACSEC_DEFAULT_CIPHER_{ID,ALT} back to their
original values.
Changing the maximum length of the MACSEC_SA_ATTR_KEY attribute is
unnecessary, as the previous value (MACSEC_MAX_KEY_LEN, which was 128B)
is large enough to carry 32-bytes keys. This patch reverts
MACSEC_MAX_KEY_LEN to 128B and restores the old length check on
MACSEC_SA_ATTR_KEY.
Fixes: ccfdec908922 ("macsec: Add support for GCM-AES-256 cipher suite") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan [Thu, 18 Jan 2018 02:37:34 +0000 (10:37 +0800)]
net: hns: Fix for variable may be used uninitialized warnings
When !CONFIG_REGMAP hns throws compiler warnings since
dsaf_read_syscon ignores the return result from regmap_read,
which allows val to be uninitialized.
Fixes: 86897c960b49 ("net: hns: add syscon operation for dsaf") Reported-by: Jason Gunthorpe <jgg@ziepe.ca> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This change converts existing per-cpu stats structure into per-queue one.
This should not impact on performance since each queue counter is not
updated concurrently by multiple cpus.
Performance numbers:
- Guest has 2 vcpus and 2 queues
- Guest runs netserver
- Host runs 100-flow super_netperf
====================
Armada 7k/8k PP2 ACPI support
I quickly resend the series, thanks to Antoine Tenart's remark,
who spotted !CONFIG_ACPI compilation issue after introducing
the new fwnode_irq_get() routine. Please see the details in the changelog
below and the 3/7 commit log.
mvpp2 driver can work with the ACPI representation, as exposed
on a public branch:
https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/commits/marvell-armada-wip
It was compiled together with the most recent Tianocore EDK2 revision.
Please refer to the firmware build instruction on MacchiatoBin board:
http://wiki.macchiatobin.net/tiki-index.php?page=Build+from+source+-+UEFI+EDK+II
ACPI representation of PP2 controllers (withouth PHY support) can
be viewed in the github:
* MacchiatoBin:
https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/blob/71ae395da1661374b0f07d1602afb1eee56e9794/Platforms/Marvell/Armada/AcpiTables/Armada80x0McBin/Dsdt.asl#L201
* Armada 7040 DB:
https://github.com/MarvellEmbeddedProcessors/edk2-open-platform/blob/71ae395da1661374b0f07d1602afb1eee56e9794/Platforms/Marvell/Armada/AcpiTables/Armada70x0/Dsdt.asl#L131
I will appreciate any comments or remarks.
Best regards,
Marcin
Changelog:
v3 -> v4:
* 3/7
- add new macro (ACPI_HANDLE_FWNODE) and fix
compilation with !CONFIG_ACPI
- extend commit log and mention usability of fwnode_irq_get
for the child nodes as well
v1 -> v2:
* Remove MDIO patches
* Use PP2 ports only with link interrupts
* Release second region resources in mvpp2 driver (code moved from
mvmdio), as explained in details in 5/5 commit message.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Wojtas [Thu, 18 Jan 2018 12:31:44 +0000 (13:31 +0100)]
net: mvpp2: enable ACPI support in the driver
This patch introduces an alternative way of obtaining resources - via
ACPI tables provided by firmware. Enabling coexistence with the DT
support, in addition to the OF_*->device_*/fwnode_* API replacement,
required following steps to be taken:
* Add mvpp2_acpi_match table
* Omit clock configuration and obtain tclk from the property - in ACPI
world, the firmware is responsible for clock maintenance.
* Disable comphy and syscon handling as they are not available for ACPI.
* Modify way of obtaining interrupts - use newly introduced
fwnode_irq_get() routine
* Until proper MDIO bus and PHY handling with ACPI is established in the
kernel, use only link interrupts feature in the driver. For the RGMII
port it results in depending on GMAC settings done during firmware
stage.
* When booting with ACPI MVPP2_QDIST_MULTI_MODE is picked by
default, as there is no need to keep any kind of the backward
compatibility.
Moreover, a memory region used by mvmdio driver is usually placed in
the middle of the address space of the PP2 network controller.
The MDIO base address is obtained without requesting memory region
(by devm_ioremap() call) in mvmdio.c, later overlapping resources are
requested by the network driver, which is responsible for avoiding
a concurrent access.
In case the MDIO memory region is declared in the ACPI, it can
already appear as 'in-use' in the OS. Because it is overlapped by second
region of the network controller, make sure it is released, before
requesting it again. The care is taken by mvpp2 driver to avoid
concurrent access to this memory region.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Wojtas [Thu, 18 Jan 2018 12:31:43 +0000 (13:31 +0100)]
net: mvpp2: use device_*/fwnode_* APIs instead of of_*
OF functions can be used only for the driver using DT.
As a preparation for introducing ACPI support in mvpp2
driver, use struct fwnode_handle in order to obtain
properties from the hardware description.
This patch replaces of_* function with device_*/fwnode_*
where possible in the mvpp2.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Wojtas [Thu, 18 Jan 2018 12:31:42 +0000 (13:31 +0100)]
net: mvpp2: simplify maintaining enabled ports' list
'port_count' field of the mvpp2 structure holds an overall amount
of available ports, based on DT nodes status. In order to be prepared
to support other HW description, obtain the value by incrementing it
upon each successful port initialization. This allowed for simplifying
port indexing in the controller's private array, whose size is now not
dynamically allocated, but fixed to MVPP2_MAX_PORTS.
This patch simplifies creating and filling list of enabled ports and
is a part of the preparation for adding ACPI support in the mvpp2 driver.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Wojtas [Thu, 18 Jan 2018 12:31:41 +0000 (13:31 +0100)]
device property: Allow iterating over available child fwnodes
Implement a new helper function fwnode_get_next_available_child_node(),
which enables obtaining next enabled child fwnode, which
works on a similar basis to OF's of_get_next_available_child().
This commit also introduces a macro, thanks to which it is
possible to iterate over the available fwnodes, using the
new function described above.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Wojtas [Thu, 18 Jan 2018 12:31:40 +0000 (13:31 +0100)]
device property: Introduce fwnode_irq_get()
Until now there were two very similar functions allowing
to get Linux IRQ number from ACPI handle (acpi_irq_get())
and OF node (of_irq_get()). The first one appeared to be used
only as a subroutine of platform_irq_get(), which (in the generic
code) limited IRQ obtaining from _CRS method only to nodes
associated to kernel's struct platform_device.
This patch introduces a new helper routine - fwnode_irq_get(),
which allows to get the IRQ number directly from the fwnode
to be used as common for OF/ACPI worlds. It is usable not
only for the parents fwnodes, but also for the child nodes
comprising their own _CRS methods with interrupts description.
In order to be able o satisfy compilation with !CONFIG_ACPI
and also simplify the new code, introduce a helper macro
(ACPI_HANDLE_FWNODE), with which it is possible to reach
an ACPI handle directly from its fwnode.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Marcin Wojtas [Thu, 18 Jan 2018 12:31:39 +0000 (13:31 +0100)]
device property: Introduce fwnode_get_phy_mode()
Until now there were two almost identical functions for
obtaining network PHY mode - of_get_phy_mode() and,
more generic, device_get_phy_mode(). However it is not uncommon,
that the network interface is represented as a child
of the actual controller, hence it is not associated
directly to any struct device, required by the latter
routine.
This commit allows for getting the PHY mode for
children nodes in the ACPI world by introducing a new function -
fwnode_get_phy_mode(). This commit also changes
device_get_phy_mode() routine to be its wrapper, in order
to prevent unnecessary duplication.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Until now there were two almost identical functions for
obtaining MAC address - of_get_mac_address() and, more generic,
device_get_mac_address(). However it is not uncommon,
that the network interface is represented as a child
of the actual controller, hence it is not associated
directly to any struct device, required by the latter
routine.
This commit allows for getting the MAC address for
children nodes in the ACPI world by introducing a new function -
fwnode_get_mac_address(). This commit also changes
device_get_mac_address() routine to be its wrapper, in order
to prevent unnecessary duplication.
Signed-off-by: Marcin Wojtas <mw@semihalf.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Mon, 22 Jan 2018 13:18:26 +0000 (18:48 +0530)]
cxgb4: add geneve offload support for T6
Add geneve segmentation offload support of T6 cards.
Original work by: Santosh Rastapur <santosh@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 22 Jan 2018 14:36:37 +0000 (09:36 -0500)]
Merge tag 'mac80211-next-for-davem-2018-01-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
Less than a handful of changes:
* possible memory leak fix in hwsim
* speed up hwsim
* add hwsim userspace rate control API
* code cleanups
====================
A conflict was resolved in mac80211_hwsim.c, mostly of
the simple overlapping changes category. One adding
a rhashtable and another adding a workqueue.
Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Mon, 22 Jan 2018 10:31:19 +0000 (10:31 +0000)]
devlink: fix memory leak on 'resource'
Currently, if the call to devlink_resource_find returns null then
the error exit path does not free the devlink_resource 'resource'
and a memory leak occurs. Fix this by kfree'ing resource on the
error exit path.
Detected by CoverityScan, CID#1464184 ("Resource leak")
Fixes: d9f9b9a4d05f ("devlink: Add support for resource abstraction") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
mlxsw: spectrum_router: Optimize LPM trees
Ido says:
This set tries to optimize the structure of the LPM trees used for route
lookup by avoiding lookups that are guaranteed not to return a result.
This is done by making sure only used prefix lengths are present in the
tree.
First two patches are small preparatory steps towards the actual change
in the last patch.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 22 Jan 2018 08:17:42 +0000 (09:17 +0100)]
mlxsw: spectrum_router: Remove unnecessary prefix lengths from LPM tree
In commit fc922bb0dd94 ("mlxsw: spectrum_router: Use one LPM tree for
all virtual routers") I tried to make sure only used prefix lengths are
present in the LPM tree shared between all virtual routers.
However, this optimization had to be removed in commit a69518cf0b4c
("mlxsw: spectrum_router: Avoid expensive lookup during route removal"),
since determining the used prefix lengths required us to traverse all
the active virtual routers, which could result in a hung task depending
on the number of VRFs and whether routes were removed due to abort or
not.
Re-introduce the optimization by moving the prefix usage accounting from
the virtual routers to the LPM tree, as this accounting is only used in
order to determine the tree's structure.
To make the sharing of the trees more explicit, the two trees (for IPv4
and IPv6) are stored in the shared router struct and upon the creation
of a virtual router it is immediately bound to both.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 22 Jan 2018 08:17:40 +0000 (09:17 +0100)]
mlxsw: spectrum_router: Use the nodes list as indication for empty FIB
Currently, each FIB (IPv4 / IPv6) in a virtual router holds a prefix
usage that is used to choose a matching LPM tree, but also to check if
the FIB is empty, so that the LPM tree could be unbound.
Next patches will remove the reliance on the per-FIB prefix usage for
LPM tree matching. Keeping it only to check if the FIB is empty is a
waste, since we can use the nodes ({Prefix, Length}) list instead.
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Mon, 22 Jan 2018 02:55:38 +0000 (10:55 +0800)]
tun: add missing rcu annotation
This patch fixes the following sparse warnings:
drivers/net/tun.c:2241:15: error: incompatible types in comparison expression (different address spaces)
Fixes: cd5681d7d890 ("tuntap: rename struct tun_steering_prog to struct tun_prog") Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
mlxsw: Add support for mirror action with flower
Arkadi says:
Add support for mirror action with flower classifier. The first 3 patches
introduce a generic per-block resource infra. The last 4 patches add
support for flow based span.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_acl: Add support for mirror action
Add support for mirror action. Only one mirror action can be set per rule.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum: Extend mlxsw_afa_ops for counter index and implement for Spectrum
Introduce extension of mlxsw_afa_ops in order to add/del mirroring and
implement the ops for Spectrum.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Extend SPAN API for ACL case. In case of ACL triggering the MPAR register
shouldn't be configured. This patch also export those helpers for
ACL usage.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
mlxsw: spectrum_acl: Add support for mirroring action
The patch extends the trap action for mirroring.
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 19 Jan 2018 08:24:48 +0000 (09:24 +0100)]
mlxsw: core: Make counter index allocated inside the action append
So far, the caller of mlxsw_afa_block_append_counter needed to allocate
counter index by hand. Benefit from the previously introduced resource
infra and counter_index_get/put callbacks, and allocate the counter
index in place where it is needed, inside the action append function.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Pirko [Fri, 19 Jan 2018 08:24:47 +0000 (09:24 +0100)]
mlxsw: core: Convert fwd_entry_ref list to be generic per-block resource list
Since the resource list needs to be used also for other entries different
to fwd_entry_ref, make the list generic. For that purpose, introduce a
resource structure with couple of helpers that the code which need to
store a per-block resource should use.
Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
Aquantia atlantic driver new devices support
This patchset introduces a support for new Aquantia hardware:
AQC11x family with updated hardware (B1) and firmware (2.x and 3.x branches).
For that, a number of improvements in overall driver model were done:
- Firmware specific ops tables. Firmware 2.x and 3.x series support
functions are now in separate fw2x module.
- PCI module cleanup and simplification done.
- Verified and tested hardware reset process.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 19 Jan 2018 14:03:26 +0000 (17:03 +0300)]
net: aquantia: Introduce global AQC hardware reset sequence
The detailed reset sequence ensures all HW components are in aligned
state before NIC startup. It also supports cards with signed firmware (RBL)
and checks if their FW is valid.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 19 Jan 2018 14:03:24 +0000 (17:03 +0300)]
net: aquantia: Introduce firmware ops callbacks
New AQC cards will have an updated firmware with new binary interface.
This patch extracts firmware specific operations into a separate table
and prepares for the introduction of new fw 2.x and 3.x
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 19 Jan 2018 14:03:21 +0000 (17:03 +0300)]
net: aquantia: Cleanup pci functions module
Driver contained a dead code of maintaining multiple pci port instances.
That will never be used since for each pci function a separate NIC
instance is created.
Simplify this, making pci module only responsible for pci resource
management.
NIC initialization is also simplified accordingly.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Igor Russkikh [Fri, 19 Jan 2018 14:03:19 +0000 (17:03 +0300)]
net: aquantia: Introduce new AQC devices and capabilities
A number of new AQC devices is going to be released. To support more
flexible capabilities management a number of static caps instances is now
declared. Devices now are mainly differs by supported speeds, but in future
more parameters will be customized. A set of AQC100 devices have
fibre media, not twisted pair - this is also reflected in
new capabilities definitions.
HW level also now directly exports hw_ops for each of A0/B0 hardware.
PCI configuration now uses a device configuration table where each
device ID is explicitly mapped with hardware OPs and capabilities
structures.
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>