Björn Töpel [Wed, 13 Mar 2019 14:15:49 +0000 (15:15 +0100)]
xsk: fix umem memory leak on cleanup
When the umem is cleaned up, the task that created it might already be
gone. If the task was gone, the xdp_umem_release function did not free
the pages member of struct xdp_umem.
It turned out that the task lookup was not needed at all; The code was
a left-over when we moved from task accounting to user accounting [1].
This patch fixes the memory leak by removing the task lookup logic
completely.
====================
Hi,
This set is an update to the documentation for the BPF helper functions in
the UAPI header bpf.h, used to generate the bpf-helpers(7) man page.
First patch contains fixes to the current documentation. The second patch
adds documentation for the two helpers bpf_spin_lock() and
bpf_spin_unlock(). The last patch simply reports the changes to the version
of that header in tools/.
====================
Quentin Monnet [Thu, 14 Mar 2019 12:38:40 +0000 (12:38 +0000)]
bpf: add documentation for helpers bpf_spin_lock(), bpf_spin_unlock()
Add documentation for the BPF spinlock-related helpers to the doc in
bpf.h. I added the constraints and restrictions coming with the use of
spinlocks for BPF: not all of it is directly related to the use of the
helper, but I thought it would be nice for users to find them in the man
page.
This list of restrictions is nearly a verbatim copy of the list in
Alexei's commit log for those helpers.
Quentin Monnet [Thu, 14 Mar 2019 12:38:39 +0000 (12:38 +0000)]
bpf: fix documentation for eBPF helpers
Another round of minor fixes for the documentation of the BPF helpers
located in the UAPI bpf.h header file. Changes include:
- Moving around description of some helpers, to keep the descriptions in
the same order as helpers are declared (bpf_map_push_elem(), leftover
from commit fb8e2eee2cd8 ("bpf: fix documentation for eBPF helpers"),
bpf_rc_keydown(), and bpf_skb_ancestor_cgroup_id()).
- Fixing typos ("contex" -> "context").
- Harmonising return types ("void* " -> "void *", "uint64_t" -> "u64").
- Addition of the "bpf_" prefix to bpf_get_storage().
- Light additions of RST markup on some keywords.
- Empty line deletion between description and return value for
bpf_tcp_sock().
- Edit for the description for bpf_skb_ecn_set_ce() (capital letters,
acronym expansion, no effect if ECT not set, more details on return
value).
====================
This patchset adds ability to resolve forward-declared enums into their
corresponding full enum definition types during type deduplication,
eliminating one of the reasons for having duplicated graphs of types.
====================
Andrii Nakryiko [Mon, 11 Mar 2019 00:44:09 +0000 (17:44 -0700)]
btf: resolve enum fwds in btf_dedup
GCC and clang support enum forward declarations as an extension. Such
forward-declared enums will be represented as normal BTF_KIND_ENUM types with
vlen=0. This patch adds ability to resolve such enums to their corresponding
fully defined enums. This helps to avoid duplicated BTF type graphs which only
differ by some types referencing forward-declared enum vs full enum.
One such example in kernel is enum irqchip_irq_state, defined in
include/linux/interrupt.h and forward-declared in include/linux/irq.h. This
causes entire struct task_struct and all referenced types to be duplicated in
btf_dedup output. This patch eliminates such duplication cases.
====================
This set addresses issue about accessing invalid
ptr returned from bpf_tcp_sock() and bpf_sk_fullsock()
after bpf_sk_release().
v4:
- Tried the one "id" approach. It does not work well and the reason is in
the Patch 1 commit message.
- Rename refcount_id to ref_obj_id.
- With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg()
because ref_obj_id is passed to release_reference() instead of reg->id.
- Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true
- sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock().
- bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2.
- If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr,
bpf_sk_release(tp) is also allowed.
v3:
- reset reg->refcount_id for the is_null case in mark_ptr_or_null_reg()
v2:
- Remove refcount_id arg from release_reference() because
id == refcount_id
- Add a WARN_ON_ONCE to mark_ptr_or_null_regs() to catch
an internal verifier bug.
====================
Martin KaFai Lau [Tue, 12 Mar 2019 17:23:09 +0000 (10:23 -0700)]
bpf: Test ref release issue in bpf_tcp_sock and bpf_sk_fullsock
Adding verifier tests to ensure the ptr returned from bpf_tcp_sock() and
bpf_sk_fullsock() cannot be accessed after bpf_sk_release() is called.
A few of the tests are derived from a reproducer test by Lorenz Bauer.
Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add a new helper "struct bpf_sock *bpf_get_listener_sock(struct bpf_sock *sk)"
which returns a bpf_sock in TCP_LISTEN state. It will trace back to
the listener sk from a request_sock if possible. It returns NULL
for all other cases.
No reference is taken because the helper ensures the sk is
in SOCK_RCU_FREE (where the TCP_LISTEN sock should be in).
Hence, bpf_sk_release() is unnecessary and the verifier does not
allow bpf_sk_release(listen_sk) to be called either.
The following is also allowed because the bpf_prog is run under
rcu_read_lock():
sk = bpf_sk_lookup_tcp();
/* if (!sk) { ... } */
listen_sk = bpf_get_listener_sock(sk);
/* if (!listen_sk) { ... } */
bpf_sk_release(sk);
src_port = listen_sk->src_port; /* Allowed */
Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Martin KaFai Lau [Tue, 12 Mar 2019 17:23:02 +0000 (10:23 -0700)]
bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release
Lorenz Bauer [thanks!] reported that a ptr returned by bpf_tcp_sock(sk)
can still be accessed after bpf_sk_release(sk).
Both bpf_tcp_sock() and bpf_sk_fullsock() have the same issue.
This patch addresses them together.
A simple reproducer looks like this:
sk = bpf_sk_lookup_tcp();
/* if (!sk) ... */
tp = bpf_tcp_sock(sk);
/* if (!tp) ... */
bpf_sk_release(sk);
snd_cwnd = tp->snd_cwnd; /* oops! The verifier does not complain. */
The problem is the verifier did not scrub the register's states of
the tcp_sock ptr (tp) after bpf_sk_release(sk).
[ Note that when calling bpf_tcp_sock(sk), the sk is not always
refcount-acquired. e.g. bpf_tcp_sock(skb->sk). The verifier works
fine for this case. ]
Currently, the verifier does not track if a helper's return ptr (in REG_0)
is "carry"-ing one of its argument's refcount status. To carry this info,
the reg1->id needs to be stored in reg0.
One approach was tried, like "reg0->id = reg1->id", when calling
"bpf_tcp_sock()". The main idea was to avoid adding another "ref_obj_id"
for the same reg. However, overlapping the NULL marking and ref
tracking purpose in one "id" does not work well:
ref_sk = bpf_sk_lookup_tcp();
fullsock = bpf_sk_fullsock(ref_sk);
tp = bpf_tcp_sock(ref_sk);
if (!fullsock) {
bpf_sk_release(ref_sk);
return 0;
}
/* fullsock_reg->id is marked for NOT-NULL.
* Same for tp_reg->id because they have the same id.
*/
/* oops. verifier did not complain about the missing !tp check */
snd_cwnd = tp->snd_cwnd;
Hence, a new "ref_obj_id" is needed in "struct bpf_reg_state".
With a new ref_obj_id, when bpf_sk_release(sk) is called, the verifier can
scrub all reg states which has a ref_obj_id match. It is done with the
changes in release_reg_references() in this patch.
While fixing it, sk_to_full_sk() is removed from bpf_tcp_sock() and
bpf_sk_fullsock() to avoid these helpers from returning
another ptr. It will make bpf_sk_release(tp) possible:
sk = bpf_sk_lookup_tcp();
/* if (!sk) ... */
tp = bpf_tcp_sock(sk);
/* if (!tp) ... */
bpf_sk_release(tp);
A separate helper "bpf_get_listener_sock()" will be added in a later
patch to do sk_to_full_sk().
Misc change notes:
- To allow bpf_sk_release(tp), the arg of bpf_sk_release() is changed
from ARG_PTR_TO_SOCKET to ARG_PTR_TO_SOCK_COMMON. ARG_PTR_TO_SOCKET
is removed from bpf.h since no helper is using it.
- arg_type_is_refcounted() is renamed to arg_type_may_be_refcounted()
because ARG_PTR_TO_SOCK_COMMON is the only one and skb->sk is not
refcounted. All bpf_sk_release(), bpf_sk_fullsock() and bpf_tcp_sock()
take ARG_PTR_TO_SOCK_COMMON.
- check_refcount_ok() ensures is_acquire_function() cannot take
arg_type_may_be_refcounted() as its argument.
- The check_func_arg() can only allow one refcount-ed arg. It is
guaranteed by check_refcount_ok() which ensures at most one arg can be
refcounted. Hence, it is a verifier internal error if >1 refcount arg
found in check_func_arg().
- In release_reference(), release_reference_state() is called
first to ensure a match on "reg->ref_obj_id" can be found before
scrubbing the reg states with release_reg_references().
- reg_is_refcounted() is no longer needed.
1. In mark_ptr_or_null_regs(), its usage is replaced by
"ref_obj_id && ref_obj_id == id" because,
when is_null == true, release_reference_state() should only be
called on the ref_obj_id obtained by a acquire helper (i.e.
is_acquire_function() == true). Otherwise, the following
would happen:
sk = bpf_sk_lookup_tcp();
/* if (!sk) { ... } */
fullsock = bpf_sk_fullsock(sk);
if (!fullsock) {
/*
* release_reference_state(fullsock_reg->ref_obj_id)
* where fullsock_reg->ref_obj_id == sk_reg->ref_obj_id.
*
* Hence, the following bpf_sk_release(sk) will fail
* because the ref state has already been released in the
* earlier release_reference_state(fullsock_reg->ref_obj_id).
*/
bpf_sk_release(sk);
}
2. In release_reg_references(), the current reg_is_refcounted() call
is unnecessary because the id check is enough.
- The type_is_refcounted() and type_is_refcounted_or_null()
are no longer needed also because reg_is_refcounted() is removed.
Fixes: b9d53812598c ("bpf: Add struct bpf_tcp_sock and BPF_FUNC_tcp_sock") Reported-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Magnus Karlsson [Tue, 12 Mar 2019 08:59:45 +0000 (09:59 +0100)]
libbpf: fix to reject unknown flags in xsk_socket__create()
In xsk_socket__create(), the libbpf_flags field was not checked for
setting currently unused/unknown flags. This patch fixes that by
returning -EINVAL if the user has set any flag that is not in use at
this point in time.
Fixes: 1630fbcf169a ("libbpf: add support for using AF_XDP sockets") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Reviewed-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Yonghong Song [Tue, 12 Mar 2019 05:21:09 +0000 (22:21 -0700)]
selftests/bpf: fix segfault of test_progs when prog loading failed
The test_progs subtests, test_spin_lock() and test_map_lock(),
requires BTF present to run successfully.
Currently, when BTF failed to load, test_progs will segfault,
$ ./test_progs
...
12: (bf) r1 = r8
13: (85) call bpf_spin_lock#93
map 'hash_map' has to have BTF in order to use bpf_spin_lock
libbpf: -- END LOG --
libbpf: failed to load program 'map_lock_demo'
libbpf: failed to load object './test_map_lock.o'
test_map_lock:bpf_prog_load errno 13
Segmentation fault
The segfault is caused by uninitialized variable "obj", which
is used in bpf_object__close(obj), when bpf prog failed to load.
Initializing variable "obj" to NULL in two occasions fixed the problem.
$ ./test_progs
...
Summary: 219 PASSED, 2 FAILED
Fixes: 5d1ecdb9db4d ("selftests/bpf: add bpf_spin_lock verifier tests") Fixes: 366693ab15e9 ("selftests/bpf: test for BPF_F_LOCK") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Andrii Nakryiko [Fri, 8 Mar 2019 23:58:20 +0000 (15:58 -0800)]
libbpf: handle BTF parsing and loading properly
This patch splits and cleans up error handling logic for loading BTF data.
Previously, if BTF data was parsed successfully, but failed to load into
kernel, we'd report nonsensical error code, instead of error returned from
btf__load(). Now btf__new() and btf__load() are handled separately with proper
cleanup and warning reporting.
Fixes: 4b942d80d175 ("btf: separate btf creation and loading") Reported-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: 75239b3ef531 ("net: add gro_cells infrastructure") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 10 Mar 2019 17:36:40 +0000 (10:36 -0700)]
vxlan: test dev->flags & IFF_UP before calling gro_cells_receive()
Same reasons than the ones explained in commit bf63bc03073e
("vxlan: test dev->flags & IFF_UP before calling netif_rx()")
netif_rx() or gro_cells_receive() must be called under a strict contract.
At device dismantle phase, core networking clears IFF_UP
and flush_all_backlogs() is called after rcu grace period
to make sure no incoming packet might be in a cpu backlog
and still referencing the device.
A similar protocol is used for gro_cells infrastructure, as
gro_cells_destroy() will be called only after a full rcu
grace period is observed after IFF_UP has been cleared.
Most drivers call netif_rx() from their interrupt handler,
and since the interrupts are disabled at device dismantle,
netif_rx() does not have to check dev->flags & IFF_UP
Virtual drivers do not have this guarantee, and must
therefore make the check themselves.
Otherwise we risk use-after-free and/or crashes.
Fixes: c036b03fe2ab ("vxlan: virtual extensible lan") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sun, 10 Mar 2019 16:07:14 +0000 (09:07 -0700)]
net/x25: fix use-after-free in x25_device_event()
In case of failure x25_connect() does a x25_neigh_put(x25->neighbour)
but forgets to clear x25->neighbour pointer, thus triggering use-after-free.
Since the socket is visible in x25_list, we need to hold x25_list_lock
to protect the operation.
syzbot report :
BUG: KASAN: use-after-free in x25_kill_by_device net/x25/af_x25.c:217 [inline]
BUG: KASAN: use-after-free in x25_device_event+0x296/0x2b0 net/x25/af_x25.c:252
Read of size 8 at addr ffff8880a030edd0 by task syz-executor003/7854
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot+04babcefcd396fabec37@syzkaller.appspotmail.com Cc: andrew hendry <andrew.hendry@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Shiju Jose [Sun, 10 Mar 2019 06:47:51 +0000 (14:47 +0800)]
net: hns3: fix to stop multiple HNS reset due to the AER changes
The commit 03504cf5b152
("PCI/ERR: Run error recovery callbacks for all affected devices")
affected the non-fatal error recovery logic for the HNS and RDMA devices.
This is because each HNS PF under PCIe bus receive callbacks
from the AER driver when an error is reported for one of the PF.
This causes unwanted PF resets because
the HNS decides which PF to reset based on the reset type set.
The HNS error handling code sets the reset type based on the hw error
type detected.
This patch provides fix for the above issue for the recovery of
the hw errors in the HNS and RDMA devices.
This patch needs backporting to the kernel v5.0+
Fixes: 9482eb3b06c2 ("net: hns3: add handling of hw ras errors using new set of commands") Reported-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Sat, 9 Mar 2019 22:43:38 +0000 (14:43 -0800)]
ip: fix ip_mc_may_pull() return value
ip_mc_may_pull() must return 0 if there is a problem, not an errno.
syzbot reported :
BUG: KASAN: use-after-free in br_ip4_multicast_igmp3_report net/bridge/br_multicast.c:947 [inline]
BUG: KASAN: use-after-free in br_multicast_ipv4_rcv net/bridge/br_multicast.c:1631 [inline]
BUG: KASAN: use-after-free in br_multicast_rcv+0x3cd8/0x4440 net/bridge/br_multicast.c:1741
Read of size 4 at addr ffff88820a4084ee by task syz-executor.2/11183
Fixes: 40a8efcec8af ("bridge: simplify ip_mc_check_igmp() and ipv6_mc_check_mld() calls") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Sat, 9 Mar 2019 09:26:53 +0000 (10:26 +0100)]
net: keep refcount warning in reqsk_free()
As Eric Dumazet said, "We do not have a way to tell if the req was ever
inserted in a hash table, so better play safe.".
Let's remove this comment, so that nobody will be tempted to drop the
WARN_ON_ONCE() line.
Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: stmmac: Avoid one more sometimes uninitialized Clang warning
When building with -Wsometimes-uninitialized, Clang warns:
drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c:111:2: error: variable
'ns' is used uninitialized whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
drivers/net/ethernet/stmicro/stmmac/stmmac_ptp.c:111:2: error: variable
'ns' is used uninitialized whenever '&&' condition is false
[-Werror,-Wsometimes-uninitialized]
Clang is concerned with the use of stmmac_do_void_callback (which
stmmac_get_systime wraps), as it may fail to initialize these values if
the if condition was ever false (meaning the callback doesn't exist).
It's not wrong because the callback is what initializes ns. While it's
unlikely that the callback is going to disappear at some point and make
that condition false, we can easily avoid this warning by zero
initializing the variable.
Link: https://github.com/ClangBuiltLinux/linux/issues/384 Fixes: 9031125cb033 ("net: stmmac: Avoid sometimes uninitialized Clang warnings") Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Fri, 8 Mar 2019 00:21:27 +0000 (01:21 +0100)]
net: dsa: mv88e6xxx: Set correct interface mode for CPU/DSA ports
By default, the switch driver is expected to configure CPU and DSA
ports to their maximum speed. For the 6341 and 6390 families, the
ports interface mode has to be configured as well. The 6390X range
support 10G ports using XAUI, while the 6341 and 6390 supports
2500BaseX, as their maximum speed.
Fixes: 911fb79960a5 ("net: dsa: mv88e6xxx: Default ports 9/10 6390X CMODE to 1000BaseX") Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Sat, 9 Mar 2019 00:29:58 +0000 (00:29 +0000)]
rxrpc: Fix client call queueing, waiting for channel
rxrpc_get_client_conn() adds a new call to the front of the waiting_calls
queue if the connection it's going to use already exists. This is bad as
it allows calls to get starved out.
Fix this by adding to the tail instead.
Also change the other enqueue point in the same function to put it on the
front (ie. when we have a new connection). This makes the point that in
the case of a new connection the new call goes at the front (though it
doesn't actually matter since the queue should be unoccupied).
Fixes: 3a2a7fa418e4 ("rxrpc: Improve management and caching of client connection objects") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following pull-request contains BPF updates for your *net* tree.
The main changes are:
1) Fix a crash in AF_XDP's xsk_diag_put_ring() which was passing
wrong queue argument, from Eric.
2) Fix a regression due to wrong test for TCP GSO packets used in
various BPF helpers like NAT64, from Willem.
3) Fix a sk_msg strparser warning which asserts that strparser must
be stopped first, from Jakub.
4) Fix rejection of invalid options/bind flags in AF_XDP, from Björn.
5) Fix GSO in bpf_lwt_push_ip_encap() which must properly set inner
headers and inner protocol, from Peter.
6) Fix a libbpf leak when kernel does not support BTF, from Nikita.
7) Various BPF selftest and libbpf build fixes to make out-of-tree
compilation work and to properly resolve dependencies via fixdep
target, from Stanislav.
8) Fix rejection of invalid ldimm64 imm field, from Daniel.
9) Fix bpf stats sysctl compile warning of unused helper function
proc_dointvec_minmax_bpf_stats() under some configs, from Arnd.
10) Fix couple of warnings about using plain integer as NULL, from Bo.
11) Fix some BPF sample spelling mistakes, from Colin.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Guillaume Nault [Fri, 8 Mar 2019 21:09:47 +0000 (22:09 +0100)]
tcp: handle inet_csk_reqsk_queue_add() failures
Commit 665d17d333ab ("tcp/dccp: fix another race at listener
dismantle") let inet_csk_reqsk_queue_add() fail, and adjusted
{tcp,dccp}_check_req() accordingly. However, TFO and syncookies
weren't modified, thus leaking allocated resources on error.
Contrary to tcp_check_req(), in both syncookies and TFO cases,
we need to drop the request socket. Also, since the child socket is
created with inet_csk_clone_lock(), we have to unlock it and drop an
extra reference (->sk_refcount is initially set to 2 and
inet_csk_reqsk_queue_add() drops only one ref).
For TFO, we also need to revert the work done by tcp_try_fastopen()
(with reqsk_fastopen_remove()).
Fixes: 665d17d333ab ("tcp/dccp: fix another race at listener dismantle") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: sun: Zero initialize class in default case in niu_add_ethtool_tcam_entry
When building with -Wsometimes-uninitialized, Clang warns:
drivers/net/ethernet/sun/niu.c:7466:5: warning: variable 'class' is used
uninitialized whenever switch default is taken
[-Wsometimes-uninitialized]
The default case can never happen because i can only be 0 to 3
(NIU_L3_PROG_CLS is defined as 4). To make this clear to Clang,
just zero initialize class in the default case (use the macro
CLASS_CODE_UNRECOG to make it clear this shouldn't happen).
Link: https://github.com/ClangBuiltLinux/linux/issues/403 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: c243293632a4 ("fou, fou6: do not assume linear skbs") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Wed, 6 Mar 2019 15:50:43 +0000 (17:50 +0200)]
net: sched: fix potential use-after-free in __tcf_chain_put()
When used with unlocked classifier that have filters attached to actions
with goto chain, __tcf_chain_put() for last non action reference can race
with calls to same function from action cleanup code that releases last
action reference. In this case action cleanup handler could free the chain
if it executes after all references to chain were released, but before all
concurrent users finished using it. Modify __tcf_chain_put() to only access
tcf_chain fields when holding block->lock. Remove local variables that were
used to cache some tcf_chain fields and are no longer needed because their
values can now be obtained directly from chain under block->lock
protection.
Fixes: b488ed6b1fec ("net: sched: prevent insertion of new classifiers during chain flush") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 6 Mar 2019 11:05:49 +0000 (12:05 +0100)]
vhost: silence an unused-variable warning
On some architectures, the MMU can be disabled, leading to access_ok()
becoming an empty macro that does not evaluate its size argument,
which in turn produces an unused-variable warning:
Adalbert Lazăr [Wed, 6 Mar 2019 10:13:53 +0000 (12:13 +0200)]
vsock/virtio: fix kernel panic from virtio_transport_reset_no_sock
Previous to commit e12dfe207f62 ("vsock/virtio: fix kernel panic
after device hot-unplug"), vsock_core_init() was called from
virtio_vsock_probe(). Now, virtio_transport_reset_no_sock() can be called
before vsock_core_init() has the chance to run.
[Wed Feb 27 14:17:09 2019] BUG: unable to handle kernel NULL pointer dereference at 0000000000000110
[Wed Feb 27 14:17:09 2019] #PF error: [normal kernel read fault]
[Wed Feb 27 14:17:09 2019] PGD 0 P4D 0
[Wed Feb 27 14:17:09 2019] Oops: 0000 [#1] SMP PTI
[Wed Feb 27 14:17:09 2019] CPU: 3 PID: 59 Comm: kworker/3:1 Not tainted 5.0.0-rc7-390-generic-hvi #390
[Wed Feb 27 14:17:09 2019] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
[Wed Feb 27 14:17:09 2019] Workqueue: virtio_vsock virtio_transport_rx_work [vmw_vsock_virtio_transport]
[Wed Feb 27 14:17:09 2019] RIP: 0010:virtio_transport_reset_no_sock+0x8c/0xc0 [vmw_vsock_virtio_transport_common]
[Wed Feb 27 14:17:09 2019] Code: 35 8b 4f 14 48 8b 57 08 31 f6 44 8b 4f 10 44 8b 07 48 8d 7d c8 e8 84 f8 ff ff 48 85 c0 48 89 c3 74 2a e8 f7 31 03 00 48 89 df <48> 8b 80 10 01 00 00 e8 68 fb 69 ed 48 8b 75 f0 65 48 33 34 25 28
[Wed Feb 27 14:17:09 2019] RSP: 0018:ffffb42701ab7d40 EFLAGS: 00010282
[Wed Feb 27 14:17:09 2019] RAX: 0000000000000000 RBX: ffff9d79637ee080 RCX: 0000000000000003
[Wed Feb 27 14:17:09 2019] RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff9d79637ee080
[Wed Feb 27 14:17:09 2019] RBP: ffffb42701ab7d78 R08: ffff9d796fae70e0 R09: ffff9d796f403500
[Wed Feb 27 14:17:09 2019] R10: ffffb42701ab7d90 R11: 0000000000000000 R12: ffff9d7969d09240
[Wed Feb 27 14:17:09 2019] R13: ffff9d79624e6840 R14: ffff9d7969d09318 R15: ffff9d796d48ff80
[Wed Feb 27 14:17:09 2019] FS: 0000000000000000(0000) GS:ffff9d796fac0000(0000) knlGS:0000000000000000
[Wed Feb 27 14:17:09 2019] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Wed Feb 27 14:17:09 2019] CR2: 0000000000000110 CR3: 0000000427f22000 CR4: 00000000000006e0
[Wed Feb 27 14:17:09 2019] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[Wed Feb 27 14:17:09 2019] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[Wed Feb 27 14:17:09 2019] Call Trace:
[Wed Feb 27 14:17:09 2019] virtio_transport_recv_pkt+0x63/0x820 [vmw_vsock_virtio_transport_common]
[Wed Feb 27 14:17:09 2019] ? kfree+0x17e/0x190
[Wed Feb 27 14:17:09 2019] ? detach_buf_split+0x145/0x160
[Wed Feb 27 14:17:09 2019] ? __switch_to_asm+0x40/0x70
[Wed Feb 27 14:17:09 2019] virtio_transport_rx_work+0xa0/0x106 [vmw_vsock_virtio_transport]
[Wed Feb 27 14:17:09 2019] NET: Registered protocol family 40
[Wed Feb 27 14:17:09 2019] process_one_work+0x167/0x410
[Wed Feb 27 14:17:09 2019] worker_thread+0x4d/0x460
[Wed Feb 27 14:17:09 2019] kthread+0x105/0x140
[Wed Feb 27 14:17:09 2019] ? rescuer_thread+0x360/0x360
[Wed Feb 27 14:17:09 2019] ? kthread_destroy_worker+0x50/0x50
[Wed Feb 27 14:17:09 2019] ret_from_fork+0x35/0x40
[Wed Feb 27 14:17:09 2019] Modules linked in: vmw_vsock_virtio_transport vmw_vsock_virtio_transport_common input_leds vsock serio_raw i2c_piix4 mac_hid qemu_fw_cfg autofs4 cirrus ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops virtio_net psmouse drm net_failover pata_acpi virtio_blk failover floppy
Fixes: e12dfe207f62 ("vsock/virtio: fix kernel panic after device hot-unplug") Reported-by: Alexandru Herghelegiu <aherghelegiu@bitdefender.com> Signed-off-by: Adalbert Lazăr <alazar@bitdefender.com> Co-developed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Li RongQing [Wed, 6 Mar 2019 06:46:27 +0000 (14:46 +0800)]
connector: fix unsafe usage of ->real_parent
proc_exit_connector() uses ->real_parent lockless. This is not
safe that its parent can go away at any moment, so use RCU to
protect it, and ensure that this task is not released.
Fixes: 30bed0ac530ef09 ("connector: add parent pid and tgid to coredump and exit events") Signed-off-by: Zhang Yu <zhangyu31@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Jian Shen [Wed, 6 Mar 2019 03:26:37 +0000 (11:26 +0800)]
net: hns3: add dma_rmb() for rx description
HW can not guarantee complete write desc->rx.size, even though
HNS3_RXD_VLD_B has been set. Driver needs to add dma_rmb()
instruction to make sure desc->rx.size is always valid.
Fixes: a42418f9a0a1 ("net: hns3: Add handling of GRO Pkts not fully RX'ed in NAPI poll") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pedro Tammela [Tue, 5 Mar 2019 14:35:54 +0000 (11:35 -0300)]
net: add missing documentation in linux/skbuff.h
This patch adds missing documentation for some inline functions on
linux/skbuff.h. The patch is incomplete and a lot more can be added,
just wondering if it's of interest of the netdev developers.
Also fixed some whitespaces.
Signed-off-by: Pedro Tammela <pctammela@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bo YU [Fri, 8 Mar 2019 06:45:51 +0000 (01:45 -0500)]
bpf: fix warning about using plain integer as NULL
Sparse warning below:
sudo make C=2 CF=-D__CHECK_ENDIAN__ M=net/bpf/
CHECK net/bpf//test_run.c
net/bpf//test_run.c:19:77: warning: Using plain integer as NULL pointer
./include/linux/bpf-cgroup.h:295:77: warning: Using plain integer as NULL pointer
Fixes: 6f317e33a148 ("bpf: extend cgroup bpf core to allow multiple cgroup storage types") Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Bo YU <tsu.yubo@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Björn Töpel [Fri, 8 Mar 2019 07:57:27 +0000 (08:57 +0100)]
xsk: fix to reject invalid options in Tx descriptor
Passing a non-existing option in the options member of struct
xdp_desc was, incorrectly, silently ignored. This patch addresses
that behavior, and drops any Tx descriptor with non-existing options.
We have examined existing user space code, and to our best knowledge,
no one is relying on the current incorrect behavior. AF_XDP is still
in its infancy, so from our perspective, the risk of breakage is very
low, and addressing this problem now is important.
Fixes: b053263a3f77 ("xsk: support for Tx") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Björn Töpel [Fri, 8 Mar 2019 07:57:26 +0000 (08:57 +0100)]
xsk: fix to reject invalid flags in xsk_bind
Passing a non-existing flag in the sxdp_flags member of struct
sockaddr_xdp was, incorrectly, silently ignored. This patch addresses
that behavior, and rejects any non-existing flags.
We have examined existing user space code, and to our best knowledge,
no one is relying on the current incorrect behavior. AF_XDP is still
in its infancy, so from our perspective, the risk of breakage is very
low, and addressing this problem now is important.
Fixes: f19f2dc69f6e ("xsk: add support for bind for Rx") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
bpf, libbpf: fixing leak when kernel does not support btf
We could end up in situation when we have object file w/ all btf
info, but kernel does not support btf yet. In this situation
currently libbpf just set obj->btf to NULL w/o freeing it first.
This patch is fixing it by making sure to run btf__free first.
Fixes: 4b942d80d175 ("btf: separate btf creation and loading") Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
David S. Miller [Fri, 8 Mar 2019 19:48:20 +0000 (11:48 -0800)]
Merge branch 'stmmac-add-some-fixes-for-stm32'
Christophe Roullier says:
====================
stmmac: add some fixes for stm32
For common stmmac:
- Add support to set CSR Clock range selection in DT
For stm32mpu:
- Glue codes to support magic packet
- Glue codes to support all PHY config :
PHY_MODE (MII,GMII, RMII, RGMII) and in normal,
PHY wo crystal (25Mhz),
PHY wo crystal (50Mhz), No 125Mhz from PHY config
For stm32mcu:
- Add Ethernet support for stm32h7
Changes in V3:
- Reverse for syscfg management because it is manage by these patches
https://lkml.org/lkml/2018/12/12/133
https://lkml.org/lkml/2018/12/12/134
https://lkml.org/lkml/2018/12/12/131
https://lkml.org/lkml/2018/12/12/132
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: stmmac: add management of clk_csr property
In Documentation stmmac.txt there is possibility to
fixed CSR Clock range selection with property clk_csr.
This patch add the management of this property
For example to use it, add in your ethernet node DT:
clk_csr = <3>;
Signed-off-by: Christophe Roullier <christophe.roullier@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Add properties to support all Phy config
PHY_MODE (MII,GMII, RMII, RGMII) and in normal, PHY wo crystal (25Mhz),
PHY wo crystal (50Mhz), No 125Mhz from PHY config.
Signed-off-by: Christophe Roullier <christophe.roullier@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: stmmac: update to support all PHY config for stm32mp157c.
Update glue codes to support all PHY config on stm32mp157c
PHY_MODE (MII,GMII, RMII, RGMII) and in normal, PHY wo crystal (25Mhz),
PHY wo crystal (50Mhz), No 125Mhz from PHY config.
Signed-off-by: Christophe Roullier <christophe.roullier@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
====================
sctp: process the error returned from sctp_sock_migrate()
This patchset is to process the errs returned by sctp_auth_init_hmacs()
and sctp_bind_addr_dup() from sctp_sock_migrate(). And also fix a panic
caused by new ep->auth_hmacs was not set due to net->sctp.auth_enable
changed by sysctl before accepting an assoc.
====================
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sun, 3 Mar 2019 09:54:55 +0000 (17:54 +0800)]
sctp: call sctp_auth_init_hmacs() in sctp_sock_migrate()
New ep's auth_hmacs should be set if old ep's is set, in case that
net->sctp.auth_enable has been changed to 0 by users and new ep's
auth_hmacs couldn't be set in sctp_endpoint_init().
It can even crash kernel by doing:
1. on server: sysctl -w net.sctp.auth_enable=1,
sysctl -w net.sctp.addip_enable=1,
sysctl -w net.sctp.addip_noauth_enable=0,
listen() on server,
sysctl -w net.sctp.auth_enable=0.
2. on client: connect() to server.
3. on server: accept() the asoc,
sysctl -w net.sctp.auth_enable=1.
4. on client: send() asconf packet to server.
It's because the old ep->auth_hmacs wasn't copied to the new ep while
ep->auth_hmacs is used in sctp_auth_calculate_hmac() when processing
the incoming auth chunks, and it should have been done when migrating
sock.
Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sun, 3 Mar 2019 09:54:54 +0000 (17:54 +0800)]
sctp: move up sctp_auth_init_hmacs() in sctp_endpoint_init()
sctp_auth_init_hmacs() is called only when ep->auth_enable is set.
It better to move up sctp_auth_init_hmacs() and remove auth_enable
check in it and check auth_enable only once in sctp_endpoint_init().
Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio [Fri, 8 Mar 2019 15:40:57 +0000 (16:40 +0100)]
vxlan: Fix GRO cells race condition between receive and link delete
If we receive a packet while deleting a VXLAN device, there's a chance
vxlan_rcv() is called at the same time as vxlan_dellink(). This is fine,
except that vxlan_dellink() should never ever touch stuff that's still in
use, such as the GRO cells list.
Otherwise, vxlan_rcv() crashes while queueing packets via
gro_cells_receive().
Move the gro_cells_destroy() to vxlan_uninit(), which runs after the RCU
grace period is elapsed and nothing needs the gro_cells anymore.
This is now done in the same way as commit b3d97d45a14a ("geneve: Use GRO
cells infrastructure.") originally implemented for GENEVE.
Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: 2a52e0b4b15d ("vxlan: GRO support at tunnel layer") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Fri, 8 Mar 2019 12:48:39 +0000 (12:48 +0000)]
rxrpc: Fix client call connect/disconnect race
rxrpc_disconnect_client_call() reads the call's connection ID protocol
value (call->cid) as part of that function's variable declarations. This
is bad because it's not inside the locked section and so may race with
someone granting use of the channel to the call.
This manifests as an assertion failure (see below) where the call in the
presumed channel (0 because call->cid wasn't set when we read it) doesn't
match the call attached to the channel we were actually granted (if 1, 2 or
3).
Fix this by moving the read and dependent calculations inside of the
channel_lock section. Also, only set the channel number and pointer
variables if cid is not zero (ie. unset).
This problem can be induced by injecting an occasional error in
rxrpc_wait_for_channel() before the call to schedule().
Make two further changes also:
(1) Add a trace for wait failure in rxrpc_connect_call().
(2) Drop channel_lock before BUG'ing in the case of the assertion failure.
Fixes: 3a2a7fa418e4 ("rxrpc: Improve management and caching of client connection objects") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Fri, 8 Mar 2019 07:49:16 +0000 (15:49 +0800)]
sctp: remove sched init from sctp_stream_init
syzbot reported a NULL-ptr deref caused by that sched->init() in
sctp_stream_init() set stream->rr_next = NULL.
kasan: GPF could be caused by NULL-ptr deref or user memory access
RIP: 0010:sctp_sched_rr_dequeue+0xd3/0x170 net/sctp/stream_sched_rr.c:141
Call Trace:
sctp_outq_dequeue_data net/sctp/outqueue.c:90 [inline]
sctp_outq_flush_data net/sctp/outqueue.c:1079 [inline]
sctp_outq_flush+0xba2/0x2790 net/sctp/outqueue.c:1205
All sched info is saved in sout->ext now, in sctp_stream_init()
sctp_stream_alloc_out() will not change it, there's no need to
call sched->init() again, since sctp_outq_init() has already
done it.
Fixes: 4ec45eebba0b ("sctp: introduce stream scheduler foundations") Reported-by: syzbot+4c9934f20522c0efd657@syzkaller.appspotmail.com Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Fri, 8 Mar 2019 06:50:54 +0000 (14:50 +0800)]
route: set the deleted fnhe fnhe_daddr to 0 in ip_del_fnhe to fix a race
The race occurs in __mkroute_output() when 2 threads lookup a dst:
CPU A CPU B
find_exception()
find_exception() [fnhe expires]
ip_del_fnhe() [fnhe is deleted]
rt_bind_exception()
In rt_bind_exception() it will bind a deleted fnhe with the new dst, and
this dst will get no chance to be freed. It causes a dev defcnt leak and
consecutive dmesg warnings:
unregister_netdevice: waiting for ethX to become free. Usage count = 1
Especially thanks Jon to identify the issue.
This patch fixes it by setting fnhe_daddr to 0 in ip_del_fnhe() to stop
binding the deleted fnhe with a new dst when checking fnhe's fnhe_daddr
and daddr in rt_bind_exception().
It works as both ip_del_fnhe() and rt_bind_exception() are protected by
fnhe_lock and the fhne is freed by kfree_rcu().
Fixes: 792e31150fe8 ("route: check and remove route cache when we get route") Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: f8baa9c29c38 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Arvid Brodin <arvid.brodin@alten.se> Signed-off-by: David S. Miller <davem@davemloft.net>
The simple vNIC mailbox length should be 12 decimal and not 0x12.
Using a decimal also makes it clear this is a length value and not
another field within the simple mailbox defines.
Found by code inspection, there are no known firmware configurations
where this would cause issues.
Fixes: bd635c253910 ("nfp: read mailbox address from TLV caps") Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: atm: Add another IS_ENABLED(CONFIG_COMPAT) in atm_dev_ioctl
I removed compat's universal assignment to 0, which allows this if
statement to fall through when compat is passed with a value other
than 0.
Fixes: 392935addfb5 ("net: atm: Use IS_ENABLED in atm_dev_ioctl") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: stmmac: Avoid sometimes uninitialized Clang warnings
When building with -Wsometimes-uninitialized, Clang warns:
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:495:3: warning: variable 'ns' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:495:3: warning: variable 'ns' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:532:3: warning: variable 'ns' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:532:3: warning: variable 'ns' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:741:3: warning: variable 'sec_inc' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:741:3: warning: variable 'sec_inc' is used uninitialized whenever '&&' condition is false [-Wsometimes-uninitialized]
Clang is concerned with the use of stmmac_do_void_callback (which
stmmac_get_timestamp and stmmac_config_sub_second_increment wrap),
as it may fail to initialize these values if the if condition was ever
false (meaning the callbacks don't exist). It's not wrong because the
callbacks (get_timestamp and config_sub_second_increment respectively)
are the ones that initialize the variables. While it's unlikely that the
callbacks are ever going to disappear and make that condition false, we
can easily avoid this warning by zero initialize the variables.
Link: https://github.com/ClangBuiltLinux/linux/issues/384 Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When building with -Wsometimes-uninitialized, Clang warns:
net/atm/resources.c:256:6: warning: variable 'number' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
net/atm/resources.c:212:7: warning: variable 'iobuf_len' is used uninitialized whenever 'if' condition is true [-Wsometimes-uninitialized]
Clang won't realize that compat is 0 when CONFIG_COMPAT is not set until
the constant folding stage, which happens after this semantic analysis.
Use IS_ENABLED instead so that the zero is present at the semantic
analysis stage, which eliminates this warning.
qede: Fix internal loopback failure with jumbo mtu configuration
Driver uses port-mtu as packet-size for the loopback traffic. This patch
limits the max packet size to 1.5K to avoid data being split over multiple
buffer descriptors (BDs) in cases where MTU > PAGE_SIZE.
Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Thu, 7 Mar 2019 15:52:24 +0000 (16:52 +0100)]
enic: fix build warning without CONFIG_CPUMASK_OFFSTACK
The enic driver relies on the CONFIG_CPUMASK_OFFSTACK feature to
dynamically allocate a struct member, but this is normally intended for
local variables.
Building with clang, I get a warning for a few locations that check the
address of the cpumask_var_t:
drivers/net/ethernet/cisco/enic/enic_main.c:122:22: error: address of array 'enic->msix[i].affinity_mask' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
As far as I can tell, the code is still correct, as the truth value of
the pointer is what we need in this configuration. To get rid of
the warning, use cpumask_available() instead of checking the
pointer directly.
Fixes: d09d0b936d49 ("enic: assign affinity hint to interrupts") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Thu, 7 Mar 2019 10:31:55 +0000 (11:31 +0100)]
peak_usb: fix clang build warning
Clang points out undefined behavior when building the pcan_usb_pro driver:
drivers/net/can/usb/peak_usb/pcan_usb_pro.c:136:15: error: passing an object that undergoes default argument promotion to 'va_start' has undefined behavior [-Werror,-Wvarargs]
Changing the function prototype to avoid argument promotion in the
varargs call avoids the warning, and should make this well-defined.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Masaru Nagai [Thu, 7 Mar 2019 10:24:47 +0000 (11:24 +0100)]
ravb: Decrease TxFIFO depth of Q3 and Q2 to one
Hardware has the CBS (Credit Based Shaper) which affects only Q3
and Q2. When updating the CBS settings, even if the driver does so
after waiting for Tx DMA finished, there is a possibility that frame
data still remains in TxFIFO.
To avoid this, decrease TxFIFO depth of Q3 and Q2 to one.
This patch has been exercised this using netperf TCP_MAERTS, TCP_STREAM
and UDP_STREAM tests run on an Ebisu board. No performance change was
detected, outside of noise in the tests, both in terms of throughput and
CPU utilisation.
Fixes: bfafa9ee6042 ("Renesas Ethernet AVB driver proper") Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com> Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
[simon: updated changelog] Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Thu, 7 Mar 2019 09:32:07 +0000 (10:32 +0100)]
isdn: isdnloop: fix pointer dereference bug
clang has spotted an ancient code bug and warns about it with:
drivers/isdn/isdnloop/isdnloop.c:573:12: error: address of array 'card->rcard' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
This is an array of pointers, so we should check if a specific
pointer exists in the array before using it, not whether the
array itself exists.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Thu, 7 Mar 2019 09:31:20 +0000 (10:31 +0100)]
davinci_emac: always build in CONFIG_OF code
clang warns about what seems to be an unintended use of an obscure C
language feature where a forward declaration of an array remains usable
when the final definition is never seen:
drivers/net/ethernet/ti/davinci_emac.c:1694:34: error: tentative array definition assumed to have one element [-Werror]
static const struct of_device_id davinci_emac_of_match[];
There is no harm in always enabling the device tree matching code here,
and it makes the code behave in a more conventional way aside from
avoiding the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 4 Mar 2019 20:08:53 +0000 (21:08 +0100)]
bpf: fix replace_map_fd_with_map_ptr's ldimm64 second imm field
Non-zero imm value in the second part of the ldimm64 instruction for
BPF_PSEUDO_MAP_FD is invalid, and thus must be rejected. The map fd
only ever sits in the first instructions' imm field. None of the BPF
loaders known to us are using it, so risk of regression is minimal.
For clarity and consistency, the few insn->{src_reg,imm} occurrences
are rewritten into insn[0].{src_reg,imm}. Add a test case to the BPF
selftest suite as well.
Fixes: 2aadee160a84 ("bpf: handle pseudo BPF_LD_IMM64 insn") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Jakub Sitnicki [Thu, 7 Mar 2019 10:35:43 +0000 (11:35 +0100)]
bpf: Stop the psock parser before canceling its work
We might have never enabled (started) the psock's parser, in which case it
will not get stopped when destroying the psock. This leads to a warning
when trying to cancel parser's work from psock's deferred destructor:
Stop psock's parser just before canceling its work.
Fixes: 3ae66b8b2e02 ("sk_msg: Always cancel strp work before freeing the psock") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
selftests: bpf: test_progs: initialize duration in singal_pending test
CHECK macro implicitly uses duration. We call CHECK() a couple of times
before duration is initialized from bpf_prog_test_run().
Explicitly set duration to 0 to avoid compiler warnings.
Fixes: a41d5304470c ("selftests/bpf: make sure signal interrupts BPF_PROG_TEST_RUN") Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
libbpf: force fixdep compilation at the start of the build
libbpf targets don't explicitly depend on fixdep target, so when
we do 'make -j$(nproc)', there is a high probability, that some
objects will be built before fixdep binary is available.
Fix this by running sub-make; this makes sure that fixdep dependency
is properly accounted for.
For the same issue in perf, see commit 358acc70013c ("perf tools: Force
fixdep compilation at the start of the build").
Auto-detecting system features:
... libelf: [ on ]
... bpf: [ on ]
HOSTCC /tmp/bld/fixdep.o
CC /tmp/bld/libbpf.o
CC /tmp/bld/bpf.o
CC /tmp/bld/btf.o
CC /tmp/bld/nlattr.o
CC /tmp/bld/libbpf_errno.o
CC /tmp/bld/str_error.o
CC /tmp/bld/netlink.o
CC /tmp/bld/bpf_prog_linfo.o
CC /tmp/bld/libbpf_probes.o
CC /tmp/bld/xsk.o
HOSTLD /tmp/bld/fixdep-in.o
LINK /tmp/bld/fixdep
LD /tmp/bld/libbpf-in.o
LINK /tmp/bld/libbpf.a
LINK /tmp/bld/libbpf.so
LINK /tmp/bld/test_libbpf
$ head /tmp/bld/.libbpf.o.cmd
# cannot find fixdep (/usr/local/google/home/sdf/src/linux/xxx//fixdep)
# using basic dep data
Auto-detecting system features:
... libelf: [ on ]
... bpf: [ on ]
HOSTCC /tmp/bld/fixdep.o
HOSTLD /tmp/bld/fixdep-in.o
LINK /tmp/bld/fixdep
CC /tmp/bld/libbpf.o
CC /tmp/bld/bpf.o
CC /tmp/bld/nlattr.o
CC /tmp/bld/btf.o
CC /tmp/bld/libbpf_errno.o
CC /tmp/bld/str_error.o
CC /tmp/bld/netlink.o
CC /tmp/bld/bpf_prog_linfo.o
CC /tmp/bld/libbpf_probes.o
CC /tmp/bld/xsk.o
LD /tmp/bld/libbpf-in.o
LINK /tmp/bld/libbpf.a
LINK /tmp/bld/libbpf.so
LINK /tmp/bld/test_libbpf
Fixes: 430ef4dda010 ("tools build: Build fixdep helper from perf and basic libs") Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
selftests: bpf: fix compilation with out-of-tree $(OUTPUT)
A bunch of related changes lumped together:
* Create prog_tests and verifier output directories; these don't exist with
out-of-tree $(OUTPUT)
* Add missing -I (via separate TEST_{PROGS,VERIFIER}_CFLAGS) for the main tree
($(PWD) != $(OUTPUT) for out-of-tree)
* Add libbpf.a dependency for test_progs_32 (parallel make fails otherwise)
* Add missing "; \" after "cd" when generating test.h headers
Tested by:
$ alias m="make -s -j$(nproc)"
$ m -C tools/testing/selftests/bpf/ clean
$ m -C tools/lib/bpf/ clean
$ rm -rf xxx; mkdir xxx; m -C tools/testing/selftests/bpf/ OUTPUT=$PWD/xxx
$ m -C tools/testing/selftests/bpf/
Fixes: 7341118fc0a9 ("selftests: bpf: break up test_progs - preparations") Fixes: 65815c88ad9c ("selftests: bpf: prepare for break up of verifier tests") Fixes: 32298c4635c3 ("selftests: bpf: makefile support sub-register code-gen test mode") Signed-off-by: Stanislav Fomichev <sdf@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Peter Oskolkov [Tue, 5 Mar 2019 00:27:09 +0000 (16:27 -0800)]
selftests/bpf: test that GSO works in lwt_ip_encap
Add a test on egress that a large TCP packet successfully goes through
the lwt+bpf encap tunnel.
Although there is no direct evidence that GSO worked, as opposed to
e.g. TCP segmentation or IP fragmentation (maybe a kernel stats counter
should be added to track the number of failed GSO attempts?), without
the previous patch in the patchset this test fails, and printk-debugging
showed that software-based GSO succeeded here (veth is not compatible with
SKB_GSO_DODGY, so GSO happens in the software stack).
Also removed an unnecessary nodad and added a missed failed flag.
Signed-off-by: Peter Oskolkov <posk@google.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Peter Oskolkov [Tue, 5 Mar 2019 00:27:08 +0000 (16:27 -0800)]
net: fix GSO in bpf_lwt_push_ip_encap
GSO needs inner headers and inner protocol set properly to work.
skb->inner_mac_header: skb_reset_inner_headers() assigns the current
mac header value to inner_mac_header; but it is not set at the point,
so we need to call skb_reset_inner_mac_header, otherwise gre_gso_segment
fails: it does
int tnl_hlen = skb_inner_mac_header(skb) - skb_transport_header(skb);
...
if (unlikely(!pskb_may_pull(skb, tnl_hlen)))
...
skb->inner_protocol should also be correctly set.
Fixes: 14856db5267f ("bpf: handle GSO in bpf_lwt_push_encap") Signed-off-by: Peter Oskolkov <posk@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Fixes: b584411a5755 ("xsk: add sock_diag interface for AF_XDP") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
BPF can adjust gso only for tcp bytestreams. Fail on other gso types.
But only on gso packets. It does not touch this field if !gso_size.
Fixes: 4274a89dd3cb ("bpf: only adjust gso_size on bytestream protocols") Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Arnd Bergmann [Mon, 4 Mar 2019 20:34:12 +0000 (21:34 +0100)]
bpf: fix sysctl.c warning
When CONFIG_BPF_SYSCALL or CONFIG_SYSCTL is disabled, we get
a warning about an unused function:
kernel/sysctl.c:3331:12: error: 'proc_dointvec_minmax_bpf_stats' defined but not used [-Werror=unused-function]
static int proc_dointvec_minmax_bpf_stats(struct ctl_table *table, int write,
The CONFIG_BPF_SYSCALL check was already handled, but the SYSCTL check
is needed on top.
Fixes: 5dceb13db0ac ("bpf: enable program stats") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Christian Brauner <christian@brauner.io> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
tcp: do not report TCP_CM_INQ of 0 for closed connections
Returning 0 as inq to userspace indicates there is no more data to
read, and the application needs to wait for EPOLLIN. For a connection
that has received FIN from the remote peer, however, the application
must continue reading until getting EOF (return value of 0
from tcp_recvmsg) or an error, if edge-triggered epoll (EPOLLET) is
being used. Otherwise, the application will never receive a new
EPOLLIN, since there is no epoll edge after the FIN.
Return 1 when there is no data left on the queue but the
connection has received FIN, so that the applications continue
reading.
Fixes: 8bef561cb4324 (tcp: send in-queue bytes in cmsg upon read) Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mao Wenan [Wed, 6 Mar 2019 14:45:01 +0000 (22:45 +0800)]
net: hsr: fix memory leak in hsr_dev_finalize()
If hsr_add_port(hsr, hsr_dev, HSR_PT_MASTER) failed to
add port, it directly returns res and forgets to free the node
that allocated in hsr_create_self_node(), and forgets to delete
the node->mac_list linked in hsr->self_node_db.
Fixes: aefd34382d1c ("net/hsr: Use list_head (and rcu) instead of array for slave devices.") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Mao Wenan <maowenan@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov [Wed, 6 Mar 2019 14:22:12 +0000 (16:22 +0200)]
net: sched: flower: insert new filter to idr after setting its mask
When adding new filter to flower classifier, fl_change() inserts it to
handle_idr before initializing filter extensions and assigning it a mask.
Normally this ordering doesn't matter because all flower classifier ops
callbacks assume rtnl lock protection. However, when filter has an action
that doesn't have its kernel module loaded, rtnl lock is released before
call to request_module(). During this time the filter can be accessed bu
concurrent task before its initialization is completed, which can lead to a
crash.
Example case of NULL pointer dereference in concurrent dump:
Extension initialization and mask assignment don't depend on fnew->handle
that is allocated by idr_alloc_u32(). Move idr allocation code after action
creation and mask assignment in fl_change() to prevent concurrent access
to not fully initialized filter when rtnl lock is released to load action
module.
Fixes: 1b01bde0f5c0 ("net: sched: refactor flower walk to iterate over idr") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Vasily Averin [Wed, 6 Mar 2019 11:10:22 +0000 (14:10 +0300)]
tcp: detecting the misuse of .sendpage for Slab objects
sendpage was not designed for processing of the Slab pages,
in some situations it can trigger BUG_ON on receiving side.
Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 6 Mar 2019 10:52:36 +0000 (11:52 +0100)]
appletalk: Fix compile regression
A bugfix just broke compilation of appletalk when CONFIG_SYSCTL
is disabled:
In file included from net/appletalk/ddp.c:65:
net/appletalk/ddp.c: In function 'atalk_init':
include/linux/atalk.h:164:34: error: expected expression before 'do'
#define atalk_register_sysctl() do { } while(0)
^~
net/appletalk/ddp.c:1934:7: note: in expansion of macro 'atalk_register_sysctl'
rc = atalk_register_sysctl();
This is easier to avoid by using conventional inline functions
as stubs rather than macros. The header already has inline
functions for other purposes, so I'm changing over all the
macros for consistency.
Fixes: 27a2e8d36bc2 ("appletalk: Fix use-after-free in atalk_proc_exit") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
I confirmed that the dst entry also has dst->input set to
dst_md_discard, so it looks like it's an entry that's been
initialized via __metadata_dst_init alright.
I think the fix here is to use skb_valid_dst(skb) - it checks
for DST_METADATA also, and with that fix in place, the
problem - which was previously 100% reproducible - disappears.
The below patch resolves the panic and all bpf tunnel tests pass
without incident.
Fixes: 07215f8312e4 ("ip_tunnel: Add tnl_update_pmtu in ip_md_tunnel_xmit") Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Anders Roxell <anders.roxell@linaro.org> Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Tested-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Wed, 6 Mar 2019 09:42:53 +0000 (10:42 +0100)]
ipv4/route: fail early when inet dev is missing
If a non local multicast packet reaches ip_route_input_rcu() while
the ingress device IPv4 private data (in_dev) is NULL, we end up
doing a NULL pointer dereference in IN_DEV_MFORWARD().
Since the later call to ip_route_input_mc() is going to fail if
!in_dev, we can fail early in such scenario and avoid the dangerous
code path.
v1 -> v2:
- clarified the commit message, no code changes
Reported-by: Tianhao Zhao <tizhao@redhat.com> Fixes: 00ba4bc64bbc ("net: Enable support for VRF with ipv4 multicast") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Wed, 6 Mar 2019 08:12:34 +0000 (11:12 +0300)]
net: hns3: Fix a logical vs bitwise typo
There were a couple logical ORs accidentally mixed in with the bitwise
ORs.
Fixes: f6aacac4b250 ("net: hns3: remove hnae3_get_bit in data path") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
It should not call dst_cache_destroy before dst_release
Fixes: 907dfb4efbbf ("net/sched: act_tunnel_key: Add dst_cache support") Signed-off-by: wenxu <wenxu@ucloud.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
Erik Hugne [Mon, 4 Mar 2019 22:26:10 +0000 (23:26 +0100)]
tipc: fix RDM/DGRAM connect() regression
Fix regression bug introduced in
commit ca92a64e5fff ("tipc: reduce risk of user starvation during link
congestion")
Only signal -EDESTADDRREQ for RDM/DGRAM if we don't have a cached
sockaddr.
Fixes: ca92a64e5fff ("tipc: reduce risk of user starvation during link congestion") Signed-off-by: Erik Hugne <erik.hugne@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 5 Mar 2019 19:28:25 +0000 (11:28 -0800)]
Merge tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS updates from Paul Burton:
- Support for the MIPSr6 MemoryMapID register & Global INValidate TLB
(GINVT) instructions, allowing for more efficient TLB maintenance
when running on a CPU such as the I6500 that supports these.
- Enable huge page support for MIPS64r6.
- Optimize post-DMA cache sync by removing that code entirely for
kernel configurations in which we know it won't be needed.
- The number of pages allocated for interrupt stacks is now calculated
correctly, where before we would wastefully allocate too much memory
in some configurations.
- The ath79 platform migrates to devicetree.
- The bcm47xx platform sees fixes for the Buffalo WHR-G54S board.
- The ingenic/jz4740 platform gains support for appended devicetrees.
- The cavium_octeon, lantiq, loongson32 & sgi-ip27 platforms all see
cleanups as do various pieces of core architecture code.
* tag 'mips_5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (66 commits)
MIPS: lantiq: Remove separate GPHY Firmware loader
MIPS: ingenic: Add support for appended devicetree
MIPS: SGI-IP27: rework HUB interrupts
MIPS: SGI-IP27: do boot CPU init later
MIPS: SGI-IP27: do xtalk scanning later
MIPS: SGI-IP27: use pr_info/pr_emerg and pr_cont to fix output
MIPS: SGI-IP27: clean up bridge access and header files
MIPS: SGI-IP27: get rid of volatile and hubreg_t
MIPS: irq: Allocate accurate order pages for irq stack
MIPS: dma-noncoherent: Remove bogus condition in dma_sync_phys()
MIPS: eBPF: Remove REG_32BIT_ZERO_EX
MIPS: eBPF: Always return sign extended 32b values
MIPS: CM: Fix indentation
MIPS: BCM47XX: Fix/improve Buffalo WHR-G54S support
MIPS: OCTEON: program rx/tx-delay always from DT
MIPS: OCTEON: delete board-specific link status
MIPS: OCTEON: don't lie about interface type of CN3005 board
MIPS: OCTEON: warn if deprecated link status is being used
MIPS: OCTEON: add fixed-link nodes to in-kernel device tree
MIPS: Delete unused flush_cache_sigtramp()
...
Linus Torvalds [Tue, 5 Mar 2019 19:17:23 +0000 (11:17 -0800)]
Merge branch 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"The most important changes in this patch set are:
- DMA-related cleanups for parisc with the aim to move anything not
required by drivers out of <asm/dma-mapping.h>, by Christoph
Hellwig
- Switch to memblock_alloc(), by Mike Rapoport
- Makefile cleanups by Masahiro Yamada
- Switch to bust_spinlocks(), by Sergey Senozhatsky
- Improved initial SMP affinity selection for IRQs
- Added IPI- and rescheduling interrupts in /proc/interrupts output"
* 'parisc-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: (21 commits)
parisc: use memblock_alloc() instead of custom get_memblock()
parisc: Add constants for various PDC firmware calls
parisc: Add constant for PDC_PAT_COMPLEX firmware call
parisc: Show machine product number during boot
parisc: Add constants for PDC_RELOCATE PDC call
parisc: Add PDC_CRASH_PREP PDC function number
parisc: Use F_EXTEND() macro in iosapic code
parisc: remove the HBA_DATA macro
parisc/lba_pci: use container_of in LBA_DEV
parisc/dino: use container_of in DINO_DEV
parisc: properly type the return value of parisc_walk_tree
parisc: properly type the iommu field in struct pci_hba_data
parisc: turn GET_IOC into an inline function
parisc: move internal implementation details out of <asm/dma-mapping.h>
parisc: don't include <asm/cacheflush.h> in <asm/dma-mapping.h>
parisc: remove meaningless ccflags-y in arch/parisc/boot/Makefile
parisc: replace oops_in_progress manipulation with bust_spinlocks()
parisc: Improve initial IRQ to CPU assignment
parisc: Count IPI function call interrupts
parisc: Show rescheduling interrupts on SMP machines only
...
Linus Torvalds [Tue, 5 Mar 2019 19:13:10 +0000 (11:13 -0800)]
Merge tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
- A copy of Arnds compat wrapper generation series
- Pass information about the KVM guest to the host in form the control
program code and the control program version code
- Map IOV resources to support PCI physical functions on s390
- Add vector load and store alignment hints to improve performance
- Use the "jdd" constraint with gcc 9 to make jump labels working again
- Remove amode workaround for old z/VM releases from the DCSS code
- Add support for in-kernel performance measurements using the CPU
measurement counter facility
- Introduce a new PMU device cpum_cf_diag to capture counters and store
thenn as event raw data.
- Bug fixes and cleanups
* tag 's390-5.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (54 commits)
Revert "s390/cpum_cf: Add kernel message exaplanations"
s390/dasd: fix read device characteristic with CONFIG_VMAP_STACK=y
s390/suspend: fix prefix register reset in swsusp_arch_resume
s390: warn about clearing als implied facilities
s390: allow overriding facilities via command line
s390: clean up redundant facilities list setup
s390/als: remove duplicated in-place implementation of stfle
s390/cio: Use cpa range elsewhere within vfio-ccw
s390/cio: Fix vfio-ccw handling of recursive TICs
s390: vfio_ap: link the vfio_ap devices to the vfio_ap bus subsystem
s390/cpum_cf: Handle EBUSY return code from CPU counter facility reservation
s390/cpum_cf: Add kernel message exaplanations
s390/cpum_cf_diag: Add support for s390 counter facility diagnostic trace
s390/cpum_cf: add ctr_stcctm() function
s390/cpum_cf: move common functions into a separate file
s390/cpum_cf: introduce kernel_cpumcf_avail() function
s390/cpu_mf: replace stcctm5() with the stcctm() function
s390/cpu_mf: add store cpu counter multiple instruction support
s390/cpum_cf: Add minimal in-kernel interface for counter measurements
s390/cpum_cf: introduce kernel_cpumcf_alert() to obtain measurement alerts
...
Linus Torvalds [Tue, 5 Mar 2019 19:02:12 +0000 (11:02 -0800)]
Merge tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k updates from Geert Uytterhoeven:
- VLA removal
- gcc-8.x build fixes
- small improvements and cleanups
- defconfig updates
* tag 'm68k-for-v5.1-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Add -ffreestanding to CFLAGS
m68k/apollo: Fix comment in Makefile
dio: Fix buffer overflow in case of unknown board
m68k/defconfig: Update defconfigs for v5.0-rc1
m68k/atari: Avoid VLA use in atari_switches_setup()
m68k: Avoid VLA use in mangle_kernel_stack()
m68k/mac: Use '030 reset method on SE/30
m68k/mac: Remove obsolete comment
m68k/mac: Skip VIA port setup unless RTC is connected
m68k/mac: Clean up unused timer definitions
m68k/defconfig: Drop NET_VENDOR_<FOO>=n
Borislav Petkov [Tue, 5 Mar 2019 14:47:51 +0000 (15:47 +0100)]
x86: Deprecate a.out support
Linux supports ELF binaries for ~25 years now. a.out coredumping has
bitrotten quite significantly and would need some fixing to get it into
shape again but considering how even the toolchains cannot create a.out
executables in its default configuration, let's deprecate a.out support
and remove it a couple of releases later, instead.
Signed-off-by: Borislav Petkov <bp@suse.de> Acked-by: Richard Weinberger <richard@nod.at> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Jann Horn <jannh@google.com> Cc: <linux-api@vger.kernel.org> Cc: <linux-fsdevel@vger.kernel.org> Cc: lkml <linux-kernel@vger.kernel.org> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <x86@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 5 Mar 2019 18:00:35 +0000 (10:00 -0800)]
a.out: remove core dumping support
We're (finally) phasing out a.out support for good. As Borislav Petkov
points out, we've supported ELF binaries for about 25 years by now, and
coredumping in particular has bitrotted over the years.
None of the tool chains even support generating a.out binaries any more,
and the plan is to deprecate a.out support entirely for the kernel. But
I want to start with just removing the core dumping code, because I can
still imagine that somebody actually might want to support a.out as a
simpler biinary format.
Particularly if you generate some random binaries on the fly, ELF is a
much more complicated format (admittedly ELF also does have a lot of
toolchain support, mitigating that complexity a lot and you really
should have moved over in the last 25 years).
So it's at least somewhat possible that somebody out there has some
workflow that still involves generating and running a.out executables.
In contrast, it's very unlikely that anybody depends on debugging any
legacy a.out core files. But regardless, I want this phase-out to be
done in two steps, so that we can resurrect a.out support (if needed)
without having to resurrect the core file dumping that is almost
certainly not needed.
Jann Horn pointed to the <asm/a.out-core.h> file that my first trivial
cut at this had missed.
And Alan Cox points out that the a.out binary loader _could_ be done in
user space if somebody wants to, but we might keep just the loader in
the kernel if somebody really wants it, since the loader isn't that big
and has no really odd special cases like the core dumping does.
Acked-by: Borislav Petkov <bp@alien8.de> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Jann Horn <jannh@google.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Tue, 5 Mar 2019 17:09:55 +0000 (09:09 -0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
"API:
- Add helper for simple skcipher modes.
- Add helper to register multiple templates.
- Set CRYPTO_TFM_NEED_KEY when setkey fails.
- Require neither or both of export/import in shash.
- AEAD decryption test vectors are now generated from encryption
ones.
- New option CONFIG_CRYPTO_MANAGER_EXTRA_TESTS that includes random
fuzzing.
Algorithms:
- Conversions to skcipher and helper for many templates.
- Add more test vectors for nhpoly1305 and adiantum.
Drivers:
- Add crypto4xx prng support.
- Add xcbc/cmac/ecb support in caam.
- Add AES support for Exynos5433 in s5p.
- Remove sha384/sha512 from artpec7 as hardware cannot do partial
hash"
[ There is a merge of the Freescale SoC tree in order to pull in changes
required by patches to the caam/qi2 driver. ]
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (174 commits)
crypto: s5p - add AES support for Exynos5433
dt-bindings: crypto: document Exynos5433 SlimSSS
crypto: crypto4xx - add missing of_node_put after of_device_is_available
crypto: cavium/zip - fix collision with generic cra_driver_name
crypto: af_alg - use struct_size() in sock_kfree_s()
crypto: caam - remove redundant likely/unlikely annotation
crypto: s5p - update iv after AES-CBC op end
crypto: x86/poly1305 - Clear key material from stack in SSE2 variant
crypto: caam - generate hash keys in-place
crypto: caam - fix DMA mapping xcbc key twice
crypto: caam - fix hash context DMA unmap size
hwrng: bcm2835 - fix probe as platform device
crypto: s5p-sss - Use AES_BLOCK_SIZE define instead of number
crypto: stm32 - drop pointless static qualifier in stm32_hash_remove()
crypto: chelsio - Fixed Traffic Stall
crypto: marvell - Remove set but not used variable 'ivsize'
crypto: ccp - Update driver messages to remove some confusion
crypto: adiantum - add 1536 and 4096-byte test vectors
crypto: nhpoly1305 - add a test vector with len % 16 != 0
crypto: arm/aes-ce - update IV after partial final CTR block
...
Pull networking updates from David Miller:
"Here we go, another merge window full of networking and #ebpf changes:
1) Snoop DHCPACKS in batman-adv to learn MAC/IP pairs in the DHCP
range without dealing with floods of ARP traffic, from Linus
Lüssing.
2) Throttle buffered multicast packet transmission in mt76, from
Felix Fietkau.
3) Support adaptive interrupt moderation in ice, from Brett Creeley.
4) A lot of struct_size conversions, from Gustavo A. R. Silva.
5) Add peek/push/pop commands to bpftool, as well as bash completion,
from Stanislav Fomichev.
6) Optimize sk_msg_clone(), from Vakul Garg.
7) Add SO_BINDTOIFINDEX, from David Herrmann.
8) Be more conservative with local resends due to local congestion,
from Yuchung Cheng.
9) Allow vetoing of unsupported VXLAN FDBs, from Petr Machata.
10) Add health buffer support to devlink, from Eran Ben Elisha.
11) Add TXQ scheduling API to mac80211, from Toke Høiland-Jørgensen.
12) Add statistics to basic packet scheduler filter, from Cong Wang.
13) Add GRE tunnel support for mlxsw Spectrum-2, from Nir Dotan.
14) Lots of new IP tunneling forwarding tests, also from Nir Dotan.
15) Add 3ad stats to bonding, from Nikolay Aleksandrov.
16) Lots of probing improvements for bpftool, from Quentin Monnet.
17) Various nfp drive #ebpf JIT improvements from Jakub Kicinski.
18) Allow #ebpf programs to access gso_segs from skb shared info, from
Eric Dumazet.
19) Add sock_diag support for AF_XDP sockets, from Björn Töpel.
20) Support 22260 iwlwifi devices, from Luca Coelho.
21) Use rbtree for ipv6 defragmentation, from Peter Oskolkov.
22) Add JMP32 instruction class support to #ebpf, from Jiong Wang.
23) Add spinlock support to #ebpf, from Alexei Starovoitov.
24) Support 256-bit keys and TLS 1.3 in ktls, from Dave Watson.
25) Add device infomation API to devlink, from Jakub Kicinski.
26) Add new timestamping socket options which are y2038 safe, from
Deepa Dinamani.
27) Add RX checksum offloading for various sh_eth chips, from Sergei
Shtylyov.
28) Flow offload infrastructure, from Pablo Neira Ayuso.
29) Numerous cleanups, improvements, and bug fixes to the PHY layer
and many drivers from Heiner Kallweit.
30) Lots of changes to try and make packet scheduler classifiers run
lockless as much as possible, from Vlad Buslov.
31) Support BCM957504 chip in bnxt_en driver, from Erik Burrows.
32) Add concurrency tests to tc-tests infrastructure, from Vlad
Buslov.
33) Add hwmon support to aquantia, from Heiner Kallweit.
34) Allow 64-bit values for SO_MAX_PACING_RATE, from Eric Dumazet.
And I would be remiss if I didn't thank the various major networking
subsystem maintainers for integrating much of this work before I even
saw it. Alexei Starovoitov, Daniel Borkmann, Pablo Neira Ayuso,
Johannes Berg, Kalle Valo, and many others. Thank you!"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2207 commits)
net/sched: avoid unused-label warning
net: ignore sysctl_devconf_inherit_init_net without SYSCTL
phy: mdio-mux: fix Kconfig dependencies
net: phy: use phy_modify_mmd_changed in genphy_c45_an_config_aneg
net: dsa: mv88e6xxx: add call to mv88e6xxx_ports_cmode_init to probe for new DSA framework
selftest/net: Remove duplicate header
sky2: Disable MSI on Dell Inspiron 1545 and Gateway P-79
net/mlx5e: Update tx reporter status in case channels were successfully opened
devlink: Add support for direct reporter health state update
devlink: Update reporter state to error even if recover aborted
sctp: call iov_iter_revert() after sending ABORT
team: Free BPF filter when unregistering netdev
ip6mr: Do not call __IP6_INC_STATS() from preemptible context
isdn: mISDN: Fix potential NULL pointer dereference of kzalloc
net: dsa: mv88e6xxx: support in-band signalling on SGMII ports with external PHYs
cxgb4/chtls: Prefix adapter flags with CXGB4
net-sysfs: Switch to bitmap_zalloc()
mellanox: Switch to bitmap_zalloc()
bpf: add test cases for non-pointer sanitiation logic
mlxsw: i2c: Extend initialization by querying resources data
...