1) Out of bounds access in xfrm IPSEC policy unlink, from Yue Haibing.
2) Missing length check for esp4 UDP encap, from Sabrina Dubroca.
3) Fix byte order of RX STBC access in mac80211, from Johannes Berg.
4) Inifnite loop in bpftool map create, from Alban Crequy.
5) Register mark fix in ebpf verifier after pkt/null checks, from Paul
Chaignon.
6) Properly use rcu_dereference_sk_user_data in L2TP code, from Eric
Dumazet.
7) Buffer overrun in marvell phy driver, from Andrew Lunn.
8) Several crash and statistics handling fixes to bnxt_en driver, from
Michael Chan and Vasundhara Volam.
9) Several fixes to the TLS layer from Jakub Kicinski (copying negative
amounts of data in reencrypt, reencrypt frag copying, blind nskb->sk
NULL deref, etc).
10) Several UDP GRO fixes, from Paolo Abeni and Eric Dumazet.
11) PID/UID checks on ipv6 flow labels are inverted, from Willem de
Bruijn.
12) Use after free in l2tp, from Eric Dumazet.
13) IPV6 route destroy races, also from Eric Dumazet.
14) SCTP state machine can erroneously run recursively, fix from Xin
Long.
15) Adjust AF_PACKET msg_name length checks, add padding bytes if
necessary. From Willem de Bruijn.
16) Preserve skb_iif, so that forwarded packets have consistent values
even if fragmentation is involved. From Shmulik Ladkani.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (69 commits)
udp: fix GRO packet of death
ipv6: A few fixes on dereferencing rt->from
rds: ib: force endiannes annotation
selftests: fib_rule_tests: print the result and return 1 if any tests failed
ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
net/tls: avoid NULL pointer deref on nskb->sk in fallback
selftests: fib_rule_tests: Fix icmp proto with ipv6
packet: validate msg_namelen in send directly
packet: in recvmsg msg_name return at least sizeof sockaddr_ll
sctp: avoid running the sctp state machine recursively
stmmac: pci: Fix typo in IOT2000 comment
Documentation: fix netdev-FAQ.rst markup warning
ipv6: fix races in ip6_dst_destroy()
l2ip: fix possible use-after-free
appletalk: Set error code if register_snap_client failed
net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc
rxrpc: Fix net namespace cleanup
ipv6/flowlabel: wait rcu grace period before put_pid()
vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach
tcp: add sanity tests in tcp_add_backlog()
...
Linus Torvalds [Thu, 2 May 2019 16:55:04 +0000 (09:55 -0700)]
Merge tag 'for-linus-20190502' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"This is mostly io_uring fixes/tweaks. Most of these were actually done
in time for the last -rc, but I wanted to ensure that everything
tested out great before including them. The code delta looks larger
than it really is, as it's mostly just comment additions/changes.
Outside of the comment additions/changes, this is mostly removal of
unnecessary barriers. In all, this pull request contains:
- Tweak to how we handle errors at submission time. We now post a
completion event if the error occurs on behalf of an sqe, instead
of returning it through the system call. If the error happens
outside of a specific sqe, we return the error through the system
call. This makes it nicer to use and makes the "normal" use case
behave the same as the offload cases. (me)
- Fix for a missing req reference drop from async context (me)
- If an sqe is submitted with RWF_NOWAIT, don't punt it to async
context. Return -EAGAIN directly, instead of using it as a hint to
do async punt. (Stefan)
- Fix notes on barriers (Stefan)
- Remove unnecessary barriers (Stefan)
- Fix potential double free of memory in setup error (Mark)
- Further improve sq poll CPU validation (Mark)
- Fix page allocation warning and leak on buffer registration error
(Mark)
- Fix iov_iter_type() for new no-ref flag (Ming)
- Fix a case where dio doesn't honor bio no-page-ref (Ming)"
* tag 'for-linus-20190502' of git://git.kernel.dk/linux-block:
io_uring: avoid page allocation warnings
iov_iter: fix iov_iter_type
block: fix handling for BIO_NO_PAGE_REF
io_uring: drop req submit reference always in async punt
io_uring: free allocated io_memory once
io_uring: fix SQPOLL cpu validation
io_uring: have submission side sqe errors post a cqe
io_uring: remove unnecessary barrier after unsetting IORING_SQ_NEED_WAKEUP
io_uring: remove unnecessary barrier after incrementing dropped counter
io_uring: remove unnecessary barrier before reading SQ tail
io_uring: remove unnecessary barrier after updating SQ head
io_uring: remove unnecessary barrier before reading cq head
io_uring: remove unnecessary barrier before wq_has_sleeper
io_uring: fix notes on barriers
io_uring: fix handling SQEs requesting NOWAIT
Linus Torvalds [Thu, 2 May 2019 15:29:24 +0000 (08:29 -0700)]
Merge tag 'pci-v5.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"I apologize for sending these so late in the cycle. We went back and
forth about how to deal with the unexpected logging of intentional
link state changes and finally decided to just config them off by
default.
- Use shared MSI/MSI-X vector for Link Bandwidth Management (Alex
Williamson)
- Add Kconfig option for Link Bandwidth notification messages (Keith
Busch)"
* tag 'pci-v5.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI/LINK: Add Kconfig option (default off)
PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management
PCI: Fix issue with "pci=disable_acs_redir" parameter being ignored
Linus Torvalds [Thu, 2 May 2019 15:27:39 +0000 (08:27 -0700)]
Merge tag 'mtd/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fix from Richard Weinberger:
"A single regression fix for the marvell nand driver"
* tag 'mtd/fixes-for-5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: marvell: Clean the controller state before each operation
Keith Busch [Wed, 1 May 2019 14:29:42 +0000 (08:29 -0600)]
PCI/LINK: Add Kconfig option (default off)
e8303bb7a75c ("PCI/LINK: Report degraded links via link bandwidth
notification") added dmesg logging whenever a link changes speed or width
to a state that is considered degraded. Unfortunately, it cannot
differentiate signal integrity-related link changes from those
intentionally initiated by an endpoint driver, including drivers that may
live in userspace or VMs when making use of vfio-pci. Some GPU drivers
actively manage the link state to save power, which generates a stream of
messages like this:
vfio-pci 0000:07:00.0: 32.000 Gb/s available PCIe bandwidth, limited by 2.5 GT/s x16 link at 0000:00:02.0 (capable of 64.000 Gb/s with 5 GT/s x16 link)
Since we can't distinguish the intentional changes from the signal
integrity issues, leave the reporting turned off by default. Add a Kconfig
option to turn it on if desired.
Eric Dumazet [Thu, 2 May 2019 01:56:28 +0000 (18:56 -0700)]
udp: fix GRO packet of death
syzbot was able to crash host by sending UDP packets with a 0 payload.
TCP does not have this issue since we do not aggregate packets without
payload.
Since dev_gro_receive() sets gso_size based on skb_gro_len(skb)
it seems not worth trying to cope with padded packets.
BUG: KASAN: slab-out-of-bounds in skb_gro_receive+0xf5f/0x10e0 net/core/skbuff.c:3826
Read of size 16 at addr ffff88808893fff0 by task syz-executor612/7889
Memory state around the buggy address: ffff88808893fe80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88808893ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88808893ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^ ffff888088940000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff888088940080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 May 2019 21:57:23 +0000 (14:57 -0700)]
Merge tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel:
"Two more fixes for the 5.1 cycle.
One division by zero fix in a specific driver and one core workaround
for bad userspace behaviour from systemd regarding uevents. IMHO this
can be considered to be a userspace bug, but the debug messages are
useless anyways
- cpcap-battery: fix a division by zero
- core: fix systemd issue due to log messages produced by uevent"
* tag 'for-v5.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
power: supply: sysfs: prevent endless uevent loop with CONFIG_POWER_SUPPLY_DEBUG
power: supply: cpcap-battery: Fix division by zero
Martin KaFai Lau [Tue, 30 Apr 2019 17:45:12 +0000 (10:45 -0700)]
ipv6: A few fixes on dereferencing rt->from
It is a followup after the fix in
commit 9c69a1320515 ("route: Avoid crash from dereferencing NULL rt->from")
rt6_do_redirect():
1. NULL checking is needed on rt->from because a parallel
fib6_info delete could happen that sets rt->from to NULL.
(e.g. rt6_remove_exception() and fib6_drop_pcpu_from()).
2. fib6_info_hold() is not enough. Same reason as (1).
Meaning, holding dst->__refcnt cannot ensure
rt->from is not NULL or rt->from->fib6_ref is not 0.
Instead of using fib6_info_hold_safe() which ip6_rt_cache_alloc()
is already doing, this patch chooses to extend the rcu section
to keep "from" dereference-able after checking for NULL.
inet6_rtm_getroute():
1. NULL checking is also needed on rt->from for a similar reason.
Note that inet6_rtm_getroute() is using RTNL_FLAG_DOIT_UNLOCKED.
Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") Signed-off-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Wei Wang <weiwan@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
While the endiannes is being handled correctly as indicated by the comment
above the offending line - sparse was unhappy with the missing annotation
as be64_to_cpu() expects a __be64 argument. To mitigate this annotation
all involved variables are changed to a consistent __le64 and the
conversion to uint64_t delayed to the call to rds_cong_map_updated().
Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 May 2019 20:40:30 +0000 (13:40 -0700)]
Merge tag 'arc-5.1-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"A few minor fixes for ARC.
- regression in memset if line size !64
- avoid panic if PAE and IOC"
* tag 'arc-5.1-final' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: memset: fix build with L1_CACHE_SHIFT != 6
ARC: [hsdk] Make it easier to add PAE40 region to DTB
ARC: PAE40: don't panic and instead turn off hw ioc
Alex Williamson [Mon, 22 Apr 2019 22:43:30 +0000 (16:43 -0600)]
PCI/portdrv: Use shared MSI/MSI-X vector for Bandwidth Management
The Interrupt Message Number in the PCIe Capabilities register (PCIe r4.0,
sec 7.5.3.2) indicates which MSI/MSI-X vector is shared by interrupts
related to the PCIe Capability, including Link Bandwidth Management and
Link Autonomous Bandwidth Interrupts (Link Control, 7.5.3.7), Command
Completed and Hot-Plug Interrupts (Slot Control, 7.5.3.10), and the PME
Interrupt (Root Control, 7.5.3.12).
pcie_message_numbers() checked whether we want to enable PME or Hot-Plug
interrupts but neglected to check for Link Bandwidth Management, so if we
only wanted the Bandwidth Management interrupts, it decided we didn't need
any vectors at all. Then pcie_port_enable_irq_vec() tried to reallocate
zero vectors, which failed, resulting in fallback to INTx.
On some systems, e.g., an X79-based workstation, that INTx seems broken or
not handled correctly, so we got spurious IRQ16 interrupts for Bandwidth
Management events.
Change pcie_message_numbers() so that if we want Link Bandwidth Management
interrupts, we use the shared MSI/MSI-X vector from the PCIe Capabilities
register.
Linus Torvalds [Wed, 1 May 2019 20:03:39 +0000 (13:03 -0700)]
Merge tag 'acpi-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Revert a recent ACPICA change that caused initialization to fail on
systems with Thunderbolt docking stations connected at the init time"
* tag 'acpi-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPICA: Clear status of GPEs before enabling them"
Linus Torvalds [Wed, 1 May 2019 19:19:20 +0000 (12:19 -0700)]
gcc-9: don't warn about uninitialized btrfs extent_type variable
The 'extent_type' variable does seem to be reliably initialized, but
it's _very_ non-obvious, since there's a "goto next" case that jumps
over the normal initialization. That will then always trigger the
"start >= extent_end" test, which will end up never falling through to
the use of that variable.
But the code is certainly not obvious, and the compiler warning looks
reasonable. Make 'extent_type' an int, and initialize it to an invalid
negative value, which seems to be the common pattern in other places.
Hangbin Liu [Tue, 30 Apr 2019 02:46:10 +0000 (10:46 +0800)]
selftests: fib_rule_tests: print the result and return 1 if any tests failed
Fixes: 65b2b4939a64 ("selftests: net: initial fib rule tests") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 May 2019 18:07:40 +0000 (11:07 -0700)]
gcc-9: don't warn about uninitialized variable
I'm not sure what made gcc warn about this code now. The 'ret' variable
does end up initialized in all cases, but it's definitely not obvious,
so the compiler is quite reasonable to warn about this.
So just add initialization to make it all much more obvious both to
compilers and to humans.
ipv4: ip_do_fragment: Preserve skb_iif during fragmentation
Previously, during fragmentation after forwarding, skb->skb_iif isn't
preserved, i.e. 'ip_copy_metadata' does not copy skb_iif from given
'from' skb.
As a result, ip_do_fragment's creates fragments with zero skb_iif,
leading to inconsistent behavior.
Assume for example an eBPF program attached at tc egress (post
forwarding) that examines __sk_buff->ingress_ifindex:
- the correct iif is observed if forwarding path does not involve
fragmentation/refragmentation
- a bogus iif is observed if forwarding path involves
fragmentation/refragmentatiom
Fix, by preserving skb_iif during 'ip_copy_metadata'.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mark Rutland [Wed, 1 May 2019 15:59:16 +0000 (16:59 +0100)]
io_uring: avoid page allocation warnings
In io_sqe_buffer_register() we allocate a number of arrays based on the
iov_len from the user-provided iov. While we limit iov_len to SZ_1G,
we can still attempt to allocate arrays exceeding MAX_ORDER.
On a 64-bit system with 4KiB pages, for an iov where iov_base = 0x10 and
iov_len = SZ_1G, we'll calculate that nr_pages = 262145. When we try to
allocate a corresponding array of (16-byte) bio_vecs, requiring 4194320
bytes, which is greater than 4MiB. This results in SLUB warning that
we're trying to allocate greater than MAX_ORDER, and failing the
allocation.
Avoid this by using kvmalloc() for allocations dependent on the
user-provided iov_len. At the same time, fix a leak of imu->bvec when
registration fails.
Jakub Kicinski [Mon, 29 Apr 2019 19:19:12 +0000 (12:19 -0700)]
net/tls: avoid NULL pointer deref on nskb->sk in fallback
update_chksum() accesses nskb->sk before it has been set
by complete_skb(), move the init up.
Fixes: e8f69799810c ("net/tls: Add generic NIC offload infrastructure") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Mon, 29 Apr 2019 17:30:09 +0000 (10:30 -0700)]
selftests: fib_rule_tests: Fix icmp proto with ipv6
A recent commit returns an error if icmp is used as the ip-proto for
IPv6 fib rules. Update fib_rule_tests to send ipv6-icmp instead of icmp.
Fixes: 5e1a99eae8499 ("ipv4: Add ICMPv6 support when parse route ipproto") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Mon, 29 Apr 2019 15:53:18 +0000 (11:53 -0400)]
packet: validate msg_namelen in send directly
Packet sockets in datagram mode take a destination address. Verify its
length before passing to dev_hard_header.
Prior to 2.6.14-rc3, the send code ignored sll_halen. This is
established behavior. Directly compare msg_namelen to dev->addr_len.
Change v1->v2: initialize addr in all paths
Fixes: 6b8d95f1795c4 ("packet: validate address length if non-zero") Suggested-by: David Laight <David.Laight@aculab.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Mon, 29 Apr 2019 15:46:55 +0000 (11:46 -0400)]
packet: in recvmsg msg_name return at least sizeof sockaddr_ll
Packet send checks that msg_name is at least sizeof sockaddr_ll.
Packet recv must return at least this length, so that its output
can be passed unmodified to packet send.
This ceased to be true since adding support for lladdr longer than
sll_addr. Since, the return value uses true address length.
Always return at least sizeof sockaddr_ll, even if address length
is shorter. Zero the padding bytes.
Change v1->v2: do not overwrite zeroed padding again. use copy_len.
Fixes: 0fb375fb9b93 ("[AF_PACKET]: Allow for > 8 byte hardware addresses.") Suggested-by: David Laight <David.Laight@aculab.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ming Lei [Fri, 26 Apr 2019 10:45:21 +0000 (18:45 +0800)]
iov_iter: fix iov_iter_type
Commit 875f1d0769cd ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag")
introduces one extra flag of ITER_BVEC_FLAG_NO_REF, and this flag
is stored into iter->type.
However, iov_iter_type() doesn't consider the new added flag, fix
it by masking this flag in iov_iter_type().
Fixes: 875f1d0769cd ("iov_iter: add ITER_BVEC_FLAG_NO_REF flag") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Ming Lei [Fri, 26 Apr 2019 10:45:20 +0000 (18:45 +0800)]
block: fix handling for BIO_NO_PAGE_REF
Commit 399254aaf489211 ("block: add BIO_NO_PAGE_REF flag") introduces
BIO_NO_PAGE_REF, and once this flag is set for one bio, all pages
in the bio won't be get/put during IO.
However, if one bio is submitted via __blkdev_direct_IO_simple(),
even though BIO_NO_PAGE_REF is set, pages still may be put.
Fixes this issue by avoiding to put pages if BIO_NO_PAGE_REF is
set.
Fixes: 399254aaf489211 ("block: add BIO_NO_PAGE_REF flag") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_uring: drop req submit reference always in async punt
If we don't end up actually calling submit in io_sq_wq_submit_work(),
we still need to drop the submit reference to the request. If we
don't, then we can leak the request. This can happen if we race
with ring shutdown while flushing the workqueue for requests that
require use of the mm_struct.
Mark Rutland [Tue, 30 Apr 2019 16:30:21 +0000 (17:30 +0100)]
io_uring: free allocated io_memory once
If io_allocate_scq_urings() fails to allocate an sq_* region, it will
call io_mem_free() for any previously allocated regions, but leave
dangling pointers to these regions in the ctx. Any regions which have
not yet been allocated are left NULL. Note that when returning
-EOVERFLOW, the previously allocated sq_ring is not freed, which appears
to be an unintentional leak.
When io_allocate_scq_urings() fails, io_uring_create() will call
io_ring_ctx_wait_and_kill(), which calls io_mem_free() on all the sq_*
regions, assuming the pointers are valid and not NULL.
This can result in pages being freed multiple times, which has been
observed to corrupt the page state, leading to subsequent fun. This can
also result in virt_to_page() on NULL, resulting in the use of bogus
page addresses, and yet more subsequent fun. The latter can be detected
with CONFIG_DEBUG_VIRTUAL on arm64.
Adding a cleanup path to io_allocate_scq_urings() complicates the logic,
so let's leave it to io_ring_ctx_free() to consistently free these
pointers, and simplify the io_allocate_scq_urings() error paths.
Full splats from before this patch below. Note that the pointer logged
by the DEBUG_VIRTUAL "non-linear address" warning has been hashed, and
is actually NULL.
Mark Rutland [Tue, 30 Apr 2019 12:34:51 +0000 (13:34 +0100)]
io_uring: fix SQPOLL cpu validation
In io_sq_offload_start(), we call cpu_possible() on an unbounded cpu
value from userspace. On v5.1-rc7 on arm64 with
CONFIG_DEBUG_PER_CPU_MAPS, this results in a splat:
WARNING: CPU: 1 PID: 27601 at include/linux/cpumask.h:121 cpu_max_bits_warn include/linux/cpumask.h:121 [inline]
There was an attempt to fix this in commit:
917257daa0fea7a0 ("io_uring: only test SQPOLL cpu after we've verified it")
... by adding a check after the cpu value had been limited to NR_CPU_IDS
using array_index_nospec(). However, this left an unbound check at the
start of the function, for which the warning still fires.
Let's fix this correctly by checking that the cpu value is bound by
nr_cpu_ids before passing it to cpu_possible(). Note that only
nr_cpu_ids of a cpumask are guaranteed to exist at runtime, and
nr_cpu_ids can be significantly smaller than NR_CPUs. For example, an
arm64 defconfig has NR_CPUS=256, while my test VM has 4 vCPUs.
Following the intent from the commit message for 917257daa0fea7a0, the
check is moved under the SQ_AFF branch, which is the only branch where
the cpu values is consumed. The check is performed before bounding the
value with array_index_nospec() so that we don't silently accept bogus
cpu values from userspace, where array_index_nospec() would force these
values to 0.
I suspect we can remove the array_index_nospec() call entirely, but I've
conservatively left that in place, updated to use nr_cpu_ids to match
the prior check.
Fixes: 917257daa0fea7a0 ("io_uring: only test SQPOLL cpu after we've verified it") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: linux-block@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org
Simplied the logic
As it shows, the first sctp_do_sm() running under atomic context (NET_RX
softirq) invoked sctp_primitive_ASCONF() that uses GFP_KERNEL flag later,
and this flag is supposed to be used in non-atomic context only. Besides,
sctp_do_sm() was called recursively, which is not expected.
Vlad tried to fix this recursive call in Commit c0786693404c ("sctp: Fix
oops when sending queued ASCONF chunks") by introducing a new command
SCTP_CMD_SEND_NEXT_ASCONF. But it didn't work as this command is still
used in the first sctp_do_sm() call, and sctp_primitive_ASCONF() will
be called in this command again.
To avoid calling sctp_do_sm() recursively, we send the next queued ASCONF
not by sctp_primitive_ASCONF(), but by sctp_sf_do_prm_asconf() in the 1st
sctp_do_sm() directly.
Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jan Kiszka [Mon, 29 Apr 2019 05:51:44 +0000 (07:51 +0200)]
stmmac: pci: Fix typo in IOT2000 comment
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Mon, 29 Apr 2019 01:10:39 +0000 (18:10 -0700)]
Documentation: fix netdev-FAQ.rst markup warning
Fix ReST underline warning:
./Documentation/networking/netdev-FAQ.rst:135: WARNING: Title underline too short.
Q: I made changes to only a few patches in a patch series should I resend only those changed?
--------------------------------------------------------------------------------------------
Fixes: ffa91253739c ("Documentation: networking: Update netdev-FAQ regarding patches") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
io_uring: have submission side sqe errors post a cqe
Currently we only post a cqe if we get an error OUTSIDE of submission.
For submission, we return the error directly through io_uring_enter().
This is a bit awkward for applications, and it makes more sense to
always post a cqe with an error, if the error happens on behalf of an
sqe.
This changes submission behavior a bit. io_uring_enter() returns -ERROR
for an error, and > 0 for number of sqes submitted. Before this change,
if you wanted to submit 8 entries and had an error on the 5th entry,
io_uring_enter() would return 4 (for number of entries successfully
submitted) and rewind the sqring. The application would then have to
peek at the sqring and figure out what was wrong with the head sqe, and
then skip it itself. With this change, we'll return 5 since we did
consume 5 sqes, and the last sqe (with the error) will result in a cqe
being posted with the error.
This makes the logic easier to handle in the application, and it cleans
up the submission part.
Suggested-by: Stefan Bühler <source@stbuehler.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: David Ahern <dsahern@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Revert "ACPICA: Clear status of GPEs before enabling them"
Revert commit c8b1917c8987 ("ACPICA: Clear status of GPEs before
enabling them") that causes problems with Thunderbolt controllers
to occur if a dock device is connected at init time (the xhci_hcd
and thunderbolt modules crash which prevents peripherals connected
through them from working).
Commit c8b1917c8987 effectively causes commit ecc1165b8b74 ("ACPICA:
Dispatch active GPEs at init time") to get undone, so the problem
addressed by commit ecc1165b8b74 appears again as a result of it.
David S. Miller [Tue, 30 Apr 2019 15:52:17 +0000 (11:52 -0400)]
Merge tag 'wireless-drivers-for-davem-2019-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 5.1
Third set of fixes for 5.1.
iwlwifi
* fix an oops when creating debugfs entries
* fix bug when trying to capture debugging info while in rfkill
* prevent potential uninitialized memory dumps into debugging logs
* fix some initialization parameters for AX210 devices
* fix an oops with non-MSIX devices
* fix an oops when we receive a packet with bogus lengths
* fix a bug that prevented 5350 devices from working
* fix a small merge damage from the previous series
mwifiex
* fig regression with resume on SDIO
ath10k
* fix locking problem with crashdump
* fix warnings during suspend and resume
Also note that this pull conflicts with net-next. And I want to emphasie
that it's really net-next, so when you pull this to net tree it should
go without conflicts. Stephen reported the conflict here:
In iwlwifi oddly commit 154d4899e411 adds the IS_ERR_OR_NULL() in
wireless-drivers but commit c9af7528c331 removes the whole check in
wireless-drivers-next. The fix is easy, just drop the whole check for
mvmvif->dbgfs_dir in iwlwifi/mvm/debugfs-vif.c, it's unneeded anyway.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes for a bunch of warnings/errors that the
syzbot has been finding with it's new-found ability to stress-test the
USB layer.
All of these are tiny, but fix real issues, and are marked for stable
as well. All of these have had lots of testing in linux-next as well"
* tag 'usb-5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: w1 ds2490: Fix bug caused by improper use of altsetting array
USB: yurex: Fix protection fault after device removal
usb: usbip: fix isoc packet num validation in get_pipe
USB: core: Fix bug caused by duplicate interface PM usage counter
USB: dummy-hcd: Fix failure to give back unlinked URBs
USB: core: Fix unterminated string returned by usb_string()
Stefan Bühler [Wed, 24 Apr 2019 21:54:16 +0000 (23:54 +0200)]
io_uring: fix notes on barriers
The application reading the CQ ring needs a barrier to pair with the
smp_store_release in io_commit_cqring, not the barrier after it.
Also a write barrier *after* writing something (but not *before*
writing anything interesting) doesn't order anything, so an smp_wmb()
after writing SQ tail is not needed.
Additionally consider reading SQ head and writing CQ tail in the notes.
Also add some clarifications how the various other fields in the ring
buffers are used.
Signed-off-by: Stefan Bühler <source@stbuehler.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Stefan Bühler [Sat, 27 Apr 2019 18:34:19 +0000 (20:34 +0200)]
io_uring: fix handling SQEs requesting NOWAIT
Not all request types set REQ_F_FORCE_NONBLOCK when they needed async
punting; reverse logic instead and set REQ_F_NOWAIT if request mustn't
be punted.
Signed-off-by: Stefan Bühler <source@stbuehler.de>
Merged with my previous patch for this.
Merge tag 'selinux-pr-20190429' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux
Pull selinux fix from Paul Moore:
"One small patch for the stable folks to fix a problem when building
against the latest glibc.
I'll be honest and say that I'm not really thrilled with the idea of
sending this up right now, but Greg is a little annoyed so here I
figured I would at least send this"
* tag 'selinux-pr-20190429' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
selinux: use kernel linux/socket.h for genheaders and mdp
Fixes: 54652eb12c1b ("l2tp: hold tunnel while looking up sessions in l2tp_netlink") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
appletalk: Set error code if register_snap_client failed
If register_snap_client fails in atalk_init,
error code should be set, otherwise it will
triggers NULL pointer dereference while unloading
module.
Fixes: 9804501fa122 ("appletalk: Fix potential NULL pointer dereference in unregister_snap_client") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The "fs->location" is a u32 that comes from the user in ethtool_set_rxnfc().
We can't pass unclamped values to test_bit() or it results in an out of
bounds access beyond the end of the bitmap.
Fixes: 7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Tue, 30 Apr 2019 07:34:08 +0000 (08:34 +0100)]
rxrpc: Fix net namespace cleanup
In rxrpc_destroy_all_calls(), there are two phases: (1) make sure the
->calls list is empty, emitting error messages if not, and (2) wait for the
RCU cleanup to happen on outstanding calls (ie. ->nr_calls becomes 0).
To avoid taking the call_lock, the function prechecks ->calls and if empty,
it returns to avoid taking the lock - this is wrong, however: it still
needs to go and do the second phase and wait for ->nr_calls to become 0.
Without this, the rxrpc_net struct may get deallocated before we get to the
RCU cleanup for the last calls. This can lead to:
Note the "61" at offset 0x58. This corresponds to the ->nr_calls member of
struct rxrpc_net (which is >9k in size, and thus allocated out of the 16k
slab).
Fix this by flipping the condition on the if-statement, putting the locked
section inside the if-body and dropping the return from there. The
function will then always go on to wait for the RCU cleanup on outstanding
calls.
Fixes: 2baec2c3f854 ("rxrpc: Support network namespacing") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
1) Fix an out-of-bound array accesses in __xfrm_policy_unlink.
From YueHaibing.
2) Reset the secpath on failure in the ESP GRO handlers
to avoid dereferencing an invalid pointer on error.
From Myungho Jung.
3) Add and revert a patch that tried to add rcu annotations
to netns_xfrm. From Su Yanjun.
4) Wait for rcu callbacks before freeing xfrm6_tunnel_spi_kmem.
From Su Yanjun.
5) Fix forgotten vti4 ipip tunnel deregistration.
From Jeremy Sowden:
6) Remove some duplicated log messages in vti4.
From Jeremy Sowden.
7) Don't use IPSEC_PROTO_ANY when flushing states because
this will flush only IPsec portocol speciffic states.
IPPROTO_ROUTING states may remain in the lists when
doing net exit. Fix this by replacing IPSEC_PROTO_ANY
with zero. From Cong Wang.
8) Add length check for UDP encapsulation to fix "Oversized IP packet"
warnings on receive side. From Sabrina Dubroca.
9) Fix xfrm interface lookup when the interface is associated to
a vrf layer 3 master device. From Martin Willi.
10) Reload header pointers after pskb_may_pull() in _decode_session4(),
otherwise we may read from uninitialized memory.
11) Update the documentation about xfrm[46]_gc_thresh, it
is not used anymore after the flowcache removal.
From Nicolas Dichtel.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Memory state around the buggy address: ffff888094012900: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc ffff888094012980: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
>ffff888094012a00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
^ ffff888094012a80: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc ffff888094012b00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
Fixes: 4f82f45730c6 ("net ip6 flowlabel: Make owner a union of struct pid * and kuid_t") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach
When there is no route to an IPv6 dest addr, skb_dst(skb) points
to loopback dev in the case of that the IP6CB(skb)->iif is
enslaved to a vrf. This causes Ip6InNoRoutes to be incremented on the
loopback dev. This also causes the lookup to fail on icmpv6_send() and
the dest unreachable to not sent and Ip6OutNoRoutes gets incremented on
the loopback dev.
To reproduce:
* Gateway configuration:
ip link add dev vrf_258 type vrf table 258
ip link set dev enp0s9 master vrf_258
ip addr add 66:1/64 dev enp0s9
ip -6 route add unreachable default metric 8192 table 258
sysctl -w net.ipv6.conf.all.forwarding=1
sysctl -w net.ipv6.conf.enp0s9.forwarding=1
* Sender configuration:
ip addr add 66::2/64 dev enp0s9
ip -6 route add default via 66::1
and ping 67::1 for example from the sender.
Fix this by counting on the original netdev and reset the skb dst to
force a fresh lookup.
v2: Fix typo of destination address in the repro steps.
v3: Simplify the loopback check (per David Ahern) and use reverse
Christmas tree format (per David Miller).
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 26 Apr 2019 17:10:05 +0000 (10:10 -0700)]
tcp: add sanity tests in tcp_add_backlog()
Richard and Bruno both reported that my commit added a bug,
and Bruno was able to determine the problem came when a segment
wih a FIN packet was coalesced to a prior one in tcp backlog queue.
It turns out the header prediction in tcp_rcv_established()
looks back to TCP headers in the packet, not in the metadata
(aka TCP_SKB_CB(skb)->tcp_flags)
The fast path in tcp_rcv_established() is not supposed to
handle a FIN flag (it does not call tcp_fin())
Therefore we need to make sure to propagate the FIN flag,
so that the coalesced packet does not go through the fast path,
the same than a GRO packet carrying a FIN flag.
While we are at it, make sure we do not coalesce packets with
RST or SYN, or if they do not have ACK set.
Many thanks to Richard and Bruno for pinpointing the bad commit,
and to Richard for providing a first version of the fix.
Fixes: 4f693b55c3d2 ("tcp: implement coalescing on backlog queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org> Reported-by: Bruno Prémont <bonbons@sysophe.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Thu, 25 Apr 2019 16:06:54 +0000 (12:06 -0400)]
ipv6: invert flowlabel sharing check in process and user mode
A request for a flowlabel fails in process or user exclusive mode must
fail if the caller pid or uid does not match. Invert the test.
Previously, the test was unsafe wrt PID recycling, but indeed tested
for inequality: fl1->owner != fl->owner
Fixes: 4f82f45730c68 ("net ip6 flowlabel: Make owner a union of struct pid* and kuid_t") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp fixes from Kees Cook:
"Syzbot found a use-after-free bug in seccomp due to flags that should
not be allowed to be used together.
Tycho fixed this, I updated the self-tests, and the syzkaller PoC has
been running for several days without triggering KASan (before this
fix, it would reproduce). These patches have also been in -next for
almost a week, just to be sure.
- Add logic for making some seccomp flags exclusive (Tycho)
- Update selftests for exclusivity testing (Kees)"
* tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
seccomp: Make NEW_LISTENER and TSYNC flags exclusive
selftests/seccomp: Prepare for exclusive seccomp flags
This doesn't really do anything, but at least we now parse teh
ZERO_PAGE() address argument so that we'll catch the most obvious errors
in usage next time they'll happen.
See commit 6a5c5d26c4c6 ("rdma: fix build errors on s390 and MIPS due to
bad ZERO_PAGE use") what happens when we don't have any use of the macro
argument at all.
rdma: fix build errors on s390 and MIPS due to bad ZERO_PAGE use
The parameter to ZERO_PAGE() was wrong, but since all architectures
except for MIPS and s390 ignore it, it wasn't noticed until 0-day
reported the build error.
Fixes: 67f269b37f9b ("RDMA/ucontext: Fix regression with disassociate") Cc: stable@vger.kernel.org Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Paulo Alcantara [Mon, 25 Feb 2019 00:55:28 +0000 (21:55 -0300)]
selinux: use kernel linux/socket.h for genheaders and mdp
When compiling genheaders and mdp from a newer host kernel, the
following error happens:
In file included from scripts/selinux/genheaders/genheaders.c:18:
./security/selinux/include/classmap.h:238:2: error: #error New
address family defined, please update secclass_map. #error New
address family defined, please update secclass_map. ^~~~~
make[3]: *** [scripts/Makefile.host:107:
scripts/selinux/genheaders/genheaders] Error 1 make[2]: ***
[scripts/Makefile.build:599: scripts/selinux/genheaders] Error 2
make[1]: *** [scripts/Makefile.build:599: scripts/selinux] Error 2
make[1]: *** Waiting for unfinished jobs....
Instead of relying on the host definition, include linux/socket.h in
classmap.h to have PF_MAX.
Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara <paulo@paulo.ac> Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: manually merge in mdp.c, subject line tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
ath10k: Drop WARN_ON()s that always trigger during system resume
ath10k_mac_vif_chan() always returns an error for the given vif
during system-wide resume which reliably triggers two WARN_ON()s
in ath10k_bss_info_changed() and they are not particularly
useful in that code path, so drop them.
Tested: QCA6174 hw3.2 PCI with WLAN.RM.2.0-00180-QCARMSWPZ-1
Tested: QCA6174 hw3.2 SDIO with WLAN.RMH.4.4.1-00007-QCARMSWP-1
Fixes: cd93b83ad927 ("ath10k: support for multicast rate control") Fixes: f279294e9ee2 ("ath10k: add support for configuring management packet rate") Cc: stable@vger.kernel.org Reviewed-by: Brian Norris <briannorris@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Tested-by: Claire Chang <tientzu@chromium.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Brian Norris [Tue, 26 Mar 2019 20:57:28 +0000 (13:57 -0700)]
ath10k: perform crash dump collection in workqueue
Commit 25733c4e67df ("ath10k: pci: use mutex for diagnostic window CE
polling") introduced a regression where we try to sleep (grab a mutex)
in an atomic context:
(NB: simulated firmware crashes, via debugfs, don't trigger firmware
dumps.)
The fix is to move the crash-dump into a workqueue context, and avoid
relying on 'data_lock' for most mutual exclusion. We only keep using it
here for protecting 'fw_crash_counter', while the rest of the coredump
buffers are protected by a new 'dump_mutex'.
I've tested the above with simulated firmware crashes (debugfs 'reset'
file), real firmware crashes (the 'dd' command above), and a variety of
reboot and suspend/resume configurations on QCA6174A.
Fixes: 25733c4e67df ("ath10k: pci: use mutex for diagnostic window CE polling") Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Nicholas Piggin [Fri, 12 Apr 2019 04:26:13 +0000 (14:26 +1000)]
sched/nohz: Run NOHZ idle load balancer on HK_FLAG_MISC CPUs
The NOHZ idle balancer runs on the lowest idle CPU. This can
interfere with isolated CPUs, so confine it to HK_FLAG_MISC
housekeeping CPUs.
HK_FLAG_SCHED is not used for this because it is not set anywhere
at the moment. This could be folded into HK_FLAG_SCHED once that
option is fixed.
The problem was observed with increased jitter on an application
running on CPU0, caused by NOHZ idle load balancing being run on
CPU1 (an SMT sibling).
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190412042613.28930-1-npiggin@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Jan Kara [Wed, 24 Apr 2019 16:39:57 +0000 (18:39 +0200)]
fsnotify: Fix NULL ptr deref in fanotify_get_fsid()
fanotify_get_fsid() is reading mark->connector->fsid under srcu. It can
happen that it sees mark not fully initialized or mark that is already
detached from the object list. In these cases mark->connector
can be NULL leading to NULL ptr dereference. Fix the problem by
being careful when reading mark->connector and check it for being NULL.
Also use WRITE_ONCE when writing the mark just to prevent compiler from
doing something stupid.
Reported-by: syzbot+15927486a4f1bfcbaf91@syzkaller.appspotmail.com Fixes: 77115225acc6 ("fanotify: cache fsid in fsnotify_mark_connector") Signed-off-by: Jan Kara <jack@suse.cz>
- Fix EFI decompressor entry (ensuring barrier instructions are
enabled prior to use)"
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 8857/1: efi: enable CP15 DMB instructions before cleaning the cache
ARM: 8856/1: NOMMU: Fix CCR register faulty initialization when MPU is disabled
ARM: fix function graph tracer and unwinder dependencies
Merge tag 'powerpc-5.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
"A one-liner to make our Radix MMU support depend on HUGETLB_PAGE. We
use some of the hugetlb inlines (eg. pud_huge()) when operating on the
linear mapping and if they're compiled into empty wrappers we can
corrupt memory.
Then two fixes to our VFIO IOMMU code. The first is not a regression
but fixes the locking to avoid a user-triggerable deadlock.
The second does fix a regression since rc1, and depends on the first
fix. It makes it possible to run guests with large amounts of memory
again (~256GB).
Thanks to Alexey Kardashevskiy"
* tag 'powerpc-5.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm_iommu: Allow pinning large regions
powerpc/mm_iommu: Fix potential deadlock
powerpc/mm/radix: Make Radix require HUGETLB_PAGE
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"One core bug fix and a few driver ones
- FRWR memory registration for hfi1/qib didn't work with with some
iovas causing a NFSoRDMA failure regression due to a fix in the NFS
side
- A command flow error in mlx5 allowed user space to send a corrupt
command (and also smash the kernel stack we've since learned)
- Fix a regression and some bugs with device hot unplug that was
discovered while reviewing Andrea's patches
- hns has a failure if the user asks for certain QP configurations"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/hns: Bugfix for mapping user db
RDMA/ucontext: Fix regression with disassociate
RDMA/mlx5: Use rdma_user_map_io for mapping BAR pages
RDMA/mlx5: Do not allow the user to write to the clock page
IB/mlx5: Fix scatter to CQE in DCT QP creation
IB/rdmavt: Fix frwr memory registration
Kalle Valo [Sun, 28 Apr 2019 11:25:33 +0000 (14:25 +0300)]
Merge tag 'iwlwifi-for-kalle-2019-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes
Fourth batch of patches intended for v5.1
* Fix an oops when we receive a packet with bogus lengths;
* Fix a bug that prevented 5350 devices from working;
* Fix a small merge damage from the previous series;
We introduced a bug that prevented this old device from
working. The driver would simply not be able to complete
the INIT flow while spewing this warning:
iwlwifi: mvm: check for length correctness in iwl_mvm_create_skb()
We don't check for the validity of the lengths in the packet received
from the firmware. If the MPDU length received in the rx descriptor
is too short to contain the header length and the crypt length
together, we may end up trying to copy a negative number of bytes
(headlen - hdrlen < 0) which will underflow and cause us to try to
copy a huge amount of data. This causes oopses such as this one:
Paolo Abeni [Fri, 26 Apr 2019 10:50:44 +0000 (12:50 +0200)]
udp: fix GRO reception in case of length mismatch
Currently, the UDP GRO code path does bad things on some edge
conditions - Aggregation can happen even on packet with different
lengths.
Fix the above by rewriting the 'complete' condition for GRO
packets. While at it, note explicitly that we allow merging the
first packet per burst below gso_size.
Reported-by: Sean Tong <seantong114@gmail.com> Fixes: e20cf8d3f1f7 ("udp: implement GRO for plain UDP sockets.") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 26 Apr 2019 00:35:10 +0000 (17:35 -0700)]
net/tls: fix copy to fragments in reencrypt
Fragments may contain data from other records so we have to account
for that when we calculate the destination and max length of copy we
can perform. Note that 'offset' is the offset within the message,
so it can't be passed as offset within the frag..
Here skb_store_bits() would have realised the call is wrong and
simply not copy data.
Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Fri, 26 Apr 2019 00:35:09 +0000 (17:35 -0700)]
net/tls: don't copy negative amounts of data in reencrypt
There is no guarantee the record starts before the skb frags.
If we don't check for this condition copy amount will get
negative, leading to reads and writes to random memory locations.
Familiar hilarity ensues.
Fixes: 4799ac81e52a ("tls: Add rx inline crypto offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
"Just a couple of fixups for Synaptics RMI4 driver and allowing
snvs_pwrkey to be selected on more boards"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: synaptics-rmi4 - write config register values to the right offset
Input: synaptics-rmi4 - fix possible double free
Input: snvs_pwrkey - make it depend on ARCH_MXC
David S. Miller [Sat, 27 Apr 2019 21:00:19 +0000 (17:00 -0400)]
Merge branch 'bnxt_en-Misc-bug-fixes'
Michael Chan says:
====================
bnxt_en: Misc. bug fixes.
6 miscellaneous bug fixes covering several issues in error code paths,
a setup issue for statistics DMA, and an improvement for setting up
multicast address filters.
Please queue these for stable as well.
Patch #5 (bnxt_en: Fix statistics context reservation logic) is for the
most recent 5.0 stable only. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 26 Apr 2019 02:31:55 +0000 (22:31 -0400)]
bnxt_en: Fix uninitialized variable usage in bnxt_rx_pkt().
In bnxt_rx_pkt(), if the driver encounters BD errors, it will recycle
the buffers and jump to the end where the uninitailized variable "len"
is referenced. Fix it by adding a new jump label that will skip
the length update. This is the most correct fix since the length
may not be valid when we get this type of error.
Fixes: 6a8788f25625 ("bnxt_en: add support for software dynamic interrupt moderation") Reported-by: Nathan Chancellor <natechancellor@gmail.com> Cc: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In an earlier commit that fixes the number of stats contexts to
reserve for the RDMA driver, we added a function parameter to pass in
the number of stats contexts to all the relevant functions. The passed
in parameter should have been used to set the enables field of the
firmware message.
Fixes: 780baad44f0f ("bnxt_en: Reserve 1 stat_ctx for RDMA driver.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 26 Apr 2019 02:31:53 +0000 (22:31 -0400)]
bnxt_en: Pass correct extended TX port statistics size to firmware.
If driver determines that extended TX port statistics are not supported
or allocation of the data structure fails, make sure to pass 0 TX stats
size to firmware to disable it. The firmware returned TX stats size should
also be set to 0 for consistency. This will prevent
bnxt_get_ethtool_stats() from accessing the NULL TX stats pointer in
case there is mismatch between firmware and driver.
Fixes: 36e53349b60b ("bnxt_en: Add additional extended port statistics.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 26 Apr 2019 02:31:52 +0000 (22:31 -0400)]
bnxt_en: Fix possible crash in bnxt_hwrm_ring_free() under error conditions.
If we encounter errors during open and proceed to clean up,
bnxt_hwrm_ring_free() may crash if the rings we try to free have never
been allocated. bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx()
may reference pointers that have not been allocated.
Fix it by checking for valid fw_ring_id first before calling
bnxt_cp_ring_for_rx() or bnxt_cp_ring_for_tx().
Fixes: 2c61d2117ecb ("bnxt_en: Add helper functions to get firmware CP ring ID.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
bnxt_en: Free short FW command HWRM memory in error path in bnxt_init_one()
In the bnxt_init_one() error path, short FW command request memory
is not freed. This patch fixes it.
Fixes: e605db801bde ("bnxt_en: Support for Short Firmware Message") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Fri, 26 Apr 2019 02:31:50 +0000 (22:31 -0400)]
bnxt_en: Improve multicast address setup logic.
The driver builds a list of multicast addresses and sends it to the
firmware when the driver's ndo_set_rx_mode() is called. In rare
cases, the firmware can fail this call if internal resources to
add multicast addresses are exhausted. In that case, we should
try the call again by setting the ALL_MCAST flag which is more
guaranteed to succeed.
Fixes: c0c050c58d84 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
- Fix an early boot crash in the RSDP parsing code by effectively
turning off the parsing call - we ran out of time but want to fix the
regression. The more involved fix is being worked on.
- Fix a crash that can trigger in the kmemlek code.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix a crash with kmemleak_scan()
x86/boot: Disable RSDP parsing temporarily
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fix from Ingo Molnar:
"A cstate event enumeration fix for Kaby/Coffee Lake CPUs"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Update KBL Package C-state events to also include PC8/PC9/PC10 counters
for the error handling path, and rather than complicate that code, just
make it ok to always free what was returned by the init function.
That's what the code used to do before commit 4ab42d78e37a ("ppp, slip:
Validate VJ compression slot parameters completely") when slhc_init()
just returned NULL for the error case, with no actual indication of the
details of the error.
Reported-by: syzbot+45474c076a4927533d2e@syzkaller.appspotmail.com Fixes: 4ab42d78e37a ("ppp, slip: Validate VJ compression slot parameters completely") Acked-by: Ben Hutchings <ben@decadent.org.uk> Cc: David Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
fs/proc/proc_sysctl.c: Fix a NULL pointer dereference
mm/page_alloc.c: fix never set ALLOC_NOFRAGMENT flag
mm/page_alloc.c: avoid potential NULL pointer dereference
mm, page_alloc: always use a captured page regardless of compaction result
mm: do not boost watermarks to avoid fragmentation for the DISCONTIG memory model
lib/test_vmalloc.c: do not create cpumask_t variable on stack
lib/Kconfig.debug: fix build error without CONFIG_BLOCK
zram: pass down the bvec we need to read into in the work struct
mm/memory_hotplug.c: drop memory device reference after find_memory_block()
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- keep the tail of an unaligned initrd reserved
- adjust ftrace_make_call() to deal with the relative nature of PLTs
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64/module: ftrace: deal with place relative nature of PLTs
arm64: mm: Ensure tail of unaligned initrd is reserved
Merge tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Three tracing fixes:
- Use "nosteal" for ring buffer splice pages
- Memory leak fix in error path of trace_pid_write()
- Fix preempt_enable_no_resched() (use preempt_enable()) in ring
buffer code"
* tag 'trace-v5.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
trace: Fix preempt_enable_no_resched() abuse
tracing: Fix a memory leak by early error exit in trace_pid_write()
tracing: Fix buffer_ref pipe ops
Merge tag 'gpio-v5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Not much to say about them, regular fixes:
- Fix a bug on the errorpath of gpiochip_add_data_with_key()
- IRQ type setting on the spreadtrum GPIO driver"
* tag 'gpio-v5.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: Fix gpiochip_add_data_with_key() error path
gpio: eic: sprd: Fix incorrect irq type setting for the sync EIC
Merge tag 'drm-fixes-2019-04-26' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Regular drm fixes, nothing too outstanding, I'm guessing Easter was
slowing people down.
i915:
- FEC enable fix
- BXT display lanes fix
ttm:
- fix reinit for reloading drivers regression
imx:
- DP CSC fix
sun4i:
- module unload/load fix
vc4:
- memory leak fix
- compile fix
dw-hdmi:
- rockchip scdc overflow fix
sched:
- docs fix
vmwgfx:
- dma api layering fix"
* tag 'drm-fixes-2019-04-26' of git://anongit.freedesktop.org/drm/drm:
drm/bridge: dw-hdmi: fix SCDC configuration for ddc-i2c-bus
drm/vmwgfx: Fix dma API layer violation
drm/vc4: Fix compilation error reported by kbuild test bot
drm/sun4i: Unbind components before releasing DRM and memory
drm/vc4: Fix memory leak during gpu reset.
drm/sched: Fix description of drm_sched_stop
drm/imx: don't skip DP channel disable for background plane
gpu: ipu-v3: dp: fix CSC handling
drm/ttm: fix re-init of global structures
drm/sun4i: Fix component unbinding and component master deletion
drm/sun4i: Set device driver data at bind time for use in unbind
drm/sun4i: Add missing drm_atomic_helper_shutdown at driver unbind
drm/i915: Restore correct bxt_ddi_phy_calc_lane_lat_optim_mask() calculation
drm/i915: Do not enable FEC without DSC
drm: bridge: dw-hdmi: Fix overflow workaround for Rockchip SoCs
Merge tag 'for-5.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
"One patch to fix a crash in io submission path, due to memory
allocation errors.
In short, the multipage bio work that landed in 5.1 caused larger bios
that in turn require larger temporary memory for checksums. The patch
is a workaround, we're going to rework the allocation so it does not
require the vmalloc fallback.
It took a while to identify that it's caused by patches in 5.1 and not
a patchset that did some changes in error handling in the code. I've
tested it on various memory/cpu combinations, it could hit OOM but
does not crash.
The timestamp of the patch is less than a day due to updates in the
changelog, tests were running meanwhile"
* tag 'for-5.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: Switch memory allocations in async csum calculation path to kvmalloc