]> git.baikalelectronics.ru Git - kernel.git/log
kernel.git
7 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 7 Dec 2016 19:27:33 +0000 (11:27 -0800)]
Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fixes from Ingo Molnar:
 "Two rtmutex race fixes (which miraculously never triggered, that we
  know of), plus two lockdep printk formatting regression fixes"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lockdep: Fix report formatting
  locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()
  locking/rtmutex: Prevent dequeue vs. unlock race
  locking/selftest: Fix output since KERN_CONT changes

7 years agoMerge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 7 Dec 2016 18:56:00 +0000 (10:56 -0800)]
Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fix from Ingo Molnar:
 "A single late breaking fix for objtool"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Fix bytes check of lea's rex_prefix

7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi...
Linus Torvalds [Wed, 7 Dec 2016 16:45:23 +0000 (08:45 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse fix from Miklos Szeredi:
 "Fix a regression spotted by Jeff Layton"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: fix clearing suid, sgid for chown()

7 years agoRevert "default exported asm symbols to zero"
Linus Torvalds [Wed, 7 Dec 2016 16:39:00 +0000 (08:39 -0800)]
Revert "default exported asm symbols to zero"

This reverts commit 3ca51e1c81acafc1f80943b387460233684fa50d.

I loved that commit because of how it explained what the problem with
newer versions of binutils were, but the actual patch itself turns out
to not work very well.

It has two problems:

 - a zero CRC value isn't actually right.  It happens to work for the
   case where both sides of the equation fail at giving the symbol a
   crc, but there are cases where the users of the exported symbol get
   the right crc (due to seeing the C declarations), but the actual
   exporting itself does not (due to the whole weak asm symbol issue).

   So then the module load fails after all - we did have a crc for the
   symbol, but we couldn't match it with the loaded module.

 - it seems that the alpha assembler has special semantics for the
   '.set' directive, and on alpha it doesn't actually set the value of
   the specified symbol at all, it is instead used to set various
   assembly modes (eg ".set noat" and ".set noreorder").

   So using ".set" to set the symbol value would just cause build
   failures on alpha.

I'm sure we'll find some other workaround for these issues (hopefully
that involves getting rid of modversions entirely some day, but people
are also talking about just using smarter tools).  But for now we'll
just fall back on commit cbef5e600a58 ("Re-enable CONFIG_MODVERSIONS in
a slightly weaker form") that just let's a missing crc through.

Reported-by: Jan Stancek <jstancek@redhat.com>
Reported-by: Philip Müller <philm@manjaro.org>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoDon't feed anything but regular iovec's to blk_rq_map_user_iov
Linus Torvalds [Wed, 7 Dec 2016 00:18:14 +0000 (16:18 -0800)]
Don't feed anything but regular iovec's to blk_rq_map_user_iov

In theory we could map other things, but there's a reason that function
is called "user_iov".  Using anything else (like splice can do) just
confuses it.

Reported-and-tested-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Tue, 6 Dec 2016 17:24:11 +0000 (09:24 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc

Pull sparc fix from David Miller:
 "A use-before-NULL-check from Dan Carpenter"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  dbri: move dereference after check for NULL

7 years agodbri: move dereference after check for NULL
Dan Carpenter [Thu, 1 Dec 2016 05:48:30 +0000 (08:48 +0300)]
dbri: move dereference after check for NULL

We accidentally introduced a dereference before the NULL check in
xmit_descs() as part of silencing a GCC warning.

Fixes: c0f6f9c95032 ("dbri: Fix compiler warning")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Tue, 6 Dec 2016 17:06:51 +0000 (09:06 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) When dcbnl_cee_fill() fails to be able to push a new netlink
    attribute, it return 0 instead of an error code. From Pan Bian.

 2) Two suffix handling fixes to FIB trie code, from Alexander Duyck.

 3) bnxt_hwrm_stat_ctx_alloc() goes through all the trouble of setting
    and maintaining a return code 'rc' but fails to actually return it.
    Also from Pan Bian.

 4) ping socket ICMP handler needs to validate ICMP header length, from
    Kees Cook.

 5) caif_sktinit_module() has this interesting logic:

        int err = sock_register(...);
        if (!err)
                return err;
        return 0;

    Just return sock_register()'s return value directly which is the
    only possible correct thing to do.

 6) Two bnx2x driver fixes from Yuval Mintz, return a reasonable
    estimate from get_ringparam() ethtool op when interface is down and
    avoid trying to use UDP port based tunneling on 577xx chips.

 7) Fix ep93xx_eth crash on module unload from Florian Fainelli.

 8) Missing uapi exports, from Stephen Hemminger.

 9) Don't schedule work from sk_destruct(), because the socket will be
    freed upon return from that function. From Herbert Xu.

10) Buggy drivers, of which we know there is at least one, can send a
    huge packet into the TCP stack but forget to set the gso_size in the
    SKB, which causes all kinds of problems.

    Correct this when it happens, and emit a one-time warning with the
    device name included so that it can be diagnosed more easily.

    From Marcelo Ricardo Leitner.

11) virtio-net does DMA off the stack causes hiccups with VMAP_STACK,
    fix from Andy Lutomirski.

12) Fix fec driver compilation with CONFIG_M5272, from Nikita
    Yushchenko.

13) mlx5 fixes from Kamal Heib, Saeed Mahameed, and Mohamad Haj Yahia.
    (erroneously flushing queues on error, module parameter validation,
    etc)

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits)
  net/mlx5e: Change the SQ/RQ operational state to positive logic
  net/mlx5e: Don't flush SQ on error
  net/mlx5e: Don't notify HW when filling the edge of ICO SQ
  net/mlx5: Fix query ISSI flow
  net/mlx5: Remove duplicate pci dev name print
  net/mlx5: Verify module parameters
  net: fec: fix compile with CONFIG_M5272
  be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.
  virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address()
  tcp: warn on bogus MSS and try to amend it
  uapi glibc compat: fix outer guard of net device flags enum
  net: stmmac: clear reset value of snps, wr_osr_lmt/snps, rd_osr_lmt before writing
  netlink: Do not schedule work from sk_destruct
  uapi: export nf_log.h
  uapi: export tc_skbmod.h
  net: ep93xx_eth: Do not crash unloading module
  bnx2x: Prevent tunnel config for 577xx
  bnx2x: Correct ringparam estimate when DOWN
  isdn: hisax: set error code on failure
  net: bnx2x: fix improper return value
  ...

7 years agoshmem: fix shm fallocate() list corruption
Linus Torvalds [Mon, 5 Dec 2016 20:10:29 +0000 (12:10 -0800)]
shmem: fix shm fallocate() list corruption

The shmem hole punching with fallocate(FALLOC_FL_PUNCH_HOLE) does not
want to race with generating new pages by faulting them in.

However, the wait-queue used to delay the page faulting has a serious
problem: the wait queue head (in shmem_fallocate()) is allocated on the
stack, and the code expects that "wake_up_all()" will make sure that all
the queue entries are gone before the stack frame is de-allocated.

And that is not at all necessarily the case.

Yes, a normal wake-up sequence will remove the wait-queue entry that
caused the wakeup (see "autoremove_wake_function()"), but the key
wording there is "that caused the wakeup".  When there are multiple
possible wakeup sources, the wait queue entry may well stay around.

And _particularly_ in a page fault path, we may be faulting in new pages
from user space while we also have other things going on, and there may
well be other pending wakeups.

So despite the "wake_up_all()", it's not at all guaranteed that all list
entries are removed from the wait queue head on the stack.

Fix this by introducing a new wakeup function that removes the list
entry unconditionally, even if the target process had already woken up
for other reasons.  Use that "synchronous" function to set up the
waiters in shmem_fault().

This problem has never been seen in the wild afaik, but Dave Jones has
reported it on and off while running trinity.  We thought we fixed the
stack corruption with the blk-mq rq_list locking fix (commit
963d5da64165: "blk-mq: update hardware and software queues for sleeping
alloc"), but it turns out there was _another_ stack corruptor hiding
in the trinity runs.

Vegard Nossum (also running trinity) was able to trigger this one fairly
consistently, and made us look once again at the shmem code due to the
faults often being in that area.

Reported-and-tested-by: Vegard Nossum <vegard.nossum@oracle.com>.
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge branch 'mlx5-fixes'
David S. Miller [Tue, 6 Dec 2016 16:44:45 +0000 (11:44 -0500)]
Merge branch 'mlx5-fixes'

Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes 2016-12-04

Some bug fixes for mlx5 core and mlx5e driver.

v1->v2:
 - replace "uint" with "unsigned int"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Change the SQ/RQ operational state to positive logic
Mohamad Haj Yahia [Tue, 6 Dec 2016 15:32:48 +0000 (17:32 +0200)]
net/mlx5e: Change the SQ/RQ operational state to positive logic

When using the negative logic (i.e. FLUSH state), after the RQ/SQ reopen
we will have a time interval that the RQ/SQ is not really ready and the
state indicates that its not in FLUSH state because the initial SQ/RQ struct
memory starts as zeros.
Now we changed the state to indicate if the SQ/RQ is opened and we will
set the READY state after finishing preparing all the SQ/RQ resources.

Fixes: 8189241cde13 ("net/mlx5e: Don't wait for SQ completions on close")
Fixes: 33269c3ee0ca ("net/mlx5e: Don't wait for RQ completions on close")
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Don't flush SQ on error
Saeed Mahameed [Tue, 6 Dec 2016 15:32:47 +0000 (17:32 +0200)]
net/mlx5e: Don't flush SQ on error

We are doing SQ descriptors cleanup in driver.

Fixes: 8189241cde13 ("net/mlx5e: Don't wait for SQ completions on close")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5e: Don't notify HW when filling the edge of ICO SQ
Saeed Mahameed [Tue, 6 Dec 2016 15:32:46 +0000 (17:32 +0200)]
net/mlx5e: Don't notify HW when filling the edge of ICO SQ

We are going to do this a couple of steps ahead anyway.

Fixes: 7721bd489cb7 ("net/mlx5e: Added ICO SQs")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5: Fix query ISSI flow
Kamal Heib [Tue, 6 Dec 2016 15:32:45 +0000 (17:32 +0200)]
net/mlx5: Fix query ISSI flow

In old FWs query ISSI command is not supported and for some of those FWs
it might fail with status other than "MLX5_CMD_STAT_BAD_OP_ERR".

In such case instead of failing the driver load, we will treat any FW
status other than 0 for Query ISSI FW command as ISSI not supported and
assume ISSI=0 (most basic driver/FW interface).

In case of driver syndrom (query ISSI failure by driver) we will fail
driver load.

Fixes: 9df41abb3a64 ('net/mlx5: Extend mlx5_core to support ConnectX-4
Ethernet functionality')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5: Remove duplicate pci dev name print
Kamal Heib [Tue, 6 Dec 2016 15:32:44 +0000 (17:32 +0200)]
net/mlx5: Remove duplicate pci dev name print

Remove duplicate pci dev name printing from mlx5_core_warn/dbg.

Fixes: 78dbced42f82 ('net/mlx5_core: Improve mlx5 messages')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx5: Verify module parameters
Kamal Heib [Tue, 6 Dec 2016 15:32:43 +0000 (17:32 +0200)]
net/mlx5: Verify module parameters

Verify the mlx5_core module parameters by making sure that they are in
the expected range and if they aren't restore them to their default
values.

Fixes: a76686a4b7e5 ('mlx5: Move pci device handling from mlx5_ib to mlx5_core')
Signed-off-by: Kamal Heib <kamalh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: fec: fix compile with CONFIG_M5272
Nikita Yushchenko [Tue, 6 Dec 2016 06:26:53 +0000 (09:26 +0300)]
net: fec: fix compile with CONFIG_M5272

Commit e944d20a145e ("net: fec: cache statistics while device is down")
introduced unconditional statistics-related actions.

However, when driver is compiled with CONFIG_M5272, staticsics-related
definitions do not exist, which results into build errors.

Fix that by adding explicit handling of !defined(CONFIG_M5272) case.

Fixes: e944d20a145e ("net: fec: cache statistics while device is down")
Signed-off-by: Nikita Yushchenko <nikita.yoush@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobe2net: Add DEVSEC privilege to SET_HSW_CONFIG command.
Venkat Duvvuru [Tue, 6 Dec 2016 05:33:50 +0000 (00:33 -0500)]
be2net: Add DEVSEC privilege to SET_HSW_CONFIG command.

OPCODE_COMMON_GET_FN_PRIVILEGES is returning only DEVSEC
privilege (Unrestricted Administrative Privilege) for Lancer NIC functions.
So, driver is failing SET_HSW_CONFIG command, as DEVSEC privilege was not
set in the privilege bitmap. This patch fixes the problem by setting DEVSEC
privilege in SET_HSW_CONFIG’s privilege bitmap.

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Suresh Reddy <suresh.reddy@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agovirtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address()
Andy Lutomirski [Tue, 6 Dec 2016 02:10:58 +0000 (18:10 -0800)]
virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address()

With CONFIG_VMAP_STACK=y, virtnet_set_mac_address() can be passed a
pointer to the stack and it will OOPS.  Copy the address to the heap
to prevent the crash.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Laura Abbott <labbott@redhat.com>
Reported-by: zbyszek@in.waw.pl
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: warn on bogus MSS and try to amend it
Marcelo Ricardo Leitner [Mon, 5 Dec 2016 20:37:13 +0000 (18:37 -0200)]
tcp: warn on bogus MSS and try to amend it

There have been some reports lately about TCP connection stalls caused
by NIC drivers that aren't setting gso_size on aggregated packets on rx
path. This causes TCP to assume that the MSS is actually the size of the
aggregated packet, which is invalid.

Although the proper fix is to be done at each driver, it's often hard
and cumbersome for one to debug, come to such root cause and report/fix
it.

This patch amends this situation in two ways. First, it adds a warning
on when this situation occurs, so it gives a hint to those trying to
debug this. It also limit the maximum probed MSS to the adverised MSS,
as it should never be any higher than that.

The result is that the connection may not have the best performance ever
but it shouldn't stall, and the admin will have a hint on what to look
for.

Tested with virtio by forcing gso_size to 0.

v2: updated msg per David's suggestion
v3: use skb_iif to find the interface and also log its name, per Eric
    Dumazet's suggestion. As the skb may be backlogged and the interface
    gone by then, we need to check if the number still has a meaning.
v4: use helper tcp_gro_dev_warn() and avoid pr_warn_once inside __once, per
    David's suggestion

Cc: Jonathan Maxwell <jmaxwell37@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agouapi glibc compat: fix outer guard of net device flags enum
Jonas Gorski [Sat, 3 Dec 2016 16:31:45 +0000 (17:31 +0100)]
uapi glibc compat: fix outer guard of net device flags enum

Fix a wrong condition preventing the higher net device flags
IFF_LOWER_UP etc to be defined if net/if.h is included before
linux/if.h.

The comment makes it clear the intention was to allow partial
definition with either parts.

This fixes compilation of userspace programs trying to use
IFF_LOWER_UP, IFF_DORMANT or IFF_ECHO.

Fixes: e4a9abe110f0 ("uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: stmmac: clear reset value of snps, wr_osr_lmt/snps, rd_osr_lmt before writing
Niklas Cassel [Mon, 5 Dec 2016 17:12:54 +0000 (18:12 +0100)]
net: stmmac: clear reset value of snps, wr_osr_lmt/snps, rd_osr_lmt before writing

WR_OSR_LMT and RD_OSR_LMT have a reset value of 1.
Since the reset value wasn't cleared before writing, the value in the
register would be incorrect if specifying an uneven value for
snps,wr_osr_lmt/snps,rd_osr_lmt.

Zero is a valid value for the properties, since the databook specifies:
maximum outstanding requests = WR_OSR_LMT + 1.

We do not want to change the behavior for existing users when the
property is missing. Therefore, default to 1 if the property is missing,
since that is the same as the reset value.

Signed-off-by: Niklas Cassel <niklas.cassel@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agofuse: fix clearing suid, sgid for chown()
Miklos Szeredi [Tue, 6 Dec 2016 15:18:45 +0000 (16:18 +0100)]
fuse: fix clearing suid, sgid for chown()

Basically, the pjdfstests set the ownership of a file to 06555, and then
chowns it (as root) to a new uid/gid. Prior to commit 19d3437c41f9 ("fuse:
fix killing s[ug]id in setattr"), fuse would send down a setattr with both
the uid/gid change and a new mode.  Now, it just sends down the uid/gid
change.

Technically this is NOTABUG, since POSIX doesn't _require_ that we clear
these bits for a privileged process, but Linux (wisely) has done that and I
think we don't want to change that behavior here.

This is caused by the use of should_remove_suid(), which will always return
0 when the process has CAP_FSETID.

In fact we really don't need to be calling should_remove_suid() at all,
since we've already been indicated that we should remove the suid, we just
don't want to use a (very) stale mode for that.

This patch should fix the above as well as simplify the logic.

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 19d3437c41f9 ("fuse: fix killing s[ug]id in setattr")
Cc: <stable@vger.kernel.org>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
7 years agolockdep: Fix report formatting
Dmitry Vyukov [Mon, 28 Nov 2016 14:24:43 +0000 (15:24 +0100)]
lockdep: Fix report formatting

Since commit:

  3c758df09457 ("printk: reinstate KERN_CONT for printing continuation lines")

printk() requires KERN_CONT to continue log messages. Lots of printk()
in lockdep.c and print_ip_sym() don't have it. As the result lockdep
reports are completely messed up.

Add missing KERN_CONT and inline print_ip_sym() where necessary.

Example of a messed up report:

  0-rc5+ #41 Not tainted
  -------------------------------------------------------
  syz-executor0/5036 is trying to acquire lock:
   (
  rtnl_mutex
  ){+.+.+.}
  , at:
  [<ffffffff86b3d6ac>] rtnl_lock+0x1c/0x20
  but task is already holding lock:
   (
  &net->packet.sklist_lock
  ){+.+...}
  , at:
  [<ffffffff873541a6>] packet_diag_dump+0x1a6/0x1920
  which lock already depends on the new lock.
  the existing dependency chain (in reverse order) is:
  -> #3
   (
  &net->packet.sklist_lock
  +.+...}
  ...

Without this patch all scripts that parse kernel bug reports are broken.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: andreyknvl@google.com
Cc: aryabinin@virtuozzo.com
Cc: joe@perches.com
Cc: syzkaller@googlegroups.com
Link: http://lkml.kernel.org/r/1480343083-48731-1-git-send-email-dvyukov@google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoobjtool: Fix bytes check of lea's rex_prefix
Jiri Slaby [Mon, 5 Dec 2016 10:55:51 +0000 (11:55 +0100)]
objtool: Fix bytes check of lea's rex_prefix

For the "lea %(rsp), %rbp" case, we check if there is a rex_prefix.
But we check 'bytes' which is insn_byte_t[4] in rex_prefix (insn_field
structure). Therefore, the check is always true.

Instead, check 'nbytes' which is the right one.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161205105551.25917-1-jslaby@suse.cz
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agonetlink: Do not schedule work from sk_destruct
Herbert Xu [Mon, 5 Dec 2016 07:28:21 +0000 (15:28 +0800)]
netlink: Do not schedule work from sk_destruct

It is wrong to schedule a work from sk_destruct using the socket
as the memory reserve because the socket will be freed immediately
after the return from sk_destruct.

Instead we should do the deferral prior to sk_free.

This patch does just that.

Fixes: 982ca8249994 ("netlink: Call cb->done from a worker thread")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agouapi: export nf_log.h
stephen hemminger [Fri, 2 Dec 2016 22:54:00 +0000 (14:54 -0800)]
uapi: export nf_log.h

File is in uapi directory but not being copied on
 make install_headers

Fixes commit 4ec9c8fbbc22 ("netfilter: nft_log: complete
NFTA_LOG_FLAGS attr support").

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agouapi: export tc_skbmod.h
stephen hemminger [Fri, 2 Dec 2016 22:53:59 +0000 (14:53 -0800)]
uapi: export tc_skbmod.h

Fixes commit 735cffe5d800 ("net_sched: Introduce skbmod action")
Not used by iproute2 but maybe in future.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ep93xx_eth: Do not crash unloading module
Florian Fainelli [Mon, 5 Dec 2016 03:22:05 +0000 (19:22 -0800)]
net: ep93xx_eth: Do not crash unloading module

When we unload the ep93xx_eth, whether we have opened the network
interface or not, we will either hit a kernel paging request error, or a
simple NULL pointer de-reference because:

- if ep93xx_open has been called, we have created a valid DMA mapping
  for ep->descs, when we call ep93xx_stop, we also call
  ep93xx_free_buffers, ep->descs now has a stale value

- if ep93xx_open has not been called, we have a NULL pointer for
  ep->descs, so performing any operation against that address just won't
  work

Fix this by adding a NULL pointer check for ep->descs which means that
ep93xx_free_buffers() was able to successfully tear down the descriptors
and free the DMA cookie as well.

Fixes: 03abbbff1577 ("[PATCH] Cirrus Logic ep93xx ethernet driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'bnx2x-fixes'
David S. Miller [Mon, 5 Dec 2016 20:08:40 +0000 (15:08 -0500)]
Merge branch 'bnx2x-fixes'

Yuval Mintz says:

====================
bnx2x: fixes series

Two unrelated fixes for bnx2x - the first one is nice-to-have,
while the other fixes fatal behaviour in older adapters.

Please consider applying them to `net'.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnx2x: Prevent tunnel config for 577xx
Mintz, Yuval [Sun, 4 Dec 2016 13:30:18 +0000 (15:30 +0200)]
bnx2x: Prevent tunnel config for 577xx

Only the 578xx adapters are capable of configuring UDP ports for
the purpose of tunnelling - doing the same on 577xx might lead to
a firmware assertion.
We're already not claiming support for any related feature for such
devices, but we also need to prevent the configuration of the UDP
ports to the device in this case.

Fixes: b13c15158caa ("bnx2x: Add vxlan RSS support")
Reported-by: Anikina Anna <anikina@gmail.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobnx2x: Correct ringparam estimate when DOWN
Mintz, Yuval [Sun, 4 Dec 2016 13:30:17 +0000 (15:30 +0200)]
bnx2x: Correct ringparam estimate when DOWN

Until interface is up [and assuming ringparams weren't explicitly
configured] when queried for the size of its rings bnx2x would
claim they're the maximal size by default.
That is incorrect as by default the maximal number of buffers would
be equally divided between the various rx rings.

This prevents the user from actually setting the number of elements
on each rx ring to be of maximal size prior to transitioning the
interface into up state.

To fix this, make a rough estimation about the number of buffers.
It wouldn't always be accurate, but it would be much better than
current estimation and would allow users to increase number of
buffers during early initialization of the interface.

Reported-by: Seymour, Shane <shane.seymour@hpe.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoisdn: hisax: set error code on failure
Pan Bian [Sun, 4 Dec 2016 10:43:31 +0000 (18:43 +0800)]
isdn: hisax: set error code on failure

In function hfc4s8s_probe(), the value of return variable err should be
negative on failures. However, when the call to request_region() returns
NULL, the value of err is 0. This patch fixes the bug, assigning
"-EBUSY" to err on the path that request_region() fails.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188931

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bnx2x: fix improper return value
Pan Bian [Sun, 4 Dec 2016 10:46:03 +0000 (18:46 +0800)]
net: bnx2x: fix improper return value

Macro BNX2X_ALLOC_AND_SET(arr, lbl, func) calls kmalloc() to allocate
memory, and jumps to label "lbl" if the allocation fails. Label "lbl"
first cleans memory and then returns variable rc. Before calling the
macro, the value of variable rc is 0. Because 0 means no error, the
callers of bnx2x_init_firmware() may be misled. This patch fixes the bug,
assigning "-ENOMEM" to rc before calling macro NX2X_ALLOC_AND_SET().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189141

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: qlogic: set error code on failure
Pan Bian [Sun, 4 Dec 2016 05:53:53 +0000 (13:53 +0800)]
net: ethernet: qlogic: set error code on failure

When calling dma_mapping_error(), the value of return variable rc is 0.
And when the call returns an unexpected value, rc is not set to a
negative errno. Thus, it will return 0 on the error path, and its
callers cannot detect the bug. This patch fixes the bug, assigning
"-ENOMEM" to err.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189041

Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoatm: fix improper return value
Pan Bian [Sun, 4 Dec 2016 05:45:15 +0000 (13:45 +0800)]
atm: fix improper return value

It returns variable "error" when ioremap_nocache() returns a NULL
pointer. The value of "error" is 0 then, which will mislead the callers
to believe that there is no error. This patch fixes the bug, returning
"-ENOMEM".

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=189021

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: set error code on failures
Pan Bian [Sun, 4 Dec 2016 05:27:40 +0000 (13:27 +0800)]
net: irda: set error code on failures

When the calls to kzalloc() fail, the value of return variable ret may
be 0. 0 means success in this context. This patch fixes the bug,
assigning "-ENOMEM" to ret before calling kzalloc().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188971

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: caif: remove ineffective check
Pan Bian [Sun, 4 Dec 2016 04:15:44 +0000 (12:15 +0800)]
net: caif: remove ineffective check

The check of the return value of sock_register() is ineffective.
"if(!err)" seems to be a typo. It is better to propagate the error code
to the callers of caif_sktinit_module(). This patch removes the check
statment and directly returns the result of sock_register().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188751
Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ping: check minimum size on ICMP header length
Kees Cook [Mon, 5 Dec 2016 18:34:38 +0000 (10:34 -0800)]
net: ping: check minimum size on ICMP header length

Prior to commit 478f199281e7 ("put iov_iter into msghdr") in v3.19, there
was no check that the iovec contained enough bytes for an ICMP header,
and the read loop would walk across neighboring stack contents. Since the
iov_iter conversion, bad arguments are noticed, but the returned error is
EFAULT. Returning EINVAL is a clearer error and also solves the problem
prior to v3.19.

This was found using trinity with KASAN on v3.18:

BUG: KASAN: stack-out-of-bounds in memcpy_fromiovec+0x60/0x114 at addr ffffffc071077da0
Read of size 8 by task trinity-c2/9623
page:ffffffbe034b9a08 count:0 mapcount:0 mapping:          (null) index:0x0
flags: 0x0()
page dumped because: kasan: bad access detected
CPU: 0 PID: 9623 Comm: trinity-c2 Tainted: G    BU         3.18.0-dirty #15
Hardware name: Google Tegra210 Smaug Rev 1,3+ (DT)
Call trace:
[<ffffffc000209c98>] dump_backtrace+0x0/0x1ac arch/arm64/kernel/traps.c:90
[<ffffffc000209e54>] show_stack+0x10/0x1c arch/arm64/kernel/traps.c:171
[<     inline     >] __dump_stack lib/dump_stack.c:15
[<ffffffc000f18dc4>] dump_stack+0x7c/0xd0 lib/dump_stack.c:50
[<     inline     >] print_address_description mm/kasan/report.c:147
[<     inline     >] kasan_report_error mm/kasan/report.c:236
[<ffffffc000373dcc>] kasan_report+0x380/0x4b8 mm/kasan/report.c:259
[<     inline     >] check_memory_region mm/kasan/kasan.c:264
[<ffffffc00037352c>] __asan_load8+0x20/0x70 mm/kasan/kasan.c:507
[<ffffffc0005b9624>] memcpy_fromiovec+0x5c/0x114 lib/iovec.c:15
[<     inline     >] memcpy_from_msg include/linux/skbuff.h:2667
[<ffffffc000ddeba0>] ping_common_sendmsg+0x50/0x108 net/ipv4/ping.c:674
[<ffffffc000dded30>] ping_v4_sendmsg+0xd8/0x698 net/ipv4/ping.c:714
[<ffffffc000dc91dc>] inet_sendmsg+0xe0/0x12c net/ipv4/af_inet.c:749
[<     inline     >] __sock_sendmsg_nosec net/socket.c:624
[<     inline     >] __sock_sendmsg net/socket.c:632
[<ffffffc000cab61c>] sock_sendmsg+0x124/0x164 net/socket.c:643
[<     inline     >] SYSC_sendto net/socket.c:1797
[<ffffffc000cad270>] SyS_sendto+0x178/0x1d8 net/socket.c:1761

CVE-2016-8399

Reported-by: Qidan He <i@flanker017.me>
Fixes: cfce3383c7be ("net: ipv4: add IPPROTO_ICMP socket kind")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'powerpc-4.9-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc...
Linus Torvalds [Mon, 5 Dec 2016 18:30:12 +0000 (10:30 -0800)]
Merge tag 'powerpc-4.9-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
 "Four fixes, the first for code we merged this cycle and three that are
  also going to stable:

   - On 64-bit Book3E we were not placing the .text section where we
     said we would in the asm.

   - We broke building the boot wrapper on some 32-bit toolchains.

   - Lazy icache flushing was broken on pre-POWER5 machines.

   - One of the error paths in our EEH code would lead to a deadlock.

  Thanks to: Andrew Donnellan, Ben Hutchings, Benjamin Herrenschmidt,
  Nicholas Piggin"

* tag 'powerpc-4.9-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/64: Fix placement of .text to be immediately following .head.text
  powerpc/eeh: Fix deadlock when PE frozen state can't be cleared
  powerpc/mm: Fix lazy icache flush on pre-POWER5
  powerpc/boot: Fix build failure in 32-bit boot wrapper

7 years agoatm: lanai: set error code when ioremap fails
Pan Bian [Sat, 3 Dec 2016 12:25:45 +0000 (20:25 +0800)]
atm: lanai: set error code when ioremap fails

In function lanai_dev_open(), when the call to ioremap() fails, the
value of return variable result is 0. 0 means no error in this context.
This patch fixes the bug, assigning "-ENOMEM" to result when ioremap()
returns a NULL pointer.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188791

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: usb: set error code when usb_alloc_urb fails
Pan Bian [Sat, 3 Dec 2016 11:24:48 +0000 (19:24 +0800)]
net: usb: set error code when usb_alloc_urb fails

In function lan78xx_probe(), variable ret takes the errno code on
failures. However, when the call to usb_alloc_urb() fails, its value
will keeps 0. 0 indicates success in the context, which is inconsistent
with the execution result. This patch fixes the bug, assigning
"-ENOMEM" to ret when usb_alloc_urb() returns a NULL pointer.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188771

Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Woojung Huh <woojung.huh@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bridge: set error code on failure
Pan Bian [Sat, 3 Dec 2016 11:33:23 +0000 (19:33 +0800)]
net: bridge: set error code on failure

Function br_sysfs_addbr() does not set error code when the call
kobject_create_and_add() returns a NULL pointer. It may be better to
return "-ENOMEM" when kobject_create_and_add() fails.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188781

Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: af_mpls.c add space before open parenthesis
Suraj Deshmukh [Sat, 3 Dec 2016 07:59:26 +0000 (07:59 +0000)]
net: af_mpls.c add space before open parenthesis

Adding space after switch keyword before open
parenthesis for readability purpose.

This patch fixes the checkpatch.pl warning:
space required before the open parenthesis '('

Signed-off-by: Suraj Deshmukh <surajssd009005@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonetdev: broadcom: propagate error code
Pan Bian [Sat, 3 Dec 2016 09:56:17 +0000 (17:56 +0800)]
netdev: broadcom: propagate error code

Function bnxt_hwrm_stat_ctx_alloc() always returns 0, even if the call
to _hwrm_send_message() fails. It may be better to propagate the errors
to the caller of bnxt_hwrm_stat_ctx_alloc().

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188661

Signed-off-by: Pan Bian <bianpan2016@163.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'fib-suffix-length-fixes'
David S. Miller [Mon, 5 Dec 2016 18:15:59 +0000 (13:15 -0500)]
Merge branch 'fib-suffix-length-fixes'

Alexander Duyck says:

====================
IPv4 FIB suffix length fixes

In reviewing the patch from Robert Shearman and looking over the code I
realized there were a few different bugs we were still carrying in the IPv4
FIB lookup code.

These two patches are based off of Robert's original patch, but take things
one step further by splitting them up to address two additional issues I
found.

So first have Robert's original patch which was addressing the fact that
us calling update_suffix in resize is expensive when it is called per add.
To address that I incorporated the core bit of the patch which was us
dropping the update_suffix call from resize.

The first patch in the series does a rename and fix on the push_suffix and
pull_suffix code.  Specifically we drop the need to pass a leaf and
secondly we fix things so we pull the suffix as long as the value of the
suffix in the node is dropping.

The second patch addresses the original issue reported as well as
optimizing the code for the fact that update_suffix is only really meant to
go through and clean things up when we are decreasing a suffix.  I had
originally added code for it to somehow cause an increase, but if we push
the suffix when a new leaf is added we only ever have to handle pulling
down the suffix with update_suffix so I updated the code to reflect that.

As far as side effects the only ones I think that will be obvious should be
the fact that some routes may be able to be found earlier since before we
relied on resize to update the suffix lengths, and now we are updating them
before we add or remove the leaf.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv4: Drop suffix update from resize code
Alexander Duyck [Thu, 1 Dec 2016 12:27:57 +0000 (07:27 -0500)]
ipv4: Drop suffix update from resize code

It has been reported that update_suffix can be expensive when it is called
on a large node in which most of the suffix lengths are the same.  The time
required to add 200K entries had increased from around 3 seconds to almost
49 seconds.

In order to address this we need to move the code for updating the suffix
out of resize and instead just have it handled in the cases where we are
pushing a node that increases the suffix length, or will decrease the
suffix length.

Fixes: bea8e562f31b ("fib_trie: Add tracking value for suffix length")
Reported-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Robert Shearman <rshearma@brocade.com>
Tested-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv4: Drop leaf from suffix pull/push functions
Alexander Duyck [Thu, 1 Dec 2016 12:27:52 +0000 (07:27 -0500)]
ipv4: Drop leaf from suffix pull/push functions

It wasn't necessary to pass a leaf in when doing the suffix updates so just
drop it.  Instead just pass the suffix and work with that.

Since we dropped the leaf there is no need to include that in the name so
the names are updated to node_push_suffix and node_pull_suffix.

Finally I noticed that the logic for pulling the suffix length back
actually had some issues.  Specifically it would stop prematurely if there
was a longer suffix, but it was not as long as the original suffix.  I
updated the code to address that in node_pull_suffix.

Fixes: bea8e562f31b ("fib_trie: Add tracking value for suffix length")
Suggested-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Reviewed-by: Robert Shearman <rshearma@brocade.com>
Tested-by: Robert Shearman <rshearma@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Mon, 5 Dec 2016 17:16:10 +0000 (09:16 -0800)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

   - Intermittent build failure in RSA

   - Memory corruption in chelsio crypto driver

   - Regression in DRBG due to vmalloced stack"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: rsa - Add Makefile dependencies to fix parallel builds
  crypto: chcr - Fix memory corruption
  crypto: drbg - prevent invalid SG mappings

7 years agoLinux 4.9-rc8
Linus Torvalds [Sun, 4 Dec 2016 20:50:51 +0000 (12:50 -0800)]
Linux 4.9-rc8

7 years agonet: dcb: set error code on failures
Pan Bian [Sat, 3 Dec 2016 13:49:08 +0000 (21:49 +0800)]
net: dcb: set error code on failures

In function dcbnl_cee_fill(), returns the value of variable err on
errors. However, on some error paths (e.g. nla put fails), its value may
be 0. It may be better to explicitly set a negative errno to variable
err before returning.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=188881

Signed-off-by: Pan Bian <bianpan2016@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Sun, 4 Dec 2016 00:40:21 +0000 (16:40 -0800)]
Merge tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "A pretty small pull request: a couple of AMD powerxpress regression
  fixes and a power management fix, a couple of i915 fixes and one hdlcd
  fix, along with one core don't oops because of incorrect API usage fix"

* tag 'drm-fixes-for-v4.9-rc8' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: drop the struct_mutex when wedged or trying to reset
  drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error
  drm: Don't call drm_for_each_crtc with a non-KMS driver
  drm/radeon: fix check for port PM availability
  drm/amdgpu: fix check for port PM availability
  drm/amd/powerplay: initialize the soft_regs offset in struct smu7_hwmgr
  drm: hdlcd: Fix cleanup order

7 years agoMerge tag 'batadv-net-for-davem-20161202' of git://git.open-mesh.org/linux-merge
David S. Miller [Sat, 3 Dec 2016 21:13:27 +0000 (16:13 -0500)]
Merge tag 'batadv-net-for-davem-20161202' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
Here is another batman-adv bugfix:

 - fix checking for failed allocation of TVLV blocks in TT local data,
   by Sven Eckelmann
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'drm-intel-fixes-2016-12-01' of git://anongit.freedesktop.org/git/drm-intel...
Dave Airlie [Sat, 3 Dec 2016 20:31:26 +0000 (06:31 +1000)]
Merge tag 'drm-intel-fixes-2016-12-01' of git://anongit.freedesktop.org/git/drm-intel into drm-fixes

2 intel fixes.

* tag 'drm-intel-fixes-2016-12-01' of git://anongit.freedesktop.org/git/drm-intel:
  drm/i915: drop the struct_mutex when wedged or trying to reset
  drm/i915: Don't touch NULL sg on i915_gem_object_get_pages_gtt() error

7 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 3 Dec 2016 02:48:11 +0000 (18:48 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge more fixes from Andrew Morton:
 "2 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, vmscan: add cond_resched() into shrink_node_memcg()
  mm: workingset: fix NULL ptr in count_shadow_nodes

7 years agomm, vmscan: add cond_resched() into shrink_node_memcg()
Michal Hocko [Sat, 3 Dec 2016 01:26:48 +0000 (17:26 -0800)]
mm, vmscan: add cond_resched() into shrink_node_memcg()

Boris Zhmurov has reported RCU stalls during the kswapd reclaim:

  INFO: rcu_sched detected stalls on CPUs/tasks:
   23-...: (22 ticks this GP) idle=92f/140000000000000/0 softirq=2638404/2638404 fqs=23
   (detected by 4, t=6389 jiffies, g=786259, c=786258, q=42115)
  Task dump for CPU 23:
  kswapd1         R  running task        0   148      2 0x00000008
  Call Trace:
    shrink_node+0xd2/0x2f0
    kswapd+0x2cb/0x6a0
    mem_cgroup_shrink_node+0x160/0x160
    kthread+0xbd/0xe0
    __switch_to+0x1fa/0x5c0
    ret_from_fork+0x1f/0x40
    kthread_create_on_node+0x180/0x180

a closer code inspection has shown that we might indeed miss all the
scheduling points in the reclaim path if no pages can be isolated from
the LRU list.  This is a pathological case but other reports from Donald
Buczek have shown that we might indeed hit such a path:

        clusterd-989   [009] .... 118023.654491: mm_vmscan_direct_reclaim_end: nr_reclaimed=193
         kswapd1-86    [001] dN.. 118023.987475: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239830 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118024.320968: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239844 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118024.654375: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239858 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118024.987036: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239872 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118025.319651: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239886 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118025.652248: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239900 nr_taken=0 file=1
         kswapd1-86    [001] dN.. 118025.984870: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4239914 nr_taken=0 file=1
  [...]
         kswapd1-86    [001] dN.. 118084.274403: mm_vmscan_lru_isolate: isolate_mode=0 classzone=0 order=0 nr_requested=32 nr_scanned=4241133 nr_taken=0 file=1

this is minute long snapshot which didn't take a single page from the
LRU.  It is not entirely clear why only 1303 pages have been scanned
during that time (maybe there was a heavy IRQ activity interfering).

In any case it looks like we can really hit long periods without
scheduling on non preemptive kernels so an explicit cond_resched() in
shrink_node_memcg which is independent on the reclaim operation is due.

Link: http://lkml.kernel.org/r/20161202095841.16648-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Boris Zhmurov <bb@kernelpanic.ru>
Tested-by: Boris Zhmurov <bb@kernelpanic.ru>
Reported-by: Donald Buczek <buczek@molgen.mpg.de>
Reported-by: "Christopher S. Aker" <caker@theshore.net>
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm: workingset: fix NULL ptr in count_shadow_nodes
Michal Hocko [Sat, 3 Dec 2016 01:26:45 +0000 (17:26 -0800)]
mm: workingset: fix NULL ptr in count_shadow_nodes

Commit 735ed5b72dfe ("mm: workingset: make shadow node shrinker memcg
aware") has made the workingset shadow nodes shrinker memcg aware.  The
implementation is not correct though because memcg_kmem_enabled() might
become true while we are doing a global reclaim when the sc->memcg might
be NULL which is exactly what Marek has seen:

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000400
  IP: [<ffffffff8122d520>] mem_cgroup_node_nr_lru_pages+0x20/0x40
  PGD 0
  Oops: 0000 [#1] SMP
  CPU: 0 PID: 60 Comm: kswapd0 Tainted: G           O   4.8.10-12.pvops.qubes.x86_64 #1
  task: ffff880011863b00 task.stack: ffff880011868000
  RIP: mem_cgroup_node_nr_lru_pages+0x20/0x40
  RSP: e02b:ffff88001186bc70  EFLAGS: 00010293
  RAX: 0000000000000000 RBX: ffff88001186bd20 RCX: 0000000000000002
  RDX: 000000000000000c RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88001186bc70 R08: 28f5c28f5c28f5c3 R09: 0000000000000000
  R10: 0000000000006c34 R11: 0000000000000333 R12: 00000000000001f6
  R13: ffffffff81c6f6a0 R14: 0000000000000000 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff880013c00000(0000) knlGS:ffff880013d00000
  CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000400 CR3: 00000000122f2000 CR4: 0000000000042660
  Call Trace:
    count_shadow_nodes+0x9a/0xa0
    shrink_slab.part.42+0x119/0x3e0
    shrink_node+0x22c/0x320
    kswapd+0x32c/0x700
    kthread+0xd8/0xf0
    ret_from_fork+0x1f/0x40
  Code: 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 3b 35 dd eb b1 00 55 48 89 e5 73 2c 89 d2 31 c9 31 c0 4c 63 ce 48 0f a3 ca 73 13 <4a> 8b b4 cf 00 04 00 00 41 89 c8 4a 03 84 c6 80 00 00 00 83 c1
  RIP  mem_cgroup_node_nr_lru_pages+0x20/0x40
   RSP <ffff88001186bc70>
  CR2: 0000000000000400
  ---[ end trace 100494b9edbdfc4d ]---

This patch fixes the issue by checking sc->memcg rather than
memcg_kmem_enabled() which is sufficient because shrink_slab makes sure
that only memcg aware shrinkers will get non-NULL memcgs and only if
memcg_kmem_enabled is true.

Fixes: 735ed5b72dfe ("mm: workingset: make shadow node shrinker memcg aware")
Link: http://lkml.kernel.org/r/20161201132156.21450-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Marek Marczykowski-Górecki <marmarek@mimuw.edu.pl>
Tested-by: Marek Marczykowski-Górecki <marmarek@mimuw.edu.pl>
Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Cc: <stable@vger.kernel.org> [4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokbuild: fix building bzImage with CONFIG_TRIM_UNUSED_KSYMS enabled
Nicolas Pitre [Fri, 2 Dec 2016 20:11:50 +0000 (15:11 -0500)]
kbuild: fix building bzImage with CONFIG_TRIM_UNUSED_KSYMS enabled

When building a specific target such as bzImage, modules aren't normally
built.  However if CONFIG_TRIM_UNUSED_KSYMS is enabled, no built modules
means none of the exported symbols are used and therefore they will all
be trimmed away from the final kernel.  A subsequent "make modules" will
fail because modpost cannot find the needed symbols for those modules in
the kernel binary.

Let's make sure modules are also built whenever CONFIG_TRIM_UNUSED_KSYMS
is enabled and that the kernel binary is properly rebuilt accordingly.

Signed-off-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Fri, 2 Dec 2016 21:34:37 +0000 (13:34 -0800)]
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "This should be the last set of bugfixes for arm-soc in v4.9. None of
  these are critical regressions, but it would be nice to still get them
  merged.

   - On the Juno platform, the idle latency was described wrong, leading
     to suboptimal cpuidle tuning.

   - Also on the same platform, PCI I/O space was set up incorrectly and
     could not work.

   - On the sti platform, a syntactically incorrect DT entry caused
     warnings.

   - The newly added 'gr8' platform has somewhat confusing file names,
     which we rename for consistency"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: dts: juno: fix cluster sleep state entry latency on all SoC versions
  arm64: dts: juno: Correct PCI IO window
  ARM: dts: STiH407-family: fix i2c nodes
  ARM: gr8: Rename the DTSI and relevant DTS

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 2 Dec 2016 19:45:27 +0000 (11:45 -0800)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Lots more phydev and probe error path leaks in various drivers by
    Johan Hovold.

 2) Fix race in packet_set_ring(), from Philip Pettersson.

 3) Use after free in dccp_invalid_packet(), from Eric Dumazet.

 4) Signnedness overflow in SO_{SND,RCV}BUFFORCE, also from Eric
    Dumazet.

 5) When tunneling between ipv4 and ipv6 we can be left with the wrong
    skb->protocol value as we enter the IPSEC engine and this causes all
    kinds of problems. Set it before the output path does any
    dst_output() calls, from Eli Cooper.

 6) bcmgenet uses wrong device struct pointer in DMA API calls, fix from
    Florian Fainelli.

 7) Various netfilter nat bug fixes from FLorian Westphal.

 8) Fix memory leak in ipvlan_link_new(), from Gao Feng.

 9) Locking fixes, particularly wrt. socket lookups, in l2tp from
    Guillaume Nault.

10) Avoid invoking rhash teardowns in atomic context by moving netlink
    cb->done() dump completion from a worker thread. Fix from Herbert
    Xu.

11) Buffer refcount problems in tun and macvtap on errors, from Jason
    Wang.

12) We don't set Kconfig symbol DEFAULT_TCP_CONG properly when the user
    selects BBR. Fix from Julian Wollrath.

13) Fix deadlock in transmit path on altera TSE driver, from Lino
    Sanfilippo.

14) Fix unbalanced reference counting in dsa_switch_tree, from Nikita
    Yushchenko.

15) tc_tunnel_key needs to be properly exported to userspace via uapi,
    fix from Roi Dayan.

16) rds_tcp_init_net() doesn't unregister notifier in error path, fix
    from Sowmini Varadhan.

17) Stale packet header pointer access after pskb_expand_head() in
    genenve driver, fix from Sabrina Dubroca.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (103 commits)
  net: avoid signed overflows for SO_{SND|RCV}BUFFORCE
  geneve: avoid use-after-free of skb->data
  tipc: check minimum bearer MTU
  net: renesas: ravb: unintialized return value
  sh_eth: remove unchecked interrupts for RZ/A1
  net: bcmgenet: Utilize correct struct device for all DMA operations
  NET: usb: qmi_wwan: add support for Telit LE922A PID 0x1040
  cdc_ether: Fix handling connection notification
  ip6_offload: check segs for NULL in ipv6_gso_segment.
  RDS: TCP: unregister_netdevice_notifier() in error path of rds_tcp_init_net
  Revert: "ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"
  ipv6: Set skb->protocol properly for local output
  ipv4: Set skb->protocol properly for local output
  packet: fix race condition in packet_set_ring
  net: ethernet: altera: TSE: do not use tx queue lock in tx completion handler
  net: ethernet: altera: TSE: Remove unneeded dma sync for tx buffers
  net: ethernet: stmmac: fix of-node and fixed-link-phydev leaks
  net: ethernet: stmmac: platform: fix outdated function header
  net: ethernet: stmmac: dwmac-meson8b: fix probe error path
  net: ethernet: stmmac: dwmac-generic: fix probe error path
  ...

7 years agonet: avoid signed overflows for SO_{SND|RCV}BUFFORCE
Eric Dumazet [Fri, 2 Dec 2016 17:44:53 +0000 (09:44 -0800)]
net: avoid signed overflows for SO_{SND|RCV}BUFFORCE

CAP_NET_ADMIN users should not be allowed to set negative
sk_sndbuf or sk_rcvbuf values, as it can lead to various memory
corruptions, crashes, OOM...

Note that before commit 98f292d34928 ("net: cleanups in
sock_setsockopt()"), the bug was even more serious, since SO_SNDBUF
and SO_RCVBUF were vulnerable.

This needs to be backported to all known linux kernels.

Again, many thanks to syzkaller team for discovering this gem.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agogeneve: avoid use-after-free of skb->data
Sabrina Dubroca [Fri, 2 Dec 2016 15:49:29 +0000 (16:49 +0100)]
geneve: avoid use-after-free of skb->data

geneve{,6}_build_skb can end up doing a pskb_expand_head(), which
makes the ip_hdr(skb) reference we stashed earlier stale. Since it's
only needed as an argument to ip_tunnel_ecn_encap(), move this
directly in the function call.

Fixes: 7a30eea2658a ("geneve: ensure ECN info is handled properly in all tx/rx paths")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotipc: check minimum bearer MTU
Michal Kubeček [Fri, 2 Dec 2016 08:33:41 +0000 (09:33 +0100)]
tipc: check minimum bearer MTU

Qian Zhang (张谦) reported a potential socket buffer overflow in
tipc_msg_build() which is also known as CVE-2016-8632: due to
insufficient checks, a buffer overflow can occur if MTU is too short for
even tipc headers. As anyone can set device MTU in a user/net namespace,
this issue can be abused by a regular user.

As agreed in the discussion on Ben Hutchings' original patch, we should
check the MTU at the moment a bearer is attached rather than for each
processed packet. We also need to repeat the check when bearer MTU is
adjusted to new device MTU. UDP case also needs a check to avoid
overflow when calculating bearer MTU.

Fixes: 082dc377e86b ("[TIPC] Initial merge")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reported-by: Qian Zhang (张谦) <zhangqian-c@360.cn>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'linux-can-fixes-for-4.9-20161201' of git://git.kernel.org/pub/scm/linux...
David S. Miller [Fri, 2 Dec 2016 19:02:13 +0000 (14:02 -0500)]
Merge tag 'linux-can-fixes-for-4.9-20161201' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2016-12-02

this is a pull request for net/master.

There are two patches by Stephane Grosjean, who adds support for the new
PCAN-USB X6 USB interface to the pcan_usb driver.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: renesas: ravb: unintialized return value
Dan Carpenter [Thu, 1 Dec 2016 20:57:44 +0000 (23:57 +0300)]
net: renesas: ravb: unintialized return value

We want to set the other "err" variable here so that we can return it
later.  My version of GCC misses this issue but I caught it with a
static checker.

Fixes: f644407aa394 ("net: ethernet: renesas: ravb: fix fixed-link phydev leaks")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosh_eth: remove unchecked interrupts for RZ/A1
Chris Brandt [Thu, 1 Dec 2016 18:32:14 +0000 (13:32 -0500)]
sh_eth: remove unchecked interrupts for RZ/A1

When streaming a lot of data and the RZ/A1 can't keep up, some status bits
will get set that are not being checked or cleared which cause the
following messages and the Ethernet driver to stop working. This
patch fixes that issue.

irq 21: nobody cared (try booting with the "irqpoll" option)
handlers:
[<c036b71c>] sh_eth_interrupt
Disabling IRQ #21

Fixes: 38804f532eb57a30 ("sh_eth: Add support for r7s72100")
Signed-off-by: Chris Brandt <chris.brandt@renesas.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bcmgenet: Utilize correct struct device for all DMA operations
Florian Fainelli [Thu, 1 Dec 2016 17:45:45 +0000 (09:45 -0800)]
net: bcmgenet: Utilize correct struct device for all DMA operations

__bcmgenet_tx_reclaim() and bcmgenet_free_rx_buffers() are not using the
same struct device during unmap that was used for the map operation,
which makes DMA-API debugging warn about it. Fix this by always using
&priv->pdev->dev throughout the driver, using an identical device
reference for all map/unmap calls.

Fixes: 83c070ed8c89 ("net: bcmgenet: add main driver file")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoFix up a couple of field names in the CREDITS file
Linus Torvalds [Fri, 2 Dec 2016 18:48:50 +0000 (10:48 -0800)]
Fix up a couple of field names in the CREDITS file

Ozgur Karatas reported that the very first entry in the CREDITS file had
the wrong tag for name (M: instead of N: - it happened when moving the
entry from the MAINTAINERS file, where 'M:' stands for "Maintainer").

And when I went looking, I found a couple of other cases of wrong
tagging too.

Reported-by: Ozgur Karatas <mueddib@yandex.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoNET: usb: qmi_wwan: add support for Telit LE922A PID 0x1040
Daniele Palmas [Thu, 1 Dec 2016 15:52:05 +0000 (16:52 +0100)]
NET: usb: qmi_wwan: add support for Telit LE922A PID 0x1040

This patch adds support for PID 0x1040 of Telit LE922A.

The qmi adapter requires to have DTR set for proper working,
so QMI_WWAN_QUIRK_DTR has been enabled.

Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocdc_ether: Fix handling connection notification
Kristian Evensen [Thu, 1 Dec 2016 13:23:17 +0000 (14:23 +0100)]
cdc_ether: Fix handling connection notification

Commit f13692a70834 ("cdc_ether: Improve ZTE MF823/831/910 handling")
introduced a work-around in usbnet_cdc_status() for devices that exported
cdc carrier on twice on connect. Before the commit, this behavior caused
the link state to be incorrect. It was assumed that all CDC Ethernet
devices would either export this behavior, or send one off and then one on
notification (which seems to be the default behavior).

Unfortunately, it turns out multiple devices sends a connection
notification multiple times per second (via an interrupt), even when
connection state does not change. This has been observed with several
different USB LAN dongles (at least), for example 13b1:0041 (Linksys).
After f13692a70834, the link state has been set as down and then up for
each notification. This has caused a flood of Netlink NEWLINK messages and
syslog to be flooded with messages similar to:

cdc_ether 2-1:2.0 eth1: kevent 12 may have been dropped

This commit fixes the behavior by reverting usbnet_cdc_status() to how it
was before f13692a70834. The work-around has been moved to a separate
status-function which is only called when a known, affect device is
detected.

v1->v2:

* Do not open-code netif_carrier_ok() (thanks Henning Schild).
* Call netif_carrier_off() instead of usb_link_change(). This prevents
calling schedule_work() twice without giving the work queue a chance to be
processed (thanks Bjørn Mork).

Fixes: f13692a70834 ("cdc_ether: Improve ZTE MF823/831/910 handling")
Reported-by: Henning Schild <henning.schild@siemens.com>
Signed-off-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoip6_offload: check segs for NULL in ipv6_gso_segment.
Artem Savkov [Thu, 1 Dec 2016 13:06:04 +0000 (14:06 +0100)]
ip6_offload: check segs for NULL in ipv6_gso_segment.

segs needs to be checked for being NULL in ipv6_gso_segment() before calling
skb_shinfo(segs), otherwise kernel can run into a NULL-pointer dereference:

[   97.811262] BUG: unable to handle kernel NULL pointer dereference at 00000000000000cc
[   97.819112] IP: [<ffffffff816e52f9>] ipv6_gso_segment+0x119/0x2f0
[   97.825214] PGD 0 [   97.827047]
[   97.828540] Oops: 0000 [#1] SMP
[   97.831678] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 rpcsec_gss_krb5
nfsv4 dns_resolver nfs fscache xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack
ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
bridge stp llc snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel
snd_hda_codec edac_mce_amd snd_hda_core edac_core snd_hwdep kvm_amd snd_seq kvm snd_seq_device
snd_pcm irqbypass snd_timer ppdev parport_serial snd parport_pc k10temp pcspkr soundcore parport
sp5100_tco shpchp sg wmi i2c_piix4 acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc
ip_tables xfs libcrc32c sr_mod cdrom sd_mod ata_generic pata_acpi amdkfd amd_iommu_v2 radeon
broadcom bcm_phy_lib i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
ttm ahci serio_raw tg3 firewire_ohci libahci pata_atiixp drm ptp libata firewire_core pps_core
i2c_core crc_itu_t fjes dm_mirror dm_region_hash dm_log dm_mod
[   97.927721] CPU: 1 PID: 3504 Comm: vhost-3495 Not tainted 4.9.0-7.el7.test.x86_64 #1
[   97.935457] Hardware name: AMD Snook/Snook, BIOS ESK0726A 07/26/2010
[   97.941806] task: ffff880129a1c080 task.stack: ffffc90001bcc000
[   97.947720] RIP: 0010:[<ffffffff816e52f9>]  [<ffffffff816e52f9>] ipv6_gso_segment+0x119/0x2f0
[   97.956251] RSP: 0018:ffff88012fc43a10  EFLAGS: 00010207
[   97.961557] RAX: 0000000000000000 RBX: ffff8801292c8700 RCX: 0000000000000594
[   97.968687] RDX: 0000000000000593 RSI: ffff880129a846c0 RDI: 0000000000240000
[   97.975814] RBP: ffff88012fc43a68 R08: ffff880129a8404e R09: 0000000000000000
[   97.982942] R10: 0000000000000000 R11: ffff880129a84076 R12: 00000020002949b3
[   97.990070] R13: ffff88012a580000 R14: 0000000000000000 R15: ffff88012a580000
[   97.997198] FS:  0000000000000000(0000) GS:ffff88012fc40000(0000) knlGS:0000000000000000
[   98.005280] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   98.011021] CR2: 00000000000000cc CR3: 0000000126c5d000 CR4: 00000000000006e0
[   98.018149] Stack:
[   98.020157]  00000000ffffffff ffff88012fc43ac8 ffffffffa017ad0a 000000000000000e
[   98.027584]  0000001300000000 0000000077d59998 ffff8801292c8700 00000020002949b3
[   98.035010]  ffff88012a580000 0000000000000000 ffff88012a580000 ffff88012fc43a98
[   98.042437] Call Trace:
[   98.044879]  <IRQ> [   98.046803]  [<ffffffffa017ad0a>] ? tg3_start_xmit+0x84a/0xd60 [tg3]
[   98.053156]  [<ffffffff815eeee0>] skb_mac_gso_segment+0xb0/0x130
[   98.059158]  [<ffffffff815eefd3>] __skb_gso_segment+0x73/0x110
[   98.064985]  [<ffffffff815ef40d>] validate_xmit_skb+0x12d/0x2b0
[   98.070899]  [<ffffffff815ef5d2>] validate_xmit_skb_list+0x42/0x70
[   98.077073]  [<ffffffff81618560>] sch_direct_xmit+0xd0/0x1b0
[   98.082726]  [<ffffffff815efd86>] __dev_queue_xmit+0x486/0x690
[   98.088554]  [<ffffffff8135c135>] ? cpumask_next_and+0x35/0x50
[   98.094380]  [<ffffffff815effa0>] dev_queue_xmit+0x10/0x20
[   98.099863]  [<ffffffffa09ce057>] br_dev_queue_push_xmit+0xa7/0x170 [bridge]
[   98.106907]  [<ffffffffa09ce161>] br_forward_finish+0x41/0xc0 [bridge]
[   98.113430]  [<ffffffff81627cf2>] ? nf_iterate+0x52/0x60
[   98.118735]  [<ffffffff81627d6b>] ? nf_hook_slow+0x6b/0xc0
[   98.124216]  [<ffffffffa09ce32c>] __br_forward+0x14c/0x1e0 [bridge]
[   98.130480]  [<ffffffffa09ce120>] ? br_dev_queue_push_xmit+0x170/0x170 [bridge]
[   98.137785]  [<ffffffffa09ce4bd>] br_forward+0x9d/0xb0 [bridge]
[   98.143701]  [<ffffffffa09cfbb7>] br_handle_frame_finish+0x267/0x560 [bridge]
[   98.150834]  [<ffffffffa09d0064>] br_handle_frame+0x174/0x2f0 [bridge]
[   98.157355]  [<ffffffff8102fb89>] ? sched_clock+0x9/0x10
[   98.162662]  [<ffffffff810b63b2>] ? sched_clock_cpu+0x72/0xa0
[   98.168403]  [<ffffffff815eccf5>] __netif_receive_skb_core+0x1e5/0xa20
[   98.174926]  [<ffffffff813659f9>] ? timerqueue_add+0x59/0xb0
[   98.180580]  [<ffffffff815ed548>] __netif_receive_skb+0x18/0x60
[   98.186494]  [<ffffffff815ee625>] process_backlog+0x95/0x140
[   98.192145]  [<ffffffff815edccd>] net_rx_action+0x16d/0x380
[   98.197713]  [<ffffffff8170cff1>] __do_softirq+0xd1/0x283
[   98.203106]  [<ffffffff8170b2bc>] do_softirq_own_stack+0x1c/0x30
[   98.209107]  <EOI> [   98.211029]  [<ffffffff8108a5c0>] do_softirq+0x50/0x60
[   98.216166]  [<ffffffff815ec853>] netif_rx_ni+0x33/0x80
[   98.221386]  [<ffffffffa09eeff7>] tun_get_user+0x487/0x7f0 [tun]
[   98.227388]  [<ffffffffa09ef3ab>] tun_sendmsg+0x4b/0x60 [tun]
[   98.233129]  [<ffffffffa0b68932>] handle_tx+0x282/0x540 [vhost_net]
[   98.239392]  [<ffffffffa0b68c25>] handle_tx_kick+0x15/0x20 [vhost_net]
[   98.245916]  [<ffffffffa0abacfe>] vhost_worker+0x9e/0xf0 [vhost]
[   98.251919]  [<ffffffffa0abac60>] ? vhost_umem_alloc+0x40/0x40 [vhost]
[   98.258440]  [<ffffffff81003a47>] ? do_syscall_64+0x67/0x180
[   98.264094]  [<ffffffff810a44d9>] kthread+0xd9/0xf0
[   98.268965]  [<ffffffff810a4400>] ? kthread_park+0x60/0x60
[   98.274444]  [<ffffffff8170a4d5>] ret_from_fork+0x25/0x30
[   98.279836] Code: 8b 93 d8 00 00 00 48 2b 93 d0 00 00 00 4c 89 e6 48 89 df 66 89 93 c2 00 00 00 ff 10 48 3d 00 f0 ff ff 49 89 c2 0f 87 52 01 00 00 <41> 8b 92 cc 00 00 00 48 8b 80 d0 00 00 00 44 0f b7 74 10 06 66
[   98.299425] RIP  [<ffffffff816e52f9>] ipv6_gso_segment+0x119/0x2f0
[   98.305612]  RSP <ffff88012fc43a10>
[   98.309094] CR2: 00000000000000cc
[   98.312406] ---[ end trace 726a2c7a2d2d78d0 ]---

Signed-off-by: Artem Savkov <asavkov@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoRDS: TCP: unregister_netdevice_notifier() in error path of rds_tcp_init_net
Sowmini Varadhan [Thu, 1 Dec 2016 12:44:43 +0000 (04:44 -0800)]
RDS: TCP: unregister_netdevice_notifier() in error path of rds_tcp_init_net

If some error is encountered in rds_tcp_init_net, make sure to
unregister_netdevice_notifier(), else we could trigger a panic
later on, when the modprobe from a netns fails.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoRevert: "ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"
Eli Cooper [Thu, 1 Dec 2016 02:05:12 +0000 (10:05 +0800)]
Revert: "ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()"

This reverts commit 97048225dce46415c8a2dd6c55999dea29917578
("ip6_tunnel: Update skb->protocol to ETH_P_IPV6 in ip6_tnl_xmit()").

skb->protocol is now set in __ip_local_out() and __ip6_local_out() before
dst_output() is called. It is no longer necessary to do it for each tunnel.

Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv6: Set skb->protocol properly for local output
Eli Cooper [Thu, 1 Dec 2016 02:05:11 +0000 (10:05 +0800)]
ipv6: Set skb->protocol properly for local output

When xfrm is applied to TSO/GSO packets, it follows this path:

    xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

where skb_gso_segment() relies on skb->protocol to function properly.

This patch sets skb->protocol to ETH_P_IPV6 before dst_output() is called,
fixing a bug where GSO packets sent through an ipip6 tunnel are dropped
when xfrm is involved.

Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoipv4: Set skb->protocol properly for local output
Eli Cooper [Thu, 1 Dec 2016 02:05:10 +0000 (10:05 +0800)]
ipv4: Set skb->protocol properly for local output

When xfrm is applied to TSO/GSO packets, it follows this path:

    xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()

where skb_gso_segment() relies on skb->protocol to function properly.

This patch sets skb->protocol to ETH_P_IP before dst_output() is called,
fixing a bug where GSO packets sent through a sit tunnel are dropped
when xfrm is involved.

Cc: stable@vger.kernel.org
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agopacket: fix race condition in packet_set_ring
Philip Pettersson [Wed, 30 Nov 2016 22:55:36 +0000 (14:55 -0800)]
packet: fix race condition in packet_set_ring

When packet_set_ring creates a ring buffer it will initialize a
struct timer_list if the packet version is TPACKET_V3. This value
can then be raced by a different thread calling setsockopt to
set the version to TPACKET_V1 before packet_set_ring has finished.

This leads to a use-after-free on a function pointer in the
struct timer_list when the socket is closed as the previously
initialized timer will not be deleted.

The bug is fixed by taking lock_sock(sk) in packet_setsockopt when
changing the packet version while also taking the lock at the start
of packet_set_ring.

Fixes: 7fa5f1dbc2ee ("af-packet: TPACKET_V3 flexible buffer implementation.")
Signed-off-by: Philip Pettersson <philip.pettersson@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 2 Dec 2016 17:15:26 +0000 (09:15 -0800)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Radim Krčmář:
 "All architectures avoid memory corruption in an error path. ARM
  prevents bogus acknowledgement of interrupts"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: use after free in kvm_ioctl_create_device()
  KVM: arm/arm64: vgic: Don't notify EOI for non-SPIs

7 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 2 Dec 2016 17:12:44 +0000 (09:12 -0800)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fix from Wolfram Sang:
 "Here is the revert for the regression of the i2c-octeon driver I
  mentioned last time. I wished for a bit more feedback, but all people
  working actively on it are in need of this patch, so here it goes"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  Revert "i2c: octeon: thunderx: Limit register access retries"

7 years agonet: ethernet: altera: TSE: do not use tx queue lock in tx completion handler
Lino Sanfilippo [Wed, 30 Nov 2016 22:48:32 +0000 (23:48 +0100)]
net: ethernet: altera: TSE: do not use tx queue lock in tx completion handler

The driver already uses its private lock for synchronization between xmit
and xmit completion handler making the additional use of the xmit_lock
unnecessary.
Furthermore the driver does not set NETIF_F_LLTX resulting in xmit to be
called with the xmit_lock held and then taking the private lock while xmit
completion handler does the reverse, first take the private lock, then the
xmit_lock.
Fix these issues by not taking the xmit_lock in the tx completion handler.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: altera: TSE: Remove unneeded dma sync for tx buffers
Lino Sanfilippo [Wed, 30 Nov 2016 22:48:31 +0000 (23:48 +0100)]
net: ethernet: altera: TSE: Remove unneeded dma sync for tx buffers

An explicit dma sync for device directly after mapping as well as an
explicit dma sync for cpu directly before unmapping is unnecessary and
costly on the hotpath. So remove these calls.

Signed-off-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodefault exported asm symbols to zero
Arnd Bergmann [Fri, 2 Dec 2016 12:40:27 +0000 (13:40 +0100)]
default exported asm symbols to zero

With binutils-2.26 and before, a weak missing symbol was kept during the
final link, and a missing CRC for an export would lead to that CRC being
treated as zero implicitly.  With binutils-2.27, the crc symbol gets
dropped, and any module trying to use it will fail to load.

This sets the weak CRC symbol to zero explicitly, making it defined in
vmlinux, which in turn lets us load the modules referring to that CRC.

The comment above the __CRC_SYMBOL macro suggests that this was always
the intention, although it also seems that all symbols defined in C have
a correct CRC these days, and only the exports that are now done in
assembly need this.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Adam Borowski <kilobyte@angband.pl>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoarm64: dts: juno: fix cluster sleep state entry latency on all SoC versions
Sudeep Holla [Wed, 16 Nov 2016 17:31:31 +0000 (17:31 +0000)]
arm64: dts: juno: fix cluster sleep state entry latency on all SoC versions

The core and the cluster sleep state entry latencies can't be same as
cluster sleep involves more work compared to core level e.g. shared
cache maintenance.

Experiments have shown on an average about 100us more latency for the
cluster sleep state compared to the core level sleep. This patch fixes
the entry latency for the cluster sleep state.

Fixes: 33f9bb3f48b6 ("arm64: dts: juno: Add idle-states to device tree")
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: "Jon Medhurst (Tixy)" <tixy@linaro.org>
Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
7 years agoMerge branch 'stmmac-probe-error-handling-and-phydev-leaks'
David S. Miller [Fri, 2 Dec 2016 15:42:47 +0000 (10:42 -0500)]
Merge branch 'stmmac-probe-error-handling-and-phydev-leaks'

Johan Hovold says:

====================
net: stmmac: fix probe error handling and phydev leaks

This series fixes a number of issues with the stmmac-driver probe error
handling, which for example left clocks enabled after probe failures.

The final patch fixes a failure to deregister and free any fixed-link
PHYs that were registered during probe on probe errors and on driver
unbind. It also fixes a related of-node leak on late probe errors.

This series depends on the of_phy_deregister_fixed_link() helper that
was just merged to net.

As mentioned earlier, one staging driver also suffers from a similar
leak and can be fixed up once the above mentioned helper hits mainline.

Note that these patches have only been compile tested.
====================

Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: fix of-node and fixed-link-phydev leaks
Johan Hovold [Wed, 30 Nov 2016 14:29:55 +0000 (15:29 +0100)]
net: ethernet: stmmac: fix of-node and fixed-link-phydev leaks

Make sure to deregister and free any fixed-link phy registered during
probe on probe errors and on driver unbind by adding a new glue helper
function.

Drop the of-node reference taken in the same path also on late probe
errors (and not just on driver unbind) by moving the put from
stmmac_dvr_remove() to the new helper.

Fixes: 9dc3644e2bc8 ("stmmac: add fixed-link device-tree support")
Fixes: 3f00850be5e7 ("ethernet: stmicro: stmmac: add missing of_node_put
after calling of_parse_phandle")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: platform: fix outdated function header
Johan Hovold [Wed, 30 Nov 2016 14:29:54 +0000 (15:29 +0100)]
net: ethernet: stmmac: platform: fix outdated function header

Fix the OF-helper function header to reflect that the function no longer
has a platform-data parameter.

Fixes: 10c736ab6ba3 ("stmmac: make stmmac_probe_config_dt return the
platform data struct")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: dwmac-meson8b: fix probe error path
Johan Hovold [Wed, 30 Nov 2016 14:29:53 +0000 (15:29 +0100)]
net: ethernet: stmmac: dwmac-meson8b: fix probe error path

Make sure to disable clocks before returning on late probe errors.

Fixes: 5b97865dfed5 ("net: stmmac: add a glue driver for the Amlogic
Meson 8b / GXBB DWMAC")
Signed-off-by: Johan Hovold <johan@kernel.org>
Acked-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: dwmac-generic: fix probe error path
Johan Hovold [Wed, 30 Nov 2016 14:29:52 +0000 (15:29 +0100)]
net: ethernet: stmmac: dwmac-generic: fix probe error path

Make sure to call any exit() callback to undo the effect of init()
before returning on late probe errors.

Fixes: 6f4675c34620 ("stmmac: move hw init in the probe (v2)")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: dwmac-rk: fix probe error path
Johan Hovold [Wed, 30 Nov 2016 14:29:51 +0000 (15:29 +0100)]
net: ethernet: stmmac: dwmac-rk: fix probe error path

Make sure to disable runtime PM, power down the PHY, and disable clocks
before returning on late probe errors.

Fixes: 58638061fc8b ("stmmac: dwmac-rk: create a new probe function")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: dwmac-sti: fix probe error path
Johan Hovold [Wed, 30 Nov 2016 14:29:50 +0000 (15:29 +0100)]
net: ethernet: stmmac: dwmac-sti: fix probe error path

Make sure to disable clocks before returning on late probe errors.

Fixes: ecb756a36a0c ("stmmac: dwmac-sti: turn setup callback into a
probe function")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: ethernet: stmmac: dwmac-socfpga: fix use-after-free on probe errors
Johan Hovold [Wed, 30 Nov 2016 14:29:49 +0000 (15:29 +0100)]
net: ethernet: stmmac: dwmac-socfpga: fix use-after-free on probe errors

Make sure to call stmmac_dvr_remove() before returning on late probe
errors so that memory is freed, clocks are disabled, and the netdev is
deregistered before its resources go away.

Fixes: f2886db2f5aa ("net: stmmac: socfpga: Remove re-registration of
reset controller")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/rtnetlink: fix attribute name in nlmsg_size() comments
Tobias Klauser [Wed, 30 Nov 2016 13:30:37 +0000 (14:30 +0100)]
net/rtnetlink: fix attribute name in nlmsg_size() comments

Use the correct attribute constant names IFLA_GSO_MAX_{SEGS,SIZE}
instead of IFLA_MAX_GSO_{SEGS,SIZE} for the comments int nlmsg_size().

Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agolocking/rtmutex: Use READ_ONCE() in rt_mutex_owner()
Thomas Gleixner [Wed, 30 Nov 2016 21:04:42 +0000 (21:04 +0000)]
locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()

While debugging the rtmutex unlock vs. dequeue race Will suggested to use
READ_ONCE() in rt_mutex_owner() as it might race against the
cmpxchg_release() in unlock_rt_mutex_safe().

Will: "It's a minor thing which will most likely not matter in practice"

Careful search did not unearth an actual problem in todays code, but it's
better to be safe than surprised.

Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20161130210030.431379999@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agolocking/rtmutex: Prevent dequeue vs. unlock race
Thomas Gleixner [Wed, 30 Nov 2016 21:04:41 +0000 (21:04 +0000)]
locking/rtmutex: Prevent dequeue vs. unlock race

David reported a futex/rtmutex state corruption. It's caused by the
following problem:

CPU0 CPU1 CPU2

l->owner=T1
rt_mutex_lock(l)
lock(l->wait_lock)
l->owner = T1 | HAS_WAITERS;
enqueue(T2)
boost()
  unlock(l->wait_lock)
schedule()

rt_mutex_lock(l)
lock(l->wait_lock)
l->owner = T1 | HAS_WAITERS;
enqueue(T3)
boost()
  unlock(l->wait_lock)
schedule()
signal(->T2) signal(->T3)
lock(l->wait_lock)
dequeue(T2)
deboost()
  unlock(l->wait_lock)
lock(l->wait_lock)
dequeue(T3)
  ===> wait list is now empty
deboost()
 unlock(l->wait_lock)
lock(l->wait_lock)
fixup_rt_mutex_waiters()
  if (wait_list_empty(l)) {
    owner = l->owner & ~HAS_WAITERS;
    l->owner = owner
     ==> l->owner = T1
  }

lock(l->wait_lock)
rt_mutex_unlock(l) fixup_rt_mutex_waiters()
  if (wait_list_empty(l)) {
    owner = l->owner & ~HAS_WAITERS;
cmpxchg(l->owner, T1, NULL)
 ===> Success (l->owner = NULL)
    l->owner = owner
     ==> l->owner = T1
  }

That means the problem is caused by fixup_rt_mutex_waiters() which does the
RMW to clear the waiters bit unconditionally when there are no waiters in
the rtmutexes rbtree.

This can be fatal: A concurrent unlock can release the rtmutex in the
fastpath because the waiters bit is not set. If the cmpxchg() gets in the
middle of the RMW operation then the previous owner, which just unlocked
the rtmutex is set as the owner again when the write takes place after the
successfull cmpxchg().

The solution is rather trivial: verify that the owner member of the rtmutex
has the waiters bit set before clearing it. This does not require a
cmpxchg() or other atomic operations because the waiters bit can only be
set and cleared with the rtmutex wait_lock held. It's also safe against the
fast path unlock attempt. The unlock attempt via cmpxchg() will either see
the bit set and take the slowpath or see the bit cleared and release it
atomically in the fastpath.

It's remarkable that the test program provided by David triggers on ARM64
and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
problem exists there as well. That refusal might explain that this got not
discovered earlier despite the bug existing from day one of the rtmutex
implementation more than 10 years ago.

Thanks to David for meticulously instrumenting the code and providing the
information which allowed to decode this subtle problem.

Reported-by: David Daney <ddaney@caviumnetworks.com>
Tested-by: David Daney <david.daney@cavium.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: stable@vger.kernel.org
Fixes: 133790fe1d3d ("[PATCH] pi-futex: rt mutex core")
Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agobatman-adv: Check for alloc errors when preparing TT local data
Sven Eckelmann [Wed, 30 Nov 2016 20:47:09 +0000 (21:47 +0100)]
batman-adv: Check for alloc errors when preparing TT local data

batadv_tt_prepare_tvlv_local_data can fail to allocate the memory for the
new TVLV block. The caller is informed about this problem with the returned
length of 0. Not checking this value results in an invalid memory access
when either tt_data or tt_change is accessed.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 2c95804667d7 ("batman-adv: make the TT CRC logic VLAN specific")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
7 years agoMerge tag 'pci-v4.9-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Linus Torvalds [Fri, 2 Dec 2016 00:44:42 +0000 (16:44 -0800)]
Merge tag 'pci-v4.9-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:
 "PCI fixes:

   - Fix Read Completion Boundary setting, which fixes a boot failure on
     IBM x3850 with Mellanox MT27500 ConnectX-3

   - Update some MAINTAINERS entries and email addresses"

* tag 'pci-v4.9-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Set Read Completion Boundary to 128 iff Root Port supports it (_HPX)
  PCI: Export pcie_find_root_port
  PCI: designware-plat: Update author email
  PCI: designware: Change maintainer to Joao Pinto
  MAINTAINERS: Add devicetree binding to PCI i.MX6 entry
  MAINTAINERS: Update Richard Zhu's email address

7 years agoixgbe/ixgbevf: Don't use lco_csum to compute IPv4 checksum
Alexander Duyck [Mon, 28 Nov 2016 15:42:29 +0000 (10:42 -0500)]
ixgbe/ixgbevf: Don't use lco_csum to compute IPv4 checksum

In the case of IPIP and SIT tunnel frames the outer transport header
offset is actually set to the same offset as the inner transport header.
This results in the lco_csum call not doing any checksum computation over
the inner IPv4/v6 header data.

In order to account for that I am updating the code so that we determine
the location to start the checksum ourselves based on the location of the
IPv4 header and the length.

Fixes: 0f48a278078b ("ixgbe/ixgbevf: Add support for GSO partial")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoigb/igbvf: Don't use lco_csum to compute IPv4 checksum
Alexander Duyck [Mon, 28 Nov 2016 15:42:23 +0000 (10:42 -0500)]
igb/igbvf: Don't use lco_csum to compute IPv4 checksum

In the case of IPIP and SIT tunnel frames the outer transport header
offset is actually set to the same offset as the inner transport header.
This results in the lco_csum call not doing any checksum computation over
the inner IPv4/v6 header data.

In order to account for that I am updating the code so that we determine
the location to start the checksum ourselves based on the location of the
IPv4 header and the length.

Fixes: 40d8dc3ebe29 ("igb/igbvf: Add support for GSO partial")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: asix: Fix AX88772_suspend() USB vendor commands failure issues
allan [Wed, 30 Nov 2016 08:29:08 +0000 (16:29 +0800)]
net: asix: Fix AX88772_suspend() USB vendor commands failure issues

The change fixes AX88772_suspend() USB vendor commands failure issues.

Signed-off-by: Allan Chou <allan@asix.com.tw>
Tested-by: Allan Chou <allan@asix.com.tw>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszer...
Linus Torvalds [Thu, 1 Dec 2016 18:31:53 +0000 (10:31 -0800)]
Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs fix from Miklos Szeredi:
 "This fixes a regression introduced in 4.8"

* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: fix d_real() for stacked fs

7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Thu, 1 Dec 2016 18:29:41 +0000 (10:29 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov: "We are disabling automatic
  probing of BYD touchpads as it results in too many false positives,
  and the hardware is not terribly popular and having the protocol
  support does not result in significantly improved user experience.

  We also change keycode for KEY_DATA to avoid clashing with
  KEY_FASTREVERSE. Luckily this newish code is used by CEC framework
  that is still in staging, so it is extremely unlikely that someone has
  already started using this keycode"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: change KEY_DATA from 0x275 to 0x277
  Input: psmouse - disable automatic probing of BYD touchpads