]> git.baikalelectronics.ru Git - kernel.git/log
kernel.git
7 years agobpf: add test cases for new BPF_J{LT, LE, SLT, SLE} instructions
Daniel Borkmann [Wed, 9 Aug 2017 23:40:03 +0000 (01:40 +0200)]
bpf: add test cases for new BPF_J{LT, LE, SLT, SLE} instructions

Add test cases to the verifier selftest suite in order to verify that
i) direct packet access, and ii) dynamic map value access is working
with the changes related to the new instructions.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier
Daniel Borkmann [Wed, 9 Aug 2017 23:40:02 +0000 (01:40 +0200)]
bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier

Enable the newly added jump opcodes, main parts are in two
different areas, namely direct packet access and dynamic map
value access. For the direct packet access, we now allow for
the following two new patterns to match in order to trigger
markings with find_good_pkt_pointers():

Variant 1 (access ok when taking the branch):

  0: (61) r2 = *(u32 *)(r1 +76)
  1: (61) r3 = *(u32 *)(r1 +80)
  2: (bf) r0 = r2
  3: (07) r0 += 8
  4: (ad) if r0 < r3 goto pc+2
  R0=pkt(id=0,off=8,r=0) R1=ctx R2=pkt(id=0,off=0,r=0)
  R3=pkt_end R10=fp
  5: (b7) r0 = 0
  6: (95) exit

  from 4 to 7: R0=pkt(id=0,off=8,r=8) R1=ctx
               R2=pkt(id=0,off=0,r=8) R3=pkt_end R10=fp
  7: (71) r0 = *(u8 *)(r2 +0)
  8: (05) goto pc-4
  5: (b7) r0 = 0
  6: (95) exit
  processed 11 insns, stack depth 0

Variant 2 (access ok on fall-through):

  0: (61) r2 = *(u32 *)(r1 +76)
  1: (61) r3 = *(u32 *)(r1 +80)
  2: (bf) r0 = r2
  3: (07) r0 += 8
  4: (bd) if r3 <= r0 goto pc+1
  R0=pkt(id=0,off=8,r=8) R1=ctx R2=pkt(id=0,off=0,r=8)
  R3=pkt_end R10=fp
  5: (71) r0 = *(u8 *)(r2 +0)
  6: (b7) r0 = 1
  7: (95) exit

  from 4 to 6: R0=pkt(id=0,off=8,r=0) R1=ctx
               R2=pkt(id=0,off=0,r=0) R3=pkt_end R10=fp
  6: (b7) r0 = 1
  7: (95) exit
  processed 10 insns, stack depth 0

The above two basically just swap the branches where we need
to handle an exception and allow packet access compared to the
two already existing variants for find_good_pkt_pointers().

For the dynamic map value access, we add the new instructions
to reg_set_min_max() and reg_set_min_max_inv() in order to
learn bounds. Verifier test cases for both are added in a
follow-up patch.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf, nfp: implement jiting of BPF_J{LT,LE}
Daniel Borkmann [Wed, 9 Aug 2017 23:40:01 +0000 (01:40 +0200)]
bpf, nfp: implement jiting of BPF_J{LT,LE}

This work implements jiting of BPF_J{LT,LE} instructions with
BPF_X/BPF_K variants for the nfp eBPF JIT. The two BPF_J{SLT,SLE}
instructions have not been added yet given BPF_J{SGT,SGE} are
not supported yet either.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf, ppc64: implement jiting of BPF_J{LT, LE, SLT, SLE}
Daniel Borkmann [Wed, 9 Aug 2017 23:40:00 +0000 (01:40 +0200)]
bpf, ppc64: implement jiting of BPF_J{LT, LE, SLT, SLE}

This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions
with BPF_X/BPF_K variants for the ppc64 eBPF JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Tested-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf, s390x: implement jiting of BPF_J{LT, LE, SLT, SLE}
Daniel Borkmann [Wed, 9 Aug 2017 23:39:59 +0000 (01:39 +0200)]
bpf, s390x: implement jiting of BPF_J{LT, LE, SLT, SLE}

This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions
with BPF_X/BPF_K variants for the s390x eBPF JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf, sparc64: implement jiting of BPF_J{LT, LE, SLT, SLE}
Daniel Borkmann [Wed, 9 Aug 2017 23:39:58 +0000 (01:39 +0200)]
bpf, sparc64: implement jiting of BPF_J{LT, LE, SLT, SLE}

This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions
with BPF_X/BPF_K variants for the sparc64 eBPF JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf, arm64: implement jiting of BPF_J{LT, LE, SLT, SLE}
Daniel Borkmann [Wed, 9 Aug 2017 23:39:57 +0000 (01:39 +0200)]
bpf, arm64: implement jiting of BPF_J{LT, LE, SLT, SLE}

This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions
with BPF_X/BPF_K variants for the arm64 eBPF JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf, x86: implement jiting of BPF_J{LT,LE,SLT,SLE}
Daniel Borkmann [Wed, 9 Aug 2017 23:39:56 +0000 (01:39 +0200)]
bpf, x86: implement jiting of BPF_J{LT,LE,SLT,SLE}

This work implements jiting of BPF_J{LT,LE,SLT,SLE} instructions
with BPF_X/BPF_K variants for the x86_64 eBPF JIT.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: add BPF_J{LT,LE,SLT,SLE} instructions
Daniel Borkmann [Wed, 9 Aug 2017 23:39:55 +0000 (01:39 +0200)]
bpf: add BPF_J{LT,LE,SLT,SLE} instructions

Currently, eBPF only understands BPF_JGT (>), BPF_JGE (>=),
BPF_JSGT (s>), BPF_JSGE (s>=) instructions, this means that
particularly *JLT/*JLE counterparts involving immediates need
to be rewritten from e.g. X < [IMM] by swapping arguments into
[IMM] > X, meaning the immediate first is required to be loaded
into a register Y := [IMM], such that then we can compare with
Y > X. Note that the destination operand is always required to
be a register.

This has the downside of having unnecessarily increased register
pressure, meaning complex program would need to spill other
registers temporarily to stack in order to obtain an unused
register for the [IMM]. Loading to registers will thus also
affect state pruning since we need to account for that register
use and potentially those registers that had to be spilled/filled
again. As a consequence slightly more stack space might have
been used due to spilling, and BPF programs are a bit longer
due to extra code involving the register load and potentially
required spill/fills.

Thus, add BPF_JLT (<), BPF_JLE (<=), BPF_JSLT (s<), BPF_JSLE (s<=)
counterparts to the eBPF instruction set. Modifying LLVM to
remove the NegateCC() workaround in a PoC patch at [1] and
allowing it to also emit the new instructions resulted in
cilium's BPF programs that are injected into the fast-path to
have a reduced program length in the range of 2-3% (e.g.
accumulated main and tail call sections from one of the object
file reduced from 4864 to 4729 insns), reduced complexity in
the range of 10-30% (e.g. accumulated sections reduced in one
of the cases from 116432 to 88428 insns), and reduced stack
usage in the range of 1-5% (e.g. accumulated sections from one
of the object files reduced from 824 to 784b).

The modification for LLVM will be incorporated in a backwards
compatible way. Plan is for LLVM to have i) a target specific
option to offer a possibility to explicitly enable the extension
by the user (as we have with -m target specific extensions today
for various CPU insns), and ii) have the kernel checked for
presence of the extensions and enable them transparently when
the user is selecting more aggressive options such as -march=native
in a bpf target context. (Other frontends generating BPF byte
code, e.g. ply can probe the kernel directly for its code
generation.)

  [1] https://github.com/borkmann/llvm/tree/bpf-insns

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'net-zerocopy-fixes'
David S. Miller [Wed, 9 Aug 2017 23:49:17 +0000 (16:49 -0700)]
Merge branch 'net-zerocopy-fixes'

Willem de Bruijn says:

====================
net: zerocopy fixes

Fix two issues introduced in the msg_zerocopy patchset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosock: fix zerocopy_success regression with msg_zerocopy
Willem de Bruijn [Wed, 9 Aug 2017 23:09:44 +0000 (19:09 -0400)]
sock: fix zerocopy_success regression with msg_zerocopy

Do not use uarg->zerocopy outside msg_zerocopy. In other paths the
field is not explicitly initialized and aliases another field.

Those paths have only one reference so do not need this intermediate
variable. Call uarg->callback directly.

Fixes: 1f8b977ab32d ("sock: enable MSG_ZEROCOPY")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosock: fix zerocopy panic in mem accounting
Willem de Bruijn [Wed, 9 Aug 2017 23:09:43 +0000 (19:09 -0400)]
sock: fix zerocopy panic in mem accounting

Only call mm_unaccount_pinned_pages when releasing a struct ubuf_info
that has initialized its field uarg->mmp.

Before this patch, a vhost-net with experimental_zcopytx can crash in

  mm_unaccount_pinned_pages
  sock_zerocopy_put
  skb_zcopy_clear
  skb_release_data

Only sock_zerocopy_alloc initializes this field. Move the unaccount
call from generic sock_zerocopy_put to its specific callback
sock_zerocopy_callback.

Fixes: a91dbff551a6 ("sock: ulimit on MSG_ZEROCOPY pages")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Wed, 9 Aug 2017 23:47:19 +0000 (16:47 -0700)]
Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2017-08-08

This series contains updates to e1000e and igb/igbvf.

Gangfeng Huang fixes an issue with receive network flow classification,
where igb_nfc_filter_exit() was not being called in igb_down() which
would cause the filter tables to "fill up" if a user where to change
the adapter settings (such as speed) which requires a reset of the
adapter.

Cliff Spradlin fixes a timestamping issue, where igb was allowing requests
for hardware timestamping even if it was not configured for hardware
transmit timestamping.

Corinna Vinschen removes the error message that there was an "unexpected
SYS WRAP", when it is actually expected.  So remove the message to not
confuse users.

Greg Edwards provides several patches for the mailbox interface between
the PF and VF drivers.  Added a mailbox unlock method to be used to unlock
the PF/VF mailbox by the PF.  Added a lock around the VF mailbox ops to
prevent the VF from sending another message while the PF is still
processing the previous message.  Fixed a "scheduling while atomic" issue
by changing msleep() to mdelay().

Sasha adds support for the next LOM generations i219 (v8 & v9) which
will be available in the next Intel client platform IceLake.

John Linville adds support for a Broadcom PHY to the igb driver, since
there are designs out in the world which use the igb MAC and a third
party PHY.  This allows the driver to load and function as expected on
these designs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Wed, 9 Aug 2017 23:28:45 +0000 (16:28 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

The UDP offload conflict is dealt with by simply taking what is
in net-next where we have removed all of the UFO handling code
entirely.

The TCP conflict was a case of local variables in a function
being removed from both net and net-next.

In netvsc we had an assignment right next to where a missing
set of u64 stats sync object inits were added.

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agofutex: Remove unnecessary warning from get_futex_key
Mel Gorman [Wed, 9 Aug 2017 07:27:11 +0000 (08:27 +0100)]
futex: Remove unnecessary warning from get_futex_key

Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in
get_futex_key()") removed an unnecessary lock_page() with the
side-effect that page->mapping needed to be treated very carefully.

Two defensive warnings were added in case any assumption was missed and
the first warning assumed a correct application would not alter a
mapping backing a futex key.  Since merging, it has not triggered for
any unexpected case but Mark Rutland reported the following bug
triggering due to the first warning.

  kernel BUG at kernel/futex.c:679!
  Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3
  Hardware name: linux,dummy-virt (DT)
  task: ffff80001e271780 task.stack: ffff000010908000
  PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
  LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
  pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145

The fact that it's a bug instead of a warning was due to an unrelated
arm64 problem, but the warning itself triggered because the underlying
mapping changed.

This is an application issue but from a kernel perspective it's a
recoverable situation and the warning is unnecessary so this patch
removes the warning.  The warning may potentially be triggered with the
following test program from Mark although it may be necessary to adjust
NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the
system.

    #include <linux/futex.h>
    #include <pthread.h>
    #include <stdio.h>
    #include <stdlib.h>
    #include <sys/mman.h>
    #include <sys/syscall.h>
    #include <sys/time.h>
    #include <unistd.h>

    #define NR_FUTEX_THREADS 16
    pthread_t threads[NR_FUTEX_THREADS];

    void *mem;

    #define MEM_PROT  (PROT_READ | PROT_WRITE)
    #define MEM_SIZE  65536

    static int futex_wrapper(int *uaddr, int op, int val,
                             const struct timespec *timeout,
                             int *uaddr2, int val3)
    {
        syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
    }

    void *poll_futex(void *unused)
    {
        for (;;) {
            futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
        }
    }

    int main(int argc, char *argv[])
    {
        int i;

        mem = mmap(NULL, MEM_SIZE, MEM_PROT,
               MAP_SHARED | MAP_ANONYMOUS, -1, 0);

        printf("Mapping @ %p\n", mem);

        printf("Creating futex threads...\n");

        for (i = 0; i < NR_FUTEX_THREADS; i++)
            pthread_create(&threads[i], NULL, poll_futex, NULL);

        printf("Flipping mapping...\n");
        for (;;) {
            mmap(mem, MEM_SIZE, MEM_PROT,
                 MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
        }

        return 0;
    }

Reported-and-tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Wed, 9 Aug 2017 20:21:28 +0000 (13:21 -0700)]
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "The main thing is to allow empty id_tables for ACPI to make some
  drivers get probed again. It looks a bit bigger than usual because it
  needs some internal renaming, too.

  Other than that, there is a fix for broken DSTDs, a super simple
  enablement for ARM MPS, and two documentation fixes which I'd like to
  see in v4.13 already"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: rephrase explanation of I2C_CLASS_DEPRECATED
  i2c: allow i2c-versatile for ARM MPS platforms
  i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz
  i2c: designware: Print clock freq on invalid clock freq error
  i2c: core: Allow empty id_table in ACPI case as well
  i2c: mux: pinctrl: mention correct module name in Kconfig help text

7 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Wed, 9 Aug 2017 17:37:35 +0000 (10:37 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Three patches that should go into this release.

  Two of them are from Paolo and fix up some corner cases with BFQ, and
  the last patch is from Ming and fixes up a potential usage count
  imbalance regression due to the recent NOWAIT work"

* 'for-linus' of git://git.kernel.dk/linux-block:
  blk-mq: don't leak preempt counter/q_usage_counter when allocating rq failed
  block, bfq: consider also in_service_entity to state whether an entity is active
  block, bfq: reset in_service_entity if it becomes idle

7 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Wed, 9 Aug 2017 17:33:49 +0000 (10:33 -0700)]
Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "Fix two regressions in the inside-secure driver with respect to
  hmac(sha1)"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: inside-secure - fix the sha state length in hmac_sha1_setkey
  crypto: inside-secure - fix invalidation check in hmac_sha1_setkey

7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Wed, 9 Aug 2017 17:14:04 +0000 (10:14 -0700)]
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "The pull requests are getting smaller, that's progress I suppose :-)

   1) Fix infinite loop in CIPSO option parsing, from Yujuan Qi.

   2) Fix remote checksum handling in VXLAN and GUE tunneling drivers,
      from Koichiro Den.

   3) Missing u64_stats_init() calls in several drivers, from Florian
      Fainelli.

   4) TCP can set the congestion window to an invalid ssthresh value
      after congestion window reductions, from Yuchung Cheng.

   5) Fix BPF jit branch generation on s390, from Daniel Borkmann.

   6) Correct MIPS ebpf JIT merge, from David Daney.

   7) Correct byte order test in BPF test_verifier.c, from Daniel
      Borkmann.

   8) Fix various crashes and leaks in ASIX driver, from Dean Jenkins.

   9) Handle SCTP checksums properly in mlx4 driver, from Davide
      Caratti.

  10) We can potentially enter tcp_connect() with a cached route
      already, due to fastopen, so we have to explicitly invalidate it.

  11) skb_warn_bad_offload() can bark in legitimate situations, fix from
      Willem de Bruijn"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  net: avoid skb_warn_bad_offload false positives on UFO
  qmi_wwan: fix NULL deref on disconnect
  ppp: fix xmit recursion detection on ppp channels
  rds: Reintroduce statistics counting
  tcp: fastopen: tcp_connect() must refresh the route
  net: sched: set xt_tgchk_param par.net properly in ipt_init_target
  net: dsa: mediatek: add adjust link support for user ports
  net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
  qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
  hysdn: fix to a race condition in put_log_buffer
  s390/qeth: fix L3 next-hop in xmit qeth hdr
  asix: Fix small memory leak in ax88772_unbind()
  asix: Ensure asix_rx_fixup_info members are all reset
  asix: Add rx->ax_skb = NULL after usbnet_skb_return()
  bpf: fix selftest/bpf/test_pkt_md_access on s390x
  netvsc: fix race on sub channel creation
  bpf: fix byte order test in test_verifier
  xgene: Always get clk source, but ignore if it's missing for SGMII ports
  MIPS: Add missing file for eBPF JIT.
  bpf, s390: fix build for libbpf and selftest suite
  ...

7 years agonet: ipv6: avoid overhead when no custom FIB rules are installed
Vincent Bernat [Tue, 8 Aug 2017 18:23:49 +0000 (20:23 +0200)]
net: ipv6: avoid overhead when no custom FIB rules are installed

If the user hasn't installed any custom rules, don't go through the
whole FIB rules layer. This is pretty similar to f4530fa574df (ipv4:
Avoid overhead when no custom FIB rules are installed).

Using a micro-benchmark module [1], timing ip6_route_output() with
get_cycles(), with 40,000 routes in the main routing table, before this
patch:

    min=606 max=12911 count=627 average=1959 95th=4903 90th=3747 50th=1602 mad=821
    table=254 avgdepth=21.8 maxdepth=39
    value │                         ┊                            count
      600 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒                                         199
      880 │▒▒▒░░░░░░░░░░░░░░░░                                      43
     1160 │▒▒▒░░░░░░░░░░░░░░░░░░░░                                  48
     1440 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░                               43
     1720 │▒▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░                          59
     2000 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                      50
     2280 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                    26
     2560 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                  31
     2840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░               28
     3120 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░              17
     3400 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             17
     3680 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             8
     3960 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           11
     4240 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░            6
     4520 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           6
     4800 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           9

After:

    min=544 max=11687 count=627 average=1776 95th=4546 90th=3585 50th=1227 mad=565
    table=254 avgdepth=21.8 maxdepth=39
    value │                         ┊                            count
      540 │▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒                                        201
      800 │▒▒▒▒▒░░░░░░░░░░░░░░░░                                    63
     1060 │▒▒▒▒▒░░░░░░░░░░░░░░░░░░░░░                               68
     1320 │▒▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░                            39
     1580 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                         32
     1840 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                       32
     2100 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                    34
     2360 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░                 33
     2620 │▒▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░               26
     2880 │▒░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░              22
     3140 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░              9
     3400 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             8
     3660 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░             9
     3920 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░            8
     4180 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           8
     4440 │░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░           8

At the frequency of the host during the bench (~ 3.7 GHz), this is
about a 100 ns difference on the median value.

A next step would be to collapse local and main tables, as in
0ddcf43d5d4a (ipv4: FIB Local/MAIN table collapse).

[1]: https://github.com/vincentbernat/network-lab/blob/master/lab-routes-ipv6/kbench_mod.c

Signed-off-by: Vincent Bernat <vincent@bernat.im>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: avoid skb_warn_bad_offload false positives on UFO
Willem de Bruijn [Tue, 8 Aug 2017 18:22:55 +0000 (14:22 -0400)]
net: avoid skb_warn_bad_offload false positives on UFO

skb_warn_bad_offload triggers a warning when an skb enters the GSO
stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
checksum offload set.

Commit b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
observed that SKB_GSO_DODGY producers can trigger the check and
that passing those packets through the GSO handlers will fix it
up. But, the software UFO handler will set ip_summed to
CHECKSUM_NONE.

When __skb_gso_segment is called from the receive path, this
triggers the warning again.

Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
Tx these two are equivalent. On Rx, this better matches the
skb state (checksum computed), as CHECKSUM_NONE here means no
checksum computed.

See also this thread for context:
http://patchwork.ozlabs.org/patch/799015/

Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoisdn: hfcsusb: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 16:49:28 +0000 (22:19 +0530)]
isdn: hfcsusb: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoisdn: hisax: hfc_usb: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 16:49:27 +0000 (22:19 +0530)]
isdn: hisax: hfc_usb: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqmi_wwan: fix NULL deref on disconnect
Bjørn Mork [Tue, 8 Aug 2017 16:02:11 +0000 (18:02 +0200)]
qmi_wwan: fix NULL deref on disconnect

qmi_wwan_disconnect is called twice when disconnecting devices with
separate control and data interfaces.  The first invocation will set
the interface data to NULL for both interfaces to flag that the
disconnect has been handled.  But the matching NULL check was left
out when qmi_wwan_disconnect was added, resulting in this oops:

  usb 2-1.4: USB disconnect, device number 4
  qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device
  BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0
  IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
  PGD 0
  P4D 0
  Oops: 0000 [#1] SMP
  Modules linked in: <stripped irrelevant module list>
  CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G            E   4.12.3-nr44-normandy-r1500619820+ #1
  Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017
  Workqueue: usb_hub_wq hub_event [usbcore]
  task: ffff8c882b716040 task.stack: ffffb8e800d84000
  RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
  RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400
  RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000
  R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8
  R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0
  Call Trace:
   ? usb_unbind_interface+0x71/0x270 [usbcore]
   ? device_release_driver_internal+0x154/0x210
   ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan]
   ? usbnet_disconnect+0x6c/0xf0 [usbnet]
   ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan]
   ? usb_unbind_interface+0x71/0x270 [usbcore]
   ? device_release_driver_internal+0x154/0x210

Reported-and-tested-by: Nathaniel Roach <nroach44@gmail.com>
Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support")
Cc: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqmi_wwan: fix NULL deref on disconnect
Bjørn Mork [Tue, 8 Aug 2017 16:02:11 +0000 (18:02 +0200)]
qmi_wwan: fix NULL deref on disconnect

qmi_wwan_disconnect is called twice when disconnecting devices with
separate control and data interfaces.  The first invocation will set
the interface data to NULL for both interfaces to flag that the
disconnect has been handled.  But the matching NULL check was left
out when qmi_wwan_disconnect was added, resulting in this oops:

  usb 2-1.4: USB disconnect, device number 4
  qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device
  BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0
  IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
  PGD 0
  P4D 0
  Oops: 0000 [#1] SMP
  Modules linked in: <stripped irrelevant module list>
  CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G            E   4.12.3-nr44-normandy-r1500619820+ #1
  Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017
  Workqueue: usb_hub_wq hub_event [usbcore]
  task: ffff8c882b716040 task.stack: ffffb8e800d84000
  RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan]
  RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400
  RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000
  R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8
  R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0
  Call Trace:
   ? usb_unbind_interface+0x71/0x270 [usbcore]
   ? device_release_driver_internal+0x154/0x210
   ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan]
   ? usbnet_disconnect+0x6c/0xf0 [usbnet]
   ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan]
   ? usb_unbind_interface+0x71/0x270 [usbcore]
   ? device_release_driver_internal+0x154/0x210

Reported-and-tested-by: Nathaniel Roach <nroach44@gmail.com>
Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support")
Cc: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: phy: mdio-bcm-unimac: fix unsigned wrap-around when decrementing timeout
Colin Ian King [Tue, 8 Aug 2017 09:52:32 +0000 (10:52 +0100)]
net: phy: mdio-bcm-unimac: fix unsigned wrap-around when decrementing timeout

Change post-decrement compare to pre-decrement to avoid an
unsigned integer wrap-around on timeout. This leads to the following
!timeout check to never to be true so -ETIMEDOUT is never returned.

Detected by CoverityScan, CID#1452623 ("Logically dead code")

Fixes: 69a60b0579a4 ("net: phy: mdio-bcm-unimac: factor busy polling loop")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoppp: fix xmit recursion detection on ppp channels
Guillaume Nault [Tue, 8 Aug 2017 09:43:24 +0000 (11:43 +0200)]
ppp: fix xmit recursion detection on ppp channels

Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp
devices") dropped the xmit_recursion counter incrementation in
ppp_channel_push() and relied on ppp_xmit_process() for this task.
But __ppp_channel_push() can also send packets directly (using the
.start_xmit() channel callback), in which case the xmit_recursion
counter isn't incremented anymore. If such packets get routed back to
the parent ppp unit, ppp_xmit_process() won't notice the recursion and
will call ppp_channel_push() on the same channel, effectively creating
the deadlock situation that the xmit_recursion mechanism was supposed
to prevent.

This patch re-introduces the xmit_recursion counter incrementation in
ppp_channel_push(). Since the xmit_recursion variable is now part of
the parent ppp unit, incrementation is skipped if the channel doesn't
have any. This is fine because only packets routed through the parent
unit may enter the channel recursively.

Finally, we have to ensure that pch->ppp is not going to be modified
while executing ppp_channel_push(). Instead of taking this lock only
while calling ppp_xmit_process(), we now have to hold it for the full
ppp_channel_push() execution. This respects the ppp locks ordering
which requires locking ->upl before ->downl.

Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agords: Reintroduce statistics counting
Håkon Bugge [Tue, 8 Aug 2017 09:13:32 +0000 (11:13 +0200)]
rds: Reintroduce statistics counting

In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection
while senders are present"), refilling the receive queue was removed
from rds_ib_recv(), along with the increment of
s_ib_rx_refill_from_thread.

Commit 73ce4317bf98 ("RDS: make sure we post recv buffers")
re-introduces filling the receive queue from rds_ib_recv(), but does
not add the statistics counter. rds_ib_recv() was later renamed to
rds_ib_recv_path().

This commit reintroduces the statistics counting of
s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agotcp: fastopen: tcp_connect() must refresh the route
Eric Dumazet [Tue, 8 Aug 2017 08:41:58 +0000 (01:41 -0700)]
tcp: fastopen: tcp_connect() must refresh the route

With new TCP_FASTOPEN_CONNECT socket option, there is a possibility
to call tcp_connect() while socket sk_dst_cache is either NULL
or invalid.

 +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
 +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
 +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
 +0 connect(4, ..., ...) = 0

<< sk->sk_dst_cache becomes obsolete, or even set to NULL >>

 +1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000

We need to refresh the route otherwise bad things can happen,
especially when syzkaller is running on the host :/

Fixes: 19f6d3f3c8422 ("net/tcp-fastopen: Add new API support")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: sched: set xt_tgchk_param par.net properly in ipt_init_target
Xin Long [Tue, 8 Aug 2017 07:25:25 +0000 (15:25 +0800)]
net: sched: set xt_tgchk_param par.net properly in ipt_init_target

Now xt_tgchk_param par in ipt_init_target is a local varibale,
par.net is not initialized there. Later when xt_check_target
calls target's checkentry in which it may access par.net, it
would cause kernel panic.

Jaroslav found this panic when running:

  # ip link add TestIface type dummy
  # tc qd add dev TestIface ingress handle ffff:
  # tc filter add dev TestIface parent ffff: u32 match u32 0 0 \
    action xt -j CONNMARK --set-mark 4

This patch is to pass net param into ipt_init_target and set
par.net with it properly in there.

v1->v2:
  As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix
  it by also passing net_id to __tcf_ipt_init.
v2->v3:
  Missed the fixes tag, so add it.

Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put")
Reported-by: Jaroslav Aster <jaster@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocxgb4: Clear On FLASH config file after a FW upgrade
Arjun Vynipadath [Tue, 8 Aug 2017 05:50:52 +0000 (11:20 +0530)]
cxgb4: Clear On FLASH config file after a FW upgrade

Because Firmware and the Firmware Configuration File need to be
in sync; clear out any On-FLASH Firmware Configuration File when new
Firmware is loaded.  This will avoid difficult to diagnose and fix
problems with a mis-matched Firmware Configuration File which prevents the
adapter from being initialized.

Original work by: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet_sched: get rid of some forward declarations
WANG Cong [Mon, 7 Aug 2017 22:26:50 +0000 (15:26 -0700)]
net_sched: get rid of some forward declarations

If we move up tcf_fill_node() we can get rid of these
forward declarations.

Also, move down tfilter_notify_chain() to group them together.

Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: lan9303: Only allocate 3 ports
Egil Hjelmeland [Mon, 7 Aug 2017 22:22:21 +0000 (00:22 +0200)]
net: dsa: lan9303: Only allocate 3 ports

Save 2628 bytes on arm eabi by allocate only the required 3 ports.

Now that ds->num_ports is correct: In net/dsa/tag_lan9303.c
eliminate duplicate LAN9303_MAX_PORTS, use ds->num_ports.
(Matching the pattern of other net/dsa/tag_xxx.c files.)

Signed-off-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests: bpf: add a test for XDP redirect
William Tu [Mon, 7 Aug 2017 20:14:42 +0000 (13:14 -0700)]
selftests: bpf: add a test for XDP redirect

Add test for xdp_redirect by creating two namespaces with two
veth peers, then forward packets in-between.

Signed-off-by: William Tu <u9012063@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: John Fastabend <john.fastabend@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoliquidio: fix misspelled firmware image filenames
Derek Chickles [Mon, 7 Aug 2017 19:22:15 +0000 (12:22 -0700)]
liquidio: fix misspelled firmware image filenames

Fix misspelled firmware image filenames advertised via MODULE_FIRMWARE().

Signed-off-by: Derek Chickles <derek.chickles@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Extend check_uarg_tail_zero() checks
Mickaël Salaün [Mon, 7 Aug 2017 18:45:20 +0000 (20:45 +0200)]
bpf: Extend check_uarg_tail_zero() checks

The function check_uarg_tail_zero() was created from bpf(2) for
BPF_OBJ_GET_INFO_BY_FD without taking the access_ok() nor the PAGE_SIZE
checks. Make this checks more generally available while unlikely to be
triggered, extend the memory range check and add an explanation
including why the ToCToU should not be a security concern.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Link: https://lkml.kernel.org/r/CAGXu5j+vRGFvJZmjtAcT8Hi8B+Wz0e1b6VKYZHfQP_=DXzC4CQ@mail.gmail.com
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf: Move check_uarg_tail_zero() upward
Mickaël Salaün [Mon, 7 Aug 2017 18:45:19 +0000 (20:45 +0200)]
bpf: Move check_uarg_tail_zero() upward

The function check_uarg_tail_zero() may be useful for other part of the
code in the syscall.c file. Move this function at the beginning of the
file.

Signed-off-by: Mickaël Salaün <mic@digikod.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Kees Cook <keescook@chromium.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonetvsc: make sure and unregister datapath
stephen hemminger [Mon, 7 Aug 2017 18:30:00 +0000 (11:30 -0700)]
netvsc: make sure and unregister datapath

Go back to switching datapath directly in the notifier callback.
Otherwise datapath might not get switched on unregister.

No need for calling the NOTIFY_PEERS notifier since that is only for
a gratitious ARP/ND packet; but that is not required with Hyper-V
because both VF and synthetic NIC have the same MAC address.

Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Fixes: 0c195567a8f6 ("netvsc: transparent VF management")
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoigb: support BCM54616 PHY
John W Linville [Fri, 21 Jul 2017 18:12:24 +0000 (14:12 -0400)]
igb: support BCM54616 PHY

The management port on an Edgecore AS7712-32 switch uses an igb MAC, but
it uses a BCM54616 PHY. Without a patch like this, loading the igb
module produces dmesg output like this:

[    3.439125] igb: Copyright (c) 2007-2014 Intel Corporation.
[    3.439866] igb: probe of 0000:00:14.0 failed with error -2

Signed-off-by: John W Linville <linville@tuxdriver.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoliquidio: fix wrong info about vf rx/tx ring parameters reported to ethtool
Intiyaz Basha [Mon, 7 Aug 2017 17:39:00 +0000 (10:39 -0700)]
liquidio: fix wrong info about vf rx/tx ring parameters reported to ethtool

Information reported to ethtool about vf rx/tx ring parameters is wrong.
Fix it by adding the missing initializations.

Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoigbvf: convert msleep to mdelay in atomic context
Greg Edwards [Thu, 20 Jul 2017 16:15:14 +0000 (10:15 -0600)]
igbvf: convert msleep to mdelay in atomic context

This fixes a "scheduling while atomic" splat seen with
CONFIG_DEBUG_ATOMIC_SLEEP enabled.

Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigbvf: after mailbox write, wait for reply
Greg Edwards [Thu, 20 Jul 2017 16:00:58 +0000 (10:00 -0600)]
igbvf: after mailbox write, wait for reply

Two of the VF mailbox commands were not waiting for a reply from the PF,
which can result in a VF mailbox timeout in the VM for the next command.

Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agonet: dsa: mediatek: add adjust link support for user ports
John Crispin [Mon, 7 Aug 2017 14:20:49 +0000 (16:20 +0200)]
net: dsa: mediatek: add adjust link support for user ports

Manually adjust the port settings of user ports once PHY polling has
completed. This patch extends the adjust_link callback to configure the
per port PMCR register, applying the proper values polled from the PHY.
Without this patch flow control was not always getting setup properly.

Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com>
Signed-off-by: Muciri Gatimu <muciri@openmesh.com>
Signed-off-by: John Crispin <john@phrozen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets
Davide Caratti [Thu, 3 Aug 2017 20:54:48 +0000 (22:54 +0200)]
net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets

if the NIC fails to validate the checksum on TCP/UDP, and validation of IP
checksum is successful, the driver subtracts the pseudo-header checksum
from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't
do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails.

V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set

Reported-by: Shuang Li <shuali@redhat.com>
Fixes: f8c6455bb04b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Acked-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoigbvf: add lock around mailbox ops
Greg Edwards [Thu, 20 Jul 2017 16:00:57 +0000 (10:00 -0600)]
igbvf: add lock around mailbox ops

The PF driver assumes the VF will not send another mailbox message until
the PF has written its reply to the previous message.  If the VF does,
that message will be silently dropped by the PF before it writes its
reply to the mailbox.  This results in a VF mailbox timeout for posted
messages waiting for an ACK, and the VF is reset by the
igbvf_watchdog_task in the VM.

Add a lock around the VF mailbox ops to prevent the VF from sending
another message while the PF is still processing the previous one.

Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoe1000e: Initial Support for IceLake
Sasha Neftin [Mon, 17 Jul 2017 22:13:39 +0000 (15:13 -0700)]
e1000e: Initial Support for IceLake

i219 (8) and i219 (9) are the next LOM generations that will be available
on the next Intel Client platform (IceLake).
This patch provides the initial support for these devices

Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Reviewed-by: Raanan Avargil <raanan.avargil@intel.com>
Reviewed-by: Dima Ruinskiy <dima.ruinskiy@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: do not drop PF mailbox lock after read of VF message
Greg Edwards [Wed, 28 Jun 2017 15:22:26 +0000 (09:22 -0600)]
igb: do not drop PF mailbox lock after read of VF message

When the PF receives a mailbox message from the VF, it grabs the mailbox
lock, reads the VF message from the mailbox, ACKs the message and drops
the lock.

While the PF is performing the action for the VF message, nothing
prevents another VF message from being posted to the mailbox.  The
current code handles this condition by just dropping any new VF messages
without processing them.  This results in a mailbox timeout in the VM
for posted messages waiting for an ACK, and the VF is reset by the
igbvf_watchdog_task in the VM.

Given the right sequence of VF messages and mailbox timeouts, this
condition can go on ad infinitum.

Modify the PF mailbox read method to take an 'unlock' argument that
optionally leaves the mailbox locked by the PF after reading the VF
message.  This ensures another VF message is not posted to the mailbox
until after the PF has completed processing the VF message and written
its reply.

Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoMerge branch 'bpf-rewrite-value-tracking-in-verifier'
David S. Miller [Wed, 9 Aug 2017 00:51:36 +0000 (17:51 -0700)]
Merge branch 'bpf-rewrite-value-tracking-in-verifier'

Edward Cree says:

====================
bpf: rewrite value tracking in verifier

This series simplifies alignment tracking, generalises bounds tracking
and fixes some bounds-tracking bugs in the BPF verifier.  Pointer
arithmetic on packet pointers, stack pointers, map value pointers and
context pointers has been unified, and bounds on these pointers are
only checked when the pointer is dereferenced.

Operations on pointers which destroy all relation to the original
pointer (such as multiplies and shifts) are disallowed if
!env->allow_ptr_leaks, otherwise they convert the pointer to an
unknown scalar and feed it to the normal scalar arithmetic handling.

Pointer types have been unified with the corresponding
adjusted-pointer types where those existed
(e.g. PTR_TO_MAP_VALUE[_ADJ] or FRAME_PTR vs PTR_TO_STACK); similarly,
CONST_IMM and UNKNOWN_VALUE have been unified into SCALAR_VALUE.

Pointer types (except CONST_PTR_TO_MAP, PTR_TO_MAP_VALUE_OR_NULL and
PTR_TO_PACKET_END, which do not allow arithmetic) have a 'fixed
offset' and a 'variable offset'; the former is used when e.g. adding
an immediate or a known-constant register, as long as it does not
overflow.  Otherwise the latter is used, and any operation creating a
new variable offset creates a new 'id' (and, for PTR_TO_PACKET, clears
the 'range').  SCALAR_VALUEs use the 'variable offset' fields to track
the range of possible values; the 'fixed offset' should never be set
on a scalar.
====================

Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf/verifier: increase complexity limit to 128k
Edward Cree [Mon, 7 Aug 2017 14:30:30 +0000 (15:30 +0100)]
bpf/verifier: increase complexity limit to 128k

The more detailed value tracking can reduce the effectiveness of pruning
 for some programs.  So, to avoid rejecting previously valid programs, up
 the limit to 128kinsns.  Hopefully we will be able to bring this back
 down later by improving pruning performance.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoDocumentation: describe the new eBPF verifier value tracking behaviour
Edward Cree [Mon, 7 Aug 2017 14:30:09 +0000 (15:30 +0100)]
Documentation: describe the new eBPF verifier value tracking behaviour

Also bring the eBPF documentation up to date in other ways.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: variable offset negative tests
Edward Cree [Mon, 7 Aug 2017 14:29:51 +0000 (15:29 +0100)]
selftests/bpf: variable offset negative tests

Variable ctx accesses and stack accesses aren't allowed, because we can't
 determine what type of value will be read.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: add tests for subtraction & negative numbers
Edward Cree [Mon, 7 Aug 2017 14:29:34 +0000 (15:29 +0100)]
selftests/bpf: add tests for subtraction & negative numbers

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: don't try to access past MAX_PACKET_OFF in test_verifier
Edward Cree [Mon, 7 Aug 2017 14:29:11 +0000 (15:29 +0100)]
selftests/bpf: don't try to access past MAX_PACKET_OFF in test_verifier

A number of selftests fell foul of the changed MAX_PACKET_OFF handling.
For instance, "direct packet access: test2" was potentially reading four
 bytes from pkt + 0xffff, which could take it past the verifier's limit,
 causing the program to be rejected (checks against pkt_end didn't give
 us any reg->range).
Increase the shifts by one so that R2 is now mask 0x7fff instead of
 mask 0xffff.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: add test for bogus operations on pointers
Edward Cree [Mon, 7 Aug 2017 14:28:45 +0000 (15:28 +0100)]
selftests/bpf: add test for bogus operations on pointers

Tests non-add/sub operations (AND, LSH) on pointers decaying them to
 unknown scalars.
Also tests that a pkt_ptr add which could potentially overflow is rejected
 (find_good_pkt_pointers ignores it and doesn't give us any reg->range).

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: add a test to test_align
Edward Cree [Mon, 7 Aug 2017 14:28:00 +0000 (15:28 +0100)]
selftests/bpf: add a test to test_align

New test adds 14 to the unknown value before adding to the packet pointer,
 meaning there's no 'fixed offset' field and instead we add into the
 var_off, yielding a '4n+2' value.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: rewrite test_align
Edward Cree [Mon, 7 Aug 2017 14:27:34 +0000 (15:27 +0100)]
selftests/bpf: rewrite test_align

Expectations have changed, as has the format of the logged state.
To make the tests easier to read, add a line-matching framework so that
 each match need only quote the register it cares about.  (Multiple
 matches may refer to the same line, but matches must be listed in
 order of increasing line.)

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoselftests/bpf: change test_verifier expectations
Edward Cree [Mon, 7 Aug 2017 14:27:12 +0000 (15:27 +0100)]
selftests/bpf: change test_verifier expectations

Some of the verifier's error messages have changed, and some constructs
 that previously couldn't be verified are now accepted.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf/verifier: more concise register state logs for constant var_off
Edward Cree [Mon, 7 Aug 2017 14:26:56 +0000 (15:26 +0100)]
bpf/verifier: more concise register state logs for constant var_off

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf/verifier: track signed and unsigned min/max values
Edward Cree [Mon, 7 Aug 2017 14:26:36 +0000 (15:26 +0100)]
bpf/verifier: track signed and unsigned min/max values

Allows us to, sometimes, combine information from a signed check of one
 bound and an unsigned check of the other.
We now track the full range of possible values, rather than restricting
 ourselves to [0, 1<<30) and considering anything beyond that as
 unknown.  While this is probably not necessary, it makes the code more
 straightforward and symmetrical between signed and unsigned bounds.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobpf/verifier: rework value tracking
Edward Cree [Mon, 7 Aug 2017 14:26:19 +0000 (15:26 +0100)]
bpf/verifier: rework value tracking

Unifies adjusted and unadjusted register value types (e.g. FRAME_POINTER is
 now just a PTR_TO_STACK with zero offset).
Tracks value alignment by means of tracking known & unknown bits.  This
 also replaces the 'reg->imm' (leading zero bits) calculations for (what
 were) UNKNOWN_VALUEs.
If pointer leaks are allowed, and adjust_ptr_min_max_vals returns -EACCES,
 treat the pointer as an unknown scalar and try again, because we might be
 able to conclude something about the result (e.g. pointer & 0x40 is either
 0 or 0x40).
Verifier hooks in the netronome/nfp driver were changed to match the new
 data structures.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoigb: expose mailbox unlock method
Greg Edwards [Wed, 28 Jun 2017 15:22:25 +0000 (09:22 -0600)]
igb: expose mailbox unlock method

Add a mailbox unlock method to e1000_mbx_operations, which will be used
to unlock the PF/VF mailbox by the PF.

Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: add argument names to mailbox op function declarations
Greg Edwards [Wed, 28 Jun 2017 15:22:24 +0000 (09:22 -0600)]
igb: add argument names to mailbox op function declarations

Signed-off-by: Greg Edwards <gedwards@ddn.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agonet: usb: rtl8150: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:58:41 +0000 (21:28 +0530)]
net: usb: rtl8150: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: usb: r8152: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:58:05 +0000 (21:28 +0530)]
net: usb: r8152: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: usb: kaweth: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:58:04 +0000 (21:28 +0530)]
net: usb: kaweth: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: usb: ipheth: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:58:03 +0000 (21:28 +0530)]
net: usb: ipheth: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: usb: cdc-phonet: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:58:02 +0000 (21:28 +0530)]
net: usb: cdc-phonet: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: usb: catc: constify usb_device_id and fix space before '[' error
Arvind Yadav [Tue, 8 Aug 2017 15:58:01 +0000 (21:28 +0530)]
net: usb: catc: constify usb_device_id and fix space before '[' error

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Fix checkpatch.pl error:
ERROR: space prohibited before open square bracket '['.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: stir4200: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:56:45 +0000 (21:26 +0530)]
net: irda: stir4200: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: mcs7780: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:56:44 +0000 (21:26 +0530)]
net: irda: mcs7780: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: ksdazzle: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:56:43 +0000 (21:26 +0530)]
net: irda: ksdazzle: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: ks959: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:56:42 +0000 (21:26 +0530)]
net: irda: ks959: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: kingsun: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:56:41 +0000 (21:26 +0530)]
net: irda: kingsun: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: irda: irda-usb: constify usb_device_id
Arvind Yadav [Tue, 8 Aug 2017 15:56:40 +0000 (21:26 +0530)]
net: irda: irda-usb: constify usb_device_id

usb_device_id are not supposed to change at runtime. All functions
working with usb_device_id provided by <linux/usb.h> work with
const usb_device_id. So mark the non-const structs as const.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoigb: Remove incorrect "unexpected SYS WRAP" log message
Corinna Vinschen [Fri, 23 Jun 2017 12:26:30 +0000 (14:26 +0200)]
igb: Remove incorrect "unexpected SYS WRAP" log message

TSAUXC.DisableSystime is never set, so SYSTIM runs into a SYS WRAP
every 1100 secs on 80580/i350/i354 (40 bit SYSTIM) and every 35000
secs on 80576 (45 bit SYSTIM).

This wrap event sets the TSICR.SysWrap bit unconditionally.

However, checking TSIM at interrupt time shows that this event does not
actually cause the interrupt.  Rather, it's just bycatch while the
actual interrupt is caused by, for instance, TSICR.TXTS.

The conclusion is that the SYS WRAP is actually expected, so the
"unexpected SYS WRAP" message is entirely bogus and just helps to
confuse users.  Drop it.

Signed-off-by: Corinna Vinschen <vinschen@redhat.com>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoe1000e: add check on e1e_wphy() return value
Gustavo A R Silva [Tue, 20 Jun 2017 21:22:34 +0000 (16:22 -0500)]
e1000e: add check on e1e_wphy() return value

Check return value from call to e1e_wphy(). This value is being
checked during previous calls to function e1e_wphy() and it seems
a check was missing here.

Addresses-Coverity-ID: 1226905
Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com>
Reviewed-by: Ethan Zhao <ethan.zhao@oracle.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: protect TX timestamping from API misuse
Cliff Spradlin [Mon, 19 Jun 2017 20:30:43 +0000 (13:30 -0700)]
igb: protect TX timestamping from API misuse

HW timestamping can only be requested for a packet if the NIC is first
setup via ioctl(SIOCSHWTSTAMP). If this step was skipped, then the igb
driver still allowed TX packets to request HW timestamping. In this
situation, the _IGB_PTP_TX_IN_PROGRESS flag was set and would never
clear. This prevented any future HW timestamping requests to succeed.

Fix this by checking that the NIC is configured for HW TX timestamping
before accepting a HW TX timestamping request.

Signed-off-by: Cliff Spradlin <cspradlin@google.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoigb: Fix error of RX network flow classification
Gangfeng Huang [Sat, 27 May 2017 01:17:53 +0000 (09:17 +0800)]
igb: Fix error of RX network flow classification

After add an ethertype filter, if user change the adapter speed several
times, the error "ethtool -N: etype filters are all used" is reported by
igb driver.

In older patch, function igb_nfc_filter_exit() and igb_nfc_filter_restore()
is not paried. igb_nfc_filter_restore() exist in igb_up(), but function
igb_nfc_filter_exit() is exist in __igb_close(). In the process of speed
changing, only igb_nfc_filter_restore() is called, it will take a position
of ethertype bitmap.

Reproduce steps:
Step 1: Add a etype filter by ethtool
$ethtool -N eth0 flow-type ether proto 0x88F8 action 1
Step 2: Change the adapter speed to 100M/full duplex
$ethtool -s eth0 speed 100 duplex full
Step 3: Change the adapter speed to 1000M/full duplex
ethtool -s eth0 speed 1000 duplex full
Repeat step2 and step3, then dmesg the system log, you can find the error
message, add new ethtype filter is also failed.

This fixing is move igb_nfc_filter_exit() from __igb_close() to igb_down()
to make igb_nfc_filter_restore()/igb_nfc_filter_exit() is paired.

Signed-off-by: Gangfeng Huang <gangfeng.huang@ni.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
7 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Linus Torvalds [Tue, 8 Aug 2017 18:42:33 +0000 (11:42 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma

Pull rdma fixes from Doug Ledford:
 "Third set of -rc fixes for 4.13 cycle

   - small set of miscellanous fixes

   - a reasonably sizable set of IPoIB fixes that deal with multiple
     long standing issues"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  IB/hns: checking for IS_ERR() instead of NULL
  RDMA/mlx5: Fix existence check for extended address vector
  IB/uverbs: Fix device cleanup
  RDMA/uverbs: Prevent leak of reserved field
  IB/core: Fix race condition in resolving IP to MAC
  IB/ipoib: Notify on modify QP failure only when relevant
  Revert "IB/core: Allow QP state transition from reset to error"
  IB/ipoib: Remove double pointer assigning
  IB/ipoib: Clean error paths in add port
  IB/ipoib: Add get statistics support to SRIOV VF
  IB/ipoib: Add multicast packets statistics
  IB/ipoib: Set IPOIB_NEIGH_TBL_FLUSH after flushed completion initialization
  IB/ipoib: Prevent setting negative values to max_nonsrq_conn_qp
  IB/ipoib: Make sure no in-flight joins while leaving that mcast
  IB/ipoib: Use cancel_delayed_work_sync when needed
  IB/ipoib: Fix race between light events and interface restart

7 years agoparse-maintainers: Move matching sections from MAINTAINERS
Joe Perches [Sun, 6 Aug 2017 01:45:49 +0000 (18:45 -0700)]
parse-maintainers: Move matching sections from MAINTAINERS

Allow any number of command line arguments to match either the
section header or the section contents and create new files.

Create MAINTAINERS.new and SECTION.new.

This allows scripting of the movement of various sections from
MAINTAINERS.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoparse-maintainers: Use perl hash references and specific filenames
Joe Perches [Sun, 6 Aug 2017 01:45:48 +0000 (18:45 -0700)]
parse-maintainers: Use perl hash references and specific filenames

Instead of reading STDIN and writing STDOUT, use specific filenames of
MAINTAINERS and MAINTAINERS.new.

Use hash references instead of global hash %hash so future modifications
can read and write specific hashes to split up MAINTAINERS into multiple
files using a script.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoparse-maintainers: Add section pattern sorting
Joe Perches [Sun, 6 Aug 2017 01:45:47 +0000 (18:45 -0700)]
parse-maintainers: Add section pattern sorting

Section [A-Z]: patterns are not currently in any required sorting order.
Add a specific sorting sequence to MAINTAINERS entries.
Sort F: and X: patterns in alphabetic order.

The preferred section ordering is:

  SECTION HEADER
  M: Maintainers
  R: Reviewers
  P: Named persons without email addresses
  L: Mailing list addresses
  S: Status of this section (Supported, Maintained, Orphan, etc...)
  W: Any relevant URLs
  T: Source code control type (git, quilt, etc)
  Q: Patchwork patch acceptance queue site
  B: Bug tracking URIs
  C: Chat URIs
  F: Files with wildcard patterns (alphabetic ordered)
  X: Excluded files with wildcard patterns (alphabetic ordered)
  N: Files with regex patterns
  K: Keyword regexes in source code for maintainership identification

Miscellaneous perl neatening:

 - Rename %map to %hash, map has a different meaning in perl
 - Avoid using \& and local variables for function indirection
 - Use return for a little c like clarity
 - Use c-like function call style instead of &function

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoget_maintainer: Prepare for separate MAINTAINERS files
Joe Perches [Sat, 5 Aug 2017 04:45:48 +0000 (21:45 -0700)]
get_maintainer: Prepare for separate MAINTAINERS files

Allow for MAINTAINERS to become a directory and if it is,
read all the files in the directory for maintained sections.

Optionally look for all files named MAINTAINERS in directories
excluding the .git directory by using --find-maintainer-files.

This optional feature adds ~.3 seconds of CPU on an Intel
i5-6200 with an SSD.

Miscellanea:

 - Create a read_maintainer_file subroutine from the existing code
 - Test only the existence of MAINTAINERS, not whether it's a file

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMAINTAINERS: openbmc mailing list is moderated
Randy Dunlap [Wed, 2 Aug 2017 17:57:45 +0000 (10:57 -0700)]
MAINTAINERS: openbmc mailing list is moderated

The openbmc mailing list is moderated for non-subscribers.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Joel Stanley <joel@jms.id.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMAINTAINERS: greybus: Fix typo s/LOOBACK/LOOPBACK
Sedat Dilek [Tue, 25 Jul 2017 12:53:42 +0000 (14:53 +0200)]
MAINTAINERS: greybus: Fix typo s/LOOBACK/LOOPBACK

Fixes: f47e07bc5f1a5c48 ("Fix up MAINTAINERS file problems")
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Tue, 8 Aug 2017 16:38:41 +0000 (09:38 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Two small fixes, one re-fix of a previous fix and five patches sorting
  out hotplug in the bnx2X class of drivers. The latter is rather
  involved, but necessary because these drivers have started dropping
  lockdep recursion warnings on the hotplug lock because of its
  conversion to a percpu rwsem"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sg: only check for dxfer_len greater than 256M
  scsi: aacraid: reading out of bounds
  scsi: qedf: Limit number of CQs
  scsi: bnx2i: Simplify cpu hotplug code
  scsi: bnx2fc: Simplify CPU hotplug code
  scsi: bnx2i: Prevent recursive cpuhotplug locking
  scsi: bnx2fc: Prevent recursive cpuhotplug locking
  scsi: bnx2fc: Plug CPU hotplug race

7 years agorandom: fix warning message on ia64 and parisc
Helge Deller [Tue, 8 Aug 2017 16:28:41 +0000 (18:28 +0200)]
random: fix warning message on ia64 and parisc

Fix the warning message on the parisc and IA64 architectures to show the
correct function name of the caller by using %pS instead of %pF. The
message is printed with the value of _RET_IP_ which calls
__builtin_return_address(0) and as such returns the IP address caller
instead of pointer to a function descriptor of the caller.

The effect of this patch is visible on the parisc and ia64 architectures
only since those are the ones which use function descriptors while on
all others %pS and %pF will behave the same.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Fixes: eecabf567422 ("random: suppress spammy warnings about unseeded randomness")
Fixes: d06bfd1989fe ("random: warn when kernel uses unseeded randomness")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge tag 'xtensa-20170807' of git://github.com/jcmvbkbc/linux-xtensa
Linus Torvalds [Tue, 8 Aug 2017 01:58:10 +0000 (18:58 -0700)]
Merge tag 'xtensa-20170807' of git://github.com/jcmvbkbc/linux-xtensa

Pull Xtensa fixes from Max Filippov:

 - use asm-generic instances of asm/param.h and asm/device.h instead of
   exact copies in arch/xtensa/include/asm;

 - fix build error for xtensa cores with aliasing WT cache: define cache
   flushing functions and copy_{to,from}_user_page;

 - add missing EXPORT_SYMBOLs for clear_user_highpage, copy_user_highpage,
   flush_dcache_page, local_flush_cache_range, local_flush_cache_page,
   csum_partial and csum_partial_copy_generic.

* tag 'xtensa-20170807' of git://github.com/jcmvbkbc/linux-xtensa:
  xtensa: mm/cache: add missing EXPORT_SYMBOLs
  xtensa: don't limit csum_partial export by CONFIG_NET
  xtensa: fix cache aliasing handling code for WT cache
  xtensa: remove wrapper header for asm/param.h
  xtensa: remove wrapper header for asm/device.h

7 years agoMerge tag 'for-linus-20170807' of git://git.infradead.org/linux-mtd
Linus Torvalds [Tue, 8 Aug 2017 01:40:18 +0000 (18:40 -0700)]
Merge tag 'for-linus-20170807' of git://git.infradead.org/linux-mtd

Pull MTD fixes from Brian Norris:
 "I missed getting these out for rc4, but here are some MTD fixes.

  Just NAND fixes (in both the core handling, and a few drivers). Notes
  stolen from Boris:

  Core fixes:

   - fix data interface setup for ONFI NANDs that do not support the SET
     FEATURES command

   - fix a kernel doc header

   - fix potential integer overflow when retrieving timing information
     from the parameter page

   - fix wrong OOB layout for small page NANDs

  Driver fixes:

   - fix potential division-by-zero bug

   - fix backward compat with old atmel-nand DT bindings

   - fix ->setup_data_interface() in the atmel NAND driver"

* tag 'for-linus-20170807' of git://git.infradead.org/linux-mtd:
  mtd: nand: atmel: Fix EDO mode check
  mtd: nand: Declare tBERS, tR and tPROG as u64 to avoid integer overflow
  mtd: nand: Fix timing setup for NANDs that do not support SET FEATURES
  mtd: nand: Fix a docs build warning
  mtd: nand: sunxi: fix potential divide-by-zero error
  nand: fix wrong default oob layout for small pages using soft ecc
  mtd: nand: atmel: Fix DT backward compatibility in pmecc.c

7 years agoMerge tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Tue, 8 Aug 2017 01:16:22 +0000 (18:16 -0700)]
Merge tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs fixes from Darrick Wong:
 "I have a couple more bug fixes for you today:

   - fix memory leak when issuing discard

   - fix propagation of the dax inode flag"

* tag 'xfs-4.13-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: Fix per-inode DAX flag inheritance
  xfs: Fix leak of discard bio

7 years agonet: vrf: Add extack messages for newlink failures
David Ahern [Mon, 7 Aug 2017 17:08:10 +0000 (10:08 -0700)]
net: vrf: Add extack messages for newlink failures

Add extack error messages for failure paths creating vrf devices. Once
extack support is added to iproute2, we go from the unhelpful:
    $  ip li add foobar type vrf
    RTNETLINK answers: Invalid argument

to:
    $ ip li add foobar type vrf
    Error: VRF table id is missing

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoqed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'
Christophe Jaillet [Sun, 6 Aug 2017 22:00:17 +0000 (00:00 +0200)]
qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()'

We allocate 'p_info->mfw_mb_cur' and 'p_info->mfw_mb_shadow' but we check
'p_info->mfw_mb_addr' instead of 'p_info->mfw_mb_cur'.

'p_info->mfw_mb_addr' is never 0, because it is initiliazed a few lines
above in 'qed_load_mcp_offsets()'.

Update the test and check the result of the 2 'kzalloc()' instead.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoisdn: kcapi: make capi_version const
Bhumika Goyal [Sun, 6 Aug 2017 17:09:06 +0000 (22:39 +0530)]
isdn: kcapi: make capi_version const

Declare this structure as const as it is only used during a copy
operation.

Signed-off-by: Bhumika Goyal <bhumirks@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'Update-DSAs-FDB-API-and-perform-switchdev-cleanup'
David S. Miller [Mon, 7 Aug 2017 21:48:49 +0000 (14:48 -0700)]
Merge branch 'Update-DSAs-FDB-API-and-perform-switchdev-cleanup'

Arkadi Sharshevsky says:

====================
Update DSA's FDB API and perform switchdev cleanup

The patchset adds support for configuring static FDB entries via the
switchdev notification chain. The current method for FDB configuration
uses the switchdev's bridge bypass implementation. In order to support
this legacy way and to perform the switchdev cleanup, the implementation
is moved inside DSA.

The DSA drivers cannot sync the software bridge with hardware learned
entries and use the switchdev's implementation of bypass FDB dumping.
Because they are the only ones using this functionality, the fdb_dump
implementation is moved from switchdev code into DSA.

Finally after this changes a major cleanup in switchdev can be done.

Please see individual patches for patch specific change logs.
v1->v2
- Split MDB/vlan dump removal into core/driver removal.

v2->v3
- The self implementation for FDB add/del is moved inside DSA.
====================

Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Tested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: switchdev: Remove bridge bypass support from switchdev
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:51 +0000 (16:15 +0300)]
net: switchdev: Remove bridge bypass support from switchdev

Currently the bridge port flags, vlans, FDBs and MDBs can be offloaded
through the bridge code, making the switchdev's SELF bridge bypass
implementation to be redundant. This implies several changes:
- No need for dump infra in switchdev, DSA's special case is handled
  privately.
- Remove obj_dump from switchdev_ops.
- FDBs are removed from obj_add/del routines, due to the fact that they
  are offloaded through the bridge notification chain.
- The switchdev_port_bridge_xx() and switchdev_port_fdb_xx() functions
  can be removed.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: bridge: Remove FDB deletion through switchdev object
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:50 +0000 (16:15 +0300)]
net: bridge: Remove FDB deletion through switchdev object

At this point no driver supports FDB add/del through switchdev object
but rather via notification chain, thus, it is removed.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: Move FDB dump implementation inside DSA
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:49 +0000 (16:15 +0300)]
net: dsa: Move FDB dump implementation inside DSA

>From all switchdev devices only DSA requires special FDB dump. This is due
to lack of ability for syncing the hardware learned FDBs with the bridge.
Due to this it is removed from switchdev and moved inside DSA.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: Remove redundant MDB dump support
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:48 +0000 (16:15 +0300)]
net: dsa: Remove redundant MDB dump support

Currently the MDB HW database is synced with the bridge's one, thus,
There is no need to support special dump functionality.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: Remove support for MDB dump from DSA's drivers
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:47 +0000 (16:15 +0300)]
net: dsa: Remove support for MDB dump from DSA's drivers

This is done as a preparation before removing support for MDB dump from
DSA core. The MDBs are synced with the bridge and thus there is no
need for special dump operation support.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: Remove support for bypass bridge port attributes/vlan set
Arkadi Sharshevsky [Sun, 6 Aug 2017 13:15:46 +0000 (16:15 +0300)]
net: dsa: Remove support for bypass bridge port attributes/vlan set

The bridge port attributes/vlan for DSA devices should be set only
from bridge code. Furthermore, The vlans are synced totally with the
bridge so there is no need for special dump support.

Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>