tcp: remove redundant argument from tcp_rcv_established()
The last (4th) argument of tcp_rcv_established() is redundant as it
always equals to skb->len and the skb itself is always passed as 2th
agrument. There is no reason to have it.
Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The function mlx4_en_get_profile always return zero. So it is not
necessary to check its return value.
CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix it by replacing 'arp_tbl' and 'nd_tbl' with 'tbl->family' wherever
possible and reference 'nd_tbl' only when IPV6 is enabled.
Fixes: fa1c23e55f7c ("mlxsw: spectrum_router: Reflect IPv6 neighbours to the device") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The function mlx4_en_arm_cq always returns zero. So change the
return type of the function mlx4_en_arm_cq to void.
CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sergei Shtylyov [Sun, 23 Jul 2017 18:45:47 +0000 (21:45 +0300)]
of_mdio: kill useless variable in of_phy_register_fixed_link()
of_phy_register_fixed_link() declares the 'err' variable to hold the result
of of_property_read_string() but only uses it once after that, while that
function can be called directly from the *if* statement...
Remove that variable and move/regroup 'link_gpio' and 'len' variables in
order to sort the declarations in the reverse Xmas tree order -- to please
DaveM. ;-)
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
skbuff: re-add check for NULL skb->head in kfree_skb path
A null check is needed after all. netlink skbs can have skb->head be
backed by vmalloc. The netlink destructor vfree()s head, then sets it to
NULL. We then panic in skb_release_data with a NULL dereference.
Re-add such a test.
Alternative would be to switch to kvfree to free skb->head memory
and remove the special handling in netlink destructor.
Reported-by: kernel test robot <fengguang.wu@intel.com> Fixes: 986e14096549e ("net: Revert "net: add function to allocate sk_buff head without data area") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Gortmaker [Sun, 23 Jul 2017 14:44:52 +0000 (10:44 -0400)]
liquidio: fix implicit irq include causing build failures
To fix
In file included from drivers/net/ethernet/cavium/liquidio/octeon_mem_ops.c:24:0:
drivers/net/ethernet/cavium/liquidio/octeon_device.h:216:2: error: expected specifier-qualifier-list before ‘irqreturn_t’
irqreturn_t (*process_interrupt_regs)(void *);
^
as seen on arm64 allmodconfig builds.
Cc: Derek Chickles <derek.chickles@caviumnetworks.com> Cc: Satanand Burla <satananda.burla@caviumnetworks.com> Cc: Felix Manlunas <felix.manlunas@caviumnetworks.com> Cc: Raghu Vatsavayi <raghu.vatsavayi@caviumnetworks.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Sat, 22 Jul 2017 07:40:04 +0000 (10:40 +0300)]
bpf: dev_map_alloc() shouldn't return NULL
We forgot to set the error code on two error paths which means that we
return ERR_PTR(0) which is NULL. The caller, find_and_alloc_map(), is
not expecting that and will have a NULL dereference.
Fixes: a6dd72d34c12 ("bpf: add devmap, a map for storing net device references") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 52ea636677d3 ("netvsc: use ERR_PTR to avoid dereference issues") CC: stephen hemminger <stephen@networkplumber.org> Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Jul 2017 23:17:10 +0000 (16:17 -0700)]
Merge tag 'rxrpc-rewrite-20170721' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Rearrange headers
Here's a pair of patches that rearrange some of the AF_RXRPC header files
that are outside of the net/rxrpc/ directory:
(1) The bits userspace need are moved to uapi/linux/rxrpc.h. [Should this
be af_rxrpc.h instead, I wonder - but there doesn't seem to be
precedent for that in the other net UAPI headers.]
(2) For the most part, the contents of rxrpc/packet.h are no longer used
outside of the AF_RXRPC module, so move them to net/rxrpc/protocol.h
with the exception of the standard abort codes which are exposed to
userspace when an abort occurs and the security index values which are
needed when constructing keys.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sun, 23 Jul 2017 01:34:28 +0000 (09:34 +0800)]
sctp: remove the typedef sctp_unrecognized_param_t
This patch is to remove the typedef sctp_unrecognized_param_t, and
replace with struct sctp_unrecognized_param in the places where it's
using this typedef.
It is also to fix some indents in sctp_sf_do_unexpected_init() and
sctp_sf_do_5_1B_init().
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Jul 2017 20:53:00 +0000 (13:53 -0700)]
Merge branch 'udp-tunnel-offloads-toggle'
Sabrina Dubroca says:
====================
Allow to switch off UDP-based tunnel offloads per netdevice
This patchset adds a new netdevice feature to toggle RX offloads of
UDP-based tunnel via ethtool. This is useful if the offload is causing
issues, for example if the hardware is buggy.
The feature is added to all devices providing the ->ndo_udp_tunnel_add
op, and enabled by default to preserve current behavior.
When the administrator disables this feature on a device, all
currently offloaded ports are cleared from the device. When the
feature is turned on, the stack notifies the device about all current
ports.
v2:
- rename feature bit to NETIF_F_RX_UDP_TUNNEL_PORT
- rename ethtool feature to rx-udp_tunnel-port-offload
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
geneve/vxlan: offload ports on register/unregister events
This improves consistency of handling when moving a netdev to another
netns. Most drivers currently do a full reset when the device goes up,
so that will flush the offload state anyway.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
net: call udp_tunnel_get_rx_info when NETIF_F_RX_UDP_TUNNEL_PORT is toggled
NETIF_F_RX_UDP_TUNNEL_PORT is special, in that we need to do more than
just flip the bit in dev->features. When disabling we must also clear
currently offloaded ports from the device, and when enabling we must
tell the device to offload the ports it can.
Because vxlan stores its sockets in a hashtable and they are inserted at
the head of per-bucket lists, switching the feature off and then on can
result in a different set of ports being offloaded (depending on the
HW's limits).
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
net: add new netdevice feature for offload of RX port for UDP tunnels
This adds a new netdevice feature, so that the offloading of RX port for
UDP tunnels can be disabled by the administrator on some netdevices,
using the "rx-udp_tunnel-port-offload" feature in ethtool.
This feature is set for all devices that provide ndo_udp_tunnel_add.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds changelink rtnl operation support for geneve devices
and the code changes involve:
- added geneve_quiesce() which quiesces the geneve device data path
for both TX and RX. This lets us perform the changelink operation
atomically w.r.t data path. Also added geneve_unquiesce() to
reverse the operation of geneve_quiesce().
- refactor geneve_newlink into geneve_nl2info to be used by both
geneve_newlink and geneve_changelink
- geneve_nl2info takes a changelink boolean argument to isolate
changelink checks.
- Allow changing only a few attributes (ttl, tos, and remote tunnel
endpoint IP address (within the same address family)):
- return -EOPNOTSUPP for attributes that cannot be changed for
now. Incremental patches can make the non-supported one
available in the future if needed.
Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Rob Herring [Tue, 18 Jul 2017 21:43:19 +0000 (16:43 -0500)]
net: Convert to using %pOF instead of full_name
Now that we have a custom printf format specifier, convert users of
full_name to use %pOF instead. This is preparation to remove storing
of the full path string for each node.
Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 24 Jul 2017 20:37:01 +0000 (13:37 -0700)]
Merge branch 'virtio_net-xdp-refine'
Jason Wang says:
====================
Refine virtio-net XDP
This series brings two optimizations for virtio-net XDP:
- avoid reset during XDP set
- turn off offloads on demand
Changes from V1:
- Various tweaks on commit logs and comments
- Use virtnet_napi_enable() when enabling NAPI on XDP set
- Copy the small buffer packet only if xdp_headroom is smaller than
required
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 19 Jul 2017 08:54:49 +0000 (16:54 +0800)]
virtio-net: switch off offloads on demand if possible on XDP set
Current XDP implementation wants guest offloads feature to be disabled
on device. This is inconvenient and means guest can't benefit from
offloads if XDP is not used. This patch tries to address this
limitation by disabling the offloads on demand through control guest
offloads. Guest offloads will be disabled and enabled on demand on XDP
set.
Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 19 Jul 2017 08:54:48 +0000 (16:54 +0800)]
virtio-net: do not reset during XDP set
We currently reset the device during XDP set, the main reason is
that we allocate more headroom with XDP (for header adjustment).
This works but causes network downtime for users.
Previous patches encoded the headroom in the buffer context,
this makes it possible to detect the case where a buffer
with headroom insufficient for XDP is added to the queue and
XDP is enabled afterwards.
Upon detection, we handle this case by copying the packet
(slow, but it's a temporary condition).
Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jason Wang [Wed, 19 Jul 2017 08:54:46 +0000 (16:54 +0800)]
virtio-net: pack headroom into ctx for mergeable buffers
Pack headroom into ctx - this way when we get a buffer we can figure out
the actual headroom that was allocated for the buffer. Will be helpful
to optimize switching between XDP and non-XDP modes which have different
headroom requirements.
Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David Howells [Fri, 21 Jul 2017 09:07:10 +0000 (10:07 +0100)]
rxrpc: Move the packet.h include file into net/rxrpc/
Move the protocol description header file into net/rxrpc/ and rename it to
protocol.h. It's no longer necessary to expose it as packets are no longer
exposed to kernel services (such as AFS) that use the facility.
The abort codes are transferred to the UAPI header instead as we pass these
back to userspace and also to kernel services.
Signed-off-by: David Howells <dhowells@redhat.com>
1) BPF verifier signed/unsigned value tracking fix, from Daniel
Borkmann, Edward Cree, and Josef Bacik.
2) Fix memory allocation length when setting up calls to
->ndo_set_mac_address, from Cong Wang.
3) Add a new cxgb4 device ID, from Ganesh Goudar.
4) Fix FIB refcount handling, we have to set it's initial value before
the configure callback (which can bump it). From David Ahern.
5) Fix double-free in qcom/emac driver, from Timur Tabi.
6) A bunch of gcc-7 string format overflow warning fixes from Arnd
Bergmann.
7) Fix link level headroom tests in ip_do_fragment(), from Vasily
Averin.
8) Fix chunk walking in SCTP when iterating over error and parameter
headers. From Alexander Potapenko.
9) TCP BBR congestion control fixes from Neal Cardwell.
10) Fix SKB fragment handling in bcmgenet driver, from Doug Berger.
11) BPF_CGROUP_RUN_PROG_SOCK_OPS needs to check for null __sk, from Cong
Wang.
12) xmit_recursion in ppp driver needs to be per-device not per-cpu,
from Gao Feng.
13) Cannot release skb->dst in UDP if IP options processing needs it.
From Paolo Abeni.
14) Some netdev ioctl ifr_name[] NULL termination fixes. From Alexander
Levin and myself.
15) Revert some rtnetlink notification changes that are causing
regressions, from David Ahern.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (83 commits)
net: bonding: Fix transmit load balancing in balance-alb mode
rds: Make sure updates to cp_send_gen can be observed
net: ethernet: ti: cpsw: Push the request_irq function to the end of probe
ipv4: initialize fib_trie prior to register_netdev_notifier call.
rtnetlink: allocate more memory for dev_set_mac_address()
net: dsa: b53: Add missing ARL entries for BCM53125
bpf: more tests for mixed signed and unsigned bounds checks
bpf: add test for mixed signed and unsigned bounds checks
bpf: fix up test cases with mixed signed/unsigned bounds
bpf: allow to specify log level and reduce it for test_verifier
bpf: fix mixed signed/unsigned derived min/max value bounds
ipv6: avoid overflow of offset in ip6_find_1stfragopt
net: tehuti: don't process data if it has not been copied from userspace
Revert "rtnetlink: Do not generate notifications for CHANGEADDR event"
net: dsa: mv88e6xxx: Enable CMODE config support for 6390X
dt-binding: ptp: Add SoC compatibility strings for dte ptp clock
NET: dwmac: Make dwmac reset unconditional
net: Zero terminate ifr_name in dev_ifname().
wireless: wext: terminate ifr name coming from userspace
netfilter: fix netfilter_net_init() return
...
net: bonding: Fix transmit load balancing in balance-alb mode
balance-alb mode used to have transmit dynamic load balancing feature
enabled by default. However, transmit dynamic load balancing no longer
works in balance-alb after commit 8bc3f1172a6b ("bonding: remove
hardcoded value").
Both balance-tlb and balance-alb use the function bond_do_alb_xmit() to
send packets. This function uses the parameter tlb_dynamic_lb.
tlb_dynamic_lb used to have the default value of 1 for balance-alb, but
now the value is set to 0 except in balance-tlb.
Re-enable transmit dyanmic load balancing by initializing tlb_dynamic_lb
for balance-alb similar to balance-tlb.
Fixes: 8bc3f1172a6b ("bonding: remove hardcoded value") Signed-off-by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>
print the versions of vpd and serial configuration file,
flashed to adapter, and cleanup the relevant code.
Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: ti: cpsw: Push the request_irq function to the end of probe
Push the request_irq function to the end of probe so as
to ensure all the required fields are populated in the event
of an ISR getting executed right after requesting the irq.
Currently while loading the crash kernel a crash was seen as
soon as devm_request_threaded_irq was called. This was due to
n->poll being NULL which is called as part of net_rx_action
function.
Suggested-by: Sekhar Nori <nsekhar@ti.com> Signed-off-by: Keerthy <j-keerthy@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv4: initialize fib_trie prior to register_netdev_notifier call.
Net stack initialization currently initializes fib-trie after the
first call to netdevice_notifier() call. In fact fib_trie initialization
needs to happen before first rtnl_register(). It does not cause any problem
since there are no devices UP at this moment, but trying to bring 'lo'
UP at initialization would make this assumption wrong and exposes the issue.
Fixes: 539ccea17fa4 ("[NETNS]: Refactor fib initialization so it can handle multiple namespaces.") Fixes: a96a9eba8458 ("[IPV4]: fib hash|trie initialization") Signed-off-by: Mahesh Bandewar <maheshb@google.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Thu, 20 Jul 2017 18:27:57 +0000 (11:27 -0700)]
rtnetlink: allocate more memory for dev_set_mac_address()
virtnet_set_mac_address() interprets mac address as struct
sockaddr, but upper layer only allocates dev->addr_len
which is ETH_ALEN + sizeof(sa_family_t) in this case.
We lack a unified definition for mac address, so just fix
the upper layer, this also allows drivers to interpret it
to struct sockaddr freely.
Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: dsa: b53: Add missing ARL entries for BCM53125
The BCM53125 entry was missing an arl_entries member which would
basically prevent the ARL search from terminating properly. This switch
has 4 ARL entries, so add that.
Fixes: 42c5927a3531 ("net: dsa: b53: Implement ARL add/del/dump operations") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 20 Jul 2017 22:00:25 +0000 (00:00 +0200)]
bpf: more tests for mixed signed and unsigned bounds checks
Add a couple of more test cases to BPF selftests that are related
to mixed signed and unsigned checks.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Thu, 20 Jul 2017 22:00:24 +0000 (00:00 +0200)]
bpf: add test for mixed signed and unsigned bounds checks
These failed due to a bug in verifier bounds handling.
Signed-off-by: Edward Cree <ecree@solarflare.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 20 Jul 2017 22:00:23 +0000 (00:00 +0200)]
bpf: fix up test cases with mixed signed/unsigned bounds
Fix the few existing test cases that used mixed signed/unsigned
bounds and switch them only to one flavor. Reason why we need this
is that proper boundaries cannot be derived from mixed tests.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 20 Jul 2017 22:00:22 +0000 (00:00 +0200)]
bpf: allow to specify log level and reduce it for test_verifier
For the test_verifier case, it's quite hard to parse log level 2 to
figure out what's causing an issue when used to log level 1. We do
want to use bpf_verify_program() in order to simulate some of the
tests with strict alignment. So just add an argument to pass the level
and put it to 1 for test_verifier.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Thu, 20 Jul 2017 22:00:21 +0000 (00:00 +0200)]
bpf: fix mixed signed/unsigned derived min/max value bounds
Edward reported that there's an issue in min/max value bounds
tracking when signed and unsigned compares both provide hints
on limits when having unknown variables. E.g. a program such
as the following should have been rejected:
... r1 carries an unsigned value, and is compared as unsigned
against a register carrying an immediate. Verifier deduces in
reg_set_min_max() that since the compare is unsigned and operation
is greater than (>), that in the fall-through/false case, r1's
minimum bound must be 0 and maximum bound must be r2. Latter is
larger than the bound and thus max value is reset back to being
'invalid' aka BPF_REGISTER_MAX_RANGE. Thus, r1 state is now
'R1=inv,min_value=0'. The subsequent test ...
11: (65) if r1 s> 0x1 goto pc+2
... is a signed compare of r1 with immediate value 1. Here,
verifier deduces in reg_set_min_max() that since the compare
is signed this time and operation is greater than (>), that
in the fall-through/false case, we can deduce that r1's maximum
bound must be 1, meaning with prior test, we result in r1 having
the following state: R1=inv,min_value=0,max_value=1. Given that
the actual value this holds is -8, the bounds are wrongly deduced.
When this is being added to r0 which holds the map_value(_adj)
type, then subsequent store access in above case will go through
check_mem_access() which invokes check_map_access_adj(), that
will then probe whether the map memory is in bounds based
on the min_value and max_value as well as access size since
the actual unknown value is min_value <= x <= max_value; commit 7db5e9e65f04 ("bpf, verifier: fix alu ops against map_value{,
_adj} register types") provides some more explanation on the
semantics.
It's worth to note in this context that in the current code,
min_value and max_value tracking are used for two things, i)
dynamic map value access via check_map_access_adj() and since
commit 97748102b68f ("bpf: allow helpers access to variable memory")
ii) also enforced at check_helper_mem_access() when passing a
memory address (pointer to packet, map value, stack) and length
pair to a helper and the length in this case is an unknown value
defining an access range through min_value/max_value in that
case. The min_value/max_value tracking is /not/ used in the
direct packet access case to track ranges. However, the issue
also affects case ii), for example, the following crafted program
based on the same principle must be rejected as well:
Meaning, while we initialize the max_value stack slot that the
verifier thinks we access in the [1,2] range, in reality we
pass -7 as length which is interpreted as u32 in the helper.
Thus, this issue is relevant also for the case of helper ranges.
Resetting both bounds in check_reg_overflow() in case only one
of them exceeds limits is also not enough as similar test can be
created that uses values which are within range, thus also here
learned min value in r1 is incorrect when mixed with later signed
test to create a range:
This leaves us with two options for fixing this: i) to invalidate
all prior learned information once we switch signed context, ii)
to track min/max signed and unsigned boundaries separately as
done in [0]. (Given latter introduces major changes throughout
the whole verifier, it's rather net-next material, thus this
patch follows option i), meaning we can derive bounds either
from only signed tests or only unsigned tests.) There is still the
case of adjust_reg_min_max_vals(), where we adjust bounds on ALU
operations, meaning programs like the following where boundaries
on the reg get mixed in context later on when bounds are merged
on the dst reg must get rejected, too:
Meaning, in adjust_reg_min_max_vals() we must also reset range
values on the dst when src/dst registers have mixed signed/
unsigned derived min/max value bounds with one unbounded value
as otherwise they can be added together deducing false boundaries.
Once both boundaries are established from either ALU ops or
compare operations w/o mixing signed/unsigned insns, then they
can safely be added to other regs also having both boundaries
established. Adding regs with one unbounded side to a map value
where the bounded side has been learned w/o mixing ops is
possible, but the resulting map value won't recover from that,
meaning such op is considered invalid on the time of actual
access. Invalid bounds are set on the dst reg in case i) src reg,
or ii) in case dst reg already had them. The only way to recover
would be to perform i) ALU ops but only 'add' is allowed on map
value types or ii) comparisons, but these are disallowed on
pointers in case they span a range. This is fine as only BPF_JEQ
and BPF_JNE may be performed on PTR_TO_MAP_VALUE_OR_NULL registers
which potentially turn them into PTR_TO_MAP_VALUE type depending
on the branch, so only here min/max value cannot be invalidated
for them.
In terms of state pruning, value_from_signed is considered
as well in states_equal() when dealing with adjusted map values.
With regards to breaking existing programs, there is a small
risk, but use-cases are rather quite narrow where this could
occur and mixing compares probably unlikely.
Fixes: a91a21eaf94f ("bpf: allow access into map value arrays") Reported-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'pm-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These are two stable-candidate fixes for the intel_pstate driver and
the generic power domains (genpd) framework.
Specifics:
- Fix the average CPU load computations in the intel_pstate driver on
Knights Landing (Xeon Phi) processors that require an extra factor
to compensate for a rate change differences between the TSC and
MPERF which is missing (Srinivas Pandruvada).
- Fix an initialization ordering issue in the generic power domains
(genpd) framework (Sudeep Holla)"
* tag 'pm-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / Domains: defer dev_pm_domain_set() until genpd->attach_dev succeeds if present
cpufreq: intel_pstate: Correct the busy calculation for KNL
x86: mark kprobe templates as character arrays, not single characters
They really are, and the "take the address of a single character" makes
the string fortification code unhappy (it believes that you can now only
acccess one byte, rather than a byte range, and then raises errors for
the memory copies going on in there).
We could now remove a few 'addressof' operators (since arrays naturally
degrade to pointers), but this is the minimal patch that just changes
the C prototypes of those template arrays (the templates themselves are
defined in inline asm).
Reported-by: kernel test robot <xiaolong.ye@intel.com> Acked-and-tested-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'for-f2fs-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs fixes from Jaegeuk Kim:
"We've filed some bug fixes:
- missing f2fs case in terms of stale SGID bit, introduced by Jan
- build error for seq_file.h
- avoid cpu lockup
- wrong inode_unlock in error case"
* tag 'for-f2fs-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs:
f2fs: avoid cpu lockup
f2fs: include seq_file.h for sysfs.c
f2fs: Don't clear SGID when inheriting ACLs
f2fs: remove extra inode_unlock() in error path
Merge tag 'libnvdimm-fixes-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
"A handful of small fixes for 4.13-rc2. Three of these fixes are tagged
for -stable. They have all appeared in at least one -next release with
no reported issues
- Fix handling of media errors that span a sector
- Fix support of multiple namespaces in a libnvdimm region being in
device-dax mode
- Clean up the machine check notifier properly when the nfit driver
fails to register
- Address a static analysis (smatch) report in device-dax"
* tag 'libnvdimm-fixes-4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
device-dax: fix sysfs duplicate warnings
MAINTAINERS: list drivers/acpi/nfit/ files for libnvdimm sub-system
acpi/nfit: Fix memory corruption/Unregister mce decoder on failure
device-dax: fix 'passing zero to ERR_PTR()' warning
libnvdimm: fix badblock range handling of ARS range
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Pull HID fixes from Jiri Kosina:
- HID multitouch 4.12 regression fix from Dmitry Torokhov
- error handling fix for HID++ driver from Gustavo A. R. Silva
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
HID: hid-logitech-hidpp: add NULL check on devm_kmemdup() return value
HID: multitouch: do not blindly set EV_KEY or EV_ABS bits
HID: hid-logitech-hidpp: add NULL check on devm_kmemdup() return value
Check return value from call to devm_kmemdup() in order to prevent a NULL
pointer dereference.
Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
ipv6: avoid overflow of offset in ip6_find_1stfragopt
In some cases, offset can overflow and can cause an infinite loop in
ip6_find_1stfragopt(). Make it unsigned int to prevent the overflow, and
cap it at IPV6_MAXPLEN, since packets larger than that should be invalid.
This problem has been here since before the beginning of git history.
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Colin Ian King [Wed, 19 Jul 2017 17:46:59 +0000 (18:46 +0100)]
net: tehuti: don't process data if it has not been copied from userspace
The array data is only populated with valid information from userspace
if cmd != SIOCDEVPRIVATE, other cases the array contains garbage on
the stack. The subsequent switch statement acts on a subcommand in
data[0] which could be any garbage value if cmd is SIOCDEVPRIVATE which
seems incorrect to me. Instead, just return EOPNOTSUPP for the case
where cmd == SIOCDEVPRIVATE to avoid this issue.
As a side note, I suspect that the original intention of the code
was for this ioctl to work just for cmd == SIOCDEVPRIVATE (and the
current logic is reversed). However, I don't wont to change the current
semantics in case any userspace code relies on this existing behaviour.
Detected by CoverityScan, CID#139647 ("Uninitialized scalar variable")
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The duplicate CHANGEADDR event message is sent regardless of link
status whereas the setlink changes only generate a notification when
the link is up. Not sending a notification when the link is down breaks
dhcpcd which only processes hwaddr changes when the link is down.
Reported-by: Yaroslav Isakov <yaroslav.isakov@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Hundebøll [Wed, 19 Jul 2017 06:17:02 +0000 (08:17 +0200)]
net: dsa: mv88e6xxx: Enable CMODE config support for 6390X
Commit 134a05e2380f1 ('net: dsa: mv88e6xxx: Set the CMODE for mv88e6390
ports 9 & 10') added support for setting the CMODE for the 6390X family,
but only enabled it for 9290 and 6390 - and left out 6390X.
Fix support for setting the CMODE on 6390X also by assigning
mv88e6390x_port_set_cmode() to the .port_set_cmode function pointer in
mv88e6390x_ops too.
Fixes: 134a05e2380f ("net: dsa: mv88e6xxx: Set the CMODE for mv88e6390 ports 9 & 10") Signed-off-by: Martin Hundebøll <mnhu@prevas.dk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 20 Jul 2017 05:20:05 +0000 (22:20 -0700)]
Merge branch 'netvsc-lockdep-and-related-fixes'
Stephen Hemminger says:
====================
netvsc: lockdep and related fixes
These fix sparse and lockdep warnings from netvsc driver.
Targeting these at net-next since no actual related failures
have been observed in non-debug kernels.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
netvsc: save pointer to parent netvsc_device in channel table
Keep back pointer in the per-channel data structure to
avoid any possible RCU related issues when napi poll is
called but netvsc_device is in RCU limbo.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
netvsc: need rcu_derefence when accessing internal device info
The netvsc_device structure should be accessed by rcu_dereference
in the send path. Change arguments to netvsc_send() to make
this easier to do correctly.
Remove no longer needed hv_device_to_netvsc_device.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The rndis_filter_device_add function is called both in
probe context and RTNL context,and creates the netvsc_device
inner structure. It is easier to get the RTNL lock annotation
correct if it returns the object directly, rather than implicitly
by updating network device private data.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
netvsc: change logic for change mtu and set_queues
Use device detach/attach to ensure that no packets are handed
to device during state changes. Call rndis_filter_open/close
directly as part of later VF related changes.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This fixes the error unwind logic for incorrect number of queues.
If netif_set_real_num_XX_queues failed then rndis_filter_device_add
would have been called twice. Since input arguments are already
ranged checked this is a hypothetical only problem, not possible
in actual code.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
If two MTU changes are in less than update interval (2 seconds),
then the netvsc network device may get stuck with no carrier.
The netvsc driver debounces link status events which is fine
for unsolicited updates, but blocks getting the update after
down/up from MTU reinitialization.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 19 Jul 2017 23:45:16 +0000 (16:45 -0700)]
Merge branch 'dev_close-void'
Stephen Hemminger says:
====================
net: make dev_close void
Noticed while working on other changes. Why is dev_close()
returning int, it should be void. Should also change
ndo_close to be void, but that requires more work and someone
with more coccinelle foo (smpl) than me.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
liquidio: lio_main: remove unnecessary static in setup_io_queues()
Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
Such variables are initialized before being used, on every execution
path throughout the function. The static has no benefit and, removing
it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a significant difference in the object
file size. Also, there is a significant difference in the bss segment.
This log is the output of the size command, before and after the code
change:
before:
text data bss dec hex filename
78689 15272 27808 121769 1dba9 drivers/net/ethernet/cavium/liquidio/lio_main.o
after:
text data bss dec hex filename
78667 15128 27680 121475 1da83 drivers/net/ethernet/cavium/liquidio/lio_main.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
liquidio: lio_vf_main: remove unnecessary static in setup_io_queues()
Remove unnecessary static on local variables cpu_id_modulus and cpu_id.
Such variables are initialized before being used, on every execution
path throughout the function. The static has no benefit and, removing
it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a significant difference in the object
file size. Also, there is a significant difference in the bss segment.
This log is the output of the size command, before and after the code
change:
before:
text data bss dec hex filename
55656 10680 576 66912 10560 drivers/net/ethernet/cavium/liquidio/lio_vf_main.o
after:
text data bss dec hex filename
55796 10536 448 66780 104dc drivers/net/ethernet/cavium/liquidio/lio_vf_main.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: mediatek: remove useless code in mtk_poll_tx()
Remove useless local variable _condition_ and the code related.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qlcnic: remove unnecessary static in qlcnic_dump_fw()
Remove unnecessary static on local variable fw_dump_ops.
Such variable is initialized before being used, on every
execution path throughout the function. The static has no
benefit and, removing it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a difference in the object file size.
This log is the output of the size command, before and after the code
change:
before:
text data bss dec hex filename
19032 2136 64 21232 52f0 drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.o
after:
text data bss dec hex filename
19020 2048 0 21068 524c drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Remove useless local variables last_read_point and last_txw_point and
the code related.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Acked-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: David S. Miller <davem@davemloft.net>
wireless: airo: remove unnecessary static in writerids()
Remove unnecessary static on local function pointer _writer_.
Such pointer is initialized before being used, on every
execution path throughout the function. The static has no
benefit and, removing it reduces the object file size.
This issue was detected using Coccinelle and the following semantic patch:
@bad exists@
position p;
identifier x;
type T;
@@
static T x@p;
...
x = <+...x...+>
@@
identifier x;
expression e;
type T;
position p != bad.p;
@@
-static
T x@p;
... when != x
when strict
?x = e;
In the following log you can see a significant difference in the object
file size. This log is the output of the size command, before and after
the code change:
before:
text data bss dec hex filename
113797 19152 1216 134165 20c15 drivers/net/wireless/cisco/airo.o
after:
text data bss dec hex filename
113881 19096 1152 134129 20bf1 drivers/net/wireless/cisco/airo.o
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch removes the definition of PGV_FROM_VMALLOC from af_packet.c.
The PGV_FROM_VMALLOC definition was already removed by
commit 62633ad24fc2 ("net: cleanup unused macros in net directory"),
and its usage was removed even before by commit 6474f5100eed
("af_packet: remove pgv.flags"); but it was added back by mistake later on,
in commit caf94a261123 ("af-packet: TPACKET_V3 flexible buffer implementation").
Signed-off-by: Rami Rosen <rami.rosen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dt-binding: ptp: Add SoC compatibility strings for dte ptp clock
Add SoC specific compatibility strings to the Broadcom DTE
based PTP clock binding document.
Fixed the document heading and node name.
Fixes: f8f64b68e223 ("dt-binding: ptp: add bindings document for dte based ptp clock") Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Tue, 18 Jul 2017 15:49:26 +0000 (18:49 +0300)]
ISDN: eicon: switch to use native bitmaps
Two arrays are clearly bit maps, so, make that explicit by converting to
bitmap API and remove custom helpers.
Note sig_ind() uses out of boundary bit to (looks like) protect against
potential bitmap_empty() checks for the same bitmap.
This patch removes that since:
1) that didn't guarantee atomicity anyway;
2) the first operation inside the for-loop is set bit in the bitmap
(which effectively makes it non-empty);
3) group_optimization() doesn't utilize possible emptiness of the bitmap
in question.
Thus, if there is a protection needed it should be implemented properly.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adjusts the timeout formula to schedule the TCP loss probe
(TLP). The previous formula uses 2*SRTT or 1.5*RTT + DelayACKMax if
only one packet is in flight. It keeps a lower bound of 10 msec which
is too large for short RTT connections (e.g. within a data-center).
The new formula = 2*RTT + (inflight == 1 ? 200ms : 2ticks) which
performs better for short and fast connections.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently llist_for_each_entry() and llist_for_each_entry_safe() iterate
until &pos->member != NULL. But when building the kernel with Clang,
the compiler assumes &pos->member cannot be NULL if the member's offset
is greater than 0 (which would be equivalent to the object being
non-contiguous in memory). Therefore the loop condition is always true,
and the loops become infinite.
To work around this, introduce the member_address_is_nonnull() macro,
which casts object pointer to uintptr_t, thus letting the member pointer
to be NULL.
openvswitch: Optimize operations for OvS flow_stats.
When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).
The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.
VM01 --- OvS-VM --- VM02
Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.
With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance imporve).
This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct does’t increase the cacheline number.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
openvswitch: Optimize updating for OvS flow_stats.
In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>