Ben Hutchings [Sun, 26 Jun 2016 09:16:11 +0000 (11:16 +0200)]
batman-adv: Fix double-put of vlan object
Each batadv_tt_local_entry hold a single reference to a
batadv_softif_vlan. In case a new entry cannot be added to the hash
table, the error path puts the reference, but the reference will also
now be dropped by batadv_tt_local_entry_release().
Fixes: a33d970d0b54 ("batman-adv: Fix reference counting of vlan object for tt_local_entry") Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Sven Eckelmann [Sun, 26 Jun 2016 09:16:10 +0000 (11:16 +0200)]
batman-adv: Fix use-after-free/double-free of tt_req_node
The tt_req_node is added and removed from a list inside a spinlock. But the
locking is sometimes removed even when the object is still referenced and
will be used later via this reference. For example batadv_send_tt_request
can create a new tt_req_node (including add to a list) and later
re-acquires the lock to remove it from the list and to free it. But at this
time another context could have already removed this tt_req_node from the
list and freed it.
CPU#0
batadv_batman_skb_recv from net_device 0
-> batadv_iv_ogm_receive
-> batadv_iv_ogm_process
-> batadv_iv_ogm_process_per_outif
-> batadv_tvlv_ogm_receive
-> batadv_tvlv_ogm_receive
-> batadv_tvlv_containers_process
-> batadv_tvlv_call_handler
-> batadv_tt_tvlv_ogm_handler_v1
-> batadv_tt_update_orig
-> batadv_send_tt_request
-> batadv_tt_req_node_new
spin_lock(...)
allocates new tt_req_node and adds it to list
spin_unlock(...)
return tt_req_node
CPU#1
batadv_batman_skb_recv from net_device 1
-> batadv_recv_unicast_tvlv
-> batadv_tvlv_containers_process
-> batadv_tvlv_call_handler
-> batadv_tt_tvlv_unicast_handler_v1
-> batadv_handle_tt_response
spin_lock(...)
tt_req_node gets removed from list and is freed
spin_unlock(...)
CPU#0
<- returned to batadv_send_tt_request
spin_lock(...)
tt_req_node gets removed from list and is freed
MEMORY CORRUPTION/SEGFAULT/...
spin_unlock(...)
This can only be solved via reference counting to allow multiple contexts
to handle the list manipulation while making sure that only the last
context holding a reference will free the object.
Fixes: a73105b8d4c7 ("batman-adv: improved client announcement mechanism") Signed-off-by: Sven Eckelmann <sven@narfation.org> Tested-by: Martin Weinelt <martin@darmstadt.freifunk.net> Tested-by: Amadeus Alfa <amadeus@chemnitz.freifunk.net> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich [Sun, 26 Jun 2016 09:16:09 +0000 (11:16 +0200)]
batman-adv: replace WARN with rate limited output on non-existing VLAN
If a VLAN tagged frame is received and the corresponding VLAN is not
configured on the soft interface, it will splat a WARN on every packet
received. This is a quite annoying behaviour for some scenarios, e.g. if
bat0 is bridged with eth0, and there are arbitrary VLAN tagged frames
from Ethernet coming in without having any VLAN configuration on bat0.
The code should probably create vlan objects on the fly and
transparently transport these VLAN-tagged Ethernet frames, but until
this is done, at least the WARN splat should be replaced by a rate
limited output.
Fixes: 354136bcc3c4 ("batman-adv: fix kernel crash due to missing NULL checks") Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Fri, 24 Jun 2016 23:25:24 +0000 (16:25 -0700)]
net: phy: Manage fixed PHY address space using IDA
If we have a system which uses fixed PHY devices and calls
fixed_phy_register() then fixed_phy_unregister() we can exhaust the
number of fixed PHYs available after a while, since we keep incrementing
the variable phy_fixed_addr, but we never decrement it.
This patch fixes that by converting the fixed PHY allocation to using
IDA, which takes care of the allocation/dealloaction of the PHY
addresses for us.
Fixes: a75951217472 ("net: phy: extend fixed driver with fixed_phy_register()") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Willem de Bruijn [Fri, 24 Jun 2016 20:02:35 +0000 (16:02 -0400)]
sock_diag: do not broadcast raw socket destruction
Diag intends to broadcast tcp_sk and udp_sk socket destruction.
Testing sk->sk_protocol for IPPROTO_TCP/IPPROTO_UDP alone is not
sufficient for this. Raw sockets can have the same type.
Add a test for sk->sk_type.
Fixes: eb4cb008529c ("sock_diag: define destruction multicast groups") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The proc connector messages include a sequence number, allowing userspace
programs to detect lost messages. However, performing this detection is
currently more difficult than necessary, since netlink messages can be
delivered to the application out-of-order. To fix this, leave pre-emption
disabled during cn_netlink_send(), and use GFP_NOWAIT.
The following was written as a test case. Building the kernel w/ make -j32
proved a reliable way to generate out-of-order cn_proc messages.
daniel [Fri, 24 Jun 2016 10:35:18 +0000 (12:35 +0200)]
Bridge: Fix ipv6 mc snooping if bridge has no ipv6 address
The bridge is falsly dropping ipv6 mulitcast packets if there is:
1. No ipv6 address assigned on the brigde.
2. No external mld querier present.
3. The internal querier enabled.
When the bridge fails to build mld queries, because it has no
ipv6 address, it slilently returns, but keeps the local querier enabled.
This specific case causes confusing packet loss.
Ipv6 multicast snooping can only work if:
a) An external querier is present
OR
b) The bridge has an ipv6 address an is capable of sending own queries
Otherwise it has to forward/flood the ipv6 multicast traffic,
because snooping cannot work.
This patch fixes the issue by adding a flag to the bridge struct that
indicates that there is currently no ipv6 address assinged to the bridge
and returns a false state for the local querier in
__br_multicast_querier_exists().
Special thanks to Linus Lüssing.
Fixes: d1d81d4c3dd8 ("bridge: check return value of ipv6_dev_get_saddr()") Signed-off-by: Daniel Danzberger <daniel@dd-wrt.com> Acked-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Sheng-Hui [Fri, 24 Jun 2016 00:52:11 +0000 (08:52 +0800)]
net/mlx5: use mlx5_buf_alloc_node instead of mlx5_buf_alloc in mlx5_wq_ll_create
Commit 311c7c71c9bb ("net/mlx5e: Allocate DMA coherent memory on
reader NUMA node") introduced mlx5_*_alloc_node() but missed changing
some calling and warn messages. This patch introduces 2 changes:
* Use mlx5_buf_alloc_node() instead of mlx5_buf_alloc() in
mlx5_wq_ll_create()
* Update the failure warn messages with _node postfix for
mlx5_*_alloc function names
Fixes: 311c7c71c9bb ("net/mlx5e: Allocate DMA coherent memory on reader NUMA node") Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com> Acked-By: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 23 Jun 2016 21:25:33 +0000 (14:25 -0700)]
net: bgmac: Remove superflous netif_carrier_on()
bgmac_open() calls phy_start() to initialize the PHY state machine,
which will set the interface's carrier state accordingly, no need to
force that as this could be conflicting with the PHY state determined by
PHYLIB.
Fixes: dd4544f05469 ("bgmac: driver for GBit MAC core on BCMA bus") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 23 Jun 2016 21:25:32 +0000 (14:25 -0700)]
net: bgmac: Start transmit queue in bgmac_open
The driver does not start the transmit queue in bgmac_open(). If the
queue was stopped prior to closing then re-opening the interface, we
would never be able to wake-up again.
Fixes: dd4544f05469 ("bgmac: driver for GBit MAC core on BCMA bus") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Fainelli [Thu, 23 Jun 2016 21:23:12 +0000 (14:23 -0700)]
net: bgmac: Fix SOF bit checking
We are checking for the Start of Frame bit in the ctl1 word, while this
bit is set in the ctl0 word instead. Read the ctl0 word and update the
check to verify that.
Fixes: 9cde94506eac ("bgmac: implement scatter/gather support") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jay Vosburgh [Thu, 23 Jun 2016 21:20:51 +0000 (14:20 -0700)]
bonding: fix 802.3ad aggregator reselection
Since commit 7bb11dc9f59d ("bonding: unify all places where
actor-oper key needs to be updated."), the logic in bonding to handle
selection between multiple aggregators has not functioned.
This affects only configurations wherein the bonding slaves
connect to two discrete aggregators (e.g., two independent switches, each
with LACP enabled), thus creating two separate aggregation groups within a
single bond.
The cause is a change in 7bb11dc9f59d to no longer set
AD_PORT_BEGIN on a port after a link state change, which would cause the
port to be reselected for attachment to an aggregator as if were newly
added to the bond. We cannot restore the prior behavior, as it
contradicts IEEE 802.1AX 5.4.12, which requires ports that "become
inoperable" (lose carrier, setting port_enabled=false as per 802.1AX
5.4.7) to remain selected (i.e., assigned to the aggregator). As the port
now remains selected, the aggregator selection logic is not invoked.
A side effect of this change is that aggregators in bonding will
now contain ports that are link down. The aggregator selection logic
does not currently handle this situation correctly, causing incorrect
aggregator selection.
This patch makes two changes to repair the aggregator selection
logic in bonding to function as documented and within the confines of the
standard:
First, the aggregator selection and related logic now utilizes the
number of active ports per aggregator, not the number of selected ports
(as some selected ports may be down). The ad_select "bandwidth" and
"count" options only consider ports that are link up.
Second, on any carrier state change of any slave, the aggregator
selection logic is explicitly called to insure the correct aggregator is
active.
Reported-by: Veli-Matti Lintu <veli-matti.lintu@opinsys.fi> Fixes: 7bb11dc9f59d ("bonding: unify all places where actor-oper key needs to be updated.") Signed-off-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Stefan Hajnoczi [Thu, 23 Jun 2016 15:28:58 +0000 (16:28 +0100)]
vsock: make listener child lock ordering explicit
There are several places where the listener and pending or accept queue
child sockets are accessed at the same time. Lockdep is unhappy that
two locks from the same class are held.
Tell lockdep that it is safe and document the lock ordering.
Originally Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> sent a similar
patch asking whether this is safe. I have audited the code and also
covered the vsock_pending_work() function.
Suggested-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paolo Abeni [Thu, 23 Jun 2016 13:25:09 +0000 (15:25 +0200)]
ipv6: enforce egress device match in per table nexthop lookups
with the commit 8c14586fc320 ("net: ipv6: Use passed in table for
nexthop lookups"), net hop lookup is first performed on route creation
in the passed-in table.
However device match is not enforced in table lookup, so the found
route can be later discarded due to egress device mismatch and no
global lookup will be performed.
This cause the following to fail:
ip link add dummy1 type dummy
ip link add dummy2 type dummy
ip link set dummy1 up
ip link set dummy2 up
ip route add 2001:db8:8086::/48 dev dummy1 metric 20
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy1 metric 20
ip route add 2001:db8:8086::/48 dev dummy2 metric 21
ip route add 2001:db8:d34d::/64 via 2001:db8:8086::2 dev dummy2 metric 21
RTNETLINK answers: No route to host
This change fixes the issue enforcing device lookup in
ip6_nh_lookup_table()
v1->v2: updated commit message title
Fixes: 8c14586fc320 ("net: ipv6: Use passed in table for nexthop lookups") Reported-and-tested-by: Beniamino Galvani <bgalvani@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Mon, 27 Jun 2016 14:05:55 +0000 (10:05 -0400)]
Merge tag 'linux-can-fixes-for-4.7-20160623' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2016-06-23
this is a pull request of 3 patches for the upcoming linux-4.7 release.
The first two patches are by Oliver Hartkopp fixing oopes in the generic CAN
device netlink handling. Jimmy Assarsson's patch for the kvaser_usb driver adds
support for more devices by adding their USB product ids.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Linton [Wed, 22 Jun 2016 17:40:50 +0000 (12:40 -0500)]
net: smsc911x: Fix bug where PHY interrupts are overwritten by 0
By default, mdiobus_alloc() sets the PHYs to polling mode, but a
pointer size memcpy means that a couple IRQs end up being overwritten
with a value of 0. This means that PHY_POLL is disabled and results
in unpredictable behavior depending on the PHY's location on the
MDIO bus. Remove that memcpy and the now unused phy_irq member to
force the SMSC911x PHYs into polling mode 100% of the time.
Fixes: e7f4dc3536a4 ("mdio: Move allocation of interrupts into core") Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 23 Jun 2016 19:22:31 +0000 (15:22 -0400)]
Merge tag 'wireless-drivers-for-davem-2016-06-21' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers
Kalle Valo says:
====================
wireless-drivers fixes for 4.7
iwlwifi
* fix the scan timeout for long scans
* fix an RCU splat caused when updating the TKIP key
* fix a potential NULL-derefence introduced recently
* fix a IGTK key bug that has existed since the MVM driver was introduced
* fix some fw capabilities checks that got accidentally inverted
rtl8xxxu
* fix typo on variable name
ath10k
* fix deadlock when peer cannot be created
* fix crash related to printing features
* fix deadlock while processing rx_in_ord_ind
ath9k
* fix GPIO mask regression for AR9462 and AR9565
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Chris Packham [Tue, 21 Jun 2016 03:39:43 +0000 (15:39 +1200)]
net: vrf: replace hard tab with space in assignment
The assignment of rth->dst.output in vrf_rt6_create() and
vrf_rtable_create() used a hard tab before the '='. The neighboring
assignments did not. Make the assignment of rth->dst.output consistent
with the surrounding code.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 20 Jun 2016 22:00:43 +0000 (15:00 -0700)]
netem: fix a use after free
If the packet was dropped by lower qdisc, then we must not
access it later.
Save qdisc_pkt_len(skb) in a temp variable.
Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 20 Jun 2016 20:37:19 +0000 (13:37 -0700)]
act_ife: acquire ife_mod_lock before reading ifeoplist
Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 20 Jun 2016 20:37:18 +0000 (13:37 -0700)]
act_ife: only acquire tcf_lock for existing actions
Alexey reported that we have GFP_KERNEL allocation when
holding the spinlock tcf_lock. Actually we don't have
to take that spinlock for all the cases, especially
for the new one we just create. To modify the existing
actions, we still need this spinlock to make sure
the whole update is atomic.
For net-next, we can get rid of this spinlock because
we already hold the RTNL lock on slow path, and on fast
path we can use RCU to protect the metalist.
Joint work with Jamal.
Reported-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Sat, 18 Jun 2016 05:03:36 +0000 (13:03 +0800)]
esp: Fix ESN generation under UDP encapsulation
Blair Steven noticed that ESN in conjunction with UDP encapsulation
is broken because we set the temporary ESP header to the wrong spot.
This patch fixes this by first of all using the right spot, i.e.,
4 bytes off the real ESP header, and then saving this information
so that after encryption we can restore it properly.
Fixes: 7021b2e1cddd ("esp4: Switch to new AEAD interface") Reported-by: Blair Steven <Blair.Steven@alliedtelesis.co.nz> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Hartkopp [Tue, 21 Jun 2016 13:45:47 +0000 (15:45 +0200)]
can: fix oops caused by wrong rtnl dellink usage
For 'real' hardware CAN devices the netlink interface is used to set CAN
specific communication parameters. Real CAN hardware can not be created nor
removed with the ip tool ...
This patch adds a private dellink function for the CAN device driver interface
that does just nothing.
It's a follow up to commit 993e6f2fd ("can: fix oops caused by wrong rtnl
newlink usage") but for dellink.
Reported-by: ajneu <ajneu1@gmail.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Oliver Hartkopp [Tue, 21 Jun 2016 10:14:07 +0000 (12:14 +0200)]
can: fix handling of unmodifiable configuration options fix
With upstream commit bb208f144cf3f59 (can: fix handling of unmodifiable
configuration options) a new can_validate() function was introduced.
When invoking 'ip link set can0 type can' without any configuration data
can_validate() tries to validate the content without taking into account that
there's totally no content. This patch adds a check for missing content.
Reported-by: ajneu <ajneu1@gmail.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
David S. Miller [Wed, 22 Jun 2016 20:38:17 +0000 (16:38 -0400)]
Merge branch 'mlx4-fixes'
Tariq Toukan says:
====================
mlx4_en fixes for 4.7-rc
This small patchset includes two small fixes for mlx4_en driver.
One allows a clean shutdown even when clients do not release their
netdev reference.
The other adds error return values to the VLAN VID add/kill functions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Eran Ben Elisha [Tue, 21 Jun 2016 11:20:03 +0000 (14:20 +0300)]
net/mlx4_en: Avoid unregister_netdev at shutdown flow
This allows a clean shutdown, even if some netdev clients do not
release their reference from this netdev. It is enough to release
the HW resources only as the kernel is shutting down.
Fixes: 2ba5fbd62b25 ('net/mlx4_core: Handle AER flow properly') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kamal Heib [Tue, 21 Jun 2016 11:20:02 +0000 (14:20 +0300)]
net/mlx4_en: Fix the return value of a failure in VLAN VID add/kill
Modify mlx4_en_vlan_rx_[add/kill]_vid to return error value in case of
failure.
Fixes: 8e586137e6b6 ('net: make vlan ndo_vlan_rx_[add/kill]_vid return error value') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Mon, 20 Jun 2016 13:20:46 +0000 (09:20 -0400)]
tipc: unclone unbundled buffers before forwarding
When extracting an individual message from a received "bundle" buffer,
we just create a clone of the base buffer, and adjust it to point into
the right position of the linearized data area of the latter. This works
well for regular message reception, but during periods of extremely high
load it may happen that an extracted buffer, e.g, a connection probe, is
reversed and forwarded through an external interface while the preceding
extracted message is still unhandled. When this happens, the header or
data area of the preceding message will be partially overwritten by a
MAC header, leading to unpredicatable consequences, such as a link
reset.
We now fix this by ensuring that the msg_reverse() function never
returns a cloned buffer, and that the returned buffer always contains
sufficient valid head and tail room to be forwarded.
Reported-by: Erik Hugne <erik.hugne@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
It is caused by a missing free in the ->release path. So fix it by
providing seq_release_net as the ->release method.
Signed-off-by: Jiri Slaby <jslaby@suse.cz> Fixes: cd6e111bf5 (kcm: Add statistics and proc interfaces) Cc: "David S. Miller" <davem@davemloft.net> Cc: Tom Herbert <tom@herbertland.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel [Mon, 20 Jun 2016 08:53:20 +0000 (11:53 +0300)]
team: Fix possible deadlock during team enslave
Both dev_uc_sync_multiple() and dev_mc_sync_multiple() require the
source device to be locked by netif_addr_lock_bh(), but this is missing
in team's enslave function, so add it.
Fixes: cb41c997d444 ("team: team should sync the port's uc/mc addrs when add a port") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Wed, 22 Jun 2016 20:29:17 +0000 (16:29 -0400)]
Merge tag 'linux-can-fixes-for-4.7-20160620' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2016-06-20
this is a pull request of 3 patches for the upcoming linux-4.7 release.
The first patch is by Thor Thayer for the c_can/d_can driver. It fixes the
registar access on Altera Cyclone devices, which caused CAN frames to have 0x0
in the first two bytes incorrectly. Wolfgang Grandegger's patch for the at91
driver fixes a hanging driver under high bus load situations. A patch for the
gs_usb driver by Maximilian Schneider adds support for the bytewerk.org
candleLight interface.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
can: at91_can: RX queue could get stuck at high bus load
At high bus load it could happen that "at91_poll()" enters with all RX
message boxes filled up. If then at the end the "quota" is exceeded as
well, "rx_next" will not be reset to the first RX mailbox and hence the
interrupts remain disabled.
Signed-off-by: Wolfgang Grandegger <wg@grandegger.com> Tested-by: Amr Bekhit <amrbekhit@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Thor Thayer [Thu, 16 Jun 2016 16:10:19 +0000 (11:10 -0500)]
can: c_can: Update D_CAN TX and RX functions to 32 bit - fix Altera Cyclone access
When testing CAN write floods on Altera's CycloneV, the first 2 bytes
are sometimes 0x00, 0x00 or corrupted instead of the values sent. Also
observed bytes 4 & 5 were corrupted in some cases.
The D_CAN Data registers are 32 bits and changing from 16 bit writes to
32 bit writes fixes the problem.
Testing performed on Altera CycloneV (D_CAN). Requesting tests on other
C_CAN & D_CAN platforms.
Reported-by: Richard Andrysek <richard.andrysek@gomtec.de> Signed-off-by: Thor Thayer <tthayer@opensource.altera.com> Cc: <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
David S. Miller [Sun, 19 Jun 2016 17:47:33 +0000 (10:47 -0700)]
Merge branch 'qed-fixes'
Yuval Mintz says:
====================
qed*: Fixes series
This series contains several small fixes to driver behavior
[4th patch is the only one containing a 'fatal' fix, but the error
is only theoretical for qede; if would require another protocol
driver yet unsubmitted to reach it].
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 19 Jun 2016 12:18:15 +0000 (15:18 +0300)]
qed: Add missing port-mode
The 'MODULE_FIBER' value replaced several other FIBER values
in newer management firmware images, so existing code would
fail to properly reflect its mode.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 19 Jun 2016 12:18:14 +0000 (15:18 +0300)]
qed: Fix returning unlimited SPQ entries
Driver has 2 sets of entries for handling ramrod configurations
toward firmware - a regular pre-allocated set of entires and a
possible 'unlimited' list of additional pending entries.
In most scenarios the 'unlimited' list would not be used, but
when it does the handling of the ramrod completion doesn't
properly handle the release of the entry.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 19 Jun 2016 12:18:13 +0000 (15:18 +0300)]
qed*: Don't reset statistics on inner reload
Several user APIs can cause driver to perform an inner-reload.
Currently, doing this would cause the HW/FW statistics of the
adapter to reset, which isn't the expected behavior [statistics
should only reset on explicit unloads].
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 19 Jun 2016 12:18:12 +0000 (15:18 +0300)]
qed: Prevent VF from Tx-switching 'promisc'
Internal loopback in driver is based on two things - first
is the willingness of transmitter to use it [in case of VFs,
this can be forced based on VEPA/VEB] and secondly on another
vport classification configuration which should match the
packet's destination.
Current code allows non-linux VFs to configure a 'promisc'
mode on Tx, meaning all traffic sent by VF would be loopbacked
internally by firmware; This isn't considered a valid mode and
as such should be prevented by PF.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sun, 19 Jun 2016 12:18:11 +0000 (15:18 +0300)]
qed: Correct default vlan behavior
When no vlan filter is configured, firmware has a configurable
default on whether to pass only untagged packets or all packets
regardless of their tagging. Driver currently doesn't set this
field in the necessary ramrod, causing the default to always be
'receive all'.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Joshua Houghton [Sat, 18 Jun 2016 15:46:31 +0000 (15:46 +0000)]
net: rds: fix coding style issues
Fix coding style issues in the following files:
ib_cm.c: add space
loop.c: convert spaces to tabs
sysctl.c: add space
tcp.h: convert spaces to tabs
tcp_connect.c:remove extra indentation in switch statement
tcp_recv.c: convert spaces to tabs
tcp_send.c: convert spaces to tabs
transport.c: move brace up one line on for statement
Signed-off-by: Joshua Houghton <josh@awful.name> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Basil Gunn [Thu, 16 Jun 2016 16:42:30 +0000 (09:42 -0700)]
AX.25: Close socket connection on session completion
A socket connection made in ax.25 is not closed when session is
completed. The heartbeat timer is stopped prematurely and this is
where the socket gets closed. Allow heatbeat timer to run to close
socket. Symptom occurs in kernels >= 4.2.0
Originally sent 6/15/2016. Resend with distribution list matching
scripts/maintainer.pl output.
Signed-off-by: Basil Gunn <basil@pacabunga.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan [Fri, 17 Jun 2016 20:12:12 +0000 (16:12 -0400)]
RDS: TCP: rds_tcp_accept_one() should transition socket from RESETTING to UP
The state of the rds_connection after rds_tcp_reset_callbacks() would
be RDS_CONN_RESETTING and this is the value that should be passed
by rds_tcp_accept_one() to rds_connect_path_complete() to transition
the socket to RDS_CONN_UP.
Fixes: b5c21c0947c1 ("RDS: TCP: fix race windows in send-path quiescence
by rds_tcp_accept_one()") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Fri, 17 Jun 2016 16:15:30 +0000 (18:15 +0200)]
net: tilegx: use correct timespec64 type
The conversion to the 64-bit time based ptp methods left two instances
of 'struct timespec' in place. This is harmless because 64-bit
architectures define timespec64 as timespec, and this driver is
not used on 32-bit machines.
However, using 'struct timespec64' directly is obviously the right
thing to do, and will help us remove 'struct timespec' in the future.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Fixes: b9acf24f779c ("ptp: tilegx: convert to the 64 bit get/set time methods.") Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jon Paul Maloy [Fri, 17 Jun 2016 10:35:57 +0000 (06:35 -0400)]
tipc: fix socket timer deadlock
We sometimes observe a 'deadly embrace' type deadlock occurring
between mutually connected sockets on the same node. This happens
when the one-hour peer supervision timers happen to expire
simultaneously in both sockets.
Further analysis reveals that there are three different locations in the
socket code where tipc_sk_respond() is called within the context of the
socket lock, with ensuing risk of similar deadlocks.
We now solve this by passing a buffer queue along with all upcalls where
sk_lock.slock may potentially be held. Response or rejected message
buffers are accumulated into this queue instead of being sent out
directly, and only sent once we know we are safely outside the slock
context.
Reported-by: GUNA <gbalasun@gmail.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following patchset contains Netfilter fixes for your net tree,
they are rather small patches but fixing several outstanding bugs in
nf_conntrack and nf_tables, as well as minor problems with missing
SYNPROXY header uapi installation:
1) Oneliner not to leak conntrack kmemcache on module removal, this
problem was introduced in the previous merge window, patch from
Florian Westphal.
2) Two fixes for insufficient ruleset loop validation, one due to
incorrect flag check in nf_tables_bind_set() and another related to
silly wrong generation mask logic from the walk path, from Liping
Zhang.
3) Fix double-free of anonymous sets on error, this fix simplifies the
code to let the abort path take care of releasing the set object,
also from Liping Zhang.
4) The introduction of helper function for transactions broke the skip
inactive rules logic from the nft_do_chain(), again from Liping
Zhang.
5) Two patches to install uapi xt_SYNPROXY.h header and calm down
kbuild robot due to missing #include <linux/types.h>.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski [Thu, 16 Jun 2016 13:42:50 +0000 (14:42 +0100)]
nfp: use correct index to mask link state irq
We were using an incorrect define to get the irq vector number.
NFP_NET_CFG_LSC is a control BAR offset, LSC interrupt vector
index is called NFP_NET_IRQ_LSC_IDX. For machines with less
than 30 CPUs this meant that we were disabling/enabling IRQ 0.
For bigger hosts we were just playing with the 31st RX/TX
interrupt.
Fixes: 0ba40af963f0 ("nfp: move link state interrupt request/free calls") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Thu, 16 Jun 2016 08:06:19 +0000 (17:06 +0900)]
sit: correct IP protocol used in ipip6_err
Since 32b8a8e59c9c ("sit: add IPv4 over IPv4 support")
ipip6_err() may be called for packets whose IP protocol is
IPPROTO_IPIP as well as those whose IP protocol is IPPROTO_IPV6.
In the case of IPPROTO_IPIP packets the correct protocol value is not
passed to ipv4_update_pmtu() or ipv4_redirect().
This patch resolves this problem by using the IP protocol of the packet
rather than a hard-coded value. This appears to be consistent
with the usage of the protocol of a packet by icmp_socket_deliver()
the caller of ipip6_err().
I was able to exercise the redirect case by using a setup where an ICMP
redirect was received for the destination of the encapsulated packet.
However, it appears that although incorrect the protocol field is not used
in this case and thus no problem manifests. On inspection it does not
appear that a problem will manifest in the fragmentation needed/update pmtu
case either.
In short I believe this is a cosmetic fix. None the less, the use of
IPPROTO_IPV6 seems wrong and confusing.
Reviewed-by: Dinan Gunawardena <dinan.gunawardena@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Acked-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Wed, 15 Jun 2016 21:42:11 +0000 (14:42 -0700)]
mlx4e: Do not attempt to offload VXLAN ports that are unrecognized
The mlx4e driver does not support more than one port for VXLAN offload. As
such expecting the hardware to offload other ports is invalid since it
appears the parsing logic is used to perform Tx checksum and segmentation
offloads. Use the vxlan_port number to determine in which cases we can
apply the offload and in which cases we can not.
Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Reviewed-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kalle Valo [Thu, 16 Jun 2016 14:55:19 +0000 (17:55 +0300)]
Merge ath-current from ath.git
ath.git fixes for 4.7. Major changes:
ath9k
* fix GPIO mask regression with AR9462 and AR9565
ath10k
* fix deadlock while processing rx_in_ord_ind
* fix crash related to printing firmware features in debug mode
* fix deadlock when peer cannot be created
Colin Ian King [Thu, 9 Jun 2016 18:38:50 +0000 (14:38 -0400)]
rtl8xxxu: fix typo on variable name, compare against correct variable
path_b_ok is being assigned but immediately after path_a_ok is being
compared to the value 0x03. This appears to be a typo on the
variable name, compare path_b_ok instead.
Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
David S. Miller [Thu, 16 Jun 2016 06:37:55 +0000 (23:37 -0700)]
Merge branch 'bpf-fixes'
Alexei Starovoitov says:
====================
bpf fixes
Fixes for two bpf bugs:
1st bug reported by Sasha Goldshtein here:
https://github.com/iovisor/bcc/issues/570
2nd discovered by Daniel Borkmann by manual code analysis.
See patches for details.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
bpf, trace: check event type in bpf_perf_event_read
similar to bpf_perf_event_output() the bpf_perf_event_read() helper
needs to check the type of the perf_event before reading the counter.
Fixes: a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") Reported-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
The ctx structure passed into bpf programs is different depending on bpf
program type. The verifier incorrectly marked ctx->data and ctx->data_end
access based on ctx offset only. That caused loads in tracing programs
int bpf_prog(struct pt_regs *ctx) { .. ctx->ax .. }
to be incorrectly marked as PTR_TO_PACKET which later caused verifier
to reject the program that was actually valid in tracing context.
Fix this by doing program type specific matching of ctx offsets.
Fixes: 969bf05eb3ce ("bpf: direct packet access") Reported-by: Sasha Goldshtein <goldshtn@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 15 Jun 2016 13:24:00 +0000 (06:24 -0700)]
gre: fix error handler
1) gre_parse_header() can be called from gre_err()
At this point transport header points to ICMP header, not the inner
header.
2) We can not really change transport header as ipgre_err() will later
assume transport header still points to ICMP header (using icmp_hdr())
3) pskb_may_pull() logic in gre_parse_header() really works
if we are interested at zone pointed by skb->data
4) As Jiri explained in commit b7f8fe251e46 ("gre: do not pull header in
ICMP error processing") we should not pull headers in error handler.
So this fix :
A) changes gre_parse_header() to use skb->data instead of
skb_transport_header()
B) Adds a nhs parameter to gre_parse_header() so that we can skip the
not pulled IP header from error path.
This offset is 0 for normal receive path.
C) remove obsolete IPV6 includes
Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: Don't forget pr_fmt on net_dbg_ratelimited for CONFIG_DYNAMIC_DEBUG
The implementation of net_dbg_ratelimited in the CONFIG_DYNAMIC_DEBUG
case was added with 2c94b5373 ("net: Implement net_dbg_ratelimited() for
CONFIG_DYNAMIC_DEBUG case"). The implementation strategy was to take the
usual definition of the dynamic_pr_debug macro, but alter it by adding a
call to "net_ratelimit()" in the if statement. This is, in fact, the
correct approach.
However, while doing this, the author of the commit forgot to surround
fmt by pr_fmt, resulting in unprefixed log messages appearing in the
console. So, this commit adds back the pr_fmt(fmt) invocation, making
net_dbg_ratelimited properly consistent across DEBUG, no DEBUG, and
DYNAMIC_DEBUG cases, and bringing parity with the behavior of
dynamic_pr_debug as well.
Fixes: 2c94b5373 ("net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Tim Bingham <tbingham@akamai.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 15 Jun 2016 15:45:51 +0000 (17:45 +0200)]
net: skfb: remove obsolete -I cflag
The skfp driver has been moved to drivers/net/fddi/skfp a long time
ago, but we still attempt to include headers from the old location,
which causes a warning when building with W=1:
cc1: error: /git/arm-soc/drivers/net/skfp: No such file or directory [-Werror=missing-include-dirs]
cc1: error: drivers/net/skfp: No such file or directory [-Werror=missing-include-dirs]
Clearly this include directive is not needed any more, so we can
just remove it now.
Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Ying Xue [Wed, 15 Jun 2016 06:11:31 +0000 (14:11 +0800)]
tipc: eliminate uninitialized variable warning
net/tipc/link.c: In function ‘tipc_link_timeout’:
net/tipc/link.c:744:28: warning: ‘mtyp’ may be used uninitialized in this function [-Wuninitialized]
Fixes: 42b18f605fea ("tipc: refactor function tipc_link_timeout()") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The warning appears as rtnl_dereference() is wrongly used in
tipc_l2_send_msg() under RCU read lock protection. Instead the proper
usage should be that rcu_dereference_rtnl() is called here.
Fixes: 5b7066c3dd24 ("tipc: stricter filtering of packets in bearer layer") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sabrina Dubroca [Tue, 14 Jun 2016 13:25:15 +0000 (15:25 +0200)]
macsec: allocate sg and iv on the heap
For the crypto callbacks to work properly, we cannot have sg and iv on
the stack. Use kmalloc instead, with a single allocation for
aead_request + scatterlist + iv.
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Jamal Hadi Salim [Mon, 13 Jun 2016 22:08:42 +0000 (18:08 -0400)]
net sched actions: bug fix dumping actions directly didnt produce NLMSG_DONE
This refers to commands to direct action access as follows:
sudo tc actions add action drop index 12
sudo tc actions add action pipe index 10
And then dumping them like so:
sudo tc actions ls action gact
iproute2 worked because it depended on absence of TCA_ACT_TAB TLV
as end of message.
This fix has been tested with iproute2 and is backward compatible.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
WANG Cong [Mon, 13 Jun 2016 20:44:14 +0000 (13:44 -0700)]
act_ipt: fix a bind refcnt leak
And avoid calling tcf_hash_check() twice.
Fixes: a57f19d30b2d ("net sched: ipt action fix late binding") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Now prio_init() can return -ENOMEM, it also has to make sure
any allocated qdiscs are freed, since the caller (qdisc_create()) wont
call ->destroy() handler for us.
More generally, we want a transactional behavior for "tc qdisc
change ...", so prio_tune() should not make modifications if
any error is returned.
It means that we must validate parameters and allocate missing qdisc(s)
before taking root qdisc lock exactly once, to not leave the prio qdisc
in an intermediate state.
Fixes: cbdf45116478 ("net_sched: prio: properly report out of memory errors") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Maciej Żenczykowski reported lockdep warning a spinlock
was not registered before being held in mlx4_cmd_wake_completions()
cmd.context_lock initialization is not at the right place.
1) mlx4_cmd_use_events() can be called multiple times.
Calling spin_lock_init() on a live spinlock can lead
to hangs.
2) mlx4_cmd_wake_completions() can be called while lock
has not been initialized.
Lockdep complains, and current logic is not race prone.
It seems better to move the initialization earlier in
mlx4_load_one()
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Maciej Żenczykowski <maze@google.com> Cc: Eugenia Emantayev <eugenia@mellanox.com> Cc: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Liping Zhang [Sat, 11 Jun 2016 04:20:28 +0000 (12:20 +0800)]
netfilter: nf_tables: fix wrong destroy anonymous sets if binding fails
When we add a nft rule like follows:
# nft add rule filter test tcp dport vmap {1: jump test}
-ELOOP error will be returned, and the anonymous set will be
destroyed.
But after that, nf_tables_abort will also try to remove the
element and destroy the set, which was already destroyed and
freed.
If we add a nft wrong rule, nft_tables_abort will do the cleanup
work rightly, so nf_tables_set_destroy call here is redundant and
wrong, remove it.
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This is because before we commit, the element in the current anonymous
set is inactive, so osp->walk will skip this element and miss the
validate check."
To resolve this problem, this patch passes the generation mask to the
walk function through the iter container structure depending on the code
path:
1) If we're dumping the elements, then we have to check if the element
is active in the current generation. Thus, we check for the current
bit in the genmask.
2) If we're checking for loops, then we have to check if the element is
active in the next generation, as we're in the middle of a
transaction. Thus, we check for the next bit in the genmask.
Nicolas Dichtel [Mon, 13 Jun 2016 08:31:07 +0000 (10:31 +0200)]
ovs/geneve: fix rtnl notifications on iface deletion
The function geneve_dev_create_fb() (only used by ovs) never calls
rtnl_configure_link(). The consequence is that dev->rtnl_link_state is
never set to RTNL_LINK_INITIALIZED.
During the deletion phase, the function rollback_registered_many() sends
a RTM_DELLINK only if dev->rtnl_link_state is set to RTNL_LINK_INITIALIZED.
Fixes: e305ac6cf5a1 ("geneve: Add support to collect tunnel metadata.") CC: Pravin B Shelar <pshelar@nicira.com> CC: Jesse Gross <jesse@nicira.com> CC: Thomas Graf <tgraf@suug.ch> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Mon, 13 Jun 2016 08:31:06 +0000 (10:31 +0200)]
ovs/gre: fix rtnl notifications on iface deletion
The function gretap_fb_dev_create() (only used by ovs) never calls
rtnl_configure_link(). The consequence is that dev->rtnl_link_state is
never set to RTNL_LINK_INITIALIZED.
During the deletion phase, the function rollback_registered_many() sends
a RTM_DELLINK only if dev->rtnl_link_state is set to RTNL_LINK_INITIALIZED.
Fixes: b2acd1dc3949 ("openvswitch: Use regular GRE net_device instead of vport") CC: Thomas Graf <tgraf@suug.ch> CC: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Mon, 13 Jun 2016 08:31:05 +0000 (10:31 +0200)]
ovs/vxlan: fix rtnl notifications on iface deletion
The function vxlan_dev_create() (only used by ovs) never calls
rtnl_configure_link(). The consequence is that dev->rtnl_link_state is
never set to RTNL_LINK_INITIALIZED.
During the deletion phase, the function rollback_registered_many() sends
a RTM_DELLINK only if dev->rtnl_link_state is set to RTNL_LINK_INITIALIZED.
Note that the function vxlan_dev_create() is moved after the rtnl stuff so
that vxlan_dellink() can be called in this function.
Fixes: dcc38c033b32 ("openvswitch: Re-add CONFIG_OPENVSWITCH_VXLAN") CC: Thomas Graf <tgraf@suug.ch> CC: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Nicolas Dichtel [Mon, 13 Jun 2016 08:31:04 +0000 (10:31 +0200)]
ovs/gre,geneve: fix error path when creating an iface
After ipgre_newlink()/geneve_configure() call, the netdev is registered.
Fixes: 7e059158d57b ("vxlan, gre, geneve: Set a large MTU on ovs-created tunnel devices") CC: David Wragg <david@weave.works> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Su, Xuemin [Mon, 13 Jun 2016 03:02:50 +0000 (11:02 +0800)]
udp reuseport: fix packet of same flow hashed to different socket
There is a corner case in which udp packets belonging to a same
flow are hashed to different socket when hslot->count changes from 10
to 11:
1) When hslot->count <= 10, __udp_lib_lookup() searches udp_table->hash,
and always passes 'daddr' to udp_ehashfn().
2) When hslot->count > 10, __udp_lib_lookup() searches udp_table->hash2,
but may pass 'INADDR_ANY' to udp_ehashfn() if the sockets are bound to
INADDR_ANY instead of some specific addr.
That means when hslot->count changes from 10 to 11, the hash calculated by
udp_ehashfn() is also changed, and the udp packets belonging to a same
flow will be hashed to different socket.
This is easily reproduced:
1) Create 10 udp sockets and bind all of them to 0.0.0.0:40000.
2) From the same host send udp packets to 127.0.0.1:40000, record the
socket index which receives the packets.
3) Create 1 more udp socket and bind it to 0.0.0.0:44096. The number 44096
is 40000 + UDP_HASH_SIZE(4096), this makes the new socket put into the
same hslot as the aformentioned 10 sockets, and makes the hslot->count
change from 10 to 11.
4) From the same host send udp packets to 127.0.0.1:40000, and the socket
index which receives the packets will be different from the one received
in step 2.
This should not happen as the socket bound to 0.0.0.0:44096 should not
change the behavior of the sockets bound to 0.0.0.0:40000.
It's the same case for IPv6, and this patch also fixes that.
Signed-off-by: Su, Xuemin <suxm@chinanetcenter.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Mon, 13 Jun 2016 03:01:25 +0000 (20:01 -0700)]
net_sched: fix pfifo_head_drop behavior vs backlog
When the qdisc is full, we drop a packet at the head of the queue,
queue the current skb and return NET_XMIT_CN
Now we track backlog on upper qdiscs, we need to call
qdisc_tree_reduce_backlog(), even if the qlen did not change.
Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: WANG Cong <xiyou.wangcong@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Feng Tang [Sun, 12 Jun 2016 09:36:37 +0000 (17:36 +0800)]
net: alx: Work around the DMA RX overflow issue
Commit 26c5f03 uses a new skb allocator to avoid the RFD overflow
issue.
But from debugging without datasheet, we found the error always
happen when the DMA RX address is set to 0x....fc0, which is very
likely to be a HW/silicon problem.
So one idea is instead of adding a new allocator, why not just
hitting the right target by avaiding the error-prone DMA address?
This patch will actually
* Remove the commit 26c5f03
* Apply rx skb with 64 bytes longer space, and if the allocated skb
has a 0x...fc0 address, it will use skb_resever(skb, 64) to
advance the address, so that the RX overflow can be avoided.
In theory this method should also apply to atl1c driver, which
I can't find anyone who can help to test on real devices.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=70761 Signed-off-by: Feng Tang <feng.tang@intel.com> Suggested-by: Eric Dumazet <edumazet@google.com> Tested-by: Ole Lukoie <olelukoie@mail.ru> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Tom Herbert <tom@herbertland.com> Fixes: 4068579e1e098fa ("net: Implmement RFC 6936 (zero RX csums for UDP/IPv6") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Tom Herbert <tom@herbertland.com> Fixes: 4068579e1e098fa ("net: Implmement RFC 6936 (zero RX csums for UDP/IPv6") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6: tcp: fix endianness annotation in tcp_v6_send_response
Cc: Florent Fourcot <florent.fourcot@enst-bretagne.fr> Fixes: 1d13a96c74fc ("ipv6: tcp: fix flowlabel value in ACK messages send from TIME_WAIT") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
IPv6 ping socket error handler doesn't correctly convert the new 32 bit
mtu to host endianness before using.
Cc: Lorenzo Colitti <lorenzo@google.com> Fixes: 6d0bfe22611602f ("net: ipv6: Add IPv6 support to the ping socket.") Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Lorenzo Colitti <lorenzo@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>