Pravin B Shelar [Tue, 13 Aug 2013 08:41:06 +0000 (01:41 -0700)]
ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id.
Using inner-id for tunnel id is not safe in some rare cases.
E.g. packets coming from multiple sources entering same tunnel
can have same id. Therefore on tunnel packet receive we
could have packets from two different stream but with same
source and dst IP with same ip-id which could confuse ip packet
reassembly.
Following patch reverts optimization from commit b92c2f282b (IP_GRE: Fix IP-Identification.)
CC: Jarno Rajahalme <jrajahalme@nicira.com> CC: Ansis Atteka <aatteka@nicira.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Tue, 13 Aug 2013 23:04:38 +0000 (16:04 -0700)]
Merge branch 'bnx2x'
Dmitry Kravkov says:
====================
Please consider applying the series of bnx2x fixes to net:
* statistics may cause FW assert
* missing fairness configuration in DCB flow
* memory leak in sriov related part
* Illegal PTE access
* Pagefault crash in shutdown flow with cnic
v1->v2
* fixed sparse error pointed by Joe Perches
* added missing signed-off from Sergei Shtylyov
v2->v3
* added missing signed-off from Sergei Shtylyov
* fixed formatting from Sergei Shtylyov
v3->v4
* patch 1/6: fixed declaration order
* patch 2/6 replaced with: protect flows using set_bit constraints
v4->v5
* patch 2/6: replace proprietary locking with semaphore
* droped 1/6: since adds redundant code from Benjamin Poirier
The following patchset contains four netfilter fixes, they are:
* Fix possible invalid access and mangling of the TCPMSS option in
xt_TCPMSS. This was spotted by Julian Anastasov.
* Fix possible off by one access and mangling of the TCP packet in
xt_TCPOPTSTRIP, also spotted by Julian Anastasov.
* Fix possible information leak due to missing initialization of one
padding field of several structures that are included in nfqueue and
nflog netlink messages, from Dan Carpenter.
* Fix TCP window tracking with Fast Open, from Yuchung Cheng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Mon, 12 Aug 2013 23:25:03 +0000 (02:25 +0300)]
bnx2x: prevent crash in shutdown flow with CNIC
There might be a crash as during shutdown flow CNIC might try
to access resources already freed by bnx2x.
Change bnx2x_close() into dev_close() in __bnx2x_remove (shutdown flow)
to guarantee CNIC is notified of the device's change of status.
Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Barak Witkowsky [Mon, 12 Aug 2013 23:25:02 +0000 (02:25 +0300)]
bnx2x: fix PTE write access error
PTE write access error might occur in MF_ALLOWED mode when IOMMU
is active. The patch adds rmmod HSI indicating to MFW to stop
running queries which might trigger this failure.
Signed-off-by: Barak Witkowsky <barak@broadcom.com> Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dmitry Kravkov [Mon, 12 Aug 2013 23:24:59 +0000 (02:24 +0300)]
bnx2x: protect different statistics flows
Add locking to protect different statistics flows from
running simultaneously.
This in order to serialize statistics requests sent to FW,
otherwise two outstanding queries may cause FW assert.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: Eilon Greenstein <eilong@broadcom.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg [Tue, 13 Aug 2013 07:04:05 +0000 (09:04 +0200)]
genetlink: fix family dump race
When dumping generic netlink families, only the first dump call
is locked with genl_lock(), which protects the list of families,
and thus subsequent calls can access the data without locking,
racing against family addition/removal. This can cause a crash.
Fix it - the locking needs to be conditional because the first
time around it's already locked.
A similar bug was reported to me on an old kernel (3.4.47) but
the exact scenario that happened there is no longer possible,
on those kernels the first round wasn't locked either. Looking
at the current code I found the race described above, which had
also existed on the old kernel.
Cc: stable@vger.kernel.org Reported-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Probably this one is quite unlikely to be triggered, but it's more safe
to do the call_rcu() at the end after we have dropped the reference on
the asoc and freed sctp packet chunks. The reason why is because in
sctp_transport_destroy_rcu() the transport is being kfree()'d, and if
we're unlucky enough we could run into corrupted pointers. Probably
that's more of theoretical nature, but it's safer to have this simple fix.
Introduced by commit 72c4f188 ("sctp: sctp_close: fix release of bindings
for deferred call_rcu's"). I also did the 72c4f188 regression test and
it's fine that way.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Karl Heiss <kheiss@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
stmmac: fix init_dma_desc_rings() to handle errors
In stmmac_init_rx_buffers():
* add missing handling of dma_map_single() error
* remove superfluous unlikely() optimization while at it
Add stmmac_free_rx_buffers() helper and use it in dma_free_rx_skbufs().
In init_dma_desc_rings():
* add missing handling of kmalloc_array() errors
* fix handling of dma_alloc_coherent() and stmmac_init_rx_buffers() errors
* make function return an error value on error and 0 on success
In stmmac_open():
* add handling of init_dma_desc_rings() return value
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The problem is that the tipc_link_delete() will cancel the timer disc_timeout() when
the b_ptr->lock is hold, but the disc_timeout() still call b_ptr->lock to finish the
work, so the dead lock occurs.
We should unlock the b_ptr->lock when del the disc_timeout().
Remove link_timeout() still met the same problem, the patch:
fix the problem, so no need to send patch for fix link_timeout() deadlock warming.
Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Fix possibly wrong memcpy() bytes length since some CAN records received from
PCAN-USB could define a DLC field in range [9..15].
In that case, the real DLC value MUST be used to move forward the record pointer
but, only 8 bytes max. MUST be copied into the data field of the struct
can_frame object of the skb given to the network core.
Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
In the first case, macvtap_put_user() calls macvlan_count_rx()
in a preempt-able context, and this is not allowed.
In the second case, macvtap_get_user() calls
macvlan_start_xmit() with BH enabled, and this is not allowed.
Reported-by: Thomas Huth <thuth@linux.vnet.ibm.com> Bisected-by: Thomas Huth <thuth@linux.vnet.ibm.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Thomas Huth <thuth@linux.vnet.ibm.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Tue, 6 Aug 2013 18:21:15 +0000 (20:21 +0200)]
batman-adv: fix potential kernel paging errors for unicast transmissions
There are several functions which might reallocate skb data. Currently
some places keep reusing their old ethhdr pointer regardless of whether
they became invalid after such a reallocation or not. This potentially
leads to kernel paging errors.
This patch fixes these by refetching the ethdr pointer after the
potential reallocations.
Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
David S. Miller [Sat, 10 Aug 2013 20:44:22 +0000 (13:44 -0700)]
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:
====================
The following patchset contains four netfilter fixes, they are:
* Fix possible invalid access and mangling of the TCPMSS option in
xt_TCPMSS. This was spotted by Julian Anastasov.
* Fix possible off by one access and mangling of the TCP packet in
xt_TCPOPTSTRIP, also spotted by Julian Anastasov.
* Fix possible information leak due to missing initialization of one
padding field of several structures that are included in nfqueue and
nflog netlink messages, from Dan Carpenter.
* Fix TCP window tracking with Fast Open, from Yuchung Cheng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuchung Cheng [Sat, 10 Aug 2013 00:21:27 +0000 (17:21 -0700)]
netfilter: nf_conntrack: fix tcp_in_window for Fast Open
Currently the conntrack checks if the ending sequence of a packet
falls within the observed receive window. However it does so even
if it has not observe any packet from the remote yet and uses an
uninitialized receive window (td_maxwin).
If a connection uses Fast Open to send a SYN-data packet which is
dropped afterward in the network. The subsequent SYNs retransmits
will all fail this check and be discarded, leading to a connection
timeout. This is because the SYN retransmit does not contain data
payload so
end == initial sequence number (isn) + 1
sender->td_end == isn + syn_data_len
receiver->td_maxwin == 0
The fix is to only apply this check after td_maxwin is initialized.
Reported-by: Michael Chan <mcfchan@stanford.edu> Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Byungho An [Thu, 8 Aug 2013 06:30:26 +0000 (15:30 +0900)]
net: stmmac: Fixed the condition of extend_desc for jumbo frame
This patch fixed the condition of extend_desc for jumbo frame.
There is no check routine for extend_desc in the stmmac_jumbo_frm function.
Even though extend_desc is set if dma_tx is used instead of dma_etx.
It causes kernel panic.
Signed-off-by: Byungho An <bh74.an@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The problem is that vxlan_dellink(), which is called with RTNL lock
held, tries to flush the workqueue synchronously, but apparently
igmp_join and igmp_leave work need to hold RTNL lock too, therefore we
have a soft lockup!
As suggested by Stephen, probably the flush_workqueue can just be
removed and let the normal refcounting work. The workqueue has a
reference to device and socket, therefore the cleanups should work
correctly.
Suggested-by: Stephen Hemminger <stephen@networkplumber.org> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Tested-by: Cong Wang <amwang@redhat.com> Signed-off-by: Cong Wang <amwang@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
if (vxlan_group_used(vn, vxlan->default_dst.remote_ip))
ip_mc_join_group(sk, &mreq);
else
ip_mc_leave_group(sk, &mreq);
therefore we shoud check vxlan_group_used(), not its opposite,
for igmp_join.
Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Timo Teräs [Tue, 6 Aug 2013 10:45:43 +0000 (13:45 +0300)]
ip_gre: fix ipgre_header to return correct offset
Fix ipgre_header() (header_ops->create) to return the correct
amount of bytes pushed. Most callers of dev_hard_header() seem
to care only if it was success, but af_packet.c uses it as
offset to the skb to copy from userspace only once. In practice
this fixes packet socket sendto()/sendmsg() to gre tunnels.
Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Timo Teräs <timo.teras@iki.fi> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 8 Aug 2013 21:12:10 +0000 (14:12 -0700)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:
====================
This is a batch of fixes intended for the 3.11 queue...
Regarding the mac80211 (and related) bits, Johannes says:
"I have a fix from Chris for an infinite loop along with fixes from
myself to prevent it entering the loop to start with (continue using
disabled channels, many thanks to Chris for his debug/test help) and a
workaround for broken APs that advertise a bad HT primary channel in
their beacons. Additionally, a fix for another attrbuf race in mac80211
and a fix to clean up properly while P2P GO interfaces go down."
Along with that...
Solomon Peachy corrects a range check in cw1200 that would lead to
a BUG_ON when starting AP mode.
Stanislaw Gruszka provides an iwl4965 patch to power-up the device
earlier (avoiding microcode errors), and another iwl4965 fix that
resets the firmware after turning rfkill off (resolving a bug in the
Red Hat Bugzilla).
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ipv6: don't stop backtracking in fib6_lookup_1 if subtree does not match
In case a subtree did not match we currently stop backtracking and return
NULL (root table from fib_lookup). This could yield in invalid routing
table lookups when using subtrees.
Instead continue to backtrack until a valid subtree or node is found
and return this match.
Also remove unneeded NULL check.
Reported-by: Teco Boot <teco@inf-net.nl> Cc: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Cc: David Lamparter <equinox@diac24.net> Cc: <boutier@pps.univ-paris-diderot.fr> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 6 Aug 2013 03:05:12 +0000 (20:05 -0700)]
tcp: cubic: fix bug in bictcp_acked()
While investigating about strange increase of retransmit rates
on hosts ~24 days after boot, Van found hystart was disabled
if ca->epoch_start was 0, as following condition is true
when tcp_time_stamp high order bit is set.
(s32)(tcp_time_stamp - ca->epoch_start) < HZ
Quoting Van :
At initialization & after every loss ca->epoch_start is set to zero so
I believe that the above line will turn off hystart as soon as the 2^31
bit is set in tcp_time_stamp & hystart will stay off for 24 days.
I think we've observed that cubic's restart is too aggressive without
hystart so this might account for the higher drop rate we observe.
Diagnosed-by: Van Jacobson <vanj@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 6 Aug 2013 00:10:15 +0000 (17:10 -0700)]
tcp: cubic: fix overflow error in bictcp_update()
commit 3d6503dec95 ("tcp_cubic: fix clock dependency") added an
overflow error in bictcp_update() in following code :
/* change the unit from HZ to bictcp_HZ */
t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3) -
ca->epoch_start) << BICTCP_HZ) / HZ;
Because msecs_to_jiffies() being unsigned long, compiler does
implicit type promotion.
We really want to constrain (tcp_time_stamp - ca->epoch_start)
to a signed 32bit value, or else 't' has unexpected high values.
This bugs triggers an increase of retransmit rates ~24 days after
boot [1], as the high order bit of tcp_time_stamp flips.
[1] for hosts with HZ=1000
Big thanks to Van Jacobson for spotting this problem.
Diagnosed-by: Van Jacobson <vanj@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Lüssing [Mon, 5 Aug 2013 22:32:05 +0000 (00:32 +0200)]
bridge: don't try to update timers in case of broken MLD queries
Currently we are reading an uninitialized value for the max_delay
variable when snooping an MLD query message of invalid length and would
update our timers with that.
Fixing this by simply ignoring such broken MLD queries (just like we do
for IGMP already).
This is a regression introduced by:
"bridge: disable snooping if there is no querier" (5231d8c6ad99)
Reported-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel Borkmann [Mon, 5 Aug 2013 10:49:35 +0000 (12:49 +0200)]
net: esp{4,6}: fix potential MTU calculation overflows
Commit 6e2269b48 ("xfrm: take net hdr len into account for esp payload
size calculation") introduced a possible interger overflow in
esp{4,6}_get_mtu() handlers in case of x->props.mode equals
XFRM_MODE_TUNNEL. Thus, the following expression will overflow
where (net_adj - 2) would be evaluated as <foo> + (0 - 2) in an unsigned
context. Fix it by simply removing brackets as those operations here
do not need to have special precedence.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Benjamin Poirier <bpoirier@suse.de> Cc: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Benjamin Poirier <bpoirier@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
net_sched: make dev_trans_start return vlan's real dev trans_start
Vlan devices are LLTX and don't update their own trans_start, so if
dev_trans_start has to be called with a vlan device then 0 or a stale
value will be returned. Currently the bonding is the only such user, and
it's needed for proper arp monitoring when the slaves are vlans.
Fix this by extracting the vlan's real device trans_start.
Suggested-by: David Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
vlan: make vlan_dev_real_dev work over stacked vlans
Sometimes we might have stacked vlans on top of each other, and we're
interested in the first non-vlan real device on the path, so transform
vlan_dev_real_dev to go over the stacked vlans and extract the first
non-vlan device.
Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Solomon Peachy [Fri, 2 Aug 2013 23:57:40 +0000 (19:57 -0400)]
cw1200: Fix spurious BUG_ON() trigger when starting AP mode.
There's an underlying race condition with the unjoin_work() call that is
sometimes triggered depending on scheduling order and the phase of the
moon. This doesn't fix the race condition, but it does remove the
ill-advised BUG_ON() call in an easily-recoverable situation.
Signed-off-by: Solomon Peachy <pizza@shaftnet.org> Signed-off-by: John W. Linville <linville@tuxdriver.com>
commit a3e17141dde26b498f4a33e8139dfe0fc2e874d2
macvlan: add FDB bridge ops and macvlan flags
added a flags field to macvlan, which can be
controlled from userspace.
The idea is to make the interface future-proof
so we can add flags and not new fields.
However, flags value isn't validated, as a result,
userspace can't detect which flags are supported.
Cc: "David S. Miller" <davem@davemloft.net> Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The following is needed as well to fix warning/error about shifting a 32 bit
value 32 bits which occurs if building on 32 bit platform caused by conversion
to using dma_addr_t
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Eliezer Tamir [Sun, 4 Aug 2013 09:55:48 +0000 (12:55 +0300)]
busy_poll: cleanup do-nothing placeholders
When renaming ll_poll to busy poll, I introduced a typo
in the name of the do-nothing placeholder for sk_busy_loop
and called it sk_busy_poll.
This broke compile when busy poll was not configured.
Cong Wang submitted a patch to fixed that.
This patch removes the now redundant, misspelled placeholder.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
This old driver never checked for DMA mapping errors.
Causing splats with the new DMA mapping checks:
WARNING: at lib/dma-debug.c:937 check_unmap+0x47b/0x930()
skge 0000:01:09.0: DMA-API: device driver failed to check map
Add checks and unwind code.
Reported-by: poma <pomidorabelisima@gmail.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
1) Don't ignore user initiated wireless regulatory settings on cards
with custom regulatory domains, from Arik Nemtsov.
2) Fix length check of bluetooth information responses, from Jaganath
Kanakkassery.
3) Fix misuse of PTR_ERR in btusb, from Adam Lee.
4) Handle rfkill properly while iwlwifi devices are offline, from
Emmanuel Grumbach.
5) Fix r815x devices DMA'ing to stack buffers, from Hayes Wang.
6) Kernel info leak in ATM packet scheduler, from Dan Carpenter.
7) 8139cp doesn't check for DMA mapping errors, from Neil Horman.
8) Fix bridge multicast code to not snoop when no querier exists,
otherwise mutlicast traffic is lost. From Linus Lüssing.
9) Avoid soft lockups in fib6_run_gc(), from Michal Kubecek.
10) Fix races in automatic address asignment on ipv6, which can result
in incorrect lifetime assignments. From Jiri Benc.
11) Cure build bustage when CONFIG_NET_LL_RX_POLL is not set and rename
it CONFIG_NET_RX_BUSY_POLL to eliminate the last reference to the
original naming of this feature. From Cong Wang.
12) Fix crash in TIPC when server socket creation fails, from Ying Xue.
13) macvlan_changelink() silently succeeds when it shouldn't, from
Michael S Tsirkin.
14) HTB packet scheduler can crash due to sign extension, fix from
Stephen Hemminger.
15) With the cable unplugged, r8169 prints out a message every 10
seconds, make it netif_dbg() instead of netif_warn(). From Peter
Wu.
16) Fix memory leak in rtm_to_ifaddr(), from Daniel Borkmann.
17) sis900 gets spurious TX queue timeouts due to mismanagement of link
carrier state, from Denis Kirjanov.
18) Validate somaxconn sysctl to make sure it fits inside of a u16.
From Roman Gushchin.
19) Fix MAC address filtering on qlcnic, from Shahed Shaikh.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (68 commits)
qlcnic: Fix for flash update failure on 83xx adapter
qlcnic: Fix link speed and duplex display for 83xx adapter
qlcnic: Fix link speed display for 82xx adapter
qlcnic: Fix external loopback test.
qlcnic: Removed adapter series name from warning messages.
qlcnic: Free up memory in error path.
qlcnic: Fix ingress MAC learning
qlcnic: Fix MAC address filter issue on 82xx adapter
net: ethernet: davinci_emac: drop IRQF_DISABLED
netlabel: use domain based selectors when address based selectors are not available
net: check net.core.somaxconn sysctl values
sis900: Fix the tx queue timeout issue
net: rtm_to_ifaddr: free ifa if ifa_cacheinfo processing fails
r8169: remove "PHY reset until link up" log spam
net: ethernet: cpsw: drop IRQF_DISABLED
htb: fix sign extension bug
macvlan: handle set_promiscuity failures
macvlan: better mode validation
tipc: fix oops when creating server socket fails
net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL
...
Rajesh Borundia [Sat, 3 Aug 2013 03:16:00 +0000 (23:16 -0400)]
qlcnic: Fix link speed and duplex display for 83xx adapter
o Set link speed and duplex to unknown when link is not up.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Rajesh Borundia [Sat, 3 Aug 2013 03:15:59 +0000 (23:15 -0400)]
qlcnic: Fix link speed display for 82xx adapter
o Do not obtain link speed from register when adapter
link is down.
Signed-off-by: Rajesh Borundia <rajesh.borundia@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Pratik Pujar [Sat, 3 Aug 2013 03:15:57 +0000 (23:15 -0400)]
qlcnic: Removed adapter series name from warning messages.
Signed-off-by: Pratik Pujar <pratik.pujar@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Himanshu Madhani <himanshu.madhani@qlogic.com> Signed-off-by: Shahed Shaikh <shahed.shaikh@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sat, 3 Aug 2013 18:15:03 +0000 (11:15 -0700)]
Merge branch 'for-3.11' of git://linux-nfs.org/~bfields/linux
Pull nfsd bugfixes from Bruce Fields:
"Most of this is due to a screwup on my part -- some gss-proxy crashes
got fixed before the merge window but somehow never made it out of a
temporary git repo on my laptop...."
* 'for-3.11' of git://linux-nfs.org/~bfields/linux:
svcrpc: set cr_gss_mech from gss-proxy as well as legacy upcall
svcrpc: fix kfree oops in gss-proxy code
svcrpc: fix gss-proxy xdr decoding oops
svcrpc: fix gss_rpc_upcall create error
NFSD/sunrpc: avoid deadlock on TCP connection due to memory pressure.
Linus Torvalds [Sat, 3 Aug 2013 18:12:09 +0000 (11:12 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull arm fixes fixes from Russell King:
"This fixes a couple of problems with commit 24b27675b3bc ("ARM: move
signal handlers into a vdso-like page"), one of which was originally
discovered via my testing originally, but the fix for it was never
actually committed.
The other shows up on noMMU builds, and such platforms are extremely
rare and as such are not part of my nightly testing"
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: fix nommu builds with 24b27675b (ARM: move signal handlers into a vdso-like page)
ARM: fix a cockup in 24b27675b (ARM: move signal handlers into a vdso-like page)
Russell King [Sat, 3 Aug 2013 09:39:51 +0000 (10:39 +0100)]
ARM: fix nommu builds with 24b27675b (ARM: move signal handlers into a vdso-like page)
Olof reports that noMMU builds error out with:
arch/arm/kernel/signal.c: In function 'setup_return':
arch/arm/kernel/signal.c:413:25: error: 'mm_context_t' has no member named 'sigpage'
This shows one of the evilnesses of IS_ENABLED(). Get rid of it here
and replace it with #ifdef's - and as no noMMU platform can make use
of sigpage, depend on CONIFG_MMU not CONFIG_ARM_MPU.
Reported-by: Olof Johansson <olof@lixom.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Russell King [Sat, 3 Aug 2013 09:30:05 +0000 (10:30 +0100)]
ARM: fix a cockup in 24b27675b (ARM: move signal handlers into a vdso-like page)
Unfortunately, I never committed the fix to a nasty oops which can
occur as a result of that commit:
------------[ cut here ]------------
kernel BUG at /home/olof/work/batch/include/linux/mm.h:414!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0 PID: 490 Comm: killall5 Not tainted 3.11.0-rc3-00288-gabe0308 #53
task: e90acac0 ti: e9be8000 task.ti: e9be8000
PC is at special_mapping_fault+0xa4/0xc4
LR is at __do_fault+0x68/0x48c
This doesn't show up unless you do quite a bit of testing; a simple
boot test does not do this, so all my nightly tests were passing fine.
The reason for this is that install_special_mapping() expects the
page array to stick around, and as this was only inserting one page
which was stored on the kernel stack, that's why this was blowing up.
Reported-by: Olof Johansson <olof@lixom.net> Tested-by: Olof Johansson <olof@lixom.net> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Paul Moore [Fri, 2 Aug 2013 18:45:08 +0000 (14:45 -0400)]
netlabel: use domain based selectors when address based selectors are not available
NetLabel has the ability to selectively assign network security labels
to outbound traffic based on either the LSM's "domain" (different for
each LSM), the network destination, or a combination of both. Depending
on the type of traffic, local or forwarded, and the type of traffic
selector, domain or address based, different hooks are used to label the
traffic; the goal being minimal overhead.
Unfortunately, there is a bug such that a system using NetLabel domain
based traffic selectors does not correctly label outbound local traffic
that is not assigned to a socket. The issue is that in these cases
the associated NetLabel hook only looks at the address based selectors
and not the domain based selectors. This patch corrects this by
checking both the domain and address based selectors so that the correct
labeling is applied, regardless of the configuration type.
In order to acomplish this fix, this patch also simplifies some of the
NetLabel domainhash structures to use a more common outbound traffic
mapping type: struct netlbl_dommap_def. This simplifies some of the code
in this patch and paves the way for further simplifications in the
future.
Signed-off-by: Paul Moore <pmoore@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Roman Gushchin [Fri, 2 Aug 2013 14:36:40 +0000 (18:36 +0400)]
net: check net.core.somaxconn sysctl values
It's possible to assign an invalid value to the net.core.somaxconn
sysctl variable, because there is no checks at all.
The sk_max_ack_backlog field of the sock structure is defined as
unsigned short. Therefore, the backlog argument in inet_listen()
shouldn't exceed USHRT_MAX. The backlog argument in the listen() syscall
is truncated to the somaxconn value. So, the somaxconn value shouldn't
exceed 65535 (USHRT_MAX).
Also, negative values of somaxconn are meaningless.
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Reported-by: Changli Gao <xiaosuo@gmail.com> Suggested-by: Eric Dumazet <edumazet@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
timer routine checks the link status and if it's up calls
netif_carrier_on() allowing upper layer to start the tx queue
even if the auto-negotiation process is not finished.
Also remove ugly auto-negotiation check from the sis900_start_xmit()
CC: Duan Fugang <B38611@freescale.com> CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 2 Aug 2013 21:58:30 +0000 (14:58 -0700)]
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband/rdma fixes from Roland Dreier:
- Fixes for the newly merged mlx5 hardware driver
- Stack info leak fixes from Dan Carpenter
- Fixes for pkey table handling with SR-IOV
- A few other small things
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
IPoIB: Fix pkey change flow for virtualization environments
IPoIB: Make sure child devices use valid/proper pkeys
IB/core: Create QP1 using the pkey index which contains the default pkey
mlx5_core: Variable may be used uninitialized
mlx5_core: Implement new initialization sequence
mlx5_core: Fix use after free in mlx5_cmd_comp_handler()
IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()
IB/mlx5: Fix error return code in init_one()
IB/mlx4: Use default pkey when creating tunnel QPs
RDMA/cma: Only call cma_save_ib_info() for CM REQs
RDMA/cma: Fix accessing invalid private data for UD
RDMA/cma: Fix gcc warning
Revert "RDMA/nes: Fix compilation error when nes_debug is enabled"
IB/qib: Add err_decode() call for ring dump
RDMA/cxgb3: Fix stack info leak in iwch_create_cq()
RDMA/nes: Fix info leaks in nes_create_qp() and nes_create_cq()
RDMA/ocrdma: Fix several stack info leaks
RDMA/cxgb4: Fix stack info leak in c4iw_create_qp()
RDMA/ocrdma: Remove unused include
Linus Torvalds [Fri, 2 Aug 2013 21:57:24 +0000 (14:57 -0700)]
Merge tag 'gpio-for-v3.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Yet another GPIO pull request, fixing the fix from the last one. It
turns out that fixing the boot path for device tree boots on OMAP
breaks out antique systems (such as OMAP1) and we need to find a
better way. So we're reverting that "fix" for the moment and thinking
about something better.
Also fixing a build issue on the MSM driver"
* tag 'gpio-for-v3.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio_msm: Fix build error due to missing err.h
Revert "gpio/omap: don't create an IRQ mapping for every GPIO on DT"
Revert "gpio/omap: auto request GPIO as input if used as IRQ via DT"
Revert "gpio/omap: fix build error when OF_GPIO is not defined."
Daniel Borkmann [Fri, 2 Aug 2013 09:32:43 +0000 (11:32 +0200)]
net: rtm_to_ifaddr: free ifa if ifa_cacheinfo processing fails
Commit f102c101c ("ipv4: introduce address lifetime") leaves the ifa
resource that was allocated via inet_alloc_ifa() unfreed when returning
the function with -EINVAL. Thus, free it first via inet_free_ifa().
Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Lekensteyn [Fri, 2 Aug 2013 08:36:55 +0000 (10:36 +0200)]
r8169: remove "PHY reset until link up" log spam
This message was added in commit a7154cb8 (June 2004, [PATCH] r8169:
link handling and phy reset rework) and is printed every ten seconds
when no cable is connected and runtime power management is disabled.
(Before that commit, "Reset RTL8169s PHY" would be printed instead.)
Signed-off-by: Peter Wu <lekensteyn@gmail.com> Acked-by: Francois Romieu <romieu@fr.zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When userspace passes a large priority value
the assignment of the unsigned value hopt->prio
to signed int cl->prio causes cl->prio to become negative and the
comparison is with TC_HTB_NUMPRIO is always false.
The result is that HTB crashes by referencing outside
the array when processing packets. With this patch the large value
wraps around like other values outside the normal range.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 2 Aug 2013 21:39:49 +0000 (14:39 -0700)]
Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
"Here is not quite a handful of powerpc fixes for rc3.
The windfarm fix is a regression fix (though not a new one), the PMU
interrupt rename is not a fix per-se but has been submitted a long
time ago and I kept forgetting to put it in (it puts us back in sync
with x86), the other perf bit is just about putting an API/ABI bit
definition in the right place for userspace to consume, and finally,
we have a fix for the VPHN (Virtual Partition Home Node) feature
(notification that the hypervisor is moving nodes around) which could
cause lockups so we may as well fix it now"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/windfarm: Fix noisy slots-fan on Xserve (rm31)
powerpc: VPHN topology change updates all siblings
powerpc/perf: Export PERF_EVENT_CONFIG_EBB_SHIFT to userspace
powerpc: Rename PMU interrupts from CNT to PMI
Linus Torvalds [Fri, 2 Aug 2013 21:37:45 +0000 (14:37 -0700)]
Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-arm
Pull ARM fixes from Russell King:
"I've thought long and hard about what to say for this pull request,
and I really can't work out anything sane to say to summarise much of
these commits. The problem is, for most of these are, yet again, lots
of small bits scattered around the place without any real overall
theme to them"
Most notable is probably the kuser page helper improvements.
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm: (22 commits)
ARM: Add .text annotations where required after __CPUINIT removal
ARM: 7803/1: Fix deadlock scenario with smp_send_stop()
ARM: make vectors page inaccessible from userspace
ARM: move signal handlers into a vdso-like page
ARM: allow kuser helpers to be removed from the vector page
ARM: update FIQ support for relocation of vectors
ARM: use linker magic for vectors and vector stubs
ARM: move vector stubs
ARM: poison memory between kuser helpers
ARM: poison the vectors page
ARM: 7801/1: v6: prevent gcc 4.5 from reordering extended CP15 reads above is_smp() test
ARM: 7800/1: ARMv7-M: Fix name of NVIC handler function
ARM: Fix sorting of machine- initializers
ARM: 7791/1: a.out: remove partial a.out support
ARM: 7790/1: Fix deferred mm switch on VIVT processors
ARM: 7789/1: Do not run dummy_flush_tlb_a15_erratum() on non-Cortex-A15
ARM: 7787/1: virt: ensure visibility of __boot_cpu_mode
ARM: 7788/1: elf: fix lpae hwcap feature reporting in proc/cpuinfo
ARM: 7786/1: hyp: fix macro parameterisation
ARM: 7785/1: mm: restrict early_alloc to section-aligned memory
...
Linus Torvalds [Fri, 2 Aug 2013 21:36:32 +0000 (14:36 -0700)]
Merge branch 'parisc-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc updates from Helge Deller:
"The majority of lines changed are due the addition of a defconfig for
the C8000 machine. Even the fix in parisc/kernel/cache.c file is
actually ony a 10-line fix, but the change became bigger (and much
nicer) to avoid errors of the checkpatch script.
Here is the short-changelog:
This round of parisc updates includes mostly fixes for the C8000
workstation. We have a new defconfig file for this machine, as well
as fixes for it's serial port, the AGP driver and the cache routines
to cope with the vmas of the FireGL card in a C8000. The sys32.h
header file was not used and as such it's now gone"
* 'parisc-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix interrupt routing for C8000 serial ports
parisc: Remove arch/parisc/kernel/sys32.h header
parisc: add defconfig for c8000 machine
parisc: agp/parisc-agp: allow binding of user memory to the AGP GART
parisc: Fix cache routines to ignore vma's with an invalid pfn
Linus Torvalds [Fri, 2 Aug 2013 20:12:52 +0000 (13:12 -0700)]
Merge tag 'pci-v3.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Yinghai fixed a couple regressions: one resource assignment problem
introduced in v3.10 that showed up with SR-IOV on powerpc, and another
SR-IOV hot-remove issue related to refcounting changes we merged for
v3.11.
Yinghai is still working on another SR-IOV-related fix or two, which
will be simpler if pciehp is non-modular, so I included the Kconfig
changes now to get them in earlier.
Finally, a minor fix for the ARM Marvell EBU host bridge driver that
was merged for v3.11
Hotplug:
PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device
PCI: hotplug: Convert to be builtin only, not modular
PCI: pciehp: Convert pciehp to be builtin only, not modular
Resource allocation:
PCI: Retry allocation of only the resource type that failed
ARM:
PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge"
* tag 'pci-v3.11-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge
PCI: Retry allocation of only the resource type that failed
PCI: pciehp: Convert pciehp to be builtin only, not modular
PCI: hotplug: Convert to be builtin only, not modular
PCI: pciehp: Fix null pointer deref when hot-removing SR-IOV device
Linus Torvalds [Fri, 2 Aug 2013 19:21:32 +0000 (12:21 -0700)]
Merge tag 'pm+acpi-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI and power management fixes from Rafael Wysocki:
- Revert two cpuidle commits added during the 3.8 development cycle
that turn out to have introduced a significant performance regression
as requested by Jeremy Eder.
- The recent patches that made the freezer less heavy-weight introduced
a regression causing user-space-driven hibernation using the ioctl()
interface to block indefinitely when the hibernate process executes
try_to_freeze(). Fix from Colin Cross addresses this by adding a
process flag to mark the hibernate/suspend process to inform the
freezer that that process should be ignored.
- One of the recent cpufreq reverts uncovered a problem in the core
causing the cpufreq driver module refcount to become negative after a
system suspend-resume cycle. Fix from Rafael J Wysocki.
- The evaluation of the ACPI battery _BIX method has never worked
correctly, because the commit that added support for it forgot to
take the "Revision" field in the return package into account. As a
result, the reading of battery info doesn't work at all on some
systems, which is addressed by a fix from Lan Tianyu.
* tag 'pm+acpi-3.11-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
freezer: set PF_SUSPEND_TASK flag on tasks that call freeze_processes
ACPI / battery: Fix parsing _BIX return value
cpufreq: Fix cpufreq driver module refcount balance after suspend/resume
Revert "cpuidle: Quickly notice prediction failure for repeat mode"
Revert "cpuidle: Quickly notice prediction failure in general case"
Using rfkill switch can make firmware unstable, what cause various
Microcode errors and kernel warnings. Reseting firmware just after
rfkill off (radio on) helped with that.
If device was put into a sleep and system was restarted or module
reloaded, we have to wake device up before sending other commands.
Otherwise it will fail to start with Microcode error.
Cc: stable@vger.kernel.org Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
macvlan passthrough mode is special: it's not possible to switch to or
from it through a netlink command.
But if you try, the command will succeed, which is
confusing.
Validate input and return error to user.
Cc: Sridhar Samudrala <sri@us.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
tipc_core_start()
ret = tipc_subscr_start()
ret = tipc_server_start(){
server->enabled = 1;
ret = tipc_open_listening_sock()
}
I.e., the server->enabled flag is unconditionally set to 1, whatever
the return value of tipc_open_listening_sock().
This causes a crash when tipc_core_start() tries to clean up
resources after a failed initialization:
if (ret == failed)
tipc_subscr_stop()
tipc_server_stop(){
if (server->enabled)
tipc_close_conn(){
NULL reference of con->sock-sk
OOPS!
}
}
To avoid this, tipc_server_start() should only set server->enabled
to 1 in case of a succesful socket creation. In case of failure, it
should release all allocated resources before returning.
Problem introduced in commit d111e95686f31ca4e589f920694248770bf395e8
("tipc: introduce new TIPC server infrastructure") in v3.11-rc1.
Note that it won't be seen often; it takes a module load under memory
constrained conditions in order to trigger the failure condition.
Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Thu, 1 Aug 2013 03:10:25 +0000 (11:10 +0800)]
net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLL
Eliezer renames several *ll_poll to *busy_poll, but forgets
CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too.
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Cong Wang [Thu, 1 Aug 2013 03:10:24 +0000 (11:10 +0800)]
net: fix a compile error when CONFIG_NET_LL_RX_POLL is not set
When CONFIG_NET_LL_RX_POLL is not set, I got:
net/socket.c: In function ‘sock_poll’:
net/socket.c:1165:4: error: implicit declaration of function ‘sk_busy_loop’ [-Werror=implicit-function-declaration]
Fix this by adding a nop when !CONFIG_NET_LL_RX_POLL.
Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4_core: VFs must ignore the enable_64b_cqe_eqe module param
Slaves get the 64B CQE/EQE state from QUERY_HCA, not from the module parameter.
If the parameter is set to zero, the slave outputs an incorrect/irrelevant
warning message that 64B CQEs/EQEs are supported but not enabled (even if the
hypervisor has enabled 64B CQEs/EQEs).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz [Thu, 1 Aug 2013 16:55:00 +0000 (19:55 +0300)]
net/mlx4_core: Don't give VFs MAC addresses which are derived from the PF MAC
If the user has not assigned a MAC address to a VM, then don't give it MAC which
is based on the PF one. The current derivation scheme is wrong and leads to VM
MAC collisions when the number of cards/hypervisors becomes big enough.
Instead, just give it zeros and let them figure out what to do with that.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Benc [Thu, 1 Aug 2013 08:41:28 +0000 (10:41 +0200)]
ipv6: prevent race between address creation and removal
There's a race in IPv6 automatic addess assignment. The address is created
with zero lifetime when it's added to various address lists. Before it gets
assigned the correct lifetime, there's a window where a new address may be
configured. This causes the semi-initiated address to be deleted in
addrconf_verify.
This was discovered as a reference leak caused by concurrent run of
__ipv6_ifa_notify for both RTM_NEWADDR and RTM_DELADDR with the same
address.
Fix this by setting the lifetime before the address is added to
inet6_addr_lst.
A few notes:
1. In addrconf_prefix_rcv, by setting update_lft to zero, the
if (update_lft) { ... } condition is no longer executed for newly
created addresses. This is okay, as the ifp fields are set in
ipv6_add_addr now and ipv6_ifa_notify is called (and has been called)
through addrconf_dad_start.
2. The removal of the whole block under ifp->lock in inet6_addr_add is okay,
too, as tstamp is initialized to jiffies in ipv6_add_addr.
Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kubeček [Thu, 1 Aug 2013 08:04:24 +0000 (10:04 +0200)]
ipv6: update ip6_rt_last_gc every time GC is run
As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().
Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
Michal Kubeček [Thu, 1 Aug 2013 08:04:14 +0000 (10:04 +0200)]
ipv6: prevent fib6_run_gc() contention
On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.
This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.
Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
PCI: mvebu: Disable prefetchable memory support in PCI-to-PCI bridge
The Marvell PCIe driver uses an emulated PCI-to-PCI bridge to be able
to dynamically set up MBus address decoding windows for PCI I/O and
memory regions depending on the PCI devices enumerated by Linux.
However, this emulated PCI-to-PCI bridge logic makes the Linux PCI
core believe that prefetchable memory regions are supported (because
the registers are read/write), while in fact no adress decoding window
is ever created for such regions. Since the Marvell MBus address
decoding windows do not distinguish memory regions and prefetchable
memory regions, this patch takes a simple approach: change the
PCI-to-PCI bridge emulation to let the Linux PCI core know that we
don't support prefetchable memory regions.
To achieve this, we simply make the prefetchable memory base a
read-only register that always returns 0. Reading/writing all the
other prefetchable memory related registers has no effect.
This problem was originally reported by Finn Hoffmann
<finn@uni-bremen.de>, who couldn't get a RTL8111/8168B PCI NIC working
on the NSA310 Kirkwood platform after updating to 3.11-rc. The problem
was that the PCI-to-PCI bridge emulation was making the Linux PCI core
believe that we support prefetchable memory, so the Linux PCI core was
only filling the prefetchable memory base and limit registers, which
does not lead to a MBus window being created. The below patch has been
confirmed by Finn Hoffmann to fix his problem on Kirkwood, and has
otherwise been successfully tested on the Armada XP GP platform with a
e1000e PCIe NIC and a Marvell SATA PCIe card.
Reported-by: Finn Hoffmann <finn@uni-bremen.de> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
David S. Miller [Thu, 1 Aug 2013 19:57:52 +0000 (12:57 -0700)]
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless
John W. Linville says:
====================
This pull request is intended for the 3.11 stream. It is a bit
larger than usual, as it includes pulls from most of my feeder trees
as well...
For the Bluetooth bits, Gustavo says:
"A few fixes and devices ID additions for 3.11:
* There are 4 new ath3k device ids
* Fixed stack memory usage in ath3k.
* Fixed the init process of BlueFRITZ! devices, they were failing to init
due to an unsupported command we sent.
* Fixed wrong use of PTR_ERR in btusb code that was preventing intel devices
to work properly.
* Fixed race condition between hci_register_dev() and hci_dev_open() that
could cause a NULL pointer dereference.
* Fixed race condition that could call hci_req_cmd_complete() and make some
devices to fail as showed in the log added to the commit message."
Regarding the NFC bits, Samuel says:
"We have:
1) A build failure fix for the NCI SPI transport layer due to a
missing CRC_CCITT Kconfig dependency.
2) A netlink command rename: CMD_FW_UPLOAD was merged during the 3.11
merge window but the typical terminology for loading a firmware to a
target is firmware download rather than upload. In order to avoid any
confusion in a file exported to userspace, we rename this command to
CMD_FW_DOWNLOAD."
Samuel's item #2 isn't strictly a fix, but it seems safe and should
avoid confusion in the future.
As for the mac80211 bits, Johannes says:
"I only have three fixes this time, a fix for a suspend regression, a
patch correcting the initiator in regulatory code and one fix for mesh
station powersave."
With respect to the iwlwifi bits, Johannes says:
"We have a scan fix for passive channels, a new PCI device ID for an old
device, a NIC reset fix, an RF-Kill fix, a fix for powersave when GO
interfaces are present as well as an aggregation session fix (for a
corner case) and a workaround for a firmware design issue - it only
supports a single GTK in D3."
Bringing-up the rear with the Atheros trees, Kalle says:
"Geert Uytterhoeven fixed an ath10k build problem when NO_DMA=y. I added
a missing MAINTAINERS entry for ath10k and updated ath6kl git tree
location."
Along with the above...
Arend van Spriel fixes a brcmfmac WARNING when unplugging the device.
Avinash Patil proves a couple of minor mwifiex fixes relating to P2P mode.
Luciano Coelho updates the MAINTAINERS entry for the wilink drivers.
Stanislaw Gruszka brings an rt2x00 fix for a queue start/stop problem.
Stone Piao fixes another mwifiex problem, a command timeout related to P2P mode.
Tomasz Moń corrects an endian problem in mwifiex.
I'll remind my feeder maintainers to slowdown the patchflow.
Beyond that, please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>