Johannes Berg [Sun, 1 Nov 2009 18:25:40 +0000 (19:25 +0100)]
mac80211: check interface is down before type change
For some strange reason the netif_running() check
ended up after the actual type change instead of
before, potentially causing all kinds of problems
if the interface is up while changing the type;
one of the problems manifests itself as a warning:
WARNING: at net/mac80211/iface.c:651 ieee80211_teardown_sdata+0xda/0x1a0 [mac80211]()
Hardware name: Aspire one
Pid: 2596, comm: wpa_supplicant Tainted: G W 2.6.31-10-generic #32-Ubuntu
Call Trace:
[] warn_slowpath_common+0x6d/0xa0
[] warn_slowpath_null+0x15/0x20
[] ieee80211_teardown_sdata+0xda/0x1a0 [mac80211]
[] ieee80211_if_change_type+0x4a/0xc0 [mac80211]
[] ieee80211_change_iface+0x61/0xa0 [mac80211]
[] cfg80211_wext_siwmode+0xc7/0x120 [cfg80211]
[] ioctl_standard_call+0x58/0xf0
Cc: Arjan van de Ven <arjan@infradead.org> Cc: stable@kernel.org Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
introduced a potential NULL pointer dereference that
some people have been hitting for some reason -- the
params.bssid pointer is not guaranteed to be non-NULL
for what seems to be a race between various ways of
reaching the same thing.
While I'm trying to analyse the problem more let's
first fix the crash. I think the real fix may be to
avoid doing _anything_ if it ended up being NULL, but
right now I'm not sure yet.
I think
http://bugzilla.kernel.org/show_bug.cgi?id=14342
might also be this issue.
Reported-by: Parag Warudkar <parag.lkml@gmail.com> Tested-by: Parag Warudkar <parag.lkml@gmail.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
David Woodhouse [Fri, 30 Oct 2009 17:45:14 +0000 (17:45 +0000)]
libertas if_usb: Fix crash on 64-bit machines
On a 64-bit kernel, skb->tail is an offset, not a pointer. The libertas
usb driver passes it to usb_fill_bulk_urb() anyway, causing interesting
crashes. Fix that by using skb->data instead.
This highlights a problem with usb_fill_bulk_urb(). It doesn't notice
when dma_map_single() fails and return the error to its caller as it
should. In fact it _can't_ currently return the error, since it returns
void.
So this problem was showing up only at unmap time, after we'd already
suffered memory corruption by doing DMA to a bogus address.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Cc: stable@kernel.org Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Thu, 29 Oct 2009 09:09:28 +0000 (10:09 +0100)]
mac80211: fix reason code output endianness
When HT debugging is enabled and we receive a DelBA
frame we print out the reason code in the wrong byte
order. Fix that so we don't get weird values printed.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
downgraded an ASSERT to a WARN_ON() but also misplaced a
semicolon at the end of the second check. What this did
was force the rate control code to always return the rate
even if we should have warned about it. Since this should
not have happened anymore anyway this fix isn't critical
as the proper rate would have been returned anyway.
Cc: stable@kernel.org Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michael Buesch [Wed, 28 Oct 2009 21:08:13 +0000 (22:08 +0100)]
b43: Fix DMA TX bounce buffer copying
b43 allocates a bouncebuffer, if the supplied TX skb is in an invalid
memory range for DMA.
However, this is broken in that it fails to copy over some metadata to the
new skb.
This patch fixes three problems:
* Failure to adjust the ieee80211_tx_info pointer to the new buffer.
This results in a kmemcheck warning.
* Failure to copy the skb cb, which contains ieee80211_tx_info, to the new skb.
This results in breakage of various TX-status postprocessing (Rate control).
* Failure to transfer the queue mapping.
This results in the wrong queue being stopped on saturation and can result in queue overflow.
Signed-off-by: Michael Buesch <mb@bu3sch.de> Tested-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Zhu Yi [Thu, 15 Oct 2009 06:50:28 +0000 (14:50 +0800)]
ipw2200: fix oops on missing firmware
For non-monitor interfaces, the syntax for alloc_ieee80211/free_80211
is wrong. Because alloc_ieee80211 only creates (wiphy_new) a wiphy, but
free_80211() does wiphy_unregister() also. This is only correct when
the later wiphy_register() is called successfully, which apparently
is not the case for your fw doesn't exist one.
Signed-off-by: Zhu Yi <yi.zhu@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Herbert Xu [Fri, 30 Oct 2009 05:51:48 +0000 (05:51 +0000)]
gre: Fix dev_addr clobbering for gretap
Nathan Neulinger noticed that gretap devices get their MAC address
from the local IP address, which results in invalid MAC addresses
half of the time.
This is because gretap is still using the tunnel netdev ops rather
than the correct tap netdev ops struct.
This patch also fixes changelink to not clobber the MAC address
for the gretap case.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Tested-by: Nathan Neulinger <nneul@mst.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Fri, 30 Oct 2009 05:03:53 +0000 (05:03 +0000)]
net: fix sk_forward_alloc corruption
On UDP sockets, we must call skb_free_datagram() with socket locked,
or risk sk_forward_alloc corruption. This requirement is not respected
in SUNRPC.
Add a convenient helper, skb_free_datagram_locked() and use it in SUNRPC
Reported-by: Francis Moreau <francis.moro@gmail.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Thu, 29 Oct 2009 13:46:05 +0000 (13:46 +0000)]
e1000e: rework disable K1 at 1000Mbps for 82577/82578
This patch reworks a previous workaround (commit 7d3cabbcc) for an issue
in hardware where noise on the interconnect between the MAC and PHY could
be generated by a lower power mode (K1) at 1000Mbps resulting in bad
packets. Disable K1 while at 1000 Mbps but keep it enabled for 10/100Mbps
and when the cable is disconnected. The original version of this
workaround was found to be incomplete.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Thu, 29 Oct 2009 13:45:45 +0000 (13:45 +0000)]
e1000e: config PHY via software after resets
On PCH-based (82577/82578) and some ICH8-based parts (82566) there is an
issue with the hardware automatically configuring the PHY with contents
from the EEPROM after the PHY is reset, so do the configuration by the
driver instead. This was already similarly done for some 82566 parts in
e1000_phy_hw_reset_ich8lan() but needs to be done after other resets,
so move the PHY configuration code to its own function and call after
all PHY resets.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Thu, 29 Oct 2009 13:42:41 +0000 (13:42 +0000)]
e100: e100_phy_init() isolates selected PHY, causes 10 second boot delay
A change in how PHYs are electrically isolated caused all PHYs to be
isolated followed by reverting that isolation for the selected PHY.
Unfortunately, isolating the selected PHY for even a short period of
time can result in DHCP negotiation taking more than 10 seconds on certain
embedded configurations delaying boot time as reported by Bernhard Kaindl.
This patch reverts the change to how PHYs are isolated yet still works
around the issue for 82552 needing the selected PHY's BMCR register to
be written after the unused PHYs are isolated. This code is moved below
the setting of nic->phy ID in order to do the 82552-specific workaround.
Cc: Bernhard Kaindl <bernhard.kaindl@gmx.net> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Gabor Gombas [Thu, 29 Oct 2009 10:19:11 +0000 (03:19 -0700)]
net: Fix 'Re: PACKET_TX_RING: packet size is too long'
Currently PACKET_TX_RING forces certain amount of every frame to remain
unused. This probably originates from an early version of the
PACKET_TX_RING patch that in fact used the extra space when the (since
removed) CONFIG_PACKET_MMAP_ZERO_COPY option was enabled. The current
code does not make any use of this extra space.
This patch removes the extra space reservation and lets userspace make
use of the full frame size.
Signed-off-by: Gabor Gombas <gombasg@sztaki.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
netdev: usb: dm9601.c can drive a device not supported yet, add support for it
I found that the current version of drivers/net/usb/dm9601.c can be used to
successfully drive a low-power, low-cost network adapter with USB ID
0a46:9000, based on a DM9000E chipset. As no device with this ID is yet
present in the kernel, I have created a patch that adds support for the device
to the dm9601 driver.
Created and tested against linux-2.6.32-rc5.
Signed-off-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl> Acked-by: Peter Korsgaard <jacmet@sunsite.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ron Mercer [Wed, 28 Oct 2009 08:39:21 +0000 (08:39 +0000)]
qlge: Fix firmware mailbox command timeout.
The mailbox command process would only process a maximum of 5 unrelated
firmware events while waiting for it's command completion status.
It should process an unlimited number of events while waiting for a maximum of 5 seconds.
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Neil Horman [Wed, 28 Oct 2009 08:59:47 +0000 (08:59 +0000)]
AF_RAW: Augment raw_send_hdrinc to expand skb to fit iphdr->ihl (v2)
Augment raw_send_hdrinc to correct for incorrect ip header length values
A series of oopses was reported to me recently. Apparently when using AF_RAW
sockets to send data to peers that were reachable via ipsec encapsulation,
people could panic or BUG halt their systems.
I've tracked the problem down to user space sending an invalid ip header over an
AF_RAW socket with IP_HDRINCL set to 1.
Basically what happens is that userspace sends down an ip frame that includes
only the header (no data), but sets the ip header ihl value to a large number,
one that is larger than the total amount of data passed to the sendmsg call. In
raw_send_hdrincl, we allocate an skb based on the size of the data in the msghdr
that was passed in, but assume the data is all valid. Later during ipsec
encapsulation, xfrm4_tranport_output moves the entire frame back in the skbuff
to provide headroom for the ipsec headers. During this operation, the
skb->transport_header is repointed to a spot computed by
skb->network_header + the ip header length (ihl). Since so little data was
passed in relative to the value of ihl provided by the raw socket, we point
transport header to an unknown location, resulting in various crashes.
This fix for this is pretty straightforward, simply validate the value of of
iph->ihl when sending over a raw socket. If (iph->ihl*4U) > user data buffer
size, drop the frame and return -EINVAL. I just confirmed this fixes the
reported crashes.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jiri Bohac [Thu, 29 Oct 2009 05:23:54 +0000 (22:23 -0700)]
bonding: fix a race condition in calls to slave MII ioctls
In mii monitor mode, bond_check_dev_link() calls the the ioctl
handler of slave devices. It stores the ndo_do_ioctl function
pointer to a static (!) ioctl variable and later uses it to call the
handler with the IOCTL macro.
If another thread executes bond_check_dev_link() at the same time
(even with a different bond, which none of the locks prevent), a
race condition occurs. If the two racing slaves have different
drivers, this may result in one driver's ioctl handler being
called with a pointer to a net_device controlled with a different
driver, resulting in unpredictable breakage.
Unless I am overlooking something, the "static" must be a
copy'n'paste error (?).
Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
virtio net used to unlink skbs from send queues on error,
but ever since 48925e372f04f5e35fec6269127c62b2c71ab794
we do not do this. This causes guest data corruption and crashes
with vhost since net core can requeue the skb or free it without
it being taken off the list.
This patch fixes this by queueing the skb after successful
transmit.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings [Wed, 28 Oct 2009 10:43:49 +0000 (03:43 -0700)]
sfc: Set ip_summed correctly for page buffers passed to GRO
Page buffers containing packets with an incorrect checksum or using a
protocol not handled by hardware checksum offload were previously not
passed to LRO. The conversion to GRO changed this, but did not set
the ip_summed value accordingly.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Björn Smedman [Sat, 24 Oct 2009 18:55:09 +0000 (20:55 +0200)]
mac80211: fix for incorrect sequence number on hostapd injected frames
When hostapd injects a frame, e.g. an authentication or association
response, mac80211 looks for a suitable access point virtual interface
to associate the frame with based on its source address. This makes it
possible e.g. to correctly assign sequence numbers to the frames.
A small typo in the ethernet address comparison statement caused a
failure to find a suitable ap interface. Sequence numbers on such
frames where therefore left unassigned causing some clients
(especially windows-based 11b/g clients) to reject them and fail to
authenticate or associate with the access point. This patch fixes the
typo in the address comparison statement.
Signed-off-by: Björn Smedman <bjorn.smedman@venatech.se> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Cc: stable@kernel.org Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Tue, 20 Oct 2009 06:08:53 +0000 (15:08 +0900)]
cfg80211: sme: deauthenticate on assoc failure
When the in-kernel SME gets an association failure from
the AP we don't deauthenticate, and thus get into a very
confused state which will lead to warnings later on. Fix
this by actually deauthenticating when the AP indicates
an association failure.
(Brought to you by the hacking session at Kernel Summit 2009 in Tokyo,
Japan. -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Johannes Berg [Tue, 20 Oct 2009 06:08:12 +0000 (15:08 +0900)]
mac80211: keep auth state when assoc fails
When association fails, we should stay authenticated,
which in mac80211 is represented by the existence of
the mlme work struct, so we cannot free that, instead
we need to just set it to idle.
(Brought to you by the hacking session at Kernel Summit 2009 in Tokyo,
Japan. -- JWL)
Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Reinette Chatre [Mon, 19 Oct 2009 21:55:37 +0000 (14:55 -0700)]
mac80211: fix ibss joining
Recent commit "mac80211: fix logic error ibss merge bssid check" fixed
joining of ibss cell when static bssid is provided. In this case
ifibss->bssid is set before the cell is joined and comparing that address
to a bss should thus always succeed. Unfortunately this change broke the
other case of joining a ibss cell without providing a static bssid where
the value of ifibss->bssid is not set before the cell is joined.
Since ifibss->bssid may be set before or after joining the cell we do not
learn anything by comparing it to a known bss. Remove this check.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Larry Finger [Fri, 16 Oct 2009 15:18:09 +0000 (10:18 -0500)]
b43: Fix Bugzilla #14181 and the bug from the previous 'fix'
"b43: Fix PPC crash in rfkill polling on unload" fixed the bug reported
in Bugzilla No. 14181; however, it introduced a new bug. Whenever the
radio switch was turned off, it was necessary to unload and reload
the driver for it to recognize the switch again.
This patch fixes both the original bug in #14181 and the bug introduced by
the previous patch. It must be stated, however, that if there is a BCM4306/3
with an rfkill switch (not yet proven), then the driver will need an
unload/reload cycle to turn the device back on.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Benoit PAPILLAULT <benoit.papillault@free.fr> Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
Michal Ostrowski [Mon, 26 Oct 2009 23:23:20 +0000 (16:23 -0700)]
PPPoE: Fix flush/close races.
Be more careful about the state of pointers during tear-down.
The "pppoe_dev" field can only be looked at safely while holding socket locks.
This subsequently allows for the flush_lock to be killed.
We depend on the PPPOX_CONNECTED state to tell us that that those fields are
valid, so whoever clears that state (pppox_unbind_sock()) is responsible for
the dev_put() call.
We also have to ensure that we delete_item() on all sockets before they are
cleaned up.
The need for these changes has been exposed by scenarios wherein namespace
bindings of ethernet devices change while there are ongoing PPPoE sessions,
which resulted in oopses due to unusual socket connection termination paths,
exposing these issues.
Bruce Allan [Mon, 26 Oct 2009 11:24:02 +0000 (11:24 +0000)]
e1000e: allow for swflag to be held over consecutive PHY accesses
PCH-based parts (82577/82578) and some ICH8-based parts (82566) need to
hold the swflag (sw/fw/hw hardware semaphore) over consecutive PHY accesses
in order to perform sw-driven PHY configuration during initialization to
workaround known hardware issues (see follow-on patch). This patch
provides new PHY read/write functions (and function pointers) that will
allow accessing the PHY registers assuming the swflag has already been
acquired. The actual PHY register access code has moved into helper
functions that are called with a flag indicating whether or not the swflag
has already been acquired and acquires/releases it if not.
The functions called from within the updated PHY access functions had to be
updated to assume the swflag was already acquired, and other functions that
called those functions were also updated to acquire/release the swflag.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Mon, 26 Oct 2009 11:23:43 +0000 (11:23 +0000)]
e1000e: separate mutex usage between NVM and PHY/CSR register for ICHx/PCH
Accesses to NVM and PHY/CSR registers on ICHx/PCH-based parts are protected
from concurrent accesses with a mutex that is acquired when the access is
initiated and released when the access has completed. However, the two
types of accesses should not be protected by the same mutex because the
driver may have to access the NVM while already holding the mutex over
several consecutive PHY/CSR accesses which would result in livelock.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Mon, 26 Oct 2009 11:23:25 +0000 (11:23 +0000)]
e1000e: 82577/82578 requires a different method to configure LPLU
Unlike previous ICHx-based parts, the PCH-based parts (82577/82578) require
LPLU (Low Power Link Up, or "reverse auto-negotiation") to be configured in
the PHY rather than the MAC.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Mon, 26 Oct 2009 11:23:06 +0000 (11:23 +0000)]
e1000e: increase swflag acquisition timeout for ICHx/PCH
In some conditions (e.g. when AMT is enabled on the system), it is possible
to take an extended period of time to for the driver to acquire the sw/fw/hw
hardware semaphore used to protect against concurrent access of a shared
resource (e.g. PHY registers). This could cause PHY registers to not get
configured properly resulting in link issues.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Mon, 26 Oct 2009 11:22:47 +0000 (11:22 +0000)]
e1000e: clear PHY wakeup bit after LCD reset on 82577/82578
Performing a dummy read of the PHY Wakeup Control (WUC) register clears the
wakeup enable bit set by an PHY reset. If this bit remains set, link
problems may occur.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 26 Oct 2009 11:32:25 +0000 (11:32 +0000)]
igbvf: fix memory leak when ring size changed while interface down
This patch resolves a memory leak which occurs while changing the ring size
while the interface is down.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 26 Oct 2009 11:32:05 +0000 (11:32 +0000)]
ixgbe: fix memory leak when resizing rings while interface is down
This patch resolves a memory leak that occurs when you resize the rings via
the ethtool -G option while the interface is down.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck [Mon, 26 Oct 2009 11:31:47 +0000 (11:31 +0000)]
igb: fix memory leak when setting ring size while interface is down
Changing ring sizes while the interface was down was causing a double
allocation of the receive and transmit rings. This issue is amplified when
there are multiple rings enabled. To prevent this we need to add an
additional check which will just update the ring counts when the interface
is not up and skip the allocation steps.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Jasper Spaans [Fri, 23 Oct 2009 04:08:46 +0000 (04:08 +0000)]
bonding: Modify hash transmit policies to use the packet's source MAC address
Modify bonding hash transmit policies to use the psource MAC address of
the packet instead of the MAC address configured for the bonding device.
The old sitation conflicts with the documentation.
Signed-off-by: Jasper Spaans <spaans@fox-it.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Jay Vosburgh <fubar@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Wunderlich [Sat, 24 Oct 2009 13:47:33 +0000 (06:47 -0700)]
r8169: fix Ethernet Hangup for RTL8110SC rev d
The 8110SC rev d chip on our board shows a regression which the 8110SB chip
did not have. When inbound traffic is overflowing the receive descriptor queue,
"holes" in the ring buffer may occur which lead to a hangup until the buffer
is filled again. The packets are than completely processed, but the ring
remains porous and no packets are processed until the next overflow. Setting
the interface down and up can fix the problem temporary from userspace.
For some reason we don't know, this behaviour is not occuring if the RxVlan
bit for hardware VLAN untagging is set. There is another "Work around for
AMD plateform" in the current code which checks the VLAN status
word in receive descriptors, but does never come to effect when hardware
VLAN support is enabled. We assume that this is a bug in the chip.
The following patch fixes the problem. Without the patch we could reproduce
the hang within minutes (given other devices also generating lots of
interrupts), without we couldn't reproduce within a few days of long term
testing.
This version contains minor style adjustments and is sent with mutt which
will hopefully not destroy the formatting again.
Signed-off-by: Bernhard Schmidt <bernhard.schmidt@saxnet.de> Signed-off-by: Simon Wunderlich <simon.wunderlich@saxnet.de> Acked-by: Francois Romieu <romieu@zoreil.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Manuel Lauss [Sat, 17 Oct 2009 02:00:07 +0000 (02:00 +0000)]
net: au1000_eth: add missing capability.h
fixes the following build failure:
CC drivers/net/au1000_eth.o
/drivers/net/au1000_eth.c: In function 'au1000_set_settings':
/drivers/net/au1000_eth.c:623: error: implicit declaration of function 'capable'
/drivers/net/au1000_eth.c:623: error: 'CAP_NET_ADMIN' undeclared (first use in this function)
/drivers/net/au1000_eth.c:623: error: (Each undeclared identifier is reported only once
/drivers/net/au1000_eth.c:623: error: for each function it appears in.
Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Arjan van de Ven [Fri, 23 Oct 2009 04:37:56 +0000 (21:37 -0700)]
net: use WARN() for the WARN_ON in commit b6b39e8f3fbbb
Commit b6b39e8f3fbbb (tcp: Try to catch MSG_PEEK bug) added a printk()
to the WARN_ON() that's in tcp.c. This patch changes this combination
to WARN(); the advantage of WARN() is that the printk message shows up
inside the message, so that kerneloops.org will collect the message.
In addition, this gets rid of an extra if() statement.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Bruce Allan [Fri, 23 Oct 2009 04:22:18 +0000 (21:22 -0700)]
e1000e: reset the PHY on 82577/82578 when going to Sx
The PHY on 82577/82578 parts needs a soft reset when transitioning to Sx
state in order for the PHY write which disables gigabit speed to take
effect. Gigabit speed must be disabled in order for the PHY writes to
registers on page 800 (the wakeup control registers) to work as expected
otherwise the system might not wake via WoL.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
=======================================================
[ INFO: possible circular locking dependency detected ]
2.6.32-rc4-testing #7
-------------------------------------------------------
ipppd/28379 is trying to acquire lock:
(&netdev->queue_lock){......}, at: [<e62ad0fd>] isdn_net_device_busy+0x2c/0x74 [isdn]
but task is already holding lock:
(&netdev->local->xmit_lock){+.....}, at: [<e62aefc2>] isdn_net_write_super+0x3f/0x6e [isdn]
which lock already depends on the new lock.
.......
We don't need to lock nd->queue->xmit_lock to protect single
isdn_net_lp_busy(). This can fix above lockdep warnings.
Reported-and-tested-by: Tilman Schmidt <tilman@imap.cc> Signed-off-by: Xiaotian Feng <xtfeng@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dhananjay Phadke [Wed, 21 Oct 2009 19:39:03 +0000 (19:39 +0000)]
netxen: avoid undue board config check
Old code assumed board config version in the flash to be 1.
When this will get changed by tools, driver just refuses to
attach. This is unnecessary since driver does not have to
parse board config structure directly (maintained by firmware).
Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Clear NX_RESETING bit in netxen_tx_timeout_task() so that
the firmware watchdog task can catch need_reset request
from tx timeout.
Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com> Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Mon, 19 Oct 2009 23:49:05 +0000 (23:49 +0000)]
KS8851: Fix ks8851_set_rx_mode() for IFF_MULTICAST
In ks8851_set_rx_mode() the case handling IFF_MULTICAST was also setting
the RXCR1_AE bit by accident. This meant that all unicast frames where
being accepted by the device. Remove RXCR1_AE from this case.
Note, RXCR1_AE was also masking a problem with setting the MAC address
properly, so needs to be applied after fixing the MAC write order.
Fixes a bug reported by Doong, Ping of Micrel. This version of the
patch avoids setting RXCR1_ME for all cases.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Mon, 19 Oct 2009 23:49:04 +0000 (23:49 +0000)]
KS8851: Fix MAC address write order
The MAC address register was being written in the wrong order, so add
a new address macro to convert mac-address byte to register address and
a ks8851_wrreg8() function to write each byte without having to worry
about any difficult byte swapping.
Fixes a bug reported by Doong, Ping of Micrel.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Dooks [Mon, 19 Oct 2009 23:49:03 +0000 (23:49 +0000)]
KS8851: Add soft reset at probe time
Issue a full soft reset at probe time.
This was reported by Doong Ping of Micrel, but no explanation of why this
is necessary or what bug it is fixing. Add it as it does not seem to hurt
the current driver and ensures that the device is in a known state when we
start setting it up.
Signed-off-by: Ben Dooks <ben@simtec.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Mon, 19 Oct 2009 19:41:06 +0000 (19:41 +0000)]
tcp: Try to catch MSG_PEEK bug
This patch tries to print out more information when we hit the
MSG_PEEK bug in tcp_recvmsg. It's been around since at least
2005 and it's about time that we finally fix it.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Tested-by: Oliver Hartkopp <oliver@hartkopp.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov [Mon, 19 Oct 2009 10:10:40 +0000 (10:10 +0000)]
tcp: fix TCP_DEFER_ACCEPT retrans calculation
Fix TCP_DEFER_ACCEPT conversion between seconds and
retransmission to match the TCP SYN-ACK retransmission periods
because the time is converted to such retransmissions. The old
algorithm selects one more retransmission in some cases. Allow
up to 255 retransmissions.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov [Mon, 19 Oct 2009 10:03:58 +0000 (10:03 +0000)]
tcp: reduce SYN-ACK retrans for TCP_DEFER_ACCEPT
Change SYN-ACK retransmitting code for the TCP_DEFER_ACCEPT
users to not retransmit SYN-ACKs during the deferring period if
ACK from client was received. The goal is to reduce traffic
during the deferring period. When the period is finished
we continue with sending SYN-ACKs (at least one) but this time
any traffic from client will change the request to established
socket allowing application to terminate it properly.
Also, do not drop acked request if sending of SYN-ACK fails.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov [Mon, 19 Oct 2009 10:01:56 +0000 (10:01 +0000)]
tcp: accept socket after TCP_DEFER_ACCEPT period
Willy Tarreau and many other folks in recent years
were concerned what happens when the TCP_DEFER_ACCEPT period
expires for clients which sent ACK packet. They prefer clients
that actively resend ACK on our SYN-ACK retransmissions to be
converted from open requests to sockets and queued to the
listener for accepting after the deferring period is finished.
Then application server can decide to wait longer for data
or to properly terminate the connection with FIN if read()
returns EAGAIN which is an indication for accepting after
the deferring period. This change still can have side effects
for applications that expect always to see data on the accepted
socket. Others can be prepared to work in both modes (with or
without TCP_DEFER_ACCEPT period) and their data processing can
ignore the read=EAGAIN notification and to allocate resources for
clients which proved to have no data to send during the deferring
period. OTOH, servers that use TCP_DEFER_ACCEPT=1 as flag (not
as a timeout) to wait for data will notice clients that didn't
send data for 3 seconds but that still resend ACKs.
Thanks to Willy Tarreau for the initial idea and to
Eric Dumazet for the review and testing the change.
Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Tomoki Sekiyama [Mon, 19 Oct 2009 06:17:37 +0000 (23:17 -0700)]
AF_UNIX: Fix deadlock on connecting to shutdown socket
I found a deadlock bug in UNIX domain socket, which makes able to DoS
attack against the local machine by non-root users.
How to reproduce:
1. Make a listening AF_UNIX/SOCK_STREAM socket with an abstruct
namespace(*), and shutdown(2) it.
2. Repeat connect(2)ing to the listening socket from the other sockets
until the connection backlog is full-filled.
3. connect(2) takes the CPU forever. If every core is taken, the
system hangs.
PoC code: (Run as many times as cores on SMP machines.)
int main(void)
{
int ret;
int csd;
int lsd;
struct sockaddr_un sun;
/* make an abstruct name address (*) */
memset(&sun, 0, sizeof(sun));
sun.sun_family = PF_UNIX;
sprintf(&sun.sun_path[1], "%d", getpid());
/* connect loop */
alarm(15); /* forcely exit the loop after 15 sec */
for (;;) {
csd = socket(AF_UNIX, SOCK_STREAM, 0);
ret = connect(csd, (struct sockaddr *)&sun, sizeof(sun));
if (-1 == ret) {
perror("connect()");
break;
}
puts("Connection OK");
}
return 0;
}
(*) Make sun_path[0] = 0 to use the abstruct namespace.
If a file-based socket is used, the system doesn't deadlock because
of context switches in the file system layer.
Why this happens:
Error checks between unix_socket_connect() and unix_wait_for_peer() are
inconsistent. The former calls the latter to wait until the backlog is
processed. Despite the latter returns without doing anything when the
socket is shutdown, the former doesn't check the shutdown state and
just retries calling the latter forever.
Patch:
The patch below adds shutdown check into unix_socket_connect(), so
connect(2) to the shutdown socket will return -ECONREFUSED.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama.qu@hitachi.com> Signed-off-by: Masanori Yoshida <masanori.yoshida.tv@hitachi.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Thomas Chou [Wed, 7 Oct 2009 14:16:43 +0000 (14:16 +0000)]
ethoc: clear only pending irqs
This patch fixed the problem of dropped packets due to lost of
interrupt requests. We should only clear what was pending at the
moment we read the irq source reg.
Signed-off-by: Thomas Chou <thomas@wytron.com.tw> Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Sat, 17 Oct 2009 00:54:34 +0000 (17:54 -0700)]
vmxnet3: use dev_dbg, fix build for CONFIG_BLOCK=n
vmxnet3 was using dprintk() for debugging output. This was
defined in <linux/dst.h> and was the only thing that was
used from that header file. This caused compile errors
when CONFIG_BLOCK was not enabled due to bio* and BIO*
uses in the header file, so change this driver to use
dev_dbg() for debugging output.
include/linux/dst.h:520: error: dereferencing pointer to incomplete type
include/linux/dst.h:520: error: 'BIO_POOL_BITS' undeclared (first use in this function)
include/linux/dst.h:521: error: dereferencing pointer to incomplete type
include/linux/dst.h:522: error: dereferencing pointer to incomplete type
include/linux/dst.h:525: error: dereferencing pointer to incomplete type
make[4]: *** [drivers/net/vmxnet3/vmxnet3_drv.o] Error 1
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Bhavesh Davda <bhavesh@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Wed, 14 Oct 2009 14:36:43 +0000 (14:36 +0000)]
virtio_net: use dev_kfree_skb_any() in free_old_xmit_skbs()
Because netpoll can call netdevice start_xmit() method with
irqs disabled, drivers should not call kfree_skb() from
their start_xmit(), but use dev_kfree_skb_any() instead.
Oct 8 11:16:52 172.30.1.31 [113074.791813] ------------[ cut here ]------------
Oct 8 11:16:52 172.30.1.31 [113074.791813] WARNING: at net/core/skbuff.c:398 \
skb_release_head_state+0x64/0xc8()
Oct 8 11:16:52 172.30.1.31 [113074.791813] Hardware name:
Oct 8 11:16:52 172.30.1.31 [113074.791813] Modules linked in: netconsole ocfs2 jbd2 quota_tree \
ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs crc32c drbd cn loop \
serio_raw psmouse snd_pcm snd_timer snd soundcore snd_page_alloc virtio_net pcspkr parport_pc parport \
i2c_piix4 i2c_core button processor evdev ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_snapshot \
dm_mod ide_cd_mod cdrom ata_generic ata_piix virtio_blk libata scsi_mod piix ide_pci_generic ide_core \
virtio_pci virtio_ring virtio floppy thermal fan thermal_sys [last unloaded: netconsole]
Oct 8 11:16:52 172.30.1.31 [113074.791813] Pid: 11132, comm: php5-cgi Tainted: G W \
2.6.31.2-vserver #1
Oct 8 11:16:52 172.30.1.31 [113074.791813] Call Trace:
Oct 8 11:16:52 172.30.1.31 [113074.791813] <IRQ> [<ffffffff81253cd5>] ? \
skb_release_head_state+0x64/0xc8
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81253cd5>] ? skb_release_head_state+0x64/0xc8
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81049ae1>] ? warn_slowpath_common+0x77/0xa3
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81253cd5>] ? skb_release_head_state+0x64/0xc8
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81253a1a>] ? __kfree_skb+0x9/0x7d
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffffa01cb139>] ? free_old_xmit_skbs+0x51/0x6e \
[virtio_net]
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffffa01cbc85>] ? start_xmit+0x26/0xf2 [virtio_net]
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8126934f>] ? netpoll_send_skb+0xd2/0x205
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffffa0429216>] ? write_msg+0x90/0xeb [netconsole]
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81049f06>] ? __call_console_drivers+0x5e/0x6f
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8102b49d>] ? kvm_clock_read+0x4d/0x52
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8104a082>] ? release_console_sem+0x115/0x1ba
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8104a632>] ? vprintk+0x2f2/0x34b
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8106b142>] ? vx_update_load+0x18/0x13e
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81308309>] ? printk+0x4e/0x5d
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff8102b49d>] ? kvm_clock_read+0x4d/0x52
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81070b62>] ? getnstimeofday+0x55/0xaf
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81062683>] ? ktime_get_ts+0x21/0x49
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff810626b7>] ? ktime_get+0xc/0x41
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81062788>] ? hrtimer_interrupt+0x9c/0x146
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81024a4b>] ? smp_apic_timer_interrupt+0x80/0x93
Oct 8 11:16:52 172.30.1.31 [113074.791813] [<ffffffff81011663>] ? apic_timer_interrupt+0x13/0x20
Oct 8 11:16:52 172.30.1.31 [113074.791813] <EOI> [<ffffffff8130a9eb>] ? _spin_unlock_irq+0xd/0x31
Reported-and-tested-by: Massimo Cetra <mcetra@navynet.it> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Bug-Entry: http://bugzilla.kernel.org/show_bug.cgi?id=14378 Signed-off-by: David S. Miller <davem@davemloft.net>
Sathya Perla [Wed, 14 Oct 2009 20:21:17 +0000 (20:21 +0000)]
be2net: fix support for PCI hot plug
Before issuing any cmds to the FW, the driver must first wait
till the fW becomes ready. This is needed for PCI hot plug when
the driver can be probed while the card fw is being initialized.
Signed-off-by: Sathya Perla <sathyap@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Thu, 15 Oct 2009 03:38:58 +0000 (20:38 -0700)]
vmxnet: fix 2 build problems
vmxnet3 uses in_dev* interfaces so it should depend on INET.
Also fix so that the driver builds when CONFIG_PCI_MSI is disabled.
vmxnet3_drv.c:(.text+0x2a88cb): undefined reference to `in_dev_finish_destroy'
drivers/net/vmxnet3/vmxnet3_drv.c:1335: error: 'struct vmxnet3_intr' has no member named 'msix_entries'
drivers/net/vmxnet3/vmxnet3_drv.c:1384: error: 'struct vmxnet3_intr' has no member named 'msix_entries'
drivers/net/vmxnet3/vmxnet3_drv.c:2137: error: 'struct vmxnet3_intr' has no member named 'msix_entries'
drivers/net/vmxnet3/vmxnet3_drv.c:2138: error: 'struct vmxnet3_intr' has no member named 'msix_entries'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Bhavesh davda <bhavesh@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: add support for STMicroelectronics Ethernet controllers.
This is the driver for the ST MAC 10/100/1000 on-chip Ethernet
controllers (Synopsys IP blocks).
Driver documentation:
o http://stlinux.com/drupal/kernel/network/stmmac
Revisions:
o http://stlinux.com/drupal/kernel/network/stmmac-driver-revisions
Performances:
o http://stlinux.com/drupal/benchmarks/networking/stmmac
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Randy Dunlap [Wed, 14 Oct 2009 22:10:58 +0000 (15:10 -0700)]
net: ks8851_mll uses mii interfaces
From: Randy Dunlap <randy.dunlap@oracle.com>
ks8851_mll uses mii interfaces so it needs to select MII.
ks8851_mll.c:(.text+0xf95ac): undefined reference to `generic_mii_ioctl'
ks8851_mll.c:(.text+0xf96a0): undefined reference to `mii_ethtool_gset'
ks8851_mll.c:(.text+0xf96fa): undefined reference to `mii_ethtool_sset'
ks8851_mll.c:(.text+0xf9754): undefined reference to `mii_link_ok'
ks8851_mll.c:(.text+0xf97ae): undefined reference to `mii_nway_restart'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Bonesio [Wed, 14 Oct 2009 22:10:19 +0000 (15:10 -0700)]
net/fec_mpc52xx: Fix kernel panic on FEC error
The MDIO bus cannot be accessed at interrupt context, but on an FEC
error, the fec_mpc52xx driver reset function also tries to reset the
PHY. Since the error is detected at IRQ context, and the PHY functions
try to sleep, the kernel ends up panicking.
Resetting the PHY on an FEC error isn't even necessary. This patch
solves the problem by removing the PHY reset entirely.
Signed-off-by: John Bonesio <bones@secretlab.ca> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
Anton Vorontsov [Wed, 14 Oct 2009 21:54:52 +0000 (14:54 -0700)]
net: Fix OF platform drivers coldplug/hotplug when compiled as modules
Some OF platform drivers are missing module device tables, so they won't
load automatically on boot. This patch fixes the issue by adding proper
MODULE_DEVICE_TABLE() macros to the drivers.
Signed-off-by: Anton Vorontsov <avorontsov@ru.mvista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Sriram [Wed, 7 Oct 2009 02:44:30 +0000 (02:44 +0000)]
TI DaVinci EMAC: Clear statistics register properly.
The mechanism to clear the statistics register is dependent
on the status of GMIIEN bit in MAC control register. If the
GMIIEN bit is set, the stats registers are write to decrement.
If the GMIIEN bit is cleared, the stats registers are plain
read/write registers. The stats register clearing operation
must take into account the current state of GMIIEN as it
can be cleared when the interface is brought down.
With existing implementation logic, querying for interface stats
when the interface is down, can corrupt the statistics counters.
This patch examines the GMIIEN bit status in MAC_CONTROL
register before choosing an appropriate mask for clearing stats
registers.
Signed-off-by: Sriramakrishnan <srk@ti.com> Acked-by: Chaithrika U S <chaithrika@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
françois romieu [Wed, 7 Oct 2009 12:44:20 +0000 (12:44 +0000)]
r8169: partial support and phy init for the 8168d
Extracted from Realtek's 8.012.00 r8168 driver.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Simon Farnsworth <simon.farnsworth@onelan.com> Cc: Edward Hsu <edward_hsu@realtek.com.tw> Signed-off-by: David S. Miller <davem@davemloft.net>
Cisco HDLC uses keepalive packets and sequence numbers to determine link
state. In rare cases both ends could transmit keepalive packets at the same
time, causing the received sequence numbers to be treated as incorrect.
Now we accept our current sequence number as well as the previous one.
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
Willy Tarreau [Tue, 13 Oct 2009 07:27:40 +0000 (00:27 -0700)]
tcp: fix tcp_defer_accept to consider the timeout
I was trying to use TCP_DEFER_ACCEPT and noticed that if the
client does not talk, the connection is never accepted and
remains in SYN_RECV state until the retransmits expire, where
it finally is deleted. This is bad when some firewall such as
netfilter sits between the client and the server because the
firewall sees the connection in ESTABLISHED state while the
server will finally silently drop it without sending an RST.
This behaviour contradicts the man page which says it should
wait only for some time :
TCP_DEFER_ACCEPT (since Linux 2.4)
Allows a listener to be awakened only when data arrives
on the socket. Takes an integer value (seconds), this
can bound the maximum number of attempts TCP will
make to complete the connection. This option should not
be used in code intended to be portable.
Also, looking at ipv4/tcp.c, a retransmit counter is correctly
computed :
case TCP_DEFER_ACCEPT:
icsk->icsk_accept_queue.rskq_defer_accept = 0;
if (val > 0) {
/* Translate value in seconds to number of
* retransmits */
while (icsk->icsk_accept_queue.rskq_defer_accept < 32 &&
val > ((TCP_TIMEOUT_INIT / HZ) <<
icsk->icsk_accept_queue.rskq_defer_accept))
icsk->icsk_accept_queue.rskq_defer_accept++;
icsk->icsk_accept_queue.rskq_defer_accept++;
}
break;
==> rskq_defer_accept is used as a counter of retransmits.
But in tcp_minisocks.c, this counter is only checked. And in
fact, I have found no location which updates it. So I think
that what was intended was to decrease it in tcp_minisocks
whenever it is checked, which the trivial patch below does.
Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: David S. Miller <davem@davemloft.net>