Linus Torvalds [Fri, 11 Dec 2009 23:59:23 +0000 (15:59 -0800)]
Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
[ACPI/CPUFREQ] Introduce bios_limit per cpu cpufreq sysfs interface
[CPUFREQ] make internal cpufreq_add_dev_* static
[CPUFREQ] use an enum for speedstep processor identification
[CPUFREQ] Document units for transition latency
[CPUFREQ] Use global sysfs cpufreq structure for conservative governor tunings
[CPUFREQ] Documentation: ABI: /sys/devices/system/cpu/cpu#/cpufreq/
[CPUFREQ] powernow-k6: set transition latency value so ondemand governor can be used
[CPUFREQ] cpumask: don't put a cpumask on the stack in x86...cpufreq/powernow-k8.c
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6: (58 commits)
tty: split the lock up a bit further
tty: Move the leader test in disassociate
tty: Push the bkl down a bit in the hangup code
tty: Push the lock down further into the ldisc code
tty: push the BKL down into the handlers a bit
tty: moxa: split open lock
tty: moxa: Kill the use of lock_kernel
tty: moxa: Fix modem op locking
tty: moxa: Kill off the throttle method
tty: moxa: Locking clean up
tty: moxa: rework the locking a bit
tty: moxa: Use more tty_port ops
tty: isicom: fix deadlock on shutdown
tty: mxser: Use the new locking rules to fix setserial properly
tty: mxser: use the tty_port_open method
tty: isicom: sort out the board init logic
tty: isicom: switch to the new tty_port_open helper
tty: tty_port: Add a kref object to the tty port
tty: istallion: tty port open/close methods
tty: stallion: Convert to the tty_port_open/close methods
...
Linus Torvalds [Fri, 11 Dec 2009 23:31:13 +0000 (15:31 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: (21 commits)
ext3: PTR_ERR return of wrong pointer in setup_new_group_blocks()
ext3: Fix data / filesystem corruption when write fails to copy data
ext4: Support for 64-bit quota format
ext3: Support for vfsv1 quota format
quota: Implement quota format with 64-bit space and inode limits
quota: Move definition of QFMT_OCFS2 to linux/quota.h
ext2: fix comment in ext2_find_entry about return values
ext3: Unify log messages in ext3
ext2: clear uptodate flag on super block I/O error
ext2: Unify log messages in ext2
ext3: make "norecovery" an alias for "noload"
ext3: Don't update the superblock in ext3_statfs()
ext3: journal all modifications in ext3_xattr_set_handle
ext2: Explicitly assign values to on-disk enum of filetypes
quota: Fix WARN_ON in lookup_one_len
const: struct quota_format_ops
ubifs: remove manual O_SYNC handling
afs: remove manual O_SYNC handling
kill wait_on_page_writeback_range
vfs: Implement proper O_SYNC semantics
...
Linus Torvalds [Fri, 11 Dec 2009 23:30:29 +0000 (15:30 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: Fix error return for fallocate() on XFS
xfs: cleanup dmapi macros in the umount path
xfs: remove incorrect sparse annotation for xfs_iget_cache_miss
xfs: kill the STATIC_INLINE macro
xfs: uninline xfs_get_extsz_hint
xfs: rename xfs_attr_fetch to xfs_attr_get_int
xfs: simplify xfs_buf_get / xfs_buf_read interfaces
xfs: remove IO_ISAIO
xfs: Wrapped journal record corruption on read at recovery
xfs: cleanup data end I/O handlers
xfs: use WRITE_SYNC_PLUG for synchronous writeout
xfs: reset the i_iolock lock class in the reclaim path
xfs: I/O completion handlers must use NOFS allocations
xfs: fix mmap_sem/iolock inversion in xfs_free_eofblocks
xfs: simplify inode teardown
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging-2.6: (235 commits)
Staging: IIO: add selection of IIO_SW_RING to LIS3L02DQ as needed
Staging: IIO: Add tsl2560-2 support to tsl2563 driver.
Staging: IIO: Remove tsl2561 driver. Support merged with tsl2563.
Staging: wlags49_h2: fix up signal levels
+ drivers-staging-wlags49_h2-remove-cvs-metadata.patch added to -mm tree
Staging: samsung-laptop: add TODO file
Staging: samsung-laptop: remove old kernel code
Staging: add Samsung Laptop driver
staging: batman-adv meshing protocol
Staging: rtl8192u: depends on USB
Staging: rtl8192u: remove dead code
Staging: rtl8192u: remove bad whitespaces
Staging: rtl8192u: make it compile
Staging: Added Realtek rtl8192u driver to staging
Staging: dream: add gpio and pmem support
Staging: dream: add TODO file
Staging: android: delete android drivers
Staging: et131x: clean up the avail fields in the rx registers
Staging: et131x: Clean up number fields
Staging: et131x: kill RX_DMA_MAX_PKT_TIME
...
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (27 commits)
Driver core: fix race in dev_driver_string
Driver Core: Early platform driver buffer
sysfs: sysfs_setattr remove unnecessary permission check.
sysfs: Factor out sysfs_rename from sysfs_rename_dir and sysfs_move_dir
sysfs: Propagate renames to the vfs on demand
sysfs: Gut sysfs_addrm_start and sysfs_addrm_finish
sysfs: In sysfs_chmod_file lazily propagate the mode change.
sysfs: Implement sysfs_getattr & sysfs_permission
sysfs: Nicely indent sysfs_symlink_inode_operations
sysfs: Update s_iattr on link and unlink.
sysfs: Fix locking and factor out sysfs_sd_setattr
sysfs: Simplify iattr time assignments
sysfs: Simplify sysfs_chmod_file semantics
sysfs: Use dentry_ops instead of directly playing with the dcache
sysfs: Rename sysfs_d_iput to sysfs_dentry_iput
sysfs: Update sysfs_setxattr so it updates secdata under the sysfs_mutex
debugfs: fix create mutex racy fops and private data
Driver core: Don't remove kobjects in device_shutdown.
firmware_class: make request_firmware_nowait more useful
Driver-Core: devtmpfs - set root directory mode to 0755
...
Linus Torvalds [Fri, 11 Dec 2009 23:22:27 +0000 (15:22 -0800)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: ohci: handle receive packets with a data length of zero
Linus Torvalds [Fri, 11 Dec 2009 23:19:56 +0000 (15:19 -0800)]
Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
kgdb: Always process the whole breakpoint list on activate or deactivate
kgdb: continue and warn on signal passing from gdb
kgdb,x86: do not set kgdb_single_step on x86
kgdb: allow for cpu switch when single stepping
kgdb,i386: Fix corner case access to ss with NMI watch dog exception
kgdb: Replace strstr() by strchr() for single-character needles
kgdbts: Read buffer overflow
kgdb: Read buffer overflow
kgdb,x86: remove redundant test
Alan Cox [Mon, 30 Nov 2009 13:18:51 +0000 (13:18 +0000)]
tty: split the lock up a bit further
The tty count sanity check may need the BKL, that isn't clear. However it
is clear that the count use of the lock is internal and independant of the
bigger use of the lock.
Furthermore the file list locking is also separately locked already
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:18:45 +0000 (13:18 +0000)]
tty: Move the leader test in disassociate
There are two call points, both want to check that tty->signal->leader is
set. Move the test into disassociate_ctty() as that will make locking
changes easier in a bit
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:18:24 +0000 (13:18 +0000)]
tty: moxa: split open lock
moxa_openlock is used for several situations where we want to handle the
case of an ioctl that crosses many ports (not just the open tty), and also
cases where an open races a deinit (eg a pci unplug) and we hangup a port
before we can cope with that.
The non open race cases can use the moxa_lock spinlock. This simplifies sorting
out the remaining mess.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:18:02 +0000 (13:18 +0000)]
tty: moxa: Locking clean up
- The open lock is needed to fix up the case of a board reset occuring during
tty open but too early for a sane hangup response.
- The lock can however got for other cases
- Use the port mutex for get/setserial
- Fix up the confused lack of locking on the THROTTLE and other bits in the
private flags. Just use set/test/clear bit and it covers the cases we need
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Path to might_sleep macro from isicom_hangup:
1. isicom_hangup calls spin_lock_irqsave (drivers/char/isicom.c:1315) and
then
calls isicom_shutdown_port.
2. isiscom_shutdown_port calls tty_port_free_xmit_buf at
drivers/char/isicom.c:906
3. tty_port_free_xmit_buf calls mutex_lock at drivers/char/tty_port:48
Found by Linux Driver Verification Project.
Reported-by: Alexander Strakh <strakh@ispras.ru> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:17:41 +0000 (13:17 +0000)]
tty: mxser: Use the new locking rules to fix setserial properly
Propogate the init/shutdown mutex through the setserial logic. Use the proper
locks for the various bits still using the BKL. Kill the BKL in this driver.
Updated to fix the bug noted by Dan Carpenter
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:17:35 +0000 (13:17 +0000)]
tty: mxser: use the tty_port_open method
At first this looks a fairly trivial conversion but we can't quite push
everything into the right format yet. The open side is easy but care is needed
over the setserial methods. Fix up the locking now that we've adopted the
port->mutex locking rule for the initialization.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:17:30 +0000 (13:17 +0000)]
tty: isicom: sort out the board init logic
Split this into two flags - INIT meaning the board is set up and ACTIVE
meaning the board has ports open. Remove the broken HUPCL casing and push
the counts somewhere sensible.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:16:57 +0000 (13:16 +0000)]
tty: tty_port: Move the IO_ERROR clear
Some devices want to set IO_ERROR in their activate methods so that you can
be handed a 'dead' port for operations like setserial. Thus we need to
clear the flag before activate so that activate can choose to set the flag
and still return 0.
This is fine as the file handle/tty are not accessible to the user yet.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Mon, 30 Nov 2009 13:16:41 +0000 (13:16 +0000)]
tty: tty_port: Change the buffer allocator locking
We want to be able to do this without regard for the activate/own open
method being used which causes a problem using port->mutex. Add another
mutex for now. Once everything uses port_open to do buffer allocs we can
kill it back off
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Thu, 5 Nov 2009 13:28:38 +0000 (13:28 +0000)]
sdio_uart: Move the open lock
When we move to the tty_port logic the port mutex will protect open v close
v hangup. Move to this first in the existing open code so we have a bisection
point.
Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
devpts_get_tty() assumes that the inode passed in is associated with a valid
pty. But if the only reference to the pty is via a bind-mount, the inode
passed to devpts_get_tty() while valid, would refer to a pty that no longer
exists.
With a lot of debug effort, Grzegorz Nosek developed a small program (see
below) to reproduce a crash on recent kernels. This crash is a regression
introduced by the commit:
Ian Jackson [Wed, 18 Nov 2009 10:08:11 +0000 (11:08 +0100)]
Serial: Do not read IIR in serial8250_start_tx when UART_BUG_TXEN
Do not read IIR in serial8250_start_tx when UART_BUG_TXEN
Reading the IIR clears some oustanding interrupts so it is not safe.
Instead, simply transmit immediately if the buffer is empty without
regard to IIR.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Jiri Kosina <jkosina@suse.cz> Cc: Alan Cox <alan@linux.intel.com> Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Intel(R) PXA27x Processor Family Specification Update (Nov 2005)
says:
E75. UART: Baud rate may not be programmed correctly on
back-to-back writes.
Problem:
When programming the Divisor Latch registers, Low and High (DLL and
DLH), with back-to-back writes, the second register write may not
take effect. The result is an incorrect baud rate.
Workaround:
After programming the first Divisor Latch register, read and verify
it before programming the second Divisor Latch register.
This was hit when changing the baud rate from 115200 to 9600 while
receiving characters at 9600 Bd.
And fixed indention of some comments nearby.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Wolfram Sang <w.sang@pengutronix.de> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Cc: Eric Miao <eric.y.miao@gmail.com> Cc: Alan Cox <alan@linux.intel.com> Cc: Mike Rapoport <mike@compulab.co.il> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Alan Cox [Tue, 6 Oct 2009 15:06:36 +0000 (16:06 +0100)]
usb_serial: Use the shutdown() operation
As Alan Stern pointed out - now we have tty_port_open the shutdown method
and locking allow us to whack the other bits into the full helper methods
and provide a shutdown op which the tty port code will synchronize with
setup for us.
Signed-off-by: Alan Cox <alan@linux.intel.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Currently there is a field in the jsm_board structure to cont
the number of interrupt that the card recevived, but it's not
working properly when the IRQ line is shared, and also nowhere
else this field is used. So, This patch is removing it.
Actually jsm displays "Device Added" 8 times (for a 8 port device).
This silly patch just makes things more informative, showing
the port (instead of the device) that was added.
Signed-off-by: Breno Leitão <leitao@linux.vnet.ibm.com> Cc: Scott Kilau <scottk@digi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
1. If unregister_netdevice_many() is called with both registered
and unregistered devices, rollback_registered_many() bails out
when it reaches the first unregistered device. The processing
of the prior registered devices is unfinished, and the
remaining devices are skipped, and possible registered netdev's
are leaked/unregistered.
2. System hangs or panics depending on how the devices are passed,
since when netdev_run_todo() runs, some devices were not fully
processed.
Tested by passing intermingled unregistered and registered vlan
devices to unregister_netdevice_many() as follows:
1. dev, fake_dev1, fake_dev2: hangs in run_todo
("unregister_netdevice: waiting for eth1.100 to become
free. Usage count = 1")
2. fake_dev1, dev, fake_dev2: failure during de-registration
and next registration, followed by a vlan driver Oops
during subsequent registration.
Confirmed that the patch fixes both cases.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Barry Song [Thu, 10 Dec 2009 23:46:28 +0000 (23:46 +0000)]
can: add the driver for Analog Devices Blackfin on-chip CAN controllers
Signed-off-by: Barry Song <21cnbao@gmail.com> Signed-off-by: H.J. Oertel <oe@port.de> Signed-off-by: Wolfgang Grandegger <wg@grandegger.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Martin Willi [Wed, 9 Dec 2009 06:11:15 +0000 (06:11 +0000)]
xfrm: Fix truncation length of authentication algorithms installed via PF_KEY
Commit 4447bb33f09444920a8f1d89e1540137429351b6 ("xfrm: Store aalg in
xfrm_state with a user specified truncation length") breaks
installation of authentication algorithms via PF_KEY, as the state
specific truncation length is not installed with the algorithms
default truncation length. This patch initializes state properly to
the default if installed via PF_KEY.
Signed-off-by: Martin Willi <martin@strongswan.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiko Carstens [Wed, 9 Dec 2009 20:59:15 +0000 (20:59 +0000)]
net: use compat helper functions in compat_sys_recvmmsg
Use (get|put)_compat_timespec helper functions to simplify the code.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Heiko Carstens [Wed, 9 Dec 2009 20:58:16 +0000 (20:58 +0000)]
net: fix compat_sys_recvmmsg parameter type
compat_sys_recvmmsg has a compat_timespec parameter and not a
timespec parameter. This way we also get rid of an odd cast.
Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Mundt [Thu, 10 Dec 2009 20:42:27 +0000 (20:42 +0000)]
net: smc91x: Fix up type mismatch in smc_drv_resume().
smc_drv_resume() takes a struct device, while smc_enable_device() takes a
platform device. This fixes up the smc_enable_device() callsite with the
proper pointer.
It's not obvious when this change was introduced, as git history doesn't
go back that far. Presumably the resume code has always been broken in
this fashion.
Signed-off-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Mike Frysinger [Wed, 9 Dec 2009 03:40:04 +0000 (03:40 +0000)]
smc91x: fix unused flags warnings on UP systems
Local flags variables will be declared whenever these functions get used,
but obviously on UP systems the flags parameter won't be touched. So add
some dummy ops that get optimized away anyways to satisfy gcc's warnings.
Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: Nicolas Pitre <nico@fluxnic.net> Signed-off-by: David S. Miller <davem@davemloft.net>
Oliver Neukum [Fri, 11 Dec 2009 23:01:26 +0000 (15:01 -0800)]
MAINTAINERS: Transfering maintainership of cdc-ether
Oliver Neukum takes over from Greg KH
Signed-off-by: Oliver Neukum <oliver@neukum.org> Signed-off-by: Greg Kroah-Hartman <greg@kroah.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Takashi Iwai [Thu, 3 Dec 2009 05:12:02 +0000 (05:12 +0000)]
net: Add missing TST_CFG_WRITE bits around sky2_pci_write
Add missing TST_CFG_WRITE bits around sky2_pci_write*() in Optima
setup routines. Without the cfg-write bits, the driver may spew endless
link-up messages through qlink irq.
Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Fri, 11 Dec 2009 22:32:30 +0000 (14:32 -0800)]
Merge branch 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/ttm: export some functions useful to drivers using ttm
drm/radeon/kms/avivo: fix typo in new_pll module description
drm/radeon/kms: Convert radeon to new ttm_bo_init
drm/ttm: Convert ttm_buffer_object_init to use ttm_placement
Jason Gunthorpe [Tue, 24 Nov 2009 21:52:53 +0000 (21:52 +0000)]
xfs: Fix error return for fallocate() on XFS
Noticed that through glibc fallocate would return 28 rather than -1
and errno = 28 for ENOSPC. The xfs routines uses XFS_ERROR format
positive return error codes while the syscalls use negative return
codes. Fixup the two cases in xfs_vn_fallocate syscall to convert to
negative.
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Alex Elder <aelder@sgi.com>
Stop the flag saving as we never mangle those in the unmount path, and
hide all the weird arguents to the dmapi code inside the
XFS_SEND_PREUNMOUNT / XFS_SEND_UNMOUNT macros.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Remove our own STATIC_INLINE macro. For small function inside
implementation files just use STATIC and let gcc inline it, and for
those in headers do the normal static inline - they are all small
enough to be inlined for debug builds, too.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Using a totally different name for the low-level get operation does
not fit the _int convention used in the rest of the attr code, so
rename it.
While we're at it also fix the prototype to use the normal convention
and mark it static as it's never used outside of xfs_attr.c.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Alex Elder <aelder@sgi.com>
Currently the low-level buffer cache interfaces are highly confusing
as we have a _flags variant of each that does actually respect the
flags, and one without _flags which has a flags argument that gets
ignored and overriden with a default set. Given that very few places
use the default arguments get rid of the duplication and convert all
callers to pass the flags explicitly. Also remove the now confusing
_flags postfix.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>
We set the IO_ISAIO flag for all read/write I/O since early Linux
2.6.x. Remove it as it has lost it's purpose long ago.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Eric Sandeen <sandeen@sandeen.net> Signed-off-by: Alex Elder <aelder@sgi.com>
Andy Poling [Tue, 3 Nov 2009 17:26:47 +0000 (17:26 +0000)]
xfs: Wrapped journal record corruption on read at recovery
Summary of problem:
If a journal record wraps at the physical end of the journal, it has to be
read in two parts in xlog_do_recovery_pass(): a read at the physical end and a
read at the physical beginning. If xlog_bread() has to re-align the first
read, the second read request does not take that re-alignment into account.
If the first read was re-aligned, the second read over-writes the end of the
data from the first read, effectively corrupting it. This can happen either
when reading the record header or reading the record data.
The first sanity check in xlog_recover_process_data() is to check for a valid
clientid, so that is the error reported.
Summary of fix:
If there was a first read at the physical end, XFS_BUF_PTR() returns where the
data was requested to begin. Conversely, because it is the result of
xlog_align(), offset indicates where the requested data for the first read
actually begins - whether or not xlog_bread() has re-aligned it.
Using offset as the base for the calculation of where to place the second read
data ensures that it will be correctly placed immediately following the data
from the first read instead of sometimes over-writing the end of it.
The attached patch has resolved the reported problem of occasional inability
to recover the journal (reporting "bad clientid").
Signed-off-by: Andy Poling <andy@realbig.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Currently we have different end I/O handlers for read vs the different
types of write I/O. But they are all very similar so we could just
use one with a few conditionals and reduce code size a lot.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
The VM and I/O schedulers now expect us to use WRITE_SYNC_PLUG for
synchronous writeout. Right now I can't see any changes in performance
numbers with this, but we're getting some beating for not using it,
and the knowledge definitely could help the block code to make better
decisions.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: reset the i_iolock lock class in the reclaim path
The iolock is used for protecting reads, writes and block truncates
against each other. We have two classes of callers, the first one is
induced by a file operation and requires a reference to the inode be
held and not dropped after the operation is done:
- xfs_vm_vmap, xfs_vn_fallocate, xfs_read, xfs_write, xfs_splice_read,
xfs_splice_write and xfs_setattr are all implementations of VFS
methods that require a live inode
- xfs_getbmap and xfs_swap_extents are ioctl subcommand for which the
same is true
- xfs_truncate_file is only called on quota inodes just returned from
xfs_iget
- xfs_sync_inode_data does the lock just after an igrab()
- xfs_filestream_associate and xfs_filestream_new_ag take the iolock
on the parent inode of an inode which by VFS rules must be referenced
And we have various calls to truncate blocks past EOF or the whole
file when dropping the last reference to an inode. Unfortunately
lockdep complains when we do memory allocations that can recurse into
the filesystem in the first class because the second class happens to
take the same lock. To avoid this re-init the iolock in the beginning
of xfs_fs_clear_inode to get a new lock class.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: I/O completion handlers must use NOFS allocations
When completing I/O requests we must not allow the memory allocator to
recurse into the filesystem, as we might deadlock on waiting for the
I/O completion otherwise. The only thing currently allocating normal
GFP_KERNEL memory is the allocation of the transaction structure for
the unwritten extent conversion. Add a memflags argument to
_xfs_trans_alloc to allow controlling the allocator behaviour.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Thomas Neumann <tneumann@users.sourceforge.net> Tested-by: Thomas Neumann <tneumann@users.sourceforge.net> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
xfs: fix mmap_sem/iolock inversion in xfs_free_eofblocks
When xfs_free_eofblocks is called from ->release the VM might already
hold the mmap_sem, but in the write path we take the iolock before
taking the mmap_sem in the generic write code.
Switch xfs_free_eofblocks to only trylock the iolock if called from
->release and skip trimming the prellocated blocks in that case.
We'll still free them later on the final iput.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>
Currently the reclaim code for the case where we don't reclaim the
final reclaim is overly complicated. We know that the inode is clean
but instead of just directly reclaiming the clean inode we go through
the whole process of marking the inode reclaimable just to directly
reclaim it from the calling context. Besides being overly complicated
this introduces a race where iget could recycle an inode between
marked reclaimable and actually being reclaimed leading to panics.
This patch gets rid of the existing reclaim path, and replaces it with
a simple call to xfs_ireclaim if the inode was clean. While we're at
it we also use the slightly more lax xfs_inode_clean check we'd use
later to determine if we need to flush the inode here.
Finally get rid of xfs_reclaim function and place the remaining small
bits of reclaim code directly into xfs_fs_destroy_inode.
Signed-off-by: Christoph Hellwig <hch@lst.de> Reported-by: Patrick Schreurs <patrick@news-service.com> Reported-by: Tommy van Leeuwen <tommy@news-service.com> Tested-by: Patrick Schreurs <patrick@news-service.com> Reviewed-by: Alex Elder <aelder@sgi.com> Signed-off-by: Alex Elder <aelder@sgi.com>