Commit 054b92b902a7 ("scripts/faddr2line: Fix overlapping text section
failures") can cause faddr2line to fail on ppc64le on some
distributions, while it works fine on other distributions. The failure
can be attributed to differences in the readelf output.
$ ./scripts/faddr2line vmlinux find_busiest_group+0x00
no match for find_busiest_group+0x00
On ppc64le, readelf adds the localentry tag before the symbol name on
some distributions, and adds the localentry tag after the symbol name on
other distributions. This problem has been discussed previously:
This problem can be overcome by filtering out the localentry tags in the
readelf output. Similar fixes are already present in the kernel by way
of the following commits:
1d4f33c247b4 ("libbpf: Fix VERSIONED_SYM_COUNT number parsing") b7eca9898dd4 ("libbpf: Fix readelf output parsing for Fedora")
[jpoimboe: rework commit log]
Fixes: 054b92b902a7 ("scripts/faddr2line: Fix overlapping text section failures") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Reviewed-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Link: https://lore.kernel.org/r/20220927075211.897152-1-srikar@linux.vnet.ibm.com Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix an implicit declaration of function error for rpr0521 under some configs
When CONFIG_RPR0521 is enabled without CONFIG_IIO_TRIGGERED_BUFFER,
the build results in "implicit declaration of function" errors, e.g.,
drivers/iio/light/rpr0521.c:434:3: error: implicit declaration of function
'iio_trigger_poll_chained' [-Werror=implicit-function-declaration]
434 | iio_trigger_poll_chained(data->drdy_trigger0);
| ^~~~~~~~~~~~~~~~~~~~~~~~
This fix adds select dependencies to RPR0521's configuration declaration.
The array size of afe4404_channel_leds and afe4404_channel_offdacs
are less than channels, so access with chan->address cause OOB read
in afe4404_[read|write]_raw. Fix it by moving access before use them.
Fixes: fbd956c6130a ("iio: health/afe440x: Use regmap fields") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20221107152010.95937-1-weiyongjun@huaweicloud.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The array size of afe4403_channel_leds is less than channels, so access
with chan->address cause OOB read in afe4403_read_raw. Fix it by moving
access before use it.
Fixes: fbd956c6130a ("iio: health/afe440x: Use regmap fields") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20221107151946.89260-1-weiyongjun@huaweicloud.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Calling drm_connector_update_edid_property() in
amdgpu_connector_free_edid() causes a noticeable pause in
the system every 10 seconds on polled outputs so revert this
part of the change.
drm_display_info is updated by drm_get_edid() or
drm_connector_update_edid_property(). In the amdgpu driver it is almost
always updated when the edid is read in amdgpu_connector_get_edid(),
but not always. Change amdgpu_connector_get_edid() and
amdgpu_connector_free_edid() to keep drm_display_info updated.
Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Claudio Suarez <cssk@net-c.es> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Stable-dep-of: e0d3fc5fe2b6 ("drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read"") Signed-off-by: Sasha Levin <sashal@kernel.org>
Instead of having 2 places that short circuit the qgroup leaf scan have
everything in the qgroup_rescan_leaf function. In addition to that, also
ensure that the inconsistent qgroup flag is set when rescan_should_stop
returns true. This both retains the old behavior when -EINTR was set in
the body of the loop and at the same time also extends this behavior
when scanning is interrupted due to remount or unmount operations.
Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Stable-dep-of: cb4e6a0ae540 ("btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit()") Signed-off-by: Sasha Levin <sashal@kernel.org>
In case the requested bus clock is higher than the input clock, the correct
dividers (pre = 0, post = 0) are returned from mx51_ecspi_clkdiv(), but
*fres is left uninitialized and therefore contains an arbitrary value.
This causes trouble for the recently introduced PIO polling feature as the
value in spi_imx->spi_bus_clk is used there to calculate for which
transfers to enable PIO polling.
Fix this by setting *fres even if no clock dividers are in use.
This issue was observed on Kontron BL i.MX8MM with an SPI peripheral clock set
to 50 MHz by default and a requested SPI bus clock of 80 MHz for the SPI NOR
flash.
With the fix applied the debug message from mx51_ecspi_clkdiv() now prints the
following:
file_modified() must be called with inode lock held. fuse_fallocate()
didn't lock the inode in case of just FALLOC_KEEP_SIZE flags value, which
resulted in a kernel Warning in notify_change().
Lock the inode unconditionally, like all other fallocate implementations
do.
In case of Gen12 video and compute engines, TLB_INV registers are masked -
to modify one bit, corresponding bit in upper half of the register must
be enabled, otherwise nothing happens.
CVE: CVE-2022-4139 Suggested-by: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Fixes: 2947b9324e2f ("drm/i915: Flush TLBs before releasing backing store") Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since switching to HMM we always need that because we no longer grab
references to the pages.
Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> CC: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There's been a very long running bug that seems to have been neglected for
a while, where amdgpu consistently triggers a KASAN error at start:
BUG: KASAN: global-out-of-bounds in read_indirect_azalia_reg+0x1d4/0x2a0 [amdgpu]
Read of size 4 at addr ffffffffc2274b28 by task modprobe/1889
After digging through amd's rather creative method for accessing registers,
I eventually discovered the problem likely has to do with the fact that on
my dce120 GPU there are supposedly 7 sets of audio registers. But we only
define a register mapping for 6 sets.
So, fix this and fix the KASAN warning finally.
Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Although kset_unregister() can eventually remove all attribute files,
explicitly rolling back with the matching function makes the code logic
look clearer.
CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
btrfs_ioctl_ino_to_path() frees the search path after the userspace copy
from the temp buffer @ipath->fspath. Which potentially can lead to a lock
splat warning.
Fix this by freeing the path before we copy it to userspace.
CC: stable@vger.kernel.org # 4.19+ Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We do generally the right thing here, copying the references into a
temporary buffer, however we are still holding the path when we do
copy_to_user from the temporary buffer. Fix this by freeing the path
before we copy to user space.
Reported-by: syzbot+4ef9e52e464c6ff47d9d@syzkaller.appspotmail.com CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Some android userspace is sending BINDER_TYPE_FDA objects with
num_fds=0. Like the previous patch, this is reproducible when
playing a video.
Before commit a44ad4a107ad BINDER_TYPE_FDA objects with num_fds=0
were 'correctly handled', as in no fixup was performed.
After commit a44ad4a107ad we aggregate fixup and skip regions in
binder_ptr_fixup structs and distinguish between the two by using
the skip_size field: if it's 0, then it's a fixup, otherwise skip.
When processing BINDER_TYPE_FDA objects with num_fds=0 we add a
skip region of skip_size=0, and this causes issues because now
binder_do_deferred_txn_copies will think this was a fixup region.
To address that, return early from binder_translate_fd_array to
avoid adding an empty skip region.
When handling BINDER_TYPE_FDA object we are pushing a parent fixup
with a certain skip_size but no scatter-gather copy object, since
the copy is handled standalone.
If BINDER_TYPE_FDA is the last children the scatter-gather copy
loop will never stop to skip it, thus we are left with an item in
the parent fixup list. This will trigger the BUG_ON().
This is reproducible in android when playing a video.
We receive a transaction that looks like this:
obj[0] BINDER_TYPE_PTR, parent
obj[1] BINDER_TYPE_PTR, child
obj[2] BINDER_TYPE_PTR, child
obj[3] BINDER_TYPE_FDA, child
binder_uintptr_t is not the same as uintptr_t, so converting it into a
pointer requires a second cast:
drivers/android/binder.c: In function 'binder_translate_fd_array':
drivers/android/binder.c:2511:28: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
2511 | sender_ufda_base = (void __user *)sender_uparent->buffer + fda->parent_offset;
| ^
BINDER_TYPE_PTR objects point to memory areas in the
source process to be copied into the target buffer
as part of a transaction. This implements a scatter-
gather model where non-contiguous memory in a source
process is "gathered" into a contiguous region in
the target buffer.
The data can include pointers that must be fixed up
to correctly point to the copied data. To avoid making
source process pointers visible to the target process,
this patch defers the copy until the fixups are known
and then copies and fixeups are done together.
There is a special case of BINDER_TYPE_FDA which applies
the fixup later in the target process context. In this
case the user data is skipped (so no untranslated fds
become visible to the target).
This patch is to prepare for an up coming patch where we read
pre-translated fds from the sender buffer and translate them before
copying them to the target. It does not change run time.
The patch adds two new parameters to binder_translate_fd_array() to
hold the sender buffer and sender buffer parent. These parameters let
us call copy_from_user() directly from the sender instead of using
binder_alloc_copy_from_buffer() to copy from the target. Also the patch
adds some new alignment checks. Previously the alignment checks would
have been done in a different place, but this lets us print more
useful error messages.
Transactions are copied from the sender to the target
first and objects like BINDER_TYPE_PTR and BINDER_TYPE_FDA
are then fixed up. This means there is a short period where
the sender's version of these objects are visible to the
target prior to the fixups.
Instead of copying all of the data first, copy data only
after any needed fixups have been applied.
This commit flushes the journal on suspend. It is prerequisite for the
next commit that enables activating dm integrity devices in read-only mode.
Note that we deliberately didn't flush the journal on suspend, so that the
journal replay code would be tested. However, the dm-integrity code is 5
years old now, so that journal replay is well-tested, and we can make this
change now.
pci_get_device() will increase the reference count for the returned
pci_dev. We need to use pci_dev_put() to decrease the reference count
before asus_wmi_set_xusb2pr() returns.
RS485-enabled UART ports on TI Sitara SoCs with active-low polarity
exhibit a Transmit Enable glitch on ->set_termios():
omap8250_restore_regs(), which is called from omap_8250_set_termios(),
sets the TCRTLR bit in the MCR register and clears all other bits,
including RTS. If RTS uses active-low polarity, it is now asserted
for no reason.
The TCRTLR bit is subsequently cleared by writing up->mcr to the MCR
register. That variable is always zero, so the RTS bit is still cleared
(incorrectly so if RTS is active-high).
(up->mcr is not, as one might think, a cache of the MCR register's
current value. Rather, it only caches a single bit of that register,
the AFE bit. And it only does so if the UART supports the AFE bit,
which OMAP does not. For details see serial8250_do_set_termios() and
serial8250_do_set_mctrl().)
Finally at the end of omap8250_restore_regs(), the MCR register is
restored (and RTS deasserted) by a call to up->port.ops->set_mctrl()
(which equals serial8250_set_mctrl()) and serial8250_em485_stop_tx().
So there's an RTS glitch between setting TCRTLR and calling
serial8250_em485_stop_tx(). Avoid by using a read-modify-write
when setting TCRTLR.
While at it, drop a redundant initialization of up->mcr. As explained
above, the variable isn't used by the driver and it is already
initialized to zero because it is part of the static struct
serial8250_ports[] declared in 8250_core.c. (Static structs are
initialized to zero per section 6.7.8 nr. 10 of the C99 standard.)
Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Su Bao Cheng <baocheng.su@siemens.com> Tested-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Link: https://lore.kernel.org/r/6554b0241a2c7fd50f32576fdbafed96709e11e8.1664278942.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
The Nanote UMPC-01 mini laptop has stereo speakers, while the default
bytcht_es8316 settings assume a mono speaker setup. Add a quirk for this.
Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Link: https://lore.kernel.org/r/20221025140942.509066-1-hdegoede@redhat.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, in clang version of gcov code when module is getting removed
gcov_info_add() incorrectly adds the sfn_ptr->counter to all the
dst->functions and it result in the kernel panic in below crash report.
Fix this by properly handling it.
When extending segments, nilfs_sufile_alloc() is called to get an
unassigned segment, then mark it as dirty to avoid accidentally allocating
the same segment in the future.
But for some special cases such as a corrupted image it can be unreliable.
If such corruption of the dirty state of the segment occurs, nilfs2 may
reallocate a segment that is in use and pick the same segment for writing
twice at the same time.
This case started with segbuf1.segnum = 3, nextnum = 4 when constructed.
It supposed segment 4 has already been allocated and marked as dirty.
However the dirty state was corrupted and segment 4 usage was not dirty.
For the first time nilfs_segctor_extend_segments() segment 4 was allocated
again, which made segbuf2 and next segbuf3 had same segment 4.
sb_getblk() will get same bh for segbuf2 and segbuf3, and this bh is added
to both buffer lists of two segbuf. It makes the lists broken which
causes NULL pointer dereference.
Fix the problem by setting usage as dirty every time in
nilfs_sufile_mark_dirty(), which is called during constructing current
segment to be written out and before allocating next segment.
The coreboot_table driver registers a coreboot bus while probing a
"coreboot_table" device representing the coreboot table memory region.
Probing this device (i.e., registering the bus) is a dependency for the
module_init() functions of any driver for this bus (e.g.,
memconsole-coreboot.c / memconsole_driver_init()).
With synchronous probe, this dependency works OK, as the link order in
the Makefile ensures coreboot_table_driver_init() (and thus,
coreboot_table_probe()) completes before a coreboot device driver tries
to add itself to the bus.
With asynchronous probe, however, coreboot_table_probe() may race with
memconsole_driver_init(), and so we're liable to hit one of these two:
1. coreboot_driver_register() eventually hits "[...] the bus was not
initialized.", and the memconsole driver fails to register; or
2. coreboot_driver_register() gets past #1, but still races with
bus_register() and hits some other undefined/crashing behavior (e.g.,
in driver_find() [1])
We can resolve this by registering the bus in our initcall, and only
deferring "device" work (scanning the coreboot memory region and
creating sub-devices) to probe().
[1] Example failure, using 'driver_async_probe=*' kernel command line:
Fix a bug where the kernel module can't be loaded after it has been
unloaded as the devices are still present and conflicting with the
to be created coreboot devices.
When decoding the snaps fails it maybe leaving the 'first_realm'
and 'realm' pointing to the same snaprealm memory. And then it'll
put it twice and could cause random use-after-free, BUG_ON, etc
issues.
We will only track the uppest parent snapshot realm from which we
need to rebuild the snapshot contexts _downward_ in hierarchy. For
all the others having no new snapshot we will do nothing.
This fix will avoid calling ceph_queue_cap_snap() on some inodes
inappropriately. For example, with the code in mainline, suppose there
are 2 directory hierarchies (with 6 directories total), like this:
/dir_X1/dir_X2/dir_X3/
/dir_Y1/dir_Y2/dir_Y3/
Firstly, make a snapshot under /dir_X1/dir_X2/.snap/snap_X2, then make a
root snapshot under /.snap/root_snap. Every time we make snapshots under
/dir_Y1/..., the kclient will always try to rebuild the snap context for
snap_X2 realm and finally will always try to queue cap snaps for dir_Y2
and dir_Y3, which makes no sense.
That's because the snap_X2's seq is 2 and root_snap's seq is 3. So when
creating a new snapshot under /dir_Y1/... the new seq will be 4, and
the mds will send the kclient a snapshot backtrace in _downward_
order: seqs 4, 3.
When ceph_update_snap_trace() is called, it will always rebuild the from
the last realm, that's the root_snap. So later when rebuilding the snap
context, the current logic will always cause it to rebuild the snap_X2
realm and then try to queue cap snaps for all the inodes related in that
realm, even though it's not necessary.
This is accompanied by a lot of these sorts of dout messages:
Also, the 'invalidate' word is not precise here. In actuality, it will
cause a rebuild of the existing snapshot contexts or just build
non-existent ones. Rename it to 'rebuild_snapcs'.
URL: https://tracker.ceph.com/issues/44100 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Stable-dep-of: 848e3060bf94 ("ceph: avoid putting the realm twice when decoding snaps fails") Signed-off-by: Sasha Levin <sashal@kernel.org>
When using multiple instances of this driver the compensation PROM was
overwritten by the last initialized sensor. Now each sensor has own PROM
storage.
Signed-off-by: Mitja Spes <mitja@lxnav.com> Fixes: aa0343cce1c1 ("iio: pressure: ms5611: add support for MS5607 temperature and pressure sensor") Link: https://lore.kernel.org/r/20221021135827.1444793-2-mitja@lxnav.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The ms5611 passes &indio_dev->dev as a parameter to all its IO callbacks
only to directly cast the struct device back to struct iio_dev. And the
struct iio_dev is then only used to get the drivers state struct.
Simplify this a bit by passing the state struct directly. This makes it a
bit easier to follow what the code is doing.
When using dash as /bin/sh, the CC_HAS_ASM_GOTO_TIED_OUTPUT test fails
with a syntax error which is not the one we are looking for:
<stdin>: In function ‘foo’:
<stdin>:1:29: warning: missing terminating " character
<stdin>:1:29: error: missing terminating " character
<stdin>:2:5: error: expected ‘:’ before ‘+’ token
<stdin>:2:7: warning: missing terminating " character
<stdin>:2:7: error: missing terminating " character
<stdin>:2:5: error: expected declaration or statement at end of input
CRC errors (code -84 EILSEQ) have been observed for some SanDisk
Ultra A1 cards when running at 50MHz.
Waveform analysis suggest that the level shifters that are used on the
RK3399-Q7 module for voltage translation between 3.0 and 3.3V don't
handle clock rates at or above 48MHz properly. Back off to 40MHz for
some safety margin.
The core DWC3 device node was not properly removed by the custom
dwc3_exynos_remove_child() function. Replace it with generic
of_platform_depopulate() which does that job right.
Fixes: 363ff3d00045 ("usb: dwc3: exynos: Use of_platform API to create dwc3 core pdev") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Cc: stable@vger.kernel.org Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org> Link: https://lore.kernel.org/r/20221110154131.2577-1-m.szyprowski@samsung.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the vdso Makefile to use "grep -E" instead.
The size of the TOD programmable field was incorrectly increased from
four to eight bytes with commit 6ac8c3d942dd ("s390/dump: cleanup CPU
save area handling").
This leads to an elf notes section NT_S390_TODPREG which has a size of
eight instead of four bytes in case of kdump, however even worse is
that the contents is incorrect: it is supposed to contain only the
contents of the TOD programmable field, but in fact contains a mix of
the TOD programmable field (32 bit upper bits) and parts of the CPU
timer register (lower 32 bits).
Fix this by simply changing the size of the todpreg field within the
save area structure. This will implicitly also fix the size of the
corresponding elf notes sections.
This also gets rid of this compile time warning:
in function ‘fortify_memcpy_chk’,
inlined from ‘save_area_add_regs’ at arch/s390/kernel/crash_dump.c:99:2:
./include/linux/fortify-string.h:413:25: error: call to ‘__read_overflow2_field’
declared with attribute warning: detected read beyond size of field
(2nd parameter); maybe use struct_group()? [-Werror=attribute-warning]
413 | __read_overflow2_field(q_size_field, size);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 6ac8c3d942dd ("s390/dump: cleanup CPU save area handling") Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The ACPI buffer memory (string.pointer) should be freed as the buffer is
not used after returning from bgx_acpi_match_id(), free it to prevent
memory leak.
Fixes: 8144d1b1e9ee ("net, thunder, bgx: Add support to get MAC address from ACPI.") Signed-off-by: Yu Liao <liaoyu15@huawei.com> Link: https://lore.kernel.org/r/20221123082237.1220521-1-liaoyu15@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
The first validation check for EVT_TRANSACTION has two different checks
tied together with logical AND. One is a check for minimum packet length,
and the other is for a valid aid_tag. If either condition is true (fails),
then an error should be triggered. The fix is to change && to ||.
For DASD devices in raw_track_access mode only full track images are
read and written.
For this purpose it is not necessary to do search operation in the
locate record extended function. The documentation even states that
this might fail if the searched record is not found on a track.
Currently the driver sets a value of 1 in the search field for the first
record after record zero. This is the default for disks not in
raw_track_access mode but record 1 might be missing on a completely
empty track.
There has not been any problem with this on IBM storage servers but it
might lead to errors with DASD devices on other vendors storage servers.
Fix this by setting the search field to 0. Record zero is always available
even on a completely empty track.
Fixes: c007a8beb930 ("[S390] dasd: Add support for raw ECKD access.") Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20221123160719.3002694-4-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
When connect() is called on a socket bound to the wildcard address,
we change the socket's saddr to a local address. If the socket
fails to connect() to the destination, we have to reset the saddr.
However, when an error occurs after inet_hash6?_connect() in
(dccp|tcp)_v[46]_conect(), we forget to reset saddr and leave
the socket bound to the address.
From the user's point of view, whether saddr is reset or not varies
with errno. Let's fix this inconsistent behaviour.
Note that after this patch, the repro [0] will trigger the WARN_ON()
in inet_csk_get_port() again, but this patch is not buggy and rather
fixes a bug papering over the bhash2's bug for which we need another
fix.
For the record, the repro causes -EADDRNOTAVAIL in inet_hash6_connect()
by this sequence:
As comment of pci_get_domain_bus_and_slot() says, it returns
a pci device with refcount increment, when finish using it,
the caller must decrement the reference count by calling
pci_dev_put(). Call pci_dev_put() before returning from
bnx2x_vf_is_pcie_pending() to avoid refcount leak.
Fixes: fffee62ad6e4 ("bnx2x: Prepare device and initialize VF database") Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20221119070202.1407648-1-yangyingliang@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
In former times, info->feature was populated via the parent driver
by pdata/regulator_init_data->driver_data for all regulators when
USB_PRODUCT_ID_LSB indicates a TWL6032.
Today, the information is not set, so re-add it at the regulator
definitions.
Fixes: 44fc21d999a97 ("regulator: twl: make driver DT only") Signed-off-by: Andreas Kemnade <andreas@kemnade.info> Link: https://lore.kernel.org/r/20221120221208.3093727-2-andreas@kemnade.info Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
One extra conn_get() is needed in tipc_conn_alloc(), as after
tipc_conn_alloc() is called, tipc_conn_close() may free this
con before deferencing it in tipc_topsrv_accept():
This patch fixes it by holding it in tipc_conn_alloc(), then after
all accessing in tipc_topsrv_accept() releasing it. Note when does
this in tipc_topsrv_kern_subscr(), as tipc_conn_rcv_sub() returns
0 or -1 only, we don't need to check for "> 0".
Fixes: d111e95686f3 ("tipc: introduce new TIPC server infrastructure") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
It was caused by !con->sock in tipc_conn_close(). In tipc_topsrv_accept(),
con is allocated in conn_idr then its sock is set:
con = tipc_conn_alloc();
... <----[1]
con->sock = newsock;
If tipc_conn_close() is called in anytime of [1], the null-pointer-def
is triggered by con->sock->sk due to con->sock is not yet set.
This patch fixes it by moving the con->sock setting to tipc_conn_alloc()
under s->idr_lock. So that con->sock can never be NULL when getting the
con from s->conn_idr. It will be also safer to move con->server and flag
CF_CONNECTED setting under s->idr_lock, as they should all be set before
tipc_conn_alloc() is called.
Fixes: d111e95686f3 ("tipc: introduce new TIPC server infrastructure") Reported-by: Wei Chen <harperchen1110@gmail.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fix a bug in calculation of FW tracer timestamp. Decreasing one in the
calculation should effect only bits 52_7 and not effect bits 6_0 of the
timestamp, otherwise bits 6_0 are always set in this calculation.
If device_register() returns error in vmbus_device_register(),
the name allocated by dev_set_name() must be freed. As comment
of device_register() says, it should use put_device() to give
up the reference in the error path. So fix this by calling
put_device(), then the name can be freed in kobject_cleanup().
Fixes: eba68b3942d9 ("Staging: hv: make the Hyper-V virtual bus code build") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20221119081135.1564691-3-yangyingliang@huawei.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In the error path of vmbus_device_register(), device_unregister()
is called, which calls vmbus_device_release(). The latter frees
the struct hv_device that was passed in to vmbus_device_register().
So remove the kfree() in vmbus_add_channel_work() to avoid a double
free.
Fixes: 03d76a8b8a52 ("vmbus: add per-channel sysfs info") Suggested-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20221119081135.1564691-2-yangyingliang@huawei.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Setting of the port flag `NFP_PORT_CHANGED`, introduced
to ensure the correct reading of EEPROM data, causes a
fatal kernel NULL pointer dereference in cases where
the target netdev type cannot be determined.
Add validation of port struct pointer before attempting
to set the `NFP_PORT_CHANGED` flag. Return that operation
is not supported if the netdev type cannot be determined.
Fixes: 57b063516ab7 ("nfp: ethtool: fix the display error of `ethtool -m DEVNAME`") Signed-off-by: Jaco Coetzee <jaco.coetzee@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
As comment of pci_get_domain_bus_and_slot() says, it returns
a pci device with refcount increment, when finish using it,
the caller must decrement the reference count by calling
pci_dev_put().
In pch_gbe_probe(), pci_get_domain_bus_and_slot() is called,
so in error path in probe() and remove() function, pci_dev_put()
should be called to avoid refcount leak. Compile tested only.
The process is as follows:
Thread A: Thread B:
p9_poll_workfn() p9_client_create()
... ...
p9_conn_cancel() p9_fd_cancel()
list_del() ...
... list_del() //list_del
corruption
There is no lock protection when deleting list in p9_conn_cancel(). After
deleting list in Thread A, thread B will delete the same list again. It
will cause issue of list_del corruption.
Setting req->status to REQ_STATUS_ERROR under lock prevents other
cleanup paths from trying to manipulate req_list.
The other thread can safely check req->status because it still holds a
reference to req at this point.
Link: https://lkml.kernel.org/r/20221110122606.383352-1-shaozhengchao@huawei.com Fixes: f2265193a1b4 ("9p: trans_fd/p9_conn_cancel: drop client lock earlier") Reported-by: syzbot+9b69b8d10ab4a7d88056@syzkaller.appspotmail.com Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
[Dominique: add description of the fix in commit message] Signed-off-by: Dominique Martinet <asmadeus@codewreck.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
In pch_gbe_xmit_frame(), NETDEV_TX_OK will be returned whether
pch_gbe_tx_queue() sends data successfully or not, so pch_gbe_tx_queue()
needs to free skb before returning. But pch_gbe_tx_queue() returns without
freeing skb in case of dma_map_single() fails. Add dev_kfree_skb_any()
to fix it.
Fixes: d818c768e816 ("net: Add Gigabit Ethernet driver of Topcliff PCH") Signed-off-by: Wang Hai <wanghai38@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
Previously we leverage NCI_UNREG and the lock inside nci_close_device to
prevent the race condition between opening a device and closing a
device. However, it still has problem because a failed opening command
will erase the NCI_UNREG flag and allow another opening command to
bypass the status checking.
This fix corrects that by making sure the NCI_UNREG is held.
Reported-by: syzbot+43475bf3cfbd6e41f5b7@syzkaller.appspotmail.com Fixes: 5437906a10e9 ("NFC: add NCI_UNREG flag to eliminate the race") Signed-off-by: Lin Ma <linma@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
cocci warnings: (new ones prefixed by >>)
>> drivers/net/ethernet/cavium/liquidio/lio_main.c:1797:54-56: WARNING !A || A && B is equivalent to !A || B
drivers/net/ethernet/cavium/liquidio/lio_main.c:1827:54-56: WARNING !A || A && B is equivalent to !A || B
Fixes: 822507ec9f0b ("net: liquidio: release resources when liquidio driver open failed") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Saeed Mahameed <saeed@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
We set the PIOC to GPIO mode. This way the pin becomes an
input signal will be usable by the controller. Without
this change the udc on the 9g20ek does not work.
If device_register() returns error in optee_register_device(),
the name allocated by dev_set_name() need be freed. As comment
of device_register() says, it should use put_device() to give
up the reference in the error path. So fix this by calling
put_device(), then the name can be freed in kobject_cleanup(),
and optee_device is freed in optee_release_device().
Fixes: a8611d81269b ("tee: optee: add TEE bus device enumeration support") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Sumit Garg <sumit.garg@linaro.org> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
When communicating with a PMIC during system poweroff (pm_power_off()),
IRQs are disabled and we are in a RCU read-side critical section, so we
cannot use wait_for_completion_io_timeout(). Instead, poll the status
register for transfer completion.
Fixes: 2982481577db ("bus: sunxi-rsb: Add driver for Allwinner Reduced Serial Bus") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://lore.kernel.org/r/20221114015749.28490-3-samuel@sholland.org Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
processor A processor B
regulator_register()
rdev_init_debugfs()
rdev->debugfs = debugfs_create_dir()
devm_regulator_get()
rdev = regulator_dev_lookup()
create_regulator(rdev)
// using rdev->debugfs as parent
debugfs_create_dir(rdev->debugfs)
In devm_regulator_get(), after getting rdev, the refcount
is get, so fix this by moving debugfs_remove_recursive()
to regulator_dev_release(), then it can be proctected by
the refcount, the 'rdev->debugfs' can not be freed until
the refcount is 0.
Here is a warning report about lack of registered release()
from kobject lib:
Device '(null)' does not have a release() function, it is broken and must be fixed.
WARNING: CPU: 0 PID: 48430 at drivers/base/core.c:2332 device_release+0x104/0x120
Call Trace:
kobject_put+0xdc/0x180
put_device+0x1b/0x30
regulator_register+0x651/0x1170
devm_regulator_register+0x4f/0xb0
When regulator_register() returns fail and directly goto `clean` symbol,
rdev->dev has not registered release() function yet (which is registered
by regulator_class in the following), so rdev needs to be freed manually.
If rdev->dev.of_node is not NULL, which means the of_node has gotten by
regulator_of_get_init_data(), it needs to call of_node_put() to avoid
refcount leak.
Otherwise, only calling put_device() would lead memory leak of rdev
in further:
When regulator_register() returns fail and goto `wash` symbol,
rdev->dev has registered release() function, so directly call
put_device() to cleanup everything.
Fixes: 82efdee9b09c ("regulator: plug of_node leak in regulator_register()'s error path") Signed-off-by: Zeng Heng <zengheng4@huawei.com> Link: https://lore.kernel.org/r/20221116074339.1024240-1-zengheng4@huawei.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Since commit 4aa9a74a7f12 ("ASoC: sgtl5000: Fix noise on shutdown/remove"),
the device power control registers are reset when the driver is
removed/shutdown.
This is an issue when the device is configured to use the PLL clock. The
device will stop responding if it is still configured to use the PLL
clock but the PLL clock is powered down.
When rebooting linux, the probe function will show:
sgtl5000 0-000a: Error reading chip id -11
Make sure that the CHIP_CLK_CTRL is reset to its default value before
powering down the device.
Fixes: 4aa9a74a7f12 ("ASoC: sgtl5000: Fix noise on shutdown/remove") Signed-off-by: Detlev Casanova <detlev.casanova@collabora.com> Reviewed-by: Fabio Estevam <festevam@gmail.com> Link: https://lore.kernel.org/r/20221110190612.1341469-1-detlev.casanova@collabora.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Remove the regulators node and define fixed regulators in the root node.
Prevents the sdhci-omap driver from waiting in probe deferral forever
because of the missing vmmc-supply and keeps am335x-pcm-953 consistent with
the other Phytec AM335 boards.
Fixes: 2bddee75b155 ("ARM: dts: Add support for phyCORE-AM335x PCM-953 carrier board") Signed-off-by: Dominik Haller <d.haller@phytec.de>
Message-Id: <20221011143115.248003-1-d.haller@phytec.de> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
get_port_from_cmdline() returns an int, yet is assigned to a char, which
is wrong in its own right, but also, with char becoming unsigned, this
poses problems, because -1 is used as an error value. Further
complicating things, fw_init_early_console() is only ever called with a
-1 argument. Fix this up by removing the unused argument from
fw_init_early_console() and treating port as a proper signed integer.
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
Recently, ld.lld moved from '--undefined-version' to
'--no-undefined-version' as the default, which breaks the compat vDSO
build:
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_gettimeofday' failed: symbol not defined
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_gettime' failed: symbol not defined
ld.lld: error: version script assignment of 'LINUX_4.15' to symbol '__vdso_clock_getres' failed: symbol not defined
These symbols are not present in the compat vDSO or the regular vDSO for
32-bit but they are unconditionally included in the version section of
the linker script, which is prohibited with '--no-undefined-version'.
Fix this issue by only including the symbols that are actually exported
in the version section of the linker script.
Add the same change for ARM64 as done in the commit 33ed529a3375
("x86/syscall: Include asm/ptrace.h in syscall_wrapper header") to
make sure all syscalls see 'struct pt_regs' definition and resulted
BTF for '__arm64_sys_*(struct pt_regs *regs)' functions point to
actual struct.
Without this patch, the BPF verifier refuses to load a tracing prog
which accesses pt_regs.
blkcg_unpin_online
blkcg_destroy_blkgs
blkg_destroy
list_del_init(&blkg->q_node)
// blkg is removed from queue list
6) switch elevator back to bfq
bfq_init_queue
bfq_create_group_hierarchy
blkcg_activate_policy
list_for_each_entry_reverse(blkg, &q->blkg_list)
// blkg is removed from list, hence bfq policy is still NULL
7) throttled io is dispatched to bfq:
bfq_insert_requests
bfq_init_rq
bfq_bic_update_cgroup
bfq_bio_bfqg
bfqg = blkg_to_bfqg(blkg)
// bfqg is NULL because bfq policy is NULL
The problem is only possible in bfq because only bfq can be deactivated and
activated while queue is online, while others can only be deactivated while
the device is removed.
Fix the problem in bfq by checking if blkg is online before calling
blkg_to_bfqg().
Like the Acer Switch One 10 S1003, for which there already is a quirk,
the Acer Switch V 10 (SW5-017) has a 800x1280 portrait screen mounted
in the tablet part of a landscape oriented 2-in-1. Add a quirk for this.
When this driver is used with a driver that uses preallocated spi_transfer
structs. The speed_hz is halved by every run. This results in:
spi_stm32 44004000.spi: SPI transfer setup failed
ads7846 spi0.0: SPI transfer failed: -22
Example when running with DIV_ROUND_UP():
- First run; speed_hz = 1000000, spi->clk_rate 125000000
div 125 -> mbrdiv = 7, cur_speed = 976562
- Second run; speed_hz = 976562
div 128,00007 (roundup to 129) -> mbrdiv = 8, cur_speed = 488281
- Third run; speed_hz = 488281
div 256,000131072067109 (roundup to 257) and then -EINVAL is returned.
Use DIV_ROUND_CLOSEST to allow to round down and allow us to keep the
set speed.
When trying to transmit an data frame with tx_status to a destination
that have no route in the mesh, then it is dropped without recrediting
the ack_status_frames idr.
Once it is exhausted, wpa_supplicant starts failing to do SAE with
NL80211_CMD_FRAME and logs "nl80211: Frame command failed".
Use ieee80211_free_txskb() instead of kfree_skb() to fix it.
Shifting signed 32-bit value by 31 bits is undefined, so changing
significant bit to unsigned. The UBSAN warning calltrace like below:
UBSAN: shift-out-of-bounds in kernel/auditfilter.c:179:23
left shift of 1 by 31 places cannot be represented in type 'int'
Call Trace:
<TASK>
dump_stack_lvl+0x7d/0xa5
dump_stack+0x15/0x1b
ubsan_epilogue+0xe/0x4e
__ubsan_handle_shift_out_of_bounds+0x1e7/0x20c
audit_register_class+0x9d/0x137
audit_classes_init+0x4d/0xb8
do_one_initcall+0x76/0x430
kernel_init_freeable+0x3b3/0x422
kernel_init+0x24/0x1e0
ret_from_fork+0x1f/0x30
</TASK>
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
[PM: remove bad 'Fixes' tag as issue predates git, added in v2.6.6-rc1] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Fixes a warning that occurs when rc table support is enabled
(IEEE80211_HW_SUPPORTS_RC_TABLE) in mac80211_hwsim and the PS mode
is changed via the exported debugfs attribute.
When the PS mode is changed, a packet is broadcasted via
hwsim_send_nullfunc by creating and transmitting a plain skb with only
header initialized. The ieee80211 rate array in the control buffer is
zero-initialized. When ratetbl support is enabled, ieee80211_get_tx_rates
is called for the skb with sta parameter set to NULL and thus no
ratetbl can be used. The final rate array then looks like
[-1,0; 0,0; 0,0; 0,0] which causes the warning in ieee80211_get_tx_rate.
The issue is fixed by setting the count of the first rate with idx '0'
to 1 and hence ieee80211_get_tx_rates won't overwrite it with idx '-1'.
Signed-off-by: Jonas Jelonek <jelonek.jonas@gmail.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Kernel iterates over ATTR_RECORDs in mft record in ntfs_attr_find().
Because the ATTR_RECORDs are next to each other, kernel can get the next
ATTR_RECORD from end address of current ATTR_RECORD, through current
ATTR_RECORD length field.
The problem is that during iteration, when kernel calculates the end
address of current ATTR_RECORD, kernel may trigger an integer overflow bug
in executing `a = (ATTR_RECORD*)((u8*)a + le32_to_cpu(a->length))`. This
may wrap, leading to a forever iteration on 32bit systems.
This patch solves it by adding some checks on calculating end address
of current ATTR_RECORD during iteration.
Kernel iterates over ATTR_RECORDs in mft record in ntfs_attr_find(). To
ensure access on these ATTR_RECORDs are within bounds, kernel will do some
checking during iteration.
The problem is that during checking whether ATTR_RECORD's name is within
bounds, kernel will dereferences the ATTR_RECORD name_offset field, before
checking this ATTR_RECORD strcture is within bounds. This problem may
result out-of-bounds read in ntfs_attr_find(), reported by Syzkaller:
==================================================================
BUG: KASAN: use-after-free in ntfs_attr_find+0xc02/0xce0 fs/ntfs/attrib.c:597
Read of size 2 at addr ffff88807e352009 by task syz-executor153/3607
This patch solves it by moving the ATTR_RECORD strcture's bounds checking
earlier, then checking whether ATTR_RECORD's name is within bounds.
What's more, this patch also add some comments to improve its
maintainability.
Patch series "ntfs: fix bugs about Attribute", v2.
This patchset fixes three bugs relative to Attribute in record:
Patch 1 adds a sanity check to ensure that, attrs_offset field in first
mft record loading from disk is within bounds.
Patch 2 moves the ATTR_RECORD's bounds checking earlier, to avoid
dereferencing ATTR_RECORD before checking this ATTR_RECORD is within
bounds.
Patch 3 adds an overflow checking to avoid possible forever loop in
ntfs_attr_find().
Without patch 1 and patch 2, the kernel triggersa KASAN use-after-free
detection as reported by Syzkaller.
Although one of patch 1 or patch 2 can fix this, we still need both of
them. Because patch 1 fixes the root cause, and patch 2 not only fixes
the direct cause, but also fixes the potential out-of-bounds bug.
This patch (of 3):
Syzkaller reported use-after-free read as follows:
==================================================================
BUG: KASAN: use-after-free in ntfs_attr_find+0xc02/0xce0 fs/ntfs/attrib.c:597
Read of size 2 at addr ffff88807e352009 by task syz-executor153/3607
Kernel will loads $MFT/$DATA's first mft record in
ntfs_read_inode_mount().
Yet the problem is that after loading, kernel doesn't check whether
attrs_offset field is a valid value.
To be more specific, if attrs_offset field is larger than bytes_allocated
field, then it may trigger the out-of-bounds read bug(reported as
use-after-free bug) in ntfs_attr_find(), when kernel tries to access the
corresponding mft record's attribute.
This patch solves it by adding the sanity check between attrs_offset field
and bytes_allocated field, after loading the first mft record.
Functions implementing the a_ops->write_end() interface accept the `void
*fsdata` parameter that is supposed to be initialized by the corresponding
a_ops->write_begin() (which accepts `void **fsdata`).
However not all a_ops->write_begin() implementations initialize `fsdata`
unconditionally, so it may get passed uninitialized to a_ops->write_end(),
resulting in undefined behavior.
Fix this by initializing fsdata with NULL before the call to
write_begin(), rather than doing so in all possible a_ops implementations.
This patch covers only the following cases found by running x86 KMSAN
under syzkaller:
- generic_perform_write()
- cont_expand_zero() and generic_cont_expand_simple()
- page_symlink()
Other cases of passing uninitialized fsdata may persist in the codebase.
Link: https://lkml.kernel.org/r/20220915150417.722975-43-glider@google.com Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Konovalov <andreyknvl@google.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Christoph Hellwig <hch@lst.de> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Eric Biggers <ebiggers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kees Cook <keescook@chromium.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vegard Nossum <vegard.nossum@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
syzbot is reporting hung task at p9_fd_close() [1], for p9_mux_poll_stop()
from p9_conn_destroy() from p9_fd_close() is failing to interrupt already
started kernel_read() from p9_fd_read() from p9_read_work() and/or
kernel_write() from p9_fd_write() from p9_write_work() requests.
Since p9_socket_open() sets O_NONBLOCK flag, p9_mux_poll_stop() does not
need to interrupt kernel_read()/kernel_write(). However, since p9_fd_open()
does not set O_NONBLOCK flag, but pipe blocks unless signal is pending,
p9_mux_poll_stop() needs to interrupt kernel_read()/kernel_write() when
the file descriptor refers to a pipe. In other words, pipe file descriptor
needs to be handled as if socket file descriptor.
We somehow need to interrupt kernel_read()/kernel_write() on pipes.
A minimal change, which this patch is doing, is to set O_NONBLOCK flag
from p9_fd_open(), for O_NONBLOCK flag does not affect reading/writing
of regular files. But this approach changes O_NONBLOCK flag on userspace-
supplied file descriptors (which might break userspace programs), and
O_NONBLOCK flag could be changed by userspace. It would be possible to set
O_NONBLOCK flag every time p9_fd_read()/p9_fd_write() is invoked, but still
remains small race window for clearing O_NONBLOCK flag.
If we don't want to manipulate O_NONBLOCK flag, we might be able to
surround kernel_read()/kernel_write() with set_thread_flag(TIF_SIGPENDING)
and recalc_sigpending(). Since p9_read_work()/p9_write_work() works are
processed by kernel threads which process global system_wq workqueue,
signals could not be delivered from remote threads when p9_mux_poll_stop()
from p9_conn_destroy() from p9_fd_close() is called. Therefore, calling
set_thread_flag(TIF_SIGPENDING)/recalc_sigpending() every time would be
needed if we count on signals for making kernel_read()/kernel_write()
non-blocking.
Switch from strlcpy to strscpy and make sure that @count is the size of
the smaller of the source and destination buffers. This prevents
reading beyond the end of the source buffer when the source string isn't
null terminated.
Found by a modified version of syzkaller.
Suggested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>