Linus Torvalds [Sat, 24 Aug 2019 18:26:51 +0000 (11:26 -0700)]
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Four fixes, three for edge conditions which don't occur very often.
The lpfc fix mitigates memory exhaustion for some high CPU systems"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
scsi: ufs: Fix NULL pointer dereference in ufshcd_config_vreg_hpm()
scsi: target: tcmu: avoid use-after-free after command timeout
scsi: qla2xxx: Fix gnl.l memory leak on adapter init failure
Linus Torvalds [Sat, 24 Aug 2019 18:21:26 +0000 (11:21 -0700)]
Merge tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fix from Darrick Wong:
"A single patch that fixes a xfs lockup problem when a chown/chgrp
operation fails due to running out of quota. It has survived the usual
xfstests runs and merges cleanly with this morning's master:
- Fix a forgotten inode unlock when chown/chgrp fail due to quota"
* tag 'xfs-5.3-fixes-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT
Linus Torvalds [Sat, 24 Aug 2019 18:16:04 +0000 (11:16 -0700)]
Merge tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm
Pull more drm fixes from Dave Airlie:
"Although the tree built for me fine on arm here, it appears either
header cleanups in next or some kconfig combo it breaks, so this
contains a fix to mediatek to include dma-mapping.h explicitly.
There was also one nouveau fix that came in late that I was going to
leave until next week, but since I was sending this I thought it may
as well be in here:
mediatek:
- fix build in some cases
nouveau:
- fix hang with i2c and mst docks"
* tag 'drm-fixes-2019-08-24' of git://anongit.freedesktop.org/drm/drm:
drm/mediatek: include dma-mapping header
drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
Dave Airlie [Sat, 24 Aug 2019 05:07:07 +0000 (15:07 +1000)]
drm/mediatek: include dma-mapping header
Although it builds fine here in my arm cross compile, it seems
either via some other patches in -next or some Kconfig combination,
this fails to build for everyone.
Linus Torvalds [Fri, 23 Aug 2019 21:53:09 +0000 (14:53 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Doug Ledford:
"No beating around the bush: this is a monster pull request for an -rc5
kernel. Intel hit me with a series of fixes for TID processing.
Mellanox hit me with a series for their UMR memory support.
And we had one fix for siw that fixes the 32bit build warnings and
because of the number of casts that had to be changed to properly
silence the warnings, that one patch alone is a full 40% of the LOC of
this entire pull request. Given that this is the initial release
kernel for siw, I'm trying to fix anything in it that we can, so that
adds to the impetus to take fixes for it like this one.
I had to do a rebase early in the week. Jason had thought he put a
patch on the rc queue that he needed to be there so he could base some
work off of it, and it had actually not been placed there. So he asked
me (on Tuesday) to fix that up before pushing my wip branch to the
official rc branch. I did, and that's why the early patches look like
they were all committed at the same time on Tuesday. That bunch had
been in my queue prior.
The various patches all pass my test for being legitimate fixes and
not attempts to slide new features or development into a late rc.
Well, they were all fixes with the exception of a couple clean up
patches people wrote for making the fixes they also wrote better (like
a cleanup patch to move UMR checking into a function so that the
remaining UMR fix patches can reference that function), so I left
those in place too.
My apologies for the LOC count and the number of patches here, it's
just how the cards fell this cycle.
Summary:
- Fix siw buffer mapping issue
- Fix siw 32/64 casting issues
- Fix a KASAN access issue in bnxt_re
- Fix several memory leaks (hfi1, mlx4)
- Fix a NULL deref in cma_cleanup
- Fixes for UMR memory support in mlx5 (4 patch series)
- Fix namespace check for restrack
- Fixes for counter support
- Fixes for hfi1 TID processing (5 patch series)
- Fix potential NULL deref in siw
- Fix memory page calculations in mlx5"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (21 commits)
RDMA/siw: Fix 64/32bit pointer inconsistency
RDMA/siw: Fix SGL mapping issues
RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
infiniband: hfi1: fix memory leaks
infiniband: hfi1: fix a memory leak bug
IB/mlx4: Fix memory leaks
RDMA/cma: fix null-ptr-deref Read in cma_cleanup
IB/mlx5: Block MR WR if UMR is not possible
IB/mlx5: Fix MR re-registration flow to use UMR properly
IB/mlx5: Report and handle ODP support properly
IB/mlx5: Consolidate use_umr checks into single function
RDMA/restrack: Rewrite PID namespace check to be reliable
RDMA/counters: Properly implement PID checks
IB/core: Fix NULL pointer dereference when bind QP to counter
IB/hfi1: Drop stale TID RDMA packets that cause TIDErr
IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet
IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet
IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet
IB/hfi1: Drop stale TID RDMA packets
RDMA/siw: Fix potential NULL de-ref
...
Linus Torvalds [Fri, 23 Aug 2019 21:45:45 +0000 (14:45 -0700)]
Merge tag 'for-linus-20190823' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"Here's a set of fixes that should go into this release. This contains:
- Three minor fixes for NVMe.
- Three minor tweaks for the io_uring polling logic.
- Officially mark Song as the MD maintainer, after he's been filling
that role sucessfully for the last 6 months or so"
* tag 'for-linus-20190823' of git://git.kernel.dk/linux-block:
io_uring: add need_resched() check in inner poll loop
md: update MAINTAINERS info
io_uring: don't enter poll loop if we have CQEs pending
nvme: Add quirk for LiteON CL1 devices running FW 22301111
nvme: Fix cntlid validation when not using NVMEoF
nvme-multipath: fix possible I/O hang when paths are updated
io_uring: fix potential hang with polled IO
Linus Torvalds [Fri, 23 Aug 2019 17:53:34 +0000 (10:53 -0700)]
Merge tag 'for-5.3/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Revert a DM bufio change from during the 5.3 merge window now that a
proper fix has been made to the block loopback driver.
- Fix DM kcopyd to wakeup so failed subjobs get completed.
- Various fixes to DM zoned target to address error handling, and other
small tweaks (SPDX license identifiers and fix typos).
- Fix DM integrity range locking race by tracking whether journal has
changed.
- Fix DM dust target to detect reads of badblocks beyond the first 512b
sector (applicable if blocksize is larger than 512b).
- Fix DM persistent-data issue in both the DM btree and DM
space-map-metadata interfaces.
- Fix out of bounds memory access with certain DM table configurations.
* tag 'for-5.3/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm table: fix invalid memory accesses with too high sector number
dm space map metadata: fix missing store of apply_bops() return value
dm btree: fix order of block initialization in btree_split_beneath
dm raid: add missing cleanup in raid_ctr()
dm zoned: fix potential NULL dereference in dmz_do_reclaim()
dm dust: use dust block size for badblocklist index
dm integrity: fix a crash due to BUG_ON in __journal_read_write()
dm zoned: fix a few typos
dm zoned: add SPDX license identifiers
dm zoned: properly handle backing device failure
dm zoned: improve error handling in i/o map code
dm zoned: improve error handling in reclaim
dm kcopyd: always complete failed jobs
Revert "dm bufio: fix deadlock with loop device"
Linus Torvalds [Fri, 23 Aug 2019 17:49:44 +0000 (10:49 -0700)]
Merge tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fixes from Darrick Wong:
"Here are a few more bug fixes that trickled in since the last pull.
They've survived the usual xfstests runs and merge cleanly with this
morning's master.
I expect there to be one more pull request tomorrow for the fix to
that quota related inode unlock bug that we were reviewing last night,
but it will continue to soak in the testing machine for several more
hours.
- Fix missing compat ioctl handling for get/setlabel
- Fix missing ioctl pointer sanitization on s390
- Fix a page locking deadlock in the dedupe comparison code
- Fix inadequate locking in reflink code w.r.t. concurrent directio
- Fix broken error detection when breaking layouts"
* tag 'xfs-5.3-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
fs/xfs: Fix return code of xfs_break_leased_layouts()
xfs: fix reflink source file racing with directio writes
vfs: fix page locking deadlocks when deduping files
xfs: compat_ioctl: use compat_ptr()
xfs: fall back to native ioctls for unhandled compat ones
Linus Torvalds [Fri, 23 Aug 2019 16:19:38 +0000 (09:19 -0700)]
Merge tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client
Pull ceph fixes from Ilya Dryomov:
"Three important fixes tagged for stable (an indefinite hang, a crash
on an assert and a NULL pointer dereference) plus a small series from
Luis fixing instances of vfree() under spinlock"
* tag 'ceph-for-5.3-rc6' of git://github.com/ceph/ceph-client:
libceph: fix PG split vs OSD (re)connect race
ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
ceph: clear page dirty before invalidate page
ceph: fix buffer free while holding i_ceph_lock in fill_inode()
ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
Linus Torvalds [Fri, 23 Aug 2019 16:03:06 +0000 (09:03 -0700)]
Merge tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Live from the laundromat after my washing machine broke down, we have
the 5.3-rc6 fixes. Changelog is in the tag below, but nothing too
noteworthy in here:
rcar-du:
- LVDS dual-link mode fix
mediatek:
- of node refcount fix
- prime buffer import fix
- dma max seg fix
* tag 'drm-fixes-2019-08-23' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu/powerplay: silence a warning in smu_v11_0_setup_pptable
drm/amd/display: Calculate bpc based on max_requested_bpc
drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl
drm/amd/amdgpu: disable MMHUB PG for navi10
drm/amd/powerplay: remove duplicate macro smu_get_uclk_dpm_states in amdgpu_smu.h
drm/amd/powerplay: fix variable type errors in smu_v11_0_setup_pptable
drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible
drm/i915: Fix HW readout for crtc_clock in HDMI mode
drm/mediatek: mtk_drm_drv.c: Add of_node_put() before goto
drm: rcar_lvds: Fix dual link mode operations
drm/mediatek: set DMA max segment size
drm/mediatek: use correct device to import PRIME buffers
drm/omap: ensure we have a valid dma_mask
drm/komeda: Add support for 'memory-region' DT node property
drm/komeda: Adds internal bpp computing for arm afbc only format YU08 YU10
drm/komeda: Initialize and enable output polling on Komeda
Mikulas Patocka [Fri, 23 Aug 2019 13:54:09 +0000 (09:54 -0400)]
dm table: fix invalid memory accesses with too high sector number
If the sector number is too high, dm_table_find_target() should return a
pointer to a zeroed dm_target structure (the caller should test it with
dm_target_is_valid).
However, for some table sizes, the code in dm_table_find_target() that
performs btree lookup will access out of bound memory structures.
Fix this bug by testing the sector number at the beginning of
dm_table_find_target(). Also, add an "inline" keyword to the function
dm_table_get_size() because this is a hot path.
================================================
WARNING: lock held when returning to user space!
5.3.0-rc5 #rc5 Tainted: G W
------------------------------------------------
chgrp/47006 is leaving the kernel with locks still held!
1 lock held by chgrp/47006:
#0: 000000006664ea2d (&xfs_nondir_ilock_class){++++}, at: xfs_ilock+0xd2/0x290 [xfs]
...which is clearly caused by xfs_setattr_nonsize failing to unlock the
ILOCK after the xfs_qm_vop_chown_reserve call fails. Add the missing
unlock.
Reported-by: benjamin.moody@gmail.com Fixes: 04e7d7da4cc5 ("xfs: better xfs_trans_alloc interface") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Tested-by: Salvatore Bonaccorso <carnil@debian.org>
Lyude Paul [Thu, 25 Jul 2019 19:40:01 +0000 (15:40 -0400)]
drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX
While I had thought I had fixed this issue in:
commit ab2b95aa6970 ("drm/nouveau/i2c: Disable i2c bus access after
->fini()")
It turns out that while I did fix the error messages I was seeing on my
P50 when trying to access i2c busses with the GPU in runtime suspend, I
accidentally had missed one important detail that was mentioned on the
bug report this commit was supposed to fix: that the CPU would only lock
up when trying to access i2c busses _on connected devices_ _while the
GPU is not in runtime suspend_. Whoops. That definitely explains why I
was not able to get my machine to hang with i2c bus interactions until
now, as plugging my P50 into it's dock with an HDMI monitor connected
allowed me to finally reproduce this locally.
Now that I have managed to reproduce this issue properly, it looks like
the problem is much simpler then it looks. It turns out that some
connected devices, such as MST laptop docks, will actually ACK i2c reads
even if no data was actually read:
[ 275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
[ 275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 0110100010040000
[ 275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
[ 275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[ 275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
[ 275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
Because we don't handle the situation of i2c ack without any data, we
end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
value of cnt always remains at 0. This finally properly explains how
this could result in a CPU hang like the ones observed in the
aforementioned commit.
So, fix this by retrying transactions if no data is written or received,
and give up and fail the transaction if we continue to not write or
receive any data after 32 retries.
Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Dave Airlie [Fri, 23 Aug 2019 01:43:47 +0000 (11:43 +1000)]
Merge tag 'drm-misc-fixes-2019-08-22' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Fixes for v5.3-rc6:
- dma fix for omap.
- Make output polling work on komeda.
- Fix bpp computing for AFBC formats in komeda.
- Support the memory-region property in komeda.
Jens Axboe [Thu, 22 Aug 2019 04:19:11 +0000 (22:19 -0600)]
io_uring: add need_resched() check in inner poll loop
The outer poll loop checks for whether we need to reschedule, and
returns to userspace if we do. However, it's possible to get stuck
in the inner loop as well, if the CPU we are running on needs to
reschedule to finish the IO work.
Add the need_resched() check in the inner loop as well. This fixes
a potential hang if the kernel is configured with
CONFIG_PREEMPT_VOLUNTARY=y.
* tag 'pci-v5.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Documentation PCI: Fix pciebus-howto.rst filename typo
PCI: Reset both NVIDIA GPU and HDA in ThinkPad P50 workaround
ZhangXiaoxu [Mon, 19 Aug 2019 03:31:21 +0000 (11:31 +0800)]
dm space map metadata: fix missing store of apply_bops() return value
In commit 0a16d460d1a8 ("dm space map metadata: fix occasional leak
of a metadata block on resize"), we refactor the commit logic to a new
function 'apply_bops'. But when that logic was replaced in out() the
return value was not stored. This may lead out() returning a wrong
value to the caller.
Fixes: 0a16d460d1a8 ("dm space map metadata: fix occasional leak of a metadata block on resize") Cc: stable@vger.kernel.org Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
ZhangXiaoxu [Sat, 17 Aug 2019 05:32:40 +0000 (13:32 +0800)]
dm btree: fix order of block initialization in btree_split_beneath
When btree_split_beneath() splits a node to two new children, it will
allocate two blocks: left and right. If right block's allocation
failed, the left block will be unlocked and marked dirty. If this
happened, the left block'ss content is zero, because it wasn't
initialized with the btree struct before the attempot to allocate the
right block. Upon return, when flushing the left block to disk, the
validator will fail when check this block. Then a BUG_ON is raised.
Fix this by completely initializing the left block before allocating and
initializing the right block.
Fixes: 4e98028405ea0 ("dm btree: fix leak of bufio-backed block in btree_split_beneath error path") Cc: stable@vger.kernel.org Signed-off-by: ZhangXiaoxu <zhangxiaoxu5@huawei.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Linus Torvalds [Thu, 22 Aug 2019 18:17:20 +0000 (11:17 -0700)]
Merge tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
Pull chrome platform fix from Benson Leung:
"Fix a kernel crash during suspend/resume of cros_ec_ishtp"
* tag 'tag-chrome-platform-fixes-for-v5.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
platform/chrome: cros_ec_ishtp: fix crash during suspend
Linus Torvalds [Thu, 22 Aug 2019 18:12:33 +0000 (11:12 -0700)]
Merge tag 'afs-fixes-20190822' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
- Fix a cell record leak due to the default error not being cleared.
- Fix an oops in tracepoint due to a pointer that may contain an error.
- Fix the ACL storage op for YFS where the wrong op definition is being
used. By luck, this only actually affects the information appearing
in traces.
* tag 'afs-fixes-20190822' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
afs: use correct afs_call_type in yfs_fs_store_opaque_acl2
afs: Fix possible oops in afs_lookup trace event
afs: Fix leak in afs_lookup_cell_rcu()
Bernard Metzler [Thu, 22 Aug 2019 15:07:41 +0000 (17:07 +0200)]
RDMA/siw: Fix SGL mapping issues
All user level and most in-kernel applications submit WQEs
where the SG list entries are all of a single type.
iSER in particular, however, will send us WQEs with mixed SG
types: sge[0] = kernel buffer, sge[1] = PBL region.
Check and set is_kva on each SG entry individually instead of
assuming the first SGE type carries through to the last.
This fixes iSER over siw.
Selvin Xavier [Thu, 22 Aug 2019 10:02:50 +0000 (03:02 -0700)]
RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
Driver copies FW commands to the HW queue as units of 16 bytes. Some
of the command structures are not exact multiple of 16. So while copying
the data from those structures, the stack out of bounds messages are
reported by KASAN. The following error is reported.
[ 1337.530155] ==================================================================
[ 1337.530277] BUG: KASAN: stack-out-of-bounds in bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530413] Read of size 16 at addr ffff888725477a48 by task rmmod/2785
[ 1337.530996] Memory state around the buggy address:
[ 1337.531072] ffff888725477900: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f2 f2 f2
[ 1337.531180] ffff888725477980: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
[ 1337.531288] >ffff888725477a00: 00 f2 f2 f2 f2 f2 f2 00 00 00 f2 00 00 00 00 00
[ 1337.531393] ^
[ 1337.531478] ffff888725477a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531585] ffff888725477b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531691] ==================================================================
Fix this by passing the exact size of each FW command to
bnxt_qplib_rcfw_send_message as req->cmd_size. Before sending
the command to HW, modify the req->cmd_size to number of 16 byte units.
If the inode from afs_do_lookup() is an error other than ENOENT, or if it
is ENOENT and afs_try_auto_mntpt() returns an error, the trace event will
try to dereference the error pointer as a valid pointer.
Use IS_ERR_OR_NULL to only pass a valid pointer for the trace, or NULL.
Ideally the trace would include the error value, but for now just avoid
the oops.
Fixes: 5833e6942c84 ("afs: Add more tracepoints") Signed-off-by: Marc Dionne <marc.dionne@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com>
David Howells [Thu, 22 Aug 2019 12:28:43 +0000 (13:28 +0100)]
afs: Fix leak in afs_lookup_cell_rcu()
Fix a leak on the cell refcount in afs_lookup_cell_rcu() due to
non-clearance of the default error in the case a NULL cell name is passed
and the workstation default cell is used.
Also put a bit at the end to make sure we don't leak a cell ref if we're
going to be returning an error.
This leak results in an assertion like the following when the kafs module is
unloaded:
AFS: Assertion failed
2 == 1 is false
0x2 == 0x1 is false
------------[ cut here ]------------
kernel BUG at fs/afs/cell.c:770!
...
RIP: 0010:afs_manage_cells+0x220/0x42f [kafs]
...
process_one_work+0x4c2/0x82c
? pool_mayday_timeout+0x1e1/0x1e1
? do_raw_spin_lock+0x134/0x175
worker_thread+0x336/0x4a6
? rescuer_thread+0x4af/0x4af
kthread+0x1de/0x1ee
? kthread_park+0xd4/0xd4
ret_from_fork+0x24/0x30
Ilya Dryomov [Tue, 20 Aug 2019 14:40:33 +0000 (16:40 +0200)]
libceph: fix PG split vs OSD (re)connect race
We can't rely on ->peer_features in calc_target() because it may be
called both when the OSD session is established and open and when it's
not. ->peer_features is not valid unless the OSD session is open. If
this happens on a PG split (pg_num increase), that could mean we don't
resend a request that should have been resent, hanging the client
indefinitely.
In userspace this was fixed by looking at require_osd_release and
get_xinfo[osd].features fields of the osdmap. However these fields
belong to the OSD section of the osdmap, which the kernel doesn't
decode (only the client section is decoded).
Instead, let's drop this feature check. It effectively checks for
luminous, so only pre-luminous OSDs would be affected in that on a PG
split the kernel might resend a request that should not have been
resent. Duplicates can occur in other scenarios, so both sides should
already be prepared for them: see dup/replay logic on the OSD side and
retry_attempt check on the client side.
Cc: stable@vger.kernel.org Fixes: 23f46420feef ("libceph: resend on PG splits if OSD has RESEND_ON_SPLIT") Link: https://tracker.ceph.com/issues/41162 Reported-by: Jerry Lee <leisurelysw24@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Tested-by: Jerry Lee <leisurelysw24@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org>
Jeff Layton [Thu, 15 Aug 2019 10:23:38 +0000 (06:23 -0400)]
ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply
When ceph_mdsc_do_request returns an error, we can't assume that the
filelock_reply pointer will be set. Only try to fetch fields out of
the r_reply_info when it returns success.
Cc: stable@vger.kernel.org Reported-by: Hector Martin <hector@marcansoft.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
clear_page_dirty_for_io(page) before mapping->a_ops->invalidatepage().
invalidatepage() clears page's private flag, if dirty flag is not
cleared, the page may cause BUG_ON failure in ceph_set_page_dirty().
Luis Henriques [Fri, 19 Jul 2019 14:32:22 +0000 (15:32 +0100)]
ceph: fix buffer free while holding i_ceph_lock in fill_inode()
Calling ceph_buffer_put() in fill_inode() may result in freeing the
i_xattrs.blob buffer while holding the i_ceph_lock. This can be fixed by
postponing the call until later, when the lock is released.
The following backtrace was triggered by fstests generic/070.
Luis Henriques [Fri, 19 Jul 2019 14:32:21 +0000 (15:32 +0100)]
ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob()
Calling ceph_buffer_put() in __ceph_build_xattrs_blob() may result in
freeing the i_xattrs.blob buffer while holding the i_ceph_lock. This can
be fixed by having this function returning the old blob buffer and have
the callers of this function freeing it when the lock is released.
The following backtrace was triggered by fstests generic/117.
BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
in_atomic(): 1, irqs_disabled(): 0, pid: 649, name: fsstress
4 locks held by fsstress/649:
#0: 00000000a7478e7e (&type->s_umount_key#19){++++}, at: iterate_supers+0x77/0xf0
#1: 00000000f8de1423 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: ceph_check_caps+0x7b/0xc60
#2: 00000000562f2b27 (&s->s_mutex){+.+.}, at: ceph_check_caps+0x3bd/0xc60
#3: 00000000f83ce16a (&mdsc->snap_rwsem){++++}, at: ceph_check_caps+0x3ed/0xc60
CPU: 1 PID: 649 Comm: fsstress Not tainted 5.2.0+ #439
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x67/0x90
___might_sleep.cold+0x9f/0xb1
vfree+0x4b/0x60
ceph_buffer_release+0x1b/0x60
__ceph_build_xattrs_blob+0x12b/0x170
__send_cap+0x302/0x540
? __lock_acquire+0x23c/0x1e40
? __mark_caps_flushing+0x15c/0x280
? _raw_spin_unlock+0x24/0x30
ceph_check_caps+0x5f0/0xc60
ceph_flush_dirty_caps+0x7c/0x150
? __ia32_sys_fdatasync+0x20/0x20
ceph_sync_fs+0x5a/0x130
iterate_supers+0x8f/0xf0
ksys_sync+0x4f/0xb0
__ia32_sys_sync+0xa/0x10
do_syscall_64+0x50/0x1c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7fc6409ab617
Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Luis Henriques [Fri, 19 Jul 2019 14:32:20 +0000 (15:32 +0100)]
ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the
i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be
fixed by postponing the call until later, when the lock is released.
The following backtrace was triggered by fstests generic/117.
BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress
3 locks held by fsstress/650:
#0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50
#1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0
#2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810
CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x67/0x90
___might_sleep.cold+0x9f/0xb1
vfree+0x4b/0x60
ceph_buffer_release+0x1b/0x60
__ceph_setxattr+0x2b4/0x810
__vfs_setxattr+0x66/0x80
__vfs_setxattr_noperm+0x59/0xf0
vfs_setxattr+0x81/0xa0
setxattr+0x115/0x230
? filename_lookup+0xc9/0x140
? rcu_read_lock_sched_held+0x74/0x80
? rcu_sync_lockdep_assert+0x2e/0x60
? __sb_start_write+0x142/0x1a0
? mnt_want_write+0x20/0x50
path_setxattr+0xba/0xd0
__x64_sys_lsetxattr+0x24/0x30
do_syscall_64+0x50/0x1c0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7ff23514359a
Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
drm/amd/display: Calculate bpc based on max_requested_bpc
[Why]
The only place where state->max_bpc is updated on the connector is
at the start of atomic check during drm_atomic_connector_check. It
isn't updated when adding the connectors to the atomic state after
the fact. It also doesn't necessarily reflect the right value when
called in amdgpu during mode validation outside of atomic check.
This can cause the wrong bpc to be used even if the max_requested_bpc
is the correct value.
[How]
Don't rely on state->max_bpc reflecting the real bpc value and just
do the min(...) based on display info bpc and max_requested_bpc.
Fixes: 5c8741632440 ("drm/amd/display: Use current connector state if NULL when checking bpc") Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Nicolai Hähnle [Tue, 20 Aug 2019 13:39:53 +0000 (15:39 +0200)]
drm/amdgpu: prevent memory leaks in AMDGPU_CS ioctl
Error out if the AMDGPU_CS ioctl is called with multiple SYNCOBJ_OUT and/or
TIMELINE_SIGNAL chunks, since otherwise the last chunk wins while the
allocated array as well as the reference counts of sync objects are leaked.
Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Kevin Wang [Mon, 19 Aug 2019 15:38:02 +0000 (23:38 +0800)]
drm/amd/powerplay: fix variable type errors in smu_v11_0_setup_pptable
fix size type errors, from uint32_t to uint16_t.
it will cause only initializes the highest 16 bits in
smu_get_atom_data_table function.
bug report:
This fixes the following static checker warning.
drivers/gpu/drm/amd/amdgpu/../powerplay/smu_v11_0.c:390 smu_v11_0_setup_pptable()
warn: passing casted pointer '&size' to 'smu_get_atom_data_table()' 32 vs 16.
Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Thu, 15 Aug 2019 13:27:09 +0000 (08:27 -0500)]
drm/amdgpu/gfx9: update pg_flags after determining if gfx off is possible
We need to set certain power gating flags after we determine
if the firmware version is sufficient to support gfxoff.
Previously we set the pg flags in early init, but we later
we might have disabled gfxoff if the firmware versions didn't
support it. Move adding the additional pg flags after we
determine whether or not to support gfxoff.
Fixes: 8e7ced5bdb19 ("drm/amdgpu: enable gfxoff again on raven series (v2)") Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: stable@vger.kernel.org
Linus Torvalds [Wed, 21 Aug 2019 18:48:38 +0000 (11:48 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM fixes from Paolo Bonzini:
"A couple bugfixes, and mostly selftests changes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
selftests/kvm: make platform_info_test pass on AMD
Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
selftests: kvm: fix state save/load on processors without XSAVE
selftests: kvm: fix vmx_set_nested_state_test
selftests: kvm: provide common function to enable eVMCS
selftests: kvm: do not try running the VM in vmx_set_nested_state_test
KVM: x86: svm: remove redundant assignment of var new_entry
MAINTAINERS: add KVM x86 reviewers
MAINTAINERS: change list for KVM/s390
kvm: x86: skip populating logical dest map if apic is not sw enabled
Vitaly Kuznetsov [Mon, 10 Jun 2019 17:22:55 +0000 (19:22 +0200)]
selftests/kvm: make platform_info_test pass on AMD
test_msr_platform_info_disabled() generates EXIT_SHUTDOWN but VMCB state
is undefined after that so an attempt to launch this guest again from
test_msr_platform_info_enabled() fails. Reorder the tests to make test
pass.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linus Torvalds [Wed, 21 Aug 2019 17:04:38 +0000 (10:04 -0700)]
Merge tag 'nfsd-5.3-1' of git://linux-nfs.org/~bfields/linux
Pull nfsd fixes from Bruce Fields:
"Fix nfsd bugs: three in the new nfsd/clients/ code, one in the reply
cache containerization"
* tag 'nfsd-5.3-1' of git://linux-nfs.org/~bfields/linux:
nfsd4: Fix kernel crash when reading proc file reply_cache_stats
nfsd: initialize i_private before d_add
nfsd: use i_wrlock instead of rcu for nfsdfs i_private
nfsd: fix dentry leak upon mkdir failure.
Wenwen Wang [Mon, 19 Aug 2019 00:18:34 +0000 (19:18 -0500)]
dm raid: add missing cleanup in raid_ctr()
If rs_prepare_reshape() fails, no cleanup is executed, leading to
leak of the raid_set structure allocated at the beginning of
raid_ctr(). To fix this issue, go to the label 'bad' if the error
occurs.
Fixes: 1189d68f91da2 ("dm raid: stop keeping raid set frozen altogether") Cc: stable@vger.kernel.org Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Dan Carpenter [Mon, 19 Aug 2019 09:58:14 +0000 (12:58 +0300)]
dm zoned: fix potential NULL dereference in dmz_do_reclaim()
This function is supposed to return error pointers so it matches the
dmz_get_rnd_zone_for_reclaim() function. The current code could lead to
a NULL dereference in dmz_do_reclaim()
Fixes: 160c1004c5a9 ("dm zoned: improve error handling in reclaim") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dmitry Fomichev <dmitry.fomichev@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Bryan Gurney [Fri, 16 Aug 2019 14:09:53 +0000 (10:09 -0400)]
dm dust: use dust block size for badblocklist index
Change the "frontend" dust_remove_block, dust_add_block, and
dust_query_block functions to store the "dust block number", instead
of the sector number corresponding to the "dust block number".
For the "backend" functions dust_map_read and dust_map_write,
right-shift by sect_per_block_shift. This fixes the inability to
emulate failure beyond the first sector of each "dust block" (for
devices with a "dust block size" larger than 512 bytes).
He Zhe [Tue, 20 Aug 2019 14:53:10 +0000 (22:53 +0800)]
modules: page-align module section allocations only for arches supporting strict module rwx
We should keep the case of "#define debug_align(X) (X)" for all arches
without CONFIG_HAS_STRICT_MODULE_RWX ability, which would save people, who
are sensitive to system size, a lot of memory when using modules,
especially for embedded systems. This is also the intention of the
original #ifdef... statement and still valid for now.
Note that this still keeps the effect of the fix of the following commit, 8223415277a7 ("modules: always page-align module section allocations"),
since when CONFIG_ARCH_HAS_STRICT_MODULE_RWX is enabled, module pages are
aligned.
Signed-off-by: He Zhe <zhe.he@windriver.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
Paolo Bonzini [Thu, 15 Aug 2019 07:43:32 +0000 (09:43 +0200)]
Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
This reverts commit 2955cb206c319211ab4e59f7f197fbf280e0cbbe.
Alex Williamson reported regressions with device assignment with
this patch. Even though the bug is probably elsewhere and still
latent, this is needed to fix the regression.
Fixes: 2955cb206c31 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05) Reported-by: Alex Willamson <alex.williamson@redhat.com> Cc: stable@vger.kernel.org Cc: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 20 Aug 2019 15:35:52 +0000 (17:35 +0200)]
selftests: kvm: fix state save/load on processors without XSAVE
state_test and smm_test are failing on older processors that do not
have xcr0. This is because on those processor KVM does provide
support for KVM_GET/SET_XSAVE (to avoid having to rely on the older
KVM_GET/SET_FPU) but not for KVM_GET/SET_XCRS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
video: fbdev: acornfb: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: rpc_defconfig arm):
drivers/video/fbdev/acornfb.c: In function ‘acornfb_parse_dram’:
drivers/video/fbdev/acornfb.c:860:9: warning: this statement may fall through [-Wimplicit-fallthrough=]
size *= 1024;
~~~~~^~~~~~~
drivers/video/fbdev/acornfb.c:861:3: note: here
case 'K':
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
scsi: libsas: sas_discover: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: mtx1_defconfig mips):
drivers/scsi/libsas/sas_discover.c: In function ‘sas_discover_domain’:
./include/linux/printk.h:309:2: warning: this statement may fall through [-Wimplicit-fallthrough=]
printk(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/libsas/sas_discover.c:459:3: note: in expansion of macro ‘pr_notice’
pr_notice("ATA device seen but CONFIG_SCSI_SAS_ATA=N so cannot attach\n");
^~~~~~~~~
drivers/scsi/libsas/sas_discover.c:462:2: note: here
default:
^~~~~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
power: supply: ab8500_charger: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: allmodconfig arm):
drivers/power/supply/ab8500_charger.c: In function ‘ab8500_charger_max_usb_curr’:
drivers/power/supply/ab8500_charger.c:738:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (di->vbus_detected) {
^
drivers/power/supply/ab8500_charger.c:745:2: note: here
case USB_STAT_HM_IDGND:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
watchdog: wdt285: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: footbridge_defconfig arm):
drivers/watchdog/wdt285.c: In function ‘watchdog_ioctl’:
drivers/watchdog/wdt285.c:170:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
watchdog_ping();
^~~~~~~~~~~~~~~
drivers/watchdog/wdt285.c:172:2: note: here
case WDIOC_GETTIMEOUT:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: assabet_defconfig arm):
drivers/mtd/maps/sa1100-flash.c: In function ‘sa1100_probe_subdev’:
drivers/mtd/maps/sa1100-flash.c:82:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
printk(KERN_WARNING "SA1100 flash: unknown base address "
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
"0x%08lx, assuming CS0\n", phys);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/mtd/maps/sa1100-flash.c:85:2: note: here
case SA1100_CS0_PHYS:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
drm/sun4i: tcon: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: sunxi_defconfig arm):
drivers/gpu/drm/sun4i/sun4i_tcon.c: In function ‘sun4i_tcon0_mode_set_dithering’:
drivers/gpu/drm/sun4i/sun4i_tcon.c:318:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
val |= SUN4I_TCON0_FRM_CTL_MODE_B;
drivers/gpu/drm/sun4i/sun4i_tcon.c:319:2: note: here
case MEDIA_BUS_FMT_RGB666_1X18:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
drm/sun4i: sun6i_mipi_dsi: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: multi_v7_defconfig arm):
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c: In function ‘sun6i_dsi_transfer’:
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:993:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
if (msg->rx_len == 1) {
^
drivers/gpu/drm/sun4i/sun6i_mipi_dsi.c:998:2: note: here
default:
^~~~~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Mark switch cases where we are expecting to fall through.
Fix the following warning (Building: rpc_defconfig arm):
arch/arm/mach-rpc/riscpc.c: In function ‘parse_tag_acorn’:
arch/arm/mach-rpc/riscpc.c:48:13: warning: this statement may fall through [-Wimplicit-fallthrough=]
vram_size += PAGE_SIZE * 256;
~~~~~~~~~~^~~~~~~~~~~~~~~~~~
arch/arm/mach-rpc/riscpc.c:49:2: note: here
case 256:
^~~~
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
dmaengine: fsldma: Mark expected switch fall-through
Mark switch cases where we are expecting to fall through.
Fix the following warnings (Building: powerpc-ppa8548_defconfig powerpc):
drivers/dma/fsldma.c: In function ‘fsl_dma_chan_probe’:
drivers/dma/fsldma.c:1165:26: warning: this statement may fall through [-Wimplicit-fallthrough=]
chan->toggle_ext_pause = fsl_chan_toggle_ext_pause;
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/dma/fsldma.c:1166:2: note: here
case FSL_DMA_IP_83XX:
^~~~
Reported-by: kbuild test robot <lkp@intel.com> Acked-by: Li Yang <leoyang.li@nxp.com> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Linus Torvalds [Tue, 20 Aug 2019 18:18:43 +0000 (11:18 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:
- a few regression fixes for wacom driver (including fix for my earlier
mismerge) from Aaron Armstrong Skomra and Jason Gerecke
- revert of a few Logitech device ID additions which turn out to not
work perfectly with the hidpp driver at the moment; proper support is
now scheduled for 5.4. Fixes from Benjamin Tissoires
- scheduling-in-atomic fix for cp2112 driver, from Benjamin Tissoires
- new device ID to intel-ish, from Even Xu
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
HID: wacom: correct misreported EKR ring values
HID: cp2112: prevent sleeping function called from invalid context
HID: intel-ish-hid: ipc: add EHL device id
HID: wacom: Correct distance scale for 2nd-gen Intuos devices
HID: logitech-hidpp: remove support for the G700 over USB
Revert "HID: logitech-hidpp: add USB PID for a few more supported mice"
HID: wacom: add back changes dropped in merge commit
Wenwen Wang [Sun, 18 Aug 2019 18:54:46 +0000 (13:54 -0500)]
infiniband: hfi1: fix memory leaks
In fault_opcodes_write(), 'data' is allocated through kcalloc(). However,
it is not deallocated in the following execution if an error occurs,
leading to memory leaks. To fix this issue, introduce the 'free_data' label
to free 'data' before returning the error.
Wenwen Wang [Sun, 18 Aug 2019 19:29:31 +0000 (14:29 -0500)]
infiniband: hfi1: fix a memory leak bug
In fault_opcodes_read(), 'data' is not deallocated if debugfs_file_get()
fails, leading to a memory leak. To fix this bug, introduce the 'free_data'
label to free 'data' before returning the error.
Wenwen Wang [Sun, 18 Aug 2019 20:23:01 +0000 (15:23 -0500)]
IB/mlx4: Fix memory leaks
In mlx4_ib_alloc_pv_bufs(), 'tun_qp->tx_ring' is allocated through
kcalloc(). However, it is not always deallocated in the following execution
if an error occurs, leading to memory leaks. To fix this issue, free
'tun_qp->tx_ring' whenever an error occurs.
zhengbin [Mon, 19 Aug 2019 04:27:39 +0000 (12:27 +0800)]
RDMA/cma: fix null-ptr-deref Read in cma_cleanup
In cma_init, if cma_configfs_init fails, need to free the
previously memory and return fail, otherwise will trigger
null-ptr-deref Read in cma_cleanup.
Moni Shoua [Thu, 15 Aug 2019 08:38:34 +0000 (11:38 +0300)]
IB/mlx5: Block MR WR if UMR is not possible
Check conditions that are mandatory to post_send UMR WQEs.
1. Modifying page size.
2. Modifying remote atomic permissions if atomic access is required.
If either condition is not fulfilled then fail to post_send() flow.
Fixes: 6b538ebbfdf2 ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-9-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
Moni Shoua [Thu, 15 Aug 2019 08:38:33 +0000 (11:38 +0300)]
IB/mlx5: Fix MR re-registration flow to use UMR properly
The UMR WQE in the MR re-registration flow requires that
modify_atomic and modify_entity_size capabilities are enabled.
Therefore, check that the these capabilities are present before going to
umr flow and go through slow path if not.
Fixes: 6b538ebbfdf2 ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-8-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
Moni Shoua [Thu, 15 Aug 2019 08:38:32 +0000 (11:38 +0300)]
IB/mlx5: Report and handle ODP support properly
ODP depends on the several device capabilities, among them is the ability
to send UMR WQEs with that modify atomic and entity size of the MR.
Therefore, only if all conditions to send such a UMR WQE are met then
driver can report that ODP is supported. Use this check of conditions
in all places where driver needs to know about ODP support.
Also, implicit ODP support depends on ability of driver to send UMR WQEs
for an indirect mkey. Therefore, verify that all conditions to do so are
met when reporting support.
Fixes: 6b538ebbfdf2 ("IB/mlx5: Respect new UMR capabilities") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-7-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Thu, 15 Aug 2019 08:38:29 +0000 (11:38 +0300)]
RDMA/restrack: Rewrite PID namespace check to be reliable
task_active_pid_ns() is wrong API to check PID namespace because it
posses some restrictions and return PID namespace where the process
was allocated. It created mismatches with current namespace, which
can be different.
Rewrite whole rdma_is_visible_in_pid_ns() logic to provide reliable
results without any relation to allocated PID namespace.
Fixes: 62424a9874c3 ("RDMA/nldev: Factor out the PID namespace check") Fixes: 2b939a57c1d4 ("RDMA/restrack: Make is_visible_in_pid_ns() as an API") Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-4-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
Leon Romanovsky [Thu, 15 Aug 2019 08:38:28 +0000 (11:38 +0300)]
RDMA/counters: Properly implement PID checks
"Auto" configuration mode is called for visible in that PID
namespace and it ensures that all counters and QPs are coexist
in the same namespace and belong to same PID.
Ido Kalir [Thu, 15 Aug 2019 08:38:27 +0000 (11:38 +0300)]
IB/core: Fix NULL pointer dereference when bind QP to counter
If QP is not visible to the pid, then we try to decrease its reference
count and return from the function before the QP pointer is
initialized. This lead to NULL pointer dereference.
Fix it by pass directly the res to the rdma_restract_put as arg instead of
&qp->res.
Kaike Wan [Thu, 15 Aug 2019 19:20:58 +0000 (15:20 -0400)]
IB/hfi1: Drop stale TID RDMA packets that cause TIDErr
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order. A stale TID RDMA data packet
could lead to TidErr if the TID entries have been released by duplicate
data packets generated from retries, and subsequently erroneously force
the qp into error state in the current implementation.
Since the payload has already been dropped by hardware, the packet can
be simply dropped and it is no longer necessary to put the qp into
error state.
Fixes: 78cbedc12a74 ("IB/hfi1: Add functions to receive TID RDMA READ response") Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Link: https://lore.kernel.org/r/20190815192058.105923.72324.stgit@awfm-01.aw.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Thu, 15 Aug 2019 19:20:51 +0000 (15:20 -0400)]
IB/hfi1: Add additional checks when handling TID RDMA WRITE DATA packet
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order, which could cause incorrect
processing of stale packets. For stale TID RDMA WRITE DATA packets that
cause KDETH EFLAGS errors, this patch adds additional checks before
processing the packets.
Fixes: 2bfb71b37bd2 ("IB/hfi1: Add a function to receive TID RDMA WRITE DATA packet") Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Link: https://lore.kernel.org/r/20190815192051.105923.69979.stgit@awfm-01.aw.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Thu, 15 Aug 2019 19:20:45 +0000 (15:20 -0400)]
IB/hfi1: Add additional checks when handling TID RDMA READ RESP packet
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order, which could cause incorrect
processing of stale packets. For stale TID RDMA READ RESP packets that
cause KDETH EFLAGS errors, this patch adds additional checks before
processing the packets.
Fixes: 78cbedc12a74 ("IB/hfi1: Add functions to receive TID RDMA READ response") Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Link: https://lore.kernel.org/r/20190815192045.105923.59813.stgit@awfm-01.aw.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Thu, 15 Aug 2019 19:20:39 +0000 (15:20 -0400)]
IB/hfi1: Unsafe PSN checking for TID RDMA READ Resp packet
When processing a TID RDMA READ RESP packet that causes KDETH EFLAGS
errors, the packet's IB PSN is checked against qp->s_last_psn and
qp->s_psn without the protection of qp->s_lock, which is not safe.
This patch fixes the issue by acquiring qp->s_lock first.
Fixes: 78cbedc12a74 ("IB/hfi1: Add functions to receive TID RDMA READ response") Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Link: https://lore.kernel.org/r/20190815192039.105923.7852.stgit@awfm-01.aw.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>
Kaike Wan [Thu, 15 Aug 2019 19:20:33 +0000 (15:20 -0400)]
IB/hfi1: Drop stale TID RDMA packets
In a congested fabric with adaptive routing enabled, traces show that
the sender could receive stale TID RDMA NAK packets that contain newer
KDETH PSNs and older Verbs PSNs. If not dropped, these packets could
cause the incorrect rewinding of the software flows and the incorrect
completion of TID RDMA WRITE requests, and eventually leading to memory
corruption and kernel crash.
The current code drops stale TID RDMA ACK/NAK packets solely based
on KDETH PSNs, which may lead to erroneous processing. This patch
fixes the issue by also checking the Verbs PSN. Addition checks are
added before rewinding the TID RDMA WRITE DATA packets.
Fixes: 27d6f5ea6140 ("IB/hfi1: Add a function to receive TID RDMA ACK packet") Cc: <stable@vger.kernel.org> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Link: https://lore.kernel.org/r/20190815192033.105923.44192.stgit@awfm-01.aw.intel.com Signed-off-by: Doug Ledford <dledford@redhat.com>
Jason Gunthorpe [Thu, 15 Aug 2019 08:38:30 +0000 (11:38 +0300)]
RDMA/mlx5: Fix MR npages calculation for IB_ACCESS_HUGETLB
When ODP is enabled with IB_ACCESS_HUGETLB then the required pages
should be calculated based on the extent of the MR, which is rounded
to the nearest huge page alignment.
Fixes: 74627516f55c ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Link: https://lore.kernel.org/r/20190815083834.9245-5-leon@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
Jens Axboe [Tue, 20 Aug 2019 17:03:11 +0000 (11:03 -0600)]
io_uring: don't enter poll loop if we have CQEs pending
We need to check if we have CQEs pending before starting a poll loop,
as those could be the events we will be spinning for (and hence we'll
find none). This can happen if a CQE triggers an error, or if it is
found by eg an IRQ before we get a chance to find it through polling.
nvme: Add quirk for LiteON CL1 devices running FW 22301111
One of the components in LiteON CL1 device has limitations that
can be encountered based upon boundary race conditions using the
nvme bus specific suspend to idle flow.
When this situation occurs the drive doesn't resume properly from
suspend-to-idle.
LiteON has confirmed this problem and fixed in the next firmware
version. As this firmware is already in the field, avoid running
nvme specific suspend to idle flow.
Fixes: f95338d9c7b3 ("nvme-pci: use host managed power state for suspend") Link: http://lists.infradead.org/pipermail/linux-nvme/2019-July/thread.html Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Signed-off-by: Charles Hyde <charles.hyde@dellteam.com> Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit bb771bddf1e7 ("nvme: validate cntlid during controller initialisation")
introduced a validation for controllers with duplicate cntlid that runs
on nvme_init_subsystem(). The problem is that the validation relies on
ctrl->cntlid, and this value is assigned (from id_ctrl value) after the
call for nvme_init_subsystem() in nvme_init_identify() for non-fabrics
scenario. That leads to ctrl->cntlid always being 0 in case we have a
physical set of controllers in the same subsystem.
This patch fixes that by loading the discovered cntlid id_ctrl value into
ctrl->cntlid before the subsystem initialization, only for the non-fabrics
case. The patch was tested with emulated nvme devices (qemu) having two
controllers in a single subsystem. Without the patch, we couldn't make
it work failing in the duplicate check; when running with the patch, we
could see the subsystem holding both controllers.
For the fabrics case we see ctrl->cntlid has a more intricate relation
with the admin connect, so we didn't change that.
Anton Eidelman [Mon, 12 Aug 2019 20:00:36 +0000 (23:00 +0300)]
nvme-multipath: fix possible I/O hang when paths are updated
nvme_state_set_live() making a path available triggers requeue_work
in order to resubmit requests that ended up on requeue_list when no
paths were available.
This requeue_work may race with concurrent nvme_ns_head_make_request()
that do not observe the live path yet.
Such concurrent requests may by made by either:
- New IO submission.
- Requeue_work triggered by nvme_failover_req() or another ana_work.
A race may cause requeue_work capture the state of requeue_list before
more requests get onto the list. These requests will stay on the list
forever unless requeue_work is triggered again.
In order to prevent such race, nvme_state_set_live() should
synchronize_srcu(&head->srcu) before triggering the requeue_work and
prevent nvme_ns_head_make_request referencing an old snapshot of the
path list.
Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Anton Eidelman <anton@lightbitslabs.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Mon, 19 Aug 2019 18:15:59 +0000 (12:15 -0600)]
io_uring: fix potential hang with polled IO
If a request issue ends up being punted to async context to avoid
blocking, we can get into a situation where the original application
enters the poll loop for that very request before it has been issued.
This should not be an issue, except that the polling will hold the
io_uring uring_ctx mutex for the duration of the poll. When the async
worker has actually issued the request, it needs to acquire this mutex
to add the request to the poll issued list. Since the application
polling is already holding this mutex, the workqueue sleeps on the
mutex forever, and the application thus never gets a chance to poll for
the very request it was interested in.
Fix this by ensuring that the polling drops the uring_ctx occasionally
if it's not making any progress.
Reported-by: Jeffrey M. Birnbaum <jmbnyc@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
James Smart [Fri, 16 Aug 2019 02:36:49 +0000 (19:36 -0700)]
scsi: lpfc: Mitigate high memory pre-allocation by SCSI-MQ
When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of MQ
resources based on shost values set by the driver. In newer cases of the
driver, which attempts to set nr_hw_queues to the cpu count, the
multipliers become excessive, with a single shost having SCSI-MQ
pre-allocation reaching into the multiple GBytes range. NPIV, which
creates additional shosts, only multiply this overhead. On lower-memory
systems, this can exhaust system memory very quickly, resulting in a system
crash or failures in the driver or elsewhere due to low memory conditions.
After testing several scenarios, the situation can be mitigated by limiting
the value set in shost->nr_hw_queues to 4. Although the shost values were
changed, the driver still had per-cpu hardware queues of its own that
allowed parallelization per-cpu. Testing revealed that even with the
smallish number for nr_hw_queues for SCSI-MQ, performance levels remained
near maximum with the within-driver affiinitization.
A module parameter was created to allow the value set for the nr_hw_queues
to be tunable.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Ira Weiny [Tue, 20 Aug 2019 01:15:28 +0000 (18:15 -0700)]
fs/xfs: Fix return code of xfs_break_leased_layouts()
The parens used in the while loop would result in error being assigned
the value 1 rather than the intended errno value.
This is required to return -ETXTBSY from follow on break_layout()
changes.
Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Linus Torvalds [Mon, 19 Aug 2019 23:28:25 +0000 (16:28 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A couple fixes to the core framework logic that finds clk parents, a
handful of samsung clk driver fixes for audio and display clks, and a
small fix for the Stratix10 SoC driver that was checking the wrong
register for validity"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Fix potential NULL dereference in clk_fetch_parent_index()
clk: Fix falling back to legacy parent string matching
clk: socfpga: stratix10: fix rate caclulationg for cnt_clks
clk: samsung: exynos542x: Move MSCL subsystem clocks to its sub-CMU
clk: samsung: exynos5800: Move MAU subsystem clocks to MAU sub-CMU
clk: samsung: Change signature of exynos5_subcmus_init() function
Linus Torvalds [Mon, 19 Aug 2019 23:17:59 +0000 (16:17 -0700)]
Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull kernel thread signal handling fix from Eric Biederman:
"I overlooked the fact that kernel threads are created with all signals
set to SIG_IGN, and accidentally caused a regression in cifs and drbd
when replacing force_sig with send_sig.
This is my fix for that regression. I add a new function
allow_kernel_signal which allows kernel threads to receive signals
sent from the kernel, but continues to ignore all signals sent from
userspace. This ensures the user space interface for cifs and drbd
remain the same.
These kernel threads depend on blocking networking calls which block
until something is received or a signal is pending. Making receiving
of signals somewhat necessary for these kernel threads.
Perhaps someday we can cleanup those interfaces and remove
allow_kernel_signal. If not allow_kernel_signal is pretty trivial and
clearly documents what is going on so I don't think we will mind
carrying it"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Allow cifs and drbd to receive their terminating signals
1) Fix jmp to 1st instruction in x64 JIT, from Alexei Starovoitov.
2) Severl kTLS fixes in mlx5 driver, from Tariq Toukan.
3) Fix severe performance regression due to lack of SKB coalescing of
fragments during local delivery, from Guillaume Nault.
4) Error path memory leak in sch_taprio, from Ivan Khoronzhuk.
5) Fix batched events in skbedit packet action, from Roman Mashak.
6) Propagate VLAN TX offload to hw_enc_features in bond and team
drivers, from Yue Haibing.
7) RXRPC local endpoint refcounting fix and read after free in
rxrpc_queue_local(), from David Howells.
8) Fix endian bug in ibmveth multicast list handling, from Thomas
Falcon.
9) Oops, make nlmsg_parse() wrap around the correct function,
__nlmsg_parse not __nla_parse(). Fix from David Ahern.
10) Memleak in sctp_scend_reset_streams(), fro Zheng Bin.
11) Fix memory leak in cxgb4, from Wenwen Wang.
12) Yet another race in AF_PACKET, from Eric Dumazet.
13) Fix false detection of retransmit failures in tipc, from Tuong
Lien.
14) Use after free in ravb_tstamp_skb, from Tho Vu.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
ravb: Fix use-after-free ravb_tstamp_skb
netfilter: nf_tables: map basechain priority to hardware priority
net: sched: use major priority number as hardware priority
wimax/i2400m: fix a memory leak bug
net: cavium: fix driver name
ibmvnic: Unmap DMA address of TX descriptor buffers after use
bnxt_en: Fix to include flow direction in L2 key
bnxt_en: Use correct src_fid to determine direction of the flow
bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command
bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
bnxt_en: Improve RX doorbell sequence.
bnxt_en: Fix VNIC clearing logic for 57500 chips.
net: kalmia: fix memory leaks
cx82310_eth: fix a memory leak bug
bnx2x: Fix VF's VLAN reconfiguration in reload.
Bluetooth: Add debug setting for changing minimum encryption key size
tipc: fix false detection of retransmit failures
lan78xx: Fix memory leaks
MAINTAINERS: r8169: Update path to the driver
MAINTAINERS: PHY LIBRARY: Update files in the record
...
David Howells [Mon, 19 Aug 2019 15:02:01 +0000 (16:02 +0100)]
keys: Fix description size
The maximum key description size is 4095. Commit e89a490f3192 ("keys:
Simplify key description management") inadvertantly reduced that to 255
and made sizes between 256 and 4095 work weirdly, and any size whereby
size & 255 == 0 would cause an assertion in __key_link_begin() at the
following line:
BUG_ON(index_key->desc_len == 0);
This can be fixed by simply increasing the size of desc_len in struct
keyring_index_key to a u16.
Note the argument length test in keyutils only checked empty
descriptions and descriptions with a size around the limit (ie. 4095)
and not for all the values in between, so it missed this. This has been
addressed and
keyctl add user "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" a @s
Fixes: e89a490f3192 ("keys: Simplify key description management") Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>