Merge tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull timer fix from Ingo Molnar:
"Fix a suspend/resume regression (crash) on TI AM3/AM4 SoC's"
* tag 'timers-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/timer-ti-dm: Fix suspend and resume for am3 and am4
Merge tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull scheduler fixes from Ingo Molnar:
"Fix a race introduced by the recent loadavg race fix, plus add a debug
check for a hard to debug case of bogus wakeup function flags"
* tag 'sched-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Warn if garbage is passed to default_wake_function()
sched: Fix race against ptrace_freeze_trace()
Merge tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into master
Pull EFI fixes from Ingo Molnar:
"Various EFI fixes:
- Fix the layering violation in the use of the EFI runtime services
availability mask in users of the 'efivars' abstraction
- Revert build fix for GCC v4.8 which is no longer supported
- Clean up some x86 EFI stub details, some of which are borderline
bugs that copy around garbage into padding fields - let's fix these
out of caution.
- Fix build issues while working on RISC-V support
- Avoid --whole-archive when linking the stub on arm64"
* tag 'efi-urgent-2020-07-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Revert "efi/x86: Fix build with gcc 4"
efi/efivars: Expose RT service availability via efivars abstraction
efi/libstub: Move the function prototypes to header file
efi/libstub: Fix gcc error around __umoddi3 for 32 bit builds
efi/libstub/arm64: link stub lib.a conditionally
efi/x86: Only copy upto the end of setup_header
efi/x86: Remove unused variables
Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into master
Pull networking fixes from David Miller:
1) Fix RCU locaking in iwlwifi, from Johannes Berg.
2) mt76 can access uninitialized NAPI struct, from Felix Fietkau.
3) Fix race in updating pause settings in bnxt_en, from Vasundhara
Volam.
4) Propagate error return properly during unbind failures in ax88172a,
from George Kennedy.
5) Fix memleak in adf7242_probe, from Liu Jian.
6) smc_drv_probe() can leak, from Wang Hai.
7) Don't muck with the carrier state if register_netdevice() fails in
the bonding driver, from Taehee Yoo.
8) Fix memleak in dpaa_eth_probe, from Liu Jian.
9) Need to check skb_put_padto() return value in hsr_fill_tag(), from
Murali Karicheri.
10) Don't lose ionic RSS hash settings across FW update, from Shannon
Nelson.
11) Fix clobbered SKB control block in act_ct, from Wen Xu.
12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang.
13) IS_UDPLITE cleanup a long time ago, incorrectly handled
transformations involving UDPLITE_RECV_CC. From Miaohe Lin.
14) Unbalanced locking in netdevsim, from Taehee Yoo.
15) Suppress false-positive error messages in qed driver, from Alexander
Lobakin.
16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye.
17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost.
18) Uninitialized value in geneve_changelink(), from Cong Wang.
19) Fix deadlock in xen-netfront, from Andera Righi.
19) flush_backlog() frees skbs with IRQs disabled, so should use
dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov
Kasiviswanathan.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits)
drivers/net/wan: lapb: Corrected the usage of skb_cow
dev: Defer free of skbs in flush_backlog
qrtr: orphan socket in qrtr_release()
xen-netfront: fix potential deadlock in xennet_remove()
flow_offload: Move rhashtable inclusion to the source file
geneve: fix an uninitialized value in geneve_changelink()
bonding: check return value of register_netdevice() in bond_newlink()
tcp: allow at most one TLP probe per flight
AX.25: Prevent integer overflows in connect and sendmsg
cxgb4: add missing release on skb in uld_send()
net: atlantic: fix PTP on AQC10X
AX.25: Prevent out-of-bounds read in ax25_sendmsg()
sctp: shrink stream outq when fails to do addstream reconf
sctp: shrink stream outq only when new outcnt < old outcnt
AX.25: Fix out-of-bounds read in ax25_connect()
enetc: Remove the mdio bus on PF probe bailout
net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
net: ethernet: ave: Fix error returns in ave_init
drivers/net/wan/x25_asy: Fix to make it work
ipvs: fix the connection sync failed in some cases
...
Xie He [Fri, 24 Jul 2020 16:33:47 +0000 (09:33 -0700)]
drivers/net/wan: lapb: Corrected the usage of skb_cow
This patch fixed 2 issues with the usage of skb_cow in LAPB drivers
"lapbether" and "hdlc_x25":
1) After skb_cow fails, kfree_skb should be called to drop a reference
to the skb. But in both drivers, kfree_skb is not called.
2) skb_cow should be called before skb_push so that is can ensure the
safety of skb_push. But in "lapbether", it is incorrectly called after
skb_push.
More details about these 2 issues:
1) The behavior of calling kfree_skb on failure is also the behavior of
netif_rx, which is called by this function with "return netif_rx(skb);".
So this function should follow this behavior, too.
2) In "lapbether", skb_cow is called after skb_push. This results in 2
logical issues:
a) skb_push is not protected by skb_cow;
b) An extra headroom of 1 byte is ensured after skb_push. This extra
headroom has no use in this function. It also has no use in the
upper-layer function that this function passes the skb to
(x25_lapb_receive_frame in net/x25/x25_dev.c).
So logically skb_cow should instead be called before skb_push.
Cc: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
IRQs are disabled when freeing skbs in input queue.
Use the IRQ safe variant to free skbs here.
Fixes: 12df4281be2a ("net: flush the softnet backlog in process context") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci into master
Pull PCI fixes from Bjorn Helgaas:
- Reject invalid IRQ 0 command line argument for virtio_mmio because
IRQ 0 now generates warnings (Bjorn Helgaas)
- Revert "PCI/PM: Assume ports without DLL Link Active train links in
100 ms", which broke nouveau (Bjorn Helgaas)
* tag 'pci-v5.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
Revert "PCI/PM: Assume ports without DLL Link Active train links in 100 ms"
virtio-mmio: Reject invalid IRQ 0 command line argument
Cong Wang [Fri, 24 Jul 2020 16:45:51 +0000 (09:45 -0700)]
qrtr: orphan socket in qrtr_release()
We have to detach sock from socket in qrtr_release(),
otherwise skb->sk may still reference to this socket
when the skb is released in tun->queue, particularly
sk->sk_wq still points to &sock->wq, which leads to
a UAF.
Reported-and-tested-by: syzbot+6720d64f31c081c2f708@syzkaller.appspotmail.com Fixes: 99f74953578f ("net: qrtr: Expose tunneling endpoint to user space") Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Andrea Righi [Fri, 24 Jul 2020 08:59:10 +0000 (10:59 +0200)]
xen-netfront: fix potential deadlock in xennet_remove()
There's a potential race in xennet_remove(); this is what the driver is
doing upon unregistering a network device:
1. state = read bus state
2. if state is not "Closed":
3. request to set state to "Closing"
4. wait for state to be set to "Closing"
5. request to set state to "Closed"
6. wait for state to be set to "Closed"
If the state changes to "Closed" immediately after step 1 we are stuck
forever in step 4, because the state will never go back from "Closed" to
"Closing".
Make sure to check also for state == "Closed" in step 4 to prevent the
deadlock.
Also add a 5 sec timeout any time we wait for the bus state to change,
to avoid getting stuck forever in wait_event().
Signed-off-by: Andrea Righi <andrea.righi@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Fri, 24 Jul 2020 00:50:22 +0000 (10:50 +1000)]
flow_offload: Move rhashtable inclusion to the source file
I noticed that touching linux/rhashtable.h causes lib/vsprintf.c to
be rebuilt. This dependency came through a bogus inclusion in the
file net/flow_offload.h. This patch moves it to the right place.
This patch also removes a lingering rhashtable inclusion in cls_api
created by the same commit.
Fixes: 103e3997221b ("flow_offload: move tc indirect block to...") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'akpm' into master (patches from Andrew)
Merge misc fixes from Andrew Morton:
"Subsystems affected by this patch series: mm/pagemap, mm/shmem,
mm/hotfixes, mm/memcg, mm/hugetlb, mailmap, squashfs, scripts,
io-mapping, MAINTAINERS, and gdb"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
scripts/gdb: fix lx-symbols 'gdb.error' while loading modules
MAINTAINERS: add KCOV section
io-mapping: indicate mapping failure
scripts/decode_stacktrace: strip basepath from all paths
squashfs: fix length field overlap check in metadata reading
mailmap: add entry for Mike Rapoport
khugepaged: fix null-pointer dereference due to race
mm/hugetlb: avoid hardcoding while checking if cma is enabled
mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
mm/memcg: fix refcount error while moving and swapping
mm/memcontrol: fix OOPS inside mem_cgroup_get_nr_swap_pages()
mm: initialize return of vm_insert_pages
vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
Merge tag 'for-5.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into master
Pull btrfs fixes from David Sterba:
"A few resouce leak fixes from recent patches, all are stable material.
The problems have been observed during testing or have a reproducer"
* tag 'for-5.8-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: fix mount failure caused by race with umount
btrfs: fix page leaks after failure to lock page for delalloc
btrfs: qgroup: fix data leak caused by race between writeback and truncate
btrfs: fix double free on ulist after backref resolution failure
Merge tag 'zonefs-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs into master
Pull zonefs fixes from Damien Le Moal:
"Two fixes, the first one to remove compilation warnings and the second
to avoid potentially inefficient allocation of BIOs for direct writes
into sequential zones"
* tag 'zonefs-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
zonefs: count pages after truncating the iterator
zonefs: Fix compilation warning
Merge tag 'io_uring-5.8-2020-07-24' of git://git.kernel.dk/linux-block into master
Pull io_uring fixes from Jens Axboe:
- Fix discrepancy in how sqe->flags are treated for a few requests,
this makes it consistent (Daniele)
- Ensure that poll driven retry works with double waitqueue poll users
- Fix a missing io_req_init_async() (Pavel)
* tag 'io_uring-5.8-2020-07-24' of git://git.kernel.dk/linux-block:
io_uring: missed req_init_async() for IOSQE_ASYNC
io_uring: always allow drain/link/hardlink/async sqe flags
io_uring: ensure double poll additions work with both request types
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma into master
Pull rdma fixes from Jason Gunthorpe:
"One merge window regression, some corruption bugs in HNS and a few
more syzkaller fixes:
- Two long standing syzkaller races
- Fix incorrect HW configuration in HNS
- Restore accidentally dropped locking in IB CM
- Fix ODP prefetch bug added in the big rework several versions ago"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/mlx5: Prevent prefetch from racing with implicit destruction
RDMA/cm: Protect access to remote_sidr_table
RDMA/core: Fix race in rdma_alloc_commit_uobject()
RDMA/hns: Fix wrong PBL offset when VA is not aligned to PAGE_SIZE
RDMA/hns: Fix wrong assignment of lp_pktn_ini in QPC
RDMA/mlx5: Use xa_lock_irq when access to SRQ table
Merge tag 'for-5.8/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm into master
Pull device mapper fix from Mike Snitzer:
"A stable fix for DM integrity target's integrity recalculation that
gets skipped when resuming a device. This is a fix for a previous
stable@ fix"
* tag 'for-5.8/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm integrity: fix integrity recalculation that is improperly skipped
Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into master
Pull i2c fixes from Wolfram Sang:
"Again some driver bugfixes and some documentation fixes"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: i2c-qcom-geni: Fix DMA transfer race
i2c: rcar: always clear ICSAR to avoid side effects
MAINTAINERS: i2c: at91: handover maintenance to Codrin Ciubotariu
i2c: drop duplicated word in the header file
i2c: cadence: Clear HOLD bit at correct time in Rx path
Revert "i2c: cadence: Fix the hold bit setting"
Merge tag 'drm-fixes-2020-07-24' of git://anongit.freedesktop.org/drm/drm into master
Pull drm fixes from Dave Airlie:
"Quiet fixes, I may have a single regression fix follow up to this for
nouveau, but it might be next week, Ben was testing it a bit more .
Otherwise two amdgpu fixes, one lima and one sun4i:
amdgpu:
- Fix crash when overclocking VegaM
- Fix possible crash when editing dpm levels
sun4i:
- Fix inverted HPD result; fixes an earlier fix
lima:
- fix timeout during reset"
* tag 'drm-fixes-2020-07-24' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: Fix NULL dereference in dpm sysfs handlers
drm/amd/powerplay: fix a crash when overclocking Vega M
drm/lima: fix wait pp reset timeout
drm: sun4i: hdmi: Fix inverted HPD result
scripts/gdb: fix lx-symbols 'gdb.error' while loading modules
Commit 1ebbd1dd767a ("module: Refactor section attr into bin attribute")
removed the 'name' field from 'struct module_sect_attr' triggering the
following error when invoking lx-symbols:
(gdb) lx-symbols
loading vmlinux
scanning for modules in linux/build
loading @0xffffffffc014f000: linux/build/drivers/net/tun.ko
Python Exception <class 'gdb.error'> There is no member named name.:
Error occurred in Python: There is no member named name.
This patch fixes the issue taking the module name from the 'struct
attribute'.
Fixes: 1ebbd1dd767a ("module: Refactor section attr into bin attribute") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Kieran Bingham <kbingham@kernel.org> Link: http://lkml.kernel.org/r/20200722102239.313231-1-sgarzare@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 652612a28b50 ("io-mapping: Always create a struct to hold metadata about the io-mapping") Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
scripts/decode_stacktrace: strip basepath from all paths
Currently the basepath is removed only from the beginning of the string.
When the symbol is inlined and there's multiple line outputs of
addr2line, only the first line would have basepath removed.
Change to remove the basepath prefix from all lines.
Fixes: d78ec5f789f1 ("scripts/decode_stacktrace: match basepath using shell prefix operator, not regex") Co-developed-by: Shik Chen <shik@chromium.org> Signed-off-by: Pi-Hsun Shih <pihsun@chromium.org> Signed-off-by: Shik Chen <shik@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Cc: Sasha Levin <sashal@kernel.org> Cc: Nicolas Boichat <drinkcat@chromium.org> Cc: Jiri Slaby <jslaby@suse.cz> Link: http://lkml.kernel.org/r/20200720082709.252805-1-pihsun@chromium.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
squashfs: fix length field overlap check in metadata reading
This is a regression introduced by the "migrate from ll_rw_block usage
to BIO" patch.
Squashfs packs structures on byte boundaries, and due to that the length
field (of the metadata block) may not be fully in the current block.
The new code rewrote and introduced a faulty check for that edge case.
Fixes: fa3d76cb540228bb7f ("squashfs: migrate from ll_rw_block usage to BIO") Reported-by: Bernd Amend <bernd.amend@gmail.com> Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Adrien Schildknecht <adrien+dev@schischi.me> Cc: Guenter Roeck <groeck@chromium.org> Cc: Daniel Rosenberg <drosen@google.com> Link: http://lkml.kernel.org/r/20200717195536.16069-1-phillip@squashfs.org.uk Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
khugepaged: fix null-pointer dereference due to race
khugepaged has to drop mmap lock several times while collapsing a page.
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.
But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
anon_vma_lock_write include/linux/rmap.h:120 [inline]
collapse_huge_page mm/khugepaged.c:1110 [inline]
khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
khugepaged_do_scan mm/khugepaged.c:2193 [inline]
khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238
The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate(). The helper is only used for collapsing
anonymous pages.
Fixes: 21abde3434f6 ("mm,thp: add read-only THP support for (non-shmem) FS") Reported-by: syzbot+ed318e8b790ca72c5ad0@syzkaller.appspotmail.com Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Yang Shi <yang.shi@linux.alibaba.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Barry Song [Fri, 24 Jul 2020 04:15:30 +0000 (21:15 -0700)]
mm/hugetlb: avoid hardcoding while checking if cma is enabled
hugetlb_cma[0] can be NULL due to various reasons, for example, node0
has no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is
not enabled. gigantic pages might have been reserved on other nodes.
This patch fixes possible double reservation and CMA leak.
[akpm@linux-foundation.org: fix CONFIG_CMA=n warning]
[sfr@canb.auug.org.au: better checks before using hugetlb_cma] Link: http://lkml.kernel.org/r/20200721205716.6dbaa56b@canb.auug.org.au Fixes: 0eeb7efcd4e6 ("mm: hugetlb: optionally allocate gigantic hugepages using cma") Signed-off-by: Barry Song <song.bao.hua@hisilicon.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200710005726.36068-1-song.bao.hua@hisilicon.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Muchun Song [Fri, 24 Jul 2020 04:15:27 +0000 (21:15 -0700)]
mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
If the kmem_cache refcount is greater than one, we should not mark the
root kmem_cache as dying. If we mark the root kmem_cache dying
incorrectly, the non-root kmem_cache can never be destroyed. It
resulted in memory leak when memcg was destroyed. We can use the
following steps to reproduce.
1) Use kmem_cache_create() to create a new kmem_cache named A.
2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
so the refcount of B is just increased.
3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
decrease the B's refcount but mark the B as dying.
4) Create a new memory cgroup and alloc memory from the kmem_cache
B. It leads to create a non-root kmem_cache for allocating memory.
5) When destroy the memory cgroup created in the step 4), the
non-root kmem_cache can never be destroyed.
If we repeat steps 4) and 5), this will cause a lot of memory leak. So
only when refcount reach zero, we mark the root kmem_cache as dying.
Fixes: 4c15dba4b967 ("mm: fix race between kmem_cache destroy, create and deactivate") Signed-off-by: Muchun Song <songmuchun@bytedance.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20200716165103.83462-1-songmuchun@bytedance.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/memcg: fix refcount error while moving and swapping
It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.
This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).
Just skip that optimization: do that part of the accounting immediately.
Fixes: 0103cd5f4f9b ("mm: memcontrol: fix memcg id ref counter on swap charge move") Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Alex Shi <alex.shi@linux.alibaba.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Michal Hocko <mhocko@suse.com> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Prabhakar reported an OOPS inside mem_cgroup_get_nr_swap_pages()
function in a corner case seen on some arm64 boards when kdump kernel
runs with "cgroup_disable=memory" passed to the kdump kernel via
bootargs.
The root-cause behind the same is that currently mem_cgroup_swap_init()
function is implemented as a subsys_initcall() call instead of a
core_initcall(), this means 'cgroup_memory_noswap' still remains set to
the default value (false) even when memcg is disabled via
"cgroup_disable=memory" boot parameter.
This may result in premature OOPS inside mem_cgroup_get_nr_swap_pages()
function in corner cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000188
Mem abort info:
ESR = 0x96000006
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000006
CM = 0, WnR = 0
[0000000000000188] user address but active_mm is swapper
Internal error: Oops: 96000006 [#1] SMP
Modules linked in:
<..snip..>
Call trace:
mem_cgroup_get_nr_swap_pages+0x9c/0xf4
shrink_lruvec+0x404/0x4f8
shrink_node+0x1a8/0x688
do_try_to_free_pages+0xe8/0x448
try_to_free_pages+0x110/0x230
__alloc_pages_slowpath.constprop.106+0x2b8/0xb48
__alloc_pages_nodemask+0x2ac/0x2f8
alloc_page_interleave+0x20/0x90
alloc_pages_current+0xdc/0xf8
atomic_pool_expand+0x60/0x210
__dma_atomic_pool_init+0x50/0xa4
dma_atomic_pool_init+0xac/0x158
do_one_initcall+0x50/0x218
kernel_init_freeable+0x22c/0x2d0
kernel_init+0x18/0x110
ret_from_fork+0x10/0x18
Code: aa1403e39110600097f82a2714000011 (f940c663)
---[ end trace 9795948475817de4 ]---
Kernel panic - not syncing: Fatal exception
Rebooting in 10 seconds..
Fixes: ef3a4fbf23da ("mm: memcontrol: prepare swap controller setup for integration") Reported-by: Prabhakar Kushwaha <pkushwaha@marvell.com> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: James Morse <james.morse@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Link: http://lkml.kernel.org/r/1593641660-13254-2-git-send-email-bhsharma@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tom Rix [Fri, 24 Jul 2020 04:15:18 +0000 (21:15 -0700)]
mm: initialize return of vm_insert_pages
clang static analysis reports a garbage return
In file included from mm/memory.c:84:
mm/memory.c:1612:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn]
return err;
^~~~~~~~~~
The setting of err depends on a loop executing. So initialize err.
vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
After commit bd7eca96bd4b ("kernfs: kvmalloc xattr value instead of
kmalloc"), simple xattr entry is allocated with kvmalloc() instead of
kmalloc(), so we should release it with kvfree() instead of kfree().
Fixes: bd7eca96bd4b ("kernfs: kvmalloc xattr value instead of kmalloc") Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Chris Down <chris@chrisdown.name> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [5.7] Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
mmap_read_lock(). It can lead to race with __do_munmap():
Thread A Thread B
__do_munmap()
detach_vmas_to_be_unmapped()
mmap_write_downgrade()
expand_downwards()
vma->vm_start = address;
// The VMA now overlaps with
// VMAs detached by the Thread A
// page fault populates expanded part
// of the VMA
unmap_region()
// Zaps pagetables partly
// populated by Thread B
Similar race exists for expand_upwards().
The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.
[akpm@linux-foundation.org: s/mmap_sem/mmap_lock/ in comment]
Fixes: 5dcd5782f466 ("mm: mmap: zap pages with read mmap_sem in munmap") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> [4.20+] Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cong Wang [Wed, 22 Jul 2020 23:31:54 +0000 (16:31 -0700)]
bonding: check return value of register_netdevice() in bond_newlink()
Very similar to commit 28de64e5daa7
("bonding: check error value of register_netdevice() immediately"),
we should immediately check the return value of register_netdevice()
before doing anything else.
Fixes: 1b0b93607679 ("bonding: set carrier off for devices created through netlink") Reported-and-tested-by: syzbot+bbc3a11c4da63c1b74d6@syzkaller.appspotmail.com Cc: Beniamino Galvani <bgalvani@redhat.com> Cc: Taehee Yoo <ap420073@gmail.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Upon additional testing with older servers, it was found that
the original commit introduced a regression when using the old SMB1
dialect and rsyncing over an existing file.
The patch will need to be respun to address this, likely including
a larger refactoring of the SMB1 and SMB3 rename code paths to make
it less confusing and also to address some additional rename error
cases that SMB3 may be able to workaround.
Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Patrick Fernie <patrick.fernie@gmail.com> CC: Stable <stable@vger.kernel.org> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Acked-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Merge tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into master
Pull s390 fixes from Heiko Carstens:
- Change cpum_cf/perf counter name from DFLT_CCERROR to DFLT_CCFINISH
to reflect reality and avoid further confusion. This is a user space
visible change therefore the commit has also a stable tag for 5.7,
where this counter was introduced.
- Add Matthew Rosato as s390 IOMMU maintainer.
* tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
MAINTAINERS: add Matthew for s390 IOMMU
s390/cpum_cf,perf: change DFLT_CCERROR counter name
Douglas Anderson [Wed, 22 Jul 2020 22:00:21 +0000 (15:00 -0700)]
i2c: i2c-qcom-geni: Fix DMA transfer race
When I have KASAN enabled on my kernel and I start stressing the
touchscreen my system tends to hang. The touchscreen is one of the
only things that does a lot of big i2c transfers and ends up hitting
the DMA paths in the geni i2c driver. It appears that KASAN adds
enough delay in my system to tickle a race condition in the DMA setup
code.
When the system hangs, I found that it was running the geni_i2c_irq()
over and over again. It had these:
Notably we're in DMA mode but are getting M_RX_IRQ_EN and
M_RX_FIFO_WATERMARK_EN over and over again.
Putting some traces in geni_i2c_rx_one_msg() showed that when we
failed we were getting to the start of geni_i2c_rx_one_msg() but were
never executing geni_se_rx_dma_prep().
I believe that the problem here is that we are starting the geni
command before we run geni_se_rx_dma_prep(). If a transfer makes it
far enough before we do that then we get into the state I have
observed. Let's change the order, which seems to work fine.
Although problems were seen on the RX path, code inspection suggests
that the TX should be changed too. Change it as well.
Fixes: e5dc314c03a3 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Douglas Anderson <dianders@chromium.org> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Akash Asthana <akashast@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Mukesh Kumar Savaliya <msavaliy@codeaurora.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
Wolfram Sang [Sat, 4 Jul 2020 13:38:29 +0000 (15:38 +0200)]
i2c: rcar: always clear ICSAR to avoid side effects
On R-Car Gen2, we get a timeout when reading from the address set in
ICSAR, even though the slave interface is disabled. Clearing it fixes
this situation. Note that Gen3 is not affected.
To reproduce: bind and undbind an I2C slave on some bus, run
'i2cdetect' on that bus.
Fixes: 151af86086c4 ("i2c: rcar: add slave support") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
Previously TLP may send multiple probes of new data in one
flight. This happens when the sender is cwnd limited. After the
initial TLP containing new data is sent, the sender receives another
ACK that acks partial inflight. It may re-arm another TLP timer
to send more, if no further ACK returns before the next TLP timeout
(PTO) expires. The sender may send in theory a large amount of TLP
until send queue is depleted. This only happens if the sender sees
such irregular uncommon ACK pattern. But it is generally undesirable
behavior during congestion especially.
The original TLP design restrict only one TLP probe per inflight as
published in "Reducing Web Latency: the Virtue of Gentle Aggression",
SIGCOMM 2013. This patch changes TLP to send at most one probe
per inflight.
Note that if the sender is app-limited, TLP retransmits old data
and did not have this issue.
Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 23 Jul 2020 14:49:57 +0000 (17:49 +0300)]
AX.25: Prevent integer overflows in connect and sendmsg
We recently added some bounds checking in ax25_connect() and
ax25_sendmsg() and we so we removed the AX25_MAX_DIGIS checks because
they were no longer required.
Unfortunately, I believe they are required to prevent integer overflows
so I have added them back.
Fixes: 84b9456ddd8e ("AX.25: Prevent out-of-bounds read in ax25_sendmsg()") Fixes: 6bfe1b919f17 ("AX.25: Fix out-of-bounds read in ax25_connect()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.
To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().
Pavel Begunkov [Thu, 23 Jul 2020 17:17:20 +0000 (20:17 +0300)]
io_uring: missed req_init_async() for IOSQE_ASYNC
IOSQE_ASYNC branch of io_queue_sqe() is another place where an
unitialised req->work can be accessed (i.e. prior io_req_init_async()).
Nothing really bad though, it just looses IO_WQ_WORK_CONCURRENT flag.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
arm64: vdso32: Fix '--prefix=' value for newer versions of clang
Newer versions of clang only look for $(COMPAT_GCC_TOOLCHAIN_DIR)as [1],
rather than $(COMPAT_GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE_COMPAT)as,
resulting in the following build error:
$ make -skj"$(nproc)" ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- \
CROSS_COMPILE_COMPAT=arm-linux-gnueabi- LLVM=1 O=out/aarch64 distclean \
defconfig arch/arm64/kernel/vdso32/
...
/home/nathan/cbl/toolchains/llvm-binutils/bin/as: unrecognized option '-EL'
clang-12: error: assembler command failed with exit code 1 (use -v to see invocation)
make[3]: *** [arch/arm64/kernel/vdso32/Makefile:181: arch/arm64/kernel/vdso32/note.o] Error 1
...
Adding the value of CROSS_COMPILE_COMPAT (adding notdir to account for a
full path for CROSS_COMPILE_COMPAT) fixes this issue, which matches the
solution done for the main Makefile [2].
This patch fixes PTP on AQC10X.
PTP support on AQC10X requires FW involvement and FW configures the
TPS data arb mode itself.
So we must make sure driver doesn't touch TPS data arb mode on AQC10x
if PTP is enabled. Otherwise, there are no timestamps even though
packets are flowing.
Fixes: 28eee4eb691b7 ("net: atlantic: QoS implementation: min_rate") Signed-off-by: Egor Pomozov <epomozov@marvell.com> Signed-off-by: Mark Starovoytov <mstarovoitov@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peilin Ye [Wed, 22 Jul 2020 16:05:12 +0000 (12:05 -0400)]
AX.25: Prevent out-of-bounds read in ax25_sendmsg()
Checks on `addr_len` and `usax->sax25_ndigis` are insufficient.
ax25_sendmsg() can go out of bounds when `usax->sax25_ndigis` equals to 7
or 8. Fix it.
It is safe to remove `usax->sax25_ndigis > AX25_MAX_DIGIS`, since
`addr_len` is guaranteed to be less than or equal to
`sizeof(struct full_sockaddr_ax25)`
Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 22 Jul 2020 15:52:12 +0000 (23:52 +0800)]
sctp: shrink stream outq when fails to do addstream reconf
When adding a stream with stream reconf, the new stream firstly is in
CLOSED state but new out chunks can still be enqueued. Then once gets
the confirmation from the peer, the state will change to OPEN.
However, if the peer denies, it needs to roll back the stream. But when
doing that, it only sets the stream outcnt back, and the chunks already
in the new stream don't get purged. It caused these chunks can still be
dequeued in sctp_outq_dequeue_data().
As its stream is still in CLOSE, the chunk will be enqueued to the head
again by sctp_outq_head_data(). This chunk will never be sent out, and
the chunks after it can never be dequeued. The assoc will be 'hung' in
a dead loop of sending this chunk.
To fix it, this patch is to purge these chunks already in the new
stream by calling sctp_stream_shrink_out() when failing to do the
addstream reconf.
Fixes: 87f668e6dad9 ("sctp: implement receiver-side procedures for the Reconf Response Parameter") Reported-by: Ying Xu <yinxu@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Wed, 22 Jul 2020 15:52:11 +0000 (23:52 +0800)]
sctp: shrink stream outq only when new outcnt < old outcnt
It's not necessary to go list_for_each for outq->out_chunk_list
when new outcnt >= old outcnt, as no chunk with higher sid than
new (outcnt - 1) exists in the outqueue.
While at it, also move the list_for_each code in a new function
sctp_stream_shrink_out(), which will be used in the next patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Peilin Ye [Wed, 22 Jul 2020 15:19:01 +0000 (11:19 -0400)]
AX.25: Fix out-of-bounds read in ax25_connect()
Checks on `addr_len` and `fsa->fsa_ax25.sax25_ndigis` are insufficient.
ax25_connect() can go out of bounds when `fsa->fsa_ax25.sax25_ndigis`
equals to 7 or 8. Fix it.
This issue has been reported as a KMSAN uninit-value bug, because in such
a case, ax25_connect() reaches into the uninitialized portion of the
`struct sockaddr_storage` statically allocated in __sys_connect().
It is safe to remove `fsa->fsa_ax25.sax25_ndigis > AX25_MAX_DIGIS` because
`addr_len` is guaranteed to be less than or equal to
`sizeof(struct full_sockaddr_ax25)`.
For ENETC ports that register an external MDIO bus,
the bus doesn't get removed on the error bailout path
of enetc_pf_probe().
This issue became much more visible after recent:
commit 07095c025ac2 ("net: enetc: Use DT protocol information to set up the ports")
Before this commit, one could make probing fail on the error
path only by having register_netdev() fail, which is unlikely.
But after this commit, because it moved the enetc_of_phy_get()
call up in the probing sequence, now we can trigger an mdiobus_free()
bug just by forcing enetc_alloc_msix() to return error, i.e. with the
'pci=nomsi' kernel bootarg (since ENETC relies on MSI support to work),
as the calltrace below shows:
Thomas Gleixner [Wed, 22 Jul 2020 22:46:44 +0000 (00:46 +0200)]
Merge tag 'efi-urgent-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi into efi/urgent
Pull EFI fixes from Ard Biesheuvel:
- Fix the layering violation in the use of the EFI runtime services
availability mask in users of the 'efivars' abstraction
- Revert build fix for GCC v4.8 which is no longer supported
- Some fixes for build issues found by Atish while working on RISC-V support
- Avoid --whole-archive when linking the stub on arm64
- Some x86 EFI stub cleanups from Arvind
J. Bruce Fields [Wed, 15 Jul 2020 17:31:36 +0000 (13:31 -0400)]
nfsd4: fix NULL dereference in nfsd/clients display code
We hold the cl_lock here, and that's enough to keep stateid's from going
away, but it's not enough to prevent the files they point to from going
away. Take fi_lock and a reference and check for NULL, as we do in
other code.
Reported-by: NeilBrown <neilb@suse.de> Fixes: 20a26298532f ("nfsd4: add file to display list of client's opens") Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Merge tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media into master
Pull media fixes from Mauro Carvalho Chehab:
"A series of fixes for the upcoming atomisp driver. They solve issues
when probing atomisp on devices with multiple cameras and get rid of
warnings when built with W=1.
The diffstat is a bit long, as this driver has several abstractions.
The patches that solved the issues with W=1 had to get rid of some
duplicated code (there used to have 2 versions of the same code, one
for ISP2401 and another one for ISP2400).
As this driver is not in 5.7, such changes won't cause regressions"
* tag 'media/v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (38 commits)
Revert "media: atomisp: keep the ISP powered on when setting it"
media: atomisp: fix mask and shift operation on ISPSSPM0
media: atomisp: move system_local consts into a C file
media: atomisp: get rid of version-specific system_local.h
media: atomisp: move global stuff into a common header
media: atomisp: remove non-used 32-bits consts at system_local
media: atomisp: get rid of some unused static vars
media: atomisp: Fix error code in ov5693_probe()
media: atomisp: Replace trace_printk by pr_info
media: atomisp: Fix __func__ style warnings
media: atomisp: fix help message for ISP2401 selection
media: atomisp: i2c: atomisp-ov2680.c: fixed a brace coding style issue.
media: atomisp: make const arrays static, makes object smaller
media: atomisp: Clean up non-existing folders from Makefile
media: atomisp: Get rid of ACPI specifics in gmin_subdev_add()
media: atomisp: Provide Gmin subdev as parameter to gmin_subdev_add()
media: atomisp: Use temporary variable for device in gmin_subdev_add()
media: atomisp: Refactor PMIC detection to a separate function
media: atomisp: Deduplicate return ret in gmin_i2c_write()
media: atomisp: Make pointer to PMIC client global
...
Merge tag 'exfat-for-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat into master
Pull exfat fixes from Namjae Jeon:
- fix overflow issue at sector calculation
- fix wrong hint_stat initialization
- fix wrong size update of stream entry
- fix endianness of upname in name_hash computation
* tag 'exfat-for-5.8-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: fix name_hash computation on big endian systems
exfat: fix wrong size update of stream entry by typo
exfat: fix wrong hint_stat initialization in exfat_find_dir_entry()
exfat: fix overflow issue in exfat_cluster_to_sector()
Karol reported that this commit broke Nouveau firmware loading on a Lenovo
P1G2 with Intel UHD Graphics 630 and NVIDIA TU117GLM [Quadro T1000 Mobile]:
nouveau 0000:01:00.0: acr: AHESASC binary failed
In both cases, reverting 47d55665d087 solved the problem. Unfortunately,
this revert will reintroduce the "Thunderbolt bridges take long time to
resume from D3cold" problem:
https://bugzilla.kernel.org/show_bug.cgi?id=206837
virtio-mmio: Reject invalid IRQ 0 command line argument
The "virtio_mmio.device=" command line argument allows a user to specify
the size, address, and IRQ of a virtio device. Previously the only
requirement for the IRQ was that it be an unsigned integer.
Zero is an unsigned integer but an invalid IRQ number, and after 77849ba8d8719 ("driver core: platform: Clarify that IRQ 0 is invalid"),
attempts to use IRQ 0 cause warnings.
If the user specifies IRQ 0, return failure instead of registering a device
with IRQ 0.
Fixes: 77849ba8d8719 ("driver core: platform: Clarify that IRQ 0 is invalid") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>
i2c: cadence: Clear HOLD bit at correct time in Rx path
There are few issues on Zynq SOC observed in the stress tests causing
timeout errors. Even though all the data is received, timeout error
is thrown. This is due to an IP bug in which the COMP bit in ISR is
not set at end of transfer and completion interrupt is not generated.
This bug is seen on Zynq platforms when the following condition occurs:
Master read & HOLD bit set & Transfer size register reaches '0'.
One workaround is to clear the HOLD bit before the transfer size
register reaches '0'. The current implementation checks for this at
the start of the loop and also only for less than FIFO DEPTH case
(ignoring the equal to case).
So clear the HOLD bit when the data yet to receive is less than or
equal to the FIFO DEPTH. This avoids the IP bug condition.
Signed-off-by: Raviteja Narayanam <raviteja.narayanam@xilinx.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
There are two issues with "i2c: cadence: Fix the hold bit setting" commit.
1. In case of combined message request from user space, when the HOLD
bit is cleared in cdns_i2c_mrecv function, a STOP condition is sent
on the bus even before the last message is started. This is because when
the HOLD bit is cleared, the FIFOS are empty and there is no pending
transfer. The STOP condition should occur only after the last message
is completed.
2. The code added by the commit is redundant. Driver is handling the
setting/clearing of HOLD bit in right way before the commit.
The setting of HOLD bit based on 'bus_hold_flag' is taken care in
cdns_i2c_master_xfer function even before cdns_i2c_msend/cdns_i2c_recv
functions.
The clearing of HOLD bit is taken care at the end of cdns_i2c_msend and
cdns_i2c_recv functions based on bus_hold_flag and byte count.
Since clearing of HOLD bit is done after the slave address is written to
the register (writing to address register triggers the message transfer),
it is ensured that STOP condition occurs at the right time after
completion of the pending transfer (last message).
Signed-off-by: Raviteja Narayanam <raviteja.narayanam@xilinx.com> Acked-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
Peter Zijlstra [Mon, 20 Jul 2020 15:20:21 +0000 (17:20 +0200)]
sched: Fix race against ptrace_freeze_trace()
There is apparently one site that violates the rule that only current
and ttwu() will modify task->state, namely ptrace_{,un}freeze_traced()
will change task->state for a remote task.
Oleg explains:
"TASK_TRACED/TASK_STOPPED was always protected by siglock. In
particular, ttwu(__TASK_TRACED) must be always called with siglock
held. That is why ptrace_freeze_traced() assumes it can safely do
s/TASK_TRACED/__TASK_TRACED/ under spin_lock(siglock)."
This breaks the ordering scheme introduced by commit:
net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload
Currently drive supports taprio offload which is a tc feature offloaded
to cpsw hardware. So driver has to set the hw feature flag, NETIF_F_HW_TC
in the net device to be compliant. This patch adds the flag.
Fixes: 87a485068ff4 ("ethernet: ti: am65-cpsw-qos: add TAPRIO offload support") Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Wang Hai [Fri, 17 Jul 2020 02:50:49 +0000 (10:50 +0800)]
net: ethernet: ave: Fix error returns in ave_init
When regmap_update_bits failed in ave_init(), calls of the functions
reset_control_assert() and clk_disable_unprepare() were missed.
Add goto out_reset_assert to do this.
Fixes: 939b6c01d89c ("net: ethernet: ave: add support for phy-mode setting of system controller") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wang Hai <wanghai38@huawei.com> Reviewed-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Xie He [Thu, 16 Jul 2020 23:44:33 +0000 (16:44 -0700)]
drivers/net/wan/x25_asy: Fix to make it work
This driver is not working because of problems of its receiving code.
This patch fixes it to make it work.
When the driver receives an LAPB frame, it should first pass the frame
to the LAPB module to process. After processing, the LAPB module passes
the data (the packet) back to the driver, the driver should then add a
one-byte pseudo header and pass the data to upper layers.
The changes to the "x25_asy_bump" function and the
"x25_asy_data_indication" function are to correctly implement this
procedure.
Also, the "x25_asy_unesc" function ignores any frame that is shorter
than 3 bytes. However the shortest frames are 2-byte long. So we need
to change it to allow 2-byte frames to pass.
Cc: Eric Dumazet <edumazet@google.com> Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Reviewed-by: Martin Schiller <ms@dev.tdt.de> Signed-off-by: David S. Miller <davem@davemloft.net>
ipvs: fix the connection sync failed in some cases
The sync_thread_backup only checks sk_receive_queue is empty or not,
there is a situation which cannot sync the connection entries when
sk_receive_queue is empty and sk_rmem_alloc is larger than sk_rcvbuf,
the sync packets are dropped in __udp_enqueue_schedule_skb, this is
because the packets in reader_queue is not read, so the rmem is
not reclaimed.
Here I add the check of whether the reader_queue of the udp sock is
empty or not to solve this problem.
Fixes: 29a9fbc2513f ("udp: use a separate rx queue for packet reception") Reported-by: zhouxudong <zhouxudong8@huawei.com> Signed-off-by: guodeqing <geffrey.guo@huawei.com> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Paolo Pisati [Tue, 21 Jul 2020 16:17:10 +0000 (18:17 +0200)]
selftest: txtimestamp: fix net ns entry logic
According to 'man 8 ip-netns', if `ip netns identify` returns an empty string,
there's no net namespace associated with current PID: fix the net ns entrance
logic.
Signed-off-by: Paolo Pisati <paolo.pisati@canonical.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Max Filippov [Tue, 21 Jul 2020 22:00:35 +0000 (15:00 -0700)]
xtensa: fix access check in csum_and_copy_from_user
Commit 54d7ddbc9747 ("xtensa: switch to providing
csum_and_copy_from_user()") introduced access check, but incorrectly
tested dst instead of src.
Fix access_ok argument in csum_and_copy_from_user.
Cc: Al Viro <viro@zeniv.linux.org.uk> Fixes: 54d7ddbc9747 ("xtensa: switch to providing csum_and_copy_from_user()") Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
====================
qed: suppress irrelevant error messages on HW init
This raises the verbosity level of several error/warning messages on
driver/module initialization, most of which are false-positives, and
the one actively spamming the log for no reason.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
qed: suppress false-positives interrupt error messages on HW init
It was found that qed_pglueb_rbc_attn_handler() can produce a lot of
false-positive error detections on driver load/reload (especially after
crashes/recoveries) and spam the kernel log:
Reduce the logging level of two false-positive prone error messages from
notice to verbose on initialization (only) to not mix it with real error
attentions while debugging.
Fixes: b1f861c6e79b ("qed: Revise load sequence to avoid PCI errors") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
qed: suppress "don't support RoCE & iWARP" flooding on HW init
Change the verbosity of the "don't support RoCE & iWARP simultaneously"
warning to debug level to stop flooding on driver/hardware initialization:
[ 4.783230] qede 01:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth0]
[ 4.810020] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 4.861186] qede 01:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth1]
[ 4.893311] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 5.181713] qede a1:00.00: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth2]
[ 5.224740] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 5.276449] qede a1:00.01: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth3]
[ 5.318671] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
[ 5.369548] qede a1:00.02: Storm FW 8.37.7.0, Management FW 8.52.9.0
[MBI 15.10.6] [eth4]
[ 5.411645] [qed_rdma_set_pf_params:2076()]Current day drivers don't
support RoCE & iWARP simultaneously on the same PF. Default to RoCE-only
Fixes: 09259a6bc60d ("qed: Add iWARP enablement support") Signed-off-by: Alexander Lobakin <alobakin@marvell.com> Signed-off-by: Igor Russkikh <irusskikh@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
In the nsim_create(), rtnl_lock() is called before nsim_bpf_init().
If nsim_bpf_init() is failed, rtnl_unlock() should be called,
but it isn't called.
So, unbalanced locking would occur.
Fixes: ce2062b08335 ("netdevsim: move netdev creation/destruction to dev probe") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
Helmut Grohne [Tue, 21 Jul 2020 11:07:39 +0000 (13:07 +0200)]
net: dsa: microchip: call phy_remove_link_mode during probe
When doing "ip link set dev ... up" for a ksz9477 backed link,
ksz9477_phy_setup is called and it calls phy_remove_link_mode to remove
1000baseT HDX. During phy_remove_link_mode, phy_advertise_supported is
called. Doing so reverts any previous change to advertised link modes
e.g. using a udevd .link file.
phy_remove_link_mode is not meant to be used while opening a link and
should be called during phy probe when the link is not yet available to
userspace.
Therefore move the phy_remove_link_mode calls into
ksz9477_switch_register. It indirectly calls dsa_register_switch, which
creates the relevant struct phy_devices and we update the link modes
right after that. At that time dev->features is already initialized by
ksz9477_switch_detect.
Remove phy_setup from ksz_dev_ops as no users remain.
Link: https://lore.kernel.org/netdev/20200715192722.GD1256692@lunn.ch/ Fixes: 9abfa544c4c8d4 ("net: dsa: microchip: prepare PHY for proper advertisement") Signed-off-by: Helmut Grohne <helmut.grohne@intenta.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
net: hns3: fix return value error when query MAC link status fail
Currently, PF queries the MAC link status per second by calling
function hclge_get_mac_link_status(). It return the error code
when failed to send cmdq command to firmware. It's incorrect,
because this return value is used as the MAC link status, which
0 means link down, and none-zero means link up. So fixes it.
Fixes: 575d84053c84 ("net: hns3: Add HNS3 Acceleration Engine & Compatibility Layer Support") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Huazhong tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Tue, 21 Jul 2020 11:03:53 +0000 (19:03 +0800)]
net: hns3: fix error handling for desc filling
The content of the TX desc is automatically cleared by the HW
when the HW has sent out the packet to the wire. When desc filling
fails in hns3_nic_net_xmit(), it will call hns3_clear_desc() to do
the error handling, which miss zeroing of the TX desc and the
checking if a unmapping is needed.
So add the zeroing and checking in hns3_clear_desc() to avoid the
above problem. Also add DESC_TYPE_UNKNOWN to indicate the info in
desc_cb is not valid, because hns3_nic_reclaim_desc() may treat
the desc_cb->type of zero as packet and add to the sent pkt
statistics accordingly.
Fixes: 327a841d1dd9 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Tue, 21 Jul 2020 11:03:52 +0000 (19:03 +0800)]
net: hns3: fix for not calculating TX BD send size correctly
With GRO and fraglist support, the SKB can be aggregated to
a total size of 65535, and when that SKB is forwarded through
a bridge, the size of the SKB may be pushed to exceed the size
of 65535 when br_dev_queue_push_xmit() is called.
The max send size of BD supported by the HW is 65535, when a SKB
with a headlen of over 65535 is sent to the driver, the driver
needs to use multi BD to send the linear data, and the send size
of the last BD is calculated incorrectly by the driver who is
using '&' operation, which causes a TX error.
Use '%' operation to fix this problem.
Fixes: 4c6401564832 ("net: hns3: avoid mult + div op in critical data path") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Yunsheng Lin [Tue, 21 Jul 2020 11:03:51 +0000 (19:03 +0800)]
net: hns3: fix for not unmapping TX buffer correctly
When a big TX buffer is sent using multi BD, the driver maps the
whole TX buffer, and unmaps it using info in desc_cb corresponding
to each BD, but only the info in the desc_cb of first BD is correct,
other info in desc_cb is wrong, which causes TX unmapping problem
when SMMU is on.
Only set the mapping and freeing info in the desc_cb of first BD to
fix this problem, because the TX buffer only need to be unmapped and
freed once.
Fixes: 02823d65546f("net: hns3: add handling for big TX fragment") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huzhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Miaohe Lin [Tue, 21 Jul 2020 09:11:44 +0000 (17:11 +0800)]
net: udp: Fix wrong clean up for IS_UDPLITE macro
We can't use IS_UDPLITE to replace udp_sk->pcflag when UDPLITE_RECV_CC is
checked.
Fixes: 239a54a74982 ("[UDP]: Clean up for IS_UDPLITE macro") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: ethernet: ravb: exit if re-initialization fails in tx timeout
According to the report of [1], this driver is possible to cause
the following error in ravb_tx_timeout_work().
ravb e6800000.ethernet ethernet: failed to switch device to config mode
This error means that the hardware could not change the state
from "Operation" to "Configuration" while some tx and/or rx queue
are operating. After that, ravb_config() in ravb_dmac_init() will fail,
and then any descriptors will be not allocaled anymore so that NULL
pointer dereference happens after that on ravb_start_xmit().
To fix the issue, the ravb_tx_timeout_work() should check
the return values of ravb_stop_dma() and ravb_dmac_init().
If ravb_stop_dma() fails, ravb_tx_timeout_work() re-enables TX and RX
and just exits. If ravb_dmac_init() fails, just exits.
Reported-by: Dirk Behme <dirk.behme@de.bosch.com> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, SO_REUSEPORT does not work well if connected sockets are in a
UDP reuseport group.
Then reuseport_has_conns() returns true and the result of
reuseport_select_sock() is discarded. Also, unconnected sockets have the
same score, hence only does the first unconnected socket in udp_hslot
always receive all packets sent to unconnected sockets.
So, the result of reuseport_select_sock() should be used for load
balancing.
The noteworthy point is that the unconnected sockets placed after
connected sockets in sock_reuseport.socks will receive more packets than
others because of the algorithm in reuseport_select_sock().
index | connected | reciprocal_scale | result
---------------------------------------------
0 | no | 20% | 40%
1 | no | 20% | 20%
2 | yes | 20% | 0%
3 | no | 20% | 40%
4 | yes | 20% | 0%
If most of the sockets are connected, this can be a problem, but it still
works better than now.
Fixes: e599397fd331 ("udp: correct reuseport selection with connected sockets") CC: Willem de Bruijn <willemb@google.com> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
If an unconnected socket in a UDP reuseport group connect()s, has_conns is
set to 1. Then, when a packet is received, udp[46]_lib_lookup2() scans all
sockets in udp_hslot looking for the connected socket with the highest
score.
However, when the number of sockets bound to the port exceeds max_socks,
reuseport_grow() resets has_conns to 0. It can cause udp[46]_lib_lookup2()
to return without scanning all sockets, resulting in that packets sent to
connected sockets may be distributed to unconnected sockets.
Therefore, reuseport_grow() should copy has_conns.
Fixes: e599397fd331 ("udp: correct reuseport selection with connected sockets") CC: Willem de Bruijn <willemb@google.com> Reviewed-by: Benjamin Herrenschmidt <benh@amazon.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Boris Burkov [Thu, 16 Jul 2020 20:29:46 +0000 (13:29 -0700)]
btrfs: fix mount failure caused by race with umount
It is possible to cause a btrfs mount to fail by racing it with a slow
umount. The crux of the sequence is generic_shutdown_super not yet
calling sop->put_super before btrfs_mount_root calls btrfs_open_devices.
If that occurs, btrfs_open_devices will decide the opened counter is
non-zero, increment it, and skip resetting fs_devices->total_rw_bytes to
0. From here, mount will call sget which will result in grab_super
trying to take the super block umount semaphore. That semaphore will be
held by the slow umount, so mount will block. Before up-ing the
semaphore, umount will delete the super block, resulting in mount's sget
reliably allocating a new one, which causes the mount path to dutifully
fill it out, and increment total_rw_bytes a second time, which causes
the mount to fail, as we see double the expected bytes.
To fix this, we clear total_rw_bytes from within btrfs_read_chunk_tree
before the calls to read_one_dev, while holding the sb umount semaphore
and the uuid mutex.
To reproduce, it is sufficient to dirty a decent number of inodes, then
quickly umount and mount.
for i in $(seq 0 500)
do
dd if=/dev/zero of="/mnt/foo/$i" bs=1M count=1
done
umount /mnt/foo&
mount /mnt/foo
does the trick for me.
CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Boris Burkov <boris@bur.io> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Robbie Ko [Mon, 20 Jul 2020 01:42:09 +0000 (09:42 +0800)]
btrfs: fix page leaks after failure to lock page for delalloc
When locking pages for delalloc, we check if it's dirty and mapping still
matches. If it does not match, we need to return -EAGAIN and release all
pages. Only the current page was put though, iterate over all the
remaining pages too.
CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Robbie Ko <robbieko@synology.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
btrfs: qgroup: fix data leak caused by race between writeback and truncate
[BUG]
When running tests like generic/013 on test device with btrfs quota
enabled, it can normally lead to data leak, detected at unmount time:
BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096
------------[ cut here ]------------
WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs]
RIP: 0010:close_ctree+0x1dc/0x323 [btrfs]
Call Trace:
btrfs_put_super+0x15/0x17 [btrfs]
generic_shutdown_super+0x72/0x110
kill_anon_super+0x18/0x30
btrfs_kill_super+0x17/0x30 [btrfs]
deactivate_locked_super+0x3b/0xa0
deactivate_super+0x40/0x50
cleanup_mnt+0x135/0x190
__cleanup_mnt+0x12/0x20
task_work_run+0x64/0xb0
__prepare_exit_to_usermode+0x1bc/0x1c0
__syscall_return_slowpath+0x47/0x230
do_syscall_64+0x64/0xb0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
---[ end trace caf08beafeca2392 ]---
BTRFS error (device dm-3): qgroup reserved space leaked
[CAUSE]
In the offending case, the offending operations are:
2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0
2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0
The following sequence of events could happen after the writev():
CPU1 (writeback) | CPU2 (truncate)
-----------------------------------------------------------------
btrfs_writepages() |
|- extent_write_cache_pages() |
|- Got page for 1003520 |
| 1003520 is Dirty, no writeback |
| So (!clear_page_dirty_for_io()) |
| gets called for it |
|- Now page 1003520 is Clean. |
| | btrfs_setattr()
| | |- btrfs_setsize()
| | |- truncate_setsize()
| | New i_size is 18388
|- __extent_writepage() |
| |- page_offset() > i_size |
|- btrfs_invalidatepage() |
|- Page is clean, so no qgroup |
callback executed
This means, the qgroup reserved data space is not properly released in
btrfs_invalidatepage() as the page is Clean.
[FIX]
Instead of checking the dirty bit of a page, call
btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage().
As qgroup rsv are completely bound to the QGROUP_RESERVED bit of
io_tree, not bound to page status, thus we won't cause double freeing
anyway.
Fixes: 45467edf4474 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>