Paolo Bonzini [Fri, 15 Jul 2022 11:34:55 +0000 (07:34 -0400)]
KVM: emulate: do not adjust size of fastop and setcc subroutines
Instead of doing complicated calculations to find the size of the subroutines
(which are even more complicated because they need to be stringified into
an asm statement), just hardcode to 16.
It is less dense for a few combinations of IBT/SLS/retbleed, but it has
the advantage of being really simple.
Cc: stable@vger.kernel.org # 5.15.x: 8db6b1bc7390: x86/kvm: fix FASTOP_SIZE when return thunks are enabled Cc: stable@vger.kernel.org Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM: x86: Fully initialize 'struct kvm_lapic_irq' in kvm_pv_kick_cpu_op()
'vector' and 'trig_mode' fields of 'struct kvm_lapic_irq' are left
uninitialized in kvm_pv_kick_cpu_op(). While these fields are normally
not needed for APIC_DM_REMRD, they're still referenced by
__apic_accept_irq() for trace_kvm_apic_accept_irq(). Fully initialize
the structure to avoid consuming random stack memory.
Fixes: 715c1b0ced00 ("KVM: x86: make apic_accept_irq tracepoint more generic") Reported-by: syzbot+d6caa905917d353f0d07@syzkaller.appspotmail.com Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220708125147.593975-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 14 Jul 2022 11:29:57 +0000 (07:29 -0400)]
Documentation: kvm: clarify histogram units
In the case of histogram statistics, the values are always sample
counts; the unit instead applies to the bucket range. For example,
halt_poll_success_hist is a nanosecond statistic because the buckets are
for 0ns, 1ns, 2-3ns, 4-7ns etc. There isn't really any other sensible
interpretation, but clarify this anyway in the Documentation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 14 Jul 2022 11:27:31 +0000 (07:27 -0400)]
kvm: stats: tell userspace which values are boolean
Some of the statistics values exported by KVM are always only 0 or 1.
It can be useful to export this fact to userspace so that it can track
them specially (for example by polling the value every now and then to
compute a % of time spent in a specific state).
Therefore, add "boolean value" as a new "unit". While it is not exactly
a unit, it walks and quacks like one. In particular, using the type
would be wrong because boolean values could be instantaneous or peak
values (e.g. "is the rmap allocated?") or even two-bucket histograms
(e.g. "number of posted vs. non-posted interrupt injections").
Suggested-by: Amneesh Singh <natto@weirdnatto.in> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86/kvm: fix FASTOP_SIZE when return thunks are enabled
The return thunk call makes the fastop functions larger, just like IBT
does. Consider a 16-byte FASTOP_SIZE when CONFIG_RETHUNK is enabled.
Otherwise, functions will be incorrectly aligned and when computing their
position for differently sized operators, they will executed in the middle
or end of a function, which may as well be an int3, leading to a crash
like:
Fixes: 2c70af946767 ("x86: Use return-thunk in asm code") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Co-developed-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@suse.de> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Message-Id: <20220713171241.184026-1-cascardo@canonical.com> Tested-by: Jack Wang <jinpu.wang@ionos.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM: nVMX: Always enable TSC scaling for L2 when it was enabled for L1
Windows 10/11 guests with Hyper-V role (WSL2) enabled are observed to
hang upon boot or shortly after when a non-default TSC frequency was
set for L1. The issue is observed on a host where TSC scaling is
supported. The problem appears to be that Windows doesn't use TSC
frequency for its guests even when the feature is advertised and KVM
filters SECONDARY_EXEC_TSC_SCALING out when creating L2 controls from
L1's. This leads to L2 running with the default frequency (matching
host's) while L1 is running with an altered one.
Keep SECONDARY_EXEC_TSC_SCALING in secondary exec controls for L2 when
it was set for L1. TSC_MULTIPLIER is already correctly computed and
written by prepare_vmcs02().
vf/remap: return the amount of bytes actually deduplicated
When using the FIDEDUPRANGE ioctl, in case of success the requested size
is returned. In some cases this might not be the actual amount of bytes
deduplicated.
This change modifies vfs_dedupe_file_range() to report the actual amount
of bytes deduplicated, instead of the requested amount.
Link: https://lore.kernel.org/linux-fsdevel/5548ef63-62f9-4f46-5793-03165ceccacc@tu-darmstadt.de/ Reported-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de> Reported-by: Max Schlecht <max.schlecht@informatik.hu-berlin.de> Reported-by: Björn Scheuermann <scheuermann@kom.tu-darmstadt.de> Cc: Dave Chinner <david@fromorbit.com> Cc: Darrick J Wong <djwong@kernel.org> Signed-off-by: Ansgar Lößer <ansgar.loesser@kom.tu-darmstadt.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo:
"Fix an old and subtle bug in the migration path.
css_sets are used to track tasks and migrations are tasks moving from
a group of css_sets to another group of css_sets. The migration path
pins all source and destination css_sets in the prep stage.
Unfortunately, it was overloading the same list_head entry to track
sources and destinations, which got confused for migrations which are
partially identity leading to use-after-frees.
Fixed by using dedicated list_heads for tracking sources and
destinations"
* tag 'cgroup-for-5.19-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: Use separate src/dst nodes when preloading css_sets for migration
Dave Chinner [Wed, 13 Jul 2022 07:49:15 +0000 (17:49 +1000)]
fs/remap: constrain dedupe of EOF blocks
If dedupe of an EOF block is not constrainted to match against only
other EOF blocks with the same EOF offset into the block, it can
match against any other block that has the same matching initial
bytes in it, even if the bytes beyond EOF in the source file do
not match.
Fix this by constraining the EOF block matching to only match
against other EOF blocks that have identical EOF offsets and data.
This allows "whole file dedupe" to continue to work without allowing
eof blocks to randomly match against partial full blocks with the
same data.
Merge tag 'trace-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Fixes and minor clean ups for tracing:
- Fix memory leak by reverting what was thought to be a double free.
A static tool had gave a false positive that a double free was
possible in the error path, but it was actually a different
location that confused the static analyzer (and those of us that
reviewed it).
- Move use of static buffers by ftrace_dump() to a location that can
be used by kgdb's ftdump(), as it needs it for the same reasons.
- Clarify in the Kconfig description that function tracing has
negligible impact on x86, but may have a bit bigger impact on other
architectures.
- Remove unnecessary extra semicolon in trace event.
- Make a local variable static that is used in the fprobes sample
- Use KSYM_NAME_LEN for length of function in kprobe sample and get
rid of unneeded macro for the same purpose"
* tag 'trace-v5.19-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
samples: Use KSYM_NAME_LEN for kprobes
fprobe/samples: Make sample_probe static
blk-iocost: tracing: atomic64_read(&ioc->vtime_rate) is assigned an extra semicolon
ftrace: Be more specific about arch impact when function tracer is enabled
tracing: Fix sleeping while atomic in kdb ftdump
tracing/histograms: Fix memory leak problem
sunliming [Mon, 6 Jun 2022 07:56:59 +0000 (15:56 +0800)]
fprobe/samples: Make sample_probe static
This symbol is not used outside of fprobe_example.c, so marks it static.
Fixes the following warning:
sparse warnings: (new ones prefixed by >>)
>> samples/fprobe/fprobe_example.c:23:15: sparse: sparse: symbol 'sample_probe'
was not declared. Should it be static?
Link: https://lkml.kernel.org/r/20220606075659.674556-1-sunliming@kylinos.cn Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: sunliming <sunliming@kylinos.cn> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
ftrace: Be more specific about arch impact when function tracer is enabled
It was brought up that on ARMv7, that because the FUNCTION_TRACER does not
use nops to keep function tracing disabled because of the use of a link
register, it does have some performance impact.
The start of functions when -pg is used to compile the kernel is:
Which just puts the stack back to its normal location. But these two
instructions at the start of every function does incur some overhead.
Be more honest in the Kconfig FUNCTION_TRACER description and specify that
the overhead being in the noise was x86 specific, but other architectures
may vary.
If you drop into kdb and type "ftdump" you'll get a sleeping while
atomic warning from memory allocation in trace_find_next_entry().
This appears to have been caused by commit 7c2a3035d9a2 ("tracing:
Save off entry when peeking at next entry"), which added the
allocation in that path. The problematic commit was already fixed by
commit d2f84cc688f0 ("tracing: Do not allocate buffer in
trace_find_next_entry() in atomic") but that fix missed the kdb case.
The fix here is easy: just move the assignment of the static buffer to
the place where it should have been to begin with:
trace_init_global_iter(). That function is called in two places, once
is right before the assignment of the static buffer added by the
previous fix and once is in kdb.
Note that it appears that there's a second static buffer that we need
to assign that was added in commit ce133cf6c0cd ("tracing: Show real
address for trace event arguments"), so we'll move that too.
As commit 955d06b86ab6 ("tracing: fix double free") said, the
"double free" problem reported by clang static analyzer is:
> In parse_var_defs() if there is a problem allocating
> var_defs.expr, the earlier var_defs.name is freed.
> This free is duplicated by free_var_defs() which frees
> the rest of the list.
However, if there is a problem allocating N-th var_defs.expr:
+ in parse_var_defs(), the freed 'earlier var_defs.name' is
actually the N-th var_defs.name;
+ then in free_var_defs(), the names from 0th to (N-1)-th are freed;
IF ALLOCATING PROBLEM HAPPENED HERE!!! -+
\
|
0th 1th (N-1)-th N-th V
+-------------+-------------+-----+-------------+-----------
var_defs: | name | expr | name | expr | ... | name | expr | name | ///
+-------------+-------------+-----+-------------+-----------
These two frees don't act on same name, so there was no "double free"
problem before. Conversely, after that commit, we get a "memory leak"
problem because the above "N-th var_defs.name" is not freed.
If enable CONFIG_DEBUG_KMEMLEAK and inject a fault at where the N-th
var_defs.expr allocated, then execute on shell like:
$ echo 'hist:key=call_site:val=$v1,$v2:v1=bytes_req,v2=bytes_alloc' > \
/sys/kernel/debug/tracing/events/kmem/kmalloc/trigger
Merge tag 'ovl-fixes-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fix from Miklos Szeredi:
"Add a temporary fix for posix acls on idmapped mounts introduced in
this cycle. A proper fix will be added in the next cycle"
* tag 'ovl-fixes-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: turn off SB_POSIXACL with idmapped layers temporarily
Merge tag 'drm-fixes-2022-07-12' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"I see you picked up one of the fbdev fixes, this is the other stuff
that was queued up last week.
A bit of a scattering of fixes, three for i915, one amdgpu, and a
couple of panfrost, rockchip, panel and bridge ones.
amdgpu:
- Hibernation fix
dma-buf:
- fix use after free of fence
i915:
- Fix a possible refcount leak in DP MST connector (Hangyu)
- Fix on loading guc on ADL-N (Daniele)
- Fix vm use-after-free in vma destruction (Thomas)
* tag 'drm-fixes-2022-07-12' of git://anongit.freedesktop.org/drm/drm:
drm/ssd130x: Fix pre-charge period setting
dma-buf: Fix one use-after-free of fence
drm/i915: Fix vm use-after-free in vma destruction
drm/i915/guc: ADL-N should use the same GuC FW as ADL-S
drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
drm/amdgpu/display: disable prefer_shadow for generic fb helpers
drm/amdgpu: keep fbdev buffers pinned during suspend
drm/panfrost: Fix shrinker list corruption by madvise IOCTL
drm/panfrost: Put mapping instead of shmem obj on panfrost_mmu_map_fault_addr() error
drm/rockchip: Detach from ARM DMA domain in attach_device
drm/bridge: fsl-ldb: Drop DE signal polarity inversion
drm/bridge: fsl-ldb: Enable split mode for LVDS dual link
drm/bridge: fsl-ldb: Fix mode clock rate validation
drm/aperture: Run fbdev removal before internal helpers
drm: panel-orientation-quirks: Add quirk for the Lenovo Yoga Tablet 2 830
__static_call_fixup() invokes __static_call_transform() without holding
text_mutex, which causes lockdep to complain in text_poke_bp().
Adding the proper locking cures that, but as this is either used during
early boot or during module finalizing, it's not required to use
text_poke_bp(). Add an argument to __static_call_transform() which tells
it to use text_poke_early() for it.
Fixes: 9bd2b3a6b6f0 ("x86,static_call: Use alternative RET encoding") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Borislav Petkov <bp@suse.de>
Merge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 retbleed fixes from Borislav Petkov:
"Just when you thought that all the speculation bugs were addressed and
solved and the nightmare is complete, here's the next one: speculating
after RET instructions and leaking privileged information using the
now pretty much classical covert channels.
It is called RETBleed and the mitigation effort and controlling
functionality has been modelled similar to what already existing
mitigations provide"
* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
x86/speculation: Disable RRSBA behavior
x86/kexec: Disable RET on kexec
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
x86/bugs: Add Cannon lake to RETBleed affected CPU list
x86/retbleed: Add fine grained Kconfig knobs
x86/cpu/amd: Enumerate BTC_NO
x86/common: Stamp out the stepping madness
KVM: VMX: Prevent RSB underflow before vmenter
x86/speculation: Fill RSB on vmexit for IBRS
KVM: VMX: Fix IBRS handling after vmexit
KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
KVM: VMX: Convert launched argument to flags
KVM: VMX: Flatten __vmx_vcpu_run()
objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
x86/speculation: Remove x86_spec_ctrl_mask
x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
x86/speculation: Fix SPEC_CTRL write on SMT state change
x86/speculation: Fix firmware entry SPEC_CTRL handling
x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
...
Dave Airlie [Tue, 12 Jul 2022 00:43:49 +0000 (10:43 +1000)]
Merge tag 'drm-misc-fixes-2022-07-07-1' of ssh://git.freedesktop.org/git/drm/drm-misc into drm-fixes
Three mode setting fixes for fsl-ldb, a fbdev removal use-after-free fix,
a dma-buf fence use-after-free fix, a DMA setup fix for rockchip, an error
path fix and memory corruption fix for panfrost and one more orientation
quirk
Dave Airlie [Tue, 12 Jul 2022 00:40:24 +0000 (10:40 +1000)]
Merge tag 'drm-intel-fixes-2022-07-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix a possible refcount leak in DP MST connector (Hangyu)
- Fix on loading guc on ADL-N (Daniele)
- Fix vm use-after-free in vma destruction (Thomas)
Merge tag 'for-5.19-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
"A more fixes that seem to me to be important enough to get merged
before release:
- in zoned mode, fix leak of a structure when reading zone info, this
happens on normal path so this can be significant
- in zoned mode, revert an optimization added in 5.19-rc1 to finish a
zone when the capacity is full, but this is not reliable in all
cases
- try to avoid short reads for compressed data or inline files when
it's a NOWAIT read, applications should handle that but there are
two, qemu and mariadb, that are affected"
* tag 'for-5.19-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: zoned: drop optimization of zone finish
btrfs: zoned: fix a leaked bioc in read_zone_info
btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents
Merge tags 'free-mq_sysctls-for-v5.19' and 'ptrace_unfreeze_fix-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ipc namespace fix from Eric Biederman:
"This fixes a bug with error handling if ipc creation fails that was
reported by syzbot"
For completeness, this also pulls the ptrace_unfreeze_fix tag that
contains the original version of one of the hotfixes that I manually
applied earlier so that it would be fixed in rc6.
* tag 'free-mq_sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ipc: Free mq_sysctls if ipc namespace creation failed
* tag 'ptrace_unfreeze_fix-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()
Merge tag 'mm-hotfixes-stable-2022-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
"Mainly MM fixes. About half for issues which were introduced after
5.18 and the remainder for longer-term issues"
* tag 'mm-hotfixes-stable-2022-07-11' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm: split huge PUD on wp_huge_pud fallback
nilfs2: fix incorrect masking of permission flags for symlinks
mm/rmap: fix dereferencing invalid subpage pointer in try_to_migrate_one()
riscv/mm: fix build error while PAGE_TABLE_CHECK enabled without MMU
Documentation: highmem: use literal block for code example in highmem.h comment
mm: sparsemem: fix missing higher order allocation splitting
mm/damon: use set_huge_pte_at() to make huge pte old
sh: convert nommu io{re,un}map() to static inline functions
mm: userfaultfd: fix UFFDIO_CONTINUE on fallocated shmem pages
Merge tag 'modules-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull module fixes from Luis Chamberlain:
"Although most of the move of code in in v5.19-rc1 should have not
introduced a regression patch review on one of the file changes
captured a checkpatch warning which advised to use strscpy() and it
caused a buffer overflow when an incorrect length is passed.
Another change which checkpatch complained about was an odd RCU usage,
but that was properly addressed in a separate patch to the move by
Aaron. That caused a regression with PREEMPT_RT=y due to an unbounded
latency.
This series fixes both and adjusts documentation which we forgot to do
for the move"
* tag 'modules-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
module: kallsyms: Ensure preemption in add_kallsyms() with PREEMPT_RT
doc: module: update file references
module: Fix "warning: variable 'exit' set but not used"
module: Fix selfAssignment cppcheck warning
modules: Fix corruption of /proc/kallsyms
module: kallsyms: Ensure preemption in add_kallsyms() with PREEMPT_RT
The commit 40468939e937 ("module: kallsyms: Fix suspicious rcu usage")
under PREEMPT_RT=y, disabling preemption introduced an unbounded
latency since the loop is not fixed. This change caused a regression
since previously preemption was not disabled and we would dereference
RCU-protected pointers explicitly. That being said, these pointers
cannot change.
Before kallsyms-specific data is prepared/or set-up, we ensure that
the unformed module is known to be unique i.e. does not already exist
(see load_module()). Therefore, we can fix this by using the common and
more appropriate RCU flavour as this section of code can be safely
preempted.
Reported-by: Steven Rostedt <rostedt@goodmis.org> Fixes: 40468939e937 ("module: kallsyms: Fix suspicious rcu usage") Signed-off-by: Aaron Tomlin <atomlin@redhat.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
fix race between exit_itimers() and /proc/pid/timers
As Chris explains, the comment above exit_itimers() is not correct,
we can race with proc_timers_seq_ops. Change exit_itimers() to clear
signal->posix_timers with ->siglock held.
RISC-V: KVM: Fix SRCU deadlock caused by kvm_riscv_check_vcpu_requests()
The kvm_riscv_check_vcpu_requests() is called with SRCU read lock held
and for KVM_REQ_SLEEP request it will block the VCPU without releasing
SRCU read lock. This causes KVM ioctls (such as KVM_IOEVENTFD) from
other VCPUs of the same Guest/VM to hang/deadlock if there is any
synchronize_srcu() or synchronize_srcu_expedited() in the path.
To fix the above in kvm_riscv_check_vcpu_requests(), we should do SRCU
read unlock before blocking the VCPU and do SRCU read lock after VCPU
wakeup.
Fixes: 255c4ecb5dc9 ("RISC-V: KVM: Implement VCPU interrupts and requests handling") Reported-by: Bin Meng <bmeng.cn@gmail.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Tested-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Tested-by: Bin Meng <bmeng.cn@gmail.com> Signed-off-by: Anup Patel <anup@brainfault.org>
There are a bunch of functions that use the PFN from a page table entry
that end up with the svpbmt upper-bits because they are missing the newly
introduced PAGE_PFN_MASK which leads to wrong addresses conversions and
then crash: fix this by adding this mask.
Fixes: 88f4cb2d66e4 ("riscv: Fix accessing pfn bits in PTEs for non-32bit variants") Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com> Signed-off-by: Anup Patel <anup@brainfault.org>
This is a collection of three fixes for small annoyances.
Two of these are already pending in other trees, but I really don't want
to release another -rc with these issues pending, so I picked up the
patches for these things directly. We'll end up with duplicate commits
eventually, I prefer that over having these issues pending.
The third one is just me getting rid of another BUG_ON() just because it
was reported and I dislike those things so much.
* merge 'hot-fixes' branch:
ida: don't use BUG_ON() for debugging
drm/aperture: Run fbdev removal before internal helpers
ptrace: fix clearing of JOBCTL_TRACED in ptrace_unfreeze_traced()
This is another old BUG_ON() that just shouldn't exist (see also commit a3a14d4efebf: "signal handling: don't use BUG_ON() for debugging").
In fact, as Matthew Wilcox points out, this condition shouldn't really
even result in a warning, since a negative id allocation result is just
a normal allocation failure:
"I wonder if we should even warn here -- sure, the caller is trying to
free something that wasn't allocated, but we don't warn for
kfree(NULL)"
and goes on to point out how that current error check is only causing
people to unnecessarily do their own index range checking before freeing
it.
This was noted by Itay Iellin, because the bluetooth HCI socket cookie
code does *not* do that range checking, and ends up just freeing the
error case too, triggering the BUG_ON().
The HCI code requires CAP_NET_RAW, and seems to just result in an ugly
splat, but there really is no reason to BUG_ON() here, and we have
generally striven for allocation models where it's always ok to just do
free(alloc());
even if the allocation were to fail for some random reason (usually
obviously that "random" reason being some resource limit).
Fixes: 802d8c8f8237 ("ida: simplified functions for id allocation") Reported-by: Itay Iellin <ieitayie@gmail.com> Suggested-by: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'staging-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging driver fix from Greg KH:
"Here is a single staging driver fix for a reported problem that showed
up in 5.19-rc1 in the wlan-ng driver. It has been in linux-next for a
week with no reported problems"
* tag 'staging-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging/wlan-ng: get the correct struct hfa384x in work callback
Merge tag 'char-misc-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are four small char/misc driver fixes for 5.19-rc6 to resolve
some reported issues. They only affect two drivers:
- rtsx_usb: fix for of-reported DMA warning error, the driver was
handling memory buffers in odd ways, it has now been fixed up to be
much simpler and correct by Shuah.
- at25 eeprom driver bugfix for reported problem
All of these have been in linux-next for a week with no reported
problems"
* tag 'char-misc-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
misc: rtsx_usb: set return value in rsp_buf alloc err path
misc: rtsx_usb: use separate command and response buffers
misc: rtsx_usb: fix use of dma mapped buffer for usb bulk transfer
eeprom: at25: Rework buggy read splitting
Merge tag 'kbuild-fixes-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Adjust gen_compile_commands.py to the format change of *.mod files
- Remove unused macro in scripts/Makefile.modinst
* tag 'kbuild-fixes-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: remove unused cmd_none in scripts/Makefile.modinst
gen_compile_commands: handle multiple lines per .mod file
Merge tag 'irq_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Borislav Petkov:
- Gracefully handle failure to request MMIO resources in the GICv3
driver
- Make a static key static in the Apple AIC driver
- Fix the Xilinx intc driver dependency on OF_ADDRESS
* tag 'irq_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/apple-aic: Make symbol 'use_fast_ipi' static
irqchip/xilinx: Add explicit dependency on OF_ADDRESS
irqchip/gicv3: Handle resource request failure consistently
Merge tag 'x86_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Prepare for and clear .brk early in order to address XenPV guests
failures where the hypervisor verifies page tables and uninitialized
data in that range leads to bogus failures in those checks
- Add any potential setup_data entries supplied at boot to the identity
pagetable mappings to prevent kexec kernel boot failures. Usually,
this is not a problem for the normal kernel as those mappings are
part of the initially mapped 2M pages but if kexec gets to allocate
the second kernel somewhere else, those setup_data entries need to be
mapped there too.
- Fix objtool not to discard text references from the __tracepoints
section so that ENDBR validation still works
- Correct the setup_data types limit as it is user-visible, before 5.19
releases
* tag 'x86_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Fix the setup data types max limit
x86/ibt, objtool: Don't discard text references from tracepoint section
x86/compressed/64: Add identity mappings for setup_data entries
x86: Fix .brk attribute in linker script
x86: Clear .brk area at early boot
x86/xen: Use clear_bss() for Xen PV guests
* tag 'i2c-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: cadence: Unregister the clk notifier in error path
i2c: piix4: Fix a memory leak in the EFCH MMIO support
drm/aperture: Run fbdev removal before internal helpers
Always run fbdev removal first to remove simpledrm via sysfb_disable().
This clears the internal state.
The later call to drm_aperture_detach_drivers() then does nothing.
Otherwise, with drm_aperture_detach_drivers() running first, the call to
sysfb_disable() uses inconsistent state.
Example backtrace show below:
BUG: KASAN: use-after-free in device_del+0x79/0x5f0
Read of size 8 at addr ffff888108185050 by task systemd-udevd/311
CPU: 0 PID: 311 Comm: systemd-udevd Tainted: G E 5.19.0-rc2-1-default+ #1689
Hardware name: HP ProLiant DL120 G7, BIOS J01 04/21/2011
Call Trace:
device_del+0x79/0x5f0
platform_device_del.part.0+0x19/0xe0
platform_device_unregister+0x1c/0x30
sysfb_disable+0x2d/0x70
remove_conflicting_framebuffers+0x1c/0xf0
remove_conflicting_pci_framebuffers+0x130/0x1a0
drm_aperture_remove_conflicting_pci_framebuffers+0x86/0xb0
mgag200_pci_probe+0x2d/0x140 [mgag200]
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 873eb3b11860 ("fbdev: Disable sysfb device registration when removing conflicting FBs") Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Helge Deller <deller@gmx.de> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Zhen Lei <thunder.leizhen@huawei.com> Cc: Changcheng Deng <deng.changcheng@zte.com.cn> Reviewed-by: Zack Rusin <zackr@vmware.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'powerpc-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
- On Power8 bare metal, fix creation of RNG platform devices, which are
needed for the /dev/hwrng driver to probe correctly.
Thanks to Jason A. Donenfeld, and Sachin Sant.
* tag 'powerpc-5.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/powernv: delay rng platform device creation until later in boot
io_uring: check that we have a file table when allocating update slots
If IORING_FILE_INDEX_ALLOC is set asking for an allocated slot, the
helper doesn't check if we actually have a file table or not. The non
alloc path does do that correctly, and returns -ENXIO if we haven't set
one up.
Do the same for the allocated path, avoiding a NULL pointer dereference
when trying to find a free bit.
Fixes: 3c02192c8665 ("io_uring: let IORING_OP_FILES_UPDATE support choosing fixed file slots") Signed-off-by: Jens Axboe <axboe@kernel.dk>
Some Intel processors may use alternate predictors for RETs on
RSB-underflow. This condition may be vulnerable to Branch History
Injection (BHI) and intramode-BTI.
Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines,
eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against
such attacks. However, on RSB-underflow, RET target prediction may
fallback to alternate predictors. As a result, RET's predicted target
may get influenced by branch history.
A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback
behavior when in kernel mode. When set, RETs will not take predictions
from alternate predictors, hence mitigating RETs as well. Support for
this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2).
For spectre v2 mitigation, when a user selects a mitigation that
protects indirect CALLs and JMPs against BHI and intramode-BTI, set
RRSBA_DIS_S also to protect RETs for RSB-underflow case.
Merge tag 'fscache-fixes-20220708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache fixes from David Howells:
- Fix a check in fscache_wait_on_volume_collision() in which the
polarity is reversed. It should complain if a volume is still marked
acquisition-pending after 20s, but instead complains if the mark has
been cleared (ie. the condition has cleared).
Also switch an open-coded test of the ACQUIRE_PENDING volume flag to
use the helper function for consistency.
- Not a fix per se, but neaten the code by using a helper to check for
the DROPPED state.
- Fix cachefiles's support for erofs to only flush requests associated
with a released control file, not all requests.
- Fix a race between one process invalidating an object in the cache
and another process trying to look it up.
* tag 'fscache-fixes-20220708' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
fscache: Fix invalidation/lookup race
cachefiles: narrow the scope of flushed requests when releasing fd
fscache: Introduce fscache_cookie_is_dropped()
fscache: Fix if condition in fscache_wait_on_volume_collision()
Merge tag 'acpi-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix two recent regressions related to CPPC support.
Specifics:
- Prevent _CPC from being used if the platform firmware does not
confirm CPPC v2 support via _OSC (Mario Limonciello)
- Allow systems with X86_FEATURE_CPPC set to use _CPC even if CPPC
support cannot be agreed on via _OSC (Mario Limonciello)"
* tag 'acpi-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: CPPC: Don't require _OSC if X86_FEATURE_CPPC is supported
ACPI: CPPC: Only probe for _CPC if CPPC v2 is acked
Merge tag 'pm-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a NULL pointer dereference in a devfreq driver and a runtime
PM framework issue that may cause a supplier device to be suspended
before its consumer.
Specifics:
- Fix NULL pointer dereference related to printing a diagnostic
message in the exynos-bus devfreq driver (Christian Marangi)
- Fix race condition in the runtime PM framework which in some cases
may cause a supplier device to be suspended when its consumer is
still active (Rafael Wysocki)"
* tag 'pm-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / devfreq: exynos-bus: Fix NULL pointer dereference
PM: runtime: Fix supplier device management during consumer probe
PM: runtime: Redefine pm_runtime_release_supplier()
* tag 'cxl-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/mbox: Fix missing variable payload checks in cmd size validation
memregion: Fix memregion_free() fallback definition
cxl/mbox: Use __le32 in get,set_lsa mailbox structures
cxl/core: Use is_endpoint_decoder
cxl: Fix cleanup of port devices on failure to probe driver.
MAINTAINERS: Update Ben's email address
Merge tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
"NVMe pull request with another id quirk addition, and a tracing fix"
* tag 'block-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
nvme: use struct group for generic command dwords
nvme-pci: phison e16 has bogus namespace ids
Merge tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block
Pull io_uring tweak from Jens Axboe:
"Just a minor tweak to an addition made in this release cycle: padding
a 32-bit value that's in a 64-bit union to avoid any potential
funkiness from that"
* tag 'io_uring-5.19-2022-07-08' of git://git.kernel.dk/linux-block:
io_uring: explicit sqe padding for ioctl commands
Merge tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
Pull fbdev fixes from Helge Deller:
- fbcon now prevents switching to screen resolutions which are smaller
than the font size, and prevents enabling a font which is bigger than
the current screen resolution. This fixes vmalloc-out-of-bounds
accesses found by KASAN.
- Guiling Deng fixed a bug where the centered fbdev logo wasn't
displayed correctly if the screen size matched the logo size.
- Hsin-Yi Wang provided a patch to include errno.h to fix build when
CONFIG_OF isn't enabled.
* tag 'for-5.19/fbdev-3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
fbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()
fbmem: Check virtual screen sizes in fb_set_var()
fbcon: Prevent that screen size is smaller than font size
fbcon: Disallow setting font bigger than screen size
video: of_display_timing.h: include errno.h
fbdev: fbmem: Fix logo center image dx issue
Naohiro Aota [Wed, 29 Jun 2022 02:00:38 +0000 (11:00 +0900)]
btrfs: zoned: drop optimization of zone finish
We have an optimization in do_zone_finish() to send REQ_OP_ZONE_FINISH only
when necessary, i.e. we don't send REQ_OP_ZONE_FINISH when we assume we
wrote fully into the zone.
The assumption is determined by "alloc_offset == capacity". This condition
won't work if the last ordered extent is canceled due to some errors. In
that case, we consider the zone is deactivated without sending the finish
command while it's still active.
This inconstancy results in activating another block group while we cannot
really activate the underlying zone, which causes the active zone exceeds
errors like below.
BTRFS error (device nvme3n2): allocation failed flags 1, wanted 520192 tree-log 0, relocation: 0
nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
nvme3n2: I/O Cmd(0x7d) @ LBA 160432128, 127 blocks, I/O Error (sct 0x1 / sc 0xbd) MORE DNR
active zones exceeded error, dev nvme3n2, sector 0 op 0xd:(ZONE_APPEND) flags 0x4800 phys_seg 1 prio class 0
Fix the issue by removing the optimization for now.
Fixes: c0b3c3bbc6c7 ("btrfs: zoned: finish superblock zone once no space left for new SB") Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
The bioc would leak on the normal completion path and also on the RAID56
check (but that one won't happen in practice due to the invalid
combination with zoned mode).
Fixes: ca593da8f389 ("btrfs: zoned: support dev-replace in zoned filesystems") CC: stable@vger.kernel.org # 5.16+ Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
[ update changelog ] Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
Filipe Manana [Mon, 4 Jul 2022 11:42:03 +0000 (12:42 +0100)]
btrfs: return -EAGAIN for NOWAIT dio reads/writes on compressed and inline extents
When doing a direct IO read or write, we always return -ENOTBLK when we
find a compressed extent (or an inline extent) so that we fallback to
buffered IO. This however is not ideal in case we are in a NOWAIT context
(io_uring for example), because buffered IO can block and we currently
have no support for NOWAIT semantics for buffered IO, so if we need to
fallback to buffered IO we should first signal the caller that we may
need to block by returning -EAGAIN instead.
This behaviour can also result in short reads being returned to user
space, which although it's not incorrect and user space should be able
to deal with partial reads, it's somewhat surprising and even some popular
applications like QEMU (Link tag #1) and MariaDB (Link tag #2) don't
deal with short reads properly (or at all).
The short read case happens when we try to read from a range that has a
non-compressed and non-inline extent followed by a compressed extent.
After having read the first extent, when we find the compressed extent we
return -ENOTBLK from btrfs_dio_iomap_begin(), which results in iomap to
treat the request as a short read, returning 0 (success) and waiting for
previously submitted bios to complete (this happens at
fs/iomap/direct-io.c:__iomap_dio_rw()). After that, and while at
btrfs_file_read_iter(), we call filemap_read() to use buffered IO to
read the remaining data, and pass it the number of bytes we were able to
read with direct IO. Than at filemap_read() if we get a page fault error
when accessing the read buffer, we return a partial read instead of an
-EFAULT error, because the number of bytes previously read is greater
than zero.
So fix this by returning -EAGAIN for NOWAIT direct IO when we find a
compressed or an inline extent.
ovl: turn of SB_POSIXACL with idmapped layers temporarily
This cycle we added support for mounting overlayfs on top of idmapped
mounts. Recently I've started looking into potential corner cases when
trying to add additional tests and I noticed that reporting for POSIX ACLs
is currently wrong when using idmapped layers with overlayfs mounted on top
of it.
I have sent out an patch that fixes this and makes POSIX ACLs work
correctly but the patch is a bit bigger and we're already at -rc5 so I
recommend we simply don't raise SB_POSIXACL when idmapped layers are
used. Then we can fix the VFS part described below for the next merge
window so we can have good exposure in -next.
I'm going to give a rather detailed explanation to both the origin of the
problem and mention the solution so people know what's going on.
Let's assume the user creates the following directory layout and they have
a rootfs /var/lib/lxc/c1/rootfs. The files in this rootfs are owned as you
would expect files on your host system to be owned. For example, ~/.bashrc
for your regular user would be owned by 1000:1000 and /root/.bashrc would
be owned by 0:0. IOW, this is just regular boring filesystem tree on an
ext4 or xfs filesystem.
The user chooses to set POSIX ACLs using the setfacl binary granting the
user with uid 4 read, write, and execute permissions for their .bashrc
file:
This for example makes it so that
/var/lib/lxc/c2/rootfs/home/ubuntu/.bashrc which is owned by uid and gid
1000 as being owned by uid and gid 10001000 at
/vol/contpool/lowermap/home/ubuntu/.bashrc.
Assume the user wants to expose these idmapped mounts through an overlayfs
mount to a container.
(1) Mount overlayfs in the initial user namespace and expose it to the
container.
(2) Mount overlayfs on top of the idmapped mounts inside of the container's
user namespace.
Let's assume the user chooses the (1) option and mounts overlayfs on the
host and then changes into a container which uses the idmapping
0:10000000:65536 which is the same used for the two idmapped mounts.
Now the user tries to retrieve the POSIX ACLs using the getfacl command
indicating the uid wasn't correctly translated according to the idmapped
mount. The problem is how we currently translate POSIX ACLs. Let's inspect
the callchain in this example:
As is easily seen the problem arises because the idmapping of the lower
mount isn't taken into account as all of this happens in do_gexattr(). But
do_getxattr() is always called on an overlayfs mount and inode and thus
cannot possible take the idmapping of the lower layers into account.
This problem is similar for fscaps but there the translation happens as
part of vfs_getxattr() already. Let's walk through an fscaps overlayfs
callchain:
The expected outcome here is that we'll receive the cap_net_raw capability
as we are able to map the uid associated with the fscap to 0 within our
container. IOW, we want to see 0 as the result of the idmapping
translations.
If the user chooses option (1) we get the following callchain for fscaps:
We can see how the translation happens correctly in those cases as the
conversion happens within the vfs_getxattr() helper.
For POSIX ACLs we need to do something similar. However, in contrast to
fscaps we cannot apply the fix directly to the kernel internal posix acl
data structure as this would alter the cached values and would also require
a rework of how we currently deal with POSIX ACLs in general which almost
never take the filesystem idmapping into account (the noteable exception
being FUSE but even there the implementation is special) and instead
retrieve the raw values based on the initial idmapping.
The correct values are then generated right before returning to
userspace. The fix for this is to move taking the mount's idmapping into
account directly in vfs_getxattr() instead of having it be part of
posix_acl_fix_xattr_to_user().
To this end we simply move the idmapped mount translation into a separate
step performed in vfs_{g,s}etxattr() instead of in
posix_acl_fix_xattr_{from,to}_user().
To see how this fixes things let's go back to the original example. Assume
the user chose option (1) and mounted overlayfs on top of idmapped mounts
on the host:
passing through the get_acl request to the underlying filesystem. This will
retrieve the acls stored in the lower filesystem without taking the
idmapping of the underlying mount into account as this would mean altering
the cached values for the lower filesystem. The simple solution is to have
ovl_get_acl() simply duplicate the ACLs, update the values according to the
idmapped mount and return it to acl_permission_check() so it can be used in
posix_acl_permission(). Since overlayfs doesn't cache ACLs they'll be
released right after.
Link: https://github.com/brauner/mount-idmapped/issues/9 Cc: Seth Forshee <sforshee@digitalocean.com> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Aleksa Sarai <cyphar@cyphar.com> Cc: linux-unionfs@vger.kernel.org Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org> Fixes: ff5ca22de39b ("ovl: support idmapped layers") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
There are some VM configurations which have Skylake model but do not
support IBPB. In those cases, when using retbleed=ibpb, userspace is going
to be killed and kernel is going to panic.
If the CPU does not support IBPB, warn and proceed with the auto option. Also,
do not fallback to IBPB on AMD/Hygon systems if it is not supported.
Fixes: bf9a7f6cdeaa ("x86/bugs: Add retbleed=ibpb") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Borislav Petkov <bp@suse.de>
Merge tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme into block-5.19
Pull NVMe fixes from Christoph:
"nvme fixes for Linux 5.19
- another bogus identifier quirk (Keith Busch)
- use struct group in the tracer to avoid a gcc warning (Keith Busch)"
* tag 'nvme-5.19-2022-07-07' of git://git.infradead.org/nvme:
nvme: use struct group for generic command dwords
nvme-pci: phison e16 has bogus namespace ids
Pavel Begunkov [Thu, 7 Jul 2022 14:00:38 +0000 (15:00 +0100)]
io_uring: explicit sqe padding for ioctl commands
32 bit sqe->cmd_op is an union with 64 bit values. It's always a good
idea to do padding explicitly. Also zero check it in prep, so it can be
used in the future if needed without compatibility concerns.
Merge tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull a devfreq fix for 5.19-rc6 from Chanwoo Choi:
"- Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver."
* tag 'devfreq-fixes-for-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: exynos-bus: Fix NULL pointer dereference
Fix exynos-bus NULL pointer dereference by correctly using the local
generated freq_table to output the debug values instead of using the
profile freq_table that is not used in the driver.
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Fixes: 520b39385eb4 ("PM / devfreq: Rework freq_table to be local to devfreq struct") Cc: stable@vger.kernel.org Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Acked-by: Chanwoo Choi <cw00.choi@samsung.com> Signed-off-by: Chanwoo Choi <cw00.choi@samsung.com>
Merge tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch fixes from Huacai Chen:
"A fix for tinyconfig build error, a fix for section mismatch warning,
and two cleanups of obsolete code"
* tag 'loongarch-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
LoongArch: Fix section mismatch warning
LoongArch: Fix build errors for tinyconfig
LoongArch: Remove obsolete mentions of vcsr
LoongArch: Drop these obsolete selects in Kconfig
- eth: ibmvnic: properly dispose of all skbs during a failover
Previous releases - always broken:
- bpf:
- fix insufficient bounds propagation from
adjust_scalar_min_max_vals
- clear page contiguity bit when unmapping pool
- netfilter: nft_set_pipapo: release elements in clone from
abort path
- mptcp: netlink: issue MP_PRIO signals from userspace PMs
- can:
- rcar_canfd: fix data transmission failed on R-Car V3U
- gs_usb: gs_usb_open/close(): fix memory leak
Misc:
- add Wenjia as SMC maintainer"
* tag 'net-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
wireguard: Kconfig: select CRYPTO_CHACHA_S390
crypto: s390 - do not depend on CRYPTO_HW for SIMD implementations
wireguard: selftests: use microvm on x86
wireguard: selftests: always call kernel makefile
wireguard: selftests: use virt machine on m68k
wireguard: selftests: set fake real time in init
r8169: fix accessing unset transport header
net: rose: fix UAF bug caused by rose_t0timer_expiry
usbnet: fix memory leak in error case
Revert "tls: rx: move counting TlsDecryptErrors for sync"
mptcp: update MIB_RMSUBFLOW in cmd_sf_destroy
mptcp: fix local endpoint accounting
selftests: mptcp: userspace PM support for MP_PRIO signals
mptcp: netlink: issue MP_PRIO signals from userspace PMs
mptcp: Acquire the subflow socket lock before modifying MP_PRIO flags
mptcp: Avoid acquiring PM lock for subflow priority changes
mptcp: fix locking in mptcp_nl_cmd_sf_destroy()
net/mlx5e: Fix matchall police parameters validation
net/sched: act_police: allow 'continue' action offload
net: lan966x: hardcode the number of external ports
...
Merge tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Tag Intel pin control as supported in MAINTAINERS
- Fix a NULL pointer exception in the Aspeed driver
- Correct some NAND functions in the Sunxi A83T driver
- Use the right offset for some Sunxi pins
- Fix a zero base offset in the Freescale (NXP) i.MX93
- Fix the IRQ support in the STM32 driver
* tag 'pinctrl-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: stm32: fix optional IRQ support to gpios
pinctrl: imx: Add the zero base flag for imx93
pinctrl: sunxi: sunxi_pconf_set: use correct offset
pinctrl: sunxi: a83t: Fix NAND function name for some pins
pinctrl: aspeed: Fix potential NULL dereference in aspeed_pinmux_set_mux()
MAINTAINERS: Update Intel pin control to Supported
These are indeed "should not happen" situations, but it turns out recent
changes made the 'task_is_stopped_or_trace()' case trigger (fix for that
exists, is pending more testing), and the BUG_ON() makes it
unnecessarily hard to actually debug for no good reason.
It's been that way for a long time, but let's make it clear: BUG_ON() is
not good for debugging, and should never be used in situations where you
could just say "this shouldn't happen, but we can continue".
Use WARN_ON_ONCE() instead to make sure it gets logged, and then just
continue running. Instead of making the system basically unusuable
because you crashed the machine while potentially holding some very core
locks (eg this function is commonly called while holding 'tasklist_lock'
for writing).
had to change that because the 'ret' was too early and moved it into
idtentry, bloating the text size, since idtentry is expanded for every
exception vector.
However, with the advent of xen_error_entry() in commit
Kent Gibson [Wed, 6 Jul 2022 08:45:07 +0000 (16:45 +0800)]
gpiolib: cdev: fix null pointer dereference in linereq_free()
Fix a kernel NULL pointer dereference reported by gpio kselftests.
linereq_free() can be called as part of the cleanup of a failed request,
at which time the desc for a line may not have been determined, so it
is unsafe to dereference without a check.
Tiezhu Yang [Mon, 27 Jun 2022 06:57:35 +0000 (14:57 +0800)]
LoongArch: Fix section mismatch warning
init_numa_memory() is annotated __init and not used by any module,
thus don't export it.
Remove not needed EXPORT_SYMBOL for init_numa_memory() to fix the
following section mismatch warning:
MODPOST vmlinux.symvers
WARNING: modpost: vmlinux.o(___ksymtab+init_numa_memory+0x0): Section mismatch in reference
from the variable __ksymtab_init_numa_memory to the function .init.text:init_numa_memory()
The symbol init_numa_memory is exported and annotated __init
Fix this by removing the __init annotation of init_numa_memory or drop the export.
Qi Hu [Wed, 6 Jul 2022 11:29:37 +0000 (19:29 +0800)]
LoongArch: Remove obsolete mentions of vcsr
The `vcsr` only exists in the old hardware design, it isn't used in any
shipped hardware from Loongson-3A5000 on. Both scalar FP and LSX/LASX
instructions use the `fcsr` as their control and status registers now.
For example, the RM control bit in fcsr0 is shared by FP, LSX and LASX
instructions.
Particularly, fcsr16 to fcsr31 are reserved for LSX/LASX now, access to
these registers has no visible effect if LSX/LASX is enabled, and will
cause SXD/ASXD exceptions if LSX/LASX is not enabled.
So, mentions of vcsr are obsolete in the first place (it was just used
for debugging), let's remove them.
Reviewed-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Qi Hu <huqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Commit 740b41dc45ee ("LoongArch: Add build infrastructure") adds the new
file arch/loongarch/Kconfig.
As the work on LoongArch was probably quite some time under development,
various config symbols have changed and disappeared from the time of
initial writing of the Kconfig file and its inclusion in the repository.
Helge Deller [Sat, 25 Jun 2022 11:00:34 +0000 (13:00 +0200)]
fbcon: Use fbcon_info_from_console() in fbcon_modechange_possible()
Use the fbcon_info_from_console() wrapper which was added to kernel
v5.19 with commit 40964d68ad69 ("fbcon: Introduce wrapper for console->fb_info lookup").
Helge Deller [Wed, 29 Jun 2022 13:53:55 +0000 (15:53 +0200)]
fbmem: Check virtual screen sizes in fb_set_var()
Verify that the fbdev or drm driver correctly adjusted the virtual
screen sizes. On failure report the failing driver and reject the screen
size change.
Helge Deller [Sat, 25 Jun 2022 11:00:34 +0000 (13:00 +0200)]
fbcon: Prevent that screen size is smaller than font size
We need to prevent that users configure a screen size which is smaller than the
currently selected font size. Otherwise rendering chars on the screen will
access memory outside the graphics memory region.
This patch adds a new function fbcon_modechange_possible() which
implements this check and which later may be extended with other checks
if necessary. The new function is called from the FBIOPUT_VSCREENINFO
ioctl handler in fbmem.c, which will return -EINVAL if userspace asked
for a too small screen size.
Helge Deller [Sat, 25 Jun 2022 10:56:49 +0000 (12:56 +0200)]
fbcon: Disallow setting font bigger than screen size
Prevent that users set a font size which is bigger than the physical screen.
It's unlikely this may happen (because screens are usually much larger than the
fonts and each font char is limited to 32x32 pixels), but it may happen on
smaller screens/LCD displays.
And in release_reference() we dereference vma->vm to get to the
vm gt pointer, leading to a use-after free.
However, __i915_vm_release() grabs the vm->mutex so the vm won't be
destroyed before vma->vm->mutex is released, so extract the gt pointer
under the vm->mutex to avoid the vma->vm dereference in
release_references().
v2: Fix a typo in the commit message (Andi Shyti)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5944 Fixes: 9112ac6ebbc1 ("drm/i915: Remove the vm open count") Cc: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.con> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220620123659.381772-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 1926a6b75954fc1a8b44d10bd0c67db957b78cf7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
drm/i915/guc: ADL-N should use the same GuC FW as ADL-S
The only difference between the ADL S and P GuC FWs is the HWConfig
support. ADL-N does not support HWConfig, so we should use the same
binary as ADL-S, otherwise the GuC might attempt to fetch a config
table that does not exist. ADL-N is internally identified as an ADL-P,
so we need to special-case it in the FW selection code.
Fixes: a7865916df13 ("drm/i915/adl-n: Enable ADL-N platform") Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Tejas Upadhyay <tejas.upadhyay@intel.com> Cc: Anusha Srivatsa <anusha.srivatsa@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220621233005.3952293-1-daniele.ceraolospurio@intel.com
(cherry picked from commit 971e4a9781742aaad1587e25fd5582b2dd595ef8) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Hangyu Hua [Fri, 24 Jun 2022 13:04:06 +0000 (06:04 -0700)]
drm/i915: fix a possible refcount leak in intel_dp_add_mst_connector()
If drm_connector_init fails, intel_connector_free will be called to take
care of proper free. So it is necessary to drop the refcount of port
before intel_connector_free.
Jakub Kicinski [Thu, 7 Jul 2022 03:04:09 +0000 (20:04 -0700)]
Merge branch 'wireguard-patches-for-5-19-rc6'
Jason A. Donenfeld says:
====================
wireguard patches for 5.19-rc6
1) A few small fixups to the selftests, per usual. Of particular note is
a fix for a test flake that occurred on especially fast systems that
boot in less than a second.
2) An addition during this cycle of some s390 crypto interacted with the
way wireguard selects dependencies, resulting in linker errors
reported by the kernel test robot. So Vladis sent in a patch for
that, which also required a small preparatory fix moving some Kconfig
symbols around.
====================
Select the new implementation of CHACHA20 for S390 when available.
It is faster than the generic software implementation, but also prevents
some linker errors in certain situations.