git.baikalelectronics.ru Git

KVM: selftests: Link selftests directly with lib object files

The linker does obey strong/weak symbols when linking static libraries,
it simply resolves an undefined symbol to the first-encountered symbol.
This means that defining __weak arch-generic functions and then defining
arch-specific strong functions to override them in libkvm will not
always work.

More specifically, if we have:

lib/generic.c:

  void __weak foo(void)
  {
          pr_info("weak\n");
  }

  void bar(void)
  {
          foo();
  }

lib/x86_64/arch.c:

  void foo(void)
  {
          pr_info("strong\n");
  }

And a selftest that calls bar(), it will print "weak". Now if you make
generic.o explicitly depend on arch.o (e.g. add function to arch.c that
is called directly from generic.c) it will print "strong". In other
words, it seems that the linker is free to throw out arch.o when linking
because generic.o does not explicitly depend on it, which causes the
linker to lose the strong symbol.

One solution is to link libkvm.a with --whole-archive so that the linker
doesn't throw away object files it thinks are unnecessary. However that
is a bit difficult to plumb since we are using the common selftests
makefile rules. An easier solution is to drop libkvm.a just link
selftests with all the .o files that were originally in libkvm.a.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-9-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Drop unnecessary rule for STATIC_LIBS

Drop the "all: $(STATIC_LIBS)" rule. The KVM selftests already depend
on $(STATIC_LIBS), so there is no reason to have an extra "all" rule.

Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-8-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Add a helper to check EPT/VPID capabilities

Create a small helper function to check if a given EPT/VPID capability
is supported. This will be re-used in a follow-up commit to check for 1G
page support.

No functional change intended.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-7-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Move VMX_EPT_VPID_CAP_AD_BITS to vmx.h

This is a VMX-related macro so move it to vmx.h. While here, open code
the mask like the rest of the VMX bitmask macros.

No functional change intended.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-6-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Refactor nested_map() to specify target level

Refactor nested_map() to specify that it explicityl wants 4K mappings
(the existing behavior) and push the implementation down into
__nested_map(), which can be used in subsequent commits to create huge
page mappings.

No function change intended.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-5-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Drop stale function parameter comment for nested_map()

nested_map() does not take a parameter named eptp_memslot. Drop the
comment referring to it.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-4-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Add option to create 2M and 1G EPT mappings

The current EPT mapping code in the selftests only supports mapping 4K
pages. This commit extends that support with an option to map at 2M or
1G. This will be used in a future commit to create large page mappings
to test eager page splitting.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-3-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Replace x86_page_size with PG_LEVEL_XX

x86_page_size is an enum used to communicate the desired page size with
which to map a range of memory. Under the hood they just encode the
desired level at which to map the page. This ends up being clunky in a
few ways:

- The name suggests it encodes the size of the page rather than the
level.
- In other places in x86_64/processor.c we just use a raw int to encode
the level.

Simplify this by adopting the kernel style of PG_LEVEL_XX enums and pass
around raw ints when referring to the level. This makes the code easier
to understand since these macros are very common in KVM MMU code.

Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220520233249.3776001-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: SVM: fix nested PAUSE filtering when L0 intercepts PAUSE

Commit 2ed38210b89c ("KVM: x86: nSVM: support PAUSE filtering when L0
doesn't intercept PAUSE") introduced passthrough support for nested pause
filtering, (when the host doesn't intercept PAUSE) (either disabled with
kvm module param, or disabled with '-overcommit cpu-pm=on')

Before this commit, L1 KVM didn't intercept PAUSE at all; afterwards,
the feature was exposed as supported by KVM cpuid unconditionally, thus
if L1 could try to use it even when the L0 KVM can't really support it.

In this case the fallback caused KVM to intercept each PAUSE instruction;
in some cases, such intercept can slow down the nested guest so much
that it can fail to boot. Instead, before the problematic commit KVM
was already setting both thresholds to 0 in vmcb02, but after the first
userspace VM exit shrink_ple_window was called and would reset the
pause_filter_count to the default value.

To fix this, change the fallback strategy - ignore the guest threshold
values, but use/update the host threshold values unless the guest
specifically requests disabling PAUSE filtering (either simple or
advanced).

Also fix a minor bug: on nested VM exit, when PAUSE filter counter
were copied back to vmcb01, a dirty bit was not set.

Thanks a lot to Suravee Suthikulpanit for debugging this!

Fixes: 2ed38210b89c ("KVM: x86: nSVM: support PAUSE filtering when L0 doesn't intercept PAUSE")
Reported-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220518072709.730031-1-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: SVM: drop preempt-safe wrappers for avic_vcpu_load/put

Now that these functions are always called with preemption disabled,
remove the preempt_disable()/preempt_enable() pair inside them.

No functional change intended.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: disable preemption around the call to kvm_arch_vcpu_{un|}blocking

On SVM, if preemption happens right after the call to finish_rcuwait
but before call to kvm_arch_vcpu_unblocking on SVM/AVIC, it itself
will re-enable AVIC, and then we will try to re-enable it again
in kvm_arch_vcpu_unblocking which will lead to a warning
in __avic_vcpu_load.

The same problem can happen if the vCPU is preempted right after the call
to kvm_arch_vcpu_blocking but before the call to prepare_to_rcuwait
and in this case, we will end up with AVIC enabled during sleep -
Ooops.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-7-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: disable preemption while updating apicv inhibition

Currently nothing prevents preemption in kvm_vcpu_update_apicv.

On SVM, If the preemption happens after we update the
vcpu->arch.apicv_active, the preemption itself will
'update' the inhibition since the AVIC will be first disabled
on vCPU unload and then enabled, when the current task
is loaded again.

Then we will try to update it again, which will lead to a warning
in __avic_vcpu_load, that the AVIC is already enabled.

Fix this by disabling preemption in this code.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: SVM: fix avic_kick_target_vcpus_fast

There are two issues in avic_kick_target_vcpus_fast

1. It is legal to issue an IPI request with APIC_DEST_NOSHORT
   and a physical destination of 0xFF (or 0xFFFFFFFF in case of x2apic),
   which must be treated as a broadcast destination.

   Fix this by explicitly checking for it.
   Also donâ€™t use â€˜indexâ€™ in this case as it gives no new information.

2. It is legal to issue a logical IPI request to more than one target.
   Index field only provides index in physical id table of first
   such target and therefore can't be used before we are sure
   that only a single target was addressed.

   Instead, parse the ICRL/ICRH, double check that a unicast interrupt
   was requested, and use that info to figure out the physical id
   of the target vCPU.
   At that point there is no need to use the index field as well.

In addition to fixing the above issues, also skip the call to
kvm_apic_match_dest.

It is possible to do this now, because now as long as AVIC is not
inhibited, it is guaranteed that none of the vCPUs changed their
apic id from its default value.

This fixes boot of windows guest with AVIC enabled because it uses
IPI with 0xFF destination and no destination shorthand.

Fixes: b6da0dfa9a64 ("KVM: SVM: Use target APIC ID to complete AVIC IRQs when possible")
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: SVM: remove avic's broken code that updated APIC ID

AVIC is now inhibited if the guest changes the apic id,
and therefore this code is no longer needed.

There are several ways this code was broken, including:

1. a vCPU was only allowed to change its apic id to an apic id
of an existing vCPU.

2. After such change, the vCPU whose apic id entry was overwritten,
could not correctly change its own apic id, because its own
entry is already overwritten.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base

Neither of these settings should be changed by the guest and it is
a burden to support it in the acceleration code, so just inhibit
this code instead.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: document AVIC/APICv inhibit reasons

These days there are too many AVIC/APICv inhibit
reasons, and it doesn't hurt to have some documentation
for them.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606180829.102503-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/mmu: Set memory encryption "value", not "mask", in shadow PDPTRs

Assign shadow_me_value, not shadow_me_mask, to PAE root entries,
a.k.a. shadow PDPTRs, when host memory encryption is supported. The
"mask" is the set of all possible memory encryption bits, e.g. MKTME
KeyIDs, whereas "value" holds the actual value that needs to be
stuffed into host page tables.

Using shadow_me_mask results in a failed VM-Entry due to setting
reserved PA bits in the PDPTRs, and ultimately causes an OOPS due to
physical addresses with non-zero MKTME bits sending to_shadow_page()
into the weeds:

set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
BUG: unable to handle page fault for address: ffd43f00063049e8
PGD 86dfd8067 P4D 0
Oops: 0000 [#1] PREEMPT SMP
RIP: 0010:mmu_free_root_page+0x3c/0x90 [kvm]
kvm_mmu_free_roots+0xd1/0x200 [kvm]
__kvm_mmu_unload+0x29/0x70 [kvm]
kvm_mmu_unload+0x13/0x20 [kvm]
kvm_arch_destroy_vm+0x8a/0x190 [kvm]
kvm_put_kvm+0x197/0x2d0 [kvm]
kvm_vm_release+0x21/0x30 [kvm]
__fput+0x8e/0x260
____fput+0xe/0x10
task_work_run+0x6f/0xb0
do_exit+0x327/0xa90
do_group_exit+0x35/0xa0
get_signal+0x911/0x930
arch_do_signal_or_restart+0x37/0x720
exit_to_user_mode_prepare+0xb2/0x140
syscall_exit_to_user_mode+0x16/0x30
do_syscall_64+0x4e/0x90
entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: fbc967129720 ("KVM: x86/mmu: Add shadow_me_value and repurpose shadow_me_mask")
Signed-off-by: Yuan Yao <yuan.yao@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-Id: <20220608012015.19566-1-yuan.yao@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvmarm-fixes-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 fixes for 5.19, take #1

- Properly reset the SVE/SME flags on vcpu load

- Fix a vgic-v2 regression regarding accessing the pending
state of a HW interrupt from userspace (and make the code
common with vgic-v3)

- Fix access to the idreg range for protected guests

- Ignore 'kvm-arm.mode=protected' when using VHE

- Return an error from kvm_arch_init_vm() on allocation failure

- A bunch of small cleanups (comments, annotations, indentation)

Merge tag 'kvm-riscv-fixes-5.19-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 5.19, take #1

- Typo fix in arch/riscv/kvm/vmid.c

- Remove broken reference pattern from MAINTAINERS entry

KVM: arm64: Drop stale comment

The layout of 'struct kvm_vcpu_arch' has evolved significantly since
the initial port of KVM/arm64, so remove the stale comment suggesting
that a prefix of the structure is used exclusively from assembly code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-7-will@kernel.org

KVM: arm64: Remove redundant hyp_assert_lock_held() assertions

host_stage2_try() asserts that the KVM host lock is held, so there's no
need to duplicate the assertion in its wrappers.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-6-will@kernel.org

KVM: arm64: Extend comment in has_vhe()

has_vhe() expands to a compile-time constant when evaluated from the VHE
or nVHE code, alternatively checking a static key when called from
elsewhere in the kernel. On face value, this looks like a case of
premature optimization, but in fact this allows symbol references on
VHE-specific code paths to be dropped from the nVHE object.

Expand the comment in has_vhe() to make this clearer, hopefully
discouraging anybody from simplifying the code.

Cc: David Brazdil <dbrazdil@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-5-will@kernel.org

KVM: arm64: Ignore 'kvm-arm.mode=protected' when using VHE

Ignore 'kvm-arm.mode=protected' when using VHE so that kvm_get_mode()
only returns KVM_MODE_PROTECTED on systems where the feature is available.

Cc: David Brazdil <dbrazdil@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-4-will@kernel.org

KVM: arm64: Handle all ID registers trapped for a protected VM

A protected VM accessing ID_AA64ISAR2_EL1 gets punished with an UNDEF,
while it really should only get a zero back if the register is not
handled by the hypervisor emulation (as mandated by the architecture).

Introduce all the missing ID registers (including the unallocated ones),
and have them to return 0.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-3-will@kernel.org

KVM: arm64: Return error from kvm_arch_init_vm() on allocation failure

If we fail to allocate the 'supported_cpus' cpumask in kvm_arch_init_vm()
then be sure to return -ENOMEM instead of success (0) on the failure
path.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220609121223.2551-2-will@kernel.org

MAINTAINERS: Limit KVM RISC-V entry to existing selftests

Commit 3994299d2ce4 ("MAINTAINERS: Update KVM RISC-V entry to cover
selftests support") optimistically adds a file entry for
tools/testing/selftests/kvm/riscv/, but this directory does not exist.

Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference. The script is very useful to keep MAINTAINERS up to date
and MAINTAINERS can be kept in a state where the script emits no warning.

So, just drop the non-matching file entry rather than starting to collect
exceptions of entries that may match in some close or distant future.

Fixes: 3994299d2ce4 ("MAINTAINERS: Update KVM RISC-V entry to cover selftests support")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Anup Patel <anup@brainfault.org>

RISC-V: KVM: fix typos in comments

Various spelling mistakes in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Anup Patel <anup@brainfault.org>

KVM: arm64: Warn if accessing timer pending state outside of vcpu context

A recurrent bug in the KVM/arm64 code base consists in trying to
access the timer pending state outside of the vcpu context, which
makes zero sense (the pending state only exists when the vcpu
is loaded).

In order to avoid more embarassing crashes and catch the offenders
red-handed, add a warning to kvm_arch_timer_get_input_level() and
return the state as non-pending. This avoids taking the system down,
and still helps tracking down silly bugs.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220607131427.1164881-4-maz@kernel.org

KVM: arm64: Replace vgic_v3_uaccess_read_pending with vgic_uaccess_read_pending

Now that GICv2 has a proper userspace accessor for the pending state,
switch GICv3 over to it, dropping the local version, moving over the
specific behaviours that CGIv3 requires (such as the distinction
between pending latch and line level which were never enforced
with GICv2).

We also gain extra locking that isn't really necessary for userspace,
but that's a small price to pay for getting rid of superfluous code.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/20220607131427.1164881-3-maz@kernel.org

KVM: x86: do not report a vCPU as preempted outside instruction boundaries

If a vCPU is outside guest mode and is scheduled out, it might be in the
process of making a memory access.  A problem occurs if another vCPU uses
the PV TLB flush feature during the period when the vCPU is scheduled
out, and a virtual address has already been translated but has not yet
been accessed, because this is equivalent to using a stale TLB entry.

To avoid this, only report a vCPU as preempted if sure that the guest
is at an instruction boundary.  A rescheduling request will be delivered
to the host physical CPU as an external interrupt, so for simplicity
consider any vmexit *not* instruction boundary except for external
interrupts.

It would in principle be okay to report the vCPU as preempted also
if it is sleeping in kvm_vcpu_block(): a TLB flush IPI will incur the
vmentry/vmexit overhead unnecessarily, and optimistic spinning is
also unlikely to succeed.  However, leave it for later because right
now kvm_vcpu_check_block() is doing memory accesses.  Even
though the TLB flush issue only applies to virtual memory address,
it's very much preferrable to be conservative.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: do not set st->preempted when going back to user space

Similar to the Xen path, only change the vCPU's reported state if the vCPU
was actually preempted. The reason for KVM's behavior is that for example
optimistic spinning might not be a good idea if the guest is doing repeated
exits to userspace; however, it is confusing and unlikely to make a difference,
because well-tuned guests will hardly ever exit KVM_RUN in the first place.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: SVM: fix tsc scaling cache logic

SVM uses a per-cpu variable to cache the current value of the
tsc scaling multiplier msr on each cpu.

Commit d64985fc22b2d
("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
broke this caching logic.

Refactor the code so that all TSC scaling multiplier writes go through
a single function which checks and updates the cache.

This fixes the following scenario:

1. A CPU runs a guest with some tsc scaling ratio.

2. New guest with different tsc scaling ratio starts on this CPU
   and terminates almost immediately.

   This ensures that the short running guest had set the tsc scaling ratio just
   once when it was set via KVM_SET_TSC_KHZ. Due to the bug,
   the per-cpu cache is not updated.

3. The original guest continues to run, it doesn't restore the msr
   value back to its own value, because the cache matches,
   and thus continues to run with a wrong tsc scaling ratio.

Fixes: d64985fc22b2d ("KVM: X86: Add vendor callbacks for writing the TSC multiplier")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220606181149.103072-1-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: selftests: Make hyperv_clock selftest more stable

hyperv_clock doesn't always give a stable test result, especially with
AMD CPUs. The test compares Hyper-V MSR clocksource (acquired either
with rdmsr() from within the guest or KVM_GET_MSRS from the host)
against rdtsc(). To increase the accuracy, increase the measured delay
(done with nop loop) by two orders of magnitude and take the mean rdtsc()
value before and after rdmsr()/KVM_GET_MSRS.

Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220601144322.1968742-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/MMU: Zap non-leaf SPTEs when disabling dirty logging

Currently disabling dirty logging with the TDP MMU is extremely slow.
On a 96 vCPU / 96G VM backed with gigabyte pages, it takes ~200 seconds
to disable dirty logging with the TDP MMU, as opposed to ~4 seconds with
the shadow MMU.

When disabling dirty logging, zap non-leaf parent entries to allow
replacement with huge pages instead of recursing and zapping all of the
child, leaf entries. This reduces the number of TLB flushes required.
and reduces the disable dirty log time with the TDP MMU to ~3 seconds.

Opportunistically add a WARN() to catch GFNs that are mapped at a
higher level than their max level.

Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20220525230904.1584480-1-bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: drop bogus "cc" clobber from __try_cmpxchg_user_asm()

As noted (and fixed) a couple of times in the past, "=@cc<cond>" outputs
and clobbering of "cc" don't work well together. The compiler appears to
mean to reject such, but doesn't - in its upstream form - quite manage
to yet for "cc". Furthermore two similar macros don't clobber "cc", and
clobbering "cc" is pointless in asm()-s for x86 anyway - the compiler
always assumes status flags to be clobbered there.

Fixes: f6d21aee53fe ("x86/uaccess: Implement macros for CMPXCHG on user addresses")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Message-Id: <485c0c0b-a3a7-0b7c-5264-7d00c01de032@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/mmu: Check every prev_roots in __kvm_mmu_free_obsolete_roots()

When freeing obsolete previous roots, check prev_roots as intended, not
the current root.

Signed-off-by: Shaoqin Huang <shaoqin.huang@intel.com>
Fixes: fec964ffc9c3 ("KVM: x86/mmu: Zap only obsolete roots if a root shadow page is zapped")
Message-Id: <20220607005905.2933378-1-shaoqin.huang@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: arm64: Don't read a HW interrupt pending state in user context

Since 7ad0e61718dd ("KVM: arm64: vgic: Read HW interrupt pending state
from the HW"), we're able to source the pending bit for an interrupt
that is stored either on the physical distributor or on a device.

However, this state is only available when the vcpu is loaded,
and is not intended to be accessed from userspace. Unfortunately,
the GICv2 emulation doesn't provide specific userspace accessors,
and we fallback with the ones that are intended for the guest,
with fatal consequences.

Add a new vgic_uaccess_read_pending() accessor for userspace
to use, build on top of the existing vgic_mmio_read_pending().

Reported-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Fixes: 7ad0e61718dd ("KVM: arm64: vgic: Read HW interrupt pending state from the HW")
Link: https://lore.kernel.org/r/20220607131427.1164881-2-maz@kernel.org
Cc: stable@vger.kernel.org

entry/kvm: Exit to user mode when TIF_NOTIFY_SIGNAL is set

A livepatch transition may stall indefinitely when a kvm vCPU is heavily
loaded. To the host, the vCPU task is a user thread which is spending a
very long time in the ioctl(KVM_RUN) syscall. During livepatch
transition, set_notify_signal() will be called on such tasks to
interrupt the syscall so that the task can be transitioned. This
interrupts guest execution, but when xfer_to_guest_mode_work() sees that
TIF_NOTIFY_SIGNAL is set but not TIF_SIGPENDING it concludes that an
exit to user mode is unnecessary, and guest execution is resumed without
transitioning the task for the livepatch.

This handling of TIF_NOTIFY_SIGNAL is incorrect, as set_notify_signal()
is expected to break tasks out of interruptible kernel loops and cause
them to return to userspace. Change xfer_to_guest_mode_work() to handle
TIF_NOTIFY_SIGNAL the same as TIF_SIGPENDING, signaling to the vCPU run
loop that an exit to userpsace is needed. Any pending task_work will be
run when get_signal() is called from exit_to_user_mode_loop(), so there
is no longer any need to run task work from xfer_to_guest_mode_work().

Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Petr Mladek <pmladek@suse.com>
Signed-off-by: Seth Forshee <sforshee@digitalocean.com>
Message-Id: <20220504180840.2907296-1-sforshee@digitalocean.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: Don't null dereference ops->destroy

A KVM device cleanup happens in either of two callbacks:
1) destroy() which is called when the VM is being destroyed;
2) release() which is called when a device fd is closed.

Most KVM devices use 1) but Book3s's interrupt controller KVM devices
(XICS, XIVE, XIVE-native) use 2) as they need to close and reopen during
the machine execution. The error handling in kvm_ioctl_create_device()
assumes destroy() is always defined which leads to NULL dereference as
discovered by Syzkaller.

This adds a checks for destroy!=NULL and adds a missing release().

This is not changing kvm_destroy_devices() as devices with defined
release() should have been removed from the KVM devices list by then.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: arm64: Fix inconsistent indenting

Fix the following smatch warnings:

arch/arm64/kvm/vmid.c:62 flush_context() warn: inconsistent indenting

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220602024805.511457-1-sunliming@kylinos.cn

KVM: arm64: Always start with clearing SME flag on load

On each vcpu load, we set the KVM_ARM64_HOST_SME_ENABLED
flag if SME is enabled for EL0 on the host. This is used to
restore the correct state on vpcu put.

However, it appears that nothing ever clears this flag. Once
set, it will stick until the vcpu is destroyed, which has the
potential to spuriously enable SME for userspace. As it turns
out, this is due to the SME code being more or less copied from
SVE, and inheriting the same shortcomings.

We never saw the issue because nothing uses SME, and the amount
of testing is probably still pretty low.

Fixes: 1776d52ce61a ("KVM: arm64: Handle SME host state when running guests")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviwed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220528113829.1043361-3-maz@kernel.org

KVM: arm64: Always start with clearing SVE flag on load

On each vcpu load, we set the KVM_ARM64_HOST_SVE_ENABLED
flag if SVE is enabled for EL0 on the host. This is used to restore
the correct state on vpcu put.

However, it appears that nothing ever clears this flag. Once
set, it will stick until the vcpu is destroyed, which has the
potential to spuriously enable SVE for userspace.

We probably never saw the issue because no VMM uses SVE, but
that's still pretty bad. Unconditionally clearing the flag
on vcpu load addresses the issue.

Fixes: 14a852cfc411 ("KVM: arm64: Get rid of host SVE tracking/saving")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220528113829.1043361-2-maz@kernel.org

Linux 5.19-rc1

Merge tag 'pull-work.fd-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull file descriptor fix from Al Viro:
"Fix for breakage in #work.fd this window"

* tag 'pull-work.fd-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix the breakage in close_fd_get_file() calling conventions change

Merge tag 'mm-hotfixes-stable-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm hotfixes from Andrew Morton:
"Fixups for various recently-added and longer-term issues and a few
  minor tweaks:

   - fixes for material merged during this merge window

   - cc:stable fixes for more longstanding issues

   - minor mailmap and MAINTAINERS updates"

* tag 'mm-hotfixes-stable-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mm/oom_kill.c: fix vm_oom_kill_table[] ifdeffery
  x86/kexec: fix memory leak of elf header buffer
  mm/memremap: fix missing call to untrack_pfn() in pagemap_range()
  mm: page_isolation: use compound_nr() correctly in isolate_single_pageblock()
  mm: hugetlb_vmemmap: fix CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON
  MAINTAINERS: add maintainer information for z3fold
  mailmap: update Josh Poimboeuf's email

Merge tag 'mm-nonmm-stable-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull delay-accounting update from Andrew Morton:
"A single featurette for delay accounting.

  Delayed a bit because, unusually, it had dependencies on both the
  mm-stable and mm-nonmm-stable queues"

* tag 'mm-nonmm-stable-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  delayacct: track delays from write-protect copy

bluetooth: don't use bitmaps for random flag accesses

The bluetooth code uses our bitmap infrastructure for the two bits (!)
of connection setup flags, and in the process causes odd problems when
it converts between a bitmap and just the regular values of said bits.

It's completely pointless to do things like bitmap_to_arr32() to convert
a bitmap into a u32. It shoudln't have been a bitmap in the first
place. The reason to use bitmaps is if you have arbitrary number of
bits you want to manage (not two!), or if you rely on the atomicity
guarantees of the bitmap setting and clearing.

The code could use an "atomic_t" and use "atomic_or/andnot()" to set and
clear the bit values, but considering that it then copies the bitmaps
around with "bitmap_to_arr32()" and friends, there clearly cannot be a
lot of atomicity requirements.

So just use a regular integer.

In the process, this avoids the warnings about erroneous use of
bitmap_from_u64() which were triggered on 32-bit architectures when
conversion from a u64 would access two words (and, surprise, surprise,
only one word is needed - and indeed overkill - for a 2-bit bitmap).

That was always problematic, but the compiler seems to notice it and
warn about the invalid pattern only after commit a14787beba79 ("lib: add
bitmap_{from,to}_arr64") changed the exact implementation details of
'bitmap_from_u64()', as reported by Sudip Mukherjee and Stephen Rothwell.

Fixes: 0aa74cb2aca9 ("Bluetooth: hci_core: Rework hci_conn_params flags")
Link: https://lore.kernel.org/all/YpyJ9qTNHJzz0FHY@debian/
Link: https://lore.kernel.org/all/20220606080631.29526b9c@canb.auug.org.au/
Link: https://lore.kernel.org/all/20220605162537.1604762-1-yury.norov@gmail.com/
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Cc: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

fix the breakage in close_fd_get_file() calling conventions change

It used to grab an extra reference to struct file rather than
just transferring to caller the one it had removed from descriptor
table. New variant doesn't, and callers need to be adjusted.

Reported-and-tested-by: syzbot+47dd250f527cb7bebf24@syzkaller.appspotmail.com
Fixes: f95f2c7bc54e ("Unify the primitives for file descriptor closing")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

Merge tag 'x86-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SGX fix from Thomas Gleixner:
"A single fix for x86/SGX to prevent that memory which is allocated for
an SGX enclave is accounted to the wrong memory control group"

* tag 'x86-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sgx: Set active memcg prior to shmem allocation

Merge tag 'x86-mm-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 mm cleanup from Thomas Gleixner:
"Use PAGE_ALIGNED() instead of open coding it in the x86/mm code"

* tag 'x86-mm-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Use PAGE_ALIGNED(x) instead of IS_ALIGNED(x, PAGE_SIZE)

Merge tag 'x86-microcode-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 microcode updates from Thomas Gleixner:

- Disable late microcode loading by default. Unless the HW people get
   their act together and provide a required minimum version in the
   microcode header for making a halfways informed decision its just
   lottery and broken.

- Warn and taint the kernel when microcode is loaded late

- Remove the old unused microcode loader interface

- Remove a redundant perf callback from the microcode loader

* tag 'x86-microcode-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Remove unnecessary perf callback
  x86/microcode: Taint and warn on late loading
  x86/microcode: Default-disable late loading
  x86/microcode: Rip out the OLD_INTERFACE

Merge tag 'x86-cleanups-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cleanups from Thomas Gleixner:
"A set of small x86 cleanups:

   - Remove unused headers in the IDT code

   - Kconfig indendation and comment fixes

   - Fix all 'the the' typos in one go instead of waiting for bots to
     fix one at a time"

* tag 'x86-cleanups-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86: Fix all occurences of the "the the" typo
  x86/idt: Remove unused headers
  x86/Kconfig: Fix indentation of arch/x86/Kconfig.debug
  x86/Kconfig: Fix indentation and add endif comments to arch/x86/Kconfig

Merge tag 'x86-boot-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot update from Thomas Gleixner:
"Use strlcpy() instead of strscpy() in arch_setup()"

* tag 'x86-boot-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/setup: Use strscpy() to replace deprecated strlcpy()

Merge tag 'timers-core-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull clockevent/clocksource updates from Thomas Gleixner:

- Device tree bindings for MT8186

- Tell the kernel that the RISC-V SBI timer stops in deeper power
   states

- Make device tree parsing in sp804 more robust

- Dead code removal and tiny fixes here and there

- Add the missing SPDX identifiers

* tag 'timers-core-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/oxnas-rps: Fix irq_of_parse_and_map() return value
  clocksource/drivers/timer-ti-dm: Remove unnecessary NULL check
  clocksource/drivers/timer-sun5i: Convert to SPDX identifier
  clocksource/drivers/timer-sun4i: Convert to SPDX identifier
  clocksource/drivers/pistachio: Convert to SPDX identifier
  clocksource/drivers/orion: Convert to SPDX identifier
  clocksource/drivers/lpc32xx: Convert to SPDX identifier
  clocksource/drivers/digicolor: Convert to SPDX identifier
  clocksource/drivers/armada-370-xp: Convert to SPDX identifier
  clocksource/drivers/mips-gic-timer: Convert to SPDX identifier
  clocksource/drivers/jcore: Convert to SPDX identifier
  clocksource/drivers/bcm_kona: Convert to SPDX identifier
  clocksource/drivers/sp804: Avoid error on multiple instances
  clocksource/drivers/riscv: Events are stopped during CPU suspend
  clocksource/drivers/ixp4xx: Drop boardfile probe path
  dt-bindings: timer: Add compatible for Mediatek MT8186

Merge tag 'sched-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Thomas Gleixner:
"Fix the fallout of sysctl code move which placed the init function
wrong"

* tag 'sched-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/autogroup: Fix sysctl move

Merge tag 'perf-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:

  - Make the ICL event constraints match reality

  - Remove a unused local variable

* tag 'perf-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Remove unused local variable
  perf/x86/intel: Fix event constraints for ICL

Merge tag 'perf-core-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixlet from Thomas Gleixner:
"Trivial indentation fix in Kconfig"

* tag 'perf-core-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/Kconfig: Fix indentation in the Kconfig file

Merge tag 'objtool-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool fixes from Thomas Gleixner:

- Handle __ubsan_handle_builtin_unreachable() correctly and treat it as
   noreturn

- Allow architectures to select uaccess validation

- Use the non-instrumented bit test for test_cpu_has() to prevent
   escape from non-instrumentable regions

- Use arch_ prefixed atomics for JUMP_LABEL=n builds to prevent escape
   from non-instrumentable regions

- Mark a few tiny inline as __always_inline to prevent GCC from
   bringing them out of line and instrumenting them

- Mark the empty stub context_tracking_enabled() as always inline as
   GCC brings them out of line and instruments the empty shell

- Annotate ex_handler_msr_mce() as dead end

* tag 'objtool-urgent-2022-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/extable: Annotate ex_handler_msr_mce() as a dead end
  context_tracking: Always inline empty stubs
  x86: Always inline on_thread_stack() and current_top_of_stack()
  jump_label,noinstr: Avoid instrumentation for JUMP_LABEL=n builds
  x86/cpu: Elide KCSAN for cpu_has() and friends
  objtool: Mark __ubsan_handle_builtin_unreachable() as noreturn
  objtool: Add CONFIG_HAVE_UACCESS_VALIDATION

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull more SCSI updates from James Bottomley:
"Mostly small bug fixes plus other trivial updates.

  The major change of note is moving ufs out of scsi and a minor update
  to lpfc vmid handling"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
  scsi: qla2xxx: Remove unused 'ql_dm_tgt_ex_pct' parameter
  scsi: qla2xxx: Remove setting of 'req' and 'rsp' parameters
  scsi: mpi3mr: Fix kernel-doc
  scsi: lpfc: Add support for ATTO Fibre Channel devices
  scsi: core: Return BLK_STS_TRANSPORT for ALUA transitioning
  scsi: sd_zbc: Prevent zone information memory leak
  scsi: sd: Fix potential NULL pointer dereference
  scsi: mpi3mr: Rework mrioc->bsg_device model to fix warnings
  scsi: myrb: Fix up null pointer access on myrb_cleanup()
  scsi: core: Unexport scsi_bus_type
  scsi: sd: Don't call blk_cleanup_disk() in sd_probe()
  scsi: ufs: ufshcd: Delete unnecessary NULL check
  scsi: isci: Fix typo in comment
  scsi: pmcraid: Fix typo in comment
  scsi: smartpqi: Fix typo in comment
  scsi: qedf: Fix typo in comment
  scsi: esas2r: Fix typo in comment
  scsi: storvsc: Fix typo in comment
  scsi: ufs: Split the drivers/scsi/ufs directory
  scsi: qla1280: Remove redundant variable
  ...

Merge tag 'hte/for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux

Pull hardware timestamping subsystem from Thierry Reding:
"This contains the new HTE (hardware timestamping engine) subsystem
  that has been in the works for a couple of months now.

  The infrastructure provided allows for drivers to register as hardware
  timestamp providers, while consumers will be able to request events
  that they are interested in (such as GPIOs and IRQs) to be timestamped
  by the hardware providers.

  Note that this currently supports only one provider, but there seems
  to be enough interest in this functionality and we expect to see more
  drivers added once this is merged"

[ Linus Walleij mentions the Intel PMC in the Elkhart and Tiger Lake
  platforms as another future timestamp provider ]

* tag 'hte/for-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
  dt-bindings: timestamp: Correct id path
  dt-bindings: Renamed hte directory to timestamp
  hte: Uninitialized variable in hte_ts_get()
  hte: Fix off by one in hte_push_ts_ns()
  hte: Fix possible use-after-free in tegra_hte_test_remove()
  hte: Remove unused including <linux/version.h>
  MAINTAINERS: Add HTE Subsystem
  hte: Add Tegra HTE test driver
  tools: gpio: Add new hardware clock type
  gpiolib: cdev: Add hardware timestamp clock type
  gpio: tegra186: Add HTE support
  gpiolib: Add HTE support
  dt-bindings: Add HTE bindings
  hte: Add Tegra194 HTE kernel provider
  drivers: Add hardware timestamp engine (HTE) subsystem
  Documentation: Add HTE subsystem guide

Merge tag 'kbuild-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull more Kbuild updates from Masahiro Yamada:

- Fix build regressions for parisc, csky, nios2, openrisc

- Simplify module builds for CONFIG_LTO_CLANG and CONFIG_X86_KERNEL_IBT

- Remove arch/parisc/nm, which was presumably a workaround for old
   tools

- Check the odd combination of EXPORT_SYMBOL and 'static' precisely

- Make external module builds robust against "too long argument error"

- Support j, k keys for moving the cursor in nconfig

* tag 'kbuild-v5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (25 commits)
  kbuild: Allow to select bash in a modified environment
  scripts: kconfig: nconf: make nconfig accept jk keybindings
  modpost: use fnmatch() to simplify match()
  modpost: simplify mod->name allocation
  kbuild: factor out the common objtool arguments
  kbuild: move vmlinux.o link to scripts/Makefile.vmlinux_o
  kbuild: clean .tmp_* pattern by make clean
  kbuild: remove redundant cleanups in scripts/link-vmlinux.sh
  kbuild: rebuild multi-object modules when objtool is updated
  kbuild: add cmd_and_savecmd macro
  kbuild: make *.mod rule robust against too long argument error
  kbuild: make built-in.a rule robust against too long argument error
  kbuild: check static EXPORT_SYMBOL* by script instead of modpost
  parisc: remove arch/parisc/nm
  kbuild: do not create *.prelink.o for Clang LTO or IBT
  kbuild: replace $(linked-object) with CONFIG options
  kbuild: do not try to parse *.cmd files for objects provided by compiler
  kbuild: replace $(if A,A,B) with $(or A,B) in scripts/Makefile.modpost
  modpost: squash if...else-if in find_elf_symbol2()
  modpost: reuse ARRAY_SIZE() macro for section_mismatch()
  ...

Merge tag 'pull-18-rc1-work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs pathname updates from Al Viro:
"Several cleanups in fs/namei.c"

* tag 'pull-18-rc1-work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  namei: cleanup double word in comment
  get rid of dead code in legitimize_root()
  fs/namei.c:reserve_stack(): tidy up the call of try_to_unlazy()

Merge tag 'pull-18-rc1-work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull mount handling updates from Al Viro:
"Cleanups (and one fix) around struct mount handling.

  The fix is usermode_driver.c one - once you've done kern_mount(), you
  must kern_unmount(); simple mntput() will end up with a leak. Several
  failure exits in there messed up that way... In practice you won't hit
  those particular failure exits without fault injection, though"

* tag 'pull-18-rc1-work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  move mount-related externs from fs.h to mount.h
  blob_to_mnt(): kern_unmount() is needed to undo kern_mount()
  m->mnt_root->d_inode->i_sb is a weird way to spell m->mnt_sb...
  linux/mount.h: trim includes
  uninline may_mount() and don't opencode it in fspick(2)/fsopen(2)

Merge tag 'pull-18-rc1-work.fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull file descriptor updates from Al Viro.

- Descriptor handling cleanups

* tag 'pull-18-rc1-work.fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Unify the primitives for file descriptor closing
  fs: remove fget_many and fput_many interface
  io_uring_enter(): don't leave f.flags uninitialized

Merge tag '5.19-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs client fixes from Steve French:
"Nine cifs/smb3 client fixes.

  Includes DFS fixes, some cleanup of leagcy SMB1 code, duplicated
  message cleanup and a double free and deadlock fix"

* tag '5.19-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix uninitialized pointer in error case in dfs_cache_get_tgt_share
  cifs: skip trailing separators of prefix paths
  cifs: update internal module number
  cifs: version operations for smb20 unneeded when legacy support disabled
  cifs: do not build smb1ops if legacy support is disabled
  cifs: fix potential deadlock in direct reclaim
  cifs: when extending a file with falloc we should make files not-sparse
  cifs: remove repeated debug message on cifs_put_smb_ses()
  cifs: fix potential double free during failed mount

kbuild: Allow to select bash in a modified environment

This fixes the build error when the system has a default bash version
which is too old to support associative array variables.

The build error log as fellowing:
linux/scripts/check-local-export: line 11: declare: -A: invalid option
declare: usage: declare [-afFirtx] [-p] [name[=value] ...]

Signed-off-by: Schspa Shi <schspa@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

scripts: kconfig: nconf: make nconfig accept jk keybindings

Make nconfig accept jk keybindings for movement in addition to arrow
keys.

Signed-off-by: Isak Ellmer <isak01@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

modpost: use fnmatch() to simplify match()

Replace the own implementation for wildcard (glob) matching with
a function call to fnmatch().

Also, change the return type to 'bool'.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

modpost: simplify mod->name allocation

mod->name is set to the ELF filename with the suffix ".o" stripped.

The current code calls strdup() and free() to manipulate the string,
but a simpler approach is to pass new_module() with the name length
subtracted by 2.

Also, check if the passed filename ends with ".o" before stripping it.

The current code blindly chops the suffix:

tmp[strlen(tmp) - 2] = '\0'

It will cause buffer under-run if strlen(tmp) < 2;

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

kbuild: factor out the common objtool arguments

scripts/Makefile.build and scripts/link-vmlinux.sh have similar setups
for the objtool arguments.

It was difficult to factor out them because all the vmlinux build rules
were written in a shell script. It is somewhat tedious to touch the two
files every time a new objtool option is supported.

To reduce the code duplication, move the objtool for vmlinux.o into
scripts/Makefile.vmlinux_o. Then, move the common macros to Makefile.lib
so they are shared between Makefile.build and Makefile.vmlinux_o.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)

kbuild: move vmlinux.o link to scripts/Makefile.vmlinux_o

This is a preparation for moving the objtool rule in the next commit.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)

kbuild: clean .tmp_* pattern by make clean

Change the "make clean" rule to remove all the .tmp_* files.

.tmp_objdiff is the only exception, which should be removed by
"make mrproper".

Rename the record directory of objdiff, .tmp_objdiff to .objdiff to
avoid the removal by "make clean".

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM-14 (x86-64)

Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

- bitmap: optimize bitmap_weight() usage, from me

- lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro
   Carvalho Chehab

- include/linux/find: Fix documentation, from Anna-Maria Behnsen

- bitmap: fix conversion from/to fix-sized arrays, from me

- bitmap: Fix return values to be unsigned, from Kees Cook

It has been in linux-next for at least a week with no problems.

* tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits)
  nodemask: Fix return values to be unsigned
  bitmap: Fix return values to be unsigned
  KVM: x86: hyper-v: replace bitmap_weight() with hweight64()
  KVM: x86: hyper-v: fix type of valid_bank_mask
  ia64: cleanup remove_siblinginfo()
  drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate
  KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate
  lib/bitmap: add test for bitmap_{from,to}_arr64
  lib: add bitmap_{from,to}_arr64
  lib/bitmap: extend comment for bitmap_(from,to)_arr32()
  include/linux/find: Fix documentation
  lib/bitmap.c make bitmap_print_bitmask_to_buf parseable
  MAINTAINERS: add cpumask and nodemask files to BITMAP_API
  arch/x86: replace nodes_weight with nodes_empty where appropriate
  mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  clocksource: replace cpumask_weight with cpumask_empty in clocksource.c
  genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate
  irq: mips: replace cpumask_weight with cpumask_empty where appropriate
  drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  ...

Merge tag 'for-5.19/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux

Pull more parisc architecture updates from Helge Deller:
"A fix to prevent crash at bootup if CONFIG_SCHED_MC is enabled, and
  add auto-detection of primary graphics card for framebuffer driver"

* tag 'for-5.19/parisc-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc/stifb: Keep track of hardware path of graphics card
  parisc/stifb: Implement fb_is_primary_device()
  parisc: fix a crash with multicore scheduler

Merge tag 'for-linus-5.19-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull more xen updates from Juergen Gross:
"Two cleanup patches for Xen related code and (more important) an
  update of MAINTAINERS for Xen, as Boris Ostrovsky decided to step
  down"

* tag 'for-linus-5.19-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen: replace xen_remap() with memremap()
  MAINTAINERS: Update Xen maintainership
  xen: switch gnttab_end_foreign_access() to take a struct page pointer

Merge tag 'perf-tools-for-v5.19-2022-06-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull more perf tools updates from Arnaldo Carvalho de Melo:

- Synthesize task events for pre-existing threads when using 'perf lock
   --threads', as we need to show task names.

- Fix unwinding with ld.lld (>= version 10.0) linked objects, where
  .eh_frame_hdr and .text are in different PT_LOAD program headers,
   which makes perf record --call-graph dwarf fail with such obkects.

- Check if 'perf record' hangs in the ARM SPE (Statistical Profiling
   Extensions) 'perf test' entry when recording a workload with forks.

- Trace physical address for Arm SPE events, needed for 'perf c2c' to
   locate the memory node for samples.

- Fix sorting in percent_rmt_hitm_cmp() in 'perf c2c'.

- Further support for Intel hybrid systems in the evlist and 'perf
   record' code.

- Update IBM s/390 vendor event JSON tables.

- Add metrics (JSON) for Intel Sapphirerapids.

- Update metrics for Intel Alderlake.

- Correct typo of sysf 'event_source' directory in the documentation.

* tag 'perf-tools-for-v5.19-2022-06-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf vendor events intel: Update metrics for Alderlake
  perf vendor events intel: Add metrics for Sapphirerapids
  perf c2c: Fix sorting in percent_rmt_hitm_cmp()
  perf mem: Trace physical address for Arm SPE events
  perf list: Update event description for IBM zEC12/zBC12 to latest level
  perf list: Update event description for IBM z196/z114 to latest level
  perf list: Update event description for IBM z15 to latest level
  perf list: Update event description for IBM z14 to latest level
  perf list: Update event description for IBM z13 to latest level
  perf list: Update event description for IBM z10 to latest level
  perf list: Add IBM z16 event description for s390
  perf record: Support sample-read topdown metric group for hybrid platforms
  perf lock: Change to synthesize task events
  perf unwind: Fix segbase for ld.lld linked objects
  perf test arm-spe: Check if perf-record hangs when recording workload with forks
  perf docs: Correct typo of event_sources
  perf evlist: Extend arch_evsel__must_be_in_group to support hybrid systems

cifs: fix uninitialized pointer in error case in dfs_cache_get_tgt_share

Set default value of ppath to null.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>

parisc/stifb: Keep track of hardware path of graphics card

Keep the pa_path (hardware path) of the graphics card in sti_struct and use
this info to give more useful info which card is currently being used.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.10+

parisc/stifb: Implement fb_is_primary_device()

Implement fb_is_primary_device() function, so that fbcon detects if this
framebuffer belongs to the default graphics card which was used to start
the system.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v5.10+

Merge tag 'gpio-fixes-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

- use the correct register for regcache sync in gpio-pca953x

- remove unused and potentially harmful code from gpio-adp5588

- MAINTAINERS update

* tag 'gpio-fixes-for-v5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio: adp5588: Remove support for platform setup and teardown callbacks
  gpio: pca953x: use the correct register address to do regcache sync
  MAINTAINERS: Update Intel GPIO (PMIC and PCH) to Supported
  MAINTAINERS: Update GPIO ACPI library to Supported

Merge tag 'regulator-fix-v5.19-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator

Pull regulator fix from Mark Brown:
"One fix that came in during the merge window, fixing an error in the
examples in the DT binding documentation for mt6315"

* tag 'regulator-fix-v5.19-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: mt6315-regulator: fix invalid allowed mode

Merge tag 'ntfs3_for_5.19' of https://github.com/Paragon-Software-Group/linux-ntfs3

Pull ntfs3 updates from Konstantin Komarov:

- fix some memory leaks and panic

- fixed xfstests (tested on x86_64): generic/092 generic/099
   generic/228 generic/240 generic/307 generic/444

- fix some typos, dead code, etc

* tag 'ntfs3_for_5.19' of https://github.com/Paragon-Software-Group/linux-ntfs3:
  fs/ntfs3: provide block_invalidate_folio to fix memory leak
  fs/ntfs3: Fix invalid free in log_replay
  fs/ntfs3: Update valid size if -EIOCBQUEUED
  fs/ntfs3: Check new size for limits
  fs/ntfs3: Fix fiemap + fix shrink file size (to remove preallocated space)
  fs/ntfs3: In function ntfs_set_acl_ex do not change inode->i_mode if called from function ntfs_init_acl
  fs/ntfs3: Optimize locking in ntfs_save_wsl_perm
  fs/ntfs3: Update i_ctime when xattr is added
  fs/ntfs3: Restore ntfs_xattr_get_acl and ntfs_xattr_set_acl functions
  fs/ntfs3: Keep preallocated only if option prealloc enabled
  fs/ntfs3: Fix some memory leaks in an error handling path of 'log_replay()'

Merge tag 'ptrace_stop-cleanup-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull ptrace_stop cleanups from Eric Biederman:
"While looking at the ptrace problems with PREEMPT_RT and the problems
  Peter Zijlstra was encountering with ptrace in his freezer rewrite I
  identified some cleanups to ptrace_stop that make sense on their own
  and move make resolving the other problems much simpler.

  The biggest issue is the habit of the ptrace code to change
  task->__state from the tracer to suppress TASK_WAKEKILL from waking up
  the tracee. No other code in the kernel does that and it is straight
  forward to update signal_wake_up and friends to make that unnecessary.

  Peter's task freezer sets frozen tasks to a new state TASK_FROZEN and
  then it stores them by calling "wake_up_state(t, TASK_FROZEN)" relying
  on the fact that all stopped states except the special stop states can
  tolerate spurious wake up and recover their state.

  The state of stopped and traced tasked is changed to be stored in
  task->jobctl as well as in task->__state. This makes it possible for
  the freezer to recover tasks in these special states, as well as
  serving as a general cleanup. With a little more work in that
  direction I believe TASK_STOPPED can learn to tolerate spurious wake
  ups and become an ordinary stop state.

  The TASK_TRACED state has to remain a special state as the registers
  for a process are only reliably available when the process is stopped
  in the scheduler. Fundamentally ptrace needs acess to the saved
  register values of a task.

  There are bunch of semi-random ptrace related cleanups that were found
  while looking at these issues.

  One cleanup that deserves to be called out is from commit 94196e3db8d4
  ("ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs"). This
  makes a change that is technically user space visible, in the handling
  of what happens to a tracee when a tracer dies unexpectedly. According
  to our testing and our understanding of userspace nothing cares that
  spurious SIGTRAPs can be generated in that case"

* tag 'ptrace_stop-cleanup-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  sched,signal,ptrace: Rework TASK_TRACED, TASK_STOPPED state
  ptrace: Always take siglock in ptrace_resume
  ptrace: Don't change __state
  ptrace: Admit ptrace_stop can generate spuriuos SIGTRAPs
  ptrace: Document that wait_task_inactive can't fail
  ptrace: Reimplement PTRACE_KILL by always sending SIGKILL
  signal: Use lockdep_assert_held instead of assert_spin_locked
  ptrace: Remove arch_ptrace_attach
  ptrace/xtensa: Replace PT_SINGLESTEP with TIF_SINGLESTEP
  ptrace/um: Replace PT_DTRACE with TIF_SINGLESTEP
  signal: Replace __group_send_sig_info with send_signal_locked
  signal: Rename send_signal send_signal_locked

Merge tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull kthread updates from Eric Biederman:
"This updates init and user mode helper tasks to be ordinary user mode
  tasks.

  Commit aa52bf4ac30d ("kthread: Ensure struct kthread is present for
  all kthreads") caused init and the user mode helper threads that call
  kernel_execve to have struct kthread allocated for them. This struct
  kthread going away during execve in turned made a use after free of
  struct kthread possible.

  Here, commit ce0508581165 ("kthread: Don't allocate kthread_struct for
  init and umh") is enough to fix the use after free and is simple
  enough to be backportable.

  The rest of the changes pass struct kernel_clone_args to clean things
  up and cause the code to make sense.

  In making init and the user mode helpers tasks purely user mode tasks
  I ran into two complications. The function task_tick_numa was
  detecting tasks without an mm by testing for the presence of
  PF_KTHREAD. The initramfs code in populate_initrd_image was using
  flush_delayed_fput to ensuere the closing of all it's file descriptors
  was complete, and flush_delayed_fput does not work in a userspace
  thread.

  I have looked and looked and more complications and in my code review
  I have not found any, and neither has anyone else with the code
  sitting in linux-next"

* tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  sched: Update task_tick_numa to ignore tasks without an mm
  fork: Stop allowing kthreads to call execve
  fork: Explicitly set PF_KTHREAD
  init: Deal with the init process being a user mode process
  fork: Generalize PF_IO_WORKER handling
  fork: Explicity test for idle tasks in copy_thread
  fork: Pass struct kernel_clone_args into copy_thread
  kthread: Don't allocate kthread_struct for init and umh

Merge tag 'per-namespace-ipc-sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull ipc sysctl namespace updates from Eric Biederman:
"This updates the ipc sysctls so that they are fundamentally per ipc
  namespace. Previously these sysctls depended upon a hack to simulate
  being per ipc namespace by looking up the ipc namespace in read or
  write. With this set of changes the ipc sysctls are registered per ipc
  namespace and open looks up the ipc namespace.

  Not only does this series of changes ensure the traditional binding at
  open time happens, but it sets a foundation for being able to relax
  the permission checks to allow a user namspace root to change the ipc
  sysctls for an ipc namespace that the user namespace root requires. To
  do this requires the ipc namespace to be known at open time"

* tag 'per-namespace-ipc-sysctls-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  ipc: Remove extra braces
  ipc: Check permissions for checkpoint_restart sysctls at open time
  ipc: Remove extra1 field abuse to pass ipc namespace
  ipc: Use the same namespace to modify and validate
  ipc: Store ipc sysctls in the ipc namespace
  ipc: Store mqueue sysctls in the ipc namespace

firmware_loader: enable XZ by default if compressed support is enabled

Commit aad1da054959 ("firmware: Add the support for ZSTD-compressed
firmware files") added support for ZSTD compression, but in the process
also made the previously default XZ compression a config option.

That means that anybody who upgrades their kernel and does a

make oldconfig

to update their configuration, will end up without the XZ compression
that the configuration used to have.

Add the 'default y' to make sure this doesn't happen.

The whole compression question should probably be improved upon, since
it is now possible to "enable" compression in the kernel config but not
enable any actual compression algorithm, which makes it all very
useless. It makes no sense to ask Kconfig questions that enable
situations that are nonsensical like that.

This at least fixes the immediate problem of a kernel update resulting
in a nonbootable machine because of a missed option.

Fixes: aad1da054959 ("firmware: Add the support for ZSTD-compressed firmware files")
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs

Pull JFFS2, UBI and UBIFS updates from Richard Weinberger:
"JFFS2:
   - Fixes for a memory leak

  UBI:
   - Fixes for fastmap (UAF, high CPU usage)

  UBIFS:
   - Minor cleanups"

* tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
  ubi: ubi_create_volume: Fix use-after-free when volume creation failed
  ubi: fastmap: Check wl_pool for free peb before wear leveling
  ubi: fastmap: Fix high cpu usage of ubi_bgt by making sure wl_pool not empty
  ubifs: Use NULL instead of using plain integer as pointer
  ubifs: Simplify the return expression of run_gc()
  jffs2: fix memory leak in jffs2_do_fill_super
  jffs2: Use kzalloc instead of kmalloc/memset

Merge tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml

Pull UML updates from Richard Weinberger:

- Various cleanups and fixes: xterm, serial line, time travel

- Set ARCH_HAS_GCOV_PROFILE_ALL

* tag 'for-linus-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  um: Fix out-of-bounds read in LDT setup
  um: chan_user: Fix winch_tramp() return value
  um: virtio_uml: Fix broken device handling in time-travel
  um: line: Use separate IRQs per line
  um: Enable ARCH_HAS_GCOV_PROFILE_ALL
  um: Use asm-generic/dma-mapping.h
  um: daemon: Make default socket configurable
  um: xterm: Make default terminal emulator configurable

Merge tag 'devicetree-fixes-for-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

- Fixes for unevaluatedProperties warnings.

   These were missed to due to a bug in dtschema which is now fixed. The
   changes involve either adding missing properties or removing spurious
   properties from examples.

- Update several Qualcomm binding maintainer email addresses

- Fix typo in imx8mp-media-blk-ctrl example

- Fix fixed string pattern in qcom,smd

- Correct the order of 'reg' entries in Xilinx PCI binding

* tag 'devicetree-fixes-for-5.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: mtd: spi-nand: Add spi-peripheral-props.yaml reference
  dt-bindings: memory-controllers: ingenic: Split out child node properties
  dt-bindings: net/dsa: Add spi-peripheral-props.yaml references
  dt-bindings: PCI: apple: Add missing 'power-domains' property
  dt-bindings: Update Sibi Sankar's email address
  dt-bindings: clock: Update my email address
  dt-bindings: PCI: xilinx-cpm: Fix reg property order
  dt-bindings: net: Fix unevaluatedProperties warnings in examples
  dt-bindings: PCI: socionext,uniphier-pcie: Add missing child interrupt controller
  dt-bindings: usb: snps,dwc3: Add missing 'dma-coherent' property
  dt-bindings: soc: imx8mp-media-blk-ctrl: Fix DT example
  dt-bindings: soc: qcom,smd: do not use pattern for simple rpm-requests string

Merge tag 'arm-multiplatform-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull yet more ARM multiplatform updates from Arnd Bergmann:
"This is the third and final bit of the multiplatform conversion for
  ARMv5, finishing off OMAP1. One patch enables the common-clk
  interface, and the other ones does the Kconfig change.

  These were waiting on a few dependencies to trickle in for common-clk,
  and the last one of those was in the USB tree"

* tag 'arm-multiplatform-5.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  ARM: omap1: enable multiplatform
  ARM: OMAP1: clock: Convert to CCF

Merge tag 'loongarch-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull initial Loongarch architecture code from Arnd Bergmann:
"This is the majority of the loongarch architecture code, including the
  final system call interface and all core functionality.

  It still misses three sets of peripheral but vital patches to add
  support for other subsystems, which have yet to pass review:

   - The drivers/firmware/efi stub for booting from a standard UEFI
     firmware implementation. Both the original custom boot interface
     and a draft implementation of the EFI stub did not make it, so it
     is currently impossible to boot the kernel, until the loongarch
     specific portions get accepted into the UEFI subsystem

   - The drivers/irqchip/irq-loongson-*.c drivers are shared with the
     the MIPS port, but currently lack support for ACPI based booting,
     which will get merged through the irqchip subsystem.

   - Similarly, the drivers/pci/controller/pci-loongson.c needs to be
     modified for ACPI support, which will be merged through the PCI
     subsystem.

  While the port cannot actually be used before all the above are
  merged, having it in 5.19 helps to establish the user space ABI for
  the libc ports to build on, and to help any treewide changes in the
  mainline kernel get applied here as well.

  A gcc-12 based tool chains for build testing is now included in

    https://mirrors.edge.kernel.org/pub/tools/crosstool/"

Original description from Huacai Chen:
"LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V.
  LoongArch includes a reduced 32-bit version (LA32R), a standard 32-bit
  version (LA32S) and a 64-bit version (LA64). LoongArch use ACPI as its
  boot protocol LoongArch-specific interrupt controllers (similar to APIC)
  are already added in the next revision of ACPI Specification (current
  revision is 6.4).

  This patchset is adding basic LoongArch support in mainline kernel, we
  can see a complete snapshot here:

    https://github.com/loongson/linux/tree/loongarch-next
    https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson.git/log/?h=loongarch-next

  Cross-compile tool chain to build kernel:

    https://github.com/loongson/build-tools/releases/download/2021.12.21/loongarch64-clfs-2022-03-03-cross-tools-gcc-glibc.tar.xz

  A CLFS-based Linux distro:

    https://github.com/loongson/build-tools/releases/download/2021.12.21/loongarch64-clfs-system-2022-03-03.tar.bz2

  Open-source tool chain which is under review (Binutils and Gcc are already upstream):

    https://github.com/loongson/binutils-gdb/tree/upstream_v3.1
    https://github.com/loongson/gcc/tree/loongarch_upstream_v6.3
    https://github.com/loongson/glibc/tree/loongarch_2_35_dev_v2.2

  Loongson and LoongArch documentations:

    https://github.com/loongson/LoongArch-Documentation

  LoongArch-specific interrupt controllers:

    https://mantis.uefi.org/mantis/view.php?id=2203
    https://mantis.uefi.org/mantis/view.php?id=2313"

Link: https://lore.kernel.org/lkml/20220603072053.35005-1-chenhuacai@loongson.cn/
* tag 'loongarch-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (24 commits)
  MAINTAINERS: Add maintainer information for LoongArch
  LoongArch: Add Loongson-3 default config file
  LoongArch: Add Non-Uniform Memory Access (NUMA) support
  LoongArch: Add multi-processor (SMP) support
  LoongArch: Add VDSO and VSYSCALL support
  LoongArch: Add some library functions
  LoongArch: Add misc common routines
  LoongArch: Add ELF and module support
  LoongArch: Add signal handling support
  LoongArch: Add system call support
  LoongArch: Add memory management
  LoongArch: Add process management
  LoongArch: Add exception/interrupt handling
  LoongArch: Add boot and setup routines
  LoongArch: Add other common headers
  LoongArch: Add atomic/locking headers
  LoongArch: Add CPU definition headers
  LoongArch: Add build infrastructure
  LoongArch: Add writecombine support for drm
  LoongArch: Add ELF-related definitions
  ...

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Catalin Marinas:
"Most of issues addressed were introduced during this merging window.

   - Initialise jump labels before setup_machine_fdt(), needed by commit
     137c2c274a37 ("random: use static branch for crng_ready()").

   - Sparse warnings: missing prototype, incorrect __user annotation.

   - Skip SVE kselftest if not sufficient vector lengths supported"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  kselftest/arm64: signal: Skip SVE signal test if not enough VLs supported
  arm64: Initialize jump labels before setup_machine_fdt()
  arm64: hibernate: Fix syntax errors in comments
  arm64: Remove the __user annotation for the restore_za_context() argument
  ftrace/fgraph: fix increased missing-prototypes warnings

Merge tag 'riscv-for-linus-5.19-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:
"This is mostly some DT updates, but also a handful of cleanups and
  some fixes. The most user-visible of those are:

   - A device tree for the Sundance Polarberry, along with a handful of
     fixes and clenups to the PolarFire SOC device trees and bindings.

   - The memfd_secret syscall number is now visible to userspace,

   - Some improvements to the vm layout dump, which really should have
     followed shortly after the sv48 patches but I missed"

* tag 'riscv-for-linus-5.19-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Move alternative length validation into subsection
  riscv: mm: init: make pt_ops_set_[early|late|fixmap] static
  riscv: move errata/ and kvm/ builds to arch/riscv/Kbuild
  RISC-V: Mark IORESOURCE_EXCLUSIVE for reserved mem instead of IORESOURCE_BUSY
  riscv: Wire up memfd_secret in UAPI header
  riscv: Fix irq_work when SMP is disabled
  riscv: Improve virtual kernel memory layout dump
  riscv: Initialize thread pointer before calling C functions
  Documentation: riscv: Add sv48 description to VM layout
  RISC-V: Only default to spinwait on SBI-0.1 and M-mode
  riscv: dts: icicle: sort nodes alphabetically
  riscv: microchip: icicle: readability fixes
  riscv: dts: microchip: add the sundance polarberry
  dt-bindings: riscv: microchip: add polarberry compatible string
  dt-bindings: vendor-prefixes: add Sundance DSP
  riscv: dts: microchip: make the fabric dtsi board specific
  dt-bindings: riscv: microchip: document icicle reference design
  riscv: dts: microchip: remove soc vendor from filenames
  riscv: dts: microchip: move sysctrlr out of soc bus
  riscv: dts: microchip: remove icicle memory clocks

Merge tag 's390-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:
"Just a couple of small improvements, bug fixes and cleanups:

   - Add Eric Farman as maintainer for s390 virtio drivers.

   - Improve machine check handling, and avoid incorrectly injecting a
     machine check into a kvm guest.

   - Add cond_resched() call to gmap page table walker in order to avoid
     possible huge latencies. Also use non-quiesing sske instruction to
     speed up storage key handling.

   - Add __GFP_NORETRY to KEXEC_CONTROL_MEMORY_GFP so s390 behaves
     similar like common code.

   - Get sie control block address from correct stack slot in perf event
     code. This fixes potential random memory accesses.

   - Change uaccess code so that the exception handler sets the result
     of get_user() and __get_kernel_nofault() to zero in case of a
     fault. Until now this was done via input parameters for inline
     assemblies. Doing it via fault handling is what most or even all
     other architectures are doing.

   - Couple of other small cleanups and fixes"

* tag 's390-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/stack: add union to reflect kvm stack slot usages
  s390/stack: merge empty stack frame slots
  s390/uaccess: whitespace cleanup
  s390/uaccess: use __noreturn instead of __attribute__((noreturn))
  s390/uaccess: use exception handler to zero result on get_user() failure
  s390/uaccess: use symbolic names for inline assembler operands
  s390/mcck: isolate SIE instruction when setting CIF_MCCK_GUEST flag
  s390/mm: use non-quiescing sske for KVM switch to keyed guest
  s390/gmap: voluntarily schedule during key setting
  MAINTAINERS: Update s390 virtio-ccw
  s390/kexec: add __GFP_NORETRY to KEXEC_CONTROL_MEMORY_GFP
  s390/Kconfig.debug: fix indentation
  s390/Kconfig: fix indentation
  s390/perf: obtain sie_block from the right address
  s390: generate register offsets into pt_regs automatically
  s390: simplify early program check handler
  s390/crypto: fix scatterwalk_unmap() callers in AES-GCM

Merge tag 'efi-next-for-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull more EFI updates from Ard Biesheuvel:
"Follow-up tweaks for EFI changes - they mostly address issues
  introduced this merge window, except for Heinrich's patch:

   - fix new DXE service invocations for mixed mode

   - use correct Kconfig symbol when setting PE header flag

   - clean up the drivers/firmware/efi Kconfig dependencies so that
     features that depend on CONFIG_EFI are hidden from the UI when the
     symbol is not enabled.

  Also included is a RISC-V bugfix from Heinrich to avoid read-write
  mappings of read-only firmware regions in the EFI page tables"

* tag 'efi-next-for-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  efi: clean up Kconfig dependencies on CONFIG_EFI
  efi/x86: libstub: Make DXE calls mixed mode safe
  efi: x86: Fix config name for setting the NX-compatibility flag in the PE header
  riscv: read-only pages should not be writable

perf vendor events intel: Update metrics for Alderlake

Update JSON metrics for Alderlake to perf.

It included both P-core and E-core metrics.

P-core metrics based on TMA 4.4 (TMA_Metrics-full.csv)
E-core metrics based on E-core TMA 2.0 (E-core_TMA_Metrics.csv)

https://download.01.org/perfmon/

Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220528095933.1784141-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf vendor events intel: Add metrics for Sapphirerapids

Add JSON metrics for Sapphirerapids to perf.

Based on TMA4.4 metrics.

https://download.01.org/perfmon/TMA_Metrics-full.csv

Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220528095933.1784141-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf c2c: Fix sorting in percent_rmt_hitm_cmp()

The function percent_rmt_hitm_cmp() wrongly uses local HITMs for
sorting remote HITMs.

Since this function is to sort cache lines for remote HITMs, this patch
changes to use 'rmt_hitm' field for correct sorting.

Fixes: 0fa35fad13c5559f ("perf c2c report: Add hitm/store percent related sort keys")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220530084253.750190-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf mem: Trace physical address for Arm SPE events

Currently, Arm SPE events don't trace physical address, therefore, the
field 'phys_addr' is always zero in synthesized memory samples. This
leads to perf c2c tool cannot locate the memory node for samples.

This patch enables configuration 'pa_enable' for Arm SPE events, so the
physical address packet can be traced, finally this can allow perf c2c
tool to locate properly for memory node.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220530083645.253432-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

perf list: Update event description for IBM zEC12/zBC12 to latest level

Update IBM zEC12/zBC12 event counter description to the latest level
as described in the documents
1. SA23-2260-07:
   "The Load-Program-Parameter and the CPU-Measurement Facilities."
   released on May, 2022
for the following counter sets:
   * Basic counter set
   * Problem counter set
   * Crypto counter set

2. SA23-2261-07:
   "The CPU-Measurement Facility Extended Counters Definition
   for z10, z196/z114, zEC12/zBC12, z13/z13s, z14, z15 and z16"
   released on April 29, 2022
for the following counter sets:
   * Extended counter set
   * MT-Diagnostic counter set

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220531092706.1931503-7-tmricht@linux.ibm.com
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: svens@linux.ibm.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org