git.baikalelectronics.ru Git

iommu/dma-iommu: Split iommu_dma_map_msi_msg() in two parts

On RT, iommu_dma_map_msi_msg() may be called from non-preemptible
context. This will lead to a splat with CONFIG_DEBUG_ATOMIC_SLEEP as
the function is using spin_lock (they can sleep on RT).

iommu_dma_map_msi_msg() is used to map the MSI page in the IOMMU PT
and update the MSI message with the IOVA.

Only the part to lookup for the MSI page requires to be called in
preemptible context. As the MSI page cannot change over the lifecycle
of the MSI interrupt, the lookup can be cached and re-used later on.

iomma_dma_map_msi_msg() is now split in two functions:
    - iommu_dma_prepare_msi(): This function will prepare the mapping
    in the IOMMU and store the cookie in the structure msi_desc. This
    function should be called in preemptible context.
    - iommu_dma_compose_msi_msg(): This function will update the MSI
    message with the IOVA when the device is behind an IOMMU.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

genirq/msi: Add a new field in msi_desc to store an IOMMU cookie

When an MSI doorbell is located downstream of an IOMMU, it is required
to swizzle the physical address with an appropriately-mapped IOVA for any
device attached to one of our DMA ops domain.

At the moment, the allocation of the mapping may be done when composing
the message. However, the composing may be done in non-preemtible
context while the allocation requires to be called from preemptible
context.

A follow-up change will split the current logic in two functions
requiring to keep an IOMMU cookie per MSI.

A new field is introduced in msi_desc to store an IOMMU cookie. As the
cookie may not be required in some configuration, the field is protected
under a new config CONFIG_IRQ_MSI_IOMMU.

A pair of helpers has also been introduced to access the field.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

arm64: arch_k3: Enable interrupt controller drivers

Select the TISCI Interrupt Router, Aggregator drivers and all its
dependencies for TI's SoCs based on K3 architecture.

Suggested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/ti-sci-inta: Add msi domain support

Add a msi domain that is child to the INTA domain. Clients
uses the INTA MSI bus layer to allocate irqs in this
MSI domain.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

soc: ti: Add MSI domain bus support for Interrupt Aggregator

With the system coprocessor managing the range allocation of the
inputs to Interrupt Aggregator, it is difficult to represent
the device IRQs from DT.

The suggestion is to use MSI in such cases where devices wants
to allocate and group interrupts dynamically.

Create a MSI domain bus layer that allocates and frees MSIs for
a device.

APIs that are implemented:
- ti_sci_inta_msi_create_irq_domain() that creates a MSI domain
- ti_sci_inta_msi_domain_alloc_irqs() that creates MSIs for the
specified device and resource.
- ti_sci_inta_msi_domain_free_irqs() frees the irqs attached to the device.
- ti_sci_inta_msi_get_virq() for getting the virq attached to a specific event.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/ti-sci-inta: Add support for Interrupt Aggregator driver

Texas Instruments' K3 generation SoCs has an IP Interrupt Aggregator
which is an interrupt controller that does the following:
- Converts events to interrupts that can be understood by
an interrupt router.
- Allows for multiplexing of events to interrupts.

Configuration of the interrupt aggregator registers can only be done by
a system co-processor and the driver needs to send a message to this
co processor over TISCI protocol. Add the required infrastructure to
allow the allocation and routing of these events.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

dt-bindings: irqchip: Introduce TISCI Interrupt Aggregator bindings

Add the DT binding documentation for Interrupt Aggregator driver.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/ti-sci-intr: Add support for Interrupt Router driver

Texas Instruments' K3 generation SoCs has an IP Interrupt Router
that does allows for redirection of input interrupts to host
interrupt controller. Interrupt Router inputs are either from a
peripheral or from an Interrupt Aggregator which is another
interrupt controller.

Configuration of the interrupt router registers can only be done by
a system co-processor and the driver needs to send a message to this
co processor over TISCI protocol.

Add support for Interrupt Router driver over TISCI protocol.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

dt-bindings: irqchip: Introduce TISCI Interrupt router bindings

Add the DT binding documentation for Interrupt router driver.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

gpio: thunderx: Use the default parent apis for {request,release}_resources

thunderx_gpio_irq_{request,release}_resources apis are trying to
{request,release} resources on parent interrupt. There are default
apis doing the same. Use the default parent apis instead of writing
the same code snippet.

Cc: linux-gpio@vger.kernel.org
Cc: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

genirq: Introduce irq_chip_{request,release}_resource_parent() apis

Introduce irq_chip_{request,release}_resource_parent() apis so
that these can be used in hierarchical irqchips.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

firmware: ti_sci: Add helper apis to manage resources

Each resource with in the device can be uniquely identified as defined
by TISCI. Since this is generic across the devices, resource allocation
also can be made generic instead of each client driver handling the
resource. So add helper apis to manage the resource.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

firmware: ti_sci: Add RM mapping table for am654

Add the resource mapping table for AM654 SoC as defined in
http://downloads.ti.com/tisci/esd/latest/5_soc_doc/am6x/resasg_types.html
Introduce a new compatible for AM654 "ti,am654-sci" for using
this resource map table.

Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

firmware: ti_sci: Add support for IRQ management

TISCI abstracts the handling of IRQ routes where interrupt sources
are not directly connected to host interrupt controller. Add support
for the set of TISCI commands for requesting and releasing IRQs.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

firmware: ti_sci: Add support for RM core ops

TISCI provides support for getting the resources(IRQ, RING etc..)
assigned to a specific device. These resources can be handled by
the client and in turn sends TISCI cmd to configure the resources.

It is very important that client should keep track on usage of these
resources.

Add support for TISCI commands to get resource ranges.

Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

firmware: ti_sci: Add support to get TISCI handle using of_phandle

TISCI has been updated to have support for Resource management(like
interrupts etc..). And there can be multiple device instances of a
resource type in a SoC. So every driver corresponding to a resource type
should get a TISCI handle so that it can make TISCI calls. And each
DT node corresponding to a device should exist under its corresponding
bus node as per the SoC architecture.

But existing apis in TISCI library assumes that all TISCI users are
child nodes of TISCI. Which is not true in the above case. So introduce
(devm_)ti_sci_get_by_phandle() apis that can be used by TISCI users
to get TISCI handle using of phandle property.

Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/renesas-intc-irqpin: Remove devm_kzalloc() error printing

There is no need to print a message if devm_kzalloc() fails, as the
memory allocation core already takes care of that.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip: Remove unneeded select IRQ_DOMAIN

IRQ_DOMAIN_HIERARCHY selects IRQ_DOMAIN, hence there is no need for
drivers to select both.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/gic-v3-its: Make free_lpi_range a little cheaper

Using list_add + list_sort to insert an element and keeping the list
sorted is a somewhat blunt instrument; one can find the right place to
insert in fewer lines of code than the cmp callback uses. Moreover,
walking the entire list afterwards to merge adjacent ranges is
overkill, since we know that only the just-inserted element may be
merged with its neighbours.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/gic-v3-its: Drop redundant initialization in mk_lpi_range

There's no reason to ask kmalloc() to zero the allocation, since all
the fields get initialized immediately afterwards. Except that there's
also not any reason to initialize the ->entry member, since the
element gets added to the lpi_range_list immediately.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/gic-v3-its: Move allocation outside mutex

There's no reason to do the allocation of the new lpi_range inside the
lpi_range_lock. One could change the code to avoid the allocation
altogether in case the freed range can be merged with one or two
existing ranges (in which case the allocation would naturally be done
under the lock), but it's probably not worth complicating the code for
that.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/stm32: Use a platform driver for stm32mp1-exti device

This irqchip driver uses the hwspinlock framework (coprocessor HW regs
access concurrency) for the stm32mp1-exti device.
Hence, this driver needs to handle the hwspinlock driver dependency
using the deferred probe mechanism which requires to move this driver
into a platform one with a probe() ops.
This applies only for the device which is "st,stm32mp1-exti" compatible,
the management of the other devices (st,stm32h7-exti / st,stm32-exti) is
kept unchanged (use IRQCHIP_DECLARE)

Signed-off-by: Fabien Dessenne <fabien.dessenne@st.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/gic-pm: Fix suspend handling

If interrupts are enabled for a non-root GIC device that uses the
gic-pm driver, when system suspend occurs, the current interrupt
state is not saved and restored correctly and so interrupts do not
work again on resuming the system. Add a late suspend handler to
save and restore the state for these devices.

Suggested-by: Jonathan Hunter <jonathanh@nvidia.com>
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/gic-pm: Update driver to use clk_bulk APIs

gic-pm driver is using pm-clk framework to manage clock resources, where
clocks remain always ON. This happens on Tegra devices which use BPMP
co-processor to manage the clocks. Calls to BPMP are always blocking and
hence it is necessary to enable/disable clocks during prepare/unprepare
phase respectively. When pm-clk is used, prepare count of clock is not
balanced until pm_clk_remove() happens. Clock is prepared in the driver
probe() and thus prepare count of clock remains non-zero, which in turn
keeps clock ON always.

Please note that above mentioned behavior is specific to Tegra devices
using BPMP for clock management and this should not be seen on other
devices. Though this patch uses clk_bulk APIs to address the mentioned
behavior, this works fine for all devices.

To simplify gic_get_clocks() API is removed and instead probe can do
necessary setup.

Suggested-by: Mohan Kumar D <mkumard@nvidia.com>
Signed-off-by: Sameer Pujar <spujar@nvidia.com>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irq/irqdomain: Fix typo in the comment on top of __irq_domain_alloc_irqs()

The word 'number' has been misspelt in the comment on top of
_irq_domain_alloc_irqs().

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/imx-irqsteer: Use devm_platform_ioremap_resource() to simplify code

Use the new helper devm_platform_ioremap_resource() which wraps the
platform_get_resource() and devm_ioremap_resource() together, to
simplify the code.

Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/gic-v3-its: Fix typo in a comment in its_msi_prepare()

The word 'entirely' has been misspelt in a comment in its_msi_prepare().

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/gic-v3-its: fix some definitions of inner cacheability attributes

Some definitions of Inner Cacheability attibutes need to be corrected.

Fixes: 8c828a535e29f ("irqchip/gicv3-its: Restore all cacheability attributes")
Signed-off-by: Hongbo Yao <yaohongbo@huawei.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

irqchip/bcm: Restore registration print with %pOF

It is useful to print which interrupt controllers are registered in the
system and which parent IRQ they use, especially given that L2 interrupt
controllers do not call request_irq() on their parent interrupt and do
not appear under /proc/interrupts for that reason.

We used to print the base register address virtual address which had
little value, use %pOF to print the path to the Device Tree node which
maps to the physical address more easily and is what people need to
troubleshoot systems.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>

Linux 5.1-rc5

Merge branch 'page-refs' (page ref overflow)

Merge page ref overflow branch.

Jann Horn reported that he can overflow the page ref count with
sufficient memory (and a filesystem that is intentionally extremely
slow).

Admittedly it's not exactly easy.  To have more than four billion
references to a page requires a minimum of 32GB of kernel memory just
for the pointers to the pages, much less any metadata to keep track of
those pointers.  Jann needed a total of 140GB of memory and a specially
crafted filesystem that leaves all reads pending (in order to not ever
free the page references and just keep adding more).

Still, we have a fairly straightforward way to limit the two obvious
user-controllable sources of page references: direct-IO like page
references gotten through get_user_pages(), and the splice pipe page
duplication.  So let's just do that.

* branch page-refs:
  fs: prevent page refcount overflow in pipe_buf_get
  mm: prevent get_user_pages() from overflowing page refcount
  mm: add 'try_get_page()' helper function
  mm: make page ref count overflow check tighter and more explicit

fs: prevent page refcount overflow in pipe_buf_get

Change pipe_buf_get() to return a bool indicating whether it succeeded
in raising the refcount of the page (if the thing in the pipe is a page).
This removes another mechanism for overflowing the page refcount. All
callers converted to handle a failure.

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: prevent get_user_pages() from overflowing page refcount

If the page refcount wraps around past zero, it will be freed while
there are still four billion references to it. One of the possible
avenues for an attacker to try to make this happen is by doing direct IO
on a page multiple times. This patch makes get_user_pages() refuse to
take a new page reference if there are already more than two billion
references to the page.

Reported-by: Jann Horn <jannh@google.com>
Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: add 'try_get_page()' helper function

This is the same as the traditional 'get_page()' function, but instead
of unconditionally incrementing the reference count of the page, it only
does so if the count was "safe". It returns whether the reference count
was incremented (and is marked __must_check, since the caller obviously
has to be aware of it).

Also like 'get_page()', you can't use this function unless you already
had a reference to the page. The intent is that you can use this
exactly like get_page(), but in situations where you want to limit the
maximum reference count.

The code currently does an unconditional WARN_ON_ONCE() if we ever hit
the reference count issues (either zero or negative), as a notification
that the conditional non-increment actually happened.

NOTE! The count access for the "safety" check is inherently racy, but
that doesn't matter since the buffer we use is basically half the range
of the reference count (ie we look at the sign of the count).

Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

mm: make page ref count overflow check tighter and more explicit

We have a VM_BUG_ON() to check that the page reference count doesn't
underflow (or get close to overflow) by checking the sign of the count.

That's all fine, but we actually want to allow people to use a "get page
ref unless it's already very high" helper function, and we want that one
to use the sign of the page ref (without triggering this VM_BUG_ON).

Change the VM_BUG_ON to only check for small underflows (or _very_ close
to overflowing), and ignore overflows which have strayed into negative
territory.

Acked-by: Matthew Wilcox <willy@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'for-linus-20190412' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
"Set of fixes that should go into this round. This pull is larger than
  I'd like at this time, but there's really no specific reason for that.
  Some are fixes for issues that went into this merge window, others are
  not. Anyway, this contains:

   - Hardware queue limiting for virtio-blk/scsi (Dongli)

   - Multi-page bvec fixes for lightnvm pblk

   - Multi-bio dio error fix (Jason)

   - Remove the cache hint from the io_uring tool side, since we didn't
     move forward with that (me)

   - Make io_uring SETUP_SQPOLL root restricted (me)

   - Fix leak of page in error handling for pc requests (Jérôme)

   - Fix BFQ regression introduced in this merge window (Paolo)

   - Fix break logic for bio segment iteration (Ming)

   - Fix NVMe cancel request error handling (Ming)

   - NVMe pull request with two fixes (Christoph):
       - fix the initial CSN for nvme-fc (James)
       - handle log page offsets properly in the target (Keith)"

* tag 'for-linus-20190412' of git://git.kernel.dk/linux-block:
  block: fix the return errno for direct IO
  nvmet: fix discover log page when offsets are used
  nvme-fc: correct csn initialization and increments on error
  block: do not leak memory in bio_copy_user_iov()
  lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs
  nvme: cancel request synchronously
  blk-mq: introduce blk_mq_complete_request_sync()
  scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids
  virtio-blk: limit number of hw queues by nr_cpu_ids
  block, bfq: fix use after free in bfq_bfqq_expire
  io_uring: restrict IORING_SETUP_SQPOLL to root
  tools/io_uring: remove IOCQE_FLAG_CACHEHIT
  block: don't use for-inside-for in bio_for_each_segment_all

Merge tag 'nfs-for-5.1-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:
"Highlights include:

  Stable fix:

   - Fix a deadlock in close() due to incorrect draining of RDMA queues

  Bugfixes:

   - Revert "SUNRPC: Micro-optimise when the task is known not to be
     sleeping" as it is causing stack overflows

   - Fix a regression where NFSv4 getacl and fs_locations stopped
     working

   - Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.

   - Fix xfstests failures due to incorrect copy_file_range() return
     values"

* tag 'nfs-for-5.1-4' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping"
  NFSv4.1 fix incorrect return value in copy_file_range
  xprtrdma: Fix helper that drains the transport
  NFS: Fix handling of reply page vector
  NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fix from James Bottomley:
"One obvious fix for a ciostor data corruption on error bug"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: csiostor: fix missing data copy in csio_scsi_err_handler()

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux

Pull clk fixes from Stephen Boyd:
"Here's more than a handful of clk driver fixes for changes that came
  in during the merge window:

   - Fix the AT91 sama5d2 programmable clk prescaler formula

   - A bunch of Amlogic meson clk driver fixes for the VPU clks

   - A DMI quirk for Intel's Bay Trail SoC's driver to properly mark pmc
     clks as critical only when really needed

   - Stop overwriting CLK_SET_RATE_PARENT flag in mediatek's clk gate
     implementation

   - Use the right structure to test for a frequency table in i.MX's
     PLL_1416x driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: imx: Fix PLL_1416X not rounding rates
  clk: mediatek: fix clk-gate flag setting
  platform/x86: pmc_atom: Drop __initconst on dmi table
  clk: x86: Add system specific quirk to mark clocks as critical
  clk: meson: vid-pll-div: remove warning and return 0 on invalid config
  clk: meson: pll: fix rounding and setting a rate that matches precisely
  clk: meson-g12a: fix VPU clock parents
  clk: meson: g12a: fix VPU clock muxes mask
  clk: meson-gxbb: round the vdec dividers to closest
  clk: at91: fix programmable clock for sama5d2

Merge tag 'pci-v5.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

- Add a DMA alias quirk for another Marvell SATA device (Andre
   Przywara)

- Fix a pciehp regression that broke safe removal of devices (Sergey
   Miroshnichenko)

* tag 'pci-v5.1-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: pciehp: Ignore Link State Changes after powering off a slot
  PCI: Add function 1 DMA alias quirk for Marvell 9170 SATA controller

Merge tag 'powerpc-5.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fixes from Michael Ellerman:
"A minor build fix for 64-bit FLATMEM configs.

  A fix for a boot failure on 32-bit powermacs.

  My commit to fix CLOCK_MONOTONIC across Y2038 broke the 32-bit VDSO on
  64-bit kernels, ie. compat mode, which is only used on big endian.

  The rewrite of the SLB code we merged in 4.20 missed the fact that the
  0x380 exception is also used with the Radix MMU to report out of range
  accesses. This could lead to an oops if userspace tried to read from
  addresses outside the user or kernel range.

  Thanks to: Aneesh Kumar K.V, Christophe Leroy, Larry Finger, Nicholas
  Piggin"

* tag 'powerpc-5.1-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Define MAX_PHYSMEM_BITS for all 64-bit configs
  powerpc/64s/radix: Fix radix segment exception handling
  powerpc/vdso32: fix CLOCK_MONOTONIC on PPC64
  powerpc/32: Fix early boot failure with RTAS built-in

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
"The main thing is a fix to our FUTEX_WAKE_OP implementation which was
  unbelievably broken, but did actually work for the one scenario that
  GLIBC used to use.

  Summary:

   - Fix stack unwinding so we ignore user stacks

   - Fix ftrace module PLT trampoline initialisation checks

   - Fix terminally broken implementation of FUTEX_WAKE_OP atomics"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value
  arm64: backtrace: Don't bother trying to unwind the userspace stack
  arm64/ftrace: fix inadvertent BUG() in trampoline check

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Ingo Molnar:
"Fix typos in user-visible resctrl parameters, and also fix assembly
  constraint bugs that might result in miscompilation"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Use stricter assembly constraints in bitops
  x86/resctrl: Fix typos in the mba_sc mount option

Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
"Fix the alarm_timer_remaining() return value"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
alarmtimer: Return correct remaining time

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler fix from Ingo Molnar:
"Fix a NULL pointer dereference crash in certain environments"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/fair: Do not re-read ->h_load_next during hierarchical load calculation

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
"Six kernel side fixes: three related to NMI handling on AMD systems, a
  race fix, a kexec initialization fix and a PEBS sampling fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix perf_event_disable_inatomic() race
  x86/perf/amd: Remove need to check "running" bit in NMI handler
  x86/perf/amd: Resolve NMI latency issues for active PMCs
  x86/perf/amd: Resolve race condition when disabling PMC
  perf/x86/intel: Initialize TFA MSR
  perf/x86/intel: Fix handling of wakeup_events for multi-entry PEBS

Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:
"Fixes a crash when accessing /proc/lockdep"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
locking/lockdep: Zap lock classes even with lock debugging disabled

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Ingo Molnar:
"Two genirq fixes, plus an irqchip driver error handling fix"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Respect IRQCHIP_SKIP_SET_WAKE in irq_chip_set_wake_parent()
  genirq: Initialize request_mutex if CONFIG_SPARSE_IRQ=n
  irqchip/irq-ls1x: Missing error code in ls1x_intc_of_init()

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core fixes from Ingo Molnar:
"Fix an objtool warning plus fix a u64_to_user_ptr() macro expansion
  bug"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Add rewind_stack_do_exit() to the noreturn list
  linux/kernel.h: Use parentheses around argument in u64_to_user_ptr()

clk: imx: Fix PLL_1416X not rounding rates

Code which initializes the "clk_init_data.ops" checks pll->rate_table
before that field is ever assigned to so it always picks
"clk_pll1416x_min_ops".

This breaks dynamic rate rounding for features such as cpufreq.

Fix by checking pll_clk->rate_table instead, here pll_clk refers to
the constant initialization data coming from per-soc clk driver.

Signed-off-by: Leonard Crestez <leonard.crestez@nxp.com>
Fixes: 8646d4dcc7fb ("clk: imx: Add PLLs driver for imx8mm soc")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>

clk: mediatek: fix clk-gate flag setting

CLK_SET_RATE_PARENT would be dropped.
Merge two flag setting together to correct the error.

Fixes: 5a1cc4c27ad2 ("clk: mediatek: Add flags to mtk_gate")
Cc: <stable@vger.kernel.org>
Signed-off-by: Weiyi Lu <weiyi.lu@mediatek.com>
Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>

Merge tag 'dma-mapping-5.1-1' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fixes from Christoph Hellwig:
"Fix a sparc64 sun4v_pci regression introduced in this merged window,
  and a dma-debug stracktrace regression from the big refactor last
  merge window"

* tag 'dma-mapping-5.1-1' of git://git.infradead.org/users/hch/dma-mapping:
  dma-debug: only skip one stackframe entry
  sparc64/pci_sun4v: fix ATU checks for large DMA masks

Merge tag 'iommu-fix-v5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU fix from Joerg Roedel:
"Fix an AMD IOMMU issue where the driver didn't correctly setup the
  exclusion range in the hardware registers, resulting in exclusion
  ranges being one page too big.

  This can cause data corruption of the address of that last page is
  used by DMA operations"

* tag 'iommu-fix-v5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
  iommu/amd: Set exclusion range correctly

Merge tag 'clang-format-for-linus-v5.1-rc5' of git://github.com/ojeda/linux

Pull clang-format update from Miguel Ojeda:
"The usual roughly-per-release .clang-format macro list update"

* tag 'clang-format-for-linus-v5.1-rc5' of git://github.com/ojeda/linux:
clang-format: Update with the latest for_each macro list

Merge tag 'mmc-v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc

Pull MMC host fixes from Ulf Hansson:

- alcor: Stabilize data write requests

- sdhci-omap: Fix command error path during tuning

* tag 'mmc-v5.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning
mmc: alcor: don't write data before command has completed

Merge tag 'sound-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
"Well, this one became unpleasantly larger than previous pull requests,
  but it's a kind of usual pattern: now it contains a collection of ASoC
  fixes, and nothing to worry too much.

  The fixes for ASoC core (DAPM, DPCM, topology) are all small and just
  covering corner cases. The rest changes are driver-specific, many of
  which are for x86 platforms and new drivers like STM32, in addition to
  the usual fixups for HD-audio"

* tag 'sound-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (66 commits)
  ASoC: wcd9335: Fix missing regmap requirement
  ALSA: hda: Fix racy display power access
  ASoC: pcm: fix error handling when try_module_get() fails.
  ASoC: stm32: sai: fix master clock management
  ASoC: Intel: kbl: fix wrong number of channels
  ALSA: hda - Add two more machines to the power_save_blacklist
  ASoC: pcm: update module refcount if module_get_upon_open is set
  ASoC: core: conditionally increase module refcount on component open
  ASoC: stm32: fix sai driver name initialisation
  ASoC: topology: Use the correct dobj to free enum control values and texts
  ALSA: seq: Fix OOB-reads from strlcpy
  ASoC: intel: skylake: add remove() callback for component driver
  ASoC: cs35l35: Disable regulators on driver removal
  ALSA: xen-front: Do not use stream buffer size before it is set
  ASoC: rockchip: pdm: change dma burst to 8
  ASoC: rockchip: pdm: fix regmap_ops hang issue
  ASoC: simple-card: don't select DPCM via simple-audio-card
  ASoC: audio-graph-card: don't select DPCM via audio-graph-card
  ASoC: tlv320aic32x4: Change author's name
  ALSA: hda/realtek - Add quirk for Tuxedo XC 1509
  ...

Merge tag 'acpi-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
"Fix an ACPICA issue introduced during the 4.20 development cycle and
  causing some systems to crash because of leftover operation region
  data still maintained after the operation region in question has gone
  away (Erik Schmauss)"

* tag 'acpi-5.1-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPICA: Namespace: remove address node from global list after method termination

Merge tag 'drm-fixes-2019-04-12' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
"Fixes across the driver spectrum this week, the mediatek fbdev support
  might be a bit late for this round, but I looked over it and it's not
  very large and seems like a useful feature for them.

  Otherwise the main thing is a regression fix for i915 5.0 bug that
  caused black screens on a bunch of Dell XPS 15s I think, I know at
  least Fedora is waiting for this to land, and the udl fix is also for
  a regression since 5.0 where unplugging the device would end badly.

  core:
   - make atomic hooks optional

  i915:
   - Revert a 5.0 regression where some eDP panels stopped working
   - DSI related fixes for platforms up to IceLake
   - GVT (regression fix, warning fix, use-after free fix)

  amdgpu:
   - Cursor fixes
   - missing PCI ID fix for KFD
   - XGMI fix
   - shadow buffer handling after reset fix

  udl:
   - fix unplugging device crashes.

  mediatek:
   - stabilise MT2701 HDMI support
   - fbdev support

  tegra:
   - fix for build regression in rc1.

  sun4i:
   - Allwinner A6 max freq improvements
   - null ptr deref fix

  dw-hdmi:
   - SCDC configuration improvements

  omap:
   - CEC clock management policy fix"

* tag 'drm-fixes-2019-04-12' of git://anongit.freedesktop.org/drm/drm: (32 commits)
  gpu: host1x: Fix compile error when IOMMU API is not available
  drm/i915/gvt: Roundup fb->height into tile's height at calucation fb->size
  drm/i915/dp: revert back to max link rate and lane count on eDP
  drm/i915/icl: Fix port disable sequence for mipi-dsi
  drm/i915/icl: Ungate ddi clocks before IO enable
  drm/mediatek: no change parent rate in round_rate() for MT2701 hdmi phy
  drm/mediatek: using new factor for tvdpll for MT2701 hdmi phy
  drm/mediatek: remove flag CLK_SET_RATE_PARENT for MT2701 hdmi phy
  drm/mediatek: make implementation of recalc_rate() for MT2701 hdmi phy
  drm/mediatek: fix the rate and divder of hdmi phy for MT2701
  drm/mediatek: fix possible object reference leak
  drm/i915: Get power refs in encoder->get_power_domains()
  drm/i915: Fix pipe_bpp readout for BXT/GLK DSI
  drm/amd/display: Fix negative cursor pos programming (v2)
  drm/sun4i: tcon top: Fix NULL/invalid pointer dereference in sun8i_tcon_top_un/bind
  drm/udl: add a release method and delay modeset teardown
  drm/i915/gvt: Prevent use-after-free in ppgtt_free_all_spt()
  drm/i915/gvt: Annotate iomem usage
  drm/sun4i: DW HDMI: Lower max. supported rate for H6
  Revert "Documentation/gpu/meson: Remove link to meson_canvas.c"
  ...

arm64: futex: Fix FUTEX_WAKE_OP atomic ops with non-zero result value

Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't
explicitly set the return value on the non-faulting path and instead
leaves it holding the result of the underlying atomic operation. This
means that any FUTEX_WAKE_OP atomic operation which computes a non-zero
value will be reported as having failed. Regrettably, I wrote the buggy
code back in 2011 and it was upstreamed as part of the initial arm64
support in 2012.

The reasons we appear to get away with this are:

  1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get
     exercised by futex() test applications

  2. If the result of the atomic operation is zero, the system call
     behaves correctly

  3. Prior to version 2.25, the only operation used by GLIBC set the
     futex to zero, and therefore worked as expected. From 2.25 onwards,
     FUTEX_WAKE_OP is not used by GLIBC at all.

Fix the implementation by ensuring that the return value is either 0
to indicate that the atomic operation completed successfully, or -EFAULT
if we encountered a fault when accessing the user mapping.

Cc: <stable@kernel.org>
Fixes: 6170a97460db ("arm64: Atomic operations")
Signed-off-by: Will Deacon <will.deacon@arm.com>

iommu/amd: Set exclusion range correctly

The exlcusion range limit register needs to contain the
base-address of the last page that is part of the range, as
bits 0-11 of this register are treated as 0xfff by the
hardware for comparisons.

So correctly set the exclusion range in the hardware to the
last page which is _in_ the range.

Fixes: b2026aa2dce44 ('x86, AMD IOMMU: add functions for programming IOMMU MMIO space')
Signed-off-by: Joerg Roedel <jroedel@suse.de>

clang-format: Update with the latest for_each macro list

Re-run the shell fragment that generated the original list now that
there are two dozens of new entries after v5.1's merge window.

Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>

perf/core: Fix perf_event_disable_inatomic() race

Thomas-Mich Richter reported he triggered a WARN()ing from event_function_local()
on his s390. The problem boils down to:

CPU-A CPU-B

perf_event_overflow()
  perf_event_disable_inatomic()
    @pending_disable = 1
    irq_work_queue();

sched-out
  event_sched_out()
    @pending_disable = 0

sched-in
perf_event_overflow()
  perf_event_disable_inatomic()
    @pending_disable = 1;
    irq_work_queue(); // FAILS

irq_work_run()
  perf_pending_event()
    if (@pending_disable)
      perf_event_disable_local(); // WHOOPS

The problem exists in generic, but s390 is particularly sensitive
because it doesn't implement arch_irq_work_raise(), nor does it call
irq_work_run() from it's PMU interrupt handler (nor would that be
sufficient in this case, because s390 also generates
perf_event_overflow() from pmu::stop). Add to that the fact that s390
is a virtual architecture and (virtual) CPU-A can stall long enough
for the above race to happen, even if it would self-IPI.

Adding a irq_work_sync() to event_sched_in() would work for all hardare
PMUs that properly use irq_work_run() but fails for software PMUs.

Instead encode the CPU number in @pending_disable, such that we can
tell which CPU requested the disable. This then allows us to detect
the above scenario and even redirect the IPI to make up for the failed
queue.

Reported-by: Thomas-Mich Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Merge tag 'drm-intel-fixes-2019-04-11' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Revert back to max link rate and lane count on eDP.
- DSI related fixes for all platforms including Ice Lake.
- GVT Fixes including one vGPU display plane size regression fix,
one for preventing use-after-free in ppgtt shadow free function,
and another warning fix for iomem access annotation.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190411235832.GA6476@intel.com

block: fix the return errno for direct IO

If the last bio returned is not dio->bio, the status of the bio will
not assigned to dio->bio if it is error. This will cause the whole IO
status wrong.

    ksoftirqd/21-117   [021] ..s.  4017.966090:   8,0    C   N 4883648 [0]
          <idle>-0     [018] ..s.  4017.970888:   8,0    C  WS 4924800 + 1024 [0]
          <idle>-0     [018] ..s.  4017.970909:   8,0    D  WS 4935424 + 1024 [<idle>]
          <idle>-0     [018] ..s.  4017.970924:   8,0    D  WS 4936448 + 321 [<idle>]
    ksoftirqd/21-117   [021] ..s.  4017.995033:   8,0    C   R 4883648 + 336 [65475]
    ksoftirqd/21-117   [021] d.s.  4018.001988: myprobe1: (blkdev_bio_end_io+0x0/0x168) bi_status=7
    ksoftirqd/21-117   [021] d.s.  4018.001992: myprobe: (aio_complete_rw+0x0/0x148) x0=0xffff802f2595ad80 res=0x12a000 res2=0x0

We always have to assign bio->bi_status to dio->bio.bi_status because we
will only check dio->bio.bi_status when we return the whole IO to
the upper layer.

Fixes: 542ff7bf18c6 ("block: new direct I/O implementation")
Cc: stable@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

Merge tag 'for-5.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

- fix parsing of compression algorithm when set as a inode property,
   this could end up with eg. 'zst' or 'zli' in the value

- don't allow trim on a filesystem with unreplayed log, this could
   cause data loss if there are pending updates to the block groups that
   would not be subject to trim after replay

* tag 'for-5.1-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: prop: fix vanished compression property after failed set
  btrfs: prop: fix zstd compression parameter validation
  Btrfs: do not allow trimming when a fs is mounted with the nologreplay option

Merge tag 'drm-misc-fixes-2019-04-11' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

- core: Make atomic_enable and disable optional for CRTC
- dw-hdmi: Lower max frequency for the Allwinner H6, SCDC configuration
improvements for older controller versions
- omap: a fix for the CEC clock management policy

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190411151658.orm46ccd5zmrw27l@flea

Revert "SUNRPC: Micro-optimise when the task is known not to be sleeping"

This reverts commit 009a82f6437490c262584d65a14094a818bcb747.

The ability to optimise here relies on compiler being able to optimise
away tail calls to avoid stack overflows. Unfortunately, we are seeing
reports of problems, so let's just revert.

Reported-by: Daniel Mack <daniel@zonque.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

NFSv4.1 fix incorrect return value in copy_file_range

According to the NFSv4.2 spec if the input and output file is the
same file, operation should fail with EINVAL. However, linux
copy_file_range() system call has no such restrictions. Therefore,
in such case let's return EOPNOTSUPP and allow VFS to fallback
to doing do_splice_direct(). Also when copy_file_range is called
on an NFSv4.0 or 4.1 mount (ie., a server that doesn't support
COPY functionality), we also need to return EOPNOTSUPP and
fallback to a regular copy.

Fixes xfstest generic/075, generic/091, generic/112, generic/263
for all NFSv4.x versions.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

xprtrdma: Fix helper that drains the transport

We want to drain only the RQ first. Otherwise the transport can
deadlock on ->close if there are outstanding Send completions.

Fixes: 6d2d0ee27c7a ("xprtrdma: Replace rpcrdma_receive_wq ... ")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: stable@vger.kernel.org # v5.0+
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

NFS: Fix handling of reply page vector

NFSv4 GETACL and FS_LOCATIONS requests stopped working in v5.1-rc.

These two need the extra padding to be added directly to the reply
length.

Reported-by: Olga Kornievskaia <aglo@umich.edu>
Fixes: 02ef04e432ba ("NFS: Account for XDR pad of buf->pages")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Olga Kornievskaia <aglo@umich.edu>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

NFS: Forbid setting AF_INET6 to "struct sockaddr_in"->sin_family.

syzbot is reporting uninitialized value at rpc_sockaddr2uaddr() [1]. This
is because syzbot is setting AF_INET6 to "struct sockaddr_in"->sin_family
(which is embedded into user-visible "struct nfs_mount_data" structure)
despite nfs23_validate_mount_data() cannot pass sizeof(struct sockaddr_in6)
bytes of AF_INET6 address to rpc_sockaddr2uaddr().

Since "struct nfs_mount_data" structure is user-visible, we can't change
"struct nfs_mount_data" to use "struct sockaddr_storage". Therefore,
assuming that everybody is using AF_INET family when passing address via
"struct nfs_mount_data"->addr, reject if its sin_family is not AF_INET.

[1] https://syzkaller.appspot.com/bug?id=599993614e7cbbf66bc2656a919ab2a95fb5d75c

Reported-by: syzbot <syzbot+047a11c361b872896a4f@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>

dma-debug: only skip one stackframe entry

With skip set to 1, I get a traceback like this:

[  106.867637] DMA-API: Mapped at:
[  106.870784]  afu_dma_map_region+0x2cd/0x4f0 [dfl_afu]
[  106.875839]  afu_ioctl+0x258/0x380 [dfl_afu]
[  106.880108]  do_vfs_ioctl+0xa9/0x720
[  106.883688]  ksys_ioctl+0x60/0x90
[  106.887007]  __x64_sys_ioctl+0x16/0x20

With the previous value of 2, afu_dma_map_region was being omitted.  I
suspect that the code paths have simply changed since the value of 2 was
chosen a decade ago, but it's also possible that it varies based on which
mapping function was used, compiler inlining choices, etc.  In any case,
it's best to err on the side of skipping less.

Signed-off-by: Scott Wood <swood@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

platform/x86: pmc_atom: Drop __initconst on dmi table

It's used by probe and that isn't an init function. Drop this so that we
don't get a section mismatch.

Reported-by: kbuild test robot <lkp@intel.com>
Cc: David Müller <dave.mueller@gmx.ch>
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Andy Shevchenko <andy.shevchenko@gmail.com>
Fixes: 7c2e07130090 ("clk: x86: Add system specific quirk to mark clocks as critical")
Signed-off-by: Stephen Boyd <sboyd@kernel.org>

Merge tag 'gvt-fixes-2019-04-11' of https://github.com/intel/gvt-linux into drm-intel-fixes

gvt-fixes-2019-04-11

- Fix sparse warning on iomem usage (Chris)
- Prevent use-after-free for ppgtt shadow table free (Chris)
- Fix display plane size regression for tiled surface (Xiong)

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190411064910.GF17995@zhen-hp.sh.intel.com

Merge branch 'nvme-5.1' of git://git.infradead.org/nvme into for-linus

Pull NVMe fixes from Christoph:

"Two nvme fixes for 5.1 - fixing the initial CSN for nvme-fc, and handle
log page offsets properly in the target."

* 'nvme-5.1' of git://git.infradead.org/nvme:
nvmet: fix discover log page when offsets are used
nvme-fc: correct csn initialization and increments on error

nvmet: fix discover log page when offsets are used

The nvme target hadn't been taking the Get Log Page offset parameter
into consideration, and so has been returning corrupted log pages when
offsets are used. Since many tools, including nvme-cli, split the log
request to 4k, we've been breaking discovery log responses when more
than 3 subsystems exist.

Fix the returned data by internally generating the entire discovery
log page and copying only the requested bytes into the user buffer. The
command log page offset type has been modified to a native __le64 to
make it easier to extract the value from a command.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Tested-by: Minwoo Im <minwoo.im@samsung.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

nvme-fc: correct csn initialization and increments on error

This patch fixes a long-standing bug that initialized the FC-NVME
cmnd iu CSN value to 1. Early FC-NVME specs had the connection starting
with CSN=1. By the time the spec reached approval, the language had
changed to state a connection should start with CSN=0. This patch
corrects the initialization value for FC-NVME connections.

Additionally, in reviewing the transport, the CSN value is assigned to
the new IU early in the start routine. It's possible that a later dma
map request may fail, causing the command to never be sent to the
controller. Change the location of the assignment so that it is
immediately prior to calling the lldd. Add a comment block to explain
the impacts if the lldd were to additionally fail sending the command.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>

Merge tag 'asoc-fix-v5.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v5.1

A few core fixes along with the driver specific ones, mainly fixing
small issues that only affect x86 platforms for various reasons (their
unusual machine enumeration mechanisms mainly, plus a fix for error
handling in topology).

There's some of the driver fixes that look larger than they are, like
the hdmi-codec changes which resulted in an indentation change, and most
of the other large changes are for new drivers like the STM32 changes.

mmc: sdhci-omap: Don't finish_mrq() on a command error during tuning

commit 5b0d62108b46 ("mmc: sdhci-omap: Add platform specific reset
callback") skips data resets during tuning operation. Because of this,
a data error or data finish interrupt might still arrive after a command
error has been handled and the mrq ended. This ends up with a "mmc0: Got
data interrupt 0x00000002 even though no data operation was in progress"
error message.

Fix this by adding a platform specific callback for sdhci_irq. Mark the
mrq as a failure but wait for a data interrupt instead of calling
finish_mrq().

Fixes: 5b0d62108b46 ("mmc: sdhci-omap: Add platform specific reset
callback")
Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

Merge branch 'drm-fixes-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

A few fixes for 5.1:
- Cursor fixes
- Add missing picasso pci id to KFD
- XGMI fix
- Shadow buffer handling fix for GPU reset

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190410183031.3710-1-alexander.deucher@amd.com

Merge branch 'mediatek-drm-fixes-5.1' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes

This include stable MT2701 HDMI, framebuffer device and some fixes for
mediatek drm driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1554860914.29842.4.camel@mtksdaap41

Merge tag 'drm/tegra/for-5.1-rc5' of git://anongit.freedesktop.org/tegra/linux into drm-fixes

drm/tegra: Fixes for v5.1-rc5

A single, one-line fix for a build error introduced in v5.1-rc1.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thierry Reding <thierry.reding@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190411084106.7552-1-thierry.reding@gmail.com

gpu: host1x: Fix compile error when IOMMU API is not available

In case the IOMMU API is not available compiling host1x fails with
the following error:

  In file included from drivers/gpu/host1x/hw/host1x06.c:27:
  drivers/gpu/host1x/hw/channel_hw.c: In function ‘host1x_channel_set_streamid’:
  drivers/gpu/host1x/hw/channel_hw.c:118:30: error: implicit declaration of function
    ‘dev_iommu_fwspec_get’; did you mean ‘iommu_fwspec_free’?  [-Werror=implicit-function-declaration]
  struct iommu_fwspec *spec = dev_iommu_fwspec_get(channel->dev->parent);
                              ^~~~~~~~~~~~~~~~~~~~
                              iommu_fwspec_free

Fixes: de5469c21ff9 ("gpu: host1x: Program the channel stream ID")
Signed-off-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Thierry Reding <treding@nvidia.com>

drm/i915/gvt: Roundup fb->height into tile's height at calucation fb->size

When fb is tiled and fb->height isn't the multiple of tile's height,
the format fb->size = fb->stride * fb->height, will get a smaller size
than the actual size. As the memory height of tiled fb should be multiple
of tile's height.

Fixes: 7f1a93b1f1d1 ("drm/i915/gvt: Correct the calculation of plane size")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

clk: x86: Add system specific quirk to mark clocks as critical

Since commit 648e921888ad ("clk: x86: Stop marking clocks as
CLK_IS_CRITICAL"), the pmc_plt_clocks of the Bay Trail SoC are
unconditionally gated off. Unfortunately this will break systems where these
clocks are used for external purposes beyond the kernel's knowledge. Fix it
by implementing a system specific quirk to mark the necessary pmc_plt_clks as
critical.

Fixes: 648e921888ad ("clk: x86: Stop marking clocks as CLK_IS_CRITICAL")
Signed-off-by: David Müller <dave.mueller@gmx.ch>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>

block: do not leak memory in bio_copy_user_iov()

When bio_add_pc_page() fails in bio_copy_user_iov() we should free
the page we just allocated otherwise we are leaking it.

Cc: linux-block@vger.kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: stable@vger.kernel.org
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

PCI: pciehp: Ignore Link State Changes after powering off a slot

During a safe hot remove, the OS powers off the slot, which may cause a
Data Link Layer State Changed event. The slot has already been set to
OFF_STATE, so that event results in re-enabling the device, making it
impossible to safely remove it.

Clear out the Presence Detect Changed and Data Link Layer State Changed
events when the disabled slot has settled down.

It is still possible to re-enable the device if it remains in the slot
after pressing the Attention Button by pressing it again.

Fixes the problem that Micah reported below: an NVMe drive power button may
not actually turn off the drive.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203237
Reported-by: Micah Parrish <micah.parrish@hpe.com>
Tested-by: Micah Parrish <micah.parrish@hpe.com>
Signed-off-by: Sergey Miroshnichenko <s.miroshnichenko@yadro.com>
[bhelgaas: changelog, add bugzilla URL]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v4.19+

sparc64/pci_sun4v: fix ATU checks for large DMA masks

Now that we allow drivers to always need to set larger than required
DMA masks we need to be a little more careful in the sun4v PCI iommu
driver to chose when to select the ATU support - a larger DMA mask
can be set even when the platform does not support ATU, so we always
have to check if it is avaiable before using it. Add a little helper
for that and use it in all the places where we make ATU usage decisions
based on the DMA mask.

Fixes: 24132a419c68 ("sparc64/pci_sun4v: allow large DMA masks")
Reported-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Meelis Roos <mroos@linux.ee>
Acked-by: David S. Miller <davem@davemloft.net>

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
"Several driver bug fixes posted in the last several weeks

   - Several bug fixes for the hfi1 driver 'TID RDMA' functionality
     merged into 5.1. Since TID RDMA is on by default these all seem to
     be regressions.

   - Wrong software permission checks on memory in mlx5

   - Memory leak in vmw_pvrdma during driver remove

   - Several bug fixes for hns driver features merged into 5.1"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/hfi1: Do not flush send queue in the TID RDMA second leg
  RDMA/hns: Bugfix for SCC hem free
  RDMA/hns: Fix bug that caused srq creation to fail
  RDMA/vmw_pvrdma: Fix memory leak on pvrdma_pci_remove
  IB/mlx5: Reset access mask when looping inside page fault handler
  IB/hfi1: Fix the allocation of RSM table
  IB/hfi1: Eliminate opcode tests on mr deref
  IB/hfi1: Clear the IOWAIT pending bits when QP is put into error state
  IB/hfi1: Failed to drain send queue when QP is put into error state

lightnvm: pblk: fix crash in pblk_end_partial_read due to multipage bvecs

The introduction of multipage bio vectors broke pblk's partial read
logic due to it not being prepared for multipage bio vectors.

Use bio vector iterators instead of direct bio vector indexing.

Fixes: 07173c3ec276 ("block: enable multipage bvecs")
Reported-by: Klaus Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Hans Holmberg <hans.holmberg@cnexlabs.com>
Updated description.
Signed-off-by: Matias Bjørling <mb@lightnvm.io>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

IB/hfi1: Do not flush send queue in the TID RDMA second leg

When a QP is put into error state, the send queue will be flushed.
This mechanism is implemented in both the first and the second leg
of the send engine. Since the second leg is only responsible for
data transactions in the KDETH space for the TID RDMA WRITE request,
it should not perform the flushing of the send queue.

This patch removes the flushing function of the second leg, but
still keeps the bailing out of the QP if it is put into error state.

Fixes: 70dcb2e3dc6a ("IB/hfi1: Add the TID second leg send packet builder")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
"Several fixes, add more reviewers to the list"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: Honour 'may_reduce_num' in vring_create_virtqueue
  MAiNTAINERS: add Paolo, Stefan for virtio blk/scsi
  virtio_pci: fix a NULL pointer reference in vp_del_vqs

ASoC: wcd9335: Fix missing regmap requirement

wcd9335.c: undefined reference to 'devm_regmap_add_irq_chip'

Signed-off-by: Marc Gonzalez <marc.w.gonzalez@free.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>

drm/i915/dp: revert back to max link rate and lane count on eDP

Commit 7769db588384 ("drm/i915/dp: optimize eDP 1.4+ link config fast
and narrow") started to optize the eDP 1.4+ link config, both per spec
and as preparation for display stream compression support.

Sadly, we again face panels that flat out fail with parameters they
claim to support. Revert, and go back to the drawing board.

v2: Actually revert to max params instead of just wide-and-slow.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109959
Fixes: 7769db588384 ("drm/i915/dp: optimize eDP 1.4+ link config fast and narrow")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Atwood <matthew.s.atwood@intel.com>
Cc: "Lee, Shawn C" <shawn.c.lee@intel.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: intel-gfx@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v5.0+
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Tested-by: Albert Astals Cid <aacid@kde.org> # v5.0 backport
Tested-by: Emanuele Panigati <ilpanich@gmail.com> # v5.0 backport
Tested-by: Matteo Iervasi <matteoiervasi@gmail.com> # v5.0 backport
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190405075220.9815-1-jani.nikula@intel.com
(cherry picked from commit f11cb1c19ad0563b3c1ea5eb16a6bac0e401f428)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/icl: Fix port disable sequence for mipi-dsi

Re-enable clock gating of DDI clocks.

v2: Fix the default ddi clk state for mipi-dsi (Imre)

Fixes: 1026bea00381 ("drm/i915/icl: Ungate DSI clocks")
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1553513202-13863-2-git-send-email-vandita.kulkarni@intel.com
(cherry picked from commit 942d1cf48eae3fcd7e973cfb708d5c4860f0c713)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

drm/i915/icl: Ungate ddi clocks before IO enable

IO enable sequencing needs ddi clocks enabled.
These clocks will be gated at a later point in
the enable sequence.

v2: Fix the commit header (Uma)
v3: Remove the redundant read (Ville)

Fixes: 949fc52af19e ("drm/i915/icl: add pll mapping for DSI")
Signed-off-by: Vandita Kulkarni <vandita.kulkarni@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1553513202-13863-1-git-send-email-vandita.kulkarni@intel.com
(cherry picked from commit c5b81a325263a891d5811aabe938c87e03db4c37)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

nvme: cancel request synchronously

nvme_cancel_request() is used in error handler, and it is always
reliable to cancel request synchronously, and avoids possible race
in which request may be completed after real hw queue is destroyed.

One issue is reported by our customer on NVMe RDMA, in which freed ib
queue pair may be used in nvme_rdma_complete_rq().

Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: James Smart <james.smart@broadcom.com>
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

blk-mq: introduce blk_mq_complete_request_sync()

In NVMe's error handler, follows the typical steps of tearing down
hardware for recovering controller:

1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
mark the request as abort
blk_mq_complete_request(req);
4) destroy real hw queues

However, there may be race between #3 and #4, because blk_mq_complete_request()
may run q->mq_ops->complete(rq) remotelly and asynchronously, and
->complete(rq) may be run after #4.

This patch introduces blk_mq_complete_request_sync() for fixing the
above race.

Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: James Smart <james.smart@broadcom.com>
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

scsi: virtio_scsi: limit number of hw queues by nr_cpu_ids

When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-scsi, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num_queues' specified
by qemu is more than maxcpus, virtio-scsi would not be able to allocate
more than maxcpus vectors in order to have a vector for each queue. As a
result, it falls back into MSI-X with one vector for config and one shared
for queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-scsi by nr_cpu_ids.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

virtio-blk: limit number of hw queues by nr_cpu_ids

When tag_set->nr_maps is 1, the block layer limits the number of hw queues
by nr_cpu_ids. No matter how many hw queues are used by virtio-blk, as it
has (tag_set->nr_maps == 1), it can use at most nr_cpu_ids hw queues.

In addition, specifically for pci scenario, when the 'num-queues' specified
by qemu is more than maxcpus, virtio-blk would not be able to allocate more
than maxcpus vectors in order to have a vector for each queue. As a result,
it falls back into MSI-X with one vector for config and one shared for
queues.

Considering above reasons, this patch limits the number of hw queues used
by virtio-blk by nr_cpu_ids.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>