mmc: sdhci-pci-o2micro: Add quirk for O2 Micro dev 0x8620 rev 0x01
This device reports SDHCI_CLOCK_INT_STABLE even though it's not
ready to take SDHCI_CLOCK_CARD_EN. The symptom is that reading
SDHCI_CLOCK_CONTROL after enabling the clock shows absence of the
bit from the register (e.g. expecting 0x0000fa07 = 0x0000fa03 |
SDHCI_CLOCK_CARD_EN but only observed the first operand).
mmc: omap_hsmmc: Delete platform data GPIO CD and WP
The OMAP HSMMC driver has some elaborate and hairy handling for
passing GPIO card detect and write protect lines from a boardfile
into the driver: the machine defines a struct omap2_hsmmc_info
that is copied into struct omap_hsmmc_platform_data by
omap_hsmmc_pdata_init() in arch/arm/mach-omap2/hsmmc.c.
However the .gpio_cd and .gpio_wp fields are not copied from
omap2_hsmmc_info to omap_hsmmc_platform_data by
omap_hsmmc_pdata_init() so they remain unused. The only platform
defining omap2_hsmmc_info also define both to -1, unused.
It turn out there are no boardfiles passing any valid GPIO
lines into the OMAP HSMMC driver at all. And since we are not
going to add any more OMAP2 boardfiles, we can delete this
card detect and write protect handling altogether.
This seems to also fix a bug: the card detect callback
mmc_gpio_get_cd() in the slot GPIO core needs to be called
by drivers utilizing slot GPIO. It appears the the boardfile
quirks were not doing this right, so this would only get
called for boardfiles, i.e. since no boardfile was using it,
never.
Just assign mmc_gpio_get_cd() unconditionally to omap_hsmmc_ops
.get_cd() so card detects from the device tree works.
AFAICT card detect with GPIO lines assigned from
mmc_of_parse() are not working at the moment, but that is
no regression since it probably never worked.
Cc: Tony Lindgren <tony@atomide.com> Cc: linux-omap@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Cover detection appears to be a feature protecting the SD
card on mobile phones with a slide-cover, such as some Nokia
phones. The idea seems to be to not allow access to the
SD card when the cover is open.
It is only usable with platform data from board files, but
no board file in the kernel is using it, yet it takes up
a sizeable chunk of code in the OMAP HSMMC driver.
Since we do not add new board files for the OMAPs any target
that need this should anyway reimplement it properly using
the device tree, so delete this legacy code.
The driver is marked as orphan in MAINTAINERS by the way.
Cc: Tony Lindgren <tony@atomide.com> Cc: linux-omap@vger.kernel.org Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: core: Allow building PWRSEQ_SD8787 with LIBERTAS_SDIO
The sd8686 "libertas" SDIO adapter's power is controlled with WLAN_RST
and WLAN_PD pins -- pretty much the same way as sd8787. Allow building
the power sequencing driver along with the libertas Wi-Fi driver.
This driver is complicating things for no reason: the "cd"
GPIO can easily be retrieved from the device tree if present
using just mmc_gpiod_request_cd(), which will fetch the
descriptor from the device tree using the standard binding
just fine.
If the retrieveal is successful, we also request the IRQ.
As a result the private subdriver data can be removed
entirely.
Cc: Weijun Yang <york.yang@csr.com> Cc: Barry Song <baohua@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
The platform data for the PXAv3 driver allows passing a card
detect GPIO, but this code is not used in the kernel.
In order to not encourage the use of the old global GPIO
numberspace we need to remove this.
Card detect (and write protect) GPIO can easily be added into
the driver using machine descriptor tables instead, and the
descriptor-based (gpiod) variants of the slot GPIO APIs.
This driver is complicating things for no reason: the "cd"
GPIO can easily be retrieved from the device tree if present
using just mmc_gpiod_request_cd(), which will fetch the
descriptor from the device tree using the standard binding
just fine.
All the machines using the MMCI are passing GPIOs for the
card detect and write protect using the device tree or
descriptor table (one single case, Integrator/AP IM-PD1).
Drop support for passing global GPIO numbers through
platform data, noone is using it.
mmc: renesas_sdhi_internal_dmac: set scatter/gather max segment size
Fix warning when running with CONFIG_DMA_API_DEBUG_SG=y by allocating a
device_dma_parameters structure and filling in the max segment size. The
size used is the result of a discussion with Renesas hardware engineers
and unfortunately not found in the datasheet.
renesas_sdhi_internal_dmac ee140000.sd: DMA-API: mapping sg segment
longer than device claims to support [len=126976] [max=65536]
Reported-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
[wsa: simplified some logic after validating intended dma_parms life cycle
and added comment] Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
mmc: sunxi: Use new timing mode for A64 eMMC controller
The eMMC controller is also a new timing mode controller, but it doesn't
have the timing mode switch. It does however have signal delay and
calibration controls, typical of Allwinner MMC controllers that support
the new timing mode.
Enable the new timing mode setting for the A64 eMMC controller. This
also enables MMC HS-DDR modes, which gives higher throughput for eMMC
chips that support it, and can deliver such throughput.
mmc: sunxi: Clarify new timing mode usage and implementation
Newer sunxi mmc controller variants support what they call the "new
timing mode". Support for this was implemented in two ways, according
to the hardware that was seen at the time.
The first type retained the old timing mode, and both the clock and mmc
controllers had switches to select which mode was used. Both switches
had to be set to the same setting. This variant was denoted with the
.has_timings_switch field in the sunxi_mmc_cfg structure. This hardware
is only seen on the A83T.
The second type did away with the old timing mode. The clock controller
no longer had the mode selection or clock delay setting bits. In some
cases the mmc controller retained its mode selection bit, but this
always needed to be set to the new mode, or instabilities would occur.
In a few cases, such as the A64 and H6 eMMC controller, the mode
selection bit is gone, but the controller still behaves like the new
timing mode, requiring the module clock to be double the card clock
in DDR transfer modes. This variant is denoted with the
.needs_new_timings field.
This patch adds more comments explaining the two fields, as well as
the possibly nonexistent mode switch in the mmc controller.
The .has_timings_switch is renamed to .ccu_has_timings_switch to clarify
its meaning.
This patch adds the initial support of Secure Digital Host Controller
Interface compliant controller found in some latest Spreadtrum chipsets.
This patch has been tested on the version of SPRD-R11 controller.
R11 is a variant based on SD v4.0 specification.
With this driver, R11 mmc can be initialized, can be mounted, read and
written.
Chunyan Zhang [Thu, 30 Aug 2018 08:21:43 +0000 (16:21 +0800)]
mmc: sdhci: SDMA may use Auto-CMD23 in v4 mode
When Host Version 4 Enable is set to 1, SDMA uses ADMA System Address
register (05Fh-058h) instead of using register (000h-004h) to indicate
its system address of data location. The register (000h-004h) is
re-assigned to 32-bit Block Count and Auto CMD23 argument, so then SDMA
may use Auto CMD23.
Chunyan Zhang [Thu, 30 Aug 2018 08:21:42 +0000 (16:21 +0800)]
mmc: sdhci: Add Auto CMD Auto Select support
As SD Host Controller Specification v4.10 documents:
Host Controller Version 4.10 defines this "Auto CMD Auto Select" mode.
Selection of Auto CMD depends on setting of CMD23 Enable in the Host
Control 2 register which indicates whether card supports CMD23. If CMD23
Enable =1, Auto CMD23 is used and if CMD23 Enable =0, Auto CMD12 is
used. In case of Version 4.10 or later, use of Auto CMD Auto Select is
recommended rather than use of Auto CMD12 Enable or Auto CMD23
Enable.
Chunyan Zhang [Thu, 30 Aug 2018 08:21:41 +0000 (16:21 +0800)]
mmc: sdhci: Add 32-bit block count support for v4 mode
Host Controller Version 4.10 re-defines SDMA System Address register
as 32-bit Block Count for v4 mode, and SDMA uses ADMA System
Address register (05Fh-058h) instead if v4 mode is enabled. Also
when using 32-bit block count, 16-bit block count register need
to be set to zero.
Since using 32-bit Block Count would cause problems for auto-cmd23,
it can be chosen via host->quirk2.
Chunyan Zhang [Thu, 30 Aug 2018 08:21:40 +0000 (16:21 +0800)]
mmc: sdhci: Add ADMA2 64-bit addressing support for V4 mode
ADMA2 64-bit addressing support is divided into V3 mode and V4 mode.
So there are two kinds of descriptors for ADMA2 64-bit addressing
i.e. 96-bit Descriptor for V3 mode, and 128-bit Descriptor for V4
mode. 128-bit Descriptor is aligned to 8-byte.
For V4 mode, ADMA2 64-bit addressing is enabled via Host Control 2
register.
Signed-off-by: Chunyan Zhang <zhang.chunyan@linaro.org> Acked-by: Adrian Hunter <adrian.hunter@intel.com>
[Ulf: Fixed conflict while applying] Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Chunyan Zhang [Thu, 30 Aug 2018 08:21:39 +0000 (16:21 +0800)]
mmc: sdhci: Change SDMA address register for v4 mode
According to the SD host controller specification version 4.10, when
Host Version 4 is enabled, SDMA uses ADMA System Address register
(05Fh-058h) instead of using SDMA System Address register to
support both 32-bit and 64-bit addressing.
Chunyan Zhang [Thu, 30 Aug 2018 08:21:38 +0000 (16:21 +0800)]
mmc: sdhci: Add sd host v4 mode
For SD host controller version 4.00 or later ones, there're two
modes of implementation - Version 3.00 compatible mode or
Version 4 mode. This patch introduced an interface to enable
v4 mode.
Aapo Vienamo [Fri, 10 Aug 2018 18:14:01 +0000 (21:14 +0300)]
mmc: tegra: Implement HS400 delay line calibration
Implement HS400 specific delay line calibration procedure. This is a
Tegra specific procedure and has to be performed regardless whether
enhanced strobe or HS400 tuning is used.
Aapo Vienamo [Thu, 30 Aug 2018 15:06:26 +0000 (18:06 +0300)]
mmc: tegra: Disable card clock during tuning cmd on Tegra210
Implement tegra210_sdhci_writew() to disable card clock and issue a
reset when the tuning command is sent. This is done to prevent an
intermittent hang with around 10 % failure rate during tuning.
Add tegra186_sdhci_ops because this workaround is specific to Tegra210.
Aapo Vienamo [Thu, 30 Aug 2018 15:06:25 +0000 (18:06 +0300)]
mmc: tegra: Remove tegra_sdhci_writew() from tegra210_sdhci_ops
tegra_sdhci_writew() defers the write to SDHCI_TRANSFER_MODE until
SDHCI_COMMAND is written. This is not necessary on Tegra210 and Tegra186
and it breaks read-modify-write operations on SDHCI_TRANSFER_MODE
because writes to SDHCI_TRANSFER_MODE aren't visible until SDHCI_COMMAND
has been written to. This results in tuning failures on Tegra210.
Aapo Vienamo [Thu, 30 Aug 2018 15:06:23 +0000 (18:06 +0300)]
mmc: tegra: Configure default trim value on reset
Program the outbound sampling trim value in tegra_sdhci_reset(). Unlike
the outbound tap value this does not depend on the signaling mode and
needs to be only programmed once.
Aapo Vienamo [Thu, 30 Aug 2018 15:06:20 +0000 (18:06 +0300)]
mmc: tegra: Add a workaround for tap value change glitch
Add quirk to disable the card clock during configuration of the tap
value in tegra_sdhci_set_tap() and issue sdhci_reset() after value
change. This is a workaround to avoid propagation of a potential
glitch caused by setting the tap value.
Aapo Vienamo [Thu, 30 Aug 2018 15:06:15 +0000 (18:06 +0300)]
mmc: tegra: Power on the calibration pad
Automatic pad drive strength calibration is performed on a separate pad
identical to the ones used for driving the actual bus. Power on the
calibration pad during the calibration procedure and power it off
afterwards to save power.
Signed-off-by: Aapo Vienamo <avienamo@nvidia.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Aapo Vienamo [Thu, 30 Aug 2018 15:06:12 +0000 (18:06 +0300)]
mmc: tegra: Reconfigure pad voltages during voltage switching
Parse the pinctrl state and nvidia,only-1-8-v properties from the device
tree. Validate the pinctrl and regulator configuration before unmasking
UHS modes. Implement pad voltage state reconfiguration in the mmc
start_signal_voltage_switch() callback. Add NVQUIRK_NEEDS_PAD_CONTROL
and add set it for Tegra210 and Tegra186.
The pad configuration is done in the mmc callback because the order of
pad reconfiguration and sdhci voltage switch depend on the voltage to
which the transition occurs.
Jisheng Zhang [Tue, 28 Aug 2018 09:47:23 +0000 (17:47 +0800)]
mmc: sdhci: introduce adma_write_desc() hook to struct sdhci_ops
Add this hook so that it can be overridden with driver specific
implementations. We also let the original sdhci_adma_write_desc()
accept &desc so that the function can set its new value. Then export
the function so that it could be reused by driver's specific
implementations.
Wang Dongsheng [Thu, 16 Aug 2018 04:48:43 +0000 (12:48 +0800)]
sdhci: acpi: add qcom sdhci host reset quirk fix
After host requests RESET_FOR_ALL action, the hardware output an
interrupt for OS and waiting for the OS to approve.
Before writing this fix, ACPI GED has handled the interrupt. But
the ACPI GED belongs to a slow process, and sometimes the handling
process time is more than 100ms(Mutex wait more than 100ms). So
drop the GED solution and add this quirk fix.
Signed-off-by: Wang Dongsheng <dongsheng.wang@hxt-semitech.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
In tuning mode of operation, when TBCTL[TB_EN] is set, eSDHC may report
one of the following errors :
1)Tuning error while running tuning operation where SYSCTL2[SAMPCLKSEL]
will not get set even when SYSCTL2[EXTN] is reset. OR
2)Data transaction error (e.g. IRQSTAT[DCE], IRQSTAT[DEBE]) during data
transaction errors.
This issue occurs when the data window sampled within eSDHC is in full
cycle. So, in that case, eSDHC is not able to find out the start and
end points of the data window and sets the sampling pointer at default
location (which is middle of the internal SD clock). If this sampling
point coincides with the data eye boundary, then it can result in the
above mentioned errors. Impact: Tuning mode of operation for SDR50,
SDR104 or HS200 speed modes may not work properly
Workaround: In case eSDHC reports tuning error or data errors in tuning
mode of operation, by add the erratum A008171 support to fix the issue.
Here is another TMIO MMC variant found in Socionext UniPhier SoCs.
As commit b6147490e6aa ("mmc: tmio: split core functionality, DMA and
MFD glue") said, these MMC controllers use the IP from Panasonic.
However, the MMC controller in the TMIO (Toshiba Mobile IO) MFD chip
was the first upstreamed user of this IP. The common driver code
for this IP is now called 'tmio-mmc-core' in Linux although it is a
historical misnomer.
Anyway, this driver select's MMC_TMIO_CORE to borrow the common code
from tmio-mmc-core.c
Older UniPhier SoCs (LD4, Pro4, sLD8) support the external DMA engine
like renesas_sdhi_sys_dmac.c. The difference is UniPhier SoCs use a
single DMA channel whereas Renesas chips request separate channels for
RX and TX.
Newer UniPhier SoCs (Pro5 and later) support the internal DMA engine
like renesas_sdhi_internal_dmac.c The register map is almost the same,
so I guess Renesas and Socionext use the same internal DMA hardware.
The main difference is, the register offsets are doubled for Renesas.
This comes from the difference of host->bus_shift; 2 for Renesas SoCs,
and 1 for UniPhier SoCs. Also, the datasheet for UniPhier SoCs defines
DM_DTRAN_ADDR and DM_DTRAN_ADDREX as two separate registers.
It could be possible to factor out the DMA common code by introducing
some hooks to cope with platform quirks, but this patch does not touch
that for now.
Sergei Shtylyov [Sat, 18 Aug 2018 18:08:26 +0000 (21:08 +0300)]
mmc: renesas_sdhi_internal_dmac: add R8A77970 to whitelist
I've successfully tested eMMC on the V3H Starter Kit board and since the
R8A77970 SoC has a single SDHI core, it can't be a subject to the known RX
DMA errata.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Sergei Shtylyov [Fri, 17 Aug 2018 20:19:02 +0000 (23:19 +0300)]
mmc: renesas_sdhi_internal_dmac: Fix a few typos
Remove the stray underscore in the DM_CM_DTRAN_MODE.BUS_WIDTH register
field name and fix the typo in the comment of the #define
DTRAN_MODE_CH_NUM_CH1.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Paul Cercueil [Tue, 21 Aug 2018 13:03:20 +0000 (15:03 +0200)]
mmc: jz4740: Drop dependency on MACH_JZ4740/80
Depending on MACH_JZ4740 | MACH_JZ4780 prevent us from creating a generic
kernel that works on more than one MIPS board. Instead, we just depend on
MIPS being set.
Signed-off-by: Paul Cercueil <paul@crapouillou.net> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
We need r8a774a1 to be whitelisted for SDHI to work on the RZ/G2M,
but we don't care about the revision of the SoC, so just whitelist
the generic part number.
Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by: Biju Das <biju.das@bp.renesas.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Michal Simek [Mon, 6 Aug 2018 08:43:10 +0000 (10:43 +0200)]
mmc: sdhci-of-arasan: Do now show error message in case of deffered probe
When mmc-pwrseq property is passed mmc_pwrseq_alloc() can return
-EPROBE_DEFER because driver for power sequence provider is not probed
yet. Do not show error message when this situation happens.
Signed-off-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
All of these have been in linux-next with no reported issues."
* tag 'char-misc-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
thunderbolt: Initialize after IOMMUs
thunderbolt: Do not handle ICM events after domain is stopped
firmware: Always initialize the fw_priv list object
docs: fpga: document fpga manager flags
fpga: bridge: fix obvious function documentation error
tools: hv: fcopy: set 'error' in case an unknown operation was requested
fpga: do not access region struct after fpga_region_unregister
Drivers: hv: vmbus: Use get/put_cpu() in vmbus_connect()
Merge tag 'usb-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
I wrote:
"USB fixes for 4.19-rc7
Here are some small USB fixes for 4.19-rc7
These include:
- the usual xhci bugfixes for reported issues
- some new serial driver device ids
- bugfix for the option serial driver for some devices
- bugfix for the cdc_acm driver that has been there for a long time.
All of these have been in linux-next for a while with no reported
issues."
* tag 'usb-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: xhci-mtk: resume USB3 roothub first
xhci: Add missing CAS workaround for Intel Sunrise Point xHCI
usb: cdc_acm: Do not leak URB buffers
USB: serial: simple: add Motorola Tetra MTP6550 id
USB: serial: option: add two-endpoints device-id flag
USB: serial: option: improve Quectel EP06 detection
Merge tag 'powerpc-4.19-4' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Michael writes:
"powerpc fixes for 4.19 #4
Four regression fixes.
A fix for a change to lib/xz which broke our zImage loader when
building with XZ compression. OK'ed by Herbert who merged the
original patch.
The recent fix we did to avoid patching __init text broke some 32-bit
machines, fix that.
Our show_user_instructions() could be tricked into printing kernel
memory, add a check to avoid that.
And a fix for a change to our NUMA initialisation logic, which causes
crashes in some kdump configurations.
Thanks to:
Christophe Leroy, Hari Bathini, Jann Horn, Joel Stanley, Meelis
Roos, Murilo Opsfelder Araujo, Srikar Dronamraju."
* tag 'powerpc-4.19-4' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/numa: Skip onlining a offline node in kdump path
powerpc: Don't print kernel instructions in show_user_instructions()
powerpc/lib: fix book3s/32 boot failure due to code patching
lib/xz: Put CRC32_POLY_LE in xz_private.h
1) Fix truncation of 32-bit right shift in bpf, from Jann Horn.
2) Fix memory leak in wireless wext compat, from Stefan Seyfried.
3) Use after free in cfg80211's reg_process_hint(), from Yu Zhao.
4) Need to cancel pending work when unbinding in smsc75xx otherwise
we oops, also from Yu Zhao.
5) Don't allow enslaving a team device to itself, from Ido Schimmel.
6) Fix backwards compat with older userspace for rtnetlink FDB dumps.
From Mauricio Faria.
7) Add validation of tc policy netlink attributes, from David Ahern.
8) Fix RCU locking in rawv6_send_hdrinc(), from Wei Wang."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
net: mvpp2: Extract the correct ethtype from the skb for tx csum offload
ipv6: take rcu lock in rawv6_send_hdrinc()
net: sched: Add policy validation for tc attributes
rtnetlink: fix rtnl_fdb_dump() for ndmsg header
yam: fix a missing-check bug
net: bpfilter: Fix type cast and pointer warnings
net: cxgb3_main: fix a missing-check bug
bpf: 32-bit RSH verification must truncate input before the ALU op
net: phy: phylink: fix SFP interface autodetection
be2net: don't flip hw_features when VXLANs are added/deleted
net/packet: fix packet drop as of virtio gso
net: dsa: b53: Keep CPU port as tagged in all VLANs
openvswitch: load NAT helper
bnxt_en: get the reduced max_irqs by the ones used by RDMA
bnxt_en: free hwrm resources, if driver probe fails.
bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
bnxt_en: Fix VNIC reservations on the PF.
team: Forbid enslaving team device to itself
net/usb: cancel pending work when unbinding smsc75xx
mlxsw: spectrum: Delete RIF when VLAN device is removed
...
* akpm:
mm: madvise(MADV_DODUMP): allow hugetlbfs pages
ocfs2: fix locking for res->tracking and dlm->tracking_list
mm/vmscan.c: fix int overflow in callers of do_shrink_slab()
mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly
mm/vmstat.c: fix outdated vmstat_text
proc: restrict kernel stack dumps to root
mm/hugetlb: add mmap() encodings for 32MB and 512MB page sizes
mm/migrate.c: split only transparent huge pages when allocation fails
ipc/shm.c: use ERR_CAST() for shm_lock() error return
mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl
mm, thp: fix mlocking THP page with migration enabled
ocfs2: fix crash in ocfs2_duplicate_clusters_by_page()
hugetlb: take PMD sharing into account when flushing tlb/caches
mm: migration: fix migration of huge PMD shared pages
hugetlbfs pages have VM_DONTEXPAND in the VmFlags driver pages based on
author testing with analysis from Florian Weimer[1].
The inclusion of VM_DONTEXPAND into the VM_SPECIAL defination was a
consequence of the large useage of VM_DONTEXPAND in device drivers.
A consequence of [2] is that VM_DONTEXPAND marked pages are unable to be
marked DODUMP.
A user could quite legitimately madvise(MADV_DONTDUMP) their hugetlbfs
memory for a while and later request that madvise(MADV_DODUMP) on the same
memory. We correct this omission by allowing madvice(MADV_DODUMP) on
hugetlbfs pages.
[1] https://stackoverflow.com/questions/52548260/madvisedodump-on-the-same-ptr-size-as-a-successful-madvisedontdump-fails-wit
[2] commit 0103bd16fb90 ("mm: prepare VM_DONTDUMP for using in drivers")
Link: http://lkml.kernel.org/r/20180930054629.29150-1-daniel@linux.ibm.com Link: https://lists.launchpad.net/maria-discuss/msg05245.html Fixes: 0103bd16fb90 ("mm: prepare VM_DONTDUMP for using in drivers") Reported-by: Kenneth Penza <kpenza@gmail.com> Signed-off-by: Daniel Black <daniel@linux.ibm.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ashish Samant [Fri, 5 Oct 2018 22:52:15 +0000 (15:52 -0700)]
ocfs2: fix locking for res->tracking and dlm->tracking_list
In dlm_init_lockres() we access and modify res->tracking and
dlm->tracking_list without holding dlm->track_lock. This can cause list
corruptions and can end up in kernel panic.
Fix this by locking res->tracking and dlm->tracking_list with
dlm->track_lock instead of dlm->spinlock.
Link: http://lkml.kernel.org/r/1529951192-4686-1-git-send-email-ashish.samant@oracle.com Signed-off-by: Ashish Samant <ashish.samant@oracle.com> Reviewed-by: Changwei Ge <ge.changwei@h3c.com> Acked-by: Joseph Qi <jiangqi903@gmail.com> Acked-by: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <ge.changwei@h3c.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kirill Tkhai [Fri, 5 Oct 2018 22:52:10 +0000 (15:52 -0700)]
mm/vmscan.c: fix int overflow in callers of do_shrink_slab()
do_shrink_slab() returns unsigned long value, and the placing into int
variable cuts high bytes off. Then we compare ret and 0xfffffffe (since
SHRINK_EMPTY is converted to ret type).
Thus a large number of objects returned by do_shrink_slab() may be
interpreted as SHRINK_EMPTY, if low bytes of their value are equal to
0xfffffffe. Fix that by declaration ret as unsigned long in these
functions.
Jann Horn [Fri, 5 Oct 2018 22:52:07 +0000 (15:52 -0700)]
mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly
5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even
on UP") made the availability of the NR_TLB_REMOTE_FLUSH* counters inside
the kernel unconditional to reduce #ifdef soup, but (either to avoid
showing dummy zero counters to userspace, or because that code was missed)
didn't update the vmstat_array, meaning that all following counters would
be shown with incorrect values.
This only affects kernel builds with
CONFIG_VM_EVENT_COUNTERS=y && CONFIG_DEBUG_TLBFLUSH=y && CONFIG_SMP=n.
Link: http://lkml.kernel.org/r/20181001143138.95119-2-jannh@google.com Fixes: 5dd0b16cdaff ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Kemi Wang <kemi.wang@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jann Horn [Fri, 5 Oct 2018 22:52:03 +0000 (15:52 -0700)]
mm/vmstat.c: fix outdated vmstat_text
7a9cdebdcc17 ("mm: get rid of vmacache_flush_all() entirely") removed the
VMACACHE_FULL_FLUSHES statistics, but didn't remove the corresponding
entry in vmstat_text. This causes an out-of-bounds access in
vmstat_show().
Luckily this only affects kernels with CONFIG_DEBUG_VM_VMACACHE=y, which
is probably very rare.
Link: http://lkml.kernel.org/r/20181001143138.95119-1-jannh@google.com Fixes: 7a9cdebdcc17 ("mm: get rid of vmacache_flush_all() entirely") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Christoph Lameter <clameter@sgi.com> Cc: Kemi Wang <kemi.wang@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jann Horn [Fri, 5 Oct 2018 22:51:58 +0000 (15:51 -0700)]
proc: restrict kernel stack dumps to root
Currently, you can use /proc/self/task/*/stack to cause a stack walk on
a task you control while it is running on another CPU. That means that
the stack can change under the stack walker. The stack walker does
have guards against going completely off the rails and into random
kernel memory, but it can interpret random data from your kernel stack
as instruction pointers and stack pointers. This can cause exposure of
kernel stack contents to userspace.
Restrict the ability to inspect kernel stacks of arbitrary tasks to root
in order to prevent a local attacker from exploiting racy stack unwinding
to leak kernel task stack contents. See the added comment for a longer
rationale.
There don't seem to be any users of this userspace API that can't
gracefully bail out if reading from the file fails. Therefore, I believe
that this change is unlikely to break things. In the case that this patch
does end up needing a revert, the next-best solution might be to fake a
single-entry stack based on wchan.
Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com Fixes: 2ec220e27f50 ("proc: add /proc/*/stack") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Ken Chen <kenchen@google.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Laura Abbott <labbott@redhat.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mm/migrate.c: split only transparent huge pages when allocation fails
split_huge_page_to_list() fails on HugeTLB pages. I was experimenting
with moving 32MB contig HugeTLB pages on arm64 (with a debug patch
applied) and hit the following stack trace when the kernel crashed.
When unmap_and_move[_huge_page]() fails due to lack of memory, the
splitting should happen only for transparent huge pages not for HugeTLB
pages. PageTransHuge() returns true for both THP and HugeTLB pages.
Hence the conditonal check should test PagesHuge() flag to make sure that
given pages is not a HugeTLB one.
Link: http://lkml.kernel.org/r/1537798495-4996-1-git-send-email-anshuman.khandual@arm.com Fixes: 94723aafb9 ("mm: unclutter THP migration") Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kees Cook [Fri, 5 Oct 2018 22:51:48 +0000 (15:51 -0700)]
ipc/shm.c: use ERR_CAST() for shm_lock() error return
This uses ERR_CAST() instead of an open-coded cast, as it is casting
across structure pointers, which upsets __randomize_layout:
ipc/shm.c: In function `shm_lock':
ipc/shm.c:209:9: note: randstruct: casting between randomized structure pointer types (ssa): `struct shmid_kernel' and `struct kern_ipc_perm'