The Xen tmem (transcendent memory) driver can be removed, as the
related Xen hypervisor feature never made it past the "experimental"
state and will be removed in future Xen versions (>= 4.13).
The xen-selfballoon driver depends on tmem, so it can be removed, too.
Commit c06a25805333 ("x86/jump_label: Initialize static branching
early") adds jump_label_init() call in setup_arch() to make static
keys initialized early, so we could use the original simpler code
again.
Juergen Gross [Fri, 21 Jun 2019 18:47:03 +0000 (20:47 +0200)]
xen/events: fix binding user event channels to cpus
When binding an interdomain event channel to a vcpu via
IOCTL_EVTCHN_BIND_INTERDOMAIN not only the event channel needs to be
bound, but the affinity of the associated IRQi must be changed, too.
Otherwise the IRQ and the event channel won't be moved to another vcpu
in case the original vcpu they were bound to is going offline.
Cc: <stable@vger.kernel.org> # 4.13 Fixes: d994e60fbddd289b ("xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
Merge tag 'for-linus-20190715' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
"A later pull request with some followup items. I had some vacation
coming up to the merge window, so certain things items were delayed a
bit. This pull request also contains fixes that came in within the
last few days of the merge window, which I didn't want to push right
before sending you a pull request.
This contains:
- NVMe pull request, mostly fixes, but also a few minor items on the
feature side that were timing constrained (Christoph et al)
- Report zones fixes (Damien)
- Removal of dead code (Damien)
- Turn on cgroup psi memstall (Josef)
- block cgroup MAINTAINERS entry (Konstantin)
- Flush init fix (Josef)
- blk-throttle low iops timing fix (Konstantin)
- nbd resize fixes (Mike)
- nbd 0 blocksize crash fix (Xiubo)
- block integrity error leak fix (Wenwen)
- blk-cgroup writeback and priority inheritance fixes (Tejun)"
* tag 'for-linus-20190715' of git://git.kernel.dk/linux-block: (42 commits)
MAINTAINERS: add entry for block io cgroup
null_blk: fixup ->report_zones() for !CONFIG_BLK_DEV_ZONED
block: Limit zone array allocation size
sd_zbc: Fix report zones buffer allocation
block: Kill gfp_t argument of blkdev_report_zones()
block: Allow mapping of vmalloc-ed buffers
block/bio-integrity: fix a memory leak bug
nvme: fix NULL deref for fabrics options
nbd: add netlink reconfigure resize support
nbd: fix crash when the blksize is zero
block: Disable write plugging for zoned block devices
block: Fix elevator name declaration
block: Remove unused definitions
nvme: fix regression upon hot device removal and insertion
blk-throttle: fix zero wait time for iops throttled group
block: Fix potential overflow in blk_report_zones()
blkcg: implement REQ_CGROUP_PUNT
blkcg, writeback: Implement wbc_blkcg_css()
blkcg, writeback: Add wbc->no_cgroup_owner
blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner()
...
Merge tag 'for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply and reset updates from Sebastian Reichel:
"Core:
- add HWMON compat layer
- new properties:
- input power limit
- input voltage limit
Drivers:
- qcom-pon: add gen2 support
- new driver for storing reboot move in NVMEM
- new driver for Wilco EC charger configuration
- simplify getting the adapter of a client"
* tag 'for-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
power: reset: nvmem-reboot-mode: add CONFIG_OF dependency
power_supply: wilco_ec: Add charging config driver
power: supply: cros: allow to set input voltage and current limit
power: supply: add input power and voltage limit properties
power: supply: fix semicolon.cocci warnings
power: reset: nvmem-reboot-mode: use NVMEM as reboot mode write interface
dt-bindings: power: reset: add document for NVMEM based reboot-mode
reset: qcom-pon: Add support for gen2 pon
dt-bindings: power: reset: qcom: Add qcom,pm8998-pon compatibility line
power: supply: Add HWMON compatibility layer
power: supply: sbs-manager: simplify getting the adapter of a client
power: supply: rt9455_charger: simplify getting the adapter of a client
power: supply: rt5033_battery: simplify getting the adapter of a client
power: supply: max17042_battery: simplify getting the adapter of a client
power: supply: max17040_battery: simplify getting the adapter of a client
power: supply: max14656_charger_detector: simplify getting the adapter of a client
power: supply: bq25890_charger: simplify getting the adapter of a client
power: supply: bq24257_charger: simplify getting the adapter of a client
power: supply: bq24190_charger: simplify getting the adapter of a client
- Convert docs to reST (Changbin Du, Mauro Carvalho Chehab)"
* tag 'pci-v5.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (107 commits)
PCI: Enable NVIDIA HDA controllers
tools: PCI: Fix installation when `make tools/pci_install`
PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIB
PCI: Fix typos and whitespace errors
PCI: mobiveil: Fix INTx interrupt clearing in mobiveil_pcie_isr()
PCI: mobiveil: Fix infinite-loop in the INTx handling function
PCI: mobiveil: Move PCIe PIO enablement out of inbound window routine
PCI: mobiveil: Add upper 32-bit PCI base address setup in inbound window
PCI: mobiveil: Add upper 32-bit CPU base address setup in outbound window
PCI: mobiveil: Mask out hardcoded bits in inbound/outbound windows setup
PCI: mobiveil: Clear the control fields before updating it
PCI: mobiveil: Add configured inbound windows counter
PCI: mobiveil: Fix the valid check for inbound and outbound windows
PCI: mobiveil: Clean-up program_{ib/ob}_windows()
PCI: mobiveil: Remove an unnecessary return value check
PCI: mobiveil: Fix error return values
PCI: mobiveil: Refactor the MEM/IO outbound window initialization
PCI: mobiveil: Make some register updates more readable
PCI: mobiveil: Reformat the code for readability
dt-bindings: PCI: mobiveil: Change gpio_slave and apb_csr to optional
...
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"A smaller cycle this time. Notably we see another new driver, 'Soft
iWarp', and the deletion of an ancient unused driver for nes.
- Revise and simplify the signature offload RDMA MR APIs
- More progress on hoisting object allocation boiler plate code out
of the drivers
- Driver bug fixes and revisions for hns, hfi1, efa, cxgb4, qib,
i40iw
- Tree wide cleanups: struct_size, put_user_page, xarray, rst doc
conversion
- Removal of obsolete ib_ucm chardev and nes driver
- netlink based discovery of chardevs and autoloading of the modules
providing them
- Move more of the rdamvt/hfi1 uapi to include/uapi/rdma
- New driver 'siw' for software based iWarp running on top of netdev,
much like rxe's software RoCE.
- mlx5 feature to report events in their raw devx format to userspace
- Expose per-object counters through rdma tool
- Adaptive interrupt moderation for RDMA (DIM), sharing the DIM core
from netdev"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (194 commits)
RMDA/siw: Require a 64 bit arch
RDMA/siw: Mark expected switch fall-throughs
RDMA/core: Fix -Wunused-const-variable warnings
rdma/siw: Remove set but not used variable 's'
rdma/siw: Add missing dependencies on LIBCRC32C and DMA_VIRT_OPS
RDMA/siw: Add missing rtnl_lock around access to ifa
rdma/siw: Use proper enumerated type in map_cqe_status
RDMA/siw: Remove unnecessary kthread create/destroy printouts
IB/rdmavt: Fix variable shadowing issue in rvt_create_cq
RDMA/core: Fix race when resolving IP address
RDMA/core: Make rdma_counter.h compile stand alone
IB/core: Work on the caller socket net namespace in nldev_newlink()
RDMA/rxe: Fill in wc byte_len with IB_WC_RECV_RDMA_WITH_IMM
RDMA/mlx5: Set RDMA DIM to be enabled by default
RDMA/nldev: Added configuration of RDMA dynamic interrupt moderation to netlink
RDMA/core: Provide RDMA DIM support for ULPs
linux/dim: Implement RDMA adaptive moderation (DIM)
IB/mlx5: Report correctly tag matching rendezvous capability
docs: infiniband: add it to the driver-api bookset
IB/mlx5: Implement VHCA tunnel mechanism in DEVX
...
New Device Support:
- Add support for LP87561 4-Phase Regulator to TI LP87565 PMIC
- Add support for RK809 and RK817 to Rockchip RK808
- Add support for Lid Angle to ChromeOS core
- Add support for CS47L15 CODEC to Madera core
- Add support for CS47L92 CODEC to Madera core
- Add support for ChromeOS (legacy) Accelerometers in ChromeOS core
- Add support for Add Intel Elkhart Lake PCH to Intel LPSS
New Functionality:
- Provide regulator supply information when registering; madera-core
- Additional Device Tree support; lp87565, madera, cros-ec, rohm,bd71837-pmic
- Allow over-riding power button press via Device Tree; rohm-bd718x7
- Differentiate between running processors; cros_ec_dev
Fix-ups:
- Big header file update; cros_ec_commands.h
- Split header per-subsystem; rohm-bd718x7
- Remove superfluous code; menelaus, cs5535-mfd, cs47lXX-tables
- Trivial; sorting, coding style; intel-lpss-pci
- Only remove Power Off functionality if set locally; rk808
- Make use for Power Off Prepare(); rk808
- Fix spelling mistake in header guards; stmfx
- Properly free IDA resources
- SPDX fixups; cs47lXX-tables, madera
- Error path fixups; hi655x-pmic
Bug Fixes:
- Add missing break in case() statement
- Repair undefined behaviour when not initialising variables; arizona-core, madera-core
- Fix reference to Device Tree documentation; madera"
* tag 'mfd-next-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (45 commits)
mfd: hi655x-pmic: Fix missing return value check for devm_regmap_init_mmio_clk
mfd: madera: Fixup SPDX headers
mfd: madera: Remove some unused registers and fix some defaults
mfd: intel-lpss: Release IDA resources
mfd: intel-lpss: Add Intel Elkhart Lake PCH PCI IDs
mfd: cs5535-mfd: Remove ifdef OLPC noise
mfd: stmfx: Fix macro definition spelling
dt-bindings: mfd: Add link to ROHM BD71847 Datasheet
MAINAINERS: Swap words in INTEL PMIC MULTIFUNCTION DEVICE DRIVERS
mfd: cros_ec_dev: Register cros_ec_accel_legacy driver as a subdevice
mfd: rk808: Prepare rk805 for poweroff
mfd: rk808: Check pm_power_off pointer
mfd: cros_ec: differentiate SCP from EC by feature bit
dt-bindings: Add binding for cros-ec-rpmsg
mfd: madera: Add Madera core support for CS47L92
mfd: madera: Add Madera core support for CS47L15
mfd: madera: Update DT bindings to add additional CODECs
mfd: madera: Add supply mapping for MICVDD
mfd: madera: Fix potential uninitialised use of variable
mfd: madera: Fix bad reference to pinctrl.txt file
...
Merge tag 'drm-next-2019-07-16' of git://anongit.freedesktop.org/drm/drm
Pull drm updates from Dave Airlie:
"The biggest thing in this is the AMD Navi GPU support, this again
contains a bunch of header files that are large. These are the new AMD
RX5700 GPUs that just recently became available.
New drivers:
- ST-Ericsson MCDE driver
- Ingenic JZ47xx SoC
UAPI change:
- HDR source metadata property
Core:
- HDR inforframes and EDID parsing
- drm hdmi infoframe unpacking
- remove prime sg_table caching into dma-buf
- New gem vram helpers to reduce driver code
- Lots of drmP.h removal
- reservation fencing fix
- documentation updates
- drm_fb_helper_connector removed
- mode name command handler rewrite
fbcon:
- Remove the fbcon notifiers
ttm:
- forward progress fixes
dma-buf:
- make mmap call optional
- debugfs refcount fixes
- dma-fence free with pending signals fix
- each dma-buf gets an inode
Panels:
- Lots of additional panel bindings
amdgpu:
- initial navi10 support
- avoid hw reset
- HDR metadata support
- new thermal sensors for vega asics
- RAS fixes
- use HMM rather than MMU notifier
- xgmi topology via kfd
- SR-IOV fixes
- driver reload fixes
- DC use a core bpc attribute
- Aux fixes for DC
- Bandwidth calc updates for DC
- Clock handling refactor
- kfd VEGAM support
vmwgfx:
- Coherent memory support changes
i915:
- HDR Support
- HDMI i2c link
- Icelake multi-segmented gamma support
- GuC firmware update
- Mule Creek Canyon PCH support for EHL
- EHL platform updtes
- move i915.alpha_support to i915.force_probe
- runtime PM refactoring
- VBT parsing refactoring
- DSI fixes
- struct mutex dependency reduction
- GEM code reorg
mali-dp:
- Komeda driver features
msm:
- dsi vs EPROBE_DEFER fixes
- msm8998 snapdragon 835 support
- a540 gpu support
- mdp5 and dpu interconnect support
exynos:
- drmP.h removal
tegra:
- misc fixes
tda998x:
- audio support improvements
- pixel repeated mode support
- quantisation range handling corrections
- HDMI vendor info fix
armada:
- interlace support fix
- overlay/video plane register handling refactor
- add gamma support
rockchip:
- RX3328 support
panfrost:
- expose perf counters via hidden ioctls
vkms:
- enumerate CRC sources list
ast:
- rework BO handling
mgag200:
- rework BO handling
dw-hdmi:
- suspend/resume support
rcar-du:
- R8A774A1 Soc Support
- LVDS dual-link mode support
- Additional formats
- Misc fixes
* tag 'drm-next-2019-07-16' of git://anongit.freedesktop.org/drm/drm: (1815 commits)
Revert "Merge branch 'vmwgfx-next' of git://people.freedesktop.org/~thomash/linux into drm-next"
Revert "mm: adjust apply_to_pfn_range interface for dropped token."
mm: adjust apply_to_pfn_range interface for dropped token.
drm/amdgpu/navi10: add uclk activity sensor
drm/amdgpu: properly guard the generic discovery code
drm/amdgpu: add missing documentation on new module parameters
drm/amdgpu: don't invalidate caches in RELEASE_MEM, only do the writeback
drm/amd/display: avoid 64-bit division
drm/amdgpu/psp11: simplify the ucode register logic
drm/amdgpu: properly guard DC support in navi code
drm/amd/powerplay: vega20: fix uninitialized variable use
drm/amd/display: dcn20: include linux/delay.h
amdgpu: make pmu support optional
drm/amd/powerplay: Zero initialize current_rpm in vega20_get_fan_speed_percent
drm/amd/powerplay: Zero initialize freq in smu_v11_0_get_current_clk_freq
drm/amd/powerplay: Use memset to initialize metrics structs
drm/amdgpu/mes10.1: Fix header guard
drm/amd/powerplay: add temperature sensor support for navi10
drm/amdgpu: fix scheduler timeout calc
drm/amdgpu: Prepare for hmm_range_register API change (v2)
...
The mm changes in there we premature and not fully ack or reviewed by core mm folks,
I dropped the ball by merging them via this tree, so lets take em all back out.
Dave Airlie [Mon, 15 Jul 2019 05:16:20 +0000 (15:16 +1000)]
mm: adjust apply_to_pfn_range interface for dropped token.
mm/pgtable: drop pgtable_t variable from pte_fn_t functions
drops the token came in via the hmm tree, this caused lots of
conflicts, but applying this cleanup patch should reduce it
to something easier to handle. Just accept the token is unused
at this point.
Merge tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull HMM updates from Jason Gunthorpe:
"Improvements and bug fixes for the hmm interface in the kernel:
- Improve clarity, locking and APIs related to the 'hmm mirror'
feature merged last cycle. In linux-next we now see AMDGPU and
nouveau to be using this API.
- Remove old or transitional hmm APIs. These are hold overs from the
past with no users, or APIs that existed only to manage cross tree
conflicts. There are still a few more of these cleanups that didn't
make the merge window cut off.
- Improve some core mm APIs:
- export alloc_pages_vma() for driver use
- refactor into devm_request_free_mem_region() to manage
DEVICE_PRIVATE resource reservations
- refactor duplicative driver code into the core dev_pagemap
struct
- Remove hmm wrappers of improved core mm APIs, instead have drivers
use the simplified API directly
- Remove DEVICE_PUBLIC
- Simplify the kconfig flow for the hmm users and core code"
* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (42 commits)
mm: don't select MIGRATE_VMA_HELPER from HMM_MIRROR
mm: remove the HMM config option
mm: sort out the DEVICE_PRIVATE Kconfig mess
mm: simplify ZONE_DEVICE page private data
mm: remove hmm_devmem_add
mm: remove hmm_vma_alloc_locked_page
nouveau: use devm_memremap_pages directly
nouveau: use alloc_page_vma directly
PCI/P2PDMA: use the dev_pagemap internal refcount
device-dax: use the dev_pagemap internal refcount
memremap: provide an optional internal refcount in struct dev_pagemap
memremap: replace the altmap_valid field with a PGMAP_ALTMAP_VALID flag
memremap: remove the data field in struct dev_pagemap
memremap: add a migrate_to_ram method to struct dev_pagemap_ops
memremap: lift the devmap_enable manipulation into devm_memremap_pages
memremap: pass a struct dev_pagemap to ->kill and ->cleanup
memremap: move dev_pagemap callbacks into a separate structure
memremap: validate the pagemap type passed to devm_memremap_pages
mm: factor out a devm_request_free_mem_region helper
mm: export alloc_pages_vma
...
Merge tag 'ecryptfs-5.3-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs
Pull eCryptfs updates from Tyler Hicks:
- Fix error handling when ecryptfs_read_lower() encounters an error
- Fix read-only file creation when the eCryptfs mount is configured to
store metadata in xattrs
- Minor code cleanups
* tag 'ecryptfs-5.3-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
ecryptfs: Change return type of ecryptfs_process_flags
ecryptfs: Make ecryptfs_xattr_handler static
ecryptfs: remove unnessesary null check in ecryptfs_keyring_auth_tok_for_sig
ecryptfs: use print_hex_dump_bytes for hexdump
eCryptfs: fix permission denied with ecryptfs_xattr mount option when create readonly file
ecryptfs: re-order a condition for static checkers
eCryptfs: fix a couple type promotion bugs
Merge tag 'upstream-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs
Pull UBIFS updates from Richard Weinberger:
- Support for zstd compression
- Support for offline signed filesystems
- Various fixes for regressions
* tag 'upstream-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
ubifs: Don't leak orphans on memory during commit
ubifs: Check link count of inodes when killing orphans.
ubifs: Add support for zstd compression.
ubifs: support offline signed images
ubifs: remove unnecessary check in ubifs_log_start_commit
ubifs: Fix typo of output in get_cs_sqnum
ubifs: Simplify redundant code
ubifs: Correctly use tnc_next() in search_dh_cookie()
Merge tag 'stream_open-5.3' of https://lab.nexedi.com/kirr/linux
Pull stream_open() updates from Kirill Smelkov:
"This time on stream_open front it is only two small changes:
- the first one converts stream_open.cocci to treat all functions
that start with wait_.* as blocking. Previously it was only
wait_event_.* functions that were considered as blocking, but this
was falsely reporting several deadlock cases as only warning.
This was picked by linux-kbuild and entered mainline as commit fa081d4607c5 ("coccinelle: api/stream_open: treat all wait_.*()
calls as blocking"), and already merged earlier.
- the second one teaches stream_open.cocci to consider files as being
stream-like even if they use noop_llseek. It results in two more
drivers being converted to stream_open() (mousedev.c and
hid-sensor-custom.c)"
* tag 'stream_open-5.3' of https://lab.nexedi.com/kirr/linux:
*: convert stream-like files -> stream_open, even if they use noop_llseek
Merge tag 'platform-drivers-x86-v5.3-1' of git://git.infradead.org/linux-platform-drivers-x86
Pull x86 platform driver updates from Andy Shevchenko:
"Gathered a bunch of x86 platform driver changes. It's rather big,
since includes two big refactors and completely new driver:
- ASUS WMI driver got a big refactoring in order to support the TUF
Gaming laptops. Besides that, the regression with backlight being
permanently off on various EeePC laptops has been fixed.
- Accelerometer on HP ProBook 450 G0 shows wrong measurements due to
X axis being inverted. This has been fixed.
- Intel PMC core driver has been extended to be ACPI enumerated if
the DSDT provides device with _HID "INT33A1". This allows to
convert the driver to be pure platform and support new hardware
purely based on ACPI DSDT.
- From now on the Intel Speed Select Technology is supported thru a
corresponding driver. This driver provides an access to the
features of the ISST, such as Performance Profile, Core Power, Base
frequency and Turbo Frequency.
- Mellanox platform drivers has been refactored and now extended to
support more systems, including new coming ones.
- The OLPC XO-1.75 platform is now supported.
- CB4063 Beckhoff Automation board is using PMC clocks, provided via
pmc_atom driver, for ethernet controllers in a way that they can't
be managed by the clock driver. The quirk has been extended to
cover this case.
- Touchscreen on Chuwi Hi10 Plus tablet has been enabled. Meanwhile
the information of Chuwi Hi10 Air has been fixed to cover more
models based on the same platform.
- Xiaomi notebooks have WMI interface enabled. Thus, the driver to
support it has been provided. It required some extension of the
generic WMI library, which allows to propagate opaque context to
the ->probe() of the individual drivers.
This release includes debugfs clean up from Greg KH for several
drivers that drop return code check and make debugfs absence or
failure non-fatal.
Also miscellaneous fixes here and there, mostly for Acer WMI and
various Intel drivers"
* tag 'platform-drivers-x86-v5.3-1' of git://git.infradead.org/linux-platform-drivers-x86: (74 commits)
platform/x86: Fix PCENGINES_APU2 Kconfig warning
tools/power/x86/intel-speed-select: Add .gitignore file
platform/x86: mlx-platform: Fix error handling in mlxplat_init()
platform/x86: intel_pmc_core: Attach using APCI HID "INT33A1"
platform/x86: intel_pmc_core: transform Pkg C-state residency from TSC ticks into microseconds
platform/x86: asus-wmi: Use dev_get_drvdata()
Documentation/ABI: Add new attribute for mlxreg-io sysfs interfaces
platform/x86: mlx-platform: Add more reset cause attributes
platform/x86: mlx-platform: Modify DMI matching order
platform/x86: mlx-platform: Add regmap structure for the next generation systems
platform/x86: mlx-platform: Change API for i2c-mlxcpld driver activation
platform/x86: mlx-platform: Move regmap initialization before all drivers activation
MAINTAINERS: Update for Intel Speed Select Technology
tools/power/x86: A tool to validate Intel Speed Select commands
platform/x86: ISST: Restore state on resume
platform/x86: ISST: Add Intel Speed Select PUNIT MSR interface
platform/x86: ISST: Add Intel Speed Select mailbox interface via MSRs
platform/x86: ISST: Add Intel Speed Select mailbox interface via PCI
platform/x86: ISST: Add Intel Speed Select mmio interface
platform/x86: ISST: Add IOCTL to Translate Linux logical CPU to PUNIT CPU number
...
Merge branch 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu
Pull percpu updates from Dennis Zhou:
"This includes changes to let percpu_ref release the backing percpu
memory earlier after it has been switched to atomic in cases where the
percpu ref is not revived.
This will help recycle percpu memory earlier in cases where the
refcounts are pinned for prolonged periods of time"
* 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
percpu_ref: release percpu memory early without PERCPU_REF_ALLOW_REINIT
md: initialize percpu refcounters using PERCU_REF_ALLOW_REINIT
io_uring: initialize percpu refcounters using PERCU_REF_ALLOW_REINIT
percpu_ref: introduce PERCPU_REF_ALLOW_REINIT flag
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"A number of PMU driver corner case fixes, a race fix, an event
grouping fix, plus a bunch of tooling fixes/updates"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
perf/x86/intel: Fix spurious NMI on fixed counter
perf/core: Fix exclusive events' grouping
perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs
perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCs
perf/core: Fix race between close() and fork()
perf intel-pt: Fix potential NULL pointer dereference found by the smatch tool
perf intel-bts: Fix potential NULL pointer dereference found by the smatch tool
perf script: Assume native_arch for pipe mode
perf scripts python: export-to-sqlite.py: Fix DROP VIEW power_events_view
perf scripts python: export-to-postgresql.py: Fix DROP VIEW power_events_view
perf hists browser: Fix potential NULL pointer dereference found by the smatch tool
perf cs-etm: Fix potential NULL pointer dereference found by the smatch tool
perf parse-events: Remove unused variable: error
perf parse-events: Remove unused variable 'i'
perf metricgroup: Add missing list_del_init() when flushing egroups list
perf tools: Use list_del_init() more thorougly
perf tools: Use zfree() where applicable
tools lib: Adopt zalloc()/zfree() from tools/perf
perf tools: Move get_current_dir_name() cond prototype out of util.h
perf namespaces: Move the conditional setns() prototype to namespaces.h
...
Kirill Smelkov [Wed, 19 Jun 2019 17:26:56 +0000 (20:26 +0300)]
*: convert stream-like files -> stream_open, even if they use noop_llseek
This patch continues 8a0ca9292a2d (fs: stream_open - opener for
stream-like files so that read and write can run simultaneously without
deadlock) and 4fe550b29706 (*: convert stream-like files from
nonseekable_open -> stream_open) and teaches steam_open.cocci to
consider files as being stream-like not only if they have
.llseek=no_llseek, but also if they have .llseek=noop_llseek.
This is safe to do: the comment about noop_llseek says
This is an implementation of ->llseek useable for the rare special case when
userspace expects the seek to succeed but the (device) file is actually not
able to perform the seek. In this case you use noop_llseek() instead of
falling back to the default implementation of ->llseek.
and in general noop_llseek was massively added to drivers in 321d382138a2
(llseek: automatically add .llseek fop) when changing default for NULL .llseek
from NOP to no_llseek with the idea to avoid breaking compatibility, if
maybe some user-space program was using lseek on a device without caring
about the result, but caring if it was an error or not.
Amended semantic patch produces two changes when applied tree-wide:
drivers/hid/hid-sensor-custom.c:690:8-24: WARNING: hid_sensor_custom_fops: .read() has stream semantic; safe to change nonseekable_open -> stream_open.
drivers/input/mousedev.c:564:1-17: ERROR: mousedev_fops: .read() can deadlock .write(); change nonseekable_open -> stream_open to fix.
Cc: Julia Lawall <Julia.Lawall@lip6.fr> Cc: Jan Blunck <jblunck@suse.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jiri Kosina <jikos@kernel.org> Cc: Jonathan Cameron <jic23@kernel.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Merge tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- Removal of the NPU DMA code, used by the out-of-tree Nvidia driver,
as well as some other functions only used by drivers that haven't
(yet?) made it upstream.
- A fix for a bug in our handling of hardware watchpoints (eg. perf
record -e mem: ...) which could lead to register corruption and
kernel crashes.
- Enable HAVE_ARCH_HUGE_VMAP, which allows us to use large pages for
vmalloc when using the Radix MMU.
- A large but incremental rewrite of our exception handling code to
use gas macros rather than multiple levels of nested CPP macros.
And the usual small fixes, cleanups and improvements.
* tag 'powerpc-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (163 commits)
powerpc/powernv/idle: Fix restore of SPRN_LDBAR for POWER9 stop state.
powerpc/eeh: Handle hugepages in ioremap space
ocxl: Update for AFU descriptor template version 1.1
powerpc/boot: pass CONFIG options in a simpler and more robust way
powerpc/boot: add {get, put}_unaligned_be32 to xz_config.h
powerpc/irq: Don't WARN continuously in arch_local_irq_restore()
powerpc/module64: Use symbolic instructions names.
powerpc/module32: Use symbolic instructions names.
powerpc: Move PPC_HA() PPC_HI() and PPC_LO() to ppc-opcode.h
powerpc/module64: Fix comment in R_PPC64_ENTRY handling
powerpc/boot: Add lzo support for uImage
powerpc/boot: Add lzma support for uImage
powerpc/boot: don't force gzipped uImage
powerpc/8xx: Add microcode patch to move SMC parameter RAM.
powerpc/8xx: Use IO accessors in microcode programming.
powerpc/8xx: replace #ifdefs by IS_ENABLED() in microcode.c
powerpc/8xx: refactor programming of microcode CPM params.
powerpc/8xx: refactor printing of microcode patch name.
powerpc/8xx: Refactor microcode write
powerpc/8xx: refactor writing of CPM microcode arrays
...
Pull sparc updates from David Miller:
"Just a few small changes:
- Fix console naming inconsistency with hypervisor consoles, from
John Paul Adrian Glaubitz
- Fix userland compilation due to use of u_int, from Masahiro Yamada"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Add missing newline at end of file
sparc: fix unknown type name u_int in uapi header
sparc: configs: Remove useless UEVENT_HELPER_PATH
sparc: Remove redundant copy of the LGPL-2.0
sunhv: Fix device naming inconsistency between sunhv_console and sunhv_reg
1) Fix excessive stack usage in cxgb4, from Arnd Bergmann.
2) Missing skb queue lock init in tipc, from Chris Packham.
3) Fix some regressions in ipv6 flow label handling, from Eric Dumazet.
4) Elide flow dissection of local packets in FIB rules, from Petar
Penkov.
5) Fix TLS support build failure in mlx5, from Tariq Toukab.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (36 commits)
ppp: mppe: Revert "ppp: mppe: Add softdep to arc4"
net: dsa: qca8k: replace legacy gpio include
net: hisilicon: Use devm_platform_ioremap_resource
cxgb4: reduce kernel stack usage in cudbg_collect_mem_region()
tipc: ensure head->lock is initialised
tc-tests: updated skbedit tests
nfp: flower: ensure ip protocol is specified for L4 matches
nfp: flower: fix ethernet check on match fields
net/mlx5e: Provide cb_list pointer when setting up tc block on rep
net: phy: make exported variables non-static
net: sched: Fix NULL-pointer dereference in tc_indr_block_ing_cmd()
davinci_cpdma: don't cast dma_addr_t to pointer
net: openvswitch: do not update max_headroom if new headroom is equal to old headroom
net/mlx5e: Convert single case statement switch statements into if statements
net/mlx5: E-Switch, Reduce ingress acl modify metadata stack usage
net/mlx5e: Fix unused variable warning when CONFIG_MLX5_ESWITCH is off
net/mlx5e: Fix compilation error in TLS code
ipv6: fix static key imbalance in fl_create()
ipv6: fix potential crash in ip6_datagram_dst_update()
ipv6: tcp: fix flowlabels reflection for RST packets
...
Merge tag 'mtd/for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD updates from Miquel Raynal:
"This contains the following changes for MTD:
MTD core changes:
- New Hyperbus framework
- New _is_locked (concat) implementation
- Various cleanups
NAND core changes:
- use longest matching pattern in ->exec_op() default parser
- export NAND operation tracer
- add flag to indicate panic_write in MTD
- use kzalloc() instead of kmalloc() and memset()
Raw NAND controller drivers changes:
- brcmnand:
- fix BCH ECC layout for large page NAND parts
- fallback to detected ecc-strength, ecc-step-size
- when oops in progress use pio and interrupt polling
- code refactor code to introduce helper functions
- add support for v7.3 controller
- FSMC:
- use nand_op_trace for operation tracing
- GPMI:
- move all driver code into single file
- various cleanups (including dmaengine changes)
- use runtime PM to manage clocks
- implement exec_op
- MTK:
- correct low level time calculation of r/w cycle
- improve data sampling timing for read cycle
- add validity check for CE# pin setting
- fix wrongly assigned OOB buffer pointer issue
- re-license MTK NAND driver as Dual MIT/GPL
- STM32:
- manage the get_irq error case
- increase DMA completion timeouts
Raw NAND chips drivers changes:
- Macronix: add read-retry support
Onenand driver changes:
- add support for 8Gb datasize chips
- avoid fall-through warnings
SPI-NAND changes:
- define macros for page-read ops with three-byte addresses
- add support for two-byte device IDs and then for GigaDevice
GD5F1GQ4UFxxG
- add initial support for Paragon PN26G0xA
- handle the case where the last page read has bitflips
SPI-NOR core changes:
- add support for the mt25ql02g and w25q16jv flashes
- print error in case of jedec read id fails
- is25lp256: add post BFPT fix to correct the addr_width
SPI NOR controller drivers changes:
- intel-spi: Add support for Intel Elkhart Lake SPI serial flash
- smt32: remove the driver as the driver was replaced by spi-stm32-qspi.c
- cadence-quadspi: add reset control"
* tag 'mtd/for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (60 commits)
mtd: concat: implement _is_locked mtd operation
mtd: concat: refactor concat_lock/concat_unlock
mtd: abi: do not use C++ style comments in uapi header
mtd: afs: remove unneeded NULL check
mtd: rawnand: stm32_fmc2: increase DMA completion timeouts
mtd: rawnand: Use kzalloc() instead of kmalloc() and memset()
mtd: hyperbus: Add driver for TI's HyperBus memory controller
mtd: spinand: read returns badly if the last page has bitflips
mtd: spinand: Add initial support for Paragon PN26G0xA
mtd: rawnand: mtk: Re-license MTK NAND driver as Dual MIT/GPL
mtd: rawnand: gpmi: remove double assignment to block_size
dt-bindings: mtd: brcmnand: Add brcmnand, brcmnand-v7.3 support
mtd: rawnand: brcmnand: Add support for v7.3 controller
mtd: rawnand: brcmnand: Refactored code to introduce helper functions
mtd: rawnand: brcmnand: When oops in progress use pio and interrupt polling
mtd: Add flag to indicate panic_write
mtd: rawnand: Add Macronix NAND read retry support
mtd: onenand: Avoid fall-through warnings
mtd: spinand: Add support for GigaDevice GD5F1GQ4UFxxG
mtd: spinand: Add support for two-byte device IDs
...
Merge tag 'for-5.3/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Add encrypted byte-offset initialization vector (eboiv) to DM crypt.
- Add optional discard features to DM snapshot which allow freeing
space from a DM device whose free space was exhausted.
- Various small improvements to use struct_size() and kzalloc().
- Fix to check if DM thin metadata is in fail_io mode before attempting
to update the superblock to set the needs_check flag. Otherwise the
DM thin-pool can hang.
- Fix DM bufio shrinker's potential for ABBA recursion deadlock with DM
thin provisioning on loop usecase.
* tag 'for-5.3/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm bufio: fix deadlock with loop device
dm snapshot: add optional discard support features
dm crypt: implement eboiv - encrypted byte-offset initialization vector
dm crypt: remove obsolete comment about plumb IV
dm crypt: wipe private IV struct after key invalid flag is set
dm integrity: use kzalloc() instead of kmalloc() + memset()
dm: update stale comment in end_clone_bio()
dm log writes: fix incorrect comment about the logged sequence example
dm log writes: use struct_size() to calculate size of pending_block
dm crypt: use struct_size() when allocating encryption context
dm integrity: always set version on superblock update
dm thin metadata: check if in fail_io mode when setting needs_check
Merge tag 'for-linus-5.3' of git://github.com/cminyard/linux-ipmi
Pull IPMI updates from Corey Minyard:
"Some small fixes for various things, nothing huge, mostly found by
automated tools.
Plus add a driver that allows Linux to act as an IPMB slave device, so
it can be a satellite MC in an IPMI network"
* tag 'for-linus-5.3' of git://github.com/cminyard/linux-ipmi:
docs: ipmb: place it at driver-api and convert to ReST
fix platform_no_drv_owner.cocci warnings
ipmi: ipmb: don't allocate i2c_client on stack
ipmi: ipmb: Fix build error while CONFIG_I2C is set to m
Add support for IPMB driver
drivers: ipmi: Drop device reference
ipmi_ssif: fix unexpected driver unregister warning
ipmi_si: use bool type for initialized variable
ipmi_si: fix unexpected driver unregister warning
Merge tag 'pinctrl-v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
"This is the bulk of pin control changes for the v5.3 kernel cycle:
Core changes:
- Device links can optionally be added between a pin control producer
and its consumers. This will affect how the system power management
is handled: a pin controller will not suspend before all of its
consumers have been suspended.
This was necessary for the ST Microelectronics STMFX expander and
need to be tested on other systems as well: it makes sense to make
this default in the long run.
Right now it is opt-in per driver.
- Drive strength can be specified in microamps. With decreases in
silicon technology, milliamps isn't granular enough, let's make it
possible to select drive strengths in microamps.
Right now the Meson (AMlogic) driver needs this.
New drivers:
- New subdriver for the Tegra 194 SoC.
- New subdriver for the Qualcomm SDM845.
- New subdriver for the Qualcomm SM8150.
- New subdriver for the Freescale i.MX8MN (Freescale is now a product
line of NXP).
- New subdriver for Marvell MV98DX1135.
Driver improvements:
- The Bitmain BM1880 driver now supports pin config in addition to
muxing.
- The Qualcomm drivers can now reserve some GPIOs as taken aside and
not usable for users. This is used in ACPI systems to take out some
GPIO lines used by the BIOS so that noone else (neither kernel nor
userspace) will play with them by mistake and crash the machine.
- A slew of refurbishing around the Aspeed drivers (board management
controllers for servers) in preparation for the new Aspeed AST2600
SoC.
- A slew of improvements over the SH PFC drivers as usual.
- Misc cleanups and fixes"
* tag 'pinctrl-v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (106 commits)
pinctrl: aspeed: Strip moved macros and structs from private header
pinctrl: aspeed: Fix missed include
pinctrl: baytrail: Use GENMASK() consistently
pinctrl: baytrail: Re-use data structures from pinctrl-intel.h
pinctrl: baytrail: Use defined macro instead of magic in byt_get_gpio_mux()
pinctrl: qcom: Add SM8150 pinctrl driver
dt-bindings: pinctrl: qcom: Add SM8150 pinctrl binding
dt-bindings: pinctrl: qcom: Document missing gpio nodes
pinctrl: aspeed: Add implementation-related documentation
pinctrl: aspeed: Split out pinmux from general pinctrl
pinctrl: aspeed: Clarify comment about strapping W1C
pinctrl: aspeed: Correct comment that is no longer true
MAINTAINERS: Add entry for ASPEED pinctrl drivers
dt-bindings: pinctrl: aspeed: Convert AST2500 bindings to json-schema
dt-bindings: pinctrl: aspeed: Convert AST2400 bindings to json-schema
dt-bindings: pinctrl: aspeed: Split bindings document in two
pinctrl: qcom: Add irq_enable callback for msm gpio
pinctrl: madera: Fixup SPDX headers
pinctrl: qcom: sdm845: Fix CONFIG preprocessor guard
pinctrl: tegra: Add bitmask support for parked bits
...
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input updates from Dmitry Torokhov:
- an update to Elan touchpad SMBus driver to fetch device parameters
(size, resolution) while it is still in PS/2 mode, before switching
over to SMBus, as in that mode some devices return garbage dimensions
- update to iforce joystick driver
- miscellaneous driver fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: (48 commits)
Input: gpio_keys_polled - allow specifying name of input device
Input: edt-ft5x06 - simplify event reporting code
Input: max77650-onkey - add MODULE_ALIAS()
Input: atmel_mxt_ts - fix leak in mxt_update_cfg()
Input: synaptics - enable SMBUS on T480 thinkpad trackpad
Input: atmel_mxt_ts - fix -Wunused-const-variable
Input: joydev - extend absolute mouse detection
HID: quirks: Refactor ELAN 400 and 401 handling
Input: elan_i2c - export the device id whitelist
Input: edt-ft5x06 - use get_unaligned_be16()
Input: iforce - add the Saitek R440 Force Wheel
Input: iforce - use unaligned accessors, where appropriate
Input: iforce - drop couple of temps from transport code
Input: iforce - drop bus type from iforce structure
Input: iforce - use DMA-safe buffores for USB transfers
Input: iforce - allow callers supply data buffer when fetching device IDs
Input: iforce - only call iforce_process_packet() if initialized
Input: iforce - signal command completion from transport code
Input: iforce - do not combine arguments for iforce_process_packet()
Input: iforce - factor out hat handling when parsing packets
...
Merge tag 'for-5.3/io_uring-20190711' of git://git.kernel.dk/linux-block
Pull io_uring updates from Jens Axboe:
"This contains:
- Support for recvmsg/sendmsg as first class opcodes.
I don't envision going much further down this path, as there are
plans in progress to support potentially any system call in an
async fashion through io_uring. But I think it does make sense to
have certain core ops available directly, especially those that can
support a "try this non-blocking" flag/mode. (me)
- Handle generic short reads automatically.
This can happen fairly easily if parts of the buffered read is
cached. Since the application needs to issue another request for
the remainder, just do this internally and save kernel/user
roundtrip while providing a nicer more robust API. (me)
- Support for linked SQEs.
This allows SQEs to depend on each other, enabling an application
to eg queue a read-from-this-file,write-to-that-file pair. (me)
- Fix race in stopping SQ thread (Jackie)"
* tag 'for-5.3/io_uring-20190711' of git://git.kernel.dk/linux-block:
io_uring: fix io_sq_thread_stop running in front of io_sq_thread
io_uring: add support for recvmsg()
io_uring: add support for sendmsg()
io_uring: add support for sqe links
io_uring: punt short reads to async context
uio: make import_iovec()/compat_import_iovec() return bytes on success
Yuyang Du [Tue, 9 Jul 2019 10:15:22 +0000 (18:15 +0800)]
locking/lockdep: Fix lock used or unused stats error
The stats variable nr_unused_locks is incremented every time a new lock
class is register and decremented when the lock is first used in
__lock_acquire(). And after all, it is shown and checked in lockdep_stats.
However, under configurations that either CONFIG_TRACE_IRQFLAGS or
CONFIG_PROVE_LOCKING is not defined:
The commit:
04d85b95693b8c5 ("locking/lockdep: Consolidate lock usage bit initialization")
missed marking the LOCK_USED flag at IRQ usage initialization because
as mark_usage() is not called. And the commit:
Kan Liang [Tue, 25 Jun 2019 14:21:35 +0000 (07:21 -0700)]
perf/x86/intel: Fix spurious NMI on fixed counter
If a user first sample a PEBS event on a fixed counter, then sample a
non-PEBS event on the same fixed counter on Icelake, it will trigger
spurious NMI. For example:
perf record -e 'cycles:p' -a
perf record -e 'cycles' -a
The error message for spurious NMI:
[June 21 15:38] Uhhuh. NMI received for unknown reason 30 on CPU 2.
[ +0.000000] Do you have a strange power saving mode enabled?
[ +0.000000] Dazed and confused, but trying to continue
The bug was introduced by the following commit:
commit fe57f2df2f66 ("perf/x86/intel: Fix race in intel_pmu_disable_event()")
The commit moves the intel_pmu_pebs_disable() after intel_pmu_disable_fixed(),
which returns immediately. The related bit of PEBS_ENABLE MSR will never be
cleared for the fixed counter. Then a non-PEBS event runs on the fixed counter,
but the bit on PEBS_ENABLE is still set, which triggers spurious NMIs.
Check and disable PEBS for fixed counters after intel_pmu_disable_fixed().
Reported-by: Yi, Ammy <ammy.yi@intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: fe57f2df2f66 ("perf/x86/intel: Fix race in intel_pmu_disable_event()") Link: https://lkml.kernel.org/r/20190625142135.22112-1-kan.liang@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
So far, we tried to disallow grouping exclusive events for the fear of
complications they would cause with moving between contexts. Specifically,
moving a software group to a hardware context would violate the exclusivity
rules if both groups contain matching exclusive events.
This attempt was, however, unsuccessful: the check that we have in the
perf_event_open() syscall is both wrong (looks at wrong PMU) and
insufficient (group leader may still be exclusive), as can be illustrated
by running:
$ perf record -e '{intel_pt//,cycles}' uname
$ perf record -e '{cycles,intel_pt//}' uname
ultimately successfully.
Furthermore, we are completely free to trigger the exclusivity violation
by:
Kim Phillips [Fri, 28 Jun 2019 21:59:20 +0000 (21:59 +0000)]
perf/x86/amd/uncore: Do not set 'ThreadMask' and 'SliceMask' for non-L3 PMCs
The following commit:
29a45c5b24c8 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events")
enables L3 PMC events for all threads and slices by writing 1's in
'ChL3PmcCfg' (L3 PMC PERF_CTL) register fields.
Those bitfields overlap with high order event select bits in the Data
Fabric PMC control register, however.
So when a user requests raw Data Fabric events (-e amd_df/event=0xYYY/),
the two highest order bits get inadvertently set, changing the counter
select to events that don't exist, and for which no counts are read.
This patch changes the logic to write the L3 masks only when dealing
with L3 PMC counters.
AMD Family 16h and below Northbridge (NB) counters were not affected.
Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Gary Hook <Gary.Hook@amd.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Martin Liska <mliska@suse.cz> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Pu Wen <puwen@hygon.cn> Cc: Stephane Eranian <eranian@google.com> Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: 29a45c5b24c8 ("perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events") Link: https://lkml.kernel.org/r/20190628215906.4276-1-kim.phillips@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
Peter Zijlstra [Sat, 13 Jul 2019 09:21:25 +0000 (11:21 +0200)]
perf/core: Fix race between close() and fork()
Syzcaller reported the following Use-after-Free bug:
close() clone()
copy_process()
perf_event_init_task()
perf_event_init_context()
mutex_lock(parent_ctx->mutex)
inherit_task_group()
inherit_group()
inherit_event()
mutex_lock(event->child_mutex)
// expose event on child list
list_add_tail()
mutex_unlock(event->child_mutex)
mutex_unlock(parent_ctx->mutex)
...
goto bad_fork_*
bad_fork_cleanup_perf:
perf_event_free_task()
perf_release()
perf_event_release_kernel()
list_for_each_entry()
mutex_lock(ctx->mutex)
mutex_lock(event->child_mutex)
// event is from the failing inherit
// on the other CPU
perf_remove_from_context()
list_move()
mutex_unlock(event->child_mutex)
mutex_unlock(ctx->mutex)
list_for_each_entry_safe()
list_del()
free_event()
_free_event()
// and so event->hw.target
// is the already freed failed clone()
if (event->hw.target)
put_task_struct(event->hw.target)
// WHOOPSIE, already quite dead
Which puts the lie to the the comment on perf_event_free_task():
'unexposed, unused context' not so much.
Which is a 'fun' confluence of fail; copy_process() doing an
unconditional free_task() and not respecting refcounts, and perf having
creative locking. In particular:
bb9148868a32 ("perf/core: Fix lock inversion between perf,trace,cpuhp")
seems to have overlooked this 'fun' parade.
Solve it by using the fact that detached events still have a reference
count on their (previous) context. With this perf_event_free_task()
can detect when events have escaped and wait for their destruction.
Debugged-by: Alexander Shishkin <alexander.shishkin@linux.intel.com> Reported-by: syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: <stable@vger.kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Fixes: bb9148868a32 ("perf/core: Fix lock inversion between perf,trace,cpuhp") Signed-off-by: Ingo Molnar <mingo@kernel.org>
Eric Biggers [Fri, 12 Jul 2019 23:39:31 +0000 (16:39 -0700)]
ppp: mppe: Revert "ppp: mppe: Add softdep to arc4"
Commit e7eee881057a ("ppp: mppe: switch to RC4 library interface"),
which was merged through the crypto tree for v5.3, changed ppp_mppe.c to
use the new arc4_crypt() library function rather than access RC4 through
the dynamic crypto_skcipher API.
Meanwhile commit 3b4a21a08d5a ("ppp: mppe: Add softdep to arc4") was
merged through the net tree and added a module soft-dependency on "arc4".
The latter commit no longer makes sense because the code now uses the
"libarc4" module rather than "arc4", and also due to the direct use of
arc4_crypt(), no module soft-dependency is required.
So revert the latter commit.
Cc: Takashi Iwai <tiwai@suse.de> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'dlm-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm
Pull dlm updates from David Teigland:
"This set removes some unnecessary debugfs error handling, and checks
that lowcomms workqueues are not NULL before destroying"
* tag 'dlm-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: no need to check return value of debugfs_create functions
dlm: check if workqueues are NULL before flushing/destroying
Merge tag 'f2fs-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this round, we've introduced native swap file support which can
exploit DIO, enhanced existing checkpoint=disable feature with
additional mount option to tune the triggering condition, and allowed
user to preallocate physical blocks in a pinned file which will be
useful to avoid f2fs fragmentation in append-only workloads. In
addition, we've fixed subtle quota corruption issue.
Enhancements:
- add swap file support which uses DIO
- allocate blocks for pinned file
- allow SSR and mount option to enhance checkpoint=disable
- enhance IPU IOs
- add more sanity checks such as memory boundary access
Bug fixes:
- quota corruption in very corner case of error-injected SPO case
- fix root_reserved on remount and some wrong counts
- add missing fsck flag
Some patches were also introduced to clean up ambiguous i_flags and
debugging messages codes"
* tag 'f2fs-for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (33 commits)
f2fs: improve print log in f2fs_sanity_check_ckpt()
f2fs: avoid out-of-range memory access
f2fs: fix to avoid long latency during umount
f2fs: allow all the users to pin a file
f2fs: support swap file w/ DIO
f2fs: allocate blocks for pinned file
f2fs: fix is_idle() check for discard type
f2fs: add a rw_sem to cover quota flag changes
f2fs: set SBI_NEED_FSCK for xattr corruption case
f2fs: use generic EFSBADCRC/EFSCORRUPTED
f2fs: Use DIV_ROUND_UP() instead of open-coding
f2fs: print kernel message if filesystem is inconsistent
f2fs: introduce f2fs_<level> macros to wrap f2fs_printk()
f2fs: avoid get_valid_blocks() for cleanup
f2fs: ioctl for removing a range from F2FS
f2fs: only set project inherit bit for directory
f2fs: separate f2fs i_flags from fs_flags and ext4 i_flags
f2fs: replace ktype default_attrs with default_groups
f2fs: Add option to limit required GC for checkpoint=disable
f2fs: Fix accounting for unusable blocks
...
Merge tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
"In this release there are a significant amounts of consolidations and
cleanups in the log code; restructuring of the log to issue struct
bios directly; new bulkstat ioctls to return v5 fs inode information
(and fix all the padding problems of the old ioctl); the beginnings of
multithreaded inode walks (e.g. quotacheck); and a reduction in memory
usage in the online scrub code leading to reduced runtimes.
- Refactor inode geometry calculation into a single structure instead
of open-coding pieces everywhere.
- Add online repair to build options.
- Remove unnecessary function call flags and functions.
- Claim maintainership of various loose xfs documentation and header
files.
- Use struct bio directly for log buffer IOs instead of struct
xfs_buf.
- Reduce log item boilerplate code requirements.
- Merge log item code spread across too many files.
- Further distinguish between log item commits and cancellations.
- Various small cleanups to the ag small allocator.
- Support cgroup-aware writeback
- libxfs refactoring for mkfs cleanup
- Remove unneeded #includes
- Fix a memory allocation miscalculation in the new log bio code
- Fix bisection problems
- Fix a crash in ioend processing caused by tripping over freeing of
preallocated transactions
- Split out a generic inode walk mechanism from the bulkstat code,
hook up all the internal users to use the walking code, then clean
up bulkstat to serve only the bulkstat ioctls.
- Add a multithreaded iwalk implementation to speed up quotacheck on
fast storage with many CPUs.
- Remove unnecessary return values in logging teardown functions.
- Supplement the bstat and inogrp structures with new bulkstat and
inumbers structures that have all the fields we need for v5
filesystem features and none of the padding problems of their
predecessors.
- Wire up new ioctls that use the new structures with a much simpler
bulk_ireq structure at the head instead of the pointerhappy mess we
had before.
- Enable userspace to constrain bulkstat returns to a single AG or a
single special inode so that we can phase out a lot of geometry
guesswork in userspace.
- Reduce memory consumption and zeroing overhead in extended
attribute scrub code.
- Fix some behavioral regressions in the new bulkstat backend code.
- Fix some behavioral regressions in the new log bio code"
* tag 'xfs-5.3-merge-12' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (100 commits)
xfs: chain bios the right way around in xfs_rw_bdev
xfs: bump INUMBERS cursor correctly in xfs_inumbers_walk
xfs: don't update lastino for FSBULKSTAT_SINGLE
xfs: online scrub needn't bother zeroing its temporary buffer
xfs: only allocate memory for scrubbing attributes when we need it
xfs: refactor attr scrub memory allocation function
xfs: refactor extended attribute buffer pointer functions
xfs: attribute scrub should use seen_enough to pass error values
xfs: allow single bulkstat of special inodes
xfs: specify AG in bulk req
xfs: wire up the v5 inumbers ioctl
xfs: wire up new v5 bulkstat ioctls
xfs: introduce v5 inode group structure
xfs: introduce new v5 bulkstat structure
xfs: rename bulkstat functions
xfs: remove various bulk request typedef usage
fs: xfs: xfs_log: Change return type from int to void
xfs: poll waiting for quotacheck
xfs: multithreaded iwalk implementation
xfs: refactor INUMBERS to use iwalk functions
...
Merge tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull common SETFLAGS/FSSETXATTR parameter checking from Darrick Wong:
"Here's a patch series that sets up common parameter checking functions
for the FS_IOC_SETFLAGS and FS_IOC_FSSETXATTR ioctl implementations.
The goal here is to reduce the amount of behaviorial variance between
the filesystems where those ioctls originated (ext2 and XFS,
respectively) and everybody else.
- Standardize parameter checking for the SETFLAGS and FSSETXATTR
ioctls (which were the file attribute setters for ext4 and xfs and
have now been hoisted to the vfs)
- Only allow the DAX flag to be set on files and directories"
* tag 'vfs-fix-ioctl-checking-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
vfs: only allow FSSETXATTR to set DAX flag on files and dirs
vfs: teach vfs_ioc_fssetxattr_check to check extent size hints
vfs: teach vfs_ioc_fssetxattr_check to check project id info
vfs: create a generic checking function for FS_IOC_FSSETXATTR
vfs: create a generic checking and prep function for FS_IOC_SETFLAGS
Merge tag 'linux-kselftest-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull Kselftest updates from Shuah Khan:
"This Kselftest update for Linux 5.3-rc1 consists of build failure
fixes and minor code cleaning patch to remove duplicate headers"
* tag 'linux-kselftest-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
rseq/selftests: Fix Thumb mode build failure on arm32
kselftests: cgroup: remove duplicated include from test_freezer.c
selftests: timestamping: Fix SIOCGSTAMP undeclared build failure
selftests: dma-buf: Adding kernel config fragment CONFIG_UDMABUF=y
Merge tag 'kconfig-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kconfig updates from Masahiro Yamada:
- always require argument for --defconfig and remove the hard-coded
arch/$(ARCH)/defconfig path
- make arch/$(SRCARCH)/configs/defconfig the new default of defconfig
- some code cleanups
* tag 'kconfig-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kconfig: remove meaningless if-conditional in conf_read()
kconfig: Fix spelling of sym_is_changable
unicore32: rename unicore32_defconfig to defconfig
kconfig: make arch/*/configs/defconfig the default of KBUILD_DEFCONFIG
kconfig: add static qualifier to expand_string()
kconfig: require the argument of --defconfig
kconfig: remove always false ifeq ($(KBUILD_DEFCONFIG,) conditional
- propagate 'No space left on device' error in fixdep to Make
- allow Clang to use its integrated assembler
- improve some coccinelle scripts
- add a new flag KBUILD_ABS_SRCTREE to request Kbuild to use absolute
path for $(srctree).
- do not ignore errors when compression utility is missing
- misc cleanups
* tag 'kbuild-v5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (49 commits)
kbuild: use -- separater intead of $(filter-out ...) for cc-cross-prefix
kbuild: Inform user to pass ARCH= for make mrproper
kbuild: fix compression errors getting ignored
kbuild: add a flag to force absolute path for srctree
kbuild: replace KBUILD_SRCTREE with boolean building_out_of_srctree
kbuild: remove src and obj from the top Makefile
scripts/tags.sh: remove unused environment variables from comments
scripts/tags.sh: drop SUBARCH support for ARM
kbuild: compile-test kernel headers to ensure they are self-contained
kheaders: include only headers into kheaders_data.tar.xz
kheaders: remove meaningless -R option of 'ls'
kbuild: support header-test-pattern-y
kbuild: do not create wrappers for header-test-y
kbuild: compile-test exported headers to ensure they are self-contained
init/Kconfig: add CONFIG_CC_CAN_LINK
kallsyms: exclude kasan local symbols on s390
kbuild: add more hints about SUBDIRS replacement
coccinelle: api/stream_open: treat all wait_.*() calls as blocking
coccinelle: put_device: Add a cast to an expression for an assignment
coccinelle: put_device: Adjust a message construction
...
Merge tag 'asm-generic-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"The asm-generic changes for 5.3 consist of a cleanup series to remove
ptrace.h from Christoph Hellwig, who explains:
'asm-generic/ptrace.h is a little weird in that it doesn't actually
implement any functionality, but it provided multiple layers of
macros that just implement trivial inline functions. We implement
those directly in the few architectures and be off with a much
simpler design.'
at https://lore.kernel.org/lkml/20190624054728.30966-1-hch@lst.de/"
* tag 'asm-generic-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: remove ptrace.h
x86: don't use asm-generic/ptrace.h
sh: don't use asm-generic/ptrace.h
powerpc: don't use asm-generic/ptrace.h
arm64: don't use asm-generic/ptrace.h
- Add CPU measurement counters for newer machines.
- Add base DASD thin provisioning support and code cleanups.
* tag 's390-5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (21 commits)
s390/unwind: avoid int overflow in outside_of_stack
s390/zcrypt: remove the exporting of ap_query_configuration
s390/pci: add mio_enabled attribute
s390: fix setting of mio addressing control
s390/ipl: Fix detection of has_secure attribute
s390: vfio-ap: fix irq registration
s390/cpumf: Add extended counter set definitions for model 8561 and 8562
s390/dasd: Handle out-of-space constraint
s390/dasd: Add discard support for ESE volumes
s390/dasd: Use ALIGN_DOWN macro
s390/dasd: Make dasd_setup_queue() a discipline function
s390/dasd: Add new ioctl to release space
s390/dasd: Add dasd_sleep_on_queue_interruptible()
s390/dasd: Add missing intensity definition
s390/dasd: Fix whitespace
s390/dasd: Add dynamic formatting support for ESE volumes
s390/dasd: Recognise data for ESE volumes
s390/dasd: Put sub-order definitions in a separate section
s390/dasd: Make layout analysis ESE compatible
s390/dasd: Remove old defines and function
...
This patch replaces the legacy bulk gpio.h include
with the proper gpio/consumer.h variant. This was
caught by the kbuild test robot that was running
into an error because of this.
For more information why linux/gpio.h is bad can be found in:
commit 1ce3a113e232 ("gpio: Clarify that <linux/gpio.h> is legacy")
Reported-by: kbuild test robot <lkp@intel.com> Link: https://www.spinics.net/lists/netdev/msg584447.html Fixes: 48699428c91b ("net: dsa: qca8k: introduce reset via gpio feature") Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'nios2-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2
Pull arch/nios2 updates from Ley Foon Tan.
* tag 'nios2-v5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: configs: Remove useless UEVENT_HELPER_PATH
nios2: remove pointless second entry for CONFIG_TRACE_IRQFLAGS_SUPPORT
cxgb4: reduce kernel stack usage in cudbg_collect_mem_region()
The cudbg_collect_mem_region() and cudbg_read_fw_mem() both use several
hundred kilobytes of kernel stack space. One gets inlined into the other,
which causes the stack usage to be combined beyond the warning limit
when building with clang:
drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c:1057:12: error: stack frame size of 1244 bytes in function 'cudbg_collect_mem_region' [-Werror,-Wframe-larger-than=]
Restructuring cudbg_collect_mem_region() lets clang do the same
optimization that gcc does and reuse the stack slots as it can
see that the large variables are never used together.
A better fix might be to avoid using cudbg_meminfo on the stack
altogether, but that requires a larger rewrite.
Fixes: 75c85be3c1f6 ("cxgb4: collect MC memory dump") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"ARM:
- support for chained PMU counters in guests
- improved SError handling
- handle Neoverse N1 erratum #1349291
- allow side-channel mitigation status to be migrated
- standardise most AArch64 system register accesses to msr_s/mrs_s
- fix host MPIDR corruption on 32bit
- selftests ckleanups
x86:
- PMU event {white,black}listing
- ability for the guest to disable host-side interrupt polling
- fixes for enlightened VMCS (Hyper-V pv nested virtualization),
- new hypercall to yield to IPI target
- support for passing cstate MSRs through to the guest
- lots of cleanups and optimizations
Generic:
- Some txt->rST conversions for the documentation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (128 commits)
Documentation: virtual: Add toctree hooks
Documentation: kvm: Convert cpuid.txt to .rst
Documentation: virtual: Convert paravirt_ops.txt to .rst
KVM: x86: Unconditionally enable irqs in guest context
KVM: x86: PMU Event Filter
kvm: x86: Fix -Wmissing-prototypes warnings
KVM: Properly check if "page" is valid in kvm_vcpu_unmap
KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register
KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane
kvm: LAPIC: write down valid APIC registers
KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s
KVM: doc: Add API documentation on the KVM_REG_ARM_WORKAROUNDS register
KVM: arm/arm64: Add save/restore support for firmware workaround state
arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests
KVM: arm/arm64: Support chained PMU counters
KVM: arm/arm64: Remove pmc->bitmask
KVM: arm/arm64: Re-create event when setting counter value
KVM: arm/arm64: Extract duplicated code to own function
KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions
KVM: LAPIC: ARBPRI is a reserved register for x2APIC
...
Chris Packham [Thu, 11 Jul 2019 22:41:15 +0000 (10:41 +1200)]
tipc: ensure head->lock is initialised
tipc_named_node_up() creates a skb list. It passes the list to
tipc_node_xmit() which has some code paths that can call
skb_queue_purge() which relies on the list->lock being initialised.
The spin_lock is only needed if the messages end up on the receive path
but when the list is created in tipc_named_node_up() we don't
necessarily know if it is going to end up there.
Once all the skb list users are updated in tipc it will then be possible
to update them to use the unlocked variants of the skb list functions
and initialise the lock when we know the message will follow the receive
path.
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 12 Jul 2019 22:31:55 +0000 (15:31 -0700)]
Merge branch 'nfp-flower-bugs'
John Hurley says:
====================
Fix bugs in NFP flower match offload
This patchset contains bug fixes for corner cases in the match fields of
flower offloads. The patches ensure that flows that should not be
supported are not (incorrectly) offloaded. These include rules that match
on layer 3 and/or 4 data without specified ethernet or ip protocol fields.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Wed, 10 Jul 2019 18:30:30 +0000 (19:30 +0100)]
nfp: flower: ensure ip protocol is specified for L4 matches
Flower rules on the NFP firmware are able to match on an IP protocol
field. When parsing rules in the driver, unknown IP protocols are only
rejected when further matches are to be carried out on layer 4 fields, as
the firmware will not be able to extract such fields from packets.
L4 protocol dissectors such as FLOW_DISSECTOR_KEY_PORTS are only parsed if
an IP protocol is specified. This leaves a loophole whereby a rule that
attempts to match on transport layer information such as port numbers but
does not explicitly give an IP protocol type can be incorrectly offloaded
(in this case with wildcard port numbers matches).
Fix this by rejecting the offload of flows that attempt to match on L4
information, not only when matching on an unknown IP protocol type, but
also when the protocol is wildcarded.
Fixes: c549b3fc5aba ("nfp: flower: check L4 matches on unknown IP protocols") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
John Hurley [Wed, 10 Jul 2019 18:30:29 +0000 (19:30 +0100)]
nfp: flower: fix ethernet check on match fields
NFP firmware does not explicitly match on an ethernet type field. Rather,
each rule has a bitmask of match fields that can be used to infer the
ethernet type.
Currently, if a flower rule contains an unknown ethernet type, a check is
carried out for matches on other fields of the packet. If matches on
layer 3 or 4 are found, then the offload is rejected as firmware will not
be able to extract these fields from a packet with an ethernet type it
does not currently understand.
However, if a rule contains an unknown ethernet type without any L3 (or
above) matches then this will effectively be offloaded as a rule with a
wildcarded ethertype. This can lead to misclassifications on the firmware.
Fix this issue by rejecting all flower rules that specify a match on an
unknown ethernet type.
Further ensure correct offloads by moving the 'L3 and above' check to any
rule that does not specify an ethernet type and rejecting rules with
further matches. This means that we can still offload rules with a
wildcarded ethertype if they only match on L2 fields but will prevent
rules which match on further fields that we cannot be sure if the firmware
will be able to extract.
Fixes: ad758551af4d ("nfp: extend flower add flow offload") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyper-v updates from Sasha Levin:
- Add a module description to the Hyper-V vmbus module.
- Rework some vmbus code to separate architecture specifics out to
arch/x86/. This is part of the work of adding arm64 support to
Hyper-V.
* tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
Drivers: hv: vmbus: Break out ISA independent parts of mshyperv.h
drivers: hv: Add a module description line to the hv_vmbus driver
net/mlx5e: Provide cb_list pointer when setting up tc block on rep
Recent refactoring of tc block offloads infrastructure introduced new
flow_block_cb_setup_simple() method intended to be used as unified way for
all drivers to register offload callbacks. However, commit that actually
extended all users (drivers) with block cb list and provided it to
flow_block infra missed mlx5 en_rep. This leads to following NULL-pointer
dereference when creating Qdisc:
Extend en_rep with new static mlx5e_rep_block_cb_list list and pass it to
flow_block_cb_setup_simple() function instead of hardcoded NULL pointer.
Fixes: 0c5c09752259 ("drivers: net: use flow block API") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
The variables phy_basic_ports_array, phy_fibre_port_array and
phy_all_ports_features_array are declared static and marked
EXPORT_SYMBOL_GPL(), which is at best an odd combination.
Because the variables were decided to be a part of API, this commit
removes the static attributes and adds the declarations to the header.
Fixes: 1adbfd4fe348 ("net: ethernet: Convert phydev advertize and supported from u32 to link mode") Signed-off-by: Denis Efremov <efremov@linux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net: sched: Fix NULL-pointer dereference in tc_indr_block_ing_cmd()
After recent refactoring of block offlads infrastructure, indr_dev->block
pointer is dereferenced before it is verified to be non-NULL. Example stack
trace where this behavior leads to NULL-pointer dereference error when
creating vxlan dev on system with mlx5 NIC with offloads enabled:
Introduce new function tcf_block_non_null_shared() that verifies block
pointer before dereferencing it to obtain index. Use the function in
tc_indr_block_ing_cmd() to prevent NULL pointer dereference.
Fixes: 0c5c09752259 ("drivers: net: use flow block API") Signed-off-by: Vlad Buslov <vladbu@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
dma_addr_t may be 64-bit wide on 32-bit architectures, so it is not
valid to cast between it and a pointer:
drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_si':
drivers/net/ethernet/ti/davinci_cpdma.c:1047:12: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_idle_submit_mapped':
drivers/net/ethernet/ti/davinci_cpdma.c:1114:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
drivers/net/ethernet/ti/davinci_cpdma.c: In function 'cpdma_chan_submit_mapped':
drivers/net/ethernet/ti/davinci_cpdma.c:1164:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
Solve this by using two separate members in 'struct submit_info'.
Since this avoids the use of the 'flag' member, the structure does
not even grow in typical configurations.
Fixes: e82b152bae73 ("net: ethernet: ti: davinci_cpdma: add dma mapped submit") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
net: openvswitch: do not update max_headroom if new headroom is equal to old headroom
When a vport is deleted, the maximum headroom size would be changed.
If the vport which has the largest headroom is deleted,
the new max_headroom would be set.
But, if the new headroom size is equal to the old headroom size,
updating routine is unnecessary.
Signed-off-by: Taehee Yoo <ap420073@gmail.com> Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Merge tag 'dma-mapping-5.3' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:
- move the USB special case that bounced DMA through a device bar into
the USB code instead of handling it in the common DMA code (Laurentiu
Tudor and Fredrik Noring)
- don't dip into the global CMA pool for single page allocations
(Nicolin Chen)
- fix a crash when allocating memory for the atomic pool failed during
boot (Florian Fainelli)
- move support for MIPS-style uncached segments to the common code and
use that for MIPS and nios2 (me)
- make support for DMA_ATTR_NON_CONSISTENT and
DMA_ATTR_NO_KERNEL_MAPPING generic (me)
- convert nds32 to the generic remapping allocator (me)
* tag 'dma-mapping-5.3' of git://git.infradead.org/users/hch/dma-mapping: (29 commits)
dma-mapping: mark dma_alloc_need_uncached as __always_inline
MIPS: only select ARCH_HAS_UNCACHED_SEGMENT for non-coherent platforms
usb: host: Fix excessive alignment restriction for local memory allocations
lib/genalloc.c: Add algorithm, align and zeroed family of DMA allocators
nios2: use the generic uncached segment support in dma-direct
nds32: use the generic remapping allocator for coherent DMA allocations
arc: use the generic remapping allocator for coherent DMA allocations
dma-direct: handle DMA_ATTR_NO_KERNEL_MAPPING in common code
dma-direct: handle DMA_ATTR_NON_CONSISTENT in common code
dma-mapping: add a dma_alloc_need_uncached helper
openrisc: remove the partial DMA_ATTR_NON_CONSISTENT support
arc: remove the partial DMA_ATTR_NON_CONSISTENT support
arm-nommu: remove the partial DMA_ATTR_NON_CONSISTENT support
ARM: dma-mapping: allow larger DMA mask than supported
dma-mapping: truncate dma masks to what dma_addr_t can hold
iommu/dma: Apply dma_{alloc,free}_contiguous functions
dma-remap: Avoid de-referencing NULL atomic_pool
MIPS: use the generic uncached segment support in dma-direct
dma-direct: provide generic support for uncached kernel segments
au1100fb: fix DMA API abuse
...
- Complete PCI endpoint removal so a subsequent add doesn't fail with
-EBUSY (Alan Mikhak)
- Pay attention to PCI endpoint fixed-size BARs (Alan Mikhak)
- Fix PCI endpoint handling of 64bit BARs (Alan Mikhak)
- Clear PCI endpoint BARs before freeing space (Alan Mikhak)
* remotes/lorenzo/pci/endpoint:
PCI: endpoint: Clear BAR before freeing its space
PCI: endpoint: Skip odd BAR when skipping 64bit BAR
PCI: endpoint: Allocate enough space for fixed size BAR
PCI: endpoint: Set endpoint controller pointer to NULL
- Put Tegra PEX CLK & BIAS pads in DPD mode to reduce power usage when
powergated (Manikanta Maddireddy)
- Add generic DT binding for "reset-gpios" property (Manikanta
Maddireddy)
- Add Tegra support for GPIO-based PERST# (Manikanta Maddireddy)
- Enable Relaxed Ordering only for Tegra20 & Tegra30 (Vidya Sagar)
* remotes/lorenzo/pci/tegra:
PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30
PCI: tegra: Change link retry log level to debug
PCI: tegra: Add support for GPIO based PERST#
PCI: Add DT binding for "reset-gpios" property
PCI: tegra: Put PEX CLK & BIAS pads in DPD mode
dt-bindings: pci: tegra: Document PCIe DPD pinctrl optional prop
PCI: tegra: Add AFI_PEX2_CTRL reg offset as part of SoC struct
PCI: tegra: Change PRSNT_SENSE IRQ log to debug
PCI: tegra: Program AFI_CACHE_BAR_{0,1}_{ST,SZ} registers only for Tegra20
PCI: tegra: Fix PLLE power down issue due to CLKREQ# signal
PCI: tegra: Set target speed as Gen1 before starting LTSSM
PCI: tegra: Update flow control timer frequency in Tegra210
PCI: tegra: Add SW fixup for RAW violations
PCI: tegra: Increase the deskew retry time
PCI: tegra: Enable PCIe xclk clock clamping
PCI: tegra: Process pending DLL transactions before entering L1 or L2
PCI: tegra: Disable AFI dynamic clock gating
PCI: tegra: Enable opportunistic UpdateFC and ACK
PCI: tegra: Program UPHY electrical settings for Tegra210
PCI: tegra: Advertise PCIe Advanced Error Reporting (AER) capability
PCI: tegra: Add PCIe Gen2 link speed support
PCI: tegra: Fix PCIe host power up sequence
PCI: tegra: Mask AFI_INTR in runtime suspend
PCI: tegra: Rearrange Tegra PCIe driver functions
PCI: tegra: Handle failure cases in tegra_pcie_power_on()
soc/tegra: pmc: Export tegra_powergate_power_on()
- Move qcom driver to bulk clock API (Bjorn Andersson)
- Add Qualcomm QCS404 PCIe controller support (Bjorn Andersson)
- Ensure Qualcomm PERST is asserted for at least 100ms (Niklas Cassel)
* remotes/lorenzo/pci/qcom:
PCI: qcom: Ensure that PERST is asserted for at least 100 ms
PCI: qcom: Add QCS404 PCIe controller support
dt-bindings: PCI: qcom: Add QCS404 to the binding
PCI: qcom: Use clk bulk API for 2.4.0 controllers
* remotes/lorenzo/pci/mobiveil:
PCI: mobiveil: Fix INTx interrupt clearing in mobiveil_pcie_isr()
PCI: mobiveil: Fix infinite-loop in the INTx handling function
PCI: mobiveil: Move PCIe PIO enablement out of inbound window routine
PCI: mobiveil: Add upper 32-bit PCI base address setup in inbound window
PCI: mobiveil: Add upper 32-bit CPU base address setup in outbound window
PCI: mobiveil: Mask out hardcoded bits in inbound/outbound windows setup
PCI: mobiveil: Clear the control fields before updating it
PCI: mobiveil: Add configured inbound windows counter
PCI: mobiveil: Fix the valid check for inbound and outbound windows
PCI: mobiveil: Clean-up program_{ib/ob}_windows()
PCI: mobiveil: Remove an unnecessary return value check
PCI: mobiveil: Fix error return values
PCI: mobiveil: Refactor the MEM/IO outbound window initialization
PCI: mobiveil: Make some register updates more readable
PCI: mobiveil: Reformat the code for readability
dt-bindings: PCI: mobiveil: Change gpio_slave and apb_csr to optional
PCI: mobiveil: Fix devfn check in mobiveil_pcie_valid_device()
PCI: mobiveil: Initialize Primary/Secondary/Subordinate bus numbers
PCI: mobiveil: Move IRQ chained handler setup out of DT parse
PCI: mobiveil: Move the link up waiting out of mobiveil_host_init()
PCI: mobiveil: Fix the Class Code field
PCI: mobiveil: Use the 1st inbound window for MEM inbound transactions
PCI: mobiveil: Use WIN_NUM_0 explicitly for CFG outbound window
PCI: mobiveil: Update the resource list traversal function
PCI: mobiveil: Fix PCI base address in MEM/IO outbound windows
PCI: mobiveil: Remove the flag MSI_FLAG_MULTI_PCI_MSI
PCI: mobiveil: Unify register accessors
- Fix dra7xx build error when !CONFIG_GPIOLIB (YueHaibing)
* remotes/lorenzo/pci/dwc:
PCI: dwc: pci-dra7xx: Fix compilation when !CONFIG_GPIOLIB
PCI: imx6: Simplify Kconfig depends on
PCI: dwc: Export APIs to support .remove() implementation
PCI: dwc: Cleanup DBI,ATU read and write APIs
PCI: dwc: Add API support to de-initialize host
- Allow building Altera host bridge driver as a module (Ley Foon Tan)
- Fix Altera Stratix 10 Type 1 to Type 0 config access conversion (Ley
Foon Tan)
* remotes/lorenzo/pci/altera:
PCI: altera: Fix configuration type based on secondary number
PCI: altera-msi: Allow building as module
PCI: altera: Allow building as module
coresight: Make the coresight_device_fwnode_match declaration's fwnode parameter const
Fix Linus' merge error in the parent commit, causing:
drivers/hwtracing/coresight/coresight.c:1051:11: error: incompatible pointer types passing 'int (struct device *, void *)' to parameter of type 'int (*)(struct device *, const void *)' [-Werror,-Wincompatible-pointer-types]
coresight_device_fwnode_match);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/linux/device.h:173:17: note: passing argument to parameter 'match' here
int (*match)(struct device *dev, const void *data));
^
due to missed header file fixup.
Fixes: a4c647baf9da ("Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core") Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
[ Greg even sent this patch with his pull request, but I stupidly
thought it was the merge resolution fix I had already done as part of
the merge. But no, this was the extra fix for the header file
that goes with the definition I _had_ caught - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core and debugfs updates from Greg KH:
"Here is the "big" driver core and debugfs changes for 5.3-rc1
It's a lot of different patches, all across the tree due to some api
changes and lots of debugfs cleanups.
Other than the debugfs cleanups, in this set of changes we have:
- bus iteration function cleanups
- scripts/get_abi.pl tool to display and parse Documentation/ABI
entries in a simple way
- cleanups to Documenatation/ABI/ entries to make them parse easier
due to typos and other minor things
- default_attrs use for some ktype users
- driver model documentation file conversions to .rst
- compressed firmware file loading
- deferred probe fixes
All of these have been in linux-next for a while, with a bunch of
merge issues that Stephen has been patient with me for"
* tag 'driver-core-5.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (102 commits)
debugfs: make error message a bit more verbose
orangefs: fix build warning from debugfs cleanup patch
ubifs: fix build warning after debugfs cleanup patch
driver: core: Allow subsystems to continue deferring probe
drivers: base: cacheinfo: Ensure cpu hotplug work is done before Intel RDT
arch_topology: Remove error messages on out-of-memory conditions
lib: notifier-error-inject: no need to check return value of debugfs_create functions
swiotlb: no need to check return value of debugfs_create functions
ceph: no need to check return value of debugfs_create functions
sunrpc: no need to check return value of debugfs_create functions
ubifs: no need to check return value of debugfs_create functions
orangefs: no need to check return value of debugfs_create functions
nfsd: no need to check return value of debugfs_create functions
lib: 842: no need to check return value of debugfs_create functions
debugfs: provide pr_fmt() macro
debugfs: log errors when something goes wrong
drivers: s390/cio: Fix compilation warning about const qualifiers
drivers: Add generic helper to match by of_node
driver_find_device: Unify the match function with class_find_device()
bus_find_device: Unify the match callback with class_find_device
...
Merge updates from Andrew Morton:
"Am experimenting with splitting MM up into identifiable subsystems
perhaps with a view to gitifying it in complex ways. Also with more
verbose "incoming" emails.
hotfixes:
mm: vmscan: scan anonymous pages on file refaults
mm/nvdimm: add is_ioremap_addr and use that to check ioremap address
mm/memcontrol: fix wrong statistics in memory.stat
mm/z3fold.c: lock z3fold page before __SetPageMovable()
nilfs2: do not use unexported cpu_to_le32()/le32_to_cpu() in uapi header
MAINTAINERS: nilfs2: update email address
iommu:
include/linux/dmar.h: replace single-char identifiers in macros
scripts:
scripts/decode_stacktrace: match basepath using shell prefix operator, not regex
scripts/decode_stacktrace: look for modules with .ko.debug extension
scripts/spelling.txt: drop "sepc" from the misspelling list
scripts/spelling.txt: add spelling fix for prohibited
scripts/decode_stacktrace: Accept dash/underscore in modules
scripts/spelling.txt: add more spellings to spelling.txt
arch/sh:
arch/sh/configs/sdk7786_defconfig: remove CONFIG_LOGFS
sh: config: remove left-over BACKLIGHT_LCD_SUPPORT
sh: prevent warnings when using iounmap
ocfs2:
fs: ocfs: fix spelling mistake "hearbeating" -> "heartbeat"
ocfs2/dlm: use struct_size() helper
ocfs2: add last unlock times in locking_state
ocfs2: add locking filter debugfs file
ocfs2: add first lock wait time in locking_state
ocfs: no need to check return value of debugfs_create functions
fs/ocfs2/dlmglue.c: unneeded variable: "status"
ocfs2: use kmemdup rather than duplicating its implementation
mm:slab-generic:
Patch series "mm/slab: Improved sanity checking":
mm/slab: validate cache membership under freelist hardening
mm/slab: sanity-check page type when looking up cache
lkdtm/heap: add tests for freelist hardening
mm:slub:
mm/slub.c: avoid double string traverse in kmem_cache_flags()
slub: don't panic for memcg kmem cache creation failure
mm:kmemleak:
mm/kmemleak.c: fix check for softirq context
mm/kmemleak.c: change error at _write when kmemleak is disabled
docs: kmemleak: add more documentation details
mm:kasan:
mm/kasan: print frame description for stack bugs
Patch series "Bitops instrumentation for KASAN", v5:
lib/test_kasan: add bitops tests
x86: use static_cpu_has in uaccess region to avoid instrumentation
asm-generic, x86: add bitops instrumentation for KASAN
Patch series "mm/kasan: Add object validation in ksize()", v3:
mm/kasan: introduce __kasan_check_{read,write}
mm/kasan: change kasan_check_{read,write} to return boolean
lib/test_kasan: Add test for double-kzfree detection
mm/slab: refactor common ksize KASAN logic into slab_common.c
mm/kasan: add object validation in ksize()
mm:cleanups:
include/linux/pfn_t.h: remove pfn_t_to_virt()
Patch series "remove ARCH_SELECT_MEMORY_MODEL where it has no effect":
arm: remove ARCH_SELECT_MEMORY_MODEL
s390: remove ARCH_SELECT_MEMORY_MODEL
sparc: remove ARCH_SELECT_MEMORY_MODEL
mm/gup.c: make follow_page_mask() static
mm/memory.c: trivial clean up in insert_page()
mm: make !CONFIG_HUGE_PAGE wrappers into static inlines
include/linux/mm_types.h: ifdef struct vm_area_struct::swap_readahead_info
mm: remove the account_page_dirtied export
mm/page_isolation.c: change the prototype of undo_isolate_page_range()
include/linux/vmpressure.h: use spinlock_t instead of struct spinlock
mm: remove the exporting of totalram_pages
include/linux/pagemap.h: document trylock_page() return value
mm:debug:
mm/failslab.c: by default, do not fail allocations with direct reclaim only
Patch series "debug_pagealloc improvements":
mm, debug_pagelloc: use static keys to enable debugging
mm, page_alloc: more extensive free page checking with debug_pagealloc
mm, debug_pagealloc: use a page type instead of page_ext flag
mm:pagecache:
Patch series "fix filler_t callback type mismatches", v2:
mm/filemap.c: fix an overly long line in read_cache_page
mm/filemap: don't cast ->readpage to filler_t for do_read_cache_page
jffs2: pass the correct prototype to read_cache_page
9p: pass the correct prototype to read_cache_page
mm/filemap.c: correct the comment about VM_FAULT_RETRY
mm:swap:
mm, swap: fix race between swapoff and some swap operations
mm/swap_state.c: simplify total_swapcache_pages() with get_swap_device()
mm, swap: use rbtree for swap_extent
mm/mincore.c: fix race between swapoff and mincore
mm:memcg:
memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL
memcg, fsnotify: no oom-kill for remote memcg charging
mm, memcg: introduce memory.events.local
mm: memcontrol: dump memory.stat during cgroup OOM
Patch series "mm: reparent slab memory on cgroup removal", v7:
mm: memcg/slab: postpone kmem_cache memcg pointer initialization to memcg_link_cache()
mm: memcg/slab: rename slab delayed deactivation functions and fields
mm: memcg/slab: generalize postponed non-root kmem_cache deactivation
mm: memcg/slab: introduce __memcg_kmem_uncharge_memcg()
mm: memcg/slab: unify SLAB and SLUB page accounting
mm: memcg/slab: don't check the dying flag on kmem_cache creation
mm: memcg/slab: synchronize access to kmem_cache dying flag using a spinlock
mm: memcg/slab: rework non-root kmem_cache lifecycle management
mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages
mm: memcg/slab: reparent memcg kmem_caches on cgroup removal
mm, memcg: add a memcg_slabinfo debugfs file
mm:gup:
Patch series "switch the remaining architectures to use generic GUP", v4:
mm: use untagged_addr() for get_user_pages_fast addresses
mm: simplify gup_fast_permitted
mm: lift the x86_32 PAE version of gup_get_pte to common code
MIPS: use the generic get_user_pages_fast code
sh: add the missing pud_page definition
sh: use the generic get_user_pages_fast code
sparc64: add the missing pgd_page definition
sparc64: define untagged_addr()
sparc64: use the generic get_user_pages_fast code
mm: rename CONFIG_HAVE_GENERIC_GUP to CONFIG_HAVE_FAST_GUP
mm: reorder code blocks in gup.c
mm: consolidate the get_user_pages* implementations
mm: validate get_user_pages_fast flags
mm: move the powerpc hugepd code to mm/gup.c
mm: switch gup_hugepte to use try_get_compound_head
mm: mark the page referenced in gup_hugepte
mm/gup: speed up check_and_migrate_cma_pages() on huge page
mm/gup.c: remove some BUG_ONs from get_gate_page()
mm/gup.c: mark undo_dev_pagemap as __maybe_unused
mm:pagemap:
asm-generic, x86: introduce generic pte_{alloc,free}_one[_kernel]
alpha: switch to generic version of pte allocation
arm: switch to generic version of pte allocation
arm64: switch to generic version of pte allocation
csky: switch to generic version of pte allocation
m68k: sun3: switch to generic version of pte allocation
mips: switch to generic version of pte allocation
nds32: switch to generic version of pte allocation
nios2: switch to generic version of pte allocation
parisc: switch to generic version of pte allocation
riscv: switch to generic version of pte allocation
um: switch to generic version of pte allocation
unicore32: switch to generic version of pte allocation
mm/pgtable: drop pgtable_t variable from pte_fn_t functions
mm/memory.c: fail when offset == num in first check of __vm_map_pages()
mm:infrastructure:
mm/mmu_notifier: use hlist_add_head_rcu()
mm:vmalloc:
Patch series "Some cleanups for the KVA/vmalloc", v5:
mm/vmalloc.c: remove "node" argument
mm/vmalloc.c: preload a CPU with one object for split purpose
mm/vmalloc.c: get rid of one single unlink_va() when merge
mm/vmalloc.c: switch to WARN_ON() and move it under unlink_va()
mm/vmalloc.c: spelling> s/informaion/information/
mm:initialization:
mm/large system hash: use vmalloc for size > MAX_ORDER when !hashdist
mm/large system hash: clear hashdist when only one node with memory is booted
mm:pagealloc:
arm64: move jump_label_init() before parse_early_param()
Patch series "add init_on_alloc/init_on_free boot options", v10:
mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
mm: init: report memory auto-initialization features at boot time
mm:vmscan:
mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
mm: vmscan: correct some vmscan counters for THP swapout
mm:tools:
tools/vm/slabinfo: order command line options
tools/vm/slabinfo: add partial slab listing to -X
tools/vm/slabinfo: add option to sort by partial slabs
tools/vm/slabinfo: add sorting info to help menu
mm:proc:
proc: use down_read_killable mmap_sem for /proc/pid/maps
proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
proc: use down_read_killable mmap_sem for /proc/pid/pagemap
proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
proc: use down_read_killable mmap_sem for /proc/pid/map_files
mm: use down_read_killable for locking mmap_sem in access_remote_vm
mm: smaps: split PSS into components
mm: vmalloc: show number of vmalloc pages in /proc/meminfo
mm:oom-kill:
mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
mm, oom: refactor dump_tasks for memcg OOMs
mm, oom: remove redundant task_in_mem_cgroup() check
oom: decouple mems_allowed from oom_unkillable_task
mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()"
* akpm: (147 commits)
mm/oom_kill.c: remove redundant OOM score normalization in select_bad_process()
oom: decouple mems_allowed from oom_unkillable_task
mm, oom: remove redundant task_in_mem_cgroup() check
mm, oom: refactor dump_tasks for memcg OOMs
mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
mm/memory-failure.c: clarify error message
mm: vmalloc: show number of vmalloc pages in /proc/meminfo
mm: smaps: split PSS into components
mm: use down_read_killable for locking mmap_sem in access_remote_vm
proc: use down_read_killable mmap_sem for /proc/pid/map_files
proc: use down_read_killable mmap_sem for /proc/pid/clear_refs
proc: use down_read_killable mmap_sem for /proc/pid/pagemap
proc: use down_read_killable mmap_sem for /proc/pid/smaps_rollup
proc: use down_read_killable mmap_sem for /proc/pid/maps
tools/vm/slabinfo: add sorting info to help menu
tools/vm/slabinfo: add option to sort by partial slabs
tools/vm/slabinfo: add partial slab listing to -X
tools/vm/slabinfo: order command line options
mm: vmscan: correct some vmscan counters for THP swapout
mm: vmscan: remove double slab pressure by inc'ing sc->nr_scanned
...
net/mlx5e: Convert single case statement switch statements into if statements
During the review of commit a543bf7e0068 ("net/mlx5e: Return in default
case statement in tx_post_resync_params"), Leon and Nick pointed out
that the switch statements can be converted to single if statements
that return early so that the code is easier to follow.
Suggested-by: Leon Romanovsky <leon@kernel.org> Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
oom: decouple mems_allowed from oom_unkillable_task
Commit d45e01294d66 ("[PATCH] cpusets: confine oom_killer to
mem_exclusive cpuset") introduces a heuristic where a potential
oom-killer victim is skipped if the intersection of the potential victim
and the current (the process triggered the oom) is empty based on the
reason that killing such victim most probably will not help the current
allocating process.
However the commit 150f593a85a1 ("[PATCH] oom: cpuset hint") changed the
heuristic to just decrease the oom_badness scores of such potential
victim based on the reason that the cpuset of such processes might have
changed and previously they may have allocated memory on mems where the
current allocating process can allocate from.
Unintentionally 150f593a85a1 ("[PATCH] oom: cpuset hint") introduced a
side effect as the oom_badness is also exposed to the user space through
/proc/[pid]/oom_score, so, readers with different cpusets can read
different oom_score of the same process.
Later, commit d0e426a94ef6 ("oom: filter tasks not sharing the same
cpuset") fixed the side effect introduced by 150f593a85a1 by moving the
cpuset intersection back to only oom-killer context and out of
oom_badness. However the combination of 1237d1e7bb88 ("oom: make
oom_unkillable_task() helper function") and 7b11f3e89545 ("oom:
/proc/<pid>/oom_score treat kernel thread honestly") unintentionally
brought back the cpuset intersection check into the oom_badness
calculation function.
Other than doing cpuset/mempolicy intersection from oom_badness, the memcg
oom context is also doing cpuset/mempolicy intersection which is quite
wrong and is caught by syzcaller with the following report:
The fix is to decouple the cpuset/mempolicy intersection check from
oom_unkillable_task() and make sure cpuset/mempolicy intersection check is
only done in the global oom context.
[shakeelb@google.com: change function name and update comment] Link: http://lkml.kernel.org/r/20190628152421.198994-3-shakeelb@google.com Link: http://lkml.kernel.org/r/20190624212631.87212-3-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Reported-by: syzbot+d0fc9d3c166bc5e4a94b@syzkaller.appspotmail.com Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Paul Jackson <pj@sgi.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm, oom: remove redundant task_in_mem_cgroup() check
oom_unkillable_task() can be called from three different contexts i.e.
global OOM, memcg OOM and oom_score procfs interface. At the moment
oom_unkillable_task() does a task_in_mem_cgroup() check on the given
process. Since there is no reason to perform task_in_mem_cgroup()
check for global OOM and oom_score procfs interface, those contexts
provide NULL memcg and skips the task_in_mem_cgroup() check. However
for memcg OOM context, the oom_unkillable_task() is always called from
mem_cgroup_scan_tasks() and thus task_in_mem_cgroup() check becomes
redundant and effectively dead code. So, just remove the
task_in_mem_cgroup() check altogether.
Link: http://lkml.kernel.org/r/20190624212631.87212-2-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Paul Jackson <pj@sgi.com> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dump_tasks() traverses all the existing processes even for the memcg OOM
context which is not only unnecessary but also wasteful. This imposes a
long RCU critical section even from a contained context which can be quite
disruptive.
Change dump_tasks() to be aligned with select_bad_process and use
mem_cgroup_scan_tasks to selectively traverse only processes of the target
memcg hierarchy during memcg OOM.
Link: http://lkml.kernel.org/r/20190617231207.160865-1-shakeelb@google.com Signed-off-by: Shakeel Butt <shakeelb@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Paul Jackson <pj@sgi.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm: memcontrol: use CSS_TASK_ITER_PROCS at mem_cgroup_scan_tasks()
Since commit 0940dd91a4dc ("cgroup: Include dying leaders with live
threads in PROCS iterations") corrected how CSS_TASK_ITER_PROCS works,
mem_cgroup_scan_tasks() can use CSS_TASK_ITER_PROCS in order to check
only one thread from each thread group.