Linus Torvalds [Wed, 10 May 2017 18:33:08 +0000 (11:33 -0700)]
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio updates from Michael Tsirkin:
"Fixes, cleanups, performance
A bunch of changes to virtio, most affecting virtio net. Also ptr_ring
batched zeroing - first of batching enhancements that seems ready."
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
s390/virtio: change maintainership
tools/virtio: fix spelling mistake: "wakeus" -> "wakeups"
virtio_net: tidy a couple debug statements
ptr_ring: support testing different batching sizes
ringtest: support test specific parameters
ptr_ring: batch ring zeroing
virtio: virtio_driver doc
virtio_net: don't reset twice on XDP on/off
virtio_net: fix support for small rings
virtio_net: reduce alignment for buffers
virtio_net: rework mergeable buffer handling
virtio_net: allow specifying context for rx
virtio: allow extra context per descriptor
tools/virtio: fix build breakage
virtio: add context flag to find vqs
virtio: wrap find_vqs
ringtest: fix an assert statement
Linus Torvalds [Wed, 10 May 2017 18:29:23 +0000 (11:29 -0700)]
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
"ARM:
- bugfixes
- moved shared 32-bit/64-bit files to virt/kvm/arm
- support for saving/restoring virtual ITS state to userspace
PPC:
- XIVE (eXternal Interrupt Virtualization Engine) support
x86:
- nVMX improvements, including emulated page modification logging
(PML) which brings nice performance improvements on some workloads"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (45 commits)
KVM: arm/arm64: vgic-its: Cleanup after failed ITT restore
KVM: arm/arm64: Don't call map_resources when restoring ITS tables
KVM: arm/arm64: Register ITS iodev when setting base address
KVM: arm/arm64: Get rid of its->initialized field
KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs
KVM: arm/arm64: Slightly rework kvm_vgic_addr
KVM: arm/arm64: Make vgic_v3_check_base more broadly usable
KVM: arm/arm64: Refactor vgic_register_redist_iodevs
KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus
nVMX: Advertise PML to L1 hypervisor
nVMX: Implement emulated Page Modification Logging
kvm: x86: Add a hook for arch specific dirty logging emulation
kvm: nVMX: Validate CR3 target count on nested VM-entry
KVM: set no_llseek in stat_fops_per_vm
KVM: arm/arm64: vgic: Rename kvm_vgic_vcpu_init to kvm_vgic_vcpu_enable
KVM: arm/arm64: Clarification and relaxation to ITS save/restore ABI
KVM: arm64: vgic-v3: KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES
KVM: arm64: vgic-its: Fix pending table sync
KVM: arm64: vgic-its: ITT save and restore
KVM: arm64: vgic-its: Device table save/restore
...
Linus Torvalds [Wed, 10 May 2017 18:26:25 +0000 (11:26 -0700)]
Merge tag 'trace-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fix from Steven Rostedt:
"This is a trivial patch that changes a check for a cpumask from a NULL
pointer to using cpumask_available(), which will do the check. This is
because cpumasks when not allocated are always set, and clang
complains about it"
* tag 'trace-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing: Use cpumask_available() to check if cpumask variable may be used
Linus Torvalds [Wed, 10 May 2017 18:20:09 +0000 (11:20 -0700)]
Merge tag 'armsoc-tee' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull TEE driver infrastructure and OP-TEE drivers from Arnd Bergmann:
"This introduces a generic TEE framework in the kernel, to handle
trusted environemtns (security coprocessor or software implementations
such as OP-TEE/TrustZone). I'm sending it separately from the other
arm-soc driver changes to give it a little more visibility, once the
subsystem is merged, we will likely keep this in the armâ‚‹soc drivers
branch or have the maintainers submit pull requests directly,
depending on the patch volume.
I have reviewed earlier versions in the past, and have reviewed the
latest version in person during Linaro Connect BUD17.
Here is my overall assessment of the subsystem:
- There is clearly demand for this, both for the generic
infrastructure and the specific OP-TEE implementation.
- The code has gone through a large number of reviews, and the review
comments have all been addressed, but the reviews were not coming
up with serious issues any more and nobody volunteered to vouch for
the quality.
- The user space ioctl interface is sufficient to work with the
OP-TEE driver, and it should in principle work with other TEE
implementations that follow the GlobalPlatform[1] standards, but it
might need to be extended in minor ways depending on specific
requirements of future TEE implementations
- The main downside of the API to me is how the user space is tied to
the TEE implementation in hardware or firmware, but uses a generic
way to communicate with it. This seems to be an inherent problem
with what it is trying to do, and I could not come up with any
better solution than what is implemented here.
For a detailed history of the patch series, see
https://lkml.org/lkml/2017/3/10/1277"
* tag 'armsoc-tee' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: dt: hikey: Add optee node
Documentation: tee subsystem and op-tee driver
tee: add OP-TEE driver
tee: generic TEE subsystem
dt/bindings: add bindings for optee
- Improve the performance of Tree SRCU on a CPU-hotplug stress test
- Documentation updates
- Miscellaneous fixes"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (74 commits)
rcu: Open-code the rcu_cblist_n_lazy_cbs() function
rcu: Open-code the rcu_cblist_n_cbs() function
rcu: Open-code the rcu_cblist_empty() function
rcu: Separately compile large rcu_segcblist functions
srcu: Debloat the <linux/rcu_segcblist.h> header
srcu: Adjust default auto-expediting holdoff
srcu: Specify auto-expedite holdoff time
srcu: Expedite first synchronize_srcu() when idle
srcu: Expedited grace periods with reduced memory contention
srcu: Make rcutorture writer stalls print SRCU GP state
srcu: Exact tracking of srcu_data structures containing callbacks
srcu: Make SRCU be built by default
srcu: Fix Kconfig botch when SRCU not selected
rcu: Make non-preemptive schedule be Tasks RCU quiescent state
srcu: Expedite srcu_schedule_cbs_snp() callback invocation
srcu: Parallelize callback handling
kvm: Move srcu_struct fields to end of struct kvm
rcu: Fix typo in PER_RCU_NODE_PERIOD header comment
rcu: Use true/false in assignment to bool
rcu: Use bool value directly
...
Linus Torvalds [Wed, 10 May 2017 16:35:42 +0000 (09:35 -0700)]
Merge tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more ACPI updates from Rafael Wysocki:
"These update the ACPICA code in the kernel to upstream revision 20170303 which adds a few minor fixes and improvements, update ACPI
SoC drivers with new device IDs, platform-related information and
similar, fix the register information in the xpower PMIC driver,
introduce a concept of "always present" devices to the ACPI device
enumeration code and use it to fix a problem with one platform, and
fix a system resume issue related to power resources.
Specifics:
- Update the ACPICA code in the kernel to upstream revision 20170303
which includes:
* Minor fixes and improvements in the core code (Bob Moore,
Seunghun Han).
* Debugger fixes (Colin Ian King, Lv Zheng).
* Compiler/disassembler improvements (Bob Moore, David Box, Lv
Zheng).
* Build-related update (Lv Zheng).
- Add new device IDs and platform-related information to the ACPI
drivers for Intel (LPSS) and AMD (APD) SoCs (Hanjun Guo, Hans de
Goede).
- Make it possible to quirk ACPI-enumerated devices as "always
present" on platforms where they are incorrectly reported as not
present by the AML and add the INT0002 device ID to the list of
"always present" devices (Hans de Goede).
- Fix the register information in the xpower PMIC driver and add
comments to map the registers to symbols used by AML to it (Hans de
Goede).
- Move the code turning off unused ACPI power resources during system
resume to a point after all devices have been resumed to avoid
issues with power resources that do not behave as expected (Hans de
Goede)"
* tag 'acpi-extra-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (22 commits)
ACPI / power: Delay turning off unused power resources after suspend
ACPI / PMIC: xpower: Fix power_table addresses
ACPI / LPSS: Call pwm_add_table() for Bay Trail PWM device
ACPICA: Update version to 20170303
ACPICA: iasl: add ASL conversion tool
ACPICA: Local cache support: Allow small cache objects
ACPICA: Disassembler: Do not unconditionally remove temporary names
ACPICA: iasl: Fix IORT SMMU GSI disassembling
ACPICA: Cleanup AML opcode definitions, no functional change
ACPICA: Debugger: Add interpreter blocking mark for single-step mode
ACPICA: debugger: fix memory leak on Pathname
ACPICA: Update for automatic repair code for objects returned by evaluate_object
ACPICA: Namespace: fix operand cache leak
ACPICA: Fix several incorrect invocations of ACPICA return macro
ACPICA: Fix a module for excessive debug output
ACPICA: Update some function headers, no funtional change
ACPICA: Disassembler: Enhance resource descriptor detection
i2c: designware: Add ACPI HID for Hisilicon Hip07/08 I2C controller
ACPI / APD: Add clock frequency for Hisilicon Hip07/08 I2C controller
ACPI / bus: Add INT0002 to list of always-present devices
...
Linus Torvalds [Wed, 10 May 2017 16:12:30 +0000 (09:12 -0700)]
Merge tag 'pm-extra-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki:
"These add new CPU IDs to a couple of drivers, fix a possible NULL
pointer dereference in the cpuidle core, update DT-related things in
the generic power domains framework and finally update the
suspend/resume infrastructure to improve the handling of wakeups from
suspend-to-idle.
Specifics:
- Add Intel Gemini Lake CPU IDs to the intel_idle and intel_rapl
drivers (David Box).
- Add a NULL pointer check to the cpuidle core to prevent it from
crashing on platforms with incomplete cpuidle configuration (Fei
Li).
- Fix DT-related documentation in the generic power domains (genpd)
framework and add a MAINTAINERS entry for DT-related material in
genpd (Viresh Kumar).
- Update the system suspend/resume infrastructure to improve the
handling of aborts of suspend transitions in progress in the wakeup
framework and rework the suspend-to-idle core loop to make it
possible to filter out spurious wakeup events (specifically the
ones coming from ACPI) without resuming all the way up to user
space every time (Rafael Wysocki)"
* tag 'pm-extra-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / sleep: Ignore spurious SCI wakeups from suspend-to-idle
PM / wakeup: Integrate mechanism to abort transitions in progress
x86/intel_idle: add Gemini Lake support
cpuidle: check dev before usage in cpuidle_use_deepest_state()
powercap: intel_rapl: Add support for Gemini Lake
PM / Domains: Add DT file to MAINTAINERS
PM / Domains: Fix DT example
Linus Torvalds [Wed, 10 May 2017 16:03:48 +0000 (09:03 -0700)]
Merge branch 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs update from Miklos Szeredi:
"The biggest part of this is making st_dev/st_ino on the overlay behave
like a normal filesystem (i.e. st_ino doesn't change on copy up,
st_dev is the same for all files and directories). Currently this only
works if all layers are on the same filesystem, but future work will
move the general case towards more sane behavior.
There are also miscellaneous fixes, including fixes to handling
append-only files. There's a small change in the VFS, but that only
has an effect on overlayfs, since otherwise file->f_path.dentry->inode
and file_inode(file) are always the same"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: update documentation w.r.t. constant inode numbers
ovl: persistent inode numbers for upper hardlinks
ovl: merge getattr for dir and nondir
ovl: constant st_ino/st_dev across copy up
ovl: persistent inode number for directories
ovl: set the ORIGIN type flag
ovl: lookup non-dir copy-up-origin by file handle
ovl: use an auxiliary var for overlay root entry
ovl: store file handle of lower inode on copy up
ovl: check if all layers are on the same fs
ovl: do not set overlay.opaque on non-dir create
ovl: check IS_APPEND() on real upper inode
vfs: ftruncate check IS_APPEND() on real upper inode
ovl: Use designated initializers
ovl: lockdep annotate of nested stacked overlayfs inode lock
Linus Torvalds [Wed, 10 May 2017 15:45:30 +0000 (08:45 -0700)]
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
"Support for pid namespaces from Seth and refcount_t work from Elena"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: Add support for pid namespaces
fuse: convert fuse_conn.count from atomic_t to refcount_t
fuse: convert fuse_req.count from atomic_t to refcount_t
fuse: convert fuse_file.count from atomic_t to refcount_t
Linus Torvalds [Wed, 10 May 2017 15:42:33 +0000 (08:42 -0700)]
Merge tag 'ceph-for-4.12-rc1' of git://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"The two main items are support for disabling automatic rbd exclusive
lock transfers from myself and the long awaited -ENOSPC handling
series from Jeff.
The former will allow rbd users to take advantage of exclusive lock's
built-in blacklist/break-lock functionality while staying in control
of who owns the lock. With the latter in place, we will abort
filesystem writes on -ENOSPC instead of having them block
indefinitely.
Beyond that we've got the usual pile of filesystem fixes from Zheng,
some refcount_t conversion patches from Elena and a patch for an
ancient open() flags handling bug from Alexander"
* tag 'ceph-for-4.12-rc1' of git://github.com/ceph/ceph-client: (31 commits)
ceph: fix memory leak in __ceph_setxattr()
ceph: fix file open flags on ppc64
ceph: choose readdir frag based on previous readdir reply
rbd: exclusive map option
rbd: return ResponseMessage result from rbd_handle_request_lock()
rbd: kill rbd_is_lock_supported()
rbd: support updating the lock cookie without releasing the lock
rbd: store lock cookie
rbd: ignore unlock errors
rbd: fix error handling around rbd_init_disk()
rbd: move rbd_unregister_watch() call into rbd_dev_image_release()
rbd: move rbd_dev_destroy() call out of rbd_dev_image_release()
ceph: when seeing write errors on an inode, switch to sync writes
Revert "ceph: SetPageError() for writeback pages if writepages fails"
ceph: handle epoch barriers in cap messages
libceph: add an epoch_barrier field to struct ceph_osd_client
libceph: abort already submitted but abortable requests when map or pool goes full
libceph: allow requests to return immediately on full conditions if caller wishes
libceph: remove req->r_replay_version
ceph: make seeky readdir more efficient
...
Linus Torvalds [Wed, 10 May 2017 15:33:17 +0000 (08:33 -0700)]
Merge branch 'for-linus-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
Pull btrfs updates from Chris Mason:
"This has fixes and cleanups Dave Sterba collected for the merge
window.
The biggest functional fixes are between btrfs raid5/6 and scrub, and
raid5/6 and device replacement. Some of our pending qgroup fixes are
included as well while I bash on the rest in testing.
We also have the usual set of cleanups, including one that makes
__btrfs_map_block() much more maintainable, and conversions from
atomic_t to refcount_t"
* 'for-linus-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (71 commits)
btrfs: fix the gfp_mask for the reada_zones radix tree
Btrfs: fix reported number of inode blocks
Btrfs: send, fix file hole not being preserved due to inline extent
Btrfs: fix extent map leak during fallocate error path
Btrfs: fix incorrect space accounting after failure to insert inline extent
Btrfs: fix invalid attempt to free reserved space on failure to cow range
btrfs: Handle delalloc error correctly to avoid ordered extent hang
btrfs: Fix metadata underflow caused by btrfs_reloc_clone_csum error
btrfs: check if the device is flush capable
btrfs: delete unused member nobarriers
btrfs: scrub: Fix RAID56 recovery race condition
btrfs: scrub: Introduce full stripe lock for RAID56
btrfs: Use ktime_get_real_ts for root ctime
Btrfs: handle only applicable errors returned by btrfs_get_extent
btrfs: qgroup: Fix qgroup corruption caused by inode_cache mount option
btrfs: use q which is already obtained from bdev_get_queue
Btrfs: switch to div64_u64 if with a u64 divisor
Btrfs: update scrub_parity to use u64 stripe_len
Btrfs: enable repair during read for raid56 profile
btrfs: use clear_page where appropriate
...
Pull sparc updates from David Miller:
"sparc changes, including a bug fix for handling exceptions during
bzero on some sparc64 cpus"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: fix fault handling in NGbzero.S and GENbzero.S
sparc: use memdup_user_nul in sun4m LED driver
sparc: Remove redundant tests in boot_flags_init().
Linus Torvalds [Tue, 9 May 2017 22:40:28 +0000 (15:40 -0700)]
Merge tag 'dmaengine-4.12-rc1' of git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine updates from Vinod Koul:
"This time again a smaller update consisting of:
- support for TI DA8xx dma controller and updates to the cppi driver
- updates on bunch of drivers like xilinx, pl08x, stm32-dma, mv_xor,
ioat, dmatest"
* tag 'dmaengine-4.12-rc1' of git://git.infradead.org/users/vkoul/slave-dma: (35 commits)
dmaengine: pl08x: remove lock documentation
dmaengine: pl08x: fix pl08x_dma_chan_state documentation
dmaengine: pl08x: Use the BIT() macro consistently
dmaengine: pl080: Fix some missing kerneldoc
dmaengine: pl080: Cut some unused defines
dmaengine: dmatest: Add check for supported buffer count (sg_buffers)
dmaengine: dmatest: Select DMA_ENGINE_RAID as its needed for the slave_sg test
dmaengine: virt-dma: Convert to use list_for_each_entry_safe()
dma-debug: use offset_in_page() macro
dmaengine: mv_xor: use offset_in_page() macro
dmaengine: dmatest: use offset_in_page() macro
dmaengine: sun4i: fix invalid argument
dmaengine: ioat: use setup_timer
dmaengine: cppi41: Fix an Oops happening in cppi41_dma_probe()
dmaengine: pl330: remove pdata based initialization
dmaengine: cppi: fix build error due to bad variable
dmaengine: imx-sdma: add 1ms delay to ensure SDMA channel is stopped
dmaengine: cppi41: use managed functions devm_*()
dmaengine: cppi41: fix cppi41_dma_tx_status() logic
dmaengine: qcom_hidma: pause the channel on shutdown
...
Linus Torvalds [Tue, 9 May 2017 22:34:20 +0000 (15:34 -0700)]
Merge tag 'pwm/for-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
Pull pwm updates from Thierry Reding:
"Adds a new driver for the PWM controller found on MediaTek SoCs and
extends support for the Atmel PWM controller to include the SAMA5D2.
Some existing drivers have been migrated to the atomic API and a few
others see miscellaneous improvements"
* tag 'pwm/for-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
pwm: tegra: Read PWM clock source rate in driver init
pwm: pca9685: Fix GPIO-only operation
pwm: mediatek: Don't explicitly set .owner
pwm: tegra: Avoid potential overflow for short periods
pwm: tegra: Add support to configure pin state in suspends/resume
pwm: tegra: Add DT binding details to configure pin in suspends/resume
pwm: tegra: Increase precision in PWM rate calculation
pwm: tegra: Use DIV_ROUND_CLOSEST_ULL() instead of local implementation
pwm: Add MediaTek PWM support
dt-bindings: pwm: Add MediaTek PWM bindings
pwm: atmel: Enable PWM on sama5d2
pwm: atmel: Switch to atomic PWM
pwm: atmel-hlcdc: Implement the suspend/resume hooks
pwm: atmel-hlcdc: Convert to the atomic PWM API
Linus Torvalds [Tue, 9 May 2017 22:15:47 +0000 (15:15 -0700)]
Merge tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU updates from Joerg Roedel:
- code optimizations for the Intel VT-d driver
- ability to switch off a previously enabled Intel IOMMU
- support for 'struct iommu_device' for OMAP, Rockchip and Mediatek
IOMMUs
- header optimizations for IOMMU core code headers and a few fixes that
became necessary in other parts of the kernel because of that
- ACPI/IORT updates and fixes
- Exynos IOMMU optimizations
- updates for the IOMMU dma-api code to bring it closer to use per-cpu
iova caches
- new command-line option to set default domain type allocated by the
iommu core code
- another command line option to allow the Intel IOMMU switched off in
a tboot environment
- ARM/SMMU: TLB sync optimisations for SMMUv2, Support for using an
IDENTITY domain in conjunction with DMA ops, Support for SMR masking,
Support for 16-bit ASIDs (was previously broken)
- various other small fixes and improvements
* tag 'iommu-updates-v4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (63 commits)
soc/qbman: Move dma-mapping.h include to qman_priv.h
soc/qbman: Fix implicit header dependency now causing build fails
iommu: Remove trace-events include from iommu.h
iommu: Remove pci.h include from trace/events/iommu.h
arm: dma-mapping: Don't override dma_ops in arch_setup_dma_ops()
ACPI/IORT: Fix CONFIG_IOMMU_API dependency
iommu/vt-d: Don't print the failure message when booting non-kdump kernel
iommu: Move report_iommu_fault() to iommu.c
iommu: Include device.h in iommu.h
x86, iommu/vt-d: Add an option to disable Intel IOMMU force on
iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed
iommu/arm-smmu: Correct sid to mask
iommu/amd: Fix incorrect error handling in amd_iommu_bind_pasid()
iommu: Make iommu_bus_notifier return NOTIFY_DONE rather than error code
omap3isp: Remove iommu_group related code
iommu/omap: Add iommu-group support
iommu/omap: Make use of 'struct iommu_device'
iommu/omap: Store iommu_dev pointer in arch_data
iommu/omap: Move data structures to omap-iommu.h
iommu/omap: Drop legacy-style device support
...
* acpica:
ACPICA: Update version to 20170303
ACPICA: iasl: add ASL conversion tool
ACPICA: Local cache support: Allow small cache objects
ACPICA: Disassembler: Do not unconditionally remove temporary names
ACPICA: iasl: Fix IORT SMMU GSI disassembling
ACPICA: Cleanup AML opcode definitions, no functional change
ACPICA: Debugger: Add interpreter blocking mark for single-step mode
ACPICA: debugger: fix memory leak on Pathname
ACPICA: Update for automatic repair code for objects returned by evaluate_object
ACPICA: Namespace: fix operand cache leak
ACPICA: Fix several incorrect invocations of ACPICA return macro
ACPICA: Fix a module for excessive debug output
ACPICA: Update some function headers, no funtional change
ACPICA: Disassembler: Enhance resource descriptor detection
ACPICA: Add non-linux host build support
Eric Dumazet [Tue, 9 May 2017 13:29:19 +0000 (06:29 -0700)]
dccp/tcp: do not inherit mc_list from parent
syzkaller found a way to trigger double frees from ip_mc_drop_socket()
It turns out that leave a copy of parent mc_list at accept() time,
which is very bad.
Very similar to commit 99c32c3fa32f ("tcp: do not inherit
fastopen_req from parent")
Initial report from Pray3r, completed by Andrey one.
Thanks a lot to them !
Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Pray3r <pray3r.z@gmail.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Dave Aldridge [Tue, 9 May 2017 08:57:35 +0000 (02:57 -0600)]
sparc64: fix fault handling in NGbzero.S and GENbzero.S
When any of the functions contained in NGbzero.S and GENbzero.S
vector through *bzero_from_clear_user, we may end up taking a
fault when executing one of the store alternate address space
instructions. If this happens, the exception handler does not
restore the %asi register.
This commit fixes the issue by introducing a new exception
handler that ensures the %asi register is restored when
a fault is handled.
Signed-off-by: Dave Aldridge <david.j.aldridge@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Reviewed-by: Babu Moger <babu.moger@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 9 May 2017 17:07:33 +0000 (10:07 -0700)]
Merge tag 'armsoc-dt64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM 64-bit DT updates from Olof Johansson:
"Device-tree updates for arm64 platforms. Just as with 32-bit, a bunch
of smaller changes, but also some new platforms that are worth
mentioning:
- Rockchip RK3399 platforms for Chromebooks, including Samsung
Chromebook Plus (Kevin)
- Orange Pi PC2 (Allwinner H5)
- Freescale LS2088A and LS1088A SoCs
- Expanded support for Nvidia Tegra186 (and Jetson TX2)"
* tag 'armsoc-dt64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (180 commits)
arm64: dts: Add basic DT to support Spreadtrum's SP9860G
arm64: dts: exynos: Use - instead of @ for DT OPP entries
arm64: dts: exynos: Add support for s6e3hf2 panel device on TM2e board
arm64: dts: juno: add information about L1 and L2 caches
arm64: dts: juno: fix few unit address format warnings
arm64: marvell: dts: enable the crypto engine on the Armada 8040 DB
arm64: marvell: dts: enable the crypto engine on the Armada 7040 DB
arm64: marvell: dts: add crypto engine description for 7k/8k
arm64: dts: marvell: add sdhci support for Armada 7K/8K
arm64: dts: marvell: add eMMC support for Armada 37xx
arm64: dts: hisi: add pinctrl dtsi file for HiKey960 development board
arm64: dts: hisi: add drive strength levels of the pins for Hi3660 SoC
arm64: dts: hisi: enable the NIC and SAS for the hip07-d05 board
arm64: dts: hisi: add SAS nodes for the hip07 SoC
arm64: dts: hisi: add RoCE nodes for the hip07 SoC
arm64: dts: hisi: add network related nodes for the hip07 SoC
arm64: dts: hisi: add mbigen nodes for the hip07 SoC
arm64: dts: rockchip: fix the memory size of PX5 Evaluation board
arm64: dts: hisilicon: add dts files for hi3798cv200-poplar board
dt-bindings: arm: hisilicon: add bindings for hi3798cv200 SoC and Poplar board
...
Linus Torvalds [Tue, 9 May 2017 17:04:17 +0000 (10:04 -0700)]
Merge tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC 64-bit changes from Olof Johansson:
"Changes to platform code for 64-bit ARM platforms.
Most of these are small changes to the one defconfig we use on arm64
(no per-platform configs there), to enable new drivers.
There are also a few other changes. Broadcom sold off their 'Vulcan'
design to Cavium, where it is now called ThunderX2. While we normally
don't rename stuff based on marketing's whims, it seemed appropriate
to bring in renames on a few things such as MAINTAINERS, etc"
* tag 'armsoc-arm64' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arm64: sunxi: always enable reset controller
arm64: defconfig: enable the Safexcel crypto engine as a module
arm64: configs: enable SDHCI driver for Xenon
MAINTAINERS: Broadcom Vulcan is now Cavium ThunderX2
arm64: defconfig: add Allwinner USB PHY
arm64: defconfig: enable MVPP2
arm64: defconfig: Enable video, DRM and LPASS drivers for Exynos5433 and Exynos7
arm64: exynos: Enable Exynos PMU and PM domains drivers
arm64: only select PINCTRL for Allwinner platforms
arm64: set CONFIG_MMC_BCM2835=y in defconfig
arm64: defconfig: enable I2C_PXA
arm64: defconfig: enable MVNETA
ARM64: defconfig: enable the leds-pwm driver and default-on trigger
arm64: defconfig: Enable SH Mobile I2C controller
Linus Torvalds [Tue, 9 May 2017 17:01:15 +0000 (10:01 -0700)]
Merge tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC driver updates from Olof Johansson:
"Driver updates for ARM SoCs:
Reset subsystem, merged through arm-soc by tradition:
- Make bool drivers explicitly non-modular
- New support for i.MX7 and Arria10 reset controllers
PATA driver for Palmchip BK371 (acked by Tejun)
Power domain drivers for i.MX (GPC, GPCv2)
- Moved out of mach-imx for GPC
- Bunch of tweaks, fixes, etc
PMC support for Tegra186
SoC detection support for Renesas RZ/G1H and RZ/G1N
Move Tegra flow controller driver from mach directory to drivers/soc
- (Power management / CPU power driver)
Misc smaller tweaks for other platforms"
* tag 'armsoc-drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (60 commits)
soc: pm-domain: Fix the mangled urls
soc: renesas: rcar-sysc: Add support for R-Car H3 ES2.0
soc: renesas: rcar-sysc: Add support for fixing up power area tables
soc: renesas: Register SoC device early
soc: imx: gpc: add workaround for i.MX6QP to the GPC PD driver
dt-bindings: imx-gpc: add i.MX6 QuadPlus compatible
soc: imx: gpc: add defines for domain index
soc: imx: Add GPCv2 power gating driver
dt-bindings: Add GPCv2 power gating driver
ARM/clk: move the ICST library to drivers/clk
ARM: plat-versatile: remove stale clock header
ARM: keystone: Drop PM domain support for k2g
soc: ti: Add ti_sci_pm_domains driver
dt-bindings: Add TI SCI PM Domains
PM / Domains: Do not check if simple providers have phandle cells
PM / Domains: Add generic data pointer to genpd data struct
soc/tegra: Add initial flowctrl support for Tegra132/210
soc/tegra: flowctrl: Add basic platform driver
soc/tegra: Move Tegra flowctrl driver
ARM: tegra: Remove unnecessary inclusion of flowctrl header
...
Linus Torvalds [Tue, 9 May 2017 16:58:15 +0000 (09:58 -0700)]
Merge tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM: SoC defconfig updates from Olof Johansson:
"We've traditionally kept defconfig updates in a separate branch, often
to encourage submaintainers to handle those patches separately to
avoid conflicts on the shared files. The amount of changes seem to be
decreasing though, so we might rethink how we handle this going
forward.
There really isn't much to write about here. The bulk of changes here
are enabling drivers for whatever platforms the hardware is found on
(and multi-configs)"
* tag 'armsoc-defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (30 commits)
multi_v7_defconfig: make Rockchip usb2-phy built-in
ARM: omap2plus_defconfig: Enable droid 4 devices
ARM: omap2plus_defconfig: Add QMI, ACM and PPP as loadable modules
ARM: configs: aspeed: Add new drivers
ARM: configs: aspeed: Update configs for BMC systems
ARM: omap2plus_defconfig: Enable TI Ethernet PHY
ARM: configs: Add new config fragment to change RAM start point
ARM: configs: stm32: Add I2C support
multi_v7_defconfig: make Rockchip DRM drivers built-in
ARM: configs: stm32: Set CPU_V7M_NUM_IRQ to max value
ARM: imx_v6_v7_defconfig: Select SMSC_PHY
ARM: davinci_all_defconfig: convert to use libata PATA
ARM: qcom_defconfig: Enable Qualcomm remoteproc and related drivers
ARM: omap2plus_defconfig: enable ahci-dm816 module
arm: set CONFIG_MMC_BCM2835=y in bcm2835_defconfig and multi_v7_defconfig
ARM: bcm2835: Enable missing CMA settings for VC4 driver
ARM: socfpga: updates for socfpga_defconfig
ARM: imx_v6_v7_defconfig: Select hid-multitouchdriver
ARM: imx_v6_v7_defconfig: Select max11801_ts touchscreen driver
ARM: exynos_defconfig: Increase CONFIG_CMA_SIZE_MBYTES to 96
...
Linus Torvalds [Tue, 9 May 2017 16:54:39 +0000 (09:54 -0700)]
Merge tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM Device-tree updates from Olof Johansson:
"Device-tree continues to see lots of updates. The majority of patches
here are smaller changes for new hardware on existing platforms, and
there are a few larger changes worth pointing out.
Major new platforms:
- Gemini has been ported to DT, so a handful of "new" platforms moved
over from board files
- Rockchip RK3288 support for Tinkerboard and Phytec phyCORE-RK3288
SoM and RDK
- A bunch of embedded platforms, several Linksys platforms, Synology
DS116,
- Motorola Droid4 (really old OMAP-based phone) support is added.
Some refactorings, i.e. Allwinner H3/H5 support is commonalized.
And lots of smaller changes, cleanups, etc. See shortlog for more
description
We're adding ability to cross-include DT files between arm and arm64,
by creating appropriate links in the dt-include directory, and using
arm/ and arm64/ as include prefixes. This will avoid other local hacks
such as per-file links between the two arch trees (this broke for
external mirroring of DT contents). Now they can just provide their
own appropriate dt-include hierarcy per platform"
* tag 'armsoc-dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (349 commits)
ARM: dts: exynos: Use - instead of @ for DT OPP entries
arm: spear6xx: add DT description of the ADC on SPEAr600
arm: spear6xx: remove unneeded pinctrl properties in spear600-evb
arm: spear6xx: switch spear600-evb to the new flash partition DT binding
arm: spear6xx: fix spaces in spear600-evb.dts
arm: spear6xx: use node labels in spear600-evb.dts
arm: spear6xx: add labels to various nodes in spear600.dtsi
ARM: dts: vexpress: fix few unit address format warnings
ARM: dts: at91: sama5d3_xplained: not all ADC channels are available
ARM: dts: at91: sama5d3_xplained: fix ADC vref
ARM: dts: at91: add envelope detector mux to the Axentia TSE-850
ARM: dts: armada-38x: label USB and SATA nodes
ARM: dts: imx6q-utilite-pro: add hpd gpio
ARM: dts: imx6qp-sabresd: Set reg_arm regulator supply
ARM: dts: imx6qdl-sabresd: Set LDO regulator supply
ARM: dts: imx: add Gateworks Ventana GW5903 support
ARM: dts: i.MX25: add AIPS control registers
ARM: dts: imx7-colibri: add Carrier Board 3.3V/5V regulators
ARM: dts: imx7-colibri: remove 1.8V fixed regulator
ARM: dts: imx7-colibri: allow to disable Ethernet rail
...
Linus Torvalds [Tue, 9 May 2017 16:49:36 +0000 (09:49 -0700)]
Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC platform updates from Olof Johansson:
"SoC platform changes (arch/arm/mach-*). This merge window, the bulk is
for a few platforms:
Gemini:
- Legacy platform that Linus Walleij has converted to multiplatform
and DT, so a handful of various tweaks there, removal of some old
stale support, etc.
Atmel AT91:
- Fixup of various power management related pieces
- Move of SoC detection to a drivers/soc driver instead
ST Micro STM32:
- New SoC support: STM32H743
TI platforms:
- More driver support for Davinci (SATA in particular)
- Removal of some old stale hwmod files (linkspace platform)
Misc:
- A couple of smaller patches for i.MX, sunxi, hisi"
* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (57 commits)
ARM: davinci: Add clock for CPPI 4.1 DMA engine
ARM: mxs: add support for I2SE Duckbill 2 boards
MAINTAINERS: Update the Allwinner sunXi entry
ARM: i.MX25: globally disable supervisor protect
ARM: at91: move SoC detection to its own driver
ARM: at91: pm: correct typo
ARM: at91: pm: Remove at91_pm_set_standby
ARM: at91: pm: Merge all at91sam9*_pm_init
ARM: at91: pm: Tie the USB clock mask to the pmc
ARM: at91: pm: Tie the memory controller type to the ramc id
ARM: at91: pm: Workaround DDRSDRC self-refresh bug with LPDDR1 memories.
ARM: at91: pm: Simplify at91rm9200_standby
ARM: at91: pm: Use struct at91_pm_data in pm_suspend.S
ARM: at91: pm: Move global variables into at91_pm_data
ARM: at91: pm: Move at91_ramc_read/write to pm.c
ARM: at91: pm: Cleanup headers
MAINTAINERS: Add memory drivers to AT91 entry
MAINTAINERS: Update AT91 entry
ARM: davinci: add pata_bk3710 libata driver support
ARM: OMAP2+: mark omap_init_rng as __init
...
Linus Torvalds [Tue, 9 May 2017 16:20:16 +0000 (09:20 -0700)]
Merge tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull misc ARM SoC fixes from Olof Johansson:
"ARM SoC non-urgent fixes for merge window
Smaller patches that didn't seem to find a home in other branches, and
low-priority fixes from late in the merge window. A number of these
are MAINTAINER updates, it seems.
Highlights:
* Maintainers:
- Remove Alexandre Courbot and Stephen Warren from Tegra
maintainership, add Jon Hunter
- Remove Stephen Warren and add Stefan Wahren to bcm2835
- Tweaks for file flagging for Marvell Dove
* Fixes:
- For two non-common-clk platform, handle clk_disable with NULL arg
- Remove redundant Kconfig select for Oxnas"
* tag 'armsoc-fixes-nc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: mmp: let clk_disable() return immediately if clk is NULL
ARM: w90x900: let clk_disable() return immediately if clk is NULL
MAINTAINERS: Add file patterns for dove device tree bindings
ARM: oxnas: remove redundant select CPU_V6K
MAINTAINERS: tegra: Remove self as maintainer
MAINTAINERS: tegra: Replace Stephen with Jon
MAINTAINERS: Add Stefan Wahren to bcm2835.
MAINTAINERS: remove swarren from bcm2835
MAINTAINERS: Add Jon Mason to BCM5301X maintainers
Linus Torvalds [Tue, 9 May 2017 16:12:53 +0000 (09:12 -0700)]
Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
"Assorted bits and pieces from various people. No common topic in this
pile, sorry"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs/affs: add rename exchange
fs/affs: add rename2 to prepare multiple methods
Make stat/lstat/fstatat pass AT_NO_AUTOMOUNT to vfs_statx()
fs: don't set *REFERENCED on single use objects
fs: compat: Remove warning from COMPATIBLE_IOCTL
remove pointless extern of atime_need_update_rcu()
fs: completely ignore unknown open flags
fs: add a VALID_OPEN_FLAGS
fs: remove _submit_bh()
fs: constify tree_descr arrays passed to simple_fill_super()
fs: drop duplicate header percpu-rwsem.h
fs/affs: bugfix: Write files greater than page size on OFS
fs/affs: bugfix: enable writes on OFS disks
fs/affs: remove node generation check
fs/affs: import amigaffs.h
fs/affs: bugfix: make symbolic links work again
Linus Torvalds [Tue, 9 May 2017 15:45:16 +0000 (08:45 -0700)]
proc: try to remove use of FOLL_FORCE entirely
We fixed the bugs in it, but it's still an ugly interface, so let's see
if anybody actually depends on it. It's entirely possible that nothing
actually requires the whole "punch through read-only mappings"
semantics.
For example, gdb definitely uses the /proc/<pid>/mem interface, but it
looks like it mainly does it for regular reads of the target (that don't
need FOLL_FORCE), and looking at the gdb source code seems to fall back
on the traditional ptrace(PTRACE_POKEDATA) interface if it needs to.
If this breaks something, I do have a (more complex) version that only
enables FOLL_FORCE when somebody has PTRACE_ATTACH'ed to the target,
like the comment here used to say ("Maybe we should limit FOLL_FORCE to
actual ptrace users?").
Cc: Kees Cook <keescook@chromium.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Eric Biederman <ebiederm@xmission.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David S. Miller [Tue, 9 May 2017 15:24:24 +0000 (11:24 -0400)]
Merge branch 'qed-general-fixes'
Yuval Mintz says:
====================
qed*: General fixes
This series contain several fixes for qed and qede.
- #1 [and ~#5] relate to XDP cleanups
- #2 and #5 correct VF behavior
- #3 and #4 fix and add missing configurations needed for RoCE & storage
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Tue, 9 May 2017 12:07:51 +0000 (15:07 +0300)]
qede: Split PF/VF ndos.
PFs and VFs share the same structure of NDOs today,
and the VFs explicitly fails the ndo_xdp() callback stating
it doesn't support XDP.
This results in lots of:
[qede_xdp:1032(enp131s2)]VFs don't support XDP
------------[ cut here ]------------
WARNING: CPU: 4 PID: 1426 at net/core/rtnetlink.c:1637 rtnl_dump_ifinfo+0x354/0x3c0
...
Call Trace:
? __alloc_skb+0x9b/0x1d0
netlink_dump+0x122/0x290
netlink_recvmsg+0x27d/0x430
sock_recvmsg+0x3d/0x50
...
As every dump request for the VF interface info would fail due to
rtnl_xdp_fill() returning an error code.
To resolve this, introduce a subset of the NDOs meant for the VF
in a seperate structure and register that one instead for VFs,
and omit the ndo_xdp initialization.
Fixes: 1f91669b7f57 ("qede: Prevent VFs from using XDP") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Ram Amrani [Tue, 9 May 2017 12:07:50 +0000 (15:07 +0300)]
qed: Correct doorbell configuration for !4Kb pages
When configuring the doorbell DPI address, driver aligns the start
address to 4KB [HW-pages] instead of host PAGE_SIZE.
As a result, RoCE applications might receive addresses which are
unaligned to pages [when PAGE_SIZE > 4KB], which is a security risk.
Fixes: 917e79de7fa7 ("qed: Add support for RoCE hw init") Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Mintz, Yuval [Tue, 9 May 2017 12:07:49 +0000 (15:07 +0300)]
qed: Tell QM the number of tasks
Driver doesn't pass the number of tasks to the QM init logic
which would cause back-pressure in scenarios requiring many tasks
[E.g., using max MRs] and thus reduced performance.
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
When (re|un)loading, Tx-queues belonging to XDP would not get freed.
Fixes: def841306001 ("qede: Add support for XDP_TX") Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
net/mlx4_core: Reduce harmless SRIOV error message to debug level
Under SRIOV resource management, extra counters are allocated to VFs
from a free pool. If that pool is empty, the ALLOC_RES command for
a counter resource fails -- and this generates a misleading error
message in the message log.
Under SRIOV, each VF is allocated (i.e., guaranteed) 2 counters --
one counter per port. For ETH ports, the RoCE driver requests an
additional counter (above the guaranteed counters). If that request
fails, the VF RoCE driver simply uses the default (i.e., guaranteed)
counter for that port.
Thus, failing to allocate an additional counter does not constitute
a problem, and the error message on the PF when this occurs should
be reduced to debug level.
Finally, to identify the situation that the reason for the failure is
that no resources are available to grant to the VF, we modified the
error returned by mlx4_grant_resource to -EDQUOT (Quota exceeded),
which more accurately describes the error.
Fixes: d1c046996b29 ("IB/mlx4: Add RoCE/IB dedicated counters") Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Kamal Heib [Tue, 9 May 2017 11:45:22 +0000 (14:45 +0300)]
net/mlx4_en: Change the error print to debug print
The error print within mlx4_en_calc_rx_buf() should be a debug print.
Fixes: bb00d06969b5 ('mlx4: allow order-0 memory allocations in RX path') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Halil is doing a lot more work in the virtio area on s390 than I
do. Let's reflect the reality in the maintainers file.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A known weakness in ptr_ring design is that it does not handle well the
situation when ring is almost full: as entries are consumed they are
immediately used again by the producer, so consumer and producer are
writing to a shared cache line.
To fix this, add batching to consume calls: as entries are
consumed do not write NULL into the ring until we get
a multiple (in current implementation 2x) of cache lines
away from the producer. At that point, write them all out.
We do the write out in the reverse order to keep
producer from sharing cache with consumer for as long
as possible.
Writeout also triggers when ring wraps around - there's
no special reason to do this but it helps keep the code
a bit simpler.
What should we do if getting away from producer by 2 cache lines
would mean we are keeping the ring moe than half empty?
Maybe we should reduce the batching in this case,
current patch simply reduces the batching.
Notes:
- it is no longer true that a call to consume guarantees
that the following call to produce will succeed.
No users seem to assume that.
- batching can also in theory reduce the signalling rate:
users that would previously send interrups to the producer
to wake it up after consuming each entry would now only
need to do this once in a batch.
Doing this would be easy by returning a flag to the caller.
No users seem to do signalling on consume yet so this was not
implemented yet.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>
Paolo Bonzini [Tue, 9 May 2017 10:51:49 +0000 (12:51 +0200)]
Merge tag 'kvm-arm-for-v4.12-round2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
Second round of KVM/ARM Changes for v4.12.
Changes include:
- A fix related to the 32-bit idmap stub
- A fix to the bitmask used to deode the operands of an AArch32 CP
instruction
- We have moved the files shared between arch/arm/kvm and
arch/arm64/kvm to virt/kvm/arm
- We add support for saving/restoring the virtual ITS state to
userspace
KVM: arm/arm64: Don't call map_resources when restoring ITS tables
The only reason we called kvm_vgic_map_resources() when restoring the
ITS tables was because we wanted to have the KVM iodevs registered in
the KVM IO bus framework at the time when the ITS was restored such that
a restored and active device can inject MSIs prior to otherwise calling
kvm_vgic_map_resources() from the first run of a VCPU.
Since we now register the KVM iodevs for the redestributors and ITS as
soon as possible (when setting the base addresses), we no longer need
this call and kvm_vgic_map_resources() is again called only when first
running a VCPU.
Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
KVM: arm/arm64: Register ITS iodev when setting base address
We have to register the ITS iodevice before running the VM, because in
migration scenarios, we may be restoring a live device that wishes to
inject MSIs before the VCPUs have started.
All we need to register the ITS io device is the base address of the
ITS, so we can simply register that when the base address of the ITS is
set.
[ Code to fix concurrency issues when setting the ITS base address and
to fix the undef base address check written by Marc Zyngier ]
Signed-off-by: Christoffer Dall <cdall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
Marc Zyngier [Mon, 8 May 2017 17:15:40 +0000 (18:15 +0100)]
KVM: arm/arm64: Get rid of its->initialized field
The its->initialized doesn't bring much to the table, and creates
unnecessary ordering between setting the address and initializing it
(which amounts to exactly nothing).
Let's kill it altogether, making KVM_DEV_ARM_VGIC_CTRL_INIT the no-op
it deserves to be.
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
KVM: arm/arm64: Register iodevs when setting redist base and creating VCPUs
Instead of waiting with registering KVM iodevs until the first VCPU is
run, we can actually create the iodevs when the redist base address is
set. The only downside is that we must now also check if we need to do
this for VCPUs which are created after creating the VGIC, because there
is no enforced ordering between creating the VGIC (and setting its base
addresses) and creating the VCPUs.
Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
As we are about to handle setting the address for the redistributor base
region separately from some of the other base addresses, let's rework
this function to leave a little more room for being flexible in what
each type of base address does.
Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
KVM: arm/arm64: Make vgic_v3_check_base more broadly usable
As we are about to fiddle with the IO device registration mechanism,
let's be a little more careful when setting base addresses as early as
possible. When setting a base address, we can check that there's
address space enough for its scope and when the last of the two
base addresses (dist and redist) get set, we can also check if the
regions overlap at that time.
This allows us to provide error messages to the user at time when trying
to set the base address, as opposed to later when trying to run the VM.
To do this, we make vgic_v3_check_base available in the core vgic-v3
code as well as in the other parts of the GICv3 code, namely the MMIO
config code.
We also return true for undefined base addresses so that the function
can be used before all base addresses are set; all callers already check
for uninitialized addresses before calling this function.
Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
Split out the function to register all the redistributor iodevs into a
function that handles a single redistributor at a time in preparation
for being able to call this per VCPU as these get created.
Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
KVM: Add kvm_vcpu_get_idx to get vcpu index in kvm->vcpus
There are occasional needs to use the index of vcpu in the kvm->vcpus
array to map something related to a VCPU. For example, unlike the
vcpu->vcpu_id, the vcpu index is guaranteed to not be sparse across all
vcpus which is useful when allocating a memory area for each vcpu.
Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
Bandan Das [Fri, 5 May 2017 19:25:15 +0000 (15:25 -0400)]
nVMX: Advertise PML to L1 hypervisor
Advertise the PML bit in vmcs12 but don't try to enable
it in hardware when running L2 since L0 is emulating it. Also,
preserve L0's settings for PML since it may still
want to log writes.
Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With EPT A/D enabled, processor access to L2 guest
paging structures will result in a write violation.
When this happens, write the GUEST_PHYSICAL_ADDRESS
to the pml buffer provided by L1 if the access is
write and the dirty bit is being set.
This patch also adds necessary checks during VMEntry if L1
has enabled PML. If the PML index overflows, we change the
exit reason and run L1 to simulate a PML full event.
Signed-off-by: Bandan Das <bsd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Jim Mattson [Fri, 5 May 2017 18:28:09 +0000 (11:28 -0700)]
kvm: nVMX: Validate CR3 target count on nested VM-entry
According to the SDM, the CR3-target count must not be greater than
4. Future processors may support a different number of CR3-target
values. Software should read the VMX capability MSR IA32_VMX_MISC to
determine the number of values supported.
Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 9 May 2017 09:50:01 +0000 (11:50 +0200)]
Merge branch 'kvm-ppc-next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into HEAD
The main thing here is a new implementation of the in-kernel
XICS interrupt controller emulation for POWER9 machines, from Ben
Herrenschmidt.
POWER9 has a new interrupt controller called XIVE (eXternal Interrupt
Virtualization Engine) which is able to deliver interrupts directly
to guest virtual CPUs in hardware without hypervisor intervention.
With this new code, the guest still sees the old XICS interface but
performance is better because the XICS emulation in the host uses the
XIVE directly rather than going through a XICS emulation in firmware.
Geliang Tang [Sat, 6 May 2017 15:37:19 +0000 (23:37 +0800)]
KVM: set no_llseek in stat_fops_per_vm
In vm_stat_get_per_vm_fops and vcpu_stat_get_per_vm_fops, since we
use nonseekable_open() to open, we should use no_llseek() to seek,
not generic_file_llseek().
Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM: arm/arm64: vgic: Rename kvm_vgic_vcpu_init to kvm_vgic_vcpu_enable
This function really doesn't init anything, it enables the CPU
interface, so name it as such, which gives us the name to use for actual
init work later on.
Signed-off-by: Christoffer Dall <cdall@linaro.org> Reviewed-by: Eric Auger <eric.auger@redhat.com>
Linus Torvalds [Tue, 9 May 2017 03:43:30 +0000 (20:43 -0700)]
Merge tag 'linux-kselftest-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
Pull kselftest updates from Shuah Khan:
"This update consists of:
- important fixes for build failures and clean target related
warnings to address regressions introduced in commit 9b0106044ebe
("selftests: remove duplicated all and clean target")
- several minor spelling fixes in and log messages and comment
blocks.
- Enabling configs for better test coverage in ftrace, vm, and
cpufreq tests.
- .gitignore changes"
* tag 'linux-kselftest-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (26 commits)
selftests: x86: add missing executables to .gitignore
selftests: watchdog: accept multiple params on command line
selftests: create cpufreq kconfig fragments
selftests: x86: override clean in lib.mk to fix warnings
selftests: sync: override clean in lib.mk to fix warnings
selftests: splice: override clean in lib.mk to fix warnings
selftests: gpio: fix clean target to remove all generated files and dirs
selftests: add gpio generated files to .gitignore
selftests: powerpc: override clean in lib.mk to fix warnings
selftests: gpio: override clean in lib.mk to fix warnings
selftests: futex: override clean in lib.mk to fix warnings
selftests: lib.mk: define CLEAN macro to allow Makefiles to override clean
selftests: splice: fix clean target to not remove default_file_splice_read.sh
selftests: gpio: add config fragment for gpio-mockup
selftests: breakpoints: allow to cross-compile for aarch64/arm64
selftests/Makefile: Add missed PHONY targets
selftests/vm/run_vmtests: Fix wrong comment
selftests/Makefile: Add missed closing `"` in comment
selftests/vm/run_vmtests: Polish output text
selftests/timers: fix spelling mistake: "Asynchronous"
...
Linus Torvalds [Tue, 9 May 2017 03:36:38 +0000 (20:36 -0700)]
Merge tag 'trace-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull more tracing updates from Steven Rostedt:
"These are three simple changes.
The first one is just a switch from using strcpy() to strlcpy().
Someone thought that it may cause an overflow bug, but since it only
copies comms into a pre-allocated array of TASK_COMM_LEN, and no comm
should ever be bigger than that, nor not end with a nul character,
this change is more of a safety precaution than fixing anything that
is actually broken.
The other two changes are simply cleaning and optimizing some code"
* tag 'trace-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
ftrace: Simplify ftrace_match_record() even more
ftrace: Remove an unneeded condition
tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline()
Linus Torvalds [Tue, 9 May 2017 03:07:29 +0000 (20:07 -0700)]
Merge tags 'for-linus' and 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull more rdma updates from Doug Ledford:
"As mentioned in my first pull request, this is the subsequent pull
requests I had. This is all I have, and in fact this cleans out the
RDMA subsystem's entire patchworks queue of kernel changes that are
ready to go (well, it did for the weekend anyway, a few new patches
are in, but they'll be coming during the -rc cycle).
The first tag contains a single patch that would have conflicted if
taken from my tree or DaveM's tree as it needed our trees merged to
come cleanly.
The second tag contains the patch series from Intel plus three other
stragllers that came in late last week. I took them because it allowed
me to legitimately claim that the RDMA patchworks queue was, for a
short time, 100% cleared of all waiting kernel patches, woohoo! :-).
I have it under my for-next tag, so it did get 0day and linux- next
over the end of last week, and linux-next did show one minor conflict.
Summary:
'for-linus' tag:
- mlx5/IPoIB fixup patch
'for-next' tag:
- the hfi1 15 patch set that landed late
- IPoIB get_link_ksettings which landed late because I asked for a
respin
- one late rxe change
- one -rc worthy fix that's in early"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/mlx5: Enable IPoIB acceleration
* tag 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
rxe: expose num_possible_cpus() cnum_comp_vectors
IB/rxe: Update caller's CRC for RXE_MEM_TYPE_DMA memory type
IB/hfi1: Clean up on context initialization failure
IB/hfi1: Fix an assign/ordering issue with shared context IDs
IB/hfi1: Clean up context initialization
IB/hfi1: Correctly clear the pkey
IB/hfi1: Search shared contexts on the opened device, not all devices
IB/hfi1: Remove atomic operations for SDMA_REQ_HAVE_AHG bit
IB/hfi1: Use filedata rather than filepointer
IB/hfi1: Name function prototype parameters
IB/hfi1: Fix a subcontext memory leak
IB/hfi1: Return an error on memory allocation failure
IB/hfi1: Adjust default eager_buffer_size to 8MB
IB/hfi1: Get rid of divide when setting the tx request header
IB/hfi1: Fix yield logic in send engine
IB/hfi1, IB/rdmavt: Move r_adefered to r_lock cache line
IB/hfi1: Fix checks for Offline transient state
IB/ipoib: add get_link_ksettings in ethtool
- add ITE 8893 bridge DMA alias quirk (Jarod Wilson)
- restrict Cavium ACS quirk only to CN81xx/CN83xx/CN88xx devices
(Manish Jaggi)
* tag 'pci-v4.12-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (146 commits)
PCI: Don't allow unbinding host controllers that aren't prepared
ARM: DRA7: clockdomain: Change the CLKTRCTRL of CM_PCIE_CLKSTCTRL to SW_WKUP
MAINTAINERS: Add PCI Endpoint maintainer
Documentation: PCI: Add userguide for PCI endpoint test function
tools: PCI: Add sample test script to invoke pcitest
tools: PCI: Add a userspace tool to test PCI endpoint
Documentation: misc-devices: Add Documentation for pci-endpoint-test driver
misc: Add host side PCI driver for PCI test function device
PCI: Add device IDs for DRA74x and DRA72x
dt-bindings: PCI: dra7xx: Add DT bindings to enable unaligned access
PCI: dwc: dra7xx: Workaround for errata id i870
dt-bindings: PCI: dra7xx: Add DT bindings for PCI dra7xx EP mode
PCI: dwc: dra7xx: Add EP mode support
PCI: dwc: dra7xx: Facilitate wrapper and MSI interrupts to be enabled independently
dt-bindings: PCI: Add DT bindings for PCI designware EP mode
PCI: dwc: designware: Add EP mode support
Documentation: PCI: Add binding documentation for pci-test endpoint function
ixgbe: Use pcie_flr() instead of duplicating it
IB/hfi1: Use pcie_flr() instead of duplicating it
PCI: imx6: Fix spelling mistake: "contol" -> "control"
...
Linus Torvalds [Tue, 9 May 2017 01:49:23 +0000 (18:49 -0700)]
Merge tag 'tty-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial updates from Greg KH:
"Here is the "big" TTY/Serial patch updates for 4.12-rc1
Not a lot of new things here, the normal number of serial driver
updates and additions, tiny bugs fixed, and some core files split up
to make future changes a bit easier for Nicolas's "tiny-tty" work.
All of these have been in linux-next for a while"
* tag 'tty-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (62 commits)
serial: small Makefile reordering
tty: split job control support into a file of its own
tty: move baudrate handling code to a file of its own
console: move console_init() out of tty_io.c
serial: 8250_early: Add earlycon support for Palmchip UART
tty: pl011: use "qdf2400_e44" as the earlycon name for QDF2400 E44
vt: make mouse selection of non-ASCII consistent
vt: set mouse selection word-chars to gpm's default
imx-serial: Reduce RX DMA startup latency when opening for reading
serial: omap: suspend device on probe errors
serial: omap: fix runtime-pm handling on unbind
tty: serial: omap: add UPF_BOOT_AUTOCONF flag for DT init
serial: samsung: Remove useless spinlock
serial: samsung: Add missing checks for dma_map_single failure
serial: samsung: Use right device for DMA-mapping calls
serial: imx: setup DCEDTE early and ensure DCD and RI irqs to be off
tty: fix comment typo s/repsonsible/responsible/
tty: amba-pl011: Fix spurious TX interrupts
serial: xuartps: Enable clocks in the pm disable case also
serial: core: Re-use struct uart_port {name} field
...
Linus Torvalds [Tue, 9 May 2017 01:17:56 +0000 (18:17 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
- the rest of MM
- various misc things
- procfs updates
- lib/ updates
- checkpatch updates
- kdump/kexec updates
- add kvmalloc helpers, use them
- time helper updates for Y2038 issues. We're almost ready to remove
current_fs_time() but that awaits a btrfs merge.
- add tracepoints to DAX
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (114 commits)
drivers/staging/ccree/ssi_hash.c: fix build with gcc-4.4.4
selftests/vm: add a test for virtual address range mapping
dax: add tracepoint to dax_insert_mapping()
dax: add tracepoint to dax_writeback_one()
dax: add tracepoints to dax_writeback_mapping_range()
dax: add tracepoints to dax_load_hole()
dax: add tracepoints to dax_pfn_mkwrite()
dax: add tracepoints to dax_iomap_pte_fault()
mtd: nand: nandsim: convert to memalloc_noreclaim_*()
treewide: convert PF_MEMALLOC manipulations to new helpers
mm: introduce memalloc_noreclaim_{save,restore}
mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
mm/huge_memory.c: deposit a pgtable for DAX PMD faults when required
mm/huge_memory.c: use zap_deposited_table() more
time: delete CURRENT_TIME_SEC and CURRENT_TIME
gfs2: replace CURRENT_TIME with current_time
apparmorfs: replace CURRENT_TIME with current_time()
lustre: replace CURRENT_TIME macro
fs: ubifs: replace CURRENT_TIME_SEC with current_time
fs: ufs: use ktime_get_real_ts64() for birthtime
...
Andrew Morton [Mon, 8 May 2017 23:00:22 +0000 (16:00 -0700)]
drivers/staging/ccree/ssi_hash.c: fix build with gcc-4.4.4
drivers/staging/ccree/ssi_hash.c:1990: error: unknown field 'template_ahash' specified in initializer
drivers/staging/ccree/ssi_hash.c:1991: error: unknown field 'init' specified in initializer
drivers/staging/ccree/ssi_hash.c:1991: warning: missing braces around initializer
drivers/staging/ccree/ssi_hash.c:1991: warning: (near initialization for 'driver_hash[0].<anonymous>.template_ahash')
drivers/staging/ccree/ssi_hash.c:1992: error: unknown field 'update' specified in initializer
drivers/staging/ccree/ssi_hash.c:1992: warning: excess elements in union initializer
drivers/staging/ccree/ssi_hash.c:1992: warning: (near initialization for 'driver_hash[0].<anonymous>')
drivers/staging/ccree/ssi_hash.c:1993: error: unknown field 'final' specified in initializer
drivers/staging/ccree/ssi_hash.c:1993: warning: excess elements in union initializer
drivers/staging/ccree/ssi_hash.c:1993: warning: (near initialization for 'driver_hash[0].<anonymous>')
...
gcc-4.4.4 has issues with anon union initializers. Work around this.
selftests/vm: add a test for virtual address range mapping
This verifies virtual address mapping below and above the 128TB range
and makes sure that address returned are within the expected range
depending upon the hint passed from the user space.
Ross Zwisler [Mon, 8 May 2017 23:00:16 +0000 (16:00 -0700)]
dax: add tracepoint to dax_insert_mapping()
Add a tracepoint to dax_insert_mapping(), following the same logging
conventions as the rest of DAX. This tracepoint, along with the one in
dax_load_hole(), lets us know how a DAX PTE fault was serviced.
Here is an example DAX fault that inserts a PTE mapping:
Link: http://lkml.kernel.org/r/20170221195116.13278-7-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[akpm@linux-foundation.org: struct blk_dax_ctl has disappeared] Link: http://lkml.kernel.org/r/20170221195116.13278-6-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[ross.zwisler@linux.intel.com: fix regression in dax_writeback_mapping_range()] Link: http://lkml.kernel.org/r/20170314215358.31451-1-ross.zwisler@linux.intel.com Link: http://lkml.kernel.org/r/20170221195116.13278-5-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170221195116.13278-4-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20170221195116.13278-3-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Ross Zwisler [Mon, 8 May 2017 23:00:00 +0000 (16:00 -0700)]
dax: add tracepoints to dax_iomap_pte_fault()
Patch series "second round of tracepoints for DAX".
This second round of DAX tracepoint patches adds tracing to the PTE
fault path (dax_iomap_pte_fault(), dax_pfn_mkwrite(), dax_load_hole(),
dax_insert_mapping()) and to the writeback path
(dax_writeback_mapping_range(), dax_writeback_one()).
The purpose of this tracing is to give us a high level view of what DAX
is doing, whether faults are being serviced by PMDs or PTEs, and by real
storage or by zero pages covering holes.
I do have some patches nearly ready which also add tracing to
grab_mapping_entry() and dax_insert_mapping_entry(). These are more
targeted at logging how we are interacting with the radix tree, how we
use empty entries for locking, whether we "downgrade" huge zero pages to
4k PTE sized allocations, etc. In the end it seemed to me that this
might be too detailed to have as constantly present tracepoints, but if
anyone sees value in having tracepoints like this in the DAX code
permanently (Jan?), please let me know and I'll add those last two
patches.
All these tracepoints were done to be consistent with the style of the
XFS tracepoints and with the existing DAX PMD tracepoints.
This patch (of 6):
Add tracepoints to dax_iomap_pte_fault(), following the same logging
conventions as the rest of DAX.
Here is an example fault that initially tries to be serviced by the PMD
fault handler but which falls back to PTEs because the VMA isn't large
enough to hold a PMD:
small-1086 [005] ....
71.140014: xfs_filemap_huge_fault: dev 259:0 ino 0x1003
Link: http://lkml.kernel.org/r/20170221195116.13278-2-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Mon, 8 May 2017 22:59:57 +0000 (15:59 -0700)]
mtd: nand: nandsim: convert to memalloc_noreclaim_*()
Nandsim has own functions set_memalloc() and clear_memalloc() for robust
setting and clearing of PF_MEMALLOC. Replace them by the new generic
helpers. No functional change.
Link: http://lkml.kernel.org/r/20170405074700.29871-5-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Richard Weinberger <richard@nod.at> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Chris Leech <cleech@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Mon, 8 May 2017 22:59:53 +0000 (15:59 -0700)]
treewide: convert PF_MEMALLOC manipulations to new helpers
We now have memalloc_noreclaim_{save,restore} helpers for robust setting
and clearing of PF_MEMALLOC. Let's convert the code which was using the
generic tsk_restore_flags(). No functional change.
[vbabka@suse.cz: in net/core/sock.c the hunk is missing] Link: http://lkml.kernel.org/r/20170405074700.29871-4-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Chris Leech <cleech@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: Wouter Verhelst <w@uter.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Mon, 8 May 2017 22:59:50 +0000 (15:59 -0700)]
mm: introduce memalloc_noreclaim_{save,restore}
The previous patch ("mm: prevent potential recursive reclaim due to
clearing PF_MEMALLOC") has shown that simply setting and clearing
PF_MEMALLOC in current->flags can result in wrongly clearing a
pre-existing PF_MEMALLOC flag and potentially lead to recursive reclaim.
Let's introduce helpers that support proper nesting by saving the
previous stat of the flag, similar to the existing memalloc_noio_* and
memalloc_nofs_* helpers. Convert existing setting/clearing of
PF_MEMALLOC within mm to the new helpers.
There are no known issues with the converted code, but the change makes
it more robust.
Link: http://lkml.kernel.org/r/20170405074700.29871-3-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Chris Leech <cleech@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Richard Weinberger <richard@nod.at> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Mon, 8 May 2017 22:59:46 +0000 (15:59 -0700)]
mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC
Patch series "more robust PF_MEMALLOC handling"
This series aims to unify the setting and clearing of PF_MEMALLOC, which
prevents recursive reclaim. There are some places that clear the flag
unconditionally from current->flags, which may result in clearing a
pre-existing flag. This already resulted in a bug report that Patch 1
fixes (without the new helpers, to make backporting easier). Patch 2
introduces the new helpers, modelled after existing memalloc_noio_* and
memalloc_nofs_* helpers, and converts mm core to use them. Patches 3
and 4 convert non-mm code.
This patch (of 4):
__alloc_pages_direct_compact() sets PF_MEMALLOC to prevent deadlock
during page migration by lock_page() (see the comment in
__unmap_and_move()). Then it unconditionally clears the flag, which can
clear a pre-existing PF_MEMALLOC flag and result in recursive reclaim.
This was not a problem until commit a0277328372b ("mm, page_alloc:
restructure direct compaction handling in slowpath"), because direct
compation was called only after direct reclaim, which was skipped when
PF_MEMALLOC flag was set.
Even now it's only a theoretical issue, as the new callsite of
__alloc_pages_direct_compact() is reached only for costly orders and
when gfp_pfmemalloc_allowed() is true, which means either
__GFP_NOMEMALLOC is in gfp_flags or in_interrupt() is true. There is no
such known context, but let's play it safe and make
__alloc_pages_direct_compact() robust for cases where PF_MEMALLOC is
already set.
Fixes: a0277328372b ("mm, page_alloc: restructure direct compaction handling in slowpath") Link: http://lkml.kernel.org/r/20170405074700.29871-2-vbabka@suse.cz Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Hillf Danton <hillf.zj@alibaba-inc.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> Cc: Chris Leech <cleech@redhat.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Josef Bacik <jbacik@fb.com> Cc: Lee Duncan <lduncan@suse.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Richard Weinberger <richard@nod.at> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/huge_memory.c: deposit a pgtable for DAX PMD faults when required
Although all architectures use a deposited page table for THP on
anonymous VMAs, some architectures (s390 and powerpc) require the
deposited storage even for file backed VMAs due to quirks of their MMUs.
This patch adds support for depositing a table in DAX PMD fault handling
path for archs that require it. Other architectures should see no
functional changes.
Link: http://lkml.kernel.org/r/20170411174233.21902-3-oohall@gmail.com Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Cc: Reza Arbab <arbab@linux.vnet.ibm.com> Cc: Balbir Singh <bsingharora@gmail.com> Cc: linux-nvdimm@ml01.01.org Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Depending on the flags of the PMD being zapped there may or may not be a
deposited pgtable to be freed. In two of the three cases this is open
coded while the third uses the zap_deposited_table() helper. This patch
converts the others to use the helper to clean things up a bit.
Deepa Dinamani [Mon, 8 May 2017 22:59:37 +0000 (15:59 -0700)]
time: delete CURRENT_TIME_SEC and CURRENT_TIME
All uses of CURRENT_TIME_SEC and CURRENT_TIME macros have been replaced
by other time functions. These macros are also not y2038 safe. And,
all their use cases can be fulfilled by y2038 safe ktime_get_* variants.
Link: http://lkml.kernel.org/r/1491613030-11599-12-git-send-email-deepa.kernel@gmail.com Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: John Stultz <john.stultz@linaro.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>