Ramalingam C [Sat, 16 Feb 2019 17:37:03 +0000 (23:07 +0530)]
drm/i915: Fix KBL HDCP2.2 encrypt status signalling
HDCP transmitter is supposed to indicate the HDCP encryption status of
the link through enc_en signals in a window of time called "window of
opportunity" defined by HDCP HDMI spec.
But on KBL this timing of signalling has an issue. To fix the issue this
WA of resetting the signalling is required.
v2:
WA is moved into the toggle_signalling [Daniel]
v3:
Commit msg is rewritten with more information
v4:
Reviewed-by Daniel.
Ramalingam C [Sat, 16 Feb 2019 17:37:02 +0000 (23:07 +0530)]
drm/i915: CP_IRQ handling for DP HDCP2.2 msgs
Implements the
Waitqueue is created to wait for CP_IRQ
Signaling the CP_IRQ arrival through atomic variable.
For applicable DP HDCP2.2 msgs read wait for CP_IRQ.
As per HDCP2.2 spec "HDCP Transmitters must process CP_IRQ interrupts
when they are received from HDCP Receivers"
Without CP_IRQ processing, DP HDCP2.2 H_Prime msg was getting corrupted
while reading it based on corresponding status bit. This creates the
random failures in reading the DP HDCP2.2 msgs.
v2:
CP_IRQ arrival is tracked based on the atomic val inc [daniel]
Recording the reviewed-by Daniel from IRC.
Ramalingam C [Sat, 16 Feb 2019 17:37:01 +0000 (23:07 +0530)]
drm/i915: Implement the HDCP2.2 support for HDMI
Implements the HDMI adaptation specific HDCP2.2 operations.
Basically these are DDC read and write for authenticating through
HDCP2.2 messages.
v2: Rebased.
v3:
No more special handling of Gmbus burst read for AKE_SEND_CERT.
Style fixed with few naming. [Uma]
%s/PARING/PAIRING
v4:
msg_sz is initialized at definition.
Lookup table is defined for HDMI HDCP2.2 msgs [Daniel].
v5: Rebased.
v6:
Make a function as inline [Uma]
%s/uintxx_t/uxx
v7:
Errors due to sinks are reported as DEBUG logs.
Adjust to the new mei interface.
v8:
ARRAY_SIZE for the # of array members [Jon & Daniel].
hdcp adaptation is added as a const in the hdcp_shim [Daniel]
Ramalingam C [Sat, 16 Feb 2019 17:37:00 +0000 (23:07 +0530)]
drm/i915: Implement the HDCP2.2 support for DP
Implements the DP adaptation specific HDCP2.2 functions.
These functions perform the DPCD read and write for communicating the
HDCP2.2 auth message back and forth.
v2:
wait for cp_irq is merged with this patch. Rebased.
v3:
wait_queue is used for wait for cp_irq [Chris Wilson]
v4:
Style fixed.
%s/PARING/PAIRING
Few style fixes [Uma]
v5:
Lookup table for DP HDCP2.2 msg details [Daniel].
Extra lines are removed.
v6: Rebased.
v7:
Fixed some regression introduced at v5. [Ankit]
Macro HDCP_2_2_RX_CAPS_VERSION_VAL is reused [Uma]
Converted a function to inline [Uma]
%s/uintxx_t/uxx
v8:
Error due to the sinks are reported as DEBUG logs.
Adjust to the new mei interface.
v9:
ARRAY_SIZE for no of array members [Jon & Daniel]
return of the wait_for_cp_irq is made as void [Daniel]
Wait for HDCP2.2 msg is done based on polling the reg bit than
CP_IRQ based. [Daniel]
hdcp adaptation is added as a const in the hdcp_shim [Daniel]
v10:
config_stream_type is redefined [Daniel]
DP Errata specific defines are moved into intel_dp.c.
Ramalingam C [Sat, 16 Feb 2019 17:36:59 +0000 (23:06 +0530)]
drm: removing the DP Errata msg and its msg id
Since DP ERRATA message is not defined at spec, those structure
definition is removed from drm_hdcp.h
Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Acked-by: Dave Airlie <airlied@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/1550338640-17470-13-git-send-email-ramalingam.c@intel.com
When repeater notifies a downstream topology change, this patch
reauthenticate the repeater alone without disabling the hdcp
encryption. If that fails then complete reauthentication is executed.
v2:
Rebased.
v3:
Typo in commit msg is fixed [Uma]
v4:
Rebased as part of patch reordering.
Minor style fixes.
v5:
Rebased.
v6:
Rebased.
v7:
Errors due to sinks are reported as DEBUG logs.
Ramalingam C [Sat, 16 Feb 2019 17:36:57 +0000 (23:06 +0530)]
drm/i915: Implement HDCP2.2 link integrity check
Implements the link integrity check once in 500mSec.
Once encryption is enabled, an ongoing Link Integrity Check is
performed by the HDCP Receiver to check that cipher synchronization
is maintained between the HDCP Transmitter and the HDCP Receiver.
On the detection of synchronization lost, the HDCP Receiver must assert
the corresponding bits of the RxStatus register. The Transmitter polls
the RxStatus register and it may initiate re-authentication.
v2:
Rebased.
v3:
enum check_link_response is used check the link status [Uma]
v4:
Rebased as part of patch reordering.
v5:
Required members of intel_hdcp is defined [Sean Paul]
v6:
hdcp2_check_link is cancelled at required places.
v7:
Rebased for the component i/f changes.
Errors due to the sinks are reported as DEBUG logs.
v8:
hdcp_check_work is used for both hdcp1 and hdcp2 check_link [Daniel]
hdcp2.2 encryption status check is put under WARN_ON [Daniel]
drm_hdcp.h changes are moved into separate patch [Daniel]
v9:
enum check_link_status is defined at intel_drv.h [Daniel]
Implements the HDCP2.2 repeaters authentication steps such as verifying
the downstream topology and sending stream management information.
v2: Rebased.
v3:
-EINVAL is returned for topology error and rollover scenario.
Endianness conversion func from drm_hdcp.h is used [Uma]
v4:
Rebased as part of patches reordering.
Defined the mei service functions [Daniel]
v5:
Redefined the mei service functions as per comp redesign.
v6:
%s/uintxx_t/uxx
Check for comp_master is removed.
v7:
Adjust to the new mei interface.
style issue fixed.
v8:
drm_hdcp.h change is moved into separate patch [Daniel]
v9:
%s/__swab16/cpu_to_be16. [Tomas]
Reviewed-by Uma.
Implements HDCP2.2 authentication for hdcp2.2 receivers, with
following steps:
Authentication and Key exchange (AKE).
Locality Check (LC).
Session Key Exchange(SKE).
DP Errata for stream type configuration for receivers.
At AKE, the HDCP Receiver’s public key certificate is verified by the
HDCP Transmitter. A Master Key k m is exchanged.
At LC, the HDCP Transmitter enforces locality on the content by
requiring that the Round Trip Time (RTT) between a pair of messages
is not more than 20 ms.
At SKE, The HDCP Transmitter exchanges Session Key ks with
the HDCP Receiver.
In DP HDCP2.2 encryption and decryption logics use the stream type as
one of the parameter. So Before enabling the Encryption DP HDCP2.2
receiver needs to be communicated with stream type. This is added to
spec as ERRATA.
This generic implementation is complete only with the hdcp2 specific
functions defined at hdcp_shim.
v2: Rebased.
v3:
%s/PARING/PAIRING
Coding style fixing [Uma]
v4:
Rebased as part of patch reordering.
Defined the functions for mei services. [Daniel]
v5:
Redefined the mei service functions as per comp redesign.
Required intel_hdcp members are defined [Sean Paul]
v6:
Typo of cipher is Fixed [Uma]
%s/uintxx_t/uxx
Check for comp_master is removed.
v7:
Adjust to the new interface.
Avoid using bool structure members. [Tomas]
v8: Rebased.
v9:
bool is used in struct intel_hdcp [Daniel]
config_stream_type is redesigned [Daniel]
Reviewed-by Uma.
Ramalingam C [Sat, 16 Feb 2019 17:36:53 +0000 (23:06 +0530)]
drm/i915: Enable and Disable of HDCP2.2
Considering that HDCP2.2 is more secure than HDCP1.4, When a setup
supports HDCP2.2 and HDCP1.4, HDCP2.2 will be enabled.
When HDCP2.2 enabling fails and HDCP1.4 is supported, HDCP1.4 is
enabled.
This change implements a sequence of enabling and disabling of
HDCP2.2 authentication and HDCP2.2 port encryption.
v2:
Included few optimization suggestions [Chris Wilson]
Commit message is updated as per the rebased version.
intel_wait_for_register is used instead of wait_for. [Chris Wilson]
v3:
Extra comment added and Style issue fixed [Uma]
v4:
Rebased as part of patch reordering.
HDCP2 encryption status is tracked.
HW state check is moved into WARN_ON [Daniel]
v5:
Redefined the mei service functions as per comp redesign.
Merged patches related to hdcp2.2 enabling and disabling [Sean Paul].
Required shim functionality is defined [Sean Paul]
v6:
Return values are handles [Uma]
Realigned the code.
Check for comp_master is removed.
v7:
HDCP2.2 is attempted only if mei interface is up.
Adjust to the new interface
Avoid bool usage in struct [Tomas]
v8:
mei_binded status check is removed.
%s/hdcp2_in_use/hdcp2_encrypted
v9:
bool is used in struct intel_hdcp. [Daniel]
v10:
panel is replaced with sink [Uma]
Mei interface decided the hdcp2_capability.
WARN_ON if hdcp_enable is called when hdcp state is ENABLED.
Reviewed-by Uma.
Ramalingam C [Sat, 16 Feb 2019 17:36:52 +0000 (23:06 +0530)]
drm/i915: hdcp1.4 CP_IRQ handling and SW encryption tracking
"hdcp_encrypted" flag is defined to denote the HDCP1.4 encryption status.
This SW tracking is used to determine the need for real hdcp1.4 disable
and hdcp_check_link upon CP_IRQ.
On CP_IRQ we filter the CP_IRQ related to the states like Link failure
and reauthentication req etc and handle them in hdcp_check_link.
CP_IRQ corresponding to the authentication msg availability are ignored.
WARN_ON is added for the abrupt stop of HDCP encryption of a port.
v2:
bool is used in struct for the cleaner coding. [Daniel]
check_link work_fn is scheduled for cp_irq handling [Daniel]
v3:
rebased.
Ramalingam C [Sat, 16 Feb 2019 17:36:51 +0000 (23:06 +0530)]
drm/i915: MEI interface implementation
Defining the mei-i915 interface functions and initialization of
the interface.
v2:
Adjust to the new interface changes. [Tomas]
Added further debug logs for the failures at MEI i/f.
port in hdcp_port data is equipped to handle -ve values.
v3:
mei comp is matched for global i915 comp master. [Daniel]
In hdcp_shim hdcp_protocol() is replaced with const variable. [Daniel]
mei wrappers are adjusted as per the i/f change [Daniel]
v4:
port initialization is done only at hdcp2_init only [Danvet]
v5:
I915 registers a subcomponent to be matched with mei_hdcp [Daniel]
v6:
HDCP_disable for all connectors incase of comp_unbind.
Tear down HDCP comp interface at i915_unload [Daniel]
v7:
Component init and fini are moved out of connector ops [Daniel]
hdcp_disable is not called from unbind. [Daniel]
v8:
subcomponent name is dropped as it is already merged.
Ramalingam C [Sat, 16 Feb 2019 17:36:50 +0000 (23:06 +0530)]
drm/i915: Initialize HDCP2.2
Add the HDCP2.2 initialization to the existing HDCP1.4 stack.
v2:
mei interface handle is protected with mutex. [Chris Wilson]
v3:
Notifiers are used for the mei interface state.
v4:
Poll for mei client device state
Error msg for out of mem [Uma]
Inline req for init function removed [Uma]
v5:
Rebase as Part of reordering.
Component is used for the I915 and MEI_HDCP interface [Daniel]
v6:
HDCP2.2 uses the I915 component master to communicate with mei_hdcp
- [Daniel]
Required HDCP2.2 variables defined [Sean Paul]
v7:
intel_hdcp2.2_init returns void [Uma]
Realigning the codes.
v8:
Avoid using bool structure members.
MEI interface related changes are moved into separate patch.
Commit msg is updated accordingly.
intel_hdcp_exit is defined and used from i915_unload
v9:
Movement of the hdcp_check_link is moved to new patch [Daniel]
intel_hdcp2_exit is removed as mei_comp will be unbind in i915_unload.
v10:
bool is used in struct to make coding simpler. [Daniel]
hdmi hdcp init is placed correctly after encoder attachment.
v11:
hdcp2_capability check is moved into hdcp.c [Tomas]
Chris Wilson [Tue, 19 Feb 2019 12:21:54 +0000 (12:21 +0000)]
drm/i915: Avoid reset lock in writing fence registers
The idea of taking the reset lock around writing the fence register was
to serialise the mmio write we also perform during the reset where those
registers get clobbered. However, the lock is overkill as write tearing
between reset and fence_update() is harmless; the final value of the
fence register is the same. A race between revoke_fences() and
fence_update() is also harmless at this point as on the fault path where
this is necessary, we acquire the reset lock to coordinate ourselves in
the upper layer.
The danger of acquiring the reset lock again in fence_update() is that
we may recurse from the shrinker along the i915_gem_fault() path.
<4> [125.739646] ============================================
<4> [125.739652] WARNING: possible recursive locking detected
<4> [125.739659] 5.0.0-rc6-ga6e4cbf00557-drmtip_223+ #1 Tainted: G U
<4> [125.739666] --------------------------------------------
<4> [125.739672] gem_mmap_gtt/1017 is trying to acquire lock:
<4> [125.739679] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x0/0x310 [i915]
<4> [125.739848]
but task is already holding lock:
<4> [125.739854] 00000000a730190a (&dev_priv->gpu_error.reset_backoff_srcu){+.+.}, at: i915_reset_trylock+0x192/0x310 [i915]
<4> [125.739918]
other info that might help us debug this:
<4> [125.739925] Possible unsafe locking scenario:
Chris Wilson [Wed, 20 Feb 2019 14:56:37 +0000 (14:56 +0000)]
drm/i915: Beware temporary wedging when determining -EIO
At a few points in our uABI, we check to see if the driver is wedged and
report -EIO back to the user in that case. However, as we perform the
check and reset asynchronously (where once before they were both
serialised by the struct_mutex), we may instead see the temporary wedging
used to cancel inflight rendering to avoid a deadlock during reset
(caused by either us timing out in our reset handler,
i915_wedge_on_timeout or with malice aforethought in intel_reset_prepare
for a stuck modeset). If we suspect this is the case, that is we see a
wedged driver *and* reset in progress, then wait until the reset is
resolved before reporting upon the wedged status.
Dave Airlie [Wed, 20 Feb 2019 00:08:35 +0000 (10:08 +1000)]
Merge branch 'linux-5.1' of git://github.com/skeggsb/linux into drm-next
Various fixes/cleanups, along with initial support for SVM features
utilising HMM address-space mirroring and device memory migration.
There's a lot more work to do in these areas, both in terms of
features and efficiency, but these can slowly trickle in later down
the track.
Jérôme Glisse [Tue, 7 Aug 2018 20:13:16 +0000 (16:13 -0400)]
drm/nouveau/svm: new ioctl to migrate process memory to GPU memory
This add an ioctl to migrate a range of process address space to the
device memory. On platform without cache coherent bus (x86, ARM, ...)
this means that CPU can not access that range directly, instead CPU
will fault which will migrate the memory back to system memory.
This is behind a staging flag so that we can evolve the API.
Device memory can be use in SVM, in which case we do not have any of
the existing buffer object. This commit add infrastructure to allow
use of device memory without nouveau_bo. Again this is a temporary
solution until a rework of GPU memory management.
Ben Skeggs [Thu, 5 Jul 2018 02:57:12 +0000 (12:57 +1000)]
drm/nouveau/svm: initial support for shared virtual memory
This uses HMM to mirror a process' CPU page tables into a channel's page
tables, and keep them synchronised so that both the CPU and GPU are able
to access the same memory at the same virtual address.
While this code also supports Volta/Turing, it's only enabled for Pascal
GPUs currently due to channel recovery being unreliable right now on the
later GPUs.
Ben Skeggs [Tue, 19 Feb 2019 07:21:48 +0000 (17:21 +1000)]
drm/nouveau: prepare for enabling svm with existing userspace interfaces
For a channel to make use of SVM features, it requires a different GPU MMU
configuration than we would normally use, which is not desirable to switch
to unless a client is actively going to use SVM.
In order to supporting SVM without more extensive changes to the userspace
interfaces, the SVM_INIT ioctl needs to replace the previous configuration
safely.
The only way we can currently do this safely, accounting for some unlikely
failure conditions, is to allocate the new VMM without destroying the last
one, and prioritising the SVM-enabled configuration in the code that cares.
This will get cleaned up again further down the track.
Ben Skeggs [Tue, 8 May 2018 10:39:48 +0000 (20:39 +1000)]
drm/nouveau/mmu/gp100-: support vmms with gcc/tex replayable faults enabled
Some GPU units are capable of supporting "replayable" page faults, where
the execution unit will wait for SW to fixup GPU page tables rather than
triggering a channel-fatal fault.
This feature isn't useful (it's harmful, even) unless something like HMM
is being used to manage events appearing in the replayable fault buffer,
so, it's disabled by default.
This commit allows a client to request it be enabled.
Ben Skeggs [Mon, 9 Jul 2018 06:07:40 +0000 (16:07 +1000)]
drm/nouveau/mmu/gp100-: add privileged methods for fault replay/cancel
Host methods exist to do at least some of what we need, but we are not
currently pushing replay/cancels through a channel like UVM does as it's
not clear whether it's necessary in our case (UVM also updates PTEs with
the GPU).
UVM also pushes a software method for fault cancels on Pascal, seemingly
because the host methods don't appear to be sufficient. If/when we want
to push the replay/cancel on the GPU, we can re-purpose the cancellation
code here to implement that swmthd.
Keep it simple for now, until we figure out exactly what we need here.
Ben Skeggs [Wed, 13 Jun 2018 06:25:53 +0000 (16:25 +1000)]
drm/nouveau/mmu: support initialisation of client-managed address-spaces
NVKM is currently responsible for managing the allocation of a client's
GPU address-space, but there's various use-cases (ie. HMM address-space
mirroring) where giving a client more direct control is desirable.
This commit allows for a VMM to be created where the area allocated for
NVKM is limited to a client-specified window, the remainder of address-
space is controlled directly by the client.
Leaving a window is necessary to support various internal requirements,
but also to support existing allocation interfaces as not all of the HW
is capable of working with a HMM allocation.
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau: allow accelerated buffer moves even when gr isn't present
There's no need to avoid using copy engines if gr init fails for some
reason (usually missing FW, or incomplete bring-up).
It's not terribly useful for an end-user, but it'll slightly speed up
suspend/resume when saving fb contents, and allow for host/ce code to
be validated.
Ben Skeggs [Tue, 12 Feb 2019 12:28:13 +0000 (22:28 +1000)]
drm/nouveau: allocate kernel channel(s) before initialising display
Some of the pre-NV50 depends on SW methods to implement synchronisation
for page flips, and we want to move this setup out of common code, thus
we require the channel to have been allocation before display init.
Colin Ian King [Mon, 8 Oct 2018 20:47:36 +0000 (21:47 +0100)]
drm/nouveau: fix missing break in switch statement
The NOUVEAU_GETPARAM_PCI_DEVICE case is missing a break statement and falls
through to the following NOUVEAU_GETPARAM_BUS_TYPE case and may end up
re-assigning the getparam->value to an undesired value. Fix this by adding
in the missing break.
Detected by CoverityScan, CID#1460507 ("Missing break in switch")
Fixes: 35a773670318 ("drm/nouveau: remove trivial cases of nvxx_device() usage") Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This is much louder then we want. VCPI allocation failures are quite
normal, since they will happen if any part of the modesetting process is
interrupted by removing the DP MST topology in question. So just print a
debugging message on VCPI failures instead.
Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 810c68f17dd6 ("drm/nouveau/kms/nv50: initial support for DP 1.2 multi-stream") Cc: Ben Skeggs <bskeggs@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Colin Ian King [Wed, 19 Dec 2018 15:29:49 +0000 (15:29 +0000)]
drm/nouveau/pmu: don't print reply values if exec is false
Currently the uninitialized values in the array reply are printed out
when exec is false and nvkm_pmu_send has not updated the array. Avoid
confusion by only dumping out these values if they have been actually
updated.
Detected by CoverityScan, CID#1271291 ("Uninitialized scaler variable") Fixes: 6aad52bc87aa ("drm/nouveau/pmu: rename from pwr (no binary change)") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Colin Ian King [Sun, 25 Nov 2018 17:09:18 +0000 (17:09 +0000)]
drm/nouveau/bios/ramcfg: fix missing parentheses when calculating RON
Currently, the expression for calculating RON is always going to result
in zero no matter the value of ram->mr[1] because the ! operator has
higher precedence than the shift >> operator. I believe the missing
parentheses around the expression before appying the ! operator will
result in the desired result.
[ Note, not tested ]
Detected by CoveritScan, CID#1324005 ("Operands don't affect result")
Fixes: 9c9ccb29aebf ("drm/nouveau/bios/ramcfg: Separate out RON pull value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
GF117 appears to use the same register as GK104 (but still with the
general Fermi readout mechanism).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108980 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Jordan Crouse [Tue, 19 Feb 2019 18:40:19 +0000 (11:40 -0700)]
drm/msm: Truncate the buffer object name if the copy from user failed
(Resend since there was a compile error that I forgot to commit before sending)
If there is a error while doing a copy_from_user() for MSM_INFO_SET_NAME
make sure to truncate the object name so that there isn't a chance that
we'll have random data in the string.
This is on top of [1] reported and fixed by Dan Carpenter.
Fixes: aa3287591b0b ("drm/msm: add uapi to get/set debug name") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com>
Dan Carpenter [Thu, 14 Feb 2019 07:19:27 +0000 (10:19 +0300)]
drm/msm: fix an error code in the ioctl
The copy_to/from_user() functions return the number of bytes remaining
to be copied but we should return -EFAULT to the user.
Fixes: aa3287591b0b ("drm/msm: add uapi to get/set debug name") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Rob Clark <robdclark@gmail.com>
Chris Wilson [Tue, 19 Feb 2019 12:21:52 +0000 (12:21 +0000)]
drm/i915: Use time based guilty context banning
Currently, we accumulate each time a context hangs the GPU, offset
against the number of requests it submits, and if that score exceeds a
certain threshold, we ban that context from submitting any more requests
(cancelling any work in flight). In contrast, we use a simple timer on
the file, that if we see more than a 9 hangs faster than 60s apart in
total across all of its contexts, we will ban the client from creating
any more contexts. This leads to a confusing situation where the file
may be banned before the context, so lets use a simple timer scheme for
each.
If the context submits 3 hanging requests within a 120s period, declare
it forbidden to ever send more requests.
This has the advantage of not being easy to repair by simply sending
empty requests, but has the disadvantage that if the context is idle
then it is forgiven. However, if the context is idle, it is not
disrupting the system, but a hog can evade the request counting and
cause much more severe disruption to the system.
Updating ban_score from request retirement is dubious as the retirement
is purposely not in sync with request submission (i.e. we try and batch
retirement to reduce overhead and avoid latency on submission), which
leads to surprising situations where we can forgive a hang immediately
due to a backlog of requests from before the hang being retired
afterwards.
Chris Wilson [Tue, 19 Feb 2019 12:21:57 +0000 (12:21 +0000)]
drm/i915: Trim delays for wedging
CI still reports the occasional multi-second delay for resets, in
particular along the wedge+recovery paths. As the likely, and unbounded,
delay here is from sync_rcu, use the expedited variant instead.
Chris Wilson [Mon, 18 Feb 2019 09:46:28 +0000 (09:46 +0000)]
drm/i915: Include reminders about leaving no holes in uAPI enums
We don't want to pre-reserve any holes in our uAPI for that is a sign of
nefarious and hidden activity. Add a reminder about our uAPI
expectations to encourage good practice when adding new defines/enums.
Chris Wilson [Mon, 18 Feb 2019 15:31:06 +0000 (15:31 +0000)]
drm/i915: Restore interrupt enabling after a reset
At least on i965g and i965gm, performing a device reset clobbers the IER
resulting in loss of interrupts thereafter. So, run the irq_postinstall
hook to restore them.
v2: Ville pointed out that he already attempted to solve this problem by
reinstalling the interrupts in intel_reset_finish() (part of the display
handling around reset). However, reinstalling the irq clobbers the
i915->irq_mask which we need for handling MI_USER_INTERRUPTS, and does
so too late to handle any interrupts generated from resuming the rings.
The simple solution to both is to pull the interrupt reenabling from
afterwards to around the device reset.
Chris Wilson [Mon, 18 Feb 2019 10:58:21 +0000 (10:58 +0000)]
drm/i915: Optionally disable automatic recovery after a GPU reset
Some clients, such as mesa, may only emit minimal incremental batches
that rely on the logical context state from previous batches. They know
that recovery is impossible after a hang as their required GPU state is
lost, and that each in flight and subsequent batch will hang (resetting
the context image back to default perpetuating the problem).
To avoid getting into the state in the first place, we can allow clients
to opt out of automatic recovery and elect to ban any guilty context
following a hang. This prevents the continual stream of hangs and allows
the client to recreate their context and rebuild the state from scratch.
v2: Prefer calling it recoverable rather than unrecoverable.
References: https://lists.freedesktop.org/archives/mesa-dev/2019-February/215431.html Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> # for mesa Link: https://patchwork.freedesktop.org/patch/msgid/20190218105821.17293-1-chris@chris-wilson.co.uk
Chris Wilson [Sun, 17 Feb 2019 20:25:18 +0000 (20:25 +0000)]
drm/i915/selftests: Move local mock_ggtt allocations to the heap
This struct appears quite large and pushes our stack frame over
1024 bytes -- too high for conservative setups. So move the mock_ggtt
struct to the heap.
Linus Torvalds [Sun, 17 Feb 2019 17:22:01 +0000 (09:22 -0800)]
Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fixes from Ingo Molnar:
"This tree reverts a GICv3 commit (which was broken) and fixes it in
another way, by adding a memblock build-time entries quirk for ARM64"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi/arm: Revert "Defer persistent reservations until after paging_init()"
arm64, mm, efi: Account for GICv3 LPI tables in static memblock reserve table
Linus Torvalds [Sun, 17 Feb 2019 16:44:38 +0000 (08:44 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"Three changes:
- An UV fix/quirk to pull UV BIOS calls into the efi_runtime_lock
locking regime. (This done by aliasing __efi_uv_runtime_lock to
efi_runtime_lock, which should make the quirk nature obvious and
maintain the general policy that the EFI lock (name...) isn't
exposed to drivers.)
- Our version of MAGA: Make a.out Great Again.
- Add a new Intel model name enumerator to an upstream header to help
reduce dependencies going forward"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/UV: Use efi_runtime_lock to serialise BIOS calls
x86/CPU: Add Icelake model number
x86/a.out: Clear the dump structure initially
Linus Torvalds [Sun, 17 Feb 2019 16:38:13 +0000 (08:38 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Two fixes on the kernel side: fix an over-eager condition that failed
larger perf ring-buffer sizes, plus fix crashes in the Intel BTS code
for a corner case, found by fuzzing"
Linus Torvalds [Sun, 17 Feb 2019 16:36:21 +0000 (08:36 -0800)]
Merge tag 'powerpc-5.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Just one fix, for pgd/pud_present() which were broken on big endian
since v4.20, leading to possible data corruption.
Thanks to: Aneesh Kumar K.V., Erhard F., Jan Kara"
* tag 'powerpc-5.0-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/64s: Fix possible corruption on big endian due to pgd/pud_present()
Linus Torvalds [Sun, 17 Feb 2019 16:34:10 +0000 (08:34 -0800)]
Merge tag 'csky-for-linus-5.0-rc6' of git://github.com/c-sky/csky-linux
Pull arch/csky fixes from Guo Ren:
"Here are some fixup patches for 5.0-rc6"
* tag 'csky-for-linus-5.0-rc6' of git://github.com/c-sky/csky-linux:
csky: Fixup dead loop in show_stack
csky: Fixup io-range page attribute for mmap("/dev/mem")
csky: coding convention: Use task_stack_page
csky: Fixup wrong pt_regs size
csky: Fixup _PAGE_GLOBAL bit for 610 tlb entry