Chris Wilson [Mon, 4 Apr 2016 13:46:42 +0000 (14:46 +0100)]
mm/vmap: Add a notifier for when we run out of vmap address space
vmaps are temporary kernel mappings that may be of long duration.
Reusing a vmap on an object is preferrable for a driver as the cost of
setting up the vmap can otherwise dominate the operation on the object.
However, the vmap address space is rather limited on 32bit systems and
so we add a notification for vmap pressure in order for the driver to
release any cached vmappings.
The interface is styled after the oom-notifier where the callees are
passed a pointer to an unsigned long counter for them to indicate if they
have freed any space.
v2: Guard the blocking notifier call with gfpflags_allow_blocking()
v3: Correct typo in forward declaration and move to head of file
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Roman Peniaev <r.peniaev@gmail.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org Acked-by: Andrew Morton <akpm@linux-foundation.org> # for inclusion via DRM Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459777603-23618-3-git-send-email-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Mon, 4 Apr 2016 13:46:41 +0000 (14:46 +0100)]
drm/i915/shrinker: Account for unshrinkable unbound pages
Since we only attempt to purge an object if can_release_pages() report
true, we should also only add it to the count of potential recoverable
pages when can_release_pages() is true.
drm/i915: Move execlists irq handler to a bottom half
Doing a lot of work in the interrupt handler introduces huge
latencies to the system as a whole.
Most dramatic effect can be seen by running an all engine
stress test like igt/gem_exec_nop/all where, when the kernel
config is lean enough, the whole system can be brought into
multi-second periods of complete non-interactivty. That can
look for example like this:
I could not explain, or find a code path, which would explain
a +20 second lockup, but from some instrumentation it was
apparent the interrupts off proportion of time was between
10-25% under heavy load which is quite bad.
When a interrupt "cliff" is reached, which was >~320k irq/s on
my machine, the whole system goes into a terrible state of the
above described multi-second lockups.
By moving the GT interrupt handling to a tasklet in a most
simple way, the problem above disappears completely.
Testing the effect on sytem-wide latencies using
igt/gem_syslatency shows the following before this patch:
Showing a huge improvement in the unrelated process wake-up
latency. It also shows an approximate halving in the number
of total empty batches submitted during the test. This may
not be worrying since the test puts the driver under
a very unrealistic load with ncpu threads doing empty batch
submission to all GPU engines each.
Another benefit compared to the hard-irq handling is that now
work on all engines can be dispatched in parallel since we can
have up to number of CPUs active tasklets. (While previously
a single hard-irq would serially dispatch on one engine after
another.)
More interesting scenario with regards to throughput is
"gem_latency -n 100" which shows 25% better throughput and
CPU usage, and 14% better dispatch latencies.
I did not find any gains or regressions with Synmark2 or
GLbench under light testing. More benchmarking is certainly
required.
v2:
* execlists_lock should be taken as spin_lock_bh when
queuing work from userspace now. (Chris Wilson)
* uncore.lock must be taken with spin_lock_irq when
submitting requests since that now runs from either
softirq or process context.
v3:
* Expanded commit message with more testing data;
* converted missed locking sites to _bh;
* added execlist_lock comment. (Chris Wilson)
v4:
* Mention dispatch parallelism in commit. (Chris Wilson)
* Do not hold uncore.lock over MMIO reads since the block
is already serialised per-engine via the tasklet itself.
(Chris Wilson)
* intel_lrc_irq_handler should be static. (Chris Wilson)
* Cancel/sync the tasklet on GPU reset. (Chris Wilson)
* Document and WARN that tasklet cannot be active/pending
on engine cleanup. (Chris Wilson/Imre Deak)
Ville Syrjälä [Mon, 21 Mar 2016 14:43:22 +0000 (14:43 +0000)]
drm/i915: Fix plane init failure paths
Deal with errors from drm_universal_plane_init() in primary and cursor
plane init paths (sprites were already covered). Also make the code
neater by using goto for error handling.
v2: Rebased due to drm_universal_plane_init() 'name' parameter
v3: Another rebase due to s/""/NULL/
v4: Rebased on drm-nightly (Matthew Auld)
v5: Fix email address (Matthew Auld)
Ville Syrjälä [Tue, 15 Mar 2016 14:39:56 +0000 (16:39 +0200)]
drm/i915: Implement WaPixelRepeatModeFixForC0:chv
DPLL_MD(PIPE_C) is AWOL on CHV. Instead of fixing it someone added
chicken bits to propagate the pixel multiplier from DPLL_MD(PIPE_B)
to either pipe B or C. So do that to make pixel repeat work on pipes
B and C. Pipe A is fine without any tricks.
Fortunately the pixel repeat propagation appears to be a oneshot
operation, so once the value has been written we can clear the
chicken bits. So it is still possible to drive pipe B and C with
different pixel multipliers simultaneosly.
Looks like DPLL_VGA_MODE_DIS must also be set in DPLL(PIPE_B)
for this to work. But since we keep that bit always set in all
DPLLs there's no problem.
This of course means we can't reliably read out the pixel multiplier
for pipes B and C. That would make the state checker unhappy, so I
added shadow copies of those registers in to dev_priv. The other
option would have been to skip pixel multiplier, dpll_md an dotclock
checks entirely on CHV, but that feels like a serious loss of cross
checking, so just pretending that we have working DPLL MD registers
seemed better. Obviously with the shadow copies we can't detect if
the pixel multiplier was properly configured, nor can we take over
its state from the BIOS, but hopefully people won't have displays
that would be limitd to such crappy modes.
There is one strange flicker still remaining. It's visible on
pipe C/HDMID when HDMIB is enabled while driven by pipe B.
It doesn't occur if pipe A drives HDMIB, nor is there any glitch
on pipe B/HDMIB when port C/HDMID starts up. I don't have a board
with HDMIC so not sure if it happens there too. So I'm not sure
if it's somehow tied in with this strange linkage between pipe B
and C. Sadly I was unable to find an enable sequence that would
avoid the glitch, but at least it's not fatal ie. the output
recovers afterwards.
Ville Syrjälä [Tue, 15 Mar 2016 14:39:55 +0000 (16:39 +0200)]
drm/i915: Make {vlv,chv}_{disable,update}_pll() more similar
The VLV and CHV DPLL disable and update are almost identical in
how the DPLL/DPLL_MD registers need to be set up. But the code
looks more different than it really is. Try to bring them into
line.
Note that we now leave the refclock always enabled for both
DPLLs in the dual channel PHY. But that's perfectly fine since
it's the same clock, and we anyway already do that when turning
the disp2d power well on.
v2: s/chv_update_pll/chv_compute_dpll/
v3: Add a note that we leave refclocks enabled for both DPLLs (Jani)
Ville Syrjälä [Tue, 1 Mar 2016 14:16:23 +0000 (16:16 +0200)]
drm/i915: Disable FDI RX before DDI_BUF_CTL
Bspec is confused w.r.t. the HSW/BDW FDI disable sequence. It lists
FDI RX disable both as step 13 and step 18 in the sequence. But I dug
up an old BUN mail from Art that moved the FDI RX disable to happen
before DDI_BUF_CTL disable. That BUN did not renumber the steps and just
added a note:
"Workaround: Disable PCH FDI Receiver before disabling DDI_BUF_CTL."
The BUN described the symptoms of the fixed issue as:
"PCH display underflow and a black screen on the analog CRT port that
happened after a FDI re-train"
I suppose later someone tried to renumber the steps to match, but forgot
to remove the FDI RX disable from its old position in the sequence.
They also forgot to update the note describing what should be done in
case of an FDI training failure. Currently it says:
"To retry FDI training, follow the Disable Sequence steps to Disable FDI,
but skip the steps related to clocks and PLLs (16, 19, and 20), ..."
It should really say "17, 20, and 21" with the current sequence because
those are the steps that deal with PLLs and whatnot, after step 13 became
FDI RX disable. And had the step 18 FDI RX disable been removed, as I
suspect it should have, the note should actually say "17, 19, and 20".
So, let's move the FDI RX disable to happen before DDI_BUF_CTL disable,
as that would appear to be the correct order based on the BUN.
Note that Art has since unconfused the spec, and so this patch should
now match the steps listed in the spec.
With the patch applied SNB, IVB and ILK are experiencing hard machine
hangs. Original patch was to fix "just" kernel panics so it's not a good
trade-off.
Proper fix for the panic is on the way, lets revert until then.
Fixes: a7442b93cf32 ("drm/i915: Fix races on fbdev") Cc: Lukas Wunner <lukas@wunner.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Cc: stable@vger.kernel.org Acked-by: Lukas Wunner <lukas@wunner.de> Tested-by: Tomi Sarvela <tomi.p.sarvela@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459510861-29035-1-git-send-email-joonas.lahtinen@linux.intel.com
Vandana Kannan [Thu, 31 Mar 2016 17:45:54 +0000 (23:15 +0530)]
drm/i915: BXT DDI PHY sequence BUN
According to the BSpec update, bit 7 of PORT_CL1CM_DW0 register needs to be
checked to ensure that the register is in accessible state.
Also, based on a BSpec update, changing the timeout value to
check iphypwrgood, from 10ms to wait for up to 100us.
v2: [Ville] use wait_for_us instead of the atomic call.
v3: [Jani/Imre] read register only once
Signed-off-by: Vandana Kannan <vandana.kannan@intel.com> Reported-by: Philippe Lecluse <Philippe.Lecluse@intel.com> Cc: Deak, Imre <imre.deak@intel.com> Cc: Nikula, Jani <jani.nikula@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459446354-19012-1-git-send-email-vandana.kannan@intel.com
This patch checks for changes in sink count between short pulse
hpds and forces full detect when there is a change.
This will allow both detection of hotplug and unplug of panels
through dongles that give only short pulse for such events.
v2: changed variable type from u8 to bool (Jani)
return immediately if perform_full_detect is set(Siva)
v3: changed method of determining full detection from using
pointer to return code (Siva)
v4: changed comments to indicate meaning of return value of
intel_dp_short_pulse and explain the use of return value
from intel_dp_get_dpcd in intel_dp_short_pulse (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com> Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-5-git-send-email-shubhangi.shrivastava@intel.com
Sink count can change between short pulse hpd hence this patch
adds a member variable to intel_dp so we can track any changes
between short pulse interrupts.
This patch reads sink_count dpcd always and removes its
read operation based on values in downstream port dpcd.
SINK_COUNT dpcd is not dependent on DOWNSTREAM_PORT_PRESENT dpcd.
SINK_COUNT denotes if a display is attached, while
DOWNSTREAM_PORT_PRESET indicates how many ports are available
in the dongle where display can be attached. so it is possible
for sink count to change irrespective of value in downstream
port dpcd.
Here is a table of possible values and scenarios
sink_count downstream_port
present
0 0 no display is attached
0 1 dongle is connected without display
1 0 display connected directly
1 1 display connected through dongle
v2: Storing value of intel_dp->sink_count that is ready
for consumption. (Ander)
Squashing two commits into one. (Ander)
v3: Added comment to explain the need of early return when
sink count is 0. (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com> Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com> Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-4-git-send-email-shubhangi.shrivastava@intel.com
When created originally intel_dp_check_link_status()
was supposed to handle only link training for short
pulse but has grown into handler for short pulse itself.
This patch cleans up this function by splitting it into
two halves. First intel_dp_short_pulse() is called,
which will be entry point and handle all logic for
short pulse handling while intel_dp_check_link_status()
will retain its original purpose of only doing link
status related work.
intel_dp_short_pulse: All existing code other than
link status read and link training upon error status.
intel_dp_check_link_status:
The link status should be read on short pulse
irrespective of panel being enabled or not so
intel_dp_get_link_status() performs dpcd read first
then based on crtc active / enabled it will
perform the link training.
This is because short pulse is a generic interrupt
which should always be handled, because it may mean:
1. Hotplug/unplug of MST panel
2. Hotplug/unplug of dongle
3. Link status change for other DP panels
v2: Added WARN_ON to intel_dp_check_link_status()
Removed a call to intel_dp_get_link_status() (Ander)
v3: Changed commit message to explain need of link status
being read before performing encoder checks (Daniel)
v4: Changed commit message to explain need of reading
link status on short pulse (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com> Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
[anderco: fix parenthesis alignment] Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-3-git-send-email-shubhangi.shrivastava@intel.com
Current DP detection has DPCD operations split across
intel_dp_hpd_pulse and intel_dp_detect which contains
duplicates as well. Also intel_dp_detect is called
during modes enumeration as well which will result
in multiple dpcd operations. So this patch tries
to solve both these by bringing all DPCD operations
in one single function and make intel_dp_detect
use existing values instead of repeating same steps.
v2: Pulled in a hunk from last patch of the series to
this patch. (Ander)
v3: Added MST hotplug handling. (Ander)
v4: Added a flag to check if detect is performed to
prevent multiple detects on hotplug. (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com> Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
[anderco: fix parenthesis aligment] Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-2-git-send-email-shubhangi.shrivastava@intel.com
intel_dp_detect() is called for not just detection but
during modes enumeration as well. Repeating the whole
sequence during each of these calls is wasteful and
time consuming.
This patch moves probing for panel, DPCD read etc done in
intel_dp_detect() to a new function intel_dp_long_pulse().
Note that the behavior of intel_dp_detect() is changed to
report connected or disconnected depending on whether the
EDID is available or not.
This change will be required by further patches in the series
to avoid performing duplicated DPCD operations on hotplug.
v2: Moved a hunk to next patch of the series.
Moved intel_dp_unset_edid to out. (Ander)
v3: Rephrased commit message and intel_dp_unset_dp() is called
within intel_dp_set_dp() to free the previous EDID. (Ander)
v4: Added overriding of status to disconnected for MST. (Ander)
Tested-by: Nathan D Ciobanu <nathan.d.ciobanu@intel.com> Signed-off-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Shubhangi Shrivastava <shubhangi.shrivastava@intel.com> Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
[anderco: fix parenthesis alignment] Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1459341326-13142-1-git-send-email-shubhangi.shrivastava@intel.com
Joonas Lahtinen [Wed, 30 Mar 2016 13:57:10 +0000 (16:57 +0300)]
drm/i915: Refer to GGTT {,VM} consistently
Refer to the GGTT VM consistently as "ggtt->base" instead of just "ggtt",
"vm" or indirectly through other variables like "dev_priv->ggtt.base"
to avoid confusion with the i915_ggtt object itself and PPGTT VMs.
Refer to the GGTT as "ggtt" instead of indirectly through chaining.
As a bonus gets rid of the long-standing i915_obj_to_ggtt vs.
i915_gem_obj_to_ggtt conflict, due to removal of i915_obj_to_ggtt!
v2:
- Added some more after grepping sources with Chris
v3:
- Refer to GGTT VM through ggtt->base consistently instead of ggtt_vm
(Chris)
v4:
- Convert all dev_priv->ggtt->foo accesses to ggtt->foo.
v5:
- Make patch checker happy
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
drm/i915: Update color management during vblank evasion.
Without this a vblank may occur between updating color management
and planes, which should be prevented.
intel_color_set_csc was called in update pipe config because the
handover from hardware may not have any csc set, which resulted
in a black screen. Because of this also update color management
during fastset.
With async modesets this is no longer protected with connection_mutex,
so ensure that each pll has its own lock. The pll configuration state
is still protected; it's only the pll updates that need locking against
concurrency.
Changes since v1:
- Rebased.
- Fix locking to protect all accesses. (Durgadoss)
Changes since v2:
- Make the dpll_lock global to protect concurrent updates to the
same register, for example DPLL_CTRL1 on skl. (Ander)
However, not servicing all available IIR within the handler does hurt the
throughput of pathological nop execbuf by about 20%, with a similar effect
upon the dispatch latency of a series of execbuf.
v2: use do {} while(0) for a smaller patch, and easier to revert again
I have reasonable confidence that we do not miss GT interrupts (as
execlists provides a stress case with a failure mechanism easily
detected by igt), however I have less confidence about all the other
sources of interrupts and worry that may lose a display hotplug
interrupt, for example.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93467
Testcase: igt/gem_exec_nop/basic # requires NMI watchdog Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Antti Koskipää <antti.koskipaa@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1457946117-6714-1-git-send-email-chris@chris-wilson.co.uk
Chris Wilson [Thu, 24 Mar 2016 14:31:47 +0000 (14:31 +0000)]
drm/i915: Rename __force_wake_get to __force_wake_auto
__force_wake_get() only acquires a temporary wakeref on forcewake that is
automatically released when a timer expires. When reading the code
again, I confused __intel_uncore_forcewake_get() for __force_wake_get()
and to my shame thought I found a bug in unbalanced wake_count handling.
I claim that if the function had been called __force_wake_auto() instead
I would not have embarrassed myself.
Matthew Auld [Thu, 24 Mar 2016 15:54:20 +0000 (15:54 +0000)]
drm/i915: BUG_ON when ggtt_view is NULL
Lets BUG_ON and don't bother with a WARN and returning an error, so we can
remove the need to pollute the code with error handling, after all it is
a programmer error to provide NULL view. Also while we're here remove
redundant NULL ggtt_view check.
Bjørn Mork [Wed, 30 Mar 2016 09:08:33 +0000 (11:08 +0200)]
drm/i915: fix deadlock on lid open
commit e2c8b8701e2d moved modeset locking inside resume/suspend
functions, but missed a code path only executed on lid close/open
on older hardware. The result was a deadlock when closing and
opening the lid without suspending on such hardware:
=============================================
[ INFO: possible recursive locking detected ]
4.6.0-rc1 #385 Not tainted
---------------------------------------------
kworker/0:3/88 is trying to acquire lock:
(&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa063e6a4>] intel_display_resume+0x4a/0x12f [i915]
but task is already holding lock:
(&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa02d0d4f>] drm_modeset_lock_all+0x3e/0xa6 [drm]
other info that might help us debug this:
Possible unsafe locking scenario:
Lyude [Fri, 11 Mar 2016 15:57:01 +0000 (10:57 -0500)]
drm/i915: Call intel_dp_mst_resume() before resuming displays
Since we need MST devices ready before we try to resume displays,
calling this after intel_display_resume() can result in some issues with
various laptop docks where the monitor won't turn back on after
suspending the system.
This order was originally changed in
commit e7d6f7d70829 ("drm/i915: resume MST after reading back hw state")
In order to fix some unclaimed register errors, however the actual cause
of those has since been fixed.
CC: stable@vger.kernel.org Signed-off-by: Lyude <cpaul@redhat.com>
[danvet: Resolve conflicts with locking changes.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Lukas Wunner [Wed, 9 Mar 2016 11:52:53 +0000 (12:52 +0100)]
drm/i915: Fix races on fbdev
The ->lastclose callback invokes intel_fbdev_restore_mode() and has
been witnessed to run before intel_fbdev_initial_config_async()
has finished.
We might likewise receive hotplug events before we've had a chance to
fully set up the fbdev.
Fix by waiting for the asynchronous thread to finish.
v2:
An async_synchronize_full() was also added to intel_fbdev_set_suspend()
in v1 which turned out to be entirely gratuitous. It caused a deadlock
on suspend (discovered by CI, thanks to Damien Lespiau and Tomi Sarvela
for CI support) and was unnecessary since a device is never suspended
until its ->probe callback (and all asynchronous tasks it scheduled)
have finished. See dpm_prepare(), which calls wait_for_device_probe(),
which calls async_synchronize_full().
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93580 Reported-by: Gustav Fägerlind <gustav.fagerlind@gmail.com> Reported-by: "Li, Weinan Z" <weinan.z.li@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20160309115147.67B2B6E0D3@gabe.freedesktop.org
Dave Gordon [Thu, 24 Mar 2016 11:20:38 +0000 (11:20 +0000)]
drm/i915: replace for_each_engine()
Having provided for_each_engine_id() for cases where the third (id)
argument is useful, we can now replace all the remaining instances with
a simpler version that takes only two parameters. In many cases, this
also allows the elimination of the local variable used in the iterator
(usually 'i').
v2:
s/dev_priv/(dev_priv__)/ in body of for_each_engine_masked() [Chris Wilson]
Dave Gordon [Wed, 23 Mar 2016 18:19:53 +0000 (18:19 +0000)]
drm/i915: introduce for_each_engine_id()
Equivalent to the existing for_each_engine() macro, this will replace
the latter wherever the third argument *is* actually wanted (in most
places, it is not used). The third argument is renamed to emphasise
that it is an engine id (type enum intel_engine_id). All the callers of
the macro that actually need the third argument are updated to use this
version, and the argument (generally 'i') is also updated to be 'id'.
Other callers (where the third argument is unused) are untouched for
now; they will be updated in the next patch.
Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Imre Deak [Thu, 24 Mar 2016 10:41:40 +0000 (12:41 +0200)]
drm/i915/bxt: Fix DSI HW state readout
Currently the machine hangs during booting while accessing the
BXT_MIPI_PORT_CTRL register during pipe HW state readout. After some
experimentation I found that the hang is caused by the DSI PLL being
disabled, or it being enabled but with an incorrect divider
configuration. Enabling the PLL got rid of the boot problem, so fix
this by checking the PLL enabled state/configuration before attempting
to read out the HW state.
The DSI_PLL_ENABLE register is in the always-on power well, while the
BXT_DSI_PLL_CTL is in power well 0. This isn't exactly matched by the
transcoder power domain, but what we really need is just a runtime PM
reference, which is provided by any power domain.
Ville also found this dependency specified in BSpec, so I added a
reference to that too.
v2:
- Make sure we hold a power reference while accessing the PLL registers.
v3: (Jani)
- Simplify check in bxt_get_dsi_transcoder_state()
- Add comment explaining why we check for valid dividers in
bxt_dsi_pll_is_enabled()
CC: Shashank Sharma <shashank.sharma@intel.com> CC: Uma Shankar <uma.shankar@intel.com> CC: Jani Nikula <jani.nikula@intel.com> Fixes: c6c794a2fc5e ("drm/i915/bxt: Initialize MIPI DSI for BXT") Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Shashank Sharma <shashank.sharma@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458816100-31269-1-git-send-email-imre.deak@intel.com
we wrote the ggtt_bind_vma() observing a number of cleanups we could do
over the template of aliasing_gtt_bind_vma(). Now let's apply the
cleanups we made there back to the original. The essence is to avoid
redundant variables and assignements, and by doing so make the code
easier to read.
Split a GEN2 specific version from i9xx_crtc_compute_clock(). With this
there is no need for i9xx_get_refclk() anymore, and the differences
between platforms become more obvious.
drm/i915: Split CHV and VLV specific crtc_compute_clock() hooks
In order for VLV and CHV to use i9xx_crtc_compute_clocks(), a number
of if ladders is necessary: one for setting the find_dpll() hook, one
for choosing the limits struct, one for choosing the right compute dpll
function and one for initializing the crtc_compute_clock() hook.
By extracting a platform specific implementation for each platform, the
number of if-ladders is reduced to one.
While at it also clean up bxt_find_best_dpll() which depends on some of
the CHV code.
drm/i915: Merge ironlake_compute_clocks() and ironlake_crtc_compute_clock()
Merge ironlake_compute_clocks() into ironlake_crtc_compute_clock() so
the clock computation logic is all in one place. The resulting function
is still quite simple. Follow up patches will make the similar code for
GMCH platforms look similar.
drm/i915: Pass crtc_state->dpll directly to ->find_dpll()
When calculating clocks, just pass a pointer to crtc_state->dpll
directly to the find_dpll() hook. Back when this was introduced in
commit f47709a9502f3 ("drm/i915: create pipe_config->dpll for clock
state") there was no staged crtc config or atomic crtc state, so it was
possible to overwrite the current configuration on error. That hasn't
been the case for a while now, so finally make it "disappear".
drm/i915: Simplify ironlake_crtc_compute_clock() CPU eDP case
None of the code in ironlake_crtc_compute_clock() is relevant for CPU
eDP. The CPU eDP PLL is turned on and off in ironlake_edp_pll_{on,off}
from the DP code and that doesn't depend on the crtc_state->dpll values,
so just return early in that case.
drm/i915: Remove PCH type checks from ironlake_crtc_compute_clock()
The checks were added in commit 5dc5298bb3e5 ("drm/i915: add proper
CPU/PCH checks to crtc_mode_set functions") in a time when there was
doubts on what PCHs would be supported by HSW. There are similar checks
for PCH type in intel_detect_pch() and the function pointers are
initialized based on platform/pch information, so the removed WARN can't
ever be reached.
drm/i915: Don't calculate a new clock in ILK+ code if it is already set
Remove the clock calculation from ironlake_crtc_compute_clock() when the
encoder compute_config() already set one. The value was just thrown away
in that case.
Note that the previously set clock is not validated against the limits
anymore. That is ok since the fixed clocks from DP and SDVO are within
the supported range, so the call to ironlake_compute_clocks() would
never fail in that case.
drm/i915: Fold intel_ironlake_limit() into clock computation function
The function intel_ironlake_limit() is only called by the crtc compute
clock path. By merging it into ironlake_compute_clocks(), the code gets
clearer, since there's no more if-ladders to follow.
drm/i915: Wait for vblank in i9xx_disable_crtc() for gen 2 only
The wait for other gens was added in commit 564ed191f5d8 ("drm/i915:
gmch: fix stuck primary plane due to memory self-refresh mode") since
that's necessary when disabling cxsr. However, cxsr disabling was later
moved to intel_pre_disable_primary() in commit 87d4300a7dbc ("drm/i915:
Move intel_(pre_disable/post_enable)_primary to intel_display.c, and use
it there.") and that function got its own vblank wait for cxsr in commit 262cd2e154c2 ("drm/i915: CHV DDR DVFS support and another watermark
rewrite"). So remove the extra vblank wait from i9xx_crtc_distable().
Mika Kuoppala [Wed, 23 Mar 2016 08:31:46 +0000 (10:31 +0200)]
drm/i915: Fix use after free when printing load failure
Commit d15d7538c6d2 ("drm/i915: Tune down init error message due
to failure injection") added i915_load_error message to failure
path on device initialization. The message is printed
after the device is freed. And as the message printing helper
uses the device structure, this leads to use after free.
Imre Deak [Mon, 21 Mar 2016 15:08:57 +0000 (17:08 +0200)]
drm/i915: Make __i915_printk debug output behave the same as DRM_DEBUG_DRIVER
Joonas and Daniel remarked that our debugging output should stay compatible
with the core DRM's debug facility. The recently added __i915_printk() would
output debug messages even if debugging is completely disabled via the
drm.debug option. To fix this make __i915_printk behave the same as
DRM_DEBUG_DRIVER in this case.
Matt Roper [Fri, 4 Mar 2016 23:59:39 +0000 (15:59 -0800)]
drm/i915: Wait until after wm optimization to drop runtime PM reference
At the end of an atomic commit, we currently wait for vblanks to
complete, call put() on the various runtime PM references, and then try
to optimize our watermarks (on platforms that need two-step watermark
programming). This can lead to watermark registers being programmed
while the power well is powered down. We need to wait until after
watermark optimization is complete before dropping our runtime power
references.
Note that in the future the watermark optimization is probably going to
move to an asynchronous workqueue task that happens at some arbitrary
point after vblank. When we make that change, we'll no longer
necessarily be operating under the power reference held here, so we'll
need to wrap the watermark register programmin in a call to
intel_runtime_pm_get_if_in_use() or similar.
drm/i915/tdr: Prepare error handler to accept mask of hung engines
In preparation for engine reset, the wedged argument of i915_handle_error()
is extended to reflect as a mask of engines that are hung. This is further
passed down to error state capture functions which are also updated.
Engine reset recovery mechanism uses this mask and schedules recovery work
for those particular engines.
drm/i915: Extract out gamma table and CSC to their own file
The moves a couple of functions programming the gamma LUT and CSC
units into their own file.
On generations prior to Haswell there is only a gamma LUT. From
haswell on there is also a new enhanced color correction unit that
isn't used yet. This is why we need to set the GAMMA_MODE register,
either we're using the legacy 8bits LUT or enhanced LUTs (of 10 or
12bits).
The CSC unit is only available from Haswell on.
We also need to make a special case for CherryView which is recognized
as a gen 8 but doesn't have the same enhanced color correction unit
from Haswell on.
v2: Fix access to GAMMA_MODE register on older generations than
Haswell (from Matt Roper's comments)
Jani Nikula [Fri, 18 Mar 2016 15:05:42 +0000 (17:05 +0200)]
drm/i915/bxt: add dsi transcoders
The BXT display connections have DSI transcoders A and C that can be
muxed to any pipe, not unlike the eDP transcoder. Add the notion of DSI
transcoders.
The "normal" transcoders A, B and C are not used with BXT DSI, so care
must be taken to avoid accessing those registers with DSI transcoders in
the hardware state readout, modeset, and generally everywhere.
v2: addressing comments by Ville:
- rename the dsi get config function to hsw_get_dsi_transcoder_state
- rebase onto the higher level split of pipe/transcoder functions
- use more has_dsi_encoder as we can now because of the above,
with no need to look at the transcoder so much
- rename IS_DSI_TRANSCODER to transcoder_is_dsi
- use the above a bit more instead of comparing to < TRANSCODER_EDP
Jordan Justen [Mon, 7 Mar 2016 07:30:27 +0000 (23:30 -0800)]
drm/i915: Use an array of register tables in command parser
For Haswell, we will want another table of registers while retaining
the large common table of whitelisted registers shared by all gen7
devices.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net>
[danvet: Pipe patch through sed -e 's/\<ring\>/engine/g' to make it
apply.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Imre Deak [Fri, 18 Mar 2016 08:46:10 +0000 (10:46 +0200)]
drm/i915: Tune down init error message due to failure injection
Atm, in case failure injection forces an error the subsequent "*ERROR*
failed to init modeset" error message will make automated tests (CI)
report this event as a breakage even though the event is expected. To
fix this print the error message with debug log level in this case.
While at it print the error message for any init failure and change it
to
"""
Device initialization failed (errno)
Please file a bug at https://bugs.freedesktop.org/enter_bug.cgi?product=DRI
against DRM/Intel providing the dmesg log by booting with drm.debug=0xf
"""
and export a helper printing error messages using this same format.
A follow-up patch will convert all uses of DRM_ERROR reporting a user
facing problem to use this new helper instead.
v2:
- Include the problematic error message in the commit log, add a
request to file an fdo bug to the message (Chris)
v3:
- Include the new error message too in the commit log, make the
fdo link more precise and print part of the message with info log
level (Chris)
v4: (Chris)
- Use dev_printk instead of DRM_ERROR/INFO and use NOTICE instead of
INFO loglevel
- Export a helper for printing user facing error messages
v5:
- Keep the DRM_ERROR message prefix used by piglit-igt/CI to filter
relevant dmesg lines
- Use dev_notice(), instead of dev_printk(KERN_NOTICE,...)
v6:
- Print the fdo bug link only once (Chris)
Chris Wilson [Fri, 18 Mar 2016 08:42:59 +0000 (10:42 +0200)]
drm/i915: Codify our assumption that the Global GTT is <= 4GiB
Throughout the code base, we use u32 for offsets into the global GTT. If
we ever see any hardware with a larger GGTT, then we run the real risk
of silent corruption. So test for our assumption up front so that we
have a nice reminder should the time come when it fails.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[Rebased and changed 1ull -> 1ULL, cut 80 char line] Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1458290579-27783-1-git-send-email-joonas.lahtinen@linux.intel.com
Joonas Lahtinen [Fri, 18 Mar 2016 08:42:58 +0000 (10:42 +0200)]
drm/i915/gtt: Clean up GGTT probing code
Use less pointers with the probing code, making it much less confusing
to read.
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
This allows writes to EU flow control registers. Together
with SIP code from the user-mode driver this resolves a
hang seen in some pre-emption scenarios. Note that this
patch is just the kernel mode part of this workaround.
v2. Oops, add FLOW_CONTROL_ENABLE macro to i915_reg.h.
Tvrtko Ursulin [Thu, 17 Mar 2016 12:59:46 +0000 (12:59 +0000)]
drm/i915: Move CSB MMIO reads out of the execlists lock
By reading the CSB (slow MMIO accesses) into a temporary local
buffer we can decrease the duration of holding the execlist
lock.
Main advantage is that during heavy batch buffer submission we
reduce the execlist lock contention, which should decrease the
latency and CPU usage between the submitting userspace process
and interrupt handling.
Downside is that we need to grab and relase the forcewake twice,
but as the below numbers will show this is completely hidden
by the primary gains.
Testing with "gem_latency -n 100" (submit batch buffers with a
hundred nops each) shows more than doubling of the throughput
and more than halving of the dispatch latency, overall latency
and CPU time spend in the submitting process.
Submitting empty batches ("gem_latency -n 0") does not seem
significantly affected by this change with throughput and CPU
time improving by half a percent, and overall latency worsening
by the same amount.
Above tests were done in a hundred runs on a big core Broadwell.
v2:
* Overflow protection to local CSB buffer.
* Use closer dev_priv in execlists_submit_requests. (Chris Wilson)
v3: Rebase.
v4: Added commend about irq needed to be disabled in
execlists_submit_request. (Chris Wilson)
Imre Deak [Wed, 16 Mar 2016 11:39:08 +0000 (13:39 +0200)]
drm/i915: Add fault injection support
Add support for forcing an error at selected places in the driver. As an
example add 4 options to fail during driver loading.
Requested by Chris.
v2:
- Add fault point for modeset initialization
- Print debug message when injecting an error
v3:
- Rename inject_fault to inject_load_failure, rename the related macros
and helper accordingly (Chris)
- Use a counter instead of a mask to identify the failure point (Daniel)
- Mark the module option as _unsafe and keep i915_params ordered (Joonas)
v4:
- Rebase on latest -nightly
v5:
- Use DRM_INFO instead of DRM_DEBUG_DRIVER, making it clearer in CI reports
that a following error message is expected (IRC r-b from Chris on v5)
CC: Chris Wilson <chris@chris-wilson.co.uk> CC: Daniel Vetter <daniel.vetter@ffwll.ch> CC: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Imre Deak [Wed, 16 Mar 2016 11:39:07 +0000 (13:39 +0200)]
drm/i915: Fix power domain HW state cleanup on error path
Move the cleanup of the power domain HW state on the error path to the
same function where the corresponding init call was called from. I
noticed this problem when loading the module with load failure injection
enabled, making i915_load_modeset_init() fail.
Imre Deak [Wed, 16 Mar 2016 11:39:06 +0000 (13:39 +0200)]
drm/i915: Split out load time interface registration
According to the new init phases scheme we should register the device
making it available via some kernel internal or user space interface as
the last step in the init sequence, so move the corresponding code to a
separate function.
Also add a TODO comment about code that still needs to be moved around
to one of the init phases functions depending on what the role and effect
of that code is.
No functional change, except for the reordering of the unload time
unregistration steps of sysfs wrt. acpi and opregion.
Suggested by Chris.
v3:
- rename i915_driver_init_register to i915_driver_init_frameworks
(Chris)
- rename i915_driver_init_frameworks to i915_driver_register (Daniel)
Imre Deak [Wed, 16 Mar 2016 11:39:05 +0000 (13:39 +0200)]
drm/i915: Split out load time HW initialization
According to the new init phases scheme we should have a definite step
in the init sequence where we setup things requiring accessing the
device, so move the corresponding code to separate function. The steps
in this init phase should avoid exposing the driver via some interface,
which is done in the last registration init phase. This changae also
has the benefit of making the error path cleaner both in the new
function and i915_driver_load()/unload().