Daniel Drake [Tue, 23 Apr 2019 09:28:10 +0000 (17:28 +0800)]
drm/i915/fbc: disable framebuffer compression on GeminiLake
On many (all?) the Gemini Lake systems we work with, there is frequent
momentary graphical corruption at the top of the screen, and it seems
that disabling framebuffer compression can avoid this.
The ticket was reported 6 months ago and has already affected a
multitude of users, without any real progress being made. So, lets
disable framebuffer compression on GeminiLake until a solution is found.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108085 Fixes: fd7d6c5c8f3e ("drm/i915: enable FBC on gen9+ too") Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: <stable@vger.kernel.org> # v4.11+ Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com> Signed-off-by: Daniel Drake <drake@endlessm.com> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190423092810.28359-1-jian-hong@endlessm.com
(cherry picked from commit 1d25724b41fad7eeb2c3058a5c8190d6ece73e08) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Joonas Lahtinen [Tue, 7 May 2019 12:29:14 +0000 (15:29 +0300)]
Merge tag 'gvt-next-fixes-2019-05-07' of https://github.com/intel/gvt-linux into drm-intel-next-fixes
gvt-next-fixes-2019-05-07
- Revert MCHBAR save range change for BXT regression (Yakui)
- Align display dmabuf size for bytes instead of error-prone pages (Xiong)
- Fix one context MMIO save/restore after RCS0 name change (Colin)
- Misc klocwork warning/errors fixes (Aleksei)
Chris Wilson [Sat, 4 May 2019 07:07:07 +0000 (08:07 +0100)]
drm/i915: Disable semaphore busywaits on saturated systems
Asking the GPU to busywait on a memory address, perhaps not unexpectedly
in hindsight for a shared system, leads to bus contention that affects
CPU programs trying to concurrently access memory. This can manifest as
a drop in transcode throughput on highly over-saturated workloads.
The only clue offered by perf, is that the bus-cycles (perf stat -e
bus-cycles) jumped by 50% when enabling semaphores. This corresponds
with extra CPU active cycles being attributed to intel_idle's mwait.
This patch introduces a heuristic to try and detect when more than one
client is submitting to the GPU pushing it into an oversaturated state.
As we already keep track of when the semaphores are signaled, we can
inspect their state on submitting the busywait batch and if we planned
to use a semaphore but were too late, conclude that the GPU is
overloaded and not try to use semaphores in future requests. In
practice, this means we optimistically try to use semaphores for the
first frame of a transcode job split over multiple engines, and fail if
there are multiple clients active and continue not to use semaphores for
the subsequent frames in the sequence. Periodically, we try to
optimistically switch semaphores back on whenever the client waits to
catch up with the transcode results.
With 1 client, on Broxton J3455, with the relative fps normalized by %cpu:
x no semaphores
+ drm-tip
* patched
+------------------------------------------------------------------------+
| * |
| *+ |
| **+ |
| **+ x |
| x * +**+ x |
| x x * * +***x xx |
| x x * * *+***x *x |
| x x* + * * *****x *x x |
| + x xx+x* + *** * ********* x * |
| + x xx+x* * *** +** ********* xx * |
| * + ++++* + x*x****+*+* ***+*************+x* * |
|*+ +** *+ + +* + *++****** *xxx**********x***+*****************+*++ *|
| |__________A_____M_____| |
| |_______________A____M_________| |
| |____________A___M________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 2.60475 3.50941 3.31123 3.2143953 0.21117399
+ 120 2.3826 3.57077 3.25101 3.1414161 0.28146407
Difference at 95.0% confidence
-0.0729792 +/- 0.0629585
-2.27039% +/- 1.95864%
(Student's t, pooled s = 0.248814)
* 120 2.35536 3.66713 3.2849 3.2059917 0.24618565
No difference proven at 95.0% confidence
Indicating that we've recovered the regression from enabling semaphores
on this saturated setup, with a hint towards an overall improvement.
Very similar, but of smaller magnitude, results are observed on both
Skylake(gt2) and Kabylake(gt4). This may be due to the reduced impact of
bus-cycles, where we see a 50% hit on Broxton, it is only 10% on the big
core, in this particular test.
One observation to make here is that for a greedy client trying to
maximise its own throughput, using semaphores is the right choice. It is
only the holistic system-wide view that semaphores of one client
impacts another and reduces the overall throughput where we would choose
to disable semaphores.
The most noticeable negactive impact this has is on the no-op
microbenchmarks, which are also very notable for having no cpu bus load.
In particular, this increases the runtime and energy consumption of
gem_exec_whisper.
Fixes: e88619646971 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190504070707.30902-1-chris@chris-wilson.co.uk
(cherry picked from commit ca6e56f654e7b241256ffba78cd2abb22aa3bc97) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Wed, 1 May 2019 11:45:36 +0000 (12:45 +0100)]
drm/i915: Delay semaphore submission until the start of the signaler
Currently we submit the semaphore busywait as soon as the signaler is
submitted to HW. However, we may submit the signaler as the tail of a
batch of requests, and even not as the first context in the HW list,
i.e. the busywait may start spinning far in advance of the signaler even
starting.
If we wait until the request before the signaler is completed before
submitting the busywait, we prevent the busywait from starting too
early, if the signaler is not first in submission port.
To handle the case where the signaler is at the start of the second (or
later) submission port, we will need to delay the execution callback
until we know the context is promoted to port0. A challenge for later.
Colin Xu [Fri, 22 Feb 2019 06:13:42 +0000 (14:13 +0800)]
drm/i915/gvt: Add in context mmio 0x20D8 to gen9 mmio list
Depends on GEN family and I915_PARAM_HAS_CONTEXT_ISOLATION, Mesa driver
will decide whether constant buffer 0 address is relative or absolute,
and load GPU initial state by lri to context mmio INSTPM (GEN8)
or 0x20D8 (>=GEN9).
Mesa Commit fa8a764b62
("i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.")
INSTPM is already added to gen8_engine_mmio_list, but 0x20D8 is missed
in gen9_engine_mmio_list. From GVT point of view, different guest could
have different context so should switch those mmio accordingly.
v2: Update fixes commit ID.
Fixes: 178657139307 ("drm/i915/gvt: vGPU context switch") Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Colin Xu <colin.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit 1e8b15a1988ed3c7429402017d589422628cdf47)
Ville Syrjälä [Thu, 25 Apr 2019 19:24:19 +0000 (22:24 +0300)]
drm/i915: Fix ICL output CSC programming
When I refactored the code into its own function I accidentally
misplaced the <<16 shifts for some of the registers causing us
to lose the blue channel entirely.
BXT needs to access 0x141000-0x1417ff register to obtain the dram info.
But after the snapshot range of I915_MCHBAR is refined in f74a6d9a2c,
it only initializes the range of 0x144000-0x147fff for VGPU and then
causes that the guest GPU can't get the initialized value for dram
detection on BXT.
Signed-off-by: Zhao Yakui <yakui.zhao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915/gvt: Check if get_next_pt_type() always returns a valid value
According to gtt_type_table[] function get_next_pt_type() may returns
GTT_TYPE_INVALID in some cases. To prevent driver to try to create memory
page with invalid data type, additional check is added.
Signed-off-by: Aleksei Gimbitskii <aleksei.gimbitskii@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Colin Xu <colin.xu@intel.com> Reviewed-by: Colin Xu <colin.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915/gvt: Use snprintf() to prevent possible buffer overflow.
For printing the intel_vgpu->id, a buffer with fixed length is allocated
on the stack. But if vgpu->id is greater than 6 characters, the buffer
overflow will happen. Even the string of the amount of max vgpu is less
that the length buffer right now, it's better to replace sprintf() with
snprintf().
v2:
- Increase the size of the buffer. (Colin Xu)
This patch fixed the critical issue #673 reported by klocwork.
Signed-off-by: Aleksei Gimbitskii <aleksei.gimbitskii@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Colin Xu <colin.xu@intel.com> Reviewed-by: Colin Xu <colin.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915/gvt: Do not copy the uninitialized pointer from fb_info
In the code the memcpy() function copied uninitialized pointer in fb_info
to dmabuf_obj->info. Later the pointer in dmabuf_obj->info will be
initialized. To make the code aligned with requirements of the klocwork
static code analyzer, the uninitialized pointer should be initialized
before memcpy().
v2:
- Initialize fb_info.obj in vgpu_get_plane_info(). (Colin Xu)
This patch fixed the critical issue #632 reported by klockwork.
Signed-off-by: Aleksei Gimbitskii <aleksei.gimbitskii@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> Cc: Colin Xu <colin.xu@intel.com> Reviewed-by: Colin Xu <colin.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915/gvt: Remove typedef and let the enumeration starts from zero
Typedef is not recommended in the Linux kernel.The klocwork static code
analyzer takes the enumeration as the full range of intel_gvt_gtt_type_t.
But the intel_gvt_gtt_type_t will never be used in full range. For
example, the GTT_TYPE_INVALID will never be used as an index of an array.
Remove the typedef and let the enumeration starts from zero to pass
klocwork analysis.
This patch fixed the critial issues #483, #551, #665 reported by
klockwork.
v3:
- Remove the typedef and let the enumeration starts from zero.
Signed-off-by: Aleksei Gimbitskii <aleksei.gimbitskii@intel.com> Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Zhi Wang <zhi.a.wang@intel.com> CC: Colin Xu <colin.xu@intel.com> Reviewed-by: Colin Xu <colin.xu@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915/gvt: Change fb_info->size from pages to bytes
fb_info->size is in pages, but some function need bytes when it
is as a parameter. Such as:
a. intel_gvt_ggtt_validate_range(), according to function definition
b. vifio_device_gfx_plane_info->size, according to the comment of
its definition
So change fb_info->size into bytes.
v2: Keep fb_info->size in real size instead of assinging casted page
size(zhenyu)
v3: obj->size should be page aligned and delete redundant check(zhenyu)
Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com> Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Fix the order of lane, port parameters passed to the register macro.
Note that this was already partly fixed by commit 37fc7845df7b6 ("drm/i915: Call MG_DP_MODE() macro with the right parameters order")
While at it simplify things by using the macro directly instead of an
unnecessary redirection via an array.
v2:
- Add a note the commit message about simplifying things. (José)
Fixes: 58106b7d816e1 ("drm/i915: Make MG PHY macros semantically consistent") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Aditya Swarup <aditya.swarup@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190419071026.32370-1-imre.deak@intel.com
(cherry picked from commit 9c11b12184bb01d8ba2c48e655509b184f02c769) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Chris Wilson [Wed, 17 Apr 2019 13:25:07 +0000 (14:25 +0100)]
drm/i915: Avoid use-after-free in reporting create.size
We have to avoid chasing after a userspace race!
<3>[ 473.114328] BUG: KASAN: use-after-free in i915_gem_create+0x1d2/0x1f0 [i915]
<3>[ 473.114389] Read of size 8 at addr ffff88815bf1d840 by task gem_flink_race/1541
Dave Airlie [Wed, 24 Apr 2019 03:09:50 +0000 (13:09 +1000)]
Merge tag 'exynos-drm-next-for-v5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next
Log cleanups
- Correct the use of log macro in error case.
- Drop unnecessary messages.
- Replace DRM_ERROR/DEBUG with DRM_DEV_ERROR/DEBUG.
- Print out debug messages with correct device name in vidi and ipp drivers.
One trivial cleanup
- Just fix checkpatch error, "foo* bar" to "foo *bar" in g2d driver.
Print out debug messages with correct device name.
As for this, this patch adds device pointer to exynos_drm_ipp structure,
and in case of exynos_drm_ipp_task structure, replace drm_device pointer
with device one. This will make each ipp driver to print out debug
messages with correct device name.
drm/vidi: replace platform_device pointer with device one
Add device pointer to vidi_context and remove platform_device pointer.
It doesn't need for vidi_context to contain platform_device object.
Instead, this patch makes this driver more simply by replacing platform_device
pointer with device one.
Dave Airlie [Wed, 24 Apr 2019 01:54:26 +0000 (11:54 +1000)]
Merge tag 'drm-msm-next-2019-04-21' of https://gitlab.freedesktop.org/drm/msm into drm-next
This time around it is a bunch of cleanup and fixes, expanding gpu
"zap" shader support (so we can take the GPU out of secure mode on
boot) to a6xx, and small UABI extension to support robustness (see
mesa MR 673).
Dave Airlie [Wed, 24 Apr 2019 01:24:22 +0000 (11:24 +1000)]
Merge branch 'drm-next-5.2' of git://people.freedesktop.org/~agd5f/linux into drm-next
- Add the amdgpu specific bits for timeline support
- Add internal interfaces for xgmi pstate support
- DC Z ordering fixes for planes
- Add support for NV12 planes in DC
- Add colorspace properties for planes in DC
- eDP optimizations if the GOP driver already initialized eDP
- DC bandwidth validation tracing support
Dave Airlie [Wed, 24 Apr 2019 00:30:41 +0000 (10:30 +1000)]
Merge tag 'drm/tegra/for-5.2-rc1' of git://anongit.freedesktop.org/tegra/linux into drm-next
drm/tegra: Changes for v5.2-rc1
This contains a fix for the usage of shared resets that previously
generated a WARN on boot. In addition, there's a fix for CPU cache
maintenance of GEM buffers allocated using get_pages().
(airlied: contains a merge from a shared tegra tree)
Dave Airlie [Wed, 24 Apr 2019 00:12:34 +0000 (10:12 +1000)]
Merge tag 'drm-misc-next-2019-04-18' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for v5.2:
UAPI Changes:
- Document which feature flags belong to which command in virtio_gpu.h
- Make the FB_DAMAGE_CLIPS available for atomic userspace only, it's useless for legacy.
Cross-subsystem Changes:
- Add device tree bindings for lg,acx467akm-7 panel and ST-Ericsson Multi Channel Display Engine MCDE
- Add parameters to the device tree bindings for tfp410
- iommu/io-pgtable: Add ARM Mali midgard MMU page table format
- dma-buf: Only do a 64-bits seqno compare when driver explicitly asks for it, else wraparound.
- Use the 64-bits compare for dma-fence-chains
Core Changes:
- Make the fb conversion functions use __iomem dst.
- Rename drm_client_add to drm_client_register
- Move intel_fb_initial_config to core.
- Add a drm_gem_objects_lookup helper
- Add drm_gem_fence_array helpers, and use it in lima.
- Add drm_format_helper.c to kerneldoc.
Driver Changes:
- Add panfrost driver for mali midgard/bitfrost.
- Converts bochs to use the simple display type.
- Small fixes to sun4i, tinydrm, ti-fp410.
- Fid aspeed's Kconfig options.
- Make some symbols/functions static in lima, sun4i and meson.
- Add a driver for the lg,acx467akm-7 panel.
Dave Airlie [Wed, 24 Apr 2019 00:02:20 +0000 (10:02 +1000)]
Merge tag 'drm-intel-next-2019-04-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- uAPI "Fixes:" patch for the upcoming kernel 5.1, included here too
We have an Ack from the media folks (only current user) for this
late tweak
Cross-subsystem Changes:
- ALSA: hda: Fix racy display power access (Takashi, Chris)
Driver Changes:
- DDI and MIPI-DSI clocks fixes for Icelake (Vandita)
- Fix Icelake frequency change/locking (RPS) (Mika)
- Temporarily disable ppGTT read-only bit on Icelake (Mika)
- Add missing Icelake W/As (Mika)
- Enable 12 deep CSB status FIFO on Icelake (Mika)
- Inherit more Icelake code for Elkhartlake (Bob, Jani)
- Handle catastrophic error on engine reset (Mika)
- Shortcut readiness to reset check (Mika)
- Regression fix for GEM_BUSY causing us to report a mixed uabi-class request as not busy (Chris)
- Revert back to max link rate and lane count on eDP (Jani)
- Fix pipe BPP readout for BXT/GLK DSI (Ville)
- Set DP min_bpp to 8*3 for non-RGB output formats (Ville)
- Enable coarse preemption boundaries for Gen8 (Chris)
- Do not enable FEC without DSC (Ville)
- Restore correct BXT DDI latency optim setting calculation (Ville)
- Always reset context's RING registers to avoid running workload twice during reset (Chris)
- Set GPU wedged on driver unload (Janusz)
- Consolidate two similar barries from timeline into one (Chris)
- Only reset the pinned kernel contexts on resume (Chris)
- Wakeref tracking improvements (Chris, Imre)
- Lockdep fixes for shrinker interactions (Chris)
- Bump ready tasks ahead of busywaits in prep of semaphore use (Chris)
- Huge step in splitting display code into fine grained files (Jani)
- Refactor the IRQ init/reset macros for code saving (Paulo)
- Convert IRQ initialization code to uncore MMIO access (Paulo)
- Convert workarounds code to use uncore MMIO access (Chris)
- Nuke drm_crtc_state and use intel_atomic_state instead (Manasi)
- Update SKL clock-gating WA (Radhakrishna, Ville)
- Isolate GuC reset code flow (Chris)
- Expose force_dsc_enable through debugfs (Manasi)
- Header standalone compile testing framework (Jani)
- Code cleanups to reduce driver footprint (Chris)
- PSR code fixes and cleanups (Jose)
- Sparse and kerneldoc updates (Chris)
- Suppress spurious combo PHY B warning (Vile)
Jordan Crouse [Fri, 19 Apr 2019 19:46:15 +0000 (13:46 -0600)]
drm/msm/a6xx: Add zap shader load
The a6xx GPU powers on in secure mode which restricts what memory it can
write to. To get out of secure mode the GPU driver can write to
REG_A6XX_RBBM_SECVID_TRUST_CNTL but on targets that are "secure" that
register region is blocked and writes will cause the system to go down.
For those targets we need to execute a special sequence that involves
loadinga special shader that clears the GPU registers and use a PM4
sequence to pull the GPU out of secure. Add support for loading the zap
shader and executing the secure sequence. For targets that do not support
SCM or the specific SCM sequence this should fail and we would fall back
to writing the register.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Fri, 19 Apr 2019 19:46:14 +0000 (13:46 -0600)]
drm/msm/gpu: Move zap shader loading to adreno
a5xx and a6xx both share (mostly) the same code to load the zap shader and
bring the GPU out of secure mode. Move the formerly 5xx specific code to
adreno to make it available for a6xx too.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Fri, 19 Apr 2019 19:56:27 +0000 (13:56 -0600)]
dt-bindings: drm/msm/a6xx: Document interconnect properties for GPU
Add documentation for the interconnect and interconnect-names bindings
for the GPU node as detailed by bindings/interconnect/interconnect.txt.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Georgi Djakov <georgi.djakov@linaro.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
drm/msm: Split submit_lookup_objects() into two loops
First loop does copy_from_user() without the table lock held and
just stores the handle. Second loop looks up buffer objects with the
table_lock held without potentially blocking or faulting. This lets us
clean up a bunch of custom, non-faulting copy_from_user() code.
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
We use a llist and a worker to delay the object cleanup. This avoids
taking mmap_sem and struct_mutex in the wrong order when calling
drm_gem_object_put_unlocked() from drm_gem_mmap().
Fixes lockdep problem with copy_from_user() in msm_ioctl_gem_submit().
Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:43 +0000 (09:15 -0700)]
msm/drm/a6xx: Turn off the GMU if resume fails
Currently if the GMU resume function fails all we try to do is clear the
BOOT_SLUMBER oob which usually times out and ends up in a cycle of death.
If the resume function fails at any point remove any RPMh votes that might
have been added and try to shut down the GMU hardware cleanly.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:42 +0000 (09:15 -0700)]
drm/msm/a6xx: Make GMU reset useful
Now that the GX domain is sorted we can wire up a working GMU reset.
IF a GMU hang was detected then try to forcefully shut down the GMU
in the power down sequence which should ensure that it can recover
normally on the next power up.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:41 +0000 (09:15 -0700)]
drm/msm/gpu: Attach to the GPU GX power domain
99.999% of the time during normal operation the GMU is responsible
for power and clock control on the GX domain and the CPU remains
blissfully unaware. However, there is one situation where the CPU
needs to get involved:
The power sequencing rules dictate that the GX needs to be turned
off before the CX so that the CX can be turned on before the GX
during power up. During normal operation when the CPU is taking
down the CX domain a stop command is sent to the GMU which turns
off the GX domain and then the CPU handles the CX domain.
But if the GMU happened to be unresponsive while the GX domain was
left then the CPU will need to step in and turn off the GX domain
before resetting the CX and rebooting the GMU. This unfortunately
means that the CPU needs to be marginally aware of the GX domain
even though it is expected to usually keep its hands off.
To support this we create a semi-disabled GX power domain that
does nothing to the hardware on power up but tries to shut it
down normally on power down. In this method the reference counting
is correct and we can step in with the pm_runtime_put() at the right
time during the failure path.
This patch sets up the connection to the GX power domain and does
the magic to "enable" and disable it at the right points.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:40 +0000 (09:15 -0700)]
dt-bindings: drm/msm/a6xx: Add GX power-domain for GMU bindings
The GMU should have two power domains defined: "cx" and "gx". "cx" is the
actual power domain for the device and "gx" will be attached at runtime
to manage reference counting on the GPU device in case of a GMU crash.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Mon, 4 Feb 2019 16:15:39 +0000 (09:15 -0700)]
drm/msm/a6xx: Remove unwanted regulator code
The GMU code currently has some misguided code to try to work around
a hardware quirk that requires the power domains on the GPU be
collapsed in a certain order. Upcoming patches will do this the
right way so get rid of the unused and unwanted regulator
code.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Jordan Crouse [Fri, 22 Mar 2019 20:21:22 +0000 (14:21 -0600)]
drm/msm/gpu: Add submit queue queries
Add the capability to query information from a submit queue.
The first available parameter is for querying the number of GPU faults
(hangs) that can be attributed to the queue.
This is useful for implementing context robustness. A user context can
regularly query the number of faults to see if it is responsible for any
and if so it can invalidate itself.
This is also helpful for testing by confirming to the user driver if a
particular command stream caused a fault (or not as the case may be).
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org>
Rob Clark [Tue, 16 Apr 2019 23:13:28 +0000 (16:13 -0700)]
drm/msm: add param to retrieve # of GPU faults (global)
For KHR_robustness, userspace wants to know two things, the count of GPU
faults globally, and the count of faults attributed to a given context.
This patch providees the former, and the next patch provides the latter.
Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org>
Wen Yang [Wed, 3 Apr 2019 16:04:11 +0000 (00:04 +0800)]
drm/msm: a5xx: fix possible object reference leak
The call to of_get_child_by_name returns a node pointer with refcount
incremented thus it must be explicitly decremented after the last
usage.
Detected by coccinelle with the following warnings:
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:57:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:66:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:118:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 47, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:57:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:66:2-8: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function.
drivers/gpu/drm/msm/adreno/a5xx_gpu.c:118:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 51, but without a corresponding object release within this function.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn> Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jordan Crouse <jcrouse@codeaurora.org> Cc: Mamta Shukla <mamtashukla555@gmail.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Sharat Masetty <smasetty@codeaurora.org> Cc: linux-arm-msm@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org (open list) Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Douglas Anderson [Wed, 16 Jan 2019 18:46:22 +0000 (10:46 -0800)]
drm/msm: Cleanup A6XX opp-level reading
The patch ("OPP: Add support for parsing the 'opp-level' property")
adds an API enabling a cleaner way to read the opp-level. Let's use
the new API.
Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Jordan Crouse <jcrouse@codeaurora.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Iterate and assign HW intf block to physical encoders
in encoder modeset. Moving all the HW block assignments
to encoder modeset to allow easy switching to state
based resource management.
drm/msm/dpu: map mixer/ctl hw blocks in encoder modeset
After resource allocation, iterate and populate mixer/ctl
hw blocks in encoder modeset thereby centralizing all
the resource mapping to the CRTC. This change is made
for easy switching to state based allocation using
private objects later in this series.
release resources allocated in mode_set if any of
the hw check fails. Most of these checks are not
necessary and they will be removed in the follow up
patches with state based resource allocations.
Both video and command physical encoders will have
a hw interface assigned to it. So there is really no
need to track the hw block in specific encoder subclass.
Sean Paul [Wed, 30 Jan 2019 16:32:12 +0000 (11:32 -0500)]
drm/msm: dpu: Don't set frame_busy_mask for async updates
The frame_busy mask is used in frame_done event handling, which is not
invoked for async commits. So an async commit will leave the
frame_busy mask populated after it completes and future commits will start
with the busy mask incorrect.
This showed up on disable after cursor move. I was hitting the "this should
not happen" comment in the frame event worker since frame_busy was set,
we queued the event, but there were no frames pending (since async
also doesn't set that).
Sean Paul [Mon, 28 Jan 2019 20:42:51 +0000 (15:42 -0500)]
drm/msm: dpu: Don't queue the frame_done watchdog for cursor
In the case of an async/cursor update, we don't wait for the frame_done
event, which means handle_frame_done is never called, and the frame_done
watchdog isn't canceled. Currently, this results in a frame_done timeout
every time the cursor moves without a synchronous frame following it up
before the timeout expires. Since we don't wait for frame_done, and
don't handle it, we shouldn't modify the watchdog.
Sean Paul [Mon, 28 Jan 2019 20:42:50 +0000 (15:42 -0500)]
drm/msm: dpu: Untangle frame_done timeout units
There exists a bunch of confusion as to what the actual units of
frame_done is:
- The definition states it's in # of frames
- CRTC treats it like it's ms
- frame_done_timeout comment thinks it's Hz, but it stores ms
- frame_done timer is setup such that it _should_ be in frames, but the
timeout is super long
So this patch tries to interpret what the driver really wants. I've
de-centralized the #define since the consumers are expecting different
units.
For crtc, we just use 60ms since that's what it was doing before.
Perhaps we could get fancy and scale with vrefresh, but that's for
another time.
For encoder, fix the comments and rename frame_done_timeout so it's
obvious what the units are. In practice, frame_done_timeout is really
just checked against 0 || !0, which I guess is why the units being wrong
didn't matter. I've also dropped the timeout from the previous 60 frames
to 5. That seems like more than enough time to give up on a frame, and
my guess is that no one intended for the timeout to _actually_ be 60
frames.
Instead of setting the timeout and then immediately reading it back
(along with the hand-rolled msecs_to_jiffies calculation), just
calculate it once and set it in both places at the same time.
Jordan Crouse [Tue, 19 Feb 2019 18:48:20 +0000 (11:48 -0700)]
drm/msm: Remove pm_runtime calls from msm_iommu.c
Currently the IOMMU code calls pm_runtime_get/put on the GPU or display
device before doing a IOMMU operation. This was because usually the
IOMMU driver didn't do power control of its own and since the hardware
used the same clocks and power as the respective multimedia device it
was a easy way to make sure that the power was available.
Now two things have changed. First, the SMMU devices can do their own power
control and more important bringing up the a6xx GPU isn't as easy as
turning on some clocks. To bring the GPU up we need the GMU which itself
needs the IOMMU so we have a chicken and egg problem.
Luckily this is easily fixed by removing the pm_runtime calls from the
functions and letting the device link to the IOMMU device handle the magic.
Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org> Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Lucas Stach [Thu, 28 Feb 2019 06:23:29 +0000 (07:23 +0100)]
drm/msm: don't allocate pages from the MOVABLE zone
The pages backing the GEM objects are kept pinned in place as
long as they are alive, so they must not be allocated from the
MOVABLE zone. Blocking page migration for too long will cause
the VM subsystem headaches and will outright break CMA, as a
few pinned pages in CMA will lead to failure to find the
required large contiguous regions.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Rob Clark <robdclark@chromium.org>
Dmitry Osipenko [Wed, 6 Mar 2019 22:55:19 +0000 (01:55 +0300)]
drm/tegra: gem: Fix CPU-cache maintenance for BO's allocated using get_pages()
The allocated pages need to be invalidated in CPU caches. On ARM32 the
DMA_BIDIRECTIONAL flag only ensures that data is written-back to DRAM and
the data stays in CPU cache lines. While the DMA_FROM_DEVICE flag ensures
that the corresponding CPU cache lines are getting invalidated and nothing
more, that's exactly what is needed for a newly allocated pages.
This fixes randomly failing rendercheck tests on Tegra30 using the
Opentegra driver for tests that use small-sized pixmaps (10x10 and less,
i.e. 1-2 memory pages) because apparently CPU reads out stale data from
caches and/or that data is getting evicted to DRAM at the time of HW job
execution.
Add binding for the LG ACX467AKM-7 4.95" 1080×1920 LCD panel that is
found on the LG Nexus 5 (hammerhead) phone. This appears to be a JDI
panel based on some Internet searches, however a specific model number
could not be found. I disassembled an old Nexus 5 with a broken
screen and the LG part number is the only model number present on the
back of the panel, so I think that is probably the best ID to use.
drm/drv: Fix incorrect resolution of merge conflict
Commit f06ddb53096b ("BackMerge v5.1-rc5 into drm-next") incorrectly
resolved a merge conflict related to a patch having been merged twice:
- commit 3f04e0a6cfeb ("drm: Fix drm_release() and device unplug")
introduced as a standalone fix via drm-fixes branch,
- commit 1ee57d4d75fb ("drm: Fix drm_release() and device unplug")
applied as patch 1/2 of a series on drm-next branch.
That incorrect resolution of the conflict effectively reverted a change
introduced to drivers/gpu/drm/drm_drv.c by patch 2/2 of that series -
commit ba3bf37e150a ("drm/drv: drm_dev_unplug(): Move out drm_dev_put()
call"). Fix it.
drivers/gpu/drm/meson/meson_viu.c:93:6: warning: symbol 'meson_viu_set_g12a_osd1_matrix' was not declared. Should it be static?
drivers/gpu/drm/meson/meson_viu.c:121:6: warning: symbol 'meson_viu_set_osd_matrix' was not declared. Should it be static?
drivers/gpu/drm/meson/meson_viu.c:190:6: warning: symbol 'meson_viu_set_osd_lut' was not declared. Should it be static?
drivers/gpu/drm/sun4i/sun8i_tcon_top.c:271:36: warning: symbol 'sun8i_r40_tcon_top_quirks' was not declared. Should it be static?
drivers/gpu/drm/sun4i/sun8i_tcon_top.c:276:36: warning: symbol 'sun50i_h6_tcon_top_quirks' was not declared. Should it be static?
drivers/gpu/drm/sun4i/sun4i_tcon.c:239:6: warning: symbol 'sun4i_tcon_set_mux' was not declared. Should it be static?
Jani Nikula [Tue, 16 Apr 2019 08:28:52 +0000 (11:28 +0300)]
drm/i915/ehl: inherit icl cdclk init/uninit
The cdclk init/uninit code was changed by commit 93a643f29bcb
("drm/i915/cdclk: have only one init/uninit function") between the
versions of commit 39564ae86d51 ("drm/i915/ehl: Inherit Ice Lake
conditional code"). What got merged fails to do cdclk init/uninit on
ehl.
Fixes: 39564ae86d51 ("drm/i915/ehl: Inherit Ice Lake conditional code") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Bob Paauwe <bob.j.paauwe@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Bob Paauwe <bob.j.paauwe@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190416082852.18141-1-jani.nikula@intel.com
Chris Wilson [Fri, 12 Apr 2019 07:14:16 +0000 (08:14 +0100)]
drm/i915: Introduce struct class_instance for engines across the uAPI
SSEU reprogramming of the context introduced the notion of engine class
and instance for a forwards compatible method of describing any engine
beyond the old execbuf interface. We wish to adopt this class:instance
description for more interfaces, so pull it out into a separate type for
userspace convenience.
Fixes: e46c2e99f600 ("drm/i915: Expose RPCS (SSEU) configuration to userspace (Gen11 only)") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Tony Ye <tony.ye@intel.com> Cc: Andi Shyti <andi@etezian.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Tony Ye <tony.ye@intel.com> Reviewed-by: Andi Shyti <andi@etezian.org> Link: https://patchwork.freedesktop.org/patch/msgid/20190412071416.30097-1-chris@chris-wilson.co.uk
Eric Anholt [Mon, 1 Apr 2019 22:26:33 +0000 (15:26 -0700)]
drm: Add helpers for setting up an array of dma_fence dependencies.
I needed to add implicit dependency support for v3d, and Rob Herring
has been working on it for panfrost, and I had recently looked at the
lima implementation so I think this will be a good intersection of
what we all want and simplify our scheduler implementations.
v2: Rebase on xa_limit_32b API change, and tiny checkpatch cleanups on
the way in (unsigned int vs unsigned, extra return before
EXPORT_SYMBOL_GPL)
Paulo Zanoni [Wed, 10 Apr 2019 23:53:44 +0000 (16:53 -0700)]
drm/i915: fully convert the IRQ initialization macros to intel_uncore
Make them take the uncore argument from the caller instead of passing
the implicit &dev_priv->uncore directly. This will allow us to finally
pass something that's not dev_priv->uncore in the future, and gets rid
of the implicit variables in register macros.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:43 +0000 (16:53 -0700)]
drm/i915: convert the IRQ initialization functions to intel_uncore
The IRQ initialization helpers are simple and self-contained. Continue
the transition started in the recent uncore rework to get us rid of
I915_READ/WRITE and the implicit dev_priv variables.
While the implicit dev_priv is removed from the IRQ initialization
helpers, we didn't get rid of them in the macro callers. Doing that
should be very simple now.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:42 +0000 (16:53 -0700)]
drm/i915: add GEN2_ prefix to the I{E, I, M, S}R registers
This discussion started because we use token pasting in the
GEN{2,3}_IRQ_INIT and GEN{2,3}_IRQ_RESET macros, so gen2-4 passes an
empty argument to those macros, making the code a little weird. The
original proposal was to just add a comment as the empty argument, but
Ville suggested we just add a prefix to the registers, and that indeed
sounds like a more elegant solution.
Now doing this is kinda against our rules for register naming since we
only add gens or platform names as register prefixes when the given
gen/platform changes a register that already existed before. On the
other hand, we have so many instances of IIR/IMR in comments that
adding a prefix would make the users of these register more easily
findable, in addition to make our token pasting macros actually
readable. So IMHO opening an exception here is worth it.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:41 +0000 (16:53 -0700)]
drm/i915: don't specify the IRQ register in the gen2 macros
Like the gen3+ macros, the gen2 versions of the IRQ initialization
macros take the register name in the 'type' argument. But gen2 only
has one set of registers, so there's really no need to specify the
type. This commit removes the type argument and uses the registers
directly instead of passing them through variables.
Paulo Zanoni [Wed, 10 Apr 2019 23:53:40 +0000 (16:53 -0700)]
drm/i915: refactor the IRQ init/reset macros
The whole point of having macros here is for the token pasting
necessary to automatically have IMR, IIR and IER selected. We don't
really need or want all the inlining that happens as a consequence.
The good thing about the current code is that it works regardless of
the relative offsets between these registers (they change after gen4,
with the usual VLV/CHV exceptions).
One thing which we can do is to split the logic of what we do with
imr/ier/iir to functions separate from the macros that pick them.
That's what we do in this commit. This allows us to get rid of the
gen8 duplicates and also all the inlining:
v2:
- Make checkpatch happy with a temporary which_ (Checkpatch).
- Reorder the arguments for the INIT macros (Ville).
- Correctly explain when the register offsets change in the commit
message (Ville).
- Use more line breaks in the macro calls to make the arguments look
a little more organized/readable.
- Update the bloat-o-meter output (minor change only).
Joel Stanley [Fri, 5 Apr 2019 08:11:17 +0000 (18:41 +1030)]
drm: aspeed: Clean up Kconfig options
The GFX IP is inside of the ASPEED BMC SoC so there is little use
enabling it on a kernel that does not support ASPEED.
When building with COMPILE_TEST the architecture many not have CMA
support, so to avoid breaking the build we only select these options if
the architecture supports the contiguous allocator.
I suspect the DRM_PANEL came from a cut/paste error.
Fixes: 4f2a8f5898ec ("drm: Add ASPEED GFX driver") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20190405081117.27339-1-joel@jms.id.au
Christian König [Mon, 15 Apr 2019 12:46:34 +0000 (14:46 +0200)]
dma-buf: explicitely note that dma-fence-chains use 64bit seqno
Instead of checking the upper values of the sequence number use an explicit
field in the dma_fence_ops structure to note if a sequence should be 32bit
or 64bit.
Chris Wilson [Tue, 16 Apr 2019 08:52:18 +0000 (09:52 +0100)]
drm/i915: Drop bool return from breadcrumbs signaler
Since removal of the "missed interrupt detection" nobody used the result
of whether or not we signaled anybody during that invocation, so now
remove the return value.
drm/i915/gvt: addressed guest GPU hang with HWS index mode
with the introduce of "switch to use HWS indices rather than address",
guest GPU hang observed when running workloads which will update the
seqno to the real HW HWSP, not vitural GPU HWSP and then cause GPU hang.
this patch is to revoke index mode in PIPE_CTRL and MI_FLUSH_DW and
patch guest GPU HWSP address value to these commands.
Fixes: 54939ea0bd85 ("drm/i915: Switch to use HWS indices rather than addresses") Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com> Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Zhenyu Wang [Tue, 16 Apr 2019 08:50:34 +0000 (16:50 +0800)]
Merge tag 'drm-intel-next-2019-04-04' into gvt-next
Merge back drm-intel-next for engine name definition refinement
and 54939ea0bd85 ("drm/i915: Switch to use HWS indices rather than addresses")
that would need gvt fixes to depend on.
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm/i915: Nuke drm_crtc_state and use intel_atomic_state instead
This is one of the patches to start replacing drm pointers
and use the intel_atomic_state and intel_crtc to derive
the necessary intel state variables required for the intel
modeset functions.
v3:
* Remove the unwanted newline (Ville)
v2:
* Flip the function arguments (Ville)
* Remove some remaining instances of drm pointers (Ville)
* Use old_crtc_state and new_crtc_state (Ville)
Chris Wilson [Sat, 13 Apr 2019 12:58:20 +0000 (13:58 +0100)]
drm/i915/selftests: Skip live timeline/suspend tests if wedged
If the driver is wedged, we can not issue the requests to exercise the
timelines or the system across suspend, so skip the tests. live_hangcheck
is there to fail if we cannot recover.
drm/amd/display: Add profiling tools for bandwidth validation
[Why]
We used this change to investigate the performance of bandwidth validation,
it will be useful to have if we need to investigate further.
[How]
We use performance counter tick numbers to profile performance, they live
at dc->debug.bw_val_profile (set .enable in debugger to turn on measuring).
Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Reviewed-by: Tony Cheng <Tony.Cheng@amd.com> Acked-by: Bhawanpreet Lakha <Bhawanpreet Lakha@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>