Kevin Wang [Tue, 2 Mar 2021 07:54:00 +0000 (15:54 +0800)]
drm/amdgpu: fix parameter error of RREG32_PCIE() in amdgpu_regs_pcie
the register offset isn't needed division by 4 to pass RREG32_PCIE()
Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Colin Ian King [Tue, 2 Mar 2021 14:05:09 +0000 (14:05 +0000)]
drm/amd/display: fix the return of the uninitialized value in ret
Currently if stream->signal is neither SIGNAL_TYPE_DISPLAY_PORT_MST or
SIGNAL_TYPE_DISPLAY_PORT then variable ret is uninitialized and this is
checked for > 0 at the end of the function. Ret should be initialized,
I believe setting it to zero is a correct default.
Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 2bd7e0b0e183 ("drm/amd/display: Add return code instead of boolean for future use") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 1 Mar 2021 15:42:50 +0000 (10:42 -0500)]
drm/amdgpu: enable BACO runpm by default on sienna cichlid and navy flounder
It works fine and was only disabled because primary GPUs
don't enter runpm if there is a console bound to the fbdev due
to the kmap. This will at least allow runpm on secondary cards.
Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 24 Feb 2021 20:46:59 +0000 (15:46 -0500)]
drm/amdgpu/swsmu/vangogh: Only use RLCPowerNotify msg for disable
Per discussions with PMFW team, the driver only needs to
notify the PMFW when the RLC is disabled. The RLC FW will notify
the PMFW directly when it's enabled.
Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Wed, 24 Feb 2021 17:21:07 +0000 (12:21 -0500)]
drm/amdgpu/pm: make unsupported power profile messages debug
Making them an error confuses users and the errors are harmless
as not all asics support all profiles.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1488 Acked-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Qingqing Zhuo [Tue, 9 Feb 2021 21:36:41 +0000 (16:36 -0500)]
drm/amd/display: Fix system hang after multiple hotplugs (v3)
[Why]
mutex_lock() was introduced in dm_disable_vblank(), which could
be called in an IRQ context. Waiting in IRQ would cause issues
like kernel lockup, etc.
[How]
Handle code that requires mutex lock on a different thread.
v2: squash in compilation fix without CONFIG_DRM_AMD_DC_DCN (Alex)
v3: squash in warning fix (Wei)
Signed-off-by: Qingqing Zhuo <qingqing.zhuo@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Eric Bernstein [Fri, 5 Feb 2021 18:53:31 +0000 (13:53 -0500)]
drm/amd/display: Remove Assert from dcn10_get_dig_frontend
[Why]
In some cases, this function is called when DIG BE is not
connected to DIG FE, in which case a value of zero isn't
invalid and assert should not be hit.
[How]
Remove assert and handle ENGINE_ID_UNKNOWN result in calling
function.
Signed-off-by: Eric Bernstein <eric.bernstein@amd.com> Acked-by: Bindu Ramamurthy <bindu.r@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
drm/amd/display: Add vupdate_no_lock interrupts for DCN2.1
When run igt@kms_vrr in a device that uses DCN2.1 architecture, we
noticed multiple failures. Furthermore, when we tested a VRR demo, we
noticed a system hang where the mouse pointer still works, but the
entire system freezes; in this case, we don't see any dmesg warning or
failure messages kernel. This happens due to a lack of vupdate_no_lock
interrupt, making the userspace wait eternally to get the event back.
For fixing this issue, we need to add the vupdate_no_lock interrupt in
the interrupt list.
drm/amd/pm/swsmu: Avoid using structure_size uninitialized in smu_cmn_init_soft_gpu_metrics
Clang warns:
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:764:2: warning:
variable 'structure_size' is used uninitialized whenever switch default
is taken [-Wsometimes-uninitialized]
default:
^~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:770:23: note:
uninitialized use occurs here
memset(header, 0xFF, structure_size);
^~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:753:25: note:
initialize the variable 'structure_size' to silence this warning
uint16_t structure_size;
^
= 0
1 warning generated.
Return in the default case, as the size of the header will not be known.
Fixes: 428976f0c36d ("drm/amd/pm/swsmu: unify the init soft gpu metrics function") Link: https://github.com/ClangBuiltLinux/linux/issues/1304 Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Tue, 16 Feb 2021 15:57:00 +0000 (10:57 -0500)]
drm/amdgpu: Set reference clock to 100Mhz on Renoir (v2)
Fixes the rlc reference clock used for GPU timestamps.
Value is 100Mhz. Confirmed with hardware team.
v2: reword commit message.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1480 Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Alex Deucher [Tue, 16 Feb 2021 14:02:36 +0000 (09:02 -0500)]
drm/radeon: OLAND boards don't have VCE
Disable it on those boards. No functional change, this just
removes the message about VCE failing to initialize.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197327 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Felix Kuehling [Thu, 4 Feb 2021 05:11:17 +0000 (00:11 -0500)]
drm/amdkfd: Fix recursive lock warnings
memalloc_nofs_save/restore are no longer sufficient to prevent recursive
lock warnings when holding locks that can be taken in MMU notifiers. Use
memalloc_noreclaim_save/restore instead.
Fixes: 8b47713e069e ("mm: track mmu notifiers in fs_reclaim_acquire/release") CC: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 5.10.x
Jan Kokemüller [Thu, 11 Feb 2021 18:28:43 +0000 (19:28 +0100)]
drm/amd/display: Add FPU wrappers to dcn21_validate_bandwidth()
dcn21_validate_bandwidth() calls functions that use floating point math.
On my machine this sometimes results in simd exceptions when there are
other FPU users such as KVM virtual machines running. The screen freezes
completely in this case.
Wrapping the function with DC_FP_START()/DC_FP_END() seems to solve the
problem. This mirrors the approach used for dcn20_validate_bandwidth.
Tested on a AMD Ryzen 7 PRO 4750U (Renoir).
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=206987 Signed-off-by: Jan Kokemüller <jan.kokemueller@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Fix potential integer overflow by casting actual_calculated_clock_100hz
to u64, in order to give the compiler complete information about the
proper arithmetic to use.
Notice that such variable is used in a context that expects
an expression of type u64 (64 bits, unsigned) and the following
expression is currently being evaluated using 32-bit arithmetic:
actual_calculated_clock_100hz * post_divider
Fixes: 164e1a8ae6a5 ("drm/amd/display: fix 64bit division issue on 32bit OS")
Addresses-Coverity-ID: 1501691 ("Unintentional integer overflow") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Thu, 4 Feb 2021 07:46:20 +0000 (02:46 -0500)]
drm/amdgpu: fix CGTS_TCC_DISABLE register offset on gfx10.3
This fixes incorrect TCC harvesting info reported to userspace.
The impact was a very very tiny performance degradation (unnecessary
GL2 cache flushes).
Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
Ville Syrjälä [Tue, 9 Feb 2021 02:19:16 +0000 (04:19 +0200)]
drm/i915: Disallow plane x+w>stride on ilk+ with X-tiling
ilk+ planes get notably unhappy when the plane x+w exceeds
the stride. This wasn't a problem previously because we
always aligned SURF to the closest tile boundary so the
x offset never got particularly large. But now with async
flips we have to align to 256KiB instead and thus this
becomes a real issue.
On ilk/snb/ivb it looks like the accesses just wrap
early to the next tile row when scanout goes past the
SURF+n*stride boundary, hsw/bdw suffer more heavily and
start to underrun constantly. i965/g4x appear to be immune.
vlv/chv I've not yet checked.
Let's borrow another trick from the skl+ code and search
backwards for a better SURF offset in the hopes of getting the
x offset below the limit. IIRC when I ran into a similar issue
on skl years ago it was causing the hardware to fall over
pretty hard as well.
And let's be consistent and include i965/g4x in the check
as well, just in case I just got super lucky somehow when
I wasn't able to reproduce the issue. Not that it really
matters since we still use 4k SURF alignment for i965/g4x
anyway.
Fixes: 8b79dec3ee72 ("drm/i915: Implement async flips for vlv/chv") Fixes: 61c540487587 ("drm/i915: Implement async flip for ilk/snb") Fixes: 0ed3f5d84476 ("drm/i915: Implement async flip for ivb/hsw") Fixes: ea62da612990 ("drm/i915: Implement async flips for bdw") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210209021918.16234-1-ville.syrjala@linux.intel.com Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit 59fb8218c8e5001f854e7d5fdb5fb135cba58102) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Rodrigo also exported some functions from intel_display.c during backport]
Dave Airlie [Fri, 12 Feb 2021 00:15:42 +0000 (10:15 +1000)]
Merge branch '00.00-inst' of git://github.com/skeggsb/linux into drm-next
Ben wrote:
The problem is that GA100 added enough new engine types and instances
that we would have begun to overflow various u64 bitfields used to
track the connections between various engines.
Rather than addressing subdevs by a unique index, we give
each subdev a type and instance id, and replace the use of bitfields
tied to subdev index with other methods.
Notable changes:
- replace subdev index with subdev type + instance id
- engines that turn out to be fused-off (can't detect until later in
init) no longer leave dangling pointers around
- new subdev/instance additions no longer need to be made in multiple places
- ampere engine topology is now being parsed
Ben Skeggs [Tue, 9 Feb 2021 03:01:01 +0000 (13:01 +1000)]
drm/nouveau/fifo: turn chan subdev mask into engine mask
This data is used to know which engines/classes are reachable on a given
channel's runlist, and needs to be replaced with something that doesn't
rely on subdev index.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>
Ben Skeggs [Sun, 6 Dec 2020 02:14:13 +0000 (12:14 +1000)]
drm/nouveau/nvkm: add macros for subdev layout
Rather than having to add new engines / engine instances to multiple places,
define everything in include/nvkm/core/layout.h and use macros to generate
the required plumbing.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com>