git.baikalelectronics.ru Git

drm/amdgpu: fix limiting AV1 to the first instance on VCN3

The job is not yet initialized here.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2037
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: e6f966f9ab37 ("drm/amdgpu: use job and ib structures directly in CS parsers")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd:Fix fw version for 10.3.6

fix fw error when loading fw for 10.3.6

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x

drm/amdgpu: Add MODE register to wave debug info in gfx11

All other chips, from gfx6-gfx10, now include the MODE register at the
end of the wave debug state. This appears to have been missed in gfx11,
so this patch adds in MODE to the debug state for gfx11.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Pass the new context into disable OTG WA"

This reverts commit 692bc9afb6316c1083145832ed3374dded8deb52.

Causes a hang when hotplugging DP, shutting down system, or
enabling dual eDP.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amdgpu: Ensure the DMA engine is deactivated during set ups"

This reverts commit aad1e87c826e1c80a4924fc1254180aae401e2f3.

This causes regression in GPU reset related test.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: ricetons@gmail.com
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: suppress the compile warning about 64 bit type

Suppress the compile warning below:
drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1292
gfx_v11_0_rlc_backdoor_autoload_copy_ucode() warn: should '1 << id' be a 64 bit type?

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: suppress compile warnings about possible unaligned accesses

Suppress the following compile warnings:
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_pptable.h:163:17:
warning: field smc_pptable within 'struct smu_11_0_powerplay_table' is
less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_powerplay_table'
being packed, which can lead to unaligned accesses [-Wunaligned-access]
         PPTable_t smc_pptable;                        //PPTable_t in smu11_driver_if.h
                   ^
   1 warning generated.
--
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v11_0_7_pptable.h:193:17:
warning: field smc_pptable within 'struct smu_11_0_7_powerplay_table' is
less aligned than 'PPTable_t' and is usually due to 'struct smu_11_0_7_powerplay_table'
being packed, which can lead to unaligned accesses [-Wunaligned-access]
         PPTable_t smc_pptable;                        //PPTable_t in smu11_driver_if.h
                   ^
   1 warning generated.
--
>> drivers/gpu/drm/amd/amdgpu/../pm/swsmu/inc/smu_v13_0_pptable.h:161:12:
warning: field smc_pptable within 'struct smu_13_0_powerplay_table' is less aligned than
'PPTable_t' and is usually due to 'struct smu_13_0_powerplay_table' being packed, which
can lead to unaligned accesses [-Wunaligned-access]

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Fix partial migration bugs

Migration range from system memory to VRAM, if system page can not be
locked or unmapped, we do partial migration and leave some pages in
system memory. Several bugs found to copy pages and update GPU mapping
for this situation:

1. copy to vram should use migrate->npage which is total pages of range
as npages, not migrate->cpages which is number of pages can be migrated.

2. After partial copy, set VRAM res cursor as j + 1, j is number of
system pages copied plus 1 page to skip copy.

3. copy to ram, should collect all continuous VRAM pages and copy
together.

4. Call amdgpu_vm_update_range, should pass in offset as bytes, not
as number of pages.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amdkfd: add pinned BOs to kfd_bo_list

The kfd_bo_list is used to restore process BOs after
evictions. As page tables could be destroyed during
evictions, we should also update pinned BOs' page tables
during restoring to make sure they are valid.

So for pinned BOs,
1, Validate them and update their page tables.
2, Don't add eviction fence for them.

v2:
- Don't handle pinned ones specially in BO validation.(Felix)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Update PDEs flush TLB if PTB/PDB moved

Flush TLBs when existing PDEs are updated because a PTB or PDB moved,
but avoids unnecessary TLB flushes when new PDBs or PTBs are added to
the page table, which commonly happens when memory is mapped for the
first time.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: enable tmz by default for GC 10.3.7

Add IP GC 10.3.7 in the list of target to have
tmz enabled by default.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x

drm/amdkfd: Add GC 10.3.6 and 10.3.7 KFD definitions

Loading amdgpu on GC 10.3.7 shows an ERR level message:
`kfd kfd: amdgpu: GC IP 0a0307 not supported in kfd`

Add these targets to match yellow carp structures.

Reported-by: David Chang <david.chang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Jesse(Jie) Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 5.18.x

drm/amdkfd: Use mmget_not_zero in MMU notifier

MMU notifier callback may pass in mm with mm->mm_users==0 when process
is exiting, use mmget_no_zero to avoid accessing invalid mm in deferred
list work after mm is gone.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Resolve RAS GFX error count issue after cold boot on Arcturus

Adjust the sequence for ras late init and separate ras reset error status
from query status.

v2: squash in fix from Candice

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix ras supported check

Fix aldebaran ras supported check on SRIOV guest side,
the previous check conditicon block all ras feature
on baremetal

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: remove stale config guards

This code should be executed.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amdgpu: make gfx_v11_0_rlc_stop static

This symbol is not used outside of gfx_v11_0.c, so marks it static.

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:1945:6: warning: no previous
prototype for function 'gfx_v11_0_rlc_stop' [-Wmissing-prototypes].

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix a missing break in gfx_v11_0_handle_priv_fault

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c:5873:2: warning: unannotated
fall-through between switch labels [-Wimplicit-fallthrough].

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: sunliming <sunliming@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix aper_base for APU

[Why]
Wrong fb offset results in dmub f/w errors and white screen.
[drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3

[How]
Read aper_base from mmhub because GC is off by default

v2: use BAR for passthrough (Alex)

Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: update VCN codec support for Yellow Carp

Supports AV1. Mesa already has support for this and
doesn't rely on the kernel caps for yellow carp, so
this was already working from an application perspective.

Fixes: 3fd1f9fec830 ("amdgpu/nv.c - Added video codec support for Yellow Carp")
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2002
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amdgpu: make program_imu_rlc_ram static

This symbol is not used outside of imu_v11_0.c, so marks it
static.

Fixes the following w1 warning:

drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:302:6: warning: no previous
prototype for ‘program_imu_rlc_ram’ [-Wmissing-prototypes].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: 3.2.187

This version brings along the following fixes:

* Changes to DP LT fallback behavior to more closely match the DP standard
* Added new interfaces for lut pipeline
* Restore ref_dtblck value when clk struct is cleared in init_clocks
* Fixes DMUB outbox trace in S4
* Fixes lingering DIO FIFO errors when DIO no longer enabled
* Reads Golden Settings Table from VBIOS

Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix possible infinite loop in DP LT fallback

[Why]
It's possible for some fallback scenarios to result in infinite looping
during link training.

[How]
This change modifies DP LT fallback behavior to more closely match the
DP standard. Keep track of the link rate during the EQ_FAIL fallback,
and use it as the maximum link rate for the CR sequence.

Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Ilya <Ilya.Bakoulin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Don't clear ref_dtbclk value

[Description]
ref_dtbclk value is assigned in clk_mgr_construct,
but the clks struct is cleared in init_clocks.
Make sure to restore the value or we will get
0 value for ref_dtbclk in DCN31.

Reviewed-by: Chris Park <Chris.Park@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fix DMUB outbox trace in S4 (#4465)

[Why]
DMUB Outbox0 read/write pointer not sync after resumed from S4.
And that caused old traces were sent to outbox.

[How]
Disable DMUB Outbox0 interrupt
and clear DMUB Outbox0 read/write pointer when resumes from S4.
And then enable Outbox0 interrupt before starts DMCUB.

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Cruise Hung <Cruise.Hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Wait DMCUB to idle state before reset.

[WHY]
Very low rate to cause memory access issue while resetting
DMCUB after the halt command was sent to it.
The process of stopping fw of DMCUB may be timeout, that means
it is not in idle state, such as the window frames may still be
kept in cache, so reset by force will cause MMHUB hang.

[HOW]
After the halt command was sent, keep checking the DMCUB state until
it is idle.

Reviewed-by: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: hengzhou <Hengyong.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Pass the new context into disable OTG WA

[Why]
When enabling an HPO stream for the first time after having previously
enabled a DIO stream there may be lingering DIO FIFO errors even though
the DIO is no longer enabled.

These can cause display clock change to hang if we don't apply the
OTG disable workaround since the ramping logic is tied to OTG on.

[How]
The workaround wasn't being applied in the sequence of:

1 DIO stream
0 streams
1 HPO stream

because current_state has no stream or planes in its context - and
it's only swapped after optimize has finished.

We should be using the incoming context instead to determine whether
this logic is needed or not.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: revert Blank eDP on disable/enable drv

why and how:
Revert this change. It was causing a black screen with certain blocks

Reviewed-by: George Shen <George.Shen@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Leung, Martin <Martin.Leung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Read Golden Settings Table from VBIOS

[Why]
Dmub read AUX_DPHY_RX_CONTROL0 from Golden Setting Table,
but driver will set it to default value 0x103d1110, which
causes issue in some case

[How]
Remove the driver code, use the value set by dmub in
dp_aux_init

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Sherry Wang <YAO.WANG1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge tag 'amd-drm-next-5.19-2022-05-26-2' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-5.19-2022-05-26-2:

amdgpu:
- Update fdinfo to the common drm format

UAPI:
- Add VM_NOALLOC GPUVM attribute to prevent buffers for going into the MALL
Add AMDGPU_GEM_CREATE_DISCARDABLE flag to create buffers that can be discarded on eviction
Mesa code which uses these: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16466

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220526202546.66860-1-alexander.deucher@amd.com

Merge tag 'amd-drm-next-5.19-2022-05-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-5.19-2022-05-26:

amdgpu:
- Link training fixes
- DPIA fixes
- Misc code cleanups
- Aux fixes
- Hotplug fixes
- More FP clean up
- Misc GFX9/10 fixes
- Fix a possible memory leak in SMU shutdown
- SMU 13 updates
- RAS fixes
- TMZ fixes
- GC 11 updates
- SMU 11 metrics fixes
- Fix coverage blend mode for overlay plane
- Note DDR vs LPDDR memory
- Fuzz fix for CS IOCTL
- Add new PCI DID

amdkfd:
- Clean up hive setup
- Misc fixes

radeon:
- Fix a possible NULL pointer dereference

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220526200641.64097-1-alexander.deucher@amd.com

drm/amdgpu: add drm-client-id to fdinfo v2

This is enough to get gputop working :)

v2: rebase and some addition cleanup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com> (v1)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Convert to common fdinfo format v5

Convert fdinfo format to one documented in drm-usage-stats.rst.

It turned out that the existing implementation was actually completely
nonsense. The calculated percentages indeed represented the usage of the
engine, but with varying time slices.

So 10% usage for application A could mean something completely different
than 10% usage for application B.

Completely nuke that and just use the now standardized nanosecond
interface.

v2: drop the documentation change for now, nuke percentage calculation
v3: only account for each hw_ip, move the time_spend to the ctx mgr.
v4: move general ctx changes into separate patch, rework the fdinfo to
ctx_mgr interface so that all usages are calculated at once, drop
some unecessary and dangerous refcount dance.
v5: add one more comment how we calculate the time spend

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: bump minor version number

Increase the minor version number to indicate that the new flags are
available.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add AMDGPU_VM_NOALLOC v2

Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL allocation.

v2: also add the flag to the allowed flags.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add AMDGPU_GEM_CREATE_DISCARDABLE

Add a AMDGPU_GEM_CREATE_DISCARDABLE flag to note that the content of a BO
doesn't needs to be preserved during eviction.

KFD was already using a similar functionality for SVM BOs so replace the
internal flag with the new UAPI.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add beige goby PCI ID

Add a beige goby PCI ID.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

drm/amd/pm: Return auto perf level, if unsupported

When powerplay is not enabled, return AUTO as default level.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: fix typo in comment

Spelling mistake (triple letters) in comment.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gfx: fix typos in comments

Spelling mistakes (triple letters) in comments.
Detected with the help of Coccinelle.

Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/cs: make commands with 0 chunks illegal behaviour.

Submitting a cs with 0 chunks, causes an oops later, found trying
to execute the wrong userspace driver.

MESA_LOADER_DRIVER_OVERRIDE=v3d glxinfo

[172536.665184] BUG: kernel NULL pointer dereference, address: 00000000000001d8
[172536.665188] #PF: supervisor read access in kernel mode
[172536.665189] #PF: error_code(0x0000) - not-present page
[172536.665191] PGD 6712a0067 P4D 6712a0067 PUD 5af9ff067 PMD 0
[172536.665195] Oops: 0000 [#1] SMP NOPTI
[172536.665197] CPU: 7 PID: 2769838 Comm: glxinfo Tainted: P           O      5.10.81 #1-NixOS
[172536.665199] Hardware name: To be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS 2201 03/23/2015
[172536.665272] RIP: 0010:amdgpu_cs_ioctl+0x96/0x1ce0 [amdgpu]
[172536.665274] Code: 75 18 00 00 4c 8b b2 88 00 00 00 8b 46 08 48 89 54 24 68 49 89 f7 4c 89 5c 24 60 31 d2 4c 89 74 24 30 85 c0 0f 85 c0 01 00 00 <48> 83 ba d8 01 00 00 00 48 8b b4 24 90 00 00 00 74 16 48 8b 46 10
[172536.665276] RSP: 0018:ffffb47c0e81bbe0 EFLAGS: 00010246
[172536.665277] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[172536.665278] RDX: 0000000000000000 RSI: ffffb47c0e81be28 RDI: ffffb47c0e81bd68
[172536.665279] RBP: ffff936524080010 R08: 0000000000000000 R09: ffffb47c0e81be38
[172536.665281] R10: ffff936524080010 R11: ffff936524080000 R12: ffffb47c0e81bc40
[172536.665282] R13: ffffb47c0e81be28 R14: ffff9367bc410000 R15: ffffb47c0e81be28
[172536.665283] FS:  00007fe35e05d740(0000) GS:ffff936c1edc0000(0000) knlGS:0000000000000000
[172536.665284] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[172536.665286] CR2: 00000000000001d8 CR3: 0000000532e46000 CR4: 00000000000406e0
[172536.665287] Call Trace:
[172536.665322]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665332]  drm_ioctl_kernel+0xaa/0xf0 [drm]
[172536.665338]  drm_ioctl+0x201/0x3b0 [drm]
[172536.665369]  ? amdgpu_cs_find_mapping+0x110/0x110 [amdgpu]
[172536.665372]  ? selinux_file_ioctl+0x135/0x230
[172536.665399]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[172536.665403]  __x64_sys_ioctl+0x83/0xb0
[172536.665406]  do_syscall_64+0x33/0x40
[172536.665409]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2018
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: differentiate between LP and non-LP DDR memory

Some applications want to know whether the memory is LP or
not.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Resolve pcie_bif RAS recovery bug

Check shared buf instead of init flag for xgmi ta shared buf init
during xgmi ta initialization.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: clean up asd on the ta_firmware_header_v2_0

On the psp13 series use ta_firmware_header_v2_0 and the asd firmware
was buildin ta, so needn't request asd firmware separately.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/discovery: validate VCN and SDMA instances

Validate the VCN and SDMA instances against the driver
structure sizes to make sure we don't get into a
situation where the firmware reports more instances than
the driver supports.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: add Coverage blend mode for overlay plane

According to the KMS man page, there is a
"Coverage" alpha blend mode that assumes the
pixel color values have NOT been pre-multiplied
and will be done when the actual blending to
the background color values happens.

Previously, this mode hasn't been enabled
in our driver and it was assumed that all
normal overlay planes are pre-multiplied
by default.

When a 3rd party app is used to input a image
in a specific format, e.g. PNG, as a source
of a overlay plane to blend with the background
primary plane, the pixel color values are not
pre-multiplied. So by adding "Coverage" blend
mode, our driver will support those cases.

Issue fixed: Overlay plane alpha channel blending is incorrect
Issue tracker: https://gitlab.freedesktop.org/drm/amd/-/issues/1769

Reference:
https://dri.freedesktop.org/docs/drm/gpu/drm-kms.html#plane-composition-properties

Adding Coverage support also enables IGT
kms_plane_alpha_blend Coverage subtests:
1. coverage-7efc
2. coverage-vs-premult-vs-constant

Changes
1. Add DRM_MODE_BLEND_COVERAGE blend mode capability
2. Add "pre_multiplied_alpha" flag for Coverage case
3. Read the correct flag and set the DCN MPCC
pre_multiplied register bit (only on overlay plane)

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1769
Signed-off-by: Sung Joon Kim <Sungjoon.Kim@amd.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Off by one in dm_dmub_outbox1_low_irq()

The > ARRAY_SIZE() should be >= ARRAY_SIZE() to prevent an out of bounds
access.

Fixes: 5c4f2ed1ef60 ("drm/amd/display: Support for DMUB HPD interrupt handling")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: suppress some compile warnings

Suppress two compile warnings about "no previous prototype".

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: correct the metrics version for SMU 11.0.11/12/13

Correct the metrics version used for SMU 11.0.11/12/13.
Fixes misreported GPU metrics (e.g., fan speed, etc.) depending
on which version of SMU firmware is loaded.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1925
Signed-off-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: Add gfx11 trap handler

Based on gfx10 with following changes:

- GPR_ALLOC.VGPR_SIZE field moved (and size corrected in gfx10)
- s_sendmsg_rtn_b64 replaces some s_sendmsg/s_getreg
- Buffer instructions no longer have direct-to-LDS modifier

Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Laurent Morichetti <laurent.morichetti@amd.com>
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: port cwsr trap handler from dkms branch

Most of changes are for debugger feature, and it is
to simplify trap handler support for new asics in the
future.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Add HDMI_ACP_SEND register

Define HDMI_ACP_SEND register shift/mask.

Signed-off-by: Alan Liu <HaoPing.Liu@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: Fix missing thermal throttler status

On aldebaran, when thermal throttling happens due to excessive GPU
temperature, the reason for throttling event is missed in warning
message. This patch fixes it.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add support of tmz for GC 10.3.7

Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: change code name to ip version for tmz set

Use IP version rather then code name of IPs for
tmz set.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move amdgpu_gmc_tmz_set after ip_version populated

To enable TMZ feature based on IP version needs adev->ip_version
populated but its empty. Move amdgpu_gmc_tmz_set to a place where
ip_version is populated.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: support ras on SRIOV

support umc/gfx/sdma ras on guest side

Changed from V1:
move sriov judgment in amdgpu_ras_interrupt_fatal_error_handler

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/pm: smu_v13_0_4: delete duplicate condition

There is no need to check if "clock_ranges' is non-NULL. It is checked
already on the line before.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: enable memory temp reading for SMU 13.0.0

With the latest vbios, the memory temp reading is working.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: enable more dpm features for SMU 13.0.0

Enable OOB Monitor and SOC CG which are ready since 78.38.0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: correct the softpptable ids used for SMU 13.0.0

To better match with the pptable_id settings from VBIOS.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: update SMU 13.0.0 driver_if header

To align with 78.37.0 and later PMFWs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: skip dpm disablement on suspend for SMU 13.0.0

Since PMFW will handle this properly. Driver involvement is
unnecessary.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: enable more dpm features for SMU 13.0.0

Enable MP0CLK DPM and FW Dstate since they are already supported
by latest 78.36.0 PMFW.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/pm: fix a potential gpu_metrics_table memory leak

gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but
not freed in yellow_carp_fini_smc_tables().

Signed-off-by: Gong Yuanjun <ruc_gongyuanjun@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/radeon: fix a possible null pointer dereference

In radeon_fp_native_mode(), the return value of drm_mode_duplicate()
is assigned to mode, which will lead to a NULL pointer dereference
on failure of drm_mode_duplicate(). Add a check to avoid npd.

The failure status of drm_cvt_mode() on the other path is checked too.

Signed-off-by: Gong Yuanjun <ruc_gongyuanjun@163.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gfx10: rework KIQ programming

Make sure the queue is not longer active before programming
the kiq EOP registers.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Set CP_HQD_PQ_CONTROL.RPTR_BLOCK_SIZE correctly

Remove the accidental shifts on the values of RPTR_BLOCK_SIZE
in gfx_v8-v11. The bug essentially always programs the
corresponding fields to zero instead of the correct value.
The hardware clamps the min value to 5 so this resulted in a
value of 5 being programmed.

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: cleanup ctx implementation

Let each context have a pointer to the ctx manager and properly
initialize the adev pointer inside the context manager.

Reduce the BUG_ON() in amdgpu_ctx_add_fence() into a WARN_ON() and
directly return the sequence number instead of writing into a parmeter.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdkfd: simplify cpu hive assignment

CPU hive assignment currently assumes when a GPU hive is connected_to_cpu,
there is only one hive in the system.

Only assign CPUs to the hive if they are explicitly directly connected to
the GPU hive to get rid of the need for this assumption.

It's more efficient to do this when querying IO links since other non-CRAT
info has to be filled in anyways. Also, stop re-assigning the
same CPU to the same GPU hive if it has already been done before.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Clean up of initializing doorbells for gfx_v9 and gfx_v10

Clean up redundant, copy-paste code blocks during the initialization of
the doorbells in mqd_init().

Signed-off-by: Haohui Mai <ricetons@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: 3.2.186

This version brings along the following:
- Improvements in link training fallback
- Adding individual edp hotplug support
- Fixes in DPIA HPD status, display clock change hang, etc.
- FPU isolation work for DCN30

Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Revert "drm/amd/display: Refactor LTTPR cap retrieval"

This reverts commit faf8f71d34ac25572b95e539a5dd10cc935e7b8d.

[WHY]
Regressions unintentionally caused by change,
reverting until this can be resolved.

Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Michael Strauss <michael.strauss@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Fic incorrect pipe being used for clk update

[Why]
we save the prev_dppclk value using "dpp_inst" but
when reading this value we use the index "i". In
a case where a pipe is fused off we can end up reading
the incorrect instance because i != dpp_inst in this
case.

[How]
read the prev_dppclk using dpp_inst instead of i

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Move FPU associated DCN30 code to DML folder

[why & how]
As part of the FPU isolation work documented in
https://patchwork.freedesktop.org/series/93042/, isolate
code that uses FPU in DCN30 to DML, where all FPU code
should locate.

Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Check zero planes for OTG disable W/A on clock change

[Why]
A display clock change hang can occur when switching between DIO and HPO
enabled modes during the optimize_bandwidth in dc_commit_state_no_check
call.

This happens when going from 4k120 8bpc 420 to 4k144 10bpc 444.

Display clock in the DIO case is 1200MHz, but pixel rate is 600MHz
because the pixel format is 420.

Display clock in the HPO case is less (800MHz?) because of ODM combine
which results in a smaller divider.

The DIO is still active in prepare but not active in the optimize which
results in the hang occuring.

During this change there are no planes on the stream so it's safe to
apply the workaround, but dpms_off = false and signal type is not
virtual.

[How]
Check for plane_count == 0, no planes on the stream.

It's easiest to check pipe->plane_state == NULL as an equivalent check
rather than trying to search for the stream status in the context
associated with the stream, so let's do that.

The primary, non MPO pipe should not have a NULL plane state.

Reviewed-by: Dmytro Laktyushkin <Dmytro.Laktyushkin@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Allow individual control of eDP hotplug support

[Why]
Second eDP can send display off notification through HPD
but DC isn't hooked up to handle. Some primary eDP panels
will toggle on/off incorrectly if it's enabled generically.

[How]
Extend the debug option to allow individually enabling hotplug
either the first eDP or the second eDP in a dual eDP system.

Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Derek Lai <Derek.Lai@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Check if modulo is 0 before dividing.

[How & Why]
If a value of 0 is read, then this will cause a divide-by-0 panic.

Reviewed-by: Martin Leung <Martin.Leung@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: clear request when release aux engine

[Why]
when driver and dmub request aux engine at the same time,
dmub grant the aux engine but driver fail. Then driver
release aux engine but doesn't clear the request bit.
Then aux engine will be occupied by driver forever.

[How]
When driver release aux engine, clear request bit as well.

Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Paul Hsieh <paul.hsieh@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Clean up code in dc

[Why & How]
Code clean up in dc.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alvin Lee <Alvin.Lee2@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Query DPIA HPD status.

[Why]
Driver needs up to date DPIA HPD status.

[How]
Use HPD query command to get DPIA HPD status.

Reviewed-by: Meenakshikumar Somasundaram <Meenakshikumar.Somasundaram@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amd/display: Update link training fallback behaviour.

[Why]
Some displays may need several link training attempts before
link training succeeds.

[How]
If training succeeds after falling back to lower link bandwidth,
retry at original link bandwidth instead of abandoning link training
whenever link bandwidth is less than stream bandwidth.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Jimmy Kizito <Jimmy.Kizito@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

Merge tag 'drm-intel-next-fixes-2022-05-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

drm/i915 fixes for v5.19 merge window:
- Build, sparse, UB, and CFI fixes
- Variable scope fix
- Audio pipe logging fix
- ICL+ DSI NULL dereference fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87sfozuj44.fsf@intel.com

drm/i915/dsi: fix VBT send packet port selection for ICL+

The VBT send packet port selection was never updated for ICL+ where the
2nd link is on port B instead of port C as in VLV+ DSI.

First, single link DSI needs to use the configured port instead of
relying on the VBT sequence block port. Remove the hard-coded port C
check here and make it generic. For reference, see commit 73b510e1241e
("drm/i915: Changes related to the sequence port no for") for the
original VLV specific fix.

Second, the sequence block port number is either 0 or 1, where 1
indicates the 2nd link. Remove the hard-coded port C here for 2nd
link. (This could be a "find second set bit" on DSI ports, but just
check the two possible options.)

Third, sanity check the result with a warning to avoid a NULL pointer
dereference.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5984
Cc: stable@vger.kernel.org # v4.19+
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220520094600.2066945-1-jani.nikula@intel.com
(cherry picked from commit 08c59dde71b73a0ac94e3ed2d431345b01f20485)

Merge tag 'msm-next-5.19-fixes' of https://gitlab.freedesktop.org/abhinavk/msm into drm-next

5.19 fixes for msm-next

- Limiting WB modes to max sspp linewidth
- Fixing the supported rotations to add 180 back for IGT
- Fix to handle pm_runtime_get_sync() errors to avoid unclocked access
in the bind() path for dpu driver
- Fix the irq_free() without request issue which was a big-time
hitter in the CI-runs.

Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b011d51d-d634-123e-bf5f-27219ee33151@quicinc.com

Merge tag 'drm-misc-next-fixes-2022-05-19' of git://anongit.freedesktop.org/drm/drm-misc into drm-next

A device tree binding change for Rockchip VOP2

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220519080556.42p52cya4u6y3kps@houat

drm/i915/uc: Fix undefined behavior due to shift overflowing the constant

Fix:

  In file included from <command-line>:0:0:
  drivers/gpu/drm/i915/gt/uc/intel_guc.c: In function ‘intel_guc_send_mmio’:
  ././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_1047’ \
  declared with attribute error: FIELD_PREP: mask is not constant
    _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)

and other build errors due to shift overflowing values.

See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory
details as to why it triggers with older gccs only.

v2 by Jani:
- Drop the i915_reg.h changes

Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Ruiqi GONG <gongruiqi1@huawei.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518113315.1305027-2-jani.nikula@intel.com
(cherry picked from commit 962bd34bb457f6353f333ce234c3fd34cad1c00a)

drm/i915/reg: fix undefined behavior due to shift overflowing the constant

Use REG_GENMASK() and REG_FIELD_PREP() to avoid errors due to
-fsanitize=shift.

References: https://lore.kernel.org/r/20220405151517.29753-12-bp@alien8.de
Reported-by: Borislav Petkov <bp@suse.de>
Reported-by: Ruiqi GONG <gongruiqi1@huawei.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518113315.1305027-1-jani.nikula@intel.com
(cherry picked from commit 230fb39ff7e07bd0324c87acf08dd2c9b0bbcea8)

drm/i915/gt: Fix use of static in macro mismatch

The INTEL_GT_RPS_SYSFS_ATTR was creating to different structures
but. When called with the "static" keyword this is affecting only
the first structure, while the second is created as non static.

Move the static keyword inside the macros to affect both the
structures.

Reported-by: Jani Nikula <jani.nikula@linux.intel.com>
Fixes: baa281355e321 ("drm/i915/gt: Create per-tile RPS sysfs interfaces")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220510140447.80200-1-andi.shyti@linux.intel.com
(cherry picked from commit 1ade30812abfdd1c161a155fd54b0dd594c217ee)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/audio: fix audio code enable/disable pipe logging

Need to use pipe_name(pipe) instead of pipe directly.

Fixes: 200ced035257 ("drm/i915/audio: unify audio codec enable/disable debug logging")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220512161638.272601-1-jani.nikula@intel.com
(cherry picked from commit 945ae909aa76f55ac8c9e95feb3683512d39134a)

drm/i915: Fix CFI violation with show_dynamic_id()

When an attribute group is created with sysfs_create_group(), the
->sysfs_ops() callback is set to kobj_sysfs_ops, which sets the ->show()
callback to kobj_attr_show(). kobj_attr_show() uses container_of() to
get the ->show() callback from the attribute it was passed, meaning the
->show() callback needs to be the same type as the ->show() callback in
'struct kobj_attribute'.

However, show_dynamic_id() has the type of the ->show() callback in
'struct device_attribute', which causes a CFI violation when opening the
'id' sysfs node under drm/card0/metrics. This happens to work because
the layout of 'struct kobj_attribute' and 'struct device_attribute' are
the same, so the container_of() cast happens to allow the ->show()
callback to still work.

Change the type of show_dynamic_id() to match the ->show() callback in
'struct kobj_attributes' and update the type of sysfs_metric_id to
match, which resolves the CFI violation.

Fixes: 95c5c461d32f ("drm/i915/perf: Implement I915_PERF_ADD/REMOVE_CONFIG interface")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220513075136.1027007-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 18fb42db05a0b93ab5dd5eab5315e50eaa3ca620)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915: Fix 'mixing different enum types' warnings in intel_display_power.c

Fix the following sparse warnings:

drivers/gpu/drm/i915/display/intel_display_power.c:2431:34: warning: mixing different enum types:
drivers/gpu/drm/i915/display/intel_display_power.c:2431:34:    unsigned int enum intel_display_power_domain
drivers/gpu/drm/i915/display/intel_display_power.c:2431:34:    int enum port
drivers/gpu/drm/i915/display/intel_display_power.c:2442:37: warning: mixing different enum types:
drivers/gpu/drm/i915/display/intel_display_power.c:2442:37:    unsigned int enum intel_display_power_domain
drivers/gpu/drm/i915/display/intel_display_power.c:2442:37:    int enum port
drivers/gpu/drm/i915/display/intel_display_power.c:2468:43: warning: mixing different enum types:
drivers/gpu/drm/i915/display/intel_display_power.c:2468:43:    unsigned int enum intel_display_power_domain
drivers/gpu/drm/i915/display/intel_display_power.c:2468:43:    unsigned int enum aux_ch
drivers/gpu/drm/i915/display/intel_display_power.c:2479:35: warning: mixing different enum types:
drivers/gpu/drm/i915/display/intel_display_power.c:2479:35:    unsigned int enum intel_display_power_domain
drivers/gpu/drm/i915/display/intel_display_power.c:2479:35:    unsigned int enum aux_ch

Fixes: 9c58e24e4579 ("drm/i915: Sanitize the port -> DDI/AUX power domain mapping for each platform")
Reported-by: Jani Nikula <jani.nikula@intel.com>
Cc: Jouni Högander <jouni.hogander@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220510114957.406070-1-imre.deak@intel.com
(cherry picked from commit 7ecc3cc8a7b39f08eee9aea7b718187583342a70)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

drm/i915/gt: Fix build error without CONFIG_PM

drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c: In function ‘act_freq_mhz_show’:
drivers/gpu/drm/i915/gt/intel_gt_sysfs_pm.c:276:20: error: implicit declaration of function ‘sysfs_gt_attribute_r_max_func’ [-Werror=implicit-function-declaration]
276 | u32 actual_freq = sysfs_gt_attribute_r_max_func(dev, attr,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Move sysfs_gt_attribute_* macros out of #ifdef block to fix this.

Fixes: baa281355e32 ("drm/i915/gt: Create per-tile RPS sysfs interfaces")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220506032652.1856-1-yuehaibing@huawei.com
(cherry picked from commit 09708b6d82ef473de91c49d90f35e38b0db463f5)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

Merge tag 'amd-drm-next-5.19-2022-05-18' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-5.19-2022-05-18:

amdgpu:
- Misc code cleanups
- Additional SMU 13.x enablement
- Smartshift fixes
- GFX11 fixes
- Support for SMU 13.0.4
- SMU mutex fix
- Suspend/resume fix

amdkfd:
- static checker fix
- Doorbell/MMIO resource handling fix

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220518205621.5741-1-alexander.deucher@amd.com

drm/msm/dpu: handle pm_runtime_get_sync() errors in bind path

If there are errors while trying to enable the pm in the
bind path, it will lead to unclocked access of hw revision
register thereby crashing the device.

This will not address why the pm_runtime_get_sync() fails
but at the very least we should be able to prevent the
crash by handling the error and bailing out earlier.

changes in v2:
- use pm_runtime_resume_and_get() instead of
pm_runtime_get_sync()

Fixes: 19c2c6078363 ("drm/msm: Add SDM845 DPU support")
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/486721/
Link: https://lore.kernel.org/r/20220518223407.26147-1-quic_abhinavk@quicinc.com
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

drm/msm/dpu: add DRM_MODE_ROTATE_180 back to supported rotations

DRM_MODE_ROTATE_180 was previously marked as supported even
for devices not supporting inline rotation.

This is true because the SSPPs can always flip the image.

After inline rotation support changes, this bit was removed
and kms_rotation_crc IGT test starts skipping now whereas
it was previously passing.

Restore DRM_MODE_ROTATE_180 bit to the supported rotations
list.

Fixes: d2e7d6eb81521 ("add inline rotation support for sc7280")
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> # Trogdor (SC8170)
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/485928/
Link: https://lore.kernel.org/r/20220511222710.22394-1-quic_abhinavk@quicinc.com
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

drm/msm: don't free the IRQ if it was not requested

As msm_drm_uninit() is called from the msm_drm_init() error path,
additional care should be necessary as not to call the free_irq() for
the IRQ that was not requested before (because an error occured earlier
than the request_irq() call).

This fixed the issue reported with the following backtrace:

[    8.571329] Trying to free already-free IRQ 187
[    8.571339] WARNING: CPU: 0 PID: 76 at kernel/irq/manage.c:1895 free_irq+0x1e0/0x35c
[    8.588746] Modules linked in: pmic_glink pdr_interface fastrpc qrtr_smd snd_soc_hdmi_codec msm fsa4480 gpu_sched drm_dp_aux_bus qrtr i2c_qcom_geni crct10dif_ce qcom_stats qcom_q6v5_pas drm_display_helper gpi qcom_pil_info drm_kms_helper qcom_q6v5 qcom_sysmon qcom_common qcom_glink_smem qcom_rng mdt_loader qmi_helpers phy_qcom_qmp ufs_qcom typec qnoc_sm8350 socinfo rmtfs_mem fuse drm ipv6
[    8.624154] CPU: 0 PID: 76 Comm: kworker/u16:2 Not tainted 5.18.0-rc5-next-20220506-00033-g6cee8cab6089-dirty #419
[    8.624161] Hardware name: Qualcomm Technologies, Inc. SM8350 HDK (DT)
[    8.641496] Workqueue: events_unbound deferred_probe_work_func
[    8.647510] pstate: 604000c5 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    8.654681] pc : free_irq+0x1e0/0x35c
[    8.658454] lr : free_irq+0x1e0/0x35c
[    8.662228] sp : ffff800008ab3950
[    8.665642] x29: ffff800008ab3950 x28: 0000000000000000 x27: ffff16350f56a700
[    8.672994] x26: ffff1635025df080 x25: ffff16350251badc x24: ffff16350251bb90
[    8.680343] x23: 0000000000000000 x22: 00000000000000bb x21: ffff16350e8f9800
[    8.687690] x20: ffff16350251ba00 x19: ffff16350cbd5880 x18: ffffffffffffffff
[    8.695039] x17: 0000000000000000 x16: ffffa2dd12179434 x15: ffffa2dd1431d02d
[    8.702391] x14: 0000000000000000 x13: ffffa2dd1431d028 x12: 662d79646165726c
[    8.709740] x11: ffffa2dd13fd2438 x10: 000000000000000a x9 : 00000000000000bb
[    8.717111] x8 : ffffa2dd13fd23f0 x7 : ffff800008ab3750 x6 : 00000000fffff202
[    8.724487] x5 : ffff16377e870a18 x4 : 00000000fffff202 x3 : ffff735a6ae1b000
[    8.731851] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff1635015f8000
[    8.739217] Call trace:
[    8.741755]  free_irq+0x1e0/0x35c
[    8.745198]  msm_drm_uninit.isra.0+0x14c/0x294 [msm]
[    8.750548]  msm_drm_bind+0x28c/0x5d0 [msm]
[    8.755081]  try_to_bring_up_aggregate_device+0x164/0x1d0
[    8.760657]  __component_add+0xa0/0x170
[    8.764626]  component_add+0x14/0x20
[    8.768337]  dp_display_probe+0x2a4/0x464 [msm]
[    8.773242]  platform_probe+0x68/0xe0
[    8.777043]  really_probe.part.0+0x9c/0x28c
[    8.781368]  __driver_probe_device+0x98/0x144
[    8.785871]  driver_probe_device+0x40/0x140
[    8.790191]  __device_attach_driver+0xb4/0x120
[    8.794788]  bus_for_each_drv+0x78/0xd0
[    8.798751]  __device_attach+0xdc/0x184
[    8.802713]  device_initial_probe+0x14/0x20
[    8.807031]  bus_probe_device+0x9c/0xa4
[    8.810991]  deferred_probe_work_func+0x88/0xc0
[    8.815667]  process_one_work+0x1d0/0x320
[    8.819809]  worker_thread+0x14c/0x444
[    8.823688]  kthread+0x10c/0x110
[    8.827036]  ret_from_fork+0x10/0x20

Reported-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Fixes: 246959573df1 ("drm/msm: Convert to Linux IRQ interfaces")
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/485422/
Link: https://lore.kernel.org/r/20220507010021.1667700-1-dmitry.baryshkov@linaro.org
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

drm/msm/dpu: limit writeback modes according to max_linewidth

Writeback modes were being added according to mode_config.max_width
but this is assigned to double of max_mixer_width.

For compositors/clients using a single SSPP, this will fail
the dpu_plane's atomic check as it checks for max_linewidth.

Limit writeback modes according to max_linewidth to allow
even compositors/clients which use only a single SSPP to
use writeback.

Fixes: 86de9cdb00b20 ("drm/msm/dpu: add the writeback connector layer")
Reported-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> # Trogdor (SC8170)
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/486176/
Link: https://lore.kernel.org/r/20220513225959.19004-1-quic_abhinavk@quicinc.com
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>

drm/amd: Don't reset dGPUs if the system is going to s2idle

An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC
reset for handling aborted suspend can't work with s2idle.

This functionality was introduced in commit b890ad00836d61 ("drm/amdgpu:
always reset the asic in suspend (v2)"). A few other commits have
gone on top of the ASIC reset, but this still doesn't work on the A+A
configuration in s2idle.

Avoid doing the reset on dGPUs specifically when using s2idle.

Fixes: b890ad00836d61 ("drm/amdgpu: always reset the asic in suspend (v2)")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2008
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Unmap legacy queue when MES is enabled

This fixes a kernel oops when MES is not enabled.

Reported-by: Kenny Ho <Kenny.Ho@amd.com>
Suggested-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Fixes: 63b65fdbad91fc ("drm/amdgpu: add mes unmap legacy queue routine")
Fixes: 06cbc96024af16 ("drm/amdgpu: add init support for GFX11 (v2)")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>