It seems that killing an application while faults are occurring
(particularly with a GPU in FPGA at a whopping 40MHz) can lead to
handling a lingering page fault after all the address space contexts
have already been freed. In this situation, the LRU list is empty so
addr_to_drm_mm_node() ends up dereferencing the list head as if it were
a struct panfrost_mmu entry; this leaves "mmu->as" actually pointing at
the pfdev->alloc_mask bitmap, which is also empty, and given that the
fault has a high likelihood of being in AS0, hilarity ensues.
Sadly, the cleanest solution seems to involve another goto. Oh well, at
least it's robust...
Fixes: a7aede2d3cda ("drm/panfrost: Prevent race when handling page fault")
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/9a0b09e6b5851f0d4428b72dd6b8b4c0d0ef4206.1572293305.git.robin.murphy@arm.com
spin_lock(&pfdev->as_lock);
list_for_each_entry(mmu, &pfdev->as_lru_list, list) {
if (as == mmu->as)
- break;
+ goto found_mmu;
}
- if (as != mmu->as)
- goto out;
+ goto out;
+found_mmu:
priv = container_of(mmu, struct panfrost_file_priv, mmu);
spin_lock(&priv->mm_lock);