From 9065c47cd37dab503eecba0a212b84dc1ddafc4c Mon Sep 17 00:00:00 2001 From: Daniel Vetter Date: Thu, 28 Jun 2012 09:48:42 +0200 Subject: [PATCH] drm/i915: "Flush Me Harder" required on gen6+ The prep to remove the flushing list in commit 0f849b3a800fae1db986f3a55f3d13238e3c6ca6 Author: Daniel Vetter Date: Wed Jun 13 20:45:19 2012 +0200 drm/i915: disable flushing_list/gpu_write_list causes quite some decent regressions. We can fix this by setting the CS_STALL bit to ensure that the following seqno write happens only after the cache flush has completed. But only do that when the caller actually wants the flush (and not also when we invalidate caches before starting the next batch). I've looked through all our ancient scrolls about gen6+ pipe control workarounds, and this seems to be indeed a legal combination: We're allowed to set the CS_STALL bit when we flush the render cache (which we do). While yelling at this code, also pass back the return value from intel_emit_post_sync_nonzero_flush properly. v2: Instead of emitting more pipe controls, set the CS_STALL bit on the write flush as suggested by Chris Wilson. It seems to work, too. Cc: Eric Anholt Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51436 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51429 Tested-by: Lu Hua Tested-by: Chris Wilson Reviewed-by: Chris Wilson Signed-Off-by: Daniel Vetter --- drivers/gpu/drm/i915/intel_ringbuffer.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index f30a53a8917e6..dce4d1a492a84 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -219,7 +219,9 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring, int ret; /* Force SNB workarounds for PIPE_CONTROL flushes */ - intel_emit_post_sync_nonzero_flush(ring); + ret = intel_emit_post_sync_nonzero_flush(ring); + if (ret) + return ret; /* Just flush everything. Experiments have shown that reducing the * number of bits based on the write domains has little performance @@ -233,6 +235,12 @@ gen6_render_ring_flush(struct intel_ring_buffer *ring, flags |= PIPE_CONTROL_VF_CACHE_INVALIDATE; flags |= PIPE_CONTROL_CONST_CACHE_INVALIDATE; flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE; + /* + * Ensure that any following seqno writes only happen when the render + * cache is indeed flushed (but only if the caller actually wants that). + */ + if (flush_domains) + flags |= PIPE_CONTROL_CS_STALL; ret = intel_ring_begin(ring, 6); if (ret) -- 2.39.5