Re: [PATCH] drm/i915/gen9: Increase PCODE request timeout to 100ms

From: Imre Deak <hidden>
Date: 2017-02-21 09:22:15
Also in: intel-gfx

On Mon, Feb 20, 2017 at 04:05:33PM +0000, Chris Wilson wrote:

On Mon, Feb 20, 2017 at 05:29:44PM +0200, Imre Deak wrote:

quoted

After
commit 2c7d0602c815277f7cb7c932b091288710d8aba7
Author: Imre Deak [off-list ref]
Date:   Mon Dec 5 18:27:37 2016 +0200

    drm/i915/gen9: Fix PCODE polling during CDCLK change notification

there is still one report of the CDCLK-change request timing out on a
KBL machine, see the Reference link. On that machine the maximum time
the request took to succeed was 34ms, so increase the timeout to 100ms.

Reference: https://bugs.freedesktop.org/show_bug.cgi?id=99345
Cc: Ville Syrjï¿½lï¿½ <redacted>
Cc: Chris Wilson <redacted>
Cc: <redacted>
Signed-off-by: Imre Deak <redacted>
---
 drivers/gpu/drm/i915/intel_drv.h |  2 +-
 drivers/gpu/drm/i915/intel_pm.c  | 11 ++++++-----
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 821c57c..7970ba8 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h

@@ -87,7 +87,7 @@
 	int cpu, ret, timeout = (US) * 1000; \
 	u64 base; \
 	_WAIT_FOR_ATOMIC_CHECK(ATOMIC); \
-	BUILD_BUG_ON((US) > 50000); \
+	BUILD_BUG_ON((US) > 100000); \
 	if (!(ATOMIC)) { \
 		preempt_disable(); \
 		cpu = smp_processor_id(); \

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index fe243c6..90134b0 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c

@@ -7910,10 +7910,10 @@ static bool skl_pcode_try_request(struct drm_i915_private *dev_priv, u32 mbox,
  * @timeout_base_ms: timeout for polling with preemption enabled
  *
  * Keep resending the @request to @mbox until PCODE acknowledges it, PCODE
- * reports an error or an overall timeout of @timeout_base_ms+10 ms expires.
+ * reports an error or an overall timeout of @timeout_base_ms+100 ms expires.
  * The request is acknowledged once the PCODE reply dword equals @reply after
  * applying @reply_mask. Polling is first attempted with preemption enabled
- * for @timeout_base_ms and if this times out for another 10 ms with
+ * for @timeout_base_ms and if this times out for another 100 ms with
  * preemption disabled.
  *
  * Returns 0 on success, %-ETIMEDOUT in case of a timeout, <0 in case of some

@@ -7949,14 +7949,15 @@ int skl_pcode_request(struct drm_i915_private *dev_priv, u32 mbox, u32 request,
 	 * worst case) _and_ PCODE was busy for some reason even after a
 	 * (queued) request and @timeout_base_ms delay. As a workaround retry
 	 * the poll with preemption disabled to maximize the number of
-	 * requests. Increase the timeout from @timeout_base_ms to 10ms to
+	 * requests. Increase the timeout from @timeout_base_ms to 100ms to
 	 * account for interrupts that could reduce the number of these
-	 * requests.
+	 * requests, and for any quirks of the PCODE firmware that delays
+	 * the request completion.
 	 */
 	DRM_DEBUG_KMS("PCODE timeout, retrying with preemption disabled\n");
 	WARN_ON_ONCE(timeout_base_ms > 3);
 	preempt_disable();
-	ret = wait_for_atomic(COND, 10);
+	ret = wait_for_atomic(COND, 100);
 	preempt_enable();

Ugh. Straw + camel.  How about something like:

__try_request_atomic:
	cond_resched();

	preempt_disable()
	ret = COND ? wait_for_atomic(COND, 10) : 0;
	preempt_enable();
	return ret;

try_request:
	ret = wait_for(__try_request_atomic() == 0, 100);

So that our preempt-off period doesn't grow completely unchecked, or do
we need that 34ms loop?

Yes, that's at least how I understand it. Scheduling away is what let's
PCODE start servicing some other request than ours or go idle. That's
in a way what we see when the preempt-enabled poll times out.

--Imre

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help