Thread (41 messages) 41 messages, 10 authors, 2017-12-07

[RFC] Improving udelay/ndelay on platforms where that is possible

From: Pavel Machek <hidden>
Date: 2017-12-07 12:43:28
Also in: lkml

On Wed, Nov 15, 2017 at 01:51:54PM +0100, Marc Gonzalez wrote:
quoted
On 01/11/2017 20:38, Marc Gonzalez wrote:
quoted
OK, I'll just send my patch, and then crawl back under my rock.
Linus,

As promised, the patch is provided below. And as promised, I will
no longer bring this up on LKML.

FWIW, I have checked that the computed value matches the expected
value for all HZ and delay_us, and for a few clock frequencies,
using the following program:

$ cat delays.c
#include <stdio.h>
#define MEGA 1000000u
typedef unsigned int uint;
typedef unsigned long long u64;
#define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d))

static const uint HZ_tab[] = { 100, 250, 300, 1000 };

static void check_cycle_count(uint freq, uint HZ, uint delay_us)
{
	uint UDELAY_MULT = (2147 * HZ) + (483648 * HZ / MEGA);
	uint lpj = DIV_ROUND_UP(freq, HZ);
	uint computed = ((u64)lpj * delay_us * UDELAY_MULT >> 31) + 1;
	uint expected = DIV_ROUND_UP((u64)delay_us * freq, MEGA);

	if (computed != expected)
		printf("freq=%u HZ=%u delay_us=%u comp=%u exp=%u\n", freq, HZ, delay_us, computed, expected);
}

int main(void)
{
	uint idx, delay_us, freq;

	for (freq = 3*MEGA; freq <= 100*MEGA; freq += 3*MEGA)
		for (idx = 0; idx < sizeof HZ_tab / sizeof *HZ_tab; ++idx)
			for (delay_us = 1; delay_us <= 2000; ++delay_us)
				check_cycle_count(freq, HZ_tab[idx], delay_us);

	return 0;
}



-- >8 --
Subject: [PATCH] ARM: Tweak clock-based udelay implementation

In 9f8197980d87a ("delay: Add explanation of udelay() inaccuracy")
Russell pointed out that loop-based delays may return early.

On the arm platform, delays may be either loop-based or clock-based.

This patch tweaks the clock-based implementation so that udelay(N)
is guaranteed to spin at least N microseconds.
As I've already said, I don't want this, because it encourages people
to use too-small delays in driver code, and if we merge it then you
will look at your data sheet, decide it says "you need to wait 10us"
and write in your driver "udelay(10)" which will break on the loops
based delay.

udelay() needs to offer a consistent interface so that drivers know
what to expect no matter what the implementation is.  Making one
implementation conform to your ideas while leaving the other
implementations with other expectations is a recipe for bugs.
udelay() needs to be consistent across platforms, and yes, udelay(10)
is expected to delay at least 10usec.

If that is not true on your platform, _fix your platform_. But it is
not valid to reject patches fixing other platforms, just because your
platform is broken.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help