[RFC] Improving udelay/ndelay on platforms where that is possible
From: Pavel Machek <hidden>
Date: 2017-12-07 12:43:28
Also in:
lkml
On Wed, Nov 15, 2017 at 01:51:54PM +0100, Marc Gonzalez wrote:quoted
On 01/11/2017 20:38, Marc Gonzalez wrote:quoted
OK, I'll just send my patch, and then crawl back under my rock.Linus, As promised, the patch is provided below. And as promised, I will no longer bring this up on LKML. FWIW, I have checked that the computed value matches the expected value for all HZ and delay_us, and for a few clock frequencies, using the following program: $ cat delays.c #include <stdio.h> #define MEGA 1000000u typedef unsigned int uint; typedef unsigned long long u64; #define DIV_ROUND_UP(n,d) (((n) + (d) - 1) / (d)) static const uint HZ_tab[] = { 100, 250, 300, 1000 }; static void check_cycle_count(uint freq, uint HZ, uint delay_us) { uint UDELAY_MULT = (2147 * HZ) + (483648 * HZ / MEGA); uint lpj = DIV_ROUND_UP(freq, HZ); uint computed = ((u64)lpj * delay_us * UDELAY_MULT >> 31) + 1; uint expected = DIV_ROUND_UP((u64)delay_us * freq, MEGA); if (computed != expected) printf("freq=%u HZ=%u delay_us=%u comp=%u exp=%u\n", freq, HZ, delay_us, computed, expected); } int main(void) { uint idx, delay_us, freq; for (freq = 3*MEGA; freq <= 100*MEGA; freq += 3*MEGA) for (idx = 0; idx < sizeof HZ_tab / sizeof *HZ_tab; ++idx) for (delay_us = 1; delay_us <= 2000; ++delay_us) check_cycle_count(freq, HZ_tab[idx], delay_us); return 0; } -- >8 -- Subject: [PATCH] ARM: Tweak clock-based udelay implementation In 9f8197980d87a ("delay: Add explanation of udelay() inaccuracy") Russell pointed out that loop-based delays may return early. On the arm platform, delays may be either loop-based or clock-based. This patch tweaks the clock-based implementation so that udelay(N) is guaranteed to spin at least N microseconds.As I've already said, I don't want this, because it encourages people to use too-small delays in driver code, and if we merge it then you will look at your data sheet, decide it says "you need to wait 10us" and write in your driver "udelay(10)" which will break on the loops based delay. udelay() needs to offer a consistent interface so that drivers know what to expect no matter what the implementation is. Making one implementation conform to your ideas while leaving the other implementations with other expectations is a recipe for bugs.
udelay() needs to be consistent across platforms, and yes, udelay(10) is expected to delay at least 10usec. If that is not true on your platform, _fix your platform_. But it is not valid to reject patches fixing other platforms, just because your platform is broken. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html