Thread (24 messages) 24 messages, 6 authors, 2021-08-06

Re: [clocksource] 8901ecc231: stress-ng.lockbus.ops_per_sec -9.5% regression

From: "Paul E. McKenney" <paulmck@kernel.org>
Date: 2021-08-06 04:15:06
Also in: lkml, oe-lkp

On Fri, Aug 06, 2021 at 10:10:00AM +0800, Chao Gao wrote:
On Thu, Aug 05, 2021 at 08:37:27AM -0700, Paul E. McKenney wrote:
quoted
On Thu, Aug 05, 2021 at 01:39:40PM +0800, Chao Gao wrote:
quoted
[snip]
quoted
quoted
This patch works well; no false-positive (marking TSC unstable) in a
10hr stress test.
Very good, thank you!  May I add your Tested-by?
sure.
Tested-by: Chao Gao <redacted>
Very good, thank you!  I will apply this on the next rebase.
quoted
quoted
I expect that I will need to modify the patch a bit more to check for
a system where it is -never- able to get a good fine-grained read from
the clock.
Agreed.
quoted
And it might be that your test run ended up in that state.
Not that case judging from kernel logs. Coarse-grained check happened 6475
times in 43k seconds (by grep "coarse-grained skew check" in kernel logs).
So, still many checks were fine-grained.
Whew!  ;-)

So about once per 13 clocksource watchdog checks.

To Andi's point, do you have enough information in your console log to
work out the longest run of course-grained clocksource checks?
Yes. 5 consecutive course-grained clocksource checks. Note that
considering the reinitialization after course-grained check, in my
calculation, two course-grained checks are considered consecutive if
they happens in 1s(+/- 0.3s).
Very good, thank you!

So it seems eminently reasonable to have the clocksource watchdog complain
bitterly for more than (say) 100 consecutive course-grained checks.

I am thinking in terms of a separate patch for this purpose.

Thoughts?

							Thanx, Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help