Thread (6 messages) 6 messages, 3 authors, 2018-09-10

Re: PREEMPT_RT_FULL breaks NO_HZ_FULL (full dynticks)

From: Ramesh Thomas <hidden>
Date: 2018-08-31 12:35:13

On 2018-08-30 at 16:18:56 +0200, Sebastian Andrzej Siewior wrote:
On 2018-08-26 20:39:22 [-0700], Ramesh Thomas wrote:
quoted
Case #2 with CONFIG_PREEMPT_RT_FULL=y (First run after boot)
S [cpuhp/3]
S [migration/3]
S [posixcputmr/3]
S [rcuc/3]
S [ktimersoftd/3]
S [ksoftirqd/3]
I [kworker/3:0-mm_]
I [kworker/3:0H]
R [irq/125-nvme0q4]
R [kworker/3:1-mm_]
R ./jitter
irq/125 shouldn't be there, right?
Yes. Also posixcputmr and ktimersoftd are not seen if PREEMPT_RT_FULL is not 
enabled. They don't seem to be running when the timer interrupts occur. But 
they being there by itself indicates something different is happening.
		     Sched    RT_Prio   Cpu_Time
S [posixcputmr/3]      FF      99 	00:00:00
R [ktimersoftd/3]      FF       1 	00:00:00
S [irq/125-nvme0q4]    FF      50 	00:00:00
R ./jitter             FF      99 	00:01:53
quoted
Case #3 with CONFIG_PREEMPT_RT_FULL=y (Second run after boot)
S [cpuhp/3]
S [migration/3]
S [posixcputmr/3]
S [rcuc/3]
R [ktimersoftd/3]
S [ksoftirqd/3]
I [kworker/3:0-mm_]
I [kworker/3:0H]
S [irq/125-nvme0q4]
R [kworker/3:1-mm_]
R ./jitter

In Case #3, /proc/interupts show timer interrupts occuring on CPU 3 while it
is stopped in the other cases. ktimersoftd is in runnable state in Case #3
can you trace down who or what is arming the timer on CPU3?
Ok, I will take a look.
quoted
Is this a known issue and is it being looked at by anyone?
now that I know of. Do you happen to know if this is a regression
compared to v4.14-RT?
I see the issue in 4.14.63 RT as well. There are slight differences in 
behavior due to changes that went in 4.17, but the main issue is seen there 
also.
quoted
If it is an issue, I would be glad to help in any way to get these 2 very
important features compatible with each other.
So if the ktimersoftd runs and you see the interrupt counter
incrementing for CPU3 then it would be interesting to figure out why
there is an armed timer on the second invocation (and none on the first
one).
In 4.18.5 RT, the issue does not always happen in the second invocation.  
Sometimes it works as expected, but the issue will show up after a few 
tries.

Looks like there are 2 issues when PREEMPT_RT_FULL is enabled
1. Some additional processes are pinned to isolated cores.
2. Timer is armed even though only a single high priority task is running.

quoted
Thanks,
Ramesh
Sebastian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help