Re: clocksource_watchdog causing scheduling of timers every second (was [v13] support "task_isolation" mode)
From: Francis Giraldeau <hidden>
Date: 2016-07-29 18:31:55
Also in:
lkml
I tested this patch on 4.7 and confirm that irq_work does not occurs anymore on
the isolated cpu. Thanks!
I don't know of any utility to test the task isolation feature, so I started
one:
https://github.com/giraldeau/taskisol
The script exp.sh runs the taskisol to test five different conditions, but some
behavior is not the one I would expect.
At startup, it does:
- register a custom signal handler for SIGUSR1
- sched_setaffinity() on CPU 1, which is isolated
- mlockall(MCL_CURRENT) to prevent undesired page faults
The default strict mode is set with:
prctl(PR_SET_TASK_ISOLATION, PR_TASK_ISOLATION_ENABLE)
And then, the syscall write() is called. From previous discussion, the SIGKILL
should be sent, but it does not occur. When instead of calling write() we force
a page fault, then the SIGKILL is correctly sent.
When instead a custom signal handler SIGUSR1:
prctl(PR_SET_TASK_ISOLATION, PR_TASK_ISOLATION_USERSIG |
PR_TASK_ISOLATION_SET_SIG(SIGUSR1)
The signal is never delivered, either when the syscall is issued nor when the
page fault occurs.
I can confirm that, if two taskisol are created on the same CPU, the second one
fails with Resource temporarily unavailable, so that's fine.
I can add more test cases depending on your comments, such as the TLB events
triggered by another thread on a non-isolated core. But maybe there is already
a test suite?
Francis
2016-07-27 15:58 GMT-04:00 Chris Metcalf [off-list ref]:On 7/27/2016 3:53 PM, Christoph Lameter wrote:quoted
On Wed, 27 Jul 2016, Chris Metcalf wrote:quoted
Looks good. Did you omit the equivalent fix in clocksource_start_watchdog() on purpose? For now I just took your change, but tweaked it to add the equivalent diff with cpumask_first_and() there.Can the watchdog be started on an isolated cpu at all? I would expect that the code would start a watchdog only on a housekeeping cpu.The code just starts the watchdog initially on the first online cpu. In principle you could have configured that as an isolated cpu, so without any change to that code, you'd interrupt that cpu. I guess another way to slice it would be to start the watchdog on the current core. But just using the same idiom as in clocksource_watchdog() seems cleanest to me. I added your patch to the series and pushed it up (along with adding your Tested-by to the x86 enablement commit). It's still based on 4.6 so I'll need to rebase it once the merge window closes. -- Chris Metcalf, Mellanox Technologies http://www.mellanox.com