Thread (16 messages) 16 messages, 7 authors, 2011-06-11

Re: Changing Kernel thread priorities

From: Thomas Gleixner <hidden>
Date: 2011-06-10 15:37:58
Also in: lkml

On Fri, 10 Jun 2011, Remy Bohmer wrote:
2011/6/8 Thomas Gleixner [off-list ref]:
quoted
On Wed, 8 Jun 2011, Remy Bohmer wrote:
quoted
In real life you may want, for EXAMPLE, this setup:
* prio 70: high priority motor control loop
* prio 60: network device irq
* prio 59: network softirqs
* prio 55: some realtime task depending on networkingstack
* prio 54: mass storage irq
* prio 53: block device softirq
* prio 52: some realtime task depending on mass-storage
* prio 50: all remaining irq threads
* prio 49: all remaining softirqs

Assume here you do a ifconfig down and ifconfig up, in the current
kernel behaviour you will see that the irq thread switches from prio
60 to 50.
The irq-thread will become of a lower priority compared to its related
softirqs due to this reason, which can result in a complete die of
this network interface... even before it ever came back up again...
Not really. If that's the case it needs to be investigated and
fixed.
I, of course, agree with that, but these cases are usually extremely
hard to find, and occur typically only in the once-a-month-condition
that you cannot reproduce...
Do you remember why the priority of the softirqs was moved down from
50 to 49 ? IIRC this was because of the very same reason and IIRC
still valid
No, it's not. The root cause was a problem with the network softirq
and a network driver, the softirq ->49 was a temporary workaround
until we had enough information to find the real root cause. I wish
I'd never committed that change at all.
We do not have control over all kernel code, and new drivers are
continuously being developed that make wrong implicit assumptions
about the order of irq->sirq->everything else. Of course this is
wrong, and there is no excuse, but it is a fact of life...
In practice the softirq prio can be set to a higher value than 50 (or
1), and a hirq thread that is started at 50 (or 2) will result in
situations that are not expected.
quoted
quoted
As mentioned before by Thomas, the configuration is a policy issue and
must be set from user-context. I understand what he means by that and
I agree, but there still has to be a mechanism to make the kernel
remember the configuration set by the user to prevent all kinds of
race conditions. You cannot demand from the user to run after
Which race conditions?
Race conditions that occur when a softirq preempts a related hardirq
what the driver did not expect or was designed for.
And making it the other way round hides the problem, which is even
worse. We want stuff to explode right away. You can run into the same
problem when the softirq holds a lock and the high prio irq thread
boosts it.
quoted
So moving the base priority down to 1 or 2 is probably the most
sensible solution to avoid that a newly brought up interrupt thread
interferes with anything in the rt domain and it's not rocket science
to adjust the priority in a ifup.post or with an udev rule.
At prio 1 or 2, _every_ RT-thread in the system is to be assumed to be
more low-latency bound compared to _any_ interrupt handler. And you
assume here that no user RT-thread in the system shall use any
functionality of any driver that has an interrupt handler (otherwise
you get the priority inversions issue)
Sigh. People who use RT threads should better know what they do and
configure their damned system correct. We cannot provide a solution
which takes every incarnatation of lusers into account.
As mentioned in this thread before by someone else, you will get this
old issue back: 'My drivers start to behave weird when I create a
RT-thread...'
And I do not care at all. The answer is: Do not use an RT-thread when
you are not knowing what you are doing.
The prio inversion issue between hirq/sirq will even become more
worse, since there will be a smaller chance that softirqs will stay at
prio 1 and thus there is less guarantee that they will stay below the
hirq-prio all the time.
There is no such thing and if it's there, then it needs to be found
and fixed.
Furthermore, I prefer the principle: _Nothing_ goes above interrupt
(thread) priority unless there is a very special reason for it and it
has been investigated that it is safe to do so. And a user-thread that
requires functionality of a certain driver shall be set below the
priority of the hirq-thread of that driver. The prio of the softirq
must _always_ be between that user-thread and hirq-thread if there is
a relation between the driver and softirq.

In that light I think prio 1/2 is more worse compared to 49/50. I
think the current _default_ is okay, it makes the system at least
boot.
It boots with 50 or whatever you set it to as well.

Thanks,

	tglx
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help