Thread (20 messages) 20 messages, 7 authors, 2004-12-28

Re: Lockup with 2.6.9-ac15 related to netconsole

From: Patrick McHardy <hidden>
Date: 2004-12-22 14:59:06
Also in: lkml

Francois Romieu wrote:
Patrick McHardy [off-list ref] :
[...]
quoted
at least the queued messages ordered. But you need to grab
dev->queue_lock, otherwise you risk corrupting qdisc internal data.
You should probably also deal with the noqueue-qdisc, which doesn't
have an enqueue function. So it should look something like this:

If I am not mistaken, a failure on spin_trylock + the test on
xmit_lock_owner imply that it is safe to directly handle the
queue. It means that qdisc_run() has been interrupted on the
current cpu and the other paths seem fine as well. Counter-example
is welcome (no joke).
enqueue is only protected by dev->queue_lock, and dev->queue_lock
is dropped as soon as dev->xmit_lock is grabbed, so any other CPU
might call enqueue at the same time.

Example:

CPU1					CPU2

dev_queue_xmit				dev_queue_xmit
  lock(dev->queue_lock)			 lock(dev->queue_lock)
q->enqueue
qdisc_run
qdisc_restart
  trylock(dev->xmit_lock), ok
  unlock(dev->queue_lock)
...
printk("something")
...
netpoll_send_skb
  trylock(dev->xmit_lock), fails
q->enqueue				q->enqueue
Of course the patch is completely ugly and violates any layering
principle one could think of. It was not submitted for inclusion :o)
Sure, but I think we should have a short-term workaround until
a better solution has been invented. Maybe dropping the packets
would be best for now, it only affects printks issued in paths
starting at qdisc_restart (-> hard_start_xmit -> ...). Queueing
the packets might also cause reordering since not all packets
are queued.
quoted
while (!spin_trylock(&np->dev->xmit_lock)) {
if (np->dev->xmit_lock_owner == smp_processor_id()) {
	struct Qdisc *q;

	rcu_read_lock();
	q = rcu_dereference(dev->qdisc);
	if (q->enqueue) {
		spin_lock(&dev->queue_lock);

I'd expect it to deadlock if dev_queue_xmit -> qdisc_run is interrupted
on the current cpu and a printk is issued as dev->queue_lock will have
been taken elsewhere.
Hmm this is complicated, let me think some more about it.

Regards
Patrick
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help