Thread (288 messages) 288 messages, 13 authors, 2014-04-01

Re: [PATCH] netpoll: Don't call driver methods from interrupt context.

From: Eric W. Biederman <hidden>
Date: 2014-03-05 19:24:40
Subsystem: netconsole, networking drivers, the rest · Maintainers: Breno Leitao, Andrew Lunn, "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Linus Torvalds

David Miller [off-list ref] writes:
From: ebiederm@xmission.com (Eric W. Biederman)
Date: Tue, 04 Mar 2014 16:03:43 -0800
quoted
So I would like some clear guidance.  Will you accept patches to make
it safe to call the napi poll routines from hard irq context, or should
we simply defer messages prented with netconsole in hard irq context
into another context where we can run the napi code?

If there is not a clear way to fix the problems that crop up we should
just delete all of the netpoll code altogether, as it seems deadly in
it's current form.
Clearly to make netconsole most useful we should synchronously emit
log messages.

Because what if the system hangs right after this event, but before
we get back to a "safe" context.

That's one bug that will be a billion times harder to diagnose if
we defer.
In general I agree.  

The gripping hand for me is kernel/rcu/tree.c:print_cpu_stall() that
generates a warning from irq context on every cpu simultaneously.

Which without netpoll I can debug by just logging into the machine and
dumping dmesg, but with netpoll machine die when the warning is
generarted because of the after the first few messages each additional
message generates a new message.

Now that I have looked closer the printk generating a printk problem
seems to be something that is best solved at the printk level.  So if
you will accept the patches I will proceed to shore up the existing
netpoll implementations.

I am thinking pretty seriously about forcing hard irq context during
netconsole's use of netpoll to ensure that the hard irq context case
actually get's tested.  I need to do some audit's to see if that would
cause any side effects beyond leaving irq's disabled.
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ba2f5e710af1..aaa9062061c8 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -734,6 +734,7 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
        unsigned long flags;
        struct netconsole_target *nt;
        const char *tmp;
+       bool hard_irq;
 
        if (oops_only && !oops_in_progress)
                return;
@@ -742,6 +743,9 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                return;
 
        spin_lock_irqsave(&target_list_lock, flags);
+       hard_irq = in_irq();
+       if (!hard_irq)
+               irq_enter();
        list_for_each_entry(nt, &target_list, list) {
                netconsole_target_get(nt);
                if (nt->enabled && netif_running(nt->np.dev)) {
@@ -761,6 +765,8 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                }
                netconsole_target_put(nt);
        }
+       if (!hard_irq)
+               irq_exit();
        spin_unlock_irqrestore(&target_list_lock, flags);
 }

Eric
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help