Re: 2.6.25rc7 lockdep trace
From: Johannes Berg <johannes@sipsolutions.net>
Date: 2008-03-29 00:54:20
Attachments
- signature.asc [application/pgp-signature] 828 bytes
From: Johannes Berg <johannes@sipsolutions.net>
Date: 2008-03-29 00:54:20
quoted
stack backtrace: Pid: 2308, comm: NetworkManager Not tainted 2.6.25-0.161.rc7.fc9.i686 #1 [print_circular_bug_tail+91/102] print_circular_bug_tail+0x5b/0x66 [print_circular_bug_entry+57/67] ? print_circular_bug_entry+0x39/0x43 [__lock_acquire+2488/3089] __lock_acquire+0x9b8/0xc11 [_spin_unlock_irq+34/47] ? _spin_unlock_irq+0x22/0x2f [lock_acquire+106/144] lock_acquire+0x6a/0x90 [flush_workqueue+0/133] ? flush_workqueue+0x0/0x85 [flush_workqueue+68/133] flush_workqueue+0x44/0x85 [flush_workqueue+0/133] ? flush_workqueue+0x0/0x85 [flush_scheduled_work+13/15] flush_scheduled_work+0xd/0xf [<d096d80a>] tulip_down+0x20/0x1a3 [tulip] [trace_hardirqs_on+233/266] ? trace_hardirqs_on+0xe9/0x10a [dev_deactivate+177/222] ? dev_deactivate+0xb1/0xdeYes, see for example: http://www.mail-archive.com/netdev@vger.kernel.org/msg31718.html You can't flush a workqueue in the device close handler exactly because of this locking conflict. Nobody has come up with a suitable way to fix this yet.
Maybe we should check which schedule_work users actually lock the rtnl within the work function and move them to a uses-rtnl-in-work workqueue so that everybody else can have rtnl around flush. Depending on how many users there are that might not be feasible, but I have so far only seen linkwatch_event() lock the rtnl within the work function and everybody else seems to want to use the rtnl around the flushing. johannes