Thread (4 messages) 4 messages, 2 authors, 2022-02-02

Re: [BUG] net_device UAF: linkwatch_fire_event() calls dev_hold() after netdev_wait_allrefs() is done

From: Jakub Kicinski <kuba@kernel.org>
Date: 2022-02-02 00:57:37
Also in: lkml

On Tue, 1 Feb 2022 23:46:16 +0100 Jann Horn wrote:
On Fri, Jan 28, 2022 at 3:19 AM Jakub Kicinski [off-list ref] wrote:
quoted
Interesting..

I don't know what link_reset does, but since it turns the carrier on it
seems like something that should be flushed/canceled when the device
goes down. unregister brings the device down under rtnl_lock.

On Fri, 28 Jan 2022 02:51:24 +0100 Jann Horn wrote:  
quoted
Is the bug that usbnet_disconnect() should be stopping &dev->kevent
before calling unregister_netdev()?  
I'd say not this one, I think the generally agreed on semantics are that
the netdev is under users control between register and unregister, we
should not cripple it before unregister.
 
quoted
Or is the bug that ax88179_link_reset() doesn't take some kind of lock
and re-check that the netdev is still alive?  
That'd not be an uncommon way to fix this.. taking rtnl_lock, not even
a driver lock in similar.  
Ah, I found a comment with a bit of explanation on how this is
supposed to work... usbnet_stop() explains:

    /* deferred work (task, timer, softirq) must also stop.
     * can't flush_scheduled_work() until we drop rtnl (later),
     * else workers could deadlock; so make workers a NOP.
     */

And usbnet_stop() is ->ndo_stop(), which indeed runs under RTNL.

I wonder what the work items can do that'd conflict with RTNL... or is
the comment just talking about potential issues if a bunch of *other*
work items need RTNL and clog up the system_wq so that
flush_scheduled_work() blocks forever?
Good question. Nothing that would explicitly take rtnl_lock jumps out
in the usbnet code or the link_reset handler. The code is ancient, too:

/* work that cannot be done in interrupt context uses keventd.
 *
 * NOTE:  with 2.5 we could do more of this using completion callbacks,

:)
If it's the latter case, I guess we could instead do cancel_work_sync() and
then maybe re-run the work function's handler one more time
synchronously?
cancel_sync() sounds good, lan78xx.c does that plus clearing the event
bits. I don't think you need to call the link_reset handler manually,
the stop function should shut down the link, anyway.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help