Thread (17 messages) 17 messages, 3 authors, 2014-03-28

Re: [PATCH net] ipv6: fix RTNL assert fail in DAD

From: Hannes Frederic Sowa <hidden>
Date: 2014-03-20 06:38:24

On Wed, Mar 19, 2014 at 11:52:17PM -0400, David Miller wrote:
From: Hannes Frederic Sowa <redacted>
Date: Wed, 19 Mar 2014 23:44:42 +0100
quoted
On Wed, Mar 19, 2014 at 01:53:19PM -0400, David Miller wrote:
quoted
Ok, the timer stuff could run from a workqueue just fine.
We have no-timer invocations, too, like addrconf_prefix_rcv. In that case the
whole handling of the router advertisment should get deferred into the
workqueue.
Just to be clear, you are saying that this doesn't need to be
synchronous?  Handling a prefix event seems like something that would
in fact need to be.
Here is my current analysis and proposals:

Actually, I would say that a safe entry point for starting to push further
prefix event handling into a workqueue would be addrconf_dad_start.
From there on, we need to make sure that addrconf_join_solict (which
is the first point we actually need RTNL locked) is called before we
do optimistic duplicate address detection processing (this seems to be
the only happens-before invariant we need to preserve here). Stephen already
allocated the work_struct in inet6_ifaddr, so my suggestion would be to
change Stephen's patch to use a delayed workqueue and just replace the
other timer operations to use the new work_struct in inet6_ifaddr
with delayed operations. Entry-point would be addrconf_dad_start which
simply adds the delayed operation with 0 delay and maybe a new flag so
that addrconf_dad_timer (which should be called addrconf_dad_work by then)
does the work which was prior in addrconf_dad_start.

The addrconf_dad_completed handling could be under RTNL, too, so the
original problem would be gone.

addrconf_verify would also need a delayed workqueue (split to
addrconf_verify_rtnl and addrconf_verify is just a invocation
to mod_delay_work(wq, addrconf_verify_work, 0) which calls
addrconf_verify_rtnl with rtnl locked, would be my approach by only
looking at the code).

That leaves us with one unsafe invocation of an rtnl-locked needed invocation
in pndisc_constructor for proxy_ndp handling. Don't know what to do about that
currently but didn't look to closely.

Also, to find problems like this sooner, should we propagate ASSERT_RTNL()
tests up from conditional callees to their callers (e.g. __dev_set_promiscuity
-> __dev_set_rx_mode -> maybe even further up the stack?).

Greetings,

  Hannes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help