Re: [RFC] bonding: fix workqueue re-arming races

From: Jiri Bohac <hidden>
Date: 2010-09-01 18:29:41

On Wed, Sep 01, 2010 at 10:14:34AM -0700, Jay Vosburgh wrote:

	I also thought a bit more, and in the current code, the mode
shouldn't change in the middle of one of the work functions, because a
mode change requires the bond to be closed, so the various work things
will be stopped (more or less; excepting the race under disucssion
here).

	I don't think this is true for the new wq_rtnl functions,
however, because its work items are not canceled until the workqueue
itself is freed in bond_destructor.  Does the wq_rtnl open new races,
because it's work items are not synchronized with other activities
(close in particular)?  It's possible for its work functions (which may
do things like set the active slave, carrier, etc) to be invoked after
the bond has closed, and possibly reopened, or been otherwise adjusted.

I don't think this patch opens new races. The current race
scenario is:

1) schedule_delayed_work(foo)
2) foo's timer expires and foo is queued on bond->wq
  (possibly, foo starts to run and either gets preempted or
  sleeps on rtnl)
3) bond_close() sets kill_timers=1 and calls
  cancel_delayed_work() which accomplishes nothing
4) bond_open() sets kill_timers=0
5) bond_open() calls schedule_delayed_work(bar)
6) foo may run the "commit" work that should not be run
7) foo re-arms
8) if (foo == bar) -> BUG	/* bond->mode did not change */

With this patch, it is:

1) schedule_delayed_work(foo)
2) foo's timer expires and foo is queued on bond->wq
3) foo may have queued foo_commit on bond->wq_rtnl
4) bond_close() cancels foo
5) bond_open()
6) foo_commit may run and it should not be run

The patch avoids the problem of 7) and 8)

I think the race in 6) remains the same. It is now easier to fix.
This could even be done with a flag (similar to kill_timers),
which would be set each time the "commit" work is queued on wq_rtnl and
cleared by bond_close(). This should avoid the races completely,
I think. The trick is that, unlike kill_timers, bond_open() would
not touch this flag.

	I'm not sure this is better than one of the alternatives I
believe we discussed the last time around: having the rtnl-acquiring
work functions do a conditional acquire of rtnl, and if that fails,
reschedule themselves.

[...]

		if (rtnl_trylock()) {
			read_lock(&bond->lock);

			bond_miimon_commit(bond);

			read_unlock(&bond->lock);
			rtnl_unlock();	/* might sleep, hold no other locks */
			read_lock(&bond->lock);
		} else {
			read_lock(&bond->lock);
			queue_work(bond->wq, &bond->mii_work);
			read_unlock(&bond->lock);
			return;
		}

I actually tried the other variant suggested last time
(basically:

while (!rtnl_trylock()) {
	read_lock(&bond->lock)
	kill = bond->kill_timers;
	read_unlock(&bond->lock)
	if (kill)
		return;
})

and gave that to a customer experiencing this problem (I cannot
reproduce it myself). It was reported to lock up. I suspect some
kind of live-lock on bond->lock caused by the active waiting, but
I did not spend too much time debugging this.
[BTW, this is https://bugzilla.novell.com/show_bug.cgi?id=602969
,Novell BZ account needeed]

FWIW  this would be the only use of rtnl_trylock() in the kernel,
besides a few places that do:
	if (!rtnl_trylock()) return restart_syscall();
I think it is plain ugly -- semaphores are just not supposed to be
spinned on.

Your re-scheduling variant is more or less equivalent to active
spinning, isn't it? Anyway, if you really think it is a better approach,
are you going to apply it? I can supply the patch. (Although I
kinda don't like people seeing my name next to it ;))

-- 
Jiri Bohac [off-list ref]
SUSE Labs, SUSE CZ

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help