Re: [RFC PATCH v2 2/6] PM / Runtime: introduce pm_runtime_set_memalloc_noio()

From: Alan Stern <stern@rowland.harvard.edu>
Date: 2012-10-23 14:46:52
Also in: linux-mm, linux-pm, lkml

On Tue, 23 Oct 2012, Ming Lei wrote:

On Mon, Oct 22, 2012 at 10:33 PM, Alan Stern [off-list ref] wrote:

quoted

Tail recursion should be implemented as a loop, not as an explicit
recursion.  That is, the function should be:

void pm_runtime_set_memalloc_noio(struct device *dev, bool enable)
{
        do {
                dev->power.memalloc_noio_resume = enable;

                if (!enable) {
                        /*
                         * Don't clear the parent's flag if any of the
                         * parent's children have their flag set.
                         */
                        if (device_for_each_child(dev->parent, NULL,
                                          dev_memalloc_noio))
                                return;
                }
                dev = dev->parent;
        } while (dev);
}

OK, will take the non-recursion implementation for saving kernel
stack space.

quoted

except that you need to add locking, for two reasons:

        There's a race.  What happens if another child sets the flag
        between the time device_for_each_child() runs and the next loop
        iteration?

Yes, I know the race, and not adding a lock because the function
is mostly called in .probe() or .remove() callback and its parent's device
lock is held to avoid this race.

Considered that it may be called in async probe() (scsi disk), one lock
is needed, the simplest way is to add a global lock. Any suggestion?

No.  Because of where you put the new flag, it must be protected by
dev->power.lock.  And this means the iterative implementation shown
above can't be used as is.  It will have to be more like this:

void pm_runtime_set_memalloc_noio(struct device *dev, bool enable)
{
	spin_lock_irq(&dev->power.lock);
	dev->power.memalloc_noio_resume = enable;

	while (dev->parent) {
		spin_unlock_irq(&dev->power.lock);
		dev = dev->parent;

		spin_lock_irq(&dev->power.lock);
		/*
		 * Don't clear the parent's flag if any of the
		 * parent's children have their flag set.
		 */
		if (!enable && device_for_each_child(dev->parent, NULL,
				dev_memalloc_noio))
			break;
		dev->power.memalloc_noio_resume = enable;
	}
	spin_unlock_irq(&dev->power.lock);
}

quoted

        Even without a race, access to bitfields is not SMP-safe
        without locking.

You mean one ancestor device might not be in active when
one of its descendants is being probed or removed?

No.  Consider this example:

	struct foo {
		int a:1;
		int b:1;
	} x;

Consider what happens if CPU 0 does "x.a = 1" at the same time as 
another CPU 1 does "x.b = 1".  The compiler might produce object code 
looking like this for CPU 0:

	move	x, reg1
	or	0x1, reg1
	move	reg1, x

and this for CPU 1:

	move	x, reg2
	or	0x2, reg2
	move	reg2, x

With no locking, the two "or" instructions could execute 
simultaneously.  What will the final value of x be?

The two CPUs will interfere, even though they are touching different 
bitfields.

Alan Stern

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help