Thread (18 messages) 18 messages, 4 authors, 2012-10-24

Re: [RFC PATCH v2 2/6] PM / Runtime: introduce pm_runtime_set_memalloc_noio()

From: Ming Lei <hidden>
Date: 2012-10-23 15:18:36
Also in: linux-mm, lkml, netdev

On Tue, Oct 23, 2012 at 10:46 PM, Alan Stern [off-list ref] wrote:
On Tue, 23 Oct 2012, Ming Lei wrote:
quoted
On Mon, Oct 22, 2012 at 10:33 PM, Alan Stern [off-list ref] wrote:
quoted
Tail recursion should be implemented as a loop, not as an explicit
recursion.  That is, the function should be:

void pm_runtime_set_memalloc_noio(struct device *dev, bool enable)
{
        do {
                dev->power.memalloc_noio_resume = enable;

                if (!enable) {
                        /*
                         * Don't clear the parent's flag if any of the
                         * parent's children have their flag set.
                         */
                        if (device_for_each_child(dev->parent, NULL,
                                          dev_memalloc_noio))
                                return;
                }
                dev = dev->parent;
        } while (dev);
}
OK, will take the non-recursion implementation for saving kernel
stack space.
quoted
except that you need to add locking, for two reasons:

        There's a race.  What happens if another child sets the flag
        between the time device_for_each_child() runs and the next loop
        iteration?
Yes, I know the race, and not adding a lock because the function
is mostly called in .probe() or .remove() callback and its parent's device
lock is held to avoid this race.

Considered that it may be called in async probe() (scsi disk), one lock
is needed, the simplest way is to add a global lock. Any suggestion?
No.  Because of where you put the new flag, it must be protected by
dev->power.lock.  And this means the iterative implementation shown
above can't be used as is.  It will have to be more like this:

void pm_runtime_set_memalloc_noio(struct device *dev, bool enable)
{
        spin_lock_irq(&dev->power.lock);
        dev->power.memalloc_noio_resume = enable;

        while (dev->parent) {
                spin_unlock_irq(&dev->power.lock);
                dev = dev->parent;

                spin_lock_irq(&dev->power.lock);
                /*
                 * Don't clear the parent's flag if any of the
                 * parent's children have their flag set.
                 */
                if (!enable && device_for_each_child(dev->parent, NULL,
s/dev->parent/dev
                                dev_memalloc_noio))
                        break;
                dev->power.memalloc_noio_resume = enable;
        }
        spin_unlock_irq(&dev->power.lock);
}
With the problem of non-SMP-safe bitfields access, the power.lock should
be held, but that is not enough to prevent children from being probed or
disconnected. Looks another lock is still needed. I think a global lock
is OK in the infrequent path.
quoted
quoted
        Even without a race, access to bitfields is not SMP-safe
        without locking.
You mean one ancestor device might not be in active when
one of its descendants is being probed or removed?
No.  Consider this example:

        struct foo {
                int a:1;
                int b:1;
        } x;

Consider what happens if CPU 0 does "x.a = 1" at the same time as
another CPU 1 does "x.b = 1".  The compiler might produce object code
looking like this for CPU 0:

        move    x, reg1
        or      0x1, reg1
        move    reg1, x

and this for CPU 1:

        move    x, reg2
        or      0x2, reg2
        move    reg2, x

With no locking, the two "or" instructions could execute
simultaneously.  What will the final value of x be?

The two CPUs will interfere, even though they are touching different
bitfields.
Got it, thanks for your detailed explanation.

Looks the problem is worse than above, not only bitfields are affected, the
adjacent fields might be involved too, see:

           http://lwn.net/Articles/478657/


Thanks,
--
Ming Lei

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help