Re: [RFC PATCH v2 2/6] PM / Runtime: introduce pm_runtime_set_memalloc_noio()
From: Ming Lei <hidden>
Date: 2012-10-23 15:18:36
Also in:
linux-mm, lkml, netdev
On Tue, Oct 23, 2012 at 10:46 PM, Alan Stern [off-list ref] wrote:
On Tue, 23 Oct 2012, Ming Lei wrote:quoted
On Mon, Oct 22, 2012 at 10:33 PM, Alan Stern [off-list ref] wrote:quoted
Tail recursion should be implemented as a loop, not as an explicit recursion. That is, the function should be: void pm_runtime_set_memalloc_noio(struct device *dev, bool enable) { do { dev->power.memalloc_noio_resume = enable; if (!enable) { /* * Don't clear the parent's flag if any of the * parent's children have their flag set. */ if (device_for_each_child(dev->parent, NULL, dev_memalloc_noio)) return; } dev = dev->parent; } while (dev); }OK, will take the non-recursion implementation for saving kernel stack space.quoted
except that you need to add locking, for two reasons: There's a race. What happens if another child sets the flag between the time device_for_each_child() runs and the next loop iteration?Yes, I know the race, and not adding a lock because the function is mostly called in .probe() or .remove() callback and its parent's device lock is held to avoid this race. Considered that it may be called in async probe() (scsi disk), one lock is needed, the simplest way is to add a global lock. Any suggestion?No. Because of where you put the new flag, it must be protected by dev->power.lock. And this means the iterative implementation shown above can't be used as is. It will have to be more like this: void pm_runtime_set_memalloc_noio(struct device *dev, bool enable) { spin_lock_irq(&dev->power.lock); dev->power.memalloc_noio_resume = enable; while (dev->parent) { spin_unlock_irq(&dev->power.lock); dev = dev->parent; spin_lock_irq(&dev->power.lock); /* * Don't clear the parent's flag if any of the * parent's children have their flag set. */ if (!enable && device_for_each_child(dev->parent, NULL,
s/dev->parent/dev
dev_memalloc_noio))
break;
dev->power.memalloc_noio_resume = enable;
}
spin_unlock_irq(&dev->power.lock);
}With the problem of non-SMP-safe bitfields access, the power.lock should be held, but that is not enough to prevent children from being probed or disconnected. Looks another lock is still needed. I think a global lock is OK in the infrequent path.
quoted
quoted
Even without a race, access to bitfields is not SMP-safe without locking.You mean one ancestor device might not be in active when one of its descendants is being probed or removed?No. Consider this example: struct foo { int a:1; int b:1; } x; Consider what happens if CPU 0 does "x.a = 1" at the same time as another CPU 1 does "x.b = 1". The compiler might produce object code looking like this for CPU 0: move x, reg1 or 0x1, reg1 move reg1, x and this for CPU 1: move x, reg2 or 0x2, reg2 move reg2, x With no locking, the two "or" instructions could execute simultaneously. What will the final value of x be? The two CPUs will interfere, even though they are touching different bitfields.
Got it, thanks for your detailed explanation.
Looks the problem is worse than above, not only bitfields are affected, the
adjacent fields might be involved too, see:
http://lwn.net/Articles/478657/
Thanks,
--
Ming Lei
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>