Re: [PATCH] md: Protecting mddev with barriers to avoid races
From: SandeepKsinha <hidden>
Date: 2009-05-22 13:41:12
On Fri, May 22, 2009 at 5:03 PM, NeilBrown [off-list ref] wrote:
On Fri, May 22, 2009 7:46 pm, SandeepKsinha wrote:quoted
Hi Neil, On Fri, May 22, 2009 at 2:28 PM, NeilBrown [off-list ref] wrote:quoted
On Fri, May 22, 2009 6:36 pm, Andre Noll wrote:quoted
On 13:48, SandeepKsinha wrote:quoted
Signed-off-by: Sandeep K Sinha <redacted> Date: Fri May 22 13:46:56 2009 +0530 Protecting mddev with barriers to avoid races.diff --git a/drivers/md/linear.c b/drivers/md/linear.c index 9ad6ec4..e2dbda8 100644 --- a/drivers/md/linear.c +++ b/drivers/md/linear.c@@ -28,10 +28,12 @@static inline dev_info_t *which_dev(mddev_t *mddev, sector_t sector) { int lo, mid, hi; - linear_conf_t *conf = mddev_to_conf(mddev); + linear_conf_t *conf; lo = 0; hi = mddev->raid_disks - 1; + barrier(); + conf = mddev_to_conf(mddev);Are you sure that a compiler barrier is sufficient here? Maybe we need smp_mb(), but I'm not familar with all the different memory barrier types that exist in the kernel.I think we just want a read barrier here: smb_rmb(); because it is a barrier between reading mddev->raid_disk and reading mddev->private. In linear_add, we want a write barrier smb_wmb(); before setting mdev->raid_disks, and I think we also want one before setting mddev->private to make sure that all the assignments to newconf->whatever are stable.Do we really need one before setting mddev->private? Will newconf be used by anyone before its assigned to mddev->private? No. Then what makes you put a write barrier before it?I'm not 100% certain we do need it, but the logic goes like that: One processor does newconf->disks[0].end_sector = X; mddev->private = newconf; and another processor does conf = mddev->private; foo = conf->disk[0].end_sector; Due to the lack of memory ordering guarantees, foo could end up with the value of end_sector *before* X was assigned to it.
Not sure but can be a possibility. But I feel if thats the case, we might have to revisit the whole md code. I see similar instances at lot many places.
In fact, rcu_assign_pointer (in include/linux/rcupdate.h) has exactly this purpose: place an smp_wmb before assigning the address of a structure to a pointer, to ensure that the content of the structure is globally visible before the address of the structure become visible. So I would probably do rcu_assign_pointer(mddev->private, newconf); smb_wmb(); mddev->raid_disks ++; An alternative (which actually might be an improvement, I'm not sure) is to have a conf->raid_disks field and use that value in which_dev. Then it would be more like mddev->raid_disk++; newconf->raid_disk = mddev->raid_disks; rcu_assign_pointer(mddev->private, newconf); and we only have one memory barrier. The which_dev code then becomes: conf = rcu_dereference(mddev->private); hi = conf->raid_disks = 1; .....
I like this approach more. And this will surely keep consistency throughout the code.
and again we have only one memory barrier. The significant benefit
of this is that instead of using the low level smp_[rw]mb() operations,
we use the higher level rcu_* interfaces which make it a lot more
obvious what we are doing.
It would then be a very small step use use rcu_dereference for
all mddev->private accesses in linear.c, wrap them in rcu_read_{,un}lock,
and then use call_rcu to arrange for the old conf_t to be freed.I am going to try this. Will keep you posted.
NeilBrown
-- Regards, Sandeep. “To learn is to change. Education is a process that changes the learner.” -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html