Re: [PATCH] md: Protecting mddev with barriers to avoid races
From: NeilBrown <hidden>
Date: 2009-05-22 11:33:53
On Fri, May 22, 2009 7:46 pm, SandeepKsinha wrote:
Hi Neil, On Fri, May 22, 2009 at 2:28 PM, NeilBrown [off-list ref] wrote:quoted
On Fri, May 22, 2009 6:36 pm, Andre Noll wrote:quoted
On 13:48, SandeepKsinha wrote:quoted
Signed-off-by: Sandeep K Sinha <redacted> Date: Fri May 22 13:46:56 2009 +0530 Protecting mddev with barriers to avoid races.diff --git a/drivers/md/linear.c b/drivers/md/linear.c index 9ad6ec4..e2dbda8 100644 --- a/drivers/md/linear.c +++ b/drivers/md/linear.c@@ -28,10 +28,12 @@static inline dev_info_t *which_dev(mddev_t *mddev, sector_t sector) { int lo, mid, hi; - linear_conf_t *conf = mddev_to_conf(mddev); + linear_conf_t *conf; lo = 0; hi = mddev->raid_disks - 1; + barrier(); + conf = mddev_to_conf(mddev);Are you sure that a compiler barrier is sufficient here? Maybe we need smp_mb(), but I'm not familar with all the different memory barrier types that exist in the kernel.I think we just want a read barrier here: smb_rmb(); because it is a barrier between reading mddev->raid_disk and reading mddev->private. In linear_add, we want a write barrier smb_wmb(); before setting mdev->raid_disks, and I think we also want one before setting mddev->private to make sure that all the assignments to newconf->whatever are stable.Do we really need one before setting mddev->private? Will newconf be used by anyone before its assigned to mddev->private? No. Then what makes you put a write barrier before it?
I'm not 100% certain we do need it, but the logic goes like that:
One processor does
newconf->disks[0].end_sector = X;
mddev->private = newconf;
and another processor does
conf = mddev->private;
foo = conf->disk[0].end_sector;
Due to the lack of memory ordering guarantees, foo could end up
with the value of end_sector *before* X was assigned to it.
In fact, rcu_assign_pointer (in include/linux/rcupdate.h) has
exactly this purpose: place an smp_wmb before assigning the address
of a structure to a pointer, to ensure that the content of the
structure is globally visible before the address of the structure
become visible.
So I would probably do
rcu_assign_pointer(mddev->private, newconf);
smb_wmb();
mddev->raid_disks ++;
An alternative (which actually might be an improvement, I'm not
sure) is to have a conf->raid_disks field and use that value in
which_dev. Then it would be more like
mddev->raid_disk++;
newconf->raid_disk = mddev->raid_disks;
rcu_assign_pointer(mddev->private, newconf);
and we only have one memory barrier.
The which_dev code then becomes:
conf = rcu_dereference(mddev->private);
hi = conf->raid_disks = 1;
.....
and again we have only one memory barrier. The significant benefit
of this is that instead of using the low level smp_[rw]mb() operations,
we use the higher level rcu_* interfaces which make it a lot more
obvious what we are doing.
It would then be a very small step use use rcu_dereference for
all mddev->private accesses in linear.c, wrap them in rcu_read_{,un}lock,
and then use call_rcu to arrange for the old conf_t to be freed.
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html