Re: [RFC][PATCH] md: avoid fullsync if a faulty member missed a dirty transition
From: Mike Snitzer <hidden>
Date: 2008-05-09 04:42:44
Also in:
lkml
On Thu, May 8, 2008 at 9:40 PM, Neil Brown [off-list ref] wrote:
On Thursday May 8, snitzer@gmail.com wrote: > On Thu, May 8, 2008 at 2:13 AM, Neil Brown [off-list ref] wrote: > > On Tuesday May 6, snitzer@gmail.com wrote: > > > > > > It looks like bitmap_update_sb()'s incrementing of events_cleared (on > > > behalf of the local member) could be racing with the fact that the NBD > > > member becomes faulty (whereby making the array degraded). This > > > allows the events_cleared to reflect a clean->dirty transition last > > > occurred before the array became degraded. My reasoning is: If it was > > > a clean->dirty transition the bitmap still has the associated dirty > > > bit set in the local member's bitmap, so using the bitmap to resync is > > > valid. > > > > > > thanks, > > > Mike > > > > Thanks for persisting. I think I understand what is going on now. > > > > How about this patch? It is similar to your, but instead of depending > > on the odd/even state of the event counter, it directly checks the > > clean/dirty state of the array. > > Hi Neil, > > Your revised patch works great and is obviously cleaner. But I'm still not happy with it :-( I suspect there might be other cases where it will still do the wrong thing. The real problem is that we are updating events_cleared to early. We are setting to the new event counter before that is even written out. So I've come up with this patch, which I think more clearly encapsulated what events_cleared means. It is now set to the current 'events' counter immediately before we clear any bit. If you could test it, I'd really appreciate it.
Unfortunately my testing with this patch results in a full resync.
Here is the state of the array after shutdown:
# mdadm -X /dev/nbd0 /dev/sdq
Filename : /dev/nbd0
Magic : 6d746962
Version : 4
UUID : 7140cc3c:8681416c:12c5668a:984ca55d
Events : 896
Events Cleared : 897
State : OK
Chunksize : 128 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 52428736 (50.00 GiB 53.69 GB)
Bitmap : 409600 bits (chunks), 1 dirty (0.0%)
Filename : /dev/sdq
Magic : 6d746962
Version : 4
UUID : 7140cc3c:8681416c:12c5668a:984ca55d
Events : 898
Events Cleared : 897
State : OK
Chunksize : 128 KB
Daemon : 5s flush period
Write Mode : Normal
Sync Size : 52428736 (50.00 GiB 53.69 GB)
Bitmap : 409600 bits (chunks), 0 dirty (0.0%)
# mdadm --examine /dev/nbd0 /dev/sdq
/dev/nbd0:
Magic : a92b4efc
Version : 00.90.00
UUID : 7140cc3c:8681416c:12c5668a:984ca55d
Creation Time : Thu May 8 06:55:32 2008
Raid Level : raid1
Used Dev Size : 52428736 (50.00 GiB 53.69 GB)
Array Size : 52428736 (50.00 GiB 53.69 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Thu May 8 18:07:47 2008
State : clean
Internal Bitmap : present
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Checksum : df65cb35 - correct
Events : 0.896
Number Major Minor RaidDevice State
this 1 43 0 1 active sync write-mostly /dev/nbd0
0 0 65 0 0 active sync /dev/sdq
1 1 43 0 1 active sync write-mostly /dev/nbd0
/dev/sdq:
Magic : a92b4efc
Version : 00.90.00
UUID : 7140cc3c:8681416c:12c5668a:984ca55d
Creation Time : Thu May 8 06:55:32 2008
Raid Level : raid1
Used Dev Size : 52428736 (50.00 GiB 53.69 GB)
Array Size : 52428736 (50.00 GiB 53.69 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Update Time : Thu May 8 18:07:49 2008
State : clean
Internal Bitmap : present
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
Checksum : df65c956 - correct
Events : 0.898
Number Major Minor RaidDevice State
this 0 65 0 0 active sync /dev/sdq
0 0 65 0 0 active sync /dev/sdq
1 1 0 0 1 faulty removed
Was I supposed to use this latest patch in combination with your
previous patch (to validate_super)? Because you'll note that with
your most recent patch nbd0's events (ev1) is still one less than
sdq's events_cleared. As such the validate_super's "ev1 <
mddev->bitmap->events_cleared" check triggers a full rebuild.
The kernel log shows:
md: md0 stopped.
md: bind<nbd0>
md: bind<sdq>
md: kicking non-fresh nbd0 from array!
md: unbind<nbd0>
md: export_rdev(nbd0)
raid1: raid set md0 active with 1 out of 2 mirrors
md0: bitmap initialized from disk: read 13/13 pages, set 0 bits, status: 0
created bitmap (200 pages) for device md0
Nope!!! ev1 (896) < mddev->bitmap->events_cleared (897)
md: bind<nbd0>
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sdq
disk 1, wo:1, o:1, dev:nbd0
md: recovery of RAID array md0