Re: RAID5 reconstruction ?
From: Greg Freemyer <hidden>
Date: 2009-06-03 01:54:37
On Sun, May 31, 2009 at 8:14 AM, NeilBrown [off-list ref] wrote:
On Sun, May 31, 2009 9:54 pm, Goswin von Brederlow wrote:quoted
SandeepKsinha [off-list ref] writes:quoted
On Sun, May 31, 2009 at 1:07 AM, Redeeman [off-list ref] wrote:quoted
On Sat, 2009-05-30 at 20:55 +0200, Goswin von Brederlow wrote:quoted
And just when I hit send I thought of something else. Instead of the initial sync when creating a raid the bitmap could just mark all blocks as unused. Much faster raid creation.This really sounds like a good option. This would have a slight hit for writes which I believe will compensate for later re-constructions, replacing a disk, mirror resysnc and many more operation.What hit? Currently with bitmap support a write will set the block to "unclean", write the data, write the parity and set the block to "clean". Setting the "used" bit along the way should not cost much. Only difference I see is that the bitmap would have to have finer granularity so one "used" bit covers one filesystem block (4k usualy). Otherwise you could only "use" blocks but not "unuse" them again when the filesystem frees them in 4k chunks.But the filesystem could "unuse" blocks in larger chunks. There is this thing called "thin provisioning" and I believe the proponents of that would like the "TRIM" command to be sent in aligned multiples of 1Gigabyte or something like that. I believe this is one aspect of Linux TRIM support that is still open. I think there would be real value in providing an 'allocated' bitmap even if it were quite coarsely grained. The problem with a very large grain is that every time you set a bit, you need to resync that region, and you don't want that to take too long. So 1 gig (10-30seconds?) would be an upper limit I would thing. If you used 1 sector for the bitmap, that is 4096 bits so on a terabyte array, you have 256Meg chunks that resync in a few seconds. Certainly an interesting idea to experiment with I think. NeilBrown
Neil, Is there likely to be any discussion of how trim / unmap will be invoked by the filesystem layer at OLS? Or how do those decisions get made? ie. The ext4 list was recently talking about sending down very small grained info, but large grained seems to make a lot more sense to me. Hopefully, each filesystem is not given the ability to decide for themselves. Very much seems like something the lk community should have input into, not just the ext4 maintainer. Greg -- Greg Freemyer Head of EDD Tape Extraction and Processing team Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com