Re: [RFC PATCH V1] raid1: rewrite the iobarrier
From: NeilBrown <hidden>
Date: 2013-02-06 02:16:53
On Thu, 24 Jan 2013 14:02:52 +0800 majianpeng [off-list ref] wrote:
There is iobarrier in raid1 because two reason resync/recovery or reconfigure the array. At present,it suspend all nornal IO when reysync/recovey. But if nornal IO is outrange the resync/recovery windwos,it don't need to iobarrier. So I rewrite the iobarrier. Because the reasons of barrier are two,so i use two different methods. First for resync/recovery, there is a reysnc window.The end position is 'next_resync'.Because the resync depth is RESYNC_DEPTH(32), so the start is 'next_resync - RESYNC_SECTOR * RESYNC_DEPTH' The nornal IO Will be divided into three categories by the location. a: before the start of resync window b: between the resync window c: after the end of resync window For a, it don't barrier. For b, it need barrier and used the original method For c, it don't barrier but it need record the minimum position.If next resync is larger this, resync action will suspend.Otherwise versa. I used rbtree to order those io. For the reason of reconfigure of the arrary,I proposed a concept "force_barrier".When there is force_barrier, all Nornam IO must be suspended. NOTE: Because this problem is also for raid10, but i only do it for raid1. It is post out mainly to make sure it is going in the correct direction and hope to get some helpful comments from other guys. If the methods is accepted,i will send the patch for raid10. Signed-off-by: Jianpeng Ma <redacted>
Hi, thanks for this and sorry for the delay in replying. The patch is reasonably good, but there is room for improvement. I would break it up into several patches which are easier to review. - Firstly, don't worry about the barrier for 'read' requests - it really isn't relevant for them (your patch didn't do this). - Secondly, find those places where raise_barrier() is used to reconfigure the array and replace those with "freeze_array()". I think that is safe, and it means that we don't need to pass a 'force' argument to 'raise_barrier()' and 'lower_barrier()'. - The rest might have to be all one patch, though if it could be done in a couple of distinct stages that would be good. For the different positions (before, during, after), which you currently call 0, 1, and 2 you should use an enum so they all have names. I don't really like the rbtree. It adds an extra place where you take the spinlock and causes more work to be done inside a spinlock. And it isn't really needed. Instead, you can have two R1BIO flags for "AFTER_RESYNC". One for requests that are more than 2 windows after, one for request that are less then 2 windows after (or maybe 3 windows or maybe 8..). Each of these has a corresponding count (as you already have: nr_after). Then when resync gets to the end of a window you wait until the count of "less than 2 windows after" reaches zero, then atomically swap the meaning of the two bits (toggle a bit somewhere). This should give you nearly the same effect with constant (not log-n) effort. Finally, have you done any testing, either to ensure there is no slow-down, or to demonstrate some improvement? Thanks, NeilBrown
Attachments
- signature.asc [application/pgp-signature] 828 bytes