Re: [PATCH, RFC] Don't do page stablization if !CONFIG_BLKDEV_INTEGRITY
From: Boaz Harrosh <hidden>
Date: 2012-03-08 20:20:47
Also in:
linux-fsdevel
On 03/08/2012 10:09 AM, Chris Mason wrote:
But, why are we writeback for a second or more? Aren't there other parts of this we would want to fix as well? I'm not against only turning on stable pages when they are needed, but the code that isn't the default tends to be somewhat less used. So it does increase testing burden when we do want stable pages, and it tends to make for awkward bugs that are hard to reproduce because someone neglects to mention it. IMHO it's much more important to nail down the 2 second writeback latency. That's not good.
I think I understand this one. It's do to the sync nature introduced by page_waiting in mkwrite. The system is loaded everything is somewhat 2 second or more in a lag. The 2 sec (or more) comes from the max-dirty-limit/disk-speed so any IO you'll submit will probably be on stable disk 2 sec later. (In theory, any power fail will loose all dirty pages which is in our case max-dirty-limit) Now usually that's fine because everything is queued and waits a bit evenly distributed and you wait, theoretically, only the rate of your IO. But here, all of a sudden, you are not allowed to be queued and you are waiting for the head of queue to be actually done, and the app is just frozen. Actually now when I think of it the pages were already submitted for them to be waited on. So the 2-sec is the depth of the block+scsi+target queues. I guess they can be pretty deep. I have a theory of how we can fix that 2-sec wait, by avoiding writeback of the last n pages of an inode who's mtime is less then 2-sec. This would solve any sequential writer wait penalty, which is Ted's case Thanks Boaz