Thread (20 messages) 20 messages, 6 authors, 2012-10-31

Re: semi-stable page writes

From: Dave Chinner <david@fromorbit.com>
Date: 2012-10-30 23:43:31
Also in: linux-fsdevel

On Tue, Oct 30, 2012 at 01:40:37PM -0700, Darrick J. Wong wrote:
On Tue, Oct 30, 2012 at 09:01:22AM +1100, Dave Chinner wrote:
quoted
On Fri, Oct 26, 2012 at 03:19:09AM -0700, Darrick J. Wong wrote:
quoted
Hi everyone,

Are people still annoyed about writes taking unexpectedly long amounts of tme
due to the stable page write patchset?  I'm guessing yes...
I haven't heard anyone except th elunatic fringe complain
recently...
quoted
I'm close to posting a patchset that (a) gates the wait_on_page_writeback calls
on a flag that you can set in the bdi to indicate that you need stable writes
(which blk_integrity_register will set);
I'd prefer stable pages by default (e.g. btrfs needs it for sane
data crc calculations), with an option to turn it off.
quoted
(b) (ab)uses a page flag bit (PG_slab)
to indicate that a page is actually being sent out to disk hardware; and (c)
I don't think you can do that. You can send slab allocated memory to
disk (e.g. kmalloc()d memory) and XFS definitely does that for
sub-page sized metadata. I'm pretty sure that means the PG_slab
flag is not available for (ab)use in the IO path....
I gave up on PG_slab and declared my own PG_ bit.  Unfortunately, atm I can't
remember which bit of code marks the page ptes so that they have to go back
through page_mkwrite, where we can trap the write.  Hopefully for a shorter
duration.
clear_page_dirty_for_io(), IIRC.
Also, I was wondering -- is it possible to pursue a dual strategy?  If we can
obtain a memory page without sleeping or causing any writeback, then use the
page as a bounce buffer.  Otherwise, just wait like we do now.
Using bounce buffers for all IO is not a feasible solution. Way too
much overhead copying data, not to mention we are already suffering
from the problem of flusher threads going CPU bound trying to issue
enough IO to keep high bandwidth storage fully utilised...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help