Thread (11 messages) 11 messages, 3 authors, 2009-03-01

Re: [patch][rfc] mm: new address space calls

From: Chris Mason <hidden>
Date: 2009-02-27 13:52:47
Also in: linux-fsdevel

On Fri, 2009-02-27 at 12:26 +0100, Nick Piggin wrote:
On Thu, Feb 26, 2009 at 08:21:45AM -0500, Chris Mason wrote:
quoted
quoted
quoted
One problem I have with the btrfs extent state code is that I might
choose to release the extent state in releasepage, but the VM might not
choose to free the page.  So I've got an up to date page without any of
the rest of my state.
I'm not sure. What semantics do you want there? In most cases (including
fsblock default case where the filesystem does not have a pin), we're
happy to leave clean, uptodate pages in pagecache in that case.
Right, but it really limits the state that we can keep outside the page
bits.  Take a subpage block, where we know the first 1k is up to date.
releasepage comes and we free our tracking that says the first 1k is up
to date, but the VM doesn't free the page.

Now we have a page where the uptodate bit isn't set, but the first 1k
has valid data.  We have to reread it.
Well I don't see how that limits us? Either we prefer to keep the
metadata, or we throw it away and it is inevitable that we lose
information. 
We can't have metadata that isn't freed by releasepage unless we want to
pin the page completely.  There was a time when the btrfs metadata had a
bit for 'this block needs defrag', and I ended up not being able to use
it because releasepage was consistently freeing my extra data while the
page was still around.
Regardless of whether you store the data in a tree of extends in the
inode, or per-page buffers, you have the same problem (buffer heads
have that same problem too).
Right.
quoted
I'd like a form of releasepage that knows if the vm is going to really
get rid of the page.  Or another callback that happens when the VM is
sure the page will be freed so we can drop extra metadata that doesn't
pin the page, but we always want to stay with the page.
Well, for page reclaim/invalidate/truncate, we have releasepage that you
can use even if the metadata is stored outside the page, just set PagePrivate
and it will still get called when the page is about to be freed.
For clean pages, shrink_page_list seems to check the page count after
the releasepage call.  It was a big enough window for me to see it in
practice under normal workloads.
There are *some* races that can result in the page subsequently not being
freed, but I don't think that should be a big deal. I don't want to add
a callback in the pagecache remove path if possible, but we can try to
rework or improve things if btrfs needs something specific..
Btrfs doesn't need it today, but it should help once I finally get
subpage blocks going again (and metadata defrag as well).

-chris


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help