Re: Two questions on VFS/mm

From: Jan Kara <jack@suse.cz>
Date: 2008-06-05 08:12:20
Also in: linux-fsdevel, linux-mm, lkml

On Wed 04-06-08 19:10:42, Miklos Szeredi wrote:

(Added some CCs)

quoted

  could some kind soul knowledgable in VFS/mm help me with the following
two questions? I've spotted them when testing some ext4 for patches...
  1) In write_cache_pages() we do:
...
	lock_page(page);
	...
	if (!wbc->range_cyclic && page->index > end) {
                   done = 1;
                   unlock_page(page);
                   continue;
        }
	...
	ret = (*writepage)(page, wbc, data);

  Now the problem is that if range_cyclic is set, it can happen that the
page we give to the filesystem is beyond the current end of file (and can
be already processed by invalidatepage()). Is the filesystem supposed to
handle this (what would it be good for to give such a page to the fs?) or
is it just a bug in write_cache_pages()?

There may be a bug somewhere, but write_cache_pages() looks correct.
It locks the page then checks for page->mapping to make sure the page
wasn't truncated.  And truncation (including invalidatepage()) happens
with the page locked, so that can't race with page writeback.

  You are right, write_cache_pages() is correct - I've wrongly undrestood
what 'end' means.

However the do_invalidatepage() in block_write_full_page() looks
suspicious.  It calls invalidatepage(), but doesn't perform all the
other things needed for truncation.  Maybe there's a valid reason for
that, but I really don't have any idea what.

  Hmm, the fact is I've seen in my tests writepage() being called on a page
which had its buffers removed. And because we attach buffers to a page in
page_mkwrite() and in write_begin() I think we should not see such page.
I've added more debug printings to the code to verify that the page has
indeed been truncated but so far I did not reproduce the problem again.

quoted

  2) I have the following problem with page_mkwrite() when blocksize <
pagesize. What we want to do is to fill in a potential hole under a page
somebody wants to write to. But consider following scenario with a
filesystem with 1k blocksize:
  truncate("file", 1024);
  ptr = mmap("file");
  *ptr = 'a'
     -> page_mkwrite() is called.
        but "file" is only 1k large and we cannot really allocate blocks
        beyond end of file. So we allocate just one 1k block.
  truncate("file", 4096);
  *(ptr + 2048) = 'a'
     - nothing is called and later during writepage() time we are surprised
       we have a dirty page which is not backed by a filesystem block.

  How to solve this? One idea I have here is that when we handle truncate(),
we mark the original last page (if it is partial) as read-only again so
that page_mkwrite() is called on the next write to it. Is something like
this possible? Pointers to code doing something similar are welcome, I don't
really know these things ;).

									Honza
-- 
Jan Kara [off-list ref]
SUSE Labs, CR

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help