Thread (15 messages) 15 messages, 7 authors, 2007-02-12

Re: [RFC][PATCH 2/3] Move the file data to the new blocks

From: Jan Kara <jack@suse.cz>
Date: 2007-02-08 10:21:02
Also in: linux-fsdevel

On Thu 08-02-07 01:45:29, Andrew Morton wrote:
 <snip>
quoted
  I though Andreas meant "any write changes" - i.e. you check that noone
has open file descriptor for writing and block any new open for writing.
That can be done quite easily.
  Anyway, I agree with you that userspace solution to a possible page
cache pollution is preferable after thinking about it for a while.
As I've been thinking about it, we could actually do the copying
from user space. We could do something like:
  block any writes to file (as I described above)
  craft new inode with blocks allocated as we want (using preallocation,
    we should mostly have the kernel infrastructure we need)
  copy data using splice syscall
  call the kernel to switch data
I don't think we need to block any writes to any file or anything.

To move a page within a file:

	fd = open(file);
	p = mmap(fd);
	the_page_was_in_core = mincore(p, offset);
	munmap(p);
	ioctl(fd, ..., new_block);

			<kernel>
			read_cache_page(inode, offset);
			lock_page(page);
			if (try_to_free_buffers(page)) {
				<relocate the page>
				set_page_dirty(page);
			}
			unlock_page(page);

	if (the_page_was_in_core) {
		sync_file_range(fd, offset SYNC_FILE_RANGE_WAIT_BEFORE|
						SYNC_FILE_RANGE_WRITE|
						SYNC_FILE_RANGE_WAIT_AFTER);
		fadvise(fd, offset, FADV_DONTNEED);
	}

completely coherent with pagecache, quite safe in the presence of mmap,
mlock, O_DIRECT, everything else.  Also fully journallable in-kernel.
  Yes, this is the simple way. But I see two disadvantages:
1) You'd like to relocate metadata (indirect blocks) too. For that you need
   a different mechanism. In my approach, you can mostly assume you've got
   sanely laid out metadata and so the existence of such mechanism is not
   so important.
2) You'd like to allocate new blocks in big chunks. So your kernel function
   should rather take a range. Also when you fail in the middle of
   relocating a file (for example the block you'd like to use is already
   taken by someone else), I find it nice if you can return at least to the
   original state. But that's probably not important.

								Honza

-- 
Jan Kara [off-list ref]
SuSE CR Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help