Thread (9 messages) 9 messages, 4 authors, 2011-01-12

Re: [PATCH v2 0/5] add new ioctls to do metadata readahead in btrfs

From: Wu Fengguang <hidden>
Date: 2011-01-11 01:38:13
Also in: linux-fsdevel

On Tue, Jan 11, 2011 at 08:15:19AM +0800, Li, Shaohua wrote:
On Mon, 2011-01-10 at 22:26 +0800, Wu, Fengguang wrote:
quoted
Shaohua,

On Tue, Jan 04, 2011 at 01:40:30PM +0800, Li, Shaohua wrote:
quoted
Hi,
  We have file readahead to do asyn file read, but has no metadata
readahead. For a list of files, their metadata is stored in fragmented
disk space and metadata read is a sync operation, which impacts the
efficiency of readahead much. The patches try to add meatadata readahead
for btrfs.
  In btrfs, metadata is stored in btree_inode. Ideally, if we could hook
the inode to a fd so we could use existing syscalls (readahead, mincore
or upcoming fincore) to do readahead, but the inode is hidden, there is
no easy way for this from my understanding. So we add two ioctls for
If that is the main obstacle, why not do straightforward fincore()/
fadvise(), and add ioctls to btrfs to export/grab the hidden
btree_inode in any form?  This will address btrfs' specific issue, and
have the benefit of making the VFS part general enough. You know
ext2/3/4 already have block_dev ready for metadata readahead.
I forgot to update this comment. Please see patch 2 and patch 4, both
incore and readahead need btrfs specific staff involved, so we can't use
generic fincore or something.
You can if you like :)

- fincore() can return the referenced bit, which is generally
  useful information

- btrfs_metadata_readahead() can be passed to some (faked)
  ->readpages() for use with fadvise.

Thanks,
Fengguang
quoted
quoted
this. One is like readahead syscall, the other is like micore/fincore
syscall.
  Under a harddisk based netbook with Meego, the metadata readahead
reduced about 3.5s boot time in average from total 16s.
  Last time I posted similar patches to btrfs maillist, which adds the
new ioctls in btrfs specific ioctl code. But Christoph Hellwig asks we
have a generic interface to do this so other filesystem can share some
code, so I came up with the new one. Comments and suggestions are
welcome!

v1->v2:
1. Added more comments and fix return values suggested by Andrew Morton
2. fix a race condition pointed out by Yan Zheng

initial post:
http://marc.info/?l=linux-fsdevel&m=129222493406353&w=2

Thanks,
Shaohua

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help