Thread (3 messages) 3 messages, 2 authors, 2012-06-21

Re: [RFC] ext4: add an io-tree to track block allocation

From: Zheng Liu <hidden>
Date: 2012-06-21 11:48:58
Also in: linux-ext4

On Thu, Jun 21, 2012 at 07:04:31PM +0800, Yongqiang Yang wrote:
On Thu, Jun 21, 2012 at 5:46 PM, Zheng Liu [off-list ref] wrote:
quoted
Hi all,

This year at ext4 workshop a new idea that calls io-tree is proposed to
solve some problmes in ext4 [1].  I summarize the problems that are
needed to solve by io-tree in here:
1. reserve quota calculation in bigalloc
2. simplify puch hole implementation
3. simplify fiemap implementation
4. SEEK_DATA/HOLE implementation
Actually, we can accelerate
 ext4_da_write_cache_pages by looking up extent status tree rather
than page cache.  This is one of aims of the original patch sets.
Thanks for the feedback.  I will add it in my TODO list.
quoted
Meanwhile with io-tree, some codes can be improved as following:
1. accelerate get_block functions
2. simplify uninitialized extent conversion
3. fine granularity locking (extent lock)

I make a plan to implement io-tree that can be divided into three-steps.
Now I describe it in detailed.

* Step 1
The following problems will be solved in this step:
1. reserve quota calculation in bigalloc
2. simplify puch hole implementation
3. simplify fiemap implementation
4. SEEK_DATA/HOLE implementation

Currently a patch set has been submitted to the mailing list by
Yongqiang and Allison, which called status extent tree, and it has
simplified fiemap implementation.  But it only works when delay
In my memory  reserveing quota for bigalloc is also resolved in the
original patch sets.  Was it sent out?  If not, I can send the patch
to you if you need it:-)
I think that this patch is 'ext4: reimplement
ext4_find_delay_alloc_range on status extent tree'.  Right?
quoted
allocation is enabled.  I will pick up this work.  Now I have rebased
this patch set to 3.5-rc3, and renamed it to extent status tree as
Darrick advised.

Next I will try to solve the above problems and make it run in
nodelalloc mode.

* Step 2
To be improved:
1. accelerate get_block functions
2. simplify uninitialized extent conversion
IMHO ext4_da_write_cache_pages can be improved in this step.

Yongqiang.
quoted
For the above improvements, a status member will be added in extent
status tree to indicate the current status of this extent.  I think that
the status includes dealloc, allocated, uninit, and hole.  Then we can
let get_block functions to lookup extent status tree firstly to
accelerate get_block.  Meanwhile uninitialized extent conversion can be
modified to reduce lock contention of i_mutex.

* Step 3
To be done:
1. fine granularity locking (extent lock)

Now in ext4 it does some operations with i_mutex locking.  After adding
extent status tree, we can avoid to take this lock as much as possible.
It seems that a new member needs to be added to indicate the type of
locking.  We can take a range lock with shared or exclusive, and, when a
range is locked, it cannot be merged by other processes and other types
extent lock.

Dave Chinner said that maybe range lock can be used in xfs too.  So I
will try to implement a generic extent locking as much as possible after
step 3.

Please review this RFC, and any feedbacks are appreciated.  Thanks.

In addition, I remember that at ext4 workshop Ted mentions that a big
extent tree has been implemented to improve extent cache.  So we need to
consider whether need to merge big extent tree and io-tree or not after
both big extent tree and io-tree have been done.

1. http://www.spinics.net/lists/linux-ext4/msg31742.html

Regards,
Zheng


-- 
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help