Thread (25 messages) 25 messages, 4 authors, 2012-09-23

Re: [PATCH 2/5] Btrfs: fix trans block rsv regression

From: David Sterba <hidden>
Date: 2012-09-14 12:07:20

On Fri, Sep 14, 2012 at 07:25:45PM +0800, Liu Bo wrote:
On 09/14/2012 07:15 PM, David Sterba wrote:
quoted
On Fri, Sep 14, 2012 at 04:58:04PM +0800, Liu Bo wrote:
quoted
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -306,9 +306,17 @@ static struct btrfs_trans_handle *start_transaction(struct btrfs_root *root,
 		WARN_ON(type != TRANS_JOIN && type != TRANS_JOIN_NOLOCK &&
 			type != TRANS_JOIN_ONLY);
 		h = current->journal_info;
-		h->use_count++;
-		h->orig_rsv = h->block_rsv;
+		if (h->block_rsv) {
+			struct btrfs_trans_rsv_item *item;
+			item = kmalloc(sizeof(*item), GFP_NOFS);
I'd rather avoid the kmalloc here and add a list hook into
btrfs_block_rsv itself (used only for this purpose).

It also does not increase the failure surface and we don't have to
handle error conditions from deep callchains.
Actually I placed a list hook at first, but I found the same block_rsv could be inserted into
the list chain twice, which will cause list_head's terrible warnings.
Is it expected and correct to add the rsv twice? I know the transaction
joins and commits are sometimes wildly nested so it does not mean that's
necessarily a bug. Then it is not possible to embed the list hook, we
need to keep the full track of the blk_rsv stack.

I'm counting two other similar structs used inside btrfs
(list_head + 8B value), we could make a shared separate slab for them.

struct delayed_iput {
        struct list_head           list;                 /*     0    16 */
        struct inode *             inode;                /*    16     8 */

        /* size: 24, cachelines: 1, members: 2 */
        /* last cacheline: 24 bytes */
};

truct seq_list {
        struct list_head           list;                 /*     0    16 */
        u64                        seq;                  /*    16     8 */

        /* size: 24, cachelines: 1, members: 2 */
        /* last cacheline: 24 bytes */
};

seq_list is never allocated, only embedded inside tree_mod_log structures.

delayed_iput is allocated on every add_delayed_iput and freed quite
frequently (once per-commit at least), so this is a short-lived object
as well as the proposed btrfs_trans_rsv_item. IMO this justifies using
the slab. Does this sound good to you?


david
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help