Thread (68 messages) 68 messages, 4 authors, 2021-01-27

Re: [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case

From: Josef Bacik <josef@toxicpanda.com>
Date: 2021-01-20 14:52:50

On 1/19/21 7:27 PM, Qu Wenruo wrote:

On 2021/1/20 上午5:54, Josef Bacik wrote:
quoted
On 1/16/21 2:15 AM, Qu Wenruo wrote:
quoted
For subpage case, we need to allocate new memory for each metadata page.

So we need to:
- Allow attach_extent_buffer_page() to return int
   To indicate allocation failure

- Prealloc btrfs_subpage structure for alloc_extent_buffer()
   We don't want to call memory allocation with spinlock hold, so
   do preallocation before we acquire mapping->private_lock.

- Handle subpage and regular case differently in
   attach_extent_buffer_page()
   For regular case, just do the usual thing.
   For subpage case, allocate new memory or use the preallocated memory.

For future subpage metadata, we will make more usage of radix tree to
grab extnet buffer.

Signed-off-by: Qu Wenruo <redacted>
---
  fs/btrfs/extent_io.c | 75 ++++++++++++++++++++++++++++++++++++++------
  fs/btrfs/subpage.h   | 17 ++++++++++
  2 files changed, 82 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index a816ba4a8537..320731487ac0 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -24,6 +24,7 @@
  #include "rcu-string.h"
  #include "backref.h"
  #include "disk-io.h"
+#include "subpage.h"
  static struct kmem_cache *extent_state_cache;
  static struct kmem_cache *extent_buffer_cache;
@@ -3140,9 +3141,13 @@ static int submit_extent_page(unsigned int opf,
      return ret;
  }
-static void attach_extent_buffer_page(struct extent_buffer *eb,
-                      struct page *page)
+static int attach_extent_buffer_page(struct extent_buffer *eb,
+                      struct page *page,
+                      struct btrfs_subpage *prealloc)
  {
+    struct btrfs_fs_info *fs_info = eb->fs_info;
+    int ret;
int ret = 0;
quoted
+
      /*
       * If the page is mapped to btree inode, we should hold the private
       * lock to prevent race.
@@ -3152,10 +3157,32 @@ static void attach_extent_buffer_page(struct 
extent_buffer *eb,
      if (page->mapping)
          lockdep_assert_held(&page->mapping->private_lock);
-    if (!PagePrivate(page))
-        attach_page_private(page, eb);
-    else
-        WARN_ON(page->private != (unsigned long)eb);
+    if (fs_info->sectorsize == PAGE_SIZE) {
+        if (!PagePrivate(page))
+            attach_page_private(page, eb);
+        else
+            WARN_ON(page->private != (unsigned long)eb);
+        return 0;
+    }
+
+    /* Already mapped, just free prealloc */
+    if (PagePrivate(page)) {
+        kfree(prealloc);
+        return 0;
+    }
+
+    if (prealloc) {
+        /* Has preallocated memory for subpage */
+        spin_lock_init(&prealloc->lock);
+        attach_page_private(page, prealloc);
+    } else {
+        /* Do new allocation to attach subpage */
+        ret = btrfs_attach_subpage(fs_info, page);
+        if (ret < 0)
+            return ret;
Delete the above 2 lines.
quoted
+    }
+
+    return 0;
return ret;
quoted
  }
  void set_page_extent_mapped(struct page *page)
@@ -5062,21 +5089,29 @@ struct extent_buffer *btrfs_clone_extent_buffer(const 
struct extent_buffer *src)
      if (new == NULL)
          return NULL;
+    set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
+    set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
+
Why are you doing this here?  It seems unrelated?  Looking at the code it 
appears there's a reason for this later, but I had to go look to make sure I 
wasn't crazy, so at the very least it needs to be done in a more relevant patch.
This is to handle case where we allocated a page but failed to allocate subpage 
structure.

In that case, btrfs_release_extent_buffer() will go different routine to free 
the eb.

Without UNMAPPED bit, it just go wrong without knowing it's a unmapped eb.

This change is mostly due to the extra failure pattern introduced by the subpage 
memory allocation.
Yes, but my point is it's unrelated to this change, and in fact the problem 
exists outside of your changes, so it needs to be addressed in its own patch 
with its own changelog.
quoted
quoted
      for (i = 0; i < num_pages; i++) {
+        int ret;
+
          p = alloc_page(GFP_NOFS);
          if (!p) {
              btrfs_release_extent_buffer(new);
              return NULL;
          }
-        attach_extent_buffer_page(new, p);
+        ret = attach_extent_buffer_page(new, p, NULL);
+        if (ret < 0) {
+            put_page(p);
+            btrfs_release_extent_buffer(new);
+            return NULL;
+        }
          WARN_ON(PageDirty(p));
          SetPageUptodate(p);
          new->pages[i] = p;
          copy_page(page_address(p), page_address(src->pages[i]));
      }
-    set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags);
-    set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags);
      return new;
  }
@@ -5308,12 +5343,28 @@ struct extent_buffer *alloc_extent_buffer(struct 
btrfs_fs_info *fs_info,
      num_pages = num_extent_pages(eb);
      for (i = 0; i < num_pages; i++, index++) {
+        struct btrfs_subpage *prealloc = NULL;
+
          p = find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL);
          if (!p) {
              exists = ERR_PTR(-ENOMEM);
              goto free_eb;
          }
+        /*
+         * Preallocate page->private for subpage case, so that
+         * we won't allocate memory with private_lock hold.
+         * The memory will be freed by attach_extent_buffer_page() or
+         * freed manually if exit earlier.
+         */
+        ret = btrfs_alloc_subpage(fs_info, &prealloc);
+        if (ret < 0) {
+            unlock_page(p);
+            put_page(p);
+            exists = ERR_PTR(ret);
+            goto free_eb;
+        }
+
I realize that for subpage sectorsize we'll only have 1 page, but I'd still 
rather see this outside of the for loop, just for clarity sake.
This is the trade-off.
Either we do every separately, sharing the minimal amount of code (and need 
extra for loop for future 16K pages), or using the same loop sacrifice a little 
readability.

Here I'd say sharing more code is not that a big deal.
It's not a tradeoff, it's confusing.  What I'm suggesting is you do

ret = btrfs_alloc_subpage(fs_info, &prealloc);
if (ret) {
	exists = ERR_PTR(ret);
	goto free_eb;
}
for (i = 0; i < num_pages; i++, index++) {
}

free_eb:
	kmem_cache_free(prealloc);

The subpage portion is part of the eb itself, and there's one per eb, and thus 
should be pre-allocated outside of the loop that is doing the page lookup, as 
it's logically a different thing.  Thanks,

Josef
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help