Thread (14 messages) 14 messages, 4 authors, 2021-06-16

Re: [PATCH 3/4] fs: add a filemap_fdatawrite_wbc helper

From: Josef Bacik <josef@toxicpanda.com>
Date: 2021-06-15 13:38:34

On 6/14/21 6:16 AM, Nikolay Borisov wrote:

On 11.06.21 г. 16:53, Josef Bacik wrote:
quoted
Btrfs sometimes needs to flush dirty pages on a bunch of dirty inodes in
order to reclaim metadata reservations.  Unfortunately most helpers in
this area are too smart for us

1) The normal filemap_fdata* helpers only take range and sync modes, and
    don't give any indication of how much was written, so we can only
    flush full inodes, which isn't what we want in most cases.
2) The normal writeback path requires us to have the s_umount sem held,
    but we can't unconditionally take it in this path because we could
    deadlock.
3) The normal writeback path also skips inodes with I_SYNC set if we
    write with WB_SYNC_NONE.  This isn't the behavior we want under heavy
    ENOSPC pressure, we want to actually make sure the pages are under
    writeback before returning, and if another thread is in the middle of
    writing the file we may return before they're under writeback and
    miss our ordered extents and not properly wait for completion.
4) sync_inode() uses the normal writeback path and has the same problem
    as #3.

What we really want is to call do_writepages() with our wbc.  This way
we can make sure that writeback is actually started on the pages, and we
can control how many pages are written as a whole as we write many
inodes using the same wbc.  Accomplish this with a new helper that does
just that so we can use it for our ENOSPC flushing infrastructure.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
  include/linux/fs.h |  2 ++
  mm/filemap.c       | 29 ++++++++++++++++++++++++-----
  2 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c3c88fdb9b2a..aace07f88b73 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2886,6 +2886,8 @@ extern int filemap_fdatawrite_range(struct address_space *mapping,
  				loff_t start, loff_t end);
  extern int filemap_check_errors(struct address_space *mapping);
  extern void __filemap_set_wb_err(struct address_space *mapping, int err);
+extern int filemap_fdatawrite_wbc(struct address_space *mapping,
+				  struct writeback_control *wbc);
  
  static inline int filemap_write_and_wait(struct address_space *mapping)
  {
diff --git a/mm/filemap.c b/mm/filemap.c
index 66f7e9fdfbc4..0408bc247e71 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -376,6 +376,29 @@ static int filemap_check_and_keep_errors(struct address_space *mapping)
  		return -ENOSPC;
  	return 0;
  }
+/**
+ * filemap_fdatawrite_wbc - start writeback on mapping dirty pages in range
+ * @mapping:	address space structure to write
+ * @wbc:	the writeback_control controlling the writeout
+ *
+ * This behaves the same way as __filemap_fdatawrite_range, but simply takes the
That's not true, because __filemap_fdatawrite_range will only issue
writeback in case of PAGECACHE_TAG_DIRTY && the inode's bdi having
BDI_CAP_WRITEBACK. So I think those checks should also be moved to
fdatawrite_wbc.
Yeah I'll move those into _wbc
In fact what would be good for readability since we have a bunch of
__filemap_fdatawrite functions is to have each one call your newly
introduced helper and have their body simply setup the correct
writeback_control structure. Alternative right now one has to chase up
to 3-4 levels of (admittedly very short) functions. I.E

filemap_fdatawrite->__filemap_fdatawrite->__filemap_fdatawrite_range->filemap_fdatawrite_wbc

which is somewhat annoying. Instead I propose having

filemap_fdatawrite->filemap_fdatawrite_wbc
filemap_flush->filemap_fdatawrite_wbc etc...
Yeah I'd like to clean this up at some point, but that's outside the scope of 
this patch.  I want to get a helper in without needing to run it by everybody, 
we can take up cleaning this up at a later point with input from everybody else. 
  Thanks,

Josef
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help