Thread (64 messages) 64 messages, 4 authors, 2021-12-16

Re: [PATCH v2 02/28] mm: Add functions to zero portions of a folio

From: Matthew Wilcox <willy@infradead.org>
Date: 2021-11-18 20:08:54
Also in: linux-fsdevel, linux-xfs, lkml

On Thu, Nov 18, 2021 at 09:26:15AM -0800, Darrick J. Wong wrote:
On Thu, Nov 18, 2021 at 03:55:12PM +0000, Matthew Wilcox wrote:
quoted
                        if (buffer_new(bh)) {
...
                                        folio_zero_segments(folio,
                                                to, block_end,
                                                block_start, from);

("zero between block_start and block_end, except for the region
specified by 'from' and 'to'").  Except that for some reason the
ranges are specified backwards, so it's not obvious what's going on.
Converting that to folio_zero_ranges() would be a possibility, at the
expense of complexity in the caller, or using 'max' instead of 'end'
would also add complexity to the callers.
The call above looks like it is preparing to copy some data into the
middle of a buffer by zero-initializing the bytes before and the bytes
after that middle region.

Admittedly my fs-addled brain actually finds this hot mess easier to
understand:

folio_zero_segments(folio, to, blocksize - 1, block_start, from - 1);

but I suppose the xend method involves less subtraction everywhere.
That's exactly what it's doing.  It's kind of funny because it's an
abstraction that permits a micro-optimisation (removing potentially one
kmap() call), but removes the opportunity for a larger optimisation
(removing several, and also removing calls to flush_dcache_folio).
That is, we could rewrite __block_write_begin_int() as:

static void *kremap_folio(void *kaddr, struct folio *folio)
{
	if (kaddr)
		return kaddr;
	/* buffer heads only support single page folios */
	return kmap_local_folio(folio, 0);
}

+       void *kaddr = NULL;
...
-                               if (block_end > to || block_start < from)
-                                       folio_zero_segments(folio,
-                                               to, block_end,
-                                               block_start, from);
+                               if (from > block_start) {
+                                       kaddr = kremap_folio(kaddr, folio);
+                                       memset(kaddr + block_start, 0,
+                                               block_start - from);
+                               }
+                               if (block_end > to) {
+                                       kaddr = kremap_folio(kaddr, folio);
+                                       memset(kaddr + to, 0, block_end - to);
+                               }
...
        }
+       if (kaddr) {
+               kunmap_local(kaddr);
+               flush_dcache_folio(folio);
+       }

That way if there are multiple unmapped+new buffers, we only kmap/kunmap
once per page.  I don't care to submit this as a patch though ... buffer
heads just need to go away.  iomap can't use an optimisation like this;
it already reports all the contiguous unmapped blocks as a single extent,
and if you have multiple unmapped extents per page, well ... I'm sorry
for you, but the overhead of kmap/kunmap is the least of your problems.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Thanks.  Pushed to
https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/for-next

I'll give that until Monday to soak and send a pull request.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help