Thread (11 messages) 11 messages, 2 authors, 2021-10-25

Re: [RFC PATCH 0/5] Shared memory for shared extents

From: Matthew Wilcox <willy@infradead.org>
Date: 2021-10-25 15:45:14
Also in: linux-fsdevel

On Mon, Oct 25, 2021 at 09:53:01AM -0500, Goldwyn Rodrigues wrote:
On  2:43 23/10, Matthew Wilcox wrote:
quoted
On Fri, Oct 22, 2021 at 03:15:00PM -0500, Goldwyn Rodrigues wrote:
quoted
This is an attempt to reduce the memory footprint by using a shared
page(s) for shared extent(s) in the filesystem. I am hoping to start a
discussion to iron out the details for implementation.
When you say "Shared extents", you mean reflinks, which are COW, right?
Yes, shared extents are extents which are shared on disk by two or more
files. Yes, same as reflinks. Just to explain with an example:

If two files, f1 and f2 have shared extent(s), and both files are read. Each
file's mapping->i_pages will hold a copy of the contents of the shared
extent on disk. So, f1->mapping will have one copy and f2->mapping will
have another copy.

For reads (and only reads), if we use underlying device's mapping, we
can save on duplicate copy of the pages.
Yes; I'm familiar with the problem.  Dave Chinner and I had a great
discussion about it at LCA a couple of years ago.

The implementation I've had in mind for a while is that the filesystem
either creates a separate inode for a shared extent, or (as you've
done here) uses the bdev's inode.  We can discuss the pros/cons of
that separately.

To avoid the double-lookup problem, I was intending to generalise DAX
entries into PFN entries.  That way, if the read() (or mmap read fault)
misses in the inode's cache, we can look up the shared extent cache,
and then cache the physical address of the memory in the inode.

That makes reclaim/eviction of the page in the shared extent more
expensive because you have to iterate all the inodes which share the
extent and remove the PFN entries before the page can be reused.

Perhaps we should have a Zoom meeting about this before producing duelling
patch series?  I can host if you're interested.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help