Thread (13 messages) 13 messages, 4 authors, 2025-04-02

Re: [PATCH v3] mm/filemap: Allow arch to request folio size for exec memory

From: Matthew Wilcox <willy@infradead.org>
Date: 2025-03-28 19:14:34
Also in: linux-fsdevel, linux-mm, lkml

On Thu, Mar 27, 2025 at 04:23:14PM -0400, Ryan Roberts wrote:
+ Kalesh

On 27/03/2025 12:44, Matthew Wilcox wrote:
quoted
On Thu, Mar 27, 2025 at 04:06:58PM +0000, Ryan Roberts wrote:
quoted
So let's special-case the read(ahead) logic for executable mappings. The
trade-off is performance improvement (due to more efficient storage of
the translations in iTLB) vs potential read amplification (due to
reading too much data around the fault which won't be used), and the
latter is independent of base page size. I've chosen 64K folio size for
arm64 which benefits both the 4K and 16K base page size configs and
shouldn't lead to any read amplification in practice since the old
read-around path was (usually) reading blocks of 128K. I don't
anticipate any write amplification because text is always RO.
Is there not also the potential for wasted memory due to ELF alignment?
I think this is an orthogonal issue? My change isn't making that any worse.
To a certain extent, it is.  If readahead was doing order-2 allocations
before and is now doing order-4, you're tying up 0-12 extra pages which
happen to be filled with zeroes due to being used to cache the contents
of a hole.
quoted
Kalesh talked about it in the MM BOF at the same time that Ted and I
were discussing it in the FS BOF.  Some coordination required (like
maybe Kalesh could have mentioned it to me rathere than assuming I'd be
there?)
I was at Kalesh's talk. David H suggested that a potential solution might be for
readahead to ask the fs where the next hole is and then truncate readahead to
avoid reading the hole. Given it's padding, nothing should directly fault it in
so it never ends up in the page cache. Not sure if you discussed anything like
that if you were talking in parallel?
Ted said that he and Kalesh had talked about that solution.  I have a
more bold solution in mind which lifts the ext4 extent cache to the
VFS inode so that the readahead code can interrogate it.
Anyway, I'm not sure if you're suggesting these changes need to be considered as
one somehow or if you're just mentioning it given it is loosely related? My view
is that this change is an improvement indepently and could go in much sooner.
This is not a reason to delay this patch.  It's just a downside which
should be mentioned in the commit message.
quoted
quoted
+static inline int arch_exec_folio_order(void)
+{
+	return -1;
+}
This feels a bit fragile.  I often expect to be able to store an order
in an unsigned int.  Why not return 0 instead?
Well 0 is a valid order, no? I think we have had the "is order signed or
unsigned" argument before. get_order() returns a signed int :)
But why not always return a valid order?  I don't think we need a
sentinel.  The default value can be 0 to do what we do today.

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help