Thread (137 messages) 137 messages, 11 authors, 2025-10-09

Re: [PATCH v3 29/30] luo: allow preserving memfd

From: Jason Gunthorpe <jgg@nvidia.com>
Date: 2025-09-03 15:02:10
Also in: linux-doc, linux-fsdevel, linux-mm, lkml

On Wed, Sep 03, 2025 at 04:10:37PM +0200, Pratyush Yadav wrote:
quoted
So, it could be useful, but I wouldn't use it for memfd, the vmalloc
approach is better and we shouldn't optimize for sparsness which
should never happen.
I disagree. I think we are re-inventing the same data format with minor
variations. I think we should define extensible fundamental data formats
first, and then use those as the building blocks for the rest of our
serialization logic.
page, vmalloc, slab seem to me to be the fundamental units of memory
management in linux, so they should get KHO support.

If you want to preserve a known-sized array you use vmalloc and then
write out the per-list items. If it is a dictionary/sparse array then
you write an index with each item too. This is all trivial and doesn't
really need more abstraction in of itself, IMHO.
cases can then build on top of it. For example, the preservation bitmaps
can get rid of their linked list logic and just use KHO array to hold
and retrieve its bitmaps. It will make the serialization simpler.
I don't think the bitmaps should, the serialization here is very
special because it is not actually preserved, it just exists for the
time while the new kernel runs in scratch and is insta freed once the
allocators start up.
I also don't get why you think sparseness "should never happen". For
memfd for example, you say in one of your other emails that "And again
in real systems we expect memfd to be fully populated too." Which
systems and use cases do you have in mind? Why do you think people won't
want a sparse memfd?
memfd should principally be used to back VM memory, and I expect VM
memory to be fully populated. Why would it be sparse?
All in all, I think KHO array is going to prove useful and will make
serialization for subsystems easier. I think sparseness will also prove
useful but it is not a hill I want to die on. I am fine with starting
with a non-sparse array if people really insist. But I do think we
should go with KHO array as a base instead of re-inventing the linked
list of pages again and again.
The two main advantages I see to the kho array design vs vmalloc is
that it should be a bit faster as it doesn't establish a vmap, and it
handles unknown size lists much better.

Are these important considerations? IDK.

As I said to Chris, I think we should see more examples of what we
actually need before assuming any certain datastructure is the best
choice.

So I'd stick to simpler open coded things and go back and improve them
than start out building the wrong shared data structure.

How about have at least three luo clients that show meaningful benefit
before proposing something beyond the fundamental page, vmalloc, slab
things?
What do you mean by "data per version"? I think there should be only one
version of the serialized object. Multiple versions of the same thing
will get ugly real quick.
If you want to support backwards/forwards compatability then you
probably should support multiple versions as well. Otherwise it
could become quite hard to make downgrades..

Ideally I'd want to remove the upstream code for obsolete versions
fairly quickly so I'd imagine kernels will want to generate both
versions during the transition period and then eventually newer
kernels will only accept the new version.

I've argued before that the extended matrix of any kernel version to
any other kernel version should lie with the distro/CSP making the
kernel fork. They know what their upgrade sequence will be so they can
manage any missing versions to make it work.

Upstream should do like v6.1 to v6.2 only or something similarly well
constrained. I think this is a reasonable trade off to get subsystem
maintainers to even accept this stuff at all.
Other than that, I think this could work well. I am guessing luo_object
stores the version and gives us a way to query it on the other side. I
think if we are letting LUO manage supported versions, it should be
richer than just a list of strings. I think it should include a ops
structure for deserializing each version. That would encapsulate the
versioning more cleanly.
Yeah, sounds about right

Jason
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help