Thread (50 messages) 50 messages, 9 authors, 2025-07-26

Re: [PATCH v5 bpf-next 0/5] bpf path iterator

From: Song Liu <hidden>
Date: 2025-06-26 05:52:55
Also in: bpf, linux-fsdevel, lkml

On Jun 25, 2025, at 6:05 PM, NeilBrown [off-list ref] wrote:
[...]
quoted
I can't speak for Mickaël, but a callback-based interface is less flexible
(and _maybe_ less performant?).  Also, probably we will want to fallback
to a reference-taking walk if the walk fails (rather than, say, retry
infinitely), and this should probably use Song's proposed iterator.  I'm
not sure if Song would be keen to rewrite this iterator patch series in
callback style (to be clear, it doesn't necessarily seem like a good idea
to me, and I'm not asking him to), which means that we will end up with
the reference walk API being a "call this function repeatedly", and the
rcu walk API taking a callback.  I think it is still workable (after all,
if Landlock wants to reuse the code in the callback it can just call the
callback function itself when doing the reference walk), but it seems a
bit "ugly" to me.
call-back can have a performance impact (less opportunity for compiler
optimisation and CPU speculation), though less than taking spinlock and
references.  However Al and Christian have drawn a hard line against
making seq numbers visible outside VFS code so I think it is the
approach most likely to be accepted.

Certainly vfs_walk_ancestors() would fallback to ref-walk if rcu-walk
resulted in -ECHILD - just like all other path walking code in namei.c.
This would be largely transparent to the caller - the caller would only
see that the callback received a NULL path indicating a restart.  It
wouldn't need to know why.
I guess I misunderstood the proposal of vfs_walk_ancestors() 
initially, so some clarification:

I think vfs_walk_ancestors() is good for the rcu-walk, and some 
rcu-then-ref-walk. However, I don’t think it fits all use cases. 
A reliable step-by-step ref-walk, like this set, works well with 
BPF, and we want to keep it. 

Can we ship this set as-is (or after fixing the comment reported
by kernel test robot)? I really don’t think we need figure out 
all details about the rcu-walk here. 

Once this is landed, we can try implementing the rcu-walk
(vfs_walk_ancestors or some variation). If no one volunteers, I
can give it a try. 

Thanks,
Song
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help