Re: [RFC/PATCH v2 bpf-next fanotify 7/7] selftests/bpf: Add test for BPF based fanotify fastpath handler
From: Song Liu <hidden>
Date: 2024-11-19 08:35:52
Also in:
bpf, linux-fsdevel, lkml
Hi Amir, Thanks for sharing these thoughts. I will take a closer look later, as I am not awake enough to fully understand everything.
On Nov 18, 2024, at 11:59 PM, Amir Goldstein [off-list ref] wrote:
[...]
quoted
Moving struct_ops hook inside send_to_group does save us an indirect call. But this also means we need to introduce the fastpath concept to both fsnotify and fanotify. I personally don't really like duplications like this (see the big BUILD_BUG_ON array in fanotify_handle_event). OTOH, maybe the benefit of one fewer indirect call justifies the extra complexity. Let me think more about it.I need to explain something about fsnotify vs. fanotify in order to argue why the feature should be "fanotify", but the bottom line is that is should not be too hard to avoid the indirect call even if the feature is introduced through fanotify API as I think that it should be.
When I first looked into this, I thought about "whether there will be a use case that uses fsnotify but not fanotify". I didn't get any conclusion on this back then. But according to this thread, I think we are pretty confident that future use cases (such as FAN_PRE_ACCESS) will have a fanotify part. If this is the case, I think fanotify-bpf makes more sense.
TLDR: The fsnotify_backend abstraction has become somewhat of a theater of abstraction over time, because the feature distance between fanotify backend and all the rest has grew quite large. The logic in send_to_group() is *seemingly* the generic fsnotify logic, but not really, because only fanotify has ignore masks and only fanotify has mark types (inode,mount,sb). This difference is encoded by the group->ops->handle_event() operation that only fanotify implements. All the rest of the backends implement the simpler ->handle_inode_event(). Similarly, the group->private union is always dominated by the size of group->fanotify_data, so there is no big difference if we place group->fp_hook (or ->filter_hook) outside of fanotify_data, so that we can query and call it from send_to_group() saving the indirect call to ->handle_event(). That still leaves the question if we need to call fanotify_group_event_mask() before the filter hook.
I was trying Alexei's idea to move the API to fsnotify, and got stucked at fanotify_group_event_mask(). It appears we should always honor these built-in filters.
fanotify_group_event_mask() does several things, but I think that
the only thing relevant before the filter hook is this line:
/*
* Send the event depending on event flags in mark mask.
*/
if (!fsnotify_mask_applicable(mark->mask, ondir, type))
continue;
This code is related to the two "built-in fanotify filters", namely
FAN_ONDIR and FAN_EVENT_ON_CHILD.
These built-in filters are so lame that they serve as a good example
why a programmable filter is a better idea.
For example, users need to opt-in for events on directories, but they
cannot request events only on directories.
Historically, the "generic" abstraction in send_to_group() has dealt
with the non-generic fanotify ignore mask, but has not dealt with
these non-generic fanotify built-in filters.
However, since commit 31a371e419c8 ("fanotify: prepare for setting
event flags in ignore mask"), send_to_group() is already aware of those
fanotify built-in filters.I will continue on this tomorrow. It is time to get some sleep. :)
So unless I am missing something, if we align the marks iteration loop in send_to_group() to look exactly like the marks iteration loop in fanotify_group_event_mask(), there should be no problem to call the filter hook directly, right before calling ->handle_event(). Hope this brain dump helps.
Thanks again for your input! Song
Thanks, Amir.