Thread (33 messages) 33 messages, 3 authors, 2021-03-28

Re: [PATCH v2 0/2] unprivileged fanotify listener

From: Christian Brauner <hidden>
Date: 2021-03-24 16:29:29
Also in: linux-fsdevel

On Wed, Mar 24, 2021 at 05:05:45PM +0200, Amir Goldstein wrote:
On Wed, Mar 24, 2021 at 4:32 PM Christian Brauner
[off-list ref] wrote:
quoted
On Wed, Mar 24, 2021 at 03:57:12PM +0200, Amir Goldstein wrote:
quoted
quoted
quoted
Now tested FAN_MARK_FILESYSTEM watch on tmpfs mounted
inside userns and works fine, with two wrinkles I needed to iron:

1. FAN_REPORT_FID not supported on tmpfs because tmpfs has
    zero f_fsid (easy to fix)
2. open_by_handle_at() is not userns aware (can relax for
    FS_USERNS_MOUNT fs)

Pushed these two fixes to branch fanotify_userns.
Pushed another fix to mnt refcount bug in WIP and another commit to
add the last piece that could make fanotify usable for systemd-homed
setup - a filesystem watch filtered by mnt_userns (not tested yet).
Now I used mount-idmapped (from xfstest) to test that last piece.
Found a minor bug and pushed a fix.

It is working as expected, that is filtering only the events generated via
the idmapped mount. However, because the listener I tested is capable in
the mapped userns and not in the sb userns, the listener cannot
open_ny_handle_at(), so the result is not as useful as one might hope.
This is another dumb question probably but in general, are you saying
that someone watching a mount or directory and does _not_ want file
descriptors from fanotify to be returned has no other way of getting to
the path they want to open other than by using open_by_handle_at()?
Well there is another way.
It is demonstrated in my demo with intoifywatch --fanotify --recursive.
It involved userspace iterating a subtree of interest to create fid->path
map.
Ok, so this seems to be

inotifytools_filename_from_fid()
-> if (fanotify_mark_type != FAN_MARK_FILESYSTEM)
           watch_from_fid()
   -> read_path_from(/proc/self/fd/w->dirfd)
The fanotify recursive watch is similar but not exactly the same as the
old intoify recursive watch, because with inotify recursive watch you
can miss events.

With fanotify recursive watch, the listener (if capable) can setup a
filesystem mark so events will not be missed. They will be recorded
by fid with an unknown path and the path information can be found later
by the crawler and updated in the map before the final report.

Events on fid that were not found by the crawler need not be reported.
That's essentially a subtree watch for the poor implemented in userspace.
This is already a good improvement.
Honestly, having FAN_MARK_INODE workable unprivileged is already pretty
great. In addition having FAN_MARK_MOUNT workable with idmapped mounts
will likely get us what most users care about, afaict that is the POC
in:
https://github.com/amir73il/linux/commit/f0d5d462c5baeb82a658944c6df80704434f09a1

(I'm reading the source correctly that FAN_MARK_MOUNT works with
FAN_REPORT_FID as long as no inode event set in FANOTIFY_INODE_EVENTS is
set? I'm asking because my manpage - probably too old - seems to imply
that FAN_REPORT_FID can't be used with FAN_MARK_MOUNT although I might
just be stumbling over the phrasing.)

I think FAN_MARK_FILESYSTEM should simply stay under the s_userns_s
capable requirement. That's imho the cleanest semantics for this, i.e.
I'd drop:
https://github.com/amir73il/linux/commit/bd20e273f3c3a650805b3da32e493f01cc2a4763
This is neither an urgent use-case nor am I feeling very comfortable
with it.
I did not implement the combination --fanotify --global --recursive in my
demo, because it did not make sense with the current permission model
(listener that can setup a fs mark can always resolve fids to path), but it
would be quite trivial to add.

quoted
quoted
I guess we will also need to make open_by_handle_at() idmapped aware
and use a variant of vfs_dentry_acceptable() that validates that the opened
path is legitimately accessible via the idmapped mount.
So as a first step, I think there's a legitimate case to be made for
open_by_handle_at() to be made useable inside user namespaces. That's a
change worth to be made independent of fanotify. For example, nowadays
cgroups have a 64 bit identifier that can be used with open_by_handle_at
to map a cgrp id to a path and back:
https://lkml.org/lkml/2020/12/2/1126
Right now this can't be used in user namespaces because of this
restriction but it is genuinely useful to have this feature available
since cgroups are FS_USERNS_MOUNT and that identifier <-> path mapping
is very convenient.
FS_USERNS_MOUNT is a simple case and I think it is safe.
There is already a patch for that on my fanotify_userns branch.
Great!
Christian
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help