Re: [RFC PATCH 2/3] add statmnt(2) syscall

[RFC PATCH 0/3] quering mount attributes · Miklos Szeredi <hidden> · 2023-09-13
[RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <hidden> · 2023-09-13
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Amir Goldstein <amir73il@gmail.com> · 2023-09-14
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Ian Kent <raven@themaw.net> · 2023-09-15
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-14
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-14
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-14
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-15
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <hidden> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Matthew House <hidden> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Matthew House <hidden> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-20
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Matthew House <hidden> · 2023-09-20
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-21
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Andreas Dilger <hidden> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-19
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Dave Chinner <david@fromorbit.com> · 2023-09-20
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Jeff Layton <jlayton@kernel.org> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-18
RE: [RFC PATCH 2/3] add statmnt(2) syscall · David Laight <hidden> · 2023-09-20
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Paul Moore <paul@paul-moore.com> · 2023-09-14
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-15
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Sargun Dhillon <hidden> · 2023-09-17
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Ian Kent <raven@themaw.net> · 2023-09-17
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-18
Re: [RFC PATCH 2/3] add statmnt(2) syscall · "Arnd Bergmann" <arnd@arndb.de> · 2023-09-25
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-25
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-25
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-25
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-25
Re: [RFC PATCH 2/3] add statmnt(2) syscall · "Arnd Bergmann" <arnd@arndb.de> · 2023-09-25
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-26
Re: [RFC PATCH 2/3] add statmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-27
[RFC PATCH 1/3] add unique mount ID · Miklos Szeredi <hidden> · 2023-09-13
Re: [RFC PATCH 1/3] add unique mount ID · Christian Brauner <brauner@kernel.org> · 2023-09-14
Re: [RFC PATCH 1/3] add unique mount ID · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-14
Re: [RFC PATCH 1/3] add unique mount ID · Christian Brauner <brauner@kernel.org> · 2023-09-14
Re: [RFC PATCH 1/3] add unique mount ID · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-14
Re: [RFC PATCH 1/3] add unique mount ID · Christian Brauner <brauner@kernel.org> · 2023-09-14
Re: [RFC PATCH 1/3] add unique mount ID · Ian Kent <raven@themaw.net> · 2023-09-15
[RFC PATCH 3/3] add listmnt(2) syscall · Miklos Szeredi <hidden> · 2023-09-13
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Amir Goldstein <amir73il@gmail.com> · 2023-09-14
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-14
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-14
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Ian Kent <raven@themaw.net> · 2023-09-15
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Matthew House <hidden> · 2023-09-17
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Miklos Szeredi <hidden> · 2023-09-17
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Christian Brauner <brauner@kernel.org> · 2023-09-18
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Paul Moore <paul@paul-moore.com> · 2023-09-19
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Miklos Szeredi <miklos@szeredi.hu> · 2023-09-28
Re: [RFC PATCH 3/3] add listmnt(2) syscall · Paul Moore <paul@paul-moore.com> · 2023-10-04
Re: [RFC PATCH 0/3] quering mount attributes · Amir Goldstein <amir73il@gmail.com> · 2023-09-14
Re: [RFC PATCH 0/3] quering mount attributes · Ian Kent <raven@themaw.net> · 2023-09-15
Re: [RFC PATCH 0/3] quering mount attributes · Amir Goldstein <amir73il@gmail.com> · 2023-09-15
Re: [RFC PATCH 0/3] quering mount attributes · Ian Kent <raven@themaw.net> · 2023-09-16
Re: [RFC PATCH 0/3] quering mount attributes · Ian Kent <raven@themaw.net> · 2023-09-16

From: Christian Brauner <brauner@kernel.org>
Date: 2023-09-18 16:20:36
Also in: linux-fsdevel, linux-man, linux-security-module, lkml

Atomicity of getting a snapshot of the current mount tree with all of
its attributes was never guaranteed, although reading
/proc/self/mountinfo into a sufficiently large buffer would work that
way.   However, I don't see why mount trees would require stronger
guarantees than dentry trees (for which we have basically none).

So atomicity was never put forward as a requirement. In that
session/recording I explicitly state that we won't guarantee atomicity.
And systemd agreed with this. So I think we're all on the same page.

Even more type clean interface:

struct statmnt *statmnt(u64 mnt_id, u64 mask, void *buf, size_t
bufsize, unsigned int flags);

Kernel would return a fully initialized struct with the numeric as
well as string fields filled.  That part is trivial for userspace to
deal with.

I really would prefer a properly typed struct and that's what everyone
was happy with in the session as well. So I would not like to change the
main parameters.

quoted

Plus, the format for how to return arbitrary filesystem mount options
warrants a separate discussion imho as that's not really vfs level
information.

Okay.   Let's take fs options out of this.

Thanks.

That leaves:

 - fs type and optionally subtype

So since subtype is FUSE specific it might be better to move this to
filesystem specific options imho.

 - root of mount within fs
 - mountpoint path

The type and subtype are naturally limited to sane sizes, those are
not an issue.

What's the limit for fstype actually? I don't think there is one.
There's one by chance but not by design afaict?

Maybe crazy idea:
That magic number thing that we do in include/uapi/linux/magic.h
is there a good reason for this or why don't we just add a proper,
simple enum:

enum {
        FS_TYPE_ADFS        1
        FS_TYPE_AFFS        2
        FS_TYPE_AFS         3
        FS_TYPE_AUTOFS      4
	FS_TYPE_EXT2	    5
	FS_TYPE_EXT3	    6
	FS_TYPE_EXT4	    7
	.
	.
	.
	FS_TYPE_MAX
}

that we start returning from statmount(). We can still return both the
old and the new fstype? It always felt a bit odd that fs developers to
just select a magic number.

For paths the evolution of the relevant system/library calls was:

  char *getwd(char buf[PATH_MAX]);
  char *getcwd(char *buf, size_t size);
  char *get_current_dir_name(void);

It started out using a fixed size buffer, then a variable sized
buffer, then an automatically allocated buffer by the library, hiding
the need to resize on overflow.

The latest style is suitable for the statmnt() call as well, if we
worry about pleasantness of the API.

So, can we then do the following struct:

struct statmnt {
        __u64 mask;             /* What results were written [uncond] */
        __u32 sb_dev_major;     /* Device ID */
        __u32 sb_dev_minor;
        __u64 sb_magic;         /* ..._SUPER_MAGIC */
        __u32 sb_flags;         /* MS_{RDONLY,SYNCHRONOUS,DIRSYNC,LAZYTIME} */
        __u32 __spare1;
        __u64 mnt_id;           /* Unique ID of mount */
        __u64 mnt_parent_id;    /* Unique ID of parent (for root == mnt_id) */
        __u32 mnt_id_old;       /* Reused IDs used in proc/.../mountinfo */
        __u32 mnt_parent_id_old;
        __u64 mnt_attr;         /* MOUNT_ATTR_... */
        __u64 mnt_propagation;  /* MS_{SHARED,SLAVE,PRIVATE,UNBINDABLE} */
        __u64 mnt_peer_group;   /* ID of shared peer group */
        __u64 mnt_master;       /* Mount receives propagation from this ID */
        __u64 propagate_from;   /* Propagation from in current namespace */
	__aligned_u64 mountpoint;
	__u32 mountpoint_len;
	__aligned_u64 mountroot;
	__u32 mountroot_len;
        __u64 __spare[20];
};

Userspace knows already how to deal with that because of bpf and other
structs (e.g., both systemd and LXC have ptr_to_u64() helpers and so
on). Libmount and glibc can hide this away internally as well.

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help