Thread (78 messages) 78 messages, 14 authors, 2024-10-21

Re: [RFC PATCH v1 1/7] fs: Add inode_get_ino() and implement get_ino() for NFS

From: Trond Myklebust <hidden>
Date: 2024-10-17 21:06:47
Also in: linux-fsdevel, linux-nfs

On Thu, 2024-10-17 at 13:59 -0400, Jeff Layton wrote:
On Thu, 2024-10-17 at 17:09 +0000, Trond Myklebust wrote:
quoted
On Thu, 2024-10-17 at 13:05 -0400, Jeff Layton wrote:
quoted
On Thu, 2024-10-17 at 11:15 -0400, Paul Moore wrote:
quoted
On Thu, Oct 17, 2024 at 10:58 AM Christoph Hellwig
[off-list ref] wrote:
quoted
On Thu, Oct 17, 2024 at 10:54:12AM -0400, Paul Moore wrote:
quoted
Okay, good to know, but I was hoping that there we could
come
up with
an explicit list of filesystems that maintain their own
private
inode
numbers outside of inode-i_ino.
Anything using iget5_locked is a good start.  Add to that
file
systems
implementing their own inode cache (at least xfs and
bcachefs).
Also good to know, thanks.  However, at this point the lack of
a
clear
answer is making me wonder a bit more about inode numbers in
the
view
of VFS developers; do you folks care about inode numbers?  I'm
not
asking to start an argument, it's a genuine question so I can
get a
better understanding about the durability and sustainability of
inode->i_no.  If all of you (the VFS folks) aren't concerned
about
inode numbers, I suspect we are going to have similar issues in
the
future and we (the LSM folks) likely need to move away from
reporting
inode numbers as they aren't reliably maintained by the VFS
layer.
Like Christoph said, the kernel doesn't care much about inode
numbers.

People care about them though, and sometimes we have things in
the
kernel that report them in some fashion (tracepoints, procfiles,
audit
events, etc.). Having those match what the userland stat() st_ino
field
tells you is ideal, and for the most part that's the way it
works.

The main exception is when people use 32-bit interfaces (somewhat
rare
these days), or they have a 32-bit kernel with a filesystem that
has
a
64-bit inode number space (NFS being one of those). The NFS
client
has
basically hacked around this for years by tracking its own fileid
field
in its inode. That's really a waste though. That could be
converted
over to use i_ino instead if it were always wide enough.

It'd be better to stop with these sort of hacks and just fix this
the
right way once and for all, by making i_ino 64 bits everywhere.
Nope.

That won't fix glibc, which is the main problem NFS has to work
around.
True, but that's really a separate problem.

Currently, the problem where the kernel needs to use one inode number
in iget5() and a different one when replying to stat() is limited to
the set of 64-bit kernels that can operate in 32-bit userland
compability mode. So mainly on x86_64 kernels that are set up to run in
i386 userland compatibility mode.

If you now decree that all kernels will use 64-bit inode numbers
internally, then you've suddenly expanded the problem to encompass all
the remaining 32-bit kernels. In order to avoid stat() returning
EOVERFLOW to the applications, they too will have to start generating
separate 32-bit inode numbers.
It also doesn't inform how we track inode numbers inside the kernel.
Inode numbers have been 64 bits for years on "real" filesystems. If
we
were designing this today, i_ino would be a u64, and we'd only hash
that down to 32 bits when necessary.
"I'm doing a (free) operating system (just a hobby, won't be big and
professional like gnu) for 386(486) AT clones."

History is a bitch...

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help