Re: [PATCH v2] man/man3/readdir.3, man/man3type/stat.3type: Improve documentation about .d_ino and .st_ino
From: Pali Rohár <pali@kernel.org>
Date: 2025-10-29 19:34:19
Hello Branden, On Wednesday 29 October 2025 02:00:39 G. Branden Robinson wrote:
Hi Pali, Thanks for following up. At 2025-10-29T00:53:06+0100, Pali Rohár wrote:quoted
Hello Branden, I'm sorry but I have not received your message because I'm not subscribed to the list. Otherwise I would have replied to you earlier.No worries--it's a risk I take when forgetting to CC people's accounts.quoted
If you are referring to the "bug" then it is written in informative part in RATIONALE section of readdir / POSIX.1-2024. I wrote in my first email in that email thread which Alejandro linked above. Here is direct link to POSIX spec and below is quoted part: https://pubs.opengroup.org/onlinepubs/9799919799/functions/readdir.html "When returning a directory entry for the root of a mounted file system, some historical implementations of readdir() returned the file serial number of the underlying mount point, rather than of the root of the mounted file system. This behavior is considered to be a bug, since the underlying file serial number has no significance to applications."Thanks--this is precisely what I was asking for!quoted
That part is in the "informative" section. I have not found anything in normative sections which would disallow usage of that "historical" behavior, so my understanding was that "historical" behavior is conforming too. Please correct me if I'm wrong here, or if it should be understood in different way.I can't speak for the Austin Group, but I don't read the text quite the same way. I interpret it as saying that some historical implementations of readdir() would _not_ "return a pointer to a structure representing the directory entry at the current position in the directory stream specified by the argument dirp, and position the directory stream at the next entry." But I suspect that's not what it _intends_ to say. Instead, these implementations "returned [sic] the file serial number of the underlying mount point", which I interpret to mean that they would return a pointer to a _dirent_ struct whose _d_ino_ member was not the file serial number of the file (of directory type) named by the _d_name_ member but a pointer to a _dirent_ struct whose _d_ino_ member was the file serial number of the underlying mount point. I think there are two conclusions we can reach here. 1. POSIX.1-2024 might be a little sloppy in the wording of its "RATIONALE" for this interface. Presumably no historical implementation's readdir() returned a _d_ino_ number directly. (Though with all the exuberant integer/pointer punning that used to go in Unix, I'd wouldn't bet a lot of money that *no* implementation ever did.) I'll wager a nickel that readdir() has always, on every implementation, returned a pointer to a _dirent_ struct, and it is only the value of the _d_ino_ member of the pointed-to struct that some implementations have populated inconsistently when the entry is a directory that is a mount point. If I'm right, this is an example of the common linguistic error of synecdoche: confusing a container with (a subset of) its contents. 2. The behavior POSIX describes as buggy is, in fact, nonconforming.
Only two? I can image that somebody come up with another conclusion. (just a joke) Anyway, I think that it is important to document the existing Linux behavior and whether it is POSIX-conforming or not is then second step. We can drop the information about POSIX conformity from manpage until we figure out how it is.
quoted
Also I have not read all those 4000 pages, so maybe there is something hidden. It is quite hard to find information about this topic and that is why I think this should be documented in Linux manpages.I reckon someone should open a Mantis ticket with the Austin Group's issue tracker to get clarity on what I characterized as "sloppy" wording. Either it is and we can get the standard clarified, or I'm wrong and an authority can point out how. (Maybe both!) I'm subscribed to the austin-group-l reflector and will take an action item to file this ticket. I'll try to do within a week. (I have a lot of old Unix books and would like to rummage around in them for any documented land mines in this area.) Regards, Branden
Thanks for taking that part. It would be really nice if austin group can clarify how the whole situation is in a non-confusing way. Anyway, inode number is always connected to the specific mounted filesystem. So when the application is doing something with inodes, it always needs a pair (dev_t, ino_t) unless inodes belongs to same fs dev. readdir() and getdents() returns just ino_t, and without knowledge of dev_t, applications cannot use returned ino_t for anything useful. On "historical" implementations, the dev_t can be fetched for example by one fstat(dir_fd, &st) call as dev_t would be same for all readdir and getdents entries. But on non-"historical" implementation, it would be needed to call stat() on every one entry. For example /mnt/ directory which usually contains just mountpoints, will contain entries where each one has inode number 2 (common inode number for root of fs). I looked into archives and I have found that this problem was already discussed in the past. Here are some email threads from coreutils: https://lore.kernel.org/lkml/87y6oyhkz8.fsf@meyering.net/t/#u (local) https://public-inbox.org/bug-coreutils/8763c5wcgn.fsf@meyering.net/t/#u https://public-inbox.org/bug-coreutils/87iqvi2j0q.fsf@rho.meyering.net/t/#u https://public-inbox.org/bug-coreutils/87verkborm.fsf@rho.meyering.net/ https://public-inbox.org/bug-coreutils/022320061637.4398.43FDE4D7000110830000112E22007507440A050E040D0C079D0A@comcast.net/ Maybe they could be a good reference for future discussion by austin group. Just my personal idea: If there would be some xgetdents syscall (like there statx over stat), it could return both inode numbers with dev_t and application can take which it wants. For example, NFS4's readdir can return both inode numbers (depending what is client asking). NFSv4.1 spec has nicely documented this problem with UNIX background of mount point crossing: https://www.rfc-editor.org/rfc/rfc8881.html#section-5.8.2.23 Pali