Re: d_off field in struct dirent and 32-on-64 emulation
From: Andy Lutomirski <luto@kernel.org>
Date: 2018-12-28 15:26:15
Also in:
linux-ext4, linux-fsdevel, lkml, qemu-devel
From: Andy Lutomirski <luto@kernel.org>
Date: 2018-12-28 15:26:15
Also in:
linux-ext4, linux-fsdevel, lkml, qemu-devel
[sending again, slightly edited, due to email client issues] On Thu, Dec 27, 2018 at 9:25 AM Florian Weimer [off-list ref] wrote:
We have a bit of an interesting problem with respect to the d_off
field in struct dirent.
When running a 64-bit kernel on certain file systems, notably ext4,
this field uses the full 63 bits even for small directories (strace -v
output, wrapped here for readability):
getdents(3, [
{d_ino=1494304, d_off=3901177228673045825, d_reclen=40, d_name="authorized_keys", d_type=DT_REG},
{d_ino=1494277, d_off=7491915799041650922, d_reclen=24, d_name=".", d_type=DT_DIR},
{d_ino=1314655, d_off=9223372036854775807, d_reclen=24, d_name="..", d_type=DT_DIR}
], 32768) = 88
When running in 32-bit compat mode, this value is somehow truncated to
31 bits, for both the getdents and the getdents64 (!) system call (at
least on i386)....
However, both qemu-user and the 9p file system can run in such a way that the kernel is entered from a 64-bit process, but the actual usage is from a 32-bit process:
I imagine that at least some of the problems you're seeing are due to this bug: https://lkml.org/lkml/2018/10/18/859 Presumably the right fix involves modifying the relevant VFS file operations to indicate the relevant ABI to the implementations. I would guess that 9p is triggering the “not really in the syscall you think you’re in” issue.