Thread (28 messages) 28 messages, 8 authors, 2017-01-12

[Question] New mmap64 syscall?

From: catalin.marinas@arm.com (Catalin Marinas)
Date: 2016-12-07 16:38:40
Also in: linux-arch, lkml

On Wed, Dec 07, 2016 at 06:09:44PM +0530, Yury Norov wrote:
On Wed, Dec 07, 2016 at 12:07:24PM +0100, Dr.Philipp Tomsich wrote:
quoted
[Resend, as my mail-client had insisted on using the wrong MIME type?]
quoted
On 07 Dec 2016, at 11:34, Yury Norov [off-list ref] wrote:
quoted
If there is a use case for larger than 16TB offsets, we should add
the call on all architectures, probably using your approach 3. I don't
think that we should treat it as anything special for arm64 though.
From this point of view, 16+TB offset is a matter of 16+TB storage,
and it's more than real. The other consideration to add it is that
we have 64-bit support for offsets in syscalls like sys_llseek().
So mmap64() will simply extend this support.
I believe the question is rather if the 16TB offset is a real use-case for ILP32.
This is not for ilp32, but for all 32-bit architectures - both native
and compat. And because the scope is so generic, I think it's the
strong reason for us to support true 64-bit offset in mmap().
When I mentioned it, I didn't realise that we already use 6 registers
for mmap(). While we can go up to 8 on AArch64/ILP32, I think Arnd has a
point that we don't want this to diverge from other new 32-bit
architectures. I don't really have a strong opinion either way here,
just a remark that AArch64/ILP32 already diverged from _current_ 32-bit
architectures by introducing 64-bit off_t in a 32-bit world. Introducing
an mmap64() at the same time wouldn't look too bad either.
quoted
This seems to bring the discussion full-circle, as this would indicate that 64bit is the 
preferred bit-width for all sizes, offsets, etc. throughout all filesystem-related calls 
(i.e. stat, seek, etc.).
AARCH64/ILP32 (and all new arches) exposes ino_t, off_t, blkcnt_t,
fsblkcnt_t, fsfilcnt_t and rlim_t as 64-bit types. (Size_t should
be 32-bit of course, because it's the same lengths as pointer.)

It allows to make syscalls that pass it support 64-bit values, refer
Documentation/arm64/ilp32.txt for details. Stat and seek are both
supporting 64-bit types. From this point of view, mmap() is the (only?)
exception in current ILP32 ABI.
I thought ILP32 will use llseek() which has its own explicit way of
passing a 64-bit offset and the result written back by the kernel. We
wouldn't be able to use lseek() because of the return type.
quoted
But if that is the case, then we should have gone with 64bit arguments in a single
register for our ILP32 definition on AArch64.
 
There are 2 unrelated matters - the size of types, and the size of
register. Most of 32-bit architectures has hardware limitation on
register size (consider aarch32). And it doesn't mean that they are
forced to stuck with 32-bit off_t etc. This is still opened question
how to pass 64-bit parameters in aarch64/ilp32 because there we have
the choice (the reason why it's RFC). If you have new ideas - welcome
to that discussion. This topic also covers architectures that has to
pass 64-bit parameters in a pair.
We've discussed this a few times already and the only sane option from
the _kernel_ perspective seemed to be either (a) close to native ABI for
ILP32 (and breaking POSIX) or (b) just a standard 32-bit ABI. The latter
implies splitting 64-bit values in register pairs, especially to avoid a
lot of annotations/wrapping in the generic kernel unistd.h file. IIRC,
we decided to go with option (b), so I don't think it's worth re-opening
that discussion.
quoted
In other words: Why not keep ILP32 simple an ask users that need a 16TB+ offset
to use LP64? It seems much more consistent with the other choices takes so far.
If user can switch to lp64, he doesn't need ilp32 at all, right? :)
Also, I don't understand how true 64-bit offset in mmap64() would
complicate this port.
It's more like the user wanting a quick transition from code that was
only ever compiled for AArch32 (or other 32-bit architecture) with a
goal of full LP64 transition on the long run. I have yet to see
convincing benchmarks showing ILP32 as an advantage over LP64 (of
course, I hear the argument of reading a pointer a loop is twice as fast
with a half-size pointer but I don't consider such benchmarks relevant).

-- 
Catalin
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help