Re: [PATCH V5 0/7] Allow user to request memory to be locked on page fault

From: Eric B Munson <hidden>
Date: 2015-07-27 14:54:13
Also in: linux-alpha, linux-arch, linux-mips, linux-mm, linuxppc-dev, lkml, sparclinux

On Mon, 27 Jul 2015, Vlastimil Babka wrote:

On 07/27/2015 03:35 PM, Eric B Munson wrote:

quoted

On Mon, 27 Jul 2015, Vlastimil Babka wrote:

quoted

On 07/24/2015 11:28 PM, Eric B Munson wrote:

...

quoted

Changes from V4:
Drop all architectures for new sys call entries except x86[_64] and MIPS
Drop munlock2 and munlockall2
Make VM_LOCKONFAULT a modifier to VM_LOCKED only to simplify book keeping
Adjust tests to match

Hi, thanks for considering my suggestions. Well, I do hope there
were correct as API's are hard and I'm no API expert. But since
API's are also impossible to change after merging, I'm sorry but
I'll keep pestering for one last thing. Thanks again for persisting,
I do believe it's for the good thing!

The thing is that I still don't like that one has to call
mlock2(MLOCK_LOCKED) to get the equivalent of the old mlock(). Why
is that flag needed? We have two modes of locking now, and v5 no
longer treats them separately in vma flags. But having two flags
gives us four possible combinations, so two of them would serve
nothing but to confuse the programmer IMHO. What will mlock2()
without flags do? What will mlock2(MLOCK_LOCKED | MLOCK_ONFAULT) do?
(Note I haven't studied the code yet, as having agreed on the API
should come first. But I did suggest documenting these things more
thoroughly too...)
OK I checked now and both cases above seem to return EINVAL.

So about the only point I see in MLOCK_LOCKED flag is parity with
MAP_LOCKED for mmap(). But as Kirill said (and me before as well)
MAP_LOCKED is broken anyway so we shouldn't twist the rest just of
the API to keep the poor thing happier in its misery.

Also note that AFAICS you don't have MCL_LOCKED for mlockall() so
there's no full parity anyway. But please don't fix that by adding
MCL_LOCKED :)

Thanks!


I have an MLOCK_LOCKED flag because I prefer an interface to be
explicit.

I think it's already explicit enough that the user calls mlock2(),
no? He obviously wants the range mlocked. An optional flag says that
there should be no pre-fault.

quoted

The caller of mlock2() will be required to fill in the flags
argument regardless.

I guess users not caring about MLOCK_ONFAULT will continue using
plain mlock() without flags anyway.

I can drop the MLOCK_LOCKED flag with 0 being the

quoted

value for LOCKED, but I thought it easier to make clear what was going
on at any call to mlock2().  If user space defines a MLOCK_LOCKED that
happens to be 0, I suppose that would be okay.

Yeah that would remove the weird 4-states-of-which-2-are-invalid
problem I mentioned, but at the cost of glibc wrapper behaving
differently than the kernel syscall itself. For little gain.

quoted

We do actually have an MCL_LOCKED, we just call it MCL_CURRENT.  Would
you prefer that I match the name in mlock2() (add MLOCK_CURRENT
instead)?

Hm it's similar but not exactly the same, because MCL_FUTURE is not
the same as MLOCK_ONFAULT :) So MLOCK_CURRENT would be even more
confusing. Especially if mlockall(MCL_CURRENT | MCL_FUTURE) is OK,
but mlock2(MLOCK_LOCKED | MLOCK_ONFAULT) is invalid.

MLOCK_ONFAULT isn't meant to be the same as MCL_FUTURE, rather it is
meant to be the same as MCL_ONFAULT.  MCL_FUTURE only controls if the
locking policy will be applied to any new mappings made by this process,
not the locking policy itself.  The better comparison is MCL_CURRENT to
MLOCK_LOCK and MCL_ONFAULT to MLOCK_ONFAULT.  MCL_CURRENT and
MLOCK_LOCK do the same thing, only one requires a specific range of
addresses while the other works process wide.  This is why I suggested
changing MLOCK_LOCK to MLOCK_CURRENT.  It is an error to call
mlock2(MLOCK_LOCK | MLOCK_ONFAULT) just like it is an error to call
mlockall(MCL_CURRENT | MCL_ONFAULT).  The combinations do no make sense.

This was all decided when VM_LOCKONFAULT was a separate state from
VM_LOCKED.  Now that VM_LOCKONFAULT is a modifier to VM_LOCKED and
cannot be specified independentally, it might make more sense to mirror
that relationship to userspace.  Which would lead to soemthing like the
following:

To lock and populate a region:
mlock2(start, len, 0);

To lock on fault a region:
mlock2(start, len, MLOCK_ONFAULT);

If LOCKONFAULT is seen as a modifier to mlock, then having the flags
argument as 0 mean do mlock classic makes more sense to me.

To mlock current on fault only:
mlockall(MCL_CURRENT | MCL_ONFAULT);

To mlock future on fault only:
mlockall(MCL_FUTURE | MCL_ONFAULT);

To lock everything on fault:
mlockall(MCL_CURRENT | MCL_FUTURE | MCL_ONFAULT);

I think I have talked myself into rewriting the set again :/

quoted

Finally, on the question of MAP_LOCKONFAULT, do you just dislike
MAP_LOCKED and do not want to see it extended, or is this a NAK on the
set if that patch is included.  I ask because I have to spin a V6 to get
the MLOCK flag declarations right, but I would prefer not to do a V7+.
If this is a NAK with, I can drop that patch and rework the tests to
cover without the mmap flag.  Otherwise I want to keep it, I have an
internal user that would like to see it added.

I don't want to NAK that patch if you think it's useful.

Attachments

signature.asc [application/pgp-signature] 819 bytes

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help