Re: MPK: removing a pkey
From: Dave Hansen <dave.hansen@linux.intel.com>
Date: 2017-11-23 15:09:18
Also in:
linux-api, linux-mm
On 11/23/2017 04:38 AM, Florian Weimer wrote:
On 11/22/2017 05:32 PM, Dave Hansen wrote:quoted
On 11/22/2017 08:21 AM, Florian Weimer wrote:quoted
On 11/22/2017 05:10 PM, Dave Hansen wrote:quoted
On 11/22/2017 04:15 AM, Florian Weimer wrote:quoted
On 11/22/2017 09:18 AM, Vlastimil Babka wrote:quoted
And, was the pkey == -1 internal wiring supposed to be exposed to the pkey_mprotect() signal, or should there have been a pre-check returning EINVAL in SYSCALL_DEFINE4(pkey_mprotect), before calling do_mprotect_pkey())? I assume it's too late to change it now anyway (or not?), so should we also document it?I think the -1 case to the set the default key is useful because it allows you to use a key value of -1 to mean “MPK is not supported”, and still call pkey_mprotect.The behavior to not allow 0 to be set was unintentional and is a bug. We should fix that.On the other hand, x86-64 has no single default protection key due to the PROT_EXEC emulation.No, the default is clearly 0 and documented to be so. The PROT_EXEC emulation one should be inaccessible in all the APIs so does not even show up as *being* a key in the API.
I should have been more explicit: the EXEC pkey does not show up in the syscall API.
I see key 1 in /proc for a PROT_EXEC mapping. If I supply an explicit protection key, that key is used, and the page ends up having read access enabled. The key is also visible in the siginfo_t argument on read access to a PROT_EXEC mapping with the default key, so it's not just /proc: page 1 (0x7f008242d000): read access denied SIGSEGV address: 0x7f008242d000 SIGSEGV code: 4 SIGSEGV key: 1 I'm attaching my test.
Yes, it is exposed there. But, as a non-allocated pkey, the intention in the kernel was to make sure that it could not be passed to the syscalls. If that behavior is broken, we should probably fix it.
quoted
The fact that it's implemented with pkeys should be pretty immaterial other than the fact that you can't touch the high bits in PKRU.I don't see a restriction for PKRU updates. If I write zero to the PKRU register, PROT_EXEC implies PROT_READ, as I would expect.
I'll rephrase: The fact that it's implemented with pkeys should be pretty immaterial other than the fact that you must not touch the bits controlling PROT_EXEC in PKRU if you want to keep it working. There is no restriction which is *enforced*. It's just documented.