Thread (57 messages) 57 messages, 14 authors, 2020-11-17

Re: [PATCH v6 5/6] mm: secretmem: use PMD-size pages to amortize direct map fragmentation

From: Mike Rapoport <rppt@kernel.org>
Date: 2020-09-29 14:04:44
Also in: linux-arch, linux-arm-kernel, linux-fsdevel, linux-kselftest, linux-mm, linux-riscv, lkml, nvdimm

On Fri, Sep 25, 2020 at 11:31:14AM +0100, Mark Rutland wrote:
Hi,

Agreed. I think if we really need something like this, something between
XPFO and DEBUG_PAGEALLOC would be generally better, since:

* Secretmem puts userspace in charge of kernel internals (AFAICT without
  any ulimits?), so that seems like an avenue for malicious or buggy
  userspace to exploit and trigger DoS, etc. The other approaches leave
  the kernel in charge at all times, and it's a system-level choice
  which is easier to reason about and test.
Secretmem obeys RLIMIT_MLOCK.
I don't see why it "puts userpspace in charge of kernel internals" more
than other system calls. The fact that memory is dropped from
linear/direct mapping does not make userspace in charge of the kernel
internals. The fact that this is not system-level actually makes it more
controllable and tunable, IMHO.
* Secretmem interaction with existing ABIs is unclear. Should uaccess
  primitives work for secretmem? If so, this means that it's not valid
  to transform direct uaccesses in syscalls etc into accesses via the
  linear/direct map. If not, how do we prevent syscalls? The other
  approaches are clear that this should always work, but the kernel
  should avoid mappings wherever possible.
Our idea was that direct uaccess in the context of the process that owns
the secretmem should work and that transforming the direct uaccesses
into accesses via the linear map would be valid only when allowed
explicitly. E.g with addition of FOLL_SOMETHING to gup.
Yet, this would be required for any implementation of memory areas that
excludes pages from the linear mapping.
* The uncached option doesn't work in a number of situations, such as
  systems which are purely cache coherent at all times, or where the
  hypervisor has overridden attributes. The kernel cannot even know that
  whther this works as intended. On its own this doens't solve a
  particular problem, and I think this is a solution looking for a
  problem.
As we discussed at one of the previous iterations, the uncached makes
sense for x86 to reduce availability of side channels and I've only
enabled uncached mappings on x86.
... and fundamentally, this seems like a "more security, please" option
that is going to be abused, since everyone wants security, regardless of
how we say it *should* be used. The few use-cases that may make sense
(e.g. protection of ketys and/or crypto secrrets), aren't going to be
able to rely on this (since e.g. other uses may depelete memory pools),
so this is going to be best-effort. With all that in mind, I struggle to
beleive that this is going to be worth the maintenance cost (e.g. with
any issues arising from uaccess, IO, etc).
I think that making secretmem a file descriptor that only allows mmap()
already makes it quite self contained and simple. There could be several
cases that will need special treatment, but I don't think it will have
large maintenance cost.
I've run syzkaller for some time with memfd_secret() enabled and I never
hit a crash because of it.
Thanks,
Mark.
-- 
Sincerely yours,
Mike.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help