Re: [PATCH v7 1/2] mm: security: introduce init_on_alloc=1 and... | linux-security-module

Re: [PATCH v7 1/2] mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options

From: Alexander Potapenko <glider@google.com>
Date: 2019-06-21 08:57:50
Also in: linux-mm

On Fri, Jun 21, 2019 at 9:09 AM Michal Hocko [off-list ref] wrote:

On Mon 17-06-19 17:10:49, Alexander Potapenko wrote:

quoted

The new options are needed to prevent possible information leaks and
make control-flow bugs that depend on uninitialized values more
deterministic.

init_on_alloc=1 makes the kernel initialize newly allocated pages and heap
objects with zeroes. Initialization is done at allocation time at the
places where checks for __GFP_ZERO are performed.

init_on_free=1 makes the kernel initialize freed pages and heap objects
with zeroes upon their deletion. This helps to ensure sensitive data
doesn't leak via use-after-free accesses.

Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator
returns zeroed memory. The two exceptions are slab caches with
constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never
zero-initialized to preserve their semantics.

Both init_on_alloc and init_on_free default to zero, but those defaults
can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and
CONFIG_INIT_ON_FREE_DEFAULT_ON.

Slowdown for the new features compared to init_on_free=0,
init_on_alloc=0:

hackbench, init_on_free=1:  +7.62% sys time (st.err 0.74%)
hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%)

Linux build with -j12, init_on_free=1:  +8.38% wall time (st.err 0.39%)
Linux build with -j12, init_on_free=1:  +24.42% sys time (st.err 0.52%)
Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%)
Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%)

The slowdown for init_on_free=0, init_on_alloc=0 compared to the
baseline is within the standard error.

The new features are also going to pave the way for hardware memory
tagging (e.g. arm64's MTE), which will require both on_alloc and on_free
hooks to set the tags for heap objects. With MTE, tagging will have the
same cost as memory initialization.

Although init_on_free is rather costly, there are paranoid use-cases where
in-memory data lifetime is desired to be minimized. There are various
arguments for/against the realism of the associated threat models, but
given that we'll need the infrastructre for MTE anyway, and there are
people who want wipe-on-free behavior no matter what the performance cost,
it seems reasonable to include it in this series.

Thanks for reworking the original implemenation. This looks much better!

quoted

Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Kees Cook <redacted>
To: Andrew Morton <akpm@linux-foundation.org>
To: Christoph Lameter <redacted>
To: Kees Cook <redacted>
Cc: Masahiro Yamada <redacted>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Nick Desaulniers <redacted>
Cc: Kostya Serebryany <redacted>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Sandeep Patil <redacted>
Cc: Laura Abbott <redacted>
Cc: Randy Dunlap <redacted>
Cc: Jann Horn <jannh@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marco Elver <elver@google.com>
Cc: linux-mm@kvack.org
Cc: linux-security-module@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com

Acked-by: Michal Hocko <redacted> # page allocator parts.

kmalloc based parts look good to me as well but I am not sure I fill
qualified to give my ack there without much more digging and I do not
have much time for that now.

[...]

quoted

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index fd5c95ff9251..2f75dd0d0d81 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c

@@ -315,7 +315,7 @@ static struct page *kimage_alloc_pages(gfp_t gfp_mask, unsigned int order)
              arch_kexec_post_alloc_pages(page_address(pages), count,
                                          gfp_mask);

-             if (gfp_mask & __GFP_ZERO)
+             if (want_init_on_alloc(gfp_mask))
                      for (i = 0; i < count; i++)
                              clear_highpage(pages + i);
      }

I am not really sure I follow here. Why do we want to handle
want_init_on_alloc here? The allocated memory comes from the page
allocator and so it will get zeroed there. arch_kexec_post_alloc_pages
might touch the content there but is there any actual risk of any kind
of leak?

You're right, we don't want to initialize this memory if init_on_alloc is on.
We need something along the lines of:
  if (!static_branch_unlikely(&init_on_alloc))
    if (gfp_mask & __GFP_ZERO)
      // clear the pages

Another option would be to disable initialization in alloc_pages() using a flag.

quoted

diff --git a/mm/dmapool.c b/mm/dmapool.c
index 8c94c89a6f7e..e164012d3491 100644
--- a/mm/dmapool.c
+++ b/mm/dmapool.c

@@ -378,7 +378,7 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
 #endif
      spin_unlock_irqrestore(&pool->lock, flags);

-     if (mem_flags & __GFP_ZERO)
+     if (want_init_on_alloc(mem_flags))
              memset(retval, 0, pool->size);

      return retval;

Don't you miss dma_pool_free and want_init_on_free?

Agreed.
I'll fix this and add tests for DMA pools as well.

--
Michal Hocko
SUSE Labs



-- 
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help