Re: [PATCH 3/4] gfp: mm: introduce __GFP_NOINIT

From: Souptick Joarder <hidden>
Date: 2019-05-11 07:28:42
Also in: linux-mm

On Thu, May 9, 2019 at 6:53 PM Alexander Potapenko [off-list ref] wrote:

From: Kees Cook <redacted>
Date: Wed, May 8, 2019 at 9:16 PM
To: Alexander Potapenko
Cc: Andrew Morton, Christoph Lameter, Kees Cook, Laura Abbott,
Linux-MM, linux-security-module, Kernel Hardening, Masahiro Yamada,
James Morris, Serge E. Hallyn, Nick Desaulniers, Kostya Serebryany,
Dmitry Vyukov, Sandeep Patil, Randy Dunlap, Jann Horn, Mark Rutland

quoted

On Wed, May 8, 2019 at 8:38 AM Alexander Potapenko [off-list ref] wrote:

quoted

When passed to an allocator (either pagealloc or SL[AOU]B), __GFP_NOINIT
tells it to not initialize the requested memory if the init_on_alloc
boot option is enabled. This can be useful in the cases newly allocated
memory is going to be initialized by the caller right away.

__GFP_NOINIT doesn't affect init_on_free behavior, except for SLOB,
where init_on_free implies init_on_alloc.

__GFP_NOINIT basically defeats the hardening against information leaks
provided by init_on_alloc, so one should use it with caution.

This patch also adds __GFP_NOINIT to alloc_pages() calls in SL[AOU]B.
Doing so is safe, because the heap allocators initialize the pages they
receive before passing memory to the callers.

Slowdown for the initialization features compared to init_on_free=0,
init_on_alloc=0:

hackbench, init_on_free=1:  +6.84% sys time (st.err 0.74%)
hackbench, init_on_alloc=1: +7.25% sys time (st.err 0.72%)

Linux build with -j12, init_on_free=1:  +8.52% wall time (st.err 0.42%)
Linux build with -j12, init_on_free=1:  +24.31% sys time (st.err 0.47%)
Linux build with -j12, init_on_alloc=1: -0.16% wall time (st.err 0.40%)
Linux build with -j12, init_on_alloc=1: +1.24% sys time (st.err 0.39%)

The slowdown for init_on_free=0, init_on_alloc=0 compared to the
baseline is within the standard error.

Not sure, but I think this patch will clash with Matthew's posted patch series
*Remove 'order' argument from many mm functions*.

quoted

Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <redacted>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Nick Desaulniers <redacted>
Cc: Kostya Serebryany <redacted>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kees Cook <redacted>
Cc: Sandeep Patil <redacted>
Cc: Laura Abbott <redacted>
Cc: Randy Dunlap <redacted>
Cc: Jann Horn <jannh@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-mm@kvack.org
Cc: linux-security-module@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
---
 include/linux/gfp.h | 6 +++++-
 include/linux/mm.h  | 2 +-
 kernel/kexec_core.c | 2 +-
 mm/slab.c           | 2 +-
 mm/slob.c           | 3 ++-
 mm/slub.c           | 1 +
 6 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index fdab7de7490d..66d7f5604fe2 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h

@@ -44,6 +44,7 @@ struct vm_area_struct;
 #else
 #define ___GFP_NOLOCKDEP       0
 #endif
+#define ___GFP_NOINIT          0x1000000u

I mentioned this in the other patch, but I think this needs to be
moved ahead of GFP_NOLOCKDEP and adjust the values for GFP_NOLOCKDEP
and to leave the IS_ENABLED() test in __GFP_BITS_SHIFT alone.

Do we really need this blinking GFP_NOLOCKDEP bit at all?
This approach doesn't scale, we can't even have a second feature that
has a bit depending on the config settings.
Cannot we just fix the number of bits instead?

quoted

 /* If the above are modified, __GFP_BITS_SHIFT may need updating */

 /*

@@ -208,16 +209,19 @@ struct vm_area_struct;
  * %__GFP_COMP address compound page metadata.
  *
  * %__GFP_ZERO returns a zeroed page on success.
+ *
+ * %__GFP_NOINIT requests non-initialized memory from the underlying allocator.
  */
 #define __GFP_NOWARN   ((__force gfp_t)___GFP_NOWARN)
 #define __GFP_COMP     ((__force gfp_t)___GFP_COMP)
 #define __GFP_ZERO     ((__force gfp_t)___GFP_ZERO)
+#define __GFP_NOINIT   ((__force gfp_t)___GFP_NOINIT)

 /* Disable lockdep for GFP context tracking */
 #define __GFP_NOLOCKDEP ((__force gfp_t)___GFP_NOLOCKDEP)

 /* Room for N __GFP_FOO bits */
-#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (25)

AIUI, this will break non-CONFIG_LOCKDEP kernels: it should just be:

-#define __GFP_BITS_SHIFT (23 + IS_ENABLED(CONFIG_LOCKDEP))
+#define __GFP_BITS_SHIFT (24 + IS_ENABLED(CONFIG_LOCKDEP))

quoted

 #define __GFP_BITS_MASK ((__force gfp_t)((1 << __GFP_BITS_SHIFT) - 1))

 /**

diff --git a/include/linux/mm.h b/include/linux/mm.h
index ee1a1092679c..8ab152750eb4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h

@@ -2618,7 +2618,7 @@ DECLARE_STATIC_KEY_FALSE(init_on_alloc);
 static inline bool want_init_on_alloc(gfp_t flags)
 {
        if (static_branch_unlikely(&init_on_alloc))
-               return true;
+               return !(flags & __GFP_NOINIT);
        return flags & __GFP_ZERO;

What do you think about renaming __GFP_NOINIT to __GFP_NO_AUTOINIT or something?

Regardless, yes, this is nice.

--
Kees Cook



--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help