Thread (4 messages) 4 messages, 3 authors, 2022-11-30

Re: [PATCH v2 1/2] kasan: allow sampling page_alloc allocations for HW_TAGS

From: Andrey Konovalov <andreyknvl@gmail.com>
Date: 2022-11-30 19:51:58
Also in: linux-mm, lkml

.On Tue, Nov 29, 2022 at 12:30 PM Marco Elver [off-list ref] wrote:
On Sat, 26 Nov 2022 at 20:12, [off-list ref] wrote:
quoted
From: Andrey Konovalov <redacted>

Add a new boot parameter called kasan.page_alloc.sample, which makes
Hardware Tag-Based KASAN tag only every Nth page_alloc allocation for
allocations marked with __GFP_KASAN_SAMPLE.
This is new - why was it decided that this is a better design?
Sampling all page_alloc allocations (with the suggested frequency of 1
out of 10) effectively means that KASAN/MTE is no longer mitigation
for page_alloc corruptions. The idea here was to only apply sampling
to selected allocations, so that all others are still checked
deterministically.

However, it's hard to say whether this is critical from the security
perspective. Most exploits today corrupt slab objects, not page_alloc.
This means we have to go around introducing the GFP_KASAN_SAMPLE flag
everywhere where we think it might cause a performance degradation.

This depends on accurate benchmarks. Yet, not everyone's usecases will
be the same. I fear we might end up with marking nearly all frequent
and large page-alloc allocations with GFP_KASAN_SAMPLE.

Is it somehow possible to make the sampling decision more automatic?

E.g. kasan.page_alloc.sample_order -> only sample page-alloc
allocations with order greater or equal to sample_order.
Hm, perhaps this could be a good middle ground between sampling all
allocations and sprinkling GFP_KASAN_SAMPLE.

Looking at the networking code, most multi-page data allocations are
done with the order of 3 (either via PAGE_ALLOC_COSTLY_ORDER or
SKB_FRAG_PAGE_ORDER). So this would be the required minimum value for
kasan.page_alloc.sample_order to alleviate the performance impact for
the networking workloads.

I measured the number of allocations for each order from 0 to 8 during
boot in my test build:

7299 867 318 206 86 8 7 5 2

So sampling with kasan.page_alloc.sample_order=3 would affect only ~7%
of page_alloc allocations that happen normally, which is not bad. (Of
course, if an attacker can control the size of the allocation, they
can increase the order to enable sampling.)

I'll do some more testing and either send a v3 with this approach or
get back to this discussion.

Thanks for the suggestion!
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help