Re: [PATCH v4 14/36] lib: add allocation tagging support for memory allocation profiling
From: Kees Cook <hidden>
Date: 2024-02-22 00:58:00
Also in:
cgroups, linux-arch, linux-doc, linux-fsdevel, linux-iommu, linux-mm, lkml
On Wed, Feb 21, 2024 at 07:34:44PM -0500, Kent Overstreet wrote:
On Wed, Feb 21, 2024 at 04:25:02PM -0800, Kees Cook wrote:quoted
On Wed, Feb 21, 2024 at 06:29:17PM -0500, Kent Overstreet wrote:quoted
On Wed, Feb 21, 2024 at 03:05:32PM -0800, Kees Cook wrote:quoted
On Wed, Feb 21, 2024 at 11:40:27AM -0800, Suren Baghdasaryan wrote:quoted
[...] +struct alloc_tag { + struct codetag ct; + struct alloc_tag_counters __percpu *counters; +} __aligned(8); [...] +#define DEFINE_ALLOC_TAG(_alloc_tag) \ + static DEFINE_PER_CPU(struct alloc_tag_counters, _alloc_tag_cntr); \ + static struct alloc_tag _alloc_tag __used __aligned(8) \ + __section("alloc_tags") = { \ + .ct = CODE_TAG_INIT, \ + .counters = &_alloc_tag_cntr }; [...] +static inline struct alloc_tag *alloc_tag_save(struct alloc_tag *tag) +{ + swap(current->alloc_tag, tag); + return tag; +}Future security hardening improvement idea based on this infrastructure: it should be possible to implement per-allocation-site kmem caches. For example, we could create: struct alloc_details { u32 flags; union { u32 size; /* not valid after __init completes */ struct kmem_cache *cache; }; }; - add struct alloc_details to struct alloc_tag - move the tags section into .ro_after_init - extend alloc_hooks() to populate flags and size: .flags = __builtin_constant_p(size) ? KMALLOC_ALLOCATE_FIXED : KMALLOC_ALLOCATE_BUCKETS; .size = __builtin_constant_p(size) ? size : SIZE_MAX; - during kernel start or module init, walk the alloc_tag list and create either a fixed-size kmem_cache or to allocate a full set of kmalloc-buckets, and update the "cache" member. - adjust kmalloc core routines to use current->alloc_tag->cache instead of using the global buckets. This would get us fully separated allocations, producing better than type-based levels of granularity, exceeding what we have currently with CONFIG_RANDOM_KMALLOC_CACHES. Does this look possible, or am I misunderstanding something in the infrastructure being created here?Definitely possible, but... would we want this?Yes, very very much. One of the worst and mostly unaddressed weaknesses with the kernel right now is use-after-free based type confusion[0], which depends on merged caches (or cache reuse). This doesn't solve cross-allocator (kmalloc/page_alloc) type confusion (as terrifyingly demonstrated[1] by Jann Horn), but it does help with what has been a very common case of "use msg_msg to impersonate your target object"[2] exploitation.We have a ton of code that references PAGE_SIZE and uses the page allocator completely unnecessarily - that's something worth harping about at conferences; if we could motivate people to clean that stuff up it'd have a lot of positive effects.quoted
quoted
That would produce a _lot_ of kmem cachesFewer than you'd expect, but yes, there is some overhead. However, out-of-tree forks of Linux have successfully experimented with this already and seen good results[3].So in that case - I don't think there's any need for a separate alloc_details; we'd just add a kmem_cache * to alloc_tag and then hook into the codetag init/unload path to create and destroy the kmem caches.
Okay, sounds good. There needs to be a place to track "is this a fixed size or a run-time size" choice.
No need to adjust the slab code either; alloc_hooks() itself could dispatch to kmem_cache_alloc() instead of kmalloc() if this is in use.
Right, it'd go to either kmem_cache_alloc() directly, or to a modified kmalloc() that used the passed-in cache is the base for an array of sized buckets, rather than the global (or 16-way global) buckets. Yay for the future! -- Kees Cook