[PATCH v4 00/17] khwasan: kernel hardware assisted address sanitizer
From: akpm@linux-foundation.org (Andrew Morton)
Date: 2018-07-02 20:30:28
Also in:
linux-doc, linux-kbuild, linux-mm, lkml
On Mon, 2 Jul 2018 13:22:23 -0700 Evgenii Stepanov [off-list ref] wrote:
On Mon, Jul 2, 2018 at 12:21 PM, Andrew Morton [off-list ref] wrote:quoted
On Mon, 2 Jul 2018 12:16:42 -0700 Evgenii Stepanov [off-list ref] wrote:quoted
On Fri, Jun 29, 2018 at 7:41 PM, Andrew Morton [off-list ref] wrote:quoted
On Fri, 29 Jun 2018 14:45:08 +0200 Andrey Konovalov [off-list ref] wrote:quoted
quoted
quoted
What kind of memory consumption testing would you like to see?Well, 100kb or so is a teeny amount on virtually any machine. I'm assuming the savings are (much) more significant once the machine gets loaded up and doing work?So with clean kernel after boot we get 40 kb memory usage. With KASAN it is ~120 kb, which is 200% overhead. With KHWASAN it's 50 kb, which is 25% overhead. This should approximately scale to any amounts of used slab memory. For example with 100 mb memory usage we would get +200 mb for KASAN and +25 mb with KHWASAN. (And KASAN also requires quarantine for better use-after-free detection). I can explicitly mention the overhead in %s in the changelog. If you think it makes sense, I can also make separate measurements with some workload. What kind of workload should I use?Whatever workload people were running when they encountered problems with KASAN memory consumption ;) I dunno, something simple. `find / > /dev/null'?Looking at a live Android device under load, slab (according to /proc/meminfo) + kernel stack take 8-10% available RAM (~350MB). Kasan's overhead of 2x - 3x on top of it is not insignificant.(top-posting repaired. Please don't) For a debugging, not-for-production-use feature, that overhead sounds quite acceptable to me. What problems is it known to cause?Not having this overhead enables near-production use - ex. running kasan/khasan kernel on a personal, daily-use device to catch bugs that do not reproduce in test configuration. These are the ones that often cost the most engineering time to track down. CPU overhead is bad, but generally tolerable. RAM is critical, in our experience. Once it gets low enough, OOM-killer makes your life miserable.
OK, anecdotal experience works for me. But this is all stuff that should have been in the changelog from day zero, please. It describes the reason for the patchset's existence!