Re: [PATCH 1/1] mm: prevent a race between process_mrelease and exit_mmap
From: Suren Baghdasaryan <surenb@google.com>
Date: 2021-10-22 05:23:24
Also in:
linux-mm, lkml
On Thu, Oct 21, 2021 at 7:25 PM Andrew Morton [off-list ref] wrote:
On Thu, 21 Oct 2021 18:46:58 -0700 Suren Baghdasaryan [off-list ref] wrote:quoted
Race between process_mrelease and exit_mmap, where free_pgtables is called while __oom_reap_task_mm is in progress, leads to kernel crash during pte_offset_map_lock call. oom-reaper avoids this race by setting MMF_OOM_VICTIM flag and causing exit_mmap to take and release mmap_write_lock, blocking it until oom-reaper releases mmap_read_lock. Reusing MMF_OOM_VICTIM for process_mrelease would be the simplest way to fix this race, however that would be considered a hack. Fix this race by elevating mm->mm_users and preventing exit_mmap from executing until process_mrelease is finished. Patch slightly refactors the code to adapt for a possible mmget_not_zero failure. This fix has considerable negative impact on process_mrelease performance and will likely need later optimization.Has the impact been quantified?
A ball-park figure for a large process (6GB) it takes 4x times longer for process_mrelease to exit.
And where's the added cost happening? The changes all look quite lightweight?
I think it's caused by the fact that exit_mmap and all other cleanup routines happening on the last mmput are postponed until process_mrelease finishes __oom_reap_task_mm and drops mm->mm_users. I suspect all that cleanup is happening at the end of process_mrelease now and that might be contributing to the regression. I didn't have time yet to fully understand all the reasons for that regression but wanted to fix the crash first. Will proceed with more investigation and hopefully with a quick fix for the lost performance.
-- To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.