Re: [RESEND PATCH v2] modules: wait do_free_init correctly
From: Eric Chanudet <hidden>
Date: 2024-02-15 14:18:13
Also in:
lkml
On Tue, Jan 30, 2024 at 06:21:03AM -0800, Luis Chamberlain wrote:
On Tue, Jan 30, 2024 at 09:40:38AM +0800, Changbin Du wrote:quoted
On Mon, Jan 29, 2024 at 09:53:58AM -0800, Luis Chamberlain wrote:quoted
On Mon, Jan 29, 2024 at 10:03:04AM +0800, Changbin Du wrote:quoted
The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves do_free_init() into a global workqueue instead of call_rcu(). So now rcu_barrier() can not ensure that do_free_init has completed. We should wait it via flush_work(). Without this fix, we still could encounter false positive reports in W+X checking, and rcu synchronization is unnecessary.
The comment in do_init_module(), just before schedule_work(&init_free_wq), mentioning rcu_barrier(), should be amended as well.
quoted
quoted
You didn't answer my question, which should be documented in the commit log. Does this mean we never freed modules init because of this? If so then your commit log should clearly explain that. It should also explain that if true (you have to verify) then it means we were no longer saving the memory we wished to save, and that is important for distributions which do want to save anything on memory. You may want to do a general estimate on how much that means these days on any desktop / server.Actually, I have explained it in commit msg. It's not about saving memory. The synchronization here is just to ensure the module init's been freed before doing W+X checking. The problem is that the current implementation is wrong, rcu_barrier() cannot guarantee that. So we can encounter false positive reports. But anyway, the module init will be freed, and it's just a timing related issue.Your desciption here is better than the commit log.
I saw this problem using a PREEMPT_RT kernel as well. Setting DEBUG_WX=n
stills show a significant delay due to the rcu_barrier:
[ 0.291444] Freeing unused kernel memory: 5568K
[ 0.402442] Run /sbin/init as init process
The same delay is shorter using linux-next, but still noticeable
(DEBUG_WX=n):
[ 0.384362] Freeing unused kernel memory: 14080K
[ 0.413423] Run /sbin/init as init process
Matching trace_event=rcu:rcu_barrier trace:
systemd-1 [002] ..... 0.384391: rcu_barrier: rcu_preempt Begin cpu -1 remaining 0 # 4
systemd-1 [002] d..1. 0.384394: rcu_barrier: rcu_preempt Inc1 cpu -1 remaining 0 # 1
systemd-1 [002] ..... 0.384395: rcu_barrier: rcu_preempt NQ cpu 0 remaining 2 # 1
<idle>-0 [001] d.h2. 0.384407: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 2 # 1
systemd-1 [002] ..... 0.384408: rcu_barrier: rcu_preempt OnlineQ cpu 1 remaining 3 # 1
systemd-1 [002] ..... 0.384409: rcu_barrier: rcu_preempt NQ cpu 2 remaining 3 # 1
<idle>-0 [003] d.h2. 0.384416: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 3 # 1
systemd-1 [002] ..... 0.384418: rcu_barrier: rcu_preempt OnlineQ cpu 3 remaining 4 # 1
<idle>-0 [004] d.h2. 0.384428: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 4 # 1
systemd-1 [002] ..... 0.384430: rcu_barrier: rcu_preempt OnlineQ cpu 4 remaining 5 # 1
<idle>-0 [005] d.h2. 0.384438: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 5 # 1
systemd-1 [002] ..... 0.384441: rcu_barrier: rcu_preempt OnlineQ cpu 5 remaining 6 # 1
<idle>-0 [006] d.h2. 0.384450: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 6 # 1
systemd-1 [002] ..... 0.384452: rcu_barrier: rcu_preempt OnlineQ cpu 6 remaining 7 # 1
<idle>-0 [007] d.h2. 0.384461: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 7 # 1
systemd-1 [002] ..... 0.384463: rcu_barrier: rcu_preempt OnlineQ cpu 7 remaining 8 # 1
<idle>-0 [004] ..s1. 0.385339: rcu_barrier: rcu_preempt CB cpu -1 remaining 5 # 1
<idle>-0 [007] ..s1. 0.397335: rcu_barrier: rcu_preempt CB cpu -1 remaining 4 # 1
<idle>-0 [003] ..s1. 0.397337: rcu_barrier: rcu_preempt CB cpu -1 remaining 3 # 1
<idle>-0 [005] ..s1. 0.401336: rcu_barrier: rcu_preempt CB cpu -1 remaining 2 # 1
<idle>-0 [006] ..s1. 0.401336: rcu_barrier: rcu_preempt CB cpu -1 remaining 1 # 1
<idle>-0 [001] .Ns1. 0.413338: rcu_barrier: rcu_preempt LastCB cpu -1 remaining 0 # 1
systemd-1 [002] ..... 0.413351: rcu_barrier: rcu_preempt Inc2 cpu -1 remaining 0 # 1
With this patch the delay is no longer there:
[ 0.377662] Freeing unused kernel memory: 14080K
[ 0.377767] Run /sbin/init as init process
AFAIU, for the race to happen, module_alloc() needs to create a W+X
mapping (neither x86 nor arm64 does) and debug_checkwx() has to happen
before module_enable_nx() in complete_formation(), I didn't get a
reproducer so far.
Best,
--
Eric Chanudet