Thread (6 messages) 6 messages, 3 authors, 2024-02-17

Re: [RESEND PATCH v2] modules: wait do_free_init correctly

From: Eric Chanudet <hidden>
Date: 2024-02-15 14:18:13
Also in: lkml

On Tue, Jan 30, 2024 at 06:21:03AM -0800, Luis Chamberlain wrote:
On Tue, Jan 30, 2024 at 09:40:38AM +0800, Changbin Du wrote:
quoted
On Mon, Jan 29, 2024 at 09:53:58AM -0800, Luis Chamberlain wrote:
quoted
On Mon, Jan 29, 2024 at 10:03:04AM +0800, Changbin Du wrote:
quoted
The commit 1a7b7d922081 ("modules: Use vmalloc special flag") moves
do_free_init() into a global workqueue instead of call_rcu(). So now
rcu_barrier() can not ensure that do_free_init has completed. We should
wait it via flush_work().

Without this fix, we still could encounter false positive reports in
W+X checking, and rcu synchronization is unnecessary.
The comment in do_init_module(), just before
schedule_work(&init_free_wq), mentioning rcu_barrier(), should be
amended as well.
quoted
quoted
You didn't answer my question, which should be documented in the commit log.

Does this mean we never freed modules init because of this? If so then
your commit log should clearly explain that. It should also explain that
if true (you have to verify) then it means we were no longer saving
the memory we wished to save, and that is important for distributions
which do want to save anything on memory. You may want to do a general
estimate on how much that means these days on any desktop / server.
Actually, I have explained it in commit msg. It's not about saving memory. The
synchronization here is just to ensure the module init's been freed before
doing W+X checking. The problem is that the current implementation is wrong,
rcu_barrier() cannot guarantee that. So we can encounter false positive reports.
But anyway, the module init will be freed, and it's just a timing related issue.
Your desciption here is better than the commit log.
I saw this problem using a PREEMPT_RT kernel as well. Setting DEBUG_WX=n
stills show a significant delay due to the rcu_barrier:
  [    0.291444] Freeing unused kernel memory: 5568K
  [    0.402442] Run /sbin/init as init process

The same delay is shorter using linux-next, but still noticeable
(DEBUG_WX=n):
  [    0.384362] Freeing unused kernel memory: 14080K
  [    0.413423] Run /sbin/init as init process

Matching trace_event=rcu:rcu_barrier trace:
         systemd-1       [002] .....     0.384391: rcu_barrier: rcu_preempt Begin cpu -1 remaining 0 # 4
         systemd-1       [002] d..1.     0.384394: rcu_barrier: rcu_preempt Inc1 cpu -1 remaining 0 # 1
         systemd-1       [002] .....     0.384395: rcu_barrier: rcu_preempt NQ cpu 0 remaining 2 # 1
          <idle>-0       [001] d.h2.     0.384407: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 2 # 1
         systemd-1       [002] .....     0.384408: rcu_barrier: rcu_preempt OnlineQ cpu 1 remaining 3 # 1
         systemd-1       [002] .....     0.384409: rcu_barrier: rcu_preempt NQ cpu 2 remaining 3 # 1
          <idle>-0       [003] d.h2.     0.384416: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 3 # 1
         systemd-1       [002] .....     0.384418: rcu_barrier: rcu_preempt OnlineQ cpu 3 remaining 4 # 1
          <idle>-0       [004] d.h2.     0.384428: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 4 # 1
         systemd-1       [002] .....     0.384430: rcu_barrier: rcu_preempt OnlineQ cpu 4 remaining 5 # 1
          <idle>-0       [005] d.h2.     0.384438: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 5 # 1
         systemd-1       [002] .....     0.384441: rcu_barrier: rcu_preempt OnlineQ cpu 5 remaining 6 # 1
          <idle>-0       [006] d.h2.     0.384450: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 6 # 1
         systemd-1       [002] .....     0.384452: rcu_barrier: rcu_preempt OnlineQ cpu 6 remaining 7 # 1
          <idle>-0       [007] d.h2.     0.384461: rcu_barrier: rcu_preempt IRQ cpu -1 remaining 7 # 1
         systemd-1       [002] .....     0.384463: rcu_barrier: rcu_preempt OnlineQ cpu 7 remaining 8 # 1
          <idle>-0       [004] ..s1.     0.385339: rcu_barrier: rcu_preempt CB cpu -1 remaining 5 # 1
          <idle>-0       [007] ..s1.     0.397335: rcu_barrier: rcu_preempt CB cpu -1 remaining 4 # 1
          <idle>-0       [003] ..s1.     0.397337: rcu_barrier: rcu_preempt CB cpu -1 remaining 3 # 1
          <idle>-0       [005] ..s1.     0.401336: rcu_barrier: rcu_preempt CB cpu -1 remaining 2 # 1
          <idle>-0       [006] ..s1.     0.401336: rcu_barrier: rcu_preempt CB cpu -1 remaining 1 # 1
          <idle>-0       [001] .Ns1.     0.413338: rcu_barrier: rcu_preempt LastCB cpu -1 remaining 0 # 1
         systemd-1       [002] .....     0.413351: rcu_barrier: rcu_preempt Inc2 cpu -1 remaining 0 # 1

With this patch the delay is no longer there:
  [    0.377662] Freeing unused kernel memory: 14080K
  [    0.377767] Run /sbin/init as init process

AFAIU, for the race to happen, module_alloc() needs to create a W+X
mapping (neither x86 nor arm64 does) and debug_checkwx() has to happen
before module_enable_nx() in complete_formation(), I didn't get a
reproducer so far.

Best,

-- 
Eric Chanudet
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help