Re: [PATCH v2 0/5] mm: reduce mmap_lock contention and improve page fault performance
From: Lorenzo Stoakes <ljs@kernel.org>
Date: 2026-05-22 15:53:17
Also in:
linux-arm-kernel, linux-mm, linux-riscv, linux-s390, lkml, loongarch
On Thu, May 21, 2026 at 07:37:58AM +0800, Barry Song wrote:
On Thu, May 21, 2026 at 5:35 AM David Hildenbrand (Arm) [off-list ref] wrote:quoted
On 5/20/26 23:15, Matthew Wilcox wrote:quoted
On Thu, May 21, 2026 at 05:14:20AM +0800, Barry Song wrote:quoted
My understanding is that we should not blame applications here. This is 2026: there are basically only two kinds of applications — single-threaded and multi-threaded — and single-threaded applications are nearly extinct.all of the applications i run are either single threaded or don't fork. what multithreaded applications call fork?Traditionally the problem was random libraries using fork+execve to launch other programs ... instead of using alternatives like posix_spwan (some use cases require more work done before execve and cannot yet switch to that). I'd hope that that is less of a problem on Android. I assume Android zygote might be multi threaded? Maybe sshd as well? Systemd? But I'd be surprised if there are really performance implications.I am trying to answer the question above: 1. zygote, multi-threaded on my phone using Android13. / # ls /proc/`pidof zygote64`/task/ 1359 22728 22729 22730 22731 22732 /proc/1359/task # cat 22728/comm Jit thread pool /proc/1359/task # cat 22730/comm ReferenceQueueD /proc/1359/task # cat 22731/comm FinalizerDaemon /proc/1359/task # cat 22732/comm FinalizerWatchd /proc/1359/task # cat 1359/comm main But on another phone of mine running Android 16, zygote64 is single-threaded. Not sure if it is due to the Android team making some changes related to threads from Android 13 to Android 16. 2. sshd, multi-processes instead of multi-threads: $ ps aux | grep sshd root 1192 0.0 0.0 15444 9032 ? Ss 09:42 0:00 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups root 2465 0.0 0.0 17164 10760 ? Ss 09:42 0:00 sshd: barry [priv] barry 2632 0.0 0.0 17164 7852 ? S 09:42 0:00 sshd: barry@pts/0 root 3305 2.5 0.0 17164 10772 ? Ss 09:44 0:00 sshd: barry [priv] barry 3406 0.0 0.0 17164 7940 ? S 09:44 0:00 sshd: barry@pts/1 3. systemd, also multi-processes $ ps ax | grep systemd 350 ? S<s 0:00 /lib/systemd/systemd-journald 387 ? Ss 0:00 /lib/systemd/systemd-udevd 666 ? Ss 0:00 /lib/systemd/systemd-oomd 667 ? Ss 0:00 /lib/systemd/systemd-resolved 728 ? Ss 0:00 @dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 751 ? Ss 0:00 /lib/systemd/systemd-logind 753 ? Ssl 0:00 /usr/sbin/thermald --systemd --dbus-enable --adaptive 1350 ? Ss 0:00 /lib/systemd/systemd --user 1428 ? Ss 0:00 /usr/bin/dbus-daemon --session --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 1900 ? Ssl 0:00 /usr/libexec/gnome-session-binary --systemd-service --session=ubuntu 2141 ? Ssl 0:00 /lib/systemd/systemd-timesyncdquoted
Not sure about webbroswers .... I think most of them switched to fork servers, where I would assume fork servers would be single-threaded.On my phone, Chrome is multi-process, but its parent process chrome_zygote (10774) is single-threaded: ps -A | grep chrome u0_i15 9883 10774 321066464 119452 do_epoll_wait 0 S com.android.chrome:sandboxed_process0:org.chromium.content.app.SandboxedProcessService0:15 u0_a142 10164 1359 35110548 277640 do_epoll_wait 0 S com.android.chrome u0_a278 10724 1359 9779864 104988 do_epoll_wait 0 S com.google.android.apps.chromecast.app u0_a142 10774 1359 32803908 64076 do_sys_poll 0 S com.android.chrome_zygote u0_a142 11173 1359 34208592 142192 do_epoll_wait 0 S com.android.chrome:privileged_process0 /proc/10774/task # ls 10774quoted
So, yeah, getting a clear understanding how this ends up being a problem on Android would be great.I guess the real issue is that in the Android market, there are so many applications that are out of our control? Here are some trace examples from Nanzhe: iQIYI plugin vma reader thread: PbMisc-0, pid=27183, tgid=26444 vma writer thread: i.video:plugin1, pid=27298, tgid=26444 writer blocked: 440394938 ns (440 ms) reader stack: vma_start_read lock_vma_under_rcu do_page_fault do_translation_fault do_mem_abort el0_da el0t_64_sync_handler el0t_64_sync writer stack: __vma_start_write dup_mmap copy_mm copy_process kernel_clone __arm64_sys_clone invoke_syscall el0_svc_common do_el0_svc el0_svc Baidu Tieba vma reader thread: elastic_pms_pro, pid=7731, tgid=7575 vma writer thread: com.baidu.tieba, pid=8005, tgid=7575 writer blocked: 514975545 ns(515 ms) reader stack: vma_start_read lock_vma_under_rcu do_page_fault do_translation_fault do_mem_abort el0_da el0t_64_sync_handler el0t_64_sync writer stack: __vma_start_write dup_mmap copy_mm copy_process kernel_clone __arm64_sys_clone invoke_syscall el0_svc_common do_el0_svc el0_svc Thanks Barry
Again this is making me want to sit outside and sip on some lemonade and ice :) Yes - android processes are aggressively multi-threaded, sure of course. The missing bit here is the forking - what, where, why, when? And then you say zygote is sometimes multi-threaded but sometimes single-threaded, which is adding a whole bunch of confusion on top of all that. I don't find these stack trace dumps all that useful (though thanks of course for taking the time to gather them), I think we'd be better off with specific data on forking, in some _concise_ _summarised_ form, ideally with numbers. There's such a thing as too much information :)) Anyway, again, please let's see a new _RFC_ with the approach proposed by Suren, with some _succinct_ data demonstrating _exactly_ what the problem is, so we can make some headway here. And now I'm off for a cornetto! :) Thanks, Lorenzo