Re: [PATCH 09/10] kthread: Ensure struct kthread is present for all kthreads
From: Nathan Chancellor <nathan@kernel.org>
Date: 2021-12-22 18:46:50
Also in:
lkml
On Wed, Dec 22, 2021 at 12:30:57PM -0600, Eric W. Biederman wrote:
Nathan Chancellor [off-list ref] writes:quoted
Hi Eric, On Wed, Dec 08, 2021 at 02:25:31PM -0600, Eric W. Biederman wrote:quoted
Today the rules are a bit iffy and arbitrary about which kernel threads have struct kthread present. Both idle threads and thread started with create_kthread want struct kthread present so that is effectively all kernel threads. Make the rule that if PF_KTHREAD and the task is running then struct kthread is present. This will allow the kernel thread code to using tsk->exit_code with different semantics from ordinary processes. To make ensure that struct kthread is present for all kernel threads move it's allocation into copy_process. Add a deallocation of struct kthread in exec for processes that were kernel threads. Move the allocation of struct kthread for the initial thread earlier so that it is not repeated for each additional idle thread. Move the initialization of struct kthread into set_kthread_struct so that the structure is always and reliably initailized. Clear set_child_tid in free_kthread_struct to ensure the kthread struct is reliably freed during exec. The function free_kthread_struct does not need to clear vfork_done during exec as exec_mm_release called from exec_mmap has already cleared vfork_done. Signed-off-by: "Eric W. Biederman" <redacted>This patch as commit 40966e316f86 ("kthread: Ensure struct kthread is present for all kthreads") in -next causes an ARCH=arm multi_v5_defconfig kernel to fail to boot in QEMU. I had to apply commit 6692c98c7df5 ("fork: Stop protecting back_fork_cleanup_cgroup_lock with CONFIG_NUMA") to get it to build and I applied commit dd621ee0cf8e ("kthread: Warn about failed allocations for the init kthread") to avoid the known runtime warning. $ make -skj"$(nproc)" ARCH=arm CROSS_COMPILE=arm-linux-gnueabi- distclean multi_v5_defconfig all $ qemu-system-arm \ -initrd rootfs.cpio \ -append earlycon \ -machine palmetto-bmc \ -no-reboot \ -dtb arch/arm/boot/dts/aspeed-bmc-opp-palmetto.dtb \ -display none \ -kernel arch/arm/boot/zImage \ -m 512m \ -nodefaults \ -serial mon:stdio qemu-system-arm: warning: nic ftgmac100.0 has no peer qemu-system-arm: warning: nic ftgmac100.1 has no peer Booting Linux on physical CPU 0x0 Linux version 5.16.0-rc1-00016-g40966e316f86-dirty (nathan@archlinux-ax161) (arm-linux-gnueabi-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2) #1 PREEMPT Wed Dec 22 18:08:53 UTC 2021 CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00093177 CPU: VIVT data cache, VIVT instruction cache OF: fdt: Machine model: Palmetto BMC earlycon: ns16550a0 at MMIO 0x1e784000 (options '') printk: bootconsole [ns16550a0] enabled Memory policy: Data cache writethrough cma: Reserved 16 MiB at 0x5b000000 Zone ranges: DMA [mem 0x0000000040000000-0x000000005edfffff] Normal empty HighMem [mem 0x000000005ee00000-0x000000005fffffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000040000000-0x000000005bffffff] node 0: [mem 0x000000005c000000-0x000000005dffffff] node 0: [mem 0x000000005e000000-0x000000005edfffff] node 0: [mem 0x000000005ee00000-0x000000005fffffff] Initmem setup node 0 [mem 0x0000000040000000-0x000000005fffffff] Built 1 zonelists, mobility grouping on. Total pages: 130084 Kernel command line: earlycon Dentry cache hash table entries: 65536 (order: 6, 262144 bytes, linear) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes, linear) mem auto-init: stack:off, heap alloc:off, heap free:off Memory: 433140K/524288K available (9628K kernel code, 2019K rwdata, 2368K rodata, 340K init, 661K bss, 74764K reserved, 16384K cma-reserved, 0K highmem) SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 rcu: Preemptible hierarchical RCU implementation. rcu: RCU event tracing is enabled. Trampoline variant of Tasks RCU enabled. rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies. NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16 i2c controller registered, irq 16 random: get_random_bytes called from start_kernel+0x408/0x624 with crng_init=0 clocksource: FTTMR010-TIMER2: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 79635851949 ns sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 89478484971ns Switching to timer-based delay loop, resolution 41ns Console: colour dummy device 80x30 printk: console [tty0] enabled printk: bootconsole [ns16550a0] disabled After that, it just hangs. The rootfs is available at https://github.com/ClangBuiltLinux/boot-utils in the images/arm folder. If there is any more information that I can provide or changes to test, please let me know.Well crap. I hate to hear my code is causing problems like this. This is however a very good bug report, which I very much appreciate. I think I have enough information. I will see if I can reproduce this and track down what is happening. Have you by any chance tried linux-next with just these changes backed out?
Yes, if I back out of the following commits on top of next-20211222 then
the kernel boots right up.
dd621ee0cf8e ("kthread: Warn about failed allocations for the init kthread")
ff8288ff475e ("fork: Rename bad_fork_cleanup_threadgroup_lock to bad_fork_cleanup_delayacct")
6692c98c7df5 ("fork: Stop protecting back_fork_cleanup_cgroup_lock with CONFIG_NUMA")
1fb466dff904 ("objtool: Add a missing comma to avoid string concatenation")
5eb6f22823e0 ("exit/kthread: Fix the kerneldoc comment for kthread_complete_and_exit")
6b1248798eb6 ("exit/kthread: Move the exit code for kernel threads into struct kthread")
40966e316f86 ("kthread: Ensure struct kthread is present for all kthreads")
Cheers,
Nathan