Re: [linux-next:master] [fs] 313c47f4fe: BUG:kernel_hang_in_test_stage
From: Oliver Sang <hidden>
Date: 2026-02-03 01:31:04
Also in:
linux-fsdevel, oe-lkp
hi, Christian Brauner, sorry for late. it cost us some time to double confirm. On Sat, Jan 31, 2026 at 12:41:12PM +0100, Christian Brauner wrote:
On Fri, Jan 30, 2026 at 05:59:00PM +0100, Christian Brauner wrote:quoted
On Tue, Jan 27, 2026 at 02:26:09PM +0800, kernel test robot wrote:quoted
Hello, kernel test robot noticed "BUG:kernel_hang_in_test_stage" on: commit: 313c47f4fe4d07eb2969f429a66ad331fe2b3b6f ("fs: use nullfs unconditionally as the real rootfs") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master [test failed on linux-next/master ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea] in testcase: trinity version: with following parameters: runtime: 300s group: group-00 nr_groups: 5 config: x86_64-kexec compiler: clang-20 test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G (please refer to attached dmesg/kmsg for entire log/backtrace)The reproducer doesn't work: ubuntu@pengar:~/data/kernel/linux/MODULES/lkp-tests$ sudo bin/lkp qemu -k ../../vmlinux -m ./modules.cgz job-script # job-script result_root: /home/ubuntu/.lkp//result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/15 downloading initrds ... skip downloading /home/ubuntu/.lkp/cache/osimage/yocto/yocto-x86_64-minimal-20190520.cgz 19270 blocks /usr/bin/wget -q --timeout=3600 --tries=1 --local-encoding=UTF-8 https://download.01.org/0day-ci/lkp-qemu/osimage/pkg/debian-x86_64-20180403.cgz/trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06.cgz -N -P /home/ubuntu/.lkp/cache/osimage/pkg/debian-x86_64-20180403.cgz Failed to download osimage/pkg/debian-x86_64-20180403.cgz/trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06.cgz cat: '': No such file or directory exec command: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -fsdev local,id=test_dev,path=/home/ubuntu/.lkp//result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/15,security_model=none -device virtio-9p-pci,fsdev=test_dev,mount_tag=9p/virtfs_mount -kernel ../../vmlinux -append root=/dev/ram0 RESULT_ROOT=/result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/0 BOOT_IMAGE=/pkg/linux/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/vmlinuz-6.19.0-rc1-00006-g313c47f4fe4d branch=internal-devel/devel-hourly-20260124-050739 job=/lkp/jobs/scheduled/vm-meta-17/trinity-group-00-5-300s-yocto-x86_64-minimal-20190520.cgz-313c47f4fe4d-20260126-53110-19zhjsh-2.yaml user=lkp ARCH=x86_64 kconfig=x86_64-kexec commit=313c47f4fe4d07eb2969f429a66ad331fe2b3b6f intremap=posted_msi watchdog_thresh=240 rcuperf.shutdown=0 rcuscale.shutdown=0 refscale.shutdown=0 audit=0 kunit.enable=0 ia32_emulation=on max_uptime=7200 LKP_LOCAL_RUN=1 selinux=0 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw ip=dhcp result_service=9p/virtfs_mount -initrd /home/ubuntu/.lkp/cache/final_initrd -smp 2 -m 12872M -no-reboot -device i6300esb -rtc base=localtime -device e1000,netdev=net0 -netdev user,id=net0 -display none -monitor null -serial stdio qemu-system-x86_64: Error loading uncompressed kernel without PVH ELF Note The paths for the downloads in the job script are wrong or don't work. Even if I manually modify the above path I still get in the next step: /usr/bin/wget -q --timeout=3600 --tries=1 --local-encoding=UTF-8 https://download.01.org/0day-ci/lkp-qemu/modules.cgz -N -P /home/ubuntu/.lkp/cache Failed to download modules.cgz cat: '': No such file or directory I need a way to reproduce the issue to figure out exactly what is happening.Ok, I got it all working and can run the reproducer.
not sure how you solve it? from above log, the problem is caused by trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06.cgz, do you have any log? internally, it's a soft link pointing to a debian version, so maybe there are some code issue for uploading to https://download.01.org/0day-ci. if you could share more information with us, we could check further to improve our process and reproducer. thanks a lot! we will also check by ourselves, so no problem at all if you ignore this.
But I cannot reproduce the error below at all. I've tried vfs.all, vfs-7.0.nullfs, vfs-7.0.initrd, and I tried ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea. In all cases: root@vm-snb:~# which dmesg /bin/dmesg root@vm-snb:~# which sleep /bin/sleep root@vm-snb:~# which grep /bin/grep root@vm-snb:~# /lkp/lkp/src/bin/event/wait Usage: /lkp/lkp/src/bin/event/wait [-t|--timeout seconds] PIPE_NAME root 1736 0.0 0.0 4144 1020 ? S 10:36 0:00 /bin/sh /etc/rc5.d/S77lkp-bootstrap start root 1738 0.0 0.0 4408 2676 ? S 10:36 0:00 \_ /bin/sh /lkp/lkp/src/bin/lkp-setup-rootfs root 1771 0.0 0.0 4144 1908 ? S 10:36 0:00 \_ tail -f /tmp/stderr root 1853 0.0 0.0 4408 2688 ? S 10:36 0:00 \_ /bin/sh /lkp/lkp/src/bin/run-lkp /lkp/jobs/scheduled/vm-meta-17/trinity-group-00-5-300s-yocto-x86_64-minimal-20190520.c root 1875 0.0 0.0 4144 1932 ? S 10:36 0:00 \_ tail -n 0 -f /tmp/stdout root 1876 0.0 0.0 4144 1900 ? S 10:36 0:00 \_ tail -n 0 -f /tmp/stderr root 1877 0.0 0.0 4144 2196 ? S 10:36 0:00 \_ tail -n 0 -f /tmp/stdout /tmp/stderr root 1930 0.0 0.0 4148 2524 ? S 10:36 0:00 \_ /bin/sh /lkp/jobs/scheduled/vm-meta-17/trinity-group-00-5-300s-yocto-x86_64-minimal-20190520.cgz-313c47f4fe4d-20260 root 1943 0.0 0.0 4016 1892 ? S 10:36 0:00 \_ cat /proc/kmsg root 1974 0.0 0.0 4016 1856 ? S 10:36 0:00 | \_ cat /tmp/lkp/fifo-kmsg root 1945 0.0 0.0 2476 1732 ? S 10:36 0:00 \_ vmstat --timestamp -n 10 root 1989 0.0 0.0 4016 1928 ? S 10:36 0:00 | \_ cat /tmp/lkp/fifo-heartbeat root 1948 0.1 0.0 4148 2492 ? S 10:36 0:00 \_ /bin/sh /lkp/lkp/src/monitors/meminfo root 1995 0.0 0.0 4280 2120 ? S 10:36 0:00 | \_ gzip -c root 2532 0.0 0.0 1144 832 ? S 10:39 0:00 | \_ /lkp/lkp/src/bin/event/wait post-test --timeout 1 root 1952 0.0 0.0 4148 2392 ? S 10:36 0:00 \_ /bin/sh /lkp/lkp/src/monitors/oom-killer root 1978 0.0 0.0 4016 1880 ? S 10:36 0:00 | \_ cat /tmp/lkp/fifo-oom-killer root 2523 0.0 0.0 1144 836 ? S 10:39 0:00 | \_ /lkp/lkp/src/bin/event/wait post-test --timeout 11 root 1955 0.0 0.0 4148 2140 ? S 10:36 0:00 \_ /bin/sh /lkp/lkp/src/monitors/plain/watchdog root 1971 0.0 0.0 1144 832 ? S 10:36 0:00 | \_ /lkp/lkp/src/bin/event/wait job-finished --timeout 7200 root 2026 0.0 0.0 4148 2448 ? S 10:37 0:00 \_ /bin/sh /lkp/lkp/src/programs/trinity/run root 2049 0.0 0.0 4016 1820 ? S 10:37 0:00 \_ sleep 300 root 1747 0.0 0.0 4144 2176 ? Ss 10:36 0:00 /bin/sh /bin/start_getty 115200 ttyS0 vt102 root 1794 0.0 0.0 4188 2424 ttyS0 Ss 10:36 0:00 \_ -sh root 2533 0.0 0.0 3288 2272 ttyS0 R+ 10:39 0:00 \_ ps auxf root 1749 0.0 0.0 4144 2176 tty1 Ss+ 10:36 0:00 /sbin/getty 38400 tty1 root 1963 0.0 0.0 1144 324 ? Ss 10:36 0:00 /lkp/lkp/src/bin/event/wakeup activate-monitor root 1968 0.0 0.0 1144 320 ? Ss 10:36 0:00 /lkp/lkp/src/bin/event/wakeup pre-test root 2030 0.0 0.0 4148 1972 ? S 10:37 0:00 tee -a //result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429 Is there any more data you can provide or how much and how reliable this test fails? Otherwise I have no choice but to discount this for now.
in our bot tests, the results are quite persistent, now we run even more till
~120 times for either 313c47f4fe4d07eb2969f429a66 or its parent, always see
the issue for 313c47f4fe4d07eb2969f429a66, and parent keeps clean.
=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/group/nr_groups:
vm-snb/trinity/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/300s/group-00/5
7416634fd6f18762 313c47f4fe4d07eb2969f429a66
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:121 98% 119:119 last_state.is_incomplete_run
:121 98% 119:119 last_state.running
:121 98% 119:119 dmesg.BUG:kernel_hang_in_test_stage
(BTW, in fact we also tried to rebuild the kernel and rerun tests, also got
same results)
I also tried reproducer
https://download.01.org/0day-ci/archive/20260127/202601270735.29b7c33e-lkp@intel.com/reproduce
on my local Ubuntu 22.04.5 LTS
can reproduce the issue for 313c47f4fe4d07eb2969f429a66. one log is attached as
313c47f4fe4d-run.log.
no issue if running upon parent commit 7416634fd6f18762. one log is attached as
parent-7416634fd6f1-run.log.
you mentioned you tried ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea
(commit ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea (tag: next-20260123))
I can reproduce the issue upon it. one log is attached as
next-20260123-ca3a02fda4da-run.log.
I uploaded the binaries I used for reproducer to
https://download.01.org/0day-ci/archive/20260127/202601270735.29b7c33e-lkp@intel.com
not sure if they are useful.
for 313c47f4fe4d07eb2969f429a66:
vmlinuz-6.19.0-rc1-00006-g313c47f4fe4d <-- <bzImage>
modules-313c47f4fe4d.cgz
vmlinux-313c47f4fe4d.xz <---- not sure if it could supply some information
for parent 7416634fd6f18762:
vmlinuz-6.19.0-rc1-00005-g7416634fd6f1
modules-7416634fd6f1.cgz
vmlinux-7416634fd6f1.xz
for ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea (tag: next-20260123):
vmlinuz-6.19.0-rc6-next-20260123
modules-6.19.0-rc6-next-20260123.cgz
vmlinux-next-20260123.xz
config-6.19.0-rc6-next-20260123 <--- since it has diff with config-6.19.0-rc1-00006-g313c47f4fe4d
quoted
quoted
If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot [off-list ref] | Closes: https://lore.kernel.org/oe-lkp/202601270735.29b7c33e-lkp@intel.com (local) [ 27.746952][ T1793] /lkp/lkp/src/monitors/meminfo: line 25: /lkp/lkp/src/bin/event/wait: not found [ 31.757224][ T1793] /lkp/lkp/src/monitors/oom-killer: line 94: dmesg: not found [ 31.757224][ T1793] /lkp/lkp/src/monitors/oom-killer: line 94: grep: not found [ 31.757224][ T1793] /lkp/lkp/src/monitors/oom-killer: line 25: /lkp/lkp/src/bin/event/wait: not found [ 65.744824][ T4974] trinity-main[4974]: segfault at 0 ip 0000000000000000 sp 00007ffe08d2ec08 error 15 likely on CPU 0 (core 0, socket 0) [ 65.746308][ T4974] Code: Unable to access opcode bytes at 0xffffffffffffffd6. Code starting with the faulting instruction =========================================== /etc/rc5.d/S77lkp-bootstrap: line 79: sleep: not found BUG: kernel hang in test stage The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20260127/202601270735.29b7c33e-lkp@intel.com -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
Attachments
- 313c47f4fe4d-run.log [text/plain] 87488 bytes · preview
- parent-7416634fd6f1-run.log [text/plain] 104975 bytes · preview
- next-20260123-ca3a02fda4da-run.log [text/plain] 80746 bytes · preview