Thread (6 messages) 6 messages, 2 authors, 2026-02-05

Re: [linux-next:master] [fs] 313c47f4fe: BUG:kernel_hang_in_test_stage

From: Oliver Sang <hidden>
Date: 2026-02-03 01:31:04
Also in: linux-fsdevel, oe-lkp

hi, Christian Brauner,

sorry for late. it cost us some time to double confirm.

On Sat, Jan 31, 2026 at 12:41:12PM +0100, Christian Brauner wrote:
On Fri, Jan 30, 2026 at 05:59:00PM +0100, Christian Brauner wrote:
quoted
On Tue, Jan 27, 2026 at 02:26:09PM +0800, kernel test robot wrote:
quoted

Hello,

kernel test robot noticed "BUG:kernel_hang_in_test_stage" on:

commit: 313c47f4fe4d07eb2969f429a66ad331fe2b3b6f ("fs: use nullfs unconditionally as the real rootfs")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

[test failed on linux-next/master ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea]

in testcase: trinity
version: 
with following parameters:

	runtime: 300s
	group: group-00
	nr_groups: 5



config: x86_64-kexec
compiler: clang-20
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 32G

(please refer to attached dmesg/kmsg for entire log/backtrace)
The reproducer doesn't work:

ubuntu@pengar:~/data/kernel/linux/MODULES/lkp-tests$ sudo bin/lkp qemu -k ../../vmlinux -m ./modules.cgz job-script # job-script
result_root: /home/ubuntu/.lkp//result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/15
downloading initrds ...
skip downloading /home/ubuntu/.lkp/cache/osimage/yocto/yocto-x86_64-minimal-20190520.cgz
19270 blocks
/usr/bin/wget -q --timeout=3600 --tries=1 --local-encoding=UTF-8 https://download.01.org/0day-ci/lkp-qemu/osimage/pkg/debian-x86_64-20180403.cgz/trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06.cgz -N -P /home/ubuntu/.lkp/cache/osimage/pkg/debian-x86_64-20180403.cgz
Failed to download osimage/pkg/debian-x86_64-20180403.cgz/trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06.cgz
cat: '': No such file or directory
exec command: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -fsdev local,id=test_dev,path=/home/ubuntu/.lkp//result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/15,security_model=none -device virtio-9p-pci,fsdev=test_dev,mount_tag=9p/virtfs_mount -kernel ../../vmlinux -append root=/dev/ram0 RESULT_ROOT=/result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/0 BOOT_IMAGE=/pkg/linux/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429a66ad331fe2b3b6f/vmlinuz-6.19.0-rc1-00006-g313c47f4fe4d branch=internal-devel/devel-hourly-20260124-050739 job=/lkp/jobs/scheduled/vm-meta-17/trinity-group-00-5-300s-yocto-x86_64-minimal-20190520.cgz-313c47f4fe4d-20260126-53110-19zhjsh-2.yaml user=lkp ARCH=x86_64 kconfig=x86_64-kexec commit=313c47f4fe4d07eb2969f429a66ad331fe2b3b6f intremap=posted_msi watchdog_thresh=240 rcuperf.shutdown=0 rcuscale.shutdown=0 refscale.shutdown=0 audit=0 kunit.enable=0 ia32_emulation=on max_uptime=7200 LKP_LOCAL_RUN=1 selinux=0 debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw  ip=dhcp result_service=9p/virtfs_mount -initrd /home/ubuntu/.lkp/cache/final_initrd -smp 2 -m 12872M -no-reboot -device i6300esb -rtc base=localtime -device e1000,netdev=net0 -netdev user,id=net0 -display none -monitor null -serial stdio
qemu-system-x86_64: Error loading uncompressed kernel without PVH ELF Note

The paths for the downloads in the job script are wrong or don't work.
Even if I manually modify the above path I still get in the next step:

/usr/bin/wget -q --timeout=3600 --tries=1 --local-encoding=UTF-8 https://download.01.org/0day-ci/lkp-qemu/modules.cgz -N -P /home/ubuntu/.lkp/cache
Failed to download modules.cgz
cat: '': No such file or directory

I need a way to reproduce the issue to figure out exactly what is
happening.
Ok, I got it all working and can run the reproducer.
not sure how you solve it? from above log, the problem is caused by
trinity-static-x86_64-x86_64-1c734c75-1_2020-01-06.cgz, do you have any log?
internally, it's a soft link pointing to a debian version, so maybe there are
some code issue for uploading to https://download.01.org/0day-ci.

if you could share more information with us, we could check further to improve
our process and reproducer. thanks a lot!

we will also check by ourselves, so no problem at all if you ignore this.

But I cannot
reproduce the error below at all. I've tried vfs.all, vfs-7.0.nullfs,
vfs-7.0.initrd, and I tried ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea. In
all cases:

root@vm-snb:~# which dmesg
/bin/dmesg
root@vm-snb:~# which sleep
/bin/sleep
root@vm-snb:~# which grep
/bin/grep
root@vm-snb:~# /lkp/lkp/src/bin/event/wait
Usage: /lkp/lkp/src/bin/event/wait [-t|--timeout seconds] PIPE_NAME

root      1736  0.0  0.0   4144  1020 ?        S    10:36   0:00 /bin/sh /etc/rc5.d/S77lkp-bootstrap start
root      1738  0.0  0.0   4408  2676 ?        S    10:36   0:00  \_ /bin/sh /lkp/lkp/src/bin/lkp-setup-rootfs
root      1771  0.0  0.0   4144  1908 ?        S    10:36   0:00      \_ tail -f /tmp/stderr
root      1853  0.0  0.0   4408  2688 ?        S    10:36   0:00      \_ /bin/sh /lkp/lkp/src/bin/run-lkp /lkp/jobs/scheduled/vm-meta-17/trinity-group-00-5-300s-yocto-x86_64-minimal-20190520.c
root      1875  0.0  0.0   4144  1932 ?        S    10:36   0:00          \_ tail -n 0 -f /tmp/stdout
root      1876  0.0  0.0   4144  1900 ?        S    10:36   0:00          \_ tail -n 0 -f /tmp/stderr
root      1877  0.0  0.0   4144  2196 ?        S    10:36   0:00          \_ tail -n 0 -f /tmp/stdout /tmp/stderr
root      1930  0.0  0.0   4148  2524 ?        S    10:36   0:00          \_ /bin/sh /lkp/jobs/scheduled/vm-meta-17/trinity-group-00-5-300s-yocto-x86_64-minimal-20190520.cgz-313c47f4fe4d-20260
root      1943  0.0  0.0   4016  1892 ?        S    10:36   0:00              \_ cat /proc/kmsg
root      1974  0.0  0.0   4016  1856 ?        S    10:36   0:00              |   \_ cat /tmp/lkp/fifo-kmsg
root      1945  0.0  0.0   2476  1732 ?        S    10:36   0:00              \_ vmstat --timestamp -n 10
root      1989  0.0  0.0   4016  1928 ?        S    10:36   0:00              |   \_ cat /tmp/lkp/fifo-heartbeat
root      1948  0.1  0.0   4148  2492 ?        S    10:36   0:00              \_ /bin/sh /lkp/lkp/src/monitors/meminfo
root      1995  0.0  0.0   4280  2120 ?        S    10:36   0:00              |   \_ gzip -c
root      2532  0.0  0.0   1144   832 ?        S    10:39   0:00              |   \_ /lkp/lkp/src/bin/event/wait post-test --timeout 1
root      1952  0.0  0.0   4148  2392 ?        S    10:36   0:00              \_ /bin/sh /lkp/lkp/src/monitors/oom-killer
root      1978  0.0  0.0   4016  1880 ?        S    10:36   0:00              |   \_ cat /tmp/lkp/fifo-oom-killer
root      2523  0.0  0.0   1144   836 ?        S    10:39   0:00              |   \_ /lkp/lkp/src/bin/event/wait post-test --timeout 11
root      1955  0.0  0.0   4148  2140 ?        S    10:36   0:00              \_ /bin/sh /lkp/lkp/src/monitors/plain/watchdog
root      1971  0.0  0.0   1144   832 ?        S    10:36   0:00              |   \_ /lkp/lkp/src/bin/event/wait job-finished --timeout 7200
root      2026  0.0  0.0   4148  2448 ?        S    10:37   0:00              \_ /bin/sh /lkp/lkp/src/programs/trinity/run
root      2049  0.0  0.0   4016  1820 ?        S    10:37   0:00                  \_ sleep 300
root      1747  0.0  0.0   4144  2176 ?        Ss   10:36   0:00 /bin/sh /bin/start_getty 115200 ttyS0 vt102
root      1794  0.0  0.0   4188  2424 ttyS0    Ss   10:36   0:00  \_ -sh
root      2533  0.0  0.0   3288  2272 ttyS0    R+   10:39   0:00      \_ ps auxf
root      1749  0.0  0.0   4144  2176 tty1     Ss+  10:36   0:00 /sbin/getty 38400 tty1
root      1963  0.0  0.0   1144   324 ?        Ss   10:36   0:00 /lkp/lkp/src/bin/event/wakeup activate-monitor
root      1968  0.0  0.0   1144   320 ?        Ss   10:36   0:00 /lkp/lkp/src/bin/event/wakeup pre-test
root      2030  0.0  0.0   4148  1972 ?        S    10:37   0:00 tee -a //result/trinity/group-00-5-300s/vm-snb/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/313c47f4fe4d07eb2969f429

Is there any more data you can provide or how much and how reliable this
test fails? Otherwise I have no choice but to discount this for now.
in our bot tests, the results are quite persistent, now we run even more till
~120 times for either 313c47f4fe4d07eb2969f429a66 or its parent, always see
the issue for 313c47f4fe4d07eb2969f429a66, and parent keeps clean.

=========================================================================================
tbox_group/testcase/rootfs/kconfig/compiler/runtime/group/nr_groups:
  vm-snb/trinity/yocto-x86_64-minimal-20190520.cgz/x86_64-kexec/clang-20/300s/group-00/5

7416634fd6f18762 313c47f4fe4d07eb2969f429a66
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :121         98%         119:119   last_state.is_incomplete_run
           :121         98%         119:119   last_state.running
           :121         98%         119:119   dmesg.BUG:kernel_hang_in_test_stage

(BTW, in fact we also tried to rebuild the kernel and rerun tests, also got
same results)

I also tried reproducer
https://download.01.org/0day-ci/archive/20260127/202601270735.29b7c33e-lkp@intel.com/reproduce
on my local Ubuntu 22.04.5 LTS

can reproduce the issue for 313c47f4fe4d07eb2969f429a66. one log is attached as
313c47f4fe4d-run.log.

no issue if running upon parent commit 7416634fd6f18762. one log is attached as
parent-7416634fd6f1-run.log.

you mentioned you tried ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea
(commit ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea (tag: next-20260123))
I can reproduce the issue upon it. one log is attached as
next-20260123-ca3a02fda4da-run.log.


I uploaded the binaries I used for reproducer to
https://download.01.org/0day-ci/archive/20260127/202601270735.29b7c33e-lkp@intel.com
not sure if they are useful.

for 313c47f4fe4d07eb2969f429a66:
vmlinuz-6.19.0-rc1-00006-g313c47f4fe4d <-- <bzImage>
modules-313c47f4fe4d.cgz
vmlinux-313c47f4fe4d.xz   <---- not sure if it could supply some information

for parent 7416634fd6f18762:
vmlinuz-6.19.0-rc1-00005-g7416634fd6f1
modules-7416634fd6f1.cgz
vmlinux-7416634fd6f1.xz

for ca3a02fda4da8e2c1cb6baee5d72352e9e2cfaea (tag: next-20260123):
vmlinuz-6.19.0-rc6-next-20260123
modules-6.19.0-rc6-next-20260123.cgz
vmlinux-next-20260123.xz
config-6.19.0-rc6-next-20260123  <--- since it has diff with config-6.19.0-rc1-00006-g313c47f4fe4d
quoted
quoted
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot [off-list ref]
| Closes: https://lore.kernel.org/oe-lkp/202601270735.29b7c33e-lkp@intel.com (local)



[   27.746952][ T1793] /lkp/lkp/src/monitors/meminfo: line 25: /lkp/lkp/src/bin/event/wait: not found
[   31.757224][ T1793] /lkp/lkp/src/monitors/oom-killer: line 94: dmesg: not found
[   31.757224][ T1793] /lkp/lkp/src/monitors/oom-killer: line 94: grep: not found
[   31.757224][ T1793] /lkp/lkp/src/monitors/oom-killer: line 25: /lkp/lkp/src/bin/event/wait: not found
[   65.744824][ T4974] trinity-main[4974]: segfault at 0 ip 0000000000000000 sp 00007ffe08d2ec08 error 15 likely on CPU 0 (core 0, socket 0)
[   65.746308][ T4974] Code: Unable to access opcode bytes at 0xffffffffffffffd6.

Code starting with the faulting instruction
===========================================
/etc/rc5.d/S77lkp-bootstrap: line 79: sleep: not found
BUG: kernel hang in test stage



The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20260127/202601270735.29b7c33e-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help