Thread (9 messages) 9 messages, 3 authors, 2024-08-20

Re: blktests failures with v6.11-rc1 kernel

From: Yi Zhang <hidden>
Date: 2024-08-13 07:06:48
Also in: linux-nvme, linux-rdma, linux-scsi

On Sat, Aug 3, 2024 at 12:49 AM Nilay Shroff [off-list ref] wrote:


On 8/2/24 18:04, Shinichiro Kawasaki wrote:
quoted
CC+: Yi Zhang,

On Aug 02, 2024 / 17:46, Nilay Shroff wrote:
quoted

On 8/2/24 14:39, Shinichiro Kawasaki wrote:
quoted
#3: nvme/052 (CKI failure)

   The CKI project reported that nvme/052 fails occasionally [4].
   This needs further debug effort.

  nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [failed]
      runtime    ...  22.209s
      --- tests/nvme/052.out        2024-07-30 18:38:29.041716566 -0400
      +++ /mnt/tests/gitlab.com/redhat/centos-stream/tests/kernel/kernel-tests/-/archive/production/kernel-tests-production.zip/storage/blktests/nvme/nvme-loop/blktests/results/nodev_tr_loop/nvme/052.out.bad     2024-07-30 18:45:35.438067452 -0400
      @@ -1,2 +1,4 @@
       Running nvme/052
      +cat: /sys/block/nvme1n2/uuid: No such file or directory
      +cat: /sys/block/nvme1n2/uuid: No such file or directory
       Test complete

   [4] https://datawarehouse.cki-project.org/kcidb/tests/13669275
I just checked the console logs of the nvme/052 and from the logs it's
apparent that all namespaces were created successfully and so it's strange
to see that the test couldn't access "/sys/block/nvme1n2/uuid".
I agree that it's strange. I think the "No such file or directory" error
happened in _find_nvme_ns(), and it checks existence of the uuid file before
the cat command. I have no idea why the error happens.
Yes exactly, and these two operations (checking the existence of uuid
and cat command) are not atomic. So the only plausible theory I have at this
time is "if namespace is deleted after checking the existence of uuid but
before cat command is executed" then this issue may potentially manifests.
Furthermore, as you mentioned, this issue is seen on the test machine
occasionally, so I asked if there's a possibility of simultaneous blktest
or some other tests running on this system.
There are no simultaneous tests during the CKI tests running.
I reproduced the failure on that server and always can be reproduced
within 5 times:
# sh a.sh
==============================0
nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [passed]
    runtime  21.496s  ...  21.398s
==============================1
nvme/052 (tr=loop) (Test file-ns creation/deletion under one subsystem) [failed]
    runtime  21.398s  ...  21.974s
    --- tests/nvme/052.out 2024-08-10 00:30:06.989814226 -0400
    +++ /root/blktests/results/nodev_tr_loop/nvme/052.out.bad
2024-08-13 02:53:51.635047928 -0400
    @@ -1,2 +1,5 @@
     Running nvme/052
    +cat: /sys/block/nvme1n2/uuid: No such file or directory
    +cat: /sys/block/nvme1n2/uuid: No such file or directory
    +cat: /sys/block/nvme1n2/uuid: No such file or directory
     Test complete
# uname -r
6.11.0-rc3
[root@hpe-rl300gen11-04 blktests]# lsblk
NAME                                MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
zram0                               252:0    0     8G  0 disk [SWAP]
nvme0n1                             259:0    0 447.1G  0 disk
├─nvme0n1p1                         259:1    0   600M  0 part /boot/efi
├─nvme0n1p2                         259:2    0     1G  0 part /boot
└─nvme0n1p3                         259:3    0 445.5G  0 part
  └─fedora_hpe--rl300gen11--04-root 253:0    0 445.5G  0 lvm  /

Thanks,
--Nilay

-- 
Best Regards,
  Yi Zhang
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help