Thread (2 messages) 2 messages, 2 authors, 2021-09-06

Re: nvme-tcp crashes the system when overloading the backend device.

From: Mark Ruijter <hidden>
Date: 2021-09-06 12:25:49

This is all the output I could still find from the test.
If you are still missing info than let me know. I will need to re-run the test in that case.

--Mark

On 06/09/2021, 13:12, "Sagi Grimberg" [off-list ref] wrote:


    > Hi Sagi,
    > 
    > I installed a recent kernel on the system and restarted the test.
    > The kernel is: 5.10.57
    > 
    > Before the kernel would crash, I stopped the test by disconnecting the initiators.
    > "nvmetcli clear" did not hang and in this case, it still managed to remove the configuration.
    > 
    > # ls -l /sys/kernel/config/nvmet/ports/
    > total 0
    > # ls -l /sys/kernel/config/nvmet/subsystems/
    > total 0
    > 
    > However, after this I still see nvmet_tcp_wq workers that are actively running:
    > # ps aux | grep nvmet
    > USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    > root       163  0.4  0.0      0     0 ?        I<   02:28   0:21 [kworker/24:0H-nvmet_tcp_wq]
    > root       199  0.0  0.0      0     0 ?        I<   02:28   0:02 [kworker/30:0H-nvmet_tcp_wq]
    > root       355  0.0  0.0      0     0 ?        I<   02:28   0:04 [kworker/56:0H-nvmet_tcp_wq]
    > root       361  0.0  0.0      0     0 ?        I<   02:28   0:03 [kworker/57:0H-nvmet_tcp_wq]
    > root       683  1.2  0.0      0     0 ?        D<   02:30   0:56 [kworker/53:1H+nvmet_tcp_wq]
    > root       785  1.0  0.0      0     0 ?        D<   02:30   0:44 [kworker/59:1H+nvmet_tcp_wq]
    > root      1200  0.1  0.0      0     0 ?        D<   02:30   0:08 [kworker/27:1H+nvmet_tcp_wq]
    > root     29212  2.0  0.0      0     0 ?        I<   03:28   0:18 [kworker/31:2H-nvmet_tcp_wq]
    > root     32691  0.0  0.0      0     0 ?        I<   02:31   0:00 [nvmet-buffered-]
    > root     39437  5.3  0.0      0     0 ?        D<   03:32   0:35 [kworker/51:3H+nvmet_tcp_wq]
    > root     39440  1.8  0.0      0     0 ?        I<   03:32   0:12 [kworker/59:3H-nvmet_tcp_wq]
    > root     39458 13.3  0.0      0     0 ?        I<   03:32   1:28 [kworker/18:3H-nvmet_tcp_wq]
    > root     39508  7.1  0.0      0     0 ?        D<   03:32   0:47 [kworker/53:4H+nvmet_tcp_wq]
    > root     39511  2.7  0.0      0     0 ?        D<   03:32   0:17 [kworker/28:5H+nvmet_tcp_wq]
    > root     39520  7.7  0.0      0     0 ?        D<   03:32   0:51 [kworker/52:3H+nvmet_tcp_wq]
    > root     39855  4.3  0.0      0     0 ?        I<   03:32   0:28 [kworker/48:4H-nvmet_tcp_wq]
    > root     39857  3.0  0.0      0     0 ?        D<   03:32   0:20 [kworker/28:7H+nvmet_tcp_wq]
    > root     39902  6.3  0.0      0     0 ?        D<   03:32   0:41 [kworker/27:6H+nvmet_tcp_wq]
    > root     39928  5.3  0.0      0     0 ?        D<   03:32   0:35 [kworker/25:9H+nvmet_tcp_wq]
    > root     39963  8.8  0.0      0     0 ?        D<   03:32   0:57 [kworker/24:6H+nvmet_tcp_wq]
    > root     40024  3.3  0.0      0     0 ?        I<   03:32   0:21 [kworker/28:9H-nvmet_tcp_wq]
    > root     40087  6.3  0.0      0     0 ?        I<   03:32   0:41 [kworker/53:6H-nvmet_tcp_wq]
    > root     40169  6.1  0.0      0     0 ?        D<   03:32   0:40 [kworker/59:5H+nvmet_tcp_wq]
    > root     40201  3.5  0.0      0     0 ?        D<   03:32   0:23 [kworker/54:8H+nvmet_tcp_wq]
    > root     40333  0.6  0.0      0     0 ?        D<   03:32   0:04 [kworker/59:7H+nvmet_tcp_wq]
    > root     40371  0.4  0.0      0     0 ?        I<   03:32   0:03 [kworker/49:5H-nvmet_tcp_wq]
    > root     40375  2.5  0.0      0     0 ?        I<   03:32   0:16 [kworker/20:8H-nvmet_tcp_wq]
    > root     40517  0.4  0.0      0     0 ?        I<   03:32   0:02 [kworker/58:6H-nvmet_tcp_wq]
    > root     40811  2.8  0.0      0     0 ?        D<   03:33   0:17 [kworker/51:9H+nvmet_tcp_wq]
    > root     40864  1.5  0.0      0     0 ?        I<   03:33   0:09 [kworker/29:5H-nvmet_tcp_wq]
    > root     40891  1.7  0.0      0     0 ?        I<   03:33   0:10 [kworker/17:9H-nvmet_tcp_wq]
    > root     40902  4.3  0.0      0     0 ?        D<   03:33   0:25 [kworker/59:8H+nvmet_tcp_wq]
    > root     41061  3.3  0.0      0     0 ?        I<   03:34   0:18 [kworker/51:10H-nvmet_tcp_wq]
    > root     41145  2.6  0.0      0     0 ?        D<   03:34   0:14 [kworker/56:7H+nvmet_tcp_wq]
    > root     41278  1.3  0.0      0     0 ?        I<   03:34   0:07 [kworker/22:9H-nvmet_tcp_wq]
    > 
    > I've attached dmesg.txt as requested.

    The dmesg output seems incomplete, I'm missing the nvmet-tcp threads 
    that are blocked. Do you have this output in the journal that you can share?

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help