Thread (12 messages) 12 messages, 3 authors, 2016-02-01

Re: ata: BUG in ata_sff_hsm_move

From: Dmitry Vyukov <dvyukov@google.com>
Date: 2016-01-29 13:40:57
Also in: lkml

On Fri, Jan 29, 2016 at 2:18 PM, Dmitry Vyukov [off-list ref] wrote:
On Fri, Jan 29, 2016 at 1:23 PM, Tejun Heo [off-list ref] wrote:
quoted
Hello, Dmitry.

On Fri, Jan 29, 2016 at 12:59:49PM +0100, Dmitry Vyukov wrote:
quoted
quoted
Hmmm... the port interrupt handler checks for IDLE before calling into
hsm_move, so the only explanation would be that something is resetting
it to IDLE inbetween.  ce7514526742 ("libata: prevent HSM state change
race between ISR and PIO") describes and fixes the same problem.  The
fix seems correct and I can't find anywhere else where this can
happen.  :(

Can you please post the kernel log leading to the BUG?  Also, I don't
think that condition needs to be BUG.  I'll change it to WARN.
Here are two logs, in both cases no kernel messages prior to the bug:
https://gist.githubusercontent.com/dvyukov/5087d633e3620280b6c7/raw/31c9ab1ced92ac5f85cfb15eaf48ec5793c2c3a1/gistfile1.txt
https://gist.githubusercontent.com/dvyukov/825b2e3d5fb80ae08a9a/raw/03c5a4f4c4bd9d0a304a71cda2da4c92f4b7f1ba/gistfile1.txt
lol, this is kinda embarrassing.  It looks like the poll path wasn't
doing any locking.  Can you please verify the following patch at least
doesn't crash the machine immediately and if so keep it applied to the
test kernel so that we can verify that the problem actually goes away?

Great that you managed to debug it without a repro!
I've applied this patch to my tree and will rerun fuzzer. I will
notify you if I see this warning again.
Thanks
It now deadlocks at this stack. It is probably not OK to call
ata_sff_hsm_move holding the spinlock.

[    8.168809] NMI backtrace for cpu 2
[    8.168809] CPU: 2 PID: 17 Comm: kworker/2:0 Not tainted 4.5.0-rc1+ #302
[    8.168809] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[    8.168809] Workqueue: ata_sff ata_sff_pio_task
[    8.168809] task: ffff88006d24c740 ti: ffff88006d090000 task.ti:
ffff88006d090000
[    8.168809] RIP: 0010:[<ffffffff82c0e388>]  [<ffffffff82c0e388>]
delay_tsc+0x18/0x70
[    8.168809] RSP: 0000:ffff88006d097a60  EFLAGS: 00000086
[    8.168809] RAX: 000000008edaeede RBX: ffff880036e88028 RCX: 000000000000001e
[    8.168809] RDX: 0000000000000033 RSI: 000000338edaee9c RDI: 0000000000000001
[    8.168809] RBP: ffff88006d097a60 R08: 0000000000000002 R09: 0000000000000000
[    8.168809] R10: fffffbfff11baf82 R11: 1ffffffff134edc5 R12: ffff880036e88038
[    8.168809] R13: 000000009a9d2d40 R14: ffff880036e88030 R15: 0000000099383d98
[    8.168809] FS:  0000000000000000(0000) GS:ffff88006d600000(0000)
knlGS:0000000000000000
[    8.168809] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    8.168809] CR2: 00000000ffffffff CR3: 0000000007a4b000 CR4: 00000000000006e0
[    8.168809] Stack:
[    8.168809]  ffff88006d097a70 ffffffff82c0e2da ffff88006d097aa8
ffffffff814661a9
[    8.168809]  ffff880036e88028 0000000000000082 ffff880036e90008
ffff880036e90010
[    8.168809]  0000000000000001 ffff88006d097ad0 ffffffff86652c47
ffffffff83a80c4c
[    8.168809] Call Trace:
[    8.168809]  [<ffffffff82c0e2da>] __delay+0xa/0x10
[    8.168809]  [<ffffffff814661a9>] do_raw_spin_lock+0x149/0x2b0
[    8.168809]  [<ffffffff86652c47>] _raw_spin_lock_irqsave+0xa7/0xd0
[    8.168809]  [<ffffffff83a80c4c>] ? ata_hsm_qc_complete+0x12c/0x420
[    8.168809]  [<ffffffff83a80c4c>] ata_hsm_qc_complete+0x12c/0x420
[    8.168809]  [<ffffffff83a82f42>] ata_sff_hsm_move+0x1e2/0x1c60
[    8.168809]  [<ffffffff83a85f17>] ? ata_sff_pio_task+0x87/0x530
[    8.168809]  [<ffffffff83a86296>] ata_sff_pio_task+0x406/0x530
[    8.168809]  [<ffffffff813a2386>] process_one_work+0x796/0x1440
[    8.168809]  [<ffffffff813a22ba>] ? process_one_work+0x6ca/0x1440
[    8.168809]  [<ffffffff813d6190>] ? finish_task_switch+0x120/0x5f0
[    8.168809]  [<ffffffff813a1bf0>] ? pwq_dec_nr_in_flight+0x2e0/0x2e0
[    8.168809]  [<ffffffff813a310b>] worker_thread+0xdb/0xfc0
[    8.168809]  [<ffffffff813b637f>] kthread+0x23f/0x2d0
[    8.168809]  [<ffffffff813a3030>] ? process_one_work+0x1440/0x1440
[    8.168809]  [<ffffffff813b6140>] ? kthread_create_on_node+0x3b0/0x3b0
[    8.168809]  [<ffffffff813b6140>] ? kthread_create_on_node+0x3b0/0x3b0
[    8.168809]  [<ffffffff8665362f>] ret_from_fork+0x3f/0x70
[    8.168809]  [<ffffffff813b6140>] ? kthread_create_on_node+0x3b0/0x3b0



quoted
diff --git a/drivers/ata/libata-sff.c b/drivers/ata/libata-sff.c
index 608677d..6991efc 100644
--- a/drivers/ata/libata-sff.c
+++ b/drivers/ata/libata-sff.c
@@ -1362,12 +1362,14 @@ static void ata_sff_pio_task(struct work_struct *work)
        u8 status;
        int poll_next;

+       spin_lock_irq(ap->lock);
+
        BUG_ON(ap->sff_pio_task_link == NULL);
        /* qc can be NULL if timeout occurred */
        qc = ata_qc_from_tag(ap, link->active_tag);
        if (!qc) {
                ap->sff_pio_task_link = NULL;
-               return;
+               goto out_unlock;
        }

 fsm_start:
@@ -1382,11 +1384,14 @@ static void ata_sff_pio_task(struct work_struct *work)
         */
        status = ata_sff_busy_wait(ap, ATA_BUSY, 5);
        if (status & ATA_BUSY) {
+               spin_unlock_irq(ap->lock);
                ata_msleep(ap, 2);
+               spin_lock_irq(ap->lock);
+
                status = ata_sff_busy_wait(ap, ATA_BUSY, 10);
                if (status & ATA_BUSY) {
                        ata_sff_queue_pio_task(link, ATA_SHORT_PAUSE);
-                       return;
+                       goto out_unlock;
                }
        }
@@ -1403,6 +1408,8 @@ static void ata_sff_pio_task(struct work_struct *work)
         */
        if (poll_next)
                goto fsm_start;
+out_unlock:
+       spin_unlock_irq(ap->lock);
 }

 /**

--
You received this message because you are subscribed to the Google Groups "syzkaller" group.
To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help