Thread (5 messages) 5 messages, 2 authors, 2021-05-25

RE: [PATCH] usb: cdnsp: Fix deadlock issue in cdnsp_thread_irq_handler

From: Pawel Laszczak <pawell@cadence.com>
Date: 2021-05-25 05:11:44
Also in: stable

On 21-05-24 10:56:23, Pawel Laszczak wrote:
quoted
quoted

On 21-05-20 11:42:24, Pawel Laszczak wrote:
quoted
From: Pawel Laszczak <pawell@cadence.com>

Patch fixes the critical issue caused by deadlock which has been detected
during testing NCM class.

The root cause of issue is spin_lock/spin_unlock instruction instead
spin_lock_irqsave/spin_lock_irqrestore in cdnsp_thread_irq_handler
function.
Pawel, would you please explain more about why the deadlock occurs with
current code, and why this patch could fix it?
I'm going to add such description to commit message:

Lack of spin_lock_irqsave causes that during handling threaded
interrupt inside spin_lock/spin_unlock section the ethernet
subsystem invokes eth_start_xmit function from interrupt context
which in turn starts queueing free requests in cdnsp driver.
Consequently, the deadlock occurs inside cdnsp_gadget_ep_queue
on spin_lock_irqsave instruction. Replacing spin_lock/spin_unlock
with spin_lock_irqsave/spin_lock_irqrestor causes disableing
%s/disableing/disabling
quoted
interrupts and blocks queuing requests by ethernet subsystem until
cdnsp_thread_irq_handler ends..

I hope this description is sufficient.
A call stack graph may be better, like [1]

[1]: https://urldefense.com/v3/__https://www.spinics.net/lists/linux-
usb/msg211931.html__;!!EHscmS1ygiU1lA!WCxrsHL4OWKd3iPdyymp-dQlVCoiHM9QzjIZPUC6-Dm6cnpU5CRLLOHrgiYSRA$
After putting online extra cpus, I've manage to catch call trace showing the deadlock issue:

[  223.051673] smp: csd: Detected non-responsive CSD lock (#1) on CPU#0, waiting 5000000015 ns for CPU#02 do_sync_core+0x0/0x30(0x0).
[  223.051690] smp:     csd: CSD lock (#1) unresponsive.
[  223.051693] Sending NMI from CPU 0 to CPUs 2:
[  223.052690] NMI backtrace for cpu 2
[  223.052692] CPU: 2 PID: 3053 Comm: irq/16-0000:01: Tainted: G           OE     5.11.0-rc1+ #5
[  223.052693] Hardware name: ASUS All Series/Q87T, BIOS 0908 07/22/2014
[  223.052693] RIP: 0010:native_queued_spin_lock_slowpath+0x61/0x1d0
[  223.052695] Code: 0f ba 2f 08 0f 92 c0 0f b6 c0 c1 e0 08 89 c2 8b 07 30 e4 09 d0 a9 00 01 ff ff 75 47 85 c0 74 0e 8b 07 84 c0 74 08 f3 90 8b 07 <84> c0 75 f8 b8 01 00 00 00 5d 66 89 07 c3 8b 37 b8 00 02 00 00 81
[  223.052696] RSP: 0018:ffffbc494011cde0 EFLAGS: 00000002
[  223.052698] RAX: 0000000000000101 RBX: ffff9ee8116b4a68 RCX: 0000000000000000
[  223.052699] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9ee8116b4658
[  223.052700] RBP: ffffbc494011cde0 R08: 0000000000000001 R09: 0000000000000000
[  223.052701] R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
[  223.052702] R13: ffff9ee8116b4670 R14: 0000000000000246 R15: ffff9ee8116b4658
[  223.052702] FS:  0000000000000000(0000) GS:ffff9ee89b400000(0000) knlGS:0000000000000000
[  223.052703] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  223.052704] CR2: 00007f7bcc41a830 CR3: 000000007a612003 CR4: 00000000001706e0
[  223.052705] Call Trace:
[  223.052706]  <IRQ>
[  223.052706]  do_raw_spin_lock+0xc0/0xd0
[  223.052707]  _raw_spin_lock_irqsave+0x95/0xa0
[  223.052708]  cdnsp_gadget_ep_queue.cold+0x88/0x107 [cdnsp_udc_pci]
[  223.052708]  usb_ep_queue+0x35/0x110
[  223.052709]  eth_start_xmit+0x220/0x3d0 [u_ether]
[  223.052710]  ncm_tx_timeout+0x34/0x40 [usb_f_ncm]
[  223.052711]  ? ncm_free_inst+0x50/0x50 [usb_f_ncm]
[  223.052711]  __hrtimer_run_queues+0xac/0x440
[  223.052712]  hrtimer_run_softirq+0x8c/0xb0
[  223.052713]  __do_softirq+0xcf/0x428
[  223.052713]  asm_call_irq_on_stack+0x12/0x20
[  223.052714]  </IRQ>
[  223.052715]  do_softirq_own_stack+0x61/0x70
[  223.052715]  irq_exit_rcu+0xc1/0xd0
[  223.052716]  sysvec_apic_timer_interrupt+0x52/0xb0
[  223.052717]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[  223.052718] RIP: 0010:do_raw_spin_trylock+0x18/0x40
[  223.052719] Code: ff ff 66 90 eb 86 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 55 8b 07 48 89 e5 85 c0 75 29 ba 01 00 00 00 f0 0f b1 17 <75> 1e 65 8b 05 2f 77 ee 70 89 47 08 5d 65 48 8b 04 25 40 7e 01 00
[  223.052720] RSP: 0018:ffffbc494138bda8 EFLAGS: 00000246
[  223.052721] RAX: 0000000000000000 RBX: ffff9ee8116b4658 RCX: 0000000000000000
[  223.052722] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9ee8116b4658
[  223.052723] RBP: ffffbc494138bda8 R08: 0000000000000001 R09: 0000000000000000
[  223.052724] R10: ffff9ee8116b4670 R11: 0000000000000000 R12: ffff9ee8116b4658
[  223.052725] R13: ffff9ee8116b4670 R14: ffff9ee7b5c73d80 R15: ffff9ee8116b4000
[  223.052726]  _raw_spin_lock+0x3d/0x70
[  223.052726]  ? cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
[  223.052727]  cdnsp_thread_irq_handler.cold+0x32/0x112c [cdnsp_udc_pci]
[  223.052728]  ? cdnsp_remove_request+0x1f0/0x1f0 [cdnsp_udc_pci]
[  223.052729]  ? cdnsp_thread_irq_handler+0x5/0xa0 [cdnsp_udc_pci]
[  223.052730]  ? irq_thread+0xa0/0x1c0
[  223.052730]  irq_thread_fn+0x28/0x60
[  223.052731]  irq_thread+0x105/0x1c0
[  223.052732]  ? __kthread_parkme+0x42/0x90
[  223.052732]  ? irq_forced_thread_fn+0x90/0x90
[  223.052733]  ? wake_threads_waitq+0x30/0x30
[  223.052734]  ? irq_thread_check_affinity+0xe0/0xe0
[  223.052735]  kthread+0x12a/0x160
[  223.052735]  ? kthread_park+0x90/0x90
[  223.052736]  ret_from_fork+0x22/0x30

Pawel
Peter
quoted
Thanks,
Pawel
quoted
quoted
Cc: stable@vger.kernel.org
Fixes: 3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
Signed-off-by: Pawel Laszczak <pawell@cadence.com>
---
 drivers/usb/cdns3/cdnsp-ring.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/cdns3/cdnsp-ring.c b/drivers/usb/cdns3/cdnsp-ring.c
index 5f0513c96c04..68972746e363 100644
--- a/drivers/usb/cdns3/cdnsp-ring.c
+++ b/drivers/usb/cdns3/cdnsp-ring.c
@@ -1517,13 +1517,14 @@ irqreturn_t cdnsp_thread_irq_handler(int irq, void *data)
 {
 	struct cdnsp_device *pdev = (struct cdnsp_device *)data;
 	union cdnsp_trb *event_ring_deq;
+	unsigned long flags;
 	int counter = 0;

-	spin_lock(&pdev->lock);
+	spin_lock_irqsave(&pdev->lock, flags);

 	if (pdev->cdnsp_state & (CDNSP_STATE_HALTED | CDNSP_STATE_DYING)) {
 		cdnsp_died(pdev);
-		spin_unlock(&pdev->lock);
+		spin_unlock_irqrestore(&pdev->lock, flags);
 		return IRQ_HANDLED;
 	}
@@ -1539,7 +1540,7 @@ irqreturn_t cdnsp_thread_irq_handler(int irq, void *data)

 	cdnsp_update_erst_dequeue(pdev, event_ring_deq, 1);

-	spin_unlock(&pdev->lock);
+	spin_unlock_irqrestore(&pdev->lock, flags);

 	return IRQ_HANDLED;
 }
--
2.25.1
--

Thanks,
Peter Chen
--

Thanks,
Peter Chen
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help