Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting
From: Thinh Nguyen <Thinh.Nguyen@synopsys.com>
Date: 2021-08-21 02:57:26
Also in:
linux-arm-msm, linux-usb, lkml
Possibly related (same subject, not in this thread)
- 2021-08-26 · Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting · Pavel Hofman <hidden>
- 2021-08-26 · Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting · Ferry Toth <hidden>
- 2021-08-26 · Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting · Thinh Nguyen <Thinh.Nguyen@synopsys.com>
- 2021-07-23 · Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting · Felipe Balbi <balbi@kernel.org>
- 2021-07-22 · Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting · Ferry Toth <hidden>
Pavel Hofman wrote:
Dne 19. 08. 21 v 22:10 Ferry Toth napsal(a):quoted
Hi Op 19-08-2021 om 09:51 schreef Pavel Hofman:quoted
Hi, Dne 18. 08. 21 v 21:07 Ferry Toth napsal(a):quoted
Hi, Op 18-08-2021 om 00:00 schreef Ferry Toth:quoted
Hi, Op 16-08-2021 om 07:18 schreef Felipe Balbi:quoted
Hi, Ferry Toth [off-list ref] writes:quoted
quoted
Ferry Toth [off-list ref] writes:quoted
quoted
quoted
quoted
Ferry Toth [off-list ref] writes:quoted
quoted
quoted
quoted
quoted
Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48 RIP: 0010:0x500000000 Code: Unable to access opcode bytes at RIP 0x4ffffffd6. RSP: 0018:ffffa4d00045fc28 EFLAGS: 00010046 RAX: 0000000500000000 RBX: ffff8cd546aed200 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8cd547bfcae0 RDI: ffff8cd546aed200 RBP: ffff8cd547bfcae0 R08: 0000000000000000 R09: 0000000000000001 R10: ffff8cd541fd28c0 R11: 0000000000000000 R12: ffff8cd547342828 R13: ffff8cd546aed248 R14: 0000000000000000 R15: ffff8cd548b1d000 FS: 0000000000000000(0000) GS:ffff8cd57e200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000500000000 CR3: 000000000311e000 CR4: 00000000001006f0 Call Trace: ? dwc3_remove_requests.constprop.0+0x14d/0x170 ? __dwc3_gadget_ep_disable+0x7a/0x160 ? dwc3_gadget_ep_disable+0x3d/0xd0 ? usb_ep_disable+0x1c/0x ? u_audio_stop_capture+0x79/0x120 [u_audio] ? afunc_set_alt+0x73/0x80 [usb_f_uac2]So this is triggered by a SetInterface request...quoted
quoted
quoted
quoted
quoted
quoted
quoted
? composite_setup+0x224/0x1b90 [libcomposite] ? __dwc3_gadget_kick_transfer+0x160/0x400 ? dwc3_gadget_ep_queue+0xf3/0x1a0 ? configfs_composite_setup+0x6b/0x90 [libcomposite] ? configfs_composite_setup+0x6b/0x90 [libcomposite] ? dwc3_ep0_interrupt+0x459/0xa40 ? dwc3_thread_interrupt+0x8ee/0xf40 ? __schedule+0x235/0x6c0 ? disable_irq_nosync+0x10/0x10 ? irq_thread_fn+0x1b/0x60 ? irq_thread+0xc0/0x160 ? irq_thread_check_affinity+0x70/0x70 ? irq_forced_thread_fn+0x70/0x70 ? kthread+0x122/0x140 ? set_kthread_struct+0x40/0x40 ? ret_from_fork+0x22/0x30Do you mind enabling dwc3 traces and collecting them? Trying to figure out how we got here.I'll try if I can get the same error by booting with USB in host mode and then switch to device mode. If so I can enable traces and collect as you explained me before. I'll try before monday, as then I fly for a holiday and will not be available before rc5.you can enable all of those with kernel cmdline :-) https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html__;!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAPch0qja$ you need ftrace_dump_on_oops=1 and also need the correct options on trace_buf_size and trace_event.On Edison-Arduino I have a switch to go to device mode, after which udev triggers a script configure gadgets through configfs. I tried to log following these instructions: https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html*reporting-bugs__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAG64agnA$ <https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html*reporting-bugs__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAG64agnA$quoted
Unfortunately the kernel crashes so badly I can not get to the ` cp /t/trace /root/trace.txt` line (after a while the watchdog kicks). What to do next?Pass ftrace_dump_on_oops to kernel cmdline.No sure if I did this right, on oops everything is pushed to console (115k2 serial), I hope nothing essential is lost. I copied the screen buffer to file see attached.Thank you, I bet it took quite a some time :-) Anyway, looking at the logs around Set Interface requests, we can track every endpoint that's disabled. I'll take a guess and assume we're failing at the last Set Interface, that means we should have something odd with ep6in, but everything looks fine in the trace output: [ 75.823107] irq/14-d-596 0d... 42789194us : dwc3_gadget_ep_enable: ep6in: mps 192/346 streams 16 burst 0 ring 0/0 flags E:swbp:< [ 75.835472] irq/14-d-596 0d... 42789198us : dwc3_alloc_request: ep6in: req 0000000002c71409 length 0/0 zsI ==> 0 [ 75.846416] irq/14-d-596 0d... 42789202us : dwc3_ep_queue: ep6in: req 0000000002c71409 length 0/192 zsI ==> -115 [ 75.857360] irq/14-d-596 0d... 42789204us : dwc3_alloc_request: ep6in: req 00000000a324f5d0 length 0/0 zsI ==> 0 [ 75.868301] irq/14-d-596 0d... 42789206us : dwc3_ep_queue: ep6in: req 00000000a324f5d0 length 0/192 zsI ==> -115 [ 75.879244] irq/14-d-596 0d... 42789209us : dwc3_event: event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) [Status Phase] [ 75.891880] irq/14-d-596 0d... 42789211us : dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf 000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2) [ 75.989131] irq/14-d-596 0d... 42789224us : dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 00000000 1bded000 00000000 --> status: Successful [ 76.096261] irq/14-d-596 0d... 42789272us : dwc3_event: event (0000c042): ep0in: Transfer Complete (sIL) [Status Phase] [ 76.107834] irq/14-d-596 0d... 42789275us : dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2) [ 76.122944] irq/14-d-596 0d... 42789277us : dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0 zsI ==> 0 [ 76.134160] irq/14-d-596 0d... 42789280us : dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup) [ 76.231322] irq/14-d-596 0d... 42789292us : dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 00000000 1bded000 00000000 --> status: Successful [ 76.297418] kworker/-23 0d... 42789670us : dwc3_ep_queue: ep3in: req 0000000029586135 length 0/96 ZsI ==> -115 [ 76.308278] kworker/-23 0d... 42789695us : dwc3_prepare_trb: ep3in: trb 00000000b81213d6 (E1:D0) buf 0000000003b7a800 size 96 ctrl 00000811 (Hlcs:sC:normal) [ 76.395294] kworker/-23 0d... 42789707us : dwc3_gadget_ep_cmd: ep3in: cmd 'Update Transfer' [60007] params 00000000 00000000 00000000 --> status: Successful [ 76.471900] irq/14-d-596 0d... 42789842us : dwc3_event: event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase] [ 76.489308] irq/14-d-596 0d... 42789845us : dwc3_ctrl_req: Set Interface(Intf = 5, Alt.Setting = 0) [ 76.505650] irq/14-d-596 0d... 42789851us : dwc3_ep_dequeue: ep6in: req 0000000002c71409 length 0/192 zsI ==> -115 [ 76.523315] irq/14-d-596 0d... 42789854us : dwc3_gadget_giveback: ep6in: req 0000000002c71409 length 0/192 zsI ==> -104 [ 76.541427] irq/14-d-596 0d... 42789857us : dwc3_free_request: ep6in: req 0000000002c71409 length 0/192 zsI ==> -104 [ 76.559267] irq/14-d-596 0d... 42789859us : dwc3_ep_dequeue: ep6in: req 00000000a324f5d0 length 0/192 zsI ==> -115 [ 76.576937] irq/14-d-596 0d... 42789861us : dwc3_gadget_giveback: ep6in: req 00000000a324f5d0 length 0/192 zsI ==> -104 [ 76.595046] irq/14-d-596 0d... 42789862us : dwc3_free_request: ep6in: req 00000000a324f5d0 length 0/192 zsI ==> -104 [ 76.612892] irq/14-d-596 0d... 42789865us : dwc3_gadget_ep_disable: ep6in: mps 192/346 streams 16 burst 0 ring 0/0 flags E:swbp:< [ 76.665535] irq/14-d-596 0d... 42789873us : dwc3_event: event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) [Status Phase] [ 76.684716] irq/14-d-596 0d... 42789875us : dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf 000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2) [ 76.819195] irq/14-d-596 0d... 42789886us : dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 00000000 1bded000 00000000 --> status: Successful [ 76.926324] irq/14-d-596 0d... 42789930us : dwc3_event: event (0000c042): ep0in: Transfer Complete (sIL) [Status Phase] [ 76.937892] irq/14-d-596 0d... 42789933us : dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2) [ 76.953003] irq/14-d-596 0d... 42789935us : dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0 zsI ==> 0 [ 76.964217] irq/14-d-596 0d... 42789938us : dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup) [ 77.061379] irq/14-d-596 0d... 42789950us : dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 00000000 1bded000 00000000 --> status: Successful [ 77.168595] irq/14-d-596 0d... 42790509us : dwc3_event: event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase] [ 77.180159] irq/14-d-596 0d... 42790512us : dwc3_ctrl_req: Get String Descriptor(Index = 18, Length = 255) [ 77.190578] irq/14-d-596 0d... 42790537us : dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf 0000000003b68000 size 36 ctrl 00000c53 (HLcs:SC:data) [ 77.287648] irq/14-d-596 0d... 42790550us : dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 00000000 1bded000 00000000 --> status: Successful [ 77.333107] irq/14-d-596 0d... 42790557us : dwc3_event: event (000010c2): ep0in: Transfer Not Ready [0] (Not Active) [Data Phase] [ 77.407223] irq/14-d-596 0d... 42790575us : dwc3_event: event (000090c2): ep0in: Transfer Not Ready [0] (Active) [Data Phase] [ 77.480985] irq/14-d-596 0d... 42790588us : dwc3_event: event (0000c042): ep0in: Transfer Complete (sIL) [Data Phase] [ 77.492376] irq/14-d-596 0d... 42790590us : dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 0000000003b68000 size 0 ctrl 00000c52 (hLcs:SC:data) [ 77.507221] irq/14-d-596 0d... 42790595us : dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 36/36 ZsI ==> 0 [ 77.518609] irq/14-d-596 0d... 42790597us : dwc3_event: event (000020c0): ep0out: Transfer Not Ready [0] (Not Active) [Status Phase] [ 77.531332] irq/14-d-596 0d... 42790598us : dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 000000001bded000 size 0 ctrl 00000c43 (HLcs:SC:status3) [ 77.628669] irq/14-d-596 0d... 42790609us : dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 00000000 1bded000 00000000 --> status: Successful Do you mind adding a few prints in dwc3_remove_requests to tell us which endpoint is being processed? Then we'll know for sure which one caused the crash.I wouldn't mind but am leaving on a holiday, won't have time until 6 aug.not a problem, we'll still be here when you're back :-)Well, let's go then :-) To get back in the mood I have retested 5.13.0, 5.14.0-rc1, 5.14.0-rc2 and 5.14.0-rc5. I find that 5.13.0 works fine, and the issue starts from 5.14.0-rc1.That's great finding. We have a bisection point. There are a total of 13764 commits between v5.13 and v5.14-rc1 $ git rev-list --count v5.13..v5.14-rc1 13764 git bisect should find the offending commit in at most 14 tries. That's not too bad.I correctly guesstimated that the problem got introduced by the usb merge 79160a60 "Merge tag 'usb-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb" 116 commits(7 bisects). 24f779dac8f3efb9629adc0e486914d93dc45517 is the first bad commit "usb: gadget: f_uac2/u_audio: add feedback endpoint support" Ruslan's 3 patches are related to each other so I reverted all three 24f779da...e89bb428 and applied the reverts to rc1. I can confirm this indeed resolves the problem in rc1. Is late now, tomorrow evening I will apply the reverts to rc6.With these reverts rc6 works fine as well. So, where do we go from here?I know the patches have been tested on dwc2 (by me and others). I do not know if Ruslan or Jerome tested them on dwc3 but probably not. Ruslan has talked about RPi (my case too) and BeagleboneBlack, both with dwc2. Perhaps the dwc2 behaves a bit differently than dwc3? The patches add a new EP-IN for async feedback. I am sorry I have not followed your long thread (it started as unrelated to uac). Does the problem appear with f_uac1 or f_uac2? Please how have you reached the above problem?I'm sorry too. I first believed the issue was related to the patch mentioned in the subject line. The problem appaers with f_uac2. I bost Edison_Arduino board in host mode (there is a switch allowing to select host/device mode). When flipping the switch to device mode udev run a script: But as I am using configfs (excerpt follows) and just disabling the last 2 line resolves the issue, I'm guessing uac2 is the issue. Or exceeding the available resources. # Create directory structure mkdir "${GADGET_BASE_DIR}" cd "${GADGET_BASE_DIR}" mkdir -p configs/c.1/strings/0x409 mkdir -p strings/0x409 # Serial device mkdir functions/gser.usb0 ln -s functions/gser.usb0 configs/c.1/ ### # Ethernet device mkdir functions/eem.usb0 echo "${DEV_ETH_ADDR}" > functions/eem.usb0/dev_addr echo "${HOST_ETH_ADDR}" > functions/eem.usb0/host_addr ln -s functions/eem.usb0 configs/c.1/ # Mass Storage device mkdir functions/mass_storage.usb0 echo 1 > functions/mass_storage.usb0/stall echo 0 > functions/mass_storage.usb0/lun.0/cdrom echo 0 > functions/mass_storage.usb0/lun.0/ro echo 0 > functions/mass_storage.usb0/lun.0/nofua echo "${USBDISK}" > functions/mass_storage.usb0/lun.0/file ln -s functions/mass_storage.usb0 configs/c.1/ # UAC2 device mkdir functions/uac2.usb0 ln -s functions/uac2.usb0 configs/c.1 ....As you say, could perhaps the reason be that the extra EP-IN added in those patches (previously 1, now 2 with the default config you use) exceeds your EP-IN max count or available fifos somehow? You have a number of functions initialized. If you change the load order of the functions, do you get the error later with a different function? Just guessing... You should be able to switch the default async EP-OUT (which configures the new feedback EP-IN ) to adaptive EP-OUT (which requires no feedback EP) with c_sync=8 parameter of f_uac2. https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/gadget/function/f_uac2.c*L47__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAHQ6Lj4y$ https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/gadget/function/f_uac2.c*L830__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAC7mfgo7$ https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14-rc6/source/include/uapi/linux/usb/ch9.h*L453__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPANdeuGIx$ Does that fix the problem?
Hi Pavel,
I took a look at 24f779dac8f3 ("usb: gadget: f_uac2/u_audio: add
feedback endpoint support") that Ferry reported the issue from
bisection. I see at least a couple problems in the new UAC2 changes.
1) usb_ep_dequeue() is asynchronous. Don't free requests before the
controller driver give them back.
2) Did you test with SuperSpeed? I don't see companion descriptor.
Please fix them.
Thanks,
Thinh