Thread (32 messages) 32 messages, 7 authors, 2021-09-02

Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting

From: Ferry Toth <hidden>
Date: 2021-08-19 20:11:04
Also in: linux-arm-msm, linux-devicetree, lkml

Possibly related (same subject, not in this thread)

Hi

Op 19-08-2021 om 09:51 schreef Pavel Hofman:
Hi,

Dne 18. 08. 21 v 21:07 Ferry Toth napsal(a):
quoted
Hi,

Op 18-08-2021 om 00:00 schreef Ferry Toth:
quoted
Hi,

Op 16-08-2021 om 07:18 schreef Felipe Balbi:
quoted
Hi,

Ferry Toth [off-list ref] writes:
quoted
quoted
Ferry Toth [off-list ref] writes:
quoted
quoted
quoted
quoted
Ferry Toth [off-list ref] writes:
quoted
quoted
quoted
quoted
quoted
Hardware name: Intel Corporation Merrifield/BODEGA BAY, 
BIOS 542
2015.01.21:18.19.48
RIP: 0010:0x500000000
Code: Unable to access opcode bytes at RIP 0x4ffffffd6.
RSP: 0018:ffffa4d00045fc28 EFLAGS: 00010046
RAX: 0000000500000000 RBX: ffff8cd546aed200 RCX: 
0000000000000000
RDX: 0000000000000000 RSI: ffff8cd547bfcae0 RDI: 
ffff8cd546aed200
RBP: ffff8cd547bfcae0 R08: 0000000000000000 R09: 
0000000000000001
R10: ffff8cd541fd28c0 R11: 0000000000000000 R12: 
ffff8cd547342828
R13: ffff8cd546aed248 R14: 0000000000000000 R15: 
ffff8cd548b1d000
FS:  0000000000000000(0000) GS:ffff8cd57e200000(0000) 
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000500000000 CR3: 000000000311e000 CR4: 
00000000001006f0
Call Trace:
        ? dwc3_remove_requests.constprop.0+0x14d/0x170
        ? __dwc3_gadget_ep_disable+0x7a/0x160
        ? dwc3_gadget_ep_disable+0x3d/0xd0
        ? usb_ep_disable+0x1c/0x
        ? u_audio_stop_capture+0x79/0x120 [u_audio]
        ? afunc_set_alt+0x73/0x80 [usb_f_uac2]
So this is triggered by a SetInterface request...
quoted
quoted
quoted
quoted
quoted
quoted
quoted
        ? composite_setup+0x224/0x1b90 [libcomposite]
        ? __dwc3_gadget_kick_transfer+0x160/0x400
        ? dwc3_gadget_ep_queue+0xf3/0x1a0
        ? configfs_composite_setup+0x6b/0x90 [libcomposite]
        ? configfs_composite_setup+0x6b/0x90 [libcomposite]
        ? dwc3_ep0_interrupt+0x459/0xa40
        ? dwc3_thread_interrupt+0x8ee/0xf40
        ? __schedule+0x235/0x6c0
        ? disable_irq_nosync+0x10/0x10
        ? irq_thread_fn+0x1b/0x60
        ? irq_thread+0xc0/0x160
        ? irq_thread_check_affinity+0x70/0x70
        ? irq_forced_thread_fn+0x70/0x70
        ? kthread+0x122/0x140
        ? set_kthread_struct+0x40/0x40
        ? ret_from_fork+0x22/0x30
Do you mind enabling dwc3 traces and collecting them? 
Trying to figure
out how we got here.
I'll try if I can get the same error by booting with USB in 
host mode
and then switch to device mode. If so I can enable traces 
and collect as
you explained me before.

I'll try before monday, as then I fly for a holiday and 
will not be
available before rc5.
you can enable all of those with kernel cmdline :-)

https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html 


you need ftrace_dump_on_oops=1 and also need the correct 
options on
trace_buf_size and trace_event.
On Edison-Arduino I have a switch to go to device mode, after 
which
udev triggers a script configure gadgets through configfs.

I tried to log following these instructions:

https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html#reporting-bugs 
<https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html#reporting-bugs> 


Unfortunately the kernel crashes so badly I can not get to 
the ` cp
/t/trace /root/trace.txt` line (after a while the watchdog 
kicks).

What to do next?
Pass ftrace_dump_on_oops to kernel cmdline.
No sure if I did this right, on oops everything is pushed to 
console
(115k2 serial), I hope nothing essential is lost.

I copied the screen buffer to file see attached.
Thank you, I bet it took quite a some time :-) Anyway, looking at
the logs around Set Interface requests, we can track every endpoint
that's disabled. I'll take a guess and assume we're failing at 
the last
Set Interface, that means we should have something odd with 
ep6in, but
everything looks fine in the trace output:

[   75.823107] irq/14-d-596       0d... 42789194us : 
dwc3_gadget_ep_enable: ep6in: mps 192/346 streams 16 burst 0 
ring 0/0 flags E:swbp:<
[   75.835472] irq/14-d-596       0d... 42789198us : 
dwc3_alloc_request: ep6in: req 0000000002c71409 length 0/0 zsI 
==> 0
[   75.846416] irq/14-d-596       0d... 42789202us : 
dwc3_ep_queue: ep6in: req 0000000002c71409 length 0/192 zsI ==> 
-115
[   75.857360] irq/14-d-596       0d... 42789204us : 
dwc3_alloc_request: ep6in: req 00000000a324f5d0 length 0/0 zsI 
==> 0
[   75.868301] irq/14-d-596       0d... 42789206us : 
dwc3_ep_queue: ep6in: req 00000000a324f5d0 length 0/192 zsI ==> 
-115
[   75.879244] irq/14-d-596       0d... 42789209us : dwc3_event: 
event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) 
[Status Phase]
[   75.891880] irq/14-d-596       0d... 42789211us : 
dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf 
000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2)
[   75.989131] irq/14-d-596       0d... 42789224us : 
dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 
00000000 1bded000 00000000 --> status: Successful
[   76.096261] irq/14-d-596       0d... 42789272us : dwc3_event: 
event (0000c042): ep0in: Transfer Complete (sIL) [Status Phase]
[   76.107834] irq/14-d-596       0d... 42789275us : 
dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 
000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2)
[   76.122944] irq/14-d-596       0d... 42789277us : 
dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0 
zsI ==> 0
[   76.134160] irq/14-d-596       0d... 42789280us : 
dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 
000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup)
[   76.231322] irq/14-d-596       0d... 42789292us : 
dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 
00000000 1bded000 00000000 --> status: Successful
[   76.297418] kworker/-23        0d... 42789670us : 
dwc3_ep_queue: ep3in: req 0000000029586135 length 0/96 ZsI ==> -115
[   76.308278] kworker/-23        0d... 42789695us : 
dwc3_prepare_trb: ep3in: trb 00000000b81213d6 (E1:D0) buf 
0000000003b7a800 size 96 ctrl 00000811 (Hlcs:sC:normal)
[   76.395294] kworker/-23        0d... 42789707us : 
dwc3_gadget_ep_cmd: ep3in: cmd 'Update Transfer' [60007] params 
00000000 00000000 00000000 --> status: Successful
[   76.471900] irq/14-d-596       0d... 42789842us : dwc3_event: 
event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase]
[   76.489308] irq/14-d-596       0d... 42789845us : 
dwc3_ctrl_req: Set Interface(Intf = 5, Alt.Setting = 0)
[   76.505650] irq/14-d-596       0d... 42789851us : 
dwc3_ep_dequeue: ep6in: req 0000000002c71409 length 0/192 zsI 
==> -115
[   76.523315] irq/14-d-596       0d... 42789854us : 
dwc3_gadget_giveback: ep6in: req 0000000002c71409 length 0/192 
zsI ==> -104
[   76.541427] irq/14-d-596       0d... 42789857us : 
dwc3_free_request: ep6in: req 0000000002c71409 length 0/192 zsI 
==> -104
[   76.559267] irq/14-d-596       0d... 42789859us : 
dwc3_ep_dequeue: ep6in: req 00000000a324f5d0 length 0/192 zsI 
==> -115
[   76.576937] irq/14-d-596       0d... 42789861us : 
dwc3_gadget_giveback: ep6in: req 00000000a324f5d0 length 0/192 
zsI ==> -104
[   76.595046] irq/14-d-596       0d... 42789862us : 
dwc3_free_request: ep6in: req 00000000a324f5d0 length 0/192 zsI 
==> -104
[   76.612892] irq/14-d-596       0d... 42789865us : 
dwc3_gadget_ep_disable: ep6in: mps 192/346 streams 16 burst 0 
ring 0/0 flags E:swbp:<
[   76.665535] irq/14-d-596       0d... 42789873us : dwc3_event: 
event (000020c2): ep0in: Transfer Not Ready [0] (Not Active) 
[Status Phase]
[   76.684716] irq/14-d-596       0d... 42789875us : 
dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf 
000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2)
[   76.819195] irq/14-d-596       0d... 42789886us : 
dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 
00000000 1bded000 00000000 --> status: Successful
[   76.926324] irq/14-d-596       0d... 42789930us : dwc3_event: 
event (0000c042): ep0in: Transfer Complete (sIL) [Status Phase]
[   76.937892] irq/14-d-596       0d... 42789933us : 
dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 
000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2)
[   76.953003] irq/14-d-596       0d... 42789935us : 
dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0 
zsI ==> 0
[   76.964217] irq/14-d-596       0d... 42789938us : 
dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 
000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup)
[   77.061379] irq/14-d-596       0d... 42789950us : 
dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 
00000000 1bded000 00000000 --> status: Successful
[   77.168595] irq/14-d-596       0d... 42790509us : dwc3_event: 
event (0000c040): ep0out: Transfer Complete (sIL) [Setup Phase]
[   77.180159] irq/14-d-596       0d... 42790512us : 
dwc3_ctrl_req: Get String Descriptor(Index = 18, Length = 255)
[   77.190578] irq/14-d-596       0d... 42790537us : 
dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf 
0000000003b68000 size 36 ctrl 00000c53 (HLcs:SC:data)
[   77.287648] irq/14-d-596       0d... 42790550us : 
dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params 
00000000 1bded000 00000000 --> status: Successful
[   77.333107] irq/14-d-596       0d... 42790557us : dwc3_event: 
event (000010c2): ep0in: Transfer Not Ready [0] (Not Active) 
[Data Phase]
[   77.407223] irq/14-d-596       0d... 42790575us : dwc3_event: 
event (000090c2): ep0in: Transfer Not Ready [0] (Active) [Data 
Phase]
[   77.480985] irq/14-d-596       0d... 42790588us : dwc3_event: 
event (0000c042): ep0in: Transfer Complete (sIL) [Data Phase]
[   77.492376] irq/14-d-596       0d... 42790590us : 
dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 
0000000003b68000 size 0 ctrl 00000c52 (hLcs:SC:data)
[   77.507221] irq/14-d-596       0d... 42790595us : 
dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 36/36 
ZsI ==> 0
[   77.518609] irq/14-d-596       0d... 42790597us : dwc3_event: 
event (000020c0): ep0out: Transfer Not Ready [0] (Not Active) 
[Status Phase]
[   77.531332] irq/14-d-596       0d... 42790598us : 
dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf 
000000001bded000 size 0 ctrl 00000c43 (HLcs:SC:status3)
[   77.628669] irq/14-d-596       0d... 42790609us : 
dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params 
00000000 1bded000 00000000 --> status: Successful

Do you mind adding a few prints in dwc3_remove_requests to tell 
us which
endpoint is being processed? Then we'll know for sure which one 
caused
the crash.
I wouldn't mind but am leaving on a holiday, won't have time 
until 6 aug.
not a problem, we'll still be here when you're back :-)
Well, let's go then :-)

To get back in the mood I have retested 5.13.0, 5.14.0-rc1, 5.14.0-rc2
and 5.14.0-rc5.

I find that 5.13.0 works fine, and the issue starts from 5.14.0-rc1.
That's great finding. We have a bisection point. There are a total of
13764 commits between v5.13 and v5.14-rc1

    $ git rev-list  --count v5.13..v5.14-rc1
    13764

git bisect should find the offending commit in at most 14 tries. That's
not too bad.
I correctly guesstimated that the problem got introduced by the usb 
merge 79160a60

"Merge tag 'usb-5.14-rc1' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb"

116 commits(7 bisects).

24f779dac8f3efb9629adc0e486914d93dc45517 is the first bad commit

"usb: gadget: f_uac2/u_audio: add feedback endpoint support"

Ruslan's 3 patches are related to each other so I reverted all three 
24f779da...e89bb428 and applied the reverts to rc1.

I can confirm this indeed resolves the problem in rc1.

Is late now, tomorrow evening I will apply the reverts to rc6.
With these reverts rc6 works fine as well.

So, where do we go from here?
I know the patches have been tested on dwc2 (by me and others).  I do 
not know if Ruslan or Jerome tested them on dwc3 but probably not. 
Ruslan has talked about RPi (my case too) and BeagleboneBlack, both with 
dwc2. Perhaps the dwc2 behaves a bit differently than dwc3?

The patches add a new EP-IN for async feedback. I am sorry I have not 
followed your long thread (it started as unrelated to uac). Does the 
problem appear with f_uac1 or f_uac2? Please how have you reached the 
above problem?
I'm sorry too. I first believed the issue was related to the patch 
mentioned in the subject line.

The problem appaers with f_uac2. I bost Edison_Arduino board in host 
mode (there is a switch allowing to select host/device mode). When 
flipping the switch to device mode udev run a script:
But as I am using configfs (excerpt follows) and just disabling the last 
2 line resolves the issue, I'm guessing uac2 is the issue. Or exceeding 
the available resources.

# Create directory structure
mkdir "${GADGET_BASE_DIR}"
cd "${GADGET_BASE_DIR}"
mkdir -p configs/c.1/strings/0x409
mkdir -p strings/0x409

# Serial device
mkdir functions/gser.usb0
ln -s functions/gser.usb0 configs/c.1/
###

# Ethernet device
mkdir functions/eem.usb0
echo "${DEV_ETH_ADDR}" > functions/eem.usb0/dev_addr
echo "${HOST_ETH_ADDR}" > functions/eem.usb0/host_addr
ln -s functions/eem.usb0 configs/c.1/

# Mass Storage device
mkdir functions/mass_storage.usb0
echo 1 > functions/mass_storage.usb0/stall
echo 0 > functions/mass_storage.usb0/lun.0/cdrom
echo 0 > functions/mass_storage.usb0/lun.0/ro
echo 0 > functions/mass_storage.usb0/lun.0/nofua
echo "${USBDISK}" > functions/mass_storage.usb0/lun.0/file
ln -s functions/mass_storage.usb0 configs/c.1/

# UAC2 device
mkdir functions/uac2.usb0
ln -s functions/uac2.usb0 configs/c.1
....

u_audio_stop_capture calls free_ep_fback 
https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/gadget/function/u_audio.c#L414 
which clears the feedback request prm->req_fback.

Then (reading from your the call trace), usb_ep_disable calls 
dwc3_gadget_ep_disable which for every request in the EP calls 
dwc3_remove_requests which gives every found request back to the gadget 
function, IIUC by calling the complete method of the request 
https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/gadget/udc/core.c#L912 
. Perhaps the problem is somewhere here?

The dwc2 gadget has its endpoint_disable method a bit different 
https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/dwc2/gadget.c#L4200 
  -> 
https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/dwc2/gadget.c#L3251 
-> 
https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/dwc2/gadget.c#L2090 
which also calls usb_gadget_giveback_request 
https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/dwc2/gadget.c#L2126 
. Perhaps there is some minor but important difference in processing the 
requests when disabling the endpoint between dwc2 and dwc3?

Thanks,

Pavel.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help