Thread (32 messages) 32 messages, 7 authors, 2021-09-02

Re: [PATCH v10 0/6] Re-introduce TX FIFO resize for larger EP bursting

From: Pavel Hofman <hidden>
Date: 2021-08-23 07:57:55
Also in: linux-arm-msm, linux-devicetree, lkml

Possibly related (same subject, not in this thread)


Dne 22. 08. 21 v 22:10 Ferry Toth napsal(a):
Op 21-08-2021 om 04:57 schreef Thinh Nguyen:
quoted
Pavel Hofman wrote:
quoted
Dne 19. 08. 21 v 22:10 Ferry Toth napsal(a):
quoted
Hi

Op 19-08-2021 om 09:51 schreef Pavel Hofman:
quoted
Hi,

Dne 18. 08. 21 v 21:07 Ferry Toth napsal(a):
quoted
Hi,

Op 18-08-2021 om 00:00 schreef Ferry Toth:
quoted
Hi,

Op 16-08-2021 om 07:18 schreef Felipe Balbi:
quoted
Hi,

Ferry Toth [off-list ref] writes:
quoted
quoted
Ferry Toth [off-list ref] writes:
quoted
quoted
quoted
quoted
Ferry Toth [off-list ref] writes:
quoted
quoted
quoted
quoted
quoted
Hardware name: Intel Corporation Merrifield/BODEGA BAY,
BIOS 542
2015.01.21:18.19.48
RIP: 0010:0x500000000
Code: Unable to access opcode bytes at RIP 0x4ffffffd6.
RSP: 0018:ffffa4d00045fc28 EFLAGS: 00010046
RAX: 0000000500000000 RBX: ffff8cd546aed200 RCX:
0000000000000000
RDX: 0000000000000000 RSI: ffff8cd547bfcae0 RDI:
ffff8cd546aed200
RBP: ffff8cd547bfcae0 R08: 0000000000000000 R09:
0000000000000001
R10: ffff8cd541fd28c0 R11: 0000000000000000 R12:
ffff8cd547342828
R13: ffff8cd546aed248 R14: 0000000000000000 R15:
ffff8cd548b1d000
FS:  0000000000000000(0000) GS:ffff8cd57e200000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000500000000 CR3: 000000000311e000 CR4:
00000000001006f0
Call Trace:
         ? dwc3_remove_requests.constprop.0+0x14d/0x170
         ? __dwc3_gadget_ep_disable+0x7a/0x160
         ? dwc3_gadget_ep_disable+0x3d/0xd0
         ? usb_ep_disable+0x1c/0x
         ? u_audio_stop_capture+0x79/0x120 [u_audio]
         ? afunc_set_alt+0x73/0x80 [usb_f_uac2]
So this is triggered by a SetInterface request...
quoted
quoted
quoted
quoted
quoted
quoted
quoted
         ? composite_setup+0x224/0x1b90 [libcomposite]
         ? __dwc3_gadget_kick_transfer+0x160/0x400
         ? dwc3_gadget_ep_queue+0xf3/0x1a0
         ? configfs_composite_setup+0x6b/0x90
[libcomposite]
         ? configfs_composite_setup+0x6b/0x90
[libcomposite]
         ? dwc3_ep0_interrupt+0x459/0xa40
         ? dwc3_thread_interrupt+0x8ee/0xf40
         ? __schedule+0x235/0x6c0
         ? disable_irq_nosync+0x10/0x10
         ? irq_thread_fn+0x1b/0x60
         ? irq_thread+0xc0/0x160
         ? irq_thread_check_affinity+0x70/0x70
         ? irq_forced_thread_fn+0x70/0x70
         ? kthread+0x122/0x140
         ? set_kthread_struct+0x40/0x40
         ? ret_from_fork+0x22/0x30
Do you mind enabling dwc3 traces and collecting them?
Trying to figure
out how we got here.
I'll try if I can get the same error by booting with USB
in host mode
and then switch to device mode. If so I can enable traces
and collect as
you explained me before.

I'll try before monday, as then I fly for a holiday and
will not be
available before rc5.
you can enable all of those with kernel cmdline :-)

https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html__;!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAPch0qja$ 



you need ftrace_dump_on_oops=1 and also need the correct
options on
trace_buf_size and trace_event.
On Edison-Arduino I have a switch to go to device mode,
after which
udev triggers a script configure gadgets through configfs.

I tried to log following these instructions:

https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html*reporting-bugs__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAG64agnA$ 

<https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/driver-api/usb/dwc3.html*reporting-bugs__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAG64agnA$ 
quoted
Unfortunately the kernel crashes so badly I can not get to
the ` cp
/t/trace /root/trace.txt` line (after a while the watchdog
kicks).

What to do next?
Pass ftrace_dump_on_oops to kernel cmdline.
No sure if I did this right, on oops everything is pushed to
console
(115k2 serial), I hope nothing essential is lost.

I copied the screen buffer to file see attached.
Thank you, I bet it took quite a some time :-) Anyway, 
looking at
the logs around Set Interface requests, we can track every
endpoint
that's disabled. I'll take a guess and assume we're failing at
the last
Set Interface, that means we should have something odd with
ep6in, but
everything looks fine in the trace output:

[   75.823107] irq/14-d-596       0d... 42789194us :
dwc3_gadget_ep_enable: ep6in: mps 192/346 streams 16 burst 0
ring 0/0 flags E:swbp:<
[   75.835472] irq/14-d-596       0d... 42789198us :
dwc3_alloc_request: ep6in: req 0000000002c71409 length 0/0 zsI
==> 0
[   75.846416] irq/14-d-596       0d... 42789202us :
dwc3_ep_queue: ep6in: req 0000000002c71409 length 0/192 zsI
==> -115
[   75.857360] irq/14-d-596       0d... 42789204us :
dwc3_alloc_request: ep6in: req 00000000a324f5d0 length 0/0 zsI
==> 0
[   75.868301] irq/14-d-596       0d... 42789206us :
dwc3_ep_queue: ep6in: req 00000000a324f5d0 length 0/192 zsI
==> -115
[   75.879244] irq/14-d-596       0d... 42789209us :
dwc3_event: event (000020c2): ep0in: Transfer Not Ready [0]
(Not Active) [Status Phase]
[   75.891880] irq/14-d-596       0d... 42789211us :
dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf
000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2)
[   75.989131] irq/14-d-596       0d... 42789224us :
dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params
00000000 1bded000 00000000 --> status: Successful
[   76.096261] irq/14-d-596       0d... 42789272us :
dwc3_event: event (0000c042): ep0in: Transfer Complete (sIL)
[Status Phase]
[   76.107834] irq/14-d-596       0d... 42789275us :
dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf
000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2)
[   76.122944] irq/14-d-596       0d... 42789277us :
dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0
zsI ==> 0
[   76.134160] irq/14-d-596       0d... 42789280us :
dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf
000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup)
[   76.231322] irq/14-d-596       0d... 42789292us :
dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params
00000000 1bded000 00000000 --> status: Successful
[   76.297418] kworker/-23        0d... 42789670us :
dwc3_ep_queue: ep3in: req 0000000029586135 length 0/96 ZsI ==>
-115
[   76.308278] kworker/-23        0d... 42789695us :
dwc3_prepare_trb: ep3in: trb 00000000b81213d6 (E1:D0) buf
0000000003b7a800 size 96 ctrl 00000811 (Hlcs:sC:normal)
[   76.395294] kworker/-23        0d... 42789707us :
dwc3_gadget_ep_cmd: ep3in: cmd 'Update Transfer' [60007]
params 00000000 00000000 00000000 --> status: Successful
[   76.471900] irq/14-d-596       0d... 42789842us :
dwc3_event: event (0000c040): ep0out: Transfer Complete (sIL)
[Setup Phase]
[   76.489308] irq/14-d-596       0d... 42789845us :
dwc3_ctrl_req: Set Interface(Intf = 5, Alt.Setting = 0)
[   76.505650] irq/14-d-596       0d... 42789851us :
dwc3_ep_dequeue: ep6in: req 0000000002c71409 length 0/192 zsI
==> -115
[   76.523315] irq/14-d-596       0d... 42789854us :
dwc3_gadget_giveback: ep6in: req 0000000002c71409 length 0/192
zsI ==> -104
[   76.541427] irq/14-d-596       0d... 42789857us :
dwc3_free_request: ep6in: req 0000000002c71409 length 0/192
zsI ==> -104
[   76.559267] irq/14-d-596       0d... 42789859us :
dwc3_ep_dequeue: ep6in: req 00000000a324f5d0 length 0/192 zsI
==> -115
[   76.576937] irq/14-d-596       0d... 42789861us :
dwc3_gadget_giveback: ep6in: req 00000000a324f5d0 length 0/192
zsI ==> -104
[   76.595046] irq/14-d-596       0d... 42789862us :
dwc3_free_request: ep6in: req 00000000a324f5d0 length 0/192
zsI ==> -104
[   76.612892] irq/14-d-596       0d... 42789865us :
dwc3_gadget_ep_disable: ep6in: mps 192/346 streams 16 burst 0
ring 0/0 flags E:swbp:<
[   76.665535] irq/14-d-596       0d... 42789873us :
dwc3_event: event (000020c2): ep0in: Transfer Not Ready [0]
(Not Active) [Status Phase]
[   76.684716] irq/14-d-596       0d... 42789875us :
dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf
000000001bded000 size 0 ctrl 00000c33 (HLcs:SC:status2)
[   76.819195] irq/14-d-596       0d... 42789886us :
dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params
00000000 1bded000 00000000 --> status: Successful
[   76.926324] irq/14-d-596       0d... 42789930us :
dwc3_event: event (0000c042): ep0in: Transfer Complete (sIL)
[Status Phase]
[   76.937892] irq/14-d-596       0d... 42789933us :
dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf
000000001bded000 size 0 ctrl 00000c32 (hLcs:SC:status2)
[   76.953003] irq/14-d-596       0d... 42789935us :
dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length 0/0
zsI ==> 0
[   76.964217] irq/14-d-596       0d... 42789938us :
dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf
000000001bded000 size 8 ctrl 00000c23 (HLcs:SC:setup)
[   77.061379] irq/14-d-596       0d... 42789950us :
dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params
00000000 1bded000 00000000 --> status: Successful
[   77.168595] irq/14-d-596       0d... 42790509us :
dwc3_event: event (0000c040): ep0out: Transfer Complete (sIL)
[Setup Phase]
[   77.180159] irq/14-d-596       0d... 42790512us :
dwc3_ctrl_req: Get String Descriptor(Index = 18, Length = 255)
[   77.190578] irq/14-d-596       0d... 42790537us :
dwc3_prepare_trb: ep0in: trb 000000004c0ae319 (E0:D0) buf
0000000003b68000 size 36 ctrl 00000c53 (HLcs:SC:data)
[   77.287648] irq/14-d-596       0d... 42790550us :
dwc3_gadget_ep_cmd: ep0in: cmd 'Start Transfer' [406] params
00000000 1bded000 00000000 --> status: Successful
[   77.333107] irq/14-d-596       0d... 42790557us :
dwc3_event: event (000010c2): ep0in: Transfer Not Ready [0]
(Not Active) [Data Phase]
[   77.407223] irq/14-d-596       0d... 42790575us :
dwc3_event: event (000090c2): ep0in: Transfer Not Ready [0]
(Active) [Data Phase]
[   77.480985] irq/14-d-596       0d... 42790588us :
dwc3_event: event (0000c042): ep0in: Transfer Complete (sIL)
[Data Phase]
[   77.492376] irq/14-d-596       0d... 42790590us :
dwc3_complete_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf
0000000003b68000 size 0 ctrl 00000c52 (hLcs:SC:data)
[   77.507221] irq/14-d-596       0d... 42790595us :
dwc3_gadget_giveback: ep0out: req 00000000cb1bd3cd length
36/36 ZsI ==> 0
[   77.518609] irq/14-d-596       0d... 42790597us :
dwc3_event: event (000020c0): ep0out: Transfer Not Ready [0]
(Not Active) [Status Phase]
[   77.531332] irq/14-d-596       0d... 42790598us :
dwc3_prepare_trb: ep0out: trb 000000004c0ae319 (E0:D0) buf
000000001bded000 size 0 ctrl 00000c43 (HLcs:SC:status3)
[   77.628669] irq/14-d-596       0d... 42790609us :
dwc3_gadget_ep_cmd: ep0out: cmd 'Start Transfer' [406] params
00000000 1bded000 00000000 --> status: Successful

Do you mind adding a few prints in dwc3_remove_requests to
tell us which
endpoint is being processed? Then we'll know for sure which
one caused
the crash.
I wouldn't mind but am leaving on a holiday, won't have time
until 6 aug.
not a problem, we'll still be here when you're back :-)
Well, let's go then :-)

To get back in the mood I have retested 5.13.0, 5.14.0-rc1,
5.14.0-rc2
and 5.14.0-rc5.

I find that 5.13.0 works fine, and the issue starts from 
5.14.0-rc1.
That's great finding. We have a bisection point. There are a 
total of
13764 commits between v5.13 and v5.14-rc1

     $ git rev-list  --count v5.13..v5.14-rc1
     13764

git bisect should find the offending commit in at most 14 tries.
That's
not too bad.
I correctly guesstimated that the problem got introduced by the usb
merge 79160a60

"Merge tag 'usb-5.14-rc1' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb"

116 commits(7 bisects).

24f779dac8f3efb9629adc0e486914d93dc45517 is the first bad commit

"usb: gadget: f_uac2/u_audio: add feedback endpoint support"

Ruslan's 3 patches are related to each other so I reverted all
three 24f779da...e89bb428 and applied the reverts to rc1.

I can confirm this indeed resolves the problem in rc1.

Is late now, tomorrow evening I will apply the reverts to rc6.
With these reverts rc6 works fine as well.

So, where do we go from here?
I know the patches have been tested on dwc2 (by me and others).  I do
not know if Ruslan or Jerome tested them on dwc3 but probably not.
Ruslan has talked about RPi (my case too) and BeagleboneBlack, both
with dwc2. Perhaps the dwc2 behaves a bit differently than dwc3?

The patches add a new EP-IN for async feedback. I am sorry I have not
followed your long thread (it started as unrelated to uac). Does the
problem appear with f_uac1 or f_uac2? Please how have you reached the
above problem?
I'm sorry too. I first believed the issue was related to the patch
mentioned in the subject line.

The problem appaers with f_uac2. I bost Edison_Arduino board in host
mode (there is a switch allowing to select host/device mode). When
flipping the switch to device mode udev run a script:
But as I am using configfs (excerpt follows) and just disabling the
last 2 line resolves the issue, I'm guessing uac2 is the issue. Or
exceeding the available resources.

# Create directory structure
mkdir "${GADGET_BASE_DIR}"
cd "${GADGET_BASE_DIR}"
mkdir -p configs/c.1/strings/0x409
mkdir -p strings/0x409

# Serial device
mkdir functions/gser.usb0
ln -s functions/gser.usb0 configs/c.1/
###

# Ethernet device
mkdir functions/eem.usb0
echo "${DEV_ETH_ADDR}" > functions/eem.usb0/dev_addr
echo "${HOST_ETH_ADDR}" > functions/eem.usb0/host_addr
ln -s functions/eem.usb0 configs/c.1/

# Mass Storage device
mkdir functions/mass_storage.usb0
echo 1 > functions/mass_storage.usb0/stall
echo 0 > functions/mass_storage.usb0/lun.0/cdrom
echo 0 > functions/mass_storage.usb0/lun.0/ro
echo 0 > functions/mass_storage.usb0/lun.0/nofua
echo "${USBDISK}" > functions/mass_storage.usb0/lun.0/file
ln -s functions/mass_storage.usb0 configs/c.1/

# UAC2 device
mkdir functions/uac2.usb0
ln -s functions/uac2.usb0 configs/c.1
....
As you say, could perhaps the reason be that the extra EP-IN added in
those patches (previously 1, now 2 with the default config you use)
exceeds your EP-IN max count or available fifos somehow?  You have a
number of functions initialized. If you change the load order of the
functions, do you get the error later with a different function? Just
guessing...

You should be able to switch the default async EP-OUT (which configures
the new feedback EP-IN ) to adaptive EP-OUT (which requires no feedback
EP) with c_sync=8 parameter of f_uac2.

https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/gadget/function/f_uac2.c*L47__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAHQ6Lj4y$ 


https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14-rc6/source/drivers/usb/gadget/function/f_uac2.c*L830__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPAC7mfgo7$ 


https://urldefense.com/v3/__https://elixir.bootlin.com/linux/v5.14-rc6/source/include/uapi/linux/usb/ch9.h*L453__;Iw!!A4F2R9G_pg!PZHOWnH3HDs-9a9bz4KLDShZpsubSfG84fu3wEZyK4Q_Wvv2lhHKuk7LJpWPANdeuGIx$ 


Does that fix the problem?
Hi Pavel,

I took a look at 24f779dac8f3 ("usb: gadget: f_uac2/u_audio: add
feedback endpoint support") that Ferry reported the issue from
bisection. I see at least a couple problems in the new UAC2 changes.

1) usb_ep_dequeue() is asynchronous. Don't free requests before the
controller driver give them back.

2) Did you test with SuperSpeed? I don't see companion descriptor.

Please fix them.
Given it's late in the cycle I agree with Andy it would be best to 
revert the series now.

I will be happy to test on dwc3 when a fixed series arrives.
Yes, I agree it's the best solution now. I have asked the patch devs 
Ruslan and Jerome for help with these issues, I am mostly a tester of 
the UAC2 gadget changes (with dwc2 USB2 only available for now).

Best regards,

Pavel.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help