Thread (13 messages) 13 messages, 4 authors, 2021-01-18

RE: dmaengine : xilinx_dma two issues

From: Radhey Shyam Pandey <hidden>
Date: 2021-01-08 18:37:20
Also in: linux-arm-kernel, lkml

-----Original Message-----
From: Paul Thomas <redacted>
Sent: Friday, January 8, 2021 9:27 PM
To: Radhey Shyam Pandey <redacted>
Cc: Dan Williams <redacted>; Vinod Koul
[off-list ref]; Michal Simek [off-list ref]; Matthew Murrian
[off-list ref]; Romain Perier
[off-list ref]; Krzysztof Kozlowski [off-list ref]; Marc
Ferland [off-list ref]; Sebastian von Ohr
[off-list ref]; dmaengine@vger.kernel.org; Linux ARM <linux-
arm-kernel@lists.infradead.org>; linux-kernel <linux-
kernel@vger.kernel.org>; dave.jiang@intel.com; Shravya Kumbham
[off-list ref]; git [off-list ref]
Subject: Re: dmaengine : xilinx_dma two issues

Hi All,

On Fri, Jan 8, 2021 at 2:13 AM Radhey Shyam Pandey [off-list ref]
wrote:
quoted
quoted
-----Original Message-----
From: Radhey Shyam Pandey
Sent: Monday, January 4, 2021 10:50 AM
To: Paul Thomas <redacted>; Dan Williams
[off-list ref]; Vinod Koul [off-list ref]; Michal
Simek [off-list ref]; Matthew Murrian
[off-list ref]; Romain Perier
[off-list ref]; Krzysztof Kozlowski [off-list ref];
Marc Ferland [off-list ref]; Sebastian von Ohr
[off-list ref]; dmaengine@vger.kernel.org; Linux ARM <linux-
arm-kernel@lists.infradead.org>; linux-kernel <linux-
kernel@vger.kernel.org>; Shravya Kumbham [off-list ref]; git
[off-list ref]
Subject: RE: dmaengine : xilinx_dma two issues
quoted
-----Original Message-----
From: Paul Thomas <redacted>
Sent: Monday, December 28, 2020 10:14 AM
To: Dan Williams <redacted>; Vinod Koul
[off-list ref]; Michal Simek [off-list ref]; Radhey
Shyam Pandey [off-list ref]; Matthew Murrian
[off-list ref]; Romain Perier
[off-list ref];
quoted
Krzysztof Kozlowski [off-list ref]; Marc Ferland
[off-list ref]; Sebastian von Ohr [off-list ref];
dmaengine@vger.kernel.org; Linux ARM <linux-
arm-kernel@lists.infradead.org>; linux-kernel <linux-
kernel@vger.kernel.org>
Subject: dmaengine : xilinx_dma two issues

Hello,

I'm trying to get the 5.10 kernel up and running for our system,
and I'm running into a couple of issues with xilinx_dma.
+ (Xilinx mailing list)

Thanks for bringing the issues to our notice. Replies inline.
quoted
First, commit 14ccf0aab46e 'dmaengine: xilinx_dma: In dma channel
probe fix node order dependency' breaks our usage. Before this
commit a
call to:
quoted
dma_request_chan(&indio_dev->dev, "axi_dma_0"); returns fine, but
after that commit it returns -19. The reason for this seems to be
that the only channel that is setup is channel 1 (chan->id is 1 in
xilinx_dma_chan_probe()).
quoted
However in
of_dma_xilinx_xlate() chan_id is gets set to 0 (int chan_id =
dma_spec-
quoted
args[0];), which causes the:
!xdev->chan[chan_id]
test to fail in of_dma_xilinx_xlate()
What is the channel number passed in dmaclient DT?
Is this a question for me?
Yes, please also share the dmaclient DT client node. Need to see
channel number passed to dmas property. Something like below-

dmas = <& axi_dma_0 1>
dma-names = "axi_dma_0"
quoted
Any update on this issue?
quoted
quoted
dmas = <& axi_dma_0 1>
dma-names = "axi_dma_0"
quoted
Our device-tree entry looks like this:
    axi_dma_0: dma@80002000 {
        status = "okay";
        #dma-cells = <1>;
        compatible = "xlnx,axi-dma-1.00.a";
        interrupt-parent = <&gic>;
        interrupts = <0 89 4>;
        reg = <0x0 0x80002000 0x0 0x1000>;
        xlnx,addrwidth = <0x20>;
        clocks = <&zynqmp_clk LPD_LSBUS>, <&zynqmp_clk LPD_LSBUS>,
<&zynqmp_clk LPD_LSBUS>, <&zynqmp_clk LPD_LSBUS>;
        clock-names = "s_axi_lite_aclk", "m_axi_sg_aclk",
"m_axi_mm2s_aclk", "m_axi_s2mm_aclk";
        dma-channel@80002030 {
            compatible = "xlnx,axi-dma-s2mm-channel";
            dma-channels = <0x1>;
            interrupts = <0 89 4>;
            xlnx,datawidth = <0x20>;
            xlnx,device-id = <0x0>;
        };
    };

This is on a 5.10.1 kernel on arm64 zynqmp hardware.

The second issue goes a little further back to commit
e81274cd6b526
'dmaengine: add support to dynamic register/unregister of channels'.
After this commit even just removing the module 'rmmod
xilinx_dma', without ever using it, results in a kernel oops like this:
[   37.214568] xilinx-vdma 80002000.dma: ch 0: SG disabled
[   37.219807] xilinx-vdma 80002000.dma: WARN: Device release is not
defined so it is not safe to unbind this driver while in use
[   37.231299] xilinx-vdma 80002000.dma: Xilinx AXI DMA Engine Driver
Probed!!
[   42.100660] Unable to handle kernel paging request at virtual
address dead000000000108
[   42.108598] Mem abort info:
[   42.111393]   ESR = 0x96000044
[   42.114443]   EC = 0x25: DABT (current EL), IL = 32 bits
[   42.119744]   SET = 0, FnV = 0
[   42.122794]   EA = 0, S1PTW = 0
[   42.125918] Data abort info:
[   42.128789]   ISV = 0, ISS = 0x00000044
[   42.132617]   CM = 0, WnR = 1
[   42.135577] [dead000000000108] address between user and kernel
address ranges
[   42.142705] Internal error: Oops: 96000044 [#1] SMP
[   42.147566] Modules linked in: xilinx_dma(-) clk_xlnx_clock_wizard
uio_pdrv_genirq
[   42.155139] CPU: 1 PID: 2075 Comm: rmmod Not tainted
5.10.1-00026-g3a2e6dd7a05-dirty #192
[   42.163302] Hardware name: Enclustra XU5 SOM (DT)
[   42.167992] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--)
[   42.173996] pc : xilinx_dma_chan_remove+0x74/0xa0 [xilinx_dma]
[   42.179815] lr : xilinx_dma_chan_remove+0x70/0xa0 [xilinx_dma]
[   42.185636] sp : ffffffc01112bca0
[   42.188935] x29: ffffffc01112bca0 x28: ffffff80402ea640
[   42.194238] x27: 0000000000000000 x26: 0000000000000000
[   42.199542] x25: 0000000000000000 x24: 0000000000000000
[   42.204845] x23: 0000000000000000 x22: 0000000000000000
[   42.210149] x21: ffffffc0088a2028 x20: ffffff8040c08410
[   42.215452] x19: ffffff80423fa480 x18: ffffffffffffffff
[   42.220756] x17: 0000000000000000 x16: 0000000000000000
[   42.226059] x15: ffffffc010ce88c8 x14: 0000000000000040
[   42.231363] x13: ffffff0000000000 x12: ffffffffffffffff
[   42.236667] x11: 0000000000000028 x10: ffffffff7fffffff
[   42.241970] x9 : ffffffff00f0dfe0 x8 : 0000000000000000
[   42.247273] x7 : ffffffc010da4000 x6 : 0000000000000000
[   42.252577] x5 : 0000000000210d00 x4 : ffffffc010da4da0
[   42.257881] x3 : ffffff80423fa578 x2 : 0000000000000000
[   42.263184] x1 : dead000000000100 x0 : dead000000000122
[   42.268488] Call trace:
[   42.270923]  xilinx_dma_chan_remove+0x74/0xa0 [xilinx_dma]
[   42.276399]  xilinx_dma_remove+0x3c/0x70 [xilinx_dma]
[   42.281446]  platform_drv_remove+0x24/0x38
[   42.285530]  device_release_driver_internal+0xec/0x1a8
[   42.290659]  driver_detach+0x64/0xd8
[   42.294226]  bus_remove_driver+0x58/0xb8
[   42.298133]  driver_unregister+0x30/0x60
[   42.302048]  platform_driver_unregister+0x14/0x20
[   42.306744]  xilinx_vdma_driver_exit+0x18/0x978 [xilinx_dma]
[   42.312396]  __arm64_sys_delete_module+0x1e4/0x270
[   42.317178]  el0_svc_common.constprop.4+0x68/0x170
[   42.321959]  do_el0_svc+0x70/0x90
[   42.325267]  el0_svc+0x14/0x20
[   42.328313]  el0_sync_handler+0x90/0xb8
[   42.332141]  el0_sync+0x158/0x180
[   42.335442] Code: 95dfce29 9103c260 95de7ffb a9490261 (f9000420)
[   42.341525] ---[ end trace dbd90aeb5ca71943 ]---

So if I use the 04c2bc2bede1 (commit before 14ccf0aab46e) version
of xilinx_dma.c and never remove the module then it is working
with the
5.10.1 kernel.
Ok, we will analyze this issue and report back the findings.
+ Dave

I was able to reproduce this crash on the unloading xilinx_dma module.
This is introduced due to the e81274cd6b526 mainline commit added in
the 5.6 kernel version. The crash is coming from -

xilinx_dma_chan_remove+0x74/0xa0:
__list_del at ./include/linux/list.h:112 (inlined by) __list_del_entry
at./include/linux/list.h:135 (inlined by) list_del at
./include/linux/list.h:146 (inlined by) xilinx_dma_chan_remove at
drivers/dma/xilinx/xilinx_dma.c:2546

Looking into e81274cd6b526 commit - It deletes channel device_node
entry.
quoted
Same channel device_node entry is also deleted in
xilinx_dma_chan_remove as a result we see this crash.
@@ -993,12 +1007,22 @@ static
void __dma_async_device_channel_unregister(struct dma_device *device,
                  "%s called while %d clients hold a reference\n",
                  __func__, chan->client_count);
        mutex_lock(&dma_list_mutex);
+       list_del(&chan->device_node);
+       device->chancnt--;

I will let Dave comment on the background of this change.  In
dmaengine driver - we are adding channel device node entry so deleting
it in the exit path looks fine to me. Also, I checked other dma
drivers are also deleting channel device_node entry in .remove so it
likely a common problem.
quoted
quoted
Hopefully, this will be clear to someone how these issues can be
resolved. In general we've been very happy using the xilinx dma.

I'm not subscribed to the linux-kernel ML so if you need any
further info or testing just keep me in the to: list.

thanks,
Paul
Thanks for looking at this! Let me know if anyone needs more info.

-Paul
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help