Thread (24 messages) 24 messages, 6 authors, 2014-05-30

RE: [PATCH net-next 5/7] net:fec: add support for dumping transmit ring on timeout

From: fugang.duan@freescale.com <hidden>
Date: 2014-05-29 08:17:40

Hi, Russell,

From: Russell King - ARM Linux <redacted> Data: Tuesday, April 29, 2014 10:38 PM
To: David Laight
Cc: 'Frank Li'; Li Frank-B20596; shawn.guo@linaro.org; Duan Fugang-B38611;
davem@davemloft.net; linux-arm-kernel@lists.infradead.org;
netdev@vger.kernel.org
Subject: Re: [PATCH net-next 5/7] net:fec: add support for dumping
transmit ring on timeout

On Tue, Apr 29, 2014 at 02:23:09PM +0000, David Laight wrote:
quoted
From: Frank Li [mailto:lznuaa@gmail.com]
quoted
On Tue, Apr 29, 2014 at 9:01 AM, David Laight [off-list ref]
wrote:
quoted
quoted
quoted
From: Frank ...
quoted
quoted
You probably want the read and write indexes as well.
                     bdp == fep->cur_tx ? 'S' : ' ',
                     bdp == fep->dirty_tx ? 'H' : ' ',

Above code already print read and write index. 'S', 'H'
Gah I must be asleep!
Something made be think that was to do with the ring ownership bit!
I think it is same thing. If I am wrong, please tell me difference.
The ownership bit in the ring flags - that the hardware uses.
Which are printed in the next field.

I'm guessing that the reason the tx ring is 'interesting' is that
there have been bugs where the driver and hardware disagree about
which entry each should process next.
Otherwise the full tx ring is likely to be very very boring.
There have been several bugs.

One is where the ring is completely owned by software (because all the
entries have been transmitted) but the driver is buggy and hasn't reaped
the ring at all, leading to a tx timeout.

The second one is where the ring appears to be completely full, because
the hardware hasn't been transmitting for various reasons (eg, there are
bugs in the way the transmitter is started.)

The third one is where the transmitter skips a ring entry on earlier iMX
hardware.

However, those are specific bugs.  The point of dumping the whole ring is
to allow bugs to be diagnosed, because we can then see the state of the
ring, and start looking for likely causes of the symptoms that are visible.
With the driver as it currently is, the only thing we know is "oops, the
transmit seemed to stop for some reason" and we hope that resetting the
device gets it going again - after many seconds of it being non-responsive.

This is how I've sorted out many issues with this driver.
I see linux next and net "imx_v6_v7_defconfig" don't enable "CONFIG_CMA", if you enable the feature,
Fec issue may disapper. I tested it with the config, fec BD dma coherent issue, watchdog timeout issue don't happen.

I attach the document (not secretive), you can glance over them.
The issue happened on imx6q/dl fec, usb, audio...

Thanks,
Andy

Attachments

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help