Thread (18 messages) 18 messages, 4 authors, 2021-12-03

Re: [PATCH net-next v3 4/4] net: ocelot: add FDMA support

From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: 2021-11-29 17:42:45
Also in: linux-devicetree, lkml

On Mon, Nov 29, 2021 at 09:19:02AM +0100, Clément Léger wrote:
quoted
I'm not sure why you're letting the hardware grind to a halt first,
before refilling? I think since the CPU is the bottleneck anyway, you
can stop the extraction channel at any time you want to refill.
A constant stream of less data might be better than a bursty one.
Or maybe I'm misunderstanding some of the details of the hardware.
Indeed, I can stop the extraction channel but that does not seems a
good idea to stop the channel in a steady state. At least that's what I
thought since it will make the receive "window" non predictable. Not
sure how well it will play with various protocol but I will try
implementing the refill we talked previously (ie when there an
available threshold is reached).
(...)
quoted
I don't understand why you restart the injection channel from the TX
confirmation interrupt. It raised the interrupt to tell you that it hit
a NULL LLP because there's nothing left to send. If you restart it now and
no other transmission has happened in the meantime, won't it stop again?
Actually, it is only restarted if there is some pending packets to
send. With this hardware, packets can't be added while the FDMA is
running and it must be stopped everytime we want to add a packet to the
list. To avoid that, in the TX path, if the FDMA is stopped, we set the
llp of the packet to NULL and start the chan. However, if the FDMA TX
channel is running, we don't stop it, we simply add the next packets to
the ring. However, the FDMA will stop on the previous NULL LLP. So when
we hit a LLP, we might not be at the end of the list. This is why the
next check verifies if we hit a NULL LLP and if there is still some
packet to send. 
Oh, is that so? That would be pretty odd if the hardware is so dumb that
it doesn't detect changes made to an LLP on the go.

The manual has this to say, and I'm not sure how to interpret it:

| It is possible to update an active channels LLP pointer and pointers in
| the DCB chains. Before changing pointers software must schedule the
| channel for disabling (by writing FDMA_CH_DISABLE.CH_DISABLE[ch]) and
| then wait for the channel to set FDMA_CH_SAFE.CH_SAFE[ch]. When the
| pointer update is complete, soft must re-activate the channel by setting
| FDMA_CH_ACTIVATE.CH_ACTIVATE[ch]. Setting activate will cancel the
| deactivate-request, or if the channel has disabled itself in the
| meantime, it will re activate the channel.

So it is possible to update an active channel's LLP pointer, but not
while it's active? Thank you very much!

If true, this will severely limit the termination performance one is
able to obtain with this switch, even with a faster CPU and PCIe.
quoted
quoted
+void ocelot_fdma_netdev_init(struct ocelot_fdma *fdma, struct net_device *dev)
+{
+	dev->needed_headroom = OCELOT_TAG_LEN;
+	dev->needed_tailroom = ETH_FCS_LEN;  
The needed_headroom is in no way specific to FDMA, right? Why aren't you
doing it for manual register-based injection too? (in a separate patch ofc)
Actually, If I switch to page based ring, This won't be useful anymore
because the header will be written directly in the page and not anymore
directly in the skb header.
I don't understand this comment. You set up the needed headroom and
tailroom netdev variables to avoid reallocation on TX, not for RX.
And you use half page buffers for RX, not for TX.
quoted
I can't help but think how painful it is that with a CPU as slow as
yours, insult over injury, you also need to check for each packet
whether the device tree had defined the "fdma" region or not, because
you practically keep two traffic I/O implementations due to that sole
reason. I think for the ocelot switchdev driver, which is strictly for
MIPS CPUs embedded within the device, it should be fine to introduce a
static key here (search for static_branch_likely in the kernel).
I thinked about it *but* did not wanted to add a key since it would be
global. However, we could consider that there is always only one
instance of the driver and indeed a static key is an option.
Unfortunately, I'm not sure this will yield any noticeable performance
improvement.
What is the concern with a static key in this driver, exactly?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help