Thread (30 messages) 30 messages, 6 authors, 2021-02-05

RE: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets

From: <Bryan.Whitehead@microchip.com>
Date: 2021-02-01 18:06:36
Also in: lkml

Hi Sven, see below
quoted
If lan743x_rx_init_ring_element fails to allocate an skb, Then
lan743x_rx_reuse_ring_element will be called.
But that function expects the skb is already allocated and dma mapped.
But the dma was unmapped above.
Good catch. I think you're right, the skb allocation always has to come before
the unmap. Because if we unmap, and then the skb allocation fails, there is
no guarantee that we can remap the old skb we've just unmapped (it could
fail).
And then we'd end up with a broken driver.

BUT I actually joined skb alloc and init_ring_element, because of a very
subtle synchronization bug I was seeing: if someone changes the mtu
_in_between_ skb alloc and init_ring_element, things will go haywire,
because the skb and mapping lengths would be different !

We could fix that by using a spinlock I guess, but synchronization primitives in
"hot paths" like these are far from ideal... Would be nice if we could avoid
that.

Here's an idea: what if we fold "unmap from dma" into init_ring_element()?
That way, we get the best of both worlds: length cannot change in the
middle, and the function can always "back out" without touching the ring
element in case an allocation or mapping fails.

Pseudo-code:

init_ring_element() {
    /* single "sampling" of mtu, so no synchronization required */
    length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING;

    skb = alloc(length);
    if (!skb) return FAIL;
    dma_ptr = dma_map(skb, length);
    if (!dma_ptr) {
        free(skb);
        return FAIL;
    }
    if (buffer_info->dma_ptr)
        dma_unmap(buffer_info->dma_ptr, buffer_info->buffer_length);
    buffer_info->skb = skb;
    buffer_info->dma_ptr = dma_ptr;
    buffer_info->buffer_length = length;

    return SUCCESS;
}

What do you think?
Yes, I believe this will work.
quoted
Also if lan743x_rx_init_ring_element fails to allocate an skb.
Then control will jump to process_extension and therefor the currently
received skb will not be added to the skb list.
I assume that would corrupt the packet? Or am I missing something?
Yes if an skb alloc failure in the middle of a multi-buffer frame, will corrupt
the packet inside the frame. A chunk will be missing. I had assumed that this
would be caught by an upper network layer, some checksum would be
incorrect?

Are there current networking devices that would send a corrupted packet to
Linux if there is a corruption on the physical link? Especially if they don't
support checksumming?

Maybe my assumption is naive.
I'll fix this up if you believe that it could be an issue.
Yes the upper layers will catch it and drop it.
But if we already know the packet is corrupted, I think it would be better if we
dropped it here, to avoid unnecessary processing upstream.

...
RX_PROCESS_RESULT_XXX can now only take two values (RECEIVED and
NOTHING_TO_DO), so in theory it could be replaced by a bool. But perhaps
we should keep the current names, because they are clearer to the reader?
I'm ok if you want to change it too a bool. 
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help