RE: [PATCH net-next v1 2/6] lan743x: support rx multi-buffer packets
From: <Bryan.Whitehead@microchip.com>
Date: 2021-02-01 18:06:36
Also in:
lkml
Hi Sven, see below
quoted
If lan743x_rx_init_ring_element fails to allocate an skb, Then lan743x_rx_reuse_ring_element will be called. But that function expects the skb is already allocated and dma mapped. But the dma was unmapped above.Good catch. I think you're right, the skb allocation always has to come before the unmap. Because if we unmap, and then the skb allocation fails, there is no guarantee that we can remap the old skb we've just unmapped (it could fail). And then we'd end up with a broken driver. BUT I actually joined skb alloc and init_ring_element, because of a very subtle synchronization bug I was seeing: if someone changes the mtu _in_between_ skb alloc and init_ring_element, things will go haywire, because the skb and mapping lengths would be different ! We could fix that by using a spinlock I guess, but synchronization primitives in "hot paths" like these are far from ideal... Would be nice if we could avoid that. Here's an idea: what if we fold "unmap from dma" into init_ring_element()? That way, we get the best of both worlds: length cannot change in the middle, and the function can always "back out" without touching the ring element in case an allocation or mapping fails. Pseudo-code: init_ring_element() { /* single "sampling" of mtu, so no synchronization required */ length = netdev->mtu + ETH_HLEN + 4 + RX_HEAD_PADDING; skb = alloc(length); if (!skb) return FAIL; dma_ptr = dma_map(skb, length); if (!dma_ptr) { free(skb); return FAIL; } if (buffer_info->dma_ptr) dma_unmap(buffer_info->dma_ptr, buffer_info->buffer_length); buffer_info->skb = skb; buffer_info->dma_ptr = dma_ptr; buffer_info->buffer_length = length; return SUCCESS; } What do you think?
Yes, I believe this will work.
quoted
Also if lan743x_rx_init_ring_element fails to allocate an skb. Then control will jump to process_extension and therefor the currently received skb will not be added to the skb list. I assume that would corrupt the packet? Or am I missing something?Yes if an skb alloc failure in the middle of a multi-buffer frame, will corrupt the packet inside the frame. A chunk will be missing. I had assumed that this would be caught by an upper network layer, some checksum would be incorrect? Are there current networking devices that would send a corrupted packet to Linux if there is a corruption on the physical link? Especially if they don't support checksumming? Maybe my assumption is naive. I'll fix this up if you believe that it could be an issue.
Yes the upper layers will catch it and drop it. But if we already know the packet is corrupted, I think it would be better if we dropped it here, to avoid unnecessary processing upstream. ...
RX_PROCESS_RESULT_XXX can now only take two values (RECEIVED and NOTHING_TO_DO), so in theory it could be replaced by a bool. But perhaps we should keep the current names, because they are clearer to the reader?
I'm ok if you want to change it too a bool.