Thread (15 messages) 15 messages, 5 authors, 2023-02-27

Re: [PATCH bpf-next V1] igc: enable and fix RX hash usage by netstack

From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: 2023-02-24 16:43:18
Also in: bpf, intel-wired-lan

From: Jesper Dangaard Brouer <redacted>
Date: Wed, 22 Feb 2023 16:00:30 +0100
On 20/02/2023 16.39, Alexander Lobakin wrote:
quoted
From: Jesper Dangaard Brouer <redacted>
Date: Thu, 16 Feb 2023 17:46:53 +0100
[...]
quoted
Rx descriptors are located in the DMA coherent zone (allocated via
dma_alloc_coherent()), I am missing something? Because I was (I am) sure
CPU doesn't cache anything from it (and doesn't reorder reads/writes
from/to). I thought that's the point of coherent zones -- you may talk
to hardware without needing for syncing...
That is a good point and you are (likely) right.

I do want to remind you that this is a "fixes" patch that dates back to
v5.2.  This driver is from the very beginning coded to access descriptor
this way via union igc_adv_rx_desc.  For a fixes patch, I'm not going to
code up a new and more effecient way of accessing the descriptor memory.
Sure, not for fixes definitely. +
If you truely believe this matters for a 2.5 Gbit/s device, then someone
(e.g you) can go through this driver and change this pattern in the code.
[...]
quoted
quoted
quoted
quoted
+    [10].hash_type = PKT_HASH_TYPE_L2, /* RSS Type above 9
"Reserved" by HW */
+    [11].hash_type = PKT_HASH_TYPE_L2,
+    [12].hash_type = PKT_HASH_TYPE_L2,
+    [13].hash_type = PKT_HASH_TYPE_L2,
+    [14].hash_type = PKT_HASH_TYPE_L2,
+    [15].hash_type = PKT_HASH_TYPE_L2,
Changing these 10-15 to PKT_HASH_TYPE_NONE, which is zero.
The ASM generated table is smaller code size with zero padded content.
Yeah, and _L2 is applicable only when there's actual hash (but it's
hashed by MAC addresses, for example). Sorry I didn't notice this :s
quoted
quoted
quoted
Why define those empty if you could do a bound check in the code
instead? E.g. `if (unlikely(bigger_than_9)) return PKT_HASH_TYPE_L2`.
Having a branch for this is likely slower.  On godbolt I see that this
generates suboptimal and larger code.
But you have to verify HW output anyway, right? Or would like to rely on
that on some weird revision it won't spit BIT(69) on you?
The table is constructed such that the lookup takes care of "verifying"
the HW output.  Notice that software will bit mask the last 4 bits, thus
the number will max be 15.  No matter what hardware outputs it is safe
to do a lookup in the table.  IMHO it is a simple way to avoid an
unnecessary verification branch and still be able to handle buggy/weird
HW revs.
Ah, didn't notice the field is of 4 bits. Ack then.

[...]

Thanks,
Olek
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help