Thread (15 messages) 15 messages, 5 authors, 2023-02-27

Re: [xdp-hints] Re: [Intel-wired-lan] [PATCH bpf-next V1] igc: enable and fix RX hash usage by netstack

From: Jesper Dangaard Brouer <hidden>
Date: 2023-02-16 15:18:58
Also in: bpf, intel-wired-lan

On 14/02/2023 16.13, Alexander Lobakin wrote:
From: Paul Menzel <redacted>
Date: Tue, 14 Feb 2023 16:00:52 +0100
quoted
Am 10.02.23 um 16:07 schrieb Jesper Dangaard Brouer:
quoted
When function igc_rx_hash() was introduced in v4.20 via commit 0507ef8a0372
("igc: Add transmit and receive fastpath and interrupt handlers"), the
hardware wasn't configured to provide RSS hash, thus it made sense to not
enable net_device NETIF_F_RXHASH feature bit.
[...]
quoted
quoted
hash value doesn't include UDP port numbers. Not being PKT_HASH_TYPE_L4, have
the effect that netstack will do a software based hash calc calling into
flow_dissect, but only when code calls skb_get_hash(), which doesn't
necessary happen for local delivery.
Excuse my ignorance, but is that bug visible in practice by users
(performance?) or is that fix needed for future work?
Hash calculation always happens when RPS or RFS is enabled. So having no
hash in skb before hitting the netstack slows down their performance.
Also, no hash in skb passed from the driver results in worse NAPI bucket
distribution when there are more traffic flows than Rx queues / CPUs.
+ Netfilter needs hashes on some configurations.
Thanks Olek for explaining that.

My perf measurements show that the expensive part is that netstack will
call the flow_dissector code, when the hardware RX-hash is missing.
quoted
quoted
Fixes: 2121c2712f82 ("igc: Add multiple receive queues control
supporting")
Signed-off-by: Jesper Dangaard Brouer <redacted>
[...]

Nice to see that you also care about (not) using short types on the stack :)
As can be seen by godbolt.org exploration[0] I have done, the stack
isn't used for storing the values.

  [0] 
https://github.com/xdp-project/xdp-project/tree/master/areas/hints/godbolt/

I have created three files[2] with C-code that can be compiled via 
https://godbolt.org/.  The C-code contains a comment with the ASM code 
that was generated with -02 with compiler x86-64 gcc 12.2.

The first file[01] corresponds to this patch.

  [01] 
https://github.com/xdp-project/xdp-project/blob/master/areas/hints/godbolt/igc_godbolt01.c
  [G01] https://godbolt.org/z/j79M9aTsn

The second file igc_godbolt02.c [02] have changes in [diff02]

  [02] 
https://github.com/xdp-project/xdp-project/blob/master/areas/hints/godbolt/igc_godbolt02.c
  [G02] https://godbolt.org/z/sErqe4qd5
  [diff02] https://github.com/xdp-project/xdp-project/commit/1f3488a932767

The third file igc_godbolt03.c [03] have changes in [diff03]

  [03] 
https://github.com/xdp-project/xdp-project/blob/master/areas/hints/godbolt/igc_godbolt03.c
  [G03] https://godbolt.org/z/5K3vE1Wsv
  [diff03] https://github.com/xdp-project/xdp-project/commit/aa9298f68705

Summary, the only thing we can save is replacing some movzx
(zero-extend) with mov instructions.
quoted
[1]: https://notabs.org/coding/smallIntsBigPenalty.htm
Thanks,
Olek
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help