Thread (15 messages) 15 messages, 8 authors, 2024-11-04

RE: [resend PATCH 2/2] dim: pass dim_sample to net_dim() by reference

From: "Kiyanovski, Arthur" <akiyano@amazon.com>
Date: 2024-10-31 18:28:39
Also in: intel-wired-lan, linux-doc, linux-mediatek, linux-rdma, linuxppc-dev, lkml, virtualization

-----Original Message-----
From: Caleb Sander Mateos <redacted>
Sent: Wednesday, October 30, 2024 5:23 PM

net_dim() is currently passed a struct dim_sample argument by value.
struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64 passes it
on the stack. All callers have already initialized dim_sample on the stack, so
passing it by value requires pushing a duplicated copy to the stack. Either
witing to the stack and immediately reading it, or perhaps dereferencing
addresses relative to the stack pointer in a chain of push instructions, seems
to perform quite poorly.

In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
94% of which is attributed to the first push instruction to copy dim_sample on
the stack for the call to net_dim():
// Call ktime_get()
  0.26 |4ead2:   call   4ead7 <mlx5e_handle_rx_dim+0x47>
// Pass the address of struct dim in %rdi
       |4ead7:   lea    0x3d0(%rbx),%rdi
// Set dim_sample.pkt_ctr
       |4eade:   mov    %r13d,0x8(%rsp)
// Set dim_sample.byte_ctr
       |4eae3:   mov    %r12d,0xc(%rsp)
// Set dim_sample.event_ctr
  0.15 |4eae8:   mov    %bp,0x10(%rsp)
// Duplicate dim_sample on the stack
 94.16 |4eaed:   push   0x10(%rsp)
  2.79 |4eaf1:   push   0x10(%rsp)
  0.07 |4eaf5:   push   %rax
// Call net_dim()
  0.21 |4eaf6:   call   4eafb <mlx5e_handle_rx_dim+0x6b>

To allow the caller to reuse the struct dim_sample already on the stack, pass
the struct dim_sample by reference to net_dim().

Signed-off-by: Caleb Sander Mateos <redacted>
---
Thank you for this patch.

For the ENA part:

Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>

Thanks,
Arthur
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help