Thread (48 messages) 48 messages, 7 authors, 2024-09-09

Re: [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values

From: Jakub Kicinski <kuba@kernel.org>
Date: 2024-09-03 00:49:46
Also in: lkml

On Sat, 31 Aug 2024 18:27:45 +0100 Joe Damato wrote:
quoted
How do you feel about making this configuration opt-in / require driver
changes? What I'm thinking is that having the new "netif_napi_add()"
variant (or perhaps extending netif_napi_set_irq()) to take an extra
"index" parameter would make the whole thing much simpler.  
I think if we are going to go this way, then opt-in is probably the
way to go. This series would include the necessary changes for mlx5,
in that case (because that's what I have access to) so that the new
variant has a user?
SG! FWIW for bnxt the "id" is struct bnxt_napi::index (I haven't looked
at bnxt before writing the suggestion :))
quoted
Index would basically be an integer 0..n, where n is the number of
IRQs configured for the driver. The index of a NAPI instance would
likely match the queue ID of the queue the NAPI serves.

We can then allocate an array of "napi_configs" in net_device -
like we allocate queues, the array size would be max(num_rx_queue,
num_tx_queues). We just need to store a couple of ints so it will
be tiny compared to queue structs, anyway.  
I assume napi_storage exists for both combined RX/TX NAPIs (i.e.
drivers that multiplex RX/TX on a single NAPI like mlx5) as well
as drivers which create NAPIs that are RX or TX-only, right?
Hm.
If so, it seems like we'd either need to:
  - Do something more complicated when computing how much NAPI
    storage to make, or
  - Provide a different path for drivers which don't multiplex and
    create some number of (for example) TX-only NAPIs ?

I guess I'm just imagining a weird case where a driver has 8 RX
queues but 64 TX queues. max of that is 64, so we'd be missing 8
napi_storage ?

Sorry, I'm probably just missing something about the implementation
details you summarized above.
I wouldn't worry about it. We can added a variant of alloc_netdev_mqs()
later which takes the NAPI count explicitly. For now we can simply
assume max(rx, tx) is good enough, and maybe add a WARN_ON_ONCE() to 
the set function to catch drivers which need something more complicated.

Modern NICs have far more queues than IRQs (~NAPIs).
quoted
The NAPI_SET netlink op can then work based on NAPI index rather 
than the ephemeral NAPI ID. It can apply the config to all live
NAPI instances with that index (of which there really should only 
be one, unless driver is mid-reconfiguration somehow but even that
won't cause issues, we can give multiple instances the same settings)
and also store the user config in the array in net_device.  
I understand what you are proposing. I suppose napi-get could be
extended to include the NAPI index, too?
Yup!
Then users could map queues to NAPI indexes to queues (via NAPI ID)?
Yes.
quoted
When new NAPI instance is associate with a NAPI index it should get
all the config associated with that index applied.

Thoughts? Does that makes sense, and if so do you think it's an
over-complication?  
It feels a bit tricky, to me, as it seems there are some edge cases
to be careful with (queue count change). I could probably give the
implementation a try and see where I end up.

Having these settings per-NAPI would be really useful and being able
to support IRQ suspension would be useful, too.

I think being thoughtful about how we get there is important; I'm a
little wary of getting side tracked, but I trust your judgement and
if you think this is worth exploring I'll think on it some more.
I understand, we can abandon it if the implementation drags out due to
various nit picks and back-and-forths. But I don't expect much of that
🤞️
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help