Re: [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values
From: Jakub Kicinski <kuba@kernel.org>
Date: 2024-09-03 00:49:46
Also in:
lkml
On Sat, 31 Aug 2024 18:27:45 +0100 Joe Damato wrote:
quoted
How do you feel about making this configuration opt-in / require driver changes? What I'm thinking is that having the new "netif_napi_add()" variant (or perhaps extending netif_napi_set_irq()) to take an extra "index" parameter would make the whole thing much simpler.I think if we are going to go this way, then opt-in is probably the way to go. This series would include the necessary changes for mlx5, in that case (because that's what I have access to) so that the new variant has a user?
SG! FWIW for bnxt the "id" is struct bnxt_napi::index (I haven't looked at bnxt before writing the suggestion :))
quoted
Index would basically be an integer 0..n, where n is the number of IRQs configured for the driver. The index of a NAPI instance would likely match the queue ID of the queue the NAPI serves. We can then allocate an array of "napi_configs" in net_device - like we allocate queues, the array size would be max(num_rx_queue, num_tx_queues). We just need to store a couple of ints so it will be tiny compared to queue structs, anyway.I assume napi_storage exists for both combined RX/TX NAPIs (i.e. drivers that multiplex RX/TX on a single NAPI like mlx5) as well as drivers which create NAPIs that are RX or TX-only, right?
Hm.
If so, it seems like we'd either need to:
- Do something more complicated when computing how much NAPI
storage to make, or
- Provide a different path for drivers which don't multiplex and
create some number of (for example) TX-only NAPIs ?
I guess I'm just imagining a weird case where a driver has 8 RX
queues but 64 TX queues. max of that is 64, so we'd be missing 8
napi_storage ?
Sorry, I'm probably just missing something about the implementation
details you summarized above.I wouldn't worry about it. We can added a variant of alloc_netdev_mqs() later which takes the NAPI count explicitly. For now we can simply assume max(rx, tx) is good enough, and maybe add a WARN_ON_ONCE() to the set function to catch drivers which need something more complicated. Modern NICs have far more queues than IRQs (~NAPIs).
quoted
The NAPI_SET netlink op can then work based on NAPI index rather than the ephemeral NAPI ID. It can apply the config to all live NAPI instances with that index (of which there really should only be one, unless driver is mid-reconfiguration somehow but even that won't cause issues, we can give multiple instances the same settings) and also store the user config in the array in net_device.I understand what you are proposing. I suppose napi-get could be extended to include the NAPI index, too?
Yup!
Then users could map queues to NAPI indexes to queues (via NAPI ID)?
Yes.
quoted
When new NAPI instance is associate with a NAPI index it should get all the config associated with that index applied. Thoughts? Does that makes sense, and if so do you think it's an over-complication?It feels a bit tricky, to me, as it seems there are some edge cases to be careful with (queue count change). I could probably give the implementation a try and see where I end up. Having these settings per-NAPI would be really useful and being able to support IRQ suspension would be useful, too. I think being thoughtful about how we get there is important; I'm a little wary of getting side tracked, but I trust your judgement and if you think this is worth exploring I'll think on it some more.
I understand, we can abandon it if the implementation drags out due to various nit picks and back-and-forths. But I don't expect much of that 🤞️