Re: [RFC net-next v3 2/4] virtio_net: Prepare for NAPI to queue mapping
From: Joe Damato <hidden>
Date: 2025-01-27 17:52:10
Also in:
lkml, virtualization
On Sun, Jan 26, 2025 at 04:04:02PM +0800, Jason Wang wrote:
On Sat, Jan 25, 2025 at 4:19 AM Joe Damato [off-list ref] wrote:quoted
On Fri, Jan 24, 2025 at 09:14:54AM +0800, Jason Wang wrote:quoted
On Thu, Jan 23, 2025 at 10:47 AM Joe Damato [off-list ref] wrote:quoted
On Thu, Jan 23, 2025 at 10:40:43AM +0800, Jason Wang wrote:quoted
On Thu, Jan 23, 2025 at 1:41 AM Joe Damato [off-list ref] wrote:quoted
On Wed, Jan 22, 2025 at 02:12:46PM +0800, Jason Wang wrote:quoted
On Wed, Jan 22, 2025 at 3:11 AM Joe Damato [off-list ref] wrote:
[...]
quoted
quoted
quoted
quoted
quoted
quoted
quoted
-static void virtnet_napi_enable(struct virtqueue *vq, struct napi_struct *napi) +static void virtnet_napi_do_enable(struct virtqueue *vq, + struct napi_struct *napi) { napi_enable(napi);Nit: it might be better to not have this helper to avoid a misuse of this function directly.Sorry, I'm probably missing something here. Both virtnet_napi_enable and virtnet_napi_tx_enable need the logic in virtnet_napi_do_enable. Are you suggesting that I remove virtnet_napi_do_enable and repeat the block of code in there twice (in virtnet_napi_enable and virtnet_napi_tx_enable)?I think I miss something here, it looks like virtnet_napi_tx_enable() calls virtnet_napi_do_enable() directly. I would like to know why we don't call netif_queue_set_napi() for TX NAPI here?Please see both the cover letter and the commit message of the next commit which addresses this question. TX-only NAPIs do not have NAPI IDs so there is nothing to map.Interesting, but I have more questions 1) why need a driver to know the NAPI implementation like this?I'm not sure I understand the question, but I'll try to give an answer and please let me know if you have another question. Mapping the NAPI IDs to queue IDs is useful for applications that use epoll based busy polling (which relies on the NAPI ID, see also SO_INCOMING_NAPI_ID and [1]), IRQ suspension [2], and generally per-NAPI configuration [3]. Without this code added to the driver, the user application can get the NAPI ID of an incoming connection, but has no way to know which queue (or NIC) that NAPI ID is associated with or to set per-NAPI configuration settings. [1]: https://lore.kernel.org/all/20240213061652.6342-1-jdamato@fastly.com/ (local) [2]: https://lore.kernel.org/netdev/20241109050245.191288-5-jdamato@fastly.com/T/ (local) [3]: https://lore.kernel.org/lkml/20241011184527.16393-1-jdamato@fastly.com/ (local)Yes, exactly. Sorry for being unclear, what I want to ask is actually: 1) TX NAPI doesn't have a NAPI ID, this seems more like a NAPI implementation details which should be hidden from the driver. 2) If 1 is true, in the netif_queue_set_napi(), should it be better to add and check for whether or not NAPI has an ID and return early if it doesn't have one 3) Then driver doesn't need to know NAPI implementation details like NAPI stuffs?
Sorry it just feels like this conversation is getting off track.
This change is about mapping virtio_net RX queues to NAPI IDs to
allow for RX busy polling, per-NAPI config settings, etc.
If you try to use netif_queue_set_napi with a TX-only NAPI, it will
set the NAPI ID to 0.
I already addressed this in the cover letter, would you mind
carefully re-reading my cover letter and commit messages?
If your main concern is that you want me to call
netif_queue_set_napi for TX-only NAPIs in addition to the RX NAPIs
in virtio_net, I can do that and resend an RFC.
In that case, the output will show "0" for NAPI ID for the TX-only
NAPIs. See the commit message of patch 3 and imagine that the output
shows this instead:
$ ./tools/net/ynl/pyynl/cli.py \
--spec Documentation/netlink/specs/netdev.yaml \
--dump queue-get --json='{"ifindex": 2}'
[{'id': 0, 'ifindex': 2, 'napi-id': 8289, 'type': 'rx'},
{'id': 1, 'ifindex': 2, 'napi-id': 8290, 'type': 'rx'},
{'id': 2, 'ifindex': 2, 'napi-id': 8291, 'type': 'rx'},
{'id': 3, 'ifindex': 2, 'napi-id': 8292, 'type': 'rx'},
{'id': 0, 'ifindex': 2, 'napi-id': 0, 'type': 'tx'},
{'id': 1, 'ifindex': 2, 'napi-id': 0, 'type': 'tx'},
{'id': 2, 'ifindex': 2, 'napi-id': 0, 'type': 'tx'},
{'id': 3, 'ifindex': 2, 'napi-id': 0, 'type': 'tx'}]
If in the future the TX-only NAPIs get NAPI IDs, then nothing would
need to be updated in the driver and the NAPI IDs would "just work"
and appear.
quoted
quoted
2) does NAPI know (or why it needs to know) whether or not it's a TX or not? I only see the following code in napi_hash_add():Note that I did not write the original implementation of NAPI IDs or epoll-based busy poll, so I can only comment on what I know :) I don't know why TX-only NAPIs do not have NAPI IDs. My guess is that in the original implementation, the code was designed only for RX busy polling, so TX-only NAPIs were not assigned NAPI IDs. Perhaps in the future, TX-only NAPIs will be assigned NAPI IDs, but currently they do not have NAPI IDs.Jakub, could you please help to clarify this part?
Can you please explain what part needs clarification? Regardless of TX-only NAPIs, we can still set NAPI IDs for virtio_net RX queues and that would be immensely useful for users. There's two options for virtio_net as I've outlined above and in my cover letter and commit messages: 1. This implementation as-is. Then if one day in the future TX-only NAPIs get NAPI IDs, this driver (and others like mlx4) can be updated. - OR - 2. Calling netif_queue_set_napi for all NAPIs, which results in the TX-only NAPIs displaying "0" as shown above. Please let me know which option you'd like to see; I don't have a preference, I just want to get support for this API in virtio_net.
quoted
quoted
static void napi_hash_add(struct napi_struct *napi) { unsigned long flags; if (test_bit(NAPI_STATE_NO_BUSY_POLL, &napi->state)) return; ... __napi_hash_add_with_id(napi, napi_gen_id); spin_unlock_irqrestore(&napi_hash_lock, flags); } It seems it only matters with NAPI_STATE_NO_BUSY_POLL. And if NAPI knows everything, should it be better to just do the linking in napi_enable/disable() instead of letting each driver do it by itself?It would be nice if this were possible, I agree. Perhaps in the future some work could be done to make this possible. I believe that this is not currently possible because the NAPI does not know which queue ID it is associated with. That mapping of which queue is associated with which NAPI is established in patch 3 (please see the commit message of patch 3 to see an example of the output). The driver knows both the queue ID and the NAPI for that queue, so the mapping can be established only by the driver. Let me know if that helps.Yes, definitely. Let's see Jakub's comment.
As mentioned above, I'm not sure if we need to worry about TX-only NAPIs getting NAPI IDs. That seems pretty unrelated to this change as I've explained above.