Re: [PATCH net] virtio-net: suppress bad irq warning for tx napi
From: Wei Wang <hidden>
Date: 2021-02-11 00:14:14
On Wed, Feb 10, 2021 at 1:14 AM Michael S. Tsirkin [off-list ref] wrote:
On Tue, Feb 09, 2021 at 10:00:22AM -0800, Wei Wang wrote:quoted
On Tue, Feb 9, 2021 at 6:58 AM Willem de Bruijn [off-list ref] wrote:quoted
quoted
quoted
quoted
quoted
I have no preference. Just curious, especially if it complicates the patch.My understanding is that. It's probably ok for net. But we probably need to document the assumptions to make sure it was not abused in other drivers. Introduce new parameters for find_vqs() can help to eliminate the subtle stuffs but I agree it looks like a overkill. (Btw, I forget the numbers but wonder how much difference if we simple remove the free_old_xmits() from the rx NAPI path?)The committed patchset did not record those numbers, but I found them in an earlier iteration: [PATCH net-next 0/3] virtio-net tx napi https://lists.openwall.net/netdev/2017/04/02/55 It did seem to significantly reduce compute cycles ("Gcyc") at the time. For instance: TCP_RR Latency (us): 1x: p50 24 24 21 p99 27 27 27 Gcycles 299 432 308 I'm concerned that removing it now may cause a regression report in a few months. That is higher risk than the spurious interrupt warning that was only reported after years of use.Right. So if Michael is fine with this approach, I'm ok with it. But we probably need to a TODO to invent the interrupt handlers that can be used for more than one virtqueues. When MSI-X is enabled, the interrupt handler (vring_interrup()) assumes the interrupt is used by a single virtqueue.Thanks. The approach to schedule tx-napi from virtnet_poll_cleantx instead of cleaning directly in this rx-napi function was not effective at suppressing the warning, I understand.Correct. I tried the approach to schedule tx napi instead of directly do free_old_xmit_skbs() in virtnet_poll_cleantx(). But the warning still happens.Two questions here: is the device using packed or split vqs? And is event index enabled?
The device is indeed using split vqs with event index enabled.
I think one issue is that at the moment with split and event index we don't actually disable events at all.
You mean we don't disable 'interrupts' right? What is the reason for that?
static void virtqueue_disable_cb_split(struct virtqueue *_vq)
{
struct vring_virtqueue *vq = to_vvq(_vq);
if (!(vq->split.avail_flags_shadow & VRING_AVAIL_F_NO_INTERRUPT)) {
vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
if (!vq->event)
vq->split.vring.avail->flags =
cpu_to_virtio16(_vq->vdev,
vq->split.avail_flags_shadow);
}
}
Can you try your napi patch + disable event index?Thanks for the suggestion. I've run the reproducer with napi patch + disable event index, and so far, I did not see the warning getting triggered. Will keep it running for a bit longer.
-- MST