Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2017-09-01 16:17:50
quoted
quoted
This is not a 50/50 split, which impliesTw that some packets from the large packet flow are still converted to copying. Without the change the rate without queue was 80k zerocopy vs 80k copy, so this choice of (vq->num >> 2) appears too conservative. However, testing with (vq->num >> 1) was not as effective at mitigating stalls. I did not save that data, unfortunately. Can run more tests on fine tuning this variable, if the idea sounds good.Looks like there're still two cases were left:To be clear, this patch is not intended to fix all issues. It is a small improvement to avoid HoL blocking due to queued zerocopy skbs. The trade-off is that reverting to copying in these cases increases cycle cost. I think that that is a trade-off worth making compared to the alternative drop in throughput. It probably would be good to be able to measure this without kernel instrumentation: export counters similar to net->tx_zcopy_err and net->tx_packets (though without reset to zero, as in vhost_net_tx_packet).quoted
1) sndbuf is not INT_MAXYou mean the case where the device stalls, later zerocopy notifications are queued, but these are never cleaned in free_old_xmit_skbs, because it requires a start_xmit and by now the (only) socket is out of descriptors?
Typo, sorry. I meant out of sndbuf.
A watchdog would help somewhat. With tx-napi, this case cannot occur, either, as free_old_xmit_skbs no longer depends on a call to start_xmit.quoted
2) tx napi is used for virtio-netI am not aware of any issue specific to the use of tx-napi?quoted
1) could be a corner case, and for 2) what your suggest here may not solve the issue since it still do in order completion.Somewhat tangential, but it might also help to break the in-order completion processing in vhost_zerocopy_signal_used. Complete all descriptors between done_idx and upend_idx. done_idx should then only be forward to the oldest still not-completed descriptor. In the test I ran, where the oldest descriptors are held in a queue and all newer ones are tail-dropped, this would avoid blocking a full ring of completions, when only a small number (or 1) is actually delayed. Dynamic switching between copy and zerocopy using zcopy_used already returns completions out-of-order, so this is not a huge leap.