Thread (47 messages) 47 messages, 6 authors, 2017-09-06

Re: [PATCH net-next] virtio-net: invoke zerocopy callback on xmit path if no tx napi

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2017-09-01 16:17:50

quoted
quoted
This is not a 50/50 split, which impliesTw that some packets from the
large
packet flow are still converted to copying. Without the change the rate
without queue was 80k zerocopy vs 80k copy, so this choice of
(vq->num >> 2) appears too conservative.

However, testing with (vq->num >> 1) was not as effective at mitigating
stalls. I did not save that data, unfortunately. Can run more tests on
fine
tuning this variable, if the idea sounds good.

Looks like there're still two cases were left:
To be clear, this patch is not intended to fix all issues. It is a small
improvement to avoid HoL blocking due to queued zerocopy skbs.

The trade-off is that reverting to copying in these cases increases
cycle cost. I think that that is a trade-off worth making compared to
the alternative drop in throughput. It probably would be good to be
able to measure this without kernel instrumentation: export
counters similar to net->tx_zcopy_err and net->tx_packets (though
without reset to zero, as in vhost_net_tx_packet).
quoted
1) sndbuf is not INT_MAX
You mean the case where the device stalls, later zerocopy notifications
are queued, but these are never cleaned in free_old_xmit_skbs,
because it requires a start_xmit and by now the (only) socket is out of
descriptors?
Typo, sorry. I meant out of sndbuf.
A watchdog would help somewhat. With tx-napi, this case cannot occur,
either, as free_old_xmit_skbs no longer depends on a call to start_xmit.
quoted
2) tx napi is used for virtio-net
I am not aware of any issue specific to the use of tx-napi?
quoted
1) could be a corner case, and for 2) what your suggest here may not solve
the issue since it still do in order completion.
Somewhat tangential, but it might also help to break the in-order
completion processing in vhost_zerocopy_signal_used. Complete
all descriptors between done_idx and upend_idx. done_idx should
then only be forward to the oldest still not-completed descriptor.

In the test I ran, where the oldest descriptors are held in a queue and
all newer ones are tail-dropped, this would avoid blocking a full ring
of completions, when only a small number (or 1) is actually delayed.

Dynamic switching between copy and zerocopy using zcopy_used
already returns completions out-of-order, so this is not a huge leap.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help