Thread (39 messages) 39 messages, 5 authors, 2019-01-02

Re: [PATCH net-next 0/3] vhost: accelerate metadata access through vmap()

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2018-12-24 19:09:32
Also in: kvm, lkml, virtualization

On Mon, Dec 24, 2018 at 04:44:14PM +0800, Jason Wang wrote:
On 2018/12/17 上午3:57, Michael S. Tsirkin wrote:
quoted
On Sat, Dec 15, 2018 at 11:43:08AM -0800, David Miller wrote:
quoted
From: Jason Wang <jasowang@redhat.com>
Date: Fri, 14 Dec 2018 12:29:54 +0800
quoted
On 2018/12/14 上午4:12, Michael S. Tsirkin wrote:
quoted
On Thu, Dec 13, 2018 at 06:10:19PM +0800, Jason Wang wrote:
quoted
Hi:

This series tries to access virtqueue metadata through kernel virtual
address instead of copy_user() friends since they had too much
overheads like checks, spec barriers or even hardware feature
toggling.

Test shows about 24% improvement on TX PPS. It should benefit other
cases as well.

Please review
I think the idea of speeding up userspace access is a good one.
However I think that moving all checks to start is way too aggressive.
So did packet and AF_XDP. Anyway, sharing address space and access
them directly is the fastest way. Performance is the major
consideration for people to choose backend. Compare to userspace
implementation, vhost does not have security advantages at any
level. If vhost is still slow, people will start to develop backends
based on e.g AF_XDP.
Exactly, this is precisely how this kind of problem should be solved.

Michael, I strongly support the approach Jason is taking here, and I
would like to ask you to seriously reconsider your objections.

Thank you.
Okay. Won't be the first time I'm wrong.

Let's say we ignore security aspects, but we need to make sure the
following all keep working (broken with this revision):
- file backed memory (I didn't see where we mark memory dirty -
   if we don't we get guest memory corruption on close, if we do
   then host crash as https://lwn.net/Articles/774411/ seems to apply here?)

We only pin metadata pages, so I don't think they can be used for DMA. So it
was probably not an issue. The real issue is zerocopy codes, maybe it's time
to disable it by default?

quoted
- THP

We will miss 2 or 4 pages for THP, I wonder whether or not it's measurable.

quoted
- auto-NUMA

I'm not sure auto-NUMA will help for the case of IPC. It can damage the
performance in the worst case if vhost and userspace are running in two
different nodes. Anyway I can measure.

quoted
Because vhost isn't like AF_XDP where you can just tell people "use
hugetlbfs" and "data is removed on close" - people are using it in lots
of configurations with guest memory shared between rings and unrelated
data.

This series doesn't share data, only metadata is shared.
Let me clarify - I mean that metadata is in same huge page with
unrelated guest data. 
quoted
Jason, thoughts on these?
Based on the above, I can measure the impact of THP to see how it impacts.

For unsafe variants, it can only work for when we can batch the access and
it needs non trivial rework on the vhost codes with unexpected amount of
work for archs other than x86. I'm not sure it's worth to try.

Thanks
Yes I think we need better APIs in vhost. Right now
we have an API to get and translate a single buffer.
We should have one that gets a batch of descriptors
and stores it, then one that translates this batch.

IMHO this will benefit everyone even if we do vmap due to
better code locality.

-- 
MST
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help