Thread (20 messages) 20 messages, 6 authors, 2011-06-28

Re: SKB paged fragment lifecycle on receive

From: Ian Campbell <hidden>
Date: 2011-06-27 09:41:35
Also in: xen-devel

On Sun, 2011-06-26 at 11:25 +0100, Michael S. Tsirkin wrote:
On Fri, Jun 24, 2011 at 04:43:22PM +0100, Ian Campbell wrote:
quoted
In this mode guest data pages ("foreign pages") were mapped into the
backend domain (using Xen grant-table functionality) and placed into the
skb's paged frag list (skb_shinfo(skb)->frags, I hope I am using the
right term). Once the page is finished with netback unmaps it in order
to return it to the guest (we really want to avoid returning such pages
to the general allocation pool!).
Are the pages writeable by the source guest while netback processes
them?  If yes, firewalling becomes unreliable as the packet can be
modified after it's checked, right?
We only map the paged frags, the linear area is always copied (enough to
cover maximally sized TCP/IP, including options), for this reason.
Also, for guest to guest communication, do you wait for
the destination to stop looking at the packet in order
to return it to the source? If yes, can source guest
networking be disrupted by a slow destination?
There is a timeout which ultimately does a copy into dom0 memory and
frees up the domain grant for return to the sending guest.
quoted
Jeremy Fitzhardinge and I subsequently
looked at the possibility of a no-clone skb flag (i.e. always forcing a
copy instead of a clone)
I think this is the approach that the patchset
'macvtap/vhost TX zero-copy support' takes.
That's TX from the guests PoV, the same as I am looking at here,
correct?

I should definitely check this work out, thanks for the pointer. Is V7
(http://marc.info/?l=linux-kernel&m=130661128431312&w=2) the most recent
posting?

I suppose one difference with this is that it deals with data from
"dom0" userspace buffers rather than (what looks like) kernel memory,
although I don't know if that matters yet. Also it hangs off of struct
sock which netback doesn't have. Anyway I'll check it out.
quoted
but IIRC honouring it universally turned into a
very twisty maze with a number of nasty corner cases etc.
Any examples? Are they covered by the patchset above?
It was quite a while ago so I don't remember many of the specifics.
Jeremy might remember better but for example any broadcast traffic
hitting a bridge (a very interesting case for Xen), seems like a likely
case? pcap was another one which I do remember, but that's obviously
less critical.

I presume with the TX zero-copy support the "copying due to attempted
clone" rate is low?
quoted
FWIW I proposed a session on the subject for LPC this year.
We also plan to discuss this on kvm forum 2011
(colocated with linuxcon 2011).
http://www.linux-kvm.org/page/KVM_Forum_2011
I had already considered coming to LinuxCon for other reasons but
unfortunately I have family commitments around then :-(

Ian.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help