Thread (39 messages) 39 messages, 5 authors, 2010-09-29

Re: [PATCH v11 13/17] Add mp(mediate passthru) device.

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: 2010-09-28 13:13:30
Also in: kvm, lkml

On Mon, Sep 27, 2010 at 10:23:33PM +0100, Ben Hutchings wrote:
quoted
+/* The main function to transform the guest user space address
+ * to host kernel address via get_user_pages(). Thus the hardware
+ * can do DMA directly to the external buffer address.
+ */
+static struct page_info *alloc_page_info(struct page_ctor *ctor,
+		struct kiocb *iocb, struct iovec *iov,
+		int count, struct frag *frags,
+		int npages, int total)
+{
+	int rc;
+	int i, j, n = 0;
+	int len;
+	unsigned long base, lock_limit;
+	struct page_info *info = NULL;
+
+	lock_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
+	lock_limit >>= PAGE_SHIFT;
+
+	if (ctor->lock_pages + count > lock_limit && npages) {
+		printk(KERN_INFO "exceed the locked memory rlimit.");
+		return NULL;
+	}
What if the process is locking pages with mlock() as well?  Doesn't this
allow it to lock twice as many pages as it should be able to?
No, since locked_vm is incremented by both mp and mlock.
Or at least, that's the idea :)
In any case, twice as many would not be a big deal: admin can control
which processes can do this by controlling access to the device.
quoted
+	info = kmem_cache_alloc(ext_page_info_cache, GFP_KERNEL);
+	
+	if (!info)
+		return NULL;
+	info->skb = NULL;
+	info->next = info->prev = NULL;
+
+	for (i = j = 0; i < count; i++) {
+		base = (unsigned long)iov[i].iov_base;
+		len = iov[i].iov_len;
+
+		if (!len)
+			continue;
+		n = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
+
+		rc = get_user_pages_fast(base, n, npages ? 1 : 0,
+				&info->pages[j]);
+		if (rc != n)
+			goto failed;
+
+		while (n--) {
+			frags[j].offset = base & ~PAGE_MASK;
+			frags[j].size = min_t(int, len,
+					PAGE_SIZE - frags[j].offset);
+			len -= frags[j].size;
+			base += frags[j].size;
+			j++;
+		}
+	}
+
+#ifdef CONFIG_HIGHMEM
+	if (npages && !(dev->features & NETIF_F_HIGHDMA)) {
+		for (i = 0; i < j; i++) {
+			if (PageHighMem(info->pages[i]))
+				goto failed;
+		}
+	}
+#endif
Shouldn't you try to allocate lowmem pages explicitly, rather than
failing at this point?
We don't allocate pages, we lock given pages. Once this is
integrated in macvtap presumably we'll fall back on data copy
for such devices.

...
quoted
+	skb_reserve(skb, NET_IP_ALIGN);
+	skb_put(skb, len);
+
+	if (skb_copy_datagram_from_iovec(skb, 0, iov, 0, len)) {
+		kfree_skb(skb);
+		return -EAGAIN;
+	}
+
+	skb->protocol = eth_type_trans(skb, mp->dev);
Why are you calling eth_type_trans() on transmit?
So that GSO can work.  BTW macvtap does:

        skb_set_network_header(skb, ETH_HLEN);
        skb_reset_mac_header(skb);
        skb->protocol = eth_hdr(skb)->h_proto;

and I think this is broken for vlans. Arnd?

-- 
MST
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help