Thread (49 messages) 49 messages, 4 authors, 2015-06-07

Re: [RFC PATCH 03/13] xen-netback: implement TX persistent grants

From: Joao Martins <hidden>
Date: 2015-05-22 10:25:16

On 19 May 2015, at 17:23, Wei Liu [off-list ref] wrote:
On Tue, May 12, 2015 at 07:18:27PM +0200, Joao Martins wrote:
quoted
Introduces persistent grants for TX path which follows similar code path
as the grant mapping.

It starts by checking if there's a persistent grant available for header
and frags grefs and if so setting it in tx_pgrants. If no persistent grant
is found in the tree for the header it will resort to grant copy (but
preparing the map ops and add them laster). For the frags it will use the
                                    ^
                                    later
quoted
tree page pool, and in case of no pages it fallbacks to grant map/unmap
using mmap_pages. When skb destructor callback gets called we release the
slot and persistent grant within the callback to avoid waking up the
dealloc thread. As long as there are no unmaps to done the dealloc thread
will remain inactive.
This scheme looks complicated. Can we just only use one
scheme at a time? What's the rationale for using this combined scheme?
Maybe you're thinking about using a max_grants < ring_size to save
memory?
Yes, my purpose was to allow a max_grants < ring_size to save amount of
memory mapped. I did a bulk transfer test with iperf and the max amount of
grants in tree was <160 TX gnts, without affecting the max performance;
tough using pktgen fills the tree completely.
The second reason is to handle the case for a (malicious?) frontend providing
more grefs than the max allowed in which I would fallback to grant map/unmap.
Only skim the patch. I will do detailed reviews after we're sure this is
the right way to go.
quoted
Results show an improvement of 46% (1.82 vs 1.24 Mpps, 64 pkt size)
measured with pktgen and up to over 48% (21.6 vs 14.5 Gbit/s) measured
with iperf (TCP) with 4 parallel flows 1 queue vif, DomU to Dom0.
Tests ran on a Intel Xeon E5-1650 v2 with HT disabled.
[…]
int xenvif_init_queue(struct xenvif_queue *queue)
{
	int err, i;
@@ -496,9 +524,23 @@ int xenvif_init_queue(struct xenvif_queue *queue)
			  .ctx = NULL,
			  .desc = i };
		queue->grant_tx_handle[i] = NETBACK_INVALID_HANDLE;
+		queue->tx_pgrants[i] = NULL;
+	}
+
+	if (queue->vif->persistent_grants) {
+		err = init_persistent_gnt_tree(&queue->tx_gnts_tree,
+					       queue->tx_gnts_pages,
+					       XEN_NETIF_TX_RING_SIZE);
+		if (err)
+			goto err_disable;
	}

	return 0;
+
+err_disable:
+	netdev_err(queue->vif->dev, "Could not reserve tree pages.");
+	queue->vif->persistent_grants = 0;
You can just move the above two lines under `if (err)'.

Also see below.
In the next patch this is also common cleanup path. I did it this way to 
avoid moving this hunk around.

quoted
+	return 0;
}

void xenvif_carrier_on(struct xenvif *vif)
@@ -654,6 +696,10 @@ void xenvif_disconnect(struct xenvif *vif)
		}

		xenvif_unmap_frontend_rings(queue);
+
+		if (queue->vif->persistent_grants)
+			deinit_persistent_gnt_tree(&queue->tx_gnts_tree,
+						   queue->tx_gnts_pages);
If the init function fails on queue N (N>0) you now leak resources.
Correct. I should return -ENOMEM (on init) and not 0.

quoted
	}
}
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 332e489..529d7c3 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -269,6 +269,11 @@ static inline unsigned long idx_to_kaddr(struct xenvif_queue *queue,
	return (unsigned long)pfn_to_kaddr(idx_to_pfn(queue, idx));
}

+static inline void *page_to_kaddr(struct page *page)
+{
+	return pfn_to_kaddr(page_to_pfn(page));
+}
+
#define callback_param(vif, pending_idx) \
	(vif->pending_tx_info[pending_idx].callback_struct)
@@ -299,6 +304,29 @@ static inline pending_ring_idx_t pending_index(unsigned i)
	return i & (MAX_PENDING_REQS-1);
}

+/*  Creates a new persistent grant and add it to the tree.
+ */
+static struct persistent_gnt *xenvif_pgrant_new(struct persistent_gnt_tree *tree,
+						struct gnttab_map_grant_ref *gop)
+{
+	struct persistent_gnt *persistent_gnt;
+
+	persistent_gnt = kmalloc(sizeof(*persistent_gnt), GFP_KERNEL);
xenvif_pgrant_new can be called in NAPI which runs in softirq context
which doesn't allow you to sleep.
Silly mistake, I will fix this. The whole point of the tree page pool was that we
aren't allowed to sleep both here and in ndo_start_xmit (in patch #6).
quoted
[…]

+/* Checks if there's a persistent grant available for gref and
+ * if so, set it also in the tx_pgrants array that keeps the ones
+ * in use.
+ */
+static bool xenvif_tx_pgrant_available(struct xenvif_queue *queue,
+				       grant_ref_t ref, u16 pending_idx,
+				       bool *can_map)
+{
+	struct persistent_gnt_tree *tree = &queue->tx_gnts_tree;
+	struct persistent_gnt *persistent_gnt;
+	bool busy;
+
+	if (!queue->vif->persistent_grants)
+		return false;
+
+	persistent_gnt = get_persistent_gnt(tree, ref);
+
+	/* If gref is already in use we fallback, since it would
+	 * otherwise mean re-adding the same gref to the tree
+	 */
+	busy = IS_ERR(persistent_gnt);
+	if (unlikely(busy))
+		persistent_gnt = NULL;
+
Under what circumstance can we retrieve a already in use persistent
grant? You seem to suggest this is a bug in RX case.
A guest could share try to share the same mapped page in multiple frags,
in which case I fallback to map/unmap. I think this is a limitation in
the way we manage the persistent gnts where we can only have a single
reference of a persistent grant inflight.
quoted
[…]
			MAX_PENDING_REQS);
		index = pending_index(queue->dealloc_prod);
@@ -1691,7 +1939,10 @@ void xenvif_zerocopy_callback(struct ubuf_info *ubuf, bool zerocopy_success)
		smp_wmb();
		queue->dealloc_prod++;
	} while (ubuf);
-	wake_up(&queue->dealloc_wq);
+	/* Wake up only when there are grants to unmap */
+	if (dealloc_prod_save != queue->dealloc_prod)
+		wake_up(&queue->dealloc_wq);
+
	spin_unlock_irqrestore(&queue->callback_lock, flags);

	if (likely(zerocopy_success))
@@ -1779,10 +2030,13 @@ int xenvif_tx_action(struct xenvif_queue *queue, int budget)
	xenvif_tx_build_gops(queue, budget, &nr_cops, &nr_mops);

-	if (nr_cops == 0)
+	if (!queue->vif->persistent_grants &&
+	    nr_cops == 0)
You can just move nr_cops to previous line.

Wei.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help