Re: [net-next RFC PATCH 5/5] virtio-net: flow director support
From: Jason Wang <jasowang@redhat.com>
Date: 2011-12-06 07:25:16
Also in:
kvm, virtualization
On 12/06/2011 04:42 AM, Ben Hutchings wrote:
On Mon, 2011-12-05 at 16:59 +0800, Jason Wang wrote:quoted
In order to let the packets of a flow to be passed to the desired guest cpu, we can co-operate with devices through programming the flow director which was just a hash to queue table. This kinds of co-operation is done through the accelerate RFS support, a device specific flow sterring method virtnet_fd() is used to modify the flow director based on rfs mapping. The desired queue were calculated through reverse mapping of the irq affinity table. In order to parallelize the ingress path, irq affinity of rx queue were also provides by the driver. In addition to accelerate RFS, we can also use the guest scheduler to balance the load of TX and reduce the lock contention on egress path, so the processor_id() were used to tx queue selection.[...]quoted
+#ifdef CONFIG_RFS_ACCEL + +int virtnet_fd(struct net_device *net_dev, const struct sk_buff *skb, + u16 rxq_index, u32 flow_id) +{ + struct virtnet_info *vi = netdev_priv(net_dev); + u16 *table = NULL; + + if (skb->protocol != htons(ETH_P_IP) || !skb->rxhash) + return -EPROTONOSUPPORT;Why only IPv4?
Oops, IPv6 should work also.
quoted
+ table = kmap_atomic(vi->fd_page); + table[skb->rxhash& TAP_HASH_MASK] = rxq_index; + kunmap_atomic(table); + + return 0; +} +#endifThis is not a proper implementation of ndo_rx_flow_steer. If you steer a flow by changing the RSS table this can easily cause packet reordering in other flows. The filtering should be more precise, ideally matching exactly a single flow by e.g. VID and IP 5-tuple. I think you need to add a second hash table which records exactly which flow is supposed to be steered. Also, you must call rps_may_expire_flow() to check whether an entry in this table may be replaced; otherwise you can cause packet reordering in the flow that was previously being steered. Finally, this function must return the table index it assigned, so that rps_may_expire_flow() works.
Thanks for the explanation, how about document this briefly in scaling.txt?
quoted
+static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb) +{ + int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : + smp_processor_id(); + + /* As we make use of the accelerate rfs which let the scheduler to + * balance the load, it make sense to choose the tx queue also based on + * theprocessor id? + */ + while (unlikely(txq>= dev->real_num_tx_queues)) + txq -= dev->real_num_tx_queues; + return txq; +}[...] Don't do this, let XPS handle it. Ben.