Re: [PATCH] xps-mq: Transmit Packet Steering for multiqueue
From: Tom Herbert <hidden>
Date: 2010-09-01 16:24:28
On Wed, Sep 1, 2010 at 8:54 AM, Eric Dumazet [off-list ref] wrote:
Le mercredi 01 septembre 2010 à 08:41 -0700, Tom Herbert a écrit :quoted
quoted
Why don't we do this in the normal transmit processing. There is already so much policy mechanism filters/actions/qdisc that doing it in higher level is fighting against these.Are you proposing that TX queue selection be done in the qdiscs? The queue has to be selected before taking the lock (cannot afford taking a lock over the whole interface). This would necessitate moving the locking and probably rearranging a lot of the xmit code around that.Stephen point is not adding yet another layer 'before' qdisc layer. I would like something not as complex as your patch. 1) Why current selection fails ?
Current selection does a hash on 4-tuple to map packets to queues. So any CPU can send on any queue which leads to cache line bouncing of transmit structures. Also when sending from one CPU to a queue whose transmit interrupt is on a CPU in another cache domain cause more cache line bouncing with transmit completion. So while the current scheme nicely distributes load across the queues, it does nothing to promote locality. Getting some reasonable locality is where the benefits come from that we are demonstrating.
2) Could we change current selection to : - Use a lightweight selection, with no special configuration. - Use driver RX multiqueue information if available, in a one-to-one relationship.
Not generally. It's very possible that the only a subset of CPUs are getting RX interrupts in multiqueue (consider when #queues < #CPUs), so there's really not an obvious 1-1 relationship. But each CPU can send and should be mapped to at least one transmit queue; the most obvious plan would be to send in a queue in the same cache domain.
3) Eventually have a user selectable selection (socket option, or system wide, but one sysctl, not many bitmasks ;) ).
Right, but it would also be nice if a single sysctl could optimally set up multiqueue, RSS, RPS, and all my interrupt affinities for me ;-) Tom