Thread (27 messages) 27 messages, 10 authors, 2011-02-26

Re: [PATCH] xps-mq: Transmit Packet Steering for multiqueue

From: Tom Herbert <hidden>
Date: 2010-09-01 16:24:28

On Wed, Sep 1, 2010 at 8:54 AM, Eric Dumazet [off-list ref] wrote:
Le mercredi 01 septembre 2010 à 08:41 -0700, Tom Herbert a écrit :
quoted
quoted
Why don't we do this in the normal transmit processing.
There is already so much policy mechanism filters/actions/qdisc that
doing it in higher level is fighting against these.
Are you proposing that TX queue selection be done in the qdiscs?  The
queue has to be selected before taking the lock (cannot afford taking
a lock over the whole interface).  This would necessitate moving the
locking and probably rearranging a lot of the xmit code around that.
Stephen point is not adding yet another layer 'before' qdisc layer.

I would like something not as complex as your patch.

1) Why current selection fails ?
Current selection does a hash on 4-tuple to map packets to queues.  So
any CPU can send on any queue which leads to cache line bouncing of
transmit structures.  Also when sending from one CPU to a queue whose
transmit interrupt is on a CPU in another cache domain cause more
cache line bouncing with transmit completion.  So while the current
scheme nicely distributes load across the queues, it does nothing  to
promote locality.  Getting some reasonable locality is where the
benefits come from that we are demonstrating.
2) Could we change current selection to :

 - Use a lightweight selection, with no special configuration.

 - Use driver RX multiqueue information if available, in a one-to-one
relationship.
Not generally.  It's very possible that the only a subset of CPUs are
getting RX interrupts in multiqueue (consider when #queues < #CPUs),
so there's really not an obvious 1-1 relationship.  But each CPU can
send and should be mapped to at least one transmit queue; the most
obvious plan would be to send in a queue in the same cache domain.
3) Eventually have a user selectable selection (socket option, or system
wide, but one sysctl, not many bitmasks ;) ).
Right, but it would also be nice if a single sysctl could optimally
set up multiqueue, RSS, RPS, and all my interrupt affinities for me
;-)

Tom
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help