Thread (1 message) 1 message, 1 author, 2010-04-30

Re: [PATCH v6] net: batch skb dequeueing from softnet input_pkt_queue

From: Eric Dumazet <hidden>
Date: 2010-04-30 17:40:28

Possibly related (same subject, not in this thread)

Le jeudi 29 avril 2010 à 20:23 +0200, Andi Kleen a écrit :
On Thu, Apr 29, 2010 at 07:56:12PM +0200, Eric Dumazet wrote:
quoted
Le jeudi 29 avril 2010 à 19:42 +0200, Andi Kleen a écrit :
quoted
quoted
Andi, what do you think of this one ?
Dont we have a function to send an IPI to an individual cpu instead ?
That's what this function already does. You only set a single CPU 
in the target mask, right?

IPIs are unfortunately always a bit slow. Nehalem-EX systems have X2APIC
which is a bit faster for this, but that's not available in the lower
end Nehalems. But even then it's not exactly fast.

I don't think the IPI primitive can be optimized much. It's not a cheap 
operation.

If it's a problem do it less often and batch IPIs.

It's essentially the same problem as interrupt mitigation or NAPI 
are solving for NICs. I guess just need a suitable mitigation mechanism.

Of course that would move more work to the sending CPU again, but 
perhaps there's no alternative. I guess you could make it cheaper it by
minimizing access to packet data.

-Andi
Well, IPI are already batched, and rate is auto adaptative.

After various changes, it seems things are going better, maybe there is
something related to cache line trashing.

I 'solved' it by using idle=poll, but you might take a look at
clockevents_notify (acpi_idle_enter_bm) abuse of a shared and higly
contended spinlock...
acpi_idle_enter_bm should not be executed on a Nehalem, it's obsolete.
If it does on your system something is wrong.

Ahh, that triggers a bell. There's one issue that if the remote CPU is in a very
deep idle state it could take a long time to wake it up. Nehalem has deeper
sleep states than earlier CPUs. When this happens the IPI sender will be slow
too I believe.

Are the target CPUs idle? 
Yes, mostly, but about 200.000 wakeups per second I would say...

If a cpu in deep state receives an IPI, process a softirq, should it
come back to deep state immediately, or should it wait for some
milliseconds ?
Perhaps need to feed some information to cpuidle's governour to prevent this problem.

idle=poll is very drastic, better to limit to C1 
How can I do this ?

Thanks !

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help