Re: Multicast packet loss

Multicast packet loss · Kenny Chang <hidden> · 2009-01-30
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-01-30
Re: Multicast packet loss · Denys Fedoryschenko <hidden> · 2009-01-30
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-01-30
Re: Multicast packet loss · Kenny Chang <hidden> · 2009-01-30
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-01-30
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-01-31
Re: Multicast packet loss · Kenny Chang <hidden> · 2009-02-02
Re: Multicast packet loss · Kenny Chang <hidden> · 2009-02-02
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-02-03
Re: Multicast packet loss · Kenny Chang <hidden> · 2009-02-03
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-02-04
Re: Multicast packet loss · Kenny Chang <hidden> · 2009-02-04
Re: Multicast packet loss · Wesley Chow <hidden> · 2009-02-04
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-04
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-02-05
Re: Multicast packet loss · Wesley Chow <hidden> · 2009-02-05
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-02-05
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-01
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-02-02
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-02
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-02-02
Re: Multicast packet loss · Wes Chow <hidden> · 2009-02-02
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-02
Re: Multicast packet loss · Wes Chow <hidden> · 2009-02-02
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-02
Re: Multicast packet loss · Kenny Chang <hidden> · 2009-02-03
Re: Multicast packet loss · Neil Horman <nhorman@tuxdriver.com> · 2009-02-04
Re: Multicast packet loss · Kenny Chang <hidden> · 2009-02-26
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-28
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-01
Re: Multicast packet loss · David Miller <davem@davemloft.net> · 2009-03-04
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-04
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-07
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-08
Re: Multicast packet loss · David Miller <davem@davemloft.net> · 2009-03-09
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-09
Re: Multicast packet loss · David Miller <davem@davemloft.net> · 2009-03-13
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-13
Re: Multicast packet loss · David Miller <davem@davemloft.net> · 2009-03-13
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-13
[PATCH] net: reorder fields of struct socket · Eric Dumazet <hidden> · 2009-03-14
Re: [PATCH] net: reorder fields of struct socket · David Miller <davem@davemloft.net> · 2009-03-16
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-16
Re: Multicast packet loss · Peter Zijlstra <peterz@infradead.org> · 2009-03-17
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-17
Re: Multicast packet loss · Peter Zijlstra <peterz@infradead.org> · 2009-03-17
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-03-17
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-17
Re: Multicast packet loss · David Stevens <hidden> · 2009-03-17
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-17
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-04-03
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-04-05
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-04-06
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-04-06
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-04-07
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-04-08
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-03-09
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-10
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-03-10
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-11
Re: Multicast packet loss · Brian Bloniarz <hidden> · 2009-03-12
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-03-12
Re: Multicast packet loss · Christoph Lameter <hidden> · 2009-02-27
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-27
Re: Multicast packet loss · Christoph Lameter <hidden> · 2009-02-27
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-27
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-27
Re: Multicast packet loss · Eric Dumazet <hidden> · 2009-02-02

From: Neil Horman <nhorman@tuxdriver.com>
Date: 2009-02-04 01:21:47

On Tue, Feb 03, 2009 at 12:34:54PM -0500, Kenny Chang wrote:

Eric Dumazet wrote:

quoted

Wes Chow a écrit :

quoted

Eric Dumazet wrote:

quoted

Wes Chow a écrit :

quoted

(I'm Kenny's colleague, and I've been doing the kernel builds)

First I'd like to note that there were a lot of bnx2 NAPI changes
between 2.6.21 and 2.6.22. As a reminder, 2.6.21 shows tiny amounts
of packet loss,
whereas loss in 2.6.22 is significant.

Second, some CPU affinity info: if I do like Eric and pin all of the
apps onto a single CPU, I see no packet loss. Also, I do *not* see
ksoftirqd show up on top at all!

If I pin half the processes on one CPU and the other half on another
CPU, one ksoftirqd processes shows up in top and completely pegs one
CPU. My packet loss
in that case is significant (25%).

Now, the strange case: if I pin 3 processes to one CPU and 1 process
to another, I get about 25% packet loss and ksoftirqd pins one CPU.
However, one
of the apps takes significantly less CPU than the others, and all
apps lose the
*exact same number of packets*. In all other situations where we see
packet
loss, the actual number lost per application instance appears random.

You see same number of packet lost because they are lost at NIC level

Understood.

I have a new observation: if I pin processes to just CPUs 0 and 1, I see
no packet loss. Pinning to 0 and 2, I do see packet loss. Pinning 2 and
3, no packet loss. 4 & 5 - no packet loss, 6 & 7 - no packet loss. Any
other combination appears to produce loss (though I have not tried all
28 combinations, this seems to be the case).

At first I thought maybe it had to do with processes pinned to the same
CPU, but different cores. The machine is a dual quad core, which means
that CPUs 0-3 should be a physical CPU, correct? Pinning to 0/2 and 0/3
produce packet loss.

a quad core is really a 2 x 2 core

L2 cache is splited on two blocks, one block used by CPU0/1, other by 
CPU2/3 

You are at the limit of the machine with such workload, so as soon as your
CPUs have to transfert 64 bytes lines between those two L2 blocks, you loose.

quoted

I've also noticed that it does not matter which of the working pairs I
pin to. For example, pinning 5 processes in any combination on either
0/1 produce no packet loss, pinning all 5 to just CPU 0 also produces no
packet loss.

The failures are also sudden. In all of the working cases mentioned
above, I don't see ksoftirqd on top at all. But when I run 6 processes
on a single CPU, ksoftirqd shoots up to 100% and I lose a huge number of
packets.

quoted

Normaly, softirq runs on same cpu (the one handling hard irq)

What determines which CPU the hard irq occurs on?

Check /proc/irq/{irqnumber}/smp_affinity

If you want IRQ16 only served by CPU0 :

echo 1 >/proc/irq/16/smp_affinity

Hi everyone,

First, thanks for all the effort so far, I think we've learned so much  
more about the problem in the last couple of days than we had previously  
in a month.

Just to summarize where we are:

* pinning processes to specific cores/CPUs alleviate the problem
* issues exist from 2.6.22 up to 2.6.29-rc3
* issue does not appear to be isolated to 64-bit, 32-bits have problems  
too.
* I'm attaching an updated test program with the PR_SET_TIMERSTACK call  
added.
* on troubled machines, we are seeing high number of context switches  
and interrupts.
* we've ordered an Intel card to try in our machine to see if we can  
circumvent the issue with a different driver.

Kernel Version         Has Problem?     Notes
----------             ----------       ----------
2.6.15.x                N    2.6.16.x                -
2.6.17.x                -               Doesn't build on Hardy
2.6.18.x                -               Doesn't boot (kernel panic)
2.6.19.7                N               ksoftirqd is up there, but not  
pegging a CPU.
                                       Takes roughly same amount of CPU  
as the other
                                       processes, all of which are from  
20-40%
2.6.20.21               N
2.6.21.7                N               sort of lopsided load, but no  
load from
                                       ksoftirqd -- strange
2.6.22.19               Y               First broken kernel
2.6.23.x                -
2.6.24-19               Y               (from Hardy)
2.6.25.x                -
2.6.26.x                -
2.6.27.x                Y               (from Intrepid)
2.6.28.1                Y
2.6.29-rc               Y


Correct me if I'm wrong, from what we've seen, it looks like its  
pointing to some inefficiency in the softirq handling.  The question is  
whether it's something in the driver or the kernel.  If we can isolate  
that, maybe we can take some action to have it fixed.

I don't think its sofirq ineffeciencies (oprofile would have shown that).  I
know I keep harping on this, but I still think irq affininty is your problem.
I'd be interested in knowning what your /proc/interrupts file looked like on
each of the above kenrels.  Perhaps its not that the bnx2 card you have can't
handle the setting of MSI interrupt affinities, but rather that something
changeed to break irq affinity on this card.

Neil

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help