Thread (17 messages) 17 messages, 4 authors, 2010-10-06

Re: [PATCH] bonding: rejoin multicast groups on VLANs

From: Flavio Leitner <hidden>
Date: 2010-09-29 20:38:47

On Wed, Sep 29, 2010 at 03:54:11PM -0400, Andy Gospodarek wrote:
On Wed, Sep 29, 2010 at 04:35:39PM -0300, Flavio Leitner wrote:
quoted
On Wed, Sep 29, 2010 at 02:44:13PM -0400, Andy Gospodarek wrote:
quoted
On Wed, Sep 29, 2010 at 04:12:24AM -0300, Flavio Leitner wrote:
quoted
It fixes bonding to rejoin multicast groups added
to VLAN devices on top of bonding when a failover
happens.

The first packet may be discarded, so the timer
assure that at least 3 Reports are sent.
Good find, Flavio.  Clearly the fact that multicast membership is broken
needs to be fixed, but I would rather not see timers used at all.  We
worked hard in the past to eliminate timers for several reasons, so I
would rather see a workqueue used.
I noticed that the code is using workqueues now, just thought a
simple thing which may run couple times would fit perfectly with
a simple timer.
Timers runs in softirq context, so I'd rather not add code that takes
locks and runs in softirq context.
quoted
quoted
I also don't like retransmitting the membership report 3 times when it
may not be needed.  Though many switches can handle it, the cost of
receiving and processing what might be a large list of multicast
addresses every 200ms for 600ms doesn't seem ideal.  It also feels like
a hack. :)
Definitely a parameter is much better, but I wasn't sure about
the patch approach so I was expecting a review like this and then
do the refinements needed. Better to post early, right? :)

I see your point to change the default to one membership report,
but we can't assure during a failover if everything has been
received. Also, it isn't supposed to keep failing flooding the
network, so I would rather have couple membership reports being
send than watch an important multicast application failing.

Perhaps 3 is too much, but one sounds too few to me.

what you think?
Adding a tunable parameter allows the administrator to decide how many
is enough.  I would rather keep the default at one and add the tunable
parameter (which needs to be added to bond_sysfs.c to be effective).

I have not heard loud complaints about only sending one since the code
to send retransmits of membership reports was added a few years ago, so
I'm inclined to think it is working well for most users (or no one is
using bonding).

Maybe it would be best to break this into 2 patches.  One that simply
fixes the failover code so it works with VLANs (that could be done
easily today) and another patch that can add the code to send multiple
retransmits.  Would you be willing to do that?
Sure, I can do it and then start another testing session here.

-- 
Flavio
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help