Re: [net-2.6 PATCH] bonding: fix broken multicast with round-robin mode
From: Andy Gospodarek <andy@greyhouse.net>
Date: 2010-03-31 14:49:04
On Wed, Mar 31, 2010 at 5:08 AM, Eric Dumazet [off-list ref] wrote:
Le jeudi 25 mars 2010 à 17:40 -0400, Andy Gospodarek a écrit :quoted
Round-robin (mode 0) does nothing to ensure that any multicast traffic originally destined for the host will continue to arrive at the host when the link that sent the IGMP join or membership report goes down. One of the benefits of absolute round-robin transmit. Keeping track of subscribed multicast groups for each slave did not seem like a good use of resources, so I decided to simply send on the curr_active slave of the bond (typically the first enslaved device that is up). This makes failover management simple as IGMP membership reports only need to be sent when the curr_active_slave changes. I tested this patch and it appears to work as expected. Originally reported by Lon Hohberger [off-list ref]. Signed-off-by: Andy Gospodarek <andy@greyhouse.net> CC: Lon Hohberger <redacted> CC: Jay Vosburgh <redacted> --- drivers/net/bonding/bond_main.c | 34 ++++++++++++++++++++++++++-------- 1 files changed, 26 insertions(+), 8 deletions(-)diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 430c022..0b38455 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c@@ -1235,6 +1235,11 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active)write_lock_bh(&bond->curr_slave_lock); } } + + /* resend IGMP joins since all were sent on curr_active_slave */ + if (bond->params.mode == BOND_MODE_ROUNDROBIN) { + bond_resend_igmp_join_requests(bond); + } } /**@@ -4138,22 +4143,35 @@ static int bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *bond_devstruct bonding *bond = netdev_priv(bond_dev); struct slave *slave, *start_at; int i, slave_no, res = 1; + struct iphdr *iph = ip_hdr(skb); read_lock(&bond->lock); if (!BOND_IS_OK(bond)) goto out; - /* - * Concurrent TX may collide on rr_tx_counter; we accept that - * as being rare enough not to justify using an atomic op here + * Start with the curr_active_slave that joined the bond as the + * default for sending IGMP traffic. For failover purposes one + * needs to maintain some consistency for the interface that will + * send the join/membership reports. The curr_active_slave found + * will send all of this type of traffic. */ - slave_no = bond->rr_tx_counter++ % bond->slave_cnt; + if ((skb->protocol == htons(ETH_P_IP)) && + (iph->protocol == htons(IPPROTO_IGMP))) {Hmm... iph->protocol is a u8, how can htons(IPPROTO_IGMP) be equal to iph->protocol ?
Heh, this isn't needed for a single-byte check. Thanks for catching that.
[PATCH] bonding: bond_xmit_roundrobin() fix Commit a2fd940f (bonding: fix broken multicast with round-robin mode) added a problem on litle endian machines. drivers/net/bonding/bond_main.c:4159: warning: comparison is always false due to limited range of data type
Curious what version of GCC are you using? Before applying your patch it compiles without warning on my x86_64 F11-ish system with: gcc version 4.4.1 20090725 (Red Hat 4.4.1-2) (GCC)
quoted hunk ↗ jump to hunk
Signed-off-by: Eric Dumazet <redacted> ---diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 5b92fbf..5972a52 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c@@ -4156,7 +4156,7 @@ static int bond_xmit_roundrobin(struct sk_buff *skb, struct net_device *bond_dev* send the join/membership reports. The curr_active_slave found * will send all of this type of traffic. */ - if ((iph->protocol == htons(IPPROTO_IGMP)) && + if ((iph->protocol == IPPROTO_IGMP) && (skb->protocol == htons(ETH_P_IP))) { read_lock(&bond->curr_slave_lock);