Thread (32 messages) 32 messages, 5 authors, 2011-02-03

Re: [PATCH] bonding: added 802.3ad round-robin hashing policy for single TCP session balancing

From: Jay Vosburgh <hidden>
Date: 2011-01-15 00:05:22

Oleg V. Ukhno [off-list ref] wrote:
Jay Vosburgh wrote:
quoted
	This is a violation of the 802.3ad (now 802.1ax) standard, 5.2.1
(f), which requires that all frames of a given "conversation" are passed
to a single port.

	The existing layer3+4 hash has a similar problem (that it may
send packets from a conversation to multiple ports), but for that case
it's an unlikely exception (only in the case of IP fragmentation), but
here it's the norm.  At a minimum, this must be clearly documented.

	Also, what does a round robin in 802.3ad provide that the
existing round robin does not?  My presumption is that you're looking to
get the aggregator autoconfiguration that 802.3ad provides, but you
don't say.
	I'm still curious about this question.  Given the rather
intricate setup of your particular network (described below), I'm not
sure why 802.3ad is of benefit over traditional etherchannel
(balance-rr / balance-xor).
quoted
	I don't necessarily think this is a bad cheat (round robining on
802.3ad as an explicit non-standard extension), since everybody wants to
stripe their traffic across multiple slaves.  I've given some thought to
making round robin into just another hash mode, but this also does some
magic to the MAC addresses of the outgoing frames (more on that below).
Yes, I am resetting MAC addresses when transmitting packets to have switch
to put packets into different ports of the receiving etherchannel.
	By "etherchannel" do you really mean "Cisco switch with a
port-channel group using LACP"?
I am using this patch to provide full-mesh ISCSI connectivity between at
least 4 hosts (all hosts of course are in same ethernet segment) and every
host is connected with aggregate link with 4 slaves(usually).
Using round-robin I provide near-equal load striping when transmitting,
using MAC address magic I force switch to stripe packets over all slave
links in destination port-channel(when number of rx-ing slaves is equal to
number ot tx-ing slaves and is even).
	By "MAC address magic" do you mean that you're assigning
specifically chosen MAC addresses to the slaves so that the switch's
hash is essentially "assigning" the bonding slaves to particular ports
on the outgoing port-channel group?

	Assuming that this is the case, it's an interesting idea, but
I'm unconvinced that it's better on 802.3ad vs. balance-rr.  Unless I'm
missing something, you can get everything you need from an option to
have balance-rr / balance-xor utilize the slave's permanent address as
the source address for outgoing traffic.
[...] So I am able to utilize all slaves
for tx and for rx up to maximum capacity; besides I am getting L2 link
failure detection (and load rebalancing), which is (in my opinion) much
faster and robust than L3 or than dm-multipath provides.
It's my idea with the patch
	Can somebody (John?) more knowledgable than I about dm-multipath
comment on the above?
quoted
	This is the code that resets the MAC header as described above.
It doesn't quite match the documentation, since it only resets the MAC
for ETH_P_IP packets.
Yes, I really meant that my patch applies to ETH_P_IP packets and I've
missed that from documentation I wrote.
	Is limiting this to just ETH_P_IP really a means to exclude ARP,
or is there some advantage to (effectively) only balancing IP traffic,
and leaving other traffic (IPv6, for one) essentially unbalanced (when
exiting the switch through the destination port-channel group, which
you've set to use a src-mac hash)?

	-J

---
	-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help