Thread (52 messages) 52 messages, 10 authors, 2012-03-13

Re: [RFC PATCH v0 1/2] net: bridge: propagate FDB table into hardware

From: John Fastabend <hidden>
Date: 2012-02-14 18:57:04
Also in: kvm

On 2/14/2012 5:18 AM, jamal wrote:
On Mon, 2012-02-13 at 07:13 -0800, John Fastabend wrote:
quoted
The use case here is multiple VFs but the same solution should work with
multiple PFs as well. FDB controls should be independent of how the ports
are exposed VFs, PFs, VMDQ/queue pairs, macvlan, etc.
Makes sense.
quoted
With events and ADD/DEL/GET FDB controls we can solve both cases. This also
solves Roopa's case with macvlan where she wants to add additional addresses
to macvlan ports.
Not familiar with that issue - I'll prowl the list.
Roopa was likely on the right track here,

http://patchwork.ozlabs.org/patch/123064/

But I think the proper syntax is to use the existing PF_BRIDGE:RTM_XXX
netlink messages. And if possible drive this without extending ndo_ops.

An ideal user space interaction IMHO would look like,

[root@jf-dev1-dcblab iproute2]# ./br/br fdb add 52:e5:62:7b:57:88 dev veth10
[root@jf-dev1-dcblab iproute2]# ./br/br fdb
port    mac addr                flags
veth2   36:a6:35:9b:96:c4       local
veth4   aa:54:b0:7b:42:ef       local
veth0   2a:e8:5c:95:6c:1b       local
veth6   6e:26:d5:43:a3:36       local
veth0   f2:c1:39:76:6a:fb
veth8   4e:35:16:af:87:13       local
veth10  52:e5:62:7b:57:88       static
veth10  aa:a9:35:21:15:c4       local
[root@jf-dev1-dcblab iproute2]# ./br/br fdb add dev eth3 to 52:e5:62:7b:57:88
RTNETLINK answers: Invalid argument

Using Stephen's br tool. First command adds FDB entry to SW bridge and
if the same tool could be used to add entries to embedded bridge I think
that would be the best case. So no RTNETLINK error on the second cmd. Then
embedded FDB entries could be dumped this way also so I get a complete view
of my FDB setup across multiple sw bridges and embedded bridges.

I don't think br is part of iproute2 yet I just pulled it out of some RFC
but it works reasonably well and is intuitive enough.
quoted
Yes it should flood here, unless its acting as a 802.1Qbg VEB or VEPA.
Ok. So there is a toggle somewhere which controls how flooding should
happen.
Yes. The hardware has a bit to support this which is currently not exposed
to user space. That's a case where we have 'yet another knob' that needs
a clean solution. This causes real bugs today when users try to use the
macvlan devices in VEPA mode on top of SR-IOV. By the way these modes are
all part of the 802.1Qbg spec which people actually want to use with Linux
so a good clean solution is probably needed.
quoted
Maybe not. But the kernel already has the needed signals with one extra
hook we can save running a daemon in user space. Maybe that's not a great
argument to add kernel code though.
You make a reasonable arguement to have it in the kernel but i think we
win more if we separate the control. So while i empathize, I am hoping
that youd go with the path that is hard to travel ;->
quoted
The PF_BRIDGE:RTM_GETNEIGH,RTM_NEWNEIGH,RTM_DELNEIGH are registered in the
br_netlink_init() path. 
Hrm - hadnt paid attention to that before. Nasty.
The bridge seems to be hard-coding policy on station movement, no? 
This is a good example of the qualms i have on adding things to the
kernel;->
I may not want to auto update a MAC address moving ports as part of
some policy i have. I can go and add YAK (Yet Another Knob) - but where
is the line drawn?
I have no problem with drawing the line here and trying to implement something
over PF_BRIDGE:RTM_xxx nlmsgs. I'll work with Roopa and see if we can come
up with something in the next couple days.

w.r.t. VEPA/VEB and flooding behavior we could probably have a bit to indicate
if the port is a flooding port or not. Then users could build any sort of forwarding
table they wanted OR we could just drive it through a notifier (ndo_ops?) in the
macvlan path which does VEPA today.

OK I'll try to write some actual code now that can be critiqued.
cheers,
jamal
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help