RE: macvlan devices and vlan interaction
From: "Keller, Jacob E" <jacob.e.keller@intel.com>
Date: 2018-01-30 23:07:21
-----Original Message----- From: Alexander Duyck [mailto:alexander.duyck@gmail.com] Sent: Tuesday, January 30, 2018 2:39 PM To: Keller, Jacob E <jacob.e.keller@intel.com> Cc: Shannon Nelson <redacted>; netdev@vger.kernel.org; Duyck, Alexander H [off-list ref] Subject: Re: macvlan devices and vlan interaction On Tue, Jan 30, 2018 at 2:20 PM, Keller, Jacob E [off-list ref] wrote:quoted
quoted
-----Original Message----- From: Alexander Duyck [mailto:alexander.duyck@gmail.com] Sent: Tuesday, January 30, 2018 12:49 PM To: Shannon Nelson <redacted> Cc: Keller, Jacob E <jacob.e.keller@intel.com>; netdev@vger.kernel.org;Duyck,quoted
quoted
Alexander H [off-list ref] Subject: Re: macvlan devices and vlan interactionquoted
Hi Jake, The current behavior seems logical to me, but I suppose Alex might argue differently. The macvlan was put onto the default lowerdev assuming the lowerdev will hand it all the default traffic, and then the macvlan splits out its own vlan traffic. As soon as the lowerdev assumption changes, it is going to change what gets pushed up to the macvlan dev. If the lowerdev is separating the vlan traffic out of the "default" flow headed to the macvlan, then the initial assumption has changed and the vlan traffic has been vectored off before it can be delivered up the stack to the macvlan.It depends on what your goal is. In my mind making macvlan VLAN challenged is the easier solution since you just have to add some pass-thru ops to the VLAN drivers and you can guarantee that you are passing MAC-VLAN pair for each address on the interface for the call. The alternative gets to be a bit more complex since it requires multiple rules, one for non-tagged and one per VLAN for tagged traffic.quoted
There's an argument that the lowerdev shouldn't know anything about the upperdev's routing, just deliver to the upperdev and let the upperdev worry about it. But perhaps this becomes is a question of precedence: does the lowerdev split traffic first by mac address or by vlan tag.That is where things get messy. We found it splits by VLAN tag if the VLAN is present on the lowerdev, or it splits by MAC if it is not. That is why as Jake pointed out adding the VLAN to the lower dev causes issues.Yes, right now the problem is that it splits differently depending on whether ornot a VLAN is present on the lower dev.quoted
quoted
quoted
I don't like your option 1: as you point out, it breaks current functionality, likely depended upon in some containers that are using macvlans to manage their traffic. We don't know what's going on inside that container and I don't think we want to break its ability to split its own vlans.Maybe we should look at an option 1.5. Mark the lowerdev as VLAN challenged if any macvlan is operating with any VLANs enabled on it since we can only really allow VLAN filtering to occur at one level reliably. Either that or maybe we look at making VLANs and rx_handler setups mutually exclusive.Actually.. what if we changed the order of splitting, so that we always checkmacvlan MAC address first, before checking VLANs?quoted
This should work in both cases of macvlan -> VLAN -> lowerdev, or VLAN ->macvlan -> lowerdev. The thing you have to then watch out for is how something like this would impact bonding or bridging since both of those use the Rx handler as well from my understanding. I suppose it would make sense though to do Rx handler first and then VLAN since the Rx handler should be placed on the VLAN itself if you are bridging/bonding/macvlan over a VLAN versus the reverse.quoted
In the first case, the macvlan isn't directly attached to the lowerdev, so we'd doVLAN filtering first, and then the VLAN would check MAC address. Right, that bit works without any issues.quoted
In the second case, even if lowerdev also had the VLAN, we'd do macvlanfiltering first, and things would work. That piece makes sense, at least for macvlan.quoted
Both the lowerdev VLAN and upperdev macvlan should receive traffic correctlyin this case.quoted
I think this resolves the problem of which device goes to which VLAN.Right, this works for macvlan. The only concern I have then is bond and bridging. It is probably fine but it wouldn't hurt to check.
Yea, it's quite possible bonding and bridging won't work quite right...
quoted
I don't know if it resolves the issues with leaked VLANs, where a VLAN added tothe macvlan device causes traffic for that VLAN to be received by all the MAC addresses of the lowerdev...quoted
I suppose this might not be considered a problem? The traffic could be receivedeither way if you're in promiscuous mode. It's not like we have a sense of "trusted" configuration either.quoted
I think some separate work for the case of macvlan on top of VLAN on top oflower dev can be done as well, to enable offloading in this case. I'll have some more thoughts on that soon.quoted
Thanks, JakeSo the issue with all of this apears to be: commit 2425717b27eb92b175335ca4ff0bb218cbe0cb64 Author: John Fastabend [off-list ref] Date: Mon Oct 10 09:16:41 2011 +0000 net: allow vlan traffic to be received under bond The problem is that patch made it so that you could put a bond on two interfaces and still peel out traffic via VLANs. Prior to this patch the code is the way we were already discussing.
Interesting. I am not sure how to resolve this without breaking one or the other case. The commit message itself calls out that: Putting a VLAN.228 on both the bond0 and eth2 device will result in eth2.228 receiving the skb. I don't think this is completely unexpected and was the result prior to the rx_handler result. So.. to me this *was* unexpected, but I guess it matches how things used to work? Thanks, Jake