Re: [PATCH net-next RFC v2] switchdev: bridge: drop hardware forwarded packets
From: roopa <hidden>
Date: 2015-03-20 23:30:33
On 3/20/15, 3:37 PM, Scott Feldman wrote:
On Fri, Mar 20, 2015 at 3:06 PM, roopa [off-list ref] wrote:quoted
On 3/20/15, 11:13 AM, Scott Feldman wrote:quoted
On Fri, Mar 20, 2015 at 10:11 AM, John Fastabend [off-list ref] wrote:quoted
On 03/20/2015 09:58 AM, roopa@cumulusnetworks.com wrote:quoted
From: Roopa Prabhu <redacted> On a Linux bridge with bridge forwarding offloaded to switch ASIC, there is a need to not re-forward frames that have already been forwarded in hardware. Typically these are broadcast or multicast frames forwarded by the hardware to multiple destination ports including sending a copy of the packet to the cpu (kernel e.g. an arp broadcast). The bridge driver will try to forward the packet again, resulting in two copies of the same packet. These packets can also come up to the kernel for logging when they hit a LOG acl rule in hardware. In such cases, you do want the packet to go through the bridge netfilter hooks. Hence, this patch adds the required checks just before the packet is being xmited. v2: - Add a new hw_fwded flag in skbuff to indicate that the packet is already hardware forwarded. Switch driver will set this flag. I have been trying to avoid having this flag in the skb and thats why this patch has been in my tree for long. Cant think of other better alternatives. Suggestions are welcome. I have put this under CONFIG_NET_SWITCHDEV to minimize the impact. Signed-off-by: Roopa Prabhu <redacted> Signed-off-by: Wilson Kok <redacted> ---Interesting. I completely avoid this problem by not instantiating a software bridge ;) When these pkts come up the stack I either use a raw socket to capture them, put a 'tc' ingress rule to do something, or have OVS handle them in some special way. It seems to me that this is where the sw/hw model starts to break when you have these magic bits to handle the packets differently. How do you know to set the skb bit? Do you have some indicator in the descriptor? I don't have any good way to learn this on my hardware. But I can assume if it reached the CPU it was because of some explicit rule.I was wondering that also, since there was no example. This features seems like it belongs in the bridge.yes, it does, the check today is really in the bridge.quoted
We already have BR_FLOOD to indicate whether unknown unicast traffic is flooded to a bridge port. Can we add another BR_FLOOD_BCAST (or some name) for this new feature? You would set/clear this flag on the bridge (master) port. The default is set. And now: - #define BR_AUTO_MASK (BR_FLOOD | BR_LEARNING) + #define BR_AUTO_MASK (BR_FLOOD | BR_FLOOD_BCAST | BR_LEARNING) Does this work for your use-case, Roopa?Note my first RFC patch, sort of did this: https://marc.info/?l=linux-netdev&m=142147999420017&w=2 But there are open things there as listed in the comment and also in the subsequent discussion on the thread. We discussed this flag before and i think it does not allow the case where hw switch ports are bridged with non-hw ports.I went back and read the thread just to remind me what the pros/cons where. I think the mixed case isn't a concern since this BR_FLOOD_BCAST check is made on egress to the bridge port. So only clear BR_FLOOD_BCAST on hw switch ports (if hw did the flood already amongst its ports), and leave it set for non-hw-ports. It seems the patch should mostly be a clone of how BR_FLOOD is handled. Is there more to it?
That may work. But, we have recently moved igmp handling to software in kernel and i was trying to make this work for that case. I am going to try what you suggest by finding a work around for the igmp case. I will get back to you. Thanks! -Roopa