Thread (51 messages) 51 messages, 11 authors, 2015-03-30

Re: [PATCH net-next RFC v2] switchdev: bridge: drop hardware forwarded packets

From: roopa <hidden>
Date: 2015-03-20 23:30:33

On 3/20/15, 3:37 PM, Scott Feldman wrote:
On Fri, Mar 20, 2015 at 3:06 PM, roopa [off-list ref] wrote:
quoted
On 3/20/15, 11:13 AM, Scott Feldman wrote:
quoted
On Fri, Mar 20, 2015 at 10:11 AM, John Fastabend
[off-list ref] wrote:
quoted
On 03/20/2015 09:58 AM, roopa@cumulusnetworks.com wrote:
quoted
From: Roopa Prabhu <redacted>

On a Linux bridge with bridge forwarding offloaded to switch ASIC,
there is a need to not re-forward frames that have already been
forwarded in hardware.

Typically these are broadcast or multicast frames forwarded by the
hardware to multiple destination ports including sending a copy of
the packet to the cpu (kernel e.g. an arp broadcast).
The bridge driver will try to forward the packet again, resulting in
two copies of the same packet.

These packets can also come up to the kernel for logging when they hit
a LOG acl rule in hardware. In such cases, you do want the packet
to go through the bridge netfilter hooks. Hence, this patch adds the
required checks just before the packet is being xmited.

v2:
        - Add a new hw_fwded flag in skbuff to indicate that the packet
        is already hardware forwarded. Switch driver will set this flag.
        I have been trying to avoid having this flag in the skb
        and thats why this patch has been in my tree for long. Cant think
        of other better alternatives. Suggestions are welcome. I have put
        this under CONFIG_NET_SWITCHDEV to minimize the impact.

Signed-off-by: Roopa Prabhu <redacted>
Signed-off-by: Wilson Kok <redacted>
---
Interesting. I completely avoid this problem by not instantiating a
software bridge ;) When these pkts come up the stack I either use a
raw socket to capture them, put a 'tc' ingress rule to do something,
or have OVS handle them in some special way. It seems to me that this
is where the sw/hw model starts to break when you have these magic
bits to handle the packets differently.

How do you know to set the skb bit? Do you have some indicator in the
descriptor? I don't have any good way to learn this on my hardware. But
I can assume if it reached the CPU it was because of some explicit rule.
I was wondering that also, since there was no example.

This features seems like it belongs in the bridge.
yes, it does, the check today is really in the bridge.
quoted
We already have
BR_FLOOD to indicate whether unknown unicast traffic is flooded to a
bridge port.  Can we add another BR_FLOOD_BCAST (or some name) for
this new feature?  You would set/clear this flag on the bridge
(master) port.  The default is set.  And now:

- #define BR_AUTO_MASK          (BR_FLOOD | BR_LEARNING)
+ #define BR_AUTO_MASK          (BR_FLOOD | BR_FLOOD_BCAST | BR_LEARNING)

Does this work for your use-case, Roopa?
Note my first RFC patch, sort of did this:
https://marc.info/?l=linux-netdev&m=142147999420017&w=2

But there are open things there as listed in the comment and also in the
subsequent
discussion on the thread.

We discussed this flag before and i think it does not allow the case where
hw switch ports are  bridged with non-hw ports.
I went back and read the thread just to remind me what the pros/cons
where.  I think the mixed case isn't a concern since this
BR_FLOOD_BCAST check is made on egress to the bridge port.  So only
clear BR_FLOOD_BCAST on hw switch ports (if hw did the flood already
amongst its ports), and leave it set for non-hw-ports.   It seems the
patch should mostly be a clone of how BR_FLOOD is handled.  Is there
more to it?
That may work.  But, we have recently moved igmp handling to software in 
kernel
and i was trying to make this work for that case. I am going to try what 
you suggest
  by finding a work around for the igmp case.

I will get back to you.

Thanks!
-Roopa
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help