Thread (21 messages) 21 messages, 3 authors, 2019-06-11

Re: [PATCH net v3 0/3] net/sched: fix actions reading the network header in case of QinQ packets

From: Eli Britstein <hidden>
Date: 2019-06-11 04:43:10

On 6/11/2019 3:52 AM, Cong Wang wrote:
On Wed, Jun 5, 2019 at 10:37 PM Eli Britstein [off-list ref] wrote:
quoted
On 6/6/2019 4:42 AM, Cong Wang wrote:
quoted
On Tue, Jun 4, 2019 at 11:19 AM Eli Britstein [off-list ref] wrote:
quoted
On 6/4/2019 8:55 PM, Cong Wang wrote:
quoted
On Sat, Jun 1, 2019 at 9:22 PM Eli Britstein [off-list ref] wrote:
quoted
I think that's because QinQ, or VLAN is not an encapsulation. There is
no outer/inner packets, and if you want to mangle fields in the packet
you can do it and the result is well-defined.
Sort of, perhaps VLAN tags are too short to be called as an
encapsulation, my point is that it still needs some endpoints to push
or pop the tags, in a similar way we do encap/decap.

quoted
BTW, the motivation for my fix was a use case were 2 VGT VMs
communicating by OVS failed. Since OVS sees the same VLAN tag, it
doesn't add explicit VLAN pop/push actions (i.e pop, mangle, push). If
you force explicit pop/mangle/push you will break such applications.
   From what you said, it seems act_csum is in the middle of packet
receive/transmit path. So, which is the one pops the VLAN tags in
this scenario? If the VM's are the endpoints, why not use act_csum
there?
In a switchdev mode, we can passthru the VFs to VMs, and have their
representors in the host, enabling us to manipulate the HW eswitch
without knowledge of the VMs.

To simplify it, consider the following setup:

v1a <-> v1b and v2a <-> v2b are veth pairs.

Now, we configure v1a.20 and v2a.20 as VLAN devices over v1a/v2a
respectively (and put the "a" devs in separate namespaces).

The TC rules are on the "b" devs, for example:

tc filter add dev v1b ... action pedit ... action csum ... action
redirect dev v2b

Now, ping from v1a.20 to v1b.20. The namespaces transmit/receive tagged
packets, and are not aware of the packet manipulation (and the required
act_csum).
This is what I said, v1b is not the endpoint which pops the vlan tag,
v1b.20 is. So, why not simply move at least the csum action to
v1b.20? With that, you can still filter and redirect packets on v1b,
you still even modify it too, just defer the checksum fixup to the
endpoint.
There are no vxb.20 ports:

ns0:     v1a.20 ----(VLAN)---- v1a ns1:    v2a ---- (VLAN) ---- v2a.20

|----(veth)---- v1b     <---- (TC) ---->    v2b ----(veth)----|
This diagram makes me even more confusing...

Can you explicitly explain why there is no vxb.20? Is it a router or
something?
Yes.
By the way, even if it is router and you really want to checksum the
packet at that point, you still don't have to move the skb->data
pointer, you just need to parse the header and calculate the offset
without touching skb->data. This could at least avoid restoring
skb->data after it.
Sure, this is another implementation method. It doesn't change the 
essence. I just wanted to reuse the existing tcf_csum_ipv4/6.
Thanks.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help