Re: [RFC PATCH 14/16] gtp: add support for flow based tunneling
From: Pravin Shelar <hidden>
Date: 2021-02-07 17:57:09
On Sat, Feb 6, 2021 at 10:06 AM Jonas Bonn [off-list ref] wrote:
Hi Pravin et al; TL;DR: we don't need to introduce an entire collect_md mode to the driver; we just need to tweak what we've got so that metadata is "always" added on RX and respected on TX; make the userspace socket optional and dump GTP packets to netstack if it's not present; and make a decision on whether to allow packets into netstack without validating their IP address.
Thanks for summarizing the LWT mechanism below. Overall I am fine with the changes, a couple of comments inlined.
On 23/01/2021 20:59, Jonas Bonn wrote:quoted
From: Pravin B Shelar <redacted> This patch adds support for flow based tunneling, allowing to send and receive GTP tunneled packets via the (lightweight) tunnel metadata mechanism. This would allow integration with OVS and eBPF using flow based tunneling APIs. The mechanism used here is to get the required GTP tunnel parameters from the tunnel metadata instead of looking up a pre-configured PDP context. The tunnel metadata contains the necessary information for creating the GTP header.So, I've been investigating this a bit further... What is being introduced in this patch is a variant of "normal functionality" that resembles something called "collect metadata" mode in other drivers (GRE, VXLAN, etc). These other drivers operate in one of two modes: more-or-less-point-to-point mode, or this alternative collect metadata varaint. The latter is something that has been bolted on to an existing implementation of the former. For iproute2, a parameter called 'external' is added to the setup of links of the above type to switch between the two modes; the point-to-point parameters are invalid in 'external' (or 'collect metadata') mode. The name 'external' is because the driver is externally controlled, meaning that the tunnel information is not hardcoded into the device instance itself. The GTP driver, however, has never been a point-to-point device. It is already 'externally controlled' in the sense that tunnels can be added and removed at any time. Adding this 'external' mode doesn't immediately make sense because that's roughly what we already have. Looking into how ip_tunnel_collect_metadata() works, it looks to me like it's always true if there's _ANY_ routing rule in the system with 'tunnel ID' set. Is that correct? I'll assume it is in order to continue my thoughts.
Right. It is just optimization to avoid allocating tun-dst in datapath.
So, with that, either the system is collecting SKB metadata or it isn't.
If it is, it seems reasonable that we populate incoming packets with
the tunnel ID as in this patch. That said, I'm not convinced that we
should bypass the PDP context mechanism entirely... there should still
be a PDP context set up or we drop the packet for security reasons.
For outgoing packets, it seems reasonable that the remote TEID may come
from metadata OR a PDP context. That would allow some routing
alternatives to what we have right now which just does a PDP context
lookup based on the destination/source address of the packet. It would
be nice for OVS/BPF to be able to direct a packet to a remote GTP
endpoint by providing that endpoint/TEID via a metadata structure.
So we end up with, roughly:
On RX:
i) check TEID in GTP header
ii) lookup PDP context
iii) validate IP of encapsulated packet
iv) if ip(tunnel_collect_metadata()) { /* add tun info */ }
v) decapsulate and pass to network stack
On TX:
i) if SKB has metadata, get destination and TEID from metadata (tunnel ID)
ii) otherwise, lookup PDP context for packet
For RX, only iv) is new; for TX only step i) is new. The rest is what
we already have.
The one thing that this complicates is the case where an external entity
(OVS or BPF) is doing validation of packet IP against incoming TEID. In
that case, we've got double validation of the incoming address and, I
suppose, OVS/BPF would perhaps be more efficient (?)... In that case,
holding a PDP context is a bit of a waste of memory.
It's somewhat of a security issue to allow un-checked packets into the
system, so I'm not super keen on dropping the validation of incoming
packets; given the previous paragraph, however, we might add a flag when
creating the link to decide whether or not we allow packets through even
if we can't validate them. This would also apply to packets without a
PDP context for an incoming TEID, too. This flag, I suppose, looks a
bit like the 'collect_metadata' flag that Pravin has added here.Yes. user should have the option to use GTP devices in LTW mode and bypass PDP session context completely. Lets add a flag for GTP link to have consistent behaviour for all types of LWT devices.
The other difference to what we currently have is that this patch sets up a socket exclusively in kernel space for the tunnel; currently, all sockets terminate in userspace: userspace receives all packets that can't be decapsulated and re-injected in to the network stack. With this patch, ALL packets (without a userspace termination) are re-injected into the network stack; it's just that anything that can't be decapsulated gets injected as a "GTP packet" with some metadata and an UNKNOWN protocol type. If nothing is looking at the metadata and acting on it, then these will just get dropped; and that's what would happen if nothing was listening on the userspace end, too. So we might as well just make the FD1 socket parameter to the link setup optional and be done with it.
Good idea to make FD1 optional argument. Thanks.