Re: [PATCH net-next v4 2/2] net/sched: cls_flower: add support for matching tunnel control flags
From: Davide Caratti <hidden>
Date: 2024-06-06 08:09:47
hello Asbjørn, thanks for looking at this! On Wed, Jun 05, 2024 at 07:37:13AM +0000, Asbjørn Sloth Tønnesen wrote:
Hi Davide and Ilva,
[...]
Sorry, that I am late to the party, I have been away camping at Electromagnetic Field in the UK.
guess you had fun :)
Why not use the already existing key->enc_control.flags with dissector key FLOW_DISSECTOR_KEY_ENC_CONTROL for storing the flags? Currently key->enc_control.flags are unused.
I looked at the struct, it currently stores information about IPv4 / IPv6 fragmentation of the inner/outer packet plus FLOW_DIS_ENCAPSULATION - that IIRC is true in case the packet has an inner IP header (key->enc_control.addr_type selects the address family). these 3 bits are usually set when FLOW_DISSECTOR_KEY_CONTROL is requested.
I haven't fixed the drivers to validate that field yet, so currently only sfc does so.
I see: [1] https://elixir.bootlin.com/linux/v6.10-rc2/source/drivers/net/ethernet/sfc/tc.c#L276 (for FLOW_DISSECTOR_KEY_CONTROL) [2] https://elixir.bootlin.com/linux/v6.10-rc2/source/drivers/net/ethernet/sfc/tc.c#L390 (for FLOW_DISSECTOR_KEY_ENC_CONTROL)
Look at include/uapi/linux/pkt_cls.h for netlink flags:
enum {
TCA_FLOWER_KEY_FLAGS_IS_FRAGMENT = (1 << 0),
TCA_FLOWER_KEY_FLAGS_FRAG_IS_FIRST = (1 << 1),
};
and at include/net/flow_dissector.h for the dissector flags:
#define FLOW_DIS_IS_FRAGMENT BIT(0)
#define FLOW_DIS_FIRST_FRAG BIT(1)
#define FLOW_DIS_ENCAPSULATION BIT(2)
I would like to keep FLOW_DIS_ENCAPSULATION as the last flag, in order to
keep parity with the netlink flags, since the dissector flags field is
exposed to user space when some flags are not supported.to fully understand this, I need some more info on how you think IP_TUNNEL_<blah> bit should be written on the dissected data, since FLOW_DIS_IS_FRAGMENT has the same value as BIT(IP_TUNNEL_CSUM_BIT). So, in a way or another parity with netlink can't be guaranteed. The main reason why I added a new dissector bit is: avoiding breaking functionality of drivers, because I don't really know what the hardware does. If we want to use FLOW_DISSECTOR_KEY_ENC_CONTROL, then all drivers supporting the offload of this bit need to return -EOPNOTSUPP if key->enc_control.flags contains IP_TUNNEL_<blah> bits (like done in [2] for FLOW_DIS_*FRAG*). Moreover, if a driver supports offload of one of these bits (e.g. TUNNEL_OAM), we need to understand if it's possible for hardware to match when 'addr_type' is not specified, e.g. a geneve packet carrying a non-IP packet.
I realize that since this series is now merged, then fixing this up will have to go in another series, are you up for that?
I'm not against: if there's something to improve, this is the right moment _ later it will be more difficult; but we should clarify: - how / if we want to handle overlap of 'flags'. I guess, pass the current tunnel flags to the arguments of skb_flow_dissect_set_enc_addr_type(), and write "something" in ctrl->flags. - what to do with with flower netlink API. If these bits are in the same struct, we don't need TCA_FLOWER_KEY_ENC_FLAGS anymore? should we extend TCA_FLOWER_KEY_FLAGS instead? thanks, -- davide