Re: [PATCH net-next v4 1/6] net/sched: cls_api: Support hardware miss to tc action
From: Paul Blakey <hidden>
Date: 2023-01-23 11:39:58
On 23/01/2023 12:04, Simon Horman wrote:
On Sun, Jan 22, 2023 at 04:55:07PM +0200, Paul Blakey wrote:quoted
For drivers to support partial offload of a filter's action list, add support for action miss to specify an action instance to continue from in sw. CT action in particular can't be fully offloaded, as new connections need to be handled in software. This imposes other limitations on the actions that can be offloaded together with the CT action, such as packet modifications. Assign each action on a filter's action list a unique miss_cookie which drivers can then use to fill action_miss part of the tc skb extension. On getting back this miss_cookie, find the action instance with relevant cookie and continue classifying from there. Signed-off-by: Paul Blakey <redacted> Reviewed-by: Jiri Pirko <redacted>...quoted
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index 4cabb32a2ad94..9ef85cf9b5328 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h...quoted
@@ -240,21 +243,11 @@ struct tcf_exts { static inline int tcf_exts_init(struct tcf_exts *exts, struct net *net, int action, int police) { -#ifdef CONFIG_NET_CLS_ACT - exts->type = 0; - exts->nr_actions = 0; - /* Note: we do not own yet a reference on net. - * This reference might be taken later from tcf_exts_get_net(). - */ - exts->net = net; - exts->actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *), - GFP_KERNEL); - if (!exts->actions) - return -ENOMEM; +#ifdef CONFIG_NET_CLS + return tcf_exts_init_ex(exts, net, action, police, NULL, 0, false); +#else + return -EOPNOTSUPP; #endifnit: I think it would be nicer if there was a dummy implementation of tcf_exts_init_ex for the !CONFIG_NET_CLS case and #ifdefs could disappear from tcf_exts_init() entirely. Likewise, elsewhere in this patch there seems to be new #if/#ifdefs Which may be in keeping with the style of this file. But I also think it's a style we ought to be moving away from.
I agree, I'll do that for v5.
quoted
- exts->action = action; - exts->police = police; - return 0; } /* Return false if the netns is being destroyed in cleanup_net(). Callers...quoted
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 5b4a95e8a1ee0..46524ae353c5a 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c...quoted
+static struct tcf_exts_miss_cookie_node * +tcf_exts_miss_cookie_lookup(u64 miss_cookie, int *act_index) +{ + union tcf_exts_miss_cookie mc = { .miss_cookie = miss_cookie, }; + + *act_index = mc.act_index; + return xa_load(&tcf_exts_miss_cookies_xa, mc.miss_cookie_base); +} +#endif /* IS_ENABLED(CONFIG_NET_TC_SKB_EXT) */ + +static u64 tcf_exts_miss_cookie_get(u32 miss_cookie_base, int act_index) +{ + union tcf_exts_miss_cookie mc = { .act_index = act_index, }; + + if (!miss_cookie_base) + return 0;nit: perhaps the compiler optimises this out, or it is otherwise unimportant, but doesn't the assignment of mc zero it's fields, other than .act_index only for: 1. Any assignment of mc to be unused in the !miss_cookie_base case 2. mc.miss_cookie_base to be reassigned, otherwise
It should always zero all other fields with such assignment, and both are equivalent while we don't have any other fields.
FWIIW, I might have gone for something more like this (*untested*). union tcf_exts_miss_cookie mc; if (!miss_cookie_base) return 0; mc.act_index = act_index; mc.miss_cookie_base = miss_cookie_base; return mc.miss_cookie;quoted
+ + mc.miss_cookie_base = miss_cookie_base; + return mc.miss_cookie; +} + #ifdef CONFIG_NET_CLS_ACT DEFINE_STATIC_KEY_FALSE(tc_skb_ext_tc); EXPORT_SYMBOL(tc_skb_ext_tc);@@ -1549,6 +1642,8 @@ static inline int __tcf_classify(struct sk_buff *skb, const struct tcf_proto *orig_tp, struct tcf_result *res, bool compat_mode, + struct tcf_exts_miss_cookie_node *n, + int act_index, u32 *last_executed_chain) { #ifdef CONFIG_NET_CLS_ACT@@ -1560,13 +1655,40 @@ static inline int __tcf_classify(struct sk_buff *skb, #endif for (; tp; tp = rcu_dereference_bh(tp->next)) { __be16 protocol = skb_protocol(skb, false); - int err; + int err = 0; - if (tp->protocol != protocol && - tp->protocol != htons(ETH_P_ALL)) - continue; + if (n) { + struct tcf_exts *exts; - err = tc_classify(skb, tp, res); + if (n->tp_prio != tp->prio) + continue; + + /* We re-lookup the tp and chain based on index instead + * of having hard refs and locks to them, so do a sanity + * check if any of tp,chain,exts was replaced by the + * time we got here with a cookie from hardware. + */ + if (unlikely(n->tp != tp || n->tp->chain != n->chain || + !tp->ops->get_exts)) + return TC_ACT_SHOT; + + exts = tp->ops->get_exts(tp, n->handle);I see that this is probably safe, but it's not entirely obvious that tp->ops->get_exts will never by NULL here > 1) It is invariant on n, which is set in a different function and is in turn 2) invariant on ext->act_miss being set, several patches away, in the driver. 3) Which is lastly invariant on being a code path relating to for hardware offload of a classifier where tp->ops->get_exts is defined. Or perhaps I mixed it up somehow. But I do think at a minimum this ought to be documented.
Yes, I check for get_exts above it, and for !exts below it.
Which brings me to a more central concern with this series: it's very invasive and sets up a complex relationship between the core and the driver. Is this the right abstraction for the problem at hand?
I think so and the changes are pretty minimal, and align with what we currently do with sw/hw cooperation.
quoted
+ if (unlikely(!exts || n->exts != exts)) + return TC_ACT_SHOT; + + n = NULL; +#ifdef CONFIG_NET_CLS_ACT + err = tcf_action_exec(skb, exts->actions + act_index, + exts->nr_actions - act_index, + res); +#endif + } else { + if (tp->protocol != protocol && + tp->protocol != htons(ETH_P_ALL)) + continue; + + err = tc_classify(skb, tp, res); + } #ifdef CONFIG_NET_CLS_ACT if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode)) { first_tp = orig_tp;...quoted
@@ -1606,21 +1731,33 @@ int tcf_classify(struct sk_buff *skb, #if !IS_ENABLED(CONFIG_NET_TC_SKB_EXT) u32 last_executed_chain = 0; - return __tcf_classify(skb, tp, tp, res, compat_mode, + return __tcf_classify(skb, tp, tp, res, compat_mode, NULL, 0, &last_executed_chain); #else u32 last_executed_chain = tp ? tp->chain->index : 0; + struct tcf_exts_miss_cookie_node *n = NULL; const struct tcf_proto *orig_tp = tp; struct tc_skb_ext *ext; + int act_index = 0; int ret; if (block) { ext = skb_ext_find(skb, TC_SKB_EXT); - if (ext && ext->chain) { + if (ext && (ext->chain || ext->act_miss)) { struct tcf_chain *fchain; + u32 chain = ext->chain;nit: ext->chain seems to only be used once, as it was before this patch. Perhaps the chain local variable is an artifact of development and is best not added?
It is used to share the chain lookup code below without new scope or a '?' op (e.g n ? n->chain_index : ext->chain) which I think is cleaner.
quoted
+ + if (ext->act_miss) { + n = tcf_exts_miss_cookie_lookup(ext->act_miss_cookie, + &act_index); + if (!n) + return TC_ACT_SHOT; - fchain = tcf_chain_lookup_rcu(block, ext->chain); + chain = n->chain_index; + } + + fchain = tcf_chain_lookup_rcu(block, chain); if (!fchain) return TC_ACT_SHOT;...