Re: [PATCH net-next v4 1/6] net/sched: cls_api: Support hardware miss to tc action

From: Paul Blakey <hidden>
Date: 2023-01-23 11:39:58


On 23/01/2023 12:04, Simon Horman wrote:

On Sun, Jan 22, 2023 at 04:55:07PM +0200, Paul Blakey wrote:

quoted

For drivers to support partial offload of a filter's action list,
add support for action miss to specify an action instance to
continue from in sw.

CT action in particular can't be fully offloaded, as new connections
need to be handled in software. This imposes other limitations on
the actions that can be offloaded together with the CT action, such
as packet modifications.

Assign each action on a filter's action list a unique miss_cookie
which drivers can then use to fill action_miss part of the tc skb
extension. On getting back this miss_cookie, find the action
instance with relevant cookie and continue classifying from there.

Signed-off-by: Paul Blakey <redacted>
Reviewed-by: Jiri Pirko <redacted>

...

quoted

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 4cabb32a2ad94..9ef85cf9b5328 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h

...

quoted

@@ -240,21 +243,11 @@ struct tcf_exts {
  static inline int tcf_exts_init(struct tcf_exts *exts, struct net *net,
  				int action, int police)
  {
-#ifdef CONFIG_NET_CLS_ACT
-	exts->type = 0;
-	exts->nr_actions = 0;
-	/* Note: we do not own yet a reference on net.
-	 * This reference might be taken later from tcf_exts_get_net().
-	 */
-	exts->net = net;
-	exts->actions = kcalloc(TCA_ACT_MAX_PRIO, sizeof(struct tc_action *),
-				GFP_KERNEL);
-	if (!exts->actions)
-		return -ENOMEM;
+#ifdef CONFIG_NET_CLS
+	return tcf_exts_init_ex(exts, net, action, police, NULL, 0, false);
+#else
+	return -EOPNOTSUPP;
  #endif

nit: I think it would be nicer if there was a dummy implementation
of tcf_exts_init_ex for the !CONFIG_NET_CLS case and #ifdefs
could disappear from tcf_exts_init() entirely.

Likewise, elsewhere in this patch there seems to be new #if/#ifdefs
Which may be in keeping with the style of this file. But I also
think it's a style we ought to be moving away from.

I agree, I'll do that for v5.

quoted

-	exts->action = action;
-	exts->police = police;
-	return 0;
  }
  
  /* Return false if the netns is being destroyed in cleanup_net(). Callers

...

quoted

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 5b4a95e8a1ee0..46524ae353c5a 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c

...

quoted

+static struct tcf_exts_miss_cookie_node *
+tcf_exts_miss_cookie_lookup(u64 miss_cookie, int *act_index)
+{
+	union tcf_exts_miss_cookie mc = { .miss_cookie = miss_cookie, };
+
+	*act_index = mc.act_index;
+	return xa_load(&tcf_exts_miss_cookies_xa, mc.miss_cookie_base);
+}
+#endif /* IS_ENABLED(CONFIG_NET_TC_SKB_EXT) */
+
+static u64 tcf_exts_miss_cookie_get(u32 miss_cookie_base, int act_index)
+{
+	union tcf_exts_miss_cookie mc = { .act_index = act_index, };
+
+	if (!miss_cookie_base)
+		return 0;

nit: perhaps the compiler optimises this out, or it is otherwise
unimportant,
but doesn't the assignment
of mc zero it's fields, other than .act_index only for:

1. Any assignment of mc to be unused in the !miss_cookie_base case
2. mc.miss_cookie_base to be reassigned, otherwise

It should always zero all other fields with such assignment, and both 
are equivalent while we don't have any other fields.

FWIIW, I might have gone for something more like this (*untested*).

	union tcf_exts_miss_cookie mc;

	if (!miss_cookie_base)
		return 0;

	mc.act_index = act_index;
	mc.miss_cookie_base = miss_cookie_base;

	return mc.miss_cookie;

quoted

+
+	mc.miss_cookie_base = miss_cookie_base;
+	return mc.miss_cookie;
+}
+
  #ifdef CONFIG_NET_CLS_ACT
  DEFINE_STATIC_KEY_FALSE(tc_skb_ext_tc);
  EXPORT_SYMBOL(tc_skb_ext_tc);

@@ -1549,6 +1642,8 @@ static inline int __tcf_classify(struct sk_buff *skb,
  				 const struct tcf_proto *orig_tp,
  				 struct tcf_result *res,
  				 bool compat_mode,
+				 struct tcf_exts_miss_cookie_node *n,
+				 int act_index,
  				 u32 *last_executed_chain)
  {
  #ifdef CONFIG_NET_CLS_ACT
@@ -1560,13 +1655,40 @@ static inline int __tcf_classify(struct sk_buff *skb,
  #endif
  	for (; tp; tp = rcu_dereference_bh(tp->next)) {
  		__be16 protocol = skb_protocol(skb, false);
-		int err;
+		int err = 0;
  
-		if (tp->protocol != protocol &&
-		    tp->protocol != htons(ETH_P_ALL))
-			continue;
+		if (n) {
+			struct tcf_exts *exts;
  
-		err = tc_classify(skb, tp, res);
+			if (n->tp_prio != tp->prio)
+				continue;
+
+			/* We re-lookup the tp and chain based on index instead
+			 * of having hard refs and locks to them, so do a sanity
+			 * check if any of tp,chain,exts was replaced by the
+			 * time we got here with a cookie from hardware.
+			 */
+			if (unlikely(n->tp != tp || n->tp->chain != n->chain ||
+				     !tp->ops->get_exts))
+				return TC_ACT_SHOT;
+
+			exts = tp->ops->get_exts(tp, n->handle);

I see that this is probably safe, but it's not entirely obvious
that tp->ops->get_exts will never by NULL here >
1) It is invariant on n, which is set in a different function and is in turn
2) invariant on ext->act_miss being set, several patches away, in the driver.
3) Which is lastly invariant on being a code path relating to for
    hardware offload of a classifier where tp->ops->get_exts is defined.

Or perhaps I mixed it up somehow. But I do think at a minimum this
ought to be documented.

Yes, I check for get_exts above it, and for !exts below it.

Which brings me to a more central concern with this series: it's very
invasive and sets up a complex relationship between the core and the
driver. Is this the right abstraction for the problem at hand?


I think so and the changes are pretty minimal, and align with what
we currently do with sw/hw cooperation.

quoted

+			if (unlikely(!exts || n->exts != exts))
+				return TC_ACT_SHOT;
+
+			n = NULL;
+#ifdef CONFIG_NET_CLS_ACT
+			err = tcf_action_exec(skb, exts->actions + act_index,
+					      exts->nr_actions - act_index,
+					      res);
+#endif
+		} else {
+			if (tp->protocol != protocol &&
+			    tp->protocol != htons(ETH_P_ALL))
+				continue;
+
+			err = tc_classify(skb, tp, res);
+		}
  #ifdef CONFIG_NET_CLS_ACT
  		if (unlikely(err == TC_ACT_RECLASSIFY && !compat_mode)) {
  			first_tp = orig_tp;

...

quoted

@@ -1606,21 +1731,33 @@ int tcf_classify(struct sk_buff *skb,
  #if !IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
  	u32 last_executed_chain = 0;
  
-	return __tcf_classify(skb, tp, tp, res, compat_mode,
+	return __tcf_classify(skb, tp, tp, res, compat_mode, NULL, 0,
  			      &last_executed_chain);
  #else
  	u32 last_executed_chain = tp ? tp->chain->index : 0;
+	struct tcf_exts_miss_cookie_node *n = NULL;
  	const struct tcf_proto *orig_tp = tp;
  	struct tc_skb_ext *ext;
+	int act_index = 0;
  	int ret;
  
  	if (block) {
  		ext = skb_ext_find(skb, TC_SKB_EXT);
  
-		if (ext && ext->chain) {
+		if (ext && (ext->chain || ext->act_miss)) {
  			struct tcf_chain *fchain;
+			u32 chain = ext->chain;

nit: ext->chain seems to only be used once, as it was before this patch.
Perhaps the chain local variable is an artifact of development
and is best not added?

It is used to share the chain lookup code below without new scope or a 
'?' op (e.g n ? n->chain_index : ext->chain) which I think is cleaner.

quoted

+
+			if (ext->act_miss) {
+				n = tcf_exts_miss_cookie_lookup(ext->act_miss_cookie,
+								&act_index);
+				if (!n)
+					return TC_ACT_SHOT;
  
-			fchain = tcf_chain_lookup_rcu(block, ext->chain);
+				chain = n->chain_index;
+			}
+
+			fchain = tcf_chain_lookup_rcu(block, chain);
  			if (!fchain)
  				return TC_ACT_SHOT;

...

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help