Re: [PATCH net-next v6 01/11] net: sched: use rcu for action cookie update
From: Vlad Buslov <hidden>
Date: 2018-07-16 08:58:08
On Fri 13 Jul 2018 at 21:51, Cong Wang [off-list ref] wrote:
On Fri, Jul 13, 2018 at 6:30 AM Vlad Buslov [off-list ref] wrote:quoted
On Fri 13 Jul 2018 at 03:52, Cong Wang [off-list ref] wrote:quoted
On Thu, Jul 5, 2018 at 7:24 AM Vlad Buslov [off-list ref] wrote:quoted
Implement functions to atomically update and free action cookie using rcu mechanism.Without stating any reason..... Is this even a changelog?Yes, it is.What do you expect in a changelog generally? Repeating what your code does? Thanks but we don't even want to read any code unless the need of this code is reasonably justified.
In my cover letter: - Motivation for patchset is presented in first paragraph. - Problems that prevent us from removing rtnl lock dependency are described, problem 3 is about cookie pointer. - In implementation section, point 3 presents solution for that problem.
Can we at least agree you have no justification for this change in this changelog? Or you believe this patch is as trivial as a white space change which doesn't need a justification?
Cong, from your last letter I understand that you want to have justification specifically for using atomic operation in this particular patch. I agree with you that I should have explained it in more details. I found a lot of prior art for same or similar atomic ops usage for rcu pointers(examples in my previous mail) and assumed it to be trivial, but now I understand that I was wrong in this case.
quoted
quoted
quoted
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>Dear Marcelo, how did it pass your review? See below:quoted
+static void tcf_set_action_cookie(struct tc_cookie __rcu **old_cookie, + struct tc_cookie *new_cookie) +{ + struct tc_cookie *old; + + old = xchg(old_cookie, new_cookie);This is an incorrect use of RCU, obviously should be rcu_assign_pointer() here.Could you please explain your concern in more details? Similar pattern is already widely used in kernel for re-assigning rcu pointers. ForMy reasoning is too simple: search whatisRCU.txt for "xchg", I find nothing. :) Here is the link: https://www.kernel.org/doc/Documentation/RCU/whatisRCU.txt Of course, both xchg() and rcu_assign_pointer() are aimed to make an assignment of a pointer. But even without looking into their implementations, there must be a reason for rcu_assign_pointer() to exist, right? Can we agree or you believe rcu_assign_pointer() can be replaced by xchg() and removed finally? This also means you need to justify your pick of xchg() in your changelog where there is nothing literally.
Now I understand that you want a justification for using xchg specifically and I do agree with you that it should have been described in details.
quoted
example, Eric Dumazet uses it in 1c0d32fde5bd ("net_sched: gen_estimator: complete rewrite of rate estimators"): void gen_kill_estimator(struct net_rate_estimator __rcu **rate_est) { struct net_rate_estimator *est; est = xchg((__force struct net_rate_estimator **)rate_est, NULL); if (est) { del_timer_sync(&est->timer); kfree_rcu(est, rcu); } }In this case, *I think* the only reason is the burden of the API gen_kill_estimator(). It aims to be a core API for both netfilter and net_sched, therefore it _has to_ provide a wrapper for its users, because otherwise each user has to repeat rcu_assign_pointer() + del_timer_sync() + kfree_rcu(), just a matter of duplication. Apparently this rule does NOT apply to your case, where tcf_set_action_cookie() is merely a static function called by two users in the same C file, and without anything but a call_rcu().quoted
Tom Herbert uses same idiom in a8c5f90fb59a ("ip_tunnel: Ops registration for secondary encap (fou, gue)"): int ip_tunnel_encap_add_ops(const struct ip_tunnel_encap_ops *ops, unsigned int num) { if (num >= MAX_IPTUN_ENCAP_OPS) return -ERANGE; return !cmpxchg((const struct ip_tunnel_encap_ops **) &iptun_encaps[num], NULL, ops) ? 0 : -1; }cmpxchg() is completely different with xchg(), first of all. In this case, its caller expects if this cmpxchg() fails or not. How this could be even related to your case given tcf_set_action_cookie() returns void?quoted
Again, Eric uses xchg to re-assign rcu pointer in 45f6fad84cc3 ("ipv6: add complete rcu protection around np->opt"): struct ipv6_txoptions *ipv6_update_options(struct sock *sk, struct ipv6_txoptions *opt) { if (inet_sk(sk)->is_icsk) { if (opt && !((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE)) && inet_sk(sk)->inet_daddr != LOOPBACK4_IPV6) { struct inet_connection_sock *icsk = inet_csk(sk); icsk->icsk_ext_hdr_len = opt->opt_flen + opt->opt_nflen; icsk->icsk_sync_mss(sk, icsk->icsk_pmtu_cookie); } } opt = xchg((__force struct ipv6_txoptions **)&inet6_sk(sk)->opt, opt); sk_dst_reset(sk);In this case, it is the caller's requirement. The callers of ipv6_update_options() want to get the old pointer and do something about it: opt = ipv6_update_options(sk, opt); if (opt) { atomic_sub(opt->tot_len, &sk->sk_omem_alloc); txopt_put(opt); } It should be functionally equivalent to saving the old pointer before a rcu_assign_pointer(). So, this does NOT apply to your case either, you only call call_rcu() and the callers require nothing.quoted
return opt; }quoted
quoted
@@ -65,10 +83,7 @@ static void free_tcf(struct tc_action *p) free_percpu(p->cpu_bstats); free_percpu(p->cpu_qstats); - if (p->act_cookie) { - kfree(p->act_cookie->data); - kfree(p->act_cookie); - } + tcf_set_action_cookie(&p->act_cookie, NULL);So, this is called in free_tcf(), where the action is already invisible from readers so it is ready to be freed. The question is: If the action itself is already ready to be freed, why do you need RCU here? What could still read 'act->act_cookie' while 'act' is already invisible? Its last refcnt is already gone, the fast path RCU readers are gone too given filters use rcu work already. Standalone action dump? Again, the last refcnt is already gone.It is not necessary here, I just used tcf_set_action_cookie() that already implements cookie pointer cleanup to prevent code duplication. I'm open to changing it, if you concerned with performance impact of using atomic operation for re-assigning cookie pointer.Yeah, totally understand. But >act_cookie very special here, it requires no copying when update, unlike the normal cases. This means from RCU we can remove the "C" here. I know you still copy it when dumping it, but it is a part of Read, not a part of Update, so it is safe to say you only need R and U here. Which in turn means two things: 1. You don't have to use RCU anymore. 2. You don't need a lock for writers given there is no copy during update, if you still stick to RCU. This is why I keep saying you need to justify it, it is not trivial and it is not easy to understand either.
I agree with your analysis. I just want to use rcu mechanism here for managing lifetime of cookie pointer(protecting reads with rcu read lock and freeing cookie after rcu timeout). However, due to lack of actual copy phase, I omit locking on update and just use atomic ops to substitute cookie pointer in concurrent-safe manner.
Thanks!
Thank you for reviewing my code!