Re: [PATCH net-next 2/3] ipv4: rate limit updating of next hop exceptions with same pmtu
From: Timo Teras <hidden>
Date: 2013-05-29 05:05:45
On Wed, 29 May 2013 00:04:47 +0300 (EEST) Julian Anastasov [off-list ref] wrote:
Hello, On Tue, 28 May 2013, Timo Teras wrote:quoted
quoted
quoted
--- a/net/ipv4/route.c +++ b/net/ipv4/route.c@@ -947,6 +947,10 @@ static void __ip_rt_update_pmtu(structrtable *rt, struct flowi4 *fl4, u32 mtu) if (mtu < ip_rt_min_pmtu) mtu = ip_rt_min_pmtu; + if (rt->rt_pmtu == mtu && + time_before(jiffies, dst->expires - ip_rt_mtu_expires / 2)) + return; +Can we also add logic in this patch in update_or_create_fnhe, so that we avoid invalidation for cached routes when only pmtu expiration is updated (same pmtu), i.e.: + if (gw || pmtu != fnhe->fnhe_pmtu) { + /* Exception created; mark the cached routes for the nexthop + ... + } BTW, I now see that previous patch should call for_each_possible_cpu for the both cases, not only when fnhe is created but also on update:Why would this be needed? The idea is that if there is no next hop exception, everyone is using the next hop's dsts. If there is a next hop exception hashed, they will be using those routes in the exception entry. When the exception is created, we need to invalidate the nexthop's routes, to force relookup of these dst's if should go to the exception. Basically it flushes the nexthop's 'default' dst. But if we have a valid exception, and we are just updating it. Everyone is already using the right dst. The update_or_create_fnhe() updates all exception's dst's that are effected. Since the set of hosts to which that exception applies is the same, I don't see why we would need to invalidate the nexthop's 'default' dst.Agreed for the NH route. But there is another case: when fnhe exists for fnhe_pmtu!=0 and fnhe_gw=0 and the cached FNHE route has just rt_pmtu!=0 and rt_uses_gateway=0. In the event of redirect we should invalidate fnhe_rth. Users should come again to get the new gw from the same fnhe. As you note, nh_pcpu_rth_output does not need to be invalidated, it was already invalidated when fnhe was created. So, my idea is something like this. If cached route detects change in gw on redirect, it calls update_or_create_fnhe but FNHE can already know this gw, we should invalidate the fnhe_rth only if gw changes. As caller has some stale cache route, it should be invalidated as before, I assume 'kill_route' is properly provided. if (fnhe) { if (fnhe->fnhe_gw != gw && gw) { rt = rcu_dereference_protected(fnhe->fnhe_rth, 1); if (rt) rt->dst.obsolete = DST_OBSOLETE_KILL; fnhe->fnhe_gw = gw; } ... } ... }
This is already done by the caller of update_or_create_fnhe for the case of gw change. It might be worth to put all this to the same place, but would be worth of separate patch.
Also, I don't know the XFRM code well but xfrm_bundle_ok() calls dst_check. Now ipv4_dst_check will not give any indication for recent change in rt_pmtu. I hope this is not a problem because I see some comparisons with cached pmtus. I.e. the main question is: are there any dst_check and ->check users that needs to know for changes in rt_pmtu or all of them just use dst_mtu().
In XFRM case, xfrm_bundle_ok() will propagate the pmtu by calling dst_mtu() for each target. All caching users need to use dst_mtu() to check it, so the route.c code is correct - this is how it worked before my patches too. - Timo