Thread (28 messages) 28 messages, 4 authors, 2019-06-05

Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info

From: Wei Wang <hidden>
Date: 2019-06-04 21:36:41

On Tue, Jun 4, 2019 at 2:13 PM David Ahern [off-list ref] wrote:
On 6/4/19 3:06 PM, Martin Lau wrote:
quoted
On Tue, Jun 04, 2019 at 02:17:28PM -0600, David Ahern wrote:
quoted
On 6/3/19 11:29 PM, Martin Lau wrote:
quoted
On Mon, Jun 03, 2019 at 07:36:06PM -0600, David Ahern wrote:
quoted
On 6/3/19 6:58 PM, Martin Lau wrote:
quoted
I have concern on calling ip6_create_rt_rcu() in general which seems
to trace back to this commit
dec9b0e295f6 ("net/ipv6: Add rt6_info create function for ip6_pol_route_lookup")

This rt is not tracked in pcpu_rt, rt6_uncached_list or exception bucket.
In particular, how to react to NETDEV_UNREGISTER/DOWN like
the rt6_uncached_list_flush_dev() does and calls dev_put()?

The existing callers seem to do dst_release() immediately without
caching it, but still concerning.
those are the callers that don't care about the dst_entry, but are
forced to deal with it. Removing the tie between fib lookups an
dst_entry is again the right solution.
Great to know that there will be a solution.  It would be great
if there is patch (or repo) to show how that may look like on
those rt6_lookup() callers.
Not 'will be', 'there is' a solution now. Someone just needs to do the
conversions and devise the tests for the impacted users.
I don't think everyone will convert to the new nexthop solution
immediately.

How about ensuring the existing usage stays solid first?
Use of nexthop objects has nothing to do with separating fib lookups
from dst_entries, but with the addition of fib6_result it Just Works.

Wei converted ipv6 to use exception caches instead of adding them to the
FIB.

I converted ipv6 to use separate data structures for fib entries, added
direct fib6 lookup functions and added fib6_result. See the
net/core/filter.c.

The stage is set for converting users.

For example, ip6_nh_lookup_table does not care about the dst entry, only
the fib entry. This converts it:

static int ip6_nh_lookup_table(struct net *net, struct fib6_config *cfg,
                               const struct in6_addr *gw_addr, u32 tbid,
                               int flags, struct fib6_result *res)
{
        struct flowi6 fl6 = {
                .flowi6_oif = cfg->fc_ifindex,
                .daddr = *gw_addr,
                .saddr = cfg->fc_prefsrc,
        };
        struct fib6_table *table;
        struct rt6_info *rt;

        table = fib6_get_table(net, tbid);
        if (!table)
                return -EINVAL;

        if (!ipv6_addr_any(&cfg->fc_prefsrc))
                flags |= RT6_LOOKUP_F_HAS_SADDR;

        flags |= RT6_LOOKUP_F_IGNORE_LINKSTATE;

        fib6_table_lookup(net, table, cfg->fc_ifindex, fl6, res, flags);
        if (res.f6i == net->ipv6.fib6_null_entry)
                return -ENETUNREACH;

        fib6_select_path(net, &res, fl6, oif, false, NULL, flags);

        return 0;
}
I do agree with Martin that ip6_create_rt_rcu() seems to be dangerous
as the dst cache created by this func does not get tracked anywhere
and it is up to the user to not cache it for too long.
But I think David, what you are suggesting is:
instead of trying to convert ip6_create_rt_rcu() to use the pcpu_dst
logic, completely get rid of the calling to ip6_create_rt_rcu(), and
directly return f6i in those cases to the caller. Is that correct?
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help