Thread (10 messages) 10 messages, 6 authors, 2022-09-12

Re: [RFC] Socket termination for policy enforcement and load-balancing

From: Kuniyuki Iwashima <hidden>
Date: 2022-08-31 23:44:04
Also in: bpf

Thanks for CCing, Martin.

Date:   Wed, 31 Aug 2022 16:01:57 -0700
From:   Martin KaFai Lau <redacted>
On Wed, Aug 31, 2022 at 09:37:41AM -0700, Aditi Ghag wrote:
quoted
- Use BPF (sockets) iterator to identify sockets connected to a
deleted backend. The BPF (sockets) iterator is network namespace aware
so we'll either need to enter every possible container network
namespace to identify the affected connections, or adapt the iterator
to be without netns checks [3]. This was discussed with my colleague
Daniel Borkmann based on the feedback he shared from the LSFMMBPF
conference discussions.
Being able to iterate all sockets across different netns will
be useful.

It should be doable to ignore the netns check.  For udp, a quick
thought is to have another iter target. eg. "udp_all_netns".
From the sk, the bpf prog should be able to learn the netns and
the bpf prog can filter the netns by itself.

The TCP side is going to have an 'optional' per netns ehash table [0] soon,
not lhash2 (listening hash) though.  Ideally, the same bpf
all-netns iter interface should work similarly for both udp and
tcp case.  Thus, both should be considered and work at the same time.
I'm going to add optional hash tables for UDP as well.  The first series [1]
had TCP/UDP stuff and was split, and UDP part is pending for now.

So, if the both series was merged, the TCP/UDP all netns iter would have
similar logic.

[1]: https://lore.kernel.org/netdev/20220826000445.46552-14-kuniyu@amazon.com/ (local)

For udp, something more useful than plain udp_abort() could potentially
be done.  eg. directly connect to another backend (by bpf kfunc?).
There may be some details in socket locking...etc but should
be doable and the bpf-iter program could be sleepable also.
fwiw, we are iterating the tcp socket to retire some older
bpf-tcp-cc (congestion control) on the long-lived connections
by bpf_setsockopt(TCP_CONGESTION).

Also, potentially, instead of iterating all,
a more selective case can be done by
bpf_prog_test_run()+bpf_sk_lookup_*()+udp_abort().

[0]: https://lore.kernel.org/netdev/20220830191518.77083-1-kuniyu@amazon.com/ (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help