Thread (20 messages) 20 messages, 8 authors, 2019-12-19

Re: epoll_wait() performance

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: 2019-12-02 16:47:45
Also in: lkml

On Mon, Dec 2, 2019 at 7:24 AM David Laight [off-list ref] wrote:
From: Jakub Sitnicki <jakub@cloudflare.com>
quoted
Sent: 30 November 2019 13:30
On Sat, Nov 30, 2019 at 02:07 AM CET, Eric Dumazet wrote:
quoted
On 11/28/19 2:17 AM, David Laight wrote:
...
quoted
quoted
quoted
How can you do that when all the UDP flows have different destination port numbers?
These are message flows not idempotent requests.
I don't really want to collect the packets before they've been processed by IP.

I could write a driver that uses kernel udp sockets to generate a single message queue
than can be efficiently processed from userspace - but it is a faff compiling it for
the systems kernel version.
Well if destinations ports are not under your control,
you also could use AF_PACKET sockets, no need for 'UDP sockets' to receive UDP traffic,
especially it the rate is small.
Alternatively, you could steer UDP flows coming to a certain port range
to one UDP socket using TPROXY [0, 1].
I don't think that can work, we don't really know the list of valid UDP port
numbers ahead of time.
How about -j REDIRECT. That does not require all ports to be known
ahead of time.
quoted
TPROXY has the same downside as AF_PACKET, meaning that it requires at
least CAP_NET_RAW to create/set up the socket.
CAP_NET_RAW wouldn't be a problem - we already send from a 'raw' socket.
One other issue when comparing udp and packet sockets is ip
defragmentation. That is critical code that is not at all trivial to
duplicate in userspace.

Even when choosing packet sockets, which normally would not
defragment, there is a trick. A packet socket with fanout and flag
PACKET_FANOUT_FLAG_DEFRAG will defragment before fanout.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help