Thread (21 messages) 21 messages, 5 authors, 2010-11-08

Re: [PATCH 0/1] RFC: poll/select performance on datagram sockets

From: Eric Dumazet <hidden>
Date: 2010-10-30 12:53:45
Also in: lkml
Subsystem: networking [general], networking [unix sockets], the rest · Maintainers: "David S. Miller", Eric Dumazet, Jakub Kicinski, Paolo Abeni, Kuniyuki Iwashima, Linus Torvalds

Le samedi 30 octobre 2010 à 12:34 +0100, Alban Crequy a écrit :
Le Fri, 29 Oct 2010 21:27:11 +0200,
Eric Dumazet [off-list ref] a écrit :
quoted
Le vendredi 29 octobre 2010 à 19:18 +0100, Alban Crequy a écrit :
quoted
Hi,

When a process calls the poll or select, the kernel calls (struct
file_operations)->poll on every file descriptor and returns a mask
of events which are ready. If the process is only interested by
POLLIN events, the mask is still computed for POLLOUT and it can be
expensive. For example, on Unix datagram sockets, a process running
poll() with POLLIN will wakes-up when the remote end call read().
This is a performance regression introduced when fixing another bug
by 3c73419c09a5ef73d56472dbfdade9e311496e9b and
ec0d215f9420564fc8286dcf93d2d068bb53a07e.

The attached program illustrates the problem. It compares the
performance of sending/receiving data on an Unix datagram socket and
select(). When the datagram sockets are not connected, the
performance problem is not triggered, but when they are connected
it becomes a lot slower. On my computer, I have the following time:

Connected datagram sockets: >4 seconds
Non-connected datagram sockets: <1 second

The patch attached in the next email fixes the performance problem:
it becomes <1 second for both cases. I am not suggesting the patch
for inclusion; I would like to change the prototype of (struct
file_operations)->poll instead of adding ->poll2. But there is a
lot of poll functions to change (grep tells me 337 functions).

Any opinions?
My opinion would be to use epoll() for this kind of workload.
I found a problem with epoll() with the following program. When there
is several datagram sockets connected to the same server and the
receiving queue is full, epoll(EPOLLOUT) wakes up only the emitter who
has its skb removed from the queue, and not all the emitters. It is
because sock_wfree() runs sk->sk_write_space() only for one emitter.
I dont think this is the reason.

sock_wfree() really is good here, since it copes with one socket (the
one that sent the message)

Problem is the peer_wait, that epoll doesnt seem to be plugged into.

Bug is in unix_dgram_poll()

It calls sock_poll_wait( ... &unix_sk(other)->peer_wait,) only if socket
is 'writable'. Its a clear bug

Try this patch please ?
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 0ebc777..315716c 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2092,7 +2092,7 @@ static unsigned int unix_dgram_poll(struct file *file, struct socket *sock,
 
 	/* writable? */
 	writable = unix_writable(sk);
-	if (writable) {
+	if (1 /*writable*/) {
 		other = unix_peer_get(sk);
 		if (other) {
 			if (unix_peer(other) != sk) {
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help