Re: UDP multi-core performance on a single socket and SO_REUSEPORT
From: Tom Herbert <hidden>
Date: 2013-01-04 20:47:06
I believe the hard part of making SO_REUSEPORT was on the TCP side in dealing with state in req structs which we have not resolved. UDP SO_REUSEPORT seems to be working pretty well. Tom On Fri, Jan 4, 2013 at 11:37 AM, Eric Dumazet [off-list ref] wrote:
On Fri, 2013-01-04 at 18:50 +0000, Mark Zealey wrote:quoted
I have written two small test scripts now which can be found at http://mark.zealey.org/uploads/ - one launches 16 listening threads for a single UDP socket, the other needs to be run as for i in `seq 16`; do ./udp_test_client & done On my test server (32-core), stock kernel 3.7.1, 90% of the time is spent in the kernel waiting on spinlocks. Perf output:Mark We know the scalability issue of using a single socket and many threads. The send path was somehow fixed to not require socket lock. But the receive path uses a single receive_queue, protected by a spinlock. SO_REUSEPORT would be nice, but had known issues. af_packet fanout implementation was nicer. You could try : 1) Use af_packet FANOUT instead of UDP sockets 2) rewrite SO_REUSEPORT to use a FANOUT like implementation 3) Extend UDP sockets to be able to use a configurable number of receive queues instead of a single one. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html