***SPAM*** Re: [ofa-general] [PATCH 0/26] Reliable Datagram Sockets (RDS), take 2
From: Andrew Grover <hidden>
Date: 2009-02-28 20:44:37
On Fri, Feb 27, 2009 at 9:56 PM, Andi Kleen [off-list ref] wrote:
On Fri, Feb 27, 2009 at 05:53:19PM -0800, Andrew Grover wrote:quoted
On Fri, Feb 27, 2009 at 9:08 AM, Andi Kleen [off-list ref] wrote:quoted
quoted
This patchset against net-next adds support for RDS sockets. RDS is an Oracle-originated protocol used to send IPC datagrams (up to 1MB) reliably, and is used currently in Oracle RAC and Exadata products.Perhaps I missed it earlier, but what is the rationale for putting this as a socket type into the kernel? I assume they also work directly as implemented in user space using raw sockets or similar, don't they?You want me to implement my fancy protocol in userspace???I just asked why you're putting it in kernel space.quoted
Do I even get to write it in C or do I need to use Ruby?Well normally people who add new subsystems to the kernel explain why they do that. Perhaps it's obvious to you, but at least to me it isn't.
Sure thing, sorry to be flippant :-) The previous solution for IPC that Oracle was using was based on UDP, which I think could be considered very close to using raw sockets -- each process is responsible for its own acks, retransmits, everything. Doing this on a highly loaded machine resulted in a cascade where performance got worse and worse. Moving this to kernel code made a big difference. Additionally, our interconnect is primarily Infiniband. It natively implements a reliable datagram connection type so RDS leverages that. RDS multiplexes all processes' traffic between two hosts over a single IB connection. Since RDS is managing IB connections at the host level (but based on socket traffic) this is also more naturally a fit for kernel code. Regards -- Andy