Thread (73 messages) 73 messages, 14 authors, 2005-03-31

Re: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics

From: Andrea Arcangeli <hidden>
Date: 2005-03-27 06:04:03

On Sat, Mar 26, 2005 at 09:48:31PM -0800, Matt Mackall wrote:
I believe the mempool can be shared among all sockets that represent
the same storage device. Packets out any socket represent progress.
What's the point to have more than one socket connected to each storage
device anyway?
Yes, done before it was even called iSCSI.
Ok, theoretical deadlock conditions aren't nice anyway, but knowing this
is a real life problem too makes it more interesting ;).
The receive buffer is allocated at the time we DMA it from the card.
We have no idea of its contents and we won't know what socket mempool
to pull the receive skbuff from until much higher in the network
stack, which could be quite a while later if we're under OOM load. And
we can't have a mempool big enough to handle all the traffic that
might potentially be deferred for softirq processing when we're OOM,
especially at gigabit rates.

I think this is actually the tricky piece of the problem and solving
the socket and send buffer allocation doesn't help until this gets
figured out.

We could perhaps try to address this with another special receive-side
alloc_skb that fails most of the time on OOM but sometimes pulls from
a special reserve.
One algo to handle this is: after we get the gfp_atomic failure, we
look at all the mempools are registered for a certain NIC, and we pick
a random mempools that isn't empty. We use the non-empty mempool to
receive the packet, and we let the netif_rx process the packet. Then if
going up the stack we find that the packet doesn't belong to the
socket-mempool, we discard the packet and we release the ram back into
the mempool. This should make progress since eventually the right packet
will go in the right mempool.
quoted
Perhaps the mempooling overhead will be too huge to pay for it even when
it's not necessary, in such case the iscsid will have to pass a new
bitflag to the socket syscall, when it creates the socket meant to talk
with the remote disk.
I think we probably attach a mempool to a socket after the fact. And
I guess you meant before the fact (i.e. before the connection to the
server), anything attached after the fact (whatever the fact is ;) isn't
going to help.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help