Re: [Ksummit-2005-discuss] Summary of 2005 Kernel Summit Proposed Topics
From: Andrea Arcangeli <hidden>
Date: 2005-03-27 06:04:03
On Sat, Mar 26, 2005 at 09:48:31PM -0800, Matt Mackall wrote:
I believe the mempool can be shared among all sockets that represent the same storage device. Packets out any socket represent progress.
What's the point to have more than one socket connected to each storage device anyway?
Yes, done before it was even called iSCSI.
Ok, theoretical deadlock conditions aren't nice anyway, but knowing this is a real life problem too makes it more interesting ;).
The receive buffer is allocated at the time we DMA it from the card. We have no idea of its contents and we won't know what socket mempool to pull the receive skbuff from until much higher in the network stack, which could be quite a while later if we're under OOM load. And we can't have a mempool big enough to handle all the traffic that might potentially be deferred for softirq processing when we're OOM, especially at gigabit rates. I think this is actually the tricky piece of the problem and solving the socket and send buffer allocation doesn't help until this gets figured out. We could perhaps try to address this with another special receive-side alloc_skb that fails most of the time on OOM but sometimes pulls from a special reserve.
One algo to handle this is: after we get the gfp_atomic failure, we look at all the mempools are registered for a certain NIC, and we pick a random mempools that isn't empty. We use the non-empty mempool to receive the packet, and we let the netif_rx process the packet. Then if going up the stack we find that the packet doesn't belong to the socket-mempool, we discard the packet and we release the ram back into the mempool. This should make progress since eventually the right packet will go in the right mempool.
quoted
Perhaps the mempooling overhead will be too huge to pay for it even when it's not necessary, in such case the iscsid will have to pass a new bitflag to the socket syscall, when it creates the socket meant to talk with the remote disk.I think we probably attach a mempool to a socket after the fact. And
I guess you meant before the fact (i.e. before the connection to the server), anything attached after the fact (whatever the fact is ;) isn't going to help.