Re: [RFC][PATCH 2/9] deadlock prevention core
From: Peter Zijlstra <hidden>
Date: 2006-08-09 14:04:57
Also in:
linux-mm, lkml
On Wed, 2006-08-09 at 15:48 +0200, Indan Zupancic wrote:
On Wed, August 9, 2006 14:54, Peter Zijlstra said:quoted
On Wed, 2006-08-09 at 14:02 +0200, Indan Zupancic wrote:quoted
That avoids lots of checks and should guarantee that the accounting is correct, except in the case when the IFF_MEMALLOC flag is cleared and the counter is set to zero manually. Can't that be avoided and just let it decrease to zero naturally?That would put the atomic op on the free path unconditionally, I think davem gets nightmares from that.I confused SOCK_MEMALLOC with sk_buff::memalloc, sorry. What I meant was to unconditionally decrement the reserved usage only when memalloc is true on the free path. That way all skbs that increased the reserve also decrease it, and the counter should never go below zero.
OK, so far so good, except we loose the notion of getting memory back from regular skbs.
Also as far as I can see it should be possible to replace all atomic "if (unlikely(dev_reserve_used(skb->dev)))" checks witha check if memalloc is set. That should make davem happy, as there aren't any atomic instructions left in hot paths.
dev_reserve_used() uses atomic_read() which isn't actually a LOCK'ed instruction, so that should not matter.
If IFF_MEMALLOC is set new skbs set memalloc and increase the reserve.
Not quite, if IFF_MEMALLOC is set new skbs _could_ get memalloc set. We only fall back to alloc_pages() if the regular path fails to alloc. If the skb is backed by a page (as opposed to kmem_cache fluff) sk_buff::memalloc is set.
When the skb is being destroyed it doesn't matter if IFF_MEMALLOC is set or not, only if that skb used reserves and thus only the memalloc flag needs to be checked. This means that changing the IFF_MEMALLOC doesn't affect in-flight skbs but only newly created ones, and there's no need to update in-flight skbs whenever the flag is changed as all should go well.
Your reasoning is sound, except for these two point above... <snip code>
It seems that here SOCK_MEMALLOC is only set on the first socket. Shouldn't it be set on all sockets instead?
Ouch, where was my brain... Thanks! Also, I've been thinking (more pain), should I not up the reserve for each SOCK_MEMALLOC socket.