Thread (29 messages) 29 messages, 2 authors, 2012-07-10

Re: [PATCH 04/16] mm: allow PF_MEMALLOC from softirq context

From: Mel Gorman <mgorman@suse.de>
Date: 2012-07-10 11:09:41
Also in: linux-mm, lkml

On Mon, Jul 09, 2012 at 06:57:10PM +0200, Sebastian Andrzej Siewior wrote:
On Mon, Jul 09, 2012 at 11:04:42AM +0100, Mel Gorman wrote:
quoted
quoted
- lets assume your allocation happens with kmalloc() without __GFP_MEMALLOC
  and current->flags has PF_MEMALLOC ORed and your SLAB pool is empty. This
  forces SLAB to allocate more pages from the buddy allocator with it will
  receive more likely (due to ->current->flags + PF_MEMALLOC) but SLAB will
  drop this extra memory because the page has ->pf_memory (or something like
  that) set and the GFP_FLAGS do not have __GFP_MEMALLOC set.
It's recorded if the slab page was allocated from PFMEMALLOC reserves (see
patch 2 from the swap over NBD series). slab will use this page for objects
but only allocate them to callers that pass a gfp_pfmemalloc_allowed() check.
kmalloc() users with either __GFP_MEMALLOC or PF_MEMALLOC will get
the pages they need but they will not "leak" to !_GFP_MEMALLOC users as
that would potentially deadlock.
Argh, I missed that gfp_to_alloc_flags() is not only called from
within the buddy allocater but also from slab. So this is fine then :)
Good to hear. I appreciate you taking the time to give it a solid review
like this looking for holes.
One thing:
You only get current->flags |= PF_MEMALLOC in softirq _if_ the skb, which is 
passed to netif_receive_skb(), was allocated with __GFP_MEMALLOC. That
means if the NIC's RX allocation did not require an allocation from the
emergency pool (without ->pfmemalloc set) then you never use this extra
pool, even if this skb would end up in your swap socket. Also, the other way
around, where you allocate it from the emergency pool but it is a user
socket and you could drop it.
While there is a possibility that packets may get dropped later like this,
they still get retransmitted and eventually it'll get through.  This is
not optimal but optimised swap-over-network was not the primary goal of
the series, deadlock avoidance was.
What about extending sk_set_memalloc() to record socket's ips + ports
in a separate list so that skb_pfmemalloc_protocol() might use that
information and decide on per-protocol basis if the skb is worth to
spend more ressource to deliver it. That means you would enable the
extra pool if the currently received skb is part of your swap socket and
not if the skb was allocated from the emergency pool.

That said, there is nothing wrong with the code as of now and this
optimization could be added later (if at all).
I think it is a good idea but it could also be done later iff a user had
a serious problem with the performance and that this made a measurable
difference. The series is already quite complex and I'd rather not add to
that complexity without strong motivation.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help