Re: [PATCH 2/6] MM: improve documentation for __GFP_NOFAIL
From: Michal Hocko <mhocko@suse.com>
Date: 2021-10-13 08:27:05
Also in:
linux-doc, linux-ext4, linux-fsdevel, linux-mm, linux-xfs, lkml
On Wed 13-10-21 13:32:31, Dave Chinner wrote:
On Mon, Oct 11, 2021 at 01:57:36PM +0200, Michal Hocko wrote:quoted
On Sat 09-10-21 09:36:49, Dave Chinner wrote:quoted
On Fri, Oct 08, 2021 at 09:48:39AM +0200, Michal Hocko wrote:quoted
quoted
quoted
quoted
Even the API constaints of kvmalloc() w.r.t. only doing the vmalloc fallback if the gfp context is GFP_KERNEL - we already do GFP_NOFS kvmalloc via memalloc_nofs_save/restore(), so this behavioural restriction w.r.t. gfp flags just makes no sense at all.GFP_NOFS (without using the scope API) has the same problem as NOFAIL in the vmalloc. Hence it is not supported. If you use the scope API then you can GFP_KERNEL for kvmalloc. This is clumsy but I am not sure how to define these conditions in a more sensible way. Special case NOFS if the scope api is in use? Why do you want an explicit NOFS then?Exactly my point - this is clumsy and a total mess. I'm not asking for an explicit GFP_NOFS, just pointing out that the documented restrictions that "vmalloc can only do GFP_KERNEL allocations" is completely wrong. vmalloc() { if (!(gfp_flags & __GFP_FS)) memalloc_nofs_save(); p = __vmalloc(gfp_flags | GFP_KERNEL) if (!(gfp_flags & __GFP_FS)) memalloc_nofs_restore(); } Yup, that's how simple it is to support GFP_NOFS support in vmalloc().Yes, this would work from the functionality POV but it defeats the philosophy behind the scope API. Why would you even need this if the scope was defined by the caller of the allocator?Who actually cares that vmalloc might be using the scoped API internally to implement GFP_NOFS or GFP_NOIO? Nobody at all. It is far more useful (and self documenting!) for one-off allocations to pass a GFP_NOFS flag than it is to use a scope API...
I would agree with you if the explicit GFP_NOFS usage was consistent and actually justified in the majority cases. My experience tells me otherwise though. Many filesystems use the flag just because that is easier. That leads to a huge overuse of the flag that leads to practical problems. I was hoping that if we offer an API that would define problematic reclaim recursion scopes then it would reduce the abuse. I haven't expected this to happen overnight but it is few years and it seems it will not happen soon either. [...]
quoted
quoted
It also points out that the scope API is highly deficient. We can do GFP_NOFS via the scope API, but we can't do anything else because *there is no scope API for other GFP flags*. Why don't we have a GFP_NOFAIL/__GFP_RETRY_FOREVER scope API?NO{FS,IO} where first flags to start this approach. And I have to admit the experiment was much less successful then I hoped for. There are still thousands of direct NOFS users so for some reason defining scopes is not an easy thing to do. I am not against NOFAIL scopes in principle but seeing the nofs "success" I am worried this will not go really well either and it is much more tricky as NOFAIL has much stronger requirements than NOFS. Just imagine how tricky this can be if you just call a library code that is not under your control within a NOFAIL scope. What if that library code decides to allocate (e.g. printk that would attempt to do an optimistic NOWAIT allocation).I already asked you that _exact_ question earlier in the thread w.r.t. kvmalloc(GFP_NOFAIL) using optimistic NOWAIT kmalloc allocation. I asked you as a MM expert to define *and document* the behaviour that should result, not turn around and use the fact that it is undefined behaviour as a "this is too hard" excuse for not changing anything.
Dave, you have "thrown" a lot of complains in previous emails and it is hard to tell rants from features requests apart. I am sorry but I believe it would be much more productive to continue this discussion if you could mild your tone. Can I ask you to break down your feature requests into separate emails so that we can discuss and track them separately rather in this quite a long thread which has IMHO diverghed from the initial topic. Thanks!
THe fact is that the scope APIs are only really useful for certain contexts where restrictions are set by higher level functionality. For one-off allocation constraints the API sucks and we end up with
Could you be more specific about these one-off allocation constrains?
What would be the reason to define one-off NO{FS,IO} allocation
constrain? Or did you have your NOFAIL example in mind?
crap like this (found in btrfs):
/*
* We're holding a transaction handle, so use a NOFS memory
* allocation context to avoid deadlock if reclaim happens.
*/
nofs_flag = memalloc_nofs_save();
value = kmalloc(size, GFP_KERNEL);
memalloc_nofs_restore(nofs_flag); Yes this looks wrong indeed! If I were to review such a code I would ask why the scope cannot match the transaction handle context. IIRC jbd does that. I am aware of these patterns. I was pulled in some discussions in the past and in some it turned out that the constrain is not needed at all and in some cases that has led to a proper scope definition. As you point out in your other examples it just happens that it is easier to go an easy path and define scopes ad-hoc to work around allocation API limitations. [...]
IOWs, a large number of the users of the scope API simply make [k]vmalloc() provide GFP_NOFS behaviour. ceph_kvmalloc() is pretty much a wrapper that indicates how all vmalloc functions should behave. Honour GFP_NOFS and GFP_NOIO by using the scope API internally.
I was discouraging from this behavior at vmalloc level to push people to use scopes properly - aka at the level where the reclaim recursion is really a problem. If that is infeasible in practice then we can re-evaluate of course. I was really hoping we can get rid of cargo cult GFP_NOFS usage this way but the reality often disagrees with hopes. All that being said, let's discuss [k]vmalloc constrains and usecases that need changes in a separate email thread. Thanks! -- Michal Hocko SUSE Labs