Thread (99 messages) 99 messages, 9 authors, 2024-08-30

Re: [PATCH v3 4/4] mm: prohibit NULL deference exposed for unsupported non-blockable __GFP_NOFAIL

From: Yafang Shao <hidden>
Date: 2024-08-19 11:56:56
Also in: linux-mm

On Mon, Aug 19, 2024 at 6:18 PM Michal Hocko [off-list ref] wrote:
On Mon 19-08-24 17:25:18, Yafang Shao wrote:
quoted
On Mon, Aug 19, 2024 at 3:50 PM Michal Hocko [off-list ref] wrote:
quoted
On Sun 18-08-24 10:55:09, Yafang Shao wrote:
quoted
On Sat, Aug 17, 2024 at 2:25 PM Barry Song [off-list ref] wrote:
quoted
From: Barry Song <redacted>

When users allocate memory with the __GFP_NOFAIL flag, they might
incorrectly use it alongside GFP_ATOMIC, GFP_NOWAIT, etc.  This kind of
non-blockable __GFP_NOFAIL is not supported and is pointless.  If we
attempt and still fail to allocate memory for these users, we have two
choices:

    1. We could busy-loop and hope that some other direct reclamation or
    kswapd rescues the current process. However, this is unreliable
    and could ultimately lead to hard or soft lockups,
That can occur even if we set both __GFP_NOFAIL and
__GFP_DIRECT_RECLAIM, right?
No, it cannot! With __GFP_DIRECT_RECLAIM the allocator might take a long
time to satisfy the allocation but it will reclaim to get the memory, it
will sleep if necessary and it will will trigger OOM killer if there is
no other option. __GFP_DIRECT_RECLAIM is a completely different story
than without it which means _no_sleeping_ is allowed and therefore only
a busy loop waiting for the allocation to proceed is allowed.
That could be a livelock.
quoted
From the user's perspective, there's no noticeable difference between
a livelock, soft lockup, or hard lockup.
Ohh, it very much is different if somebody in a sleepable context is
taking too long to complete and making a CPU completely unusable for
anything else.
__alloc_pages_slowpath
retry:
    if (gfp_mask & __GFP_NOFAIL) {
        goto retry;
    }

When the loop continues indefinitely here, it indicates that the
system is unstable. In such a scenario, does it really matter whether
you sleep or not?
Please consider that asking for never failing allocation is a major
requirement.
quoted
quoted
quoted
So, I don't believe the issue is related
to setting __GFP_DIRECT_RECLAIM; rather, it stems from the flawed
design of __GFP_NOFAIL itself.
Care to elaborate?
I've read the documentation explaining why the busy loop is embedded
within the page allocation process instead of letting users implement
it based on their needs. However, the complexity and numerous issues
suggest that this design might be fundamentally flawed.
I really fail what you mean.
I mean giving the user the option to handle the loop at the call site,
rather than having it loop within __alloc_pages_slowpath().

--
Regards
Yafang
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help