Thread (7 messages) 7 messages, 5 authors, 2021-08-03

Re: [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code

From: Rik van Riel <hidden>
Date: 2021-08-03 14:34:34
Also in: linux-mm, lkml

On Tue, 2021-07-27 at 09:51 -0700, Shakeel Butt wrote:
On Mon, Jul 26, 2021 at 8:19 AM Rik van Riel [off-list ref] wrote:
quoted
On Mon, 2021-07-26 at 11:00 -0400, Johannes Weiner wrote:
quoted
__mem_cgroup_threshold() indeed holds the rcu lock. In addition,
the
thresholding code is invoked during stat changes, and those
contexts
have irqs disabled as well. If the lock breaking occurs inside
the
flush function, it will result in a sleep from an atomic context.

Use the irsafe flushing variant in mem_cgroup_usage() to fix this
While this fix is necessary, in the long term I think we may
want some sort of redesign here, to make sure the irq safe
version does not spin long times trying to get the statistics
off some other CPU.

I have seen a number of soft (IIRC) lockups deep inside the
bowels of cgroup_rstat_flush_irqsafe, with the function taking
multiple seconds to complete.
Can you please share a bit more detail on this lockup? I am wondering
if this was due to the flush not happening more often and thus the
update tree is large or if there are too many concurrent flushes
happening.
I was not logged into any system while it happened, but
only found it later in the logs.

I suspect your explanation is the reason why it happened,
though.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help