Re: [PATCH 2/3] cgroup: add lockless fast-path checks to cgroup_file_notify()

From: Shakeel Butt <shakeel.butt@linux.dev>
Date: 2026-03-02 16:14:24
Also in: cgroups, linux-mm, lkml

Hi Chen, thanks for taking a look.

On Mon, Mar 02, 2026 at 09:50:53AM +0800, Chen Ridong wrote:

Hi Shakeel,

Good series to move away from the global lock.

On 2026/2/28 22:20, Shakeel Butt wrote:

quoted

Add two lockless checks before acquiring the lock:

1. READ_ONCE(cfile->kn) NULL check to skip torn-down files.
2. READ_ONCE(cfile->notified_at) check to skip when within the
   rate-limit window (~10ms).

Both checks have safe error directions -- a stale read can only cause
unnecessary lock acquisition, never a missed notification.  Annotate
all write sites with WRITE_ONCE() to pair with the lockless readers.

The trade-off is that trailing timer_reduce() calls during bursts are
skipped, so the deferred notification that delivers the final state
may be lost.  This is acceptable for the primary callers like
__memcg_memory_event() where events keep arriving.

Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
Reported-by: Jakub Kicinski <kuba@kernel.org>
---
 kernel/cgroup/cgroup.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 33282c7d71e4..5473ebd0f6c1 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c

@@ -1749,7 +1749,7 @@ static void cgroup_rm_file(struct cgroup *cgrp, const struct cftype *cft)
 		struct cgroup_file *cfile = (void *)css + cft->file_offset;
 
 		spin_lock_irq(&cgroup_file_kn_lock);
-		cfile->kn = NULL;
+		WRITE_ONCE(cfile->kn, NULL);
 		spin_unlock_irq(&cgroup_file_kn_lock);
 
 		timer_delete_sync(&cfile->notify_timer);

@@ -4430,7 +4430,7 @@ static int cgroup_add_file(struct cgroup_subsys_state *css, struct cgroup *cgrp,
 		timer_setup(&cfile->notify_timer, cgroup_file_notify_timer, 0);
 
 		spin_lock_irq(&cgroup_file_kn_lock);
-		cfile->kn = kn;
+		WRITE_ONCE(cfile->kn, kn);
 		spin_unlock_irq(&cgroup_file_kn_lock);
 	}

@@ -4686,20 +4686,27 @@ int cgroup_add_legacy_cftypes(struct cgroup_subsys *ss, struct cftype *cfts)
  */
 void cgroup_file_notify(struct cgroup_file *cfile)
 {
-	unsigned long flags;
+	unsigned long flags, last, next;
 	struct kernfs_node *kn = NULL;
 
+	if (!READ_ONCE(cfile->kn))
+		return;
+
+	last = READ_ONCE(cfile->notified_at);
+	if (time_before_eq(jiffies, last + CGROUP_FILE_NOTIFY_MIN_INTV))
+		return;
+

Previously, if a notification arrived within the rate-limit window, we would
still call timer_reduce(&cfile->notify_timer, next) to schedule a deferred
notification.

With this change, returning early here bypasses that timer scheduling entirely.
Does this risk missing notifications that would have been delivered by the timer?

You are indeed right that this can cause missed notifications. After giving some
thought I think the lockless check-and-return can be pretty much simplified to
timer_pending() check. If timer is active, just do nothing and the notification
will be delivered eventually.

I will send the updated version soon. Any comments on the other two patches?

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help