Thread (47 messages) 47 messages, 8 authors, 2025-03-15

Re: [PATCH v7 08/16] rust: add devres abstraction

From: Danilo Krummrich <dakr@kernel.org>
Date: 2025-01-15 06:42:03
Also in: linux-pci, lkml, rcu, rust-for-linux

On Tue, Jan 14, 2025 at 05:33:45PM +0100, Danilo Krummrich wrote:
quoted
quoted
+impl<T> Drop for Devres<T> {
+    fn drop(&mut self) {
+        // Revoke the data, such that it gets dropped already and the actual resource is freed.
+        //
+        // `DevresInner` has to stay alive until the devres callback has been called. This is
+        // necessary since we don't know when `Devres` is dropped and calling
+        // `devm_remove_action()` instead could race with `devres_release_all()`.
IIUC, the outcome of that race is the `WARN` if
devres_release_all takes the spinlock first and has already remvoed the
action?

Could you do a custom devres_release here that mimick
`devm_remove_action` but omit the `WARN`? This way it allows the memory
behind DevresInner to be freed early without keeping it allocated until
the end of device lifetime.
Better late than never, I now remember what's the *actual* race I was preventing
here:

  | Task 0                               | Task 1
--|----------------------------------------------------------------------------
1 | Devres::drop() {                     | devres_release_all() {
2 |    DevresInner::remove_action() {    |    spin_lock_irqsave();
3 |                                      |    remove_nodes(&todo);
4 |                                      |    spin_unlock_irqrestore();
5 |       devm_remove_action_nowarn();   |
6 |       let _ = Arc::from_raw();       |
7 |    }                                 |
8 | }                                    |
9 |                                      |    release_nodes(&todo);
10|                                      | }

So, if devres_release_all() wins the race it stores the devres action within the
temporary todo list. Which means that in 9 we enter
DevresInner::devres_callback() even though DevresInner has been freed already.

Unfortunately, this race can happen with [1], but not with this original version
of Devres.
Well, actually this can't happen, since obviously we know whether Devres:drop
removed it or whether it will be released by devres_release_all() and we only
free DevresInner conditionally.

So, the code in [1] is fine; I somehow managed to confuse myself.

Sorry for the noise.

- Danilo
I see two ways to fix it:

1) Just revert [1] and stick with the original version.

2) Use devm_release_action() instead of devm_remove_action() and don't drop the
   reference in DevresInner::remove_action() (6). This way the reference is
   always dropped from the callback.

With 2) we still have an unnecessary call to revoke() if Devres is dropped
before the device is detached from the driver, but that's still better than
keeping DevresInner alive until the device is detached from the driver, which
the original version did. Hence, I'll go ahead and prepare a patch for this,
unless there are any concerns.

- Danilo

[1] https://lore.kernel.org/rust-for-linux/20250107122609.8135-2-dakr@kernel.org/ (local)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help