Thread (92 messages) 92 messages, 7 authors, 2021-11-03

Re: [PATCH v8 09/12] sysfs: fix deadlock race with module removal

From: Ming Lei <hidden>
Date: 2021-10-05 09:24:45
Also in: linux-block, linux-fsdevel, linux-kselftest, lkml

On Mon, Sep 27, 2021 at 09:38:02AM -0700, Luis Chamberlain wrote:
When driver sysfs attributes use a lock also used on module removal we
can race to deadlock. This happens when for instance a sysfs file on
a driver is used, then at the same time we have module removal call
trigger. The module removal call code holds a lock, and then the
driver's sysfs file entry waits for the same lock. While holding the
lock the module removal tries to remove the sysfs entries, but these
cannot be removed yet as one is waiting for a lock. This won't complete
as the lock is already held. Likewise module removal cannot complete,
and so we deadlock.

This can now be easily reproducible with our sysfs selftest as follows:

./tools/testing/selftests/sysfs/sysfs.sh -t 0027

This uses a local driver lock. Test 0028 can also be used, that uses
the rtnl_lock():

./tools/testing/selftests/sysfs/sysfs.sh -t 0028

To fix this we extend the struct kernfs_node with a module reference
and use the try_module_get() after kernfs_get_active() is called. As
documented in the prior patch, we now know that once kernfs_get_active()
is called the module is implicitly guarded to exist and cannot be removed.
This is because the module is the one in charge of removing the same
sysfs file it created, and removal of sysfs files on module exit will wait
until they don't have any active references. By using a try_module_get()
after kernfs_get_active() we yield to let module removal trump calls to
process a sysfs operation, while also preventing module removal if a sysfs
operation is in already progress. This prevents the deadlock.

This deadlock was first reported with the zram driver, however the live
Looks not see the lock pattern you mentioned in zram driver, can you
share the related zram code?
patching folks have acknowledged they have observed this as well with
live patching, when a live patch is removed. I was then able to
reproduce easily by creating a dedicated selftest for it.

A sketch of how this can happen follows, consider foo a local mutex
part of a driver, and used on the driver's module exit routine and
on one of its sysfs ops:

foo.c:
static DEFINE_MUTEX(foo);
static ssize_t foo_store(struct device *dev,
			 struct device_attribute *attr,
			 const char *buf, size_t count)
{
	...
	mutex_lock(&foo);
	...
	mutex_lock(&foo);
	...
}
static DEVICE_ATTR_RW(foo);
...
void foo_exit(void)
{
	mutex_lock(&foo);
	...
	mutex_unlock(&foo);
}
module_exit(foo_exit);

And this can lead to this condition:

CPU A                              CPU B
                                   foo_store()
foo_exit()
  mutex_lock(&foo)
                                   mutex_lock(&foo)
   del_gendisk(some_struct->disk);
     device_del()
       device_remove_groups()
I guess the deadlock exists if foo_exit() is called anywhere. If yes,
look the issue may not be related with removing module directly, right?



Thanks,
Ming
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help