Re: [PATCH v3 2/3] zram: fix deadlock with sysfs attribute usage and driver removal
From: Luis Chamberlain <mcgrof@kernel.org>
Date: 2021-06-22 19:57:09
Also in:
lkml
On Tue, Jun 22, 2021 at 08:05:12PM +0200, Greg KH wrote:
On Tue, Jun 22, 2021 at 10:27:12AM -0700, Luis Chamberlain wrote:quoted
On Tue, Jun 22, 2021 at 07:16:34PM +0200, Greg KH wrote:quoted
On Tue, Jun 22, 2021 at 09:32:08AM -0700, Luis Chamberlain wrote:quoted
On Tue, Jun 22, 2021 at 09:45:39AM +0200, Greg KH wrote:quoted
On Mon, Jun 21, 2021 at 04:36:34PM -0700, Luis Chamberlain wrote:quoted
@@ -2048,13 +2048,19 @@ static ssize_t hot_add_show(struct class *class, { int ret; + if (!try_module_get(THIS_MODULE)) + return -ENODEV; +You can not increment/decrement your own module's reference count and expect it to work properly, as it is still a race.The goal here is to prevent an rmmod call if this succeeds. If it succeeds then any subsequent rmmod will fail. Can you explain how this is still racy?{sigh} What happens if the driver core is just about to call hot_add_show() and the module is removed from the system. It then calls to the memory location that hot_add_show() was previously at, but now that is not a valid pointer to code, and boom.The new kobject_get() on patch 3/3 ensures that the device will be up throughout the entire life of the store call, and thus prevent the code being executed being removed, no?I do not know, I no longer remember what is in that patch at the moment as it is long-gone from my queue.
It was the changes *you* recommended, a generic way to ensure the lifetime of the derefernce is valid. I had used bdgrab()/bdget() and you suggested we generalize it with the kobject_get() for the device and a bus get. With that change, I confirm that the device will still be present during the lifetime of the sysfs knobs call.
Also, if the device will be "up" for the whole lifetime, why do you need to increment the module reference count?
The goal is to prevent a deadlock. The lifetime of the device is not an issue in this deadlock case, the issue is a race with module removal and that code path using a lock which is also used on a sysfs knob. Luis