Thread (92 messages) 92 messages, 7 authors, 2021-11-03

Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate

From: Petr Mladek <pmladek@suse.com>
Date: 2021-11-02 15:24:16
Also in: linux-block, linux-fsdevel, linux-kselftest, live-patching, lkml

On Wed 2021-10-27 13:57:40, Miroslav Benes wrote:
On Tue, 26 Oct 2021, Luis Chamberlain wrote:
quoted
On Tue, Oct 26, 2021 at 11:37:30PM +0800, Ming Lei wrote:
quoted
On Tue, Oct 26, 2021 at 10:48:18AM +0200, Petr Mladek wrote:
quoted
Livepatch code never called kobject_del() under a lock. It would cause
the obvious deadlock.
I have to correct myself. IMHO, the deadlock is far from obvious. I
always get lost in the code and the documentation is not clear.
I always get lost.
quoted
Never?
kobject_put() to be precise.
IMHO, the problem is actually with kobject_del() that gets blocked
until the sysfs interface gets removed. kobject_put() will have
the same problem only when the clean up is not delayed.

When I started working on the support for module/live patches removal, 
calling kobject_put() under our klp_mutex lock was the obvious first 
choice given how the code was structured, but I ran into problems with 
deadlocks immediately. So it was changed to async approach with the 
workqueue. Thus the mainline code has never suffered from this, but we 
knew about the issues.
 
quoted
quoted
quoted
The historic code only waited in the
module_exit() callback until the sysfs interface was removed.
OK, then Luis shouldn't consider livepatching as one such issue to solve
with one generic solution.
It's not what I was told when the deadlock was found with zram, so I was
informed quite the contrary.
quoted
From my perspective, it is quite easy to get it wrong due to either a lack 
of generic support, or missing rules/documentation. So if this thread 
leads to "do not share locks between a module removal and a sysfs 
operation" strict rule, it would be at least something. In the same 
manner as Luis proposed to document try_module_get() expectations.
The rule "do not share locks between a module removal and a sysfs
operation" is not clear to me.

IMHO, there are the following rules:

1. rule: kobject_del() or kobject_put() must not be called under a lock that
	 is used by store()/show() callbacks.

   reason: kobject_del() waits until the sysfs interface is destroyed.
	 It has to wait until all store()/show() callbacks are finished.


2. rule: kobject_del()/kobject_put() must not be called from the
	related store() callbacks.

   reason: same as in 1st rule.


3. rule: module_exit() must wait until all release() callbacks are called
	 when kobject are static.

   reason: kobject_put() must be called to clean up internal
	dependencies. The clean up might be done asynchronously
	and need access to the kobject structure.


Best Regards,
Petr

PS: I am sorry if I am messing things. I want to be sure that we are
    all talking about the same and understand it the same way.
    
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help