Thread (92 messages) 92 messages, 7 authors, 2021-11-03

Re: [PATCH v8 11/12] zram: fix crashes with cpu hotplug multistate

From: Luis Chamberlain <mcgrof@kernel.org>
Date: 2021-10-11 18:27:40
Also in: linux-block, linux-fsdevel, linux-kselftest, lkml

On Tue, Oct 05, 2021 at 01:55:35PM -0700, Kees Cook wrote:
On Mon, Sep 27, 2021 at 09:38:04AM -0700, Luis Chamberlain wrote:
quoted
Provide a simple state machine to fix races with driver exit where we
remove the CPU multistate callbacks and re-initialization / creation of
new per CPU instances which should be managed by these callbacks.

The zram driver makes use of cpu hotplug multistate support, whereby it
associates a struct zcomp per CPU. Each struct zcomp represents a
compression algorithm in charge of managing compression streams per
CPU. Although a compiled zram driver only supports a fixed set of
compression algorithms, each zram device gets a struct zcomp allocated
per CPU. The "multi" in CPU hotplug multstate refers to these per
cpu struct zcomp instances. Each of these will have the CPU hotplug
callback called for it on CPU plug / unplug. The kernel's CPU hotplug
multistate keeps a linked list of these different structures so that
it will iterate over them on CPU transitions.

By default at driver initialization we will create just one zram device
(num_devices=1) and a zcomp structure then set for the now default
lzo-rle comrpession algorithm. At driver removal we first remove each
zram device, and so we destroy the associated struct zcomp per CPU. But
since we expose sysfs attributes to create new devices or reset /
initialize existing zram devices, we can easily end up re-initializing
a struct zcomp for a zram device before the exit routine of the module
removes the cpu hotplug callback. When this happens the kernel's CPU
hotplug will detect that at least one instance (struct zcomp for us)
exists. This can happen in the following situation:

CPU 1                            CPU 2

                                disksize_store(...);
class_unregister(...);
idr_for_each(...);
zram_debugfs_destroy();

idr_destroy(...);
unregister_blkdev(...);
cpuhp_remove_multi_state(...);
So this is strictly separate from the sysfs/module unloading race?
It is only related in the sense that the sysfs/module unloading race
happened *after* this other issue, but addressing these through
separate threads created a break in conversation and focus. For
instance, a theoretical race was mentioned in one thread, which
I worked to prove/disprove and then I disproved it was not possible.

But at this point, yes, this is a purely separate issue, and this
patch *should* be picked up already.

Andrew, can you merge this? It already has the respective maintainer
Ack, and I can continue to work on the rest of the patches. The only
issue I can think of would be a conflict with the last patch but
that's a oneliner, I think chances are low that would create a conflict
if its all merged separately, and if so, it should be an easy fix for
a merge conflict.

  Luis
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help