Re: [PATCH v5 6/7] module: Improve support for asynchronous module exit code
From: Bart Van Assche <bvanassche@acm.org>
Date: 2022-09-28 19:27:16
Also in:
linux-scsi, lkml
Subsystem:
scsi subsystem, the rest · Maintainers:
"James E.J. Bottomley", "Martin K. Petersen", Linus Torvalds
On 9/27/22 18:09, Ming Lei wrote:
On Wed, Sep 14, 2022 at 03:56:20PM -0700, Bart Van Assche wrote:quoted
Some kernel modules call device_del() from their module exit code and schedule asynchronous work from inside the .release callback without waiting until that callback has finished. As an example, many SCSI LLD drivers callIt isn't only related with device, any kobject has such issue, or any reference counter usage has similar potential risk, see previous discussion: https://lore.kernel.org/lkml/YsZm7lSXYAHT14ui@T590/ (local) IMO, it is one fundamental problem wrt. module vs. reference counting or kobject uses at least, since the callback depends on module code segment.quoted
scsi_remove_host() from their module exit code. scsi_remove_host() may invoke scsi_device_dev_release_usercontext() asynchronously. scsi_device_dev_release_usercontext() uses the host template pointer and that pointer usually exists in static storage in the SCSI LLD. Support using the module reference count to keep the module around until asynchronous module exiting has completed by waiting in the delete_module() system call until the module reference count drops to zero.The issue can't be addressed by the normal mod->refcnt, since user need to unload module when the device isn't used.
Hi Ming, How about removing support for calling scsi_device_put() from atomic context as is done in the untested patch below? Thanks, Bart.
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index c59eac7a32f2..661753a10b47 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c@@ -561,6 +561,8 @@ EXPORT_SYMBOL(scsi_report_opcode); */ int scsi_device_get(struct scsi_device *sdev) { + might_sleep(); + if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL) goto fail; if (!get_device(&sdev->sdev_gendev))
@@ -588,6 +590,7 @@ void scsi_device_put(struct scsi_device *sdev) { struct module *mod = sdev->host->hostt->module; + might_sleep(); put_device(&sdev->sdev_gendev); module_put(mod); }
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index a3aaafdeac1d..4cfc9317b4ad 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c@@ -441,7 +441,7 @@ static void scsi_device_cls_release(struct device *class_dev) put_device(&sdev->sdev_gendev); } -static void scsi_device_dev_release_usercontext(struct work_struct *work) +static void scsi_device_dev_release(struct device *dev) { struct scsi_device *sdev; struct device *parent;
@@ -450,11 +450,8 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL; struct scsi_vpd *vpd_pgb0 = NULL, *vpd_pgb1 = NULL, *vpd_pgb2 = NULL; unsigned long flags; - struct module *mod; - - sdev = container_of(work, struct scsi_device, ew.work); - mod = sdev->host->hostt->module; + sdev = to_scsi_device(dev); parent = sdev->sdev_gendev.parent;
@@ -516,19 +513,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work) if (parent) put_device(parent); - module_put(mod); -} - -static void scsi_device_dev_release(struct device *dev) -{ - struct scsi_device *sdp = to_scsi_device(dev); - - /* Set module pointer as NULL in case of module unloading */ - if (!try_module_get(sdp->host->hostt->module)) - sdp->host->hostt->module = NULL; - - execute_in_process_context(scsi_device_dev_release_usercontext, - &sdp->ew); } static struct class sdev_class = {