Thread (20 messages) 20 messages, 4 authors, 2016-09-01

Re: Time to make dynamically allocated devt the default for scsi disks?

From: James Bottomley <James.Bottomley@HansenPartnership.com>
Date: 2016-08-14 17:20:55
Also in: linux-scsi
Subsystem: scsi subsystem, the rest · Maintainers: "James E.J. Bottomley", "Martin K. Petersen", Linus Torvalds

On Sat, 2016-08-13 at 11:27 -0700, Dan Williams wrote:
On Sat, Aug 13, 2016 at 10:43 AM, James Bottomley
[off-list ref] wrote:
quoted
On Sat, 2016-08-13 at 09:29 -0700, Dan Williams wrote:
quoted
On Sat, Aug 13, 2016 at 8:23 AM, James Bottomley
[off-list ref] wrote:
quoted
It does?  The race is the fact that the parent can be removed
before the child meaning if the parent name is re-registered 
before the child dies we get a duplicate name in bdi space.
No, the race is that the *name* of the parent isn't released 
until the child is both unregistered and put.  The device core is
already ensuring that the parent is not released until all 
descendants have been removed.
We're both saying the same thing: the problem is that, with
df08c32ce3be the bdi name lifetime is tied to the lifetime of the
gendisk.  However, the parent of the gendisk currently is only tied 
to the visibility lifetime of the gendisk, not the final put 
lifetime, so it doesn't see this.
quoted
quoted
quoted
So I tried the attached and it makes the libnvdimm unit tests
start crashing.
Well, the attached is clearly buggy, isn't it?  You're trying 
to do a get on the parent before the parent is actually set.
Ah, yes, thank you.  Fixed up v2 attached that passes my tests.
quoted
Why don't you just try the incremental patch I sent instead of
trying to rework it?
I reworked it because it is the bdi that holds this extra 
dependency on the disk's parent, not the disk itself.
Philosophically I don't like this approach.  The dependency goes

bdi->gendisk->parent
I'm arguing that there is no bdi->gendisk dependency.
You created one with your bdi->owner field.  Just because you didn't
call it a parent doesn't mean it wasn't one.  Arguably the whole bdi
thing is strangely done because gendisk treats it like a class and
that's how it behaves, it just doesn't have a proper class structure
(which is why gendisk creates the link that would be done by the class
infrastructure)
The dependency is:

bdi->devt
devt isn't a device (in the struct device sense).  It exists as
effectively an embedded component of the bdi.  As far as I can tell
there's no reason for it to be separately allocated, it could be
properly embedded as is the normal pattern.
It just so happens that block-dynamic devt is released in
disk_release().
quoted
Making the bdi manage the parent lifetime is an unusual pattern.
 Making the parent stay around until the last reference to gendisk 
is put is the usual one.
What's unusual is the bdi's dependency on the allocated name, not the
gendisk itself.
A name is just a resource belonging to an object.  The object it
belongs to is the bdi and the bdi is parented by the owner field (and a
hokey link) to the gendisk.
quoted
quoted
quoted
quoted
  A couple crash logs attached.  Not yet sure what assumption
is getting violated, but how about that conversion of scsi to 
use dynamic devt? ;-)
It's completely orthogonal.  The problem is in hierarchy 
lifetimes: switching from static to dynamic allocation won't 
change that at all.  You don't see this problem in nvme because 
the parent control device's lifetime belongs to the controller 
not the disk.  In SCSI the parent is our representation of the 
SCSI device whose lifetime is governed at the SCSI level and 
effectively represents the disk.
No, it's only the name.  We could achieve the same by teaching 
the block core to manage the "sd_index_ida" instead of the sd 
driver itself, but the v2-patch attached works and does not 
introduce that layering violation.
Um, so this patch doesn't fix the problem. It merely makes the 
lifetime rules correct so the problem can then be fixed at the scsi
level.
You're right that this patch does not fix the problem, I missed that
the scsi_disk is not the parent of the gendisk, so this patch does
nothing to delay scsi_disk_release.  What I think is the real fix is
to make the devt properly reference counted and prevent
ida_remove(&sd_index_ida, sdkp->index); from being called until all
objects derived from that allocation are done with it.
OK, this is another philosophical difference, I suppose: since bdi is
already so complex and non-standard, I really don't think adding more
non standard stuff is a good idea.  The simplest way to fix it is

   1. The two line patch I already sent to make the bdi hold the owner
      ->parent until release
   2. Parent the gendisk to scsi_disk->dev.  The name release is already
      in the correct place, so this is a 3 line patch.

These are established patterns, so they're both understandable to
anyone who reads the code.  The answer to any other BDI lifetime
problem is to free the name in the parent release.

James

---
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d3e852a..222771d 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3000,7 +3000,13 @@ static void sd_probe_async(void *data, async_cookie_t cookie)
 	}
 
 	blk_pm_runtime_init(sdp->request_queue, dev);
-	device_add_disk(dev, gd);
+	/*
+	 * previously the parent of the gendisk was the scsi device.  It
+	 * was moved to fix lifetime rules, so now we install a symlink
+	 * to the new location of the block class directory
+	 */
+	device_add_disk(&sdkp->dev, gd);
+	WARN_ON(sysfs_add_link_to_group(&dev->kobj, "block", &sdkp->dev.kobj, "block"));
 	if (sdkp->capacity)
 		sd_dif_config_host(sdkp);
 
@@ -3142,6 +3148,7 @@ static int sd_remove(struct device *dev)
 
 	async_synchronize_full_domain(&scsi_sd_pm_domain);
 	async_synchronize_full_domain(&scsi_sd_probe_domain);
+	sysfs_remove_link(&dev->kobj, "block");
 	device_del(&sdkp->dev);
 	del_gendisk(sdkp->disk);
 	sd_shutdown(dev);
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help