Thread (12 messages) 12 messages, 4 authors, 2021-02-25

Re: [PATCH 2/2] nvme: delete disk when last path is gone

From: Hannes Reinecke <hare@suse.de>
Date: 2021-02-25 08:37:24

On 2/24/21 11:40 PM, Sagi Grimberg wrote:
quoted
The multipath code currently deletes the disk only after all references
to it are dropped rather than when the last path to that disk is lost.
This has been reported to cause problems with some use cases like MD 
RAID.
What is the exact problem?

Can you describe what the problem you see now and what you expect
to see (unrelated to patch #1)?
The problem is a difference in behaviour between multipathed and 
non-multipathed namespaces (ie whether 'CMIC' is set or not).
If the CMIC bit is _not_ set, the disk device will be removed once
the controller is gone; if the CMIC bit is set the disk device will be 
retained, and only removed once the last _reference_ is dropped.

This is causing customer issues, as some vendors produce nearly 
identical PCI NVMe devices, which differ in the CMIC bit.
So depending on which device the customer uses, he might be getting on 
or the other behaviour.
And this is causing issues when said customer deploys MD RAID on thems;
with one set of devices PCI hotplug works, with the other set of devices 
it doesn't.
quoted
This patch implements an alternative behaviour of deleting the disk when
the last path is gone, ie the same behaviour as non-multipathed nvme
devices.
But we also don't remove the non-multipath'd nvme device until the
last reference drops (e.g. if you have a mounted filesystem on top).
Au contraire.

When doing PCI hotplug the controller is removed (in the non-multipathed 
case), and calling 'put_disk()' during nvme_free_ns().
When doing PCI hotplug in the non-multipathed case, the controller is 
removed, too, but put_disk() is only called on the namespace itself; the 
'nshead' disk is still kept around, and put_disk() on the 'nshead' disk 
is only called after the last reference is dropped.
This would be the equivalent to running raid on top of dm-mpath on
top of scsi devices right? And if all the mpath device nodes go away
the mpath device is deleted even if it has an open reference to it?
See above. The prime motivator behind this patch is to get equivalent 
behaviour between multipathed and non-multipathed devices.
It just so happens that MD RAID exercises this particular issue.
quoted
The new behaviour will be selected with the 'fail_if_no_path'
attribute, as returning it's arguably the same functionality.
But its not the same functionality.
Agreed. But as the first patch will be dropped (see my other mail) I'll 
be redoing the patchset anyway.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                Kernel Storage Architect
hare@suse.de                              +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help