Re: [LSF/MM TOPIC] linux servers as a storage server - what's missing?
From: Bart Van Assche <bvanassche@acm.org>
Date: 2012-01-18 18:51:37
Also in:
linux-scsi
From: Bart Van Assche <bvanassche@acm.org>
Date: 2012-01-18 18:51:37
Also in:
linux-scsi
On Wed, Jan 18, 2012 at 6:46 PM, Roland Dreier [off-list ref] wrote:
quoted
Why would you crash is you have device mapper multipath configured to handle path fail over? We have tons of enterprise customers that use that...cf http://www.spinics.net/lists/linux-scsi/msg56254.html Basically hot unplug of an sdX can oops on any recent kernel, no matter what dm stuff you have on top.quoted
On the broader topic of error handling and so on, I do agree that is always an area of concern (how many times to retry, how long time outs need to be, when to panic/reboot or propagate up an error code)Yes, especially the scsi eh stuff escalating to a host reset when a single drive has gone bad -- even if the HBA is happily doing IO to other drives, we'll kill access to the whole SAS fabric.
With which SCSI low-level diver does that occur and how does the call stack look like ? I haven't encountered any such issues while testing the srp-ha patch set. However, I have to admit that the issues mentioned in the description of commit 3308511 were discovered while testing the srp-ha patch set. Bart.