Re: [PATCH] block: always requeue !fs requests at the front
From: Christoph Hellwig <hch@infradead.org>
Date: 2007-06-16 19:54:44
Also in:
linux-xfs, lkml
On Fri, Jun 15, 2007 at 01:05:44PM +0200, Jens Axboe wrote:
On Fri, Jun 15 2007, Tejun Heo wrote:quoted
SCSI marks internal commands with REQ_PREEMPT and push it at the front of the request queue using blk_execute_rq(). When entering suspended or frozen state, SCSI devices are quiesced using scsi_device_quiesce(). In quiesced state, only REQ_PREEMPT requests are processed. This is how SCSI blocks other requests out while suspending and resuming. As all internal commands are pushed at the front of the queue, this usually works. Unfortunately, this interacts badly with ordered requeueing. To preserve request order on requeueing (due to busy device, active EH or other failures), requests are sorted according to ordered sequence on requeue if IO barrier is in progress. The following sequence deadlocks. 1. IO barrier sequence issues. 2. Suspend requested. Queue is quiesced with part of all of IO barrier sequence at the front. 3. During suspending or resuming, SCSI issues internal command which gets deferred and requeued for some reason. As the command is issued after the IO barrier in #1, ordered requeueing code puts the request after IO barrier sequence. 4. The device is ready to process requests again but still is in quiesced state and the first request of the queue isn't REQ_PREEMPT, so command processing is deadlocked - suspending/resuming waits for the issued request to complete while the request can't be processed till device is put back into running state by resuming. This can be fixed by always putting !fs requests at the front when requeueing. The following thread reports this deadlock. http://thread.gmane.org/gmane.linux.kernel/537473 Signed-off-by: Tejun Heo <redacted> Cc: Jenn Axboe <redacted> Cc: David Greaves <redacted> --- Okay, it took a lot of hours of debugging but boiled down to two liner fix. I feel so empty. :-) RAID6 triggers this reliably because it uses BIO_BARRIER heavily to update its superblock. The recent ATA suspend/resume rewrite is hit by this because it uses SCSI internal commands to spin down and up the drives for suspending and resuming. David, please test this. Jens, does it look okay?Yep looks good, except for the bad multi-line comment style, but that's minor stuff ;-) Acked-by: Jens Axboe <redacted>
I'd much much prefer having a description of the problem in the actual comment then a hyperlink. There's just too much chance of the latter breaking over time, and it's impossible to update it when things change that should be reflected in the comment.