Re: [PATCH v7.1] block: Coordinate flush requests

From: Shaohua Li <shli@kernel.org>
Date: 2011-01-13 05:38:59
Also in: lkml

2011/1/13 Darrick J. Wong [off-list ref]:

On certain types of storage hardware, flushing the write cache takes a
considerable amount of time.  Typically, these are simple storage systems with
write cache enabled and no battery to save that cache during a power failure.
When we encounter a system with many I/O threads that try to flush the cache,
performance is suboptimal because each of those threads issues its own flush
command to the drive instead of trying to coordinate the flushes, thereby
wasting execution time.

Instead of each thread initiating its own flush, we now try to detect the
situation where multiple threads are issuing flush requests.  The first thread
to enter blkdev_issue_flush becomes the owner of the flush, and all threads
that enter blkdev_issue_flush before the flush finishes are queued up to wait
for the next flush.  When that first flush finishes, one of those sleeping
threads is woken up to perform the next flush and then wake up the other
threads which are asleep waiting for the second flush to finish.

In the single-threaded case, the thread will simply issue the flush and exit.

To test the performance of this latest patch, I created a spreadsheet
reflecting the performance numbers I obtained with the same ffsb fsync-happy
workload that I've been running all along:  http://tinyurl.com/6xqk5bs

The second tab of the workbook provides easy comparisons of the performance
before and after adding flush coordination to the block layer.  Variations in
the runs were never more than about 5%, so the slight performance increases and
decreases are negligible.  It is expected that devices with low flush times
should not show much change, whether the low flush times are due to the lack of
write cache or the controller having a battery and thereby ignoring the flush
command.

Notice that the elm3b231_ipr, elm3b231_bigfc, elm3b57, elm3c44_ssd,
elm3c44_sata_wc, and elm3c71_scsi profiles showed large performance increases
from flush coordination.  These 6 configurations all feature large write caches
without battery backups, and fairly high (or at least non-zero) average flush
times, as was discovered when I studied the v6 patch.

Unfortunately, there is one very odd regression: elm3c44_sas.  This profile is
a couple of battery-backed RAID cabinets striped together with raid0 on md.  I
suspect that there is some sort of problematic interaction with md, because
running ffsb on the individual hardware arrays produces numbers similar to
elm3c71_extsas.  elm3c71_extsas uses the same type of hardware array as does
elm3c44_sas, in fact.

FYI, the flush coordination patch shows performance improvements both with and
without Christoph's patch that issues pure flushes directly.  The spreadsheet
only captures the performance numbers collected without Christoph's patch.

Hi,
can you explain why there is improvement with your patch? If there are
multiple flush, blk_do_flush already has queue for them (the
->pending_flushes list).

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help