Re: [PATCH v7.1] block: Coordinate flush requests
From: Shaohua Li <shli@kernel.org>
Date: 2011-01-13 05:38:59
Also in:
lkml
2011/1/13 Darrick J. Wong [off-list ref]:
On certain types of storage hardware, flushing the write cache takes a considerable amount of time. Typically, these are simple storage systems with write cache enabled and no battery to save that cache during a power failure. When we encounter a system with many I/O threads that try to flush the cache, performance is suboptimal because each of those threads issues its own flush command to the drive instead of trying to coordinate the flushes, thereby wasting execution time. Instead of each thread initiating its own flush, we now try to detect the situation where multiple threads are issuing flush requests. The first thread to enter blkdev_issue_flush becomes the owner of the flush, and all threads that enter blkdev_issue_flush before the flush finishes are queued up to wait for the next flush. When that first flush finishes, one of those sleeping threads is woken up to perform the next flush and then wake up the other threads which are asleep waiting for the second flush to finish. In the single-threaded case, the thread will simply issue the flush and exit. To test the performance of this latest patch, I created a spreadsheet reflecting the performance numbers I obtained with the same ffsb fsync-happy workload that I've been running all along: http://tinyurl.com/6xqk5bs The second tab of the workbook provides easy comparisons of the performance before and after adding flush coordination to the block layer. Variations in the runs were never more than about 5%, so the slight performance increases and decreases are negligible. It is expected that devices with low flush times should not show much change, whether the low flush times are due to the lack of write cache or the controller having a battery and thereby ignoring the flush command. Notice that the elm3b231_ipr, elm3b231_bigfc, elm3b57, elm3c44_ssd, elm3c44_sata_wc, and elm3c71_scsi profiles showed large performance increases from flush coordination. These 6 configurations all feature large write caches without battery backups, and fairly high (or at least non-zero) average flush times, as was discovered when I studied the v6 patch. Unfortunately, there is one very odd regression: elm3c44_sas. This profile is a couple of battery-backed RAID cabinets striped together with raid0 on md. I suspect that there is some sort of problematic interaction with md, because running ffsb on the individual hardware arrays produces numbers similar to elm3c71_extsas. elm3c71_extsas uses the same type of hardware array as does elm3c44_sas, in fact. FYI, the flush coordination patch shows performance improvements both with and without Christoph's patch that issues pure flushes directly. The spreadsheet only captures the performance numbers collected without Christoph's patch.
Hi, can you explain why there is improvement with your patch? If there are multiple flush, blk_do_flush already has queue for them (the ->pending_flushes list). Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html