Re: Writeback efficiency -- proposal

From: Vojtech Pavlik <hidden>
Date: 2017-09-20 08:07:13

On Wed, Sep 20, 2017 at 01:01:47AM -0700, Michael Lyle wrote:

Hey everyone---

Right now writeback is pretty inefficient.  It lowers the seek
workload some on the disk by doing things in ascending-LBA order, but
there is no prioritization of writing back larger blocks (that is,
doing larger sequential IOs).

On RAID devices, bcache attempts writing out full RAID stripes, avoiding
the issue you describe.

It might make sense to extend that logic to non-striped devices, too.

At the same time, there is no on-disk index that makes it easy to find
larger sequential pieces.  However, I think it's possible to take a
heuristic approach to make this better.

Proposal--- When gathering dirty chunks--- I would like to track the
median size written back in the last batch of writebacks, and then
skip the first 500 things smaller than the median size.  This still
has the effect of putting all of our writes in LBA order, and has a
relatively minimal cost (having to scan through 1000 dirty things
instead of 500 in the worst case).  Upon reaching the end of the btree
we can revert to accepting all blocks.

Taking a trivial case-- If half of the things to write back are 4k,
and half are 8k, this will make us favor / almost entirely do
writeback of 8k chunks, and will demand 25% fewer seeks to do an
equivalent amount of writeback, in exchange for a small amount of
additional CPU.  (To an extent even this will be mitigated, because we
won't have to scan to find dirty blocks as often).

Does this sound reasonable?

It doesn't sound wrong. :)

Vojtech

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help