Thread (2 messages) 2 messages, 1 author, 2017-06-20

Re: Re: [PATCH] bcache: fix issue of writeback rate at minimum 1 blockper second

From: Eric Wheeler <hidden>
Date: 2017-06-20 21:36:56

On Sat, 17 Jun 2017, bcache@lists.ewheeler.net wrote:
On Fri, 9 Jun 2017, tang.junhui@zte.com.cn wrote:
quoted
Hello Eric,


> It would be nice if ZERO_RATE_DELAY_NS was configurable via sysfs, as some 
> might want a longer delay than 30s.

How about we reuse the dc->writeback_delay configuration for the delay time?
That seems reasonable.

For small writeback caches where the cache could be exhausted while in 
schedule_timeout(), will IOs stall until schedule_timeout() allows 
writeback_thread to proceed?
Hi Tang, 

Also, I wonder, is schedule_timeout the right place to do this?  You would 
delay the thread for 30 seconds (by default) even in the case of 
BCACHE_DEV_DETACHING if I am reading this right. This would make for a 
slow shutdown if it is waiting in schedule_timeout.

Would it would be better to handle this by setting d->next = WRITEBACK_WAIT_CYCLE?  

After all, we are trying to avoid doing 1-secondly writes if the dirty 
data is within threshold.

--
Eric Wheeler

quoted
> Also, if zero-rate has been hit, why not have a full-writeback of 
whatever > remains so we only use the ZERO_RATE_DELAY_NS when 
dirty_data==0?  No > sense in leaving it slightly dirty if there is 
only a block or two still > remaining.

I think it's risky, write IO maybe in continue, we should to accumulate 
more dirty data
ok, agreed---also I note that writeback is still 10% dirty by default, so 
it could be a large amount of data.  The cache is always seeking to hit 
dirty_ratio and this is what the 1-sector writeback fix is targetd for.
quoted
to make writeback sequentially, otherwise, the full-writeback would be 
inefficient when there

is only a little dirty, as it is small and random IOs, and the write IO 
are in continue.


So, Eric, would you agree to just modify this patch to reuse 
dc->writeback_delay as

the delay time?
Yes that seems OK, unless someone else says otherwise.
quoted
Acctually, we do many optimizations for writeback in our product, such 
as the issue that the dirty device never get clean (we start 
full-writeback when we find there is no more write IOs in comming), 
these modifications make bcache works better.
I would be interested in seeing other optimization patches if you have 
any.  Thank you for your work on this!

-Eric
quoted

Regards,

Tang Junhui




原始邮件
发件人: <bcache@lists.ewheeler.net>;
收件人: <i@coly.li>;
抄送人:唐军辉10074136; <kent.overstreet@gmail.com>; <linux-bcache@vger.kernel.org>;
日 期 :2017年06月09日 09:43
主 题 :Re: [PATCH] bcache: fix issue of writeback rate at minimum 1 blockper second


On Fri, 9 Jun 2017, Coly Li wrote:

> On 2017/6/8 下午5:34, tang.junhui@zte.com.cn wrote:
> > From: Tang Junhui <tang.junhui@zte.com.cn>
> > 
> > When there is not enough data in writeback cache, let the writeback rate
> > to be 0, and delay 30 seconds in read_dirty().
> > 
> 
> I feel uncomfortable to the above idea. Is the purpose of this change is
> to make writeback more efficient with more dirty data ? Or some other
> reason that you want to make this change?
> 
> When data writing back to the backing device, most of the writes are
> random. If the backing device is a hard disk, the write back rate will
> be very slow. Such a longer delay only makes more data accumulated in
> writeback cache, does not help the writeback rate indeed.
> 
> Do you observe performance improvement by this patch ? If yes, I'd like
> to see the performance number.

Its not a performance issue, it is a power saving issue.  Laptop users 
have reported that zero-rate writeback writes 512-bytes every second and 
never lets the HDD spin down and several users on the list have asked for 
such a bugfix.

It would be nice if ZERO_RATE_DELAY_NS was configurable via sysfs, as some 
might want a longer delay than 30s.

Also, if zero-rate has been hit, why not have a full-writeback of whatever 
remains so we only use the ZERO_RATE_DELAY_NS when dirty_data==0?  No 
sense in leaving it slightly dirty if there is only a block or two still 
remaining.


--
Eric Wheeler



> 
> Thanks.
> 
> Coly
> 
> 
> > Signed-off-by: Tang Junhui <tang.junhui@zte.com.cn>
> > ---
> >  drivers/md/bcache/util.c      | 17 ++++++++++++-----
> >  drivers/md/bcache/writeback.c |  2 +-
> >  2 files changed, 13 insertions(+), 6 deletions(-)
> > 
> > diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
> > index dde6172..cf750df 100644
> > --- a/drivers/md/bcache/util.c
> > +++ b/drivers/md/bcache/util.c
> > @@ -15,6 +15,8 @@
> >  
> >  #include "util.h"
> >  
> > +#define ZERO_RATE_DELAY_NS    30*NSEC_PER_SEC
> > +
> >  #define simple_strtoint(c, end, base)    simple_strtol(c, end, base)
> >  #define simple_strtouint(c, end, base)    simple_strtoul(c, end, base)
> >  
> > @@ -209,13 +211,18 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
> >  {
> >      uint64_t now = local_clock();
> >  
> > -    d->next += div_u64(done * NSEC_PER_SEC, d->rate);
> > +    if (!d->rate) {
> > +        d->next = now + ZERO_RATE_DELAY_NS;
> > +    }
> > +    else {
> > +        d->next += div_u64(done * NSEC_PER_SEC, d->rate);
> >  
> > -    if (time_before64(now + NSEC_PER_SEC, d->next))
> > -        d->next = now + NSEC_PER_SEC;
> > +        if (time_before64(now + NSEC_PER_SEC, d->next))
> > +            d->next = now + NSEC_PER_SEC;
> >  
> > -    if (time_after64(now - NSEC_PER_SEC * 2, d->next))
> > -        d->next = now - NSEC_PER_SEC * 2;
> > +        if (time_after64(now - NSEC_PER_SEC * 2, d->next))
> > +            d->next = now - NSEC_PER_SEC * 2;
> > +    }
> >  
> >      return time_after64(d->next, now)
> >          ? div_u64(d->next - now, NSEC_PER_SEC / HZ)
> > diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> > index 69e1ae5..8fac280 100644
> > --- a/drivers/md/bcache/writeback.c
> > +++ b/drivers/md/bcache/writeback.c
> > @@ -60,7 +60,7 @@ static void __update_writeback_rate(struct cached_dev *dc)
> >  
> >      dc->writeback_rate.rate =
> >          clamp_t(int64_t, (int64_t) dc->writeback_rate.rate + change,
> > -            1, NSEC_PER_MSEC);
> > +            0, NSEC_PER_MSEC);
> >  
> >      dc->writeback_rate_proportional = proportional;
> >      dc->writeback_rate_derivative = derivative;
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-bcache" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help