Thread (18 messages) 18 messages, 4 authors, 2011-06-30

Re: [RFC PATCH 0/3] block: Fix fsync slowness with CFQ cgroups

From: Vivek Goyal <vgoyal@redhat.com>
Date: 2011-06-29 01:29:55
Also in: linux-fsdevel, lkml

On Wed, Jun 29, 2011 at 09:04:55AM +0800, Shaohua Li wrote:

[..]
quoted
We idle on last queue on sync-noidle tree. So we idle on fysnc queue as
it is last queue on sync-noidle tree. That's how we provide protection
to all sync-noidle queues against sync-idle queues. Instead of idling
on individual quues we do idling in group and that is on service tree.
Ok. but this looks silly. We are idling in a noidle service tree or a
group (backed by the last queue of the tree or group) because we assume
the tree or group can dispatch a request soon. But if the think time of
the tree or group is big, the assumption isn't true. Doing idle here is
blind. I thought we can extend the think time check for both service
tree and group.
We can implement the thinktime for noidle service tree and group idle as
well. That's not a problem, though I am yet to be convinced that thinktime
still makes sense for the group. I guess it will just mean that in the
past have you done a bunch of IO with gap between IO less than 8ms. If
yes, then we expect you to do more IO in future. Frankly speaking, I am
not too sure that how past IO pattern predicts the future IO pattern
of the group.

But anyway, the point is, even if you we implement it, it will not solve
the fsync issue at hand. The reason I explained in previous mail. We 
will be oscillating between high think time and low thinktime depending
on whether we are idling or not. There is no correlation between think
time of fsync thread and idling here.

I think you are banking on the fact that after fsync, journaling thread
IO can take more than 8ms hence delaying next IO to fsync thread, pushing
its thinktim more than 8ms hence we will not idle on fsync thread at
all. It is just one corner case and I think it is broken in multiple
cases.

- If filesystem barriers are disabled or backend storage has battery
  backup then journal IO most likely will go in cache and barriers
  will be ignored. In that case write will finish almost instantly
  and we will get next IO from fsync thread very soon hence pushing
  down thinktime of fsync thread which will enable idling and we will
  be back to the problem we are trying to solve.

- Fsync thread might be submitting string of IOs (say 10-12) before it
  moves to journal thread to commit meta data. In that case we might
  have lowered thinktime of fsync hence enable idle. 

So implementing think time for service tree/group might be a good idea
in general but it will not solve this IO dependecny issue across cgroups.

Thanks
Vivek
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help