Thread (41 messages) 41 messages, 5 authors, 2012-01-06

Re: [PATCH block:for-3.3/core] cfq: merged request shouldn't jump to a different cfqq

From: Shaohua Li <hidden>
Date: 2012-01-06 02:59:37
Also in: linux-next, linux-scsi, lkml

On Thu, 2012-01-05 at 18:36 -0800, Tejun Heo wrote:
Hello, again.

On Thu, Jan 05, 2012 at 06:17:07PM -0800, Tejun Heo wrote:
quoted
When two requests are merged, if the absorbed request is older than
the absorbing one, cfq_merged_requests() tries to reposition it in the
cfqq->fifo list by list_move()'ing the absorbing request to the
absorbed one before removing it.

This works if both requests are on the same cfqq but nothing
guarantees that and the code ends up moving the merged request to a
different cfqq's fifo list without adjusting the rest.  This leads to
the following failures.

* A request may be on the fifo list of a cfqq without holding
  reference to it and the cfqq can be freed before requst is finished.
  Among other things, this triggers list debug warning and slab debug
  use-after-free warning.

* As a request can be on the wrong fifo queue, it may be issued and
  completed before its cfqq is scheduled.  If the cfqq didn't have
  other requests on it, it would be empty by the time it's dispatched
  triggering BUG_ON() in cfq_dispatch_request().

Fix it by making cfq_merged_requests() scan the absorbing request's
fifo list for the correct slot and move there instead.
Hmmm... while the patch would fix the problem.  It isn't entirely
correct.  The root cause is,

1. q->last_merge and rqhash used to be used only for merging bios into
   requests and that queries elevator whether the merge should be
   allowed.  cfq disallows merging if they belong to different cfqqs.

2. request-request merging didn't use to use q->last_merge or rqhash to
   find request candidates.  It used elv_former/latter_request() and
   cfq never returned request from a different cfqq.

3. Plug merging started using q->last_merge and rqhash and now
   elevator can't prevent cross cfqq merges.

So, yeah, the right fix would be using elv_former/latter_request()
instead.  Maybe we should strip out rqhash altogether and change
elevator handle everything?  I don't know.  I'll prepare a different
fix patch soon.
So not allow merge from two cfq queues strictly? This will impact
performance. I don't know how important the strict isolation is. we even
allow two cfq queues merge to improve performance.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help