Thread (38 messages) 38 messages, 9 authors, 2020-05-15

Re: raid6check extremely slow ?

From: Piergiorgio Sartor <hidden>
Date: 2020-05-12 16:09:43

On Mon, May 11, 2020 at 11:44:11PM +0100, Peter Grandi wrote:
quoted
quoted
quoted
With lock / unlock, I get around 1.2MB/sec per device
component, with ~13% CPU load.  Wihtout lock / unlock, I get
around 15.5MB/sec per device component, with ~30% CPU load.
quoted
quoted
[...] we still need to avoid race conditions. [...]
Not all race conditions are equally bad in this situation.
quoted
1. Per your previous reply, only call raid6check when array is
RO, then we don't need the lock.
2. Investigate if it is possible that acquire stripe_lock in
suspend_lo/hi_store [...]
Some other ways could be considered:

* Read a stripe without locking and check it; if it checks good,
  no problem, else either it was modified during the read, or it
  was faulty, so acquire a W lock, reread and recheck it (it
  could have become good in the meantime).

  The assumption here is that there is a modest write load from
  applications on the RAID set, so the check will almost always
  succeed, and it is worth rereading the stripe in very rare
  cases of "collisions" or faults.

* Variants, like acquiring a W lock (if possible) on the stripe
  solely while reading it ("atomic" read, which may be possible
  in other ways without locking) and then if check fails we know
  it was faulty, so optionally acquire a new W lock and reread
  and recheck it (it could have become good in the meantime).

  The assumption here is that the write load is less modest, but
  there are a lot more reads than writes, so a W lock only
  during read will eliminate the rereads and rechecks from
  relatively rare "collisions".
The locking method was suggested by Neil,
I'm not aware of other methods.

About the check -> maybe lock -> re-check,
it is a possible workaround, but I find it
a bit extreme.

In any case, we should keep it in mind.

bye,

pg
 
The case where there is at the same time a large application
write load on the RAID set and checking at the same time is hard
to improve and probably eliminating rereads and rechecks by just
acquiring the stripe W lock for the whole duration of read and
check.
-- 

piergiorgio
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help