Re: Very long raid5 init/rebuild times

From: Stan Hoeppner <hidden>
Date: 2014-01-29 00:56:32

On 1/28/2014 10:50 AM, Marc MERLIN wrote:

On Tue, Jan 28, 2014 at 01:46:28AM -0600, Stan Hoeppner wrote:

quoted

Today, I don't use PMPs anymore, except for some enclosures where it's easy
to just have one cable and where what you describe would need 5 sata cables
to the enclosure, would it not?

No.  For external JBOD storage you go with an SAS expander unit instead
of a PMP.  You have a single SFF 8088 cable to the host which carries 4
SAS/SATA channels, up to 2.4 GB/s with 6G interfaces.

 
Yeah, I know about those, but I have 5 drives in my enclosures, so that's
one short :)

I think you misunderstood.  I was referring to a JBOD chassis with SAS
expander, up to 32 drives, typically 12-24 drives with two host or two
daisy chain ports.  Maybe an example would help here.

http://www.newegg.com/Product/Product.aspx?Item=N82E16816133047

Obviously this is in a difference cost category, and not typical for
consumer use.  Smaller units are available for less $$ but you pay more
per drive, as the expander board is the majority of the cost.  Steel and
plastic are cheap, as are PSUs.

quoted

I generally agree. Here I was using it to transfer data off some drives, but
indeed I wouldn't use this for a main array.

Your original posts left me with the impression that you were using this
as a production array.  Apologies for not digesting those correctly.

 
I likely wasn't clear, sorry about that.

quoted

You don't get extra performance.  You expose the performance you already
have.  Serial submission typically doesn't reach peak throughput.  Both
the resync operation and dd copy are serial submitters.  You usually
must submit asynchronously or in parallel to reach maximum throughput.
Being limited by a PMP it may not matter.  But with your direct
connected drives of your production array you should see a substantial
increase in throughput with parallel submission.

I agree, it should be faster.

quoted

[global]
directory=/some/directory
zero_buffers
numjobs=4
group_reporting
blocksize=1024k
ioengine=libaio
iodepth=16
direct=1
size=1g

[read]
rw=read
stonewall

[write]
rw=write
stonewall

Yeah, I have fio, didn't seem needed here, but I'll it a shot when I get a
chance.

With your setup and its apparent hardware limitations, parallel
submission may not reveal any more performance.  On the vast majority of
systems it does.

fio said:
Run status group 0 (all jobs):
   READ: io=4096.0MB, aggrb=77695KB/s, minb=77695KB/s, maxb=77695KB/s, mint=53984msec, maxt=53984msec

Run status group 1 (all jobs):
  WRITE: io=4096.0MB, aggrb=77006KB/s, minb=77006KB/s, maxb=77006KB/s, mint=54467msec, maxt=54467msec

Something is definitely not right if parallel FIO submission is ~25%
lower than single submission dd.  But you were running your dd tests
through buffer cache IIRC.  This FIO test uses O_DIRECT.  So it's not
apples to apples.  When testing IO throughput one should also bypass
buffer cache.

quoted

Of course, I'm not getting that speed, but again, I'll look into it.

Yeah, something's definitely up with that.  All drives are 3G sync, so
you 'should' have 300 MB/s data rate through the PMP.

Right.

quoted

Thanks for your suggestions for tweaks.

No problem Marc.  Have you noticed the right hand side of my email
address? :)  I'm kinda like a dog with a bone when it comes to hardware
issues.  Apologies if I've been a bit too tenacious with this.

I had not :) I usually try to optimize stuff as much as possible when it's
worth it or when I really care and have time. I agree this one is puzzling
me a bit and even if it's fast enough for my current needs and the time I
have right now, I'll try and move it to another system to see. I'm pretty
sure that one system has a weird bottleneck.

Yeah, something definitely not right.  Your RAID throughput is less than
a single 7.2K SATA drive.  It's probably just something funky with that
JBOD chassis.

-- 
Stan

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help