Thread (36 messages) 36 messages, 14 authors, 2011-05-07

Re: mdadm raid1 read performance

From: Roberto Spadim <hidden>
Date: 2011-05-04 23:57:00

2011/5/4 NeilBrown [off-list ref]:
On Thu, 5 May 2011 00:08:59 +0100 Liam Kurmos [off-list ref] wrote:
quoted
Thanks to all who replied on this.

I somewhat naively assumed that having 2 disks with the same data
would mean a similar read speed to raid0 should be the norm (and i
think this is a very popular miss-conception).
I was neglecting the seek time to skip alternate blocks which i guess
must the flaw.

In theory though if i was reading a larger file, couldn't one disk
start reading at the beginning to a buffer and one start reading from
half way ( assuming 2 disks) and hence get close to 2x single d
isk
quoted
speed?
If you write your program to read from both the beginning and the middle
then you might get double-speed.  The kernel doesn't know you are going to do
this so the best it can do is read-ahead is large amounts.

raid1 could notice large reads and send some to one disk and some to another,
but the size for each device must be large enough that the time to seek over
must be much less than the time to read, which is probably many megabytes on
todays hardware - and raid1 has no way to know what that size is.

Certainly it is possible that the read_balance code in md/raid1 could be
improved.  As yet no-one has improved it and provided convincing performance
numbers.
yes, it´s not a 10000% improvement, i got a max of 1% on a big test (1
hour of nonsequencial read), for ssd round robin allow a more use of
drives, and some improvements, while i don´t know how to get
hardware/software queue size, i couln´t improve code for select 'best'
disk: the disk that should return with less time, but benchmark
results was interesting since 1% was 1% three times (60minutes drop to
54minutes)

could be very interesting how to get information about disk and
automatic tune read balance
informations: acesstime (RPM information can help here), mb/s in a
sequencial search (depend RPM+disk size(1,8" 2,5" 3,5")+interface
(SATA1,SATA2,SAS) since SATA1 can´t allow more than 1,5Gb/s),
rotational/non rotational information
diference from rotational to non rotational:
roatitional: access time proportional to block distance (head arm /
disk position)
non rotaition: fixed accesstime with low variation

quoted
as a separate question, what should be the theoretical performance of raid5?
x(N-1)

So a 4 drive RAID5 should read at 3 time the speed of a single drive.
quoted
in my tests i read 1GB and throw away the data.
dd if=/dev/md0 of=/dev/null bs=1M count=1000

With 4 fairly fast hdd's i get
Which apparently do 140MB/s:
quoted
raid0: ~540MB/s
I would expect 4*140 == 560, so this is a good result.
quoted
raid10: 220MB/s
Assuming the default 'n2' layout, I would expect 2*140 or 280, so this is a
little slow.  Try "--layout=f2" and see what you get (should be more like
RAID0).
quoted
raid5: ~165MB/s
I would expect 3*140 or 420, so this is very slow.  I wonder if read-ahead is
set badly.
Can you:
  blockdev --getra /dev/md0
multiply the number it gives you by 8 and give it back with
  blockdev --setra NUMBER /dev/md0
very nice :)
quoted
raid1: ~140MB/s  (single disk speed)
as expected.
quoted
for 4 disks raid0 seems like suicide, but for my system drive the
speed advantage is so great im tempted to try it anyway and try and
use rsync to keep constant back up.
If you have somewhere to rsync to, then you have more disks so RAID10 might
be an answer... but I suspect you cannot move disks around that freely :-)

NeilBrown


quoted
cheers for you responses,

Liam
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Roberto Spadim
Spadim Technology / SPAEmpresarial
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help