Thread (11 messages) 11 messages, 5 authors, 2012-08-16

Re: O_DIRECT to md raid 6 is slow

From: Andy Lutomirski <luto@amacapital.net>
Date: 2012-08-16 01:09:09
Also in: lkml

On Wed, Aug 15, 2012 at 4:50 PM, Stan Hoeppner [off-list ref] wrote:
On 8/15/2012 5:10 PM, Andy Lutomirski wrote:
quoted
On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner [off-list ref] wrote:
quoted
On 8/15/2012 12:57 PM, Andy Lutomirski wrote:
quoted
On Wed, Aug 15, 2012 at 4:50 AM, John Robinson
[off-list ref] wrote:
quoted
On 15/08/2012 01:49, Andy Lutomirski wrote:
quoted
If I do:
# dd if=/dev/zero of=/dev/md0p1 bs=8M
[...]
Grr.  I thought the bad old days of filesystem and related defaults
sucking were over.
The previous md chunk default of 64KB wasn't horribly bad, though still
maybe a bit high for alot of common workloads.  I didn't have eyes/ears
on the discussion and/or testing process that led to the 'new' 512KB
default.  Obviously something went horribly wrong here.  512KB isn't a
show stopper as a default for 0/1/10, but is 8-16 times too large for
parity RAID.
quoted
cryptsetup aligns sanely these days, xfs is
sensible, etc.
XFS won't align with the 512KB chunk default of metadata 1.2.  The
largest XFS journal stripe unit (su--chunk) is 256KB, and even that
isn't recommended.  Thus mkfs.xfs throws an error due to the 512KB
stripe.  See the md and xfs archives for more details, specifically Dave
Chinner's colorful comments on the md 512KB default.
Heh -- that's why the math didn't make any sense :)
quoted
wtf?  <rant>Why is there no sensible filesystem for
huge disks?  zfs can't cp --reflink and has all kinds of source
availability and licensing issues, xfs can't dedupe at all, and btrfs
isn't nearly stable enough.</rant>
Deduplication isn't a responsibility of a filesystem.  TTBOMK there are
two, and only two, COW filesystems in existence:  ZFS and BTRFS.  And
these are the only two to offer a native dedupe capability.  They did it
because they could, with COW, not necessarily because they *should*.
There are dozens of other single node, cluster, and distributed
filesystems in use today and none of them support COW, and thus none
support dedup.  So to *expect* a 'sensible' filesystem to include dedupe
is wishful thinking at best.
I should clarify my rant for the record.  I don't care about in-fs
dedupe.  I want COW so userspace can dedupe and generally replace
hardlinks with sensible cowlinks.  I'm also working on some fun tools
that *require* reflinks for anything resembling decent performance.

--Andy
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help