Re: O_DIRECT to md raid 6 is slow
From: Andy Lutomirski <luto@amacapital.net>
Date: 2012-08-16 01:09:09
Also in:
lkml
On Wed, Aug 15, 2012 at 4:50 PM, Stan Hoeppner [off-list ref] wrote:
On 8/15/2012 5:10 PM, Andy Lutomirski wrote:quoted
On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner [off-list ref] wrote:quoted
On 8/15/2012 12:57 PM, Andy Lutomirski wrote:quoted
On Wed, Aug 15, 2012 at 4:50 AM, John Robinson [off-list ref] wrote:quoted
On 15/08/2012 01:49, Andy Lutomirski wrote:quoted
If I do: # dd if=/dev/zero of=/dev/md0p1 bs=8M[...]Grr. I thought the bad old days of filesystem and related defaults sucking were over.The previous md chunk default of 64KB wasn't horribly bad, though still maybe a bit high for alot of common workloads. I didn't have eyes/ears on the discussion and/or testing process that led to the 'new' 512KB default. Obviously something went horribly wrong here. 512KB isn't a show stopper as a default for 0/1/10, but is 8-16 times too large for parity RAID.quoted
cryptsetup aligns sanely these days, xfs is sensible, etc.XFS won't align with the 512KB chunk default of metadata 1.2. The largest XFS journal stripe unit (su--chunk) is 256KB, and even that isn't recommended. Thus mkfs.xfs throws an error due to the 512KB stripe. See the md and xfs archives for more details, specifically Dave Chinner's colorful comments on the md 512KB default.
Heh -- that's why the math didn't make any sense :)
quoted
wtf? <rant>Why is there no sensible filesystem for huge disks? zfs can't cp --reflink and has all kinds of source availability and licensing issues, xfs can't dedupe at all, and btrfs isn't nearly stable enough.</rant>Deduplication isn't a responsibility of a filesystem. TTBOMK there are two, and only two, COW filesystems in existence: ZFS and BTRFS. And these are the only two to offer a native dedupe capability. They did it because they could, with COW, not necessarily because they *should*. There are dozens of other single node, cluster, and distributed filesystems in use today and none of them support COW, and thus none support dedup. So to *expect* a 'sensible' filesystem to include dedupe is wishful thinking at best.
I should clarify my rant for the record. I don't care about in-fs dedupe. I want COW so userspace can dedupe and generally replace hardlinks with sensible cowlinks. I'm also working on some fun tools that *require* reflinks for anything resembling decent performance. --Andy