Re: [PATCH] Btrfs: do not async metadata csums if we have hardware crc32c
From: Chris Mason <hidden>
Date: 2012-09-24 18:58:11
On Mon, Sep 24, 2012 at 12:19:20PM -0600, Arne Jansen wrote:
On 09/24/12 20:11, Josef Bacik wrote:quoted
The reason we offload csumming is because it is CPU intensive, except it is not on modern intel CPUs. So check to see if we support hardware crc32c, and if we do just do the csumming in our current threads context. Otherwise we can farm it off. Thanks, Signed-off-by: Josef Bacik <redacted> --- fs/btrfs/disk-io.c | 17 +++++++++++++++++ 1 files changed, 17 insertions(+), 0 deletions(-)diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index dcaf556..830b9af 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c@@ -31,6 +31,7 @@ #include <linux/migrate.h> #include <linux/ratelimit.h> #include <asm/unaligned.h> +#include <asm/cpufeature.h> #include "compat.h" #include "ctree.h" #include "disk-io.h"@@ -880,6 +881,22 @@ static int btree_submit_bio_hook(struct inode *inode, int rw, struct bio *bio, } /* + * Pretty sure I'm going to hell for this. If our CPU can do crc32cs in + * the hardware then there is no reason to do the csum stuff + * asynchronously, it will be faster to do it inline, so test to see if + * our CPU can do hardware crc32c and if it can just do the csum in our + * threads context. + */ +#ifdef CONFIG_X86 + if (cpu_has_xmm4_2) { + printk(KERN_ERR "doing it the fast way\n");You'll probably go to hell for the printk...
;) Testing with dd on my recent intel box, I can hardware crc32c at 1.3GB/s. Anything beyond that and you really want more cpus jumping into the mix. I wanted to use this test for data crcs too, but I suppose the helpers only really hurt for the synchronous IO. -chris