Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption
From: Andrew Morton <hidden>
Date: 2003-08-26 10:11:37
Also in:
lkml
Steve Lord [off-list ref] wrote:
quoted
quoted
Is this enough information to help find the cause of the bug? If not,> > it might be several days (if I'm unlucky, maybe even a week or two) > > before I have time to do anything more... > > > > -mm kernels have O_DIRECT-for-NFS patches in them. And some versions of > RPM use O_DIRECT. Whether O_DIRECT makes any difference at the server end > I do not know, but it would be useful if you could repeat the test on stock > 2.6.0-test4. > > Alternatively, run > > export LD_ASSUME_KERNEL=2.2.5 > > before running RPM. I think that should tell RPM to not try O_DIRECT. I doubt the NFS client is O_DIRECT capable here, I have run some rpm builds over nfs to 2.6.0-test4 and an xfs filesystem, everything is behaving so far. I will try mm1 tomorrow. Do we know if this NFS V3 or V2 by the way?
OK, sorry for the noise. It appears that this is due to the AIO patches in -mm. fsx-linux fails instantly on nfsv3 to localhost on XFS. It's OK on ext2 for some reason. Binary searching reveals that the offending patch is O_SYNC-speedup-nolock-fix.patch testcase: mkfs.xfs -f /dev/hda5 mount /dev/hda5 /mnt/hda5 chmod a+rw /mnt/hda5 service nfs start mount localhost:/mnt/hda5 /mnt/localhost cd /mnt/localhost fsx-linux foo truncating to largest ever: 0x13e76 READ BAD DATA: offset = 0x18f13, size = 0xee06, fname = foo OFFSET GOOD BAD RANGE 0x26000 0x02eb 0x0000 0x 0 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x26001 0xeb02 0x0000 0x 1 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops 0x26002 0x0228 0x0000 0x 2 operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>