Re: [RFC 11/32] xfs: convert to struct inode_time
From: Dave Chinner <david@fromorbit.com>
Date: 2014-05-31 05:54:57
Also in:
linux-fsdevel, linux-xfs, lkml
[ Please don't top post. ] On Fri, May 30, 2014 at 06:22:55PM -0700, H. Peter Anvin wrote:
On May 30, 2014 6:14:50 PM PDT, Dave Chinner [off-list ref] wrote:quoted
On Fri, May 30, 2014 at 05:41:14PM -0700, H. Peter Anvin wrote:quoted
On 05/30/2014 05:37 PM, Dave Chinner wrote:quoted
IOWs, the filesystem has to be able to reject any attempt to set a timestamp that is can't represent on disk otherwise Bad Stuff will happen,Actually it is questionable if it is worse to reject a timestamp orjustquoted
let it wrap. Rejecting a valid timestamp is a bit like "You don't exist, go away."I think having the new systems calls being able to return EINVAL if the value cannot be stored permanently on disk correctly is the right thing to do. Having it silently mangled by the filesystem and returning "everything is just fine, trust me" is close to the worst solution I can think of. That's exactly what leads to overflow bugs occurring....quoted
quoted
and filesystems have to be able to specify in their on disk format what timestamp encoding is being used. The solutionwillquoted
quoted
be different for every filesystem that needs to support time beyond 2038.Actually the cutoff can be really different for each filesystem, not necessarily 2038. However, I maintain the above still holds.Sure, but all filesystems are supposed to handle at least the current unix epoch.quoted
Consider a filesystem that kept timestamps in YYMMDDHHMMSS format.Whatquoted
would you have expected such a filesystem to do on Jan 1, 2000?Strawman. We don't need to cater for fundamentally broken designs that can't even handle the current unix epoch correctly. If such filesystems exist, then they can simple say "original unix epoch support only" and do whatever crap they are doing right now.No, not a strawman. Replace with Jan 26, 2038 and you have the same situation.
But that's not the problem I'm talking about. The problem isn't the roll-over date of the epoch - the problem is that we're changing the in-memory meaning of time without changing what the filesystems store on disk or how they translate them. To use your example, what I'm actually talking about is the kernel switching to CCYYMMDDHHMMSS while the filesystem has YYMMDDHHMMSS on disk. The filesystem doesn't know the timestamp is now a different format, so it could mangle it writing it to disk, or it could mangle existing timestamps in the YY.. format reading them from disk and putting them into CC.. format structures. IOWs, it will incorrectly translate YY format dates to CC format, or translate something in the CC format as though it was in YY format. And it wouldn't even know what was the correct format because there's nothing telling it on disk whether the date is in CC or YY format. Either way, you get mangled timestamps, the filesystem doesn't know about it because it's just storing what the kernel gives it, the kernel thinks they are fine because they are just opaque when read back, but the user says "what the fuck did a reboot do to all these timestamps?". Hence your example of roll-over dates is a strawman - you've constructed a problem that is irrelevant to the issue being pointed out. FWIW, we already have code in the superblock and VFS to avoid such problems on filesystems with limited timestamp resolution (i.e s_time_gran and current_fs_time()) so that what the VFS hands the filesystem is exactly what the VFS expects to get back from disk when comparing timestamps. If we are changing the in-kernel timestamp to have a greater dynamic range that anything we current support on disk, then we need support for all filesystems for similar translation and constraint. The filesystems need to be able to tell the kernel what they timestamp range they support, and then the kernel needs to follow those guidelines. And if the filesystem is mounted on a kernel that doesn't support the current filesystem's timestamp format, then at minimum that filesystem cannot do anything that writes a timestamp.... Put simply: the filesystem defines the timestamp range that can be used safely, not the userspace API. If the filesystem can't support the date it is handed then that is an out-of-range error. Since when have we accepted that it's OK to handle out-of-range data with silent overflows or corruption of the data that we are attempting to store? We're defining a new API to support a wider date range - there is nothing that prevents us from saying ERANGE can be returned to a timestamp that the file cannot store correctly.... Cheers, Dave. -- Dave Chinner david@fromorbit.com