Re: [PATCH RESEND v2] scsi: ignore Synchronize Cache command failures to keep using drives not supporting it
From: Damien Le Moal <hidden>
Date: 2021-03-01 12:43:59
On 2021/03/01 21:30, Damien Le Moal wrote:
On 2021/03/01 20:33, Guido Trentalancia wrote:quoted
I have just checked the drive and I can now confirm that you are right about the capabilities reported: it is indeed reporting Write Caching. However the point is that the current kernel behaviour is wrong and leading to data corruption on such drives, as the sync function fails due to missing Synchronize Cache command.Without sync working, how could you ever guarantee that even a clean shutdown of the host does not result in data loss ? I am not even talking about power failures and crashes here. A simple, clean "shutdown -h now". Without issuing a synchronize cache command after flushing the page cache and unmounting the FS, you will loose data and corrupt things. 100% guaranteed. As James explained in different terms, it is the other way around: the kernel is correct and behaves according to the fact that the drive is saying "I am caching written sectors". That implies that synchronize cache *is* supported, per the standards. The corruption errors you are seeing are due to the drive being silly and failing synchronize cache commands, resulting in the cached data *not* going to persistent media, and loss of data on power down or crash. This is not a bad behavior. On the contrary, not seeing any data corruption would only mean that you are being lucky.quoted
Once again, this is because of very old HD specifications that were implementing Write Caching without that command.If the drive is scanned and initialized by sd/libata, it means that it is reporting following a supported standard version (SPC, SBC, ACS). Probably an old version, but still a standard. Synchronize cache command (for scsi) and flush cache/flush cache ext (for ATA) are not recent additions to SCSI & ATA. These have been around for a long time. It does not matter that the drive is old and following an old version. It should support cache flushing. Refer to the ACS specs. It is clearly noted that: "If the volatile write cache is disabled or no volatile write cache is present, the device shall indicate command completion without error.". Get the point ? That drive firmware is simply broken and missing a critical command.quoted
The way forward is to treat the command failure as non-critical (see attached patch) because clearly it's not implemented in all drives, but only more recent ones.Nope. Simply disable the write cache. Try "hdparm -W 0 /dev/sdX" or "sdparm --clear=WCE /dev/sdX" to disable it. And a udev rule can do that for you on boot too. Failures and data corruption will go away. But the performance will likely go to the trash bin too... A little bit of history: it used to be a thing, many many years ago, to see drives that had synchronize cache commands implemented as "nop" or not implemented at all. Unscrupulous vendors would do that to get better performance results for their drives with benchmarks. Because synchronize cache can be very costly and can take several seconds to complete. Using such drives in big RAID arrays with power failure protections would be fine, but any other use case by any regular user would create data corruption situations very quickly. As noted above, even a simple clean shutdown would lose data ! Such bad practice ended fairly quickly. For some reasons these bad drives failed to sell very well :)
I checked the standards again. It turns out that SYNCHRONIZE CACHE command is optional in SBC... Aouch. Got so used to have that one on any drive that I thought it was mandatory. Well, it certainly is mandatory if the drive has a write cache, which seems to be the case for you. The problem with your patch though is that you disable write caching when you see an ILLEGAL REQUEST/INVALID OPCODE termination of synchronize cache. Which means that the drive was already used, written too and the cache has likely dirty data and I do not see any method to guarantee that that data makes it to persistent media before shutdown. Imagine if that was the synchronize cache sent before shutdown. So the only reasonable solution for such drive is to use it with write cache disabled from the start.
quoted
Guido On Mon, 01/03/2021 at 07.38 +0000, Damien Le Moal wrote:quoted
On 2021/03/01 16:12, Guido Trentalancia wrote:quoted
Hi James, thanks for getting back on this issue. I have tested this patch for over a year and it works flawlessly without any data corruption ! On such kind of drives the actual situation is just the opposite as you describe: data corruption occurs when not using this patch ! I do not agree with you: if a drive does not support Synchronize Cache command, there is no point in treating the failure as critical and by all means the failure must be ignored, as there is nothing which can be done about it and it should not be treated as a failure !If the drive does not support synchronize cache, then the drive should *not* report write caching either. If it does, then the kernel will issue synchronize cache commands and that command failing indicates the drive is broken/lying == junk and should not be trusted. The user can trivially remedy to this situation by force disabling the write cache: no more synchronize cache commands will be issued and no more failures. No need to patch the kernel for that. And if the drive does not allow disabling write caching, then I seriously recommend replacing it :)quoted
However, if you are willing to propose an alternative patch, I'd be happy to test it and report back, as long as this bug is fixed in the shortest time possible. Guido On Sun, 28/02/2021 at 08.37 -0800, James Bottomley wrote:quoted
On Sun, 2021-02-28 at 10:01 +0100, Guido Trentalancia wrote:quoted
Many obsolete hard drives do not support the Synchronize Cache SCSI command. Such command is generally issued during fsync() calls which at the moment therefore fail with the ILLEGAL_REQUEST sense key.It should be that all drives that don't support sync cache also don't have write back caches, which means we don't try to do a cache sync on them. The only time you we ever try to sync the cache is if the device advertises a write back cache, in which case the sync cache command is mandatory. I'm sure some SATA manufacturers somewhere cut enough corners to produce an illegally spec'd drive like this, but your proposed remedy is unviable: you can't ignore a cache failure on flush barriers which will cause data corruption. You have to disable barriers on the filesystem to get correct operation and be very careful about power down. James
-- Damien Le Moal Western Digital Research