Re: [PATCH v4 13/13] bcache: add stop_when_cache_set_failed to struct cached_dev
From: Coly Li <hidden>
Date: 2018-01-29 13:02:27
Also in:
linux-block
On 29/01/2018 8:57 PM, Nix wrote:
On 27 Jan 2018, Coly Li said:quoted
Current bcache failure handling code will stop all attached bcache devices when the cache set is broken or disconnected. This is desired behavior for most of enterprise or cloud use cases, but maybe not for low end configuration. Nix [off-list ref] points out, users may still want to access the bcache device after cache device failed, for example on laptops.Actually I'm much interested in the server use case. On laptops, it's relatively easy to recover if you know what you're doing, because they usually have a user in front of them with console access -- but if a remote headless server with a hundred users a thousand miles away has its cache device wear out I would really rather the hundred users get served, if more slowly, rather than the whole machine going down for hours or days until I can get someone there to bash on the hardware!
Hi Nix, Thanks for the input, I didn't think of such use case before. It makes a lot sense !
(Sure, ideally you'd detect the wearing out in advance, but SSDs are not always nice and helpful like that and sometimes just go instantly readonly or simply vanish off the bus entirely without warning.)
Yes. Then in the v5 patch set, I will add an option for "always"/"auto", which will leave bcache device alive if the broken cache set is clean. Thank you all again, for the insight and brilliant suggestion ! Coly Li