Thread (18 messages) 18 messages, 7 authors, 2016-02-24

Re: [PULL] Re: bcache stability patches

From: Vojtech Pavlik <hidden>
Date: 2016-01-02 12:15:28

On Fri, Jan 01, 2016 at 08:28:39PM -0500, Denis Bychkov wrote:
quoted
We certainly can do all that, but: since new bcache can't read the old bcache
format (I can go into why that's impractical, if people are curious) - that
means there's a pretty high cost to switching to the new format:
 - people have to manually upgrade
 - the kernel would have to carry around both the old and the new
   implementations of bcache for as long as people are using the old format -
   we can't force people to upgrade
Not that hard technically, you could just leave the existing bcache
module as-is to avoid regressions when adding the new format support,
since the existing code does not require a lot of maintenance and add
a new module that would only recognize a new superblock. Another
question is how easy it is to convince Linus/top maintainers to keep 2
modules with a lot of duplication with intention to retire the old
code eventually, but something tells me that you know ways around this
problem.
We had that with the UHCI drivers, it is there with ext3/ext4, I don't
think that is a real problem.
quoted
So this isn't something we want to do more than once, which means we need to
make sure the new on disk format is 100% done. And it's not quite done - the
main thing that's left for it to really be considered done is big endian support
and endian compatibility (writing the code so a little endian machine can read a
big endian layout and vice versa; due to the way bkeys work it's not possible to
just have an endian agnostic layout, we'll have to do swabbing).
But this problem is not unique to bcache at all, AFAIK, almost any FS
linux supports would not be able to work on a different endianness
than it was created for.
On the contrary, all modern filesystems cope with endianness
portability. The only major filesystem in use where endianness is not
handled is, as far I know, UFS.

At the same time, I don't see endianness portability, the ability to
create a cache on a machine with one endian and then mounting it on a
machine with the opposite endian a real use case.

Unlike fileystems, which can be used to transfer valuable data between
machines, the cache only contains ephemeral data, which can easily be
recreated from the backing device.

Hence I believe that it is reasonable to require the user to nuke the
contents of the cache when moving the cache set between machines of
different endianity.

Ideally this would happen automatically and error out if the cache isn't
clean.

Actually, the same would be fine for format version changes.
quoted
And this isn't a trivial amount of work - and besides finishing the on disk
format, there's a fair amount of work on tooling and related stuff to make sure
everything is ready for the switch.

And, I can't work for free, so somehow funding has to be secured. Given the
number of companies that are using bcache, and the fact that Canonical and SuSe
are both apparantly putting in at least a little bit of engineering time into
supporting bcache, you'd think it should be possible but offers have not been
forthcoming.
I don't know, IMHO bcache was hurt a lot because of a host of small
problems that nobody was able to address for quite some time. It
gained a bad reputation as a production system, unfortunately, which
means not much interest from the enterprise world, which means
Canonical & co. did not want to invest into it. Don't get me wrong, I
am not blaming you. Of all people, I might understand pretty well what
was going on, just explaining why RH or Canonical or Suse did not
fight for the privilege to financially support this project.
SUSE had plans for bcache, however, since upstram stable branch
maintenance has been more than unreliable, we postponed most of them and
are building knowledge in-house to be able to fully support it before we
deploy.

The structure of the code doesn't really help, either.

-- 
Vojtech Pavlik
Director SUSE Labs
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help