Re: bcachefs: can bcachefs export block devices?
From: Kent Overstreet <hidden>
Date: 2016-08-04 07:10:09
On Fri, May 27, 2016 at 07:45:32PM -0700, Eric Wheeler wrote:
quoted
On Wed, May 25, 2016 at 02:47:29PM -0700, Eric Wheeler wrote:quoted
Does bcachefs's implementation reuse and update the existing bcache code such that the block device driver inherits the bcachefs improvements? I understand the cache superblock changed, maybe the cached dev super too.Yes, all of the existing functionality is still there (though some of it's broken at the moment because I haven't been running those tests; if you're interested in using bcache-dev for the old style caching (there are performance and robustness improvements) it wouldn't take me long to get it working again).I can test that once its working. Would it use the same bcachefs tools for formatting superblocks? Relatedly, can you point out the best place to abstract cachemeta-v1 vs. cachemeta-v2 for simultaneous use? Could it be just a bunch of function pointers in the cachedev struct and assignment during initialization for v1/v2? Have the call arguments changed? What functions would need abstractions (the smallest v1/v2 intersection)?
You mean compile a kernel that supports both old and new on disk format? Realistically the only way that's going to happen is to completely fork the source code, ext2/3/4 style. Although that's going to have to happen eventually.
quoted
quoted
Can bcachefs provide /dev/bcacheN devices without loop.ko? If so, are these simply filesystem objects (files)?The way it works is the first 4096 inode numbers are owned by the block device interface - inodes in that range are for either cached devices or thin provisioned volumes. The filesystem code owns inode numbers >= 4096. So while blockdev volumes/cached data do have inodes, they're not reachable via the filesystem because there will never be dirents that point to them (also, they use a different inode type with extra fields for the UUID/label).Thats a neat implementation. Would creating a dirent for such an inode expose the block device with the same size and content (and ordering) if if the inode were compatable? Would the blockdev be block-size aligned versus the file or might the file have an alignment requirement?
What we'd want to do is add an ioctl or something to take a fs inode (a normal file, that already has a dirent) and create at runtime a block device for it.
I'm particularly excited about this as a precursor to snapshot support, especially if udev could help produce something like this: /dev/disk/by-path/bcache-mydiskfile -> /dev/bcacheN /dev/disk/by-path/bcache-mydisksnap -> /dev/bcacheN+1
Not sure what you mean by precursor - that would still require essentially the entire snapshots implementation. But yes, once we have snapshots we could do that too.