Re: [Patch v4 0/3] Introduce a driver to support host accelerated access to Microsoft Azure Blob
From: Bart Van Assche <bvanassche@acm.org>
Date: 2021-07-20 04:42:45
Also in:
linux-block, lkml
On 7/19/21 8:31 PM, longli@linuxonhyperv.com wrote:
From: Long Li <longli@microsoft.com> Microsoft Azure Blob storage service exposes a REST API to applications for data access. (https://docs.microsoft.com/en-us/rest/api/storageservices/blob-service-rest-api) This patchset implements a VSC (Virtualization Service Consumer) that communicates with a VSP (Virtualization Service Provider) on the Hyper-V host to execute Blob storage access via native network stack on the host. This VSC doesn't implement the semantics of REST API. Those are implemented in user-space. The VSC provides a fast data path to VSP. Answers to some previous questions discussing the driver: Q: Why this driver doesn't use the block layer A: The Azure Blob is based on a model of object oriented storage. The storage object is not modeled in block sectors. While it's possible to present the storage object as a block device (assuming it makes sense to fake all the block device attributes), we lose the ability to express functionality that are defined in the REST API. Q: You just lost all use of caching and io_uring and loads of other kernel infrastructure that has been developed and relied on for decades? A: The REST API is not designed to have caching at system level. This driver doesn't attempt to improve on this. There are discussions on supporting ioctl() on io_uring (https://lwn.net/Articles/844875/), that will benefit this driver. The block I/O scheduling is not helpful in this case, as the Blob application and Blob storage server have complete knowledge on the I/O pattern based on storage object type. This knowledge doesn't get easily consumed by the block layer. Q: You also just abandoned the POSIX model and forced people to use a random-custom-library just to access their storage devices, breaking all existing programs in the world? A: The existing Blob applications access storage via HTTP (REST API). They don't use POSIX interface. The interface for Azure Blob is not designed on POSIX. Q: What programs today use this new API? A: Currently none is released. But per above, there are also none using POSIX. Q: Where is the API published and what ensures that it will remain stable? A: Cloud based REST protocols have similar considerations to the kernel in terms of interface stability. Applications depend on cloud services via REST in much the same way as they depend on kernel services. Because applications can consume cloud APIs over the Internet, there is no opportunity to recompile applications to ensure compatibility. This causes the underlying APIs to be exceptionally stable, and Azure Blob has not removed support for an exposed API to date. This driver is supporting a pass-through model where requests in a guest process can be reflected to a VM host environment. Like the current REST interface, the goal is to ensure that each host provide a high degree of compatibility with each guest, but that task is largely outside the scope of this driver, which exists to communicate requests in the same way an HTTP stack would. Just like an HTTP stack does not require updates to add a new custom header or receive one from a server, this driver does not require updates for new functionality so long as the high level request/response model is retained. Q: What happens when it changes over time, do we have to rebuild all userspace applications? A: No. We don’t rebuild them all to talk HTTP either. In the current HTTP scheme, applications specify the version of the protocol they talk, and the storage backend responds with that version. Q: What happens to the kernel code over time, how do you handle changes to the API there? A: The goal of this driver is to get requests to the Hyper-V host, so the kernel isn’t involved in API changes, in the same way that HTTP implementations are robust to extra functionality being added to HTTP.
Another question is why do we need this in the kernel? Has it been considered to provide a driver similar to vfio on top of the Hyper-V bus such that this object storage driver can be implemented as a user-space library instead of as a kernel driver? As you may know vfio users can either use eventfds for completion notifications or polling. An interface like io_uring can be built easily on top of vfio. Thanks, Bart.