Thread (23 messages) 23 messages, 7 authors, 2021-07-21

Re: [Patch v4 0/3] Introduce a driver to support host accelerated access to Microsoft Azure Blob

From: Greg KH <hidden>
Date: 2021-07-21 05:19:14
Also in: linux-block, lkml

On Tue, Jul 20, 2021 at 06:37:38PM +0000, Long Li wrote:
quoted
Subject: Re: [Patch v4 0/3] Introduce a driver to support host accelerated
access to Microsoft Azure Blob

On Mon, Jul 19, 2021 at 08:31:03PM -0700, longli@linuxonhyperv.com wrote:
quoted
From: Long Li <longli@microsoft.com>

Microsoft Azure Blob storage service exposes a REST API to
applications for data access.
(https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc
quoted
s.microsoft.com%2Fen-us%2Frest%2Fapi%2Fstorageservices%2Fblob-
service-
quoted
rest-
api&amp;data=04%7C01%7Clongli%40microsoft.com%7Ce499fbe161554232e
quoted
b1608d94b96a772%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637
623932
quoted
843247787%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIj
oiV2luMzIi
quoted
LCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=9CKNHAmurdBWp
ZeLfkiC18
quoted
CXNg66UhKsSZzzHZkzf0Y%3D&amp;reserved=0)

This patchset implements a VSC (Virtualization Service Consumer) that
communicates with a VSP (Virtualization Service Provider) on the
Hyper-V host to execute Blob storage access via native network stack on
the host.
quoted
This VSC doesn't implement the semantics of REST API. Those are
implemented in user-space. The VSC provides a fast data path to VSP.

Answers to some previous questions discussing the driver:

Q: Why this driver doesn't use the block layer

A: The Azure Blob is based on a model of object oriented storage. The
storage object is not modeled in block sectors. While it's possible to
present the storage object as a block device (assuming it makes sense
to fake all the block device attributes), we lose the ability to
express functionality that are defined in the REST API.
What "REST API"?

Why doesn't object oriented storage map to a file handle to read from?
No need to mess with "blocks", why would you care about them?

And again, you loose all caching, this thing has got to be really slow, why add
a slow storage interface?  What workload requires this type of slow block
storage?
"Blob REST API" expresses storage request semantics through HTTP. In Blob
REST API, each request is associated with a context meta data expressed in
HTTP headers. The ability to express those semantics is rich, it's only limited
by HTTP protocol.
HTTP has nothing to do with the kernel, so I do not understand what you
are talking about here.
There are attempts to implement the Blob as a file system.
Here is an example filesystem (BlobFuse) implemented for Blob:
(https://github.com/Azure/azure-storage-fuse).

It's doable, but at the same time you lose all the performance and
shareable/security features presented in the REST API for Blob.
What sharable/security features are in this driver instead?  I saw none.
A POSIX
interface cannot express same functionality as the REST API for Blob.
But you are not putting a REST api into this kernel driver, so I fail to
understand this.
For example, The Blob API for read (Get Blob, 
https://docs.microsoft.com/en-us/rest/api/storageservices/get-blob)
has rich meta data context that cannot easily be mapped to POSIX. The same
goes to other Blob API to manage security tokens and the life cycle of shareable
objects.
How can you have sharable objects in this ioctl interface instead?
BlobFuse (above) filesystem demonstrated why Blob should not be implemented
on a filesystem. It's useable for data management purposes. It's not usable for an I/O
intensive workload. It's not usable for managing sharable objects and security tokens.
What is the bottleneck for the throughput/performance issues involved?
Blob is designed not to use file system caching and block layer I/O scheduling.
There are many solutions existing today, that access raw disk for I/O, bypassing
filesystem and block layer. For example, many database applications access raw
disks for I/O. When the application knows the I/O pattern and its intended behavior,
it doesn't get much benefit from filesystem or block.
Databases that use raw i/o "know what they are doing" and are constantly
fighting with the kernel.  Don't expect kernel developers to just think
that this is ok.

But this is not what you are doing here at all, this is an object
storage, you are still being forced to open/ioctl/close to get an object
instead of just doing open/read/close, so I fail to understand where the
performance issues are.

And if they are in the FUSE layer, why not just write a filesystem layer
in the kernel instead to resolve that?  Who is insisting that you do
this through a character device driver to get filesystem data?
quoted
quoted
Q: You also just abandoned the POSIX model and forced people to use a
random-custom-library just to access their storage devices, breaking
all existing programs in the world?

A: The existing Blob applications access storage via HTTP (REST API).
They don't use POSIX interface. The interface for Azure Blob is not
designed on POSIX.
I do not see a HTTP interface here, what does that have to do with the kernel?

I see a single ioctl interface, that's all.
The driver doesn't attempt to implement Blob API or HTTP. It presents a fast data
path to pass Blob requests to hosts, so the guest VM doesn't need to assemble
a HTTP packet for each Blob REST requests. This also eliminates additional
overhead in guest network stack to send the HTTP packets over TCP/IP.
Again, I fail to understand how http or tcp/ip comes into play here at
all, that's not what this driver does.
Once the request reaches the Azure host, it knows the best way to reach to the 
backend storage and serving the Blob request, while at the same time all the 
security and integrity features are preserved.
I do not understand this statement at all.
quoted
quoted
Q: What programs today use this new API?

A: Currently none is released. But per above, there are also none
using POSIX.
Great, so no one uses this, so who is asking for and requiring this thing?

What about just doing it all from userspace using FUSE?  Why does this HAVE
to be a kernel driver?
We have a major workload nearing the end of development. Compared with
REST over HTTP, this device model presented faster data access and CPU savings
in that there is no overhead of sending HTTP over network.
Your development cycle means nothing to us, sorry.  Please realize that
if you submit something that is not acceptable to us, there is no
requirement that we take it just because we feel this is implemented in
totally the wrong way.

And you have not, again, proven that there is any performance
improvement anywhere due to lack of numbers and data.   And again, what
does HTTP have to do with this driver.
Eventually, all the existing Blob REST API applications can use this new API, once
it gets to their Blob transport libraries.
What applications?  What libraries?  Who is using any of this?
I have explained why BlobFuse is not suitable for production workloads. There
are people using BlobFuse but mostly for management purposes.
I fail to see why it is not usable as you have not provided any real
information, sorry.

Please step back and write up a document that explains the problem you
are trying to solve and then we can go from there.  Right now you are
throwing a driver at us and expecting us to just accept it, despite it
looking like it is completly wrong for the problem space it is
attempting to interact with.

Please work with some experienced Linux kernel developers on your team
to do all of this _BEFORE_ submitting it to the community again.  Based
on the mistakes made so far, it looks like you could use some guidance
in turning this into something that might be acceptable.  As it is, you
are a long way off.

good luck!

greg k-h
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help