Re: IMA on remote file systems

From: Chuck Lever <hidden>
Date: 2019-09-16 18:18:48
Also in: linux-fsdevel

On Sep 16, 2019, at 12:10 PM, Theodore Y. Ts'o [off-list ref] wrote:

On Sun, Sep 15, 2019 at 05:42:10PM -0400, Mimi Zohar wrote:

quoted

My thought was to use an ephemeral Merkle tree for NFS (and
possibly other remote filesystems, like FUSE, until these
filesystems support durable per-file Merkle trees). A tree would
be constructed when the client measures a file, but it would not
saved to the filesystem. Instead of a hash of the file's contents,
the tree's root signature is stored as the IMA metadata.

Once a Merkle tree is available, it can be used in exactly the
same way that a durable Merkle tree would, to verify the integrity
of individual pages as they are used, evicted, and then read back
from the server.

If the client needs to evict part or all of an ephemeral tree, it
can subsequently be reconstructed by measuring the file again and
verifying its root signature against the stored IMA metadata.

Where would the client store the ephemeral tree?  If you're thinking
about storing in memory, calculating the emphemeral tree would require
dragging the entire file across the network, which is going to be just
as bad as using IMA --- plus the CPU cost of calculating the Merkle
tree, and the memory cost of storing the ephemeral Merkle tree.

A client would store ephemeral Merkle trees in memory.

The most interesting use case to me is protecting executables and
DLLs. These will tend to be limited in size, so the cost of Merkle
tree construction should be nicely bounded in the typical case.

An additional cost would arise if the in-memory tree were to be
evicted. We hope that is an infrequent event. If the tree is
partially evicted, only some of the file needs to be read back
to re-construct it, since we would still have in-memory hashes
stored in the interior nodes of the tree that enable the client to
verify the portion of the tree that needs to be re-constructed.

The short-term purpose of these trees is to add the value of better
integrity protection for file systems that find it difficult to
store per-file Merkle trees durably. We expect that situation will
be temporary for many file systems, though not all.

The price that is paid for this extra protection is that it will
perform like traditional IMA, as you observed above. This is probably
a different cost than reading from flash on a mobile device: a typical
NFS client will be less memory- and CPU-constrained than a mobile
device, and the cost of reading over NFS on a fast network from the
server's cache is not high. The trade-offs here are going to be
different.

I suspect that for most clients, it wouldn't be worth it unless the
client can store the ephemeral tree *somewhere* on the client's local
persistent storage, or maybe if it could store the Merkle tree on the
NFS server (maybe via an xattr which contains the pathname to the
Merkle tree relative to the NFS mount point?).

The trees could be cached locally for exceptionally large files (eg
files larger than the client's physical memory). For smaller files,
which I expect will be the typical case, the cost of reading a file
will be about the same as reading a Merkle tree.

As mentioned in my proposal, the eventual goal is to extend the NFS
protocol to store the Merkle tree durably on the server. We will get
there eventually. Changing the protocol is a slow process, particularly
because it involves consensus among NFS implementers who work on other
operating systems besides Linux.

quoted

So the only difference here is that the latency-to-first-byte
benefit of a durable Merkle tree would be absent.

What problem are you most interested in solving?  And what cost do you
think the user will be willing to pay in order to solve that problem?

NFS users would get full protection of their files from storage
to point-of-use, at the same cost as IMA, until some point in the
future when NFS can store the trees durably. The same would apply
to other filesystems that find storing a full Merkle tree to be
a challenge.

quoted

I like the idea, but there are a couple of things that need to happen
first.  Both fs-verity and IMA appended signatures need to be
upstreamed.

Eric has sent the pull request fs-verity today.

quoted

 The IMA appended signature support simplifies
ima_appraise_measurement(), paving the way for adding IMA support for
other types of signature verification.  How IMA will support fs-verity 
signatures still needs to be defined.  That discussion will hopefully
include NFS support.

As far as using the Merkle tree root hash for the IMA measurement,
what sort of policy should be used for determining when the Merkle
tree root hash should be used in preference to reading and checksuming
the whole file when it is first opened?  It could be as simple as, "if
this is a fs-verity, use the fs-verity Merkle root".  Is that OK?

    	  	     	     	       	      - Ted

--
Chuck Lever

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help