Thread (61 messages) 61 messages, 10 authors, 2020-08-14

Re: [dm-devel] [RFC PATCH v5 00/11] Integrity Policy Enforcement LSM (IPE)

From: Chuck Lever <hidden>
Date: 2020-08-10 23:37:03
Also in: dm-devel, linux-block, linux-fsdevel, linux-integrity, lkml

On Aug 10, 2020, at 11:35 AM, James Bottomley [off-list ref] wrote:

On Sun, 2020-08-09 at 13:16 -0400, Mimi Zohar wrote:
quoted
On Sat, 2020-08-08 at 13:47 -0400, Chuck Lever wrote:
quoted
quoted
On Aug 5, 2020, at 2:15 PM, Mimi Zohar [off-list ref]
wrote:
<snip>
quoted
quoted
If block layer integrity was enough, there wouldn't have been a
need for fs-verity.   Even fs-verity is limited to read only
filesystems, which makes validating file integrity so much
easier.  From the beginning, we've said that fs-verity signatures
should be included in the measurement list.  (I thought someone
signed on to add that support to IMA, but have not yet seen
anything.)
Mimi, when you and I discussed this during LSS NA 2019, I didn't
fully understand that you expected me to implement signed Merkle
trees for all filesystems. At the time, it sounded to me like you
wanted signed Merkle trees only for NFS files. Is that still the
case?
I definitely do not expect you to support signed Merkle trees for all
filesystems.  My interested is from an IMA perspective of measuring
and verifying the fs-verity Merkle tree root (and header info)
signature. This is independent of which filesystems support it.
quoted
The first priority (for me, anyway) therefore is getting the
ability to move IMA metadata between NFS clients and servers
shoveled into the NFS protocol, but that's been blocked for various
legal reasons.
Up to now, verifying remote filesystem file integrity has been out of
scope for IMA.   With fs-verity file signatures I can at least grasp
how remote file integrity could possibly work.  I don't understand
how remote file integrity with existing IMA formats could be
supported. You might want to consider writing a whitepaper, which
could later be used as the basis for a patch set cover letter.
I think, before this, we can help with the basics (and perhaps we
should sort them out before we start documenting what we'll do).
Thanks for the help! I just want to emphasize that documentation
(eg, a specification) will be critical for remote filesystems.

If any of this is to be supported by a remote filesystem, then we
need an unencumbered description of the new metadata format rather
than code. GPL-encumbered formats cannot be contributed to the NFS
standard, and are probably difficult for other filesystems that are
not Linux-native, like SMB, as well.

The
first basic is that a merkle tree allows unit at a time verification. 
First of all we should agree on the unit.  Since we always fault a page
at a time, I think our merkle tree unit should be a page not a block.
Remote filesystems will need to agree that the size of that unit is
the same everywhere, or the unit size could be stored in the per-file
metadata.

Next, we should agree where the check gates for the per page accesses
should be ... definitely somewhere in readpage, I suspect and finally
we should agree how the merkle tree is presented at the gate.  I think
there are three ways:

  1. Ahead of time transfer:  The merkle tree is transferred and verified
     at some time before the accesses begin, so we already have a
     verified copy and can compare against the lower leaf.
  2. Async transfer:  We provide an async mechanism to transfer the
     necessary components, so when presented with a unit, we check the
     log n components required to get to the root
  3. The protocol actually provides the capability of 2 (like the SCSI
     DIF/DIX), so to IMA all the pieces get presented instead of IMA
     having to manage the tree
A Merkle tree is potentially large enough that it cannot be stored in
an extended attribute. In addition, an extended attribute is not a
byte stream that you can seek into or read small parts of, it is
retrieved in a single shot.

For this reason, the idea was to save only the signature of the tree's
root on durable storage. The client would retrieve that signature
possibly at open time, and reconstruct the tree at that time.

Or the tree could be partially constructed on-demand at the time each
unit is to be checked (say, as part of 2. above).

The client would have to reconstruct that tree again if memory pressure
caused some or all of the tree to be evicted, so perhaps an on-demand
mechanism is preferable.

There are also a load of minor things like how we get the head hash,
which must be presented and verified ahead of time for each of the
above 3.
Also, changes to a file's content and its tree signature are not
atomic. If a file is mutable, then there is the period between when
the file content has changed and when the signature is updated.
Some discussion of how a client is to behave in those situations will
be necessary.


--
Chuck Lever
chucklever@gmail.com


Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help