Thread (8 messages) 8 messages, 4 authors, 2022-09-28

Re: [PATCH] read-cache: avoid misaligned reads in index v4

From: Victoria Dye <hidden>
Date: 2022-09-26 16:47:53

Jeff King wrote:
On Fri, Sep 23, 2022 at 07:43:55PM +0000, Victoria Dye via GitGitGadget wrote:
quoted
@@ -1883,7 +1883,7 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 	size_t len;
 	const char *name;
 	const unsigned hashsz = the_hash_algo->rawsz;
-	const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz);
+	const char *flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz;
Now we use the "const char *" pointer instead of the cast to the
ondisk_cache_entry struct, which is good, and is what fixes the
alignment question.

But we also convert flagsp from being a uint16_t into a byte pointer.
I'm not sure if that's strictly necessary from an alignment perspective,
as we'd dereference it only via get_be16(), which handles alignment and
type conversion itself.

I'd imagine the standard probably says that even forming such a pointer
is illegal, so in that sense, it probably is undefined behavior. But I
think it's one of those things that's OK in practice.
Yep, per the C standard §6.3.2.3 #7 [1]:

  A pointer to an object or incomplete type may be converted to a pointer to
  a different object or incomplete type. If the resulting pointer is not
  correctly aligned for the pointed-to type, the behavior is undefined.

To your point, it is probably fine in practice, but I'd lean towards
sticking with a 'char *' to play it safe.

[1] https://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
quoted
@@ -1935,20 +1935,24 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 
 	ce = mem_pool__ce_alloc(ce_mem_pool, len);
 
-	ce->ce_stat_data.sd_ctime.sec = get_be32(&ondisk->ctime.sec);
[...]
+	ce->ce_stat_data.sd_ctime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, ctime)
+							+ offsetof(struct cache_time, sec));
I had figured we'd be able to drop ondisk_cache_entry entirely. But here
you're using it essentially as a template for a set of constants
retrieved via offsetof().

That's OK from an alignment perspective. It does mean we'd be in trouble
if a compiler ever decided to introduce padding into the struct. That's
probably unlikely. We don't use __attribute__((packed)) because it's not
portable, and our existing uses have generally been OK, because our
data structures are organized around 8-byte alignment. We might have
problems on a theoretical 128-bit processor or something.
In addition to portability, using '__attribute__((packed))' could hurt
performance (and, in a large index, that might have a noticeable effect).

As for dropping 'ondisk_cache_entry()', I didn't want to drop it only from
the "read" operation (and use something like the "parse left-to-right"
strategy below) while leaving it in "write." And, as you mentioned later,
changing 'ce_write_entry()' is a lot more invasive than what's already in
this patch and possibly out-of-scope.
Another strategy is to just parse left-to-right, advancing the byte
pointer. Like:

  ce->ce_state_data.sd_ctime.sec = get_be32(ondisk);
  ondisk += sizeof(uint32_t);
  ce->ce_state_data.sd_mtime.sec = get_be32(ondisk);
  ondisk += sizeof(uint32_t);
  ...etc...

You can even stick that in a helper function that does the get_b32() and
advances, so you know they're always done in sync. See pack-bitmap.c's
read_be32(), etc. IMHO this produces a nice result because the reading
code itself becomes the source of truth for the format.
...
One final note, though:
quoted
+	ce->ce_stat_data.sd_mtime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime)
+							+ offsetof(struct cache_time, sec));
Here (and elsewhere), you can assume that the offsetof() "sec" in
cache_time is 0, for two reasons:

  - I didn't look up chapter and verse, but I'm pretty sure the standard
    does guarantee that the first field of a struct is at the beginning.

  - If there's any padding, this whole scheme is hosed anyway, because
    it means sizeof(cache_time) is bigger than we expect, which messes
    up the offsetof() the entry after us (in this case sd_dev).

So this can just be:

  ce->ce_stat_data.sd_mtime.sec = get_be32(ondisk + offsetof(struct ondisk_cache_entry, mtime));

which is mercifully shorter.

Assuming we dismiss the rest of what I said as not worth it for a
minimal fix, I do think that simplification is worth rolling a v2.
That makes sense from a technical perspective, but I included the starting
entry offset for readability reasons. It might be confusing to someone
unfamiliar with C struct memory alignment to see every other 'get_be32'
refer to the exact entry it's reading via the 'offsetof()', but have that
information absent only for a few entries. And, the double 'offsetof()'
would still be used by the 'mtime.nsec'/'ctime.nsec' fields anyway.

In any case, if this patch is intended to be a short-lived change on the way
to a more complete refactor and/or I'm being overzealous on the readability,
I'd be happy to change it. :) 

Thanks!
-Peff

PS BTW, I mentioned earlier "can we just get rid of ondisk_cache_entry".
   We also use it for the writing side, of course. That doesn't have
   alignment issues, but it does have the same "I hope there's never any
   padding" question. In an ideal world, it would be using the
   equivalent put_be32(), but again, that's getting out of the "minimal
   fix" territory.
  
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help