Re: t9210-scalar.sh fails with SANITIZE=undefined
From: Victoria Dye <hidden>
Date: 2022-09-22 22:10:06
Subsystem:
the rest · Maintainer:
Linus Torvalds
Jeff King wrote:
On Thu, Sep 22, 2022 at 08:35:22AM -0400, Derrick Stolee wrote:quoted
quoted
I didn't dig further. It's obviously new in v2.38.0-rc1, but I'm not sure it's a show-stopper. It _might_ have been there all along, and is just now surfacing. Or it might be in an existing experimental feature that just wasn't exercised enough in the tests. Or if it really is new in scalar, then it will only hurt people using scalar, which didn't exist before. So I don't think it's a regression in the strictest sense, but it might be nice to get a more accurate understanding of the problem before the release.Interesting find! Here are the index-related settings that Scalar sets as of -rc1: * core.preloadIndex=true * index.threads=true * index.version=4 My gut feeling is that index.version=4 might be the culprit. I thought we tested GIT_TEST_INDEX_VERSION=4 in some CI modes, but apparently we do not. Do you get the same error in other tests with that environment variable?Yeah, that seems by far the most likely of those three. And indeed, running with GIT_TEST_INDEX_VERSION=4 causes even t0000 to fail with the same problem. A minimal reproduction in git.git is just: make SANITIZE=undefined git clone . tmp cd tmp rm .git/index export GIT_INDEX_VERSION=4 ../git reset --hard ;# ok, writes v4 index ../git reset --hard ;# fails reading unaligned v4 index So it seems like a problem with the v4 format in general. Which...yikes.
Other than allowing us to use a (non-packed) 'struct ondisk_cache_entry' to parse the index entries, is there any reason why the on-disk index entries should (or need to be) 4-byte aligned? If not, we could update how we read the 'ondisk' index entry in 'create_from_disk()' to avoid the misalignment. E.g.: ------------------8<------------------8<------------------8<------------------
diff --git a/read-cache.c b/read-cache.c
index b09128b188..f132a3f256 100644
--- a/read-cache.c
+++ b/read-cache.c@@ -1875,7 +1875,7 @@ static int read_index_extension(struct index_state *istate, static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool, unsigned int version, - struct ondisk_cache_entry *ondisk, + const char *ondisk, unsigned long *ent_size, const struct cache_entry *previous_ce) {
@@ -1883,7 +1883,7 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool, size_t len; const char *name; const unsigned hashsz = the_hash_algo->rawsz; - const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz); + const char *flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz; unsigned int flags; size_t copy_len = 0; /* ------------------>8------------------>8------------------>8------------------
the do the same sort of conversion with the rest of the function. It's certainly uglier than just using the 'struct ondisk_index_entry *' pointer, but it should avoid the misaligned addressing error.
-Peff