Re: [PATCH 08/15] cache: compare the entire buffer for struct object_id
From: brian m. carlson <hidden>
Date: 2021-04-11 21:05:57
On 2021-04-11 at 11:36:33, Ævar Arnfjörð Bjarmason wrote:
On Sat, Apr 10 2021, brian m. carlson wrote:quoted
Currently, when we compare two object IDs, we have to take a branch to determine what the hash size is supposed to be. The compiler can optimize well for a single length, but has trouble when there are two possible lengths.This would benefit from some performance/perf numbers. When this code was first changed like this in 183a638b7da (hashcmp: assert constant hash size, 2018-08-23) we had: Test v2.18.0 v2.19.0-rc0 HEAD ------------------------------------------------------------------------------ 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0% Then it was later modified in 0dab7129ab1 (cache: make hashcmp and hasheq work with larger hashes, 2018-11-14).
I can do some perf numbers.
quoted
@@ -205,7 +205,7 @@ static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2) static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2) { - return hashcmp(oid1->hash, oid2->hash); + return memcmp(oid1->hash, oid2->hash, GIT_MAX_RAWSZ); }hashcmp is now: if (the_hash_algo->rawsz == GIT_MAX_RAWSZ) return memcmp(sha1, sha2, GIT_MAX_RAWSZ); return memcmp(sha1, sha2, GIT_SHA1_RAWSZ); Wouldn't it make more sense to amend it to just be a memcmp wrapper/macro if we're going to not make this conditional on the hash algorithm, or are there other callsites where we still want the old way of doing it?
No, we can't do that. With oidcmp, we know the buffer is large enough. However, in some cases, the buffer in hashcmp is not large enough. For example, we may be at the end of a SHA-1 tree object and we'd segfault. I did try that and I quickly found that it was totally broken. -- brian m. carlson (he/him or they/them) Houston, Texas, US
Attachments
- signature.asc [application/pgp-signature] 263 bytes