Re: [PATCH] fsck: do not loop infinitely when processing packs
From: Jeff King <hidden>
Date: 2026-02-23 09:27:50
On Mon, Feb 23, 2026 at 09:43:41AM +0100, Patrick Steinhardt wrote:
This is because ultimately the root cause seems to be a bit more complex: we don't only care about there being multiple packfiles. We also care about "core.bigFileThreshold". Typically, we don't execute `find_pack_entry()` at all when verifying packfiles as we iterate through objects in packfile order. We thus don't have to look up objects via their object ID, but instead we do so by using their packfile offset. And this mechanism will not end up in `find_pack_entry()`, and thus we wouldn't update the MRU. But there's an exception: when the size of the object that is to be checked exceeds "core.bigFileThreshold" we won't read it directly, but we'll instead use `stream_object_signature()`, which eventually ends up calling `odb_read_stream_open()`. And that of course _will_ call `find_pack_entry()`, as we're now in the mode where we search by object ID, not by offset. And consequently, we'll update the MRU in this call path.
Good find.
With that knowledge it's kind of easy to reproduce the issue: we simply need two packfiles, and each of them must contain at least one blob that is larger than "core.bigFileThreshold". Now I agree that the below proposed fix would be a good change to make the code more solid while we still have `repo_for_each_pack()` (I plan to eventually get rid of it). But arguably, the above logic is kind of broken regardless of this: we are asked to verify objects in the current pack, but we may end up verifying the object via a different pack. So if the same object were to exist in multiple packs, we might end up only verifying one of its instances.
Yeah, that was my immediate response after reading your analysis above (that fsck should not be doing find_pack_entry() in the first place here). -Peff