Re: [PATCH 0/6] odb: track commit graphs via object source
From: Patrick Steinhardt <hidden>
Date: 2025-10-02 11:36:01
On Thu, Oct 02, 2025 at 01:21:34PM +0200, Patrick Steinhardt wrote:
On Thu, Sep 25, 2025 at 12:17:50PM -0700, Junio C Hamano wrote:quoted
Patrick Steinhardt [off-list ref] writes:quoted
There is no inherent reason why a new backend would not be able to use the existing commit-graph infrastructure indeed. But there are reasons that specific backends may not want to do so. If objects are already stored in a database table, then it may make way more sense to store additional metadata that is currently stored in the commit-graph in a secondary database table instead of in the commit graph. ... This is roughly what I have in my head right now. And I realize that this information really should be sitting in a design document. I'm working on that, but still need to land two more patch series before I want to send such a patch series to the list.So is everybody happy with this line of thought that makes it mandatory for each backend to decide and implement the commit-graph support if they want to? My reading of the later part of Taylor's message[*] tells me that at least Taylor does not agree with that position, and I am not sure about this design choice, either. Surely, each backend can have its own optimization, but looking at the way data from the commit-graph and other auxiliary data files are used to optimize real operations (like populating the essential fields of the commit object first from the graph, only to read other things lazily from the object database, or switching to completely different traversal machinery when reachability bitmap is available), we cannot say that each backend can store whatever side data they please and leave it at that. The code paths that are supposed to be generic need to be aware of these side data used for optimization to some degree, so conceptually it is much cleaner (well, at least to my eyes, that is) to declare that the auxiliary data files like commit-graph and reachability bitmaps are defined on the objects in the repository, no matter what backend is used to store them.My intent here is mostly to allow us to swap out how exactly the data is being cached. During the Git Merge I heard from some JJ developer (I think) that they also have a pluggable cache, but they approach the issue differently: instead of making the cache a property of the object backend, they instead make the cache itself pluggable. I think that's a worthwhile angle to explore. The cache would still sit on the repository level, and it wouldn't have to care at all whether we use loose objects/packfiles or any other backend. But in theory, we can still swap it out for a different representation as desired. Which overall means that we can defer this to a later point in time, as we can make it pluggable independent from making the object database itself pluggable. So I'd propose to merge the first six patches, as everyone seemed to
Correction: first five patches, of course :) Patrick
agree that they improve the status quo, but drop the last patch that moves the commit-graph into the ODB sources. Does that seem reasonable to everyone? If so, I don't really see a reason to reroll at this point. But please let me know in case I miss anything that needs addressing. Thanks! Patrick