Thread (75 messages) 75 messages, 5 authors, 8h ago

Re: [PATCH/RFC 2/6] commit-reach: introduce struct paint_queue with per-side counters

From: Derrick Stolee <hidden>
Date: 2026-06-23 13:50:18

On 6/23/2026 6:13 AM, Kristofer Karlsson wrote:
On Mon, 22 Jun 2026 at 22:23, Derrick Stolee [off-list ref] wrote:
quoted
On 6/22/2026 3:14 PM, Kristofer Karlsson wrote:
quoted
On Mon, 22 Jun 2026 at 20:10, Derrick Stolee [off-list ref] wrote:
quoted
When possible, I like to try to make loops only have one terminating
condition. Should we have paint_queue_get() return NULL when it sees
this internal state condition?
Possibly, but that would couple the paint_queue struct very tightly with
the usage. Not a problem in practice since it only has one call site, and
it's unlikely that we want to add more of them but it may feel more natural
to let the paint_queue purely have the queue semantics and counters,
and keep the halt condition within the function itself. I don't feel
super-strongly about this and can change it if needed, I will just need to
verify that nothing else gets complex as a result, I have not fully thought
through the effects.
Hm. Interesting. The coupling is perhaps expected, because the data
structure tracks counts that don't otherwise need to be tracked.
Maybe the terminating condition method could be descriptively named
to say why it would be completing.
I have been working on v2 locally and most of the changes landed
nicely and were clear improvements but there's one point I would
want to discuss a bit more.

For the termination conditions, I moved them into paint_queue_get()
as you suggested.  The all-zero check was straightforward since it
only depends on the counters but the side-exhaustion check also
needs to know whether we have entered the finite-generation region,
so I pass last_gen (already a local in paint_down_to_common) as a
parameter:

  static struct commit *paint_queue_get(struct paint_state *state,
                                        timestamp_t last_gen)

Inside, the two conditions merge nicely under a shared guard:

  if (!state->pending_merge_bases) {
      if (!state->p1_count && !state->p2_count)
          return NULL;
      if (last_gen < GENERATION_NUMBER_INFINITY &&
          (!state->p1_count || !state->p2_count))
          return NULL;
  }
This looks good to me. I'm not even bothered by the last_gen
parameter. You do make a good point about it being a potentially
leaky abstraction.
Both conditions require pending_merge_bases == 0, so the nesting
felt natural. The first is "nothing non-stale left" (works in any
region). The second is "one side exhausted" (only in the finite
region where topological ordering holds).

I think passing in last_gen into paint_queue_get() feels _slightly_
awkward but not too bad in practice.  However, we also have my
older (first) patch with the fast-exit if the caller only needs one
merge base -- that has a separate break that also could be folded
into paint_queue_get(). The messy part here is that we would need
to also pass the mb_flags parameter to paint_queue_get().
How much of this data that you are passing into the method could be
state in the paint_queue struct? Could we have the paint_queue manage
all of the state necessary to make decisions around the walk
termination?

Or, could we do a peek into the queue to see the "top" commit, and
check if it is a finite commit or not? I know that 'last_gen' is
supposed to be the commit walked in the previous cycle, but it seems
that we only care about "the remaining commits are finite" as our
condition. 
Perhaps we should just let this remain as-is for now and follow up
with _removing_ that optimization. I think the value of having it
is much diminished (but not fully gone) by the side-exhaust approach.

Additionally there's a correctness argument to be made -- perhaps
all callers _should_ care about multiple merge bases existing, and
instead bail out if it finds more than one. The only use case
where this matters today is "git merge-base A B" without --all.
Right now I am leaning towards simply passing in last_gen and
containing all of the halt conditions there
(except the old !FIND_ALL).
This is a good start, but hopefully storing the data in the
struct would be a good way to handle that.

Thanks,
-Stolee
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help