Thread (329 messages) 329 messages, 12 authors, 2018-03-14

Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command

From: Jeff King <hidden>
Date: 2018-02-24 04:01:58

On Thu, Feb 22, 2018 at 04:45:14PM -0800, Brandon Williams wrote:
quoted
This kind of tail matching can't quite implement all of the current
behavior. Because we actually do the normal dwim_ref() matching, which
includes stuff like "refs/remotes/%s/HEAD".

The other problem with tail-matching is that it's inefficient on the
server. Ideally we could get a request for "master" and only look up
refs/heads/master, refs/tags/master, etc. And if there are 50,000 refs
in refs/pull, we wouldn't have to process those at all. Of course this
is no worse than the current code, which not only looks at each ref but
actually _sends_ it. But it would be nice if we could fix this.

There's some more discussion in this old thread:

  https://public-inbox.org/git/20161024132932.i42rqn2vlpocqmkq@sigill.intra.peff.net/
Thanks for the pointer.  I was told to be wary a while about about
performance implications on the server but no discussion ensued till now
about it :)

We always have the ability to extend the patterns accepted via a feature
(or capability) to ls-refs, so maybe the best thing to do now would only
support a few patterns with specific semantics.  Something like if you
say "master" only match against refs/heads/ and refs/tags/ and if you
want something else you would need to specify "refs/pull/master"?
The big question is whether you want to break compatibility with the
existing program behavior. If not, then I think you have to ask for
every variant in ref_rev_parse_rules (of which there are 6 variants).

Which sounds pretty gross, but it actually may not be _too_ bad. Most
fetches tend to ask for either a single name, or they use left-anchored
wildcards. So it would work to just have the client expand all of the
possibilities itself into fully-qualified refs, and keep the server as
dumb as possible.

And then the server for now can just cull based on the pattern list,
like you have here. But later, we could optimize it to look up the
individual patterns, which should be cheaper, since we'd generally have
many fewer patterns than total refs.
quoted
Does the client have to be aware that we're using wildmatch? I think
they'd need "refs/heads/**" to actually implement what we usually
specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
make this work with just "*"?

Do we anticipate that the client would left-anchor the refspec like
"/refs/heads/*" so that in theory the server could avoid looking outside
of /refs/heads/?
Yeah we may want to anchor it by providing the leading '/' instead of
just "refs/<blah>".
I actually wonder if we should just specify that the patterns must
_always_ be fully-qualified, but may end with a single "/*" to iterate
over wildcards. Or even simpler, that "refs/heads/foo" would find that
ref itself, and anything under it.

That drops any question about how wildcards work (e.g., does "refs/foo*"
work to find "refs/foobar"?).
I need to read over the discussion you linked to more but what sort of
ref patterns do you believe we should support as part of the initial
release of v2?  It seems like you wanted this at some point in the past
so I assume you have an idea of what sort of filtering would be
beneficial.
My goals were just optimizing:

  1. Don't send all the refs across the wire if we can avoid it.

  2. Don't even iterate over all the refs internally if we can avoid it.

Especially with the new binary-searching packed-refs code, we should be
able to serve a request like "ls-refs refs/heads/*" without looking into
"refs/pull" or "refs/changes" at all.

-Peff
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help