Re: [PATCH v3 14/35] connect: request remote refs using v2
From: Jonathan Nieder <hidden>
Date: 2018-02-27 06:52:07
Brandon Williams wrote:
Teach the client to be able to request a remote's refs using protocol v2. This is done by having a client issue a 'ls-refs' request to a v2 server.
Yay, ls-remote support! [...]
quoted hunk ↗ jump to hunk
--- a/builtin/upload-pack.c +++ b/builtin/upload-pack.c@@ -5,6 +5,7 @@ #include "parse-options.h" #include "protocol.h" #include "upload-pack.h" +#include "serve.h"
nit, no change needed in this patch: What is a good logical order for the #includes here? Bonus points if there's a tool to make it happen automatically. Asking since adding #includes like this at the end tends to result in a harder-to-read list of #includes, sometimes with duplicates, and often producing conflicts when multiple patches in flight add a #include to the same file. [...]
quoted hunk ↗ jump to hunk
@@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix) switch (determine_protocol_version_server()) { case protocol_v2: - /* - * fetch support for protocol v2 has not been implemented yet, - * so ignore the request to use v2 and fallback to using v0. - */ - upload_pack(&opts); + serve_opts.advertise_capabilities = opts.advertise_refs; + serve_opts.stateless_rpc = opts.stateless_rpc; + serve(&serve_opts);
Interesting! daemon.c has its own (file-local) serve() function; can one of the two be renamed? I actually like both names: serve() is a traditional name for the function a server calls when it's done setting up and is ready to serve. But the clash is confusing. [...]
quoted hunk ↗ jump to hunk
+++ b/connect.c@@ -12,9 +12,11 @@ #include "sha1-array.h" #include "transport.h" #include "strbuf.h" +#include "version.h" #include "protocol.h" static char *server_capabilities; +static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
Can a quick doc comment describe these and how they relate? Is only one of them set, based on which protocol version is in use? Should server_capabilities be renamed to server_capabilities_v1?
quoted hunk ↗ jump to hunk
static const char *parse_feature_value(const char *, const char *, int *); static int check_ref(const char *name, unsigned int flags)@@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected) "and the repository exists.")); } +/* Checks if the server supports the capability 'c' */ +int server_supports_v2(const char *c, int die_on_error) +{ + int i; + + for (i = 0; i < server_capabilities_v2.argc; i++) { + const char *out; + if (skip_prefix(server_capabilities_v2.argv[i], c, &out) && + (!*out || *out == '=')) + return 1; + } + + if (die_on_error) + die("server doesn't support '%s'", c); + + return 0; +}
Nice.
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+ while (packet_reader_read(reader) == PACKET_READ_NORMAL)
+ argv_array_push(&server_capabilities_v2, reader->line);
+
+ if (reader->status != PACKET_READ_FLUSH)
+ die("protocol error");Can this say more? E.g. "expected flush after capabilities, got <foo>"? [...]
quoted hunk ↗ jump to hunk
@@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader *reader) /* Maybe process capabilities here, at least for v2 */
Is this comment out of date now?
quoted hunk ↗ jump to hunk
switch (version) { case protocol_v2: - die("support for protocol v2 not implemented yet"); + process_capabilities_v2(reader); break; case protocol_v1: /* Read the peeked version line */@@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader *reader, return list; } +static int process_ref_v2(const char *line, struct ref ***list)
What does the return value represent? Could it return the more typical 0 on success, -1 on error?
+{
+ int ret = 1;
+ int i = 0;
+ struct object_id old_oid;
+ struct ref *ref;
+ struct string_list line_sections = STRING_LIST_INIT_DUP;
+
+ if (string_list_split(&line_sections, line, ' ', -1) < 2) {Can there be a comment describing the expected format?
+ ret = 0;
+ goto out;
+ }
+
+ if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
+ ret = 0;
+ goto out;
+ }
+
+ ref = alloc_ref(line_sections.items[i++].string);Ref names cannot contains a space, so this is safe. Good.
+
+ oidcpy(&ref->old_oid, &old_oid);
+ **list = ref;
+ *list = &ref->next;
+
+ for (; i < line_sections.nr; i++) {
+ const char *arg = line_sections.items[i].string;
+ if (skip_prefix(arg, "symref-target:", &arg))
+ ref->symref = xstrdup(arg);Using space-delimited fields in a single pkt-line means that - values cannot contains a space - total length is limited by the size of a pkt-line Given the context, I think that's fine. More generally it is tempting to use a pkt-line per field to avoid the trouble v1 had with capability lists crammed into a pkt-line, but I see why you used a pkt-line per ref to avoid having to have sections-within-a-section. My only potential worry is the length part: do we have an explicit limit on the length of a ref name? git-check-ref-format(1) doesn't mention one. A 32k ref name would be a bit ridiculous, though.
+
+ if (skip_prefix(arg, "peeled:", &arg)) {
+ struct object_id peeled_oid;
+ char *peeled_name;
+ struct ref *peeled;
+ if (get_oid_hex(arg, &peeled_oid)) {
+ ret = 0;
+ goto out;
+ }Can this also check that there's no trailing garbage after the oid? [...]
+
+ peeled_name = xstrfmt("%s^{}", ref->name);
optional: can reuse a buffer to avoid allocation churn:
struct strbuf peeled_name = STRBUF_INIT;
strbuf_reset(&peeled_name);
strbuf_addf(&peeled_name, "%s^{}", ref->name);
// or strbuf_addstr(ref->name); strbuf_addstr("^{}");
out:
strbuf_release(&peeled_name);
[...]+struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+ struct ref **list, int for_push,
+ const struct argv_array *ref_patterns)
+{
+ int i;
+ *list = NULL;
+
+ /* Check that the server supports the ls-refs command */
+ /* Issue request for ls-refs */
+ if (server_supports_v2("ls-refs", 1))
+ packet_write_fmt(fd_out, "command=ls-refs\n");Since the code is so clear, I don't think the above two comments are helping.
+
+ if (server_supports_v2("agent", 0))
+ packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());whitespace nit: mixing tabs and spaces. Does "make style" catch this?
+ + packet_delim(fd_out); + /* When pushing we don't want to request the peeled tags */
Can you say more about this? In theory it would be nice to have the peeled tags since they name commits whose history can be excluded from the pack.
+ if (!for_push) + packet_write_fmt(fd_out, "peel\n"); + packet_write_fmt(fd_out, "symrefs\n");
Are symrefs useful during push?
+ for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
+ packet_write_fmt(fd_out, "ref-pattern %s\n",
+ ref_patterns->argv[i]);
+ }The exciting part. Why do these pkts end with \n? I would have expected the pkt-line framing to work to delimit them.
+ packet_flush(fd_out);
+
+ /* Process response from server */
+ while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+ if (!process_ref_v2(reader->line, &list))
+ die("invalid ls-refs response: %s", reader->line);
+ }
+
+ if (reader->status != PACKET_READ_FLUSH)
+ die("protocol error");Can this protocol error give more detail? When diagnosing an error in servers, proxies, or lower-level networking issues, informative protocol errors can be very helpful (similar to syntax errors from a compiler). [...]
quoted hunk ↗ jump to hunk
--- a/connect.h +++ b/connect.h@@ -16,4 +16,6 @@ extern int url_is_local_not_ssh(const char *url); struct packet_reader; extern enum protocol_version discover_version(struct packet_reader *reader); +extern int server_supports_v2(const char *c, int die_on_error);
const char *cap, maybe? [...]
quoted hunk ↗ jump to hunk
--- a/remote.h +++ b/remote.h@@ -151,10 +151,14 @@ void free_refs(struct ref *ref); struct oid_array; struct packet_reader; +struct argv_array; extern struct ref **get_remote_heads(struct packet_reader *reader, struct ref **list, unsigned int flags, struct oid_array *extra_have, struct oid_array *shallow_points); +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader, + struct ref **list, int for_push, + const struct argv_array *ref_patterns);
What is the difference between get_remote_heads and get_remote_refs? A comment might help. (BTW, thanks for making the new saner name to replace get_remote_heads!) [...]
quoted hunk ↗ jump to hunk
--- /dev/null +++ b/t/t5702-protocol-v2.sh@@ -0,0 +1,53 @@ +#!/bin/sh + +test_description='test git wire-protocol version 2'
Woot! [...]
+test_expect_success 'list refs with git:// using protocol v2' ' + GIT_TRACE_PACKET=1 git -c protocol.version=2 \ + ls-remote --symref "$GIT_DAEMON_URL/parent" >actual 2>log && + + # Client requested to use protocol v2 + grep "git> .*\\\0\\\0version=2\\\0$" log && + # Server responded using protocol v2 + grep "git< version 2" log &&
optional: Could anchor these greps to make the test tighter (e.g. to not match "version 20". Thanks, Jonathan