Thread (329 messages) 329 messages, 12 authors, 2018-03-14

Re: [PATCH v3 14/35] connect: request remote refs using v2

From: Jonathan Nieder <hidden>
Date: 2018-02-27 06:52:07

Brandon Williams wrote:
Teach the client to be able to request a remote's refs using protocol
v2.  This is done by having a client issue a 'ls-refs' request to a v2
server.
Yay, ls-remote support!

[...]
quoted hunk ↗ jump to hunk
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -5,6 +5,7 @@
 #include "parse-options.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
nit, no change needed in this patch: What is a good logical order for
the #includes here?  Bonus points if there's a tool to make it happen
automatically.

Asking since adding #includes like this at the end tends to result in
a harder-to-read list of #includes, sometimes with duplicates, and
often producing conflicts when multiple patches in flight add a
#include to the same file.

[...]
quoted hunk ↗ jump to hunk
@@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 
 	switch (determine_protocol_version_server()) {
 	case protocol_v2:
-		/*
-		 * fetch support for protocol v2 has not been implemented yet,
-		 * so ignore the request to use v2 and fallback to using v0.
-		 */
-		upload_pack(&opts);
+		serve_opts.advertise_capabilities = opts.advertise_refs;
+		serve_opts.stateless_rpc = opts.stateless_rpc;
+		serve(&serve_opts);
Interesting!  daemon.c has its own (file-local) serve() function;
can one of the two be renamed?

I actually like both names: serve() is a traditional name for the
function a server calls when it's done setting up and is ready to
serve.  But the clash is confusing.

[...]
quoted hunk ↗ jump to hunk
+++ b/connect.c
@@ -12,9 +12,11 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "version.h"
 #include "protocol.h"
 
 static char *server_capabilities;
+static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
Can a quick doc comment describe these and how they relate?

Is only one of them set, based on which protocol version is in use?
Should server_capabilities be renamed to server_capabilities_v1?
quoted hunk ↗ jump to hunk
 static const char *parse_feature_value(const char *, const char *, int *);
 
 static int check_ref(const char *name, unsigned int flags)
@@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+/* Checks if the server supports the capability 'c' */
+int server_supports_v2(const char *c, int die_on_error)
+{
+	int i;
+
+	for (i = 0; i < server_capabilities_v2.argc; i++) {
+		const char *out;
+		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+		    (!*out || *out == '='))
+			return 1;
+	}
+
+	if (die_on_error)
+		die("server doesn't support '%s'", c);
+
+	return 0;
+}
Nice.
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
+		argv_array_push(&server_capabilities_v2, reader->line);
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
Can this say more?  E.g. "expected flush after capabilities, got <foo>"?

[...]
quoted hunk ↗ jump to hunk
@@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
 	/* Maybe process capabilities here, at least for v2 */
Is this comment out of date now?
quoted hunk ↗ jump to hunk
 	switch (version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		process_capabilities_v2(reader);
 		break;
 	case protocol_v1:
 		/* Read the peeked version line */
@@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 	return list;
 }
 
+static int process_ref_v2(const char *line, struct ref ***list)
What does the return value represent?

Could it return the more typical 0 on success, -1 on error?
+{
+	int ret = 1;
+	int i = 0;
+	struct object_id old_oid;
+	struct ref *ref;
+	struct string_list line_sections = STRING_LIST_INIT_DUP;
+
+	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
Can there be a comment describing the expected format?
+		ret = 0;
+		goto out;
+	}
+
+	if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
+		ret = 0;
+		goto out;
+	}
+
+	ref = alloc_ref(line_sections.items[i++].string);
Ref names cannot contains a space, so this is safe.  Good.
+
+	oidcpy(&ref->old_oid, &old_oid);
+	**list = ref;
+	*list = &ref->next;
+
+	for (; i < line_sections.nr; i++) {
+		const char *arg = line_sections.items[i].string;
+		if (skip_prefix(arg, "symref-target:", &arg))
+			ref->symref = xstrdup(arg);
Using space-delimited fields in a single pkt-line means that
- values cannot contains a space
- total length is limited by the size of a pkt-line

Given the context, I think that's fine.  More generally it is tempting
to use a pkt-line per field to avoid the trouble v1 had with
capability lists crammed into a pkt-line, but I see why you used a
pkt-line per ref to avoid having to have sections-within-a-section.

My only potential worry is the length part: do we have an explicit
limit on the length of a ref name?  git-check-ref-format(1) doesn't
mention one.  A 32k ref name would be a bit ridiculous, though.
+
+		if (skip_prefix(arg, "peeled:", &arg)) {
+			struct object_id peeled_oid;
+			char *peeled_name;
+			struct ref *peeled;
+			if (get_oid_hex(arg, &peeled_oid)) {
+				ret = 0;
+				goto out;
+			}
Can this also check that there's no trailing garbage after the oid?

[...]
+
+			peeled_name = xstrfmt("%s^{}", ref->name);
optional: can reuse a buffer to avoid allocation churn:

	struct strbuf peeled_name = STRBUF_INIT;

			strbuf_reset(&peeled_name);
			strbuf_addf(&peeled_name, "%s^{}", ref->name);
			// or strbuf_addstr(ref->name); strbuf_addstr("^{}");

 out:
 	strbuf_release(&peeled_name);

[...]
+struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+			     struct ref **list, int for_push,
+			     const struct argv_array *ref_patterns)
+{
+	int i;
+	*list = NULL;
+
+	/* Check that the server supports the ls-refs command */
+	/* Issue request for ls-refs */
+	if (server_supports_v2("ls-refs", 1))
+		packet_write_fmt(fd_out, "command=ls-refs\n");
Since the code is so clear, I don't think the above two comments are
helping.
+
+	if (server_supports_v2("agent", 0))
+	    packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
whitespace nit: mixing tabs and spaces.  Does "make style" catch this?
+
+	packet_delim(fd_out);
+	/* When pushing we don't want to request the peeled tags */
Can you say more about this?  In theory it would be nice to have the
peeled tags since they name commits whose history can be excluded from
the pack.
+	if (!for_push)
+		packet_write_fmt(fd_out, "peel\n");
+	packet_write_fmt(fd_out, "symrefs\n");
Are symrefs useful during push?
+	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
+		packet_write_fmt(fd_out, "ref-pattern %s\n",
+				 ref_patterns->argv[i]);
+	}
The exciting part.

Why do these pkts end with \n?  I would have expected the pkt-line
framing to work to delimit them.
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (!process_ref_v2(reader->line, &list))
+			die("invalid ls-refs response: %s", reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
Can this protocol error give more detail?  When diagnosing an error in
servers, proxies, or lower-level networking issues, informative protocol
errors can be very helpful (similar to syntax errors from a compiler).

[...]
quoted hunk ↗ jump to hunk
--- a/connect.h
+++ b/connect.h
@@ -16,4 +16,6 @@ extern int url_is_local_not_ssh(const char *url);
 struct packet_reader;
 extern enum protocol_version discover_version(struct packet_reader *reader);
 
+extern int server_supports_v2(const char *c, int die_on_error);
const char *cap, maybe?

[...]
quoted hunk ↗ jump to hunk
--- a/remote.h
+++ b/remote.h
@@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
 
 struct oid_array;
 struct packet_reader;
+struct argv_array;
 extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
 				     struct oid_array *shallow_points);
+extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+				    struct ref **list, int for_push,
+				    const struct argv_array *ref_patterns);
What is the difference between get_remote_heads and get_remote_refs?
A comment might help.  (BTW, thanks for making the new saner name to
replace get_remote_heads!)

[...]
quoted hunk ↗ jump to hunk
--- /dev/null
+++ b/t/t5702-protocol-v2.sh
@@ -0,0 +1,53 @@
+#!/bin/sh
+
+test_description='test git wire-protocol version 2'
Woot!

[...]
+test_expect_success 'list refs with git:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote --symref "$GIT_DAEMON_URL/parent" >actual 2>log &&
+
+	# Client requested to use protocol v2
+	grep "git> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
optional: Could anchor these greps to make the test tighter (e.g. to
not match "version 20".

Thanks,
Jonathan
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help