Re: [PATCH] http: add custom hostname to IP address resolves
From: Christian Couder <hidden>
Date: 2022-05-04 10:08:17
On Mon, May 2, 2022 at 9:04 PM Junio C Hamano [off-list ref] wrote:
Christian Couder [off-list ref] writes:quoted
Subject: Re: [PATCH] http: add custom hostname to IP address resolvesI can guess what you means, but I am not able to quite parse the above. I guess the phrase that makes me hiccup when I read is "resolve" used as a noun.
Ok, I will use "http: add custom hostname to IP address resolutions" in the next version then.
quoted
Libcurl has a CURLOPT_RESOLVE easy option that allows hostname resolve information in the following form toThe same here, "... allows the result of hostname resolution in the following format ...", perhaps?
Ok, I will use your suggestion.
quoted
be passed: [+]HOST:PORT:ADDRESS[,ADDRESS] This way, redirects and everything operating against the HOST+PORT will use the provided ADDRESS(s). The following form is also allowed to stop using these resolves: -HOST:PORTThe above is a reasonable summary of CURLOPT_RESOLVE documentation that is appropriate to have here for those of us who are not familiar with it. For those of us who may want to learn more, it would help to have an URL to the canonical documentation page, e.g. https://curl.se/libcurl/c/CURLOPT_RESOLVE.html but it is not required. People should be able to find it easily.
Yeah, I also thought that it wasn't required, but I will add it anyway, as I agree it could be useful and hopefully it doesn't change very often.
quoted
Let's add a corresponding "http.hostResolve" config option that takes advantage of CURLOPT_RESOLVE.CURLOPT_RESOLVE allows us to feed cURL a list of these <host,port> -> <address> mappings, so we use that mechansim to feed the values listed on the multi-valued configuration variable (spell it out as such, by the way, instead of saying "config option", which may give a false impression that it is a last-one-wins single string with many such mapping entries on it). OK.quoted
Each value configured for the "http.hostResolve" key is passed "as is" to curl through CURLOPT_RESOLVE, so it should be in one of the above 2 forms. This keeps the implementation simple and makes us consistent with libcurl's CURLOPT_RESOLVE, and with curl's corresponding `--resolve` command line option.OK.quoted
I am not sure if some tests could/should be added. Ideas about how to test this are welcome.It should. Perhaps invent a totally bogus domain name, map that to localhost ::1, run a test server locally, and try to clone from that bogus domain?
Ok, I will add a simple test like this.
quoted
+http.hostResolve::Is "host" a good prefix for it? In the context of HTTP(s), if there is no other thing than host that we resolve, "http.resolve" may be sufficient. For those who are looking for CURLOPT_RESOLVE equivalent, "http.curloptResolve" may make it easier to find.
I am Ok with just "http.resolve". I think using "curlopt" is perhaps going into too many details about the implementation of the feature, which could theoretically change if we ever decided to use something other than curl.
quoted
+ Hostname resolve information that will be used first when sending + HTTP requests. This information should be in one of the following + forms: + + - [+]HOST:PORT:ADDRESS[,ADDRESS] + - -HOST:PORT + ++ +The first form redirects all requests to the given `HOST:PORT` +to the provided `ADDRESS`(s). The second form clears all previous +config values for that `HOST:PORT` combination. To allow easy +overriding of all the settings inherited from the system config, +an empty value will reset all resolve information to the empty +list.If I understand your use case correctly, this is not something you would want to hardcode in your configuration files for long term, is it?
Right, but I wonder if there are other use cases where people would like to hardcode it, for example on a private network where IP addresses rarely change. Also a config option makes it more consistent with "http.extraHeaders" and other such options.
I am wondering if we want to mention the expected use case here
as well, something like
This is designed to be used primarily from the command line
configuration variable override, e.g.
$ git -c http.resolve=example.com:443:127.0.0.1 \
clone https://example.com/user/project.git
perhaps? Not a suggestion, but soliciting thoughts.I am also interested in others' thoughts about this. If no one thinks that a config option could be useful, I am Ok with making it a "--resolve" command line option that can be passed to any Git command similar to "-c <name>=<value>": git --resolve=... <command> [<args>]
quoted
diff --git a/http.c b/http.c index 229da4d148..e9cc46ee52 100644 --- a/http.c +++ b/http.c@@ -128,6 +128,9 @@ static struct curl_slist *pragma_header; static struct curl_slist *no_pragma_header; static struct string_list extra_http_headers = STRING_LIST_INIT_DUP; +static struct curl_slist *host_resolves; +static struct string_list http_host_resolve = STRING_LIST_INIT_DUP; + static struct active_request_slot *active_queue_head; static char *cached_accept_language;@@ -393,6 +396,17 @@ static int http_options(const char *var, const char *value, void *cb) return 0; } + if (!strcmp("http.hostresolve", var)) { + if (!value) { + return config_error_nonbool(var); + } else if (!*value) { + string_list_clear(&http_host_resolve, 0);OK, this is a way to "clear" the list of entries accumulated on this multi-valued configuration variable so far. And it is documented in the above, too. Good.quoted
+ } else { + string_list_append(&http_host_resolve, value); + } + return 0; + } + if (!strcmp("http.followredirects", var)) { if (value && !strcmp(value, "initial")) http_follow_config = HTTP_FOLLOW_INITIAL;@@ -985,6 +999,17 @@ static void set_from_env(const char **var, const char *envname) *var = val; } +static struct curl_slist *http_copy_host_resolve(void) +{ + struct curl_slist *hosts = NULL; + const struct string_list_item *item; + + for_each_string_list_item(item, &http_host_resolve) + hosts = curl_slist_append(hosts, item->string); + + return hosts; +} + void http_init(struct remote *remote, const char *url, int proactive_auth) { char *low_speed_limit;@@ -1048,6 +1073,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth) no_pragma_header = curl_slist_append(http_copy_default_headers(), "Pragma:"); + host_resolves = http_copy_host_resolve();This is curious. I imagined that the reason why you keep the original in a string_list and copy it to a curl_slist was perhaps because you'll lose the latter every time you make a curl request, but it does not appear to be the case. You http_init() just once and then the same CURL *curl instance will be reused until you clear it with http_cleanup(). So I do not see offhand the need to have the string_list at all.
Ok, I will remove it.
Does it work equally well if you used curl_slist_append() in http_options() and maintain ONLY the curl_slist version of the host_resolve list? That would make it unnecessary to keep http_host_resolve and add http_copy_host_resolve() function, no?
Yeah, right. I will remove http_host_resolve and http_copy_host_resolve(). Thanks!