Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C
From: Pranit Bauva <hidden>
Date: 2016-08-30 18:25:47
Possibly related (same subject, not in this thread)
- 2016-08-30 · Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C · Pranit Bauva <hidden>
- 2016-08-30 · Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C · Junio C Hamano <hidden>
- 2016-08-23 · [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C · Pranit Bauva <hidden>
Hey Junio, Sorry for a late replay. On Fri, Aug 26, 2016 at 2:00 AM, Junio C Hamano [off-list ref] wrote:
Pranit Bauva [off-list ref] writes:quoted
A lot of parts of bisect.c uses exit() and these signals are then trapped in the `bisect_start` function. Since the shell script ceases its existence it would be necessary to convert those exit() calls to return statements so that errors can be reported efficiently in C code.Is efficiency really an issue? I think the real reason is that it would make it impossible for the callers to handle errors, if you do not convert and let the error codepaths exit().
I think I put the word "efficiently" wrongly over here. Will omit it.
quoted
@@ -729,7 +735,7 @@ static struct commit **get_bad_and_good_commits(int *rev_nr) return rev; } -static void handle_bad_merge_base(void) +static int handle_bad_merge_base(void) { if (is_expected_rev(current_bad_oid)) { char *bad_hex = oid_to_hex(current_bad_oid);@@ -750,17 +756,18 @@ static void handle_bad_merge_base(void) "between %s and [%s].\n"), bad_hex, term_bad, term_good, bad_hex, good_hex); } - exit(3); + return 3; } fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n" "git bisect cannot work properly in this case.\n" "Maybe you mistook %s and %s revs?\n"), term_good, term_bad, term_good, term_bad); - exit(1); + bisect_clean_state(); + return 1;What is the logic behind this function sometimes clean the state, and some other times do not, when it makes an error-return? We see above that "return 3" codepath leaves the state behind. Either you forgot a necessary clean_state in "return 3" codepath, or you forgot to document why the distinction exists in the in-code comment for the function. I cannot tell which, but I am leaning towards guessing that it is the former.
This is a very tricky one. I have purposely not included this after a lot of testing. I have hand tested with the original git and with this branch. The reason why anyone wouldn't be able to catch this is because its not covered in the test suite. I am including a patch with this as an attachment (because I am behind a proxy right now but don't worry I will include this as a commit in the next series). The original behaviour of git does not clean the bisect state when this situation occurs. On another note which you might have missed that bisect_clean_state() is purposely put before return 1 which is covered by the test suite. You can try removing it and see that there is a broken tes. tI was thinking of including the tests after the whole conversion but now I think including this before will make the conversion more easier for review.
quoted
-static void check_good_are_ancestors_of_bad(const char *prefix, int no_checkout) +static int check_good_are_ancestors_of_bad(const char *prefix, int no_checkout) { char *filename = git_pathdup("BISECT_ANCESTORS_OK"); struct stat st; - int fd; + int fd, res = 0; if (!current_bad_oid) die(_("a %s revision is needed"), term_bad);Can you let it die yere?
Not really. I should change it to return error().
quoted
@@ -873,8 +890,11 @@ static void check_good_are_ancestors_of_bad(const char *prefix, int no_checkout) filename); else close(fd); + + return 0; done: free(filename); + return 0; }Who owns "filename"? The first "return 0" leaves it unfreed, and when "goto done" is done, it is freed. The above two may indicate that "perhaps 'retval + goto finish' pattern?" is a really relevant suggestion for the earlier steps in this series.
Yes.
quoted
if (!all) { fprintf(stderr, _("No testable commit found.\n" "Maybe you started with bad path parameters?\n")); - exit(4); + return 4; } bisect_rev = revs.commits->item->object.oid.hash; if (!hashcmp(bisect_rev, current_bad_oid->hash)) { - exit_if_skipped_commits(tried, current_bad_oid); + res = exit_if_skipped_commits(tried, current_bad_oid); + if (res) + return res; + printf("%s is the first %s commit\n", sha1_to_hex(bisect_rev), term_bad); show_diff_tree(prefix, revs.commits->item); /* This means the bisection process succeeded. */ - exit(10); + return 10; } nr = all - reaches - 1;@@ -1005,7 +1033,11 @@ int bisect_next_all(const char *prefix, int no_checkout) "Bisecting: %d revisions left to test after this %s\n", nr), nr, steps_msg); - return bisect_checkout(bisect_rev, no_checkout); + res = bisect_checkout(bisect_rev, no_checkout); + if (res) + bisect_clean_state(); + + return res; }There were tons of "exit_if" that was converted to "if (res) return res" above, instead of jumping here to cause clean_state to be called. I cannot tell if this new call to clean_state() is wrong, or all the earlier "return res" should come here. I am guessing the latter.
No I don't think its wrong. It is advised in the comment
In case of mistaken revs or checkout error, or signals received,
"bisect_auto_next" below may exit or misbehave.
We have to trap this to be able to clean up using
"bisect_clean_state".
to clean the bisection state *iff checkout fails* otherwise not.
Luckily this is already covered by the test suite. I think what you
meant is that bisect_clean_state() might be covered elsewhere too then
why perform cleanup here, right? bisect_clean_state() have been
carefully put where ever required only.
quoted
diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c index c64996a..ef7b3a1 100644 --- a/builtin/bisect--helper.c +++ b/builtin/bisect--helper.c@@ -8,6 +8,7 @@ #include "run-command.h" #include "prompt.h" #include "quote.h" +#include "revision.h" static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS") static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")@@ -29,6 +30,8 @@ static const char * const git_bisect_helper_usage[] = { N_("git bisect--helper --bisect-terms [--term-good | --term-old | --term-bad | --term-new]"), N_("git bisect--helper --bisect start [--term-{old,good}=<term> --term-{new,bad}=<term>]" "[--no-checkout] [<bad> [<good>...]] [--] [<paths>...]"), + N_("git bisect--helper --bisect-next"), + N_("git bisect--helper --bisect-auto-next"), NULL };@@ -396,6 +399,129 @@ static int bisect_terms(struct bisect_terms *terms, const char **argv, int argc) return 0; } +static int register_good_ref(const char *refname, + const struct object_id *oid, int flags, + void *cb_data) +{ + struct string_list *good_refs = cb_data; + string_list_append(good_refs, oid_to_hex(oid)); + return 0; +} + +static int bisect_next(struct bisect_terms *terms, const char *prefix) +{ + int res, no_checkout; + + /* In case of mistaken revs or checkout error, or signals received,That's an unbalanced comment. You end the block with "*/" on its own line, so you should start the block with "/*" on its own line. There seems to be at least one more such comment in this patch but I won't repeat.
I shall change this.
quoted
+ * "bisect_auto_next" below may exit or misbehave. + * We have to trap this to be able to clean up using + * "bisect_clean_state". + */"exit" meaning "call exit() to terminate the process", or something else? If the latter, don't say "exit", but say "return error".
Yes I think its necessary to change "exit" in comments to "return". There are other places too in which I will have to change.
quoted
+ if (bisect_next_check(terms, terms->term_good.buf)) + return -1;Mental note. The "autostart" in the original is gone. Perhaps it is done by next_check in this code, but we haven't seen that yet.
This will be added back again in a coming patch[1].
quoted
+ no_checkout = !is_empty_or_missing_file(git_path_bisect_head()); + + /* Perform all bisection computation, display and checkout */ + res = bisect_next_all(prefix , no_checkout); + + if (res == 10) { + FILE *fp; + unsigned char sha1[20]; + struct commit *commit; + struct pretty_print_context pp = {0}; + struct strbuf commit_name = STRBUF_INIT; + char *bad_ref = xstrfmt("refs/bisect/%s", + terms->term_bad.buf); + read_ref(bad_ref, sha1); + commit = lookup_commit_reference(sha1); + format_commit_message(commit, "%s", &commit_name, &pp); + fp = fopen(git_path_bisect_log(), "a"); + if (!fp) { + free(bad_ref); + strbuf_release(&commit_name); + return -1; + } + if (fprintf(fp, "# first %s commit: [%s] %s\n", + terms->term_bad.buf, sha1_to_hex(sha1), + commit_name.buf) < 1){ + free(bad_ref); + strbuf_release(&commit_name); + fclose(fp); + return -1; + } + free(bad_ref); + strbuf_release(&commit_name); + fclose(fp); + return 0;Doesn't it bother you that you have to write free(bad_ref)...fclose(fp) repeatedly?
My mind was more involved in the actual conversion and that was the main part bothering me. I did all the "clean up" stuff after I did the actual conversion. And since I was extremely happy after the porting happened, the "cleanup stuff" didn't bother me much.
quoted
+ } + else if (res == 2) { + FILE *fp; + struct rev_info revs; + struct argv_array rev_argv = ARGV_ARRAY_INIT; + struct string_list good_revs = STRING_LIST_INIT_DUP; + struct pretty_print_context pp = {0}; + struct commit *commit; + char *term_good = xstrfmt("%s-*", terms->term_good.buf); + int i; + + fp = fopen(git_path_bisect_log(), "a"); + if (!fp) { + free(term_good); + return -1; + } + if (fprintf(fp, "# only skipped commits left to test\n") < 1) { + free(term_good); + fclose(fp); + return -1; + } + for_each_glob_ref_in(register_good_ref, term_good, + "refs/bisect/", (void *) &good_revs); + + free(term_good);Doesn't it bother you that you have to write free(term_good) repeatedly?quoted
+ argv_array_pushl(&rev_argv, "skipped_commits", "refs/bisect/bad", "--not", NULL); + for (i = 0; i < good_revs.nr; i++) + argv_array_push(&rev_argv, good_revs.items[i].string); + + /* It is important to reset the flags used by revision walks + * as the previous call to bisect_next_all() in turn + * setups a revision walk. + */ + reset_revision_walk(); + init_revisions(&revs, NULL); + rev_argv.argc = setup_revisions(rev_argv.argc, rev_argv.argv, &revs, NULL); + argv_array_clear(&rev_argv); + string_list_clear(&good_revs, 0);Are you sure that the revision walking machinery is prepared to see the argv[] and elements in it you have given to setup_revisions() gets cleared before it starts doing the real work in prepare_revision_walk()? I am reasonably sure that the machinery borrows strings like pathspec elements from the caller and expects them to stay during revision traversal.
I have extremely little understanding of revision walking. All what I know about revision walking is what's covered in Documentation/technical/api-revision-walking.txt . There it is mentioned that for multiple revision walking its necessary to reset. What's happening is that there are multiple calls for revision walking. Now bisect_next() in itself calls for revision walking two times (the first one being with bisect_next_all and the next one conditionally with skip). This is also the reason why I included a reset_revision_walk() in bisect.c too in this patch. Before what used to happen is that when git plumbing commands were called there was no previous need for resetting the revision walk. Now that only a C code exists after bisect_replay() conversion (I faced a problem in a futuristic patch and thought it would be better to cover it up here), there would be multiple calls to bisect_next(). Previously it wasn't a problem because bisect_next_all() was a subcommand called from shell script and for "skip" it was git-rev-list which did revision walking as a separate thing. I didn't do reset previously and because of this I used to get a NULL value in the revs.commits . It would be really helpful if you could look more into it.
You seem to have acquired a habit of freeing and clearing things early, which is bad. Instead, make it a habit of free/clear at the end, and make sure all error paths go through that central freeing site. That tends to future-proof your code better from leaking even when later other people change it.
Sure!
quoted
+ if (prepare_revision_walk(&revs)) { + die(_("revision walk setup failed\n")); + }This one is still allowed to die, without clean_state?
I couldn't really test this part of code because I don't know how to make this call fail. And this isn't covered by the test suite either. BTW, here also it should be return error().
quoted
+ while ((commit = get_revision(&revs)) != NULL) { + struct strbuf commit_name = STRBUF_INIT; + format_commit_message(commit, "%s", + &commit_name, &pp); + fprintf(fp, "# possible first %s commit: " + "[%s] %s\n", terms->term_bad.buf, + oid_to_hex(&commit->object.oid), + commit_name.buf); + strbuf_release(&commit_name); + } + + fclose(fp); + return res; + } + + return res; +}quoted
@@ -415,47 +541,67 @@ static int bisect_start(struct bisect_terms *terms, int no_checkout, no_checkout = 1; for (i = 0; i < argc; i++) { - if (!strcmp(argv[i], "--")) { + const char *arg; + if (starts_with(argv[i], "'")) + arg = sq_dequote(xstrdup(argv[i])); + else + arg = argv[i]; + if (!strcmp(arg, "--")) { has_double_dash = 1; break; } }This is really bad; you are blindly assuming that anything that begins with " ' " begins with " ' " because "git-bisect.sh" sq-quoted and it never directly came from the command line that _wanted_ to give you something that begins with a " ' ". I suspect that you should be able to dequote on the calling script side. Then all these ugliness would disappear.
I had an unsuccessful attempt at it. Though this ugliness is removed in the patch[2]. This is specifically because of the line in bisect_replay() shell function which calls bisect_start().
quoted
for (i = 0; i < argc; i++) { - const char *commit_id = xstrfmt("%s^{commit}", argv[i]); + const char *arg, *commit_id; + if (starts_with(argv[i], "'")) + arg = sq_dequote(xstrdup(argv[i])); + else + arg = argv[i];Likewise.quoted
+ commit_id = xstrfmt("%s^{commit}", arg);In any case, I think a separate "const char *arg" that is an alias to argv[i] in the loop is a very good idea to do from the very beginning (i.e. should be done in a much much earlier patch in this series).
I will do it in the earlier patch. [1]: http://public-inbox.org/git/01020156b73fe6d7-8b80c663-7c77-469e-811f-40200ec6dbb1-000000@eu-west-1.amazonses.com/ [2]: https://public-inbox.org/git/01020156b73fe6e4-d45cf1f7-03a3-4566-95d1-73788c5ab2f9-000000@eu-west-1.amazonses.com/ Regards, Pranit Bauva
Attachments
- out2 [application/octet-stream] 625 bytes