Thread (3 messages) 3 messages, 2 authors, 2016-08-30

Re: [PATCH v14 14/27] bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C

From: Pranit Bauva <hidden>
Date: 2016-08-30 18:25:47

Possibly related (same subject, not in this thread)

Hey Junio,

Sorry for a late replay.

On Fri, Aug 26, 2016 at 2:00 AM, Junio C Hamano [off-list ref] wrote:
Pranit Bauva [off-list ref] writes:
quoted
A lot of parts of bisect.c uses exit() and these signals are then
trapped in the `bisect_start` function. Since the shell script ceases
its existence it would be necessary to convert those exit() calls to
return statements so that errors can be reported efficiently in C code.
Is efficiency really an issue?  I think the real reason is that it
would make it impossible for the callers to handle errors, if you do
not convert and let the error codepaths exit().
I think I put the word "efficiently" wrongly over here. Will omit it.
quoted
@@ -729,7 +735,7 @@ static struct commit **get_bad_and_good_commits(int *rev_nr)
      return rev;
 }

-static void handle_bad_merge_base(void)
+static int handle_bad_merge_base(void)
 {
      if (is_expected_rev(current_bad_oid)) {
              char *bad_hex = oid_to_hex(current_bad_oid);
@@ -750,17 +756,18 @@ static void handle_bad_merge_base(void)
                              "between %s and [%s].\n"),
                              bad_hex, term_bad, term_good, bad_hex, good_hex);
              }
-             exit(3);
+             return 3;
      }

      fprintf(stderr, _("Some %s revs are not ancestor of the %s rev.\n"
              "git bisect cannot work properly in this case.\n"
              "Maybe you mistook %s and %s revs?\n"),
              term_good, term_bad, term_good, term_bad);
-     exit(1);
+     bisect_clean_state();
+     return 1;
What is the logic behind this function sometimes clean the state,
and some other times do not, when it makes an error-return?  We see
above that "return 3" codepath leaves the state behind.

Either you forgot a necessary clean_state in "return 3" codepath,
or you forgot to document why the distinction exists in the in-code
comment for the function.  I cannot tell which, but I am leaning
towards guessing that it is the former.
This is a very tricky one. I have purposely not included this after a
lot of testing. I have hand tested with the original git and with this
branch. The reason why anyone wouldn't be able to catch this is
because its not covered in the test suite. I am including a patch with
this as an attachment (because I am behind a proxy right now but don't
worry I will include this as a commit in the next series). The
original behaviour of git does not clean the bisect state when this
situation occurs. On another note which you might have missed that
bisect_clean_state() is purposely put before return 1 which is covered
by the test suite. You can try removing it and see that there is a
broken tes. tI was thinking of including the tests after the whole
conversion but now I think including this before will make the
conversion more easier for review.
quoted
-static void check_good_are_ancestors_of_bad(const char *prefix, int no_checkout)
+static int check_good_are_ancestors_of_bad(const char *prefix, int no_checkout)
 {
      char *filename = git_pathdup("BISECT_ANCESTORS_OK");
      struct stat st;
-     int fd;
+     int fd, res = 0;

      if (!current_bad_oid)
              die(_("a %s revision is needed"), term_bad);
Can you let it die yere?
Not really. I should change it to return error().
quoted
@@ -873,8 +890,11 @@ static void check_good_are_ancestors_of_bad(const char *prefix, int no_checkout)
                            filename);
      else
              close(fd);
+
+     return 0;
  done:
      free(filename);
+     return 0;
 }
Who owns "filename"?  The first "return 0" leaves it unfreed, and
when "goto done" is done, it is freed.

The above two may indicate that "perhaps 'retval + goto finish'
pattern?" is a really relevant suggestion for the earlier steps in
this series.
Yes.
quoted
      if (!all) {
              fprintf(stderr, _("No testable commit found.\n"
                      "Maybe you started with bad path parameters?\n"));
-             exit(4);
+             return 4;
      }

      bisect_rev = revs.commits->item->object.oid.hash;

      if (!hashcmp(bisect_rev, current_bad_oid->hash)) {
-             exit_if_skipped_commits(tried, current_bad_oid);
+             res = exit_if_skipped_commits(tried, current_bad_oid);
+             if (res)
+                     return res;
+
              printf("%s is the first %s commit\n", sha1_to_hex(bisect_rev),
                      term_bad);
              show_diff_tree(prefix, revs.commits->item);
              /* This means the bisection process succeeded. */
-             exit(10);
+             return 10;
      }

      nr = all - reaches - 1;
@@ -1005,7 +1033,11 @@ int bisect_next_all(const char *prefix, int no_checkout)
                "Bisecting: %d revisions left to test after this %s\n",
                nr), nr, steps_msg);

-     return bisect_checkout(bisect_rev, no_checkout);
+     res = bisect_checkout(bisect_rev, no_checkout);
+     if (res)
+             bisect_clean_state();
+
+     return res;
 }
There were tons of "exit_if" that was converted to "if (res) return
res" above, instead of jumping here to cause clean_state to be
called.  I cannot tell if this new call to clean_state() is wrong,
or all the earlier "return res" should come here.  I am guessing the
latter.
No I don't think its wrong. It is advised in the comment
    In case of mistaken revs or checkout error, or signals received,
    "bisect_auto_next" below may exit or misbehave.
    We have to trap this to be able to clean up using
    "bisect_clean_state".

to clean the bisection state *iff checkout fails* otherwise not.
Luckily this is already covered by the test suite. I think what you
meant is that bisect_clean_state() might be covered elsewhere too then
why perform cleanup here, right? bisect_clean_state() have been
carefully put where ever required only.
quoted
diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index c64996a..ef7b3a1 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -8,6 +8,7 @@
 #include "run-command.h"
 #include "prompt.h"
 #include "quote.h"
+#include "revision.h"

 static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
 static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
@@ -29,6 +30,8 @@ static const char * const git_bisect_helper_usage[] = {
      N_("git bisect--helper --bisect-terms [--term-good | --term-old | --term-bad | --term-new]"),
      N_("git bisect--helper --bisect start [--term-{old,good}=<term> --term-{new,bad}=<term>]"
                                            "[--no-checkout] [<bad> [<good>...]] [--] [<paths>...]"),
+     N_("git bisect--helper --bisect-next"),
+     N_("git bisect--helper --bisect-auto-next"),
      NULL
 };
@@ -396,6 +399,129 @@ static int bisect_terms(struct bisect_terms *terms, const char **argv, int argc)
      return 0;
 }

+static int register_good_ref(const char *refname,
+                          const struct object_id *oid, int flags,
+                          void *cb_data)
+{
+     struct string_list *good_refs = cb_data;
+     string_list_append(good_refs, oid_to_hex(oid));
+     return 0;
+}
+
+static int bisect_next(struct bisect_terms *terms, const char *prefix)
+{
+     int res, no_checkout;
+
+     /* In case of mistaken revs or checkout error, or signals received,
That's an unbalanced comment.  You end the block with "*/" on its
own line, so you should start the block with "/*" on its own line.
There seems to be at least one more such comment in this patch but I
won't repeat.
I shall change this.
quoted
+      * "bisect_auto_next" below may exit or misbehave.
+      * We have to trap this to be able to clean up using
+      * "bisect_clean_state".
+      */
"exit" meaning "call exit() to terminate the process", or something
else?  If the latter, don't say "exit", but say "return error".
Yes I think its necessary to change "exit" in comments to "return".
There are other places too in which I will have to change.
quoted
+     if (bisect_next_check(terms, terms->term_good.buf))
+             return -1;
Mental note.  The "autostart" in the original is gone.  Perhaps it
is done by next_check in this code, but we haven't seen that yet.
This will be added back again in a coming patch[1].
quoted
+     no_checkout = !is_empty_or_missing_file(git_path_bisect_head());
+
+     /* Perform all bisection computation, display and checkout */
+     res = bisect_next_all(prefix , no_checkout);
+
+     if (res == 10) {
+             FILE *fp;
+             unsigned char sha1[20];
+             struct commit *commit;
+             struct pretty_print_context pp = {0};
+             struct strbuf commit_name = STRBUF_INIT;
+             char *bad_ref = xstrfmt("refs/bisect/%s",
+                                           terms->term_bad.buf);
+             read_ref(bad_ref, sha1);
+             commit = lookup_commit_reference(sha1);
+             format_commit_message(commit, "%s", &commit_name, &pp);
+             fp = fopen(git_path_bisect_log(), "a");
+             if (!fp) {
+                     free(bad_ref);
+                     strbuf_release(&commit_name);
+                     return -1;
+             }
+             if (fprintf(fp, "# first %s commit: [%s] %s\n",
+                         terms->term_bad.buf, sha1_to_hex(sha1),
+                         commit_name.buf) < 1){
+                     free(bad_ref);
+                     strbuf_release(&commit_name);
+                     fclose(fp);
+                     return -1;
+             }
+             free(bad_ref);
+             strbuf_release(&commit_name);
+             fclose(fp);
+             return 0;
Doesn't it bother you that you have to write free(bad_ref)...fclose(fp)
repeatedly?
My mind was more involved in the actual conversion and that was the
main part bothering me. I did all the "clean up" stuff after I did the
actual conversion. And since I was extremely happy after the porting
happened, the "cleanup stuff" didn't bother me much.
quoted
+     }
+     else if (res == 2) {
+             FILE *fp;
+             struct rev_info revs;
+             struct argv_array rev_argv = ARGV_ARRAY_INIT;
+             struct string_list good_revs = STRING_LIST_INIT_DUP;
+             struct pretty_print_context pp = {0};
+             struct commit *commit;
+             char *term_good = xstrfmt("%s-*", terms->term_good.buf);
+             int i;
+
+             fp = fopen(git_path_bisect_log(), "a");
+             if (!fp) {
+                     free(term_good);
+                     return -1;
+             }
+             if (fprintf(fp, "# only skipped commits left to test\n") < 1) {
+                     free(term_good);
+                     fclose(fp);
+                     return -1;
+             }
+             for_each_glob_ref_in(register_good_ref, term_good,
+                                  "refs/bisect/", (void *) &good_revs);
+
+             free(term_good);
Doesn't it bother you that you have to write free(term_good) repeatedly?
quoted
+             argv_array_pushl(&rev_argv, "skipped_commits", "refs/bisect/bad", "--not", NULL);
+             for (i = 0; i < good_revs.nr; i++)
+                     argv_array_push(&rev_argv, good_revs.items[i].string);
+
+             /* It is important to reset the flags used by revision walks
+              * as the previous call to bisect_next_all() in turn
+              * setups a revision walk.
+              */
+             reset_revision_walk();
+             init_revisions(&revs, NULL);
+             rev_argv.argc = setup_revisions(rev_argv.argc, rev_argv.argv, &revs, NULL);
+             argv_array_clear(&rev_argv);
+             string_list_clear(&good_revs, 0);
Are you sure that the revision walking machinery is prepared to see
the argv[] and elements in it you have given to setup_revisions()
gets cleared before it starts doing the real work in
prepare_revision_walk()?  I am reasonably sure that the machinery
borrows strings like pathspec elements from the caller and expects
them to stay during revision traversal.
I have extremely little understanding of revision walking. All what I
know about revision walking is what's covered in
Documentation/technical/api-revision-walking.txt . There it is
mentioned that for multiple revision walking its necessary to reset.
What's happening is that there are multiple calls for revision
walking. Now bisect_next() in itself calls for revision walking two
times (the first one being with bisect_next_all and the next one
conditionally with skip).  This is also the reason why I included a
reset_revision_walk() in bisect.c too in this patch. Before what used
to happen is that when git plumbing commands were called there was no
previous need for resetting the revision walk.  Now that only a C code
exists after bisect_replay() conversion (I faced a problem in a
futuristic patch and thought it would be better to cover it up here),
there would be multiple calls to bisect_next(). Previously it wasn't a
problem because bisect_next_all() was a subcommand called from shell
script and for "skip" it was git-rev-list which did revision walking
as a separate thing. I didn't do reset previously and because of this
I used to get a NULL value in the revs.commits . It would be really
helpful if you could look more into it.
You seem to have acquired a habit of freeing and clearing things
early, which is bad.  Instead, make it a habit of free/clear at the
end, and make sure all error paths go through that central freeing
site.  That tends to future-proof your code better from leaking even
when later other people change it.
Sure!
quoted
+             if (prepare_revision_walk(&revs)) {
+                     die(_("revision walk setup failed\n"));
+             }
This one is still allowed to die, without clean_state?
I couldn't really test this part of code because I don't know how to
make this call fail. And this isn't covered by the test suite either.
BTW, here also it should be return error().
quoted
+             while ((commit = get_revision(&revs)) != NULL) {
+                     struct strbuf commit_name = STRBUF_INIT;
+                     format_commit_message(commit, "%s",
+                                           &commit_name, &pp);
+                     fprintf(fp, "# possible first %s commit: "
+                                 "[%s] %s\n", terms->term_bad.buf,
+                                 oid_to_hex(&commit->object.oid),
+                                 commit_name.buf);
+                     strbuf_release(&commit_name);
+             }
+
+             fclose(fp);
+             return res;
+     }
+
+     return res;
+}
quoted
@@ -415,47 +541,67 @@ static int bisect_start(struct bisect_terms *terms, int no_checkout,
              no_checkout = 1;

      for (i = 0; i < argc; i++) {
-             if (!strcmp(argv[i], "--")) {
+             const char *arg;
+             if (starts_with(argv[i], "'"))
+                     arg = sq_dequote(xstrdup(argv[i]));
+             else
+                     arg = argv[i];
+             if (!strcmp(arg, "--")) {
                      has_double_dash = 1;
                      break;
              }
      }
This is really bad; you are blindly assuming that anything that
begins with " ' " begins with " ' " because "git-bisect.sh" sq-quoted
and it never directly came from the command line that _wanted_ to
give you something that begins with a " ' ".

I suspect that you should be able to dequote on the calling script
side.  Then all these ugliness would disappear.
I had an unsuccessful attempt at it. Though this ugliness is removed
in the patch[2]. This is specifically because of the line in
bisect_replay() shell function which calls bisect_start().
quoted
      for (i = 0; i < argc; i++) {
-             const char *commit_id = xstrfmt("%s^{commit}", argv[i]);
+             const char *arg, *commit_id;
+             if (starts_with(argv[i], "'"))
+                     arg = sq_dequote(xstrdup(argv[i]));
+             else
+                     arg = argv[i];
Likewise.
quoted
+             commit_id = xstrfmt("%s^{commit}", arg);
In any case, I think a separate "const char *arg" that is an alias
to argv[i] in the loop is a very good idea to do from the very
beginning (i.e. should be done in a much much earlier patch in this
series).
I will do it in the earlier patch.

[1]: http://public-inbox.org/git/01020156b73fe6d7-8b80c663-7c77-469e-811f-40200ec6dbb1-000000@eu-west-1.amazonses.com/
[2]: https://public-inbox.org/git/01020156b73fe6e4-d45cf1f7-03a3-4566-95d1-73788c5ab2f9-000000@eu-west-1.amazonses.com/

Regards,
Pranit Bauva

Attachments

  • out2 [application/octet-stream] 625 bytes
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help