Re: [PATCH 6/6] revisions API: don't leak memory on argv elements that need free()-ing
From: Jeff King <hidden>
Date: 2022-07-29 18:03:22
On Fri, Jul 29, 2022 at 09:07:40AM +0200, Ævar Arnfjörð Bjarmason wrote:
quoted
In that case, we could replace your patch 5 in favor of just calling strvec_clear() at the end of bisect_rev_setup().5/6 is doing that, or rather at the end of check_ancestors() and bisect_next_all(), but those call bisect_rev_setup() and pass it the strvec, so in terms of memory management it amounts to the same thing.
Right, but my point is that we do not need to complicate the interface and ownership rules for bisect_rev_setup() by passing around the extra variable.
quoted
It's possible there's a case I'm missing that makes this generally not-safe, but in the case of bisect_rev_setup() there's a very limited set of items in our argv in the first place. Doing so also passes the test suite with SANITIZE=address, though again, this is just exercising the very limited bisect options here.I think what you're missing is that this code is basically doing this, which is a common pattern in various parts of our codebase: struct strvec sv = STRVEC_INIT; strvec_push(&sv, "foo"); // sv.v[0] strvec_push(&sv, "bar"); // sv.v[1] sv.v[1] = NULL; // the code in revisions.c strvec_clear(&sv); I.e. you can't fix this by simply having revision.c having its own strvec, because it would just .. proceed to do the same thing.
Right, none of what I was suggesting above gets rid of the flag to tell revisions.c that it should free elements it removes from argv. We still have to do that. My point was just that if we can assume that setup_revisions() does not need for the passed-in argv to exist after it exits (which _used_ to not be true, but I think is these days), then that can simplify a few of the callers.
Of course we could alter its argv-iterating state machine to not clobber that "argv" element to NULL, and have other subsequent code know that it should stop at a new lower "argc" etc. etc., but that's getting at the much more invasive API changes 6/6 notes as the path not taken.
Yeah, I don't think that's at all worth it. If anything, it could perhaps hold on to the "removed" pointer and restore it at the end, but I wouldn't be surprised if that gets ugly, too.
And, in the general case for things that do this to the strvec what we're explicitly doing is having the caller then operate on that munged argv, i.e. it's important that we change *its* argv. That's not going on here, but e.g. various parse_options() callers rely on that.
Right, agreed.
IIRC this fails SANITIZE=address or was otherwise broken, I didn't test it now, but that's not the point... ... I'm just including it by way of explanation. I.e. for at least *some* callers (IIRC the below mostly works, and I can't remember where it's broken) it would suffice to just keep around a list of "here's stuff we should free later". In case below I opted to do that with a string_list, but it could be another strvec or whatever.
FWIW, I don't really like this direction. It feels like a huge band-aid, and the right solution is either having functions _not_ munge strvecs too much, or be told that they can take ownership of removed elements and free them (i.e., your current patch 6). -Peff