Thread (162 messages) 162 messages, 2 authors, 2022-09-01

Re: [PATCH 03/11] submodule--helper: fix "module_clone_data" memory leaks

From: Glen Choo <hidden>
Date: 2022-07-13 23:04:33

Ævar Arnfjörð Bjarmason [off-list ref] writes:
On Wed, Jul 13 2022, Glen Choo wrote:
quoted
Glen Choo [off-list ref] writes:
quoted
Ævar Arnfjörð Bjarmason [off-list ref] writes:
quoted
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 73717be957c..23ab9c7e349 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -1511,6 +1511,7 @@ static int module_deinit(int argc, const char **argv, const char *prefix)
 struct module_clone_data {
 	const char *prefix;
 	const char *path;
+	char *path_free;
 	const char *name;
 	const char *url;
 	const char *depth;
@@ -1527,6 +1528,11 @@ struct module_clone_data {
 	.single_branch = -1, \
 }
 
+static void module_clone_data_release(struct module_clone_data *cd)
+{
+	free(cd->path_free);
+}
+
 struct submodule_alternate_setup {
 	const char *submodule_name;
 	enum SUBMODULE_ALTERNATE_ERROR_MODE {
@@ -1651,9 +1657,9 @@ static int clone_submodule(struct module_clone_data *clone_data)
 
 	if (!is_absolute_path(clone_data->path)) {
 		strbuf_addf(&sb, "%s/%s", get_git_work_tree(), clone_data->path);
-		clone_data->path = strbuf_detach(&sb, NULL);
+		clone_data->path = clone_data->path_free = strbuf_detach(&sb, NULL);
 	} else {
-		clone_data->path = xstrdup(clone_data->path);
+		clone_data->path = clone_data->path_free = xstrdup(clone_data->path);
 	}
Hm, having .path_free doesn't seem necessary. If I'm reading the code
correctly,

- in module_clone() we set clone_data.path from argv
- then we unconditionally call clone_submodule(), which does all of the
  hard work
- in clone_submodule(), we always hit this conditional, which means that
  past this point, clone_data.path is always free()-able.

which suggests that we don't need clone_data.path_free at all. I suspect
that just this

   static void module_clone_data_release(struct module_clone_data *cd)
   {
   	free(cd->path);
   }

is enough to plug the leak (though admittedly, I haven't properly tested
the leak because it's been a PITA to set up locally).

That's a pretty hacky suggestion though, since we're still using the
same member to hold free()-able and non-free()-able memory....
Ah, here's a wacky idea (whose feasibility I haven't thought about at
all) that would make things a lot cleaner. If we had something like
OPT_STRING_DUP, that xstrdup()-s the value from argv when we parse it,
then we could drop the "const" from clone_data.path and just free() it.
I suppose so, it might make some things simpler, of course at the cost
of needlessly duplicating things.

But we also have various common patterns such as string_lists where some
elements are dup'd, some aren't, and need to deal with that. I think
just having common idioms for tracking the dupe is usually better,
e.g. in the case of a string list stick the pointer to free in "util".
Hm, sounds fair. "Sometimes dup and sometimes not" sounds like an
inevitability. I'm not experienced enough to know better, and folks
whose opinion I sought seem to agree with you.
I think in this case the patch as-is is better than your suggestions,
because it's a less fragile pattern, we explicitly mark when we dup
something that's sometimes a dup, and sometimes not.

Whereas if we do it with the xstrdup() at the start it requires more
moving things around, and if we have a new user who parses the same
argument we'll bug out on that free().

What do you think?
Frankly I'm ok with moving things around; I think the code could use
a little cleaning up :) But yeah, I think my suggestion isn't so great -
it's a bit weird to keep around an auto variable that exists only to be
dup-ed to the thing we care about. We can forget about that.

I do think that it's worth avoiding the "sometimes dup, sometimes not"
pattern if we can, though (of course these are just my non-C instincts
talking), and we can do that here if we just choose not to assign back
to .path. Something like:

  struct module_clone_data {
    const char *prefix;
  -	const char *path;
  +	char *path;
  +	const char *path_argv;

  ...

   	if (!is_absolute_path(clone_data->path)) {
  -		strbuf_addf(&sb, "%s/%s", get_git_work_tree(), clone_data->path);
  +		strbuf_addf(&sb, "%s/%s", get_git_work_tree(), clone_data->path_argv);
  +		clone_data->path = strbuf_detach(&sb, NULL);
   	} else {
  -		clone_data->path = xstrdup(clone_data->path);
  +		clone_data->path = xstrdup(clone_data->path_argv);
   	}

would be clearer to me since the const pointer never points to something
that the struct actually owns.

But if the "= .to_free = " idiom is well-understood and accepted to the
point that we don't need to actively avoid "sometimes dup, sometimes
not", then we should drop my suggestion and just go with your patch :)
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help