Re: [PATCH] git-compat-util: make git_find_last_dir_sep return a const pointer
From: Jeff King <hidden>
Date: 2026-02-04 05:32:20
Subsystem:
the rest · Maintainer:
Linus Torvalds
On Tue, Feb 03, 2026 at 07:15:10PM -0800, Collin Funk wrote:
And that plan makes sense to me. I'll create a seperate patch handling the obvious 's/char/const char/' conversions that make sense regardless of this glibc change.
Sounds good. BTW, thank you for working on this! I think some of it will be a slog. :)
Yes, I see. I think the "arg" there is from the command line or from a
buffer read using fgets() in get_object_list from
builtin/pack-objects.c, so it is safe to write to there.
It's also called like this though:
handle_revision_arg("HEAD", &revs, 0, 0);
We can't write to the string "HEAD", but it doesn't have a "dotdot" so
we don't. It could probably be cleaned up a bit.
Oof, yeah. So this is a trap waiting to happen, and any code like:
handle_revision_arg("..HEAD", &revs, 0, 0);
would segfault. That's something we're unlikely to write, which is how
it's managed to hang around for so long. But I'm happy that the new
warning will help us find and fix such cases.
This is a case where I think the interface really should be a const
string, and we should just pay the cost to make a NUL-terminated
version. I.e., something like this:
diff --git a/revision.c b/revision.c
index ba0da18f26..289af7507c 100644
--- a/revision.c
+++ b/revision.c@@ -2143,16 +2143,17 @@ static int handle_dotdot(const char *arg, int cant_be_filename) { struct object_context a_oc = {0}, b_oc = {0}; - char *dotdot = strstr(arg, ".."); + const char *dotdot = strstr(arg, ".."); + char *lhs; int ret; if (!dotdot) return -1; - *dotdot = '\0'; - ret = handle_dotdot_1(arg, dotdot, revs, flags, cant_be_filename, + lhs = xmemdupz(arg, dotdot - arg); + ret = handle_dotdot_1(arg, lhs, revs, flags, cant_be_filename, &a_oc, &b_oc); - *dotdot = '.'; + free(lhs); object_context_release(&a_oc); object_context_release(&b_oc);
This isn't a hot enough code path for the allocation to matter, and simple and safe is the best approach (IMHO). I suspect many cases you'll find are similar.
FYI, that code would also be made much clearer if not all of the declarations were at the top of the function. I guess it just hasn't been touched in a long while.
Though mixed statements and declarations are allowed by modern C99 (which we require these days), I think we still prefer not to use it as a point of style. And we enforce it via -Wdeclaration-after-statement, at least with all of our developer warnings on. Speaking of which: I noticed your original patch introduced one such case. Make sure you're building with "make DEVELOPER=1" as you build and test. All that said, there are often cases where variable declarations could be pushed down into the inner block where they are used, and those sorts of cleanups are welcome. We also allow declaring variables in loop initializers these days.
quoted
Looking at strchr()'s declaration in string.h, which is defined like: # define strchr(S, C) \ __glibc_const_generic (S, const char *, strchr (S, C)) I think the answer is probably "yes". But it also doesn't quite solve our problem. That would give us type-checking of callers of our function, but we still have to convince the compiler not to complain about its implementation. For that we'd need to either cast away const manually, I guess.That macro depends on Generic selections from C11 [1]. I wasn't sure if Git would like that, given it is conservative with other C features.
We definitely can't rely on it everywhere. But if there is a solution that is conditionally compiled, and can kick in only when these extra warnings also kick in, that would be OK. Assuming the result is not too painful to look at, of course. Probably the best path forward for most spots is just fixing the code to make it more obvious about its use of const. We may find there are not enough left for us to try to get too clever afterwards. Even though I think the skip_iprefix() thing is a general problem with constness in C (the same one faced by strchr() in the first place!), in practice we can probably just rewrite the code in the few cases where it matters. For instance, the "cookie" example I gave could probably just do something like this:
diff --git a/http.c b/http.c
index 7815f144de..e6f0913691 100644
--- a/http.c
+++ b/http.c@@ -749,15 +749,16 @@ static int redact_sensitive_header(struct strbuf *header, size_t offset) sensitive_header++; cookie = sensitive_header; while (cookie) { - char *equals; - char *semicolon = strstr(cookie, "; "); - if (semicolon) - *semicolon = 0; - equals = strchrnul(cookie, '='); + const char *equals; + const char *semicolon = strstr(cookie, "; "); + + equals = semicolon ? + memchr(cookie, '=', semicolon - cookie) : + strchr(cookie, '='); if (!equals) { /* invalid cookie, just append and continue */ strbuf_addstr(&redacted_header, cookie); continue; }
Though note there is another bug lurking in this code! If we hit the "!equals" case, we will continue the loop without advancing "cookie" at all, and loop forever. But in the current version of the function, that is dead code, because strchrnul() will never return NULL (you get either the matched char or the end-of-string). Probably the "continue" should be a "break", though perhaps we could keep parsing past the next semicolon. Not necessarily a problem we need to solve, but as I've written it above, the dead code becomes live. So I wanted to give fair warning. ;) -Peff