Thread (23 messages) 23 messages, 6 authors, 2021-09-13

[PATCH v2 0/7] strvec: use size_t to store nr and alloc

From: Ævar Arnfjörð Bjarmason <hidden>
Date: 2021-09-12 00:16:08

This is a proposed v2 of Jeff King's one-patch change to change
strvec's nr/alloc from "int" to "size_t". As noted below I think it's
worthwhile to not only change that in the struct, but also in code
that directly references the "nr" member.

On Sat, Sep 11 2021, Philip Oakley wrote:
On 11/09/2021 17:13, Ævar Arnfjörð Bjarmason wrote:
quoted
On Sat, Sep 11 2021, Jeff King wrote:
quoted
We converted argv_array (which later became strvec) to use size_t in
819f0e76b1 (argv-array: use size_t for count and alloc, 2020-07-28) in
order to avoid the possibility of integer overflow. But later, commit
d70a9eb611 (strvec: rename struct fields, 2020-07-28) accidentally
converted these back to ints!

Those two commits were part of the same patch series. I'm pretty sure
what happened is that they were originally written in the opposite order
and then cleaned up and re-ordered during an interactive rebase. And
when resolving the inevitable conflict, I mistakenly took the "rename"
patch completely, accidentally dropping the type change.

We can correct it now; better late than never.

Signed-off-by: Jeff King <redacted>
---
This was posted previously in the midst of another thread, but I don't
think was picked up. There was some positive reaction, but one "do we
really need this?" to which I responded in detail:

  https://lore.kernel.org/git/YTIBnT8Ue1HZXs82@coredump.intra.peff.net/ (local)

I don't really think any of that needs to go into the commit message,
but if that's a hold-up, I can try to summarize it (though I think
referring to the commit which _already_ did this and was accidentally
reverted would be sufficient).
Thanks, I have a WIP version of this outstanding starting with this
patch that I was planning to submit sometime, but I'm happy to have you
pursue it, especially with the ~100 outstanding patches I have in
master..seen.

It does feel somewhere between iffy and a landmine waiting to be stepped
on to only convert the member itself, and not any of the corresponding
"int" variables that track it to "size_t".

If you do the change I suggested in
https://lore.kernel.org/git/87v93i8svd.fsf@evledraar.gmail.com/ (local) you'll
find that there's at least one first-order reference to this that now
uses "int" that if converted to "size_t" will result in a wrap-around
error, we're lucky that one has a test failure.

I can tell you what that bug is, but maybe it's better if you find it
yourself :) I.e. I found *that* one, but I'm not sure I found them
all. I just s/int nr/size_t *nr/ and eyeballed the wall off compiler
errors & the code context (note: pointer, obviously broken, but makes
the compiler yell).

That particular bug will be caught by the compiler as it involves a >= 0
comparison against unsigned, but we may not not have that everywhere...
I'm particularly interested in the int -> size_t change problem as part
of the wider 4GB limitations for the LLP64 systems [0] such as the
RaspPi, git-lfs (on windows [1]), and Git-for-Windows[2]. It is a big
problem.
Okey, fine, no fun excercise for the reader then ;)

This is what I'd been sitting on locally since that recent thread, I
polished it up a bit since Jeff King posted his version.

The potential overflow bug I mentioned is in rebase.c. See
5/7. "Potential" because it's not a bug now, but that code
intentionally considers a strvec, and then iterates it from nr-1 to 0,
and if it reaches 0 intentionally counts down one more to -1 to
indicate that it's visited all elements.

We then check that with i >= 0, except of course if it becomes
unsigned that doesn't become -1, but rather it wraps around.

The rest of this is all changes to have that s/int/size_t/ radiate
outwards, i.e. when we assign that value to a variable somewhere its
now a "size_t" instead of an "int" etc.
[0]
http://nickdesaulniers.github.io/blog/2016/05/30/data-models-and-word-size/
[1] https://github.com/git-lfs/git-lfs/issues/2434  Git on Windows
client corrupts files > 4Gb
[2] https://github.com/git-for-windows/git/pull/2179  [DRAFT] for
testing : Fix 4Gb limit for large files on Git for Windows
Jeff King (1):
  strvec: use size_t to store nr and alloc

Ævar Arnfjörð Bjarmason (6):
  remote-curl: pass "struct strvec *" instead of int/char ** pair
  pack-objects: pass "struct strvec *" instead of int/char ** pair
  sequencer.[ch]: pass "struct strvec *" instead of int/char ** pair
  upload-pack.c: pass "struct strvec *" instead of int/char ** pair
  rebase: don't have loop over "struct strvec" depend on signed "nr"
  strvec API users: change some "int" tracking "nr" to "size_t"

 builtin/pack-objects.c |  6 +++---
 builtin/rebase.c       | 26 ++++++++++++--------------
 connect.c              |  8 ++++----
 fetch-pack.c           |  4 ++--
 ls-refs.c              |  2 +-
 remote-curl.c          | 23 +++++++++++------------
 sequencer.c            |  8 ++++----
 sequencer.h            |  4 ++--
 serve.c                |  2 +-
 shallow.c              |  5 +++--
 shallow.h              |  6 ++++--
 strvec.h               |  4 ++--
 submodule.c            |  2 +-
 upload-pack.c          |  7 +++----
 14 files changed, 53 insertions(+), 54 deletions(-)

Range-diff against v1:
-:  ----------- > 1:  2ef48d734e8 remote-curl: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 2:  7f59a58ed97 pack-objects: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 3:  c35cfb9c9c5 sequencer.[ch]: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 4:  2e0b82d4316 upload-pack.c: pass "struct strvec *" instead of int/char ** pair
-:  ----------- > 5:  be85a0565ef rebase: don't have loop over "struct strvec" depend on signed "nr"
1:  498f5ed80dc ! 6:  ba17290852c strvec: use size_t to store nr and alloc
    @@ Commit message
         We can correct it now; better late than never.
     
         Signed-off-by: Jeff King [off-list ref]
    +    Signed-off-by: Ævar Arnfjörð Bjarmason [off-list ref]
     
      ## strvec.h ##
     @@ strvec.h: extern const char *empty_strvec[];
-:  ----------- > 7:  2edd9708888 strvec API users: change some "int" tracking "nr" to "size_t"
-- 
2.33.0.998.ga4d44345d43
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help