Re: [PATCH 9/9] git archive docs: document output non-stability
From: Ævar Arnfjörð Bjarmason <hidden>
Date: 2023-02-02 10:41:07
On Thu, Feb 02 2023, brian m. carlson wrote:
quoted
+* We will do our best not to change the "tar" output itself, but won't + promise that we're never going to change it. ++ +If you must avoid using "git" itself for the tree validation, you +should be checksumming the uncompressed "tar" output, not e.g. the +compressed "tgz" output. ++I don't think I want to state this, because it implies that the changes I made that broke kernel.org (making tar.umask apply to pax headers) wouldn't have been allowed.
I don't see how "we'll do our best, but it might change" precludes that...
We should probably just state that "we won't promise that the tar output won't change between versions". Maybe,
...but it sounds like you'd like this "softer" promise. I think it's saying the same, but picked the "we'll try not to" wording because I think it more accurately reflects reality, but...
"We won't change the tar output needlessly, but it may change from time to time." That is, we won't be "let's change the format just to mix it up for users", but if there's a valuable patch that could be applied, then we might well take it.
...here we're back (at least per my reading) to basically what my proposed patch said. I'm happy to improve/change the wording, but I'm confused about the "because it implies" part you noted.
As I said, it's my goal to provide more concrete guarantees in a future patch, probably this weekend.
I think that would be great, but also think that if we're going to make new guarantees it's probably best applied on top of a series such as this, which aside from the reverting back to gzip as the default attempts to clarify the status quo.
quoted
+* We promise that a given version of git will emit stable "tar" output + for the same tree ID (but not commit ID, see the discussion in the + <<DESCRIPTION>> section above).I think that section contradicts this. The tree version uses the current timestamp, which would make the archive change based on the time of day.
Thanks! It's referring back to the previous discussion, but I managed to somehow get the tree & commit cases reversed.
quoted
+While you shouldn't assume that different versions of git will emit +the same output, you can assume (e.g. for the purposes of caching) +that a given version's output is stable.Unfortunately, this isn't actually true if someone uses export-subst. That's because adding unrelated objects can increase the length of abbreviations, and then the tar contents can be different. I've actually seen this in the wild. Modulo that, yes, I agree with this.
I didn't know about the export-subst case, I'll add that caveat in there. Thanks!