Re: git archive generates tar with malformed pax extended attribute
From: Jeff King <hidden>
Date: 2019-05-28 05:58:09
On Sat, May 25, 2019 at 03:26:53PM +0200, René Scharfe wrote:
We could truncate symlink targets at the first NUL as well in git archive -- but that would be a bit sad, as the archive formats allow storing the "real" target from the repo, with NUL and all. We could make git fsck report such symlinks.
This is a little tricky, because fsck generally looks at individual objects, and the bad pattern is a combination of a tree and a blob together. I think you could make it work by reusing some of the code and patterns from 9e84a6d758 (Merge branch 'jk/submodule-fsck-loose' into maint, 2018-05-22).
Can Unicode symlink targets contain NULs? We wouldn't want to damage them even if we decide to truncate.
On Windows, I suppose, where pathnames can be UTF-16? I don't know how any of that works with Git. I guess we'd always have to assume the filenames in Git are UTF-8 or at least some ASCII-superset, since we cannot encode NULs; and presumably that would extend to link destinations, too. So I doubt it's a problem in practice. Personally, I'd wait until somebody with such a system cares enough to suggest a new behavior, rather than trying to guess. :) Likewise, I think at this point with Keegan's original report that Git is doing something reasonable with a lousy input. Unless something interesting comes out of the golang/go bug report discussion (thank you for opening that!), it's probably not worth chasing hypotheticals. -Peff