Re: git archive generates tar with malformed pax extended attribute
From: René Scharfe <hidden>
Date: 2019-05-26 21:34:04
Subsystem:
the rest · Maintainer:
Linus Torvalds
Am 25.05.19 um 23:07 schrieb Ævar Arnfjörð Bjarmason:
On Sat, May 25 2019, René Scharfe wrote:quoted
We could truncate symlink targets at the first NUL as well in git archive -- but that would be a bit sad, as the archive formats allow storing the "real" target from the repo, with NUL and all.
But that being said, this assumption that data in a tar archive will get written to a FS of some sort isn't true. There's plenty of consumers of the format that read it in-memory and stream its contents out to something else entirely, e.g. taking "git archive --remote" output, parsing it with e.g. [1] and throwing some/all of the content into a database. 1. https://metacpan.org/pod/Archive::Tar
Git archive writes link targets that are 100 characters long or less into the appropriate field in the plain tar header. It copies everything, including NULs, but unlike a PAX extended header that field lacks a length indicator, so extractors basically have to treat it as NUL-terminated. If we want to preserve NUL in short link targets as well, we'd have to put such names into an PAX extended header.. --- archive-tar.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/archive-tar.c b/archive-tar.c
index 3e53aac1e6..e8f55578d1 100644
--- a/archive-tar.c
+++ b/archive-tar.c@@ -291,7 +291,8 @@ static int write_tar_entry(struct archiver_args *args, } if (S_ISLNK(mode)) { - if (size > sizeof(header.linkname)) { + if (size > sizeof(header.linkname) || + memchr(buffer, '\0', size)) { xsnprintf(header.linkname, sizeof(header.linkname), "see %s.paxheader", oid_to_hex(oid)); strbuf_append_ext_header(&ext_header, "linkpath", --
2.21.0