Re: [PATCH 2/2] archive: avoid spawning `gzip`

From: René Scharfe <hidden>
Date: 2019-06-10 10:45:31
Subsystem: kernel build + files below scripts/ (unless maintained elsewhere), the rest · Maintainers: Nathan Chancellor, Nicolas Schier, Linus Torvalds

Am 01.05.19 um 20:18 schrieb Jeff King:

On Wed, May 01, 2019 at 07:45:05PM +0200, René Scharfe wrote:

quoted

But since the performance is still not quite on par with `gzip`, I would
actually rather not, and really, just punt on that one, stating that
people interested in higher performance should use `pigz`.

Here are my performance numbers for generating .tar.gz files again:

OK, tried one more version, with pthreads (patch at the end).  Also
redid all measurements for better comparability; everything is faster
now for some reason (perhaps due to a compiler update? clang version
7.0.1-8 now):

master, using gzip(1):
Benchmark #1: git archive --format=tgz HEAD
  Time (mean ± σ):     15.697 s ±  0.246 s    [User: 19.213 s, System: 0.386 s]
  Range (min … max):   15.405 s … 16.103 s    10 runs

using zlib sequentially:
Benchmark #1: git archive --format=tgz HEAD
  Time (mean ± σ):     19.191 s ±  0.408 s    [User: 19.091 s, System: 0.100 s]
  Range (min … max):   18.802 s … 19.877 s    10 runs

using a gzip-lookalike:
Benchmark #1: git archive --format=tgz HEAD
  Time (mean ± σ):     16.289 s ±  0.218 s    [User: 19.485 s, System: 0.337 s]
  Range (min … max):   16.020 s … 16.555 s    10 runs

using zlib with start_async:
Benchmark #1: git archive --format=tgz HEAD
  Time (mean ± σ):     16.516 s ±  0.334 s    [User: 20.282 s, System: 0.383 s]
  Range (min … max):   16.166 s … 17.283 s    10 runs

using zlib in a separate thread (that's the new one):
Benchmark #1: git archive --format=tgz HEAD
  Time (mean ± σ):     16.310 s ±  0.237 s    [User: 20.075 s, System: 0.173 s]
  Range (min … max):   15.983 s … 16.790 s    10 runs

I think the start_async() one seems like a good option. It reclaims most
of the (wall-clock) performance, isn't very much code, and doesn't leave
any ugly user-visible traces.

The pthreads numbers look a bit better still.  The patch is huge though,
because it duplicates almost everything.  It was easier that way; a real
patch series would extract functions that can be used both with static
and allocated headers first, and keep everything in archive-tar.c.

I'd be fine to see it come later, though, on top of the patches Dscho is
sending. Even though changing to sequential zlib is technically a change
in behavior, the existing behavior wasn't really planned. And given the
wall-clock versus CPU time tradeoff, it's not entirely clear that one
solution is better than the other.

The current behavior is not an accident; the synchronous method was
rejected in 2009 because it was slower [1].  Redid the measurements
with v1.6.5-rc0 and the old patch [2], but they would only compile with
gcc (Debian 8.3.0-6) for me, so it's not directly comparable to the
numbers above:

v1.6.5-rc0:
Benchmark #1: ../git/git-archive HEAD | gzip
  Time (mean ± σ):     16.051 s ±  0.486 s    [User: 19.514 s, System: 0.341 s]
  Range (min … max):   15.416 s … 17.001 s    10 runs

v1.6.5-rc0 + [2]:
Benchmark #1: ../git/git-archive --format=tar.gz HEAD
  Time (mean ± σ):     19.684 s ±  0.374 s    [User: 19.601 s, System: 0.060 s]
  Range (min … max):   19.082 s … 20.177 s    10 runs

User time is still slightly higher, but the difference is in the noise.

[1] http://public-inbox.org/git/4AAAC8CE.8020302@lsrfire.ath.cx/
[2] http://public-inbox.org/git/4AA97B61.6030301@lsrfire.ath.cx/

quoted

And who knows, maybe nobody will complain at all about the performance?

Probably.  And popular tarballs would be cached anyway, I guess.

At GitHub we certainly do cache the git-archive output. We'd also be
just fine with the sequential solution. We generally turn down
pack.threads to 1, and keep our CPUs busy by serving multiple users
anyway.

So whatever has the lowest overall CPU time is generally preferable, but
the times are close enough that I don't think we'd care much either way
(and it's probably not worth having a config option or similar).

Moving back to 2009 and reducing the number of utilized cores both feels
weird, but the sequential solution *is* the most obvious, easiest and
(by a narrow margin) lightest one if gzip(1) is not an option anymore.

Anyway, the threading patch:

---
 Makefile      |   1 +
 archive-tar.c |  11 +-
 archive-tgz.c | 452 ++++++++++++++++++++++++++++++++++++++++++++++++++
 archive.h     |   4 +
 4 files changed, 465 insertions(+), 3 deletions(-)
 create mode 100644 archive-tgz.c

diff --git a/Makefile b/Makefile
index 8a7e235352..ed649ac18d 100644
--- a/Makefile
+++ b/Makefile

@@ -834,6 +834,7 @@ LIB_OBJS += alloc.o
 LIB_OBJS += apply.o
 LIB_OBJS += archive.o
 LIB_OBJS += archive-tar.o
+LIB_OBJS += archive-tgz.o
 LIB_OBJS += archive-zip.o
 LIB_OBJS += argv-array.o
 LIB_OBJS += attr.o

diff --git a/archive-tar.c b/archive-tar.c
index 3e53aac1e6..929eb58235 100644
--- a/archive-tar.c
+++ b/archive-tar.c

@@ -15,7 +15,9 @@
 static char block[BLOCKSIZE];
 static unsigned long offset;

-static int tar_umask = 002;
+int tar_umask = 002;
+
+static const char internal_gzip[] = "git archive gzip";

 static int write_tar_filter_archive(const struct archiver *ar,
 				    struct archiver_args *args);

@@ -445,6 +447,9 @@ static int write_tar_filter_archive(const struct archiver *ar,
 	if (!ar->data)
 		BUG("tar-filter archiver called with no filter defined");

+	if (!strcmp(ar->data, internal_gzip))
+		return write_tgz_archive(ar, args);
+
 	strbuf_addstr(&cmd, ar->data);
 	if (args->compression_level >= 0)
 		strbuf_addf(&cmd, " -%d", args->compression_level);

@@ -483,9 +488,9 @@ void init_tar_archiver(void)
 	int i;
 	register_archiver(&tar_archiver);

-	tar_filter_config("tar.tgz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tgz.command", internal_gzip, NULL);
 	tar_filter_config("tar.tgz.remote", "true", NULL);
-	tar_filter_config("tar.tar.gz.command", "gzip -cn", NULL);
+	tar_filter_config("tar.tar.gz.command", internal_gzip, NULL);
 	tar_filter_config("tar.tar.gz.remote", "true", NULL);
 	git_config(git_tar_config, NULL);
 	for (i = 0; i < nr_tar_filters; i++) {

diff --git a/archive-tgz.c b/archive-tgz.c
new file mode 100644
index 0000000000..ae219e1cc0
--- /dev/null
+++ b/archive-tgz.c

@@ -0,0 +1,452 @@
+#include "cache.h"
+#include "config.h"
+#include "tar.h"
+#include "archive.h"
+#include "object-store.h"
+#include "streaming.h"
+
+#define RECORDSIZE	(512)
+#define BLOCKSIZE	(RECORDSIZE * 20)
+
+static gzFile gzip;
+static size_t offset;
+
+/*
+ * This is the max value that a ustar size header can specify, as it is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended headers at
+ * this size.
+ *
+ * Likewise for the mtime (which happens to use a buffer of the same size).
+ */
+#if ULONG_MAX == 0xFFFFFFFF
+#define USTAR_MAX_SIZE ULONG_MAX
+#else
+#define USTAR_MAX_SIZE 077777777777UL
+#endif
+#if TIME_MAX == 0xFFFFFFFF
+#define USTAR_MAX_MTIME TIME_MAX
+#else
+#define USTAR_MAX_MTIME 077777777777ULL
+#endif
+
+static void tgz_write(const void *data, size_t size)
+{
+	const char *p = data;
+	while (size) {
+		size_t to_write = size;
+		if (to_write > UINT_MAX)
+			to_write = UINT_MAX;
+		if (gzwrite(gzip, p, to_write) != to_write)
+			die(_("gzwrite failed"));
+		p += to_write;
+		size -= to_write;
+		offset = (offset + to_write) % BLOCKSIZE;
+	}
+}
+
+static void tgz_finish_record(void)
+{
+	size_t tail = offset % RECORDSIZE;
+	if (tail) {
+		size_t to_seek = RECORDSIZE - tail;
+		if (gzseek(gzip, to_seek, SEEK_CUR) < 0)
+			die(_("gzseek failed"));
+		offset = (offset + to_seek) % BLOCKSIZE;
+	}
+}
+
+static void tgz_write_trailer(void)
+{
+	size_t to_seek = BLOCKSIZE - offset;
+	if (to_seek < 2 * RECORDSIZE)
+		to_seek += BLOCKSIZE;
+	if (gzseek(gzip, to_seek, SEEK_CUR) < 0)
+		 die(_("gzseek failed"));
+	if (gzflush(gzip, Z_FINISH) != Z_OK)
+		die(_("gzflush failed"));
+}
+
+struct work_item {
+	void *buffer;
+	size_t size;
+	int finish_record;
+};
+
+#define TODO_SIZE 64
+struct work_item todo[TODO_SIZE];
+static int todo_start;
+static int todo_end;
+static int todo_done;
+static int all_work_added;
+static pthread_mutex_t tar_mutex;
+static pthread_t thread;
+
+static void tar_lock(void)
+{
+	pthread_mutex_lock(&tar_mutex);
+}
+
+static void tar_unlock(void)
+{
+	pthread_mutex_unlock(&tar_mutex);
+}
+
+static pthread_cond_t cond_add;
+static pthread_cond_t cond_write;
+static pthread_cond_t cond_result;
+
+static void add_work(void *buffer, size_t size, int finish_record)
+{
+	tar_lock();
+
+	while ((todo_end + 1) % ARRAY_SIZE(todo) == todo_done)
+		pthread_cond_wait(&cond_write, &tar_mutex);
+
+	todo[todo_end].buffer = buffer;
+	todo[todo_end].size = size;
+	todo[todo_end].finish_record = finish_record;
+
+	todo_end = (todo_end + 1) % ARRAY_SIZE(todo);
+
+	pthread_cond_signal(&cond_add);
+	tar_unlock();
+}
+
+static struct work_item *get_work(void)
+{
+	struct work_item *ret = NULL;
+
+	tar_lock();
+	while (todo_start == todo_end && !all_work_added)
+		pthread_cond_wait(&cond_add, &tar_mutex);
+
+	if (todo_start != todo_end || !all_work_added) {
+		ret = &todo[todo_start];
+		todo_start = (todo_start + 1) % ARRAY_SIZE(todo);
+	}
+	tar_unlock();
+	return ret;
+}
+
+static void work_done(void)
+{
+	tar_lock();
+	todo_done = (todo_done + 1) % ARRAY_SIZE(todo);
+	pthread_cond_signal(&cond_write);
+
+	if (all_work_added && todo_done == todo_end)
+		pthread_cond_signal(&cond_result);
+	tar_unlock();
+}
+
+static void *run(void *arg)
+{
+	for (;;) {
+		struct work_item *w = get_work();
+		if (!w)
+			break;
+		tgz_write(w->buffer, w->size);
+		free(w->buffer);
+		if (w->finish_record)
+			tgz_finish_record();
+		work_done();
+	}
+	return NULL;
+}
+
+static void start_output_thread(void)
+{
+	int err;
+
+	pthread_mutex_init(&tar_mutex, NULL);
+	pthread_cond_init(&cond_add, NULL);
+	pthread_cond_init(&cond_write, NULL);
+	pthread_cond_init(&cond_result, NULL);
+
+	memset(todo, 0, sizeof(todo));
+
+	err = pthread_create(&thread, NULL, run, NULL);
+	if (err)
+		die(_("failed to create thread: %s"), strerror(err));
+}
+
+static void wait_for_output_thread(void)
+{
+	tar_lock();
+	all_work_added = 1;
+
+	while (todo_done != todo_end)
+		pthread_cond_wait(&cond_result, &tar_mutex);
+
+	pthread_cond_broadcast(&cond_add);
+	tar_unlock();
+
+	pthread_join(thread, NULL);
+
+	pthread_mutex_destroy(&tar_mutex);
+	pthread_cond_destroy(&cond_add);
+	pthread_cond_destroy(&cond_write);
+	pthread_cond_destroy(&cond_result);
+}
+
+static int stream_blob(const struct object_id *oid)
+{
+	struct git_istream *st;
+	enum object_type type;
+	unsigned long sz;
+	ssize_t readlen;
+	size_t chunk_size = BLOCKSIZE * 10;
+
+	st = open_istream(oid, &type, &sz, NULL);
+	if (!st)
+		return error(_("cannot stream blob %s"), oid_to_hex(oid));
+	for (;;) {
+		char *buf = xmalloc(chunk_size);
+		readlen = read_istream(st, buf, chunk_size);
+		if (readlen <= 0)
+			break;
+		sz -= readlen;
+		add_work(buf, readlen, !sz);
+	}
+	close_istream(st);
+	return readlen;
+}
+
+/*
+ * pax extended header records have the format "%u %s=%s\n".  %u contains
+ * the size of the whole string (including the %u), the first %s is the
+ * keyword, the second one is the value.  This function constructs such a
+ * string and appends it to a struct strbuf.
+ */
+static void strbuf_append_ext_header(struct strbuf *sb, const char *keyword,
+				     const char *value, unsigned int valuelen)
+{
+	int len, tmp;
+
+	/* "%u %s=%s\n" */
+	len = 1 + 1 + strlen(keyword) + 1 + valuelen + 1;
+	for (tmp = len; tmp > 9; tmp /= 10)
+		len++;
+
+	strbuf_grow(sb, len);
+	strbuf_addf(sb, "%u %s=", len, keyword);
+	strbuf_add(sb, value, valuelen);
+	strbuf_addch(sb, '\n');
+}
+
+/*
+ * Like strbuf_append_ext_header, but for numeric values.
+ */
+static void strbuf_append_ext_header_uint(struct strbuf *sb,
+					  const char *keyword,
+					  uintmax_t value)
+{
+	char buf[40]; /* big enough for 2^128 in decimal, plus NUL */
+	int len;
+
+	len = xsnprintf(buf, sizeof(buf), "%"PRIuMAX, value);
+	strbuf_append_ext_header(sb, keyword, buf, len);
+}
+
+static unsigned int ustar_header_chksum(const struct ustar_header *header)
+{
+	const unsigned char *p = (const unsigned char *)header;
+	unsigned int chksum = 0;
+	while (p < (const unsigned char *)header->chksum)
+		chksum += *p++;
+	chksum += sizeof(header->chksum) * ' ';
+	p += sizeof(header->chksum);
+	while (p < (const unsigned char *)header + sizeof(struct ustar_header))
+		chksum += *p++;
+	return chksum;
+}
+
+static size_t get_path_prefix(const char *path, size_t pathlen, size_t maxlen)
+{
+	size_t i = pathlen;
+	if (i > 1 && path[i - 1] == '/')
+		i--;
+	if (i > maxlen)
+		i = maxlen;
+	do {
+		i--;
+	} while (i > 0 && path[i] != '/');
+	return i;
+}
+
+static void prepare_header(struct archiver_args *args,
+			   struct ustar_header *header,
+			   unsigned int mode, unsigned long size)
+{
+	xsnprintf(header->mode, sizeof(header->mode), "%07o", mode & 07777);
+	xsnprintf(header->size, sizeof(header->size), "%011"PRIoMAX , S_ISREG(mode) ? (uintmax_t)size : (uintmax_t)0);
+	xsnprintf(header->mtime, sizeof(header->mtime), "%011lo", (unsigned long) args->time);
+
+	xsnprintf(header->uid, sizeof(header->uid), "%07o", 0);
+	xsnprintf(header->gid, sizeof(header->gid), "%07o", 0);
+	strlcpy(header->uname, "root", sizeof(header->uname));
+	strlcpy(header->gname, "root", sizeof(header->gname));
+	xsnprintf(header->devmajor, sizeof(header->devmajor), "%07o", 0);
+	xsnprintf(header->devminor, sizeof(header->devminor), "%07o", 0);
+
+	memcpy(header->magic, "ustar", 6);
+	memcpy(header->version, "00", 2);
+
+	xsnprintf(header->chksum, sizeof(header->chksum), "%07o", ustar_header_chksum(header));
+}
+
+static void write_extended_header(struct archiver_args *args,
+				  const struct object_id *oid,
+				  struct strbuf *extended_header)
+{
+	size_t size;
+	char *buffer = strbuf_detach(extended_header, &size);
+	struct ustar_header *header = xcalloc(1, sizeof(*header));
+	unsigned int mode;
+	*header->typeflag = TYPEFLAG_EXT_HEADER;
+	mode = 0100666;
+	xsnprintf(header->name, sizeof(header->name), "%s.paxheader",
+		  oid_to_hex(oid));
+	prepare_header(args, header, mode, size);
+	add_work(header, sizeof(*header), 1);
+	add_work(buffer, size, 1);
+}
+
+static int write_tar_entry(struct archiver_args *args,
+			   const struct object_id *oid,
+			   const char *path, size_t pathlen,
+			   unsigned int mode)
+{
+	struct ustar_header *header = xcalloc(1, sizeof(*header));
+	struct strbuf ext_header = STRBUF_INIT;
+	unsigned int old_mode = mode;
+	unsigned long size, size_in_header;
+	void *buffer;
+	int err = 0;
+
+	if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
+		*header->typeflag = TYPEFLAG_DIR;
+		mode = (mode | 0777) & ~tar_umask;
+	} else if (S_ISLNK(mode)) {
+		*header->typeflag = TYPEFLAG_LNK;
+		mode |= 0777;
+	} else if (S_ISREG(mode)) {
+		*header->typeflag = TYPEFLAG_REG;
+		mode = (mode | ((mode & 0100) ? 0777 : 0666)) & ~tar_umask;
+	} else {
+		return error(_("unsupported file mode: 0%o (SHA1: %s)"),
+			     mode, oid_to_hex(oid));
+	}
+	if (pathlen > sizeof(header->name)) {
+		size_t plen = get_path_prefix(path, pathlen,
+					      sizeof(header->prefix));
+		size_t rest = pathlen - plen - 1;
+		if (plen > 0 && rest <= sizeof(header->name)) {
+			memcpy(header->prefix, path, plen);
+			memcpy(header->name, path + plen + 1, rest);
+		} else {
+			xsnprintf(header->name, sizeof(header->name), "%s.data",
+				  oid_to_hex(oid));
+			strbuf_append_ext_header(&ext_header, "path",
+						 path, pathlen);
+		}
+	} else
+		memcpy(header->name, path, pathlen);
+
+	if (S_ISREG(mode) && !args->convert &&
+	    oid_object_info(args->repo, oid, &size) == OBJ_BLOB &&
+	    size > big_file_threshold)
+		buffer = NULL;
+	else if (S_ISLNK(mode) || S_ISREG(mode)) {
+		enum object_type type;
+		buffer = object_file_to_archive(args, path, oid, old_mode, &type, &size);
+		if (!buffer)
+			return error(_("cannot read %s"), oid_to_hex(oid));
+	} else {
+		buffer = NULL;
+		size = 0;
+	}
+
+	if (S_ISLNK(mode)) {
+		if (size > sizeof(header->linkname)) {
+			xsnprintf(header->linkname, sizeof(header->linkname),
+				  "see %s.paxheader", oid_to_hex(oid));
+			strbuf_append_ext_header(&ext_header, "linkpath",
+						 buffer, size);
+		} else
+			memcpy(header->linkname, buffer, size);
+	}
+
+	size_in_header = size;
+	if (S_ISREG(mode) && size > USTAR_MAX_SIZE) {
+		size_in_header = 0;
+		strbuf_append_ext_header_uint(&ext_header, "size", size);
+	}
+
+	prepare_header(args, header, mode, size_in_header);
+
+	if (ext_header.len > 0) {
+		write_extended_header(args, oid, &ext_header);
+	}
+	add_work(header, sizeof(*header), 1);
+	if (S_ISREG(mode) && size > 0) {
+		if (buffer)
+			add_work(buffer, size, 1);
+		else
+			err = stream_blob(oid);
+	} else
+		free(buffer);
+	return err;
+}
+
+static void write_global_extended_header(struct archiver_args *args)
+{
+	const struct object_id *oid = args->commit_oid;
+	struct strbuf ext_header = STRBUF_INIT;
+	struct ustar_header *header = xcalloc(1, sizeof(*header));
+	unsigned int mode;
+	size_t size;
+	char *buffer;
+
+	if (oid)
+		strbuf_append_ext_header(&ext_header, "comment",
+					 oid_to_hex(oid),
+					 the_hash_algo->hexsz);
+	if (args->time > USTAR_MAX_MTIME) {
+		strbuf_append_ext_header_uint(&ext_header, "mtime",
+					      args->time);
+		args->time = USTAR_MAX_MTIME;
+	}
+
+	if (!ext_header.len)
+		return;
+
+	buffer = strbuf_detach(&ext_header, &size);
+	*header->typeflag = TYPEFLAG_GLOBAL_HEADER;
+	mode = 0100666;
+	xsnprintf(header->name, sizeof(header->name), "pax_global_header");
+	prepare_header(args, header, mode, size);
+	add_work(header, sizeof(*header), 1);
+	add_work(buffer, size, 1);
+}
+
+int write_tgz_archive(const struct archiver *ar, struct archiver_args *args)
+{
+	int level = args->compression_level;
+	int err = 0;
+
+	gzip = gzdopen(1, "wb");
+	if (!gzip)
+		return error(_("gzdopen failed"));
+	if (gzsetparams(gzip, level, Z_DEFAULT_STRATEGY) != Z_OK)
+		return error(_("unable to set compression level %d"), level);
+
+	start_output_thread();
+	write_global_extended_header(args);
+	err = write_archive_entries(args, write_tar_entry);
+	if (err)
+		return err;
+	wait_for_output_thread();
+	tgz_write_trailer();
+	return err;
+}

diff --git a/archive.h b/archive.h
index e60e3dd31c..b00afa1a9f 100644
--- a/archive.h
+++ b/archive.h

@@ -45,6 +45,10 @@ void init_tar_archiver(void);
 void init_zip_archiver(void);
 void init_archivers(void);

+int tar_umask;
+
+int write_tgz_archive(const struct archiver *ar, struct archiver_args *args);
+
 typedef int (*write_archive_entry_fn_t)(struct archiver_args *args,
 					const struct object_id *oid,
 					const char *path, size_t pathlen,
--

2.22.0

`h`	back out one level
`j`	next message in thread
`k`	previous message in thread
`l`	drill in
`Esc`	close help / fold thread tree
`?`	toggle this help