Thread (5 messages) 5 messages, 4 authors, 2023-10-03

Re: [silly] loose, pack, and another thing?

From: Jonathan Tan <hidden>
Date: 2023-09-28 21:40:17

Junio C Hamano [off-list ref] writes:
Just wondering if it would help to have the third kind of object
representation in the object database, sitting next to loose objects
and packed objects, say .git/objects/verbatim/<hex-object-name> for
the contents and .git/objects/verbatim/<hex-object-name>.type that
records "blob", "tree", "commit", or "tag" (in practice, I would
expect huge "blob" objects would be the only ones that use this
mechanism).

The contents will be stored verbatim without compression and without
any object header (i.e., the usual "<type> <length>\0") and the file
could be "ln"ed (or "cow"ed if the underlying filesystem allows it)
to materialize it in the working tree if needed.
This sounds like a useful feature. We probably would want to use the
"ln" or "cow" every time we use streaming (stream_blob_to_fd() in
streaming.h) currently, so hopefully we won't need to increase the
number of ways in which we can write an object to the worktree (just
change the streaming to write to a filename instead of an fd).
"fsck" needs to be told about how to verify them.  Create the object
header in-core and hash that, followed by the contents of that file,
and make sure the result matches the <hex-object-name> part of the
filename, or something like that.
Yeah, this sounds like what index-pack is doing - the hash algo can take
the contents of one buffer (a header that we synthesize ourselves), and
then take the contents of another buffer (the file contents).
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help