Re: [RFC] git blame-tree
From: Jakub Narebski <hidden>
Date: 2016-06-15 22:50:42
Jeff King [off-list ref] writes:
On Wed, Mar 02, 2011 at 11:40:32AM -0500, Jeff King wrote:
quoted
It's sometimes useful to get a list of files in a tree along with the last commit that touched them. This is the default tree view shown on github.com, but it can also be handy from the command line (there has been talk lately of having a "git ls"), or as plumbing for a local fancier tree view. E.g., something like: add.c 6e7293e git-add: make -A description clearer vs. -u apply.c fd03881 add description parameter to OPT__VERBOSE blame.c 9ca1169 parse-options: Don't call parse_options_check() so much branch.c 62270f6 branch_merged: fix grammar in warning bundle.c 62b4698 Use angles for placeholders consistently
[...]
quoted
blame-tree.c | 156 +++++++++++++++++++++++++++++++++++++++++++++++++ blame-tree.h | 25 ++++++++ builtin.h | 1 + builtin/blame-tree.c | 34 +++++++++++I tried to lib-ify the implementation as much as possible. It increases the lines of code, of course, but I figured there was a reasonable chance that there might be a user-friendly "git ls" command eventually, and this would probably make a good "-v" option to it.
I think it _might_ be a good idea to add `--blame' option to "git ls-tree", as a one of ways of presenting tree-blame output. Or perhaps as part of "git ls". In "[RFC] Tree blame (git blame <directory>)"[1] I proposed for $ git blame --abbrev v1.6.3.3 -- . to generate 100644 blob e57630e ba19a80 Junio C Hamano Feb 10 17:42 walker.c 100644 blob 8a149e1 c13b263 Daniel Barkalow Apr 26 2008 walker.h 100644 blob 7eb3218 fc71db3 Alex Riesen Apr 29 23:21 wrapper.c 100644 blob 4c29255 559e840 Junio C Hamano Jul 20 2008 write_or_die.c 100644 blob 819c797 a437900 Junio C Hamano Jun 21 02:35 ws.c 100644 blob 1b6df45 2af202b Linus Torvalds Jun 18 10:28 wt-status.c 100644 blob 78add09 6c2ce04 Marius Storm-Olsen Jun 5 2008 wt-status.h 100644 blob b9b0db8 eb3a9dd Benjamin Kramer Mar 7 21:02 xdiff-interface.c 100644 blob 7352b9a 86295bb Rene Scharfe Oct 25 2008 xdiff-interface.h 040000 tree ef5d413 5719db9 Charles Bailey May 25 01:21 xdiff/ or something like that. Date doesn't have to be in this strange format used by 'ls'. Also instead of name we can use username part of email, or just email; OTOH git-blame uses above format for author. This could be result of "git ls-tree --abbrev --blame v1.6.3.3"... and it could be combined with `-l' option of git-ls-tree. [1]: http://article.gmane.org/gmane.comp.version-control.git/122830
I considered making it a special mode of "git blame" when blame is fed a directory instead of a file. But the implementations aren't shared at all (nor do I think they need to be; blame-tree is _way_ simpler). And I didn't want to steal that concept in case somebody can think of a more content-level way of blaming a whole tree that makes sense (obviously just showing the concatenation of the blames of each file is one way, but I don't know how useful that would be). If we want to go that way, we can always catch the special case in blame and just exec blame-tree.
Well, having "git blame [<rev>] <directory>" to output tre-blame would allow to reuse some of already existing options to ordinary git-blame; well those that makes sense, like `-b', `-S <revs-file>', `--reverse', perhaps (depending on available output) also `-l', `-t', `-s', `--date <format>'. <rev> is here starting revision or revision range; if it is revision range then negative specifiers function as boundary. We could use `-M' to turn on rename detection, and `-C' to turn on copy detection; I think that in tree-blame we need to consider only _exact_ renames (pure renames, i.e. the same SHA-1, different name). Also for GitHub (and perhaps also in the future for gitweb too) would I think use `--porcelain' or even `--incremental' version of tree-blame; in [1] I have proposed the following output (following existing "for porcelain" format): $ git blame --porcelain v1.6.3.3 -- . 86295bb6bac1482d29650d1f77f19d8e7a7cc2fe 7352b9a9c204c2b1d4ca9df5ce040fe22d6f521c author Rene Scharfe author-mail [off-list ref] author-time 1224941475 author-tz +0200 committer Junio C Hamano committer-mail [off-list ref] committer-time 1224961771 committer-tz -0700 summary add xdi_diff_hunks() for callers that only need hunk lengths filename xdiff-interface.h 100644 blob 7352b9a9c204c2b1d4ca9df5ce040fe22d6f521c xdiff-interface.h 5719db91ce5915ee07c50f1afdc94fe34e91529f ef5d413237b3a390007fba56671b00d7c371ae1e author Charles Bailey author-mail [off-list ref] author-time 1243210874 author-tz +0100 committer Junio C Hamano committer-mail [off-list ref] committer-time 1243234594 committer-tz -0700 summary add xdi_diff_hunks() for callers that only need hunk lengths filename xdiff 040000 tree ef5d413237b3a390007fba56671b00d7c371ae1e xdiff
quoted
+void blame_tree_init(struct blame_tree *bt) +{ + memset(bt, 0, sizeof(*bt)); + bt->paths.strdup_strings = 1; + init_revisions(&bt->rev, NULL); + bt->rev.no_merges = 1; + bt->rev.def = "HEAD"; +}I turn off merges by default, since they are unlikely to be interesting matches (you will see the merge of a side-branch that touched a file instead of the actual commit on the side-branch). You could of course do "git blame-tree . --no-merges" to get the same effect. I think no-merges makes a saner default, but sadly it doesn't seem like there is a way to turn no-merges back off ("--merges" means something else, and there is no --no-no-merges").
IMHO merges are interecting; moreover if I remember correctly my proof
of concept of tree-blame which I tried to implement in Perl using
Git.pm (git cat-file --batch + git diff-tree --stdin), I have problems
with merges in tree-blame of subdirectory ("--relative" option doesn't
work as I thought it did).
Right now the code just handles trees. But in the long run, it would probably make sense to get the list of files from the index, and mark files modified in the working tree or index, too. So something like: foo.c 1234abcd this is a commit subject bar.c modified in working tree baz.c modified in index Sort of like how gitk shows "pseudo-commits" on top of history to indicate changes.
Or how "git blame" handles "--contents <file>" option... though what you mentioned is more than that. [...]
quoted
+int cmd_blame_tree(int argc, const char **argv, const char *prefix) +{ + struct blame_tree bt; + const char *path = NULL; + + git_config(git_default_config, NULL); + + if (argv[1]) { + path = argv[1]; + argc--; + argv++; + }Obviously no options. Probably there should at least be "--porcelain" to output the current form, and the default output should be more user-friendly. And probably "-z" to avoid quoting issues.
Thank you for working on this. -- Jakub Narebski Poland ShadeHawk on #git