gigantic commit messages, was Re: Git Bug Report: out of memory using git tag
From: Jeff King <hidden>
Date: 2022-11-02 09:15:38
On Wed, Nov 02, 2022 at 01:14:59AM -0700, Elijah Newren wrote:
On Wed, Nov 2, 2022 at 12:51 AM Jeff King [off-list ref] wrote:quoted
Here are patches which fix them both. I may be setting a new record for the ratio of commit message lines to changed codeIt looks like the first patch is 72 lines of commit message for a one-line fix, and the second patch is 61 lines of commit message for a two line fix. I don't know what the record ratio is, but it's at least 96[1], so clearly you'll need to figure out how to pad your first commit message with at least another 25 lines before this series can be accepted. ;-)
Well, if we want to start digging things up... ;)
Try this:
git log --no-merges --no-renames --format='%H %B' -z --numstat '*.c' |
perl -0ne '
chomp;
if (s/^([0-9a-f]{40}) //) {
if (defined $commit && $diff) {
my $ratio = $body / $diff;
print "$ratio $body $diff $commit\n";
}
$commit = $1;
$body = () = /\n/g;
$diff = 0;
} elsif (/^\s*(\d+)\t/) {
# this counts only added lines, under the assumption that
# small commits generally remove/add in proportion. Of course
# ones that _only_ remove lines have infinite ratios.
$diff += $1;
} else {
die "confusing record: $_\n";
}
' |
sort -rn |
head
which shows there are a few in the 100's. Pipe through:
awk '{print $4}' |
git log --stdin --no-walk=unsorted --stat
for a nicer view. I'm rejecting the top one on the grounds that it's
mostly cut-and-paste output, and also that #2 is mine. ;)
-Peff