Git: How to Compute Stats for each Contributor

Date: 2015-08-07 |

**Problem: **I wanted to find an easy way to compute my git contribution stats for a specific repo.  Ideally the solution wouldn’t involve any computation on my part, the script itself would do the work for me.

**Solution: **I found the solution on Stack Overflow.

The one I chose is a one-liner that computes the stats for every user – which is why it may take a few minutes to complete depending on repo size.  Just open up a terminal/command line in a directory that has a git repo and run this line.  It will list every author that has made a commit in the current repo along with the number of files they’ve changed, the number of insertions, deletions, and the net total of lines.

git log --shortstat --pretty="%cE" | sed 's/\(.*\)@.*/\1/' | grep -v "^$" | awk 'BEGIN { line=""; } !/^ / { if (line=="" || !match(line, $0)) {line = $0 "," line }} /^ / { print line " # " $0; line=""}' | sort | sed -E 's/# //;s/ files? changed,//;s/([0-9]+) ([0-9]+ deletion)/\1 0 insertions\(+\), \2/;s/\(\+\)$/\(\+\), 0 deletions\(-\)/;s/insertions?\(\+\), //;s/ deletions?\(-\)//' | awk 'BEGIN {name=""; files=0; insertions=0; deletions=0;} {if ($1 != name && name != "") { print name ": " files " files changed, " insertions " insertions(+), " deletions " deletions(-), " insertions-deletions " net"; files=0; insertions=0; deletions=0; name=$1; } name=$1; files+=$2; insertions+=$3; deletions+=$4} END {print name ": " files " files changed, " insertions " insertions(+), " deletions " deletions(-), " insertions-deletions " net";}'

The output of this script for a single author looks like this: hgreene,: 72 files changed, 2070 insertions(+), 1719 deletions(-), 351 net

The meaning of these stats is pretty straight forward. You may notice that there are some entries with multiple usernames next to them. I’m not really sure what these mean, but I assume it’s counting commits that have multiple authors – perhaps merges or rebases. That requires more research, but the stats with single authors listed seem to be relatively accurate and were suitable for my purposes.

On another note, I think the files changed field is a bit misleading.  I think it counts each file changed in a commit with no regard to whether or not that file was counted in previous commits.  So if you had two commits that changed the same six files, you’d probably get a total files changed count of 12 rather than a count of 6 which would indicate unique files modified.

Want more like this?

The best / easiest way to support my work is by subscribing for future updates and sharing with your network.