behaviour of git-blame -M -C (maybe a bug?)
To
git@vger.kernel.org
From
dmg
Date
2018-12-06 22:41:54 UTC


hi everybody,

I am the maintainer of cregit. We are trying to improve blame 
traceability at the token level (see 
https://github.com/dmgerman/papers/blob/master/editorials/cregit/cregit.org)

We use git-blame heavilty in cregit. One of the features that I 
would like to add to cregit is the ability track movement of code.

I have been testing git-blame -M -C and I found some behaviour 
that  seems incorrect. I have created a very simple repository 
that I think showcases this problem:

https://github.com/dmgerman/testBlameMove

this repo have 4 commits (listed below in order of execution):

1. A file is created tpm-dev.c (authored by D German),
2. a refactoring (code is moved from tpm-dev.c to 
tpm-dev-common.c, a new file). Author is "refactor"
3. a commit that adds some few contiguous lines (the existence of 
this commit seems to matter). Author is "none"
4. a commit that changes few lines and alters the result of blame 
for lines not modified by this commit. Author is "problem"

See below. I am running blame at different commits, showing only 
the lines attributed to author "refactor" (author of commit #2).

dmg@iodine:/tmp/testRepo|master ⇒  git log --oneline
ded1aa1 (HEAD -> master, origin/master) problematic commit
3720e68 simple commit
391adba refactoring
33165cb file before refactoring

if we checkout 391adba and do blame -M -C we get this:

dmg@iodine:/tmp/testRepo|3720e68 ⇒  git checkout 3720e68 && git 
blame -M -C tpm-dev-common.c | grep refactor | head
HEAD is now at 3720e68 simple commit
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  24) 
begin_include
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  25) 
include|#
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  26) 
directive|include
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  27) 
file|"tpm-dev.h"
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  28) 
end_include
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800 147) 
DECL|function|tpm_common_open (struct file * file,struct tpm_chip 
* chip,struct file_priv * priv)
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800 148) 
name|void
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800 149) 
name|tpm_common_open
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800 150) 
parameter_list|(
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800 151) 
name|struct

so far, so good. blame detects the movement. Note that the changes 
by refactor are adding 5 lines (24 to 28) and then adding some at 
147 and beyond.

now do it for the next commit: 3720e68


things continue to look good. The changes of this commit do not 
affect any of these lines.

now... the next commit, the problematic: ded1aa1 (author is not 
refactor)

dmg@iodine:/tmp/testRepo|3720e68 ⇒  git checkout ded1aa1 && git 
blame -M -C tpm-dev-common.c | grep refactor | head
Previous HEAD position was 3720e68 simple commit
HEAD is now at ded1aa1 problematic commit
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  24) 
begin_include
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  25) 
include|#
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  26) 
directive|include
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  27) 
file|"tpm-dev.h"
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  28) 
end_include
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  29)
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  30) 
begin_function
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  53) 
name|user_read_timer
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800  54) 
argument_list|)
391adba4 tpm-dev-common.c (refactor 2018-12-06 12:41:10 -0800 150) 
DECL|function|tpm_common_open (struct file * file,struct tpm_chip 
* chip,struct file_priv * priv)

now blame assigns the lines 29, 30, 53 and 54 to commit 391adba4 
refactor!!! This is what I think is a bug.
(by the way, the changes made in this last commit were between 28 
and 150)

thank you in advance for any clues on why git-blame is behaving 
like this.

--dmg

---
D M German
http://turingmachine.org