bug: git pull may delete untracked files in submodule without notice
To
git@vger.kernel.org
From
Christian Spanier
Date
2019-05-03 08:02:35 UTC
Hi,

I found a bug where Git may delete untracked files without notice in 
certain situations. This bug effects Git 2.21.0 both on Linux and Windows.
In summary this happens when git pull merges a commit that replaces a 
submodule folder with a symlink. Any files within the folder are deleted 
without notice.
Check out the script below for details.

This happend on some developer's machine and deleted a repository 
containing about 200GiB of files and tons of uncommited local scripts, 
log files and whatever, just because some other dev accidentally 
commited a temporary change.

Greetings,
Christian Spanier

##### PREPARATION #####

# New empty repository #1
mkdir rep1
cd rep1
git init --bare .
cd ..

# New empty repository #2
mkdir rep2
cd rep2
git init --bare .
cd ..

# Clone repository #1 and create initial commit
git clone rep1 clone_rep1_user1
cd clone_rep1_user1
touch README
git add README
git commit -m "initial commit"
git push
cd ..

# Clone repository #2 and create initial commit
git clone rep2 clone_rep2
cd clone_rep2
touch README
git add README
git commit -m "initial commit"
git push
cd ..

# Add repository #2 as a submodule to repository #1
cd clone_rep1_user1
git submodule add ../rep2
git commit -m "add submodule"
git push
cd ..

# User 2 also clones repository #1 and #2 recursively
git clone --recursive rep1 clone_rep1_user2

# User 2 starts working in his folder and adds an important local file 
which is
# not yet committed inside the submodule folder.
cd clone_rep1_user2/rep2
echo "important work" > uncommitted_file
cd ../../

# Meanwhile, user 1 temporarily switch out folder /clone_rep1_user1/rep2 
with a
# symbolic link to a different folder (for whatever reason, maybe a copy 
of an
# older version or anything).
mkdir rep2_alternative
cd clone_rep1_user1
mv rep2 ../rep2_backup
ln -s ../rep2_alternative rep2
# On Windows this can be done with 'mklink /D rep2 ../rep2_alternative',
# which requires admin privileges. The bug is not reproducible when
# using a directory junction with 'mklink /D /J ...'.

# He does some work on rep1 but then accidently adds the symbolic link 
to his
# next commit and pushes the changes. Notice the typechange of rep2.
echo "some" > work
git status
# On branch master
# Your branch is up to date with 'origin/master'.
#
# Changes not staged for commit:
#   (use "git add <file>..." to update what will be committed)
#   (use "git checkout -- <file>..." to discard changes in working 
directory)
#
#         typechange: rep2
#
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#
#         work
#
# no changes added to commit (use "git add" and/or "git commit -a")
git add .
git commit -m "do some work"
git push
cd ..

# NOW THE BUG:

# User 2 pulls the changes and loses his important work in
# rep2/uncommitted_file because Git replaces the folder with a symlink
# without checking for modified or uncommited files!
# He should get an error in this case!
cd clone_rep1_user2
git pull
cat rep2/uncommitted_file
# cat: rep2/uncommitted_file: Not a directory
# "important work" in rep2/uncommitted_file is gone :(