GIT Advanced Hints

Adjusting the Last Commit

You may just have done a commit which needs some little fix. Use git commit --amend for that. If you replicate you GIT repository (i.e. with clone, fetch, pull, or push) you need to amend a commit before you synchronize with the other repositories.

Example: fix the commit message
git commit -m 'messege with a typo''
git commit -m 'message without a typo'' --amend

The second commit command will replace the first entirely.

Example: add more files
git add file1 file2
git commit
git add file3 file4
git commit --amend

The files file3 and file4 will be folded into the same commit and appear alongside file1 and file2

Example: fix author name
git commit --amend --author="Actual Name <name@example.com>"

Adjusting Older Commits

If you want to do the same as above with older commits you need to rebase the GIT repository first. Specifically you need interactive rebasing with git rebase -i.

Note: if your repository is replicated elsewhere there is a risk that those repositories will not be able to properly synchronize after a rebase. The rule-of-thumb with rebase is to edit only those commits that have not yet been replicated to other repositories. Depending on the situation you may decide to just clone the other repositories again.

Removing a File From a GIT Repository

Occasionally you may have inadvertently committed a file to the repository that does not belong there. A file containing passwords or some other secret information would be a case in point. Assuming that the path to the file has always been the same you can filter out all occurrences and then get rid of the unreachable objects in the GIT repository with the following commands. Replace path/to/secret-file with the actual file name.

git filter-branch --index-filter 'git rm --cached --ignore-unmatch path/to/secret-file'
rm -rf .git/refs/original/
git reflog expire --all --expire='0 days'
git fsck --full --unreachable
git repack -A -d
git prune

Note1: if other branches have commits that reference the same file you will need to repeat the procedure for the other branches.

Note2: if remote repositories are involved (i.e. if the file propagated to other repositories or was cloned from a remote repository) you would need to remove the file from all repositories and you risk losing the consistency between the repositories. The best course of action then is to pick one best master repository, delete any references to remote repositories (git remote rm ...), and then remove the file with the above procedure. After that you would delete the other repositories and clone them freshly from the master repository. Of course, this is only feasible if you know about all remote repositories.

See also: Remove file from git repository (history) on stackoverflow.

Detach subdirectory into separate git repository

Suppose you have a git repository MyRepo with two subfolders subA and subB and you would like to create a new repository SubRepo with only the folder subA, but keeping its full history, branches and tags. This can be achieved as follows:

 # Clone the repo using file copies instead of hardlinks
git clone --no-hardlinks MyRepo SubRepo
cd SubRepo
 # Filter to keep only the desired folder
git filter-branch --subdirectory-filter subA HEAD -- --all
 # Remove all unwanted files
git reset --hard
rm -rf .git/refs/original/
git reflog expire --expire=now --all
git gc --aggressive --prune=now

See also detach subdirectory into separate Git repository on stackoverflow.

Cherry-picking a commit from a different repository

Suppose you have two different git repositories RepoA and RepoB with at least one common file. It may happen that someone committed a change to this file in RepoA which you would also like to have in RepoB, without merging any other differences. This is where cherry-picking comes to the rescue. First add RepoA as a remote to RepoB, then cherry-pick the desired commit:

git remote add RemoteRepoA path/to/RepoA
git fetch RemoteRepoA
git cherry-pick <SHA-Key>

This adds a new commit to RepoB with the desired changes.

Finding Objects in GIT Repositories by Size

In order to prune down a large repository you may want to remove some very large files (e.g. some .tar.gz file that should not be in the repository in the first place). The procedure to remove the file works as described above. This is how you find the large files.

git gc
git verify-pack -v .git/objects/pack/*.idx | sort -k 3 -n | tail

This will give you the SHA keys (first column) of the largest objects. To find out the file name you use

git rev-list --objects --all | grep <SHA-key>

If you are only interested in the largest file you can combine the commands like such:

git rev-list --objects --all | grep `git verify-pack -v .git/objects/pack/*.idx | sort -k 3 -n | tail -1 | awk '{print$1}'`

To actually purge that file from the repository so that it no more takes up disk space anywhere see e.g. http://help.github.com/remove-sensitive-data/.

Creating multiple working trees

Let's say you are currently refactoring some code and your boss comes to you and wants an emergency fix. Usually you would do a git stash but now you can also create a new worktree which is attached to the same repository.

Example:

git worktree add -b <new-branch> <path> <branch-to-checkout>
git worktree list
git worktree add -b emergency-fix ../fix master
cd ../fix

Do the fix, then

git commit -am 'Emergency fix'
cd -
rm -rf ../fix
git worktree prune      # remove the emergency-fix worktree

Now you can merge it or do whatever you want, the fix is on the emergency-fix branch

git merge emergency-fix

Limitations

  • You cannot check out the same branch twice (except when using the --force flag)