Rebasing

Matt Krump
Matt Krump

Brian gave a presentation on some of git's more advanced features for Zagaku on Friday. He mainly covered the reflog command and how it could potentially help you recover lost work even in pretty dire situations (after a rebase gone bad for example). It was definitely good to learn about some of git's more advanced features. It’s one of those tools that seems only secondary for doing your job, until something goes wrong and you waste hours fighting it, or even worse end up having to redo work.

The talk referenced rebasing quite a bit. Previously, I've only used git in a very basic way, so I didn't have a great understanding of the difference between rebasing and merging. So I thought I’d do some reading and write quick blog post and hopefully gain a better understanding of git in the process. That turned out to be kind of a bad a idea. For such a seeming simple tool the amount of material out there about anything git related is more or less limitless. After a bunch of reading and multiple detours (git bisect is command if you didn’t know), I reemerged with a little better understanding of git and how rebasing works.

Below is an example repo that I created. You can see that work has been added to master since we've started working on the feature branch. In git the default strategy for combining branches in this situation is a three way merge.

git checkout master
git merge feature_branch

pre

After merging in the feature branch you can see a couple things have happened. First, git determines the common ancestor of the two branch tips (the initial commit in this case), it then creates a new commit (a merge commit cdcda19) and moves the tip of master forward. It's very clear looking at the commit history which work occurred on the feature branch and which work occurred on master.

merge

git checkout feature_branch
git rebase master 
git checkout master
git merge feature_branch

Rebasing takes a different approach. When you rebase, git first determines a common ancestor between the current branch and base branch. It then sets aside (into a temporary staging area) all of the commits that are in the current branch, but not in the base branch. The head of the current branch is then set to match the base branch. Finally, the commits previously set aside are applied in order. Even though the contents of the commits have not changed, you can see that the commit hashes have changed, so these are actually viewed as new commits by git. The graph is now completely linear, resulting in a generally cleaner history. However, compared to the merge commit it is less clear what work was related to the feature branch.

rebase

I definitely found pros and cons to each approach and git strategies in general seemed to very much depend on the team. The best summary that I found of the philosophical differences between the two approaches was on git-scm.com:

One point of view on this is that your repository’s commit history is a record of what actually happened. It’s a historical document, valuable in its own right, and shouldn’t be tampered with. From this angle, changing the commit history is almost blasphemous; you’re lying about what actually transpired. So what if there was a messy series of merge commits? That’s how it happened, and the repository should preserve that for posterity.

The opposing point of view is that the commit history is the story of how your project was made. You wouldn’t publish the first draft of a book, and the manual for how to maintain your software deserves careful editing. This is the camp that uses tools like rebase and filter-branch to tell the story in the way that’s best for future readers.

Also, the one rule that I saw over and over again regarding rebasing was something similar to what git-scm.com says:

Do not rebase commits that exist outside your repository.

If you follow that guideline, you’ll be fine. If you don’t, people will hate you, and you’ll be scorned by friends and family.

Like all things git related, rebasing is both simple and complicated. Also, the answer to the question "what is the right way" seems to be most often be "it depends".

Resources

https://git-scm.com/book/en/v2/Git-Branching-Rebasing

https://git-scm.com/book/en/v2/Git-Branching-Basic-Branching-and-Merging

http://stackoverflow.com/questions/21817248/whats-the-difference-between-git-rebase-and-merge-no-ff

https://www.atlassian.com/git/articles/git-team-workflows-merge-or-rebase

https://www.derekgourlay.com/blog/git-when-to-merge-vs-when-to-rebase/

https://benmarshall.me/git-rebase/