Merge vs Rebase: which is better ?

I usually prefer using Merge to Rebase for safety first.

Merge and Rebase is 2 ways of combining changes from different branches when using Github as chosen source code management platform. Since Merge seems to be enough to get things done in every cases, why does Github includes Rebase method ?

The answer seems related to team’s preference on the commit history. Github maintains a tree of commits per repository and each commit is a snapshot of all files. It is important to notice that Github stores project snapshots, not the diffs that we see with command git diff . Diffs are calculated on the fly when we compare 2 commits. This nature of Github affects to how actually Merge and Rebase behaves under the hood:

How does Merge actually work ?

When using git merge, for example, to merge branch A into branch B and given branch B is created from branch A, Github performs below steps:

  1. Finds the common ancestor snapshot, aka the commit where branch B is created from.
  2. Compares the latest snapshots of branch A to ancestor snapshot, get the diffs D1 (aka MERGE_HEAD)
  3. Compares the latest snapshot of branch B to ancestor snapshot, get the diffs D2 (aka HEAD)
  4. Applies diffs D1 & D2 on the ancestor snapshot then output a new merged snapshot, stored in a new commit of branch B

Because commits are snapshots:

  • Git doesn’t need to replay all intermediate diffs.
  • It just looks at 3 snapshots: ancestor , HEAD and MERGE_HEAD.

That’s why merging large histories is fast and doesn’t rewrite old commits — the snapshots are stable and immutable. When using Merge, if Conflicts happen, because there are always 3 snapshots is taken into account and the output is always 1 new snapshot, resolving Conflicts when using Merge likely happens only once.

How does Rebase actually work ?

When using git rebase, for example, to rebase branch B onto branch A, given that branch B is created from branch A, Github performs below steps:

  1. Calculate the diff between each commit (aka snapshot) of branch B to its parent commit. This is likely to create a “patch” telling step-by-step how changes are already made on branch B,
  2. Reapplies those diffs (patches) on top of latest snapshot of branch A
  3. Creates new commits with new IDs (aka new snapshots).

So when each time when we rebase a branch B onto branch A, new commits (or snapshots) are added as if we have just made those changes on the snapshots of branch A. Because diffs are reapplied every time when we rebase, if there are Conflicts, it is likely we have to resolve same conflicts again and again. And this is why I prefer Merge to Rebase.

So, why does Rebase exists ?

Rebase is mostly used when we have a reason to control how the commit history looks like on a branch. This can be useful when a team prefer a linear commit history that is easier to read and do not care what actually happen such as when a branch is created and what is merged. Because it rewrites commit history on a branch, Rebase is not recommended to use on main branch due to the risk of losing commits and resolving conflicts multiple time. Rebase is safer only on a feature branch, which is created from main branch, and most important, this feature branch should have short-time development. On a feature branch that long-live enough, re-resolving conflicts might happen frequently and this can slow down development speed and even frustrate developers.

Conclusion

In summary, my suggestion on Merge vs Rebase is :

  1. Always using Merge for safety first
  2. If we are working on a feature branch (NOT the main or master one), and want to have a nicer commit history on this branch, and development time for this branch is short, then can use Rebase

One thought on “Merge vs Rebase: which is better ?

Leave a comment