git – TheTechLead

Beside Ambiguous Requirements, Tight Deadline and Unstable Legacy Codebase, Merge Conflict is another light fear that annoys and disrupts developers the most while making software.

From a fact of how Merge Conflict might appear in this post that long-live branches should be avoid as much as possible to mitigate chance of conflict, here are some guidelines to help any software teams deal with this fear.

User Story based Task Description

Human brain is designed to consume story, not ambiguity. In software development, User Story is short, simple description of a feature or requirement told from the perspective of the end user. It’s a fundamental element of Agile and Scrum methods — meant to capture what the user wants and why, without prescribing how developers should implement it. A user story usually follows this format: As a [type of user], when [something happen], user want [some goal] so that [some reason] . For example: As a registered user, I want to reset my password so that I can access my account if I forget it. This helps teams understand who needs something, what they need, and why it matters.

Depends on how big the goal user want is, a User Story can be splitted into simpler stories. We don’t have to write an essay in task description because we are not at school. Clarity is the top priority when writing tasks descriptions, so don’t think, just tell stories.

When a User Story is simple enough, a task stemmed from it can have a small scope of change with limited effect of codebase, and in predictable way. Small scope of change in a task helps to mitigate chance of conflicts. Even conflicts happen, resolving them can be easier because it happens in fewer places.

Merge/Rebase daily, resolve early

When tasks are defined well and scope of change is limited, the branch now can be short-live which is okay for Rebase tactic. Depend on preference of history commit tree, Merge or Rebase both is okay. The recommended practice here is to do it daily: merge main branch into new branches, or rebase new branches onto main branch. This practice helps to early aware of possible places might cause conflicts so that we can adjust coding tactic or sync up with other developers about what changes are made.

FIFO Merging

It is obviously right when prioritize tasks but do not apply priority to the order of merging code because it can turn some branches into long-live one when higher priority tasks keep being merge first. When a task is put on progress, your Kanban board for example, and when it is in Done column for example, it should be merged asap regardless priority of its task. What is completed first should be merged first, (First-In-First-Out order) . To achieve this state, utilize any tool to automatically do so is highly recommended. Of course, completion of a task here includes testing phase as well.

When a task is well defined with limited scope in User Story format, and somehow it gets stuck and turn into long-live branch, we need to review its necessity:

If we don’t need this feature anymore, so discard the task and close the branch.
If we still need this feature, but it contains risky changes, and it is why no one dare to merge it to main branch, so it is time to make simplier stories from the risky parts. Don’t let task being stuck. Keep feature branches short-live.

Commit with Task ID

Commit message is encouraged to describe what changes are but there is nothing to force or ensure that consistency. This depends on how good a developer can explain things. So to make it simple and scalable, it is recommended to begin a commit message with Task ID, for example: #1234 fix things , so whenever anyone wonders why a commit is added, they can trace back to related task description and let the User Story explains.

Long-live branches as Microservices

For any reason that a branch is planned to be long-live, such as when making a challenging feature that inevitable requires long time of development, consider to turn this big part into Microservices. Microservices architecture can keep new code in separated repository, which in turn, can mitigate risk of conflicts with existing main branch. Main system can communicate to this new Microservices in any kind to get things done without worrying about large changes from new features. The new Microservices can have its own tasks board with its own User Stories, and User here, is the Main system.

Time is money, so Don’t waste development time for resolving merge conflicts !

I usually prefer using Merge to Rebase for safety first.

Merge and Rebase is 2 ways of combining changes from different branches when using Github as chosen source code management platform. Since Merge seems to be enough to get things done in every cases, why does Github includes Rebase method ?

The answer seems related to team’s preference on the commit history. Github maintains a tree of commits per repository and each commit is a snapshot of all files. It is important to notice that Github stores project snapshots, not the diffs that we see with command git diff . Diffs are calculated on the fly when we compare 2 commits. This nature of Github affects to how actually Merge and Rebase behaves under the hood:

How does Merge actually work ?

When using git merge, for example, to merge branch A into branch B and given branch B is created from branch A, Github performs below steps:

Finds the common ancestor snapshot, aka the commit where branch B is created from.
Compares the latest snapshots of branch A to ancestor snapshot, get the diffs D1 (aka MERGE_HEAD)
Compares the latest snapshot of branch B to ancestor snapshot, get the diffs D2 (aka HEAD)
Applies diffs D1 & D2 on the ancestor snapshot then output a new merged snapshot, stored in a new commit of branch B

Because commits are snapshots:

Git doesn’t need to replay all intermediate diffs.
It just looks at 3 snapshots: ancestor , HEAD and MERGE_HEAD.

That’s why merging large histories is fast and doesn’t rewrite old commits — the snapshots are stable and immutable. When using Merge, if Conflicts happen, because there are always 3 snapshots is taken into account and the output is always 1 new snapshot, resolving Conflicts when using Merge likely happens only once.

How does Rebase actually work ?

When using git rebase, for example, to rebase branch B onto branch A, given that branch B is created from branch A, Github performs below steps:

Calculate the diff between each commit (aka snapshot) of branch B to its parent commit. This is likely to create a “patch” telling step-by-step how changes are already made on branch B,
Reapplies those diffs (patches) on top of latest snapshot of branch A
Creates new commits with new IDs (aka new snapshots).

So when each time when we rebase a branch B onto branch A, new commits (or snapshots) are added as if we have just made those changes on the snapshots of branch A. Because diffs are reapplied every time when we rebase, if there are Conflicts, it is likely we have to resolve same conflicts again and again. And this is why I prefer Merge to Rebase.

So, why does Rebase exists ?

Rebase is mostly used when we have a reason to control how the commit history looks like on a branch. This can be useful when a team prefer a linear commit history that is easier to read and do not care what actually happen such as when a branch is created and what is merged. Because it rewrites commit history on a branch, Rebase is not recommended to use on main branch due to the risk of losing commits and resolving conflicts multiple time. Rebase is safer only on a feature branch, which is created from main branch, and most important, this feature branch should have short-time development. On a feature branch that long-live enough, re-resolving conflicts might happen frequently and this can slow down development speed and even frustrate developers.

Conclusion

In summary, my suggestion on Merge vs Rebase is :

Always using Merge for safety first
If we are working on a feature branch (NOT the main or master one), and want to have a nicer commit history on this branch, and development time for this branch is short, then can use Rebase

Tag: git

How to avoid Merge Conflicts in software development