Blog Feed

How to compete with generative AI as a software engineer ?

Before the decade of AI bursting, software engineering is mostly about writing code that realize requirements. Software Engineers, at some extent, act like a translators between human languages and computer language. This translation today can be accomplished by many generative AI products in seconds and from my observation, generated code has pattern even better than code written by most of developers. It is understandable when companies begin laying off employees that does not match existing AI. It is just a cost optimization – vital part of every business – and also a coldest truth of this life, might be !

What is generative AI good and not good at ?

Recall the flow that each software engineer has to do daily is:

Receive requirements -> Review current state of source code -> Define a target state of source code -> Retrieve information from documents of related tools, libraries and solutions -> Pick solutions -> Actually write code -> Aligning new code to existing code -> Deploy -> Testing -> Measuring results -> Read error messages -> Debugging.

Some steps of this flow is done better by generative AI, and some is better by human:

StepsDescriptionWinner and Why
Receive requirementsto capture goals, constraints, acceptance criteria, performance, security needs, and stakeholders’ expectations.Human
Reason: human are better at eliciting ambiguous needs, negotiating trade-offs, and asking the right follow-ups with stakeholders. AI can help by summarizing long requirement documents and suggesting missing or inconsistent points.
Review current state of source codeto read codebase, architecture, tests, docs, build scripts, dependencies, and CI config.Human + AI
Reason: AI can quickly index, summarize files, find patterns, risky hotspots, and generate dependency graphs. But humans provide domain knowledge, historical context, and recognize subtle intent (business logic, quirks, trade-offs).
Define a target state of source code to design the desired architecture, interfaces, data flows, APIs, and acceptance criteria for the new state.Human + AI
Reason: AI can propose multiple concrete design options, highlight trade-offs. Humans must pick the option that fits non-technical constraints (policy, team skill, product strategy).
Retrieve information from documents of related tools, libraries and solutions to find API docs, migration guides, best practices, configuration notes.AI
Reason: AI can extract key steps, call signatures, breaking changes, and produce concise examples from long docs much faster than manual reading. Humans validate and interpret edge cases.
Pick solutionsto select libraries, patterns, and implementation approaches considering performance, security, license, team skills.Human
Reason: human decision-makers must weigh organizational constraints, long-term maintenance, licensing, and political factors.
Actually write codeimplement features, refactor, add tests, update docs.AI
Reason: AI excels at generating boilerplate, test stubs, consistent code patterns, and translations across languages.
Aligning new code to existing codeensure style, APIs, error-handling, logging, and patterns match the codebase; maintain backward compatibility.Human + AI
Reason: AI can automatically reformat, rename for consistency, and propose refactors to match patterns; humans confirm that changes don’t break conventions tied to tests or runtime behaviors.
Deploypush to staging/production, run migration scripts, coordinate releases, rollback plans.Human
Reason: Humans must coordinate cross-team tasks, business windows, and incident response. AI/automation is excellent at packaging, CI/CD scripts, and repeatable deployment steps.
TestingRun the application locally and manually verify that new changes behave as expected.Human
Reason: Manual testing relies heavily on intuition, product knowledge, and human perception (e.g., UX feel, layout issues, unexpected delays, weird state transitions).
Measuring resultsmonitor metrics, logs, user feedback, testing results, and define success signals.Human + AI
Reason: AI can detect anomalies, summarize metrics, and surface correlations. Humans decide what metrics matter, interpret business impact, and choose next actions.
Read error messagesanalyze stack traces, logs, exceptions, and failure contexts.Human + AI
Reason: AI quickly maps errors to likely root causes and suggests reproducible steps. Humans provide context (recent changes, infra issues) and confirm fixes.
Debuggingreproduce issues, step through code, identify root cause, fix and validate.Human
Reason: AI speeds discovery (identifying suspicious diffs, suggesting breakpoints, generating reproducer scripts), but complex debugging often needs human insight into domain rules, race conditions, and stateful behaviors.

How to compete with generative AI to secure the career as a software engineer ?

Similar to the Industrial Revolution and Digital Revolution, where labors is replaced by machines, some jobs disappeared but new jobs got created. And at some extent, AI, is just another machine, huge one, so, essentially, we are still in the Revolution of Machine era.

The answer for this question is that we need to work on where this huge machine cannot. So far, at the moment of this post, what we can do to compete with AI in software development are:

Transit to Solution Architect

As AI becomes strong at writing code, humans can shift upward into architectural thinking. A Solution Architect focuses on shaping systems, not just lines of code. This involves interpreting ambiguous requirements, negotiating constraints across teams, balancing trade-offs between cost, performance, security, and future growth. AI can propose patterns, but only a human understands organizational politics, legacy constraints, domain history, and long-term impact. By moving into architecture, you operate at a layer where human judgment, experience, and foresight remain irreplaceable.

Become Reviewer / Validator

AI can produce solutions quickly, but it still needs someone to verify correctness, safety, and alignment with real-world constraints. A human reviewer checks assumptions, identifies risks, ensures compliance with business rules, and validates that AI-generated code or plans actually make sense in context. Humans excel at spotting hidden inconsistencies, ethical issues, and practical pitfalls that AI may overlook. Becoming a Validator means owning the final approval — the role of the responsible adult in the loop.

Become Orchestrator

Future engineers will spend less time typing code and more time coordinating AI agents, tools, workflows, and automation. An Orchestrator knows how to decompose problems, feed the right information to the right AI tool, evaluate outputs, and blend them into a coherent product. This role requires systems thinking, communication, and the ability to see the entire workflow end-to-end. AI is powerful but narrow; an Orchestrator provides the glue, strategy, and oversight that turns multiple AI capabilities into a real solution.

Study Broader knowledge

AI is good at depth — consuming a specific library or framework instantly — but humans win by having breadth. Understanding multiple domains (networking, security, product design, compliance, UX, devops, data, hardware) allows you to make decisions AI cannot contextualize. Breadth lets you spot cross-domain interactions, anticipate downstream consequences, and design better holistic systems. The more wide your knowledge, the more you can see risks, opportunities, and real-world constraints that AI cannot infer from text alone.

Expertise in task description

In an AI-driven era, the most valuable skill is the ability to turn a messy idea into a clear, precise, constraints-rich task. This includes defining scope, edge cases, success criteria, and architectural boundaries. AI is only as good as the instructions it receives — so those who excel at describing tasks will control the quality of AI output. Humans who master problem framing, prompt engineering, and requirement decomposition gain leverage: they make AI more accurate, faster, and more predictable than others can.

Business Analyst

The heart of value creation lies in understanding the business, not writing the code. AI cannot replace someone who knows market dynamics, user behavior, budget constraints, prioritization logic, risk tolerance, stakeholder psychology, and regulatory boundaries. A Business Analyst bridges the gap between technology and real-world value. They decide why a feature matters, who it serves, how it impacts revenue or cost, and what risk it introduces — areas where AI can help, but not replace the human nuance needed.

Pentester

Security is one of the hardest domains for AI to master fully because it requires creativity, unpredictability, street knowledge, and adversarial thinking. A pentester does more than run scanners — they exploit human behavior, spot surprising vulnerabilities, and think like an attacker. Humans who understand security fundamentals, threat modeling, social engineering, and advanced exploitation techniques will stay in demand. AI helps automate scanning and code analysis, but a creative pentester stays ahead by understanding motives, tactics, and real-world constraints.


Essentially, it is to use AI as a super-assistant
that can write code very well
to realize our intentions.

What is Refactor and why it is matter ?

For someone with no programming experience, the word “refactor” can be confusing at first, because it’s mostly used in coding. For a simple explaining, refactor is Making something better or clearer without changing what it does.

Why Refactoring when there is nothing change ?

If it works, don’t touch it” is a principle that is still valid in real world. It’s truth, pragmatic and is recommended at some extent when we do not have enough understanding about system we are working on. From business perspective, Refactoring feels unproductive when there is no new features added to system, but, just like a business, sometime we need to restructure processes, reorganize people and rearrange tasks to maximize outcomes, programing process also need refactoring to optimize coding experience which help source code more readable, maintainable, and scalable. These benefits, in turns, accelerate developers when adding new features or fixing bugs later on.

What is Readable code ?

Readable code is code written clearly so another developer, or future you, can read it quickly and know what’s going on, sometime just by guessing via variable names and function names. Some tactics can be applied to ensure readable code are:

  • Clear naming: Variables, functions, and classes have names that explain their purpose.
  • Short, focused functions: Each function does one thing, not many things.
  • Consistent formatting: Proper indentation, spacing, and line breaks.
  • Avoids unnecessary complexity: No overly clever tricks, Straightforward logic.
  • Helpful comments: Explain why, not what.
  • Use of standard patterns: Code follows common conventions so others instantly recognize the structure.

What is Maintainable code ?

  • Readable: as explained as above
  • Well-organized: Code is structured logically into modules, functions, or classes
  • Consistent: Follows the same style, naming, and patterns everywhere.
  • Well-tested: Covered by tests to catch bugs early and safely.
  • Documented: Has comments or docs explaining why and how things work.
  • Flexible: Easy to modify, extend, or adapt without breaking existing code.

What is Scalable code ?

  • Efficient: Uses memory and CPU wisely, Avoids unnecessary heavy operations.
  • Modular: Pieces of code can be separated or duplicated easily
  • Asynchronous / non-blocking when needed: Doesn’t freeze the whole system while waiting for one slow task.
  • Uses good architecture: Clear layers, Can split into microservices or separate components if needed.
  • Uses proper data structures: For example, using a Map instead of a List for fast lookups.
  • Database scalability: Indexes, caching, batching queries, sharding, etc.

Refactor safety

Because the goal is to keep system working as the same while rewriting codes, there must be a metric indicate sameness, or early detect differences in system behaviors. This is where Test Driven Development shines.

Writing Test is mistakenly overlooked by inexperience developers. Beginners usually think programing job is just to write code, see it run then move on writing another code. Writing tests looks like an extra work or an annoying requirement. This is okay just as a young men does not understand “karma”. And karma for this overlooking usually are:

  • Bugs keep coming back
  • Bugs evolve when there is more code added
  • Take so much time for debugging
  • Source code become a mess and a small change can take months to add

When bugs bring enough pain, developers begin more experience.

Test Driven Development (aka TDD)

Test-Driven Development (TDD) is a software development process where you write tests before writing the actual code.

TDD follows a repeating 3-step loop:

  1. Write a failing test ( yes, always fail first ! )
    • The test describes what the code should do.
    • It fails because the feature doesn’t exist yet, or the bug is not fixed yet.
  2. Write the minimum code needed to make the test pass
    • Not perfect code, just enough to pass the test.
  3. Refactor – Clean up the code
    • Improve readability, maintainability, scalability
    • Keep the tests passing.

Then repeat the cycle for the next feature and bug fixing.

Tests ideally can simulate the UX that users will engage on real product. This can not be 100% achieved but keep this principle in mind will help a lot to write good tests. Depends on how closely a test to real world UX, tests can be classified to 3 levels: Unit Test, Integration Test, and E2E Test.

Unit Test

Unit Test is ideal to test behaviors of a function or a class. In each unit test, we can test output of a function given a particular input. We can anticipate what inputs can be, even unrealistic ones (hackers usually input unrealistic ones) , to ensure our functions keep functioning regardless what input is. Unit tests can be used as a debugging tool when we can test directly a part of system without try reproducing via UI/UX. For functions that is well guarded by unit tests, developers can feel more confident to add changes or refactor it because bugs can be caught early.

Integration Test

Integration tests are tests that check how multiple parts of system work together. Functions, Classes and Flows can be tested on how they are interacting together inside a system. It ensures that every “pieces” of the system are integrating properly. Similar to Unit Test, we can anticipate and simulates Flows to can catch bugs soon.

E2E Test

E2E (End-to-End) tests are the tests that simulate a real user using the real app, by actually click buttons and typing text.

This is ultimate form of Test that can catch bugs that unit or integration tests cannot. E2E Tests test the app in an environment closest to production. They validate the entire system from UI/UX to data storage. But it is the hardest tests to make when a real system need to be deployed for E2E tests can execute. Simulating user behaviors by coding requires more effort. This is why many teams usually stop at Integrating Tests and it is totally ok when majority of bugs can be catch at level of Integration Test. Writing E2E Tests is time consuming so we should only write it for bugs that un-produceable at Integration Test level. These bugs are high-level bugs and it should be addressed by high-level tests, and they mostly about concurrency, timing and resources:

  • Race Condition: A situation where the correct behavior of a program depends on the relative timing or interleaving of multiple threads or processes. It’s about uncontrolled timing causing wrong results.
  • DeadLock: Two or more threads/processes are waiting on each other indefinitely, preventing progress. As the result, system freezes because resources are locked in a circular wait.
  • Livelock: Threads or processes keep changing state in response to each other but make no actual progress. As the result, CPU or threads are active, but nothing gets done.
  • Starvation: A thread never gets access to a needed resource or CPU because other threads dominate it. As the result, resource exists, but some threads never get a chance to execute.
  • Atomicity violation: A set of operations that should be executed as a single, indivisible unit is interrupted, causing incorrect results.
  • Order violation: Correct behavior depends on operations happening in a specific order, but the order is not guaranteed that eventually leads to incorrect results.
  • Heisenbug: A bug that disappears or changes when you try to observe it (e.g., by debugging, logging, or adding print statements). This sounds like quantum computing but yes, it does exists. These bugs often caused by concurrency, timing, or memory issues.
  • Data corruption: Shared data is modified concurrently without proper synchronization, resulting in invalid or inconsistent values.
  • Lost update: Two concurrent operations overwrite each other’s results, causing data loss.
  • Dirty read / inconsistent read: A thread reads a partially updated or uncommitted value from another thread or transaction and then produce wrong results.
  • Priority inversion: A low-priority thread holds a resource needed by a high-priority thread, causing the high-priority thread to wait unnecessarily.

In conclusion, to Refactor code safely, we need a lot of tests, good tests one !

Phishing attack at its ultimate form in Asia

Here is a poster in Vietnam that every buildings have to place to warn citizen about online scammer. Scammers now are tech + government powered criminals, well funded and well-organized !

Above poster lists popular tricks that have been used by scammer for decade and caused extreme financial damage to citizen. Below is a summary on what happened and existing solutions at the end of this post

Impersonate bankers

Scammers pretend to be bank employees, using forged caller IDs or fake emails to convince victims that their accounts have problems or suspicious activity. They pressure people to provide OTPs, passwords, or transfer money to “secure accounts,” exploiting the victim’s fear of losing funds.

Love trap on social networks

Criminals create fake profiles on Facebook, Zalo, or dating apps, using attractive photos and sweet messages to build emotional bonds. After gaining trust, they fabricate emergencies, travel problems, or gifts stuck at customs and ask the victim to send money to “help.”

Impersonate telecommunication officer

Fraudsters pose as telecom staff claiming your SIM will be locked, your number is involved in illegal activity, or you must update customer information. They then guide victims to provide ID details or install malicious apps that allow remote control of the phone.

Fake Sim 4G upgrade

Scammers contact victims saying their SIM card needs to be upgraded to 4G/5G and ask for OTP verification. When the victim shares the OTP, the scammer hijacks the phone number, enabling them to reset banking passwords and steal funds.

Recruit Partner

These scams offer “partnership” opportunities with fake companies or online stores. Victims are promised high profits or commissions, but after investing money, they cannot withdraw earnings, or the scammers disappear entirely.

Impersonate Social Insurance

Scammers claim to be from the social insurance authority, saying the victim has unpaid contributions, benefits problems, or involvement in illegal records. They create panic and manipulate victims into sharing personal data or making payments.

Impersonate charity

Fraudsters pose as charity organizations, exploiting compassion by collecting “donations” for fake causes such as medical emergencies, disaster relief, or orphan support. The collected money goes directly to the scammers’ accounts.

Gambling

Many scams involve illegal online betting sites. Victims are lured with promises of guaranteed wins or insider tips. After depositing money, the site manipulates the results or locks the account, making withdrawal impossible.

Impersonate Financial Organization

Scammers pretend to be from loan companies or investment firms, offering high returns or easy loan approval. They require “processing fees,” “insurance,” or initial deposits—after receiving the money, they vanish.

Forced loan

Victims is transferred an amount of money from strangers. Then strangers call them and tell that it is borrowed from black credit firms, and threaten that if they do not pay, they can come with force.

Fake Crypto Trading Platform

Fraudulent crypto apps or websites show manipulated profit charts to convince victims they are earning money. When victims deposit larger amounts, withdrawals are blocked, and the platform disappears.

Recruit house cleaner

Scammers post fake job ads for housekeeping, offering high salaries. Applicants are then asked to pay “training fees,” “uniform fees,” or deposits for tools. Once paid, the job offer is withdrawn and the scammer disappears.

Buy / sell on digital platforms

In online marketplaces, scammers sell products they never deliver, or buy goods and send fake payment receipts. Some also lure victims into sending deposits to “hold” an item, then immediately block them.

Missions via strange apps

Victims are assigned “simple online tasks” such as liking posts or rating products, with small initial payouts. Later, the tasks require larger deposits to continue earning, and once enough money is collected, the scammers cut off contact.

Clone Facebook account

Fraudsters impersonate the victim by cloning their facebook account, asking friends and family to send emergency money or mobile card codes. Others use the hacked account to run ads or steal linked personal information.

Impersonate government officers

Scammers masquerade as police, prosecutors, or tax officials, claiming the victim is involved in money laundering, tax evasion, or criminal cases. They use intimidation to force victims into transferring money to “verify” or “clear” their records.

Fake jackpot / gift

Victims receive messages claiming they’ve won a prize, iPhone, or overseas gift package. To claim it, they must pay customs fees or taxes. After sending the money, the supposed prize never arrives.

Terrorism via phone calls

Some scammers make threatening calls pretending to be criminals or debt collectors. They use fear—claiming harm, kidnapping, or legal consequences—to force victims to transfer money quickly without thinking.

Impersonate law firms

Scammers pose as lawyers claiming there is a lawsuit, unpaid debt, or urgent legal issue. They pressure victims to pay consulting fees or settlement amounts immediately to avoid prosecution.


Terribly, this keeps going on, at least at the moment of this post, regardless many effort from Vietnam, Korea, Singapore, etc polices. Because it is backed by some other governments, it is really hard to eliminate them all.

Well-organized criminal networks

Scam centers in Cambodia are hard to destroy because they are often backed by well-organized criminal networks that operate across multiple countries. These groups have resources, connections, and the ability to relocate quickly when law enforcement pressure increases. Their cross-border structure makes it difficult for any single government to completely shut them down.

Corruption & weak enforcement

Another reason is the presence of corruption and weak enforcement in certain regions. Some scam compounds operate in areas where local authorities have limited oversight or where bribery and influence allow criminals to continue operating with minimal interference. Even when raids happen, the networks frequently rebuild in nearby locations or migrate to neighboring countries.

Many scam centers also hide behind the facade of legal businesses, such as casinos, entertainment centers, or investment companies. These fronts make investigations more complicated because law-enforcement agencies need strong evidence before taking action. Criminals exploit this ambiguity to stay operational for long periods.

Human trafficking victims

Additionally, these scam operations rely on a steady supply of human trafficking victims brought in from various countries. Victims are forced to work under threats, making the operations difficult to expose. Because the workers are often imprisoned and isolated, reliable information rarely reaches the outside world, slowing down international rescue efforts.

High profitability and Low traceability

Finally, global factors contribute to their persistence. The rapid rise of online scams, cryptocurrency, and digital anonymity provides scam centers with high profitability and low traceability. As long as these operations generate massive revenue with relatively low risk, shutting them down completely requires coordinated international action—something that remains complex and slow.


Solutions

So looks like that citizens have to protect themself before government get things done.

And below is some protection tactics that can be observed in Vietnam

Community-based reporting website

chongluadao.vn

Chongluadao.vn is a Vietnamese cybersecurity initiative that maintains a large database of verified scam websites, phishing pages, and fake online services. It allows users to check whether a link is safe and relies heavily on community submissions to keep its blacklist updated. It focuses on suspicious urls and websites. User can search for past reports to know whether a page is scam.

trangtrang.com

TrangTrang.com is another platform supporting community reporting of suspicious phone numbers. It focuses on gathering public complaints about calls. Users can search past reports before pick up a call, helping them avoid risks.

Firewalls on smartphone

Smartphone Firewalls can act as a digital shield that monitors network traffic to detect and block malicious connections. Unlike antivirus software that only reacts after threats appear, firewalls proactively prevent dangerous apps or websites from communicating with scam servers. They help stop phishing pages, data exfiltration, and suspicious background activities. This makes them especially useful in preventing scams delivered through fake apps or hidden links.

SafePhone (Firewall for smartphone)

SafePhone is a specialized mobile firewall designed to filter both internet traffic and incoming call threats. It can block incoming calls from known scam numbers. It also can prevent users to access scam websites when tapping urls on messengers. By putting blacklists right on user’s smartphone, it helps users defend against risks more seamlessly without frequently looking up on other websites.

Browser Extensions

Browser extensions can add an additional security layer directly inside the user’s web browser. They can warn about dangerous websites before loading, block trackers, stop pop-ups, and identify phishing attempts. Extensions with anti-scam features check every website against a global blacklist and use heuristics to detect fake login pages or fraudulent shopping sites. This type of protection is crucial because most scams start with a single click on a malicious link.

chongluadao.vn

Chongluadao.vn offers a browser extension that automatically warns users whenever they visit a suspicious or reported scam site.

SafePhone

SafePhone includes a feature called SafeBrowser. SafeBrowser is a secure browsing mode inside the SafePhone ecosystem. It routes traffic through SafePhone’s protection filters, blocking malicious domains and preventing users from accidentally accessing scam websites. This controlled environment is especially useful for elderly users, children, or anyone who prefers a safe but still simple browsing experience.


How to avoid Merge Conflicts in software development

Beside Ambiguous Requirements, Tight Deadline and Unstable Legacy Codebase, Merge Conflict is another light fear that annoys and disrupts developers the most while making software.

From a fact of how Merge Conflict might appear in this post that long-live branches should be avoid as much as possible to mitigate chance of conflict, here are some guidelines to help any software teams deal with this fear.

User Story based Task Description

Human brain is designed to consume story, not ambiguity. In software development, User Story is short, simple description of a feature or requirement told from the perspective of the end user. It’s a fundamental element of Agile and Scrum methods — meant to capture what the user wants and why, without prescribing how developers should implement it. A user story usually follows this format: As a [type of user], when [something happen], user want [some goal] so that [some reason] . For example: As a registered user, I want to reset my password so that I can access my account if I forget it. This helps teams understand who needs something, what they need, and why it matters.

Depends on how big the goal user want is, a User Story can be splitted into simpler stories. We don’t have to write an essay in task description because we are not at school. Clarity is the top priority when writing tasks descriptions, so don’t think, just tell stories.

When a User Story is simple enough, a task stemmed from it can have a small scope of change with limited effect of codebase, and in predictable way. Small scope of change in a task helps to mitigate chance of conflicts. Even conflicts happen, resolving them can be easier because it happens in fewer places.

Merge/Rebase daily, resolve early

When tasks are defined well and scope of change is limited, the branch now can be short-live which is okay for Rebase tactic. Depend on preference of history commit tree, Merge or Rebase both is okay. The recommended practice here is to do it daily: merge main branch into new branches, or rebase new branches onto main branch. This practice helps to early aware of possible places might cause conflicts so that we can adjust coding tactic or sync up with other developers about what changes are made.

FIFO Merging

It is obviously right when prioritize tasks but do not apply priority to the order of merging code because it can turn some branches into long-live one when higher priority tasks keep being merge first. When a task is put on progress, your Kanban board for example, and when it is in Done column for example, it should be merged asap regardless priority of its task. What is completed first should be merged first, (First-In-First-Out order) . To achieve this state, utilize any tool to automatically do so is highly recommended. Of course, completion of a task here includes testing phase as well.

When a task is well defined with limited scope in User Story format, and somehow it gets stuck and turn into long-live branch, we need to review its necessity:

  • If we don’t need this feature anymore, so discard the task and close the branch.
  • If we still need this feature, but it contains risky changes, and it is why no one dare to merge it to main branch, so it is time to make simplier stories from the risky parts. Don’t let task being stuck. Keep feature branches short-live.

Commit with Task ID

Commit message is encouraged to describe what changes are but there is nothing to force or ensure that consistency. This depends on how good a developer can explain things. So to make it simple and scalable, it is recommended to begin a commit message with Task ID, for example: #1234 fix things , so whenever anyone wonders why a commit is added, they can trace back to related task description and let the User Story explains.

Long-live branches as Microservices

For any reason that a branch is planned to be long-live, such as when making a challenging feature that inevitable requires long time of development, consider to turn this big part into Microservices. Microservices architecture can keep new code in separated repository, which in turn, can mitigate risk of conflicts with existing main branch. Main system can communicate to this new Microservices in any kind to get things done without worrying about large changes from new features. The new Microservices can have its own tasks board with its own User Stories, and User here, is the Main system.

Time is money, so Don’t waste development time for resolving merge conflicts !

Merge vs Rebase: which is better ?

I usually prefer using Merge to Rebase for safety first.

Merge and Rebase is 2 ways of combining changes from different branches when using Github as chosen source code management platform. Since Merge seems to be enough to get things done in every cases, why does Github includes Rebase method ?

The answer seems related to team’s preference on the commit history. Github maintains a tree of commits per repository and each commit is a snapshot of all files. It is important to notice that Github stores project snapshots, not the diffs that we see with command git diff . Diffs are calculated on the fly when we compare 2 commits. This nature of Github affects to how actually Merge and Rebase behaves under the hood:

How does Merge actually work ?

When using git merge, for example, to merge branch A into branch B and given branch B is created from branch A, Github performs below steps:

  1. Finds the common ancestor snapshot, aka the commit where branch B is created from.
  2. Compares the latest snapshots of branch A to ancestor snapshot, get the diffs D1 (aka MERGE_HEAD)
  3. Compares the latest snapshot of branch B to ancestor snapshot, get the diffs D2 (aka HEAD)
  4. Applies diffs D1 & D2 on the ancestor snapshot then output a new merged snapshot, stored in a new commit of branch B

Because commits are snapshots:

  • Git doesn’t need to replay all intermediate diffs.
  • It just looks at 3 snapshots: ancestor , HEAD and MERGE_HEAD.

That’s why merging large histories is fast and doesn’t rewrite old commits — the snapshots are stable and immutable. When using Merge, if Conflicts happen, because there are always 3 snapshots is taken into account and the output is always 1 new snapshot, resolving Conflicts when using Merge likely happens only once.

How does Rebase actually work ?

When using git rebase, for example, to rebase branch B onto branch A, given that branch B is created from branch A, Github performs below steps:

  1. Calculate the diff between each commit (aka snapshot) of branch B to its parent commit. This is likely to create a “patch” telling step-by-step how changes are already made on branch B,
  2. Reapplies those diffs (patches) on top of latest snapshot of branch A
  3. Creates new commits with new IDs (aka new snapshots).

So when each time when we rebase a branch B onto branch A, new commits (or snapshots) are added as if we have just made those changes on the snapshots of branch A. Because diffs are reapplied every time when we rebase, if there are Conflicts, it is likely we have to resolve same conflicts again and again. And this is why I prefer Merge to Rebase.

So, why does Rebase exists ?

Rebase is mostly used when we have a reason to control how the commit history looks like on a branch. This can be useful when a team prefer a linear commit history that is easier to read and do not care what actually happen such as when a branch is created and what is merged. Because it rewrites commit history on a branch, Rebase is not recommended to use on main branch due to the risk of losing commits and resolving conflicts multiple time. Rebase is safer only on a feature branch, which is created from main branch, and most important, this feature branch should have short-time development. On a feature branch that long-live enough, re-resolving conflicts might happen frequently and this can slow down development speed and even frustrate developers.

Conclusion

In summary, my suggestion on Merge vs Rebase is :

  1. Always using Merge for safety first
  2. If we are working on a feature branch (NOT the main or master one), and want to have a nicer commit history on this branch, and development time for this branch is short, then can use Rebase

Story behind ads you see on your Facebook

TO PROTECT YOUR PRIVACY !!

Have you ever wonder why powerful tools like Google Search, Gmail, Facebook, X, Tiktok, etc are all free to use ?

When you see they sell nothing, then you are what to be sold !

Advertisement is the main profit source that keep most of online tools free nowadays. Advertisement is not bad when it brings information to us in proactive way so we don’t have to spend time investigating market options. But due to this indirect method of advertising, we can’t know who actually behind it. And in fact, this has become an ideal channel for scammers to lure users via social networks. A lot of students, elders were victims because they has least knowledge and experience online. And when many companies systems were infiltrated, hacked, stolen data, personal data of users is leaked globally. These problems, when combine together, harms our privacy !

This post reveals some methods around online advertising industry found on Facebook and the same also can be applied in every social networks as well.

How anyone can make you read something while using Facebook ?

Given that we are all using some banking or non-banking applications, and on the news we hear about companies behind those application are hacked, and data is leaked on some dark web. Dark webs are websites operated outside the laws and are the ideal places for criminal activities which selling hacked data is the most popular. Today anyone can buy leaked data using Bitcoin or Ethereum to hide their identity completely. When someone has our name, phone numbers, emails or even addresses, they can search for and start stalking us on social networks such as Facebook, Instagram, X, etc, then after that they pay to run a targeted ads campaign with information they know about us, and let Facebook algorithm handle presenting that ads to our mobile phone.

How can advertisers control who will see their ads:

1. Custom Audiences

Facebook allows anyone to use phone numbers, emails, or names to directly target a specific user:

  • Advertisers upload a list of customer data (CSV) to Facebook Ads Manager.
  • Facebook matches the data (phone/email/name) with existing user accounts.
  • Once matched, only those users in the list will see the ads.

This is the most direct method of delivering ads to a known individual.

2. Lookalike Audiences

Based on a seed audience (e.g., 100 users with names and emails), Facebook finds other users with similar behavior.

This is indirect targeting, used to expand reach to similar users even advertisers don’t know names or emails.

3. Geotargeting / Geofencing

Facebook allow anyone to use the user’s location (address or GPS) to limit where an ad appears. This usually being used by physical stores. If you ever notice when you pass by some stores, you more likely see their ads on news feed.

4. Interest, Demographic, and Behavioral Targeting

When no personal data is available, Facebook allows anyone to filter audiences by:

  • Age, gender, region, job title
  • Online behavior (e.g., searching for a laptop, following specific pages)
  • Past engagement with posts, videos, or websites

This is an indirect ways but still get ads appear to us.

Lesson

By utilize above methods, anyone, real advertisers, or even scammers, can show some messages to our face when we are scrolling on Facebook, Youtube, Instagram, etc. Social Network applications, one hand, create free tools to soothe the desire of connection in people, and one hand, sell privacy to anyone willing to pay.

Although most of ads is not harmful, make sure to share just enough on social networks, to avoid worst situations that scammers can use ads too !

Store datetime as ISO 8601 instead of timestamp

Like any programmer, I used to use Date for storing date time values on database, until I suffer with bugs related to timezone and DST (Daylight Saving Time). The most common symptom is the date fields, such as created_date or updated_date, often display 1 day prior to what it is set before: Let say an admin set the created_date to Jan 1st 2025 00:00:00 clients somehow see value Dec 31 2024, 23:00:00 .

I tried putting browser timezone information into adjusting timestamps but then I noticed that the code base became more complex and the bug still can’t be fixed when DST (Daylight Saving Time) happens. The bug will happen when:

  • most of databases use integer representing microseconds (or timestamp) to store datetime data
  • users locate in many countries with different timezones
  • Daylight Saving Time happens with no fixed schedule

Then I realize the power of standard: ISO 8601, is that it can be used to store date time values in plain text and still be sortable. Storing datetime as format YYYY-MM-DD HH:mm:ss at UTC, (or any format instructed in ISO 8601) can remove all headaches when using timestamp but also remain sorting & filtering capabilities. The only downside, is the frontend part and backend part has to make sure date time data is sent and received in ISO 8061 format instead of integers, and this conversion is simple enough and can help avoid days of debugging.

Security incident 2023 …

News goes old and lessons usually be forgotten. Below is some incidents happened in cyber battlefield for a feel of 2023 – a drama year.

CompanyDomainBreached DataMoneyAttack vector
iRentcar rental– millions of partial credit card numbers
– at least 100,000 customer identification documents
N/Athe database has no password
Yes MadamSalon platform– customers’ location data
– user device details,
– IMEI numbers of ~900,000 users
N/Athe database has no password
PeopleGrovesocial platform for higher education institutions and alumni networks– gigabytes of personal information: email addresses, phone numbers, addresses, details of university achievements and scores, and resumes containing detailed work histories and employment detailsN/Athe database has no password
Proskauer Roseinternational law– private and privileged financial and legal documents, contracts, non-disclosure agreements, financial deals and files relating to high-profile acquisitions.N/Amisconfiguration
AvidXchangeautomate invoice processing and payment management processes– employee payroll information
– corporate bank account numbers.
N/Aeasily guessable passwords
ToyotaManufacture– data of 2 millions customersN/Amisconfiguration
FerrariSupercar Manufacturer– 7GB of documents, data sheets and repair manuals.N/ARansomware
LogicMonitornetwork security– data of a small number customersN/Ause of default password
MicrosoftAI– accidentally exposed tens of terabytes of sensitive data, including private keys and passwordsN/Apublishing a storage bucket of open source training data on GitHub.
Tesla– 75,000 company employees personal informationN/Atwo former employees leaked
MicrosoftEmail– a key that allowed to stealthily break into dozens of email inboxes, including those belonging to several federal government agencies.N/AUnknown
Crema FinanceCrypto$9 million in cryptocurrencyethical hacker turning rogue
Taiwan Semiconductor ManufacturingChipMaker$70 million ransom demandLeaked setup information
RedditSocial Network– 80+ gigabytes of compressed dataN/A“highly-targeted” phishing attack
T-MobileTelecom– personal data belonging to 37+ million customers.N/Asocial engineering +
SIM swap
TwitterSocial Network– 400+ million email addresses and phone numbers N/Aa security bug
MailChimpEmail– 400+ accounts mostly of cryptocurrency and finance-related accountsN/Asocial engineering
OktaIdentity– 134 organizations dataN/Astolen credentials
TruepillPharmacy Fulfillment– 2.3+ million patients personal dataN/Apoor security design
Perry Johnson & AssociatesHealthcare~9 million patients dataN/AUnknown
IntellihartxPatient payment half a million people’s personal and health informationN/AMOVEit (1)
PharMericaPharmacy– 5.8 million personal informationN/AMOVEit (1)
HCA Healthcarehealthcare– 11 million patients’ dataN/AUnknown
Enzo Biochembiotechnology – 2.5 million patients’s clinical test informationN/ARansomware (2)
Managed Care of North AmericaDental– 8.9 million clients dataN/AUnknown
NextGen Healthcareelectronic health record software– 1.05 million patients personal dataN/ARansomware (2)
IlluminaDNA sequencing devices– Can alter test result in devicesN/ACVE-2023-1968
Maternal & Family Health ServicesHealthcare– 461,070 personal data of patients, employees and vendorsN/AUnknown
23AndMeGenetic testing– 6.9 million user data recordsN/AUnknown
WelltokPatient Engagement– 8 million personal dataN/AMOVEit (1)
McLaren Health CareHealthcare– 2.2 million patients sensitive personal and health informationN/ARansomware (2)
Performance Health TechnologyData Management Services– 1.7 million Oregon citizens health informationN/AMOVEit (1)
Colorado Department of Health Care Policy and FinancingHealthcare– 4 million patients dataN/AMOVEit (1)
HCA HealthcareHealthcare– 11 million patients’ dataN/AUnknown
SabreTravel booking– 1.3 terabytes of data on ticket sales, passenger turnover, employees’ personal data, corporate financial information.N/ARansomware (2)
See TicketsGlobal Ticketing– customers’ credit card informationN/ACredit Card Skimming Malware
MGM ResortsHotel & Casino– unspecified amount of customers’ personal information
– ATM shut down
– Website offline
~ $100 millionRansomware
Caesars EntertainmentHotel & CasinoN/A$30 million demandedRansomware
Motel OneHotel– 50 credit cards dataN/ARansomware
RadissonHotelN/AN/ARansomware
Fidelity National Financialreal estate servicesvirtually froze all the company and its subsidiaries’ activitiesN/AUnknown
Mr. Coopermortgage and loan– unknown amount of ~4 million usersN/AUnknown
1st Source BankBankN/AN/AMOVEit
Hatch Bankfintech infrastructure – 140,000 customers SSN N/ACVE-2023-0669
FlutterwaveStartupN/Alost ~$4.2 million in the accountsUnknown
Euler FinanceFinanceN/A~ $197 million in crypto theft
– 1.3M USD gone
“in a flurry of transactions” (3)
AT&T email addresses.Mail– customers account compromised$15 – $20 million crypto stolen.Unknown
MixinCryptoN/A~ $200 million stolenUnknown
Mom’s MealsFood– 1.2+ million individuals dataN/ARansomware
NationBenefitssupplementary benefits– 7,100+ residents personal dataN/ARansomware
Yum Brandsfast-food chains~ 300 UK restaurants dataN/AUnknown
Forever 21Clothing500.000+ individuals dataN/ARansomware
ByjuedtechN/AN/Amisconfiguration
Electoral Commissionoverseeing elections~ 40 million U.K. voters dataN/A“complex cyberattack.”
Ofcom412 employees dataN/AMOVEit
JumpCloudAccess managementa “small and specific” set of customers.N/AUnknown
ShellOilterabyte of logging dataN/AMOVEit
Dishsatellite television – 300,000 personal informationN/AUnknown
SchoolDudeorder management system– 3M SchoolDude user accountsN/AUnknown
AtlassianSaaSN/AN/ACVE-2023-22515
ShadowGame– 530,000 customers dataN/Aadvanced social engineering
CCleanerTools– 2% usersN/AMOVEit
BoeingAerospaceN/AN/ARansomware
National Aerospace LaboratoriesAerospace– eight purportedly stolen documents ( confidential letters, an employee’s passport internal documents)N/ARansomware
Zhefenglee-commerce– millions of Chinese citizen identity numbers
from 3.3 million orders
N/AUnknown
A network of knockoff apparel storesStore– 330,000 credit card numbers, cardholder names, and full billing addressesN/AUnknown
ODIN IntelligenceApplications for policesLeaked files reveal tactical plans for police raids, surveillance and facial recognitionN/AUnknown
LastPassPassword managercustomers’ encrypted password vaultsN/AUnknown
British LibraryLibrary– website offline
~490,000 user data
N/ARansomware
2023 security incidents sample

A thought about language

It is certainly that there is a lot of programming languages on the market. Each language creates itself job positions and even programming “religion” where engineers prefer using a certain language over others, and thanks to that, the software development market can thrive because the diversity in language increase the scarcity in workforce.

So why some language offer higher salary than others ?

Well, language is tool, each tool is more appropriate for a certain purpose. Salary is the compensation from the budget assigned for a purpose. So actually there is no standard market price assigned to each language. It still is the supply-demand game. The difference is the demand is influenced by big boys in the industry. Big boys create programing languages, frameworks that overcome limitations of the ones exists before. The limitation can be speed, memory efficiency, friendly syntax, built-in solution for repeatable tasks. As the result, it can save money in renting physical resource (computers) and save working hours (constructing and error fixing) and that savings also contribute to the compensation, or the salary.

Can an engineer learn multiple language ?

Absolutely Yes.

Learning a language is always hard and time-consuming. There is a lot to remember: syntax. And syntax can be translated from one to another by the same principle as human language. So far, for each language, what we need to remember are:

  • How to declare variable
  • How to write conditional check (if-else)
  • How to do the loop (for-while)
  • How to separate code in reusable blocks (class, method, function, interface)
  • How to handle concurrency (process, thread, event loop)
  • What libraries are offered out-of-the-box per each language.

The tactic is, for the first language, learn it carefully and practice a lot. For new languages, compare it to the one we knew, and find the translation between syntax sets, then practice a lot.

Another tip is to search for open-source projects, and read their code. This is also a quickest way to learn from top engineers.

Is it worthy to know more than one language?

Yes, I think.

If we only need a stable job, one language is enough. But remember that technology changes everyday, and any project has the starting date and the end date. So, ensure your adaptability.

If we are interested in technology, learning new languages can provide us a deeper understanding about computer as well as to see more than 1 way to solve same issues. It keeps us open-minded, curios and provide a broad range of knowledge, which is a truly stability, so far as what I learn from many professionals.

And, the whole programing activities is a career, not a single language itself.

Is it tired to know more than one language?

Yes, obviously. No need to explain.

First time to NLP huh ?

Natural Language Processing (NLP) is a major research field of AI and to almost developers, it sounds like a miracle. Lately I have an interest in this field since the noticeable viral news of GPT-3 model. I decided to learn to make use of it as a tool before somehow it will replace developer job in the future as many predictions from many illustrious figures. But the more I study about it, the more nothing I know. There are too many background knowledge to know before understanding each word on the GPT-3 paper. Below is a quick summary about works behind the scene that hopefully useful to developers like me who wants make a leap to catch up with the AI progress.

List of keywords

It is inevitable long and exhausting journey to make sure we can understand fairly basic about below terms:

  • Convolutional Neuron Network, Recurrent Neuron Network, Activation Function, Loss Function, Back Propagation, Feed Forward.
  • Word Embedding, Contextual Word Embedding, Positional Encoding.
  • Long – Short Term Memory (LSTM).
  • Attention Mechanism.
  • Encoder – Decoder Architecture.
  • Language Model.
  • Transformer Architecture.
  • Pre-trained Model, Masked Language Modeling, Next Sentence Prediction.
  • Zero-shot learning. One-shot learning, Few-shot learning.
  • Knowledge Graph.
  • BERT, GPT, BART, T5

What exists before BERT and GPT ?

There was a lot of researches and works existed in NLP field. Work on NLP field means to solve below common Tasks:

  • Tagging Part of Speech.
  • Recognising Named Entities.
  • Sentiment Classification.
  • Question & Answering.
  • Text Generation.
  • Machine Translation.
  • Summarization.
  • Similarity Matching.

SpaCy and NLTK is two most famous libraries in NLP field that provide tools, frameworks and models solving a few Tasks above, but not everything. Each Task usually had its own model and there is no reusing or transferring between models, until the Transformer Architecture is published. With its amazing performance and ability of Transformer Architecture, researchers begin to think about using this architecture to perform above NLP tasks, to have one single model can do it all. And the result is the BERT and GPT models which both are using Transformer. A fact is that, BERT is powering the Google search engine, and GPT-3 is the one powering ChatGPT application. There are also more applications making used of these models can be found around Internet.

Some Core Challenges when doing NLP

No matter what method is applied, the challenges that forming the NLP field is still the same:

  • Computer does not understand words, it understands numbers. Find a method to convert each word in a sentence into a vector (a group of numbers) that: given 2 words with similar meanings, 2 vectors can have a close-distance to present the similarity.
  • Given a sentence with many words and variable length, find a vector can present the sentence.
  • Given a passage with many sentences and variable length, find a vector can present the whole passage.
  • From a vector of a word, sentence or passage, find a method to convert it back to words/sentences/passage. This task in turn become the Machine Translation, or Text Summarization.
  • From a vector of a word, sentence or passage, find a method to classify it into some senses/intents. This task in turn become Sentiment Classification.
  • From a vector of a word, sentence or passage, find a method to calculate the similarity to other vector. This task in turn become Question & Answering, or Text Generation, Text Suggestion.

It will be too long to dive into each keyword here so please Subscribe button to receive upcoming posts from my learning journey.

Thanks for reading!