
For someone with no programming experience, the word “refactor” can be confusing at first, because it’s mostly used in coding. For a simple explaining, refactor is Making something better or clearer without changing what it does.
Why Refactoring when there is nothing change ?
“If it works, don’t touch it” is a principle that is still valid in real world. It’s truth, pragmatic and is recommended at some extent when we do not have enough understanding about system we are working on. From business perspective, Refactoring feels unproductive when there is no new features added to system, but, just like a business, sometime we need to restructure processes, reorganize people and rearrange tasks to maximize outcomes, programing process also need refactoring to optimize coding experience which help source code more readable, maintainable, and scalable. These benefits, in turns, accelerate developers when adding new features or fixing bugs later on.
What is Readable code ?
Readable code is code written clearly so another developer, or future you, can read it quickly and know what’s going on, sometime just by guessing via variable names and function names. Some tactics can be applied to ensure readable code are:
- Clear naming: Variables, functions, and classes have names that explain their purpose.
- Short, focused functions: Each function does one thing, not many things.
- Consistent formatting: Proper indentation, spacing, and line breaks.
- Avoids unnecessary complexity: No overly clever tricks, Straightforward logic.
- Helpful comments: Explain why, not what.
- Use of standard patterns: Code follows common conventions so others instantly recognize the structure.
What is Maintainable code ?
- Readable: as explained as above
- Well-organized: Code is structured logically into modules, functions, or classes
- Consistent: Follows the same style, naming, and patterns everywhere.
- Well-tested: Covered by tests to catch bugs early and safely.
- Documented: Has comments or docs explaining why and how things work.
- Flexible: Easy to modify, extend, or adapt without breaking existing code.
What is Scalable code ?
- Efficient: Uses memory and CPU wisely, Avoids unnecessary heavy operations.
- Modular: Pieces of code can be separated or duplicated easily
- Asynchronous / non-blocking when needed: Doesn’t freeze the whole system while waiting for one slow task.
- Uses good architecture: Clear layers, Can split into microservices or separate components if needed.
- Uses proper data structures: For example, using a
Mapinstead of aListfor fast lookups. - Database scalability: Indexes, caching, batching queries, sharding, etc.
Refactor safety
Because the goal is to keep system working as the same while rewriting codes, there must be a metric indicate sameness, or early detect differences in system behaviors. This is where Test Driven Development shines.
Writing Test is mistakenly overlooked by inexperience developers. Beginners usually think programing job is just to write code, see it run then move on writing another code. Writing tests looks like an extra work or an annoying requirement. This is okay just as a young men does not understand “karma”. And karma for this overlooking usually are:
- Bugs keep coming back
- Bugs evolve when there is more code added
- Take so much time for debugging
- Source code become a mess and a small change can take months to add
When bugs bring enough pain, developers begin more experience.
Test Driven Development (aka TDD)
Test-Driven Development (TDD) is a software development process where you write tests before writing the actual code.
TDD follows a repeating 3-step loop:
- Write a failing test ( yes, always fail first ! )
- The test describes what the code should do.
- It fails because the feature doesn’t exist yet, or the bug is not fixed yet.
- Write the minimum code needed to make the test pass
- Not perfect code, just enough to pass the test.
- Refactor – Clean up the code
- Improve readability, maintainability, scalability
- Keep the tests passing.
Then repeat the cycle for the next feature and bug fixing.
Tests ideally can simulate the UX that users will engage on real product. This can not be 100% achieved but keep this principle in mind will help a lot to write good tests. Depends on how closely a test to real world UX, tests can be classified to 3 levels: Unit Test, Integration Test, and E2E Test.
Unit Test
Unit Test is ideal to test behaviors of a function or a class. In each unit test, we can test output of a function given a particular input. We can anticipate what inputs can be, even unrealistic ones (hackers usually input unrealistic ones) , to ensure our functions keep functioning regardless what input is. Unit tests can be used as a debugging tool when we can test directly a part of system without try reproducing via UI/UX. For functions that is well guarded by unit tests, developers can feel more confident to add changes or refactor it because bugs can be caught early.
Integration Test
Integration tests are tests that check how multiple parts of system work together. Functions, Classes and Flows can be tested on how they are interacting together inside a system. It ensures that every “pieces” of the system are integrating properly. Similar to Unit Test, we can anticipate and simulates Flows to can catch bugs soon.
E2E Test
E2E (End-to-End) tests are the tests that simulate a real user using the real app, by actually click buttons and typing text.
This is ultimate form of Test that can catch bugs that unit or integration tests cannot. E2E Tests test the app in an environment closest to production. They validate the entire system from UI/UX to data storage. But it is the hardest tests to make when a real system need to be deployed for E2E tests can execute. Simulating user behaviors by coding requires more effort. This is why many teams usually stop at Integrating Tests and it is totally ok when majority of bugs can be catch at level of Integration Test. Writing E2E Tests is time consuming so we should only write it for bugs that un-produceable at Integration Test level. These bugs are high-level bugs and it should be addressed by high-level tests, and they mostly about concurrency, timing and resources:
- Race Condition: A situation where the correct behavior of a program depends on the relative timing or interleaving of multiple threads or processes. It’s about uncontrolled timing causing wrong results.
- DeadLock: Two or more threads/processes are waiting on each other indefinitely, preventing progress. As the result, system freezes because resources are locked in a circular wait.
- Livelock: Threads or processes keep changing state in response to each other but make no actual progress. As the result, CPU or threads are active, but nothing gets done.
- Starvation: A thread never gets access to a needed resource or CPU because other threads dominate it. As the result, resource exists, but some threads never get a chance to execute.
- Atomicity violation: A set of operations that should be executed as a single, indivisible unit is interrupted, causing incorrect results.
- Order violation: Correct behavior depends on operations happening in a specific order, but the order is not guaranteed that eventually leads to incorrect results.
- Heisenbug: A bug that disappears or changes when you try to observe it (e.g., by debugging, logging, or adding print statements). This sounds like quantum computing but yes, it does exists. These bugs often caused by concurrency, timing, or memory issues.
- Data corruption: Shared data is modified concurrently without proper synchronization, resulting in invalid or inconsistent values.
- Lost update: Two concurrent operations overwrite each other’s results, causing data loss.
- Dirty read / inconsistent read: A thread reads a partially updated or uncommitted value from another thread or transaction and then produce wrong results.
- Priority inversion: A low-priority thread holds a resource needed by a high-priority thread, causing the high-priority thread to wait unnecessarily.
In conclusion, to Refactor code safely, we need a lot of tests, good tests one !
