Some recent projects involving big rewrites have lead to a large number of regressions. Some bugs are inevitable, however I think a fair amount could have been prevented with a better strategy for refactoring.
I’ve talked about this privately to some developers before, all the way back to Blender 2.5 planning. Note that this is my personal opinion and something that other developers have disagreed with at various points. But maybe it will help future projects.
If you are converting a large amount of code from one design to another, there are broadly two strategies:
- Incrementally convert to the new design, refactoring one thing at a time while keeping the system working.
- Build a new system parallel to the old one from the ground up, adding functionality back piece by piece.
There are trade-offs between these approaches. Building from the ground up has the advantage that you are less constrained by the old design and may end up with a better architecture in the end. It can also be more rewarding to start with a clean slate and do things right immediately.
The downside is that it’s easy to lose important information in the process, implicit behavior and design decision that have been added over the years, which are impossible to see at a glance. This can then lead to quick progress at the start, and a long time at the end of the project discovering and fixing what was missed.
I will not try to argue here which strategy is right for which project. A project may use both strategies for different parts as well. However I think it’s important to realize that even when doing a ground up rebuild, it is possible to retain many of the benefits of the incremental strategy.
It works something like this:
- Copy-paste the entire old code into the new code, and comment it out. Broadly organize it into new files.
- Where straightforward, use tools to do automated renaming and refactoring horizontally across the old code. This avoids manual work and prevents some types of mistakes.
- When reimplementing a piece of functionality, either extract the relevant piece of old code and refactor it into the new one, or rewrite it from scratch while deleting pieces of the old code as they are replaced. Or do a mix of both.
- The important thing here is to be meticulous in ensuring that either the new implementation really is equivalent and complete (unless it’s an intentional change). Or to leave the missing bits from the old code around until they are completed at a later time, or decided not to be needed.
- Immediately write down many TODO comments about things you are uncertain about or that are incomplete, as you are working. Do not assume you will remember them or simply hope they will not be a problem.
- This is an iterative process. Continue doing automated refactoring across the old, commented out code as you discover the need. This helps ensuring certain naming conventions or patterns, getting them out of the way immediately instead of keeping track of them for every future change or code review.
Typically the project work is tracked in a high level list outside the code. This is fine, but such a list will miss many things. The code itself is guaranteed to be complete, and having it there confronts you with the hidden behavior and design that you may otherwise miss.