Technical debt is the silent enemy of every product team. Product development activities are hectic, as teams try to jam as many features as possible into a single release, and as the business pushes for higher productivity. At times, development teams make shortcuts as they push for a deadline; make architectural choices to fit into the current environment and delay updates to a later time; or delay fixing small problems as they push for bigger features to become available.
There is a silent enemy, it’s the technical debt. And if it goes unchecked, over the long run it can wreak havoc in your product. Technical debt is an expression of cost, that is the cost to make changes, either because you need to fix things or because the product is too complex. The key to technical debt reduction is in working on it early and often, so to keep its cost low. The smaller the debt, the less difficult it is to fix it. By waiting and delaying, technical debt can only increase, making it harder to solve later. It’s like accumulating a financial debt and waiting to pay it back. If you wait too long, the interest may become expensive, making the debt payments too hard (see Martin Fowler, “Technical Debt“).
Many consider technical debt as the number one cause of product failures over the long run. When an architecture becomes too rigid and too complex to upgrade, it cannot sustain the rhythm of innovation, and risks giving ground to competitive products.
The list of products, or entire companies, that failed because of technical debt is too long to compile. One story that stands out is Nokia’s:
Nokia was once the market leader in mobile phones. When Apple launched the iPhone in 2007, Nokia wasn’t ready, and within a few years it was barely a shadow of its former self. What happened? An interesting analysis in “How Nokia Lost the Smartphone Battle” uncovers several major issues with the product development culture at Nokia, and among these is technical debt.
For years, the company decided to live with the limitations and deficiencies of its mobile operating system (Symbian) in an effort to go to market faster.
[The Symbian OS software] had a very antiquated architecture in many ways, which [software developers] could never modernize and they weren’t given the time to modernize. […] And a terrible technical complexity emerged through that process.How Nokia Lost the Smartphone Battle
When the time came for a new concept of mobile phone that included a touchscreen and an app store, the engineering team at Nokia was clear: Symbian could not support it. They needed to rewrite the operating system in order to build the new phone. By the time they were done, Apple was the market leader, and Nokia was history. The technical debt accumulated was too big and had sunk the company.
The way the user interface was done, it was really old. A totally antique system. So doing anything with [this old system resulted in] very slow [performance]. Then instead of saying early on that we have to get rid of this [old system], it’s not worth fixing it, they had just been patching it up. It might help in getting the next product out, but it doesn’t solve the [core] problem.How Nokia Lost the Smartphone Battle
Sources of technical debt
Depending on circumstances, technical debt can arise at any time, during the initial development, maintenance (fixing issues), or enhancements (building new capabilities). The sources of technical debt can be voluntary (for example, taking a shortcut to get to market sooner), or involuntary (a system architecture that is built poorly because of dependencies between teams that are not addressed). Let’s look at some examples:
At times, teams may choose to take a shortcut in order to complete development and launch a product in market quickly. They know that they will have to fix the product at a later time, but they judge that the value today from releasing faster is higher than the cost it will take in the future to fix the shortcut and bring it up to quality. That is,
Value (today) > Cost (tomorrow)
This is a voluntary technical debt, and the team has a path to resolution (the fix they will do in the future).
A different example is Nokia’s, as described above. The teams accumulated technical debt voluntarily to get to market quickly, but did not have a path to resolution. They let the technical debt grow uncontrollably, until it was too large to fix.
There is also the involuntary accumulation of technical debt. I once worked with a development team that built iPad apps. At the time, the programming language was Objective C. We had a full suite of apps built when Apple announced the switch to a new programming language called Swift. We spoke with Apple and the message was clear: we had to rewrite in Swift all the apps we had already created, otherwise we would not be able to support or expand them in the future.
To us, Apple’s switch from Objective C to Swift was a huge accumulation of technical debt overnight. None of that depended on technical choices we had made. Technical debt just showed up, and we had to deal with it. We chose to adopt capacity allocation (described below) and rewrite the existing apps over time while building new ones directly in the new language.
The involuntary accumulation of technical debt without realizing that it’s building up, can be very dangerous. A typical source of this is when companies add features over features to their products without a real product strategy (feature creep), sometimes in response to pressure from market demands. And if the team does not employ the right engineering practices, it’s very easy to bend to the pressure and take shortcuts.
Reducing technical debt
The key is to control technical debt to avoid the long-term problems it creates, limit its accumulation in the first place, and address it once it has accumulated. Here are a few techniques that teams can use to keep technical debt in check:
Build up to quality, define a strong DoD
In the ideal world, teams would never take shortcuts, would have plenty of times to check (and re-check) their code, and would document everything their system does. In reality it’s often difficult to do all of these things as time is limited and stakeholders put pressure to get the product done as quickly as possible.
But having a focus on quality is key to keeping technical debt from growing uncontrollably. Teams need to slow down to go far. They need to bring everything they do to their standard of quality, to avoid reworking at a later time.
A couple of good practices support this. The first is having a strong Definition of Done. This is an understanding within the team of what conditions need to be true for the work to be considered “done”. Not half done, or 95% done. Fully, completely, 100% done so that it can be released (potentially) and the team doesn’t have to touch it again.
The second good practice is to establish a team policy around the Definition of Done. Any work that is not done by the end of the sprint is simply moved back to the product backlog and it will be completed at a later time. This is to avoid that half-done work gets released and then you have to deal with bugs and rework later, which is more expensive to do than just getting it done right the first time.
Make sure that your team has a Definition of Done and a policy that the team uses to check every work item that gets done.
DevOps and engineering practices
The team can also adopt engineering practices. These practices are widely adopted by high-performing teams. While they were born in software development contexts, they have meaningful complements in other applications as well.
Many of these practices are now at the core of DevOps. Watch how Microsoft has adopted DevOps and has transformed its engineering capability.
The key here that by having the right engineering practices in place, teams can avoid injecting errors in the final product, can reduce the amount of work it takes to test the product, and can create a shared understanding of the inner workings of a product so that everyone on the team knows how to fix it.
Read more in the engineering practices article.
Reserve capacity, backlog prioritization
Smart teams know the technical debt they have accumulated and take steps to fix it. This is usually in the form of work items added to the product backlog. When the work is properly prioritized by the product manager, the team gets to fix the technical debt over time.
The problem is that usually product managers want to build the next cool feature for their customers, and not fix “older” things. They may not see value in investing precious development time on fixing technical debt and the features always get prioritized. This over time may create a huge risk for the company (remember Nokia, above?)
They (the product managers) should be enlightened to the value of fixing technical debt and the risk of not doing so. And they (still the product managers) should order their product backlog so that it contains a combination of new features and technical debt reduction.
Yet, the negotiation between developers and product managers is often difficult. So how can teams make sure that technical debt is addressed even when the product managers don’t consider it a priority? One possible solution is capacity allocation.
Capacity allocation is the idea is that the team decides upfront a percentage of its capacity to allocate to technical debt reduction. Once this is done, technical work can be addressed within that percentage, and new feature work takes all the remaining capacity. This takes the emotions away from a negotiation and makes visible to everyone what work the team is doing.
For example, if your team’s velocity is typically 30 points and you are concerned about your ability to address technical debt, you could do capacity allocation and reserve 10 points for technical debt work. This is in every sprint, so that in every sprint 10 points are reserved for technical debt reduction, and work for new features etc. has a capacity limited by the remaining 20 points. I’m saying 10 points here as an example, and of course the right number depends on your team’s velocity, on how much technical debt your have to address, and how quickly you want to get it solved.
Read the Velocity, capacity or load? article to learn more about how to use velocity to properly plan your Sprint.