Characterizing Tech Debt

These notes focus on accurately describing what tech debt is, moreso than tactics for handling it.

A Taxonomy of Tech Debt

This might be the best discussion of what technical debt is that I've read.

Debt can be measured by:

Impact
Fix cost
Contagion

Debt can be:

Local debt
MacGyver debt
Foundational debt
Data debt

Towards an Understanding of Technical Debt

There are at least 5 distinct things we mean we say “technical debt”.

Maintenance work

Features of the codebase that resist change

Operability choices that resist change

Code choices that suck the will to live

Dependencies that resist upgrading

Ur-Technical Debt

Describes a narrower notion of tech debt, as it was originally coined by Ward Cunningham.

What counts as technical debt has expanded over the years, which has caused many people to lose sight of the interesting phenomena that Ward Cunningham was talking about when he coined the term in 1992. Today, any code that a developer dislikes is branded as technical debt. Tech debt is also hacky code, code written by novices, code written without consideration of software architecture (so-called “big balls of mud”), and code with anti-patterns flagged by static analysis tools.

When you choose an iterative process, you use that time to deliver several partial solutions. You don’t understand the full requirements until the last iteration, so for most of the year, the ideas in your head are partial. The code embodies your partial understanding because it cannot be more insightful than the ideas in your head. The original paper argues that this is ok, so long as you don’t leave the code in that state forever, and introduces the debt metaphor: “Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite.” [1]

This understanding of technical debt is not so far away from common understandings of technical debt:

“[I]f you develop a program for a long period of time by only adding features and never reorganizing it to reflect your understanding of those features, then eventually that program simply does not contain any understanding and all efforts to work on it take longer and longer.” [2]

But it does presume a level of care that tech debt ridden systems may not show:

If you knowingly write hacky code, or you allow inexperienced developers to use their first draft code, you undermine the very thing that makes iterative development viable. “In other words, the whole debt metaphor, let’s say, the ability to pay back debt, and make the debt metaphor work for your advantage depends upon your writing code that is clean enough to be able to refactor as you come to understand your problem.”

The fascinating, surprising thing about ur-technical debt is that it happens even under the best circumstances, say with expert developers who always choose to fix debt immediately. It’s inherent to using an iterative process and acting with a partial understanding.

I don't see a reason to restrict technical debt so tightly. But understanding Cunningham's idea of unavoidable technical debt isolates an important phenomenon. Some debt is unavoidable regardless of how much care you take, while the worst technical debt inhibits future refactoring and improvement.

On Systemic Debt

Discussion of the author's experience with tech debt on an inventory management process. Good discussion of how some kinds of brittle code result in process barriers to improvement. Because the code is bad, we require expensive proof that new code is good (blocking incremental improvement). Because the system is brittle, the team prioritized safety (the existing system works, so they couldn't justify make changes that risk downtime). These policies meant the code remained in bad shape. The story ends with discussion of making actual improvements.

Here, we have a very concrete number: technical debt was two full-time salaries to just maintain basic operations.

Stop Saying Technical Debt

Equating tech debt to bad code also allows us to conflate “this code doesn’t match my personal preferences” with “this code is a problem”—which, again, is fine, until we’re under a time constraint. We spend “tech debt week” doing our pet refactors instead of actually fixing anything. Engineers love tech debt week because they get to chase down their personal bugaboos. The thing is, those bugaboos rarely intersect with the code’s most pressing maintenance challenges. So when each engineer finishes their gang-of-four-fueled refactoring bender, the code is no easier to work in than it was before: it’s just different, so no one besides the refactorer knows it as well anymore. Fantastic. A+. No notes.

In all seriousness, this is a huge reason that spending three weeks paying down tech debt, carte blanche, often does little or nothing for the team’s velocity after those weeks have ended.

To fix these problems, choose something measurable to evaluate the quality of the system. My recommendation: maintenance load. How much time and effort are developers spending on tasks that are not adding features or removing features? We can talk to folks outside the engineering team about that number. If we have six developers but half of our work is maintenance work, then our feature plan can only assume three developers. Business people think of engineers as expensive, so this framing motivates them to help us decrease our maintenance load.

Fair, but...

Technical Debt Doesn't Exist

...this simplification goes too far. It's true that cost of maintenance is the ultimate measure of technical debt. But some types of maintenance are inevitable and outside your control, while other costs are avoidable.