Diagnostic before surgery
Where should I spend the engineering budget?
How to rank technical debt by business value, not by ugliness
The first thing most engineers do with an inherited backend is open the file with the worst code. The first thing the System Report does is ask which file the business will miss most when it breaks. These are rarely the same file.
A 4,000-line OrderController looks like the obvious place to start. It is ugly, it has been touched by twelve people, it is the first thing a senior engineer winces at on day one. The temptation is to refactor it on principle. The report's Business-Use Map tends to suggest something different — that the OrderController, however ugly, has been working for years against well-understood inputs and is not in fact carrying the company's most exposed risk. The exposed risk lives in a 200-line nightly settlement job that almost nobody on the team can describe end to end.
This article is about that gap.
The wrong ranking
Three rankings dominate inherited-system audits. All three are wrong on their own.
Rank by file size. The longest files are the most complex; therefore the most dangerous. Sometimes true, often not. A 4,000-line file that has been stable for three years and is only ever read by one team is much less risky than a 400-line file that ten teams import and that nobody owns.
Rank by complexity score. Cyclomatic complexity is a real signal, but it tells you about the local difficulty of reading a function, not about its blast radius. A complex function that is exercised by three users a year is less of a problem than a simple function in the hot path of every checkout.
Rank by lint warnings. Static analysis is cheap to run and produces an impressive list. The list correlates with code-quality conventions, not with what the business depends on. A clean file can carry the most important rule in the system; a file full of warnings can carry nothing.
What all three share is that they describe the code. The business does not pay for code. It pays for the workflows the code makes possible.
The Business-Use Map
The artefact the report builds for ranking is short, ugly, and load-bearing. It is a table — kept in the consultant's working folder during the engagement and confirmed in writing by the technical and business owners. A row per system area or workflow, with these columns:
- System area / workflow. The named thing —
OrderController.Submit, the partner CSV export, the nightly settlement job, the admin override screen. - Primary users. Sales, operations, finance, support, customers, partners.
- Business purpose. What value flow this area serves, in plain English.
- Usage frequency. Daily, weekly, monthly, quarterly, rare.
- Value type. Revenue, cost saving, risk reduction, compliance, customer experience, delivery speed.
- Business criticality. What hurts if this fails. Multi-select: revenue-critical, operationally critical, compliance-critical, customer-trust-critical, delivery-blocking.
- Pain or risk. The friction or failure mode the business already feels here, if any.
- Evidence. Interview line, analytics, log volume, support tickets, admin usage, DB activity.
Rows without evidence are not added. The map is bounded by what the consultant has seen and heard, not by what could be assumed about a generic business of this shape.
The map is not a strategy document. It is the input that lets the synthesis pass at the end of the report rank findings against something other than how messy the code looks.
Frequency is not value
The most common mistake the map prevents is reading frequency as importance. A workflow that runs once a quarter — month-end close, year-end statements, the annual partner reconciliation — can be the single most business-critical thing the system does. A workflow that runs every minute can be ambient noise nobody would miss for a day.
This is why the map records Business criticality and Usage frequency in different columns. Criticality carries the rank weight. Frequency modulates how often that weight will be triggered. A weekly workflow that touches nothing important ranks below a quarterly workflow that touches a regulatory deadline.
The mistake gets made anyway, because frequency is easy to measure and criticality is not. Logs, analytics, and request counts all give frequency for free. Criticality only comes out in interviews — what would hurt the business most if this were unavailable for a day? — and from understanding the value flow well enough to recognise that the rare workflow is the one that pays the rent. The report's Pass 11 is mostly about asking that question and writing the answer down before the synthesis pass tries to rank anything.
What it looks like in practice
A real-shaped excerpt — anonymised, but the shape is what shows up in nearly every engagement:
| Workflow | Frequency | Criticality | Pain today |
|---|---|---|---|
OrderController.Submit | Hundreds/day | Revenue-critical | Slow under peak load; staff have learned to retry |
| Nightly settlement job | Daily | Compliance-critical, revenue-critical | Has failed silently twice; nobody noticed for 36 hours |
| Partner CSV export | Weekly | Operationally critical | Format drift causes partner complaints once a quarter |
| Admin override screen | Several/week | Customer-trust-critical | Used to fix support tickets; bypasses validations |
| Quarterly tax export | Quarterly | Compliance-critical | Last quarter required two engineers and three days |
LegacyImageGenerator | Rare | None | Code is ugly, but unused since 2022 |
The OrderController is the workflow the team complains about. The settlement job is the workflow that is actually exposed. The legacy image generator is the workflow the engineer who walks in on day one wants to refactor first, and the one that should not be touched at all.
The findings the report writes against this map look very different from the findings a code-quality audit would write. The settlement job's silent-failure mode outranks the OrderController's ugliness. The admin override's validation-bypass outranks the partner CSV's format drift. The legacy image generator falls off the list entirely — recommended action: leave alone — because the only correct response to dead code is do not move it.
The yes list
Ranking by business value also produces a counterweight that pure code-quality audits do not produce. The areas where the system is sound enough that new work can begin without prerequisite cleanup. The report calls this what can be safely built directly. Many engagements produce more value from this list than from the findings — the system is permission to proceed in these areas is a more useful statement than the system is fine and a much more useful one than the system needs a year of rework.
The yes list also depends on the Business-Use Map. An area is on it when the business value carried is well-understood, the test trust on the relevant flows is medium or high, and the four properties of safe change can be answered for the kind of work the team wants to do next. Without the map, you cannot tell the yes list from the no list.
What this tells you
Three moves, in order.
Stop scanning the codebase for the worst file. It is the wrong question. The right question is which areas of the system are most exposed if they fail, and the answer is rarely visible from git blame.
Build the map before the budget conversation. A budget-vs-debt conversation that does not start from a Business-Use Map will rank by ugliness by default, because that is the easiest signal to point at on a screen. Spend the day or two it takes to build the map first. The conversation that follows is qualitatively different.
Treat the rare-but-critical workflows as the prime suspects. They are usually the most exposed. They are also the most likely to be missing observability, missing tests, and missing a documented owner — because nobody touches them, nobody touches them. Where business rules hide tends to be exactly this kind of workflow.
Where this fits in a System Report
The Business-Use Map is built across passes 4 and 11 of the System Report process and used at synthesis. It is the load-bearing input that prevents the final report from being a list of pretty refactors. Findings are ranked against it. The recommended implementation package is shaped by it. The yes list comes out of it.
If you only take one artefact from this article, take this one: a table on a single page, ranking your system's workflows by what the business depends on, with evidence. That table will outrank every code-quality dashboard you have on the day a hard decision has to be made.
Articles describe the lens. The questions a System Report asks are how that lens is applied to your system.