The Real Cost of One Trillion Dollars in IT Debt: Part II – The Performance Paradox
Some of the business ramifications of the $1 trillion in IT debt have been explored in the first post of this two-part analysis. This second post focuses on “an ounce of prevention is worth a pound of cure” aspects of IT debt. In particular, it proposes an explanation why prevention was often neglected in the US over the past decade and very possibly longer. This explanation is not meant to dwell on the past. Rather, it studies the patterns of the past in order to provide guidance for what you could do and should do in the future to rein in technical debt.
The prevention vis-a-vis cure trade-off in software was illustrated by colleague and friend Jim Highsmith in the following figure:
Figure 1: The Technical Debt Curve
As Jim astutely points out, “once on far right of curve all choices are hard.” My experience as well as those of various Cutter colleagues have shown it is actually very hard. The reason is simple: on the far right the software controls you more than you control it. The manifestations of technical debt  in the form of pressing customer problems in the production environment force you into a largely reactive mode of operation. This reactive mode of operation is prone to a high error injection rate – you introduce new bugs while you fix old ones. Consequently, progress is agonizingly slow and painful. It is often characterized by “never-ending” testing periods.
In Measure and Manage Your IT Debt, Gartner’s Andrew Kyte put his finger on the mechanics that lead to the accumulation of technical debt – “when budget are tight, maintenance gets cut.” While I do not doubt Andrew’s observation, it does not answer a deeper question: why would maintenance get cut in the face of the consequences depicted in Figure 1? Most CFOs and CEOs I know would get quite alarmed by Figure 1. They do not need to be experts in object-oriented programming in order to take steps to mitigate the risks associated with slipping to the far right of the curve.
I believe the deeper answer to the question “why would maintenance get cut in the face of the consequences depicted in Figure 1?” was given by John Seely Brown in his 2009 presentation The Big Shift: The Mutual Decoupling of Two Sets of Disruptions – One in Business and One in IT. Brown points out five alarming facts in his presentation:
- The return on assets (ROA) for U.S. firms has steadily fallen to almost one-quarter of 1965 levels.
- Similarly, the ROA performance gap between corporate winners and losers has increased over time, with the “winners” barely maintaining previous performance levels while the losers experience rapid performance deterioration.
- U.S. competitive intensity has more than doubled during that same time [i.e. the US has become twice as competitive - IG].
- Average Lifetime of S&P 500 companies [declined steadily over this period].
- However, in those same 40 years, labor productivity has doubled – largely due to advances in technology and business innovation.
Discussion of the full-fledged analysis that Brown derives based on these five facts is beyond the scope of this blog post . However, one of the phenomena he highlights – “The performance paradox: ROA has dropped in the face of increasing labor productivity” – is IMHO at the roots of the staggering IT debt we are staring at.
Put yourself in the shoes of your CFO or your CEO, weighing the five facts highlighted by Brown in the context of Highsmith’s technical debt curve. Unless you are one of the precious few winner companies, the only viable financial strategy you can follow is a margin strategy. You are very competitive (#3 above). You have already ridden the productivity curve (#5 above). However, growth is not demonstrable or not economically feasible given the investment it takes (#1 & #2 above). Needless to say, just thinking about being dropped out of the S&P 500 index sends cold sweat down your spine. The only way left to you to satisfy the quarterly expectations of Wall Street is to cut, cut and cut again anything that does not immediately contribute to your cashflow. You cut on-going refactoring of code even if your CTO and CIO have explained the technical debt curve to you in no uncertain terms. You are not happy to do so but you are willing to pay the price down the road. You are basically following a “survive to fight another day” strategy.
If you accept this explanation for the level of debt we are staring at, the core issue with respect to IT debt at the individual company level  is how “patient” (or “impatient”) investment capital is. Studies by Carlota Perez seem to indicate we are entering a phase of the techno-economic cycle in which investment capital will shift from financial speculation toward (the more “patient”) production capital. While this shift is starting to happens, you have the opportunity to apply “an ounce of prevention is worth a pound of cure” strategy with respect to the new code you will be developing.
My recommendation would be to combine technical debt measurements with software process change. The ability to measure technical debt through code analysis is a necessary but not sufficient condition for changing deep-rooted patterns. Once you institute a process policy like “stop the line whenever the level of technical debt rose,” you combine the “necessary” with the “sufficient” by tying the measurement to human behavior. A possible way to do so through a modified Agile/Scrum process is illustrated in Figure 2:
Figure 2: Process Control Model for Controlling Technical Debt
As you can see in Figure 2, you stop the line and convene an event-driven Agile meeting whenever the technical debt of a certain build exceeds that of the previous build. If ‘stopping the line’ with every such build is “too much of a good thing” for your environment, you can adopt statistical process control methods to gauge when the line should be stopped. (See Using 3σ Control Limits in Software Engineering for a discussion of the settings appropriate for your environment.)
An absolutely critical question this analysis does not cover is “But how do we pay back our $1 trillion debt?!” I will address this most important question in a forthcoming post which draws upon the threads of this post plus those in the preceding Part I.
 Kyte/Gartner define IT Debt as “the costs for bringing all the elements [i.e. business applications] in the [IT] portfolio up to a reasonable standard of engineering integrity, or replace them.” In essence, IT Debt differs from the definition of Technical Debt used in The Agile Executive in that it accounts for the possible costs associated with replacing an application. For example, the technical debt calculated through doing code analysis on a certain application might amount to $500K. In contrast, the cost of replacement might be $250K, $1M or some other figure that is not necessarily related to intrinsic quality defects in the current code base.
 See Hagel, Brown and Davison: The Power of Pull: How Small Moves, Smartly Made, Can Set Big Things in Motion.
 As distinct from the core issue at the national level.