The Agile Executive

Making Agile Work

Posts Tagged ‘Empitical Process Control Model

The Real Cost of One Trillion Dollars in IT Debt: Part II – The Performance Paradox

with 7 comments

Some of the business ramifications of the $1 trillion in IT debt have been explored in the first post of this two-part analysis. This second post focuses on “an ounce of prevention is worth a pound of cure” aspects of IT debt. In particular, it proposes an explanation why prevention was often neglected in the US over the past decade and very possibly longer. This explanation is not meant to dwell on the past. Rather, it studies the patterns of the past in order to provide guidance for what you could do and should do in the future to rein in technical debt.

The prevention vis-a-vis cure trade-off  in software was illustrated by colleague and friend Jim Highsmith in the following figure:

Figure 1: The Technical Debt Curve

As Jim astutely points out, “once on far right of curve all choices are hard.” My experience as well as those of various Cutter colleagues have shown it is actually very hard. The reason is simple: on the far right the software controls you more than you control it. The manifestations of technical debt [1] in the form of pressing customer problems in the production environment force you into a largely reactive mode of operation. This reactive mode of operation is prone to a high error injection rate – you introduce new bugs while you fix old ones. Consequently, progress is agonizingly slow and painful. It is often characterized by “never-ending” testing periods.

In Measure and Manage Your IT Debt, Gartner’s Andrew Kyte put his finger on the mechanics that lead to the accumulation of technical debt – “when budget are tight, maintenance gets cut.” While I do not doubt Andrew’s observation, it does not answer a deeper question: why would maintenance get cut in the face of the consequences depicted in Figure 1? Most CFOs and CEOs I know would get quite alarmed by Figure 1. They do not need to be experts in object-oriented programming in order to take steps to mitigate the risks associated with slipping to the far right of the curve.

I believe the deeper answer to the question “why would maintenance get cut in the face of the consequences depicted in Figure 1?” was given by John Seely Brown in his 2009 presentation The Big Shift: The Mutual Decoupling of Two Sets of Disruptions – One in Business and One in IT. Brown points out five alarming facts in his presentation:

  1. The return on assets (ROA) for U.S. firms has steadily fallen to almost one-quarter of 1965 levels.
  2. Similarly, the ROA performance gap between corporate winners and losers has increased over time, with the “winners” barely maintaining previous performance levels while the losers experience rapid performance deterioration.
  3. U.S. competitive intensity has more than doubled during that same time [i.e. the US has become twice as competitive – IG].
  4. Average Lifetime of S&P 500 companies [declined steadily over this period].
  5. However, in those same 40 years, labor productivity has doubled – largely due to advances in technology and business innovation.

Discussion of the full-fledged analysis that Brown derives based on these five facts is beyond the scope of this blog post [2]. However, one of the phenomena he highlights –  “The performance paradox: ROA has dropped in the face of increasing labor productivity” – is IMHO at the roots of the staggering IT debt we are staring at.

Put yourself in the shoes of your CFO or your CEO, weighing the five facts highlighted by Brown in the context of Highsmith’s technical debt curve. Unless you are one of the precious few winner companies, the only viable financial strategy you can follow is a margin strategy. You are very competitive (#3 above). You have already ridden the productivity curve (#5 above). However, growth is not demonstrable or not economically feasible given the investment it takes (#1 & #2 above). Needless to say, just thinking about being dropped out of the S&P 500 index sends cold sweat down your spine. The only way left to you to satisfy the quarterly expectations of Wall Street is to cut, cut and cut again anything that does not immediately contribute to your cashflow. You cut on-going refactoring of code even if your CTO and CIO have explained the technical debt curve to you in no uncertain terms. You are not happy to do so but you are willing to pay the price down the road. You are basically following a “survive to fight another day” strategy.

If you accept this explanation for the level of debt we are staring at, the core issue with respect to IT debt at the individual company level [3] is how “patient” (or “impatient”) investment capital is. Studies by Carlota Perez seem to indicate we are entering a phase of the techno-economic cycle in which investment capital will shift from financial speculation toward (the more “patient”) production capital. While this shift is starting to happens, you have the opportunity to apply “an ounce of prevention is worth a pound of cure” strategy with respect to the new code you will be developing.

My recommendation would be to combine technical debt measurements with software process change. The ability to measure technical debt through code analysis is a necessary but not sufficient condition for changing deep-rooted patterns. Once you institute a process policy like “stop the line whenever the level of technical debt rose,” you combine the “necessary” with the “sufficient” by tying the measurement to human behavior. A possible way to do so through a modified Agile/Scrum process is illustrated in Figure 2:

Figure 2: Process Control Model for Controlling Technical Debt

As you can see in Figure 2, you stop the line and convene an event-driven Agile meeting whenever the technical debt of a certain build exceeds that of the previous build. If ‘stopping the line’ with every such build is “too much of a good thing” for your environment, you can adopt statistical process control methods to gauge when the line should be stopped. (See Using 3σ  Control Limits in Software Engineering for a discussion of the settings appropriate for your environment.)

An absolutely critical question this analysis does not cover is “But how do we pay back our $1 trillion debt?!I will address this most important question in a forthcoming post which draws upon the threads of this post plus those in the preceding Part I.

Footnotes:

[1] Kyte/Gartner define IT Debt as “the costs for bringing all the elements [i.e. business applications] in the [IT] portfolio up to a reasonable standard of engineering integrity, or replace them.” In essence, IT Debt differs from the definition of Technical Debt used in The Agile Executive in that it accounts for the possible costs associated with replacing an application. For example, the technical debt calculated through doing code analysis on a certain application might amount to $500K. In contrast, the cost of replacement might be $250K, $1M or some other figure that is not necessarily related to intrinsic quality defects in the current code base.

[2] See Hagel, Brown and Davison: The Power of Pull: How Small Moves, Smartly Made, Can Set Big Things in Motion.

[3] As distinct from the core issue at the national level.

Toward An Event Driven Scrum Process

with 10 comments

Recent posts in this blog and elsewhere recommended using technical debt (TD) criteria as an integral part of the build process. One could fail the build under conditions such as:

  • TD(current build) > TD(previous build)
  • TD per line of code > $1
  • TD as % of cost  > 50%
  • Many others…

If you follow this recommendation while integrating continuously, chances are some fixing might need to be done multiple times a day. This being the case, the relationship between the daily stand-up meeting and the build fail event needs to be examined. If the team anyway re-groups whenever the build fails, what is the value add of the daily stand-up meeting?

Figure 1 below, taken from Agile Software Development with Scrum, examines the feedback loop of Scrum from a process control perspective. It shows the daily stand-up meeting providing the process control function.

Figure 1: Process Control View of Scrum

Figure 2 below shows the feedback loop being driven by the build process instead of by the daily stand-up meeting. From a process control standpoint it is essentially identical to Figure 1. Control through the regularly scheduled daily stand-up meeting has simply been replaced by event-driven control. The team re-groups whenever the build fails.

Figure 2: Build Driven Process Control

Some groups, e.g. the Scrum/Kanban teams under Erik Huddleston and Stephen Chin at Inovis have been practicing Event Driven Scrum for some time now. A lot more data is needed in order to validate the premise that Event Driven Scrum is indeed an improvement over vanilla Scrum. Until the necessary data is collected and examined, two intuitive observations are worth mentioning:

  1. Event Driven Scrum preserves the nature of Scrum as an empirical process.
  2. A control unit that is triggered by events (build fail) is in the very spirit of adaptive Agile methods.

Between Agile and ITIL – Part II

leave a comment »

The July 2009 post Between Agile and ITIL introduced the application of Agile principles to system management with the following words:

You do not need to be an expert in Value Stream Mapping to appreciate the power of speeding up deployment to match the pace of Agile development. By aligning development with deployment, you streamline “production” with “consumption.” The rationale for so doing is aptly captured in the first bullet of the Declaration of Interdependence: “We increase return on investment by making continuous flow of value our focus.”

Yesterday’s press release about the acquisition of Phurnace by BMC validates the projection given in the afore-listed post. Colleague and friend Michael Cote puts his finger on the heart of the acquisition in his post in People Over Process:

The interesting part is also that this is automation – I’m assuming – at the application layer, where as most automation talk in past and present is at the infrastructure layer. Of course, the thought leaders in this area – folks like Reductive Labs (Puppet),OpsCode (Chef), and in a more general sense cloud management outfits – are doing a helpful job of blurring the distinction between traditional IT layers like application and infrastructure with their dev/ops angled automation. Check out this white paper done by Reductive Labs and dto solutions on the topic for a nice toe-dip. And, I’d expect to see more application layer automation from VMWare/SpringSource. Older automation portfolios like BMC’s Blade Logic line need to keep a close eye on these developments, hopefully, taking in the proven parts of that work.

One can, of course, automate IT tasks without embracing Agile. The fundamental question to be answered is whether one considers ITIL as an “empirical” process control model or as a “defined” process control model (or possibly a hybrid).