Posts Tagged ‘Technical Debt’
Photo credit: Dancing Lemur (Flickr)
Cunningham’s quip “A little debt speeds development so long as it is paid back promptly with a rewrite” is intuitively very clear. We are talking about short-term debt which will be reduced, and hopefully eliminated in entirety, at the earliest possible time.
The question this post addresses is what happens when the expected short-term technical debt becomes a significant long-term debt? Specifically, can technical debt under some conditions constitute a breach of implied warranties?
In his InformIT article Don’t “Enron” Your Software Project, Aaron Erickson coined the term “Technical Fraud” and connected it to Lemmon Laws:
As a reaction to seeing this condition and its deleterious effects, I coined the term technical fraud to refer to the practice of incurring unmanaged and hidden technical debt. Many U.S. states have “lemon laws” that make it illegal to knowingly sell someone a car that has undisclosed maintenance problems. Selling a “lemon” is a fraudulent practice in the world of cars, and it should be considered as such in the world of software.
It is a little tricky (though not impossible – see Using Credit limits to Constrain Development on Margin) to define the precise point where technical debt becomes “unmanaged.” One needs to walk a fine line between technical/methodical incompetence and resource availability to determine technical fraud. For example, if your code has 35% coverage, is it or is not unmanaged? Does the answer to this question change if your cyclomatic complexity per class exceeds 30? I would think the courts might be divided for a very long time on the question when does hidden technical debt represent a fraudulent misrepresentation.
One component of technical debt deserves special attention in the context of this post. I am referring to the conscious decision not to do unit testing at all.
Best I understand it, the rationale for not “bothering” with unit testing is a variant of the old ploy “we do not have time for testing here.” It is a resource allocation strategy that bets on the code being miraculously bug-free. Some amount of functional testing is done out of necessity – the code in customers hands needs to function as proclaimed. But, the pieces of code from which functionality is constructed are not subject to direct rigorous testing. The individual units of code will be indirectly exercised in some manner through functional testing, but not in a systemic manner to verify and validate correctness of the units of code per se.
Such a conscious decision IMHO indicates no intention to pay back this category of technical debt – unit test coverage. It is therefore quite incompatible with the nature of an implied warranty:
An implied warranty is as an unstated promise, assumed by the law in most sales transactions, that the product will be of at least average quality and will do what the average customer would expect it to do [The Reader's Digest Legal Questions & Answers Book]
To #1 defense open to a software vendor who gets sued over lack of unit testing is that a fair average quality of software can be attained without any unit testing. As a programmer, I would think such defense would fly at the teeth of the availability since 1987 of the IEEE Standard for Software Unit Testing.
It is fascinating to note the duality between contracts and programming. For the programmer who follows the tenets of design by contract, “a unit test provides a strict, written contract that the piece of code must satisfy…”
Disclaimer: I am not an expert in the law. The opinion expressed in this post merely represents my layman’s understanding of principles of contract law that might be applicable to technical debt situations.
Photo credit: tengtan (Flickr)
Most posts on technical debt in this blog emphasize the use of technical debt for strategic decision-making. In this post we will point out the use of technical debt in Agile teams at the tactical level. Specifically:
- Every two weeks; and/or,
- With every build.
Taking a close look at the various components of technical debt during the bi-weekly iteration review meeting provides plenty of useful information to the process. For example, you might look for insights to explain the following:
- Why is the unit test coverage figure going down?
- Any particular reason the cyclomatic complexity figure has gone up?
- Why is the figure of merit for design lower than the figure indicated in the previous iteration review meeting?
- Many others…
The emphasis in this mode of operation is on guiding the retrospection. Plenty of good and valid reasons might exist for any of the trends mentioned above. However, observing the trends helps you ask the right questions, focusing on what happened during the iteration just completed. In conjunction with technical debt data from previous iteration review meetings, trends that characterize your software development project become visible. You may or may not need to change anything you are doing, but you become very conscious of any “let’s not change” decision.
An intriguing practice suggested by colleague and friend Erik Huddleston is to make technical debt a criterion for the build to pass. The build automatically fails if the technical debt figure has gone up. Or, if you are very focused on a specific aspect of technical debt such as complexity, you fail the build whenever the complexity figure of merit rises above a certain pre-determined threshold. For example, you might fail a build in which the cyclomatic complexity per method has exceeded 4.
The power of failing a build whenever the technical debt arises is in utilizing the build as an exceptionally effective influence point. You instill the discipline of reducing technical debt one build at a time. If your team aggressively practices continuous integration, it will address technical debt issues multiple times a day. Instead of staring at a “mountain” of technical debt towards the release of a product, you chunk it to really small increments that get addressed “real-time.” For instance, a build that failed due to lack of comments can usually be fixed very quickly by the developer who “upset the apple cart” while the logic embedded in the code is fresh on his/her mind.
A good insight to the way the tactical use of technical debt techniques adds value is provided by the following observation: the technical debt data is observed from outside the Agile process. Hence, technical debt data is nicely suited to guiding the process. If you think of the software engineering fabric as a virtual stack, the technical debt “layer” could be considered a layer above the Agile process.
Technical debt is usually perceived as a measure of expediency. You borrow a little (time) with the intent of paying it back as soon as possible. To quote Ward Cunnigham:
Shipping first time code is like going into debt. A little debt speeds development so long as it is paid back promptly with a rewrite… I thought that rushing software out the door to get some experience with it was a good idea, but that of course, you would eventually go back and as you learned things about that software you would repay that loan by refactoring the program to reflect your experience as you acquired it.
As is often the case with financial debt, technical debt accrues with compound interest. Once it reaches a certain level (e.g. $1 per line of code) you stare at a difficult question:
Should I ship this code before reducing the accrued technical debt?!
The Figure below, taken from An Objective Measure of Code Quality by Mark Dixon, answers the question with respect to one important component of technical debt – cyclomatic complexity. Once complexity per source code file exceeds 74, the file is for most practical purposes guaranteed to contain errors. Some of the errors in such a file might be trivial. However, a 2007 study by Capers Jones indicates about a third of the errors found in released code are likely to be serious enough to stop an application from running or create erroneous outputs.
To answer the question cited above – Should You Ship This Software Before Reducing Technical Debt?! – examine both cost and risk for the number of error-prone files you are about to unleash:
- The economics of defect removal clearly favor early defect removal over late defect removal. The cost of removal grows exponentially as function of time.
- Brand risk should be first and foremost on your mind. If complexity figures higher than 74 per file are more of the norm than the exception, you are quite likely to tarnish your image due to poor quality.
If you decide to postpone the release date until the technical debt has been reduced, you can apply yourself to technical debt reduction in a biggest-bang-for-the-buck manner. The analysis of complexity can identify the hot spots in your code, giving you a de-facto roadmap you would be wise to follow.
Conversely, if you opt to ship the code without reducing technical debt, you might lose this degree of freedom to prioritize your “fix it” work. Customer situations and pressures might force you to attend to fixing modules that do not necessarily provide as much bang for the buck.
Postscript: Please note that the discussion in this post is strictly limited to intrinsic quality. It does not address at all extrinsic quality. In other words, reducing/eliminating technical debt does not guarantee that the customer will find the code valuable. I would suggest reading Beyond Scope, Schedule and Cost: Measuring Agile Performance in the Cutter Blog for a more detailed analysis of the distinction between the two.
Erratum: The figure above is actually taken from a blog post on the Mark Dixon paper cited in my post. See McCabe Cyclomatic Complexity: the proof is in the pudding. My apology for the error.
Source: 17th/21st Lancers c. 1922-1929 “THE FIGHTING SPIRIT!”
Agile consultants on a development project often start by helping the team construct a backlog. The task is sufficiently concrete to get all stakeholders (product management, project management, development, test, any others) on a collaborative track through the creation of a key artifact. The backlog establishes a base line for the tasks to be carried out in the project.
For a DevOps project, start by establishing the technical debt of the software to be released to operations. By so doing you build the foundations for collaboration between development and operations through shared data. In the DevOps context, the technical debt data form the basis for the creation and grooming of a unified backlog which includes various user stories from operations.
Apply the same approach when you are fortunate to be able to include folks from operations in the Agile team from the very beginning. You start with zero technical debt, but you track it on an ongoing basis and include the corresponding “fix-it” stories in the backlog as you accrue the debt. Running technical debt analytics on the source code every two weeks is a good practice to follow.
As the head of development, you might not be comfortable sharing technical debt data. This being the case, you are not ready for DevOps.
Much has been said recently about the success/failure rate of Agile projects. In particular, a debate arose around the success rate of Scrum vis-a-vis Kanban. For example, in a post entitled Some Day Kanban will fail 75% of the Time, colleague Jurgen Appelo states as follows:
Unfortunately, some people arguing against Scrum include these ScrumBut teams in their evaluations of the “high failure rate” of Scrum. They love quoting that “at least 75 percent of Scrum implementations fail.” And I think “Yes of course, 75% fails when that includes the teams that don’t understand what they’re doing.”
I would like to add one other “dimension” to the discussion: boundary conditions.
Any Agile initiative – Crystal, Scrum, Kanban, etc. – typically starts from a certain state of affairs of the code that has already been developed using a Waterfall method or no method at all. Even brand new projects produce code that invariably interacts with other software components that are already deployed, warts and everything. Pristine environments with no technical debt for the Agile initiative to deal with are rare.
Like it or not, the Agile initiative is saddled from the outset with a certain amount of technical debt. Code has been duplicated, rules violated, complexity ran amuck, etc. A typical enterprise software team starts with hundreds of thousands $$ in technical debt, if not millions. This debt needs to be “paid back.” Probably not over night, but certainly over a period of time. As illustrated by the following figure from Jim Highsmith, things get ugly if the debt is not paid back over an extended period of time.
The evaluation of success or failure of the Agile initiative needs to take technical debt into account. A team of 50 with an accrued technical debt of $100,000 has a much easier job on its hands transitioning to Agile than a similar size team starting with $1M in technical debt on its hands.
Whatever criteria you use to determine whether an Agile initiative has been successful, I would suggest the following boundary condition needs to be satisfied:
Technical debt at the end of the project/initiative must be significantly lower than technical debt at the start of the project.
Use the techniques outlined in Using Credit Limits to Constrain Development on Margin to calculate technical debt before and after. In addition to qualifying your Agile success, quantifying technical debt will do a lot towards improving the quality of your software.
Consider the situation described in Should You Invest in This Software:
- One of your portfolio companies expects to ship 500K lines of code in 6 months.
- The company asks for additional $2M to complete development and bring the product to market.
- Using technical debt quantification techniques you find the technical debt amounts to $1M.
You are not at all comfortable “paying back” the technical debt in addition to funding the requested $2M. You wonder whether you should start afresh instead of trying to complete and fix the code.
Photo credit: @muntz (Flickr)
A good starting point for assessing the fresh start option is Michael Mah‘s studies of software productivity. Based on the QSMA SLIM metrics database of more than 8,000 projects, Michael will probably bracket the productivity per person in a team consisting of product management, development and test at 10-15K lines of code per year. If you use the 15K lines of code per year figure for the purposes of the analysis, 500K lines of code could theoretically be delivered with an investment of about 33.3 (500/15) man years. Assuming average loaded cost of $99,000 per man-year, the software represents a programming effort of $3.3M. Not much is left if you deduct $3M ($2M+1M) from $3.3M…
Five considerations are of paramount importance in evaluating the start afresh option:
- The comparison above ($3.3M versus $3.0M) is timeless. It is a snapshot at a certain point in time which does not take into account the value of time. To factor in the time dimension, the analysis needs to get into value (as distinct from cost) considerations. See the note on Intrinsic Quality v. Extrinsic Quality at the bottom of this post.
- Your “mileage” may vary. For example, best in class teams in large software projects have reported productivity of 20K lines of code per team member per year. As another example, productivity in business applications is very different from productivity in real-time software.
- If you decide to start with a brand new team, remember Napoleon’s quip: “Soldiers have to eat soup together for a long time before they are ready to fight.”
- If you decide to start afresh with the same team plus some enhancements to the headcount, be mindful of ‘Mythical Man-Month‘ effects. Michael Mah’s studies of the BMC BPM projects indicate that such effects might not hold for proficient Agile teams. Hence, you might opt to go Agile if you plan to enhanced the team in an aggressive manner.
- Starting afresh is not an antidote to accruing technical debt (yet again…) over time. But, it gives you the opportunity to aggressively curtail technical debt by applying the techniques described in Using Credit Limits to Constrain Development on Margin. For example, you might run source code analytics every two weeks and go over the results in the bi-weekly demo.
As long as you are mindful of these five aspects (timeless analysis, your mileage may vary, Napoleon’s quip, mythical man-month effects and credit limits on technical debt), combining technical debt figures with productivity data is an effective way to consider the pros and cons of “fix it” versus starting afresh. The combination of the two simplifies a complex investment decision by reducing all considerations to a single common denominator – $$.
Note: This is not a discussion from a value perspective. The software, warts and everything, might (or might not) be valuable to the target customers. The reader is referred to Jim Highsmith‘s analysis of Intrinsic quality versus Extrinsic Quality in Agile Project Management: Creating Innovative Products. See the Cutter Blog post entitled Beyond Scope, Schedule and Cost: Measuring Agile Performance for a short summary of the distinction between the two.
Consider the following scenario: You are a venture capitalist. One of your portfolio companies has been working for a few years on a promising software application. Various surprises with respect to schedule and functionality have been sprung on you along the way. The company now asks for one last shot-in-the-arm in order to get the product out the door, market and sell it. Should you open your wallet one more time to fund this alleged last push?
It is a familiar scenario not only for venture capitalists, but for CEOs, CFOs, general managers and M&A executives. A renowned CEO once told me the following when I pushed my luck with respect to project funding:
Israel, I have a warehouse of software products that never generated a dime for me.
Believe me, this CEO was neither amused nor philosophical…
Code analysis techniques have progressed to the point that the answer to the software investment question for object-oriented code can to a certain extent be determined through quantifying technical debt. For example, assume the following circumstances:
- A company expects to ship 500K lines of code in 6 months.
- The company asks for additional $2M to complete development and make a significant resound in the market.
To assess the investment decision, apply the code analysis techniques described in Using Credit Limits to Constrain Development on Margin to quantify the technical debt. Assuming a debt of $2 per line of code has been identified, the overall technical debt amounts to $1M (2X500K).
The investment decision then is not an incremental $2M decision. It is actually a $3M ($2M+1M) investment decision when the technical debt is taken into account. The technical debt might not need to be paid overnight, but it will have to be paid back over a period of time. The team might not hire additional resources to reduce/eliminate the technical debt, but the team resources dedicated to reducing technical debt will not be available to carry out other assignments. Hence, the opportunity cost ($1M) is real, relevant and should be taken into account.
If you are hesitant to continue investing in this software/team, you stare at a tricky question:
- What will it take to start afresh?
If you decide to make the $3M investment, two operational questions pose themselves:
- How should work on reducing/eliminating technical debt be interleaved with other pressing work such as new functions and features?
- Given a $1M debt on 500K lines of code, can the company indeed ship as expected in 6 months?
We will address these three questions in forthcoming posts in The Agile Executive.
Buying (stocks) on margin is broadly recognized as a risky investment strategy. Funding long-term investments with short-term debt exposes the investor to margin calls as he/she might not be able to secure more financing when needed. The resultant margin call is never pleasant.
The accrual of technical debt in the course of aggressively developing functions and features is quite a similar phenomenon. The CTO is betting the functionality he/she is developing will pay off before the need to “pay back” the technical debt becomes imperative. The temptation to do so is particularly strong due to the lack of credit limits on technical debt. For all practical purposes the CTO is “developing on margin.”
In his comprehensive studies of the economics of software, Capers Jones has actually put a 3-5 year ceiling on the economical viability of developing on margin:
Indeed, the economic value of lagging applications is questionable after about three to five years. The degradation of initial structure and the increasing difficulty of making updates without “bad fixes” tends towards negative returns on investment (ROI) within a few years.
As the CEO leading a company, or the venture capitalist funding it, you can restrain development on margin by establishing credit limits. Use a combination of static code analysis with dynamic program analysis to calculate the amount of accrued technical debt in $$ terms. (An illustration of such calculation as well as a breakdown of the technical debt is given in the Sonar chart above). Set a limit (say $0.25 per line of code) on the amount of permitted technical debt. Once the limit is reached, developers are not allowed to continue developing new functionality – they have to first reduce (and hopefully eliminate) their technical debt.
A very simple “Lacmus test” is available to the CEO/VC until the code is instrumented and the analytics illustrated above generated. Ask your CTO about unit test coverage. If the coverage is low (say <30%), chances are the technical debt is high. Whether the CTO realizes it or not, low unit test coverage is a good indicator of technical debt of all kinds. Moreover, the investment required to develop a full-fledged suite of unit tests is often the largest component of the technical debt to be paid back.
But if people are already reluctant to run the things they have, on another platform they already have, on an operating system they are already familiar with (Linux on zSeries), how can you expect them to even look at cloud computing seriously? Every technological advancement requires people to adapt and change. Human nature is that we don’t like that, so it often requires a disaster to change our behavior. Or carefully planned steps to prove and convince people. However, nothing makes IT people more cautious than a hype. And that is how cloud is perceived. When the press, the analysts and the industry start writing about cloud as part of the IT solution, people will want to change. Now that it’s presented as the silver bullet to all IT problems, people are cautious to say the least.
Here is Annie Shum‘s thoughtful reply to Marcel’s comment:
Today, the Cloud era has only just begun. Despite lingering doubts, growing concerns and wide-spread confusion (especially separating media and vendor spun hype from reality), the IT industry generally views Cloud Computing as more appealing than traditional ASP /hosting or outsourcing/off-shoring. To technology-centric startups and nimble entrepreneurs, Cloud Computing enables them to punch above their weight class. By turning up-front CapEx into a more scalable and variable cost structure based on an on-demand pay-as-you-go model, Cloud Computing can provide a temporary, level playing field. Similarly, many budget-constrained and cash-strapped organizations also look to Cloud Computing for immediate (friction-free) access to “unlimited” computing resources. To wit: Cloud Computing may be considered as a utility-based alternative to an on-premises datacenter and allow an organization (notably cash-strapped startups) to “Think like a ‘big guy’. Pay like a ‘little guy’ ”.
Forward-thinking organizations should not lose sight of the vast potential of Cloud Computing that extends well beyond short-term economics. At its core, Cloud Computing is about enabling business agility and connectivity by abstracting computing infrastructure via a new set of flexible service delivery/deployment models. Harvard Business School Professor Andrew McAffee painted a “Cloudy” future for Corporate IT in his August 21, 2009 blog and cited a perceptive 1983 paper by Warren D. Devine, Jr. in the Journal of Economic History called “From Shafts to Wires: Historical Perspective on Electrification”. There are three key take-away messages that resonate with the current Cloud Computing paradigm shift. First: The real impact of the new technology was not apparent right away. Second: The transition to full utilization of the new technology will be long, but inevitable. Third: There will be detractors and skeptics about the new technology throughout the transition. Interestingly, telephone is another groundbreaking disruptive technology that might have faced similar skepticism in the beginning. Legend has it that a Western Union internal memo dated 1876 downplayed the viability of the telephone: “This ‘telephone’ has too many shortcomings to be seriously considered as a means of communications. The device is inherently of no value to us.”
The dominance of Cloud Computing as a computing platform, however, is far from a fait accompli. Nor will it ever be complete, a “one-size fits all” or a “big and overnight switch”. The shape of computing is constantly changing but it is always a blended and gradual transition, analogous to a modern city. While the cityscape continues to change, a complete “rip-and-replace” overhaul is rarely feasible or cost-effective. Instead, city planners generally preserve legacy structures although some of them are retrofitted with standards-based interfaces that enable them to connect to the shared infrastructure of the city. For example, the Paris city planners retrofitted Notre Dame with facilities such as electricity, water, and plumbing. Similarly, despite the passage of the last three computing paradigm shifts – first mainframe, next Client/Server and PCs, and then Web N-tier – they all co-exist and can be expected to continue in the future. Consider the following. Major shares of mission-critical business applications are running today on mainframe servers. Through application modernization, legacy applications – notably Cobol for example – now can operate in a Web 2.0 environment as well as deploy in the Cloud via the Amazon EC2 platform.
Cloud Computing can provide great appeal to a wide swath of organizations spanning startups, SMBs, ISVs, enterprise IT and government agencies. The most commonly cited benefits include the promise of avoiding CapEx and lowering TCO to on-demand elasticity, immediacy and ease of deployment, time to value, location independence and catalyzing innovation. However, there is no magic in the Cloud and it is certainly not a panacea for all IT woes. Some applications are not “Cloud-friendly”. While deploying applications in the Cloud can enable business agility incrementally, such deployment will not change the characteristics of the applications fundamentally to be highly scalable, flexible and automatically responsive to new business requirements. Realistically, one must recognize that the many of the challenging problems – security, data integration and service interoperability in particular – will persist and live on regardless of the computing delivery medium: Cloud, hosted or on-premises.
 “The author combed through the contemporaneous business and technology press to learn what ‘experts’ were saying as manufacturing switched over from steam to electrical power, a process that took about 50 years to complete.” – Andrew McAfee, September 21, 2009.
I will go one step further and add quality to Annie’s list of challenging problem. A crappy on-premises application will continue to be crappy in the cloud. An audit of the technical debt should be conducted before “clouding” an application. See Technical Debt on Your Balance Sheet for a recommendation on quantifying the results of the quality audit.
Everything is very simple in War, but the simplest thing is difficult. These difficulties accumulate and produce a friction which no man can imagine exactly who has not seen War… This enormous friction, which is not concentrated, as in mechanics, at a few points, is therefore everywhere brought into contact with chance, and thus incidents take place upon which it was impossible to calculate…
IMHO, this excerpt from On War applies exceptionally well to Agile roll-outs these days:
- Simplicity: The four principles of the Agile Manifesto are intuitively compelling. You could (and probably should) use them as the core definition of what Agile is all about. Likewise, you do not need more than a single hand-drawn matrix to illustrate how WIP limits in Kanban work. In contrast to various other terms used in development and IT – e.g. SOA – the conceptual power of Agile methods is easy to grasp.
- Friction: Assume you were building a company from scratch without any pre-conceived notions of the organizational constructs you would put in place. Assume as well that you were developing your organization with end-to-end Agile effectiveness in mind. You would probably devise a holistically integrated organization. For example, you might opt for an organizational design in which each level of the organization will include all functions relevant to Agile – R&D, IT, Marketing, Support, Sales etc. In other words, ideally you will not go for the traditional organizational design: a vertical R&D stove pipe, a vertical Marketing stove pipe, a vertical Sales stove pipe, etc. As in reality you are unlikely to get the charter to start with a clean sheet of paper, the friction arises in each and every point in which your end-to-end organizational design for Agile deviates from the traditional organizational constructs your company uses.
- Not concentrated: As Clausewitz points out, the friction of war is not mechanical friction – you can’t address it by pouring in a ‘organizational lubricant’ in just a few places. Flooding the whole organization with the lubricant is likely to create a morass in which all agility will be lost.
I recommend four principles to minimize the organizational friction of Agile, as follows:
- Acknowledge you accrued organizational debt: It is conceptually quite similar to accruing technical debt – various organizational decisions and compromises taken along the way were rushed to the extent that they leave much to be desired. The organizational constructs and practices that sprang out of these decisions need to be refactored.
- Carry out the organizational refactoring work from the outside to the inside: A truly holistic Agile design will incorporate customers and partners. Start with the way you will integrate them, thence apply this very same way to the integration of the organizational entities within your company.
- Build on the strengths of your core corporate culture: As pointed out by Drucker:
… culture is singularly persistent… changing [organizational] behavior works only if it can be based on the existing ‘culture’… [Drucker, 1991]
- Use modern corporate for the fundamental organizing principles. See Relational Networks, Strategic Advantage: New Challenges for Collaborative Control by Hagel, Brown and Jelinek for a good recent article on the subject.
Since the end of the Cold War, a fair amount of debate has taken place around the applicability of the friction of war principles to armed conflicts in the information age. The conclusion is of interest to both military personnel, Agile practitioners and IT professionals:
… while technological advances might temporarily mitigate general friction, they could neither eliminate it nor substantially reduce its potential magnitude.