Posts Tagged ‘Java’
What 108M Lines of Code Tell Us
Results of the first annual report on application quality have just been released by CAST. The company analyzed 108M lines of code in 288 applications from 75 companies in various industries. In addition to the ‘usual suspects’ – COBOL, C/C++, Java, .NET – CAST included Oracle 4GL and ABAP in the report.
The CAST report is quite important in shedding light on the code itself. As explained in various posts in this blog, this transition from the process to its output is of paramount importance. Proficiency in the software process is a bit allusive. The ‘proof of the pudding’ is in the output of the software process. The ability to measure code quality enables effective governance of the software process. Moreover, Statistical Process Control methods can be applied to samples of technical debt readings. Such application is most helpful in striking a good balance in ‘stopping the line’ – neither too frequently nor too rarely.
According to CAST’s report, the average technical debt per line of code across all application is $2.82. This figure, depressing that it might be, is reasonably consistent with quick eyeballing of Nemo. The figure is somewhat lower than the average technical debt figure reported recently by Cutter for a sample of the Cassandra code. (The difference is probably attributable to the differences in sample sizes between the two studies). What the data means is that the average business application in the CAST study is saddled with over $1M in technical debt!
An intriguing finding in the CAST report is the impact of size on the quality of COBOL applications. This finding is demonstrated in Figure 1. It has been quite a while since I last saw such a dramatic demonstration of the correlation between size and quality (again, for COBOL applications in the CAST study).
Source: First Annual CAST Worldwide Application Software Quality Study – 2010
One other intriguing findings in the CAST study is that “application in government sector show poor changeability.” CAST hypothesizes that the poor changeability might be due to higher level of outsourcing in the government sector compared to the private sector. As pointed out by Amy Thorne in a recent comment posted in The Agile Executive, it might also be attributable to the incentive system:
… since external developers often don’t maintain the code they write, they don’t have incentives to write code that is low in technical debt…
Congratulations to Vincent Delaroche, Dr. Bill Curtis, Lev Lesokhin and the rest of the CAST team. We as an industry need more studies like this!
Forrester on Managing Technical Debt
Forrester Research analysts Dave West and Tom Grant just published their report on Agile 2010. Here is the section in their report on managing technical debt:
Managing technical debt
Dave: The Agile community has faced a lot of hard questions about how a methodology that breaks development into short iterations can maintain a long-term view on issues like maintainability. Does Agile unintentionally increase the risk of technical debt? Israel Gat is leading some breakthrough thinking in the financial measures and ramifications of technical debt. This topic deserves the attention it’s beginning to receive, in part because of its ramifications for backlog management and architecture planning. Application development professionals should :-
- Starting captured debt. Even if it is just by encouraging developers to note issues as they are writing code in the comments of that code, or putting in place more formal peer review processes where debt is captured it is important to document debt as it accumulates.
- Start measuring debt. Once captured, placing a value / cost to the debt created enables objective discussions to be made. It also enables reporting to provide the organization with transparency of their growing debt. I believe that this approach would enable application and product end of life discussions to be made earlier and with more accuracy.
- Adopt standard architectures and opensource models. The more people that look at a piece of code the more likely debt will be reduced. The simple truth of many people using the same software makes it simpler and less prone to debt.
Tom: Since the role I serve, the product manager in technology companies, sites on the fault line between business and technology, I’m really interested in where Israel Gat and others take this discussion. The era of piling up functionality in the hopes that customers will be impressed with the size of the pile are clearly ending. What will replace it is still undetermined.
I will be responding to Tom’s good question in various posts along the way. For now I would just like to mention the tremendous importance of automated technical debt assessment. Typical velocity of formal code inspection is 100-200 lines of code per hour. Useful and important that formal code inspection is, there is only so much that can be inspected through our eyes, expertise and brains. The tools we use nowadays to do code analysis apply to code bases of any size. Consequently, the assessment of quality (or lack thereof) shifts from the local to the global. It is no more no a matter of an arcane code metric in an esoteric Java class that precious few folks ever hear of. Rather, it is a matter of overall quality in the portfolios of projects/products a company possesses. As mentioned in an earlier post, companies who capitalize software will sooner or later need to report technical debt as line item on their balance sheet. It will simply be listed as a liability.
From a governance perspective, technical debt techniques give us the opportunity to carry out consistent governance of the software process based on a single source of truth. The single source of truth is, of course, the code itself. The very same truth is reflected at every level in the organization. For the developer in the trenches the truth manifests itself as a blocking violation in a specific line of code. For the CFO it is the need to “pay back” $500K in the very same project. Different that the two views are, they are absolutely consistent. They merely differ in the level of aggregation.
Technical Debt Meets Continuous Deployment
As you would expect in a conference entitled velocity, and in a follow-on devops day, speeding up things was an overarching theme. In the context of devops, the theme primarily manifested itself in lively discussions about the number of deploys per day. Comments such as the following reply to my post Ops Driven Dev were typical:
Conceptually, I move the whole business application configuration into the source code…
The theme that was missing for me in many of the presentations and discussions on the subject was the striking of a balance between velocity and quality. The classical trade-off in process control is between production rate and product quality (and safety, but that aspect [safety] is beyond the scope of this post). IMHO this trade-off applies to software just as it applies to mechanical or chemical processes.
The heart of the “deploy early and often” strategy hailed by advocates of continuous deployment is known deployment state to known deployment state. You don’t let the deployment evolve from one state to another before it has stabilized to a robust state. The power of this incremental deployment is in dealing with single-piece (or as small number of pieces as possible) flow rather than dealing with the effects of multiple-piece flow. When the deployment increments are small enough, rollback, root cause analysis and recovery are relatively straightforward if a deployment turns sour. It is a similar concept to Agile development, extending continuous integration to continuous deployment.
While I am wholeheartedly behind this devops strategy, I believe it needs to be reinforced through rigorous quality criteria the code must satisfy prior to deployment. The most straightforward way for so doing is through embedding technical debt criteria in the release/deploy process. For example:
- The code will not be deployed unless the overall technical debt per line of code is lower than $2.
- To qualify for deployment, code duplication levels must be kept under 8%.
- Code whose Cyclomatic complexity per Java class is higher than 15 will not be accepted for deployment.
- 50% unit test coverage is the minimal level required for deployment.
- Many others…
I have no doubt whatsoever that code which does not satisfy these criteria might be successfully deployed in a short-term manner. The problem, however, is the accumulative effect over the long haul of successive deployments of code increments of inadequate quality. As Figure 1 demonstrates, a Java file with Cyclomatic complexity of 38 has a probability of 50% to be error-prone. If you do not stop it prior to deployment through technical debt criteria, it is likely to affect your customers and play havoc with your deployment quite a few times in the future. The fact that it did not do so during the first hour of deployment does not guarantee that such a file will be “well-behaved” in the future.
Figure 1: Error-proneness as a Function of Cyclomatic Complexity (Source: http://www.enerjy.com/blog/?p=198)
To attain satisfactory long-term quality and stability, you need both the right process and the right code. Continuous deployment is the “right process” if you have developed the deployment infrastructure to support it. The “right code” in this context is code whose technical debt levels are quantified and governed prior to deployment.
Using 3σ Control Limits in Software Engineering
Source: Wikipedia; Control Chart
The July/August 2010 issue of IEEE Software features an article entitled “Monitoring Software Quality Evolution for Defects” by Hongyu Zhang and Sunghun Kim. The article is of interest to the software developer/tester/manager in quite a few ways. In particular, the authors report on their successful use of 3σ control limits in c-charts used to plot defects in software projects.
To put things in perspective, consider my recent assessment of the results accomplished by Quick Solutions (QSI) in two of their projects:
One to one-and-a-half standard deviation better than the mean might not seem like much to six-sigma black belts. However, in the context of typical results we see in the software industry the QSI results are outstanding. I have not done the exact math whether those results are superior to 95%, 97% or 98% of software projects in Michael Mah‘s QSMA database as the very exact figure almost does not matter when you achieve this level of excellence.
A complementary perspective is provided by Capers Jones in Estimating Software Costs: Bringing Realism to Estimating:
Another way of looking at six-sigma in a software context would be to achieve a defect-removal efficiency level of about 99.9999 percent. Since the average defect-removal efficiency level in the United States is only about 85 percent, and less than one project in 1000 has ever topped 98 percent, it can be seen that actual six-sigma results are beyond the current state of the art.
The setting of control limits is, of course, quite a different thing from the actual defect-removal efficiency numbers reported by Jones for the US and the very low number of defects reported by Mah for QSI. Having said that, driving a continuous improvement process through using 3σ control limits is the best recipe toward eventually reaching six-sigma results. For example, one could drive the development process by using Cyclomatic complexity per Java class as the quality characteristic in the figure at the top of this post. In this figure, a Cyclomatic complexity reading higher than 10.860 (the Upper Control Limit) will indicate a need to “stop the line” and attend to reducing complexity before resuming work on functions and features.
Coming on the heels of the impressive results reported by David Joyce on the use of statistical process control (SPC) techniques by the BBC, the article by Zhang and Kim is another encouraging report on the successful application of manufacturing techniques to software (and to knowledge work in general). I am not at liberty to quote from this just published IEEE article, but here is the abstract:
Quality control charts, especially c-charts, can help monitor software quality evolution for defects over time. c-charts of the Eclipse and Gnome systems showed that for systems experiencing active maintenance and updates, quality evolution is complicated and dynamic. The authors identify six quality evolution patterns and describe their implications. Quality assurance teams can use c-charts and patterns to monitor quality evolution and prioritize their efforts.