The Agile Executive

Making Agile Work

Posts Tagged ‘IT Operations

Through the Prism of IT Transformation for Tomorrow’s Enterprise Datacenters: Interview with Annie Shum

with one comment

As indicated in our recent post “Extending the Scope of the Agile Executive”, Cote and I have recently reached the conclusion that The Agile Executive needs to cover structural changes in order to give a forward-looking view to its readers. We start the coverage of structural changes that are relevant to Agile with an interview with Annie Shum, VP of Advanced Technology, Amdocs Corp.

We cover a broad panorama in this interview with Annie. Here are some items that may be of special interest to the reader who focuses on Agile methods, processes and governance in a broad sense – from programming to IT operations and anything in between:

  • Unleashing disruptive transformations
  • Supply and demand – the two sides of the IT “coin”
  • Open source software in general and OpenStack in particular
  • The impact of social networking and other Web 2.0 tools
  • Three billion downloads and counting…
  • Finding the “right” balance between hierarchical command-control and bottom-up empowerment
  • “Self-service” IT service delivery/deployment
  • Forthcoming changes in IT system administration and the rise of DevOps
  • How to gain freedom from a variety of low-level operational tasks and controls of physical infrastructure
  • Provisioning and over provisioning
  • Many others…

Annie answers all questions with data, insights and passion. No surprises there…

Israel: Nancy Foy immortalized the monolithic International Business Machines Corporation in her classic “The Sun Never Sets on IBM.” Much has changed, of course, since the book was published in the 70’s. For quite a few years IBM has been deconstructing its business design, its organizational structure and both internal and external processes. By some accounts, prior to Gerstner IBM had even been contemplating reforming itself as a bunch of independent companies. The contrast to IBM’s announcement a couple of weeks ago about putting both software and hardware under one hand is noteworthy. What do you make of it, Annie? Is this a new development? Or is it a blast from the past?

Annie: Interesting question but I would be remiss if I failed to point out that I don’t have a crystal ball or the expertise to predict reliably whether this will be an isolated case or a trend-setter.  Although the arguably radical IBM organizational restructuring in management is newsworthy, I am not especially interested in looking at it purely from the perspective of vendor management structure because it is merely a means to an end.  What intrigues me is the rationale behind this key announcement.  In particular, I am interested in envisioning the more profound and potentially game-changing, if not disruptive, transformation that IBM hopes to unleash by adopting this bold organizational restructuring with likely (significant) risks.

To better understand this new undertaking, I think it would be instructive to analyze it from the supply side as well as the demand side.  So let’s break up the narrative: first by looking at the supply side, namely the IT service providers/system vendors, followed by the second half of the narrative, the customers/consumers.

Israel: I am intrigued by your supply side/demand side approach. Please elaborate.

Annie: To understand the supply side, consider the three major IT vendor announcements made during the week of July 19, 2010.  Not as three disparate events. Instead, by putting them in context and connecting the dots among them, we can uncover some very interesting insights into emerging trends of the IT industry in general and actionable guidelines for tomorrow’s enterprise datacenters in particular. 

Let’s begin with the May 2010 report from Saugatuck Research titled, “Gorillas In the Cloud: Applying Saugatuck’s “Master Brand” Model to Cloud IT” whereby “Master Brands” refer to those vendors (and service providers) that dominate and influence IT marketplaces, technologies and/or user accounts. This May report sets the stage for the latest Saugatuck research alert titled, “One-Stop Shopping – Major Vendors Acquire Assets for the Cloud”. This research alert describes how increasing numbers of major vendors are striving to become the “sole source for offerings up and down the IT EcoStack™ targeting the Cloud.”

As if on cue, IBM released two major announcements just this past week. First, on July 20, 2010, InformationWeek reported that IBM plans[i] to combine hardware and software to spur the company’s efforts to deliver bundled, plug-and-play systems. According to Sam Palmisano, the core strategy pivots on producing tightly bundled computer systems that “feature chips, middleware, and business software designed from the ground up to support Cloud Computing and other new-wave IT architectures.”

To some long-standing industry observers, this strategy may appear to be “back to the future” and IBM is simply returning to its roots after a prolonged hiatus from its original business model.  There is, however, an important historical footnote. Almost five decades ago, due to concerns of monopoly antitrust abuses stemming from the bundling of hardware and software in the IBM mainframe systems, the US government took legal action leading to IBM’s acceptance of the 1956 Consent Decree.

Today, unlike the past, IBM no longer dominates the computer systems market. In fact, there is a growing trend towards bundled systems, mainly by the “Master Brands”, to “mask” complexity for customers as they embark on implementing complex IT endeavors including key programs such as datacenter consolidation, server/storage virtualization, predictive analytics, SOA/BPM, Cloud Computing (public, private or hybrid), and Green IT. For example, Oracle acquired Sun Microsystems in 2009 for $7.4 billion to support what InformationWeek described as Larry Ellison’s “applications-to-disk” strategy, while HP and Microsoft earlier this year unveiled a multi-million dollar initiative under which they will jointly engineer servers and software.

It is likely that the timeline of the July 19 IBM announcement was influenced (perhaps even pressured) by its rivals taking a similar approach to address evolving enterprise datacenters. To expedite this strategy to deliver bundled “plug-and-play” systems, IBM first announced sweeping organizational restructuring to foster internal collaboration and harness synergies across products and LOBs. Clearly, the biggest change is the management restructuring by consolidating key hardware and software divisions under the watch of a single executive, Steve Mills who’s a longtime IBM software chief.

Next, just three days later on the heels of this organizational makeover, IBM made another major announcement on July 22, 2010 amidst much fanfare and hype. Presenting the vision of a new “Dimension in Computing” designed to control multi-platform datacenter operational costs and (significantly reduce complexity), IBM announced a new hybrid “system of systems” platform that unifies IT for efficient service delivery and large-scale datacenter simplification. Dubbed a “datacenter in a box” or a “cloud in a box[ii]”, it integrates the new super powerful and energy-efficient mainframe zEnterprise, 196  running z/OS and the zEnterprise  BladeCenter Extension zBX, running Linux and AIX. By extending the System Z’s qualities of service (spanning security, scalability, availability, efficiency and virtualization) to enable Cloud readiness and optimized service delivery for enterprises, IBM likely is promoting its strength in building private Clouds for large enterprises.  See the following two slides from the IBM July 22 announcement.


Israel: So it looks like the IT industry is heading towards more “power” consolidation of mega vendors or as you referenced earlier, “Master Brands”. Is this a fait accompli? If so, is it a matter of channeling demand toward one-stop-shopping irrespective of integration realities underneath? Isn’t there a danger to this trend?”

Annie: Despite these high profile announcements by the major vendors, it is far from fait accompli. And yes, your comments are only too real especially for those who have lived through the era of monopolies and antitrust concerns. Frankly, many people believe that such a trend may be a clear threat in the presently emerging era.  While I don’t want to downplay the risk and potential damage of antitrust abuses, I believe there are some factors at work here to counteract, or at least limit, unchecked monopolies in the IT industry.

In this Internet age with the rise of “Consumerization of IT”, catalyzed by the nearly ubiquitous access to social networking and other Web 2.0 tools, IT has permeated almost every market sector in our society. The set of functions and services supported and enabled by IT has become exceedingly vast, diverse and complex such that no single business model or supplier is in a position to dominate, let alone destroy all others.  The era when a handful of proprietary stalwart vendors dominated the IT industry is all but over. Just this past decade, we have witnessed the meteoric rise of Google, Facebook and more recently, Twitter. A growing formidable force, namely the open source software and its bottom-up self-organizing community, powers as well as empowers most if not all of the Web 2.0 companies. At this point in our discussion, it is apt to segue to the third vendor announcement during the week of July 19, 2010.

On July 19, Cloud service provider RackSpace with NASA announced the sponsorship of the project: OpenStack, an open source IaaS Cloud platform. Included in the announcement is a diverse group of computer system providers from across the technology industry like CITRIX, DELL, NTT DATA, RIGHTSCALE and others to drive a deployable, totally open cloud solution.  According to their mission statement, OpenStack is designed to foster the emergence of technology standards and Cloud interoperability. One of the primary objectives is to facilitate enterprises to avoid vendor lock-in.

Israel: This appears to be a very timely announcement given that “vendor lock-in” is one of the top concerns confronting enterprises as they evaluate and plan for the transition to Cloud Computing. Having said that, are we not back to “square zero” – striking a balance between openness and “one-stop shopping” tight integration?

Annie: Yes indeed. Although some industry observers describe the issue as “vendor lock-in”, others see it as a broader issue describing it as the “challenge/difficulty of bringing back in-house” or the “lack of interoperability standards for seamless portability”.  For example, in the 2009 Cloud Computing survey conducted by IDC, over 80% surveyed rated this issue under both labels to be very important.  Incidentally, I should point out that “vendor lock-in” is neither a new nor a unique issue with Cloud Computing. On the contrary, it is a long-standing “problem” going all the way back from the early days of mainframe computing and culminating with the government versus IBM antitrust lawsuit in the ‘50s as we discussed earlier.

Interestingly, there are many forms and variants of vendor lock-in and they are not all equal. For example, many industry observers have been unhappy with the proprietary development and delivery model that Apple imposed on the iPod/iPhone/iPad.  Although the risk of “vendor lock-in” may be real, any negative impact on the ever-growing large and loyal Apple customer base seems minimal.  Just think about the run-away successful App Store. It is heavily “curated” by Apple. Yet since its opening on July 10, 2008, there have been more than one hundred thousand available apps in App Store, over two billion application downloads (as of November 2009), and reaching three billion downloads by January 2010. Steve Jobs hailed this as a landmark event: “Three billion applications downloaded in less than 18 months – this is like nothing we’ve ever seen before.”

Sorry we digressed. So let’s resume our discussion of the recent major announcements.  In a nutshell, the OpenStack announcement attempts to address the issue directly by allowing any organization to create and offer Cloud Computing capabilities using open source software freely available under the Apache 2.0 license running on standard hardware.

Now this gets interesting: a tale of two diametrically opposite strategies.  On one hand, we have IBM announcing the high performance zEnterprise 196 as a hybrid integrated multi-architecture “datacenter /Cloud in a box”. The goal is to mask complexity and maximize efficiency:  infrastructure (management /admin costs savings up to 70%) and energy consumption (up to 82% energy usage reduction) with a bundled technology stack: integrating multi-platforms, infrastructure and management (spanning service, platform and hardware).  A principal concern of this proprietary single vendor approach is the risk of “vendor lock-in”.

On the other hand, the OpenStack is “DIY” based on an open source development platform. The goal of OpenStack is the following: “Anyone can run it, build on it, or submit changes back to the project. We strongly believe that an open development model is the only way to foster badly-needed cloud standards, remove the fear of proprietary lock-in for cloud customers, and create a large ecosystem that spans Cloud providers.”  The cons/challenges of this approach are probably similar to conventional “DIY” open source projects.

I should clarify that this dichotomy may be seen as an entire spectrum. As noted here IBM, VMware, etc on one hand, and RackSpace, Eucalyptus, etc on the other hand, exemplify the two end-points bookending the dichotomy spectrum. Along the spectrum, there are a growing number of intermediate options/offerings (with a rising number of variations) by a wide variety of IT Cloud service vendors: stalwart vendors including Amazon, Microsoft, Google, Salesforce.com,  etc as well as young companies and startups such as RackSpace, RightScale, Boomi, Canonical, Cloudkick, Opscode, etc.

Israel: Is this shaping up to be a battle between two diametrically opposite strategies? And if so, which one will come out on top? Or is it a draw?

Annie: To me, a similar dichotomy has already existed previously in the IT industry. For example, think Apple versus Google. Consider the modus operandi of the Apple core business model (“close or at least closely curated” to optimize user experience and quality) versus that of Google’s (“open standards/APIs” to maximize opportunities for 3rd party development participation).

Insofar as whether bundled systems or “Cloud in a box” versus open source “DIY” will be the ultimate winner, I have to defer to other industry observers with more experience such as you.  Perhaps in our future Q&A meet-up, I am interested to hear your views on how the competition may be settled eventually.  However, while we all await the uncertain outcome, IT practitioners should be mindful that the dichotomy spectrum would have profound implications not only on the supply side but also on the demand side.  In particular, because the offerings from the dichotomy spectrum will be rapidly evolving, the fluidity will very likely confound and confuse users/consumers as they attempt to balance a convoluted set of different tradeoffs. Many  enterprise IT practitioners will be under pressure to make difficult and ambiguous choices by picking one or more evolving offerings over other evolving offerings for building the foundation of tomorrow’s enterprise datacenters in the Cloud era.

Israel: Good timing.  So far in our Q&A today, you have focused on the first half of the narrative – namely, the supply side, now let’s continue to part 2 of your narrative, namely, the demand side.

Annie:  Earlier, I discussed the supply side by connecting the dots among three key announcements during the week of July 19. Now similarly for the demand side, I will suggest a few more dots that I believe should be connected. Specifically, I suggest connecting the following trends:

  • The growing complexities and inefficiencies of on-premises enterprise datacenters;
  • The inevitable rise of alternative delivery and deployment models for IT services; and
  • The advent of Cloud Computing:  a long-standing vision whose time may finally arrive.

Several months ago, I published a guest post on your blog site entitled “The Urgency of Now.”  You might recall that I began the post with some sobering and perhaps even alarming statistics about the gross inefficiency of traditional on-premises enterprise datacenters.  Here again is the Enterprise Datacenter Index at–a-glance:

In summary, enterprise IT faces a “crisis of staggering complexity” and IT infrastructure is reaching a “breaking point” marked by such salient factors/trends[iii] as the following:

  • 1.5 X: Information explosion driving over fifty percent yearly growth in storage shipments;
  • 85% idle: Over-provisioned waste primarily in distributed computing environments e.g. typical computing resources (capacity) remain idle for  an average of over eighty percent;
  • $40 Billion or 3.5% of sales: Retail industries annual loss due to (supply) value chain inefficiencies;
  • 60-70% IT spending on maintenance/overhead: Overall IT spending profile shows that the lion’s share of IT expenses goes towards overhead and maintenance. Maintenance overhead: seventy cents per dollar is spent on maintaining IT infrastructures at the expense of adding new capabilities;

Now consider the following scenario. Suppose enterprise IT could choose an alternative set of “self-service” IT service delivery/deployment models that would be orthogonal to traditional hierarchical command-and-control Cap-ex based datacenters.  Instead of owning and tightly controlling its own private internal datacenter and purchasing capital resources up front, an organization on-demand would “rent” pooled computing resources hosted on the provider’s multi-tenant environment. The Internet would serve as the global infrastructure “grid” and all services would be delivered through Web APIs.  In lieu of having a dedicated IT staff administering IT operations, users could avoid lengthy red-tape delay and access directly/immediately to provision as well as to manage computing capacity as “self-service IT”. In addition, instead of formal contracts and protracted delay in hardware procurement, an organization would pay for access at any time to “unlimited” computing capacity simply with a credit card.

Because there would not be formal contracts imposing preset time commitments, both entry and exit would be friction-free. In this way, an organization could accelerate time-to-value/market and help to catalyze experimentation and innovative endeavor. Furthermore, CIOs of enterprise IT could avoid or mitigate the lose-lose dilemma because they would not be restricted to choosing either a policy that leads to “waste due to over-provisioning” using peak usage estimates for capacity planning or a policy that can incur “risk due to under-provisioning” using non-peak estimates. Ideally, IT staff would “plan capacity based on typical usage” while confident that it could “scale dynamically at peak times” to maintain performance and SLAs. Simply put, the primary objectives for today’s organizations are not just about increasing speed and efficiency for back office automation. Rather, they also are about increasing speed and flexibility to adapt to changes by yielding judicious control to providers for on-demand utility computing services off-premises.

Conceptually, this scenario is an overall vision of Cloud Computing. With the advent of Cloud Computing, the vision of “Computing as a Utility” is beginning to take shape. Since the early days of time-sharing computing, that vision has taken a quantum leap towards reality. One of the earliest references to Utility Computing occurred in 1961 at the MIT Centennial. On that occasion, John McCarthy presented his vision of computing organized as a public utility. Just as the telephone system had developed into a major industry, Professor McCarthy envisioned that “Computing as a Utility” could one day become the basis of a new and important public industry.

Rooted in the long-standing vision and hope for “Computing as a Utility” that began more than half a century ago, the genesis of Cloud Computing goes back a long way. To a growing number of industry observers, it is an old idea whose time may have finally arrived when, in 2006, Amazon began offering Cloud infrastructure services to the public as a utility. Despite initial skepticism, it was a watershed event in the quest of Utility Computing and helped to usher in the first wave of industrial-strength commercial Cloud Computing offerings.

Israel: To wrap up our discussion today, can you leave us with a few thoughts about some of the implications of Cloud Computing as enterprises begin their transition to the Cloud?

Annie: Eric Schmidt, Google’s Chairman and Chief Executive has stated that Cloud computing will be “the defining technological shift of our Generation”. However, the media and vendor-spun hype (at times referred to as “cloud-washing”) around this topic has created an unprecedented level of confusion. Today, unabated sound and fury surrounding the Cloud Computing buzz continues and indeed, increases. Nevertheless, it is all but certain that there will be no “big or easy switch” for enterprise IT to transition overnight from running applications on premises to the Cloud. Because the shift is not an “all-or-nothing” or a “one size fits all” endeavor, stakeholders in enterprises should take a judicious measured approach to balance different tradeoffs.

To sustain the transition of enterprise IT to the Cloud will require not only technological advances but also new business models, new forms of IT organizational management structure and perhaps even new IT roles.  One of the “inconvenient” truths about embracing new user-empowerment technology trends and business models is the slippery slope of finding the “right” balance between hierarchical command-control and bottom-up empowerment. The harm (ineffectiveness and counter-productivity) of too much top-down control can be matched or even surpassed by the dangers of too little control. User empowerment without reasonable constraints can lead to anarchy and chaos. A new form of organizational governance is clearly required to avoid these problems. Striking a balance between planned orderliness and new emergent forces has been a challenging dynamic since the dawn of civilization.

Many of the principles that have been refined over the millennia will have direct applicability for governing tomorrow’s world of “self-service” computing in the Cloud. Clearly, there will be direct implications to new scrutiny as well as the shaping/changing of security and governance related policies. However, an organization should not overlook the human aspects and the cultural impact on the IT system administration personnel.  For example, resistance to sweeping changes driven by a fear of losing control and the stress over the prospect of losing employment can be one of the more profound ramifications that often are under the management radar.

Cloud Computing likely will change the status quo of IT system administration and, perhaps in the future, could obviate the need for some traditional IT system skills. Cloud Computing, however, is also opening new opportunities for the technical IT community and enterprise IT personnel. There is a growing consensus that, as Cloud Computing evolves, the need for more business-minded IT staff will accelerate. Specifically, there likely will be an urgent need for people “with broader business skills who can manage multiple supplier relationships.”  Freed from a variety of low-level operational tasks and controls of physical infrastructure via Cloud Computing, enterprise IT has the opportunity to promote system administration staff to higher-level decision makers as IT service facilitators and SLA contracts managers. In the near future, many traditional hierarchical command-control system operators may pursue a wider array of IT professional opportunities spanning the roles of enterprise architects; capacity planning; budget planning; performance assurance; and data, security, governance gatekeepers.

Israel: This really resonates with what I see happening in many of my consulting engagements. Successful companies waste an immense amount of capital, energy and management attention on migrating from yesterday’s datacenter to today’s or tomorrow’s datacenter. When exposed to the pains of such migrations, I am always reminded of Peter Drucker’s quip “Companies make shoes!” It is beyond me why companies who makes shoes, cars, drugs or financial instruments would want to be prisoners of their own success, hopping over from one data center to a bigger data center every few years.

Annie: Thanks for Peter Drucker’s quip. I am going to borrow it for my future use.

Israel: Annie, I can’t thank you enough for sharing your insights with us. You really connect the dots!

Endnotes:

[i] Based on the assumption that IT infrastructure performance can be greatly enhanced when each element is designed and brought to market as a component of a tightly integrated, optimized system.

[ii] With this slogan, IBM is promoting the hybrid zEnterprise 196 integrating multiple architectures and OS in a “box” as the one stop shopping ready-made private Cloud for enterprises.

[iii] Information source from IBM, The Open Group Conference, July 22, 2009.

Schedule Constraints in the Devops Triangle

leave a comment »

Last week’s post “The Devops Triangle” demonstrated the extension of Jim Highsmith‘s Agile Triangle to devops. The extension relied on adding compliance to the three traditional constraints of software development: scope, schedule, cost. A graphical representation of this extension is given in Figure 1.

Figure 1: Compliance as the Fourth Constraint in Devops Projects

This blog post examines how time/schedule should be governed in the devops context. It does so by building on the concluding observation in the previous post:

The Devops Triangle and the corresponding Tradeoff Matrix demonstrate how governance a la Agile can be extended to devops projects as far as compliance goes. The proposed governance framework however is incomplete in the following sense: schedule in devops projects can be a much more granular and stringent constraint than schedule in “dev only” projects.

For the schedule constraint in devops, I propose a schedule set.  It consists of  four components:

  • Lead Time or Engineering Time
  • Time to change
  • Time to deploy
  • Time to roll back

Lead Time/Engineering Time: These are customary metrics used in Kanban software development, as demonstrated in Figure 3.

Figure 3: The Engineering Time Metric Used by the BBC (David Joyce in the LSSC10 Conference)

Time to change: The amount of time it takes for the various stakeholders (e.g., dev, test, ops, customer support) to review the code to be deployed, approve its deployment and assign a time window for the deployment.

Time to deploy: The amount of time from (metaphorically speaking) pushing the Deploy “button” to completion of deployment.

Time to roll back: The amount of time to undo a deployment. (Rigorous that the engineering practices and IT processes might be, the time to roll back a deployment can’t be ignored – it is a critical risk parameter).

A graphical representation of these four schedule metrics together with the Devops Triangle is given in the figure below:

Figure 4: The Devops Triangle with a Schedule Set

Using hours as the common unit of measure, a typical schedule set could be {100, 48, 3, 2}. In this hypothetical example, it takes a little over 4 days to carry out the development of the code increment; 2 days to get approval for the change; 3 hours to deploy the code; and, 2 hours to roll back.

Whatever your specific schedule numbers might be, it is highly recommended you apply value stream mapping (see Figure 5 below) to your schedule set. Based on the findings of the value stream mapping, apply statistical process control methods like those illustrated in Figure 3 to continuously improving both the mean and the variances of the four schedule components.

Figure 5: An Example of Value Stream Mapping (Source: Wikipedia entry on the subject)

The Devops Triangle

with one comment

The Agile Triangles was introduced by Jim Highsmith as an antidote to the Iron Triangle. Instead of balancing development between cost, schedule and scope, the Agile Triangle strives to strike a balance between value, quality and constraints:

Figure 1 – The Agile Triangle (based on Figure 1-3 in Agile Project Management: Creating Innovative Products.)

Consider the Iron Triangle in the context of devops. Value, quality and constraints apply to IT operations as meaningfully as they apply to software development. IT can go beyond cost, schedule and scope to focus on value and quality just as the Agile software development team does. Between development and operations the specific tasks to be carried out change, but the principles embodies in the triangle remain invariant.

In addition to cost, schedule and scope, devops projects must cope with another constraint: compliance. For example, a bank that implements a ‘follow the sun’ strategy with respect to trading must finish reconciling transaction that took place in London before the start of trade in Wall Street. From the bank’s point of view, its IT department needs to be mindful of four constraints: compliance, cost, schedule and scope. This view is represented in Figure 2 below.

Figure 2 – The Devops Triangle

Balancing the four constraints – compliance, cost, schedule, and scope – is not a trivial task. However, just like the Agile Triangle, the Tradeoff Matrix used in Agile software development applies to IT. In its software development variant, the Tradeoff matrix is an effective tool to decide between conflicting constraints, as follows:

Table 1 – Tradeoff Matrix (based on Table 6-1 in Agile Project Management: Creating Innovative Products.)

For devops, the matrix is extended to include a compliance row and a Reluctantly Accept column as follows:

Table 2 – Tradeoff Matrix for Devops

The Devops Triangle and the corresponding Tradeoff Matrix demonstrate how governance a la Agile can be extended to devops projects as far as compliance goes. The proposed governance framework however is incomplete in the following sense: schedule in devops projects can be a much more granular and stringent constraint than schedule in “dev only” projects. The subject of schedule constraints in devops projects will be addressed in a forthcoming post.

Boundary Objects in DevOps

with one comment

Boundary Object by Cherice.

Source: Flickr; Chrice‘s Photostream

The following recommendation was given in the post How to Initiate a Devops Project:

For a DevOps project, start by establishing the technical debt of the software to be released to operations. By so doing you build the foundations for collaboration between development and operations through shared data. In the devops context, the technical debt data form the basis for the creation and grooming of a unified backlog which includes various user stories from operations.

I would like to augment this recommendation with a suggestion with respect to the mindset during the initiation phase. Chances are the IT folks feel outnumbered by the dev folks. It might or might not be a matter of optics, but recognizing and appreciating this mindset is will help a lot in getting the devops project on track.

Here is a simple example I heard from a participant in the June 25 devops day in Mountain View, CA. The participant with whom I talked is an IT ops person who tries to get ops aligned with  fairly proficient Agile development teams. She is, however, constrained with respect to the IT ops resources available to her. She simply does not have the resources required to attend each and every Scrum meeting as 25 such meetings take place every day. She strongly feels “outnumbered.”

Various schemes could be devised to enable meaningful participation of ops in the Agile process. The more important thing though is to be fully sensitized to the “outnumbered” feeling. The extension of Agile principles to ops will not succeed at the face of such a feeling.

Discussing the subject with my friend Andrew Shafer, he mentioned the effectiveness of boundary objects in such cross-organizational situations:

Boundary objects are objects which are both plastic enough to adapt to local needs and constraints of the several parties employing them, yet robust enough to maintain a common identity across sites. They are weakly structured in common use, and become strongly structured in individual-site use. They may be abstract or concrete. They have different meanings in different social worlds but their structure is common enough to more than one world to make them recognizable means of translation. The creation and management of boundary objects is key in developing and maintaining coherence across intersecting social worlds. [Source: Wikipedia].

As an example, the boundary object for the situation described in this post could be a set of technical debt criteria that make the code eligible for deployment from a product life cycle perspective. By so doing, it shifts the dialog from the process to the outcome of the process. Instead of working on generating IT resources in an “outnumbered” mode, the energy shifts toward developing a working agreement on the intrinsic quality of the code to be deployed.

Some technical debt criteria that could form the core of a devops boundary object are mentioned in the post Technical Debt Meets Continuous Deployment. Corresponding criteria could be used in the boundary object to satisfy operational requirements which are critical to the proper functioning of the code. For example, a ceiling on configuration drift in IT could be established to ensure an adequate operating environment for the code. A boundary object that contains both technical debt criteria and configuration drift criteria satisfies different concerns – those of dev and those of ops – simultaneously.

Written by israelgat

July 6, 2010 at 6:44 am

Technical Debt Meets Continuous Deployment

with 11 comments

As you would expect in a conference entitled velocity, and in a follow-on devops day, speeding up things was an overarching theme. In the context of devops, the theme primarily manifested itself in lively discussions about the number of deploys per day. Comments such as the following reply to my post Ops Driven Dev were typical:

Conceptually, I move the whole business application configuration into the source code…

The theme that was missing for me in many of the presentations and discussions on the subject was the striking of a balance between velocity and quality. The classical trade-off in process control is between production rate and product quality (and safety, but that aspect [safety] is beyond the scope of this post). IMHO this trade-off applies to software just as it applies to mechanical or chemical processes.

The heart of the “deploy early and often” strategy hailed by advocates of continuous deployment is known deployment state to known deployment state. You don’t let the deployment evolve from one state to another before it has stabilized to a robust state. The power of this incremental deployment is in dealing with single-piece (or as small number of pieces as possible) flow rather than dealing with the effects of multiple-piece flow. When the deployment increments are small enough, rollback, root cause analysis and recovery are relatively straightforward if a deployment turns sour. It is a similar concept to Agile development, extending continuous integration to continuous deployment.

While I am wholeheartedly behind this devops strategy, I believe it needs to be reinforced through rigorous quality criteria the code must satisfy prior to deployment. The most straightforward way for so doing is through embedding technical debt criteria in the release/deploy process. For example:

  • The code will not be deployed unless the overall technical debt per line of code is lower than $2.
  • To qualify for deployment, code duplication levels must be kept under 8%.
  • Code whose Cyclomatic complexity per Java class is higher than 15 will not be accepted for deployment.
  • 50% unit test coverage is the minimal level required for deployment.
  • Many others…

I have no doubt whatsoever that code which does not satisfy these criteria might be successfully deployed in a short-term manner. The problem, however, is the accumulative effect over the long haul of successive deployments of code increments of inadequate quality. As Figure 1 demonstrates, a Java file with Cyclomatic complexity of 38 has a probability of 50% to be error-prone. If you do not stop it prior to deployment through technical debt criteria, it is likely to affect your customers and play havoc with your deployment quite a few times in the future. The fact that it did not do so during the first hour of deployment does not guarantee that such a  file will be “well-behaved” in the future.

mccabegraph.jpg

Figure 1: Error-proneness as a Function of Cyclomatic Complexity (Source: http://www.enerjy.com/blog/?p=198)

To attain satisfactory long-term quality and stability, you need both the right process and the right code. Continuous deployment is the “right process” if you have developed the deployment infrastructure to support it. The “right code” in this context is code whose technical debt levels are quantified and governed prior to deployment.

Devops: It is Not About ITIL, It is About Proficiency

with 2 comments

As you would expect, the Information Technology Infrastructure Library (ITIL) topic was brought up in the devops day held last Friday in a LinkedIn facility in Mountain View, CA. We, of course, had the expected spectrum of opinions about ITIL in the context of devops – from “ITIL will never work for a true continuous development shop” to “well, you can make ITIL work under such circumstances.” Needless to say, a noticeable level of passion accompanied these two statements…

IMHO the heart of the issue is not ITIL per se but system management proficiency. If your system management proficiency is high, you are likely to be able to effectively respond to 10, 20 or 50 deploys per day. Conversely, if your system management proficiency is low, ops is not likely to be able to cope with high velocity in dev. The critical piece is alignment of velocities between dev and ops, not the method used to manage IT systems and services.  Whether you use ITIL, COBIT or your own home-grown set of best practices is irrelevant. Achieving alignment of velocities between dev and ops is a matter of proficiency in system management.

Cutter’s Technical Debt Assessment and Valuation Service

with 3 comments

 

 

Source: Cutter Technical Debt and Valuation Service

The Cutter Consortium has announced the availability of the Technical Debt Assessment and Valuation Service. The service combines static code analytics with dynamic program analytics to give the client “x-rays” of the software being examined at any desired granularity – from the whole project portfolio to a single instruction. It breaks down technical debt into the areas of coverage, complexity, duplication, violations and comments. Clients get an aggregate dollar figure for “paying back” debt that they can then plug into their financial models to objectively analyze their critical software assets. Based on these metrics, they can make the best decisions about their ongoing strategy for the software development effort under scrutiny.

This new service is an important addition to the enlightened software governance framework that Jim Highsmith, Michael Mah and I have been thinking about and contributing to for sometime now (see Beyond Scope, Schedule and Cost: Measuring Agile Performance and Quantifying the Start Afresh Option). The heart of both the technical debt service and the enlightened governance framework is captured by the following words from the press release:

Executives in charge of software governance have long dealt with two kinds of dollar figures: One, the cost of producing and maintaining the software; and two, the value of the software, which is usually expressed in terms of the net present value associated with the expected value stream the product will generate. Now we can deal with technical debt in the same quantitative manner, regardless of the software methods a company uses.

When expressed in terms of dollars, technical debt ties neatly into value vis-à-vis cost considerations. For a “well behaved” software project, three factors — value, cost, and technical debt — have to satisfy the equation Value >> Cost > Technical Debt. Monitoring the balance between value, cost, and technical debt on an ongoing basis is an effective way for organizations to stay on top of their real progress, and for stakeholders and investors to ensure their investment is sound.

By boiling down technical debt to dollars and tying it to cost and value, the service enables a metrics-driven governance framework for the use of five major constituencies, as follows:

Technical debt assessments and valuation can specifically help CIOs ensure alignment of software development with IT Operations; give CTOs early warning signs of impending project trouble; assure those involved in due diligence for M&A activity that the code being acquired will adapt to meet future needs; enables CEOs to effectively govern the software development process; and, it provides critical information as to whether software under consideration constitutes an asset or a liability for venture capitalists who need to make informed investment decisions.

It should finally be pointed out that the technical debt assessment service and the governance framework it enables are applicable to any software method. They can be used to:

  • Govern a heterogeneous environment in which multiple software methods are used
  • Make apples-to-apples comparisons between disparate software projects
  • Assess project performance vis-a-vis industry norms

Forthcoming Cutter Executive Reports, Executive Updates and Email Advisors on the technical debt service are restricted to Cutter clients. As appropriate, I will publish the latest and greatest news on the subject in the Cutter Blog (which is an open forum I highly recommend).

Acknowledgements: I would like to wholeheartedly thank the following colleagues for inspiring, enlightening and supporting me during the preparation of the service:

  • Karen Coburn
  • Jennifer Flaxman
  • Jonathon Golden
  • John Heintz
  • Jim Highsmith
  • Ken Collier
  • Kim Leonard
  • Kara Letourneau
  • Michal Mah
  • Anne Mullaney
  • Chris Sterling
  • Cindy Swain
  • Sarah Wiesbrock

How to Initiate a DevOps Project

with 4 comments

17th/21st Lancers c. 1922-1929 "THE FIGHTING SPIRIT!" by sunnybrook100 - One Million Views!.

Source: 17th/21st Lancers c. 1922-1929 “THE FIGHTING SPIRIT!”

Agile consultants on a development project often start by helping the team construct a backlog. The task is sufficiently concrete to get all stakeholders (product management, project management, development, test, any others) on a collaborative track through the creation of a key artifact. The backlog establishes a base line for the tasks to be carried out in the project.

For a DevOps project, start by establishing the technical debt of the software to be released to operations. By so doing you build the foundations for collaboration between development and operations through shared data. In the DevOps context, the technical debt data form the basis for the creation and grooming of  a unified backlog which includes various user stories from operations.

Apply the same approach when you are fortunate to be able to include folks from operations in the Agile team from the very beginning. You start with zero technical debt, but you track it on an ongoing basis and include the corresponding “fix-it” stories in the backlog as you accrue the debt. Running technical debt analytics on the source code every two weeks is a good practice to follow.

As the head of development, you might not be comfortable sharing technical debt data. This being the case, you are not ready for DevOps.

The Agile Flywheel

with 8 comments

Readers of The Agile Executive have been exposed to the “All In!” strategy used by Erik Huddleston to transform the software engineering process at Inovis and make it uniquely streamlined. In this post we follow up on the original discussion of the subject to explore the effect of Agile on IT Operations. As the title implies, Agile at Inovis served as a flywheel which created the momentum required to transform IT Operations and blend the best of Agile with the best of ITIL.

This guest post was written by Ray Riescher – a Six Sigma Black Belt, Agile evangelist and a business process change agent. Ray is currently responsible for business process management and IT governance at Inovis, a  leading provider of business-to-business (B2B) e-commerce services, in Alpharetta, GA

Here is Ray:

When we converted to an Agile Scrum software methodology some 24 months ago, I never imagined the lessons I’d learn and the organizational change that would be driven by the adoption of Scrum.

I’ve lived by the philosophy that managing a business is managing its processes and that all of those processes, especially the operational processes, are interconnected.  However, I don’t think I was fully prepared for effect Agile Scrum would have on our company operations.

We dove head first into Agile Scrum and adapted to it very quickly. However, it wasn’t until we landed a very large and demanding customer that Scrum was really put to the test. New enhancements, new features, and new configurations were all needed ASAP.  Scrum delivered with rapid development and deployment in the form of releases that were moving into production with amazing velocity. Our release cadence hit warp drive and at one point we experienced several months where multiple teams’ production releases were deploying at the end of every two week sprint.

We’ve subscribed to the ITIL service support processes for Release, Change, Incident, Problem and Configuration Management. ITIL has served us well, giving us a common language and a clear understanding of process boundaries.

As the Scrum release cadence kicked in, the downstream ITIL processes had to keep up, adapt, and support the dynamics of rapid production changes.  What happened was enlightening and maybe a bit ground breaking.

The Release Management process had to reassess its reliance on artifacts for gate keeping. The levels of sign offs had to be streamlined, the heavyweight deployment documentation had to be lightened, yet the process still had to control the production release to ensure deployment success.  The rapidity of the release cycles meant that maintenance window downtime would be experienced too frequently by customers, so “rolling bounce” deployment strategies were devised and implemented.

Change requests could no longer wait for a weekly Change Management review board to approve and schedule the changes.  Change management risk models had to be relied on for accurate detection of risky changes.

Early on in this dynamic environment, we weren’t quite as good as we needed to be and our Incident Management process was put to the test.  Faster releases meant more opportunity for problems with service degradation and outages. This reality manifested itself more frequently than we’d ever experienced. Monitoring, detecting and repairing became paramount for environment stability and customer satisfaction.

What we found out was that we became very agile at this break/fix game. We developed a small team approach to managing incidents and leveraged the ITIL Problem Management process to rapidly perform root cause analysis. Once the true root cause was determined, a fix would be defined and deployed. Sometimes the fix was software related and went through the Scrum process, sometimes the fix was hardware related and went through the Configuration Management process, other times it was more operational and the fix took the form of training or corrections to procedural documentation.

The point is we’ve become agile across the entire IT spectrum. Whether it’s development via Scrum, the velocity with which we now operate our ITIL processes, or the integrated break/fix operational support processes, we are performing all of these with an agile mindset and discipline. We have small teams, working on priorities, and completing what needs to be completed now.

Scrum set the flywheel in motion and caused the rest of the IT process life cycle to respond.  ITIL’s processes still form the solid core of service support and we’ve improved the processes’ capability to handle intense work velocity. The organization adapted by developing unprecedented speed in the ability to deliver production fixes and to solve root cause problems with agility.

What I think we are witnessing is a manifestation of Agile Business Service Management; a holistic agile methodology running across the IT process spectrum that’s delivering eye popping change and tremendous results.

Harnessing Economies of Scale in Cloud Computing to Realize a Greener Computing Option

with 2 comments

Economies of Scale have been much discussed in The Agile Executive since the recent OpsCamp in Austin, TX. The significant savings on system administration costs  in very large data centers have been called out as a major advantage of Internet-scale Clouds. Unlike various short-lived advantages, the benefits to the Cloud operator, and to the Cloud user when the savings are passed on to him/her, are sustainable.

In this guest post, colleague and friend Annie Shum analyzes the various sources of waste in operations in traditional data centers. Like an Agilist with Lean inclinations who confronts an inefficient Waterfall process, Annie explains how economies of scale apply to the various kinds of waste that are prevalent in today’s small and medium data centers. Furthermore, she connects the dots that lead toward a Green IT option.

Here is Annie:

Harnessing Economies of Scale in Cloud Computing to Realize a Greener Computing Option

Scale Matters: “Over time, however, competitive advantage within categories shifts inexorably toward volume operations architecture.” – Geoffrey Moore, “Dealing with Darwin”

It is a truism that today’s datacenters are systemically inefficient. This is not intended as an indictment of all conventional datacenters. Nor does it imply that today’s datacenters cannot be made more efficient (incrementally) through right sizing and other initiatives, notably consolidation by deploying virtualization technologies and governance by enforcing energy conservation/recycling policies. There are a myriad of inefficiencies, however, that are prevalent in datacenters today.

Many industry observers lament the “staggering complexity” that permeates on-premises datacenters. Over time, most, if not all, enterprise IT datacenters have become amalgamations of disparate heterogeneous resources. Generally, they can be described as incohesive, perhaps even haphazard, accumulations. The datacenter components and configurations often reflect the intersections of organizational politics (LOB reporting structures leading to highly customized/organizational asset acquisitions and configurations), business needs of the moment (shifting corporate strategies and changing business imperatives to gain competitive edge or meet regulatory compliances) and technology limitations (commercial tools available in the marketplace). It should come as no surprise that human interactions and errors are considered a major contributor to the inefficiencies of datacenters: IBM reported that human errors account for seventy percent of the datacenter problems.

The challenge of maximizing energy efficiency begins fundamentally with the historical capital-intensive ownership model for computing assets to enable each organization to operate its own datacenter and to provide “24×7 availability” to its own users.  The enterprise IT staff has been required to support unpredictable future growth, accommodate situational demands and unscheduled but deadline-critical events, meet performance levels within SLAs and comply with regulatory and auditing requirements. Hence, datacenters generally are over-configured and over-provisioned. In addition to highly skewed under-utilization of distributed platform servers, ninety percent of corporate datacenters have excess cooling capacity. Worst of all, according to IBM, about seventy-two percent of cooling bypassed the computing equipment entirely. Further compounding these problems for a typical enterprise datacenter, is the lack of transparency and the inability to control energy consumption properly due to inadequate and often inaccurate instrumentation to quantify energy consumption and waste due to energy lost.

The economics of Cloud Computing can offer a compelling option for more efficient IT: by lowering power consumption for individual organizations and by improving the efficiency of a large number of discrete datacenters. Although the electricity consumption of Cloud Computing is projected to be one to two percent of today’s global electricity use, Cloud service providers can still cultivate sustainable Green I.T. effectively at lower costs by leveraging state-of-the-art super energy efficient massive datacenters, proximity to power generation thereby reducing transmission costs and, above all, harnessing enormous economies of scale. To better understand how Cloud Computing can offer greener computing in the Cloud and how will it help moderate power consumption by datacenters and rein in run-away costs, a good starting place is James Hamilton’s September 2008 study on Internet-Scale Service Efficiency” as summarized in the table below.

Resource Cost in

Medium DC

Cost in

Very Large DC

Ratio
Network $95 / Mbps / month $13 / Mbps / month 7.1x
Storage $2.20 / GB / month $0.40 / GB / month 5.7x
Administration ≈140 servers/admin >1000 servers/admin 7.1x

Table 1: Internet-Scale Service Efficiency [Source: James Hamilton]

This study concludes that hosted services by Cloud providers with super large datacenters (at least tens of thousands of servers) can achieve enormous economies of scale of five to seven times over smaller scale (thousands of servers) medium deployments.  The significant cost savings is driven primarily by scale. Other key factors include location (low cost real estate and electricity rate, abundant water supply and readily available fiber-optic connectivity), proximity to electricity and power generators, load diversity, and virtualization technologies.

Will this mark the beginning of the end for traditional on-premises datacenters? Can enterprise IT continue to justify new business cases for expanding today’s non-renewable energy powered datacenters? According to the McKinsey article, the costs to launch a large enterprise datacenter have risen sharply from $150M to over $500M over the past five years. The facility operating costs are also increasing at about twenty percent per year. How long will the status quo last for enterprise IT considering the recent trend of Cloud service providers? Major players such as Google, Microsoft as well as the U.S. government itself have invested in or are planning ultra energy-efficient mega-size datacenters (also known as “container hotels”) with massive commoditized containerization and proximity both to power source and less expensive power rates. Bottom line: will the tide turn if the economics (radical cost savings) due to enormous economies of scale become too significant to ignore?

Despite the potential for significant cost savings, it is premature to declare the demise of traditional IT or the end of enterprise datacenters. After all, the rationale for today’s enterprise IT extends well beyond simplistic bottom-line economics – at least for now. To most industry observers, enterprise datacenters are unlikely to disappear although the traditional roles of enterprise IT will be changing. A likely scenario may involve redistributing IT personnel from operating low-level system operational tasks to addressing higher-level functions involving governance, energy management, security and business processes. Such change not only would become more apparent but will likely be precipitated by the rise of hybrid Clouds and the growing interconnection linking SOA, BPM and social computing. Another likely scenario is the rise of the mega datacenters or “container hotels” for Cloud Utility Computing providers. Although the global economic outlook will undoubtedly play a key role in shaping the development plans/timelines of the mega datacenters, they are here to stay. Case in point: by 2012, Intel estimates that it will design and ship about a quarter of the server chips (it sells) to such mega-data centers.