The Agile Executive

Making Agile Work

Posts Tagged ‘Agile Business Service Management

Schedule Constraints in the Devops Triangle

leave a comment »

Last week’s post “The Devops Triangle” demonstrated the extension of Jim Highsmith‘s Agile Triangle to devops. The extension relied on adding compliance to the three traditional constraints of software development: scope, schedule, cost. A graphical representation of this extension is given in Figure 1.

Figure 1: Compliance as the Fourth Constraint in Devops Projects

This blog post examines how time/schedule should be governed in the devops context. It does so by building on the concluding observation in the previous post:

The Devops Triangle and the corresponding Tradeoff Matrix demonstrate how governance a la Agile can be extended to devops projects as far as compliance goes. The proposed governance framework however is incomplete in the following sense: schedule in devops projects can be a much more granular and stringent constraint than schedule in “dev only” projects.

For the schedule constraint in devops, I propose a schedule set.  It consists of  four components:

  • Lead Time or Engineering Time
  • Time to change
  • Time to deploy
  • Time to roll back

Lead Time/Engineering Time: These are customary metrics used in Kanban software development, as demonstrated in Figure 3.

Figure 3: The Engineering Time Metric Used by the BBC (David Joyce in the LSSC10 Conference)

Time to change: The amount of time it takes for the various stakeholders (e.g., dev, test, ops, customer support) to review the code to be deployed, approve its deployment and assign a time window for the deployment.

Time to deploy: The amount of time from (metaphorically speaking) pushing the Deploy “button” to completion of deployment.

Time to roll back: The amount of time to undo a deployment. (Rigorous that the engineering practices and IT processes might be, the time to roll back a deployment can’t be ignored – it is a critical risk parameter).

A graphical representation of these four schedule metrics together with the Devops Triangle is given in the figure below:

Figure 4: The Devops Triangle with a Schedule Set

Using hours as the common unit of measure, a typical schedule set could be {100, 48, 3, 2}. In this hypothetical example, it takes a little over 4 days to carry out the development of the code increment; 2 days to get approval for the change; 3 hours to deploy the code; and, 2 hours to roll back.

Whatever your specific schedule numbers might be, it is highly recommended you apply value stream mapping (see Figure 5 below) to your schedule set. Based on the findings of the value stream mapping, apply statistical process control methods like those illustrated in Figure 3 to continuously improving both the mean and the variances of the four schedule components.

Figure 5: An Example of Value Stream Mapping (Source: Wikipedia entry on the subject)

Advertisements

The Devops Triangle

with one comment

The Agile Triangles was introduced by Jim Highsmith as an antidote to the Iron Triangle. Instead of balancing development between cost, schedule and scope, the Agile Triangle strives to strike a balance between value, quality and constraints:

Figure 1 – The Agile Triangle (based on Figure 1-3 in Agile Project Management: Creating Innovative Products.)

Consider the Iron Triangle in the context of devops. Value, quality and constraints apply to IT operations as meaningfully as they apply to software development. IT can go beyond cost, schedule and scope to focus on value and quality just as the Agile software development team does. Between development and operations the specific tasks to be carried out change, but the principles embodies in the triangle remain invariant.

In addition to cost, schedule and scope, devops projects must cope with another constraint: compliance. For example, a bank that implements a ‘follow the sun’ strategy with respect to trading must finish reconciling transaction that took place in London before the start of trade in Wall Street. From the bank’s point of view, its IT department needs to be mindful of four constraints: compliance, cost, schedule and scope. This view is represented in Figure 2 below.

Figure 2 – The Devops Triangle

Balancing the four constraints – compliance, cost, schedule, and scope – is not a trivial task. However, just like the Agile Triangle, the Tradeoff Matrix used in Agile software development applies to IT. In its software development variant, the Tradeoff matrix is an effective tool to decide between conflicting constraints, as follows:

Table 1 – Tradeoff Matrix (based on Table 6-1 in Agile Project Management: Creating Innovative Products.)

For devops, the matrix is extended to include a compliance row and a Reluctantly Accept column as follows:

Table 2 – Tradeoff Matrix for Devops

The Devops Triangle and the corresponding Tradeoff Matrix demonstrate how governance a la Agile can be extended to devops projects as far as compliance goes. The proposed governance framework however is incomplete in the following sense: schedule in devops projects can be a much more granular and stringent constraint than schedule in “dev only” projects. The subject of schedule constraints in devops projects will be addressed in a forthcoming post.

Devops: It is Not About ITIL, It is About Proficiency

with 2 comments

As you would expect, the Information Technology Infrastructure Library (ITIL) topic was brought up in the devops day held last Friday in a LinkedIn facility in Mountain View, CA. We, of course, had the expected spectrum of opinions about ITIL in the context of devops – from “ITIL will never work for a true continuous development shop” to “well, you can make ITIL work under such circumstances.” Needless to say, a noticeable level of passion accompanied these two statements…

IMHO the heart of the issue is not ITIL per se but system management proficiency. If your system management proficiency is high, you are likely to be able to effectively respond to 10, 20 or 50 deploys per day. Conversely, if your system management proficiency is low, ops is not likely to be able to cope with high velocity in dev. The critical piece is alignment of velocities between dev and ops, not the method used to manage IT systems and services.  Whether you use ITIL, COBIT or your own home-grown set of best practices is irrelevant. Achieving alignment of velocities between dev and ops is a matter of proficiency in system management.

The Agile Flywheel

with 8 comments

Readers of The Agile Executive have been exposed to the “All In!” strategy used by Erik Huddleston to transform the software engineering process at Inovis and make it uniquely streamlined. In this post we follow up on the original discussion of the subject to explore the effect of Agile on IT Operations. As the title implies, Agile at Inovis served as a flywheel which created the momentum required to transform IT Operations and blend the best of Agile with the best of ITIL.

This guest post was written by Ray Riescher – a Six Sigma Black Belt, Agile evangelist and a business process change agent. Ray is currently responsible for business process management and IT governance at Inovis, a  leading provider of business-to-business (B2B) e-commerce services, in Alpharetta, GA

Here is Ray:

When we converted to an Agile Scrum software methodology some 24 months ago, I never imagined the lessons I’d learn and the organizational change that would be driven by the adoption of Scrum.

I’ve lived by the philosophy that managing a business is managing its processes and that all of those processes, especially the operational processes, are interconnected.  However, I don’t think I was fully prepared for effect Agile Scrum would have on our company operations.

We dove head first into Agile Scrum and adapted to it very quickly. However, it wasn’t until we landed a very large and demanding customer that Scrum was really put to the test. New enhancements, new features, and new configurations were all needed ASAP.  Scrum delivered with rapid development and deployment in the form of releases that were moving into production with amazing velocity. Our release cadence hit warp drive and at one point we experienced several months where multiple teams’ production releases were deploying at the end of every two week sprint.

We’ve subscribed to the ITIL service support processes for Release, Change, Incident, Problem and Configuration Management. ITIL has served us well, giving us a common language and a clear understanding of process boundaries.

As the Scrum release cadence kicked in, the downstream ITIL processes had to keep up, adapt, and support the dynamics of rapid production changes.  What happened was enlightening and maybe a bit ground breaking.

The Release Management process had to reassess its reliance on artifacts for gate keeping. The levels of sign offs had to be streamlined, the heavyweight deployment documentation had to be lightened, yet the process still had to control the production release to ensure deployment success.  The rapidity of the release cycles meant that maintenance window downtime would be experienced too frequently by customers, so “rolling bounce” deployment strategies were devised and implemented.

Change requests could no longer wait for a weekly Change Management review board to approve and schedule the changes.  Change management risk models had to be relied on for accurate detection of risky changes.

Early on in this dynamic environment, we weren’t quite as good as we needed to be and our Incident Management process was put to the test.  Faster releases meant more opportunity for problems with service degradation and outages. This reality manifested itself more frequently than we’d ever experienced. Monitoring, detecting and repairing became paramount for environment stability and customer satisfaction.

What we found out was that we became very agile at this break/fix game. We developed a small team approach to managing incidents and leveraged the ITIL Problem Management process to rapidly perform root cause analysis. Once the true root cause was determined, a fix would be defined and deployed. Sometimes the fix was software related and went through the Scrum process, sometimes the fix was hardware related and went through the Configuration Management process, other times it was more operational and the fix took the form of training or corrections to procedural documentation.

The point is we’ve become agile across the entire IT spectrum. Whether it’s development via Scrum, the velocity with which we now operate our ITIL processes, or the integrated break/fix operational support processes, we are performing all of these with an agile mindset and discipline. We have small teams, working on priorities, and completing what needs to be completed now.

Scrum set the flywheel in motion and caused the rest of the IT process life cycle to respond.  ITIL’s processes still form the solid core of service support and we’ve improved the processes’ capability to handle intense work velocity. The organization adapted by developing unprecedented speed in the ability to deliver production fixes and to solve root cause problems with agility.

What I think we are witnessing is a manifestation of Agile Business Service Management; a holistic agile methodology running across the IT process spectrum that’s delivering eye popping change and tremendous results.

The Urgency of Now – Guest Post by Annie Shum

with 3 comments

Failure to learn, failure to anticipate, and failure to adapt are the three generic causes of military disasters. Each one of these three failures is bad enough. In combination, they can be catastrophic. Germany swiftly defeated and conquered France in 1940 due to the utter failure of the French army to grasp the nature of future war, to conceive the probable action of the German forces and to adequately react to the German initiative once it unfolded through the Ardennes. The patterns leading to the catastrophe suffered by the French are similar in some ways to the eco-meltdowns described by Jared Diamond in Collapse: How Societies Choose to Succeed or Fail.

In this guest post, colleague and friend Annie Shum poses disturbing questions with respect to our willingness and ability as IT professionals to learn, anticipate and adapt to the imperatives of Cloud Computing. Between shockingly low (15%) server capacity utilization on the one hand, and dramatic changes in the needs of the business on the other hand, companies who continue to use industrial-era IT models are at peril. Annie weaves theses and other related threads together, and makes a resounding call-to-action to re-think IT.

It is remarkable that Annie’s analysis herein of the root causes of a possible meltdown in IT identifies worrisome patterns similar to those that the Agile movement has pointed out to with respect to arcane methods of software development. The very same core problems that afflict software development manifest themselves in the IT paradigm as well as in the corresponding business design. Painful and wasteful that this repeated manifestation is, it actually creates the opportunity to manage software, IT, and the business in unison. To do so, we need to embrace a data-driven version of the economics of IT, to grasp the true nature of Cloud Computing without the hype that currently surrounds it, and to adapt software development, IT operations and business design accordingly. As the title of this post states, we need to start carrying out these three tasks now.

Here is Annie:

The Urgency of Now: The Edge of Chaos and A “Strategic Inflection Point” for IT

“It was the worst of times. It may be the best of times.” – IBM

Consider the following table. It contains a list of statistics pertaining to the enterprise datacenter index compiled by Peter Mell and Tim Grance, NIST. Overall, the statistics are sobering, perhaps even alarming, and do not bode well for the long-term sustainability of traditional on-premises datacenters. Prudent IT organizations – whether big or small, stalwart or startup – should consider this as a wake-up call. In particular, out of the almost twelve million servers in US datacenters today, the typical server capacity utilization is only around fifteen percent. Although not explicitly shown in this table, the average utilization of the mainframe z/OS servers is typically over eighty percent. However, mainframe z/OS server utilization is only a minor component of the overall average server utilization.

Statistics Enterprise Datacenter Index
11,800,000 Servers in US datacenters
15% Typical server capacity utilization
$800,000,000,000/year Purchasing & maintaining enterprise software
80% Software costs spent on maintenance: the “80-20” ratio
100x Power consumption/sq ft compared to office building
4x Increase in server power consumption, 2001 to 2006
2x Increase in number of servers, 2001 to 2006
$21,300,000 Datacenter construction cost, 9000 sq ft
$1,000,000/year Annual cost to power the datacenter
1.5% Portion of national power generation
50% Potential power reduction from green technologies
2% Portion of global carbon emissions

 

Over the years, organizations have accepted such skewed levels of server inefficiency and escalating maintenance costs of IT infrastructure as the norm. Even as organizations continue to express concerns, many seem resigned to the status quo tacitly: akin to what Bob Evans of InfoWeek described as “insurmountable laws of physics.” Looking ahead, however, the status quo may no longer be a viable option for most organizations. Due to soaring electricity/power costs compounded by the recent global financial meltdown with a near collapse of the financial system that triggered a prolonged (and for now, apparently indefinite) credit crunch, these are unparalleled strident and chaotic times for businesses. Pressured by business decision-makers who are under a heightened level of anxiety, enterprise IT is now confronting a transformative dilemma whether to preserve the status quo or to re-think IT.

On one hand, the current global recessionary down cycle is a particularly powerful (albeit rooted in fear) and instinctive deterrent to challenging the status quo. For risk-adverse organizations, it is only understandable why status quo, fundamental flaws notwithstanding, may trump disruptive change during these challenging times. On the other hand, forward-thinking decision-makers may make the bold but disruptive (radical) choice to view status quo as the fundamental problem: acknowledge the growing “urgency of now” by resolving to overcome and correct the entrenched shortcomings of enterprise IT.

“You never want a serious crisis to go to waste”. That quote (or its many variations) has been attributed alike to economists and politicians. The same could be said for IT. Indeed a growing number of IT industry observers believe the profound impact of the on-going economic crisis could offer a rare window of opportunity for organizations to rethink traditional capital-intensive, command-control, on-premises IT operations and invest in new and more flexible self-service IT delivery/deployment models. Think of this defining moment as what Andy Grove, co-founder of Intel, described as the “strategic inflection point”.  He was referring to the point in the dynamic when the fundamentals of a business are about to change and “that change can mean an opportunity to rise to new heights.” Nonetheless, the choices will be hard decisions because the options are stark: either counter-intuitively invest in a down cycle by focusing on a more sustainable but disruptive trajectory or hunker down and risk irreversible shrinking business. 

As one considers how to address the challenges of today’s enterprise IT, perhaps the following two observations should be taken into account. First, despite the quantum leap in technology advancements, generally the basic design and delivery models of existing IT applications/services are variations of traditionally insular, back-office automation business tools. Second, the organizational structure and business models of most companies are deeply rooted in models of yesteryear, in many instances dating back to the Industrial Revolution. In theory, adhering to the traditional organizational model of top-down command-control can maximize predictability, efficiency and order. Heretofore, this has been the modus operandi for most organizations that Umair Haque succinctly characterized as “ industrial-era companies that make industrial-era stuff — and play by industrial-era rules.” In today’s exponential times, however, the velocity of change and the rapidly growing need of interconnecting to other organizations and automating value chains inevitably lead to an increase in uncertainty and disorder.  Strategically, forward-thinking organizations should consider seeking alternative models to address the interdependent and shifting new world order.  

In their book, “Presence – Human Purpose and the Field of the Future”, authors Peter Senge, Otto Schramer, Joe Jaworski and Betty Sue Flower observe that many of the practices of the Industrial Age appear to be largely unaffected by the changing reality of today’s society and continue to expand in today’s business organizations. They conclude with this advice:  “As long as our thinking is governed by industrial ‘machine age’ metaphors such as control, predictability, and faster is better, we will continue to re-create organizations as we have had – for the last 100 years – despite their increasing disharmony with the world and the science of the 21st century.”  Likewise, the traditional top-down command-control modus operandi of enterprise IT today does not reflect adequately and hence likely is unable to accommodate fully the transformational shift of business from silo organizations to “all thing’s digital all the time”, hyper-interconnected and hyper-interdependent ecosystems.

Three Criteria for Qualifying as Agile

leave a comment »

Agile methods have been gaining popularity to the extent that one sees the term Agile used beyond the domain of software methods. Agile Infrastructure and Agile Business Service Management were used in this blog and elsewhere. Recently I have seen the term used in the domain of Business Process Management (BPM). For example, a presentations entitled Best Practices for Agile BPM will be delivered in the forthcoming Gartner Group Business Process Management Summit 2010.

I have no doubt the term Agile will be adopted in various fields. Using BPM as an example, I propose the following three criteria to differentiate between agile (small A) and Agile (capital A):

  1. Beyond software: A software team carrying out a BPM initiative might use Agile methods. This fact to itself does not suffice to make the initiative Agile BPM.
  2. Methodical specificity: Roles, forums/ceremonies and artifacts for the BPM initiative must be specified. Folks might be already applying Lean, TOC or other approaches to BPM, but a definitive Agile BPM method has not crystalized yet.
  3. Values: Adherence in spirit to the four principles of the Agile Manifesto. Replace the word “software” with “product” in the manifesto (just two occurences!) and you get a universal value statement that is not restricted to “just” software. It applies to BPM as well as to any other field in which products are produced and used.

You might be impressively agile in what you do but it does not necessarily make you Agile. The pace by which you do things must be anchored in a broader perspective that incorporates customers and employees. A forthcoming post entitled Indivisibility of the Principles of Operation will explore the connection between the Agile values (plural) you hold and the business value (singular) you generate.

An Update on Agile Business Service Management

leave a comment »

A previous post in this blog defined the demarcation line between The Agile Executive and BSM Review as follows:

If software development is your primary interest, you might find my forthcoming posts in BSM Review go a little beyond the traditional scope of software methods. If, however, you are interested in software delivery in entirety, you are likely to find good synergy between the topics I will address in BSM Review and those I will continue to bring up in The Agile Executive. Either way, I trust my posts and Cote’s will be of on-going interest to you.

Since writing these words, I realized how tricky it is to adhere to this differentiation. The difficulty lies in the “cord” between development and operations. Development needs to devise algorithms that take into account operational characteristics in IT. Operations needs to comprehend the limits of such algorithms in the context of the service level agreements and operational level agreements that had been negotiated with their customers (either external or internal). The mutual need is particularly strong in the web application/web operations domain where mutual understanding, collaborative work and joint commitment often need to transcend organizational lines.

Given the inherently close ties between development and operations, here are some BSM Review articles and posts that are likely to be of interest to readers of The Agile Executive:

It is a little premature at this early stage to project how BSM Review will evolve. My hunch is that forthcoming articles in BSM Review on cloud computing, large-scale operations, leadership, risk mitigation and technology trends will be of particular interest to readers of this blog.