The Agile Executive

Making Agile Work

Archive for February 2010

The Mojo of Innovation Games

with 2 comments

This post is a shameless plug for Innovation Games. Shameless that it might be, it is grounded in the hands-on experience I acquired as a participant in Luke Hohmann’s workshop on the subject last week. Colleagues Ken Collier, Alan Shalloway and Michele Sliger took the workshop together with me.

While Innovation Games had been conceived, implemented and published by Luke more than 4 years ago, the contemporary on-line implementation breaks new grounds in three important ways:

  1. It ties in ideation, requirements management and software project management in a seamless fashion. (Stay tuned for exciting announcements on the subject in a couple of months).
  2. It gets over the “too much data” barrier of the paper-based version of the game. Data capture is largely automated now. Data analysis tools are forthcoming.
  3. It gets over the “across the pond” obstacle. You can play Innovation Games to your heart’s content no matter how geographically dispersed your teams might be.

Not bad for three guys and a dog. Actually, I don’t even know whether they can afford a dog. These guys operate on passion, craftsmanship and mojo…

Postscript: If you know Luke, you already know what I mean by “the Luke mojo.” If you don’t, may I suggest you get to know him. A convenient opportunity might be the forthcoming Agile Roots 2010 conference – the organizers are speaking with Luke about delivering a keynote presentation literally as I write this post.

The Agile Flywheel

with 8 comments

Readers of The Agile Executive have been exposed to the “All In!” strategy used by Erik Huddleston to transform the software engineering process at Inovis and make it uniquely streamlined. In this post we follow up on the original discussion of the subject to explore the effect of Agile on IT Operations. As the title implies, Agile at Inovis served as a flywheel which created the momentum required to transform IT Operations and blend the best of Agile with the best of ITIL.

This guest post was written by Ray Riescher – a Six Sigma Black Belt, Agile evangelist and a business process change agent. Ray is currently responsible for business process management and IT governance at Inovis, a  leading provider of business-to-business (B2B) e-commerce services, in Alpharetta, GA

Here is Ray:

When we converted to an Agile Scrum software methodology some 24 months ago, I never imagined the lessons I’d learn and the organizational change that would be driven by the adoption of Scrum.

I’ve lived by the philosophy that managing a business is managing its processes and that all of those processes, especially the operational processes, are interconnected.  However, I don’t think I was fully prepared for effect Agile Scrum would have on our company operations.

We dove head first into Agile Scrum and adapted to it very quickly. However, it wasn’t until we landed a very large and demanding customer that Scrum was really put to the test. New enhancements, new features, and new configurations were all needed ASAP.  Scrum delivered with rapid development and deployment in the form of releases that were moving into production with amazing velocity. Our release cadence hit warp drive and at one point we experienced several months where multiple teams’ production releases were deploying at the end of every two week sprint.

We’ve subscribed to the ITIL service support processes for Release, Change, Incident, Problem and Configuration Management. ITIL has served us well, giving us a common language and a clear understanding of process boundaries.

As the Scrum release cadence kicked in, the downstream ITIL processes had to keep up, adapt, and support the dynamics of rapid production changes.  What happened was enlightening and maybe a bit ground breaking.

The Release Management process had to reassess its reliance on artifacts for gate keeping. The levels of sign offs had to be streamlined, the heavyweight deployment documentation had to be lightened, yet the process still had to control the production release to ensure deployment success.  The rapidity of the release cycles meant that maintenance window downtime would be experienced too frequently by customers, so “rolling bounce” deployment strategies were devised and implemented.

Change requests could no longer wait for a weekly Change Management review board to approve and schedule the changes.  Change management risk models had to be relied on for accurate detection of risky changes.

Early on in this dynamic environment, we weren’t quite as good as we needed to be and our Incident Management process was put to the test.  Faster releases meant more opportunity for problems with service degradation and outages. This reality manifested itself more frequently than we’d ever experienced. Monitoring, detecting and repairing became paramount for environment stability and customer satisfaction.

What we found out was that we became very agile at this break/fix game. We developed a small team approach to managing incidents and leveraged the ITIL Problem Management process to rapidly perform root cause analysis. Once the true root cause was determined, a fix would be defined and deployed. Sometimes the fix was software related and went through the Scrum process, sometimes the fix was hardware related and went through the Configuration Management process, other times it was more operational and the fix took the form of training or corrections to procedural documentation.

The point is we’ve become agile across the entire IT spectrum. Whether it’s development via Scrum, the velocity with which we now operate our ITIL processes, or the integrated break/fix operational support processes, we are performing all of these with an agile mindset and discipline. We have small teams, working on priorities, and completing what needs to be completed now.

Scrum set the flywheel in motion and caused the rest of the IT process life cycle to respond.  ITIL’s processes still form the solid core of service support and we’ve improved the processes’ capability to handle intense work velocity. The organization adapted by developing unprecedented speed in the ability to deliver production fixes and to solve root cause problems with agility.

What I think we are witnessing is a manifestation of Agile Business Service Management; a holistic agile methodology running across the IT process spectrum that’s delivering eye popping change and tremendous results.

Harnessing Economies of Scale in Cloud Computing to Realize a Greener Computing Option

with 2 comments

Economies of Scale have been much discussed in The Agile Executive since the recent OpsCamp in Austin, TX. The significant savings on system administration costs  in very large data centers have been called out as a major advantage of Internet-scale Clouds. Unlike various short-lived advantages, the benefits to the Cloud operator, and to the Cloud user when the savings are passed on to him/her, are sustainable.

In this guest post, colleague and friend Annie Shum analyzes the various sources of waste in operations in traditional data centers. Like an Agilist with Lean inclinations who confronts an inefficient Waterfall process, Annie explains how economies of scale apply to the various kinds of waste that are prevalent in today’s small and medium data centers. Furthermore, she connects the dots that lead toward a Green IT option.

Here is Annie:

Harnessing Economies of Scale in Cloud Computing to Realize a Greener Computing Option

Scale Matters: “Over time, however, competitive advantage within categories shifts inexorably toward volume operations architecture.” – Geoffrey Moore, “Dealing with Darwin”

It is a truism that today’s datacenters are systemically inefficient. This is not intended as an indictment of all conventional datacenters. Nor does it imply that today’s datacenters cannot be made more efficient (incrementally) through right sizing and other initiatives, notably consolidation by deploying virtualization technologies and governance by enforcing energy conservation/recycling policies. There are a myriad of inefficiencies, however, that are prevalent in datacenters today.

Many industry observers lament the “staggering complexity” that permeates on-premises datacenters. Over time, most, if not all, enterprise IT datacenters have become amalgamations of disparate heterogeneous resources. Generally, they can be described as incohesive, perhaps even haphazard, accumulations. The datacenter components and configurations often reflect the intersections of organizational politics (LOB reporting structures leading to highly customized/organizational asset acquisitions and configurations), business needs of the moment (shifting corporate strategies and changing business imperatives to gain competitive edge or meet regulatory compliances) and technology limitations (commercial tools available in the marketplace). It should come as no surprise that human interactions and errors are considered a major contributor to the inefficiencies of datacenters: IBM reported that human errors account for seventy percent of the datacenter problems.

The challenge of maximizing energy efficiency begins fundamentally with the historical capital-intensive ownership model for computing assets to enable each organization to operate its own datacenter and to provide “24×7 availability” to its own users.  The enterprise IT staff has been required to support unpredictable future growth, accommodate situational demands and unscheduled but deadline-critical events, meet performance levels within SLAs and comply with regulatory and auditing requirements. Hence, datacenters generally are over-configured and over-provisioned. In addition to highly skewed under-utilization of distributed platform servers, ninety percent of corporate datacenters have excess cooling capacity. Worst of all, according to IBM, about seventy-two percent of cooling bypassed the computing equipment entirely. Further compounding these problems for a typical enterprise datacenter, is the lack of transparency and the inability to control energy consumption properly due to inadequate and often inaccurate instrumentation to quantify energy consumption and waste due to energy lost.

The economics of Cloud Computing can offer a compelling option for more efficient IT: by lowering power consumption for individual organizations and by improving the efficiency of a large number of discrete datacenters. Although the electricity consumption of Cloud Computing is projected to be one to two percent of today’s global electricity use, Cloud service providers can still cultivate sustainable Green I.T. effectively at lower costs by leveraging state-of-the-art super energy efficient massive datacenters, proximity to power generation thereby reducing transmission costs and, above all, harnessing enormous economies of scale. To better understand how Cloud Computing can offer greener computing in the Cloud and how will it help moderate power consumption by datacenters and rein in run-away costs, a good starting place is James Hamilton’s September 2008 study on Internet-Scale Service Efficiency” as summarized in the table below.

Resource Cost in

Medium DC

Cost in

Very Large DC

Ratio
Network $95 / Mbps / month $13 / Mbps / month 7.1x
Storage $2.20 / GB / month $0.40 / GB / month 5.7x
Administration ≈140 servers/admin >1000 servers/admin 7.1x

Table 1: Internet-Scale Service Efficiency [Source: James Hamilton]

This study concludes that hosted services by Cloud providers with super large datacenters (at least tens of thousands of servers) can achieve enormous economies of scale of five to seven times over smaller scale (thousands of servers) medium deployments.  The significant cost savings is driven primarily by scale. Other key factors include location (low cost real estate and electricity rate, abundant water supply and readily available fiber-optic connectivity), proximity to electricity and power generators, load diversity, and virtualization technologies.

Will this mark the beginning of the end for traditional on-premises datacenters? Can enterprise IT continue to justify new business cases for expanding today’s non-renewable energy powered datacenters? According to the McKinsey article, the costs to launch a large enterprise datacenter have risen sharply from $150M to over $500M over the past five years. The facility operating costs are also increasing at about twenty percent per year. How long will the status quo last for enterprise IT considering the recent trend of Cloud service providers? Major players such as Google, Microsoft as well as the U.S. government itself have invested in or are planning ultra energy-efficient mega-size datacenters (also known as “container hotels”) with massive commoditized containerization and proximity both to power source and less expensive power rates. Bottom line: will the tide turn if the economics (radical cost savings) due to enormous economies of scale become too significant to ignore?

Despite the potential for significant cost savings, it is premature to declare the demise of traditional IT or the end of enterprise datacenters. After all, the rationale for today’s enterprise IT extends well beyond simplistic bottom-line economics – at least for now. To most industry observers, enterprise datacenters are unlikely to disappear although the traditional roles of enterprise IT will be changing. A likely scenario may involve redistributing IT personnel from operating low-level system operational tasks to addressing higher-level functions involving governance, energy management, security and business processes. Such change not only would become more apparent but will likely be precipitated by the rise of hybrid Clouds and the growing interconnection linking SOA, BPM and social computing. Another likely scenario is the rise of the mega datacenters or “container hotels” for Cloud Utility Computing providers. Although the global economic outlook will undoubtedly play a key role in shaping the development plans/timelines of the mega datacenters, they are here to stay. Case in point: by 2012, Intel estimates that it will design and ship about a quarter of the server chips (it sells) to such mega-data centers.


dev/ops with John Allspaw – The Agile Executive #08

leave a comment »

Guest co-host Andrew Shafer and Coté talk with John Allspaw (now at etsy) about the dev/ops idea and, more interestingly, cultural and process needs and changes.

To listen to this podcast, download the podcast directly, subscribe to the blog/podcast feed in iTunes (or whatever), or click play below to hear it:

Show Notes

  • Check out the video of the Velocity talk we reference frequently. It has some actual meat when it comes to biting off a dev/ops culture, practices, and process.
  • John opens up by speaking to tools vs/& culture
  • I ask John how his org got over the hump of “we’re a special snow-flake” – resistance to change .
  • Effecting cultural change by speaking to problems solved and benefits of the new culture.
  • What are some major changes in ops and dev process/culture? Continuous deployment (pushing small changes, often).
  • Getting beyond the culture of no – change management tends to be really “no management.”
  • You have to have a good track record to do this (MTTR, MTTD) – don’t crash the car. What’s deployment to incident ratio?
  • The role of metrics (monitoring) becomes important.
  • Physical vs. virtual stuff in operations, EC2 use, etc. People use public cloud stuff for best-of-breed solutions, like SmugMug usage.
  • Taking flickr from 25th most popular web property to the 5th.
  • Coté is glad John’s at Etsy, cause it fuels the best blog du jour, regretsy.

Written by Coté

February 11, 2010 at 3:53 pm

Posted in Podcasts

Tagged with ,

A Core Formula for Agile B2C Statrups

leave a comment »

Colleague Chris Sterling drew my attention to a Pivotal Labs talk by Nathaniel Talbott on Experiment-Driven Development (EDD). It is a forward-looking think piece, focused on development helping the business make decisions based on actual A/B Testing data. Basically, EDD to the business is like TDD to development.

Between this talk and a recent discussion with Columbia’s Yechiam Yemini on his Principle of Innovation and Entrepreneurship course, a core “formula” for Agile B2C startups emerges:

  1. Identify a business process P
  2. Create a minimum viable Internet service S to support P
  3. Apply EDD to S on just about any feature decision of significance

This core formula can be easily refined and extended. For example:

  • Criteria for choosing P could/should be established
  • Other kinds of testing (in addition to or instead of A/B testing) could be done
  • customer development layer could be added to the formula
  • Many others…

By following this formula a startup can implement the Agile Triangle depicted below in a meaningful manner. Value is validated – it is determined based on real customer feedback rather than through conjectures, speculations or ego trips.

Figure 1 – The Agile Triangle (based on Figure 1-3 in Jim Highsmith‘s Agile Project Management: Creating Innovative Products)

The quip “The voice of the people is the voice of God” has long been a tenet of musicians. The “formula” described above enables the Agile B2C startup to capture the voice of the people and thoughtfully act on it to accomplish business results.

OpsCamp Through an Internet-scale Lens

with 10 comments

OpsCamp Austin 2010

Like Agile Roots in Salt Lake City in June 2009, OpsCamp in Austin last week demonstrated how powerful grass roots conferences can be. We might not have had big names on the roster, but we sure had a productive dialog on the tricky issues lurking in the cusp between software development and IT operations in Cloud environments.

The conference has been amply covered by Michael Cote, John Willis, Mark Hinkle, and Damon Edwards (to name a few). This post restricts itself to commenting on one fundamental aspect of the cloud which IMHO does not get the attention it deserves. It might be implied in various discourses on the subject, but I believe it needs to be called out as a fundamental assumption for just about anything  and everything one might consider doing with respect to the cloud. I am referring to economies of scale.

As pointed out in a forthcoming book on Cloud Computing by colleague and friend Annie Shum, the cloud phenomenon is fundamentally driven by substantial economies of scale in very large data centers. The operational costs of running such data centers are close to an order of magnitude lower than these prevailing in small and mid-sized data centers. User benefits are primarily derived from these compelling economies of scale.

I will be asking Annie to write a detailed guest post on the subject for readers of The Agile Executive. Until her post is published here, I would recommend we primarily consider the Cloud as a phenomenon that only becomes meaningful at scale. In particular, Private Clouds are not likely to yield Internet-scale efficiencies. Folks who regard their company’s conventional data center as a private cloud might be missing up on the ‘secret sauce’ of cloud computing.

The various agile system administration schemes discussed at the Austin OpsCamp are essential to attaining the requisite economies of scale in cloud services. Watch out for follow-on OpsCamps in other cities for developments to come in this all important space.

Agile Infrastructure

leave a comment »

Ten years ago I probably would not have seen any connection between global warming and server design. Today, power considerations prevail in the packaging of servers, particularly those slated for use in large and very large data centers. The dots have been connected to characterize servers in terms of their eco foot print.

In his Agile Austin presentation a couple of days ago, Cote delivered a strong case for connecting the dots of Agile software development with those of Cloud Computing. Software development and IT operations become largely inseparable in cloud environments.  In many of these environments, customer feedback is given “real time” and needs to be responded to in an ultra fast manner. Companies that develop fast closed-loop feedback and response systems are likely to have a major competitive advantage. They can make development and investment decisions based on actual user analytics, feature analytics and aggregate analytics instead of speculating what might prove valuable.

While the connection between Agile and Cloud might not be broadly recognized yet, the subject IMHO is of paramount importance. In recognition of this importance, Michael Cote, John Allspaw,  Andrew Shafer and I plan to dig into it in a podcast next week. Stay tuned…

Cloud Computing Forecasts: “Cloudy” Future for Enterprise IT

leave a comment »

In a comment on The Urgency of Now, Marcel Den Hartog discusses technology assimilation in the face of hype:

But if people are already reluctant to run the things they have, on another platform they already have, on an operating system they are already familiar with (Linux on zSeries), how can you expect them to even look at cloud computing seriously? Every technological advancement requires people to adapt and change. Human nature is that we don’t like that, so it often requires a disaster to change our behavior. Or carefully planned steps to prove and convince people. However, nothing makes IT people more cautious than a hype. And that is how cloud is perceived. When the press, the analysts and the industry start writing about cloud as part of the IT solution, people will want to change. Now that it’s presented as the silver bullet to all IT problems, people are cautious to say the least.

Here is Annie Shum‘s thoughtful reply to Marcel’s comment:

Today, the Cloud era has only just begun. Despite lingering doubts, growing concerns and wide-spread confusion (especially separating media and vendor spun hype from reality), the IT industry generally views Cloud Computing as more appealing than traditional ASP /hosting or outsourcing/off-shoring. To technology-centric startups and nimble entrepreneurs, Cloud Computing enables them to punch above their weight class. By turning up-front CapEx into a more scalable and variable cost structure based on an on-demand pay-as-you-go model, Cloud Computing can provide a temporary, level playing field. Similarly, many budget-constrained and cash-strapped organizations also look to Cloud Computing for immediate (friction-free) access to “unlimited” computing resources. To wit: Cloud Computing may be considered as a utility-based alternative to an on-premises datacenter and allow an organization (notably cash-strapped startups) to “Think like a ‘big guy’. Pay like a ‘little guy’ ”.

Forward-thinking organizations should not lose sight of the vast potential of Cloud Computing that extends well beyond short-term economics. At its core, Cloud Computing is about enabling business agility and connectivity by abstracting computing infrastructure via a new set of flexible service delivery/deployment models. Harvard Business School Professor Andrew McAffee painted a “Cloudy” future for Corporate IT in his August 21, 2009 blog and cited a perceptive 1983 paper by Warren D. Devine, Jr. in the Journal of Economic History called “From Shafts to Wires: Historical Perspective on Electrification”.[1] There are three key take-away messages that resonate with the current Cloud Computing paradigm shift. First: The real impact of the new technology was not apparent right away. Second: The transition to full utilization of the new technology will be long, but inevitable. Third: There will be detractors and skeptics about the new technology throughout the transition. Interestingly, telephone is another groundbreaking disruptive technology that might have faced similar skepticism in the beginning. Legend has it that a Western Union internal memo dated 1876 downplayed the viability of the telephone: “This ‘telephone’ has too many shortcomings to be seriously considered as a means of communications. The device is inherently of no value to us.”

The dominance of Cloud Computing as a computing platform, however, is far from a fait accompli. Nor will it ever be complete, a “one-size fits all” or a “big and overnight switch”. The shape of computing is constantly changing but it is always a blended and gradual transition, analogous to a modern city. While the cityscape continues to change, a complete “rip-and-replace” overhaul is rarely feasible or cost-effective. Instead, city planners generally preserve legacy structures although some of them are retrofitted with standards-based interfaces that enable them to connect to the shared infrastructure of the city. For example, the Paris city planners retrofitted Notre Dame with facilities such as electricity, water, and plumbing. Similarly, despite the passage of the last three computing paradigm shifts – first mainframe, next Client/Server and PCs, and then Web N-tier – they all co-exist and can be expected to continue in the future. Consider the following. Major shares of mission-critical business applications are running today on mainframe servers. Through application modernization, legacy applications – notably Cobol for example – now can operate in a Web 2.0 environment as well as deploy in the Cloud via the Amazon EC2 platform.

Cloud Computing can provide great appeal to a wide swath of organizations spanning startups, SMBs, ISVs, enterprise IT and government agencies. The most commonly cited benefits include the promise of avoiding CapEx and lowering TCO to on-demand elasticity, immediacy and ease of deployment, time to value, location independence and catalyzing innovation. However, there is no magic in the Cloud and it is certainly not a panacea for all IT woes. Some applications are not “Cloud-friendly”. While deploying applications in the Cloud can enable business agility incrementally, such deployment will not change the characteristics of the applications fundamentally to be highly scalable, flexible and automatically responsive to new business requirements. Realistically, one must recognize that the many of the challenging problems – security, data integration and service interoperability in particular – will persist and live on regardless of the computing delivery medium: Cloud, hosted or on-premises.

[1] “The author combed through the contemporaneous business and technology press to learn what ‘experts’ were saying as manufacturing switched over from steam to electrical power, a process that took about 50 years to complete.” – Andrew McAfee, September 21, 2009.

I will go one step further and add quality to Annie’s list of challenging problem. A crappy on-premises application will continue to be crappy in the cloud. An audit of the technical debt should be conducted before “clouding” an application. See Technical Debt on Your Balance Sheet for a recommendation on quantifying the results of the quality audit.