Sep 20

Managing Chaos

Managers mostly come in two flavors: the fireman and the accountant.  The fireman is a crisis manager.  This is the one you run to when the data center melts down or your product blows up in the face of a child star on television.  They excel at stopping the bleeding by applying tourniquets and enough emergency care to get the patient out of the hospital.  A crisis manager keeps a cool head during all the shouting, ensures everyone is focused on the most important problem, and bulldozes any road blocks that come up.  They are not good, however, at what I call steady-state management.  They get bored, they are addicted to adrenaline.  If there’s no crisis, they spend a little time getting better prepared for the next crisis, then they go to sleep.

The steady-state accountant manager, on the other hand, is keeping things moving day-by-day, and slowly improving the processes to produce more and prevent future crises.  They manage people and processes, not situations.  The accountant ideally thinks proactively, compared with the reactive crisis manager.  The better they do their job, the less you realize their value – or even their existence.  In a crisis, however, they are lost.  They look for what went wrong, where their systems failed them, rather than the main goal of getting back in operation as quickly as possible.  They can often add to the panic rather than push through it.

So where do I fit in?  I’d love to say I’m the ideal mix between the two, but that would just be self-serving and inaccurate.  I do fit in the intersection, however.  I would describe my approach as building order from chaos.  I do manage to keep everyone focused on solving the crisis if one appears, but I don’t make the intuitive leaps that a natural crisis manager can.  When the crisis is over, my first task is to gather Lessons Learned and divide them into two categories: How to prevent the next crisis, and How to respond faster next time it happens.  But unlike the crisis manager, my next step is building the processes to prevent future occurrences along with the plans and tools to respond more quickly to any similar event.  In between crises, my aim is to build robust processes within my group to proactively address future needs and more efficiently produce current results – similar to a steady-state manager.  I will admit that focusing on the detailed minutia of minor incremental improvements, where the accountant shines, is where I begin to lose some interest.

Where is your sweet-spot in this spectrum?

Jun 11

Strategic IT Resource Allocation

So, now that I have pontificated for about 6 weeks on IT organizational structure, I can finally answer the Professor’s question: how do you strategically allocate people to keep normal operations flowing yet still advance strategic IT capabilities that extend the business’ competitive advantage?  If you put all your attention on keeping current things from falling apart, the competition will pass you by.  If you focus only on the future, the floor will rot out from under you – that was the essense of the conundrum he presented (rephrased in my own words).  So: where do you put your best people in order to keep both progressing?

The obvious and over-simplified answer is to balance it out so that they’re spread across the organization.  But I contend that beyond being trite, it is also wrong.  First, the two areas require different talents, meaning it is not an either-or situation.  Second, there are different levels (individual contributors, first-level management, executive direction) that provide more levers to push.  I think that the U.S. Navy has long had the general answer to this question.

A ship is run 24 hours a day, separated into watches.  Let’s divorce theory from reality for this discussion and say that there are 3 watches, one led by the CO (Captain), one by the XO (first officer), and one by the CDO (command duty officer, which is a rotating role, not a person).  Who is on the bridge with the captain?  The weakest, newest officers being evaluated or trained into the positions.  Who is on with the weakest of the three commanding officers?  The best officers for each position – specifically to cover the weakness of the officer commanding the watch.  We’ll ignore the mix characteristics for the XO for the time being.

This same model can act as a guide in strategic personnel assignment.  In the maintenance role, you want a tactical leader who is a great crisis manager.  This role needs little strategic thinking, other than planning for the next crisis.  The people they lead need dogged troubleshooting skills and deep knowledge of how things work, but they do not need to be the “best and the brightest.”  The senior leadership of this group performs mostly an administrative role.  Of the three levels (IC, mid-management, executive), the maintenance/operations group needs the mid-management to be its strongest link.

The development organization, on the other hand, is the exact opposite.  The individual contributors need to be independent and creative, the best and the brightest.  Mid-managers in development often only need resource-management or administrative skills – if the individual contributors are as strong as they should be.  The executive level needs strong vision and inspiration ability.  In this case, the stronger people are in the Individual contributor and Executive positions, while mid-management can be weaker.

Those who show strong management potential might be promoted into mid-management of the maintenance organization where they gain knowledge of how the business works, how important it is that things keep operating, and how to deal with high-stress situations.  Managers from the maintenance side of the house can make good candidates for the executive core of the development organization because they now understand more of how the whole business works and what they need to work better.

As with any organization, there is no one cookie-cutter approach that works all the time.  What I describe here works when the IT organization is structured as I described in the past several articles.  The same kind of thinking (why you need strength at different levels) applied to IT Organizations with different structures and strengths will lead to an optimal layering of talent for that specific organization.

Tell us all how your organization focuses its technical talent to achieve organizational objectives.  Have you seen models that work better?

May 14

Framing the Problem…

There are a lot of frameworks out there: ITIL, CMMi, PMP, TOGAF, Six Sigma, and a couple dozen more.   Do any of them work?  Do they add more value than the effort to implement them?  I have seen companies adopt them thinking they are a panacea to solve their organizational problems, and other companies avoid them as if they would rather catch the Bubonic Plague than implement a framework.  Today, I am going to take a break from my ever-growing series on organizational structures and talk about the benefits of a framework.

“The Benefits of a Framework” is a good starting place because it really does not matter which framework you try.  Nor does it truly matter how “complete” the implementation is.  Purity is not beneficial in this case.  If you are starting from zero, and go about it with the right attitude, ITIL will be just as transformative to the organization as Six Sigma – and they could not be more different.  Which one will be easier to implement?  Which one will give you more value?  The framework cannot answer that question.  Only the implementing champion can.

This means that the critical key to the success of implementing a framework and the majority of the benefit the organization will achieve lies in the decision of who will champion the effort.  It is that person’s attitude, vision, and ability to rally support around that vision which determines success.  It is ironic that the principle of the framework (every one of them) is to move away from reliance on the organizational hero, yet the implementation depends exactly on that hero.

So what are you looking for in that champion?  First, and foremost, their conviction that the framework will solve organizational challenges and bring business value.  If they do not believe, they cannot inspire.  If they’re not focused on business value as the end-game, they will not achieve it.  If they cannot see the full scope of organizational challenges in the harsh light of day, including in their back yard, they have no ability to apply the tools to solve the problems. 

Another key characteristic is that the champion must plan their own obsolescence.  The framework is not successful if it requires personal charisma to maintain.  The hero is not only no longer needed, but an actual hindrance to eventual success.  The champion must be able to get follower momentum until it is capable of sustaining itself, then get out of the way.  It is about the organization’s success, not the champion’s personal resume or fiefdom.

Finally, the champion needs a healthy dose of realism.  What will the framework solve, and what can it not solve?  In reality, it solves nothing in the present.  It is only the way it was implemented that solves or does not solve a problem – did the implementer aim to solve the right business problem?

So if the framework doesn’t matter, and it is only the alignment of the implementation to the challenges of the business that determines the resulting value, what good is a framework anyway?  Nothing – now.  But it provides a common language with which the future can be described.  Whichever framework you choose to solve any given problem gives you a common taxonomy to ground the discussion with vendors, with potential candidates for key positions, orienting new hires and guiding the thought process of the strategic thinkers in the organization.

So, it does not matter which framework you adopt, only that you have one.  So what if it is only adopted within two groups of your organization – it provides a communication structure and it solved business problems.  This group believes in CMMi and that one believes in Lean?  Great!  Neither one solves all problems, and each has drawbacks that detract from their value which can be solved by the other.  So get the two talking to each other and see what results.  It’ll be interesting, and if it is approached with the right mindset, the offspring of that union will generate business value.  The only problem is that no one will know what to call it.

So: What are the different frameworks and where do they fit in?

Information Technology Infrastructure Library (ITIL) focuses on – in a phrase – service availability.  It defines several disciplines and organizational structures, but in the end every one of them is aimed at preventing a service disruption or restoring a service to operations as quickly as possible.  It provides one possible map for how to structure an IT Operations group and processes for managing change.  The change management is best suited to a Waterfall SDLC, but can be adapted to Agile.  The effort is typically led from the Operations side of the house, in either Risk Management or Production Operations.

Capability Maturity Model, Integrated (CMMi) focuses on process in the hope that if you get the processes right, the end result cannot help but be right.  Any failure in the result, by definition, was caused by a defect in the process.  CMMi is frequently and incorrectly thought to be the answer to software quality issues.  It is usually led from the Change Management group.

Project Management Professional (PMP) is one of the best certifications I’ve seen.  In order to achieve the certification, the candidate must have lead projects throughout the full lifecycle for at least 5 years – and that claim is randomly audited.  Most other certifications test your ability to absorb and repeat their taxonomy.  While PMP speaks in terms of process, with inputs and outputs, it truly centers on communication.  Leadership typically centers in the obvious portfolio/project management organization.

The Open Group Architecture Framework (TOGAF) derives from earlier architectural frameworks, but from what I can see it appears to be a re-cast of PMP with an architectural point of view.  While PMP focuses on communication, TOGAF centers on the artifacts.  The obvious group for this is Enterprise Architecture.

Six Sigma is all about statistical process control.  It is frequently disregarded in the IT realm because it is strongly linked conceptually to manufacturing, and of course manufacturing software is completely unlike manufacturing anything else…  The real problem with using Six Sigma in IT is that there must be sufficient statistical data points to generate a meaningful sigma.  I am lining up a guest writer to discuss how some Lean concepts such as Kanban can apply to the software development process.  This is often led by outside business units or by the Process Improvement committee in the Governance arena.

Understand that any summarization of these extensive frameworks that boils down to 3 or 4 sentences is generalized almost to the point of uselessness, but it helps to understand the “problem space” that each is intended to solve.  I do not recommend implementing any of them in their entirety, but cherry-picking the best of breed processes to solve targeted operational problems.  Has anyone experienced something that shows otherwise?  Any disagreement with the overly simplistic summaries here?  Please comment so we can all chime in.

May 11

Supporting Production Operations

Continuing my series on the structure of a theoretic IT organization, today’s subject is the group in Operations that supports production systems. The Production Support group’s primary function is to act as a proxy for the customer throughout the organization. Resources are limited in the company and reacting to every customer issue appearing through the help desk results in chaos – it is impossible. This organization prioritizes the organization’s response on behalf of the customer. They must determine the organizational cost of not resolving the customer’s issue and compare that to the estimate of the cost to resolve it – as expanded across the potentially affected customer base. There is no single cut-and-dried formula for this, each company must develop their own. The result of this calculation determines which issues get addressed first and how fast.

Production support also represents the customer in planning meetings and product roadmap sessions.

The people in this group must be hired for their empathy with the customer, drive to satisfy the customer’s need, and ability to quantify the business impact of customer problems. Technical skills can be learned. Customer rapport is a talent. One key set of metrics this group must understand inherently is Customer Lifetime Value (CLV), Customer Referral Value (CRV), and the Cost of Quality (CoQ). While it is important to resolve the customer’s problem and generate customer satisfaction, it must be done with the company’s long term interest (the bottom line both now and in the future) in mind.

Another consideration is that, as proxy for the customer, the Production Support organization also owns the data in the prodution databases.  This seems a strange concept – the data is not owned by the lines of business, but by the customer’s representative.  It makes more sense when you clarify that the line of business users are customers of the IT organization.  But making the internal mindset shift that says that the customers own their own data and that Production Support acts as their proxy/guardian, various decisions and thought processes become automatic.  Privacy issues resolve themselves.  SOX separations of concerns become intuitive.  Decisions like retention policies are much easier to manage.  Most importantly, the IT Operations orgaanization becomes naturally and axiomatically more customer focused.  The transformation is amazing to watch.

I have rarely seen Production Operations organizations play prominent roles in large IT organizations, and even more rarely acting as the customer’s proxy.  When used, it has been extremely powerful.  Can anyone share stories when this concept made a significant differents, for good or ill, in your organization?

May 03

Governing IT

Unlike the Development and Operations organizations, Governance is a functional role, not a position.  People throughout the IT organization perform governance roles. The Governance organization is little more than a collection of committees. 

The most obvious and common of the committees is the Change Advisory Board (CAB) responsible for approving the changes to production systems.  This group should be led by Release Management and include the proposers of changes along with those impacted by the change.  Typically, the group has a set of fixed members from infrastructure and PMO along with temporary invitees reflecting the drivers for the change and the other affected parties.

The Enterprise Architecture Oversight Committee (EAOC) is lead by the chief architect or CTO and includes the leadership of all development and infrastructure groups.  This group is charged with maintaining the list of approved/acceptable technologies within the company and vets any proposed new technology – ensuring that they are both supportable and support the strategic direction of the company.

Another governance group is the Process Improvement Committee (PIC).  This team is responsible for defining, producing, and responding to all IT performance metrics for Development and Infrastructure, using this information to define Service Level Agreements and Operating Level Agreements (SLA/OLA) and any process improvement initiatives.  This is the natural home for Six Sigma methodologies and projects (Yes, Six Sigma can be applied to IT).  Once identified, process improvement projects are passed to the Portfolio Management group.

The Project Governance group (PG), led by the head of the Project Management group, arbitrates all project conflicts ensuring that projects are prioritized and resourced at the company level.  Another key function of this team is to spot when one silo within the business is taking action that will either conflict with or potentially enhance the operations of another.  Significant project timeline and cost risks are directed at this group for resolution or mitigation.  Out of this resolution, problems requiring Process Improvement focus will be surfaced and referred to PIC.

The final committee in the Governance arena is the Strategic Planning Committee (SP).  This might be chaired by either or both of the CTO and the leader of Portfolio Management.  The group uses the roadmaps of all the business units and corporate executives as input to define the IT roadmap for the next 2 years, adjusted at each quarterly meeting.

Apr 26

IT Operation’s view of the world…

This discussion continues my series of how a theoretical IT organization might be structured.  Today’s topic is the Operations side of the house.  For this group, the first order of business is to “Keep the lights on” and deliver the IT services the business needs as reliably as possible.  They also experience an interesting dichotomy: reduce the operational risk for the organization while simultaneously providing more services for less money.  Does that sound like fun to you?  It can be, with the right mind-set.  You can imagine, however, why many look upon operations jobs as thankless, meaningless, all-risk-for-no-benefit prospects – when the leader is uninspired.

Chart of Organizational Structure

Potential IT Org chart

IT Operations functions break down into three major categories: Infrastructure Management, Production Support, and Risk Management.  Infrastructure Management focuses on the servers, storage, networking equipment, databases, and middleware that keeps the organization humming.  Production support focuses on the customer’s daily experience.   Risk Management provides information security and ensures the quality of the systems produced and maintained by IT.  It is an interesting and worthy distinction that the DBAs in Infrastructure are responsible for the physical databases while Production Support owns the data inside them.

In many organizations I have seen, Infrastructure Support personnel are the least inspired to excel.  There are many good, talented people here, but if they do their job right, no one notices.  A friend of mine was fond of saying: How do you know when you have a good Database Administrator?  He’s bored to tears and no one knows he exists.  This principle is true of the whole Infrastructure Support group.  They are almost always in the blame-chain for everything that goes wrong in their area of expertise.  It takes a special management style to lead such a group effectively.  I’ll cover that in a future article.

Production Support is the Operations Group’s proxy for the customer – both internal and external.   This is a high-stress, interrupt driven organization.  They advise other groups from the customer’s point of view, own the data and therefore are responsible for data quality and scheduling any production data changes.

Risk Management is responsible for protecting corporate assets.  It controls the usual information security functions as well as some less-common functions that other IT organizations place in other functional areas: Change Management and Quality Assurance.  These latter are typically the perview of either infrastructure or production support, but properly are risks to the organization that must be managed.

IT Operations is also the natural “home” of two process framework disciplines, ITIL and CMMi, due to their focus on daily operations.  These frameworks are rarely successful in implementation if the sponsor/champion is not in the right sub-discipline of Operations.  ITIL is often best championed from the Help Desk, but could be led from other Production Support areas.  CMMi is best led from Change Management.   These frameworks require strong buy-in from all of Operations, but the group with the most “skin in the game” is typically the best to lead it.

In future articles, I will expand more on the three focus areas of IT operations as well as a separate article on process frameworks.  Let me hear your thoughts on this article so I can be sure to fully discuss areas of interest to you – and more importantly, include your feedback in designing this theoretic organization.  What kinds of problems do you see with this organizational breakdown?

Apr 23

Musings on Organizational Structure

I was attempting to explain the principles for building a high-functioning IT department to one of my professors, and it led to the diagram below.  I thought it might be interesting to bring the thought process out in the open so that we may all learn from each other.  Too much is involved to discuss the entire structure in one blog entry, but this will serve as the foundation for further conversations on the topic.
Chart of Organizational Structure

Potential IT Org chart

To begin with, obviously this chart describes a significantly sized IT organization – at least 50 FTE, perhaps as many as 250.  Smaller organizations would, by necesity, have people performing multiple roles.  The smaller the organization, the more it will drift from this model – as the combination of tasks assigned to individuals will be based more on the inherent talents they bring to the table than on any view of “proper” roles.  Larger organizations will specialize even more than what is indicated here.

At a high level, you can see a division into two large organizational groupings: Development and Operations.  The “Governance” box demonstrates the cross-functional nature of the governance role – it represents all constituencies of IT.  Future blogs will focus on each organizational group and eventually get down to individual roles.  This split, however, represents the natural balance of tensions that is normal to an IT organization.  You begin with one organization whose main purpose is to generate and implement change (Development), then balance that with an organization dedicated to ensuring proper operations of necessary services.  One creates change, the other attempts to limit the risk of implementing that change.  The Governance role brings blended balance to the picture.

In my next blog, I will discuss the IT Operations organization.  In the mean time, please join in the conversation and toss in your two cents.