Spare Some Change?

Image result for pennyA Penny for Your Thoughts

Does your organization have a process to handle changes to services, applications and other types of configuration items?  Is that process able to withstand compliance audits?  Are you able to correlate changes to other ticket types within your organization like incidents and requests?  If the answer to any one of these questions is no, it may be time to reevaluate your change process.

 

Here’s a Nickelback 

ServiceNow’s current Change Application has 5 features out of box that can set your organization up for success when implementing Change Management.

  1. Image result for nickelStandard Change Proposal process – allows anyone within the organization to propose a standard change template that once approved is automatically added to the standard change library for selection.
  2. Change Request Interceptor – a landing page that gives users the ability to select from the three types of ITIL recognized changes Normal, Standard, and Emergency.
  3. Conflict Checking – the ability to check conflicts against blackout periods and maintenance windows.
  4. Risk Assessment – using survey type functionality, a risk assessment provides a list of questions with various weighted values that help determine the risk score and level of a change.
  5. CAB Workbench – the one stop shop for the Change Approval Board to schedule, plan, and manage CAB meetings from a sleek user interface.

Image result for dime

On a Dime

Service Catalyst has accomplished half a dozen Change implementations this year alone and while the out of box offerings are great, it’s telling that no one implementation has been the same.  For example, the implementation we did for our higher education client was drastically different than our tightly regulated third party insurance client.  Each of these implementations faced Change Management needs beyond what ServiceNow’s current out of box solution provided.  So, we worked to ensure that the solutions we provided met the needs of each individual client.  We developed a standard change template review process, ensuring that templates remain current.  We provided the ability to automate risk calculations based off the Configuration Item’s information without a user ever having to complete a risk assessment.  We considered that not all CI’s will have the same approval or tasking workflows and developed a system that allows for subtypes of changes.  These pieces of functionality, along with other enhancements, boosted the out of box offering of Change Management and raised its value on a dime.

 

Quarter Pounder

Image result for QuarterWith cheese… tying it all together in a big juicy burger: you can have all your changes in one release.  Utilizing ServiceNow’s Release application, organizations can have the ability to house a group of changes into a single release and keep tabs on the status of each change within that given release.  We’ve developed the ability to check conflicts against all changes within a release and provide the means to update changes directly from the release record.

 

We’ve found all that’s left in the couch and wouldn’t mind sharing because we know all organizations could use a little change.

Image result for change jar

 

 

 

ITIL’s Last Gasp? It may be more relevant than ever

Recently Charles Araujo penned an article raising the question of whether we were seeing ITIL’s last gasp.

https://www.cio.com/article/3237225/itil/is-this-itils-last-gasp.html

I thought it was a very good and thought provoking post and my immediate conclusion, expressed in a tweet, was that what ITIL prescribed was still a prerequisite to positioning an organization to adopt such concepts as DevOps and Digital Transformation.



Troy du Moulin responded to Araujo’s article with a spot on reminder of the ‘Lean components of value’ – Quality, Speed and Cost — and ITIL’s role in ensuring those parts that might be less sexy in light of the current focus on DevOps and Digital Transformation. He emphasizes that ITIL covers the entire set of capabilities for creating value and reminds us that approaches such as DevOps and Agile do not do this, nor do they aspire to.

https://www.linkedin.com/pulse/itil-continued-pursuit-relevance-value-troy-dumoulin/

This is what we continue to find, that there are a lot of organizations that continue to have to work on their blocking and tackling before any consideration can be made of seriously applying approaches such as DevOps. In our experience the precepts laid forth in ITIL continue to be the best source for laying this groundwork. The ability to deliver consistent Quality at an acceptable Cost is a requirement that must be met before focusing on Speed and Velocity.

There are some data from the Service Management software market that back this up. ServiceNow estimates the current ITSM software market as about $1.5 Billion, and has set goals to grow their market to $4 Billion by 2020. I think a good part of these sales, present and future, represent IT organizations investing in a Service Management platform that will allow them to take care of the fundamentals.

In a few recent posts, I have been waxing nostalgic about the old ITIL v2 approach to these ‘fundamentals.’ One of the areas I continue to emphasize is the old ‘Blue Book’ approach to ‘Release and Control’ – the triad of the Change, Release and Configuration processes. Having a solid control over your environment is an absolute requirement before you can consider automating and accelerating your delivery through the Service life cycle.

In fact, one of the books I still frequently recommend is the old Visible Ops book. Even in the SaaS and Cloud environment the message of that volume is critical to set the stage for considering how to automate the value chain:

  • Stabilize and Control your environment using Change Mgt.
  • Identify your CI’s – particularly your ‘fragile’ CI’s
  • Build a Repeatable Build Library

Once this level of control is in place we have defined a design and transition method that we call Service Onboarding. While not specifically in the ITIL Volumes this is a unified approach to the Service Delivery and Transition stages of the ITIL lifecycle. It expands on the concept of the ‘repeatable build library’ to standardize the components and steps to building and transitioning new IT Services. If you will, it standardizes ITIL’s Service Design Package and defines the steps to actualize Service Design in a consistent manner throughout the organization.

http://docplayer.net/12867956-Service-onboarding-a-process-approach-for-uniting-itil-and-devops-bill-cunningham.html

The point is, if these principles that cover the entire lifecyle are not in place you will not be ready to realize the benefits from automating the Development through Change and Release processes using DevOps. These fundamentals are absolute requirements and ITIL is still the best source for them.

In our experience at Service Catalyst- DevOps and Digital Transformation have made ITIL more necessary and relevant than ever. The sales figures from ServiceNow cited above would seem to support that.

We would love to discuss with you how you might lay the groundwork to realize the benefits from DevOps automation and make the turn towards Digital Transformation. You can contact us at sales@service-catalyst.com or call us at +1.888.718.1708

Major Incident Management in ServiceNow- Integrating with Problem and Change

– by Bill Cunningham and Rodney Holiman

In a few previous posts we have discussed designing and implementing a Major Incident Management (MI) process and made some comments around MI roles and communications. Let’s begin to apply those ideas to setting up an MI process in ServiceNow.

The Out of the Box ServiceNow approach to Major Incident is to treat it as a sub-process of Incident Management and provide a separate Form View for those Incidents that have a Priority indication that rises above a certain level.

At Service Catalyst we highly recommend developing Major Incident Management as a separate process altogether. This means that it has a separate Process Owner, its own Policy and Workflow and in Service Now it gets its own Table and State Model.

We have had a continuing internal discussion over whether it makes more sense to build the ‘MajorIncident’ table by extending either the Incident or Problem table. We typically extend the Problem Management table so that we can make use of the underlying Workaround capability built into that module.

Building off of that table we are able to build a separate state model around Major Incident Management that allows us to tightly integrate with Incident, Problem and Change Management.

Unlike ‘regular’ Incidents, Major Incidents have no ‘Pending’ State. By definition Major Incidents are highest priority and need to remain open and active throughout their lifecycle. If you can pause working on it – you no longer have a Major Incident. (Maybe convert it to a Problem?)

We do make one exception, and that is when a Change has to be raised to resolve the Major Incident. We usually add a State ‘Pending Emergency Change’ that will generate a related Emergency Request for Change (RFC) record. Unlike ‘regular Incidents’ this pending state does not pause any SLA clocks we may have defined, but it does streamline the process of generating the Emergency Change documentation. It also provides an update as to the fact that there is a Change in process that is expected to address the Major Incident.

When the related RFC is closed, the Major Incident record is returned to Work-In-Progress, if it has not already been manually returned to that state. Remember, we are using ServiceNow to document the process here, not drive the workflow.

In our most common approach – when the Major Incident record is ‘Resolved’, two things occur –
1). An After Action Review (AAR) Task is generated and assigned to the Service Owner of the Major Incident. An SLA is assigned to this task with notifications that it needs to be completed. When the task status is marked as completed then the parent Major Incident record is marked as ‘Closed.’ Enforcing these After Action Reviews ensures that some thought is put into what went wrong, what went right, and how the process can be improved. This step also builds an ongoing record of the Major Incidents that we have managed. It is to be hoped that we don’t have too many of them.

2). A related Problem record is also generated when the Major Incident is marked as ‘Closed’. The After Action Review is focused on immediate lessons learned from the Major Incident. The Problem record is to determine and document the underlying Root Cause of the Major Incident and to take steps so that it will not happen again.

If Root Cause has been determined, the Problem record can be immediately closed. But it is helpful to have all of our Root Cause documentation in the same place. This is particularly true if you are using the pure model of having all Incident workarounds vetted by Problem Management.

Another thing to keep in mind is that a Problem is a declared state. If for some reason it is decided that Root Cause does not need to be determined, the Problem record can be closed with a closure code of ‘Declined to Pursue.’ Again, there is value in recording this decision in the Problem record. It ensures that it was a conscious decision to not pursue Root Cause and that the decision can be readdressed at a later date should it need to be.

So, we have taken three posts to discuss Major Incident Management at a fairly high level. I hope that they have provided some insight into how to approach managing Major Incidents in your organization.

If you would like to discuss these concepts further – you can contact us at contact_us@service-catalyst.com or call us at +1.888.718.1708 and let us know you would like to discuss Major Incident Management or anything about ITSM and ServiceNow implementation services.

Major Incident Management II – Comments on Roles and Communication

In our last post we began discussing Major Incident Management and offered some thoughts around the authority to declare a Major Incident (MI) and guidelines for when such a declaration should be made. Let’s continue that topic by outlining the roles we typically define for the process. We’ll also offer some suggestions for handling communications during a Major Incident.

Remember – Major Incident Management is a process where the key activity happens outside of your ITSM software. We raise Major Incident records to document what has been done, but the process activities themselves are driven by immediate verbal communication.

Once a Major Incident is declared we suggest three roles to manage the response –
1. The Owner of the Major Incident. This person is responsible for coordinating efforts to resolve the issue. This includes naming the Technical Lead, overseeing communications, contacting vendors and ensuring all the other tasks that are required are assigned and carried out.

The Owner of the Major Incident can change during the lifecycle of the event as the nature of the issue is learned and troubleshooting advances. Any such turnover should, of course, be verbally confirmed, and then documented in the MI record.

We usually recommend that anyone in IT is eligible to be the Owner of any specific Major Incident. We reflect this in the ITSM software by not restricting the ‘Assigned To’ field to be a member of the ‘Assignment Group.’ This field is left open to be filled in by anyone from any IT group.

2. The Technical Lead. The Technical Lead works with the Owner of the Major Incident to form the Technical Team that will be in the troubleshooting room and on the conference bridge, and in the data center, to figure out what went wrong and how to restore service.

As with the Owner of the Major Incident, the Technical Lead and the members of the Technical Team, can change as the Major Incident moves through its lifecycle and details of the issue develop.

The Owner of the Major Incident serves as the conduit for communications with the Technical Lead. You should strive to limit the point of contact to only the Owner of the Major Incident. This allows the Technical resources to focus their energies on troubleshooting and resolving the issue.

3. The Communications Lead. The Communications Lead is responsible for coordinating messages about the Major Incident to the client community and other stakeholders (e.g. Board of Directors, Media, etc). The Owner of the Major Incident will serve as a coordination point between the Tech Lead and the Communications Lead.

Some clients have dedicated Communications officers or departments and these folks handle all the communications responsibilities for all Major Incidents. Others do not have permanent dedicated roles for this and the role is assigned separately for each Major Incident.

A brief ServiceNow comment – one fancy thing we usually do is to establish a separate ‘MI Communications Tab’ to facilitate the collection and coordination of information that will be used to communicate status and progress about the Major Incident. The specifics of this tab varies greatly from client to client but the underlying principle is the same, allow the central collection of information that will be used to build communications about the Major Incident and also have the capacity to build distribution lists that are used to disseminate the right information to the right groups of clients and stakeholders.

Now that we have considered the three roles of Major Incident Management we can take a quick look at an example of a high level workflow of the overall process. This sample process illustrates the MI declaration and the three roles in action:

In our last two posts we have covered some of the considerations we bring to the design of a Major Incident Management process. In a future post we will discuss some ideas for designing a Major Incident process in ServiceNow. (I think we promised that a few posts ago…)

If you would like to discuss these concepts further – you can contact us at contact_us@service-catalyst.com or call us at +1.888.718.1708 and let us know you would like to discuss Major Incident Management or anything about ITSM and ServiceNow implementation services.

Service Operation: Major Incident Management Overview

In a recent post I mentioned we’d offer some thoughts on fancy integration workflow with Incident, Major Incident and Problem Management.

But first, let’s begin with some suggestions on how to approach Major Incident Management and then in a future article we’ll discuss how it might integrate with Incident, Change and Problem Management.

Within the Service Operation and Service Transition lifecycle stages where most ITSM projects focus, there are three processes that can be effectively started without immediate use of an ITSM tool to enable them. These processes are Problem Management, Change Management and Major Incident Management.

In fact, Major Incident Management is a process that should be designed almost completely separate from any I

TSM software considerations. The tool should be almost an afterthought, helping to document what was done, but not driving any workflow during a Major Incident. You should be able to respond to any Major Incident independent of any supporting software or systems. Because, depending on the situation- you have to be prepared for those systems to not be available.

Another, more prosaic, reason for this system independence is the urgent nature of a Major Incident.

 

Whenever you appoint the owner of a Major Incident, or assign someone a Task to perform in troubleshooting or resolving a Major Incident, there needs to be real-time contact and confirmation. There is no time to waste waiting for tickets in work queues to be noticed and picked up. This is a process that relies on immediate communication and acknowledgment.

In considering the design of the process itself, one necessary element that must be defined is the authority to declare. A Major Incident is a deliberate declared state. It is an organizational decision to prioritize response to a specific Incident above all other activities. Some organizations have the Major Incident process owner declare all Major Incidents, in others the office of the CIO does this. It is also common to have the authority to declare be delegated to the Service Owner of the principle Service affected by the Incident.

Wherever you choose to locate your Major Incident declaration authority, be sure to consider the chain of command in the event you cannot immediately contact the first person on the list.

It is helpful to define some guidelines governing the threshold for Major Incident declaration. My advice would be to not go overboard with this. Some organizations try to define these boundaries to the penny, ‘if we are losing $10,000.01 an hour or more then we have a Major Incident.’ Reality is seldom going to be that precise. Keeping the declaration guidelines general is usually best.

Here are some suggested categories of thresholds you might consider providing guidance around:

  • Service interruption that causes potential impact on safety
  • Impact on vital business processes
  • Impact on teaching and learning (for our higher education clients)
  • Loss of revenue
  • Loss of reputation – Public Perception
  • Compliance – Regulatory breaches
  • Impact on ability to perform work

These are just some commonly used categories, the parameters you develop will obviously vary depending on the specific nature of your industry and organization. For example, one higher education client identified the concept of a ‘Teaching Emergency:’

  • The inability of at least one faculty member to teach OR more than one student to learn, now or within the next 24 hours.

In this post we have provided some suggestions to get started thinking about how to approach designing an IT Major Incident Management process. In our next article we will continue the topic with a discussion of the process roles and high level workflow.

If you would like to discuss these concepts further – you can contact us at contact_us@service-catalyst.com or call us at +1.888.718.1708 and let us know you would like to discuss Major Incident Management or anything about ITSM and ServiceNow implementation services.

 

Improving Problem Management – Root Cause Analysis (RCA) Sub-Process in ServiceNow

I wanted to add some thoughts to our last (rather long) post on defining a Root Cause Analysis (RCA) sub-process as part of a Problem Management discipline. Specifically, just a brief mention of how we might approach enabling Root Cause Analysis in an ITSM Toolset such as ServiceNow.

You may recall- we defined a sample RCA sub-process that has six steps we want each of our Problem Management teams to follow. A sensible next question might be, how do we ensure that the process is followed?

Example RCA Sub-Process

Consider Using ServiceNow Tasks
One approach is to use different Task types in ServiceNow. As in most modern ITSM tools ServiceNow has the capability to add multiple child ‘Task’ records to a parent process record, such as an Incident or Problem.

We can take advantage of this capacity to design different Task types for our RCA process. Each of these different Task types can then have different fields to handle the different phases of the process.


The above slides are simplified examples of a few of these different RCA Task types defined within ServiceNow. Depending on which RCA Task Type is selected, the ‘Task Details’ Tab displays different fields – those fields that are appropriate for the specified Task type.

There are a few variations to consider:
The ‘light’ version makes the different Task types available. They can be created and assigned as necessary.

A more prescriptive version defines a workflow to create the Tasks in a sequence each time an RCA is initiated. This ensures that the team is working through the different phases of the RCA process. This will help drive consistency in approach throughout the organization.

If desired, each different Task in the workflow can be assigned to a different individual, so responsibility can be distributed through the process.

Sample RCA Workflow in ServiceNow

The above is a representation of a very simple workflow in ServiceNow. It creates the RCA Tasks in a sequence. For example, when the Problem Definition Task has been completed successfully the Data Collection Task is then created. Each of these Tasks can be assigned to different individuals for completion. The slide below shows the different RCA Task types that are attached to a single Problem record.

Consider Using Tabs on the Problem Record
Consider this approach if you anticipate having one person responsible for the management of individual Problems throughout their lifecycle. The Problem record is assigned to that individual and they are responsible for completing the information for the respective RCA phases directly on the Problem record.

The sample slide above shows the RCA stages arranged as tabs on the Problem record itself. The individual managing the Problem record can then enter all the RCA information in one place.

Conclusion
These are just a few examples of how Service Catalyst is enabling the RCA sub-process in the ServiceNow environment for our clients. These concepts can be easily applied to any comprehensive ITSM Tool set. We hope they have given you some practical ideas for how to improve your Problem Management approach within your ITSM software.

If you would like to discuss these concepts further – you can contact us at info@service-catalyst.com or call us at +1.888.718.1708 and let us know you would like to discuss Problem Management, Root Cause Analysis or anything about ITSM and ServiceNow implementation services.

Improving Problem Management – Consider Formalizing an RCA Sub-Process

In a previous post we offered some practical solutions for getting started with Problem Management. It was suggested that it was of value to simply organize your Problem candidates in a central location and to keep track of their pursuit and that this didn’t necessarily require any fancy software.

This is certainly true, but as your Problem Management practice gains traction you may want to consider becoming a bit more formal in your approach. One way to do this is to define a ‘Root Cause Analysis’ (RCA) sub-process. This will ensure that all Problems are approached using a similar methodology. This methodology can then, if you choose, be defined as a sub-process workflow in your ITSM tool set.

Way back in the 1950’s Dr. Charles Kepner and Dr. Benjamin Tregoe conducted research on decision making at the Strategic Air Command. They discovered that successful decision making by US Air Force officers had less to do with rank or career path than with the logical processes the officer used to gather, organize, and analyze information before taking action. Their resulting book The Rational Manager (1965) is a business classic well worth reading.

Defining a standard RCA approach leverages this insight and applies it to your Problem Management process.

Step1: Problem Definition
Clearly define the Problem, paying close attention to the ‘boundary.’ Two questions can help –
‘What IS the Problem?’
‘What IS NOT the Problem?’

This may sound simple, but don’t underestimate the value of a disciplined approach to defining the Problem and the boundary of your definition.

It can be quite valuable to clearly understand what is working correctly and what is NOT working correctly as part of your investigations for identifying the root cause.

In framing this step be sure to include relevant stakeholders and ask ‘open’ questions that people cannot answer with a simple ‘yes’ or ‘no’.

For example:
• What do you see happening?
• What are the specific symptoms?

Step2: Collect Data about the Problem
You want to collect as much relevant data about the Problem as you can.

You may also need to obtain evidence that you have an underlying Problem and not just a perception.

Example questions for this stage:
• What proof is there that the Problem actually exists? Reports, Logs, anecdotes (they count), etc.
• How long has the Problem existed? What is the timeline of the Problem and its symptoms?

• What is the impact of the Problem? (e.g., Number of people affected, loss of revenue, effect on brand etc.)
• What’s the urgency to resolve the Problem? (Is it getting worse?)

Be sure to analyze the situation completely before examining the factors that contributed to the Problem.

It will help if you include as many stakeholder representatives as possible. The data collection and analysis will benefit from a variety of points of view, including your clients as well as the ‘experts’ and front line staff. Folks who are most familiar with the Problem can often help lead you to a better understanding of the underlying issues.

Remember, in all these phases, to document your findings. They are inputs to the next Step…

Step3: Identify Possible Causal Factors
Be sure to be exhaustive here. It can be a mistake to identify a few possible causes and then rush off to the next step.

If the issue is urgent, you can certainly work in parallel – pursuing possible solutions while you continue to consider investigating possible causes of the Problem.

The point is that you want to consider all possibilities. Make use of the team you have assembled to look deeply at the issue and identify as many causal factors as possible. Too often, people identify one or two factors and then stop. But that’s not sufficient.

Consider deeper causes and try to be sure that you have arrived at the underlying Root Cause of the Problem rather than just surface symptoms.

Some tools and techniques that help with this step include:

• 5 Whys – Ask “Why?” until you have dug down as far as you can. Use the facts and keep asking “So what?” to determine all the possible consequences of a fact.
• Fault Tree Analysis – Drill Down to break a Problem down into small, detailed parts. Paradoxically this will help to better understand the big picture.
• Ishikawa or Fishbone (Cause and Effect Diagrams) – Create a chart of all of the possible causal factors, to see where the trouble may have begun.

Step4: Identify Root Cause
Use the Information and Analysis to identify underlying Root Cause. As a practical matter this step is closely aligned with the third step. The same tools can be used to try to determine the underlying causes of the Problem.

Again the caution – be sure not to skip too far ahead. Be sure to examine each possible identified factor if you want to be confident in the completeness of your Root Cause analysis.

Look for the root cause of each factor and how it contributes to the Problem.

• Why is the Causal Factor present, why is it happening, when is it happening?
• Ultimately – why is the Problem occurring? Remember, there may be more than one contributing factor.

Step5: Recommend and Design Solutions

Another caution – Since there may be more than one contributing factor, there may need to be more than one solution pursued to completely address the underlying Problem.

For each proposed solution you should be sure to consider the possible effects –
• What can be done to prevent recurrence of the Problem?
• How will the solution be implemented? What are the risks? What other systems or components could be affected by the solution?
Try to anticipate the possible effects of the solution, particularly for complex Problems. Predict the positive and negative effects of the solution.

Step 6: Implement and Review Solutions

It is likely that you will have to engage with Change Management to implement some solutions.

Others may be addressed with workarounds. Some may require ‘acceptance’ meaning that the cost is too great to pursue correction and the Problem will be ‘lived with.’

After each solution has been implemented the Problem team should review the results. Be alert for unanticipated effects on related systems and components.

For some Problems the RCA process itself should be a part of the review. What parts of the process worked well, were the right stakeholders included in the Problem definition phase, was the solution analyzed correctly – these are some of the questions that can inform a review of the process.

Conclusion
This has been a bit longer than our normal post, but Root Cause Analysis is an important topic. If you have taken our advice and begun a simple centralized approach to Problem Management, formalizing the Root Cause sub-process is a great next step to take to increase the effectiveness of your Problem Management process.

If you would like to get started applying these concepts in your organization – you can contact us at sales@service-catalyst.com or call us at +1.888.718.1708 and let us know you would like to discuss Problem Management or anything about ITSM and ServiceNow implementation services.

Problem Management – Some Practical Suggestions on Getting Started

In a previous post we went back in history to the old ITIL v2 Service Support volume to highlight an effective role Problem Management can play in providing control and oversight to the solutions and workarounds developed and applied in the support organization. This post will offer some thoughts on how to get started with Problem Management.

I have been on several projects where Problem Management was in the original scope of work but we never actually developed the process. More urgent items and limited time and resources moved the Problem effort out of ‘phase 1’. This is unfortunate because having a separate process to oversee the Incident workarounds and keep track of all of the longer term issues in a centralized place can provide immediate benefits.

And the fact is that Problem Management can be implemented with very little initial overhead.

One way to start very quickly is to appoint a Problem Manager to oversee the process. Of course, as with all process owners your Problem Manager will need to be provided with proper resources and authority to manage the process.

Once given the proper organizational authority, your new Problem Manager simply needs to keep track of the list of outstanding ‘Problem Candidates’ and track progress against them.

Figure out a way to keep that list up to date and publicly available and you are in the Problem Management business.

‘Problem Candidates’ is an intentional term. Remember – you always have the option to ‘decline to pursue’ a Problem. It is useful, however, to keep track of them in a centralized list so that the organization has history about the decisions made about which Problems to pursue and which not to pursue.

One mistake to avoid is to form a ‘Problem Team’ that has the responsibility for investigating all Problems. If your organization is large enough that you need a team of people to manage and coordinate the PROCESS of Problem investigation that is one thing. But make no mistake, the Problem Management role is one of coordinating and managing Problem investigations. The teams doing the actual troubleshooting will vary depending on the nature of the specific Problem being investigated.

That is one reason having a centralized Problem process to coordinate all the Problem investigations is of immediate benefit. It is a big help in providing an organizational understanding of Work-In-Progress .

Problem Management is similar to Change Management in that you can derive immediate benefits without having it integrated into your ITSM tool, at least not at the outset. Just appoint a Problem Manager, be sure they have the organizational authority to oversee the process and keep a centralized list of Problems considered, Problems that are being actively pursued (WIP) and those completed.

One idea that I have successfully used to jump-start the process is to add a 5-10 minute ‘Problem-section’ to the CAB meeting. Now, you have to be careful with this. It is only going to work if you already have a smoothly functioning Change process and your CAB meetings are efficiently run. But if you do find yourself in that happy state, adding a quick Problem overview section to the meeting can help to jump-start the Problem process and ensure it is well integrated with Change Management.

When it does come time to add Problem to the ITSM tool-set, the process definition itself can be pretty basic. In fact, the out-of-the-box ServiceNow process is not a bad place to start. Unlike Incident and Change, you don’t need a lot of workflow to have an effective Problem Management module. The ability to track Problem Type, assign Tasks to the teams and relate Problems to Incidents and Changes is all you need for an effective baseline Problem Management application.

You might be surprised to hear that from a ServiceNow implementation specialist, but in a future post we’ll offer some thoughts on fancy integration workflow with Incident, Major Incident and Problem Management.

For now, the thoughts above and in our recent post on using Problem Management to improve Incident Management should help you get started with Problem Management.

Service Operation: Consider using Problem Management to improve Incident Management

‘This stuff makes the most sense of all this stuff.’  – ITIL Foundations student

 

This comment was made by a student in one of my ITIL Foundations classes, and it was a reasonable statement because he made it while we were covering Service Operation.

Service Operation is the ITIL Lifecycle phase that is most immediately familiar to  the majority of the front-line IT staffers who find themselves in an ITIL Foundations class.

The core processes included in the Service Operation volume have roots in earlier versions of ITIL. One of the things we emphasize at Service-Catalyst is the development of a management system that relies on well defined process integration.

To describe the integration of the Service Operation processes, the old ITIL v2 Service Support book did a very clear job of making the interconnections between the core Service Operation processes clear.

The triad of the Service Desk function and the Incident and Problem Management processes formed a section of the ITIL (v2) Service Support volume. Collectively they were referred to as ‘Support and Restore.’ You used to be able to take a Practitioner exam on these three processes and become a Certified ITIL Support and Restore Practitioner.

Now, aside from the interesting history lesson, I think it can still be helpful to begin with this triad in understanding and applying the Service Operation concepts from the ITIL v3 lifecycle.

In the classic model Problem Management was the ‘parent’ or ‘controlling’ process of the Support and Restore triad. We used to visually demonstrate this oversight role of Problem Management like so:

Support and Restore Process Triad - Problem as Controlling Process
What this meant was that Problem Management had oversight of the Incident Management process. According to the pure model every workaround or solution applied by Incident Management to resolve an Incident was vetted, tested and approved by Problem Management. In a flow-chart it would look something like this:

Incident Management integrated with Problem Management
‘Wow, something like that could really help us.’ – (a different) ITIL Foundations student when I drew this up on the board.

 

Now, you, who have read the v3 materials, might object at this point and state that this is properly the function of the Knowledge Management process. This objection would be correct. But if we go back to the ITIL v2 Service Support outline, as indicated in the graphic above, Knowledge Management was originally a sub-process of Problem Management.

 

One of the things that the v3 refresh did was to break sub-processes, such as Knowledge Management, out more clearly. Knowledge Management thus became a much more comprehensive and useful process.

 

One cost of this, however, is that we have lost track of the immediate oversight role that historically was provided by having Problem Management, using what was the sub-process of (a simpler) Knowledge Management, play the role of controlling process for the Support and Restore triad.

In my unscientific experience most organizations who are ‘doing ITSM’ never really get to Problem Management. The immediate focus is on fixing the ticketing system and getting the Service Desk squared away and getting everyone on the same system for tracking Incidents. The immediacy of the break-fix backlog takes up all the project mind-share.

However, it’s not necessarily such a big step to get started with Problem Management and using it a systematic way to review the workarounds and solutions that are applied in the support environment is an excellent way to get started.

Our next article will offer some further thoughts about getting started with Problem Management.

Organizational Vital Signs II – Two Steps to Control your Work In Progress (WIP)

As consultants we have the opportunity to see, examine and work with a variety of different organizations. When I’m first learning about an organization there are a few ‘vital signs’ I look for to get a quick assessment of the shop’s health and maturity.

The first, which was covered in our last post, is the cohesiveness and effectiveness of the leadership team.

Once such a team is in place, one of the first tasks of such a leadership team is to understand and gain control over Work in Progress (WIP).

This might seem obvious, but in my experience it is not. It is very rare, even in IT organizations that have a formally organized Project Management Office (PMO), for management to have any real clear idea of the project and operational work the various parts of the shop have committed to.

If the management team does not have that clear idea, it is very difficult to manage effectively? The very basis of management is the efficient and effective allocation of resources. If you, as a manager, don’t know how your resources are allocated, how can you make good decisions about what projects to pursue in the future?

There can certainly be a chicken and egg objection at this point. One might point out that the reason an ITSM program is adopted is, in part, to gain control over the operational commitments required to maintain the current IT environment.

That is a fair point, but before committing to any project – management should at least have a rough order of magnitude of the amount of staff and resources that are committed to just keeping the proverbial lights on. If they don’t, it is inevitable that they will increase the stress on staff by over-committing to project work.

A good first step, and one that we look for, is to have a single, common project approval procedure. There should be no shadow projects, all projects should appear on the common list of approved projects.

A good second step, one that we also look for, is that each of the approved projects has an up to date resource management plan. These plans can be used to gain a comprehensive insight into the collective commitment made to Work-In-Progress.

These first two steps to gaining control over project WIP do not require a formal Project Management Office (PMO), but it does require a cohesive leadership team committed to establishing a common set of processes.

Once this has been established, and you have gained and understanding of your WIP, you are in very good shape to consider adopting a formal management framework, such as ITIL.

If you need help gaining control of your WIP, or if you are all set with that step and ready to talk about moving forward with ITSM, let us know. We’d love to discuss your next steps.