Major Incident Management in ServiceNow- Integrating with Problem and Change

– by Bill Cunningham and Rodney Holiman

In a few previous posts we have discussed designing and implementing a Major Incident Management (MI) process and made some comments around MI roles and communications. Let’s begin to apply those ideas to setting up an MI process in ServiceNow.

The Out of the Box ServiceNow approach to Major Incident is to treat it as a sub-process of Incident Management and provide a separate Form View for those Incidents that have a Priority indication that rises above a certain level.

At Service Catalyst we highly recommend developing Major Incident Management as a separate process altogether. This means that it has a separate Process Owner, its own Policy and Workflow and in Service Now it gets its own Table and State Model.

We have had a continuing internal discussion over whether it makes more sense to build the ‘MajorIncident’ table by extending either the Incident or Problem table. We typically extend the Problem Management table so that we can make use of the underlying Workaround capability built into that module.

Building off of that table we are able to build a separate state model around Major Incident Management that allows us to tightly integrate with Incident, Problem and Change Management.

Unlike ‘regular’ Incidents, Major Incidents have no ‘Pending’ State. By definition Major Incidents are highest priority and need to remain open and active throughout their lifecycle. If you can pause working on it – you no longer have a Major Incident. (Maybe convert it to a Problem?)

We do make one exception, and that is when a Change has to be raised to resolve the Major Incident. We usually add a State ‘Pending Emergency Change’ that will generate a related Emergency Request for Change (RFC) record. Unlike ‘regular Incidents’ this pending state does not pause any SLA clocks we may have defined, but it does streamline the process of generating the Emergency Change documentation. It also provides an update as to the fact that there is a Change in process that is expected to address the Major Incident.

When the related RFC is closed, the Major Incident record is returned to Work-In-Progress, if it has not already been manually returned to that state. Remember, we are using ServiceNow to document the process here, not drive the workflow.

In our most common approach – when the Major Incident record is ‘Resolved’, two things occur –
1). An After Action Review (AAR) Task is generated and assigned to the Service Owner of the Major Incident. An SLA is assigned to this task with notifications that it needs to be completed. When the task status is marked as completed then the parent Major Incident record is marked as ‘Closed.’ Enforcing these After Action Reviews ensures that some thought is put into what went wrong, what went right, and how the process can be improved. This step also builds an ongoing record of the Major Incidents that we have managed. It is to be hoped that we don’t have too many of them.

2). A related Problem record is also generated when the Major Incident is marked as ‘Closed’. The After Action Review is focused on immediate lessons learned from the Major Incident. The Problem record is to determine and document the underlying Root Cause of the Major Incident and to take steps so that it will not happen again.

If Root Cause has been determined, the Problem record can be immediately closed. But it is helpful to have all of our Root Cause documentation in the same place. This is particularly true if you are using the pure model of having all Incident workarounds vetted by Problem Management.

Another thing to keep in mind is that a Problem is a declared state. If for some reason it is decided that Root Cause does not need to be determined, the Problem record can be closed with a closure code of ‘Declined to Pursue.’ Again, there is value in recording this decision in the Problem record. It ensures that it was a conscious decision to not pursue Root Cause and that the decision can be readdressed at a later date should it need to be.

So, we have taken three posts to discuss Major Incident Management at a fairly high level. I hope that they have provided some insight into how to approach managing Major Incidents in your organization.

If you would like to discuss these concepts further – you can contact us at contact_us@service-catalyst.com or call us at +1.888.718.1708 and let us know you would like to discuss Major Incident Management or anything about ITSM and ServiceNow implementation services.