Incident and Problem Management

The ISO 20000 standard for IT Service Management includes requirements for incident and problem management.

Incident and problem management are separate resolution processes, but they are closely linked. Incident management deals with restoring services to users. Problem management is concerned with identifying and removing the causes of incidents.

Incident Management
An incident is defined as any event which is not part of the standard operation of a service and which causes or may cause an interruption to, or a reduction in, the quality of that service.

The ISO 20000-1 Specification standard requires the following steps be taken for incident management:

  • Record all incidents
  • Adopt procedures to manage the impact of incidents
  • Define in procedures the recording, prioritization, business impact, classification, updating, escalation, resolution, and formal closure of all incidents
  • Keep the customer informed of the progress of their reported incident or service request and alerted in advance if their service levels cannot be met and an action agreed
  • Give access to all staff involved in incident management to relevant information such as known errors, problem resolutions, and the configuration management database
  • Classify and manage major incidents according to a process

The ISO 20000-2 Code of Practice reminds us that incidents may be reported by telephone calls, voice mails, visits, faxes, letters, or emails. They can also be recorded directly by users that have access to your incident recording system, or by automatic monitoring software.

The incident management process should include priority assignment and first line resolution or referral. In addition, the process should address security issues, incident tracking, incident verification and closure, and escalation paths.

The incident management staff should have access to an up-to-date database containing information on technical specialists, previous incidents, known errors, workarounds, and checklists that will help them restore service to the business.

Final closure of an incident should only take place when the initiating user has been given the opportunity to confirm the incident has been resolved and service restored.

Problem Management
A problem is defined as an unknown underlying cause of one or more incidents.

The objective of problem management is to minimize disruption to the business by proactive identification and analysis of the cause of incidents and by managing problems to closure.

The ISO 20000-1 Specification standard requires the following steps be taken for problem management:

  • Record all identified problems
  • Adopt procedures to identify, minimize, or avoid the impact of incidents and problems
  • Define in procedures the recording, classification, updating, escalation, resolution, and closure of all problems
  • Take preventive actions to reduce potential problems, e.g., following trend analysis of incident volumes and types
  • Pass to the change management process any changes required to correct the underlying cause of problems
  • Monitor, review, and report on effectiveness of problem resolution
  • Ensure problem management is responsible for making up-to-date information on known errors and corrected problems available to incident management
  • Record actions for improvement identified during this process and input into a plan for improving the service

The ISO 20000-2 Code of Practice says that incidents should be classified to help determine the causes of problems. And, when the root cause has been identified, along with a method of resolving the incident, the problem should be classified as a known error.

Known errors should be recorded in the knowledge database together with any workarounds. Information on workarounds, permanent fixes, and problem status should be communicated to those affected and those that support the affected services.

The problem management process should cover identifying any incidents that breach service level targets, as well as, defining escalation points and recording resources used and any actions taken.

Problem reviews should be held to investigate unresolved, unusual, or high impact problems. These reviews look for process improvements to prevent recurrence of the incidents, and to examine incident levels against service targets.

We offer ISO 20000 courses that explain the following topics:

Service Delivery
capacity management
service continuity
availability management
service level management
service reporting
information security management
budgeting and accounting for IT services

Control Processes
configuration management
change management

Release Processes
release management

Resolution Processes
incident management
problem management

Relationship Processes
business relationship management
supplier management