Uploaded by djresume@hotmail.com

CompanyX Major Incident Management Process 2022 03

advertisement
Purpose
This document describes the process followed by Helpdesk for Major Incident Management
(Severity 1 and Severity 2).
Scope
This document is intended as a reference for CompanyX IS employees.
Overview
Major Incident is an unplanned interruption to an IS service which causes serious interruptions of
business activities (like affecting revenue, financial impact, potential loss of machine critical data,
damage to the reputation of business) that must be resolved with greater urgency. The aim is the
fast recovery of the service, where necessary by means of a Workaround. If required, specialist
support groups or third-party suppliers (3rd Level Support) are involved. If the correction of the
root cause is not possible, a Problem Record is created and transferred to Problem Management.
Related Documents
Document Title
User Access Management Process
Incident Priority Matrix
Incident Management Process
Problem Management Process
Abbreviations and Acronyms
Term
CMR
ECMR
MOD
NOC
SLO
SNOW
SOS
MOC
Description
Change Management Request
Emergency Change Management Request
Manager on Duty
Network Operations Center
Service Level Objective
Service Now
System Order Severity
Manager On Call
Common Terms
Term
Change Request
Everbridge
Incident
Problem
Description
A formal proposal for a Change to be made.
Third-party SMS mass notification system.
An unplanned interruption to an IS service or a reduction in quality of an IS
service. It could also be a failure of a configuration item such as server,
database, application, network, scheduled job, and so on, prior to service
disruption (Event Management input).
A problem is the unknown, underlying cause of one or more incidents. The
cause is usually unknown at the time of Record creation. Problem
Management process is responsible for further investigation.
Clarifications and Revisions
Please send an e-mail to cybersecurity@CompanyX.com for any of the following:
 Clarifications on this process
 Recommendations for revision
MAJOR INCIDENT MANAGEMENT PROCESS
The process flow diagram and process descriptions are provided in the following sections.
Severity 1 Process Flow
The following diagram shows the process activities. Double-click to view in the PDF Format.
Severity 1 Process Description
1. NOC team will send a bridge notification to IS users and call on-call support groups if it
is determined that a bridge is required.
2. If the incident is being worked upon by a vendor, ISP, or a power related issue, a
bridge is not required. NOC team will call on-call support groups if required.
Helpdesk/NOC works with the IT vendor/third-party on the resolution of this issue.
Once the issue is resolved and the confirmation is received, step 12 is followed.
Problem ticket will be assigned to the respective team who needs to coordinate with
the IT vendor/third-party.
3. MOC to call and inform the MOD on the Severity 1 incident and send SMS to the list of
members in the MOC list using MIR3.
4. If the magnitude of the impact is higher like Wolverine unavailable globally, it is treated
as an SOS. NOC and Helpdesk team will start calling the people on the SOS list. MOD
takes a decision on calling it out as an SOS.
5. For issues like e-mails are not available on mobile devices, SMS will be sent to
executives if it is decided on the bridge.
6. Alert message will be sent by the Helpdesk to the end users for issues which require a
communication. This needs to be discussed and decided on the bridge.
7. Initial diagnosis and troubleshooting is carried out by the support groups on the bridge.
8. When a potential resolution has been identified, this should be applied and tested to
ensure that the service has been fully restored to the users.
9. If a change is required to restore the services, an ECMR will be raised and it goes
through the Change Management process. Once the ECMR is implemented, it should
be tested to ensure that the service has been fully restored to the users.
10. After incident resolution, the incident ticket is assigned to the support group who
resolved this incident and the incident will be closed in SNOW. Only Helpdesk should
be closing the Severity 1 incident.
11. A problem ticket should be created against the incident in SNOW and assigned to the
respective support group (This will be discussed on the bridge).
12. The User and IS communications on the resolution/availability of the systems will be
sent out by the Helpdesk team.
Severity 2 Process Flow
The following diagram shows the process activities. Double-click to view the PDF.
Severity 2 Process Description
1. In most of the Severity 2 incidents, a bridge is not required. NOC/Helpdesk team calls
the on-call support groups. For the Severity 2 tickets generated out of alerts, the NOC
team calls the on-call support group. For User reported issues, the Helpdesk team
calls the on-call support group.
2. NOC team will send a bridge notification to IS users and call on-call support groups if it
is determined that a bridge is required.
3. If the incident is being worked upon by a vendor, ISP, or a power related issue, NOC
team will call on-call support groups if required. Helpdesk/NOC works with the IT
vendor/third-party on the resolution of this issue. Once the issue is resolved and the
confirmation is received, step 10 is followed.
4. For issues like e-mails are not available on mobile devices, SMS will be sent to
executives if it is required or the management requests for it.
5. Alert message will be sent by the Helpdesk to the end users for issues which require a
communication.
6. Initial diagnosis and troubleshooting is carried out by the support groups (if a bridge is
opened, the support groups do the troubleshooting on the bridge). If this is a recurring
incident, it goes the Problem Management Process. Refer to 05.84 Problem
Management Process.
7. When a potential resolution has been identified, this should be applied and tested to
ensure that the service has been fully restored to the users.
8. If a change is required to restore the services, a CMR will be raised and it goes
through the Change Management process. Once the CMR is implemented, it should
be tested to ensure that the service has been fully restored to the users.
9. After incident resolution, the incident ticket should be resolved in SNOW by the support
group who resolved the incident.
10. The User and IS communications on the resolution/availability of the systems will be
sent out by the Helpdesk team.
Download