Major Incident Process
@ Progressive
Enable quick restoration of service and minimize impact for all key incidents via centralized communication, collaboration, facilitation, and coordination.
» Own lifecycle of all Major
Incidents
»
Remedy Ticket Handling and
Documentation
» Send Incident Notifications and Escalations
» Maintenance of the Incident
Management Notifications site
»
Direct-line for Vendors,
Partners, & Customers
Major Incident Facilitation
» When the IMT takes ownership of an incident record they will provide direction over an AT&T bridge.
» The bridge will be used for bringing support groups together to make decisions for service restoration and normalization. This will allow technicians to collaborate and troubleshoot the incident.
» Research potential causes by utilizing tools available to the team including researching change requests, dashboards and related Remedy tickets that maybe related to the incident.
» Engage Support Groups / Technicians during the course of a
Major Incident as needed
» Document actions from the Major Incident in an Incident
Ticket
» Regularly communicate to stakeholders and interested parties
» Monitor if the Major Incident meets any Disaster Recovery
Triggers
Incident Task Force
» For elongated incidents where there is no immediate resolution, the IMT will schedule and facilitate meetings to investigate root cause and normalization of service
Participate in Problem Reviews
» The Problem Management team schedules and invites the
IMT to review the Key Incident. The IMT is needed to collaborate on details and assist in determining root cause of the incident handled by IMT. The IMT will also give feedback in how well the incident process was executed.
Emergency Change Management
» When working a Key Incident and a change is required within
24 hours or less to restore service, the Incident Management
Team will be engaged to initiate the Emergency Change process.
» Successful restoration and normalization of Service for Major
Incidents
» Creation of Problem Investigation;
Emergency Change Requests, & CI
Unavailability Forms
» Pages, texts, and emails regarding a
Major Incident
» MyQumas documentation, training
Lifecycle Owner
Leadership, Direction, Standards &
Practices
Platform Owner
Planning, Road-mapping, & Supporting
Support Groups / Technical Staff
Investigation & Resolution of Major
Incidents
Problem COE
Communicate & Cooperate
Business
Inform on Major Incidents
Service Desk
Coordination during a Major Incident
Service Management teams
Consult with other SMO process teams for consistency
» A Major Incident will be an Incident that meets any of the below criteria.
» Call Auditing:
» Incident Management Notifications
» Goal: decrease IMT engagement Time
» Pilot to begin where we dissect a critical application:
– understand current monitoring strategy,
– enhance the monitors
– Have monitors directly notify IMT
– Build pre-defined communication / escalation scripts per monitor