Seven Steps to Creating an Effective Computer Security Incident Response Team 11 January 2012 ID:G00225512 Analyst(s): Rob McMillan, Andrew Walls VIEW SUMMARY A carefully managed incident response team, designed to meet specific organizational circumstances, is a vital component of an organization's defense. A phased approach to the creation of the team will ensure optimal effectiveness. Overview An effective computer security incident response team (CSIRT) is the result of planning, preparation and training. Chief information security officers (CISOs) and other key security decision makers should follow a phased approach in developing and maintaining a CSIRT that will identify, contain, escalate, investigate and remediate incidents in a timely and efficient manner. Key Findings A CSIRT — the entity that "owns" an organization's security incident response functions — is an integral component of an effective security program. Nonetheless, many organizations either have no CSIRT in place, or have not established workable goals or procedures for the team. CSIRTs require the involvement and support of individuals and roles beyond the information security and IT organizations. Most are actually "virtual teams," with members who can be called on as needed for specific skill sets. Part-time responders cannot be successful without adequate management support for training and the ability to leave their normal jobs on demand. The CSIRT's activities impact many different functional areas and operate across organizational and geographical boundaries. For this reason, the commitment and support of many different stakeholders — including senior management — are crucial to the team's success. A phased approach to development and implementation will enable organizations to best assess their needs and implement a CSIRT that will satisfy all stakeholders' needs, and use available resources effectively. Recommendations Identify key executive stakeholders, such as those in lines of business, risk, HR and communications, and gain their explicit support for the goals of the CSIRT and its budgetary and cultural support. Incorporate your response and escalation plan into your corporate policy to establish the CSIRT authority to do whatever is necessary to protect the organization, conduct an investigation, and ensure that CSIRT staff acting within their authority are protected from corporate politics and legal actions. Build reporting into all facets of operations, including regular reporting on all activity detected — this provides context when incidents occur. Include your virtual team in regular training drills, using a variety of techniques and scenarios. TABLE OF CONTENTS CONTENTS o o Analysis The CSIRT: The Key to Managing Today's Security Incident — and Tomorrow's Seven Steps to Creating an Effective CSIRT Step 1. Define Context and Scope Step 2. Establish a Governance Structure Step 3. Identify and Acquire Necessary Resources Step 4. Define Processes CSIRT Process Phases Step 5. Assemble a Toolkit Step 6. Establish Detection Capabilities Human Sources Technical Sources Step 7. Validate the Process TABLES Table 1. CSIRT Skill Sets Table 2. Tool Categories Analysis The CSIRT: The Key to Managing Today's Security Incident — and Tomorrow's The CSIRT is an entity that "owns" an enterprise's security incident response functions — their definition, execution and improvement. A capable CSIRT, with adequate executive support, funding and other resources, is an integral component of any information security program. Gartner client interactions show, however, that many enterprises have no CSIRT in place, have poorly defined goals for the team or follow inconsistent procedures. These problems can dramatically reduce the security organization's effectiveness in responding to future information security incidents, and lack of planning may expose the enterprise to serious legal or regulatory liability. Enterprises at a low level of security maturity view the goal of incident response as simply to recover from the incident, paying minimal attention to "secondary" functions, such as evidence collection, analysis and postmortem reporting. In the long term, this approach will almost certainly result in more security incidents, not fewer, because their root causes are not identified, and infrastructure and process management are not improved. Moreover, disclosure laws in some jurisdictions compel enterprises to carefully analyze data that may have been leaked. A CSIRT does not need to be a full-time team of investigators and technical experts. CSIRTs are often "virtual teams," with members who can be called on when a particular incident requires their specific skill sets. A structure of this type reduces the team's staffing costs and enhances the flexibility of the team, but also presents skills development challenges (for example, the difficulty of bringing team members together for practice exercises). The team's membership must be flexible and agile enough to allow the CSIRT to respond to specific issues that may emerge in the course of an incident, which may require that other personnel be called on to assist the regular members. The team requires at least some dedicated full-time members, and the other virtual team members (part-time or on-call) must have adequate training. The creation of an agile and effective CSIRT requires a phased approach that simultaneously develops the team's technical, managerial and procedural capabilities and the organizational support for it. Incident response is a discipline that touches on many different functional areas and operates across organizational and geographic boundaries. Gartner's seven-step approach provides opportunities for feedback from the many stakeholders and participants in incident response, and allows for continuous improvement throughout the development process (see Note 1). Table of Contents Seven Steps to Creating an Effective CSIRT Step 1. Define Context and Scope A clearly defined context, purpose and scope focus the development of the CSIRT, ensure that the program meets internal and external requirements, set appropriate expectations, and establish support from senior leadership. Context: A CSIRT cannot function in isolation. It depends on internal, outsourced and external functions, and needs to provide support to other organizations within the enterprise. The administrative "home" of the CSIRT must be defined, usually within the security team, because this fundamental decision affects many of the other steps in the process. Establishing other contextual relationships for the CSIRT will help to define its operating model and build organizational support. These relationships may include dependencies on network support teams or technology outsourcing providers, and the provision of services to customer channels and other outside entities (see "Six Decisions You Must Make To Prepare for a Security Incident"). The extent and implications of team interactions with external organizations — for example, the Forum of Incident Response and Security Teams (FIRST) — must be determined and defined. Scope: The scope of the team's activities will be determined by its context and purpose. Gartner's phased approach assumes that the CSIRT and its clients are internal. If third-party providers are used for business processing, technology support or other services, it may be necessary to determine whether the CSIRT will have accountability for executing core functions on behalf of the providers, coordinate or oversee provider security responses, or play some other role in ensuring adequate provider response to an incident. The CSIRT must, of course, cooperate with other operational teams within the enterprise. Clear definitions of responsibility and points of coordination between teams are essential to ensuring that functional gaps and territorial conflicts that could diminish the CSIRT's effectiveness are avoided. Core CSIRT functions may include: Event monitoring and identification — Monitoring of logging infrastructure to identify events of interest, and identify and conduct early assessment of events that may constitute security incidents (for example, security information and event management [SIEM] tools and other information sources). Incident management — Command and control of actions directly required to coordinate work and manage an incident to its completion. Forensic analysis and evidence collection — Collation, cataloging and protection of material used in support of decisions made during the incident, as well as research, disciplinary and legal activities following the incident. Other functions may include: Communication — Communication with internal personnel, including impacted business process owners (but not the public or external entities, which should be managed by the corporate communications or public relations organization). Investigation — Investigation of potentially inappropriate activities by personnel (including security personnel or CSIRT members), which makes it critical that the team include a human resources (HR) member to reduce the potential for conflict of interest. Legal support — Action to help the organization deal with potential legal or other liability (for example, through the exposure of confidential data and trade secrets, theft of funds, loss of private data or implied breach of contract), which requires a clearly defined relationship with legal counsel and internal legal mechanisms that give the CSIRT team the authority and flexibility to make decisions according to organizational priorities (see "Toolkit: Security Incident Response Preparation") and provide protections for the team if good-faith mistakes are made. Service management — Service commitments that may be affected by a security incident (for example, unscheduled outages for system repair and patch application, or external review/audit) and result in business or financial impact, which may become critical in heavily outsourced environments. Customer service — Managing the impact of security incidents on customer service, which will probably involve a dedicated customer service team and marketing or public relations personnel. Whether ancillary teams have individuals formally identified as part of the CSIRT — or whether the CSIRT is tasked with providing supporting service to those teams — will depend on enterprise-specific requirements. It is important to note that many of these elements will also be considered during the development of business continuity plans, and possibly IT disaster recovery plans, as well, and the CSIRT's role should be consistent with these plans. Table of Contents Step 2. Establish a Governance Structure The CSIRT's authority must be clearly established by a defined governance structure that is supported by documented policies and oversight by a governance committee. The committee should not be confined to IT leaders and other technologists, but should also include business, operational risk and audit representation (for example, from the corporate audit committee). The governance committee should have oversight of some elements of team administration, such as structure, critical personnel, funding, and policies and procedures, particularly regarding escalation criteria. Escalation criteria are especially important in large organizations where command and control of an incident that is formally escalated to a crisis may transition from the CSIRT or middle management to senior management. It is particularly important in these environments that the requirements for adherence and support for the incident response process and delegation be enshrined in policy. The team's governance structure should include regular, required reporting whose content and frequency are determined by the committee, as well as postincident reporting after all high-severity incidents. The governance committee should also ensure oversight of the CSIRT's communications plan, including communications during an incident and ongoing communications not related to specific incidents. This should include defining triggers for, and content of, messages to ensure that they are adequate and appropriate. Table of Contents Step 3. Identify and Acquire Necessary Resources The success of the CSIRT will depend heavily on the allocation and availability of appropriate resources, in two key areas: Personnel: It is impossible to anticipate all the skills that will be required to successfully manage all security incidents. Core roles and skill sets can and should be defined in advance, through a gap analysis, but the CSIRT's structure should anticipate that new skills and capabilities may be required, particularly as new technologies are deployed and the enterprise evolves. Some of the personnel required may come from outside the IT organization (see Table 1). Function Description Technical Systems Knowledge Knowledge of enterprise systems and platforms (important in establishing dependencies) Technical Artifact Analysis The ability to obtain and analyze log information, malware examples and other evidence Investigative Process Capabilities Skills in all aspects of incident investigation, including evidence management, journal maintenance, risk escalation, surveillance and team coordination Intelligence Analysis Capabilities The ability to collate and analyze intelligence information Incident Supervision The ability to manage incidents capably Communications Management The ability to delivery accurate, timely communication concerning ongoing incidents and incident outcomes Public Relations and Media Management Skills in communicating with the public and other external parties Law Enforcement Liaison Knowledge of when and how to engage with law enforcement agencies Legal and Compliance The ability to identify and manage legal risks and instances when legal responses (for example, criminal prosecution of other forms of litigation) are called for HR Management The knowledge and authority to ensure that appropriate HR processes are followed, particularly regarding information concerning employee behavior or risks to privacy or health and safety Planning and Evaluation The ability to document and update the plan in a disciplined fashion and organize periodic testing Table 1. CSIRT Skill Sets Source: Gartner (January 2012) The goal of the personnel/skill set gap analysis is to validate that there is an appropriate level of expertise within each sphere of CSIRT responsibilities. If the necessary skill sets cannot be identified within the enterprise, contractors or service providers may be needed to supplement internal skills. Due to the sensitivity of the information the CSIRT handles, members and outside parties should be subjected to more-intense background security checks than usual. Senior management must recognize that, during an incident response action, team members and assigned respondents may need to shift priorities on short notice. CSIRT functions must always be seen as a high priority, and personnel who perform these roles must not be penalized for their CSIRTrelated efforts. Facilities: CSIRT members will require access to key systems, such as capabilities that are normally available via network operations centers (NOCs) or security operations centers (SOCs). The team will also need dedicated infrastructure, which may be protected from the rest of the enterprise, including secure physical facilities, materials storage and dedicated computers, as well as specialized software and hardware. Redundancy in physical resources and technical systems (mobile phones, fixed-line telephones, fax machines and, in extreme circumstances, radio communications) will ensure the continuity of CSIRT operations when normal facilities and technology are corrupted or unavailable. Table of Contents Step 4. Define Processes The CSIRT's operational processes must be clearly established. The CSIRT should not, for example, become involved until an event has been identified as security-related and confirmed as an actual security event as defined by enterprise-specific criteria. In some extreme cases, a security incident may become so severe that it threatens the ongoing viability of the organization and becomes a crisis. The management of a crisis often requires the active participation of management that is typically more senior than that of the CSIRT (for example, the CEO and direct reports). Accordingly, the process should include escalation criteria to identify when an incident is to be declared a crisis and managed via crisis management procedures. It is imperative that this escalation protocol be tested at the executive committee (CEO and direct-reports) level. The CSIRT's process documentation must identify key roles and responsibilities to ensure that decisions are made by the right people at the right time using the right information. An individual managing an "everyday" incident, for example, will typically not need to be particularly senior. However, if the incident is escalated to crisis level — because it could involve material losses or even more dramatic consequences — higher-level involvement will likely be required. An optimal way to achieve this transition is to have procedures recorded in the form of cross-functional flowcharts, which also reinforces the importance of recruiting disciplined individuals who are dedicated to process. CSIRT Process Phases Various references outline proposed and reasonable processes for the management of an incident (see Note 2). A generic process typically includes the following phases: Detection Assessment Mitigation Recovery Postmortem Event detection may occur by several means, and is not necessarily the sole responsibility of the CSIRT. However, the CSIRT is generally responsible for the other phases. The final phase — postincident reporting to address the cause, scope, type and level of impact (shortand long-term) and resolution of the incident — is crucial. The postmortem should also be used to evaluate the effectiveness of the response and identify areas for continuous improvement. All participants in the response should contribute to the postincident report, with an independent colleague taking responsibility for preparing the report. All postmortem reports should use a common format and include a set of action items that are tracked to optimize future operations. The CSIRT should provide only "anonymized" management presentations of incident response activities. It is not necessary to identify the individual employees involved as investigators, perpetrators or suspects. Information concerning the identity and disposition of personnel involved in incidents is the sole responsibility of the HR organization. Table of Contents Step 5. Assemble a Toolkit The technical and other tools the CSIRT needs will vary widely, depending on the types of services to be offered, the infrastructure being managed and any outsourcing arrangements that are in place (see Table 2). The enterprise may already have deployed some of these tools for other purposes. The CSIRT should seek to leverage existing investments in tools whenever possible to maximize the costeffectiveness of the CSIRT operation and to take advantage of available skills. Function Tool Categories Monitoring and Detection SIEM Database activity monitoring Data loss prevention (DLP) Anti-malware Intrusion prevention system (IPS) (application, host and network) Fraud monitoring and prevention Incident Management Case management Ticketing and tracking system Database for data storage Knowledge base Forensics Disk/network forensics tools Network protocol analyzer Change management records Malware analysis tools Technology Support Notebook computers Network island separated from production environments Communications tools (e.g., phones) Table 2. Tool Categories Source: Gartner (January 2012) Some of these tools will be dedicated to the CSIRT (for example, forensics tools), but many (for example, intrusion detection system [IDS]/IPS and DLP tools) are likely to be managed by other teams or by shared systems (such as case management systems). The tools the CSIRT requires will be used in an uncertain operational environment — one that may have been compromised — so it is important that the enterprise be able to assert with confidence that they are reliable and can preserve evidence in an untainted fashion. This may become crucial as the consequences of an incident are realized during commercial negotiation or legal proceedings. CSIRT members must receive training on the tools and techniques they are expected to use. Training for the tools is usually available from the vendor. External training resources for CSIRT operations are available from organizations such as regional computer emergency response teams (CERTs), the CERT Coordination Center and commercial entities (see Note 2). Table of Contents Step 6. Establish Detection Capabilities The rapid detection of potential incidents is critical to early identification and response. (Forewarning of planned incident activity is, of course, ideal, but it is difficult to achieve.) Any incidents affecting competitors, business partners or customers should be assessed for possible implications for the enterprise. Third-party intelligence services can also be highly beneficial for providing new information or insight (see "How to Select a Security Threat Intelligence Service"). Detection capabilities should not be restricted to technical sources within the enterprise. High-quality intelligence will usually be a blend of information from human and technical sources. Human Sources Human sources are often the trigger for the detection of particularly important or damaging incidents. The establishment of a whistleblower capability will enable personnel to report incidents without fear of retribution (and, in some jurisdictions, guarantee legal protections for the whistleblower). In many cases, individuals may not recognize an event as a security incident, but may simply see a technical anomaly and contact the help desk. For this reason, scripts for help desks and call centers should include questions that may trigger a CSIRT response. Technical Sources Technology is one of the key sources — if not the key source — of security information. However, it is important to understand that the role of the CSIRT is to manage the incident as a business event, not purely as a technical event. The CSIRT must liaise with IT managers to establish clear guidelines for analysis of technical system outputs in order to filter, identify and escalate evidence of possible incidents. The CSIRT is not responsible for operating security monitoring systems (such as SIEM or IPS), but should be informed of alerts generated by these and other systems. Table of Contents Step 7. Validate the Process All CSIRT processes must be validated, to ensure that they are adequately documented, that all participants understand their roles, that all participants have the requisite skills and that the available tools are appropriate for the team's purposes. This validation may be performed as: A tabletop exercise specifically formulated for the purpose of working through the CSIRT basic procedure (see "Tutorial: How to Plan and Run a Tabletop Training Exercise on Incident Response") Part of regular business continuity testing, in which alternative operational arrangements are tested Part of regular crisis management testing at the senior management level Part of industry exercises, such as the Cyber Storm exercises conducted within some jurisdictions (an option that will likely only be available to large enterprises considered to be part of critical national infrastructure) Test exercises are easy to overlook, because they do not have the same level of urgency as the enterprise's core business. However, neglecting this phase may leave the enterprise unprepared for a security incident. For this reason, needed participants must participate in exercises. In heavily outsourced environments, it may be necessary to incorporate participation as a contractual requirement. Table of Contents STRATEGIC PLANNING ASSUMPTION NOTE 1 SEVEN-STEP APPROACH The steps Gartner recommends do not need to be implemented sequentially. Implementation will inevitably require that some activities be implemented concurrently, and perhaps iteratively. NOTE 2 RESOURCES FOR CSIRTS A variety of public and commercial organizations provide a range of support services for CSIRTs, including: CERT.org (www.cert.org/cert): The original CERT was formed at Carnegie Mellon University (CMU) in Pittsburgh, Pennsylvania, and remains a major force in the CERT/CSIRT world. Its website provides numerous resources for CSIRTs, including links to manuals, policy templates and training resources. SANS (http://isc.sans.edu): The SANS Institute provides training in incident handling and forensic techniques, and the website provides tools and resources for anticipation of analysis-of-attack activity. US-CERT (www.us-cert.gov): The U.S. Department of Homeland Security, working in conjunction with the CMU CERT, has developed a national, government-based CERT capability to coordinate national CERT efforts. The website provides numerous resources and links. CSIRT.org (http://csirt.org): This commercial organization provides a collection of materials and services, including team training, incident escalation and investigation services and links to other online resources (such as CERT.org). FIRST (http://first.org): This membership-based organization provides a support service for CERTs and CSIRTs on a global basis. FIRST members tend to be governmental organizations (for example, the U.S. Army CERT — ACERT) and major commercial enterprises (for example, GE-CIRT, General Electric's computer incident response team). ACRONYM KEY AND GLOSSARY TERMS CERT computer emergency response team CSIRT computer security incident response team DLP data loss prevention FIRST Forum of Incident Response and Security Teams PS intrusion prevention system SIEM security information and event management