Backup and Recovery Plan

advertisement
BACKUP AND RECOVERY PLAN
The Backup and Recovery Plan presents the aspects of the solution relevant to backup and
recovery, identifies and describes weaknesses in the system, and describes backup methods and
recovery steps.
The paragraphs written in the “Comment” style are for the benefit of the person writing the
document and should be removed before the document is finalized.
SEPTEMBER 11, 1998
Revision Chart
This chart contains a history of this document’s revisions. The entries below are provided solely
for purposes of illustration. Entries should be deleted until the revision they refer to has actually
been created.
The document itself should be stored in revision control, and a brief description of each version
should be entered in the revision control system. That brief description can be repeated in this
section.
Version
Primary Author(s)
Description of Version
Date
Completed
Draft
TBD
Initial draft created for distribution and
review comments
TBD
Preliminary
TBD
Second draft incorporating initial review
comments, distributed for final review
TBD
Final
TBD
First complete draft, which is placed under
change control
TBD
Revision 1
TBD
Revised draft, revised according to the
change control process and maintained
under change control
TBD
etc.
TBD
TBD
TBD
Backup and Recovery Plan
Project Name
PREFACE
The preface contains an introduction to the document. It is optional and can be deleted if
desired.
Introduction
The Backup and Recovery Plan presents the aspects of the solution relevant to backup and
recovery, identifies and describes weaknesses in the system, and describes backup methods and
recovery steps. This plan should encompass several different scenarios, accounting for different
types of failure. This could include steps for replacing hardware, rebuilding/modifying/replacing
the operating system and applications, restoring data, or hot backup systems that stand in for a
failed solution.
Justification
This plan is a key component of the solution. Having the plan in place ensures that
comprehensive backup and recovery steps will be included in the deployment process. This leads
to a solution that meets its availability requirements even if something does fail. It also prevents
the compounding of failures when they do occur. Continuous service by the solution will
increase customer satisfaction and confidence in that solution.
Team Role Primary
Release Management is responsible for developing the Backup and Recovery Plan. Development
also plays a primary role in creating the plans content to ensure the feasibility of the technical
implementation. Program Management will incorporate the Backup and Restore Plan into the
Master Project Plan.
Team Role Secondary
All team roles are responsible for reviewing the plan’s content to ensure its execution is feasible.
533575959 (03/07/16)
Page 1
Backup and Recovery Plan
Project Name
CONTENTS
New paragraphs formatted as Heading 1, Heading 2, and Heading 3 will be added to the table
automatically. To update this table of contents in Microsoft Word, put the cursor anywhere in the
table and press F9. If you want the table to be easy to maintain, do not change it manually.
1.
INTRODUCTION ........................................................................................................4
1.1
1.2
1.3
1.4
BACKUP AND RECOVERY PLAN SUMMARY ...............................................................4
AVAILABILITY PLAN OBJECTIVES .............................................................................4
DEFINITIONS, ACRONYMS, AND ABBREVIATIONS ......................................................4
REFERENCES .............................................................................................................4
2.
DESCRIPTION OF SOLUTION ....................................................................................5
2.1
2.2
2.3
2.4
2.5
2.6
RECOVERY RESPONSE TIME ......................................................................................5
SINGLE POINTS OF FAILURE ......................................................................................5
LATENCY ..................................................................................................................5
SYSTEM REDUNDANCY .............................................................................................5
DATA INTEGRITY.......................................................................................................5
BUSINESS COST WHILE SYSTEMS ARE DOWN ...........................................................6
3.
BACKUP AND RECOVERY METHODS .......................................................................7
3.1
3.2
3.3
RESTORE FROM BACKUP MEDIA ...............................................................................7
REPLAY LOG FILES ...................................................................................................7
FAIL OVER ................................................................................................................7
4.
RECOVERY STEPS ....................................................................................................8
4.1
RESTORING SERVICE FROM BACKUP SYSTEMS ..........................................................8
4.1.1
4.1.2
Hot Stand By ............................................................................................................... 8
Spare Systems ............................................................................................................. 8
4.2
4.3
SYSTEM RECOVERY ..................................................................................................8
DATA RECOVERY ......................................................................................................8
5.
INDEX .......................................................................................................................9
6.
APPENDICES ...........................................................................................................10
533575959 (03/07/16)
Page 2
Backup and Recovery Plan
Project Name
LIST OF FIGURES
New figures that are given captions using the Caption paragraph style will be added to the table
automatically. To update this table of contents in Microsoft Word, put the cursor anywhere in the
table and press F9. If you want the table to be easy to maintain, do not change it manually.
This section can be deleted if the document contains no figures or if otherwise desired.
Error! No table of figures entries found.
533575959 (03/07/16)
Page 3
Backup and Recovery Plan
Project Name
1. INTRODUCTION
This section should provide an overview of the entire document. No text is necessary between the
heading above and the heading below unless otherwise desired.
1.1 Backup and Recovery Plan Summary
Provide an overall summary of the contents of this document.
Some project participants may need to know only the plan’s highlights, and summarizing creates
that user view. It also enables the full reader to know the essence of the document before they
examine the details.
1.2 Availability Plan Objectives
The Objectives section defines the objectives of the backup and recovery process. This
information should be derived from information about the current operational environment as
well as business requirements and functional specifications. One consistent objective critical to
the customer is to ensure reliable solution operations with a minimum of down time.
Identifying the objectives signals to the customer that the team has carefully considered the
present operational situation, the business requirements, and the solution and created an
appropriate backup and recovery approach.
1.3 Definitions, Acronyms, and Abbreviations
Provide definitions or references to all the definitions of the special terms, acronyms and
abbreviations used within this document.
1.4 References
List all the documents and other materials referenced in this document. This section is like the
bibliography in a published book.
533575959 (03/07/16)
Page 4
Backup and Recovery Plan
Project Name
2. DESCRIPTION OF SOLUTION
The Description of Solution section presents key aspects of the solution that are relevant to the
backup and recovery process.
These solution aspects will drive the development of a viable backup and recovery plan.
No text is necessary between the heading above and the heading below unless otherwise desired
2.1 Recovery Response Time
The Recovery Response Time section defines for each type of solution failure the time estimated
(minimum, average, maximum) to recover and resume operations.
2.2 Single Points of Failure
Critical solution components without redundancy constitute single-points-of-failure; that is, their
failure or degradation causes the solution to fail or to become degraded. The Single Points of
Failure section identifies solution components (hardware, operating system, applications,
infrastructure, procedures, people) that are single-points-of-failure.
2.3 Latency
Latency is the hidden and often unpredictable time from a failure occurrence (of a critical
solution component or an entire solution) to the point where its affect on other components or
systems has been recognized. The Latency section defines for each type of failure the other
components and systems that may be affected, describes the effect, and estimates the ranges of
latency times.
2.4 System Redundancy
When critical solution components (hardware power supplies, CPUs, data storage devices, key
people) fail or become degraded, solution failures can be avoided or minimized by providing
redundant copies of these components that can be brought on-line quickly or that operate in
parallel to their counterparts. The System Redundancy section identifies the critical solution
components for which the solution provides redundancy and describes how the redundant
components will be brought on line.
2.5 Data Integrity
The Data Integrity section describes how the solution will fully account for the methods for
handling data integrity – such as queuing or real time backup. The importance of data integrity
becomes fundamental where solutions use systems that record online transactions or have
elements that use data that represent a snapshot from an earlier day's processing.
533575959 (03/07/16)
Page 5
Backup and Recovery Plan
Project Name
Data integrity must be planned for to prevent data loss or corruption that may result in
significant disruption in the solution, thus impacting the users and potentially the business.
2.6 Business Cost While Systems Are Down
The Business Cost While Systems Are Down section estimates by periods of time the costs to the
business of the solution being unavailable because of failure, preventative maintenance, or other
reasons.
533575959 (03/07/16)
Page 6
Backup and Recovery Plan
Project Name
3. BACKUP AND RECOVERY METHODS
The Backup and Recovery Methods section describes the methods planned to backup the
hardware, operating system(s), applications, infrastructure, resources, and data that comprise
the solution. The description should include for each of these solution component classes: the
type of backup, location of backups, backup procedures, and backup responsibilities. For each
backup method, describe the procedures for using the backup to restart the solution and recover
the state of its operations and the solution data.
No text is necessary between the heading above and the heading below unless otherwise desired
3.1 Restore from Backup Media
At predetermined checkpoints (after key events or time periods) a solution may backup (store) a
snapshot of its operational state and the information it has processed. Restoring the solution
state and information from backup media (e.g., tape) enables past information to be
reconstructed and the solution to resume operation with a minimum of lost data and time. The
Restore from Backup Media section identifies solution checkpoints and the procedures for using
backup solution status information to recover from solution failures or degradation.
3.2 Replay Log Files
Operations personnel and operating systems maintain logs (log files) of solution events and their
time of occurrence. Replaying log files often enables past information to be reconstructed. The
Replay Log Files section describes the log files that operations will maintain, the procedures
used to record events and time in the logs, and the procedures employed to reconstruct solution
information from the log files.
3.3 Fail Over
The use of a fail-over system (redundant system[s] operating in parallel with a primary system)
prevents the loss of a minimal amount of data and is used to reconstruct the data on the primary
system. The Fail Over section identifies and describes fail-over systems, the procedures for
keeping fail-over systems current with the primary system and for starting up their operations,
and the procedures for reconstructing lost or corrupted data.
533575959 (03/07/16)
Page 7
Backup and Recovery Plan
Project Name
4. RECOVERY STEPS
The Recovery Steps section describes the detailed procedures (with steps and decisions) for
restarting solution operations and restoring solution data for the state of the solution determined
at the closest checkpoint prior to failure.
No text is necessary between the heading above and the heading below unless otherwise desired
4.1 Restoring Service from Backup Systems
The Restoring Service from Backup Systems section describes how service will be restored by
using stand by (backup) systems. This can consist of using having a "hot stand by" with
automated fail over or by swapping the failed system with spare systems already configured for
use.
4.1.1 Hot Stand By
The Hot Stand By section describes the hot standby systems ready for use when needed.
4.1.2 Spare Systems
The Spare Systems section describes the spare systems, identifies where they are located, and
details the steps required to bring up the solution on a spare system.
4.2 System Recovery
The System Recovery section describes how system recovery occurs.
4.3 Data Recovery
The Data Recovery section defines how data will be recovered. The requirements for data
recovery are primarily dependent on the application:

The data could be stored on RAID disks.

Application logs can be stored on separate disks, and frequently backed up.

Recovery or checkpoints can be frequently made.
533575959 (03/07/16)
Page 8
Backup and Recovery Plan
Project Name
5. INDEX
The index is optional according to the IEEE standard. If the document is made available in
electronic form, readers can search for terms electronically.
533575959 (03/07/16)
Page 9
Backup and Recovery Plan
Project Name
6. APPENDICES
Include supporting detail that would be too distracting to include in the main body of the
document.
533575959 (03/07/16)
Page 10
Download