Uploaded by craig

Database Archiving Trends and Best Practices

Managing Data for Long Retention Periods
Database Archiving Trends & Best Practices
Including a “Titan Archive” Example
Craig S. Mullins
Corporate Technologist
http://www.neonesoft.com
Authors
This presentation was prepared by:
Craig S. Mullins
Corporate Technologist
NEON Enterprise Software, Inc.
14100 Southwest Freeway, Suite 400
Sugar Land, TX 77478
Tel: 281.491.6366
Fax: 281.207.4973
E-mail: craig.mullins@neonesoft.com
This document is protected under the copyright laws of the United States and other countries as an unpublished work.
This document contains information that is proprietary and confidential to NEON Enterprise Software, which shall not
be disclosed outside or duplicated, used, or disclosed in whole or in part for any purpose other than to evaluate NEON
Enterprise Software products. Any use or disclosure in whole or in part of this information without the express
written permission of NEON Enterprise Software is prohibited. © 2007 NEON Enterprise Software (Unpublished). All
rights reserved.
1
Confidential Material of NEON Enterprise Software, Inc.
Agenda
Data Retention: The Long-Term Data
Storage Problem
Trends Driving the Need to Archive
Long Term Data Storage Solutions or,
what database archiving is and is not!
Required Database Archiving Capabilities
TITAN Archive for Long-Term Data Retention
2
Confidential Material of NEON Enterprise Software, Inc.
Trends Impacting Data Retention
Data Retention Issues:
Data growth
(125% CAGR)
Length of retention
requirement
Varied types of data
Security issues
0
4
Time Required
Confidential Material of NEON Enterprise Software, Inc.
30 Yrs (and more)
Data Retention Drivers
Data Retention Requirements refer to the length
of time you need to keep data

Determined by laws – regulatory compliance



More than 150 state and federal laws
Dramatically increasing retention periods for
corporate data
Determined by business needs

Reduce operational costs


Isolate content from changes

5
Large volumes of data interfere with operations
Protect archived/retained data
from modification
Confidential Material of NEON Enterprise Software, Inc.
What is Meant by “Long Term Data Retention”
Source: 100 Year Archive Requirements Survey (January 2007), SNIA-DMF 100 Year Archive Task Force
6
Confidential Material of NEON Enterprise Software, Inc.
Retention Requirements Vary
Source: 100 Year Archive Requirements Survey (January 2007), SNIA-DMF 100 Year Archive Task Force
7
Confidential Material of NEON Enterprise Software, Inc.
Regulatory Compliance & Data
Retention Requirements
8
Confidential Material of NEON Enterprise Software, Inc.
Can You Read This?
9
Confidential Material of NEON Enterprise Software, Inc.
Regulations Tracked by Gartner
10
Confidential Material of NEON Enterprise Software, Inc.
Regulatory Compliance is International
Country
Examples of Regulations
Australia
Commonwealth Government’s Information Exchange Steering Committee,
Evidence Act 1995, more than 80 acts governing retention requirements
Brazil
Electronic Government Programme, EU GMP Directive 1/356/EEC-9
France
Model Requirements for the Management of Electronic Records, EU
Directive 95/46/EC
Germany
Federal Data Protection Act, Model Requirements for the Management of
Electronic Records, EU Directive 95/46/EC
Japan
Personal Data Protection Bill, J-SOX
Switzerland
Swiss Code of Obligations articles 957 and 962
United
Data Protection Act, Civil Evidence Act 1995, Police and Criminal Evidence
Kingdom
Act 1984, Employment Practices Data Protection Code, Combined Code on
Corporate Governance 2003
11
Confidential Material of NEON Enterprise Software, Inc.
http://www.itcinstitute.com/ucp/index.aspx
12
Confidential Material of NEON Enterprise Software, Inc.
Retention: The Need for Archiving…
Paper
Blueprints
Forms
Claims
13
Word
Excel
PDF
XML
IMS
DB2
ORACLE
SYBASE
SQL Server
IDMS
Confidential Material of NEON Enterprise Software, Inc.
VSAM
Programs
UNIX Files
Outlook
Lotus Notes
Attachments
Sound
Pictures
Video
Discovery and e-Discovery
Data retention and e-Discovery intersect…
Discovery is the compulsory disclosure, at a
party’s request, of information that relates to
the litigation.
(Source: Black’s Law Dictionary)
Therefore, e-Discovery is the discovery of
electronic information
14
Confidential Material of NEON Enterprise Software, Inc.
E-Discovery
Electronic evidence is the predominant form of discovery today.
(Gartner, Research Note G00136366)
Electronic evidence could encompass anything that is stored anywhere.
(Gartner, Research Note G00133224)
When data is being collected (for e-discovery) it is imperative that it is
not changed in any way. Metadata must be preserved…
(Gartner, Research Note G00133224)
…it is not the job of the IT organization to determine what should and
should not be saved or how individual business users manage their data.
(Gartner, Research Note G00147735)
Through 2007, more than half of IT organizations and in-house legal
departments will lack the people and the appropriate skills to handle
electronic discovery requirements (0.8 probability).
(Gartner,Research Note G00131014)
By YE10, 75% of IT departments in large enterprises will employ one or
more legal IT or e-discovery specialists (0.8 probability).
(Gartner, Research Note G00146630)
15
Confidential Material of NEON Enterprise Software, Inc.
E-Discovery and FRCP
Changes to the Federal Rules of Civil Procedure
Examples: Rules 26(b)(1) and 34(b)




Changes took effect December 2006
A party who produces documents for inspection
shall produce them . . . “as they are kept in the
usual course of business...”
The rules no longer use the term “data compilations” instead using
the term “electronically stored information”
The amended rules state that requested information must be
turned over within 90 to 120 days after a complaint has been
served.
So data stored in database systems must be produced in
electronic form.
16
Confidential Material of NEON Enterprise Software, Inc.
What Does It Mean?
Enterprises must recognize that there is a
business value in organizing their information
and data.
Organizations that fail to respond run the risk
of seeing more of their cases decided on
questions of process rather than merit.
(Gartner, 20-April-2007, Research Note G00148170:
“Cost of E-Discovery Threatens to Skew Justice System”)
17
Confidential Material of NEON Enterprise Software, Inc.
Some Sample Cases
Morgan Stanley - $1.6B – lost backup tapes
UBS Warburg - $29.3M – deleted email, could not produce
backups.

Court to jurors: “Assume discarded emails would have
negatively impacted the case”
Bank of America - $10M – “repeatedly failed to promptly
furnish email”
Philip Morris - $2.75M – did not save email
Arthur Anderson - $500,000 (overturned)

18
Innocently destroyed documents involved in a court “hold”
order
Confidential Material of NEON Enterprise Software, Inc.
Retiring Legacy Applications
Older applications, perhaps running on outdated
hardware and/or outdated DBMS
—
May be looking to save HW and/or SW licensing costs
Archive the data to a secure database archive
Retire the application – and perhaps HW/SW
Application data is secured and available for
access from the database archive
19
Confidential Material of NEON Enterprise Software, Inc.
An Example of Retiring a Legacy Application
Only one IMS application remaining
CTH0
…
…
…
Archive IMS data
Archive Store
Retire IMS application
and database
Data &
Metadata
Eliminate IMS license
20
Continue to access the data in
the archive
Confidential Material of NEON Enterprise Software, Inc.
Operational Efficiency Drives Archiving, Too

In addition to supporting regulatory compliance
requirements, database archiving improves
operational efficiency

21
Large volumes of data in operational databases
interfere with production operations
—
Efficiency of transactions
—
Efficiency of utilities: COPY, REORG, etc.
—
Improved storage

Archived data can be stored on cheaper media

Gartner: databases copied an average of 6 times!
Confidential Material of NEON Enterprise Software, Inc.
Key Reasons to Archive
Source: Forrester Research, Database Archiving Remains An Important Part Of Enterprise DBMS Strategy (August 13, 2007)
22
Confidential Material of NEON Enterprise Software, Inc.
Database Archiving
Purge
Database Archiving: The process of removing
selected data records from operational databases
that are not expected to be referenced again and
storing them in an archive data store where they
can be retrieved if needed.

Data comes from a structured DBMS (DB2, IMS, etc.)

Selection criteria is logical rather than physical

23
Data can be retrieved after a long period of time,
regardless of whether the original DBMS is still in
place (data independence)
Confidential Material of NEON Enterprise Software, Inc.
The Lifecycle of Data
Create
Operational
Reference
Archive
Needed for
completing
business
transactions
Needed for
reporting
or expected
queries
Needed for
compliance
and business
protection
Mandatory Retention Period
24
Confidential Material of NEON Enterprise Software, Inc.
Discard
Database or Archive?
Keep in DB
Performance
Space
Compliance
25
Confidential Material of NEON Enterprise Software, Inc.
Keep in Archive
Based on Data Availability
Keep in DB
Must be Available to App
Must be Available
Must Be Secure
Not Needed
26
Confidential Material of NEON Enterprise Software, Inc.
Keep in Archive
Purge
What Solutions Are Out There?

Keep Data in Operational Database
—
—


Store Data in UNLOAD files (or backups)
—
Problems with schema change and reading archived data
—
Using backups poses even more serious problems
Move Data to a Parallel Reference Database
—

Combines problems of the previous two
Move Data to a Database Archive
—
27
Problems with authenticity of large amounts of
data over long retention times
Operational performance degradation
Secure, durable, and accessible
Confidential Material of NEON Enterprise Software, Inc.
Components of a
Database Archiving Solution
Production
Database
Metadata
Capture, Archive
& Retention
Policies
Archive Data
Query & Access
Data
Extract
Recall
Database
Archive Data Store
and Retrieve
Data
Recall
Archive Store
Metadata
Policies
History
Archive
Administration
28
Confidential Material of NEON Enterprise Software, Inc.
Data &
Metadata
Database Archiving Requirements


Policy-Driven
—
Policy based archiving: logical selection
—
Discard data after retention period
Data Requirements
—
Store very large amounts of data in archive
—
Keep data for very long periods of time
—
Access data when needed; as needed

—

29
Even across schema breaks
Protect authenticity of data
Independence
—
Maintain archives for ever-changing operational systems
—
Become independent from operational metadata
—
Become independent from Applications/DBMS/Systems
Confidential Material of NEON Enterprise Software, Inc.
Database Data is at Risk!
Source: 100 Year Archive Requirements Survey (January 2007), SNIA-DMF 100 Year Archive Task Force
30
Confidential Material of NEON Enterprise Software, Inc.
Database Archiving Storage Capacity
Total Worldwide Database Archive Capacity, 2007-2012 (Petabytes)
16,000
13,639
14,000
12,000
10,000
8,110
8,000
6,000
4,824
4,000
2,991
2,000
1,198
1,838
0
2007
2008
2009
Source: Enterprise Strategy Group
31
Confidential Material of NEON Enterprise Software, Inc.
2010
2011
2012
TITAN Archive
An Introduction to Database Archiving
with TITAN ArchiveTM
TITAN Archive
NEON Enterprise Software’s database archive
solution:



33
Architected as a long-term data retention solution.
Built to address the regulatory compliance issues
impacting data and database systems.
Delivers operational benefits to the existing database
environment and applications.

Supports e-Discovery needs

Built from the ground-up as an enterprise solution.
Confidential Material of NEON Enterprise Software, Inc.
TITAN Architecture
34
Confidential Material of NEON Enterprise Software, Inc.
Details of the Physical Architecture
zOS
Browser HTTP
DB2, IMS
TITAN Extractor
LINUX
MQ
SOCKETS
Linux/Unix/
Windows
ORACLE, Sybase,
UDB, SQL Server
35
TITAN Archive Appliance
TITAN Extractor
Confidential Material of NEON Enterprise Software, Inc.
TITAN Archive Catalog,
TITAN EADO
SAN
TITAN EADO
Product Task Overview
USER ROLES
ADMINISTRATOR
CREATE
APPLICATION
• Add schema
• Edit schema objects
• Edit relationships
• Validate schema
ADMINISTRATOR
ASSIGN USERS
ANALYST
DEFINE
SCHEMA
36
Confidential Material of NEON Enterprise Software, Inc.
Search DB2
for desired
objects
37
Confidential Material of NEON Enterprise Software, Inc.
Drag and Drop
objects from the
DB2 catalog into
Titan
38
Confidential Material of NEON Enterprise Software, Inc.
Use the Schema Editor to
arrange objects and modify
relationships
39
Confidential Material of NEON Enterprise Software, Inc.
Use the Column
Editor to expand the
basic metadata by
adding annotations
40
Confidential Material of NEON Enterprise Software, Inc.
IMS Schema Editor
IMS Schema Editor
Use the Schema Editor
to arrange objects and
modify relationships
41
Confidential Material of NEON Enterprise Software, Inc.
Product Task Overview
USER ROLES
ADMINISTRATOR
• Add plan
• Copy tables from schema
• Select root table
• Assign archive actions
• Review relationships
• Specify EADO properties
• Validate and register the plan
CREATE
APPLICATION
ADMINISTRATOR
ASSIGN USERS
ANALYST
ANALYST
DEFINE ARCHIVE
PLAN
42
Confidential Material of NEON Enterprise Software, Inc.
DEFINE
SCHEMA
Remember the schema
that we created in the
previous section.
We want to build a
plan to archive
accounts…
43
Confidential Material of NEON Enterprise Software, Inc.
Customer data is
treated as
reference data
and is to be copied
to the archive
Accounts data is
to be moved to
the archive
Use the Plan
Editor to design
each archive Plan
44
Transaction data
is to be moved to
the archive
Confidential Material of NEON Enterprise Software, Inc.
Product Task Overview
USER ROLES
ADMINISTRATOR
CREATE
APPLICATION
• Add job to plan
• Specify job options
• Generate JCL
• Edit JCL
• Register the job
ANALYST
DEFINE
ARCHIVE JOB
Confidential Material of NEON Enterprise Software, Inc.
ASSIGN USERS
ANALYST
ANALYST
DEFINE ARCHIVE
PLAN
45
ADMINISTRATOR
DEFINE
SCHEMA
Use the job editor to
define the parameters for
the archive extract
46
Confidential Material of NEON Enterprise Software, Inc.
Define the archive
policy using standard
SQL .
47
Confidential Material of NEON Enterprise Software, Inc.
Define archive storage
attributes such as number
of copies, storage
locations and encryption
keys.
48
Confidential Material of NEON Enterprise Software, Inc.
Define the discard
policy using
standard SQL .
49
Confidential Material of NEON Enterprise Software, Inc.
Getting Started—Product Task Overview
USER ROLES
ADMINISTRATOR
CREATE
APPLICATION
ANALYST
ADMINISTRATOR
RUN SIMULATION
AND/OR STATS
TROUBLESHOOT
AND VERIFY
ASSIGN USERS
ANALYST
DEFINE
ARCHIVE JOB
ANALYST
ANALYST
DEFINE ARCHIVE
PLAN
50
Confidential Material of NEON Enterprise Software, Inc.
DEFINE
SCHEMA
Define JCL
parameters with
the job editor
51
For initial testing and
validation run the extract in
simulate mode
Confidential Material of NEON Enterprise Software, Inc.
Getting Started—Product Task Overview
USER ROLES
ANALYST OR READER
ADMINISTRATOR
TROUBLESHOOT/
VERIFY ARCHIVE
CREATE
APPLICATION
ANALYST
ADMINISTRATOR
RUN SIMULATION
AND/OR STATS
ASSIGN USERS
ANALYST
DEFINE
ARCHIVE JOB
ANALYST
ANALYST
DEFINE ARCHIVE
PLAN
52
Confidential Material of NEON Enterprise Software, Inc.
DEFINE
SCHEMA
Titan will generate the
archive extract JCL. For
testing purposed the job
can be submitted through
the GUI. Eventually you
will want to add the job to
your scheduler.
53
Confidential Material of NEON Enterprise Software, Inc.
Archive extract results
can be viewed on the
operational system
54
Confidential Material of NEON Enterprise Software, Inc.
Or from
within
Titan
55
Confidential Material of NEON Enterprise Software, Inc.
Data can be retrieved from the
archive by using the Titan select tool
or . . .
56
Confidential Material of NEON Enterprise Software, Inc.
Data can be retrieved using any
JDBC compliant SQL tool. Here we
use a free tool called SQuirreL to
access data that we have archived
with Titan
57
Confidential Material of NEON Enterprise Software, Inc.
Getting Started—Product Task Overview
ANALYST
USER ROLES
SCHEDULE & RUN
ARCHIVE JOB
ANALYST OR READER
ADMINISTRATOR
TROUBLESHOOT/
VERIFY ARCHIVE
CREATE
APPLICATION
ANALYST
ADMINISTRATOR
RUN SIMULATION
AND/OR STATS
ASSIGN USERS
ANALYST
DEFINE
ARCHIVE JOB
ANALYST
ANALYST
DEFINE ARCHIVE
PLAN
58
Confidential Material of NEON Enterprise Software, Inc.
DEFINE
SCHEMA
Highlighted TITAN Archive Features
Protects Data Authenticity




Archived data never changes
Data can be encrypted
Role-based security
Signature / Checksum
Discard Policies



Forensic discard (zeroed out)
Flexible policy-based definition
Discards archive & backups
Handles Media Rot
Contingency Planning


Up to four backup copies
Recover ability
Access Assistance


60
EADO Indexing
EADO Scoping Variables
Confidential Material of NEON Enterprise Software, Inc.


Checks viability of media
Repurpose to new media
The Three Most Important Things to Remember
TITAN Archive Qualities
Accessible: the right data can be retrieved in
the required timeframe
Authentic: the data is unchanging, accurately
represents the business, and can be
proven as such for legal purposes
Enterprise: engineered to be non-disruptive
61
Confidential Material of NEON Enterprise Software, Inc.
Summary Points





Keeping data in operational systems is a bad idea
Putting data in UNLOAD files is a bad idea
Putting data in a parallel references database is a bad idea
Using a DBMS to store the archive does not work
Database archiving requires a great deal of data design
—
—
—

Database archives must be continuously managed
—
—
—
—

62
Establishing and maintaining metadata
Designing how data looks in the archive
Achieving application independence
Copying data for storage problems (e.g. media rot)
Copying data for system changes
Copying data for data encoding standard changes
Logging, auditing, and monitoring

Archive events

Partition management

Accesses
New IT/Business Position: Database Archivist
Confidential Material of NEON Enterprise Software, Inc.
Database Archivist… or Archive Analyst
If you are serious about long-term data retention you will
need to staff a database archivist:

Understand the retention requirements & regulations

Understand business data
—

Interact with business and legal experts

Build archive plans and jobs

On-going archive administration

63
Classification of data to match to regulations
Assist in e-discovery and other projects that require
access to archived data
Confidential Material of NEON Enterprise Software, Inc.
Intelligent Solutions for Enterprise Data.
64
Confidential Material of NEON Enterprise Software, Inc.
Craig S. Mullins – Contact Info
NEON Enterprise Software, Inc.
craig.mullins@neonesoft.com
www.neonesoft.com
www.craigsmullins.com
www.DB2portal.com
65
Confidential Material of NEON Enterprise Software, Inc.