Cross-Industry Preservation Architectures

advertisement
Cross-Industry
Preservation Architectures
Michael Peterson
May 2011
“Cross-Industry Preservation Architectures” – PASIG May, 2011
1
Introductions
● Michael Peterson

Founder, past President, and Chief Strategy Advocate for
the SNIA


Currently working on Cloud Archive and Long-term Retention
standards, best practices, market education
Information Services Architect – consulting in long-term
retention and digital preservation system design and
implementation


Currently driving the LTDPRM.org & ILM2.0.org Communities
Author: “100 Year Archive Requirements Study,” 2008, “Building a
Terminology Bridge: Guidelines for Digital Information Retention and
Preservation Practices in the Datacenter,” Sept. 2009
“Cross-Industry Preservation Architectures” – PASIG May, 2011
2
www.ltdprm.org
“Cross-Industry Preservation Architectures” – PASIG May, 2011
3
Agenda
● LTDP Paradoxes (Laws designed to be broken)
● New Digital Preservation Models

That the cloud empowers
● Using the Cloud for Digital Archives
(Digital Preservation)
“Cross-Industry Preservation Architectures” – PASIG May, 2011
4
4 Paradoxes of Digital Preservation
● Data will be lost
● Migration does not scale
● Access & use models
keep changing
● Cost overwhelms
everything complexity
does not
“Cross-Industry Preservation Architectures” – PASIG May, 2011
5
“Old” Laws of Digital Preservation
How do we break them?
“Cross-Industry Preservation Architectures” – PASIG May, 2011
6
100 Year Complexity Barrier
● Overwhelming growth, cost, change








Constant Physical and Logical migration
Power, cooling, space, people, resources, maintenance,…
Always adding & migrating systems, networking, storage
Managing thousands of formats
Constant Auditing and recovery of damaged or lost data
Thousands of moving parts
Complex systems and architectures
Changing software platforms
“Cross-Industry Preservation Architectures” – PASIG May, 2011
7
Aha!
Move from
“physical” preservation architectures and
design as in physical media or a physical
repository (a 2002 ‘OAIS’)
to
“virtualized” Preservation Services based on
Service Management principles
(Sounds like the Cloud…)
“Cross-Industry Preservation Architectures” – PASIG May, 2011
8
Using the Cloud
New Digital Preservation
Models
“Cross-Industry Preservation Architectures” – PASIG May, 2011
9
“Physical” Doesn’t Scale
● Old Architecture

“Storing digital images
effectively requires
standards related to the
storage media, such as CDROMs, and the file formats,
such as TIFF.”
● Physical Standards



Source: “A Resource List for Standards Related
to Digital Imaging” Dec. 2010

Physical Application &
Storage infrastructure


Architecture: OAIS 2002
Metadata: Dublin Core: ISO
15836:2009
Storage Media: ISO
18921:2008, ISO 18925:2008,
Digitization: ISO/IEC 109181/Cor1:2005, ISO/IEC 109183/Amd1:1999
File Formats: ISO 190051:2005, Adobe TIFF
Specification, V6, 1992
Transfer Protocols: ISO
15740:2008
“Cross-Industry Preservation Architectures” – PASIG May, 2011
10
“Infrastructure Virtualization”
● New Architecture




Media independent
System Architecture
virtualized, self-protecting,
cloud based, and self
healing
Integrated migration &
transformation services
Virtualized historical
applications hosted in the
cloud in specialized
containers running in virtual
machines
● New Standards





Architecture: OAIS 2010
Metadata: FCIS, PREMIS,
IETF
Cloud: SNIA-CDMI
Interoperability: NIST Smart
Grid Framework, Cloud and
Interoperability workgroups
Object Containerization:
SNIA-SIRF
“Cross-Industry Preservation Architectures” – PASIG May, 2011
11
Add “Information Virtualization”
● Portable Information
● New Standards
Objects





Extensible Preservation Objects
Location, media independent
Secure, auditable, authentic,
portable
Self-healing
● On-demand, virtual emulation



“Jumpbox” hosted emulators
Populations of legacy ‘readers’
Web-based delivery and access




Architecture: OAIS 2010
Metadata: FCIS, PREMIS,
IETF
Cloud: SNIA-CDMI
Interoperability: NIST Smart
Grid Framework, Cloud and
Interoperability workgroups
Object Containerization:
SNIA-SIRF and CDMI
“Cross-Industry Preservation Architectures” – PASIG May, 2011
12
Move to “Managed”
 Content Management
 Service Management

ITIL, ITSM, ILM2.0,
Information Governance
 Litigation ‘Ready’
 Preservation begins at

Operating Practices:
ITSM - IT Service Mgmt.
ITIL-IT Infrastructure Library
ILM2.0 - Service mgmt. based
approach to information mgmt. and
automation
Regulatory Compliance
“Creation”
Preservation is a new Datacenter Practice
“Cross-Industry Preservation Architectures” – PASIG May, 2011
13
And to “Virtual Services” in the Cloud
● Platform as a Service
● Infrastructure as a
●
●
●
Service
Storage as a Service
Evolving Web access
and use models
Private, Hybrid, Public
Clouds

Multiple clouds, multiple
providers
“Cross-Industry Preservation Architectures” – PASIG May, 2011
14
Using the “Cloud” in Preservation
● Most likely use-cases:





Private and Hybrid clouds
Virtualize infrastructure
Virtualize delivery and
access
Virtualize emulation
Virtualize information

● Examples



Providing portability



Web-access models
Web-drop boxes
Agile, Scalable, cost effective
compute and storage
resources (on demand)

Virtual emulation

Demand spikes
Disaster recovery
Distributed data sets
Infrastructure extensions
“Cross-Industry Preservation Architectures” – PASIG May, 2011
21
Emerging Cloud Standards
● Cloud Data Management Interface, CDMI

SNIA to ISO: storage-to-cloud, cloud-to-cloud interchange
format
● Self-contained Information Retention Format,
SIRF

SNIA to ISO: extensible preservation object format
● Interoperability

ISO project: Data Preservation Interchange Framework,
DPIF
“Cross-Industry Preservation Architectures” – PASIG May, 2011
22
Cloud Data Management Interface
● Data Portability Standard with an Object Storage
Interface


Move data and metadata in standard portable containers
in and out of the cloud and between clouds
Simple XML container of objects plus metadata
● A data and information services management
interface and control path

Operate services through CDMI


Rules and Policies in metadata
Cloud Peering – cloud to cloud communications
“Cross-Industry Preservation Architectures” – PASIG May, 2011
23
Design for the Cloud
● Considerations




Establish Service Objectives
Include verification of
recovery, authenticity,
availability, digital audit, etc.
Consider using multiple
cloud destinations or local
and remote copies for
increased reliability and
availability
Beware of excessive moving of
data across the WAN due to
high I/O and bandwidth costs
● Evaluate Cloud providers

Establish strong contracts
● Test and Audit

All required services
● Use CDMI !
“Cross-Industry Preservation Architectures” – PASIG May, 2011
24
Cloud Contract Considerations
 Costs
 Retention Management





Preservation/Integrity/
Authentication
Return and Secure Disposal
– Subpoenas, Control
Legal Hold
Digital Audits & Verification
Physical and logical
migration practices and
authenticity verifications
•
Access
Availability, Protection,
Security & Confidentiality
 Search/Discovery
 Multi‐Cloud Provider
Relationships
Right to Conduct Forensic
Exams

•
•
Cross‐Border Data
Transfers
“Cross-Industry Preservation Architectures” – PASIG May, 2011
25
Preservation Architectures: Virtualization and Cloud
Summary Thoughts
“Cross-Industry Preservation Architectures” – PASIG May, 2011
26
Move to Virtual Preservation
● Shift thinking from “Physical” Preservation to
“Virtual”
● Virtualization Applies in many ways




System, storage, application, infrastructure
Information
Migration – both physical and logical
Cost reduction
● Conclusion: ‘Cloud’ has a positive role
“Cross-Industry Preservation Architectures” – PASIG May, 2011
27
Using the Cloud
● Start out Private, Move to Hybrid
● Apply Service Management Principles

Classify, Requirements, SLAs, Design, Audit, Improve
● Design for the Cloud


Create strong and measureable SLA style contracts
Test, Audit, Verify
● Use and Promote CDMI

Need cloud interface, management, and information
portability standards
“Cross-Industry Preservation Architectures” – PASIG May, 2011
28
Contact Information
● Michael Peterson



IMERGE consulting and LTDPRM.org
mpeterson@ltdprm.org
(805)201-3178
“Cross-Industry Preservation Architectures” – PASIG May, 2011
29
Download