Chapter 9

advertisement
Section 2 : Storage Networking Technologies and Virtualization
Chapter 9

Content Addressed Storage (CAS): features
and Benefits of a CAS
Upon completion of this chapter, you will be
able to:
 Describe CAS, fixed content and archives,
traditional storage solutions for archive
 Describe the features and benefits of a CAS
based storage strategy
 List the physical and logical elements of CAS
 Describe the storage and retrieval process for
CAS data objects
 Describe the best suited operational
environments for CAS solutions
Upon completion of this lesson, you be able to:
 Define fixed content
 Describe traditional archival solutions and its
shortcoming
 Define Content Addressed Storage (CAS)
 List benefits of CAS
Generate
New Revenues
Improve
Service Levels
Leverage
Historical Value
Digital Assets Retained For Active Reference And Value
Electronic Documents
•Contracts, claims, etc.
•E-mail and attachments
•Financial spread sheets
•CAD/CAM designs
•Presentations
Digital Records
•Documents
– Checks, securities trades
– Historical preservation
•Photographs
– Personal / professional
•Surveys
– Seismic, astronomic,
geographic
Rich Media
•Medical
– X-rays, MRIs, CTI
•Video
– News / media, movies
– Security surveillance
•Audio
– Voicemail
– Radio

Fixed content is growing at more than 90%
annually
◦ Significant amount of newly created information falls
into this category
◦ New regulations require retention and data protection





Often, long-term preservation is required (yearsdecades)
Simultaneous multi-user online access is
preferable to offline storage
Need faster access to fixed content
Need for location independent data, enabling
technology refresh and migration
Traditional storage methods are inadequate

Three categories of archival solution are:
◦ Online, nearline, and offline based on the means of
access

Traditional archival solution were offline
◦ Traditional archival process used optical disks and
tapes as media for archival
◦ An archive is often stored on a Write Once Read
Many (WORM) device, such as a CD-ROM
Tape is slow, and standards are always
changing
 Optical is expensive, and requires vast
amounts of media
 Recovering files from tape and optical is often
time consuming
 Data on tape and optical is subject to media
degradation
CAS
Both
solution
sophisticated
media
has
emergedrequire
as an alternative
to traditional
management
archiving solutions





Object-oriented, location-independent
approach to data storage
Repository for the “Objects”
Access mechanism to interface with
repository
Globally unique identifiers provide access to
objects
Additional Task
Research on role of CAS in ILM
Strategy








Content authenticity
Content integrity
Location independence
Single-instance storage (SiS)
Retention enforcement
Record-level protection and disposition
Technology independence
Fast record retrieval
Key points covered in this lesson:
 CAS Definition
 Challenges of Storing Fixed Content
 Shortcomings of Traditional Archiving
Solutions
 Benefits of CAS

CAS Architecture, Storage and Retrieval,
Examples.
Upon completion of this lesson, you will be
able to:
 Describe CAS architecture
 Describe Physical and logical elements of CAS
 Describe data storage and retrieval process in
CAS environment
 CAS examples



Storage devices (CAS Based)
◦ Storage node
◦ Access node
Servers (to which storage
devices get connected)
Client
IP
CAS System
API
Server
Access
Nodes
Private
LAN
Storage
Nodes

Application Programming Interface (API)
◦ A set of function calls that enables
communication between applications or
between an application and an operating
system
API

BLOB (Binary Large Object)
◦ The actual data without the descriptive
information (metadata)
◦ The Distinct Bit Sequence (DBS) of user
data represents the actual content of a file
and is independent of the filename and
physical location

C-Clip

Content Address (CA)

C-Clip Descriptor File (CDF)
◦ A package containing the user's data and
associated metadata
◦ C-Clip ID (C-Clip handle or C-Clip reference) is
the CA that the system returns to the client
application
◦ An identifier that uniquely addresses the
content of a file and not its location. Unlike
location-based addresses, content addresses
are inherently stable and, once calculated, they
never change and always refer to the same
content
◦ The additional XML file that the system creates
when making a C-Clip. This file includes the
content addresses for all referenced BLOBs and
associated metadata
Client presents data
to API to be archived
Client
Unique Content
CAS System
Address is calculated
Application Server Object is sent
to CAS System via
CAS API over IP
API
C-Clip
(Object)
CDF
Client presents data
to API to be archived
Client
Unique Content
CAS System
Address is calculated
Application Server Object is sent
to CAS System via
Object
CAS API over IP
API
Acknowledgement
returned to
application
CAS System validates
the Content Address
and stores the object
Clip ID is retained and
stored for future use
4
1
CAS authenticates
the request and
delivers the object
Object is needed by
an application
CAS System
Application Server
Client
API
3
2
Application finds
Content Address of
object to be retrieved
C-Clip ID
Retrieval request is
sent to the CAS System via
CAS API over IP

Features available with most CAS systems are:
◦ Integrity checking
◦ Data protection
◦
◦
◦
◦
◦
 Local replication
 Remote replication
Load balancing
Scalability
Self-diagnosis and repair
Report generation and event notification
Fault tolerance
 Through the use of redundant components and data
protection schemes
◦ Audit trails
 Documentation of management activities, access and
disposition of data
Hospital
Patient Studies
Stored locally for
Short-Term Use
(60 Days)
API
Application
Server



Data Stored
on CAS
CAS System
Each X-ray image ranges from about 15MB to
over 1GB
Patient record is stored online for a period of
60-90 days
Beyond 90 days patient records are archived
Bank
API
Application Server




CAS System
Check image size is about 25KB
Check imaging service provider may process 50–
90 million check images per month
Checks are stored online for a period of 60 days
Beyond 60 days data is archived
Key points covered in this lesson:
 CAS architecture
 Physical and logical elements of CAS
 CAS storage and retrieval process
 CAS solution examples
Key points covered in this chapter:
 Benefits of CAS based storage strategy
 Overview of physical and logical elements of
CAS
 Storing and retrieving data from CAS
 CAS application examples
Download