APARSEN Storage Solutions

advertisement
Co-funded by the European Union under FP7-ICT-2009-6
Storage Solutions
The use case at the National Library of the
Netherlands (KB)
Jeffrey van der Hoeven
APARSEN webinar, April 14th, 2014
Co-ordinated by
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Outline of talk
•
•
•
•
•
•
About the National Library of the Netherlands (KB)
Storage challenges: creating digital collections
Storage solution
Cost
Future perspective
Cloud storage: hot or not…
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
• Since 1798 / 248 FTE / 53M euro budget
• We preserve & give access to everything published in and
about the Netherlands
• Central role in Dutch information infrastructure
• Kept safe: 6M physical publications / 18M digital publications
• Goal: everything digital in 2035
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
What we do
We give open access to:
8
million
Newspaper
pages online
4,6 2,1
million
Online visits
million
Parlementary pages
online
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Storage challenges:
Creating digital collections
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Storage share of digital collections (in GB)
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Storage prospect at KB
1800m
1 PB
&
1000M
files
Burj Khalifa
Dubai
0,5 PB
1.5 million
828m
&
CD-ROM’s
Empire State
Building
500M
files
443m
324m
Eiffel tour
2010
2011
2012
2018
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Challenges in (long-term) storage
•
•
•
•
•
Volume (size and number of files)
Type of data (structured / unstructured)
Growth rate
Availability vs preservation
Cost per TB
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Storage solution
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
IT & Storage at KB
Two locations:
• In-house = data centre for primary storage and computing
• Off-site = for data back-up & archiving
• Hosting 230 servers (80 physical / 150 virtual)
• Managing 550 TB of data
• Managing +/- 500 million files:
– PDF, TIFF, JPEG2000, JPEG, XML
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Storage Management
Storage tiers
Very fast, very expensive
Gold
Used for : indexing, databases
HW
: SAN with HiPerf SAS disks, near-future: SSD
Fast, expensive
Silver
Used for : web hosting, processing
HW
: SAN with HiCap SAS disks
Slow (45 sec), sustainable
Steel
Used
: long-term archiving
HW
: Disk-based NAS with WORM
Very slow (> 45 sec)
Bronze
Used for : back-up & restore, archiving
HW
: LTO4/5 tape
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Storage process & strategy
Selection
Digital processing
Stage 1
Access
Stage 2
Stage 3
Shared file system(s) / API
Stage 4
Stage 5
DB
File system
Storage management
Off-site
Bronze
Storage on-site
Bronze
Steel
Silver
Gold
Platinum
Back-up
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Storage cost
Source: http://www.brightsideofnews.com/2011/12/07/your-storage-blog-make-storage-cheaper-and-more-energy-efficient/
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
TCO storage
• Cost per Terabyte (TB) per year per storage tier
• TCO composed of several cost components, based on
whitepaper Four Principles for Reducing Total Cost of
Ownership (2011 Hitachi)
• In total 14 cost components included
• In 2014 model was approved by PWC accounting office
Referenced article: http://www.hds.com/assets/pdf/four-principles-for-reducing-totalcost-of-ownership.pdf
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Hardware & software
Maintenance
Support
Power & cooling
Floor space
Monitoring
Off-site locations
Network
Waste & duplication
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
KB TCO storage 2014
per TB per year
€ 4,858.-
€ 1,036.-
€ 1,046.-
Steel
Silver
€ 387.-
Bronze
Gold
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
KB TCO storage cost over years
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
KB vs storage providers (cloud)
KB
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Can we afford it in the future?
• Recent developments *:
– Disk storage is becoming more popular in archiving.
– Physical limits of hard disk drive seems reached.
– Kryder’s law seems to fail, as disk storage density seems not
to keep up the pace of a yearly 30-40% increase of storage
density.
– Monopoly of hard disk producers Seagate and Western Digital
is risky as prices might go up, especially in case of shortage.
Risk: storage costs can become a bottleneck for long-term
preservation.
* David Rosenthal blog post, available at: http://blog.dshr.org/2012/12/talk-at-fall-2012cni.html
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Cloud storage: hot… or not?
Storage
in the cloud
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Benefits of cloud storage
•
•
•
•
•
Scalable
Availability
Pay per TB per month
No need for own ICT infrastructure
Less maintenance
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
However… in preservation terms:
•
•
•
•
•
Is it sustainable?
Who is responsible for the data?
Which jurisdiction is applied?
What if I want to migrate to another cloud?
Continuity: no money? No data!
• Advise: be cautious to use the cloud for long-term
storage.
Read on:
http://www.ncdd.nl/blog/?p=2347
aparsen.eu
#APARSEN
Co-funded by the European Union under FP7-ICT-2009-6
Thank you! Questions?
Jeffrey DOT vanderhoeven AT kb DOT nl
aparsen.eu
#APARSEN
Download