- OpenStack

advertisement
Case Study: Georgia Tech
University Private Cloud for Researchers
Didier Contis
Director Technology Services
College of Engineering
Georgia Institute of Technology
Joe Arnold
CEO SwiftStack Inc.
Session Speakers
Didier Contis
Didier Contis is the Director Technology Services /
College of Engineering, Georgia Institute of
Technology Greater Atlanta Area. The largest of
Georgia Tech’s six colleges, CoE offers more than
50 graduate and undergraduate degree programs
through its main Atlanta campus and satellites
around the world. Its 13,000 students use an
estimated 150 unique apps—the same ones
businesses rely on to design airplane wings,
model circuit-board layouts, and much more.
2 Georgia Tech Case Study - OpenStack Summit 2014
Joe Arnold
Joe Arnold is co-founder and CEO of SwiftStack, a
leading provider of object storage software.
SwiftStack's customers include some of the
largest web and enterprise IT organizations. Joe
managed the first public OpenStack launch of
Swift after its release as an open source project.
He has been active in the OpenStack community
since 2010. Joe is the author of Object Storage
with Swift published by O'Reilly Media.
Object Storage and OpenStack Swift
Joe Arnold
CEO SwiftStack Inc.
3 Georgia Tech Case Study - OpenStack Summit 2014
Swift Object Storage – Key Attributes
• Open-source object storage system
• Powers the largest storage clouds
• Geographically distributed
“OpenStack Swift in
particular has gained a
lot of traction both in the
enterprise and in the
service provider space”
October 2013
• Multi-tenant
• Massively concurrent
• Extremely durable
• Runs on standard Linux
• Inexpensive commodity x86 hardware
4 Georgia Tech Case Study - OpenStack Summit 2014
“Swift is a proven solution,
suitable for production
needs, and should be
included in competitive
evaluations of object-based
storage solutions.”
February 2014
Swift Data Redundancy
Swift places 3+ replicas of all data as unique as possible
Single Node Cluster
Disks are “as-unique-as-possible”
Small Cluster
Storage Nodes are “as-unique-as-possible”
5 Georgia Tech Case Study - OpenStack Summit 2014
Large Cluster
Storage Racks are “as-unique-as-possible”
Muti-Region
Distributed data centers are “as-unique-as-possible”
Swift Object Storage – Filesystem Conceptual View
Files
CIFS
Swift
Node
6 Georgia Tech Case Study - OpenStack Summit 2014
SwiftStack Filesystem Gateway
Swift
Node
NFS
Swift
Node
Georgia Tech Case Study
Didier Contis
Director Technology Services
College of Engineering
7
Who we are and what we do
• Georgia Tech: 21,471 undergraduate and
graduate students (Fall 2013)
• Six colleges: Architecture, Business, Computing,
Engineering, Liberal Arts and Sciences
• 6th Top Engineering Graduate Programs
• 5th Top Engineering Undergraduate Programs
• 13,000 students in the College of Engineering,
largest in the U.S.
8 Georgia Tech Case Study - OpenStack Summit 2014
We have been deploying pre-cloud systems
since 2007…
Meet our federated condominium systems
9 Georgia Tech Case Study - OpenStack Summit 2014
Our VDI / App publishing farm…
Virtual Lab Project and its supporting shared infrastructure (Matrix)
10 Georgia Tech Case Study - OpenStack Summit 2014
Our HPC farm…
PACE: Partnership for an Advanced Computing Environment
HPC federation / condominium system. 28,000 cpu cores and 2PB storage
11 Georgia Tech Case Study - OpenStack Summit 2014
What we learned:
1) Our HPC and VDI users love compute power
2) They love their research data even more
They are not alone….
12 Georgia Tech Case Study - OpenStack Summit 2014
All our researchers / students love their research data
They love to
• Acquire
• Create
• Exchange
• Receive
Data….
13 Georgia Tech Case Study - OpenStack Summit 2014
Here is a research project generating a lot of data
Remote Sensing and GIS-enabled Asset Management System (RS-GAMS)
Assessment of pavement, bridge, and roadway assets using various sensors
Estimated Storage needs: 2,400 lane miles interstate highways currently on files with plan
to analyze 2,000 miles in next few months…
Raw data: 2.2GB per lane mile
Processed data: 1.2GB per lane mile
16 Million jpeg files so far !!!!
14 Georgia Tech Case Study - OpenStack Summit 2014
Here is a research project receiving a lot of data
Effective Capacity Analysis and Traffic Data Collection for the I-85 HOV to HOT Conversion
The effectiveness of the implementation of the HOT lane is being evaluated in a before and
after study.
Direct fiber network feed from Georgia Department of Transportation to Georgia Tech
 Over 400TB of videos currently stored on
random fileservers, USB drives…
 Lots more video to collect….
15 Georgia Tech Case Study - OpenStack Summit 2014
Oh, by the way have you heard about:
Research Data Curation
The White House Office of Science Technology Policy:
“has directed Federal agencies with more than $100M in R&D expenditures to develop
plans to make the published results of federally funded research freely available to the
public within one year of publication and requiring researchers to better account for and
manage the digital data resulting from federally funded scientific research.”
http://www.whitehouse.gov/blog/2013/02/22/expanding-public-access-results-federally-funded-research
http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf
16 Georgia Tech Case Study - OpenStack Summit 2014
So where do we store all this data?
17 Georgia Tech Case Study - OpenStack Summit 2014
Research Data Storage Challenges
 Our Challenges:
 Obviously we have a lot of research data. How much ???
(2 PB just for HPC)
 Cheap enterprise level storage is still expensive
 Backup is a problem (cost, time)
 Meet our BIGGEST challenges:
 Bring-your-own-drive – USB, thumb
 Consumer cloud – Dropbox, etc.
18 Georgia Tech Case Study - OpenStack Summit 2014
Research Data Storage Challenges
 “Sometimes” important research
data might be stored might on not so
reliable solution:
•
Due to cost of existing “enterprise
storage” and research programs funding
•
Backup? Could you repeat the question
please?
19 Georgia Tech Case Study - OpenStack Summit 2014
!!! WARNING UNCONFIRMED REPORT !!!
Ultimate cheap NFS File Server circa 2006
•
•
•
Refurbished Desktop tower
Two 5 ports USB cards
13+ USB drives each shared individually via NFS
Our magic answer to all our problems ???
VAPOR the hybrid cloud
20 Georgia Tech Case Study - OpenStack Summit 2014
Meet VAPOR
Goal: Build a Georgia Tech Distributed and Federated Academic Cloud
 Proposed design principles:
•
Led by Academic Units in partnership with Central IT
(Currently College of Engineering, College of Science, College of Computing, Library, HPC PACE Group,
Office of Information Technology)
•
Support Instruction and Research at Georgia Tech
•
Distributed across campus and beyond (Hybrid)
•
Federate multiple departmental projects
•
Design / Architecture by Committee
•
Academic Governance Oversight
•
Need to be able to experiment and iterate quickly !!!!
21 Georgia Tech Case Study - OpenStack Summit 2014
Proposed Use Cases for VAPOR Cloud
1. Ephemeral computing: A machine runs for short term use. Possibly for
development/testing purposes.
2. "Pet" computer: student needs a system which is "permanent", stateful
and accessible both off and on campus. Basic usage is like VDI.
3. IaaS: Running campus services (both production and beta) on VMs. E.g. I
don't want to manage a hardware layer but I need to set up a purpose
built website to host data and services for an international research group
and webhosting doesn't meet my needs.
4. PaaS: Running a platform. E.g. I don't want to manage hardware or the
OS layer, but please give me a database I can use for this application.
22 Georgia Tech Case Study - OpenStack Summit 2014
<DRAFT> Vapor Architecture Vision </DRAFT>
Self-Service
(to be defined / under investigation)
Management
Management
(Microsoft Azure Pack / Redhat CloudForm…..)
On-premise Component
Off-premise Component
HYPERVISOR or CONTAINER PODs
(Hyper-V / KVM / XenServer + NVIDIA vgpu…)
COMPUTE STORAGE
(Gluster / Ceph / Scale-IO / ...)
Amazon
AWS
Microsoft
Azure
RackSpace
DATA STORAGE
(Swiftstack / DDN WOS / Gluster…)
NETWORK
NETWORK
(VXVLAN / NVGRE / VPN…)
(VXVLAN / NVGRE / VPN…)
23 Georgia Tech Case Study - OpenStack Summit 2014
Today we are focusing on…. Data Storage
24 Georgia Tech Case Study - OpenStack Summit 2014
Vision of the Data Storage layer
•
Will hold a large portion of GT “Research Data”
•
Probably multiple data storage layers (multiple vendors / technology)
•
Some of our current requirements:
 Distributed and Resilient (support multiple catastrophic failures)
 Limit vendor dependency / lock-in (priority to open source)
 Leverage de-facto standards (S3 / Swift)
 Support multiple entry points (API, Cloud NAS, pluggable services)
 Flexible design to limit the need to migrate data to new systems down the road
 Integration with Georgia Tech identity management system (LDAP & AD)
25 Georgia Tech Case Study - OpenStack Summit 2014
Services supported by the Data Storage Layer / Swift
Research Data Storage
Research Data
Curation
Research Data
Repositories
DATA STORAGE
(SwiftStack – Storage as a service)
“Dropbox” type
service
26 Georgia Tech Case Study - OpenStack Summit 2014
Filesystem Gateway
(CIFS / NFS / GPFS/ ….)
Why Swift / SwiftStack for the Data Storage Layer?
 Like:
•
•
•
•
Swift is open-source (limit vendor lock-in in our mind)
Turn key approach / manageability provided by SwiftStack
Growing ecosystem around Swift
Low hardware requirement / homogeneous hardware not required
• System seem robust -> replication rather than RAID technology
• Price is right !!! (so far….)
 Don’t Like:
• It is object storage / not native filesystem
• Still young project / product
27 Georgia Tech Case Study - OpenStack Summit 2014
Research Project Candidates to use Swift
 Projects in Aerospace, Transportations and BioEngineering currently
targeted
Examples of research projects looking / experimenting with Swift:
• Effective Capacity Analysis and Traffic Data Collection for the I-85 HOV to
HOT Conversion
• Remote Sensing and GIS-enabled Asset Management System (RS-GAMS)
28 Georgia Tech Case Study - OpenStack Summit 2014
Our current strategy to engage research groups
 Goal: Incentivize researchers to store data directly
into Swift as objects when it makes sense
• This means demonstrating advantages from:
•
Indexing
•
Metadata
•
Scalability
•
Performance
•
Future benefits (analytics)
Files
Apps
Scripts
SwiftStack
CIFS/NFS
Gateway
Swift HTTP APIs
Swift
Node
• It also means making an up-front investment:
•
•
In training and technical assistance
Providing free storage
29 Georgia Tech Case Study - OpenStack Summit 2014
Swift
Node
Swift
Node
Filesystem Gateway
• Using Object Storage natively is difficult
• Lots of workflow based on using files. Our students are
using Windows / Linux applications packages which are not
object friendly.
Directory
• Latency / speed is also an issue
• Strategies being deployed or investigated:
•
SwiftStack gateway with lots of cache
•
Would like a GPFS Gateway (High Performance Computing)
•
Storage abstraction technology (Software defined storage
utopia…. EMC ViPR Data Services ?)
30 Georgia Tech Case Study - OpenStack Summit 2014
Directory
Directory
Directory
Directory
Directory
Directory
Filesystem Gateway – No Data Lock In
• No lock in due to encoding with SwiftStack gateway
•
Objects
Files
Data in/out same via Swift API or via CIFS/NFS filesystem
• Traditional gateways (like S3/Glacier, Avere, Panzura) are
a 'medieval marriage....forever‘
•
These gateways severely lock data in – all data going in via
gateway MUST come out through same gateway
•
It's a "Hotel California“ for data – you can check-out any time
you like, but you can never leave
SwiftStack
Filesystem
Gateway
Swift HTTP APIs
Swift
Node
• Other gateways with lock have other benefits/features
•
E.g. deduplication, compression, etc.
•
Or offer POSIX required for some applications
32 Georgia Tech Case Study - OpenStack Summit 2014
Swift
Node
Swift
Node
What about our Research Data Curation problem?
• Initiative lead by Georgia Tech Library
• Migrate from Dspace to a Research Data Curation repository built around
Fedora (repository infrastructure) and Hydra (front end repositories)
• Fedora 4.0 will connect to Swift
 via JBoss ModeShape and Infinispan storage subsystem
 Infinispan connection to Swift initially to use the Swift3 (S3) emulation layer
36 Georgia Tech Case Study - OpenStack Summit 2014
Swift zones distribution across campus
Federated and Distributed Academic cloud:
Each zone is located in a server room which is owned and
operated by a different GT department….. No one own the cloud
VAPOR ???
Zone ISYE 1
Zone PACE 1
Zone ECS 1
37 Georgia Tech Case Study - OpenStack Summit 2014
We expect more zones
to come on-line in the
next 12 months
Geographically distant
region is on the
roadmap (using hosting
agreement with other
Universities and
Internet 2)
What hardware are we using?
• Supermicro chassis, primarily 24 bays chassis
• Hardware configuration is heterogeneous (different drives capacity, no same number of
storage nodes per zone)
• Drives are mix of Enterprise / Consumer grade. Mainly 1TB or 2TB
• Most storage nodes have 10GB network connectivity (SolarFlare or Mellanox ConnectX). Currently SFP+ 10GB… 10GB Base-T is next
• LSI SAS Adapter 9211-8i (do not forget to re-flash if needed to change card from
Integrated Raid to Integrated Target)
• SSDs for Account/Container Ring (60GB to 120GB)
• Memory to TB ratio?? 1GB of memory per TB can be expensive…
38 Georgia Tech Case Study - OpenStack Summit 2014
Distributed management of our Swift Infrastructure
 Sysadmin from multiple departments share administrative responsibilities.
 Would like more delegation granularities down the road: delegation on a
per zone / region basis to enable for node management of cluster operator
role.
 Students are a great resource to replace dead drives…. If they know which
one to replace…..
39 Georgia Tech Case Study - OpenStack Summit 2014
SwiftStack Auth and LDAP
• Initially was considering using AD Integration
• LDAP was a definite requirement. Availability delayed some of our testing /
usage.
• Georgia Tech LDAP size is fairly large:
-
ou=accounts -> 300K entries
-
ou=people -> 2M entries
• So far so good. Initial integration was easy (5 minutes) but we waited until
code was stable…..
• Anyone with a GT valid account can access the Swift cluster !!!!
- (but we have not advertised its existence…. Please don’t tell our students)
40 Georgia Tech Case Study - OpenStack Summit 2014
Our Financial Model Approach
Limit recurring cost at all costs !!!!
Fund recurring cost at the central level (licensing for example)
Focus on Bring Your Own:
• Zone (BYOZ)
• Server (BYOS)
• Drive (BYOD)
We envision to use HDDs as a form of currency with research groups.
4TB of data to store = 3 x 4TB HDD payment for 5 years.
Hopefully storing 4TB will be negligible in 5 years when drives start to die.
41 Georgia Tech Case Study - OpenStack Summit 2014
√
IT’S ALIVE…. Okay… So What’s Next…
• Implement Quotas… probably container based Quotas
• Re-architect the proxy layer this summer. Dedicated proxy nodes and possibly Load
Balancer (Netscaler / F5)
• Might deed to identify High performance NAS Gateway for specific workload ('medieval
marriage....forever’)
• Investigate support for a GPFS based Gateway
• Keep educating people on long term benefits of using Swift API to access data
• Unified access to data via SwiftStack FileSystem Gateway (convergence)
42 Georgia Tech Case Study - OpenStack Summit 2014
Managing Swift with SwiftStack
Joe Arnold
CEO SwiftStack Inc.
43 Georgia Tech Case Study - OpenStack Summit 2014
SwiftStack Object Storage Software
Simple, Web-based MANAGEMENT
DEPLOY
INTEGRATE
SCALE
Deploy in Minutes
not Days
Seamlessly
Without Disruption
OpenStack Swift
(support included)
Standard Hardware & Linux Distribution
44
Georgia Tech Case Study - OpenStack Summit 2014
Questions & Answers
45
Download