Helix Nebula
Big Science in the Cloud
Micheál Higgins
Enterprise Solutions Architecture
CloudSigma
A Collaboration Initiative
European Commission
& relevant projects
User organisations
Demand-side
European
Cloud Computing
Strategy
Commercial Service
Providers
Supply-side
2
Helix Nebula
ATLAS High Energy
Physics Cloud Use
Genomic Assembly
in the Cloud
To support the computing
capacity needs for the
ATLAS experiment
A new service to simplify
large scale genome
analysis; for a deeper
insight into evolution and
biodiversity
•
•
•
SuperSites Exploitation
Platform
To create an Earth
Observation platform,
focusing on earthquake
and volcano research
Scientific challenges with societal impact
Sponsored by user organisations
Stretch what is possible with the cloud today
Helix Nebula – The European Science Cloud
• Concept
• Big Science needs to begin to use Public Clouds
• The European GRID is aging out
• Cloud-burst has a much better TCO
• Avoid lock-in and Procurement issues
• Federation and Identity Management
• Disintermediation of Cloud Vendor Solutions
• API’s
• Drive Image formats (KVM, ESx, Zen, etc.)
• Cost/Billing models
• Etc.
• Initial Membership
• Demand Side: CERN, ESA and EMBL
• Supply Side: CloudSigma, Atos, T-Systems, Logica,
Interoute and 4 non-provider SME’s
Helix Nebula – The European Science Cloud
• Phase I – Flagship Proof-of-Concepts
•
•
•
•
Q4 2011 to Q4 2012
CERN – LHC ATLAS jobs with Panda/Condor
EMBL – De Novo (non-human) Genome with StarCluster
ESA – SSEP Earth Observation Site
• Environment
• One-to-one tests, vendor API, vendor drive format, etc.
• Success criteria was binary, no performance or cost data
• CS successful in all 3 and continuing to run ATLAS live
• Phase II – PoC’s with some Disintermediation
•
•
•
•
Q1 2013
Many-to-Many
Blue Box 0.9 – remove complexity for the Customer
enStratus + Integration
• Phase III – Expanded Membership and Disintermediation
• More Demand and Supply Side partners
• Increased Blue Box Functionality – Federated Clouds
Blue Box Maturity Model
• Release 0.9 – enStratus – January 2013
•
•
•
•
Basic Services Catalog
Federation and Identity Management, Token pass-through
Web Portal and API Translations
Some Image Management and Cost reporting (not Billing) - as time allows
•
•
•
•
•
•
Service Catalog and Cube Filtering
Image Factory and Transport
Billing / Payment Module
Embedded Monitoring
On-screen Provisioning
Contextualization
•
•
•
•
•
•
•
Cluster Management plug-ins (StarCluster, SGE, G-POD, etc.)
Payment gateways
PaaS for Science
Recipes and Golden Image Management
SLA / OLA Reporting
Data Movement and Open Networking
Ecosystems
• Release 1.0
• Post 1.0
The Blue Box 0.9
EC1/EC2
The Blue Box - Production
Helix Nebula – Learnings
• Not All Things are Public Cloud Suitable
• This is not the GRID
• Some science middleware is not cloud friendly
• IP’s and UUID’s change at re-boot
• The Public Cloud is commodity hardware
• 2.2 to 2.3 GHz CPU is common, less in older clouds
• 32, 64 and 96 Gb maximum RAM sizing – no 1 Tb servers
• Caution: You can’t eat the whole physical server
• Parts of De Novo are single core, massive map reduce
• Many science applications do not scale horizontally, yet
• Software innovation by the Vendors is required
• Putting Data in the Public Cloud is still perceived as a risk
• My Firewall is better than your Firewall – or is it ?
• Burst utilization must be somewhat predictable
• No, you may not have 65,000 large VM’s for 2 hours later today
• What do you mean I have to pay for it ?!!
CloudSigma
• Public IaaS Provider since 2008
• Locations (tier 4+ carrier neutral datacenters)
•
•
•
•
Zurich, Switzerland (Headquarters)
Las Vegas, USA
Amsterdam (in-work)
San Palo (planned 2012)
• Key Values:
•
•
•
•
•
•
•
Open Platform and Networking
Constant Innovation (big secret: all SSD storgage !!)
High availability and Up Time SLA’s
Customer Relationships / Enterprise Architecture Team
Standards Adherence (OpenNubula, Jclouds, etc.)
We only sell IaaS, we partner for SaaS and PaaS
The Ecosystem Concept
CloudSigma Features
• Granular Resources not Bundles
•
•
•
•
•
CPU, Cores, RAM, Disk, SSD, GPU’s, etc. all virtualized
Graphic Equalizer – reboot required
Allows for the tuning of the Server to the Work-load
E.g. Oracle requires 1.5x to 2x Memory of Standard config
HEP/HPC applications are also not Typical configurations
•
•
•
•
•
•
•
KVM Hypervisor with full Virtualization, no sniffing, no root
Any x68 O/S and Application (no you can’t run Mac)
Public Drives Library, Pay-per-use and Bring-your-own
Open Networking – 2x 10GBs NIC’s
SigmaConnect and IX – private back-haul lines
Peering (e.g. Switch, Geant, etc.)
No Customer Lock-in – upload and download easily
• Open Architecture
CloudSigma Features
• In-Work for the Future
•
•
•
•
•
•
•
•
All SSD Storage – only S3 is magnetic
JSON API
Virtualized GPU’s
Virtualized H/W for Transcoding and Rendering
Virtual Desk Top – Command and Control
Additional Science Applications: Panda/Condor, etc.
PaaS for Media and PaaS for Science
The Ecosystem Concept
•
•
•
•
Hold the Data and the World will come to the Cloud
Low to Zero cost Data hosting
More margin in CPU and RAM than in Storage
Meta meta-data – joining the Databases
• Known Point + Vector + distance vs. Lat / Lon
• ESA EO + WHO = Mosquito outbreak predictions
The Easiest Way to Understand
The CS Ecosystem Concept
Questions / Discussion