Presentation

advertisement
Research Area Background
 Area: systems – applied computer science
 Question: what to do?
 Dr. Dan Reed, Vice President Microsoft, in his Keynote
talk “Clouds: from Both Sides New” in Washington in
2011 stated (my interpretation)
 University researchers should find a
research niche because they do not have
enough resources (human and financial) to
compete against main stream of research
carried out by big companies
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Andrzej Goscinski
Service and Cloud Computing Lab
Senior Members: A. Wong, P. Church, M. Brock
Biology and Medicine Needs
 Biology and medicine specialists collect a lot of data
 Many of them only use their workstations, desktops
and even laptops to carry out data analysis
 Many of them are not familiar with HPC
 Many biology and medicine specialists do not program
well and do not have system admin skills (they should
not have it I guess)
 Biology and medicine specialists would like to use
computers to get analysis results quickly without a
burden of computing “jargon”
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Lab (Current) Research Aim
 to carry out the study into the development of a
technology for simplifying the deployment,
exposure, access and customization of HPC science
applications in SaaS clouds
 This technology forms a basis of research
environments enabling science specialists to use
HPC resources in clouds for running their
computational demanding software
 easily
 on-demand
 at reasonable costs
for the discovery of new and significant discipline
knowledge
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
The NIST Definition of Cloud Computing
NIST Special Publication 800-145, P. Mell and T. Grance, Sept 2011
 Cloud computing is a model for enabling
ubiquitous, convenient, on-demand network
access to a shared pool of configurable computing
resources (e.g., networks, servers, storage,
applications, and services) that can be rapidly
provisioned and released with minimal
management effort or service provider interaction
 This cloud model is composed of five essential
characteristics, three service models, and four
deployment models
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
NIST: Service Models
 Infrastructure as a Service (IaaS)
 The delivery of hardware resources as a
service
 Users are granted access to cloud
infrastructure through virtual machines
 Platform as a Service (PaaS)
 Build services on IaaS clouds supporting
cloud application deployment
 Most cloud platforms consist of a high-level
language and a well-defined Application
Programming Interface
 Software as a Service (SaaS)
 Exposes applications designed to run on a
cloud as services
 Eliminates the need to install or run
applications on the customer’s computer and
is often cheaper than buying a full software
licence
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
NIST: Deployment Models
 Public Clouds
 Accessed by the general public
 Allows users to rent resources such as
computational time or storage as necessary
 Private Clouds
 Used exclusively by an organisation
 Allow for a specific service level agreement
(SLA) to be made to ensure availability and
security
 Community Clouds
 Used by a group of users that have shared
concerns
 Allows for a shared mission statement
which has specific security and policy
requirements
 Hybrid Clouds
 Combines cloud resources from two or
more deployment models to accomplish a
user’s goal
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
NIST: Essential Characteristics
 On-demand self-service
 Broad network access
 Resource pooling
 Rapid elasticity
 Measured service
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Characteristics of Clouds that Attract Business
 Clients only pay for what they consume
 Rather than spending money on buying, managing and
upgrading servers, business administrators concentrate on
the management of their applications
 The required service is always there – availability is very
high that leads to short times from submission to the
completion of execution
 Cloud computing provides opportunities to small
businesses by giving them access to world class systems
otherwise unaffordable
On the other hand, even small companies can export their
specialized services to clients
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
When Using Clouds Additional Steps Must be
Carried out Depending on the Service Model
 IaaS - involves construction of a virtual cluster,
compilation and deployment of distributed software
 System administrators jobs
 PaaS - aimed at developers provide users with a
development environment and automating the
deployment of resources
 Limited access to development tools and languages
 SaaS - users are able to access HPC applications through
graphical interfaces; however users are reliant on what
cloud service providers have made available
 Such software would have expensive licenses or be not
readily available
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)}
 Over the past 2.5 years the percentage of companies
who say they are currently using public cloud
computing services has climbed from 14% to 40%.
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)}
 The results in the latest ChangeWave cloud survey
point to continued growth for public, private and
hybrid cloud computing
 Within public cloud computing, software as a service
(SaaS) remains the area with the fastest growth rate
 When asked why their companies do not use cloud
computing, the most important reasons are Security
Concerns (41%), while 15% cite the Complexity of
Integrating with Existing IT Infrastructure
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Cloud Trends {ChangeWave Investing Weekly Update (5/21/2013)}
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
HPC vs. HPC Clouds vs. Discipline Specialists
 Problem 1: HPC requires




powerful and expensive computational and data storage hardware
advanced middleware
sophisticated discipline oriented applications
knowledgeable programmers and system managers
 Clouds have been created for business ($$$), not to earn
money from HPC ($)
 Most HPC clouds are based on IaaS clouds enhanced by additional
hardware and middleware to support HPC
 Problem 2: the cost and time overheads in learning how to
 prepare a HPC cloud and
 properly install and configure applications in the
underlying HPC facilities
 Conclusion: if discipline specialists want to use HPC clouds
for scientific discovery, they also must become system
administrators and good programmers
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Clouds and HPC
 A response to Problems 1 & 2 faced by discipline
specialists lies in cloud computing
 These days clouds can support some HPC workloads
 Clouds are oriented to support High Scalability
Computing (HSC) rather than HPC
 Note: with the improvement of communication
performance clouds are becoming a major tool for HPC
 Question: what kind of HPC applications could be
executed on a cloud?
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
HPC Clouds vs. Applications
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
HPC Clouds vs. Discipline Specialists
 Most HPC clouds are based on IaaS clouds enhanced by




additional hardware and middleware to support HPC
Problem 3 again: the cost and time overheads in learning
how to prepare a HPC cloud and its applications remain a
problem
HPC cloud users are
 presented with a set of virtual and physical servers
 required to put the servers together to form the HPC
facilities to run their software applications on
The software applications must be properly installed and
configured in the underlying HPC facilities
Conclusion: if discipline specialists want to use HPC clouds
for scientific discovery, they must also become
 system administrators and
 good programmers
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Web-based Software Tools/Packages
 In many areas of science, discipline specialists
benefit from Web-based software tools
 Software tools are easy to use and attractive to
specialists through their discipline oriented
interfaces
 scientific workflow systems (Galaxy)
 web portals for accessing grid resources (P-GRADE)
 web portals of scientific gateway such as HubZero
 Observation: specialists appreciate easy to use
Web-based discipline oriented interfaces!
Plenary "Cloud in Action" CLOUD 2013 panel
HPC Applications Exposed as Services in SaaS
Clouds
 Use of clouds (ChangeWave Research)
 Conclusion: discipline specialists could benefit most from
the execution of their HPC applications if they are
 exposed as services in SaaS clouds and
 accessed through discipline (tool-based) interfaces
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Merging SaaS Cloud Services and Web Tools
 Question: are we on a good track?
 Yes, we are!
 Providing users faster turnaround times on
their experiments using clouds has been one
of the major issues promised to be addressed
in a new version of the AGAVE software tool
 AGAVE is one of the well known and widely used Web-
based software tools
 AGAVE delivers science-as-a-service
 Data processed using analytics provided as
SaaS services
Plenary "Cloud in Action" CLOUD 2013 panel
Direct Research Questions
 How to make scientists able to deploy software applications
in clouds?
 How to make clouds easy to use for discipline researchers to
run HPC applications?
 How to support the customization and reuse of HPC
applications in clouds?
 These three questions form the current research scope of
our Lab
 Our research aim again: develop a technology that
 automatically creates a virtual machine (VM)
 exposes an application as a service
 deploys it on the VM
 generates an easy to use interface – a Web form
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Initial Lab’s Research
 Web services, which are used to develop services, are
stateless
 Our response: stateful Web services
 Service discovery and selection is a major threshold of the
application of cloud computing (only simple catalogues
are in use)
 Our response: a dynamic broker based on attributed
names
 The application of HPC is unaffordable to small and
medium research groups and institution
 Our response: the CaaS framework that exposes a
cluster as a service, and makes it available within a
private and public cloud
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
From IaaS/PaaS to SaaS with a Broker (M. Brock)
Web form URL,
Application Invocation
Dynamic Broker
Application
Web Form
Job Start and Monitoring,
File Transfer, Compilation
CaaS
State exposure via stateful WSDL
RVWS
Scheduling and Monitoring
Cluster Middleware
CPUs, Memory and Disk
Hardware
HPCynergy Cloud
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
SaaS-like Cloud
Application Invocation
Cluster Configuration,
Installed Software,
Resource Availability
Available Databases
Client
(End User)
From IaaS/PaaS to SaaS with a Broker (M. Brock)
 The RVWS Framework
 Allows current activity and characteristics of resources to be exposed as
services via WSDL documents
 A compatible extension to existing Web standards
 The Dynamic Broker
 A discovery service that uses stateful WSDL documents
 CaaS Infrastructure
 Web service-based middleware for easy publishing, discovery and use
of clusters
 HPCynergy
 A prototype private cloud built using CaaS for easy access to HPC
resources and applications
 HPC Hybrid Deakin (H2D) Cloud
 Able to discover suitable resources from both public and private clouds
to execute single applications too large to singular clusters
 All tasks such as parameter modification, data file break up and
multiple application monitoring handled on behalf of the user
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
SaaS Cloud Supporting HPC Science Applications
• Three steps:
• Deployment of HPC
applications on IaaS
clouds
• Exposure of HPC
application services
• Access of HPC
application services
 Transforming
complicated HPC
applications into
easy-to-use SaaS
cloud services
HPC Resources
SaaS Cloud
HPCApplication
Service Registry
HPC Application Service
Publishing
Virtual
Machine
Image
User
Web Form
Accessing
Yes
Service
Discovery
No
HPC Application
Service, Web Form
Generation
Deploying
HPC Application
HPC Application
Deployment
IaaS Cloud
SaaS Clouds Supporting HPC Biology Sciences CMU CV July 2013
Using the Framework
 The discipline researcher to conduct a scientific discovery by executing HPC
applications on clouds contacts the HPC Application Service Registry
 Scenario 1: the HPC application services of researcher ’s interest is found
 Researcher selects the cloud service
 Resources are selected automatically and the application deployment service sets up and
configures the cloud
 The automated interface generation service constructs a user friendly discipline specific
interface for the requested HPC application service
 Researcher accesses the cloud service through the provided interface
 Scenario 2: the HPC application service of user’s interest is not found but
the discipline researcher has programming and system administration skills
and decides to deploy a new targeted HPC application in IaaS cloud
 The Automatic HPC Application Deployment System can automate parts of this process
 The outcome is either

a virtual machine image containing a copy of the properly installed and configured HPC application or

a software service (consisting of input/output, invocation information and hardware requirements) which
can be deployed on a virtual machine
 Stage 1: the cloud service published in the HPC application service registry is readily
accessible in IaaS cloud

The new cloud service generated by the Automatic HPC Application Deployment System is stored for future
use in the HPC Application Service Registry
 Stage 2: the user can employ the Automatic HPC Application Service and Web Form
Generation System to automate the formation of a HPC Application Service exposing the
HPC application

The HPC Application Service is abstracted by a user friendly discipline specific interface that is published in
the HPC application service registry (see Scenario 1)
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Implementation of the HPC Cloud Framework (A.
Wong)
Cloud Service Stack
 Services provided at the
Cloud service stack:
 Bottom (IaaS layer): the
Amazon EC2 was used to
provide cloud
infrastructure services
 Middle (HPCaaS Layer):
a HPC software library
was used to expose and
access Amazon EC2
services
 Top (SaaS Layer): a HPC
application service was
developed and exposed
as a tool in the Galaxy
server
HPC Application Service
HPC
Application
API
SaaS Layer
exposed as
Web Form
Tool
Tool
HPC Service
HPC
Software
Library
Galaxy
Server
Web Form
exposed as
HPCaaS Layer
Amazon EC2 Service
HPC
Application
VM
Image
SaaS Clouds Supporting HPC Biology Sciences CMU CV July 2013
IaaS Layer
The Galaxy Web-based Platform (A. Wong)
 Galaxy provides a powerful feature for tool integration where
each tool (application) is presented to users as a Web form
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
An Interface to Access the HPC Cloud (A. Wong)
 A HPC cluster was being constructed where compute
instances of the cluster would support mpiBlast execution
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
An Interface to Access the HPC Cloud (A. Wong)
 A cluster of 8 nodes was constructed at Amazon EC2
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
An Interface to Access mpiBlast (A. Wong)
 mpiBlast was accessed by supplying parameters: cluster
name, number of processes and other typical parameters
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
An Interface to mpiBlast (A. Wong)
 mpiBlast execution finished at Amazon EC2; its result file was
transferred automatically to the Galaxy server for post
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
processing
July 2013
Uncinus: Cloud Deployment (P. Church)
 Supports
 Resource Allocation
 Workflow Orchestration
 Cloud Bursting
 Genomics in the clouds
 Gene Discovery
 Personalized Genomics
 Leverage EC2 to improve
the speed and accuracy of
analysis
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Uncinus: Case Study (P. Church)
 To identify genes transferred
upon digestion of dairy products
 Mother -> Child
 A 8 step workflow was developed
and ran on Uncinus
 Run on the following resources;
Resources
#Nodes
Amazon (cc1.4xlarge)
2
Amazon (m1.Large)
2
West-Lin Cluster
2
Mamsap Server
1
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Uncinus: Case Study (P. Church)
 Cloud bursting improved performance
 Workflow mode reduced run time by 8 hours
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Uncinus: Case Study (P. Church)
 Results from the workflow found genes active during
lactation and during digestion of dairy
 Is this gene transfer or a reaction? Further work is needed
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Increasing Scalability – Hybrid Clouds
Public Clouds
Compute Cloud
Storage Cloud
Service Request
Publishing
Broker 1
(Distributed) Service
Broker
Broker N
Compute
Cloud
Private Compute
Cloud
Storage Cloud
SaaS Clouds Supporting HPC Biology Sciences CMU CV July 2013
Solutions from Hybrid/Federated Clouds
 Hybrid/Federated Cloud Management (FCM)
Architecture
 A recent work that provides a reference architecture
consisting of brokering services
 User requests are serviced by creating virtual
appliances based on user request parameters and ran
inside virtual machines
 Appliances are stored in repositories and decomposed
over time to support the creation of future appliances
 As virtual appliances contain a software stack
(operating system) upwards, there are high data
transfer costs
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Solutions from Hybrid/Federated Clouds
 There is also an (unnamed) toolkit for VM migration
between clouds
 Users are able to transfer VMs between public and
private clouds to control load (manually or
automatically)
 However, the interface itself is primitive at best
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Conclusions
 Clouds are being moved from business to specialized research
 HPC on clouds promise scalability, faster turnaround times,






lower costs, services on demand
Discipline specialist should not be forced to become (good)
programmers and system administrators
Easy and discipline oriented interfaces are very important
Web tools offer discipline oriented interfaces but are
inflexible and do not support HPC widely
Combining HPC clouds and Web tools is the way
HPC applications exposed as services of SaaS cloud and
accessed using Web forms is the solution!
Hybrid clouds will grab the HPC market
SaaS Clouds Supporting HPC Biology Sciences - CMU CV
July 2013
Download