The EPIKH Project
(Exchange Programme to advance e-Infrastructure Know-How)
Giuseppe Andronico
INFN Sez. CT / Consorzio COMETA
Beijing, 13.05.2011
www.epikh.eu
• Computing and distributed computing
• Grid computing
• Cloud computing
• Grid and Cloud computing together?
Outline
Computing
The computing era started with Mainframes
Big central CPU, memory, storage used at the same time from different users and batch jobs
Computing
•
•
•
•
•
Computing: multiprocessing
• Increase the processing power of a system
• Parallel processing
• Tightly coupled systems
• Master-slave multiprocessing
• Symmetrical multiprocessing
• Loosely coupled systems
• Shared-nothing model
• Shared-disk model
Computing
Introduction of personal computers changed computing
Distributed computing
Ever and ever powerful personal computers and the introduction of networking made easy to implement loosely coupled systems, known as clusters
Distributed computing
Externally, clusters appear as a single computing unit.
Network nodes are individually identifiable.
Workload on a cluster is determined by cluster administration and load-balancing software.
Network workload cannot be controlled using the above method.
Distributed computing
• High performance networking
• Parallel computing with clusters
• Distributed and networking file systems
• Beowulf and beowulf like clusters
Distributed computing
Computing platforms:
Geographically distributed computers
(Grid computing in the broadest sense)
Cluster computing
Parallel computers
Software Techniques:
Object oriented approaches
Java Remote Method Invocation (RMI)
CORBA (Common Request Broker Architecture)
Web services
Remote Procedure calls (RPC)
Concept of service registry
Beginnings of service oriented architecture
1985 1990 1995 2000 2005
Grid & Cloud computing
“If computers of the kind I have advocated become the computers of the future, then computing may someday be organized as a public utility just as the telephone system is a public utility...
The computer utility could become the basis of a new and important industry .
”
John McCarthy, at the MIT Centennial in 1961
Grid computing
Some problems arose that were to complex to build a single cluster in only one place to front them.
An example is Large Hadron Collider, an experiment producing tens of PetaBytes of data to be analyzed every year.
Or the analysis of the human genoma.
The winning solutions was to adopt grid computing
Grid computing
Grid computing is about collaborating and resource sharing as much as it is about high performance computing
Resource to be shared:
• Storage
• Sensors for experiments at particular sites
• Application Software
• Databases
• Network capacity, …
Grid computing
Ingredients:
• High capacity and high speed networks
• Computers and other resources
• Middle ware, the software to share resources
• Authorization and authentication system
• Virtual Organizations
Cloud Computing
Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources
(e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.
European Telecommunications Standards Institute
(ETSI) http://www.etsi.org/website/document/tr_102997v010101p.pdf
The NIST Definition of Cloud Computing http://www.mendeley.com/research/nist-definition-cloud-computing-v15/?mrr_wp=0.1
Cloud computing
Why only now?
• Broadband networks
• Fast penetration of virtualization technology for x86based servers
– Virtual appliances
• Adoption of Software as a Service
– Salesforce.com
– Web 2.0 mindset
• General purpose on-line virtual machines that can do almost anything
Main ingredients:
• Network
• Storage resources
• Computer resources
• Virtualization layer
• Provisioning, billing, accounting
Cloud computing
Grid vs Cloud
Massive scale resource sharing over the Internet, sounds a lot like grid computing, yet the driving force are different hence solutions are different too
Highly specialized resources that need to be shared by thousands
[of researchers]
Reducing CAPEX, OPEX, time to market
Large data sets
Millions of users that share to save not for the sake of sharing
In many cases, providers are also consumers
Providers want market share and customer lock-in
Driven by the need to increase performance (FLOPs)
Driven by the need to reduce cost (
€£¥$)
Grid computing is more a computing paradigm, while cloud computing is a business model
Grid & Cloud
You do not need a grid to have a cloud
Today a cluster with recent virtualization enabled hardware is enough to start
Having a grid you can provide a cloud
Usually in a grid you have lot of resources
Usually most of the resources in a working structure (research departments or business units) can be used to set up a cloud
Simply adding a virtualization hypervisor
(XEN, KVM, VirtualBox,…) and a cloud environment (OpenNebula, Eucaliptus,
Nimbus, …) the game is done
Adding storage virtualization and computing virtualization you can handle provisioning
Improving accounting you can provide billing
Grid & Cloud
In this workshop will be explored 3 approaches to having a cloud interface to grid resources:
1.
Integrating a cloud environment in a grid middleware
2.
Configuring LRMS and modifying a part of the middleware to implement a cloud interface
3.
Developing a different approach to a cloud environment, minimally invasive and easily interacting with clusters or grids