Enabling Technologies Behind Cloud Computing

advertisement
Hargadon – Week 5
Cloud Computing More Definitions:
Forrester Research:
“A pool of abstracted, highly scalable, and managed compute infrastructure capable of
hosting end-customer applications and billed by consumption.”
- http://www.slideshare.net/GoGrid/cloud-computing-defined-presentation
(Slide 15/33)
“Cloud computing is an emerging approach to shared infrastructure in which large pools
of systems are linked together to provide IT services.”
- IBM press release on “Blue Cloud”
“a hosted infrastructure model that delivers abstracted IT resources over the Internet.”
- Thomas Weisel Partners LLC from “Into the Clouds: Leveraging Data Centers
and the Road to Cloud Computing.
“Cloud computing describes a systems architecture. Period. This particular architecture
assumes nothing about the physical location, internal composition or ownership of its
component parts.”
- James Urquhart blog post
GoGrid definition:
“Cloud computing is an internet infrastructure service where virtualized IT resources are
billed for on a variable usage basis and can be provisioned and consumed on-demand
using standard web-based or programmatic interfaces.”
- http://www.slideshare.net/GoGrid/cloud-computing-defined-presentation (16/33)
Cloud Computing Enabling Techs
http://www.the451group.com/marketmonitor/mm_cloud_enabling/451_mm_cloud_enabl
ing.php
“The following are some of the precursor technologies that enabled cloud computing as it
exists today:
 Inexpensive and plentiful storage and CPU bandwidth
 Sophisticated communication, security and syndication formats to communicate
with applications like HTTP, OpenID, Atom
 Established data formats and flexible protocols to enable message passing like
XML, JSON and REST
 Sophisticated client platforms, such as HTML, CSS, AJAX
 The right solution stacks like .NET and LAMP
 SOA (service-oriented architectures) and SaaS
 Commercial virtualization
 Large datacenter infrastructure implementations from Microsoft, Yahoo, Amazon
and others that provided real-world, massively scalable, distributed computing”
- http://www.azurepilot.com/page/Enabling+Technologies+of+Cloud+Com
puting
Datacenter Design:
One of the newer designs being widely deployed is the containerized datacenter, which
involves the direct deployment of shipping containers packed with several thousand
servers each; when repairs or upgrades are needed, whole containers are replaced (rather
than repairing individual servers).
“As computation continues to move into the cloud, the computing platform of interest no
longer resembles a pizza box or a refrigerator, but a warehouse full of computers. These
new large datacenters are quite different from traditional hosting facilities of earlier times
and cannot be viewed simply as a collection of co-located servers. Large portions of the
hardware and software resources in these facilities must work in concert to efficiently
deliver good levels of Internet service performance, something that can only be achieved
by a holistic approach to their design and deployment. In other words, we must treat the
datacenter itself as one massive warehouse-scale computer (WSC). We describe the
architecture of WSCs, the main factors influencing their design, operation, and cost
structure, and the characteristics of their software base. We hope it will be useful to
architects and programmers of today's WSCs, as well as those of future many-core
platforms which may one day implement the equivalent of today's WSCs on a single
board.”
- http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y2009
05CAC006
- http://www.azurepilot.com/page/Physical+Design+and+Architecture+of+t
he+%27Cloud%27+-+Datacenter+Design
Virtualization:
“A host computer runs an application known as a hypervisor; this creates one or more
virtual machines, which simulate real computers so faithfully that the simulations can run
any software, from operating systems to end-user applications. The software "thinks" it
has access to a processor, network, and disk drive, just as if it had a real computer all to
itself. The hypervisor retains ultimate control, however, and can pause, erase, or create
new virtual machines at any time. By providing multiple VMs at once, this approach
allows several operating systems to run simultaneously on a single physical machine.
Rather than paying for many under-utilized server machines, each dedicated to a specific
workload, server virtualization allows those workloads to be consolidated onto a smaller
number of more fully-used machines. Virtualization means that e-mail, Web, or file
servers (or anything else) can be conjured up as soon as they're needed; when the need is
gone, they can be wiped from existence, freeing the host computer to run a different
virtual machine for another user. Coupled with management software and vast data
centers, this technology allows cloud providers to reap massive economies of scale. And
it gives cloud users access to as much computing power as they want, whenever they
want it. Virtualization reduces IT costs, increases hardware utilization, optimizes
business and network infrastructure, and improves server availability.”
- http://www.azurepilot.com/page/Virtualization
Cloud Protocols, Standards and Wire Formats
Cloud computing relies on the ability of users to reach across the internet and access the
compute and storage resources offered in the cloud. And therefore, to enable this
interaction, it relies on a number of open standards around areas like communication and
data representation.
Cloud computing, most commonly exposes its resources as a service over the Internet,
and therefore the HTTP and HTTPS protocols act as the backbone communication
infrastructures supporting cloud computing. All the other standards for exposing compute
and storage resources, therefore are technologies that have evolved to closely interoperate
with HTTP around issues like scalability and statelessness. In the following section, we
will look at the REST architecture for exposing services, the AtomPub protocol for
creating and updating web resources and the XML, Atom and JSON data representation
formats.
Pre-REST: Prior to REST, there was a long evolution of attempting to retrofitting
existing client/server-based Inter-process communication systems paradigms over to the
web. Early in the Internet's history, circa 1995, the Common Gateway Interface (CGI)
was created to handle computational and data-centric requests that didn't necessarily
always fit into the hypertext-based document model. for eg. payment processing.
One of the most common distributed IPC paradigms that had taken hold in the clientserver space was Remote Procedure Call with implementations such as Java RMI,
DCOM and CORBA. Initial implementations to port the above model over to the web in
a literal sense resulted in technologies like XML-RPC which used XML to encode its
calls and HTTP as a transport mechanism. HTTP was chosen as a framing protocol
mostly because it was entrenched enough so that people already had port 80 open in the
firewall. XML was chosen because the bet was that XML was going to be the platformneutral meta-schema for data.
As new functionality was introduced, the standard evolved in in late 1999 to what is now
SOAP — the Simple Object Access Protocol. SOAP defined a wire protocol that
standardized how binary information within the computer's memory could be serialized
to XML. Initially there were no discovery and description mechanisms either, but
eventually the Web Service Description Language (WSDL) was settled on. WSDL
enabled SOAP-based service designers to more fully describe the XML their service
would accept as well as an extensibility mechanism. Using this extensibility mechanism,
the Web service community came up with a collection of extensional specifications that
together are referred to as the WS-* ("WS-star") specifications. These include WSSecurity, WS-Reliable Messaging, WS-Addressing, and WS-Policy, to name just a few.
These specifications all essentially added layered metadata to the SOAP payload by
modifying the header while keeping the message body intact.
But there is a pattern here we should more closely examine. In the beginning we had
simple hypermedia documents. However, as soon as we added the capability to access
server-based application data processes, the basic architecture of our applications began
moving away from the pure hypermedia model. CGI introduced a "gateway" where none
had existed. SOAP introduced a novel use of XML, but this then required servers to
accept and handle SOAP requests with arbitrarily complex payloads or negotiate lowest
common denominators. Then came WSDL. With WSDL it became easier to modify the
client's SOAP request to match what the service provider required, but then the server
needed to formulate the WSDL, adding more processing requirements and moving the
data communication architecture even further away from a pure hypermedia basis. And
when the incredible complexity of the WS-* specifications is added, the entire data
communication process incorporates processing and metadata transfer completely
unforeseen only a few years ago.
The gist of all of this is that somewhere along the way, all of us associated with designing
and developing RPC-based Web services diverted from the basic architecture of the
Internet. SOAP and WS-* started from examining various platform-specific solutions like
DCOM and CORBA, and their ultimate goal was to build those same types of systems in
a platform-neutral way. To accomplish this goal, they built on the substrate of some of
the emerging Internet protocols of the time - HTTP & XML. All the advanced WS-*
specifications (Security/Reliability/Transactions) were taking application server features
du jour and incorporating those protocols into this platform-neutral protocol substrate.
REST: In 2000, which is about the time the SOAP protocol started to take root, Roy
Fielding, a co-designer of the HTTP specification, finished his doctoral dissertation, titled
"Architectural Styles and the Design of Network-based Software Architectures"
(www.ics.uci.edu/~fielding/pubs/dissertation/top.htm). His dissertation revolves around
the evaluation of network-based architectures, but his final two chapters introduce the
concepts of Representational State Transfer, or REST. REST essentially formalizes
services using concepts inherent in the Internet.
The principal concept in REST is the existence of resources (sources of specific
information), each of which is referenced with a global identifier (e.g., a URI in HTTP).
In order to manipulate these resources, components of the network (user agents and
origin servers) communicate via a standardized interface (e.g., HTTP) and exchange
representations of these resources (the actual documents conveying the information),
which is typically an HTML, XML or JSON document of some kind, although it may be
an image, plain text, or any other content.
Atom / AtomPub: The name Atom applies to a pair of related standards. The Atom
Syndication Format is an XML language used for web feeds, while the Atom Publishing
Protocol (AtomPub or APP) is a simple HTTP-based protocol for creating and updating
web resources. The following article: “Atom Publishing Protocol” by James Snell
introduces in detail AtomPub, Atom Syndication Format and curl commands.
http://www-128.ibm.com/developerworks/library/x-atompp1/
ADO.Net Data Services: The goal of the ADO.NET Data Services framework is to
facilitate the creation of flexible data services that are naturally integrated with the web.
As such, ADO.NET Data Services use URIs to point to pieces of data and simple, wellknown formats to represent that data, such as JSON and ATOM (XML-based feed
format). This results in the data service being surfaced as a REST-style resource
collection that is addressable with URIs and that agents can interact with using standard
HTTP verbs such as GET, POST, PUT or DELETE. In order for the system to understand
and leverage semantics over the data that it is surfacing, ADO.NET Data Services models
the data exposed through the data service using a model called the Entity Data Model
(EDM), an Entity-Relationship derivative. This organizes the data in the form of
instances of "entity types", or "entities", and the associations between them. For relational
data, ADO.NET Data Services supports exposing an EDM model created using the
ADO.NET Entity Framework. For all other (ie. non relational) data sources or to use
additional database access technologies (ex. LINQ to SQL) a mechanism is provided
which enables any data source to be modeled as entities and associations (ie. described
using an EDM schema) and exposed as a data service.
Many of the Microsoft cloud data services (Windows Azure tables, SQL Azure Data
Services, etc.) expose data using the same REST interaction conventions followed by
ADO.NET Data Services. This enables using the ADO.NET Data Services client libraries
and developer tools when working with hosted cloud services.
- http://www.azurepilot.com/page/Cloud+Protocols%2C+Standards+and+
Wire+Formats
“Cloud Computing has emerged from the existing parallel processing, distributed
computing and grid computing technologies. Both applications being provided by the
data centres as services on to the internet to the various users of it as well as the hardware
and the system software are being used to provide those services are included in the
Cloud Computing. And Cloud is referred to the hardware and software that are being
used by the data centre to provide the services. In recent times only the Cloud Computing
has emerged as a new thing and is gaining popularity among the people. As an essential
feature of cloud computing, the nodes in it are differentiated in such a logical manner as
each node behave like a unique machine and also it makes the virtualization technology
more easy and simple to use for the user. And above all, cloud computing overrides the
limitation of grid computing technology by connecting various bifurcated computers to
transform it into a one big logical computer that has the ability to process various
computations and to handle a huge amount of data. As this technology makes each node
a separate machine thereby providing users an additional advantage of loading software
and operating system on each node separately according to the specifications of each
node and configuring the same for each node separately.
Other Cloud Related Technologies from which it has evolved its existence:
Grid Computing
It can be described as an extension of distributed and parallel computing in which a super
and virtual computer consists of a number
of networked and loosely coupled computers that act together to perform huge tasks.
Utility computing
When the resources used in computing process are packaged as a metered service same as
electricity, a traditional public utility.
Autonomic computing
Those systems that is capable of self-management.
- http://www.cloudcomputingtechnology.org/
Download