Hargadon – Week 5 Cloud Computing More Definitions: Forrester Research: “A pool of abstracted, highly scalable, and managed compute infrastructure capable of hosting end-customer applications and billed by consumption.” - http://www.slideshare.net/GoGrid/cloud-computing-defined-presentation (Slide 15/33) “Cloud computing is an emerging approach to shared infrastructure in which large pools of systems are linked together to provide IT services.” - IBM press release on “Blue Cloud” “a hosted infrastructure model that delivers abstracted IT resources over the Internet.” - Thomas Weisel Partners LLC from “Into the Clouds: Leveraging Data Centers and the Road to Cloud Computing. “Cloud computing describes a systems architecture. Period. This particular architecture assumes nothing about the physical location, internal composition or ownership of its component parts.” - James Urquhart blog post GoGrid definition: “Cloud computing is an internet infrastructure service where virtualized IT resources are billed for on a variable usage basis and can be provisioned and consumed on-demand using standard web-based or programmatic interfaces.” - http://www.slideshare.net/GoGrid/cloud-computing-defined-presentation (16/33) Cloud Computing Enabling Techs http://www.the451group.com/marketmonitor/mm_cloud_enabling/451_mm_cloud_enabl ing.php “The following are some of the precursor technologies that enabled cloud computing as it exists today: Inexpensive and plentiful storage and CPU bandwidth Sophisticated communication, security and syndication formats to communicate with applications like HTTP, OpenID, Atom Established data formats and flexible protocols to enable message passing like XML, JSON and REST Sophisticated client platforms, such as HTML, CSS, AJAX The right solution stacks like .NET and LAMP SOA (service-oriented architectures) and SaaS Commercial virtualization Large datacenter infrastructure implementations from Microsoft, Yahoo, Amazon and others that provided real-world, massively scalable, distributed computing” - http://www.azurepilot.com/page/Enabling+Technologies+of+Cloud+Com puting Datacenter Design: One of the newer designs being widely deployed is the containerized datacenter, which involves the direct deployment of shipping containers packed with several thousand servers each; when repairs or upgrades are needed, whole containers are replaced (rather than repairing individual servers). “As computation continues to move into the cloud, the computing platform of interest no longer resembles a pizza box or a refrigerator, but a warehouse full of computers. These new large datacenters are quite different from traditional hosting facilities of earlier times and cannot be viewed simply as a collection of co-located servers. Large portions of the hardware and software resources in these facilities must work in concert to efficiently deliver good levels of Internet service performance, something that can only be achieved by a holistic approach to their design and deployment. In other words, we must treat the datacenter itself as one massive warehouse-scale computer (WSC). We describe the architecture of WSCs, the main factors influencing their design, operation, and cost structure, and the characteristics of their software base. We hope it will be useful to architects and programmers of today's WSCs, as well as those of future many-core platforms which may one day implement the equivalent of today's WSCs on a single board.” - http://www.morganclaypool.com/doi/pdf/10.2200/S00193ED1V01Y2009 05CAC006 - http://www.azurepilot.com/page/Physical+Design+and+Architecture+of+t he+%27Cloud%27+-+Datacenter+Design Virtualization: “A host computer runs an application known as a hypervisor; this creates one or more virtual machines, which simulate real computers so faithfully that the simulations can run any software, from operating systems to end-user applications. The software "thinks" it has access to a processor, network, and disk drive, just as if it had a real computer all to itself. The hypervisor retains ultimate control, however, and can pause, erase, or create new virtual machines at any time. By providing multiple VMs at once, this approach allows several operating systems to run simultaneously on a single physical machine. Rather than paying for many under-utilized server machines, each dedicated to a specific workload, server virtualization allows those workloads to be consolidated onto a smaller number of more fully-used machines. Virtualization means that e-mail, Web, or file servers (or anything else) can be conjured up as soon as they're needed; when the need is gone, they can be wiped from existence, freeing the host computer to run a different virtual machine for another user. Coupled with management software and vast data centers, this technology allows cloud providers to reap massive economies of scale. And it gives cloud users access to as much computing power as they want, whenever they want it. Virtualization reduces IT costs, increases hardware utilization, optimizes business and network infrastructure, and improves server availability.” - http://www.azurepilot.com/page/Virtualization Cloud Protocols, Standards and Wire Formats Cloud computing relies on the ability of users to reach across the internet and access the compute and storage resources offered in the cloud. And therefore, to enable this interaction, it relies on a number of open standards around areas like communication and data representation. Cloud computing, most commonly exposes its resources as a service over the Internet, and therefore the HTTP and HTTPS protocols act as the backbone communication infrastructures supporting cloud computing. All the other standards for exposing compute and storage resources, therefore are technologies that have evolved to closely interoperate with HTTP around issues like scalability and statelessness. In the following section, we will look at the REST architecture for exposing services, the AtomPub protocol for creating and updating web resources and the XML, Atom and JSON data representation formats. Pre-REST: Prior to REST, there was a long evolution of attempting to retrofitting existing client/server-based Inter-process communication systems paradigms over to the web. Early in the Internet's history, circa 1995, the Common Gateway Interface (CGI) was created to handle computational and data-centric requests that didn't necessarily always fit into the hypertext-based document model. for eg. payment processing. One of the most common distributed IPC paradigms that had taken hold in the clientserver space was Remote Procedure Call with implementations such as Java RMI, DCOM and CORBA. Initial implementations to port the above model over to the web in a literal sense resulted in technologies like XML-RPC which used XML to encode its calls and HTTP as a transport mechanism. HTTP was chosen as a framing protocol mostly because it was entrenched enough so that people already had port 80 open in the firewall. XML was chosen because the bet was that XML was going to be the platformneutral meta-schema for data. As new functionality was introduced, the standard evolved in in late 1999 to what is now SOAP — the Simple Object Access Protocol. SOAP defined a wire protocol that standardized how binary information within the computer's memory could be serialized to XML. Initially there were no discovery and description mechanisms either, but eventually the Web Service Description Language (WSDL) was settled on. WSDL enabled SOAP-based service designers to more fully describe the XML their service would accept as well as an extensibility mechanism. Using this extensibility mechanism, the Web service community came up with a collection of extensional specifications that together are referred to as the WS-* ("WS-star") specifications. These include WSSecurity, WS-Reliable Messaging, WS-Addressing, and WS-Policy, to name just a few. These specifications all essentially added layered metadata to the SOAP payload by modifying the header while keeping the message body intact. But there is a pattern here we should more closely examine. In the beginning we had simple hypermedia documents. However, as soon as we added the capability to access server-based application data processes, the basic architecture of our applications began moving away from the pure hypermedia model. CGI introduced a "gateway" where none had existed. SOAP introduced a novel use of XML, but this then required servers to accept and handle SOAP requests with arbitrarily complex payloads or negotiate lowest common denominators. Then came WSDL. With WSDL it became easier to modify the client's SOAP request to match what the service provider required, but then the server needed to formulate the WSDL, adding more processing requirements and moving the data communication architecture even further away from a pure hypermedia basis. And when the incredible complexity of the WS-* specifications is added, the entire data communication process incorporates processing and metadata transfer completely unforeseen only a few years ago. The gist of all of this is that somewhere along the way, all of us associated with designing and developing RPC-based Web services diverted from the basic architecture of the Internet. SOAP and WS-* started from examining various platform-specific solutions like DCOM and CORBA, and their ultimate goal was to build those same types of systems in a platform-neutral way. To accomplish this goal, they built on the substrate of some of the emerging Internet protocols of the time - HTTP & XML. All the advanced WS-* specifications (Security/Reliability/Transactions) were taking application server features du jour and incorporating those protocols into this platform-neutral protocol substrate. REST: In 2000, which is about the time the SOAP protocol started to take root, Roy Fielding, a co-designer of the HTTP specification, finished his doctoral dissertation, titled "Architectural Styles and the Design of Network-based Software Architectures" (www.ics.uci.edu/~fielding/pubs/dissertation/top.htm). His dissertation revolves around the evaluation of network-based architectures, but his final two chapters introduce the concepts of Representational State Transfer, or REST. REST essentially formalizes services using concepts inherent in the Internet. The principal concept in REST is the existence of resources (sources of specific information), each of which is referenced with a global identifier (e.g., a URI in HTTP). In order to manipulate these resources, components of the network (user agents and origin servers) communicate via a standardized interface (e.g., HTTP) and exchange representations of these resources (the actual documents conveying the information), which is typically an HTML, XML or JSON document of some kind, although it may be an image, plain text, or any other content. Atom / AtomPub: The name Atom applies to a pair of related standards. The Atom Syndication Format is an XML language used for web feeds, while the Atom Publishing Protocol (AtomPub or APP) is a simple HTTP-based protocol for creating and updating web resources. The following article: “Atom Publishing Protocol” by James Snell introduces in detail AtomPub, Atom Syndication Format and curl commands. http://www-128.ibm.com/developerworks/library/x-atompp1/ ADO.Net Data Services: The goal of the ADO.NET Data Services framework is to facilitate the creation of flexible data services that are naturally integrated with the web. As such, ADO.NET Data Services use URIs to point to pieces of data and simple, wellknown formats to represent that data, such as JSON and ATOM (XML-based feed format). This results in the data service being surfaced as a REST-style resource collection that is addressable with URIs and that agents can interact with using standard HTTP verbs such as GET, POST, PUT or DELETE. In order for the system to understand and leverage semantics over the data that it is surfacing, ADO.NET Data Services models the data exposed through the data service using a model called the Entity Data Model (EDM), an Entity-Relationship derivative. This organizes the data in the form of instances of "entity types", or "entities", and the associations between them. For relational data, ADO.NET Data Services supports exposing an EDM model created using the ADO.NET Entity Framework. For all other (ie. non relational) data sources or to use additional database access technologies (ex. LINQ to SQL) a mechanism is provided which enables any data source to be modeled as entities and associations (ie. described using an EDM schema) and exposed as a data service. Many of the Microsoft cloud data services (Windows Azure tables, SQL Azure Data Services, etc.) expose data using the same REST interaction conventions followed by ADO.NET Data Services. This enables using the ADO.NET Data Services client libraries and developer tools when working with hosted cloud services. - http://www.azurepilot.com/page/Cloud+Protocols%2C+Standards+and+ Wire+Formats “Cloud Computing has emerged from the existing parallel processing, distributed computing and grid computing technologies. Both applications being provided by the data centres as services on to the internet to the various users of it as well as the hardware and the system software are being used to provide those services are included in the Cloud Computing. And Cloud is referred to the hardware and software that are being used by the data centre to provide the services. In recent times only the Cloud Computing has emerged as a new thing and is gaining popularity among the people. As an essential feature of cloud computing, the nodes in it are differentiated in such a logical manner as each node behave like a unique machine and also it makes the virtualization technology more easy and simple to use for the user. And above all, cloud computing overrides the limitation of grid computing technology by connecting various bifurcated computers to transform it into a one big logical computer that has the ability to process various computations and to handle a huge amount of data. As this technology makes each node a separate machine thereby providing users an additional advantage of loading software and operating system on each node separately according to the specifications of each node and configuring the same for each node separately. Other Cloud Related Technologies from which it has evolved its existence: Grid Computing It can be described as an extension of distributed and parallel computing in which a super and virtual computer consists of a number of networked and loosely coupled computers that act together to perform huge tasks. Utility computing When the resources used in computing process are packaged as a metered service same as electricity, a traditional public utility. Autonomic computing Those systems that is capable of self-management. - http://www.cloudcomputingtechnology.org/