Cloud Computing Keith Joshua Massi What is Cloud? • The combination of the software and hardware units which are located at the centralized servers inclusive of data storage units and can be accessed through the Internet from anywhere by the customers is known as the Cloud Network. • The major requirement is a good speed and high bandwidth Internet connection. Definition of Cloud Computing • It is an internet-based method of computing, where the end-user can get access to the data servers on a paid basis and the virtual shared servers provision the infrastructure, software applications, platform, storage, and other resources to the user. • Cloud Computing is defined as storing and accessing of data and computing services over the internet. It doesn’t store any data on your personal computer. It is the on-demand availability of computer services like servers, data storage, networking, databases, etc. Origins of Cloud Computing • The concept of Cloud Computing came into existence in the year 1950 with implementation of mainframe computers, accessible via thin/static clients. • In 1999, Salesforce.com became the 1st company to enter the cloud arena, excelling the concept of providing enterprise-level applications to end users through the Internet. • Then in 2002, Amazon came up with Amazon Web Services, providing services like computation, storage, and even human intelligence. In 2009, Google Apps and Microsoft’s Windows Azure also started to provide cloud computing enterprise applications. Cloud Computing Architecture • The Architecture of Cloud computing contains many different components. • It includes Client infrastructure, applications, services, runtime clouds, storage spaces, management, and security. These are all the parts of a Cloud computing architecture. • Front End: • The client uses the front end, which contains a client-side interface and application. Both of these components are important to access the Cloud computing platform. The front end includes web servers (Chrome, Firefox, Opera, etc.), clients, and mobile devices. • Back End: • The backend part helps you manage all the resources needed to provide Cloud computing services. This Cloud architecture part includes a security mechanism, a large amount of data storage, servers, virtual machines, traffic control mechanisms, etc. Cloud Computing Architecture Diagram Important Components of Cloud Computing Architecture 1. 2. 3. 4. 5. 6. 7. 8. 9. Client Infrastructure: Application: Service: Runtime Cloud: Storage: Infrastructure: Management: Security: Internet: Characteristics of Cloud Computing 1. 2. 3. 4. 5. On-demand self-service. Broad network access. Multi-tenancy and resource pooling. Rapid elasticity and scalability. Measured service. • On-demand self-services: • The Cloud computing services does not require any human administrators, user themselves are able to provision, monitor and manage computing resources as needed. • Broad network access: • The Computing services are generally provided over standard networks and heterogeneous devices. • Rapid elasticity: • The Computing services should have IT resources that are able to scale out and in quickly and on as needed basis. Whenever the user require services it is provided to him and it is scale out as soon as its requirement gets over. • Resource pooling: • The IT resource (e.g., networks, servers, storage, applications, and services) present are shared across multiple applications and occupant in an uncommitted manner. Multiple clients are provided service from a same physical resource. • Measured service: • The resource utilization is tracked for each application and occupant, it will provide both the user and the resource provider with an account of what has been used. This is done for various reasons like monitoring billing and effective use of resource. What is a cloud service provider? • A cloud service provider, or CSP, is a company that offers components of cloud computing -- typically, infrastructure as a service (IaaS), software as a service (SaaS) or platform as a service (PaaS). • Cloud service providers use their own data centers and compute resources to host cloud computing-based infrastructure and platform services for customer organizations. • Cloud services typically are priced using various pay-as-you-go subscription models. What are the benefits and challenges of using a cloud service provider? • Benefits • Cost and flexibility. • Scalability. • Mobility. • Disaster recovery. …. • Challenges • Hidden costs. • Cloud migration. • Cloud security. • Performance and outages. • Complicated contract terms. • Vendor lock-in. Types of cloud service providers • IaaS providers. In the IaaS model, the cloud service provider delivers infrastructure components that would otherwise exist in an onpremises data center. • SaaS providers. SaaS vendors offer a variety of business technologies, such as productivity suites, customer relationship management (CRM) software, human resources management (HRM) software and data management software, all of which the SaaS vendor hosts and provides over the internet. • PaaS providers. The third type of cloud service provider, PaaS vendors, offers cloud infrastructure and services that users can access to perform various functions. Organizational Scenarios of Clouds • End user to cloud • End user data is managed in cloud. End user does not need to be keep up with anything other than password. For example: Email applications • Enterprise to cloud to end user • When end user interacts with enterprise, enterprise accesses data from cloud, manipulates it and sends it to end user. • Enterprise to cloud • Enterprise using cloud services for its internal processes. • Enterprise to cloud to Enterprise • Two enterprise using same cloud What is cloud monitoring? • Cloud monitoring is a method of reviewing, observing, and managing the operational workflow in a cloud-based IT infrastructure. • Manual or automated management techniques confirm the availability and performance of websites, servers, applications, and other cloud infrastructure. • This continuous evaluation of resource levels, server response times, and speed predicts possible vulnerability to future issues before they arise. Types of cloud monitoring • Database monitoring Because most cloud applications rely on databases, this technique reviews processes, queries, availability, and consumption of cloud database resources. This technique can also track queries and data integrity, monitoring connections to show real-time usage data. For security purposes, access requests can be tracked as well. For example, an uptime detector can alert if there’s database instability and can help improve resolution response time from the precise moment that a database goes down. • Website monitoring A website is a set of files that is stored locally, which, in turn, sends those files to other computers over a network. This monitoring technique tracks processes, traffic, availability, and resource utilization of cloudhosted sites. • Virtual network monitoring This monitoring type creates software versions of network technology such as firewalls, routers, and load balancers. Because they’re designed with software, these integrated tools can give you a wealth of data about their operation. If one virtual router is endlessly overcome with traffic, for example, the network adjusts to compensate. Therefore, instead of swapping hardware, virtualization infrastructure quickly adjusts to optimize the flow of data. Conti… • Cloud storage monitoring This technique tracks multiple analytics simultaneously, monitoring storage resources and processes that are provisioned to virtual machines, services, databases, and applications. This technique is often used to host infrastructure-as-a-service (IaaS) and software-as-a-service (SaaS) solutions. For these applications, you can configure monitoring to track performance metrics, processes, users, databases, and available storage. It provides data to help you focus on useful features or to fix bugs that disrupt functionality. • Virtual machine monitoring This technique is a simulation of a computer within a computer; that is, virtualization infrastructure and virtual machines. It’s usually scaled out in IaaS as a virtual server that hosts several virtual desktops. A monitoring application can track the users, traffic, and status of each machine. You get the benefits of traditional IT infrastructure monitoring with the added benefit of cloud monitoring solutions. Benefits of Cloud Computing • Reduced IT costs • Scalability • Simplicity • Vendors • Security Limitations • Sensitive Information • Applications Not Ready for Deployment • Developing Your Own Applications Security Concerns • Privacy Concerns with a Third Party • Security Level of Third Party Security Benefits • Centralized Data • Reduced Data Loss • Monitoring Regulatory Issues • No Existing Regulation Infrastructure As A Service: Amazon Ec2 “Amazon Elastic Compute Cloud (Amazon EC2) is an Amazon Web Service (AWS) you can use to access servers, software, and storage resources across the Internet in a self-service manner “ • Provides scalable, pay as-you-go compute capacity • Elastic - scales in both direction Platform As A Service: Google App Engine • The Google cloud, called Google App Engine, is a ‘platform as a service’ (PaaS) offering. In contrast with the Amazon infrastructure as a service cloud, where users explicitly provision virtual machines and control them fully, including installing, compiling and running software on them, a PaaS offering hides the actual execution environment from users. Microsoft Azure • Like the Google App Engine, Azure is a PaaS offering. Developers create applications using Microsoft development tools (i.e. Visual Studio along with its supported languages, C#, Visual Basic, ASPs, etc.); an Azure extension to the standard toolset allows such applications to be developed and deployed on Microsoft’s cloud, in much the same manner as developing and deploying using Google App Engine’s SDK. Cloud Technologies • A few technologies have been crucial in enabling the development and use of cloud platforms. Web services allow applications to communicate easily over the internet: Composite applications are easily assembled from distributed web-based components using ‘mashups.’ • If there is one technology that has contributed the most to cloud computing, it is virtualization. Web Services • Web services are applications that allow for communication between devices over the internet and are usually independent of the technology or language the devices are built on as they use standardised eXtensible Markup Language (XML) for information exchange. • A client or user is able to invoke a web service by sending an XML message and then in turn gets back and XML response message. • There are a number of communication protocols for web services that use the XML format such as Web Services Flow Language(WSFL), Blocks Extensible Exchange Protocol(BEEP) among others. • Simple Object Access Protocol(SOAP) and Representational State Transfer (REST) are by far the most used options for accessing web services, however they are not directly comparable as they vary in the sense that SOAP is a communications protocol while REST is a set of architectural principles for data transmission. SOAP • SOAP is a messaging protocol for exchanging information between two computers based on XML over the internet. • SOAP messages are purely written in XML which is why they are platform and language independent. • SOAP is a protocol which was designed before REST and came into the picture. • The main idea behind designing SOAP was to ensure that programs built on different platforms and programming languages could exchange data in an easy manner. • SOAP stands for Simple Object Access Protocol. REST • REST is a web standard architecture that achieves data communication using a standard interface such as HTTP or other transfer protocols that use standard Uniform Resource Identifier (URI). • REST was designed specifically for working with components such as media components, files, or even objects on a particular hardware device. • Any web service that is defined on the principles of REST can be called a RestFul web service. • A Restful service would use the normal HTTP verbs of GET, POST, PUT and DELETE for working with the required components. REST stands for Representational State Transfer. Difference Between SOAP and REST SOAP REST •SOAP stands for Simple Object Access Protocol •REST stands for Representational State Transfer •SOAP is a protocol. SOAP was designed with a specification. It includes a WSDL file which has the required information on what the web service does in addition to the location of the web service. •REST is an Architectural style in which a web service can only be treated as a RESTful service if it follows the constraints of being 1. Client Server 2. Stateless 3. Cacheable 4. Layered System 5. Uniform Interface •SOAP cannot make use of REST since SOAP is a protocol and REST is an architectural pattern. •REST can make use of SOAP as the underlying protocol for web services, because in the end it is just an architectural pattern. •SOAP uses service interfaces to expose its functionality to client applications. In SOAP, the WSDL file provides the client with the necessary information which can •REST use Uniform Service locators to access to the components on the hardware be used to understand what services the web service can offer. device. •SOAP requires more bandwidth for its usage. Since SOAP Messages contain a lot of information inside of it, the amount of data transfer using SOAP is generally a lot. •REST does not need much bandwidth when requests are sent to the server. REST messages mostly just consist of JSON messages. Ajax: Asynchronous ‘Rich’ Interfaces • Ajax is an acronym for Asynchronous Javascript and XML. It is used to communicate with the server without refreshing the web page and thus increasing the user experience and better performance. • Ajax uses XHTML for content, CSS for presentation, along with Document Object Model and JavaScript for dynamic content display. • Conventional web applications transmit information to and from the sever using synchronous requests. It means you fill out a form, hit submit, and get directed to a new page with new information from the server. • With AJAX, when you hit submit, JavaScript will make a request to the server, interpret the results, and update the current screen. In the purest sense, the user would never know that anything was even transmitted to the server. … • XML is commonly used as the format for receiving server data, although any format, including plain text, can be used. • AJAX is a web browser technology independent of web server software. • A user can continue to use the application while the client program requests information from the server in the background. • Intuitive and natural user interaction. Clicking is not required, mouse movement is a sufficient event trigger. • Data-driven as opposed to page-driven Mashups: User Interface Services • A mashup is a technique by which a website or Web application uses data, presentation or functionality from two or more sources to create a new service. Virtualization Technology • In computing, virtualization or virtualisation is the act of creating a virtual version of something, including virtual computer hardware platforms, storage devices, and computer network resources. • Virtualization relies on software to simulate hardware functionality and create a virtual computer system. This enables IT organizations to run more than one virtual system – and multiple operating systems and applications – on a single server. The resulting benefits include economies of scale and greater efficiency. … • A virtual computer system is known as a “virtual machine” (VM): a tightly isolated software container with an operating system and application inside. Each self-contained VM is completely independent. Putting multiple VMs on a single computer enables several operating systems and applications to run on just one physical server, or “host.” • A thin layer of software called a “hypervisor” decouples the virtual machines from the host and dynamically allocates computing resources to each virtual machine as needed. Types of Virtualization: Applications in Enterprises • Server Virtualization This is a kind of virtualization in which masking of server resources takes place. Here, the central-server(physical server) is divided into multiple different virtual servers by changing the identity number, processors. So, each system can operate its own operating systems in isolate manner. Where each sub-server knows the identity of the central server. It causes an increase in the performance and reduces the operating cost by the deployment of main server resources into a sub-server resource. It’s beneficial in virtual migration, reduce energy consumption, reduce infrastructural cost, etc. • Network Virtualization The ability to run multiple virtual networks with each has a separate control and data plan. It co-exists together on top of one physical network. It can be managed by individual parties that potentially confidential to each other. Network virtualization provides a facility to create and provision virtual networks—logical switches, routers, firewalls, load balancer, Virtual Private Network (VPN), and workload security within days or even in weeks. Conti… • Desktop Virtualization Desktop virtualization allows the users’ OS to be remotely stored on a server in the data centre. It allows the user to access their desktop virtually, from any location by a different machine. Users who want specific operating systems other than Windows Server will need to have a virtual desktop. Main benefits of desktop virtualization are user mobility, portability, easy management of software installation, updates, and patches. Multi Tenant Software • In multi-tenant software architecture—also called software multitenancy—a single instance of a software application (and its underlying database and hardware) serves multiple tenants (or user accounts). • A tenant can be an individual user, but more frequently, it’s a group of users—such as a customer organization—that shares common access to and privileges within the application instance. • Each tenant’s data is isolated from, and invisible to, the other tenants sharing the application instance, ensuring data security and privacy for all tenants. Benefits of multi-tenant architecture • Lower costs: Because the software provider can serve multiple tenants from a single application instance and supporting infrastructure (and because tenants share the burden of software maintenance, infrastructure, and data center operations), ongoing costs tend to be lower than those of a single-tenant arrangement. SaaS software is typically offered for a predictable monthly or annual subscription price based on number of users, usage level, or data volumes managed within the application. • Scalability: Tenants can scale on demand—new users get access to the same instance in the software, typically for an incremental subscription rate increase. • Customization without coding: SaaS multi-tenant offerings are highly configurable so that each tenant customer can tailor the application to its specific business purposes without expensive, time-consuming, and sometimes risky custom development. • Continuous, consistent updates and maintenance: The multi-tenant software provider is responsible for updates and patches. New features are added and/or fixes applied without any effort on the customer's part and just once (as opposed to single-tenant architecture, where providers must update every instance of the software). • Improved productivity for tenants. Not having to manage infrastructure or software means tenants are free to focus on more important tasks. Data in the Cloud: Relational Databases • A relational database is a collection of data items with pre-defined relationships between them. • These items are organized as a set of tables with columns and rows. Tables are used to hold information about the objects to be represented in the database. • Each column in a table holds a certain kind of data and a field stores the actual value of an attribute. The rows in the table represent a collection of related values of one object or entity. Cloud File Systems: GFS and HDFS • Big data is a term used to describe huge volume of data, generated by digital process and media exchange all over the world. • Google file system and Hadoop distributed file system were developed and implemented to handle huge amount of data. • Google File System (GFS) is a scalable distributed file system (DFS) created by Google Inc. and developed to accommodate Google’s expanding data processing requirements. Main Goal Of GFS And HDFS • The HDFS and GFS were built to support large files coming from various sources and in a variety of formats. • Huge data storage size (Peta bytes) are distributed across thousands of disks attached to commodity hardware. • Both HDFS and GFS are designed for data-intensive computing and not for normal end-users Google File System Architecture • The GFS is composed of clusters. A cluster is a set of networked computers. • GFS clusters contain three types of interdependent entities which are: Client, master and chunk server. • Clients could be: Computers or applications manipulating existing files or creating new files on the system. • The master server is the orchestrator or manager of the cluster system that maintain the operation log. • Chunk servers are the core engine of the GFS. They store file chunks of 64 MB size. Chunk servers coordinate with the master server and send requested chunks to clients directly. Hadoop Distributed File System Architecture • HDFS is a distributed file system that handles large data sets running on commodity hardware. • It is used to scale a single Apache Hadoop cluster to hundreds (and even thousands) of nodes. Limitations of Hadoop • Hadoop can perform only batch processing, and data will be accessed only in a sequential manner. • That means one has to search the entire dataset even for the simplest of jobs. HBase • HBase is a column-oriented non-relational database management system that runs on top of Hadoop Distributed File System (HDFS). • HBase is a distributed column-oriented database built on top of the Hadoop file system. It is an open-source project and is horizontally scalable. • HBase is a data model that is similar to Google’s big table designed to provide quick random access to huge amounts of structured data. It leverages the fault tolerance provided by the Hadoop File System (HDFS). MapReduce • MapReduce is a programming model developed by Google and used by both GFS and HDFS. Based on Google MapReduce white paper, Apache adopted and developed its own MapReduce model with some minor differences. • The primary role of MapReduce is to provide an infrastructure that allows development and execution of large-scale data processing jobs. BigTable • Cloud Bigtable is a fully managed wide-column and key-value NoSQL database service for large analytical and operational workloads as part of the Google Cloud portfolio. Dynamo • Dynamo is a set of techniques that together can form a highly available key-value structured storage system or a distributed data store. • Dynamo is internal technology developed at Amazon to address the need for an incrementally scalable, highly-available key-value storage system. • The technology is designed to give its users the ability to trade-off cost, consistency, durability and performance, while maintaining highavailability. Parallel Computing • Parallel computing refers to the process of breaking down larger problems into smaller, independent, often similar parts that can be executed simultaneously by multiple processors communicating via shared memory, the results of which are combined upon completion as part of an overall algorithm. • The primary goal of parallel computing is to increase available computation power for faster application processing and problem solving.