Virtualization Technologies Hugo Valentim m6104 Nuno Pereira m6691 Rafael Couto m6027 Outline: Virtualization Technologies • • • • • • • Introduction Making a Business Case for Virtualization Virtualization Technologies Virtualization Applications Performing a Server Virtualization Cost-Benefit Analysis Nine Steps to Your First Virtualization Project Five Virtualization Pitfalls to Avoid References • Virtualization- A Manager’s Guide • • Virtualization for Dummies • • By Dan Kusneztky, O’Reilly – ISBN: 978-1-449-30645-8 By Bernard Golden, Wiley Publishing Inc. – ISBN: 978-0-470-14831-0 Server Virtualization – Cloud Computing • By C. Y. Huang, Department of Computer Science – National Tsing Hua University Introduction Virtualization: A Definition • • • Operating system virtualization is the use of software to allow a piece of hardware to run multiple operating system images at the same time. The technology got its start on mainframes decades ago, allowing administrators to avoid wasting expensive processing power. There are three areas of IT where virtualization is making head roads, network virtualization, storage virtualization and server virtualization. Network virtualization • Is a method of combining the available resources in a network by splitting up the available bandwidth into channels, each of which is independent from the others, and each of which can be assigned (or reassigned) to a particular server or device in real time. Storage virtualization • Is the pooling of physical storage from multiple network storage devices into what appears to be a single storage device that is managed from a central. • Storage virtualization is commonly used in storage area networks (SANs). Server virtualization • Is the masking of server resources from server users. The intention is to spare the user from having to understand and manage complicated details of server resources while increasing resource sharing and utilization and maintaining the capacity to expand later. Why Virtualization Is Hot right now? • Hardware is underutilized • Data centers run out of space • Energy costs go through the roof • System administration costs mount Types of Virtualization • Operating system virtualization (containers) • Hardware emulation/Full Virtualization • Para-virtualization Operating system virtualization (containers) • Multiple versions of the same app running at the same time • Split of IO devices into the containers • Web Servers Hardware emulation/Full Virtualization • Guest OS has all the hardware • Bootcamp Para-virtualization • Main OS control the access of all guest OS to the hardware • Main OS control all the needs for the guest OS • Guest OS know that its virtualized Making a Business Case for Virtualization Making a Business Case for Virtualization • You’re drowning in servers. • You need to use your hardware more efficiently. • You need to reduce energy costs. • You need to make IT operations in general more efficient. Virtualization Lowers Costs • Total number of server machinery can be lower • Less power consumption • Less problems with maintenance • Simple control of servers and hardware Software Costs: A Challenge for Virtualization • • • When it comes to virtualization, software licensing is a mess. Oracle has taken the position that its fee should be based on the total number of processors in the server. End users are confused about their licensing rights and responsibilities in a virtualized environment, and the situation appears to be driven more by vendor business needs rather than by customer satisfaction. Virtualization Technologies Full Virtualization • Full Virtualization is a virtualization in which the guest operating system is unaware that it is in a virtualized environment, and therefore hardware is virtualized by the host operating system so that the guest can issue commands to what it thinks is actual hardware, but really are just simulated hardware devices created by the host. Paravirtualization • Paravirtualization is a virtualization technique in which the guest operating system is aware that it is a guest and accordingly has drivers that, instead of issuing hardware commands, simply issues commands directly to the host operating system. Full Virtualization Vs. Para-Virtualization Nested Virtualization • • Is simply a VM running inside another VMExample: A Windows 8 VM running a ubuntu VM. Hybrid Virtualization • • This virtualization technique is a combination of Paravirtualization and Full Virtualization, where parts of the guest operating system use paravirtualization for certain hardware drivers, and the host uses full virtualization for other features. This often produces superior performance on the guest without the need for it to be completely paravirtualized. An example of this: The guest uses full virtualization for privileged instructions in the kernel but paravirtualization for IO requests using a special driver in the guest. Hardware emulation • • • Hardware Assisted Virtualization (emulation) is a type of Full Virtualization where the microprocessor architecture has special instructions to aid the virtualization of hardware. These instructions might allow a virtual context to be setup so that the guest can execute privileged instructions directly on the processor, even though it is virtualized. If said instructions do not exist, Full Virtualization is still possible, however it must be done via software techniques such as Dynamic Recompilation where the host recompiles on the fly privileged instructions in the guest to be able to run in a nonprivileged way on the host. Examples: Console emulators OS Virtualization • • • Operating-system-level virtualization is a virtualization method where the kernel of an operating system allows for multiple isolated user space instances, instead of just one. Such instances (Ex. Containers) may look and feel like a real server from the point of view of its owners and users. Operating-system-level virtualization usually imposes little to no overhead, because programs in virtual partitions use the operating system's normal system call interface and do not need to be subjected to emulation or be run in an intermediate virtual machine, as is the case with whole-system virtualizers. This form of virtualization also does not require support in hardware to perform efficiently. OS Virtualization (Cont.) • • This virtualization is not as flexible as other virtualization approaches since it cannot host a guest operating system different from the host one, or a different guest kernel. For example, with Linux, different distributions are fine, but other operating systems such as Windows cannot be hosted. Some operating-system-level virtualization implementations provide file-level copy-on-write mechanisms. This means that a standard file system is shared between partitions, and those partitions that change the files automatically create their own copies. This is easier to back up, more space-efficient and simpler to cache. OS Virtualization https://www.youtube.com/watch?v=NQSIWo8mXFE Storage Virtualization • Storage virtualization is the fusion of multiple network storage devices into what appears to be a single storage unit. Storage virtualization is usually implemented via software applications and often used in SAN (storage area network), making tasks such as archiving, back-up, and recovery easier and faster. Storage Virtualization (Ex.) Network Virtualization • • When applied to a network, virtualization creates a logical software-based view of the hardware and software networking resources (switches, routers, etc.). The physical networking devices are simply responsible for the forwarding of packets, while the virtual network (software) provides an intelligent abstraction that makes it easy to deploy and manage network services and underlying network resources. Network Virtualization(Ex.) Applications Virtualization • • • Application virtualization is software technology that encapsulates application software from the underlying operating system on which it is executed. A fully virtualized application is not installed in the traditional sense, although it is still executed as if it were. The application behaves at runtime like it is directly interfacing with the original operating system and all the resources managed by it. In this context, the term "virtualization" refers to the artifact being encapsulated. Desktop Virtualization • • • Desktop virtualization is software technology that separates the desktop environment and associated application software from the physical client device that is used to access it. Desktop virtualization can be used in conjunction with application virtualization and user profile management systems, to provide a comprehensive desktop environment management system. In this mode, all the components of the desktop are virtualized, which allows for a highly flexible and much more secure desktop delivery model. In addition, this approach supports a more complete desktop disaster recovery strategy as all components are essentially saved in the data center and backed up through traditional redundant maintenance systems. Desktop Virtualization • If a user's device or hardware is lost, the restore is much more straightforward and simple, because basically all the components will be present at login from another device. In addition, because no data is saved to the user's device, if that device is lost, there is much less chance that any critical data can be retrieved and compromised. Virtualization Applications Development and Testing • • • Engineer's build some software to implement certain functionality's. Depending on the type of software, it might need to run on more than one operating system (say, on Linux and Windows). After the engineer has built the software to run in those environment, the software is handed off to a testing group to ensure quality. Test groups often do things like inputting data of the wrong type or feeding-in a data file of the wrong format, all to see whether the software is likely to work in real-world usage conditions. One consequence of this kind of destructive testing is that it tends to crash machines and often wipes out the operating system, necessitating a fresh install of the OS and application before any additional testing can be done. Development and Testing • There are a few problems with this situation, however: • • • • Only a portion of the development and test cycle requires multiple systems. Most development and test work that is done on multiple machines is done only to ascertain basic functionality; therefore, many of the machines are lightly loaded and lightly exercised. When development or testing causes a system crash, it can take time and labor to reinstall everything to get ready for the next task. Keeping a bunch of machines sitting around for occasional development or test work requires space and cost money. Development and Testing • • • • Nowadays Even a developer’s laptop is plenty powerful enough to support several guest machines. By using virtualization, a developer or tester can replicate a distributed environment containing several machines on a single piece of hardware. This setup negates the need to have a bunch of servers sitting around for the occasional use of developers or testers. It also avoids the inevitable conflict that occurs when an organization attempts to have its members share machines for distributed use. This capability is a boon to developers and testers. Rather than having to repeatedly rebuild test instances, they can just save a complete virtual machine image and load it each time they trash a virtual machine instance. In terms of recovery time, the consequences of a machine crash go from hours to minutes. • Training(Teaching) Training is a common application of virtualization. Because technical training requires that students have a computer system available to experiment and perform exercises on, the management of a training environment has traditionally been very challenging and labor intensive.Training typically tracks the following scenario: • • • • 1. The student begins with a basic technology setup. 2. The student performs set exercises involving more and more complex operations, each of which extends or modifies the basic technology setup. 3. At class end, the student leaves the course having successfully completed all exercises and having significantly modified the basic technology setup The challenge then faced is the need to restore the environments back to the basic setup. With virtualization, the restoration process is quite simple. For instance if hardware emulation or paravirtualization is being used, the virtual machines used by the last batch of students are deleted and new virtual machines are created from images that represent the desired starting point. Now setting up training environments has gone from hours or days to minutes. • • Server Consolidation Consolidation is the act of taking separate things and creating a union that contains all of them. With respect to virtualization, consolidation refers to taking separate server instances and migrating them into virtual machines running on a single server. It addresses the problems most folks desperately want virtualization to address: • • • • underutilization of servers in the data center; server sprawl that threatens to overwhelm data center capacity; sky-high energy costs from running all those underutilized servers; Escalating operations costs as more system administrators are required to keep all those underutilized servers humming. • Server Consolidation A typical server consolidation project accomplishes the following: • • • • • • • 1. Identifies a number of underutilized systems in the data center. 2. Selects one or more servers to be the new virtualization servers; these might be existing or newly purchased. 3. Installs virtualization software on the new virtualization servers. 4. Creates new virtual machines from the existing servers. 5. Installs the new virtual machines on the new virtualization servers. 6. Begins running the virtual machines as the production servers for the applications inside the virtual machines. Companies implementing server consolidation often move from running 150 physical servers to running 150 virtual machines on only 15 servers, with an understandable reduction in power, hardware investment, and employee time. Failover • Companies run many applications that they consider mission critical. Running applications on a single server exposes the company to a single point of failure (referred as SPOF). • Failover technology essentially mirrors one copy of the application to a second machine, and it keeps them consistent by constantly sending messages back and forth between the copies. If the secondary machine notices that the primary machine is no longer responding, it steps up to the plate and takes on the workload when one system goes down, the application does a failover to the secondary system. Although this functionality is clearly critical, it’s important to note some less-than-desirable aspects to it: Failover • • • It’s application specific. If every mission-critical application provides its own method of achieving failover, the IT operations staff has to know as many different technologies as there are applications. This is a recipe for complexity and high cost. It’s wasteful of resources. Keeping a mirrored system up for every mission-critical application requires a lot of extra machines, each of which imposes its own operations costs, not to mention being wasteful of a lot of extra hardware. Furthermore, because one of the factors driving the move to virtualization is that data centers are overfilled with machines, a solution that requires even more machines seems undesirable. It’s expensive. Every vendor recognizes that failover is mission critical, so it charges a lot for the functionality. High Availability • High availability extends the concept of simple failover to incorporate an additional hardware server. • Instead of a crashed virtual machine being started on the same piece of hardware, it is started on a different server, thereby avoiding the problem of a hardware failure negating the use of virtualization failover. • But how does it work? • How can a hypervisor on one physical server start a virtual machine on another hypervisor? • It can’t. • • • • • High Availability High availability relies on an overarching piece of virtualization software that coordinates the efforts of multiple hypervisors. When a virtual machine on one hardware server crashes, the coordinating software starts another virtual machine on a separate hardware server. The coordinating virtualization software is constantly monitoring all the hypervisors and their virtual machines. If the coordinating software sees that the hypervisor on one server is no longer responding, the software then arranges to start any virtual machines that were on the failed hardware on other hardware. So HA addresses the issue of hardware failure by using higher-level virtualization software to coordinate the hypervisors on two or more machines, constantly monitoring them and restarting virtual machines on other machines if necessary. This setup certainly addresses the issue of hardware failure and makes the failover solution more robust. Load Balancing • • • Simply put, load balancing involves running two instances of a virtual machine on separate pieces of hardware and dividing the regular workload between them. By running two instances of the virtual machine, if one machine crashes, the other continues to operate. If the hardware underneath one of the virtual machines fails, the other keeps working. In this way, the application never suffers an outage. Load balancing also makes better use of machine resources. Rather than the second VM sitting idly by, being updated by the primary machine but performing no useful work, load balancing makes the second VM carry half of the load, thereby ensuring that its resources aren’t going unused. The use of duplicate resources can extend beyond the virtual machines. • • • Load Balancing Virtualized storage is a prerequisite for load balancing because each virtual machine must be able to access the same data, which is necessary because application transactions can go to either virtual machine. In fact, depending on the application and how the load balancing is configured, transactions might go through both virtual machines during the execution of an extended transaction. Therefore, both machines must be able to access the data, and therefore, virtualized storage is required. We can also configure load balanced virtual machines to act as clustered machines and share state between them. That way, if a virtual machine crashes, the other can pick up the work and continue it. Load Balancing www.youtube.com/watch?v=oEcEqN8PeeI Server pooling • Now you’re probably thinking “Wow, virtualization sure can be applied in a lot of useful ways, but there seems to be a lot of installing and configuring. Wouldn’t it be great if the virtualization software was arranged for the installation and configuration so that I automatically got failover and load balancing?” Server pooling • • • • • In fact, that functionality exists. It’s called server pooling, and it’s a great application of virtualization. With server pooling, a virtualization software manages a group of virtualized servers. Instead of installing a virtual machine on a particular server, the administrator merely point the virtualization software at the virtual machine image, and the software figures out which physical server is best suited to run the machine. The server-pooling software also keeps track of every virtual machine and server to determine how resources are being allocated. If a virtual machine needs to be relocated to better use the available resources, the virtualization software automatically migrates it to a better-suited server. The pool is managed through a management console, and if the administrator notices that the overall pool of servers is getting near to the defined maximum utilization rate, he can transparently add another server into the pool. The virtualization software then rebalances the loads to make the most effective use of all the server resources. Server pooling • • Because there is no way of knowing which physical server will be running a virtual machine, the storage must be virtualized so that a VM on any server can get access to its data. Most IT organizations haven’t moved toward server pooling, yet. Server pooling and Distributed Resource Scheduler, in addition to intelligently moving and migrating machines to other machines that have more resources (load balancing), help with moving machines when a hardware failure hits. So, rather than using clustering for failover, Distributed Resource Scheduler can help move a machine when its base hardware has an issue. Server pooling • https://www.youtube.com/watch?v=yGX2QLutpdM Disaster recovery • • • Disaster recovery comes into play when an entire data center is temporarily or permanently lost. In complete data center loss, IT organizations need to scramble to keep the computing infrastructure of the entire company operating. Virtualization can help with the application recovery and ongoing management tasks. Any of the failover, High Availability, load-balancing, or server-pooling virtualization capabilities may be applied in a DR scenario. Their application just depends on how much you want to physically manage during the DR process. Because virtual machine images can be captured in files and then started by the hypervisor, wich makes virtualization an ideal technology for DR scenarios. Disaster recovery • • • • In a time of a disaster, needing to locate physical servers, configure them, install applications, configure them, and then feed in backup backups to get the system up to date is a nightmare. Keeping spare computing capacity in a remote data center that completely mirrors your primary computing infrastructure is extremely expensive. With virtualization, a much smaller set of machines can be kept available in a remote data center, with virtualization software preinstalled and ready to accept virtual machine images. In the case of a disaster, such images can be transferred from the production data center to the backup data center. These images can then be started by the preinstalled virtualization software, and they can be up and running in just a few minutes. Disaster recovery • • • Because virtualization abstracts hardware, it is possible to move a virtual machine from one hypervisor to another, whether the hypervisors are on machines in the same rack or on machines in data centers halfway around the world from one another. Because the virtual machines can be moved to any available hypervisor, it is possible to more easily implement a disaster recovery plan, with no need to replicate an entire data center. Example: The Hurricane Katrina: When it struck, many IT shops lost all processing capability because their entire data centers were inundated. If that weren’t enough, Internet connectivity was lost as well due to telecommunications centers being flooded. Performing a Server Virtualization Cost-Benefit Analysis Looking at current costs • Most IT organizations don’t track their costs with much granularity; • Almost impossible to request a report that shows the current costs of the infrastructure to be virtualized; • However it is possible with a bit of work, create a fairly accurate estimate of the current cost of the infrastructure; • There are two broad categories: hard costs and soft costs. • Easier to document hard costs rather than soft costs. Hard Costs • Any costs that require paying actual money to an entity. • Usually associated with a specific product and service. • E.g., an outside company to run security scans on the servers. Hard Costs Examples • Power – obvious! • Server Maintenance - covers maintenance on machines; • Outside Services - outside vendors for anything associated with keeping its infrastructure up and running (for example, someone to run backups on the machines). Soft Costs • Are typically associated with internal personnel or internal services for which no explicit chargeback system is in place; • Soft costs are what the organization spends on an ongoing basis doing its daily work. • Very few organizations systematically keep track of the specific tasks its members perform, as well as the amount of time its members devote to each task. Soft Costs Examples • Machine Administration - refers to the work done by internal personnel keeping the technology infrastructure humming; • Backup - refers to the process of creating a secure copy of the data on individual machines; Identifying virtualization costs • After establishment of the current running costs of the infrastructure as-is, it is important to understand the potential benefits and costs of moving to virtualization; • It requires estimation rather than documentation; Selecting a VirtualizationDeployment Scenario • • • The first step is to define the configuration you’ll most likely install. • For example: individual virtualized servers or a virtualized pool of servers, possibly including virtualized storage, as well. Without a defined configuration, you can’t accurately estimate the cost of moving to virtualization. One will incur a range of costs as moving to virtualization, and it’s important to recognize them and estimate them in the overall cost-benefit analysis. Identifying New Hardware Needs • Virtualization is a very flexible technology and is capable of running on a very wide range of hardware; • Bringing in new hardware as part of the virtualization project might make good sense; • Those costs need to be added to the cost analysis. • If current hardware is in good condition, one should consider creating two versions of the cost analysis: one with new hardware and one without; Considering Other Physical Equipment • Depending on the envisioned configuration, one might also decide to include other physical equipment such as Network-Attached Storage or a Storage Area Network. • Any additional physical equipment is, naturally, a hard cost and should be documented in that section of the detail spreadsheet. • Software licenses might also be required to implement other physical equipment, so one might need to estimate and document an additional hard cost in the new software section of the worksheet. Purchasing New Software • Software licenses are a hard cost • Depending on the load generated on the physical hardware, one will be able to estimate how many virtual machines one can support per physical machine. • Presenting an annualized view is important because software licenses typically carry a significant cost in Year One and then a lower ongoing cost for software maintenance and licensing. Training Employees • A learning curve is associated with beginning to use any new software product; • Many organizations educate their employees through crash-courses; • These training costs should be contemplated in implementation costs. Identifying the financial benefits of virtualization • After the definition of the configuration and associated costs of the virtualization solution, one can estimate the financial benefits of the migration. • There are two types of financial benefits by moving to virtualization: • • Cost savings through not having to spend money one is currently spending on the existing infrastructure; Cost savings through more effective and efficient operations. Reduced Hard Costs • Reduced hardware maintenance: as there are fewer machines, the hardware maintenance bill will shrink; • Reduced software licenses: one might be able to realize savings on software licenses, as some vendors charge per physical device and not per “machine itself”; • Reduced power cost: fewer machines = less power consumption; Reduced Soft Costs • System administration work is reduced: there are reduced worries about hardware failure, and less work on maintenance as “everything” is managed by itself. • Soft costs might require rough estimates: estimating reduced work is difficult, so it is better to make rough predictions. Creating a Cost-benefit Spreadsheet • Three sheets: • • • Current cost structure; Virtualized cost structure; Project cost summary Creating a Cost-benefit Spreadsheet (cont.) • Important sections: • • • • Financial implications across time; Separated hard and soft costs sections; Summary worksheet must really summarize the big picture; Creating the spreadsheet with both detail and summary worksheets ensures that one have captured all the costs of the two options. Nine Steps to Your First Virtualization Project Evaluate Use Cases • Don’t think shortsighted; • They must reflect what your virtualization identity should be; • They must reflect the opinions/recommendations one have collected from the company employees; Review of Organizational Ops. Structure • Virtualization isn’t only a technical issue, it’s also political; • People need to be willing to adapt to it; • Take into account there might be groups within the company that use computational resources differently; • A group may use only windows machines and another may use only linux machines; Define the Virtualization Architecture • Merge use cases and insights gathered from the organizational structure; • This is critical and should be reviewed within the organization • • It ensures that it captures everyone’s needs; Generates awareness and commitment from the different interviewed groups. Select the Virtualization Product(s) • It must be a straightforward process has it should be clear from the virtualization architecture; • Be aware: although a product’s website might claim that it can do such-andsuch a function, the actual implementation of that functionality might only work well enough for the vendor to claim support for it, but not that well in real-world use. Select the Virtualization Hardware • Final check on existing hardware; • If possible and worthy keeping existing hardware will lower initial capital investment; • There are machines specifically designed for virtualization; Perform a Pilot Implementation • Creates the opportunity to determine if the designed virtualization architecture will work; • It should replicate on a small scale the final system run, including: • Hardware, software, applications and workloads. Implement the Production Environment • Phase to: • • • • Order software and hardware; Install any necessary data center equipment (for example, power connections); Install the virtualization software and hardware; One is ready to move forward with migration to the new virtualized architecture. Migrate the Physical Servers • Two choices for migration: • Automated • • Automated migration software products are available, which are usually called physical-tovirtual migration tools - P2V tools; Better for Windows based systems rather than Linux systems. Manual • Hands-on work to move systems. It consists in installing software in a new virtual machine, backing up the data from the existing physical server, and then recovering that data into the new virtual server. Manage the Virtualized Infrastructure • There is always work to be done: • • • Tunning; Management; And so on…. Five Virtualization Pitfalls to Avoid Don’t Skip on Training • Typical error of organizations; • There is always a learning curve associated with new technologies. Don’t Apply Virtualization in Areas That Are Not Appropriate • High loaded systems are not candidates to be virtualized; • Critical systems are also not candidates to be virtualized; • Low load systems are good candidates because they will optimize hardware utilization. • It is important to evaluate the purpose and load of the systems to be virtualized. Don’t Imagine That Virtualization Is Static • Periodic examinations of the current solution in light of what’s newly available. • It may be worth to upgrade the systems (hardware and/or software). • It’s part of the virtualization management tasks. Don’t Skip the “Boring” Stuff • It is important to do the use cases and the architecture design reviews • Without them important directives may get forgotten; Don’t Overlook the Importance of Hardware • It is greatly possible that the current hardware isn’t suitable for virtualization, or at least not optimized; • The importance of hardware is only going to increase as new, virtualizationready hardware comes to market. Q&A Hugo Valentim m6104 Nuno Pereira m6691 Rafael Couto m6027 Quiz Time!