
Virtual Data Center Design
• In computing, virtualization is a broad term that refers to
the abstraction of computer resources
• It is "a technique for hiding the physical characteristics of
computing resources from the way in which other
systems, applications, or end users interact with those
resources. This includes making a single physical
resource (such as a server, an operating system, an
application, or storage device) appear to function as
multiple logical resources; or it can include making
multiple physical resources (such as storage devices or
servers) appear as a single logical resource."
• The common theme of all virtualization
technologies is the hiding of technical
detail, through encapsulation.
• Virtualization creates an external interface
that hides an underlying implementation,
e.g. by multiplexing access, by combining
resources at different physical locations, or
by simplifying a control system.
• It is divided into two main categories:
– Platform virtualization involves the simulation
of virtual machines.
– Resource virtualization involves the simulation
of combined, fragmented, or simplified
Platform Virtualization
• the creation of a virtual machine using a combination of
hardware and software is referred to as platform
• Platform virtualization is performed on a given hardware
platform by "host" software (a control program), which
creates a simulated computer environment (a virtual
machine) for its "guest" software.
• The "guest" software, which is often itself a complete
operating system, runs just as if it were installed on a
stand-alone hardware platform.
• Typically, many such virtual machines are simulated on a
given physical machine.
• For the "guest" system to function, the simulation must
be robust enough to support all the guest system's
external interfaces, which (depending on the type of
virtualization) may include hardware drivers.
Platform Virtualization
• There are several approaches to platform
virtualization, listed below based on how
complete a hardware simulation is
implemented. (The following terms are not
universally-recognized as such, but the
underlying concepts are all found in the
Platform Virtualization
• Emulation or simulation
– the virtual machine simulates the complete hardware,
allowing an unmodified "guest" OS for a completely
different CPU to be run. This approach has long been
used to enable the creation of software for new
processors before they were physically available.
Examples include Bochs, PearPC, PPC version of
Virtual PC, QEMU without acceleration, and the
Hercules emulator. Emulation is implemented using a
variety of techniques, from state machines to the use
of dynamic recompilation on a full virtualization
Hardware emulation uses a VM to
simulate the required hardware
Platform Virtualization
• Native virtualization and full virtualization
– the virtual machine simulates enough
hardware to allow an unmodified "guest" OS
(one designed for the same CPU) to be run in
isolation. Typically, many instances can be run
at once. This approach was pioneered in
1966 with CP-40 and CP[-67]/CMS,
predecessors of IBM's VM family.
• Examples include Virtual Iron, VMware
Workstation, VMware Server (formerly GSX
Server), Parallels Desktop, Adeos, Mac-on-Linux,
Win4BSD, Win4Lin Pro, and z/VM.
Full virtualization uses a hypervisor
to share the underlying hardware
• In computing, a hypervisor (also: virtual
machine monitor) is a virtualization
platform that allows multiple operating
systems to run on a host computer at the
same time. The term usually refers to an
implementation using full virtualization.
• Hypervisors are currently classified in two types:
– Type 1 hypervisor (or Type 1 virtual machine monitor) is software
that runs directly on a given hardware platform (as an operating
system control program). A "guest" operating system thus runs at
the second level above the hardware.
• The classic type 1 hypervisor was CP/CMS, developed at IBM in the
1960s, ancestor of IBM's current z/VM. More recent examples are
Xen, VMware's ESX Server, and Sun's Hypervisor (released in
– Type 2 hypervisor (or Type 2 virtual machine monitor) is software
that runs within an operating system environment. A "guest"
operating system thus runs at the third level above the hardware.
• Examples include VMware server and Microsoft Virtual Server.
Platform Virtualization
• Partial virtualization (and including "address
space virtualization")
– the virtual machine simulates multiple instances of
much (but not all) of an underlying hardware
environment, particularly address spaces. Such an
environment supports resource sharing and process
isolation, but does not allow separate "guest"
operating system instances. Although not generally
viewed as a virtual machine category per se, this was
an important approach historically, and was used in
such systems as CTSS, the experimental IBM
M44/44X, and arguably such systems as OS/VS1,
OS/VS2, and MVS. (Many more recent systems, such
as Microsoft Windows and Linux, as well as the
remaining categories below, also use this basic
Platform Virtualization
• Paravirtualization
– the virtual machine does not necessarily
simulate hardware, but instead (or in addition)
offers a special API that can only be used by
modifying the "guest" OS. This system call to
the hypervisor is called a "hypercall" in Xen,
Parallels Workstation and Enomalism; it is
implemented via a DIAG ("diagnose")
hardware instruction in IBM's CMS under VM
(which was the origin of the term hypervisor).
• Examples include VMware ESX Server, Win4Lin
9x, and z/VM.
Paravirtualization shares the process
with the guest operating system
Platform Virtualization
• Operating system-level virtualization
– virtualizing a physical server at the operating system
level, enabling multiple isolated and secure virtualized
servers to run on a single physical server. The "guest"
OS environments share the same OS as the host
system – i.e. the same OS kernel is used to
implement the "guest" environments. Applications
running in a given "guest" environment view it as a
stand-alone system.
• Examples are Linux-VServer, Virtuozzo, OpenVZ, Solaris
Containers, and FreeBSD Jails.
Operating system-level
virtualization isolates servers
Platform Virtualization
• Application Virtualization
– running a desktop or server application locally, using local resources,
within an appropriate virtual machine; this is in contrast with running the
application as conventional local software, i.e. software that has been
'installed' on the system. (Compare this approach with Software
installation and Terminal Services.)
– Such a virtualized application runs in a small virtual environment
containing the components needed to execute – such as registry
entries, files, environment variables, user interface elements, and global
– This virtual environment acts as a layer between the application and the
operating system, and eliminates application conflicts and applicationOS conflicts. Examples include the Sun Java Virtual Machine, Softricity,
Thinstall, Altiris, and Trigence. (This approach to virtualization is clearly
different from the preceding ones; only an arbitrary line separates it from
such virtual machine environments as Smalltalk, FORTH, Tcl, P-code,
or any interpreted language.)
Resource Virtualization
• The basic concept of platform
virtualization, was later extended to the
virtualization of specific system resources,
such as storage volumes, name spaces,
and network resources.
Resource Virtualization
• Resource aggregation, spanning, or concatenation
combines individual components into larger resources or
resource pools. For example:
– RAID and volume managers combine many disks into one large
logical disk.
– Storage Virtualization refers to the process of completely
abstracting logical storage from physical storage, and is
commonly used in SANs. The physical storage resources are
aggregated into storage pools, from which the logical storage is
created. Multiple independent storage devices, which may be
scattered over a network, appear to the user as a single,
location-independent, monolithic storage device, which can be
managed centrally.
– Channel bonding and network equipment use multiple links
combined to work as though they offered a single, higherbandwidth link.
– Virtual Private Network (VPN), Network Address Translation
(NAT), and similar networking technologies create a virtualized
network namespace within or across network subnets.
– Multiprocessor and multi-core computer systems often present
what appears as a single, fast processor.
Resource Virtualization
• Computer clusters, grid computing, and virtual servers
use the above techniques to combine multiple discrete
computers into larger metacomputers.
• Partitioning is the splitting of a single resource (usually
large), such as disk space or network bandwidth, into a
number of smaller, more easily utilized resources of the
same type. This is sometimes also called "zoning,"
especially in storage networks.
• Encapsulation is the hiding of resource complexity by the
creation of a simplified interface. For example, CPUs
often incorporate cache memory or pipelines to improve
performance, but these elements are not reflected in
their virtualized external interface. Similar virtualized
interfaces hiding complex implementations are found in
disk drives, modems, routers, and many other "smart"
Linux-related virtualization projects
Full virtualization
Full virtualization
Operating systemlevel virtualization
Operating systemlevel virtualization
Bochs (emulation)
• Bochs is an x86 computer simulator that is portable and runs on a
variety of platforms, including x86, PowerPC, Alpha, SPARC, and
MIPS. What makes Bochs interesting is that it doesn't just simulate
the processor but the entire computer, including the peripherals,
such as the keyboard, mouse, video graphics hardware, network
interface card (NIC) devices, and so on.
• Bochs can be configured as an older Intel® 386, or successor
processors such as the 486, Pentium, Pentium Pro, or a 64-bit
variant. It even emulates optional graphics instructions like the MMX
and 3DNow.
• Using the Bochs emulator, you can run any Linux distribution on
Linux, Microsoft® Windows® 95/98/NT/2000 (and a variety of
applications) on Linux, and even the Berkeley Software Distribution
(BSD) operating systems (FreeBSD, OpenBSD, and so on) on
QEMU (emulation)
QEMU is another emulator, like Bochs, but it has some differences that are
worth noting. QEMU supports two modes of operation. The first is the Full
System Emulation mode. This mode is similar to Bochs in that it emulates a
full personal computer (PC) system with processor and peripherals. This
mode emulates a number of processor architectures, such as x86, x86_64,
ARM, SPARC, PowerPC, and MIPS, with reasonable speed using dynamic
translation. Using this mode, you can emulate the Windows operating
systems (including XP) and Linux on Linux, Solaris, and FreeBSD. Many
other operating system combinations are also supported (see the
Resources section for more information).
QEMU also supports a second mode called User Mode Emulation. In this
mode, which can only be hosted on Linux, a binary for a different
architecture can be launched. This allows, for example, a binary compiled
for the MIPS architecture to be executed on Linux running on x86. Other
architectures supported in this mode include ARM, SPARC, and PowerPC,
though more are under development.
VMware (full virtualization)
• VMware is a commercial solution for full virtualization. A hypervisor
sits between the guest operating systems and the bare hardware as
an abstraction layer. This abstraction layer allows any operating
system to run on the hardware without knowledge of any other guest
operating system.
• VMware also virtualizes the available I/O hardware and places
drivers for high-performance devices into the hypervisor.
• The entire virtualized environment is kept as a file, meaning that a
full system (including guest operating system, VM, and virtual
hardware) can be easily and quickly migrated to a new host for load
z/VM (full virtualization)
• While the IBM System z™ is a new brand name, it actually has a
long heritage originating back in the 1960s. The System/360
supported virtualization using virtual machines in 1965. Interestingly,
the System z retains backward compatibility with the older
System/360 line.
• The z/VM® is the operating system hypervisor for the System z. At
its core is the Control Program (CP), which provides the
virtualization of physical resources to the guest operating systems,
including Linux (see the figure on the next slide). This permits
multiple processors and other resources to be virtualized for a
number of guest operating systems.
• The z/VM can also emulate a guest local area network (LAN)
virtually for those guest operating systems that want to communicate
with each other. This is emulated entirely in the hypervisor, making it
highly secure.
z/VM (full virtualization)
Xen (paravirtualization)
• Xen is a free open source solution for operating system-level
paravirtualization from XenSource. Recall that in paravirtualization
the hypervisor and the operating system collaborate on the
virtualization, requiring operating system changes but resulting in
near native performance.
• As Xen requires collaboration (modifications to the guest operating
system), only those operating systems that are patched can be
virtualized over Xen. From the perspective of Linux, which is itself
open source, this is a reasonable compromise because the result is
better performance than full virtualization. But from the perspective
of wide support (such as supporting other non-open source
operating systems), it's a clear disadvantage.
• It is possible to run Windows as a guest on Xen, but only on
systems running the Intel Vanderpool or AMD Pacifica. Other
operating systems that support Xen include Minix, Plan 9, NetBSD,
FreeBSD, and OpenSolaris.
User-mode Linux
• User-mode Linux (UML) allows a Linux
operating system to run other Linux
operating systems in user-space. Each
guest Linux operating system exists within
a process of the host Linux operating
system (see Figure 6). This permits
multiple Linux kernels (with their own
associated user-spaces) to run within the
context of a single Linux kernel.
User-mode Linux
• As of the 2.6 Linux kernel, UML resides in
the main kernel tree, but it must be
enabled and then recompiled for use.
These changes provide, among other
things, device virtualization. This allows
the guest operating systems to share the
available physical devices, such as the
block devices (floppy, CD-ROM, and file
systems, for example), consoles, NIC
devices, sound hardware, and others.
User-mode Linux
• Note that since the guest kernels run in
application space, they must be specially
compiled for this use (though they can be
different kernel versions). This results in what's
called the host kernel (which resides on the
hardware) and the guest kernel (which runs in
the user space of the host kernel). These
kernels can even be nested, allowing a guest
kernel to run on another guest kernel that is
running on the host kernel.
User-mode Linux
Linux-VServer (operating systemlevel virtualization)
• Linux-VServer is a solution for operating
system-level virtualization. Linux-VServer
virtualizes the Linux kernel so that multiple
user-space environments, otherwise
known as Virtual Private Servers (VPS),
run independently with no knowledge of
one another. Linux-VServer achieves userspace isolation through a set of
modifications to the Linux kernel.
Linux-VServer (operating systemlevel virtualization)
• To isolate the individual user-spaces from one
another, you begin with the concept of a context.
A context is a container for processes of a given
VPS, so that tools like ps know only about the
processes of the VPS. For initial boot, the kernel
defines a default context. A spectator context
also exists for administration (to view all
executing processes). As you can guess, the
kernel and internal data structures are modified
to support this approach to virtualization.
Linux-VServer (operating systemlevel virtualization)
• Linux-VServer also uses a form of chroot
to isolate the root directory for each VPS.
Recall that chroot allows a new root
directory to be specified, but additional
functionality is required (called a ChrootBarrier) so that a VPS can't escape its
isolated root directory to the parent. Given
an isolated root directory, each VPS has
its own user list and root password.
Linux-VServer (operating systemlevel virtualization)
• The Linux-VServer is supported by both
the 2.4 and 2.6 Linux kernels and operates
on a number of platforms, including x86,
x86-64, SPARC, MIPS, ARM and
OpenVZ (operating system-level
• OpenVZ is another operating system-level
virtualization solution, like Linux-VServer,
but it has some interesting differences.
• OpenVZ is a virtualization-aware
(modified) kernel that supports isolated
user-spaces, VPS, with a set of user-tools
for management.
• For example, you can easily create a new
VPS from the command line
OpenVZ (operating system-level
$ vzctl create 42 --ostemplate fedora-core-4
Creating VPS private area
VPS private area was created
$ vzctl start 42
Starting VPS ...
VPS is mounted
OpenVZ (operating system-level
• You can also list the currently created VPSes
using the vzlist command, which operates in a
similar fashion to the standard Linux ps
• To schedule processes, OpenVZ includes a twolevel CPU scheduler. First, the scheduler
determines which VPS should get the CPU. After
this is done, the second-level scheduler picks
the process to execute given the standard Linux
OpenVZ (operating system-level
• OpenVZ also includes what are called beancounters. A
beancounter consists of a number of parameters that
define resource distribution for a given VPS. This
provides a level of control over a VPS, defining how
much memory is available, how many interprocess
communication (IPC) objects are available, and so on.
• A unique feature of OpenVZ is the ability to checkpoint
and migrate a VPS from one physical server to another.
Checkpointing means that the state of a running VPS is
frozen and store into a file. This file can then be migrated
to a new server and restored to bring the VPS back
• OpenVZ supports a number of hardware architectures,
including x86, x86-64, and PowerPC.
Hardware support for full
virtualization and paravirtualization
• Recall that the IA-32 (x86) architecture creates some
issues when it comes to virtualization. Certain privilegedmode instructions do not trap, and can return different
results based upon the mode. For example, the x86 STR
instruction retrieves the security state, but the value
returned is based upon the particular requester's
privilege level. This is problematic when attempting to
virtualize different operating systems at different levels.
For example, the x86 supports four rings of protection,
where level 0 (the highest privilege) typically runs the
operating system, levels 1 and 2 support operating
system services, and level 3 (the lowest level) supports
applications. Hardware vendors have recognized this
shortcoming (and others), and have produced new
designs that support and accelerate virtualization.
Hardware support for full
virtualization and paravirtualization
• Intel is producing new virtualization technology that will
support hypervisors for both the x86 (VT-x) and Itanium®
(VT-i) architectures.
• The VT-x supports two new forms of operation
– one for the VMM (root)
– one for guest operating systems (non-root).
• The root form is fully privileged, while the non-root form
is deprivileged (even for ring 0).
• The architecture also supports flexibility in defining the
instructions that cause a VM (guest operating system) to
exit to the VMM and store off processor state. Other
capabilities have been added
Hardware support for full
virtualization and paravirtualization
• AMD is also producing hardware-assisted virtualization
technology, under the name Pacifica.
• Among other things, Pacifica maintains a control block
for guest operating systems that are saved on execution
of special instructions.
• The VMRUN instruction allows a virtual machine (and its
associated guest operating system) to run until the VMM
regains control (which is also configurable). The
configurability allows the VMM to customize the
privileges for each of the guests.
• Pacifica also amends address translation with host and
guest memory management unit (MMU) tables.
Linux KVM (Kernel Virtual
• The most recent news out of Linux is the incorporation of
the KVM into the Linux kernel (2.6.20).
• KVM is a full virtualization solution that is unique in that it
turns a Linux kernel into a hypervisor using a kernel
• This module allows other guest operating systems to
then run in user-space of the host Linux kernel (see
Figure in the next slide).
• The KVM module in the kernel exposes the virtualized
hardware through the /dev/kvm character device.
• The guest operating system interfaces to the KVM
module using a modified QEMU process for PC
hardware emulation.
Linux KVM (Kernel Virtual
Linux KVM (Kernel Virtual
• The KVM module introduces a new execution mode into
the kernel. Where vanilla kernels support kernel mode
and user mode, the KVM introduces a guest mode. The
guest mode is used to execute all non-I/O guest code,
where normal user mode supports I/O for guests.
• The introduction of the KVM is an interesting evolution of
Linux, as it represents the first virtualization technology
that is part of the mainline Linux kernel. It exists in the
2.6.20 tree, but can be used as a kernel module for the
2.6.19 kernel. When run on hardware that supports
virtualization, Linux (32-and 64-bit) and Windows (32-bit)
guests are supported.
Virtualization Examples
• Server consolidation - Virtual machines
are used to consolidate many physical
servers into fewer servers, which in turn
host virtual machines. Each physical
server is reflected as a virtual machine
"guest" residing on a virtual machine host
system. This is also known as Physical-toVirtual or 'P2V' transformation.
Virtualization Examples
• Disaster recovery - Virtual machines can
be used as "hot standby" environments for
physical production servers. This changes
the classical "backup-and-restore"
philosophy, by providing backup images
that can "boot" into live virtual machines,
capable of taking over workload for a
production server experiencing an outage.
Virtualization Examples
• Testing and training - Hardware
virtualization can give root access to a
virtual machine. This can be very useful
such as in kernel development and
operating system courses.
Virtualization Examples
• Portable applications - The Microsoft Windows platform
has a well-known issue involving the creation of portable
applications, needed (for example) when running an
application from a removable drive, without installing it
on the system's main disk drive. This is a particular issue
with USB drives. Virtualization can be used to
encapsulate the application with a redirection layer that
stores temporary files, Windows Registry entries, and
other state information in the application's installation
directory – and not within the system's permanent file
system. See portable applications for further details. It is
unclear whether such implementations are currently
Virtualization Examples
• Portable workspaces - Recent technologies have used
virtualization to create portable workspaces on devices
like iPods and USB memory sticks. These products
– Application Level – Thinstall – which is a driver-less solution for
running "Thinstalled" applications directly from removable
storage without system changes or needing Admin rights
– OS-level – MojoPac, Ceedo, and U3 – which allows end users to
install some applications onto a storage device for use on
another PC.
– Machine-level – moka5 and LivePC – which delivers an
operating system with a full software suite, including isolation
and security protections.
Server Virtualization
• Server virtualization is used to describe
many different technologies and
approaches to abstract operating systems
from hardware.
• Server virtualization presents a virtual view
of hardware to an operating system to
allow multiple operating systems to share
the same physical resource in complete
isolation from each other.
Server Virtualization
• The key benefits of virtualization are:
– Isolation: A virtual server’s state is unaffected
by the state of other virtual servers on the
same physical hardware.
– Encapsulation: The state of a virtual server
can be captured and files representing a
virtual server are portable.
– Hardware-independence: Virtual hardware
does not have to be identical to the underlying
physical hardware.
X86 Virtualization
• The x86 architecture was not originally designed for
• This created tradeoffs in early server virtualization
implementations in terms of both performance and
• Historically there have been two approaches to virtualize
x86 architecture
– binary patching
– paravirtualization.
• Although both approaches create the illusion of physical
hardware to achieve the goal of operating system
independence from the hardware, there are significant
differences between the approaches
X86 Virtualization
• Full virtualization with binary patching, at
run-time rewrites x86 instructions that
cannot be trapped and converts them into
a series of instructions that can be trapped
and virtualized. Full virtualization is
capable of running existing, legacy
operating systems without modifications,
however it has significant costs in
complexity and runtime performance.
X86 Virtualization
• Paravirtualization modifies an operating
system to replace non-trappable x86
instructions with a series of calls directly
into a hypervisor (a virtual machine
monitor). It achieves high performance
with less complexity in the virtualization
layer but requires the guest operating
system to be substantially modified and
tied to a particular version of the
Virtual Infrastructure
• All data center resources can be virtualized to
create a Virtual Infrastructure. The components
described in the chart below provide the
foundation to create virtual servers. A virtual
server consists of 32 or 64-bit CPUs, memory,
disks, network adapters, fibre channel adapters,
keyboard, video, and mouse. A virtual server can
run standard Linux and Windows operating
systems and applications.
Virtual Infrastructure
Physical Resource
Virtual Infrastructure
Industry standard Intel and AMD
servers upon which the virtualization
layer is automatically deployed
A Virtualized Node consists of a collection
of CPUs and RAM that can be allocated to
a virtual server
Each server can have multiple gigabit
Ethernet cards (NICs) to provide
required throughput and availability
Virtual servers connect through virtual
NICs to physical or virtual networks
iSCSI, SAN and NAS storage
technologies are used for reliable
persistent storage
A collection of storage resources can be
partitioned and allocated to virtual servers
using raw mappings or virtual hard disks
Virtualization Tips
• In the VMware space, VirtualCenter is the
management tool of choice for ESX Server.
• Other products, like Hewlett-Packard's Virtual
Machine Management or IBM's Director
modules, are adding functionality to deal with
virtual machine [VM] environments.
• The problem is that most of these tools that are
snap-ins lack much of the simple functionality
you get in VirtualCenter.
• Most companies will end up buying both
VirtualCenter and the vendor's tool and use both
depending on what they are doing.
Virtualization Tips
• Shy away from large amounts of processing
when doing consolidation.
• If you are doing virtualization for other reasons,
like workload management, then you can get
nearly anything to run virtualized if you are
willing to change some of the things you do.
• However, if you are looking for maximum
consolidation ratios and high ROIs, stay away
from the quad boxes that are already running at
Security Tips
• Some standard minimum security at least:
– Disable remote root access
– use sudo when needed
– configure the AD PAM modules for Windows
Security Tips
• Some organizations use too much surrounding
security and end up making their environment
slower, more difficult and expensive to manage.
• When dealing with the VMs, all of the standard
procedures should be followed.
• The host systems themselves should often be
considered appliances, and organizations
should limit the amount of customized agents
and security hacks performed on these systems.
Security Tips
• One should not go overboard with ESX hosts,
since they are basically appliances serving up
computing resources and should be treated as
such. Nevertheless, taking a common sense
approach to security on the servers is the best
• The most common mistakes made with virtual
security are based on ignorance, lack of
knowledge of the Linux console, failure to
understand how virtual switch architecture
works, and what the host does not directly see in
the data in the VM disk files.
Security Tips
• The same practices that are performed to
secure a physical environment can, and
should, be used in a virtual environment
as well.
• Everything from proper VLAN/firewall
organization to host-based intrusion
detection should be leveraged to keep the
environment secure.
Scalability Tips
• Simplicity. The more complicated the design and
infrastructure, the less scalable it will be.
– For example, a common mistake in large
organizations, is that they assume they cannot create
a simple solution because they are big. One can
argue that they should make the solution or design for
VMware as simple as possible to make it scalable for
the size of their organization and largest client base.
• Don't design the entire solution around the oneoffs.
Scalability Tips
• When designing a virtual infrastructure, one
should never look at the environment and try to
plan one large infrastructure for the entire
virtualization project. It won’t work.
• Organize the overall environment into smaller
groupings of servers and addressed individually.
• When approached this way, at the end of the
project, a very scalable deployment
methodology that uses the same principals with
a manageable number of servers in various
phases of the project will be in place