virtualization

advertisement
Virtualization
… from a reliability perspective
DAIMI
Henrik Bærbak Christensen
1
Credits
Some slides from
– E6998 - Virtual Machines
Lecture 2
CPU Virtualization
Scott Devine
VMware, Inc.
DAIMI
Henrik Bærbak Christensen
2
Agenda
What is it?
– Overview
– Classifications
– Virtualization of CPU
Virtualization to improve reliability
– Fault Detection and Removal
• Configuration Testing
• Distribution Testing
• Record-n-Playback
– Fault Tolerance
• Recovery by snapshots
• Lock-step execution
DAIMI
Henrik Bærbak Christensen
3
Classes of Virtualization
DAIMI
Henrik Bærbak Christensen
4
What is it?
vir•tu•al (adj):
– existing in essence or effect, though not in actual
fact
Example
– ScummVM is a program which allows you to run
certain classic graphical point-and-click adventure
games, provided you already have their data files.
The clever part about this: ScummVM just replaces
the executables shipped with the games, allowing you
to play them on systems for which they were never
designed!
DAIMI
Henrik Bærbak Christensen
5
A Physical Machine
Hardware
– Processors, devices,
memory, etc.
Software
– Built to the given hardware
(Instruction Set
Architecture, e.g. x86)
– Built to given OS (App.
Programming Interface,
e.g. Win XP)
– OS controls hardware
DAIMI
Henrik Bærbak Christensen
6
A Virtual Machine
Hardware Abstraction
– Virtual processor, memory,
devices, etc.
Virtualization Software
– Indirection: Decouple
hardware and OS
– Multiplex physical
hardware across guest
VMs
DAIMI
Henrik Bærbak Christensen
7
Why VMs?
Many advantages (often business related)
– Utilization of resources
• VM1 that idles automatically free resources to VM2
– Of course not the case for physical machines
– Isolation
• Crash, virus, in VM1 does not propagate to VM2
– Encapsulation
•
•
•
•
A VM is a file: (OS,Apps,Data,Config, run-time state)
Content distribution: Demos as snapshot
Snapshot provides recovery point
Migration to new physical machines by snapshot/restore
– In e.g. server farms
DAIMI
Henrik Bærbak Christensen
8
Why VMs?
More advantages
– Maintenance costs
• Maintain 100 Linux-based web server machines versus a few
big VM platforms
– Legacy VMs
• Run ancient apps (like DOS or SCUMM-based games)
– Create once, run anywhere
• No configuration issues
– E.g. complex setup of app suite of database, servers, etc.
– … and some reliability related issues that we will
come back to…
DAIMI
Henrik Bærbak Christensen
9
Classification
DAIMI
Henrik Bærbak Christensen
10
Types of VMs
Smith & Nair 2005
Two superclasses
– Process VM
– System VM
Both can be subclassed based upon supporting
virtualization of same or different ISA (Instruction
Set Architecture).
DAIMI
Henrik Bærbak Christensen
11
Process / System VM
DAIMI
Henrik Bærbak Christensen
12
Process VM
The process VM puts the virtualization border
line at the process line.
– A application process/program executes in a
• Logical address space (assigned/handled by the OS)
• Using user-level instructions and registers (CPU user-mode)
• Only do IO using OS calls or High-Level Library(HHL) calls
Thus a process VM becomes a virtual OS in
which a process may execute.
Example: Different ISA Process VM
– JavaVM defines its own ISA (stack-based) as well as
normal OS operations
DAIMI
Henrik Bærbak Christensen
13
System VM
The system VM sets the virtualization border at
the system or hardware line
– A system/OS executes in a
• Physical memory space
• Using the full ISA of the underlying machine
• Interact directly with IO
Thus a system VM becomes a virtual machine
in which a system as a whole may execute
Example: Same ISA System VM
– VMWare: Executes Guest OS that is running on the
x86 ISA.
DAIMI
Henrik Bærbak Christensen
14
Other examples
System VM Different ISA
– VirtualPC running Windows on Mac
Classic VMs
– Virtual Machine Monitor (VMM) directly on hardware
• VMWare ESX
• Drivers 
Hosted VMs
– VMM runs as an application in an OS
• VMWare Workstation
• Drivers 
DAIMI
Henrik Bærbak Christensen
15
Exercise
How would you classify ScummVM?
DAIMI
Henrik Bærbak Christensen
16
How does it work?
Just a glimpse
DAIMI
Henrik Bærbak Christensen
17
Definition
[Rosenblum & Garfinkel 2005]
– A CPU architecture is virtualizable if it supports the
basic VMM technique of direct execution—executing
the virtual machine on the real machine, while letting
the VMM retain ultimate control of the CPU.
That is, in a ‘same ISA’, execution of a guest OS
instructions should be executed directly by the
hardware
– Why? Performance of course!
– Ex. JavaVM poses a performance penalty
DAIMI
Henrik Bærbak Christensen
18
Challenges
The problem is that not all instructions are
possible to execute directly!
To see the problem we have to dig a bit into
modern CPU architectures
– I stopped around Z80 and Motorola 68000 
– 80286:
• First with protected mode…
• Support multitasking, process protection, memory mgt., …
DAIMI
Henrik Bærbak Christensen
19
A few CPU terms
Supervisor mode/Privileged mode:
– “An execution mode on some processors which
enables execution of all instructions, including
privileged instructions. It may also give access to a
different address space, to memory management
hardware and to other peripherals. This is the mode
in which the operating system usually runs.”
(Wikipedia)
DAIMI
Henrik Bærbak Christensen
20
Supervisor/User mode
x86 architecture
– Ring 0: Kernel mode
• Can do anything
• OS and device drivers
– Ring 3: User mode
• Limited Instruction Set
• Apps can fail at any time without impact on rest of system!
• User applications
User apps must do system call (call OS) to
interact with hardware like device drivers...
DAIMI
Henrik Bærbak Christensen
21
Performance
Switching from “user mode” to “kernel mode” is,
in most existing systems, very expensive. It has
been measured, on the basic request getpid, to
cost 1000-1500 cycles on most machines.
DAIMI
Henrik Bærbak Christensen
22
Trapping
What happens if code in user mode executes a
privileged instruction?
– In computing and operating systems, a trap is a type
of synchronous interrupt typically caused by an
exceptional condition (e.g. division by zero or invalid
memory access) in a user process. A trap usually
results in a switch to kernel mode, wherein the
operating system performs some action before
returning control to the originating process.
DAIMI
Henrik Bærbak Christensen
23
Virtualization Requirements
If VMWare is an app, running in Windows, and it
runs the Linux OS “inside”, it follows that
– Privileged instructions have to run in user-mode!
VVM: Virtual Machine Monitor
– VVM runs in priviledged mode
– All ‘inside’ runs in user mode
DAIMI
Henrik Bærbak Christensen
24
VMM and privileged instructions
OK, so…
– Linux runs in user mode inside a VMM
– Now the Linux kernel disables interrupts (CLI)
• Which is certainly a privileged instruction which is not
allowed in user mode
So – what can we do about that?
– Emulation
– Trap-and-Emulate
– Binary Translation
DAIMI
Henrik Bærbak Christensen
25
Emulation Example: CPUState
static struct {
uint32 GPR[16];
uint32 LR;
uint32 PC;
int
IE;
int
IRQ;
} CPUState;
void CPU_CLI(void)
{
CPUState.IE = 0;
}
void CPU_STI(void)
{
CPUState.IE = 1;
}
CLI=Clear Interrupt Flag
STI=Set Interrupt Flag
IE = Interrupt Enabled
Goal for CPU virtualization techniques
– Process normal instructions as fast as possible
– Forward privileged instructions to emulation routines
DAIMI
Henrik Bærbak Christensen
26
Instruction Interpretation
Emulate Fetch/Decode/Execute pipeline in
software
Positives
– Easy to implement
– Minimal complexity
Negatives
– Slow!
DAIMI
Henrik Bærbak Christensen
27
Example: Virtualizing the Interrupt Flag
w/ Instruction Interpreter
void CPU_Run(void)
{
while (1) {
inst = Fetch(CPUState.PC);
CPUState.PC += 4;
switch (inst) {
case ADD:
CPUState.GPR[rd]
= GPR[rn] + GPR[rm];
break;
…
case CLI:
CPU_CLI();
break;
case STI:
CPU_STI();
break;
}
void CPU_CLI(void)
{
CPUState.IE = 0;
}
void CPU_STI(void)
{
CPUState.IE = 1;
}
void CPU_Vector(int exc)
{
CPUState.LR = CPUState.PC;
CPUState.PC = disTab[exc];
}
if (CPUState.IRQ
&& CPUState.IE) {
CPUState.IE = 0;
CPU_Vector(EXC_INT);
}
}
}
DAIMI
Henrik Bærbak Christensen
28
Guest OS + Applications
Page
Fault
Unprivileged
Trap and Emulate
Undef
Instr
MMU
Emulation
CPU
Emulation
I/O
Emulation
Privileged
vIRQ
Virtual Machine Monitor
DAIMI
Henrik Bærbak Christensen
29
The protocol…
The Linux kernel code contains CLI
– As in user mode, the CPU traps and transfer control
to the VVM
– VVM sets internal flag that interrupts are disabled
• Interrupts are now not delivered to the guest OS until it calls
the STI
• which again traps and the VVM clears the internal flag!
• From that time on, interrupts are again delivered.
Trap-and-Emulate:
– All user mode instructions execute full speed
– Privileged instructions are trapped and emulated…
DAIMI
Henrik Bærbak Christensen
30
Issues with Trap and Emulate
Not all architectures support it
Trap costs may be high
– Cf. performance measurements earlier
DAIMI
Henrik Bærbak Christensen
31
Challenges…
The x86 is not virtualizable 
– i.e. there are privileged instructions that do not trap!
Ex.
– POPF = pop CPU flag from stack
• Contains the interrupt flag bit and thus can enable/disable
interrupts without the VVM being notified!
Require advanced techniques to cope beyond
trap-and-emulate…
– Para Virtualization
– Binary Translation
DAIMI
Henrik Bærbak Christensen
32
Para Virtualization
Para virtualization = replacing non-virtualizable
portions of the original instruction set with easily
virtualizable and more efficient equivalents.
–  OS code has to be ported!
–  User apps run unmodified.
Ex. Disco: Change MIPS interrupt instruction to
read/write of special memory location in the
VVM
–  More efficient if handled by Trapping
–  Iris OS had to be ported…
DAIMI
Henrik Bærbak Christensen
33
Binary Translation
Basic idea:
– Read code blocks before the CPU gets them
– For each instruction classify
• “Ident”: copy directly to translation cache (TC)
• “Inline”: replace instruction by inline equivalent instructions
and copy these to TC
• “Callouts”: replace instruction by call to emulation code in
VVM
– Let CPU execute contents of translation cache
instead of original block
TC: keep the translated block for future
execution without any translation
DAIMI
Henrik Bærbak Christensen
34
Basic Blocks
Guest Code
vPC
mov
ebx, eax
cli
and
ebx, ~0xfff
mov
ebx, cr3
Straight-line code
Basic Block
sti
ret
DAIMI
Control flow
Henrik Bærbak Christensen
35
Binary Translation
Guest Code
vPC
mov
ebx, eax
cli
DAIMI
Translation Cache
mov
ebx, eax
call
HANDLE_CLI
and
ebx, ~0xfff
and
ebx, ~0xfff
mov
ebx, cr3
mov
[CO_ARG], ebx
sti
call
HANDLE_CR3
ret
call
HANDLE_STI
jmp
HANDLE_RET
Henrik Bærbak Christensen
start
36
Binary Translation
Guest Code
vPC
mov
ebx, eax
cli
Translation Cache
mov
ebx, eax
mov
[CPU_IE], 0
and
ebx, ~0xfff
and
ebx, ~0xfff
mov
ebx, cr3
mov
[CO_ARG], ebx
sti
call
HANDLE_CR3
ret
mov
[CPU_IE], 1
test
[CPU_IRQ], 1
start
jne
DAIMI
call
HANDLE_INTS
jmp
HANDLE_RET
Henrik Bærbak Christensen
37
CPU future…
DAIMI
Henrik Bærbak Christensen
38
Hypervisor mode
Modern/Future CPU architectures
– Make a CPU with a Ring -1
– VVM runs in Ring -1, and can run the OS in Ring 0.
– Recent CPUs from Intel and AMD offer x86
virtualization instructions for a hypervisor to control
Ring 0 hardware access. […] a guest operating
system can run Ring 0 operations natively without
affecting other guests or the host OS.
DAIMI
Henrik Bærbak Christensen
39
Virtualization and
Reliable Architectures?
What reliability can we get from using
VM techniques?
DAIMI
Henrik Bærbak Christensen
40
The three approaches
Three approaches to dependable systems
– Fault avoidance: simply avoid introducing defects!
– Fault detection and removal: Find and remove the
defects before they cause failures.
– Fault tolerance: Ensure that faults does not lead to
failures.
DAIMI
Henrik Bærbak Christensen
41
DAIMI
Henrik Bærbak Christensen
42
Fault Avoidance
I fail so see any here
– Fault avoidance in Sommerville’s perspective classify
a set of static techniques
• Programming languages, etc.
– VVM’s are dynamic techniques…
DAIMI
Henrik Bærbak Christensen
43
Fault Detection and Removal
Again, as in testing the detection aspect is
central. How does VVM’s help?
– Configuration Testing!
• HelpDesk:
– Large set of snapshots of typical customer configs
» Like Internet Explorer 6 on Windows XP
– Just resume this snapshot, reproduce defect and report to
development
• Configuration suite
– Configuration testing on a single tester’s machine
– Ex: I tested linux variants of my book’s code on my Win XP
VMWare within Ubuntu 9.0.4 installed
DAIMI
Henrik Bærbak Christensen
44
Fault Detection
Record and Playback
– http://blogs.vmware.com/workstation/2008/04/enhanc
ed-execut.html
– See VMWare demo later
DAIMI
Henrik Bærbak Christensen
45
Fault Detection
Distributed systems are complex
Make a virtual network with a set of virtual
machines
– Lower bandwidth of network connection
– Induce packet loss
– Kill machines to find defective behavior in the rest
– Test graceful degrading algorithms
DAIMI
Henrik Bærbak Christensen
46
Fault Tolerance
Backward recovery technique
– Restore previous state of system and restart
Ex. Hardware/device failure on machine
– Migrate last good snapshot to new machine
VMWare also works on migrating running VMs
– Capture run-time state of running VM
– Hot migration allows full hot standby techniques…
Dynamic VM management
– Load balancing by create/destroy VMs
– Hardware failure automatically migrate VMs
DAIMI
Henrik Bærbak Christensen
47
Fault Tolerance
Lock-step execution
– Primary VM has secondary VM running in shadow
operation
– If primary fails, the secondary acts as hot standby
– http://www.vmware.com/files/pdf/perf-vspherefault_tolerance.pdf
– See demo later…
DAIMI
Henrik Bærbak Christensen
48
Summary
Virtualization
– Isolate Apps+OS from hardware
– CPU near techniques to do this fast
• Trap-and-emulate & Binary Translation
Dependability relation
– Fault Detection
• Configurations; Tricky setups; record-playback
– Fault Tolerance
• Support hot standby
DAIMI
Henrik Bærbak Christensen
49
Download