A Primer on Virtualization

advertisement
A Primer on Virtualization
Software & Services Group
1
Agenda
• Westmere EP on Intel Roadmap
• Adv. Enc. Std: New Instructions (AES-NI) in
Westmere EP
• Intel® Virtualization Technology – a primer
–
–
–
–
–
–
–
Evolution of Virtualization
Intel Virtualization Technologies
A Framework for Optimizing Virtualization – improving efficiency and scaling
Reducing Virtualization Overheads: Processor, Memory, I/O
Improving VM scaling
VMM readiness
Improving VMM security through TXT
• Summary
Content condensed from: “Intel® Virtualization Technology in the Enterprise”, Rich Uhlig, Intel
Fellow, IDF 2009 San Francisco USA.
Software & Services Group
Evolution of Virtualization
OS
OS
OS
OS
OS
OS
VMM
VMM
VMM
Host
Host
Host
Flexible
Resource
Management
VMM
Host
Basic
Consolidation
OS
OS
VMM
Host
No Virtualization
OS
Host
OS
Host
OS
Host
Drives Priorities:
• Lower VMM Overheads
• Richer Workloads in VMs
• Seamless VM Migration
• Power Efficiency
• Increased Scaling (VMs, vCPUs)
• Improved Reliability
Software & Services Group
3
Intel® Virtualization Technologies
Intel® VT-x
Processor
Intel® VT-d
Chipset
Intel® VT-c
Network
Intel® VT-x
Hardware assists for robust virtualization
Intel® VT FlexMigration – Flexible live migration
Intel® VT FlexPriority – Interrupt acceleration
Intel® EPT – Memory Virtualization
Intel® VT for Directed I/O
Reliability and Security through device Isolation
I/O performance with direct assignment
Intel® VT for Connectivity
NIC Enhancement with VMDq
Single Root IOV support
Network Performance and reduced CPU utilization
Intel® I/OAT for virtualization
Lower CPU Overhead and Data Acceleration
Software & Services Group
A Framework for Optimizing Virtualization
•
Reduce overheads from virtualization
•
•
•
•
•
•
Intel® VT-x Latency Reductions
Extended Page Tables
Virtual Processor IDs
APIC Virtualization (Flex Priority)
I/O Assignment via DMA Remapping
Introduce capabilities that increase
scaling out across VMs
•
•
•
Intel® Hyper-Threading Technology
PAUSE-loop Exiting
Network Virtualization
2. … and scale out across VMs
1.
Reduce
overhead
within
each VM…
OS
OS
…
OS
VMM
Host
Intel® Virtualization Technology (Intel® VT) for IA-32, Intel® 64 and Intel® Architecture (Intel® VT-x)
5
Software & Services Group
Reducing VT Latencies
What makes VM Context Switching expensive
Virtual Machine Ctl Structure (VMCS)
– Maintains Guest & Host reg. state
– “Backed” by host physical memory
Saving-Loading privileged state
• Compounded by consistency checking
Addressing Context changes
• Translation Lookaside Buffer (TLB) flushes
–Accessed via architectural VMREAD/VMWRITE
–Enables caching of VMCS state on-die
Virtual Processor IDs (VPIDs)
–Tag µarch structures (TLBs)
–Removes need to flush TLBs
Software & Services Group
6
Intel® Virtualization Technology (VT-x)
• VT-x® provides architected assists to allow guest OSes to run directly on hardware
• On Nehalem and Westmere VT-x is extended with:
Extended Page Tables (EPT)
Eliminates VM exits to the VMM for shadow pagetable maintenance
Virtual Processor IDs (VPID)
Avoid flushes on VM transitions to give a lower-cost
VM transition time
Guest Preemption Timer -- lets a VMM preempt a guest OS
Aids VMM vendors in flexibility and Quality of
Service (QoS)
Descriptor Table Exiting –Traps on modifications of guest DTs
Allows VMM to protect a guest from internal attack
Transition Latency reductions
Continuing improvements in microarchitectural
handling of VMM round trips
Software & Services Group
Issues with abstracting physical memory
Address Translation
• Guest OS expects contiguous, zerobased physical memory
• VMM must preserve this illusion
Page-table Shadowing
• VMM intercepts paging operations
• Constructs copy of page tables
VM0
VMn
Guest
Page Tables
Guest
Page Tables
Induced
VM Exits
Remap
VMM
Shadow
Page Tables
Overheads
• VM exits add to execution time
• Shadow page tables consume
significant host memory
TLB
CPU0
Memory
Software & Services Group
8
How Extended Page Tables help with abstracting Physical
Memory
Extended Page Tables (EPT)
• Map guest physical to host address
• New hardware page-table walker
Performance Benefit
• A guest OS can modify its own page tables freely and without VM exits
Memory Savings
• A single EPT supports entire VM: instead of a shadow pagetable per guest process
Intel® Virtualization Technology (Intel® VT) for IA-32, Intel® 64 and Intel® Architecture (Intel® VT-x)
9
Software & Services Group
Linear Addr
Extended Page Table (EPT) Walk
CR3
EPTP
L4
L3
L2
L1
EPTP
L4
L3
L2
L1
EPTP
L4
L3
L2
L1
EPTP
L4
L3
L2
L1
EPTP
L4
L3
L2
L1
PML4
TLB
PDP
PDE
PTE
Physical Addr
2-level TLB reduces page-table walks
• VPID tags help to retain TLB entries
Paging-structure caches
• Cache intermediate steps in walk
• Result: Reduce length of walks
• Common case: Much better than 24 steps
Software & Services Group
10
Difficulties in virtualizing I/O
1
Virtual Device Interface
• Traps device commands
• Translates DMA operations
• Injects virtual interrupts
Software Methods
• I/O Device Emulation
• Paravirtualize Device Interface
Para- 2
virtualization
I/O Device
Emulation
VM0
VMn
Guest Device
Driver
Guest Device
Driver
Device Model
Device Model
Physical Device Driver
VMM
Challenges
• Controlling DMA and interrupts
• Overheads of copying I/O buffers
Storage
Memory
Network
Software & Services Group
11
Solution: DMA remapping (Intel® VT-d)
DMA Requests
Dev 31, Func 7
Device ID
Virtual Address
Length
Bus 255
Dev P, Func 2
4KB
Page
Frame
Bus N
Fault Generation
Bus 0
Dev P, Func 1
4KB Page
Tables
Dev 0, Func 0
Device D1
DMA Remapping
Engine
Translation Cache
Context Cache
Memory Access with Host Physical Address
Device
Assignment
Structures
Address Translation
Structures
Device D2
Address Translation
Structures
Memory-resident Partitioning &
Translation Structures
Software & Services Group
12
Intel® VT FlexPriority
APIC Task Priority Register (TPR)
• Controls interrupt delivery through the APIC
• Accessed very frequently by some guest OSes
Without Intel VT FlexPriority
With Intel VT FlexPriority
VM
VM
Guest OS
No VM
Exits
VM
Exits
VMM
APIC-TPR access
in software
• Fetch/decode instruction
• Emulate APIC-TPR behavior
• Thousands of cycles per exit
VMM
Guest OS
APIC TPR access
in HW
configure
• Instruction executes directly
• Hardware emulates APIC-TPR access
• No VM exit in the common case
Software & Services Group
13
Technologies to improve VM scaling
• Hyper-threading
• Decreasing lock holder preemption impact
• Network virtualization with Virtual Machine Device Queues
• Single Root I/O Virtualization (SR-IOV)
Software & Services Group
14
Reducing Lock Holder preemption impact
Problem:
• In an SMP guest, a vCPU holding a lock may get preempted
• Other vCPUs that attempt to acquire that lock spin for the full quantum
Solution: Pause-Loop Exiting (PLE)
• Spin-locking code typically uses PAUSE instructions in a loop
• A longer than “normal” loop duration taken as a sign of lockholder preemption
• When that happens, HW forces an exit into VMM
• VMM takes control and schedules some other vCPU
spin_lock:
attempt lock-acquire;
if fail {
PAUSE;
jmp spin_lock
}
Software & Services Group
15
Network Virtualization: HW traffic management
Virtual Machine Device Queues (VMDq)
•
Data packets grouped and sorted in HW
•
Packets sent to their respective VMs
•
Round-robin servicing on transmit
10.0
9.2
9.5
4.0
(vNIC)
(vNIC)
(vNIC)
Layer 2 Software Switch
Rx1
Rx1
Rx1
Rxn
Rxn
Rx2
NIC w/VMDq
4.0
0.0
w/o VMDq
w/ VMDq
Source Intel.
w/ VMDq
Jumbo
Frames
Tests measure Wire Speed Receive (Rx) Side Performance With
VMDq on Intel® 82598 10 Gigabit Ethernet Controller
Intel® Virtualization Technology (Intel® VT) for Connectivity (Intel® VT-c)
16
VMn
Layer 2 Classified Sorter
6.0
2.0
VM2
VMM
Throughput (Gbps)
8.0
VM1
LAN
MAC/PHY
Rx1
Rx2
Rx1
Rxn
Rx1
Rxn
Idle
Idle
Txn
Txn
Tx2
Txn
Tx2
Tx1
Software & Services Group
PCI-SIG SR-IOV
• Description:
– PCI-SIG Single Root I/O Virtualization (SR-IOV) Standard:
– Allows for I/O devices to be simultaneously shared among VMs
– Virtual interfaces can be directly assigned to reduce routing overheads
• Benefits:
– Provide near native performance due to direct connectivity
– Allows direct VM control of I/O virtual functions
• Requirements:
– Intel VT-d as the core platform ingredient
– BIOS support of SR-IOV
Software & Services Group
Westmere: Trusted Execution Technology (TXT)
Guest
Guest
Server Extensions
VM
VM
• TXT uses features in processor, chipset
and TPM to enable more secure platforms
• TXT works through measurement,
memory locking and sealing secrets
• TXT helps prevent software attacks
Guest
VM
VMM
Rootkit Hypervisor
Platform hardware
with VT-x support
Helps prevent hijacking by rootkit
Processor
Processor
– such as, attempts to insert rogue VMM
(rootkit hypervisor)
Chipset
– and, reset-attacks that are designed to
TPM
compromise platform secrets in memory
– and, BIOS and firmware update attacks TXT technology incorporates multiple components
VT
TXT Makes Platforms More Robust Against SW-based Attacks
Software & Services Group
Intel VT Features/VMM Support
Feature
Virtualization
PWR
Microsoft
Hyper-V
Xen OSS
Red Hat
(RHEL)
KVM
Novell
(SLES)
Min
Rev
Date
Min
Rev
Date
Min
Rev
Date
Min
Rev
Date
Rev
Date
Min
Rev
Date
Min Rev for
Nehalem
3.5U4
Now
Win Svr
’08
Now
Any
Now
Any
Now
Any
Recent
Now
Any
Now
FlexPriority
3.5U4
Now
Win Svr
’08
Now
3.1
Now
5.2
Now
2.6.24
Now
10
SP1
Now
FlexMigration
3.5U4
Now
Win Svr
’082
Now
3.3
Now
TBD
TBD
TBD
TBD
EPT + VPID
4.0
Now
Win Svr
’08 R2
1H’10
3.3
Now
5.3
Now
2.6.26
Now
10
SP2
Now
Chipset (VT-d)
VT-d2
4.0
Now
TBD
TBD
3.3
Now
5.4
Q3’09
2.6.31
Now
11
Now
Network
(VT-c)
SR-IOV
TBD
TBD
TBD
TBD
3.41
Now3
TBD
TBD
TBD
TBD
TBD
TBD
VMDq
3.5U4
Now
Win Svr
’08 R2
1H’10
TBD
TBD
TBD
TBD
TBD
TBD
TBD
TBD
Power Mgmt (S,
4.0
Now
1.0 (Sstate TBD)
Now
3.3
Now
TBD
TBD
11
Now
SSE 4.2
4.0
Now
TBD
TBD
Any
Now
Any
Now
Any
Now
Turbo
4.0
Now
1.0
Now
Any
Now
Any
Now
Any
Now
HT (SMT)
3.5U4
Now
1.0
Now
Any
Now
Any
Now
Any
Now
Processor (VT-x)
Other
VMware ESX
P, C, T States)
Notes: Table is based on latest information as of: VT-c : July 09; VT-x and VT-d: May 09
and subject to change; Please contact vendors directly for
unreleased products
1 Requires other ecosystem support
2
3
Product allows capability but robustness improvements in future releases
PCI-SIG SR-IOV standard is supported now
Software & Services Group
Software & Services Group
20
Download