presentation

advertisement
Survey of State-of-the-art in Inter-VM
Communication Mechanisms
Jian Wang
 Introduction
 Shared memory research
 Scheduler optimization research
 Challenges and problems
2
Virtual
Virtual
Machine
Machine
A
B
Hypervisor
(or Virtual Machine Monitor)
 Virtualization technology is
mainly focused on building
the isolation barrier
between co-located VMs.
 However, applications often
wish to talk across this
isolation barrier
 E.g. High performance grid
apps, web services, virtual
network appliances,
transaction processing,
graphics rendering.
Physical Machine
3
Transparent to applications BUT
High communication overhead between
co-located VMs
Native Loopback
Xen Inter-VM
Flood Ping
RTT
(Microsecs)
TCP
Bandwidth
(Mbps)
UDP
Bandwidth
(Mbps)
6
140
4666
2656
4928
707
4
Domain 0
PKT
Communication data path between co-located VMs
5
Packet
routed
VM 1
Domain-0
VM 2
Xen
Put packet
Ask Xen to
Ask Xen to
into a page
transmit
swap/copy
pages
pages
6
Advantages of using Shared Memory:
No need for per-packet processing
Pages reused in circular buffer
Writes are visible immediately
Fewer hypercalls (only for signaling)
7
VM 1
VM 2
Xen
Allocate one pool
Ask Xen to
of pages
share pages
8
1.Performance:
High throughput, low latency and acceptable CPU consumption.
2.Transparency:
Don't change the app. Don't change the kernel.
3. Dynamism:
On-the-fly setup/teardown channels.
Auto discovery. Migration support.
9
Dom1
DomX
t1
Dom2
…...
t2
Dom1
DomX
t3
Dom1
……
t4
Dom2
10
Scheduler induced delays
Jboss
query1
reply1
DB
Jboss
DB
query2
query1
reply2
Running
on dedicated servers
query2
reply1
reply2
Runnning on
consolidated
server
Scheduler induced delays
Network latency
11
Lack of communication awareness in
VCPU scheduler
Lacks knowledge of timing requirements of
tasks/applications within each VM.
Absence of support for real-time
inter-VM interactions
Unpredictability of current VM scheduling
mechanisms
12
Low latency
Independent of other domains’ workloads
Predictable
13
Shared Memory Research
14
Xiaolan Zhang, Suzanne McIntosh
 Shared Memory between two domains
 One way communication pipe
 Below Socket layer
 Bypass TCP/IP stack
 No auto discovery, no migration support, no
transparency
15
Server
socket();
bind(sockaddr_inet);
listen();
accept();
Client
•Remote address
socket();
• Remote port #
connect(sockaddr_inet);
• Local port #
• Remote VM #
socket();
bind(sockaddr_xen);
•Remote VM #
socket();
• Remote grant #
connect(sockaddr_xen);
System returns grant # for client
16
Kangho Kim Cheiyol Kim
Bi-directional communication
Transparent to applications
Below Socket layer
Significant kernel modifications,
No migration support, TCP only
17
Domain A
Event channel
Domain B
SQ
SQ
Head Tail
Head Tail
RQ
RQ
Head Tail
Head Tail
18
Wei Huang, Matthew Koop
 IVC library providing efficient intra-physical node
communication through shared memory
 Provides auto discovery and migration support
 User transparency or kernel transparency not fully
supported, only MPI protocol supported
19
IVC consists of two parts:
A user space communication
library
A kernel driver
Uses a general socket style
interface.
20
Prashanth Radhakrishnan, Kiran Srinivasan
 Map in the entire physical memory of the peer VM
 Zero copy between guest kernels
 On-the-fly setup/teardown channels not supported,
 In their model, VMs need to fully trust each other, which
is not practical.
21
22
Jian Wang, Kartik Gopalan
 Enables direct traffic exchange between co-located





VMs
Transparency for Applications and Libraries
Kernel Transparency
Automatic discovery of co-located VMs
On-the-fly setup/teardown XenLoop channels
Migration transparency
23
XenLoop Architecture
One-bit bidirectional channel
Netfilter
hook to capture
Applications
Applications
other endpoint
and examine outgoing to notify theLockless
producerSocketpackets.
Layer
Socket Layer
that data is available in FIFO
consumer circular buffers
Transport Layer
Transport Layer
Network Layer
Software
Bridge
FIFO A B
FIFO B A
OUT
IN
XenLoop Layer
Netfront
Virtual Machine A
Network Layer
Event Channel
N
B
Domain
Discovery
N
Software
Bridge B
Domain 0
IN
OUT
Software
Bridge
XenLoop Layer
Netfront
24
Virtual Machine B
XenSosket
XWay
IVC
MMNet
Xenloop
User Transparent
X
√
X
√
√
Kernel Transparent
√
X
X
√
√
Transparent
Migration Support
X
X
Not fully
transparent
X
√
Standard protocol
support
X
Only TCP
Only MPI or
√
app protocols
√
Auto VM Discovery
& Conn. Setup
X
X
√
√
√
Complete memory
isolation
√
√
√
X
√
Location in
Software Stack
Below
Below
User Library
socket layer socket layer + syscalls
Below
IP layer
Below IP
layer
Copying Overhead
2 copies
2 copies
4 copies at
present
2 copies
2 copies
25
Scheduler Optimization Research
26
 Preferentially scheduling
communication oriented domains
Introduce short term unfairness
Performance VS Fairness
 Address inter-VM communication
characteristics
27
Sriram Govindan, Arjun R Nath
Prefer VM with most pending network packets
 Both to be sent and received
Predict pending packets
 Receive prediction
 Send prediction
Fairness

Still preserve reservation guarantees over a coarser time
scale – PERIOD
28
Packet Reception
Domain 1 Domain 2
…
Domain n
Guest
Domains
domain1.pending-Hypervisor
Packet arrive
at the NIC
Domain0.pending-Domain0.pending++
NIC
Interrupt
Domain0
Domain1.pending++
Schedule
Domain
1.
Now,
schedule
domain0.
29
29
Diego Ongaro, Alan L. Cox
 Boosting I/O domains
 Used when an idle domain is sent a virtual interrupt
 Run-queue ordering
 Within each state, sorts domains by credits remaining
 Tickling too soon
 Don’t tickle while sending virtual interrupts
30
Hwanju Kim, Hyeontaek Lim
 Use task info to determine whether a domain that gets a
event notification is I/O-bound
 Give the domain a partial boost if it is I/O bound.
 Partial boosting
 Partial boosted VCPU can preempt a running VCPU and
handle the pending event.
 Whenever it is inferred as non-I/O-bound, the VMM will
revoke CPU from the partially boosted VCPU.
 Use correlation information to predict whether an event is
directed for I/O tasks
 Block I/O
 Network I/O
31
Jian Wang, Kartik Gopalan
Dom1
DomX
t1
Dom2
…...
t2
DomX
t3
Dom1
……
t4
 Dom2 cannot get time slice as early as possible
32
One time slice(30ms)
Dom1
Dom2
One way AICT
Dom1
Dom2
Dom1
Two way AICT
Basic Idea
 Donate unused time slices to the target domain
Proper Accounting
 When source domain donates time slice to target guest, charge credits on
source domain in stead of target domain.
33
 Real-time Guarantee
 Coordinate with guest scheduler
 Compositional VM systems
Web Server
Dom1
Application Server
Dom2
Database Sever
Dom3
34
For co-located inter-VM communication
 Shared memory greatly improves performance
 Optimizing scheduler has much benefits
35
Thank You.
Questions?
36
Backup slides
37
XenLoop Performance
Netperf UDP_STREAM
38
XenLoop Performance (contd.)
39
XenLoop Performance (contd.)
40
XenLoop Performance (contd.)
Migration Transparency
Colocated VMs
Separated VMs
Separated again
41
Future Work
 Compatibility with routed-mode Xen setup
 Implemented. Under testing.
 Packet interception b/w socket and transport layers
 Do this without changing the kernel.
 Will reduce 4 copies to 2 (as others), significantly improving
bandwidth performance.
 XenLoop for Windows guest?
 Windows  Linux XenLoop Channel
 XenLoop architecture mostly OS agnostic.
42
Download