Towards Elastic Operating Systems

advertisement
Towards Elastic
Operating
Systems
Amit Gupta
Ehab Ababneh
Richard Han
Eric Keller
University of Colorado,
Boulder
2
OS + Cloud Today
OS/Process
ELB/
Cloud
Mgr
Resources Limited
• Thrashing
• CPUs limited
• I/O bottlenecks
• Network
• Storage
Present Workarounds
• Additional Scripting/Code changes
• Extra Modules/Frameworks
• Coordination
• Synch/Aggregating State
Stretch Process
3
OS/Process
Advantages
• Expands available Memory
• Extends the scope of Multithreaded
Parallelism (More CPUs available)
• Mitigates I/O bottlenecks
• Network
• Storage
4
ElasticOS : Our Vision
5
ElasticOS: Our Goals

“Elasticity” as an OS Service


Elasticize all resources – Memory,
CPU, Network, …
Single machine abstraction
Apps unaware whether they’re running
on 1 machine or 1000 machines
 Simpler Parallelism


Compatible with an existing OS
(e.g Linux, …)
6
“Stretched” Process
Unified Address Space
OS/Process
Elastic Page Table
V
R
Location
7
Movable Execution Context
OS/Process
•
•
•
OS handles elasticity – Apps don’t change
Partition locality across multiple nodes
• Useful for single (and multiple) threads
For multiple threads, seamlessly exploit
network I/O and CPU parallelism
8
Replicate Code, Partition Data
CODE
CODE
CODE
Data 1
Data 2
• Unique copy of data (unlike DSM)
• Execution context follows data
(unlike Process Migration, SSI )
9
Exploiting Elastic Locality
• We need an adaptive page clustering
algorithm
• LRU, NSWAP i.e “always pull”
• Execution follows data i.e “always jump”
• Hybrid (Initial): Pull pages, then Jump
10
Status and Future Work
Complete our initial prototype
 Improve our page placement
algorithm
 Improve context jump efficiency
 Investigate Fault Tolerance issues

Contact:
amit.gupta@colorado.edu
Thank You
Questions ?
12
Algorithm Performance(1)
13
Algorithm Performance(2)
14
Page Placement
Multinode Adaptive LRU
Pulls
Jump
Threshold
Execution
Pull
First
Reached
Context !
Mem
CPUs
Swap
Mem
CPUs
Swap
15
Locality in a Single Thread
Temporal Locality
Mem
CPUs
Swap
Mem
CPUs
Swap
16
Locality across Multiple Threads
CPUs
Swap
Mem
Mem
CPUs
Swap
CPUs
Swap
17
Unlike DSM…
18
Exploiting Elastic Locality
• Assumptions
• Replicate Code Pages, Place Data
Pages (vs DSM)
• We need an adaptive page clustering
algorithm
• LRU, NSWAP
• Us (Initial): Pull pages, then Jump
19
Replicate Code, Distribute Data
CODE
CODE
CODE
Data 1
Accessing
• Unique
Data
1
Data 2
copy of data (vs DSM)
Accessing
• Execution context follows data
Data 2
(vs Process Migration)
Accessing
Data 1
Benefits
20
OS handles elasticity – Apps don’t
change
 Partition locality across multiple nodes


Useful for single (and multiple) threads
For multiple threads, seamlessly exploit
network I/O and CPU parallelism

21
Benefits (delete)

OS handles elasticity


Application ideally runs unmodified
Application is naturally partitioned …
By Page Access locality
 By seamlessly exploiting multithreaded
parallelism
 By intelligent page placement

22
How should we place pages ?
23
Execution Context Jumping
A single thread example
Process
Address Space
Address Space
Node 1
Node 2
TIME
24
“Stretch” a Process
Unified Address Space
Process
Address Space
Address Space
Node 1
Node 2
Page Table
V R
IP Addr
25
Operating Systems Today
 Resource
Limit = 1 Node
Mem
Disks
CPUs
OS
Process
26
Cloud Applications at Scale
More Queries ?
Load
Balancer
More
Resources ?
Cloud
Manager
Process
Process
Process
Partitioned
Data
Partitioned
Data
Partitioned
Data
Framework (eg. Map Reduce)
27
Our findings

Important Tradeoff
 Data
Page Pulls
Vs
Execution Context Jumps
Latency cost is realistic
 Our Algorithm: Worst case scenario
 “always pull” == NSWAP
 marginal improvements

28
Advantages
Natural Groupings: Threads &
Pages
 Align resources with inherent
parallelism
 Leverage existing mechanisms
for synchronization

29
“Stretch” a Process :
Unified Address Space
A “Stretched” Process
=
Collection of Pages + Other Resources
{ Across Several Machines }
Page Table
Mem
V R
IP Addr
Mem
Swap
Swap
CPUs
CPUs
30
delete Exec. context follows
Data

Replicate Code Pages
 Read-Only
=> No Consistency burden
Smartly distribute Data Pages
 Execution context can jump



Moves towards data
*Converse also allowed*
31
Elasticity in Cloud Apps Today
Input Data
D1
Mem
Disk
CPUs
~~~~
~~~~
~~~~
D2
Dx
….
~~~~
~~~~
~~~~
Output Data
32
Input Queries
Load Balancer
D1
Mem
Disk
CPUs
D2
Dy
….
~~~~
~~~~
~~~~
Output Data
Dx
33
(delete)Goals : Elasticity
dimensions

Extend Elasticity to



Memory
CPU
I/O
Network
 Storage

34
Thank You
35
Bang
Head
Here !
36
Stretching a Thread
37
Overlapping Elastic Processes
38
*Code Follows Data*
39
Application Locality
40
Possible Animation?
41
Multinode Adaptive LRU
42
Possible Animation?
43
Open Topics
 Fault
tolerance
 Stack
handling
 Dynamic
 Locking
Linked Libraries
44
Elastic Page Table
Virtual Addr
Phy. Addr
Valid
Node
(IP addr)
A
B
1
Localhost
Local Mem
C
D
0
Localhost
Swap space
E
F
1
128.138.60.1
Remote Mem
G
H
0
128.138.60.1
Remote
Swap
45
“Stretch” a Process
 Move
beyond resource boundaries of
ONE machine



CPU
Memory
Network, I/O
46
Input Data
CPUs
D1
Mem
Disk
CPUs
D2
Mem
Disk
~~~~
~~~~
~~~~
….
~~~~
~~~~
~~~~
Output Data
47
~~~~
~~~~
~~~~
Data
CPUs
CPUs
D1 Mem
D2 Mem
Disk
Disk
48
Reinventing Elasticity Wheel
Download