(PowerPoint 258Kb)

advertisement
http://www.eas.asu.edu/~calypso
1
Parallel Processing with
Windows NT Networks
Collborators:
Zvi M. Kedem
Donald McLaughlin
Shantanu Sardesai
Rahul Thombre
Partha Dasgupta
Arizona State University
The MILAN Project
New York University
Arizona State University
Funding Sources: DARPA/Rome Laboratory, NSF, Intel, and Microsoft
http://www.eas.asu.edu/~calypso
2
ECLIPSE
Calypso
Linux 1.0
Malaxis
Joint Research of
Arizona State University
and New York University
http://www.eas.asu.edu/~calypso
+
Chime
3
The Platforms
 Calypso
 Language independent parallel processing
 Shared memory and fault tolerance.
 Chime
 CC++ based parallel processing
 Shared memory, fault tolerance
 Malaxis
 DSM package for Windows NT
 Read/write locking, barriers
 Milan
 A metacomputing platform
 Coalesces features from the above systems to a general purpose
computing platform
http://www.eas.asu.edu/~calypso
4
Unix to Windows NT
 Port a system program or middleware from
Unix to Windows NT.
How?
 Just change the system calls?
 Does not work.
 Change programming and design styles to NT-centric:





no signals in NT
use structured event handling (no such thing in Unix)
use threads (useful)
integrate with windows messages or MFC
remote execution support is weak
 Learn NT-centrism, and NT lingo
http://www.eas.asu.edu/~calypso
5
NT Terminology




MSDN is not a network
Developer’s library contains books
Resource Kit is not about resources
Huh?
 SDK, DDK, checked build
 Service Pack
 OSR2
 Remote access does not let you execute anything remotely
 Use a Share?
 You mean remote mount? No, I mean map network drive
 Memory can be reserved or committed or both.
 Synchronization primitives - never mind...
http://www.eas.asu.edu/~calypso
6
What is
 Yet another parallel processing system, which runs on a
distributed network of microcomputers:
 Shared Memory
 Novel execution and memory management strategy
 Fault Tolerant:
 Machines may stop and start dynamically without affecting the
execution
 Automatic Load Balancing:
 Manages slow and fast machines
Provides near optimal thread assignments (measured)
 Execution strategy hidden from programmer:
 No message passing, process management, data partitioning
 Low-overhead mechanisms
http://www.eas.asu.edu/~calypso
7
Key Techniques in Calypso
 Eager Scheduling
manager
 Manager - worker architecture
 Provides fault-tolerant and loadshared executions with minimal
overhead
 Two-phase Idempotent
Execution Strategy
 Distributed memory management
strategy
 Stops side effects due to failures
 Ensures idempotence of results,
in spite of duplicate executions
 These techniques developed
in previous joint theoretical
worker
research
worker
http://www.eas.asu.edu/~calypso
8
Eager Scheduling
 Workers contact the manager for work after finishing
previous assignment, if any
 When there is unfinished work, the manager has the option of
assigning an unfinished thread to a “willing” worker regardless of who
is already working on that thread
 An example of Round Robin Eager Scheduling:
 3 machines: fast, slow and transient
 12 threads of equal length (50 secs)
A
B
1
3
5
2
8
10
6
12
9
11
9
Worker interrupted
C
4
7
9
Worker crashed
time
50
100
150
200
250
300
350
400
http://www.eas.asu.edu/~calypso
9
Chime
 Chime is a programming system and
runtime environment for parallel
processing
 The first system to incorporate
standard parallel language support on
a network of workstations:




Nested Parallelism, Parallel statements
Language-defined scoping of variables
Synchronization support
Transparent shared memory
 Chime supports the “shared memory”
constructs of CC++
 Adds fault tolerance….
 Adds load balancing….
…. with low overhead
A “distributed”
cactus stack
http://www.eas.asu.edu/~calypso
10
Chime Software Architecture
One out of
many Workers
The Manager
Application
Thread
Application
Application
Application Request Protocol
Application Request Protocol
Runtime
System
Runtime
System
Controlling Thread
http://www.eas.asu.edu/~calypso
11
Chime Execution Trace
The Manager
Controlling Thread
Application Thread
1
Start Cntrl. Thread
Suspend
2
Application Request Protocol
Initialize
One out of “many” workers
Start Application
1’
Sq. Step 3
Start Cntrl. Thread
Initialize
Parallel Exec. Request
4
5
Suspend
Send Parallel Task
Suspend
6
Start Task
8
Request Page from Manager
Send Page
9
Page Fault
7
Install Page
Compute 10
12
13
Resume & Done
Done/Send Dirty Page Diffs
Done
11
Suspend
http://www.eas.asu.edu/~calypso
12
Malaxis
 A DSM Package
 Uses NT threads and memory mapping
and protection features
 Uses barrier synchronization,
memory XOR-ing and
intelligent monitoring of page/lock requests
to prevent page shuttling
 Programmer support:




Spawning processes on remote machines
Mapping shared segments
Barrier Synchronization
Read and Write locks (abstract, advisory)
http://www.eas.asu.edu/~calypso
13
Milan
 A metacomputing platform
 Creates a system image of a large computer on a set of
workstations
 Smart scheduling
 bunching
 job recall
 pre-emption
 Shared memory
 Fault tolerant
http://www.eas.asu.edu/~calypso
14
Using Windows NT
 The needs of our implementations:






User Level page fault handling
Getting and setting thread contexts
Getting and setting stack contents
Asynchronous notification and exception handling
Networking support
Process/Thread control
 Windows NT provides all of the above
http://www.eas.asu.edu/~calypso
15
Memory Handling
 Windows NT memory handling is elegant and powerful
(After you understand the terminology)
 States of memory:
 committed
 reserved
 guarded
 Protection and allocation is done by:
 VirtualAlloc
 VirtualProtect
 Access violations generate exceptions
 Needed reprogramming Calypso - for the better
http://www.eas.asu.edu/~calypso
16
Exception Handling
 All exceptions are delivered to an exception handler, defined
in the current scope of execution.
 Great, for programmers - nice and structured
 Not good for middleware solutions….
 How can I execute another persons code,
with my exception handlers?
 I cannot change the exception handler, from within my exception
handler.
 In our case, we found reasonable workarounds - but don’t
have general solutions to the above problems.
http://www.eas.asu.edu/~calypso
17
Threads
 Good, consistent, kernel threads.
 Easy to use
 works great
 plethora of synchronization constructs (too many, in fact)
 Threads are useful for:
 Threads inside middleware - wow!
 Handling distributed shared memory (callbacks, caching, memory
service)
 Process migration - a thread can set up the main process
 Segregating functionality (assign a thread per job)
http://www.eas.asu.edu/~calypso
18
Process and Stack Migration
 Migration is used by our system for several purposes:
 Cactus stacks
 Checkpointing
 Pre-emptive scheduling (produces better turnaround times in
dynamic environments)
 When a thread has to be migrated:




Another thread suspends it and gets its context
The context is a checkpoint
The context is sent to the target machine
A thread sets the context of a suspended thread with the new context
and resumes it. Stack has to be reset too.
 IT WORKS
http://www.eas.asu.edu/~calypso
19
Other Features
 Networking
 winsock is like sockets, no surprises
 Remote execution
 our approach: Use a daemon process
 NT approach: use a starter service
 Execution Monitor (GUI)
 External process, that controls and displays state of the distributed
computation
http://www.eas.asu.edu/~calypso
20
Performance
 Program: Ray Trace, generates a nice picture
 Equipment:
Pentium-90, running Windows NT (Calypso tests)
Pentium Pro 200, running Windows NT (Chime tests)
 Tests conducted




Speedup
Speedup in case of mixed speed machines
Speedup in case of crashing and recovering machines
Micro-tests (migration, stack creation)
– Not all tests will be shown now.
http://www.eas.asu.edu/~calypso
21
Calypso Performance
1,200
6
1042
1037
1,000
5
4.5
4
3.6
548
600
400
1.9
200
1.0
3
2.9
362
290
Speedup
Time
800
2
230
1.0
1
5P
90
4P
90
3P
90
2P
90
n
e
S
e
q
u
1P
90
0
ti
al
-
Machine
Performance is comparable to Unix systems
http://www.eas.asu.edu/~calypso
22
Chime Performance
700
584
5
500
Time
3.9
400
3
2.4
240
174
1.8
200
100
3.4
329
300
1.0
4
Speedup
600
6
639
2
148
1
0.9
-
0
Sequential
1-P90
2-P90
3-P90
4-P90
5-P90
Machine
Chime has higher network overhead than Calypso
http://www.eas.asu.edu/~calypso
23
In Retrospect
 NT has some strong points, things that are better than Unix




Threads
Exception Handling
Memory Management
Program development tools
– (very good, especially the debugger)
 Documentation
 A few shortcomings
 no signals
 no remote execution facility
 terrible terminology
http://www.eas.asu.edu/~calypso
24
Status
 Operational prototype systems
 Calypso on Windows NT / Windows 95 released
 A prototype of Chime implementing most of the “parallel part” of
Compositional C++ on an unreliable network of workstations
 Ongoing research
 Distributed scheduling and resource management (for MILAN)
 Quality of service
 Better integration with NT (MFC support, remote services, global
scheduling…)
http://www.eas.asu.edu/~calypso
25
Acknowledgements
 Co-PI
 Zvi M. Kedem
 Calypso
 Arash Baratloo, Mehmet Karaul
 Calypso NT
 Donald McLaughlin and Shantanu Sardesai
 Chime
 Shantanu Sardesai
 Calypso Linux
 Arash Baratloo
http://www.eas.asu.edu/~calypso
26
http://www.eas.asu.edu/~calypso
27
done?
http://www.eas.asu.edu/~calypso
28
Review request for SP&E
Done
?
http://www.eas.asu.edu/~calypso
29
Download