Dynamic adaptation of parallel codes Françoise André, Jérémy Buisson & Jean-Louis Pazat

advertisement
Dynamic adaptation of parallel codes
Toward self-adaptable components for the Grid
Françoise André, Jérémy Buisson &
Jean-Louis Pazat
IRISA / INSA de Rennes / Université de Rennes 1
Our view of the Grid
Cluster resource
Cluster resource
Application
WAN
Our point
of interest
Cluster resource
Our view of the Grid
Cluster resource
Processor resources
Network resource
…
Our view of the Grid
• Environment that is:
– Parallel
• Grid is built up from parallel machines
– Dynamic
• Resource allocation may change dynamically
– Distributed
• Resources are distributed over a network
• Resources are in different administration domains
• Need for a new programming technique
Parallel self-adaptable distributed components
Related works
• Parallel and distributed components /
objects exist
– Example: GridCCM, PARDIS
• Self-adaptable components exist
– Example: ACEEL, DART
• But no parallel and self-adaptable
distributed component
Principles of parallel components
• Encapsulation of a
parallel code
– Collaboration of
several
communicating
processes
• Goal: allow to easily
couple parallel codes
Principles of dynamic adaptation
• Modification of the
executed code
Execution flow
– Reflexive
programming
• Goal: better fit to
allocated resources
1. Event
2. Reaction
Dynamic adaptation
• Three key questions:
– When should the component adapt?
– How should the component be modified?
– Where can the reaction be executed?
Dynamic adaptation
•
When should the
component adapt?
– Upon reception of
an event from a
monitor
– According to the
policy
Monitor
Decider
Notifies of events
Interprets
Adaptation policy
Dynamic adaptation
•
How should the
component be
modified?
– Executing special
code
– Following
directives of the
policy
Coordinator
Requests
execution of reactions
Decider
Executes
Interprets
Adaptation policy
Reaction
Dynamic adaptation
•
Where can the reaction be
executed?
Behavior 1
Behavior 2
– At the next adaptation point
– Approximated prediction of
the next point
• Based on control flow graph
Reaction
Not an
adaptation point
An adaptation
point
Dynamic adaptation
Coordinator
Requests
execution of reactions
Monitor
Decider
Executes
Notifies of events
Component
Platform
Interprets
Reaction
Adaptation policy
Modifies
Behavior
Mixing parallelism and adaptation
Parallel coordinator
Requests
execution of reactions
Monitor
Decider
Executes
Notifies of events
Component
Platform
Interprets
Parallel reaction
Adaptation policy
Modifies
Parallel
behavior
Parallel
Parallelbehavior
behavior
Mixing parallelism and adaptation
• Introduction of global
adaptation points
– All the processes at
the same state
• Need to coordinate all
the processes
Local adaptation
point
Global adaptation
points
• Example: SPMD code
– Adaptation point
between each phase
Not global
adaptation points
Mixing parallelism and adaptation
• Need for a distributed algorithm for the
parallel coordinator
– Only consider globally reachable points
• In the future of all the processes
– Make an agreement of all the processes
• Choose the same point for all the processes
Mixing parallelism and adaptation
• Need to control the non-determinism
– Due to parallelism
• Dynamically insert synchronization statements
– Due to unpredictable conditional instructions
• Force the result of the conditions if possible
– Example: insertion of empty iterations in loops
• Otherwise postpone the decision-making
Experiment
– Iterative SPMD code
• Adaptation points between
each iteration
– Increase of the number of
processors
• Results
– Negligible time in
adaptation points
– Gain thanks to the
adaptation
– Expected to scale well
Execution time (sec)
• Experiment
Adaptation
120
100
80
60
40
20
0
0
25
50
75 100
Iteration
Original
Adaptable
Related domains
• Computation steering
– Notions equivalent to
global adaptation
points
• Need to execute some
“special code” at the
next “special point”
– Particular use of
adaptation
mechanisms
• User interface instead
of monitors
Man-Machine Interface
Requ
exec
Monitor
Decider
Notifies of events
Component
Interprets
Adaptation policy
Mod
Parallel
behav
Parallel
Parallelbeha
beha
Related domains
• Fault tolerance
– Consider dynamic environment
– Need for a global “consistent” state
• In the past for fault tolerance
• In the future for dynamic adaptation
– Relation to dynamic adaptation
• An application?
• A complementary feature?
Work done
• Design of the overall architecture
– Identification of functional “boxes”
• Distributed algorithm for the coordinator
– Automated instrumentation by static
behavioral reification
– Simple negotiation protocol
• Demonstration prototype
– Ad-hoc mechanisms
– Proof of concept
Future work
• Generalizing the approach
– Generic definition of global adaptation points
• Limits of the “same state” definition
• Case of non-SPMD codes
– Expression of the adaptation policy
• Limits of explicit event-based rules
• Need for more sophisticated (intelligent?) policies
– Smoothing measures of resource availability
– Balancing instabilities
Future work
• Collaborative adaptation of components
– Control side-effects
• Avoid adaptation cycles
– Common policy at the level of:
• A group of components
• A composite
• The whole application
– Consider full Grid applications
• Not only their components
Dynamic adaptation of parallel codes
Toward self-adaptable components for the Grid
Download