Use Case Scenarios for Performance Control of Grid-based Metacomputing

advertisement
Use Case Scenarios for Performance
Control of Grid-based Metacomputing
John Gurd, Ken Mayes, Graham Riley
3rd Grid Performance Workshop, June 2005
www.cs.man.ac.uk/cnc
Overview
 Preamble
• The case for Performance Control
 Context
• Malleable, component-based Grid applications
 The PERCO (Performance Control) System
• Design and implementation
 Homogeneous Components
• Simple performance control scenarios
 More Complex Scenarios
 Conclusions
Achieving Performance
 Engineering for maximum performance:
• coarse design, then fine tuning
• requires high degree of repeatability
• benefits from homogeneity, symmetry, etc.
 Control to achieve (less than maximum)
target:
• use negative feedback control at run-time
• necessary to approach dynamic environment
• helps to deal with heterogeneity
How to Control Performance?
 Requires (negative) feedback
feedback function
error
actuator
• needs sensors, actuators and compensators
• timers, control ‘handles’, predictive models
 Whole system vs. piece-wise control
• who is responsible for what?
 Perception is that a hierarchy is needed
• hence need hierarchical software structure
Controllable Components?
 Several groups have suggested that control
should be effected via a component-based
software architecture
• degenerates to singleton component
• can reduce the complexity of control
• can form a control hierarchy
Overview of PERCO
 Two-tier hierarchical performance control
• CPS (Component Performance Steerer)
- one wrapped around each component
- all attached to APS (see below)
- maximises performance on deployed platform
• APS (Application Performance Steerer)
- (re)deploys components on available resources
- maximises performance on allocated platforms
 Requires an external resource allocator
(from which to obtain a set of resources in
which to effect its deployments)
Modus Operandi
 Components progress via a sequence of
progress points, at each of which a
component calls out to its CPS for any
component-specific performance control
actions (local actuation; requires
component to be malleable)
 Certain progress points are also safepoints (i.e. the component is in a state that
permits it to be redeployed) and, at these
points, the CPS can call out to the APS for
redeployment-based performance control
actions (the APS means of actuation)
Progress Points
 Assume that the execution of components
and application proceeds through phases,
and that the phase boundaries are marked
by progress points.
Ph 1
0
Ph 2
1
Ph 3
2
Ph 4
3
Ph 5
4
Ph 6
5
Ph 7
6
 Can take decisions about performance and
(possibly) actuate at the progress points
7
Application vs. Component Progress Points
APS
Application progress points
CPS
Component progress points
Component
Time
 Application progress points need to be safe
points
PERCO System Overview
PERCO Infrastructure
 Each component is attached to a local
loader which is capable of moving the
component safely around the distributed
Grid hardware according to the APS
commands
 The local loaders act in concert with the
APS to form a virtual loader layer for the
application
 Each CPS communicates with the local
loader on behalf of its component
PERCO System for 2 Components, C1 & C2
Controllable Components?
 Several groups have suggested that control
should be effected via a component-based
software architecture
• degenerates to singleton component
• can reduce the complexity of control
• can form a control hierarchy
 But where do the components come from?
• a knotty problem (cf. RealityGrid LB3D)
One Answer . . .
 Homogeneous components
• each component a copy of the same model
• used e.g. for parameter search
• e.g. LB3D from RealityGrid
 Performance control scenarios
• N instances of LB3D, finish as fast as possible
- equates to keeping them in (approximate) timestep
with each other (see next slides)
• execute N instances of LB3D at specified rates
relative to one another
- e.g. N=2, one instance executes twice as many
timesteps per unit of time as the other
With No Control
With Control Exerted
Slightly More Complex Answer . . .
 “Almost homogeneous” components
• each component a copy of a similar model, but ...
• ... with different driving parameters
- e.g. LB3D with different resolutions
 Performance control scenarios
• TeraGyroid experiment (from RealityGrid;
conducted during SC’2003; see next slide)
• IntBioSim “beading” method
• Hurricane “tracking”
 Embedded high resolution subdomains
• when does extra resolution become new physics?
TeraGyroid Use Case Scenario
Even More Complex Answer: Coupled Models
 Many scientific modellers are finding a
need to link together multiple models:
• climate/envt. models (ocean + atmosphere + ...)
• multi-scale phenomena (CFD + MD = HybridMD)
• aircraft lightning strike (CEM + a/f structure)
+ others, all needing high performance & ‘Grid’
 The individual models seem to constitute
ready-made components:
• can these be used for performance control?
Summary
 We are investigating the practicalities of
component-based performance control in Grid
execution environments
 A prototype performance control system is being
developed and we have shown that it can be used
to achieve a scientifically meaningful high-level
performance objective
 We are ready to apply it to realistic scientific
coupled model applications
K.R. Mayes, M. Luján, G.D. Riley, J. Chin, P.V. Coveney, J.R. Gurd,
Towards performance control on the Grid, Philosophical
Transactions of the Royal Society of London: Series A, to appear,
August 2005.
Related Projects at Manchester
 FLUME - design of next generation Unified Model
software
• funded by The Met Office (led by Mick Carter)
 RealityGrid – condensed matter modelling
• EPSRC-funded e-Science (led by Peter Coveney at UCL)
 SoftIAM - climate impact, integrated assessment
modelling
• funded by the Tyndall Centre (led by Rachel Warren)
 IntBioSim – integrated biological simulation
• BBSRC-funded e-Science (led by Mark Sansom at Oxford)
 GENIEfy – Earth system modelling
• NERC-funded e-Science (led by Tim Lenton + Tyndall C)
Weblinks
For more information check:
http://www.cs.man.ac.uk/cnc
http://www.realitygrid.org
http://www.intbiosim.org (under construction)
Download