NeSC Workshop on Applications and Testbeds on the Grid

advertisement
NeSC Workshop on Applications and
Testbeds on the Grid
Prior to GGF5 and HPDC-11
In collaboration with the Applications and Testbeds Research Group and the Grid User
Services Research Group of the Global Grid Forum
Workshop Presentations
collected by John Towns, NCSA and NLANR
jtowns@nlanr.net
NeSC Workshop on Applications and
Testbeds on the Grid
Date:
Venue:
20 July 2002
Marriott Hotel Glasgow, 500 Argyle Street, Glasgow
The GGF’s Applications Research Group (ARG) and Grid Users Services Research
Group (GUS), in seeking to provide a bridge between the wider application community
and the developers and directors of grid policies, standards and infrastructures, held a
workshop in Glasgow, Scotland preceding GGF5. The goals of this workshop were:
•
•
•
•
to provide a forum for (prospective) applications utilizing Grids
to spread information about both the current state-of-the-art and future
directions of tools, toolkits and other instruments for users and application
programmers
to encourage users and application programmers to make use of existing
Grid infrastructure
to gather user and developer requirements for the effective use of Grid
technologies
This document is the collected presentation of the workshop following the agenda below.
All presentation abstracts and presentations will be available through the workshop web
page at: http://umbriel.dcs.gla.ac.uk/Nesc/general/esi/events/apps/
Agenda for the NeSC Workshop on Applications and
Testbeds on the Grid
Saturday 20 July 2002
0800 0845
Registration
0845 0900
Welcome
0900 1000
Session: Applications 1
0900
Grid Enabled Optimisation and Design Search for Engineering
(GEODISE) - Simon Cox
0920
Taming of the Grid: Lessons Learned and Solutions Found in
the National Fusion Collaboratory - Kate Keahey
0940
Chemical Reactor Performance Simulation: A Grid-Enabled
Application - Ken Bishop
1000 1030
BREAK
1030 1130
Session: Applications 2
1030
Experiences with Applications on the Grid using PACX-MPI Matthias Mueller
1050 DAME Presentation Overview - Tom Jackson
1110 CFD Grid Research in N*Grid Project – Chun-ho Sung
1130 1230
OPEN DISCUSSION "Grid Experiences: Good, Bad and Ugly"
1230 1400
LUNCH
1400 1520
Session: Infrastructure 1
1400
GridTools: "Customizable command line tools for using Grids" Ian Kelley
1420 Using pyGlobus to Expose Legacy Applications as OGSA
Components - Keith Jackson
1440 An Overview of the GAT API - Tom Goodale
1500 Collaborative Tools for Grid Support - Laura McGinnis
1520 1550
BREAK
1550 1710
Session: Infrastructure 2
1550 Application Web Service Tool Kit - Geoffrey Fox
1610
Grid Portals: Bridging the gap between Grids and application
scientists - Michael Russell
1630 A Data Miner for the Information Power Grid - Thomas Hinke
1650 Grid Programming Frameworks at ICASE - Thomas Eidson
1710 1800
OPEN DISCUSSION "If we build it, will they come?"
1800
CLOSING
Grid Enabled Optimisation and
Design Search for Engineering
(GEODISE)
Prof Simon Cox
Southampton University
http://www.geodise.org
Academic and Industrial Partners
Southampton, Oxford and Manchester
Simon Cox- Grid/ W3C Technologies
and High Performance Computing
Global Grid Forum Apps Working Group
Andy Keane- Director of Rolls Royce/
BAE Systems University Technology
Partnership in Design Search and
Optimisation
Mike Giles- Director of Rolls Royce
University Technology Centre for
Computational Fluid Dynamics
Carole Goble- Ontologies and DARPA
Agent Markup Language (DAML) /
Ontology Inference Language (OIL)
BAE Systems- Engineering
Rolls-Royce- Engineering
Fluent- Computational Fluid Dynamics
Microsoft- Software/ Web Services
Intel- Hardware
Compusys- Systems Integration
Epistemics- Knowledge Technologies
Condor- Grid Middleware
Nigel Shadbolt- Director of Advanced
Knowledge Technologies (AKT) IRC
1
The GEODISE Team ...
)
)
)
)
)
)
)
)
)
)
)
)
Richard Boardman
Sergio Campobasso
Liming Chen
Mike Chrystall
Simon Cox
Mihai Duta
Clive Emberey
Hakki Eres
Matt Fairman
Carole Goble
Mike Giles
Zhuoan Jiao
)
)
)
)
)
)
)
)
)
)
)
)
)
Andy Keane
Juri Papay
Graeme Pound
Nicola Reader
Angus Roberts
Mark Scott
Tony Scurr
Nigel Shadbolt
Paul Smart
Barry Tao
Jasmin Wason
Gang “Luke” Xue
Fenglian Xu
Design
2
Design Challenges
Modern engineering firms are global and distributed
How to … ?
… improve design environments
… cope with legacy code / systems
… produce optimized designs
CAD and analysis tools, user
interfaces, PSEs, and Visualization
Optimisation methods
… integrate large-scale systems in a
flexible way
Management of distributed compute
and data resources
… archive and re-use design history
Data archives (e.g. design/ system
usage)
… capture and re-use knowledge
Knowledge repositories &
knowledge capture and reuse tools.
“Not just a problem of using HPC”
NASA Satellite Structure
Optimized satellite designs
have been found with
enhanced vibration
isolation performance
using parallel GA’s
running on Intel
workstation clusters.
3
4
5
Baseline 3D-boom on test
6
Gas Turbine Engine: Initial Design
Base Geometry
Secondary Kinetic Energy
Collaboration with Rolls-Royce
Design of Experiment &
Response Surface Modelling
Initial
Geometry
RSM
Construct
DoE
RSM
Evaluate
CFD
CFD … CFD
CFD
CFD … CFD
CFD
CFD … CFD
CFD
CFD … CFD
Cluster
Parallel
Analysis
RSM
Tuning
Search Using
RSM
CFD
Build
Data-Base
Adequate ?
Best
Design
7
Optimised Design
Geometry
Secondary Kinetic Energy
The Grid Problem
“Flexible and secure sharing of resources among
dynamic collections of individuals within and
across organisations”
)
Resources = assets, capabilities, and knowledge
Capabilities (e.g. application codes, analysis & design tools)
™ Compute Grids (PC cycles, commodity clusters, HPC)
™ Data Grids
™ Experimental Instruments
™ Knowledge Services
™ Virtual Organisations
™ Utility Services
™
Grid middleware mediates between these resources
8
GEODISE
Engineer
GEODISE
PORTAL
Knowledge
repository
Ontology for
Engineering,
Computation, &
Optimisation and
Design Search
Visualization
Session
database
Traceability
OPTIMISATION
OPTIONS
System
APPLICATION
SERVICE
PROVIDER
Intelligent
Application
Manager
Reliability
Security
QoS
CAD System
CADDS
IDEAS
ProE
CATIA, ICAD
Globus, Condor, SRB
Optimisation
archive
COMPUTATION
Licenses
and code
Analysis
CFD
FEM
CEM
Design
archive
Parallel machines
Clusters
Internet Resource Providers
Pay-per-use
Intelligent
Resource
Provider
Geodise will provide grid-based seamless access to an intelligent knowledge
repository, a state-of-the-art collection of optimisation and search tools,
industrial strength analysis codes, and distributed computing & data resources
GEODISE Demo
(1) Security Infrastructure
Authentication & Authorisation
(2) Define Geometry to
optimise
Nacelle Design
3D
Axisymmetric
(2D)
(3) Sample Objective function
to build Response Surface
Model
Grid Computing
9
GEODISE Demo
(4) Optimise over Response
Surface Model
Target velocity
function
evaluation
Search using
Response Surface Model
Width of curve
on lower
nacelle surface
Position of
curve on upper
nacelle surface
(5) Grid Database Query and
Postprocessing of Results
Automated Data Archiving
Auto-Generate
Database
Problem Solving
Environment
XML Schema
from PSE
Insert Files into
Database
XML files
Evolve Database
New XML Schema
from PSE
Process Schema
Create Database
Data
Repository
Insert Files
Reconcile Schema
Update Database
Data
Archive
Knowledge Discovery
Updated XML
Schema
Web/Agent access to Repository
GEODISE Home Movie
10
Knowledge Technologies
Knowledge Capture for
Design Process
Ontology Driven
Service Composition
Workflow Management
) Knowledge
Driven
™ Acquire
™ Model
™ Re-use
™ Retrieve
™ Publish
™ Maintain
11
The future of design optimisation
Design Optimisation needs integrated services
) Design improvements driven by CAD tools coupled to
advanced analysis codes (CFD, FEA, CEM etc.)
) On demand heterogeneous distributed computing and
data spread across companies and time zones.
) Optimization “for the masses” alongside manual search
as part of a problem solving environment.
) Knowledge based tools for advice and control of process
as well as product.
Geodise will provide grid-based seamless access to an
intelligent knowledge repository, a state-of-the-art
collection of optimisation and search tools, industrial
strength analysis codes, and distributed computing and
data resources
12
The Taming of the Grid:
Lessons Learned in the
National Fusion Collaboratory
Kate Keahey
Overview
z
Goals and Vision
z
Challenges and Solutions
z
Deployment War Stories
z
Team
z
Summary
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
2
Goals
“ enabling more efficient use of experimental facilities
more effective integration of experiment, theory and
modelling”
z
Fusion Experiments
– Pulses every 15-20 minutes
– Time-critical execution
z
We want:
– More people running more simulation/analysis codes
in that critical time window
– Reduce the time/cost to maintain the software
– Make the best possible use of facilities
> Share them
> Use them efficiently
z
Better collaborative visualization (outside of scope)
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
3
Overview of the Project
z
Funded by DOE as part of the SciDAC initiative
– 3 year project
– Currently in its first year
z
First phase
– SC demonstration of a prototype
z
Second phase
– More realistic scenario at Fusion conferences
– First shot at research issues
z
z
Planning an initial release for November timeframe
Work so far:
– Honing existing infrastructure
– Initial work on design and development of new
capabilities
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
4
Vision
z
Vision of the Grid as a set of “network services”
– Characteristics of the software (problems)
> Software is hard to port and maintain (large, complex)
> Needs updating frequently and consistently (physics changes)
> Maintenance of portability is expensive
> “Software Grid” as important as “Hardware Grid”
> Reliable between pulse execution for certain codes
> Prioritization, pre-emption, etc.
– Solution:
> provide the application (along with hardware and
maintenance) as a remotely available “service to community”
> Provide the infrastructure enabling this mode of operation
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
5
What prevents us?
z
Issues of control and trust
–
–
–
–
z
How do
How do
Will my
How do
I enter into contract with resource owner?
I ensure that this contract is observed?
code get priority when it is needed?
I deal with a dynamic set of users?
Issues of reliability and performance
– Time-critical applications
> How do we handle reservations and ensure performance?
– Shared environment is more susceptible to failure
– No control over resources
– But a lot of redundancy
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
6
Other Challenges
z
z
z
Service Monitoring
Resource Monitoring
Good understanding of quality of service
– Application-level
– Composition of different QoS
z
z
Accounting
Abstractions
– How do “network services” relate to OGSA Grid
Services?
z
Implementational and deployment issues
– firewalls
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
7
Issues of Trust: Use policies
z
Requirements
– Policies coming from different sources:
> A center should be able to dedicate a percentage of its
resources to a community
> Community may want to grant different rights to different
groups of users
– A group within a VO may be given management
rights for certain groups of jobs
– Managers should be able to use their higher
privileges (if any) to manage jobs
– Shared/dynamic accounts dealing with dynamic user
community problem
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
8
Issues of Trust (cntd.)
resource …
owner
virtual
organization
policy specification
and management
client
Request
Gird-wide
client
credential
-credential
-policy target
-policy action
NESC Workshop, Glasgow 07/20/02
Akenti
(authorization system)
policy evaluation
GRAM (JM)
(resource management)
enforcement
module
PEP
Local
enforcer
credential
National Fusion Collaboratory
local
resource
management
system
9
Issues of Trust (cntd.)
z
Policy language
– Based on RSL
– Additions
> Policy tags, ownership, actions, etc.
z
Experimenting with different enforcement
strategies
– Gateway
– Sandboxing
– Services
z
z
z
Joint work with Von Welch (ANL), Bo Liu
Work based on GT2
Collaborating with Mary Thompson (LBNL)
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
10
Issues of Reliable Performance
z
Scenario:
– A GA scientist needs to run TRANSP (at PPPL)
between experimental pulses in less than 10 mins
– TRANSP inputs can be experimentally configured
beforehand to determine how its execution time
relates to them
> Loss of complexity (“physics”) to gain time
– The scientist reserves the PPPL cluster for the time of
the experiment
– Multiple executions of TRANSP, initiated by different
clients and requiring different QoS guarantees can
co-exist on the cluster at any time, but when a
reservation is claimed, the corresponding TRANSP
execution claims full CPU power
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
11
Issues of Reliable Performance
(cntd)
Use policies
(administrator)
Meta-data
Information
(servers)
z
service
interface
multiple
clients,
different
requirements
TRANSP
Service
Interface
TRANSP
QoS requirements
(client)
execution
Execution
broker
Broker
multiple
service
installations
Status: an OGSA-based prototype
– Uses DSRT and other GARA-inspired solutions to
implement pre-emption, reservations, etc.
z
Joint work with Kal Motawi
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
12
Deployment (Firewall Problems)
z
The single most serious problem: firewalls
– Globus requires
> Opening specific ports for the services (GRAM, MDS)
> Opening a range of non-deterministic ports for both client and
server
> Those requirements are necessitated by the design
– Site policies and configurations
>
>
>
>
>
z
Blocking outgoing ports
Opening a port only for traffic from a specific IP
Authenticating through the firewall using SecureID card
NAT (private network)
“opening a firewall is an extremely unrealistic request”
An extremely serious problem: makes us unable to
use the Fusion Grid
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
13
Firewalls (Proposed Solutions)
z
z
Inherently difficult problem
Administrative Solutions
– Explain why it is OK to open certain ports
> Document explaining Globus security (Von Welch)
– Agree on acceptable firewall practices to use with
Globus
> Document outlining those practices (Von Welch)
– Talk to potential “influential bodies”
> ESCC: August meeting, Lew Randerson, Von Welch
> DOE Science Grid: firewall practices under discussion
z
Technical Solutions
– OGSA work: Von Welch, Frank Siebenlist
– Example: route interactions through one port
z
Do you have similar problems? Use cases?
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
14
Firewalls (Resources)
z
New: updated firewall web page
– http://www.globus.org/security/v2.0/firewalls.html
z
Portsmouth, UK
– http://esc.dl.ac.uk/Papers/firewalls/globus-firewall-experiences.pdf
z
DOE SG Firewall Policy Draft (Von Welch)
z
DOE SG firewall testbed
z
Globus Security Primer for Site
Administrators (Von Welch)
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
15
The NFC Team
z
Fusion
– David Schissel, PI, General Atomics (applications)
– Doug McCune, PPPL (applications)
– Martin Greenwald, MIT (MDSplus)
z
Secure Grid Infrastructure
– Mary Thompson, LBNL (Akenti)
– Kate Keahey, ANL, (Globus, network services)
z
Visualization
– ANL
– University of Utah
– Princeton University
z
More information at www.fusiongrid.org
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
16
Summary
z
Existing infrastructure
– A lot in relatively little time
– Caveat: firewalls
z
Building infrastructure
– Network services
> A view of a “software grid”
> Goal: to provide execution reliable in terms of an
application-level QoS
> To accomplish this goal we need:
z Authorization and use policies
z Resource management strategies
NESC Workshop, Glasgow 07/20/02
National Fusion Collaboratory
17
Simulation of Chemical
Reactor Performance
– A Grid-Enabled Application –
Kenneth A. Bishop
Li Cheng
Karen D. Camarda
The University of Kansas
kbishop@ku.edu
NeSC Workshop July 20, 2002
Presentation Organization
•
Application Background
•
•
Grid Assets In Play
•
•
•
Chemical Reactor Performance Evaluation
Hardware Assets
Software Assets
Contemporary Research
•
•
NCSA Chemical Engineering Portal Application
Cactus Environment Application
NeSC Workshop July 20, 2002
1
Chemical Reactor Description
Reaction Conditions:
Temperature: 640 ~ 770 K
Pressure: 2 atm
Coolant Molten Salt
Feed
Products
O-Xylene : Air
Mixture
Phthalic
Anhydride
V2O5 Catalyst in Tubes
NeSC Workshop July 20, 2002
Simulator Capabilities
•
•
•
•
Reaction Mechanism: Heterogeneous
Or Pseudo-homogeneous
Reaction Path: Three Specie Or Five
Specie Paths
Flow Phenomena: Diffusive vs Bulk
And Radial vs Axial
Excitation: Composition And/Or
Temperature
NeSC Workshop July 20, 2002
2
Chemical Reactor Start-up
CENTER
TEMPERATURE
RADIUS
TUBE
ENTRANCE
AXIAL POSITION
INITIAL CONDITION:
FEED NITROGEN
FEED TEMP. 640 K
COOLANT TEMP. 640 K
FINAL CONDITION:
FEED 1% ORTHO-XYLENE
FEED TEMP. 683 K
COOLANT TEMP. 683 K
EXIT
TEMPERATURE K
640
770
NeSC Workshop July 20, 2002
Reactor StartStart-up: t = 60
LOW
HIGH
+
TEMPERATURE
ORTHOORTHO-XYLENE
PHTHALIC ANHYDRIDE
TOLUALDEHYDE
PHTHALIDE
COx
NeSC Workshop July 20, 2002
3
Reactor StartStart-up: t = ∞
LOW
HIGH
+
TEMPERATURE
ORTHOORTHO-XYLENE
PHTHALIC ANHYDRIDE
TOLUALDEHYDE
PHTHALIDE
COx
NeSC Workshop July 20, 2002
Grid Assets In Play - Hardware
• The University of Kansas
•
•
•
•
JADE O2K [6] 250MHz, R10000, 512M RAM
PILTDOWN Indy [1] 175MHz, R4400, 64M RAM
Linux Workstations
Windows Workstations
• University of Illinois (NCSA)
• MODI4 O2K [48] 195MHz, R10000, 12G RAM
• Linux ( IA32 [968] & IA64 [256] Clusters)
• Boston University
• LEGO O2K [32] 195MHz, R10000, 8G RAM
NeSC Workshop July 20, 2002
4
Grid Assets In Play - Software
• The University of Kansas
• IRIX 6.5: Globus 2.0 (host); COG 0.9.13 [Java] (client);
Cactus
• Linux: Globus 2.0 (host); COG 0.9.13 [Java] (client);
Cactus
• Windows 2K: COG 0.9.13 (client); Cactus
• University of Illinois (NCSA)
• IRIX 6.5: Globus 2.0 (host); COG 0.9.13 (client);
Cactus
• Linux: Cactus
• Boston University
• IRIX 6.5: Globus 1.1.3 (host); COG 0.9.13 (client);
Cactus
NeSC Workshop July 20, 2002
Research Projects
• Problem Complexity: Initial (Target)
• Pseudo-homogeneous (Heterogeneous) Kinetics
• Temperature And Feed Composition Excitation
• 1,500 (70,000) grid nodes & 200 (1,000) time steps
• Applications
• Alliance Chemical Engineering Portal; Li Cheng
– Thrust: Distributed Computation Assets
– Infrastructure: Method of Lines, XCAT Portal, DDASSL
• Cactus Environment; Karen Camarda
– Thrust: Parallel Computation Algorithms
– Infrastructure: Crank-Nicholson, Cactus, PETSc
NeSC Workshop July 20, 2002
5
ChE Portal Project Plan
• Grid Asset Deployment
• Client: KU
• Host: KU or NCSA or BU
• Grid Services Used
• Globus Resource Allocation Manager
• Grid FTP
• Computation Distribution (File Xfer Load)
•
•
•
•
Direct to Host Job Submission (Null)
Client- Job Submission; Host- Simulation (Negligible)
Client- Simulation; Host- ODE Solver (Light)
Client- Solver; Host- Derivative Evaluation (Heavy)
NeSC Workshop July 20, 2002
ChE Portal Project Results
• Run Times (Wall Clock Minutes)
Load\Host PILTDOWN
Null
76.33
Negligible
NA
Light
NA
Heavy
2540*
JADE
22.08
27.76
35.08
NA
MODI4
7.75
8.25
13.49
15.00**
• 211,121 Derivative Evaluations
• ** Exceeded Interactive Queue Limit After 3 Time Steps
(10,362 Derivative Evaluations)
NeSC Workshop July 20, 2002
6
ChE Portal Project Conclusions
• Conclusions
• The Cost For The Benefits Associated With The
Use Of Grid Enabled Assets Appears Negligible.
• The Portal Provides Robust Mechanisms For
Managing Grid Distributed Computations.
• The Cost Of File Transfer Standard Procedures As
A Message Passing Mechanism Is Extremely High.
• Recommendation
• A High Priority Must Be Assigned To Development
Of High Performance Alternatives To Standard File
Transfer Protocols.
NeSC Workshop July 20, 2002
Cactus Project Plan
• Grid Asset Deployment
• Client: KU
• Host: NCSA (O2K, IA32 Cluster, IA64 Cluster)
• Grid Services Used
• MPICH-G
• Cactus Environment Evaluation
• Shared Memory : Message Passing
• Problem Size: 5x105 – 1x108 Algebraic Equations
• Grid Assets: 0.5 – 8.0 O2K Processor Minutes
0.1 – 4.0 IA32 Cluster Processor Minutes
• Application Script Use
NeSC Workshop July 20, 2002
7
Cactus Project Results
Parallel Speedup on IA32 Linux Cluster
2D Dynamic simulation, 1 Processor/Cluster Node
35
Speedup (t1/tN)
30
25
9x33
20
18x66
15
36x132
10
5
0
0
5
10
15
20
25
30
Number of Cluster Nodes
NeSC Workshop July 20, 2002
Cactus Project Conclusions
• Conclusions
• The IA32 Cluster Outperforms O2K On The Small
Problems Run To Date. (IA32 Faster Than O2K;
IA32 Speedup Exceeds O2K Speedup.)
• The Cluster Computations Appear To Be Somewhat
Fragile. (Convergence Problems Encountered Above
28 Cluster Node Configuration; Similar (?) Problems
With The IA64 Cluster.)
• The Grid Service (MPICH-G) Evaluation Has Only
Begun.
• Recommendations
• Continue The Planned Evaluation of Grid Services.
• Continue The Planned IA64 Cluster Evaluation.
NeSC Workshop July 20, 2002
8
Overall Conclusions
• The University Of Kansas Is Actively
Involved In Developing The Grid Enabled
Computation Culture Appropriate To Its
Research & Teaching Missions.
• Local Computation Assets Appropriate To
Topical Application Development And Use
Are Necessary.
• Understanding Of And Access To Grid
Enabled Assets Are Necessary.
NeSC Workshop July 20, 2002
9
Experiences with Applications on the Grid using PACX-MPI
Matthias Mueller
mueller@hlrs.de
HLRS
www.hlrs.de
University Stuttgart
Germany
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Outline
•
•
•
Definition, Scenarios and Success Stories
Middleware and Tools: The DAMIEN project
Case Study
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
1
Grid Scenario
•
Standard approach: one big supercomputer
•
Grid approach: distributed resources
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Example of Distributed Resources: Supercomputers
•
•
•
•
•
•
HLRS:
CSAR:
PSC:
TACC:
NCHC:
JAERI:
23.07.2002
Cray T3E 512/900,
460 GFlops
Cray T3E 576/1200,
691 GFlops
Cray T3E 512/900,
460 GFlops
Hitachi SR8000 512CPU/64 Nodes,
512 GFlops
IBM SP3 Winter Hawk2,168CPU/42 Nodes, 252 GFlops
NEC SX-4/4 Nodes,
8 GFlops
=========
2.383 TFlops
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
2
Applications
23.07.2002
•
CFD (HLRS)
– re-entry simulation of space craft
•
Processing of radio astronomy data (MC)
– pulsar search code
•
DSMC (HLRS)
– simulation of granular media
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
GWAAT: Global Wide Area Application Testbed
NSF Award at SC’ 99
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
3
Network Topology
STAR-TAP
Chicago
TANET2
APAN
IMnet
Tsukuba/Tokyo Tokyo
Hsinchu
vBNS
Abilene
PSC
SCInet
Pittsburgh
TEN
155
DFN
New York
Dante Frankfurt
DFN
DFN
JANET
X.X.X.X
Belwü RUS
Dallas
Stuttgart
JAERI
TACC
NCHC
PSC
Hitachi SR 8000 NEC SX-4
IBM SP3
Cray T3E
sr8k.aist.go.jp frente.koma.jaeri ivory.nchc.gov.tjaromir.psc.edu
.go.jp
150.29.228.82
128.182.73.68
w
202.241.61.92
140.110.7.x
23.07.2002
ATM PVC
2 Mbit/s
Shared
connections
HLRS
Cray T3E
hwwt3e-at.hww.de
129.69.200.195
Manchester
MCC
Cray T3E
turing.cfs.ac.uk
130.88.212.1
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Middleware for Scientific Grid Computing: PACX-MPI
•
•
•
PACX-MPI is a Grid enabled MPI implementation
no difference between parallel computing and Grid computing
higher latencies for external messages (70ms compared to 20µs)
Co-operation with JAERI regarding Communication Library (stampi)
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
4
Status of the Implementation (I)
•
•
•
•
Full MPI 1.2 implemented
MPI-2 functionality
– Extended collective operations
– Language interoperability routines
– Canonical Pack/Unpack Functions
MPI 2 JoD Cluster attributes
Other implemented features:
– data conversion
– data compression
– data encryption
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Status of the implementation (II)
•
PACX-MPI ported and tested on
– Cray T3E
– SGI Origin/Onyx
– Hitachi SR2201 and SR8000
– NEC SX4 and SX5
– IBM RS6000/SP
– SUN platforms
– Alpha platforms
– LINUX: IA32 and IA64
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
5
Challenge: What is the application performance?
•
•
•
•
•
•
•
23.07.2002
CFD (HLRS)
– re-entry simulation of space craft
Processing of radio astronomy data (MC)
– pulsar search code
DSMC (HLRS)
– simulation of granular media
Electronic structure simulation (PSC)
Risk management for environment crisis
(JAERI)
GFMC (NCHC):
– high-tc superconductor simulation
Coupled vibro-acoustic simulation
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Comparision DSMC <-> MD
•
•
Domain decomposition
Speed up
DSMC
23.07.2002
MD
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
6
DSMC - Direct Simulation Monte Carlo on Transatlantic Grid
P a r tic le s/C P U
1953
w ith o u t P A C X
1 x 60 N odes
0 .0 5 se c
w ith P A C X
2 x 30 N odes
0 .2 8 se c
3906
0 .1 0 se c
0 .3 1 se c
7812
0 .2 0 se c
0 .3 1 se c
15625
0 .4 0 se c
0 .4 0 se c
31250
0 .8 1 se c
0 .8 1 se c
125000
3 .2 7 se c
3 .3 0 se c
500000
1 3 .0 4 se c
1 3 .4 1 se c
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Necessary Tools:
The DAMIEN project
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
7
The development phase
• Sequential code(s)
• Parallel (MPI) code(s)
• parallelization
• code coupling
• optimisation
MPI
MpCCI
• compiling
• linking with libraries
MpCCI
PACX-MPI
• debugging
• performance analysis
Marmot
MetaVampir
• testing
Results
ok ?
yes
23.07.2002
no
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
MpCCI: Basic Functionality
•
Communication
– Based on MPI
– Coupling of Sequential and Parallel Codes
– Communicators for Codes (internal) and Coupling (external)
•
Neighborhood Search
– Bucket Search Algorithm
– Interface for User-defined Neighborhood Search
•
Interpolation
– Linear Surface Interpolation for Standard Elements
– Volume Interpolation
– User-defined Interpolation
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
8
DAMIEN End-User application
•
•
•
EADS distributed over numerous sites all over Europe
Computing resources are distributed ⇒ “natural” need for Gridsoftware to couple the resources
Coupled vibro-acoustic simulations
– structure of
rockets during the
launch
– noise reduction
in airplanes-cabins
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
MetaVampir - Application Level Analysis
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
9
The production phase
Given problem
Grid-enabled code
Experience with
small problem sizes
MetaVampirtrace
Dimemastrace
Dimemas
Determine optimal number of processors
and combination of machines
MetaVampir
Execute job
Configuration
Manager
QoS Manager
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Dimemas: Tuning Methodology
Sequential
machine
Tracing
MP library
- MPI
Tracing
- PVM
facilities
- etc...
Message
Passing
Code
Parallel machine
Dimemas
Trace
File
Code
modification
DIMEMAS
Paraver
Visualization
Trace
File
Visualization
and analysis
Simulation
Parameters
modification
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
10
DAMIEN tools in the production phase
Dimemas-Tracefile
Edit new configuration
Execute Dimemas simulator
Check results with Vampir
All configurations
tested ?
no
yes
Specify best configuration
Specify QoS parameter
Launch job with
Configuration Manager
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Case Study:
PCM
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
11
PCM
Direct numerical
simulation of turbulent
reactive flows
• 2-D flow fields
• detailed chemical reactions
• spatial discretization:
6th order central derivatives
• integration in time:
4th order explicit Runge-Kutta
Challenging Applications
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Requirements for a 3D simulation
•
•
•
•
Components
– density, velocity and energy
– mass fractions for chemical species (9 - 100)
Spatial discretization
– 100 µm (typical flame-front), 1 cm for computational domain
– 100 grid-points into each direction
Discretization in time
– 10-8 for some important intermediate radicals
– 1 s for slowly produced pollutants (e.g. NO)
Summary
– 100 variables, 106 grid-points
– 1 ms simulation time with time steps of about 10-8
– 105 iterations with 108 unknowns
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
12
Example of a production run
Cray T3E/900
512 nodes
64 GByte memory
Hitachi SR8000
16 nodes /128 CPUs
128 GByte memory
Hippi: 100MBit, 4ms
•
•
•
•
auto-ignition process
fuel (10%H2 and 90%N2, T=298K) and heated oxidiser (air at
T=1298K)
distribution is superimposed with turbulent flow-field
temporal evolution computed
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Performance Analysis with Meta-Vampir
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
13
Message statistics - process view
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Message statistics - cluster-view
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
14
Result of production run: Maximum heat-release
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
Summary and Conclusion
•
•
•
•
•
•
To make use of the Grid you need middleware and tools
A Grid aware MPI implementation like PACX-MPI offers an
incremental, optimized approach to the Grid
Together with other standard tools this attracted a lot of scientific
applications
Applications are the driving force for PACX-MPI and DAMIEN
This kind of scientific Grid computing is very demanding, but the
requirements are common to other forms of Grid computing
– performance, resource management, scheduling, security
The network is the bottleneck, because:
– Fat networks that are weakly interconnected
– Political barriers between networks
– Performance is not transparent
– no responsibility for end-to-end performance
23.07.2002
Matthias Müller
Höchstleistungsrechenzentrum Stuttgart
15
Overview of the DAME Project
Distributed Aircraft Maintenance
Environment
University of York
Martyn Fletcher
Project Goal
z
z
z
z
Build a GRID Application for Distributed
Diagnostics.
Application is generic, but application
demonstrator will be for aircraft
maintenance.
Three year project - began in January
2002.
One of six pilot projects funded by the
EPSRC under the current UK e-Science
initiative.
1
Outline Of Basic Operation
z
z
z
z
On landing – DAME receives data from the
engine.
DAME looks for patterns, performs
modelling etc. and provides a diagnosis /
prognosis.
DAME system made up of GRID based
web services.
DAME system also provides analysis tools
etc. for use by Domain Experts etc.
Benefits of Use
z Allows
problems to be detected early
and common causes to be detected.
z Ultimately will reduce flight delays,
in-flight shutdowns and aborted takeoffs - due to engine problems.
2
DAME Collaborators
z
z
z
z
z
z
z
University of York.
University of Leeds.
University of Oxford.
University of Sheffield.
Rolls-Royce Plc. (RR).
Data Systems & Solutions LLC. (DS&S).
Cybula Ltd.
Technologies Used
z
z
z
z
AURA - Advanced Uncertain Reasoning
Architecture – high performance pattern
matcher developed by University of York
and Cybula Ltd.
QUOTE – “On The Engine” - intelligent
engine signature collection / local
diagnosis system developed by University
of Oxford for Rolls Royce / DS&S.
Decision Support – University of Sheffield.
GRID architecture / web services –
University of Leeds.
3
Current Work
z
z
z
z
z
Developing expertise in GRID
architectures and web services.
Developing AURA to make it available as a
GRID web service.
QUOTE work is ongoing.
Developing Decision Support.
Working with the “users” RR and DS&S to
develop the use cases (following slides).
Use Case Process
1.
2.
3.
4.
5.
6.
7.
Define the DAME system scope and
boundaries.
Identify the list of primary actors.
Identify the list of primary actor goals / use
cases for the DAME system.
Describe the outermost (overall) summary use
cases.
Revise the outermost summary use cases.
Expand each DAME system use cases.
Reconsider and readjust the set of uses cases.
4
Primary Actors
z
z
z
z
z
z
z
z
z
Maintenance team
Engine releaser
Maintenance scheduler
Maintenance advisor
Domain expert
MR&O engine releaser
MR&O Condition Recorder
Knowledge engineer
System administrator
Outermost Use Cases
z
z
z
z
z
z
z
z
z
Release Engine.
Dispossess the QUOTE anomaly.
Plan Maintenance Schedule.
Provide Maintenance Advice.
Provide Maintenance Information.
Provide Expert Information.
Pass-Off Engine.
Capture Knowledge.
Maintain the System.
5
DAME Use Cases
z
z
z
z
z
z
z
z
Perform Diagnosis.
Perform Analysis.
Model The System.
Match The Pattern.
Provide The Decision.
Update Local Diagnostics.
Provide Statistics Report
Etc.
DAME Use Case Diagram
DAME
Update Local
Diagnostics
Model The System
«extends»
Engine &
GSS
«extends»
Match The Pattern
«extends»
Perform Analysis
«extends»
«uses»
«uses»
Perform Diagnosis
Domain Expert (RR)
Provide The
Decision
«uses»
Engine Releaser (Airline)
Perform Diagnosis
not detected by QUOTE
Get Pass-Off
Information
Maintenance Team
Store Result
MRO Engine Releaser
Provide System
Diagnostics Information
Provide Statistics
Report
Perform System
Software Maintenance
System Administrator
Maintenance Scheduler (Airline)
Capture Knowledge
Maintenance Advisor (DS&S or Airline Equivalent)
Knowledge Engineer
6
Use Case – Perform Diagnosis
(Main Success Scenario)
1.
2.
3.
4.
5.
The DAME system analyses the data
using e.g. Match the Pattern, Model the
System use cases, etc.
The DAME system assesses the results
and determines the diagnoses /
prognoses and confidence levels.
The Domain Expert (DE) receives the
proposed diagnosis / prognosis.
The DE accepts it.
The DE provides the result to the
Maintenance Team.
Use Case – Perform Diagnosis
(Extensions)
4a. The Domain Expert does not accept the
diagnosis / prognosis: requests more
information from the Maintenance
Team and / or Performs Analysis.
Continues at step 5.
7
Future Use Case (UC) Work.
z Expand
existing UCs in discussion
with the users - RR and DS&S.
z Use these in planning the DAME
demonstrations.
z Update UCs as project progresses.
z Consider other domains e.g. medical.
Use Case Conclusions.
z
Use case diagrams
– Tend to confuse people.
– Useful as an overview only.
z
Use cases in text form
– Easily understood.
– Users like them.
– Very little jargon.
z
Key point - know when to stop expansion
of use cases.
FOR MORE INFO...
http://www.cs.york.ac.uk/dame
8
CFD Grid Research in
N*Grid Project
KISTI Supercomputing Center
Chun-ho Sung
Supercomputing Center
Introduction to N*Grid
z What is N*Grid?
ƒ Korean Grid research initiative
z Construction and Operation of the Korean National Grid
z N*Grid includes
ƒ National Computational Grid
ƒ National Data Grid
ƒ National Access Grid
ƒ National Application Grid (Ex: Bio Grid, CFD Grid, Meteo Grid etc)
z Funded by Korean government through Ministry of Information and
Communication
z KISTI supercomputing center is a primary contractor of N*Grid
Supercomputing Center
1
Scope of N*Grid
z High Performance Computational Grid
ƒ Supercomputers
ƒ High performance clusters
z Advanced Access Grid
ƒ Massive data distribution and processing (Data Grid)
ƒ Collaborative access grid
ƒ Immersive visualization
z Grid Middleware
ƒ Information service, Security, Scheduling, …
z Search and support of Grid Application Project (Seed Project)
ƒ Grid application testbed
ƒ Grid application portals
ƒ Grid applications
Supercomputing Center
CFD & Grid Research
z CFD – computational fluid dynamics
ƒ Nonlinear partial differential equations – Navier-Stokes equations
ƒ Requires huge amount of computing resource
ƒ The most limiting factor is computing power!
z CFD in Grid research
ƒ It can fully exploit the power of computing grid resources.
ƒ Parallel/Distributed computing algorithm in CFD shows high level of
maturity.
ƒ Grand Challege problem can be solved through grid research (direct
numerical simulation of turbulent flow).
ƒ Grid research can receive feedback from real application.
Supercomputing Center
2
CFD in N*Grid
z Virtual Wind Tunnel on Grid infrastructure
Flow analysis
Module
Mesh Generation
Module
CAD
System
Optimization
Module
Supercomputing Center
Components of Virtual Wind Tunnel
z CAD system
ƒ Define geometry, integrated in grid portal
z Mesh Generator
ƒ Multi-block and/or Chimera grid system
ƒ Semi-automated mesh generation
z Flow Solver
ƒ 3-dimensional Navier-Stokes code parallelized with MPI
z Optimization Module
ƒ Sensitivity analysis, response surface etc
z Database
ƒ Repository for geometries and flow solutions
ƒ Communicate with other discipline code (CSD, CEM)
Supercomputing Center
3
High Throughput Computing Environment
z Improved throughput for
ƒ Parametric study such as flutter analysis
ƒ Construction of response surface
Computing Grid
Unstable
Flutter boundary?
Stable
Supercomputing Center
Preliminary Results
z Supercomputer Grid Experiment
Chonbuk N. Univ.: IBM SP2
Chonan/Soongsil Univ.
Cluster
Taejon KISTI
Compaq GS320
KREONet2
Globus/MPICH-G
KISTI: Compaq GS320
Chonbuk N. Univ.
IBM SP2
Pusan Dong-Myoung.
Univ. IBM SP2
Supercomputing Center
4
Preliminary Results – Cont.
z Cluster Grid Experiment
ƒ
ƒ
ƒ
ƒ
2 Linux PC cluster systems over WAN
duy.kaist.ac.kr : 1.8GHz P4 4 nodes, RAM 512M
cluster.hpcnet.ne.kr : 450MHz P2 4 nodes, RAM 256M
F90, PBS, MPICH-G2, GT2.0
duy.kaist.ac.kr/jobmanager-pbs
cluster.hpcnet.ne.kr/jobmanager-pbs
MPICH-G2
scheduler-pbs
scheduler-pbs
Execution nodes
Execution nodes
Supercomputing Center
Preliminary Results – Cont.
z Simulation of a parallel multi-stage rocket
ƒ 400 thousand grid points & 6 processors
ƒ Chimera methodology
Supercomputing Center
5
Preliminary Results – Cont.
z Aerodynamic Design Optimization
ƒ RAE2822 airfoil design in 2D turbulent flow field
ƒ 10 design variables & 4 processors
ƒ Adjoint sensitivity analysis
Supercomputing Center
Preliminary Results – Cont.
z Obtained parallel efficiency on supercomputer grid
16
Ideal Case
14
GS320-SP2(Onera M6)
Speed-up
12
10
8
6
4
2
5
10
Processors
15
Supercomputing Center
6
Ongoing Efforts
z CFD portal
ƒ PHP based web interface
ƒ GPDK for next version
ƒ Integrated PRE/POST processing interface
z High throughput computing Environment
ƒ Generate parameter set
ƒ Distribute/Submit jobs
ƒ Collect results
z Improved parallel algorithm
z Adequate for WAN
Supercomputing Center
Remarks
z Most application engineers are reluctant to use grid, since they believe that it
is just a WAN version of parallel computing
z We need to prove power of grid environment to application engineers, in
order to encourage to use a new grid technology
z Therefore, it is very important to show the capabilities of grid services and
what can be done with those services
Supercomputing Center
7
Thank you for your attention!
Supercomputing Center
8
GridTools:
Customizable command line
tools for Grids
Ian Kelley + Gabrielle Allen
Max Planck Institute for Gravitational Physics
Golm, Germany
ikelley@aei.mpg.de
NeSC Apps Workshop
July 20th, 2002
Introduction
• Simple command line tools (in Perl) for testing and
performing operations across TestBeds.
• Motivation:
– Working with 26 machines on the SC2001 testbed
– Tools to help us get our physics users onto the Grid
– Playground for easily testing different scenarios before
building them into portals/applications.
• Have been useful for us, so put them together and
wrote some documentation.
• See also TeraGrid pages:
– http://www.ncsa.uiuc.edu/~jbasney/teragrid-setup-test.html
NeSC Apps Workshop
July 20th, 2002
1
TestBeds
• What do we mean by “TestBed”?
– My definition of a TestBed:
• “a collection of machines with some sort of coordinated infrastructure,
that is used for a common purpose or by a specific group of users
– We want to develop, deploy and test portal and application software
– Ultimately: want real “users” to view TestBed as a single resource
• For me:
–
–
–
–
SC2001 (GGF Apps) TestBed
GridLab TestBed
AEI Relativity Group production machines
My personal TestBed
NeSC Apps Workshop
July 20th, 2002
SC2001 TestBed
• 26 Machines
• Very heterogeneous
• All sites worked to build
towards a common setup
(GRAM, GSI, Cactus, Portal,
GIIS)
• At SC2001 showed a Cactus
simulation dynamically spawning
individual analysis tasks to all
machines
•
http://www.aei.mpg.de/~allen/TestBedWeb/
NeSC Apps Workshop
July 20th, 2002
2
NumRel Production TestBed
• This is what we
really want!
• For physicists
to do physics!
• Hard work!
Blue Horizon
Lemieux
Psi
Seaborg
Globus
Titan
Origin
Platinum
Los Lobos
sr8000
NeSC Apps Workshop
July 20th, 2002
(Some) TestBed Gripes …
•
•
•
•
•
•
•
•
Software deployment not yet standard/stable
Information not easy-to-find or up-to-date
Different security/account policies (firewalls!!)
Priorities mean things not always fixed quickly.
Hard to get a global view of current state.
Have trouble keeping track of changes
Not everything works as expected ☺
Basically we need to work in a “research-like”
environment, but the more we use it, the more
“production-like” it will become …
NeSC Apps Workshop
July 20th, 2002
3
What We Want To Do
• Run different tests (gsi, gram, etc) on our TestBed to verify
that things are working correctly.
• Easily get up-to-date global views of our testbeds.
• Log files for tracking history, stability, etc.
• Easily add and configure machines and tests.
•
Construct and test more complex scenarios for applications
• Something that our end-users can also use!
NeSC Apps Workshop
July 20th, 2002
Higher Level Scenarios
• For example for our portal/applications we want to
test feasibility/usefulness etc of
–
–
–
–
–
–
–
–
–
Remote code assembly and compilation
Repositories of executables
Things specific for Cactus: parameter files, thornlists
Data description archiving, selection, transfer
Visualisation
Design of user-orientated interfaces
User customisations
Collaborative/Group issues
Simulation announcing/steering/tracking
• These also require work on the applications !!
NeSC Apps Workshop
July 20th, 2002
4
GridTools Aims
• Give a wrapper around Globus tools that enables
scripting capability to perform multiple tasks.
• Provide additional functionality such as a pseudo
database for storing machine and configuration
specific information.
• Modularization of functionality to allow for easy
development of more complex programs.
NeSC Apps Workshop
July 20th, 2002
What You Get
• Basic scripts:
– TestAuth
– TestResources
• Report making
– CreateTEXT
– CreateHTML
– CreateMAP
• A Library:
– GridTools.pm
• A Pseudo-database
– grid.dat
– (all the stuff we
really want to get
from e.g. MDS)
• Other stuff
NeSC Apps Workshop
July 20th, 2002
5
TestAuth Output
NeSC Apps Workshop
July 20th, 2002
Current Tests for
TestResources
•
•
•
•
•
•
•
•
Authorize to Globus Gatekeeper
Simple GRAM job submission
Using GSIFTP to copy files
Using GSISCP to copy files
Testing GSISSH in batchmode
Simple job run using GASS server
Simple MPI job run using GASS server
Using machine specific predefined RSLs to
execute a simple job
• Very simple to add new tests.
NeSC Apps Workshop
July 20th, 2002
6
TestResources Output
NeSC Apps Workshop
July 20th, 2002
TestResources Output
NeSC Apps Workshop
July 20th, 2002
7
Extensibility
•
•
GridTools.pm, a Perl module, contains many common functions that
allow you to easily write additional scripts or modify the existing
ones.
– Such as execution of commands via fork() or using timeouts
– Reading of machine configuration
– User (text based) interfaces
Could implement other useful functionality
– Timing how long things take to complete
– More advanced monitoring
• How often do different services go down on different machines
•
– Querying of information servers to update local database, or visa-versa
GridTools can be extended to perform more complicated tasks.
– Such as real job submission
• Using RSL templates and compilation specific information
– Distribution or aggregation of files and processes
NeSC Apps Workshop
July 20th, 2002
Conclusion
• GridTools can help you to run tests on a group of computers to
provide you with a general overview of the status of your
TestBed.
• Can be extended to include more complicated tasks such as job
distribution and compilation.
• Obtain from CVS:
– cvs –d :pserver:cvs_anon@cvs.aei.mpg.de:/numrelcvs login
• password: anon
– cvs –d :pserver:cvs_anon@cvs.aei.mpg.de:/numrelcvs co
GridTools
• Contact me for help/comments: ikelley@aei.mpg.de.
NeSC Apps Workshop
July 20th, 2002
8
Exposing Legacy Applications as OGSI
Components using pyGlobus
Keith R. Jackson
Distributed Systems Department
Lawrence Berkeley National Lab
NeSC Grid Apps Workshop
Overview
•
•
•
•
•
•
•
•
•
The Problem?
Proposed Solution
Why Python?
Tools for generating Python interfaces to
C/C++/Fortran
pyGlobus Overview
Current support in pyGlobus for Web Services
OGSI plans for pyGlobus
Steps for Exposing a Legacy Application
Contacts & Acknowledgements
NeSC Grid Apps Workshop
1
The Problem?
• Many existing codes in multiple languages, e.g., C,
C++, Fortran
— Would like to make these accessible on the Grid
• Should be accessible from any language
— Need to integrate standard Grid security
mechanisms for authentication
— Need a standard framework for doing
authorization
— Would like to avoid custom “one off” solutions for
each code
NeSC Grid Apps Workshop
The Solution
• Provide a framework that legacy applications can easily be
plugged into
— Must be easy to add applications written in many languages
• Use the Python language as the “glue”
• The framework should support:
— Authentication using standard Grid mechanisms
— Flexible authorization mechanisms
— Lifecycle management
• Including persistent state
• Develop one container, and reuse it for many legacy
applications
• Use Web Services protocols to provide language neutral
invocation and control
— Use standard high-performance Grid protocols, e.g.,
GridFTP, for data transfer
NeSC Grid Apps Workshop
2
Solution (cont.)
Client
GSI Authentication
Python Container
Authorization Adapter
Dispatcher
Lifecycle
Management
Operations
Application Factory
Python Shadow
Class
State Management
Operations
Legacy Application
NeSC Grid Apps Workshop
Why Python?
• Easy to learn/read high-level scripting language
— Very little syntax
• A large collection of modules to support common
operations, e.g., networking, http, smtp, ldap, XML,
Web Services, etc.
• Excellent for “gluing” together existing codes
— Many automated tools for interfacing with
C/C++/Fortran
• Support for platform independent GUI components
• Runs on all popular OS’s, e.g., UNIX, Win32,
MacOS, etc.
• Support for Grid programming with pyGlobus,
PyNWS, etc.
NeSC Grid Apps Workshop
3
Tools for Interface Generation
• SWIG (Simple Wrapper Interface Generator)
— Generates interfaces from C/C++
• Supports the full C++ type system
— Can be used to generate interfaces for Python, Perl, Tcl,
Ruby, Guile, Java, etc.
— Automatic Python “shadow class” generation
— http://www.swig.org/
• Boost.Python (Boost Python interface generator)
— Generates interfaces from C++
— http://www.boost.org/libs/python/doc/
• PyFort (Python Fortran connection tool)
— Generates interfaces from Fortran
— http://pyfortran.sourceforge.net/
• F2PY (Fortran to Python Interface Generator)
— Generates interfaces from Fortran
— http://cens.ioc.ee/projects/f2py2e/
NeSC Grid Apps Workshop
pyGlobus Overview
• The Python CoG Kit provides a mapping between
Python and the Globus Toolkit™. It extends the use
of Globus by enabling access to advanced Python
features such as events and objects for Grid
programming.
• Hides much of the complexity of Grid programming
behind simple object-oriented interfaces.
• The Python CoG Kit is implemented as a series of
Python extension modules that wrap the Globus C
code.
• Provides a complete interface to GT2.0.
• Uses SWIG (http://www.swig.org) to help generate
the interfaces.
NeSC Grid Apps Workshop
4
pyGlobus and Web Services
• Provides a SOAP toolkit that supports
SOAP/HTTP/GSI
— Allows standard GSI delegation to web services
— Interoperates with the GSI enabled Java SOAP
• XSOAP from Indiana
• Axis SOAP from ANL
— Currently based on SOAP.py, but switching to
ZSI
• ZSI supports document-oriented SOAP
• ZSI supports much more flexible encoding of complex
types
NeSC Grid Apps Workshop
GSISOAP Client Example
from pyGlobus import GSISOAP, ioc
proxy =
GSISOAP.SOAPProxy(“https://host.lbl.gov
:8081”, namespace=“urn:gtg-Echo”)
proxy.channel_mode =
ioc.GLOBUS_IO_SECURE_CHANNEL_MODE
proxy.delegation_mode =
ioc.GLOBUS_IO_SECURE_DELEGATION_MODE_NO
NE
print proxy.echo(“spam, spam, spam, eggs,
and spam”)
NeSC Grid Apps Workshop
5
GSISOAP Server Example
from pyGlobus import GSISOAP, ioc
def echo(s, _SOAPContext):
cred = _SOAPContext.delegated_cred
# Do something useful with cred here
return s
server = GSISOAP.SOAPServer(host.lbl.gov, 8081)
server.channel_mode =
ioc.GLOBUS_IO_SECURE_CHANNEL_MODE_GSI_WRAP
server.delegation_mode =
ioc.GLOBUS_IO_SECURE_DELEGATION_MODE_FULL_PROXY
server.registerFunction(SOAP.MethodSig(echo,
keywords=0, context=1), “urn:gtg-Echo”)
server.serve_forever()
NeSC Grid Apps Workshop
OGSI Plans for pyGlobus
• Develop a full OGSI implementation in Python
— Planned alpha release of an OGSI client by the
end of August
— OGSI hosting environment based on WebWare
(http://webware.sourceforge.net/)
• Dynamic web service invocation framework
— Similar to WSIF (Web Services Invocation
Framework) from IBM for Java
• http://www.alphaworks.ibm.com/tech/wsif
— Download and parse WSDL document, create
request on the fly
— Support for multiple protocol bindings to WSDL
portTypes
NeSC Grid Apps Workshop
6
Steps to Expose a Legacy App
• Wrap the legacy application to create a series of
Python classes or functions
— Use one of the automated tools to help with this
• Use pyGlobus to add any needed Grid support
— GridFTP client to move data files
— IO module for GSI authenticated network
communication
• Extend the GridServiceFactory class to implement
any custom instantiation behavior
• Add the Python shadow class to the container
— XML descriptor file used to control properties of
the class, e.g., security, lifecycle, etc.
NeSC Grid Apps Workshop
Contacts / Acknowledgements
• http://www-itg.lbl.gov/
• krjackson@lbl.gov
• This work was supported by the Mathematical,
Information, and Computational Science Division
subprogram of the Office of Advanced Scientific
Computing Research, U.S. Department of Energy,
under Contract DE-AC03-76SF00098 with the
University of California.
NeSC Grid Apps Workshop
7
www.gridlab.org
An Overview of the GAT API
Tom Goodale
Max-Plank-Institut fuer Gravitationsphysik
goodale@aei-potsdam.mpg.de
www.gridlab.org
The Problem



Application developers want to use the Grid
now. The Grid is, however, a rapidly
changing environment
What technology should they use ?
How soon will this technology be obsolete ?
NeSC Applications Workshop 20/7/2002
www.gridlab.org
The Problem


Much of the existing Grid technologies are
very low level; they allow sophisticated
programmers to do things, but leave some
application developers cold.
Grid technologies are not deployed at all
sites to the same extent.
NeSC Applications Workshop 20/7/2002
www.gridlab.org
The Problem


Something which works in one place may
need to be tweaked to work elsewhere even
if the technologies used are, in principle, the
same.
It can take a lot of effort to get services
deployed, but developers want to be able to
develop and test their applications while this
is going on.
NeSC Applications Workshop 20/7/2002
www.gridlab.org
The Solution ?



Need an API which insulates the application
developer from the details of low-level grid
operations.
Need an API which is “future-proof”.
Need to make sure that developers can write
and test their applications irrespective of the
underlying state of deployment of Grid
technologies.
NeSC Applications Workshop 20/7/2002
www.gridlab.org
Gridlab




The GridLab project, which started in
January of this year, seeks to provide such
an API.
Three year project.
Will provide the Grid Application Toolkit
(GAT) API
Will provide a set of services which can be
accessed via the GAT- API
NeSC Applications Workshop 20/7/2002
www.gridlab.org
GAT

What does it mean to provide the GAT-API ?




An API specification.
A sample implementation.
A set of services (or equivalents) available
through the sample implementation
What else ?


A testbed to prove that it works
Applications using it !
NeSC Applications Workshop 20/7/2002
www.gridlab.org
GAT-API



Need to identify the Grid operations which
people want to do.
Needs to be high level, rather than just
duplicating existing low-level APIs
Must give user the choice of which sets of
low-level functionaility are actually used.
NeSC Applications Workshop 20/7/2002
www.gridlab.org
GAT-API



Must allow application developers to do
everything they want to do through the API,
rather than forcing them to access specific
Grid technologies with specific calls.
Must provide the capability to see what
actually happened so that users or
developers can diagnose problems
Must have bindings for many languages.
NeSC Applications Workshop 20/7/2002
www.gridlab.org
Which Operations






Spawn, migrate
Checkpoint
Find resource, allocate resource
find process, find data
Copy data, send/receive data
Security


Allow selective access to data, processes, etc
Multiple VOs, etc
NeSC Applications Workshop 20/7/2002
www.gridlab.org
GAT Implementation


Must be modular to allow access both to
current Grid technologies and future Grid
technologies
Must be modular to be independent of any
specific technology
NeSC Applications Workshop 20/7/2002
www.gridlab.org
GAT Implementation



Core – the API bindings – must be
deployable on all architectures
Core must return sensible error codes in the
absence of any other part
Should be able to provide access to serviceequivalents so that the application can work
even in the absence of externally-deployed
services or behind a firewall.
NeSC Applications Workshop 20/7/2002
www.gridlab.org
GAT Implementation


Must be able to react to changing Grid
conditions and allow access to new services
as they become available
Must allow access to existing services, as
well as the services being developed within
the GridLab project.
NeSC Applications Workshop 20/7/2002
www.gridlab.org
Status



Very rough prototype implementation of
core.
Continuing work to define the set of
operations available through the API, and
hence the precise API specification
Work to develop a set of GridLab services to
be accessed via the API.
NeSC Applications Workshop 20/7/2002
NeSC Workshop on Applications and Testbeds on the Grid
Collaborative Tools for Grid
Support
Laura F. McGinnis
Pittsburgh Supercomputing Center
Pittsburgh, PA, USA
July 20, 2002
This Talk
The Software Tools Collaboratory, an NIH funded project at the Pittsburgh Supercomputing
Center, has recently completed a protein folding simulation, utilizing 4 major systems at
3 geographically distributed sites. This presentation will discuss the issues related to
setting up and running the simulation, from the perspective of establishing and
maintaining the infrastructure and communication among participants before,
during and after the simulation. As many sites have found, coordinating the resources
for grid computing is more than just a matter of synchronizing batch schedulers. This
presentation will share the collaboratory’s experience in supporting and managing
communication among participants, especially in the back channels, using
common, publicly available tools.
• The Experiment
• The Tools
• Alternatives
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
1
The Experiment: The Task
• Simulate protein folding for a 400point surface using CHARMM
• Using the 4 systems, the elapsed
time for this experiment was
estimated to take 30 hours
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
The Experiment: The Cast
• 1 Scientist at The Scripps Research Institute in
San Diego, California
• 1 Legion Administrator at the University of
Virginia
• 4 machines of different architectures
•
•
•
•
TCS Alphacluster at Pittsburgh Supercomputing Center
T3E at Pittsburgh Supercomputing Center
IBM SP at San Diego Supercomputing Center
Linux Cluster at University of Virginia
• The Chorus – 4 Collaboratory support observers
at 2 locations in Pittsburgh
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
2
The Experiment: Prep Work
• Establish the protocol
• Test the components
• Including the collaborative tools
• Coordinate dedicated time
• On all platforms
• With all necessary support staff
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
The Experiment: The Run
• Launch jobs to each machine via
Legion
• Monitor progress of jobs as they run
• Collect results back to the
scientist’s site for evaluation
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
3
The Tools: Prep Tools
• Email:
• Majordomo
• Mhonarc
• Document Management:
• Enotes (US Department of Energy, Oak
Ridge National Laboratory)
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
The Tools: Run Tools
• Application Sharing
• Microsoft NetMeeting, SGI’s SGIMeeting
• Communication:
• AOL Instant Messenger
• Email:
• Majordomo
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
4
The Tools: AIM
100
Day 1
90
Chatroom Activity over time
80
70
60
50
Day 2
40
30
20
Day 3
lfm@psc.edu
20-July-2002
9:00
17:00
16:00
15:00
14:00
13:00
12:00
11:00
10:00
9:00
18:00
17:00
16:00
15:00
14:00
13:00
12:00
11:00
0
10:00
10
NeSC Applications Workshop
Glasgow, Scotland
The Tools: AIM
Chorus
10%
Participation by
Cast Member
Systems
20%
Scientist
54%
Legion Admin
16%
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
5
The Tools: AIM
Statement Types
UVa
1%
Misc
2%
NetMeeting
3%
Admin
5%
Jaromir
7%
Lemieux
39%
Blue Horizon
16%
lfm@psc.edu
20-July-2002
Legion
27%
NeSC Applications Workshop
Glasgow, Scotland
Alternative Tools: Document
Management
• DoE Electronic Notebooks:
• Enote@LBNL
• Enote@PNNL
• Enote@ORNL
• DocShare
• UServ
• CVS (Concurrent Versions Systems)
• Shared Disk Space
• PCs
• Unix
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
6
Alternative Tools: Document Sharing
Evaluation Criteria
•
•
•
•
•
•
•
•
Available from
Cost
Ease of Use
Sharable Document Types
Web-Based Interface
Web Client Support
Cross-Platform
Client System
Requirements
• Server System
Requirements
• Shared Editing
• Editing Method
• Object Handling Method
• Source Code Availability
• History Tracking
• Other features/notes
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
Alternative Tools: Communication
• Instant Messenger Services
• AOL Instant Messenger
• Yahoo Instant Messenger
• Microsoft Instant Messenger
•
•
•
•
ICQ
Internet Relay Chat (IRC)
Zephyr
Imici
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
7
Alternative Tools: Communications
Evaluation Criteria
•
•
•
•
•
•
•
•
Provider
Cost
Size of Download
Ease of Signup
File Transfer Ability
Logging
Chat Rooms
SPAM Problems
lfm@psc.edu
20-July-2002
•
•
•
•
•
•
•
•
People Locator
Information Required
Auto-ID
Firewall/Proxy Usage
Platform
System Requirements
Internet Email
Notes
NeSC Applications Workshop
Glasgow, Scotland
Lessons Learned
• Running applications on a grid is still a peopleintensive activity
• Participants require strong communication tools
• Custom solutions are still the norm
• Use cases should be analyzed to identify
commonalities which can be addressed (and
differentiators that cannot be)
• Tools are available to support back-channel
communications
• Don’t overlook “popular” software
• There are also low-overhead tools available
• Know your risk and annoyance threshold
8
The Cast: Credits
• The Scientist – Mike Crowley
• The Legion Administrator – Glenn Wasson
• The System Administrators
• TCS Alphacluster and T3E @ PSC – Chad Vizino
• IBM SP @SDSC – Kenneth Yoshimoto
• Linux Cluster @Uva – Glenn Wasson
• The Chorus
•
•
•
•
Sergiu Sanielivici (PSC)
Cindy Gadd (UPMC)
Robb Wilson (UPMC)
Laura McGinnis (PSC)
lfm@psc.edu
20-July-2002
NeSC Applications Workshop
Glasgow, Scotland
Appendix 1: Contact Information for
Communication Tools
•
•
•
•
•
•
•
AOL Instant Messenger (aol.com)
Yahoo Instant Messenger (yahoo.com)
ICQ (ICQ.com)
Microsoft Instant Messenger (microsoft.com)
Internet Relay Chat (mirc.com, ircle.com)
Imici (imici.com)
Zephyr (mit.edu)
9
Appendix 2: Contact Information for
Document Management Tools
• Enote@LBL
(http://vision.lbl.gov/~ssachs/doe2000/lbnl.download.html)
• Enote@PNNL
(http://www.emsl.pnl.gov:2080/docs/collab/)
• Enote@ORNL
(http://www.epm.ornl.gov/~geist/java/applets/enote/)
• DocShare
(http://collaboratory.psc.edu/tools/docshare/faq.html)
(must email nstone@psc.edu))
• Userv
(http://userv.web.cmu.edu/userv/Download.jsp)
• CVS (Concurrent Versions Systems)
(http://collaboratory.psc.edu/tools/cvs/faq.html)
10
Application Web Services and
Event / Messaging Systems
NeSC Glasgow July 20 2002
PTLIU Laboratory for Community Grids
Geoffrey Fox, Shrideep Pallickara, Marlon Pierce
Computer Science, Informatics, Physics
Indiana University, Bloomington IN 47404
http://www.naradabrokering.org
http://grids.ucs.indiana.edu/ptliupages
gcf@indiana.edu
7/23/2002
uri="http://www.naradabrokering.org"
email="gcf@indiana.edu"
1
Application Portal in a Minute (box)
„
„
Systems like Unicore, GPDK, Gridport (HotPage),
Gateway, Legion provide “Grid or GCE Shell”
interfaces to users (user portals)
• Run a job; find its status; manipulate files
• Basic UNIX Shell-like capabilities
Application Portals (Problem Solving Environments)
are often built on top of “Shell Portals” but this can be
quite time confusing
• Application Portal = Shell Portal Web Service +
Application (factory) Web service
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
2
1
Application Web service
„
„
„
Application Web Service is ONLY metadata
• Application is NOT touched
Application Web service defined by two sets of schema:
• First set defines the abstract state of the application
„ What are my options for invoking myapp?
„ Dub these to be “abstract descriptors”
• Second set defines a specific instance of the application
„ I want to use myapp with input1.dat on
solar.uits.indiana.edu.
„ Dub these to be “instance descriptors”.
Each descriptor group consists of
• Application descriptor schema
• Host (resource) descriptor schema
• Execution environment (queue or shell) descriptor schema
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
3
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
4
2
Engineering Application WS
„
Schema wizard: given a Schema, creates (JSP) web
page with form to specify XML Instances
• Use for application metadata
„
AntiSchema wizard: given an HTML form, creates a
Schema
• Captures input parameters of application
„
Castor converts Schema to Java
• Use Python if you prefer!
„
„
„
Apache converts Java into Web Services
Make this in a portlet for use in favorite portal
Being used today in DoD …….. (with and without
Globus)
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
5
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
6
3
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
7
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
8
4
Different Web Service Organizations
„
„
„
Everything is a resource implemented as a Web
Service, whether it be:
• back end supercomputers and a petabyte data
• Microsoft PowerPoint and this file
Web Services communicate by messages …..
Grids and Peer to Peer (P2P) networks can be
integrated by building both in terms of Web Services
with different (or in fact sometimes the same)
implementations of core services such as registration,
discovery, life-cycle, collaboration and event or
message transport …..
• Gives a Peer-to-Peer Grid
„
Here we discuss Event or Message Service linking web
services together
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
Database
Peers
9
Database
Resource Facing
Web Service Interfaces
Event/
Message
Brokers
Integrate P2P
and Grid/WS
Event/
Message
Brokers
Peer to Peer Grid
Web Service Interfaces
Peers
User Facing
Web Service Interfaces
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
10
A
democratic organization
Peer to Peer Grid
5
XML
Skin
Message
Or Event
Soft
Soft
Based
ware
ware
Inter
Connection
Resource
XML
Skin
Resource
Data
base
e-Science/Grid/P2P Networks are XML
Specified Resources connected by XML
specified messages
Implementation ofuri="http://www.naradabrokering.org"
resource and connection
may or may not be XML 11
email="gcf@indiana.edu"
7/23/2002
Role of Event/Message Brokers
„
„
„
We will use events and messages interchangeably
• An event is a time stamped message
Our systems are built from clients, servers and “event brokers”
• These are logical functions – a given computer can have one
or more of these functions
• In P2P networks, computers typically multifunction; in Grids
one tends to have separate function computers
• Event Brokers “just” provide message/event services; servers
provide traditional distributed object services as Web services
There are functionalities that only depend on event itself and
perhaps the data format; they do not depend on details of
application and can be shared among several applications
• NaradaBrokering is designed to provide these functionalities
• MPI provided such functionalities for all parallel computing
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
12
6
NaradaBrokering implements an
Event Web Service
Destination
Source Matching
Routing
Web
Service 1
WSDL
Ports
„
„
„
„
(Virtual)
Queue
Broker
Filter
workflow
Web
Service 2
WSDL
Ports
Filter is mapping to PDA or slow communication channel
(universal access) – see our PDA adaptor
Workflow implements message process
Routing illustrated by JXTA
Destination-Source matching illustrated by JMS using PublishSubscribe mechanism
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
13
Engineering Issues Addressed
by Event / Messaging Service
„
„
„
„
„
„
„
Application level Quality of Service – give audio highest
priority
Tunnel through firewalls
Filter messages to slow (collaborative or real time)
clients
Hardware multicast is erratically implemented (Event
service can dynamically use software multicast)
Scaling of software multicast
Elegant implementation of Collaboration in a Groove
Networks (done better) style
Integrate synchronous and asynchronous collaboration
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
14
7
Features of Event Service I
„
„
„
„
„
„
MPI nowadays aims at a microsecond latency
The Event Web Service aims at a millisecond latency
• Typical distributed system travel times are many milliseconds
(to seconds for Geosynchronous satellites)
• Different performance/functionality trade-off
Messages are not sent directly from P to S but rather from P to
Broker B and from Broker B to subscriber S
• Actually a network of brokers
Synchronous systems: B acts as a real-time router/filterer
• Messages can be archived and software multicast
Asynchronous systems: B acts as an XML database and
workflow engine
Subscription is in each case, roughly equivalent to a database
query
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
15
Features of Event Web Service II
„
In principle Message brokering can be virtual and
compiled away in the same way that WSDL ports can
be bound in real time to optimal transport mechanism
• All Web Services are specified in XML but can be
implemented quite differently
• Audio Video Conferencing sessions could be negotiated using
SOAP (raw XML) messages and agree to use certain video
codecs transmitted by UDP/RTP
„
There is a collection of XML Schema – call it GXOS –
specifying event service and requirements of message
streams and their endpoints
• One can sometimes compile message streams specified in
GXOS to MPI or to local method call
„
Event Service must support dynamic heterogeneous
protocols
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
16
8
Features of Event Web Service III
„
The event web service is naturally implemented as a
dynamic distributed network
• Required for fault tolerance and performance
„
A new classroom joins my online lecture
• A broker is created to handle students – multicast locally my
messages to classroom; handle with high performance local
messages between students
„
Company X sets up a firewall
• The event service sets up brokers either side of firewall to
optimize transport through the firewall
„
Note all message based applications use same message
service
• Web services imply ALL applications are (possibly virtual)
message based
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
17
Single Server P2P Illusion
Data
base
Traditional Collaboration Architecture
e.g. commercial WebEx
Collaboration Server
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
18
9
Narada Broker Network
(P2P) Community
For message/events service
Broker
Broker
(P2P) Community
Resource
Broker
Broker
Data
base
Broker
(P2P) Community
Software multicast
Broker
(P2P) Community
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
19
NaradaBrokering and JMS (Java Message Service)
Transit Delays for Message Samples in Narada and SonicMQ
Low Rate; Small Messages
Mean
Transit Delay
(MilliSeconds)
14
12
10
8
6
4
2
0
0
5
550
500
450
400
350
300
250
200 Payload Size
150
(Bytes)
100
Narada
10
15
Publish Rate 20
(Messages/sec)
7/23/2002
SonicMQ (commercial JMS)
25
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
20
10
JXTA just got slower
100
Client ⇔ JXTA ⇔ JXTA ⇔ Client
90
Mean Transit Delay
(Milliseconds)
80
Client ⇔ JXTA ⇔ Narada ⇔ JXTA ⇔ Client
70
60
50
Client ⇔ JXTA ⇔ JXTA ⇔ Client multicast
40
30
Narada
Client
20
0
Pure Narada 2 hops
Client
10
0
7/23/2002
Narada
100
200
300
400
500
600
Message Payload Size
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
(Bytes)
21
Shared Input Port (Replicated WS) Collaboration
Collaboration as a WS
Set up Session
F
I
U
R
O
F
Web
Service
I
O
WS
Viewer
WS
Display
Master
F
Event
(Message)
Service
I
F
I
7/23/2002
U
R
O
I
O
F
Web
Service
I
O
WS
Viewer
WS
Display
Other
Participants
U
R
O
F
Web
Service
WS
Viewer
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
WS
Display
22
11
Shared Output Port Collaboration
Collaboration as a WS
Set up Session
Web Service Message
Interceptor
F
I
R
O
Master
WSDL
U
Application or
Content source
Web Service
O
F
I
Event
(Message)
Service
7/23/2002
WS
Viewer
WS
Display
WS
Viewer
WS
Display
Other
Participants
WS
Viewer
WS
Display
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
23
NaradaBrokering Futures
„
„
„
„
„
„
„
Higher Performance – reduce minimum transit time to
around one millisecond
Substantial operational testing
Security – allow Grid (Kerberos/PKI) security
mechanisms
Support of more protocols with dynamic switching as
in JXTA – SOAP, RMI, RTP/UDP
Integration of simple XML database model using
JXTA Search to manage distributed archives
More formal specification of “native mode” and
dynamic instantiation of brokers
General Collaborative Web services
7/23/2002
uri="http://www.naradabrokering.org" email="gcf@indiana.edu"
24
12
www.gridlab.org
Grid Portals: Bridging the
gap between scientists and
the Grid
Michael Russell, Jason Novotny, Gabrielle Allen
Max-Planck-Institute fuer Gravitationphysik
Golm, Deutschland
www.gridlab.org
The promises of Grid computing are grand
„
„
„
„
„
Uniform access to heterogenous resources
The ability to pool distributed resources together on
demand
Resources are either transparently available to users
or they simply don’t have to worry about them
Support virtual organizations of distributed
researchers collaborationg across institutoional,
geographic, and political boundaries…
Support applications with enormous computing and/or
data management requirements.
1
www.gridlab.org
Grid computing is quickly evolving
„
„
„
„
„
„
„
In the 2 years I’ve been here Globus 1.3 went
2.0
Grid Portals became “the next big thing”
Luckily, portlets have come to the rescue
Meanwhile, Globus went pro to bring in IBM
and other heavyweights
Now OGSA is stepping up to bat
Global Grid Forum is already a hit
Can’t wait for the video games!
www.gridlab.org
So where is the Grid?
„
„
„
„
How can we use it? Or at least that’s so-called users
are probably asking.
The Grid is a work in progress, one of the biggest
undertakings in the history of humankind. Most of you
are in Scotland to do your part in building the Grid.
And most of you have reasons why you need the Grid,
you want to use it right? Or perhaps you want to help
others to use it.
The point is, a large gap exists between the Grid and
its would-be users.
2
www.gridlab.org
Grid Portals
„
„
„
„
So you want to build a Grid portal to bridge
that gap. But what does it take to build a Grid
portal? Well, it’s going to take a lot…
It takes a solid understanding of the state-ofthe-art in Web portal development.
It takes a solid understanding of the state-ofthe-art of in Grid computing.
And much more…
www.gridlab.org
The Astrophysics Simulation Collaboratory
„
The ASC seeks to:
™
™
™
Enable astrophysicists to develop better
and more powerful simulations with the
Cactus Computational Toolkit.
Enable astrophysicists to run and analyze
simulations on Grids.
Build a Grid portal to support these
activities and all the rest that comes with
Web portals.
3
www.gridlab.org
The ASC and Cactus
„
„
So we worked with the Cactus Project to develop support for
working with Cactus from the ASC Portal.
We developed application Web components to:
™
™
™
™
™
„
Install Cactus software from multiple cvs repositories onto target
resources.
Build executables with those installations using the autoconf and
make capabilities in Cactus.
Upload and edit parameter files.
Run simulations on target resources, as well as for connecting to
simulations and monitoring their progress.
Launching the appropriate visualization applications on Cactus data.
In most cases, the Cactus Project developed extensions to Cactus
software to support these components.
www.gridlab.org
The ASC and Globus
„
„
„
„
We worked with the Globus Project to develop support for working with
Globus from the ASC Portal.
We developed Grid Web components to:
™
Enable logging on with one or more Grid proxy certificates stored in
MyProxy.
™
Submit and monitor jobs with Globus Gatekeeper and for maintaing
job history.
™
Manipulate files (listing, copying, moving, deleting files) with GSIFTP.
™
We tried to use MDS, but at that time MDS did not meet our
needs, so we developed our own components for storing static
information about resources and polling for whatever dynamic
information could be reliably retrieved from MDS.
We asked Globus to build GSI-CVS and now we’re building Web
components and services to use and extend GSI-CVS.
We’ve added support for GSI-authentication in the MindTerm Java SSH
implementation.
4
www.gridlab.org
The ASC and ASC
„
We realized we needed to support the people within
ASC directly from our portal, so we built
administrative Web components to:
™ Manage user accounts.
™ Assign security roles to users and to which pages
users have access.
™ Manage the proxy certificates users require to
authenticate to other services and determine
which certificates authenticate to which services.
™ Manage the resources and services to which we
provide access from the ASC Portal.
www.gridlab.org
Problems we faced
„
„
„
„
„
The ASC is a Virtual Organization in every sense of the word with
users, developers, administrators, software, and resources
distributed across the U.S. and Europe.
With our developer(s) in the University of Chicago and our users
in Washington University, St. Louis and Max Planck Institute in
Golm, Germany, there wasn’t nearly enough direct contact
between our users and developers. This made it difficult to meet
their needs.
It was easy enough to prototype Web components, but we needed
to build a Grid portal framework that would support future
development and sustain a production-quality Grid portal, and that
took several months to develop.
We were closely associated with Globus, but this made us too
reliant upon Globus as our interface to resources…
For while we had identified the resources our users required, the
Grid as we knew it then and today just wasn’t enough.
5
www.gridlab.org
It takes more than just a portal
You need to build
a virtual organization
or otherwise
join a virtual organization
and plug into their work!
www.gridlab.org
Other lessons we learned
„
„
Put the needs of scientists at the very top of your list. Be familiar
with their research and the day-to-day problems they face in
using computing technology to conduct that research.
Next, consider what it is you really need in the form of:
™
Enhancements to the applications scientists are
developing. For new applications, consider how you can buildin support for Grid operations. For legacy applications,
consider how you can provide better support for their
applications with external services.
™
Enhancements to the Grid infrastructure with respect to
your applications. These enhancements should build off
other…
™
High-level services that coordinate the use of resources.
We’re beginning to see Globus as the “system” layer with
respect to Grids.
6
www.gridlab.org
Don’t forget your infrastructure
„
„
„
Create a vialable testbed that includes both the
resources your users need and use in their everyday
reserch as well as resources with which you can
experiment.
In a VO like the ASC, you will not have administrative
control over these resources. In fact, this is a very
difficult problem to overcome, it takes a lot of effort
to see changes applied where and when you need them.
You need tools for testing things out and you’re going
to need to keep track of all your resources and
providers, the change requests you make, the problems
your users experience, and so forth…
www.gridlab.org
Get into production mode
„
„
„
In order to build a viable Grid portal, realize you need
to build a production system, something your users can
rely upon to work every time. For instance, what
happens to your production database when your
software and data model change?
What makes a production system? Solid engineering
practices and attention to user requirements, security
issues, persistence management, quality assurance,
release management, performance issues…
Project management at the VO level is complex, make
sure you understand Grid-level management issues
before you start writing those cool Grid portal
proposals!
7
www.gridlab.org
Build a team
„
„
It’s important to communicate with partners in
the Global Grid Forum, but try to cover all the
bases within your project. If you don’t have
the funds to build a large team, then allocate
funds towards developing explicit links
between yours and other projects.
Because you need application experts, Grid
portal experts, Grid service experts, Grid
testbed administrators, and so on… division of
labor is a cornerstone of project management.
www.gridlab.org
GridLab
„
„
„
„
Well, so we’re taking the lessons we learned in
the ASC and elsewhere (GPDK, for example!)
through the collective experience of everyone
involved in GridLab and we’re applying them
towards building a…
Production Grid across Europe (and the U.S.)
Grid Application Toolkit for developing
applications with built-in support for Grid
operations, like m igrateApplication()
Grid Portal to support GridLab
8
www.gridlab.org
Key points
„
„
„
„
„
We’re going to work as closely as possible with scientists and
their applications throughout the project.
We’ve created a testbed of resources that our scientists really
want to use.
We’re working to build higher-level services to better coordinate
the use of those resources, and the requirements for these
services will come either directly from our application groups or
indirectly through using a Grid portal.
We’re developing application frameworks to enable scientists to
make use of these higher-level services as basic function/method
calls within the applications.
And we’re coordinating these activities through the development
and use of the GridLab Portal.
www.gridlab.org
The GridLab Portal
„
„
„
We’re using the ASC Portal software to allow us to
focus on the needs of scientists from the very
beginning of our project development.
We’re simultaneously building a new framework that
takes the best of current practices in Web and Grid
computing, and we’re documenting just about every bit
of its requirements, design, and development.
As we develop this framework, we’ll be preparing the
Web interfaces we develop with the ASC Portal for
migration to the new framework.
9
www.gridlab.org
Bridging the gap
„
Before you begin your Grid portal efforts, look at
what’s already out there. For example:
™
™
„
„
GPDK - JSP/Servlets, Great way for getting an introduction
to Grid Portal development
GridPort - Perl-CGI, well-managed project and they have real
application groups using NPACI resources.
Bear in mind that Portlets are where it’s all heading,
but there is no Portlet API quite yet. A lot of people
looking at building Grid portals with the JetSpeed
codebase (but not everyone!).
Or, you should consider working with us…
www.gridlab.org
A possible collaboration with GridLab
GridLab
Application
Toolkits
•Cactus
•Triana
GridLab
Portal
Run my
application!
GridLab
Application
Manager
GridLab
Resource
Manager
GridLab
Information
Service
Grid CVS
Grid Make
Compute
Resources
Help us design better Grid
technologies. Plug in your own
application and supporting
services.Then develop your
own Web pages with our Grid
portal…
10
www.gridlab.org
Some online references
„
„
„
„
„
„
„
„
„
Astrophysics Simulation Collaboratory: http://www.ascportal.org
Cactus Project: http://www.cactuscode.org
Globus Project: http://ww.globus.org
GridLab: http://www.gridlab.org
GridPort: http://www.gridport.org
Grid Portal Development Toolkit:
http://www.doesciencegrid.org/Projects/GPDK
Jakarta JetSpeed: jakarta.apache.org/jetspeed
OGSA: http://www.globus.org/ogsa
Portlet Specification: http://www.jcp.org/jsr/detail/168.jsp
11
A Data Miner for the
Information Power Grid
Thomas H. Hinke
NASA Ames Research Center
Moffett Field, California, USA
Ames Research Center
1
NAS Division
Data Mining on the Grid
What is data mining?
Why use the grid for data mining?
Grid miner overview
Grid miner architecture
Grid miner implementation
Current status
Ames Research Center
2
NAS Division
1
What Is Data Mining?
“Data mining is the process by which
information and knowledge are extracted
from a potentially large volume of data
using techniques that go beyond a simple
search though the data.” [NASA Workshop
on Issues in the Application of Data
Mining to Scientific Data, Oct 1999,
http://www.cs.uah.edu/NASA_Mining/]
Ames Research Center
3
NAS Division
Example: Mining for Mesoscale
Convective Systems
Image shows results from mining SSM/I data
Ames Research Center
4
NAS Division
2
Data Mining on the Grid
What is data mining?
Why use the grid for data mining?
Grid miner overview
Grid miner architecture
Grid miner implementation
Current status
Ames Research Center
5
NAS Division
Grid Provides Computational Power
• Grid couples needed computational power to data
– NASA has a large volume of data stored in its
distributed archives
• E.g., In the Earth Science area, the Earth Observing System
Data and Information System (EOSDIS) holds large volume
of data at multiple archives
– Data archives are not designed to support user
processing
– Grids, coupled to archives, could provide such a
computational capability for users
Ames Research Center
6
NAS Division
3
Grid Provides Re-Usable Functions
• Grid-provided functions do not have to be re-implemented
for each new mining system
–
–
–
–
–
Single sign-on security
Ability to execute jobs at multiple remote sites
Ability to securely move data between sites
Broker to determine best place to execute mining job
Job manager to control mining jobs
• Mining system developers do not have to re-implement
common grid services
• Mining system developers can focus on the mining
applications and not the issues associated with distributed
processing
Ames Research Center
7
NAS Division
Data Mining on the Grid
What is data mining?
Why use the grid for data mining?
Grid miner overview
Grid miner architecture
Grid miner implementation
Current status
Ames Research Center
8
NAS Division
4
Grid Miner
• Developed as one of the early applications on the
IPG
– Helped debug the IPG
– Provided basis for satisfying a major IPG milestones
• IPG is NASA implementation of Globus-based
Grid
• Provides basis for what could be an on-going Grid
Mining Service
Ames Research Center
NAS Division
9
Grid Miner Operations
Results
Results
Translated
Data
Data
Data
Preprocessed
Preprocessed
Data
Data
Patterns/
Patterns/
Models
Models
Input
Preprocessing
Analysis
Output
HDF
HDF-EOS
GIF PIP-2
SSM/I Pathfinder
SSM/I TDR
SSM/I NESDIS Lvl 1B
SSM/I MSFC
Brightness Temp
US Rain
Landsat
ASCII Grass
Vectors (ASCII Text)
Selection and Sampling
Subsetting
Subsampling
Select by Value
Coincidence Search
Grid Manipulation
Grid Creation
Bin Aggregate
Bin Select
Grid Aggregate
Grid Select
Find Holes
Image Processing
Cropping
Inversion
Thresholding
Others...
Clustering
K Means
Isodata
Maximum
Pattern Recognition
Bayes Classifier
Min. Dist. Classifier
Image Analysis
Boundary Detection
Cooccurrence Matrix
Dilation and Erosion
Histogram
Operations
Polygon
Circumscript
Spatial Filtering
Texture Operations
Genetic Algorithms
Neural Networks
Others...
GIF Images
HDF-EOS
HDF Raster Images
HDF SDS
Polygons (ASCII, DXF)
SSM/I MSFC
Brightness Temp
TIFF Images
Others...
Intergraph Raster
Others...
Figure thanks to Information and Technology Laboratory at the University of Alabama in Huntsville
5
Data Mining on the Grid
What is data mining?
Why use the grid for data mining?
Grid miner overview
Grid miner architecture
Grid miner implementation
Current status
Ames Research Center
11
NAS Division
Mining on the Grid
Satellite
Data
Grid Mining
Agent
Archive X
IPG Processor
Grid Mining
Agent
IPG Processor
Satellite
Data
Archive Y
Ames Research Center
Grid Mining
Agent
IPG Processor
12
NAS Division
6
Grid Miner Architecture
Grid Mining
Agent
Data
Archive X
IPG Processor
IPG Processor
Mining
Database
Daemon
IPG Processor
Mining
Operations
Repository
Miner
Confiig
Server
Control
Database
IPG Processor
Satellite
Data
Archive Y
Ames Research Center
Grid Mining
Agent
IPG Processor
13
NAS Division
Data Mining on the Grid
What is data mining?
Why use the grid for data mining?
Grid miner overview
Grid miner architecture
Grid miner implementation
Current status
Ames Research Center
14
NAS Division
7
Starting Point for Grid Miner
• Grid Miner reused code from object-oriented ADaM data
mining system
– Developed under NASA grant at the University of Alabama in
Huntsville, USA
– Implemented in C++ as stand-alone, objected-oriented mining
system
• Runs on NT, IRIX, Linux
– Has been used to support research personnel at the Global
Hydrology and Climate Center and a few other sites.
• Object-oriented nature of ADaM provided excellent base
for enhancements to transform ADaM into Grid Miner
Ames Research Center
15
NAS Division
Transforming Stand-Alone Data
Miner into Grid Miner
• Original stand-alone miner had 459 C++
classes.
• Had to make small modifications to ADaM
– Modified 5 existing classes
– Added 3 new classes
• Grid commands added for
– Staging miner agent to remote sites
– Moving data to mining processor
Ames Research Center
16
NAS Division
8
Staging Data Mining Agent to
Remote Processor
globusrun -w -r target_processor
'&(executable=$(GLOBUSRUN_GASS_U
RL)# path_to_agent)(arguments=arg1 arg2
… argN)(minMemory=500)'
Ames Research Center
17
NAS Division
Moving Data to be Mined
gsincftpget remote_processor local_directory
remote_file
Ames Research Center
18
NAS Division
9
Data Mining on the Grid
What is data mining?
Why use the grid for data mining?
Grid miner overview
Grid miner architecture
Grid miner implementation
Current status
Ames Research Center
19
NAS Division
Current Status
• Currently works on the IPG as a prototype system
• User documentation underway
• Data archives need to be grid-enabled
– Connected to the grid
– Provide controlled access to data on tertiary storage
• E.g., by using a system such as the Storage Resource Broker that was
developed at the San Diego Super Computer Center
• Some earlier-adopter users need to be found to begin
using the Grid Miner
– Willing to code any new operations needed for their
applications
– Willing to work with system with prototype-level
documentation
Ames Research Center
20
NAS Division
10
Backup Slides
Ames Research Center
22
NAS Division
11
Example of Data Being Mined
75 MB for one day of global data - Special
Sensor Microwave/Imager (SSM/I).
Much higher resolution data exists with
significantly higher volume.
Ames Research Center
23
NAS Division
Grid Will Provide Re-usable Services
• In the future, Grid/Web services will provide the
ability to create reusable services that can
facilitate the development of data mining systems
– Builds on the web services work from the e-commerce
area
• Service interface is defined through WSDL (Web Services
Description Language)
• Standard access protocol is SOAP (Simple Object Access
Protocol)
– Mining applications can be built by re-using
capabilities provided by existing grid-enabled Web
services.
Ames Research Center
24
NAS Division
12
Mining on the IPG
• Now user must
– Develop mining plan
– Identify data files to be mined and check file URLs
into Control Database
– Create mining ticket that has information on
•
•
•
•
Miner Configuration Server - Currently LDAP server but future GIS
Executable type - e.g., SGI
Sending-host contact information - Source of mining plan and agent
Mining-database contact information - Location of Urls of files to be
mined.
Future
– User could use current capability or a Grid Mining
Portal for all of above
Ames Research Center
25
NAS Division
Mining on the IPG
• Mining agent
– Acquires configuration information from Miner
Configuration Server
– Acquires mining plan from sending host (future
Mining Portal)
– Acquires mining operations needed to support mining
plan from Mining Operations Repository
– Acquires URLs of data to be mined from Control
Database
– Transfers data using just-in-time acquisition
– Mines data
– Produces mining output
Ames Research Center
26
NAS Division
13
Mining Operator Acquisition
One possibility for the future is a number of
source directories for
– Public mining operations contributed by
practitioners
– For-fee mining operations from a future
mining.com
– Private mining operations available to a
particular mining team
Ames Research Center
27
NAS Division
14
Applications Talk
file://///Mithril/D%20Drive/Temp/App-talk/intro.html
Applications Talk Outline
file://///Mithril/D%20Drive/Temp/App-talk/outline.html
Methodology for
Building Grid
Applications
Methodology for
Building Grid
Applications
1.
Grid Programming Overview
2.
Modular Grid Programming
3.
Component/Framework Project
4.
Summary
Thomas M. Eidson
ICASE at NASA Langley
URL: www.icase.edu/~teidson
email: teidson@icase.edu
1 of 1
8/8/2002 9:15 AM
1 of 1
8/8/2002 9:14 AM
Applications Talk Outline
file://///Mithril/D%20Drive/Temp/App-talk/o_overview.html
Programming Model Slide
file://///Mithril/D%20Drive/Temp/App-talk/sci_prog.html
Modern Scientific
Programming Features
Scientific Programming
1.
Requirements
2.
Programmers and Users
3.
Issues
Composite Applications
non-trivial collection of element applications
(heterogenous physics, graphics, databases)
2. including large, data-parallel element
applications
3. task-parallel execution with message passing
and event signals
4. data located in files and databases distributed
over network
1.
Computing Environment
a heterogeneous network of computers
(workstations to supercomputers)
2. a variety of OS architectures, languages
3. a variety of sites with different administrations &
policies
1.
Users
nature: mixture of designers, programmers, and
users
2. programming teams
3. trend toward code sharing
1.
1 of 1
8/8/2002 9:17 AM
1 of 1
8/8/2002 9:17 AM
8/8/2002 9:20 AM
file://///Mithril/D%20Drive/Temp/App-talk/Guser_model.html
Applications Talk
file://///Mithril/D%20Drive/Temp/App-talk/sci_issues.html
Current Research - Opinion
Too much emphasis
fancy interfaces
2. access to "existing"
services
1.
Not enough emphasis
design of grid applications
- component approach
recommended
2. side effects of complex
applications - port
metadata
3. application
characterization standards
1 of 1
Programming Model
1.
1 of 1
8/8/2002 9:20 AM
Modular Grid
Programming
Modular organization: single-focus
programming modules
2. Coupling: simple interfaces to
complex, interactive dialogs
3. Task-Oriented Programming
4. Programming Entities
5. Programming Components definition
6. Software Components & Ports
7. Aspect-oriented Programming
(Filters)
8. Multi-language Support
9. Component/Instance Programming
10. Workflow Program
8/8/2002 9:22 AM
file://///Mithril/D%20Drive/Temp/App-talk/o_modular.html
file://///Mithril/D%20Drive/Temp/App-talk/Gtask_prog.html
Applications Talk Outline
8/8/2002 9:21 AM
1 of 1
1 of 1
Component Programming Slide
1.
8/8/2002 9:23 AM
file://///Mithril/D%20Drive/Temp/App-talk/Gprog_entity.html
Programming Model Slide
file://///Mithril/D%20Drive/Temp/App-talk/naut_comp.html
Programming Component
A Programming Component is an abstraction representing a well-defined programming entity along with
metadata that defines the the following properties of the entity. The metadata is referred to as a Shared
Programming Definition (SPD) to emphasize that Programming Components are independent of any specific
framework.
Identity is necessary to ensure that a program expresses the programmers desires.
An interface (port) is needed to allow specific behavior to be accessed.
State is important to allow a range of functionality so that only a modest number of Programming
Components are needed.
Relationships between Programming Components allow complex behavior to be defined in a
hierarchical manner and to support dynamic modification of behavior.
Behavior describes to computational characteristics of a programming entity.
1 of 1
Composite Application Slide
Programming Component = Programming Entity + SPD
1 of 1
8/8/2002 9:24 AM
1 of 1
Component Programming Slide
1 of 1
Component Programming Slide
8/8/2002 9:25 AM
file://///Mithril/D%20Drive/Temp/App-talk/Gprog_aspects.html
8/8/2002 9:25 AM
file://///Mithril/D%20Drive/Temp/App-talk/Gcomp_ports.html
1 of 1
Component Programming Slide
1 of 1
Component Programming Slide
8/8/2002 9:27 AM
file://///Mithril/D%20Drive/Temp/App-talk/Gcomp_instances.html
8/8/2002 9:26 AM
file://///Mithril/D%20Drive/Temp/App-talk/Gcomp_basics.html
8/8/2002 9:28 AM
file://///Mithril/D%20Drive/Temp/App-talk/Ghost_prog.html
Applications Talk Outline
file://///Mithril/D%20Drive/Temp/App-talk/o_framework.html
Nautilus
Programming
Framework
1.
Building Blocks
Current funding: partial funding
via NSF, joint with U. Tenn
(Dongarra and Eijkhout)
3. Nautilus Programming Model
2.
1 of 1
Component Programming Slide
4.
1 of 1
SANS Features
8/8/2002 9:29 AM
Nautilus Framework: Building
Blocks
1.
Large Application Working Environment (LAWE),
NASA SBIR
8/8/2002 9:32 AM
file://///Mithril/D%20Drive/Temp/App-talk/fw_basis.html
file://///Mithril/D%20Drive/Temp/App-talk/Gprog_model0.html
Nautilus Framework Slide
programming model: programming
component + metadata (interface specs)
within framework
2.
Self-Adaptive Numerical Software (SANS) and
NetSolve, U. Tenn
behavioral specifications
3.
Common Component Architecture (CCA)
Specification
modular programming standards
4.
Globus Toolkit
Grid services and security
Relevant Grid Forum Specifications
6.
Component Programming Slide
compatibility with other frameworks and
Grid interfaces
Relevant Web Standards: SOAP, XML
interoperability with other frameworks
and Grid interfaces
1 of 2
8/8/2002 9:31 AM
1 of 1
5.
SANS Framework Slide
file://///Mithril/D%20Drive/Temp/App-talk/fw_sans.html
Applications Talk Outline
SANS Service Component
Summary of ICASE Grid Research
Problem: Communication gap between numerical
terminology and application terminology.
2. Numerics: matrix properties (spectrum, norm)
3. Application: PDE, discretisation (ex: elliptic problem &
linear elements <=> M-matrix, hence Alg. Multigrid)
4. Research: Bridge gap by Intelligent Agent in
Self-Adaptive System
5. Approach: use Behavioural metadata
1.
1.
Example:
user specifies information about systems
Intelligent Agent uses heuristic
determination of properties in absence of
user info
Intelligent Agent database enhances by
inforamtion from previous runs
7.
1 of 1
Support programmers/users in developing Grid Applications
Infrastructure: Tidewater Regional Grid
Partnership
local (WM, ODU, HU, JLab, NASA Langley,
military bases)
distant (IPG, U. Utah, U.
Complutense/Spain, U. Va.)
features: PGP user managment and Globus
application services
Targets applications:
NASA: MultiDisciplinary Optimazation
(MDO), Probalistic structures (task
farming), impact code, reusable launch
vehicles, symmetric web
U. Va.: battlefield simmulation
U. Utah: genetic algorithms
Brown: heterogeneous mathematical
algorithms
JLab: distributed access to data
to describe characteristics of problem
data and of software components (e.g.,
elements of linear system solvers)
to enable smart service components to
match solver components to user data.
6.
file://///Mithril/D%20Drive/Temp/App-talk/summary.html
2.
Build Programming Framework to support efficent
application development
Nautilus Project
Targets: linear systems, eigenvalue solvers,
information retrieval
8/8/2002 9:33 AM
1 of 1
8/8/2002 9:37 AM
Download