Introductory Lectures

advertisement
e-Science e-Business
e-Government and their
Technologies
Introduction
Bryan Carpenter, Geoffrey Fox, Marlon Pierce
Pervasive Technology Laboratories
Indiana University Bloomington IN 47404
January 12 2004
dbcarpen@indiana.edu
gcf@indiana.edu
mpierce@cs.indiana.edu
http://www.grid2004.org/spring2004
1
Class Structure

Grading based on mixture of homework and a single
final project
• Up to 2 students can collaborate together on final project



Homework 70% Final Project 30% grade
NO midterm or final
Homework will mainly be programming based but
there may be reports either in final or one or two
homework assignments
• Reports will use Internet, book and “Gap Analysis”
2
What are we doing








This is a semester-long course on Grids (viewed as technologies
and infrastructure) and the application – mainly to science but
also to business and government
We will assume a basic knowledge of the Java language and then
interweave 6 topic areas – first four cover technologies that will
be used by students
1) Advanced Java: including networking, Java Server Pages and
perhaps servlets
2) XML: Specification, Tools, Linkage to Java
3) Web Services: Basic Ideas, WSDL, Axis and Tomcat
4)Grid Systems: GT3/Cogkit, Gateway, XSOAP, Portlet
5) Advanced Technology Discussions: CORBA as istory, OGSADAI, security, Semantic Grid, Workflow
6) Applications: Bioinformatics, Particle Physics, Engineering,
Crises, Computing-on-demand Grid, Earth Science
3
Course Topics 1 and 2 :
Background/Core

Advanced Java Programming
• We will assume basic Java programming proficiency
• We will cover Java client/server, three-tiered and network
programming.
• Ancillary but interesting Java topics to be covered include
Apache Ant, XML-Beans, and Java Message Service

XML and XML Schema
• We will provide introductory material.
• Necessary to understand Web Service standards
• Examples include RDF (semantic web) and SOAP (Web
services)

XML Tools
• XML Databases (Xindice, Sleepycat)
• Search: XPath, XQuery
4
Course Topics 3 and 4: Web and
Grid Services

Overview Material
• Grid and Web Service Architectures

Basic Web Service Standards
• WSDL, SOAP: structure and definitions
• Building services in Java: Apache Axis

Advanced Web Services: Emerging capabilities
• WS-ReliableMessaging, WS-Security, WS-Transaction

Computational Grids
• Globus Toolkit 2
• Java COG Kit for Globus programming

Grids Meet Web Services
• Open Grid Service Architecture/Infrastructure
• Implementations: GSX from Indiana University

The Semantic Grid: Information Models for Describing
Resources
• RDF, DAML-OIL, and OWL
5
Grid Computing: Making The Global
Infrastructure a Reality





Based on work done in
preparing book edited
with
Fran Berman and
Anthony J.G. Hey,
ISBN: 0-470-85319-0
Hardcover 1080 Pages
Published March 2003
http://www.grid2002.org
6
Other

See the webcast in an Oracle technology series
http://webevents.broadcast.com/techtarget/Oracle/100303/index.asp?loc=10

See also the “Gap Analysis”
http://grids.ucs.indiana.edu/ptliupages/publications/GapAnalysis30June03v2.pdf
• We can send you nicely printed versions of this
• End of this is a good collection of references and it gives both
a general survey of current Grids and specific examples from
UK

Appendix with more details is:
http://grids.ucs.indiana.edu/ptliupages/publications/Appendix30June03.pdf


See also GlobusWorld http://www.globusworld.org/
and the Grid Forum http://www.gridforum.org
7
e-moreorlessanything and the Grid






e-Business captures an emerging view of corporations as
dynamic virtual organizations linking employees, customers
and stakeholders across the world.
• The growing use of outsourcing is one example
e-Science is the similar vision for scientific research with
international participation in large accelerators, satellites or
distributed gene analyses.
The Grid integrates the best of the Web, traditional
enterprise software, high performance computing and Peerto-peer systems to provide the information technology einfrastructure for e-moreorlessanything.
A deluge of data of unprecedented and inevitable size must
be managed and understood.
People, computers, data and instruments must be linked.
On demand assignment of experts, computers, networks and
storage resources must be supported
8
So what is a Grid?




Supporting human decision making with a network of at least
four large computers, perhaps six or eight small computers,
and a great assortment of disc files and magnetic tape units not to mention remote consoles and teletype stations - all
churning away. (Licklider 1960)
Coordinated resource sharing and problem solving in
dynamic multi-institutional virtual organizations
Infrastructure that will provide us with the ability to
dynamically link together resources as an ensemble to support
the execution of large-scale, resource-intensive, and
distributed applications.
Realizing thirty year dream of science fiction writers that
have spun yarns featuring worldwide networks of
interconnected computers that behave as a single entity.
9
What is a High Performance Computer?








We might wish to consider three classes of multi-node computers
1) Classic MPP with microsecond latency and scalable internode
bandwidth (tcomm/tcalc ~ 10 or so)
2) Classic Cluster which can vary from configurations like 1) to 3)
but typically have millisecond latency and modest bandwidth
3) Classic Grid or distributed systems of computers around the
network
• Latencies of inter-node communication – 100’s of milliseconds
but can have good bandwidth
All have same peak CPU performance but synchronization costs
increase as one goes from 1) to 3)
Cost of system (dollars per gigaflop) decreases by factors of 2 at
each step from 1) to 2) to 3)
One should NOT use classic MPP if class 2) or 3) suffices unless
some security or data issues dominates over cost-performance
One should not use a Grid as a true parallel computer – it can link
parallel computers together for convenient access etc.
10
e-Science



e-Science is about global collaboration in key areas of
science, and the next generation of infrastructure that
will enable it. This is a major UK Program
e-Science reflects growing importance of international
laboratories, satellites and sensors and their integrated
analysis by distributed teams
CyberInfrastructure is the analogous US initiative
Grid Technology
supports e-Science
and
CyberInfrastructure
It is software
(middeleware) built
on top of networks
DATA
ACQUISITION
ADVANCED
VISUALIZATION
,ANALYSIS
QuickTime™ and a
decompressor
are needed to see this picture.
IMAGING INSTRUMENTS
COMPUTATIONAL
RESOURCES
11
LARGE-SCALE DATABASES
Global Terabit Research Network

The Grid software and resources run on top of high
performance global networks
12
USA Network
13
Terabit Networks



Network performance will increase faster than Moore’s
law – partly because optical fiber has almost unlimited
bandwidth and partly because there are many old
networks to be replaced
Home dial-ups (56kbit)  DSL/Cable Modem (2
megabits/sec)  FTTP (Fiber to the Premise at gigabit
performance)
2006 Goal of Global Terabit Research Network
International: National Backbone: Organization;:
Optical Desktop: Copper Desktop is
1000:1000:100:10:1 Gigabit/sec
14
e-Business and (Virtual) Organizations





Enterprise Grid supports information system for an
organization; includes “university computer center”, “(digital)
library”, sales, marketing, manufacturing …
Outsourcing Grid links different parts of an enterprise together
(Gridsourcing)
• Manufacturing plants with designers
• Animators with electronic game or film designers and
producers
• Coaches with aspiring players (e-NCAA or e-NFL etc.)
Customer Grid links businesses and their customers as in many
web sites such as amazon.com
e-Multimedia can use secure peer-to-peer Grids to link creators,
distributors and consumers of digital music, games and films
respecting rights
Distance education Grid links teacher at one place, students all
over the place, mentors and graders; shared curriculum,
15
homework, live classes …
e-Defense and e-Crisis

Grids support Command and Control and provide
Global Situational Awareness
• Link commanders and frontline troops to themselves and to
archival and real-time data; link to what-if simulations
• Dynamic heterogeneous wired and wireless networks
• Security and fault tolerance essential

System of Systems; Grid of Grids
• The command and information infrastructure of each ship is
a Grid; each fleet is linked together by a Grid; the President
is informed by and informs the national defense Grid
• Grids must be heterogeneous and federated

Crisis Management and Response enabled by a Grid
linking sensors, disaster managers, and first responders
with decision support
16
Classes of Computing Grid Applications




Running “Pleasing Parallel Jobs” as in United Devices,
Entropia (Desktop Grid) “cycle stealing systems”
Can be managed (“inside” the enterprise as in Condor)
or more informal (as in SETI@Home)
Computing-on-demand in Industry where jobs spawned
are perhaps very large (SAP, Oracle …)
Support distributed file systems as in Legion (Avaki),
Globus with (web-enhanced) UNIX programming
paradigm
• Particle Physics will run some 30,000 simultaneous jobs this
way


Pipelined applications linking data/instruments,
compute, visualization
Seamless Access where Grid portals allow one to choose
one of multiple resources with a common interfaces
17
Utility Computing




An important business application of Grids is utility
computing
Namely support a pool of computers to be assigned as
needed to take-up extra demand
• Pool shared between multiple applications
One can say this application is common in academia where
different simulations share resources while in industry we
have
• Web Servers
• Financial Modeling
• Data-mining
• Simulation response to crisis like forest fire or
earthquake
Architecture is “Farm of Grid Services” connected to
Internet not cluster of computers connected to each other
18
Resources-on-demand

Computing-on-demand uses dynamically assigned
(shared) pool of resources to support excess demand in
flexible cost-effective fashion
Program A
Computer
1
Program Z
Computer
26
Static Assignment with redundancy
Program Z
Computer
52
Program A
Computer 27
Spares
Program A
Pool
Computer
1
Program Z
Pool
Computer N
<52
Dynamic on-demand Assignment
19
Some Important Styles of Grids






Computational Grids were origin of concepts and link computers
across the globe – high latency stops this from being used as
parallel machine
Knowledge and Information Grids link sensors and information
repositories as in Virtual Observatories or BioInformatics
• More detail on next slide
Education Grids link teachers, learners, parents as a VO with
learning tools, distant lectures etc.
e-Science Grids link multidisciplinary researchers across
laboratories and universities
Community Grids focus on Grids involving large numbers of
peers rather than focusing on linking major resources – links
Grid and Peer-to-peer network concepts
Semantic Grid links Grid, and AI community with Semantic web
(ontology/meta-data enriched resources) and Agent concepts
20
Information/Knowledge Grids


Distributed (10’s to 1000’s) of data sources (instruments,
file systems, curated databases …)
Data Deluge: 1 (now) to 100’s petabytes/year (2012)
• Moore’s law for Sensors




Possible filters assigned dynamically (on-demand)
• Run image processing algorithm on telescope image
• Run Gene sequencing algorithm on compiled data
Needs decision support front end with “what-if”
simulations
Metadata (provenance)
critical to annotate data
Integrate across experiments
as in multi-wavelength
astronomy
Data Deluge comes from pixels/year available
21
2.4 Petabytes Today
22
Repositories
Federated Databases
Database
Sensor Nets
Streaming Data
Database
SERVOGrid for e-Geoscience
?
Loosely Coupled
Filters
Discovery
Services
Analysis and
Visualization
Closely Coupled
Compute Nodes
SERVOGrid – Solid Earth Research Virtual Observatory will link
23
Australia, Japan, USA ……
SERVOGrid Requirements


Seamless Access to Data repositories and large scale
computers
Integration of multiple data sources including sensors,
databases, file systems with analysis system
• Including filtered OGSA-DAI (Grid database access)




Rich meta-data generation and access with
SERVOGrid specific Schema extending openGIS
(Geography as a Web service) standards and using
Semantic Grid
Portals with component model for user interfaces and
web control of all capabilities
Collaboration to support world-wide work
Basic Grid tools: workflow and notification
24
DAME
In flight data
~5000 engines
~ Gigabyte per aircraft per
Engine per transatlantic flight
Airline
Global Network
Such as SITA
Ground
Station
Engine Health (Data) Center
Maintenance Centre
Internet, e-mail, pager
Rolls Royce and UK e-Science Program
Distributed Aircraft Maintenance Environment
25
NASA Aerospace Engineering Grid
Wing Models
•Lift Capabilities
•Drag Capabilities
•Responsiveness
Airframe Models
Stabilizer Models
•Deflection capabilities
•Responsiveness
Crew
Capabilities
- accuracy
- perception
- stamina
- re-action
times
- SOP’s
Human Models
Engine Models
•Braking performance
•Steering capabilities
•Traction
•Dampening capabilities
Landing Gear Models
•Thrust performance
•Reverse Thrust performance
•Responsiveness
•Fuel Consumption
simulations
are produced
by coupling
ItWhole
takes asystem
distributed
virtual organization
to design,
simulate
andall
build
a complex
system simulations
like an aircraft
of the
sub-system
26
Virtual Observatory Astronomy Grid
Integrate Experiments
Radio
Far-Infrared
Visible
Dust Map
Visible + X-ray
27
Galaxy Density Map
e-Chemistry Laboratory
Experiments-on-demand
Grid-enabled Output Streams
Simulation
Video
Diffractometer
Properties
Analysis
Structures
Database
GridGlobus
Resources
X-Ray
e-Lab
Properties
e-Lab
Fig. 23: A Combinatorial Chemistry Grid (Chapter 42)
28
CERN LHC Data Analysis Grid
29
Typical Grid Architecture
Each Blob is a
Computer
Program!
System
Services
Portal
Services
User
Services
System
Services
Application
Service
Middleware
System
Services
System
Services
System
Services
“Core”
Grid
Raw (HPC)
Resources
Database
30
Sources of Grid Technology







Grids support distributed collaboratories or virtual
organizations integrating concepts from
The Web
Agents
Distributed Objects (CORBA Java/Jini COM)
Globus, Legion, Condor, NetSolve, Ninf and other High
Performance Computing activities
Peer-to-peer Networks
With perhaps the Web and P2P networks being the most
important for “Information Grids” and Globus for
“Compute Grids”
31
The Essence of Grid Technology?



We will start from the Web view and assert that basic
paradigm is
Meta-data rich Web Services communicating via
messages
These have some basic support from some runtime
such as .NET, Jini (pure Java), Apache Tomcat+Axis
(Web Service toolkit), Enterprise JavaBeans,
WebSphere (IBM) or GT3 (Globus Toolkit 3)
• These are the distributed equivalent of operating system
functions as in UNIX Shell
• Called Hosting Environment or platform

W3C standard WSDL defines IDL (Interface
standard) for Web Services
32
Meta-data




Meta-data is usually thought of as “data about data”
The Semantic Web is at its simplest considered as
adding meta-data to web pages
For example, the hospital web-page has meta-data
telling you its location, phone-number, specialties which
can be used to automate Google-style searches to allow
planning of disease/accident treatment from web
Modern trend (Semantic Grid) is meta-data about webservices e.g. specify details of interface and useage
• Such as that a bioinformatics service is free or bandwidth
input is of limited amount

Provenance – history and ownership – of data very
important
33
A typical Web Service


In principle, services can be in any language (Fortran .. Java ..
Perl .. Python) and the interfaces can be method calls, Java RMI
Messages, CGI Web invocations, totally compiled away (inlining)
The simplest implementations involve XML messages (SOAP) and
programs written in net friendly languages like Java and Python
Web Services
WSDL interfaces
Portal
Service
Security
WSDL interfaces
Web Services
Payment
Credit Card
Catalog
Warehouse
Shipping
control
34
Services and Distributed Objects


A web service is a computer program running on either the local
or remote machine with a set of well defined interfaces (ports)
specified in XML (WSDL)
Web Services (WS) have many similarities with Distributed
Object (DO) technology but there are some (important) technical
and religious points (not easy to distinguish)
• CORBA Java COM are typical DO technologies
• Agents are typically SOA (Service Oriented Architecture)

Both involve distributed entities but Web Services are more
loosely coupled
• WS interact with messages; DO with RPC (Remote Procedure Call)
• DO have “factories”; WS manage instances internally and interactionspecific state not exposed and hence need not be managed
• DO have explicit state (statefull services); WS use context in the messages to
link interactions (statefull interactions)

Claim: DO’s do NOT scale; WS build on experience (with
CORBA) and do scale
35
Details of Web Service Protocol Stack







UDDI finds where programs are
• remote (distributed) programs are
just Web Services
• (not clearly a great success)
WSFL links programs together
(under revision as BPEL)
WSDL defines interface (methods,
parameters, data formats)
SOAP defines structure of message
including serialization of information
HTTP is negotiation/transport protocol
TCP/IP is layers 3-4 of OSI
Physical Network is layer 1 of OSI
UDDI or WSIL
WSFL
WSDL
SOAP or RMI
HTTP or SMTP
or IIOP or RMTP
TCP/IP
Physical Network
36
Classic Grid Architecture
Resources
Database
Database
Composition
Content Access
Netsolve
Security
Collaboration
Middle Tier
Brokers
Service Providers
Computing
Middle Tier becomes Web Services
Clients
Users and Devices
37
Grid Services for the Education Process












“Learning Object” XML standards already exist
Registration
Performance (grading)
Authoring of Curriculum
Online laboratories for real and virtual instruments
Homework submission
Quizzes of various types (multiple choice, random parameters)
Assessment data access and analysis
Synchronous Delivery of Curricula including Audio/Video
Conferencing and other synchronous collaborative tools as Web
Services
Scheduling of courses and mentoring sessions
Asynchronous access, data-mining and knowledge discovery
Learning Plan agents to guide students and teachers
38
Grid Learning Model

Education and Research Grids share some services
both for content and “process”
• For example collaboration services are largely identical
• Research will use much larger simulation engines to get high
resolution results
• Maybe a researcher uses a CAVE to visualize; education a
Macintosh



But both can share data services but run through
different filters to select for precision (research) or
pedagogical value (education)
Education has “digital textbook” frontend to resources
of the research Grid
Both use same workflow technologies to link services
together
39
Repositories
Federated Databases
Database
Field Trip Data
Sensors
Streaming
Data
Database
SERVOGrid for e-Education
?
Loosely Coupled
Filters
Discovery
Services
Analysis and
Visualization
Coarse grain simulations
40
Some Observations





“Traditional “ Grids manage and share asynchronous resources
in a rather centralized fashion
Peer-to-peer networks are “just like” Grids with different
implementations of message-based services like registration and
look-up
Collaboration systems like WebEx/Placeware (Application
sharing) or Polycom (audio/video conferencing) can be viewed as
Grids
Computers are fast and getting faster. One can afford many
strategies that used to be unrealistic including rich usually XML
based messaging
Web Services interact with messages
• Everything (including applications like PowerPoint) will be a
Web Service?
• Grids, P2P Networks, Collaborative Environments are (will
41
be) managed message-linked Web Services
Peers
Database
Database
Service Facing
Web Service Interfaces
Event/
Message
Brokers
Event/
Message
Brokers
Event/
Message
Brokers
Peer to Peer Grid
Peers
User Facing
Web Service Interfaces
A democratic organization
42
Peer to Peer Grid
System and Application Services?




There are generic Grid system services: security, collaboration,
persistent storage, universal access
• OGSA (Open Grid Service Architecture) is implementing these
as extended Web Services
An Application Web Service is a capability used either by another
service or by a user
• It has input and output ports – data is from sensors or other
services
Consider Satellite-based Sensor Operations as a Web Service
• Satellite management (with a web front end)
• Each tracking station is a service
• Image Processing is a pipeline of filters – which can be grouped
into different services
• Data storage is an important system service
• Big services built hierarchically from “basic” services
Portals are the user (web browser) interfaces to Web services 43
Filter1
WS
Filter2
WS
Filter3
WS
Prog1
WS
Prog2
WS
as multiple
Satellite Science Build
Grid
interdisciplinary
EnvironmentPrograms
Build as multiple Filter Web Services
Sensor Data
as a Web
service (WS)
Simulation WS
Data
Analysis WS
Sensor
Management
WS
Visualization WS
44
What is Happening?

Grid ideas are being developed in (at least) three
communities
• Web Service – W3C, OASIS
• Grid Forum (High Performance Computing, e-Science)







Service Standards are being debated
Grid Operational Infrastructure is being deployed
Grid Architecture and core software being developed
Particular System Services are being developed
“centrally” – OGSA framework for this in
Lots of fields are setting domain specific standards and
building domain specific services
There is a lot of hype
Grids are viewed differently in different areas
• Largely “computing-on-demand” in industry (IBM, Oracle,
HP, Sun)
• Largely distributed collaboratories in academia
45
OGSA OGSI & Hosting Environments



Start with Web Services in a hosting environment
Add OGSI to get a Grid service and a component model
Add OGSA to get Interoperable Grid “correcting” differences in base platform
and adding key functionalities
Not OGSA
Domain -specific services
Possibly OGSA
More specialized services: data
replication, workflow, etc., etc.
OGSA
Environment
Broadly applicable services: registry,
authorization, monitoring, data
access, etc., etc.
OGSI on Web Services
Given to us from on high
Hosting Environment for WS
Network
46
Technical Activities of Note





Look at different styles of Grids such as Autonomic (Robust
Reliable Resilient)
New Grid architectures hard due to investment required
Critical Services Such as
• Security – build message based not connection based
• Notification – event services
• Metadata – Use Semantic Web, provenance
• Databases and repositories – instruments, sensors
• Computing – Submit job, scheduling, distributed file
systems
• Visualization, Computational Steering
• Fabric and Service Management
• Network performance
Program the Grid – Workflow
Access the Grid – Portals, Grid Computing Environments
47
Issues and Types of Grid Services

•
•
•
•

•
•
•
•

•
•
•
•

•
•
•
•


•
•
•
•
1) Types of Grid
R3
Lightweight
P2P
Federation and Interoperability
2) Core Infrastructure and Hosting
Environment
Service Management
Component Model
Service wrapper/Invocation
Messaging
3) Security Services
Certificate Authority
Authentication
Authorization
Policy
4) Workflow Services and Programming
Model
Enactment Engines (Runtime)
Languages and Programming
Compiler
Composition/Development
5) Notification Services
6) Metadata and Information Services
Basic including Registry
Semantically rich Services and metadata
Information Aggregation (events)
Provenance





7) Information Grid Services
• OGSA-DAI/DAIT
• Integration with compute resources
• P2P and database models
8) Compute/File Grid Services
• Job Submission
• Job Planning Scheduling
Management
• Access to Remote Files, Storage and
Computers
• Replica (cache) Management
• Virtual Data
• Parallel Computing
9) Other services including
• Grid Shell
• Accounting
• Fabric Management
• Visualization Data-mining and
Computational Steering
• Collaboration
10) Portals and Problem Solving
Environments
11) Network Services
• Performance
• Reservation
48
• Operations
Remote Grid Service
10: Job
Status
Remote Grid Service
1: Job Management Service
(Grid Service Interface to user or program client)
1: Plan Execution
4: Job Submittal
2: Schedule and control Execution
3: Access to Remote Computers
Data
7: Cache
Data
Replicas
9: Grid MPI
5: Data Transfer
6: File and
Storage
Access
8: Virtual
Data
Data
Technology Components of (Services in)
a Computing Grid
49
Approach



Build on e-Science methodology and Grid
technology
Science applications with multi-scale models,
scalable parallelism, data assimilation as key
issues
• Data-driven models for earthquakes,
climate, environment …..
Use existing code/database technology
(SQL/Fortran/C++) linked to “Application
Web/OGSA services”
• XML specification of models,
computational steering, scale supported
at “Web Service” level as don’t need
“high performance” here
• Allows use of Semantic Grid technology
Application WS
WS linking
to user and
Other WS
(data sources)
Typical
codes
50
User
Services
System
Services
Grid
Computing
Environments
Portal
Services
System
Services
Application
Application Metadata
Service
Middleware
System
Services
Actual Application
System
Services
System
Services
Raw (HPC)
Resources
“Core”
Grid
Database
51
Why we can dream of using HTTP
and that slow stuff





We have at least three tiers in computing
environment
Client (user portal)
“Middle Tier” (Web Servers/brokers)
Back end (databases, files, computers etc.)
In Grid programming, we use HTTP (and used to use
CORBA and Java RMI) in middle tier ONLY to
manipulate a proxy for real job
• Proxy holds metadata
• Control communication in middle tier only uses metadata
• “Real” (data transfer) high performance communication in
52
back end
Virtualization








The Grid could and sometimes does virtualize
various concepts – should do more
Location: URI (Universal Resource Identifier)
virtualizes URL (WSAddressing goes further)
Replica management (caching) virtualizes file
location generalized by GriPhyn virtual data concept
Protocol: message transport and WSDL bindings
virtualize transport protocol as a QoS request
P2P or Publish-subscribe messaging virtualizes
matching of source and destination services
Semantic Grid virtualizes Knowledge as a meta-data
query
Brokering virtualizes resource allocation
Virtualization implies all references can be indirect
and needs powerful mapping (look-up) services -metadata
53
Integration of Data and Filters



One has the OGSA-DAI Data repository interface
combined with WSDL of the (Perl, Fortran, Python
…) filter
User only sees WSDL not data syntax
Some non-trivial issues as to where the filtering
compute power is
• Microsoft says filter next to data
WSDL
Of Filter
Filter
OGSA-DAI
Interface
DB
54
SERVOGrid Complexity Computing Environment
Database
Database
Service
Application
Service-1
Application
Service-2
Application
Service-3
Parallel
Simulation
Service
Compute
Service
Middle Tier
with XML
CCE Control
Portal Aggregation
Users
Sensor
Service
Interfaces
XML Meta-data
Service
Complexity
Simulation
Service
Visualization
Service
55
OGSA-DAI
Grid Services
Grid
Grid Data
Assimilation
HPC
Simulation
Analysis
Control
Visualize
This Type of Grid
integrates with
Parallel computing
Multiple HPC
facilities but only
use one at a time
Many simultaneous
data sources and
sinks
Distributed Filters
massage data
For simulation
SERVOGrid (Complexity) Computing Model
56
Two-level Programming I


The paradigm implicitly assumes a two-level
Programming Model
We make a Service (same as a “distributed object” or
“computer program” running on a remote computer)
using conventional technologies
• C++ Java or Fortran Monte Carlo module
• Data streaming from a sensor or Satellite
• Specialized (JDBC) database access

Such services accept and produce data from users files
and databases
Service

Data
The Grid is built by coordinating such services
assuming we have solved problem of programming the
service
57
Two-level Programming II




The Grid is discussing the composition of distributed
services with the runtime Service1
Service2
interfaces to Grid as
opposed to UNIX
Service3
Service4
pipes/data streams
Familiar from use of UNIX Shell, PERL or Python
scripts to produce real applications from core programs
Such interpretative environments are the single
processor analog of Grid Programming
Some projects like GrADS from Rice University are
looking at integration between service and composition
levels but dominant effort looks at each level separately
58
Conclusions







Grids are inevitable and pervasive
Can expect Web Services and Grids to merge with a common
set of general principles but different implementations with
different scaling and functionality trade-offs
e-Science will grow in importance as Science grows as an
international “team sport”; affects scientists and
organizations
Enough is known that one can start today
We will be flooded with data, information and purported
knowledge
One should be learning about Grids; understanding relevant
Web and Grid standards and developing new domain specific
standards
Note many existing (standards) efforts assume client-server
and not a brokered service model; these will need to change!
59
Download