Overview of Grid Computing

advertisement
The Grid, Grid Services and the Semantic
Web: Technologies and Opportunities
Dr. Carl Kesselman
Director
Center for Grid Technologies
Information Sciences Institute
University of Southern California
Outline

What are Grids?

Grid technology
- Globus and the Open Grid Services Architecture

Grids and the Semantic Web
How do we solve problems?

Communities committed to common goals
- Virtual organizations


Teams with heterogeneous members &
capabilities
Distributed geographically and politically
- No location/organization possesses all required skills
and resources

Adapt as a function of the situation
- Adjust membership, reallocate responsibilities,
renegotiate resources
The Grid Vision
“Resource sharing & coordinated problem
solving in dynamic, multi-institutional virtual
organizations”
- On-demand, ubiquitous access to computing, data,
and services
- New capabilities constructed dynamically and
transparently from distributed services
“When the network is as fast as the computer's
internal links, the machine disintegrates across
the net into a set of special purpose appliances”
(George Gilder)
Biomedical Informatics
Research Network (BIRN)


Evolving reference set of
brains provides essential data
for developing therapies for
neurological disorders
(multiple sclerosis,
Alzheimer’s, etc.).
Today
- One lab, small patient base
- 4 TB collection

Tomorrow
- 10s of collaborating labs
- Larger population sample
- 400 TB data collection: more
brains, higher resolution
- Multiple scale data integration and
analysis
National Virtual Observatory
http://virtualsky.org/
from
Caltech CACR
Caltech Astronomy
Microsoft Research
Virtual Sky has
140,000,000 tiles
140 Gbyte
Change scale
Change theme
Optical (DPOSS)
Xray (ROSAT) theme
Coma cluster
Living in an Exponential World
(1) Computing & Sensors
Moore’s Law: transistor count doubles each 18 months
Magnetohydrodynamics
star formation
Living in an Exponential World:
(2) Storage



Storage density doubles every 12 months
Dramatic growth in online data (1 petabyte =
1000 terabyte = 1,000,000 gigabyte)
- 2000
~0.5 petabyte
- 2005
~10 petabytes
- 2010
~100 petabytes
- 2015
~1000 petabytes?
Transforming entire disciplines in physical and,
increasingly, biological sciences; humanities
next?
An Exponential World: (3) Networks
(Or, Coefficients Matter …)

Network vs. computer performance
- Computer speed doubles every 18 months
- Network speed doubles every 9 months
- Difference = order of magnitude per 5 years

1986 to 2000
- Computers: x 500
- Networks: x 340,000

2001 to 2010
- Computers: x 60
- Networks: x 4000
Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.
The Grid World: Current Status


Dozens of major Grid projects in scientific &
technical computing/research & education
Considerable consensus on key concepts and
technologies
- Open source Globus Toolkit™ a de facto standard for
major protocols & services
- Far from complete or perfect, but out there, evolving
rapidly, and large tool/user base


Industrial interest emerging rapidly
Opportunity: convergence of eScience and
eBusiness requirements & technologies
The Next Step

Globus leverages standard protocols
- TLS, LDAP, X.509, HTTP
- Only TCP in common

Is there a better foundation for Grid functions
- More unified protocol stack (common base)
- Better support for virtualization
- Leverage commodity infrastructure
“Web Services”

Increasingly popular standards-based framework for
accessing network applications
- W3C standardization; Microsoft, IBM, Sun, others

WSDL: Web Services Description Language
- Interface Definition Language for Web services

SOAP: Simple Object Access Protocol
- XML-based RPC protocol; common WSDL target

WS-Inspection
- Conventions for locating service descriptions

UDDI: Universal Desc., Discovery, & Integration
- Directory for Web services
Transient Service Instances

“Web services” address discovery & invocation
of persistent services
- Interface to persistent state of entire enterprise

In Grids, must also support transient service
instances, created/destroyed dynamically
- Interfaces to the states of distributed activities
- E.g. workflow, video conf., dist. data analysis

Significant implications for how services are
managed, named, discovered, and used
- In fact, much of our work is concerned with the
management of service instances
OGSA Design Principles

Service orientation to virtualize resources
- Everything is a service

From Web services
- Standard interface definition mechanisms: multiple
protocol bindings, local/remote transparency

From Grids
- Service semantics, reliability and security models
- Lifecycle management, discovery, other services

Multiple “hosting environments”
- C, J2EE, .NET, …
The Grid Service =
Interfaces + Service Data
Reliable invocation
Authentication
Service data access
Explicit destruction
Soft-state lifetime
GridService
Service
data
element
… other interfaces …
Service
data
element
Service
data
element
Implementation
Hosting environment/runtime
(“C”, J2EE, .NET, …)
Notification
Authorization
Service creation
Service registry
Manageability
Concurrency
Given a set of Services?




How do we do a better job of finding out what
services we want to use
How do we do a better job of configuring
services
How do we do a better job of composing and
nesting services
Answer: Do a better job of representing
services
Deeper representation of services

Information is captured via structure
- X.509 certificates, MDS models, CIM schema,
Metadata

Knowledge expresses relationships between
entities
- Concepts and relationships
- Logical framework to inference over relationships
Vision
“The Semantic Web is an extension of the current Web in which
information is given a well-defined meaning, better enabling
computers and people to work in cooperation. It is the idea of
having data on the Web defined and linked in a way that it can
be used for more effective discovery, automation, integration
and reuse across various applications. The Web can reach its
full potential if it becomes a place where data can be processed
by automated tools as well as people”
From the W3C Semantic Web Activity statement
Resource Description Framework
Ontologies Everywhere

What happens if knowledge permeates the
Grid
- Data elements
- Service descriptions (service data elements)
-

Protocols (e.g. policy, provisioning)
More dynamic and general model then
Semantic Web
- OGSA lifetime model
- OGSA SDE model
Cognative Grid


Grid Services + Ontologies + Knowledge
Driven Services
Examples
- Knowledge driven matchmaking
- Agent based service composition
- High-level planning and resource discovery
- Knowledge based provisioning

Some people are using term “semantic grid” to
discribe Grid Services+Knowlege
SCEC Modeling Environment
KNOWLEDGE REPRESENTATION
& REASONING
Knowledge Server
Knowledge base access, Inference
Translation Services
Syntactic & semantic translation
Knowledge Base
Ontologies
Curated taxonomies,
Relations & constraints
DIGITAL
LIBRARIES
Pathway Models
Pathway templates,
Models of simulation codes
Navigation &
Queries
Versioning,
Topic maps
KNOWLEDGE
ACQUISITION
Code
Acquisition Interfaces
Repositories
Dialog planning,
FSM
RDM
AWM
Mediated
Collections
Federated
access
SRM
Data
Collections
Pathway construction
strategies
Pathway Assembly
Template instantiation,
Resource selection,
Constraint checking
Data & Simulation
Products
GRID
Pathway Execution
Policy, Data ingest, Repository access
Grid Services
Compute & storage management, Security
Computing
Pathway
Instantiations
Storage
Users
DOCKER: Publishing SHA Code
Web
Browser
DOCKER
User
Interface
Constraint
Acquisition
Model
Specification
Wrapper
Generation
(WSDL, PWL)
User specifies:
 Types of model
parameters
 Format of input messages
 Documentation
AS97
 Constraints
AS97
docs
types
msg
constrs
AS97
ontology
(Y. Gil, USC/ISI)
SCEC
ontologies
Recommends other models
Yes
Did you know that [Sadigh97] is a good model for dist >80 miles?
Automatically Generates Interface
Automatically Generates KR
Description
myGrid Project - bioinformatics

Imminent ‘deluge’ of
genomics data
-

Highly heterogeneous, Highly
complex and inter-related
Convergence of data and
literature archives
1.
2.
3.
4.
Database access from the
Grid
Process enactment on the
Grid
Personalisation services
Metadata services
Grid Services + Ontologies
Carol Gobel, U. Manchester
Resource selection: Matchmaking

Providers and requesters describe themselves
- Synactic description
> Structured or Semi-structured

A Matchmaker matches compatible classads
- Match based on attribute name, simple prioritization

Semantic matchmaking
- Inference based matching (e.g. CIM+relations)
- Automatic classification (e.g. description logic)
- Leverage domain specific ontologies
Pegasus:
Planning for Execution in Grids

Create workflow to
create virtual data
- Domain specific and
generic rules

Map Workflow unto
Grid resources
- System state via Grid
services (MDS, RLS,…)
- Global and local
optimization criteria
Chimera
(1) Abstract Workflow
(DAG)
(18) Results
Current Sate
Generator
MCS
RLS
(3) Logical File Names
(LFNs)
(2) Abstract DAG
Request Manager
(9) Concrete DAG
(4) Physical File Names
(PFNs)
(12) DAGMan files
(10) Concrete
DAG
(11) DAGMan files
(15) Monitoring
MDS
Abstract DAG
reduction
Abstract and
Concrete Planner
(5) Full Abstract DAG
(6) Reduced Abstract DAG
Concrete Planner
(7) Logical
Transformations
(8) Physical
Transformations and
Execution Environment
Information
Transformation
Catalog
VDL Generator
Submit File
Generator for
Condor-G
DAGMan
Submission and
Monitoring
(13) DAG
(14) Log FIles
Condor-G/
DAGMan
Summary

Technology exponentials are changing the
shape of scientific investigation & knowledge
- More computing, even more data, yet more
networking


The Grid: Resource sharing & coordinated
problem solving in dynamic, multi-institutional
virtual organizations
Many potential opportunities for application of
semantic web technologies to Grid services
- OGSA
Partial Acknowledgements

Open Grid Services Architecture design
- Karl Czajkowski @ USC/ISI
- Ian Foster, Steve Tuecke @ANL
- Jeff Nick, Steve Graham, Jeff Frey @ IBM

Semantic/Cognitive Grid
- Yolanda Gil, Ewa Deelman, Jim Blythe, Tom Russ, Hans
Chalupsky
- Conversations with Jim Hendler, Carol Gobel, David
DeRoure

Strong links with many EU, UK, US Grid projects

Support from DOE, NASA, NSF, Microsoft
For More Information

Grid Book
- www.mkp.com/grids

The Globus Project™
- www.globus.org

OGSA
- www.globus.org/ogsa

Global Grid Forum
- www.gridforum.org
Download