Design decisions: architecture Richard J. White Cardiff School of Computer Science

advertisement
Design decisions:
architecture
Richard J. White
Cardiff School of Computer Science
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
1
BDWorld architecture – user
perspective (original ideas)
Analytic
tool
Proxy
BDGrid
Analytic
tool
Proxy
Taxonomic index
(Species 2000
& ITIS Catalogue of
Life)
GSD
GSD
GSD
GSD
Proxy
Ontology:
 Metadata
 Intelligent links
 Resource & analytic
tool descriptions
 Maintenance tools
Thematic
data source
Problem Solving
Environment:
 Broker agents
 Facilitator agents
 Presentation agents
Proxy
User
Proxy
Proxy
Abiotic data
source
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
Local
tools
Problem
Solving
Environment
user interface
2
Design principles 1: the Grid
• Creating a Grid for biodiversity informatics
• Current Grid practice and software keep changing
• Architecture and much of the implementation
should be insulated from changes in Grid
technology: such changes should require no
change to resource software (other than rebuilding wrappers); only our interface to the Grid
would need to change
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
3
Architecture 1: interfacing to a Grid
A Software Component in
BDWorld
BGI API
BDWorld-GRID
Interface (BGI)
The GRID
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
4
Design principles 2: services
• Data sources, analytical tools, etc. should be
made available as services which can be
invoked remotely by clients
• Service-oriented computing is a Good Thing –
users do not need to install or adapt
resources to their own environments
• Potential for interoperability with other Grids
in related domains such as environmental,
molecular and genomic biology (e.g. myGrid)
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
5
Architecture 2: invocation via the Grid
A Software
Component
Another Software
Component
BGI API
BGI API
BDWorld-GRID
Interface (BGI)
BDWorld-GRID
Interface (BGI)
The GRID
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
6
Design principles 3: wrappers
• The services are made available as
Operations provided by a Resource
• A Resource is connected to the BDWorld Grid
through a Wrapper
• Any program could therefore be a resource,
only the difficulty of wrapping it would vary
• Resources and wrappers should be able to
be implemented in any language
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
7
Architecture 3: wrapped resources
User
Workflow enactment
engine
Remote Resource
Wrapper
BGI API
BGI API
BDWorld-GRID
Interface (BGI)
BDWorld-GRID
Interface (BGI)
The GRID
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
8
Design principles 4: workflows
• User metaphor based on the concept of
workflows – requires a workflow manager
for design and enactment of workflows
• Flexible use and re-use of work-flows
• Resource interoperability with heterogeneous
data, complex in structure
• Need to be able to select suitable resources
which “fit together” in a workflow – requires
metadata
• Need to record activities, data generated, etc.
• Did I say we also need a user interface?
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
9
Architecture 4: (as we planned it)
User
User interface
Legacy
user
interfaces
Presentation layer
Metadata
repository
Workflow
enactment
engine
Local tools
e.g. Input
and Output
Units
Native
BDWorld
resources
Wrapped
“legacy”
resources
BGI API
BDWorld-GRID Interface (BGI)
The GRID
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
10
Design principles 5: desiderata
Extensibility and flexibility are important:
• Minimise the joining ‘cost’ for
• users (easy installation of local components)
• providers (adding a new resource of a type not
previously encountered)
• Adding attributes to the metadata requires no
change to the MDR
• Challenges with handling non-portable
resources and inflexible user interfaces …
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
11
Legacy resource issues addressed
• We planned to deal with resources which:
•
•
•
•
•
interact with their user locally in real time
have not been designed to be scripted
cannot support multiple simultaneous invocations
run on specific platforms only
have other unexpected requirements
• Using techniques such as
• capturing input and/or output and emulating a real user’s
actions
• providing user access to remote desktops
• limiting where the user of a work-flow can be sited
• providing instructions for direct user control
• modifying the source code
• avoiding the use of the resource altogether
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
12
Architecture (as we built it)
User
User interface
(Protégé)
User
interface
Presentation
layer
Metadata
repository
(MDR)
Workflow
enactment
engine
Local tools
e.g.WFDA,
Input and
Output
Units
MDA
Legacy
user
interfaces
Native
BDWorld
resources
Wrapped
“legacy”
resources
BGI API
BDWorld-GRID Interface (BGI)
The GRID
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
13
Glossary of components (existing or
Real Soon Now)
• Problem-solving Environment (PSE)
• Workflow Designer, Enactment Engine, User Interface [Triana]
• local Units in Toolboxes
• proxies for remote Operations
• local functions, including
• Input and Output Units
• Workflow Design Assistant (WFDA)
• Metadata Agent (MDA)
• BDWorld-Grid Interface (BGI)
• BGI Comms Layer, API, Wrappers
• Remote Resources
• provide Operations (services)
• Metadata Repository (MDR)
• BDWorld ontology, metadatabase, user interface
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
14
Current evaluation; future flexibility
We believe our architecture ensures that BDWorld is:
• not limited to a specific application domain
• extensible to cope with unanticipated uses and
resources
Because:
• new resources can be added
• domain-specific knowledge resides
• only in the resources and the MDR
• not in the BGI or the workflow engine or its user interface or
the Metadata Agent
• MDR contents come from the resources and from
humans customising the MDR to assist in new
domains
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
15
A dream
• Desktop environment in which scientists “drag &
drop” data sources, analysis and modelling tools,
and visualisation interfaces into desired sequence
of operations which can be run automatically
• Essentially a component-based visual
programming environment for scientific tasks
• With additional features (some described earlier),
the environment could be made richer, more
productive, and support research groups.
• Not just for biodiversity!
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
16
Where do we go from here?
• Present system is a proof of concept
• Limited
• Restricted domain of exemplars
• Needs
• more data resources
• more PSE functionality (described next)
• additional features
• User interaction (described earlier)
• Virtual organisations (described later)
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
17
Extra PSE functionality
Some of these topics are becoming available within
the present BDWorld project
• Enhanced metadata
• Provenance and data lineage
• Automatic electronic “notebook”
• Stored workflows
• Repeatability, reproduceability
• Re-use with different data, changed parameters
• Ontologies
• Resource discovery and improved selection
• Usability
• Dynamic interaction of users with resources
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
18
Virtual organisations
These are not going to be addressed during the
present BDWorld project, but would make a good
Computer Science component in future proposals
• Collaborative working environments
• Shared and private resources: data, tools
• Shared experimentation
• User authentication
• Access control
• Controlled release of data, tools and results
• Dynamic
• Membership
• Resources
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
19
The way forward
• New domain exemplars
• Links with national and international
organisations, resources
• “End users”
• Applied use, driven by scientific priorities
• Input for planning
• Feedback for evaluation and improvement
• …
BiodiversityWorld Grid Workshop
NeSC, Edinburgh, 30 June - 1 July 2005
Richard White
Design decisions: architecture 1 July 2005
20
Download