Research Process of the Future - Keith Jeffery

advertisement
Research Workflow Process
on the
GRIDs Surface
Keith G Jeffery
President euroCRIS
Keith G Jeffery
Director, IT
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange
Standards
• Workflow on the GRIDs surface
• Conclusion
Research Process Workflow GRIDs 2
© Keith G Jeffery
Director, IT
Nirvana
Commonly used to indicate an optimal state of a
person (professional) or system (suitable)
•
•
Buddhism. “The ineffable ultimate in which one has
attained disinterested wisdom and compassion”
Hinduism. “Emancipation from ignorance and the
extinction of all attachment”
In a euroCRIS context, best possible CRIS system(s)
for end-users backed by best advice
Research Process Workflow GRIDs 3
© Keith G Jeffery
Director, IT
Nirvana - Retrieval
• An environment where an end-user can:
– Request information and through an intelligent dialogue
generate a ‘job’ which provides it
• Example (Medical R&D planning)
– How many researchers
• expert in GlycoProtein gp120 and CD4 molecule
– are likely be available in 2015;
– Classify researchers by country, institution;
• order list of researchers by number of refereed
publications to date
Research Process Workflow GRIDs 4
© Keith G Jeffery
Director, IT
Nirvana – input / update
• An environment where an end-user can:
– Input / update information and through an
intelligent dialogue obtain assistance where
needed and validation of the input
• Example:
– if value input for ‘person’ then possible valid
values for ‘organisational unit’ suggested
Research Process Workflow GRIDs 5
© Keith G Jeffery
Director, IT
The Solution is Required:
• To overcome the ‘effort threshold’ to :
• obtain the required answers from the CRIS
• input and update the information in the CRIS
• maintain data quality in the CRIS
• Across
– local stand-alone CRIS
– heterogeneous distributed CRISs
• Thus achieving ‘nirvana’
Research Process Workflow GRIDs 6
© Keith G Jeffery
Director, IT
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange
Standards
• Workflow on the GRIDs surface
• Conclusion
Research Process Workflow GRIDs 7
© Keith G Jeffery
Director, IT
The R&D Process: Recording
Workprogramme
CRIS
DATABASE
Proposal
Project
Results
Exploitation
Research Process Workflow GRIDs 8
WealthCreation
© Keith G Jeffery
Director, IT
The R&D Process: Feedbacks
Workprogramme
CRIS
DATABASE
Proposal
Project
Results
Exploitation
Research Process Workflow GRIDs 9
WealthCreation
© Keith G Jeffery
Director, IT
The R&D Process: Review
Workprogramme
CRIS
DATABASE
Proposal
Project
Results
Exploitation
WealthCreation
Research
Process Workflow GRIDs 10
review
review review review
© Keith G Jeffery
Director, IT
The WorkProgramme Process
Economic factors
Societal factors
CRIS
DATABASE
Technology Foresight
-World / Country State
-World / Country Models
-Technology Prediction
-Solicited Advice
Research Process Workflow GRIDs 11
Workprogramme
© Keith G Jeffery
Director, IT
The Proposal Process
Idea
CRIS
DATABASE
Review Previous Work
Objectives
CRIS
DATABASE
-Previous Results
-Previous Projects
-Human Resources
-Finance
Method
Resources and
dependencies
Research Process Workflow GRIDs 12
Proposal
© Keith G Jeffery
Director, IT
The Project Process
Project
CRIS
DATABASE
Project Management
System
CRIS
DATABASE
-Previous Results
-Previous Projects
-Human Resources
-Finance
Research Process Workflow GRIDs 13
© Keith G Jeffery
Director, IT
The Results Process
Initial Results
CRIS
DATABASE
Internal Review
CRIS
DATABASE
Previous Results
Peer Review
Publication or
Registration
Research Process Workflow GRIDs 14
© Keith G Jeffery
Director, IT
Results
The Exploitation Process
Results
Business Plan
CRIS
DATABASE
Finance
Marketing
Production
Marketing Information
Economic Information
Research Process Workflow GRIDs 15
Selling
Exploitation
© Keith G Jeffery
Director, IT
The Wealth Creation Process
Exploitation
CRIS
DATABASE
marketing
employment
production
Marketing Information
Economic Information
Research Process Workflow GRIDs 16
WealthCreation
© Keith G Jeffery
Director, IT
The R&D Process: Recording
Workprogramme
CRIS
DATABASE
Proposal
Project
Results
Exploitation
Research Process Workflow GRIDs 17
WealthCreation
© Keith G Jeffery
Director, IT
The R&D Process
Recording WorkProgramme
Workprogramme
ProgrammeName
Funding
OrgUnit
Person responsible
Workprogramme document
Research Process Workflow GRIDs 18
CRIS
DATABASE
© Keith G Jeffery
Director, IT
The R&D Process
Recording Proposal
Proposal
Title
Abstract
Person(s)
OrgUnit(s)
Proposal Document
Research Process Workflow GRIDs 19
CRIS
DATABASE
© Keith G Jeffery
Director, IT
The R&D Process
Recording Project
Project
Research Process Workflow GRIDs 20
Title
Abstract
Person(s)
OrgUnit(s)
Funding
Project Plan
CRIS
DATABASE
© Keith G Jeffery
Director, IT
The R&D Process
Recording Results-Product
Person(s)
OrgUnit(s)
Project(s)
Product(s)
Product Description
CRIS
DATABASE
Results
Research Process Workflow GRIDs 21
© Keith G Jeffery
Director, IT
The R&D Process
Recording Results-Patent
Person(s)
OrgUnit(s)
Project(s)
Patent(s)
Patent File
CRIS
DATABASE
Results
Research Process Workflow GRIDs 22
© Keith G Jeffery
Director, IT
The R&D Process
Recording Results-Publication
Person(s)
OrgUnit(s)
Project(s)
Bibliographic Information
Article
CRIS
DATABASE
Results
Research Process Workflow GRIDs 23
© Keith G Jeffery
Director, IT
The R&D Process
Recording Exploitation
Person(s)
OrgUnit(s)
Business plan
Finance Data
Marketing Data
Production Data
Sales Data
CRIS
DATABASE
Exploitation
Research Process Workflow GRIDs 24
© Keith G Jeffery
Director, IT
The R&D Process
Recording Wealth Creation
Person(s)
OrgUnit(s)
Annual Reports/Accounts
Employment Records
Dividends Records
Research Process Workflow GRIDs 25
WealthCreation
CRIS
DATABASE
© Keith G Jeffery
Director, IT
The R&D Process
Note:
Workprogramme
Nirvana
Proposal
Project
Results
some CRIS
developers
limit
recording of
outputs
from the
process to
areas
indicated
Exploitation
Research Process Workflow GRIDs 26
WealthCreation
© Keith G Jeffery
Director, IT
Complete Process ICT Support
• Nirvana is
– a complete,
– integrated,
– end-to-end ICT support
– for the research process
– across heterogeneous distributed CRISs
Research Process Workflow GRIDs 27
© Keith G Jeffery
Director, IT
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange
Standards
• Workflow on the GRIDs surface
• Conclusion
Research Process Workflow GRIDs 28
© Keith G Jeffery
Director, IT
Metadata and Data Exchange Standards
• Metadata
– a succinct representation of
the object of interest
– Schema, navigational,
associative [descriptive,
restrictive, supportive]
– Used for rapid retrieval of
navigational data to objects
of interest
– Can also be used for
statistical purposes (‘how
many…..’,’average number
of…’)
Research Process Workflow GRIDs 29
view to users
SCHEMA
NAVIGATIONAL ASSOCIATIVE
constrain it
data
(document)
© Keith G Jeffery
Director, IT
Metadata
• Many kinds and standards exist
• Examples include:
– Publications: MARC, DC (Dublin Core)
– Geospatial: CSDGM (Content standard
for digital geospatial metadata)
– Engineering: STEP
– Education: LOM (learning object
metadata); EDNA (Education Network
Australia metadata)
Research Process Workflow GRIDs 30
© Keith G Jeffery
Director, IT
Metadata and CRISs
• Commonly a CRIS stores the metadata rather
than the object itself
– e.g. result_publicationId which can be used to
access the publication itself (person{author},
title, abstract etc usually stored in the CRIS)
– e.g. projectId which can be used to access
the detailed project documentation (title,
abstract etc usually stored in the CRIS)
Research Process Workflow GRIDs 31
© Keith G Jeffery
Director, IT
Metadata: DCf: Publications
Domain of CERIF
Project
Person
OrgUnit
Person
Descriptive
OrgUnit
UniqueId
UniqueId
Restrictive
Title
Security
Subject
Privacy
Keywords
Quality Assessment
AccessLevel
Description
Charge
Resource Type
Annotation
Coverage Temporal
Classification
Coverage Spatial
ResourceIdentifier
Research Process Workflow GRIDs 32
Navigational
© Keith G Jeffery
Director, IT
Metadata in CRISs
• Used for
– Quality: validation on input / update
– Summarising: overview results
– Retrieval speed (find the list of objects
of potential interest)
– Controlling access
– Rights management
– And……..
Research Process Workflow GRIDs 33
© Keith G Jeffery
Director, IT
Metadata in Interoperating CRISs
• Metadata essential to allow interoperation of
CRISs, especially heterogeneous distributed
CRISs
• Provides the information necessary to set up
automatically retrieval (or update) over
heterogeneous CRISs
– Catalog technique
– Universal schema technique(s)
– Knowledge-based reconciliation technique(s)
Research Process Workflow GRIDs 34
© Keith G Jeffery
Director, IT
Metadata and Data Exchange Standards
• Data Exchange Standards
– Needed not just for data (file) exchange
– Also for returning results of a retrieval from
one CRIS to another in a form (syntax,
semantics) that is processable
• Metadata plus dataset
– Note data exchange standards used
extensively in e-business, banking, insurance,
medical, engineering, research areas
Research Process Workflow GRIDs 35
© Keith G Jeffery
Director, IT
The Key: Metadata and Data Exchange
Standards
• Nirvana is
– Formal metadata (machine
understandable)
– Query: Metadata describing CRIS
resources to improve queries
– Answer: Metadata attached to Query
result files (data exchange) so the
receiving CRIS or user can understand
the output
Research Process Workflow GRIDs 36
© Keith G Jeffery
Director, IT
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange
Standards
• Workflow on the GRIDs surface
• Conclusion
Research Process Workflow GRIDs 37
© Keith G Jeffery
Director, IT
Workflow on the GRIDs surface
• GRIDs ‘surface’ provides
– Computational capabilities of GRID
– Information presentation capabilities of
WWW
– Information management capabilities
• But not yet environment for workflow
Research Process Workflow GRIDs 38
© Keith G Jeffery
Director, IT
Data to Knowledge
The GRIDs Architecture
Knowledge Layer
Information Layer
Computation / Data Layer
Research Process Workflow GRIDs 39
© Keith G Jeffery
Director, IT
Research Process Workflow GRIDs 40
E-Business Application
Environmental Application
Data to Knowledge
The GRIDs Architecture
© Keith G Jeffery
Director, IT
A POSSIBLE ARCHITECTURE
U:USER
The GRIDs
Environment
Um:User
Metadata
Sm:Source Sa:Source
Metadata Agent
Ua:User
Agent
brokers
S:SOURCE
Research Process Workflow GRIDs 41
Ra:ResourceRm:Resource
Agent
Metadata
R:RESOURCE
© Keith G Jeffery
Director, IT
A Brief History of GRIDs
• 1G: custom-made architecture machines to user
– Pioneering metacomputing
• 2G: proprietary standards and interfaces
– I-WAY GLOBUS, UNICORE, CONDOR, e-Science
Apps
LEGION AVAKI
• 2.5G: added in FTP, SRB, LDAP, AccessGRID
• 3G: adopted W3C concepts for open interfaces
– OGSA / OGSI: note especially OGSA/DAI e-Science
R&D
– But built on 2.G foundations
Research Process Workflow GRIDs 42
© Keith G Jeffery
Director, IT
But…..
• This comes nowhere near the
requirements as originally defined for
GRIDs
• Too low-level (programmer not end-user
level)
– Insufficient representativity
– Insufficient expressivity
– Insufficient resilience
– Insufficient dynamic flexibility
Research Process Workflow GRIDs 43
© Keith G Jeffery
Director, IT
So…..
• The US GRID is metacomputing plus
extensions
– In 2002 improved with OGSA using
W3C Web Services ideas
• European position is that GRID
architecture (GLOBUS or even UNICORE)
is the wrong starting point for the
European vision
Research Process Workflow GRIDs 44
© Keith G Jeffery
Director, IT
And…..
• EC persuaded of importance of GRIDs
– Started in IST/Environment (early 2000) with
IT architectural framework for FP6 projects
– Set up GRID Unit under Wolfgang Boch (late
2002)
• January 2003: large workshop (GRID Unit)
– (~ 240 participants)
– Keynotes:
• Thierry Priol
(INRIA, FR)
• Domenico Laforenza (CNR, IT)
• Keith Jeffery
(CCLRC, UK)© Keith G Jeffery
Research Process Workflow GRIDs 45
Director, IT
NGG Requirements
•
•
•
•
•
•
•
•
•
•
Transparent and reliable
Open to wide user and provider communities
Pervasive and ubiquitous
Secure and provide trust across multiple administrative
domains
Easy to use and to program
Persistent
2.5G or
Based on standards for software and protocols even 3G GRID
basically
Person-centric
meet none
Scalable
of these
Easy to configure and manage
Research Process Workflow GRIDs 46
© Keith G Jeffery
Director, IT
NGG
• NGG1: 200301-200306
– Brought together visionary experts
– Defined properties required and research agenda to
achieve them
• NGG2: 200401-200407
– Updated NGG1 vision in the light of funded projects
and evolving requirements and technology
• NGG3 200509• http://www.cordis.lu/ist/grids/pub-report.htm
Research Process Workflow GRIDs 47
© Keith G Jeffery
Director, IT
GRIDs Vision and Requirements (1)
• a user interacts with the GRIDs environment
intelligently
• such that the GRIDs environment proposes a
'deal' to the end-user to satisfy her request
• which the user can then decide to execute involving multiple resources of computation,
information, detectors (for new data collection),
interactions with other users through various
communication devices etc.
Research Process Workflow GRIDs 48
© Keith G Jeffery
Director, IT
GRIDs Vision and Requirements (2)
• interoperation as a seemingly homogeneous
'surface' over a range of devices from smart dust
through detectors to embedded systems
(including controllers), handhelds, laptops,
desktops, departmental servers, corporate
servers and supercomputers.
• the 'surface' depends on self-* (self-managing,
self-repairing, self-tuning...) capability across
arbitrary and dynamic collections of (large
numbers of) nodes to give scalability,
performance, reliability, access, security, privacy
and other features.
Research Process Workflow GRIDs 49
© Keith G Jeffery
Director, IT
NGG1
• NGG1 Properties Required:
–
–
–
–
–
–
–
–
–
–
Transparent and reliable
Open to wide user and provider communities
Pervasive and ubiquitous
Secure and provide trust across multiple
administrative domains
Easy to use and to program
Persistent
Based on standards for software and protocols
Person-centric
Scalable
Easy to configure and manage
Research Process Workflow GRIDs 50
© Keith G Jeffery
Director, IT
Call2 (NGG1) Projects Funded
GRIDCOORD
Building the ERA
in Grid research
inteliGRID
Grid-based generic enabling
application technologies to
facilitate solution of industrial
problems
SIMDAT
K-WF Grid
Knowledge based
workflow &
collaboration
UniGridS
Extended OGSA
Implementation based
on UNICORE
HPC4U
Fault tolerance,
dependability
for Grid
EU - driven Grid services
architecture for business
and industry
NEXTGRID
Semantic Grid
based virtual organisations
OntoGrid
Knowledge Services for
the semantic Grid
Mobile Grid architecture
and services for dynamic
virtual Organisations
AKOGRIMO
DataminingGrid
Datamining
tools & services
European - wide virtual laboratory for longer term Grid
research - creating the foundation for the next generation Grids
COREGRID
Provenance
Provenance for Grids
Figure 1: The Call 2 Projects as a ‘house’
Research Process Workflow GRIDs 51
© Keith G Jeffery
Director, IT
NGG2 SWOT(1)
• Ontologies and semantic web technologies will be crucial to provide
scalable support for complex, heterogeneous Grids middleware and
applications.
• The strengths of the European telecommunications industry and the
diversity of its market for electronic control systems have given
Europe a leading position in the areas of mobile and embedded
technology. This is of particular relevance for the realization of the
vision of a Grid as a pervasive, user-centered utility.
• The weakness in hardware and primary software products (e.g.
commodity processors, server and desktop Operating systems,
Programming Languages, etc.) may hamper the development of a
European leadership in Grids Technologies.
Research Process Workflow GRIDs 52
© Keith G Jeffery
Director, IT
NGG2 SWOT(2)
• The convergence between Grids and Web Services provides a
significant opportunity to move to a model of software development
and service provision where the market dominance of particular OS
vendors is no longer a major economic issue.
• The distinctive European vision of a Grids environment that operates
from the level of devices to supercomputers, to serve communities
ranging from individuals to whole industries, including data,
information and knowledge and emphasizing resilience and
scalability could have a significant economic and social impact far
beyond the scope of existing compute and data Grids. This should
be contrasted with the North American Grid vision of programmerlevel metacomputing.
• It is vital that any European vision for the evolution of Grids is
accompanied by a clear representation of that vision to the key
standards bodies and technology providers worldwide.
Research Process Workflow GRIDs 53
© Keith G Jeffery
Director, IT
NGG2 Recommendations
•
(a) development of a design for a new operating system that provides a
fault-tolerant, scalable, self-healing, self-managing environment upon which
Grids service middleware may ‘sit’;
•
(b) development of Grids foundations middleware suitable both for
enhancing existing operating systems and for inclusion within (a);
•
(c) development of Grids service middleware in a modular fashion allowing
applications to utilise those services they require;
•
(d) research and development in computer science and information
technology required to accomplish (c), (b) and (a), notably new models and
software for transactions and messaging; for scheduling, resource
management and optimisation; for trust, security and privacy; for data,
information and knowledge management; for software development and
deployment including mobile code; and for intelligent and appropriate user
interfaces and device interfaces;
•
(e) development of novel applications that are wealth-creating or improve
the quality of life, particularly in the e-business domain, but also in e-health,
e-environment, e-culture, e-science, e-government;
© Keith G Jeffery
Research Process Workflow GRIDs 54
Director, IT
NGG2
Application A
Application B
Application C
Grids Middleware
Services Needed for
A
Grids Middleware
Services Needed for B
Grids Middleware
Services Needed for C
Grids Foundations for Grids Foundations For
Operating System X
Operating System Y
Operating System X
Grids Operating System
(including Foundations)
Modular and
dynamically loadable
Operating System Y
Research Process Workflow GRIDs 55
© Keith G Jeffery
Director, IT
Workflow on the GRIDs Surface
• Nirvana is
– GRIDs ‘surface’
• Providing computation, information
presentation and information
management
– Plus Self* resilience
– Plus capabilities to support workflow
Research Process Workflow GRIDs 56
© Keith G Jeffery
Director, IT
Agenda
• Introduction
• The R&D Process: Recording
• The Key: Metadata and Data Exchange
Standards
• Workflow on the GRIDs surface
• Conclusion
Research Process Workflow GRIDs 57
© Keith G Jeffery
Director, IT
Overall : The Way Forward
SCIENTIFIC DATASETS
PUBLICATIONS
Data
CRIS
Data
Information
Management of Research
Information
Knowledge
Research Process Workflow GRIDs 58
Knowledge
© Keith G Jeffery
Director, IT
Overall : The Way Forward
Portal with knowledge-assisted user interface
SCIENTIFIC DATASETS
PUBLICATIONS
CRIS
Data
Information
Data
Management of Research
CDR
(CERIF)
Knowledge
Information
Knowledge
Digital Curation Facility
Research Process Workflow GRIDs 59
© Keith G Jeffery
Director, IT
Overall : The Way Forward
Portal with knowledge-assisted user interface
SCIENTIFIC DATASETS
PUBLICATIONS
Data
Data
Information
Information
Knowledge
metadata
Knowledge
Digital Curation Facility
Research Process Workflow GRIDs 60
© Keith G Jeffery
Director, IT
Overall : The Way Forward
Portal with knowledge-assisted user interface
SCIENTIFIC DATASETS
PUBLICATIONS
Data
Data
publish
Information
Knowledge
Information
metadata
validate
Knowledge
Digital Curation Facility
Research Process Workflow GRIDs 61
© Keith G Jeffery
Director, IT
Overall : The Way Forward
Ambient, Pervasive Access
Portal with knowledge-assisted user interface
SCIENTIFIC DATASETS
PUBLICATIONS
Data
Data
publish
Information
Knowledge
Information
metadata
validate
Knowledge
Digital Curation Facility
GRIDs
Research Process Workflow GRIDs 62
© Keith G Jeffery
Director, IT
Overall : The Way Forward
Ambient, Pervasive Access
Portal with knowledge-assisted user interface
SCIENTIFIC DATASETS
PUBLICATIONS
Data
Data
publish
Information
Knowledge
Information
metadata
validate
Knowledge
Digital Curation Facility
GRIDs
Research Process Workflow GRIDs 63
© Keith G Jeffery
Director, IT
Overall : The Way Forward
Ambient, Pervasive Access
Portal with knowledge-assisted user interface
SCIENTIFIC DATASETS
PUBLICATIONS
Data
Data
publish
Information
Knowledge
Information
metadata
validate
Knowledge
Digital Curation Facility
GRIDs
Research Process Workflow GRIDs 64
© Keith G Jeffery
Director, IT
Overall : The Way Forward
Ambient, Pervasive Access
Portal with knowledge-assisted user interface
SCIENTIFIC DATASETS
PUBLICATIONS
Data
Data
publish
Information
Knowledge
Information
metadata
validate
Knowledge
Digital Curation Facility
GRIDs
Research Process Workflow GRIDs 65
© Keith G Jeffery
Director, IT
Three Steps to Nirvana
The Perfect CRIS
Workflow on the GRIDs Surface
Metadata and Data Exchange Standards
Complete Process ICT Support
Research Process Workflow GRIDs 66
© Keith G Jeffery
Director, IT
Prof. Keith G Jeffery
Director, Information Technology
Head, Business & Information Technology Department
CCLRC Rutherford Appleton Laboratory
k.g.jeffery@rl.ac.uk
http://www.bitd.clrc.ac.uk/
Keith G Jeffery
Director, IT
Download