Discovery Net C M Y

advertisement
CMYK
Discovery Net
The Discovery Net project (www.discovery-onthe.net) is a UK e-Science Pilot Project funded by
the EPSRC under the UK e-Science programme.
Discovery Net is building the software
infrastructure and tools for providing Knowledge
Discovery Services that allow scientists to conduct
and manage complex data analysis and knowledge
discovery activities on the new generation Internet.
Discovery Net addresses the complexities faced by
scientists from all information-intensive fields
where:
Project Goals
Discovery Net defines the standards, architecture and
tools that:
• Allow scientists to plan, manage, share and
execute complex knowledge discovery and data
analysis procedures available as remote services.
• Allow service providers to publish and make
available data mining and data analysis software
components as services to be used in knowledge
discovery procedures.
• Allow data owners to provide interfaces and
access to scientific databases, data stores, sensors
and experimental results as services so that they can
be integrated in knowledge discovery processes
• Modern high throughput devices are routinely
generating and capturing large amounts of data.
• Data sets are not analysed in isolation, but
dynamically integrated during the analysis.
• New data analysis methods and software
components are continually being developed.
• Knowledge discovery procedures are complex multi-step procedures conducted by interdisciplinary teams
of scientists.
Discovery Net Testbeds
Discovery Net is multi-disciplinary project. In addition to developing the software infrastructure for Knowledge
Discovery Services, the project is also developing a series of testbeds and demonstrators for using the
technology in the areas of life sciences, environmental modelling and geo-hazard prediction.
Discovery Net Architecture and Knowledge
Discovery Services
Discovery Net architecture provides open standards
for specifying:
Knowledge Discovery Adapters: used to declare the
properties of analytical software components and
scientific data stores including their input/output types,
performance and accuracy characteristics.
Knowledge Discovery Services look-up and
registration: allowing scientists to retrieve and
compose Knowledge Discovery Services in their
discovery procedures.
Integrated Scientific Database Access: allowing the
integration of structured and semi-structured data from
different data sources within a discovery procedure
using XML schemas.
Knowledge Discovery Process Management:
including DPML (Discovery Process Markup
Application of Discovery Net in Life Sciences
Language) as a standard specification language for
constructing and managing knowledge discovery
procedures, as well as recording their history.
Knowledge and Discovery Process Storage: allowing discovery procedures to be stored, shared and reexecuted.
Knowledge Discovery Process Deployment: allowing users to deploy and publish their existing knowledge
discovery procedures as new services.
CMYK
D-NET Grid-based Architecture:
Discovery Net is based on an open architecture re-using common protocols and common infrastructures such
as the Globus Toolkit and the OGSI. It also define its own protocol for workflows, Discovery Process Markup
Language (DPML) which allows the definition of data analysis tasks to be executed on distributed resources.
Using Discovery Net architecture it easy to build and deploy a variety of fully distributed data intensive
applications.
The D-NET client can connect to multiple servers in
order to build a real Grid application where data,
functionalities and resources can be located and
used on any D-NET server
• Data: Large tables can be transferred across servers
• Tasks: Tasks implemented in Java can migrate to other
servers
• Resources: A trusted client can use any D-NET server
to execute parts or all of the workflow
D-NET takes advantage of the OGSA architecture to
present its functionalities to other grid users and
projects.
• Grid Service: The functionalities of D-NET can be
accessed through a simple interface exposed as an
Discovery Net Grid-based Architecture
OGSA-compliant Grid service, making all the workflows
usable by other grid application builders.
• OGSI: D-NET uses the OGSI technology preview to deploy its grid service and to have a simple front-end
to use it.
Discovery Net Geo-hazard Prediction Application
Discovery Net Air Pollution Modelling Application
The Discovery Net project is conducted at Imperial College London
Principal Investigator: Dr. Yike Guo (Dept of Computing).
Discovery Net Team: Prof. Tony Cass (Dept. of Biological Sciences), Prof. John Darlington (Dept. of Computing), Dr. John Hassard
(Dept. of Physics), Dr. Jian Guo Liu (Dept. of Earth Sciences), Dr. Daniel Ruckert (Dept. of Computing), Prof. Robert Spence (Dept. of
Electrical Engineering).
Discovery Net Collaborations: Discovery Net is currently collaborating with National Centre for Data Mining (NCDM) at the University
of Illinois at Chicago for the creation of the Global Discovery Net project.
Contact Information:
Dr. Moustafa M. Ghanem, Discovery Net Project Manager, Department of Computing, Imperial College, London SW7 2BZ, United
Kingdom, Email: discoverynet@doc.ic.ac.uk Phone: +44 (0) 20 7594 8357 Fax: +44 (0) 20 7594 8246, www.discovery-on-the.net
tel:
fax:
web:
email:
+44 (0)20 7594 8360
+44 (0)20 7581 8024
www.lesc.imperial.ac.uk
lesc-admin@imperial.ac.uk
London e-Science Centre
Department of Computing
South Kensington Campus
Imperial College London
SW7 2AZ
United Kingdom
Download