1 Context

advertisement
DRAFT
Defining the Role of the UK e-Science Architectural Task Force
Malcolm Atkinson
19th October 2001
1
Context
The Architectural Task Force (ATF) has been set up by the e-Science Core Programme
Directorate to provide medium-term guidance to the development of UK e-Science. This
general goal is refined below, based on a meeting of the ATF in Cambridge, 19th October
2001.
The ATF will take a fresh look at the issues and strategies for building, maintaining and
operating the large-scale, distributed systems that are need to support e-Science and eCommerce. Our challenge is to identify the frameworks that allow the requirements and
structure of typical applications to be understood and then to identify existing components or
research and development required to fill these frameworks. We believe that at present much
work is still required before the systems that are envisaged can be built economically and
routinely, with sufficient durability and flexibility.
The ATF will not be able to provide immediate advice, that role will be filled by the Grid
Support Team (GST). In the longer term it will provide assessment of project plans against its
system frameworks and identify which components are readily available and which require
R&D. It will also suggest which areas research are most necessary or have the greatest
potential rewards. The development of this mosaic denoting the status of components within
frameworks will be used to explain the progress achieved by Grid projects and to identify the
issues that must be addressed before the vision of the infrastructure that supports e-Science
and similar commercial activities can be realised.
2
Generic Issues
The large-scale distributed systems that are built today are often developed by bespoke
technology or rely heavily on special properties of a particular application or host
organisation. As a result the intellectual and engineering investment in their construction and
operation does not transfer to other applications. Similarly, they are typically brittle, in the
sense that they cannot easily be changed to extend their function. Encouraging work that will
overcome these limitations is a priority for the ATF.
The ATF’s initial enumeration of the issues that must be considered includes:
1. The intrinsic unreliability of subsystems within a large distributed system. This must
be considered, e.g. through a recognition of trade offs between the costs of redundancy
and the reduction of risk.
2. Durability is a requirement before systems can support routine use. Premature
dependence on an infrastructure will result in failures that lead to an excessive
backlash against the general and achievable goals.
3. Systems have to operate without components, designers, implementers or operators
having universal knowledge about the total system. Consequently they have to operate
via dynamic knowledge discovery mechanisms that enable local operation in terms of
what can be discovered about relevant properties of the encompassing system.
ATF_Role+MOv1-19Oct01MPA.doc
1 of 4
19/03/2002
4. Heterogeneity is intrinsic to large-scale and distributed systems. Thus they must
support information integration technology and a variety of translation subsystems,
“digital Babel fish”☺.
5. Security and trust has to be established throughout these systems.
6. Some components must support dynamic information discovery and some must use
such information to dynamically optimise operations.
7. Autonomy has to be supported, as local subsystems must change to meet new
management, operating or functional requirements. This generates a concomitant
requirement for evolution mechanisms and dynamic modification of subsystem
specifications.
8. It is often appropriate to aggregate subsystems and services to achieve economic and
accounting benefits. Accounting and economic models are a prerequisite for routine
and sustained use.
9. Systems under consideration interact with other systems that already exist or are
developed independently. Consequently, adequate mechanisms are needed to handle
interfaces at the boundaries of large systems and to handle legacy models, components
and systems.
10. The scale of systems envisaged will raise technological challenges.
11. These large-scale systems require new software development methods and tools.
12. Similarly tools are required to support the operations and management of these
systems.
These considerations interact and appropriate compositions of choices will differ for different
applications. It is therefore necessary to identify and consider a number of representative
frameworks, which will characterise the major patterns required for the variety of applications
considered.
3
ATF Operational Plan
The following activities will be undertaken by the ATF with the outputs indicated. Readers
should be aware that this is a complex field and our programme of work will need to adapt as
our understanding develops.
1. Identify a small set of frameworks that characterise relevant large-scale distributed
systems. Each framework will identify components and show how those components
relate to the issues given above. To validate and motivate these frameworks, they will
be used to describe existing and planned application systems. We do not anticipate
that there will be a useful one-size-fits-all framework.
2. The frameworks will be used to expose problems and issues. This will lead to an
identification of areas warranting a focus of research or development attention. The
frameworks should allow researchers and application developers to focus on system
properties of interest and to leave other issues to other workers or automated
mechanisms.
3. Existing and contemporaneous work, from both current industry practice and a broad
range of research programmes, will be reviewed to identify significant inputs to the
exposed issues and problems.
ATF_Role+MOv1-19Oct01MPA.doc
2 of 4
19/03/2002
4. Active liaison will be maintained with GGF teams, relevant standards groups and
practitioners with the goal of influencing the design of future systems via emerging
standards and direct communication with development teams.
5. A summary of the progress and outstanding issues will be undertaken. This will be
composed from the four activities above and is motivated by two considerations: it is
important that potential users and funders understand what remains to be done, and it
is essential to ensure that adequate investment is directed into addressing infrastructure
research and development.
3.1 ATF Outputs and Services
The ATF will produce outputs available to the UK e-Science Core-Programme Directorate,
the UK e-Science community, computer-science researchers and wider audiences. The output
will vary from the following illustrative list as our understanding develops and as needs are
recognised.
1. Reports describing the distributed system frameworks, their motivation and use will be
produced.
2. Meetings will be held with relevant groups and practitioners. As an example a
meeting joint with the database architecture task force and with Ian Foster has been
arranged for 12th to 14th December 2001, at the e-Science Institute, Edinburgh.
3. Issues and subsystems will have their status classified to indicate whether solutions are
available in production quality, or whether they require development or research.
4. Topics requiring development or research will be identified together with a review of
potentially significant inputs to their treatment.
5. White papers and other documents will be produced to influence standards. In some
cases, these will need a supporting prototype implementation to validate their design
and to offer a public reference implementation. The mechanisms for achieving such
substantial bodies of work are still a matter of discussion. The ATF could operate
entirely through encouraging others to develop implementations or could manage
some of these developments itself, if granted suitable resources.
6. Publicly understandable reviews of progress and the road ahead will be produced to
help manage expectations, to assist those contemplating using these systems and to
influence funding decisions.
7. From Q2 2002 we will offer a service to pilot e-Science projects and others planning to
use grid-like infrastructure where we review their project plans against our frameworks
and assessment of progress. We will seek a summary of their distributed system
implementation and comment on which components will be needed and their status,
paying particular attention to spotting potential areas of difficulty.
3.2 ATF Requirements
In order to operate, the ATF has a few requirements.
1. Funds to support meetings, attendance at meetings and meeting administration.
2. Staff time to prepare material, attend meetings and develop outputs.
3. Web site and mail system support. (This will be provided by NeSC.)
ATF_Role+MOv1-19Oct01MPA.doc
3 of 4
19/03/2002
3.3 ATF Membership and Life-Cycle
The membership should be based on meeting requirements for a set of skills and to provide
authoritative links with other bodies, e.g. TAG, the Grid Network Team (GNT), etc. It should
also be kept small to make it feasible to arrange meetings and achieve progress. It should be
kept fresh through a process of renewal and review.
3.3.1 Initial Membership
Name
Malcolm Atkinson
Jon Crowcroft
David De Roure
Vijay Dialani
Andy Herbert
Ian Leslie
Tony Storey
Initials
MPA
JC
DDR
VKD
AH
IL
TS
e-mail
mpa@nesc.ac.uk
j.crowcroft@cs.ucl.ac.uk
dder@ecs.soton.ac.uk
vkd00r@ecs.soton.ac.uk
aherbert@microsoft.com
Ian.Leslie@cl.cam.ac.uk
tony_storey@uk.ibm.com
The above listed group met Friday 19th October 2001, without Andy Herbert, who was
unavailable, and this document is a result of that first meeting.
3.3.2 Skills and Contacts Required
The membership of the ATF needs to cover the following list of skills. The table shows the
coverage from the existing group. We have identified people to be approached in order to
better meet our skill requirement.
Skill area
Dependable Computing
Programming Models
Security
W3C
Information Models
Software Engineering
Networks
Operating Systems & Distributed Systems
Databases
Applications (preferably biomedical)
Information Engineering
Existing Coverage
TS & DDER
TS & ?
JC & IL
AH & IL
TS & MPA
The ATF also needs effective communication with important groups in its context. An initial
identification of this requirement is given in the following table.
Group
TAG
NeSC
GGF
W3C
GNT
GST
DB Architecture TF
Existing Links
JC & IL
MPA
MPA
DDR
JC
MPA
MPA & ST
3.3.3 Review Cycle
Members of the ATF require fixed-term commitments to provide a framework for scheduling
deliverable outputs, to limit commitments and to ensure that new views and new energy is
brought into the discussion on a regular basis. We therefore propose to operate for 12 months
under an initial regime and then invite a review. We expect after that review to revise our
goals, method of working and membership.
ATF_Role+MOv1-19Oct01MPA.doc
4 of 4
19/03/2002
Download