UK-Japan “N+N” Workshop on Grid Computing London, October 3-4, 2003 Summary Report

advertisement
UK-Japan “N+N” Workshop on Grid Computing
London, October 3-4, 2003
Summary Report
1
Executive Summary
The first “UK–Japan N+N workshop on Grid Computing” was held on October 3-4
2003 in London, and with participation from over 20 Grid researchers. The workshop
was jointly organized by the National eScience Center (NeSC) in Edinburgh and the
Grid Technology Research Center (GTRC) of the Institute for Advanced Industrial
Science and Technology (AIST) in Tsukuba and Tokyo and chaired by Malcolm
Atkinson, Ron Perrott, and Satoshi Sekiguchi.
An explicit goal of the workshop was to explore opportunities for collaboration between
existing grid middleware and applications R&D efforts in the UK and Japan. On the
first day of the workshop 20 presentations organized in five session provided an
overview to grid projects in the UK and Japan, as well as on grids and
high-performance computing, developments in standard grid middleware, and grids and
data.
On the second day of the workshop participants were divided into three topical
groups—infrastructure, data, and applications—and asked to identify possibilities for
exchange and cooperation between the UK and Japan in grid research. During a final
panel discussion, potential areas for collaboration were then elaborated. Also, for each
area the most suitable candidates to oversee eventual collaboration activities were
identified. In some cases immediate action items were identified.
A second UK-Japan workshop is planned to take place in Japan in about 18 month. It is
expected that, by then, at least some of the project ideas listed in the following will have
turned into real collaborative ventures. In the meantime, the next steps are:
•
Follow-up on projects proposal at SC2003/GGF9
•
Determine date and format for Spring 2005 Workshop in Japan
•
Investigate possibilities for funding long-term collaborations
It is now up to participants at the meeting to link-up, exchange information, exchange
code, discuss joint projects, and to refine proposals. However, while a number of joint
activities can be undertaken relatively easily, and with existing resources, external
funding will be needed for collaborations on more long-term, fundamental research.
The UK and Japan are perhaps the two countries that spend most on grid research
besides the US, and researchers in the UK and Japan are heavily involved in GGF
activities and international standardization in the grid area. In the case of grids,
fundamental research is often highly relevant to standardization to support long-term
R&D collaborations between the UK and Japan, also means to strengthen the position
of both countries in an emerging technology that has the potential to affect the entire IT
2
industry.
3
1.
Introduction
On October 3-4 2003, over 20 Grid researchers from the UK and Japan gathered in
London for the “UK Japan N+N Workshop on Grid Computing”, a dense two-day
meeting with presentations and discussion covering all aspects of grid computing and its
application in research and industry. The workshop was organized by the National
e-Science Center (NeSC) in Edinburgh and the Grid Technology Research Center
(GTRC) of the Institute for Advanced Industrial Science and Technology (AIST) in
Tsukuba and Tokyo and chaired by Malcolm Atkinson, Ron Perrott, and Satoshi
Sekiguchi.1
After an initial day with 20 presentations on grid projects in the UK and Japan, grids
and high-performance computing, standard grid middleware, and grids and data, on
morning of the second day the participants split-up into three topical discussion groups
(infrastructure, data, and applications) to debate possibilities for exchange and
cooperation between the two countries. The following provides an outline of the
presentations and discussions at the workshop.
1.1 Background of the Meeting
International collaboration in the area of grids does not need justification. Grids, like the
Internet, are inherently global. Within the UK e-Science program, it is recognized
that—in addition to participation in international conferences or the GGF—direct
interaction with “sister programs” in other countries is necessary and important. In
Japan, the AIST Grid Technology Research Center (GTRC) has served as a “gateway”
between the international grid community and Japan. Also, there have been numerous
efforts, such as the APGrid or PRAGMA, to link various grid activities in the
Asia-Pacific Region.
The idea of a joint workshop on grids between the UK and Japan was first discussed in
the context of the IST 2000 meeting in Nice, France, which was attended by John Taylor
and Satoshi Sekiguchi. Several months later, in spring 2001, John Taylor visited Japan
and ideas for a cooperation were further investigated. Eventually, a proposal for a
“UK/Japan Grid Laboratory”, drafted by Tony Hey and Satoshi Sekiguchi, was included
into the “4th UK-Japan Agreement on Scientific and Technological Cooperation”, that
was signed by the two countries at a meeting in London in February 2002. The status of
that proposal is still pending.
1
On the Japanese side, this workshop was organized by the Grid Technology Research Center. However,
it should be noted that, during 2003, Japan has launched a major national Grid research & development
effort, termed the National Research Grid Initiative (NAREGI). It is expected that, in the future, NAREGI
efforts will play an important part in UK-Japan cooperation efforts.
4
In the meantime, the UK government had launched it’s e-Science program, a large-scale
grid effort oriented towards applications and valuable scientific output rather than
middleware development or large-scale hardware procurement2. In Japan, the Institute
of Advanced Industrial Science and Technology (AIST) formed the first dedicated grid
research organization, the Grid Technology Research Center (GTRC).
At the same time, the Ministry of Education, Culture, Sports, Science and Technology
(MEXT) funded several medium-scale Grid infrastructure and R&D projects, such as
the Information Technology Based Laboratory (ITBL) project, the Osaka Biogrid, or the
Tokyo Institute of Technology (Titech Grid). In 2003, the Japanese government
launched two large national-level grid initiatives to support middleware development
for scientific and business applications, namely the NAREGI (National Research Grid
Infrastructure) project and the Business Grid Project.3
1.2 The Agenda: UK-Japan Cooperation
In the past 2-3 years, both the UK and Japan have launched large-scale grid initiatives.
Also, in both countries, the number of individual grid projects as well as the number of
university departments or research centers involved in grid activities have increased
considerably. Commercial interests in grids and grid applications has surged and grids
have been advertised as the Internet of the future. Leaving aside the US, the UK and
Japan are perhaps the two countries that spend most on grid research. Researchers in the
UK and Japan have been heavily engaged in GGF activities and, given the importance
of international standardization, there is a natural interest in long-term cooperation on
fundamental research that can drive standardization.
With large efforts in place on each side, 2003 thus seemed the right year to further
investigate possibilities for collaboration between the UK and Japan. Plans for a joint
workshop between the UK e-Science program and Japanese grid activities were first
discussed between Malcolm Atkinson and Satoshi Sekiguchi at GGF7 in March 2003 in
Tokyo and plans for the event were finalized by Ron Perrott, Malcolm Atkinson, and
Satoshi Sekiguchi. The practical aspects of the workshop were arranged by Gill Maddy
(NeSC) and Yuko Oshima (GTRC), who also provided on-site support for the Japanese
2
The UK e-Science programme corresponds to an investment of approximately £250 million over 5
years, of which 75% is invested in problem driven R&D and 25% is invested in coordination and a core
programme, including middleware R&D and regional/national centres. Investment in HPC is channelled
separately, and includes £55 million on HPCx over 6 years.
3
The National Research Grid Initiative is funded at around ¥ 9 billion over a 5 year period, with an
additional ¥ 4-5 billion in hardware procurements. Public funding within the Business Grid Project is
comparable to NAREGI funding. However, in the case of the Business Grid Project comapanied need to
match funds invested by the government.
5
participants.
1.3 Goals of the Meeting
An explicit goal of the workshop was to explore opportunities for collaboration between
existing grid middleware and applications efforts in the UK and Japan, including the
following:
•
exchange of information
•
exchange of experiences regarding policies for production grids
•
mutual sharing of resources
•
joint test beds and demonstrations
•
joint testing of middleware protocols
•
joint specifications to guarantee interoperability of middleware components
•
scientific cooperation projects that use the grid as a tool
•
commercial grid projects
•
cooperation on user identification, certification, and security
•
grid economics
Throughout this workshop, the goal was to identify both common interests and
complementary capabilities—leading eventually to proposals for cooperation between
groups in the two countries. To this end, a better understanding and awareness of grid
related activities in each other’s country was deemed necessary. Presenters at the
workshop were asked to emphasize actual experiences, and problems encountered with
software development efforts, implementations, production grids, or test beds—rather
than to focus on “grand schemes” or technical detail (all presentations from the
workshop are available).
Also, throughout the workshop, the emphasis was on discussion. Following a day with
presentations, for the second day of the meeting participants were split into three groups
(infrastructure, middleware, and applications) and asked to come-up with possible
topics for cooperation. The following document summarizes the discussions at the
meeting, including a list of possible topics for cooperation efforts between groups in the
UK and Japan.
6
2.
Grid Infrastructure
An important goal of the workshop was to foster awareness of grid activities in both
countries. While a complete survey of grid activities in the UK and Japan would have
been far beyond the capacity of the meeting, several presentations have highlighted
important grid efforts in both countries.
2.1 Funding for Grid Research
Judging from the amount of funding made available for grid research, there can be no
doubt that the grid is high on the agenda of policy makers in the UK and Japan. Grid
related funding in the UK has been mainly provided through the e-Science program as
well as through a number of European Union funded activities. The e-Science program
has provided funding by a variety of means. Most importantly, the program has funded a
number of regional e-Science centers across the country as well as the National
e-Science Center (NeSC) in Edinburgh.
Funding for Grid activities in Japan is somewhat more fragmented and, over the past
few years, a variety of initiatives have been launched by various ministries and
agencies—including the former Science and Technology Agency (now part of the
Education Ministry), the Ministry for Economics, Trade, and Industry (METI), or
various Initiatives funded through the Ministry for Education, Culture, Sports, Science,
and Technology (MEXT).
The Grid Technology Research Center (GTRC) within the National Institute for
Advanced Industrial Science and Technology (AIST), which developed out of a
research unit on network computing and clusters at the former Electrotechnical
Laboratory, has been the first large-scale grid research center in Japan, and perhaps also
internationally, with a dedicated staff of some 50 full-time employees and offices in
Tsukuba and Tokyo. GTRC also functions as an informal link between the various Grid
activities in Japan and provides support for the Japanese Grid Consortium. GTRC is by
far the largest dedicated research organization in the grid area in Japan, supporting
research on all aspects of grid technology, from infrastructure to applications.
A number of grid efforts in academia have started over the past few years and
organizations such as the Japan Atomic Energy Research Institute (JAERI) have
launched grid efforts. But, it was only this year that funding for Grid research on a
broader scale has become available with the launch two national Grid projects targeting
science and business applications:
•
The National Research Grid Initiative (NAREGI), organized and funded by
MEXT, and hosted at the National Institute of Informatics (NII), aims at
7
•
developing standard middleware for scientific applications and to demonstrate
the feasibility of a grid based national computing infrastructure.
The Business Grid project, supported by METI, aims at the development of
basic underlying grid middleware for business applications and future
e-commerce environments.
Both initiatives are well funded and are at least partially intended to support the
Japanese computer industry to develop grid capabilities. Many academic institutions,
notably Osaka University, Tokyo Institute of Technology, and University of Kyushu,
participate in the NAREGI efforts.
2.2 Grid Projects in the UK and Japan
A number of presentations at the meeting covered Grid activities in Japan and the UK.
Peter Coveney presented an overview of activities under the Reality Grid, a large grid
based computational science project funded through the e-Science program. Reality
Grid is mainly focused on computation, including very large-scale computations using
HPCx or systems of a similar size. Within the Reality Grid project, various middleware
tools that give HPC users better control over their jobs were developed, and notably
techniques for computational steering.
Satoshi Matsuoka (TITech) provided a broad outline of the NAtional REsearch Grid
Initiative (NAREGI). Sponsored by the Japanese education ministry (MEXT), the focus
of the project is on grid middleware R&D activities and Nanoscience grid applications.
The current focus at NAREGI is mainly on compute grids rather than data grids. Also,
NAREGI was conceived as a software R&D project and not a deployment project. Still,
a 15 Teraflops test is presently being built.
Although NAREGI is still in an early stages, collaborations with middleware projects
such as Unicore, Globus, and Condor as well as contributions to various OGSA
standardization efforts are planned. In addition to the National Institute of Informatics
(NII), the Institute for Molecular Science (IMS) and the three large computer vendors,
various academic and research institutions, including GTRC and Tokyo Institute of
Technology, are participating in the project.
Several other presentations provided glimpses on important grid activities in both
countries—and internationally. Richard Kenway introduced the International Lattice
Data Grid, a project with participation from both Japan and the UK. Peter Clarke
provided an overview to the networking resources within the UK e-Science program.
Yoshio Tanaka presented the APGrid, a consortium of organizations in the Asia Pacific
8
region that aims at developing a regional test bed. Judged by the number of
organizations that have joined, or that provide resources, APGrid has been very
successful at managing a grid consortium without centralized funding and in a region
that is both politically and economically extremely diverse, addressing a formidable
challenging. But, as Tanaka pointed out, managing diverse and heterogeneous networks
of organizations is an essential part of grid technology.
2.3 Towards Production Grids
Experiences with large-scale grid test beds, but especially with production grids—such
as the UK Level 2 Grid—seem to converge on a set of issues that are both “technical”
and “social” in nature. Dealing with a heterogeneous infrastructure—composed of
resources from various organizations—remains challenging and demanding. Experience
with the APGrid Test bed, a project without a single funding source, further illustrated
this point.
But, judging from the presentations at the workshop, there are also important
differences between the thematic emphasis of grid efforts in the UK and in Japan.
Activities in the UK seems to be more focused on grids, while in Japan there appears to
be more on large clusters and computation. Also, the e-Science program is strongly
focused on users and applications—it is not a grid program per se, but a broader
initiative that aims at building an IT infrastructure for scientific inquiry.
While the choice within the UK e-Science program was not to develop any kind of new
middleware, but rather try to put existing packages and toolkits to work, in Japan grid
efforts are characterized by a strong emphasis on new middleware research and
development. To some extend, this might be simply a reflection of the fact that Japan
has a strong domestic IT industry. Conversely, and while there exist significant
middleware development efforts in the UK, such as OGSA-DAIS, it appears that—at
least at this stage—grid activities are somewhat more oriented towards working directly
with users.
9
3.
Grid Middleware
Grids are essentially middleware—and middleware is still where the problem is with
grids. Most of the presentations at the meeting were essentially dealing with middleware
issues.
One sentiment, expressed by various participants throughout the meeting concerns
approaches towards middleware development. In his presentation, and in various
comments, Peter Coveney has pointed out that, from the perspective of grid users, it is
more important “to get things to work”—even if that means using unconventional
approaches or improvisation. In other words, to search after pragmatic solutions that
work, even if they may not be optimal in “architectural terms” is preferential to working
out ideal solutions on paper only. The counter argument was also raised: that such
pragmatic solutions demonstrate required functionality but longer-term investment is
needed to provide more economic and generic middleware that delivers that
functionality.
3.1 Middleware Development in the UK and Japan
Interestingly, while the UK e-Science program has voted not to support the development
of new middleware packages, but rather use existing middleware—such as Globus,
Unicore, or Condor—there is more emphasis on middleware development within
Japanese grid R&D programs, perhaps reflecting the presence of several large domestic
computer vendors in Japan.
Grid efforts in Japan and the UK have voted not to develop the “base” middleware layer
by themselves, but rather to rely on available de-facto middleware standards. For
example, the Japanese NAREGI project will base its middleware development efforts
on Globus, Unicore, and Condor and to add new functionalities or layers, but without
duplicating existing efforts. Such as approach opens up many opportunities for
cooperation.
David Snelling reminded participants that, although OGSI has now been published, and
is very likely to become adopted as a GGF recommendation, a lot of work still remains
to be done and the overall OGSA scheme is far from complete—with OGSI
representing only the lowest branch of a complex services hierarchy. Also, there remain
many issues with respect to interoperability, and its definition, with respect to OGSI.
3.2 Web Portals
Efforts to develop a software toolkit for the automated generation of application portal
10
sites were discussed by Satoshi Itoh. In a related presentation, Hiroshi Takemiya
discussed issues with “gridifying” legacy applications using standard Grid middleware,
such as Globus or Ninf-G.. Steven Newhouse discussed ICENI, an integrated Grid
middleware presently developed by the London e-Science Center and built using JAVA
and JINI. The ICENI architecture includes scheduling, execution, visualization, and
steering services accessible through a highly-developed graphic user interface.
3.3 GridMPI
Yutaka Ishikawa, who is responsible for the SCore cluster operating system
development, presented GridMPI, a new latency aware MPI implementation. Fast
communication protocols are a key to building efficient cluster operating systems.
Better performance in the communication layer is crucial for running parallel
applications on computational grids.
Using an emulated WAN environment in which two clusters are connected by a PC
router which adds communication latency artificially several MPI implementations were
tested. Using the NAS parallel benchmarks run and MPI implementations such as
MPICH-P4, MPICH-G2, and MPICH-SCore, it was demonstrated that the distance limit
for running non-embarrassingly parallel applications on the Grid is around 2000 km (i.e.
metropolitan networks). Based on these data, the decision was made to build a new
GridMPI implementation from scratch. The features include a new latency aware
implementation and a new TCP/IP protocol implementation.
3.4 Resource Brokering
Accounting and resource brokering are a central component of the Grid—and, together
with user identification and certification, a major problem for large, public grid projects.
In his presentation on Grid activities related to HPCx, David Henty pointed out that
accounting remains a major issue for all large computer centers and is very difficult to
do on large parallel systems. Typically, centers write their own accounting software, yet
there remain many problems—which suggests that it will be much more difficult on
heterogeneous environments such as the Grid.
In complex Grid environments, new concepts for resource brokering are therefore
needed—and some ideas were presented in John Brook’s presentation that described
efforts within the GRIP project to develop “federated” resource brokering schemes.
Work on resource brokering is also an important component of middleware
development efforts in Japan and there seem to exist good opportunities for cooperation
between the UK and Japanese teams.
11
3.5 Data
Issues surrounding data and grids were engagingly summarized by Malcolm Atkinson in
his talk on “data everywhere”. Structured digital data are now becoming ubiquitous and
there is an increasing need to link distributed datasets with computing resources. Isao
Kojima presented promising efforts at the AIST towards a grid based integration of
research databases using an Xquery based, metadata integration system and a grid
proxy/mediator based database integration approach for hidden web databases based on
the OGSA-DAI framework. Paul Watson, in his presentation, demonstrated a services
oriented approach for the access and integration of structured data on the Grid based on
the OGSA-DQP framework.
In a beautiful example of a grid application for managing large data sets, Ron Perrott
introduced the GridCast project, a joint effort with the BBC to use grids for the internal
distribution, handling, and storage of digital media files. There are many possible
extensions of this effort, such as the use of grids and IP networks to deliver content to
educational organizations.
There were also discussions regarding Gfarm and the database query efforts by Watson.
The goal of the Gfarm project, which is undertaken jointly by AIST/GTRC, KEK, and
the Tokyo Institute of Technology, is to develop new file system for wide-scale and
bandwidth intensive Data grid processing. It was suggested that, combining the
OGSA-DQP and Gfarm will nicely complement each other for high-bandwidth
processing of distributed data.
12
Grid Applications
As grid applications mature, the focus has increasingly shifted from computation
towards data—and the integration of large, distributed data sets with high-end
computing. This trend was well in evidence in the presentation related to applications at
the workshop. Structured digital data, as Malcolm Atkinson has pointed out, are now
“everywhere” and there is an urgent need to combine large-scale data management and
high-end computing using distributed resources.
3.6 Life Science and Medical Applications
The integration of many different sources of data remains a major challenge in the
bioinformatics field, according to Steve Oliver who presented his work on an
object-oriented database system for sequence and functional genomic data information,
called Genome Information Management System (GIMS), as well as a prototype data
repository for proteome data. The grid is well suited to address many of the problems
that bionformaticians are facing today. The number of new data sources is increasing
continuously and many organizations now provided access to experimental data on their
websites, which further complicates issues of data integration and access. myGrid was
introduced as a first step towards applying the GIMS principles of data integration in a
Grid-based system that would avoid internalising huge amounts of data from around the
world into a single data warehouse.
But, there are also important limitations and hurdles to the application of grids,
especially in biomedical research. For example, Steve Oliver pointed out that there
remain difficult issues as concerns the validation or correction of experimental data. If
data are distributed to many different locations, how can this be achieved? Also, in areas
of research where competition in research is high, control over data-even after their
release-may be important.
Especially in medical research, data security remains a major hurdle for the application
of grids, a fact that became especially clear in Derek Hill's presentation on medical
imaging as a grid application. An increasingly sophisticated infrastructure for
processing and storing medical images is needed, as data sets and algorithms get more
complex. Also, imaging is increasingly used for guiding and planning therapy-as
opposed to simple diagnosis. Most importantly, medical images are increasingly stored
in digital archives.
While medical imaging would certainly benefit from access to on-demand compute
resources and techniques for dealing with distributed data, there remain fundamental
challenges to the use of grids in medical imaging and notably issues of data security and
13
privacy. Similar issues are relevant in medical research in general. As Derek Hill
pointed out, it is still common to transfer data from clinical trials physically on a CD or
hard disk, rather than to send them over the Internet
3.7 Nanoscience and Molecular Simulations
Several presentations at the workshop discussed approaches to extend materials
simulations—such as molecular dynamics—to cluster and grid environments, targeting
applications in basic and applied research, including nano scale simulations of materials
and biological systems or support systems for drug design and development.
Approached to molecular simulations on highly parallel systems in cluster and grid
environments were discussed by Masaaki Kawata. The combination of fast and reliable
algorithms for molecular dynamics simulations on clustered systems with innovative
grid-based simulation methodologies that utilize widely distributed computational
resources should yield robust systems for large-scale molecular simulations in a grid
environment. Especially, Kawata discussed a grid-optimized replica exchange method
(termed RAX-MS) for solving optimization problems.
Mitsuhisa Sato provided an overview to efforts at integrating various software tools,
such as tools for conformational search and molecular simulations, into a grid-based
system to support drug discovery. In order o implement the system, a middleware tool to
implement remote procedure calls (RPC) on the grid called OmniRPC was developed.
The need to integrate databases with various types of simulations makes grids especially
suited for applications such as drug discovery.
Nanoscale simulations are also an important part of the applications development work
within the Reality Grid discussed by Peter Coveney.
3.8 Commercial Applications
Commercial applications of grid technology were not major focus at this meeting. Still,
commercial applications were discussed in several presentations. Satoshi Itoh, who
introduced the Grid PSE Builder projects at the Grid Technology Research Center
(GTRC), a software kit that enables users to easily build web application services. So
far, portal sites for standard technical computing applications, such as Gaussian or
Phoenics, were developed using the Grid PSE Builder.
In a beautiful example of a grid application, Ron Perrott presented an on-going
collaboration project with the BBC with the goal to use grid technology to manage
video data. The distribution of video data at the BBC presently is managed in a highly
centralized form, with video files distributed from the main office in London to regional
14
TV stations using dedicated lines.
If successful, grid technology could contribute to a significant change in the way how
the organization processes, distributes, and stores video data. Presently, video data are
stored mainly as physical objects and distributed using special purpose equipment and
via dedicated networks. Using grid technology it is now possible to store video as data
files on commercial server systems and distributed on IP based networks using grid
middleware.
15
4.
Building UK-Japan Collaborations
Following a day with presentations covering many aspect of grid related research,
middleware development, and grid applications, the goal for the second day of the
workshop was to identify areas of common interests (or complementarities) in grid
efforts in the UK and Japan and, eventually, to define a set of possible topics for future
collaboration between the two countries.
Both the UK and Japan have sizeable R&D efforts on grids and cooperation on
fundamental research is highly desirable. Further, both countries are also involved in
standardization activities and long-term R&D efforts constitute a crucial backdrop to
formulating standards.
4.1 Approaches to Cooperation
It is important to clearly distinguish between various types of collaboration—from the
exchange of software codes and resource sharing to joint test-beds or more long-term
R&D efforts—which come with widely differing needs for funding, personal
investments and efforts, or means of communication.
There are a variety of activities that can be undertaken without need for additional
resources, and by relying upon existing forums for communication. Other activities and
notably more long-term research efforts will need both careful planning and additional
funding resources on both sides.
There was a wide-spread agreement that identifying an application “driver” for each
cooperation is likely to be important—but it was much less clear what application might
really help to “drive” cooperation between the two countries. Research communities in
areas such as high-energy physics or astronomy are well organized and there are many
grid activities that are already well on the way globally.
Data integration is a strong need in the life sciences field and grid middleware might
really provide much needed technology to the life sciences community. Yet, it remains a
fact that many computer scientists involved in grids have little knowledge about the life
sciences, which often makes communication—and cooperation—difficult. In addition to
life sciences, materials research might constitute an interesting area for cooperation,
with needs for both high-end computation and data.
Grid researchers in Japan and the UK already cooperate on a number of activities from
cooperation on work done with the GGF working or research groups to existing
collaboration efforts in fields such as High Energy Physics or various networking
16
projects. David Snelling from Fujitsu Europe Laboratories is also involved in
middleware development for Japanese grid efforts.
In order to be successful, efforts to build UK-Japan collaborations should, wherever
possible, take such existing efforts, contacts, and relationships into account.
4.2 Grand Challenge Projects
Grand challenge projects are enablers for cooperation—at least in the grid field. In any
case, grand challenge projects are well suited to cooperation. There are clear goals, tight
deadlines, and often simply a need to cooperate. Most importantly, there is also
considerable visibility for those who succeed.
4.2.1 Sharing Compute Resources
Mutual use/sharing of computing resources is a straightforward area for UK-Japan
collaboration. The Reality Grid project in the UK has already concluded an agreement
with the Teragrid project that will enable Reality Grid users to access Teragrid resources.
Japan has formidable HPC resources and several grid activities in Japan include
generous hardware funding.
The Grid Technology Research Center (GTRC) is presently installing the AIST
Supercluster system (a set of large cluster systems with AMD Opteron and Itanium II
processors) and hardware has also been (or will be) purchased within various other grid
projects. With HPCx, the UK is presently building a major computing facility that is
focused on capability users.
Again, and while HPCx is already operating beyond its capacity there might be
possibilities for mutual resource sharing between Japanese centers and the HPCx
facility—especially if this is related to grand challenge projects. Eventually, grand
challenge projects and resource sharing agreements might lead to broader cooperation
agreements and experimentation related to topics such as parallel file systems,
computational steering, or scheduling and accounting (see below).
•
•
•
•
•
Actions: Explore possibilities for resource sharing agreements between
AIST/GTRC and the Reality Grid project, evaluate possibilities for other
resource sharing agreements between centers in the UK and Japan
Resources: No specific resources needed at this stage
Means: Direct interaction via email & phone
Target: Discuss draft MoU at SC2003; start by April 2004
Steering: Peter Coveney (UCL), David Henty (EPCC Edinburgh), Mitsuhisa
Sato (Tsukuba University), Satoshi Matsuoka (Tokyo Institute of Technology)
17
4.2.2 Bandwidth Challenge
Both in Japan and the UK grid efforts are linked with programs to build future
generation research optical research networks—SuperJANET in the UK and
SuperSINET in Japan—which will be linked through UK Light and StarLink (Chicago).
This new network infrastructure provides a formidable infrastructure for various grid
test beds—such as the demonstration of very high-speed communication at the
Supercomputing 2004 Conference (“Bandwidth Challenge”).
•
•
•
•
•
Actions: Set milestones for test bed development
Resources: Essentially bandwidth...
Means: GGF/SC and other international meetings, AccessGrid, direct
interaction via mail or phone
Targets: Demonstrate 2.4 Gbps at SC 2004 using StarLink
Steering: Satoshi Sekiguchi (AIST/GTRC), Peter Clarke (UCL)
4.3 Infrastructure
There exist various collaborations between the UK and Japan as concern network
infrastructure or IP—grid efforts should build on these existing linkages.
4.3.1 Managing/Monitoring Production Grids
Grid test beds are one thing—but, how to run a production Grid? Over the past two
years, considerable experience with running production grids has accumulated in the
UK. Running production grids, especially if they are very heterogeneous, is a difficult
and messy business. Agreements regarding uniform policies for accounts and firewalls
are crucial, but very difficult to obtain. Also, it is very important to continuously
monitor the “health” of the Grid (and have it displayed on a website).
Many of the experiences with building large-scale production grids in the UK should be
highly relevant to Japan. Further, there should be many opportunities in the near future
to exchange opinions and policies with respect to production grids or else to cooperate
on the development, benchmarking, and optimization of management/monitoring tools
for production grids.
•
•
•
Actions: Send Level 2 Grid test suite to ApGrid and NAREGI, exchange
documents and draft policy proposals related to production grids; send
APGrid documentation
Resources: Initially information exchange only
Means: At initial stage essentially exchange of information, but consider joint
project workshop on production grids (experiences, monitoring tools, user
18
•
management, etc.)
Steering: Rob Allan (CCLRC Daresbury), Stephen Pickles (CSAR), Yoshio
Tanaka (Fujitsu), Satoshi Matsuoka (Tokyo Institute of Technology)
4.3.2 IPv6 Protocol
The UK and Japan are already work together on IPv6 issues within the GGF Ipv6
Research Group. This work will continue for some time and there should be good
opportunities for joint experimentation or verification, possibly using GT2 and/or UPL.
Eventually, this may lead to experiments between UK and Japan, using GT2 and/or
UPL.
•
Means: GGF research group, video conferences, develop relationship between
GGF and the WIDE project
•
Target: 12-18 month
•
Steering: Peter Clarke (UCL), Satoshi Sekiguchi (GTRC)
4.4 Middleware
Eventually, most of the opportunities for collaboration between the UK and Japan will
be in the development and testing of grid middleware. While there exist numerous
opportunities, in the following only a few specific projects that were discussed at the
workshop are mentioned.
4.4.1 GridMPI
GridMPI is a latency aware MPI implementation developed by the University of Tokyo
and AIST. Using available bandwidth, it is suggested to jointly undertake application
tests with GridMPI or related MPI implementations, such as PACX-MPI. On the
European side, application experiments could be undertaken jointly with the Distributed
European Infrastructure for Supercomputing Applications (DEISA), a consortium of
various centers in Europe that aims, among others, at building a virtual 148 node
single-system image SMP cluster.
•
•
•
•
•
Actions: Test implementation, need to select application code!
Resources: No specific resources needed
Means: Discussion within existing meetings, direct interaction
Targets: Demonstration within the next 12 months?
Steering: David Henty (EPCC), Neil Stringfellow (University of Manchester),
Yutaka Ishikawa (University of Tokyo), Hiroshi Takemiya (AIST/GTRC)
4.4.2 Grid Virtual Machines
Grid Virtual Machines were mentioned as another area for possible collaboration
19
between Japan and the UK, although no details were discussed.
•
Steering: David Snelling (Fujitsu Laboratories Europe), Kenji Kohno (UEC)
4.4.3 Scheduling and Resource Brokering
Scheduling remains an essential part of any Grid middleware—and an area where much
work still needs to be done. In the UK, scheduling tools are developed within the GRIP
program, a European initiative that builds on the EuroGrid broker. In Japan too, there
are numerous activities on scheduling and development efforts are intensifying.
Eventually, the aim should be directed towards a unified approach so that components
are essentially interoperable. Further, this is an area where joint development efforts are
both conceivable and desirable and Fujitsu Laboratories Europe is already involved in
some of the Japanese efforts.
•
•
•
•
•
Actions: UK Scheduling Conference Edinburgh October 21/22; exchange
information and personnel among NaReGI/WP1, the UK Markets Project, and
EuroGrid/GRIP
Resources: Funding within existing programs
Means: Dedicated meetings, direct communication, exchange of personnel
Target: Need to be defined
Steering: Dave Snelling (Fujitsu), Jon MacLaren (University of Manchester),
Satoshi Itoh (AIST/GTRC), Kento Aida (TiTech), Yuuji Iguchi (Fujitsu)
4.4.4 Middleware User Applications
Tools to build Problem Solving Environments on the Grid are in development both in
the UK (e.g. within the Reality Grid) and in Japan (e.g. at GTRC). Such tools cover a
variety of functions, such as workflow management, visualization, computational
steering, or remote procedure calls.
•
Steering: Kirsten Kleese van Dam (CCLRC Daresbury), Satoshi Itoh
(AIST/GTRC), John Brooke (University of Manchester), Hitohide Usami
(Fujitsu), David Henty (EPCC)
4.5 Data
Investigators in Japan and the UK are involved in the OGSA-DAI process and we
believe there is a host of opportunities for joint work related to the OGSA-DAI and
OGSA-DQP standard development process as well as related to OGSA specifications
for web data. Such activities are likely to include both short-term technology
development and integration and more long-term research. Given the importance of
standardization in this area, a strong cooperation between the UK and Japan that also
includes long-term R&D is likely to benefit standardization processes—and the
20
respective role of the two countries therein.
•
Registries: In the near future, the grid is likely to become a fundamental part
of the semantic infrastructure, Database registries will be part of this semantic
grid and designing them is not trivial. At present there are no dedicated efforts
in either country to build such registries which, eventually, should be global in
nature.
•
Portals: There has been much interest recently in the automated generation of
portals for data access and various groups in Japan and the UK as well as the
US are actively working on this issue. Also, as the number of databases
available on-line increases, there is an increasing need to support semantic
searching to find databases that contain interesting information.
•
Binary Data: The querying of binary data through XML queries and the
XML Description of structured data were mentioned as an important area
where much further work is needed. No specific actions items were
mentioned on this subject, though.
There is also a host of opportunities for more straightforward collaboration on databases
in specific areas—such as the life sciences.
•
•
•
•
Actions: Many opportunities—need to work out proposal
Resources: Consider long-term R&D cooperation
Means: various existing meetings
Steering: Paul Watson (University of Newcastle), Malcolm Atkinson (NeSC),
Isao Kojima (AIST/GTRC)
4.6 Applications
Applications related collaborative efforts are usually the most difficult to manage, since
applications efforts involve other interested parties—the scientists who “provide” the
problem to solve or the corporate clients who want to protect their interests.
Nonetheless, cooperation in the applications field can also be extremely rewarding.
4.6.1 Scientific Computing and HPC
Scientific applications remain an important driver for grid research—yet building
successful collaborations with computational scientists (or experimentalists) is no easy
task.
Building application centered international collaborations is even more demanding.
21
High-energy physics is a somewhat special case, since the high-energy physics
community has been used to running large-scale international collaborations for several
decades. Also, in anticipation of a new generation of experiments, the high-energy
physics community has in fact been a driver behind large-scale international
collaborations on grids (such as the Gfarm project, where collaboration involving UK
and Japanese scientists already exists). In astronomy, the situation is somewhat similar.
By contrast, in life sciences, international collaborations and large-scale projects have
provided opportunities for international cooperation on data analysis, data integration,
or simulation. In medical research, data sharing is all but impossible and even in the life
sciences field, the integration of various sources of genomic, proteomic, or metabolomic
data at the national level remains a rather elusive goal in many countries—including
Japan.
While several possibilities for cooperation were especially mentioned—such as
middleware development for LQCD within the International Lattice QCD Data Grid or
cooperation on grid-enable MD codes, such as NAMED and REX-MS—apart from a
few existing efforts in particle physics and astronomy, precise targets for collaboration
efforts would still need to be selected. This is especially true for the life sciences field.
•
•
Actions: Further explore interests and possibilities for cooperation projects in
various topical areas
Steering; Peter Clarke (UCL), Peter Coveney (UCL), Mr. Oishi (NAO),
Mitsuhisa Sato (University of Tsukuba), Masaaki Kawata (AIST/GTRC),
Richard Kenway (University of Edinburgh), Steve Oliver (University of
Manchester), Shinji Shimojo (Osaka University), Osamu Tatabe
(AIST/GTRC)
4.7 Business Applications
As part of the UK e-Science project, a number of commercial projects have been funded.
In Japan, business related projects are starting now with a new project financed by the
Ministry of Economics, Trade, and Industry (METI) and with involvement from
Japanese IT vendors. While direct cooperation on commercial projects may be difficult,
since such projects typically involve industry co-funding, sharing experiences with
commercial grid projects is important in emerging markets.
The following areas were identified as promising:
•
GridCast: the use of Grids in broadcasting and, more generally, the handling,
distribution, and storage of digital video files
22
•
Commercials computing: application of grids in financial and commercial
computing, such as derivative analysis or billing
In specific cases, direct cooperation involving industry actors may also be possible. At a
somewhat more advanced stage, joint evaluations of commercial implementations and
an assessment of the Total Cost of Ownership (TCO) of grid technology might also be
possible.
•
•
•
Actions: Define framework for joint evaluation of commercial grid projects,
evaluate other opportunities for cooperation
Resources: Direct communication
Steering: Ron Perrott (University of Belfast), Satoshi Itoh (AIST/GTRC)
4.7.1 Grid ASP
Tools for building Grid ASP portals will be increasingly important for business users.
Such tools include functionalities for single sign-up, job submission, accounting, and
security. Grid “economics” and the development of pricing models is an essential for
ASP services. In the case of the UK, there exists a common currency for computing
resources, but there is no unified model for costing/pricing. In Japan, economics models
and approaches will be developed within the Business Grid project. Given the
complexity of the issues surrounding accounting, exchange of experiences and
information, as well as exchange of personnel, should be highly beneficial for both
sides.
.
•
Steering: Satoshi Itoh (AIST/GTRC), Ron Perrott (University of Belfast),
4.7.2 Grid Security
There exist a host of other topics where cooperation and exchange of experiences
between the UK and Japan would be desirable and highly welcome, notably in the area
of security and including data security related to medical data, role-based access control,
and all issues related to Certification Authorities. While no specific projects or action
items were identified, we believe that cooperation in the area of security and
certification is especially important—and, in fact, inevitable for all activities that go
beyond demonstrations and test beds and include real production grids or application
work.
23
5.
What Next?
Cooperation does not come easily—and rarely works if here are not shared interests, the
mutual recognition of each other’s work, and—perhaps most importantly—the
perception of benefits that are both mutual and real. Judging from discussions and
comments at the UK-Japan meeting, there are a number of areas where all three
conditions are already fulfilled.
The demonstration of very high-speed data transmission (e.g. by participation in the
“Bandwith Challenge” at the US SuperComputing Conferences) as well as
demonstration experiments with GridMPI constitutes another area where cooperation
seems straightforward and relatively easy to achieve. There is a host of other areas in
middleware development where cooperation, if not necessarily joint development,
seems relatively easy.
Scheduling is clearly an area where cooperation between the UK and Japan, involving
various scheduling efforts in Japan, scheduling work within the GRIP project, and
Unicore related efforts, seems very straightforward and of considerable mutual benefit.
Scheduling is intimately linked with issues such as Grid economics, and thus is likely to
open up related opportunities for joint research and development.
In the case of data, there are several opportunities to build joint efforts around the
OGSA-DAI and OGSA-DQP specifications and the DAIT projects in the UK and data
related efforts at the Grid Technology Research Center and other organizations in Japan.
Standardization—at least ideally—should be driven by long-term research and this
appears an area where cooperation between the UK and Japan could have an impact
beyond simply research. Joint engagement in the development of standards in this area,
initially via the DAIS WG at GGF, has already begun. It will benefit from increased
investment as well as from the background research.
But, there are also areas where the mutual benefits are somewhat unclear and where
there is still uncertainty about what the other side is “really doing”. For example, the
mutual sharing of compute resources is never easy—and certainly not for centers that
are already operating well beyond capacity, such as HPCx. For the Japanese side, the
same is true, even if hardware spending within grid efforts in Japan appears to be
considerably higher in Japan than is the case in the UK.
As concerns applications, further discussion among investigators in various areas
mentioned at the meeting—such as life sciences or mesoscale/nanoscale simulations—is
needed. For example, it was mentioned at the workshop that “life sciences” is far too
broad and diffuse a category to be of much use—are we talking about integrating
24
various genomics/proteomics databases, about the simulation of biomolecule complexes,
about imaging, or about clinical trials for new pharmaceuticals?
Still, applications are important—and, if it is true that, according to David Snelling’s
“hype” curve of the grid, a downturn in is now imminent, identifying applications with
either a high scientific or high commercial value may well turn out as crucial for
successful collaboration efforts between the UK and Japan.
In any case, it is now up to participants at the meeting to link-up, exchange information,
exchange code, discuss joint projects, or to refine proposals. Conferences and events
like GGF provide ample opportunity to do so. It is unclear at this point what kind of
funding is available to support UK-Japan collaborations, but funding is perhaps not the
limiting factor. Also, during preparatory stages, there is much that can be done without
explicit funding—at least initially.
A second UK-Japan workshop is planned to take place in Japan in about 18 months.
This should provide enough time to develop at least some of the project ideas listed in
this document. That workshop will include focused discussions around those
developments.
25
Appendix 1: Workshop Program
Thursday October 2
19:00
Finger Buffet
Friday October 3
8.30 - 9:20 Breakfast
9.20 - 9:40 Introduction, scene setting etc. (chair Ron Perrott)
Tony Hey (Malcolm Atkinson my have to substitute)
Satoshi Sekiguchi
9:40 -11:00 Grid Activities: Surveys & Reports 4 talks (chair Ron Perrott)
Peter Coveney
The RealityGrid: A Survey
Satoshi Matsuoka
Towards a Petascale Research Grid Infrastructure in Japan
David Henty
Exploiting Terascale Supercomputers: Experiences from HPCx
Yoshio Tanaka
ApGrid: An Asia Pacific Partnership for Grid Computing
11.00 - 11.15
Coffee Break
11.15 - 12.35 e-Science and HPC (chair Satoshi Matsuoka)
Yutaka Ishikawa
GridMPI: A Novel Latency-Aware MPI Implementation
John Brooke
Resource Brokering on Complex Grids
Hiroshi Takemiya
Developing Scientific Applications Using Standard Grid
Middleware
Steven Newhouse
ICENI: An Integrated Grid Middleware to Support
e-Science
12:35 - 13:30 Lunch
13:30 - 14:50 Standard Middleware (chair Richard Kenway)
Dave Snelling
Beyond OGSI
Paul Watson
A Grid Data Integration Service
Satoshi Itoh
Grid ASP Portals and the Grid PSE Builder
Peter Clarke
Networking Infrastructure and Network Projects in the
Context Of E-Science In The UK
14:50 - 16:10 Life Sciences and Chemsitry Applications (chair Isao Kojima)
Derek Hill
Grid Technology and Medical Immmaging
26
Mitsuhisa Sato
Steve Oliver
Masaaki Kawata
Drug Discovery by Grid Technology
Capture, Integration, and Sharing of Functional Genomic
Data
Molecular Simulations: Toward Grid Based Approaches
16:10 - 16:30 Coffee Break
16.30 - 17:50 Grids and Data (chair Steve Oliver)
Malcolm Atkinson
Data, Data Everywhere.
Isao Kojima
Grid Based Database Integration at AIST
Ron Perrott
The Grid and Media: the Gridcast Project
Richard Kenway
The International Lattice Data
18.30
Dinner
Saturday October 4
8.00 - 8:30 Breakfast
8:30 - 10:10 GridDiscussion Groups
1. Infrastructure
2. Data
3. Applications
10:10 - 10.30
Break
10.30 - 11.45 Assimilation of Group Feedback (chair Malcolm Atkinson & Satoshi
Sekiguchi)
Discussion, reflections, etc
Opportunities for collaboration on applications, etc.,
Summary of meeting, action items, next steps
27
Appendix 2: Workshop Participants
Japanese Participants
UK Participants
Yutaka ISHIKAWA
Malcolm ATKINSON
The University of Tokyo
The National e-Science Centre
ishikawa@is.s.u-tokyo.ac.jp
mpa@nesc.ac.uk
Satoshi ITOH
John BROOKE
Grid Technology Research Center (GRTC)
North-West Regional eScience Centre
satoshi.itoh@aist.go.jp
j.m.brooke@man.ac.uk
Masaaki KAWATA
Peter CLARKE
National Institute for Advanced Industrial
Particle Physics Group, University College
Science and Technology (AIST)
London
m.kawata@aist.go.jp
clarke@hep.ucl.ac.uk
Isao KOJIMA
Peter COVENEY
Grid Technology Research Center (GTRC)
University College London
kojima@ni.aist.go.jp
P.V.Coveney@ucl.ac.uk
Satoshi MATSUOKA
David HENTY
Global Scientific Information and Computing
EPCC, University of Edinburgh
Center (GSICC), Tokyo Institute of Technology
dsh@epcc.ed.ac.uk
matsu@is.titech.ac.jp
Mitsuhisa SATO
Derek HILL
University of Tsukuba
KCL School of Medicine, Guy’s Hospital
msato@is.tsukuba.ac.jp
derek.hill@kcl.ac.uk
Satoshi SEKIGUCHI
Richard KENWAY
National Institute for Advanced Industrial
University of Edinburgh
Science and Technology (AIST)
R.Kenway@ed.ac.uk
s.sekiguchi@aist.go.jp
Shinji SHIMOJO
Steven NEWHOUSE
Osaka University Cybermedia Center
Imperial College London
shimojo@cmc.osaka-u.ac.jp
sjn5@doc.ic.ac.uk
Hiroshi TAKEMIYA
Steve OLIVER
Grid Technology Research Center (GTRC)
University of Manchester.
h-takemiya@aist.go.jp
steve.oliver@bso.man.ac.uk
28
Yoshio TANAKA
Ron PERROTT
Grid Technology Research Center (GTRC)
School of Computer Science , Queen's
yoshio.tanaka@aist.go.jp
University of Belfast
R.Perrott@qub.ac.uk
Robert TRIENDL
David SNELLING
triendl@gol.com
Fujitsu Laboratories of Europe
d.snelling@fle.fujitsu.com
Paul WATSON
North East Regional e-Science Centre
Paul.Watson@newcastle.ac.uk
29
Download