UK-Japan “N+N” Workshop on Grid Computing London, October 3-4, 2003 Summary Report 1 Executive Summary The first “UK–Japan N+N workshop on Grid Computing” was held on October 3-4 2003 in London, and with participation from over 20 Grid researchers. The workshop was jointly organized by the National eScience Center (NeSC) in Edinburgh and the Grid Technology Research Center (GTRC) of the Institute for Advanced Industrial Science and Technology (AIST) in Tsukuba and Tokyo and chaired by Malcolm Atkinson, Ron Perrott, and Satoshi Sekiguchi. An explicit goal of the workshop was to explore opportunities for collaboration between existing grid middleware and applications R&D efforts in the UK and Japan. On the first day of the workshop 20 presentations organized in five session provided an overview to grid projects in the UK and Japan, as well as on grids and high-performance computing, developments in standard grid middleware, and grids and data. On the second day of the workshop participants were divided into three topical groups—infrastructure, data, and applications—and asked to identify possibilities for exchange and cooperation between the UK and Japan in grid research. During a final panel discussion, potential areas for collaboration were then elaborated. Also, for each area the most suitable candidates to oversee eventual collaboration activities were identified. In some cases immediate action items were identified. A second UK-Japan workshop is planned to take place in Japan in about 18 month. It is expected that, by then, at least some of the project ideas listed in the following will have turned into real collaborative ventures. In the meantime, the next steps are: • Follow-up on projects proposal at SC2003/GGF9 • Determine date and format for Spring 2005 Workshop in Japan • Investigate possibilities for funding long-term collaborations It is now up to participants at the meeting to link-up, exchange information, exchange code, discuss joint projects, and to refine proposals. However, while a number of joint activities can be undertaken relatively easily, and with existing resources, external funding will be needed for collaborations on more long-term, fundamental research. The UK and Japan are perhaps the two countries that spend most on grid research besides the US, and researchers in the UK and Japan are heavily involved in GGF activities and international standardization in the grid area. In the case of grids, fundamental research is often highly relevant to standardization to support long-term R&D collaborations between the UK and Japan, also means to strengthen the position of both countries in an emerging technology that has the potential to affect the entire IT 2 industry. 3 1. Introduction On October 3-4 2003, over 20 Grid researchers from the UK and Japan gathered in London for the “UK Japan N+N Workshop on Grid Computing”, a dense two-day meeting with presentations and discussion covering all aspects of grid computing and its application in research and industry. The workshop was organized by the National e-Science Center (NeSC) in Edinburgh and the Grid Technology Research Center (GTRC) of the Institute for Advanced Industrial Science and Technology (AIST) in Tsukuba and Tokyo and chaired by Malcolm Atkinson, Ron Perrott, and Satoshi Sekiguchi.1 After an initial day with 20 presentations on grid projects in the UK and Japan, grids and high-performance computing, standard grid middleware, and grids and data, on morning of the second day the participants split-up into three topical discussion groups (infrastructure, data, and applications) to debate possibilities for exchange and cooperation between the two countries. The following provides an outline of the presentations and discussions at the workshop. 1.1 Background of the Meeting International collaboration in the area of grids does not need justification. Grids, like the Internet, are inherently global. Within the UK e-Science program, it is recognized that—in addition to participation in international conferences or the GGF—direct interaction with “sister programs” in other countries is necessary and important. In Japan, the AIST Grid Technology Research Center (GTRC) has served as a “gateway” between the international grid community and Japan. Also, there have been numerous efforts, such as the APGrid or PRAGMA, to link various grid activities in the Asia-Pacific Region. The idea of a joint workshop on grids between the UK and Japan was first discussed in the context of the IST 2000 meeting in Nice, France, which was attended by John Taylor and Satoshi Sekiguchi. Several months later, in spring 2001, John Taylor visited Japan and ideas for a cooperation were further investigated. Eventually, a proposal for a “UK/Japan Grid Laboratory”, drafted by Tony Hey and Satoshi Sekiguchi, was included into the “4th UK-Japan Agreement on Scientific and Technological Cooperation”, that was signed by the two countries at a meeting in London in February 2002. The status of that proposal is still pending. 1 On the Japanese side, this workshop was organized by the Grid Technology Research Center. However, it should be noted that, during 2003, Japan has launched a major national Grid research & development effort, termed the National Research Grid Initiative (NAREGI). It is expected that, in the future, NAREGI efforts will play an important part in UK-Japan cooperation efforts. 4 In the meantime, the UK government had launched it’s e-Science program, a large-scale grid effort oriented towards applications and valuable scientific output rather than middleware development or large-scale hardware procurement2. In Japan, the Institute of Advanced Industrial Science and Technology (AIST) formed the first dedicated grid research organization, the Grid Technology Research Center (GTRC). At the same time, the Ministry of Education, Culture, Sports, Science and Technology (MEXT) funded several medium-scale Grid infrastructure and R&D projects, such as the Information Technology Based Laboratory (ITBL) project, the Osaka Biogrid, or the Tokyo Institute of Technology (Titech Grid). In 2003, the Japanese government launched two large national-level grid initiatives to support middleware development for scientific and business applications, namely the NAREGI (National Research Grid Infrastructure) project and the Business Grid Project.3 1.2 The Agenda: UK-Japan Cooperation In the past 2-3 years, both the UK and Japan have launched large-scale grid initiatives. Also, in both countries, the number of individual grid projects as well as the number of university departments or research centers involved in grid activities have increased considerably. Commercial interests in grids and grid applications has surged and grids have been advertised as the Internet of the future. Leaving aside the US, the UK and Japan are perhaps the two countries that spend most on grid research. Researchers in the UK and Japan have been heavily engaged in GGF activities and, given the importance of international standardization, there is a natural interest in long-term cooperation on fundamental research that can drive standardization. With large efforts in place on each side, 2003 thus seemed the right year to further investigate possibilities for collaboration between the UK and Japan. Plans for a joint workshop between the UK e-Science program and Japanese grid activities were first discussed between Malcolm Atkinson and Satoshi Sekiguchi at GGF7 in March 2003 in Tokyo and plans for the event were finalized by Ron Perrott, Malcolm Atkinson, and Satoshi Sekiguchi. The practical aspects of the workshop were arranged by Gill Maddy (NeSC) and Yuko Oshima (GTRC), who also provided on-site support for the Japanese 2 The UK e-Science programme corresponds to an investment of approximately £250 million over 5 years, of which 75% is invested in problem driven R&D and 25% is invested in coordination and a core programme, including middleware R&D and regional/national centres. Investment in HPC is channelled separately, and includes £55 million on HPCx over 6 years. 3 The National Research Grid Initiative is funded at around ¥ 9 billion over a 5 year period, with an additional ¥ 4-5 billion in hardware procurements. Public funding within the Business Grid Project is comparable to NAREGI funding. However, in the case of the Business Grid Project comapanied need to match funds invested by the government. 5 participants. 1.3 Goals of the Meeting An explicit goal of the workshop was to explore opportunities for collaboration between existing grid middleware and applications efforts in the UK and Japan, including the following: • exchange of information • exchange of experiences regarding policies for production grids • mutual sharing of resources • joint test beds and demonstrations • joint testing of middleware protocols • joint specifications to guarantee interoperability of middleware components • scientific cooperation projects that use the grid as a tool • commercial grid projects • cooperation on user identification, certification, and security • grid economics Throughout this workshop, the goal was to identify both common interests and complementary capabilities—leading eventually to proposals for cooperation between groups in the two countries. To this end, a better understanding and awareness of grid related activities in each other’s country was deemed necessary. Presenters at the workshop were asked to emphasize actual experiences, and problems encountered with software development efforts, implementations, production grids, or test beds—rather than to focus on “grand schemes” or technical detail (all presentations from the workshop are available). Also, throughout the workshop, the emphasis was on discussion. Following a day with presentations, for the second day of the meeting participants were split into three groups (infrastructure, middleware, and applications) and asked to come-up with possible topics for cooperation. The following document summarizes the discussions at the meeting, including a list of possible topics for cooperation efforts between groups in the UK and Japan. 6 2. Grid Infrastructure An important goal of the workshop was to foster awareness of grid activities in both countries. While a complete survey of grid activities in the UK and Japan would have been far beyond the capacity of the meeting, several presentations have highlighted important grid efforts in both countries. 2.1 Funding for Grid Research Judging from the amount of funding made available for grid research, there can be no doubt that the grid is high on the agenda of policy makers in the UK and Japan. Grid related funding in the UK has been mainly provided through the e-Science program as well as through a number of European Union funded activities. The e-Science program has provided funding by a variety of means. Most importantly, the program has funded a number of regional e-Science centers across the country as well as the National e-Science Center (NeSC) in Edinburgh. Funding for Grid activities in Japan is somewhat more fragmented and, over the past few years, a variety of initiatives have been launched by various ministries and agencies—including the former Science and Technology Agency (now part of the Education Ministry), the Ministry for Economics, Trade, and Industry (METI), or various Initiatives funded through the Ministry for Education, Culture, Sports, Science, and Technology (MEXT). The Grid Technology Research Center (GTRC) within the National Institute for Advanced Industrial Science and Technology (AIST), which developed out of a research unit on network computing and clusters at the former Electrotechnical Laboratory, has been the first large-scale grid research center in Japan, and perhaps also internationally, with a dedicated staff of some 50 full-time employees and offices in Tsukuba and Tokyo. GTRC also functions as an informal link between the various Grid activities in Japan and provides support for the Japanese Grid Consortium. GTRC is by far the largest dedicated research organization in the grid area in Japan, supporting research on all aspects of grid technology, from infrastructure to applications. A number of grid efforts in academia have started over the past few years and organizations such as the Japan Atomic Energy Research Institute (JAERI) have launched grid efforts. But, it was only this year that funding for Grid research on a broader scale has become available with the launch two national Grid projects targeting science and business applications: • The National Research Grid Initiative (NAREGI), organized and funded by MEXT, and hosted at the National Institute of Informatics (NII), aims at 7 • developing standard middleware for scientific applications and to demonstrate the feasibility of a grid based national computing infrastructure. The Business Grid project, supported by METI, aims at the development of basic underlying grid middleware for business applications and future e-commerce environments. Both initiatives are well funded and are at least partially intended to support the Japanese computer industry to develop grid capabilities. Many academic institutions, notably Osaka University, Tokyo Institute of Technology, and University of Kyushu, participate in the NAREGI efforts. 2.2 Grid Projects in the UK and Japan A number of presentations at the meeting covered Grid activities in Japan and the UK. Peter Coveney presented an overview of activities under the Reality Grid, a large grid based computational science project funded through the e-Science program. Reality Grid is mainly focused on computation, including very large-scale computations using HPCx or systems of a similar size. Within the Reality Grid project, various middleware tools that give HPC users better control over their jobs were developed, and notably techniques for computational steering. Satoshi Matsuoka (TITech) provided a broad outline of the NAtional REsearch Grid Initiative (NAREGI). Sponsored by the Japanese education ministry (MEXT), the focus of the project is on grid middleware R&D activities and Nanoscience grid applications. The current focus at NAREGI is mainly on compute grids rather than data grids. Also, NAREGI was conceived as a software R&D project and not a deployment project. Still, a 15 Teraflops test is presently being built. Although NAREGI is still in an early stages, collaborations with middleware projects such as Unicore, Globus, and Condor as well as contributions to various OGSA standardization efforts are planned. In addition to the National Institute of Informatics (NII), the Institute for Molecular Science (IMS) and the three large computer vendors, various academic and research institutions, including GTRC and Tokyo Institute of Technology, are participating in the project. Several other presentations provided glimpses on important grid activities in both countries—and internationally. Richard Kenway introduced the International Lattice Data Grid, a project with participation from both Japan and the UK. Peter Clarke provided an overview to the networking resources within the UK e-Science program. Yoshio Tanaka presented the APGrid, a consortium of organizations in the Asia Pacific 8 region that aims at developing a regional test bed. Judged by the number of organizations that have joined, or that provide resources, APGrid has been very successful at managing a grid consortium without centralized funding and in a region that is both politically and economically extremely diverse, addressing a formidable challenging. But, as Tanaka pointed out, managing diverse and heterogeneous networks of organizations is an essential part of grid technology. 2.3 Towards Production Grids Experiences with large-scale grid test beds, but especially with production grids—such as the UK Level 2 Grid—seem to converge on a set of issues that are both “technical” and “social” in nature. Dealing with a heterogeneous infrastructure—composed of resources from various organizations—remains challenging and demanding. Experience with the APGrid Test bed, a project without a single funding source, further illustrated this point. But, judging from the presentations at the workshop, there are also important differences between the thematic emphasis of grid efforts in the UK and in Japan. Activities in the UK seems to be more focused on grids, while in Japan there appears to be more on large clusters and computation. Also, the e-Science program is strongly focused on users and applications—it is not a grid program per se, but a broader initiative that aims at building an IT infrastructure for scientific inquiry. While the choice within the UK e-Science program was not to develop any kind of new middleware, but rather try to put existing packages and toolkits to work, in Japan grid efforts are characterized by a strong emphasis on new middleware research and development. To some extend, this might be simply a reflection of the fact that Japan has a strong domestic IT industry. Conversely, and while there exist significant middleware development efforts in the UK, such as OGSA-DAIS, it appears that—at least at this stage—grid activities are somewhat more oriented towards working directly with users. 9 3. Grid Middleware Grids are essentially middleware—and middleware is still where the problem is with grids. Most of the presentations at the meeting were essentially dealing with middleware issues. One sentiment, expressed by various participants throughout the meeting concerns approaches towards middleware development. In his presentation, and in various comments, Peter Coveney has pointed out that, from the perspective of grid users, it is more important “to get things to work”—even if that means using unconventional approaches or improvisation. In other words, to search after pragmatic solutions that work, even if they may not be optimal in “architectural terms” is preferential to working out ideal solutions on paper only. The counter argument was also raised: that such pragmatic solutions demonstrate required functionality but longer-term investment is needed to provide more economic and generic middleware that delivers that functionality. 3.1 Middleware Development in the UK and Japan Interestingly, while the UK e-Science program has voted not to support the development of new middleware packages, but rather use existing middleware—such as Globus, Unicore, or Condor—there is more emphasis on middleware development within Japanese grid R&D programs, perhaps reflecting the presence of several large domestic computer vendors in Japan. Grid efforts in Japan and the UK have voted not to develop the “base” middleware layer by themselves, but rather to rely on available de-facto middleware standards. For example, the Japanese NAREGI project will base its middleware development efforts on Globus, Unicore, and Condor and to add new functionalities or layers, but without duplicating existing efforts. Such as approach opens up many opportunities for cooperation. David Snelling reminded participants that, although OGSI has now been published, and is very likely to become adopted as a GGF recommendation, a lot of work still remains to be done and the overall OGSA scheme is far from complete—with OGSI representing only the lowest branch of a complex services hierarchy. Also, there remain many issues with respect to interoperability, and its definition, with respect to OGSI. 3.2 Web Portals Efforts to develop a software toolkit for the automated generation of application portal 10 sites were discussed by Satoshi Itoh. In a related presentation, Hiroshi Takemiya discussed issues with “gridifying” legacy applications using standard Grid middleware, such as Globus or Ninf-G.. Steven Newhouse discussed ICENI, an integrated Grid middleware presently developed by the London e-Science Center and built using JAVA and JINI. The ICENI architecture includes scheduling, execution, visualization, and steering services accessible through a highly-developed graphic user interface. 3.3 GridMPI Yutaka Ishikawa, who is responsible for the SCore cluster operating system development, presented GridMPI, a new latency aware MPI implementation. Fast communication protocols are a key to building efficient cluster operating systems. Better performance in the communication layer is crucial for running parallel applications on computational grids. Using an emulated WAN environment in which two clusters are connected by a PC router which adds communication latency artificially several MPI implementations were tested. Using the NAS parallel benchmarks run and MPI implementations such as MPICH-P4, MPICH-G2, and MPICH-SCore, it was demonstrated that the distance limit for running non-embarrassingly parallel applications on the Grid is around 2000 km (i.e. metropolitan networks). Based on these data, the decision was made to build a new GridMPI implementation from scratch. The features include a new latency aware implementation and a new TCP/IP protocol implementation. 3.4 Resource Brokering Accounting and resource brokering are a central component of the Grid—and, together with user identification and certification, a major problem for large, public grid projects. In his presentation on Grid activities related to HPCx, David Henty pointed out that accounting remains a major issue for all large computer centers and is very difficult to do on large parallel systems. Typically, centers write their own accounting software, yet there remain many problems—which suggests that it will be much more difficult on heterogeneous environments such as the Grid. In complex Grid environments, new concepts for resource brokering are therefore needed—and some ideas were presented in John Brook’s presentation that described efforts within the GRIP project to develop “federated” resource brokering schemes. Work on resource brokering is also an important component of middleware development efforts in Japan and there seem to exist good opportunities for cooperation between the UK and Japanese teams. 11 3.5 Data Issues surrounding data and grids were engagingly summarized by Malcolm Atkinson in his talk on “data everywhere”. Structured digital data are now becoming ubiquitous and there is an increasing need to link distributed datasets with computing resources. Isao Kojima presented promising efforts at the AIST towards a grid based integration of research databases using an Xquery based, metadata integration system and a grid proxy/mediator based database integration approach for hidden web databases based on the OGSA-DAI framework. Paul Watson, in his presentation, demonstrated a services oriented approach for the access and integration of structured data on the Grid based on the OGSA-DQP framework. In a beautiful example of a grid application for managing large data sets, Ron Perrott introduced the GridCast project, a joint effort with the BBC to use grids for the internal distribution, handling, and storage of digital media files. There are many possible extensions of this effort, such as the use of grids and IP networks to deliver content to educational organizations. There were also discussions regarding Gfarm and the database query efforts by Watson. The goal of the Gfarm project, which is undertaken jointly by AIST/GTRC, KEK, and the Tokyo Institute of Technology, is to develop new file system for wide-scale and bandwidth intensive Data grid processing. It was suggested that, combining the OGSA-DQP and Gfarm will nicely complement each other for high-bandwidth processing of distributed data. 12 Grid Applications As grid applications mature, the focus has increasingly shifted from computation towards data—and the integration of large, distributed data sets with high-end computing. This trend was well in evidence in the presentation related to applications at the workshop. Structured digital data, as Malcolm Atkinson has pointed out, are now “everywhere” and there is an urgent need to combine large-scale data management and high-end computing using distributed resources. 3.6 Life Science and Medical Applications The integration of many different sources of data remains a major challenge in the bioinformatics field, according to Steve Oliver who presented his work on an object-oriented database system for sequence and functional genomic data information, called Genome Information Management System (GIMS), as well as a prototype data repository for proteome data. The grid is well suited to address many of the problems that bionformaticians are facing today. The number of new data sources is increasing continuously and many organizations now provided access to experimental data on their websites, which further complicates issues of data integration and access. myGrid was introduced as a first step towards applying the GIMS principles of data integration in a Grid-based system that would avoid internalising huge amounts of data from around the world into a single data warehouse. But, there are also important limitations and hurdles to the application of grids, especially in biomedical research. For example, Steve Oliver pointed out that there remain difficult issues as concerns the validation or correction of experimental data. If data are distributed to many different locations, how can this be achieved? Also, in areas of research where competition in research is high, control over data-even after their release-may be important. Especially in medical research, data security remains a major hurdle for the application of grids, a fact that became especially clear in Derek Hill's presentation on medical imaging as a grid application. An increasingly sophisticated infrastructure for processing and storing medical images is needed, as data sets and algorithms get more complex. Also, imaging is increasingly used for guiding and planning therapy-as opposed to simple diagnosis. Most importantly, medical images are increasingly stored in digital archives. While medical imaging would certainly benefit from access to on-demand compute resources and techniques for dealing with distributed data, there remain fundamental challenges to the use of grids in medical imaging and notably issues of data security and 13 privacy. Similar issues are relevant in medical research in general. As Derek Hill pointed out, it is still common to transfer data from clinical trials physically on a CD or hard disk, rather than to send them over the Internet 3.7 Nanoscience and Molecular Simulations Several presentations at the workshop discussed approaches to extend materials simulations—such as molecular dynamics—to cluster and grid environments, targeting applications in basic and applied research, including nano scale simulations of materials and biological systems or support systems for drug design and development. Approached to molecular simulations on highly parallel systems in cluster and grid environments were discussed by Masaaki Kawata. The combination of fast and reliable algorithms for molecular dynamics simulations on clustered systems with innovative grid-based simulation methodologies that utilize widely distributed computational resources should yield robust systems for large-scale molecular simulations in a grid environment. Especially, Kawata discussed a grid-optimized replica exchange method (termed RAX-MS) for solving optimization problems. Mitsuhisa Sato provided an overview to efforts at integrating various software tools, such as tools for conformational search and molecular simulations, into a grid-based system to support drug discovery. In order o implement the system, a middleware tool to implement remote procedure calls (RPC) on the grid called OmniRPC was developed. The need to integrate databases with various types of simulations makes grids especially suited for applications such as drug discovery. Nanoscale simulations are also an important part of the applications development work within the Reality Grid discussed by Peter Coveney. 3.8 Commercial Applications Commercial applications of grid technology were not major focus at this meeting. Still, commercial applications were discussed in several presentations. Satoshi Itoh, who introduced the Grid PSE Builder projects at the Grid Technology Research Center (GTRC), a software kit that enables users to easily build web application services. So far, portal sites for standard technical computing applications, such as Gaussian or Phoenics, were developed using the Grid PSE Builder. In a beautiful example of a grid application, Ron Perrott presented an on-going collaboration project with the BBC with the goal to use grid technology to manage video data. The distribution of video data at the BBC presently is managed in a highly centralized form, with video files distributed from the main office in London to regional 14 TV stations using dedicated lines. If successful, grid technology could contribute to a significant change in the way how the organization processes, distributes, and stores video data. Presently, video data are stored mainly as physical objects and distributed using special purpose equipment and via dedicated networks. Using grid technology it is now possible to store video as data files on commercial server systems and distributed on IP based networks using grid middleware. 15 4. Building UK-Japan Collaborations Following a day with presentations covering many aspect of grid related research, middleware development, and grid applications, the goal for the second day of the workshop was to identify areas of common interests (or complementarities) in grid efforts in the UK and Japan and, eventually, to define a set of possible topics for future collaboration between the two countries. Both the UK and Japan have sizeable R&D efforts on grids and cooperation on fundamental research is highly desirable. Further, both countries are also involved in standardization activities and long-term R&D efforts constitute a crucial backdrop to formulating standards. 4.1 Approaches to Cooperation It is important to clearly distinguish between various types of collaboration—from the exchange of software codes and resource sharing to joint test-beds or more long-term R&D efforts—which come with widely differing needs for funding, personal investments and efforts, or means of communication. There are a variety of activities that can be undertaken without need for additional resources, and by relying upon existing forums for communication. Other activities and notably more long-term research efforts will need both careful planning and additional funding resources on both sides. There was a wide-spread agreement that identifying an application “driver” for each cooperation is likely to be important—but it was much less clear what application might really help to “drive” cooperation between the two countries. Research communities in areas such as high-energy physics or astronomy are well organized and there are many grid activities that are already well on the way globally. Data integration is a strong need in the life sciences field and grid middleware might really provide much needed technology to the life sciences community. Yet, it remains a fact that many computer scientists involved in grids have little knowledge about the life sciences, which often makes communication—and cooperation—difficult. In addition to life sciences, materials research might constitute an interesting area for cooperation, with needs for both high-end computation and data. Grid researchers in Japan and the UK already cooperate on a number of activities from cooperation on work done with the GGF working or research groups to existing collaboration efforts in fields such as High Energy Physics or various networking 16 projects. David Snelling from Fujitsu Europe Laboratories is also involved in middleware development for Japanese grid efforts. In order to be successful, efforts to build UK-Japan collaborations should, wherever possible, take such existing efforts, contacts, and relationships into account. 4.2 Grand Challenge Projects Grand challenge projects are enablers for cooperation—at least in the grid field. In any case, grand challenge projects are well suited to cooperation. There are clear goals, tight deadlines, and often simply a need to cooperate. Most importantly, there is also considerable visibility for those who succeed. 4.2.1 Sharing Compute Resources Mutual use/sharing of computing resources is a straightforward area for UK-Japan collaboration. The Reality Grid project in the UK has already concluded an agreement with the Teragrid project that will enable Reality Grid users to access Teragrid resources. Japan has formidable HPC resources and several grid activities in Japan include generous hardware funding. The Grid Technology Research Center (GTRC) is presently installing the AIST Supercluster system (a set of large cluster systems with AMD Opteron and Itanium II processors) and hardware has also been (or will be) purchased within various other grid projects. With HPCx, the UK is presently building a major computing facility that is focused on capability users. Again, and while HPCx is already operating beyond its capacity there might be possibilities for mutual resource sharing between Japanese centers and the HPCx facility—especially if this is related to grand challenge projects. Eventually, grand challenge projects and resource sharing agreements might lead to broader cooperation agreements and experimentation related to topics such as parallel file systems, computational steering, or scheduling and accounting (see below). • • • • • Actions: Explore possibilities for resource sharing agreements between AIST/GTRC and the Reality Grid project, evaluate possibilities for other resource sharing agreements between centers in the UK and Japan Resources: No specific resources needed at this stage Means: Direct interaction via email & phone Target: Discuss draft MoU at SC2003; start by April 2004 Steering: Peter Coveney (UCL), David Henty (EPCC Edinburgh), Mitsuhisa Sato (Tsukuba University), Satoshi Matsuoka (Tokyo Institute of Technology) 17 4.2.2 Bandwidth Challenge Both in Japan and the UK grid efforts are linked with programs to build future generation research optical research networks—SuperJANET in the UK and SuperSINET in Japan—which will be linked through UK Light and StarLink (Chicago). This new network infrastructure provides a formidable infrastructure for various grid test beds—such as the demonstration of very high-speed communication at the Supercomputing 2004 Conference (“Bandwidth Challenge”). • • • • • Actions: Set milestones for test bed development Resources: Essentially bandwidth... Means: GGF/SC and other international meetings, AccessGrid, direct interaction via mail or phone Targets: Demonstrate 2.4 Gbps at SC 2004 using StarLink Steering: Satoshi Sekiguchi (AIST/GTRC), Peter Clarke (UCL) 4.3 Infrastructure There exist various collaborations between the UK and Japan as concern network infrastructure or IP—grid efforts should build on these existing linkages. 4.3.1 Managing/Monitoring Production Grids Grid test beds are one thing—but, how to run a production Grid? Over the past two years, considerable experience with running production grids has accumulated in the UK. Running production grids, especially if they are very heterogeneous, is a difficult and messy business. Agreements regarding uniform policies for accounts and firewalls are crucial, but very difficult to obtain. Also, it is very important to continuously monitor the “health” of the Grid (and have it displayed on a website). Many of the experiences with building large-scale production grids in the UK should be highly relevant to Japan. Further, there should be many opportunities in the near future to exchange opinions and policies with respect to production grids or else to cooperate on the development, benchmarking, and optimization of management/monitoring tools for production grids. • • • Actions: Send Level 2 Grid test suite to ApGrid and NAREGI, exchange documents and draft policy proposals related to production grids; send APGrid documentation Resources: Initially information exchange only Means: At initial stage essentially exchange of information, but consider joint project workshop on production grids (experiences, monitoring tools, user 18 • management, etc.) Steering: Rob Allan (CCLRC Daresbury), Stephen Pickles (CSAR), Yoshio Tanaka (Fujitsu), Satoshi Matsuoka (Tokyo Institute of Technology) 4.3.2 IPv6 Protocol The UK and Japan are already work together on IPv6 issues within the GGF Ipv6 Research Group. This work will continue for some time and there should be good opportunities for joint experimentation or verification, possibly using GT2 and/or UPL. Eventually, this may lead to experiments between UK and Japan, using GT2 and/or UPL. • Means: GGF research group, video conferences, develop relationship between GGF and the WIDE project • Target: 12-18 month • Steering: Peter Clarke (UCL), Satoshi Sekiguchi (GTRC) 4.4 Middleware Eventually, most of the opportunities for collaboration between the UK and Japan will be in the development and testing of grid middleware. While there exist numerous opportunities, in the following only a few specific projects that were discussed at the workshop are mentioned. 4.4.1 GridMPI GridMPI is a latency aware MPI implementation developed by the University of Tokyo and AIST. Using available bandwidth, it is suggested to jointly undertake application tests with GridMPI or related MPI implementations, such as PACX-MPI. On the European side, application experiments could be undertaken jointly with the Distributed European Infrastructure for Supercomputing Applications (DEISA), a consortium of various centers in Europe that aims, among others, at building a virtual 148 node single-system image SMP cluster. • • • • • Actions: Test implementation, need to select application code! Resources: No specific resources needed Means: Discussion within existing meetings, direct interaction Targets: Demonstration within the next 12 months? Steering: David Henty (EPCC), Neil Stringfellow (University of Manchester), Yutaka Ishikawa (University of Tokyo), Hiroshi Takemiya (AIST/GTRC) 4.4.2 Grid Virtual Machines Grid Virtual Machines were mentioned as another area for possible collaboration 19 between Japan and the UK, although no details were discussed. • Steering: David Snelling (Fujitsu Laboratories Europe), Kenji Kohno (UEC) 4.4.3 Scheduling and Resource Brokering Scheduling remains an essential part of any Grid middleware—and an area where much work still needs to be done. In the UK, scheduling tools are developed within the GRIP program, a European initiative that builds on the EuroGrid broker. In Japan too, there are numerous activities on scheduling and development efforts are intensifying. Eventually, the aim should be directed towards a unified approach so that components are essentially interoperable. Further, this is an area where joint development efforts are both conceivable and desirable and Fujitsu Laboratories Europe is already involved in some of the Japanese efforts. • • • • • Actions: UK Scheduling Conference Edinburgh October 21/22; exchange information and personnel among NaReGI/WP1, the UK Markets Project, and EuroGrid/GRIP Resources: Funding within existing programs Means: Dedicated meetings, direct communication, exchange of personnel Target: Need to be defined Steering: Dave Snelling (Fujitsu), Jon MacLaren (University of Manchester), Satoshi Itoh (AIST/GTRC), Kento Aida (TiTech), Yuuji Iguchi (Fujitsu) 4.4.4 Middleware User Applications Tools to build Problem Solving Environments on the Grid are in development both in the UK (e.g. within the Reality Grid) and in Japan (e.g. at GTRC). Such tools cover a variety of functions, such as workflow management, visualization, computational steering, or remote procedure calls. • Steering: Kirsten Kleese van Dam (CCLRC Daresbury), Satoshi Itoh (AIST/GTRC), John Brooke (University of Manchester), Hitohide Usami (Fujitsu), David Henty (EPCC) 4.5 Data Investigators in Japan and the UK are involved in the OGSA-DAI process and we believe there is a host of opportunities for joint work related to the OGSA-DAI and OGSA-DQP standard development process as well as related to OGSA specifications for web data. Such activities are likely to include both short-term technology development and integration and more long-term research. Given the importance of standardization in this area, a strong cooperation between the UK and Japan that also includes long-term R&D is likely to benefit standardization processes—and the 20 respective role of the two countries therein. • Registries: In the near future, the grid is likely to become a fundamental part of the semantic infrastructure, Database registries will be part of this semantic grid and designing them is not trivial. At present there are no dedicated efforts in either country to build such registries which, eventually, should be global in nature. • Portals: There has been much interest recently in the automated generation of portals for data access and various groups in Japan and the UK as well as the US are actively working on this issue. Also, as the number of databases available on-line increases, there is an increasing need to support semantic searching to find databases that contain interesting information. • Binary Data: The querying of binary data through XML queries and the XML Description of structured data were mentioned as an important area where much further work is needed. No specific actions items were mentioned on this subject, though. There is also a host of opportunities for more straightforward collaboration on databases in specific areas—such as the life sciences. • • • • Actions: Many opportunities—need to work out proposal Resources: Consider long-term R&D cooperation Means: various existing meetings Steering: Paul Watson (University of Newcastle), Malcolm Atkinson (NeSC), Isao Kojima (AIST/GTRC) 4.6 Applications Applications related collaborative efforts are usually the most difficult to manage, since applications efforts involve other interested parties—the scientists who “provide” the problem to solve or the corporate clients who want to protect their interests. Nonetheless, cooperation in the applications field can also be extremely rewarding. 4.6.1 Scientific Computing and HPC Scientific applications remain an important driver for grid research—yet building successful collaborations with computational scientists (or experimentalists) is no easy task. Building application centered international collaborations is even more demanding. 21 High-energy physics is a somewhat special case, since the high-energy physics community has been used to running large-scale international collaborations for several decades. Also, in anticipation of a new generation of experiments, the high-energy physics community has in fact been a driver behind large-scale international collaborations on grids (such as the Gfarm project, where collaboration involving UK and Japanese scientists already exists). In astronomy, the situation is somewhat similar. By contrast, in life sciences, international collaborations and large-scale projects have provided opportunities for international cooperation on data analysis, data integration, or simulation. In medical research, data sharing is all but impossible and even in the life sciences field, the integration of various sources of genomic, proteomic, or metabolomic data at the national level remains a rather elusive goal in many countries—including Japan. While several possibilities for cooperation were especially mentioned—such as middleware development for LQCD within the International Lattice QCD Data Grid or cooperation on grid-enable MD codes, such as NAMED and REX-MS—apart from a few existing efforts in particle physics and astronomy, precise targets for collaboration efforts would still need to be selected. This is especially true for the life sciences field. • • Actions: Further explore interests and possibilities for cooperation projects in various topical areas Steering; Peter Clarke (UCL), Peter Coveney (UCL), Mr. Oishi (NAO), Mitsuhisa Sato (University of Tsukuba), Masaaki Kawata (AIST/GTRC), Richard Kenway (University of Edinburgh), Steve Oliver (University of Manchester), Shinji Shimojo (Osaka University), Osamu Tatabe (AIST/GTRC) 4.7 Business Applications As part of the UK e-Science project, a number of commercial projects have been funded. In Japan, business related projects are starting now with a new project financed by the Ministry of Economics, Trade, and Industry (METI) and with involvement from Japanese IT vendors. While direct cooperation on commercial projects may be difficult, since such projects typically involve industry co-funding, sharing experiences with commercial grid projects is important in emerging markets. The following areas were identified as promising: • GridCast: the use of Grids in broadcasting and, more generally, the handling, distribution, and storage of digital video files 22 • Commercials computing: application of grids in financial and commercial computing, such as derivative analysis or billing In specific cases, direct cooperation involving industry actors may also be possible. At a somewhat more advanced stage, joint evaluations of commercial implementations and an assessment of the Total Cost of Ownership (TCO) of grid technology might also be possible. • • • Actions: Define framework for joint evaluation of commercial grid projects, evaluate other opportunities for cooperation Resources: Direct communication Steering: Ron Perrott (University of Belfast), Satoshi Itoh (AIST/GTRC) 4.7.1 Grid ASP Tools for building Grid ASP portals will be increasingly important for business users. Such tools include functionalities for single sign-up, job submission, accounting, and security. Grid “economics” and the development of pricing models is an essential for ASP services. In the case of the UK, there exists a common currency for computing resources, but there is no unified model for costing/pricing. In Japan, economics models and approaches will be developed within the Business Grid project. Given the complexity of the issues surrounding accounting, exchange of experiences and information, as well as exchange of personnel, should be highly beneficial for both sides. . • Steering: Satoshi Itoh (AIST/GTRC), Ron Perrott (University of Belfast), 4.7.2 Grid Security There exist a host of other topics where cooperation and exchange of experiences between the UK and Japan would be desirable and highly welcome, notably in the area of security and including data security related to medical data, role-based access control, and all issues related to Certification Authorities. While no specific projects or action items were identified, we believe that cooperation in the area of security and certification is especially important—and, in fact, inevitable for all activities that go beyond demonstrations and test beds and include real production grids or application work. 23 5. What Next? Cooperation does not come easily—and rarely works if here are not shared interests, the mutual recognition of each other’s work, and—perhaps most importantly—the perception of benefits that are both mutual and real. Judging from discussions and comments at the UK-Japan meeting, there are a number of areas where all three conditions are already fulfilled. The demonstration of very high-speed data transmission (e.g. by participation in the “Bandwith Challenge” at the US SuperComputing Conferences) as well as demonstration experiments with GridMPI constitutes another area where cooperation seems straightforward and relatively easy to achieve. There is a host of other areas in middleware development where cooperation, if not necessarily joint development, seems relatively easy. Scheduling is clearly an area where cooperation between the UK and Japan, involving various scheduling efforts in Japan, scheduling work within the GRIP project, and Unicore related efforts, seems very straightforward and of considerable mutual benefit. Scheduling is intimately linked with issues such as Grid economics, and thus is likely to open up related opportunities for joint research and development. In the case of data, there are several opportunities to build joint efforts around the OGSA-DAI and OGSA-DQP specifications and the DAIT projects in the UK and data related efforts at the Grid Technology Research Center and other organizations in Japan. Standardization—at least ideally—should be driven by long-term research and this appears an area where cooperation between the UK and Japan could have an impact beyond simply research. Joint engagement in the development of standards in this area, initially via the DAIS WG at GGF, has already begun. It will benefit from increased investment as well as from the background research. But, there are also areas where the mutual benefits are somewhat unclear and where there is still uncertainty about what the other side is “really doing”. For example, the mutual sharing of compute resources is never easy—and certainly not for centers that are already operating well beyond capacity, such as HPCx. For the Japanese side, the same is true, even if hardware spending within grid efforts in Japan appears to be considerably higher in Japan than is the case in the UK. As concerns applications, further discussion among investigators in various areas mentioned at the meeting—such as life sciences or mesoscale/nanoscale simulations—is needed. For example, it was mentioned at the workshop that “life sciences” is far too broad and diffuse a category to be of much use—are we talking about integrating 24 various genomics/proteomics databases, about the simulation of biomolecule complexes, about imaging, or about clinical trials for new pharmaceuticals? Still, applications are important—and, if it is true that, according to David Snelling’s “hype” curve of the grid, a downturn in is now imminent, identifying applications with either a high scientific or high commercial value may well turn out as crucial for successful collaboration efforts between the UK and Japan. In any case, it is now up to participants at the meeting to link-up, exchange information, exchange code, discuss joint projects, or to refine proposals. Conferences and events like GGF provide ample opportunity to do so. It is unclear at this point what kind of funding is available to support UK-Japan collaborations, but funding is perhaps not the limiting factor. Also, during preparatory stages, there is much that can be done without explicit funding—at least initially. A second UK-Japan workshop is planned to take place in Japan in about 18 months. This should provide enough time to develop at least some of the project ideas listed in this document. That workshop will include focused discussions around those developments. 25 Appendix 1: Workshop Program Thursday October 2 19:00 Finger Buffet Friday October 3 8.30 - 9:20 Breakfast 9.20 - 9:40 Introduction, scene setting etc. (chair Ron Perrott) Tony Hey (Malcolm Atkinson my have to substitute) Satoshi Sekiguchi 9:40 -11:00 Grid Activities: Surveys & Reports 4 talks (chair Ron Perrott) Peter Coveney The RealityGrid: A Survey Satoshi Matsuoka Towards a Petascale Research Grid Infrastructure in Japan David Henty Exploiting Terascale Supercomputers: Experiences from HPCx Yoshio Tanaka ApGrid: An Asia Pacific Partnership for Grid Computing 11.00 - 11.15 Coffee Break 11.15 - 12.35 e-Science and HPC (chair Satoshi Matsuoka) Yutaka Ishikawa GridMPI: A Novel Latency-Aware MPI Implementation John Brooke Resource Brokering on Complex Grids Hiroshi Takemiya Developing Scientific Applications Using Standard Grid Middleware Steven Newhouse ICENI: An Integrated Grid Middleware to Support e-Science 12:35 - 13:30 Lunch 13:30 - 14:50 Standard Middleware (chair Richard Kenway) Dave Snelling Beyond OGSI Paul Watson A Grid Data Integration Service Satoshi Itoh Grid ASP Portals and the Grid PSE Builder Peter Clarke Networking Infrastructure and Network Projects in the Context Of E-Science In The UK 14:50 - 16:10 Life Sciences and Chemsitry Applications (chair Isao Kojima) Derek Hill Grid Technology and Medical Immmaging 26 Mitsuhisa Sato Steve Oliver Masaaki Kawata Drug Discovery by Grid Technology Capture, Integration, and Sharing of Functional Genomic Data Molecular Simulations: Toward Grid Based Approaches 16:10 - 16:30 Coffee Break 16.30 - 17:50 Grids and Data (chair Steve Oliver) Malcolm Atkinson Data, Data Everywhere. Isao Kojima Grid Based Database Integration at AIST Ron Perrott The Grid and Media: the Gridcast Project Richard Kenway The International Lattice Data 18.30 Dinner Saturday October 4 8.00 - 8:30 Breakfast 8:30 - 10:10 GridDiscussion Groups 1. Infrastructure 2. Data 3. Applications 10:10 - 10.30 Break 10.30 - 11.45 Assimilation of Group Feedback (chair Malcolm Atkinson & Satoshi Sekiguchi) Discussion, reflections, etc Opportunities for collaboration on applications, etc., Summary of meeting, action items, next steps 27 Appendix 2: Workshop Participants Japanese Participants UK Participants Yutaka ISHIKAWA Malcolm ATKINSON The University of Tokyo The National e-Science Centre ishikawa@is.s.u-tokyo.ac.jp mpa@nesc.ac.uk Satoshi ITOH John BROOKE Grid Technology Research Center (GRTC) North-West Regional eScience Centre satoshi.itoh@aist.go.jp j.m.brooke@man.ac.uk Masaaki KAWATA Peter CLARKE National Institute for Advanced Industrial Particle Physics Group, University College Science and Technology (AIST) London m.kawata@aist.go.jp clarke@hep.ucl.ac.uk Isao KOJIMA Peter COVENEY Grid Technology Research Center (GTRC) University College London kojima@ni.aist.go.jp P.V.Coveney@ucl.ac.uk Satoshi MATSUOKA David HENTY Global Scientific Information and Computing EPCC, University of Edinburgh Center (GSICC), Tokyo Institute of Technology dsh@epcc.ed.ac.uk matsu@is.titech.ac.jp Mitsuhisa SATO Derek HILL University of Tsukuba KCL School of Medicine, Guy’s Hospital msato@is.tsukuba.ac.jp derek.hill@kcl.ac.uk Satoshi SEKIGUCHI Richard KENWAY National Institute for Advanced Industrial University of Edinburgh Science and Technology (AIST) R.Kenway@ed.ac.uk s.sekiguchi@aist.go.jp Shinji SHIMOJO Steven NEWHOUSE Osaka University Cybermedia Center Imperial College London shimojo@cmc.osaka-u.ac.jp sjn5@doc.ic.ac.uk Hiroshi TAKEMIYA Steve OLIVER Grid Technology Research Center (GTRC) University of Manchester. h-takemiya@aist.go.jp steve.oliver@bso.man.ac.uk 28 Yoshio TANAKA Ron PERROTT Grid Technology Research Center (GTRC) School of Computer Science , Queen's yoshio.tanaka@aist.go.jp University of Belfast R.Perrott@qub.ac.uk Robert TRIENDL David SNELLING triendl@gol.com Fujitsu Laboratories of Europe d.snelling@fle.fujitsu.com Paul WATSON North East Regional e-Science Centre Paul.Watson@newcastle.ac.uk 29