The National e-Science Centre Report 2001–2002 e-Science Report Contents Contents Directors Report 5 NeSC Report 7 Overview of NeSC 7 Resources 11 Activities 14 e-Science Institute Report 17 Overview 17 Events 18 Opening 18 Bluegene 20 GGF5/HPDC11 20 Visitors to eSI 22 Plans 22 Projects 23 OGSA/DAI 23 Sun Data and Compute Grids (SunDCG) 24 Gridweaver 25 eDIKT 26 Manager’s Report 28 Staff 28 Finance 31 Buildings 34 IT infrastructure 35 Appendix Details of Events to 31 July 2002 37 37 page 3 e-Science Report Director’s Report page 5 Director’s Report As Director of the National e-Science Centre, I have been privileged to work at the forefront of the UK’s national e-Science programme; providing leadership of UK-based research, and promoting UK e-Science within the global community. It has been exciting for me to preside over a period of such widespread uptake and all-inclusive research development. e-Science will provide technology which can transform research in any discipline. By combining the expertise of the world’s leading experimentalists, theoreticians and computing scientists, we will develop distributed computing systems that are capable of storing and analysing the ever expanding volumes of data produced by today’s scientific researchers. Extensive global collaborations will become a possibility, advancing the capabilities of computing, communication and visualisation. At the National e-Science Centre, we have led the UK in developing software to deliver a range of components for database “e-Science is about global collaboration in key areas of science and the next generation that will enable it.” Malcolm Atkinson Director, National e-Science Centre “e-Science will change the way in which science is undertaken.” 1 access across the Grid. We are also engaged in strategically crucial work at the Global Grid Forum (GGF); representing the UK’s interests at GGF working groups and participating in discussions which have led to the deployment of Grid enabling software. The OGSA-DAI group in particular is working closely alongside its international counterparts to deliver Grid services to the widespread scientific community. Our e-Science Institute has become a focus for expertise in many scientific disciplines, drawing together the world’s leading researchers to realise the potential of e-Science at our e-Science Institute. The focus of meetings and workshops has varied from research in a particular domain, such as the Blue Gene Protein Folding Conference, to metadata and workflow management which is common to the wider spectrum of scientific disciplines. For anybody who has witnessed the enthusiasm generated by new possibilities, and the crossculture knowledge integration we have fostered here at NeSC, it would be impossible to ignore the widespread impact and potential of e-Science. It is equally impossible to ignore the amount of work which NeSC has done in such a short time towards developing these possibilities and realising this potential. It has been deeply satisfying for me personally to preside over a period of such sustained growth, and as the Director of NeSC I look forward to many future challenges in e-Science and High Performance Computation. Our Vision The future of the National e-Science Centre is intertwined with the future of e-Science in the UK, Europe, and globally. We must play a leading role at all levels, responding to needs, researching methodology, and pioneering new technology. We will do this in conjunction with our five foundation departments, with our colleagues in the UK, and through our strong collaborative links with e-Science programmes in other countries. We will continue to lead high-quality research, concentrating on the ongoing challenge of managing scientific data, supporting the scientists who use it and developing technology for analyses that combine leadingedge data resources and intensive computation. This research will be driven by demanding goals and dedicated operational resources: capability computing, high-volume storage and expert data curation. Our long term commitment to specific projects will yield better understanding of data life-cycles, and the research patterns that they enable. 1 John Taylor, Director General of Research Councils, Office of Science and Technology. page 6 Director’s Report e-Science Report We also understand that the long-term growth of e-Science depends upon engaging and enabling cohorts of new researchers. This requires that e-Science permeates post-doctoral, doctoral and MSc education within the next three years. NeSC should play a leading role, in pioneering and publicising the relevant educational standards – in addition to training, which is already developing. Academic resources will be required. These should capitalise on existing centres of excellence at the National e-Science Centre and at the eight UK regional centres. The future at NeSC must be holistic. We must combine our constituent ethos – experimentation, theory, computation – integrating existing infrastructure for major application research with our own research and outreach. We should address science of every scale; sciences with small, complex datasets are of equal importance to those with huge quantitative demands. Work done on a desktop PC should transfer fluently to the largest supercomputer when the need arises. All scientific research should be able to capitalise on the potential of e-Science. e-Science Report Overview of NeSC page 7 The signing of the official agreement. Front row left to right: Raye Brown, DTI, Tony Hey, DTI, Sir Stewart Sutherland, Principal of the University of Edinburgh, W G Hill, Dean of the Faculty of Science & Engineering, the University of Edinburgh. NeSC Report Overview of NeSC The origins of NeSC lay in the UK government’s decision in 2000 to allocate £120M of public money to e-Science, stimulated by John Taylor’s vision of globally distributed computing and the ‘virtual organisation’. £35 million was allocated to the UK e-Science Core Programme, to develop infrastructure, coordination and support for pilot research projects. The remaining £85M was distributed amongst the UK scientific research councils to develop Grid research relevant to UK science. In March 2001, Tony Hey was appointed Director of the UK e-Science Core Programme. One of his first duties in office was to invite bids for the eight regional e-Science Centres, and one National e-Science Centre. On the 6th of June 2001, a consortium led by Edinburgh and Glasgow Universities won the right to inaugurate the UK’s National e-Science Centre. NeSC officially came into being on the 1st August 2001, in temporary accommodation at the Universities of Edinburgh and Glasgow. The location for the e-Science Institute, a converted church at 15 South College Street, Edinburgh, was contributed by the University of Edinburgh. The National e-Science Centre has defined its mission as: ‘To stimulate and sustain the development of e-Science in the UK, to contribute significantly to its international development and to ensure that its techniques are rapidly propagated to commerce and industry.’ To deliver this mission it has set up an infrastructure, established the e-Science Institute and developed a working programme, as described in this report. The Structure The National e-Science Centre presents an integrated external image. Internally it is quite a complex organisation. Two Universities In May 2001, the Principals of Edinburgh and Glasgow Universities recognised their growing collaboration in a number of areas of e-Science. As e-Science frequently involves multi-organisation collaborations, it was agreed that the two Universities could substantially accelerate their uptake and development of e-Science by working together. Two collaborative projects that pre-date NeSC are: • Scottish Centre for Genomic Technology and Informatics • ScotGrid: an investigation of a Tier2 Data Centre for Particle Physics Experiments page 8 Overview of NeSC e-Science Report These are both funded by the Scottish Higher Education Funding Council, who have also provided £2.3M funds under the SRIF scheme to improve communication infrastructure for e-Science, including connections between the Universities. After the award of the National e-Science Centre to the two Universities, they developed a Collaboration Agreement to manage the project. They also jointly appointed Malcolm Atkinson, previously a Professor in Computing Science at Glasgow, and seconded him to be the Director of NeSC. The e-Science Institute The e-Science Institute is more than a building to house the activities of NeSC in Edinburgh. It is a national resource, holding a range of Research, Community Building, and Training events as well as hosting International Collaboration meetings. It also runs an international visitors programme, which is intended to establish eSI as a recognised center for collaborative research. The resources and activities of the e-Science Institute are described in a dedicated section of this report. NeSC Projects One of the main goals of NeSC (as with the regional e-Science Centres) is to develop and manage a wide portfolio of collaborative projects with industry focused on developing middleware and application solutions in the e-Science arena. In the first instance, funding for these projects has come from the DTI/EPSRC Grid Core Programme. A grant of £3 million over three years has been made available to NeSC for this purpose. The NeSC Grid Core Programme (GCP) funding is designed to fund collaborative projects with industrial and commercial partners to develop open core Grid middleware for the benefit of the project partners and the wider e-Science community. The GCP funding from DTI/EPSRC pays the full costs of the NeSC partners involved matched by ‘in-kind’ contributions from the project’s industrial and commercial partners. In-kind can be anything from cash, to equipment, to software licenses, to company effort. The Scottish Higher Education Funding Council (SHEFC) has also funded a project at NeSC. Called eDIKT, this project will link leading edge research into data management techniques with the needs of working e-scientists. The intention is for the eDIKT team to be a catalyst. By identifying problems faced by application scientists, they will provide challenges to computer scientists, and by identifying promising techniques in the computer science, they will provide better solutions to the application scientists. The eDIKT team will aid the process by providing software development expertise and capacity, Five Foundation Departments NeSC is built on the existing success of five departments at the two Universities: Physics and Astronomy, Edinburgh; Physics and Astronomy, Glasgow; EPCC, Edinburgh; Computing Science, Glasgow and Informatics, Edinburgh. These foundation departments jointly proposed and established NeSC. Physics and Astronomy, Edinburgh Richard Kenway leads the UKQCD Consortium, which is developing a QCD machine on a chip (QCDOC), in a £6.6M collaboration with Columbia University. This architecture is an antecedent of IBM's BlueGene/L computers, which will be announced at Supercomputing 2002. The project will install a 10 Tflop/s peak machine at Edinburgh in 2004 and this will be integrated with the UKQCD Grid which is being developed as part of the GridPP project. The UKQCD Grid is already in operation providing data management services for the Consortium. It is planned to evolve into a computation Grid for QCDOC data analysis over the next two years. Andy Lawrence leads the AstroGrid pilot project and the UK's role in several other international projects to develop a virtual observatory, where data from multiple parts of the spectrum and from e-Science Report different times can be conveniently used by astronomers. Overview of NeSC page 9 This builds on substantial experience building and managing large digital sky surveys at the Institute for Astronomy, housed in the Royal Observatory Edinburgh. Physics and Astronomy, Glasgow Tony Doyle leads GridPP, which is developing an operational Grid across the UK in a PPARC-funded £17m three-year project. Middleware developments are being pursued within the EU DataGrid project where Gavin McCance plays a leading role and is Deputy Manager for the Grid Data Management Work Package and Paul Millar is the Grid Data Management member of the Integration Team. US-based work at Fermilab is being co-led by Rick St Denis for the CDF experiment's Data Handling Group. Within the Department, the focus is on meeting the requirements of particle physics experiment, theory, bioinformatics, Grid data management and information retrieval users. This is being performed as part of the JREI SHEFC-funded ScotGrid project to establish a Tier-2 Computing Facility in Scotland meeting the petabyte-scale requirements of LHC Computing using Grid technology. This builds upon substantial experience building and managing large data servers within the Glasgow Particle Physics Experiment Group. EPCC, Edinburgh Tony Doyle University of Glasgow EPCC has over 60 staff and 12 years experience in: running the high-end parallel computers required to address the challenging computations arising in e-Science, research in fundamental novel computing techniques, and training in HPC methods and technology transfer to industry and commerce. EPCC is taking the leading role in managing the new UK supercomputer project, HPCx, a six-year £54M contract, which will deliver one of the world’s ten fastest computers for UK science. EPCC has a well-established capability in commercialisation and an extensive involvement in European Programmes. Currently these include a major role in preparing projects for Framework VI and they currently lead two Grid-related Framework V projects: GRIDSTART: has the specific objective of consolidating Grid technical advances in Europe, encouraging interaction amongst similar activities both in Europe and the rest of the world and stimulating the early take-up of this technology by industry. ENACTS: co-ordinates the activities and planning of 14 supercomputer centres and numerous user groups across Europe in Grid and HPC, with the aim of distilling best-practice and providing a roadmap for the EC. Computing Science, Glasgow Computing Science at Glasgow has focused primarily on developing applications, in collaboration with scientists. For example, Paul Cockshott leads a number of projects involving 3D imaging. These include collaborative research with the cardiac MRI unit on developing techniques to give animated cardiac images showing areas of disease; compression of microscopy images; 3D television studios. He is also the principle author of a collaborative project on the use of the Grid to support a virtual organisation with the computer generated animation company Pepper’s Ghost. David Gilbert leads the newly established Bioinformatics Research Centre which brings together researchers in computer science, mathematics, and the life sciences. At Glasgow he leads the numerous projects including a £5.5M Wellcome Trust project on Cardiovascular Functional Genomics with the focus on hypertension. He is Principle Investigator together with Muffy Calder in a large DTI funded Beacon Project (under the Harnessing the Genome Programme) concerned with developing a software tool for analyzing biochemical pathways and leading research into formalisms and techniques for modelling and analyzing concurrency in cell signaling pathways. Arthur Trew Director of EPCC page 10 Overview of NeSC e-Science Report Informatics, Edinburgh The School of Informatics was one of only six computing departments in the UK to have obtained a 5* ranking in the 2001 UK Research Assessment Exercise. It returned the highest number of research active staff and was the only 5*A department. It contains world-class research groups in the areas of theoretical computer science, artificial intelligence, cognitive science and bioinformatics, among others. Professor Don Sannella leads a team investigating the use of proof-carrying code to enforce resource guarantees for mobile code. This should be very useful in a Grid context, for ensuring that services meet their advertised behaviour. Paul Anderson leads the team that developed the LFCG configuration management system. This is used by the European Data Grid and is being enhanced to meet their needs. He also leads the ‘GridWeaver’ proposal, in conjunction with Hewlett-Packard, to develop this technology for managing Grid systems. Members of the school are active in two recently approved Interdisciplinary Research Collaborations: Dependability, and Advanced Knowledge Technologies. Our First Year The National e-Science Centre has had a productive and exciting first year. We began in temporary accommodation, building a programme in several small steps, By March 2002 we had a dedicated building, fitted out to our requirements, and an event programme that was already more active than we had envisaged. The year culminated with two major events: the official opening by the Chancellor of the Exchequer Gordon Brown, and the largest meeting to date of the Global Grid Forum. Beginning NeSC During August 2001 the first meetings were held at the e-Science Institute. The UK e-Science Directors’ Forum initiated regular meetings between the directors of the nine e-Science centres, the UK Grid Support Centre, and the e-Science Core Programme Directorate. The Grid Users’ Meeting illustrated many common interests across the sciences. The workshop on Databases and the Grid discussed a paper by Paul Watson and led to the launch of the UK Database Task Force. In October we attended the Global Grid Forum 3 in Rome and began discussions with Ian Foster, leader of the Globus project, on the potential for the Grid architecture of the distributed computing model emerging as Web Services. Steve Tuecke, chief architect of the Globus development team, flew in and energetically launched our training programme with an intensive tutorial on Globus Toolkit 2. Our ‘Getting Going with the Grid’ workshop was the first example of our workshops designed to establish collaboration in the UK e-Science community. November saw a number of meetings Steve Tuecke during the Globus Toolkit 2 tutorial. relating to particular fields of research, such as quantum chromodynamics, experimental particle physics and astronomy. Building our Programme In early December the Scottish Higher Education Council awarded us a Strategic Research Development Grant of £2.4M for the eDIKT project, to devise novel software tools for the analysis of existing data. We were also privileged to welcome Ian Foster to the Centre for an extended meeting of the UK Architectural Task Force. This allowed us to develop the newly proposed Open Grid Services Architecture (OGSA). January 2002 saw us hosting a four-day Globus workshop for UK researchers. Steve Tuecke led an excellent four days’ tuition, in conjunction with the European Data Grid and his Globus colleagues. This was immediately followed by a meeting of the six e-Science pilot projects funded by the EPSRC. e-Science Report Overview of NeSC page 11 On to February, and the OGSA design effort was publicly announced at GGF4, whilst the UK Database Task force led to a successful ‘birds of a feather’ session. We started work on our first major centre project, OGSA-DAI. This is a collaboration with IBM, Oracle, and the e-Science Centres in Newcastle and in Manchester. We produced a UK e-Science technical paper about this work. Our events and systems support teams heroically prepared for the Blue Gene workshop in just two days. This was a joint venture with IBM, and a stimulating international research meeting on protein science was the result. There followed an intensive month of meetings, including a week of hands-on training workshops for e-scientists pioneering the use of web services and OGSA. NeSC Officially Open and Operational On the 25th April 2002, the National e-Science Centre was officially opened by the Rt Hon. Gordon Brown, Chancellor of the Exchequer. More than 200 of the UK’s leading researchers, politicians and Freddie Moran of IBM. journalists attended the opening, which consisted of an intensive programme of discussion and demonstration, demonstrating the breadth and vitality of the UK e-Science Community. June 2002 saw a UK first, with the hands-on training workshop IBM delivered in partnership with NeSC on DiscoveryLink. This was the first time a UK company had addressed the UK e-Science Community at the e-Science Institute. Another UK first quickly followed, in the form an international collaboration meeting with Chinese e-Scientists and Bioinformaticians. July was a month like no other. We hosted GGF for its first meeting in the UK, in conjunction with the 11 Symposium on High Performance Digital Computing, at the Edinburgh International Conference Centre. th GGF5 was spectacularly well received, attracting over 900 delegates as opposed to 400 at GGF4 in Toronto. The two conferences ran in tandem with a number of co-located workshops and meetings, for two intense weeks of concerted activity, at the height (and vibrancy) of Edinburgh’s tourist season. Our dual site was used to advantage, with the Grid Applications workshop and SUN High Performance Computing consortium being held in Glasgow. Dave Pearson of Oracle. Resources Web Information System NeSC provides a Web Information System for all e-Science activities in the UK. It includes summaries of all e-Science projects and events, with links to the regional e-Science centres and the Grid Support Centre. It also provides a news service, and a secure area for use by the UK e-Science directorate. When considering the basis for Web design, we decided early in the process to concentrate on the content rather than the style, and agreed a number of criteria: • The Web would largely be a repository of information for the UK e-Science community. • Platform/browser independent (as far as possible) • Easy to adapt • Navigation pages fit on screen • Easy to navigate/search on every page • Easy to maintain (including by admin staff) • Each page to offer opportunity for feedback • A secure area page 12 Resources e-Science Report Central to making the content easy to maintain, was our decision to drive the Web pages from a database. Information could either be fed in by Web forms (e.g. registration for events) or directly into the database by administrative staff. Information in the database would then be used to generate dynamic (in response to queries) or static web pages, and for management purposes. We opted for SQLServer as the database, with Cold Fusion as the application server to the Web pages. SQLServer has the advantage that it connects to Microsoft Access via ODBC in an error free way, allowing admin staff to utilise the user friendly and flexible querying features of the Access front end. The use of Cold Fusion was a known technology in Glasgow Computer Science, and so we were able to implement (in particular) web event registration very quickly. Both the database and the hierarchical structure of the web pages underwent a formal design process. Both have grown out of all proportion, in comparison to what was originally intended. The database originally had 17 tables – it now contains 50, so that it is no longer possible to show the schema in a readable format in A4. Because of the OO nature of the database, it has been possible to extend this in a logical way to accommodate the increasing demands for services. The web pages currently offer the following information: • Information about NeSC and links to other e-Science centres. • Front page News flash and archive (database driven). News items date and archive automatically. • Web-based registration for events and to join the mailing list. (Note that having registered once, users need only enter their email address and surname to register for future events.) • Structured archiving of material from previous events. • Web based applications for GridNet and the Visitor Programme. • Areas for groups such as Grid Network Team (GNT) to publish information. • The UK e-Science programme publishes a technical report series, and these are presented on the Web. • A secure SSL website for management information for the e-Science Core Programme . • A web system for Core Programme staff to manage e-Science projects for the UK. Access Grid The Access Grid is a collaboration tool for distributed groups. It can be used for meetings, brainstorm sessions, seminars, and similar activities. Each node includes microphones and cameras to transmit sounds and images of the presenters, and a large display area to show images from other sites. In addition, the display can show computer-based material, allowing collaborative development of documents or other software. For more information on the Access Grid technology, see the Grid Support Centre web pages at www.grid-support.ac.uk/ NeSC currently runs two Access Grid nodes, with a third planned: • Edinburgh Parallel Computing Centre • E-Science Institute, Edinburgh • Glasgow Computer Science Department Each node uses two Windows machines to manage the display, and two Linux machines for video and audio capture. The EPCC node was set up in November 2001. It uses three back projectors and four pan & tilt cameras. The node’s four PCs are connected directly to the University’s 100MB multimedia backbone network, and from there out onto JANET. e-Science Report Resources page 13 Access Grid demonstration of Telemedicine. The e-Science Institute (eSI) node was set up in April 2002. It uses three ceiling mounted projectors, four pan & tilt cameras, and four microphones. The node’s four PCs pass data to one another via a 100MB network, which is connected directly to the same backbone as the EPCC node. A Glasgow node is planned, and should be ready in September 2002. It will run on four dual processor IBM IntelliStations. It will have three ceiling mounted projectors and two pan & tilt cameras. The node’s four machines will pass data to one another via a Gigabit switch, which in turn hooks into the department’s 100MB backbone, and from there to the Campus and out onto JANET. We have used the Access Grid for regular cross-site meetings, to join meetings of the UK Engineering Task Force, to broadcast events to the rest of the UK, and various other activities. We plan to extend the use of the Access Grid in the future. To this end, we plan to purchase a portable Access Grid node that can be deployed in different locations. This will add more flexibility to the possible use of the Access Grid, for example by allowing larger meetings to join the Grid, or for small meetings to use the Grid while the main room is used for other activities. Research Computing Facilities NeSC has a small number of machines for research use, in addition to the computing infrastructure for the e-Science Institute and the equipment for running the Access Grid. In the future we plan to increase both the number of machines at NeSC, and to use Grid technology to link other computing facilities in the two Universities (and other neighbouring institutes). Glasgow In Glasgow, the e-Science team has at its disposal a Power3 based IBM PSeries running AIX 5.1, a dual processor IBM IntelliStation and a 4-processor IBM Netfiniti server running Linux and a 4-processor Sun Enterprise450 running Solaris 7. The Netfiniti and IntelliStation have been generously loaned by IBM, together with over 0.5 Terabytes of disk space. Memory ranges from 512MB to 4GB on the AIX machine, and most of the machines have Gigabit Ethernet cards providing the capability of fast communications between machines via a local Gigabit switch. This affords a good mix of hardware and Operating Systems and gives us experimental flexibility. page 14 Resources e-Science Report We have installed version 2.2 of the Globus Toolkit on the Sun e450. We are currently in the process of linking this machine to other Globus enabled-machines, which will enable us to send and receive jobs to and from remote sites. Locally, we run Condor and Sun Grid Engine job scheduling software. In addition to these machines there are numerous other supercomputers and clusters located around the University of Glasgow. For example, the Department of Electronic and Electrical Engineering has an 8x SGI Origin 2000 processor; a 30x Sun Enterprise processor; a 32x SGI Origin 300 processor and a 40x IBM pSeries 640. Edinburgh Currently, the main research machine at Edinburgh is an IBM Netfinity 7600 on loan from IBM. This machine has four Processors, 7GB RAM, and 1TB of disk, and runs Linux. This is used for a variety of purposes. For collaborative projects, we have access to other facilities at Edinburgh. Highest among these is the Edinburgh Parallel Computing Centre, one of the founding departments of NeSC. The main EPCC machine is Lomond, a Sun e6800 running Solaris. In the forthcoming year, we expect to have an IBM p690 ‘Regatta’ server, donated by IBM from their Shared University Research Programme. This machine is likely to have 16 processors, 128GB RAM and 2.1TB of disk. Our plans include a machine for the eDIKT project. This machine will be funded from that project, and will be put out to tender. We are also planning a Storage Area Network that will link a range of research machines. Extending the network ScotGRID is a £800k centre in Scotland for the analysis of data primarily from the ATLAS and LHCb experiments at the Large Hadron Collider. The centre currently consists of a 128CPU Monte Carlo production facility run by the Glasgow PPE group and a 5TB datastore and associated high-performance server run by Edinburgh Parallel Computing Centre. SHEFC have funded a 1Gb/sec e-Science network via the SRIF programme. This connects all the NeSC sites, and will enable Grid-related research to use dedicated high-bandwidth connections. The ScotGrid project is also planning to use this network to connect its machines. This network will form the basis of our integration with other computing facilities in the two Universities. It will also include neighbouring institutions, such as the MRC’s Human Genome Unit. Activities This section describes the activities of the National e-Science Centre, except for the e-Science Institute and the NeSC projects. These activities fall into three main areas: Grid Engineering, Research, and International Representation. Grid Engineering We use the term ‘Grid Engineering’ to cover our contribution to setting up the UK e-Science Grid, support activities for users of this (and other) Grids, and the general development of Grid middleware. Contributions to the UK Engineering Task Force The Engineering Task Force (ETF) of the UK Core Grid Programme consists of representatives from all the e-Science centres in the UK. The ETF co-ordinates Grid engineering activities, and provides a forum where the e-Science centres can learn from each other. Currently the ETF has an ongoing project to e-Science Report Resources build a single, connected Grid for the UK e-Science community. A so-called ‘Level 1’ Grid is establishing basic connectivity between distributed sites. NeSC has contributed in several ways to the UK ETF over the past year. We have built a Level 1 Grid at both Edinburgh and Glasgow. This was based on version 2.0 of the Globus Toolkit. At Edinburgh, EPCC’s main high-performance computing server, a Sun e3500, was connected to the Grid. At Glasgow, the Grid was installed on a 4-CPU Sun e450. In addition to installation and development work, Stephen Booth chaired the ETF working group on firewalls, and wrote the ETF report. NeSC members also contributed to other ETF reports. NeSC provided prototype software combing the Globus GSI security model with web service support, which has been used both as a basis for OGSI implementation work and by other projects who needed this functionality. Contributions to the UK Grid Support Centre The Grid Support Centre (GSC) of the UK Core Grid Programme provides resources to help e-Science projects to set up grid infrastructure. The GSC runs a central web site (www.grid-support.ac.uk/), and funds several people to evaluate systems and support projects in their region. NeSC has one member of staff who is fully funded by the Grid Support Centre. As NeSC contribution to the GSC, we are developing expertise with installing and configuring the Globus toolkit on Solaris and AIX. Middleware development At NeSC, we are undertaking our own Grid engineering work, as well as contributing to the ETF and GSC. In particular, we are supporting our projects and those of our partners, particularly OGSA-DAI, eDIKT, and Astrogrid. These projects make substantial use of the newer Globus technology, based on OGSA, which will be implemented in version 3 of the Globus toolkit. Currently we are using preview releases of this technology. We are also evaluating other Grid technologies. An example here is the Giggle replication toolkit, which is being evaluated by the eDIKT project. Currently we have a particular interest in creating test workloads for Grids. We plan to create test frameworks for a variety of real applications. These frameworks will record success and performance of test runs. This will allow us to test and compare a range of Grid technologies and deployments. As with the rest of our Grid engineering work, we are particularly interested in OGSA-based applications. NeSC has provided many workshops and courses on Grid technology, as described elsewhere in this report. We will continue this programme. In particular, we are planning several events based on OGSA and version 3 of the Globus toolkit. International Representation As the UK National e-Science Centre, NeSC has a particular role to represent the UK-e-Science programme internationally. We perform this task in conjunction with the e-Science directorate and the national task forces, as well as the regional e-Science Centres. NeSC is fulfilling its international role through numerous avenues. We have provided members of programme committees for Global Grid Forum meetings 4 to 7, and the High-Performance Distributed Computing conferences 11 and 12. We have also attended major conferences such as SuperComputing 01 and CANARIE 7. We have visited major research centres in the USA, including Argonne National Laboratory, Information Systems Institute, NCSA and San Diego Supercomputer Centre, as well as meetings of American research projects such as NPACI. We also take part in so-called ‘N+N’ meetings, in which a number of researchers from the UK meet a similar number of researchers from another country. As an example, more than thirty-five delegates, page 15 page 16 Activities e-Science Report representing over twenty-four organisations in the UK and China, met at NeSC for an N+N meeting organised by the Biotechnology and Biological Sciences Research Council. Delegates from both countries met to discuss a variety of eScience topics, including drug discovery, structural genomics and biodiversity. Through the hosting of major events such as Global Grid Forum 5 and the 11th Symposium on High Performance Digital Computing, the international visibility of NeSC as a hosting venue and centre of excellence in e-Science and Grid technologies has increased considerably. NeSC has gained international recognition for its software development, especially through work such as OGSA-DAI. The OGSA-DAI work is also driving through critical standardisation activities (GGF Data Access and Integration Services Working Group) which will underpin future OGSA-based Grid applications. Members of the Globus team have visited NeSC and given courses. China N+N meeting in Edinburgh. We work with members of several European projects, including the European Data Grid and the European Virtual Observatory. We have also contributed to the European Framework programme via such projects as GridStart and ENACTS. Research We intend to make the National e-Science Centre become a thriving research community, producing world-class research. We will host visiting academics, post-doctoral researchers, and PhD students. We will develop their e-Science skills, and extend global understanding of e-Science issues. Our plan is to begin building on research activity in the NeSC foundation departments, and on our existing links to other projects. We are enthusiastic to reach as broad a range of potential e-Scientists as possible. This activity will attract international visitors, who will in turn stimulate this research further. We have local discussion groups on aspects of e-Science: • Astronomical data mining • Bioinformatics data curation These groups enable researchers to share their knowledge of the state of the art, and provide forums to suggest new ideas to take the subject forwards. In the forthcoming year, we intend to deepen our international links to strengthen our research capability. We already have a visit scheduled from Jim Gray of Microsoft Research in San Francisco. We will also submit proposals to the EU’s ‘Framework 6’ programme. NeSC will continue to contribute to international conferences and collaborations. Ultimately and fundamentally, we aim to forge international links by producing world-class research. e-Science Report e-Science Institute Report page 17 e-Science Institute Report Overview The original vision of the e-Science Institute was Event Statistics: August 2001–July 2002 that it would host six two-week events a year. 4000 Reality has proven quite different as the graph here shows. So great has been the demand in the UK community for NeSC to host meetings, courses and workshops that eSI has held 49 events in its first year. Our doors were opened to more than 2600 8 Delegates Delegate Days Events 3500 7 3000 6 2500 5 2000 4 1500 3 1000 2 500 1 participants from over 500 organisations who arrived to hear 236 speakers on every topic relating to the development of e-Science, including software tutorials, applications, projects, infrastructure, research and international collaborations. eSI started running events within weeks of the award being announced in August 2001 and before it had even taken occupancy of the refurbished church in South College Street, which has been 0 Aug Sep Oct Nov Dec Jan Feb Mar April May June July contributed to the project by the University of Edinburgh. October 2001 saw Steve Tuecke, who has since become a regular visitor to eSI, give his first Globus tutorial. At this point the conferences were managed manually while work was proceeding in parallel to develop an on line registration system, without which the small team at eSI could not possibly handle the volume of visitors it does. This system first went live for the AstroGRID workshop in December and has continued to develop as the needs have expanded. Steve Tuecke returned with Bill Allcock and Charles Bacon in January 2002 to give the Globus expanded tutorial. This was enormously popular and heavily oversubscribed with all parallel sessions full, and much creative thought needed to fit as many people into the meeting space as possible. In February, the building was refurbished with additional sockets and power points in all meeting rooms, improved networking and the construction of a second Access Grid node with the NeSC team finally moving into the building in March 2002. The first event to be held after the move was the BlueGene meeting in conjunction with IBM, so ensuring that the building was ready in time was paramount. April was the month for the Official Opening of NeSC by the Rt Hon Gordon Brown. This was a high profile event for eSI, with much media attention and at which it could not afford to fail. Many people and organisations in the UK e-Science community contributed to the success of the day, ensuring that the demonstrators and AccessGrid were operating at peak performance, and that the talks and general events went smoothly. The ‘grand finale’ for the first year was GGF5/HPDC11. These two meetings were hosted jointly for the first time outside of the USA and attracted some 900 delegates, as well as six co-located meetings and events. The whole event was universally agreed to be a great success. The e-Science Institute has ‘settled down’ to running four types of events – research, community building, and training as well as hosting international collaboration meetings. It currently hosts an average of five meetings a month, with an even busier schedule planned for autumn 2002. South College Street reception area. 0 page 18 e-Science Institute Report e-Science Report Malcolm Atkinson, the Rt Hon Gordon Brown and Sir Stewart Sutherland at the Opening of the National e-Science Centre. Events Because these are part of the core business of eSI, the events in the first year are exhaustively summarised in Appendix 1. Further details and presentations for the majority of the talks can be found on the eSI website at: www.nesc.ac.uk/esi/. In this section we focus on three of the major meetings for the year. Focus on The Official Opening (25 April 2002) The Rt Hon Gordon Brown Chancellor of the Exchequer, officially opened the National e-Science Centre (NeSC) in April, proclaiming it a "clear demonstration of the Government’s commitment to science and research, which includes specific funding for genomics, basic technologies and e-Science". Gordon Brown said: “The Government is committed to maintaining the UK’s leading role in this important area of scientific research, for which we already have an enviable reputation. I am pleased to open the NeSC, a bold, exciting and worthwhile initiative which provides the e-Science community with a permanent home where it can share resources, ideas and facilities.” The official opening of the Centre brought together participants from government, academia, Research Councils and industry. This event was more than the opening of the Centre. It also acted as a showcase for the UK e-Science programme and was an opportunity to present its vision for the future to Government and other decision makers. This was achieved by a unified effort from the UK e-Science community that pulled together researchers from all the Centres. It was the first time that all of the demonstrators had been presented together in this way. More than 180 visitors heard speakers addressing the major issues facing the e-Science community. In the morning they listened to talks on AstroGrid (Dr Nicholas Walton), ‘Grid-based on-line aeroengine diagnostics’ (Professor Jim Austin) and UNICORE (Dr Dietmer Erwin). The longer afternoon session included talks on the European Particle Physics Grid (Mr David Williams), myGrid (Professor Carole Goble), ‘e-Science, e-Commerce and eBusiness”(Mr David Pearson), Databases (Professor Peter Buneman), ‘IBM and e-Science’ (Mr Freddie Moran) and finally ‘Computer Science Challenges to emerge from e-Science’ (Professor Tom Rodden). Running in parallel to the talks was a display of pilot projects in action demonstrating how Grid computing can solve the challenges in e-Science. In a wide collaborative effort, these were from across the UK and presented the work being done at all the e-Science centres, as well as universities, hospitals and research laboratories. The demonstrators on display included: e-Science Report e-Science Institute Report page 19 3D OPT Microscopy Grid - Bringing the Grid to the Biomedical Workbench A Dynamic Brain Atlas A Liquid Crystal Structure Modelling and Visualisation Web Portal AstroGrid: Creating the UK's Virtual Observatory and Defining the EU's Astrophysical Virtual Observatory Building the Grid for BaBar ClimatePrediction.com: Public Participation in Climate Simulation of the 21st Century Collaborative Data Visualisation of Medical Images via the Grid Data Portal: Generic Distributed Database Queries Dynamic GRID Optimisation e-Science Grid Site Monitor e-STAR: e-Science Telescopes for Astronomical Research Exploiting the Grid to Simulate and Design the LHCB experiment. Exploring Chemical Structures GEODISE: Grid Enabled Optimisation and Design for Engineering GODIVA: Grid foe Ocean Diagnostics, Interactive Visualisation and Analysis GRAB: Grid and Biodiversity Grid-based Collaborative Visualisation & Computational Steering HPCGrid Services Portal Simulating Radiation Damage to Crystalline waste-storage materials Telemedicine on the Grid Access Grid presentations were scheduled throughout the day and included ‘Databases & the Grid’ (Professor Norman Paton), ‘e-Science, the Grid and Southampton University’ (Professor Andrew Keane), ‘Telemedicine on the Grid’ (Dr Martin Graves and Miss Kate Caldwell), RePhoNet (Dr Thomas Eickermann) and ICENI (Dr Steven Newhouse). Lord Sutherland of Houndwood (Principal, The University of Edinburgh) pronounced the welcome to the formal opening later in the afternoon. He was followed by Professor Tony Hey speaking on UK e-Science, Mr Roger McClure (SHEFC) on the launch of the eDIKT project and Professor Malcolm Atkinson spoke about NeSC. Gordon Brown then formally opened the Centre with the response being given by Professor Robin Leake (Vice Principal, University of Glasgow). After the formal opening, Gordon Brown proceeded to the exhibition area to view some of the demonstrators and talk to the scientists involved. As a finale, From top left: The Rt Hon Gordon Brown and Tony Hey, Director of the e-Science Core Programme, during an Access Grid session; Robin Leake, Vice Principal of the University of Glasgow at the Opening; Andrea Grainger of NeSC badges a delegate; Ray Browne of the DTI and Anne Trefethen, Associate Director of the e-Science Core Programme. page 20 e-Science Institute Report e-Science Report he was able to speak live over AccessGrid to David Wallace at Imperial College and Director Fran Berman at the San Diego Supercomputer Centre. Media attention was high with over sixty items of coverage appearing in the press and online media, including articles in the Financial Times, The Times Higher Educational Supplement, The Guardian and BBC Online, thus helping to raise the profile of e-Science in the UK. NeSC would like to formally extend its thanks to all of those involved, and too numerous to list here, in what was widely regarded as a very successful Opening. Blue Gene Meeting (15–16 March 2002) About 100 participants converged on Edinburgh in March 2002 for the second Blue Gene Workshop, held in the new e-Science Centre under the auspices of IBM and NeSC. ‘Blue Gene’ is the name given by IBM to its latest and largest supercomputer being built for use in postgenomic research and this workshop brought together distinguished scientists whose research is in the area of protein folding to help stimulate interactions among the various communities involved. Generally, the workshop addressed the application of massively parallel computation to the biomolecular sciences. In three sections, overviews and recent results were combined with speculation and novel approaches to the basic question: given unlimited computing resource, what biology would you like to do? The section on protein folding included both experimental data on proteins folding from Alan Fersht (Cambridge, UK) as well as the computational approaches to how proteins are able to achieve their unique 3-D structures given only an amino acid sequence. Computational approaches to individual enzyme reactions and structure-based drug design were linked through the need to consider molecular interactions between protein molecule and ligand. Both quantum mechanical and semi-empirical methods were discussed. Finally, systems biology provided a fascinating glimpse of what can be achieved when metabolic pathways, whole cells (M Tomita, Keio) and even whole organs (Denis Noble, Oxford) are modelled. It was clear that although great strides are being made towards successful modelling of biological processes, there remains enormous scope for both basic biochemistry and ingenious algorithms. GGF5/HPDC11 (21–26 July 2002) The Fifth Global Grid Forum and Eleventh IEEE High Performance Distributed Computing Symposium were hosted in Edinburgh by NeSC at the Edinburgh International Conference Centre. This was the first time that these two events had been run as a joint activity, and the first occasion on which HPDC had been held outside the USA. It remains the largest event for each of these series and was the first occasion on which GGF had connected with external participants by Access Grid and had made extensive use of wireless networks. The UK e-Science programme and the NeSC and EPCC staff made substantial investments of funds and labour to deliver a very successful conference. The main function of GGF5, Delegates using the Wireless LAN at GGF5 was, as always, the development of Grid standards and the dissemination of information about progress in developing and using Grid technology. The conference was opened by Dr John Taylor, Director General of the Research Councils, Office of Science and Technology, who presented an inspiring vision of the potential of e-Science and characterised the considerable technical and social challenges and opportunities it presented. Professor Tony Hey, Director of the UK e-Science Core Programme, reported on the rapidly growing set of UK eScience projects that are driving the UK’s Grid requirements and of the engineering work underway to support them. The invited lecture by Dr Henry Thompson, of the University of Edinburgh (editor of the XML schema proposal) used intriguing examples of past knowledge representation failures to expose the challenges that attempting to build a semantic Grid would pose. This was timely as the first meeting of e-Science Report e-Science Institute Report the community establishing a Semantic Grid research Grid GGF research group, was held at this meeting and a tutorial on ontology was very well attended. Both of these were led by UK researchers, Professors Carole Goble of Manchester University and David De Roure of Southampton University. The DAIS Working Group held its first meetings, undertaking a major task of proposing a specification for data access and integration using grids, led by Professor Norman Paton and Dr Dave Pearson. Dr Rick Stevens of ANL gave an exciting introduction to biogrids and Dr Fran Berman of SDSC reported early work on the construction of the Distributed TeraScale Facility. page 21 From top left: John Taylor, Director General of Research Councils, Office of Science and Technology, at GGF5; GGF5 at the Edinburgh International Conference Centre. Opening HPDC11, Dr Irving Wladawsky-Berger of IBM spoke of their strong commitment to Grid technology, initially for computing on demand, and of the major investment IBM planned in order to develop and deploy it. The busy programme of research papers at HPDC11 was combined with several provocative keynotes. Professor Jon Crowcroft of Cambridge University explored the new challenges that Grid computing presents to network providers and encouraged the Grid community to learn much more about the intrinsic properties of networks. In a complementary talk, Professor Joe Sventek of Agilent, now at Glasgow University, spoke of the Grid-scale challenge of monitoring and diagnosing the behaviour of toady’s global digital networks. Professor Tom Rodden of Nottingham University 1000 and Director of the Equator IRC described the 900 technical, ethical and social challenges of ubiquitous computing. His examples were drawn 800 from plans to equip people at risk of cardiovascular 700 engaged in their daily business in Glasgow and researchers in Antartica as they collected paleoclimatic data. Dr Gregory Abowd of Georgia Participants episodes with personal health monitors as they 600 500 400 Institute of Technology raised similar examples of new kinds of Grid-like distributed computing 300 infrastructure, in his case to support ubiquitous 200 computing in the home. 100 Some associated events were held in the e-Science Institute and we drew on our dual 0 GGF1 GGF2 GGF3 GGF4 GGF5 page 22 e-Science Institute Report e-Science Report university foundation to hold the two day Grid Applications workshop in Glasgow to find sufficient capacity. This very energetic workshop was led by Professor Simon Cox of Southampton University. Taken together the ten-day programme made it self-evident that as a result of the UK e-Science Programme, the UK is a major contributor to Grid computing and high-performance distributed computing. Visitors to eSI David Maier (Oregon Health & Science University, USA, 25 June 2002–3 July 2002) Dave Maier visited for a little over a week to work with Malcolm Atkinson and Peter Buneman. This was based on prior working relationships concerning scientific and object-oriented databases, in which Dave Maier is an leading expert. This led to fruitful discussions on how to efficiently handle matrix structured and XML structured data. We also explored issues of provenance tracking, annotation and archiving such data. This was all planned, based on previous discussions and we intend that this collaborative work will continue. A surprise was Dave Maier’s project modelling the interaction between the outflow of water from the Columbia River and the Pacific. This involved scheduling major data production workflows with simulation and sensor array inputs on a daily basis. This proved a provocative example of data intensive computation issues. Professor Gregory A. Riccardi (Florida State University, current visitor) Greg Riccardi arrived at NeSC on June 28, 2002 for a visit of four months. He is on sabbatical from his position as Professor of Computer Science at Florida State University and has come to Edinburgh to participate in the development of methods of utilizing information and data in ways that are consistent across all Grid applications. The OGSA-DAI project at NeSC is a perfect match for his interests and skills. Plans Training events As a result of the demand for training courses and the cost of hiring equipment to run them, we have decided to set up a training room in the Swanston meeting room at South College Street. This will house twenty PCs configurable with a variety of Windows or Linux operating system builds. The use of Ghost to install images to all machines concurrently will significantly reduce management overhead, and will provide speedy and flexible reconfigurations of the system build. It is anticipated that the provision of the training room will significantly increase the number of workshops that eSI is able to provide and host. Student Program As part of its outreach and training mission, eSI will implement a Summer Scholarship programme for senior undergraduates pursuing an appropriate qualification. e-Science Report Projects Projects One of the main goals of NeSC (as for the regional e-Science centres) is to develop and manage a wide portfolio of collaborative projects with industry focused on developing middleware and application solutions in the e-Science arena. In the first instance, funding for these projects has come from the DTI/EPSRC Grid Core Programme. A grant of £3 million over three years has been made available to NeSC for this purpose. The NeSC Grid Core Programme (GCP) funding is designed to fund collaborative projects with industrial and commercial partners to develop open core Grid middleware for the benefit of the project partners and the wider e-Science community. The GCP funding from DTI/EPSRC pays the full costs of the NeSC partners involved matched by ‘in-kind’ contributions from the project’s industrial and commercial partners. In-kind can be anything from cash, to equipment, to software licenses, to company effort. We also report on eDIKT, which is fully funded by the Scottish Higher Education Funding Council. OGSA/DAI Background OGSA-DAI is a UK-wide project collaboration between NeSC, eScience North West at the University of Manchester, North East Regional eScience Centre at the University of Newcastle, IBM and Oracle. This project will deliver a kit of components for accessing and integrating data from diverse and distributed data bases using the new Open Grid Services Architecture (OGSA). It builds on the early dialogue with the OGSA team at Argonne National Laboratory, the Information Systems Institute in the University of Southern California and IBM, and it derives from the requirements analysis conducted by the UK Database Task Force. OGSA-DAI is developing designs, demonstrators, prototypes and ultimately reference implementations of OGSA data access and integration services, along with crucial OGSA extensions to support e-Science. The basic services developed by OGSA-DAI will be released as part of the OGSA-based Globus Toolkit version 3. The overall project is divided into six work packages: 1. Programme Management: led by Oracle Corporation. 2. Architectural Specification: led by EPCC/NeSC. 3. Grid XML and Semi-Structured Data Services: led by EPCC. 4. Distributed Query Processing: led by eScience North West and the North-East Region e-Science Centre. 5. Grid eScience Services and BinX: led by EPCC. 6. Grid RDBMS and Globus-2 Data Services: led by IBM UK. Goals The project is concerned with middleware to assist with access and integration of data from separate data sources via the Grid. It is engaged in identifying the requirements, designing solutions and delivering software that will meet this purpose. The project was conceived by the UK Database Task Force and is working closely with the Global Grid Forum DAIS-WG and the Globus team. page 23 page 24 Projects e-Science Report Members Rob Baxter, EPCC Brian Collins, IBM UK Inderpal Narang, IBM USA Paul Watson, NEeSC Malcolm Atkinson, NeSC Norman Paton, NWeSC Dave Pearson, Oracle UK Status Phase 1 of the project, which started in February 2002, is concerned with using prototyping techniques to refine functional scope and to develop a common architectural framework for Grid data services. Phase 2, which is scheduled to start at the beginning of October 2002, will focus on implementing Grid data services that provide access and integration to data held in Relational Database Management systems and XML repositories. The software deliverables will be made available to the UK e-Science community and will also provide the basis of standards recommendations on Grid data services that are put forward to the Global Grid Forum through the DAIS working group. Sun Data and Compute Grids (SunDCG) Background SunDCG is a collaboration between NeSC and Sun Microsystems, using Grid Engine and Globus to schedule jobs across a combination of local and remote machines. The Sun Grid Engine is a resource management tool which allows the efficient use of compute resources within an organisation. However, there is some desirable Grid functionality that SGE does not yet provide. Sun classifies Grids at three different levels: Cluster Grid: a single team or project and their associated resources. Enterprise Grid: multiple teams and projects but within a single organisation, facilitating collaboration of resources across the enterprise. Global Grid: linked Enterprise and Cluster Grids, providing collaboration among organisations. Grid Engine meets the first two levels by allowing a user to transparently make use of any number of compute resources within an organisation. However, Grid Engine alone does not yet meet the third level. The project will develop a data and compute Grid consisting of Grid Engines linked via a Hierarchical Scheduler. To the user this Hierarchical Scheduler will offer similar functionality and appearance to a Grid Engine but will enable access to resources not necessarily within the organisation. It is perceived that as well as communicating with Grid Engines the Hierarchical Scheduler will be able to communicate with other Hierarchical Schedulers to further enhance scalability. A prime focus of the project is the application of good software engineering techniques and the use of industry standard analysis and design tools. To ensure the project stays relevant there are ongoing consultations with existing Grid Engine users in industry, commerce and academia. These consultations are being used for requirements capture to drive the design process and ensure that functionality is desired and useful. e-Science Report Projects Goals The project aim is to provide users with coherent and co-ordinated access to compute and data resources both within and across organisations. This will be achieved by developing a Hierarchical Scheduler which will allow jobs and their associated data to be passed on to a Sun Grid Engine. A middleware layer will handle the communications and data transfer between the Hierarchical Scheduler and the Grid Engine(s), Globus being a key technology in this area. This will enable the user to have access to a wider range of resources and should improve efficiency for the user and the Grid Engine administrator. The Hierarchical Scheduler will be scalable and allow access to resources not only at the local and organisational level, but also to multi-organisational and international levels. Members Paul Graham, EPCC Geoff Cawood, EPCC Thomas Seed, EPCC Ali Anjomshoaa, EPCC Terry Sloan, EPCC Status A prototype based around Grid Engine V5.3 and the Globus Toolkit V2.0 is currently under development. This will enable exploration of the issues involved in utilising remote resources. This prototype will use Grid Engine execution methods to interact with the remote resources. The relevant modules of the Globus Toolkit V2.0 will be used to investigate security and data transfer issues. The final goal of the project is to use OGSA-compliant Globus Toolkit V3.0 to interact with remote resources. To this end, the design of the Hierarchical Scheduler is currently considering a development path which involves an initial implementation based around Web services followed by a migration to Grid services. GridWeaver Background GridWeaver is a proposed collaboration with HP Laboratories in Bristol, exploring automated configuration and management for Next Generation Grid computing fabrics. As the world deploys larger and more diverse computational resources as components of computing Grids, there is an emerging need to be able to configure these resources: • correctly to avoid errors in complex resource configurations • flexibly so that resources can be rapidly reconfigured for different purposes • automatically to reduce the labour-intensiveness of resource configuration and management, and to speed it up. Meeting this need has not been a focus of Grid research to date, but we believe it is a vital issue to address as Grid technologies become more widely deployed. The project will start in August 2002. It brings together researchers with a long-standing interest in the problems of correctly and automatically configuring and managing large, complex assemblies of resources, and the applications and services that run on them. page 25 page 26 Projects e-Science Report Goals In addition to developing configuration solutions for current Grid infrastructures, GridWeaver aims to anticipate the needs of ‘next-generation’ computing fabrics. We believe that next-generation fabrics will be characterised by increasing scale, diversity, complexity and dynamism. Furthermore, as Grid computing becomes used for commercial as well as scientific applications, security becomes a higher priority. Imagine a next-generation Grid fabric that includes elements from specialised supercomputers to commodity clusters, with a wide variety of heterogeneous hardware and software, including dedicated resources as well as spare cycles on non-dedicated resources, with permanently connected systems, and intermittently connected systems, and so on. This will give you an idea of the diversity and complexity that must be configured and managed. The UK e-Science Core programme project ID for the GridWeaver project is ‘HPFabMan’. Members Paul Anderson, Carwyn Edwards, Lex Holt (School of Informatics, The University of Edinburgh), George Beckett, Kostas Kavoussanakis (EPCC), Guillaume Mecheneau, Patrick Goldsack, Peter Toft (HP Labs Bristol). eDIKT Background eDIKT – e-Science Data, Information and Knowledge Transformation – is an initiative at the National e-Science Centre to construct novel data management and interpretation software tools, tools which will underpin the seamless linking, management and interpretation of the vast amounts of data available on global networks. eDIKT will enable e-Scientists to harvest the knowledge hidden in the acres of data with which leading researchers work. eDIKT has been funded through a Research Development Grant by the Scottish Higher Education Funding Council – the largest such grant ever awarded by SHEFC. Fully funded for three years, it is hoped and anticipated that eDIKT will be extended for a further three years, during which central funding will tail off to be replaced by additional funding streams from academia and industry. Roger McClure of SHEFC at the unveiling of the eDIKT project, NeSC Opening, April 2002. Goals eDIKT is largely a software development project, but software development at the extremes of the IT world. eDIKT will apply solid software engineering techniques to leading-edge computer science research to produce robust, scalable data management tools that will enable new research areas in e-Science. eDIKT will initially investigate the use of new database techniques in astronomy, bioinformatics, particle physics and in creating virtual global organisations using the new Open Grid Services Architecture. eDIKT’s realm of enquiry will be at the Grid scale, the terabyte regime of data management, its goal to strain-test the computer science theories and techniques at this scale. Working over time with a wider range of scientific areas, it is anticipated that eDIKT will develop generic spin-off technologies that may have commercial applications in Scotland and beyond in areas such as drug discovery, financial analysis and agricultural development. For this reason, a key component of the eDIKT team will be a dedicated commercialisation manager who will push out the benefits of eDIKT to industry and business. e-Science Report Projects Members By the end of November 2002 eDIKT expects to be at its full strength of eight full-time Software Engineers, a Project Architect, a Project Manager and a Business Development Manager, with additional administrative and systems support staff. At time of writing, the team are: Dr Rob Baxter – Project Manager; Dr Davy Virdee, Dr Stephen Rutherford, Bob Gibbins, Robert Carroll and Ted Wen – Software Engineers; Charles Gadalla – Business Development Manager. Status After an initial phase of e-Science requirements definition and computer science technology evaluation, eDIKT will begin its first testbed activities, including: • enabling interoperability and interchange of binary and XML data in astronomy – tools to provide ‘implicit XML’ representation of pre-existing binary files; • enabling relational joins across terabyte-sized database tables; • testing new data replication tools for particle physics; • engineering industrial-strength tools for indexing the human genome; • building a data integration testbed using the Open Grid Services Architecture Data Access and Integration components being developed as part of the UK’s core e-Science programme and the Globus-3 Toolkit. page 27 page 28 Manager’s Report e-Science Report Manager’s Report Staff As a result of the rapid growth in eSI activity, the number of staff directly associated with supporting events has expanded from what was originally envisioned. The newness of the staff and the number of events (described in a recent job advertisement as ‘many in-house opportunities for team building’) has resulted in a strong team spirit. Staff commitment to delivering is exemplary, and they regularly perform beyond what could be reasonably expected of them. We propose to appoint a Receptionist to sit permanently at the front desk in support the ever-expanding activity of eSI. Ms Susan Andrews Susan Andrews joined NeSC in September 2001 and is based in the Department of Computing Science at the University of Glasgow. Her responsibilities include the design, implementation and maintenance of the NeSC database and website. Much of the website is generated using an SQLServer database and a combination of HTML and Cold Fusion, thus providing an up-to-date website which serves as a supportive tool to the e-Science community. A successful online registration system which plays a key role in the organisation of events is now established. She has collated e-Science project information to provide a full and informative listing of e-Science projects and provide a means for Principle Investigators to update their project information. Professor Malcolm Atkinson Malcolm Atkinson is the Director of the National e-Science Centre. He joined the effort to locate the centre in Scotland from his post in Computing Science in Glasgow, where he had led the development of Department. He has committed his research career to addressing the challenges of building very large computing systems that are well engineered, affordable and meet a real need. Early examples were in computer aided design, then health care. His most recent work was with SUN on large-scale persistent versions of Java. These large-scale system challenges abound in e-Science, so he was delighted to engage with the UK and international communities in addressing them. This included his role in the UK Database Task Force, chairing the UK Architectural Task Force, engaging in international meetings and taking an active role in the design of the new Grid architecture, Open Grid Services. The development of NeSC has required leadership, in building the team, setting directions, establishing collaborative relationships within the two universities, throughout the UK and internationally. The opportunity to steer the e-Science Institute’s programme is an exciting privilege, as the resources permit us to bring together the energetic teams that are building e-Science Infrastructure and the innovative pioneers who are determined to discover its potential in all subjects. Though he is sorry to leave his colleagues, research team and nine PhD students behind — the directorship demands full-time attention — he is delighted to take a leading role in a national effort to demonstrate what can be achieved when theoreticians, computer scientists and experimentalists work well together to build systems to tackle our most intriguing problems. Professor Muffy Calder As the effort invested in establishing the e-Science Institute and an international presence, took the Director, Malcolm Atkinson away from Glasgow University it was essential that e-Science leadership was provided in Glasgow. Professor Muffy Calder stepped into this role. For more than a year, she has led the team, recruiting them and directing their efforts. She developed a dialogue with several e-Science Report Manager’s Report telecommunications companies on the analysis of large-scale distributed systems behaviour necessary for dependable Grids, which we had hoped to investigate as Centre projects. Two strands of her work were engaging computing science researchers to address fundamental issues thrown up by e-Science and engaging researchers from other disciplines to explore the potential of e-Science. This culminated in a DTI funded project on modelling proteomic processes using the process models developed in her previous research with the communications industry. This also led to significant investment in bioinformatics. Mr Mark Cavanagh Mark Cavanagh started work at the e-Science Institute in January 2002 and, since then, has done most of the technical organisation of the NeSC-sponsored events here. With the recruitment of a technical events co-ordinator freeing him from this increasingly onerous task (which was originally envisaged as requiring a small fraction of his time), it is intended that Mark concentrate on Systems and Database Administration. Mr James Duffy James Duffy joined NeSC at the end of January 2002, beginning as Conference Assistant which involved assisting the Conference Manager in all aspects of conference organisation, such as: booking hotels, arranging travel, arranging the equipment for the conferences, liaising with conference counterparts in other institutions, organising the food and literature for the events. He moved into finance for the NeSC in May, processing invoices and expenses claims and looking after the GridNet finances. Ms Andrea Grainger Andrea Grainger joined NeSC in October 2001 as Conference Administrator, and had a busy conference schedule from the beginning. As NeSC expanded and hired more staff, her role evolved into her present role of Conference Manager. She is also Secretary to the EPCC/NeSC Directorate and Senior Administrator for a team of four clerical staff. As Conference Manager, she liaises with conference organisers, technical staff, local hotels and restaurants, and University staff to ensure that the vision of the conference organiser is carried out. She has also managed several large projects, such as the move into the South College Street building, the NeSC Grand Opening in April, and GGF5/HPDC11 in July. In particular, GGF5/HPDC11 involved hosting 1000 guests, coordinating with organisers in the US, and managing 15 co-located meetings in two cities during the conference week. Dr Anna Kenway Anna Kenway is one of the founding members of NeSC. In her role as Physics School Administrator at the time, she was one of the people who was involved in putting the proposal together. Her experience as a senior administrator in the University allowed the project to start immediately on announcement, as she was able to call on resources that a new appointee would not. The role of Manager is wide ranging. As well as building the team at NeSC, and being responsible for the usual administrative tasks such as financial and personnel management, Anna has had the pleasurable task of deciding on the layout and use of the South College Street building, and on the sort of IT services that eSI would provide for visitors. She has a strong commitment to a ‘can do’ approach where the needs of the user are paramount. With a background in particle physics and databases, Anna has also taken on nonadministrative responsibilities. She is the project manager for the NeSC website which is database driven, and has become responsible for projects involving postgraduate students, and in particular the PPARC Summer School. Anna is delighted that she will be moving full-time to NeSC at the beginning of calendar year 2003. page 29 page 30 Manager’s Report e-Science Report Mr Alastair Knowles Alastair Knowles joined NeSC as the Director's Personal Assistant in July 2002. He coordinates all of the Director's administrative requirements, and assists peripherally in the administration of NeSC events and conferences. He also helps out with the arrangements for visiting academics Ms Lee McLeod Lee McLeod started with the National e-Science Centre in July 2002 as a Conference Assistant. Currently, she assists the Conference Manager to organise all conferences which are mainly held at the National e-Science Centre. This ranges from the arrangement of rooms, catering, accommodation, and materials, to ‘on the day’ requests, and follow-up afterwards. Mr Stewart Macneill Following periods of research work and systems support in the University of Glasgow's Department of Computing Science, Stewart Macneill was appointed as Systems Manager for the Glasgow side of NeSC's operations. The past year has seen the successful installation of the Glasgow Access Grid node and the advancement of the Grid’s capabilities. It is his job to keep this equipment running smoothly. In addition, Glasgow also currently hosts the NeSC web site so it is important to maintain the server, middleware and underlying database to keep our web site visible. Mr David McNicol David McNicol joined NeSC as a Systems Administrator, in January 2002, from a similar position in Strathclyde University. He has been responsible for setting up the network, computing infrastructure and services within the e-Science Institute. He has also designed and implemented a web-based registration system which secures our visitors network and wireless LAN. During the summer of 2002 he had a hands-on role setting up and running a wireless network for the GGF5 conference, held in the Edinburgh International Conference Centre, which was used by over four hundred delegates. In the coming months he will be working on networking, security, developing an infrastructure for eDIKT and OGSA-DAI, and continuing to support and enhance the NeSC infrastructure. Dr Mark Parsons Mark Parsons is the National e-Science Centre’s Commercial Director. As EPCC’s Commercial Manager, he was closely involved in the preparation of the successful NeSC bid last year and assumed the additional role of NeSC Commercial Director from its inauguration. In this role he has been central to the development of NeSC’s GCP funded project proposals working closely with all of the NeSC partner institutes. Three of these proposals have been funded to date – OGSA-DAI, HPFabMan and SunDCG. He has worked to establish a best practice legal framework for these projects in order to establish collaboration agreements and reporting procedures. He has also the experience to advise a number of the Regional e-Science Centres on their projects and contractual issues when such support has been requested. In addition to his formal business development and contractual role he has taken a lead role in Europe, leading the work of the GRIDSTART project and ensuring the successes of the UK Grid Core Programme have visibility in a European context. His most rewarding endeavour this year was the success of the Global Grid Forum 5 conference, which he took a lead role in organising and running. Mr Terry Rodgers Terry Rodgers started working as a temp at NeSC in May 2002, before being hired as the eDIKT Administrator in June. His task is to deal with the paperwork for the team, and also helps out with other events at the Centre as well as reception duties. He is the secretary of the building’s Management e-Science Report Manager’s Report Committee and is also responsible for compiling the information for this first Annual Report. Since arriving, he has assisted with various conferences, including the N+N Collaboration with China and GGF5. Mr John Watt John Watt joined NeSC in mid-June 2002 as a research programmer in the Department of Computing Science. His responsibilities involve the implementation, maintenance and support of the Globus Grid software, and the design of NeSC projects using this toolkit. Finance Core Funding (DTI) The table overleaf shows the current state of core funding for NeSC, by Infrastructure, Access Grid and eSI as well as GCPs applied for and awarded. GridNet (EPSRC) ‘To provide support for UK Grid developers and researchers to enable them to participate in international standardisation and coordination bodies such as GGF, W3C and IETF.’ Applications to GridNet are reviewed by three members of the GridNet Advisory Board (GNAB) comprising: Professor Malcolm Atkinson Professor Jon Crowcroft Professor John Darlington Professor David Hutchison Professor Andy Keane Dr Andy Parker Professor Ron Perrott GridNet is intended to provide funding on a continuing basis over an extended period of time, and not for individual events, though an exception was made in the case of GGF4 because the announcement of the award was made so close to the event. It has a total value of £ 595k over three years commencing February 2002. Application for GridNet funding is web based and we are in the process of implementing a secure Web area to allow PI’s to manage their accounts. It was agreed with EPSRC that NeSC would not handle all expenses claims (not least because most universities now refund their own staff using BACS and this would introduce unnecessary delays), and that institutions with an allocation should set up a suspense account and claim from NeSC on a regular basis. page 31 N/A IBM Oracle Other Academic Industry Collaborator 1 Industry Collaborator 2 N/A Hewlett Packard Other Academic Industry Collaborator 1 NeSC N/A Sun Microsystems Other Academic Industry Collaborator 1 £450,000 £0 £437,500 £143,170 £0 £132,000 £322,000 £1,804,000 £0 £1,035,000 £2,512,197 £54,495 £0 £0 £437,500 £0 £0 £132,000 £135,586 £145,692 £0 £1,035,000 £2,512,197 £54,495 £1,500,000 £117,500 £840,202 Grant £450,000 £0 £0 £143,170 £0 £0 £161,000 £1,631,000 £0 £0 £0 £0 £0 £0 £0 Industry Contribution 01/08/2002 01/08/2002 01/04/2002 01/04/2002 01/02/2002 01/08/2001 01/08/2001 01/08/2001 01/08/2001 Start Date 24 24 12 12 18 18 18 36 36 36 36 Period of Grant N/A 05/08/2002 N/A 24/06/2002 Not known Not known 13/06/2002 07/12/2001 01/10/2001 01/10/2001 01/10/2001 Date Offer N/A 30/08/2002 N/A 30/08/2002 Not known Not known 30/08/2002 20/12/2001 01/10/2001 01/10/2001 01/10/2001 Date Accepted 4 3 2 1 Ammendment No. Manager’s Report Centre SunDCG NeSC Centre HPFabMan (Gridweaver) NeSC Total Core Funding Additional Centre OGSA-DAI £117,500 Access Grid £1,500,000 £840,202 Infrastructure eSI Total Proposal Name page 32 e-Science Report e-Science Report Manager’s Report Applications & Allocations as at 31 July 2002 GGF4 Various £16,872 Mr Jonathan Giddy Cardiff £1,000 Dr Omer F Rana Cardiff £1,000 Dr Peter Clarke UC London £1,000 Professor Ken W Brodlie Leeds £1,000 Dr Steven Newhouse Imperial £30,000 Dr Richard Hughes-Jones Manchester £20,200 Professor Ron Perrott Queens, Belfast £30,000 Dr Robert Baxter Edinburgh £30,000 Dr Roger Philp Cardiff £1,000 Professor Simon Cox Southampton £1,500 Total £133,572 Additional Funding attracted by NeSC NeSC has attracted an additional £8M of funding from: • the Scottish Higher Education Funding Council • from IBM in the form of the secondment of Andy Knox and the planned SUR machine as well as other hardware • from the Universities of Edinburgh and Glasgow in the form of accommodation and staff posts. This is detailed below. University of Edinburgh eSI Building (three years at £350k per year) Comms infrastructure (SRIF) Two posts for e-Science Computing College e-Science Initiative Refurbishment of eSI & Expansion into Old College £k 1050 1680 200 200 60 Subtotal £3190 k University of Glasgow e-Science Hub - Kelvin Building Refurbishment of BioIniformatics Centre - Feb 03 e-Science Posts for Bioinformatics Centre (£160k each year) support of PhD students for Bioinformatics Centre other e-science posts (£65k for two years) 490 300 480 50 130 Subtotal £1450 k IBM Blue Dwarf SUR Machine 700 Secondment of Dr Andy Knox, £100k for 2.5 years 250 Loan of IBM servers to University of Glasgow, £4437/quarter, start Feb 2001 62 Loan of IBM servers to NeSC, £3297/quarter, start Apr 2002 30 Loan of machines for NeSC Opening 5 Donation of Netfinity server to NeSC 8 Subtotal £1055 k SHEFC RDG eDIKT 2280 Subtotal £2280 k Grand Total from all sources £7.98 M (Note: Unless otherwise stated, time frame is August 2001 to August 2004) page 33 page 34 Manager’s Report e-Science Report Buildings South College Street In March 2002, the National e-Science Centre moved into South College Street. The building, a sensitively refurbished church in central Edinburgh and near to the original University buildings of Old College, comprises the Newhaven lecture theatre seating 108, two large meeting rooms, three reception areas and offices. To enable the Centre to act as a leading IT organisation it was necessary to undertake some refurbishment of the building. This largely involved improving the networking in the two main meeting rooms (Swanston and Cramond) so that visitors would be able to access networking and power for their laptops. Additional power sockets were also installed in the Newhaven lecture theatre where possible. The e-Science Institute prides itself on providing a ‘well connected’ and welcoming IT environment for visitors which enables them to connect to the internet and access their email and other services with a minimum of trouble. We are one of the first organisations in Edinburgh University to provide WirelessLAN and David McNicol has written and implemented a Web-based WirelessLAN registration system which allows visitors to auto register. This is particularly useful when running events with a hundred delegates or so. Imagine trying to register them to use WirelessLAN (as justifiably required by the University Computing Services), manually, first thing before a meeting. After a number of requests from other Centres to use this system he has made it available at http://homepages.nesc.ac.uk/ ~david/software/wlan/ We are particularly proud of ‘the Pod’ which acts as an internet café when events are on, provides space for short term visitors and is also used as a meeting room on occasion. It even acted as a press room for the Opening. An early decision was made to provide the two flavours of platform favoured by different sectors of our community. To satisfy both requirements a low maintenance Sunray system was installed for UNIX users and Windows XP machines for the others. This area is always fully occupied during events and very popular. The Pod has also been used for training purposes, but has very limited capacity. This limited capacity and an increasing demand for eSI to run training courses requiring ‘hands on’ capability has led to the decision to implement a training room in the Swanston. Future plans also include the refurbishment of several offices in the Old College, which is just across the road from eSI. Glasgow NeSC staff located in Glasgow are currently housed in the Computer Science building of the University of Glasgow, though it is planned to site the facility in a new ‘e-Science Hub’ in the Kelvin Building. A third AccessGrid node has been installed at Glasgow. e-Science Report Manager’s Report page 35 The ‘Pod’ being used for interviews during the Opening. IT Infrastructure It was clear from the start that South College Street would require a sophisticated IT infrastructure. The system would have to satisfy a number of contradictory requirements. Firstly, it would need to provide a secure system for NeSC staff. Secondly, a more open research environment and finally an ‘open’ environment for visitors bringing their own laptops. It would also be necessary to provide the sort of environment that overseas visitors would be accustomed to, such as WirelessLAN. Even when we initially thought that we had finalised the requirements for the infrastructure, conditions continued to change with the addition of extra servers such as the IBM loan machine (Gilmore). These requirements have led to the extensive network in eSI that provides facilities for training and visitors as well as infrastructure and support for NeSC research. The visitors’ area (called the ‘Pod’) provides five PCs and five Sunrays that eSI visitors can use to access e-mail and the web, or for drafting documents. The Sunrays are X-terminals connected to Clapton, which is a Sun Blade 1000 with an UltraSPARC 3 CPU and 1024MB RAM. The VPN is used by staff who connect to the network from outside the firewall. This includes the wireless network, which can be used throughout the eSI building. The two key servers for eSI infrastructure are Hendrix and Zappa. Zappa is a Sun e250, running core services such as sendmail, DNS, Samba, printing, etc. It is also responsible for backups. Hendrix is our Windows 2000 domain controller. It handles Windows authentication, and also runs MS Exchange. A number of important projects are still in the pipeline: 1. Moving the public NeSC web pages and SQLServer database from Glasgow to the NeSC servers and the secure web pages from the EPCC servers. Santana will run the database, with page providing the Windows front end. Secure web services will run on Townsend, which is a Linux machine running Apache. 2. Web cast (video streaming) in the Newhaven lecture theatre. 3. A further ‘portable’ AccessGrid node to enable us to broadcast from any location. page 36 Network diagram. Manager’s Report e-Science Report e-Science Report Appendix Appendix Details of Events to 31 July 2002 The e-Science Institute runs four types of events – Research, Community Building, and Training as well as hosting International Collaboration meetings. Many events are multi-purpose, but we have tried to classify each under one major category. We have excluded meetings of a purely managerial nature. Further information and material for the events is available at http://www.nesc.ac.uk/esi/. First Grid Users Meeting (Community Building) 23rd August, 2001 This was the first event hosted by eSI. Twenty participants from the UK e-Science community, representing twelve organisations, arrived to discuss plans and requirements for the Grid, e-Science events, and priorities for the National e-Science Centre. Databases and the Grid (Research) 24th August, 2001 Eleven UK participants from eight organisations met to discuss database applications on the Grid. UK High-End Computing Symposium (Research) 10th–11th September, 2001 One hundred and forty participants attended this symposium, the first large-scale event to be held by NeSC. UKHEC aims to investigate emerging areas of computing and to inform and provide advice to the user community in hardware and software applications. This meeting was an opportunity to increase awareness of who was involved with NeSC, and how the start-up was proceeding, as well as discussing staffing, funding and network issues. Pre-GGF3 Meeting (Community Building) 1st October, 2001 Nineteen participants met to plan ahead for the UK contributions to the third Global Grid Forum, held in Frascati, near Rome, Italy, as well as reviewing the eSI programme. Introduction to Grid Computing and the Globus Toolkit (Training) 12th October, 2001 Eighty participants from thirty-five organisations arrived to hear Steve Tuecke, Globus Project lead architect from the Argonne National Laboratory, give a practical introduction both to Grid computing, and to the Globus Toolkit. Getting Going with the Grid (Community Building) 22nd–25th October, 2001 Thirty-five participants from nineteen organisations attended this workshop designed as a practical introduction both to Grid computing, and to the Globus Toolkit and intended to bring together a wide range of people engaged in the task of developing Grid infrastructure and Grid applications in the UK. The participants were offered introductions to Globus, Condor, SRB, and Access Grid. There were overviews of Grid portals, reports on early experiences from UK Grid projects (European Data Grid and GridPP). Several computer companies talked about their plans and the current Grid-related work in Europe was reviewed. page 37 page 38 Appendix e-Science Report Bioinformatics Grid Users’ Meeting (BiGUM1) (Community Building) 30th October, 2001 Fifty-five participants arrived, from twenty-six organisations, for this first meeting of the Biological Grid Users, to help form a community among this group of e-Scientists, to assist in the dialogue between biological Grid users and the implementers of the UK Grid infrastructure. Also identified was how the e-Science Institute can best help this community and to identify other actions that might be commissioned or requested. This was followed by a survey of bioinformatics requirements and a presentation of three bioinformatics Grid applications. Participants were also asked to present their own requirements, interests and contributions in order to identify a follow up agenda.. Quantum ChromoDynamics Grid (QCDGrid) Workshop (Research) 2nd November, 2001 Nineteen participants representing seven organisations met to discuss establishing a datagrid between Edinburgh, Glasgow, Liverpool and Swansea, which had to be fully functional and robust by summer 2002, when the old centralised data store closed down. The intention was to use 0.5TB RAID disks operating data caching/mirroring across the sites, with Edinburgh as the primary source of data (T3E then QCDOC). UKQCD is loosely attached to the US SciDAC project which is trying to define standards for QCD software and data. This may impose some external constraints as they evolve towards an international QCDGrid. The purpose of the workshop was to educate the participants about those aspects of XML and SRB which related to their needs, plus any options for achieving similar functionality. NeSC will provide introductory talks on XML and SRB. In addition to members of UKQCD, systems staff from the sites concerned and non-UKQCD members of GridPP also attended for discussions of Data Markup, Data Storage and Mirroring, User Interfaces. Grid Particle Physics 2nd Collaboration Meeting (Research) 5th-6th November, 2001 Fifty-seven participants met, coming from eighteen organisations, to discuss various GridPP projects. The VRVS system was used to broadcast procedures to members in other locations. After being opened by Richard Kenway and Steve Lloyd, GridPP Project Status and LHC Computing Project Status updates were given by Tony Doyle and Les Robertson. ‘Highlights of the meeting included Specification, Status Report and Development Plans for Current and Future Experiments”, ‘Grid Overview Technical Status Reports’ and ‘Middleware Interfaces and Development”. Large Scale System Configuration Workshop (Research) 8th–9th November, 2001 For this event, eighteen participants from seven organisations were invited to this workshop for experienced practitioners to discuss all aspects of installing and maintaining system configurations on large numbers of nodes, including Grid farms, servers, and desktop workstations. A major aim was to bring together people from different backgrounds to share experiences, and inform the development of the new techniques which would be essential for the management of the very large numbers of nodes being envisaged for the next generation of Grid systems. A summary of the discussions and conclusions can be found at: http://homepages.informatics.ed.ac.uk/group/lssconf/config2001// e-Science Report Appendix Grid Architectural Discussions (Community Building) 12th–14th December, 2001 Fourteen participants, mainly from the Architecture Task Force, representing twenty-three organisations met at NeSC to discuss Grid Architecture with Ian Foster, Argonne National Laboratory, Illinois. A Database Task Force Discussion was followed by a talk by Ian Foster with comments and discussion by the UK ATF to sketch out an Architecture for UK work. During this event, NeSC co-sponsored, with the Division of Informatics, a public lecture in the Reid Concert Hall, by Ian Foster to the University Community, on ‘The Anatomy of the Grid, Enabling Scalable Virtual Organisations’. A dinner was held in the Talbot Rice Gallery on the final evening. AstroGRID Workshop (Research) 13th–14th December, 2001 Twenty participants arrived from ten organisations gathered at NeSC for status and news reports, reviews of actions and External Projects (such as VISTA, AVO, EGSO, NVO, COST, DBTF and GridPP). Reports from working parties were produced, on Preliminary Architecture, workplans, budget, recruitment and an AGLI meeting. Scottish e-Science Forum (Community Building) 17th January, 2002 Thirty-five participants, from six organisations were welcomed to this inaugural event of the Scottish eScience Community. After being given an overview of NeSC by Malcolm Atkinson, talks were presented on Funding Industrial Projects by Mark Parsons, and The Scottish Perspective by Stuart Anderson. Working Groups reported on science application areas, such as e-Science in education, etc, before planning the work of SeSF. Grid & Globus Expanded Tutorial (Training) 21st–24th January, 2002 One hundred and fifteen participants, representing sixty-four organisations came to NeSC to hear speakers Steve Tuecke (Globus Project lead architect), Bill Allcock and Charles Bacon (GRIDS Centre, University of Chicago) from the Argonne National Laboratory. After the Introduction to Grid Computing and Globus Toolkit by Steve Tuecke, the Tutorial split into two tracks for developers and administrators, with a community-building dinner at the neighbouring Chapterhouse Restaurant afterwards. Track 1 (Developer Tutorial, Steve Tuecke and Bill Allcock) Introduction to Globus Toolkit Administration and Packaging, Introduction, globus_common, globus_io, security Resource Management Data Grid--- GridFTP, GASS, replication Track 2 (Administrator Tutorial, Charles Bacon) Introduction to Globus Toolkit Administration and Packaging Demo complete install and customization, Q&A, hands-on Meeting for EPSRC e-Science Pilot Projects (Community Building) 25th January, 2002 Forty-six participants, from sixteen organisations, came to this event, organised by Jim Fleming of EPSRC. Talks were delivered on an overview of the e-Science Centres, by Malcolm Atkinson, e-Science Support by David Boyd and Steve Booth and the Grid Network Team by David Hutchison. Also discussed page 39 page 40 Appendix e-Science Report were web sites and other administrative issues, the Pilot Projects requirements from the UK Grid and how the Pilot Projects can learn from each other. UK Grid Portals/ Grid Services Workshop (Research) 28th–29th January, 2002 Fifteen participants arrived, from seven organisations, to this event, organised by Rob Allan, primarily for UK and European participants. The aim of the meeting was to exchange information about the technology being used in funded e-Science projects to build Grid Service based Application Portals, including discussions of Globus and other middleware, CoG kits, Web Services, event and component schema definitions, etc. The work of the Grid Computing Environments Research Group within the Global Grid Forum in this area is an important step in building the ‘information layer’ of the Grid which will be exploited for complex applications built from integrated computer resources, data, instruments and software components. A number of groups were invited to make short presentations on topics of interest and follow up with in-depth discussions. IBM BlueGene Conference (Research) 15th–16th March, 2002 IBM sponsored this important event for eighty-nine distinguished guests from fifty-four organisations. Organised by Dr Lindsay Sawyer, the IBM Blue Gene Protein Science project and the National e-Science Centre hosted the Protein Science Workshop ‘Blue Gene 2002’. (The first Blue Gene workshop on Protein folding was held in 2001 at San Diego, California, USA.) The purpose of the workshop was to bring together scientists whose research is in selected areas of protein science, with emphasis on computational approaches, to establish the status of the research, discuss open issues, reinforce interactions and create new collaborations. The workshop consisted of invited talks divided in three sessions (protein folding, enzymatic reactions and drug design, and systems biology) as well as of contributed posters. Tutorial on XML 1 (ToX1) (Training) 18th March, 2002 Fifty-one participants, from ten organisations, arrived for this one-day tutorial, intended for people with little experience of defining and using XML schema. It focused on those aspects of XML that are important for describing and communicating data, as these are essential foundations for working on Web Services. The tutorial was open to all who are participating in, or were planning to participate in, the UK eScience programme. It was also recommended for those coming to the Workshop on Web Services (WoWS 1) who were not fluent in XML schema specification and use. WoWS 1: Workshop on Web Services 1 (Training) 19th–20th March, 2002 Sixty participants from thirty-one organisations came to this workshop, organised by Malcolm Atkinson and Rob Baxter. The workshop, led by Nick Todd of Stilo Technology, assumed familiarity with XML and XML schema, at least to the level presented at ToX 1. As e-Science frequently involves the integration of heterogeneous and distributed systems, one of the goals was to encourage the use of Web Services where ever appropriate in e-Science. Our other two goals are to help build a mutually supportive community of web service developers and to prepare the ground for OGSA. The workshop was open to any members of the UK e-Science community and to those engaged on European Union funded Grid projects. e-Science Report Appendix Getting OGSA Going One (GOGO) (Tutorial) 21st–22nd March, 2002 Thirty participants representing twenty-seven organisations came to this event, organised by Malcolm Atkinson & Rob Baxter. Open Grid Services Architecture was put forward as a proposal at Global Grid Forum 4 in both a panel and at Birds of a Feather (BOF) that is hoping to establish an Open Grid Services Infrastructure Working Group. The UK e-Science community is proposing to actively engage in the development and exploitation of OGSA through the OGSI WG. The primary goal of this first meeting was to establish a community and working relationships in the UK that will support that proposed UK role. This was a workshop where all participants were actively engaged in the work, to help develop joint understanding, and to plan articulation of various activities. Topics included: • Discussion of OGSA (led by Malcolm Atkinson) • Discussion of the OGSA model: Projects, Tools & Components, Progress & Problems (led by Rob Baxter), • Review of the potential synergy and duplication. • Identification of Technical Challenges for the next 6 months (led by Rob Baxter) • Discussion of Technical solutions and Plans • Integrated Planning Logic, Model Theory and Computation (Research) 4th–5th April, 2002 Fifteen participants represented eight organisations at this workshop, organised by Martin Grohe. This was the third semestrial meeting of the Logic, Model Theory and Computation Workshop. The first two meetings in this series took place in Cambridge and Swansea. The workshop was open to all interested researchers. Workshop on Distributed Software Management for e-Science (WoDSMeS) (Community Building) 8th–11th April, 2002 Twenty-six participants, from twenty organisations came to this workshop, jointly organised by Alan Simpson, Rob Baxter and Malcolm Atkinson. Because most e-Science funded projects are facing the additional challenge of working in multi-organisational distributed teams, this makes project management, quality processes and technical support far more difficult. Hence this was the first workshop to initiate the process of jointly developing shared knowledge of how best to proceed, given the realities of e-Science projects that involve companies and universities, computer scientists and application scientists, and that often depend on international collaborations. The course components introduced software tools, such as source code version managers, compilers, libraries, build managers, development environments and diagnostic aids, along with introducing software design. Also addressed was project management, with discussions on how to develop and support good practice for project and software management that is realistic for UK e-Science projects. The intended outcome was the initiation of an informal network of interested parties which shares ideas, methods and technical solutions and formulates requirements to support the management of these e-Science projects and their software production. Such formulated requirements should in turn lead to better provision of the relevant software infrastructure. page 41 page 42 Appendix e-Science Report Terascaling meeting (Research) 12th April, 2002 Twelve people met for this meeting to discuss the issue of how to incorporate terascaling in Grid-based applications. Database Access and Integration (DAI) (Research) 15th–16th April, 2002 Thirty-two participants, representing fourteen organisations, met for this event, organised by Malcolm Atkinson & Norman Paton. This workshop brought together members of the e-Science community who were currently or expecting to make use of database access and integration facilities. The DBTF presented its current understanding of requirements, its current Grid middleware development plans, and its current view of priorities. Highlights included: • Introduction to Database Task Force – Norman Paton • Database Requirements Analysis – Dave Pearson • Overview of Planned Development Activities – Norman Paton • Overview of the Open Grid Services Architecture – Malcolm Atkinson • Baseline Grid Database Services – Brian Collins • XML Data Access Services – Rob Baxter • Web Services and Data Management – Tony Storey • CERN's EDG Database Activities – Leanne Guy • IBM Almaden's OGSA Database Activities – Inderpal Narang • Discussion Group Commissioning – Norman Paton • Future DBTF, DBTF & GGF Working Group Activities – Dave Pearson UK Database Task Force (DBTF) and GGF DAG WG Meeting (Community Building) 17th April, 2002 Ten participants from seven organisations arrived at NeSC for this event, organised by Malcolm Atkinson & Norman Paton. The UK e-Science DBTF and members of the Databases and the Grid (DAG) working group met, together with invited guests and co-workers for the remainder of the DAI meeting and for the following day. This meeting was by invitation only, and included discussions of ongoing/planned development activities involving attendees, planned deliverables for GGF5, a review of the existing ‘Database Access and Integration Services on the Grid’ proposal and a technical discussion of RDB and XML service proposals. Official Opening of the National e-Science Centre (Community Building) 25th April, 2002 Over two-hundred guests arrived, representing the highest levels of seventy-six organisations, companies and institutions for the Official Opening. We were particularly honoured by the presence of the Right Honourable Gordon Brown, Chancellor of the Exchequer, who officially opened the Centre. Contributions to the opening were made by the Principals of Edinburgh University, Lord Sutherland of Houndwood and Glasgow University, Professor Sir Graeme Davies. The opening was celebrated with a VIP dinner, sponsored by the University of Glasgow, at St.Celia’s Hall. After being welcomed to NeSC by Professor Malcolm Atkinson, a series of presentations were delivered to outline the use of Grid technology. e-Science Report Appendix Ontology and Astronomy (Research) 29th April, 2002 Eleven participants from six organisations gathered to at this NeSC event, in which each spoke about Grid applications for ontological and astronomical researchers. The Information Grid (Community Building) 30th April, 2002 Sixty-three distinguished guests, representing fifty-three organisations, arrived for this workshop, organised by Dr Liz Lyon. This international workshop was a joint initiative between UKOLN and the UK e-Science Core Programme and explored common approaches and challenges to the service provision and description of datasets and information resources within the Research Grid/e-Science communities and those working with the JISC-funded Distributed National Electronic Resource (DNER). The event raised awareness of the service architectures and to describe current activities to progress thinking in this area. The workshop considered the technical standards and protocols which are elements of the service architectures, mechanisms for resource discovery, the role of ‘collection descriptions’, the use of schema and descriptive metadata as well as terminology and semantics, quality assurance, provenance, compressed archives, digital preservation, storage and security issues. The Keynote was delivered by Professor Tony Hey, Director of the e-Science Core Programme. The international line-up of speakers included Dr Reagan Moore (San Diego SCC), Professor Keith Jeffery (RAL) and Andy Powell (UKOLN), and was jointly chaired by Dr Liz Lyon, Director UKOLN and by Professor Malcolm Atkinson. Highlights included: • JISC Information Environment and Architecture – Dr Alicia Wise, JISC & Andy Powell, UKOLN • The European perspective – Professor Keith Jeffery, CLRC, RAL • The US view – Dr Reagan Moore, San Diego SCC • Digital preservation and model archiving – Kevin Ashley, ULCC • Metadata generation, use & ontologies – Professor David De Roure & Dr Jeremy Frey, University of Southampton • Security and authentication – Dr Alan Robiette, JISC • Data movement, management and resource discovery – Michael Breaks, Heriot-Watt University • Data Storage – Dr Richard Durbin, Sanger Institute Making the Grid work in a Computing Services Environment (Community Building) 1st May, 2002 Fifty-seven participants arrived, representing twenty-seven organisations at this event, organised by David Boyd and Paul Jeffreys. The purpose of the meeting was to bring together Computing Service providers and Grid developers to increase awareness of the Grid, to discuss requirements for implementing and using Grid functionality on Computing Service resources and to develop agreed recommendations on how to address these requirements. The meeting began with two talks, the first providing an overview of the Grid vision and current Grid technology, and the second outlining key requirements to be met for successfully implementing and using the Grid on Computing Service resources defined as a function of time over the following twelve months. Topic highlights included an overview of the Grid and the Globus toolkit, service requirements to enable Grid access to computing resources in Universities and Centres, Grid software installation and support, directory services and other operational components, Authentication, CAs and RAs, User Registration, Authorisation and Accounting, and a discussion of Firewall networking and monitoring page 43 page 44 Appendix e-Science Report Bioinformatics Planning Meeting (Community Building) 2nd May, 2002 A small scale meeting with four people from two organisations discussing future Bioinformatics applications for the Grid. IBM Meeting Regarding eDIKT (Community Building) 14th May, 2002 Twelve participants, from IBM and Edinburgh University met at NeSC, to discuss the Electronic Databases, Information & Knowledge Technology project. Mobile Resource Guarantees (Community Building) 22nd May, 2002 Seventeen participants from five organisations met to discuss the ability to move code smoothly between execution sites, as a key part of the technological infrastructure of future global computing platforms. The pressure to supply and use mobile code in a global environment aggravates existing security problems and presents altogether new ones. One particular security issue is the maintenance of bounds on quantitative resources. Without some technological foundations for providing such guarantees, global computing will be confined to applications where malfunction due to resource bound violation is accepted as normal and has little consequence, as with internet computation today. With more serious applications, resource awareness will become a crucial asset. This project aims at developing well-founded methods to spur technological progress in this presently under-studied area. The main objective of the project is the development of the infrastructure needed to endow mobile code with independently verifiable certificates describing resource behaviour. Highlights included: • Introduction to MRG, participants – Don Sannella, School of Informatics, University of Edinburgh • Type systems for resource bounds – Martin Hofmann, Institut für Informatik, LudwigMaximilians-Universität München • JVM and .NET comparison – Stephen Gilmore, School of Informatics, University of Edinburgh • Grail and lambda-Grail – Kenneth MacKenzie, School of Informatics, University of Edinburgh • The Grail cost model – Lennart Beringer, School of Informatics, University of Edinburgh • ConCert project – Peter Lee, School of Computer Science, Carnegie-Mellon University • MRG's High-level language – Olha Shkaravska, Institut für Informatik, Ludwig-MaximiliansUniversität München • Compiling LFPL to JVM – Robert Atkey, School of Informatics, University of Edinburgh • Formalising Toy MRG – David Aspinall, School of Informatics, University of Edinburgh Astrogrid Meeting (Research) 31st May, 2002 Organised by Bob Mann, nine invited participants from five organisations around the UK met to discuss this £4million project, funded from the UK e-Science programme through PPARC, aimed at building a data Grid for UK astronomy, as the UK's initial contribution to constructing an international 'Virtual Observatory'. AstroGrid's three year programme comprises a one-year Phase A study, followed by a twoyear Phase B implementation. As Phase A draws to a close, the project is designing the architecture of the system it will develop in Phase B, and this is one of a series of project meetings finalising that architectural design. e-Science Report Appendix Tutorial on XML: Schema Design, Processing and Storage (ToX2) (Training) 5th–7th June, 2002 Twenty-four participants from ten organisations arrived at NeSC for this second Tutorial on XML, taught by Nick Todd (Stilo Technology). A three-day intensive tutorial intended for people with little experience of defining and using XML schema, it focused on those aspects of XML that are important for describing and communicating data, as these are widely used in e-Science and are essential foundations for working on Web Services. It covered schema design and schema processing to develop practical skills and considered storage technologies for XML, including Xindices and use of these via the tools being built in the OGSA-Database Access and Integration project. The tutorial also gave an opportunity to develop practical skills using the software tool, XML Spy, as well as experiences with Xindices & an XML storage service by Rob Baxter and Matt Egbert. Topics covered by the tutorial included well-formed XML Instances, Universal Resource Identifiers (URIs), Composed XML documents, Standard and Complex XML Schema types, reuse of schema definitions, Validity of XML Instances with respect to schema, xsi:type, Design of XML schema and Validity checking, Processing XML, Available software & standards and Categories of XML parsers. The event concluded after giving examples of XML programming practice. The ‘Pod’ was used as a training lab for all practical sessions. Bioinformatics Meeting (Research) 6th June, 2002 Organised by Professor Bonnie Webber, this was one in a series of informal bioinformatics meetings, at which people from the local area talk about their work related to bioinformatics. WoWS 2: Workshop on Web Services 2 (Training) 10th–12th June, 2002 Thirty-two participants from fifteen organisations met at NeSC for this second workshop, taught by Nick Todd (Stilo Technology). This workshop was a three-day intensive practical course with opportunities for hands-on work. The workshop assumed familiarity with XML and XML schema, at least to the level presented at ToX 2. It considered all of the aspects of Web Services in as much depth could be managed in the time available. It also made a review of available tools, by sharing the knowledge of the participants and by Access Grid presentations. The first day was allocated predominantly to oral presentation of the concepts of web services and a discussion of their application. The remaining two days also offered time for people to try practical exercises or discuss issues. Topics covered included Motivation, Web Services Application Architectures & Service Stacks, Web Service Description Language (WSDL) & Simple Object Access Protocol (SOAP), Creating Web Services (mostly using Apache Axis), Registering and Discovering Web Services (WSIL & UDDI), Platforms, Tools and Middleware. As with the previous tutorial, the ‘Pod’ was used as a training lab for all practical sessions. Getting OGSA Going 2 (GOG 2) (Community Building) 13th–14th June, 2002 Nineteen participants met for this event, representing eleven organisations. Organised by Malcolm Atkinson and Steve Newhouse (LeSC), it discussed the Open Grid Services Architecture, put forward as a proposal at GGF4 in both a panel and at a BOF that is hoping to establish an Open Grid Services Infrastructure WG. The UK e-Science community has agreed to actively engage in the development and exploitation of OGSA through the OGSI WG and through its GGF Database Access and Integration WG. The meeting followed an OGSA Early Adopters Workshop at Argonne National Labs. The two goals of the meeting were to develop and share understanding of OGSA and how work should develop, and to develop a community that will sustain the efforts of early UK OGSA adopters. page 45 page 46 Appendix e-Science Report Edinburgh & Glasgow Bioinformatics Collaboration (Community Building) 17th June, 2002 For this event, members of the Bioinformatics community in Edinburgh and Glasgow met to discuss mutual interests in using Grid technology for collaboration in research. Proposals for future developments were also put forward, as well as an overview of developments up to this point. WoDL 1:Workshop on Discovery Link 1 (Training) 24th–25th June, 2002 Twenty-five participants, from eighteen organisations, arrived for this workshop, organised by Malcolm Atkinson, Andy Knox (IBM) and David White (IBM), featuring a Web Lecture on DiscoveryLink. The workshop started by introducing the capabilities of Discovery Link and showing participants how it may be use to access data from a number of different databases. It showed how adapters are built to access various forms of database and semi-structured data, and developed an understanding of operational issues, such as required platforms, the optimiser and development tools. Through IBM's Scholar's Programme both DB2 and Discovery Link are freely available to researchers. The workshop was divided across two days: the first being general education and lectures, and the second featuring hands-on tutorials. The objectives were to show the capabilities of Discovery Link as an aid to research, to demonstrate how to connect a number of databases so that researchers in their discipline can use them as an integrated resource, and to encourage the formation of a community of e-Science Discovery Link users. Highlights included a lesson in DB2 quick conversion, a DiscoveryLink Technical Overview, with Implementation and wrappers (BLAST, Documentum, etc), finishing with a Review of Federated Query Processing. For this event, the NeSC support staff were able to hire additional workstations and, effectively, construct a temporary computer lab for hands-on training. N+N Collaboration with China (International) 26th–27th June, 2002 More than thirty-five delegates, representing over twenty-four organisations in the UK and China, met at NeSC for this ground-breaking event, organised by the Biotechnology and Biological Sciences Research Council, and jointly hosted by the BBSRC and the National e-Science Centre. The event was intended to encourage collaboration between the UK and the Chinese e-Science Communities. Delegates from both countries met to discuss a variety of e-Science topics, including post genomics, drug discovery, structural genomics and biodiversity. GGF5/HPDC11 – Co-located Workshops (Research) 21st–26th July, 2002 As well as the main GGF5/HPDC11 event described earlier in this report, there were a number of collocated events. These included: e-Science Report Appendix Meeting Date Organisers SUN HPC Workshop 19–20 July Brian Hammond, Sun Microsystems, Santa Clara, Applications Workshop 20 July Ed Seidel, Max-Planck-Institut fur Gravitationsphysik, Germany GridLab 21 July Andre Merzky, Zuse Insitute, Berlin Intergrid 21 July PPARC Active Middleware Services 23 July Dennis Gannon, Salim Hariri Grid Luminaries 23 July Andy Knox, IBM, Rochester JSSPP 24 July Dror Feitelson, The Hebrew University GRIDSTART Cluster Plenary 25 July Maureen Wilkinson, EPCC WACE/HPDC 26 July Mike Papka, Jason Leigh, Department of Computer Science, Argonne Further details for these co-located events can be found at the eSI past events website detailed above. page 47 National e-Science Centre 15 South College Street Edinburgh EH8 9AA www.nesc.ac.uk