The monthly newsletter from the National e-Science Centre NeSC News Issue 64 November 2008 www.nesc.ac.uk Reading the Book of Nature By Iain Coleman Since Antiquity, philosophers and scientists have described experimental enquiry as reading the Book of Nature. But what if that’s more than just a metaphor? Can the tools of textual analysis be applied to reading biological data? And can the birth and evolution of a text be analysed is if it were an organism or a species? These are the questions that the workshop on “Living texts: interdisciplinary approaches and methodological commonalities in biology and textual analysis”, held at the e-Science Institute on 16-17 October, set out to explore. Natural languages form units such as sentences, whose meaning depends on which words are used an in how they relate to one another through word order or inflection. The structure of the amino acid sequences that form a protein determine how it folds up into a unique three-dimensional shape. The arrangement of base pairs in a DNA sequence determines how an organism will develop and function. In all three cases, we are concerned with the effects of particular forms of composition and structure. Julia Hockenmaier (University of Illinois at Urbana-Champaign) showed how the complex folding process in proteins can be represented in a tree diagram similar to those used by linguists to analyse sentences, allowing similar algorithms to be employed. When a full simulation of protein folding requires a petaflop supercomputer, this kind of abstraction becomes very attractive. Gene sequence studies are also big science, and Siu-wai Leung (Edinburgh) discussed how determining a formal grammar for the language of DNA can be valuable both in computational analysis and in laboratory experiments. The analogies can also work the other way round. Just as a single cell develops into an organism composed of millions of cells working together as part of the whole, so a text grows from an idea into a structure of words, paragraphs and chapters that make up the body of the work. And, like organisms reproducing by creating modified versions of themselves, texts are copied and reproduced down the centuries, often changing in the process. Caroline Macé and Philippe Baret (University of Louvain) presented an example of textual evolution in the manuscripts of the fourth century Church Father, Gregory the Theologian, that were extensively copied in the ninth to sixteenth centuries such that more than a thousand copies have survived to this day. The relationships between all these copies can be effectively teased out using methods taken from biological classification. A more speculative idea was introduced by Ewa Sikora and Monika Szumowska (Polish Academy of Sciences), who proposed that a single text can develop similarly to an organism. Where animals undergo the stages of prenatal development, texts are built up through successive drafts until they are ready for submission to an editor. Paragraphs and chapters are created to serve different functions within a text, and are subject both to linguistic rules and to the principles of plotting and characterisation which hold paragraphs and chapters together. The postnatal development comes when the text is worked on by the editor and proof-reader, when it can be subjected to substantial – sometimes traumatic – changes. This analogy may go some way towards explaining why the methods of biological classification prove fruitful in analysing texts. Gregory the Theologian But if life is a process of complex self-organisation, then perhaps it is not too much of a stretch to say that the scientific literature itself is being brought to life by new ways of discovering and structuring information. Markup systems have emerged as a key method of Issue 64, November 2008 Reading the Book of Nature continued structuring textual information in an ordered hierarchy so that machines can understand it and reason with it. Current systems, though, hit problems when they are faced with texts that have discontinuous or overlapping elements, even such simple sticking points as a sentence that starts on one page and ends on another. Claus Huirfeldt (University of Bergen) and Michael SperbergMcQueen (W3C) are developing alternative markup systems with features that enable them to overcome these difficulties. This is just one way of giving user the tools they need to thrive in the era of e-Science. As science becomes increasingly data-driven, researchers want to be able to easily obtain, annotate and share data, integrating it into workflows and web services and creating semantic metadata. Text mining is a key component of this approach, and the National Centre for Text Mining provides resources, tools and services to e-Scientists for just this purpose. As John McNaught (University of Manchester) explained, the centre has begun in the field of biology, applying robust, established techniques from linguistics and developing new techniques for mining the biological literature. The plan is for this effort to expand step by step into other fields, customising the systems for each domain. This workshop was part of the “eScience in the Arts and Humanities” theme at the e-Science Institute, and Theme Leader Stuart Dunn wrapped up by sketching out the way ahead. Having discussed the various methodologies, the next step for this disparate new community is to articulate the technological needs and establish a set of use models. These steps will lead to being able to read the Book of Nature in ways the ancients never dreamed possible. Slides and other material from this event can be downloaded from http:// www.nesc.ac.uk/esi/events/907/ NeSC News Workshop Programme to launch the 4th International Digital Curation Conference (IDCC08) The Digital Curation Centre is pleased to announce an innovative programme of pre-conference workshops to be held in Edinburgh on Monday 1 December 2008. These half-day workshops will cover a range of tools and services including the DCC Curation Lifecycle model, the Data Audit Framework (DAF) toolkit , the Digital Repository Audit Method Based on Risk Assessment (DRAMBORA) Interactive toolkit and a demonstration of recently developed DCC curation tools. There will also be a Repository Curation Service Environments (RECURSE) Workshop, jointly supported by OGF-Europe and DReSNET, which will focus on highlighting application environments. A pre-conference drinks reception will take place after the workshops on the evening of 1st December at Our Dynamic Earth (http://www.dynamicearth. co.uk/) from 6pm. IDCC08 will open on Tuesday 2nd December with a keynote address from Professor David Porteous, University of Edinburgh, Chair of Human Molecular Genetics & Medicine/Generation Scotland. The programme will then move on to focus on the concept of Radical Sharing with reference to three specific projects: the iPlant Collaborative, the CARMEN Project (Code Analysis, Repository and Modelling for e-Neuroscience), and Open Notebook Science. After lunch will be a session on the Sustainability of Curation with input from Dr. Bryan Lawrence, Director of STFC Centre for Environmental Data Archival, Neil Beagrie, Director of Charles Beagrie Ltd and Brian Lavoie, Co-Chair, Blue Ribbon Task Force on Sustainable Digital Preservation & Access. The final part of the day’s programme will consider the legal issues surrounding the curation and reuse of data led by John Willbanks who, as VP of Science, runs the Science Commons project at Creative Commons. Throughout the conference will be an exhibition of posters and a room set aside for demonstrations including the DRAMBORA and DAF toolkits, DCC curation tools, Swirrl, a wiki for data, A.nnotate.com – a collaborative online document annotation, the Kultur Project: Repositories for Art Research, and the VidArch project, who will be seeking feedback on ContextMiner – a system for building digital collections based on digital video. The conference will open on Day 2 with Martin Lewis, Director of Library Services & University Librarian at the University of Sheffield. Martin will address the topic of “University Libraries in the UK Data Curation Landscape”. The programme will then move to the peer-reviewed papers selected by the IDCC Programme Committee. The major themes here will be Infrastructure, Digital Curation in Practice, Lifecycle & Models and Metadata & Tools. The conference will end with a closing address by Malcolm Atkinson, Director of the National e-Science Institute and e-Science Envoy. This conference is being held in partnership with the National e-Science Centre and supported by the Coalition for Networked Information (CNI) http:// www.cni.org/ For further information on the workshops and conference see http://www.dcc. ac.uk/events/dcc-2008/ www.nesc.ac.uk Issue 64, November 2008 North East Regional e-Science Centre By Iain Coleman NEReSC aims to be a regional centre of excellence in e-Science, but that’s only part of the story. Newcastle is striving to become a “Science City”, in a regional initiative to create technology-based jobs in the North East of England. In support of this scheme, NEReSC is helping to create a science infrastructure for businesses and universities throughout the region. Photograph by Tagishsimon. Licensed under Creative Commons The coal may be gone and the great ships long departed, but it’s impossible to visit Newcastle without being reminded of its rich history as a pioneering industrial city. So it’s no surprise that the North East Regional e-Science Centre (NEReSC), based in Newcastle University, pursues its cutting-edge research with an ambitious, pragmatic and commercially minded spirit. Newcastle upon Tyne Activities at NEReSC span the full spectrum from pure research to commerce. In scientific research, the centre leads the e-Science pilot projects Gold and CARMEN, plays a major role in myGrid and MESSAGE, and is a key partner in the CISBAN systems biology centre. It also maintains collaborations with many leading UK universities. Industrial activities include not only the Newcastle University spinoff company Arjuna, but also strong links with companies such as Redhat, BT, Oracle and Microsoft. Indeed, NEReSC has been praised in industry for the influence its work has had on vendors’ strategies. There have been two principles behind NEReSC’s success. The first has been to build long term relationships with leading researchers in particular scientific disciplines. Each discipline has its own idiosyncratic working practices, and often the first project is really about figuring out how to work together, with the major advances coming in subsequent projects. Given that all the initial projects at NEReSC have had followons, the strategy would seem to be a fruitful one. The second key point is to build systems on mature, widely adopted software. This means sticking to industry software where possible, and avoiding the trap of building on some attractive piece of grid software that fails to last beyond a few years. If Newcastle is indeed to reinvent itself as a centre of 21st century industry, NEReSC will be at the heart of it. And perhaps one day it will even join Stephenson’s locomotive works and the iconic Tyne bridges on the itinerary of industrial heritage tours. More information about NEReSC is available here: http:// www.neresc.ac.uk/ NEReSC NeSC News www.nesc.ac.uk Issue 64, November 2008 e-Science Institute Live Forever, or Die Trying By Iain Coleman The Sybil of Cumae was favoured by Apollo, and he granted her a single wish. Taking a handful of sand, she asked the god to grant her as many birthdays as there were sand-grains in her hand – but she neglected to add the condition of eternal youth. When she had lived many centuries, and was so decrepit that she could do nothing but hang in a jar by her cave, some young boys asked her “Sibyl, what do you want?” She replied “I want to die”. For decades now, lifespans have been increasing, thanks to advances in medical care and public health. But the period of decrepitude late in life has also become extended. The Centre for Integrated Systems Biology of Ageing and Nutrition (CISBAN), involving scientists at NEReSC and researchers at Newcastle General Hospital, is trying to address the health problems that come with old age. For the past three years, it has brought together researchers studying aging in yeast cells, mice, human cultured cells and a cohort of elderly people, with the work on databases and data analysis at NEReSC as the glue that holds the project together. Aging is a complex problem. It arises from a variety of different factors, including direct damage to genes and feedback loops in which damaged mitochondria create free radicals that cause more mitochondrial damage. There is also a delicate balance between stopping cells from aging, and triggering cancer. The systems biology approach pursued by CISBAN seeks to integrate all these aspects of aging, to develop a complete understanding of the process in its entirety. In pursuing this ambitious goal, CISBAN has created an array of tools and resources, some of which have been taken up by researchers further afield. For example, the data archive system SyMBA can be widely applied to systems biology research in general, and is already in use by other projects. The centre has also been heavily involved in the standards process: unglamorous work, but vital in such a complex and multifaceted problem. Of course, we already know how to prevent many of the problems of late middle age. Eat a healthy diet. Don’t drink too much. Don’t smoke. CISBAN Lab The problem is that people, by and large, aren’t terribly keen to follow this advice. The holy grail of longevity is a pill that will treat the symptoms of normal aging. With all the work that is being done to develop drugs for the diseases of aging, treatments for the normal symptoms of aging may not be far behind. In the next decade or two, health problems that were once thought to be an inevitable part of life may start to become a thing of the past. If we can, in the long run, conquer aging, the effects on society could be profound. The question is, will any of us live long enough to see it? CALL FOR ABSTRACTS: THE 4th EGEE USER FORUM (UF4)/OGF25 and OGFEurope 2nd international event This combined event , to be held on 2-6 March in Catania, Sicily, will once again strengthen the links between EGEE and the Open Grid Forum, bringing users and standards bodies together to ensure that the future of the Grid is complemented by the establishment of key standards. The Program Committee invites abstracts for contributions in one of the following general topics: Scientific results obtained using grid technology; Planned or on-going scientific work using the grid; Experiences from application porting and deployment; Grid Services exploiting and extending gLite middleware (job management, data management, monitoring, workflows etc); Programming environments; End-user environments and portal technologies; Emerging Technologies within the EGEE infrastructure (cloud, virtualization etc) The Programme Committee kindly requests that submitted abstracts follow the pre-defined template provided online from the User Forum web site at CERN’s Indico (http://indico.cern.ch/conferenceDisplay.py?confId=40435). Abstract submission opening date: 15th of October Deadline for abstract submission: 5th of December Notification of acceptance: 15th of January Programme committee chair: Vangelis Floros, GRNET (efloros@grnet.gr). Local organising committee contact: Roberto Barbera, INFN (roberto.barbera@ct.infn.it) NeSC News www.nesc.ac.uk Issue 64, November 2008 e-Science Institute Computing Culture By Iain Coleman A living kitchen, that works in harmony with its owner. A search system for bodily movements. Jewellery that connects people and places all over the world. These are just some of the ideas flourishing in the fertile ground of Culture Lab at NEReSC, a hothouse of electronic creativity. One corner of Culture Lab is given over to the Ambient Kitchen, which at first glance looks like any domestic kitchen would look if it had been uprooted and incongruously dropped into a university laboratory. On closer inspection, however, it is a very special kitchen indeed. In the Ambiant Kitchen Sensors embedded in the floor, cupboards, appliances and food containers allow the kitchen to know how it is being used at any moment, and to track the whereabouts of objects and people. Integrated projectors display recipes and food information. The main application of this pervasive computing environment is in assisting the elderly, particularly those suffering from dementia, by helping them to keep track of the cooking process, and detecting if a person has got into difficulty in the kitchen. There are as yet no plans for a celebrity endorsed model in which the recorded voice of Gordon Ramsay shouts at you if your scallops are overcooked. Jayne Wallace: Blossom Elsewhere in the lab, the AMUC project is building a motion-capture database recording the movements of dancers, jugglers and magicians. One of the problems they have found is how to search for particular movements within the database. You can’t exactly Google a graceful upwards sweep of the arm. Instead, the project team has developed a system of sketch-based retrieval. Using a digital pen and pad, you draw a sketch of the motion using some expressive gesture. The system characterises this qualitatively, then finds data which has a similar qualitative representation. This can produce a motley set of results, much like the first generation of internet search engines – but, again like those early search engines, it’s a vast improvement on searching by hand. The potential inherent in a global distributed communications infrastructure is being explored in a very personal way in through the design and development of digital jewellery. Artist Jayne Wallace started off with a PhD project in creating custom-made pieces of artistic jewellery that embody the personal lives of their owners. One woman she worked with had deeply felt family roots in Cyprus, but was now living in England. Wallace created for her a piece of jewellery, called “Blossom”, that opens up like a delicate flower in response to rainfall levels detected by a sensor on the family land in Cyprus. This work has now expanded to include neck pieces called “Journeys”. These come in pairs, and contain sensors such that if one is touched, the other responds. These can transmit to one another anywhere on Earth, given appropriate Jayne Wallace: Journeys internet connectivity, and provide a tangible way for people to stay in touch that transcends the limitations of voice, text and image. The work of Culture Lab illustrates just how deeply computing is entering into our lives. As more and more designers and artists take hold of the technologies of e-Science, the accoutrements of science fiction become everyday reality. William Gibson once said that the future is here already, it’s just unevenly distributed. If you want it in highly concentrated form, take a trip to Culture Lab. For more information, contact Cultural Technologies Research Theme Leader Patrick Olivier: p.l.olivier@newcastle. ac.uk NeSC News www.nesc.ac.uk Issue 64, November 2008 e-Science Institute SSOKU09 - 1st European Conference on Software Services and Service Oriented Knowledge Utilities technologies SSOKU09, to be held in Brussels, Belgium, on 13-14 January, 2009, aims to gather over 200 experts in the fields of Software & Services and SOKU, including high level researchers, top industrial and political representatives from the European Commission and from national authorities to discuss the future of Grids & Software and Service oriented Architectures as well as to evaluate the findings presented in the results of the ECSS White Paper on Software and Service Architectures, Infrastructures and Engineering and of the Challengers’ Research Agenda and Roadmap on Grids. European SMEs participation is particularly encouraged and attendance from European and International projects and initiatives, universities, media, commercial, research or governmental organisations are most welcome. Participation is free of charge but subject to online registration. To learn more on the event, and to register, please visit SSOKU09 website: www.eu-ecss.eu/conference Forthcoming Events Timetable November 3 ECDF ‘Taking Stock’ event NeSC 3 MVM Research Symposium TOE 6-7 NERIES Data Portal for Seismology: Brainstorming Meeting NeSC 10-12 The Chris Date Seminar: A Relational Approach to SQL eSI 14 The e-Science Public Lecture - Climate Change eSI 1 4th International Digital Curation Conference “Radical Sharing: Transforming Science?” NeSC http://www.dcc.ac.uk/events/dcc-2008/ programme/ 3-4 OMII-UK Operations & Management Meeting NeSC http://www.nesc.ac.uk/esi/events/942/ 3 Workshop: Experimental Facilities for the NeSC Future Internet from a service perspective http://www.nesc.ac.uk/esi/events/925/ http://www.nesc.ac.uk/esi/events/910/ http://www.nesc.ac.uk/esi/events/918/ December This is only a selection of events that are happening in the next few months. For the full listing go to the following websites: Events at the e-Science Institute: http://www.nesc.ac.uk/esi/esi.html External events: http://www.nesc.ac.uk/events/ww_events.html If you would like to hold an e-Science event at the e-Science Institute, please contact: Conference Administrator, National e-Science Centre, 15 South College Street, Edinburgh, EH8 9AA Tel: 0131 650 9833 Fax: 0131 650 9819 Email: events@nesc.ac.uk This NeSC Newsletter was edited by Gillian Law. Email: glaw@nesc.ac.uk The deadline for the December 2008 issue is November 19, 2008 NeSC News www.nesc.ac.uk