NeSC News Reading the Book of Nature Issue 64 November 2008

advertisement
The monthly newsletter from the National e-Science Centre
NeSC News
Issue 64 November 2008 www.nesc.ac.uk
Reading the Book of Nature
By Iain Coleman
Since Antiquity, philosophers
and scientists have described
experimental enquiry as reading the
Book of Nature. But what if that’s
more than just a metaphor? Can the
tools of textual analysis be applied
to reading biological data? And can
the birth and evolution of a text be
analysed is if it were an organism or
a species? These are the questions
that the workshop on “Living texts:
interdisciplinary approaches and
methodological commonalities in
biology and textual analysis”, held
at the e-Science Institute on 16-17
October, set out to explore.
Natural languages form units such
as sentences, whose meaning
depends on which words are used
an in how they relate to one another
through word order or inflection.
The structure of the amino acid
sequences that form a protein
determine how it folds up into a
unique three-dimensional shape.
The arrangement of base pairs in a
DNA sequence determines how an
organism will develop and function.
In all three cases, we are concerned
with the effects of particular forms
of composition and structure. Julia
Hockenmaier (University of Illinois at
Urbana-Champaign) showed how the
complex folding process in proteins
can be represented in a tree diagram
similar to those used by linguists to
analyse sentences, allowing similar
algorithms to be employed. When
a full simulation of protein folding
requires a petaflop supercomputer,
this kind of abstraction becomes very
attractive. Gene sequence studies
are also big science, and Siu-wai
Leung (Edinburgh) discussed how
determining a formal grammar for the
language of DNA can be valuable
both in computational analysis and in
laboratory experiments.
The analogies can also work the
other way round. Just as a single cell
develops into an organism composed
of millions of cells working together
as part of the whole, so a text grows
from an idea into a structure of
words, paragraphs and chapters
that make up the body of the work.
And, like organisms reproducing
by creating modified versions of
themselves, texts are copied and
reproduced down the centuries, often
changing in the process. Caroline
Macé and Philippe Baret (University
of Louvain) presented an example of
textual evolution in the manuscripts
of the fourth century Church Father,
Gregory the Theologian, that were
extensively copied in the ninth
to sixteenth centuries such that
more than a thousand copies have
survived to this day. The relationships
between all these copies can be
effectively teased out using methods
taken from biological classification.
A more speculative idea was
introduced by Ewa Sikora and
Monika Szumowska (Polish Academy
of Sciences), who proposed that a
single text can develop similarly to an
organism. Where animals undergo
the stages of prenatal development,
texts are built up through successive
drafts until they are ready for
submission to an editor. Paragraphs
and chapters are created to serve
different functions within a text, and
are subject both to linguistic rules
and to the principles of plotting
and characterisation which hold
paragraphs and chapters together.
The postnatal development comes
when the text is worked on by the
editor and proof-reader, when it
can be subjected to substantial –
sometimes traumatic – changes. This
analogy may go some way towards
explaining why the methods of
biological classification prove fruitful
in analysing texts.
Gregory the Theologian
But if life is a process of complex
self-organisation, then perhaps it
is not too much of a stretch to say
that the scientific literature itself is
being brought to life by new ways
of discovering and structuring
information. Markup systems
have emerged as a key method of
Issue 64, November 2008
Reading the Book
of Nature
continued
structuring textual information in an
ordered hierarchy so that machines
can understand it and reason with
it. Current systems, though, hit
problems when they are faced with
texts that have discontinuous or
overlapping elements, even such
simple sticking points as a sentence
that starts on one page and ends on
another. Claus Huirfeldt (University
of Bergen) and Michael SperbergMcQueen (W3C) are developing
alternative markup systems with
features that enable them to
overcome these difficulties. This
is just one way of giving user the
tools they need to thrive in the era
of e-Science. As science becomes
increasingly data-driven, researchers
want to be able to easily obtain,
annotate and share data, integrating
it into workflows and web services
and creating semantic metadata. Text
mining is a key component of this
approach, and the National Centre
for Text Mining provides resources,
tools and services to e-Scientists for
just this purpose. As John McNaught
(University of Manchester) explained,
the centre has begun in the field of
biology, applying robust, established
techniques from linguistics and
developing new techniques for
mining the biological literature. The
plan is for this effort to expand step
by step into other fields, customising
the systems for each domain.
This workshop was part of the “eScience in the Arts and Humanities”
theme at the e-Science Institute,
and Theme Leader Stuart Dunn
wrapped up by sketching out the
way ahead. Having discussed the
various methodologies, the next step
for this disparate new community is
to articulate the technological needs
and establish a set of use models.
These steps will lead to being able to
read the Book of Nature in ways the
ancients never dreamed possible.
Slides and other material from this
event can be downloaded from http://
www.nesc.ac.uk/esi/events/907/
NeSC News
Workshop Programme to launch the 4th
International Digital Curation Conference
(IDCC08)
The Digital Curation Centre is pleased to announce an innovative programme
of pre-conference workshops to be held in Edinburgh on Monday 1 December
2008.
These half-day workshops will cover a range of tools and services including
the DCC Curation Lifecycle model, the Data Audit Framework (DAF)
toolkit , the Digital Repository Audit Method Based on Risk Assessment
(DRAMBORA) Interactive toolkit and a demonstration of recently developed
DCC curation tools. There will also be a Repository Curation Service
Environments (RECURSE) Workshop, jointly supported by OGF-Europe and
DReSNET, which will focus on highlighting application environments.
A pre-conference drinks reception will take place after the workshops on the
evening of 1st December at Our Dynamic Earth (http://www.dynamicearth.
co.uk/) from 6pm.
IDCC08 will open on Tuesday 2nd December with a keynote address from
Professor David Porteous, University of Edinburgh, Chair of Human Molecular
Genetics & Medicine/Generation Scotland. The programme will then move
on to focus on the concept of Radical Sharing with reference to three specific
projects: the iPlant Collaborative, the CARMEN Project (Code Analysis,
Repository and Modelling for e-Neuroscience), and Open Notebook Science.
After lunch will be a session on the Sustainability of Curation with input
from Dr. Bryan Lawrence, Director of STFC Centre for Environmental Data
Archival, Neil Beagrie, Director of Charles Beagrie Ltd and Brian Lavoie,
Co-Chair, Blue Ribbon Task Force on Sustainable Digital Preservation &
Access. The final part of the day’s programme will consider the legal issues
surrounding the curation and reuse of data led by John Willbanks who, as VP
of Science, runs the Science Commons project at Creative Commons.
Throughout the conference will be an exhibition of posters and a room set
aside for demonstrations including the DRAMBORA and DAF toolkits, DCC
curation tools, Swirrl, a wiki for data, A.nnotate.com – a collaborative online
document annotation, the Kultur Project: Repositories for Art Research,
and the VidArch project, who will be seeking feedback on ContextMiner – a
system for building digital collections based on digital video.
The conference will open on Day 2 with Martin Lewis, Director of Library
Services & University Librarian at the University of Sheffield. Martin will
address the topic of “University Libraries in the UK Data Curation Landscape”.
The programme will then move to the peer-reviewed papers selected by the
IDCC Programme Committee. The major themes here will be Infrastructure,
Digital Curation in Practice, Lifecycle & Models and Metadata & Tools.
The conference will end with a closing address by Malcolm Atkinson, Director
of the National e-Science Institute and e-Science Envoy.
This conference is being held in partnership with the National e-Science
Centre and supported by the Coalition for Networked Information (CNI) http://
www.cni.org/
For further information on the workshops and conference see http://www.dcc.
ac.uk/events/dcc-2008/
www.nesc.ac.uk
Issue 64, November 2008
North East Regional e-Science Centre
By Iain Coleman
NEReSC aims to be a regional
centre of excellence in e-Science,
but that’s only part of the story.
Newcastle is striving to become a
“Science City”, in a regional initiative
to create technology-based jobs in
the North East of England. In support
of this scheme, NEReSC is helping
to create a science infrastructure
for businesses and universities
throughout the region.
Photograph by Tagishsimon. Licensed under Creative Commons
The coal may be gone and the
great ships long departed, but it’s
impossible to visit Newcastle without
being reminded of its rich history as
a pioneering industrial city. So it’s no
surprise that the North East Regional
e-Science Centre (NEReSC),
based in Newcastle University,
pursues its cutting-edge research
with an ambitious, pragmatic and
commercially minded spirit.
Newcastle upon Tyne
Activities at NEReSC span the full spectrum from pure research to commerce. In scientific research, the centre leads
the e-Science pilot projects Gold and CARMEN, plays a major role in myGrid and MESSAGE, and is a key partner
in the CISBAN systems biology centre. It also maintains collaborations with many leading UK universities. Industrial
activities include not only the Newcastle University spinoff company Arjuna, but also strong links with companies such
as Redhat, BT, Oracle and Microsoft. Indeed, NEReSC has been praised in industry for the influence its work has had
on vendors’ strategies.
There have been two principles behind NEReSC’s
success. The first has been to build long term
relationships with leading researchers in particular
scientific disciplines. Each discipline has its own
idiosyncratic working practices, and often the first project
is really about figuring out how to work together, with the
major advances coming in subsequent projects. Given
that all the initial projects at NEReSC have had followons, the strategy would seem to be a fruitful one.
The second key point is to build systems on mature,
widely adopted software. This means sticking to industry
software where possible, and avoiding the trap of building
on some attractive piece of grid software that fails to last
beyond a few years.
If Newcastle is indeed to reinvent itself as a centre of 21st
century industry, NEReSC will be at the heart of it. And
perhaps one day it will even join Stephenson’s locomotive
works and the iconic Tyne bridges on the itinerary of
industrial heritage tours.
More information about NEReSC is available here: http://
www.neresc.ac.uk/
NEReSC
NeSC News
www.nesc.ac.uk
Issue 64, November 2008
e-Science
Institute
Live Forever, or Die Trying
By Iain Coleman
The Sybil of Cumae was favoured by
Apollo, and he granted her a single
wish. Taking a handful of sand, she
asked the god to grant her as many
birthdays as there were sand-grains
in her hand – but she neglected to
add the condition of eternal youth.
When she had lived many centuries,
and was so decrepit that she could
do nothing but hang in a jar by her
cave, some young boys asked her
“Sibyl, what do you want?” She
replied “I want to die”.
For decades now, lifespans have
been increasing, thanks to advances
in medical care and public health.
But the period of decrepitude late
in life has also become extended.
The Centre for Integrated Systems
Biology of Ageing and Nutrition
(CISBAN), involving scientists
at NEReSC and researchers at
Newcastle General Hospital, is trying
to address the health problems that
come with old age. For the past
three years, it has brought together
researchers studying aging in yeast
cells, mice, human cultured cells and
a cohort of elderly people, with the
work on databases and data analysis
at NEReSC as the glue that holds the
project together.
Aging is a complex problem. It arises
from a variety of different factors,
including direct damage to genes and
feedback loops in which damaged
mitochondria create free radicals
that cause more mitochondrial
damage. There is also a delicate
balance between stopping cells from
aging, and triggering cancer. The
systems biology approach pursued
by CISBAN seeks to integrate all
these aspects of aging, to develop
a complete understanding of the
process in its entirety.
In pursuing this ambitious goal,
CISBAN has created an array of
tools and resources, some of which
have been taken up by researchers
further afield. For example, the
data archive system SyMBA can be
widely applied to systems biology
research in general, and is already
in use by other projects. The centre
has also been heavily involved in
the standards process: unglamorous
work, but vital in such a complex and
multifaceted problem.
Of course, we already know how
to prevent many of the problems of
late middle age. Eat a healthy diet.
Don’t drink too much. Don’t smoke.
CISBAN Lab
The problem is that people, by and
large, aren’t terribly keen to follow
this advice. The holy grail of longevity
is a pill that will treat the symptoms of
normal aging.
With all the work that is being done
to develop drugs for the diseases
of aging, treatments for the normal
symptoms of aging may not be far
behind. In the next decade or two,
health problems that were once
thought to be an inevitable part of
life may start to become a thing of
the past. If we can, in the long run,
conquer aging, the effects on society
could be profound. The question is,
will any of us live long enough to see
it?
CALL FOR ABSTRACTS: THE 4th EGEE USER FORUM (UF4)/OGF25 and OGFEurope 2nd international event
This combined event , to be held on 2-6 March in Catania, Sicily, will once again strengthen the links between EGEE
and the Open Grid Forum, bringing users and standards bodies together to ensure that the future of the Grid is
complemented by the establishment of key standards.
The Program Committee invites abstracts for contributions in one of the following general topics: Scientific
results obtained using grid technology; Planned or on-going scientific work using the grid; Experiences from
application porting and deployment; Grid Services exploiting and extending gLite middleware (job management,
data management, monitoring, workflows etc); Programming environments; End-user environments and portal
technologies; Emerging Technologies within the EGEE infrastructure (cloud, virtualization etc)
The Programme Committee kindly requests that submitted abstracts follow the pre-defined template provided online
from the User Forum web site at CERN’s Indico (http://indico.cern.ch/conferenceDisplay.py?confId=40435).
Abstract submission opening date: 15th of October
Deadline for abstract submission: 5th of December Notification of acceptance: 15th of January
Programme committee chair: Vangelis Floros, GRNET (efloros@grnet.gr). Local organising committee contact:
Roberto Barbera, INFN (roberto.barbera@ct.infn.it)
NeSC News
www.nesc.ac.uk
Issue 64, November 2008
e-Science
Institute
Computing Culture
By Iain Coleman
A living kitchen, that works in
harmony with its owner. A search
system for bodily movements.
Jewellery that connects people and
places all over the world. These are
just some of the ideas flourishing in
the fertile ground of Culture Lab at
NEReSC, a hothouse of electronic
creativity.
One corner of Culture Lab is
given over to the Ambient Kitchen,
which at first glance looks like any
domestic kitchen would look if it had
been uprooted and incongruously
dropped into a university laboratory.
On closer inspection, however, it
is a very special kitchen indeed.
In the Ambiant Kitchen
Sensors embedded in the floor,
cupboards, appliances and food
containers allow the kitchen to know how it is being used at any moment, and to track the whereabouts of objects and
people. Integrated projectors display recipes and food information. The main application of this pervasive computing
environment is in assisting the elderly, particularly those suffering from dementia, by helping them to keep track of the
cooking process, and detecting if a person has got into difficulty in the kitchen. There are as yet no plans for a celebrity
endorsed model in which the recorded voice of Gordon Ramsay shouts at you if your scallops are overcooked.
Jayne Wallace: Blossom
Elsewhere in the lab, the AMUC project is building a motion-capture database
recording the movements of dancers, jugglers and magicians. One of the
problems they have found is how to search for particular movements within
the database. You can’t exactly Google a graceful upwards sweep of the arm.
Instead, the project team has developed a system of sketch-based retrieval.
Using a digital pen and pad, you draw a sketch of the motion using some
expressive gesture. The system characterises this qualitatively, then finds
data which has a similar qualitative representation. This can produce a motley
set of results, much like the first generation of internet search engines – but,
again like those early search engines, it’s a vast improvement on searching
by hand.
The potential inherent in a global distributed communications infrastructure is
being explored in a very personal way in through the design and development
of digital jewellery. Artist Jayne Wallace started off with a PhD project in
creating custom-made pieces of artistic jewellery that embody the personal
lives of their owners. One woman she worked with had deeply felt family roots
in Cyprus, but was now living in England. Wallace created for her a piece of
jewellery, called “Blossom”, that opens up like a delicate flower in response
to rainfall levels detected by a sensor on the family land in Cyprus. This work
has now expanded to include neck pieces called “Journeys”. These come in
pairs, and contain sensors such that if one is touched, the other responds.
These can transmit to one another anywhere on Earth, given appropriate
Jayne Wallace: Journeys
internet connectivity, and provide a tangible way for people to stay in touch
that transcends the limitations of voice, text and image.
The work of Culture Lab illustrates just how deeply computing is entering into our lives. As more and more designers
and artists take hold of the technologies of e-Science, the accoutrements of science fiction become everyday
reality. William Gibson once said that the future is here already, it’s just unevenly distributed. If you want it in highly
concentrated form, take a trip to Culture Lab.
For more information, contact Cultural Technologies Research Theme Leader Patrick Olivier: p.l.olivier@newcastle.
ac.uk
NeSC News
www.nesc.ac.uk
Issue 64, November 2008
e-Science
Institute
SSOKU09 - 1st European Conference on Software Services and
Service Oriented Knowledge Utilities technologies
SSOKU09, to be held in Brussels, Belgium, on 13-14 January, 2009, aims to gather over 200 experts in the fields of
Software & Services and SOKU, including high level researchers, top industrial and political representatives from the
European Commission and from national authorities to discuss the future of Grids & Software and Service oriented
Architectures as well as to evaluate the findings presented in the results of the ECSS White Paper on Software and
Service Architectures, Infrastructures and Engineering and of the Challengers’ Research Agenda and Roadmap on
Grids.
European SMEs participation is particularly encouraged and attendance from European and International projects and
initiatives, universities, media, commercial, research or governmental organisations are most welcome.
Participation is free of charge but subject to online registration.
To learn more on the event, and to register, please visit SSOKU09 website: www.eu-ecss.eu/conference
Forthcoming Events Timetable
November
3
ECDF ‘Taking Stock’ event
NeSC
3
MVM Research Symposium
TOE
6-7
NERIES Data Portal for Seismology:
Brainstorming Meeting
NeSC
10-12
The Chris Date Seminar: A Relational
Approach to SQL
eSI
14
The e-Science Public Lecture - Climate
Change
eSI
1
4th International Digital Curation
Conference “Radical Sharing:
Transforming Science?”
NeSC
http://www.dcc.ac.uk/events/dcc-2008/
programme/
3-4
OMII-UK Operations & Management
Meeting
NeSC
http://www.nesc.ac.uk/esi/events/942/
3
Workshop: Experimental Facilities for the NeSC
Future Internet from a service perspective
http://www.nesc.ac.uk/esi/events/925/
http://www.nesc.ac.uk/esi/events/910/
http://www.nesc.ac.uk/esi/events/918/
December
This is only a selection of events that are happening in the next few months. For the full listing go to the following
websites:
Events at the e-Science Institute: http://www.nesc.ac.uk/esi/esi.html
External events: http://www.nesc.ac.uk/events/ww_events.html
If you would like to hold an e-Science event at the e-Science Institute, please contact:
Conference Administrator,
National e-Science Centre, 15 South College Street, Edinburgh, EH8 9AA
Tel: 0131 650 9833 Fax: 0131 650 9819
Email: events@nesc.ac.uk
This NeSC Newsletter was edited by Gillian Law.
Email: glaw@nesc.ac.uk
The deadline for the December 2008 issue is November 19, 2008
NeSC News
www.nesc.ac.uk
Download