Uptake and Sustainability of e-Research Technologies Alexander Voss

advertisement
Uptake and Sustainability of e-Research Technologies
Alexander Voss
alex.voss@ncess.ac.uk
National Centre for e-Social Science
and e-Science Institute
25th Oct., 2006
e-Science

…the large scale science that will increasingly be
carried out through distributed global collaborations
enabled by the Internet. Typically, a feature of such
collaborative scientific enterprises is that they will
require access to very large data collections, very large
scale computing resources and high performance
visualisation back to the individual user scientists.
(Research Councils UK)

Goal: to enable better research in all disciplines, to
enable research that was not feasible previously
25th Oct., 2006
2
Drivers

Technical
– Faster, cheaper devices, higher resolutions, increased
throughput, cheaper and higher capacity storage, increased
bandwidth, etc.

Research Process: coping with the data deluge
–
–
–
–
–
–


Finding and accessing data
Independent provision and ownership, local policies
Linking data
Processing data
Interpreting data
Presenting results
Increased international collaboration
Doing what was previously impossible
25th Oct., 2006
3
e-Research in the UK







UK e-Science Programme
(since 2001)
International Programmes
(esp. US, EU)
Supported data and
information services
Access to scientific facilities
Communities developing
resources, systems and
practices
Pilot projects in most areas
Core middleware development
and code repositories
25th Oct., 2006
Replace with google map
4
Grid Technologies

‘An infrastructure that enables
flexible, secure, coordinated
resource sharing among
dynamic collections of
individuals, institutions and
resources.’
(Ian Foster and Carl Kesselman)

Like the power grid, the Grid
makes services available
through common interfaces
without the user having to
worry about the details of how
these services are provided.
25th Oct., 2006
5
Grid Technologies

What characterises a grid?
– Coordinated resource sharing
– Standard, open, general-purpose protocols and interfaces
– Delivering non-trivial qualities of service


Vision of “the Grid” not yet achieved and there are
reasons why it may never be, but many ‘grids’ which
each support one or more…
‘Virtual organisations’: people in different organisations
seeking to cooperate and share resources across their
organisational boundaries
25th Oct., 2006
6
The Web, the Grid and the Internet


The Web provides access to
documents (at least that is
what it has been designed for)
Communication between
remote servers and person
using web browser
25th Oct., 2006
7
The Web, the Grid and the Internet






The Web provides access to
documents (at least that is what it has
been designed for)
Communication between remote
servers and person using web
browser
The Grid provides access to
resources - data, computation,
experimental apparatus, etc.
Communication between resources
brought together to perform an overall
task
Does not replace but rather
complements the Web
The Internet is the common underlying
infrastructure for both - it provides
connectivity and basic management of
networks independent of their use.
25th Oct., 2006
8
Grid Middleware



Grids support a vast range of
applications accessing a range of
different resources.
Grid Middleware provides the
‘glue’ that binds them together.
Open, standardised interfaces
reduce the complexity of the
interface between resources and
applications.
Applications
Grid Middleware
Resources
25th Oct., 2006
9
Related (but not identical) concepts






Internet computing (e.g., FightAIDS@Home)
Peer-to-peer computing (e.g., Napster)
Utility computing (e.g., Sun, IBM)
Cluster computing (e.g., supercomputing, reliability)
Distributed systems (e.g., e-Business)
Groupware (e.g., Lotus Notes)
25th Oct., 2006
10
What is e-Research?





Extension of the concept of e-Science into other domains (social
sciences, arts & humanities) and an extension from large research
institutions into all parts of life where research might be conducted
(e.g. in schools or at home).
Recognising the ‘small steps’ that are sometimes crucial in ‘big
science’.
Grid technologies are not sufficient on their own to enable the vision
of e-Science, other elements are needed, e.g., data sharing
agreements, changed reward structures, domain standards, etc.
Focus more on uses of ICTs in research than on the technology per
se.
e-Research is an emerging phenomenon - we all make use of
modern ICT infrastructures in our daily research activities.
25th Oct., 2006
11
Example 1: Integrative Biology
Overview - Integrative Biology
 IB is an EPSRC-funded eScience project tackling UK’s
two biggest killers: cancer and
heart disease through large-scale
multi-scale simulations.
 Globally distributed and interdisciplinary community: US,
Europe, New Zealand
 Developing a web-services based
grid infrastructure providing
tailored access to compute and
data resources.
Courtesy of Matthew Mascord, Oxford e-Research Centre
Integrative Biology VRE
13
Heart Modelling
Requires access to compute resource, data management facilities, visualisation
capability and collaborative working tools.
Typically solving coupled systems of PDEs (tissue level) and non-linear ODE’s (cellular
level) for the electrical potential.Complex three-dimensional geometries
Investigation of how ischemic
tissue interacts with electric
shocks in order to improve
defibrillation efficacy in patients
with coronary heart disease
(Tulane/Oxford
Courtesy of Tulane/Oxford
Image is part of a study to
figure out the
arrangement of different
cell types in the heart
wall that accounts for the
shape of the T wave in the
ECG
Courtesy of Richard Clayton,
Sheffield
Courtesy of the Integrative Biology Consorium, funded by EPSRC
Visualization of Cardiac
Partners:
Oxford, Sheffield,New
Orleans, Washington Lee,
UCSD,UCLA, Baltimore,
Monash, Auckland
Graz, Utrecht
Virtual Tissue
14
Example 2:
e-Social Science
(Grid-enabled microeconomic data
analysis)
Background
Social Science Problem And Policy Issue
• Researchers frequently have to use more than one data set
in order to obtain a more complete answer to their questions
What do we know about ethnic minority economic
welfare when it is disaggregated by group and geography
• One data set may provide a large sample of the target
population, but offer incomplete coverage of the topics of
interest
Census data can lack direct measures of income
• Another data set with coverage of the topics of interest may
not sample the target population adequately
Survey data yield minority samples that may be too
small for meaningful results to be obtained
Courtesy of Simon Peters
Data
 The British Household Panel Survey (BHPS) provides
the small scale survey data.
•BHPS is a longitudinal (panel) study with yearly waves.
 The Sample of Anonymised Records (SARs) provides
the large scale Census data.
•SARs are a random sample of individuals and households
from the UK Census
 Uses 1991 data because of projected confidentiality
restrictions on the publicly available version of the 2001
SARs.
•2% sample of individuals, 1% sample of households.
Courtesy of Simon Peters
Courtesy of Simon Peters
Courtesy of Simon Peters
Example 3:
Environmental e-Science
(Grid for Ocean Diagnostics
Interactive Visualisation and
Analysis)
Exploring environmental data with
Google Maps and Google Earth
•
•
“Godiva2” website provides very
quick visualisations of numerical
model and satellite data
Scientists use an interactive website
to select dataset to visualise on a
draggable, zoomable map
– can view data at large range of scales
•
Can then view same data in Google
Earth
– 3-D globe
– Lightweight, easy to use GIS tool
– Can visualise alongside other
datasets
•
•
•
Courtesy of Jon Blower
Don’t have to download any data!
Images generated dynamically on the
server
Spin-off from GODIVA project
Demo
• http://lovejoy.nerc-essc.ac.uk:8080/Godiva2/
Example 4:
Archaeology
(Silchester Roman Town)
Courtesy of Michael Fulford
25th Oct., 2006
24
Courtesy of Michael Fulford
25th Oct., 2006
25
Courtesy of Michael Fulford
25th Oct., 2006
26
Integrated Archaeological Database
Courtesy of Michael Fulford
Silchester: A VRE for Archaeology
Examples have show instances of:













Use of public datasets
Confidentiality issues
Use of high-performance computing
Fieldwork - not all research happens inside!
Mapping geographies
Record linkage (coping with incomplete data)
Use of national infrastructures
Collaborative activities
Various web-base, desktop and mobile user interfaces
Management of large datasets
Meta-data: where is data from, what can be said about it?
Mining data
Using data previously thought worthless or intractable
25th Oct., 2006
28
Research Challenges
Dealing with complexity and heterogeneity



Just four examples have highlighted the complexity and
heterogeneity of what is meant by ‘e-Research’.
There tend to be similarities as well as differences
between the needs of different researchers.
This is where the chance lies for building common
infrastructures while supporting
– a wide range of different research activities
– and different kinds of resources,
– across organisational contexts.


Need to know about the different cultures in, say,
particle physics and sociology.
Teasing out the similarities and differences is an
important part of realising e-Research.
25th Oct., 2006
30
We need to know more about:








The early adopters, the interested, the disengaged
What motivates people to collaborate and share
What the barriers to entry are and how they can be overcome
How e-Science endeavours can be effectively and efficiently
managed in different organisational contexts
How we can manage user-designer relations to ensure what we
build is useful and usable.
How we can ensure people have reasonable expectations of what
can and cannot be done
How we engage future generations of researchers to engage in
research in the first place and to make use of the vast potential of eResearch.
And other socio-technical issues
25th Oct., 2006
31
Broad themes






Supporting Innovation and Diffusion
Improving usability
Fostering new forms of research and community
Deployability, configurability and sustainability
National and International Comparisons
Measuring Impact of e-Research
25th Oct., 2006
32
Commodification


The process that transforms the market from a collection of
individual, proprietary and idiosyncratic products to one that defines
open standards and provides competing but interoperable
implementations.
Aims are to:
–
–
–
–




Flatten the learning curve
Easy deployment
Centrally provide functionality
Overcome / leverage network effects
OGF - engaged in standardisation
OMII - repository of production software
NGS - national service providing a compute grid and operations
support
Role that University Computing Services play: centrally and locally
provided services will be required.
25th Oct., 2006
33
Project Management




Proposals are sales documents!
Project funding assumes a project plan is in place and work can
start soon
This is routinely not the case
E-Research projects tend to differ from other IT development
projects, e.g.:
–
–
–
–


Multiple stakeholders with only partially aligned agendas
Raised expectations
Different ways of working and professional cultures
Short timelines (funding)
Project management often tells us what to do but not how to do it.
Need to pay attention to the ‘seen but unnoticed’ skills of good
project managers:
– e.g., tackling problems arising from peoples’ different motivations and
professional identities and languages
25th Oct., 2006
34
Democratic e-Research


How we communicate with the wider public is crucial
where we touch upon potentially contentious issues or
make use of personal data.
This requires further interdisciplinary work involving,
e.g., ethicists and social science researchers
– EthOx centre in Oxford
– Innogen in Edinburgh

Also, involvement of the wider public as active
participants in research activities.
25th Oct., 2006
35
Democratic e-Research
25th Oct., 2006
36
User-Designer Relations in e-Research



Designers of e-Research systems need to be familiar
with the working practices and concerns of researchers
Researchers need to understand what is possible, what
is feasible and what is not, what the tradeoff between
different options are
This involves a degree of familiarity with the research
domain and e-Research technologies. This can be
achieved through:
–
–
–
–
Training (e.g., bioinformatics, Grid literacy)
Boundary spanning (e.g., researchers employed on projects)
Facilitation (e.g., workplace studies)
Shared practice (co-location, corealisation)
25th Oct., 2006
37
eSI Theme Activities

Establish and consolidate what we already know
– e-Research BOK: Formulating e-Research practices
– 1st step: Realising e-Research Endeavours, call to be issued
end November, workshop in March ‘07, write-up soon after

Identify major gaps and address through
– Targeted research (focused observational studies, interviews,
surveys, depending on the issue at hand)
– in collaboration with other projects as well as
– seeking additional grants



Workshops and Visitors as input and control mechanism
Raise awareness of e-Research in the communities
Aim to drive technical development
25th Oct., 2006
38
Prior Work




Usability Task Force: Usability Research Challenges in
e-Science
JISC Human Factors Audit of Selected e-Science
Projects
Angela Sasse and Brock Craft: Security and Usability of
Grid Projects: Implications for e-Science
Paul David: Towards a Cyberinfrastructure for Enhanced
Scientific Collaboration: providing its ‘soft’ foundations
may be the hardest part
25th Oct., 2006
39
Related Activities



Projects funded under EPSRC Usability Call
AVROSS (EU Strep): e-Social Science
JISC e-Infrastructure Call (Issued end Sept.)
– Barriers to Uptake
– Service Usage Models (practice templates)


SUPER: informing prioritisation of e-Infrastructure work
Usability Task Force Portal: assembling a network of
people working in this area and disseminating results
25th Oct., 2006
40
Outlook

There is still much to be done to deliver the promise of
e-Science and to extend its uptake. Each step requires
work. More research communities will need to agree
their methods of collaboration. The technology requires
further development, in particular to make it more
usable, versatile and economic. And production support
requires more operational experience and extension of
arrangements for sharing. It is now time to expand its
application across the academic world and to introduce
it to students as well as to academics.
(Malcolm Atkinson writing in THES)
25th Oct., 2006
41
Credits





Malcolm Atkinson, e-Science Envoy
and Director of the e-Science Institute
Anna Kenway, Deputy Director of the e-Science Institute
Rob Procter, Research Director, NCeSS
Tom Rodden, University of Nottingham and Usability
Task Force
Colleagues who have kindly allowed me to use their
slides
25th Oct., 2006
42
Questions, please…
Download