ppt - AHRC ICT Methods Network

advertisement
Sustainability Issues for e-Infrastructure
Services in Arts and Humanities
Rob Procter
rob.procter@ncess.ac.uk
www.ncess.ac.uk
29th November
AHRC ICT Methods Network Seminar
1
NCeSS Overview


Launched in May 2004 to develop and
promote UK e-Social Science.
Unified Centre with distributed
structure:
– Co-ordinating Hub: Manchester & UKDA
– Seven research Nodes located across UK
– Twelve small projects
29th November
AHRC ICT Methods Network Seminar
2
NCeSS Aims and Objectives

Applications of e-Social Science:
– Tackling substantive research problems by
enhancing existing research methods and
encouraging new approaches

Social shaping of e-Science:
– Usability of tools and infrastructure
– Socio-technical factors in the design, uptake and
use of e-Science
• eSI theme
– Drivers, policy and socio-economic impacts

Create sustainable digital resources
29th November
AHRC ICT Methods Network Seminar
3
Digital Resources in a Grid World







The emergence of a new kind of research infrastructure is
helping to redefine what we mean by digital resources.
These new infrastructures are distinctive in being ‘service
oriented’.
They support all the types of digital resources with which we
are familiar plus a new kind: the service.
A service is a resource which performs some useful function,
for example, processing data.
The provision of metadata enables these services and other
kinds of resources to be discovered used, including being
composed (perhaps automatically) to carry out complex
processing and analysis tasks for users.
What kinds of sustainability issues does this raise for digital
resources and how can they be tackled?
I will briefly consider these questions from two distinct levels:
– European and UK programmes
– User communities
29th November
AHRC ICT Methods Network Seminar
4
European e-Infrastructure Sustainability Strategy


The role of the European Strategy Forum on Research Infrastructures
is to support a coherent approach to policy-making in Europe and to act
as an incubator for international negotiations about specific initiatives.
ESFRI has prepared a European Roadmap for new research
infrastructures of pan-European interest:
– In A+H, this involves:
• DARIAH
• CLARIN

Some ESFRI principles:
– Existing e-Infrastructure projects to be superseded by integrated
sustainable services at national and European levels.
– e-Infrastructures to be application neutral and open to all user communities
and resource providers.
– National funding agencies to fund multidisciplinary & inclusive
infrastructures (rather than disciplinary specific alternatives).
– FP7 to facilitate a model which specifically encourages further integration
of national e-Infrastructure initiatives.
29th November
AHRC ICT Methods Network Seminar
5
Principles for a Sustainable National e-Infrastructure


The UK e-Science programme has been very successful in promoting innovation
in research, but the project orientation of the programme has meant that
sustainability has often been overlooked.
This situation is changing and some principles for sustainability of
infrastructure and services have now been defined:
–
–
–
–
–
–
–
–
–

A secure national framework for multiple levels of authentication and authorisation to
support both individual institutions and dynamic, cross boundary, ‘Virtual
Organisations’.
A repository for open source, open standard infrastructure middleware and tools as
well as a software engineering capability to research, support and maintain this.
A national focus for digital curation providing support, guidance and research into
long-term preservation of both research data and traditional publications.
Integrated access to national data sets and publications such as those provided
through AHDS, ESDS, EDINA, MIMAS and British Library; and emerging Open
Access subject and institutional repositories.
Resource discovery mechanisms enabling intelligent searches across the everexpanding mass of digital resources of all types.
A set of national services for data and long-term data archiving.
A national centre to enhance the creation of a strong culture of multidisciplinary
research and provide training in new technologies.
Development of tools and services to support collaborative environments, including
portals for access to data and services, national service registries, and workflow and
provenance tools.
An ongoing relationship with industry to ensure sustainability into the future.
Some elements of this strategy, such as the Digital Curation centre are now in
place.
29th November
AHRC ICT Methods Network Seminar
6
Digital Curation Centre




Actions needed to maintain and utilise digital resources
over entire life-cycle:
– For current and future generations of users.
Digital Preservation:
– Long-run technological/legal accessibility and
usability.
Data curation in science:
– Maintenance of body of trusted data to represent
current state of knowledge in area of research.
Research in tools and technologies:
– Integration, annotation, provenance, metadata,
security.
29th November
AHRC ICT Methods Network Seminar
7
Sustainability and the Community

In a world of proliferating digital resources – data, training materials,
services – how do we ensure the effort required to sustain them is
available?
– Current methods for communicating and managing the stock of knowledge
don’t scale with knowledge growth.


One possible solution is for wider community involvement to support
the maintenance and evolution of digital resources.
In particular, this means continual effort to keep metadata up-to-date
so that digital resources remain discoverable and useful:
– Organising concepts are dynamic and often contested.
– Resources are often of variable quality: as one researcher put it “We need
an Amazon for datasets.”



How is the proliferation problem being dealt with outside academia?
– wikipedia, flickr, myspace, youtube, del.icio.us, etc., are examples of the use
of folksonomies and social tagging.
A folksonomy consists of collaboratively generated, open-ended labels
that categorize content.
A folksonomy is most notably contrasted with a more formal taxonomy
or ontology in that authors of the labeling system are often the main
users of the content to which labels are applied.
29th November
AHRC ICT Methods Network Seminar
8
29th November
AHRC ICT Methods Network Seminar
9
29th November
AHRC ICT Methods Network Seminar
10
29th November
AHRC ICT Methods Network Seminar
11
29th November
AHRC ICT Methods Network Seminar
12
Linking Ontologies and Folksonomies




Formal approaches to metadata support such as ontologies are not
amenable to being easily evolved because they are cumbersome and
require technical expertise.
Informal approaches are such as folksonomies are easy to use but
vulnerable to quality control problems.
Perhaps a combination of ontologies and folksonomies can deliver low
entry costs, a rich vocabulary that is broadly shared and
comprehensible by the user base, and the capacity to respond quickly
to language change – without the errors that inevitably arise in naive,
unsupervised folksonomies.
The NCeSS PolicyGrid node (Aberdeen University) is exploring
“folktology” solutions which exploit both lightweight ontology and
folksonomy based approaches:
– Squanto (Semantic QUalitative ANnotation TOol) is a Grid-enabled
qualitative analysis tool which allows researchers to code text documents
using both free text codes and so-called structured codes (derived from an
OWL ontology).


Squanto allows users to create lightweight relations between free
codes and structured codes, leading to related resources being
highlighted through this association.
See http://www.ncess.ac.uk/research/nodes/PolicyGrid/ for more
information.
29th November
AHRC ICT Methods Network Seminar
13
PolicyGrid

Exploring how Semantic Grid tools can support social
scientists and policy makers using mixed-methods:
– Surveys and interviews, ethnography, case studies,
simulations

Provision of metadata infrastructure in this context
presents challenges:
– Dynamic and contested nature of concepts within social
sciences
– Need to align with existing thesauri (where those exist) and
need to support open, community based efforts

PolicyGrid is exploring “folktology” solutions which
exploit both lightweight ontology and folksonomy
(social tagging) based approaches.
29th November
AHRC ICT Methods Network Seminar
14
Finally …



Achieving community engagement in
sustainability is not just a matter of
having the right tools.
Is it realistic to expect communitybased effort in a research culture
where the people associated with
digital resource creation (and
maintenance) don’t get credit for it?
How can we change this?
29th November
AHRC ICT Methods Network Seminar
15
Download