talk - Jisc Digitisation

advertisement
researchers,
power of data
and GIS II
Dr Paul S Ell
Centre for Data Digitisation
and Analysis
Queen’s University Belfast
paul.ell@qub.ac.uk
Jisc Digitisation Final Programme
Meeting
3 July 2013
Or Why isn’t econtent having
and impact and
what can GIS, in
the broadest
sense, do about
it…
GIS in its broadest sense. Historical (and
Humanities?) GIS has failed
• Most researchers aren’t interested in maps. They do not form
part of their research process. They can’t read them and don’t
understand them
• Historical GIS traditionally focussed on mapping census data by
administrative units. Almost no one can use the software to do
this and few are interested in the results
• The humanities is not about statistics, is usually about text and
multimedia content
• So let’s forget about traditional GIS except in its broadest sense
and focus on why e-content has had a limited impact and what
can be done about it
Problem I: Many e-Resources – a deluge
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Historical census data for Britain 1801-2001
Welsh historical statistics
Historical census data for the Netherlands
Historical census data for India
Digitisation of recent census data for NISRA
Historical Gazetteers for Britain
Data on the 1851 Religious Census for Britain
Digitisation of a sample of place-names from the English Place-Name Survey and linking them to a GIS
Digitisation of the 1676 Compton Census
Data extraction from the Winchester Rolls
Computerisation of medieval manorial crop yield data
Hearth Tax Data
Datasets on Irish religion from 1834
Scottish National Dictionary
Dictionary of the Older Scottish Tongue
British Parliamentary Papers in a series of grants with BOPCRIS
Mortality and hospital admission data for England
Historical diaries relating to China from an Irish perspective
Key holdings from QUB Library Special collections
Various small digitisation projects for NMNI
Convict database for Down County Museum
Photographic plate digitisation for Down County Museum
Digitisation of the Banbridge Almanac for SEELB
Digitisation of Vital Registration data for Northern Ireland
Image scans of Latin texts for Ireland for UU
Database of Irish Historical Statistics
Act of Union Virtual Library
Hansard for the Stormont Parliament
CDDA/JSTOR Ireland Collection
Problem II: Data silos
Problem III: Sustainability
So what’s this got to do with GIS, or rather
gazetteers – lists of places?
• As Humphrey has indicated, place is important. In the Humanities
and Social Sciences almost everything happens somewhere
• Almost every e-resource refers to that somewhere, usually including
a geographical name but names change over time
• The importance of location has long been recognised and there are
several existing online gazetteers – Getty Thesaurus of Geographic
Names, GeoNames, Alexandria Digital Library etc.
• However, current gazetteers lack chronological depth and spatial
detail. They are not fit for purpose for those whose interest is not
modern place-names for ‘significant’ places
• But place-names can be used as a resource discovery tool, an
information augmentation tool, an enhanced way of using e-content,
and by linking content can help sustain resources
Place-Name e-infrastructure: Digital Exposure of
English Place-names (DEEP)
• So for England we have resolved the problem of linking places
through a new piece of e-infrastructure.
• DEEP is a £650,000 JISC project under the Strand B call to create
a comprehensive spatio-temporal gazetteer
• The project has digitised the work of the English Place-Names
Society who, since the mid-1920s, have systematically collected
in excess of 5 million name forms
• There are currently 86 place-name volumes, with £900k funding
from AHRC to complete the final four volumes for Shropshire
• New volumes will be ingested into the gazetteer through a
structured data entry and xml tagging system
DEEP II
• The Survey records names from cities to fields, streets and individual
buildings. Historical variants (forms) are attested with dates (critically
allowing existing e-resources to directly link to the gazetteer), the linguistic
elements which make up the names and free text descriptions of these
etymologies.
• The digital gazetteer, served via CDDA at Queens’ will facilitate, for example,
a search of a place-name in all its variant forms by keying in any one variant.
Where a source exists in digital form it will allow a direct link to that source
or sources.
• It will also be possible to submit place-name rich sources to the gazetteer
for semi-automated geo-resolution via the Jisc funded Unlock service
• The gazetteer has the potential to be associated with key strategic partners
bringing data together in the way Humphrey has outlined
• But, to work, content simply needs to contain a place-name form and be
searchable
New technologies and collaborations to enhance existing content
Crowdsourcing Welsh place names: working with Galaxy Zoo (Oxford), The
University of Wales and the People’s Collection, Wales to build an online gazetteer
of OS 6 inch maps
Oldweather.org
So how does this help resolve the use of econtent
• It allows resources to be discovered through deep linking
• It brings relevant content together
• It is one of very few projects providing vital research infrastructure –
digitisation of journals, or census data might be considered in the same
light
• By allowing resource discovery it uplifts user site numbers helping
demonstrate their use and impact justifying sustainability
• It represents a key change for Jisc, and one that should be built on. It is
not funding the development of a research or teaching collection but a
tool. There’s a need for more tools including methodological tools –
generic crowdsourcing toolkits, multi-collection metadata for example
• However, it needs to be embedded, first in Jisc collections to justify its
expense and demonstrate its utility and it turn deals with Jisc project
data silos
DEEP: Research infrastructure linking disparate content by location
Download