researchers, power of data and GIS II Dr Paul S Ell Centre for Data Digitisation and Analysis Queen’s University Belfast paul.ell@qub.ac.uk Jisc Digitisation Final Programme Meeting 3 July 2013 Or Why isn’t econtent having and impact and what can GIS, in the broadest sense, do about it… GIS in its broadest sense. Historical (and Humanities?) GIS has failed • Most researchers aren’t interested in maps. They do not form part of their research process. They can’t read them and don’t understand them • Historical GIS traditionally focussed on mapping census data by administrative units. Almost no one can use the software to do this and few are interested in the results • The humanities is not about statistics, is usually about text and multimedia content • So let’s forget about traditional GIS except in its broadest sense and focus on why e-content has had a limited impact and what can be done about it Problem I: Many e-Resources – a deluge • • • • • • • • • • • • • • • • • • • • • • • • • • • • • Historical census data for Britain 1801-2001 Welsh historical statistics Historical census data for the Netherlands Historical census data for India Digitisation of recent census data for NISRA Historical Gazetteers for Britain Data on the 1851 Religious Census for Britain Digitisation of a sample of place-names from the English Place-Name Survey and linking them to a GIS Digitisation of the 1676 Compton Census Data extraction from the Winchester Rolls Computerisation of medieval manorial crop yield data Hearth Tax Data Datasets on Irish religion from 1834 Scottish National Dictionary Dictionary of the Older Scottish Tongue British Parliamentary Papers in a series of grants with BOPCRIS Mortality and hospital admission data for England Historical diaries relating to China from an Irish perspective Key holdings from QUB Library Special collections Various small digitisation projects for NMNI Convict database for Down County Museum Photographic plate digitisation for Down County Museum Digitisation of the Banbridge Almanac for SEELB Digitisation of Vital Registration data for Northern Ireland Image scans of Latin texts for Ireland for UU Database of Irish Historical Statistics Act of Union Virtual Library Hansard for the Stormont Parliament CDDA/JSTOR Ireland Collection Problem II: Data silos Problem III: Sustainability So what’s this got to do with GIS, or rather gazetteers – lists of places? • As Humphrey has indicated, place is important. In the Humanities and Social Sciences almost everything happens somewhere • Almost every e-resource refers to that somewhere, usually including a geographical name but names change over time • The importance of location has long been recognised and there are several existing online gazetteers – Getty Thesaurus of Geographic Names, GeoNames, Alexandria Digital Library etc. • However, current gazetteers lack chronological depth and spatial detail. They are not fit for purpose for those whose interest is not modern place-names for ‘significant’ places • But place-names can be used as a resource discovery tool, an information augmentation tool, an enhanced way of using e-content, and by linking content can help sustain resources Place-Name e-infrastructure: Digital Exposure of English Place-names (DEEP) • So for England we have resolved the problem of linking places through a new piece of e-infrastructure. • DEEP is a £650,000 JISC project under the Strand B call to create a comprehensive spatio-temporal gazetteer • The project has digitised the work of the English Place-Names Society who, since the mid-1920s, have systematically collected in excess of 5 million name forms • There are currently 86 place-name volumes, with £900k funding from AHRC to complete the final four volumes for Shropshire • New volumes will be ingested into the gazetteer through a structured data entry and xml tagging system DEEP II • The Survey records names from cities to fields, streets and individual buildings. Historical variants (forms) are attested with dates (critically allowing existing e-resources to directly link to the gazetteer), the linguistic elements which make up the names and free text descriptions of these etymologies. • The digital gazetteer, served via CDDA at Queens’ will facilitate, for example, a search of a place-name in all its variant forms by keying in any one variant. Where a source exists in digital form it will allow a direct link to that source or sources. • It will also be possible to submit place-name rich sources to the gazetteer for semi-automated geo-resolution via the Jisc funded Unlock service • The gazetteer has the potential to be associated with key strategic partners bringing data together in the way Humphrey has outlined • But, to work, content simply needs to contain a place-name form and be searchable New technologies and collaborations to enhance existing content Crowdsourcing Welsh place names: working with Galaxy Zoo (Oxford), The University of Wales and the People’s Collection, Wales to build an online gazetteer of OS 6 inch maps Oldweather.org So how does this help resolve the use of econtent • It allows resources to be discovered through deep linking • It brings relevant content together • It is one of very few projects providing vital research infrastructure – digitisation of journals, or census data might be considered in the same light • By allowing resource discovery it uplifts user site numbers helping demonstrate their use and impact justifying sustainability • It represents a key change for Jisc, and one that should be built on. It is not funding the development of a research or teaching collection but a tool. There’s a need for more tools including methodological tools – generic crowdsourcing toolkits, multi-collection metadata for example • However, it needs to be embedded, first in Jisc collections to justify its expense and demonstrate its utility and it turn deals with Jisc project data silos DEEP: Research infrastructure linking disparate content by location