The National Archives Taxonomy Jone Garmendia, Head of Cataloguing 25 November 2011 2 A taxonomy is a system for naming and organising things into groups that show similar characteristics. Taxonomies in the Public Sector (TIPS) http://www.nglis.org.uk/tipsben.htm 3 A taxonomy is a collection of terms that enable the classification and grouping of content in order to support: • information organisation • searching • information discovery/browsing • website usability • content reuse / serendipity 4 Thesaurus: controlled vocabulary structured hierarchically using BT, NT, RT non-preferred terms to handle synonyms they support: o Information retrieval in controlled bibliographic databases o authority files o expansion of search 5 Ontologies High level intellectual frameworks to organise information representing concepts, entities and their relationships used in information science to support: o software engineering o the semantic web 6 People are active in Places create/own Take part in are located in take place in Events 7 Records/Assets are used in / can trigger A taxonomy is a system for naming and organising things into groups that show similar characteristics. Taxonomies in the Public Sector (TIPS) http://www.nglis.org.uk/tipsben.htm 8 9 10 http://labs.nationalarchives.gov.uk 11 We have not catalogued entries manually Subject research: words lists Building Boolean Queries within the Search Engine 12 Word Lists and ‘synonyms marriage marital “marry” “marries” “married” matrimony matrimonial jactitation betrothal dowry jointure alimony curtesy wedding adultery, adulterous nuptial spouse wedlock conjugal certificate of no impediment re-marriage, remarriage… mainenance order Maintenance Order Acts… bigamy, bigamous, polygam* common law wife/husband husband/wife divorce, decree nisi, decree absolute 14 15 16 17 18 search engine index new index taxonomy management software tagging process xml export categorisation engine Data flow back (with tags) xml tagged data new index 19 taxonomist generates query Testing and tuning Beta release April 2011 Taxonomy testing Multiple tagging Precision and relevancy of subject tags Recall: working to provide subjects for the untagged 21