Building Taxonomies Alice Redmond-Neal Access Innovations, Inc. Enterprise Search Summit New York City, May 21, 2006 1 Copyright © 2006 Access Innovations, Inc. So what’s a taxonomy? • Words – controlled vocabulary • Used as labels for indexing – descriptive metadata • Attached to documents, digital objects, or physical objects • Organized to aid retrieval – hierarchical structure – Hierarchical presentation of a thesaurus 2 Copyright © 2006 Access Innovations, Inc. Perspectives on taxonomies • Taxonomist (aka Lexicographer, Thesaurus builder) • Information architect • Indexer • Searcher Each has a different view and need for words in retrieving information. Each need relates to using a taxonomy for indexing. 3 Copyright © 2006 Access Innovations, Inc. Taxonomies for information retrieval online • Conceptual framework for web content – reflects organization of knowledge in a domain • Foundation for information architecture • Often 3 levels deep – depends on domain • May be hidden or displayed 4 Copyright © 2006 Access Innovations, Inc. Info retrieval starts with a knowledge organization system • • • • • • • • Uncontrolled list Name authority file Synonym set/ring Controlled vocabulary Taxonomy Thesaurus Ontology Semantic network LOTS OF OVERLAP! 5 Not complex Copyright © 2006 Access Innovations, Inc. Highly complex Structure of controlled vocabularies List of words Synonyms Taxonomy Thesaurus INCREASING COMPLEXITY Ambiguity control Synonym control 6 Ambiguity control Synonym control Hierarchical rel’s Copyright © 2006 Access Innovations, Inc. Ambiguity cont’l Synonym cont’l Hierarchical rel’s Associative rel’s Controlled vocabulary construction standards • ANSI (American National Standards Institute) • NISO (National Information Standards Organization) • ISO (International Standards Organization) • BS (British Standards Institute) Differences are minor and diminishing. ANSI/NISO Z39-19.2005 revision approved. 7 Copyright © 2006 Access Innovations, Inc. Taxonomy defined – ANSI/NISO Z39.19-2005* controlled “A controlled vocabulary hierarchy consisting of preferred terms all of which are connected in a hierarchy or polyhierarchy.” Missing: equivalence, homographic, and associative relationships and notes – features of a THESAURUS. * http://www.niso.org/standards/resources/Z39-19-2005.pdf 8 Copyright © 2006 Access Innovations, Inc. Taxonomy as an organization system • Controlled vocabulary • Hierarchical format – Parent-child relationships • Specific items appear as final leaves on hierarchy branches • Common on websites – Pick list – Browsable directory – Other variations 9 Copyright © 2006 Access Innovations, Inc. Thesaurus as an organization system • Controlled vocabulary • Focus on conceptual classes, not specifics • Hierarchy – implicit if not displayed – Parent-child relationships • Various display formats may be available Long • Network of relationships between terms helps user to find information – Cousins, friends, aliases • Scope notes, term history established standards • More elaborate and informative 10 Copyright © 2006 Access Innovations, Inc. Thesaurus defined – ANSI/NISO Z39.19-1993, -2005 “A controlled vocabulary of terms in natural language that are designed for postcoordination...” “Terms are arranged…so that various relationships are displayed clearly…” “The controlled vocabulary is established by information specialists or lexicographers and is generally employed in indexing.” 11 Copyright © 2006 Access Innovations, Inc. Thesaurus defined – ANSI/NISO Z39.19-2005 “A controlled vocabulary arranged in a known order in which equivalence, homographic, hierarchical, and associative relationships among terms are clearly displayed and identified by standardized relationship indicators, which must be employed reciprocally. Its purposes are to promote consistency in the indexing of content objects, especially for postcoordinated information storage and retrieval systems, and to facilitate browsing and searching by linking entry terms with terms. Thesauri may also facilitate the retrieval of content objects in free text searching.” 12 Copyright © 2006 Access Innovations, Inc. Standards and pragmatism • Standards are your friends – Lead to richer, more informative product – Promote interoperability -- Allow you to adopt or adapt other controlled vocabularies – Promote predictability – Allow repurposing within your organization and by other organizations • Follow standards for taxonomy building – Incorporate authority files / final nodes as needed • Your taxonomy or thesaurus must meet your needs 13 Copyright © 2006 Access Innovations, Inc. Your taxonomy / thesaurus end product • Reflects – scope of your concern – degree of precision you need • Facilitates – data storage and retrieval by vocabulary control – discovery of ideas • Promotes learning – preferred terminology – relationships among concepts – organized guide to your field 14 Copyright © 2006 Access Innovations, Inc. Talk about terms and taxonomies • How to choose terms • How to ensure term clarity, avoid ambiguity – Vocabulary control—why and how • How to format terms • Terms within a taxonomy—the big picture 15 Copyright © 2006 Access Innovations, Inc. How do you choose terms? • Importance in the subject area • Use in the literature, by the organization or community • Necessary degree of specificity or detail • Relationship with other controlled vocabularies 16 Copyright © 2006 Access Innovations, Inc. Vocabulary control – why? “The need for vocabulary control arises from two basic features of natural language, namely: two or more words or terms can be used to represent a single concept, and two or more words that have the same spelling can represent different concepts.” ANSI/NISO Z39.19-2005 17 Copyright © 2006 Access Innovations, Inc. Vocabulary control through disambiguation Synonyms – de-duplicate meanings • Multiple words for the same concept – President of the United States, POTUS – Biological technology, Biotech Homographs (polysemes) – eliminate ambiguity • Same written word used for multiple meanings – Balloon—which kind?, Box—which kind? – Cells, Mercury, Records, Bridge/Bridges, Bush 18 Copyright © 2006 Access Innovations, Inc. Vocabulary control – how? Organize terms • to show which of two or more synonymous terms is preferred or authorized for use • to distinguish between homographs • to indicate hierarchical and associative relationships among terms 19 Copyright © 2006 Access Innovations, Inc. Vocabulary control – in practice • Use unambiguous terms, clear to the user • • • • 20 group Distinguish between terms that appear similar Use Scope Notes when necessary Use terms as elements that can be coordinated in a flexible manner Create compound terms (noun+modifier) when necessary Copyright © 2006 Access Innovations, Inc. One term / one concept • “Terms in a thesaurus should represent simple or unitary concepts…” (ISO standard) • “Each descriptor included in a thesaurus should represent a single concept (or unit of thought). …frequently expressed by a single-word term but in many cases a multiword term is required.” (ANSI/NISO Z39.19-2005) 21 Copyright © 2006 Access Innovations, Inc. A “term” synonym ring Term Descriptor Node Category 22 Subject heading Copyright © 2006 Access Innovations, Inc. So what’s a concept? • “A unit of thought, formed by mentally combining some or all of the characteristics of a concrete or abstract, real or imaginary object. Concepts exist in the mind as abstract entities independent of terms used to express them.” • Three main categories – Abstract concepts – Concrete entities – Proper nouns 23 Copyright © 2006 Access Innovations, Inc. Concrete entities as terms • Things and their physical parts – primates • head – buildings • floors • Materials – cement – wood – lead 24 Copyright © 2006 Access Innovations, Inc. Abstract concepts as terms • Actions and events – evolution, skating, management, ceremonies • Abstract entitites – law, theory • Properties of things, materials, and actions – strength, efficiency • Disciplines and sciences – physics, meteorology, mathematics • Units of measurement – pounds, kilograms, miles, meters, nanoseconds 25 Copyright © 2006 Access Innovations, Inc. Proper nouns as terms • Individual entities – “classes of one” – expressed as proper nouns – San Francisco, Lake Michigan Thesaurus standards prefer to exclude proper names, persons, and trade names. Extensive lists authority files. Taxonomies include them as final nodes. 26 Copyright © 2006 Access Innovations, Inc. Pop quiz – which qualify as terms? • rooms • living rooms • living room furniture • schools • public schools • public school curricula “single unit of thought” • marketing and advertising • societal issues information ethics, plagiarism, credibility information literacy, lifelong learning 27 Copyright © 2006 Access Innovations, Inc. The term record • Main Term (MT) • Top Term (TT) • Broader Terms (BT) = subject term, heading, node, category, descriptor, class TAXONOMY • Narrower Terms (NT) • Related Terms (RT) – See also (SA) • Scope Note (SN) • History (H) • NonPreferred Term (NP) – Used for (UF), See (S) 28 Copyright © 2006 Access Innovations, Inc. THESAURUS see Lexicographer’s lexicon Build a taxonomy – simple steps • Get paper and pencil – Sharpen pencil • Define subject field • Collect terms • Organize terms • Fill in gaps • Flesh out and interrelate terms You’re done! 29 Copyright © 2006 Access Innovations, Inc. Define subject field • Review representative collection of content • Determine: – Core areas – Peripheral topics Sociology Psychology Education • Scope can be modified later 30 Copyright © 2006 Access Innovations, Inc. Law Before you go on: Build or buy? • Survey existing thesaurus/taxonomy resources for your domain • Test for – Scope – Depth • Make-or-break terms – Cost Don’t reinvent the wheel! 31 Copyright © 2006 Access Innovations, Inc. Collect terms • • • • • • • • • • 32 Your documents and databases Departmental terminology Text books and their indexes (indices) Book tables of contents and indexes Journal quarterly indexes Encyclopediae Lexicons, glossaries on the topic Web resources Users and experts Search logs Copyright © 2006 Access Innovations, Inc. Gather terms from search logs Beyond the Spider: The Accidental Thesaurus (Richard Wiggins, Information Today, Oct 2002) Top ~100 search terms from search logs Match to web site with appropriate answer Basis for favorites or best bets, presented at the top of results list. (AKA behavior-based taxonomy) Not a thesaurus or taxonomy, but still a useful source of terms. 33 Copyright © 2006 Access Innovations, Inc. Organize terms – roughly • Sort terms into several major categories – logical groups of similar concepts as Top Terms – Identify core areas and peripheral topics – 10 – 20 to start – Consider moving proper names to authority files • Result: loose collection of terms under several main headings – Rough and tentative – see how it fits as you go – Initial gap analysis – Add / modify / delete as needed 34 Copyright © 2006 Access Innovations, Inc. Labelling a concept – cognitive linguistics • Most-used labels are middle in range from abstract to specific --- relates to search • Linguistic universal – true across cultures • Unique beginner • Life form • Generic Insurance • Specific • Varietal 35 Practical Health insurance application? Group health insurance Copyright © 2006 Access Innovations, Inc. Craft the Top Terms • Toughest job and most important step! • Dictates further organization • Determines how browsers/searchers perceive the taxonomy – Coverage – Formality • Establish the concept first, tweak the wording later 36 Copyright © 2006 Access Innovations, Inc. Usefulness of a term – the “duh” factor • Some terms are so basic for a domain that they have little or no value – “Sports” in Sports Illustrated – “Technology” in Technology Review – “Golf” in Golf Magazine • How useful will the term be for indexing? – Apply to everything in the domain? – Distinguish important concepts? – If term is needed, specify limited use conditions in Scope Note 37 Copyright © 2006 Access Innovations, Inc. Hierarchy structures – variations on a theme • Not pre-determined – Winestypevarietyregioncost – Or Winescosttype…. • Varies by user group and needs – May have multiple views of same content – Standard alpha view or customized notation • Affects information architecture, i.e. how web site functions 38 Copyright © 2006 Access Innovations, Inc. How do terms relate? • Hierarchical relationships -- Parents and their TAXONOMY children • Equivalence relationships -- Aliases • Associative relationships -- Cousins 39 Copyright © 2006 Access Innovations, Inc. THESAURUS Hierarchical relationships • Broader Term represents the category • Narrower Term represents the specific • Three types: – Generic relationship (BTG/NTG) – Whole-part relationship (BTP/NTP) – Instance relationship (BTI/NTI) • BTs/NTs have a reciprocal relationship 40 Copyright © 2006 Access Innovations, Inc. Broader to Narrower Terms Politics Elections Generic 41 Specific Presidential elections Gubernatorial elections Mayoral elections Varietal Copyright © 2006 Access Innovations, Inc. Hierarchy – Generic (genus-species) relationship • Inheritance or inclusion – what’s true of the parent (BT) is true for all children (NTs) • Applies to entities, actions, properties, agents – not just biological taxonomies Value Cultural value Economic value Moral value Social value 42 Teachers Adult educators School teachers Special ed teachers Student teachers Copyright © 2006 Access Innovations, Inc. Thinking Contemplation Divergent thinking Lateral thinking Reasoning Generic relationship test – 1 • Both terms in same fundamental category • “All-and-some” test Rodents SOME ALL Squirrels Pests SOME NOT ALL Squirrels 43 Copyright © 2006 Access Innovations, Inc. Generic relationship test – Rodents Pests Squirrels ALL squirrels are rodents x NOT ALL squirrels are pests x NOT ALL pests are rodents 44 Copyright © 2006 Access Innovations, Inc. 2 Hierarchy – Whole-part relationship • Also known as meronymy or partonomy • Four types allowed in thesaurus standards – Body systems and organs • Ear Middle ear – Geographical locations • Bernalillo County Albuquerque – Fields of study • Geology Physical geology – Hierarchical organizational/corporate/social/political structures • Diocese Parish 45 Copyright © 2006 Access Innovations, Inc. Hierarchy – Instance relationship • General category (common noun) = BT • Individual example (proper noun) = NT Seas Baltic Sea Caspian Sea Mediterranean Sea New York museums Guggenheim Museum Museum of Modern Art Museum of Natural History Essentially identical to “final node” in taxonomies. Best practice: long list move to authority file 46 Copyright © 2006 Access Innovations, Inc. Polyhierarchical relationship • Term can logically fit under more than one Broader Term – can have Multiple Broader Terms (MBT) • New to ANSI/NISO standards 47 Spoons Sporks Forks Sporks Nurses Nurse administrators Health administrators Nurse administrators Finance Accounting Careers Accounting Copyright © 2006 Access Innovations, Inc. Equivalence relationship • Preferred Term – Thesaurus term and valid for indexing – Thesaurus notation: USE • NonPreferred Term – Not valid for indexing – An alias or imposter – Entry point, directs user to Preferred Term – Thesaurus notation: UF or NPT Spiders UF Arachnids 48 Plant pathology USE Phytopathology Copyright © 2006 Access Innovations, Inc. Equivalence – when to use • Synonyms, slang, quasi-synonyms • Scientific and trade names – Ibubrofen UF Motrin™ • Lexical variants – Fiber optics UF Fibre optics – Mouse UF Mice • Upward posting of narrow concepts not specified in taxonomy or thesaurus – Social class UF Elite, Middle class, Working class Get equivalent terms from search logs, brainstorming… 49 Copyright © 2006 Access Innovations, Inc. Associative relationship • Related Terms (RTs) ~ cousins • “…terms related conceptually but not hierarchically, and are not part of an equivalence set” (i.e. not synonyms) – Should siblings be Related Terms?? • Both terms are valid thesaurus terms for indexing, and have reciprocal relationship • Expands user’s awareness, reflects thesaurus coverage of unanticipated areas • Standards describe specific types (see Lexicon) 50 Copyright © 2006 Access Innovations, Inc. Sibling rivalry and facets • Format and sense of sibling terms should • • • • be consistent If siblings don’t coexist well, separate them Subdivide large groups of terms into facets, mutually exclusive subcategories Growing demand with faceted navigation Facet examples – Properties, Materials, Agents, Actions, Influence – Objects, Styles and periods, Color, Shape (Art & Architecture Thesaurus) 51 Copyright © 2006 Access Innovations, Inc. Faceted classification • Pharmaceuticals – (by action) • Anti-inflammatory agents… – (by chemical structure) • Alkaloids… – (by indication) • Pain… – (by use) • Immunosuppression… 52 Copyright © 2006 Access Innovations, Inc. Facet indicators (aka Node labels), not to be used for indexing Faceting challenge Propose facet indicators and subgroup these paint varieties into facets. 53 • Paint – Oil paint – High-gloss paint – Interior paint – Matte paint – Latex paint – Semi-gloss paint – Exterior paint Copyright © 2006 Access Innovations, Inc. Scope Notes (SN) • Indicate meaning of the term in the context • • • • • • 54 of this thesaurus, for this audience – Stress – Metal, Psychological, Physiological Indicate any restriction in meaning Indicate range of topics covered Provide direction for indexers; for terms often confused, may suggest an alternative term Use only as needed – not for every term Establish and stick with consistent format Be concise Copyright © 2006 Access Innovations, Inc. Evaluating terms • Do terms represent all necessary concepts? – Gap analysis • Do terms capture necessary details? – Level of granularity • Are terms understood by users? – Domain expert vs. common user 55 Copyright © 2006 Access Innovations, Inc. Talk about terms • Term format • Grammatical issues • Singular and plural forms • Spelling • Abbreviations and acronyms • Capitalization • Other punctuation • Consistency 56 Copyright © 2006 Access Innovations, Inc. Term format • KISS – Keep it short and simple – 1-2-3 words • Effect on search • Factoring, Postcoordination (coming) • Grammatical issues – Nouns and noun phrases – Verbish things – Adjectives – Adverbs – Initial articles 57 Copyright © 2006 Access Innovations, Inc. Most terms are nouns • Nouns or simple noun phrases (phrase = compound or bound term) – Adj + Noun – Art history (ANSI/NISO standard) • Noun + Prep + Noun – History of art (ISO standard) – Exceptions – Burden of proof, Coats of arms, Prisoners of war, Birds of prey, etc. 58 Copyright © 2006 Access Innovations, Inc. Other parts of speech • Verbs – Gerund form: Fishing • Adjectives – Not used in isolation – Very rare (lots in Art & Architecture Thesaurus) – OK when combined with another term – Dental bridges • Adverbs – No, except as part of proper name – Very Large Array • Articles – No, except as part of proper name – El Salvador, Le Mans 59 Copyright © 2006 Access Innovations, Inc. Singular and plural forms • Plural form for count nouns – “how many” clouds, animals, highways • Singular form for mass nouns stocks? fishes? – “how much” security, oxygen, rain monies? • Exceptions – Body parts in medicine singular (heart, foot) – Unique entities singular (Brooklyn Bridge) – User warrant plural/singular (fishes) 60 Copyright © 2006 Access Innovations, Inc. Term spelling • Preferred spelling depends on audience – Multinational company may need alternative spellings in same taxonomy • Use most widely accepted spelling • Use secondary spelling as NonPreferred Term (synonym) • Exception: – Proper names – Labour Party 61 Copyright © 2006 Access Innovations, Inc. Abbreviations and acronyms • Use only when full form is rarely seen – SCUBA, LASER, DNA, LASIK • Use full form if abbreviation is not widely used and understood – Automated teller machines – for ATM – Driving while intoxicated – for DWI • Alternative becomes NonPreferred Term • Use and acceptance always shifting • Be consistent 62 Copyright © 2006 Access Innovations, Inc. Capitalization • Standards: use all lower case – Exceptions: • • • • Initialisms – DNA Proper names – Queen Mary Trade names – Thesaurus Master™ Taxonomic names – Homo sapiens • Much variation in practice 63 Copyright © 2006 Access Innovations, Inc. Parentheses • Use only for – Parenthetical qualifiers to disambiguate homographs • Bridges (Dentistry), Bridges (Roadways), Bridges (Music) – Different meanings for singular / plural word forms • Bridges [all the above] vs. Bridge (Card game) • Wood (Material) vs. Woods (Forest) • Damage (Injury) vs. Damages (Law) – Facet indicators – Paint (by finish) – Part of the term – benzo(a)pyrene – Trademark indicator (tm) becomes ™ 64 Copyright © 2006 Access Innovations, Inc. Hyphens • Generally avoid -- nonfiction • Use only if – Omitting the hyphen would be ambiguous • cocitation vs. co-occurrence – The hyphen is part of the term • n-body problem • p-benzoquinone • CD-ROM 65 Copyright © 2006 Access Innovations, Inc. Other punctuation bits • Apostrophes – Keep for possessive case • Diacritical marks – Keep if possible – Québec • Other random marks – Keep if part of a proper name – A&W Root Beer Standard & Poors 66 Copyright © 2006 Access Innovations, Inc. Compound terms (aka bound terms) and factored terms • Term consisting of more than one word that represents a single concept • Keep compound term or factor out (split)? 67 Copyright © 2006 Access Innovations, Inc. Compound terms are precoordinated • Elements are bound together to specify a concept at the indexing stage • Can’t change the parts Water pollution Library science Television influence on preschoolers Chicken dinner with turnips and rutabagasno substitutions of menu items! 68 Copyright © 2006 Access Innovations, Inc. Factored terms can be Postcoordinated • Elements can be strung together to specify a concept at the search stage • Elements can be mixed and combined as needed – Few clothing pieces several outfits • The sum of the elements reflects the concept (usually) 69 Copyright © 2006 Access Innovations, Inc. To factor or not to factor Is each factor a single concept? Is each factor in your thesaurus? If YES, break term down to factors: California highway construction California + Highways + Construction If NO, or if factoring would be confusing, retain the compound term Children’s television Science library 70 Television + Children ?? Library + Science ?? Copyright © 2006 Access Innovations, Inc. Precoordination positives • User expectations – Rapid transit – Occurs commonly in data – Splitting would be odd – Reflects a single concept for the audience • Better accuracy – captures specific concepts precisely • Fewer false drops • Term information is retained (Related Terms, NonPreferred Terms, Scope Notes, …) 71 Copyright © 2006 Access Innovations, Inc. Precoordination negatives • Poorer total recall • Term proliferation – Combinations and permutations increase thesaurus size • Higher cost • Limited flexibility in expressing new concepts 72 Copyright © 2006 Access Innovations, Inc. Postcoordination pros and cons Higher recall Lower cost Greater flexibility – enables expression of new concepts through novel combinations x Lower accuracy, some false drops – Library science – Art museums NOT = Library + Science NOT = Art + Museums • Postcoordination is implicit in most online searches (implied AND between search words) 73 Copyright © 2006 Access Innovations, Inc. About “and” • Avoid “and” in terms – not a single concept Instead of: Children and television Factor and postcoordinate USE Media influence + Television + Children • “and” OK when both elements are members of a broader class Vessels Ships and boats 74 Copyright © 2006 Access Innovations, Inc. Your need for granularity may dictate your choice So far you’ve got • Hierarchy • Complete term records – Broader and Narrower Terms • Polyhierarchies when needed – Preferred/NonPreferred Terms (equivalence relationships) – Related Terms (associative relationships) – Scope Notes – Correct term format – Compound terms when needed 75 Copyright © 2006 Access Innovations, Inc. Notation • Symbols (numbers, letters, hyphens, colons…) – 1: Apples • 1.1: Granny Smith • 1.2: Winesap • Another kind of ordering (non-alphabetic) – Chronological, positional, numeric sequence, or other logical sequence for user group – Same terms presented differently – Different user groups, different purposes • Adjunct to verbal expression of term • Secondary to verbal concept organization 76 Copyright © 2006 Access Innovations, Inc. Review, edit, test, edit, use, edit, and maintain, i.e. edit • Review – Users – Expert reviewers • Test – Index 500+ documents (more for variable writing style; fewer for strict style) – Monitor search log • Edit and maintain – Add term – Change existing term – Change term status – Delete term – Add term relationship – Delete term relationship – Add/modify Scope Note Consider machine automated / – Change overall assisted indexing software structure 77 Copyright © 2006 Access Innovations, Inc. Automatic taxonomy construction • Words and phrases from documents • Based on frequency and co-occurrence of words • No semantic analysis • Produces list of possible terms • Requires editorial analysis – hierarchical and conceptual organization – association of related concepts – identifying and deduplicating equivalent concepts 78 Copyright © 2006 Access Innovations, Inc. Show ‘em what you’ve got – displays for every user • Thesaurus/taxonomy views and functions depend on audience and purpose – – – – 79 taxonomists indexers corporate workers public searchers Copyright © 2006 Access Innovations, Inc. For the taxonomist • • • • • • • • • • 80 Hierarchy view Alphabetic view Permuted (KWIC) view Single term record view Graphical view Notational view Deleted terms Candidate terms Retrieve term record Find term in hierarchy view Copyright © 2006 Access Innovations, Inc. Taxonomists NEED MOST and WANT even MORE! Hierarchy Alphabetical Permuted (KWIC) Term record Notation view For the indexer • Search to retrieve term record • Access to Scope Notes, Related Terms, NonPreferred Terms • Hierarchy view for the big picture • Automated proposal of indexing terms 83 Copyright © 2006 Access Innovations, Inc. For the searcher • • • • • Browsable directory (Yahoo.com, MediaSleuth.com) Faceted navigation (MOMA.org, LandsEnd.com) Alpha term list or terms grouped by letter Drop down list with selected terms Portal view – complete or partial taxonomy – Display terms may be identical to taxonomy terms – Display terms may be variants, mapped to taxonomy terms • Taxonomy may not be accessible – requires random guessing 85 Copyright © 2006 Access Innovations, Inc. Display taxonomy categories Results from sample of 1,100 documents (not all categories are populated) Reveal Narrower Terms 87 Copyright © 2006 Access Innovations, Inc. Select taxonomy category to display titles 88 Copyright © 2006 Access Innovations, Inc. Access full bibliographic record 89 Copyright © 2006 Access Innovations, Inc. Faceted navigation 90 Copyright © 2006 Access Innovations, Inc. SLA website and thesaurus 91 Copyright © 2006 Access Innovations, Inc. SLA search 92 Copyright © 2006 Access Innovations, Inc. Concept indexing – effect on retrieval Search query: THESAURUS Precision search based on M.A.I. indexing: 3 hits Free text, no indexing 0 hits 93 Copyright © 2006 Access Innovations, Inc. 94 Copyright © 2006 Access Innovations, Inc. Search: kangaroo Broader Terms Narrower Terms Related Terms Use (synonyms) 95 Copyright © 2006 Access Innovations, Inc. Leverage taxonomy term information to aid search Indexing rule Term record 96 Copyright © 2006 Access Innovations, Inc. What we’ve covered • • • • • • • • • Taxonomy – from different perspectives Collecting and organizing concepts Term choice and vocabulary control Taxonomy structure Term relationships Term format Factored and compound terms Constructing a simple taxonomy Display variations for different users 97 Copyright © 2006 Access Innovations, Inc. “The Computer and the Poet” “The biggest single need in computer technology is not for improved circuitry, or enlarged capacity, or prolonged memory, or miniaturized containers, but for better questions and better use of answers.” Norman Cousins, editorial in The Saturday Review, July 23, 1966 special issue on “The New Computer Age” Through taxonomies, effectively applied through indexing, we aim to efficiently connect the questions and the answers. 98 Copyright © 2006 Access Innovations, Inc. Questions? Comments? Thanks for your attention! Alice Redmond-Neal ared@accessinn.com Access Innovations, Inc. www.AccessInn.com Data Harmony software www.DataHarmony.com 99 Copyright © 2006 Access Innovations, Inc.