Choice and Forms of Headings (ISO 999) 1. Personal Names • full form as possible • should take the form used in the document, but if the text is not consistent, the indexer should adopt one form • choose the most recent, or the most commonly used form of personal name as the heading and add “see” crossreferences from other forms, e.g. Clemens, Samuel Langhorne see Twain, Mark • where surnames are in common used, the entry should be the surname followed by any given name or initials • Where surnames are not used, the name that customarily comes first should properly be used as the entry word e.g. Imran Khan • Persons identified only by a given name or forename should be indexed under that name, qualified if necessary, by a title of office or other distinguishing epithet e.g. Leonardo da Vinci Boudicca, Queen of Iceni • Persons normally identified by a title of honor or nobility should be indexed under that title, expanded if necessary by their family name e.g. Dalai Lama First Duke of Marlborough, John Churchill • Compound and multiple surnames, whether hyphenated or not, should be indexed under the first part e.g. Layzell Ward, Patricia Perez de Cueller, Javier 2. Corporate Bodies • Names of the corporate bodies should normally be indexed without transposition e.g. British Museum • Transposition may, however, be used if it is considered that this would help the users of the index. e.g. Department of Agriculture see Agriculture, Department of J. Whitaker & Sons see Whitaker (J) & Sons • Choose the most recent or the most commonly used form of corporate name as the main heading and add “see” cross references from other forms e.g. John Moores University see Liverpool John Moores University Liverpool John Moores University 3. Geographic Names • should be full as necessary for clarity, with additions to avoid confusion with the otherwise identical names e.g Alaminos (Laguna) Alaminos (Pangasinan) • An article or preposition should be retained in a geographic name of which it forms an integral part e.g. La Paz Las Vegas • Where the article or preposition does not form an integral part of a name it should be omitted, e.g. e.g New Forest rather than The New Forest Rheinfall rather than Der Rheinfall 4. Titles of documents • should normally be italicized, underlined or otherwise distinguished. If necessary for identification, names of creators, places of publication dates or other qualifiers may be added within parenthesis. Ave Maria (Gounod) Ave Maria (Schubert) Ave Maria (Verdi) e.g. • In an English index, articles in titles are conventionally transposed to the end of the heading so that filing order is explicit. e.g. Hunting of the Snark, The Kapital, Das • A preposition at the beginning of the title should be retained e.g. To the Lighthouse 5. First lines of poems Conventionally in an index of first lines of poems, the article is retained without transposition and is recognized for purpose of alphabetical arrangement e.g. A little thing in the snow The modest Rose puts forth a thorn Evaluation of Indexes Guidelines/Criteria 1. Subject error •Errors in choosing subject descriptors •Omission errors •Use of a too broad or too narrow term 2. Generic searching – Alphabetical indexes have always presented difficulties in promoting generic searching. 3. Terminology 4. Internal guidance • Cross-references • Printed instruction on how to use the index 5. Accuracy in referring • Bibliographic citation • Cross-references 6. Entry scattering Example: College libraries School libraries National libraries Public libraries Special libraries 7. Entry differentiation Example: Libraries, 1-2, 28-31, 42, 53-60, 82, 109-11, 131-40, 310, 342-50 8. Spelling and punctuation 9. Filing • • Letter by letter (Air base, Airborne, Air brake) Word by word (Air base, Airborne, Air brake) 10. Layout • • • Main heading are in heavy print Subheadings are in lighter print and small letters and indented See references are italicized 11. Length and type • Index length should be 3-5% of the pages of a typical nonfiction book, about 5-8% for a history or biography and about 15-20% for reference books 12. Cost 13. Standards Automatic Indexing • refers to indexing by machine, or the analysis of text by means of computer algorithms. The focus is on automatic methods used behind the scenes with little or no input from individual searchers, with the exception of relevance feedback. Four Types of Approaches (Cleveland & Cleveland, 2001, p. 211) • Statistical – based on counts of words, statistical associations, and collation techniques that assigns weighs, cluster similar words • Syntactical – stresses grammar and parts of speech, identifying concepts found in designated grammatical combinations, such as noun phrases. • Semantic systems – concerned with the context sensitivity of words in the text. What does cat mean in terms of its context? House cats? Heavy earthmoving equipment? • Knowledge-based – systems goes beyond thesaurus or equivalent relationships to knowing the relationship between words, e.g. ‘tibia’ is part of a leg, thus the document is indexed under ‘leg injuries’. Human /Manual Indexing vs. Automatic Indexing • • • • • Needs more people Costly Human error Low in production Quality can range from excellent to appalling • Needs less human effort • Cheaper • Follows instruction automatically • Accurate • Fast in production • Promotes meticulous problem analysis • Dependent to human intelligence • Power lies on how the computer is programmed Human /Manual Indexing vs. Automatic Indexing • Automatic methods have trouble handling synonyms, homonyms, and semantic relations. Conceptualizing is very poor. • Human indexers go through cognitive processes that may be influenced by their background experience, education, training, intelligence, and common sense. • Computers can, and humans cannot, organize all words in a text and in a given database and make statistical operations on them Indexing and the Internet Search Tools • Search engines - Engines are computer software that scan the Web and select pages to be indexed for the searching system. They are often referred to as Web indexes since they examine the content of the web pages. Examples: HotBot, InfoSeek, and Google. • Directory-based systems – usually indexed by human and thus tend to have a higher level of quality in the indexing. Indexing may be based on full text or on most frequently used words since the way the material is organized is a sense of browsing that is similar to traditional library browsing. Examples: Yahoo! Directory and Google Directory • Metasearchers - allow the user to search across multiple search tools at once. They take user’s query and submit it to a number of other search tools. Examples: Metacrawler and Surfmax