Comparison of Keyword Searching Using FAST vs. Using LCSH Presentation for the ALCTS CCS Program: FAST: A New System of Subject Access for Cataloging and Metadata New Orleans, Saturday, June 24, 2006 by Arlene G. Taylor The Database OCLC (i.e., Ed O’Neill and team) created a test database of bibliographic records Each record had both a set of LCSH headings and a set of FAST headings Records were a subset of Worldcat records The FAST headings were “translated” from the LCSH headings Two indexes were created by OCLC’s research team – one to search FAST headings and one to search LCSH headings © 2006 Arlene G. Taylor 2 The Project Participants were students at the University of Pittsburgh in a Subject Analysis class Two parts Search both LCSH and FAST indexes for Newspapers in home state Search four topics of interest in both LCSH and FAST indexes Students were asked to explain differences found in the two indexes © 2006 Arlene G. Taylor 3 Newspaper searches Searches for “newspapers” and any state that has an authorized AACR2 abbreviation are almost always different in the two indexes A search retrieves both records for newspapers themselves and records for works about newspapers The state is abbreviated on some records, using abbreviations in AACR2, but the searcher almost always spells out the state name (some states, e.g., Ohio and Iowa, have no abbreviation) © 2006 Arlene G. Taylor 4 Newspaper searches (cont.) A record about newspapers may have an LCSH subject heading: In FAST this is translated to: 650 0 American newspapers $z Pennsylvania $z Bucks County 650 7 American newspapers $2 fast 651 7 Pennsylvania $z Bucks County fast $2 A keyword search for “Newspapers Pennsylvania” will retrieve the record in both LCSH and FAST indexes. © 2006 Arlene G. Taylor 5 Newspaper searches (cont.) A record for a newspaper itself may have the LCSH heading: $v In FAST this is translated to: 651 0 Clearfield (Clearfield County, Pa.) Newspapers. 651 7 Pennsylvania $z Clearfield (Clearfield County) $2 fast 655 7 Newspapers $2 fast A keyword search for “Newspapers Pennsylvania” will retrieve the record only in the FAST index. © 2006 Arlene G. Taylor 6 Part II of the project While most students understood on some level the different results they got in Part I, few of them understood their different results in Part II. Therefore, the result of Part II was to generate 76 topics that I then searched again to determine results and the reasons for differences. © 2006 Arlene G. Taylor 7 Basic statistics Number searches – 76 Number records found using FAST index – 2371 Number records found using LCSH index – 2340 Number records same using either index – 2200 Number records not found using LCSH index – 171 Number records not found using FAST index - 140 © 2006 Arlene G. Taylor 8 Reasons for variation in searching results Invalid LCSH (or not established) not translated to FAST $x and or $v in 600 and 610 fields not indexed in the LCSH index Word indexed in FAST index because it was in a 650 field with 2nd indicator 7 and a $2 at the end, but the $2 contained a code for a vocabulary other than FAST Some names (personal or corporate) not translated to FAST Differences between LCSH and FAST © 2006 Arlene G. Taylor 9 Invalid LCSH (or not established) not translated to FAST At the time of creation of the FAST file we were working with, the rule was to convert LCSH (6xx, 2nd indicator 0) to FAST, but then only those headings that matched a FAST authority record were kept as FAST headings in the record. 117 records found using the LCSH index were not found using the FAST index due to this “rule” An example showing a result of searching for “information literacy” follows: © 2006 Arlene G. Taylor 10 Search for “information literacy”: 650 0 Business $x Research. 650 0 Business $x Research $x Computer network resources. 650 0 Information retrieval $x Study and teaching. 650 0 Electronic information resource literacy $x Study and teaching. 650 7 Business $x Research $2 fast 650 7 Business $x Research $x Computer network resources $2 fast 650 7 Information retrieval $x Study and teaching $2 fast Invalid LCSH (or not established) not translated to FAST (cont.) “Electronic information resource literacy” is in the FAST authority file, but not “Study and teaching.” Currently the heading would have the subdivision removed and a match would be made to the heading without the subdivision. A keyword search for “information literacy” in the future would find this record through the FAST index as well as the LCSH index. © 2006 Arlene G. Taylor 12 $x and or $v in 600 and 610 fields not indexed for the LCSH index At the time of creation of the FAST and LCSH indexes we were working with, only subfields a,b,c,d (and q in 600) in fields 600 and 610 (with 2nd indicator 0) were indexed for the LCSH index. 72 records found using the FAST index were not found using the LCSH index due to this “rule” An example showing a result of searching for “archives catalogs” follows: © 2006 Arlene G. Taylor 13 Search for “archives catalogs”: 610 20 Baptist Missionary Society $x Archives $v Catalogs. 650 0 Baptists $x Missions $z West Indies. 650 0 Baptists $x Missions $z Africa. 650 0 Baptists $x Missions $z Asia. 610 27 Baptist Missionary Society. $2 fast 650 7 Archives $2 fast 650 7 Baptists $x Missions $2 fast 651 7 Africa $2 fast 651 7 Asia $2 fast 651 7 West Indies $2 fast 655 7 Catalogs $2 fast $x and or $v in 600 and 610 fields not indexed for the LCSH file (cont.) Currently these subfields would be included in the LCSH index. A keyword search for “archives catalogs” in the future would find this record through the LCSH index as well as the FAST index. © 2006 Arlene G. Taylor 15 Word indexed in FAST index because it was in a 650 field with 2nd indicator 7 and a $2 at the end Not all 2nd indicator 7, $2 designated terms are FAST terms – some are from gsafd, nasa, ram, lctgm, etc. 40 records found using the FAST index were not found using the LCSH index due to this oversight An example showing a result of searching for “dog training” follows: © 2006 Arlene G. Taylor 16 Search for “dog training”: 650 0 Dog trainers $z Arkansas $z Blanchard Springs. 650 7 Animal training $z Arkansas $z Blanchard Springs $y 1950-1960. $2 lctgm 650 7 Dogs $z Arkansas $z Blanchard Springs $y 1950-1960. $2 lctgm 650 7 Photojournalism $z Arkansas $z Little Rock $y 1950-1960. $2 lctgm 650 7 Dog trainers $2 fast 651 7 Arkansas $z Little Rock $2 fast Word indexed in FAST index because it was in a 650 field with 2nd indicator 7 and a $2 at the end (cont.) Currently the indexing program would be refined so as not to include fields with 2nd indicator 7 and $2 unless “fast” is in $2. A keyword search for “dog training” in the future would not find this record through either the LCSH index or the FAST index. © 2006 Arlene G. Taylor 18 Some names (personal or corporate) not translated to FAST The program that translated LC 6xx headings to FAST compared names to the “FAST authority file” and validated only those that were matched in the file. 20 records found using the LCSH index were not found using the FAST index due to this “rule” An example showing a result of searching for “technical services” follows: © 2006 Arlene G. Taylor 19 Search for “technical services”: 610 20 Kansas Real Estate Commission $x Auditing. 610 10 Kansas. $b State Board of Technical Professions $x Auditing. 610 10 Kansas. $b Board of Emergency Medical Services $x Auditing. 610 27 Kansas Real Estate Commission $2 fast 610 17 Kansas. $b State Board of Technical Professions $2 fast 650 7 Auditing $2 fast Some names (personal or corporate) not translated to FAST (cont.) The corporate name containing “technical” is in the FAST authority file, but not the name containing “services.” A keyword search for “technical services” in the future would find this record through the FAST index as well as the LCSH index. © 2006 Arlene G. Taylor 21 Differences between LCSH and FAST “Politics and government” as a subdivision in LCSH is changed to “Political science” in FAST “Appropriations and expenditures” as a subdivision in LCSH is changed to “Expenditures, Public” in FAST “Exhibitions” as a subdivision in LCSH is changed to “Exhibition catalogs” in FAST “Columbia River Watershed” and “Pacific Coast (U.S.)” were translated to FAST with “United States” as a geographic heading “Arabic is a language element in LCSH and is also coded in the 008 field. This is considered redundant in FAST “Library” as a subdivision in LCSH is changed to “Libraries” in FAST “Study and teaching (Higher)” as a subdivision in LCSH is changed to “Higher education” in FAST © 2006 Arlene G. Taylor 22 “Politics and government” as a subdivision in LCSH is changed to “Political science” in FAST This change affects any keyword search using any one of the words: politics, government, political, or science 1 record found using the LCSH index was not found using the FAST index, and 27 records found using the FAST index were not found using the LCSH index due to this “rule” Examples showing a result of searching for “government documents” and a result of searching for “religion and science” follow: © 2006 Arlene G. Taylor 23 Search for “government documents”: 651 0 Egypt $x Politics and government $y 30 B.C.640 A.D. $v Sources. 650 0 Legal documents $z Egypt $x History $v Sources. 648 7 30 B.C. - 640 A.D. $2 fast 650 7 Legal documents $2 fast 650 7 Political science $2 fast 651 7 Egypt $2 fast 655 7 History $2 fast 655 7 Sources $2 fast Search for “religion and science”: 650 0 Islam and politics $z Algeria. 650 0 Religion and politics $z Algeria. 651 0 Algeria $x Politics and government. 650 7 Islam and politics $2 fast 650 7 Political science $2 fast 650 7 Religion and politics $2 fast 651 7 Algeria $2 fast “Appropriations and expenditures” as a subdivision in LCSH is changed to “Expenditures, Public” in FAST This change affects any keyword search using the word “appropriations” or the word “public” 23 records found using the FAST index were not found using the LCSH index due to this “rule” An example is the search for “public service”: © 2006 Arlene G. Taylor 26 Search for “public service”: 610 10 United States. $b Dept. of the Air Force $x Appropriations and expenditures. 610 10 United States. $b Defense Finance and Accounting Service. $b Denver Center $x Auditing. 610 17 United States. $b Defense Finance and Accounting Service. $b Denver Center $2 fast 610 17 United States. $b Dept. of the Air Force. $2 fast 650 7 Auditing $2 fast 650 7 Expenditures, Public $2 fast “Exhibitions” as a subdivision in LCSH is changed to “Catalogs $v Exhibition catalogs” in FAST This change affects any keyword search using the words: exhibition, exhibitions, or catalogs 4 records found using the FAST index were not found using the LCSH index due to this “rule” An example is the search for “archives catalogs”: © 2006 Arlene G. Taylor 28 Search for “archives catalogs”: 610 10 United States. $b National Archives and Records Administration $x Photograph collections $v Exhibitions. 650 0 Photography $z United States $x History $y 20th century $v Exhibitions. 610 17 United States. $b National Archives and Records Administration $2 fast 648 7 1900 - 1999 $2 fast 650 7 Photograph collections $2 fast 650 7 Photography $2 fast 651 7 United States $2 fast 655 7 Catalogs $v Exhibition catalogs $2 fast 655 7 History $2 fast “Columbia River Watershed” and “Pacific Coast (U.S.)” were translated to FAST with “United States” as a geographic heading This change affects any searches qualified by “United States” spelled out 2 records found using the FAST index were not found using the LCSH index due to this “rule” An example is the search for “endangered species United States”: © 2006 Arlene G. Taylor 30 Search for “endangered species United States”: 650 0 Endangered species $z Columbia River Watershed. 650 0 Logging $x Environmental aspects $z Columbia River Watershed. 610 20 Plum Creek Timber Company. 610 27 Plum Creek Timber Company $2 fast 650 7 Endangered species $2 fast 650 7 Logging $x Environmental aspects $2 fast 651 7 United States $z Columbia River Watershed $2 fast “Arabic is a language element in LCSH and is also coded in the 008 field – redundant in FAST This change affects any searches using the word “Arabic.” 2 records found using the LCSH index were not found using the FAST index due to this “rule” An example is the search for “arabic books”: © 2006 Arlene G. Taylor 32 Search for “arabic books”: 008 990614s1960 ru 000 0 ara d 500 In Russian and Arabic. 650 0 Russian language $v Conversation and phrase books $x Arabic. 650 7 Russian language $2 fast 655 7 Conversation and phrase books $2 fast “Library” as a subdivision in LCSH is changed to “Libraries” in FAST This change affects any searches using the word “library” or the word “libraries” 2 records found using the FAST index were not found using the LCSH index due to this “rule” An example is the search for “medical libraries”: © 2006 Arlene G. Taylor 34 Search for “medical libraries”: 650 0 Medicine $v Bibliography $v Catalogs. 610 20 Moody Medical Library $v Catalogs. 600 10 Blocker, T. G. $q (Truman Graves) $x Library $v Catalogs. 600 17 Blocker, T. G. $q (Truman Graves) $2 fast 610 27 Moody Medical Library. $2 fast 650 7 Libraries $2 fast 650 7 Medicine $2 fast 655 7 Bibliography $v Catalogs $2 fast 655 7 Catalogs $2 fast “Study and teaching (Higher)” as a subdivision in LCSH is changed to “Higher education” in FAST This change affects any searches using the words: study, teaching, education 1 record found using the FAST index was not found using the LCSH index due to this “rule” An example is the search for “education policy”: © 2006 Arlene G. Taylor 36 Search for “education policy”: 650 0 Arctic regions $x Research $x Government policy $z Canada. 650 0 Research $z Arctic regions. 651 0 Arctic regions $x Study and teaching (Higher) $z Canada. 650 7 Education, Higher $2 fast 650 7 Research $2 fast 650 7 Research $x Government policy $2 fast 651 7 Arctic regions $2 fast 651 7 Canada $2 fast Conclusions A total of 62 records were affected by real differences between LCSH and FAST – about 3% The real differences affected 9 of the 76 searches – about 12% – (but only 62 of the records in those 9 searches were affected – 472 records in the 9 searches were the same in both indexes) © 2006 Arlene G. Taylor 38 Thank you! Arlene G. Taylor ataylor@mail.sis.pitt.edu © 2006 Arlene G. Taylor 39