Music to our Eyes: Music Content in Google Books, Google Scholar, and the Open Content Alliance Kirstin Dougan, Music and Performing Arts Librarian, Music and Performing Arts Library, University of Illinois at Urbana-Champaign This paper is an expansion of a talk given at the Music Library Association Annual Meeting in Chicago, IL February 2009. Abstract The Internet has changed how scholars perform their research. With more and more content available online that is fully searchable (often even for free), it’s not surprising that students and others may turn there first before coming to the library. This study aims to identify the utility of three of the largest free access points to print materials online—Google Book Search, the Internet Archive, and Google Scholar, for use in conducting music research. This study discovers that there are print music materials represented in all three of these tools and that they should not be eschewed by music librarians and others researching music topics. Introduction Google Book Search, the Internet Archive, and Google Scholar represent a large part of the rapidly expanding digital universe, and are often where students turn first when looking for information sources for research projects in every discipline, including music.1 The convenience of 24x7 access from anywhere via an internet connection often trumps our carefully selected but sometimes hard to use subscription tools in students’ minds. While there are certainly several high-quality digital projects hosted by libraries and others that include music content,2 these may not be the first place students look for resources; they instead often turn to Google because it’s what they know and are comfortable with. Rather than chastising students for not coming to the library, we should instead try to find value in the tools they already use and discover how their Page 1 of 27 September 10, 2009 use might be successfully integrated with our tools. Karen Schneider observed in 2006 that “The User is Not Broken,”3 and therefore it follows that librarians should be familiar with the tools our users are using beyond the library’s resources. This paper will examine two of the largest mass digitization projects in recent years, Google Book Search and the Open Content Alliance’s contributions to the Internet Archive; as well as Google’s attempt to identify scholarly communications: Google Scholar. The goal is to present a survey of the scope and extent of music-related materials located in these three tools. The focus is on printed music (scores), music monographs, and music periodicals, but will ignore the recordings in the Internet Archive. Anecdotal evidence suggests that there is not much, or perhaps not much of quality, in these three tools for music librarians and others looking for music and music research materials to consider. However, this is not the case. While it’s not possible to provide a complete list of all of the music content in these tools, it is possible to outline the major items of significance found in them. Although each of these tools has their drawbacks, and for the most part are a far cry from the controlled environments in which librarians are most comfortable (e.g., library catalogs and subscription journal databases), they do offer content and features that we should not ignore, because our users certainly aren’t. What are they? Google Book Search and the Open Content Alliance’s contributions to the Internet Archive are both access digitization projects, meaning their goal is to digitize large quantities of materials in a cost-effective manner to allow increased discovery of and greater access to these materials than is possible for their print versions alone.4 Both provide scanned images of books and other content that have been made searchable through optical character recognition (OCR). Typical OCR programs do not know how to encode musical notes, which is likely one reason Page 2 of 27 September 10, 2009 why more musical scores have not been included in either of these projects. The scores that are included are therefore only searchable by title, composer, publisher, and perhaps instrument and movement names, if this information is included in the metadata. Whereas books and other print materials can be searched by their full text, scores cannot. An additional problem presented by scanning scores is their size. While most monographs are of a somewhat consistent size, musical scores can be very small or very, very large, with no standard size between publishers (although individual publishers may have their own standard size). Music scores can also have unusual binding configurations not common in books that could present a challenge for the scanning operators. Additionally, scores may also consist of a score showing all of the instrument/voice parts and individual parts for each performer. While this could potentially present a problem when digitizing them, Google and OCA’s scanning setups should be flexible enough within costeffective boundaries to accommodate scores with parts. Google Book Search Google Book Search (GBS), launched in beta as Google Print in 2003, renamed in 2004, and graduated from beta in mid-2009, states that its aim is “to help you discover books and learn where to buy or borrow them, not read them online from start to finish.”5 This is worth remembering when evaluating how much of its content is available in full-text and how much is only indicated as a citation or limited preview. From its earliest days, GBS has included information and (often limited) content for many items that are supplied directly by their publishers, and currently has information from 25,000 publishers in over 100 countries and 100 languages.6 GBS famously partnered with libraries, which then became the primary full-text contributors of content, and includes American university libraries, and the NYPL, in addition to notable European national and university libraries such as Oxford. The full-text content in large Page 3 of 27 September 10, 2009 part is out of copyright,7 while entries for in-copyright materials offer only a limited preview or just a citation. For many titles GBS offers the table of contents, information about other editions of the same title, information about related books, references to the book from other books, journals, or websites, and links to reviews. Also very helpful is the “Find in a Library” feature, which is powered by Open WorldCat. Open Content Alliance/Internet Archive The Open Content Alliance (OCA) was formed in 2005 specifically to counter GBS.8 It was intended to be a model of transparency as opposed to the perceived secrecy of the Google business model, although there is disagreement about whether a higher level of transparency has indeed been reached.9 It is a consortium of academic and public libraries and organizations contributing to a permanent, publicly-accessible archive of digitized texts. They include the Boston Library Consortium, the California Digital Library, CARLI, Lyrasis, TRLN, and many individual libraries, including the Library of Congress and the NYPL. Also included are GBS project libraries including the University of Illinois, Harvard, and Oxford.10 The contributed content from these libraries forms the Americana Collection in OCA. Microsoft funded much of the initial OCA scanning, and was primarily interested in Americana, therefore “only those very few libraries who pay for their own scanning with OCA got to choose what was actually scanned. Microsoft didn’t specify individual titles for the libraries they funded, but they certainly specified subject coverage.”11 Other collections in the text archive include over 300,000 items from Canadian Libraries, European Libraries, and the Universal Library Project, (aka the Million Books Project), Project Gutenberg, and others. Page 4 of 27 September 10, 2009 The materials contributed by OCA members are delivered via the Open Access Text Archive section of the Internet Archive (which is why the names OCA and Internet Archive are sometimes used interchangeably). Like GBS, the content is in large part out of copyright, but there are in-copyright materials, for which OCA has obtained permissions from publishers for beforehand.12 However, it has been observed that OCA has mis-stated the copyright status for some items (claiming some things are under copyright that are not and vice versa).13 Google is the larger of the two projects, with over seven million items,14 compared to the Open Content Alliance’s Americana Collection’s total of just less than one million items, but of course both of these numbers are rapidly increasing. Google Scholar Released in beta in 2004 (and unlike GBS still in beta), Google Scholar (GS) is a bit different from GBS and OCA, in that it focuses solely on what it deems “scholarly” literature, including journal content in addition to book and dissertation citations. GS searches the same content as Google.com, but limits results to what it deems scholarly “peer-reviewed papers, theses, books, abstracts and articles, from academic publishers, professional societies, preprint repositories, universities and other scholarly organizations. Google Scholar helps you identify the most relevant research across the world of scholarly research.”15 OCLC loaded citations for most of WorldCat into Google and Google Scholar, which vastly increases their scholarly content.16 GS claims to help users “identify the most relevant research across the world of scholarly research…weighing the full text of each article, the author, the publication in which the article appears, and how often the piece has been cited in other scholarly literature.”17 This methodology somewhat disadvantages the most recently-published research because its newness dictates that it won’t have many, if any, citations yet. Page 5 of 27 September 10, 2009 Searches can be limited by date, author, or publication, among other things. Results are sometimes citation only, sometimes a link to a content service provider, and sometimes a link directly to a PDF. The service provider links lead to various sources, not necessarily ones your institution subscribes to, so researchers may need to do another search in a library-owned resource to find full-text availability. Very useful features of GS include an item’s “Cited by” counts and links, “related articles” links, and “Discover” SFX linking (if available at your institution).18 This citation linking feature is especially useful for libraries that don’t have access to citation indexes such as Web of Science. One of the biggest limitations of GS is that it doesn’t provide a list of journals it indexes. For any subscription journal tool, we can browse a list of indexed journals by subject to help determine if the tool is worth using for the question at hand. Another issue is that of multiple versions, which is problematic in both GS and GBS, but for different reasons. GBS does not collocate various versions of an item scanned from different sources, and each version may be of vastly different quality, so it is up to the user to determine which version of an item is acceptable for their use. In the case of GS, there may be a preprint and a published version of the same article; but again, like items are not collocated in the results, and so the user may find one but not know the other exists. This is perhaps less of a problem for music/humanities than the sciences, but is still worth knowing.19 Literature Review Although Google Book Search has gotten plenty of ink in the popular and trade media,20 especially concerning its legality, there are fewer articles concerning Open Content Alliance, and altogether fewer scholarly articles concerning evaluations of either GBS’s or OCA’s content. Kalev Leetaru and Robert Lackie’s 2008 articles provide a good history of Google Books and Page 6 of 27 September 10, 2009 OCA and comparisons of their functionality.21 Paul Duguid’s 2007 article attempts to test the levels of quality assurance present in Google Books and finds it somewhat lacking.22 Many articles, including as Duguid’s, observe that GBS has some issues with scanning quality. They note that some images are blurred or partially obscured, and that there are examples of the metadata not matching the image files (what’s scanned is not what it says it is). Jonathan Bengston provides one of the only articles solely about OCA and its history. 23 GBS was favorably reviewed for the Music Library Association’s periodical Notes in September of 2008.24 Joseph Grobelny outlined the various strengths and weaknesses of the tool in terms of searching and content related to music. He notes that there are several “historical American music resources, like Louis Elson’s The History of American Music (1915) or O. G. Sonneck’s Bibliography of Early Secular American Music (1905) found in GBS. There is also full-text access to Grove’s Dictionary of Music and Musicians (1907 and 1920) and Baker’s Biographical Dictionary of Musicians (1905 and 1919), along with numerous other music reference sources.”25 He found that based on searches by LC subject headings, the content is stronger for things found in ML and MT ranges, which makes sense, since those are used for books, while scores (LC class M) receive less coverage. He also determined that the broad categories of music history, musical analysis, music theory, jazz, and rock/popular music and music instruction are well represented, and that there is a weakness in world music. Studies of Google Scholar vary in their estimations of its quality, content, currency, and search precision. This is in large part due to the methodologies and criteria employed, but also because of the databases used as targets for comparison. Researchers comparing GS to a general abstracting and indexing database, or against subject-specific databases from several disciplines, Page 7 of 27 September 10, 2009 have found different results than those comparing GS to multiple databases in the same discipline. When it was first launched, the content in GS seemed to focus primarily on the sciences (for example, subject-specific evaluations of GS have been conducted by John Meier and Thomas Conkling (Engineering) and Michael Levine-Clark and Joseph Kraus (Chemistry))26, but now there is a surprising amount of content in other disciplines, including music. According to the 2006 article “The Depth and Breadth of Google Scholar: An Empirical Study,” of which one of the authors is a music librarian, Google Scholar contains 6% of the citations in the International Index to Music Periodicals, 30% of JSTOR (which has over 40 music titles), and 94% of the Cambridge Journals Online (which has 14 music and drama titles).27 Based on percentages of journal titles covered, they draw the conclusion that GS is best for research in the sciences and not as good for social sciences or humanities topics. Philipp Mayr and AnneKathrin Walter found in their comparison of Google Scholar against the Web of Science (including the citation indices for Arts and Humanities, Social Sciences, and Science) and Open Access journals from the Directory of Open Access Journals, that 80.5% of the journals from the Arts and Humanities Citation Index are included in GS (41.8% via links, 50.73% as citations only, and 7.49% as full-text), and that for titles from the Directory of Open Access Journals (which has 30 music titles), 22.11% are included in full text, 48.2% are available via links, and 29.61% area available as citations only.28 Susan Gardner and Susanna Eng, in their comparison of GS against some of the standard social sciences databases, determined that GS had more content, a greater variety of document types, and that results had greater relevancy rankings. But, they found that GS was not as current in its indexing as the other tools.29 In his study comparing GS’s recall and precision against other Page 8 of 27 September 10, 2009 bibliographic databases (mostly general, social science, and science) William Walters found that “the idiosyncrasies of Google Scholar’s search mechanism--the absence of controlled subject terms for example--do not compromise its ability to retrieve relevant results in response to simple keywords searches. In fact, the GS mechanism performs better than most.”30 He concludes that “These findings suggest that a searcher who is unwilling to search multiple databases or to adopt a sophisticated search strategy is likely to achieve better than average recall and precision by using Google Scholar.”31 However, he does provide the caveat that results may vary if the study were performed with different databases in a different field. In fact, in her article focusing on core literature coverage of ecology in GS, Marilyn Christianson found that GS only indexed 57-77 percent of articles from sample list of articles from core ecology journals, or, if held to a higher level of citation standards, only about fifty percent of the articles were indexed.32 One of the harshest evaluators of GS, Peter Jascó reports in “Google Scholar Revisited” (a 2008 reprise of the studies he did in 2005 soon after it came on the scene), that there are various searching abnormalities that make it difficult to get accurate results.33 He cites specifically problems with “innumeracy” and “illiteracy.” The first term refers to inconsistent results in GS when trying to limit searches (e.g., using the Boolean operator OR should increase the number of results returned, instead they decreased). In addition, even when large result sets are returned, only the first 1000 results will be listed.34 “Illiteracy” refers to GS’ “deficiencies distinguishing author names from other parts of the text using its parsing algorithm” (e.g., author names such as M Data, R Findings, and N. Vietnam).35 While Jascó agrees that GS’s sheer size and the fact that it includes many document types not normally covered in A&I databases are good things, he balances that with the observations that it’s impossible to determine GS’s true Page 9 of 27 September 10, 2009 size and that it does not cover as many of the open access materials (such as PubMed and Nature) as it could. Discovering what they hold for music scholars: methodology The goal of this study of Google Book Search (GBS) and Open Content Alliance (OCA) focused on printed music (scores) content, with locating music-related book and journal content a secondary concern. Music scores are unique to the discipline, while every discipline has monographs and journals. The search options available in these tools and metadata provided challenges to this task, as they are not functionally equivalent to library catalogs. In addition, there are not options for sorting and manipulating the search results in GBS and only a maximum of 1000 items can be seen at a time. OCA results can be sorted by Average rating, Download count, Date, or Date added, but not by the traditional methods of author, title, or publisher. Results can also be grouped by Relevance, Mediatype (as in file type, not by format of the original item), or Collection (American Libraries, Canadian Libraries, Universal Library, Project Gutenberg, etc.). But perhaps the largest problem is the one of format—there is no way to limit a search in GBS or OCA to printed music. An inherent difficulty in searching for printed music (or recordings) in any system, whether it is a library online catalog or a digital library, is the need to distinguish whether you want materials written by a particular composer or written about him, and whether you want the particular piece of music itself, or writings about it. In library catalogs this can be accomplished through a combination of limiting by format, and being cognizant of which terms we put in the author, subject, and title search boxes. These exact options, however, are not available in GBS or OCA. Page 10 of 27 September 10, 2009 Google Book Search In order to identify music scores in GBS, the concentration was on things that were at least available in limited or full view only, not just citation only. In the first phase of searching the advanced search “Return books on subject” option was used to search GBS for the following LC subject headings: Hymns, Piano music, Sonatas piano, Violin music, Sonatas violin, and Orchestral music. These searches revealed that there are machine-assigned keywords for music, piano music, Sonatas (piano), songs with piano, orchestral music, etc. which helped narrow further searches. Unfortunately, GBS’s subject search does not differentiate between the subject term “music” meaning “book about music” or “printed music.” As Grobelny stated in his review of GBS, “…their application of subjects to books is uneven. Many Library of Congress subject headings (LCSH) get results with the subject search […] Many items […] have rudimentary subjects, such as orchestral scores that have only one subject: “Music.” He also notes that due to lack of strict authority control, older forms of LC subject headings are present if they were present in the contributing library’s metadata. While some subject headings are better than none, this might pose a complication for some searchers. The second and third phases emphasized finding book and journal content but used different search approaches. In the second phase searches for known items were performed, particularly reference books and historic journals, with focus on those available in full view. Searches were also conducted on publisher names that occurred more frequently than others in the initial search results for music scores. In the third phase, broad keyword searches were performed for some basic concepts, much in the manner an undergraduate might use when beginning a research project. The topical search terms “music business,” “music education,” “music and copyright,” “music theory,” “rap Page 11 of 27 September 10, 2009 music,” and “Sarum gradual” (an important 13th century text) were used, and searches were not limited to full-view only. Any number of searches could be conducted for various terms relating to music. These were chosen because they represent a wide spectrum of basic searches patrons might make. Of course many real-life searches may include more specific terms and concepts. OCA In evaluating OCA the focus was on the American Libraries sub-collection because it was likely to be closest in scholarly content to Google Books, and also made the searches more manageable. While OCA does have an advanced search, it is a bit harder to use because there are “Custom fields” which have been used by contributors in different ways and “Media types” which are also not consistent. The Media types “audio” and “music,” “Sound,” “sound,” and “au dio [sic]” seem to apply to recorded sound, while “Text,” “Texts,” “text,” and “texts” can be used for limiting search results to text, with different results. I searched for media type= Texts and subject = music and also ran the same subject, known-item and keyword searches as in GBS. These media limits were not an option in GBS. Google Scholar In searching for music-related journals and materials in Google Scholar (GS), the goal was not like those studies trying to determine a precise number of titles included (an effort that would be rendered moot if GS would make public its title lists and sources), but rather to determine what is included in a broader sense that is relevant for music scholars. Anecdotal evidence and studies such as Neuhaus et al (2006) and Mayr and Walter (2008) suggested that there were not likely to be large numbers of music titles. Therefore, the effort was not made to compare GS to the primary music journal indexing tools and instead compared it to the more Page 12 of 27 September 10, 2009 general journal database Academic Search Premier from Ebsco. Given GS’s scope, music scores were not likely to be found, and therefore those search phases from GBS and OCA were not replicated. Unlike GBS or OCA, GS allows searches to be limited to broad subject areas. In the normal reference desk setting searches might be limited to “social sciences/arts/humanities,” to make the results more precise and manageable. There is, however, content relevant to music scholars in most if not all of the categories, especially those researchers with an interdisciplinary focus. For the purposes of this study searches were not limited by subject area because there is not a similar feature in Academic Search Premier. The search for music materials was conducted in two phases. First, because several studies claimed that GS has good coverage of open access journals, searches were performed for publication names of the music titles on the Directory of Open Access Journals (DOAJ) list.36 In a second phase, topical searches for “music business,” “music and copyright,” “music theory,” “rap music,” “music education,” and “Sarum gradual” were conducted. And finally, these same keyword searches were conducted in Academic Search Premier to compare the results against GS’s. Findings Google Book Search The subject search in GBS (see Table 1 for results), while offering some control, still leaves a lot to be desired in accuracy because some of the subject keywords in the item records come from library metadata and some are machine assigned during the OCR process. There are more than two violin sonatas, but they don’t necessarily have subject terms that indicate as such. Page 13 of 27 September 10, 2009 Additionally, as mentioned earlier, there is no format limit capability in searching and a subject of “violin music” could mean violin music scores or books about violin music, so this is not a perfectly accurate representation of score holdings in GBS. [insert table 1] In comparison, a simple keyword search for “piano music” in full view returned over 2,800 items. These include scores, books, song books like “Songs of Columbia”; and vocal scores of operas and operettas. A few specific examples of scores include Mendelssohn’s Song without Words (Schirmer), the Bach/Czerny Well-tempered clavichord (Schirmer), and MacDowell’s Sonata Tragica (Schirmer). A large percentage of these scores were contributed by Harvard. Hymnals are another form of printed music well-represented in GBS—presumably because they are usually “book” sized, have text as well as music, and don’t pose any special scanning problems. Not surprisingly, one music publisher well-represented in GBS is Dover, which produces well-known reprints of other publishers’ public domain works. When a search for subject= music and publisher=Dover was conducted 3,373 items were returned. However, only 151 results were viewable. As mentioned earlier, Jascó noted that a limit of 1,000 search results were viewable, even on searches that returned more than 1,000 items. Nevertheless, entire search result sets were consistently not viewable, even in sets numbering fewer than 1,000 items. Perhaps this is related to how the search algorithms function—it may be counting occurrences of search terms, not items, so an item that contains the search term three times counts as three results. In trying to create a smaller result set to test, search for all Dover scores composed by Mozart was performed. GB claimed to have 750 scores but again, only 318 items could be seen. All of the Page 14 of 27 September 10, 2009 Dover scores are categorized as “limited preview” but all of the music content seems to be present. What’s missing appears to be introductory matter still under copyright to Dover. Other publisher names appeared frequently in initial searches, so some searches were performed on the publisher field without limiting to “full view.” There are over 12,000 Schirmer items, including books, scores, and some periodicals. A search for A-R Editions netted over 2,000 scores in “limited preview only” from their Recent Researches series37 and other score publications. Other music publishers represented include Fischer, Ayer, Oliver Ditson, Presser, Novello, and Universal Editions among others. It is possible to limit GBS searches to “books” or “magazines,” but unfortunately, “magazines” refers to the popular magazine content they added in late 2008, not scholarly journals.38 This is relevant to music scholars because among the new popular titles added is Billboard from 1942 forward in full view, but is not a complete run. It’s unclear why there are gaps in the run, and this would likely be very confusing to users expecting to find issues from the missing years. Music periodicals are not widely represented in GBS in full text, but there are a few exceptions, including these historically significant titles (Leipziger) Allgemeine musikalishe Zeitung 1822-1882 complete, and some earlier issues between 1798 and 1816 Rivista Musicale Italiana 1894-1908 Several issues of Die Musik from 1901-1908 A dozen issues of the Musical Times from the 1880s through the early 1900s Two dozen issues of The Musical World ranging from the late 1840s to the 1880s Page 15 of 27 September 10, 2009 Through known-title searching and from observing search results during other searches, music monographs, reference works, opera libretti, and ballads all appear to be well-represented in full-text. A few specific examples include the Biographie universelle by Fétis,39 Eitner’s Bibliographie der Musik-Sammelwerke,40 Stanbrook Abbey’s 1897 Gregorian Music: An Outline of Musical Palæography, Music of the Japanese from the 1891 Transcripts of the Asiatic Society of Japan, 100 of the roughly 300 pages of Barry Brook’s Thematic Catalogues in Music, and Grove’s “Dictionary of Music and Musicians.” A title search for this work returned 325 fullview items, but the results screens only listed 12 items. These volumes come from the first and second editions of Grove, but neither is represented in full.41 Searches were also conducted on “all books” for a few subject concepts using “exact phrase” keyword searching except where noted. See Table 2 for results. [insert table 2] As noted, the search for “music business” returned over 3,000 items, the first three screens of results included primarily “limited preview only” items, many of which were less than ten years old. Narrowing the search by clicking on Google’s suggested heading “music / business aspects” returned just over 1,800 items. In contrast, searching WorldCat for “music business” and limiting to books returned only 634 items. A search in GBS for “music education” returned over 3,600 items. The same search in WorldCat nets 9,600 books. The phrase “rap music” in GBS nets over 1,800 items and suggests the following ways to narrow the search: Refine results for rap music: Music / Genres & Styles / Rap & Hip Hop; Music / Ethnic; Biography & Autobiography / Composers & Musicians; Social Science / Popular Culture. While this study does not attempt to formally evaluate the relevance of the search results, certainly not all of the titles returned are relevant. However, a great number of them will be of use to scholars of all levels, even if the full Page 16 of 27 September 10, 2009 text of the item is not available. The “Find in a library” function will help lead them to the item or hopefully to the reference desk, if needed. Open Content Alliance/Internet Archive The first thing to note about OCA’s text content is that there is some overlap between it and Google Books because some institutions (such as University of Illinois) deposit their Google-scanned content into OCA. An initial search of OCA for media type = Texts AND subject = music resulted in 2,562 items and the vast majority of it was in fact printed music. See Table 3 for results. [Insert table 3] Perhaps the most notable score content to date is University of Illinois’ contribution of almost 100 opera and musical vocal scores including The Mikado by Gilbert and Sullivan (with 2,247 downloads as of 6/8/09) Have a Heart by Jerome Kern (with 261 downloads) Die tote Stadt by Erich Korngold (with 84 downloads) The NYPL has also deposited many scores (primarily Schirmer and Fischer editions), including piano and violin music of Mozart, Brahms, Bartok, Tchaikovsky, Copland, Ibert, Schubert, Liszt, Debussy, Chopin, Sarasate, and even Bruch’s 2nd violin concerto. Some of these are scores only, but some are scores with parts. In one example OCA has seamlessly addressed the problem of representing a score and part, with the cello part simply following the piano part with a few clicks.42 There are fewer large-scale works included, but there is, for example, a miniature (study) score of Holst’s The Planets.43 As in GBS, hymnals and opera libretti are well-represented in OCA. There are also a fair number of music monographs, reference works, and even periodicals. A keyword search for Page 17 of 27 September 10, 2009 “music” with media type = texts returned 4,214 items including several issues of Music magazine (but you can’t tell from the results list which ones until you click on them), Oxford Dictionary of Music (1950) 6th printing, (with 2,876 downloads), History of Arabian Music from 1929, and volumes from Grove first and second editions, but again, as in GBS, neither edition is represented in full. The Boston Library Consortium has contributed a number of important monographs under the direction of their music librarians, including Eitner and Fétis, and all volumes of the Pazdírek Universal-Handbuch der Musikliteratur (which has no text in Google, just citations for some of the volumes). Incidentally, it was not possible to find this work through an author search if the diacritic mark over the “i” was omitted. Keyword searching in OCA will have vastly different results than in GBS for several reasons. See Table 4 below. There is simply less material in OCA and the full text of items is not searched in OCA as it is in GBS (unless the search has been limited to one item), instead just the item metadata is searched. Also, because OCA focuses more closely on texts in the public domain, there are no records for contemporary materials as there are in GBS. Therefore, modern topics such as “music business” and “rap music” will have few to no hits in OCA. [insert table 4] This is a good example of needing to choose the right tool for the job. Google Scholar A search for relevant titles from the Directory of Open Access Journals (DOAJ) in Google Scholar, showed that of the thirty music titles, five (16.7%) were not represented at all (and all five are foreign-language journals), twelve had ten or fewer results, and only four had a significant number of results (one of which is Spanish language). In GS’s help pages,44 they caution the user about searching for journal titles, as they are not consistently recorded. GS Page 18 of 27 September 10, 2009 contains full text and/or direct links to vendor databases for over thirty major music journals, including Psychology of Music, Computer Music Journal, Ethnomusicology, Journal of Research in Music Education, Contemporary Music Review, and Journal of Popular Music Studies. There are several music therapy titles, and this is likely because of its interdisciplinary nature and the larger number of scientific and medical journal titles searched by GS. GS also contains dissertations and conference proceedings including the Proceedings of the International Computer Music Conference, the International Conference on Music Perception and Cognition, and The International Society for Music Information Retrieval (ISMIR). It also contains entries for patents via Google Patents, such as the one for the Music page score turner. 45 The following searches could not be limited to “subject” as was done in GBS and OCA, which is the main reason the result sets are so much larger in GS than in the other tools. See Table 5 for results. [insert table 5] The same searches were performed in Ebsco’s Academic Search Premier without applying any limits. See Table 6 for results. [insert table 6] Conclusions There is a lot of music-related material in Google Book Search, Open Content Alliance, and Google Scholar, more so than might have been expected. While there is not as much printed music as books about music or journal articles for various reasons, these tools can serve a useful purpose for music scholars and others looking for music materials. For those patrons searching for oft-checked-out materials, important historic monographs or journals, reference works that don’t circulate, or interdisciplinary topics, these three tools are worth utilizing. They provide a Page 19 of 27 September 10, 2009 level of access beyond what our library catalogs can (in terms of full-text searching), and with the “Find in a Library” features in GBS and GS, can serve as a way to guide users to more traditional resources, services, and reference help if needed. Nevertheless, while these tools seem dazzling (especially to our users) because of their vast amounts of content and apparent ease of use, they are not without their problems, especially for serious scholars who have some familiarity with standard library research catalogs and journal databases. First and foremost, metadata is problematic for both GBS and the OCA. It is often difficult to determine what an item really is (Is it a score or a libretto? What edition is it?), until viewing it. GS also contains very brief citations, often with misinformation, which can make it hard to identify an item. In some ways these tools are best for known-item searching, and at least for items available in limited or full view, especially historic texts (as you would expect given copyright issues.) Further problems with GBS include the enormous size of the database; even focused searches can return too many hits to wade through practically. This is compounded by the fact that sometimes what is described isn’t what has been scanned and some scans are unreadable (blurring, gutter loss, partial obfuscation). The limitations of the searching and results manipulation of all three tools have been mentioned previously. Where GS, and to some extent GBS, will be of most use to music librarians and scholars is for topics that go beyond individual journals or cross disciplines such as acoustics, music and popular culture, music and media, music cognition, the psychology of music, and music and marketing or consumerism. One example of this last topic is “Measuring the Effect of Music Downloads on Music Purchases” from the Journal of Law and Economics,46 a prime example of Page 20 of 27 September 10, 2009 the type of article often requested at the reference desk that wouldn’t likely be found in a traditional music-centered journal database. Of course the question many music librarians and scholars have is why there isn’t more music material, especially printed music, in GBS and OCA. There are several factors at work— the inability of OCR to handle musical scores and vastly inconsistent physical sizes of printed scores (including scores with separate physical parts) being the two primary ones. Although scanning can be outsourced for larger items, it requires money that could make this impractical on a large-scale for some institutions (the out-sourcing itself can be quite cost-effective, but it adds another layer to the logistics of the project). Another reason why more historic popular sheet music (often the focus of music digital library projects because much of it is in the public domain) isn’t included is because in most libraries there aren’t resources to fully catalog individual titles. Both GBS and OCA require that items have some sort of an individual record in an online library catalog that can be captured. The issue of coverage in Google Scholar is also a question. Since GS does include some of the DOAJ music journals, why not all of them? It also includes music journals via other sources such as JSTOR, Sage, Oxford, and so on, again, why not all of them? Further studies could more specifically compare the music content in Google Scholar to music journal databases, although there is perhaps not yet enough relevant content to make this worthwhile. Another possibility is to evaluate the academic value of audio music content in the Internet Archive (and perhaps YouTube). Students and others have learned that YouTube can be invaluable for locating performances of obscure or new compositions that may not be available on commercial recordings. Page 21 of 27 September 10, 2009 So where does this leave us in our evaluation of these tools for music scholars? Should we turn away from traditional library tools in favor of these three tools? No. Are there times when using these tools will be called for? Yes. Should we be more tolerant of students and patrons using these tools? Yes. Do I hope that these tools will improve and incorporate more of the features so valued by librarians? Yes. There are music scholars who are not yet aware of the benefits to be gained from using these tools. While we as librarians may not recommend these tools to them over our existing library resources, and online subscription tools, we should maintain an awareness of what they have to offer and know when to suggest them to our users to further their research. Page 22 of 27 September 10, 2009 Appendix: Google Scholar Music Journals List Action, Criticism and Theory for Music Education (DOAJ) British Journal of Music Education (via Cambridge journals*) Computer Music Journal (via MIT press and ACM Portal and JSTOR) Contemporary Music Review (via IngentaConnect/InformaWorld) Critical Studies in Improvisation (DOAJ) Early Music (via Oxford Press) Empirical Musicology Review (DOAJ) Ethnomusicology Forum (via Informaworld) Ethnomusicology International Journal of Community Music (DOAJ) International Journal of Music Education (via Sage) Journal of Music Theory (via JSTOR) Journal of Music therapy (via PubMed) Journal of New Music Research (via Informaworld and IngentaConnect) Journal of Popular Music Studies (via Blackwell Synergy/Wiley InterScience) Journal of Research in Music Education (via ERIC.ed.gov) Journal of the Acoustical Society of America (via PubMed) Journal of the Society for Musicology in Ireland Journal of Voice (via Elsevier) Leonardo Music Journal (via MIT Press) Music and Letters (via Oxford Journals) Music Education Research (via IngentaConnect/InformaWorld) Music Educators Journal (via JSTOR) Music Perception Music Reference Services Quarterly (via Hawthorn/Ingenta) Music Theory Online (DOAJ, mostly citations) Music Theory Spectrum (via CALIBER/ JSTOR) Music Therapy Music Therapy Today (DOAJ) Notes: Quarterly Journal of the Music Library Association (via JSTOR) Perspectives in New Music (via JSTOR) Philosophy of Music Education Review (via Muse and ERIC) Popular Music & Society (via Ingenta) Popular Music History Popular Music Psychology of Music (via Sage) Psychomusicology (via PsycInfo) Revista Musical Chilena (DOAJ) TRANS: TRanscultural Music Review (DOAJ) Voices: A World Forum for Music Therapy (DOAJ) Page 23 of 27 September 10, 2009 *Note that sources may vary by institutional subscription availability. Notes 1 Cathy DeRosa, “Perceptions of Libraries and Information Resources: A Report to the OCLC Membership” (Dublin, OH: OCLC, 2005). 2 These include Petrucci Music Library http://imslp.org/ (accessed July 9, 2009), the Library of Congress’s Performing Arts Encyclopedia, which contains many digitized pieces of sheet music http://www.loc.gov/performingarts/ (accessed July 9, 2009), and other online sheet music projects listed here: http://library.duke.edu/music/sheetmusic/collections.html (accessed July 9, 2009). 3 Karen Schneider, The Free Range Librarian blog, http://freerangelibrarian.com/2006/06/03/the- user-is-not-broken-a-meme-masquerading-as-a-manifesto/ (accessed June 9, 2009). 4 Preservation digitization, on the other hand, focuses on producing as authentic a digital surrogate of the original as possible. It may also increase access and discovery, but is often more timeconsuming and costly to accomplish. 5 Google Book Search Help, “Why can’t I read the entire book?” http://books.google.com/support/bin/answer.py?answer=43729&cbid=1eznom41z7nze&src=cb&lev=in dex (accessed June 9, 2009). 6 Tom Turvey (Google Book Search), “The Universal Collection,” (talk given at the CIC-CLI Off-the- Shelf conference, Bloomington, IN, May 19, 2009). 7 Turvey indicated that ~60% of all content was from 1964-present and ~20% was from 1923-1963. 8 Kalev Leetaru, “Mass Book Digitization: The Deeper Story of Google Books and the Open Content Alliance,” First Monday 13, 10 (October 6, 2008). 9 Ibid. 10 Open Content Alliance, “Contributors,” http://www.opencontentalliance.org/contributors/ (accessed July 9, 2009). Page 24 of 27 September 10, 2009 11 Betsy Kruger, University of Illinois at Urbana-Champaign, Head of Digital Content Creation and Illinois’ Google Project Manager, email message to author, June 25, 2009. 12 Leetaru, 2008. 13 Ibid. 14 Ibid. 15 Google Scholar, “About Google Scholar,” http://scholar.google.com/intl/en/scholar/about.html, (accessed June 9, 2009). 16 Burton Callicott and Debbie Vaghn, “Google Scholar vs. Library Scholar: Testing the Performance of Schoogle,” Internet Reference Services Quarterly 10, 3/4 (April 2006): 71-88. 17 Google Scholar, “About Google Scholar,” http://scholar.google.com/intl/en/scholar/about.html, (accessed June 9, 2009). 18 Jeffrey Young, “100 Colleges Sign Up with Google to Speed Access to Library Resources,” Chronicle of Higher Education 51, 27 (May 20, 2005): A30. 19 Carol Tenopir, “Google in the Academic Library,” Library Journal February 1, 2005, 32. 20 Charles W. Bailey, Jr., “Google Book Search Bibliography, version 3” (December 8, 2008), http://www.digital-scholarship.org/gbsb/ (accessed June 9, 2009). 21 Robert J. Lackie, “From Google Print to Google Book Search: The Controversial Initiative and Its Impact on Other Remarkable Digitization Projects,” The Reference Librarian 49, 1 (August 2008): 35-53 and Leetaru, 2008. 22 Paul Duguid, “Inheritance and Loss? A Brief Survey of Google Books,” First Monday 12, 8 (August 6, 2007). 23 Jonathan B. Bengston, “The Birth of the Universal Library,” Library Journal (Spring 2006): 2-4, 6. 24 Joseph Grobelny, "Google Book Search, and: Live Search Books (review)," Notes 65, 1 (September 2008): 136-140. 25 Ibid, 139. Page 25 of 27 September 10, 2009 26 John J. Meier and Thomas W. Conkling, “Google Scholar’s Coverage of the Engineering Literature: An Empirical Study,” The Journal of Academic Librarianship 34, 3 (May 2008): 196-201 and Michael Levine-Clark and Joseph Kraus, “Finding Chemistry Information Using Google Scholar: A Comparison with Chemical Abstracts Service,” Science and Technology Libraries 27, 4 (August 2007): 3-17. 27 Chris Neuhaus, Ellen Neuhaus, Alan Asher, and Clint Wrede, “The Depth and Breadth of Google Scholar: An Empirical Study,” portal: Libraries and the Academy 6, 2 (April 2006): 135. 28 Philipp Mayr and Anne-Kathrin Walter, “Studying Journal Coverage in Google Scholar,” Journal of Library Administration 47, 1/2 (September 2008): 93-4. 29 Susan Gardner and Susanna Eng, “Gaga over Google? Scholar in the Social Sciences,” Library Hi Tech News 22, 8 (2005): 42-5. 30 William H. Walters, “Google Scholar Search Performance: Comparative Recall and Precision,” portal: Libraries and the Academy 9, 1 (January 2009): 10. 31 Walters, 16. 32 Marilyn Christianson, “Ecology Articles in Google Scholar: Levels of Access to Articles in Core Journals,” Issues in Science and Technology Librarianship 49 (Winter 2007), http://www.istl.org/07winter/refereed.html (accessed June 9, 2009). 33 Peter Jacsó, “Google Scholar Revisited,” Online Information Review 32, 1 (2008): 102-114. 34 Ibid, 107. 35 Ibid, 110. 36 DOAJ journals belonging to subject “Music” http://www.doaj.org/doaj?func=subject&cpid=6 (accessed June 9, 2009). 37 Recent Researches in the Music of the Middle Ages and Early Renaissance, Recent Researches in the Music of the Renaissance, Recent Researches in the Music of the Baroque Era, Recent Researches in the Music of the Classical Era, Recent Researches in the Oral Traditions of Music, and Recent Researches in American Music. Page 26 of 27 September 10, 2009 38 The Official Google Blog, http://googleblog.blogspot.com/2008/12/search-and-find-magazines-on- google.html (accessed June 9, 2009). 39 François-Joseph Fétis, Biographie universelle des musiciens: et bibliographie générale de la musique, (Paris : Firmin Didot frères, fils et cie, 1878-81). 40 Robert Eitner, Bibliographie der Musik-Sammelwerke des XVI. und XVII. Jahrhunderts, (Berlin, L. Liepmannssohn, 1877). 41 I could find the following volumes: 1890 v1 (2), 1890 index volume (2), 1889 v4 (2), 1890 v4,1911 v1,1911 v5, 1920 v6 “American Supplement” (2) as of February 2009. 42 W. H. Squire, At Twilight = Triste, (New York : C. Fischer, 1907), http://www.archive.org/details/attwilighttriste00squi (accessed June 25, 2009). 43 Gustav Holst, The Planets: Suite for Large Orchestra, (London: Boosey and Hawkes, 1921), http://www.archive.org/details/Holst_ThePlanets, (accessed July 9, 2009). 44 Google Scholar, “Advanced Scholar Search Tips,” http://scholar.google.com/intl/en/scholar/refinesearch.html (accessed June 25, 2009). 45 Music page score turner, RW Edwards, PC Stavrou - US Patent 7,238,872, 2007; Portable page turner for music sheets Douglas J. Carr et al, Patent number: 5203248 Filing date: Feb 25, 1992; Page turner for music manuscripts and the like, Robert C. Burster, Patent number: 5052266 Filing date: Apr 2, 1990. 46 Zentner, Alejandro. Measuring the Effect of Music Downloads on Music Purchases. Journal of Law and Economics 49, 1 (April 2006): 63-90. Page 27 of 27 September 10, 2009