Research in digital libraries the big picture © Tefko Saracevic, Rutgers University 1 Visionaries Vannevar Bush (1945) – “As we may think” Atlantic Monthly – machine Memex J.C.R. Licklider (1964) – Libraries in the future Both envisioned computers for storing, organizing, finding & retrieving human knowledge © Tefko Saracevic, Rutgers University 2 Challenge – computers very effective dealing with well defined problems – dealing with human knowledge is not a well defined problem Needed: effective alliance between computers & human knowledge records – not easy at all – basic theme of all DL research © Tefko Saracevic, Rutgers University 3 Estimation themes underestimated how much can be achieved by brute computer power & cheap computing overestimated the power of artificial intelligence , improvements in computer methods & natural language processing to deal with © Tefko Saracevic, Rutgers University human knowledge records 4 the Web – computers in libraries long before the Web – but the Web enabled digital libraries •provides a convenient way to distribute information over Internet spurred both practical developments & applied research – INDEPENDENT of each other © Tefko Saracevic, Rutgers University 5 Technical problems Substantial , larger & more complex than anticipated: – representing, storing & retrieving of library objects •particularly if originally designed to be printed & then digitized – operationally managing large collections - issues of scale – dealing with diverse & distributed collections © Tefko Saracevic, Rutgers University 6 Research issues – understanding objects in DL •representing in many formats •non-textual materials – – – – – – metadata, cataloging, indexing conversion, digitization organizing large collections managing collections, scaling preservation, archiving interoperability, standardization © Tefko Saracevic, Rutgers University 7 Research issues ... – user interfaces & humancomputer interaction – information discovery •search, retrieval, browsing – – – – – natural language processing reliability, robustness performance, evaluation social, legal & economic issues impact on scholarship, education & other areas © Tefko Saracevic, Rutgers University 8 Digital Library Initiatives (DLI) Research consortia under National Science Foundation – DLI 1: 1994-98, 3 agencies, $24M, six large projects – DLI 2: 1999-2006, 8 agencies, $60+M, 77 large & small projects in various categories ‘digital library’ not defined to cover many topics & stretch ideas – not constrained by practice © Tefko Saracevic, Rutgers University 9 Projects in DLI 1 “Informedia Digital Video Library” images in DL; continued in DLI 2 • Carnegie Mellon U “Environmental Planning and Geographic Information Systems” - environmental information • U of California at Berkeley; continued in DLI 2 “Spatially-referenced map information” geospacial sources • U of California at Santa Barbara; continued in DLI 2 © Tefko Saracevic, Rutgers University 10 Projects in DLI 1 (cont.) “Federating repositories of scientific literature” - interoperability • U of Illinois at Urbana-Champaign “Intelligent agents for information location” - interfaces for providers , mediation & users • Michigan U “Interoperation mechanisms among heterogeneous services” - Infobus • Stanford U; continued in DLI 2 © Tefko Saracevic, Rutgers University 11 Results? Mixed bag – original hoopla high – no overall evaluation, so difficult to gauge what really accomplished – some characteristics: • 5 of 6 Principal Investigators from comp science • four projects continue in DLI 2 • four have established testbeds • two have links with the university library Is digital library research a province of computer science? © Tefko Saracevic, Rutgers University 12 Projects in DLI 2 Larger number & diversity – reflects more agencies & fields – computer scientists still dominant – but others joined in many projects •still almost no librarians A number of projects have definite operational goals Some projects toward education © Tefko Saracevic, Rutgers University 13 Sample of DLI2 projects Re-inventing Scholarly Information Dissemination & Use U of California - Berkeley High Performance Digital Library Classification Systems U of Arizona Cuneiform Digital Library Initiative U of California at Los Angeles Data Provenance U of Pennsylvania © Tefko Saracevic, Rutgers University 14 Alexandria Digital Library Earth Prototype U of California Santa Barbara Project Prism: Information Integrity in Digital Libraries Cornell University Digital Analysis and Recognition of Whale Images on a Network Eckerd College An Operational Social Science Digital Data Library Harvard © Tefko Saracevic, Rutgers University 15 Creating the Digital Music Library Indiana U Techniques for restoring, searching, & editing humanities collections University of Kentucky National Gallery of the Spoken Word Michigan State U Automatic Reference Librarians for the World Wide Web U of Washington Digital Libraries for Children U of Maryland © Tefko Saracevic, Rutgers University 16 DLI2: Results so far… 28 projects in the main part: – orientation: 18 (64%) domain, 1o technology - in DLI1 33% domain – 53% PIs from comp sc -in DLI1 83% – 5 include practical dl in their domain – 9 show demos of work – but the amount & quality of results vary greatly from site to site © Tefko Saracevic, Rutgers University 17 DL projects in practice Assoc of Res Libraries (ARL) db: – 427 dl projects in 13 countries – 374 in the US •51% in universities; 24% fed govmt; 9% hist societies; 6% regional … •84% are explicitly retrospective; 16% technological •1 listed from DLI (Illinois) •no connection with DLI projects © Tefko Saracevic, Rutgers University 18 Agenda in ARL listed projects – providing access to specialized materials and collections from an institution (s) that are otherwise not accessible – covering in an integral way a topic with a range of sources – providing technological support for specific functions in digital libraries © Tefko Saracevic, Rutgers University 19 Internationally: UK UK Electronic Libraries Programme (eLib) – UKOLN: The UK Office for Library and Information Networking – oriented toward practice & development not research directly – a number of innovative projects – considerable funding – heavy library involvement © Tefko Saracevic, Rutgers University 20 European Union DELOS Network of Excelence on Digital Libraries – many projects throughout European Union •heavily technological – many meetings, workshops – resembles DLIs in the US – well funded, long range © Tefko Saracevic, Rutgers University 21 Agendas Most dl research agenda is set from top down – from funding agencies to projects – imprint of the computer science community's interest & vision Most dl practice agendas are set from bottom up – from institutions, incl. many libraries – imprint of institutional missions, interests & vision © Tefko Saracevic, Rutgers University 22 Connections dl research &dl practice presently are conducted – mostly independent of each other, – minimally informing each other, – & having slight, or no connection Parallel universes with little connections & interaction © Tefko Saracevic, Rutgers University 23 Conclusions Most DLI research concentrated on technical, infrastructure issues – Positive: essential for innovation – Negative: research & library communities totally separated Problems harder than thought, results less than expected – normal for research in general – how will this be translated? © Tefko Saracevic, Rutgers University 24 © Tefko Saracevic, Rutgers University 25