DELOS Conference (Pisa, Italy –14 Feb 2007) Digital Libraries: From Proposals to Projects to Systems to Theory to Curricula Edward A. Fox Virginia Tech Blacksburg, VA 24061 USA 1 Outline • • • • • • • • • • Acknowledgments Introduction Proposals Projects Systems Theory Curricula Examples Summary Discussion 2 Acknowledgements • • • • • Students Faculty, Staff Collaborators Support Mentors 3 Acknowledgements: Students • Pavel Calado, Yuxin Chen, Fernando Das Neves, Shahrooz Feizabadi, Robert France, Marcos Gonçalves, Doug Gorton, Nithiwat Kampanya, Rohit Kelapure, S.H. Kim, Neill Kipp, Aaron Krowne, Bing Liu, Ming Luo, Paul Mather, Uma Murthy, Sanghee Oh, Ananth Raghavan, Unni. Ravindranathan, Ryan Richardson, Rao Shen, Ohm Sornil, Hussein Suleman, Ricardo da Silva Torres, Srinivas Vemuri, Wensi Xi, Seungwon Yang, Baoping Zhang, Qinwei Zhu, … 4 Acknowledgements: Faculty, Staff • Lillian Cassel, Lois Delcambre, Debra Dudley, Roger Ehrich, Joanne Eustis, Weiguo Fan, James Flanagan, C. Lee Giles, Sandy Grant, Eric Hallerman, Eberhard Hilf, John Impagliazzo, Filip Jagodzinski, Douglas Knight, Deborah Knox, Alberto Laender, David Maier, Gail McMillan, Claudia Medeiros, Manuel Perez-Quinones, Jeff Pomerantz, Naren Ramakrishnan, Layne Watson, Barbara Wildemuth, … 5 Other Collaborators (Selected) • • • • • • • • Brazil: FUA, UFMG, UNICAMP Case Western Reserve University Emory, Notre Dame, Oregon State Germany: Univ. Oldenburg Mexico: UDLA (Puebla), Monterrey College of NJ, Hofstra, Penn State, Villanova Portland State University University of Arizona, University of Florida, Univ. of Illinois, University of Virginia • VTLS (slides on digital repositories, NDLTD)6 Acknowledgements: Support ACM, Adobe, AOL, CAPES, CNI, CONACyT, DFG, IBM, IMLS, Microsoft, NASA, NDLTD, NLM, NSF (IIS-9986089, 0080748, 0086227, 0307867, 0325579, 0532825, 0535057, 0535060; ITR0325579; DUE-0121679, 0121741, 0136690, 0333531, 0333601, 0435059), OCLC, SOLINET, SUN, SURA, UNESCO, US Dept. Ed. (FIPSE), VTLS, … Acknowledgements - Mentors • JCR Licklider – undergrad advisor (1969-71) – Author in 1965 of “Libraries of the Future” – Before, at ARPA, funded start of Internet • Michael Kessler – BS thesis advisor – Project TIP (technical information project) – Defined bibliographic coupling • Gerard Salton – graduate advisor (1978-83) – “Father of Information Retrieval” – Application of Scientific Methods toward Integration of Theory, Systems, Experiments, and Education 8 Libraries of the Future JCR Licklider, 1965, MIT Press World Nation State City Community 9 Introduction – Mentor Challenges • Scientific method – “Leonardo da Vinci: The first scientist” • Theory-based -> integration – Across computing disciplines – Over content, representations, services • Experimentally proven – Evaluation: formative, summative • Practically useful and beneficial – Make the world better (smaller) – Task support, effectiveness, efficiency 10 Digital Libraries --- Objectives • World Lit.: 24hr / 7day / from desktop • Integrated “super” information systems: 5S: Table of related areas and their coverage • Ubiquitous, Higher Quality, Lower Cost • Education, Knowledge Sharing, Discovery • Disintermediation -> Collaboration • Universities Reclaim Property • Interactive Courseware, Student Works • Scalable, Sustainable, Usable, Useful Digital Libraries Shorten the Chain from Editor Reviewer Publisher A&I Consolidator Library 12 DLs Shorten the Chain to Author Teacher Digital Reader Editor Reviewer Learner Library Librarian 13 Introduction – 1991 Workshop • ACM SIGIR ’91 (Chicago) • Workshop on Future Directions in IR • Report planning with – Michael McGill – Michael Lesk • How can we accomplish something? – Address society’s needs • What if all undergrads had info. access? • Funding lobbying leading to: DLI, NSDL 14 15 Communications (bandwidth, connectivity) Locating Digital Libraries in Computing and Communications Technology Space Digital Libraries technology trajectory: intellectual access to globally distributed information Computing (flops) Digital content less more Note: we should consider 4 dimensions: computing, communications, content, and community (people) Challenges, Apps, Projects • US-Korea Collaboration on DLs Workshop • Reagan Moore and Ed Fox report • Chart Headings: – Application Domain – Related Institutions – Examples – Technical Challenges – Benefit/Impact 17 R e a g a n M o o r e E d F o x Application Domain Related Institutions Examples Technical Challenges Benefit / Impact Publishing Publishers, Eprint archives OAI Quality control, openness Aggregation, organization Education Schools, colleges, universities NSDL, NCSTRL Knowledge management, reuseability Access to data Art, Culture Museum AMICO, PRDLA Digitization, describing, cataloging Global understanding Science Government, Academia, Commerce NVO, PDG, SwissProt, UK eScience,European Union Commission Data models reproducibility, faster reuse, faster advance (e) Government Government Agencies (all levels) Census Intellectual property rights, privacy, multi-national Accountability, homeland security (e) Commerce, (e) Industry Legal institutions Court cases, patents Developing standards Standardization, economic development History, Heritage Foundations Crosscutting Library, Archive J u n e 2 0 0 2 American Memory Content, context, interpretation Long term view, perspective, documentation, recording, facilitating, interpretation, understanding Web, personal collections Multi-language, preservation, scalability, interoperability, dynamic behavior, workflow, sustainability, ontologies, distributed data, infrastructure Reduced cost, increased access, pereservation, democratization, leveling, peace, competitiveness 18 f o r N S F Introduction – Alliteration • 5S – – – – – Societies Scenarios Spaces Structures Streams • 3C – Content – Context – Criticism, commentary 19 Introduction – Alliteration • 5S • 3C – Societies • Users • Collaboration, Web 2.0 – Scenarios • Workflow, Stories • Services, Components – Spaces: GIS – Structures: DBMS – Streams: DSMS – Content • Content Management Systems – Context • Link Structure • NLP • Mental models – Criticism, commentary • • • • • Annotation, Talmud Cataloging, indexing Abstracting Summarizing Secondary literature 20 Introduction – Time to: • Treat DL as a serious field • Achieve balance – Research & Development – Systems & Services – Practice, Continuous Quality Improvement – Use, Benefit • Train digital librarians • Achieve sustainability 21 Introduction - Approach 1. 2. 3. 4. 5. Proposals Projects Systems Theory Curricula 1. 2. 3. 4. Vision Objectives Generality Abstraction, conceptualization 5. Education – Structure – Pedagogy 22 Introduction - Proposals • • • • • • • Early visions Providing rationale for funding, programs USA Europe India, China, New Zealand, Australia, … Sustainability, follow-on Technology transfer – Stanford DLI-1 -> Google 23 Introduction - Projects • • • • • • • • Body of information Media type (maps, video, speech, photos) Representation (DC, METS, FRBR) Architecture (SOA) Interoperability (OAI) Archiving and Preservation (UVC) Devices (SenseCam, PIM) Links with other fields 24 Introduction – Projects -2 • Body of information – Person’s works (Cervantes) – Content by organization • • • • Library (Library of Congress) Publisher (ACM) Million books project Google consortium – Content by discipline (Physics, CS, Archaeology) – Content by genre (ETDs) – Content by target audience (TEL, Learners) 25 NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup Portals & Portals & Clients Portals & Clients Clients User Interfaces Core NSDL “Bus” NSDL NSDL NSDL Collections Collections Collections Collection Building referenced referenced items&& Special items collections Databases collections Core Core Services: Collectionmetadata Building Core gathering CollectionServices protocols Building Services harvesting NSDL NSDL Services Other NSDL Services Services Usage Enhancement Core Services: CI Services information retrieval CI Services browsing CI Services authentication CI Services personalization CI Services discussion annotation 26 Digital Library Content Content Types Text Documents Video Audio Geographic Information Software, Programs Bio Information Images and Graphics Articles, Reports, Books Speech, Music (Aerial) Photos Models Simulations Genome Human, animal, plant 2D, 3D, VR, CAT 27 Introduction – Projects - 5 • Links with other fields – Art, sculpture, music, speech – Medicine: images, datasets, genomics – Law, government • Statutes, regulations • Citations, commentaries – Supercomputers, Grid – HCI, Cognitive Psychology – IR, HT, MM 28 CC2001 Information Management Areas IM1. Information models and systems* IM2. Database systems* IM8. Distributed DBs IM3. Data modeling* IM10. Data mining IM4. Relational DBs IM11. Information storage and retrieval IM12. Hypertext and hypermedia IM13. Multimedia information & systems IM14. Digital libraries IM5. Database query languages IM6. Relational DB design IM7. Transaction processing IM9. Physical DB design 29 * Core components Introduction - Systems • • • • • • IBM DL -> content management system MARIAN, ODL, WS-ODL Greenstone DSpace Fedora DELOS – DLMS – ISIS & OSIRIS 30 Introduction - Theory • • • • • • Definitions: Key ideas, concepts Taxonomy: Groups, clusters Abstraction/generalization: Components Models, metamodels Proofs: relationships, improvements Uses, benefits – Interoperability (map, wrap, mediate, harvest) • User interface: Explore: browse/search/visualize – Automation (lex/yacc -> 5SGraph, 5SGen) 31 Introduction - Curricula • Audience – LIKES, LIS, CS – Developer, implementer, systems librarian – D. Librarian (reference, coll. development) • Core • Tracks – Libraries: public, school/univ., corporation – Cultural heritage – Science (research, education) – Persons (PIM) 32 Living In the KnowlEdge Society (LIKES): Core surrounded by enabling computing concepts and problem providing disciplines Economics Math Political Science Architecture Marketing Biology Algorithms HCI Sociology Visualization Geography Database Social & Ethical Chemistry Knowledge Society Intelligent Systems Finance Systems Analysis & Design Physics Art Simulation Programming Music Knowledge Management Architecture History Psychology Net-Centricity Healthcare Engineering Modeling Communications Library & Information Science English 33 DL Curricula • “Curriculum Development for Digital Libraries” – NSF grant to VT, UNC-CH • Studied body of literature • Modules: core, related • Invite collaboration worldwide 34 Digital Librarian: Needed Skills and Knowledge • Choi, Y., & Rasmussen, E. (2006) • What is needed to educate future digital librarians: A study of current practice and staffing patterns in academic and research libraries. • D-Lib Magazine, 12(9) • doi:10.1045/september2006-choi. 35 D.Librarian Skills & Knowledge: Technology Related • • • • • DL architecture and software Technical and quality standards Web markup languages Database development and DBMS Web design skills 36 D.Librarian Skills & Knowledge: Library Related • • • • • The needs of users Digital archiving and preservation Cataloging, metadata Indexing Collection development 37 D.Librarian Skills & Knowledge: Other • • • • • Communication and interpersonal skills Project management and leadership skills Legal issues Grant/proposal writing skills Teaching and group presentation skills 38 Development & Evaluation Process · · · · Vision/plan From research team (VT & UNC) From current courses at VT & UNC From Advisory Board From CC 2001 Feedback Analyze · Specific strengths · Specific weaknesses · CC 2001 context · Curricular needs · Student background Products · Modules ready for use · Lessons ready for use Evaluate · Inspection by Advisory Board · Inspection by external experts · Inspection by Doctoral Consortium participants Design · Modules · Lessons Evaluate in the field · Teacher perceptions · Student perceptions · Student outcomes Revise & Implement · At UNC & VT · At additional universities (in CS & LIS programs) 39 RELATED TOPICS CORE DL TOPICS COURSE STRUCTURE Curriculum framework Semester 1: DL collections: development/creation Module 1: Digitization, Storage, Interchange Module 3: Metadata, Cataloging, Author submission Module 2: Digital objects, Composites, Packages Semester 2: DL services and sustainability Module 6: Architectures (agents, buses, wrappers/mediators), Interoperability Module 5: Spaces (conceptual, geographic, 2/3D, VR) Module 13: Documents, E-publishing, Markup Module 10: Multimedia streams/structures, Capture/representation, Compression/coding Module 16: Bibliographic information, Bibliometrics, Citations Module 11: Content-based analysis, Multimedia indexing and retrieval Module 7: Services (searching, linking, browsing, etc.) Module 4: Naming, Repositories, Archives Module 8: Intellectual property rights management, Privacy, Protection (watermarking) Module 6: Architectures (agents, buses, wrappers/mediators), Interoperability Module 15: Thesauri, Ontologies, Classification, Categorization Module 12: Multimedia presentation and rendering Module 14: Info. needs, Relevance, Evaluation, Effectiveness Module 9: Archiving and preservation, Integrity Module 17: Routing, Filtering, Community filtering Module 18: Search & search strategy, Info seeking behavior, User modeling, Feedback Module 19: Information summarization, Visualization 40 Figure 1. Curriculum framework Modules 1. 2. 3. 4. 5. 6. 7. Collection Development Digital objects / Composites / Packages Metadata, Cataloging, Author submission Architecture, Interoperability Data visualization Services Intellectual property rights management, Privacy, Protection 8. Social issues / Future of DLs 9. Archiving and Preservation 41 Conference papers x modules 200 JCDL 05 180 JCDL 04 JCDL 03 JCDL 02 160 JCDL 01 ACM DL 00 Number of conference papers 140 ACM DL 99 ACM DL 98 ACM DL 97 120 ACM DL 96 100 80 60 40 20 0 1 2 3 4 5 Module ID 6 7 8 9 42 Taxonomy of DL Educational Resources 43 CORE TOPICS 1 Overview 2 Collection Development 3 Digital Objects 4 5 Architecture (agents, mediators) User Behavior/ Interactions 7 Services 8 Archiving and Preservation Integrity 10 2-a: Collection development/selection policies 2-b: Digitization 4-d: Subject description 4-e: Information architecture (e.g., hypertext, hypermedia) 4-f: Object description and organization for a specific domain 5-a: Architecture overviews/models 5-b: Applications 5-c: Identifiers, handles, DOI, PURL 5-d: Protocols 5-e: Interoperability 5-f: Security 6-a: Info needs, relevance, evaluation 6-b: Search strategy, info seeking behavior, user modeling 6-c: Sharing, networking, interchange (e.g., social) 6-d: Interaction design, info summarization and visualization, usability assessment 7-a: Search engines, IR, indexing methods 7-b: Reference services 7-c: Recommender systems 7-d: Routing, community filtering 7-e: Web publishing (e.g., wiki, rss, Moodle, etc.) 8-a: Repositories, archives, storage 8-b (3-c): File formats, transformation, migration 9-a: Project management Management and 9-b: DL case studies 9-c: DL evaluation Evaluation 9-d: Usability assessment, user studies DL education and research 2-c: Harvesting 2-d: Document and e-publishing/presentation markup 3-a: Text resources 3-b: Multimedia 3-c (8-b): File formats, transformation, migration 4-a: Metadata, cataloging, metadata markup, metadata Info/ Knowledge harvesting 4-b: Ontologies, classification, categorization Organization 4-c: Vocabulary control, thesauri, terminologies 6 9 1-a (10-c): Conceptual frameworks, theories 10-a: Future of DLs 10-b: Education for digital librarians 8-c: Sustainability 9-e: Bibliometrics, Webometrics 9-f: Legal issues (e.g., copyright) 9-g: Cost/economic issues 9-h: Social issues 10-c (1-a): Conceptual framework, theories 10-d: DL research initiatives 44 1 Overview 1-a (10-c): Conceptual frameworks, theories 45 2 Collection Development 2-a: Collection development/selection policies 2-b: Digitization 2-c: Harvesting 2-d: Document and e-publishing/presentation markup 46 3 Digital Objects 3-a: Text resources 3-b: Multimedia 3-c (8-b): File formats, transformation, migration 47 4 Info/ Knowledge Organization 4-a: Metadata, cataloging, metadata markup, metadata harvesting 4-b: Ontologies, classification, categorization 4-c: Vocabulary control, thesauri, terminologies 4-d: Subject description 4-e: Information architecture (e.g., hypertext, hypermedia) 4-f: Object description and organization for a specific domain 48 5 Architecture (agents, mediators) 5-a: Architecture overviews/models 5-b: Applications 5-c: Identifiers, handles, DOI, PURL 5-d: Protocols 5-e: Interoperability 5-f: Security 49 6 User Behavior/ Interactions 6-a: Info needs, relevance, evaluation 6-b: Search strategy, info seeking behavior, user modeling 6-c: Sharing, networking, interchange (e.g., social) 6-d: Interaction design, info summarization and visualization, usability assessment 50 7 Services 7-a: Search engines, IR, indexing methods 7-b: Reference services 7-c: Recommender systems 7-d: Routing, community filtering 7-e: Web publishing (e.g., wiki, rss, Moodle, etc.) 51 8 Archiving and Preservation Integrity 8-a: Repositories, archives, storage 8-b (3-c): File formats, transformation, migration 8-c: Sustainability 52 9 Management and Evaluation 9-a: Project management 9-b: DL case studies 9-c: DL evaluation 9-d: Usability assessment, user studies 9-e: Bibliometrics, Webometrics 9-f: Legal issues (e.g., copyright) 9-g: Cost/economic issues 9-h: Social issues 53 10 DL education and research 10-a: Future of DLs 10-b: Education for digital librarians 10-c (1-a): Conceptual framework, theories 10-d: DL research initiatives 54 Personalizing A Course Website Using the NSDL William Cameron2, Boots Cassel2, Edward Fox1, Manuel Perez-Quinones1, Manas Tungare1, Xiaoyan Yu1 Virginia Tech1, Villanova2 55 Syllabus Collection … Towards an intelligent educational system Publisher Recommender Searcher Editor Services Potential Syllabus Text Other NSDL Resources Syllabus Classifier Crawle r Unstructured Syllabus Text Structured Syllabus Text Syllabus Ontology Classification Scheme Extractor Resource Classifier 56 Syllabus Ontology • • • • Standard, machine understandable Ontology Editor: Protégé Syllabus Schema: SylVia http://doc.cs.vt.edu/ontologies/ 57 Creating new syllabus • Web-based application to support entry of syllabi into collection • Moodle Plug-in in the works • Uses CC 2001 to select topics for a course 58 Example: CBIR + SI • Integration of – CBIR – Superimposed information (annotations …) • Application to – Biodiversity, fisheries and wildlife – Archaeology • Systems – CBISC, SIMPEL, SIERRA 59 EKEY: The electronic key for identifying freshwater fishes 60 Biodiversity Information Systems • Retrieve fish descriptions of all fish whose shape is similar to that shown in Figure below, which belong to genus “Notropis”, which have “large eyes” and “dorsal stripe”, and have been observed within the catchments of the “Tennessee” river 61 Here is another scenario … • An archeologist wants to write commentaries on artifacts discovered in the field • Using an Archeology digital library in his study, he wants to be able to: – Manually annotate images (and parts) – Search for images (and parts), and annotations – Automatically annotate/tag similar images (and parts) – Share annotations and images Source: http://www.bewegende-plaatjes.net Sources: http://www.dorsetforyou.com, http://www.archaeology.org 62 Functionality required • Digital Library (DL) users need, but get little assistance, regarding tasks: – Selecting and Annotating images and parts of images • Preserve original context of information • Manual and automated annotation – Content-based image retrieval of images and parts of images – Combined text- and content-based image retrieval of images and parts of images – Share selections and annotations 63 Layers in an SI system Superimposed Layer marks Base Layer Information Source1 Information Source2 * Source: ICDE04 presentation by Murthy, et. al … Information Sourcen 64 Superimposed Applications C A Enhanced CMapTools B 0 20 5 10 15 SIMPEL: A SuperImposed Multimedia Presentation Editor and pLayer 65 Content-Based Image Retrieval (CBIR) • Retrieve images similar to a user-defined specification or pattern (e.g., shape sketch, image example) • Goal: To support image retrieval based on content properties (e.g., shape, color or texture), usually encoded into feature vectors 66 Effective Image Descriptor Feature Vector 67 Image descriptors • Image Descriptor Example: Histogram • Frequency count of each individual color • Most commonly used color feature representation Image Corresponding histogram 69 Source: Andrade, D. Texture Descriptors 70 A typical CBIR system Interface Data Insertion Query Specification Visualization Query Pattern Feature Vector Extraction Query-processing Module Feature Vectors Image Database Similar Images Ranking Similarity Computation Images 71 CBISC Architecture 72 CBISC in ETANA 73 SIERRA • A tool that allows users to select parts of images and associate them with text annotations. • Performs information retrieval as annotations and associated marks in two ways, either for: – images or marks similar (in content) to a specified image or mark – annotations containing specified query terms 74 Annotating an image 75 Searching over annotations 76 Searching over images/sub-images 77 Theory 78 Informal 5S & DL Definitions DLs are complex systems that • • • • • help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams) 79 5Ss Ss Examples Objectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among 80 them 5S and DL formal definitions and compositions (April 2004 TOIS) relation (d. 1) sequence graph (d. 6) (d. 3) measurable(d.12), measure(d.13), probability (d.14), language (d.5) vector (d.15), topological (d.16) spaces sequence tuple (d. 4)* (d. 3) function state (d. 18) event (d.10) (d. 2) 5S grammar (d. 7) streams (d.9) structures (d.10) spaces (d.18) scenarios (d.21) societies (d. 24) services (d.22) structured stream (d.29) digital object (d.30) structural metadata specification (d.25) descriptive metadata specification (d.26) metadata catalog transmission collection (d. 31) (d.32) (d.23) repository (d. 33) (d.34)indexing service hypertext (d.36) browsing service (d.37) digital library (minimal) (d. 38) searching service (d.35) 81 5SL – The Minimal DL Metamodel Scenarios (Meta-) Model Societal (Meta-) Model Meta-Models Meta-Models Primitives uses Actor runs Service Scenario receiver Community Service Event Manager Interface Manager Index Manager Search Manager Collection Index User Repository Manager Browsing Manager Catalog Interface Document Metadata Retrieval Model Text Spatial Stream (Meta-) Model (Meta-)Model Video Audio Structural (Meta-) Model Image 82 Streams image contains metadata specifications describes Collection Catalog text audio video contains Structures is_version_of/ cites/links_to describes digital object Index stores Measurable is_a Measure employs produces Topological Repository employs produces is_a is_a Vector Metric Probabilistic Spaces employs produces inherits_from/includes runs Service extends reuses Scenario precedes contains happens_before event Scenarios Societies Service Manager uses participates_in Actor recipient association operation executes 83 redefines invokes Infrastructure Services Repository-Building Creational Preservational Acquiring Cataloging Crawling (focused) Describing Digitizing Federating Harvesting Purchasing Submitting Conserving Converting Copying/Replicating Emulating Renewing Translating (format) Add Value Annotating Classifying Clustering Evaluating Extracting Indexing Measuring Publicizing Rating Reviewing (peer) Surveying Translating (language) Information Satisfaction Services Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing 84 Ontology: Applications 85 Infrastructure Information Satisfaction Services Services (Add_Value) Rating Indexing p Training p {(digital object, Index actor, rate) } Society actor p handle anchor e classifier e Browsing e Requesting p p e e user model query/category e e Recommending p {digital object} e e Searching p Collection, {digital object} e Filtering Binding p p {digital object} query e binder e fundamental composite {digital object} transformer e e e Visualizing Expanding query p p space query’ 86 Formal Theory/ Metamodel 5S Requirements 5SGraph 5SL Analysis DL XML Log 5SLGen OO Classes Workflow Design Components Implementation DL Evaluation Test 87 A Minimal DL in the 5S Framework Streams Structured Stream Structures Spaces Structural Metadata Specification Scenarios Societies services Descriptive Metadata Specification indexing browsing searching hypertext Digital Object Collection Metadata Catalog Repository Minimal DL 88 A Minimal ArchDL in the 5S Framework Streams Structures Structured Stream Spaces Descriptive Metadata specification Scenarios Societies services SpaTemOrg StraDia Arch Descriptive Metadata specification ArchObj indexing browsing searching hypertext ArchDO Arch Metadata catalog ArchColl ArchDColl ArchDR Minimal ArchDL 89 Tools/Applications 5S Meta Model DL Expert 5SGraph DL Designer Practitioner 5SL DL Model Teacher component pool ODLSearch, ODLBrowse, ODLRate, ODLReview, ……. Researcher 5SLGen Tailored DL Logging Module XML Log 90 5SGen – Version 2: ODL, Services, Scenarios 5SL-Scenario Model (6) DL Designer Component Pool XMI:Class Model (3) ODL Search Wrapping Wrapping import import Scenario Synthesis (9) Deterministic FSM (10) Xmi2Java (4) Java Classes Model (5) DL Designer StateChart Model (8) 5SLGen Java ODL Browse XPath/JDOM Transform (7) XPATH/JDOM Transform (2) . . . Java 5SL-Societies Model (1) SMC (11) superclass Java Finite State Machine Class Controller (12) binds JSP User Interface View (13) 91 Generated DL Services 5SGraph Workspace (instance model) Structured toolbox (metamodel) 92 93 Information model 94 95 Formal Definition of DL Integration • DLi=(Ri, DMi, Servi, Soci), 1 i n – – – – Ri is a network accessible repository DMi is a set of metadata catalogs for all collections Servi is a set of services Soci is a society • • • • UnionRep UnionCat UnionServices UnionSociety • Given n individual libraries, integrate the n DLs to create a UnionDL. 96 Taxonomy of Union Services Infrastructure Services Information Satisfaction Services Essential Add_Vaue Essential indexing harvesting mapping (Schema registry with analyses & mapping) (data) cleaning (focused) crawling copying (replicating) logging (format) translating (Service to support annotation) (Metadata validation) searching access control browsing binding comparison (forum) discussion (query) expansion filtering recommendation visualization Add_value Note: Suggested NSDL services are shown in blue. 97 Union Catalog Integration Virtual Nimrin (VN) VN Metadata Format Mapping Tool Union ArchDL VN Catalog Halif DigMaster (HD) Wrapper Union Catalog HD Catalog Global Metadata Format Wrapper HD Metadata Format Mapping Tool 98 local schema global schema 99 5SQual Tool Implementing a Tool Aimed at Automatic Quality Assessment in Digital Libraries Bárbara Lagoeiro Moreira 100 Quality Base Model Digital Object • • • • Accessibility • Pertinence • Preservability • Relevance Metadata • • • Accuracy Completeness Conformance Collection • • Completeness Impact Factor Catalog • • Completeness Consistency Repository • • Completeness Consistency Services • • • • Composability Efficiency Effectiviness Extensibility Similarity Significance Timeliness Numeric Indicators • • Reusability Reliability 101 DL Success Model relevance adequacy timeliness reliability understandability scope information quality (IQ) performance expectancy (PE) satisfaction system system quality quality (SQ) (SQ) behavioral Intention to (re)use social influence (SI) user interface ease of use accessibility joy of use reliability 102 Systems 103 DL Manifesto - 1 • DL Reference Model • In support of the future European Digital Library • Developed by team connected with DELOS (Candela, Casteli, Ioannidis, Koutrica, Meghini, Pagano, Ross, Schek, Schuldt) • Draft 2.2 presented in Frescati, near Rome, June 2006 – 79 pages • Could be integrated with work of DLF, JISC, etc. 104 DL Manifesto – 2: 3 Tiers 105 DL Manifesto – 3: Main Concepts 106 DL Manifesto – 4: Actor Roles 107 108 SIMILE Objectives, Current Status, and Demonstration Stephen J. Garland, MIT CSAIL Mick Bass, HP Labs DSpace User Group Meeting Cambridge, MA March 11, 2004 109 Simile Goals • Make the Semantic Web a reality – For libraries and their users – Support heterogeneous, multi-community metadata – Provide tools for viewing, browsing, searching • Assess current state of Semantic Web – Explore utility of standards (RDF, RDFS, OWL) – Extend Semantic Web tool stack for libraries – Identify issues, gaps, opportunities, best practices for digital libraries 110 What is Fedora™? Flexible Extensible Digital Object Repository Architecture • Slides courtesy Vinod Chachra of VTLS 111 Client Application Fedora™ Repository Batch Program Web Browser HTTP SOAP HTTP SOAP HTTP SOAP Manage Access Search Server Application Web Service Web Service Exposure Exposure Layer Layer HTTP OAI Provider Session Management User Authentication Management Subsystem Security Subsystem Access Subsystem Policy Mgmt Object Reflection Component Mgmt Policy Enforcement Object Dissemination HTTP Object Validation Users/Groups PID Generation External Content Source HTTP FTP External Content Retriever Digital Objects XML Files Datastreams HTTP Local Service Policies Storage Subsystem FT P External Content Source SOAP Object Mgmt Remote Service Content Relational DB Adapted from Slide by V. Chachra, VTLS 112 VITAL / Fedora Relationship 113 OCKHAM Library Network NSDL Services NSDL OCKHAM Library Network OCKHAM Services Library Services Teachers Learners Librarians 114 OCKHAM • Simplicity (a la OCCAM’s razor) • Support by Mellon and DLF • Four main ideas: 1. Components 2. Lightweight protocols 3. Open reference models (e.g., 5S, OAIS) 4. Community perspective and involvement • Funded by NSF in NSDL, with P2P 115 Summary • • • • • • • • • • Acknowledgments Introduction Proposals Projects Systems Theory Curricula Examples Summary Discussion 116 Questions? Comments? See http://fox.cs.vt.edu/talks/ 117