DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality modeling and functionality interoperability, Session 1” Functionality and Interoperability with 5S by Edward A. Fox • fox@vt.edu http://fox.cs.vt.edu • Dept. of Computer Science, Virginia Tech 1 • Blacksburg, VA 24061 USA Acknowledgements • Mentors (Licklider, Kessler, Salton) • Virginia Tech, CS, Digital Library Research Laboratory • NSF and other sponsors, e.g., grants – DUE-0840719, CCF-0722259, IIS-0535057, IIS-0325579 • Students, colleagues, co-investigators • Robert France, Marcos André Gonçalves, Doug Gorton, Yi Ma, Uma Murthy, Rao Shen, Hussein Suleman, Ricardo da Silva Torres, ... • Barbara Wildemuth, Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang 2 Theses and Dissertations • • • • • • • • Douglas Gorton, "Practical Digital Library Generation into DSpace with the 5S Framework", April 2007, MS thesis, http://scholar.lib.vt.edu/theses/available/etd04252007-161736/ Rao Shen, "Applying the 5S Framework To Integrating Digital Libraries", April 2006, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-04212006-135018/ Ananth Raghavan, "Schema Mapper: A Visualization Tool for Incremental Semiautomatic Mapping-based Integration of Heterogeneous Collections into Archaeological Digital Libraries: The ETANA-DL Case Study", May 2005, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-05182005-114155/ Marcos Andre Goncalves, "Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications", Nov. 2004, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-12052004-135923/ Rohit Dilip Kelapure, "Scenario-Based Generation of Digital Library Services", June 2003, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-06182003-055012/ Hussein Suleman, "Open Digital Libraries", Nov. 2002, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-11222002-155624/ Qinwei Zhu, "5SGraph: A Modeling Tool for Digital Libraries", Nov. 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-11272002-210531/ Jun Wang, "VIDI: A Lightweight Protocol Between Visualization Systems and Digital Libraries", May 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd3 07012002-145841/ Other Selected References • • • • • • • • • Marcos Andre Goncalves, Robert K. France, Edward A. Fox, MARIAN: Flexible Interoperability for Federated Digital Libraries. ECDL 2001, 173-186, 2001 Hussein Suleman and Edward Fox. The Open Archives Initiative: Realizing Simple and Effective Digital Library Interoperability. J. Library Automation, 35(1/2):125-145, 2002 Marcos Andre Goncalves, Edward A. Fox. 5SL - A Language for Declarative Specification and Generation of Digital Libraries. JCDL 2002, 263-272 Marcos Andre Goncalves, Ming Luo, Rao Shen, Mir Farooq Ali, Edward A. Fox. An XML Log Standard and Tool for Digital Library Logging Analysis. ECDL 2002, 129-143 Marcos Andre Goncalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne, Edward A. Fox, Filip Jagodzinski, Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. JCDL 2003, 312 – 314 Hussein Suleman, Edward A Fox, Rohit Kelapure, Aaron Krowne, Ming Luo. Building digital libraries from simple building blocks, Online Information Review 27(5): 301-310, 2003 M. Goncalves, E. Fox, L. Watson, N. Kipp. Streams, Structures, Spaces, Scenarios, Societies (5S): A Formal Model for Digital Libraries. TOIS, 22(2): 270-312 , 2004 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Ricardo da S. Torres, E. A. Fox. Exploring Digital Libraries: Integrating Browsing, Searching, and Visualization. JCDL 2006, 1-10 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. What is a Successful Digital Library? ECDL 2006, 208-219 4 Other Selected References - 2 • • • • • • • • Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang, Edward A. Fox, Barbara M. Wildemuth. The Core: Digital Library Education in Library and Information Science Programs. D-Lib Magazine, 12(11), Nov. 2006 Marcos Andre Goncalves, Barbara L. Moreira, Edward A. Fox, Layne T. Watson. "What is a good digital library?" - A quality model for digital libraries. Information Processing and Management, 43(5): 1416-1437, 2007 Uma Murthy, Douglas Gorton, Ricardo Torres, Marcos Goncalves, Edward Fox, Lois Delcambre. Extending the 5S Digital Library (DL) Framework: From a Minimal DL towards a DL Reference Model. JCDL 2007 Workshop on Digital Library Foundations Barbara L. Moreira, Marcos A. Goncalves, Alberto H. F. Laender, Edward A. Fox, Evaluating Digital Libraries with 5SQual. ECDL 2007: pp. 466-470 Yi Ma, Edward A. Fox, Marcos A. Goncalves. Personal Digital Library: PIM upon 5S Framework. CIKM 2007 Workshop: PIKM07, Lisbon, Nov. 2007, 117-124 Marcos Andre Goncalves, Edward A. Fox, Layne T. Watson. Towards a Digital Library Theory: A Formal Digital Library Ontology. Int. J. Digital Libraries 8(2): 91-114, 2008 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. Integration of Complex Archaeology Digital Libraries: An ETANA-DL Experience. Information Systems. 33(7-8): 699-723, 2008 Barbara L. Moreira, Marcos Andre Goncalves, Alberto H.F. Laender, Edward A. Fox. Automatic Evaluation of Digital Libraries with 5SQual. J. Informetrics, 3(2): 102-123, 5 2009 Outline • Contextual Background – DL Definitions, Scope – DL Curricula Efforts – Interoperability Approaches • 5S • 5S Services Work • International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) • Discussion Topics 6 DL Definitions • Issues and Spectra – Collection vs. Institution – Content vs. System – Access vs. Preservation – “Free” vs. Quality – Managed vs. Comprehensive – Centralized vs. Distributed 7 Information Life Cycle Borgman et al.: Workshop Report on Social Aspects of Digital Libraries: http://www-lis.gseis. ucla.edu/DL/ 8 Information Life Cycle Authoring Modifying Using Creating Retention / Mining Organizing Indexing Accessing Filtering Storing Retrieving Distributing Networking 9 Digital Libraries Shorten the Chain from Editor Reviewer Publisher A&I Consolidator Library 10 DLs Shorten the Chain to Author Teacher Digital Reader Editor Reviewer Learner Library Librarian 11 DL Curric. Project • NSF awards to VT and UN C-CH • CS and LIS • http://curric.dlib.vt.edu/ • http://curric.dlib.vt.edu/wiki/index.php/Main _Page • http://curric.dlib.vt.edu/modDev/modDev.ht ml 12 RELATED TOPICS CORE DL TOPICS COURSE STRUCTURE DL Curriculum Framework Semester 1: DL collections: development/creation Digitization Storage Interchange Metadata Cataloging Author submission Digital objects Composites Packages Semester 2: DL services and sustainability Architectures (agents, buses, wrappers/mediators) Interoperability Spaces (conceptual, geographic, 2/3D, VR) Documents E-publishing Markup Multimedia streams/structures Capture/representation Compression/coding Bibliographic information Bibliometrics Citations Content-based analysis Multimedia indexing Naming Repositories Archives Services (searching, linking, browsing, etc.) Archiving and preservation Integrity Architectures (agents, buses, wrappers/mediators) Interoperability Thesauri Ontologies Classification Categorization Multimedia presentation, rendering Info. Needs Relevance Evaluation Effectiveness Intellectual property rights mgmt. Privacy Protection (watermarking) Routing Filtering Community filtering Search & search strategy Info seeking behavior User modeling Feedback Info summarization Visualization 13 DL Curric. Modules - 1 • Module 1-b: History of digital libraries and library automation • Module 2-c: File Formats, Transformation, and Migration • Module 3-b: Digitization • Module 4-b: Metadata • Module 5-a: Architecture overviews 14 DL Curric. Modules - 2 • Module 5-b: Application software • Module 5-d: Protocols • Module 6-a: Information needs/relevance • Module 6-b: Online information seeking behaviors and search strategies • Module 6-d: Interaction design and usability assessment 15 DL Curric. Modules - 3 • • • • Module 7-b: Reference Services Module 7-g: Personalization Module 8-b: Web Archiving Module 9-c: Digital library evaluation, user studies 16 Interoperability Approaches • • • • • • • • Browsers (Mosaic) Federation Heterogeneous, Homogeneous Protocols (OAI-PMH) Repositories Content Standards (XML), Mapping Integration (ETANA) Services (Superimposed Information) 17 Integration: Challenges • “Semantic Web” is vision, not reality. • How can we integrate without a theory? • How can we interoperate without a common framework? • How can we have a science of DLs if we lack agreement on definitions (so we can reason and discuss) and measures of quality (so we can compare and improve)? 18 Informal 5S & DL Definitions DLs are complex systems that • • • • • help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams) 19 5S Layers Societies Scenarios Spaces Structures Streams 20 5Ss Ss Examples Objectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among 21 them 5S Overview • 5S and Generating DLs – – – – – – – 5S Framework 5S definitions, services taxonomy, ontology 5SL 5SGraph 5SGen (and DL development) DL development of union DL, DL integration 5SGen into DSpace • 5S Metamodels – – – – Minimal DL Archaeology DL CBIR DL Union DL Streams Digital Library Content Content Types Text Documents Video Audio Geographic Information Software, Programs Bio Information Images and Graphics Articles, Reports, Books Speech, Music (Aerial) Photos Models Simulations Genome Human, animal, plant 2D, 3D, VR, CAT 23 Structure (Degrees, Terminology) Web DLs DBs Chaotic Organized Structured 24 Digital Objects (DOs) • Born digital • Digitized version of “real” object – Is the DO version the same, better, or worse? – Decision for ETDs: structured + rendered • Surrogate for “real” object – Not covered explicitly in metamodel for a minimal DL – Crucial in metamodel for archaeology DL 25 Databases • 5S perspective: structures, streams, scenarios • Extending database technology • Structured and unstructured info • Multimedia databases • Link databases • Performance, transaction processing • Replicated storage, rollback/recovery 26 Spaces User interfaces and visualization • • • • 2D interfaces 3D interfaces GIS Other paradigms 27 Scenarios • • • • Services (see later) Scenario based design, use cases Functionality Representation and processing for humans and machines 28 Societies • User communities – Authors, editors, teachers, students, readers – Personal(ization), group(ware), community, global – Accessibility, universal access • Librarians: reference, acquisition, operations • Research community – Associations, conferences, publications, labs, projects • Economics – Copyright, intellectual property rights, digital rights management, authorization, authentication, security, privacy, self-archiving (eprints) – Publishers, catalogers, distributors, sustainability – Open source, commercial, hybrid 29 Higher DL Constructs • • • • • • Collections Catalogs Repositories and Archives Services Systems Case Studies 30 Collections • • • • Terminology: set, “database” Distributed: basis, efficiency/effectiveness Parallelism: federation, harvesting Scale: object size, compression, replication, stream splitting • Intelligence/processing granularity: object, cluster, collection, repository 31 NSDL Collections • • • • Discovery of content Classification and cataloguing Acquisition and/or linking; referencing Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged • Access to massive real-time or archived datasets • Software tool suites for analysis, modeling, simulation, or visualization • Reviewed commentary on learning materials and pedagogy 32 Catalogs • • • • • OPACs Distributed vs. centralized Coverage, breadth Specificity, depth Management: versioning, works 33 Repositories and Archives • Naming, identifiers • Architectures, interoperability – OAI: harvesting – SRU/SRW: federation • Preservation, archives – LOCKSS, UVC, emulation/migration • Scalability, storage • Institutional repositories, Open Access 34 Services • • • • • NSDL Services Taxonomy of services Ontology, composition, reuse Evaluation Key services in-depth: – Crawling, indexing – Clustering, classifying – Recommending, using social networks – Logging 35 NSDL Services • Help services, frequently asked questions, etc. • Synchronous/asynchronous collaborative learning environments using shared resources • Mechanisms for building personal annotated digital information spaces • Reliability testing for applets or other digital learning objects • Audio, image, and video search capability • Metadata system translation • Community feedback mechanisms 36 Infrastructure Services Repository-Building Creational Preservational Acquiring Cataloging Crawling (focused) Describing Digitizing Federating Harvesting Purchasing Submitting Conserving Converting Copying/Replicating Emulating Renewing Translating (format) Add Value Annotating Classifying Clustering Evaluating Extracting Indexing Measuring Publicizing Rating Reviewing (peer) Surveying Translating (language) Information Satisfaction Services Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing 37 Services Ontology: Applications 38 Ontology: Applications • Expand definition of minimal DL by characterizing – typical DL services – in the context of “employs” and “produces” relationships • Use characterization to: – Reason about how DL services can be built from other DL components – As well as be composed with other services through extension or reuse 39 Infrastructure Information Satisfaction Services Services (Add_Value) Rating Indexing p Training p {(digital object, Index actor, rate) } Society actor p handle anchor e classifier e Browsing e Requesting p p e e user model query/category e e Recommending p {digital object} e e Searching p Collection, {digital object} e Filtering Binding p p {digital object} query e binder e fundamental composite {digital object} transformer e e e Visualizing Expanding query p p space query’ 40 5S and DL formal definitions and compositions (April 2004 TOIS) relation (d. 1) sequence graph (d. 6) (d. 3) measurable(d.12), measure(d.13), probability (d.14), language (d.5) vector (d.15), topological (d.16) spaces sequence tuple (d. 4)* (d. 3) function state (d. 18) event (d.10) (d. 2) 5S grammar (d. 7) streams (d.9) structures (d.10) spaces (d.18) scenarios (d.21) societies (d. 24) services (d.22) structured stream (d.29) digital object (d.30) structural metadata specification (d.25) transmission collection (d. 31) (d.23) repository (d. 33) descriptive metadata specification (d.26) metadata catalog (d.32) (d.34)indexing service hypertext (d.36) browsing service (d.37) digital library (minimal) (d. 38) searching service (d.35) 41 Streams image contains metadata specifications describes Collection Catalog text audio video contains Structures is_version_of/ cites/links_to describes digital object Index stores Measurable is_a Measure employs produces Topological Repository employs produces is_a is_a Vector Metric Probabilistic Spaces employs produces inherits_from/includes runs Service extends reuses Scenario precedes contains happens_before event Scenarios Societies Service Manager uses participates_in Actor recipient association operation executes 42 redefines invokes XML-based DL Log Standard • Log analysis – is a source of information on: • How patrons really use DL services • How systems behave while supporting user information seeking activities • Used to: – Evaluate and enhance services – Guide allocation of resources • Common practice in the web setting – Supported by web servers, proxy caches • DL Logging can be more detailed 43 The XML Log Format Log Transaction SessionId MachineInfo Timestamp Event StatusInfo Search SearchBy SessionInfo RegisterInfo Timestamp Statement Action Browse QueryString Statement Update Collection Catalog StoreSysInfo Timeout PresentationInfo 44 Systems • Architectures – Client-server, service-oriented – P2P, Grid • System descriptions and comparisons – Personal DLs; Institutional to global – DSpace, Eprints, Fedora, Greenstone, Kepler • ODL • 5S Suite: language, visualization, generation, logging 45 Architectural Issues • • • • • Independent system vs. part of federation Centralized vs. distributed vs. open services Monolithic vs. modular vs. componentized Topologies: bus vs. star vs. hierarchical vs. network Decompositions vary – search engine, browser, DBMS, MM support – repository, handle server, client – information resources + mediators, bus or agent collection + client with workspace/environment 46 NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup Portals & Portals & Clients Portals & Clients Clients User Interfaces Core NSDL “Bus” NSDL NSDL NSDL Collections Collections Collections Collection Building referenced referenced items&& Special items collections Databases collections Core Core Services: Collectionmetadata Building Core gathering CollectionServices protocols Building Services harvesting NSDL NSDL Services Other NSDL Services Services Usage Enhancement Core Services: CI Services information retrieval CI Services browsing CI Services authentication CI Services personalization CI Services discussion annotation 47 5S Modeling -> Systems represented by Domain Concepts (theory) instance of interpreted as used to compose abstracted from Modeling Language (Meta-Model) instance of represented by DL Architecture Model interpreted as instance of instance of Running DL “real” world object Actors Q “Real” World 48 Tools/Applications 5S Meta Model DL Expert 5SGraph DL Designer Practitioner 5SL DL Model Teacher component pool ODLSearch, ODLBrowse, ODLRate, ODLReview, ……. Researcher 5SLGen Tailored DL Logging Module XML Log 49 Formal Theory/ Metamodel 5S Requirements 5SGraph 5SL Analysis DL XML Log 5SLGen OO Classes Workflow Design Components Implementation DL Evaluation Test 50 5SL: a DL design language • Domain specific languages – Address a particular class of problems by offering specific abstractions and notations for the domain at hand – Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping. • XML-based realization of 5S – Interoperability – Use of many sub-languages (e.g., MIME types, XML Schemas, UML notations) 51 5SL – The Minimal DL Metamodel Scenarios (Meta-) Model Societal (Meta-) Model Meta-Models Meta-Models Primitives uses Actor runs Service Scenario receiver Community Service Event Manager Interface Manager Index Manager Search Manager Collection Index User Repository Manager Browsing Manager Catalog Interface Document Metadata Retrieval Model Text Spatial Stream (Meta-) Model (Meta-)Model Video Audio Structural (Meta-) Model Image 52 Example of Document declaration in the Structures Model <document name=`ETD'> <stream_enumeration> Example of Actors declaration in the Societies Model <Society> <Actor> <Community name='Patron‘/> <Attribute name='name‘ <stream type='String'/> value=`ETDText'> <Attribute name='ID‘ type='Integer'/> <stream value=`ETDAudio'> ... </Community> <Community name='Student'> <Service>Converting</Service> </stream_enumeration> </Community> <structured_stream> <Community name='ETDReviewer'> <Service>Reviewing</Service> %XMLSchema% <structured_stream> </document> </Community> <Community name='ETDCataloguer'> <Service>Cataloguing</Service> </Community> Example of Service declaration in the Scenario Model <SERVICE name ='Searching'> <SCENARIO name='SimpleSearching'> <NOTE>Simple scenario for an NDLTD site searching service</NOTE> <EVENT> <SENDER>Patron</SENDER> <RECEIVER>InterfaceManager</RECEIVER> <OPERATION name=SearchCriteria/> <PARAMETER>collection</PARAMETER> <PARAMETER>query</PARAMETER> </EVENT> <EVENT> <SENDER>InterfaceManager</SENDER> <RECEIVER>SearchManager</RECEIVER> <OPERATION name='Search'/> <PARAMETER>collection</PARAMETER> <PARAMETER>query</PARAMETER> </EVENT> <EVENT> </Actor> <SENDER>SearchManager</SENDER> ……… <RECEIVER>InterfaceManager</RECEIVER> <PARAMETER name='Results'>WtdSet </PARAMETER> </EVENT> …. 53 5SGraph: A DL Modeling Tool • • • Help users model their own instances of a digital library (DL) in the 5S language (5SL). A simple modeling process which enables rapid generation of digital libraries Features – – – 5SGraph loads and displays a metamodel in a structured toolbox. The structured editor of 5SGraph provides a topdown visual building environment for the DL designer. 5SGraph produces syntactically correct 5SL files according to the visual model built by the designer. 54 Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel) 55 56 5SGen • Version 1 -- MARIAN as the target system – Focused on rich structures: semantic networks – Behavior attached to nodes/links • Version 2 -- Shifted for later work to componentized (ODL) approach – Focused on scenarios/societies – Structures/Spaces encapsulated within components (e.g., relational tables, indexes) – Only textual streams supported • Version 3 – Practical DL (w. DSpace) – Doug Gorton 57 5SLGen – Version 2: ODL, Services, Scenarios 5SL-Scenario Model (6) DL Designer Component Pool XMI:Class Model (3) ODL Search Wrapping Wrapping import import Scenario Synthesis (9) Deterministic FSM (10) Xmi2Java (4) Java Classes Model (5) DL Designer StateChart Model (8) 5SLGen Java ODL Browse XPath/JDOM Transform (7) XPATH/JDOM Transform (2) . . . Java 5SL-Societies Model (1) SMC (11) superclass Java Finite State Machine Class Controller (12) binds JSP User Interface View (13) 58 Generated DL Services Requirements (1) 5S Meta Model DL Expert Analysis (2) DL Designer 5SGraph Practitioner 5SL DL Model component pool ODLSearch, ODLBrowse, ODLRate, ODLReview, ……. Teacher Design (3) Researcher Tailored DL Services 5SLGen Implementation (4) 5SSuite 5SGraph 5SGen Mapping Tool 59 Describing Quality in Digital Libraries • What’s a “good” digital Library? – Central Concept: Quality! – Hypotheses of this work: • Formal theory can help to define “what’s a good digital library” by: • New formalizations of quality indicators for DLs within our 5S framework • Contextualizing these measures within the Information Life Cycle 60 Quality and the Information Life Cycle Active Accura cy Comple te Conform ness ance Timeliness Similarity Preservability Describing Organizing Indexing Authoring Modifying Semi-Active Pertinence Retention Significance Mining Creation Accessibility Storing Accessing Timeliness Filtering Utilization Archiving Distribution Seeking Discard Inactive Ac ce ssi bil Networking P r es i er v t y ab ilit y Searching Browsing Recommending Relevance 61 Quality Dimensions DL Concept Digital object Metadata specification Collection Catalog Repository Services Dimensions of Quality Accessibility Pertinence Preservability Relevance Similarity Significance Timeliness Accuracy Completeness Conformance Completeness Impact Factor Completeness Consistency Completeness Consistency Composability Efficiency Effectiveness Extensibility Reusability Reliability 62 Services: Efficiency / Effectiveness • Effectiveness – Very common measures: Precision, Recall, F1, 10precision, R-Precision – Other services may have different measures: e.g., Recommending, etc. • Efficiency – let t(e) be the time of an event e – let eix and efx be the initial and the final event of service sex . – For service sex, efficiency is defined as: • Efficiency(sex) = t(efx) - t(eix) 63 DL Integration • What is “DL Integration” – Hide distribution – Hide heterogeneity – Enable autonomy of individual component • Why Integration – island-DLs – inability to seamlessly and transparently access knowledge across DLs Utilize various autonomous DLs in concert 64 Integration: Urgency, Longevity • If we collect, capture, acquire, or produce information, will it be usable in 100 years? • NSF Digital Archiving Program • Library of Congress National Digital Information Infrastructure and Preservation Program 65 DL integration formalization based on DL interoperability approach Consists of Intermediary-based Interrelated with mapping-based use mediator wrapper use agent schema mapping used in two architectures Consists of federation Union Archiving use hybrid mapper composite mapper trained by GA 66 Union DL Definitions • A Minimal Union Digital Library integrated from n DLs is given as a four-tuple: MinUnionDL=(Union Repository, Union Catalog, Minimal Union Services, Union Society). • DL Integration Problem Definition: Given n individual digital libraries (DL1, DL2, …, DLn), each defined as described above, to integrate the n DLs is to create a Union DL. Union Catalog Quality Measurement • Complete – All the catalogs to be integrated are complete. • Consistent – All the catalogs to be integrated are consistent. – Each descriptive metadata specification in the union catalog describes only one digital object. 68 Member DLs of ETANA-DL Lahav Madaba Megiddo Umayri Society Society Society Society Archaeologists Archaeologists Archaeologists Archaeologists Service Database Searching and Browsing Service Database Searching and Browsing Service Database Searching and Browsing Service Database Searching and Browsing Catalog Catalog Catalog Catalog Repository Repository Repository Repository … Architecture of ETANA-DL, with centralized catalog and partially decentralized repository Union Society Archaeologists General Public Union Services Harvesting, Mapping Searching, Browsing, Recommendation, Annotation, Object Comparison, Object Sharing Binding, Visualization Union Catalog Union Repository Mapping confirmation Mapping history 71 Union Catalog Integration Virtual Nimrin (VN) VN Metadata Format Mapping Tool Union ArchDL VN Catalog Halif DigMaster (HD) Wrapper Union Catalog HD Catalog Global Metadata Format Wrapper HD Metadata Format Mapping Tool 72 ArchDL Expert 5S Archaeology MetaModel ArchDL Designer 5SGraph VN Metadata Format Scenario Sub-model ETANA-DL Union Services Descriptions ETANA-DL Metadata Format VN Catalog HD Catalog Mapping Tool Wrapper4VN Harvesting Mapping Searching Browsing … Wrapper4HD Structure Inverted FilesSub-model Search Service XOAI Browse DB Browse Service Component Pool Services DB 5SGen Other XOAI ETANA-DL Services Web Interface Union Catalog Browsing … HD Metadata Format 73 5S definitional structure Streams Structured Stream Structures Spaces Structural Metadata Specification Scenarios Societies services Descriptive Metadata Specification indexing browsing searching hypertext Digital Object Collection Metadata Catalog Repository Minimal DL Minimal archaeological DL in the 5S framework (A.i is from minimal DL, j is new) A .1 A .2 S tr e a m s S tr u c tu r e s A .3 A .4 A .5 S paces S c e n a r io s S o c ie tie s A .7 D e s c r ip tiv e M e ta d a ta s p e c ific a tio n A .6 S tr u c tu r e d S tr e a m 1 A .8 s e r v ic e s S p a T e m O rg 2 S tr a D ia 3 4 in d e x in g A .1 0 b r o w s in g A r c h D e s c r ip tiv e M e ta d a ta s p e c ific a tio n A rc h O b j A .1 2 A .1 1 s e a r c h in g h y p e r te x t 6 5 A .9 A .1 8 A rc h D O A r c h M e ta d a ta c a ta lo g A r c h C o ll 7 A r c h D C o ll 9 A rc h D R 10 M in im a l A r c h D L Minimal CBIR DL Stream Image Stream Space Feature Vector Image Descriptor Composite Descriptor Structure Service Society KNNQ User Info Need Structured Featute Vector Image Content Description Image Object Visualization Operation Image Digital Object Image Descriptor Metadata Catalog Image Collection RQ Content-based Image Searching Service DL Ref. Model Concepts -5S(see II.4.2) • User -> Societies – Human and machine actors – End-users, Designers, Administrators, Application Developers + Librarians (DL curric) • • • • • Content -> Streams, Structures Functionality -> Services -> Scenarios Quality -> Services (recall 5SQual) Policy -> Scenarios, Societies Architecture -> Scenarios, Structures, Spaces 77 (components, protocols, standards, specs) International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) • How can we strengthen the infrastructure for repositories: key solvable problems: • Citation services - making citation data more easily available from repositories • Repository handshake – talking to each other, user deposit into several at once • Interoperable identification infrastructure – unambiguous people, documents (FRBR) 78 International Repository Infrastructure Workshop – and DL.org • How are these 2 related? • Can we learn from the Amsterdam meeting and focus on some important and solvable issues immediately? 79 Discussion Topics • Faced in MARIAN, NCSTRL, CITIDEL, Ensemble, NSDL, ETANA • Already solved: OAI-PMH • Focus – Superimposed information / annotation – Citation information • Approaches – 5S: 5SL, 5SGen, 5SQual – XML representations – Protocols (VIDI) 80 Summary • Contextual Background – DL Definitions, Scope – DL Curricula Efforts – Interoperability Approaches • 5S • 5S Services Work • International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) • Discussion Topics 81 Questions? Discussion? Thank You! 82