European Organization for Nuclear Research Organisation Européenne pour la Recherche Nucléaire CDS Invenio CERN’s open source digital library information system Dr. Tim Smith, IT User and Document Services (representing Jens Vigen, CERN Scientific Information Officer) Outline • CDS Invenio – Scope – Digital library services – Content • Sharing – Open Source – Open Access • Digital Preservation – Multi-media records Tim Smith – 34th INIS LO meeting 2 Digital Library Scope • Institutional Repository for CERN • Subject Repository for Particle Physics • Physical collections – Library and Book Shop • Digital Collections – Born digital – Converted from a physical media • More than 1M records – 500k with full text – 150k with links to external full text • 30k distinct visitors per month Tim Smith – 34th INIS LO meeting 3 “One stop shop” • Preprints, postprints, theses • Conference proceedings, lecture objects – Slides and recordings • Experiment Support material – Photos, videos, animations • Reusable information – Data in tables, figures – Correlation matrices – Data (high-level objects) • User survey – Access to full-text, Depth of coverage – Search accuracy, Quality of content Tim Smith – 34th INIS LO meeting 4 Digital Library Services Collection Aggregation Conversion Stamping Watermarking Curation Cataloguing Organisation Enrichment Preservation Access Indexing Ranking Clustering Classifying Tim Smith – 34th INIS LO meeting 5 Digital Age Services • Collaboration “Web2.0” – Comments, reviews, baskets • Immediacy – Email alerts, RSS feeds • Intensive tasks – – – – Keyword & reference extraction Citation analysis Full text indexing & ranking Conversion services: multiple download formats • Flexible formats – Remove constraints of print versions – Internationalisation Tim Smith – 34th INIS LO meeting 6 Collaboration: Web2.0 Tim Smith – 34th INIS LO meeting 7 Connections and Statistics Tim Smith – 34th INIS LO meeting 8 Key Word Extraction Tim Smith – 34th INIS LO meeting 9 Open Source • CDS Invenio is available under GPL http://cdsware.cern.ch/ – Free download and usage – Instances across the globe: sciences & humanities • Administrative documents, librettos, an art collection • Standards – MARC21 metadata format http://www.loc.gov/ – Multi-lingual; UNICODE – Compliancy with all browsers; web standards • Flexibility – Format support • SiteMap for GoogleWeb, similar for GoogleScholar • Export as: BibTeX, MARC, MARCXML, DC, EndNote, NLM • Subscribe to: RSS (& email alerts) – Available in 20 languages (external contributions) Tim Smith – 34th INIS LO meeting 10 Open Access • CERN Convention (1953) contains what is effectively an early Open Access manifesto: – “… the results of its experimental and theoretical work shall be published or otherwise made generally available” • Signatory of Berlin Declaration – Author grants • free, irrevocable, worldwide, perpetual right of access, … – Store in repository • Unrestricted distribution, interoperability, long-term archiving, … Tim Smith – 34th INIS LO meeting 11 (More) Open Access • Support interchange protocols: OAI-PMH • Open Access ≠ no more access management – Copyright acceptance workflows – Publication workflows; logins, ACLs • Changing the publishing model – – – – Spiralling subscription costs, falling subscriptions Pay to create, not download: Open Access It is feasible with Particle Physics! O(10M€) SCOAP3: http://www.scoap3.org • 3-way partnership: scientists, libraries, publishers Tim Smith – 34th INIS LO meeting 12 Digitisation for Preservation • Audio, Photo, Video • Deposit in Digital Library – Improve access – Snapshot before deterioration of objects • Archiving of knowledge for perennial access Open reel Audio 1950s U-matic 1970s Beta SP 1980s VHS 1980s – Meta data is key to retrieval • Retiree photo caption project – Storage model (and backups) • MultiMedia Archive 30TB data (c.f. <2TB for 1M docs) Tim Smith – 34th INIS LO meeting 13 Multi-Media Records: Photos Tim Smith – 34th INIS LO meeting 14 Multi-Media Records: Videos Tim Smith – 34th INIS LO meeting 15