OSI | WEB SERVICES World Digital Library www.wdl.org “Workshop on Technical Skills and Standards for the World Digital Library” Doha, Qatar February 22-23, 2012 World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Agenda (Day One) • Introduction: Overview of the World Digital Library • • • • and WDL in the Arab Peninsula Region (Jason) Participants Introduction and Discussion (All) WDL Standards: Purpose and Practice (Sandy / Chris) Overview of the WDL Production Process (Jason) Workflow for Content and Metadata Preparation (Sandy / Chris / Ted) World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Agenda (Day Two) • • • • • • Image Files: Digitization and Preparation (Sandy / Chris) Expert Descriptions: Nominal Process for Review and Evaluating Quality (Jason / Ted) Discussion on “Expert Descriptions based on Arab Heritage Resources” (Mr. Mohammed Hammam Fikri) Image Files: Transfer (Sandy / Chris) Discussion: Increasing the Usage of WDL in the Arab World (All) Summary: (Jason) • Review of the WDL Production Process • New Developments and Opportunities for Technical Staff at Partner Institutions • Improving Communication and Collaboration • Feedback Questionnaire, Certificates and Photographs Distribution World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES WDL Trainers: Sandy Bostian, WDL Content Manager Chris Masciangelo, WDL Digital Conversion Specialist Ted Waddelow, U.S. Fulbright Scholar at University of Bahrain Jason Yasner, WDL Operations Manager World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Introduction: Overview of the World Digital Library and WDL in the Arab Peninsula Region Jason Yasner WDL Operations Manager World Digital Library www.wdl.org World Digital Library: Mission and Objectives Mission: Digitize and make freely available over the Internet, in multilingual format, primary source materials that tell the stories and highlight the achievements of all countries Objectives: • Promote international and intercultural understanding and awareness • Expand multilingual and culturally diverse content on the Internet • Provide resources to educators and contribute to scholarly research • Build knowledge and capacity in the developing world; help narrow the digital divide World Digital Library www.wdl.org Key Features of the WDL Website • Multilingualism • Interface in seven (7) languages • Content in more than seventy-five (75) languages • High quality content of cultural and historical importance • Consistent, high-quality metadata to allow searching and browsing across cultures and time periods • Item-level descriptions, curator videos to enhance user understanding of the content • Speed and performance • Web 2.0 features World Digital Library www.wdl.org Partners and User Statistics (As of February 1, 2012) • 138 partners from 72 countries • www.wdl.org launched on April 21, 2009 • Usage since launch: • More than 19 million visitors • Top countries by number of visitors: Spain, United States, Mexico, Brazil, Argentina, China, France, Russian Federation, Colombia, Portugal, Germany, UK • Links from other sites to WDL: 3.5 million • Visitors from the Arab World: 454,390 World Digital Library www.wdl.org The WDL Governance Structure • WDL launched as a Library of Congress-UNESCO partnership • WDL Charter provides for: Annual partner meeting Executive Council Standing Committees for: Technical Architecture Content Selection Translation and Language Regional and Subject Sub-committees Arabic Scientific Manuscripts Chinese Language Content Meso-American Codices Arab Peninsula Regional Group • Library of Congress serves as Project Manager (2010-2015) World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES WDL in the Arab Peninsula Region (and neighboring parts of Africa) • Abu Dhabi Authority for Culture and Heritage (ADACH) • Al- Ahgaf Library for Manuscripts, Yemen • Central Library, Qatar Foundation, Qatar • King Abdulaziz University Library, Saudi Arabia • King Abdullah University of Science and Technology (KAUST), Main Library, Saudi Arabia • King Hamad Library, Bahrain • National Library of Jordan • National Library of Sudan • Omani Digital Library (Kawab Al-Marifa), Sultanate of Oman • Sultan Qaboos University Libraries, Sultanate of Oman World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Partner Introduction and Discussion • Name, Position, Institution • What are you digitizing, e.g., books, manuscripts, maps, photos, etc.? • What content do you plan to contribute to WDL? • Do you have a preservation policy in place? • How are you storing your digital items? • What equipment are you using? • Do you have dedicated funding for digital production? • Do you have a dedicated physical space (scanning facility)? World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES WDL Standards: Purpose and Practice project.wdl.org Sandy Bostian WDL Content Manager World Digital Library World Digital Library www.wdl.org WDL Standards: Why? • Your images are what people see. Deep Zoom shows flaws. “Put your best foot forward.” • Your metadata runs the display. OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES WDL Standards: What is a digital object? • Digital Content Partner • Metadata • Relationships • Behavior WDL World Digital Library World Digital Library www.wdl.org WDL Standards: One vs. Many • Books, Manuscripts Single Volume – 1 object Multi-Volume – 1 object • Newspapers Each Issue – 1 object • Journal Each Issue – 1 object • Photographs Single Photograph – 1 object Photo Album – ??? • Maps Single Map – 1 object Atlas – ??? OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES WDL Standards: Digital Object • 1 Record = 1 Object • Make Sure the Numbers Match Before Sending • Put the Digital Identifier in the Record World Digital Library World Digital Library www.wdl.org WDL Standards: Real Life Example OSI | WEBwww.wdl.org SERVICES • 806 Records • 316 Files + 55 Directories = 371 Objects • No Digital Identifiers in Records World Digital Library World Digital Library www.wdl.org WDL Standards: Filenaming • ASCII characters, preferably numbers • Case matters (my_file vs. My_file vs. my_File) • Three-letter file extensions (.tif NOT .tiff) • Do not use these characters: < or > : " / or \ | ? * Space “ “ ( or ) OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Overview of WDL Production Process • Content Selection • Content Transfer • Content Processing • Cataloging (consistent metadata, descriptions) • Translation • Publishing Jason Yasner WDL Operations Manager World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES WDL 2 Workflow World Digital Library World Digital Library www.wdl.org Content Workflow: Transfer • • • • • • Transfer to “landing zone” server Malware & virus scans “Bag & tag” Inventory Transfer to tape backup Transfer to “work zone” server Content Workflow: Triage • Reconcile records & media • Cursory vetting for obvious problems • Missing media files • Missing data files/records • Corruption issues • Cataloging/Translation derivatives • Starter kit prep - item level tracking • Load to catalog or pre-translation OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Content Workflow: Object Building • • • • • Quality review Media adjustments as needed (cropping, etc.) Color management/sRGB conversion Create master reference image Object building • WDL directory structure • Add image header information • Handle registration • Reference image derivatives • PDFs Content Workflow: Validation • • • • Run script that checks for missing parts & server permissions issues Final media validation (JHOVE based tool) Link items to dev team server area Update inventory system OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Description Process WDL Description team: • • • • Professors, scholars, researchers, and other field experts Editors Description Coordinator Partner Reviewers Function: • • To produce a summary that explains each item and highlights its significance To illuminate the material in a way that is accurate, succinct, and engaging, working with information provided by partners, subject specialists, and other authoritative resources Process: Pre-description: determine and execute the appropriate course of action for the writing and editing processes, i.e.: – how to improve existing descriptions provided by partners – pinpoint the appropriate expert to write descriptions, if they are not provided Writing: supplement pre-existing descriptions or produce original ones Editing: substantive edit and copy-edit Partner Review: evaluation of WDL-edited descriptions by partner institution Finalizing: final approval before description is sent to Metadata team World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Metadata Process Original metadata creation • Edit English title, read and check the description • Find or verify names in VIAF • Verify publication date and language used in the material • Verify place names in TGN • Assign appropriate topics and additional subjects • Clean “Notes” and “Physical Description” fields • Add collection title if needed Metadata review • Review each field to make sure there are no errors • Verify the names, DDC, and subjects, etc. Batch metadata creation • Analyze metadata to develop strategy and workflow • Create metadata mapping and decide output format • Create or/and verify generic information, names, places, DDC, and subjects, etc. • Develop and create macros and scripts • Process metadata in batch using various tools • Final review of the metadata in Metadata Management App World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Translation and Language Management Linguistic team • • Three translation vendors who share our six languages (FR, ES, PT, AR, RU, ZH) One reviewer per language Translation process • • • Pre-translation when content is submitted to us in a language other than English Translation into six languages, using translation management systems such as CAT tools In-context review and quality control on our staging website Related work • • • Terminology and style management (glossaries and style guides) Query management (direct access to content writers and subject matter experts) Feedback loop management between subject matter experts, translators and reviewers World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Development Process • • • • Ongoing supportive role of entire production process Ongoing maintenance and scaling of website Building tools for content and metadata management Troubleshooting World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Workflow for Content and Metadata Preparation Your metadata runs the display! Sandy Bostian WDL Content Manager World Digital Library World Digital Library www.wdl.org Metadata Preparation: Identifiers (IDs) • WDL Identifier • Record Identifier • Digital Identifier OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Metadata Preparation: Title Information • Original Title • Original Title Language Title vs. Work • English Title (if available) OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Metadata Preparation: Contributors • Contributor Name Birth/Death Dates (if known) • Role Author Scribe/Copyist Calligrapher Photographer Editor/Compiler Architect OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Metadata Preparation: Publication Info • Date Created Start End Created vs. Copied CE and/or Hijri • Publisher • Place of Publication OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Metadata Preparation: Subjects • Place Hierarchical Controlled Vocabularies • Time Publication vs. Subject Time CE and/or Hijri • Dewey Decimal Classification (DDC) • Additional Subjects (keywords) OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Metadata Preparation: Type of Item • Controlled Vocabulary Books Journals Manuscripts Maps Motion Pictures Newspapers Prints, Photographs Sound Recordings OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Metadata Preparation: Descriptions • Description/Abstract • Physical Description • Notes • Language OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Metadata Preparation: Additional Info • Collection • Series • Institution • URL OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Image Files: Digitization and Preparation Chris Masciangelo WDL Digital Conversion Specialist World Digital Library World Digital Library www.wdl.org Agenda Technology Hardware Software Pre-capture Set Up Information Capture / Scanning File Formats Storage Quality Review / Post Processing Problems Audio and Video Other Considerations OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Technology Hardware - Flatbed scanner - Overhead/Planetary scanner - V-shaped book scanner - Large Format scanning system - Monitor - Computer OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Hardware Flatbed Scanners OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Hardware Overhead/Planetary Scanners OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Hardware V-shaped Book Scanners - Manual [Atiz BookDrive Pro] - Automatic [Kirtas KABIS III] OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Hardware Large Format Scanning System OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Hardware Large Format Scanning System - Camera - Digital Back - Lighting - Platform/Easel OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Hardware Monitor - CRT - LCD Calibration – colorimeter / spectrophotometer World Digital Library World Digital Library www.wdl.org Software Computer/Operating System - Windows - Mac OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Software Scanning software/driver Image Editing software - Adobe Photoshop Creative Suite - Adobe Bridge - Adobe Photoshop Elements - Capture One Image Validation software [JHOVE] Image Metadata Editing [ExifTool] OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Pre-capture Set Up Examine materials to be scanned for issues/problems Identify balance between adequate informational capture and quality, physical item size, file format and compression, resolution and bit depth, and master file size Ensure equipment is properly focused, has capture profile set (including no auto-sharpening), and that the monitor is calibrated Ensure proper and equal distribution of lighting Good practice to shoot with a commercial color bar or grayscale World Digital Library World Digital Library www.wdl.org Pre-capture Set Up Risks during imaging - Physical damage - Exposure to light - Exposure to heat - Disassociation OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Information Capture / Scanning • Targets, color bars, and grayscales OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Information Capture / Scanning • Bit Depth [informational and artifactual value] World Digital Library World Digital Library www.wdl.org Information Capture / Scanning • Color Modes - RGB [red, green, blue] - CMYK [cyan, magenta, yellow, key (black)] OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Information Capture / Scanning • Color Space and Profile - sRGB sRGB is the only appropriate choice for images uploaded to the web since most web browsers don't support any color management. - sRGB IEC61966-2.1 - Adobe RGB (1998) Adobe RGB images that are uploaded to websites without conversion to sRGB will generally appear dark and muted. - Grayscale - Gray Gamma 2.2 World Digital Library World Digital Library www.wdl.org Information Capture / Scanning • Resolution [PPI/DPI] - 2400+ ppi (slides) - 300-600 ppi (archival) - 72/96 ppi (on screen/download) Rule of thumb: Longest side of the image should be 3000 pixels 4 inch x 5 inch photo @ 600 ppi = 2400 x 3000 pixels File size = (width pixels x height pixels x bit depth)/8 (2400 x 3000 x 24)/8 = 21,600,000 bytes or 20.6 MB OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org File Formats • Master [archival] - RAW - TIF - JPEG2000 • Access [derivatives] - JPG - PNG - PDF • Compression - LZW - JPEG2000 - ZIP OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Storage • Media - CD - DVD - Hard Drive - Tape - Cloud • Backups • Issues - Capacity - Transfer errors - Bit rot - Sustainability, obsolescence, data migration OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Quality Review / Post-Processing • Identify digital issues • Image validation • Color correction • Sharpening OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Noise • Artifacting • Vignetting • Chromatic aberration • Over-sharpening • Depth of field • Color reproduction • Lens distortion • Clipping - Overexposure / Underexposure OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Noise OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Artifacting OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Vignetting OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Chromatic aberration OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Over-sharpening OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Depth of Field OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Color Reproduction OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Lens Distortion OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org Problems • Clipping - Overexposure / Underexposure OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Audio and Video • The format should be publicly and openly documented. • The format is not proprietary. • The format is in widespread use. • The format can be opened, read, and accessed using readily-available tools. • WAV or AIFF files [uncompressed] • MP4 or MOV World Digital Library World Digital Library www.wdl.org Other Considerations • Copyright • Funding • Staffing OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Expert Descriptions: Nominal Process for Review and Evaluating Quality Question: What is an Expert Description on the WDL? Answer: It is a summary that explains each item and highlights its significance. It illuminates the material in a way that is accurate, succinct, and engaging. It is developed by working with information provided by partners, subject specialists, and other authoritative resources. Jason Yasner WDL Operations Manager World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Expert Descriptions: Nominal Process 1. Do we have adequate information from the partner, both metadata and what the partner has provided as descriptions, to produce the descriptions we need for the WDL? Does the description answer the following questions: “What is this object and why does it matter?” World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Expert Descriptions: Nominal Process 2. If the initial description (and metadata) are inadequate to create a usable WDL description, is there additional information on the Partner Institution’s Web site directly linked to this item that could be used to do so? This may not be initially known because there may be different people involved in creating the metadata than were involved in creating these sites. World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Expert Descriptions: Nominal Process 3. If the answers to 1 and 2 both turn out to be “no,” i.e., the Partner Institution does not have a description that can be used, is there an existing, authoritative source (online or in hard copy) that contains a usable description for the item or items in question, or the raw material to create such a description? Description writers need to remember that they are dealing with digitized versions of real objects, many rare and unique. Description writers are trying to highlight the significance of these objects and provide context. They shouldn’t write just about the objects (as in the bibliographic literature) or just about the context (as in general sources such as Wikipedia), but must do a bit of both, relating one to the other. World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Expert Descriptions: Nominal Process 4. If the answers to 1, 2, and 3 are all “no,” we have no alternative but to produce a new description from scratch. This can either be done by going back to the Partner Institution and asking them to produce a new description, or, if for some reason the partner cannot produce a description (it may not have available staff or language expertise, for example) to engage WDL staff and expert contractors to do the description and then run by the partner for comment/approval. Examples of research strategies used by WDL. World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Discussion on “Expert Descriptions based on Arab Heritage Resources” Mr. Mohammed Hammam Fikri Senior Heritage Specialist Cultural Advisor Office, Qatar Foundation World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Image Files: Transfer Live Demo Sandy Bostian WDL Content Manager World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Discussion: Increasing the Usage of WDL in the Arab World • How do we promote WDL to users in the Arab World? • What can your institution do to support this? • How do we recruit more institutions to join WDL? • Other ideas? World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES SUMMARY Jason Yasner WDL Operations Manager World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Overview of WDL Production Process • Content Selection • Content Transfer • Content Processing • Cataloging (consistent metadata, descriptions) • Translation • Publishing World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES New Developments • January 2011: 1,350 items online • Comprising 98,278 images • November 14, 2011 (Partner Meeting): 4,049 items • Comprising approx. 212,000 images • 200% increase since January 1, 2011 • End of 2011: 4,550 items • 237% increase since January 1, 2011 • Early 2012: Expected approx. 6,000 items • Satisfy medium-term goal in WDL Business Plan • 344% increase since January 1, 2011 World Digital Library World Digital Library www.wdl.org Opportunities for Technical Staff • New Metadata Management Application • Code-named “Cupcake” • Improved metadata creation and maintenance • Currently, also supports Translation and Quality Review processes • WDL 2 • Overhaul of code for better management and optimization • Into the cloud… • Better user experience worldwide • Continuous Process Improvement • Project Management, standards and best practices • Efficiency, reliability, and timeliness OSI | WEBwww.wdl.org SERVICES World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Improve Communication and Collaboration • Increase Partner Collaboration • Enable more Self-Service via online tools to: • Increase online content transfers • Track transfers with inventory integration • Improve content processing and validation • Allow Partner access and review of descriptions, metadata, and translations • Get more people in the process where needed • Improve Detailed Standards and Guidelines • http://project.wdl.org • Content guidelines (digitization, file-naming, object structure) • Metadata guidelines (mapping from MARC, MODS, Dublin Core) • Improve Integration with Authority Files • Metadata controlled vocabularies, authority files • Translation controlled vocabularies, authority files World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Improve Communication and Collaboration • Improve User Interface (UI) and User Experience (UX) • • • • • WDL 2 is a starting point Richer user interface and experience Full-Text Search (FTS) Enhanced Web 2.0 Sharing features Capacity Building • • • • Continue support of digitization centers in Egypt, Uganda, and Iraq Plan meetings and training workshops Use this workshop in Doha, Qatar, as a prototype @WDLorg • Please follow World Digital Library World Digital Library www.wdl.org OSI | WEBwww.wdl.org SERVICES Contact Information Sandy Bostian, WDL Content Manager sbos@loc.gov Chris Masciangelo, WDL Digital Conversion Specialist cmas@loc.gov Ted Waddelow, U.S. Fulbright Scholar, Univ. of Bahrain theodore.waddelow@gmail.com Jason Yasner, WDL Operations Manager jyas@loc.gov World Digital Library World Digital Library www.wdl.org Thank you! OSI | WEBwww.wdl.org SERVICES