metadata management

Harold B Lee Library is considered to be a leader in the library catalog and authority profession and it is important to continue this tradition in being a leader in setting and following metadata standards as well. When digital projects are undertaken there is an expenditure of time, effort, and money to create resources that should be accessible. Find, Identify, Select, Obtain ( FISO) is the goal of creating digital resources and metadata is the key to being able to reach these steps for access and use. Since metadata is the vital component in providing description and access for digital resources, providing metadata expertise for these projects should always be considered in the workflow of digital projects. It is necessary to be planned out and accounted for in the discussion period as well as during the creation of, field mapping, and technical loading into the digital asset management system decided upon for the project. Maintenance following the completion of the project is continual and migration to new software and systems are issues that need attention in all digital projects. Decisions by departments can impact and have far reaching effects on metadata and digital collections because of the high number of digital objects and the variety of collection owners/curators. Digital resources in the Harold B Lee Library include: 2.57 million items in CONTENTdm 21,530 items in Internet Archive 6000 url’s and 26 million documents in Archive-It Bibliographies in ATOM – 224,357 in Utah County Obituary Index; 14,621 in the Mormon Bibliography; 39,192 in Relief Society Magazine 24 Journals in Open Journal System (migrated from CONTENTdm) Digital collections owners/curators: Special Collections, Family History, European Studies, Institute for the Preservation of Ancient Religious Texts, Ancient Textual Imaging Group, Art Department, Humanities Department, Religion Department, Maxwell Institute, Daughters of the Utah Pioneers, Springville, BYU Hawaii, BYU Idaho, LDS Business College. Examples of metadata problems that developed without appropriate metadata management of projects: 1. LIT reformatted internal copyright links from html to php. No notice was given to metadata until AFTER this action was taken. Understandably LIT wants to keep internal links as secure as possible, but they were unaware that this affected over 2 million metadata records in CONTENTdm. This required a review of all 180 digital collections to note the copyright link used in each, as many are custom designed and all are written and approved by Carl Johnson. With 16 unique copyright links in CONTENTdm, LIT subsequently needed to create 16 replacement links. Following that was the actual link replacement within each digital collection. This was an 2. 3. 4. 5. 6. 7. unexpected project that was not planned on by the metadata unit. Having a voice within the LIT would help both that department as well as metadata to be aware of the ‘domino effect’ when any action is taken that can affect metadata records. The Relief Society Magazine Index began as a bibliographic database from a curator and evolved into a digital project when it was chosen to be scanned into Internet archive. The index was used as the source of metadata, but when the index was matched with the digital scans it was not up to standards and needed corrections on over 39 thousand items. Although it received approval as a digital project, there was no metadata preview of the issues involved which were further reaching than expected and took three times longer than anticipated to complete. It is important for a digital project to keep to the scope definition and not allow it ‘scope creep’ as it proceeds. This will always have an impact on the metadata production, maintenance and review. In connection with the Florence Nightingale exhibition, Special Collections digitized specific historical documents to enhance the physical exhibit with an online exhibit. With a deadline fast approaching for the exhibit, the metadata unit did not have enough notice beforehand to complete the metadata for loading into the digital collection. Metadata is more time consuming and detailed than the scanning process which must be accounted for when planning projects. A music reference bibliography database was monitored by Special Collections music students who added metadata over several years. The database became corrupted from manipulation and excessive copy and paste from excel. Correcting the metadata became a crisis for the curator who needed it fixed for addition into a music website. Reviewing metadata created by other sources is a vital step in digital project maintenance. Most of the digitization projects come from Special Collections. The curators are the ‘owners’ and subject specialists of these collections and are therefore have the responsibility to provide metadata for these projects. Metadata is now coming from the archival finding aid. It is mapped into the Dublin Core fields by the DI Lab as they scan the images and then loaded as batches into CONTENTdm. In this workflow, the projects are approved by the curator with a completed finding aid and the metadata is not seen by the metadata unit until after it is loaded by the lab students. Journals originally digitized into CONTENTdm have been migrated into OJS. When these journals were in CONTENTdm there was rich subject analysis metadata created at the article level to aid in searching and access (FISO). This metadata was lost when migrated to OJS. Many thousands of hours of work were lost as well as the rich metadata. Items scanned into Internet Archive require metadata in order to be loaded into the system. All the metadata comes from our catalog records. Problems in this procedure develop when: a. Catalog records are very minimal or are only an acquisition record, so there is not enough information in the record to load into Internet Archive. This results in missing information that is created by lab students not trained in cataloging b. DI Lab students choose the wrong records to match with the item in hand, thus loading the item with the wrong metadata c. The library does not own the materials, so there is no catalog record to match with the item. The necessary metadata is created by the lab from the item in hand or is taken from outside records for loading into Internet Archive and is not first reviewed by metadata; i.e. Clarence Dixon Taylor materials containing items that belong to donors brought in to be digitized, also the Brussels opera music containing items scanned in the Brussels Archive and added to Internet Archive collection, resulting in approximately 1700 titles added to a cataloging backlog. d. Items that belong to the library have not been cataloged yet, resulting in the lab creating the metadata and causing an additional backlog for cataloging following the scanning . Suggestions to improve metadata management: 1. Thorough metadata consultation on every digital project, starting with the proposal and following through to the finished project and subsequent monitoring through any other changes. This would include metadata inclusion and consultation in all departments that may have any effect upon metadata. 2. More oversight of projects done by units outside of the library as well as departments inside the library. 3. Have the metadata unit responsible for loading batches of digital objects into CONTENTdm. This would allow the metadata to be reviewed as it is loaded. 4. Have the metadata unit load metadata into the Dublin Core fields prior to uploading into CONTENTdm. As students trained to do so, they have the expertise needed to know the correct information and format in connection with the digital object. 5. Eliminate ‘scope creep’ on digital projects that produce changes in the metadata creation.

metadata management

Related documents

Products

Support

metadata management

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib