10.0 Technical Metadata

advertisement
10.0 Technical Metadata
Technical metadata is used for recording the technical attributes of digital objects that include their
production or creation information on the digital capture process, i.e., the hardware and software used
to acquire the digital object, file formats for master and derivatives, resolutions, color profiles and so on,
that enable the reproduction of digital resources in the future. For this reason technical metadata is
categorized as administrative and preservation metadata.
Technical metadata can be included in preservation metadata (PREMIS) or structural metadata (METS).
The decision regarding where to add the technical metadata can be made based on the types of
technical metadata that should be preserved since METS and PREMIS have different ways of
incorporating technical metadata in its schema, as mentioned in Preservation Metadata and Structural
Metadata Sections (see sections).
Table of contents
10.1 Minimum Requirements
10.2 Object Types and Related Information
10.3 Tools that Capture Technical Metadata
10.4 Resources
10.1 Minimum Requirements
Requirements for technical metadata will differ for various media formats. Regardless of media formats,
all technical metadata should include the following information.
•
File Format: Recommended best practice is to select a value from a controlled vocabulary, such
as the list of Internet Media Types < http://www.iana.org/assignments/media-types/ > that
defines computer media formats (also known as MIME types).
•
File Size: For most formats, the recommended best practice is to record file size as bytes (e.g.,
3,000,000 bytes) and not as kilobytes (KB), megabytes (MB), etc., because it is the most specific
measurement of file size. It is also best practice to include duration time for multi-media. For
example: 4,200,000 bytes; 5 minutes, 34 seconds.
10.2 Object Types and Related Information
10.2.1 Textual Document
There is a standard called textMD that is a XML Schema that details technical metadata for text-based
digital objects <http://www.loc.gov/standards/textMD/>. It can be added into both METS (as an
extension) and PREMIS. It can be also used as a standalone document.
When using the textMD schema, the technical metadata for text-based digital objects may include:
• encoding information (quality, platform, software, agent)
• character information (character set and size, byte order and size, line terminators)
• languages
• fonts
• markup information
• processing and textual notes
• technical requirements for printing and viewing
• page ordering and sequencing
10.2.2 Still image
For digital still image, it is recommended to use the NISO Data Dictionary – Technical Metadata for
Digital Still Images at
<http://www.niso.org/kst/reports/standards?step=2&gid=None&project_key=b897b0cf3e2ee526252d9
f830207b3cc9f3b6c2c>.
The document includes metadata elements important to the management of image files that could be
captured automatically from scanner or digital camera software. The elements are based on MIX.
<http://www.loc.gov/standards/mix/>
The technical metadata for digital still image may include:
• file format
• file resolution (pixels per inch)
• dimensions (image dimension or size in inches or centimeters)
• bit-depth (e.g., 8-bit, 16-bit, 24-bit, etc.)
• color mode (e.g., RGB, CMYK, or grayscale)
• scanner or digital camera brand, name, and model number
• software used to manipulate or compress the image, including the software name and version.
10.2.3 Digital Audio Files
For digital audio files (sound only) the technical metadata for the image quality and the image capture
process may include:
•
•
•
•
•
•
file format, including whether the format is lossless or lossy
sample rate
resolution
number of channels
software used to manipulate or compress the audio file, including the software name and
version
brand, name, and model number of the recording equipment used.
10.2.4 Digital Multi-Media Files
For digital multi-media files (including video, sound, and animations) the technical metadata for the
image quality and capture process may include the:
•
•
•
•
file format
bitrate
software used to manipulate or compress the multimedia file, including the software name and
version
brand, name, and model number of the equipment used.
10.3 Tools that Capture Technical Metadata
Currently most of technical metadata can be captured automatically by digital camera or scanner.
Followings are the most commonly used tools and standards used for extracting from or embedding
metadata to the digital objects.
10.3.1 JHOVE
JHOVE <http://hul.harvard.edu/jhove/ > provides functions to perform format-specific identification,
validation, and characterization of digital objects. JHOVE generates technical metadata about files,
usually by extracting information from the files themselves.
10.3.2 XMP
Adobe's Extensible Metadata Platform (XMP) is a labeling technology that allows users to embed
metadata into the file itself. XMP is used for both descriptive (based on Dublin Core) and administrative
metadata.
10.3.3 Exif
Exchangeable image file format (Exif) < http://www.exif.org/ > is a specification for the image file format
used by digital cameras. Exif data is embedded within the image file itself and can be editable. The
metadata tags defined in the Exif standard include, date and time information, camera settings such as
the camera model and make, and information that varies with each image such as orientation (rotation),
aperture, shutter speed, focal length, metering mode, and ISO speed information. A thumbnail for
previewing the picture on the camera's LCD screen, in file managers, or in photo manipulation software,
and Descriptions and copyright information can be also captured or added later.
10.4 Resources
•
NISO Data Dictionary for Technical Metadata for Digital Still Images:
http://www.niso.org/kst/reports/standards?step=2&gid=None&project_key=b897b0cf3e2ee52
6252d9f830207b3cc9f3b6c2c
•
•
•
•
•
•
•
NISO Metadata for Images in XML: http://www.loc.gov/standards/mix/
Technical Metadata for Text: http://www.loc.gov/standards/textMD/
JHOVE: http://hul.harvard.edu/jhove/
Exif: http://www.exif.org/
IPTC Core and Extension: http://photometadata.blogspot.com/2008/07/iptc-core-11extensions-10-released.html
XMP: http://www.adobe.com/products/xmp/
NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access:
http://www.archives.gov/preservation/technical/guidelines.pdf
Selected Resources for Technical Metadata
Source for File Format Extension Information:
http://fileinfo.net
Additional file formats are listed at:
http://www.ace.net.nz/tech/TechFileFormat.html
U.S. National Archives and Administration. Technical Guidelines for Digitizing Archival Materials for
Electronic Access: Creation of Production Master Files – Raster Images. (June 2004). Online Edition
http://www.archives.gov/research/arc/digitizing-archival-materials.html
National Information Standards Organization and AIIM International. NISO Data Dictionary –
Technical Metadata for Digital Still Images
http://www.niso.org/kst/reports/standards?step=2&gid=&project_key=b897b0cf3e2ee526252d9f8
30207b3cc9f3
California Digital Library
http://www.cdlib.org/inside/diglib/guidelines/bpgimages/
Collaborative Digitization Program (select section on digital imaging)
https://www.bcr.org
Download