3.0 Best Practices for Creating Digital Images

advertisement
3.0 Best Practices for Creating Digital Images
This section describes best practices for digitization of print based original documents including
photographs, manuscripts, maps, and text. These guidelines draw heavily on previously
published standards and best practices developed by standards agencies and peer institutions,
particularly those of the California Digital Library. Best practices for newspaper digitization are
covered in Section 5. Best practices for intellectual property rights (IPR) issues, which should be
investigated before scanning materials and making them publicly accessible, are covered in
Section 15.
Table of contents
3.1 Types of files produced
• Master (archival) files
• Access files
• Thumbnails
• Other files for textual materials
3.2 Minimum master image quality requirements
• Textual Documents, Graphic Illustrations/Artwork, Maps, and
Plans
• Photographs: Transmissive Originals (Film, Slides, and
Negatives)
• Photographs: Reflective Originals (Prints)
• Aerials: Transmissive Originals (Film, Slides, and Negatives)
• Aerials: Reflective Originals
3.3 Minimum image quality requirements for digital access and thumbnail image files
3.4 Additional resources
3.1 Types of files produced
At a minimum, digitization of library materials should result in the creation of a master
(archival) image and at least one access derivative for web display. Depending on the format of
the material and its anticipated use and display, other files, such as thumbnails, PDFs, and OCR
text files, may also be produced. Files should be named in accordance to the best practices for
file naming in Section xx of this document. As they are produced, files should be saved to a
workspace that resides on a Library server that is backed up nightly.
•
Master (archival) files are the source files for all other digital files and ensure the longterm usability of the digital information. A digital master file may serve as a surrogate
for the original, may completely replace originals, or may be used as security against
possible loss of originals due to disaster, theft and/or deterioration. Images are
captured at a quality high enough to serve these potential uses via scanning or digital
photography, depending on the attributes of the original. The digital master file should
represent as accurately as possible the visual information in the original object. In
general, decisions about image capture should err towards the highest quality. Files
should use color rather than grayscale when color is an integral attribute of the original,
and any compression applied to the file should be lossless. Accuracy and consistency in
tone and color reproduction through appropriate use of scanner or camera controls is
the goal;
•
Access files are derived from master files and are used for presentation and
transmission over networks. These images should be of good quality, but because their
spatial resolution (measured in pixels per inch) is lower, the file size is smaller. Some
minor post-scan adjustments to optimize image quality and to bring all images to a
common rendition are acceptable. Such adjustments include the use of appropriate
image processing tools to achieve final color balance and tone distribution and to
sharpen scanned images to match appearance of the originals.
•
Thumbnail files are very small files used in databases or web pages. Clicking on the
thumbnail image will pull up the larger original image, which can be viewed and
downloaded.
•
Additional files for textual materials
•
PDF (Portable Document Format) files are generally an appropriate access
derivative for multi-page text documents and books. PDFs preserve the layout
and formatting of original documents (including fonts and special characters, like
formulas). When making the PDF, set the Compatibility at Acrobat 5.0 (PDF 1.4),
embed all fonts, specify color spaces in a device-independent manner, and do
not use any encryption. Use high-resolution images to create the PDF and then
optimize the file for web display. PDF files of textual materials should be made
full-text searchable. See complete best practices for PDF creation in Section 6 of
this manual.
•
OCR (optical character recognition) text file may also be created using ABBYY
FineReader. OCR should be derived from the high-resolution image files. Adobe
Acrobat Professional can produce OCR for clear, high contrast laser printed or
typeset documents; however, ABBYYFineReader has a much lower error rate and
can analyze text prior to recognition to produce more accurate results. ABBYY
FineReader is the recommended software for creating OCR for all other types of
texual documents. See complete best practices for OCR creation in Section 5 of
this manual.
3.2 Minimum image quality requirements for digital masters
The imaging quality requirements for master digital images are given below. These
requirements should be viewed as the minimum necessary to create quality digital images and
may be exceeded when warranted and when storage space permits. Associate technical
metadata should be saved to the header file.
Textual Documents, Graphic Illustrations/Artwork, Maps, and Plans
Features of original
Clear, high-contrast documents with printed type
(e.g., laser printed or typeset
Digital Master Image File
File format
• TIFF or lossless JPEG2000
Pixel array:
• Minimum of 6000 pixels across long dimension for
1-bit bitonal mode.
• Minimum of 400 pixels across long dimension for
8-bit grayscale.
Resolution and bit depth:
• 1 bit bitonal mode - 600 PPI for documents with
smallest significant character of 1.0 mm or larger.
The 600 PPI 1-bit files can be produced via
scanning or created/derived from 400 PPI, 8-bit
grayscale images.
• or –
• 8 bit grayscale model - 400 PPI for documents with
the smallest significant character of 1.0 mm or
larger.
Documents with poor legibility or diffuse characters
(e.g., carbon copies, Thermofax/Verifax), handwritten
annotations or other markings, low inherent contrast,
staining, facing, halftone illustrations, or photographs
File format:
• TIFF or lossless JPEG2000
Pixel array:
• Minimum of 4000 pixels across long dimension.
Resolution and bit-depth
• 8-bit grayscale model - 400 PPI for documents with
smallest significant character of 1.0 mm or larger.
Documents as described for grayscale scanning
and/or where color is important to the interpretation
of the information or content, or desire to produce
the most accurate representation
File format:
• TIFF or lossless JPEG2000
Pixel array:
• Minimum of 4000 pixels across long dimension
Resolution and bit depth:
• 24 bit RGB mode – 400 PPI for documents with
smallest significant character of 1.0 mm or larger
Photographs: Transmissive Originals (Film, Slides, and Negatives)
Features of original
Format range:
• 35 mm and medium format, up to 4 x 5 in.
Size range:
• Smaller than 20 square in.
Digital Master Image File
File format:
• TIFF or lossless JPEG2000
Pixel array:
• 4000 pixels across long dimension of image area,
excluding mounts and borders
Format range:
• Equal to or larger than 4 x 5 in. and up to 8 x 10 in.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 2800 PPI for 35mm
originals and ranging down to approximately 800
PPI for originals approaching 4 x 5 in.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file
-or –
• 24-bit RGB mode for color and monochrome (e.g.,
collodion wet-plate negative, pyro developed
negatives, stained negatives, etc.), can be
produced from a 48-bit RGB file.
File format:
• TIFF or lossless JPEG2000
Size range:
• Equal to or larger than 20 square in. and up to 80
square in.
Pixel array:
• 6000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 1200 PPI for 4 x 5 in.
originals and ranging down to approximately 600
PPI for 8 x 10 in. originals.
Format range:
• Equal to or larger than 8x10 in.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file.
-or• 24-bit RGB mode for color and monochrome (e.g.,
collodion wet-plate negative, pyro developed
negatives, stained negatives, etc.,), can be
produced from a 48-bit RGB file.
File format:
• TIFF or lossless JPEG2000
Size range:
• Equal to or larger than 80 square in.
Pixel array:
• 8000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 800 PPI for 8 x 10 in.
originals and ranging down to produce the desired
size file from larger originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file.
-or• 24-bit RGB mode for color and monochrome (e.g.,
collodion wet-plate negative, pyro developed
negatives, stained negatives, etc.,), can be
produced from a 48-bit RGB file.
Photographs: Reflective Originals (Prints)
Features of original
Format range:
• 8x10 in. or smaller
Size range:
• Smaller than or equal to 80 square in.
Digital Master Image File
File format:
• TIFF or lossless JPEG2000
Pixel array:
• 4000 pixels across long dimension of image area,
excluding mounts and borders
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 400 PPI for 8x10 in.
originals and ranging up to the appropriate
resolution to produce the desired size file from
smaller originals, approximately 570 PPI for 5x7
in. and 800 PPI for 4 x 5in. or 3.5x5 in. originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file.
-or• 24-bit RGB mode for color and monochrome (e.g.,
albumen prints or other historic print processes),
can be produced from a 48-bit RGB file.
Format range:
• Equal to or larger than 8x10 in. and up to 11x14
in.
Size range:
• Equal to or larger than 80 square in. and up to 154
square in.
File format:
• TIFF or lossless JPEG2000
Pixel array:
• 6000 pixels across long dimension of image area,
excluding mounts and boarders.
Resolution:
•
Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 600 PPI for originals
approximately 8x10 in. and ranging down to
approximately 430 PPI for 11x14 in. originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file.
-or• 24-bit RGB mode for color and monochrome (e.g.,
albumen prints or other historic print processes),
can be produced from a 48-bit RGB file.
Format range:
• Equal to or larger than 11x14 in.
File format:
• TIFF or lossless JPEG2000
Size range:
• Equal to or larger than 154 square in.
Pixel array:
• 8000 pixels across long dimension of image area,
excluding mounts and boarders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 570 PPI for originals
approximately 11x14 in. and ranging down to the
appropriate resolution to produce the desired size
file from larger orignals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file.
-or• 24-bit RGB mode for color and monochrome (e.g.,
albumen prints or other historic print processes),
can be produced from a 48-bit RGB file.
Aerials: Transmissive Originals (Film, Slides, and Negatives)
Features of original
Digital Master Image File
NOTE: If scans of aerial photography will be used for oversized reproduction, follow the scanning
recommendations for the next largest format (e.g., if your original is 70 mm wide, follow the speicifications for
127 mm wide roll film to achieve 8,000 pixels across long dimensions).
Format range:
File format:
• 70 mm wide and medium format roll film
• TIFF or lossless JPEG2000
Size range:
• Smaller than 10 square in.
Pixel array:
• 6000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
•
Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 2700 PPI for 70mm
originals and ranging down to the appropriate
resolution to produce the desired size file from
larger originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file
-or• 24-bit RGB mode for color and monochrome
(stained negatives,) can be produced from a 48 bit
RGB file.
Format range:
• 127 mm wide roll film, 4x5 in. and up to 5x7 in.
sheet film
Size range:
• Equal to or larger than 10 in. and up to 35 square
in.
File format:
• TIFF or lossless JPEG2000
Pixel array:
• 8000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 1600 PPI for 4x5 in.
originals and ranging down to approximately 1100
PPI for 5x7 in. originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file
-or• 24-bit RGB mode for color and monochrome
(stained negatives,) can be produced from a 48 bit
RGB file.
Format range:
• Larger than 127 mm wide roll film and larger than
5x7 in. sheet film
Size range:
• Equal to or larger than 35 square in.
File format:
• TIFF or lossless JPEG2000
Pixel array:
• 10000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 2000 PPI for 5x5 in.
originals and ranging down to the appropriate
resolution to produce the desired size file from
larger originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file
-or• 24-bit RGB mode for color and monochrome
(stained negatives,) can be produced from a 48 bit
RGB file.
Aerials: Reflective Originals
Features of original
Digital Master Image File
NOTE: If scans of aerial photography will be used for oversized reproduction, follow the scanning
recommendations for the next largest format (e.g., if your original is 8x10 in., follow the specifications for
formats larger than 8x10 in. to achieve 6000 pixels across long dimensions.
Format range:
File format:
• Smaller than 8x10 in.
• TIFF or lossless JPEG2000
Size range:
• Smaller than 80 square in.
Pixel array:
• 4000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 400 PPI for 5x5 in.
originals approximately 8x10 in. and ranging up
to the desired size file from smaller originals,
approximately 570 PPI for 5x7 in. and 800 PPI for
4 x 5 in. originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file
-or• 24-bit RGB mode for color and monochrome
(stained negatives,) can be produced from a 48 bit
RGB file.
Format range:
• Equal to or larger than 8x10 in. and up to 11x14
in.
Size range:
• Equal to or larger than 80 square in. and up to 154
square in.
File format:
• TIFF or lossless JPEG2000
Pixel array:
• 6000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 600 PPI for 8x10 in.
originals and ranging down to approximately 430
PPI for 11x14 in. originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file
-or• 24-bit RGB mode for color and monochrome
(stained negatives,) can be produced from a 48 bit
RGB file.
Format range:
• Equal to or larger 11x14 in.
File format:
• TIFF or lossless JPEG2000
Size range:
• Equal to or larger than 154 square in.
Pixel array:
• 8000 pixels across long dimension of image area,
excluding mounts and borders.
Resolution:
• Adjust the scan resolution to meet pixel array
specifications, based on the format of the original
object – approximately 570 PPI for 11x14 in.
originals and ranging down to appropriate
resolution to produce the desired size file from
larger originals.
Bit depth:
• 8-bit grayscale mode for black-and-white, can be
produced from a 16-bit grayscale file
-or• 24-bit RGB mode for color and monochrome
(stained negatives,) can be produced from a 48 bit
RGB file.
3.3 Minimum image quality requirements for digital access and thumbnail image files
Access images
File format:
• JPEG (medium to high quality compression, sRGB
profile for color and Gray Gamma 2.2 profile for
monochrome) or JPEG2000 (lossy).
Pixel array:
• 800-3000 pixels across long dimension.
Resolution and bit depth:
• 8-bit grayscale or 24-bit color: 72-200 PPI
NOTE: In creating access images, scanned images
should have Unsharp Mask applied to them in
Photoshop. The following settings are recommended.
•
•
Amount: 100% - 200%
Radius: 1 to 2 pixels
•
Thumbnail images
Threshold: 2 to 8 levels
File format:
• GIF (adaptive/perceptual palette, diffusion/noise
dither).
Pixel array:
• GIF images should fit within a boundary of 150200 pixels across each dimension (200 pixels
preferred).
Resolution and bit depth:
• GIF images should be 4-bit grayscale, 8-bit color:
72 PPI.
3.4 Additional resources
•
•
•
•
•
NARA Technical Guidelines for Digitizing Archival Materials for Electronic Access:
Creation of Production Master Files - Raster Images
(http://www.archives.gov/preservation/technical/guidelines.html)
California Digital Library Guidelines for Digital Images
(http://www.cdlib.org/inside/diglib/guidelines/bpgimages/cdl_gdi_v2.pdf)
Moving Theory into Practice: Digital Imaging Tutorial (Cornell)
http://www.library.cornell.edu/preservation/tutorial/contents.html
University of Maryland Best Practice Guidelines for Digital Collections
(http://www.lib.umd.edu/dcr/publications/best_practice.pdf)
North Carolina ECHO Project Digitization Guidelines
(http://www.ncecho.org/dig/digguidelines.shtml)
Download