FADGI - (lib.stanford.edu) include

advertisement
Federal Digitization
Moving to Common Guidelines
The U.S. Federal Agencies Digitization
Guidelines Initiative (FADGI)
http://www.digitizationguidelines.gov/
PASIG, May 24, 2013
Carl Fleischhauer
cfle@loc.gov
Steve Puglia
spug@loc.gov
Library of Congress
Washington, DC
http://www.digitizationguidelines.gov/
2
18 Participating Agencies
http://www.digitizationguidelines.gov/participants/
Often participating, not “official”: NASA, NOAA, National Museum of
Health and Medicine (U.S. Army), U.S. Supreme Court
32
http://www.digitizationguidelines.gov/stillimages/
http://www.digitizationguidelines.gov/audio-visual/
Guidelines
• Conceptual framework documents
– Content Categories & Digitization Objectives (still image
reproduction; September 3, 2009)
– Digitization Activities – Project Planning (November 4, 2009)
• Capture device performance
– Digital Imaging Framework (high level about scanner
performance metrics; April 2, 2009)
– Audio Analog-to-Digital Converter Performance (August 20,
2012)
– Audio Interstitial Errors (about unwanted dropouts or sample
distortion; work in progress, 2012-13)
• Broad practices guidelines
– Technical Guidelines for the Still Image Digitization of Cultural
Heritage Materials (Many segments from 2004 NARA document;
FADGI update, August 24, 2010)
6
Guidelines
• Metadata including embedded data and file headers
– TIFF Image Header Metadata (February 10, 2009)
– Minimal Descriptive Embedded Metadata in Digital Still
Images (Smithsonian document embraced by group; March 23,
2012)
– Embedding Metadata in Broadcast WAVE Files, Version 2
(April 23, 2012)
• Associated tool on SourceForge: BWF MetaEdit
– NARA reVTMD video technical metadata (February 2012;
FADGI supporting role)
• Associated tool on GitHub: AVI MetaEdit
• Format analysis and guidelines
– File Format Comparisons (comparing still image and video
formats; under development in 2013)
– MXF Preservation Video Formatting Application Specification
(under development during 2013 in cooperation with AMWA
trade group; versions posted in 2010 and 2012)
7
Still Image
Illustrative Example
Odds and ends about still images
Still image specifications – this is what we all “used to do”
• color/monochromatic
• pixel density (good old “dpi”)
• bit depth
• . . . usually output-referred
We want to move toward more, um, “scientific” specifications
Tone
 Gamma

Resolution
Color
Spatial
Frequency
Response (SFR)
 Luminance

Delta E2000

Resolution

Delta E(a*b*)2000

Sampling
Efficiency

Channel
Mis-registration

Sampling
Frequency

White
Balance
Uniformity
% Lighting
Non-uniformity

Noise
Total rms
deviation

From this document:
http://www.digitizationguidelines.gov/guidelines/DIFfinal.pdf
Resolution rethink:
new terms, scanner performance
• SAMPLING RATE
• SPATIAL RESOLUTION
– Spatial Frequency Response (SFR)
• SAMPLING EFFICIENCY
Thanks to Barry Wheeler for his very helpful Signal blogs:
http://blogs.loc.gov/digitalpreservation/2012/12/what-resolution-should-i-use-part-1/
http://blogs.loc.gov/digitalpreservation/2013/01/what-resolution-should-i-use-part-2/
http://blogs.loc.gov/digitalpreservation/2013/03/what-resolution-should-i-use-part-3/
Resolution rethink:
new terms, scanner performance
• SAMPLING RATE. Usually, the scanner’s ppi number is
sampling rate
– Sensors can only attempt to measure (sample) the brightness at
each point.
– Some light may scatter and miss the sensor, the scanner’s motor
step may not be sufficiently precise, or the collected value may
be inaccurate. Inside every scanner or camera, between the
sensor and the screen is a small, highly specialized computer
called a digital signal processor. This processor must work very
hard to link a dot on the page to a dot on the screen.
• RESOLUTION. ISO standards (e.g., ISO 12233) define
resolution in terms of Spatial Frequency Response
(SFR) -- the actual result on the screen.
• SAMPLING EFFICIENCY. . . . the difference between
the pixel count and actually resolving each point,
expressed as percentage.
From the revised guideline
http://www.digitizationguidelines.gov/guidelines/FADGI_Still_Image-Tech_Guidelines_2010-08-24.pdf
Tools to Support
Image Performance Measurement
• Digital Image Conformance
Evaluation (DICE) System
– Device Target – Imaging Device Performance
– Object Target – Actual Image Quality
– Software for Evaluation/Validation
• Based in LabVIEW
• Data export for use in SQC/SPC
Device and Object Targets
Object target as
positioned for use
DICE Software – Main Panel
DICE – QC Summary Panel
Slide from
old version
of software
DICE – OECF detail page
DICE – SFR detail page
Audio-Visual
Illustrative Example
MXF format specification for
reformatted video
Library of Congress
Packard Campus,
Culpeper
National Archives,
College Park
Smithsonian
Institution Archives
SAMMA from Front Porch Digital
Implementations
• SAMMA at LC: Lossless compressed
– Each frame is a JPEG 2000 image
– Lossless (reversible) transform
• Emergent variants
– NARA and other archives prefer uncompressed video
– Other devices come on the market, e.g., from
OpenCube (Belgium), Amberfin (UK), Cube-Tec
(Germany), and others in process (e.g., Archimedia)
Standards-based format elements
from SMPTE and ISO/IEC
• MXF (SMPTE ST 377 and many more)
• Standard definition uncompressed covered in
ST 377 and also SMPTE ST 384
• JPEG 2000 encoding (ISO/IEC 15444-1)
• JPEG 2000 mapped to MXF (SMPTE ST 422)
• Other standards also play a role, most from
SMPTE, some from EBU
Loose Ends
• MXF, JPEG 2000, and even
“uncompressed” video are complex
standards
• Entities that “conform” to the standards
can be formatted in various ways
– We have some elements that we want to
include in order to produce an “authentic
copy”
– MXF “carriage” can be tricky to sort out
MXF Application Specification
• An MXF AS is what some would call a profile
• Pin down preferred options, reduce the
variables
• Support greater interoperability
• Increase the comfort level for users
• Increase vendor competition
• More adoption means better sustainability
Timecode
• Source recordings may have multiple
timecodes (VITC, LTC, etc.), some on
purpose, some by accident, all may provide
forensic help for future researchers.
• Specify preferred practice for retaining and
tagging multiple timecodes in the file
Audio tracks
• Source may have multiple tracks
• MXF audio track specifications cover
“listing” or “allocation” (tagging) and other
matters of terminology, need to pin these
down
Metadata
• Basic tech metadata is not an issue
• Needed: specified options for embedding
additional technical metadata:
–
–
–
–
process (like METS digiprov),
about the source item
about quality review outcomes
preservation (like PREMIS),
• And some descriptive metadata
– Schools of thought: some prefer minimal data (“just
and identifier”), others would dump everything they
have, specification should permit range of actions –
“archivists choice”
Closed captioning, subtitles,
ancillary data
• US broadcast standards embed CC as binary data
– “In the image raster” on line 21
– For digital TV, CC also in packets in MPEG stream
– Awkward for future extraction, depends upon availability of decoding
tools
• Desiderata
– Put CC/subtitles in the file for easier access and extraction
– XML rather than binary
– Alas, MXF offers “too many” options for this, we seek to pin down the
best ones
• By extension, this also applies to other ancillary data.
An MXF Application Specification is . . .
• A formal industry statement
– Not a “standard”
• Accompanied by a reference
implementation and validation tools
MXF Application Specifications
come from . . .
• Advanced Media Workflow Association (AMWA)
– Broadcast-industry group
– AMWA Application Specifications include:
• AS-10 for production – version for end-to-end digital
production workflow (forthcoming)
• AS-11 for contribution – the high end version contributed by a
producer to a television network (published)
• AS-03 for delivery – the reduced-data version “sent to the
tower for broadcast” (published)
– AS-07 for archiving and preservation will be a sibling
to those
– http://www.amwa.tv/projects/AMWA_AS_overview 04-2013 web.pdf
Role of AMWA
• Key roles played by Turner Broadcasting
veterans and engineering staff
• Members include AVID, BBC, Front Porch
Digital (SAMMA), NARA, PBS, SONY,
Discovery Communications, Fox, NBC
Universal, and more
• http://www.amwa.tv/
• Break into technical committees to push
draft specifications
FADGI’s AMWA status
• March 2012
– AMWA business committee approval to move ahead
– Designate as AS-07
• September 2012
– Technical committee approval
• November 2012
– Team meetings began
• Early 2013
– Churning along
• End of 2013
– Dream of a first draft or better
http://www.digitizationguidelines.gov/
Carl Fleischhauer cfle@loc.gov
Download