Bonus material A

advertisement
Bonus material A
Digital Video Recording
Chapter One defined digital video recording as the process of converting analog video signals
and analog sound signals (i.e., analog-to-digital or A-to-D conversion) into a stream of
discrete binary numbers that represent and correspond to changes in chroma and luminance
values (for video) or dynamic air pressure (for audio) over time. However, as is more the case
now than in the later part of the 1990s, digital video and audio signals are often already
generated “natively” from the source—removing the added steps of external A-to-D
conversion.
Full-bandwidth, standardized digital video and audio have been prevalent inside many
facilities as linear digital signals, usually carried as SMPTE 259 for standard definition or
SMPTE 292 for high definition, and AES3id (or simply AES/EBU audio) for quite some
time. Full-bandwidth digital video and audio, streamed from cameras, videotape transports,
production equipment, and video storage devices (e.g., video servers), is now well accepted
in the professional video production and broadcast market space. Compressed digital videos,
such as MPEG2, MPEG4, and JPEG 2000, have reached even further than full-bandwidth
digital video because of their file-based nature and their ability to be carried over
conventional networking infrastructures.
Video image signal structures for those resolutions outside the traditional high-definition
signal formats adopted by the Advanced Television Systems Committee (ATSC) for
broadcast applications identified as 1080 interlaced (1080i) or 720 progressive (720p) are
gaining momentum. Signals, including 1080 progressive (1080p60, 1080p24, etc.), that are
represented by 3 Gbits/second in SMPTE 424 and SMPTE 425 are emerging, not to mention
the continual evolution of compression formats at various sample data rates and structures.
As an introductory foundation to the rest of this book, this chapter looks at the origins of
component analog and digital videotape recording and provides a preface to the concepts of
file-based and digital storage technologies for the moving media industry.
KEY CHAPTER POINTS




Video coding, moving from analog to digital formats, the early formats
Component and composite analog video formats
Disk-based recording and storage in magnetic and optical formats
Component and composite digital video formats for videotape and file-based
recording, the standardized D-formats, and beyond are explained according to
SMPTE standards and their marketplace trade names
Analog to Digital Video Coding
Coding digital representations of analog video may involve either a standards-based approach
or a proprietary-based approach. For most of the accepted standards-based methods, the color
difference coding (Y′C′BC′R) process is the predominant method. This coding process
involves taking samples of the luma (luminance or brightness) signals and the chroma (color
difference) signals; and then properly scaling and interleaving them to create a set of digital
values that corresponds to the prescribed encoding methodology.
Two goals are targeted during the digital conversion process (and which is not necessarily
related to the video compression process). The first is to reduce the amount of data required
to represent the image, which in turn reduces the transmission bandwidth (bit rate) and the
storage requirements. The second is to encode the digital representations into a format that
can be transported, interchanged, and properly utilized on a standardized variety of playback
or picture manipulation devices.
Component Video Nuances
When the casual video user speaks of digital video, they probably do not recognize, or
necessary care, about the various color difference terminologies. The video user often
universally references all analog component signals to the YPBPR scheme that is most utilized
in the connector designations for consumer DVD players, high-definition displays, or video
projectors. Unfortunately, when looking at the physical connections to such devices, one may
find these signal channels mistakenly labeled with other color difference terminologies such
as YUV, Y/R-Y/B-Y, or Y/B-Y/R-Y. The relationship to these signals, while somewhat
similar, does not apply to the processes of component video digitization.
As the technologies to capture component analog video signals were developing, variations
of component video then began to materialize as manufacturers sought to provide the highest
quality recording and interchange methodologies. Manufacturers sought to have the industry
recognize “their” product offerings as superior to the competitors, so each primary
manufacturer produced slightly different systems for its particular product line.
These varying systems, some standardized and others more de facto in nature, emerged and
became accepted by certain sets of users as the television industry began its gradual migration
away from the film-based medium.
MII and Betacam Formats
Two of the more recognized analog component variations prominent during the midevolution of professional cassette-based videotape came from Panasonic and Sony. For
analog video, the two successful, yet competing, formats were the MII format (from
Matsushita/Panasonic) and the Betacam format (from Sony). Both systems were built around
a half-inch-wide oxide-based videotape cassette. These two systems used different offset
voltage levels for the color difference analog channels, requiring different scaling when they
were converted to any form of component digital video signal.
Neither the MII nor the Betacam systems ever directly recorded a digital signal onto
videotape. Any analog-to-digital conversion was, for the most of the part, accomplished by
external A-to-D devices or by the receiving video recording device or live vision mixer/video
production switcher. Care in the electrical set up of the television equipment, per the proper
signal format, was always critical.
Composite or Component Signals
The principle methods of video signal interchange was as composite analog (NTSC or PAL)
video on a single coaxial cable or as a component color difference video signal (Y, R-Y, BY). When dubbing from one Betacam transport to another, Sony offered a component “dub
cable” that carried a multiplexed component signal between the two devices.
Betacam and MII formats could not be electrically interchanged, even at the component
analog video level, without additional scaling or signal processing equipment.
Both of these analog component videotape concepts eventually paved the way for the
emergence of digital videotape recording earlier and the later standardization of the digital
encoding parameters that led to the development of the D-1, D-2, and D-3 digital videotape
formats.
Those standards with “ARCH” next to the number indicate standards that have been archived.
Encoding per Recommendation 601
ITU-R BT.601, commonly known as Rec. 601 or BT.601, is the encoding standard published
by International Telecommunication Union-Radiocommunications (ITU-R) Sector (formerly
known as the CCIR) for encoding interlaced analog video signals into a digital form. This
standard sets the methods for encoding 525-line 60 Hz and 625-line 50 Hz signals, both with
720 luminance samples and 360 chrominance samples per line.
From the technical side, the digitization process takes the already scaled nonlinear Y′CBCR
analog signal sets and samples (quantizes) them into data sets that are stored in the order
Y1:CB:Y2:CB whose chrominance samples are co-sited with the first luminance sample. This
signal can be regarded as if it were a digitally encoded analog component video signal and
thus must include data for the horizontal and vertical sync and blanking intervals. Regardless
of the frame rate, the luminance (i.e., luma) sampling frequency is 13.5 MHz, and it must be
at least 8 bits, with the chrominance (i.e., chroma) samples at least 4 bits each.
The first version of Rec. 601 defined only a parallel interface, but later versions introduced
the bit-serial family of serial digital video interfaces that are now commonly used in digital
television and postproduction facilities. The 8-bit serial protocol with a data rate of 216
Mbits/second was once used in D-1 format of component digital videotape recording. More
current standards use an encoding table to expand the data to 9 or 10 bits for improved
function over long transmission lines. The 9-bit serial version has a data rate of 243
Mbits/second. By far, the most familiar version of the interface is the 10-bit serial digital
interface more commonly known simply as SDI, which was standardized as SMPTE 259M.
SDI is now the ubiquitous interconnect standard for professional video equipment operating
on standard-definition digital video.
In Rec. 601, for each 8-bit luma sample, the value 16 is used to represent the video black
level and 235 is used to represent the peak video white level. The values 0 and 255 are used
for sync encoding only, so no video information is conveyed at these levels. The CB and CR
samples center around the value 128, which designates an encoded “zero value.” Excursions
above 128 and below 128 represent chroma sample sets that when mathematically merged
with the luma (Y′) values make up the set of Y1:CB:Y2:CR digital video values.
This standard, including its video raster format, has been re-used in a number of later
standards, including MPEG. The SDI SMPTE 259M and the ITU-R BT.601 standards
address different subjects, but are closely related in defining modern digital video encoding
and transporting means and methodologies.
Video Disk Recording
As this book focuses mainly on the storage of digital information for spinning media
platforms, a bit of history on the topics of professional digital disk recording is warranted to
set the stage.
For the broadcast industry, the video disk era began in the late 1960s with the Ampex HS-100
system. This product marked the entry point for slow motion and instant replay of recorded
video images. The disk recorder was used mainly for live sports, becoming a household word
among broadcasters. By the end of 1976, broadcasting television sports without slow-motion
replay was nearly unimaginable.
The disk-based device featured not just “slo-mo” (slow motion) but also freeze frames and
reverse motion. Ampex’s model HS-100C (1976), although only a few years old, touted some
exceptional technical features including a derivative of what would become the Ampex AVRseries digital time-base corrector. Ampex advertisements described a subsystem that
automatically lifted the heads for disk-surface protection purposes. Operational features
included fast search and a twice-normal-speed playback. Search speed reached a viewable 4.5
times the normal speed.
Considered a portable unit, the HS-100 consisted of three separate units: a disk servo, an
electronics unit, and an output-processing unit. Each unit weighed between 145 and 177
pounds and had a power requirement of 120 V at 20 A. The signal electronics, similar to the
Ampex AVR-2 videotape transport, were already proven worldwide; however, its digital
frame buffer, capable of holding an entire frame of video, was not yet ready for prime time.
Functionally, reproducing a still image required that the heads of the disk recorder constantly
step or retrace over the same physical space on the drive surfaces. One of the drawbacks to
the disk recorder when constantly used in the mobile environment was that sometimes intense
routine maintenance procedures were required to keep the stepper motors and end-stop
adjustments aligned. Physical damage from head or surface contact was always a risk, and
there was always a possibility of damage due to dirt contamination.
Overall, the Ampex HS-100, with its ability to store up to 1800 fields NTSC (1496 fields in
PAL), using analog recording and employing four steppers to cover the four surfaces on two
rotating metal disks, was an amazing product. It would not be until optical disc storage
systems, which became commercialized in the late 1970s, would there be an evolutionary
shift in producing video on anything other than linear videotape—whether it were analog or
digital in structure.
Optical Disc Platform
Optical video disc technologies were some of the earliest adopters to capitalize on the
principles of video components to produce a removable and transportable storage media. The
videodisc, originally demonstrated in 1928 by John Logie Baird, has evolved from its wax
bases to polymer bases, becoming a means of delivering a good-quality color analog video
signal. By 1978, Philips and Pioneer had introduced the videodisc into the commercial
landscape. The laser disc, invented by David Paul Gregg in 1958 (later patented in 1961 and
1990), would become a popular product but have essentially been replaced first by the VHS
cassette and eventually by the DVD in North American retail marketplaces.
Optical recording media formats such as the CD-ROM (compact disc read-only memory) and
DVD (known as both the “digital versatile disc” and the “digital video disk”), along with the
high-definition versions of the former HD DVD and the adopted Blu-ray, are examples of
optical disc based digital video recording technologies that are widely used in the consumer
marketplace. Although the laser disc had only minimal use in the professional broadcast
industry, its feature set offered both a removable recordable media and a frame-by-frame
optical recording capability that found applications, including interactive content, become a
significant shaping factor in the future of digital video media.
Digital Video
Early on, when you spoke to computer hacks about “digital video,” they believed mostly in
the perspective of streaming video on a computer or network. Computer folks didn’t seem to
understand a thing about D-1, D-2, D-3, or D-5. The very existence of ITU-R BT.601 (also
known as “601” or “D-1”) was probably so far from their minds that if you told a computer
hack you could put 270 Mbits/second video onto any video recording medium, they’d
probably just laugh. In the computer domain, digital video had no differing encoding formats;
it only had different file formats.
On the other hand, if you spoke to a broadcast professional about the same, you’d get a wide
variety of equally confusing views on digital video[sty: Comment Reference] . The
terminologies associated with digital video encoding began as early as with the introduction
of the D-2 and D-1 videotape transports. Today, it goes without saying that the industry is
beginning to believe the impact of digital video in equally as many more forms as the
computer industry.
Despite the fact that computer video made great strides in the past decade, broadcast digital
video has made even greater ones. As history unfolds, we will eventually forget that the early
recording formats of digital tape were understood by only a handful of brave pioneers.
Strangely enough, even today, with multimedia, video servers, and a plethora of file-based
disk recording methodologies, confusion between component digital video as (D-1) versus
composite digital video (D-2) formats remains, beginning with the root distinction between
the tape format compared to the signal format.
Composite versus Component Digital Video
Prior to the introduction of the current ATSC (digital terrestrial television broadcasting)
standards, the digital encoders and decoders were designed around low pass filters that
considered more of the analog television transmitter’s requirements than the capabilities of
the developing all digital video physical plants. Before digital videotape, the market for the
distribution of a final product was constrained to a quality level that benchmarked either the
1-inch C-format analog or the ½-in. Betacam tape specifications. Digital video changed that
complexion universally.
Production Effects
Throughout the 1990s, the levels of production and effects work continued to increase.
Digital technologies in video allowed for the multilayering and compositing capabilities that
became the production norm. Digital videotape recording ended the issues of those everfamiliar dropouts, at least for the production process. Industry competition segregated the
lower cost composite digital video (D-2) systems from the more expensive component digital
video hardware required for D-1. The later format allowed for the production of very highquality, multigenerational imaging that eliminated the NTSC footprint found in the composite
D-2 format.
Composite Digital Recording Formats
The need for a digital recording platform that could “drop in” to a broadcast or
postproduction facility was the notion for the development of two composite video formats
that would operate in an analog domain and record a digital signal on its tape medium.
D-2 Format
The composite digital D-2 format, created by Ampex, is a professional digital videotape
product line, standardized by SMPTE and introduced at the 1988 National Association of
Broadcasters (NAB) convention and trade exhibition. The D-2 format was intended to be a
lower cost alternative to the component digital D-1 format introduced by Sony.
Like the D-1 format, D-2 video is uncompressed. Bandwidth is conserved by sampling a
fully-encoded NTSC or PAL composite video signal at a data rate of 143 Mbits/second.
Most of the broadcast television and postproduction companies did not utilize distributed
component analog video formats at the time that D-1 was first introduced. Ampex saw the D2 as a practical solution for broadcasters since it could be dropped into existing broadcast
chains and studio facilities (as a replacement for the analog C-format one inch open reel VTR
transports) without extensive redesign or modifications. D-2 composite digital tape transports
accepted the standard RS-170A analog inputs and outputs.
Audio recording on four audio channels is available for editing as well as an analog cue
channel. The Ampex D-2 was the first digital tape format to provide the “read-before-write”
feature, known as “preread” on Sony recorders.
Using frame buffering technologies, read before write allowed simultaneous playback and
recording on the same VTR. In production, a graphic title (from an external video keyer)
could be super-imposed over existing video that was already recorded on the same video
tape. By playing back the videotape through a production switcher with effects capabilities,
one could add the title and then record the image back onto the tape in the same location of
the tape. This eliminated the requirement for a dedicated playback and second, dedicated
recorder. It further could save considerable time in linear editing, but it had a serious fatal
drawback. Should the external graphic or effect be undesirable (e.g., it had an unrecognized
typo), the original source video recording was lost and now the new, incorrect, video has
permanently replaced that previous source image.
D-2 (and D-1) digital recorders were first introduced with parallel wired interconnections if
you wanted to use them with a composite (or component) digital production switcher. At that
time, the serial digital interface (SDI) standard had not been ratified, nor were there any
commercially available interfaces options for either of the recorder formats.
D-2 used a 19 mm (approximately ¾ inch) metal particle tape loaded into three different
sized cassettes. The nominal tape running speed is 131.7 mm/second. PCM-encoded audio
and linear timecode are also recorded on the tape.
The D-2 tapes are similar in appearance to the D-1 format, however they are not
interchangeable given two entirely different recording formats. The Ampex D-2 tape
transports are extremely fast, with a high-speed search at 60 times playback speed producing
a recognizable color picture. Three hours of videotape could be scanned in around just three
minutes.
Many production houses adopted the D-2 format, but the broadcasters had firmly planted
their heals into the 1-in. tape format and found value only in moving to digital if it provided
additional revenue. With the electronic news gathering marketplace heating up and no need to
move from Betacam SP to another format, the bets were off that D-2 would remain a
mainstay videotape format for quote some time.
D-2 Downfalls
Over time, the use of composite digital videotape (D-2) proved to be less than satisfactory for
high-end effects and layering. For the D-2 format, or composite digital, it was shown that
after only a few, usually between six and eight copies (generations) of the same recording,
the artifacts from sampling and resampling of the same essential NTSC (4fSC) signal became
quite evident.
The quantizing structure of the D-2 format resulted in an inconsistent scaling of black levels,
in part because there was no defined bit boundary level for the 7.5 IRE setup level. Chroma
levels were still difficult to manage, and most of the facilities had only analog interfaces to
other equipment, making the benefits to digital transports of little value. These principles
added to the multigenerational degradation handicap at almost as serious a level as those
found in poorly maintained oxide-Betacam transports.
D-2 had a relatively brief heyday except in production. By 2003, only a handful of
broadcasters were using the D-2 format, mostly in commercial production environments or in
the Sony LMS (library management system) where the digital video provided an advantage
over the analog Betacam SP format.
D-3 Format
The D-3 format, produced first by the NHK in Japan, captured uncompressed composite
digital video in 525 line (standard definition) onto linear videotape. The format was first
introduced commercially by Panasonic in 1991 to compete with Ampex’s D-2 video format.
D-3 uses half-inch metal particle tape running at 83.88 mm/second.
Similar to the D-2 (NTSC) format, in D-3 format the video signal is sampled at four times the
color subcarrier frequency at 8 bits per sample. Four channels of 48 kHz at 16- to 20-bit PCM
audio, plus other ancillary data, are inserted during the vertical blanking interval. The
aggregate net (error corrected) bit rate of the D-3 format is 143 Mbits/second. Because the
codec is lossless, the D-3 format can also used in data applications.
Camcorders were available which used this format, and at that time, it was the only digital
tape–based camcorders to use a lossless encoding scheme.
The D-3 transport was derived from the previous Panasonic MII tape transport. D-3 (and later
the D-5 for high definition) tapes were available in cassettes of three sizes: small (161 mm ×
96 mm × 25 mm), medium (212 mm × 124 mm × 25 mm), and large (296 mm × 167 mm ×
25 mm). Format-specific recognition holes were used to adjust the spacing on the reel motors
and to enable other transport-related ballistics.
The maximum runtimes for D-3 (using the Fujifilm videotape product line) are 50, 126, and
248 min, respectively, for the three cassette footprints.
Component Digital Recording Formats
On the other hand, for the highest end production, component digital (D-1) could be layered
ostensibly forever. The very nature of component digital recording, sampled as Y′CBCR at
4:2:2, removed the NTSC footprints and kept the constantly degrading properties of encoding
and decoding the chroma subcarrier away from the video image. You could not visually
recognize the degradation of the video after upwards of 10–20 digitally rendered (copied) D1 to D-1 replications. This technology was truly revolutionary and was soon recognized as
the benchmark for premium digital quality still known today as “full bandwidth” for 525-line
(and 625-line) standard definition.
All the confusion of early D-2 digital was indeed unfortunate. Even for those digital
production houses that used D-2 composite digital disk recorders, the images still retained
those venerable NTSC footprint artifacts. It wasn’t until the emergence of MPEG encoding
(which uses the component digital quantizing structures) would the meaning of “digital” take
on a more defining terminology.
D-5 Formats
Panasonic first introduced the D-5 professional digital video format in 1994. The D-5 format
is an uncompressed digital component system (10 bits) using the same half-inch videotapes
as the Panasonic D-3 digital composite format. A 120-min D-3 tape will record in 60 minutes
in D-5/D-5 HD mode.
Before the availability of the D-5 high-definition transports (D-5 HD), a standard-definition
D-5 decks could be retrofitted to record high-definition digital video with the use of an
external HD input/output interface unit. When using the outboard HD deck, conversion
capabilities are removed.
The HD version of the D-5 transport uses standard D-5 videotape cassettes to record HD
material by using an intraframe compression with a 4:1 ratio. D-5 HD supports both the 1080i
and the 1035i line standards at both the 60-Hz and the 59.94-Hz field rates. In addition, the
720 progressive line standards and the 1080 progressive line standards at 24, 25, and 30
frame rates are supported. Simultaneous output of HD (hi-vision signals[sty: Comment
Reference] ) and SD (standard signals) can be achieved. In the case of 1080/23.98p,
1080/24p, and 1080/25p formats; the 1080/23.98 PsF, 1080/24 PsF, and 1080/25 PsF
(progressive segmented frame) formats divide the frames into two segments every 1/47.96,
1/48, and 1/50 of a second, respectively, for interfacing.
High sound quality with a dynamic range of 100 dB is delivered by 20-bit (or 24-bit) 48-kHz
sampling. Not only can recording, playback, and editing be conducted separately for each of
the four (or eight) channels but channel mixing is also possible. There is an additional
channel provided for use as an analog (cue) track.
With the 1080/59.94i, 480/59.94i, or 720/59.94p system format, either the four-channel 20bit format or the eight-channel 24-bit format can be selected. However, caution should be
exercised since the recording format used for the eight-channel audio version and the one
used for the four-channel audio version differ. These differences pertain to their video
recording areas as well. As such, these two system formats are not mutually compatible.
The D-5 HD unit (model AJ-HD3700B) is equipped with a hi-vision serial digital interfaces
that meet the BTAS-004A, S-005A, and S-006A standards, enabling both video and audio
signals to be transferred using the same coaxial cable. An SD (525/625-line standard
definition) serial digital interface is provided for the HD or the SD operations. Shuttle search
capabilities at up to 50 times of the normal tape speed in forward or reverse direction is
standard.
The D-5 transport will run at different data rates for different formats, with a maximum data
rate of 323 Mbits/second for 1080/59.94i, 720/59.94p, and 480/59.94i with eight channels of
audio.
Digital S
The D-9 format, standardized by SMPTE in 1999, brought an official marketing name,
Digital S. The D-9 format is a professional digital videotape format developed by JVC in
1995 as a direct competitor to the Sony Digital Betacam format. The D-9 format found
significant usage in Europe and Asia, although some were used in the United States.
The tape shell for D-9 has the same form factor as the VHS cassette, with the tape being of
metal particle formulation. The digital recording system uses DV compression at a 50
Mbits/second bit rate in the 4:2:2 component format at a variety of standard-definition
resolutions with 16-bit PCM audio at 48-kHz sampling with up to four channels.
The standard-definition recorder allowed for both 4:3 and 16:9 aspect ratios.
D-9 High Definition
JVC developed an extension to the D-9 for high-definition recording. The D-9 HD used twice
the number of recording heads and a data rate of 100 Mbits/second resolutions of 720p60,
1080i60, and 1080p24. With the tape recording twice the data rate, only half of the recording
time per cassette was achievable. For audio support, the HD D-9 model recorded eight
channels of 16-bit PCM audio at a 48-kHz sample rate.
Digital Betacam
Digital Betacam, launched in 1993, supersedes both Betacam and Betacam SP and was
viewed with a significantly less cost impact compared with the D-1 component digital format.
The Digital Betacam format (commonly referred to as Digibeta, d-beta, dbc or Digi) uses a
DCT-compressed component video signal with a 10-bit 4:2:2 sampling in NTSC (720 × 486)
or in PAL (720 × 576) line standards at a bit rate of 90 Mbits/second.
Audio in four channels of uncompressed PCM at 20 bit with 48 kHz sampling. There is a
fifth analog audio track available as a cue channel plus a linear timecode track.
Sony implemented SMPTE 259M serial digital interface (SDI) over coaxial cable on the
Digital Betacam transports, helping facilities to move to digital on their existing wiring
without updating their coaxial cabling infrastructures.
The principle competitor to the Digital Betacam is Panasonic’s DVCPRO50 format, which
uses 8-bit luma and color sampling at a 50-Mbits/second data rate.
HDCAM
HDCAM, introduced by Sony in 1997, was the first HD format available in the Betacam (1/2
in.) videotape form factor. The compression format, proprietary to Sony, uses an 8-bit
discrete cosine transfer (DCT) algorithm to compress the video using a 3:1:1 sampling
structure. The system uses a 1080 interlace compatible downsampled resolution of 1440 ×
1080 with nonsquare pixels. With the Sony HDCAM codec, the recorded 1440 × 1080 video
is upsampled on playback to 1920 × 1080.
With a recording video bit rate of 144 Mbits/second, there are four channels of 20-bit PCM at
48 kHz digital audio available.
HDCAM later added 24p and 23.976 PsF modes to later models.
HDCAM SR
Using a higher metal particle videotape, Sony produced the HDCAM SR (for “superior
resolution”) product family earlier in 2003. The recorder is capable of capturing 10-bit video
in both 4:2:2 or 4:4:4 RGB at a bit rate of 440 Mbits/second, referred to as SQ-mode. This
increased bit rate (vs. its HDCAM predecessor) allows HDCAM SR to capture the full
bandwidth of the high-definition signal at 1920 × 1080 without the subsampling issues of
HDCAM.
The HDCAM SR-HQ model uses a 2 times mode, allowing the recorder to capture the signal
at its highest bit rate of 880 Mbits/second in a 4:4:4 RGB stream at a lower compression
ratio. This mode has been used in motion picture productions that were later scanned back to
35-mm film following postproduction processes. Currently, SR is the most popular digital
videotape recording format for motion picture work.
HDCAM SR uses the MPEG-4 Part 2 Studio Profile for compression and has up to 12 PCM
channels of 24-bit audio at 48 kHz.
DV Formats
DV (digital video) is a format for recording and playing back digital video, which was
launched in 1995 through the joint efforts of the leading video camcorder producers. The
original DV specification, known as Blue Book, was standardized within the IEC 61834
family of standards, which defines interchange, recording modulation method, magnetization,
and basic system data plus the physical properties of the tape cassettes.
DVCAM
The DVCAM format has the robustness and operability required for professional use while
maintaining full compatibility with the DV recording format. Signal transfers are
accomplished without manipulation to the originally recorded data using i.LINK or SDTI
(QSDI™) interfaces. The DVCAM format takes into account the requirements in existing
linear and nonlinear editing environments. The 15-μ track pitch assures frame accurate and
stable editing at the tape edit point. The use of this track pitch also realizes full lip-sync audio
and preread capabilities.
The DVCAM/DV formats use metal evaporated tapes with a tape width of 6.350 mm. While
the DV format specifies two tape thicknesses (7.0 μm and 5.3 μm), the DVCAM format uses
only the tape thickness of 7.0 μm to achieve its professional robustness. There are two
cassette sizes, 184-min standard size (5 × 3 − 1/8 × 19/32 in.) and 40-min minicassette (2–5/8
× 1 − 15/16 × 1/2 in.).
The sampling raster of the DVCAM is the same as that of the ITU-R Rec.601. Luminance
video signals are sampled at 13.5 MHz, with 720 pixels transmitted per line for both 525-60
and 625-50 systems. In the 525-60 system, each color difference signal (CR/CB) is sampled at
3.375 MHz and 180 pixels are transmitted per line (4:1:1). The number of active lines is 480
(interlaced).
In the 625-50 system, each color difference signal is sampled sequentially at 6.75MHz, that
is, 360 pixels of either color difference signal is transmitted per line (4:2:0). The number of
active lines is 576 (interlaced).
The sampled video data is reduced by a factor of 5:1 using bit rate reduction, resulting in a
transfer rate of 25 Mbits/second. Coding is intra-frame DCT (Discrete Cosine Transform) and
VLC (Variable Length Coding). To obtain a good picture quality at the 25 Mbits/second data
rate, DV/DVCAM compression adopts a data shuffling technique prior to the encoding
process. This allows the video to be compressed with maximum efficiency and thus keeps a
well-balanced picture quality for any type of images and is deshuffled before recording to
tape.
Because intraframe compression is used, DVCAM/DV uses two DCT modes that are
adaptive to the amount of picture movement in a frame. The 8-8-DCT mode is selected when
there is no motion and the difference between the odd and even fields is small; and the 2-4-8DCT mode is selected when there is motion and the difference between the two fields is
significant.
There are two modes of audio, a four-channel mode at 32 kHz and a two-channel mode at 48
kHz. In 48-kHz mode, one channel of audio signal recorded in each of the two audio blocks,
giving one pair of stereo audio sampled at 48 kHz frequency. The encoded data is expressed
by 2’s complement representation with 16-bit linear resolution. In 32-kHz (two-channel)
mode, two channels of audio signals are recorded in each of the two audio blocks, giving two
pairs of stereo audio sampled at 32 kHz frequency. The encoded data is expressed by 2’s
complement representation with 12-bit nonlinear quantization.
Timecode, per SMPTE standards, is recorded in the audio area (AAUX), along with closed
captioning and record time and date information in the video area (VAUX) of the coding
sequences. Certain models of DVCAM will support 4:2:2 component digital video signals
and four channels of digital audio signals. The playback video data is decompressed to
baseband 4:1:1 (525)/4:2:0 (625) and then converted to 4:2:2 signals at the video
decompression block.
Compressed Interface
The SDTI (QSDI™) interface (Serial Data Transfer Interface) is for transferring compressed
video data, uncompressed audio data, and system data such as timecode, video, and audio
AUX data. SDTI (QSDI™) is useful for dubbing and is in connection with nonlinear editing
systems because the video data is transferred as compressed data with no quality degradation
and reduced codec delay. The SDTI (QSDI™) interface for DVCAM is standardized as an
SMPTE 322M.
Interconnectivity between systems with different compression formats is highly important in
future operations. Certain versions of DVCAM equipment support the SDTI-CP (Serial Data
Transfer Interface-Content Package) interface to feed MPEG-2-based nonlinear production
systems. In order to interface with MPEG systems, the DVCAM data is first transcoded to
produce MPEG-2 video elementary stream data that is then placed in the SDTI-CP interface
together with audio and system data. This interface not only has the capability to feed
DVCAM sourced material but also becomes a bridge from the DV family 25 Mbits/second
format to the MPEG world.
DVCPRO
DVCPRO is a professional video recording format developed by Panasonic and introduced in
1995. The format became very successful for electronic news gathering and became a serious
contender against other formats including Sony’s Betacam SP and Digital Betacam. There are
three variants of the DVCPRO family.
DVCPRO25
The first of the DVCPRO family is a 25 Mbits/second version using 4:1:1 chroma
subsampling for both 50 Hz and 60 Hz. Two-channel PCM audio is available only in 16 bits
at 48 kHz sampling. Two extra longitudinal tracks provide audio cue and timecode.
DVCPRO uses wider track pitch of 18 μm (vs. the 10 μm of baseline DV), which reduces the
chances of dropout errors. The cassette-based videotape is transported 80% faster than the
baseline DV, resulting in a shorter recording time.
The SDTI (QSDI™) compressed digital transfer mode allows DVCPRO25 transports to be
interconnected four times faster than the real-time transfer between transport decks.
DVCPRO50
This version, introduced by Panasonic in 1997, was intended for higher value ENG and
potential digital cinema applications. The DVCPRO50 not only doubles the coded video data
rate to 50 Mbits/second but also reduces the recording time by half compared with base
DVCPRO25. Chroma resolution is improved by using 4:2:2 chroma sampling.
DVCPRO Progressive
A progressive line scanning version was produced for use in news gathering, sports,
journalism, and digital cinema. It offered 480 (NTSC) or 576 (PAL) lines of progressive scan
recording using 4:2:0 chroma subsampling and four channels of PCM audio, 16 bits at 48
kHz sampling. The format offered six modes for recording and playback: 16:9 progressive
(50 Mbits/second), 4:3 progressive (50 Mbits/second), 16:9 interlaced (50 Mbits/second), 4:3
interlaced (Mbits/second), 16:9 interlaced (25 Mbits/second), and 4:3 interlaced (25
Mbits/second). This format was superseded by the introduction of DVCPRO HD.
DVCPRO HD
This was Panasonic’s introduction to HD other than the D-5 format, which is originally
marketed as DVCPRO100. The video coding is 4:2:2, and the format’s data rate is variable,
dependent on frame rate, with as low as 40 Mbits/second for 24 frame per second mode and
up to 100 Mbits/second for 50/60 frame per second high definition. Like DVCPRO50,
DVCPRO HD uses a 4:2:2 color sampling.
The DVCPRO HD format uses horizontal downsampling (a reduced raster size compared) for
both 720p and 1080i broadcast quality high definition. For DVCPRO HD in 720p, the raster
size is 960 × 720 pixels, for 1080/59.94i, it is 1280 × 1080 pixels, and for 1080/50i, it is 1440
× 1080 pixels. To maintain compatibility with HD-SDI, the DVCPRO100 equipment
upsamples the video during playback.
To provide compatibility and to support the feature sets of the Panasonic’s VariCam digital
cinema format camcorders, the DVCPRO100 provides for a variable frame rate from 4 to 60
frames per second. The DVCPRO HD equipment is backward compatible with the 25
Mbits/second and 50 Mbits/second DVCPRO formats and uses the same 18 μm track pitch as
other DVCPRO formats. DVCPRO HD-LP, a long-play variant, doubles the recording
density by using a 9 μm track pitch.
DVCPRO HD is codified in the SMPTE 370M standard, with its tape format specified in
SMPTE 371M. When using the Panasonic P2 solid-state recording card option, MXF OpAtom format is employed.
High-Definition Video
HDV is a tape recording format using DV tape to record HD video. On September 30, 2003,
Canon Inc., Sharp Corporation, Sony Corporation, and Victor Company of Japan, Limited
(JVC) announced that the specifications which realize the recording and playback of highdefinition video on a DV cassette tape had been established. Known as the HDV format,
these four companies proposed the basic specifications for HDV in July 2003, and the actual
specifications would become available in October of 2003 and that it would be proposed as
an international standard.
The HDV format would again change the complexion of both professional and consumer
video recording, opening the door to high definition for all.
HDV supports both 720p (as HDV720p) and 1080i (as HDV1080i) high-definition scanning
formats. Initially, Sony adopted HDV1080i for its HDV products. Later, the HDV
specification was widened to include native 1080p recording capability.
The HDV1080 interlaced format features 1080 lines with 1440 pixels per line, which is the
same structure that is used by XDCAM HD and XDCAM EX. HDV compresses the full HD
image by using the MPEG-2 Main Profile at High 1440 Level (MP@H-14). The 720p format
records in transport stream (TS) mode, and the 1080i format records in packetized elementary
stream mode. The stream interface is over IEEE 1394 (MPEG2-TS). In the 720p/60 and some
720p/50 models, the MPEG format is MP@HL.
For the 1080p/25 and 1080p/30 progressive modes, as well as the 1080i/50 and 1080i/60
modes, the structure of the MPEG group of pictures (GOP) is 15. For 1080p/24 modes, the
GOP is 12 with a 2-3 pull down cadence. HDV uses both intraframe (in which each frame is
individually compressed) and interframe (which compares one frame to another, removing
redundant information) compression modes.
Two-channel audio is recorded as MPEG1 Layer II at 16-bit quantization 48 kHz sampling.
There are options for both PCM (as two or four channels) and MPEG-2 Layer II (in four
channels).
Tapeless Digital Video
The evolution of digital video recording has progressed beyond tape-based recording. The
migration to file-based workflows has mandated a means to record on other than a lineal
basis. Both solid state and optical disc–based recording methods have found significant
movement in professional media recording since the early 2000s.
As we move from standardized implementations of videotape formats to market-driven notnecessarily-standardized recording media, manufacturers are pushing their development
efforts toward extensible recording formats for all industries centered on standardized video
and audio compression formats. This allows the physical media to serve a variety of purposes
beyond just professional or consumer video.
The offerings are numerous, and only a few will be discussed in this chapter. Throughout this
book, both the physical media and the recording formats will be discussed as they apply to
each of the chapter’s topics.
XDCAM
Sony introduced the standard definition of this now popular non–tape based, compressed
digital video format in 2003 using nonlinear media. This launched the era of tapeless
workflow for Sony and added features such as random access capability, thumbnail search
capability, no overwriting on existing footage, and included IT-network centric capabilities
not previously offered in a professional recording format.
The family includes both SD and HD resolutions as XDCAM HD422, XDCAM HD, and
XDCAM SD. This lineup uses an optical disc medium which Sony calls their Professional
Disc media, providing a storage capacity of up to 50 Gbytes. The XDCAM EX product line
uses a solid-state memory card.
Optical Disc Storage Media
The Sony Professional Disc medium adopted by the XDCAM HD422, XDCAM HD, and
XDCAM SD products use blue–violet laser technology to enable extremely large storage
capacity.
The diameter of the Professional Disc media used is 12 cm, equal to that of other optical
media such as CDs and DVDs. The media is offered in a dual layer (model PFD50DLA) for
50 GB and a single layer (model PFD23A) for 23.3 GB.
The PFD50DLA is a dual-layer model, and the PFD23A is a single-layer professional optical
disc media.
Flash Memory Card Media
The SxS PRO memory card adopted by the XDCAM EX series for recording is an
ultracompact nonlinear medium that uses flash memory, based on the SxS™ memory card
specification. The SxS PRO memory card, combined with the moderate bit rates produced by
the efficient MPEG-2 Long GOP compression, records 70 min of HD (25 Mbits/second) on a
single 16-GB card.
Two-card slot systems in the XDCAM EX products can achieve up to 140 min of recording
using two 16-GB memory cards in the SP mode and a minimum of up to 100 min in the HQ
mode. When a video recording spans across two cards, the transition is seamless without any
artifacts or frame loss.
SxS PRO memory cards, which are compatible with the ExpressCard/34 standard, can be hotswapped while shooting without interrupting the recording, making the XDCAM EX
products ideal for long-form content-production applications.
XDCAM and XDCAM EX
The families in the XDCAM offer both optical disc and solid-state (flash) memory recording,
which propel the products into the file-based workflow domain. SxS model cards are
currently available in up to 1 TB (one terabyte) formats.
XDCAM
The standard definition XDCAM products can record up to 50 Mbits/second at MPEG-2
4:2:2P@ML compression. The recording formats are MPEG IMX and DVCAM, with
selectable bit rates for MPEG IMX of 50, 40, and 30 Mbits/second. The NTSC formats
include 29.97p and 23.98p (by recording to disc in a 59.94i rate with a 2-3 pull down). The
45-min recording time of a single-layer disc, at 50 Mbits/second, includes four-channel
audio.
Proxies are recorded at 1.5 Mbits/second (video) and 0.5 Mbits/second (audio) proxy, plus
metadata.
XDCAM EX
The XDCAM EX is the solid-state, flash-based memory card configured as explained above.
XDCAM EX records high-quality HD video at a data rate of up to 35 Mbits/second using
MPEG-2 MP@HL (Long GOP) compression in 1080p, 1080i, and 720p scanning structures,
as well as native 23.98 p.
Built-in HD to SD downconversion is available during playback.
Metadata recording is supported, but proxies are not generated in the XDCAM EX products.
XDCAM HD
The XDCAM series continued to demonstrate many advantages of nonlinear recording. In
response to ever-increasing demands of video production, Sony expanded the XDCAM (SD)
series of XDCAM SD and XDCAM EX by introducing two new HD lines of products, the
XDCAM HD422 and XDCAM HD. To provide more user flexibility, there is a choice of
recording formats, recording bit rates, interlace or progressive modes, and optical disc or
memory card recording media.
File Transfer–Based Recording Decks
The Sony PDW-1500 transport has high-speed file-transfer capabilities of 50× real time for
proxies, 5× for DVCAM and 2.5× for MPEG IMX (50 Mbits/second) over a Gigabit Ethernet
connection. Metadata recording, plus the ability to write an EDL (clip list) back onto the disc,
is an included feature to round out the file-based workflow system.
P2 format
Introduced in 2003, this recording media, a solid-state memory card based format from
Panasonic, called P2 (for “professional plug-in”), is the company’s introduction into filebased workflow, starting at the camcorder end.
Panasonic P2 records a DVCPRO50/DVCPRO/DV signal to a PCMICA-sized card. The
PCMICA card is actually an array of secure digital cards designed to work swiftly and in
harmony to record large amounts of data. Early adopters paid a healthy price for the media (a
4-GB card costing in excess of $1600), which recorded a meager 4-min of HD video.
P2 cards record in universally interchangeable Material eXchange Format (MXF) data files,
making them immediately usable by properly configured Windows and Macintosh
computers, as well as other nonlinear editing platforms compliant with MXF using the
operational pattern OP-Atom.
Mixing Formats on P2 Cards
Users may freely intermix any type of footage, in high definition and standard definition, as
625 (PAL) and 525 (NTSC), as DVCPRO and AVC-Intra, in interlaced, progressive, or
variable-frame-rate footage on the same P2 card. P2 cards are viewed by the system as
removable storage devices, thus are format agnostic.
P2 Card Structure
P2 cards are high-precision microcomputers with an integral processor, its format firmware, a
RAID controller, and are available in multiple gigabytes of high-quality, zero-fault, solidstate memory chips. The P2 card is an intelligent device that manages the data files, whose
processor will perform a write-verification step for every byte of memory that is written to
the card, assuring a fault-free operation.
Early P2 cards were manufactured using actual secure digital (SD) memory cards in a striped
RAID array, thus increasing the performance far beyond the speed of an individual memory
chip. The latest generation of P2 cards dispensed using individual SD memory cards uses the
core memory components directly.
P2 cards (circa 2008) are capable of transferring data at a rate of 640 Mbits/second (80
Mbytes/second), fast enough to allow real-time editing of six streams of full-bandwidth
DVCPRO HD (100 Mbits/second data rate) simultaneously. Transfer speeds are governed by
the IT-storage devices and their hardware configurations.
Recording times are based on the format which is being recorded.
PC Card Origin
The PC Card is the form factor for the P2 card. The original name was developed by a
consortium of companies called the Personal Computer Memory Card International
Association (PCMCIA). The US computer industry created the PCMCIA to challenge the
Japanese JEIDA memory card devices by offering a competing standard for memoryexpansion cards. In 1991, the two standards merged as JEIDA 4.1/PCMCIA 2.0 (or “PC
Card”).
P2 HD
The P2 HD format is compatible with PCs and existing file-based IT infrastructures. Content
is recorded as independent frames, can be randomly accessed, and easily transferred to, or
archived onto, low-cost consumer media (hard drives) or other affordable current IT storage
technology.
24PN Mode
A native to P2-only mode, this is a special space-saving recording mode, designed
specifically for use with P2 Cards and no loss in quality when recording in 24PN mode.
Normally, DVCPRO HD 720p is stored at 60 frames per second, and 24p footage is
“embedded” in a 60p data stream using 2:3 pull down.
P2 HD camcorders support 24p recording within a 60p data stream. When recording to the P2
card, it is unnecessary to record the full 60p data stream. This 24PN more allowed the
recording of just the 24 frames, storing only the 24 frames, resulting in a 2.5× increase in
recording time on any given card.
The camcorder or playback device automatically takes care of inserting the 24PN data stream
back into a 60p data stream using 2:3 pull down directly on playback. 24PN recordings are
capable of being displayed on any conventional HD monitor without needing special
conversion hardware.
A 24PN mode, and the 30PN mode, will not transmit data over the IEEE 1394 (“FireWire”)
interface and is disabled when operating in this mode. The P2 HD camcorders transmit only
SMPTE-compliant data streams over this interface. The 24PN and 30PN recordings are
unique modes and are incompliant with established streaming protocols. The 30PN mode
works like the 24PN mode at 30 or 25 frames per second.
Proxy and Metadata Recording
The P2 HD system supports the creation and the management of proxy files at bit rates of 192
Kbits/second, 768 Kbits/second, or 1.5 Mbits/second. Proxy information is recorded via the
optional MPEG4 encoder simultaneously with the high-resolution video that is being
recorded to the same P2 card. Proxy video can be recorded to both the P2 card and a separate
SD memory card, allowing for the low-resolution files to be moved to a network (internal or
Web-based) for logging and viewing of content ahead of transferring the high-resolution
content into an editing platform.
All P2 camcorders record video with some standard metadata fields, including individual
camera type, camera serial number, and unique user clip ID. Additional support for up to 30
user-definable (descriptive) metadata files such as shooter, reporter, location, scene, text
memo, and GPS coordinates may be selectively added.
Not all P2 HD camcorders can create or play proxy files but only camcorders that accept the
optional proxy card can create or play back proxy files.
AVC-INTRA
On expanding the established DVCPRO HD compression, AVC compression was adopted by
Panasonic using intraframe-only compression for their P2 HD camcorder products.
Advanced Video Coding
Following the completion of the MPEG-4 Visual standard, the Joint Video Team (JVT) of
ITU-T VCEG and ISO/IEC MPEG was established to develop an even more efficient
compression scheme. The efforts by the JVT experts resulted in the new coding name,
H.264/AVC, which was formally known as ITUT Recommendation H.264 or as ISO/IEC
14496-10 (MPEG-4 Part 10) Advanced Video Coding.
H.264/AVC offers significant improvements in coding efficiency; however, in order to meet
the demand for coding of higher fidelity video content, the first amendment of H.264/AVC,
known as Fidelity Range Extension (FRExt), was created with new High profiles.
AVC-Intra Implementation
The AVC-Intra implementation of H.264/AVC is offered on select P2 products. These
products have the capability to switch between the AVC-Intra 100 mode and the more
economical AVC-Intra 50 mode. The AVC-Intra 100 mode provides the full resolution in HD
using 4:2:2 studio quality 10-bit sampling (i.e., without subsampling). The AVC-Intra 50
mode professes an economical advantage by using 4:2:0 sampling, with 10-bit depth. The
coded data size is fixed on a frame-by-frame basis, which eases the frame-accurate editing
issues found in MPEG-2 Long GOP coding.
AVC-Intra is not supported over IEEE 1394 (FireWire) as a live data stream.
Further Readings
A comprehensive listing of the SMPTE standards documents pertaining to the videotape, and
various “D” (digital) formats can be found in the Appendix available on the companion
website.
Download