Bonus material A Digital Video Recording Chapter One defined digital video recording as the process of converting analog video signals and analog sound signals (i.e., analog-to-digital or A-to-D conversion) into a stream of discrete binary numbers that represent and correspond to changes in chroma and luminance values (for video) or dynamic air pressure (for audio) over time. However, as is more the case now than in the later part of the 1990s, digital video and audio signals are often already generated “natively” from the source—removing the added steps of external A-to-D conversion. Full-bandwidth, standardized digital video and audio have been prevalent inside many facilities as linear digital signals, usually carried as SMPTE 259 for standard definition or SMPTE 292 for high definition, and AES3id (or simply AES/EBU audio) for quite some time. Full-bandwidth digital video and audio, streamed from cameras, videotape transports, production equipment, and video storage devices (e.g., video servers), is now well accepted in the professional video production and broadcast market space. Compressed digital videos, such as MPEG2, MPEG4, and JPEG 2000, have reached even further than full-bandwidth digital video because of their file-based nature and their ability to be carried over conventional networking infrastructures. Video image signal structures for those resolutions outside the traditional high-definition signal formats adopted by the Advanced Television Systems Committee (ATSC) for broadcast applications identified as 1080 interlaced (1080i) or 720 progressive (720p) are gaining momentum. Signals, including 1080 progressive (1080p60, 1080p24, etc.), that are represented by 3 Gbits/second in SMPTE 424 and SMPTE 425 are emerging, not to mention the continual evolution of compression formats at various sample data rates and structures. As an introductory foundation to the rest of this book, this chapter looks at the origins of component analog and digital videotape recording and provides a preface to the concepts of file-based and digital storage technologies for the moving media industry. KEY CHAPTER POINTS Video coding, moving from analog to digital formats, the early formats Component and composite analog video formats Disk-based recording and storage in magnetic and optical formats Component and composite digital video formats for videotape and file-based recording, the standardized D-formats, and beyond are explained according to SMPTE standards and their marketplace trade names Analog to Digital Video Coding Coding digital representations of analog video may involve either a standards-based approach or a proprietary-based approach. For most of the accepted standards-based methods, the color difference coding (Y′C′BC′R) process is the predominant method. This coding process involves taking samples of the luma (luminance or brightness) signals and the chroma (color difference) signals; and then properly scaling and interleaving them to create a set of digital values that corresponds to the prescribed encoding methodology. Two goals are targeted during the digital conversion process (and which is not necessarily related to the video compression process). The first is to reduce the amount of data required to represent the image, which in turn reduces the transmission bandwidth (bit rate) and the storage requirements. The second is to encode the digital representations into a format that can be transported, interchanged, and properly utilized on a standardized variety of playback or picture manipulation devices. Component Video Nuances When the casual video user speaks of digital video, they probably do not recognize, or necessary care, about the various color difference terminologies. The video user often universally references all analog component signals to the YPBPR scheme that is most utilized in the connector designations for consumer DVD players, high-definition displays, or video projectors. Unfortunately, when looking at the physical connections to such devices, one may find these signal channels mistakenly labeled with other color difference terminologies such as YUV, Y/R-Y/B-Y, or Y/B-Y/R-Y. The relationship to these signals, while somewhat similar, does not apply to the processes of component video digitization. As the technologies to capture component analog video signals were developing, variations of component video then began to materialize as manufacturers sought to provide the highest quality recording and interchange methodologies. Manufacturers sought to have the industry recognize “their” product offerings as superior to the competitors, so each primary manufacturer produced slightly different systems for its particular product line. These varying systems, some standardized and others more de facto in nature, emerged and became accepted by certain sets of users as the television industry began its gradual migration away from the film-based medium. MII and Betacam Formats Two of the more recognized analog component variations prominent during the midevolution of professional cassette-based videotape came from Panasonic and Sony. For analog video, the two successful, yet competing, formats were the MII format (from Matsushita/Panasonic) and the Betacam format (from Sony). Both systems were built around a half-inch-wide oxide-based videotape cassette. These two systems used different offset voltage levels for the color difference analog channels, requiring different scaling when they were converted to any form of component digital video signal. Neither the MII nor the Betacam systems ever directly recorded a digital signal onto videotape. Any analog-to-digital conversion was, for the most of the part, accomplished by external A-to-D devices or by the receiving video recording device or live vision mixer/video production switcher. Care in the electrical set up of the television equipment, per the proper signal format, was always critical. Composite or Component Signals The principle methods of video signal interchange was as composite analog (NTSC or PAL) video on a single coaxial cable or as a component color difference video signal (Y, R-Y, BY). When dubbing from one Betacam transport to another, Sony offered a component “dub cable” that carried a multiplexed component signal between the two devices. Betacam and MII formats could not be electrically interchanged, even at the component analog video level, without additional scaling or signal processing equipment. Both of these analog component videotape concepts eventually paved the way for the emergence of digital videotape recording earlier and the later standardization of the digital encoding parameters that led to the development of the D-1, D-2, and D-3 digital videotape formats. Those standards with “ARCH” next to the number indicate standards that have been archived. Encoding per Recommendation 601 ITU-R BT.601, commonly known as Rec. 601 or BT.601, is the encoding standard published by International Telecommunication Union-Radiocommunications (ITU-R) Sector (formerly known as the CCIR) for encoding interlaced analog video signals into a digital form. This standard sets the methods for encoding 525-line 60 Hz and 625-line 50 Hz signals, both with 720 luminance samples and 360 chrominance samples per line. From the technical side, the digitization process takes the already scaled nonlinear Y′CBCR analog signal sets and samples (quantizes) them into data sets that are stored in the order Y1:CB:Y2:CB whose chrominance samples are co-sited with the first luminance sample. This signal can be regarded as if it were a digitally encoded analog component video signal and thus must include data for the horizontal and vertical sync and blanking intervals. Regardless of the frame rate, the luminance (i.e., luma) sampling frequency is 13.5 MHz, and it must be at least 8 bits, with the chrominance (i.e., chroma) samples at least 4 bits each. The first version of Rec. 601 defined only a parallel interface, but later versions introduced the bit-serial family of serial digital video interfaces that are now commonly used in digital television and postproduction facilities. The 8-bit serial protocol with a data rate of 216 Mbits/second was once used in D-1 format of component digital videotape recording. More current standards use an encoding table to expand the data to 9 or 10 bits for improved function over long transmission lines. The 9-bit serial version has a data rate of 243 Mbits/second. By far, the most familiar version of the interface is the 10-bit serial digital interface more commonly known simply as SDI, which was standardized as SMPTE 259M. SDI is now the ubiquitous interconnect standard for professional video equipment operating on standard-definition digital video. In Rec. 601, for each 8-bit luma sample, the value 16 is used to represent the video black level and 235 is used to represent the peak video white level. The values 0 and 255 are used for sync encoding only, so no video information is conveyed at these levels. The CB and CR samples center around the value 128, which designates an encoded “zero value.” Excursions above 128 and below 128 represent chroma sample sets that when mathematically merged with the luma (Y′) values make up the set of Y1:CB:Y2:CR digital video values. This standard, including its video raster format, has been re-used in a number of later standards, including MPEG. The SDI SMPTE 259M and the ITU-R BT.601 standards address different subjects, but are closely related in defining modern digital video encoding and transporting means and methodologies. Video Disk Recording As this book focuses mainly on the storage of digital information for spinning media platforms, a bit of history on the topics of professional digital disk recording is warranted to set the stage. For the broadcast industry, the video disk era began in the late 1960s with the Ampex HS-100 system. This product marked the entry point for slow motion and instant replay of recorded video images. The disk recorder was used mainly for live sports, becoming a household word among broadcasters. By the end of 1976, broadcasting television sports without slow-motion replay was nearly unimaginable. The disk-based device featured not just “slo-mo” (slow motion) but also freeze frames and reverse motion. Ampex’s model HS-100C (1976), although only a few years old, touted some exceptional technical features including a derivative of what would become the Ampex AVRseries digital time-base corrector. Ampex advertisements described a subsystem that automatically lifted the heads for disk-surface protection purposes. Operational features included fast search and a twice-normal-speed playback. Search speed reached a viewable 4.5 times the normal speed. Considered a portable unit, the HS-100 consisted of three separate units: a disk servo, an electronics unit, and an output-processing unit. Each unit weighed between 145 and 177 pounds and had a power requirement of 120 V at 20 A. The signal electronics, similar to the Ampex AVR-2 videotape transport, were already proven worldwide; however, its digital frame buffer, capable of holding an entire frame of video, was not yet ready for prime time. Functionally, reproducing a still image required that the heads of the disk recorder constantly step or retrace over the same physical space on the drive surfaces. One of the drawbacks to the disk recorder when constantly used in the mobile environment was that sometimes intense routine maintenance procedures were required to keep the stepper motors and end-stop adjustments aligned. Physical damage from head or surface contact was always a risk, and there was always a possibility of damage due to dirt contamination. Overall, the Ampex HS-100, with its ability to store up to 1800 fields NTSC (1496 fields in PAL), using analog recording and employing four steppers to cover the four surfaces on two rotating metal disks, was an amazing product. It would not be until optical disc storage systems, which became commercialized in the late 1970s, would there be an evolutionary shift in producing video on anything other than linear videotape—whether it were analog or digital in structure. Optical Disc Platform Optical video disc technologies were some of the earliest adopters to capitalize on the principles of video components to produce a removable and transportable storage media. The videodisc, originally demonstrated in 1928 by John Logie Baird, has evolved from its wax bases to polymer bases, becoming a means of delivering a good-quality color analog video signal. By 1978, Philips and Pioneer had introduced the videodisc into the commercial landscape. The laser disc, invented by David Paul Gregg in 1958 (later patented in 1961 and 1990), would become a popular product but have essentially been replaced first by the VHS cassette and eventually by the DVD in North American retail marketplaces. Optical recording media formats such as the CD-ROM (compact disc read-only memory) and DVD (known as both the “digital versatile disc” and the “digital video disk”), along with the high-definition versions of the former HD DVD and the adopted Blu-ray, are examples of optical disc based digital video recording technologies that are widely used in the consumer marketplace. Although the laser disc had only minimal use in the professional broadcast industry, its feature set offered both a removable recordable media and a frame-by-frame optical recording capability that found applications, including interactive content, become a significant shaping factor in the future of digital video media. Digital Video Early on, when you spoke to computer hacks about “digital video,” they believed mostly in the perspective of streaming video on a computer or network. Computer folks didn’t seem to understand a thing about D-1, D-2, D-3, or D-5. The very existence of ITU-R BT.601 (also known as “601” or “D-1”) was probably so far from their minds that if you told a computer hack you could put 270 Mbits/second video onto any video recording medium, they’d probably just laugh. In the computer domain, digital video had no differing encoding formats; it only had different file formats. On the other hand, if you spoke to a broadcast professional about the same, you’d get a wide variety of equally confusing views on digital video[sty: Comment Reference] . The terminologies associated with digital video encoding began as early as with the introduction of the D-2 and D-1 videotape transports. Today, it goes without saying that the industry is beginning to believe the impact of digital video in equally as many more forms as the computer industry. Despite the fact that computer video made great strides in the past decade, broadcast digital video has made even greater ones. As history unfolds, we will eventually forget that the early recording formats of digital tape were understood by only a handful of brave pioneers. Strangely enough, even today, with multimedia, video servers, and a plethora of file-based disk recording methodologies, confusion between component digital video as (D-1) versus composite digital video (D-2) formats remains, beginning with the root distinction between the tape format compared to the signal format. Composite versus Component Digital Video Prior to the introduction of the current ATSC (digital terrestrial television broadcasting) standards, the digital encoders and decoders were designed around low pass filters that considered more of the analog television transmitter’s requirements than the capabilities of the developing all digital video physical plants. Before digital videotape, the market for the distribution of a final product was constrained to a quality level that benchmarked either the 1-inch C-format analog or the ½-in. Betacam tape specifications. Digital video changed that complexion universally. Production Effects Throughout the 1990s, the levels of production and effects work continued to increase. Digital technologies in video allowed for the multilayering and compositing capabilities that became the production norm. Digital videotape recording ended the issues of those everfamiliar dropouts, at least for the production process. Industry competition segregated the lower cost composite digital video (D-2) systems from the more expensive component digital video hardware required for D-1. The later format allowed for the production of very highquality, multigenerational imaging that eliminated the NTSC footprint found in the composite D-2 format. Composite Digital Recording Formats The need for a digital recording platform that could “drop in” to a broadcast or postproduction facility was the notion for the development of two composite video formats that would operate in an analog domain and record a digital signal on its tape medium. D-2 Format The composite digital D-2 format, created by Ampex, is a professional digital videotape product line, standardized by SMPTE and introduced at the 1988 National Association of Broadcasters (NAB) convention and trade exhibition. The D-2 format was intended to be a lower cost alternative to the component digital D-1 format introduced by Sony. Like the D-1 format, D-2 video is uncompressed. Bandwidth is conserved by sampling a fully-encoded NTSC or PAL composite video signal at a data rate of 143 Mbits/second. Most of the broadcast television and postproduction companies did not utilize distributed component analog video formats at the time that D-1 was first introduced. Ampex saw the D2 as a practical solution for broadcasters since it could be dropped into existing broadcast chains and studio facilities (as a replacement for the analog C-format one inch open reel VTR transports) without extensive redesign or modifications. D-2 composite digital tape transports accepted the standard RS-170A analog inputs and outputs. Audio recording on four audio channels is available for editing as well as an analog cue channel. The Ampex D-2 was the first digital tape format to provide the “read-before-write” feature, known as “preread” on Sony recorders. Using frame buffering technologies, read before write allowed simultaneous playback and recording on the same VTR. In production, a graphic title (from an external video keyer) could be super-imposed over existing video that was already recorded on the same video tape. By playing back the videotape through a production switcher with effects capabilities, one could add the title and then record the image back onto the tape in the same location of the tape. This eliminated the requirement for a dedicated playback and second, dedicated recorder. It further could save considerable time in linear editing, but it had a serious fatal drawback. Should the external graphic or effect be undesirable (e.g., it had an unrecognized typo), the original source video recording was lost and now the new, incorrect, video has permanently replaced that previous source image. D-2 (and D-1) digital recorders were first introduced with parallel wired interconnections if you wanted to use them with a composite (or component) digital production switcher. At that time, the serial digital interface (SDI) standard had not been ratified, nor were there any commercially available interfaces options for either of the recorder formats. D-2 used a 19 mm (approximately ¾ inch) metal particle tape loaded into three different sized cassettes. The nominal tape running speed is 131.7 mm/second. PCM-encoded audio and linear timecode are also recorded on the tape. The D-2 tapes are similar in appearance to the D-1 format, however they are not interchangeable given two entirely different recording formats. The Ampex D-2 tape transports are extremely fast, with a high-speed search at 60 times playback speed producing a recognizable color picture. Three hours of videotape could be scanned in around just three minutes. Many production houses adopted the D-2 format, but the broadcasters had firmly planted their heals into the 1-in. tape format and found value only in moving to digital if it provided additional revenue. With the electronic news gathering marketplace heating up and no need to move from Betacam SP to another format, the bets were off that D-2 would remain a mainstay videotape format for quote some time. D-2 Downfalls Over time, the use of composite digital videotape (D-2) proved to be less than satisfactory for high-end effects and layering. For the D-2 format, or composite digital, it was shown that after only a few, usually between six and eight copies (generations) of the same recording, the artifacts from sampling and resampling of the same essential NTSC (4fSC) signal became quite evident. The quantizing structure of the D-2 format resulted in an inconsistent scaling of black levels, in part because there was no defined bit boundary level for the 7.5 IRE setup level. Chroma levels were still difficult to manage, and most of the facilities had only analog interfaces to other equipment, making the benefits to digital transports of little value. These principles added to the multigenerational degradation handicap at almost as serious a level as those found in poorly maintained oxide-Betacam transports. D-2 had a relatively brief heyday except in production. By 2003, only a handful of broadcasters were using the D-2 format, mostly in commercial production environments or in the Sony LMS (library management system) where the digital video provided an advantage over the analog Betacam SP format. D-3 Format The D-3 format, produced first by the NHK in Japan, captured uncompressed composite digital video in 525 line (standard definition) onto linear videotape. The format was first introduced commercially by Panasonic in 1991 to compete with Ampex’s D-2 video format. D-3 uses half-inch metal particle tape running at 83.88 mm/second. Similar to the D-2 (NTSC) format, in D-3 format the video signal is sampled at four times the color subcarrier frequency at 8 bits per sample. Four channels of 48 kHz at 16- to 20-bit PCM audio, plus other ancillary data, are inserted during the vertical blanking interval. The aggregate net (error corrected) bit rate of the D-3 format is 143 Mbits/second. Because the codec is lossless, the D-3 format can also used in data applications. Camcorders were available which used this format, and at that time, it was the only digital tape–based camcorders to use a lossless encoding scheme. The D-3 transport was derived from the previous Panasonic MII tape transport. D-3 (and later the D-5 for high definition) tapes were available in cassettes of three sizes: small (161 mm × 96 mm × 25 mm), medium (212 mm × 124 mm × 25 mm), and large (296 mm × 167 mm × 25 mm). Format-specific recognition holes were used to adjust the spacing on the reel motors and to enable other transport-related ballistics. The maximum runtimes for D-3 (using the Fujifilm videotape product line) are 50, 126, and 248 min, respectively, for the three cassette footprints. Component Digital Recording Formats On the other hand, for the highest end production, component digital (D-1) could be layered ostensibly forever. The very nature of component digital recording, sampled as Y′CBCR at 4:2:2, removed the NTSC footprints and kept the constantly degrading properties of encoding and decoding the chroma subcarrier away from the video image. You could not visually recognize the degradation of the video after upwards of 10–20 digitally rendered (copied) D1 to D-1 replications. This technology was truly revolutionary and was soon recognized as the benchmark for premium digital quality still known today as “full bandwidth” for 525-line (and 625-line) standard definition. All the confusion of early D-2 digital was indeed unfortunate. Even for those digital production houses that used D-2 composite digital disk recorders, the images still retained those venerable NTSC footprint artifacts. It wasn’t until the emergence of MPEG encoding (which uses the component digital quantizing structures) would the meaning of “digital” take on a more defining terminology. D-5 Formats Panasonic first introduced the D-5 professional digital video format in 1994. The D-5 format is an uncompressed digital component system (10 bits) using the same half-inch videotapes as the Panasonic D-3 digital composite format. A 120-min D-3 tape will record in 60 minutes in D-5/D-5 HD mode. Before the availability of the D-5 high-definition transports (D-5 HD), a standard-definition D-5 decks could be retrofitted to record high-definition digital video with the use of an external HD input/output interface unit. When using the outboard HD deck, conversion capabilities are removed. The HD version of the D-5 transport uses standard D-5 videotape cassettes to record HD material by using an intraframe compression with a 4:1 ratio. D-5 HD supports both the 1080i and the 1035i line standards at both the 60-Hz and the 59.94-Hz field rates. In addition, the 720 progressive line standards and the 1080 progressive line standards at 24, 25, and 30 frame rates are supported. Simultaneous output of HD (hi-vision signals[sty: Comment Reference] ) and SD (standard signals) can be achieved. In the case of 1080/23.98p, 1080/24p, and 1080/25p formats; the 1080/23.98 PsF, 1080/24 PsF, and 1080/25 PsF (progressive segmented frame) formats divide the frames into two segments every 1/47.96, 1/48, and 1/50 of a second, respectively, for interfacing. High sound quality with a dynamic range of 100 dB is delivered by 20-bit (or 24-bit) 48-kHz sampling. Not only can recording, playback, and editing be conducted separately for each of the four (or eight) channels but channel mixing is also possible. There is an additional channel provided for use as an analog (cue) track. With the 1080/59.94i, 480/59.94i, or 720/59.94p system format, either the four-channel 20bit format or the eight-channel 24-bit format can be selected. However, caution should be exercised since the recording format used for the eight-channel audio version and the one used for the four-channel audio version differ. These differences pertain to their video recording areas as well. As such, these two system formats are not mutually compatible. The D-5 HD unit (model AJ-HD3700B) is equipped with a hi-vision serial digital interfaces that meet the BTAS-004A, S-005A, and S-006A standards, enabling both video and audio signals to be transferred using the same coaxial cable. An SD (525/625-line standard definition) serial digital interface is provided for the HD or the SD operations. Shuttle search capabilities at up to 50 times of the normal tape speed in forward or reverse direction is standard. The D-5 transport will run at different data rates for different formats, with a maximum data rate of 323 Mbits/second for 1080/59.94i, 720/59.94p, and 480/59.94i with eight channels of audio. Digital S The D-9 format, standardized by SMPTE in 1999, brought an official marketing name, Digital S. The D-9 format is a professional digital videotape format developed by JVC in 1995 as a direct competitor to the Sony Digital Betacam format. The D-9 format found significant usage in Europe and Asia, although some were used in the United States. The tape shell for D-9 has the same form factor as the VHS cassette, with the tape being of metal particle formulation. The digital recording system uses DV compression at a 50 Mbits/second bit rate in the 4:2:2 component format at a variety of standard-definition resolutions with 16-bit PCM audio at 48-kHz sampling with up to four channels. The standard-definition recorder allowed for both 4:3 and 16:9 aspect ratios. D-9 High Definition JVC developed an extension to the D-9 for high-definition recording. The D-9 HD used twice the number of recording heads and a data rate of 100 Mbits/second resolutions of 720p60, 1080i60, and 1080p24. With the tape recording twice the data rate, only half of the recording time per cassette was achievable. For audio support, the HD D-9 model recorded eight channels of 16-bit PCM audio at a 48-kHz sample rate. Digital Betacam Digital Betacam, launched in 1993, supersedes both Betacam and Betacam SP and was viewed with a significantly less cost impact compared with the D-1 component digital format. The Digital Betacam format (commonly referred to as Digibeta, d-beta, dbc or Digi) uses a DCT-compressed component video signal with a 10-bit 4:2:2 sampling in NTSC (720 × 486) or in PAL (720 × 576) line standards at a bit rate of 90 Mbits/second. Audio in four channels of uncompressed PCM at 20 bit with 48 kHz sampling. There is a fifth analog audio track available as a cue channel plus a linear timecode track. Sony implemented SMPTE 259M serial digital interface (SDI) over coaxial cable on the Digital Betacam transports, helping facilities to move to digital on their existing wiring without updating their coaxial cabling infrastructures. The principle competitor to the Digital Betacam is Panasonic’s DVCPRO50 format, which uses 8-bit luma and color sampling at a 50-Mbits/second data rate. HDCAM HDCAM, introduced by Sony in 1997, was the first HD format available in the Betacam (1/2 in.) videotape form factor. The compression format, proprietary to Sony, uses an 8-bit discrete cosine transfer (DCT) algorithm to compress the video using a 3:1:1 sampling structure. The system uses a 1080 interlace compatible downsampled resolution of 1440 × 1080 with nonsquare pixels. With the Sony HDCAM codec, the recorded 1440 × 1080 video is upsampled on playback to 1920 × 1080. With a recording video bit rate of 144 Mbits/second, there are four channels of 20-bit PCM at 48 kHz digital audio available. HDCAM later added 24p and 23.976 PsF modes to later models. HDCAM SR Using a higher metal particle videotape, Sony produced the HDCAM SR (for “superior resolution”) product family earlier in 2003. The recorder is capable of capturing 10-bit video in both 4:2:2 or 4:4:4 RGB at a bit rate of 440 Mbits/second, referred to as SQ-mode. This increased bit rate (vs. its HDCAM predecessor) allows HDCAM SR to capture the full bandwidth of the high-definition signal at 1920 × 1080 without the subsampling issues of HDCAM. The HDCAM SR-HQ model uses a 2 times mode, allowing the recorder to capture the signal at its highest bit rate of 880 Mbits/second in a 4:4:4 RGB stream at a lower compression ratio. This mode has been used in motion picture productions that were later scanned back to 35-mm film following postproduction processes. Currently, SR is the most popular digital videotape recording format for motion picture work. HDCAM SR uses the MPEG-4 Part 2 Studio Profile for compression and has up to 12 PCM channels of 24-bit audio at 48 kHz. DV Formats DV (digital video) is a format for recording and playing back digital video, which was launched in 1995 through the joint efforts of the leading video camcorder producers. The original DV specification, known as Blue Book, was standardized within the IEC 61834 family of standards, which defines interchange, recording modulation method, magnetization, and basic system data plus the physical properties of the tape cassettes. DVCAM The DVCAM format has the robustness and operability required for professional use while maintaining full compatibility with the DV recording format. Signal transfers are accomplished without manipulation to the originally recorded data using i.LINK or SDTI (QSDI™) interfaces. The DVCAM format takes into account the requirements in existing linear and nonlinear editing environments. The 15-μ track pitch assures frame accurate and stable editing at the tape edit point. The use of this track pitch also realizes full lip-sync audio and preread capabilities. The DVCAM/DV formats use metal evaporated tapes with a tape width of 6.350 mm. While the DV format specifies two tape thicknesses (7.0 μm and 5.3 μm), the DVCAM format uses only the tape thickness of 7.0 μm to achieve its professional robustness. There are two cassette sizes, 184-min standard size (5 × 3 − 1/8 × 19/32 in.) and 40-min minicassette (2–5/8 × 1 − 15/16 × 1/2 in.). The sampling raster of the DVCAM is the same as that of the ITU-R Rec.601. Luminance video signals are sampled at 13.5 MHz, with 720 pixels transmitted per line for both 525-60 and 625-50 systems. In the 525-60 system, each color difference signal (CR/CB) is sampled at 3.375 MHz and 180 pixels are transmitted per line (4:1:1). The number of active lines is 480 (interlaced). In the 625-50 system, each color difference signal is sampled sequentially at 6.75MHz, that is, 360 pixels of either color difference signal is transmitted per line (4:2:0). The number of active lines is 576 (interlaced). The sampled video data is reduced by a factor of 5:1 using bit rate reduction, resulting in a transfer rate of 25 Mbits/second. Coding is intra-frame DCT (Discrete Cosine Transform) and VLC (Variable Length Coding). To obtain a good picture quality at the 25 Mbits/second data rate, DV/DVCAM compression adopts a data shuffling technique prior to the encoding process. This allows the video to be compressed with maximum efficiency and thus keeps a well-balanced picture quality for any type of images and is deshuffled before recording to tape. Because intraframe compression is used, DVCAM/DV uses two DCT modes that are adaptive to the amount of picture movement in a frame. The 8-8-DCT mode is selected when there is no motion and the difference between the odd and even fields is small; and the 2-4-8DCT mode is selected when there is motion and the difference between the two fields is significant. There are two modes of audio, a four-channel mode at 32 kHz and a two-channel mode at 48 kHz. In 48-kHz mode, one channel of audio signal recorded in each of the two audio blocks, giving one pair of stereo audio sampled at 48 kHz frequency. The encoded data is expressed by 2’s complement representation with 16-bit linear resolution. In 32-kHz (two-channel) mode, two channels of audio signals are recorded in each of the two audio blocks, giving two pairs of stereo audio sampled at 32 kHz frequency. The encoded data is expressed by 2’s complement representation with 12-bit nonlinear quantization. Timecode, per SMPTE standards, is recorded in the audio area (AAUX), along with closed captioning and record time and date information in the video area (VAUX) of the coding sequences. Certain models of DVCAM will support 4:2:2 component digital video signals and four channels of digital audio signals. The playback video data is decompressed to baseband 4:1:1 (525)/4:2:0 (625) and then converted to 4:2:2 signals at the video decompression block. Compressed Interface The SDTI (QSDI™) interface (Serial Data Transfer Interface) is for transferring compressed video data, uncompressed audio data, and system data such as timecode, video, and audio AUX data. SDTI (QSDI™) is useful for dubbing and is in connection with nonlinear editing systems because the video data is transferred as compressed data with no quality degradation and reduced codec delay. The SDTI (QSDI™) interface for DVCAM is standardized as an SMPTE 322M. Interconnectivity between systems with different compression formats is highly important in future operations. Certain versions of DVCAM equipment support the SDTI-CP (Serial Data Transfer Interface-Content Package) interface to feed MPEG-2-based nonlinear production systems. In order to interface with MPEG systems, the DVCAM data is first transcoded to produce MPEG-2 video elementary stream data that is then placed in the SDTI-CP interface together with audio and system data. This interface not only has the capability to feed DVCAM sourced material but also becomes a bridge from the DV family 25 Mbits/second format to the MPEG world. DVCPRO DVCPRO is a professional video recording format developed by Panasonic and introduced in 1995. The format became very successful for electronic news gathering and became a serious contender against other formats including Sony’s Betacam SP and Digital Betacam. There are three variants of the DVCPRO family. DVCPRO25 The first of the DVCPRO family is a 25 Mbits/second version using 4:1:1 chroma subsampling for both 50 Hz and 60 Hz. Two-channel PCM audio is available only in 16 bits at 48 kHz sampling. Two extra longitudinal tracks provide audio cue and timecode. DVCPRO uses wider track pitch of 18 μm (vs. the 10 μm of baseline DV), which reduces the chances of dropout errors. The cassette-based videotape is transported 80% faster than the baseline DV, resulting in a shorter recording time. The SDTI (QSDI™) compressed digital transfer mode allows DVCPRO25 transports to be interconnected four times faster than the real-time transfer between transport decks. DVCPRO50 This version, introduced by Panasonic in 1997, was intended for higher value ENG and potential digital cinema applications. The DVCPRO50 not only doubles the coded video data rate to 50 Mbits/second but also reduces the recording time by half compared with base DVCPRO25. Chroma resolution is improved by using 4:2:2 chroma sampling. DVCPRO Progressive A progressive line scanning version was produced for use in news gathering, sports, journalism, and digital cinema. It offered 480 (NTSC) or 576 (PAL) lines of progressive scan recording using 4:2:0 chroma subsampling and four channels of PCM audio, 16 bits at 48 kHz sampling. The format offered six modes for recording and playback: 16:9 progressive (50 Mbits/second), 4:3 progressive (50 Mbits/second), 16:9 interlaced (50 Mbits/second), 4:3 interlaced (Mbits/second), 16:9 interlaced (25 Mbits/second), and 4:3 interlaced (25 Mbits/second). This format was superseded by the introduction of DVCPRO HD. DVCPRO HD This was Panasonic’s introduction to HD other than the D-5 format, which is originally marketed as DVCPRO100. The video coding is 4:2:2, and the format’s data rate is variable, dependent on frame rate, with as low as 40 Mbits/second for 24 frame per second mode and up to 100 Mbits/second for 50/60 frame per second high definition. Like DVCPRO50, DVCPRO HD uses a 4:2:2 color sampling. The DVCPRO HD format uses horizontal downsampling (a reduced raster size compared) for both 720p and 1080i broadcast quality high definition. For DVCPRO HD in 720p, the raster size is 960 × 720 pixels, for 1080/59.94i, it is 1280 × 1080 pixels, and for 1080/50i, it is 1440 × 1080 pixels. To maintain compatibility with HD-SDI, the DVCPRO100 equipment upsamples the video during playback. To provide compatibility and to support the feature sets of the Panasonic’s VariCam digital cinema format camcorders, the DVCPRO100 provides for a variable frame rate from 4 to 60 frames per second. The DVCPRO HD equipment is backward compatible with the 25 Mbits/second and 50 Mbits/second DVCPRO formats and uses the same 18 μm track pitch as other DVCPRO formats. DVCPRO HD-LP, a long-play variant, doubles the recording density by using a 9 μm track pitch. DVCPRO HD is codified in the SMPTE 370M standard, with its tape format specified in SMPTE 371M. When using the Panasonic P2 solid-state recording card option, MXF OpAtom format is employed. High-Definition Video HDV is a tape recording format using DV tape to record HD video. On September 30, 2003, Canon Inc., Sharp Corporation, Sony Corporation, and Victor Company of Japan, Limited (JVC) announced that the specifications which realize the recording and playback of highdefinition video on a DV cassette tape had been established. Known as the HDV format, these four companies proposed the basic specifications for HDV in July 2003, and the actual specifications would become available in October of 2003 and that it would be proposed as an international standard. The HDV format would again change the complexion of both professional and consumer video recording, opening the door to high definition for all. HDV supports both 720p (as HDV720p) and 1080i (as HDV1080i) high-definition scanning formats. Initially, Sony adopted HDV1080i for its HDV products. Later, the HDV specification was widened to include native 1080p recording capability. The HDV1080 interlaced format features 1080 lines with 1440 pixels per line, which is the same structure that is used by XDCAM HD and XDCAM EX. HDV compresses the full HD image by using the MPEG-2 Main Profile at High 1440 Level (MP@H-14). The 720p format records in transport stream (TS) mode, and the 1080i format records in packetized elementary stream mode. The stream interface is over IEEE 1394 (MPEG2-TS). In the 720p/60 and some 720p/50 models, the MPEG format is MP@HL. For the 1080p/25 and 1080p/30 progressive modes, as well as the 1080i/50 and 1080i/60 modes, the structure of the MPEG group of pictures (GOP) is 15. For 1080p/24 modes, the GOP is 12 with a 2-3 pull down cadence. HDV uses both intraframe (in which each frame is individually compressed) and interframe (which compares one frame to another, removing redundant information) compression modes. Two-channel audio is recorded as MPEG1 Layer II at 16-bit quantization 48 kHz sampling. There are options for both PCM (as two or four channels) and MPEG-2 Layer II (in four channels). Tapeless Digital Video The evolution of digital video recording has progressed beyond tape-based recording. The migration to file-based workflows has mandated a means to record on other than a lineal basis. Both solid state and optical disc–based recording methods have found significant movement in professional media recording since the early 2000s. As we move from standardized implementations of videotape formats to market-driven notnecessarily-standardized recording media, manufacturers are pushing their development efforts toward extensible recording formats for all industries centered on standardized video and audio compression formats. This allows the physical media to serve a variety of purposes beyond just professional or consumer video. The offerings are numerous, and only a few will be discussed in this chapter. Throughout this book, both the physical media and the recording formats will be discussed as they apply to each of the chapter’s topics. XDCAM Sony introduced the standard definition of this now popular non–tape based, compressed digital video format in 2003 using nonlinear media. This launched the era of tapeless workflow for Sony and added features such as random access capability, thumbnail search capability, no overwriting on existing footage, and included IT-network centric capabilities not previously offered in a professional recording format. The family includes both SD and HD resolutions as XDCAM HD422, XDCAM HD, and XDCAM SD. This lineup uses an optical disc medium which Sony calls their Professional Disc media, providing a storage capacity of up to 50 Gbytes. The XDCAM EX product line uses a solid-state memory card. Optical Disc Storage Media The Sony Professional Disc medium adopted by the XDCAM HD422, XDCAM HD, and XDCAM SD products use blue–violet laser technology to enable extremely large storage capacity. The diameter of the Professional Disc media used is 12 cm, equal to that of other optical media such as CDs and DVDs. The media is offered in a dual layer (model PFD50DLA) for 50 GB and a single layer (model PFD23A) for 23.3 GB. The PFD50DLA is a dual-layer model, and the PFD23A is a single-layer professional optical disc media. Flash Memory Card Media The SxS PRO memory card adopted by the XDCAM EX series for recording is an ultracompact nonlinear medium that uses flash memory, based on the SxS™ memory card specification. The SxS PRO memory card, combined with the moderate bit rates produced by the efficient MPEG-2 Long GOP compression, records 70 min of HD (25 Mbits/second) on a single 16-GB card. Two-card slot systems in the XDCAM EX products can achieve up to 140 min of recording using two 16-GB memory cards in the SP mode and a minimum of up to 100 min in the HQ mode. When a video recording spans across two cards, the transition is seamless without any artifacts or frame loss. SxS PRO memory cards, which are compatible with the ExpressCard/34 standard, can be hotswapped while shooting without interrupting the recording, making the XDCAM EX products ideal for long-form content-production applications. XDCAM and XDCAM EX The families in the XDCAM offer both optical disc and solid-state (flash) memory recording, which propel the products into the file-based workflow domain. SxS model cards are currently available in up to 1 TB (one terabyte) formats. XDCAM The standard definition XDCAM products can record up to 50 Mbits/second at MPEG-2 4:2:2P@ML compression. The recording formats are MPEG IMX and DVCAM, with selectable bit rates for MPEG IMX of 50, 40, and 30 Mbits/second. The NTSC formats include 29.97p and 23.98p (by recording to disc in a 59.94i rate with a 2-3 pull down). The 45-min recording time of a single-layer disc, at 50 Mbits/second, includes four-channel audio. Proxies are recorded at 1.5 Mbits/second (video) and 0.5 Mbits/second (audio) proxy, plus metadata. XDCAM EX The XDCAM EX is the solid-state, flash-based memory card configured as explained above. XDCAM EX records high-quality HD video at a data rate of up to 35 Mbits/second using MPEG-2 MP@HL (Long GOP) compression in 1080p, 1080i, and 720p scanning structures, as well as native 23.98 p. Built-in HD to SD downconversion is available during playback. Metadata recording is supported, but proxies are not generated in the XDCAM EX products. XDCAM HD The XDCAM series continued to demonstrate many advantages of nonlinear recording. In response to ever-increasing demands of video production, Sony expanded the XDCAM (SD) series of XDCAM SD and XDCAM EX by introducing two new HD lines of products, the XDCAM HD422 and XDCAM HD. To provide more user flexibility, there is a choice of recording formats, recording bit rates, interlace or progressive modes, and optical disc or memory card recording media. File Transfer–Based Recording Decks The Sony PDW-1500 transport has high-speed file-transfer capabilities of 50× real time for proxies, 5× for DVCAM and 2.5× for MPEG IMX (50 Mbits/second) over a Gigabit Ethernet connection. Metadata recording, plus the ability to write an EDL (clip list) back onto the disc, is an included feature to round out the file-based workflow system. P2 format Introduced in 2003, this recording media, a solid-state memory card based format from Panasonic, called P2 (for “professional plug-in”), is the company’s introduction into filebased workflow, starting at the camcorder end. Panasonic P2 records a DVCPRO50/DVCPRO/DV signal to a PCMICA-sized card. The PCMICA card is actually an array of secure digital cards designed to work swiftly and in harmony to record large amounts of data. Early adopters paid a healthy price for the media (a 4-GB card costing in excess of $1600), which recorded a meager 4-min of HD video. P2 cards record in universally interchangeable Material eXchange Format (MXF) data files, making them immediately usable by properly configured Windows and Macintosh computers, as well as other nonlinear editing platforms compliant with MXF using the operational pattern OP-Atom. Mixing Formats on P2 Cards Users may freely intermix any type of footage, in high definition and standard definition, as 625 (PAL) and 525 (NTSC), as DVCPRO and AVC-Intra, in interlaced, progressive, or variable-frame-rate footage on the same P2 card. P2 cards are viewed by the system as removable storage devices, thus are format agnostic. P2 Card Structure P2 cards are high-precision microcomputers with an integral processor, its format firmware, a RAID controller, and are available in multiple gigabytes of high-quality, zero-fault, solidstate memory chips. The P2 card is an intelligent device that manages the data files, whose processor will perform a write-verification step for every byte of memory that is written to the card, assuring a fault-free operation. Early P2 cards were manufactured using actual secure digital (SD) memory cards in a striped RAID array, thus increasing the performance far beyond the speed of an individual memory chip. The latest generation of P2 cards dispensed using individual SD memory cards uses the core memory components directly. P2 cards (circa 2008) are capable of transferring data at a rate of 640 Mbits/second (80 Mbytes/second), fast enough to allow real-time editing of six streams of full-bandwidth DVCPRO HD (100 Mbits/second data rate) simultaneously. Transfer speeds are governed by the IT-storage devices and their hardware configurations. Recording times are based on the format which is being recorded. PC Card Origin The PC Card is the form factor for the P2 card. The original name was developed by a consortium of companies called the Personal Computer Memory Card International Association (PCMCIA). The US computer industry created the PCMCIA to challenge the Japanese JEIDA memory card devices by offering a competing standard for memoryexpansion cards. In 1991, the two standards merged as JEIDA 4.1/PCMCIA 2.0 (or “PC Card”). P2 HD The P2 HD format is compatible with PCs and existing file-based IT infrastructures. Content is recorded as independent frames, can be randomly accessed, and easily transferred to, or archived onto, low-cost consumer media (hard drives) or other affordable current IT storage technology. 24PN Mode A native to P2-only mode, this is a special space-saving recording mode, designed specifically for use with P2 Cards and no loss in quality when recording in 24PN mode. Normally, DVCPRO HD 720p is stored at 60 frames per second, and 24p footage is “embedded” in a 60p data stream using 2:3 pull down. P2 HD camcorders support 24p recording within a 60p data stream. When recording to the P2 card, it is unnecessary to record the full 60p data stream. This 24PN more allowed the recording of just the 24 frames, storing only the 24 frames, resulting in a 2.5× increase in recording time on any given card. The camcorder or playback device automatically takes care of inserting the 24PN data stream back into a 60p data stream using 2:3 pull down directly on playback. 24PN recordings are capable of being displayed on any conventional HD monitor without needing special conversion hardware. A 24PN mode, and the 30PN mode, will not transmit data over the IEEE 1394 (“FireWire”) interface and is disabled when operating in this mode. The P2 HD camcorders transmit only SMPTE-compliant data streams over this interface. The 24PN and 30PN recordings are unique modes and are incompliant with established streaming protocols. The 30PN mode works like the 24PN mode at 30 or 25 frames per second. Proxy and Metadata Recording The P2 HD system supports the creation and the management of proxy files at bit rates of 192 Kbits/second, 768 Kbits/second, or 1.5 Mbits/second. Proxy information is recorded via the optional MPEG4 encoder simultaneously with the high-resolution video that is being recorded to the same P2 card. Proxy video can be recorded to both the P2 card and a separate SD memory card, allowing for the low-resolution files to be moved to a network (internal or Web-based) for logging and viewing of content ahead of transferring the high-resolution content into an editing platform. All P2 camcorders record video with some standard metadata fields, including individual camera type, camera serial number, and unique user clip ID. Additional support for up to 30 user-definable (descriptive) metadata files such as shooter, reporter, location, scene, text memo, and GPS coordinates may be selectively added. Not all P2 HD camcorders can create or play proxy files but only camcorders that accept the optional proxy card can create or play back proxy files. AVC-INTRA On expanding the established DVCPRO HD compression, AVC compression was adopted by Panasonic using intraframe-only compression for their P2 HD camcorder products. Advanced Video Coding Following the completion of the MPEG-4 Visual standard, the Joint Video Team (JVT) of ITU-T VCEG and ISO/IEC MPEG was established to develop an even more efficient compression scheme. The efforts by the JVT experts resulted in the new coding name, H.264/AVC, which was formally known as ITUT Recommendation H.264 or as ISO/IEC 14496-10 (MPEG-4 Part 10) Advanced Video Coding. H.264/AVC offers significant improvements in coding efficiency; however, in order to meet the demand for coding of higher fidelity video content, the first amendment of H.264/AVC, known as Fidelity Range Extension (FRExt), was created with new High profiles. AVC-Intra Implementation The AVC-Intra implementation of H.264/AVC is offered on select P2 products. These products have the capability to switch between the AVC-Intra 100 mode and the more economical AVC-Intra 50 mode. The AVC-Intra 100 mode provides the full resolution in HD using 4:2:2 studio quality 10-bit sampling (i.e., without subsampling). The AVC-Intra 50 mode professes an economical advantage by using 4:2:0 sampling, with 10-bit depth. The coded data size is fixed on a frame-by-frame basis, which eases the frame-accurate editing issues found in MPEG-2 Long GOP coding. AVC-Intra is not supported over IEEE 1394 (FireWire) as a live data stream. Further Readings A comprehensive listing of the SMPTE standards documents pertaining to the videotape, and various “D” (digital) formats can be found in the Appendix available on the companion website.