Today’s Agenda Welcome! Go over Quiz 1 and terminology review Split up and discuss Chapter 1-3 (Present your chapters) Break Lecture: Compression Basics Lecture: Batch Templates (review Compressor) Project #2 Take Breaks as needed Take Quiz #2 For Next Class: Read chapter 4-6 Chapter Review Presentations What was the main point of the chapter? What are the key concepts? What are the key terminology from the chapter? How does it relate to compression? What are some questions you still from the chapter? Two types of Compression: Lossless and Lossy Compression Lossless: eliminate just redundant bits • Modest reductions (for images, 2-3X on average) • Compression reproduces exact original dataset Lossy: eliminate least important bits (too) • Major reductions (up to 100X or more) • Decompression reproduces only a similar copy Compression Techniques Compression techniques used for digital video can be categorized into three main groups: General purpose compression techniques for any kind of data. Intraframe compression techniques work on images. Interframe compression techniques work on image sequences rather than individual images. Comparison Lossless Compression Lossy Compression High Quality Discards redundant data Larger file sizes Small file sizes Complete control of parameters Trade off between size and quality Two types: Spatial (intraframe) versus temporal (interframe) Things to consider when compressing… Type of media: are you using QT, MPEG1…? Frames per second: 24, 25, vs. 29.97 Audio quality: Noise reduction De-interlacing: you need to do this otherwise you video will have artifacts (unclear outlines) Audio and video data rates: this will determine the file size and your format for distribution Frame size/resolution: pay attention to the size because the smaller the size, the smaller the file. Color fidelity: the accuracy of the colors in the picture which depends on bit depth. Some Notes For Understanding Formats versus Codecs Formats are “container” files. They describe the type of file that your project is in (e.g. MOV, AVI, WMV). Formats are the “wrapping” for data streams. They describe how streams are stored in the file, NOT the contents of the streams. Codecs describe how a file will be compressed and decompressed. Erroneous: “The MOV file has good video compression” – formats have nothing to do with how a video is compressed, but rather, how the streams are stored. Correct: “The data streams in this MOV file can be compressed with an H.264 codec, which will allow for a better picture.” Common Formats (container Files) Quicktime (.mov) MPEG-2 (.VOB) AVI MPEG-4 (.MP4) DivX (.DIVX) Audio Interchange Format Windows Media (.WMV, .WMA, .ASF, .ASX) MPEG-1 (.MPG, .MPEG, .MPE) File (.AIFF, .AIF) and Windows WAVE audio (.WAV) Matroska (.mkv) Flash Video (.flv) Common Video Codecs Cinepak Sorenson Video Motion-JPEG WMV MPEG-1 DivX MPEG-2 XviD MPEG-4 H.264 Intel Indeo Video Common Audio Codecs PCM (Pulse-code modulation) A-law and Mu-law IMA/ADPCM QDesign Music and Qualcomm PureVoice MPEG-1 audio (MP1, MP2, MP3) AC3 AAC WMA Ogg Vorbis Codecs (MPEG 1) Mpeg 1 (moving pictures expert group) Mainly used for CD’s – produces quality to a VHS tape. Mpeg 1 only supports progressive scan videos. Codecs (Mpeg 2) Is one of the most popular codecs in use today. Is used in satellite TV broadcast, digital television set boxes, and DVD’s. It provides widescreen support for DVD’s and good quality. Used for digital standard definition cable (3-15 Mbit/s) and high definition television (at 15-30 Mbit/s). Codec (mpeg 4 – 2 and 10) Comes in two types: Mpeg 4 (part 2) and Mpeg 4 (part 10). Mpeg 4 Part 2 – earlier codec typically offered improvement in the compression ratio over Mpeg 2 and was mainly used for internet and broadcast distribution. Mpeg 4 Part 2 – was too complex and many professionals waited for the newer version to come out, so this codec never really was popularized. Codec (mpeg 4 part 10) Also known as H.264 Offers compression that results in a significant data rate savings Efficient transmission of data Presets for a wide variety of applications Has been adapted for many hardwares: Sony Playstation Apple’s iPod Mac OS HD DVD and Blu-ray Codec (Sorenson 3) Considered one of the best video compression codecs for action based content as apposed to CG or computer graphics based content. The Sorenson 3 codec is used by Apple’s QT and is also very popular for creating movie trailers for the internet. Codec (wmv) Stands for Windows Media Video and is created by Microsoft. More for streaming video media technology. Based on the Mpeg 4 format. Most popular method of streaming or progressive download of video over the internet. The newest addition is WMV HD which is the HD version of WMV. Codec (flash) Web programers have the choice of using 2 codecs to display movies on the web in flash format. On2 VP6 codec and Sorenson Spark codec. On2 VP6 – uses elaborate compression scheme, but requires processing power to play its frames – quality if outstanding. Better for Flash movies. Codec (Divx) DivX is a video codec created by DivXNetworks which is now known as DivX, Inc. Specializes in compressing long video footage into small file sizes and the video quality if maintained. Based on lossy Mpeg 4 Part 2 compression and available for both Mac and PC’s. Some Notes Apple TV – A device created by Apple to stream content from different sources (YouTube, MobileMe, Flickr, iTunes, Netflix. Works similarly to an iPod. QVGA – Quarter VGA (240x320 pixels) VGA (480x640 pixels), in 4:3 aspect ratio. This is the basic resolution for computers, but below SD standards. DV in SD can either be in 4:3 or 16:9 aspect ratio. 3GPP – Generator Partnership Project (telecommunications – cell phones) Phone networks: 1) GSM, 2) GRPS, 3) CDMA, 4) EDGE, 5) UMTS – all for mobile devices Interlaced versus progressive Interlaced versus Progressive Interlaced Scanning Progressive scanning NTSC- 525-line, 60 fields/30 frames-per-second at 60Hz system for transmission and display of video images. PAL - 625 line, 50 field/25 frames a second, 50HZ system. Interlaced Presents the odd lines first and then the even lines. Developed as a result of limitations in television technology. Causes blur during motion sequences – because of the slight delay between the fields. Better for broadcasting compatability. Progressive • Lines scanned in sequential order • Less jitter during playback • Better for moving images • All computer monitors are progressive – so you must de-interlace your video before you can display. • Easier editing and compression Aspect Ratio Width: Height 4:3 – Standard (actual ratio is 1.33:1) 33% wider than it is tall 16:9 – HDTV (broader viewer field) (actual ratio is 1:78:1) 78% wider than it is tall Film does not adhere to any set standards for aspect ratios but there are 4 commonly used aspect ratios for film: 1.37:1 – this is close to the 4:3 SD that we see on TV (pre-1950’s movies) 1.66:1 – many Disney cartoons and European movies 1.85:1 – most American movies today (also called 1.78:1 which is the same as HDTV) 2.35:1 – most epic movie directors (Star Wars, Lord of the Rings) Letterboxing Occurs when you watch a 16:9 content on a 4:3 television screen. Two horizontal black bands appear along the top and bottom of the screen. Reduces the height of the video, regardless of whether the desired aspect ratio is 16:9, 1.85:1, or 2.35:1. The difference is that you get wider black bars on top. Pillarboxing Watching 4:3 content onto a 16:9 widescreen television. 2 vertical black bands appear on the sides of the screen. Windowboxing: Using both letterboxing and pillarboxing. Example of scaling Example of Scaling Frame rates Film = 24 frames per second PAL = 25 frames per second NTSC = 30 frames per second True frame rates: NTSC = 29.97 frames per second FILM = 23.98 frames per second Conversions 29.97fps/30fps = .999 (or 99%) .1% slower than 30fps Some extra time is required to pass on color information from the input to the output and this requires the actual frame rate to be slightly lower than 30 fps. Why do we need telecine? Because if you were to show a 1 hour film on NTSC, it would end in 48 minutes. Telecine Is the process where studios add additional frames to the original film in order to increase the frame rate while converting film to video so you watch the film in the correct speed. Also the device that is used to perform process. One method of doing this is 3:2 pulldown. This process involves taking the first frame of the film, convert it into 3 fields. The the second frames of the film and this time it converts only two fields and so on until the end of the film. Compression Chapters 1-3 Chapter 1Seeing and Hearing Compression is the art of converting media into a more compact form while sacrificing the least amount of quality possible. How the human brain perceives images and sounds. Seeing: Light, luminance (how bright objects appear), color, white, space, motion Hearing: sounds, ear functions, psychoacoustics Chapter 1 What is light? Light is composed of particles (photons) at various frequencies – the higher the frequency, the shorter the wavelength. Visible light is between 380-750 nanometers – red has the lowest frequency and violet has the highest frequency (a rainbow). Our eyes see light reflected – when light hits the retina of our eyes we see an image (a camera). The retina of our eyes turns the light into impulses (like camera film or the CCD in a camera). Chapter 1 The Retina: Made of Cones and rods Rods are sensitive to low light and fast motion, but they detect only luminance (brightness), not chrominance (color). Cones detect detail and chrominance and come in three different varieties: sensitivity to blue, red or green. But they don’t work well in low light. We are most sensitive to green, less to red and least to blue (why we use green screen). Chapter 1 Main concepts of Luminance (Y) Most colors can be seen as a mixture of green, red, and blue. We perceive luminance better than color. Our brain processes brightness differently than color- our perception of brightness is based mainly on how much green we see in something. Y’=.587 Green + .299 Red + .114 Blue What is white? White varies a lot on the context – Our brain automatically calibrates our white perception – this makes white balancing difficult. Outdoors and bright – 6500 K Indoor incandescent light bulbs -3000K Persistence of Vision The many times a second something needs to move for us to perceive smooth motion in order to not see it as unrelated images. The sense of motion is achieved by playing back images that change at least 16 times per second (16FPS) – anything below that looks like a slide show. Sound Variation in pressure – changes in air pressure determine vibrations (sound). The speed of air pressure change determines loudness (amplitude). How fast the speed changes determines frequency. Harmonic overtones (harmonics) and enharmonic overtones (percussion, explosions, door slams) are harder to compress. Basic Concepts We can see brightness better than color. Color perception is blurry and slow. We can see detail in luminance than in color. We can perceive motion than things standing still. The less harmonics in your audio, the better it will compress. Chapter 2 Sampling: the process of breaking up an image into discrete pieces – the smaller the square the more samples there are. Each square is a picture element or pixel. Most web codecs use pixels in which height and width of the pixels are equal. But DV uses non-square pixels which are rectangular. Sampling Time – at least 15fps to see motion, 24fps for video and audio to appear in sync, and 50 fps for fast motion to be clear. Sampling Sound – sampling rate is the frequency at which loudness changes are sampled. Chapter 2 Sampling rate of audio: CD 44.1kHz Consumer audio no more than 48kHz. 96 kHz and 192kHz are exclusively used for authoring not for delivery. Chapter 2 Quantization: the process of assigning discrete numeric values to the theoretically infinite possible values of each sample. 1 byte = 8 bits 2 bytes = 16 bits Most video codecs use 8 bits per channel. RGB – 256 levels of brightness between black and white Y’CbCr – 219 levels of brightness between black and white. Color Sampling 4:4:4 – RGB; most common form for Y’CbCr 4:2:2 – most commonly used in professional video 4:2:0 – the idea color space for compressing progressive scan video and used in all codecs, for broadcast, DVD, Blu-ray, or web. Also used for PAL DV, HDV, and AVCHD 4:1:1 – Color space for NTSC DV25 (DV, DVC, DVCPRO, and DVCAM). You should not encode to 4:1:1, only for acquisition. It causes blocking or loss of detail if you encode this way. Basic Concepts Resolution is a made up concept – there is no resolution in nature. Video has two spatial dimensions: height and width Audio has only one: loudness (or the air pressure at any given moment). Both video and audio are sampled – but it is easier for audio to be compressed in a high quality format (uncompressed) because data rates are lower. 8 bits – the default depth for compresssion for both RGB and Y’CbCr. Chapter 3 Spatial and temporal compression The more redundancy in the content, the more it can be compressed The Shannon limit – there is a limit to how small you can compress a file. Blocking and ringing Two types of Compression: Lossless and Lossy Compression Lossless: eliminate just redundant bits • Modest reductions (for images, 2-3X on average) • Decompression reproduces exact original dataset Lossy: eliminate least important bits (too) • Major reductions (up to 100X or more) • Decompression reproduces only a similar copy Comparison Lossless Compression Lossy Compression High Quality Discards redundant data Larger file sizes Small file sizes Complete control of parameters Trade off between size and quality Two types: Spatial (intraframe) versus temporal (interframe) 1) Spatial Compression (Lossy) Also called Intraframe compression Looks at similarities among the pixels of the images within a video frame, such as patterns, graphics, etc. As you increase the spatial compression, you increase the data rate and file size. A codec that uses spatial compression will use a quality slider control. 2) Temporal compression • Also called interframe compression • Compression over time • Stores only changed information between frames. • Looks for ways to store data that has not changed from one frame to another. • The codec can construct the image based on the information deconstructed from the frames. • It keeps at least 1 uncompressed frame to be able to re-construct the frame – also known as the Keyframe. • You get a smaller file size and better than spatial compression. • However, only effective for videos that have little movement. • When there is movement, result in high keyframe creation. Keyframe A keyframe is a video frame which is unprocessed by the video codec. Use data to reference frames by the codec during decompression. The first frame is always the keyframe. Lossy Compression Spatial (Intraframe Compression) Temporal (Interframe compression) Similarities among pixels “within” the frame. Similarities “across” time. Keeps 1 keyframe uncompressed and repeats this process during scene changes. Creates a keyframe and stores only changed data over time. It preserves data in the frame. Not very effective for motion because that would mean creating a lot of keyframes for each change. How decompression works How the codec reconstructs video frames Keyframe A keyframe is a video frame which is unprocessed by the video codec. Use data to reference frames by the codec during decompression. The first frame is always the keyframe. Project #2: On Your Own Available Footage: Poker Footage 1 and Poker Footage 2 Transcode: Poker Footage 1 and 2 (Codec: Apple Devices) – all videos should have Text Overlay with a description of the specs. Poker Footage 1 and 2 with Timecode, lower left, fade in/out. Poker Footage 1 and 2 with Watermark, fade in/out. Poker Footage 1 and 2 with Color Correction Poker Footage 1 and 2 with Letterbox, Scale, Panavision 2.35:1, Fade in/out Review Questions What windows make up the Compressor interface? Why is it important to set your destination? How do you create a new destination for your target? What are the steps for creating an output file? What types of presets are available in Compressor? In what window do you adjust filters? How do you do this? What are the differences between file Formats and Codecs?