IMPLEMENTATION OF VIDEO COMPRESSION ALGORITHM ON THE TEXAS INSTRUMENT VIDEO PROCESSING BOARD TMS320DM6437 Satpalsinh Parmar B.E., Gujarat University, India, 2006 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in ELECTRICAL AND ELECTRONIC ENGINEERING at CALIFORNIA STATE UNIVERSITY, SACRAMENTO SPRING 2010 IMPLEMENTATION OF VIDEO COMPRESSION ALGORITHM ON THE TEXAS INSTRUMENT VIDEO PROCESSING BOARD TMS320DM6437 A Project by Satpalsinh Parmar Approved by: __________________________________, Committee Chair Jing Pang, Ph.D. __________________________________, Second Reader Preetham Kumar, Ph.D. ____________________________ Date ii Student: Satpalsinh Parmar I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the Project. __________________________, Graduate Coordinator Preetham Kumar, Ph.D. Department of Electrical and Electronic Engineering iii ________________ Date Abstract of IMPLEMENTATION OF VIDEO COMPRESSION ALGORITHM ON THE TEXAS INSTRUMENT VIDEO PROCESSING BOARD TMS320DM6437 by Satpalsinh Parmar Video Compression is essential for Multi-Media Communication. Without the use of video compression, it is nearly impossible to provide fast transmission of Video over the Internet. The MPEG coding standard is becoming widely used in digital video applications such as DVD and HDTV. In this project report, I have discussed the basics of Video compression and different algorithms available for video compression including the MPEG (Moving Picture Expert Group) Compression standard in detail. In this project, I have implemented an algorithm for the generation of I-frame, B-frame and P-frame and few steps of image compression (DCT and Quantization) as part of video compression flow of MPEG-I standard using MATLAB and on TI’s Video development Platform TMS320DM6437 DVDP. iv In addition, I also captured some video images from the camera, and compressed/decompressed the video I frames, B frames, and P frames. I performed DCT and quantization steps. The design was implemented in MATLAB and in C language, which is then successfully implemented on TI’s Video Development Platform. _______________________, Committee Chair Jing Pang, Ph.D. _______________________ Date v ACKNOWLEDGEMENTS I am very grateful to all those people because of whom the completion of this project was possible and because of whom my graduate experience has been so wonderful. My deepest gratitude is towards my project advisor Dr. Jing Pang. I am very fortunate to have an opportunity to work under her guidance. I wish to extend my appreciation to her encouragement, the invaluable knowledge and the assistance she provided me. I would like to thank Dr. Preetham Kumar for providing helpful suggestion and reviewing my report as a second reader. I am thankful to Department of Electrical and Electronics engineering for providing me MATLAB and lab facilities. Last, but not least, My parents receive my deepest gratitude and love for their dedication and the many years of support during my graduate studies that provided foundation of this work. vi TABLE OF CONTENTS Page Acknowledgements………………………………………………………………..……..vi List of Tables.……………………………………………………………………….…….x List of Figures...…………………………………………………………………………..xi Chapter 1. INTRODUCTION……………………………………………………………..…….1 1.1 Need of Video Compression..………………………………………………...1 1.2 Goal of the Project…...……………………………………………………….2 1.3 Organization of Report…………………………………………………….....2 2. OVERVIEW OF VIDEO COMPRESSION...............................................................4 2.1 Sampling and Quantization………………………………………………….4 2.2 Image and Video Characteristics.…………………………………………....6 2.3 Image Color Formats.…………………………….………………………….9 2.3.1 RGB Color Format………………………………………………….9 2.3.2 YCrCb Color Format……………………………………………....10 2.4 Image and Video Compression Standards………………………...………..11 2.5 JPEG Image Compression………………………...........................................12 2.5.1 Separating Intensity Component…………………………………..13 2.5.2 Forward DCT and IDCT…………………………………………..13 2.5.3 Quantization and Dequantization………………………………….14 vii 2.5.4 DC coding and Zig-Zag Sequence………………………………...16 2.5.5 Entropy Coding……………………………………………………17 2.6 JPEG Applications..,……………………………………………………........17 2.7 MPEG………………………………………………………………………..18 2.7.1 MPEG-1……………………………………………………………18 2.7.1.1 MPEG-1 Video Encoder…………………………………..19 2.7.1.2 MPEG-1 Video Data Structure…………………………...20 2.8 Overview of MPEG-2, MPEG-4 and MPEG-7……………………………...22 2.8.1. MPEG-2…………………………………………………………...22 2.8.2 MPEG-4…………………………………………………………..23 2.8.3 MPEG-7…………………………………………………………...23 3. VIDEO COMPRESSION STRATEGY……………….……………………….........24 3.1 Block Matching Algorithm………………………………………………….25 3.1.1 Exhaustive Search Algorithm……………………………………..28 3.1.2 Three Step Search Algorithm……………………………………...28 3.2 Prediction Error Coding……………………………………………………..28 3.3 Types of Frames……………………………………………………………..30 4. MPEG ALGORITHM IMPLEMENTATION IN MATLAB AND ON TMS320DM6437 DVDP PLATFORM…………………………………………….32 4.1 Implementation of MPEG Algorithm in MATLAB…………………………32 4.1.1 I-frame Encoding and Decoding…………………………………...35 4.1.2 P-frame Encoding and Decoding…………………………………..36 viii 4.1.3 B-frame Encoding and Decoding………………………………….37 4.2 Implementation of MPEG Algorithm on TI’s TMS320DM6437 DVDP………38 4.2.1 Hardware Components and Interfaces on TMS320DM6437 DVDP …………………………………………………………….38 4.2.2 Capturing the Image on TMS320DM6437………………………...40 4.2.3 I-frame Encoding Decoding……………………………………….41 4.2.4 P-frame Encoding Decoding………………………………………42 4.2.5 B-frame Encoding Decoding………………………………………43 5. CONCLUSION AND FUTURE WORK........……….........................................44 Appendix A MATLAB Model for MPEG Video Compression Algorithm…......……...45 Appendix B C Code for MPEG compression Implementation on TMS320DM6437 DVDP……………………………………………………………………………………53 Bibliography………………………………………………………………………..……70 ix LIST OF TABLES 1. Table 2.2.1: Frame Frequency for Different Application…..……………………..7 2. Table 2.2.2: Bit Rate for Different Application…..……………………………….7 3. Table 2.4.1: Compression Standards…...…...…………………………………...11 x LIST OF FIGURES 1. Figure 2.1.1 (a): Continuos Image………….……………………………………..5 2. Figure 2.1.1 (b): A Scan Line from A to B in Continuous Image…………...……5 3. Figure 2.1.1 (c): Sampling and Quantization……………………………...……...5 4. Figure 2.1.1 (d): Digital Scan Line..………………………….…………...……...5 5. Figure 2.2.1: Resolution for Different Standards………..…...…………………..6 6. Figure 2.2.2: Interlaced Scanning…..…………...………………………………..8 7. Figure 2.2.3: Progressive Scanning………………………………………………8 8. Figure 2.3.2.1: 4:2:2 Subsample YCrCb Format……………………………….11 9. Figure 2.5.1: Block Diagram of JPEG Encoder and Decoder.………………….12 10. Figure 2.5.3.1: DCT and Quantization Example………...……………………....15 11. Figure 2.5.4.1: Zig-Zag Ordering of AC Coefficient..………………………….16 12. Figure 2.7.1.1.1: MPEG-1 Video Encoder……………………………………...19 13. Figure 2.7.1.2.1: MPEG Video Structure…………………….............................20 14. Figure 3.1.1: Motion Vector.……………………………………………...….....25 15. Figure 3.1.2: Block Matching Process…………………...…...…........................27 16. Figure 3.2.1: Reconstruction Using Motion Vector and Prediction Frame ……29 17. Figure 3.2.2: Motion Prediction……….………………………………………..30 18. Figure: 3.3.1: I, P, B Frames…………………………………..………………...31 19. Figure: 4.1.1: Block Diagram for Algorithm of MATLAB Code………………34 19. Figure 4.1.1.1: I frame, DCT of I frame, and IDCT of I frame………………....35 xi 20. Figure 4.1.2.1: P Frame Encoding and Decoding……………………………..36 21. Figure 4.1.3.1: B Frame Encoding and Decoding……………………………..37 22. Figure 4.2.1.1: Component Level Schematic of TMS320DM6437 DVDP…...39 23. Figure 4.2.1.2: Image of TMS320Dm6437 DVDP……………………………39 24. Figure 4.2.3.1: I-Frame and DCT of I-Frame………..…………………………41 25. Figure 4.2.3.2: Inverse DCT of I-Frame…………..……………………………41 26. Figure 4.2.4.1: I-Frame and P-Frame……………………………………………42 27 Figure 4.2.4.2: Prediction Error and Decoder P Frame…………………………42 26. Figure 4.2.5.1: I,B and P-Frame……………………………………...…………43 27 Figure 4.2.5.2: Prediction Error and Decoder B-Frame…………………………43 xii xiii 1 Chapter 1 INTRODUCTION Video has become essential source of entertainment and part of our life. For the transmission of video and efficient storage of video in less memory, video compression is essential. For example if you are transmitting video over the internet, it is nearly impossible to transmit video in uncompressed format because of limited amount of available bandwidth. 1.1 Need of Video Compression. Video compression is needed in Internet Video streaming and TV broadcast where multimedia signals are transmitted over a fixed amount of limited bandwith channels. Video compression is also useful when multimedia data is stored in memory devices like video-CDs or hard disk, which has limited storage capability. In case video is stored in CD without compression, it will barely store 5 to 8 minutes of video. In another example of, PAL system with standard TV resolution 720x576 pixels, and 8 bit RGB color scheme, the frame rate is 30 frames/second, and the data rate is 720x576x30x8x3=284 Mb/s. For HDTV with 1920x1080 pixels/frame, the data rate is 2.78 Gb/s. Such bandwidth and storage are not possible even with current powerful computer systems.[6] But fortunately video contains a lot of repetitive information for human capabilities. In Video compression, essentially the unnecessary information and repetitive data is discarded. Video compression allows transmitting and storing video information in a compact and efficient manner. There is a tradeoff between quality and compression. 2 If compression ratio is increased, data size decreases but the quality also decreases. On the other hand, lower compression ratio contains more information that means more data size but it increase video quality. There are several standards are developed for video compression. 1.2 Goal of This Project The purpose of this project is to understand the algorithms used for video compression. In this project, video compression algorithm, for MPEG-I standard is implemented using both MATLAB and on TI’s DaVInci board TMS320DM6437EVM DVDP platform. 1.3 Organization of Report Chapter 2 gives basic introduction of image and video and compression. It also gives overview of different image and video compression standards available in industry. Chapter 3 provides some details about how video compression algorithm works. It discusses different algorithms available for block matching. It also gives hint that how encoder complexity and compression ratio are related to each other. Chapter 4 contains result of a MATLAB simulation for video compression algorithm. It discusses various capabilities and features of the Texas instrument image and video development platform TMS320DM6437 board. Moreover, this chapter shows the result obtained through implementation of MPEG algorithm on TI’s DM6437 platform. Chapter 5 discusses possible ways to build up a further development of this project. 3 At the end, Appendix contains MATLAB code and C code for the implementation of MPEG 1. 4 Chapter 2 OVERVIEW OF VIDEO COMPRESSION Video is actually a sequence of still images; these images are displayed successively at desired rate so it creates an illusion of motion. To achieve higher compression efficient method for image compression is required. A digital image is composed of number of intensity elements at a particular location called pixels. For a gray scale images only one intensity value assigned for single pixel. In color image there are three-color components including Red, Green and Blue and intensity values for each color component is assigned per pixel. For 8-bit gray scale image, the range of the intensity value changes from 0 to 255. In 24 bit color image intensity value for each color component range from 0 to 255. Video is the technology of electronically capturing, recording, processing, storing, transmitting and reconstructing a sequence of still images representing scene in motion. 2.1 Sampling and Quantization Evaluation of a continuous function at fixed intervals is called sampling. The digital image pixel values are sampled spatially in two dimensions, horizontal and vertical at finite and discrete coordinate positions. A video has the additional temporal direction to represent frames. Quantization is the procedure of constraining something from a continuous set of, for example, real numbers values to a discrete set such as integers. Figure 2.1.1 shows a continuous image, f(x,y), that we want to convert to digital form. To convert it to digital form, we have to sample the function in both coordinates 5 and in amplitude. An image may be continuous with respect to the x- and y- coordinates and in amplitude. Digitalizing the coordinate values is called sampling. Digitalizing the amplitude values is called quantization. Figure 2.1.1 Generating a Digital Image (a) Continuous Image (b) A Scan Line From A to B in Continuous Image (c) Sampling and Quantization (d) Digital Scan Line 6 2.2Image and Video Characteristics Resolution of an image describes the detail that an image holds. The spatial frequency at which a digital image is sampled is a good indicator of resolution. Resolution can be increased by increasing sampling frequency. The term "bit depth" is used to describe the number of bits used to store information about each pixel of an image. The higher the depth, the more colors that are available for storage. The bit depth of an image will determine how many levels of gray (or color) can be generated. Resolution of video is size of video image measure in pixels for digital video. Figure2.2.1 shows resolution of different standards. Figure2.2.1:Resolution for Different Standards 7 Frame rate is the frequency at which an imaging device produces unique consecutive images called frames. The minimum required frame rate needed to regenerate motion is 15 frames/second. Application Frames/second Surveillances 5 Multimedia 15 Video Telephony 10 PAL,SECAM 25 NTSC 29.97 HDTV 25-60 Table 2.2.1 Frame Frequencies for Different Applications ‘Bit rate’ or ‘Data rate’ is number of bits transferred per unit time. Application Bit Rate Video Telephony 16kbps Video Conferencing 128-384 kbps Blu-ray disc quality 54 Mbps MPEG-1(VCD quality) 1.5 Mbps MPEG-2(Digital TV,DVD) 16-80 Mbps HDTV 15 Mbps Table 2.2.2 Bit Rate for Different Application 8 Aspect ratio is the ratio of width and height of image. Normal TV ratio is 4:3, for high definition, TV ratio is 16:9.[1] Two different techniques are available to render the video: interlaced scanning and progressive scanning. Interlaced scan-based images use techniques developed for Cathode Ray Tube (CRT)-based TV monitor displays, made up of 576 visible horizontal lines across a standard TV screen. Interlacing divides these into odd and even lines and then alternately refreshes them at 30 frames per second. The slight delay between odd and even line refreshes creates some distortion or 'jaggedness'. This is because only half lines keep up with the moving image while the other half waits to be refreshed. Figure 2.2.2 Interlaced Scanning Figure 2.2.3 Progressive Scanning 9 Progressive scanning, as opposed to interlaced, scans the entire picture line by line every sixteenth of a second. In other words, captured images are not split into separate fields like in interlaced scanning. Computer monitors do not need interlace to show the picture on the screen. It displays them on one line at a time in perfect. so there is virtually no "flickering" effect. As such, in a surveillance application, it can be critical in viewing detail within a moving image such as a person running away. However, a high quality monitor is required to get the best out of this type of scan. 2.3 Image Color Formats Mainly, there are two types of format available in color image. 1) RGB color format 2) YCrCb color format RGB stands for Red, Green, Blue. The RGB format is easy to generate a color scheme visible to human eye. The RGB model is used for display on screen and but not for printing. RGB is the right choice for displays on websites, PowerPoint presentations and Acrobat .pdf files. YCrCb color format is easy for image processing.[5] 2.3.1 RGB Color Format Monitors use combination of these three colors and generate color image. A color is represented by indicating how much each ingredients of red, green, and blue should be included in it. 10 Some examples of 24-bit representation of colors: (255, 255, 255) represents white (0, 0, 0) represents black (255, 0, 0) represents red (0, 0, 255) represents blue (0, 255, 0) represents green 2.3.2 YCrCb Color Format When RGB color format is used, human eyes are more sensitive to Green less sensitive to Red and least sensitive to Blue. It can be advantageous if picture is described in terms of luminance and chrominance. Conversion from color component to luminance(Y) and chrominance (Cb and Cr) is helpful to separate Brightness and color components. The equation for the transformation of RGB components into YCrCb components are: Y ………………(1) There are several formats available for YCrCb color format. In 4:4:4 YCrCb format there are equal number of components for Y and Cr and Cb. But, in 4:2:2 YCrCb formats, Cr and Cb components are sub sampled to be presented in the interleaved fashion as shown in the figure 2.3.2.1. In this project, for Hardware implementation using TI Board I have used YCrCb format in the interleaved fashion called 4:2:2 YCrCb sub sampled format. Every other 11 byte is Y, every 4th byte is a Cb, and every 4th byte is a Cr. The format is Cb Y Cr Y Cb Y Cr Y and the sampling format is shown in the following figure. X OXOXOXO X OXOXOXO X OXOXOXO X=Cb/Cr component, O= Y component. Figure 2.3.2.1 4:2:2 Subsample YCrCb Format The notation 4:2:2 indicates that the chrominance components are sub sampled to 2:1 ratio. The above figure shows that every other pixel location has the Cb and Cr components.[5] 2.4 Image and Video Compression Standards STANDARD APPLICATION JPEG Still Image Compression JPEG-2000 Improved Still Image Compression MPEG-1 Video on digital storage media MPEG-2 Digital TV MPEG-4 Mobile Application, HD, DVDs H.261 Video Conferencing Over ISDN H.263 Video Telephony over PSTN Table 2.4.1- Compression Standards 12 2.5 JPEG Image Compression JPEG stands for ‘Joint Picture Expert Group’, A committee created to form standards for compression of still image. Digital images have mostly gradual changes in intensity over most of image. Human eyes are less sensitive to chrominance information and can differentiate between similar shades of luminance to certain extent only. So, to discard the information which human eyes cannot differentiate JPEG implements DCT (Discrete cosine Transform) and Quantization. A simplified block diagram for JPEG encoder and decoder is shown in figure. 2.5.1 [4] Figure 2.5.1 Block Diagram for JPEG Encoder and Decoder 13 The steps for JPEG encoding are: 1) If the color is represented in RGB mode, Translate it to YUV 2) Divide the file in to 8X8 blocks 3) Transform the pixel information from spatial domain to the frequency domain with the Discrete Cosine Transform 4) Quantize the resulting values by dividing each coefficient by an integer value and rounding off to the nearest integer 5) After arranging in Zig-Zag order. Do a run length encoding of the coefficients. Followed by Huffman coding. 2.5.1 Separating Intensity Component Human eyes are more sensitive to change in luminance than change in color that is chrominance difference. In JPEG file format RGB data is translated in to YCbCr data. This is done by following equations. Y ………………………(2) Chroma sub sampling is the process whereby the color information in image sampled at a lower resolution than the original. 2.5.2 Forward DCT and IDCT Forward DCT (FDCT) and Inverse DCT (IDCT) are applied to each 8x8 blocks. The following equations are used for FDCT and IDCT operation. 14 ..(3) And, the formula for IDCT (Inverse Discrete Cosine Transform) is: ..(4) DCT converts spatial domain to frequency domain. DCT does not do any Cf ((ux,, v compression but it converts the image in proper format so that compression can be done in the form of Quantization.[4] 2.5.3 Quantization and Dequantization After the output from the FDCT , each of 64 DCT coefficients are quantized in conjunction with a 64-element Quantization table, which is supplied by the user as input to the encoder, elements of the quantization table can be any integer value from 1 to 255, which specified the step size of quantizer for corresponding DCT coefficient. The purpose of quantization is to throw away the unnecessary precision, which is not visually significant. In quantization, we throw away unnecessary information, which cannot be retrieved back. This way it is main source of precision loss in DCT based Encoders. Quantization is defined as division of DCT co-efficient by its corresponding quantizer step size followed by rounding to the nearest integer: 15 F (u, v) IntegerRou nd ( F (u, v) ) Q(u, v) …………………..(5) This output is normalized by the quantizer step size. Dequantization is the inverse function. In dequantization normalization is removed by multiplying by the step size.[4] For Inverse DCT: F ' (u, v) F (u, v) * Q(u, v) Figure 2.5.3.1 DCT and Quantization Example ….……………(6) 16 2.5.4 DC Coding and Zig-Zag Sequence This step achieves additional compression lossless by encoding the quantized DCT coefficient. After quantization, the DC coefficient is treated separately from 63 AC coefficients. The DC coefficient is a measure of the average value of the 64 image samples. The quantized DC coefficients are encoded as the difference from the DC coefficient of previous blocks in encoding order because there is strong correlation between DC coefficients of adjacent blocks. DC coefficient contains significant portion of the energy of image. So this difference provide sufficient compression. After this step, all of quantized coefficients are ordered into the “Zig-Zag” sequence, The Zig-Zag scanning of the 2D quantized coefficients orders the frequencies into a 1D stream of frequencies from lower to higher. This ordering helps to facilitate entropy coding by placing low frequency coefficients before high frequency coefficients.[4] Figure 2.5.4.1 Zig-Zag Ordering of AC Coefficient 17 2.5.5 Entropy Coding Entropy coding achieves addition compression lossless by encoding the quantized DCT coefficient more compactly based on their statistical characteristics In JPEG two Entropy coding methods are available (1) Huffman Coding (2) Arithmetic Coding Entropy coding is 2-step process. During the first step, the zigzag sequence of quantized coefficient is converted into intermediate sequence of symbols. The second step converts the symbols to a data stream in which the symbols no longer have externally identified boundaries. Huffman coding uses Huffman table defined by application for compress an image and then the same table is used for decompression. These Huffman tables are predefined or computed specifically for a given image during initialization, prior to compression. Arithmetic coding doesn’t requires tables like Huffman coding because it is able to adapt to the image statistics as it encodes the image. Arithmetic coding is a little complex than Huffman coding for certain implementation, for example, the highest-speed hardware implementation. Trans coding between two coding method is possible by simply entropy decoding with one method and entropy recoding with the other. 2.6 JPEG Applications JPEG application areas are internet, digital photography, cinema and also frequently in MPEG video compression. JPEG achieves high compression for an 18 acceptable quality. JPEG is the building block in MPEG video compression. DCT, Quantization, Huffman coding, all techniques used in JPEG are also used in MPEG to compress individual frames in a Video sequence. 2.7 MPEG International Standards Organization (ISO) established the MPEG standard to meet the demand of exchanging coded representation of moving pictures between different platforms and applications. In 1988 ISO established the Moving Picture Expert Group (MPEG) to develop the standard. The MPEG standards define only the decoding process; the process of creating the coded bit-stream from the source video material is not defined by the standard and therefore different encoder can use different algorithms and methods to generate the bit stream. As long as the coded bit stream conforms to the standard, it is considered as valid bit stream and should be decodable by any MPEG-compliant decoders. 2.7.1 MPEG-1 MPEG-1 standard is optimized for stored digital video in storage medium such as video CDs. It is not designed for progressive video and does not support interlaced scanning. So, it is not used for TV broadcasting . Video is sequence of still digital images, So JPEG coding can compress this images. This technique is called Motion JPEG, But this technique does not take advantage of temporal redundancy, hence MPEG achieves much more compression.[6] 19 2.7.1.1 MPEG-1 Video Encoder MPEG-1 video encoder block diagram is shown in Figure 2.7.1.1.1. In put Video is first converted from RGB to YUV format. If the given frame is I frame, it will be simply compressed as JPEG compression. That means I frame is passed through DCT, quantization and then entropy coding is done. If it is P frame or B frame then motion estimation is done with respect to reference frame and prediction error and then motion vector is coded.[6] Figure 2.7.1.1.1 MPEG-1 Video Encoder 20 2.7.1.2 MPEG-1 Video Data Structure A Video structure is composed of following layers: 1) Sequence layer 2) Group of Picture Layer 3) Picture layer 4) Slice Layer 5) Macro block 6) Block layer Figure 2.7.1.2.1 MPEG Video Structure 21 1) Sequence Layer A video is a sequence of still digital images displayed successively to create an illusion of motion. 2) Group of Pictures (GOP) Layer The GOP (Group of Picture) represents the number of frames between reference frames. There are three types of frames, I, P and B-frames. GOP is consist of I frame followed by B frames and P frames. P frames are forward coded, while B frames are bi-directional coded. The GOP value (distance between two I frames) is user defined during encoding. The lower the GOP value, the better response with the movement will be, But the compression ratio will be poor. The coding order and transmission order are different from the display order. In coding, I frame is first encoded, next P frame is encoded and then intermediate B frames are encoded. Similar order is followed while transmitting the data. Thus the order of encoding would be ‘IPBBPBBPBB’, but the display order will be ‘IBBPBBPBBP’. 3) Picture Layer In GOP, there are two kinds of frames. I frame are intra-frame coded whereas B and P frames are inter-frame coded. Intra frame (I-frame) coding I-frames or Intra frames are used for reference frames. Intra frame coding is done using information available within the frame only. The coding of Intra 22 frame does not depends on any other frame, this reduces the spatial redundancy within each frame. I frames are coded similarly as JPEG coding. Inter-frame Coding P frames and B frames are inter-coded frames. P frames are forward predicted from either previous I frame or previous P frame. P frame encoding would require motion estimation. Motion vector and prediction error are generated, and only these information is encoded and transmitted. Inter-frame coding achieves more compression than intra-frame coding as it exploits temporal redundancy. B frames carry difference from previous reference frame as well as next reference frame. Thus B frames achieve the highest compression. 2.8 Overview of MPEG-2,MPEG-4 and MPEG-7 2.8.1 MPEG-2 MPEG-2 is just an extension of MPEG-1. The enhancements of MPEG-2 Over MPEG-1 Support field interlaced video Better quality Profiles & levels for flexible option Digital TV broadcasting standard Several bit rates and compression ratios 23 2.8.2 MPEG-4 Multimedia and web compression uses MPEG-4. In MPEG-4, individual objects are tracked separately and compressed together. MPEG-4 provides very high scalability; from low bit rates to high bit rate. MPEG-4 supports very low bandwidth applications such as Mobile application. On the other hand, it also supports application with extremely high quality and unlimited bandwidth such as HD DVD and Blu-ray discs.[6] 2.8.3 MPEG-7 MPEG-7 is multimedia content description standard. It is used for search algorithm for image or audio tunes in database. 24 Chapter 3 VIDEO COMPRESSION STRATEGY Two types of redundancy are responsible for compression. Redundancy between the pixels called spatial redundancy. Redundancy between frames called temporal redundancy. Spatial redundancy, also known as Intra-frame redundancy is achieved using techniques of DCT, Quantization and Entropy coding. Temporal redundancy, also known as Inter-frame redundancy is achieved using techniques of motion estimation. An MPEG video is sequence of still digital images. Two successive frames in video have minor difference. Motion estimation is the process, which tracks such changes in order to reduce this inter-frame redundancy. Motion estimation is the process by which a macro block in a picture is best correlated to macro block in previous or next picture by the estimated amount of motion. The motion vector stores the amount of motion. Correlation with previous image is stored in forward motion vector and correlation with future image is stored in backward motion vector. Background of images generally does not change position while object in the foreground moves a little with relative to reference frame. Motion estimation is used to determine the displacement of an object in frame. Displacement of an object with reference to another frame is coded. In industry many motion estimation algorithm are implemented. Block matching algorithms are most effectively used for motion estimation. 25 3.1 Block Matching Algorithms Objects in the frame of video sequence move within frame to form corresponding objects in new subsequence. Generally, background of image remains the same. To find matching block, each block of current frame is compared with past or future frame within a search area. Firstly, current frame is divided into smaller macro blocks (16x16). And then they are compared to corresponding macro blocks and its adjacent neighbors in previous or future frame. MAD (mean absolute difference) is calculated at each position in search area; and macro block that results in least cost is the one that matches the closest to current block. A vector is found which indicates the movement of current block with respect to matching block in future or past frame. This vector is called motion vector. Motion vectors for each and every macro block are found in the raster scan order and stored. Figure 3.1.1 Motion Vector 26 1 MAD 2 N 1 MSE 2 N N 1 N 1 | C ij i 0 j 0 i 0 j 0 ………………………………(7) 2 N 1 N 1 (C Rij | ij Rij ) …………………………….(8) Where, N=16 ,Cij and Rij is current and reference macro blocks being compared Mean Absolute Difference (MAD) and Mean Squared Error (MSE) are mostly used for Macro block matching. Block size, search area, and motion vector accuracy defines the quality of prediction. It is most likely that matching block is found if search area is larger, but larger search area will slow down the encoding process. Usually block size is 16x16. Figure 3.2.1 shows block matching process with block size of NxN, Search range is p. Block matching algorithm is very time consuming. There are many faster search algorithms available which saves computation time and provide better match.[6] 27 Figure 3.1.2 Block Matching Process 28 3.1.1 Exhaustive Search Algorithm (ES) This is the most basic search algorithm. It searches in every possible search area and provides the best match possible, however this algorithm is extremely computation expensive, and it is very slow. As the search area increases computation time increases exponentially. For example with macro block size of 16 and with search range of 7 , a total of 255 MAD computations are required in this algorithm[6] 3.1.2 Three Step Search Algorithm This algorithm does not search all possible locations. It starts search at center location usually step size S=4 usually, and finds cost at all 9 locations around it at S=4. The location where the cost is least is then set as a new origin and again cost is found at 9 locations around it with S=check)4/2. Again the least cost location is set as origin and cost is searched at 9 locations with S=2/2 now. The final least cost location selected as best match and motion vector is coded. Because TSS takes up less number of steps it is faster than ES.[6] 3.2 Prediction Error Coding Only the motion vector is not sufficient to regenerate predicted macro block, hence the difference between predicted macro block and original macro block, known as prediction error, is also coded along with motion vector. At the decoder, combination of motion vector and prediction error will regenerate exact macro block. At the decoder side, with help of motion vector, decoder will generate predicted frame. Finally, by adding this predicted frame with the prediction error it can exactly regenerate original image. [6] 29 Figure 3.2.1 Reconstruction using Motion Vector and Prediction Frame Motion estimation uses the Motion vector to store the movement of the macro block. So, the amount of data to be stored for image difference is greatly reduced. Figure 3.2.2 shows that motion predicted difference image stores very less data compared to normal difference image. Hence, motion estimation is very efficient way for data compression. [6] 30 Frame N Frame N+1 Difference Image Motion predicted difference image Figure 3.2.2 Motion Prediction 3.3 Types of Frames Frames are classified in three types. 1) I-Frames (Intra) 2) P-Frames (Forward Predicted) 3) B-Frames (Bi-Directional Prediction) I-frames are compressed using JPEG standards. I–frames do not have any reference frame. They serve as reference frames for B-frames and P-frames. I frame 31 compression is not as high as B and P-frames because it uses spatial redundancy and compress frame as it is. P frames are forward predicted frames, which mean they are predicted with reference to previous frame of either P-frame or I-frame. In P-frame coding only motion vector and prediction error is stored instead of intra-coding of whole frame. B-frames are Bi-predicted frames, which mean they are predicted with reference to previous frame as well as future frame. Figure 3.3.1 I , P , B Frames 32 Chapter 4 MPEG ALGORITHM IMPLEMENTATION IN MATLAB AND ON TMS320DM6437 DVDP PLATFORM 4.1 Implementation of MPEG Algorithm in MATLAB The algorithm was first modeled and implemented in MATLAB, to perform Intra coding to generate I frame and perform Inter coding to generate P and B frames. I used exhaustive search algorithm for motion estimation. Not all steps for JPEG compression were implemented here. Only DCT, and quantization were included. Figure 4.1.1.1 shows a block diagram of algorithm, implemented using MATLAB code. For MATLAB simulations, I used my own Video source. I captured three images as I-frame, B-frame and P-frame. First, these images are converted in to gray scale images, I-frame is Intra coded, and Quantization and DCT are performed on I frame. Forward prediction block takes I frame and P frame as input and as output respectively, and it will generate prediction error frame and motion vector. For the generation of prediction error, exhaustive search algorithm is used as block matching algorithm. Similarly Bidirectional prediction block takes I-frame, B-frame and P-frame as input and as output to generate prediction error and motion vector for B frame. Prediction error for B frame is generated using both forward prediction as well as backward prediction. Prediction error frames are then passed through DCT and quantization step. Because prediction error frames contains only the difference between predicted frame and current frame, most of the pixel value of prediction error frame is zero so compression ratio for prediction error frame is high compared to B- frame or P-frame. On decoder side, 33 dequantization and inverse DCT are performed on compressed I-frame and compressed prediction error frame. Using the prediction error of P-frame, Motion vector and decoded I frame, P-frame encoder will regenerate P frame. Similarly, Using prediction error of Bframe, Motion vector and decoded I and P frames B-frame decoder will regenerate B frame 34 DCT + Quantized value of I frame ,Prediction error of B and Pframe Inverse Discrete Cosine Transform I frame Prediction error for P and B frame I frame Prediction error for P and B frame Discrete Cosine Transform Bi-Directional Prediction Motion vector of B and P frame P frame decoder B frame Decoder I frame, B frame , Pframe Store Images as I frame, B frame, P frame Display Images on TV Gray Scale Images I frame, B frame, Pframe Forward Prediction Dequantization DCT of I frame, prediction error of B and P frame DCT of I frame, prediction error of B and P frame Quantization Color Images Convert Images into GrayScale Images Capture Images from Camera Figure. 4.1.1 Block Diagram for Algorithm of MATLAB Code 35 4.1.1 I-frame Encoding Decoding Figure 4.1.1.1 I Frame, DCT of I Frame and IDCT of I Frame Figure 4.1.1.1 shows I-frame, DCT of I frame, and IDCT of I frame. IDCT of I – frame is done to verify the DCT operation. 36 4.1.2 P frame Encoding and Decoding Figure 4.1.2.1 P-frame Encoding and Decoding In Figure 4.1.2.1 I-frame, P-frame, Prediction error of P frame, and decode P frame is shown. From the figure we can see that Prediction error contains difference between Predicted P-frame and I-frame. This difference is smaller than the difference of I frame and P-frame. So by taking prediction error instead of normal difference we achieve noticeable compression ratio. 37 4.1.3 B frame Encoding and Decoding Figure 4.1.3.1 B frame Encoding and Decoding Figure 4.1.3.1 shows I,B,P frames, Prediction error for B frame and Decoded B-frame. From the figure it can be seen that as B frame use bidirectional prediction it has very minimal difference between predicted frame and B-frame. So, compression of prediction error of B-frame would be more than compression of prediction error of P-frame. 38 4.2 Implementation of MPEG Algorithm on TI’s TMS320DM6437 DVDP 4.2.1 Hardware Components and Interfaces on TMS320DM6437 DVDP TI’s platform DM6437 has all the capabilities that are required for develepoment of the complex application related to audio and video. Combining capabilities of Texas instrument’s latest DSP processor core and high quality audio and video interfaces provides developer the full power of development in less time to market. Code Composer Studio v3.3 provides Integrated Development Platform (IDE) for implementation and debug of various image and video application in the C/C++ programming language. [2] Some feature of TMS320DM6437 DVDP. 1. TMS320C64x TM DSP core operating at 600 MHz. 2. 80 KB L1D,32 KB L1P cache/SRAM and 128 KI L2 cache SRAM memory 3. Two 32-bit,133-MHz extended memory interfaces(EMIFs) 4. 10/100 Ethernet media access controller (MAC), two UARTs, I2C, SPI, GPIO, McASP and three PWMs. 39 Ethernet D D R 2 NAND FLASH CAN RS-232 D D R 2 MIC IN AIC 33 CODEC LINE IN SRAM DM6437 LINE OUT LINE OUT Video IN Video Decoder POWER TMS320DM6437 Figure 4.2.1.1: Component Level Schematic of TMS320DM6437 DVDP Figure 4.2.1.1 shows DM6437 DSP core, memories such as DDR2, SRAM and NAND flash and various interfaces such as CAN, RS-232 and SVideo IN, SVideo OUT on the periphery of the platform. Figure 4.2.1.2: Image of TMS320DM6437 DVDP 40 For capturing the image and storing it, the on board RCA type connector was used. 4.2.2 Capturing the Image on TMS320DM6437 To capture an image from a camera I used FVID_exchange function from VPFE standard library function. FVID_exchange function takes image from camera connected at S-video IN port, and the captured image through this function is in YCrCb format. The luminance component is separated from this image to get gray scale image. After capturing the Image Intra frame coding or inter frame coding is performed. I have explained this steps in detail in previous chapters. I have attached a C code in Appendix B, which will performs encoding and decoding of I frame, P frame and B frame. In this chapter I will just show the results. The Code in Appendix is well commented and self explanatory.[3] 41 4.2.3 I-frame Encoding Decoding Figure 4.2.3.1 I-frame and DCT of I-frame Figure 4.2.3.2 Inverse DCT of I-frame Figure 4.2.3.1 and 4.2.3.2 shows results of DCT and IDCT of I-frame on the DM6437 DVDP. 42 4.2.4 P-frame Encoding Decoding Figure 4.2.4.1 I-frame and P-frame Figure 4.2.4.2 Prediction Error and Decoder P frame Figure 4.2.4.1 and 4.2.4.2 shows display of I, P frames, Prediction error, and Decoded P frame on TV using DM6437DVDP 43 4.2.5 B-frame Encoding Decoding Figure 4.2.5.1 I,B and P frame Figure 4.2.5.2 Prediction Error and Decoder B-frame Figure 4.2.5.1 and 4.2.5.2 shows display of I, B, P frames, Prediction error, and Decoded B frame on TV using DM6437DVDP 44 Chapter 5 CONCLUSION AND FUTURE WORK Video compression is essential for transmission and storage of multimedia signals. MPEG 1 is the first implementation of video compression and is used everywhere even today in form of distributing contents over CDs. In this project, function blocks used in MPEG1 algorithm were verified in MATLAB simulation. Then the implementation of this video compression algorithm on hardware (TI’s DM6437) was also done successfully. Additionally, in this project implementation of the algorithm for generating I frame, B frame and P frame was also achieved. All steps other than entropy coding are also included in the algorithm for still image compression. The current implementation can work very well on stored multimedia data. However, to make it work in real time applications, modifications are required to allow fast algorithm. Other scope of future work is to include audio component in the video compression, however implementation of MPEG-4 or MPEG -7 would be a challenging task. 45 APPENDIX A MATLAB Model for MPEG Video Compression Algorithm close all clc %reading video file frame by frame and stored video is used as a pointer to %access frames one buy on [video_file,file]=uigetfile({'*.avi';'*.*'},'Open Video'); video=aviread(video_file); nframes=size(video) %movie(video,1,473) % separating I frame,B frame and P frame and converting them to grayscale % image I_frame=double(rgb2gray(video(1).cdata)) B_frame=double(rgb2gray(video(30).cdata)) P_frame=double(rgb2gray(video(80).cdata)) %to see what images have been captured as I B and P frame %imshow(uint8(I_frame)) %imshow(uint8(I_frame)) %imshow(uint8(P_frame)) %this is the prediction function which will take I_frame and P_frame as %Input and return the motion vector and predicted frame. Using predicted %frame prediction error is generated on the decoder side only compressed %prediction error and motion vector is availabe. image compression of %prediction error is much more than normal frame as it has all zeroes %except where there is difference between predicted_frame and I_frame %(To check detail how this function works look the .m file for this %function) [motion_vector_P,encoder_predicted_P_frame]=forwardprediction(I_frame,P_frame) prediction_error_P=P_frame-encoder_predicted_P_frame %for encoding of B_frame we consider both forward prediction and backward %prediction and combine this two such that generated prediction error frame %will have a lot compression capabilities [forward_motion_vector,forward_predicted_B_frame]=forwardprediction(I_frame,B_fra me) forward_prediction_error=B_frame-forward_predicted_B_frame %this will generate backward motion vector and backward prediction error [backward_motion_vector,backward_predicted_B_frame]=backwardprediction(P_frame, B_frame) backward_prediction_error=B_frame-backward_predicted_B_frame 46 %this function will generate prediction error and motion vector from %forward prediction and backward prediction such that the resulted %preiction error can provide much more compression ratio %(for detail see .m file for this function) [motion_vector_B,prediction_error_B]= encode_B_frame(forward_motion_vector,backward_motion_vector,forward_prediction_ error,backward_prediction_error) %Now from encoder will only send compressed I frame,compressed prediction error for P and B frame % Here I have perform only Dct from whole compression %flow. I_frame_dct=dct_function(I_frame) prediction_error_P_dct=dct_function(prediction_error_P) prediction_error_B_dct=dct_function(prediction_error_B) %%%quantization not included in c code quantize_prediction_error_P= quant_funct(prediction_error_P_dct) quantize_prediction_error_B= quant_funct(prediction_error_B_dct) quantize_I_frame=quant_funct(I_frame_dct) %%%%%%%%%%%%%%%%%%%%decoder%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%side%%%%%%%%%%%%%%%%%%%%% %%dequantization not included in c code dequantize_prediction_error_P_dct= dequant_funct(quantize_prediction_error_P) dequantize_prediction_error_B_dct= dequant_funct(quantize_prediction_error_B) dequantize_I_frame_dct=dequant_funct(quantize_I_frame) %After receiving compressed frames decoder will decompress these frames. %here only inverse DCT is performed I_frame_idct=idct_function(dequantize_I_frame_dct) prediction_error_P_idct=idct_function(dequantize_prediction_error_P_dct) prediction_error_B_idct=idct_function(dequantize_prediction_error_B_dct) %After decompressed Predcition error and motion vector is available decoder %will generate P frame decoder_predicted_P_frame=deoderprediction(motion_vector_P,I_frame_idct) decoded_P_frame=prediction_error_P_idct+decoder_predicted_P_frame %After decompressed Predcition error and motion vector is available decoder %will generate B frame decoder_predicted_B_frame=B_frame_decoder_prediction(motion_vector_B,I_frame_id ct,decoded_P_frame) decoded_B_frame=prediction_error_B_idct+decoder_predicted_B_frame %plotting of frames from begining to intermidiate stges subplot(2,7,1) imshow(uint8(I_frame)) title('encoder I frame') subplot(2,7,2) 47 imshow(uint8(P_frame)) title('encoder P frame') subplot(2,7,3) imshow(uint8(B_frame)) title('encoder B frame') subplot(2,7,4) imshow(uint8(prediction_error_P)) title('prediction error of P') subplot(2,7,5) imshow(uint8(prediction_error_B)) title('prediction error of B') subplot(2,7,6) imshow(uint8(I_frame_dct)) title('dct of I_frame') subplot(2,7,7) imshow(uint8(prediction_error_P_dct)) title('dct of prediction error of P') subplot(2,7,8) imshow(uint8(prediction_error_B_dct)) title('dct of prediction error of B') subplot(2,7,9) imshow(uint8(I_frame_idct)) title('inverse dct of I_frame') subplot(2,7,10) imshow(uint8(prediction_error_P_idct)) title('inverse dct of prediction error of P') subplot(2,7,11) imshow(uint8(prediction_error_B_idct)) title('inverse dct of prediction error of B') subplot(2,7,12) imshow(uint8(decoded_P_frame)) title('decoder P frame') subplot(2,7,13) imshow(uint8(decoded_B_frame)) title('decoder B frame') %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%% Internal Functions %%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function mb_MAD = MAD(P_block,I_block) global mb mb_MAD= sum(sum(abs((P_block-I_block))))/mb^2; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function quantize_frame= quant_funct(frame) 48 qtz_mtrx=[16 11 10 16 24 40 51 61;12 12 14 19 26 58 60 55;14 13 16 24 40 57 69 56;14 17 22 29 51 87 80 62;18 12 37 56 68 109 103 77;24 35 55 64 81 104 113 92;49 64 78 87 103 121 120 101;72 92 95 98 112 100 103 99]; [m n]=size(frame); % add ofset of 100 for the elimination of the negative pixel values img_os= frame+200; for i=1:8:m for j=1:8:n quantize_frame(i:i+7,j:j+7)=round(img_os(i:i+7,j:j+7)./qtz_mtrx); end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function dequantize_frame= dequant_funct(frame) qtz_mtrx=[16 11 10 16 24 40 51 61;12 12 14 19 26 58 60 55;14 13 16 24 40 57 69 56;14 17 22 29 51 87 80 62;18 12 37 56 68 109 103 77;24 35 55 64 81 104 113 92;49 64 78 87 103 121 120 101;72 92 95 98 112 100 103 99]; [m n]=size(frame); for i=1:8:m for j=1:8:n dequantize_frame_os(i:i+7,j:j+7)=frame(i:i+7,j:j+7).*qtz_mtrx; end end % add ofset of 100 for the elimination of the negative pixel values dequantize_frame=dequantize_frame_os-200; %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function frame_dct=dct_function(frame) [m n]=size(frame); for a=1:8:m for b=1:8:n frame_dct(a:a+7,b:b+7)=dct2(frame(a:a+7,b:b+7)); end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function frame_idct=idct_function(frame) [m n]=size(frame); for a=1:8:m for b=1:8:n frame_idct(a:a+7,b:b+7)=idct2(frame(a:a+7,b:b+7)); end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [motion_vector,new_B_frame]=forwardprediction(I_frame,B_frame) global mb_MAD mb_MAD_3d prediction_error mb range 49 [m n]=size(I_frame); threshold=16; mb=16; range=20; vector_size =(m*n)/(mb^2); motion_vector = zeros(3,vector_size); mb_MAD=256*ones(15,15); %-----Block Matching Algorithm----------------% count =1; for i=1:mb:m-mb+1 for j=1:mb:n-mb+1 min=512; k1=0; l1=0; for k= -range:range for l = -range:range if ((i+k)<1||(j+l)<1||(i+k)+mb-1>m||(j+l)+mb-1>n) continue; end %MAD function stands for Mean Absolute Difference ,which %decide weather two macro block are matching or not %depending on allowed threshold min_MAD=MAD(B_frame(i:i+mb-1,j:j+mb-1),I_frame((i+k):(i+k+mb1),(j+l):(j+l+mb-1))); if min_MAD<min min=min_MAD; k1=k; l1=l; end end end %comparing with threshold value to check intra-MBs---% if min > threshold new_B_frame(i:i+mb-1,j:j+mb-1) = I_frame(i:i+mb-1,j:j+mb-1); motion_vector(1,count) = 0; %Horizontal motion vector motion_vector(2,count) = 0; %vertical motion vector motion_vector(3,count) = 0; else new_B_frame(i:i+mb-1,j:j+mb-1)= I_frame((i+k1):(i+k1+mb-1),(j+l1):(j+l1+mb1)); motion_vector(1,count) = k1; %Horizontal motion vector motion_vector(2,count) = l1; %vertical motion vector motion_vector(3,count) = 0; 50 end count=count+1; end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [motion_vector,new_B_frame]=backwardprediction(I_frame,B_frame) global mb_MAD mb_MAD_3d prediction_error mb range [m n]=size(I_frame); threshold=16; mb=16; range=20; vector_size =(m*n)/(mb^2); motion_vector = zeros(3,vector_size); mb_MAD=256*ones(15,15); %-----Block Matching Algorithm----------------% count =1; for i=1:mb:m-mb+1 for j=1:mb:n-mb+1 min=512; k1=0; l1=0; for k= -range:range for l = -range:range if ((i+k)<1||(j+l)<1||(i+k)+mb-1>m||(j+l)+mb-1>n) continue; end min_MAD=MAD(B_frame(i:i+mb-1,j:j+mb-1),I_frame((i+k):(i+k+mb1),(j+l):(j+l+mb-1))); if min_MAD<min min=min_MAD; k1=k; l1=l; end end end %comparing with threshold value to check intra-MBs---% if min > threshold new_B_frame(i:i+mb-1,j:j+mb-1) = I_frame(i:i+mb-1,j:j+mb-1); motion_vector(1,count) = 0; %Horizontal motion vector motion_vector(2,count) = 0; %vertical motion vector motion_vector(3,count) = 1; else 51 new_B_frame(i:i+mb-1,j:j+mb-1)= I_frame((i+k1):(i+k1+mb-1),(j+l1):(j+l1+mb1)); motion_vector(1,count) = k1; %Horizontal motion vector motion_vector(2,count) = l1; %vertical motion vector motion_vector(3,count) = 1; end count=count+1; end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function [motion_vector,prediction_error]= encode_B_frame(forward_motion_vector,backward_motion_vector,forward_prediction_ error,backward_prediction_error) global mb cnt=1; mb=16; [m n]=size(forward_prediction_error); vector_size =(m*n)/(mb^2); motion_vector = zeros(3,vector_size); for i=1:mb:m-mb+1 for j=1:mb:n-mb+1 energy_fp=sum(sum(forward_prediction_error(i:i+mb-1,j:j+mb-1).^2)) energy_bp=sum(sum(backward_prediction_error(i:i+mb-1,j:j+mb-1).^2)) %energy defines that which block will provide more compressionto %ratio according to that particular macroblock is selected for %prediction error if energy_fp > energy_bp prediction_error(i:i+mb-1,j:j+mb-1) = backward_prediction_error(i:i+mb1,j:j+mb-1); motion_vector(1,cnt) = backward_motion_vector(1,cnt); %Horizontal motion vector motion_vector(2,cnt) = backward_motion_vector(2,cnt); %vertical motion vector motion_vector(3,cnt) = backward_motion_vector(3,cnt); else prediction_error(i:i+mb-1,j:j+mb-1) = forward_prediction_error(i:i+mb-1,j:j+mb1); motion_vector(1,cnt) = forward_motion_vector(1,cnt); %Horizontal motion vector motion_vector(2,cnt) = forward_motion_vector(2,cnt); %vertical motion vector motion_vector(3,cnt) = forward_motion_vector(3,cnt); end cnt=cnt+1; 52 end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function decoder_predicted_frame=deoderprediction(motion_vector,I_frame) global mb cnt=1; mb=16; [m n]=size(I_frame); vector_size=(m*n)/(mb^2); for i=1:mb:m-mb+1 for j=1:mb:n-mb+1 k1=motion_vector(1,cnt); %Horizontal motion vector l1=motion_vector(2,cnt); decoder_predicted_frame(i:i+mb-1,j:j+mb-1)= I_frame((i+k1):(i+k1+mb1),(j+l1):(j+l1+mb-1)); cnt=cnt+1; end end %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% function decoder_predicted_B_frame=B_frame_decoder_prediction(motion_vector,I_frame,P_fra me) global mb cnt=1; mb=16; [m n]=size(I_frame); vector_size=(m*n)/(mb^2); for i=1:mb:m-mb+1 for j=1:mb:n-mb+1 k1=motion_vector(1,cnt); %Horizontal motion vector l1=motion_vector(2,cnt); decide=motion_vector(3,cnt); if(decide==0) decoder_predicted_B_frame(i:i+mb-1,j:j+mb-1)= I_frame((i+k1):(i+k1+mb1),(j+l1):(j+l1+mb-1)); else decoder_predicted_B_frame(i:i+mb-1,j:j+mb-1)= P_frame((i+k1):(i+k1+mb1),(j+l1):(j+l1+mb-1)); end cnt=cnt+1; end end 53 APPENDIX B C Code for MPEG Compression Implementation on the TMS320DM6437 /* * ======== video_preview.c ======== */ /* runtime include files */ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <stdarg.h> #include <math.h> /* BIOS include files */ #include <std.h> #include <gio.h> #include <tsk.h> #include <trc.h> /* PSP include files */ #include <psp_i2c.h> #include <psp_vpfe.h> #include <psp_vpbe.h> #include <fvid.h> #include <psp_tvp5146_extVidDecoder.h> /* CSL include files */ #include <soc.h> #include <cslr_sysctl.h> /* BSL include files */ #include <evmdm6437.h> #include <evmdm6437_dip.h> /* Video Params Defaults */ #include <vid_params_default.h> /* This example supports either PAL or NTSC depending on position of JP1 */ #define STANDARD_PAL 0 #define STANDARD_NTSC 1 #define FRAME_BUFF_CNT 6 /* User defined constants Satpal*/ #define THRESHOLD 16 #define MB 16 #define RANGE 20 #define VECTOR_SIZE (720*480)/(MB*MB) /*User Defined Functions Satpal*/ void get_frame(void * frame); 54 void display_frame(void * frame); void MAD(int i,int j,int k,int l,int current[720][480],int reference[720][480]); void prediction_function(int current[][480], int reference[][480], prediction_error[][480], short int motion_vector[][VECTOR_SIZE], short int type); void dct(int frame[][480],int frame_dct[][480]); void multiply1(); void multiply2(); void multiply3(); void multiply4(); void idct(int frame_dct[][480],int frame[][480]); void energy_function(int i,int j,int current[][480]); void copy(int i,int j,int temp[][480]); //*********************** /* User defined Variables Satpal*/ int read_frame[720][480]; int I_frame[720][480]; int P_frame[720][480]; int B_frame[720][480]; int I_frame_dct[720][480]; int I_frame_idct[720][480]; int prediction_error_P_dct[720][480]; int prediction_error_B_dct[720][480]; int prediction_error_P_idct[720][480]; int prediction_error_B_idct[720][480]; int write_frame[720][480]; int decoded_P_frame[720][480]; int decoded_B_frame[720][480]; long double energy,energy_fp,energy_bp; int prediction_error_P[720][480]; int forward_prediction_error_B[720][480]; int backward_prediction_error_B[720][480]; int prediction_error_B[720][480]; short int forward_motion_vector_B[3][VECTOR_SIZE]; short int backward_motion_vector_B[3][VECTOR_SIZE]; short int motion_vector_B[3][VECTOR_SIZE]; int flag; short int motion_vector_P[3][VECTOR_SIZE]; float min_MAD; FILE * fp1; //*********************** static int read_JP1(void); int 55 static CSL_SysctlRegsOvly sysModuleRegs = )CSL_SYS_0_REGS; /* * ======== main ======== */ void main() { printf("Video Preview Application\n"); fflush(stdout); /* Initialize BSL library to read jumper switches: */ EVMDM6437_DIP_init(); /* VPSS PinMuxing */ /* CI10SEL - No CI[1:0] */ /* CI32SEL - No CI[3:2] */ /* CI54SEL - No CI[5:4] */ /* CI76SEL - No CI[7:6] */ /* CFLDSEL - No C_FIELD */ /* CWENSEL - No C_WEN */ /* HDVSEL - CCDC HD and VD enabled */ /* CCDCSEL - CCDC PCLK, YI[7:0] enabled */ /* AEAW - EMIFA full address mode */ /* VPBECKEN - VPBECLK enabled */ /* RGBSEL - No digital outputs */ /* CS3SEL - LCD_OE/EM_CS3 disabled */ /* CS4SEL - CS4/VSYNC enabled */ /* CS5SEL - CS5/HSYNC enabled */ /* VENCSEL - VCLK,YOUT[7:0],COUT[7:0] enabled */ /* AEM - 8bEMIF + 8bCCDC + 8 to 16bVENC */ sysModuleRegs -> PINMUX0 &= (0x005482A3u); sysModuleRegs -> PINMUX0 |= (0x005482A3u); /* PCIEN = 0: PINMUX1 - Bit 0 */ sysModuleRegs -> PINMUX1 &= (0xFFFFFFFEu); sysModuleRegs -> VPSSCLKCTL = (0x18u); return; } /* * ======== video_preview ======== */ void video_preview(void) { FVID_Frame *frameBuffTable[FRAME_BUFF_CNT]; FVID_Frame *frameBuffPtr; GIO_Handle hGioVpfeCcdc; GIO_Handle hGioVpbeVid0; GIO_Handle hGioVpbeVenc; (CSL_SysctlRegsOvly 56 int status = 0; int result; int i; int standard; int width; int height; /* Set video display/capture driver params to defaults */ PSP_VPFE_TVP5146_ConfigParams tvp5146Params = VID_PARAMS_TVP5146_DEFAULT; PSP_VPFECcdcConfigParams vpfeCcdcConfigParams = VID_PARAMS_CCDC_DEFAULT_D1; PSP_VPBEOsdConfigParams vpbeOsdConfigParams = VID_PARAMS_OSD_DEFAULT_D1; PSP_VPBEVencConfigParams vpbeVencConfigParams; standard = read_JP1(); /* Update display/capture params based on video standard (PAL/NTSC) */ if (standard == STANDARD_PAL) { width = 720; height = 576; vpbeVencConfigParams.displayStandard PSP_VPBE_DISPLAY_PAL_INTERLACED_COMPOSITE; } else { width = 720; height = 480; vpbeVencConfigParams.displayStandard PSP_VPBE_DISPLAY_NTSC_INTERLACED_COMPOSITE; } vpfeCcdcConfigParams.height = vpbeOsdConfigParams.height = height; vpfeCcdcConfigParams.width = vpbeOsdConfigParams.width = width; vpfeCcdcConfigParams.pitch = vpbeOsdConfigParams.pitch = width * 2; /* init the frame buffer table */ for (i=0; i<FRAME_BUFF_CNT; i++) { frameBuffTable[i] = NULL; } /* create video input channel */ if (status == 0) { PSP_VPFEChannelParams vpfeChannelParams; vpfeChannelParams.id = PSP_VPFE_CCDC; vpfeChannelParams.params (PSP_VPFECcdcConfigParams*)&vpfeCcdcConfigParams; = = = 57 hGioVpfeCcdc FVID_create("/VPFE0",IOM_INOUT,NULL,&vpfeChannelParams,NULL); status = (hGioVpfeCcdc == NULL ? -1 : 0); } /* create video output channel, plane 0 */ if (status == 0) { PSP_VPBEChannelParams vpbeChannelParams; vpbeChannelParams.id = PSP_VPBE_VIDEO_0; vpbeChannelParams.params (PSP_VPBEOsdConfigParams*)&vpbeOsdConfigParams; = hGioVpbeVid0 FVID_create("/VPBE0",IOM_INOUT,NULL,&vpbeChannelParams,NULL); = status = (hGioVpbeVid0 == NULL ? -1 : 0); } /* create video output channel, venc */ if (status == 0) { PSP_VPBEChannelParams vpbeChannelParams; vpbeChannelParams.id = PSP_VPBE_VENC; vpbeChannelParams.params = *)&vpbeVencConfigParams; = (PSP_VPBEVencConfigParams hGioVpbeVenc FVID_create("/VPBE0",IOM_INOUT,NULL,&vpbeChannelParams,NULL); = status = (hGioVpbeVenc == NULL ? -1 : 0); } /* configure the TVP5146 video decoder */ if (status == 0) { result = FVID_control(hGioVpfeCcdc, VPFE_ExtVD_BASE+PSP_VPSS_EXT_VIDEO_DECODER_CONFIG, &tvp5146Params); status = (result == IOM_COMPLETED ? 0 : -1); } /* allocate some frame buffers */ if (status == 0) { for (i=0; i<FRAME_BUFF_CNT && status == 0; i++) { result = FVID_allocBuffer(hGioVpfeCcdc, &frameBuffTable[i]); status = (result == IOM_COMPLETED && frameBuffTable[i] != NULL ? 0 : -1); } } 58 /* prime up the video capture channel */ if (status == 0) { FVID_queue(hGioVpfeCcdc, frameBuffTable[0]); FVID_queue(hGioVpfeCcdc, frameBuffTable[1]); FVID_queue(hGioVpfeCcdc, frameBuffTable[2]); } /* prime up the video display channel */ if (status == 0) { FVID_queue(hGioVpbeVid0, frameBuffTable[3]); FVID_queue(hGioVpbeVid0, frameBuffTable[4]); FVID_queue(hGioVpbeVid0, frameBuffTable[5]); } /* grab first buffer from input queue */ if (status == 0) { FVID_dequeue(hGioVpfeCcdc, &frameBuffPtr); } fp1 = fopen ("I_frame.txt","w"); for(flag=0;flag<3;flag++) { int i,j; for (i = 0;i <100000 ;i++) { for (j = 0;j < 3000;j++) { continue; } } BCACHE_wbInvAll(); FVID_exchange(hGioVpfeCcdc, &frameBuffPtr); BCACHE_wbInvAll(); get_frame((frameBuffPtr->frame.frameBufferPtr)); if(flag==0) { memcpy(I_frame,read_frame,sizeof(read_frame)); memcpy(write_frame,I_frame,sizeof(write_frame)); fprintf(fp1,"%d\n",flag); } else { if (flag==1) { memcpy(B_frame,read_frame,sizeof(read_frame)); memcpy(write_frame,B_frame,sizeof(write_frame)); 59 fprintf(fp1,"%d\n",flag); } else { memcpy(P_frame,read_frame,sizeof(read_frame)); memcpy(write_frame,P_frame,sizeof(write_frame)); fprintf(fp1,"%d\n",flag); } } display_frame((frameBuffPtr->frame.frameBufferPtr)); BCACHE_wbInvAll(); FVID_exchange(hGioVpbeVid0, &frameBuffPtr); fprintf(fp1,"%d\n",(Int)TSK_time()); } flag=0; fprintf(fp1,"done"); fclose(fp1); while ( status == 0 && flag==0) { //Used defined variable Satpal int cnt=0; int i,j,m,n,k1,l1; /* grab a fresh video input frame */ //*********************************satpal //both frames are ready now prediction_function(P_frame,I_frame,prediction_error_P,motion_vector_P,0); prediction_function(B_frame,I_frame,forward_prediction_error_B,forward_motion_vect or_B,0); prediction_function(B_frame,P_frame,backward_prediction_error_B,backward_motion_ vector_B,1); //generating B Frame motion vector and prediction_error for (i=0;i<=(720-MB);i=i+MB) { for(j=0;j<=(480-MB);j=j+MB) { energy_function(i,j,forward_prediction_error_B); energy_fp=energy; energy_function(i,j,backward_prediction_error_B); energy_bp=energy; if(energy_fp>energy_bp) { copy(i,j,backward_prediction_error_B); motion_vector_B[0][cnt]=backward_motion_vector_B[0][cnt]; 60 motion_vector_B[1][cnt]=backward_motion_vector_B[1][cnt]; motion_vector_B[2][cnt]=backward_motion_vector_B[2][cnt]; } else { copy(i,j,forward_prediction_error_B); motion_vector_B[0][cnt]=forward_motion_vector_B[0][cnt]; motion_vector_B[1][cnt]=forward_motion_vector_B[1][cnt]; motion_vector_B[2][cnt]=forward_motion_vector_B[2][cnt]; } cnt=cnt+1; } } ///////////////////////////////////////////////// //These only data will decoder have dct(I_frame,I_frame_dct); dct(prediction_error_P,prediction_error_P_dct); dct(prediction_error_B,prediction_error_B_dct); //Decoder Side idct(I_frame_dct,I_frame_idct); idct(prediction_error_P_dct,prediction_error_P_idct); idct(prediction_error_B_dct,prediction_error_B_idct); //decoder Satpal cnt=0; for (i=0;i<=(720-MB);i=i+MB) { for(j=0;j<=(480-MB);j=j+MB) { k1=motion_vector_P[0][cnt]; l1=motion_vector_P[1][cnt]; for (m=0;m<MB;m++) { for(n=0;n<MB;n++) { decoded_P_frame[i+m][n+j]=I_frame_idct[i+m+k1][j+n+l1]+prediction_error_P_idct[i+ m][j+n]; } } cnt=cnt+1; } } 61 ///Decoding B frame cnt=0; for (i=0;i<=(720-MB);i=i+MB) { for(j=0;j<=(480-MB);j=j+MB) { k1=motion_vector_B[0][cnt]; l1=motion_vector_B[1][cnt]; if(motion_vector_B[2][cnt]==0) { for (m=0;m<MB;m++) { for(n=0;n<MB;n++) { decoded_B_frame[i+m][n+j]=I_frame_idct[i+m+k1][j+n+l1]+prediction_error_B_idct[i+ m][j+n]; } } } else { for (m=0;m<MB;m++) { for(n=0;n<MB;n++) { decoded_B_frame[i+m][n+j]=decoded_P_frame[i+m+k1][j+n+l1]+prediction_error_B_i dct[i+m][j+n]; } } } cnt=cnt+1; } } memcpy(write_frame,decoded_B_frame,sizeof(write_frame)); display_frame((frameBuffPtr->frame.frameBufferPtr)); //*********************************satpal BCACHE_wbInv((void*)(frameBuffPtr->frame.frameBufferPtr), 480*720*2, 1); FVID_exchange(hGioVpbeVid0, &frameBuffPtr); } } void get_frame (void * frame) 62 { int j,k, n; n = 1; for (j = 0;j < 720;j++) { for (k = 0;k < 480;k++) { read_frame[j][k]= * (((unsigned char * )frame)+n); n=n+2; } } } void display_frame (void * frame) { int j,k, n; n = 1; for (j = 0;j < 720;j++) { for (k = 0;k < 480;k++) { * (((unsigned char * )frame)+n)=write_frame[j][k]; * (((unsigned char * )frame)+n+1)= 128; n=n+2; } } } void MAD(int i,int j,int k,int l,int current[][480],int reference[][480]) { int m,n; for (m=0;m<MB;m++) { for(n=0;n<MB;n++) { min_MAD=min_MAD+abs(current[i+m][j+n]reference[i+k+m][j+l+n]); } } void prediction_function(int current[][480],int reference[][480],int prediction_error [][480],short int motion_vector[][VECTOR_SIZE],short int type) { int i,j,k,l,k1,l1,count,min,x,y; float min_MAD; count =0; 63 k1=0; l1=0; for (i=0;i<=(720-MB);i=i+MB) { for(j=0;j<=(480-MB);j=j+MB) { min=512; k1=0; l1=0; for(k=-RANGE;k<=RANGE;k=k+1) { for(l=-RANGE;l<=RANGE;l=l+1) { if((i+k)<0||(j+l)<0||(i+k+MB)>720||(j+l+MB)>480) { continue; } min_MAD=0; MAD(i,j,k,l,current,reference); if(min_MAD < min) { min=min_MAD; k1=k; l1=l; } } } if(min > THRESHOLD) { for (x=0;x<MB;x++) { for(y=0;y<MB;y++) { prediction_error[i+x][j+y]=current[i+x][j+y]reference[i+x][j+y]; } } motion_vector [0][count]=0; motion_vector [1][count]=0; motion_vector [2][count]=type; } else 64 { for (x=0;x<MB;x++) { for(y=0;y<MB;y++) { prediction_error[i+x][j+y]= current[i+x][j+y]reference[i+x+k1][j+y+l1]; } } motion_vector [0][count]=k1; motion_vector [1][count]=l1; motion_vector [2][count]=type; } count=count+1; } } } //DCT FUNCTIONS Satpal float a[8][8]; float ct[8][8]; float c[8][8]; // CO-EFF Matrix float b[8][8] = { {0.353553, 0.353553, 0.353553, 0.353553, 0.353553, 0.353553, 0.353553, 0.353553}, {0.490393, 0.415818, 0.277992, 0.097887, -0.097106, -0.277329, -0.415375, 0.490246}, {0.461978, 0.191618, -0.190882, -0.461673, -0.462282, -0.192353, 0.190145, 0.461366}, {0.414818, -0.097106, -0.490246, -0.278653, 0.276667, 0.490710, 0.099448, 0.414486}, {0.353694, -0.353131, -0.354256, 0.352567, 0.354819, -0.352001, -0.355378, 0.351435}, {0.277992, -0.490246, 0.096324, 0.4167, -0.414486, -0.100228, 0.491013, 0.274673}, {0.191618, -0.462282, 0.461366, -0.189409, -0.193822, 0.463187, -0.460440, 0.187195}, {0.097887, -0.278653, 0.4167, -0.490862, 0.489771, -0.413593, 0.274008, 0.092414} }; float bt[8][8] = { {0.353553, 0.490393, 0.461978, 0.414818, 0.353694, 0.277992, 0.191618, 0.097887}, 65 {0.353553, 0.415818, 0.191618, -0.097106, -0.353131, -0.490246, -0.462282, 0.278653}, {0.353553, 0.277992, -0.190882, -0.490296, -0.354256, 0.096324, 0.461366, 0.4167 }, {0.353553, 0.097887, -0.461673, -0.278653, 0.352567, 0.4167, -0.189409, 0.490862}, {0.353553, -0.097106, -0.462282, 0.276667, 0.354819, -0.414486, -0.193822, 0.489771}, {0.353553, -0.277329, -0.192353, 0.490710, -0.352001, -0.100228, 0.463187, 0.413593}, {0.353553, -0.415375, 0.190145, 0.099448, -0.355378, 0.491013, -0.460440, 0.274008}, {0.353553, -0.490246, 0.461366, -0.414486, 0.351435, -0.274673, 0.187195, 0.092414} }; //********************************************************************** //DCT ///********************************************************************** // Details: 2D DCT transform //********************************************************************** void dct(int frame[][480],int frame_dct[][480]) { int i, j; int m, n; int k, l; for (i=0; i<720; i = i+8) { for(j=0; j<480; j = j+8) { k = 0; l = 0; for(m = i; m<(i+8); m++) { for(n = j; n<(j+8); n++) { a[k][l++] = frame[m][n]; } l = 0; k++; } multiply1(); multiply2(); k = 0; 66 l = 0; for (m = i; m < (i+8); m++) { for(n = j; n < (j+8); n++) { frame_dct[m][n] = c[k][l++]; if(l == 8) { l = 0; k++; } } } } } } void multiply1() { int i, j, k; float sum; for ( i = 0; i < 8; i++) { for ( j = 0; j < 8; j++) { sum = 0; for (k = 0; k < 8; k++) { sum = sum + a[i][k] * bt[k][j]; } ct[i][j] = sum; } } } void multiply2() { int i, j, k; int sum; for ( i = 0; i < 8; i++) { for ( j = 0; j < 8; j++) { sum = 0; for (k = 0; k < 8; k++) 67 { sum = sum + b[i][k] * ct[k][j]; } c[i][j] = sum; } } } ///IDCT void idct(int frame_dct[][480],int frame[][480]) { int i, j; int m, n; int k, l; for (i=0; i<720; i = i+8) { for(j=0; j<480; j = j+8) { k = 0; l = 0; for(m = i; m<(i+8); m++) { for(n = j; n<(j+8); n++) { a[k][l++] = frame_dct[m][n]; } l = 0; k++; } multiply3(); multiply4(); k = 0; l = 0; for (m = i; m < (i+8); m++) { for(n = j; n < (j+8); n++) { frame[m][n] = c[k][l++]; if(l == 8) { l = 0; k++; } 68 } } } } } void multiply3() { int i, j, k; float sum; for ( i = 0; i < 8; i++) { for ( j = 0; j < 8; j++) { sum = 0; for (k = 0; k < 8; k++) { sum = sum + a[i][k] * b[k][j]; } ct[i][j] = sum; } } } void multiply4() { int i, j, k; int sum; for ( i = 0; i < 8; i++) { for ( j = 0; j < 8; j++) { sum = 0; for (k = 0; k < 8; k++) { sum = sum + bt[i][k] * ct[k][j]; } c[i][j] = sum; } } } void energy_function(int i,int j,int current[][480]) { int m,n; for (m=0;m<MB;m++) 69 { for(n=0;n<MB;n++) { energy=energy+current[i+m][j+n]*current[i+m][j+n]); } } } void copy(int i,int j,int temp[][480]) { int m,n; for (m=0;m<MB;m++) { for(n=0;n<MB;n++) { prediction_error[i+m][j+n]=temp[i+m][j+n]; } } } /* * ======== read_JP1 ======== * Read the PAL/NTSC jumper. * Retry, as I2C sometimes fails: */ static int read_JP1(void) { int jp1 = -1; while (jp1 == -1) { jp1 = EVMDM6437_DIP_get(JP1_JUMPER); TSK_sleep(1); } return(jp1); } 70 BIBLIOGRAPHY 1. D. Brown, H. Ballard and Christopher M., Computer Vision, New Jersey: Prentice Hall, Inc., 1982 2. Texas Instruments, “DaVinci Digital Media Processor”, July 27, 2009 3. Texas Instruments Inc. “How to Use the VPBE and VPFE Driver on TMS320DM643x” Dallas, Texas, November 2007. 4. Digital Compression and Coding of Continuous-tone Still Images, Part 1, Requirements and Guidelines. ISO/IEC JTC1 Draft International Standard 10918-1, Nov. 1991 5. Keith Jack, “YCbCr to RGB Consideration”, Intersil, March 1997 6. Joan L Mitchell, William B. Pennebaker, Chad E. Fogg, and Didier J. Legall, MPEG VIDEO COMPRESSION STANDARD, New-York: Chapman & Hall,1997.