The coefficients of the

ES154 Communications and Media 2001-2002 Lecture 19 UNIVERSITY OF WARWICK School of Engineering Compression D.D.Udrea Aims and Objectives - To explain the concept of data compression and data streaming - To give some examples of compression algorithms (by no means exhaustive) Types of compression As we have discussed so far, the memory and bandwidth requirements for most of the multimedia sources exceed the available bandwidths and storage spaces of the networks and devices available. Therefore the information needs to be compressed in order to reduce its bandwidth. Compression can b classified according to several criteria: One criterion is the amount of information being recovered after de-compression takes place: - lossless compression: in which the entire information is recovered - lossy compression: in which some details from the original data are not recovered. A second criterion is the dependency on the type of information: - independent : this type of compression can be applied to any string of bits, irrespective whether they belong to a text file or an image ( e.g. entropy encoding) - -dependent : this type of compression takes into account specific characteristics of the media source and the human perception of it to eliminate minor detail (e.g. source encoding) A third criterion is a general one which takes into consideration solely the type of information of the source. Examples of sources are text, image, audio, video. Text compression All the three different types of text –unformatted, formatted, hypertext- are represented as strings of characters from a defined set. They may comprise letters, numbers and signs (called alphanumeric characters) and formatting characters (called control characters). The control characters are used to interpret the alphanumeric characters in a certain way. We can deduce from this that the compression algorithms used for text should be lossless, since any loss of information may lead to misinterpretation of the original text or its formatting. One of the types of lossless encoding, which is also independent of the type of information that is being compressed, is entropy encoding. Entropy refers to the frequency (the number of times) in which a certain symbol appears in the entire file of data. A particular type of encoding is Huffman encoding (with different variants). Basically, the Huffman encoding means assigning to every symbol a code where the length of the code is inversely related to the number of times the symbol appears (i.e. frequently occurring symbols are coded with shorter codes, rarely occurring symbols are given longer codes). One important restriction in the way the code is defined is that, when decoding, the codes should not be mistaken for one another. Therefore the rule is that a longer code should not start with a shorter code. For example, if the word BANANA is to be coded without compression, just as plain text: text B A N A N A ASCII(decimal) 66 65 78 65 78 65 ASCII (hex) 42 41 4E 41 4E 41 Binary 1000010 1000001 1001110 1000001 1001110 1000001 The number of bits for the word is 6 characters x 7 bit/character = 42 bit If the text is to be encoded we consider the length of the codeword inversely proportional to the number of occurrences (or frequency of occurrence) of a particular character. 533560702 Page 1 of 7 ES154 Communications and Media 2001-2002 Lecture 19 UNIVERSITY OF WARWICK School of Engineering Compression D.D.Udrea For example, A appears 3 times, thus the shortest codeword, N appears 2 times, thus longer codeword and B appears once, thus the longest codeword. text B A N A N A Binary compressed 001 1 01 1 01 1 The number of bits in the compressed version is 10. We have also ensured that the de-coding is unambiguous, since all codewords end with a ‘1’. The compression achieved is 10/42 = 0.24 Or can be expressed as 42/10 = 4.2 : 1 Or, expressed as compression ratio percentage (42-10)/42 = 76% A dictionary encoding uses whole strings of characters, rather than individual characters, as a basis for encoding. A particular example is L-Z-W (Lempel-Ziv-Welsh) in which the place of the word in the dictionary, rather than the word itself, is transmitted. Thus, shorter words will have a low compression, but longer words will have a high compression. The disadvantage consists in the fact that both the compressor and the de-compressor should have (or should build dynamically) the dictionary. For example, if the text GO TO SCHOOL is to be coded, we have to build a dictionary; 1st place GO or, in alphabetical order 1st place GO 2nd place TO 2nd place SCHOOL rd 3 place SCHOOL 3rd place TO If we are to transmit the text in ASCII representation (7 bit/character, including spaces) we have 12 characters x 7 bit/character = 84 bit If we are to transmit the text compressed, we should only send the place in the dictionary for each word. Text GO TO SCHOOL Compressed text 123 or, in alphabetical order 132 Compressed text binary form 01 10 11 01 11 10 In total 3 words x 2 bit/word = 6 bit For a text of maximum N different words, N = 2n where N is the maximum number of words in the dictionary n is the number of bits needed to code each place Image compression Images can be compressed in either lossless or lossy format, because the eye is not so sensitive to certain colours and certain image features. However, if the image represents a source of data (e.g. intensity levels from an apparatus or an identification code) the format should be lossless. There are a multitude of image formats, some of them include compression algoritms. Example of image formats which are mostly used in Web applications: JPEG : high levels of compression on images with 16 or 24 bits of colour (lossy) GIF : medium levels of compression on images with up to 256 colours (lossless) PNG : supports larger numbers of colours and more flexible transparency (lossless) TIFF : up to 24-bit images (supports lossless compression) 533560702 Page 2 of 7 ES154 Communications and Media 2001-2002 Lecture 19 UNIVERSITY OF WARWICK School of Engineering Compression D.D.Udrea BMP : up to 24-bit images (supports lossless compression) One type of algorithms for lossless compression is run-length encoding. It is used, for example, in transmitting digitised documents, from scanners and fax machines. To obtain compression, it is assumed that the original file contains long, continuous, fields of identical data. Therefore, instead of transmitting the value of the data each time it is encountered, the value is transmitted only once, followed by a number representing the number of times to be repeated. In the example below, a binary image is to be transmitted. It has alternating black and white fields. . The normal 1 1 1 1 1 0 0 0 0 1 The total number of bits is 12 pixels x 1 bit/pixel = 12bit 1 coding should be 1 Instead of individual bits, the number of identical, consecutive bits is transmitted (bit, number of times) (1,5) (0,4) (1,3) Moreover, if we make the rule that the start of an image is always a black pixel, one can drop the actual pixel values from the transmission. (5) (4) (3) or, in binary 101 100 011, a total of 9 bits. The total number of bits depends more on the number of times the image changes. Another class of compression algorithms are called transform algorithms. They are based on the fact that a signal can be represented in time or space, as well as in frequency. For example, each line in the image above can be represented as a spatial signal of pixel amplitudes. Continuous or slow varying signals yield to low frequencies being present in the signal, whereas frequent or sharp variations of the signal yield to high frequencies being present in the signal. By performing a transform on the original time or space signal (e.g. Fourier Transform) one can obtain the frequency representation of the signal. By performing an inverse transform (e.g. Inverse Fourier Transform) on the frequency representation, one can retrieve back the original time or space signal. One type of transform that can be used is the Cosine Transform. If it is applied to the samples of a signal it is called Digital Cosine Transform (DCT). DCT forms the basis of numerous compression algorithms, such as used in JPEG or MPEG formats. If the original signal contains the following samples s =[ s(0) s(1) s(2) ……s(k)] The coefficients of the The coefficients of a the DCT can be calculated as 533560702 Page 3 of 7 ES154 Communications and Media 2001-2002 Lecture 19 C ( 0)  1 M M 1  s(k ) and C (u)  k 0 2 M M 1 UNIVERSITY OF WARWICK School of Engineering Compression D.D.Udrea  (2k  1)u   for u = 1,2 , … 2M  s(k ) cos k 0 Then, instead of sending the samples of the original signal, one sends the coefficients of the cosine transform. The compression occurs in two ways. A lossless compression occurs if some frequencies are missing from the signal and thus their coefficients are 0 and are not transmitted. A lossy compression occurs if some frequency components are very small compared to others and are not transmitted. Further compression occurs if the value of the coefficients is set to a lower precision (e.g. 2.55 rounded to 2.5) or digitised to a lower number of bits. Audio compression Unlike text and images, audio and most video signals are originally analog signals. After digitisation, both comprise a continuous stream of digital values, which represent the amplitude of a sample of a particular signal taken at repetitive time intervals. The digitisation process uses PCM (pulse code modulation), see Lecture 16. PCM involves the sampling of the analog signal at (at least) twice its maximum frequency. Each sample is then quantised and the binary level of the quantisation level is transmitted (as shown in figure below). The digital signal corresponding to the analog signal is 100 011 011 010 corresponding to amplitudes of 4 3 3 2… Differential Pulse Code Modulation (D-PCM) is a derivative of the standard PCM which exploits the fact that, for most audio signals, the range in the differences in amplitudes between successive samples is smaller than the range of the actual samples. In D-PCM, instead of coding the amplitude of the signal, one codes the difference in the amplitude of signals. Hence, either fewer bits are necessary for encoding, or the quantisation error decreases very much if the same number of bits is used In the example above, the amplitudes are 4 3 3 2 1 0 2 7 4 2 1 the differences in amplitudes are (-)1 0 1 1 1 (+) 2 5 (-)3 2 1 533560702 8 = 23 levels which can be PCM coded with 3 bits the signal has a dynamic range of max(7)-min(0)=7 6 levels the signal has a dynamic range of max(5)-min(0)=5 Page 4 of 7 ES154 Communications and Media 2001-2002 Lecture 19 UNIVERSITY OF WARWICK School of Engineering Compression D.D.Udrea There are various standards which specify the bit rate of the compressed audio signal, the type of compression code and the quality of the decompressed audio signals. They use a combination of PCM, DCT or DFT (Discrete Fourier Transform) and quantisation. The most common are the MPEG (Motion Picture Expert Group) standards, which also specify parameters for video signals. Each new standard (or layer) brings increased compression and increased signal quality, however, at the expense of increased levels of complexity and cost and higher delay in processing the signal. The table below shows a comparison between 3 MPEG standards: MPEG layer Application Compressed bit rate Quality Delay Layer 1 Digital audio cassette 32-448 kb/s HiFi 20ms Layer 2 Audio/video broadcasting 32-193 kb/s near CD 40 ms Layer 3 CD quality over slow channels 64 kb/s CD quality 60 ms Video compression Video (with sound) features in a number of multimedia applications: - interpersonal (videoconferencing and video telephony) - interactive (access to stored videos) - entertainment (digital TV and movie on demand) The quality of the video used in these applications is determined by the digitisation format and the frame rate used. In practice, the bandwidth resulting from the digitisation formats is substantially larger than of the transmission channels used. Hence, compression is necessary. Moreover, as none of the single algorithm compression techniques give a high enough compression/small enough delay to be able to satisfy the requirements for video, combinations of algorithms have to be used. The MPEG standards (see table above define such combinations of algorithms). The video compression techniques used by these algorithms are heavily reliant on the fact that successive video frames are heavily correlated (related to each other), therefore the contents of a frame may be used to predict some or all of the contents of subsequent frames. An example of such algorithms is in MPEG 4 standard, which is mostly used in multimedia applications over the Internet. The difference to other standards is that MPEG 4 has a number of content-based functionalities (i.e the algorithms take into account the exact content of the frames in order to optimise the compression). In this case, each scene is defined in terms of audio and visual objects and each object is treated separately. For example, in the example below, there are several visual objects planes (VOPs): the background scene, the car and the person. Each VOP is then analysed in terms of shape, texture, motion etc… and will be coded separately to make full use of the most appropriate algorithms. 533560702 Page 5 of 7 ES154 Communications and Media 2001-2002 Lecture 19 UNIVERSITY OF WARWICK School of Engineering Compression D.D.Udrea Streaming Streaming refers to a mode of operation in which continuously generated media (such as audio and video) is passed directly to the destination as it is generated and, at the destination, is played out directly as it is received, then discarded. The main restriction in this case is that the source is synchronised with the transmission channel and the receiver. In order to overcome little variations in the transmission rates, especially on a network which does not guarantee the amount of delay, a playout memory buffer is employed at the receiver end which can store several seconds of audio/video and thus re-synchronise the transmission. Below is an example of a transmission of streamed media over the web. When the client clicks on the link, the same procedure is followed as for any page. A TCP/IP connection is established between the web server and the client. The server contacts the file server which holds the media file itself and a meta file containing the description of the media file. The meta file is sent first, enabling the receiving web browser to set up the appropriate media player. When the media file is sent, it is sent directly to the media player, without the involvement of the browser itself. 533560702 Page 6 of 7 ES154 Communications and Media 2001-2002 Lecture 19 UNIVERSITY OF WARWICK School of Engineering Compression D.D.Udrea References: Halsall, Fred – Multimedia communications: Applications, Networks, Protocols and Standards – Addison Wesley, 2001, ISBN 0-201-39818-4 QA 78.51.H2 533560702 Page 7 of 7

The coefficients of the

Related documents

Products

Support

The coefficients of the

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib