Context-based Data Compression Part 3. Context modeling Xiaolin Wu

Context-based Data Compression Xiaolin Wu Polytechnic University Brooklyn, NY Part 3. Context modeling Context model – estimated symbol probability Variable length coding schemes need estimates of probability of each symbol - model Model can be Static - Fixed global model for all inputs English text Semi-adaptive - Computed for specific data being coded and transmitted as side information C programs Adaptive - Constructed on the fly Any source! 7/15/2016 2 Adaptive vs. Semi-adaptive Advantages of semi-adaptive Simple decoder Disadvantages of semi-adaptive Overhead of specifying model can be high Two-passes of data required Advantages of adaptive one-pass  universal  As good if not better Disadvantages of adaptive Decoder as complex as encoder Errors propagate 7/15/2016 3 Adaptation with Arithmetic and Huffman Coding Huffman Coding - Manipulate Huffman tree on the fly - Efficient algorithms known but nevertheless they remain complex. Arithmetic Coding - Update cumulative probability distribution table. Efficient data structure / algorithm known. Rest essentially same. Main advantage of arithmetic over Huffman is the ease by which the former can be used in conjunction with adaptive modeling techniques. 7/15/2016 4 Context models If source is not iid then there is complex dependence between symbols in the sequence In most practical situations, pdf of symbol depends on neighboring symbol values - i.e. context. Hence we condition encoding of current symbol to its context. How to select contexts? - Rigorous answer beyond our scope. Practical schemes use a fixed neighborhood. 7/15/2016 5 Context dilution problem The minimum code length of sequence x1 x2  xn achievable by arithmetic coding, if P( xi | xi 1 xi 2  x1 ) is known. The difficulty of estimating P( xi | xi 1 xi 2  x1 ) due to insufficient sample statistics prevents the use of high-order Markov models. 7/15/2016 6 Estimating probabilities different contexts Two approaches Maintain symbol occurrence counts within each context number of contexts needs to be modest to avoid context dilution Assume pdf shape within each context same (e.g. Laplacian), only parameters (e.g. mean and variance) different Estimation may not be as accurate but much larger number of contexts can be used 7/15/2016 7 Entropy (Shannon 1948)  Consider a random variable X { x , x ,..., x N 1} source alphabet : 0 1 probability mass functionp X:( xi )  prob( X  xi ) i( xisi )   log P( xi ) Self-information xof i measured in bits if the log is base 2. event of lower the probability carries more information Self-entropy is the weighted average of self information H ( X )   P( x ) log P( x )  i i i 7/15/2016 8 Conditional Entropy  Consider two random variables X { x , x ,..., x } X Alphabet of : 0 1 N 1 Alphabet of C {:c0 , c1 ,..., cM 1} C and xi of Conditional Self-information is i( xi | cm )   log P( xi | cm ) Conditional Entropy is the average value of conditional self-information N 1 M 1 H ( X | C )    P( xi | cm ) P (cm ) log 2 P( xi | cm ) i 0 m 0 7/15/2016 9 Entropy and Conditional Entropy H ( X | C) The conditional entropy can be X interpreted as the amount of uncertainty C we know remaining about the , given that random variable . C The additional knowledge of X the uncertainty about . should reduce H ( X | C)  H ( X ) 7/15/2016 10 Context Based Entropy Coders Consider a sequence of symbols S symbol to code form the context space, C 0 0 1 0 0 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 1 1 0 1 1 0 ... C0 0 0 0 C1 0 0 1 C7 1 1 1 7/15/2016 EC EC H (C0 ) H (C1 ) R  P(C )H (C ) i i i EC H (C7 ) 11 Decorrelation techniques to exploit sample smoothness Transforms DCT, FFT wavelets Differential Pulse Coding Modulation (DPCM) predict current symbol with past observations d xî  t xi t t 1  code prediction residual rather than the symbol ei  xi  xî  xi  xî  ei 7/15/2016 12 Benefits of prediction and transform A priori knowledge exploited to reduce the self-entropy of the source symbols Higher coding efficiency due to  Fewer parameters to be estimated adaptively  Faster convergence of adaptation 7/15/2016 13 Further Reading Text Compression - T.Bell, J. Cleary and I. Witten. Prentice Hall. Good coverage on statistical context modeling. Focus on text though. Articles in IEEE Transactions on Information Theory by Rissanen and Langdon Digital Coding of Waveforms: Principles and Applications to Speech and Video. Jayant and Noll. Good coverage on Predictive coding. 7/15/2016 14

Context-based Data Compression Part 3. Context modeling Xiaolin Wu

Related documents

Products

Support

Context-based Data Compression Part 3. Context modeling Xiaolin Wu

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib