Video Coding with Emphasis on Motion Estimation and the DCT

advertisement
Motion Compensated Prediction
and the Role of the DCT in
Video Coding
Michael Horowitz
Applied Video Compression
michael@avcompression.com
(c) 2008 Michael Horowitz
Outline
• Overview block-based hybrid motion
compensated predictive video coding
– ITU-T standards H.261, H.263, H.264
– ISO/IEC standards: MPEG-1, MPEG-2 & MPEG-4
• Survey motion estimation & compensation
• Discrete cosine transform (DCT)
– Coding efficiency
– Computational complexity
– Perceptual implications
(c) 2008 Michael Horowitz
Block-Based Hybrid Motion Compensated
Predictive Coding
• Video picture partitioned into
macroblocks
• Macroblock (MB) has three components
– One luma
• “Y”, represents “lightness”
• 16x16 luma samples
– Two chroma
• “Cb” & “Cr”, represent color
• 16x16, 8x16, or 8x8 chroma samples
(c) 2008 Michael Horowitz
Block-Based Hybrid Motion Compensated
Predictive Coding (continued)
– Human Visual System more sensitive to luma
• Chroma frequently sub-sampled
• Sub-sampling examples
4:4:4
4:2:2
4:2:0
Y
Cb
Cr
• Two coding modes for macroblocks
(c) 2008 Michael Horowitz
Inter-Picture Macroblock Coding
– Estimate motion of blocks from picture-to-picture
– Search previously coded (reference) pictures
Motion Estimate
Location of input MB
Search Region
Reference Picture
– Encode
• Location of motion estimate (motion vector)
• Difference between input MB and motion estimate
(c) 2008 Michael Horowitz
Intra-Picture Macroblock Coding
• Input MB coded using intra-picture prediction
– Prediction derived from spatially adjacent MBs
– Earlier algorithms offer no intra-picture prediction
• Significantly lower coding efficiency than
inter-coded MBs at low data rates
• Useful when motion estimate is poor
• Can be used to stop error propagation
(c) 2008 Michael Horowitz
Block-Based Hybrid Motion
Compensated Predictive Coding
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(c) 2008 Michael Horowitz
Survey Motion Estimation and
Motion Compensation
• Motion models
– Translational (focus of talk)
• Location of kth motion compensated block
– X k  MVx,k , Yk  MVy,k 
– (Xk,Yk) is location of kth input block
– (MVx,k,MVy,k) is motion vector (MV) for kth block
–Affine motion models
• Rotation
• Scaling
• Video standards do not use affine models
(c) 2008 Michael Horowitz
Motion Estimation
• Estimate inter-picture block translation
• Luma samples (and sometimes chroma)
• Example
– Distortion: Sum of Absolute Differences (SAD)
• Low complexity
• Commonly used in real-time production encoders
– Find (MVx,k, MVy,k) that minimizes SAD between
• Input block sk(i,j)
• Motion compensated prediction in reference picture r(i,j)
• Subject to search range
(c) 2008 Michael Horowitz
Motion Estimation (continued)



(MVx,k , MVy,k ) 
arg min
sk (i, j)  rX k  x  i, Yk  y  j 




x,y
 j i

(X k x,Yk y ) SearchRange
(Xk,Yk)
Sample Locations
Reference Picture r(i,j)
Search Range
• Fast motion estimation algorithms
(c) 2008 Michael Horowitz
Fractional Sample Motion Estimation
• Estimate content between samples
• Example: bilinear interpolation
x1 ≤ x* < x2 and y1 ≤ y* < y2
fx= (x*-x1)/(x2-x1)
fy= (y*-y1)/(y2-y1)
z(x1,y1)
z(x2,y1)
z(x*,y*)
z(x1,y2)
z(x2,y2)
z(x*,y*) = (1-fx)(1-fy)z(x1,y1) + fx(1-fy) z(x2,y1) +
fxfy z(x2,y2) + (1-fx)fyz(x1,y2)
(c) 2008 Michael Horowitz
Fractional Sample Motion Estimation
(continued)
– H.261
• No fractional sample motion estimation
– MPEG-1, MPEG-2 and H.263
• 1/2-sample, bilinear interpolation
– H.264 | MPEG-4 AVC & SVC
• Luma
– 1/2-sample, 6-tap interpolation
– 1/4-sample, simple average
• Chroma (1/8-sample, bilinear)
(c) 2008 Michael Horowitz
Fractional Sample Motion Estimation
(continued)
• Coding efficiency gain H.263, [from Wang 2002]
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(c) 2008 Michael Horowitz
Multiple Motion Vectors per MB
• One motion vector for each sub-block
• H.264 results [Bjontegaard 2001]
Foreman
38
37
PSNR (dB)
36
35
34
2 blocks
33
32
4 blocks
31
30
7 blocks
29
28
0
20
40
60
80
100
Rate (Kbps)
(c) 2008 Michael Horowitz
Multiple Reference Pictures
[Wiegand, Zhang, & Girod 1997]
• Coding gains
– Uncovered areas
– More integer motion vector estimates
Integer sample location
t-3
t-2
t-1
Direction of motion
t0
Integer sample location
(c) 2008 Michael Horowitz
Multiple Reference Pictures
(Continued)
Woman, CIF, 10fps, QP=7,8,10,13,19,28
24.5
PSNR (dB)
24
Annex U (5 f rames)
23.5
No Annex U
23
22.5
22
0
50000
100000
150000
200000
Bit Rate (bps)
• H.263 Annex U [Horowitz 2000]
(c) 2008 Michael Horowitz
Multi-Hypothesis Motion
Compensated Prediction
[Flierl, Wiegand & Girod 1998]
• Linear combination of multiple predictions
– One motion vector for each prediction
– Bi-predicted pictures are special case (2 MVs)
– Predictions may be forward & backward in time
(c) 2008 Michael Horowitz
Multi-Hypothesis for H.263
• Sequences Mobile & Calendar and Foreman
• Results [Flierl 1998]
(c) 2008 Michael Horowitz
Overlapped Block Motion Compensation
[Orchard & Sullivan 1994]
• Special case of multi-hypothesis coding
• H.263 advanced prediction mode (Annex F)
– Overlapped block motion compensation
• 1 coded + 2 “derived” motion vectors
• Non-uniform spatial weighting of samples
– 4 motion vectors per macroblock
(c) 2008 Michael Horowitz

Rate-Distortion Optimization
• MV resulting in lowest distortion often not optimal
• Goal: Find best tradeoff between distortion and rate
• Strategy [Everett III 1963], [Shoham & Gersho 1988]
J  D  R   Dk  Rk    Jk
k
Total
distortion
Total
bit-rate
k
Distortion
Rate
for block k for block k
– Minimize Jk for each block k separately, using common 
(c) 2008 Michael Horowitz
Perceptual Tuning
•
•
•
•
Prevent transparent foreground macroblocks
Blurring of fast moving objects
Deblocking filter
Artifacts in the motion wake
Moving
Object
Direction of Motion
Macroblocks in
the Motion Wake
(c) 2008 Michael Horowitz
Coding Summary
• Macroblock-based coding
• Two basic macroblock coding modes
– Inter-coded MB  motion compensated prediction
– Intra-coded MB
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(c) 2008 Michael Horowitz
1-D Discrete Cosine Transform
• Type II forward DCT [Ahmed et al. 1974]
 mk  12 
dm   x k cos
 m  0, ..., N 1
N


k 0
N 1
• Type II inverse DCT


d
2
xk  0 
N N
 m(k  21 ) 
 dm cos N  k  0,...,N 1
m=1
N 1
(c) 2008 Michael Horowitz


2-Dimensional DCT
• Forward
(x  12 )u  (y  12 )v 
   sx, y cos
cos

 N
  N

x 0 y 0
N1 N1
d u,v
• Inverse
(x  12 )u  (y  12 )v 
   C u C v d u,v cos
cos 

 N
  N

u 0 v 0
N1 N1
sx, y
 1

Ct   N
 2
N
for t  0,
for t  1,2,...,N  1
(c) 2008 Michael Horowitz
Basis Functions for 8x8 DCT
QuickTi me™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(c) 2008 Michael Horowitz
Why Choose the DCT?
• Coding efficiency
• Computational complexity
• Perceptual implications
(c) 2008 Michael Horowitz
Coding Efficiency
X1
X
X2
Q1
Q2
^
X1
^
X
^
X2
• Source X = [X1, X2]
– Xi is a Gaussian random variable
– Mean = 0, Variance = i2
• Rate of quantizer Qi is Ri
(bits / index)
– Total rate R = R1 + R2
(c) 2008 Michael Horowitz
Coding Efficiency (continued)
• Distortion
– Square error
– High-rate assumption
• High-rate implies R ≥ 3 bits / sample
• Often works well for lower rates
• Asymptotic Quantization Theory [Gray & Neuhoff 1998]
Di  hg  i2 22Ri
where hg 
3
2
( for Gaussian )
– Total distortion D  D1  D2

(c) 2008 Michael Horowitz
Rate Allocation Problem
• What is smallest D = D* subject to R  Rmax?
• Find optimal value for R1  R1*
D  D1  D2
1)
 hg12 22R1  hg 22 22(RR

• Minimizing D with respect to R1 yields

1 
R 1
R   2 log 2  
2
 2 
*
1
(c) 2008 Michael Horowitz
Rate Allocation Problem
(continued)
It follows that
 2 
R 1
R   2 log 2  
2
1 
*
2
and

D1*  D2*  hg1 2 2R
which implies

D*  2hg1 2 2R
(c) 2008 Michael Horowitz
Generalize for k Quantizers
X12
X
X3
X1
X2
Q1
^
X
1
^
X
^
X
12
^
X
2
Q2
^
X
3
Q3
• Rate R  R1,2  R3  Rmax
• Distortion D  D1, 2  D3
2 2R
D

h

• Recall 3 g 3 2

3
(c) 2008 Michael Horowitz
Generalization (continued)
• 2 quantizers with R1,2  R  R3 subject to R  Rmax
*
D1,2
 2hg1 2 2(RR3 ) ( from previous result)
*

 D3 with respect
• Minimize D  D1,2
to R3



2
R 1
3
*


R3   2 log 2
1
3 

2
2
2
3




 1 2 3  
(c) 2008 Michael Horowitz
Generalization (continued)
• It follows that
1
3
 3
 2R 
2
 3 

D*  3hg 

2
 j 
 j1 
• Generalize to k quantizers by induction

(c) 2008 Michael Horowitz
Optimal Rate and Distortion
[Huang & Schultheiss 1963]
• Rate




2
R 1
 i 

*
Ri   2 log 2
1

k
 k 2 k 
  j  
j1  
• Distortion

1
k
 k
 2R 
2
 k 

D*  khg 

2
 j 
 j1 
(c) 2008 Michael Horowitz
Observations and Comments
• #1 Optimal rate for Qi proportional to log 2  i 
• #2 Optimal distortion
1
k
2R 
k



 
D
*

2
 k 

Di 
 hg 

2

j 

k
 j1 
*

for all i
• #3 In practice, systems use positive [Segall 1976]
integer [Farber & Zeger 2005] Ri
(c) 2008 Michael Horowitz
Question
• Given Gaussian source X & fixed encoder
structure (i.e., k scalar quantizers) how can
we minimize D subject to R  Rmax?
k

Answer : Transform X to minimize  .
2
j

j1
(c) 2008 Michael Horowitz
Transform Coding
X1
X
X2
Xk
[Kramer & Mathews 1956]
^
^
Y1
Y1
X1
Q1
^
^
X
Y2
Y2
2
Q2
T
T-1
…
^
^
Yk
Yk
Xk
Qk
^
X
• For orthogonal T
2 
2 


ˆ
ˆ
D  E Y  Y  E X  X 

 

(c) 2008 Michael Horowitz
Fact 1
• Karhunen-Loeve Transform (KLT)
produces smallest   . [Huang et al. 1963]
k
2
j
j 1
–
–
–
–
a) Gaussian input random variables

b) High-rate quantizers
c) Rate of each quantizer is arbitrary real value
d) Square error distortion measure
(c) 2008 Michael Horowitz
Fact 2
• The autocorrelation matrix of the KLT
transform vector is diagonal.
– KLT coefficients are uncorrelated
– There is no general theorem stating
uncorrelated quantities can be more
efficiently quantized than correlated ones
(c) 2008 Michael Horowitz
Fact 3
k
• If KLT produces   , orthogonal T˜produces
2
j
≥
 ˜
then




j 1

2
j
j 1
n
for
j 1
k
k
2
j
j 1
 n
 
˜ j2

2
j
for all n  k

j 1
12   22 ,
,  k2
&
˜ 12  
˜ 22 ,

˜ k2
, 
 Energy compaction

(c) 2008 Michael Horowitz
Practical Considerations
• KLT impractical for many systems
– Computational complexity
• Transform is signal dependent
• Compute and apply transform for each input
• Consider Fourier based transforms
– Fast algorithms exist
– Examine loss of coding efficiency resulting
from loss of energy compaction
(c) 2008 Michael Horowitz
Energy Compaction of
Some Discrete Transforms
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
• 1x32 block in natural images [Lohscheller]
(c) 2008 Michael Horowitz
2-D Energy Compaction
[from Hedberg & Nilsson 2004]
• KLT
DCT
QuickTi me™ and a
TIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
• DFT
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(c) 2008 Michael Horowitz
Computational Complexity
• Recall DCT may be derived from DFT
– First N coefficients of 2N-point DFT
– Requires appropriate input sequence symmetry
– Requries scaling [Tseng & Miller 1978]
 m 
dm  Sec  f m
2N 
1
2
for m  0, ..., N 1
where fm is mth DFT coefficient
•
Leverage FFT to compute DCT
(c) 2008 Michael Horowitz
Computational Complexity
(continued)
• 1-D 8-point DCT from 16-point DFT
– 13 mults, 29 adds [Arai et al. 1988]
– 8 final scaling multiplies rolled into quantization
• Net 5 mults, 29 adds  best known
• Fast 2-D DCT (8x8)
– Separable [from Pennebaker & Mitchell 1992]
• 80 mults, 464 adds  best known
– Non-separable [Feig 1992]
• 54 mults, 416 adds, 6 shifts
(c) 2008 Michael Horowitz
Perceptual Implications
• Contrast sensitivity of HVS
– See last page of handout [Barlow & Mollen 1982]
• Perceptually tuned quantization tables [Watson]
• Filter coefficients prior to quantization
– Shape frequency content of source
– Exploit HVS contrast sensitivity
(c) 2008 Michael Horowitz
Concluding Summary
• Motion estimation & compensation
– Translation-based motion models
– Fractional sample motion estimation
– Multiple motion vectors per macroblock
– Multiple reference pictures
– Multi-hypothesis motion compensated
prediction
– Overlapped block motion compensation
(c) 2008 Michael Horowitz
Concluding Summary
• DCT
– Near optimal R-D performance for wide
range of sources (Gaussian, high-rate assumptions)
– Simple relationship to DFT  fast
– Perceptual relevance
(c) 2008 Michael Horowitz
References
•
•
•
•
•
•
N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans.
Comput., vol. C-23, pp. 90–93, Jan. 1974.
Y. Arai, T. Agui, M. Nakajima, “A Fast DCT-SQ Scheme for Images”, Trans. of the
IEICE.E 71(11):1095(Nov.1988). E. Feig, S. T.Winograd, “Fast Algorithms for Discrete
Cosine Transform”, IEEE Trans. Signal Proc., 40, 2174-2193 (1992).
H. B. Barlow and J. D. Mollon, The Senses. Cambridge: Cambridge University Press,
1982.
G. Bjontegaard “Objective simulation results”, Document VCEG-M34, Video Coding
Experts Group (VCEG),Thirteenth Meeting: Austin, Texas, USA, 2-4 April, 2001
H. Everett III, “Generalized Lagrangian Multiplier Method for Solving Problems of
Optimum Allocation of Resources,” Operations Research, vol. 11, pp. 399-417, 1963.
B. Farber and K. Zeger, “Quantization of Multiple Sources Using Integer Bit Allocation"
Data Compression Conference (DCC) Salt Lake City, Utah, March 2005 (to appear).
(c) 2008 Michael Horowitz
References (continued)
•
•
•
•
•
•
M. Flierl, T. Wiegand, B. Girod, “Locally Optimal Design Algorithm for Block-Based
Multi-Hypothesis Motion-Compensated Prediction,” Proc. of the IEEE Data
Compression Conference (DCC'98), pp. 239-248, Snowbird, USA, Apr. 1998.
A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, Kluwer
Academic Publishers, Boston, 1992.
B. Girod, Lecture for EE368b, Video and Image Compression Stanford University.
R. M. Gray and D. L. Neuhoff, "Quantization," IEEE Transactions on Information
Theory, vol. 44, pp. 2325-2384, Oct. 1998.
R. M. Haralick, “A Storage Efficient Way to Implement the Discrete Cosine Transform”,
IEEE Transactions on Computers, 25 (6) (1976) 764–765.
H. Hedberg, and P. Nilsson, “A Survey of Various Discrete Transforms used in Digital
Image Compression Algorithms,” Proceedings of the Swedish System-On-Chip
Conference 2004, Bastad, Sweden, April 13-14, 2004.
(c) 2008 Michael Horowitz
References (continued)
•
•
•
•
•
•
M. J. Horowitz, “Demonstration of H.263++ Annex U Performance”, Document Q15-J11, Tenth Meeting (Meeting J) of the ITU-T Q.15/16, Advanced Video Coding Experts
Group, Osaka, Japan, 16-18 May, 2000.
J.-Y. Huang and P. M. Schultheiss, “Block quantization of correlated Gaussian
randomvariables,” IEEE Trans. Comm., vol. 11, pp. 289–296, September 1963.
F. Kossentini, Y. Lee, M. Smith and R. Ward, “Predictive RD Optimized Motion
Estimation for Very Low Bit-Rate Video Coding”, Special Issue of the IEEE Journal on
Selected Areas in Communications, 15(9), pages 1752-1763, December 1997.
H. P. Kramer and M.V. Mathews, “A linear coding for transmitting a set of correlated
signals,” IRE Trans. Inform. Theory, vol. 23, no. 3, pp. 41-46, Sept. 1956.
M. T. Orchard and G. J. Sullivan, “Overlapped block motion compensation: An
estimation-theoretic approach,” IEEE Trans. Image Processing, vol. 3, no. 9, pp. 693699, Sept. 1994.
W. B. Pennebaker, J. L. Mitchell, JPEG, p-53, Kluwer Academic Publishers, Norwell,
MA, USA 1992.
(c) 2008 Michael Horowitz
References (continued)
•
•
•
•
•
•
A. Segall, “Bit allocation and encoding for vector sources,” IEEE Trans. Inform. Theory
IT-22 (March 1976) 162-169.
Y. Shoham and A. Gersho, “Efficient Bit Allocation for an Arbitrary Set of Quantizers,"
IEEE Trans. on Acoust., Speech, Signal Processing, vol. 36, no. 9, pp. 1445-1453.
September 1988.
B. D. Tseng and W. C. Miller, “On Computing the Discrete Cosine Transform”, IEEE
Transactions on Computers, 27 (10), (1978) 966–968.
Y. Wang, “Video Coding Standards”, lecture slides based on text Video Processing and
Communications, Prentice Hall, 2002.
A. B. Watson, “DCT quantization matrices visually optimized for individual images,”
Proc. SPIE, 1913:202-16, 1993.
T. Wiegand, X. Zhang, and B. Girod, “Block-Based Hybrid Video Coding Using
Motion-Compensated Long-Term Memory Prediction,” in Proc. of the Picture Coding
Symposium, Berlin, Germany, pp. 153-158, Sept. 1997.
(c) 2008 Michael Horowitz
Backup slides
• Little Things  Big Difference
• Motion search over picture boundary
– Reference Gisle’s Austin Contribution
(c) 2008 Michael Horowitz
DCT from the DFT [Haralick 1976]
• N-point DCT
 mk  12 
dm   x k cos
 m  0, ..., N 1
N


k 0
N 1

• Extend N-point sequence xk by reflection
xk
x  
x2N k 1
'
k
0 k  N
N  k  2N
(c) 2008 Michael Horowitz
Extend N-point Sequence xk by Reflection
• Example
xk

N
2N
(c) 2008 Michael Horowitz
Compute 2N-point DFT
fm 
2N 1
'
x
 ke
j
2 m k
2N
k 0
N 1
  xk' e
j
2 m k
2N
k 0
2N 1
  xk' e
j
2 m k
2N
for m  0, ..., 2N  1
k N
• Second sum equals (by symmetry of xk)

N 1
'
x
 Le
j
2 m (L1)
2N
where L  2N  k 1
L 0
(c) 2008 Michael Horowitz
Compute 2N-point DFT (continued)
• It follows that
2  m (k1) 
 j 2 m k
j
fm   xk' e 2N  e 2N 

k 0 
N 1
2 m 1
• Multiply by e

f m  2e

j
j
2
2N
e
j
2 m 1
 m N1
2N
x
k 0
'
k
cos
2
2N
& employ Euler’s formula
 
m(k  12 )
N
(c) 2008 Michael Horowitz
Compute 2N-point DFT (continued)
• Recognizing the DCT
dm  12 e
j
m
2N
fm
for m  0, ..., N 1
'
x
• Note k is even and real  fm = Re{fm}
(i.e. Im{fm} = 0)
 • It follows that [Tseng & Miller]

 m 
dm  Sec  f m
2N 
1
2
for m  0, ..., N 1
• First N coeffs of 2N-point DFT  N-point DCT

– with appropriate scaling and xk symmetry
(c) 2008 Michael Horowitz
Energy Compaction of
Some Discrete Transforms
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
•
Transform coefficient variances for N=16, ρ=0.95 [Ahmed 1974]
(c) 2008 Michael Horowitz
KLT Computational Complexity
• Transform is signal dependent
• Construct transform
– Compute correlation matrix for input vector
– Find eigenvectors of correlation matrix
• Apply transform
(c) 2008 Michael Horowitz
Multiple Motion Vectors per MB
• One motion vector for each sub-block
• H.264 results [Bjontegaard]
Mobile and Calendar (CIF@30fps)
35
34
PSNR (dB)
33
32
31
2 blocks
30
29
4 blocks
28
27
7 blocks
26
25
0
500
1500
1000
2000
2500
Kbps
(c) 2008 Michael Horowitz
Practical Matters
• 16-bit math for 4x4 in H.264 complexity
reduction on certain platforms
• 4x4 and 8x8 transforms in H.264
– Exact inverses
• Non-exact specification for inverse DCT
– How is it done?
– Implications
(c) 2008 Michael Horowitz
Overlapped Block Motion
Compensation in H.263
• Coding efficiency
PSNR [dB]
– Baseline vs advanced prediction mode [from Girod]
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(c) 2008 Michael Horowitz
1-D DFT Energy Compaction Analysis
• Fourier transform of ramp (continuous both domains)
• Sample Fourier domain 
Fourier Series 
Repetition in time domain
Amplitude
Time
(c) 2008 Michael Horowitz
Ramp: First 5 Fourier Terms
[ptolemy.eecs.berkeley.edu/eecs20/week8/examples.html]
QuickTi me™ and a
TIFF (LZW) decompres sor
are needed to see this picture.
1
• Fourier term decay rate:
n
(c) 2008 Michael Horowitz
Better Energy Compaction
• DFT energy compaction not very good
• Better energy compacting Fourier based
transforms exist
• Consider DFT of extended sequence
– Extend input to force even symmetry
– Leads to DCT
(c) 2008 Michael Horowitz
Extended Ramp (Triangle)
• 2N-point extended ramp
Amplitude
Time
• Sample Fourier Domain  Fourier Series
– No discontinuities at boundary (symmetrical extension)
– Expect better energy compaction
(c) 2008 Michael Horowitz
Triangle: First 5 Fourier Terms
[ptolemy.eecs.berkeley.edu/eecs20/week8/examples.html]
QuickTi me™ and a
TIFF (LZW) decompres sor
are needed to see this picture.
1
• Fourier term decay rate: 2
n
(c) 2008 Michael Horowitz
Compaction Comparison Summary
• DFT coefficient amplitude decay
1
n
– Ramp 
– Extended ramp 
1
n2
• Suggests DCT will compact well

• Fourier Series
  DFT
– Sampling in time  repetition in frequency
– “series-based” observations valid for DFT
(c) 2008 Michael Horowitz
Contrast Sensitivity
•
Allen B. Poirson & Brian A. Wandell, Pattern-color separable pathways predict
sensitivity to simple colored patterns, Vision Research, 1995
QuickTime™ and a
TIFF (LZW) decompressor
are needed to see this picture.
(c) 2008 Michael Horowitz
Uniform Scalar Quantization
• Distortion of
D0 

2
cell Di  E X i  xˆ i
 x 0 p

2

ith
2
0th Cell

2
X0
(x) dx
1


  
– Assume high rate pX 0 (x)

0
2  0  2


x
2  2
Otherwise
2
2
2
D0  , Di 
i  D
12 
12
12
(c) 2008 Michael Horowitz
Download