Distributed Video Coding - 中央研究院資訊科學研究所

Distributed Video Coding for Wireless Visual Sensor Networks

Li-Wei Kang ( 康立威 )

Institute of Information Science, Academia Sinica

Taipei, Taiwan lwkang@iis.sinica.edu.tw

中央研究院資訊科學研究所

博士後研究員

Feb. 22, 2008

Outline

•

Introduction

•

Distributed Source Coding (DSC)

•

Distributed Video Coding (DVC)

•

DVC for Wireless Visual Sensor Networks

(WVSN)

•

Concluding Remarks

•

References

Distributed Video Coding for Wireless Visual Sensor Networks Feb. 22, 2008 at CSIE/NDHU 2

Introduction

• Conventional video coding



MPEG-1/2/4, H.261, H.263,

H.26L, H.264/AVC



Interframe predictive coding



Encoder is 5-10 times more complex than decoder



Suitable for video down-link

X i

Interframe

Encoder

Interframe

Decoder

X’ i-1

[Girod, 2002]

Distributed Video Coding for Wireless Visual Sensor Networks

X i

’

Feb. 22, 2008 at CSIE/NDHU 3

Conventional Video Coding

[Aramvith]



[Lin, NTHU, 2007]


Transformation and Quantization

[Lin, NTHU, 2007]


Interframe Predictive Video Coding

[Lin, NTHU, 2007]


Motion Estimation

[Lin, NTHU, 2007]


Motion Estimation

[Lin, NTHU, 2007]


Motion Compensated Prediction

[Lin, NTHU, 2007]


Applications of Conventional Video Coding

[Pereira, 2007]


Introduction

•

Problem: low-complexity video encoding for resource-limited video devices

• DSC approach: Wyner-Ziv video coding with low-complexity intraframe encoding and possibly high-complexity interframe decoding with side information only available at decoder

X i

Intraframe

Encoder

Interframe

Decoder

X i

’

Side Information

[Girod, 2002]


X i-1

’


Applications of Low-Complexity Video Coding

• Wireless video cameras

• Wireless low-power surveillance

• Mobile document scanner

• Video conferencing with mobile devices

• Mobile video mail

• Disposable video cameras

• Wireless Visual Sensor Networks

• Networked camcorders

• Distributed video streaming

• Multiview video entertainment

• Wireless capsule endoscopy

[Pereira, 2007]



[Pereira, 2007]



[Pereira, 2007]


Wireless Visual Sensor Networks

[Akyildiz, 2007, and Pereira, 2007]


Wireless Visual Sensor Networks

[Akyildiz, 2002]


Introduction

• Requirements of wireless visual sensor networks

 low-complexity video encoder

 high compression efficiency

• Current approaches

 distributed video coding (DVC) based on distributed source coding (DSC)

 collaborative image coding and transmission

 hybrid approach (proposed approach)


Distributed Source Coding (DSC)

• Lossless DSC, Slepian and Wolf, 1973

• Lossy DSC, Wyner and Ziv, 1976

• Distributed video coding (DVC) based on DSC



Girod, Stanford University, 2002~



B. Girod, A. M. Aaron, S. Rane, and D. Rebollo-Monedero, “Distributed video coding,” Proceedings of the IEEE , vol. 93, no. 1, pp. 71-83, Jan. 2005.

 Special session on Distributed video coding, 2005 IEEE International

Conference on Image Processing (ICIP2005), Italy, Sept. 2005



Ramchandran, Berkeley, 2002~



R. Puri, A. Majumdar, and K. Ramchandran, “PRISM: a video coding paradigm with motion estimation at the decoder,” IEEE Trans. on Image Processing , vol.

16, no. 10, pp. 2436-2448, Oct. 2007.

 R. Puri, A. Majumdar, P. Ishwar, and K. Ramchandran, “Distributed video coding in wireless sensor networks,” IEEE Signal Processing Magazine , vol.

23, no. 4, pp. 94-106, July 2006.


Distributed Source Coding

• DISCOVER ( Dis tributed Co ding for V ideo S er vices)



2005~

 F. Pereira, L. Torres, C. Guillemot, T. Ebrahimi, R. Leonardi, and S. Klomp, “Distributed video coding selecting the most promising application scenarios,” to appear in Signal

Processing: Image Communication .



C. Guillemot, F. Pereira, L. Torres, T. Ebrahimi. R. Leonardi, J. Ostermann, “Distributed monoview and multiview video coding: basics, problems and recent advances,” IEEE

Signal Processing Magazine , special issue on signal processing for multiterminal communication systems, vol. 24, no. 5, pp. 67-76, Sept. 2007.



M. Maitre, C. Guillemot, and L. Morin, “3-D model-based frame interpolation for distributed video coding of static scenes,” IEEE Trans. on Image Processing , vol. 16, no.

5, pp. 1246-1257, May 2007.



Six European major universities: UPC, IST, EPFL, UH, INRIA, UNIBS



Special session on Distributed source coding, 2007 IEEE International

Conference on Image Processing (ICIP2007), USA, Sept. 2007



DISCOVER Workshop on Recent Advances in Distributed Video Coding,

Lisbon, Portugal, Nov. 2007

 http://www.discoverdvc.org/



•

X 、 Y in S = {000, 001, 010, 011, 100, 101, 110, 111}

•

H ( X ) = H ( Y ) = 3

• If d(X, Y) ≤ 1, H ( X ) may be reduced to H ( X|Y ) = 2

• For example, if Y = 000 and d(X, Y) ≤ 1, the possible X =>

X in {000, 001, 010, 100} => H ( X|Y ) = 2

• A possible solution :

S can be divided into the four disjoint sets based on d(X, Y) ≤ 1

{000, 111}, {100, 011}, {010, 101}, {001, 110}

At the encoder, if X = 100 ， H ( X|Y ) = 2 denotes X in {100, 011}

At the decoder, X = 100 can be correctly decoded based on Y = 000 and the correlation between X and Y , d(X, Y) ≤ 1

•

X : source data to be encoded, Y : the side information of X



X

Statistically dependent

Y

Encoder

Encoder

R

R

Y

X

Decoder

Slepian-Wolf Theorem, 1973

X



, Y



R

X

R

X

R

Y



R

Y



H



( X

H ( X , Y )

| Y )



H ( Y | X )

X Encoder

R

WZ

X | Y

( d )

Decoder

Statistically dependent

Y

[Girod, 2002]


Wyner-Ziv Theorem, 1976

X



R

WZ

X | Y

( d )



R

X | Y

( d )



Slepian-Wolf Theorem, 1973

R

Y

[bits]

Separate encoding and joint decoding of X and Y

Separate encoding and decoding of X and Y

 

 

R

X



R

Y

 

,



R

X

[bits]

[Girod, 2002]



X

Predictive

Interframe

Encoder

Y

Predictive

Interframe

Decoder

Y X’

Side

Information

[Girod, 2006]


Distributed Video Coding based on Wyner-Ziv Theorem

X Y X’

Side

Information

[Girod, 2006]


Wyner-Ziv Video Coding

•

K : key frame, conventional intraframe encoding

•

X : Wyner-Ziv frame, Wyner-Ziv video encoding

• The corresponding side information

Y of X is generated at decoder based on interpolation of the previous decoded frames

[Girod, 2003]


Side Information Generation

[Guo, 2006]


[Ebrahimi, 2006]



(a) (b)

(a) The original frame ( X ); (b) the corresponding side information ( Y ) generated at the decoder.

[Girod, 2003]



X

X

Wyner-Ziv Encoder

Quantizer

Channel

Encoder

X

“Correlation channel”

Wyner-Ziv Encoder

Slepian-Wolf Codec

Scalar

Quantizer

Turbo

Encoder

Wyner-Ziv Decoder

Channel

Decoder

Minimum distortion

Reconstruction

Y Y

Wyner-Ziv Decoder

Turbo

Decoder

Reconstruction

X



X’

[Girod, 2002]


Y


Pixel-domain Wyner-Ziv Video Coding

Intraframe Encoder

Wyner-Ziv frames

Scalar

X

Quantizer

Turbo

Encoder

Buffer

Key frames

K

Conventional

Intraframe encoding

Interframe Decoder

Turbo

Decoder

Reconstruction

Request bits

Y

Side information

Conventional

Intraframe decoding

Interpolation/

Extrapolation

X’

K’

[Girod, 2003]


Scalar Quantization

(a) (b)

(a) The original frame; (b) the corresponding 16 gray level quantized frame.

• Scalar quantization in pixel domain

[Girod, 2003]


Turbo Encoder

X

L bits in

Interleaver length L

Systematic

Convolutional Encoder n



1

Rate n

L bits n

L



1

Discarded bits

Systematic

Convolutional Encoder n



1

Rate n n

L



1 bits

L bits

Discarded

X

X

1

P

2

P n

2L



1 bits output

R

X



2 n



1

• For each input block of n – 1 bits, the turbo encoder produces codewords of length n composed of the actual input bits and one parity bit

[Girod, 2002]


Turbo Decoder

n

L



1

X bits in

1

P

Channel probabilities calculations

P channel SISO

P a posteriori

Decoder

P a priori

P extrinsic n

L



1 bits in

2

X

P

P ( x | y )

Channel probabilities calculations


Deinterleaver length L


Decision

P extrinsic

P a priori

P channel

SISO

Decoder

Deinterleaver length L

P a posteriori

Y

[Girod, 2002]


X



L bits out


Simulation Results


[Girod, 2003]


After Wyner-Ziv decoding

16-level quantization



[Girod, 2003]


Transform-domain Wyner-Ziv Video Coding

WZ frames

W

Intraframe

Encoder

X k q k bit-plane 1 bit-plane 2 bit-plane M k

For each transform band k

Interframe

Decoded WZ frames

W’

Decoder

IDCT

X k

’ q k

’

Reconstruction

Side information

Y k

DCT

Y

Key frames

K

[Girod, 2004]

Conventional

Intraframe coding


Conventional

Intraframe decoding

K’


Transform-domain Wyner-Ziv Video Coding

WZ frame

W

4x4 DCT

For each transform band k

X k

2

M k level

Quantizer

• Each coefficient band is quantized using a scalar quantizer with 2 M levels. q k

M k

= number of bit planes for k th coefficient

2

M k

 band

{1, 2, 4, ..., 256}

• Combination of quantizers determines the bit allocation across bands.

Sample quantizers : Values represent number quantization levels for coefficient band

[Girod, 2004]


Transform-domain Wyner-Ziv Video Coding q k bit-plane 1

Extract bitbit-plane 2 planes bit-plane M k

Turbo

Encoder

Buffer

Turbo

Decoder q k

’

Y k

• Bit planes of coefficients are encoded independently but decoded successively

• Rate-compatible punctured turbo code (RCPT)

 Flexibility for varying statistics



Bit rate controlled by decoder through feedback channel

• Turbo decoder can perform joint source channel decoding

[Girod, 2004]




[Girod, 2004]


Wyner-Ziv Coding

370 kbps



H263 Intraframe Coding

330 kbps, 32.9 dB

[Girod, 2004]


Wyner-Ziv Coding

274 kbps, 39.0 dB



H263 interframe coding

145 kbps, 40.4 dB

[Girod, 2004]


Wyner-Ziv Coding 156 kbps, 37.5 dB



3 dB

8 dB

[Girod, 2004]


DISCOVER DVC Codec

• Based on the feedback channel solution from Stanford Univ.

• Based on a split between Wyner-Ziv (WZ) and key frames

• Key frames used with a regular (GOP size) or dynamic periodicity

• Key frames coded with H.264/AVC Intraframe encoding


[Pereira, 2007]



[Pereira, 2007]


DVC for Wireless Visual Sensor Networks (WVSN)

Internet or satellite

Remote control unit

(RCU)

Visual sensor node (VSN)

Sensor field

Aggregation and forwarding node (AFN)

Wireless link


Conventional Multiview Video Coding

Multiview video coding structure combining inter-view and temporal prediction

[Kubota, 2007]


Global Motion Estimation

[ Lin, NTHU, 2007 ]


[Ebrahimi, 2007]


Multiview Distributed Video Coding

[Ebrahimi, 2006]


Multiview Distributed Video Coding

Temporal side information

Inter-view side information

[Ebrahimi, 2007]



[Ebrahimi, 2007]


Collaborative Image Coding and Transmission

[1] M. Wu and C. W. Chen, “Collaborative image coding and transmission over Wireless Sensor Networks,”

EURASIP Journal on Advances in Signal Processing , special issue on Visual Sensor Networks, 2007.

[2] K. Y. Chow, K. S. Lui, and E. Y. Lam, “Efficient on-demand image transmission in visual sensor networks,”

EURASIP Journal on Advances in Signal Processing , special issue on Visual Sensor Networks, 2007.


Proposed Multiview DVC

• The proposed low-complexity video codec is based on

 the motion estimation is shifted to the decoder

 the low-complexity image matching is performed at the encoder based on image warping and robust media hashing

•

L. W. Kang and C. S. Lu, “Low-complexity power-scalable multi-view distributed video encoder,” in Proc. of 2007 Picture Coding Symposium , Lisbon, Portugal, Nov. 2007.

•

L. W. Kang and C. S. Lu, “Multi-view distributed video coding with low-complexity inter-sensor communication over wireless video sensor networks,” in Proc. of 2007

IEEE Int. Conf. on Image Processing, special session on Distributed source coding II:

Distributed video and image coding and their applications, San Antonio, TX, USA,

Sept. 2007, vol. 3, pp. 13-16 (invited paper).

•

L. W. Kang and C. S. Lu, “Low-complexity Wyner-Ziv video coding based on robust media hashing,” in Proc. of IEEE Int. Workshop on Multimedia Signal Processing ,

Victoria, BC, Canada, Oct. 2006, pp. 267-272.

P.S. Co-author: Prof. Chun-Shien Lu ( 呂俊賢教授 , 中研院資訊所副研究員 )


Robust Media Hashing

• A compact representation for a frame



Structural digital signature (SDS)

Diff

 max

1

 k



4



Diff k



 max

1

 k



4 p

 c k

Only the parent-child pair with the maximum magnitude difference ( Diff ) among those of the four pairs in a “parent-four children” pair will be selected p c

1 c

3 c

2 c

4 c

1 c

3 c

2 c

4 p

C

1

C

3

C

2

C

4

A parent and its four child nodes.

The wavelet decomposition for a frame.

C. S. Lu and H. Y. M. Liao, “Structural digital signature for image authentication: an incidental distortion resistant scheme,”

IEEE Trans. on Multimedia , vol. 5, no. 2, pp. 161-173, June 2003.



• Labeling an SDS

 the signature symbol sym(p,c) of a parent-child pair ( p , c ) can be defined as follows sym ( p , c )











1

1









2

2 if if if if



 p p



 c c







 p p



 c c



 and and and and



 p p





0

0

,

,







 c c





0

0

,

.





 each parent-four children pair will be represented by a symbol sym(p,c) , where the pair ( p , c ) is with maximum magnitude difference


Proposed Single-view DVC

( = S ): +1, 0, +1, +2, -2, … S ( = S

Non-key bits for F Non-key bits for F

Non-key frame bits for F

An illustrated example for encoding with GOP = 4

L. W. Kang and C. S. Lu, “Low-complexity Wyner-Ziv video coding based on robust media hashing,” in Proc. of 2006 IEEE Int. Workshop on

Multimedia Signal Processing , Victoria, BC, Canada, Oct. 2006, pp. 267-272 (MMSP2006).


Proposed Multiview DVC

• Consider several adjacent VSNs observing the same target scene in a WVSN

• For each VSN,

V s

, an input video sequence is divided into several GOPs, in which a GOP consists of a key frame, K s,t

, followed by several non-key frames, W s,t

A simple example of the GOP structure for a WVSN with N sensor

= 3, where GOPS

0

= 1,

GOPS

1

= 4, and GOPS

2

= 2.

VSN /

Time instant

V

0

V

1

V

2

K

K t

0,t

1,t

K

2,t t

K

W

W

+ 1

0,t+1

1,t+1 t

K

+ 2

W

0,t+2

1,t+2 t

K

+ 3

W

2,t+1

K

2,t+2

W

0,t+3

1,t+3 t

K

K

+ 4

0,t+4

1,t+4

•••

•••

•••

2,t+3

K

2,t+4

•••


Key Frame Encoding

• Key frames

 each key frame is encoded using the H.264/AVC intraframe encoder first



The global motion estimation between the key frames from adjacent VSNs will be performed at the decoder

(RCU)



The estimated motion parameters between each pair of the key frames from adjacent VSNs will be sent back to the corresponding VSNs via feedback channel


Global Motion Estimation between the Key Frames from Adjacent VSNs

V i i j j and K


Key Frame Encoding

Target scene

V

0

V

1

V k

K’

0,48

Warping

Ќ

0,48

(a) Co-located block MSE calculation and comparison

(b) Block-based SDS extraction and comparison

(c) Significant wavelet coefficients extraction

K’

1,48

Significant wavelet coefficients for K

1,48

Quantization and entropy encoding

Compressed bitstream for

K

1,48


Non-key Frame Encoding

• Based on hash comparisons

• Block coding mode selection (Intra, Inter, or Skip)

 for each frame, all the blocks are sorted in an increasing order based on their PSNR values (calculated with their colocated blocks in the reference frame from the same VSN)

T

1

T

2

B

(1)

B

(2)

PSNR

(1)

≤

PSNR

(2)

≤

••• B

(i)

B

(i+1)

••• ≤

PSNR

(i+1)

≤

B

(i+2)

••• B

(j)

B

(j+1)

••• ≤

PSNR

(k)

•••

B

(k)

Blocks with Intra mode

(H.264/AVC intra-frame encoding)

Blocks with Inter mode

(SDS extraction and comparison)

Blocks with

Skip mode


Non-key Frame Encoding for Blocks with Inter

Mode

SDS for K’ coefficients for W

Initial significant symbols for W mode in W



38

36

34

32

PSNR (dB)

42

40

30

28

0 200

H.264 Inter (GOP = ∞ )

Multi (GOP = 4)

H.264 Intra (GOP = 1)

400 600 800

Proposed (GOP = 4)

Single (GOP = 4)

1000 Bitrate (kbps)


Concluding Remarks

• Low-complexity video coding becomes a very hot research topic

• Distributed video coding (DVC) based on distributed source coding (DSC) becomes a new paradigm of low-complexity video coding

• Further researches

 side information generation

 transformation and quantization

 channel coding

 rate control



Other DSC-related applications

 multimedia authentication

 biometrics security

 layered video coding

 Error resilience for standard video coding

 other low-complexity video coding architectures


References

[1] F. Pereira, L. Torres, C. Guillemot, T. Ebrahimi, R. Leonardi, and S. Klomp, “Distributed video coding: selecting the most promising application scenarios,” to appear in Signal Processing:

Image Communication .

[2] C. Guillemot, F. Pereira, L. Torres, T. Ebrahimi. R. Leonardi, J. Ostermann, “Distributed monoview and multiview video coding: basics, problems and recent advances,” IEEE Signal

Processing Magazine , vol. 24, no. 5, pp. 67-76, Sept. 2007.

[3] M. Maitre, C. Guillemot, and L. Morin, “3-D model-based frame interpolation for distributed video coding of static scenes,” IEEE Trans. on Image Processing , vol. 16, no. 5, pp. 1246-1257,

May 2007.

[4] R. Puri, A. Majumdar, and K. Ramchandran, “PRISM: a video coding paradigm with motion estimation at the decoder,” IEEE Trans. on Image Processing , vol. 16, no. 10, pp. 2436-2448, Oct.

2007.

[5] R. Puri, A. Majumdar, P. Ishwar, and K. Ramchandran, “Distributed video coding in wireless sensor networks,” IEEE Signal Processing Magazine , vol. 23, no. 4, pp. 94-106, July 2006.

[6] B. Girod, A. M. Aaron, S. Rane, and D. Rebollo-Monedero, “Distributed video coding,”

Proceedings of the IEEE , vol. 93, no. 1, pp. 71-83, Jan. 2005.

[7] X. Artigas, J. Ascenso, M. Dalai, S. Klomp, D. Kubasov, and M. Ouaret, “The DISCOVER codec: architecture, techniques and evaluation,” in Proc. of 2007 Picture Coding Symposium , Lisbon,

Portugal, Nov. 2007.


Our Preliminary Publications

[1] L. W. Kang and C. S. Lu, “Low-complexity power-scalable multi-view distributed video encoder,” in Proc. of Picture Coding Symposium , Lisbon, Portugal, Nov. 2007

(PCS2007).

[2] L. W. Kang and C. S. Lu, “Multi-view distributed video coding with low-complexity inter-sensor communication over wireless video sensor networks,” in Proc. of IEEE

Int. Conf. on Image Processing , special session on Distributed Source Coding II:

Distributed Image and Video Coding and Their Applications, San Antonio, TX, USA,

Sept. 2007, vol. 3, pp. 13-16 (ICIP2007, invited paper).

[3] L. W. Kang and C. S. Lu, “Low-complexity Wyner-Ziv video coding based on robust media hashing,” in Proc. of IEEE Int. Workshop on Multimedia Signal Processing ,

Victoria, BC, Canada, Oct. 2006, pp. 267-272 (MMSP2006).

[4] L. W. Kang and C. S. Lu, “Wyner-Ziv video coding with coding mode-aided motion compensation,” in Proc. of IEEE Int. Conf. on Image Processing , Atlanta, GA, USA,

Oct. 2006, pp. 237-240 (ICIP2006).


Distributed Video Coding - 中央研究院資訊科學研究所

Distributed Video Coding for Wireless Visual Sensor Networks

Outline

Introduction

Conventional Video Coding

Conventional Video Coding

Transformation and Quantization

Interframe Predictive Video Coding

Motion Estimation

Motion Estimation

Motion Compensated Prediction

Applications of Conventional Video Coding

Introduction

Wireless Visual Sensor Networks

Wireless Visual Sensor Networks

Introduction

Distributed Source Coding (DSC)

Distributed Source Coding

Distributed Source Coding

Distributed Source Coding

Distributed Source Coding

Conventional Video Coding

Wyner-Ziv Video Coding

Side Information Generation

Wyner-Ziv Video Coding

Wyner-Ziv Video Coding

Pixel-domain Wyner-Ziv Video Coding

Scalar Quantization

Turbo Encoder

Turbo Decoder

Simulation Results

Simulation Results

Simulation Results

Simulation Results

Simulation Results

Simulation Results

DISCOVER DVC Codec

Simulation Results

Conventional Multiview Video Coding

Global Motion Estimation

Multiview Distributed Video Coding

Multiview Distributed Video Coding

Simulation Results

Proposed Multiview DVC

Robust Media Hashing

Robust Media Hashing

Robust Media Hashing

Proposed Single-view DVC

Proposed Multiview DVC

Key Frame Encoding

Key Frame Encoding

Non-key Frame Encoding

Simulation Results

Concluding Remarks

References

Our Preliminary Publications

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib