Manipulating Lossless Video in the Compressed Domain William Thies , Steven Hall

advertisement
Manipulating Lossless Video
in the Compressed Domain
William Thies1, Steven Hall2, Saman Amarasinghe2
1
Microsoft Research India
2 Massachusetts Institute of Technology
ACM Multimedia
October 20, 2009
Processing in the Compressed Domain
• Multimedia archives are growing rapidly
– Monsters vs. Aliens production
– Facebook photos
– YouTube
100 TB
400 TB
600 TB
lossless prior
to distribution
• How to analyze or modify the data?
Compressed
Input
Uncompress
Process
Recompress
Typical practice
Compressed
Input
Process
Compressed
Output
Compressed-domain transformation
Compressed
Output
Prior Work: Focus on Lossy Formats
• DCT-based spatial compression (JPEG, MPEG stills)
–
–
–
–
–
Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002]
Edge detection [Shen & Sethi 1996]
Image segmentation [Feng & Jiang 2003]
Shearing and rotating inner blocks [Shen & Sethi 1998]
Linear combinations of pixels [Smith & Rowe 1996]
• DCT-based temporal compression (MPEG video)
–
–
–
–
Captioning [Nang, Kwon, & Hong 2000]
Reversal [Vasudev 1998]
Distortion detection [Dorai, Ratha, & Bolle 2000]
Transcoding [Acharya & Smith 1998]
• Almost no work on lossless formats
– Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999]
– Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003]
– Modifying pitch and playback of audio [Levine 1998]
Prior Work: Focus on Lossy Formats
• DCT-based spatial compression (JPEG, MPEG stills)
–
–
–
–
–
•
Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002]
Edge detection [Shen & Sethi 1996]
Image segmentation [Feng & Jiang 2003]
Shearing and rotating inner blocks [Shen & Sethi 1998]
Linear combinations of pixels [Smith & Rowe 1996]
Our Focus:
Regular
Processing
of
DCT-based temporal compression (MPEG video)
LZ77-Compressed
– Captioning
[Nang, Kwon, & Hong 2000]Data Streams
– Reversal [Vasudev 1998]
– Distortion detection [Dorai, Ratha, & Bolle 2000]
– Transcoding [Acharya & Smith 1998]
• Almost no work on lossless formats
– Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999]
– Pattern matching in compressed text [Farach & Thorup 1998; Navarro 2003]
– Modifying pitch and playback of audio [Levine 1998]
Example
Input:
O O O O L A L A L A
to lowercase
Output:
o o o o l a l a l a
Example
Input:
O O O O L A L A L A
Compressed
Input:
O O O O L A L A L A
Output:
o o o o l a l a l a
Example
Input:
O O O O L A L A L A
Compressed
Input:
4
2
O O O O L A L A L A
Output:
o o o o l a l a l a
Example
Input:
O O O O L A L A L A
Compressed
Input:
4
2
L A
Count
Distance
O O O O
“Repeat Token”
Output:
o o o o l a l a l a
Example
Input:
O O O O L A L A L A
Compressed
Input:
3
1
O O O O
4
2
L A
Count
Distance
“Repeat Token”
Output:
o o o o l a l a l a
Example
Input:
O O O O L A L A L A
Compressed
Input:
3
1
O
4
2
L A
Count
Distance
“Repeat Token”
Output:
o o o o l a l a l a
Example
Input:
O O O O L A L A L A
Compressed
Input:
3
1
O
4
2
L A
Compressed
Output:
3
1
o
4
2
l a
Output:
o o o o l a l a l a
Example
Input:
O O O O
L A Transformation
L A L A
Compressed
Domain
Compressed
Input:
3
1
O
4
2
L A
Compressed
Output:
3
1
o
4
2
l a
Output:
o o o o l a l a l a
Example
Compressed Domain Transformation
Compressed
Input:
3
1
O
4
2
L A
Compressed
Output:
3
1
o
4
2
l a
Our Contributions
• Handle the general case
– Produce and consume
more than one data item
– Split and join data streams
Compressed Domain Transformation
Compressed
Input:
3
1
O
4
2
L A
Compressed
Output:
3
1
o
4
2
• Implement in a compiler
l a
– Programmer thinks in terms of uncompressed data
– Compiler translates to work on compressed data
– Relies on StreamIt programming language
• Evaluate on video processing tasks
– 12 videos in Apple Animation format
– Adjust colors or overlay two videos
– Speedups proportional to compression ratio (median 15x)
In This Talk
• StreamIt Language
• Compressed Domain Transformation
• Experimental Evaluation
The StreamIt Language
void->void pipeline FMRadio(freq1 low, float freq2, int N) {
AtoD
add AtoD();
add FMDemod();
FMDemod
add splitjoin {
split duplicate;
Duplicate
for (int i=0; i<N; i++) {
add pipeline {
add LowPassFilter(freq1 + i*(freq2-freq1)/N);
LPF1
LPF2
LPF3
add HighPassFilter(freq2 + i*(freq2-freq1)/N);
HPF1
HPF2
HPF3
}
}
join roundrobin();
RoundRobin
}
add Adder();
add Speaker();
}
Adder
Speaker
The StreamIt Language
• Applications
– DES and Serpent [PLDI 05]
– MPEG-2 [IPDPS 06]
– SAR, DSP benchmarks, JPEG, …
AtoD
FMDemod
• Programmability
– StreamIt Language (CC 02)
– Teleport Messaging (PPOPP 05)
– Programming Environment in Eclipse (P-PHEC 05)
Duplicate
• Domain Specific Optimizations
– Linear Analysis and Optimization (PLDI 03)
– Optimizations for bit streaming (PLDI 05)
– Linear State Space Analysis (CASES 05)
• Architecture Specific Optimizations
– Compiling for Communication-Exposed
Architectures (ASPLOS 02 & 06, dasCMP 07)
– Phased Scheduling (LCTES 03)
– Cache Aware Optimization (LCTES 05)
– Load-Balanced Rendering
(Graphics Hardware 05)
• Migrating Legacy Code to a Stream Representation
– Using a Dynamic Analysis (MICRO 07)
LPF1
LPF2
LPF3
HPF1
HPF2
HPF3
RoundRobin
Adder
Speaker
Language Primitives
Filter
Splitter
pop
pop N
2 push M
1
roundrobin(1,1)
roundrobin(N,M)
Joiner
roundrobin(2,2)
Filter
Model of computation also known as cyclo-static dataflow
Example: Video Compositing
Source 1
Source 2
roundrobin(1,1)
2
MultiplyPixels
1
Output
In This Talk
• StreamIt Language
• Compressed Domain Transformation
• Experimental Evaluation
Transforming Windows of Data
Input:
O O O O L A L A L A
Hyphenate
Pairs
Output:
O O – O O – L A – L A – L A –
Transforming Windows of Data
Input:
O O O O L A L A L A
Hyphenate
Pairs
Output:
O O – O O – L A – L A – L A –
Transforming Windows of Data
Input:
Compressed
Input:
Compressed
Output:
O O O O L A L A L A
3
1
O
4
6
2
L A
3
L A –
Output:
O O – O O – L A – L A – L A –
Transforming Windows of Data
Input:
Compressed
Input:
Compressed
Output:
O O O O L A L A L A
3
1
O
4
6
2
L A
3
L A –
Output:
O O – O O – L A – L A – L A –
Transforming Windows of Data
Input:
O O O O L A L A L A
3
Compressed
Input:
Coarsened,
Expanded
Compressed
Output: 3
2
1
O
4
2
L A
2
O O
4
2
L A
3
O O –
6
3
L A –
Output:
O O – O O – L A – L A – L A –
General Case: Filters
N
…
D
…
I
O
Filter
Coarsen
N’
…
D’
..…
I
O
Filter
Translate
N’ % I
items
…
D’ = LCM (D, I)
N’ = N – (D’ – D)
I
O
Filter
N’’ = N’ – N % I
N’’O/I D’O/I
…
…
Splitting Streams
Output:
Input:
L A L A L A L A L A
Compressed
Input:
Output:
1
8
2
L A L A L A L A L A
4
1
1
1
4
1
1
Splitting Streams
Output:
Input:
L A L A L A L A L A
Compressed
Input:
L A
2
2
2
2
Splitting Streams
Compressed
Coarsened,
Output:
Expanded
Input:
4
2
6
4
L A L A L A L A L A
2
2
2
2
Splitting and Joining: Transpose
O O O O 4
1
X O O O 4
1
Splitting and Joining: Transpose
O O O O
4
O O O O
1
X O O O
4
X O O O
1
Splitting and Joining: Transpose
O O O O
4
1
X O O O
4
1
Splitting and Joining: Transpose
3
1
O O O O 4
X O O O 4
2 1
1
1
Splitting and Joining: Transpose
3
1
O 4
X
2
O 4
1
3
1
O
O
X
2
O
1
1
4
1
2
General Case: Joiners
N1
…
N2
…
D1
…
D2
…
W1
W2
N’
…
D1(W1+W2)
W1
…
If D1 % W1 = 0 and D2 % W2 = 0 and D1/W1 = D2/W2
In This Talk
• StreamIt Language
• Compressed Domain Transformation
• Experimental Evaluation
Implementation
• Implemented subset of transformations in StreamIt
1
1
1-to-1 filter
1
1
2
1 1-to-1 joiner
with 2-to-1 filter
– User can change graph connectivity + filter functions
• Supported file format: Apple Animation (part of .MOV)
– Standard format for interchange of lossless video
– Compression: Run-length encoding within a line +
difference encoding between frames
• Emit executable plugins for MEncoder and Blender
– Allows integration with standard video editing workflow
Experimental Methodology
• Evaluated on 12 videos drawn from Internet video,
computer animation, and stock digital television content
• Two classes of transformations:
1. Color adjustment: inverse, brightness, contrast
2. Composite transformations: alpha-under, multiply
+
=
x
=
alpha
under
1
1
1
1
2
1
Results: Execution Time
1000x
Brightness
Speedup
Contrast
100x
Inverse
Compositing
10x
Color Adjustment:
- 2.5x to 471x (median 17x)
Compression factor was low
(≤1.1x) for one of source videos Compositing:
- 1.1x to 32x (median 6.6x)
1x
1x
10x
100x
1000x
Compression
CompressionFactor
Factor
Following Re-compression
Results: File Bloat
File Bloat
Relative to Recompression
6x
Brightness
5x
Contrast
Masked out areas
not re-compressed
4x
3x
Inverse
Compositing
Saturated colors
not re-compressed
2x
1x
0x
1x
10x
100x
Compression
Compression Factor
Following Re-compression
1000x
Opportunity: Ignoring “Dead” Data
• Some pixels in composite frames do not depend on both
input frames
– Example: digital television mask (a low-performance case)
x
=
• If two data streams are multiplied, and one of them is
repeatedly zero, then the repeat can be copied to the
output (regardless of the values in the other stream)
– We expect this would fix performance of our outlier cases
1 2
1
– Requires pattern matching on stream graph
1
Extension to Other File Formats
• High-efficiency mappings
– Flic Video
– Microsoft RLE
– Targa (with run-length encoding)
• Medium-efficiency mappings
– Open EXR
– Planar RGB
 Re-arranges data by color or by byte
• Low-efficiency mappings
– ZIP
– GZIP
– PNG
 Performs Huffman coding prior to LZ77
Conclusions
• New method for direct processing of lossless-encoded
data streams
– Relies on LZ77 compression and stream programming model
– Supports operations on windows of data
– Supports splitting, joining, and reordering data
• Preliminary implementation in an automatic compiler
– Write program on uncompressed data, run on compressed data
• Good speedups in the context of video processing
– 15x speedup (median) on color adjustment and compositing
– Across 12 videos in Apple Animation format
– May prove useful as more content authored in lossless formats
• Scope for extending technique, finding new applications
Extra Slides
General Case: Splitters
N
…
D
…
U
Split
V
Coarsen
N’
…
D’
..…
U
Split
V
Translate
N’ % (U+V)
items
…
D’ = LCM (D, U+V)
N’ = N – (D’ – D)
U
Split
V
N’’ = N’ – N % (U+V)
N’’V
U+V
…
D’V
U+V
…
Download