FYP 0703 - Department of Computer Science and Engineering, CUHK

advertisement
Department of Computer Science and Engineering, CUHK
2007-2008 Final Year Project Presentation (2nd term)
LYU0703
Electronic Advertisement Guide on PS3
Huang Hiu Fung
Wong Chung Hoi
05700512
05596742
Supervised by Prof. Michael R. Lyu
LYU0703 Electronic Advertisement
Guide on PS3
1
Agenda
•
•
•
•
•
•
•
Background Information
Project Motivation and Objectives
Implementation
Result Analysis
Performance and Cost Comparison
Remote Demo
Q&A
2
Background Information
Market of Commercial Monitoring
•
Verify Ad. broadcast as stated in contract
(No. of broadcast, broadcast time, duration)
•
E.g. Large scale Ad. Campaign of HKD$10,000,000,
spend 5% (HKD$ 500,000)
for commercial monitoring
Current Solution
•
Monitor manually
•
inefficient
LYU0703 Electronic Advertisement
Guide on PS3
3
Project Motivation
•
Hundreds of TV channels
•
Increasing need for TV
commercial monitoring
•
PlayStation®3, a cheap
parallel machine
LYU0703 Electronic Advertisement
Guide on PS3
4
Project Objectives
• accurate algorithm for
commercial monitoring
• Parallelize
 better performance
• Generate an Electronic
Advertisement Guide
(EAG)
5
Developing Environment
• PlayStation®3
• A multi-core machine produced
by Sony with Cell Broadband
Engine processor
• Strong Computation Power
• Linux run on it
• Open platform for different
applications and development
LYU0703 Electronic Advertisement
Guide on PS3
6
Implementation of Commercial
Monitoring Algorithm
7
High Level Description of the Solution
1. Converting raw video data into series of Hl3 files
EAG
2. Processing Hl3 files to become the final
8
Hl3 Files
•
•
•
•
In form of a 32 x 32 array (1024 integers)
Frame capture from analog TV card
Digest of a frame 352 x 288 pixels
Time information on file name
2008.03.12.00.10004415.T.1.Hl3
Hl3
9
Obtaining Hl3 Files from Raw Video
Data
• At first, frames captured in constant
frequency
• For example, 25fps, 1 hr video
• 3600 seconds X 25 = 90,000 frames
High frequency
Low frequency
10
Obtaining Hl3 Files from Raw Video
Data
• Frames captured by scene change
11
Processing Hl3 Files to Become EAG
•
•
•
•
Minimum difference algorithm
m “Target” files P (p1, p2, …, pk)
n “Repository” files Q (q1, q2, …, qk)
Match P to Q such that
is
minimum
• O (m x n x k)
12
Weaknesses of Minimum Difference Algorithm
1. Many to one
matching
2. Out of phrase
matching
3. Force matching
13
Longest Common Subsequence (LCS)
•
•
•
•
ACGGT
AGCTC
LCS = AGT or ACT
Dynamic programming
i\j
A
C
G
G
T
0
0
0
0
0
0
A
0
1
1
1
1
1
G
0
1
1
2
2
2
C
0
1
2
2
2
2
T
0
1
2
2
2
3
C
0
1
2
2
2
3
0
1
2
3
4
5
i\j
0
1
2
3
4
5
A
C
G
G
T
0
0
0
0
0
0
A
0
1
1
1
1
1
G
0
1
1
2
2
2
C
0
1
2
2
2
2
T
0
1
2
2
2
3
C
0
1
2
2
2
3
0
1
2
3
4
5
14
0
1
2
3
4
5
LCS on Character Strings to Hl3 Files
• Alphabet = Hl3 file
• Subsequence = Adv.
1.One to one
matching
2. In phrase matching
3.No force matching
15
Modification of LCS algorithm
• R frame not match with T frame
– T-R combination
• Out of phrase advertisement
– Multiple passes LCS
16
Original LCS algorithm
• Allows R frame match with T frame
• Computation = m x n x k
i\j
R
T
R
T
R
T
0
0
0
0
0
0
0
R
0
1
1
1
1
1
1
T
0
1
2
2
2
2
2
R
0
1
2
2
3
3
3
T
0
1
2
3
3
4
4
R
0
1
2
3
3
4
4
T
0
1
2
3
3
4
5
0
1
2
3
4
5
6
i\j
0
1
2
3
4
5
6
R T R T R T
0 0 0 0 0 0 0
R 0 1
T 0
R 0
T 0
R 0
T 0
0
1
2
3
4
5
0
1
2
3
4
5
6
6
17
T-R Combination
18
T-R Combination
• Save the comparisons between T and R frame
(half the computation)
• Table size reduce
to m/2 x n/2
• Computation:
m/2 x n/2 x 2k
=½xmxnxk
i\j
T-R T-R T-R
0
0
0
0
0
T-R 0
1
1
1
1
T-R 0
1
2
2
2
T-R 0
1
2
2
3
0
1
2
3
19
Single Pass LCS
• Cannot recognize crossed advertisement
segments
• The longer advertisement is recognized
20
Multiple Pass LCS
• Number of passes depends
on relative complexity of
2 videos
• Around 7-8 passes for 2
1 hour long video
• Speed up by caching
“Equality” comparison
result in 1st pass
21
Program Flow
1.
2.
3.
4.
5.
6.
7.
Compute result table
Backtrack LCS result
Analyze LCS result
Synchronize analysis result
Propagates result back to input data stream
Print output
Loop back to 1 until no more new
advertisements detected
22
Analyze LCS Result
• LCS result has no information on start time,
end time of advertisement
• An LCS may contain more than one
advertisements or some mismatch frame
23
Printing Out Result
24
Speeding up with PlayStation®3
Compute result table
 Parallelization, SIMD
Backtrack LCS result
Double buffering
Analyze LCS result
Loop unrolling
Synchronize analysis result
Propagates result back to input data stream
Print output
Loop back to 1 until no more new Caching
advertisements detected
comparison
result
Step 1:
O(m x n x k)
Step 2-6:
O(m + n)
1.
2.
3.
4.
5.
6.
7.
25
Parallelization on Computing Result Table
• When filling the cell (i, j), information from
(i-1, j-1), (i, j-1) and (i-1, j)
1
2
3
4
5
2
3
4
5
6
3
4
5
6
7
4
5
6
7
8
5
6
7
8
9
26
Parallelization on Computing Result Table
• Example, 3 cores, 5x5 table
• Not fully utilize in first 2 steps and last 3
steps
Core 1 1 2 3 4
• Example, 6 cores, Core 2 2 3 4 5
4000 x 4000 table Core 3 3 4 5 6
• Only first 5 and
Core 1 6 7 8 9
last 5 steps not fully Core 2 7 8 9 10
utilize
5
6
7
10
11
27
Caching “Equality” Comparison Result
• 1st pass done all the “Equality” comparison
between all T-R pairs
• Memory requirement m/2 x n/2 = ¼ x m x n
• 1st pass m/2 x n/2 x 2k = ½ x m x n x k
• 2nd pass and onward m/2 x n/2 = ¼ x m x n
• In our case, k = 1024 (no. of values in Hl3)
• 2nd pass and onward are speeded up by
2048X
28
Result Analysis
29
Experiment of Cross-Comparison of 7 Videos
• “TV Easy(宣傳易)” from 7
different days
• about 5 minutes
• only Ad. available
Aim:
• To show all Ad. appeared
twice within 7 days can be
found by cross-comparison
• C27  21 times
30
Data Information
31
Flow of Comparison
32
Flow of Comparison
33
One of the Detail Result
miss in one day, still
found in other day
34
Overall Result
35
Experiment of Comparing 2 OneHour-Long Videos
• two one-hour-long videos
• including commercials, news reports and
drama programs
Aim:
• Show the matching rate
• Show the performance
36
Overall Result
37
False Positive
• Blank frame
• Low min difference
Example of a False Positive Advertisement
38
Performance and Cost
Comparison
39
Performance Comparison on PC and
PlayStation®3
40
Performance Comparison on PC and
PlayStation®3
Runing LCS algorithm in Different Conditions
2376
Time (sec)
2500
2000
1500
Time (sec)
1000
562
278
99
500
0
PPU
1 SPU
6 SPU
PC
Running Conditions
41
Performance Comparison on PC and
PlayStation®3
• 24 times faster than using only PPU on
PlayStation®3
• 2.8 times faster than using 1 SPU on
PlayStation®3
• 5.7 times faster than running on a PC
42
Cost Comparison
• 1 hr compare to 6 hr
• 6 x 99 sec = 10 mins on PlayStation®3
• 6 x 562 sec = 56 mins on PC
Elapsed Time for
PS3 PC
1hr compare 1hr
sec
• Cost for PS3 $3000
• Cost for 5 PCs $20000
• Cost for 3 staff $20000 per month
99
562
43
Conclusion
• Implement LCS algorithm for commercial monitoring,
better than 1st term approach
• Parallize on PS3, 5.7 times faster than PC
• Achieve:
accuracy ~= 95%
performance ~= 99 sec for 1hr video comparing 1hr video
• Generate a EAG from the result
44
Q&A
LYU0703 Electronic Advertisement
Guide on PS3
45
The End
LYU0703 Electronic Advertisement
Guide on PS3
46
Supplementary
47
Limitation on Direct Memory
Access (DMA)
• SPU transfer data to and from main memory to
local store via DMA
• At least 16 bytes or multiple of 16 bytes
• Each element in result table is integer (4bytes)
• Compute 4 elements at a time
Core 1
Core 2
Core 3
Core 1
Core 2
1
2
3
5
6
1
2
3
5
6
1
2
3
5
6
1
2
3
5
6
2
3
4
6
7
2
3
4
6
7
2
3
4
6
7
2
3
4
6
7
3
4
5
7
8
3
4
5
7
8
3
4
5
7
8
3
4
5
7
8
4
5
6
8
9
4
5
6
8
9
4
5
6
8
9
4
5
6
8
9
48
Double buffering
• Memory Flow Controller (MFC) operate in
parallel SPU
• Fetch data from main memory to local store
and compute result at same time
• Using extra buffer to store data that are prefetched
49
SIMD intrinsic function
•
•
•
•
•
Operate on multiple data at the same time
Mostly work on float or double
128 bits register as input
4 float value at a time
Speed up 4X
50
Loop unrolling
• Get rid of computation time for loop
counter in a loop
• Contribute to a lot computation time if little
statements within loop
• Reduce run time from 2mins  1mins 35
secs
51
Analyze LCS result
• Assign a flag for each LCS unit (T-R pairs)
• START_FLAG 'S'
indicating it is a start point
of
an advertisement
• END_FLAG
‘E’ indicating it is an end point
of
an advertisement
• MIDDLE_FLAG
‘>’ indicating it is part of
the
advertisement
• SINGLE_FLAG ‘A’ indicating it is an
advertisement itself (Start +
End)
• ISOLATE_FLAG
‘X’ indicating it does not
belong to
any advertisement 52
Analyze LCS result
• Split the LCS result into different segments
by using a threshold
• Recognized a segments as an advertisement
if duration of the segments > 5 seconds
• Assign a character string to the analysis
result
53
Synchronizing flags
• Analysis of LCS on 2 video stream may
give different result
• Synchronizing is needed
54
Synchronizing flags
• Similar to logical AND operation
• Recognized as advertisement if both
analysis string agree with each other
55
Propagates flags to input data stream
56
Advantage of Comparing to 6 Other Video
miss in one day, still
found in other day
57
False Negative
Comparing 0304 to 0305
Comparing 0305 to 0304
False-negative
condition cause
by using same
scenes in 2 Ad.
58
Future Development
59
Supplementary – Future Development
• Remove TV Promo
• Logo recognition
• Use Knowledge of Electronic Program
Guide
• Identifying New Commercials
60
“Equality” Comparison
• Computing result table requires “Equality”
comparison between 2 symbols
61
Simple Logic System
• Difficult to decide “best” logic or scoring
system
• As simple as possible
• Give high accuracy
result
62
Download