Department of Computer Science and Engineering, CUHK 2007-2008 Final Year Project Presentation (2nd term) LYU0703 Electronic Advertisement Guide on PS3 Huang Hiu Fung Wong Chung Hoi 05700512 05596742 Supervised by Prof. Michael R. Lyu LYU0703 Electronic Advertisement Guide on PS3 1 Agenda • • • • • • • Background Information Project Motivation and Objectives Implementation Result Analysis Performance and Cost Comparison Remote Demo Q&A 2 Background Information Market of Commercial Monitoring • Verify Ad. broadcast as stated in contract (No. of broadcast, broadcast time, duration) • E.g. Large scale Ad. Campaign of HKD$10,000,000, spend 5% (HKD$ 500,000) for commercial monitoring Current Solution • Monitor manually • inefficient LYU0703 Electronic Advertisement Guide on PS3 3 Project Motivation • Hundreds of TV channels • Increasing need for TV commercial monitoring • PlayStation®3, a cheap parallel machine LYU0703 Electronic Advertisement Guide on PS3 4 Project Objectives • accurate algorithm for commercial monitoring • Parallelize better performance • Generate an Electronic Advertisement Guide (EAG) 5 Developing Environment • PlayStation®3 • A multi-core machine produced by Sony with Cell Broadband Engine processor • Strong Computation Power • Linux run on it • Open platform for different applications and development LYU0703 Electronic Advertisement Guide on PS3 6 Implementation of Commercial Monitoring Algorithm 7 High Level Description of the Solution 1. Converting raw video data into series of Hl3 files EAG 2. Processing Hl3 files to become the final 8 Hl3 Files • • • • In form of a 32 x 32 array (1024 integers) Frame capture from analog TV card Digest of a frame 352 x 288 pixels Time information on file name 2008.03.12.00.10004415.T.1.Hl3 Hl3 9 Obtaining Hl3 Files from Raw Video Data • At first, frames captured in constant frequency • For example, 25fps, 1 hr video • 3600 seconds X 25 = 90,000 frames High frequency Low frequency 10 Obtaining Hl3 Files from Raw Video Data • Frames captured by scene change 11 Processing Hl3 Files to Become EAG • • • • Minimum difference algorithm m “Target” files P (p1, p2, …, pk) n “Repository” files Q (q1, q2, …, qk) Match P to Q such that is minimum • O (m x n x k) 12 Weaknesses of Minimum Difference Algorithm 1. Many to one matching 2. Out of phrase matching 3. Force matching 13 Longest Common Subsequence (LCS) • • • • ACGGT AGCTC LCS = AGT or ACT Dynamic programming i\j A C G G T 0 0 0 0 0 0 A 0 1 1 1 1 1 G 0 1 1 2 2 2 C 0 1 2 2 2 2 T 0 1 2 2 2 3 C 0 1 2 2 2 3 0 1 2 3 4 5 i\j 0 1 2 3 4 5 A C G G T 0 0 0 0 0 0 A 0 1 1 1 1 1 G 0 1 1 2 2 2 C 0 1 2 2 2 2 T 0 1 2 2 2 3 C 0 1 2 2 2 3 0 1 2 3 4 5 14 0 1 2 3 4 5 LCS on Character Strings to Hl3 Files • Alphabet = Hl3 file • Subsequence = Adv. 1.One to one matching 2. In phrase matching 3.No force matching 15 Modification of LCS algorithm • R frame not match with T frame – T-R combination • Out of phrase advertisement – Multiple passes LCS 16 Original LCS algorithm • Allows R frame match with T frame • Computation = m x n x k i\j R T R T R T 0 0 0 0 0 0 0 R 0 1 1 1 1 1 1 T 0 1 2 2 2 2 2 R 0 1 2 2 3 3 3 T 0 1 2 3 3 4 4 R 0 1 2 3 3 4 4 T 0 1 2 3 3 4 5 0 1 2 3 4 5 6 i\j 0 1 2 3 4 5 6 R T R T R T 0 0 0 0 0 0 0 R 0 1 T 0 R 0 T 0 R 0 T 0 0 1 2 3 4 5 0 1 2 3 4 5 6 6 17 T-R Combination 18 T-R Combination • Save the comparisons between T and R frame (half the computation) • Table size reduce to m/2 x n/2 • Computation: m/2 x n/2 x 2k =½xmxnxk i\j T-R T-R T-R 0 0 0 0 0 T-R 0 1 1 1 1 T-R 0 1 2 2 2 T-R 0 1 2 2 3 0 1 2 3 19 Single Pass LCS • Cannot recognize crossed advertisement segments • The longer advertisement is recognized 20 Multiple Pass LCS • Number of passes depends on relative complexity of 2 videos • Around 7-8 passes for 2 1 hour long video • Speed up by caching “Equality” comparison result in 1st pass 21 Program Flow 1. 2. 3. 4. 5. 6. 7. Compute result table Backtrack LCS result Analyze LCS result Synchronize analysis result Propagates result back to input data stream Print output Loop back to 1 until no more new advertisements detected 22 Analyze LCS Result • LCS result has no information on start time, end time of advertisement • An LCS may contain more than one advertisements or some mismatch frame 23 Printing Out Result 24 Speeding up with PlayStation®3 Compute result table Parallelization, SIMD Backtrack LCS result Double buffering Analyze LCS result Loop unrolling Synchronize analysis result Propagates result back to input data stream Print output Loop back to 1 until no more new Caching advertisements detected comparison result Step 1: O(m x n x k) Step 2-6: O(m + n) 1. 2. 3. 4. 5. 6. 7. 25 Parallelization on Computing Result Table • When filling the cell (i, j), information from (i-1, j-1), (i, j-1) and (i-1, j) 1 2 3 4 5 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8 5 6 7 8 9 26 Parallelization on Computing Result Table • Example, 3 cores, 5x5 table • Not fully utilize in first 2 steps and last 3 steps Core 1 1 2 3 4 • Example, 6 cores, Core 2 2 3 4 5 4000 x 4000 table Core 3 3 4 5 6 • Only first 5 and Core 1 6 7 8 9 last 5 steps not fully Core 2 7 8 9 10 utilize 5 6 7 10 11 27 Caching “Equality” Comparison Result • 1st pass done all the “Equality” comparison between all T-R pairs • Memory requirement m/2 x n/2 = ¼ x m x n • 1st pass m/2 x n/2 x 2k = ½ x m x n x k • 2nd pass and onward m/2 x n/2 = ¼ x m x n • In our case, k = 1024 (no. of values in Hl3) • 2nd pass and onward are speeded up by 2048X 28 Result Analysis 29 Experiment of Cross-Comparison of 7 Videos • “TV Easy(宣傳易)” from 7 different days • about 5 minutes • only Ad. available Aim: • To show all Ad. appeared twice within 7 days can be found by cross-comparison • C27 21 times 30 Data Information 31 Flow of Comparison 32 Flow of Comparison 33 One of the Detail Result miss in one day, still found in other day 34 Overall Result 35 Experiment of Comparing 2 OneHour-Long Videos • two one-hour-long videos • including commercials, news reports and drama programs Aim: • Show the matching rate • Show the performance 36 Overall Result 37 False Positive • Blank frame • Low min difference Example of a False Positive Advertisement 38 Performance and Cost Comparison 39 Performance Comparison on PC and PlayStation®3 40 Performance Comparison on PC and PlayStation®3 Runing LCS algorithm in Different Conditions 2376 Time (sec) 2500 2000 1500 Time (sec) 1000 562 278 99 500 0 PPU 1 SPU 6 SPU PC Running Conditions 41 Performance Comparison on PC and PlayStation®3 • 24 times faster than using only PPU on PlayStation®3 • 2.8 times faster than using 1 SPU on PlayStation®3 • 5.7 times faster than running on a PC 42 Cost Comparison • 1 hr compare to 6 hr • 6 x 99 sec = 10 mins on PlayStation®3 • 6 x 562 sec = 56 mins on PC Elapsed Time for PS3 PC 1hr compare 1hr sec • Cost for PS3 $3000 • Cost for 5 PCs $20000 • Cost for 3 staff $20000 per month 99 562 43 Conclusion • Implement LCS algorithm for commercial monitoring, better than 1st term approach • Parallize on PS3, 5.7 times faster than PC • Achieve: accuracy ~= 95% performance ~= 99 sec for 1hr video comparing 1hr video • Generate a EAG from the result 44 Q&A LYU0703 Electronic Advertisement Guide on PS3 45 The End LYU0703 Electronic Advertisement Guide on PS3 46 Supplementary 47 Limitation on Direct Memory Access (DMA) • SPU transfer data to and from main memory to local store via DMA • At least 16 bytes or multiple of 16 bytes • Each element in result table is integer (4bytes) • Compute 4 elements at a time Core 1 Core 2 Core 3 Core 1 Core 2 1 2 3 5 6 1 2 3 5 6 1 2 3 5 6 1 2 3 5 6 2 3 4 6 7 2 3 4 6 7 2 3 4 6 7 2 3 4 6 7 3 4 5 7 8 3 4 5 7 8 3 4 5 7 8 3 4 5 7 8 4 5 6 8 9 4 5 6 8 9 4 5 6 8 9 4 5 6 8 9 48 Double buffering • Memory Flow Controller (MFC) operate in parallel SPU • Fetch data from main memory to local store and compute result at same time • Using extra buffer to store data that are prefetched 49 SIMD intrinsic function • • • • • Operate on multiple data at the same time Mostly work on float or double 128 bits register as input 4 float value at a time Speed up 4X 50 Loop unrolling • Get rid of computation time for loop counter in a loop • Contribute to a lot computation time if little statements within loop • Reduce run time from 2mins 1mins 35 secs 51 Analyze LCS result • Assign a flag for each LCS unit (T-R pairs) • START_FLAG 'S' indicating it is a start point of an advertisement • END_FLAG ‘E’ indicating it is an end point of an advertisement • MIDDLE_FLAG ‘>’ indicating it is part of the advertisement • SINGLE_FLAG ‘A’ indicating it is an advertisement itself (Start + End) • ISOLATE_FLAG ‘X’ indicating it does not belong to any advertisement 52 Analyze LCS result • Split the LCS result into different segments by using a threshold • Recognized a segments as an advertisement if duration of the segments > 5 seconds • Assign a character string to the analysis result 53 Synchronizing flags • Analysis of LCS on 2 video stream may give different result • Synchronizing is needed 54 Synchronizing flags • Similar to logical AND operation • Recognized as advertisement if both analysis string agree with each other 55 Propagates flags to input data stream 56 Advantage of Comparing to 6 Other Video miss in one day, still found in other day 57 False Negative Comparing 0304 to 0305 Comparing 0305 to 0304 False-negative condition cause by using same scenes in 2 Ad. 58 Future Development 59 Supplementary – Future Development • Remove TV Promo • Logo recognition • Use Knowledge of Electronic Program Guide • Identifying New Commercials 60 “Equality” Comparison • Computing result table requires “Equality” comparison between 2 symbols 61 Simple Logic System • Difficult to decide “best” logic or scoring system • As simple as possible • Give high accuracy result 62