CIS 751 Project in Computer Science Video Analysis Motion Detection using Principal Component Analysis Roland Miezianko rmiezian@temple.edu Advisor Professor Dr. Longin Jan Latecki latecki@temple.edu Temple University Fall 2003 Table of Contents Introduction ......................................................................................................................... 3 Dataset................................................................................................................................. 3 Algorithm Overview ........................................................................................................... 3 Algorithm Steps .................................................................................................................. 4 Step 1 – Reshape the Image Data ............................................................................... 4 Step 2 – Compute the PCA Matrix ............................................................................. 5 Step 3 – Project Image Data onto the PCA Matrix to obtain the Scores Matrix ........ 6 Step 4 – Apply Dynamic Thresholding to the Scores Matrix ..................................... 7 Step 5 – Generate Video Showing Detected Motion .................................................. 8 Data Analysis ...................................................................................................................... 9 Conclusion ........................................................................................................................ 12 Matlab Code ...................................................................................................................... 12 Introduction This project presents concepts and working program to detect motion in a video sequence using Principal Component Analysis (PCA). Prior to using PCA for motion detection, a histogram-based algorithm was used. The histogram-based system did not provide as good results as the PCA based system where speed of computation and sensitivity to motion were of primary concern. Using PCA to provide the feature descriptor of consecutive video frames along with dynamic thresholding provided very good results in detecting object larger than the video segmentation block used. The PCA algorithm presented here used two different PCA matrix generation methods: global and local; and two thresholding methods: static and dynamic. Collected data shows that the local PCA generation with dynamic thresholding may be the best combination of speed and sensitivity. Matlab code along with video showing detected motion is provided as part of this project. Dataset Dataset used in this project consist of individual video frames saved as jpeg files. There are 2688 full RGB color frames in the test video sequence. Each frame was reduced by 50% and converted to gray scale before it was used. Once the frames were analyzed, new video was produced using the reduced grayscale data with motion blocks projected onto the image to highlight the spatial location of detected movement. Algorithm Overview Algorithm follows these basic five steps: 1. 2. 3. 4. Reshape the image data into 8x8 blocks Compute the PCA based on selected number of frames Project the image data onto the PCA matrix to obtain scores for each block Apply dynamic thresholding to the scores matrix for each block of each frame. Create motion detected Yes/No matrix for each block of each frame. 5. Generate video highlighting the spatial locations where motion was detected The Matlab code was written to separate each step and allow for changes to not affect the repetition of previous steps. Each step generated data that is used in the subsequent steps. For example, after completing Step 3, were the scores matrix was generated, experiments may be perform during Step 4, such as performing static thresholding or dynamic thresholding. Algorithm Steps Step 1 – Reshape the Image Data Image frame is loaded from an external jpeg file and Matlab converts it to RGB 3-dimentional matrix. In order to reduce the dataset to more manageable size, each image has been reduced in size by 50% and subsequently converted to grayscale. The two-dimensional grayscale matrix is then reshaped into 8x8 blocks using the im2col Matlab function. This operation produces 64 rows by 1728 columns matrix. There are 64 pixels per each 8x8 block and there are 1728 blocks in each reduced grayscale image. The resulting matrix is then transposed to 1728x64 matrix so each row represents individual block and each columns represents one of the 64 pixels in the block. This will allow for generation of the PCA dataset and for later computation of the scoring matrix. Image Image blocks are generated using a down-up-left-down motion. transposed im2col(Image) im2col(Image) 64 Pix per Block 1728 Blocks 1728 Blocks 64 Pix per Block Converting image into 8x8 blocks and transposing it, yields a 1728 by 64 matrix. Step 2 – Compute the PCA Matrix To compute the Principal Component Analysis matrix a dataset must be used that represents most of the video sequence. Taking every 250th frame of the video sequence and stacking the frames one on top of each other to create a 19008x64 matrix. In the dataset video sequence there are 2688 frames, so every 250th frame, produces about 11 frames that will be used to generate the PCA matrix. Once this matrix is created, a covariance is computed producing a 64x64 matrix. The covariance matrix is then used by the pcacov Matlab function to produce the PCA components. The produced PCA matrix is also 64x64 matrix with each row representing the 64 pixel values per block and the 64 columns represent the 64 principal components, arranged so that the first column represents the first principal component and so forth in descending order. Only the first three principal components will be used in generating the Score Matrix in the following step. 64 Pix per Block Frame 1 Blocks Frame j Blocks Frame N Blocks Append every 250th frame (total of 11 frames) to create a dataset used in computing the principal components. Principal components are computed based on the covariance matrix. 64 PCAs 64 Pix The principal components matrix contains 64 rows representing each pixel in the block and 64 columns representing 64 principal components. Only the first three components are used (first three columns) Step 3 – Project Image Data onto the PCA Matrix to obtain the Scores Matrix Once the principal components are generated, the score matrix must be generated next. Each frame image is projected onto the first 3 principal components (1728x64 * 64x3) to create a 1728x3 score matrix per frame. Each frame has its own score matrix for each block. Once the score matrix is created for each frame, dynamic thresholding may be applied to obtain the motion /no-motion matrix. Frame X 3 PCAs Frame X 3 Scores 64 Pix 1728 Blocks 1728 Blocks 64 Pix per Block Projecting frame data onto the principal components matrix yields a 1728 by 3 score matrix. Step 4 – Apply Dynamic Thresholding to the Scores Matrix Dynamic thresholding is computed on the score matrix generated in the previous step. It is generated by taking a window of 7 frames (3 before, current, and 3 after) and computing a covariance matrix per each block, then taking the larger eigenvalue of the 3 produced. Once the single eigenvalue is computed for each block in the 7 window frame, a dynamic thresholding algorithm is used on this single value. Another window of 7 frames is used to store the mean and standard deviation and compare the value for the block with the mean and determine a Yes/No motion matrix. This motion matrix is 1728x2688, where there are 1728 blocks per frame and 2688 frames per video sequence. A value of 0 represents no motion per block/frame and value of 1 represents detected motion per block/frame. MotionFlag 1728 Blocks 2688 Frames Motion matrix is generated representing a Yes/No flag for each block of each frame. This will be used in Step 5 – Generate Video Showing Detected Motion The motion matrix obtained in Step 4 is then used with the image block data from Step 1 to generate a video sequence highlighting the spatial location of detected motion. Each grayscale image block is checked against the motion matrix, and if motion is detected in the block, the grayscale image is converted to RGB image (still grayscale in presentation) and its red value is set to either 220 or 255, depending on the original. If motion was not detected in the block, the image is unchanged. Final video file is save as an AVI compressed file. No motion detected Motion detected. Red blocks represent spatial locations where motion is present. Data Analysis The scores matrix figure shows the frame locations where motion was detected. The principal components used in generating the scores matrix were created based on 11 frames evenly spaced within the video sequence. The obtained results may be viewed as completed AVI file with red blocks representing motion or as a figure showing the detected motion of each block within the entire video sequence. Additional test was performed on the block 24x28, where the principal components were created by the block data only (from matrix 2688x64). It seems that the scores using the local principal components were higher in magnitude then the ones generated by the global principal components, however, both methods yielded acceptable values for the frames where motion was present. 3 Principal Components of Block 24x28 Block 24x28 score values per frame based on Local PCA values Block 24x28 score values per frame based on Global PCA values Zoom of block 24x28 score values per frame based on Local PCA values Zoom of block 24x28 score values per frame based on Global PCA values Mean gray level values of block 24x28 per frame Conclusion The method of motion detection using principal component analysis combined with dynamic thresholding yields good results in detecting motion. Improvements in generating the PCA matrix include the creation per block PCA matrix, instead of global PCA matrix. A better thresholding algorithm may be used to eliminate sporadic detection of motion where there is no motion. This appears to happen when the gray level values are very high. Future projects will include generating PCA matrix per each block of the video sequence and processing images with variation in size of the blocks. Matlab Code Matlab code is separated into several files to allow for data collection and analysis at each step of the process. Only one file is executed in Matlab, the main.m file. The following files contain the algorithm presented in this project. File main.m gencols.m genpca.m genscore1 genmotion.m genavi.m plotblock.m Description This file is always executed, as it calls all the other functions. Global data definition, such as location of video frames and directory location where data is to be stored. Reshape the image data into 8x8 blocks, where each row represents the block within a frame, and each column represents one of the 64 pixels of the 8x8 block. Store the reshaped frame in an external file. This data will later be use in Steps 2, 3, and 5. Take about every 250th frame within the video sequence giving us about 11 frames which are used to create the Principal Components matrix. There will be 64 principal components, of which only the top three will be used. Store the principal component matrix in an external file. Take each reshaped frame from the video sequence and project it onto the three principal components. Store the score matrix in an external file. This file has two functions: (1) Take the score matrix for each block and generate a motion data based on the third eigenvalue of a 7 frame window. Store the data. (2) Generate a 0/1 matrix of each block-frame combination based on dynamic thresholding method. Store the MotionFlag matrix in an external file. Take the reshaped frame and place red squares in blocks where motion was detected based on the MotionFlag matrix. Generate an AVI compressed video showing the spatial locations of detected motion. Plot the block EV data generated by genmotion.m file. The main file main.m calls all other functions to perform the steps required to detect motion using PCA-based algorithm. generate all blocks for each frame, rows=blocks, cols=pix of block gencols(imageFileNamePrefix, dataFileNamePrefix, FrameStart, FrameEnd, 8, 8) compute the pca matrix based on the frame data and given frame increment genpca(dataFileNamePrefix,FrameInc,FrameStart,FrameEnd,dump) compute scores for each block of each frame using the pcadata, X columns (3 OR 10) genscore1(dataFileNamePrefix,FrameStart,FrameEnd,dump) generate the motion matrix of each block of each frame if dump is 1, then we are creating the threshold matrix, otherwise we are creating the ev vectors of window size 7 genmotion(dataFileNamePrefix,NumBlocks,WindowOffset,FrameStart,Fr ameEnd,dump) generate video based on motion flag genavi(dataFileNamePrefix,NumBlocks,8,8,FrameStart,FrameEnd) plot results of a block BlockRow = 24; BlockCol = 28; BlockIndex = (BlockCol-1)*36 + BlockRow; plotblock(dataFileNamePrefix,BlockIndex,NumBlocks,FrameStart,Fram eEnd) Matlab code is provided in separate files.