Principal Components Analysis (PCA) Examples Colorado School of Mines Image and Multidimensional Signal Processing Example: Recognition of Handwritten Digits • Very important commercially (e.g., postal service) • The best classification algorithms use multiple techniques; achieve accuracy of > 99% • MNIST database – Training set of 60,000 examples, and a test set of 10,000 examples – Images are size normalized to fit in a 20x20 pixel box, then centered in a 28x28 image Colorado School of Mines http://yann.lecun.com/exdb/mnist/ Image and Multidimensional Signal Processing 2 Classification using PCA • The unknown input image is projected onto the top principal components • Its representation is compared to all the images in the database • The closest match is taken to be the identified digit • For more robustness, the closest “k” matches can be found – The identify of the majority of these matches is taken to be the identified digit – This is known as the k-nearestneighbor method Colorado School of Mines Image and Multidimensional Signal Processing Comparison of images in “eigenspace” (from Ryan Crawford, EENG 510 final project report, 2011) 3 Examples of poorly written digits • The problem can be difficult in some cases • Can you tell what these are? Answers (from left to right): 0,2,4,5,7 Colorado School of Mines Image and Multidimensional Signal Processing 4 Example: Analysis of silhouette images • Goal was to extract the contour of the knee implant component as accurately as possible • From the contour we could estimate the pose (position and orientation) of the component • We compared various automatic methods; also had people do the extraction manually Colorado School of Mines Knee implant in fluoroscopy image Image and Multidimensional Signal Processing 5 Knee implant components Flouroscopy x-ray machine Colorado School of Mines Perspective projection model from: Mahfouz, M., W. Hoff, R. Komistek, and D. Dennis, "Effect of segmentation errors on 3D-to-2D registration of implant models in X-ray images," J Biomech., vol. 38, no. 2, pp. 229-39, 2005. Image and Multidimensional Signal Processing 6 Example extracted contour Extracted silhouette Mean of extracted silhouettes First 8 principal components 1 2 3 4 5 6 7 8 Colorado School of Mines Image and Multidimensional Signal Processing 7 Principal Components of Silhouette Images 0.3 E 0.2 Correct silhouette Human edited silhouettes Ideal displaced silhouettes 0 -2 +1-1 -2 -2 -2 -2 +2 -2 H C G B Second principal component amplitude -2 -1 I 0.1 -1 -1 0 -1 +1 -1 +2 -1 -2 0 F 0 +2 0 J -1 0 correct +1 0 -0.1 -2 +1 +2 +1 +1 +1 -0.2 -1 +1 0 +1 A D L -2 +2 +2 +2 +1 +2 -1 +2 0 +2 K -0.3 -0.4 0.159 Colorado School of Mines 0.16 0.161 0.162 0.163 0.164 0.165 0.166 First principal component amplitude Image and Multidimensional Signal Processing 0.167 0.168 0.169 8 Example – PCA to compress color images • Recall that a color image is made up of pixels with red, green, blue (RGB) values. • Essentially, we have a collection of 3D vectors in RGB space. Image “peppers.png’ Colorado School of Mines Pixels in RGB space (only 1% are shown) Image and Multidimensional Signal Processing 9 clear all close all RGB = im2double(imread('peppers.png')); % Convert 3-dimensional array array to 2D, where each row is a pixel (RGB) X = reshape(RGB, [], 3); N = size(X,1); % N is the number of pixels % Plot pixels in color space. To limit the number of points, only plot % every 100th point. figure hold on Code to convert RGB for i=1:100:size(X,1) mycolor = X(i,:); color image to a set of mycolor = max(mycolor, [0 0 0]); vectors, and plot them mycolor = min(mycolor, [1 1 1]); plot3(X(i, 1), X(i, 2), X(i, 3), ... '.', 'Color', mycolor); end xlabel('red'), ylabel('green'), zlabel('blue'); xlim([0 1]), ylim([0 1]), zlim([0 1]); hold off axis equal % Rotate the display. for az=-180:3:180 view(az,30); % set azimuth, elevation in degrees drawnow; end Colorado School of Mines Image and Multidimensional Signal Processing 10 Example (continued) • Can we use fewer than 3 values per pixel? • Do PCA on color vectors ... keep the top two PCs. % Get mean and covariance mx = mean(X); Cx = cov(X); %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % Get eigenvalues and eigenvectors of Cx. % Produces V,D such that Cx*V = V*D. % So the eigenvectors are the columns of V. [V,D] = eig(Cx); e1 = V(:,3); disp('Eigenvector e1:'), disp(e1); e2 = V(:,2); disp('Eigenvector e2:'), disp(e2); e3 = V(:,1); disp('Eigenvector e3:'), disp(e3); d1 = D(3,3); disp('Eigenvalue d1:'), disp(d1); d2 = D(2,2); disp('Eigenvalue d2:'), disp(d2); d3 = D(1,1); disp('Eigenvalue d3:'), disp(d3); Colorado School of Mines Image and Multidimensional Signal Processing Eigenvector e1: -0.8239 -0.5611 -0.0793 Eigenvector e2: -0.3262 0.3551 0.8761 Eigenvector e3: -0.4634 0.7477 -0.4756 Eigenvalue d1: 0.0912 Eigenvalue d2: 0.0296 Eigenvalue d3: 0.0140 11 Example (continued) • Construct the “A” matrix, whose rows are the eigenvectors of Cx % % % A Construct matrix A such that the 1st row of A is the eigenvector corresponding to the largest eigenvalue, the 2nd row is the eigenvector corresponding to the second largest eigenvalue, etc. = [e1'; e2'; e3']; • Project input vectors onto the PCs % % % % % % % % Y Project input vectors x onto eigenvectors. For each (column) vector x, we will use the equation y = A*(x - mx). To explain the Matlab commands below: X is our (N,3) array of vectors; each row is a vector. mx is the mean of the vectors, size (1,3). We first subtract off the mean using X - repmat(mx,N,1). We then transpose that result so that each vector is a column. We then apply our transform A to each column. = A*(X - repmat(mx,N,1))'; % Y has size 3xN Colorado School of Mines Image and Multidimensional Signal Processing 12 Example (continued) • Display the y vectors as images % Display y vectors as images [height,width,depth] = size(RGB); Y1 = reshape(Y(1,:), height, width); Y2 = reshape(Y(2,:), height, width); Y3 = reshape(Y(3,:), height, width); figure; subplot(1,3,1), imshow(Y1,[]); subplot(1,3,2), imshow(Y2,[]); subplot(1,3,3), imshow(Y3,[]); Colorado School of Mines Image and Multidimensional Signal Processing 13 Example (continued) • Form the Ak matrix by taking only the top k rows of A • Reconstruct the x vectors using only the top k PC’s: x’ = AkTy + mx • Plot the x vectors in RGB space. Colorado School of Mines Image and Multidimensional Signal Processing 14 • Using the top k=2 principal components for reconstruction % Reconstruct image using only Y1 and Y2. For each (column) vector y, % we will use the equation x = A'*y + mx. % To explain the Matlab commands below: % Y is our (3,N) array of vectors; where each column is a vector. % A(1:2,:) is the first two rows of A. % Y(1:2,:) is the first two rows of Y. % A(1:2,:)' * Y(1:2,:) produces our transformed vectors (3xN); we then % transpose that to make an array of size Nx3, and add the mean. Xr = ( A(1:2,:)' * Y(1:2,:) )' + repmat(mx,N,1); % Xr has size Nx3 % Plot reconstructed pixels in color space figure hold on for i=1:100:size(Xr,1) mycolor = Xr(i,:); mycolor = max(mycolor, [0 0 0]); mycolor = min(mycolor, [1 1 1]); plot3(Xr(i, 1), Xr(i, 2), Xr(i, 3), ... '.', 'Color', mycolor); end xlabel('red'), ylabel('green'), zlabel('blue'); xlim([0 1]), ylim([0 1]), zlim([0 1]); hold off axis equal Colorado School of Mines Image and Multidimensional Signal Processing 15 Example (continued) • Put the x vectors back into the form of an RGB image, using “reshape”. Original Colorado School of Mines Reconstructed Image and Multidimensional Signal Processing 16 % Display original image and reconstructed image figure, subplot(1,2,1), imshow(RGB), title('Original'); Ir(:,:,1) = reshape(Xr(:,1), height, width); Ir(:,:,2) = reshape(Xr(:,2), height, width); Ir(:,:,3) = reshape(Xr(:,3), height, width); subplot(1,2,2), imshow(Ir), title('Reconstructed'); Colorado School of Mines Image and Multidimensional Signal Processing 17