Equationrecognizer andcalculator By Daria Tolmacheva Introduction • Allow cameras to solve equations • Start with simple equations • 1 operator (+, ‐, /, *) • 2 variables (1,2,3,4,5,6,7,8,9,0) • Assumptions • Only one equation per image • 11pt. Calibri Font PreviousWorkand Background • Document Image Understanding • Skew detection, noise filtering, segmentation • Decomposition into blocks • Semantic recognition or logical layout • Skew Detection • Curved surfaces • Hough transform • Parallel straight lines ProjectDescription • Step 1: creating templates • Image containing all the required characters • make_templates.m code: I = imread('templates.png'); threshold = 90; I1 = im2bw(I, 0.35); figure, imshow(I1, []); L = bwlabel(~I1); blobs = regionprops(L); for i = 1:size(blobs,1) rectangle('Position', blobs(i).BoundingBox, 'EdgeColor', 'r'); a = sprintf('%d', blobs(i).Area); text(blobs(i).Centroid(1),blobs(i).Centroid(2), a,'Color', 'b'); box = minX = maxX = minY = maxY = blobs(i).BoundingBox; round(box(1)); minX + round(box(3)); round(box(2)); minY + round(box(4)); subImage = I1(minY:maxY, minX:maxX); [row, col] = size(subImage); if row > col diff = row - col; add_col = floor(diff/2); squareImg = ones(row,row); squareImg(:, add_col:(add_col+col-1)) = subImage; resize_blob = imresize(squareImg, [25 25]); elseif row < col diff = col - row; add_row = floor(diff/2); squareImg = ones(col,col); squareImg(add_row:(add_row + row-1),:) = subImage; resize_blob = imresize(squareImg, [25 25]); else resize_blob = imresize(subImage, [25 25]); end offscale = min(min(resize_blob)); resize_blob = resize_blob - offscale; maxoffscale = max(max(resize_blob)); resize_blob = resize_blob / maxoffscale; resize_blob = uint8(255*resize_blob); imshow(resize_blob); imsave end ProjectDescription • Step2: Get each character from equation image • Use region props on binary image: I1_grey = rgb2gray(I1); %figure, imshow(I1_grey, []); threshold = 90; I1 = im2bw(I1_grey, 0.35); s = strel('disk', 1); I1 = imclose(I1, s); I1 = imopen(I1, s); L = bwlabel(~I1); blobs = regionprops(L); blobs_center_x = zeros(1,size(blobs,1)); blobs_center_y = zeros(1,size(blobs,1),1); for i=1:size(blobs,1) blobs_center_x(:,i) = blobs(i).Centroid(1); blobs_center_y(:,i) = blobs(i).Centroid(2); end [sb,ix] = sort(blobs_center_x); for i=1:size(blobs,1) rectangle('Position', blobs(i).BoundingBox, 'EdgeColor', 'r'); end ProjectDescription • Step3: Prepare eigenfaces from templates and mean image(EGGN510 – L18 – PCA) %calcualate the mean of tempaltes m=uint8(mean(templates,2)); %subtract off mean from all templates templates_mean = templates - uint8( single(m)*single( uint8(ones(1,size(templates,2)) ) )); %calculate eigenfaces L=single(templates_mean)'*single(templates_mean); [V,D]=eig(L); PC=single(templates_mean)*V; %calculate image signatures signatures = zeros(size(templates,2), 14); for i=1:size(templates,2); signatures(i,:)=single(templates_mean(:,i))'*PC; % Each row is an image signature end ProjectDescription • Step4: Process blobs and match them against eigenfaces: (EGGN510 L18 ‐ PCA) • create subimage for each of the blob to compare it against templates for i=1:size(blobs,1) rectangle('Position', blobs(i).BoundingBox, 'EdgeColor', 'r'); %a = sprintf('%d', blobs(i).Area); %text(blobs(i).Centroid(1),blobs(i).Centroid(2), a,'Color', 'b'); box = minX = maxX = minY = maxY = blobs(ix(i)).BoundingBox; round(box(1)); minX + round(box(3)); round(box(2)); minY + round(box(4)); subImage = I1(minY:maxY, minX:maxX); [row, col] = size(subImage); if row > col diff = row - col; add_col = floor(diff/2); squareImg = ones(row,row); squareImg(:, add_col:(add_col+col-1)) = subImage; resize_blob = imresize(squareImg, [25 25]); elseif row < col diff = col - row; add_row = floor(diff/2); squareImg = ones(col,col); squareImg(add_row:(add_row + row-1),:) = subImage; resize_blob = imresize(squareImg, [25 25]); else resize_blob = imresize(subImage, [25 25]); end offscale = min(min(resize_blob)); resize_blob = resize_blob - offscale; maxoffscale = max(max(resize_blob)); resize_blob = resize_blob / maxoffscale; resize_blob = uint8(255*resize_blob); reshape_blob = reshape(resize_blob,img_size,1)-m; reshape_weighted = single(reshape_blob)'*PC; scores = zeros(1, size(signatures,1)); for j=1:size(templates,2) % calculate Euclidean distance as score scores(j)=norm(signatures(j,:)-reshape_weighted,2); end [C,idx] = sort(scores, 'ascend'); matches(i) = idx(1); end ProjectDescription • Step5: Match templates with scores and evaluate equation: if matches(1)== 1 val1 = 1; elseif matches(1) val1 = 2; elseif matches(1) val1 = 3; elseif matches(1) val1 = 4; elseif matches(1) val1 = 5; elseif matches(1) val1 = 6; elseif matches(1) val1 = 7; elseif matches(1) val1 = 8; elseif matches(1) val1 = 9; elseif matches(1) val1 = 0; end == 2 == 3 == 4 == 5 == 6 == 7 == 8 == 9 == 10 ProjectDescription • Another Important Point: • Images of equation from arbitrary point: • Plot a line through blobs centroid and get a transform from that: figure, imshow(I1, []),hold on plot(blobs_center_x, blobs_center_y); coefficients = polyfit(blobs_center_x,blobs_center_y,1); newy = polyval(coefficients, blobs_center_x); plot(blobs_center_x,blobs_center_y,'*',blobs_center_x,newy,' :') ProjectDescription Results • Templates didn’t seem to match • Reasons: bad templates, need bigger dimensions, better alignment, higher precision Futurework • Make bigger templates with better precision • Recognize more complicated equations with multiple operators and multiple digit numbers • Recognize more than one equation from image • Find a better way to deal with noise • Filtering • Picking out only important blobs WorkCited • • • • • • • • • • • • • • [1] M.Ceci, M.Beradi, D.Malerba, “ Relational Data Mining and ILP for document image understanding,” Applied Artificial Intelligence, Taylor & Francis Group, LLC, 21:317‐342 [2] I.Guyon, R.M.Harlick, J.J.Hull, “Data Sets for OCR and Document Image Understanding Research,” Handbook of Character Recognition and Document Image Analysis, pp. 779‐799 World Scientific Publishing Company, 1997 [3] S.Lu, B.Chen, C.C.Ko “A Partition Approach for the Restoration of Camera Images of Planar and Curled Document,” Image and Vision Computing, Vol.24 Issue 8, Pages 837‐ 848, Electrical and Computer Engineering Department, National University of Singapore, Aug.2006 [4] A.Nakhmani, A.Tannenbaum, “A New Distance Measure Based on Generalized Image Normalized Cross‐Correlation for Robust Video Tracking and Image Recognition,” Pattern Recognition Letters, Vol.34 Issue 3, Pages 315‐321, February 2013 [5] R. Cattoni, T.Coianiz, S.Messelodi, C.M.Modena, “Geometric Layout Analysis Techniques for Document Image Understanding: a Review,” Povo . Trento. Italy, January 1998