Equation recognizer and calculator By Daria Tolmacheva

advertisement
Equationrecognizer
andcalculator
By Daria Tolmacheva
Introduction
• Allow cameras to solve equations
• Start with simple equations
• 1 operator (+, ‐, /, *)
• 2 variables (1,2,3,4,5,6,7,8,9,0)
• Assumptions
• Only one equation per image
• 11pt. Calibri Font PreviousWorkand
Background
• Document Image Understanding
• Skew detection, noise filtering, segmentation
• Decomposition into blocks
• Semantic recognition or logical layout
• Skew Detection
• Curved surfaces
• Hough transform
• Parallel straight lines
ProjectDescription
• Step 1: creating templates
• Image containing all the required characters
• make_templates.m code: I = imread('templates.png');
threshold = 90;
I1 = im2bw(I, 0.35);
figure, imshow(I1, []);
L = bwlabel(~I1);
blobs = regionprops(L);
for i = 1:size(blobs,1)
rectangle('Position', blobs(i).BoundingBox, 'EdgeColor', 'r');
a = sprintf('%d', blobs(i).Area);
text(blobs(i).Centroid(1),blobs(i).Centroid(2), a,'Color', 'b');
box =
minX =
maxX =
minY =
maxY =
blobs(i).BoundingBox;
round(box(1));
minX + round(box(3));
round(box(2));
minY + round(box(4));
subImage = I1(minY:maxY, minX:maxX);
[row, col] = size(subImage);
if row > col
diff = row - col;
add_col = floor(diff/2);
squareImg = ones(row,row);
squareImg(:, add_col:(add_col+col-1)) = subImage;
resize_blob = imresize(squareImg, [25 25]);
elseif row < col
diff = col - row;
add_row = floor(diff/2);
squareImg = ones(col,col);
squareImg(add_row:(add_row + row-1),:) = subImage;
resize_blob = imresize(squareImg, [25 25]);
else
resize_blob = imresize(subImage, [25 25]);
end
offscale = min(min(resize_blob));
resize_blob = resize_blob - offscale;
maxoffscale = max(max(resize_blob));
resize_blob = resize_blob / maxoffscale;
resize_blob = uint8(255*resize_blob);
imshow(resize_blob);
imsave
end
ProjectDescription
• Step2: Get each character from equation image
• Use region props on binary image:
I1_grey = rgb2gray(I1);
%figure, imshow(I1_grey, []);
threshold = 90;
I1 = im2bw(I1_grey, 0.35);
s = strel('disk', 1);
I1 = imclose(I1, s);
I1 = imopen(I1, s);
L = bwlabel(~I1);
blobs = regionprops(L);
blobs_center_x = zeros(1,size(blobs,1));
blobs_center_y = zeros(1,size(blobs,1),1);
for i=1:size(blobs,1)
blobs_center_x(:,i) = blobs(i).Centroid(1);
blobs_center_y(:,i) = blobs(i).Centroid(2);
end
[sb,ix] = sort(blobs_center_x);
for i=1:size(blobs,1)
rectangle('Position', blobs(i).BoundingBox, 'EdgeColor', 'r');
end
ProjectDescription
• Step3: Prepare eigenfaces from templates and mean image(EGGN510 – L18 – PCA)
%calcualate the mean of tempaltes
m=uint8(mean(templates,2));
%subtract off mean from all templates
templates_mean = templates - uint8( single(m)*single( uint8(ones(1,size(templates,2)) )
));
%calculate eigenfaces
L=single(templates_mean)'*single(templates_mean);
[V,D]=eig(L);
PC=single(templates_mean)*V;
%calculate image signatures
signatures = zeros(size(templates,2), 14);
for i=1:size(templates,2);
signatures(i,:)=single(templates_mean(:,i))'*PC; % Each row is an image signature
end
ProjectDescription
• Step4: Process blobs and match them against eigenfaces: (EGGN510 L18 ‐ PCA)
• create subimage for each of the blob to compare it against templates
for i=1:size(blobs,1)
rectangle('Position', blobs(i).BoundingBox, 'EdgeColor', 'r');
%a = sprintf('%d', blobs(i).Area);
%text(blobs(i).Centroid(1),blobs(i).Centroid(2), a,'Color', 'b');
box =
minX =
maxX =
minY =
maxY =
blobs(ix(i)).BoundingBox;
round(box(1));
minX + round(box(3));
round(box(2));
minY + round(box(4));
subImage = I1(minY:maxY, minX:maxX);
[row, col] = size(subImage);
if row > col
diff = row - col;
add_col = floor(diff/2);
squareImg = ones(row,row);
squareImg(:, add_col:(add_col+col-1)) = subImage;
resize_blob = imresize(squareImg, [25 25]);
elseif row < col
diff = col - row;
add_row = floor(diff/2);
squareImg = ones(col,col);
squareImg(add_row:(add_row + row-1),:) = subImage;
resize_blob = imresize(squareImg, [25 25]);
else
resize_blob = imresize(subImage, [25 25]);
end
offscale = min(min(resize_blob));
resize_blob = resize_blob - offscale;
maxoffscale = max(max(resize_blob));
resize_blob = resize_blob / maxoffscale;
resize_blob = uint8(255*resize_blob);
reshape_blob = reshape(resize_blob,img_size,1)-m;
reshape_weighted = single(reshape_blob)'*PC;
scores = zeros(1, size(signatures,1));
for j=1:size(templates,2)
% calculate Euclidean distance as score
scores(j)=norm(signatures(j,:)-reshape_weighted,2);
end
[C,idx] = sort(scores, 'ascend');
matches(i) = idx(1);
end
ProjectDescription
• Step5: Match templates with scores and evaluate equation:
if matches(1)== 1
val1 = 1;
elseif matches(1)
val1 = 2;
elseif matches(1)
val1 = 3;
elseif matches(1)
val1 = 4;
elseif matches(1)
val1 = 5;
elseif matches(1)
val1 = 6;
elseif matches(1)
val1 = 7;
elseif matches(1)
val1 = 8;
elseif matches(1)
val1 = 9;
elseif matches(1)
val1 = 0;
end
== 2
== 3
== 4
== 5
== 6
== 7
== 8
== 9
== 10
ProjectDescription
• Another Important Point:
• Images of equation from arbitrary point:
• Plot a line through blobs centroid and get a transform from that:
figure, imshow(I1, []),hold on
plot(blobs_center_x, blobs_center_y);
coefficients = polyfit(blobs_center_x,blobs_center_y,1);
newy = polyval(coefficients, blobs_center_x);
plot(blobs_center_x,blobs_center_y,'*',blobs_center_x,newy,'
:')
ProjectDescription
Results
• Templates didn’t seem to match • Reasons: bad templates, need bigger dimensions, better alignment, higher precision
Futurework
• Make bigger templates with better precision
• Recognize more complicated equations with multiple operators and multiple digit numbers
• Recognize more than one equation from image
• Find a better way to deal with noise
• Filtering
• Picking out only important blobs
WorkCited
•
•
•
•
•
•
•
•
•
•
•
•
•
•
[1] M.Ceci, M.Beradi, D.Malerba, “ Relational Data Mining and ILP for document image
understanding,” Applied Artificial Intelligence, Taylor & Francis Group, LLC, 21:317‐342
[2] I.Guyon, R.M.Harlick, J.J.Hull, “Data Sets for OCR and Document Image Understanding
Research,” Handbook of Character Recognition and Document Image Analysis, pp. 779‐799
World Scientific Publishing Company, 1997
[3] S.Lu, B.Chen, C.C.Ko “A Partition Approach for the Restoration of Camera Images of
Planar and Curled Document,” Image and Vision Computing, Vol.24 Issue 8, Pages 837‐
848, Electrical and Computer Engineering Department, National University of Singapore,
Aug.2006
[4] A.Nakhmani, A.Tannenbaum, “A New Distance Measure Based on Generalized Image
Normalized Cross‐Correlation for Robust Video Tracking and Image Recognition,” Pattern
Recognition Letters, Vol.34 Issue 3, Pages 315‐321, February 2013
[5] R. Cattoni, T.Coianiz, S.Messelodi, C.M.Modena, “Geometric Layout Analysis Techniques
for Document Image Understanding: a Review,” Povo . Trento. Italy, January 1998
Download