CSCI 510/EENG 510 Image and Multidimensional Signal Processing Fall 2015 Homework Assignment 1 - SOLUTIONS Due Monday, September 7, 2015 Notes: Please email me your solutions for these problems (in order) as a single Word or PDF document. If you do a problem on paper by hand, please scan it in and paste it into the document (although I would prefer it typed!). I would like to get to know you and your interests so that I can provide the best possible educational experience for you in this course. Please describe yourself – your general background, education, interests, and goals. What specifically would you be interested in learning about image processing? Are there any application areas that you are particularly interested in? Are you currently doing thesis research that might benefit from image processing? 1. (20 pts) It is often useful to generate a synthetic image with known properties that can be used to test algorithms. Generate an image composed of two concentric circles as shown below. The inner circle should have a radius of 50 pixels and a mean value of 192. The outer circle should have a radius of 100 pixels and a mean value of 128. The background should have a mean value of 64. Add uniform random noise to each pixel in the range -16 .. +16 (see Matlab’s rand function). Save the image in “tif” format, and make sure the saved image looks correct. Turn in the Matlab program and the image that you generated. (Hint: recall the equation of a circle: x^2 + y^2 = r^2. These are the points on the circle border; to represent points inside the circle, you would you use an inequality.) Solution: % Generate a synthetic image of concentric circles, and add noise. clear all close all N = 400; I = 64*ones(N,N); % An easy way to generate a constant value image R1 = 100; R2 = 50; 1 CSCI 510/EENG 510 Image and Multidimensional Signal Processing Fall 2015 for r=1:N for c=1:N if (r-N/2)^2 + (c-N/2)^2 < R2^2 I(r,c) = 192; elseif (r-N/2)^2 + (c-N/2)^2 < R1^2 I(r,c) = 128; end end end I = I + 32*(rand(N)-0.5); imshow(I,[0 255]), impixelinfo % When writing to a "tif" file, make sure you convert to uint8 imwrite(uint8(I), 'test.tif'); 2. (20 pts) The “coins.png” image (available in the Matlab example images directory) has light-colored coins against a dark background. As done in class, you can segment the coins from the background using a simple global thresholding technique, such as B = I > t; where t is a value between 0 and 255 1. Write a Matlab program to calculate the maximum, mean and standard deviation of the pixels in the regions corresponding to the coins (you don’t need to do this for each coin; just the union of the coin regions). Turn in the Matlab program, the calculated values, and the image of the segmented coins. Solution: % HW1, problem 5 % Segment coins (by thresholding) in the "coins.png" image; % estimate maximum, mean, and standard deviation of the coin pixels. % Good idea to do this at the beginning of all of your programs clear all % Clear out all old variables in the workspace close all % Close all open figures and images I = imread('coins.png'); imshow(I, []); % Read image B = I > 80; figure, imshow(B); % Threshold image, so that coin pixels = 1 % Go through image, and accumulate statistics. % For the calculation of standard deviation, we will use the equation % Var(X) = mean(X^2) - mean(X)^2 maxval = 0; sum = 0; % Sum of coin values sum2 = 0; % Sum of values squared nPts = 0; % Number of pixels in coin regions for r=1:size(I,1) for c=1:size(I,2) 1 There may be a couple of pixels that are not segmented correctly; don’t worry about that. 2 CSCI 510/EENG 510 Image and Multidimensional Signal Processing Fall 2015 if B(r,c) % if this is a coin pixel nPts = nPts + 1; if I(r,c)> maxval maxval = I(r,c); end % Accumulate sums. Convert to double to get better precision. sum = sum + double(I(r,c)); sum2 = sum2 + double(I(r,c))^2; end end end avg = sum/nPts; var = sum2/nPts - avg^2; std = sqrt(var); fprintf('Maximum value in coin regions = %d\n', maxval); fprintf('Average value in coin regions = %f\n', avg); fprintf('Standard deivation of pixels in coin regions = %f\n', std); This prints out: Maximum value in coin regions = 255 Average value in coin regions = 178.614854 Standard deivation of pixels in coin regions = 31.017945 If you use Matlab a lot, you will find that you can write programs that are much shorter and more efficient. Since Matlab is an interpreted language, “for” loops are generally pretty slow for doing matrix and array operations, so it is much better to write the operations using matrix operators. Below is an alternative solution that gives the same results: % Alternative solution I = imread('coins.png'); B = I > 80; % Read image % Threshold image, so that coin pixels = 1 nPts = sum(sum(B)); % Sum entire image to count coin pixels % Make a mask image where coin pixels = 255 M = 255*uint8(B); % Do a logical AND of each pixel in the mask image, with I. % You could instead have used the command: I = I .* uint8(B). I = bitand(I,M); % Force background pixels to 0 avg = sum(sum(I))/nPts; var = sum(sum(double(I).^2))/nPts - avg^2; std = sqrt(var); 3. (20 pts) As described in Section 2.1.1 of the textbook, we can treat the human fovea as a square sensor array of size 1.5 mm x 1.5 mm, containing about 337,000 cones. Also, the space between the cones is equal to width of the cones. a) What is the field of view (in degrees) of the human fovea? 3 CSCI 510/EENG 510 Image and Multidimensional Signal Processing Fall 2015 b) Estimate the distance from Brown Hall to the top of South Table Mountain (you can find this using a map, or a webtool such as Microsoft Bing Maps, or Google Earth). What is the minimum size object that you can see with the naked eye on top of the mountain? Can you see a person on top of the mountain? Assume for simplicity that size of the image of the object must cover at least two receptors (cones). Solution: (a) If the fovea covers an area of 1.5mm x 1.5mm, and contains 337,000 cones, then we can think of the fovea as a square sensor array, which translates into an NxN array of size 580x580 cones. Assuming equal spacing between cones, this gives 580 cones and 580 spaces on a line 1.5 mm long. Each cone is s = 1.5 mm / 1160 = 0.0013 mm in diameter. Assuming the focal length of the eye is 17 mm, then the field of view of the fovea of the human eye is 2*arctan(0.75/17) = 5 degrees. For comparison, the diameter of the full moon as seen from earth is about 0.5 degree. (b) I estimated the distance to be about d = 1100 meters. Assume that the size of the imaged object on the retina must cover at least two sensor elements. Allowing for the spaces between the cones, that would be s = 4* 0.0013 mm, or 0.0052 mm. Then we can use similar triangles to find the minimum size of the object. If we assume that the eye has a focal length of f=17 mm, then h/d = s/f or h = d*s/f = (1100 m)*(5.2x10^-6 m)/(17x10^-3m) = 0.34 m. So you should just barely be able to see a person on top of the mountain from the campus. 4. (20 pts) A pool-playing robot uses an overhead camera to determine the positions of the balls on the pool table. Assume that: a) We are using a standard billiard table of size 44" x 88". b) We need at least 100 square pixels per ball to reliably determine the identity of each ball. c) The center of the ball can be located to a precision of +- one pixel in the image. d) We need to locate the ball on the table to an accuracy of +- one cm. e) We are going to mount the camera on the ceiling, looking straight down. The distance from the camera to the table is 2 m. Determine a configuration of the camera resolution and lens FOV that will meet these requirements. Assume that you can choose from the following parts: Lenses with field of view 30, 60, 90 degrees Cameras with resolutions of 256x256, 512x512, or 1024x1024 pixels Choose the lowest resolution that will meet the requirements. Solution: We can calculate the required field of view of the camera: To completely image the longer side of the table (L=2.235 m) at a distance of D=2 m, the angle must be 4 CSCI 510/EENG 510 Image and Multidimensional Signal Processing Fall 2015 L 2 D = 1.019 radians = 58 degrees So we need to use the lens with a 60 degree field of view. θ = 2 tan −1 To determine the required resolution NxN of the camera, we first check the constraint that we need an accuracy of +- 1cm. At a distance of D = 2m, with a field of view of 60 degrees, the horizontal length that the camera actually images is L = 2 D tan (θ 2) = 231 cm where θ = 60 degrees. So we have N pixels over 231 cm, or N/231 pixels/cm. If we need an accuracy of 1 cm, then the minimum number of pixels N must be 231. However, we also need to check the constraint that we need a minimum area for the image of the ball. A billiard ball has diameter 5.715 cm. In units of pixels in the image, the diameter is 5.715cm * (N pixels / 231 cm) = 0.0247 N pixels The area of the circular region of the ball in the image is A = π r2 2 0.0247 N =π 2 −4 = 4.79 x10 N 2 square pixels. If we need 100 square pixels per ball, then N 2 = 100 4.79 x10 −4 = 2.0877x105. Or N = 456.9 pixels. So we need a resolution of 512x512 pixels. 5. (20 pts) Develop a program to resize the example image “cameraman.tif” from its original size of 256x256 to an enlarged size of 400x400, using bilinear interpolation. For this problem, don’t use the Matlab functions “imresize” , “interp2” or the equivalent OpenCV function. Turn in your program and the resulting image. Solution: clear all close all % Source image I1 = imread('cameraman.tif'); [M1,N1] = size(I1); % Destination image M2 = 400; N2 = 400; I2 = zeros(M2,N2); % Scaling factors sx = N2/N1; 5 CSCI 510/EENG 510 Image and Multidimensional Signal Processing Fall 2015 sy = M2/M1; % The transformation from I2 to I1 is % x1 = x2/sx, y1 = y2/sy % Scan through I2 and interpolate value from I1. for y2 = 1:M2 for x2 = 1:N2 % Compute exact location of correponding point in I1 x1 = x2/sx; y1 = y2/sy; % Get closest integer coordinates less than x1,y1 (ie, round down) x0 = floor(x1); y0 = floor(y1); % Don't if x0 < if y0 < if x0 > if y0 > go outside 1 x0 = 1 y0 = N1-1 x0 = M1-1 y0 = bounds 1; 1; N1-1; M1-1; of the image end end end end % Get the offsets of (x1,y1) from (x0,y0) x = x1 - x0; y = y1 - y0; % Interpolate the value at (x1,y1), using the formula from the % lecture notes. Remember when we access a pixel, we have to use % (y,x) instead of (x,y). I2(y2,x2) = I1(y0,x0) * (1-x)*(1-y) + ... I1(y0,x0+1)*x*(1-y) + ... I1(y0+1,x0)*(1-x)*y + ... I1(y0+1,x0+1)*x*y; end end imshow(I2,[]); The original (left) and resized (right) images: 6 CSCI 510/EENG 510 Image and Multidimensional Signal Processing 7 Fall 2015