Homework Assignment 1 - SOLUTIONS Due Monday, September 7, 2015

advertisement
CSCI 510/EENG 510
Image and Multidimensional Signal Processing
Fall 2015
Homework Assignment 1 - SOLUTIONS
Due Monday, September 7, 2015
Notes: Please email me your solutions for these problems (in order) as a single Word or PDF document. If you do
a problem on paper by hand, please scan it in and paste it into the document (although I would prefer it typed!).
I would like to get to know you and your interests so that I can provide the best possible
educational experience for you in this course. Please describe yourself – your general
background, education, interests, and goals. What specifically would you be interested in
learning about image processing? Are there any application areas that you are particularly
interested in? Are you currently doing thesis research that might benefit from image
processing?
1.
(20 pts) It is often useful to generate a synthetic image with known properties that can
be used to test algorithms. Generate an image composed of two concentric circles as
shown below. The inner circle should have a radius of 50 pixels and a mean value of
192. The outer circle should have a radius of 100 pixels and a mean value of 128. The
background should have a mean value of 64. Add uniform random noise to each pixel
in the range -16 .. +16 (see Matlab’s rand function). Save the image in “tif” format, and
make sure the saved image looks correct. Turn in the Matlab program and the image
that you generated. (Hint: recall the equation of a circle: x^2 + y^2 = r^2. These are
the points on the circle border; to represent points inside the circle, you would you use
an inequality.)
Solution:
% Generate a synthetic image of concentric circles, and add noise.
clear all
close all
N = 400;
I = 64*ones(N,N);
% An easy way to generate a constant value image
R1 = 100;
R2 = 50;
1
CSCI 510/EENG 510
Image and Multidimensional Signal Processing
Fall 2015
for r=1:N
for c=1:N
if (r-N/2)^2 + (c-N/2)^2 < R2^2
I(r,c) = 192;
elseif (r-N/2)^2 + (c-N/2)^2 < R1^2
I(r,c) = 128;
end
end
end
I = I + 32*(rand(N)-0.5);
imshow(I,[0 255]), impixelinfo
% When writing to a "tif" file, make sure you convert to uint8
imwrite(uint8(I), 'test.tif');
2.
(20 pts) The “coins.png” image (available in the Matlab example images directory) has
light-colored coins against a dark background. As done in class, you can segment the
coins from the background using a simple global thresholding technique, such as
B = I > t;
where t is a value between 0 and 255 1. Write a Matlab program to calculate the
maximum, mean and standard deviation of the pixels in the regions corresponding to
the coins (you don’t need to do this for each coin; just the union of the coin regions).
Turn in the Matlab program, the calculated values, and the image of the segmented
coins.
Solution:
% HW1, problem 5
% Segment coins (by thresholding) in the "coins.png" image;
% estimate maximum, mean, and standard deviation of the coin pixels.
% Good idea to do this at the beginning of all of your programs
clear all
% Clear out all old variables in the workspace
close all
% Close all open figures and images
I = imread('coins.png');
imshow(I, []);
% Read image
B = I > 80;
figure, imshow(B);
% Threshold image, so that coin pixels = 1
% Go through image, and accumulate statistics.
% For the calculation of standard deviation, we will use the equation
%
Var(X) = mean(X^2) - mean(X)^2
maxval = 0;
sum = 0;
% Sum of coin values
sum2 = 0;
% Sum of values squared
nPts = 0;
% Number of pixels in coin regions
for r=1:size(I,1)
for c=1:size(I,2)
1
There may be a couple of pixels that are not segmented correctly; don’t worry about that.
2
CSCI 510/EENG 510
Image and Multidimensional Signal Processing
Fall 2015
if B(r,c)
% if this is a coin pixel
nPts = nPts + 1;
if I(r,c)> maxval
maxval = I(r,c);
end
% Accumulate sums. Convert to double to get better precision.
sum = sum + double(I(r,c));
sum2 = sum2 + double(I(r,c))^2;
end
end
end
avg = sum/nPts;
var = sum2/nPts - avg^2;
std = sqrt(var);
fprintf('Maximum value in coin regions = %d\n', maxval);
fprintf('Average value in coin regions = %f\n', avg);
fprintf('Standard deivation of pixels in coin regions = %f\n', std);
This prints out:
Maximum value in coin regions = 255
Average value in coin regions = 178.614854
Standard deivation of pixels in coin regions = 31.017945
If you use Matlab a lot, you will find that you can write programs that are much shorter and more
efficient. Since Matlab is an interpreted language, “for” loops are generally pretty slow for
doing matrix and array operations, so it is much better to write the operations using matrix
operators. Below is an alternative solution that gives the same results:
% Alternative solution
I = imread('coins.png');
B = I > 80;
% Read image
% Threshold image, so that coin pixels = 1
nPts = sum(sum(B));
% Sum entire image to count coin pixels
% Make a mask image where coin pixels = 255
M = 255*uint8(B);
% Do a logical AND of each pixel in the mask image, with I.
% You could instead have used the command: I = I .* uint8(B).
I = bitand(I,M);
% Force background pixels to 0
avg = sum(sum(I))/nPts;
var = sum(sum(double(I).^2))/nPts - avg^2;
std = sqrt(var);
3. (20 pts) As described in Section 2.1.1 of the textbook, we can treat the human fovea as a
square sensor array of size 1.5 mm x 1.5 mm, containing about 337,000 cones. Also, the
space between the cones is equal to width of the cones.
a) What is the field of view (in degrees) of the human fovea?
3
CSCI 510/EENG 510
Image and Multidimensional Signal Processing
Fall 2015
b) Estimate the distance from Brown Hall to the top of South Table Mountain (you can
find this using a map, or a webtool such as Microsoft Bing Maps, or Google Earth).
What is the minimum size object that you can see with the naked eye on top of the
mountain? Can you see a person on top of the mountain? Assume for simplicity
that size of the image of the object must cover at least two receptors (cones).
Solution:
(a) If the fovea covers an area of 1.5mm x 1.5mm, and contains 337,000 cones, then we can
think of the fovea as a square sensor array, which translates into an NxN array of size 580x580
cones. Assuming equal spacing between cones, this gives 580 cones and 580 spaces on a line 1.5
mm long. Each cone is s = 1.5 mm / 1160 = 0.0013 mm in diameter.
Assuming the focal length of the eye is 17 mm, then the field of view of the fovea of the human
eye is 2*arctan(0.75/17) = 5 degrees. For comparison, the diameter of the full moon as seen
from earth is about 0.5 degree.
(b) I estimated the distance to be about d = 1100 meters. Assume that the size of the imaged
object on the retina must cover at least two sensor elements. Allowing for the spaces between
the cones, that would be s = 4* 0.0013 mm, or 0.0052 mm. Then we can use similar triangles to
find the minimum size of the object. If we assume that the eye has a focal length of f=17 mm,
then
h/d = s/f
or
h = d*s/f = (1100 m)*(5.2x10^-6 m)/(17x10^-3m) = 0.34 m.
So you should just barely be able to see a person on top of the mountain from the campus.
4. (20 pts) A pool-playing robot uses an overhead camera to determine the positions of the
balls on the pool table. Assume that:
a) We are using a standard billiard table of size 44" x 88".
b) We need at least 100 square pixels per ball to reliably determine the identity of
each ball.
c) The center of the ball can be located to a precision of +- one pixel in the image.
d) We need to locate the ball on the table to an accuracy of +- one cm.
e) We are going to mount the camera on the ceiling, looking straight down. The
distance from the camera to the table is 2 m.
Determine a configuration of the camera resolution and lens FOV that will meet these
requirements. Assume that you can choose from the following parts:
Lenses with field of view 30, 60, 90 degrees
Cameras with resolutions of 256x256, 512x512, or 1024x1024 pixels
Choose the lowest resolution that will meet the requirements.
Solution:
We can calculate the required field of view of the camera: To completely image the longer side
of the table (L=2.235 m) at a distance of D=2 m, the angle must be
4
CSCI 510/EENG 510
Image and Multidimensional Signal Processing
Fall 2015
 L 2

 D 
= 1.019 radians = 58 degrees
So we need to use the lens with a 60 degree field of view.
θ = 2 tan −1 
To determine the required resolution NxN of the camera, we first check the constraint that we
need an accuracy of +- 1cm. At a distance of D = 2m, with a field of view of 60 degrees, the
horizontal length that the camera actually images is
L = 2 D tan (θ 2) = 231 cm
where θ = 60 degrees. So we have N pixels over 231 cm, or N/231 pixels/cm. If we need an
accuracy of 1 cm, then the minimum number of pixels N must be 231.
However, we also need to check the constraint that we need a minimum area for the image of the
ball. A billiard ball has diameter 5.715 cm. In units of pixels in the image, the diameter is
5.715cm * (N pixels / 231 cm) = 0.0247 N pixels
The area of the circular region of the ball in the image is
A = π r2
2
 0.0247 N 
=π

2


−4
= 4.79 x10 N 2 square pixels.
If we need 100 square pixels per ball, then N 2 = 100 4.79 x10 −4 = 2.0877x105.
Or N = 456.9 pixels.
So we need a resolution of 512x512 pixels.
5. (20 pts) Develop a program to resize the example image “cameraman.tif” from its
original size of 256x256 to an enlarged size of 400x400, using bilinear interpolation. For
this problem, don’t use the Matlab functions “imresize” , “interp2” or the
equivalent OpenCV function. Turn in your program and the resulting image.
Solution:
clear all
close all
% Source image
I1 = imread('cameraman.tif');
[M1,N1] = size(I1);
% Destination image
M2 = 400;
N2 = 400;
I2 = zeros(M2,N2);
% Scaling factors
sx = N2/N1;
5
CSCI 510/EENG 510
Image and Multidimensional Signal Processing
Fall 2015
sy = M2/M1;
% The transformation from I2 to I1 is
%
x1 = x2/sx, y1 = y2/sy
% Scan through I2 and interpolate value from I1.
for y2 = 1:M2
for x2 = 1:N2
% Compute exact location of correponding point in I1
x1 = x2/sx;
y1 = y2/sy;
% Get closest integer coordinates less than x1,y1 (ie, round down)
x0 = floor(x1);
y0 = floor(y1);
% Don't
if x0 <
if y0 <
if x0 >
if y0 >
go outside
1
x0 =
1
y0 =
N1-1 x0 =
M1-1 y0 =
bounds
1;
1;
N1-1;
M1-1;
of the image
end
end
end
end
% Get the offsets of (x1,y1) from (x0,y0)
x = x1 - x0;
y = y1 - y0;
% Interpolate the value at (x1,y1), using the formula from the
% lecture notes. Remember when we access a pixel, we have to use
% (y,x) instead of (x,y).
I2(y2,x2) = I1(y0,x0) * (1-x)*(1-y) + ...
I1(y0,x0+1)*x*(1-y) + ...
I1(y0+1,x0)*(1-x)*y + ...
I1(y0+1,x0+1)*x*y;
end
end
imshow(I2,[]);
The original (left) and resized (right) images:
6
CSCI 510/EENG 510
Image and Multidimensional Signal Processing
7
Fall 2015
Download