Stat 407 Quiz 2 Fall 2001 Name

advertisement
Stat 407 Quiz 2 Fall 2001
Name
1. (2pts) Which of the following linkage methods for hierarchical clustering is the most likely to give the
following grouping of cases into cluster 1 and cluster 2? Explain.
(a) Single ,
(b) Complete,
(c) Wards
(d) Average.
Due to the tendency for chaining.
2. (2pts) On the above plot draw your guess at the solution that would be provided by k-means clustering,
with k = 2. Explain.
The k-means algorithm tends to break data into spherical clusters, hence it would divide the data in two
roughly at the average value for x2. It would mix up the two obvious clusters.
3. (1pt) Fuzzy c-means is a variant of the k-means clustering algorithm. (T )
1
4. (2pts) From this formula for a bivariate normal density tell me what the mean is and the variance-covariance
of the distribution is.
f (x1 , x2 ) =
1
4 2
2π 2 5
1/2
1
× exp{− (x1 + 5 x2 − 2)
2
The mean is (−5 2)0 and the variance-covariance is
"
"
5/16 −1/8
−1/8 1/4
#
x1 + 5
x2 − 2
!
}
#
4 2
.
2 5
5. (2pts) Which of the following is most likely not a sample from a normal distribution? Explain.
(d) because there is strong curvature at the bootom indicating strong skewness.
6. (1pt) Mahalanobis (or statistical) distance derives from the exponent of the multivariate normal density
function. (T )
2
Download