ClusterExam

advertisement
Cluster Analysis
True/False Questions
1. Cluster analysis does not classify variables as dependent or independent.
(True,
2. Cluster analysis is the obverse of factor analysis in that it reduces the number of
objects, not the number of variables, by grouping them into a much smaller number of
clusters.
(True,
3. If cluster analysis is used as a general data reduction tool, subsequent multivariate
analysis can be conducted on the clusters rather than on the individual observations.
(True,
4. The dendrogram is read from right to left.
(False,
5. Clustering should be done on samples of 300 or more.
(False,
6. In cluster analysis, objects with larger distances between them are more similar to
each other than are those at smaller distances.
(False,
7. The average linkage method of hierarchical clustering is preferred to the single and
complete linkage methods.
(True,
8. The centroid method is a variance method of hierarchical clustering in which the
distance between two clusters is the distance between their centroids (means for all
the variables).
(True,
9. Nonhierarchical clustering is faster than hierarchical methods.
(True,
10. It is helpful to profile the clusters in terms of variables that were not used for
clustering.
(True,
11. One method of assessing reliability and validity of clustering is to use different
methods of clustering and compare the results.
(True,
12. To reduce the number of variables, a large set of variables can often be replaced by
the set of cluster components.
(True,
257
Multiple Choice Questions
25. Which method of analysis does not classify variables as dependent or independent?
a. regression analysis
b. discriminant analysis
c. analysis of variance
d. cluster analysis
(d,
26. Which statement is not true about cluster analysis?
a. Objects in each cluster tend to be similar to each other and dissimilar to objects in
the other clusters.
b. Cluster analysis is also called classification analysis or numerical taxonomy.
c. Groups or clusters are suggested by the data, not defined a priori.
d. Cluster analysis is a technique for analyzing data when the criterion or dependent
variable is categorical and the independent variables are interval in nature.
(d,
30. A _____ or tree graph is a graphical device for displaying clustering results. Vertical
lines represent clusters that are joined together. The position of the line on the scale
indicates the distances at which clusters were joined.
a. dendrogram
b. scattergram
c. scree plot
d. icicle diagram
(a,
31. The most important part of _____ is selecting the variables on which clustering is
based.
a. interpreting and profiling clusters
b. selecting a clustering procedure
c. assessing the validity of clustering
d. formulating the clustering problem
(d,
32. The most commonly used measure of similarity is the _____ or its square.
a. euclidean distance
b. city-block distance
c. Chebychev’s distance
d. Manhattan distance
(a,
258
33. _____ is a clustering procedure characterized by the development of a tree-like
structure.
a. Non-hierarchical clustering
b. Hierarchical clustering
c. Divisive clustering
d. Agglomerative clustering
(b,
38. _____ is a clustering procedure where all objects start out in one giant cluster.
Clusters are formed by dividing this cluster into smaller and smaller clusters.
a. Non-hierarchical clustering
b. Hierarchical clustering
c. Divisive clustering
d. Agglomerative clustering
(c,
43. The _____ method uses information on all pairs of distances, not merely the
minimum or maximum distances.
a. single linkage
b. medium linkage
c. complete linkage
d. average linkage
(d,
44. _____ is frequently referred to as k-means clustering.
a. Non-hierarchical clustering
b. Optimizing partitioning
c. Divisive clustering
d. Agglomerative clustering
(a,
259
Download