Uploaded by cihavay714

New Microsoft Word Document

advertisement
Q#
1.
Questions
i
Data mining is
a) The actual discovery phase of a knowledge discovery process
b) None of these
c) The stage of selecting the right data for a KDD process
d) A subject-oriented integrated time variant non-volatile collection of data
in support of management
ii
Cluster is
a) Group of similar objects that differ significantly from other
objects
b) Operations on a database to transform or simplify data in order to
prepare it for a machine-learning algorithm
c) Symbolic representation of facts or ideas from which information can
potentially be extracted
d) None of these
iii
OLAP stands for
a) Online analytical processing
b) Online analysis processing
c) Online transaction processing
d) Online aggregate processing
iv
To integrate heterogeneous databases, how many approaches are there in Data
Warehousing?
a) 2
b) 3
c) 4
d) 5
v
__________ is a system where operations like data extraction, transformation and
loading operations are executed.
a) Data staging
b) Data integration
c) ETL
d) None of the mentioned
vi
What is the use of data cleaning?
a) to remove the noisy data
b) correct the inconsistencies in data
c) transformations to correct the wrong data
d) All of the above
vii
Data Mining System Classification consists of?
a) Database Technology
b) Machine Learning
c) Information Science
d) All of the above
viii
Data selection is
a) The actual discovery phase of a knowledge discovery process
b) The stage of selecting the right data for a KDD process
c) A subject-oriented integrated time variant non-volatile collection of data in
support of management
d) None of these
ix
Background knowledge referred to
a) Additional acquaintance used by a learning algorithm to facilitate the
learning process
b) A neural network that makes use of a hidden layer
c) It is a form of automatic learning
d) None of these
x
Data that can be modeled as dimension attributes and measure attributes are
called _______ data.
a) Multi dimensional
b) Single dimensional
c) Measured
d) Dimensional
Q#
1.
Questions
i
Can FP growth algorithm be used if FP tree cannot be fit in memory? a. Yes
b.No
ii
What are maximal frequent itemsets?
a. A frequent itemsetwhose no super-itemset is frequent
b. A frequent itemset whose super-itemset is also frequent
c. A non-frequent itemset whose super-itemset is frequent
d. None of the above.
iii
What are Max_confidence, Cosine similarity, All_confidence? a.
Frequent pattern mining algorithms
b. Measures to improve efficiency of apriori
c. Pattern evaluation measure
d. None of the above
iv
Which technique finds the frequent itemsets in just two database scans? a. Which
technique finds the frequent itemsets in just two database scans? b. Sampling
c. Hashing
d. Dynamic itemset counting
v
How do you calculate Confidence(A -> B)?
a. Support(A B) / Support (A)
b. Support(A B) / Support (B)
c. Support(A B) / Support (A)
d. Support(A B) / Support (B)
vi
How many cells does an iceberg cube have if each dimension has exactly two
distinct values and only base cuboid does not satisfy iceberg condition?
a. 2n b. 3n c. 3n-2n d. 3n-1
vii
Which of the following algorithm comes under the classification Select one :
a. Apriori b. Brute force c. DBSCAN d. K-nearest neighbor
viii ______consists of formal definitions, such as a COBOL layout or a database
schema.
a. Classical metadata.
b. Transformation metadata.
c. Historical metadata.
d. Structural metadata.
ix
Detail data in single fact table is otherwise known as__________. a.
monoatomic data.
b. diatomic data.
c. atomic data.
d. multiatomic data.
x
Data set {brown, black, blue, green , red} is example of Select one: a.
Continuous attribute
b. Ordinal attribute
c. Numeric attribute
d. nominal attribute
Q#
1.
Questions
i
Agglomerative clustering uses a
a. bottom-up approach
b. top down approach
c. both
d. none
ii
Divisive clustering uses a
a. None
b. top down approach
c. bottom-up approach
d. both
iii
Which of the following statements is true for k-NN classifiers? a. The
classification accuracy is better with larger values of kPattern evaluation
measure
b. The decision boundary is smoother with smaller values of k c.
The decision boundary is linear
d. k-NN does not require an explicit training step
iv
To detect fraudulent usage of credit cards, the following data mining
task should be used
a. H outlier analysis
b. Prediction
c. association analysis
d. feature selection
v
Data scrubbing can be defined as
a. Check field overloading
b. Delete redundant tuples
c. Use simple domain knowledge (e.g., postal code, spell-check) to detect
errors and make corrections
d. Analyzing data to discover rules and relationship to detect violators
vi
Which of the following is also referred to as overlayed 1D plot? a.
lattice
b. Barplot
c. Gplot
d. all of the mentioned
vii
_______ are numeric measurements or values that represent a
specific business aspect or activity
a. Dimensions b. Schemas c. FACTS d. TABLES
viii A fact is said to be fully additive if __________.
a. additive over atleast one of the dimensions
b. Only numeric measures are used
c. All possible summaries are stored
d. it is additive over every dimension of its dimensionality
ix
Background knowledge referred to
a) Additional acquaintance used by a learning algorithm to facilitate
the learning process
b) A neural network that makes use of a hidden layer
c) It is a form of automatic learning
d) None of these
x
Data that can be modeled as dimension attributes and measure attributes are called
_______ data.
a) Multi dimensional
b) Single dimensional
c) Measured
d) Dimensional
Download