WWW.UandiSTAR.ORG

advertisement
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
WWW.UandiSTAR.ORG
IV B.Tech I Semester – Computer Science & Engineering
Data Warehousing & Data Mining
1. Which of the following is the
7. A database is a collection of
a. Related data
most popularly available and rich
b. Interrelated data
information repositories?
a. Temporal databases
b. Relational databases
c. Transactional databases
d. spatial databases
2. Which of the following
databases is used to store timerelated data?
a. Spatial databases
b. Text databases
c. Multimedia databases
d. Temporal databases
3. From a DWH perspective, data
mining can be viewed as an
advanced stage
of
c. Irrelevant data
d. Distributed data
8. A Relational database is a
collection of
a. tables
b. events
c. attributes
d. values
9. A _ _ _ _ _ _ _ is a repository of
information collected from
multiple squares stored under a
unified schema, and which usually
resides at a single site.
a. Data mining
b. Database
a. On-Line Transaction Processing
b. On-Line Data Processing
c. Data warehouse
c. On-Line Analytical Processing
10. Which of the following
databases is used to store image,
audio, and video data?
d. On-Line Electronic Processing
4. A _ _ _ _ _ _ is a group of
heterogeneous databases?
a. Time series databases
b. Object oriented databases
c. Legacy databases
d. Spatial databases
5. Spatial databases includes
a. Legacy databases
b. Time series databases
c. Satellite image databases
d. Temporal databases
6. Many people treat data mining
as synonym for another popularly
used term
a. Knowledge Discovery in
databases
b. knowledge inventory in databases
c. Knowledge acceptance in databases
d. knowledge disposal in databases.
d. legacy databases
a. Heterogeneous databases
b. Temporal databases
c. Legacy databases
d. Multimedia databases
11. What is the single dimensional
association rule for the following
predicate
WWW.UandiSTAR.ORG
notation, which in
multidimensional association rule.
Contains(T, "computer") ==
contains(T, "software")
a. Computer == software
b. Software == computer
c. Software == computer
d. Computer == software
12. Which of the following analysis
attempt to identify attributes that
Page 1 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
do not contribute to the
classification or prediction
process?
a. Cluster analysis
b. Outlier analysis
17. _ _ _ _ _ mining tasks perform
inference on the current data in
order to
make predictions.
a. Descriptive
c. Relevance analysis
b. Predictive
d. Evolution analysis
c. Data
d. Metadata
13. Which of the following is a
summarization of the general
characteristics or
features of a target class of data?
18. The derived model may be
represented in the form of
a. Data discrimination
a. ER model
b. Flow chart
b. Data characterization
c. Decision trees
c. Data compression
d. Meta data
d. DFD
14. _ _ _ _ _ _ _ is a comparison of
the general features of target class
data
objects with general features of
objects from one or a set of
contrasting
classes.
a. Data characterization
b. Data summarization
c. Data discrimination
d. Meta data
15. _ _ _ _ _ _ _ interestingness
measures are based on user
beliefs in the
data.
a. Objective
b. Descriptive
c. Collective
d. Subjective
16. _ _ _ _ _ _ mining tasks
characterize the general
properties of the data in
the databases.
a. Descriptive
b. Predictive
c. Metadata
d. Data
19. Which of the following is the
classification of data mining
systems?
a. Summarization
b. Visualization
c. Discrimination
d. Characterization
20. _ _ _ _ _ _ _ analysis describes
and models regularities or trends
for
objects whose behavior changes
over time.
a. Data evolution
b. Cluster
WWW.UandiSTAR.ORG
c. Outlier
d. Summarization
21. Which of the following issues
relation to the diversity of
database type?
a. Handling noisy or incomplete data
b. Incorporation of background
knowledge
c. Handling of relational and
complex types of data
d. Efficiency and scalability of data
mining algorithms
22. Which of the following is not
major issue in data mining?
Page 2 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
a. Mining methodology and user
interaction issues
b. Performance issues
c. Issues relating to the diversity of
database types
d. Issues relating to the
Measurement
23. Processing _ _ _ _ _ queries in
operational databases would
substantially
degrade the performance of
operational tasks.
a. On-Line Transaction Processing
b. On-Line Electronic Processing
c. On-Line Data Processing
d. On-Line Analytical Processing
24. An _ _ _ _ _ _ System typically
adopts either a star or snow flake
model
and subject oriented database
design.
a. On-Line Transaction Processing
b. On-Line Electronic Processing
c. On-Line Analytical Processing
d. On-Line Data Processing
25. The access patterns of an _ _ _
_ system consist mainly of short,
atomic
transactions.
a. On-Line Analytical Processing
b. On-Line Transaction Processing
c. On-Line Electronic Processing
d. On-Line Data Processing
26. Which of the following
approach requires complex
information filtering
and integration processes and
competes for resources with
processing at
local sources?
a. Update-driven approach
b. Integrate-driven approach
c. Query-driven approach
27. Mining different kinds of
knowledge in databases is an
issue in
a. Performance issue
b. Mining methodology and user
interaction issues
c. Diversity of database types issues
d. time complexity
28. Pattern evolution is an issue
related to
a. Mining methodology and user
interaction issues
b. Performance issues
c. Issues relating to the diversity of
database types
d. Issues relating to the Measurement
29. A DWH is a subject oriented,
integrated, time- variant, and _ _ _
___
collection of data in support of
management's decision-making
process.
a. Nonvolatile
b. Volatile
c. Disintegrated
d. Object- oriented
30. An _ _ _ system focuses
mainly on the current data with in
an enterprise
or department, without referring to
historical data or data in different
organizations .
a. On-Line Analytical Processing
WWW.UandiSTAR.ORG
b. On-Line Data Processing
c. On-Line Electronic Processing
d. On-Line Transaction Processing
31. The basic characteristic of Online Analytical Processing is
a. Informational processing
b. Operational processing
c. Data processing
d. Data cleaning
d. Data-driven approach
Page 3 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
32. Which of the following cuboid
that holds the highest level of
summerization?
form
a. Cuboid
b. Base cuboid
c. Non-base cuboid
c. Normalized
d. Apex coboid
33. _ _ _ _ _ _ _ _ _ _ is a
visualization operation that rotates
the data axes in
view in order to provide an
alternative presentation of the data
a. Rollup
b. Drill down
c. Pivot
d. Slice & dice
34. _ _ _ _ _ _ tables can be
specified by users or experts, or
automatically
generated and adjusted based on
data distributions.
a. Fact
b. Summarized
c. Dimension
d. Relational
35. _ _ _ _ _ _ _ executes queries
involving more than one fact table
a. Drill-through
b. Drill-across
c. Drill-down
d. Rotate
36. A _ _ _ _ _ allows data to be
modeled and viewed in multiple
dimensions.
a. Meta data
b. Data cube
c. Database
d. Fact table
37. The major difference between
the snowflake and star schema
models is
that the dimension tables of the
snowflake model image kept in _ _
__
a. Standard
b. De-normalized
d. Multi dimensional
38. Which of the following is not a
measure, which is based on the
kind of
aggregation functions used.
a. Cumulative
b. Distributed
c. Algebraic
d. Holistic
39. A concept hierarchy that is a
total or partial order among
attributes in
database schema is called a _ _ _
_ _ _ _ _ _ _ _ hierarchy.
a. Set-grouping
b. Grouping
c. Decision
d. Schema
40. Which of the following focuses
on socioeconomic applications?
a. Statistical database systems
WWW.UandiSTAR.ORG
b. Online Analytical Processing systems
c. Spatial database systems
d. Temporal database systems
41. A _ _ _ _ _ _ _ _ _ model
consists of radial lines emanating
from a central
point, where each line represents
a concept hierarchy for a
dimension
a. Cube net
b. Triangle net
c. Square net
d. Star net
42. Which of the following is
constructed where the enterprise
warehouse is
Page 4 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
the sole custodian of all
warehouse data. Which is then
distributed to the
various dependent data marts.
a specific group of users.
a. Enterprise warehouse
b. Virtual warehouse
c. Data warehouse
c. Multi-tier DWH
d. Data mart
49. A _ _ _ _ _ _ _ is a set of views
over operational databases
d. Virtual warehouse
a. Enterprise warehouse
43. Which of the following is a
Multi Dimensional Online
Analytical Processing?
a. Ess base
b. Virtual warehouse
a. Enterprise DWH
b. Two- tier DWH
b. Database
c. Swiss base
d. Red brick
44. The _ _ _ _ _ _ view includes
fact tables and dimension tables.
a. DWH
c. Data warehouse
d. Data mart
50. What kind of the intermediate
servers that stand in between a
relational
back-end server and client frontend tools?
a. Hybrid OLAP servers
b. Multidimensional OLAP server
b. Top-down
c. Data source
d. Business Query
c. Relational OLAP servers
WWW.UandiSTAR.ORG
45. Which of the following is a
Hybrid OLAP server?
51. Choose the _ _ _ _ _ _ _ _ _
that will populate each fact table
record
a. Measures
a. MS SQL server 1.0
b. MS SQL 5.0
c. MS SQL server 7.0
d. MS SQL server 3.0
46. ETL stands for
a. Evaluate, Transport and Link
b. Extract Transfer and Load
c. Error, Tracking and Load
d. Extract, Transient and Load
d. Specialized SQL servers
b. Dimensions
c. Grain
d. Business Process
52. How many cuboids are there in
an n- dimensional data cube?
a.
47. To architect the DWH, the
major driving factor to support is
b.
c.
d.
a. An inability to cope with requirements
evolution
b. Not populating the warehouse
53. Meta data repository contains
a. Operational meta data
d. Supporting Online Transaction
processing
b. Data irrelevant to system
performance
c. The mapping from the DWH to the
operational environment
d. Summarized data
48. A _ _ _ _ _ _ _ contains a
subset of corporate-wide data that
is of value to
54. Which of the following support
the bitmap indices
a. Sybase IQ
c. Day- to- day management of the
warehouse
Page 5 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
b. Oracle 7
c. CoBoL
d. SQL
55. _ _ _ _ _ _ _ are created for the
data names and definitions of the
given
warehouse
a. Data cube
b. Summarized data
c. Meta data
d. Detailed Information
56. Chunking technique involves
"overlapping" some of the
aggregation
computations, it is referred to as _
_ _ _ _ aggregation in data cube
computation
a. Two way array
b. Three way array
c. Multi way array
d. Sparse array
57. The _ _ _ _ _ _ _ operator
computes aggregates over all
subsets of the
dimensions specified in the
operation.
a. Data base
b. Computer cube
c. Define cube
d. Group by
58. Which of the following is a
subcuge that is small enough to fit
into the
memory available for cube
computation?
a. Bulk
b. Array
c. Structure
d. Chunk
59. The bit mapped join indices
method is an integrated form of
a. Composite join indexing and bitmap
indexing
b. Join indexing and composite join
indexing
c. Join indexing and bitmap
indexing
d. Bitmap indexing and outer join
indexing
60. A set of attributes in a relation
schema that forms a primary key
for
another relation schema is called
a_______
a. Primary key
b. Foreign key
c. Secondary key
d. Composite key
WWW.UandiSTAR.ORG
61. Which of the following typically
gathers data from multiple,
heterogeneous, and external
sources?
a. Data cleaning
b. Load
c. Refresh
d. Data extraction
62. OLAM is particularly important
for the following reason
a. How quality of data in DWH
b. Data processing
c. OLTP-based exploratory data
analysis
d. Online selection of data mining
functions
63. Which of the following sets a
good example for interactive data
analysis
and provides the necessary
preparations for exploratory data
mining?
a. OLP
b. OLAP
c. OLTP
d. OLDP
64. Which of the following is not
exception indicator?
Page 6 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
a. Out Exp
c. Online Analytical Mining
b. Self Exp
c. In Exp
d. Path Exp
d. Online Analytical Monitoring
65. _ _ _ _ _ _ _ _ _ can help
business managers find and reach
more suitable
customers, as well as gain critical
business insights that may help to
drive
market share and raise profits.
a. Data Processing
b. Transaction Processing
c. Datacube
a. Data warehouse
b. Data mining
c. Data summarization
d. Data processing
66. _ _ _ _ _ _ _ _ _ _ _ is an
alternative approach in which precomputed
measures indicating data
exceptions are used to guide the
user in the data
analysis process at all levels of
aggregation.
a. Hypothesis-driven exploration
b. Inventory-driven exploration
c. Discovery-driven exploration
d. Exception-driven exploration
67. Which of the following is an
exception indicator that indicates
that
indicates the degree of surprise of
the cell value, relative to other
cells at
the same level of aggregation?
a. Out Exp
b. In Exp
c. Path Exp
d. Self Exp
68. _ _ _ _ _ is a powerful
paradigm that integrates OLAP
with data mining
technology.
a. Online Analytical Modeling
b. Online Analytical Machine
69. Data warehouse application is
_________
d. Datamining
70. _ _ _ _ _ _ _ _ _ cubes compute
complex queries involving
multiple
dependent aggregates as multiple
granularities
a. Multi feature
b. Data
WWW.UandiSTAR.ORG
c. Meta
d. Solid
71. Which of the following
performs a linear transformation
on the original
data?
a. Z-score normalization
b. Normalization with decimal scaling
c. Zero-standard deviation
d. Min-max normalization
72. Which of the following is the
best method for missing values in
data
cleaning?
a. Fill in the missing value manually
b. Use the most probable value to
fill in the missing value
c. Use the attribute mean to fill the
missing value
d. Use a global constant to fill in the
missing value
73. The minimum and maximum
values in a given bin are identified
as the
a. Bin means
b. Bin average
c. Bin medians
d. Bin boundaries
Page 7 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
74. Which of the following is data
transformation operation?
a. Normalization
b. Regression
c. Clustering
d. Binning
75. The correlation between
attributes A and B can be
measured by
a.
b.
c.
d.
c. PP2
d. DIM
81. If the tuples in D are grouped
into M mutually disjoint
Clustering, then an
simple random sample of m
clusters can be obtained, where m
M which
WWW.UandiSTAR.ORG
of the following suits the above
sentence?
a. Stratified sample
b. SRS without replacement
76. _ _ _ _ _ methods smooth a
sorted data value by consulting in
neighborhood ie the values
around it.
c. Cluster sample
a. Clustering
a. A- trees
b. T-trees
c. P-trees
b. Binning
c. Regression
d. Data reduction
77. Z-score normalization is also
called as
a. Min-max normalization
b. Zero-standard deviation normalization
c. Zero-mean normalization
d. Normalization by decimal scaling
78. _ _ _ _ _ _ is a random error or
variance in a measured variable.
a. Bin
b. Cluster
c. Noise
d. Regression
79. The data are consolidated into
forms appropriate for mining is
called as
a. Data reduction
b. Data Redundancy
c. Data clean
d. Data transformation
80. Which of the following is a
decision tree algorithm?
a. C3.2
b. ID3
d. SRS with replacement
82. Multidimensional index trees
include
d. R-trees
83. Which of the following strategy
for data reduction is irrelevant,
weakly
relevant, or redundant attributes
may be detected and removed?
a. Data cube aggregation
b. Dimension reduction
c. Data compression
d. Numerosity reduction
84. In database systems, _ _ _ _ _
are primarily used for providing
fast data
access.
a. Red-black trees
b. Game trees
c. Multidimensional index trees
d. splay trees
85. If the mining task is
classification, and the mining
algorithm itself is used
to determine the attribute subset,
then this is called a _ _ _ _ _ _
approach.
Page 8 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
a. Filter
b. Reduction
c. Smoothing
d. Wrapper
86. The discrete wavelet
transformation is closely related
to the _ _ _ _ _ _ _
transform.
a. Discrete fourier
b. Fourier
c. Laplace
d. wavelet
87. Principal components analysis
is also called as
a. Karhunen-loeve method
b. Kinen-liva method
c. Kruskal-learn method
d. Kutni-lara method
88. _ _ _ _ _ _ can be used as a
data reduction technique since it
allows a
large data set to be represented by
a much smaller random subset of
the
data.
a. Clustering
b. Regression
c. Histograms
d. Sampling
89. Loy-linear models are
a. Parametric methods
b. Discrete methods
c. Non-parametric methods
d. Non- discrete methods
90. Which of the following method
is the generation of concept of
hierarchies
for categorical data?
a. Specification of a portion of a
hierarchy by implicit data grouping
b. Specification of their partial ordering,
but not of a set of attributes
c. Specification of a set of
attributes, but not of their partial
order
d. Specification of only a partial set of
entities
WWW.UandiSTAR.ORG
91. Which of the following method
uses class information?
a. Histogram analysis
b. Binning
c. Cluster analysis
d. Entropy-based Discretization
92. _ _ _ _ _ _ _ _ _ hierarchies for
categorical attributes or
dimensions
typically involve a group of
attributes
a. Diccretization
b. Semantic
c. Index
d. Concept
93. Which of the following is based
on the maximal asset values,
which may
lead to a highly biased hierarchy?
a. Cluster analysis
b. Segmentation
c. Binning
d. Histogram analysis
94. The _ _ _ _ _ can be used to
segment numeric data into
relatively uniform,
"natural" intervals.
a. 1-2-3 rule
b. 2-3-4 rule
c. 3-4-5 rule
d. 4-5-6rule
95. _ _ _ _ _ _ _ _ hierarchies for
numeric attributes can be
constructed
automatically based on data
distribution analysis
a. Concept
b. Discretization
Page 9 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
c. Tree
d. Index
96. _ _ _ _ _ _ _ techniques can be
used to reduce the number of
values for a
given continuous attribute, by
dividing the range of the attribute
into
intervals
a. Concept hierarchy
b. Discretization
c. Tree-based
d. Index
97. A _ _ _ _ _ _ _ _ _ algorithm
can be applied to partition data
into groups
101. _ _ _ _ _ _ hierarchies can be
used to refine or enrich schema
defined
hierarchies. When the two types of
hierarchies are combined.
a. Schema
b. Set-grouping
c. Operation-derived
d. rule-based
102. _ _ _ _ _ _ _ are those that
contribute new information or
increased
performance to the given pattern
set.
a. Utility patterns
b. Certainty patterns
a. Binning
b. Histogram
c. Novelty pattern
c. Clustering
103. Certainty factor is also known
as
d. Entropy-based
d. Simplicity patterns
98. An information-based measure
called _ _ _ _ can be used to
recursively
partition the values of a numeric
attribute A, resulting in a
hierarchical
discretization.
a. Entropy
a. Rule length
b. Noice threshold
c. Minable view
b. Cluster
c. Binning
d. Segmentation
a. Task-relevant data
99. The kinds of knowledge
include
c. Background knowledge
d. Interestingness measures
a. Image analysis
b. Query process
105. _ _ _ _ _ _ _ may be used to
guide the mining process or, after
discovery
to evaluate the discovered
patterns.
c. Association
d. Multimedia analysis
100. Which of the following is a
simplicity measure?
a. Rule strength
b. Rule quality
c. Rule reliability
d. Rule strength
104. Which of the following
primitive specifies the data mining
functions to be
performed?
b. The kind of knowledge to be
mined
a. Task-relevant data
b. The kind of knowledge to be mined
c. Background knowledge
d. Interestingness measures
WWW.UandiSTAR.ORG
d. Rule length
Page 10 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
106. A _ _ _ _ _ hierarchy is a total
or partial order among attributes
in the
database schema.
a. Schema
b. Set-grouping
c. Operation-derived
d. rule-based
107. Given a set of task-relevant
data tuples the confidence of "A==
B" is
defined as
a.
b.
c.
d.
108. _ _ _ _ _ hierarchies include
the decoding of information
encoded strings
information extraction from
complex data objects and data
clustering.
a. Rule-based
b. Operation-derived
c. Schema
d. Set grouping
109. For association rules of the
form "A== B" where A and B are
sets of
items, support is defined as
a.
111. Mining with the use of _ _ _ _ ,
allows additional flexibility for ad
hoc
rule mining.
a. Image patterns
b. Data patterns
c. Information patterns
d. Meta patterns
112. Which of the following clause
lists the attributes or dimensions
for
exploration
a. Order by
b. group by
c. having
d. in relevance to
113. Which of the following clause
uses the meta pattern?
a. Analyze
b. In relevance to
c. Matching
d. Use data warehouse
114. Which of the following clause
is used for discrimination?
a. Mine characteristics
b. Mine discriminant
c. Mine association
d. Mine comparison
115. DMQL expansion is
a. Data Modeling Queue Level
b. Design Modeling Query language
c. Data Mining Query Language
b.
d. Data &Meta data Query Language
c.
d.
110. Which of the following clause
is the task-irrelevant data
primitive?
a. In relevance to
b. Use for warehouse
WWW.UandiSTAR.ORG
c. Analysis
d. Order by
116. The _ _ _ _ _ clause, when
used for characterization, specific
aggregate
measures, such as count, sum or
count .
a. Use database
b. Analyze
c. Matching
d. Use hierarchy
Page 11 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
117. Which of the following clause
specifies the condition by which
groups of
data are considered relevant?
a. Having
b. Group by
c. Order by
d. analyze
118. The _ _ _ _ _ _ _ _ statement
is used to specify the kind of
knowledge to
be mined.
a. Knowledge-mine-specification
b. Mine-knowledge-specification
c. Knowledge-specification-mine
d. Specification-mine-knowledge
119. An example of
interestingness measures and
threshold values is
a. Without support threshold=
b. With confidence threshold=
c. Without Confidence threshold=
d. With support threshold=
120. CRISP-DM addresses an
issue as
a. Mapping from datamining problems to
business issues
b. Capturing and misunderstanding the
data
c. Disintegrating datamining results
within the business context
d. Deploying and maintaining data
mining results
WWW.UandiSTAR.ORG
121. An Example of a set-grouping
hierarchy is
a. Define hierarchy age-hierarchy for
age as customer on level1:{young,
middleaged,
serior} level10:all level2:{20 39} level1:
young level2:{20 59}
level1: middle-aged level2:{60 89}
level1:senior
b. Define hierarchy age-hierarchy as
age for customer on level1:{young,
middleaged,
serior} level10:all level2:{20 39} level1:
young level2:{20 59}
level1: middle-aged level2:{60 89}
level1:senior
c. Define hierarchy age-hierarchy
for age on customer as
level1:{young,
middle-aged,serior} level10:all
level2:{20 39} level1: young
level2:{20 59} level1: middle-aged
level2:{60 89} level1:senior
d. Define hierarchy age-hierarchy on
age for customer as level1:{young,
middleaged,
serior} level10:all level2:{20 39} level1:
young level2:{20 59}
level1: middle-aged level2:{60 89}
level1:senior
122. Which of the following data
mining language uses SQL-like
syntax and
serves as rule generation queries
for mining association rules.
a. MINE RULE operator
b. RULE MINE operator
c. DATA MINE operator
d. DWH operator
123. Which of the following is not
a data mining language?
a. DMQL
b. MSQL
c. PSQL
d. OLE DB for
124. System of schema hierarchy
is
a. textbf{Define hierarchy}
location-hierarchy textbf{on}
address
textbf{as} [street, city, country]
b. textbf{Define hierarchy} locationhierarchy textbf{as} address textbf{on}
[street, city, country]
Page 12 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
c. textbf{Define hierarchy} locationhierarchy textbf{from} address textbf{to}
[street, city, country]
d. textbf{Define hierarchy }locationhierarchy textbf{for} address textbf{all}
[street, city, country]
125. The DMQL statement syntax
is
a. display as result _ from
b. display result _ from
c. display on result _ from
d. display for result _ from
126. Which of the following is a
data mining query language
a. PSQL
b. QSQL
c. MSQL
d. RSQL
127. _ _ _ _ _ is used for efficient
implementations of a few essential
data
mining primitives.
a. No coupling
b. Loose coupling
c. Tight coupling
d. Semi tight coupling
128. _ _ _ _ _ _ _ is a compromise
between loose and tight coupling.
a. No coupling
b. Loose coupling
c. Tight coupling
d. Semi tight coupling
129. Which of the following
coupling schema is used to fetch
data from a data
repository managed by database
systems?
a. No coupling
WWW.UandiSTAR.ORG
b. Loose coupling
c. Tight coupling
d. Semi tight coupling
130. A well designed data mining
system should offer _ _ _ _ _ _ _
with a
data warehouse system
a. Semi tight coupling
b. No coupling
c. Loose coupling
d. Normal coupling
131. Which of the following is
difficult to achieve high scalability
and good
performance with large data sets?
a. No coupling
b. Tight coupling
c. Semi tight coupling
d. Loose coupling
132. _ _ _ _ _ _ _ _ means that a
Data mining system will not utilize
any
function of a data warehouse
system
a. Loose coupling
b. Semi tight coupling
c. Loose coupling
d. No coupling
133. _ _ _ _ _ _ _ _ means that a
data mining system is smoothing
integrated
coupling database system.
a. No coupling
b. Loose coupling
c. Tight coupling
d. Semi tight coupling
134. Which of the following
provides a concise and succinct
summerization of
the given collection of data?
a. Comparison
b. Characterization
c. Summerization
d. Aggregation
135. _ _ _ _ _ _ _ _ data mining
describes the data set in a concise
and
Page 13 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
summerative manner and presents
interesting general properties of
the
data.
a. Descriptive
b. Predictive
c. Active
d. Constructive
136. _ _ _ _ _ _ data mining
analyzes the data in order to
construct one or a
set of models and attempts to
predict the behavior of new data
sets.
a. Descriptive
b. Predictive
c. Active
d. Constructive
137. Attribute removal is based on
the following rule: If there is a
large set of
distinct values for an attribute of
the initial working relation but,
a. There is generalization operator on
the attribute
b. There is no generalization operand on
the attribute
c. There is no generalization
operator on the attribute
d. There is no aggregation operator on
the attribute
138. On-line analysis processing
in data warehouses is a purelycontrolled
process
a. Machine
b. database
c. Developer
WWW.UandiSTAR.ORG
d. User
139. Which of the following
approach is used to control
generalization
process?
a. Generalized relation threshold
control
b. Generalized class threshold control
c. Generalized dimension threshold
control
d. Generalized query threshold control
140. Many current OLAP systems
confine dimensions to _ _ _ _ _ _ _
___
data
a. Numeric
b. Non numeric
c. Meta
d. Summerized
141. _ _ _ _ _ _ _ is a process that
abstracts a large set of taskrelevant data
in a database from a relatively low
conceptual level to higher
conceptual
levels.
a. Data realization
b. Data characterization
c. Data summerization
d. Data generalization
142. The _ _ _ _ _ _ approach can
be considered as a data
warehouse-based
pre-computation-oriented,
material- view approach.
a. Object-oriented induction
b. Data cube
c. Attribute-oriented induction
d. Data square
143. Which of the following
approach is a relational database
query-oriented,
generalization-based, on-line data
analysis technique?
a. Attribute-oriented induction
b. object-oriented approach
c. Data cube
d. Data square
Page 14 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
144. _ _ _ _ _ _ _ _ performs offline aggregation before an OLAP
or Data
mining query is submitted for
processing.
relational query to collect the task
relevant data into the _ _ _ _ _ _ _
___
_.
a. Object-oriented induction
a. Prime relation
b. Secondary relation
b. Data cube
c. Working relation
c. Attribute-oriented induction
d. Data square
d. Analyzing relation
145. The range of t-weight is
a.
b.
c.
d.
146. How can the t-weight and
interestingness measures in
general be used
by the data mining system to
display only the concept
descriptions that it
objectively evaluates as
interesting?
a. By threshold
b. By generalization
c. By comparison
d. By characterization
147. The data cube implementation
of attribute-oriented induction can
be
performed by
a. Using defined data cube
150. Which of the following
relation collects the statistics of
attributeorientedinduction
algorithm?
a. Working relation
b. Prime relation
c. Secondary relation
d. Analyzing realation
151. Descriptions can also be
visualized in the form of _ _ _ _ _ _
__.
a. Cross-ralations
b. Cross-checks
c. Cross-boards
d. Cross-tabs
152. Step three of attributeoriented-induction derives the _ _
_____
relation.
a. Working
b. Prime
c. Secondary
d. Analysing
148. A _ _ _ _ _ can be represented
by a 3-D data cube.
a. Cross-tab
WWW.UandiSTAR.ORG
153. The _ _ _ _ _ _ as an
interestingness measure that
describes the
typically of each disjoint in the
rule, or of each tuple in the
corresponding
generalized relation.
b. Bar chart
c. pie chart
d. Flow chart
a. Quantitative rule
b. Quantitative characteristic rule
c. c-weight
149. Step one of the attributeoriented-induction algorithm is
essentially a
d. t-weight
154. The information gain is
obtained by
b. Using a predefined data cube
c. Using a generalized data cube
d. Using a quantified data cube
Page 15 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
a. Expected information + entropy
b. Entropy - Expected information
c. Expected information entropy
d. Entropy Expected information
155. The expected information
needed to classify a given sample
is
a. I(s1,s2----.sm)= mathop Sigma
limits_{i = 1}n ( /s) ( /s)
b. I(s1,s2----.sm)= ( /s) ( /s)
c. I(s1,s2----.sm)= - mathop Sigma
limits_{i = 1}n ( /s) ( /s)
d. I(s1,s2----.sm)=- mathop Sigma
limits_{i = 1}n ( /s) ( /s)
156. Class comprarison is also
called as
a. composition
b. aggregation
c. discrimination
d. characterization
157. _ _ _ _ _ _ can be used to
perform some preliminary
relevance analysis on the data by
removing or generalizing
attributes having a very large
number of distinct values.
a. Object-oriented induction
b. Attribute-oriented induction
c. Batch-oriented induction
d. Class-oriented induction
158. Class characterization that
includes the analysis of
attribute/dimensions relevance is
called _ _ _ _ _ .
a. Analytical comparison
b. Analytical measurement
c. Analytical characterization
d. Analytical difference
159. _ _ _ _ _ _ _ irrelevant and
weakly relevant attributes using
the selected relevance analysis
measure.
a. Insert
b. Update
c. Modify
d. Remove
160. The _ _ _ _ _ class is the
class to be characterized
a. base
b. target
c. contrasting
d. sub
161. The _ _ _ _ _ _ class is the set
of comparable data that are not in
the
target class.
a. base
b. target
c. contrasting
d. sub
162. Generalization is performed
on the _ _ _ _ _ _ _ _ to the level
controlled
by a user or expert-specified
dimension threshold, which
results in a _ _ _ _
___
a. Target class, Prime target class
relation
b. Contrasting class, Prime contrasting
class relation
c. Target class, Secondary target class
relation
d. Contrasting class, Secondary
contrasting class relation
163. Let be a generalized tuple,
and be the target class, the dweight
is defined as
a. d-weight =condition( ) / count( )
b. d-weight =condition( ) / mathop
Sigma limits_{i = 1}m count( )
c. d-weight =condition( ) / count( )
d. d-weight =condition( ) / count( )
164. Can class comparison mining
be implemented efficiently using
data cube techniques?
a. yes
Page 16 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
b. no
c. limited
d. difficult
165. Class discrimination is also
called as
a. class comparison
b. class hierarchy
c. class aggregation
d. class concept
166. The set of relevant data in the
database is collected by query
processed
and is partitioned respectively into
a target class and one or a set of _
___
_ class(es)
a. discrimination
b. contrasting
c. comparable
d. target
167. The range for the d-weight is
a.
b.
c.
WWW.UandiSTAR.ORG
d.
168. A _ _ _ _ _ _ d-weight in the
target class indicates that the
concept represented by the
generalized tuple is primarily
derived from the target class
comparison description is written
in the form
a. x, target _ class(x) compare(x) [d: dweight]
b. x, contrasting _ class(x) condition(x)
[d: d-weight]
c. x, contrasting _ class(x)
compare(x) [d: d-weight]
d. x, target _ class(x) condition(x) [d: dweight]
171. In d-weight, d stands for
a. divide
b. dead
c. discrimination
d. degree
172. Inter quartile is defined as
a. First quartile -Third quartile
b. First quartile + Third quartile
c. Third quartile + First quartile
d. Third quartile - First quartile
173. One common rule of thumb
for identifying suspected outliers
is to single
out values falling at least _ _ _ _ _
_ _ above the third quartile or
below the first quartile.
a.
b.
c.
d.
b. High
174. The most commonly used
percentiles other the median are _
_____
c. Average
d. Middle
a. Outliers
b. Boxplots
169. A _ _ _ _ _ _ d-weight implies
that the concept is primarily
derived from the contrasting class
a. Low
c. Quartiles
a. Low
b. High
c. Average
d. Middle
170. A quantitave discriminant rule
for the target class of a given
d. Modes
175. A popularly used visual
representation of a distribution is
the _ _ _ _ _ _
a. Boxplot
b. Outlier
c. Quartile
d. Histogram
Page 17 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
176. Dispersion is also called as
a. Mean
b. Variance
c. Median
d. mode
177. Which of the following is
central tendency measure?
a. Outliers
b. Variance
c. Quartiles
WWW.UandiSTAR.ORG
d. Mode
178. Which of the following is a
data dispersion measure?
a. Mean
b. Variance
c. Mode
d. Median
179. The average of the largest
and smallest values in a data set
is called as
a. Median
b. Mean
c. Mid range
d. Mode
180. The _ _ _ _ _ _ _ _ for a set of
data is the value that occurs most
frequently in the set.
a. Median
b. Mean
c. Mid range
c. quantile plot
d. q-q-q plot
183. A _ _ _ _ _ _ _ _ is another
important exploratory graphic aid
that adds a smooth curve to a
scatter plot in order to provide
better perception of the
pattern of dependence.
a. Loess curve
b. Scatter curve
c. Bar chat
d. Quantile plot
184. Histograms are also called as
_ _ _ _ _ _ _ _ _ histograms.
a. frequency
b. variance
c. quartile
d. outlier
185. The word loess is short for
a. Load compression
b. Local compression
c. Load refression
d. Local refression
186. A _ _ _ _ _ _ _ _ _ consists of
a set of rectangles that reflect the
counts of the classes present in
the given data.
a. Quartile plot
b. q-q plot
c. Histogram
d. Loess curves
d. Mode
181. Which of the following is not
central tendency measure?
a. Variance
187. A _ _ _ _ _ _ is a simple and
effective way to have a first look at
an
unvariate data distribution.
b. Mean
c. Median
d. Mode
a. q-q plot
b. scatter plot
c. histogram
182. A _ _ _ _ _ _ _ _ is one of the
most effective graphical methods
or trend between two quantitative
variables.
d. quantile plot
WWW.UandiSTAR.ORG
188. A _ _ _ _ _ _ _ _ _ , groups the
quantiles of one unvariate
distribution
a. q-q plot
b. scatter plot
Page 18 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
for more :- http://UandiStar.net
WWW.UandiSTAR.ORG
against the correspondings
quantiles of another.
WWW.UandiSTAR.ORG
Page 19 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT…
Alerts,Hacking Tips/Tricks… more directly to ur mobile
http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..
Download