for more :- http://UandiStar.net WWW.UandiSTAR.ORG WWW.UandiSTAR.ORG IV B.Tech I Semester – Computer Science & Engineering Data Warehousing & Data Mining 1. Which of the following is the 7. A database is a collection of a. Related data most popularly available and rich b. Interrelated data information repositories? a. Temporal databases b. Relational databases c. Transactional databases d. spatial databases 2. Which of the following databases is used to store timerelated data? a. Spatial databases b. Text databases c. Multimedia databases d. Temporal databases 3. From a DWH perspective, data mining can be viewed as an advanced stage of c. Irrelevant data d. Distributed data 8. A Relational database is a collection of a. tables b. events c. attributes d. values 9. A _ _ _ _ _ _ _ is a repository of information collected from multiple squares stored under a unified schema, and which usually resides at a single site. a. Data mining b. Database a. On-Line Transaction Processing b. On-Line Data Processing c. Data warehouse c. On-Line Analytical Processing 10. Which of the following databases is used to store image, audio, and video data? d. On-Line Electronic Processing 4. A _ _ _ _ _ _ is a group of heterogeneous databases? a. Time series databases b. Object oriented databases c. Legacy databases d. Spatial databases 5. Spatial databases includes a. Legacy databases b. Time series databases c. Satellite image databases d. Temporal databases 6. Many people treat data mining as synonym for another popularly used term a. Knowledge Discovery in databases b. knowledge inventory in databases c. Knowledge acceptance in databases d. knowledge disposal in databases. d. legacy databases a. Heterogeneous databases b. Temporal databases c. Legacy databases d. Multimedia databases 11. What is the single dimensional association rule for the following predicate WWW.UandiSTAR.ORG notation, which in multidimensional association rule. Contains(T, "computer") == contains(T, "software") a. Computer == software b. Software == computer c. Software == computer d. Computer == software 12. Which of the following analysis attempt to identify attributes that Page 1 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG do not contribute to the classification or prediction process? a. Cluster analysis b. Outlier analysis 17. _ _ _ _ _ mining tasks perform inference on the current data in order to make predictions. a. Descriptive c. Relevance analysis b. Predictive d. Evolution analysis c. Data d. Metadata 13. Which of the following is a summarization of the general characteristics or features of a target class of data? 18. The derived model may be represented in the form of a. Data discrimination a. ER model b. Flow chart b. Data characterization c. Decision trees c. Data compression d. Meta data d. DFD 14. _ _ _ _ _ _ _ is a comparison of the general features of target class data objects with general features of objects from one or a set of contrasting classes. a. Data characterization b. Data summarization c. Data discrimination d. Meta data 15. _ _ _ _ _ _ _ interestingness measures are based on user beliefs in the data. a. Objective b. Descriptive c. Collective d. Subjective 16. _ _ _ _ _ _ mining tasks characterize the general properties of the data in the databases. a. Descriptive b. Predictive c. Metadata d. Data 19. Which of the following is the classification of data mining systems? a. Summarization b. Visualization c. Discrimination d. Characterization 20. _ _ _ _ _ _ _ analysis describes and models regularities or trends for objects whose behavior changes over time. a. Data evolution b. Cluster WWW.UandiSTAR.ORG c. Outlier d. Summarization 21. Which of the following issues relation to the diversity of database type? a. Handling noisy or incomplete data b. Incorporation of background knowledge c. Handling of relational and complex types of data d. Efficiency and scalability of data mining algorithms 22. Which of the following is not major issue in data mining? Page 2 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG a. Mining methodology and user interaction issues b. Performance issues c. Issues relating to the diversity of database types d. Issues relating to the Measurement 23. Processing _ _ _ _ _ queries in operational databases would substantially degrade the performance of operational tasks. a. On-Line Transaction Processing b. On-Line Electronic Processing c. On-Line Data Processing d. On-Line Analytical Processing 24. An _ _ _ _ _ _ System typically adopts either a star or snow flake model and subject oriented database design. a. On-Line Transaction Processing b. On-Line Electronic Processing c. On-Line Analytical Processing d. On-Line Data Processing 25. The access patterns of an _ _ _ _ system consist mainly of short, atomic transactions. a. On-Line Analytical Processing b. On-Line Transaction Processing c. On-Line Electronic Processing d. On-Line Data Processing 26. Which of the following approach requires complex information filtering and integration processes and competes for resources with processing at local sources? a. Update-driven approach b. Integrate-driven approach c. Query-driven approach 27. Mining different kinds of knowledge in databases is an issue in a. Performance issue b. Mining methodology and user interaction issues c. Diversity of database types issues d. time complexity 28. Pattern evolution is an issue related to a. Mining methodology and user interaction issues b. Performance issues c. Issues relating to the diversity of database types d. Issues relating to the Measurement 29. A DWH is a subject oriented, integrated, time- variant, and _ _ _ ___ collection of data in support of management's decision-making process. a. Nonvolatile b. Volatile c. Disintegrated d. Object- oriented 30. An _ _ _ system focuses mainly on the current data with in an enterprise or department, without referring to historical data or data in different organizations . a. On-Line Analytical Processing WWW.UandiSTAR.ORG b. On-Line Data Processing c. On-Line Electronic Processing d. On-Line Transaction Processing 31. The basic characteristic of Online Analytical Processing is a. Informational processing b. Operational processing c. Data processing d. Data cleaning d. Data-driven approach Page 3 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG 32. Which of the following cuboid that holds the highest level of summerization? form a. Cuboid b. Base cuboid c. Non-base cuboid c. Normalized d. Apex coboid 33. _ _ _ _ _ _ _ _ _ _ is a visualization operation that rotates the data axes in view in order to provide an alternative presentation of the data a. Rollup b. Drill down c. Pivot d. Slice & dice 34. _ _ _ _ _ _ tables can be specified by users or experts, or automatically generated and adjusted based on data distributions. a. Fact b. Summarized c. Dimension d. Relational 35. _ _ _ _ _ _ _ executes queries involving more than one fact table a. Drill-through b. Drill-across c. Drill-down d. Rotate 36. A _ _ _ _ _ allows data to be modeled and viewed in multiple dimensions. a. Meta data b. Data cube c. Database d. Fact table 37. The major difference between the snowflake and star schema models is that the dimension tables of the snowflake model image kept in _ _ __ a. Standard b. De-normalized d. Multi dimensional 38. Which of the following is not a measure, which is based on the kind of aggregation functions used. a. Cumulative b. Distributed c. Algebraic d. Holistic 39. A concept hierarchy that is a total or partial order among attributes in database schema is called a _ _ _ _ _ _ _ _ _ _ _ hierarchy. a. Set-grouping b. Grouping c. Decision d. Schema 40. Which of the following focuses on socioeconomic applications? a. Statistical database systems WWW.UandiSTAR.ORG b. Online Analytical Processing systems c. Spatial database systems d. Temporal database systems 41. A _ _ _ _ _ _ _ _ _ model consists of radial lines emanating from a central point, where each line represents a concept hierarchy for a dimension a. Cube net b. Triangle net c. Square net d. Star net 42. Which of the following is constructed where the enterprise warehouse is Page 4 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG the sole custodian of all warehouse data. Which is then distributed to the various dependent data marts. a specific group of users. a. Enterprise warehouse b. Virtual warehouse c. Data warehouse c. Multi-tier DWH d. Data mart 49. A _ _ _ _ _ _ _ is a set of views over operational databases d. Virtual warehouse a. Enterprise warehouse 43. Which of the following is a Multi Dimensional Online Analytical Processing? a. Ess base b. Virtual warehouse a. Enterprise DWH b. Two- tier DWH b. Database c. Swiss base d. Red brick 44. The _ _ _ _ _ _ view includes fact tables and dimension tables. a. DWH c. Data warehouse d. Data mart 50. What kind of the intermediate servers that stand in between a relational back-end server and client frontend tools? a. Hybrid OLAP servers b. Multidimensional OLAP server b. Top-down c. Data source d. Business Query c. Relational OLAP servers WWW.UandiSTAR.ORG 45. Which of the following is a Hybrid OLAP server? 51. Choose the _ _ _ _ _ _ _ _ _ that will populate each fact table record a. Measures a. MS SQL server 1.0 b. MS SQL 5.0 c. MS SQL server 7.0 d. MS SQL server 3.0 46. ETL stands for a. Evaluate, Transport and Link b. Extract Transfer and Load c. Error, Tracking and Load d. Extract, Transient and Load d. Specialized SQL servers b. Dimensions c. Grain d. Business Process 52. How many cuboids are there in an n- dimensional data cube? a. 47. To architect the DWH, the major driving factor to support is b. c. d. a. An inability to cope with requirements evolution b. Not populating the warehouse 53. Meta data repository contains a. Operational meta data d. Supporting Online Transaction processing b. Data irrelevant to system performance c. The mapping from the DWH to the operational environment d. Summarized data 48. A _ _ _ _ _ _ _ contains a subset of corporate-wide data that is of value to 54. Which of the following support the bitmap indices a. Sybase IQ c. Day- to- day management of the warehouse Page 5 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG b. Oracle 7 c. CoBoL d. SQL 55. _ _ _ _ _ _ _ are created for the data names and definitions of the given warehouse a. Data cube b. Summarized data c. Meta data d. Detailed Information 56. Chunking technique involves "overlapping" some of the aggregation computations, it is referred to as _ _ _ _ _ aggregation in data cube computation a. Two way array b. Three way array c. Multi way array d. Sparse array 57. The _ _ _ _ _ _ _ operator computes aggregates over all subsets of the dimensions specified in the operation. a. Data base b. Computer cube c. Define cube d. Group by 58. Which of the following is a subcuge that is small enough to fit into the memory available for cube computation? a. Bulk b. Array c. Structure d. Chunk 59. The bit mapped join indices method is an integrated form of a. Composite join indexing and bitmap indexing b. Join indexing and composite join indexing c. Join indexing and bitmap indexing d. Bitmap indexing and outer join indexing 60. A set of attributes in a relation schema that forms a primary key for another relation schema is called a_______ a. Primary key b. Foreign key c. Secondary key d. Composite key WWW.UandiSTAR.ORG 61. Which of the following typically gathers data from multiple, heterogeneous, and external sources? a. Data cleaning b. Load c. Refresh d. Data extraction 62. OLAM is particularly important for the following reason a. How quality of data in DWH b. Data processing c. OLTP-based exploratory data analysis d. Online selection of data mining functions 63. Which of the following sets a good example for interactive data analysis and provides the necessary preparations for exploratory data mining? a. OLP b. OLAP c. OLTP d. OLDP 64. Which of the following is not exception indicator? Page 6 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG a. Out Exp c. Online Analytical Mining b. Self Exp c. In Exp d. Path Exp d. Online Analytical Monitoring 65. _ _ _ _ _ _ _ _ _ can help business managers find and reach more suitable customers, as well as gain critical business insights that may help to drive market share and raise profits. a. Data Processing b. Transaction Processing c. Datacube a. Data warehouse b. Data mining c. Data summarization d. Data processing 66. _ _ _ _ _ _ _ _ _ _ _ is an alternative approach in which precomputed measures indicating data exceptions are used to guide the user in the data analysis process at all levels of aggregation. a. Hypothesis-driven exploration b. Inventory-driven exploration c. Discovery-driven exploration d. Exception-driven exploration 67. Which of the following is an exception indicator that indicates that indicates the degree of surprise of the cell value, relative to other cells at the same level of aggregation? a. Out Exp b. In Exp c. Path Exp d. Self Exp 68. _ _ _ _ _ is a powerful paradigm that integrates OLAP with data mining technology. a. Online Analytical Modeling b. Online Analytical Machine 69. Data warehouse application is _________ d. Datamining 70. _ _ _ _ _ _ _ _ _ cubes compute complex queries involving multiple dependent aggregates as multiple granularities a. Multi feature b. Data WWW.UandiSTAR.ORG c. Meta d. Solid 71. Which of the following performs a linear transformation on the original data? a. Z-score normalization b. Normalization with decimal scaling c. Zero-standard deviation d. Min-max normalization 72. Which of the following is the best method for missing values in data cleaning? a. Fill in the missing value manually b. Use the most probable value to fill in the missing value c. Use the attribute mean to fill the missing value d. Use a global constant to fill in the missing value 73. The minimum and maximum values in a given bin are identified as the a. Bin means b. Bin average c. Bin medians d. Bin boundaries Page 7 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG 74. Which of the following is data transformation operation? a. Normalization b. Regression c. Clustering d. Binning 75. The correlation between attributes A and B can be measured by a. b. c. d. c. PP2 d. DIM 81. If the tuples in D are grouped into M mutually disjoint Clustering, then an simple random sample of m clusters can be obtained, where m M which WWW.UandiSTAR.ORG of the following suits the above sentence? a. Stratified sample b. SRS without replacement 76. _ _ _ _ _ methods smooth a sorted data value by consulting in neighborhood ie the values around it. c. Cluster sample a. Clustering a. A- trees b. T-trees c. P-trees b. Binning c. Regression d. Data reduction 77. Z-score normalization is also called as a. Min-max normalization b. Zero-standard deviation normalization c. Zero-mean normalization d. Normalization by decimal scaling 78. _ _ _ _ _ _ is a random error or variance in a measured variable. a. Bin b. Cluster c. Noise d. Regression 79. The data are consolidated into forms appropriate for mining is called as a. Data reduction b. Data Redundancy c. Data clean d. Data transformation 80. Which of the following is a decision tree algorithm? a. C3.2 b. ID3 d. SRS with replacement 82. Multidimensional index trees include d. R-trees 83. Which of the following strategy for data reduction is irrelevant, weakly relevant, or redundant attributes may be detected and removed? a. Data cube aggregation b. Dimension reduction c. Data compression d. Numerosity reduction 84. In database systems, _ _ _ _ _ are primarily used for providing fast data access. a. Red-black trees b. Game trees c. Multidimensional index trees d. splay trees 85. If the mining task is classification, and the mining algorithm itself is used to determine the attribute subset, then this is called a _ _ _ _ _ _ approach. Page 8 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG a. Filter b. Reduction c. Smoothing d. Wrapper 86. The discrete wavelet transformation is closely related to the _ _ _ _ _ _ _ transform. a. Discrete fourier b. Fourier c. Laplace d. wavelet 87. Principal components analysis is also called as a. Karhunen-loeve method b. Kinen-liva method c. Kruskal-learn method d. Kutni-lara method 88. _ _ _ _ _ _ can be used as a data reduction technique since it allows a large data set to be represented by a much smaller random subset of the data. a. Clustering b. Regression c. Histograms d. Sampling 89. Loy-linear models are a. Parametric methods b. Discrete methods c. Non-parametric methods d. Non- discrete methods 90. Which of the following method is the generation of concept of hierarchies for categorical data? a. Specification of a portion of a hierarchy by implicit data grouping b. Specification of their partial ordering, but not of a set of attributes c. Specification of a set of attributes, but not of their partial order d. Specification of only a partial set of entities WWW.UandiSTAR.ORG 91. Which of the following method uses class information? a. Histogram analysis b. Binning c. Cluster analysis d. Entropy-based Discretization 92. _ _ _ _ _ _ _ _ _ hierarchies for categorical attributes or dimensions typically involve a group of attributes a. Diccretization b. Semantic c. Index d. Concept 93. Which of the following is based on the maximal asset values, which may lead to a highly biased hierarchy? a. Cluster analysis b. Segmentation c. Binning d. Histogram analysis 94. The _ _ _ _ _ can be used to segment numeric data into relatively uniform, "natural" intervals. a. 1-2-3 rule b. 2-3-4 rule c. 3-4-5 rule d. 4-5-6rule 95. _ _ _ _ _ _ _ _ hierarchies for numeric attributes can be constructed automatically based on data distribution analysis a. Concept b. Discretization Page 9 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG c. Tree d. Index 96. _ _ _ _ _ _ _ techniques can be used to reduce the number of values for a given continuous attribute, by dividing the range of the attribute into intervals a. Concept hierarchy b. Discretization c. Tree-based d. Index 97. A _ _ _ _ _ _ _ _ _ algorithm can be applied to partition data into groups 101. _ _ _ _ _ _ hierarchies can be used to refine or enrich schema defined hierarchies. When the two types of hierarchies are combined. a. Schema b. Set-grouping c. Operation-derived d. rule-based 102. _ _ _ _ _ _ _ are those that contribute new information or increased performance to the given pattern set. a. Utility patterns b. Certainty patterns a. Binning b. Histogram c. Novelty pattern c. Clustering 103. Certainty factor is also known as d. Entropy-based d. Simplicity patterns 98. An information-based measure called _ _ _ _ can be used to recursively partition the values of a numeric attribute A, resulting in a hierarchical discretization. a. Entropy a. Rule length b. Noice threshold c. Minable view b. Cluster c. Binning d. Segmentation a. Task-relevant data 99. The kinds of knowledge include c. Background knowledge d. Interestingness measures a. Image analysis b. Query process 105. _ _ _ _ _ _ _ may be used to guide the mining process or, after discovery to evaluate the discovered patterns. c. Association d. Multimedia analysis 100. Which of the following is a simplicity measure? a. Rule strength b. Rule quality c. Rule reliability d. Rule strength 104. Which of the following primitive specifies the data mining functions to be performed? b. The kind of knowledge to be mined a. Task-relevant data b. The kind of knowledge to be mined c. Background knowledge d. Interestingness measures WWW.UandiSTAR.ORG d. Rule length Page 10 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG 106. A _ _ _ _ _ hierarchy is a total or partial order among attributes in the database schema. a. Schema b. Set-grouping c. Operation-derived d. rule-based 107. Given a set of task-relevant data tuples the confidence of "A== B" is defined as a. b. c. d. 108. _ _ _ _ _ hierarchies include the decoding of information encoded strings information extraction from complex data objects and data clustering. a. Rule-based b. Operation-derived c. Schema d. Set grouping 109. For association rules of the form "A== B" where A and B are sets of items, support is defined as a. 111. Mining with the use of _ _ _ _ , allows additional flexibility for ad hoc rule mining. a. Image patterns b. Data patterns c. Information patterns d. Meta patterns 112. Which of the following clause lists the attributes or dimensions for exploration a. Order by b. group by c. having d. in relevance to 113. Which of the following clause uses the meta pattern? a. Analyze b. In relevance to c. Matching d. Use data warehouse 114. Which of the following clause is used for discrimination? a. Mine characteristics b. Mine discriminant c. Mine association d. Mine comparison 115. DMQL expansion is a. Data Modeling Queue Level b. Design Modeling Query language c. Data Mining Query Language b. d. Data &Meta data Query Language c. d. 110. Which of the following clause is the task-irrelevant data primitive? a. In relevance to b. Use for warehouse WWW.UandiSTAR.ORG c. Analysis d. Order by 116. The _ _ _ _ _ clause, when used for characterization, specific aggregate measures, such as count, sum or count . a. Use database b. Analyze c. Matching d. Use hierarchy Page 11 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG 117. Which of the following clause specifies the condition by which groups of data are considered relevant? a. Having b. Group by c. Order by d. analyze 118. The _ _ _ _ _ _ _ _ statement is used to specify the kind of knowledge to be mined. a. Knowledge-mine-specification b. Mine-knowledge-specification c. Knowledge-specification-mine d. Specification-mine-knowledge 119. An example of interestingness measures and threshold values is a. Without support threshold= b. With confidence threshold= c. Without Confidence threshold= d. With support threshold= 120. CRISP-DM addresses an issue as a. Mapping from datamining problems to business issues b. Capturing and misunderstanding the data c. Disintegrating datamining results within the business context d. Deploying and maintaining data mining results WWW.UandiSTAR.ORG 121. An Example of a set-grouping hierarchy is a. Define hierarchy age-hierarchy for age as customer on level1:{young, middleaged, serior} level10:all level2:{20 39} level1: young level2:{20 59} level1: middle-aged level2:{60 89} level1:senior b. Define hierarchy age-hierarchy as age for customer on level1:{young, middleaged, serior} level10:all level2:{20 39} level1: young level2:{20 59} level1: middle-aged level2:{60 89} level1:senior c. Define hierarchy age-hierarchy for age on customer as level1:{young, middle-aged,serior} level10:all level2:{20 39} level1: young level2:{20 59} level1: middle-aged level2:{60 89} level1:senior d. Define hierarchy age-hierarchy on age for customer as level1:{young, middleaged, serior} level10:all level2:{20 39} level1: young level2:{20 59} level1: middle-aged level2:{60 89} level1:senior 122. Which of the following data mining language uses SQL-like syntax and serves as rule generation queries for mining association rules. a. MINE RULE operator b. RULE MINE operator c. DATA MINE operator d. DWH operator 123. Which of the following is not a data mining language? a. DMQL b. MSQL c. PSQL d. OLE DB for 124. System of schema hierarchy is a. textbf{Define hierarchy} location-hierarchy textbf{on} address textbf{as} [street, city, country] b. textbf{Define hierarchy} locationhierarchy textbf{as} address textbf{on} [street, city, country] Page 12 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG c. textbf{Define hierarchy} locationhierarchy textbf{from} address textbf{to} [street, city, country] d. textbf{Define hierarchy }locationhierarchy textbf{for} address textbf{all} [street, city, country] 125. The DMQL statement syntax is a. display as result _ from b. display result _ from c. display on result _ from d. display for result _ from 126. Which of the following is a data mining query language a. PSQL b. QSQL c. MSQL d. RSQL 127. _ _ _ _ _ is used for efficient implementations of a few essential data mining primitives. a. No coupling b. Loose coupling c. Tight coupling d. Semi tight coupling 128. _ _ _ _ _ _ _ is a compromise between loose and tight coupling. a. No coupling b. Loose coupling c. Tight coupling d. Semi tight coupling 129. Which of the following coupling schema is used to fetch data from a data repository managed by database systems? a. No coupling WWW.UandiSTAR.ORG b. Loose coupling c. Tight coupling d. Semi tight coupling 130. A well designed data mining system should offer _ _ _ _ _ _ _ with a data warehouse system a. Semi tight coupling b. No coupling c. Loose coupling d. Normal coupling 131. Which of the following is difficult to achieve high scalability and good performance with large data sets? a. No coupling b. Tight coupling c. Semi tight coupling d. Loose coupling 132. _ _ _ _ _ _ _ _ means that a Data mining system will not utilize any function of a data warehouse system a. Loose coupling b. Semi tight coupling c. Loose coupling d. No coupling 133. _ _ _ _ _ _ _ _ means that a data mining system is smoothing integrated coupling database system. a. No coupling b. Loose coupling c. Tight coupling d. Semi tight coupling 134. Which of the following provides a concise and succinct summerization of the given collection of data? a. Comparison b. Characterization c. Summerization d. Aggregation 135. _ _ _ _ _ _ _ _ data mining describes the data set in a concise and Page 13 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG summerative manner and presents interesting general properties of the data. a. Descriptive b. Predictive c. Active d. Constructive 136. _ _ _ _ _ _ data mining analyzes the data in order to construct one or a set of models and attempts to predict the behavior of new data sets. a. Descriptive b. Predictive c. Active d. Constructive 137. Attribute removal is based on the following rule: If there is a large set of distinct values for an attribute of the initial working relation but, a. There is generalization operator on the attribute b. There is no generalization operand on the attribute c. There is no generalization operator on the attribute d. There is no aggregation operator on the attribute 138. On-line analysis processing in data warehouses is a purelycontrolled process a. Machine b. database c. Developer WWW.UandiSTAR.ORG d. User 139. Which of the following approach is used to control generalization process? a. Generalized relation threshold control b. Generalized class threshold control c. Generalized dimension threshold control d. Generalized query threshold control 140. Many current OLAP systems confine dimensions to _ _ _ _ _ _ _ ___ data a. Numeric b. Non numeric c. Meta d. Summerized 141. _ _ _ _ _ _ _ is a process that abstracts a large set of taskrelevant data in a database from a relatively low conceptual level to higher conceptual levels. a. Data realization b. Data characterization c. Data summerization d. Data generalization 142. The _ _ _ _ _ _ approach can be considered as a data warehouse-based pre-computation-oriented, material- view approach. a. Object-oriented induction b. Data cube c. Attribute-oriented induction d. Data square 143. Which of the following approach is a relational database query-oriented, generalization-based, on-line data analysis technique? a. Attribute-oriented induction b. object-oriented approach c. Data cube d. Data square Page 14 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG 144. _ _ _ _ _ _ _ _ performs offline aggregation before an OLAP or Data mining query is submitted for processing. relational query to collect the task relevant data into the _ _ _ _ _ _ _ ___ _. a. Object-oriented induction a. Prime relation b. Secondary relation b. Data cube c. Working relation c. Attribute-oriented induction d. Data square d. Analyzing relation 145. The range of t-weight is a. b. c. d. 146. How can the t-weight and interestingness measures in general be used by the data mining system to display only the concept descriptions that it objectively evaluates as interesting? a. By threshold b. By generalization c. By comparison d. By characterization 147. The data cube implementation of attribute-oriented induction can be performed by a. Using defined data cube 150. Which of the following relation collects the statistics of attributeorientedinduction algorithm? a. Working relation b. Prime relation c. Secondary relation d. Analyzing realation 151. Descriptions can also be visualized in the form of _ _ _ _ _ _ __. a. Cross-ralations b. Cross-checks c. Cross-boards d. Cross-tabs 152. Step three of attributeoriented-induction derives the _ _ _____ relation. a. Working b. Prime c. Secondary d. Analysing 148. A _ _ _ _ _ can be represented by a 3-D data cube. a. Cross-tab WWW.UandiSTAR.ORG 153. The _ _ _ _ _ _ as an interestingness measure that describes the typically of each disjoint in the rule, or of each tuple in the corresponding generalized relation. b. Bar chart c. pie chart d. Flow chart a. Quantitative rule b. Quantitative characteristic rule c. c-weight 149. Step one of the attributeoriented-induction algorithm is essentially a d. t-weight 154. The information gain is obtained by b. Using a predefined data cube c. Using a generalized data cube d. Using a quantified data cube Page 15 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG a. Expected information + entropy b. Entropy - Expected information c. Expected information entropy d. Entropy Expected information 155. The expected information needed to classify a given sample is a. I(s1,s2----.sm)= mathop Sigma limits_{i = 1}n ( /s) ( /s) b. I(s1,s2----.sm)= ( /s) ( /s) c. I(s1,s2----.sm)= - mathop Sigma limits_{i = 1}n ( /s) ( /s) d. I(s1,s2----.sm)=- mathop Sigma limits_{i = 1}n ( /s) ( /s) 156. Class comprarison is also called as a. composition b. aggregation c. discrimination d. characterization 157. _ _ _ _ _ _ can be used to perform some preliminary relevance analysis on the data by removing or generalizing attributes having a very large number of distinct values. a. Object-oriented induction b. Attribute-oriented induction c. Batch-oriented induction d. Class-oriented induction 158. Class characterization that includes the analysis of attribute/dimensions relevance is called _ _ _ _ _ . a. Analytical comparison b. Analytical measurement c. Analytical characterization d. Analytical difference 159. _ _ _ _ _ _ _ irrelevant and weakly relevant attributes using the selected relevance analysis measure. a. Insert b. Update c. Modify d. Remove 160. The _ _ _ _ _ class is the class to be characterized a. base b. target c. contrasting d. sub 161. The _ _ _ _ _ _ class is the set of comparable data that are not in the target class. a. base b. target c. contrasting d. sub 162. Generalization is performed on the _ _ _ _ _ _ _ _ to the level controlled by a user or expert-specified dimension threshold, which results in a _ _ _ _ ___ a. Target class, Prime target class relation b. Contrasting class, Prime contrasting class relation c. Target class, Secondary target class relation d. Contrasting class, Secondary contrasting class relation 163. Let be a generalized tuple, and be the target class, the dweight is defined as a. d-weight =condition( ) / count( ) b. d-weight =condition( ) / mathop Sigma limits_{i = 1}m count( ) c. d-weight =condition( ) / count( ) d. d-weight =condition( ) / count( ) 164. Can class comparison mining be implemented efficiently using data cube techniques? a. yes Page 16 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG b. no c. limited d. difficult 165. Class discrimination is also called as a. class comparison b. class hierarchy c. class aggregation d. class concept 166. The set of relevant data in the database is collected by query processed and is partitioned respectively into a target class and one or a set of _ ___ _ class(es) a. discrimination b. contrasting c. comparable d. target 167. The range for the d-weight is a. b. c. WWW.UandiSTAR.ORG d. 168. A _ _ _ _ _ _ d-weight in the target class indicates that the concept represented by the generalized tuple is primarily derived from the target class comparison description is written in the form a. x, target _ class(x) compare(x) [d: dweight] b. x, contrasting _ class(x) condition(x) [d: d-weight] c. x, contrasting _ class(x) compare(x) [d: d-weight] d. x, target _ class(x) condition(x) [d: dweight] 171. In d-weight, d stands for a. divide b. dead c. discrimination d. degree 172. Inter quartile is defined as a. First quartile -Third quartile b. First quartile + Third quartile c. Third quartile + First quartile d. Third quartile - First quartile 173. One common rule of thumb for identifying suspected outliers is to single out values falling at least _ _ _ _ _ _ _ above the third quartile or below the first quartile. a. b. c. d. b. High 174. The most commonly used percentiles other the median are _ _____ c. Average d. Middle a. Outliers b. Boxplots 169. A _ _ _ _ _ _ d-weight implies that the concept is primarily derived from the contrasting class a. Low c. Quartiles a. Low b. High c. Average d. Middle 170. A quantitave discriminant rule for the target class of a given d. Modes 175. A popularly used visual representation of a distribution is the _ _ _ _ _ _ a. Boxplot b. Outlier c. Quartile d. Histogram Page 17 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG 176. Dispersion is also called as a. Mean b. Variance c. Median d. mode 177. Which of the following is central tendency measure? a. Outliers b. Variance c. Quartiles WWW.UandiSTAR.ORG d. Mode 178. Which of the following is a data dispersion measure? a. Mean b. Variance c. Mode d. Median 179. The average of the largest and smallest values in a data set is called as a. Median b. Mean c. Mid range d. Mode 180. The _ _ _ _ _ _ _ _ for a set of data is the value that occurs most frequently in the set. a. Median b. Mean c. Mid range c. quantile plot d. q-q-q plot 183. A _ _ _ _ _ _ _ _ is another important exploratory graphic aid that adds a smooth curve to a scatter plot in order to provide better perception of the pattern of dependence. a. Loess curve b. Scatter curve c. Bar chat d. Quantile plot 184. Histograms are also called as _ _ _ _ _ _ _ _ _ histograms. a. frequency b. variance c. quartile d. outlier 185. The word loess is short for a. Load compression b. Local compression c. Load refression d. Local refression 186. A _ _ _ _ _ _ _ _ _ consists of a set of rectangles that reflect the counts of the classes present in the given data. a. Quartile plot b. q-q plot c. Histogram d. Loess curves d. Mode 181. Which of the following is not central tendency measure? a. Variance 187. A _ _ _ _ _ _ is a simple and effective way to have a first look at an unvariate data distribution. b. Mean c. Median d. Mode a. q-q plot b. scatter plot c. histogram 182. A _ _ _ _ _ _ _ _ is one of the most effective graphical methods or trend between two quantitative variables. d. quantile plot WWW.UandiSTAR.ORG 188. A _ _ _ _ _ _ _ _ _ , groups the quantiles of one unvariate distribution a. q-q plot b. scatter plot Page 18 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more.. for more :- http://UandiStar.net WWW.UandiSTAR.ORG against the correspondings quantiles of another. WWW.UandiSTAR.ORG Page 19 100% free SMS: ON<space>UandiStar to 9870807070 for JNTUK Alerts,GATE,CAT… Alerts,Hacking Tips/Tricks… more directly to ur mobile http://Education.UandiStar.net :: for Jntu Previous Papers, Study Materials, Lab Manuals, Online Bits and more..