1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. The data field "ethnic group" can be best described as a. Nominal data Converting continuous valued numerical variables to ranges and categories is referred to as discretization. a. True Clustering partitions a collection of things into segments whose members share a. Similar characteristics A data mining study is specific to addressing a welldefined business task, and different business tasks require a. Different sets of data Which of the following is a data mining myth? a. Data mining requires a separate, dedicated database. In the Target case study, why did Target send a teen maternity ads? a. Target's analytic model suggested she was pregnant based on her buying habits. Data that is collected, stored, and analyzed in data mining is often private and personal. There is no way to maintain individuals' privacy other than being very careful about physical data security. a. false All of the following statements about data mining are true EXCEPT a. the process aspect means that data mining should be a one-step process to results. What does the robustness of a data mining method refer to? a. its ability to overcome noisy data to make somewhat accurate predictions What is the main reason parallel processing is sometimes used for data mining? a. because of the massive data amounts and search efforts involved Third party providers of publicly available data sets protect the anonymity of the individuals in the data set primarily by a. removing identifiers such as names and social security numbers. What does the scalability of a data mining method refer to? a. its ability to construct a prediction model efficiently given a large amount of data Which data mining process/methodology is thought to be the most comprehensive, according to kdnuggets.com rankings? a. CRISP-DM Which broad area of data mining applications partitions a collection of objects into natural groupings with similar features? a. Clustering In the terrorist funding case study, an observed price ________ may be related to income tax avoidance/evasion, money laundering, or terrorist financing. a. Deviation 16. Understanding customers better has helped Amazon and others become more successful. The understanding comes primarily from a. analyzing the vast data amounts routinely collected. 17. In estimating the accuracy of data mining (or other) classification models, the true positive rate is a. the ratio of correctly classified positives divided by the total positive count. 18. The data mining in cancer research case study explains that data mining methods are capable of extracting patterns and ________ hidden deep in large and complex medical databases. a. Relationships 19. In the Influence Health case study, what was the goal of the system? a. Increasing service use 20. Prediction problems where the variables have numeric values are most accurately defined as a. Regressions 21. Which broad area of data mining applications analyzes data, forming rules to distinguish between defined classes? a. Classification 22. All of the following statements about data mining are true EXCEPT: a. The ideas behind it are relatively new. 23. In the cancer research case study, data mining algorithms that predict cancer survivability with high predictive power are good replacements for medical professionals. a. False 24. Identifying and preventing incorrect claim payments and fraudulent activities falls under which type of data mining applications? a. Insurance 25. In data mining, finding an affinity of two products to be commonly together in a shopping cart is known as a. Association rule mining