Detailed Table of Contents

advertisement













Volume I: Discovering Knowledge in Data: An Introduction to Data Mining
Brief Table of Contents
Preface
Chapter 1. An Introduction to Data Mining
Chapter 2. Data Preprocessing
Chapter 3. Exploratory Data Analysis
Chapter 4. Statistical Approaches to Estimation and Prediction
Chapter 5. K-Nearest Neighbor
Chapter 6. Decision Trees
Chapter 7. Neural Networks
Chapter 8. Hierarchical and K-Means Clustering
Chapter 9. Kohonen networks
Chapter 10. Association Rules
Chapter 11. Model Evaluation Techniques
Epilogue
Detailed Table of Contents

Preface

Chapter 1. An Introduction to Data Mining
o What is Data Mining?
o Why Data Mining?
o The Need for Human Direction of Data Mining
o The Cross-Industry Standard Process CRISP –DM
o Case Study 1 : Analyzing Automobile Warranty Claims
o Fallacies of Data Mining
o What Tasks Can Data Mining Accomplish?
o Case Study 2: Predicting Abnormal Stock Market Returns Using Neural
Networks
o Case Study 3: Mining Association Rules from Legal Databases
o Case Study 4: Predicting Corporate Bankruptcies Using Decision Trees
o Case Study 5: Profiling the Tourism Market using K-Means Clustering
o Chapter 1 Bibliography
o Chapter 1 Exercises

Chapter 2. Data Preprocessing
o Why Do We Need to Preprocess the Data?
o Data Cleaning
o Handling Missing Data
o Identifying Misclassifications
o Graphical Methods for Identifying Outliers
o Data Normalization:
o Min-Max Normalization
o Z-Score Standardization
o Numerical Methods for Identifying Outliers
o
o
o
o
o
Using Z-Scores for Identifying Outliers
Robust Detection of Outliers
Chapter 2 Bibliography
Chapter 2 Exercises
Chapter 2 Hands-On Analysis

Chapter 3. Exploratory Data Analysis
o Hypothesis Testing vs. Exploratory Data Analysis
o EDA: Getting to Know the Data Set
o EDA: Dealing with Correlated Variables
o EDA: Exploring Categorical Variables
o Using EDA to Uncover Anomalous Fields
o EDA: Exploring Numeric Variables
o EDA: Exploring Multivariate Relationships
o EDA: Selecting Interesting Subsets of the Data for Further Investigation
o Binning
o Chapter 3 Bibliography
o Chapter 3 Exercises
o Chapter 3 Hands-On Analysis

Chapter 4. Statistical Approaches to Estimation and Prediction
o The Data Mining Tasks in Discovering Knowledge in Data
o Statistical Approaches to Estimation and Prediction
o Univariate Methods: Measures of Center and Spread
o Statistical Inference
o How Confident Are We in Our Estimates?
o Confidence Interval Estimation
o Simple Linear Regression
o The Dangers of Extrapolation
o Confidence Intervals for the Mean Value of y Given x
o Prediction Intervals for a Randomly Chosen Value of y Given x
o Multiple Regression
o Verifying Model Assumptions
o Chapter 4 Bibliography
o Chapter 4 Exercises
o Chapter 4 Hands-On Analysis

Chapter 5. K-Nearest Neighbor
o Supervised Learning vs. Unsupervised Learning
o A Methodology for Supervised Modeling
o The Classification Task
o The K-Nearest Neighbor Algorithm
o The Distance Function
o The Combination Function
o Weighted Voting
o Quantifying Attribute Relevance: Stretching the Axes
o
o
o
o
o
Database Considerations
K-Nearest Neighbor for Estimation and Prediction
Choosing K
Chapter 5 Bibliography
Chapter 5 Exercises

Chapter 6. Decision Trees
o Decision Trees
o Classification and Regression Trees
o The C4.5 Algorithm
o Decision Rules
o A Comparison of the C5.0 and CART Algorithms Applied to Real Data
o Chapter 6 Bibliography
o Chapter 6 Exercises
o Chapter 6 Hands-On Analysis

Chapter 7. Neural Networks
o Input and Output Encoding
o Neural Networks for Estimation and Prediction
o A Simple Example of a Neural Network
o The Sigmoid Activation Function
o Backpropagation
o The Gradient Descent Method
o The Backpropagation Rules
o An Example of Backpropagation
o Termination Criteria
o The Learning Rate 
o The Momentum Term 
o Sensitivity Analysis
o An Application of Neural Network Modeling
o Chapter 7 Bibliography
o Chapter 7 Exercises
o Chapter 7 Hands-On Analysis

Chapter 8. Hierarchical and K-Means Clustering
o The Clustering Task
o Hierarchical Clustering Methods
o K-Means Clustering
o An Application of K-Means Clustering using SAS Enterprise Miner
o Using Cluster Membership to Predict Churn
o Chapter 8 Bibliography
o Chapter 8 Exercises
o Chapter 8 Hands-On Analysis

Chapter 9: Kohonen networks
o Self-Organizing Maps
o
o
o
o
o
o
o
o
o
o
Kohonen Networks
An Example
Cluster Validity
An Application of Clustering Using Kohonen Networks
Interpreting the Clusters
Cluster Profiles
Using Cluster Membership as Input to Downstream Data Mining Models
Chapter 9 Bibliography
Chapter 9 Exercises
Chapter 9 Hands-On Analysis

Chapter 10. Data Mining Techniques: Association Rules
o Affinity Analysis and Market Basket Analysis
o Data Representation for Market Basket Analysis
o Support, Confidence, Frequent Itemsets, and the A Priori Property
o How Does the A Priori Algorithm Work (Part 1)? Generating Frequent
Itemsets
o How Does the A Priori Algorithm Work (Part 2)? Generating Association
Rules
o The Extension from Flag Data to General Categorical Data
o An Information Theoretic Approach: The Generalized Rule Induction
Method
o The J-Measure
o An Application of Generalized Rule Induction
o When Not To Use Association Rules
o Do Association Rules Represent Supervised or Unsupervised Learning?
o Local Patterns vs. Global Models
o Chapter 10 Bibliography
o Chapter 10 Exercises
o Chapter 10 Hands-On Analysis

Chapter 11. Model Evaluation Techniques
o Model Evaluation Techniques for the Description Task
o Model Evaluation Techniques for the Estimation and Prediction Tasks
o Model Evaluation Techniques for the Classification Task
o Error Rate, False Positives, and False Negatives
o Misclassification Cost Adjustment to Reflect Real-World Concerns
o Decision Cost / Benefit Analysis
o Lift Charts and Gains Charts
o Interweaving Model Evaluation with Model Building
o Confluence of Results: Applying a Suite of Models
o Chapter 11 Bibliography
o Chapter 11 Exercises
o Chapter 11 Hands-On Analysis
Epilogue. We’ve Only Just Begun: An Invitation to Data Mining Methods and
Models

Download