Table of Contents Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1. Basic Math and Calculus Review. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Number Theory Order of Operations Variables Functions Summations Exponents Logarithms Euler’s Number and Natural Logarithms Euler’s Number Natural Logarithms Limits Derivatives Partial Derivatives The Chain Rule Integrals Conclusion Exercises 2 3 5 6 11 13 16 18 18 21 22 24 28 31 33 39 39 2. Probability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Understanding Probability Probability Versus Statistics Probability Math Joint Probabilities Union Probabilities Conditional Probability and Bayes’ Theorem Joint and Union Conditional Probabilities 42 43 44 44 45 47 49 v Binomial Distribution Beta Distribution Conclusion Exercises 51 53 60 61 3. Descriptive and Inferential Statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 What Is Data? Descriptive Versus Inferential Statistics Populations, Samples, and Bias Descriptive Statistics Mean and Weighted Mean Median Mode Variance and Standard Deviation The Normal Distribution The Inverse CDF Z-Scores Inferential Statistics The Central Limit Theorem Confidence Intervals Understanding P-Values Hypothesis Testing The T-Distribution: Dealing with Small Samples Big Data Considerations and the Texas Sharpshooter Fallacy Conclusion Exercises 63 65 66 69 70 71 73 73 78 85 87 89 89 92 95 96 104 105 107 107 4. Linear Algebra. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 What Is a Vector? Adding and Combining Vectors Scaling Vectors Span and Linear Dependence Linear Transformations Basis Vectors Matrix Vector Multiplication Matrix Multiplication Determinants Special Types of Matrices Square Matrix Identity Matrix Inverse Matrix Diagonal Matrix Triangular Matrix vi | Table of Contents 110 114 116 119 121 121 124 129 131 136 136 136 136 137 137 Sparse Matrix Systems of Equations and Inverse Matrices Eigenvectors and Eigenvalues Conclusion Exercises 138 138 142 145 146 5. Linear Regression. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 A Basic Linear Regression Residuals and Squared Errors Finding the Best Fit Line Closed Form Equation Inverse Matrix Techniques Gradient Descent Overfitting and Variance Stochastic Gradient Descent The Correlation Coefficient Statistical Significance Coefficient of Determination Standard Error of the Estimate Prediction Intervals Train/Test Splits Multiple Linear Regression Conclusion Exercises 149 153 157 157 158 161 167 169 171 174 179 180 181 185 191 191 192 6. Logistic Regression and Classification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Understanding Logistic Regression Performing a Logistic Regression Logistic Function Fitting the Logistic Curve Multivariable Logistic Regression Understanding the Log-Odds R-Squared P-Values Train/Test Splits Confusion Matrices Bayes’ Theorem and Classification Receiver Operator Characteristics/Area Under Curve Class Imbalance Conclusion Exercises 193 196 196 198 204 208 211 216 218 219 222 223 225 226 226 Table of Contents | vii 7. Neural Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 When to Use Neural Networks and Deep Learning A Simple Neural Network Activation Functions Forward Propagation Backpropagation Calculating the Weight and Bias Derivatives Stochastic Gradient Descent Using scikit-learn Limitations of Neural Networks and Deep Learning Conclusion Exercise 228 229 231 237 243 243 248 251 253 256 256 8. Career Advice and the Path Forward. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Redefining Data Science A Brief History of Data Science Finding Your Edge SQL Proficiency Programming Proficiency Data Visualization Knowing Your Industry Productive Learning Practitioner Versus Advisor What to Watch Out For in Data Science Jobs Role Definition Organizational Focus and Buy-In Adequate Resources Reasonable Objectives Competing with Existing Systems A Role Is Not What You Expected Does Your Dream Job Not Exist? Where Do I Go Now? Conclusion 258 260 263 263 266 269 270 272 272 275 275 276 278 279 280 282 283 284 285 A. Supplemental Topics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 B. Exercise Answers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 viii | Table of Contents