Final Exam Review Sheet (Math 225 Statistics and Probability) 1. Introduction to Statistics and Data Analysis 1.1 Language of Statistics 1.2 Design of Experiments 1.3 Types of Variables 1.4 Summary and Graphical Representation of Variables 2. Probability Experiment 2.1 Sample Space 2.2 Events 2.3 Counting Sample Points 2.4 Probability of an Event 2.5 2.6 Conditional Probability & Independent Events 2.7 Bayes’ Rule 3. Random Variables and Probability Distributions 4. Mathematical Expectation 5. Some Discrete Probability Distribution 6. Some Continuous Probability Distributions 7. 8. Fundamental Sampling Distributions and Data Descriptions 9. One-Sample Estimation Problems 10. One-Sample Tests of Hypothesis 1. Introduction to Statistics and Data Analysis 1.1 Language of Statistics Statistics Population(Finite/Infinite/Target/Sampled Population) Unit (subject) Data (Univariable/Bivariable/multivariable data) Census Sample Sampling error Non sampling error/Bias Selection Bias Nonresponse Bias Self-selecting Bias Voluntarily response Sampling Unbiased sampling method (representative samples) Probability sampling method (Simple random sample SRS of size n) Stratified Random Sampling Systematic Sampling (1-in-k systematic sampling) Cluster Sampling Multistage Sampling Parameter Statistic Branches of Statistics - Descriptive Statistics - Inferential Statistics 1.2 Design of Experiments Statistical studies Types - - Group 1: Comparative study Non-comparative study(descriptive study) Group 2: Experimental study - Simple comparison experiment (treatment group/control group) - Reduce bias: Random allocation Placebos (Placebo effects) Single-blind Double-blind Observational study: - - Cohort study Retrospective cohort study Prospective cohort study Case-control study (retrospective) 1.3 Types of Variables Terms: - Response variable - Explanatory variable (factor) Levels (possible values) Treatments (specific combination of levels) - Lurking variable - Confounded variables (confounding/confounder) Types: - - Categorical (qualitative) Nominal Ordinal Numerical (quantitative) Continuous Discrete 1.4 Summary and Graphical Representation of Data Categorical variables - summary - Frequency distribution; relative frequency Graphical representation Bar graph Pareto Chart Side-by-Side bar graph Pie Chart Numerical variable - Summary Measures of centers - Mean (not resistant) - Median (resistant) - Mode - Measures of spread - Range - Quartile - Variance & Standard Deviation (not resistant ) Deviation Mean absolute deviation (MAD) - Sample standard deviation Standard deviation Sample variance Variance Graphical representation (descriptive statistics) Histogram - Continuous data - Discrete data - Shape of distribution and data Stem-and-leaf plot (small quantitive data sets) Box-and-Whisker Plot - Shape of distribution Dot plot Chapter 2 Probability Experiment 2.1 Sample Space 2.2 Events Mutually exclusive=disjoint A∩B=Ø - Set Algebra: - A∩Ø=Ø - A∪Ø=A - A ∩ A' = Ø - A ∪ A' = S - S' = Ø - Ø'= S - (A') '= A Binary Operators - Commutative - Associative - Distributive Order of procedure - ' >∩>∪ DeMorgan’s Law 2.3 Counting Sample Points Multiplication rule Permutation - Theorem 2.2: # of permutations of n distinct objects= n! - Theorem 2.3: # of permutations of n distinct objects taken r at a time = nPr =Pr,n=(𝑛−𝑟)! - Theorem 2.4: # of distinct permutations of n things of which 𝑛1 are of one kind, 𝑛2 of a second kind …, 𝑛𝑘 of a 𝑘 𝑡ℎ kind = 𝑛! 𝑛! 𝑛1 !𝑛2 !…𝑛𝑘 ! Combination: 𝑛 𝑛! - Theorem 2.5: # of combination of n distinct objects taken r at a time =( ) = 𝐶𝑟, 𝑛 = 𝑛𝐶𝑟 = (𝑛−𝑟)!𝑟! - Theorem 2.6: # of arrangements of a set of n objects into r cells with 𝑛1 elements in the first cell, 𝑛2 in the second kind, 𝑟 𝑛 𝑛! and so forth is (𝑛 , 𝑛 , … , 𝑛 ) = , where 𝑛1 + 𝑛2 + ⋯ + 𝑛𝑟 = 𝑛. 𝑛1 !𝑛2 !…𝑛𝑟 ! 1 2 𝑟 2.4 Probability of Events Axiom of Probability 3: If 𝐴1, 𝐴2, …, is a sequence of mutually exclusive events, then P(𝐴1 ∪ 𝐴2 …)=P(𝐴1)+P(𝐴2 )+… Theorem 2.7: Theorem 2.8: P(A∪B) = P(A) + P(B) - P(A∩B) P(A∪B∪C) = P(A) + P(B) + P(C) - P(A∩B) - P(A∩C) – P (B∩C) + P(A∩B∩C) - Corollary 1: If A and B are mutually exclusive, then P(A∪B) = P(A) + P(B). - Corollary 2: If 𝐴1, 𝐴2 , … , 𝐴𝑛 are mutually exclusive, P(𝐴1 ∪ … ∪ 𝐴𝑛 )= P(𝐴1 )+…+P(𝐴𝑛 ). - Corollary 3: If 𝐴1, 𝐴2 , … , 𝐴𝑛 is a partition of a sample space S, then P(S)= P(𝐴1 ∪ … ∪ 𝐴𝑛 )=P(𝐴1 )+…+P(𝐴𝑛 )=1 Theorem 2.9: P(A) + P(A') = 1 2.6 Conditional Probability & Independent Events Conditional Probability: - 𝑃(𝐴|𝐵) = Multiplication Rule: - 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴|𝐵)𝑃(𝐵) = 𝑃(𝐵|𝐴)𝑃(𝐴) - 𝑃(𝐸1 ∩ 𝐸2 ∩ ⋯ ∩ 𝐸𝑛 ) = 𝑃(𝐸1 ) ∙ 𝑃(𝐸2 |𝐸1 ) ∙ 𝑃(𝐸3 |𝐸1 ∩ 𝐸2 ) ⋯ 𝑃(𝐸𝑛 |𝐸1 ∩ 𝐸2 ∩ ⋯ ∩ 𝐸𝑛−1 ) Independent Events (If and only If): - 𝑃(𝐴|𝐵) = 𝑃(𝐴) - 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴) ∙ 𝑃(𝐵) Theorem 2.13: If the events 𝐸1 , 𝐸2 , … , 𝐸𝑛 are mutually independent, then 𝑃(𝐸1 ∩ 𝐸2 ∩ ⋯ ∩ 𝐸𝑛 ) = 𝑃(𝐸1 ) ∙ 𝑃(𝐴∩𝐵) 𝑃(𝐵) , 𝑝𝑟𝑜𝑣𝑖𝑑𝑒𝑑 𝑃(𝐵) ≠ 0 𝑃(𝐸2 ) ∙ 𝑃(𝐸3 ) ⋯ 𝑃(𝐸𝑛 ) Theorem 2.14: If events A and B are independent, then A' and B' are independent 2.7 Bayes’ Rule Law of Total Probability: - Suppose events 𝐵1 , 𝐵2 , … , 𝐵𝑛 forms a partition of the sample space S, then for any event A of S, 𝑃(𝐴) = ∑𝑛𝑗=1 𝑃(𝐴|𝐵𝑗 )𝑃(𝐵𝑗 ). Bayes’ Theorem - Suppose events 𝐵1 , 𝐵2 , … , 𝐵𝑛 forms a partition of the sample space S, then for any event A of S such that 𝑃(𝐴|𝐵𝑟 )𝑃(𝐵𝑟 ) 𝑃(𝐴) ≠ 0, 𝑃(𝐵𝑟 |𝐴) = ∑𝑛 , for 𝑟 = 1,2, ⋯ , 𝑛. 𝑖=1 𝑃(𝐴|𝐵𝑖 )𝑃(𝐵𝑖 ) Chapter 3 Random Variables and Probability Distributions Random variable - Definition: a random variable (rv) = rule associating a real number with each outcome in S. - Notation: X(s)=x x is the value associated with the outcome s by the random variable X. Single Random Variable - Discrete Random Variable - pmf: 𝑓(𝑥) = 𝑃(𝑋 = 𝑥), where 𝑓(𝑥) ≥ 0, ∀𝑥 𝑎𝑛𝑑 ∑𝑥 𝑓(𝑥) = 1 cdf: 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∑𝑡≤𝑥 𝑓(𝑡) , where -∞ < 𝑥 < ∞ . Continuous Random Variable 𝑏 pdf: 𝑓(𝑥) = 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥 = 𝐹(𝑏) − 𝐹(𝑎), ∀ 𝑎 ≤ 𝑏. cdf: 𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∫𝑎 𝑓(𝑥)dx FCT: 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥 = 𝐹(𝑏) − 𝐹(𝑎), ∀ 𝑎 ≤ 𝑏. 𝑏 𝑏 Joint Distributed Random Variable - Discrete - Joint pmf: f(x, y) = P(X = x, Y = y), P[(x, y) ∈ A] = ∑𝑥∈A ∑y∈A f(x, y) Joint cdf: F(x, y) = P(X ≤ x, Y ≤ y) = ∑s≤x ∑t≤y f(s, t) Continuous Joint pdf: P[(x, y) ∈ A] = ∬ f(x, y)dxdy exam question Joint cdf: F(x, y) = P(X ≤ x, Y ≤ y) = ∫−∞ ∫−∞ f(s, t) dsdt y Marginal Distribution - Discrete: - x 𝑔(𝑥) = fX (𝑥) = ∑y 𝑓(𝑥, 𝑦) ; ℎ(𝑦) = fY (y) = ∑x f(x, y) Continuous: 𝑔(𝑥) = fX (𝑥) = ∫ f(x, y)dy ; ℎ(𝑦) = fY (y) = ∫ f(x, y)dx Conditional Distribution of X given Y - Formula: 𝑓(𝑥|𝑦) = - Discrete: Conditional distribution of X given Y: P(a < X < b|Y = y) = ∑a<x<b f(x|y) - Continuous: Conditional density of X given Y: P(a < X < b|Y = y) = ∫a f(x|y)dx - Independency: Random Variable X and Y are independent if and only if f (x,y)=g(x) h(y) Multivariate Distribution - Joint marginal distribution: 𝑓(𝑥,𝑦) ℎ(𝑦) , for ℎ(𝑦) ≠ 0 b Discrete: g(x1 , x2 ) = ∑x3 ⋯ ∑xn f(x1 , x2 , ⋯ , xn ) Continuous: g(x1 , x2 ) = ∫−∞ ⋯ ∫−∞ f(x1 , x2, ⋯ , xn ) dx3 dx4 ⋯ dxn Mutual independency: 𝑋1 , 𝑋2 ,…, 𝑋𝑛 and Y are mutually independent if and only if f(𝑥1 , 𝑥2, ⋯ , 𝑥𝑛 ) = 𝑓1 (𝑥1 ) ∙ ∞ ∞ 𝑓2 (𝑥2 ) ⋯ 𝑓𝑛 (𝑥𝑛 ), for all (x1 , x2, ⋯ , xn ). Chapter 4 Mathematical Mean Expectation Expectation of X: - Discrete: - 𝜇 = 𝐸(𝑋) = ∑𝑥 𝑥𝑓(𝑥) Continuous: ∞ 𝜇 = 𝐸(𝑋) = ∫−∞ 𝑥𝑓(𝑥)𝑑𝑥 Expectation of g(X): - Discrete: - 𝜇𝑔(𝑋) = 𝐸[𝑔(𝑋)] = ∑𝑥 𝑔(𝑥)𝑓(𝑥) Continuous: ∞ 𝜇𝑔(𝑋) = 𝐸[𝑔(𝑋)] = ∫−∞ 𝑔(𝑥)𝑓(𝑥)𝑑𝑥 Expectation of g(X,Y): - Discrete: - 𝜇𝑔(𝑋,𝑌) = 𝐸[𝑔(𝑋, 𝑌)] = ∑𝑥 ∑𝑦 𝑔(𝑥, 𝑦)𝑓(𝑥, 𝑦) Continuous: ∞ ∞ 𝜇𝑔(𝑋,𝑌) = 𝐸[𝑔(𝑋, 𝑌)] = ∫−∞ ∫−∞ 𝑔(𝑥, 𝑦)𝑓(𝑥, 𝑦) 𝑑𝑥𝑑𝑦 Variance and Covariance Variance of X: - Discrete: - Continuous: - ∞ 𝜎 2 = 𝑉𝑎𝑟(𝑋) = 𝐸[(𝑋 − 𝜇)2 ] = ∫−∞(𝑥−𝜇)2 𝑓(𝑥)𝑑𝑥 Standard Deviation of X: - 𝜎 2 = 𝑉𝑎𝑟(𝑋) = 𝐸[(𝑋 − 𝜇)2 ] = ∑𝑥(𝑥−𝜇)2 𝑓(𝑥) 𝜎 = 𝑆𝐷(𝑋) = √𝑉𝑎𝑟(𝑋) Theorem 4.2: 𝜎 2 = 𝐸(𝑋 2 ) − 𝜇2 = E(𝑋 2 ) - E(𝑋)2 Variance of g(X): - Discrete: - 𝜎 2𝑔(𝑋) = 𝐸[(𝑔(𝑋) − 𝜇𝑔(𝑋) )2 ] = ∑𝑥[𝑔(𝑥) − 𝜇𝑔(𝑋) ]2 𝑓(𝑥) Continuous: ∞ 𝜎 2𝑔(𝑋) = 𝐸[(𝑔(𝑋) − 𝜇𝑔(𝑋) )2 ] = ∫−∞[𝑔(𝑥) − 𝜇𝑔(𝑋) ]2 𝑓(𝑥)𝑑𝑥 Covariance of X and Y - Discrete: - Continuous: - 𝜎𝑋𝑌 = 𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸[(𝑋 − 𝜇𝑋 )(𝑌 − 𝜇𝑌 )] = ∑𝑥 ∑𝑦(𝑥 − 𝜇𝑋 )(𝑦 − 𝜇𝑌 )𝑓(𝑥, 𝑦) Note: ∞ ∞ 𝜎𝑋𝑌 = 𝐶𝑜𝑣(𝑋, 𝑌) = 𝐸[(𝑋 − 𝜇𝑋 )(𝑌 − 𝜇𝑌 )] = ∫−∞ ∫−∞(𝑥 − 𝜇𝑋 )(𝑦 − 𝜇𝑌 )𝑓(𝑥, 𝑦) 𝑑𝑥𝑑𝑦 - Covariance measures a type of association between X and Y If X and Y are positively correlated Cov(X,Y) tends to be positive If X and Y are negatively correlated Cov(X,Y) tends to be negative Theorem 4.4: - Cov(X, Y) = 𝜎XY = E(XY) − 𝜇X 𝜇Y Theorem 4.5: 𝐶𝑜𝑣(𝑎𝑋 + 𝑏𝑌, 𝑍) = 𝑎𝐶𝑜𝑣(𝑋, 𝑍) + 𝑏𝐶𝑜𝑣(𝑌, 𝑍) 𝐶𝑜𝑣(𝑎𝑋 + 𝑏, 𝑐𝑌 + 𝑑) = 𝑎𝑐𝐶𝑜𝑣(𝑋, 𝑌) Linear Combination of Rvs Mean of Linear Combination - Theorem 4.6: 𝐸(𝑎𝑋 + 𝑏) = 𝑎𝐸(𝑋) + 𝑏 - - Theorem 4.7: - 𝐸[𝑔(𝑋) ± ℎ(𝑋)] = 𝐸[𝑔(𝑋)] ± 𝐸[ℎ(𝑋)] Theorem 4.8: - Corollaries: 𝐸(𝑏) = 𝑏 ; 𝐸(𝑎𝑋) = 𝑎𝐸(𝑋) 𝐸[𝑔(𝑋, 𝑌) ± ℎ(𝑋, 𝑌)] = 𝐸[𝑔(𝑋, 𝑌)] ± 𝐸[ℎ(𝑋, 𝑌)] - Corollary 1: 𝐸[𝑔(𝑋) ± ℎ(𝑌)] = 𝐸[𝑔(𝑋)] ± 𝐸[ℎ(𝑌)] - Corollary 2: 𝐸[𝑋 ± 𝑌] = 𝐸[𝑋] ± 𝐸[𝑌] Theorem 4.9: If X and Y are independent Rvs, then 𝐸(𝑋𝑌) = 𝐸(𝑋)𝐸(𝑌). - Corollary 1: If X and Y are independent Rvs, 𝜎𝑋𝑌 = 𝐶𝑜𝑣(𝑋, 𝑌) = 0 Variance of Linear Combination - Theorem 4.10: - 𝑉𝑎𝑟(𝑎𝑋 + 𝑏) = 𝑎2 𝑉𝑎𝑟(𝑋) - Corollary 1: 𝑉𝑎𝑟(𝑋 + 𝑏) = 𝑉𝑎𝑟(𝑋) - Corollary 2: 𝑉𝑎𝑟(𝑎𝑋) = 𝑎2 𝑉𝑎𝑟(𝑋) Theorem 4.11: 𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑌) = 𝑎2 𝑉𝑎𝑟(𝑋) + 𝑏 2 𝑉𝑎𝑟(𝑌) + 2𝑎𝑏𝐶𝑜𝑣(𝑋, 𝑌) - Corollary 1: If X and Y are independent, 𝑉𝑎𝑟(𝑎𝑋 + 𝑏𝑌) = 𝑎2 𝑉𝑎𝑟(𝑋) + 𝑏 2 𝑉𝑎𝑟(𝑌). Distribution of Linear combination: Chapter 5 Some Discrete Probability Distribution Parameter and Parameter Family: - Parameter: Parameter can be assigned from some set of possible value with each different value determining a different probability distribution - Parameter family: collection of all probability distributions for the values of the parameter. Discrete Probability Distribution: Name and pmf Parameter 𝑝, 𝑖𝑓 𝑥 = 1 Bernoulli (p) 𝑓(𝑥) = { 1 − 𝑝, 𝑖𝑓 𝑥 = 0 Binomial (n,p) 𝑓(𝑥) = (𝑛𝑥)𝑝 𝑥 (1 − 𝑝)𝑛−𝑥 X~Bin(n,p) = b(r;n,p) = P(X=𝑟) Poisson(𝝀) X~ Poisson (𝜆) - - 𝑒 −𝜆 𝜆𝑥 𝑓(𝑥) = 𝑥! = 𝑝(𝑥: 𝜆) = 𝑃(𝑋 = 𝑥) cdf E[X] Var[X] p p(1-p) - B(r;n,p)=P(X≤ 𝑟) Table np np(1-p) - 𝑃(𝑥: 𝜆)𝑃(𝑋 ≤ r) Table 𝜆 𝜆 Poisson approximation to Binomial: 𝜆 = 𝜇 = 𝑛𝑝 𝑛 ≥ 100 𝑎𝑛𝑑 𝑛𝑝 ≤ 10 Chapter 6 Some Continuous Probability Distributions Name and Parameter pmf 1 Uniform (a,b) X~U(a,b) - 𝑓(𝑥) = {𝑏−𝑎 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 Exponential (𝝀) X~Exp(𝜆) - 𝜆𝑒−𝜆𝑥 , 𝑖𝑓 𝑥 ≥ 0 𝑓(𝑥) = { 0 , 𝑖𝑓 𝑥 < 0 , 𝑖𝑓𝑎 ≤ 𝑥 ≤ 𝑏 - F(x)= cdf E[X] 𝑥−𝑎 Var[X ] 𝑎+𝑏 2 (𝑏 − 𝑎) 𝑏−𝑎 12 - F(x)= −λx {1 − e , if x ≥ 0 0 , if x < 0 1 𝜆 1 𝜆2 Normal (𝝁, 𝝈) X~(𝜇, 𝜎) - Table 𝜇 𝜎2 Standard Normal (1,0) - Table 𝜇 =0 𝜎2 =1 2 Z= 𝒙−𝝁 𝝈 Normal Approximation to the Binomial Good approximation: 𝑛𝑝 ≥ 5 𝑎𝑛𝑑 √𝑛(1 − 𝑝) ≥ 5 Memoryless: Random Variable X is memoryless if P(X>s+t| X>t)=P(X>s), for all s,t ≥ 0. Theorem 6.5: If X is exponentially distributed, X is memoryless. Only exponential distribution is memoryless