Sample Standard Deviation Chapter 3 Descriptive Statistics Population Mean π= ∑π₯ π₯1 + π₯2 + π₯3 + β― + π₯π = π π Sample Mean π₯= ∑π₯ π₯1 + π₯2 + π₯3 + β― + π₯π = π π ∑(π₯ − π₯)2 π =√ π−1 Computational Formulas for Population Variance and Standard Deviation π2 = (∑ π₯ )2 π π ∑ π₯2 − Interquartile Range Q3 – Q1 π = √π 2 Sum of Deviations from the Arithmetic Mean is Always Zero ∑(π₯ − π) = 0 Computational Formulas for Sample Variance and Standard Deviation π 2 = (∑ π₯ )2 π π−1 ∑ π₯2 − Mean Absolute Deviation π = √π 2 ∑|π₯ − π| ππ΄π· = π z Score π§= Population Variance π2 = ∑(π₯ − π)2 π Coefficient of Variation ππ£ = Population Standard Deviation ∑(π₯ − π)2 π=√ π Empirical Rule* Distance from the Mean π ± 1π π ± 2π π ± 3π 1 π2 proportion of the values. Assumption: k > 1 Sample Variance π 2 = ∑(π₯ − π₯)2 π−1 ∑π π ∑π π = ∑π π π1 π1 + π2 π2 + β― + ππ ππ = π1 + π2+ β― + ππ πππππ’πππ = Values within the Distance 68% 95% 99.7% Chebyshev’s Theorem Within k standard deviations of the mean, π ± ππ, lie at 1− π (100) π Mean of Grouped Data *Based on the assumption that the data are approximately normally distributed. least π₯− π π where i = the number of classes f = class frequency M = class midpoint N = total frequencies (total number of data values) Medium of Grouped Data mediangrouped where π − πΉ = π + (2 )π€ π l = lower endpoint of the class containing the median w = width of the class containing the median f = frequency of the class containing the median F = cumulative frequency of classes preceding the class containing the median N = total frequencies (total number of data values) Formulas for Population Variance and Standard Deviation of Grouped Data Original Formula Computational Version π2 = ∑ π (π − π) π 2 π2 = (∑ ππ )2 π π ∑ ππ2 − π = √π 2 where f = frequency M = class midpoint N = ∑π, or total of the frequencies of the population π = grouped mean for the population Formulas for Sample Variance and Standard Deviation of Grouped Data Original Formula Computational Version 2 π 2 = ∑ π (π − π₯Μ ) π−1 π 2 = (∑ ππ )2 π π−1 ∑ ππ2 − π = √π 2 where f = frequency M = class midpoint N = ∑π, or total of the frequencies of the population π₯Μ = grouped mean for the sample Coefficient of Skewness π π = 3(π − ππ) π where π π = coefficient of skewness ππ = median Chapter 4 Probability Classical Method of Assigning Probabilities ππ π(πΈ) = π where N = total possible number of outcomes of an experiment ππ = the number of outcomes in which the event occurs out of N outcomes Range of Possible Probabilities 0 ≤ π(πΈ) ≤ 1 Probability by Relative Frequency of Occurrence Number of Times an Event Occurred Total Number of Opportunities for the Event to Occur Mutually Exclusive Events X and Y π(π ∩ π) = 0 Independent Events X and Y π(π|π) = π(π) and π (π|π) = π(π) Probability of the Complement of A π(π΄′ ) = 1 − π(π΄) The mn Counting Rule For an operation that can be done m ways and a second operation that can be done n ways, the two operations can then occur, in order, in mn ways. This rule can be extended to cases with three or more operations. General Law of Addition π(π ∪ π) = π(π) + π(π) − π(π ∩ π) where X, Y, are events and (π ∩ π) is the intersection of X and Y. Special Law of Addition If X, Y are mutually exclusive, π(π ∪ π) = π(π) + π(π) General Law of Multiplication π(π ∩ π) = π(π) β π(π|π) = π(π) β π(π|π) Special Law of Multiplication If X, Y are independent, π(π ∩ π) = π(π) β π(π) Law of Conditional Probability π(π | π) = π (π ∩ π) π(π) β π(π | π) = π(π) π(π) Independent Events X, Y If X and Y are independent events, the following must be true: π(π|π) = π(π) and π(π|π) = π(π) Bayes’ Rule π(ππ | π) = π(ππ ) β π(π|ππ ) β― π(π1 ) β π(π | π1 ) + π(π2 ) β π(π | π2 ) + β― + π(ππ ) β π(π | ππ ) Chapter 5 Discrete Distributions Hypergeometric Formula Mean or Expected Value of a Discrete Distribution π = πΈ(π₯) = ∑[π₯ β π(π₯)] where E(x) = long-run average x = an outcome P(x) = probability of that outcome Variance of a Discrete Distribution π 2 = ∑[(π₯ − π)2 β π(π₯)] where x = an outcome P(x) = probability of a given outcome π= mean Standard Deviation of a Discrete Distribution π = √∑[(π₯ − π)2 β π(π₯)] Assumptions of the Binomial Distribution - The experiment involves n identical trials. - Each trial has only two possible outcomes denoted as success or failure. - Each trial is independent of the previous trials. - The terms p and q remain constant throughout the experiment, where the term p is the probability of getting a success on any one trial and the term q = 1 – p is the probability of getting a failure on any one trial. Binomial Formula ππΆπ₯ . π π₯ . π π−π₯ = π! . π π₯ . π π−π₯ π₯! (π − π₯)! where n = the number of trials (or the number being sampled) x = the number of successes desired p = the probability of getting a success in one trial π = 1 – p = the probability of getting a failure in one trial Mean and Standard Deviation of a Binomial Distribution π= π β π π = √π β π β π Poisson Formula π(π₯) = where ππ₯ π −π π₯! x = 0, 1, 2, 3, … λ = long-run average e = 2.718281… π(π₯) = ππΆπ₯ . π−π΄πΆπ−π₯ ππΆπ where N = size of the population n = sample size A = number of successes in the population x = number of successes in the sample; sampling is done without replacement Chapter 6 Continuous Distributions Probability Density Function of a Uniform Distribution 1 π(π₯) = {π − 1 for a ≤ x ≤ b 0 for all other values Mean and Standard Deviation of a Uniform Distribution π+π 2 π= π−π π= √12 Probabilities in a Uniform Distribution π(π₯) = where π₯2 − π₯1 π−π π ≤ π₯1 ≤ π₯2 ≤ π z formula π§= π₯− π ,π ≠ 0 π Exponential Probability Density Function π(π₯) = λπ−λπ₯ where x≥0 λ>0 and e = 2.271828… Probabilities of the Right Tail of the Exponential Distribution π(π₯ ≥ π₯0 ) = π−λπ₯0 where π₯0 ≥ 0 Chapter 7 Sampling and Sampling Distributions Determining the Value of k π= where π π n = sample size N = population size k = size of interval for selection Central Limit Theorem If samples of size n are drawn randomly from a population that has a mean of π and a standard deviation of π, the sample means, π₯, are approximately normally distributed for sufficiently large samples (n ≥ 30*) regardless of the shape of the population distribution. If the population is normally distributed, the sample means are normally distributed for any sample size. From mathematical expectation, it can be shown that the mean of the sample means is the population mean: ππ₯Μ = π and the standard deviation of the sample means (called the standard error of the mean) is the standard deviation of the population divided by the square root of the sample size: π ππ₯Μ = √π z Formula for Sample Means π§= π₯Μ − π π √π z Formula for Sample Means of a Finite Population π§= π₯Μ − π π √π − π √π π − 1 Sample Proportion πΜ = where π₯ π x = number of items in a sample that have the characteristic n = number of items in the sample z Formula for Sample Proportions for n β π > π and n βπ>π π§= πΜ − π π βπ π √ where πΜ = sample proportion n = sample size p = population proportion q=1–p Chapter 8 Statistical Inference: Estimation for Single Populations 100(1 – a)% Confidence Interval to Estimate π: π Known (8.1) π₯Μ − π§∝⁄2 π √π ≤ π ≤ π₯Μ + π§∝⁄2 √π π= the area under the normal curve outside the confidence interval area ο‘ο―ο² = the area in one end (tail) of the distribution outside the confidence intervalο Confidence Interval to Estimate π Using the Finite Correction Factor (8.2) π π−π π π−π √ √ ≤ π ≤ π₯Μ + π§∝⁄2 √π π − 1 √π π − 1 Confidence Interval to Estimate π: Population Standard Deviation Unknown and the Population Normally Distributed (8.3) π₯Μ − π‘∝⁄2,π−1 π √π ≤ π ≤ π₯Μ + π‘∝⁄2,π−1 π √π df = π − 1 Confidence Interval to Estimate p (8.4) πΜ . πΜ πΜ . πΜ πΜ − π§∝⁄2 √ ≤ π ≤ πΜ + π§∝⁄2 √ π π where πΜ = sample proportion πΜ = sample size p = population proportion n = sample size ππ Formula for Single Variance (8.5) (π − 1)π 2 π2 = π2 ππ = π − 1 Confidence Interval to Estimate the Population Variance (8.6) (π − 1)π 2 (π − 1)π 2 2 ≤ π ≤ 2 ππΌ2⁄2 π1−πΌ ⁄2 π= π§∝2⁄2 π 2 π§∝⁄2 π 2 =( ) 2 πΈ πΈ Sample Size When Estimating p (8.8) π where ο‘= π₯Μ − π§∝⁄2 Sample Size When Estimating π (8.7) where π§∝2⁄2 π. π πΈ2 p = population proportion q= 1 – p E = error of estimation n = sample size Chapter 9 Statistical Inference: Hypothesis Testing for Single Populations z Test for a Single Mean (9.1) π₯Μ − π π √π π§= Formula to Test Hypotheses about π with a Finite Population (9.2) π§= π₯Μ − π π √π − π √π π − 1 t Test for π (9.3) π‘= π₯Μ − π π √π ππ = π − 1 z Test of a Population Proportion (9.4) πΜ − π π§= π βπ π √ where πΜ = sample proportion p = population proportion q=1–p Formula for Testing Hypotheses about a Population Variance (9.5) π2 = (π − 1)π 2 π2 ππ = π − 1 Chapter 10 Statistical Inference: About Two Populations z Formula for the Difference in Two Sample Means (Independent Samples and Population Variances Known) (10.1) π§= (π₯Μ 1 − π₯Μ 2 ) − (π1 − π2 ) π2 π2 √ 1 + 2 π1 π2 where π1 = mean of population 1 π2 = mean of population 2 π1 = size of sample 1 π2 = size of sample 2 df = n1 +n1 − 2 t Formula to Test the Difference in two Dependent Populations (10.6) t= Μ −D d sd √n df = n − 1 Confidence Interval to Estimate ππ − ππ (10.2) σ2 σ2 σ2 σ2 (xΜ 1 − xΜ 2 ) − zα⁄2 √ 1 + 2 ≤ μ1 − μ2 ≤ (xΜ 1 − xΜ 2 ) + zα⁄2 √ 1 + 2 n1 n2 n1 n2 t Formula to Test the Difference in Means Assuming πππ and πππ are Equal (10.3) t= s 2 (n1 − 1) + s22 (n2 − 1) 1 1 (xΜ 1 − xΜ 2 ) + t√ 1 √ + n1 + n2 − 2 n1 n2 (xΜ 1 − xΜ 2 ) − (μ1 − μ2 ) 1 1 sp √n + n 1 2 df = n1 +n1 − 2 where n = number of pairs d = sample difference in pairs D = mean population difference sd = standard deviation of sample difference d = mean sample difference Formulas for d and sd (10.7 and 10.8) Μ = d ∑d n (∑ d)2 Μ )2 √∑ d2 − ∑(d − d n sd = √ = n−1 n−1 Confidence Interval Formula to Estimate the Difference in Related Populations, D (10.9) where s12 (n1 − 1) + s22 (n2 − 1) sp = √ n1 + n2 − 2 Μ −t d t Formula to Test the Difference in Means (10.4) df = n − 1 t= (xΜ 1 − xΜ 2 ) − (μ1 − μ2 ) s2 s2 √ 1 + 2 n1 n2 2 df = s2 s2 (n1 + n2 ) 1 2 2 s12 (n ) 1 n1 − 1 + 2 s22 (n ) 2 n2 − 1 Confidence Interval to Estimate μ1 − μ2 Assuming the Population Variances are Unknown and Equal (10.5) s 2 (n1 − 1) + s22 (n2 − 1) 1 1 (xΜ 1 − xΜ 2 ) − t√ 1 √ + ≤ μ1 − μ2 ≤ n1 + n2 − 2 n1 n2 sd √n Μ +t ≤D≤ d sd √n z Formula for the Difference in Two Population Proportions (10.10) z= (pΜ1 − pΜ2 ) − (p1 − p2 ) p .q p .q √ 1n 1 + 2n 2 1 2 where pΜ1 = proportion from sample 1 πΜ2 = proportion from sample 2 π1 = size of sample 1 π2 = size of sample 2 π1 = proportion from population 1 π2 = proportion from population 2 π1 = 1 − π1 π2 =1 − π2 z Formula to Test the Difference in Population Proportions (10.11) z= (pΜ1 − pΜ2 ) − (p1 − p2 ) 1 1 √(p Μ . qΜ ) (n + n ) 1 2 where p Μ = x1 + x2 (n1 pΜ1 + n2 pΜ2 ) = n1 + n2 n1 + n2 and qΜ = 1 − p Μ Confidence Interval to Estimate p1 - p2 (10.12) pΜ1 . qΜ1 pΜ2 . qΜ2 (pΜ1 − pΜ2 ) − z√ + ≤ p1 − p2 ≤ n1 n2 pΜ1 . qΜ1 pΜ2 . qΜ2 (pΜ1 − pΜ2 ) + z√ + n1 n2 F Test for Two Population Variances (10.13) F= s12 s22 dfnumerator = v1 = n1 − 1 dfdenominator = v2 = n2 − 1 Formula for Determining the Critical Value for the Lower-Tail F (10.14) F1−∝,v1 ,v2 = 1 F∝,v1 ,v2 Chapter 11 Analysis of Variance and Design of Experiments πΆ π SSE = ∑ ∑(π₯ππ − π₯Μ π − π₯Μ π π=1 π=1 − π₯Μ ) Formulas for computing a one-way ANOVA π 2 SST = ∑ ∑(π₯ππ − π₯Μ ) SSC = ∑ ππ (π₯Μ π − π₯Μ ) SSE π−π−πΆ+1 Fππππππ = 2 π=1 π=1 ππ π π π=1 π=1 dfπΆ = πΆ − 1 π=1 πΆ SSC = ππ ∑(π₯Μ π − π₯Μ ) dfπΈ = π − πΆ dfπΌ = (π − 1)(πΆ − 1) 2 dfπΈ = π πΆ(π − 1) π=1 df π = π − 1 πΆ π SSI = π ∑ ∑(π₯Μ ππ − π₯Μ π − π₯Μ π − π₯Μ ) SSC dfπΆ π=1 π=1 π πΆ SSE MSE = dfπΈ π=1 π=1 π=1 π πΆ π SST = ∑ ∑ ∑(π₯πππ − π₯Μ ) MSC F= MSE 2 df π = π − 1 MSR = SSR π −1 MSC = SSC πΆ−1 π SSE = ∑ ∑ ∑(π₯πππ − π₯Μ ππ ) 2 2 MSI = SSI (π − 1)(πΆ − 1) π=1 π=1 π=1 MSE = Tukey’s HSD test MSE HSD = π∝,πΆ,π−πΆ √ π Tukey-Kramer formula MSE 1 1 q∝,πΆ,π−πΆ √ ( + ) 2 ππ π π Formulas for computing a randomized block design 2 π=1 π SSR = πΆ ∑(π₯Μ π − π₯Μ )2 MSE dfπ = π − 1 SSR = ππΆ ∑(π₯Μ π − π₯Μ )2 dfπΆ = πΆ − 1 πΆ MSR Formulas for computing a two-way ANOVA 2 SST = ∑ ∑(π₯ππ − π₯Μ ) SSC = π ∑(π₯Μ π − π₯Μ ) MSC MSE Fπ‘ππππ‘ππππ‘π = SSE = ∑ ∑(π₯ππ − π₯Μ π ) MSC = MSE = 2 π=1 π=1 π=1 ππ π SSR π−1 2 πΆ π MSR = dfπΆ = πΆ − 1 dfπ = π − 1 dfπΈ = (πΆ − 1)(π − 1) = π−π−πΆ+1 π=1 MSC = SSC πΆ−1 SSE (π π πΆ − 1) Fπ = MSR MSE FπΆ = MSC MSE FπΌ = MSI MSE Chapter 12 Correlation and Simple Regression Analysis (12.1) Pearson product-moment correlation coefficient r= ∑(x − xΜ )(y − yΜ ) √∑(x − xΜ )2 ∑(y − yΜ )2 ∑x∑y ∑ xy − n = 2 2 (∑ x) 2 − (∑ y) ] √[∑ x 2 − y ] [∑ n n Equation of the simple regression line yΜ = β0 + β1 x Standard error of the estimate se = √ SSE n−2 (12.5) Coefficient of determination r2 = 1 − SSE =1− SSyy SSE (∑ y)2 ∑ y2 − n Computational formula for r2 r2 = b12 SSxx SSyy Sum of squares SSxx = ∑ x 2 − SSyy = ∑ y 2 − SSxy = ∑ xy − (∑ x)2 n (∑ y)2 n (∑ x)(∑ y) n (12.2) Slope of the regression line b1 = = ∑(x − xΜ )(y − yΜ ) ∑ xy − nxΜ yΜ = ∑(x − xΜ )2 ∑ x 2 − nxΜ 2 (∑ x)(∑ y) n (∑ x)2 2 ∑x − n t= b1 − β1 sb sb = se √SSxx (12.6) Confidence interval to estimate E(yx) for a given value of x 1 (xo − xΜ )2 yΜ ± t ∝⁄2,n−2 se √ + n SSxx ∑ xy − (12.3) Alternative formula for slope b1 = t test of slope SSxy SSxx (12.4) y intercept of the regression line b0 = yΜ − b1 xΜ = ∑y (∑ x) − b1 n n Sum of squares of error SSE = ∑(y − yΜ)2 = ∑ y 2 − b0 ∑ y − b1 ∑ xy (12.7) Prediction interval to estimate y for a given value of x yΜ ± t ∝⁄2,n−2 se √1 + 1 (xo − xΜ )2 + n SSxx Chapter 13 Multiple Regression Analysis The F value F= MSreg SSreg⁄dfreg SSR⁄k = = MSerr SSerr⁄dferr SSE⁄(N − k − 1) Sum of squares of error SSE = ∑(y − yΜ)2 Standard error of the estimate se = √ SSE n−k−1 Coefficient of multiple determination R2 − SSR SSE =1− SSyy SSyy Adjusted R2 Adjusted R2 = 1 − SSE⁄(n − k − 1) SSyy ⁄(n − 1) Chapter 14 Building Multiple Regression Models (14.1) Variance inflation factor VIF = 1 1 − R2i Chapter 15 Time-Series forecasting and Index Numbers Individual forecast error et = X t − Ft Mean absolute deviation MAD = ∑ |ππ | Number of Forecasts Mean square error ∑ e2i MSE = Number of Forecasts Exponential smoothing Ft+1 =∝. X t − (1−∝). Ft Durbin-Watson test D= ∑nt=2(et − et−1 )2 ∑nt=1 e2t Chapter 16 Analysis of Categorical Data χ2 goodness-of-fit test (16.1) (ππ − ππ )2 ππ df = k − 1 − c π2 = ∑ χ2 test of independence (16.2) (ππ − ππ )2 ππ df = (r − 1)(c − 1) π2 = ∑ ∑ Chapter 17 Nonparametric Statistics Large-sample runs test ππ = π ππ2 12 πΎ= (∑ ) − 3(π + 1) π(π + 1) ππ 2π1 π2 +1 π1 +π2 π=1 2π1 π2 (2π1 π2 − π1 − π2 ) ππ = √ (π1 +π2 )2 (π1 +π2 − 1) π − ππ π§= = ππ Kruskal-Wallis test (17.3) 2π π π − ( 1 2 + 1) π1 +π2 2π π (2π1 π2 − π1 − π2 ) √ 1 2 (π1 +π2 )2 (π1 +π2 − 1) Friedman test (17.4) π ππ2 12 = ∑ π π2 − 3π(π + 1) ππ(π + 1) π=1 Spearman’s rank correlation (17.5) ππ = 1 − Mann-Whitney U test (small sample) π1 = π1 π2 + π1 (π1 + 1) − π1 2 π2 = π1 π2 + π2 (π2 + 1) − π2 2 π ′ = π1 π2 − π Mann-Whitney U test (large sample) (17.1) ππ = π1 π2 2 ππ = √ π§= π1 π2 (π1 + π2 + 1) 12 π − ππ ππ Wilcoxon matched-pairs signed rank test (17.2) ππ = (π)(π + 1) 4 (π)(π + 1)(2π + 1) ππ = √ 24 π§= π − ππ ππ 6 ∑ π2 π(π2 − 1) Chapter 18 Statistical Quality Control Μ ππ‘ππ«ππ¬ π Centreline: ∑ xΜ k xΜ = UCL: Μ xΜ + A2 R LCL: Μ xΜ − A2 R OR UCL: xΜ + A3 sΜ LCL: xΜ − A3 sΜ R charts Centreline: ∑R k Μ = R UCL: Μ D4 R LCL: Μ D3 R p charts Centreline: UCL: LCL: p= ∑ pΜ k p + 3√ p. q n p − 3√ p. q n c charts Centreline: cΜ = c1 + c2 + c3 … + ci i UCL: cΜ + 3√cΜ LCL: cΜ − 3√cΜ Chapter 19 Decision Analysis Bayes’ Rule π(ππ | π) = π(ππ ) β π(π|ππ ) π(π1 ) β π(π | π1 ) + π(π2 ) β π(π | π2 ) + β― + π(ππ ) β π(π | ππ )