Hypothesis Testing for Population Means and Proportions Topics • Hypothesis testing for population means: – z test for the simple case (in last lecture) – z test for large samples – t test for small samples for normal distributions • Hypothesis testing for population proportions: – z test for large samples z-test for Large Sample Tests • We have previously assumed that the population standard deviationσis known in the simple case. • In general, we do not know the population standard deviation, so we estimate its value with the standard deviation s from an SRS of the population. • When the sample size is large, the z tests are easily modified to yield valid test procedures without requiring either a normal population or known σ. • The rule of thumb n > 40 will again be used to characterize a large sample size. z-test for Large Sample Tests (Cont.) • Test statistic: X 0 Z s/ n • Rejection regions and P-values: – The same as in the simple case • Determination of β and the necessary sample size: – Step I: Specifying a plausible value of σ – Step II: Use the simple case formulas, plug in theσ estimation for step I. t-test for Small Sample Normal Distribution • z-tests are justified for large sample tests by the fact that: A large n implies that the sample standard deviation s will be close toσfor most samples. • For small samples, s and σare not that close any more. So z-tests are not valid any more. • Let X1,…., Xn be a simple random sample from N(μ, σ). μ and σ are both unknown, andμ is the parameter of interest. • The standardized variable x T ~ t n 1 s n The t Distribution • Facts about the t distribution: – Different distribution for different sample sizes – Density curve for any t distribution is symmetric about 0 and bell-shaped – Spread of the t distribution decreases as the degrees of freedom of the distribution increase – Similar to the standard normal density curve, but t distribution has fatter tails – Asymptotically, t distribution is indistinguishable from standard normal distribution Table A.5 Critical Values for t Distributions α = .05 Degrees of Freedom 1 2 . . 20 . . 200 z* 0.1 3.078 1.886 . . 1.325 . . 1.286 1.282 0.05 6.314 2.92 . . 1.725 . . 1.653 1.645 0.025 0.01 12.706 31.821 4.303 6.965 . . . . 2.086 2.528 . . . . 1.972 2.345 1.96 2.326 0.005 63.657 9.925 . . 2.845 . . 2.601 2.576 t-test for Small Sample Normal Distribution (Cont.) • To test the hypothesis H0:μ = μ0 based on an SRS of size n, compute t test statistic x 0 T s n • When H0 is true, the test statistic T has a t distribution with n -1 df. • The rejection regions and P-values for the t tests can be obtained similarly as for the previous cases. Case 1 : H a : 0 (two tailed - test) . Then H 0 should be rejected if x is too far away from 0. - - The rejection region is | T | t / 2, n 1. - - The P - value is 2 P (T t ). Case 2 : H a : 0 (upper - tailed test). Then H 0 should be rejected if z is much larger tha n 0. - -The rejection region is T t , n 1. - - The P - value is P(T t ). Case 3 : H a : 0 (lower - tailed test). Then H 0 should be rejected if z is much smaller th an 0. - -The rejection region is T t , n 1. - - The P - value is P(T t ). Recap: Population Proportion • Let p be the proportion of “successes” in a population. A random sample of size n is selected, and X is the number of “successes” in the sample. • Suppose n is small relative to the population size, then X can be regarded as a binomial random variable with E ( X ) X np Var ( X ) X2 np (1 p ) X np (1 p ) Recap: Population Proportion (Cont.) • We use the sample proportion pˆ X / n as an estimator of the population proportion. • We have E ( pˆ ) pˆ p p (1 p ) Var ( pˆ ) n p (1 p ) pˆ n 2 pˆ • Hence p̂ is an unbiased estimator of the population proportion. Recap: Population Proportion (Cont.) • When n is large, p̂ is approximately normal. Thus z pˆ p p(1 p) / n is approximately standard normal. • We can use this z statistic to carry out hypotheses for H0: p = p0 against one of the following alternative hypotheses: – Ha: p > p0 – Ha: p < p0 – Ha: p ≠ p0 Large Sample z-test for a Population Proportion • The null hypothesis H0: p = p0 • The test statistic is pˆ p0 z p0 (1 p0 ) / n Alternative Hypothesis Ha: p > p0 P-value P(Z ≥ z) Rejection Region for Level α Test z ≥ zα Ha: p < p0 P(Z ≤ z) z ≤ - zα Ha: p ≠ p0 2P(Z ≥ | z |) | z | ≥ zα/2 Determination of β • To calculate the probability of a Type II error, suppose that H0 is not true and that p = p instead. Then Z still has approximately a normal distribution but E (Z ) p p' p0 (1 p0 ) / n , p ' (1 p ' ) / n V (Z ) p0 (1 p0 ) / n • The probability of a Type II error can be computed by using the given mean and variance to standardize and then referring to the standard normal cdf. Case 1 : H a : p p0 . - - The Type II error probabilit y ( p ' ) is : ( p0 p ' z / 2 p0 (1 p0 ) / n p (1 p ) / n ' ' ) ( p0 p ' z / 2 p0 (1 p0 ) / n p (1 p ) / n ' ' ). Case 2 : H a : p p 0 . - -The Type II error probabilit y ( ' ) is : ( p0 p ' z / 2 p0 (1 p0 ) / n p (1 p ) / n ' ' ) Case 3 : H a : p p0 . - -The Type II error probabilit y ( ' ) is : 1 - ( p0 p ' z / 2 p0 (1 p0 ) / n p (1 p ) / n ' ' ). Determination of the Sample Size • If it is desired that the level αtest also have β(p ) = β for a specified value of β, this equation can be solved for the necessary n as in population mean tests. 2 ' ' z p0 (1 p0 ) z p (1 p ) , one - tailed test p ' p0 n 2 ' ' z p0 (1 p0 ) z p (1 p ) /2 , two - tailed test ' p p0