The Easiest solution isn’t always the best solution, even in Math Should we always believe what we are taught in the classroom? Purpose • Statisticians use a well selected sample to estimate an unknown value of a population. • The unknown value may be the mean income, or the proportion of defective products, or proportion of “yes” responses. • Estimating an unknown population proportion is the topic of interest. Background/symbols • p= Population proportion (unknown) • n= # of subjects/ objects randomly selected. • X= # of subjects/ objects in the sample with Yes responses. • p^ = sample proportion= x/n • Traditionally p^ is used as an estimate of p. • Is there a better alternative? • We often provide an interval estimate of p, • p^ ± error of estimation: Confidence interval • A well known interval is 95% confidence. • To determine error , we need to understand how p^ value varies from sample to sample. About p^ • p = fixed value of a population, while • p^= varies from sample to sample, and thus it has a distribution. What we know is • Under certain conditions, • p^ is normally distributed with a mean value of p, and standard deviation of √p(1-p)/n • A normally distributed value can be changed to a standard normal score ,called a z score. • A well known result is that about 95% of the z scores fall between -2 and 2. • • • • • • • • Lets standardize p^ = # of yes /n, ( p^- mean)/ std dev = z ( standard normal) ( p^- p)/ √p(1-p)/n = z ( p^- p)/ √p(1-p)/n = ± 2 ( p is unknown). Note: We need to solve the above equation for p Easy approach for non math majors : solve for p in numerator p= p^- 2 √p(1-p)/n, p= p^+ 2 √p(1-p)/n, • (p^- 2 √p^(1-p^)/n , p^+ 2 √ p^(1-p^)/ n) • Makes an approximate 95% confidence interval of p. ( Mathematically incorrect) Lets try again • ( p^- p)/ √p(1-p)/n = ± 2 • Solve it the right way by squaring both sides, and solving the quadratic equation for p. • We get two solutions of p, ( mathematically tedious) • Those solutions make the 95% interval of p.This interval is very tedious, and lacks logical explanation. • we take the average of those solutions, we get • p =( # of yes +2)/ (n+4)= our new estimate =p~ A new interval of p • Recall old interval of p: • (p^- 2 √p^(1-p^)/n , p^+ 2 √ p^(1-p^)/ n) • An alternative interval of p • (p~ -2 √ p~(1- p~)/n , p~+ 2 √ p~(1- p~)/n • Recall p~ = ( # of yes +2)/ n+4 • p^ = (# of yes)/n How good is the new interval • Simulation results coming soon.