HSA 523 Health Data Analysis Dr. Robert Jantzen Homework 6 Answer Key NOTE: click here for a link to an MS-Excel spreadsheet that will calculate confidence intervals and sample sizes for means and proportions. Click here for a link to an MSExcel spreadsheet that will calculate critical z & t scores for differing significance levels. 1. If you use the large sample size formula to compute a confidence interval (CI) for a population proportion, what is the appropriate z critical value for each of the following confidence levels: a. For 99%, z= 2.58 b. For 95%, z= 1.96 c. For 90%, z= 1.645 2. The use of the large sample size confidence interval for a population proportion requires a sufficient sample size. For each of the following combinations of n (sample size) and ps (sample proportion), test whether the large sample size confidence interval should be used: n (ps) and n (1- ps) must be >= 5 to use the large sample confidence interval calculation. a. n=50 and ps=.3 OK b. n=50 and ps=.05 Not OK c. n=15 and ps=.45 OK d. n=100 and ps=.01 Not OK 3. A study of 39 bartenders who had respiratory problems before smoking was banned in bars in San Francisco found that 21 were symptom free two months after the ban was instituted. Assuming that the sample was a random one, is the sample large enough for constructing a confidence interval (CI)) for the population proportion of bartenders with respiratory problems who were symptom free 2 months after the smoking ban. Construct the 95% CI and interpret. Sample Size Number of Successes Confidence Level 39 21 95% Intermediate Calculations Sample Proportion 0.538461538 Z Value -1.95996279 Standard Error of the Proportion 0.079826849 Interval Half Width 0.156457654 Confidence Interval Interval Lower Limit 0.382003884 Interval Upper Limit 0.694919193 The sample is large enough because n(ps) and n(1-ps) are both >= 5. The 95% CI is .382 to .695 which means we are 95% confident that the true proportion of all bartenders who would be symptom free after the smoking ban lies in that interval. 4. Suppose you want to estimate the fraction of the population of drivers who use seatbelts, with 95% confidence, to within .02 or 2% points. What is the required sample size? Need a sample size of 2401. Data Estimate of True Proportion Sampling Error Confidence Level 0.5 0.02 95% Intermediate Calculations Z Value Calculated Sample Size 1.95996279 2400.90883 Result Sample Size Needed 2401 5. What is the appropriate critical t value for estimating a population mean for each of the following confidence levels and sample sizes? Can use either the calculator or the t table at n-1 degrees of freedom to find these answers. a. 95% and n=17 t= 2.01 b. 90% and n=12 t= 1.80 c. 99% and n=24 t= 2.81 d. 90% and n=25 t= 1.71 e. 95% and n=10 t= 2.26 6. The following summary statistics were obtained from a random sample of 47 students who did not return to college after the first semester and a random sample of 257 who did: Nonpersisters (47) Persisters (257) Number of hours worked per week during first semester: Mean Standard Deviation 25.6 14.4 18.1 15.3 a. Compute 99% confidence intervals for the mean number of hours worked for both groups. Interpret each CI. b. Do the two CIs suggest that the 2 groups means are different or the same? Why? Data Sample Standard Deviation Sample Mean Sample Size Confidence Level Data Sample Standard Deviation Sample Mean Sample Size Confidence Level 14.4 25.6 47 99% 15.3 18.1 257 99% Intermediate Calculations Standard Error of the Mean 2.100455878 Degrees of Freedom 46 t Value 2.687011147 Interval Half Width 5.643948357 Intermediate Calculations Standard Error of the Mean 0.954387778 Degrees of Freedom 256 t Value 2.595170372 Interval Half Width 2.476798885 Confidence Interval Interval Lower Limit Interval Upper Limit Confidence Interval Interval Lower Limit Interval Upper Limit 19.96 31.24 15.62 20.58 The 99% CIs show the intervals where we’re 99% confident that the true population means lie within. Since they overlap, the two population means might be the same. 7. If you wanted to know what the average number of days per week current Iona College students drink alcohol, how could you find out? a. Describe who you would sample and what question(s) you would ask. We could conduct a random sample and ask questions concerning alcohol use, like how many days did you drink alcohol last week, etc. b. How many students would you survey, assuming you wanted your estimate to be within 1 of the actual mean frequency with a 95% confidence level? Describe in detail your sample size design. Since the range is 7 days (max-min), then the standard deviation can be approximated as 7/6 or 1.17. The needed sample size is 6. Data Population Standard Deviation Sampling Error Confidence Level 1.17 1 95% Intemediate Calculations Z Value -1.95996279 Calculated Sample Size 5.258566556 Result Sample Size Needed 6 8. The SPSS file Stafserv.sav contains scope of service information for a random sample of 50 US hospitals. Specifically, the file lists how many specialized services (as reported by the American Hospital Directory) each hospital provided in 1996. i. Double-click here to load the file into SPSS. If the SPSS program doesn't start up automatically, download and save the Stafserv.sav file by right-clicking on the highlighted file name, and then clicking on Save Target As and "maneuvering" the Save In screen to a folder like C:\Temp. Note: this file might be too big to save to a floppy or your U: drive on the Iona network. After saving the file, start up SPSS , then click on Open an Existing File, then OK, then the down-triangle next to the Look In box, and then on the appropriate drive where you downloaded the file to (namely C:\Temp) button. SPSS will then display files on the selected drive, and you can then click on your file name and Open to open it. Use the file and SPSS to: a. generate a 95% confidence interval for the average number of specialized services provided by US private hospitals. In order to generate the confidence interval, use the Analyze * Descriptive Statistics * Explore sequence of commands and move the stafserv variable into the Dependent List box. Click on Statistics and note that the Descriptive - Confidence interval for the mean button is already clicked on with a 95% confidence level. Then click on Continue * OK. Interpret the confidence interval. b. Using the sample's estimated mean and standard deviation for the number of licensed beds, compute the 95% confidence interval for the population mean. Does your estimate agree with the SPSS calculation? a. The 95% CI shows where the population mean # of services lies, with a 95% certainty. b. The SPSS and Excel CI calculations agree. SPSS Descriptives Statistic # of staffed services Mean 95% Confidence Interval for Mean 23.3400 Lower Bound Upper Bound 25.1975 23.3556 Median 23.0000 Std. Deviation 42.719 6.53596 Minimum 11.00 Maximum 36.00 Range 25.00 Interquartile Range .92432 21.4825 5% Trimmed Mean Variance Std. Error 9.00 Skewness -.017 .337 Kurtosis -.641 .662 Data Sample Standard Deviation Sample Mean Sample Size Confidence Level 6.536 23.34 50 95% Intermediate Calculations Standard Error of the Mean 0.924329984 Degrees of Freedom 49 t Value 2.009574018 Interval Half Width 1.857509521 Confidence Interval Interval Lower Limit Interval Upper Limit 21.48 25.20