Exhale Data Analysis Project Math 1040 Autumn Beckstead Summer 2014 Part I: The data set that I choose is the Exhale study. In this data set they are comparing the forced expiratory volume in liters between people of different ages, sex, height, and whether they smoke or not. Throughout the project different charts and formulas will be used to come to a hypothesis about the data. Part II: Pie Chart & Pareto Chart for entire Gender Population: In a population of 654 subjects being tested, the frequency of female subjects in the entire study is 318, which correspond to 48.62% of the sample, and 336 male subjects that correspond to 51.38% of the sample. Pie Chart & Pareto Chart for Simple Random Sampling: To obtain simple random data first I determined the sample size should be 50. Then I used the stat crunch sample generator to compute the data. Pie Chart & Pareto Chart for Systematic Random Sampling: To obtain simple random data first I determined the sample size should be 50. Then I used excel and selected every 12th number. Comparing Results: When comparing the results with the entire population I found that the simple random sample was the closest to the entire population. The systematic sample seemed to be more skewed away from the results of the entire population. Part III: Mean, Standard Deviation, Five-Number Summary, Frequency Histogram, and Box Plot for Entire Height Population. Column Mean Std. dev. Height - inches 61.143578 5.7035128 Five Number Summary Column Min Q1 Median Q3 Max Height - inches 46 57 61.5 65.5 74 Mean, Standard Deviation, Five-Number Summary, Frequency Histogram, and Box Plot for Random Sampling of Height Population To obtain simple random data first I determined the sample size should be 50. Then I used the stat crunch sample generator to compute the data Column Mean Std. dev. Height Simple Random Sample 61.11 4.708102 Five Number Summary Column Min Q1 Median Q3 Max Height Simple Random Sample 51 58 61.5 64 70 Mean, Standard Deviation, Five-Number Summary, Frequency Histogram, and Box Plot for Systematic Sampling of Height Population. To obtain simple random data first I determined the sample size should be 50. Then I used excel and selected every 12th number. Column Mean Std. dev. Height Systematic Sample 59.6 5.4398379 Five Number Summary Column Min Q1 Median Q3 Max Height Systematic Sample 48 55 60.75 63 71 Comparing Results: When comparing the results to the entire population it seems that the systematic sample frequency histogram looks very similar to the entire population whereas the simple random sample box plot looks most similar to the entire height population box plot. Part IV: Simple Random Sample Categorical Variable (Sample 1): Confidence Interval There is a 95% confidence that the sample proportion is between 0.38 and 0.66. Systematic Sample Categorical Variable (Sample 2): Confidence Interval There is a 95% confidence that the sample proportion is between 0.26 and .54. Simple Random Sample Quantitative Variable (Sample 3): Confidence Interval for Mean: There is a 95% confidence that the sample mean is between 59.77 and 62.45. Systematic Sample Quantitative Variable (Sample 4): Confidence Interval for Mean: There is a 95% confidence that the sample mean is between 58.05 and 61.15 Simple Random Sample (Sample 3): Confidence Interval for Standard Deviation: There is a 95% confidence that the sample standard deviation is between 3.13 and 6.29. Systematic Sample (Sample 4): Confidence Interval for Standard Deviation: There is a 95% confidence that the sample standard deviation is between 3.86 and 7.02. Interpreting the Data: Confidence intervals are what we use to estimate the true value of the population parameter. If use several different samples all with the same confidence level than that percentage of the samples would contain the population parameter. Therefore from the samples taken you can conclude that it did capture the population parameter within the samples. Simple Random Sample Categorical Variable (Sample 1): Hypothesis Test We fail to reject the null hypothesis because there is enough evidence to support that the true population proportion is 48.62%. Systematic Sample Categorical Variable (Sample 2): Hypothesis Test We fail to reject the null hypothesis because there is enough evidence to support that the true population proportion is 48.62%. Simple Random Sample Quantitative Variable (Sample 3): Hypothesis Test We fail to reject the null hypothesis because there is enough evidence to support that the true population mean is 61.14%. Systematic Sample Quantitative Variable (Sample 4): Hypothesis Test We reject the null hypothesis because there is not enough evidence to support that the true population mean is 61.14%.