Exhale Data Analysis Project Math 1040

advertisement
Exhale Data Analysis Project Math 1040
Autumn Beckstead
Summer 2014
Part I:
The data set that I choose is the Exhale study. In this data set they are comparing the
forced expiratory volume in liters between people of different ages, sex, height, and
whether they smoke or not. Throughout the project different charts and formulas will be
used to come to a hypothesis about the data.
Part II:
Pie Chart & Pareto Chart for entire Gender Population:
In a population of 654 subjects being tested, the frequency of female subjects in the entire
study is 318, which correspond to 48.62% of the sample, and 336 male subjects that
correspond to 51.38% of the sample.
Pie Chart & Pareto Chart for Simple Random Sampling:
To obtain simple random data first I determined the sample size should be 50. Then I
used the stat crunch sample generator to compute the data.
Pie Chart & Pareto Chart for Systematic Random Sampling:
To obtain simple random data first I determined the sample size should be 50. Then I
used excel and selected every 12th number.
Comparing Results:
When comparing the results with the entire population I found that the simple random
sample was the closest to the entire population. The systematic sample seemed to be
more skewed away from the results of the entire population.
Part III:
Mean, Standard Deviation, Five-Number Summary, Frequency Histogram, and Box
Plot for Entire Height Population.
Column
Mean
Std. dev.
Height - inches 61.143578 5.7035128
Five Number Summary
Column
Min Q1 Median Q3 Max
Height - inches 46 57
61.5 65.5 74
Mean, Standard Deviation, Five-Number Summary, Frequency Histogram, and Box
Plot for Random Sampling of Height Population
To obtain simple random data first I determined the sample size should be 50. Then I
used the stat crunch sample generator to compute the data
Column
Mean Std. dev.
Height Simple Random Sample 61.11 4.708102
Five Number Summary
Column
Min Q1 Median Q3 Max
Height Simple Random Sample 51 58
61.5 64 70
Mean, Standard Deviation, Five-Number Summary, Frequency Histogram, and Box
Plot for Systematic Sampling of Height Population.
To obtain simple random data first I determined the sample size should be 50. Then I
used excel and selected every 12th number.
Column
Mean Std. dev.
Height Systematic Sample 59.6 5.4398379
Five Number Summary
Column
Min Q1 Median Q3 Max
Height Systematic Sample 48 55 60.75 63 71
Comparing Results:
When comparing the results to the entire population it seems that the systematic sample
frequency histogram looks very similar to the entire population whereas the simple
random sample box plot looks most similar to the entire height population box plot.
Part IV:
Simple Random Sample Categorical Variable (Sample 1): Confidence Interval
There is a 95% confidence that the sample proportion is between 0.38 and 0.66.
Systematic Sample Categorical Variable (Sample 2): Confidence Interval
There is a 95% confidence that the sample proportion is between 0.26 and .54.
Simple Random Sample Quantitative Variable (Sample 3): Confidence Interval for Mean:
There is a 95% confidence that the sample mean is between 59.77 and 62.45.
Systematic Sample Quantitative Variable (Sample 4): Confidence Interval for Mean:
There is a 95% confidence that the sample mean is between 58.05 and 61.15
Simple Random Sample (Sample 3): Confidence Interval for Standard Deviation:
There is a 95% confidence that the sample standard deviation is between 3.13 and 6.29.
Systematic Sample (Sample 4): Confidence Interval for Standard Deviation:
There is a 95% confidence that the sample standard deviation is between 3.86 and 7.02.
Interpreting the Data:
Confidence intervals are what we use to estimate the true value of the population
parameter. If use several different samples all with the same confidence level than that
percentage of the samples would contain the population parameter. Therefore from the
samples taken you can conclude that it did capture the population parameter within the
samples.
Simple Random Sample Categorical Variable (Sample 1): Hypothesis Test
We fail to reject the null hypothesis because there is enough evidence to support that the
true population proportion is 48.62%.
Systematic Sample Categorical Variable (Sample 2): Hypothesis Test
We fail to reject the null hypothesis because there is enough evidence to support that the
true population proportion is 48.62%.
Simple Random Sample Quantitative Variable (Sample 3): Hypothesis Test
We fail to reject the null hypothesis because there is enough evidence to support that the
true population mean is 61.14%.
Systematic Sample Quantitative Variable (Sample 4): Hypothesis Test
We reject the null hypothesis because there is not enough evidence to support that the
true population mean is 61.14%.
Download