Statistics 101L – Laboratory 3

advertisement
Statistics 101L – Laboratory 3
The first activity looks at the distribution of the heights of students in this class. The
other activities look at distributions of variables to see if the data could have come from a
Normal model.
Activity 1: On the first day of class we talked about the heights of students in the class.
In last weeks lab we measured those heights, in cm. Now we will look at the heights and
see if gender can explain some of the variation in the heights of students. The data from
last week’s lab is on the course web site as a JMP data set called Height.JMP. Open the
JMP data set and use JMP to analyze the distribution of height and the distribution of
height by gender. Be sure to follow the suggestions from previous labs, homework and
class on what should be included in the output and how the output should look. Turn in
this output with your lab.
1. Describe the distribution of heights of students in the class. Make sure to discuss
shape, center and spread and the presence of any outliers. Also, discuss the values
of the numerical summaries for this distribution and what they tell you about the
distribution. Recall that some numerical summaries, e.g. range, are not calculated
by JMP but that you can calculate them from what JMP gives you.
2. Discuss the similarities and differences you see in the distributions of heights of
the male and female students. Again be sure to discuss shape, center and spread
and support your comparisons by referring to graphical and numerical summaries
in the JMP output.
3. What are some other variables that could account for variation in heights? How
could you investigate whether one of these other variables could account for
variation in heights?
Activity 2: As part of this lab you are given JMP output for 3 samples of data, Data Set
#1, Data Set #2 and Data Set #3. Each data set contains 100 observations. Different
models were used to generate these data sets. You are to determine if a Normal model or
some model was used to generate each sample data set. In other words, could the data in
each set have come from a population that can be modeled using a Normal model?
Because the data sets are generated from known models, the population means and
population standard deviations are known. For each data set the value of the population
mean μ and the population standard deviation σ are given in the table below.
Data Set
#1
#2
#3
μ
σ
1
5
50
1
1
1
For each data set, answer the questions 1 – 3 and 5 on the next page.
1
1. Describe the shape of the histogram. What does this indicate about the probable
shape of the distribution for the entire population?
2. Describe the overall pattern of the normal quantile plot?
3. If a Normal model was used to generate a data set, the 68-95-99.7 rule should
apply, roughly, to the 100 observations. Using μ and σ from the table above,
determine what percentage of the 100 observations are within 1, 2, and 3 standard
deviations of the population mean. What does this indicate about the probable
shape of the distribution for the entire population?
4. After looking at the shapes of the histograms for the three data sets, can you
determine what properties the normal quantile plot will have if the data were
generated from a Normal model? How about if the data were generated from a
different type of model?
5. Based on your answer to the questions 1 – 3 above, do you believe a Normal
model was used to generate the 100 observations in each of the data sets? Explain
your answer briefly.
Activity 3: You have JMP output for the total weight of a sample of 325 Fun Size Bags
of M&Ms. Could these weights have come from a population that could be modeled
using a Normal model? For this data set, answer the following questions.
1. Describe the shape of the histogram. What does this indicate about the probable
shape of the distribution for the entire population?
2. What is the overall pattern for the normal quantile plot? What does this indicate
about whether the distribution of total weight could be modeled with a Normal
model?
3. Using your answers to questions 1 and 2 above, do you believe the distribution of
the total weight of Fun Size Bags of M&Ms can be modeled with a Normal
model? Explain your answer.
2
Statistics 101L – Laboratory 3 – Answer Sheet
Names: _______________________
_______________________
_______________________
_______________________
Activity 1:
1. Describe the distribution of heights of students in the class and discuss the values
of the numerical summaries for this distribution and what they tell you about the
distribution.
2. Discuss the similarities and differences you see in the distributions of heights of
male and female students.
3. What are some other variables that could account for variation in heights? How
could you investigate whether one of these other variables could account for
variation in heights?
3
Activity 2:
Data Set #1
Data Set #2
Data Set #3
1. Shape
1. Shape
1. Shape
2. Pattern of Normal plot
2. Pattern of Normal plot
2. Pattern of Normal plot
3. 68-95-99.7 rule
3. 68-95-99.7 rule
3. 68-95-99.7 rule
4. Shape of histogram and properties of Normal quantile plot
5. Normal model?
5. Normal model?
5. Normal model?
4
Activity 3: Total Weight of Fun Size Bags of M&Ms
1. Describe the shape of the histogram. What does this indicate about the probable
shape of the distribution for the entire population?
2. What is the overall pattern for the normal quantile plot? What does this indicate
about whether the distribution of total weight could be modeled with a Normal
model?
3. Using your answers to questions 1 and 2 above, do you believe the distribution of
the total weight of Fun Size Bags of M&Ms can be modeled with a Normal
model? Explain your answer.
5
Download