Math 311 Minitab Lab 2, Winter 2003

advertisement
Math 311 Minitab Lab 2, Winter 2003
In this lab we will investigate the following question: does the age at which the first word
is spoken predicts later performance on aptitude tests?
Let’s begin with some data. A study of the development of young children recorded the age in
months at which each of 21 children spoke their first word and their performance much later in
life on the Gesell Adaptive Test. The results in the table below and can be found on the course
webpage http://www.cwu.edu/~englundt/Data.htm
Age
Child (months)
1
2
3
4
5
6
7
8
9
10
11
15
26
10
9
15
20
18
11
8
20
7
Gesell
Score
95
71
83
91
102
87
93
100
104
94
113
Age
Child (months)
12
13
14
15
16
17
18
19
20
21
9
10
11
11
10
12
42
17
11
10
Gesell
Score
96
83
84
102
100
105
57
121
86
100
Question 1: Make a scatterplot of this data. To do so, select Graph>Plot. Remember, we are
using the age at which the first word is spoken predicts later performance on the Gesell Exam. So
enter the response variable (you have to determine which is the response variable) in the Y
column and the explanatory variable in the X column. Lastly, select OK.
Looking only at the scatterplot and before performing any further statistical analysis, does it seem
to you that the variables have a strong association? Record your impressions in your report.
Specify exactly what kind of association you think these variables have. Of course, you should
include the graph in your report. We will make your impressions more precise throughout this
worksheet.
Question 2: Compute the correlation (r) of the data. To get MINITAB to compute the
correlation do the following: Select Stat>BasicStatistics>Correlation from the menu. A
dialogue box will then open. Click on the Variables box and then double click on AGE and
SCORE in the window on the left. Then select OK. The “Pearson correlation of
Age and Score” value is r. Record this value of r in your report. How does the value of r
correspond with your impressions in question 1?
Question 3: Now produce a regression plot. To get MINITAB to do this, select
Stat>Regression>FittedLinePlot from the menu. Since we are trying to predict the youngsters
score on the Gesell test later in life from the age at which they first speak, chose AGE as the
explanatory variable and SCORE as the response variable. Of course you should include this
graph in your report.
Next, use the equation given by MINITAB to predict the Gesell score of a kid who spoke her first
words at 30 months by plugging 30 in for AGE and solving for SCORE. Include your answer in
your report.
Question 4: To hammer home the point that it matters very much which variable we call
explanatory, repeat question 3 only this time make SCORE the explanatory variable and AGE
the response variable. Again substitute 30 in for AGE and solving for SCORE in this new
equation. How much different is this answer from the answer obtained above?
Influential observations:
Now we’re going to investigate the impact that the outliers in the data have on the regression line.
So notice that Child 18 and Child 19’s data is not like the others. Child 18 didn’t speak until a
much later age than did the other kids. Child 19 scored much higher than her peers on the Gesell.
To investigate the impact of these points, we’ll delete them (in an orderly fashion please) from
our data sets.
Child 18: Copy the data from Age and Score columns on the MINITAB worksheet and paste it
into columns C4 and C5 under the heading Age_1 and Score_1. Now delete child 18’s data from
these columns and have MINITAB plot a regression line for this new set of data. What do you
notice? Record your observations. Pay attention to the value of r (and, consequently, r2).
Child 19: Copy the data from Age and Score columns on the MINITAB worksheet and paste it
into columns C7 and C8 under the heading Age_2 and Score_2. Now delete child 19’s data from
these columns and have MINITAB plot a regression line for this new set of data. What do you
notice? Record your observations.
Question 5: Which child’s data seems to have been more influential? That is, which
child’s data, when deleted, resulted in the biggest change in the regression line? Involve
the values of r2 for each of the three regression lines in your conclusion. Explain why
you think this kid’s data is more influential than the other’s.
Question 7: Now that you’ve examined the data – both with and without the outlying
data included – do you feel that the age at which a kid speaks his or her first word is
accurate predictor of the kid’s performance later in life on aptitude tests? What bits of
data or analysis results would make you feel even more comfortable with your assertion?
Make a clear, convincing, and statistically sound argument. Do not simply say “I think
so” or “I think not.” Use the concepts and vocabulary learned in class to defend your
conclusion.
Download