HLM Results

advertisement
PRELIMINARY RESEARCH ON STUDENT PERFORMANCE ON
VIRGINIA’S SOL TESTS
Patricia Campbell
University of Maryland
Model Building
The primary research question for these preliminary analyses was whether or not math
coaches increase student math achievement scores. We use data on roughly 17,000 student test
scores in 33 schools over two school years. To fit the nested nature of the data (students nested
within classrooms, nested within schools) we used Hierarchical Linear Modeling to analyze
school level coaching effects on students’ SOL test scores.1 For these preliminary models we
separately analyzed third, fourth, and fifth grade students’ scores across two school years (20052006 through 2006-2007). For each grade level we ran parallel models with six dependent
variables. The primary dependent variable was the overall Standards of Learning Mathematics
(SOL) scale score on Virginia’s statewide standardized assessment as required by No Child Left
Behind. We also examined the five component subscale scores in that assessment: Numbers and
Number Sense, Computation and Estimation, Measurement and Geometry, Probability and
Statistics, and Patterns, Functions and Algebra.
Before running the complete models we ran three fully unconditional models (or two
level models that partition the variance of the primary independent variable ,the overall scale
score, with no other variables added) to determine the interclass correlation coefficient (ICC).
The ICC is the amount of variance in student scores that can be attributed to school rather than
individual differences. Put simply, the ICC indicates how much of students scores are due to the
schools they attend. The ICC indicates that 12.4% of the variance in the third-grade scores was
do to schools, 11.2% in fourth grade, and 9.0% in fifth grade. (See Table 1.) The ICC tells us that
Table 1. Inter-Class Correlations by Grade
ICC
Percent Variability Due to
School Differences
Third Grade
Overall Scale
0.124
12.4%
Fourth Grade
Overall Scale
0.112
11.2%
1
In the future we will use a three level model and examine students, within teachers within schools. However at this
time the teacher level data is still being cleaned so we can only examine the effects coaches have at the school level.
1
Fifth Grade
Overall Scale
0.090
9.0%
on average, roughly 11% of the difference in student scores on the SOL in mathematics within
our sample is due to the schools they attended.
We created a parallel model so the results of each test would be comparable within each
grade level. Six independent variables were included in the model. Our primary independent
variable was a school level variable indicating whether this was a school that had the services of
a mathematics coach. Eleven of the 33 schools in the current sample2 were randomly assigned a
coach and therefore identified as treatment schools, meaning a coach served the school the year
of the test. All of the other variables are used as controls to allow us to more clearly determine
the effects of coaches holding all our control variables constant.
Two school-level independent variables (level two variables in a two level model) were
used to control for the school’s impact on student scores beyond the effect of math coaches. A
binary variable, indicative of high poverty schools, indicated that the school had a majority of its
students on free or reduced priced meals (≥ 70%). A second binary variable indicated whether or
not the school had a relatively high proportion of students with special education needs (≥ 20%).
Both of these variables are rather crude proxies for poverty and special education needs, but are
important because they “clean up” the statistical model and allow us to more accurately
determine the effects of coaches.
Three student characteristics (level one variables in a two level model) were used to
control for basic differences between students. These three variables were binary controls for
gender, poverty, and special education status. A measure for race/ethnicity is conspicuously
absent in our model. Data on students’ race and ethnicity is not yet available and was not
included in this preliminary model. Applying such a measure in the future will make our models
more accurate. Nonetheless, these preliminary models are sufficient to show some significant
results.
One of the student level variables, special education status, had different effects in
different schools (a random slope). Put simply, the effect of being a special education student
was not the same in all the schools in our sample. We modeled this variation with an additional
2
In the future we will have one more school in this treatment group and two more schools in the control group, but
the data for the one treatment school is currently unavailable thus necessitating removal of its control pair.
2
student level binary variable to identify whether the school the special education student attended
had a relatively high proportion of students with special education needs. This school
characteristic was consistently significant (positive impact on special education students) and
helps the model more accurately capture the effects of coaches, but as a control variable in a
preliminary analysis it does not warrant particular attention.
Findings
For each grade the results are presented in a table with models presented in the far left
column. These tables (Tables 2, 3, and 4) present the coefficients (providing a metric of raw
point gains on the SOL assessment) and p-values for each model using robust standard errors.
Normal standard errors generally generate a more conservative test for significance while robust
standard errors produce more sensitive tests. While these models were also assessed in a parallel
analysis for each grade using normal standard errors, those results are not presented here.
Using robust standard errors, coaching had a statistically significant effect on our primary
dependent variable, overall SOL Math scale score, in the fourth and fifth grade only. In this
model the coefficient for having a coach in a school, holding all else constant, was 24.74 points
(p=0.01) in the fourth grade, or a little over a quarter of a standard deviation. In the fifth grade,
the impact of a coach, holding all else constant in the model, was 18.81 points (p = .04), slightly
more than 0.2 standard deviations. In the third grade, the effects were slightly smaller and not
significant at the 0.05 level. Note that the coaching effects in all three grades have fairly
substantial coefficients that are all in the same direction. While the lack of significance keeps us
from making inferences based on the third-grade model, the results of these relatively crude
preliminary models are similar enough to suggest that more accurate future models are
warranted. Further analysis will allow us to capture and model any coaching effects more clearly.
See Tables 2, 3, and 4 below.
The subscale analyses reflect the findings of the overall scale models. While the thirdgrade model had no significant effects on any of the dependent variables, all five of the fourthgrade subscales and two of the fifth-grade subscales had significant positive effects with
magnitudes comparable to the overall scale effects (roughly a fifth to a quarter of a standard
deviation) using robust standard errors. A noticeable pattern is that the Number and Number
3
Sense Subscale yielded lower coefficients and higher p-values than most of the other dependent
subscale variables in the fourth- and fifth-grade models.
In addition to the primary independent variables, the controls included in the models are
relatively stable across all models. Such stability provides evidence that the findings are not by
chance but reflects patterns of influence on test scores. Those school characteristics that were
used as control variables (level 2 variables) had few effects. The effects of attending a high
poverty school or a school with a high proportion of students with special education needs were
not significant on any dependent variable in any grade.
The student (level one) control variables were significant in most models. Gender had a
significant but substantively negligible negative effect in the fifth grade and a more pronounced
influenced in fourth grade. The individual effects of poverty and special education status had
expected and consistent negative effects on student achievement scores. Generally, the effect of
poverty was slightly less than half a standard deviation decrease, and the effect of special
education status was slightly more than half a standard deviation decrease in scale scores.
Discussion. We believe these preliminary results offer convincing evidence that coaches
do have an effect on the schools in which they work. Three factors suggest there is much more
to be learned about how coaches effect student achievement. First, it is important to note that
these are crude preliminary models. All the predictors in the model are simple binary dummy
control variables and historically powerful predictors such as race have yet to be included. If
these crude preliminary models reveal significant findings, it is likely that more well-developed
models will have more to say about how coaching effects schools.
Second, these crude preliminary models show consistent, substantial and reliable results.
While one statistically significant positive coaching effect would be suggestive, the multiple
significant results across both subscales and grades constitute substantial evidence. The fact that
all the coefficients are in the same positive direction and have similar magnitudes more strongly
reinforces this conclusion. Additionally, by comparing the effects of coaches to the effects of
typically powerful predictors such as student social-economic status and special education needs,
it is clear that coaches can have a substantial effect on student scores. Our preliminary results
show coaches have an effect that is approximately three-fourths (Grade 4) to half (Grade 5) the
size of the effect of their socio-economic status on mathematics achievement and half (Grade 4)
4
to one-fourth (Grade 5) the size of the effect of their special education status on mathematics
achievement.
Third, these data were also examined with identical models using normal standard errors,
rather than robust standard errors. The similarity of the p-values rendered using normal and
robust standard errors is further evidence of the stability of these models. If the data were less
reliable the differences in the p-values rendered by the two types of standard errors would be
greater.
Finally, it is important to note that these school effects are averaged across all teachers in
the schools. It is reasonable to believe that the effect of having a coach would be different for
different teachers. When data on coaches interactions with teachers (PDA data) and the data
linking students to teachers, permitting analysis of the nesting of students with teachers, become
available it is likely that the effects of coaches in specific classrooms and with specific teachers
will be more pronounced that it is here.
5
Table 2. Coaches Effects on Third Grade SOL Math tests and Subscales (using Robust Standard Errors)
Number &
Overall Scale
Number
Computation Measurement Probability &
Dependent Variable
Score
Sense
& Estimation
& Geometry
Statistics
487.08
0.00 39.12 0.00 38.73 0.00 40.00 0.00 40.32 0.00
Intercept
High Poverty
School
-4.29
0.77
-0.68
0.65
-0.76
0.62
-0.42
0.73
-0.34
0.81
High Proportion of
Sp. Ed. Students
10.21
0.62
0.84
0.67
1.97
0.41
0.66
0.71
1.24
0.54
Coach in school
11.68
0.22
0.97
0.25
1.00
0.28
1.28
0.18
1.03
0.29
0.06
0.01
Female
-0.35
0.40
-0.09
0.13
-0.03
0.60
-0.06
0.18
-32.82
0.00
-3.09
0.00
-3.14
0.00
-3.33
0.00
-3.51
0.00
Low SES
Special
Education
-45.44
0.00
-4.67
0.00
-3.74
0.00
High Proportion of
30.06
0.05
3.54
0.03
Sp. Ed. Students
2.06
0.13
Bold black coefficients and P-values =p<0.05. Black text =p.<0.10. Grey text p>0.10.
Patterns,
Functions &
Algebra
40.86 0.00
-1.64
0.24
1.00
0.66
0.00
-2.41
0.61
0.46
0.99
0.00
-5.13
0.00
-5.51
0.00
-5.12
0.00
3.41
0.04
3.44
0.04
2.14
0.12
6
Table 3. Coaches Effects on Fourth Grade SOL Math tests and Subscales (using Robust Standard Errors)
Number &
Overall Scale
Number
Computation
Measurement
Probability &
Dependent Variable
Score
Sense
& Estimation
& Geometry
Statistics
459.82
0.00
36.80
0.00
35.56
0.00
35.56
0.00
34.95
0.00
Intercept
High Poverty
School
10.83
0.29
0.92
0.48
1.66
0.10
1.10
0.37
0.87
0.26
High Proportion of
Sp. Ed. Students
8.75
0.45
-0.22
0.87
0.78
0.54
1.19
0.34
1.62
0.08
24.74
0.01
1.74
0.05
2.47
0.02
2.35
0.01
1.53
0.05
Coach in school
-6.77
0.00
-1.40
0.00
-0.70
0.00
-0.94
0.00
Female
0.10
0.57
-33.78
0.00
-3.35
0.00
-2.63
0.00
-2.92
0.00
-3.55
0.00
Low SES
Special
Education
-50.69
0.00
-5.83
0.00
-3.42
Patterns,
Functions &
Algebra
36.09
0.00
0.02
0.98
0.14
1.93
-0.30
-3.31
0.88
0.01
0.27
0.00
0.00
-4.76
0.00
-4.76
0.00
-4.76
0.00
High Proportion of
3.47
0.03
Sp. Ed. Students
28.40
0.08
1.49
0.20
Bold black coefficients and P-values =p<0.05. Black text =p.<0.10. Grey text p>0.10.
3.58
0.04
2.20
0.24
1.90
0.25
7
Table 4. Coaches Effects on Fifth Grade SOL Math tests and Subscales (using Robust Standard Errors)
Number &
Overall Scale
Number
Computation
Measurement
Probability &
Dependent Variable
Score
Sense
& Estimation
& Geometry
Statistics
490.32
0.00
39.73
0.00
38.53
0.00
37.64
0.00
38.88
0.00
Intercept
High Poverty
School
13.64
0.23
0.66
0.67
2.00
0.05
1.43
0.12
0.81
0.38
High Proportion of
2.11
0.04
Sp. Ed. Students
22.65
0.13
2.80
0.08
0.91
0.57
1.82
0.15
18.81
0.04
1.82
0.02
1.69
0.02
Coach in school
1.50
0.14
1.67
0.08
0.41
0.02
Female
0.01
0.92
-0.03
0.27
0.05
0.26
0.07
0.05
-35.11
0.00
-3.08
0.00
-2.83
0.00
-2.89
0.00
-3.00
0.00
Low SES
Special
-65.76
0.00
-6.72
0.00
-4.39
0.00
-5.73
0.00
-7.19
0.00
Education
High Proportion of
57.34
0.00
4.93
0.00
5.78
0.00
3.26
0.00
5.32
0.00
Sp. Ed. Students
Patterns,
Functions &
Algebra
38.62
0.00
1.36
0.08
1.58
1.48
0.05
-3.02
0.18
0.07
0.02
0.00
-6.27
0.00
4.79
0.00
Bold black coefficients and P-values =p<0.05. Black text =p.<0.10. Grey text p>0.10.
8
Download