MWu

advertisement
The Appropriate Use of
NAPLAN Data
National Symposium, 23 July, 2010
Margaret Wu
University of Melbourne
m.wu@unimelb.edu.au
1
NAPLAN Tests
Conducted once a year
 About 40 test questions per subject area
 Test scores are used to infer

◦ the achievement levels of students

How reliable can NAPLAN test scores reflect
◦ Student achievement level?
◦ School performance?
2
Margin of error in measuring
student performance
David - a Grade 5 student in 2008.
 Reading score was 25 out of 40.
 David’s reading test scores could vary
between 20 and 30, out of 40.

◦ if similar tests are administered (e.g., 2009, 2010
tests )
One test collects only a small sample of
performance.
 Variation in scores is called Measurement Error.

3
How big an error size is acceptable?

The answer is
◦ It depends.

An example
◦
◦
◦
◦
Effectiveness of a weight loss program
Expect a loss of 0.5 kg after one week.
Measurement scale is accurate to 1kg.
Not good enough for measuring individual
change
◦ OK for a group change, if group size is ‘large’.
4
On the NAPLAN scale…
NAPLAN 2008 reading scores
800
700
600
2.5%tile
500
mean
97.5%tile
400
300
200
grade 3 grade 5 grade 7 grade 9
5
On the NAPLAN scale…
NAPLAN 2008 reading scores
800
700
600
2.5%tile
500
mean
97.5%tile
400
300
200
grade 3 grade 5 grade 7 grade 9
6
Measuring Growth
NAPLAN 2008 reading scores
Expected growth
is 50 points
800
Growth
measure?
700
600
2.5%tile
500
mean
97.5%tile
400
Margin of error of
growth measure
± 76
points
300
200
grade 3 grade 5 grade 7 grade 9
7
Class mean scores

Average score for a class
◦ Effect of measurement error reduces

New source of error
◦ Sampling error
Cohort of students changes from year to
year
 Variation in class mean score because of
the sample of students in a class
 Class mean ± 20 points

◦ (1 year’s growth)
8
Teacher effect

A high performing teacher can raise
student standards by one more year of
growth as compared to a low performing
teacher.
NAPLAN 2008 reading scores
800
700
600
excellent teacher
2.5%tile
500
average teacher
mean
50 points
97.5%tile
400
poor teacher
300
Margin
of error of teacher effect based on two testing
200
occasions:
± 320grade
points
grade
5 grade 7 grade 9
9
MySchool Website

It is a league table
◦ It compares and ranks schools

It is the worst kind of league table
◦ Because it is claimed that the red bars reflect
“underperforming schools”
◦ Simple league tables do not have this claim.
10
Summary - 1


NAPLAN results are NOT suitable for
measuring
Student achievement level
◦ beyond a rough “lower”, “average”, “higher”
groups
Student progress 
 Teacher effect 
 School performance 

11
Summary - 2

NAPLAN results are for the systems, e.g.
◦ Compare girls and boys
◦ Compare rural and urban
◦ Trends, if equating design is improved
NAPLAN results should NEVER be
published.
 Parents/caregivers should not be
encouraged to use the results to judge
schools.

12
Finally…
Conflicting advice from different experts?
 An easy way to check out:
 Ask proponents of MySchool website to
publicly name one underperforming
school.

13
References

Wu, M.L. (2010). Measurement, sampling and
equating errors in large-scale assessments.
Educational Measurement: Issues and Practice, (In
press:Volume 29 Number 4).

Nye, B., Konstantopoulos, S., & Hedges, L.
(2004). How Large Are Teacher Effects?
Educational Evaluation and Policy Analysis, Vol. 26,
No. 3 (Autumn, 2004), pp. 237-257 .
14

Leigh, A. (2009). Estimating teacher
effectiveness from two-year changes in
students’ test scores. Economics of
Education Review.

Byrne, Coventry, Olson, Wadsworth,
Samuelsson, Petrill, Willcutt and Corley.
(2009). Teacher Effects in Early Literacy
Development: Evidence From a Study of
Twins. Journal of Educational Psychology,
2009.
15
Download