Testing in the classroom: Using tests to promote learning Richard P. Phelps Universidad Finis Terrae, Santiago, Chile January 7, 2014 Q. What is a standardized test? A. An assessment with at least one aspect – in its content or administration – standardized. Q. What is the key advantage of standardized testing? A. It is standardized. Meta-analysis • A method for summarizing a large research literature, with a single, comparable measure. © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 3 John Hattie’s meta-analyses of meta-analyses John Hattie’s list 1. 11. Student self-assessment/self-grading Response to intervention Teacher credibility Providing formative assessments Classroom discussion Teacher clarity Feedback Reciprocal teaching Teacher-student relationships fostered Spaced vs. mass practice 21. Concept mapping Cooperative vs individualistic learning Direct instruction Tactile stimulation programs Mastery learning Worked examples Visual-perception programs Peer tutoring Cooperative vs competitive learning Phonics instruction Acceleration Classroom behavioral techniques Vocabulary programs Repeated reading programs Creativity programs Student prior achievement Self-questioning by students Study skills Problem-solving teaching Not labeling students 31. Student-centered teaching Classroom cohesion Pre-term birth weight Peer influences Classroom management techniques Outdoor-adventure programs Home environment Socio-economic status The effect of testing on student achievement: 1910-2010 Richard P. PHELPS © 2012, Richard P PHELPS The effect of testing on student achievement • 12-year long study • analyzed close to 700 separate studies, and more than 1,600 separate effects • 2,000 other studies were reviewed and found incomplete or inappropriate • lacking sufficient time and money, hundreds of other studies will not be reviewed © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 7 Studies included in the meta-analyses 2. …when: • a test is newly introduced, or newly removed • quantity of testing is increased or reduced • test stakes are introduced or increased, or removed or reduced © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 8 Number of studies of effects, by methodology type Number of studies Number of effects Quantitative 177 640 Surveys and public opinion polls (US & Canada) 247 813 Qualitative 245 245 TOTAL 669 1698 Methodology type © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 9 Effect size: Interpretation • d between 0.25 & 0.50 weak effect • d between 0.50 et 0.75 medium effect • d more than 0.75 © 2012, Richard P PHELPS strong effect World Association of Education Research, 17th Congress, Reims, June, 2012 10 Which predictors matter? Mean Effect Size Treatment Group… …is made aware of performance, and control group is not +0.98 …receives targeted instruction (e.g., remediation) +0.96 …is tested with higher stakes than control group +0.87 …is tested more frequently than control group +0.85 © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 11 Why tests? ● Students tend to study more, and learn more, when: • they know they will be tested, but not precisely what will be tested » (e.g.) Experiment comparing gains of students with “take-home tests” to those with “in class tests” -- the latter learned substantially more. • when there is reinforcement of material already studied ● Mastery learning experiments of 1960s—1980s: » Students learn more when asked to recall what they have learned. » Up to a point, the more students are made to actively process information, and describe it to others, the better they learn. Surveys and opinion polls: Regular standardized tests, performance tests Regular tests (N ≈125) Performance tests (N ≈ 50) d d Achievement is increased 1.2 1.0 …weighted by size of study population 1.9 0.5 Instruction is improved 1.0 1.4 …weighted by size of study population 0.9 0.9 Tests help align instruction 1.0 1.0 …weighted by size of study population 0.5 0.9 Respondent opinion © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 13 Qualitative studies: Effect on student achievement 244 studies conducted in the past century in over 30 countries Number of studies Percent of studies Percent without the inferred Positive 204 84 93 Positive inferred 24 10 Mixed 5 2 2 No change 8 3 4 Negative 3 1 1 244 100 100 Direction of effect TOTAL © 2012, Richard P PHELPS World Association of Education Research, 17th Congress, Reims, June, 2012 14 “Repeated retrieval during learning is the key to long-term retention.” 10 benefits of testing and their applications to education Roediger, Putnam and Smith Direct effects of testing Retrieval practice during tests enhances retention of the retrieved information (relative to not testing or even to studying) -- the “testing effect” Repeated retrieval produces knowledge that can be retrieved flexibly and transferred to other situations On open-ended assessments (e.g., essay tests) retrieval practice induced by tests helps students organize information into a coherent knowledge base. Repeated retrieval leads to easier retrieval of related information SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Indirect effects of testing Students tested frequently study more and with more regularity. Tests permit students to discover gaps in their knowledge and adjust their study efforts to focus on difficult material. Students who study after taking a test learn more than if they had not taken a test. Students who self-test or are tested more frequently in class learn more. SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Benefit 1: The Testing Effect: Retrieval Aids Later Retention Benefit 2: Testing Identifies Gaps in Knowledge Benefit 3: Testing Causes Students to Learn More from the Next Study Episode Benefit 4: Testing Produces Better Organization of Knowledge Benefit 5: Testing Improves Transfer of Knowledge to New Contexts Benefit 6: Testing can Facilitate Retrieval of Material That was not Tested Benefit 7: Testing Improves Metacognitive Monitoring Benefit 8: Testing Prevents Interference from Prior Material when Learning New Material Benefit 9: Testing Provides Feedback to Instructors Benefit 10: Frequent Testing Encourages Students to Study SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Benefit 1: The Testing Effect: Retrieval Aids Later Retention Benefit 2: Testing Identifies Gaps in Knowledge Benefit 3: Testing Causes Students to Learn More from the Next Study Episode Benefit 4: Testing Produces Better Organization of Knowledge Benefit 5: Testing Improves Transfer of Knowledge to New Contexts Benefit 6: Testing can Facilitate Retrieval of Material That was not Tested Benefit 7: Testing Improves Metacognitive Monitoring Benefit 8: Testing Prevents Interference from Prior Material when Learning New Material Benefit 9: Testing Provides Feedback to Instructors Benefit 10: Frequent Testing Encourages Students to Study SOURCE: Roediger, Putnam, & Smith, Ten benefits of testing and their applications to educational practice, Psychology of Learning and Motivation, 55, 2011. 10 benefits of testing and their applications to education Roediger, Putnam and Smith Most teachers should be testing much more frequently, …with smaller, shorter, less consequential tests. Students learns more when they test. But learn best when the tests are “spaced”. What is the optimal lapse of time between tests? The best time to test again is just before students start forgetting the information. This time lapse is shorter with discrete material, like mathematics, than with other subjects. Some studies suggest that math students should be tested at least once a week. The more high-stakes decision points, the better the student performance ? Figure 1: Average TIMSS Score and Number of Quality Control Measures Used, by Country Average Percent Correct (grades 7&8) 80 70 60 50 40 30 20 10 0 0 5 10 15 20 Number of Quality Control Measures Used Top-Performing Countries Bottom-Performing Countries SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001 Quality control has proportionally greater effect in poorer countries Average Percent Correct (grades 7& 8) (per GDP/capita) Figure 2: Average TIMSS Score and Number of Quality Control Measures Used (each adjusted for GDP/capita), by Country Num be r of Quality Control Me as ure s Us e d (pe r GDP/capita) SOURCE: Phelps, Benchmarking to the best in mathematics, Evaluation Review, 2001 What testing skills do teachers need… …for interpreting information from large-scale tests? Basic understanding of statistics: - distributions, mean, median, skewness, kurtosis - sampling error, measurement error - type 1 / type 2 error, statistical power - sampling (size, representativeness) Protocols to help them explain tests to others: - to students - to parents - to the media What testing skills do teachers need… …for developing and administering classroom tests? Practice (with each other) in writing items / prompts / rubrics : - unambiguous, relevant, un-biased Understand that useful assessment can be very simple: - e.g., save the last few minutes of each class to assess by asking students to record 2-3 concepts they learned that day Learn the optimal frequency, spacing of tests for your subject field and grade level. It is easy to know what you are teaching. But, you can only know what students are learning if you assess.