Florida Assessments for Instruction in Reading (FAIR) – Using Scores for Growth and Instruction Dr. Barbara Foorman Florida Center for Reading Research (FCRR) Stuart Greenberg Florida Department of Education What is FAIR? A K-2 assessment system administered to individual students 3 times a year, with electronic scoring, Adobe AIR version, and PMRN reports linked to instructional resources. A 3-12 computer-based system where students take the assessments 3 times a year. Several tasks are adaptive. PMRN reports are available, linked to instructional resources. Printed toolkit available. © 2011 Florida Department of Education 2 The K-2 “Big Picture” Map Broad Screen/Progress Monitoring Tool (BS/PMT) “All” students • Letter Naming & Sounds • Phonemic Awareness • Word Reading Broad Diagnostic Inventory (BDI) “All” students “Some” students for vocabulary • Listening Comprehension • Reading Comprehension • Vocabulary • Spelling (2nd grade only) Targeted Diagnostic Inventory (TDI) “Some” students; some tasks • K = 9 tasks • 1st = 8 tasks • 2nd = 6 tasks Ongoing Progress Monitoring (OPM) “Some” students •K – 2 = TDI tasks •1 – 2 = ORF © 2009 Florida Department of Education 3 K-2 Targeted Diagnostic Inventory (TDI) Map Kindergarten • Print Awareness • Letter name and sound knowledge • Phoneme Blending • Phoneme Deletion Word Parts/Initial • Letter Sound Connection Initial • Letter Sound Connection Final • Word Building –Initial Consonants • Word Building –Final Consonants • Word Building –Medial Vowels First Grade • Letter Sound Knowledge • Phoneme Blending • Phoneme Deletion Initial • Phoneme Deletion Final • Word Building –Consonants • Word Building –Vowels • Word Building –CVC /CVCe • Word Building –Blends Second Grade • Phoneme Deletion Initial • Phoneme Deletion Final • Word Building –Consonants • Word Building –CVC /CVCe •Word Building –Blends & Vowels • Multisyllabic Word Reading The K – 2 “Score” Map BS/PMT PRS = Probability of Reading Success LC = Listening Comprehension BDI Total questions correct (implicit/explicit) RC = Reading Comprehension Total questions correct (implicit/explicit), Fluency, Percent Accuracy Target Passage VOC = Vocabulary Percentile Rank SPL = Spelling Percentile Rank TDI ME = Meets Expectations BE = Below Expectations OPM ORF = Adjusted Fluency OPM TDI Tasks = ME or BE and Raw Score 5 Target RC Passages for Grades 1 and 2 (BDI) Florida Center for Reading Research 6 Grade 1 PRS Chart (2010-2011) AP1 0 1 2 3 4 5 6 7 8 9 10 PRS 0.11 0.14 0.20 0.27 0.36 0.46 0.56 0.66 0.75 0.82 0.86 AP2 0 1 2 3 4 5 6 7 8 9 10 PRS 0.01 0.02 0.03 0.05 0.09 0.17 0.28 0.44 0.61 0.76 0.86 AP3 0 1 2 3 4 5 6 7 8 9 10 PRS .01 .02 .03 .06 .12 .20 .33 .49 .65 .79 .88 Probability of Reading Success Look for trends 85% & above (Green) within grade levels, within teachers, and across grades. K 28% Let’s look across grades in our sample data. 1st 36% What do you notice when you compare nd 25% 2 across grades? 16%15% & 84% below (Yellow) (Red) 61% 11% 56% 8% 70% 5% Instructional Information from FAIR In grades K-2: 1. Are my students in the green zone (>.85) on the Broad Screen? At or above 50%ile in vocabulary? On the target LC or RC passage? 2. If not, look at performance on the targeted diagnostic inventory tasks to see which skills students have mastered (>80%). Teach to those they haven’t mastered. © 2011 Florida Department of Education 9 Student Score Detail Box (K-2) Excellent report to include in a student’s cumulative folder ©2011 Florida Center for Reading Research Questions to answer using the School Grade Summary Report What is the distribution of scores in a particular grade level? The School Status told us the Median Score for Vocabulary was the 24th percentile, but now I see the full distribution and can see that the majority of my students fell within the 11th – 30th percentile range. This tells me that overall, the First Grade group is below average in their general vocabulary knowledge I would want to see if this is true in Kindergarten and Second Grade as well (on the School Status Report) Kindergarten Median Vocabulary = 29th percentile Second Grade Median Vocabulary = 41st I might look to see how the Core Reading Program addresses vocabulary and if we have any additional programs addressing vocabulary (maybe we are using something different or additional at second grade?) Grades 3-12 Assessments Model Broad Screen/Progress Monitoring Tool Reading Comprehension Task (3 Times a Year) If necessary Targeted Diagnostic Inventory Maze & Word Analysis Tasks Diagnostic Toolkit (As Needed) Ongoing Progress Monitoring (As Needed) © 2011 Florida Department of Education 12 Purpose of Each 3-12 Assessment RC Screen Helps us identify students who may not be able to meet the grade level literacy standards at the end of the year as assessed by the FCAT without additional targeted literacy instruction. Mazes Helps us determine whether a student has more fundamental problems in the area of text reading efficiency and low level reading comprehension. Relevant for students below a 6th grade reading level. Word Analysis Helps us learn more about a student's fundamental literacy skills--particularly those required to decode unfamiliar words and read and write accurately. How is the student placed into the first passage/item? Task Placement Rules Reading Comprehension Adaptive •For AP 1, the first passage the student receives is determined by: • Grade level and prior year FCAT (if available) • If no FCAT, students placed into a specific grade-level passage • All 3rd grade students are placed into the same initial passage •For AP 2 and 3, the first passage is based on students’ final ability score from the prior Assessment Period (AP). Maze – Not adaptive Two predetermined passages based on grade level and assessment period (AP). WA - Adaptive • AP 1-3 starts with predetermined set of 5 words based on grade level. Student performance on this first set of 5 words determines the next words the student receives. • 5-30 words given at each assessment period based on ability. How is the student placed into subsequent passages? Based on the difficulty of the questions the student answers correctly on the first passage, the student will then be given a harder or easier passage for their next passage. Difficulty of an item is determined using Item Response Theory (IRT). Because of this, the raw score of 7/9 for Student A and 7/9 for Student B, when reading the same passage, does not mean they will have the same converted scores. The 3-12 “Big Picture” Map Type of Assessment Name of Assessment Broad Screen/Progress Monitoring Tool (BS/PMT) – Appropriate for ‘All’ students • Reading Comprehension (RC) Targeted Diagnostic Inventory (TDI) – “Some” students • Maze • Word Analysis (WA) Ongoing Progress Monitoring (OPM) – “Some” students • Maze • ORF • RC Informal Diagnostic Toolkit (Toolkit) – “Some” students • Phonics Inventory • Academic Word Inventory •Scaffolded Discussion Templates 16 The 3-12 “Score” Map Reading Comprehension BS/PMT FCAT Success Probability (FSP) Color- coded Percentile Standard Score Ability Score and Ability Range FCAT Reporting Categories Maze - TDI Percentile Standard Score Adjusted Maze Score Word Analysis - TDI OPM Percentile Standard Score Ability Score (WAAS) RC – Ability Score, Ability Range, Reporting Categories Maze – Adjusted Maze Score ORF (3rd – 5th) Adjusted Fluency Score 17 Questions to answer using the School Grade Summary Report What is the distribution of the ‘yellow’ zone (16- 84%)? Say that 38% of Seventh Graders fall in the yellow zone. Looking at a particular distribution, are more of them at the high end or the low end of the zone? Say that it is a relatively even distribution across the zone with between 10-12 students falling in each section of the zone until the last two sections, 65-74% chance has 26 students and 75%-84% chance has 45 students. We want to see the skew fall to the right with more students closer to the high success zone of 85% chance or higher. Assessment/Curriculum Decision Tree for Reading Improvement Using FAIR’s RC’s FSP and Maze and Word Analysis Percentile Ranks. Common Questions What score types should be used to measure growth? For RTI decision-making? Are scores dropping from AP1 to AP2 this year? Is FAIR norm-referenced? Criterion-referenced? Why does FAIR have adaptive tests (RC and WA) when we want grade-level information? Why doesn’t FAIR report at the benchmark level? How do we get instructional information out of FAIR? © 2011 Florida Department of Education 20 Measuring Growth on FAIR In K-2, use PRS descriptively : E.g., establish % of students to be in green zone at each AP. In 3-12, use the reading comprehension ability score (RCAS) because it has the same metric across time (like using FCAT’s DSS rather than SS for growth). FAIR’s RCAS has a mean of 500; SD=100; range=150-1000. In 3-12, do not use FSP to measure growth in reading comprehension ability because FSP includes prior FCAT but RCAS may change over time. For students without a matched FCAT score, FSP is based solely on RCAS. When a student’s FCAT score becomes available in the PMRN, the FSP may look unexpectedly high or low compared to the previous AP’s FSP. Why we use Ability Scores Tom Brady AP1 AP2 AP3 Raw 12 17 20 Standard Percentile Ability 100 50th 500 100 50th 550 100 50th 600 Reading Comprehension AP Score PM score Mazes Word Analysis AP Score PM score AP Score PM score Percentile rank WAAS student score student score Percentile rank FSP RCAS SS Adj. Maze SS %ile & SS PM = progress monitoring; SS = standard score 3-12 Instructional Toolkit Phonics screener and academic word list can be used to monitor progress. © 2009 Florida Department of Education 24 Progress Monitoring in K-2 G2 Broad Screen in G2 is same across each AP--# words read correctly in 45 sec. Thus, monitor progress in timed decoding. Equated OPM passages in G1-G5 allow for ORF progress monitoring (with 40th %ile at each grade provided at www.fcrr.org as criterion. Percent of G1 & G2 students who read target passage with comprehension. Percent of students achieving 80% mastery on TDI tasks. © 2011 Florida Department of Education 25 AP1 AP2 Scores Mean FSP and RCAS increased slightly in grades 3-8 and stayed the same in grades 9-10 at the state level. Standard deviations are fairly large, explaining why some are reporting drops. In grades 3-8 the majority of students tested were at FCAT level 3 or above. This reflects FAIR norms for FCAT 2.0. Slightly lower growth from 09-10SY to 10-11SY likely due to demands of FCAT 2.0. © 2011 Florida Department of Education 26 Norm- vs. Criterion-Referenced The K-2 Broad Screen predicts to a nationally normed test; percentiles for Vocabulary and G2 Spelling are based on FL grade-level norms. The K-2 diagnostic inventory is criterionreferenced (80% of skills mastered). The 3-12 Broad Screen predicts to a criterion— passing FCAT. (i.e., the FSP) Percentiles for the 3-12 Broad Screen, Maze, and WA are based on FL grade-level norms. © 2011 Florida Department of Education 27 Value of Computer-Adaptive Tests Provides more reliable & quicker assessment of student ability than a traditional test, because it creates a unique test tailored to the individual student’s ability. Provide more reliable assessments particularly for students at the extremes of ability (extremely low ability or extremely high ability). Grade-level percentiles are currently provided; Grade Equivalent scores will be provided next year. © 2011 Florida Department of Education 28 Benchmark Conundrum Benchmark tests rarely have enough items to be reliable at the benchmark level. Besides, teaching to benchmarks (e.g., “the student will use context clues to determine meanings of unfamiliar words”) results in fragmented skills. Teach to the standard(s) (e.g., “The student uses multiple strategies to develop grade appropriate vocabulary). Assess at aggregate levels (e.g., Reporting Categories), if CFA show categories are valid. © 2011 Florida Department of Education 29 FCAT 2.0 Reporting Categories Reporting Category 1: Vocabulary Reporting Category 2: Reading Application Reporting Category 3: Literary Analysis- Fiction/Nonfiction Reporting Category 4: Informational Text/ Research Process FCAT 2.0: Benchmarks x Grade Category 1 Category 2 Category 3 Category 4 Total Grade 3 4 5 4 4 4 6 6 6 3 3 3 1 1 2 14 14 15 6 7 8 4 4 4 5 5 5 3 3 3 2 2 2 14 14 14 9/10 4 5 3 2 14 © 2011 Florida Department of Education 31 Possible Benchmark Solutions Stop-gap: start each students with grade-level passage. Provide % correct on Reporting Categories. Then continue to current adaptive system to obtain reliable, valid FSP and RCAS. For the future: Align FAIR to the Common Core. Develop grade-level CAT that is item adaptive. Challenges: Dimensionality; multi-dimensional IRT; testlet effects. © 2011 Florida Department of Education 32 Scarborough (2002) Dimensions of Word Knowledge Knowledge of word’s spoken form (pronunciation) Written form (spelling) Grammatical behavior (syntactic/morphological features) Co-locational behavior (occurs with other words) Frequency (orally and in print) Stylistic register (e.g., academic language; informal) Conceptual meaning (antonyms, synonyms) Association with other words (inter-relatedness) Nation (1990) in Nagy & Scott (2000) Vocabulary Knowledge Task Word Meanings Text • FCRR has developed a 3min. adaptive vocabulary sentence task that strongly predicts RC. • Students pick 1 of 3 morphologically related words to best complete a sentence. • Teachers can use percentiles, standard scores, & ability scores to drive vocabulary instruction 35 New Vocabulary Knowledge Task In Support of FAIR Other Resources: PMRN Help Desk and Just Read, Florida! staff available 8:00-5:00 Monday through Friday FAIR technical tips, Users’ Guides, and frequently asked questions are available at the FCRR website: www.fcrr.org/fair/ and the Just Read, Florida! website: www.justreadflorida.com/instrreading.asp LEaRN (Literacy Essentials and Reading Network) www.justreadflorida.com/LEaRN: FAIR resources for teachers, coaches, and principals which include training videos and clips of master trainers administering FAIR to students. © 2011 Florida Department of Education 37 For more detailed information regarding the PMRN: The PMRN User’s Guides located at: www.fcrr.org/pmrn/userguides.shtm There is a section that provides an Overview of Reports and a section that provides an Explanation of Score Types Complete Power Points can be found on the PMRN website: www.fcrr.org/pmrn/ Topics include: K-2 Data Entry K-2 Electronic Scoring Tool 3-12 Web-based Assessment Module (WAM) School Level Users 1, 2, 3 Phone: 850-644-0931 or 866-471-5019 Email: helpdesk@fcrr.org In summary, remember… FAIR was designed to inform instruction. FAIR data are just one part of the puzzle. Teachers bring information from daily instruction that provides additional, important information. As FCRR gains more experience with the data and collects data statewide, we will be able to provide guidelines on what are ‘typical’ scores at each grade level or what is a ‘good’ percentage to have in the high success zone (green zone). We will make comparisons to the State Mean scores and percentages at the end of each AP New 3-12 Tech Manual available at end of summer, 2011. Thank you for your time and attention! Any Questions? 41