29 Best Practices in the Analysis of Progress-Monitoring Data and Decision Making This is a pre-print. I apologize if there are some errors. This now published in Best Practices VI. OVERVIEW Progress monitoring is one of the most important tools used by school psychologists to evaluate student response to both academic and behavioral interventions. It is the basis for data-based decision making within a multitiered problem-solving model and it solidifies the linkage between assessment and intervention. Research and program evaluation is a foundation of service delivery in the National Association of School Psychologists (NASP) Model for Comprehensive and Integrated Psychological Services (NASP, 2010). Progress monitoring is essential in ensuring that student outcomes are tightly linked with services and programs provided within schools. By providing a real-time account of how interventions are effectively (or not effectively) moving students toward predetermined goals, it helps to ensure accountability for school personnel and adds a selfcorrecting feature to intervention efforts. Within a multitiered system, evaluation of student progress within tiers and movement between tiers highly depends on careful and accurate evaluation of student performance over time. This chapter provides general guidelines for school psychologists for selecting and measuring student behavior, displaying and analyzing ongoing data, and making decisions based on progress-monitoring data. After reading this chapter, school psychologists will be aware of issues related to selecting valid and reliable Michael D. Hixson Central Michigan University Theodore J. Christ University of Minnesota Teryn Bruni Central Michigan University measures of student behavior, they will be able to analyze characteristics of graphic displays, and they will be able to apply decision guidelines to determine the effectiveness of interventions. BASIC CONSIDERATIONS In the 1920s and 1930s, B. F. Skinner took repeated measures of individual rats to determine the effects of various environmental manipulations on the frequency of lever pressing. His simple experimental arrangement uncovered the basic principles of behavior and learning. The methods he used were adapted in the 1950s and 1960s to study human behavior. It is a version of those methods of single-case design that are used for progress monitoring in schools. The term, but not the concept, of progress monitoring emerged alongside curriculum-based measurement (CBM) and problem solving, but it is used today in school psychology to reference practices with a variety of measurement methods and domains of behavior. Progress monitoring is a hallmark feature of problem solving within a multitiered service delivery model. By measuring behavior over time and observing how changes in the environment have an impact on that behavior, school psychologists can make more informed instructional decisions to improve student outcomes. Instructional time is not wasted on ineffective interventions and specific variables affecting behavior can be 1 Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:02 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Best Practices in School Psychology systematically tested and monitored (e.g., Batsche et al., 2005; Stecker, Fuchs, & Fuchs, 2005). One of the most commonly used systems to measure academic progress is CBM. Although progress monitoring using CBM is recommended in the professional literature, the research basis of CBM for progress monitoring is limited, as demonstrated in four publications. First, the report on multitiered services in reading, which was developed by an expert panel convened by the Institute for Educational Sciences in the U.S. Department of Education, found that CBM reading progress monitoring does not result in improved student outcomes. Specifically, the expert panel stated: Of the 11 randomized controlled trials and quasiexperimental design studies that evaluated effects of Tier 2 interventions and that met WWC [What Works Clearinghouse] standards or that met WWC standards with reservations, only 3 reported using mastery checks or progress monitoring in instructional decision making. None of the studies demonstrate[s] that progress monitoring is essential in Tier 2 instruction. However, in the opinion of the panel, awareness of Tier 2 student progress is essential for understanding whether Tier 2 is helping the students and whether modifications are needed. (Gersten, 2009, p. 24) Second, in their review of the literature, Stecker et al. (2005) concluded that CBM was often ineffective because educators do not implement the procedures with fidelity and they often fail to respond to the data. Third, in another review of the literature, Ardoin, Christ, Cormier, and Klingbeil (2013) concluded that there was poor to moderate support for the frequently reported guidelines associated with progress monitoring using CBM. Neither the guidelines for progressmonitoring duration, such as the number of baseline data points to collect, nor the corresponding decision rules were well supported in the literature. Finally, a number of published studies refute the notion that CBM oral reading fluency progress monitoring is reliable, valid, and accurate when the duration of monitoring is short, procedures are poorly standardized, or instrumentation is of poor quality (Christ, 2006; Christ, Zopluoglu, Long, & Monaghen, 2012; Christ, Zopluoglu, Monaghen, & Van Norman, 2013). Although the research literature on using CBM for progress monitoring is limited, there has been a large body of basic and applied research using direct measures of behavior for progress monitoring. School psycholo2 gists need to carefully consider a number of variables when designing progress-monitoring measures. BEST PRACTICES IN THE ANALYSIS OF PROGRESS-MONITORING DATA AND DECISION MAKING The following sections describe best practices for school psychologists for collecting and graphically displaying progress-monitoring data, for analyzing data, and for making decisions based on progress-monitoring data. These activities fall within the area of formative evaluation, which is a key component of the problemsolving model. Formative evaluation of progress monitoring data most commonly occurs within Tiers 2 and 3 of a multitiered system of service delivery. Although there is a large research literature in behavior analysis that has used what could be reasonably referred to as progress-monitoring data as the primary dependent variable, the research on certain types of data for progress monitoring is still evolving. Therefore, specific best practice guidelines are not always possible. Rather, school psychologists need to be aware of the issues related to interpreting and analyzing progress-monitoring data. Best Practices in Data Collection and Data Display Identifying a valid and reliable behavior for progress monitoring is challenging. This section presents a summary of best practices for school psychologists for data collection and data display and discusses issues related to selecting valid and reliable progress-monitoring measures. Validity and Reliability of ProgressMonitoring Data School psychologists must use valid measures of behavior. Validity refers to whether the assessment instrument measures what it purports to measure. The first step in developing a valid progress-monitoring measure is to define the behavior in observable terms. It is often quite difficult to turn teacher and parent concerns into objective definitions of behavior. In order to have a direct measure of the behavior, the behavior should have a clear beginning and end and the conditions under which the behavior occur should be identified. The behavior measured should be the same as the one in which conclusions will be drawn (Johnston National Association of School Psychologists Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:02 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Progress Monitoring & Pennypacker, 2009). For example, if the student frequently does not follow teacher directions, then this behavior can be directly measured by recording the number of instances of noncompliance and dividing it by the number of teacher directions. An indirect measure of the behavior would be the completion of a behavior rating scale by the teacher that measured student compliance. This is indirect because teacher behavior is measured but school psychologists are interested in student behavior. When the appropriate dimension of the behavior of concern is directly measured, then the measure is a valid one. As another example, if the concern is accuracy in solving addition problems, then the number of correct and incorrect problems can be recorded from the student’s independent practice worksheets. If accuracy with solving math problems was the concern but duration of solving the math worksheets was recorded, then the measure would be indirect and invalid. In addition to directly measuring the appropriate dimension of behavior, the measurement should take place at the appropriate time and place. Student performance can only be understood within a given environmental context. If a student’s noncompliance occurs in academic subjects but not during other activities, such as gym class and recess, then the behavior should be measured during the academic subjects. Ideally, all instances of the target behavior are recorded (this is called continuous recording), but if the behavior is only sampled then the issue of whether a representative or valid sample was obtained becomes a concern. Direct measures of specific behaviors provide valid and accurate accounts of behavior and are preferred over indirect measures, particularly when providing more targeted or intensive interventions. Indirect measures are often convenient to obtain, but school psychologists should always be concerned with validity. The validity of indirect measures may be obtained by correlating scores on the indirect measures with scores on a direct measure. Measuring behavior using curriculum materials for progress monitoring can be considered direct or indirect measures of behavior depending on how precisely they correspond with the behavior of interest. For example, if the behavior of interest is reading rate and CBM Reading is derived from curriculum materials or closely related materials, then there is a direct measure of the target behavior, which is reading rate. But if the same CBM passages are used as a general outcome measure of reading (i.e., to assess the broader skill of reading or general reading achievement), then the broad behavior of reading is being indirectly measured. CBM reading and reading rate are only indicators of reading, which is composed of many behaviors. Because general outcome measures are used as indicators of a broader skill set, validity and reliability must first be demonstrated to support their use as formative measures of student progress. Fortunately, a large amount of reliability and validity research supports the use of CBM reading for screening. However, data collected from broad measures that sample skills, such as CBM Reading, tend to be highly variable when used for progress monitoring. A reliable measure is one in which the same value is obtained under similar conditions. With indirect measures, traditional psychometric reliability procedures are used to determine the reliability of the measure, such as administering the test two times under the same conditions and correlating the scores in test-retest reliability. Because progress-monitoring measures are often direct measures of behavior or behavior products, the most relevant type of reliability measure is interrater reliability or interobserver agreement. School psychologists should collect interobserver agreement data when there is concern about the reliability of the data or when the data are used to make important decisions. Interobserver agreement is calculated from the data from two observers who have measured the same behavior at the same time. There are various methods to collect interobserver agreement data, some of which depend on the type of data that have been collected. If the data recorded were the number of call-outs in class, one observer might have recorded 8 instances of the behavior and the other 10 instances. With this type of frequency data, interobserver agreement is typically calculated by dividing the smaller number by the larger and multiplying by 100%, in this case yielding 80% interobserver agreement. The same calculation is used with duration data; that is, the smaller duration is divided by the larger duration. This calculation is sometimes referred to as overall agreement or total percent agreement. With interval data, trial data, or permanent product data, the calculation can be based on each instance of agreement or disagreement. This type of interobserver agreement is sometimes called point-by-point agreement. If interval data were collected, then the data from each interval are compared across observers and the number of intervals in which there were agreements is divided by the number of agreements plus disagreements. The results are multiplied by 100% to yield the percentage of intervals in which the observers agreed. The same approach to Foundations, Ch. 29 3 Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:02 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Best Practices in School Psychology calculating interobserver agreement can be used with trial-by-trial data by substituting intervals with trials. Parallel Measurements and Variability Progress monitoring requires measurements that are approximately parallel across time. That is, the measurement conditions must be consistent so that data are comparable. Thus, school psychologists should conduct observations of classroom behavior while controlling for time of day, activity, educators, and other potentially influential factors. Ensuring parallel measurement conditions allows us to maximize the potential to observe true intervention effects. Christ et al. (2013) engaged in extensive and systematic research to evaluate the effects of assessment conditions on CBM Reading progress-monitoring outcomes, and these general findings may apply to some other progress-monitoring measures. In the case of academic assessment and CBM, alternate forms must be approximately parallel, which means that content and difficulty of alternate forms should be very similar. As with any progress-monitoring measure, the other conditions of assessment must also be standardized, such as time of day, administrator, and setting. Display of Progress-Monitoring Data Once the relevant dimension of behavior is identified and accurate, reliable, parallel, and valid measures of performance are selected, school psychologists must decide how to present data so that changes in performance over time can be effectively interpreted and evaluated. Progress-monitoring data are displayed graphically and interpreted through methods of visual analysis. Graphic displays of data are useful because they provide a clear presentation of behavior change over time and allow for quick and easy interpretation of many different sources of information. Accurate interpretation and analysis of data depends highly on how it is displayed and organized. There are many formats that school psychologists can use to display data including bar graphs, scatterplots, and tables. The most common progress-monitoring format is the line graph. Line graphs are useful because they facilitate visual analysis with or without supplemental graphic or statistical aides. Graphical aides include goal lines, trendlines, level lines, and envelopes of variability. Statistical aides include estimates of the average level or slope of the trendline of behavior across observations. Graphs typically include data points, data paths, phase changes (i.e., alternate conditions), and axes. Figure 29.1 presents each component using hypothetical data. The lower horizontal axis corresponds with the time variable, which is often measured in days or weeks. The units along the horizontal axis should be depicted in a manner that preserves that unit of time, rather than a scale that indicates session numbers or observation intervals that do not indicate how much time passed Figure 29.1. Basic Components of a Line Graph 4 National Association of School Psychologists Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:02 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Progress Monitoring between measurements. The time metric is required to accurately represent changes in student performance and to calculate trend. The vertical axis corresponds with the measurement variable (i.e., the outcome) being measured, such as the number of words read correctly per minute. The scale used for the dependent variable will depend on the student’s current level of performance, expected gains over time, and the overall goal of the intervention. Performance is then measured repeatedly over time within the different conditions or phases. Data points are typically represented by small symbols such as open circles, triangles, or squares. Each symbol depicts a data series or outcome measure. The symbols are placed on the graph to represent performance at specific points in time. The lines that connect each data point are called the data path. Data paths connect data points within a condition. New conditions are depicted by solid vertical lines that correspond to a point in time. As shown on the sample graph in Figure 29.1, it is also useful to insert a text box above the graph that defines relevant information about the student (e.g., name, teacher, grade, year), specific measurement procedures, intervention conditions, and a clearly stated measurable objective/goal for the student. Milestones that define intermediate goals (e.g., monthly, quarterly) are often helpful. Such information communicates the purpose, features, goals, and subject of the graphic display, which is useful as an archival record and for purposes of sharing the data with others who might be less familiar with the case. Specific best practices and conventions to construct line graphs exist within the single-case design literature to facilitate interpretation and analysis that can help guide school psychologists in constructing clear and useful graphs to monitor student progress (Cooper, Heron, & Heward, 2007; Gast, 2010; Riley-Tillman & Burns, 2009). First, clarity can be impeded by including more than three behaviors on a single graph (Gast, 2010). If multiple behaviors are represented on the same graphic display, then each behavior should be related to one another to provide meaningful comparisons between the data series (e.g., behavior to decrease presented with the corresponding replacement behavior). Also, trendlines and data paths are disconnected between phase changes, extended gaps in data collection, and follow-up data collections. Finally, school psychologists should ensure that the scales used accurately represent the outcome being measured, including the maximum and minimum values that could be obtained. Each axis should be clearly labeled to indicate the variable of interest, dimension measured (in the case of the vertical axis) and the scale of measurement used (e.g., percent correct on the vertical axis with days on the horizontal axis). Frequency data on the vertical axis are usually displayed with equal intervals; that is, the distance between any two points is always the same. For example, the distance between 3 and 4 words read correctly is the same as between 20 and 21 words on a particular graph. On equal interval graphs, change from one unit to the next across all values of a particular scale is represented by the same distance between each unit. Rather than looking at these absolute changes, school psychologists may also find it useful to look at relative changes in performance. Relative changes can be visually analyzed by using a logarithmic or ratio scaled vertical axis. Although it requires some training and practice initially, it helps to visually analyze the proportional change in behavior. For example, a change from 10 to 20 words read correctly would appear as the same degree of change as from 40 to 80 words because the proportional change is the same; that is, both are doubling over time. Equal interval graphs by necessity use different scales to measure behaviors that differ greatly in their frequency. It is difficult to use a standard nonlogarithmic graph to depict behaviors with very different frequencies, such as one that occurs a few times per day and another that occurs hundreds of times per day. In contrast, it is easy to accomplish this with a logarithmic graph. Figure 29.2 illustrates the difference between equal interval and logarithmically scaled data when two behaviors of very different frequencies are plotted on the same graph. The graph on the left shows relatively stable correct and incorrect responses, but it is more evident on the right graph that errors are increasing. A particular semi-logarithmic graph called the Standard Celeration Chart (Graf & Lindsley, 2002; Pennypacker, Gutierrez & Lindsley, 2003) can display behaviors that occur as infrequently as once per day to as often as 1,000 times per minute. Being able to graph almost any human behavior on the same chart is not only convenient but also makes it easy to compare behaviors across charts. Also, it is often the case that school psychologists are interested in relative/proportional change. A change from talking with peers 20 times per day to 21 times may not be very important, but a change from 0 to 1 could very well be. While a full discussion of the differences between equal interval line graphs and the Standard Celeration Chart falls beyond the scope of this chapter, analysis, interpretation, and communication of student outcomes can be enhanced with the Standard Celeration Chart (Kubina & Yurich, 2012). Foundations, Ch. 29 5 Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:02 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Best Practices in School Psychology Figure 29.2. Hypothetical Student Data Using an Equal Interval Scale Compared to a Semi-Logarithmic Scale Best Practices in Analysis and Decision Making This section presents a summary of best practices in the analysis of progress-monitoring data to guide educational decisions. In order to make important decisions regarding student progress, school psychologists must engage in careful analysis of student data. Accurate analysis of progress-monitoring data involves evaluating data within and across conditions, setting goals, and analyzing intervention effects and variability in the data to guide decision making. Analysis of Progress-Monitoring Data in Baseline Progress-monitoring data are repeatedly collected across time before an intervention is implemented. This preintervention condition is called the baseline condition and the data in this condition are collected to 6 permit a comparison to the intervention data. Baseline data can also help school psychologists determine whether or not the reported problem is a real concern. If the baseline data indicate that the student’s performance is not significantly discrepant from peers, then no intervention is warranted. Baseline data should be collected and charted until a stable pattern of behavior emerges. The characteristics of the data that are considered in determining whether there is a stable pattern of behavior are level, trend, and variability. Level refers to the average performance within a condition. Level is often the characteristic school psychologists are most concerned with; that is, the student may be doing something too often or not enough. To estimate a student’s level of behavior, the mean or median can be calculated and illustrated by a horizontal line on the graph. School psychologists should exercise caution when interpreting level if there is a trend in the data or the data are highly variable National Association of School Psychologists Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:02 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Progress Monitoring because level might not be substantially representative of the behavior. Trend refers to systematic increases or decreases in behavior over time. It is often estimated with a trendline, which is a graphical aide, or slope, which is a statistical aide. It may take many data points to get an accurate measure of trend. The addition of a trendline may aid in the analysis of trend. Variability refers to how much a student’s data deviate from the level or trendline. It is often estimated with the standard deviation, range, or the standard error of the estimate (Christ, 2006; Christ et al., 2013). There are many influences of behavior that are beyond the school psychologist’s control that can produce highly variable performance. The academic tasks and other classroom factors vary from day to day as do many factors outside of school. Highly variable data obscure trends and levels and, therefore, hinder interpretation of progress-monitoring data. The sources of variability should be identified and controlled, if possible. The collection of additional baseline data is particularly helpful when the data are variable. There are many kinds of extraneous variables that could be responsible for variable performance, including the measurement system itself. The variability of CBM oral reading fluency tends to be relatively high, as might other measures of very broad skills. School psychologists should ensure a relatively stable baseline is obtained so that effects of the intervention can be accurately interpreted. What is considered stable depends on many factors, such as the type of measure being taken and the extent to which clear experimental control is needed. In most cases, school psychologists should consider the data stable if 80% of the data points fall within 20% of the median line (Gast, 2010). The median rather than the mean is used because the mean is more sensitive to extreme scores. If there is an increasing or decreasing trend in the data, then it is helpful to determine the stability of the data around the trendline using the same criterion: If 80% of the data fall within 20% of the trendline, then the trend may be considered stable. See Figure 29.3. For the data in Figure 29.3 a mean line has been drawn that equals 4.9 responses. The mean or level of these data is not their most important characteristic, however, because of the increasing trend in performance which is summarized by the trendline. A stability envelope (Gast, 2010) has been drawn around the data to determine if the data meet the 80–20% criterion for stability around a trendline. In this case, 77% of the data points fall within 20% of the trendline, which is close to the stability criterion. Attending to the most prominent characteristic of the data—level, trend, or variability—is important for understanding the target behavior and for comparing performance across baseline and intervention conditions. After sufficient and relatively stable baseline data have been gathered, the intervention can be introduced. Analysis of Data Across Conditions Collecting progress-monitoring data and thereby obtaining information on the level, trend, and variability of the target behavior before an intervention is implemented is useful in helping to understand the target behavior and the extent to which the behavior is a problem. If the progress-monitoring data are collected from direct Figure 29.3. Illustration of the 80–20% Stability Rule Foundations, Ch. 29 7 Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:03 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Best Practices in School Psychology observations, then the observer may also identify environmental events that trigger or consequate the behavior, which is helpful for intervention planning because it may give clues to the function of the behavior, as is often done as part of a functional assessment or functional analysis. In addition to these benefits, collecting progress-monitoring data in baseline allows an objective evaluation of the effects of interventions. If sufficient baseline data have been collected to determine the level, trend, and variability of behavior, then this information can be used to predict what is likely to happen if conditions stay the same. If the intervention data are very different from this predicted pattern, then it is possible that the change is due to the intervention. Any three of the data characteristics— level, trend, or variability—may change because of the intervention. School psychologists should consider the following factors when trying to determine if any of the three characteristics changed because of the intervention. First, consider whether there were a sufficient number of data points in each condition to obtain a predictable pattern of behavior under both the baseline and intervention conditions. Second, consider the immediacy of effect or how closely the change in behavior corresponds with the introduction of the intervention. According to Kratochwill et al. (2010), the last three data points from the baseline condition should be compared to the first three data points of the intervention condition to evaluate immediacy. Third, consider the degree of overlap of data points across conditions. If few data points overlap, then the results are more likely the result of the intervention. High variability increases the probability of overlapping data points across conditions, which can obscure the effects of the intervention. The percentage of nonoverlapping data points across conditions may be calculated as a summary statistic of overlap. To help determine the degree of overlapping data points across conditions, an envelope can be drawn around the data in baseline and projected into the intervention condition. In Figure 29.4 there is no overlap in data across conditions with the target behavior rapidly decreasing during the intervention and leveling off at around four responses. The envelope encompasses all of the data in the baseline condition and the lines are drawn horizontally. If there is a clear trend in the data, as in Figure 29.5, then the envelope can be drawn by using the slope obtained from the trendline. The data in Figure 29.5 also show a clear change in trend across conditions and no overlapping data points using the slope of the trendline to extend the data envelope. More advanced statistical procedures may be used to analyze trends and changes in data with certain types of progress-monitoring data. For example, in the case of CBM Reading progress-monitoring data, Christ (2006) and Christ et al. (2013) have advocated for the use of confidence intervals to help guide interpretation, especially if it is used for high-stakes diagnostic and eligibility decisions. CBM progress-monitoring measures have published estimates of the standard error of measurement (e.g., median standard error of measurement for CBM reading is 10; Christ & Silberglitt, 2007). Moreover, regression-based estimates of trend have standard errors that are easily derived with spreadsheet software. The Figure 29.4. Illustration of Data Envelope Showing No Overlap Across Conditions 8 National Association of School Psychologists Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:03 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Progress Monitoring Figure 29.5. Illustration of Data Envelope Defining a Clear Change in Trend Across Conditions function for MS Excel is ‘‘5STEYX(y-values, x-values).’’ The standard error of the slope (SEb) can be applied in a manner similar to the standard error of measurement. That is, the 68% confidence interval is +/2 SEb and the 95% confidence interval is +/2 1.96 * SEb (see Christ & Coolong-Chaffin, 2007, for a more detailed description). The final factor to consider when trying to determine whether or not an intervention was effective is replication of effect. Replication is one of the most powerful ways to show a functional relationship between an intervention and a target behavior. There are various single-case experimental designs that handle replication differently. The two designs most relevant for the progress monitoring of individual students are the withdrawal and alternating treatments designs. In the most common version of a withdrawal design, a nointervention (i.e., baseline) condition is alternated with an intervention condition. The target behavior is measured repeatedly within each condition. In the alternating treatments design, two conditions, which could be no-intervention and intervention conditions, are rapidly alternated. In a withdrawal design the same condition should have a similar effect across phases. For example, if the behavior were low in the first baseline, then it should also be low when the baseline condition is repeated, and if it were high in the first intervention condition, then it should be high in subsequent intervention conditions. For research purposes, a single-case design should have at least three replications of an effect or, in the case of the alternating treatments design, at least five repetitions of the alternating sequence (e.g., BCBCBCBCBC; Kratochwill et al., 2010). We believe these criteria are also useful for school psychologists when high-stakes decisions need to be made with progress-monitoring data, as is discussed in the next section. Best Practices in Data-Based Decision Making Progress monitoring alone does not produce meaningful gains in student performance. Decisions must also be made based on careful analysis of student data (Stecker et al., 2005). Although there has been research on the use of decision rules, few studies have investigated the effects of specific rules. However research in the areas of single-case designs, CBM, and response to intervention outline important considerations to help guide school psychologists in making decisions based on student data. Student performance data should be evaluated in relation to a specific goal. Goals can be based on local norms, benchmark data, classroom norms, or peer comparisons. Goals are set over a specified amount of time, allowing for short-term objectives to be set regarding the student’s rate of progress (e.g., acquire two letter sounds per week). A student’s expected rate of progress can be represented by an aimline, which is a line drawn from the last or median baseline point to the expected goal (see Figure 29.6). The aimline is used as a guide for determining whether or not the student is making sufficient progress toward the goal. Progress should be evaluated in a manner that takes into consideration all elements related to the students’ environment and should be applied flexibly. The following considerations provide some general rules/ guidelines for decision making that have been used in research on progress monitoring or have been recommended by experts. Foundations, Ch. 29 9 Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:03 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Best Practices in School Psychology Figure 29.6. Expected Rate of Progress Represented by Aimline and Illustration of Trendline Based on Hypothetical Student Data Interpreting Changes in Performance In many cases it may not be vital to determine if it was intervention or some other factor that improved student performance. But as students move through Tiers 2 and 3 and they are given more intensive instruction, it becomes increasingly important to correctly identify controlling variables. If progress-monitoring data are being used to help determine eligibility for special education, then it is of primary importance that an effective intervention is identified (Riley-Tillman & Burns, 2009). If the effective intervention is highly intensive, then it may require special education placement. Knowing that the intervention is in fact responsible for the improvement is necessary because student eligibility for special education is one of the most important decisions school psychologists make. Using an experimental design that involves replication such as a reversal, multiple-baseline, or alternating treatment design provides a powerful demonstration for the necessity of that intervention for student success and greatly increases the reliability of the school psychologist’s decision. Because the use of a rigorous experimental design takes time to implement, it should ideally be done before a request for a special education evaluation because of the evaluation timelines that begin once a referral is made. Variability in data and/or the occasional extreme value can be the result of many factors, as previously discussed. Parallelism should be considered if progressmonitoring data are highly variable. Measurement 10 factors are important to evaluate, such as standardization of conditions, instrumentation, and reliability of scoring. In addition, a possible problem with CBM is nonequivalent probes, which could be evaluated by administering different probes under the same conditions (time, assessor) and calculating differences in scores. Variability may also be due to instructional factors, which include the fidelity of implementation and instructor characteristics (e.g., how fluently the intervention is implemented). Student factors should be examined carefully. These could include biological variables (e.g., student illness), interfering behavior (e.g., noncompliance, failure to scan/attend), or lack of motivation to comply with instructional demands (e.g., insufficient reinforcement for correct responding). If the sources of variability are identified, then they can be controlled to improve the potential for effective, efficient, and accurate decisions. Using Decision Rules There are 40 or more years of recommendations to apply a variety of specific decision rules to progress-monitoring data, which are intended to improve decision accuracy. Student outcomes tend to improve as a function of progress monitoring only if decision rules are explicit and implemented with integrity (Stecker et al., 2005). Two commonly used decision rules are data point rules and trendline rules and neither is supported by substantial evidence (Ardoin et al., 2013). The accuracy of decisions using data point rules has not been researched, but there National Association of School Psychologists Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:03 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Progress Monitoring has been some research on the trendline rule. The data point rule uses a graphically depicted goal line, which displays the expected growth in performance. If three consecutive points are distributed above and below the goal line, then the intervention is continued, presumably because there is evidence of sufficient progress toward the goal. If three consecutive points are all below the goal line, then the intervention is changed because there is evidence of insufficient progress. Finally, if three consecutive points are all above the goal line, then the goal is increased because there is evidence of an insufficiently ambitious goal. The trendline rule compares an estimate of trend to the intended goal. Prior to the proliferation of computers, the trendline rule was guided by visual analysis or one of a variety of techniques to estimate trend, such as the splitmiddle technique. These estimates of trend should be replaced with statistical methods such as regression (i.e., ordinary least-squares regression). A regression line is the line of best fit to progress-monitoring data, such that it establishes the minimal distance between all points on the trendline and all data points (see Figure 6). Regressionbased estimates of trend are readily available in spreadsheet software and by vendors of many progressmonitoring assessment systems (e.g., Formative Assessment System for Teachers, http://fast.cehd.umn. edu). This method is likely to result in the best estimates of growth and predictions of future performance. However, regression is highly sensitive to extreme values. Visual analysis should be used to identify extreme values. Such values should be judiciously removed from the calculation of the trendline lest they have undue influence on regression-based estimates of trend. That being said, extreme scores may be highly informative if the variables responsible for the extreme score are identified. Overall, decision rules provide school psychologists with a general guide to help evaluate student progress toward a predetermined goal. Accurate and valid measures of behavior, parallel conditions, identifying and controlling for sources of variability, and skills in visual analysis allow school psychologists to make more accurate decisions based on student data. SUMMARY What are the possible consequences of providing an evidence-based treatment in the absence of monitoring the effects of that treatment? Ideally, the treatment would be effective at ameliorating the problem. Unfortunately, school psychologists do not have at their disposal treatments that are 100% effective. Another possibility is that the treatment is ineffective, resulting in wasted time and resources. Finally, the treatment could further impair performance (i.e., have a teratogenic effect). Because of the risk of the second two outcomes, school psychologists are ethically obligated to monitor intervention effects and terminate or change the intervention when appropriate (see Standard 2.2.2 of the NASP Principles for Professional Ethics, http://www. nasponline.org/standards/2010standards/1_ Ethical Principles.pdf). Ongoing progress monitoring is an essential component of data-based decision making within a problemsolving model. School psychologists must first select measures that are reliable, accurate, valid, and sensitive to change. Progress-monitoring data are collected repeatedly over time and plotted graphically to allow for systematic interpretation of results. Level, trend, and variability are three characteristics of progress-monitoring data. Prior to intervention, school psychologists should collect baseline data until sufficient data points permit prediction of the student’s future performance without an intervention. When analyzing data across phases of intervention, school psychologists should carefully inspect the overlap of data points between conditions, the immediacy of change from one condition to the next, and the results from replication of the intervention. Reversal designs, multiple baselines, and alternating treatment designs allow systematic replication and control of independent variables. How systematic school psychologists are in their analysis depends on the types of decisions to be made and the potential impact that decision would have on a particular student. A careful analysis of controlling variables related to instruction, consistency of implementation, setting, and other extraneous variables are necessary when evaluating student progress. REFERENCES Ardoin, S. P., Christ, T. J., Morena, L. S., Cormier, D. C., & Klingbeil, D. A. (2013). A systematic review and summarization of the recommendations and research surrounding curriculum-based measurement of oral reading fluency (CBM-R) decision rules. Journal of School Psychology, 51, 1–18. Batsche, G., Elliott, J., Graden, J. L., Grimes, J., Kovaleski, J. F., Prasse, D., … Tilly, W. D., III. (2005). Response to intervention. Alexandria, VA: National Association of State Directors of Special Education. Christ, T. J. (2006). Short-term estimates of growth using curriculumbased measurement of oral reading fluency: Estimating standard error of the slope to construct confidence intervals. School Psychology Review, 35, 128–133. Foundations, Ch. 29 11 Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:03 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003) Best Practices in School Psychology Christ, T., & Coolong-Chaffin, M. (2007). Interpretations of curriculum-based measurement outcomes: Standard error and confidence intervals. School Psychology Forum, 1(2), 75–86. Christ, T. J., & Silberglitt, B. (2007). Estimates of the standard error of measurement for curriculum-based measures of oral reading fluency. School Psychology Review, 36, 130–146. Christ, T. J., Zopluoglu, C., Long, J. D., & Monaghen, B. D. (2012). Curriculum-based measurement of oral reading: Quality of progress monitoring outcomes. Exceptional Children, 78, 356–373. Christ, T. J., Zopluoglu, C., Monaghen, B. D., & Van Norman, E. R. (2013). Curriculum-based measurement of oral reading: Multistudy evaluation of schedule, duration, and dataset quality on progress monitoring outcomes. Journal of School Psychology, 51, 19–57. Cooper, J. O., Heron, T. E., & Heward, W. L. (2007). Applied behavior analysis (2nd ed.). Upper Saddle River, NJ: Pearson. Gast, D. L. (2010). Single subject research methodology in behavioral sciences. New York, NY: Routledge. Gersten, R. M. (2009). Assisting students struggling with reading: Response to intervention and multitier intervention in the primary grades. Washington, DC: U.S. Department of Education, National Center for Education Evaluation and Regional Assistance. 12 Graf, S., & Lindsley, O. (2002). Standard Celeration Charting 2002. Youngstown, OH: Graf Implements. Johnston, J. M., & Pennypacker, H. S. (2009). Strategies and tactics of behavioral research. New York, NY: Routledge. Kratochwill, T. R., Hitchcock, J., Horner, R. H., Levin, J. R., Odom, S. L., Rindskopf, D. M., & Shadish, W. R. (2010). Single-case designs technical documentation. Washington, DC: Institute of Education Science. Retrieved from http://ies.ed.gov/ncee/wwc/pdf/wwc_scd.pdf Kubina, R. M., Jr., & Yurich, K. K. (2012). The precision teaching book. Lemont, PA: Greatness Achieved. National Association of School Psychologists. (2010). Model for comprehensive and integrated school psychological services. Bethesda, MD: Author. Retrieved from http://www.nasponline.org/standards/ 2010standards/2_PracticeModel.pdf Pennypacker, H. S., Gutierrez, A., & Lindsley, O. R. (2003). Handbook of the Standard Celeration Chart. Cambridge, MA: Cambridge Center for Behavioral Studies. Riley-Tillman, T. C., & Burns, M. K. (2009). Evaluating educational interventions: Single-case design for measuring response to intervention. New York, NY: Guilford Press. Stecker, P. M., Fuchs, L. S., & Fuchs, D. (2005). Using curriculumbased measurement to improve student achievement: Review of research. Psychology in the Schools, 42, 795–819. National Association of School Psychologists Best Practices in School Psychology B4Ch29_W122_Hixson.3d 19/10/13 16:33:03 The Charlesworth Group, Wakefield +44(0)1924 369598 - Rev 7.51n/W (Jan 20 2003)