Session 5: Single Subject Research Methodology ◦ Presentation template posted on wiki ◦ Please look at the APA Style website presented by Purdue’s Online Writing Lab (OWL) ◦ Find and review an experimental, quasi-experimental or single subject design study on an intervention for people with disabilities! Article review due next class August 3rd! ◦ Research proposal status- should be working at the point of defining your dependent and independent variables….thinking about a research design, but we haven’t discussed all of the designs yet. Steps in the Research/Scientific Process 1. Identify socially important issue 2. Review current literature 3. Define conceptual model 4. Define specific hypothesis(es) and research question(s) 5. Define dependent variable(s)/measure 6. Identify independent variable(s)/measures 7. Select appropriate research design 8. Obtain consents 9. Collect data 10. Analyze data 11. Communicate results Written presentation Oral presentation What is Causal Comparative Research? Investigators attempt to determine the cause of differences that already exist between or among groups of individuals. Describes conditions that already exist (a.k.a. ex post facto). The group difference variable is either a variable that cannot be manipulated or one that might have been manipulated but for one reason or another, has not been. Studies in medicine and sociology are causalcomparative in nature, as are studies of differences between men and women. Similarities and Differences Between CausalComparative and Experimental Research Similarities Require at least one categorical variable Both compare group performances to determine relationships Both compare separate groups of subjects Differences In experimental research, the independent variable is manipulated Causal studies are likely to provide much weaker evidence for causation In experimental studies, researchers can assign subjects to treatment groups The researcher has greater flexibility in formulating the structure of the design in experimental research Similarities and Differences Between CausalComparative and Correlational Research Similarities Ex Post Facto research Attempt to explain phenomena of interest Seek to identify variables that are worthy of later exploration through experimental research Neither permits the manipulation of variables Attempt to explore causation Differences Causal studies compare two or more groups of subjects Causal studies involve at least one categorical variable Examples of the Basic Causal-Comparative Design Steps to designing, delivering, and analyzing surveys Step 1- Determine Purpose Step 2- Identify a Sampling Plan & Mode Step 3- Design survey instrument Step 4- Test survey instrument Step 5- Send out a letter of transmittal Step 6- Deliver the survey Step 7- Analyze data from survey Please get into your research groups for the lecture portion. You will be completing the in-class activity together with your group. Systematic analysis using individual subjects as their own experimental control. Main message: ◦ Single subject research is an approach to rigorous experimentation that involves small numbers of subjects, repeated observations of subjects over time, and employs research designs that allow each subject to provide his/her own experimental control. Within-subject analysis Fine-grained analysis across time and conditions An experimental research method focused on defining causal (e.g., functional) relations between independent and dependent variables. Focus is on individuals as unit of analysis ◦ can treat groups as participants with focus on the group as a single unit Repeated measures of participants’ behavior (DV) over time Within-subject comparison to analyze effect ◦ Observed change in individual’s behavior from “Baseline” to “Intervention” Focus on an individual rather than group means ◦ Interest is in the behavior of a single individual or on within-subject variability A “group” may be treated as an “individual” ◦ Group descriptive statistics may not "describe" any actual individual ◦ Generalizations from a group to an individual are problematic in many instances Predicting the behavior of a specific individual is different from predicting that of a “typical” individual Many populations of interest are low incidence populations ◦ Practically, large numbers of subjects may not be available ◦ Assumptions of normal distribution and homogeneity of variance may not be valid Can be used in clinical practice contexts ◦ Single subject research studies may develop out of and be conducted on a specific problem or need of an individual(s) in a practical context Scientist-practitioner model A “practice” may be considered “evidencebased” when: ◦ The practice is operationally defined, and implemented with fidelity. ◦ The outcomes associated with the practice are operationally defined. ◦ The context in which the practice in use is operationally defined ◦ Results from the single subject studies used to assess the practice demonstrate experimental control. ◦ The effects are replicated across 5 single subject studies conducted in at least 3 locations, and with at least 20 different participants. Dependent variable (DV) – the behavior (measure) that you are analyzing ◦ You want to produce change (variability) in the dependent variable ◦ Studies may have multiple DVs Independent variable (IV) – the variable (event, intervention, condition) that is of experimental interest and that the researcher manipulates in an experimental research design ◦ May be discrete or continuous ◦ May be a single element or multi-component compound ◦ Studies may have multiple IVs The degree to which observed differences/changes in the dependent variable are a direct result of manipulation of the independent variable, and not some other extraneous variable Extent to which a functional relation can be documented. Control of extraneous variables that provide alternative explanations for results. ◦ It is okay to try to maximize internal validity, especially in initial documentation of a functional relationship Doing this may come with a cost, however History – everything happening outside of the research study Maturation Testing - repeated measurement Instrumentation ◦ with human observers, observer bias and drift Attrition - loss of participants Multiple treatment interference Diffusion of treatment - intervention is inadvertently provided when not intended Loss of baseline through generalization or spread of effects (across settings, behaviors, or participants) Instability and/or high variability of behavior ◦ cyclical variability Statistical regression toward mean Selection biases with participants Inconsistent or inaccurate implementation of the IV (Treatment Drift/Treatment Integrity) Defined: The extent to which results can be applied to settings, activities, people, etc. other than those involved in the study. ◦ Given that you have found an effect for this intervention with this participant under one set of conditions, will it work with other participants, in other settings, when implemented by other interventionists, and when implemented with minor variations in the basic procedures? ◦ What can we generalize from this single study? ◦ Importance of systematic and direct replication. Reactive experimental arrangements Hawthorne effect Reactive assessment - reactivity to observers Pretest sensitization Experimenter bias Interaction between selection bias and treatment effects - i.e., intervention only works if the "right" participants are selected ◦ Specificity of effects In single subject designs the research question typically examines a causal, or “functional” relation, between the independent and dependent variable. As such the research question should have three features Identify the dependent variable(s) Identify the independent variable(s) Proclaim intention to determine if change in the IV is functionally related to change in the DV. Dependent variable is socially important Independent variable(s) can be controlled (e.g. manipulated) across time. Both the dependent and independent variable(s) can be operationally described and measured. For “experimental” research, the question must ask if change in the DV is caused by (or functionally related to) change in the IV. Is there a functional relation between development of reading fluency and scores on comprehensive reading assessments? Will walking in water facilitate development of appropriate gait by individuals with “gait imbalance hypertension”? Is there a functional relation between use of escape-extinction and reduction of escapemotivated food refusal? Does Jason act out because he has ADHD? Dependent Variable (Outcome): Independent Variable (Intervention): Research question: “Is there a functional relationship between …… and …… ?” Phase A Phase B Phase A Phase B Immediacy of Effect Level Variability Trend Overlap Research Question??? Phase A Phase B Phase A Phase B Immediacy of Effect Level Variability Trend Overlap Research Question??? There are 3 demonstrations of an effect at 3 points in time. ◦ Effect could be: change in trend or level ◦ Also want to see immediacy of effect Good research has at least 5 data points in each phase to establish a consistent pattern in the data. Baseline - phase in a design that serves as the reference point or comparator for analysis of change in behavior (effect of IV) ◦ Used in withdrawal/reversal and multiple baseline designs; may be included in alternating treatments design (but not needed) ◦ Generally, the first phase, but not always Returned to periodically in withdrawal/reversal designs ◦ Provides (should provide) a representative picture of behavior under pre-intervention (typical, status quo) conditions Baseline is the “control condition” in within subject analysis May involve some alternative intervention/treatment Collect repeated measures of a DV under “baseline” conditions ◦ Goal is to establish the stability of behavior Look at level, trend, and variability of data ◦ At minimum, Horner et al. (2005) propose 5 data points in baseline phase (at least for initial phase) Fewer points can be defended in some situations e.g., participant cannot perform the behavior (has not learned) or ethical considerations ◦ Variability in DV requires more data points ◦ Can go forward with variability, particularly if intervention effect can be documented despite baseline variability Use baseline phase to do close observation to reveal potential sources of variability ◦ Control variability through elimination or holding constant extraneous variable(s) ◦ Consider whether sources of variability should be studied as IVs ◦ Be alert to dramatic changes within the phase and identify potential causes Balance logistical and clinical needs with research goal of stability ◦ Recognize potential limitations and threats to internal validity if you have high variability Trends (increasing or decreasing slope) can be accepted, if the trend is in the opposite direction of the anticipated effect of the IV ◦ Visual analysis does consider changes in trend across/between phases Trend in the “expected” change direction is problematic ◦ Collect more data points ◦ Consider whether intervention is warranted ◦ If substantial change in slope is expected, you may go forward with intervention Statistical analysis may be used to supplement visual analysis When a pattern of BL responding is established. ◦ Can you predict the next data points? ◦ Current BL pattern will allow you to document anticipated intervention effects? Note: ◦ High BL variability requires extending BL ◦ Trend in direction of expected effect requires extended BL. ◦ If BL level matches expected IV level, then extend baseline. Define research question and dependent variable. ◦ Does BL document a predictable pattern of behavior? ◦ Does BL document a pattern that will allow comparison with expected effect when Intervention (IV) is implemented? The “traditional” rule - implement one variable at a time ◦ Allows for clearest demonstration of a functional relationship Package interventions create issues ◦ May be able to establish relationship between the package and DV, but not know about effects of specific components Component analysis designs address this issue Demonstrating interaction effects also is a challenge Implement based on data collected in baseline (or previous phases), rather than on a predetermined schedule that is independent of the data Establish effects of IV on one baseline (data path) before implementing IV in another baseline (data path) in a multiple baseline Collect and report measures of IV implementation fidelity Phases should be long enough to establish representativeness of data within the phase ◦ Reach stability within the phase (at least 5 points) ◦ Some have argued that for power, the number of data points in SS design is comparable to number of subjects in group design Researchers often want to use relatively short phases ◦ Because of logistical issues, ethical issues, impatience, costs ◦ Be aware of limitations and threats to validity Phases of very different lengths within a design (particularly ABAB) can create issues for visual analysis and interpretation of effects Data may be collected in sessions that are daily, multiple within a day, or longer spaced (e.g., weekly, etc) Consider timing between sessions and phases ◦ Avoid carryover effects by spacing sessions or phases ◦ Timing between phases can raise potential threats to internal validity e.g., running all sessions for a phase within a day, and then all sessions for the next phase on the next day A multiple baseline design involves three or more AB interventions (series) with phase changes staggered across at least three points in time. Key Features ◦ Series are independent of each other People, places, materials, behaviors/skills ◦ The same IV is applied in each series ◦ Staggered implementation of IV ◦ Identify Research Question(s) ◦ Assess Baselines for each series Do the Baselines document a predictable pattern? Do Baselines allow opportunity to document IV effect? Are Baselines similar? ◦ Horizontal Analysis of Effect (per series) Level, trend, variability, overlap, immediacy of effect ◦ Vertical Analysis DV change in one series is associated with NO change in other series? Similar effect (consistent effect) across series? ◦ Functional Relationship? At least three demonstrations of effect at three points in time BL Lollipop for R+ Treatment 6 100 80 60 Percentage of Correct Responding 40 20 Vivian 0 Lollipop for R+ 100 80 60 40 20 Tammy 0 Lollipop for R+ 100 80 60 40 20 Dr. Cathy 0 10 20 30 40 Sessions 50 60 70 Sequential phases of data collection involving the implementation and withdrawal of an independent variable(s) ◦ within each phase, multiple data points are collected to establish a representative pattern of behavior ◦ phase change should occur only after stability of behavior within the phase is established ◦ traditionally, the first phase is Baseline, followed by implementation of the IV (Intervention) this is not required, however, as you may begin a study with an intervention phase Behavior measured as DV is “reversible” ◦ Learning will not occur Limited carryover effects between phases Ethical concerns ◦ Can do a reversal DV is not a dangerous behavior, or you can protect participant Staff cooperation Can compare multiple conditions ◦ Comparison of too many conditions makes design cumbersome Baseline FCT Baseline FCT 4 B 6 Total SIB per minute 5 4 3 2 1 0 1 5 10 15 Sessions 20 25 30 35 Within subject analysis Independent variable needs to have at least four levels (e.g. criteria) Document baseline performance with one level of the IV Change the level of the IV and monitor change in DV ◦ Immediacy of change important ◦ Absence of trend and variability important Repeat level (criterion) change in IV two more times. Examine the graphs below ◦ 1. What is the research question? ◦ 2. Is there a functional relationship? ◦ 3. Does the design document three demonstrations of an “effect” at three different points in time? Where? Changing Criterion Design Occurrences of Problem Behavior 25 BL: No Reinf Reinf < 17 Reinf < 12 Reinf < 5 20 15 10 5 0 1 3 5 7 9 11 13 15 Days 17 19 21 23 25 27 29 Changing Criterion Design Occurrences of Problem Behavior 25 BL: No Reinf Reinf < 17 Reinf < 12 Reinf < 5 20 15 10 5 0 1 3 5 7 9 11 13 15 Days 17 19 21 23 25 27 29 Alternating Treatment (Multi-Element) Designs employ rapid phase reversals across 2 or more conditions to assess sensitivity of change in the dependent variable to change in condition. Student 1 Hypothesis: Escape Math Work Percent Intervals with Occurrence of Problem Behavior 100% 90% 80% 70% 2. Is Esc different than Attn? Control Condition Escape Condition 60% Attention Condition 50% IOA 1. Is Esc different than Control? 40% 30% 20% 10% 0% 1 2 3 Sessions 4 5 State the research design you would use for your study and why? 54 The selection of measures is PART of building a single subject design. All single subject designs require measures that allow documentations of: ◦ A stable pre-intervention pattern of performance, and ◦ A rapid and dramatic change in performance following intervention. Measures must be reliable/consistent enough to document pre-intervention stability, and sensitive enough to document rapid, dramatic change. Fundamental Dimensions of Behavior • Frequency: – The number of occurrences of a response within an observation period. • Duration: – The total time taken to perform a response (typically indexed as the mean duration) • Latency: – The time between the presentation of the Sd, and the initiation of a response. • Perseveration: – The proportion of the observation period/interval in which responding was occurring. (Total time for all occurrences) • Rate: – The frequency of a response divided by the total time for an interval (typically occurrences per minute…or occurrences per second). Measurement Procedures • Event recording: – Observe number of occurrences within an observation period • Duration recording: – Observe the mean time of responding per occurrence (tempo) • Interval recording: – Observe the proportion of intervals in which the behavior occurs. » Whole interval versus partial interval recording. • Time sampling: – Proportion of time sampled moments in which behavior is occurring. • Permanent product: – Count of products from behavior. Note: No direct observation • Narrative: – Continuous description of behavior in real time Define a research question For the Dependent Variable ◦ Select a measure ◦ Select a measurement process For the Independent Variable ◦ Select a measure ◦ Select a measurement process Building Data Collection Forms • Paper/Pencil or Computer Entry/PDA • Key Features – Logistical Information – Date, observer, observed, – Ease of recording (eyes on context) – Key strokes or checks instead of writing words. – Number of variables recorded simultaneously (3 is plenty) – Operational definitions – Fit the context and range of observed behavior – Instructions on setting up a data session – System for summarizing session results. Nifty Observation Form Date: ________________________ Observer: _____________________ Context: ______________________ Request: Statement from teacher requesting response by target student Compliance: Initiation of requested response within 5 s of request Noncompliance: Absence of initiation of requested response within 5 s of request. Problem behavior: Talking out, aggression, property destruction, disruption. Request 10 s Interval 1 2 3 4 5 6 7 8 9 10 Compliance (+)/ Noncompliance (0) Problem Behavior Comments/Issues In-class Activity #7 • Build a data collection form based on how you plan to measure the data. Inter-observer Agreement • Proxy for reliability but not really a measure of reliability. • Poor IOA means poor reliability, but good IOA does not prove good reliability. • Two practical measures • Percent agreement (Total, Occurrence Only) • Kappa Percent Agreement • Defined: The extent to which two, independent observers agree they observed the same events at the same time. – Operationalized. Given a group of observation intervals, to what extent do the frequencies or interval recordings co-vary across two, independent observers. What percent of the intervals index agreement? • Calculation. – (Frequency of observations with agreement/ total number of observations) * 100% – Frequency observed by Observer 1/Frequency observed by Observer 2 (correlation) Percent Agreement • Advantages • Easy to compute • Easy to understand • Failure to obtain criterion level is informative. • Disadvantages • Is not a measure of reliability • Provides an over-estimate of agreement (especially when <10% or >90% of intervals include occurrence. Percent Agreement • Professional Standards – 85% agreement is expected for good IOA • Occurrence Only vs Total Percent Agreement – Occurrence/Nonoccurrence Only is used to assess agreement when <10% or > 90% of intervals include occurrence. – Calculate (use in denominator) only using those intervals in which either of the observers recorded a response (Occurrence Only) or only those intervals with either of the observers did not record a response (nonoccurrence only). – Controls for one source of bias. Cohen’s Kappa • Purpose of Kappa is to provide an index of observer agreement that controls for chance agreements. – Kappa can range from –1.00 to +1.00 • .40-.60 = fair agreement • .60-.75 = “good” agreement • .75+ = generally needed for publication in Tier 1 journals Kappa • Calculation – Kappa = (Po- Pc) / (1 – Pc) • Where Po = the proportion of observed agreements • Where Pc = the proportion of agreements expected by chance. • Recommendation: – Report both percent agreement and Kappa. – Use Occurrence/Non-occurrence Only when appropriate 68 Issues related to single subject research design features Withdrawal/Reversal Designs – – – Multiple Baseline Designs – – Implementing withdrawal/reversal phases & length of phases when DV is problematic End study with participants in the "optimal" phase Adequate baseline length Extended baselines & treatment phases No treatment/intervention "control" baselines Reaction to measurement or other research procedures – – Set research session termination guidelines & criteria to protect everyone – terminate sessions when criteria are met Have a plan to protect participants and others, and to bring situations under control if crisis occurs Issues related to applied research in natural settings • Minimize negative images and stigma • • Use unobtrusive measurement (as possible) Appropriate selection of DV measures • • • • For example, use latency to problem behavior rather than rate in community settings Dignified procedures Responding to "citizen" questions or comments Ensuring cooperation and support of others in natural settings • • • Open communication before and during study Obtain appropriate permissions & consents Be courteous & respectful • • Allow people in the setting (teachers, families, staff) some voice Include community "others" as research partners/collaborators Exiting research projects gracefully Plan for exit Leave participants in "optimal" phase or state of performance Provide training and support (i.e., plan, materials, etc) for natural community members to assume and maintain implementation of intervention Provide information on results and their implications for natural setting Provide follow-up if necessary – Agree on researcher responsibilities on the front end (before study) Baseline FCT Baseline FCT 4 B 6 Total SIB per minute 5 4 3 2 1 0 1 5 10 15 Sessions 20 25 30 35