Running head: JUDGING OUTCOMES IN PSYCHOSOCIAL INTERVENTIONS Judging Outcomes in Psychosocial Interventions for Dementia Caregivers: The Problem of Treatment Implementation Lou Burgio, Ph.D.1, Mary Corcoran, Ph.D., OTR/L2, Kenneth L. Lichstein, Ph.D.3, Linda Nichols, Ph.D.4, Sara Czaja, Ph.D.5, Dolores Gallagher-Thompson, Ph.D.6, Michelle Bourgeois, Ph.D.7, Alan Stevens, Ph.D.1, Marcia Ory, Ph.D.8 for the REACH Investigators 1 University of Alabama George Washington University 3 University of Memphis 4 VA Medical Center, Memphis, TN 5 University of Miami 6 VA Medical Center, Palo Alto, CA 7 Florida State University 8 Behavioral and Social Science Program, National Institute on Aging, National Institute of Health, Washington, D.C. 2 Judging Outcomes Abstract Caring at home for a family member with dementia is associated with numerous and well-documented emotional and physical health issues. Driven by a predicted increase in the incidence of dementia in the U.S., efforts to develop interventions that ease the burden of dementia family caregivers have expanded during the past two decades. However, serious and widespread methodological limitations in published caregiver intervention research threaten the ability to draw inferences to the caregiving population at large. Principal among these limitations is the lack of strategies to measure treatment implementation, a significant problem threatening internal validity (drawing causal inferences) and external validity (generalization of findings). The purpose of this article is to discuss the importance of inducing and assessing treatment implementation strategies in caregiving trials, and to propose Lichstein’s Treatment Implementation (TI) Model as a guide in this endeavor. The efforts of a large cooperative research study of caregiving interventions, REACH, will be used to illustrate induction and measurement of the three components of TI: delivery, receipt, and enactment. 2 Judging Outcomes 3 It is estimated that 4 million Americans suffer from Alzheimer's disease and related disorders (ADRD) with 80% of these individuals living at home and cared for by family caregivers (Czaja, Eisdorfer, & Schulz, 2000). The number of people with ADRD is expected to grow exponentially as the U.S. population ages (Schulz & O'Brien, 1994). The tasks and burdens associated with family caregiving are numerous and can include managing behavioral disturbances, attending to physical needs, and providing seemingly constant vigilance (Gold, Cohen, Schulman, Zucchero, Andres, & Etezad, 1995; Vitaliano, Russo, Young, Teri & Maiuro, 1991; Wright, Clipp, & George, 1993). These burdens can be overwhelming. Dementia caregiving has been associated with increased levels of depression and anxiety as well as higher use of psychoactive medications, poorer self-reported physical health, compromised immune function, and increased mortality (Light, Niedereke, & Coughlin, 1994; Schulz & Beach, 1999; Schulz, O'Brien, Bookwala, & Fleissner, 1995). Over the last 20 years, researchers have examined a plethora of psychosocial interventions aimed at alleviating the burdens associated with dementia caregiving. Intervention programs have been quite varied and have included intensive, personalized counseling, supportive group counseling, the provision of knowledge about ADRD through educational programs, specific therapeutic skills training, enhancing problem solving skills, and improving patient behavior management (see reviews by Bourgeois, Schulz, & Burgio, 1996; Kennet, Burgio, & Schulz, 2000; Knight, Lutszky, & Macofsky-Urban, 1993; Toseland & Rossiter, 1989; Zarit & Teri, 1992). The conclusions from these reviews are varied. However, there is a consensus in the literature that interventions which are comprehensive, intensive, and individually tailored to caregivers' needs are likely to be more effective than those lacking these characteristics (Kennet et al., 2000). Judging Outcomes 4 All of the existing reviews noted that dementia caregiving intervention research is fraught with methodological weaknesses including sampling and recruitment issues, inadequate outcome measures, and limitations in experimental design. While acknowledging the importance of these issues, the review by Bourgeois and colleagues (1996) focused on the therapeutic process presented in these studies, including the measurement of treatment intensity and implementation. Much has been written over the last two decades on therapeutic process, particularly surrounding psychotherapy outcome research. Early writings by Cook and Campbell (1979) and Sechrest and colleagues (Sechrest, West, Phillips, Redner, & Yeaton, 1979; Sechrest & Yeaton, 1981) helped focus researchers attention on critical variables such as the quality of the therapeutic relationship, therapeutic skill, and treatment duration. There have been significant advances over this period of time, although these issues continue to generate much debate (Howard, Moras, Brill, Martinovich, & Lutz, 1996; Kraus & Howard, 1999). Cook and Campbell (1979) were the first authors to discuss how problematic treatment implementation (i.e., control of the independent variable) could be in non-laboratory contexts. The nature of the independent variable is one of the least understood aspects of the dementia caregiving literature. With few exceptions (e.g., Lovett & Gallagher, 1988; Toseland, Rossiter, & Labrecque, 1989; Bourgeois, Burgio, Schulz, Beach, & Palmer, 1997), the specific content and procedural details of interventions are reported in abbreviated fashion, making it difficult for the reader to understand what was actually done. Consequently, replication of effective interventions is problematic. Moreover, even if a clear description of the treatment components is provided (i.e., the intended treatment), we cannot assume perfect congruence between the treatment that was intended to be delivered and the actual treatment delivered. For example, treatment may consist of hour-long, complex psychoeducational workshop sessions. If Judging Outcomes 5 the therapists are hampered by inadequate training, some therapy components may be omitted and others presented shabbily. Sechrest and colleagues have argued that in the absence of assessing that the treatment was presented as intended (treatment integrity), conclusions regarding treatment efficacy cannot be made with any confidence (Sechrest et al., 1979; Sechrest & Yeaton, 1981). Over twenty years have passed since Cook and Campbell, and Sechrest and colleagues introduced a focus on how the treatment is delivered. Prior to that, clinical outcome research simply assumed, often unjustifiably so, that the intended treatment was presented. In the past two decades, researchers have come to recognize that treatment integrity is critically important but insufficient in asserting a fair test of the treatment was conducted. The participant's mastery of treatment (termed receipt) and the participant's application of treatment beyond the boundaries of the therapy session (enactment) are no less critical. Thus, the path of the independent variable can be partitioned into three components, delivery, receipt, and enactment, and their summative impact on the client may be termed treatment implementation (Lichstein, Riedel, & Grieve, 1994). When delivery faults prevail, the client may be learning and enacting only part of the treatment or the wrong treatment. Proper delivery of the treatment does not guarantee it was learned to criteria. In the case of receipt faults, the client is again learning and enacting part of the treatment or the wrong treatment. Even when there is proper delivery and adequate receipt, satisfactory enactment is not assured. When there is insufficient enactment, therapy exposure is often limited to the customary one hour a week, and the remaining seven days a week are spared the influence of therapy. This is particularly critical in therapies whose efficacy rely on the application of therapy skills throughout the week. Judging Outcomes 6 It can be seen that satisfactory implementation of the three components is independent. Knowledge of satisfactory implementation of any one component does not inform us of the level of implementation of the others. Similarly, faults in any one component will drain the strength of the treatment. Given faulty treatment implementation, therapy outcomes will usually suffer, within group variability can be expected to inflate diminishing statistical power, and predictions of the effectiveness of the application of this treatment to other samples will become increasingly unreliable. Both internal validity (drawing causal inferences) and external validity (estimates of generalization of findings to other samples) will suffer. The purpose of this paper is to discuss the importance of assessing and reporting treatment implementation in dementia caregiver intervention research. Although we will focus on caregiver interventions, these issues have relevance to all geriatric intervention research (e.g., comprehensive geriatric assessment, psychosocial nursing home interventions). We will propose Lichstein's Treatment Implementation Model (Lichstein et al., 1994) as one solution for conducting fair tests of intervention efficacy, and we will describe the efforts of the REACH cooperative group to apply this model in ongoing caregiver intervention trials. Treatment Implementation Treatment implementation (TI) strategies are used to facilitate and monitor activities between two actors, the interventionist and the study participant, so that the action of an intervention can be understood and, if desirable, modified. Typically, the term treatment implementation refers to a class of process measures that document the implementation of individual treatment components. Lichstein differentiates between the induction and assessment of TI components. Induction refers to the methods researchers use to enhance the probability that proper treatment implementation occurs. Assessment refers to either quantitative or Judging Outcomes 7 qualitative measurement of their occurrence. This distinction is important because clinical researchers can induce several TI strategies, but may only assess a portion of those strategies (Lichstein et al., 1994). TI measures can be classified according to three fundamental aspects of intervention application: delivery, receipt, and enactment (Lichstein et al., 1994). Treatment delivery targets the actions of the interventionist, and specifically his/her ability to present the intervention to the client as intended. Assessment procedures focus on the therapeutic skills of the therapist and the therapist's ability to engage the client in the treatment protocols. For example, if the intervention involves cognitive behavior therapy (CBT), were the critical components of CBT (e.g., recognition, appreciation of the role, and ability to self- manage negative thoughts) satisfactorily presented by the therapist? A related concept, treatment fidelity (Moncher & Prinz, 1991), asks was the treatment presented as intended and is the treatment adequately differentiated from other treatments. Thus delivery is concerned with including all intended parts of the treatment but also excluding inadvertent introduction of parts of other treatments. Treatment receipt refers to the degree to which the client actually received the scheduled treatment, as indicated by mastery of concepts and/or skill development. Actions of the client are reviewed in the assessment of treatment receipt. Treatment receipt is assessed by documenting the client's knowledge or skill level. Continuing the CBT example, caregivers would need to demonstrate acquisition of the key concepts of CBT by completing a CBT Knowledge Test and by demonstrating skills in recognizing and refuting negativistic thinking. Treatment enactment targets the degree to which the client demonstrates changes in therapeutic behaviors related to intervention in the natural environment, i.e., does the client use appropriately, in their daily lives, the skills and knowledge that define the particular Judging Outcomes 8 intervention? For example, use of CBT skills would be demonstrated by completion of homework assignments and reductions in negative thoughts as recorded in a Negative Thoughts Diary. TI assessment. Unlike data obtained from study outcome measures, information from TI assessments is collected continuously and is inspected and interpreted as part of the ongoing intervention. Direct measures of intervention components yield more reliable judgements than indirect measures, but are generally more difficult to obtain. Direct measures are more common in delivery and receipt assessments, while enactment (or adherence) assessment often relies more strongly on indirect assessment. Delivery assessment focuses on the skills of the interventionist, and his/her ability to deliver the intervention as intended, without additions, and within the amount of time allotted for the intervention. Direct measures of delivery include the frequency, format and content of all interactions between interventionist and client. One standard methodology for directly assessing delivery is to specify the components of the intervention that are intended, and plausible confounding components that should not occur, and to rate the intervention based on the occurrence of each. Delivery assessment should be obtained through an independent rating of the intervention session, using tapes or an observer. Treatment receipt is often assessed by documenting the client's knowledge or skill level, frequently through the use of pen-and-paper surveys or questionnaires. However, for individuals who have low literacy, some direct methods of assessing intervention receipt, such as written tests of knowledge or understanding, may be burdensome. Other direct measures of receipt that are less obtrusive than written measures include role playing or asking the client to recall intervention suggestions. Two "soft" indirect measures of treatment receipt, often used as the Judging Outcomes 9 sole measures of receipt, are to confirm that the client has the intervention materials in his/her possession, and to ask if he/she has any questions about the intervention. Direct and indirect measures of treatment receipt can be scored by the interventionist or an independent rater, using tapes or observation. Assessment of enactment is more difficult but critical to establishing the internal validity of an intervention. Written measures of client changes in behavior can provide a direct assessment of enactment; however, as in receipt assessment, these may prove difficult for clients with low literacy. Indirect assessment of enactment is more common and can include questioning the client regarding the use of intervention techniques. Enactment assessment is completed more commonly by the interventionist but can be scored by an independent observer. TI assessment provides information concerning the process of treatment as the study unfolds. It provides critical information to the investigator regarding the current state of the intervention protocol, and creates the opportunity for corrective action. TI induction. Formal and informal induction methods for delivery, receipt and enactment can be instituted on an ongoing basis to help ensure appropriate implementation to the intervention. One formal mechanism for assuring appropriate and accurate delivery is a detailed protocol for delivery of the interventions. Interventionist training can combine formal and informal mechanisms, such as role playing, lectures, and discussions of the disease process and each of the interventions. While delivery induction is aimed at improving interventionist skills, receipt and enactment induction focus on encouraging the client's adherence to the intervention. Informal receipt and enhancement induction methods include instructions and reminders by the interventionist. Formal methods include written materials related to the intervention, as well as Judging Outcomes 10 any role-playing with feedback. The nature and presentation of the interventions may enhance comprehension. Interventions that are personally and directly relevant to problems the participant is experiencing, presented in a concrete and useful manner, are more likely to be remembered (Sorrentino, Bobocel, Gitta, Olson, & Hewitt, 1988). While strategies to induct and assess treatment delivery, receipt and enactment must be customized to the specific intervention, the procedures used in the six REACH intervention projects can illustrate several approaches to TI assessment. However, before we describe specific TI strategies, we will present a brief general description of the REACH cooperative effort. A more detailed description can be found in Coon, Schulz, and Ory (1999). Resources for Enhancing Alzheimer's Caregiver Health (REACH) Resources for Enhancing Alzheimer's Caregiver Health (REACH) is a unique, five-year program sponsored by the National Institute on Aging (NIA) and the National Institute of Nursing Research (NINR) at the National Institute of Health (NIH). REACH grew out of a NIH initiative that acknowledged the well-documented burdens associated with dementia caregiving, as well as the emergence of promising dementia caregiver interventions in the literature. These two areas of research provided the foundation for a systematic test of well-specified and theorybased intervention approaches. In 1995, NIH funded six intervention sites and a coordinating center that focus on family caregivers to ADRD individuals at the moderate level of impairment (see appendix). This national research program includes the following general types of interventions based on theory-driven models of care and its effectiveness: 1) Individual Information and Support strategies, 2) Group Support and Family Systems efforts, 3) Psychoeducational and Skill-based Training approaches, 4) Home-based Environmental interventions, and 5) Enhanced Technology Systems. In some sites, combinations of these Judging Outcomes 11 general types are being explored. Because the caregiving experience in ethnic minority families has been particularly neglected in the field, site proposals with substantial minority composition were given special consideration. The six REACH intervention sites funded by NIH yield a multi-site collaborative effort utilizing interdisciplinary groups of professionals to deliver a variety of interventions culturallytailored to meet the needs of a range of ethnic majority and minority populations. The study goals shared by all the REACH sites include: 1) the specification of all intervention components and an examination of the effectiveness of a variety of psychosocial, behavioral and technological interventions to strengthen family members' capacities to care for individuals with ADRD, 2) the development of standardized outcome measures to assess the impact of comparable strategies on caregivers and their care recipients, and 3) the creation of a common database to help compare the effectiveness of these different interventions across the range of identified populations. Finally, given the lack of well-described and well-controlled studies of this nature, REACH is designed to examine the feasibility and outcomes of different intervention approaches rather than to provide definitive information on the one best intervention strategy for enhancing dementia caregiving. REACH Treatment Implementation Strategies Strategies to Induce and Assess Treatment Delivery Accurate and consistent delivery of the intended treatment is critical to intervention effectiveness as well as to the interpretation of both significant and null findings. While few will disagree with this fundamental statement, mechanisms to ensure the accurate and consistent delivery of a treatment protocol can be difficult to achieve, and are often omitted by Judging Outcomes 12 investigators. Common induction and assessment strategies used in the REACH projects were designed to combat potential threats to treatment delivery (see Table 1). There are multiple threats to consistency of treatment delivery, particularly in large, complex intervention trials. For example, all of the REACH interventions include multiple treatment components. Ensuring consistent, accurate application of a single component intervention (e.g., imparting knowledge about ADRD) would be simpler than an intervention involving knowledge plus behavioral skills training. Similarly, the complexities of treatment delivery assessment multiples as a function of the number of interventions being compared within a trial. Length of intervention is also a factor. Some of the REACH sites extend their intervention phase over 12 months or longer. It is not uncommon for interventionists to "drift" from intervention protocol when lengthy intervention phases are used (Moncher & Prinz, 1991). Moreover, although the use of multiple interventionists is advisable to control for "extra-therapeutic" factors (e.g., therapist personality), therapist attrition during multi-year therapy trials presents complications for insuring consistent treatment delivery throughout the trial. Strategies to induct (i.e., enhance) and assess treatment delivery included 1) using treatment manuals, 2) interventionist training and certification, and 3) monitoring and feedback of performance. Each is further discussed below. Treatment Manuals (induction only). To guarantee a consistent level of accuracy in treatment delivery, all REACH interventions have been manualized1. Each of these manuals was examined by the Coordinating Center for consistency across sites in format and level of detail. These extensive manuals describe all aspects of treatment delivery and assessment. Manuals are used as training tools and to maintain accurate delivery over time. Interventionists are given an intervention manual that provides a detailed account of each treatment component and a step-by- Judging Outcomes 13 step timeline to insure timely delivery of all intervention activities. In many of the sites, therapists carried into each session delivery checklists to remind them of what needed to be done. Manuals are also a convenient and accurate source of information about special circumstances that may occur during intervention. Training and Certification. Methodical interventionist training is a critical step to insure accurate delivery of treatment. Supplemental to the team meetings, the project coordinators and interventionists are trained to deliver all intervention, comparison, and/or control protocols. A procedure for training interventionists was developed by the investigators at each site. Training consisted of independent readings, didactic instruction, and hands-on demonstration to enable interventionists to implement treatment with AD caregivers. Training was followed by an evaluation procedure, supervised by the Coordinating Center, that certified the individual to serve as an interventionist. Interventionists were provided with certificates indicating that they had acquired the skills necessary for delivering the intervention. Ongoing Monitoring and Feedback. Although certification procedures confirm that interventionists have a standardized level of expertise prior to delivering interventions, ongoing monitoring of the interventionists’ performance during the trial is critical. Thus, all REACH sites conduct periodic assessments of the interventionists’ performance. This is accomplished largely by audiotaping interactions with participants, either at every therapeutic session or on a random basis. These audiotaped interactions are coded by an individual at the site who is knowledgeable about the intervention to insure accurate and consistent compliance with protocol. Coding of each interventionist’s performance is guided by a TI checklist on which the coder rates the interventionist’s performance according to previously identified key treatment components. The completed checklist is then used to provide feedback to the interventionist. Judging Outcomes 14 In addition, some REACH sites incorporated the practice of 1 - 2 hour long weekly group supervision to allow very careful and consistent monitoring and feedback of their interventionists, right from the inception of the project. This was done particularly at sites where multiple interventionists were used for each condition, with high turnover over time, to ensure that treatments were delivered in a consistent manner by all involved throughout the life of the project. Turnover of interventionists occurred because of use of therapists in training at one site in particular (Palo Alto, CA) where conducting interventions in the REACH program was incorporated into the supervised training programs of psychology pre-doctoral interns and postdoctoral fellows. In general, their commitment ranged from 6 months to one year, thus necessitating frequent training of new personnel. The staff had varied levels of training and experience at the start of their involvement each year. Weekly group supervision (led by the site PI) provided this staff with an opportunity to discuss how to handle particularly problematic cases within the boundaries of the specific intervention protocol. It also helped to ensure therapeutic consistency across the various personnel involved. As an example, three of the sites used a similar "Minimal Support Condition" (MSC). In this condition, therapists contacted caregivers by telephone to offer limited social support; however, only very general therapeutic information was provided. In the MSC, the handling of serious issues (such as possible elder abuse) often required a delicate balance between the constraints of the protocol itself and the ethical mandate to respond appropriately to the problem under discussion. In group supervision all current and prospective interventionists would discuss possible ways to handle that situation, and a plan would be agreed upon for follow up, depending on the nature of the problem. Attendance was required for all staff, including those functioning in outlying areas, who were connected by phone to the face to face weekly group meetings. Judging Outcomes 15 Despite the distance, this proved to be a very successful way to involve ALL interventionists and to keep them abreast of each other's work, so that leaders did not "drift" from the intent of each protocol in use at the site. Although the intense phase of REACH intervention delivery is now complete, monthly supervision groups are still being held as REACH sites complete the booster or follow-up meetings specified in treatment protocols. These are open to new trainees as well, so that they can observe and learn from the experience of other interventionists. In this way the project continues to train clinical researchers of the future. Strategies to Induce and Assess Treatment Receipt Even if treatment was delivered by the interventionist in exemplary fashion, the investigator should not assume that the client has received the intervention as intended. Numerous threats to treatment receipt are present in caregiver intervention research. For example, (1) there may be problems in communication due to the interventionist's use of jargon or due to cultural differences in communication style between the interventionist and client (Gallagher-Thompson, Arean, Menendez, Takagi, Haley, Arguelles, Rubert, Loewenstein, & Szapocznik ,2000), (2) burdened caregivers may be distracted and inattentive in training sessions, (3) older adult caregivers may have diminished hearing, vision, or memory abilities that may hamper their learning. The REACH cooperative group inducted and assessed treatment receipt through various methods including maintaining a record of intervention contacts, assessing caregiver knowledge, documenting intervention sessions, and eliciting caregiver feedback. Each is discussed in more detail below. Record of Contacts and System Utilization. Across all six REACH sites, a standard form is used to document several pieces of information related to contact with clients. This information includes the number of contacts, duration, and method (e.g., telephone, face-to-face, Judging Outcomes 16 group, access to computerized information system). The form is completed by the staff member involved in the contact; this individual also documents whether the contact was scheduled or unscheduled, who initiated the contact, who was involved in the contact (e.g., other family members, other professionals, the care recipient), and if the contact was "off-protocol". This information is entered into the REACH core database immediately by data entry staff. A record of contacts allows investigators to analyze outcomes based on type, number, and duration of contacts, and gives immediate feedback about the degree to which clients are receiving the intervention. Assessing Caregiver Knowledge of Key Treatment Concepts and Skills. Although the measurement of treatment receipt requires the recording of contacts with the client, it is critically important to assess changes in the client's knowledge of the key concepts and skills involved in the targeted intervention. Some of the sites that include didactic instruction in group format use a formal pre-post knowledge test. Most of the sites achieve this goal by audiotaping intervention contacts. Audiotaped intervention contacts are scored for caregiver understanding of fundamental knowledge of therapeutic behavior using a standard form. Interventionist Documentation. A rich source of information about treatment receipt in REACH is the interventionists’ field notes and documentation. Both qualitative and quantitative methods are used by the REACH sites and include interventionists’ Likert-type ratings of intervention compliance, written summaries of dementia management strategies developed during therapy sessions, and documentation of behavioral or environmental problems addressed during treatment. One REACH site conducts bi-monthly "debriefing" meetings between an investigator and caregiver that are audiotaped and analyzed for indications of treatment receipt using formal qualitative methods (Weiss, 1994). Another site has developed a rating scale to Judging Outcomes 17 document the degree to which a collaborative therapeutic relationship has developed between the caregiver and interventionist. The scale, which is completed by the interventionist, includes a list of caregiver therapeutic behaviors that suggest treatment receipt, such as “To what extent did this caregiver modify suggestions to fit his/her preferences or needs?” Although not always quantitative, interventionist documentation provides important information about the treatment process that enhances understanding of change mechanisms. Feedback from the Caregiver. To collect information about caregiver perspectives on the intervention, all REACH sites use a standard 17-item survey. The questions ask the participants to rate their experiences in several topical areas, including education about dementia, caregiver skill building, and perceived benefit of the intervention. In addition, each site has included up to ten site-specific questions to gain information about topics that are unique to their intervention. For instance, two of the sites use technology to deliver the treatment; these sites ask caregivers to rate their experiences of learning to use the system and frequency of problems with the system. The survey is administered to caregivers after the 12 month assessment. Use of this standard form directly accesses information about the caregivers’ level of treatment receipt, and provides a mechanism for comparing receipt across sites. Strategies to Induce and Assess Treatment Enactment Strategies for inducing and assessing treatment enactment are designed to assess the level at which caregivers actually use the knowledge and skills acquired in treatment and apply these new skills to situations outside of the therapy session. Enactment suggests that mechanisms of change are in operation. It is preferable to assess enactment by collecting data from various perspectives. Specifically, combining direct observations with reports from both the caregivers and interventionists is most advantageous. Judging Outcomes 18 Direct Observation of the Caregiver. Observation of caregivers is the most direct and reliable means of determining the level of treatment enactment. However, for some interventions, such as home-based treatment, this method might also be costly for the researchers and burdensome for the caregiver. At the Alabama REACH site, one of the goals of intervention is to improve caregiver communication skills. Subsequently, all caregiver/care recipient dyads are asked to participate in structured, staged social activities. Each dyadic interaction is videotaped for a total of 90 minutes, 30 minutes during baseline, followed by four, 15 minute observations completed over the first ten weeks of the intervention. Project staff then complete a detailed assessment of these videotaped social interactions. Observing the caregiver's behaviors during these staged social activities provides an opportunity to examine the effects of skill training on the caregiver's verbal and nonverbal social behaviors (i.e., communication skills). Group and office-based interventions provide greater opportunities for observation of caregivers, and several REACH sites use this method. Intervention utilization is documented on a checklist or rating scale in these instances. Other methods include several types of pen and paper recording, including recording of system utilization, placing orders for adaptive equipment in environmental-based interventions, and use of behavioral logs reporting how the caregiver responded to a behavioral disturbance. In the example of ordering the purchase of adaptive equipment, caregivers consult with the interventionist to decide which adaptive equipment will address their management issues. The purchase of the adaptive equipment is then recorded on a log sheet. Information from these methods for assessing enactment is augmented by interventionist and caregiver reports, as described below. Caregiver Self-Report. Investigators can use caregiver self-reports to collect data regarding which aspects of the intervention are enacted and the frequency at which they are used. Judging Outcomes 19 Self-reports can also provide valuable information regarding the barriers to enactment and the length of time strategies were effective (this is particularly important in the case of a progressive condition such as dementia). Consequently, the REACH investigators used several types of selfreports. These included caregivers’ reports concerning frequency of strategy use, ability to generalize skills and knowledge to newly emerging situations, number of weeks that a strategy was in use, and evaluations of strategy effectiveness. Self-report provided an opportunity for caregivers to reflect on the benefits or consequences of the intervention, to make suggestions for changes, and to comment on the usefulness of specific components of the treatment. Self-report is particularly informative when analyzed in conjunction with direct observations. Interventionist Documentation. Assessing the interventionists’ perspectives on enactment may be as simple as obtaining information similar to clients' self-reported use of interventions, or as complex as a thematic analysis of field notes. REACH sites used various strategies including the interventionists’ record of caregiver compliance, rating of intervention effectiveness, and progress notes documenting use of treatment strategies. Interventionists are also given an opportunity at several sites to comment on the degree to which caregivers use knowledge and skills to address newly emerging caregiving issues. Cultural Diversity and Language Issues As noted earlier in this paper, these samples of the REACH project were deliberately selected to reflect the ethnic diversity available in the populations at each site. For example, four of the six sites heavily recruited African American caregivers (Birmingham/Tuscaloosa, Boston, Memphis, Philadelphia) and two focused on Hispanic/Latino caregivers (Miami, Palo Alto). At all sites, diversity issues were often complex and needed special attention. For example, recruitment and retention of caregivers tended to be more challenging when working with the Judging Outcomes 20 non-Anglo groups. However, once adaptations for cultural diversity had been incorporated into the treatment intervention protocols, all of them were implemented in essentially the same way across the various ethnic groups at each site. Two publications from the REACH team discuss these issues in more detail (Gallagher-Thompson, Haley, Guy, Rubert, Arguelles, Tennstedt, & Ory, 1999; Gallagher-Thompson, Menendez, Takagi, Haley, Arguelles, Rubert, Loewenstein, & Szapocznik, 2000). Here we wish to make the point that at some sites (i.e., those serving the Hispanic/Latino populations), language issues -- notably language preference and translation issues -- added to the complexity of delivering the REACH interventions and maintaining quality control. We raise this issue, and describe experiences handling it, in order to alert other researchers to some important considerations since it is anticipated that in the future, more intervention programs will be designed for diverse cultural and ethnic groups, including some that will be delivered in languages other than English. Language preference and translation issues go hand in hand: interventions cannot be delivered in caregivers' language of choice unless suitable translations of the material are available, along with bilingual (and preferably bicultural) staff to offer the programs. In REACH we had the added challenge of translating not only the treatment manuals and protocols themselves, but also all evaluation forms and scoring/coding instructions for all of the core REACH outcome measures. Since two of the six sites were working with Spanish speaking caregivers (in Miami, Cuban Americans were the majority of Hispanics seen; in Palo Alto, Mexican Americans were the majority), efforts had to be made to coordinate translation processes across these two sites. The multiple translation processes involved were lengthy, complex, expensive, and often frustrating for the investigators, for a variety of reasons. First, accepted practices of forward and Judging Outcomes 21 back translation needed to be implemented to arrive at a consensus regarding the meaning and intent of the various questions, questionnaires, rating scales, and treatment protocols. This was very time consuming and costly, since a professional translation company had to be used first for the forward translations, to get them into "generic Spanish" that would provide a culturally appropriate starting point. Then panels of bilingual and bicultural Hispanics representing different Hispanic subgroups were convened to do the back translations. Discrepancies in meaning had to be conferenced until consensus was achieved. Some use of idioms (which vary regionally in their meaning) was permitted to facilitate accurate comprehension, but this was kept to a minimum. Overall, the process took about a year to accomplish. Second, once these translations had been accomplished effectively, interventionists had to be selected and trained who were bilingual (and, in most instances, bicultural as well) to use the protocols, manuals, and forms for treatment delivery, receipt, and enactment. These were relatively unfamiliar concepts for most of the staff at these two sites: even those with a significant background in the social sciences were unaccustomed to following very detailed protocols when interacting with people in distress. Significantly more training and supervision was needed with the Hispanic interventionists compared to their Anglo counterparts to ensure that the protocols were being followed throughout the project. Third, many culturally sensitive issues arose over the course of time in the REACH project which these interventionists had to handle. For example, depression was very common among the caregivers enrolled at both Spanish speaking sites; at times it was present to a clinically significant degree, yet management of significant clinical depression was beyond the scope of the REACH protocols. Therefore, these interventionists had to be trained to Judging Outcomes 22 locate appropriate referral sources, make the actual referral, and follow up to encourage the caregiver to accept the referral. In summary, providing standardized, manual-driven interventions in languages other than English poses challenges in treatment delivery specifically in adequacy of translation, effective training and supervision of staff, and handling of off-protocol topics and situations. Despite the challenges involved, however, this type of work will be more and more common in the future, particularly in some areas of the US which have high immigrant populations who often do not speak English well enough to communicate their distress and benefit from help that is not delivered in their native language. Summary and Conclusions Dementia caregiving research has progressed from investigations of factors contributing to the stress and burden of caregivers, to the development of interventions to alleviate these burdens. Generally, reviews of the efficacy of these interventions have been equivocal, although comprehensive, intensive, and individually tailored interventions appear to be more efficacious than those lacking these components. Given the increasing prevalence of dementia, it is imperative to develop interventions that enhance the quality of life for both caregivers and care recipients. In an attempt to address this issue, the REACH cooperative group is investigating the efficacy of various interventions that have potential for providing support and relief to family caregivers. A particular strength of the REACH program is the strategies adopted at the intervention sites to induce and assess treatment implementation. To date, few caregiver intervention studies have addressed the issue of treatment implementation, making it difficult to interpret findings and draw conclusions regarding treatment efficacy. In fact, one of the most Judging Outcomes 23 significant problems with comparative psychosocial clinical intervention studies with all populations is the failure of researchers to document the degree to which the independent variables (i.e., treatment components) are related to outcome (Kazdin, 1994). The induction and assessment of treatment implementation should be an essential component of any psychosocial intervention study. Findings suggesting that there are no significant differences among intervention approaches on a chosen set of outcomes may be due to inherent problems with treatment implementation. Large variations in the implementation of a treatment may create within group variability that obscures group differences. Similarly, an unintended diffusion of treatments because of lack of adherence to a treatment protocol may create an overlap in treatment conditions which outweighs treatment differences. Implementation of a treatment can only be ascertained by a plan for careful and continuous monitoring. It also requires that treatment protocols be well-defined so that standards for judgements for treatment departure can be applied. It should be possible to estimate the degree to which a planned intervention was actually carried out in the field. Decisions also need to be made regarding what constitutes a "faithful rendering" of a treatment or, conversely, what departures fall within an acceptable range (e.g. attendance at 75% of treatment sessions). Furthermore, there must be provisions to correct deviations from those standards during the intervention period (Sechrest et al., 1979). The proper induction and assessment of treatment implementation can be difficult, and requires careful planning on the part of research investigators. This paper summarizes the efforts by the REACH cooperative group to induct and assess treatment implementation. The approaches taken by the sites vary with the different intervention protocols and include using treatment manuals, training and certification of the interventionists, and continuous monitoring Judging Outcomes 24 of actual implementation via audiotaping or videotaping of interactions with clients. The difficulties in implementing these strategies are also discussed. A unique feature of the REACH approach is the use of Lichstein's Treatment Implementation Model across sites. Thus, all of the sites are capturing three fundamental aspects of treatment implementation: delivery, receipt, and enactment (Lichstein et al., 1994). Few intervention studies have reported systematic methods for capturing all three of these components. A strength of this approach is that it not only provides strategies for inducing TI but also for assessing the precise degree to which actual treatment components are implemented. Furthermore, the interventionists are continually given feedback during the intervention period so that departures from treatment protocol can be adjusted. The availability of treatment implementation data will ultimately allow us to have greater insight into findings regarding the efficacy of the various interventions. It will also allow us to assess the impact of variations in TI on targeted outcome measures. For example, we will be able to examine the differential impact of intended vs. actual delivery characteristics of treatments (e.g. number of sessions) on outcomes. This type of information will ultimately provide insight into guidelines for required treatment strength. The importance of treatment implementation assessment can not be overstated. Funders, researchers and study participants pay high costs in time and money to develop and test interventions. Without an accurate assessment of whether the intervention was delivered as intended, received by the client, and enacted by the client, any conclusions regarding outcomes are suspect. Interventions that hold great promise may be discarded, or the intervention that was described may not be the intervention that made the difference in clients' lives. Particularly in multi-site trials with multiple populations, it is critical to know that the same intervention can be Judging Outcomes 25 delivered, received and enacted successfully. TI assessment can help determine correct dosage, ensuring that cost-benefit ratios are as favorable as possible. As the health care and social service systems struggle to find ways to help dementia caregivers and care recipients, treatment implementation induction and assessment ensures that our investments are well spent. When TI assessments confirm adequate delivery, receipt, and enactment, the experimenter can assert with confidence (1) it is known what treatment was tested, (2) treatment impact penetrated the participants, and (3) treatment exposure extended to the natural environment of the participants. Under these conditions, the independent variable received a fair test, and positive or negative outcomes are rightly associated with the treatment of interest. Conclusions of both causal influence of the treatment and expected generalization of the treatment to other samples are reinforced. When assessments find faults in any one of the TI components or when TI assessments are omitted, the validity of the clinical trial is threatened or indeterminate, and the faith due conclusions of efficacy, causal influence, and generalization is compromised. Judging Outcomes 26 TABLE 1: OUTLINE OF REACH TREATMENT IMPLEMENTATION STRATEGIES Treatment Component Methods Treatment Delivery Treatment Manuals Training and Certification Ongoing Monitoring and Feedback Record of Contacts and System Utilization Assessing Caregiver Knowledge of Key Treatment Concepts and Skills Interventionist Documentation Feedback from the Caregiver Direct Observation of the Caregiver Interventionist Documentation Caregiver Self-Reports Treatment Receipt Treatment Enactment Judging Outcomes 27 Author's Note 1 Copies of treatment manuals and training protocols for all sites can be obtain by writing the REACH Coordinating Center (Richard Schulz, Director), UCSUR, University of Pittsburgh, 121 University Place, Pittsburgh, PA, 15260 The Resources for Enhancing Alzheimer's Caregiver Health (REACH) project is supported by cooperative agreements NR04261, AG13255, AG13313, AG13297, AG13289, AG13265, AG13305 Address reprint request to: Lou Burgio, Ph.D., Applied Gerontology Program, University of Alabama, Box 870315, Tuscaloosa, AL 35487-0315 Judging Outcomes 28 References Bourgeois, M. S., Burgio, L. D., Schulz, R., Beach, S., & Palmer, B. (1997). Modifying repetitive verbalizations of community-dwelling patients with AD. The Gerontologist, 37, 3039. Bourgeois, M. S., Schulz, R., & Burgio, L. (1996). Intervention for caregivers of patient's with Alzheimer's disease: A review and analysis of content, process, and outcomes. International Journal of Human Development, 43, 35-92. Cook, T. D. & Campbell, D. T. (1979). Quasi-Experimentation: Design and analysis issues for field settings, Chicago: Rand-McNally. Coon, D. W., Schulz, R., & Ory, M. G. (1999). Innovative intervention approaches for Alzheimer's disease caregivers. In D. Beigel & A. Blum (Eds.), Innovations in practice and service delivery across the lifespan (pp. 295-325). New York: Oxford. Czaja, S. J., Eisdorfer, C., & Schulz, R. (2000). Future directions in caregiving: Implications for intervention research. In R. Schulz (Ed.), Handbook of dementia caregiving intervention research, New York: Springer Publishing Company. Gallagher-Thompson, D., Arean, P., Menendez, A., Takagi, K., Haley, W., Arguelles, T., Rubert, M., Loewenstein, D. & Szapocznik, J. (2000). Development and implementation of interventions strategies for culturally diverse caregiving populations. In R. Schulz (Ed.), Handbook of dementia caregiving interventions. New York: Springer Publishing Company. Gallagher-Thompson, D., Haley, W., Guy, D., Rubert, M., Arguelles, T., Tennstedt, S. & Ory, M. (1999). Tailoring psychological interventions for ethnically diverse dementia caregivers. Manuscript under editorial review. Judging Outcomes 29 Gold, D. P., Cohen, C., Shulman, K., Zucchero, C., Andres, D., & Etezad, J. (1995). Caregiving and dementia: Predicting negative and positive outcomes for caregivers. International Journal of Aging and Human Development, 41, 183-201. Howard, K. I., Moras, K., Brill, P. L., Martinovich, Z., & Lutz, W. (1996). Evaluation of psychotherapy: Efficacy, effectiveness, and patient progress. American Psychologist, 51(10), 1059-1064 Kazdin, A. E. (1994). Methodology, design, and evaluation in psychotherapy research. In A. E. Bergin and S. L. Garfield (Eds.), Handbook of psychotherapy and behavior change (4th ed.). New York: John Wiley and Sons, pp. 19-71. Kennet, J., Burgio, L., & Schulz, R. (2000). Interventions for in-home caregivers: A review of research 1990 to present. In Schulz (Ed.), Handbook of dementia caregiving intervention research. New York: Springer Publishing Company. Knight, B. G., Lutzky, S. M., & Macofsky-Urban, F. (1993). A meta-analytic review of interventions for caregiver distress: Recommendations for future research. The Gerontologist, 33, 240-248. Krause, M. S. & Howard, K. I. (1999). Between-group psychotherapy outcome research and basic science revisited. Journal of Clinical Psychology, 55(2), 159-169. Lichstein, K. L., Riedel, B. W., & Grieve, R. (1994). Fair tests of clinical trials: A treatment implementation model. Advances in Behavior Research and Therapy, 16, 1-29. Lovett, S. & Gallagher, D. (1988). Psychoeducational interventions for family caregivers: Preliminary efficacy data. Behavior Therapy, 19, 321-330. Moncher, F. J., & Prinz, R. J. (1991). Treatment fidelity in outcome studies. Clinical Psychology Review, 11, 247-266. Judging Outcomes 30 Schulz, R. & Beach, S. (1999). Caregiving as a risk factor for mortality: The caregiver health effects study. Journal of the American Medical Association, 282 (23), 2215-2219. Schulz, R. & O'Brien, A. T. (1994). Alzheimer's disease caregiving: An overview. Seminars in Speech and Language, 15, 185-193. Schulz, R., O'Brien, A. T., Bookwala, J., & Fleissner, K. (1995). Psychiatric and physical morbidity effects of dementia caregiving: Prevalence, correlates, and causes. The Gerontologist, 35, 771-791. Schulz, R., Visintainer, P., & Williamson, G. M. (1990). Psychiatric and physical morbidity effects of caregiving. Journal of Gerontology: Psychological Sciences, 45, P181-P191. Sechrest, L., West, S. G., Phillips, M. A., Redner, R., Yeaton, W. (1979). Some neglected problems in evaluation research: Strength and integrity of treatments. In L. Sechrest, S. G. West, M. A. Phillips, R. Redner, & W. Yeaton (Eds.), Evaluation studies review annual (Vol. 4, pp. 15-35). Beverly Hills, CA: Sage. Sechrest, L. & Yeaton, W. E. (1981). Assessing the effectiveness of social programs: Methodological and conceptual issues. In S. Ball (Ed.), New directions for program evaluation: Assessing and interpreting outcomes (pp. 41-56). San Francisco: Jossey-Bass. Sorrentino R, Bobocel D, Gitta M, Olson J, Hewitt E. (1988). Uncertainty orientation and persuasion: Individual differences in the effects of personal relevance on social judgements. Journal of Personality and Social Psychology, 55, 357-371. Toseland, R. W., & Rossiter, C. M. (1989). Group interventions to support family caregivers: A review and analysis. The Gerontologist, 29, 438-483. Toseland, R. W., Rossiter, C. M., & Labrecque, M. D. (1989). The effectiveness of peer-led and professionally led groups to support family caregivers. The Gerontologist, 29, 465-471. Judging Outcomes 31 Vitaliano, P. P., Russo, J., Young, H. M., Teri, L., & Maiuro, R. D. (1991). Predictors of burden in spouse caregivers of individuals with Alzheimer's disease. Psychology and Aging, 6, 392-402. Weiss, R. S. (1994). Learning from strangers: The art and method of qualitative interview studies. New York: The Free Press. Wright, L., Clipp, E., & George, L. (1993). Health consequences of caregiver stress. Medicine, Exercise, Nutrition, and Health, 2, 181-195. Zarit, S. & Teri, L. (1992). Interventions and services for family caregivers. In K. W. Schaie, M. Powell Lawton (Eds.), Annual review of gerontology and geriatrics (Vol. 11, pp. 287-310). New York: Springer. Judging Outcomes 32 Appendix: REACH Research Group-Participating Institutions and Principal Staff University of Alabama (Tuscaloosa and Birmingham, Alabama): Louis Burgio, Ph.D. (Principal Investigator Tuscaloosa), Alan Stevens, Ph.D. (Principal Investigator - Birmingham), Alfred Bartolucci, Ph.D., Delois Guy, Ph.D., William Haley, Ph.D., David Roth, Ph.D., David Vance, M.A. Hebrew Rehabilitation Center for Aged Research and Training Institute (Boston, Massachusetts): Diane Mahoney, Ph.D. (Principal Investigator), Robert Friedman, M.D., Brooke Harrow, Ph.D., Timothy Heeren, Ph.D. (former participant), Ting Lin, Ph.D., Barbara Tarlow, Ph.D., Sharon Tennstedt, Ph.D., Ladislav Volicer, M.D., Ph.D. University of Tennessee, Memphis (Memphis, Tennessee): Robert Burns, M.D. (Principal Investigator), Marshall Graney, Ph.D., Kenneth Lichstein, Ph.D., Jennifer Martindale-Adams, Ed.D., Linda Nichols, Ph.D., Grant Somes, Ph.D. University of Miami (Miami, Florida): Carl Eisdorfer, M.D., Ph.D. (Principal Investigator), Soledad Arguelles, Ph.D., Trinidad Arguelles, M.S., Sara Czaja, Ph.D., David Loewenstein, Ph.D., Mark Rubert, Ph.D., Jose Szapocznik, Ph.D. Veterans Affairs Medical Center (Palo Alto, California): Dolores GallagherThompson, Ph.D. (Principal Investigator), David Coon, Ph.D., Helena Kraemer, Ph.D., Ana Menendez, M.S., Larry Thompson, Ph.D. Thomas Jefferson University (Philadelphia, Pennsylvania): Laura N. Gitlin, Ph.D. (Principal Investigator), Mary Corcoran, Ph.D., Susan Klein, Ph.D., Sue Marcus, Ph.D., Laraine Winter, Ph.D. Project Office, National Institutes of Health: Marcia Ory, Ph.D., Mary Leveck, Ph.D. Coordinating Center, University of Pittsburgh (Pittsburgh, Pennsylvania): Richard Schulz, Ph.D. (Principal Investigator), Steven H. Belle, Ph.D., Joy Herrington, M.Ed., Jason Newsom, Ph.D. (former participant), Galen Switzer, Ph.D. (former participant), Stephen R. Wisniewski, Ph.D. External Advisory Committee: Patricia Archbold, DNSC, Oregon Health Sciences University; University of California, Santa Barbara; Joel Greenhouse, Ph.D., Carnegie Mellon University; J. Neil Henderson, Ph.D., University of South Florida; Ira Katz, M.D., Ph.D., University of Pennsylvania; Powell Lawton, Ph.D., Philadelphia Geriatric Center; Len Pearlin, Ph.D., University of Maryland; May Wykle, Ph.D., Case Western Reserve University.