What Evaluating Evidence Based Training Curricula Can Teach Us about Evaluating Child Welfare Training Anita P. Barbee, MSSW, Ph.D. Becky Antle, MSSW, Ph.D. Kent School of Social Work University of Louisville NHSTES in Pittsburg, PA May 22, 2013 Louisville Child Welfare Training Evaluation Model (Antle, Barbee & van Zyl, 2008) Zooming in on the Training Cycle • We have mostly focused on individual and organizational predictor variables effecting training outcomes of satisfaction, learning, transfer of learning and organizational and client outcomes • We continue to measure all of those aspects of training, but when we had the opportunity to conduct efficacy trials to test evidence informed and evidence based training interventions we began to focus more on the training cycle Which Slides Contain Background Information and Which Are the Focus • The next slides in your handout are background information about two grants that allowed us to test the efficacy of several EITs and EBTs (Within My Reach, Love Notes and Reducing the Risk) (slides 5-11) • We will focus our talk on how to measure fidelity to the training curriculum, trainer competence and two training variables that we are finding impact outcomes (trainer alliance and group cohesion) and will give you all copies of our measures (slide 12) • Finally slides 13-22 give findings from one analysis- these are for future reference- we can summarize quickly • We will end our time leading a discussion about implications of that analysis (slides 23-30) and how we can transfer these findings to child welfare training evaluation(slide 31) First Grant that Evaluated Efficacy of Evidence Based or Evidence Informed Curricula • We received $2.5M in funding from the Office of Family Assistance (OFA) for the grant Relationship Education Across Louisville (REAL) from 20062011 which evaluated the efficacy of Within My Reach in lowering Interpersonal Violence (IPV) in a population of 900 low income adults served by Neighborhood Places in Louisville KY as well as the efficacy of Love Notes on 450 youth who had dropped out of high school in Louisville KY (Antle was PI, Barbee was Evaluator) Second Grant • We received $4.8M in funding from the Office of Adolescent Health (OAH) for a teen pregnancy prevention research study we call CHAMPS (Creating Healthy Adolescents Through Meaningful Prevention Services) from 2010-2015. It is a three arm randomized controlled trial and longitudinal study comparing the effects of three training interventions. 1300 urban, refugee and foster youth ages 14-19 are being randomly assigned to either Love Notes, Reducing the Risk or The Power of We (that serves as a control condition) and followed at the 3, 6, 12 and 24 month points. These trainings are delivered in 20 community based organizations serving high risk youth in Louisville, KY (Barbee is PI, Antle is Co-PI). Training Interventions: Within My Reach • Within My Reach (Pearson, Stanley, & Kline, 2005) is a 16 hour healthy relationships program, a relationship education intervention designed for low-income populations that teaches communication and conflict resolution skills, relationship decision making strategies, and relationship safety/violence prevention content. • The WMR program has many modules that promote group interaction via discussions, activities, and processing of topics as they relate to the participants’ lives. The WMR program promotes group interaction and has several group based activities wherein group members share parts of their lives. For more information about WMR see Antle et al. (2011) and Pearson et al. (2005). This program was provided in four 4-hour sessions (typically conducted weekly) at neighborhood-based social service sites. • WMR was developed to reach a broader audience (individuals who may or may not be in a romantic relationship) than couple-focused relationship education programs. Specifically, the WMR curriculum was designed for individuals (vs couples) in order to be a primary prevention method (e.g., assist individuals in making sound relationship choices, regardless of relationship status) as well as to safe-guard participants when sensitive issues are discussed, in particular, intimate partner violence (wherein including both partners in the session may be contraindicated). Love Notes • Love Notes, was developed to educate participants about healthy relationships, including issues of decision-making, communication and conflict resolution, and overall safety, including the prevention of pregnancy and sexually transmitted disease (Pearson, 2009). • This strategy may provide more long-term prevention of interpersonal violence, as participants are engaged in a thorough self-assessment of relationship values, needs, and models of safety that will assist them with future decision-making and commitment in intimate relationships. Love Notes is a derivative of the Prevention and Relationship Enhancement Program (PREP; Stanley, Markman, & Jenkins, 2009), which is relationship marriage education program listed as an evidence-based practice (EBP) by SAMSHA (www.samhsa.gov). • This curriculum builds on social exchange theory and meets the needs of youth who are alienated and in need of loving personal relationships. Thus, it is an excellent counterpart to the pregnancy prevention curriculum and allows a test of theoretical model on what combination of information is best in preventing high risk behaviors, pregnancy and transmission of STIs. • Research by Antle et al (2011) found that participants in Love Notes experience significant gains in relationship knowledge, communication and conflict resolution skills, relationship self-efficacy and attitudes toward violence. Reducing the Risk • • • Reducing the Risk: Building Skills to Prevent Pregnancy, STD and HIV (RtR) was developed by Richard Barth, MSW, Ph.D. in California. The training manual is in its 5th Edition and was last published in 2011. This curriculum is one of the first that was evaluated using an experimental design, with a longitudinal follow up (6 months and 18 months) and tested on a large group of high school students (N = 758). It is also one of the first programs to show an impact on beliefs of adolescent sexual behavior prevalence and actual behavior as well as increasing parent-child communication about abstinence and contraception (Kirby, Barth, Leland, and Fetro, 1991). For those who were virgins at the pre-test, the curriculum significantly reduced the onset of intercourse at 18 months and those who did have sex were more likely than controls to use contraceptives. These effects held for members of several ethnic groups (Caucasian, African American, Hispanic and Asian), both genders, and for lower and higher risk youth. For females and lower-risk groups who had initiated intercourse before the pretest and curriculum delivery, contraceptive use was increased after the training and significantly more so than for controls (Kirby, et al 1991). The youth in the comparison group did receive a traditional sexuality education intervention, thus for RtR to significantly improve outcomes for participants above and beyond another intervention means that it is particularly effective. RtR continued • Another study that tested the effectiveness of RtR was conducted in Arkansas with rural and urban youth (Hubbard, Giese and Raney, 1998). This study found that RtR delayed the initiation of sex among youth who were virgins at the pre-test and increased condom use among youth that did initiate intercourse after the training. • A third study evaluated the impact of RtR in Kentucky and Ohio (Zimmerman, et al, 2008). It found that RtR significantly delayed the initiation of sex, but condom and contraception use was not increased. The Power of We (POW) • The control group participants receive training in community organizing and community building that is delivered by trainers from the Network Center for Community Change (NC3), a nonprofit organization in Louisville. • This ensures that participants are receiving some service and filling the same number of hours (approximately 13) interacting in a training environment, just on a different and unrelated topic. Measurement Issues When Using Evidence Based or Evidence Informed Curricula • Because of the importance of testing the efficacy of evidence based curricula on outcomes, we placed special emphasis in our evaluation on measuring and ensuring fidelity to the curricula using a special tool and objective evaluators who observed each training session and rated fidelity to the curriculum. • For CHAMPS we have also included a quantitative measure of facilitator engagement which is completed by each facilitator about their own performance, their partner’s performance and which is completed by the objective evaluator about each facilitator’s performance. • For a portion of the REAL grant and all of the CHAMPS grants we have included measures of facilitator engagement and group cohesion along with our usual measures of participant satisfaction. One Example of Effects of Fidelity, Engagement and Group Cohesion on Outcomes in WMR Training • A sample of 126 participants – 98 women – 21 men – 7 unknown gender • • • • Average Age was 33.33 years (s.d. = 12 years) Average number of children was 2 Median education level was high school or GED Racial background of participants – 66.1% African American – 27.8% Euro American – 6% Hispanic, Asian American or other Measure of Fidelity • Adherence to the program was monitored each session by trained raters. • Adherence to the WMR training manual was generally high (i.e., 94% of the proposed material, including discussions, activities, and content were covered). • We can argue that those training participants that participated in all 4 training days (92% of participants) received a full dose of the intervention given attendance and fidelity to the curriculum by facilitators. Effects of Adherence on Outcomes • We conducted one analysis on 559 participants in WMR and found a significant interaction between adherence, participant reactions to the training, and relational outcomes (communication skills and couple relationship quality). • For participants who reported very positive reactions to the training, there was a negative effect of strict adherence on relational outcomes. • Moderate adherence to the curriculum allows for more discussion and interactions with the trainer and other participants, which is likely more desirable when participants have a very positive view of the facilitator and the training. • In contrast, for participants who reported less favorable reactions to the training, there was a positive effect of strict adherence on relational outcomes. • This suggests that in a situation where a trainer was viewed as less competent or engaging, it was better for participants when he/she followed the curriculum content very precisely. This finding actually goes beyond Barber et al. (2006) who found no effect of adherence on outcomes when clients are engaged to suggest a negative effect of adherence on outcomes when participants are otherwise engaged or satisfied with services. How We Analyzed the Impact of Adherence on Outcomes • Multilevel modeling was utilized to test the hypotheses (time nested within participants, nested within groups). At level 1, we included the repeated measures for the relational outcomes (i.e., the dependent variables, DAS and CPQ at pre, post, and 6-month). The Time variable was coded -2 = pre, -1 = post, and 0 = 6-month follow-up, thus, the intercept values reflect participants’ scores at 6-month follow-up. At level 2 or the participant level, we included sex and race/ethnicity (both uncentered) as control variables and participants’ reactions to the program at post (grand-mean centered). At level 3 or the group level, we included ratings of adherence (grand mean centered for both the linear and quadratic effects). We created two models, one for each outcome variable (i.e., DAS and CPQ). We examined whether participants’ outcomes would be associated with adherence (both linear and quadratic effects), participants’ reactions to the program, after controlling for sex (Men = 1, Women = 0) and race (White = 1, Racial/Ethnic Minority = 0). We also included the cross-level interaction between participants’ reactions and adherence on the intercept. Facilitator Working Alliance with Participants • Working Alliance Inventory-Short Form (WAI-S, Tracey & Kokotovic, 1989). The WAI-S is a client rated measure of working alliance that consists of 12 items that assess goals and tasks for therapy as well as the relational bond between the client and therapist. These items were rated on a seven-point scale ranging from 1 (Strongly Disagree) to 7 (Strongly Agree) with higher scores indicating a better working alliance • The WAI-S is a commonly used measure of working alliance and the reliability and validity has been demonstrated in numerous studies comparing the WAI-S to other working alliance scales and therapy outcomes (see Horvath et al., 2011 for a review). • The language was adjusted to reflect the premarital education context, consistent with Owen et al. (2011). Some couples took premarital education with two leaders; working alliance scores for the two leaders were highly correlated (r = .78), so we used an average of the scores in our analyses. This measure was completed at post intervention and the Cronbach alpha was .85. Group Cohesion of Participants • Group Climate Questionnaire (GCQ; MacKenzie, 1983). The GCQ has 12 items that are rated on a 7-point Likert scale, ranging from 0 (not at all) to 6 (extremely). The GCQ assesses group members’ perceptions of the group environment. The GCQ has three empirically based factors: a) Engagement or the degree of self-disclosure, involvement, and investment among group members (b) Avoidance or the lack of addressing key issues between members or avoidance of responsibility to change (c) Conflict- which captures the overt struggles and covert sense of distrust among group members (MacKenzie, 1983). • Support for the internal consistency of the GCQ has been shown in previous studies (e.g., alpha = .88 to .94; Kivlighan & Goldfine, 1991; MacKenzie, 1983). Construct validity of the GCQ has been demonstrated by its relationship with therapy outcomes, therapist ratings of the group process, and other client rated group process measures (e.g., Kivlighan & Goldfine, 1991; Quirk, Miller, Duncan, & Owen, in press). • In the current study, the Cronbach alphas for the three subscales were .77 (Engagement, five items), .84 (Conflict, four items), and .56 (Avoidance, three items). Similar to the WAI, this measure was only completed at post intervention. Outcome Measure: Relationship Confidence • Relationship Confidence. We used the 5-item Confidence Scale (Stanley, Hoyer, & Trathen, 1994) to measure couples’ confidence that they can effectively manage their relationship and stay together. Relationship confidence is a key outcome measure for relationship prevention programs (e.g., Stanley et al., 2001). • Items were rated on a seven-point scale ranging from 1 (Strongly Disagree) to 7 (Strongly Agree) with higher scores indicating more confidence in the relationship. Confidence Scale scores have been shown to be related to other relationship and individual characteristics and outcomes in meaningful ways (see Rhoades, Stanley, & Markman, 2009; Whitton et al., 2007) and have been utilized in relationship education programs (e.g., Owen et al., 2011). Cronbach alphas were .70 and .75 at pre- and post-assessment, respectively. Outcome Measure: Communication • Relationship Dynamics Scale (RDS; Stanley & Markman, 1997). We utilized the 7-item version of the RDS to assess the frequency of negative communication, including escalation, invalidation, and withdrawal that are typically associated with negative relationship outcomes. Respondents rate each item on a 1 (Almost never) to 3 (Frequently) scale with higher scores indicating more frequent negative communication. In a variety of samples, the measure has demonstrated adequate reliability and validity (e.g., Kline et al., 2004; Markman et al., 2010). The Cronbach alphas were .82 at preassessment and .85 at post-assessment. Outcome Measure: Relationship Quality • Dyadic Adjustment Scale-7 (DAS; Spanier, 1976). The quality of intimate relationships was measured using the seven-item version of the DAS. Relationship quality is a hallmark outcome in REPs and is considered a potential outcome for individuals who want to strengthen their relationship. The DAS-7 assesses the quality of the couple relationship including facets such as satisfaction, cohesion, consensus, affection and expression. The DAS is widely utilized in clinical and research settings. Previous studies have supported the DAS’s internal consistency (e.g., alphas ranged from .76 to .96) as well as validity (e.g., discriminate sensitivity between distressed and non-distressed individuals, and correlations with other relationship measures – such as commitment, communication quality; see Sabourin et al., 2005). Cronbach alphas were .86 and .86 at pre- and post-assessment, respectively. Results • • • • • • Multilevel modeling was utilized to examine the associations between participants’ alliance and group cohesion on relationship outcomes at post intervention. Multilevel models are advantageous for many reasons, most notably they account for the interdependencies among observations that can occur when multiple individuals are included in the same group. Thus, individuals (level 1) were nested within groups (level 2). Note there were 10 groups (average 18.6 individuals per group, range 5 to 21). The small number of groups inhibited our ability to test group differences (Maas & Hox, 2005); however, conducting MLM can still correct for the interdependencies for individuals who attended the same group. Statistically, a small number of groups (level 2 units) can produce biased random effect estimates, yet the fixed effects (e.g., the association between alliance and DAS, for example) typically demonstrate less bias (Maas & Hox, 2005). We created two multilevel models with relationship variables at post (i.e., DAS, and RDS) as the outcome variables, respectively. The predictor variables included pre-functioning for the relationship variables, working alliance, and the three subscales for group cohesion (all grand-mean centered). Note we tested whether gender and/or race/ethnicity would affect the results when included as control variables. The covariates were not statistically significant and were subsequently dropped from the final models. Multilevel models were conducted using Hierarchical Linear Modeling Version 6 (HLM6) (Raudenbush, Bryk, Cheong, & Congdon, 2005). Means and Standard Deviations for Key Variables Notes. DAS = Dyadic Adjustment Scale, RDS = Relationship Dynamics Scale, WAI = Working Alliance Inventory, GSQ-S = Group Climate Questionnaire-Short Form. The changes from pre to post on the DAS, RDS, and Confidence were statistically significant (ps < .05). Ns ranged from 123 to 126. Cohen’s d = small-sized effect = 0.20, medium-sized effect = 0.50, large-sized effect = 0.80. Pre M (SD) Post M (SD) Effect Size Cohen’s d DAS 20.65 (7.00) 21.84 (7.07) 0.22 RDS 2.00 (0.52) 1.82 (0.55) 0.42 Confidence 4.78 (1.27) 5.22 (1.26) 0.38 WAI -- 5.68 (0.88) -- GCQ-S Engage -- 4.41 (1.10) -- GCQ-S Conflict -- 1.08 (1.35) -- GCQ-S Avoid -- 3.56 (1.30) -- Results • Working Alliance-Relationship Outcomes. Our first three hypotheses related to the association between working alliance and relationship outcomes. None of the associations between alliance and relationship outcomes were statistically significant. • Group Cohesion-Relationship Outcomes. Group Engagement was significantly associated with lower negative communication with their partner (RDS; d = -0.16), suggesting that individuals who felt more connected with other group members reported improvements in their communication quality with their romantic partner. • Less group avoidance was associated with better relationship adjustment at post (DAS; d = -0.24). • Group Effects for Group Cohesion. Given that we only have 10 groups, we were limited in testing group effects. However, a series of exploratory analyses were conducted. First, we estimated the shared variance (ICC) in participants’ group engagement, conflict, and avoidance scores who were in the same group. The ICCs for Engagement, Conflict, and Avoidance were .12, .05, and .17. This suggests that groups accounted for 5% to 17% of the variance in the group cohesion subscale scores. I • In context, therapists typically account for 5% to 9% of the variance in clients’ alliance scores (Baldwin & Imel, in press). Implications for Preliminary Results • This is the first known study to examine group cohesion in the context of Relationship Education Trainings and is consistent with the general findings within the psychotherapy literature (cf. Burlingame et al., 2011) – suggesting that group interventions may have common change mechanisms. Implications for Participants • More group engagement, or the degree to which group members support one another and provide a facilitative environment to explore key issues in their relationships, may be a change mechanism that enhances relationship functioning (e.g., relationship adjustment, communication quality, relationship confidence). • Potentially, these group dynamics may assist participants gain new insights about how they are able to navigate the complexities of their relationship. Interpersonal learning is a core aspect of group processes (Yalom & Leszcz, 2005), which typically includes group members providing corrective feedback, self-disclosing about their own experiences, and discussing themes related to what is and is not healthy and safe in romantic relationships. Implications for Participants • Group conflict also seems to have an association with changes in participants’ relationship adjustment. Specifically, participants who reported higher levels of conflict among group members reported better relationship adjustment at post intervention. • Potentially, higher levels of conflict among group members may represent the challenges among group members to help promote a pro-relationship stance (as it relates to the participant’s current relationship). • Although group conflict may arise differently in Relationship Education Trainings, the importance of conflict or at least tension may be sufficient within to encourage some participants to make improvements in their relationship functioning. Implications for Trainers • Group dynamics can be related to important outcomes in which may suggest that training of trainer programs should include instruction on managing group dynamics. • Training of trainer programs tend to focus on the content of the curriculum, giving future facilitators a broad overview of the concepts and activities of the program. However, as this and previous research suggests (Burlingame et al., 2011), group cohesion may be an important contributing factor to relationship outcomes for participants. • Facilitators could be trained in the stages of group formation, how to manage group conflict effectively, how to manage group participants who are over- or underinvolved, and other aspects of group processes. Implications for Trainers/Facilitators • Facilitators must learn to attend to process as much as content. There is wide variability in the style of facilitators, as this team’s research has identified through its adherence assessment process (Owen, under review). • Given a comprehensive curriculum that may exceed the amount of time available for delivery, some facilitators will choose to try to cover every core concept and activity while others will encourage more discussion and cover less content. T • his research suggests that group discussion and activities that promote group cohesion (even “negative” aspects such as conflict) should be prioritized as they have significant benefits for outcomes. Implications for Trainers/Facilitators • Facilitators should monitor the conflict within the group, to be sure that the challenges are beneficial for participants. T • he tendency of many facilitators, particularly in time-limited groups is to provide a “positive” experience, which may actually shut down important discussions due to conflict. • Relationship Education Trainings have historically tried to focus on strengths and skill building because they are time-limited and do not want to leave couples feeling more distressed than before the program. • However, in this case where the Relationship Education is for individuals who are not attending with their partners, there was benefit to the “negative” side of group cohesion, conflict. • Facilitators must learn to balance the need to create a safe environment where participants feel free to share and have an overall positive experience, with genuine group process that includes conflict and confrontation of problem areas. Potential to Link Fidelity, Facilitator Engagement and Group Cohesion to Outcomes • In both studies we also have measures of participant learning and attitudinal and behavioral outcomes so we can create a chain of evidence from these training and participant measures to participant learning and transfer of training. • In our work in child welfare staff and supervisor training we rarely assessed fidelity to the curriculum and never assessed facilitator engagement and group cohesion. • Preliminary findings from the REAL data show that group cohesion had an effect on attitudinal and changes (transfer of learning). Lessons Learned 1) It is important to measure fidelity and to ensure fidelity in order to reach desired outcomes 2) It is also important for trainers and facilitators to engage participants and create of group cohesion among trainees for learning and training transfer to occur 3) Attitude change and behavioral intention can be one way to measure transfer of training 4) Relationship skills can be taught 5) The field of Public Health expects measurement of fidelity and facilitator engagement as well as training transfer and outcomes in training interventions. Implications for Child Welfare • The Children’s Bureau is moving to the creation of Evidence Based Curricula for: – Teaching states how to create CQI systems – Trauma Informed Interventions and creating trauma informed organizations • More Child Welfare training needs to establish itself as Evidence Based and include fidelity measures and other measures of what happens in the training room that can influence transfer