What Evaluating Evidence Based
Training Curricula Can Teach Us
about Evaluating Child Welfare
Anita P. Barbee, MSSW, Ph.D.
Becky Antle, MSSW, Ph.D.
Kent School of Social Work
University of Louisville
NHSTES in Pittsburg, PA
May 22, 2013
Louisville Child Welfare Training Evaluation Model
(Antle, Barbee & van Zyl, 2008)
Zooming in on the Training Cycle
• We have mostly focused on individual and
organizational predictor variables effecting
training outcomes of satisfaction, learning,
transfer of learning and organizational and client
• We continue to measure all of those aspects of
training, but when we had the opportunity to
conduct efficacy trials to test evidence informed
and evidence based training interventions we
began to focus more on the training cycle
Which Slides Contain Background
Information and Which Are the Focus
• The next slides in your handout are background
information about two grants that allowed us to test the
efficacy of several EITs and EBTs (Within My Reach, Love
Notes and Reducing the Risk) (slides 5-11)
• We will focus our talk on how to measure fidelity to the
training curriculum, trainer competence and two training
variables that we are finding impact outcomes (trainer
alliance and group cohesion) and will give you all copies of
our measures (slide 12)
• Finally slides 13-22 give findings from one analysis- these
are for future reference- we can summarize quickly
• We will end our time leading a discussion about
implications of that analysis (slides 23-30) and how we can
transfer these findings to child welfare training evaluation(slide 31)
First Grant that Evaluated Efficacy of
Evidence Based or Evidence Informed
• We received $2.5M in funding from the Office of
Family Assistance (OFA) for the grant Relationship
Education Across Louisville (REAL) from 20062011 which evaluated the efficacy of Within My
Reach in lowering Interpersonal Violence (IPV) in
a population of 900 low income adults served by
Neighborhood Places in Louisville KY as well as
the efficacy of Love Notes on 450 youth who had
dropped out of high school in Louisville KY (Antle
was PI, Barbee was Evaluator)
Second Grant
• We received $4.8M in funding from the Office of
Adolescent Health (OAH) for a teen pregnancy
prevention research study we call CHAMPS (Creating
Healthy Adolescents Through Meaningful Prevention
Services) from 2010-2015. It is a three arm randomized
controlled trial and longitudinal study comparing the
effects of three training interventions. 1300 urban,
refugee and foster youth ages 14-19 are being
randomly assigned to either Love Notes, Reducing the
Risk or The Power of We (that serves as a control
condition) and followed at the 3, 6, 12 and 24 month
points. These trainings are delivered in 20 community
based organizations serving high risk youth in
Louisville, KY (Barbee is PI, Antle is Co-PI).
Training Interventions:
Within My Reach
• Within My Reach (Pearson, Stanley, & Kline, 2005) is a 16 hour healthy
relationships program, a relationship education intervention designed for
low-income populations that teaches communication and conflict
resolution skills, relationship decision making strategies, and relationship
safety/violence prevention content.
• The WMR program has many modules that promote group interaction via
discussions, activities, and processing of topics as they relate to the
participants’ lives. The WMR program promotes group interaction and has
several group based activities wherein group members share parts of their
lives. For more information about WMR see Antle et al. (2011) and
Pearson et al. (2005). This program was provided in four 4-hour sessions
(typically conducted weekly) at neighborhood-based social service sites.
• WMR was developed to reach a broader audience (individuals who may or
may not be in a romantic relationship) than couple-focused relationship
education programs. Specifically, the WMR curriculum was designed for
individuals (vs couples) in order to be a primary prevention method (e.g.,
assist individuals in making sound relationship choices, regardless of
relationship status) as well as to safe-guard participants when sensitive
issues are discussed, in particular, intimate partner violence (wherein
including both partners in the session may be contraindicated).
Love Notes
• Love Notes, was developed to educate participants about healthy
relationships, including issues of decision-making, communication and
conflict resolution, and overall safety, including the prevention of
pregnancy and sexually transmitted disease (Pearson, 2009).
• This strategy may provide more long-term prevention of interpersonal
violence, as participants are engaged in a thorough self-assessment of
relationship values, needs, and models of safety that will assist them with
future decision-making and commitment in intimate relationships. Love
Notes is a derivative of the Prevention and Relationship Enhancement
Program (PREP; Stanley, Markman, & Jenkins, 2009), which is relationship
marriage education program listed as an evidence-based practice (EBP) by
SAMSHA (www.samhsa.gov).
• This curriculum builds on social exchange theory and meets the needs of
youth who are alienated and in need of loving personal relationships.
Thus, it is an excellent counterpart to the pregnancy prevention
curriculum and allows a test of theoretical model on what combination of
information is best in preventing high risk behaviors, pregnancy and
transmission of STIs.
• Research by Antle et al (2011) found that participants in Love Notes
experience significant gains in relationship knowledge, communication
and conflict resolution skills, relationship self-efficacy and attitudes
toward violence.
Reducing the Risk
Reducing the Risk: Building Skills to Prevent Pregnancy, STD and HIV (RtR) was
developed by Richard Barth, MSW, Ph.D. in California. The training manual is in its
5th Edition and was last published in 2011.
This curriculum is one of the first that was evaluated using an experimental design,
with a longitudinal follow up (6 months and 18 months) and tested on a large
group of high school students (N = 758). It is also one of the first programs to show
an impact on beliefs of adolescent sexual behavior prevalence and actual behavior
as well as increasing parent-child communication about abstinence and
contraception (Kirby, Barth, Leland, and Fetro, 1991). For those who were virgins
at the pre-test, the curriculum significantly reduced the onset of intercourse at 18
months and those who did have sex were more likely than controls to use
contraceptives. These effects held for members of several ethnic groups
(Caucasian, African American, Hispanic and Asian), both genders, and for lower
and higher risk youth.
For females and lower-risk groups who had initiated intercourse before the pretest and curriculum delivery, contraceptive use was increased after the training
and significantly more so than for controls (Kirby, et al 1991). The youth in the
comparison group did receive a traditional sexuality education intervention, thus
for RtR to significantly improve outcomes for participants above and beyond
another intervention means that it is particularly effective.
RtR continued
• Another study that tested the effectiveness of RtR was
conducted in Arkansas with rural and urban youth
(Hubbard, Giese and Raney, 1998). This study found
that RtR delayed the initiation of sex among youth who
were virgins at the pre-test and increased condom use
among youth that did initiate intercourse after the
• A third study evaluated the impact of RtR in Kentucky
and Ohio (Zimmerman, et al, 2008). It found that RtR
significantly delayed the initiation of sex, but condom
and contraception use was not increased.
The Power of We (POW)
• The control group participants receive training
in community organizing and community
building that is delivered by trainers from the
Network Center for Community Change (NC3),
a nonprofit organization in Louisville.
• This ensures that participants are receiving
some service and filling the same number of
hours (approximately 13) interacting in a
training environment, just on a different and
unrelated topic.
Measurement Issues When Using
Evidence Based or Evidence Informed
• Because of the importance of testing the efficacy of
evidence based curricula on outcomes, we placed special
emphasis in our evaluation on measuring and ensuring
fidelity to the curricula using a special tool and objective
evaluators who observed each training session and rated
fidelity to the curriculum.
• For CHAMPS we have also included a quantitative measure
of facilitator engagement which is completed by each
facilitator about their own performance, their partner’s
performance and which is completed by the objective
evaluator about each facilitator’s performance.
• For a portion of the REAL grant and all of the CHAMPS
grants we have included measures of facilitator
engagement and group cohesion along with our usual
measures of participant satisfaction.
One Example of Effects of Fidelity,
Engagement and Group Cohesion on
Outcomes in WMR Training
• A sample of 126 participants
– 98 women
– 21 men
– 7 unknown gender
Average Age was 33.33 years (s.d. = 12 years)
Average number of children was 2
Median education level was high school or GED
Racial background of participants
– 66.1% African American
– 27.8% Euro American
– 6% Hispanic, Asian American or other
Measure of Fidelity
• Adherence to the program was monitored
each session by trained raters.
• Adherence to the WMR training manual was
generally high (i.e., 94% of the proposed
material, including discussions, activities, and
content were covered).
• We can argue that those training participants
that participated in all 4 training days (92% of
participants) received a full dose of the
intervention given attendance and fidelity to
the curriculum by facilitators.
Effects of Adherence on Outcomes
• We conducted one analysis on 559 participants in WMR and found a
significant interaction between adherence, participant reactions to the
training, and relational outcomes (communication skills and couple
relationship quality).
• For participants who reported very positive reactions to the training, there
was a negative effect of strict adherence on relational outcomes.
• Moderate adherence to the curriculum allows for more discussion and
interactions with the trainer and other participants, which is likely more
desirable when participants have a very positive view of the facilitator and
the training.
• In contrast, for participants who reported less favorable reactions to the
training, there was a positive effect of strict adherence on relational
• This suggests that in a situation where a trainer was viewed as less
competent or engaging, it was better for participants when he/she
followed the curriculum content very precisely. This finding actually goes
beyond Barber et al. (2006) who found no effect of adherence on
outcomes when clients are engaged to suggest a negative effect of
adherence on outcomes when participants are otherwise engaged or
satisfied with services.
How We Analyzed the Impact of
Adherence on Outcomes
• Multilevel modeling was utilized to test the hypotheses (time
nested within participants, nested within groups). At level 1, we
included the repeated measures for the relational outcomes (i.e.,
the dependent variables, DAS and CPQ at pre, post, and 6-month).
The Time variable was coded -2 = pre, -1 = post, and 0 = 6-month
follow-up, thus, the intercept values reflect participants’ scores at
6-month follow-up. At level 2 or the participant level, we included
sex and race/ethnicity (both uncentered) as control variables and
participants’ reactions to the program at post (grand-mean
centered). At level 3 or the group level, we included ratings of
adherence (grand mean centered for both the linear and quadratic
effects). We created two models, one for each outcome variable
(i.e., DAS and CPQ). We examined whether participants’ outcomes
would be associated with adherence (both linear and quadratic
effects), participants’ reactions to the program, after controlling for
sex (Men = 1, Women = 0) and race (White = 1, Racial/Ethnic
Minority = 0). We also included the cross-level interaction between
participants’ reactions and adherence on the intercept.
Facilitator Working Alliance with
• Working Alliance Inventory-Short Form (WAI-S, Tracey &
Kokotovic, 1989). The WAI-S is a client rated measure of working
alliance that consists of 12 items that assess goals and tasks for
therapy as well as the relational bond between the client and
therapist. These items were rated on a seven-point scale ranging
from 1 (Strongly Disagree) to 7 (Strongly Agree) with higher scores
indicating a better working alliance
• The WAI-S is a commonly used measure of working alliance and the
reliability and validity has been demonstrated in numerous studies
comparing the WAI-S to other working alliance scales and therapy
outcomes (see Horvath et al., 2011 for a review).
• The language was adjusted to reflect the premarital education
context, consistent with Owen et al. (2011). Some couples took
premarital education with two leaders; working alliance scores for
the two leaders were highly correlated (r = .78), so we used an
average of the scores in our analyses. This measure was completed
at post intervention and the Cronbach alpha was .85.
Group Cohesion of Participants
• Group Climate Questionnaire (GCQ; MacKenzie, 1983). The GCQ has 12
items that are rated on a 7-point Likert scale, ranging from 0 (not at all) to
6 (extremely). The GCQ assesses group members’ perceptions of the group
environment. The GCQ has three empirically based factors:
a) Engagement or the degree of self-disclosure, involvement, and
investment among group members
(b) Avoidance or the lack of addressing key issues between members or
avoidance of responsibility to change
(c) Conflict- which captures the overt struggles and covert sense of
distrust among group members (MacKenzie, 1983).
• Support for the internal consistency of the GCQ has been shown in
previous studies (e.g., alpha = .88 to .94; Kivlighan & Goldfine, 1991;
MacKenzie, 1983). Construct validity of the GCQ has been demonstrated
by its relationship with therapy outcomes, therapist ratings of the group
process, and other client rated group process measures (e.g., Kivlighan &
Goldfine, 1991; Quirk, Miller, Duncan, & Owen, in press).
• In the current study, the Cronbach alphas for the three subscales were .77
(Engagement, five items), .84 (Conflict, four items), and .56 (Avoidance,
three items). Similar to the WAI, this measure was only completed at post
Outcome Measure: Relationship
• Relationship Confidence. We used the 5-item Confidence
Scale (Stanley, Hoyer, & Trathen, 1994) to measure couples’
confidence that they can effectively manage their
relationship and stay together. Relationship confidence is a
key outcome measure for relationship prevention programs
(e.g., Stanley et al., 2001).
• Items were rated on a seven-point scale ranging from 1
(Strongly Disagree) to 7 (Strongly Agree) with higher scores
indicating more confidence in the relationship. Confidence
Scale scores have been shown to be related to other
relationship and individual characteristics and outcomes in
meaningful ways (see Rhoades, Stanley, & Markman, 2009;
Whitton et al., 2007) and have been utilized in relationship
education programs (e.g., Owen et al., 2011). Cronbach
alphas were .70 and .75 at pre- and post-assessment,
Outcome Measure: Communication
• Relationship Dynamics Scale (RDS; Stanley &
Markman, 1997). We utilized the 7-item version
of the RDS to assess the frequency of negative
communication, including escalation,
invalidation, and withdrawal that are typically
associated with negative relationship outcomes.
Respondents rate each item on a 1 (Almost never)
to 3 (Frequently) scale with higher scores
indicating more frequent negative
communication. In a variety of samples, the
measure has demonstrated adequate reliability
and validity (e.g., Kline et al., 2004; Markman et
al., 2010). The Cronbach alphas were .82 at preassessment and .85 at post-assessment.
Outcome Measure: Relationship
• Dyadic Adjustment Scale-7 (DAS; Spanier, 1976). The
quality of intimate relationships was measured using the
seven-item version of the DAS. Relationship quality is a
hallmark outcome in REPs and is considered a potential
outcome for individuals who want to strengthen their
relationship. The DAS-7 assesses the quality of the couple
relationship including facets such as satisfaction, cohesion,
consensus, affection and expression. The DAS is widely
utilized in clinical and research settings. Previous studies
have supported the DAS’s internal consistency (e.g., alphas
ranged from .76 to .96) as well as validity (e.g., discriminate
sensitivity between distressed and non-distressed
individuals, and correlations with other relationship
measures – such as commitment, communication quality;
see Sabourin et al., 2005). Cronbach alphas were .86 and
.86 at pre- and post-assessment, respectively.
Multilevel modeling was utilized to examine the associations between participants’
alliance and group cohesion on relationship outcomes at post intervention.
Multilevel models are advantageous for many reasons, most notably they account
for the interdependencies among observations that can occur when multiple
individuals are included in the same group. Thus, individuals (level 1) were nested
within groups (level 2). Note there were 10 groups (average 18.6 individuals per
group, range 5 to 21). The small number of groups inhibited our ability to test
group differences (Maas & Hox, 2005); however, conducting MLM can still correct
for the interdependencies for individuals who attended the same group.
Statistically, a small number of groups (level 2 units) can produce biased random
effect estimates, yet the fixed effects (e.g., the association between alliance and
DAS, for example) typically demonstrate less bias (Maas & Hox, 2005).
We created two multilevel models with relationship variables at post (i.e., DAS,
and RDS) as the outcome variables, respectively.
The predictor variables included pre-functioning for the relationship variables,
working alliance, and the three subscales for group cohesion (all grand-mean
Note we tested whether gender and/or race/ethnicity would affect the results
when included as control variables. The covariates were not statistically significant
and were subsequently dropped from the final models. Multilevel models were
conducted using Hierarchical Linear Modeling Version 6 (HLM6) (Raudenbush,
Bryk, Cheong, & Congdon, 2005).
Means and Standard Deviations for Key Variables
Notes. DAS = Dyadic Adjustment Scale, RDS = Relationship Dynamics Scale, WAI = Working Alliance Inventory, GSQ-S = Group Climate
Questionnaire-Short Form. The changes from pre to post on the DAS, RDS, and Confidence were statistically significant (ps < .05). Ns ranged
from 123 to 126. Cohen’s d = small-sized effect = 0.20, medium-sized effect = 0.50, large-sized effect = 0.80.
M (SD)
M (SD)
Effect Size
Cohen’s d
20.65 (7.00)
21.84 (7.07)
2.00 (0.52)
1.82 (0.55)
4.78 (1.27)
5.22 (1.26)
5.68 (0.88)
GCQ-S Engage
4.41 (1.10)
GCQ-S Conflict
1.08 (1.35)
GCQ-S Avoid
3.56 (1.30)
• Working Alliance-Relationship Outcomes. Our first three hypotheses
related to the association between working alliance and relationship
outcomes. None of the associations between alliance and relationship
outcomes were statistically significant.
• Group Cohesion-Relationship Outcomes. Group Engagement was
significantly associated with lower negative communication with their
partner (RDS; d = -0.16), suggesting that individuals who felt more
connected with other group members reported improvements in their
communication quality with their romantic partner.
• Less group avoidance was associated with better relationship adjustment
at post (DAS; d = -0.24).
• Group Effects for Group Cohesion. Given that we only have 10 groups, we
were limited in testing group effects. However, a series of exploratory
analyses were conducted. First, we estimated the shared variance (ICC) in
participants’ group engagement, conflict, and avoidance scores who were
in the same group. The ICCs for Engagement, Conflict, and Avoidance were
.12, .05, and .17. This suggests that groups accounted for 5% to 17% of the
variance in the group cohesion subscale scores. I
• In context, therapists typically account for 5% to 9% of the variance in
clients’ alliance scores (Baldwin & Imel, in press).
Implications for Preliminary Results
• This is the first known study to examine group
cohesion in the context of Relationship
Education Trainings and is consistent with the
general findings within the psychotherapy
literature (cf. Burlingame et al., 2011) –
suggesting that group interventions may have
common change mechanisms.
Implications for Participants
• More group engagement, or the degree to which group
members support one another and provide a facilitative
environment to explore key issues in their relationships,
may be a change mechanism that enhances relationship
functioning (e.g., relationship adjustment, communication
quality, relationship confidence).
• Potentially, these group dynamics may assist participants
gain new insights about how they are able to navigate the
complexities of their relationship. Interpersonal learning is
a core aspect of group processes (Yalom & Leszcz, 2005),
which typically includes group members providing
corrective feedback, self-disclosing about their own
experiences, and discussing themes related to what is and
is not healthy and safe in romantic relationships.
Implications for Participants
• Group conflict also seems to have an association with
changes in participants’ relationship adjustment.
Specifically, participants who reported higher levels of
conflict among group members reported better
relationship adjustment at post intervention.
• Potentially, higher levels of conflict among group
members may represent the challenges among group
members to help promote a pro-relationship stance (as
it relates to the participant’s current relationship).
• Although group conflict may arise differently in
Relationship Education Trainings, the importance of
conflict or at least tension may be sufficient within to
encourage some participants to make improvements in
their relationship functioning.
Implications for Trainers
• Group dynamics can be related to important outcomes in
which may suggest that training of trainer programs should
include instruction on managing group dynamics.
• Training of trainer programs tend to focus on the content of
the curriculum, giving future facilitators a broad overview
of the concepts and activities of the program. However, as
this and previous research suggests (Burlingame et al.,
2011), group cohesion may be an important contributing
factor to relationship outcomes for participants.
• Facilitators could be trained in the stages of group
formation, how to manage group conflict effectively, how
to manage group participants who are over- or underinvolved, and other aspects of group processes.
Implications for Trainers/Facilitators
• Facilitators must learn to attend to process as much as
content. There is wide variability in the style of
facilitators, as this team’s research has identified
through its adherence assessment process (Owen,
under review).
• Given a comprehensive curriculum that may exceed
the amount of time available for delivery, some
facilitators will choose to try to cover every core
concept and activity while others will encourage more
discussion and cover less content. T
• his research suggests that group discussion and
activities that promote group cohesion (even
“negative” aspects such as conflict) should be
prioritized as they have significant benefits for
Implications for Trainers/Facilitators
• Facilitators should monitor the conflict within the group, to be sure
that the challenges are beneficial for participants. T
• he tendency of many facilitators, particularly in time-limited groups
is to provide a “positive” experience, which may actually shut down
important discussions due to conflict.
• Relationship Education Trainings have historically tried to focus on
strengths and skill building because they are time-limited and do
not want to leave couples feeling more distressed than before the
• However, in this case where the Relationship Education is for
individuals who are not attending with their partners, there was
benefit to the “negative” side of group cohesion, conflict.
• Facilitators must learn to balance the need to create a safe
environment where participants feel free to share and have an
overall positive experience, with genuine group process that
includes conflict and confrontation of problem areas.
Potential to Link Fidelity, Facilitator
Engagement and Group Cohesion to
• In both studies we also have measures of participant
learning and attitudinal and behavioral outcomes so
we can create a chain of evidence from these training
and participant measures to participant learning and
transfer of training.
• In our work in child welfare staff and supervisor
training we rarely assessed fidelity to the curriculum
and never assessed facilitator engagement and group
• Preliminary findings from the REAL data show that
group cohesion had an effect on attitudinal and
changes (transfer of learning).
Lessons Learned
1) It is important to measure fidelity and to ensure
fidelity in order to reach desired outcomes
2) It is also important for trainers and facilitators to
engage participants and create of group
cohesion among trainees for learning and
training transfer to occur
3) Attitude change and behavioral intention can be
one way to measure transfer of training
4) Relationship skills can be taught
5) The field of Public Health expects measurement
of fidelity and facilitator engagement as well as
training transfer and outcomes in training
Implications for Child Welfare
• The Children’s Bureau is moving to the
creation of Evidence Based Curricula for:
– Teaching states how to create CQI systems
– Trauma Informed Interventions and creating
trauma informed organizations
• More Child Welfare training needs to establish
itself as Evidence Based and include fidelity
measures and other measures of what
happens in the training room that can
influence transfer

What Evaluating Evidence Based Training Curricula