Complexity and evaluation - Community of Practice on Results

advertisement
Complexity and evaluation
EADI Masterclass
September 2012
Rough agenda
Days 1 and 2
Session 1
Introductions and introduction to the complexity
sciences and their implications.
Session 2
Discussion and implications for our work.
Session 3
Overview of the different approaches to evaluation
Session 4
Discussion and implications for our work.
Theories of causality
 Natural Law – movement is perfectly regular and predictable and the
parts add up to the whole. Everything that is possible is already given
and there is nothing new under the sun. Causality is of an efficient, ifthen kind using rules of a timeless kind. Change occurs through single,
isolatable causes giving rise to predictable effects. Nature and human
society as mechanism. Plato/Newton.
 Rationalist causality – movement is toward a goal autonomously chosen
by humans using their reason. The whole is achieved through choice or
design of the parts brought about by human motivation trying to get it
right by universals. The world of Kant.
 Formative causality – parts and whole thinking. Movement towards a
whole which is already contained in the parts, i.e. from an acorn to an
oak, from child to adult. Kant’s dual position – formative causality for
nature (regulative idea), and rationalist causality for the social.
Theories of causality II
 Transformative or generative causality – movement towards
an unknown form in which the process itself is also evolving.
The emergence of identity in a transformative, selforganizing process. Hegel/complexity sciences.
 Adaptionist causality – after Darwin, populations evolve
towards stable states driven by competition and random
fluctuation. Individual entities evolve through competition
which improves individual fitness.
Research methods suggested by the
different theories of causality
 Rationalist causality – project design, evaluation design,
isolating single causes and single effects. RCTs and statistical
methods.
 Formative causality – Theories of Change and parts-whole
thinking found in systems theory (often combined with
rationalist causality). Abstract parts/whole models.
 Transformative/generative causality – macro and micro are
tightly connected and are co-evolving. Realist evaluations,
agent-based modelling, micro-narratives, case of 1 studies.
Definitions of ‘impact‘
 OECD
‘positive and negative, primary and secondary long-term effects
produced by a development intervention, directly or indirectly,
intended or unintended.’
 3ie
‘…analyses that measure the net change in outcomes for a
particular group of people that can be attributed to a specific
programme using the best method available, feasible and
appropriate to the evaluation question that is being investigated
and to the specific context.’
Impact definitions II
 Poverty Action Lab
‘The primary purpose of impact evaluation is to determine
whether a programme has impact (on a few key outcomes), and
more specifically, to quantify how large the impact is.’
 World Bank
‘…assessing changes in the well-being of individuals and
households, communities or firms that can be attributed to a
particular project, programme or policy.’
DfID Working Paper 38
Broadening the range and methods from impact evaluation
 …most development interventions are “contributory causes”. They ‘work’ as
part of a casual package with other helping factors such as stakeholder
behaviour, related programmes and policies, institutional capacities, cultural
factors or socio-economic trends.
 It is often more informative to ask ‘did the intervention make a difference?’
which allows for a combination of causes rather than ‘did the intervention
work?’ which expects an intervention to be cause acting on its own.
 The sheer complexity and ambition of intl development programmes means
that there will always be limits to what can be said about the links between
development interventions as ‘causes’ and development results as ‘effects’.
Programmes about governance, empowerment, accountability, climate change
and post-conflict stabilisation are exploring new areas of practice for which
little evidence exists. Many of the problems being addressed are long-term and
inter-generational; some of their impacts will only become clear to policymakers still to be born.
DfID Working Paper 38
 IE is not method-specific – no single design or method can
lay monopoly claim to the production of evidence for policy
learning; and all established methods have difficulty with
many contemporary interventions.
 The link between the ‘accountability’ and demonstrating
causal effectiveness – the idea of explanation becomes much
less important.
Evidence-based policy
 What counts as evidence?
 There are different kinds of evidence which clash and have
either to be reconciled or judgements have to be made between
conflicting policy signposts.
 The evidence and the possibilities of causal inference can be
found through many methods and designs.
 There is a need for theory to permit explanation
Four views of causation
 Regularity frameworks that depend on the frequency of
association between cause and effect – inference basis for
statistical methods to IE
 Counterfactual frameworks that depend on the difference
between two otherwise identical cases – experimental and
quasi-experimental methods.
 Multiple causation that depends on combination of causes
that lead to an effect – configurational approaches to IE.
 Generative causation that depends on identifying the
‘mechanisms’ which explain the effect – basis for theory
based and ‘realist’ approaches to IE.
DfID Working paper 38
Regularity
Experiments/ Multiple
counterfactual causation
s
generative
Requirements
Many cases
Two identical
cases for
comparison.
Ability to
‘control’ the
intervention
Sufficient
numbers of
cases. Cases with
comparable
characteristics
One case with
access to
multiple data
sources
Strength claims
Uncovering
‘laws’
Avoiding bias
Discovery of
typologies
In-depth
understanding of
context
Potential
weaknesses
Difficulties
explaining how
and why.
Construct
validity
Generalisation.
Role of contexts
Difficulties
interpreting
highly complex
combinations
Estimating
extent. Risk of
bias.
Reasons for looking to the complexity
sciences
 No one single complexity science but a number of strands which
take an interest in non-linear interaction.
 Classical science assumes proportional relationship between one
cause and a predictable effect. Models of stability and
proportionality, if-then causality.
 However, models with non-linear equations have no solution –
considered to have questionable usefulness.
 Non-linear equations can be reiterated on powerful computers to
model inter-relationships over time.
Ilya Prigogine and transformative
causality
 Prigogine arguing for the break with the idea of a predictable
nature. Is the future given in the present?
 He argues against classical science in favour of the
introduction of the concept of time into equations. Some
processes are irreversible. Time is not, as Einstein claimed, an
‘illusion.’
 Unknowable futures emerge from human interaction in the
disorderly present – we can think of this as transformative
causality.
 Human beings forming patterns of interaction at the same
time as they are formed by them – paradoxical.
Prigogine II
 Classical physics takes the individual as the fundamental unit
of analysis, and population-wide effects are averaged or
extrapolated from this.
 Prigogine is suggesting that in complex systems it is
impossible to identify individual trajectories, both because
we cannot measure to infinite precision, but also because of
resonance.
 We should pay attention to ensembles – whole populations
reach bifurcation points where a system self-organises into an
unpredictable state.
Modelling reality
 Both classical and complexity scientists use statistical models as a
way of simulating what they are talking about.
 Models are products of the human mind, are reductive and are
simplified representations of the reality we are trying to think
about.
 Models traditionally have:
A boundary
Rules for classifying elements of the model
Elements are assumed to be homogenous, or have average variations
Individual behaviour is also averaged eliminating luck, randomness
and noise.
 The reality being modelled is stability or equilibrium.




With four assumptions
 All five assumptions –classical physics
 Remove assumption five. Deterministic chaos. The data from
the last iteration is fed into the next iteration leading to
stability and instability at the same time. Patterning but never
exactly the same each time. Fractal, weather, human heart.
Impossible to make long-term predictions.
 Only possible to move from one attractor to another is the
parameters are changed. – unfolding patterns which are
already unfolded in the design of the system.
 This is not a model of learning and creativity.
With three assumptions
 When ‘noise’ is introduced, ie non-average behaviour, into
the model then it demonstrates an ability to shift from one
attractor to another by itself – the model self-organises.
 However, the system is still fluctuating between one predesigned attractor and another. There is no transformed
future for such models.
 Prigogine’s dissipative structures – and weather convection
patterns.
With two assumptions
 Introducing diversity at the micro-level introduces the possibility





of the model evolving in novel ways.
Individual entities have incentives to pursue different strategies in
their interactions with others.
The collective system conditions the response that any individual
agent can pursue.
Small individual differences in interactions between diverse agents
can be amplified into population-wide changes in patterning.
Complex adaptive systems.
Each individual agent is responding and adapting to their local
context with others. The brain.
No overall blueprint or programme for the collective pattern.
Self-organising computer models
 Self-organising emergence is not a free-for-all. In
organisational terms we are not ‘just letting things emerge’.
 Individual agents are both constraining and enabling each
other. The patterning emerges as a result of what every agent
is doing and not doing.
 Implications for social control and order.
 The implications of paying attention to the local interplay of
diverse human behaviour.
Two complex adaptive systems
modellers
Peter Allen
 ‘… the landscape of possible advantage itself is produced by the
actors in interaction, and that the detailed history of the
exploration process itself affects the outcome. Paradoxically,
uncertainty is therefore inevitable, and we must face this. Long
term success is not just about improving performance with respect
to the external of a complex society. The “payoff ” of any action for
an individual cannot be stated in absolute terms because it
depends on what other individuals are doing. Strategies are
interdependent…Innovation and change occur because of
diversity, non-average individuals with their bizarre initiatives, and
whenever this leads to an exploration into an area where positive
feedback outweighs negative, then growth will occur. Value is
assigned afterwards.’ (1998: 157)
Peter Hedström
 There is no necessary proportionality between the size of a
cause and the size of its effect.
 The structure of the social interaction is of considerable
importance in its own right for the social outcomes that
emerge.
 The effect a given action has on the social can be highly
contingent upon the structural configuration in which the
actor is embedded.
 Aggregate patterns say very little about the micro-level
processes that brought them about. (2005: 99)
Consequences of non-linearity
 Organisational and social change has a limited predictability
 The centrality of local interaction to the emergence of ongoing patterns







of organisational and social life: conflict, power and the exploration of
difference.
The limits to individual choice.
Stability through relationships. Power laws.
Innovation through difference and diversity.
Limits to planning – creative-destructive dynamic emerges because of
what everyone is doing.
‘Success’ is not based on stability, but on the combination of stableinstability of emergent novelty
Identity and difference of individuals and groups rather than
performance.
Interdependent people, rather than autonomous individuals.
RCTs
Duflo and Kremer (2005)
 ‘All we can hope for is to be able to obtain the average
impact of the programme on a group of individuals by
comparing them to a similar group of individuals who were
not exposed the programme.’
 Causal regularities are hypothesised, data on a set of apposite
variables is gathered to explore this regularity and according
to the analysis the conjectured uniformity is further
explained or explained away.
Counterfactuals
 The assumptions of counterfactual thinking do not always
hold (e.g. finding an identical match for the factual world, the
world where the cause and effect have been observed, may be
difficult, or even impossible; and even when they do,
counterfactuals associate a single cause with a single effect.
 Counterfactuals answer contingent, setting-specific causal
questions ‘did it work there and then’ and cannot be used for
generalisation to other settings and timeframes, unless they
are accompanied by more fine-grained knowledge on the
causal mechanisms actually operating within the process
leading from cause to effect.
Theory-based evaluation (ToC)
 A causal map or diagram about how an intervention achieves
its objectives.
 Weak form: a logic model that expresses the intentions of the
policy-makers
 Strong form: taking into account actions and intentions of
stakeholders and the assumptions about conditions.
 ‘A programme theory is an explicit theory or model of how
an intervention contributes to a set of specific outcomes
through a series of intermediate results.’ Funnel and Rogers
2011.
Theories of change
Theory of change is both a process and a product. It should be seen as an on-going process of
discussion-based analysis and learning that produces powerful insights to support programme design,
strategy, implementation, evaluation and impact assessment, communicated through diagrams and
narratives which are updated at regular intervals.
• The quality of a theory of change process rests on ‘making assumptions explicit’ and making strategic
thinking realistic and transparent. Practical experience highlights that this is not straightforward to do,
as these tap into deeper beliefs, values, worldviews, operational ‘rules of thumb’ and analytical lenses
that all individuals in development bring to their work. It takes time and dialogue to be able to
challenge
assumptions. Power relations, both in the programme’s context and within organisations, limit the
ability to challenge established ways of working.
• The time and resource needed to work effectively with theory of change needs to be taken seriously.
Staff in donor agencies, country programmes and civil society organisations are all under time pressures
– pragmatic approaches can get theory of change habits seeded, but institutional and funding support
for theory of change processes is needed to get the benefits in terms of more robust log-frames, results
frameworks and better implementation of programmes.
Developing a ToC
There is consensus on the basic elements that make up the theory of change approach. As a minimum,
theory of change is considered to encompass a discussion of the following elements:
• Context for the initiative , including social, political and environmental conditions, the current state
of the problem the project is seeking to influence and other actors able to influence change
• Long-term change that the initiative seeks to support and for whose ultimate benefit
• Process/sequence of change anticipated to lead to the desired long-term outcome
• Assumptions about how these changes might happen, as a check on whether the activities and
outputs are appropriate for influencing change in the desired direction in this context.
• Diagram and narrative summary that captures the outcomes of the discussion.
Develop a situation analysis
Undertaking a scoping exercise
Develop an outcomes chain.
Develop a theory of action
ToC III
• Working with theory of change thinking can be challenging but it can create a
strong organising framework to improve programme design, implementation,
evaluation and learning if some of the following enabling factors can be achieved:
• People are able to discuss and exchange their personal, organisational and
analytical assumptions with an open, learning approach.
 Theory of change thinking is used to explain rationales and how things are
intended to work, but
 also to explore new possibilities through critical thinking, discussion and
challenging of dominant
 narratives for the benefit of stakeholders.
 Critical thinking is cross-checked with evidence from research (qualitative and
quantitative) and
 wider learning that brings other analytical perspectives, referenced to
stakeholders’, partners’
 and beneficiaries’ contextual knowledge.
TOCS
 A number of theories of change are identified as relevant ‘pathways’ to impact for any
given initiative, rather than a single pathway, with acknowledgement of the non-linearity
and emergent nature of these.
 Documented theories of change and visual diagrams are acknowledged as subjective
interpretations of the change process and used as evolving ‘organising frameworks’ to guide
implementation and evaluation, not rigid predictions or prescriptions for change.
 Theory of change frameworks and visuals are used to support a more dynamic exchange
between donors, funders, grantees, development partners, programmes and communities,
to help open up new areas and challenge received wisdoms.
 Donors, funders and grant-makers are able to find ways to support justified adaptation
and refocusing of programme strategies during implementation, while there is time to
deliver improvements to stakeholders and communities.
Do ToCs help evaluative work?
 The evidence resulting from this review to repudiate or substantiate many of the claims
put forth by critics of and advocates for theory-driven forms of evaluation is, at best,
modest, and in some instances conflicting….
…In many of the cases reviewed, the explication of a program theory unmistakably was
unnecessary, or almost an afterthought in some instances, and was not visibly used in any
meaningful way for formulating or prioritizing evaluation questions nor for
conceptualizing, designing, conducting, interpreting, or applying the evaluation
reported. In these cases, from a methodological perspective, such evaluations very likely
would have produced the same results and conclusions even in the absence of articulating
or expressing an underlying theory. In other cases, however, the explication of a plausible
program theory noticeably was essential to the planning, design, and execution of the
evaluation (see Donaldson & Gooler, 2002, 2003).
Chris L. S. Coryn, Lindsay A. Noakes, Carl D. Westine and Daniela C. Schröter
A Systematic Review of Theory-Driven Evaluation Practice From 1990 to 2009
American Journal of Evaluation, 2010.
Michael Quinn Patton – developmental
evaluation
 Development evaluation is a process of engagement and





privileges relationships and communicative interaction.
It is a process of co-creation and co-interpretation.
The evaluator is a stakeholder.
Principles, not models.
Methodological diversity.
In previous versions of Patton, development evaluation is
understood to be questioning, then questioning further.
Disagreements with Quinn Patton
 Despite good intentions, Quinn Patton never really lets go of the idea of







evaluator as rationalist designer, observer – the role of the evaluator is realist
and unproblematised. Looking at things through ‘lenses’.
It is best suited to five types of social intervention/project. Complex ‘systems’
can be mapped, or be directed, or ‘tipped’ in a different ‘direction’.
Sometimes for Quinn, responding to complexity is about reducing uncertainty.
The unproblematic discussion of ‘data’.
Diverse and contradictory reflections on what it means to take complexity and
emergence seriously (simple rules, wicked problems, simple, complicated,
complex etc).
The usual panoply of managerialist understandings: vision, unity, trust, shared
values, authenticity etc etc, which presume the evaluator as observer.
Emergent = flexible
Little discussion of narrative, reflexivity, power and paradox.
Realistic evaluation – Pawson and
Tilley
 Evaluators to attend to how and why social programmes




work – what are the conditions conducive to making them
work?
Ontological depth – evaluators to penetrate below the
surface of the observable.
How do programme mechanisms replace or subvert
mechanisms the programme is designed to replace.
Understanding context – for whom and in what
circumstances does the programme work?
Understanding multiple outcomes and how they are
produced
Realistic evaluation II
 CMO – context-mechanism-outcome pattern.
 Teacher-learner role with respondents – not treating
programme participants are not privileged respondents.
 Programmes take place in ‘open systems’
Alvesson and Sköldberg
 Good qualitative research is not a technical project; it is an





intellectual one.
Insight-driven research – implying finding a more profound
meaning than that immediately given.
Encouraging multiplicity – realizing that other
interpretations are possible.
Identity is constituted by the action of narrating.
Power: the asymmetry between researchers and ‘natives’.
Researchers should not hide behind the myth of the
neutrality of their research.
Alvesson and Sköldberg II
 Empirical facts or data are never the rock-solid ground envisaged




by positivists and empiricists, but always a tenuous network.
Have the researchers demonstrated why we should believe them?
Does the research have practical and theoretical relevance?
Awareness of the ambiguity of language.
Interpretations which are ‘rich in points’. Data enables and
supports interpretation rather than necessarily leading up to it.
Generative capacity enabling a qualitatively new interpetation of
fragments of social reality.
Abstract
Detached
Positivist
Individual based
Bounded
Experimental
And quasi experimental
methods
People and what they are
Doing disappear from view
Particular
Involved
Constructivist
Group based
Multiple
Theories of Change
Realistic
evaluation
Case of one/
Case studies
People and what they are
doing/saying are at the centre
Case studies
Case study research that focuses on what Wittgenstein called “the
epistemology of the particular” works by expanding and sharpening
the vocabulary and expressions as they are used by researchers,
practitioners, policymakers, and citizens to talk about social
practices, a process that Tsoukas calls heuristic generalization
(Tsoukas 2009). In this way, they are able to draw ever more subtle
distinctions between this instance of a particular social practice and
that one.
Flyvbjerg – Making Social Science
Matter
(1) Where are we going?
(2) Is this development desirable?
(3) What, if anything, should we do about it?
The ‘we’, here consists of those organization
researchers asking the questions and those who
share the concerns of the researchers, including
people in the organization under study.
(4) Who gains and who loses, and by which
mechanisms of power?
Flyvbjorg
Phronetic organization research focuses on practical activity
and practical knowledge in everyday situations in organizations.
It may mean, but is certainly not limited to, a focus on known
sociological, ethnographic, and historical phenomena such as
everyday life’ and ‘everyday people’, with their focus on the socalled ‘common’. What it always means, however, is a focus on
the actual daily practices – common or highly specialized or
rarefied – which constitute a given organizational field of
interest, regardless of whether these practices constitute a stock
exchange, a grassroots organization, a neighbourhood, a
multinational corporation, an emergency ward, or a local
school board.
Flyvbjerg III
 Looking at practice
 Studying cases and contexts
 Asking ‘how?’, doing narrative
 Moving beyond agency and structure
 Dialoguing with a polyphony of voices
‘Case of one’ studies
‘ is best conceptualized not as a blueprint and implementation plan
for a state-of-the-art technical system but as a series of overlapping,
conflicting, and mutually misunderstood language games that
combine to produce a situation of ambiguity, paradox,
incompleteness, and confusion. But going beyond technical
“solutions” and engaging with these language games would clash with
the bounded rationality that policymakers typically employ to make
their eHealth programs manageable.’
Greenhalgh et al 2011.
 The holy grail of theoretical generalization by abstraction.
 we must resist the temptation to begin with a closed definition of
what “a case of X” comprises and then proceed to study how the
case under investigation aligns with this (Tsoukas 2009).
Complex responsive processes view of
organisations
 Embodied, interdependent human person. A social and
relational view of psychology
 Process as responsive acts of mutual recognition by persons.
 Patterns of interaction produce further patterns of
interaction and nothing else. These constitute individual and
collective identities.
 Causality is transformative which is both continuity and
potential transformation emerge at the same time. Nonlinear interaction holds the potential for the amplification of
small differences.
Complex responsive processes II
 The perpetual construction of the future in the present in






consideration of the past.
No spatial metaphors – nothing is inside or outside
Emergence as the interplay of human intentions.
No-one can take an external view of process to intervene on it.
Practice as the local, social activity of communication, power
relating and evaluative choice.
Experience as the social process of consciousness and selfconsciousness in interaction with others.
Organisation as patterns of relating in which one can only
participate.
Download