Complexity and evaluation EADI Masterclass September 2012 Rough agenda Days 1 and 2 Session 1 Introductions and introduction to the complexity sciences and their implications. Session 2 Discussion and implications for our work. Session 3 Overview of the different approaches to evaluation Session 4 Discussion and implications for our work. Theories of causality Natural Law – movement is perfectly regular and predictable and the parts add up to the whole. Everything that is possible is already given and there is nothing new under the sun. Causality is of an efficient, ifthen kind using rules of a timeless kind. Change occurs through single, isolatable causes giving rise to predictable effects. Nature and human society as mechanism. Plato/Newton. Rationalist causality – movement is toward a goal autonomously chosen by humans using their reason. The whole is achieved through choice or design of the parts brought about by human motivation trying to get it right by universals. The world of Kant. Formative causality – parts and whole thinking. Movement towards a whole which is already contained in the parts, i.e. from an acorn to an oak, from child to adult. Kant’s dual position – formative causality for nature (regulative idea), and rationalist causality for the social. Theories of causality II Transformative or generative causality – movement towards an unknown form in which the process itself is also evolving. The emergence of identity in a transformative, selforganizing process. Hegel/complexity sciences. Adaptionist causality – after Darwin, populations evolve towards stable states driven by competition and random fluctuation. Individual entities evolve through competition which improves individual fitness. Research methods suggested by the different theories of causality Rationalist causality – project design, evaluation design, isolating single causes and single effects. RCTs and statistical methods. Formative causality – Theories of Change and parts-whole thinking found in systems theory (often combined with rationalist causality). Abstract parts/whole models. Transformative/generative causality – macro and micro are tightly connected and are co-evolving. Realist evaluations, agent-based modelling, micro-narratives, case of 1 studies. Definitions of ‘impact‘ OECD ‘positive and negative, primary and secondary long-term effects produced by a development intervention, directly or indirectly, intended or unintended.’ 3ie ‘…analyses that measure the net change in outcomes for a particular group of people that can be attributed to a specific programme using the best method available, feasible and appropriate to the evaluation question that is being investigated and to the specific context.’ Impact definitions II Poverty Action Lab ‘The primary purpose of impact evaluation is to determine whether a programme has impact (on a few key outcomes), and more specifically, to quantify how large the impact is.’ World Bank ‘…assessing changes in the well-being of individuals and households, communities or firms that can be attributed to a particular project, programme or policy.’ DfID Working Paper 38 Broadening the range and methods from impact evaluation …most development interventions are “contributory causes”. They ‘work’ as part of a casual package with other helping factors such as stakeholder behaviour, related programmes and policies, institutional capacities, cultural factors or socio-economic trends. It is often more informative to ask ‘did the intervention make a difference?’ which allows for a combination of causes rather than ‘did the intervention work?’ which expects an intervention to be cause acting on its own. The sheer complexity and ambition of intl development programmes means that there will always be limits to what can be said about the links between development interventions as ‘causes’ and development results as ‘effects’. Programmes about governance, empowerment, accountability, climate change and post-conflict stabilisation are exploring new areas of practice for which little evidence exists. Many of the problems being addressed are long-term and inter-generational; some of their impacts will only become clear to policymakers still to be born. DfID Working Paper 38 IE is not method-specific – no single design or method can lay monopoly claim to the production of evidence for policy learning; and all established methods have difficulty with many contemporary interventions. The link between the ‘accountability’ and demonstrating causal effectiveness – the idea of explanation becomes much less important. Evidence-based policy What counts as evidence? There are different kinds of evidence which clash and have either to be reconciled or judgements have to be made between conflicting policy signposts. The evidence and the possibilities of causal inference can be found through many methods and designs. There is a need for theory to permit explanation Four views of causation Regularity frameworks that depend on the frequency of association between cause and effect – inference basis for statistical methods to IE Counterfactual frameworks that depend on the difference between two otherwise identical cases – experimental and quasi-experimental methods. Multiple causation that depends on combination of causes that lead to an effect – configurational approaches to IE. Generative causation that depends on identifying the ‘mechanisms’ which explain the effect – basis for theory based and ‘realist’ approaches to IE. DfID Working paper 38 Regularity Experiments/ Multiple counterfactual causation s generative Requirements Many cases Two identical cases for comparison. Ability to ‘control’ the intervention Sufficient numbers of cases. Cases with comparable characteristics One case with access to multiple data sources Strength claims Uncovering ‘laws’ Avoiding bias Discovery of typologies In-depth understanding of context Potential weaknesses Difficulties explaining how and why. Construct validity Generalisation. Role of contexts Difficulties interpreting highly complex combinations Estimating extent. Risk of bias. Reasons for looking to the complexity sciences No one single complexity science but a number of strands which take an interest in non-linear interaction. Classical science assumes proportional relationship between one cause and a predictable effect. Models of stability and proportionality, if-then causality. However, models with non-linear equations have no solution – considered to have questionable usefulness. Non-linear equations can be reiterated on powerful computers to model inter-relationships over time. Ilya Prigogine and transformative causality Prigogine arguing for the break with the idea of a predictable nature. Is the future given in the present? He argues against classical science in favour of the introduction of the concept of time into equations. Some processes are irreversible. Time is not, as Einstein claimed, an ‘illusion.’ Unknowable futures emerge from human interaction in the disorderly present – we can think of this as transformative causality. Human beings forming patterns of interaction at the same time as they are formed by them – paradoxical. Prigogine II Classical physics takes the individual as the fundamental unit of analysis, and population-wide effects are averaged or extrapolated from this. Prigogine is suggesting that in complex systems it is impossible to identify individual trajectories, both because we cannot measure to infinite precision, but also because of resonance. We should pay attention to ensembles – whole populations reach bifurcation points where a system self-organises into an unpredictable state. Modelling reality Both classical and complexity scientists use statistical models as a way of simulating what they are talking about. Models are products of the human mind, are reductive and are simplified representations of the reality we are trying to think about. Models traditionally have: A boundary Rules for classifying elements of the model Elements are assumed to be homogenous, or have average variations Individual behaviour is also averaged eliminating luck, randomness and noise. The reality being modelled is stability or equilibrium. With four assumptions All five assumptions –classical physics Remove assumption five. Deterministic chaos. The data from the last iteration is fed into the next iteration leading to stability and instability at the same time. Patterning but never exactly the same each time. Fractal, weather, human heart. Impossible to make long-term predictions. Only possible to move from one attractor to another is the parameters are changed. – unfolding patterns which are already unfolded in the design of the system. This is not a model of learning and creativity. With three assumptions When ‘noise’ is introduced, ie non-average behaviour, into the model then it demonstrates an ability to shift from one attractor to another by itself – the model self-organises. However, the system is still fluctuating between one predesigned attractor and another. There is no transformed future for such models. Prigogine’s dissipative structures – and weather convection patterns. With two assumptions Introducing diversity at the micro-level introduces the possibility of the model evolving in novel ways. Individual entities have incentives to pursue different strategies in their interactions with others. The collective system conditions the response that any individual agent can pursue. Small individual differences in interactions between diverse agents can be amplified into population-wide changes in patterning. Complex adaptive systems. Each individual agent is responding and adapting to their local context with others. The brain. No overall blueprint or programme for the collective pattern. Self-organising computer models Self-organising emergence is not a free-for-all. In organisational terms we are not ‘just letting things emerge’. Individual agents are both constraining and enabling each other. The patterning emerges as a result of what every agent is doing and not doing. Implications for social control and order. The implications of paying attention to the local interplay of diverse human behaviour. Two complex adaptive systems modellers Peter Allen ‘… the landscape of possible advantage itself is produced by the actors in interaction, and that the detailed history of the exploration process itself affects the outcome. Paradoxically, uncertainty is therefore inevitable, and we must face this. Long term success is not just about improving performance with respect to the external of a complex society. The “payoff ” of any action for an individual cannot be stated in absolute terms because it depends on what other individuals are doing. Strategies are interdependent…Innovation and change occur because of diversity, non-average individuals with their bizarre initiatives, and whenever this leads to an exploration into an area where positive feedback outweighs negative, then growth will occur. Value is assigned afterwards.’ (1998: 157) Peter Hedström There is no necessary proportionality between the size of a cause and the size of its effect. The structure of the social interaction is of considerable importance in its own right for the social outcomes that emerge. The effect a given action has on the social can be highly contingent upon the structural configuration in which the actor is embedded. Aggregate patterns say very little about the micro-level processes that brought them about. (2005: 99) Consequences of non-linearity Organisational and social change has a limited predictability The centrality of local interaction to the emergence of ongoing patterns of organisational and social life: conflict, power and the exploration of difference. The limits to individual choice. Stability through relationships. Power laws. Innovation through difference and diversity. Limits to planning – creative-destructive dynamic emerges because of what everyone is doing. ‘Success’ is not based on stability, but on the combination of stableinstability of emergent novelty Identity and difference of individuals and groups rather than performance. Interdependent people, rather than autonomous individuals. RCTs Duflo and Kremer (2005) ‘All we can hope for is to be able to obtain the average impact of the programme on a group of individuals by comparing them to a similar group of individuals who were not exposed the programme.’ Causal regularities are hypothesised, data on a set of apposite variables is gathered to explore this regularity and according to the analysis the conjectured uniformity is further explained or explained away. Counterfactuals The assumptions of counterfactual thinking do not always hold (e.g. finding an identical match for the factual world, the world where the cause and effect have been observed, may be difficult, or even impossible; and even when they do, counterfactuals associate a single cause with a single effect. Counterfactuals answer contingent, setting-specific causal questions ‘did it work there and then’ and cannot be used for generalisation to other settings and timeframes, unless they are accompanied by more fine-grained knowledge on the causal mechanisms actually operating within the process leading from cause to effect. Theory-based evaluation (ToC) A causal map or diagram about how an intervention achieves its objectives. Weak form: a logic model that expresses the intentions of the policy-makers Strong form: taking into account actions and intentions of stakeholders and the assumptions about conditions. ‘A programme theory is an explicit theory or model of how an intervention contributes to a set of specific outcomes through a series of intermediate results.’ Funnel and Rogers 2011. Theories of change Theory of change is both a process and a product. It should be seen as an on-going process of discussion-based analysis and learning that produces powerful insights to support programme design, strategy, implementation, evaluation and impact assessment, communicated through diagrams and narratives which are updated at regular intervals. • The quality of a theory of change process rests on ‘making assumptions explicit’ and making strategic thinking realistic and transparent. Practical experience highlights that this is not straightforward to do, as these tap into deeper beliefs, values, worldviews, operational ‘rules of thumb’ and analytical lenses that all individuals in development bring to their work. It takes time and dialogue to be able to challenge assumptions. Power relations, both in the programme’s context and within organisations, limit the ability to challenge established ways of working. • The time and resource needed to work effectively with theory of change needs to be taken seriously. Staff in donor agencies, country programmes and civil society organisations are all under time pressures – pragmatic approaches can get theory of change habits seeded, but institutional and funding support for theory of change processes is needed to get the benefits in terms of more robust log-frames, results frameworks and better implementation of programmes. Developing a ToC There is consensus on the basic elements that make up the theory of change approach. As a minimum, theory of change is considered to encompass a discussion of the following elements: • Context for the initiative , including social, political and environmental conditions, the current state of the problem the project is seeking to influence and other actors able to influence change • Long-term change that the initiative seeks to support and for whose ultimate benefit • Process/sequence of change anticipated to lead to the desired long-term outcome • Assumptions about how these changes might happen, as a check on whether the activities and outputs are appropriate for influencing change in the desired direction in this context. • Diagram and narrative summary that captures the outcomes of the discussion. Develop a situation analysis Undertaking a scoping exercise Develop an outcomes chain. Develop a theory of action ToC III • Working with theory of change thinking can be challenging but it can create a strong organising framework to improve programme design, implementation, evaluation and learning if some of the following enabling factors can be achieved: • People are able to discuss and exchange their personal, organisational and analytical assumptions with an open, learning approach. Theory of change thinking is used to explain rationales and how things are intended to work, but also to explore new possibilities through critical thinking, discussion and challenging of dominant narratives for the benefit of stakeholders. Critical thinking is cross-checked with evidence from research (qualitative and quantitative) and wider learning that brings other analytical perspectives, referenced to stakeholders’, partners’ and beneficiaries’ contextual knowledge. TOCS A number of theories of change are identified as relevant ‘pathways’ to impact for any given initiative, rather than a single pathway, with acknowledgement of the non-linearity and emergent nature of these. Documented theories of change and visual diagrams are acknowledged as subjective interpretations of the change process and used as evolving ‘organising frameworks’ to guide implementation and evaluation, not rigid predictions or prescriptions for change. Theory of change frameworks and visuals are used to support a more dynamic exchange between donors, funders, grantees, development partners, programmes and communities, to help open up new areas and challenge received wisdoms. Donors, funders and grant-makers are able to find ways to support justified adaptation and refocusing of programme strategies during implementation, while there is time to deliver improvements to stakeholders and communities. Do ToCs help evaluative work? The evidence resulting from this review to repudiate or substantiate many of the claims put forth by critics of and advocates for theory-driven forms of evaluation is, at best, modest, and in some instances conflicting…. …In many of the cases reviewed, the explication of a program theory unmistakably was unnecessary, or almost an afterthought in some instances, and was not visibly used in any meaningful way for formulating or prioritizing evaluation questions nor for conceptualizing, designing, conducting, interpreting, or applying the evaluation reported. In these cases, from a methodological perspective, such evaluations very likely would have produced the same results and conclusions even in the absence of articulating or expressing an underlying theory. In other cases, however, the explication of a plausible program theory noticeably was essential to the planning, design, and execution of the evaluation (see Donaldson & Gooler, 2002, 2003). Chris L. S. Coryn, Lindsay A. Noakes, Carl D. Westine and Daniela C. Schröter A Systematic Review of Theory-Driven Evaluation Practice From 1990 to 2009 American Journal of Evaluation, 2010. Michael Quinn Patton – developmental evaluation Development evaluation is a process of engagement and privileges relationships and communicative interaction. It is a process of co-creation and co-interpretation. The evaluator is a stakeholder. Principles, not models. Methodological diversity. In previous versions of Patton, development evaluation is understood to be questioning, then questioning further. Disagreements with Quinn Patton Despite good intentions, Quinn Patton never really lets go of the idea of evaluator as rationalist designer, observer – the role of the evaluator is realist and unproblematised. Looking at things through ‘lenses’. It is best suited to five types of social intervention/project. Complex ‘systems’ can be mapped, or be directed, or ‘tipped’ in a different ‘direction’. Sometimes for Quinn, responding to complexity is about reducing uncertainty. The unproblematic discussion of ‘data’. Diverse and contradictory reflections on what it means to take complexity and emergence seriously (simple rules, wicked problems, simple, complicated, complex etc). The usual panoply of managerialist understandings: vision, unity, trust, shared values, authenticity etc etc, which presume the evaluator as observer. Emergent = flexible Little discussion of narrative, reflexivity, power and paradox. Realistic evaluation – Pawson and Tilley Evaluators to attend to how and why social programmes work – what are the conditions conducive to making them work? Ontological depth – evaluators to penetrate below the surface of the observable. How do programme mechanisms replace or subvert mechanisms the programme is designed to replace. Understanding context – for whom and in what circumstances does the programme work? Understanding multiple outcomes and how they are produced Realistic evaluation II CMO – context-mechanism-outcome pattern. Teacher-learner role with respondents – not treating programme participants are not privileged respondents. Programmes take place in ‘open systems’ Alvesson and Sköldberg Good qualitative research is not a technical project; it is an intellectual one. Insight-driven research – implying finding a more profound meaning than that immediately given. Encouraging multiplicity – realizing that other interpretations are possible. Identity is constituted by the action of narrating. Power: the asymmetry between researchers and ‘natives’. Researchers should not hide behind the myth of the neutrality of their research. Alvesson and Sköldberg II Empirical facts or data are never the rock-solid ground envisaged by positivists and empiricists, but always a tenuous network. Have the researchers demonstrated why we should believe them? Does the research have practical and theoretical relevance? Awareness of the ambiguity of language. Interpretations which are ‘rich in points’. Data enables and supports interpretation rather than necessarily leading up to it. Generative capacity enabling a qualitatively new interpetation of fragments of social reality. Abstract Detached Positivist Individual based Bounded Experimental And quasi experimental methods People and what they are Doing disappear from view Particular Involved Constructivist Group based Multiple Theories of Change Realistic evaluation Case of one/ Case studies People and what they are doing/saying are at the centre Case studies Case study research that focuses on what Wittgenstein called “the epistemology of the particular” works by expanding and sharpening the vocabulary and expressions as they are used by researchers, practitioners, policymakers, and citizens to talk about social practices, a process that Tsoukas calls heuristic generalization (Tsoukas 2009). In this way, they are able to draw ever more subtle distinctions between this instance of a particular social practice and that one. Flyvbjerg – Making Social Science Matter (1) Where are we going? (2) Is this development desirable? (3) What, if anything, should we do about it? The ‘we’, here consists of those organization researchers asking the questions and those who share the concerns of the researchers, including people in the organization under study. (4) Who gains and who loses, and by which mechanisms of power? Flyvbjorg Phronetic organization research focuses on practical activity and practical knowledge in everyday situations in organizations. It may mean, but is certainly not limited to, a focus on known sociological, ethnographic, and historical phenomena such as everyday life’ and ‘everyday people’, with their focus on the socalled ‘common’. What it always means, however, is a focus on the actual daily practices – common or highly specialized or rarefied – which constitute a given organizational field of interest, regardless of whether these practices constitute a stock exchange, a grassroots organization, a neighbourhood, a multinational corporation, an emergency ward, or a local school board. Flyvbjerg III Looking at practice Studying cases and contexts Asking ‘how?’, doing narrative Moving beyond agency and structure Dialoguing with a polyphony of voices ‘Case of one’ studies ‘ is best conceptualized not as a blueprint and implementation plan for a state-of-the-art technical system but as a series of overlapping, conflicting, and mutually misunderstood language games that combine to produce a situation of ambiguity, paradox, incompleteness, and confusion. But going beyond technical “solutions” and engaging with these language games would clash with the bounded rationality that policymakers typically employ to make their eHealth programs manageable.’ Greenhalgh et al 2011. The holy grail of theoretical generalization by abstraction. we must resist the temptation to begin with a closed definition of what “a case of X” comprises and then proceed to study how the case under investigation aligns with this (Tsoukas 2009). Complex responsive processes view of organisations Embodied, interdependent human person. A social and relational view of psychology Process as responsive acts of mutual recognition by persons. Patterns of interaction produce further patterns of interaction and nothing else. These constitute individual and collective identities. Causality is transformative which is both continuity and potential transformation emerge at the same time. Nonlinear interaction holds the potential for the amplification of small differences. Complex responsive processes II The perpetual construction of the future in the present in consideration of the past. No spatial metaphors – nothing is inside or outside Emergence as the interplay of human intentions. No-one can take an external view of process to intervene on it. Practice as the local, social activity of communication, power relating and evaluative choice. Experience as the social process of consciousness and selfconsciousness in interaction with others. Organisation as patterns of relating in which one can only participate.