Evaluating public engagement: Deliberative democracy and the science museum Angie Boyce, Senior Research/Evaluation Assistant Museum of Science, Boston In the wake of 9/11, the role of the museum is shifting from a mere “cultural symbol, economic engine, and provider of educational experience,” to an institution that can “learn and master the process of civic engagement” . By expanding the museum’s role as an institution that not only engages visitors at an individual level but also at a larger social level, museums are playing an active role in “reinvention of our democracy” . This shift in the museum’s role is a response to a larger movement to empower citizens and include more diverse voices in the shaping of our communities and societies. With declining public trust in governmental and corporate institutions, the need for citizens to have opportunities to be included in decision-making is pressing. As an institution that continually engenders public trust, the museum has a special opportunity and responsibility to assist in the revitalization of civic engagement . Paralleling the shift in the relationship between museums and society is the shift in the relationship between science, technology and society. The mid 1990s saw an increased interest in clarifying and broadening the definition of technological literacy with the creation of the Committee on Technological Literacy (CTL) under the auspices of the National Academy of Engineering and the National Research Council. The CTL released a report on technological literacy in 2002, arguing that American adults and children have a poor understanding of the essential characteristics of technology, how it influences society, and how people can and do affect its development. Neither the educational system nor the policy-making apparatus in the United States has recognized the importance of technological literacy . The CTL embraced a broader definition of technological literacy, where citizens become active agents, rather than mere knowledge repositories, and where knowledge is not limited to the technical, but also the social, ethical, and political dimensions of technology. Citizen engagement is also reflected in changes in the relationship between science and society. Traditionally, the relationship between science and society has been characterized by the expression “Public Understanding of Science” (PUS), a philosophy that focuses on communicating scientific facts to the public in ways that are accessible and understandable to the laity. By increasing public scientific knowledge, it was hoped, a more “modern, industrial, and self-critical society” would emerge, in addition to gaining more public trust in and support for science . However, in recent years this philosophy has been criticized for its “deficit model” of the public, that is, that the public is deficient in its scientific knowledge, and their ignorance about science causes mistrust in science . The remedy, then, is to correct the scientific illiteracy of the ignorant public. The new trend is characterized by the expressions “Public Engagement in/with Science and Technology” (PEST) or “Public Understanding of Current Research” (PUR), which abandon the deficit model in favor of involving citizens in actual science and technology decisionmaking. Rather than trying to enhance public trust in and support for science through knowledge dissemination, PEST and PUR philosophies involve the creation of “opportunities for experts an lay audiences to learn from each other—partly so that lay audiences can learn the science but also very much so that researchers will understand the public’s concerns about their work and may even take those concerns into account” . While knowledge dissemination is a concern, the more important point is that the public’s voice will also be heard and potentially included on the decisions made about science and technology policy. This move toward citizen engagement is not limited to the scientific realm; within political science, two movements have advocated the increased public engagement in political decision-making, participatory public policy analysis and deliberative democracy . Within participatory public policy analysis, scholars have agreed that traditional policy-making is flawed and that by involving the public, better policies can be made . Likewise, in deliberative democracy, public deliberation is defined and propounded as an ideal political mechanism to include the public voice in democratic decision-making. At the confluence of these ideological shifts of science, technology, democracy, museums, and society lies the science museum. In response to these social conditions, science centers are becoming important sites for the creation and execution of public engagement opportunities. The Museum of Science, Boston (MOS), has launched such a program called Technology Forum, which strives to engage the public in dialogue, discussion, and deliberation around current issues in science, technology, and society. Citizens are drawn together in the museum’s program to enhance technological literacy through collaborative group learning. As we move forward on this endeavor, an important question arises; how can we use evaluation to improve public participation programs in the museum? What counts as a “good” or “successful” public engagement opportunity in the context of the science museum? This paper explores this question by reviewing evaluation criteria and methodologies within the deliberative democracy literature, and discusses the contingencies of the museum context in order to think about what sharing can take place between museums and the dialogue and deliberation community. Since evaluation criteria and methodologies are still being developed, this paper presents and addresses an important question for evaluation; what normative standards for good public participation methods can be preserved as these methods are employed in different contexts, such as the museum, and what contingent issues emerge? Within the museum context are mitigating factors that affect the creation and adoption of public participation programs. In the past, the public perception that the museum is a neutral space has made it an attractive place in which to hold public participation exercises, but as the museum shifts to taking a more active role in the design and implementation of these programs, other factors about the museum context will need to be taken into consideration. In particular is the tension between the museum’s institutional focus on learning and education as the outcome, and the ideal of a deliberative policy outcome within public participation. As an informal learning environment, the learning that happens is not fully captured by traditional models of conceptual learning. Falk and Dierking propose a “contextual model of learning” which posits that learning is “constructed over time as the individual moves through his sociocultural and physical world; over time, meaning is built up, layer upon layer” . Documenting and assessing learning in this model focuses on subjective meaning-making that emerges within various contexts. The museum field has widely embraced sociocultural perspectives on learning, and in doing so, has moved from the individual to the group within a specific context as the relevant unit of analysis . Learning is a social phenomenon; people learn in social ways in museums through interaction with group members. The group, then, becomes the essential link between the museum’s educative objective and deliberative democracy’s policy objective. Public participation and evaluation criteria and methodologies A diverse array of public participation methods has been developed, both deliberative (citizens juries, citizens panels, planning cells, consensus conferences, deliberative polling) and non-deliberative (focus groups; consensus building exercises; surveys; public hearings; open houses; citizen advisory committees; community planning; visioning; notification, distribution and solicitation of comments; and referenda) . However, the evaluation literature on public participation and deliberative democracy is still in its infancy. Evaluation is only beginning to be considered a critical component in the development process Webler develops an evaluative framework based on two “metacriteria”; competence, which he defines as “psychological heuristics, listening and communication skills, selfreflection, and consensus building” and fairness, which occurs when “people are provided equal opportunities to determine the agenda, the rules for discourse, to speak and raise question, and equal access to knowledge and interpretations” . Webler qualifies competence and fairness as criteria by identifying conditions under which they are most likely to occur. Additionally, he categorizes and describes various types of discourse, which provides insight into individual psyches. Webler’s evaluation framework is based largely on the theories of Jurgen Habermas, a German political philosopher. Rowe and Frewer claim that Webler’s evaluation framework bears some similarities to theirs, but that Webler is concerned with discourse analysis in groups, while they are more concerned with general issues. They divide evaluation criteria into two parts: acceptance criteria, which refer to how the procedure is constructed and implemented, and process criteria, which are related to how the public will accept the procedure. More specifically, acceptance criteria are: representativeness, independence, early involvement, influence, and transparency, while process criteria are: resource accessibility, task definition, structured decision-making, and cost-effectiveness. They use these criteria to make comparisons between public participation methods like citizens panels, juries, consensus conferences etc. However, they come to the conclusion that no method emerges as the best, and that the best techniques will probably emerge from “hybrids of more traditional methods” . In her work on citizen consensus conferences, Macoubrie theorizes various ways to improve conditions for deliberation by investigating the nature of deliberation and developing frameworks for how groups make decisions. She develops the “Three Level Model” that looks at three levels of democratic deliberation: political system process level, group communication level, and interpersonal level . Macoubrie’s model is sensitive to greater social context because she includes the political system process level. She finds that the following conditions make deliberation most likely: process openness, heterogeneous opinion groups, substantive reasons and informational issues addressed, inclusion of diverse views, consensus-seeking subtasks, and vigilant and systematic interaction. If one accepts these conditions, then these conditions provide possible variables by which success of the event can be measured, e.g. was the process open? Were the groups heterogeneous? Einsiedel’s work on Canadian citizens juries in xenotransplantation found that participant frustration largely centered on the “issue of the utility of the exercise as it connects (or does not connect) to a policy question or decision” . She also notes that “evaluations of… [public participation] processes have been infrequent and unsystematic,” and furthermore, that “appraisal… has been hampered by the deficiency of frameworks for analysis” . Hence she developed evaluation criteria from the literature on constructive technology assessment (which is front-end and design focused) and deliberative democracy (Habermas’s rules for discourse). She divided evaluation into three components: institutional/organizational criteria (which focus on how the opportunity for public participation emerged and was shaped), process criteria (which focus on what procedures were used as part of the participatory process), and outcome criteria (which focus on the impacts on participants, the community, the larger public, and the policy process in general). She found that her criteria were generally effective, but one important point of interest is the “disjunction between the organizers’ narrow definition of the task and that of the citizen participants” (327). As Einsiedel’s work shows, it is important in evaluation to not only look at criteria, but also to capture any trends that emerge in the actual event. Two aspects of her work are especially important for the museum context; participant learning is included as an outcome criterion, and the participant is a central focus. Looking at the participants, Einsiedel measured whether and how the program encouraged deliberation, using the dimensions of “equality” and “discussion opportunities” by analyzing the comments provided by participants at the end of the fora (both positive and negative), as well as looking at the arguments people presented themselves in order to analyze the complexity of the “elaborate considerations and reasoning behind each position” (323, 324). Perhaps one of the most extensive evaluation efforts that has been published to date is by Horlock-Jones et. al in their evaluation of the GM Nation? public debate sponsored by the British government on genetic modification. They used three sets of criteria: the aims and objectives of the Steering Board (in charge of implementing the debate), normative criteria (transparency, well-defined tasks, unbiased, inclusive, sufficient resources, effective and fair dialogue) and third, focus on participant views of success using surveys . By using three different sets of criteria, they show that normative criteria must co-exist with stakeholder goals and participant perceptions. Evaluation is always a social process, “firmly embedded in and inextricably tied to particular social and institutional structures and practices” . In their article they provide in-depth details on their various methodologies for each component of the debate, which lasted for multiple days and involved a large number of people in different areas around the nation. In order to measure whether and how deliberation (as well as other desirable outcomes) had occurred, they developed not only “normative” criteria, but also “participants’ evaluation” criteria (25, 26). They measured whether participants felt they had “opportunity for dialogue and debate, learning about the views of others, raised the profile of the issue, broad and representative participation, an opportunity to have one’s say, well-organised and facilitated, sensitive timing, good quality materials available in advance, sufficient time available to run the event effectively, presence of experts for consultation, and whether the event was perceived as meaningful” (26). They also included a copy of all instruments used, an extremely helpful move that increases the transparency and credibility of their evaluation methodology itself, but also opens up the opportunity for evaluators to improve upon evaluation methodologies for the analysis of future public participation methods. Joss describes several approaches to evaluating consensus conferences: efficiency (organization and management), effectiveness (external impact and outcomes), formative study (concurrent look at structure and process with possible intervention), cross-cultural studies (wider cultural context comparisons), and cost-benefit analysis (cost-effectiveness) . His evaluation work is especially relevant to the museum context because the particular model that he based his work on was a consensus conference held at the Science Museum of London. One interesting finding is the differences in institutional philosophies and how that impacted the conference; the conference was sponsored by a government agency and held at the museum, and because both parties advertised and promoted the conference, it was unclear to the participants who the conference organizer was. This is especially important because of public distrust of perceived stakeholders and the danger of bias (Joss, 1995). Additionally, the museum and the sponsor viewed the whole point of the exercise differently; the sponsor viewed it as a chance to raise public awareness and enhance public trust of biotechnology, while the museum viewed it more broadly, as both educational and participatory. The sponsor advocated the topic of plant biotechnology and did not wish to fund a conference on animal biotechnology, while the museum viewed animal biotechnology as an “ideal topic for a future conference” (96). Joss’s work underscores the importance of looking at stakeholders in an evaluation to gain a more holistic view of the success of the event, rather than just focusing on normative criteria or participant views. Interestingly, while scholars have developed different evaluative frameworks, the methodologies used in evaluation are largely similar. They look at discourse, documentaion, and social relationships, using some quantitative but mostly qualitative methodologies. Indeed, it could be said that evaluation has taken an ethnographic turn. Webler advocates discourse analysis with a particular focus on the participant perspective. Einsiedel conducted participant observations, collected materials used by participants, distributed questionnaires, recorded questions to the facilitator, and did interviews with randomly selected citizens and experts of interest. Horlock-Jones used some of the same methodologies as Einsiedel as well as conducted media analysis and public opinion surveys. In addition, they divided their observations into structured observations (looking for specific behaviors) and ethnographic recording. Joss listed his methodologies the most specifically out of the scholars reviewed in this paper; he used multiple methodologies including: keeping a log book and document/files archive, conducting group discussions, handing out questionnaires, conducting interviews, asking participants keep diaries, conducting a literature search, monitoring conferences in other settings, and audio-taping all of the procedures. Future evaluation work should discuss the merits and drawbacks to methodologies used in order to inform and improve methodological procedures for the evaluation community. Evaluating Forums in Museums: The MOS Prototype Within the museum field, evaluation has become a critical part of museum work by focusing on the visitor experience and integrating visitor perspectives and needs into the exhibits and programs that the museum creates. At the Museum of Science, we strive to continue to provide the Technology Forum program development team with participant feedback and other evaluative resources to continually improve the forum experience. We are exploring the criteria and methodologies used in public participation and modifying them to fit the museum context in a sustainable way. Evaluation, as Joss emphasizes, is a contextual, contingent practice. Similar to evaluators of deliberative democracy, we emphasize qualitative and ethnographic methodologies. Our methodologies and instruments include participant questionnaires, post-program interviews, participant and non-participant observation, and audio and video tape recording. While we have not yet formally adopted normative criteria from the evaluative frameworks presented by deliberative democratic theorists, we have drawn largely from the evaluation work of Horlock-Jones et. al. Their inclusion of sponsor criteria is relevant to the museum context, where the educational goals of the content developer are often the focus of the evaluation. In addition, the criteria and subcriteria that they provide has been extremely useful in the design of our instruments, in particular “task relevance,” “independence,” “resources,” and “structured dialogue.” The shared focus of deliberative democratic theorists and museum evaluators on discourse will be fruitful ground for the documentation of both deliberative process and group learning. Within the museum field, scholars have been increasingly focusing on discourse in museums and the conversation as an object of study . As Allen’s work shows, learning can be documented in visitor talk. Webler’s work suggests a similar possibility for the documentation of deliberation in participant talk. Future questions could draw from a melding of these two frameworks; how does deliberation foster learning, and how does a learning environment enhance the possibility for deliberation? References Abelson, J., P.-G. Forest, et al. (2003). "Deliberations about deliberative methods: issues in the design and evaluation of public participation processes." Social Science & Medicine 57: 239251. Allen, S. (2002). Looking for Learning in Visitor Talk: A Methodological Explanation. Learning Conversations in Museums. K. C. Gaea Leinhardt, Karen Knutson. London, Lawrence Erlbaum Associates. Borun, M., J. Dritsas, et al. (1998). Family Learning in Museums: The PISEC Perspective. Philadelphia, PA, PISEC. Einsiedel, E. F. (2002). "Assessing a controversial medical technology: Canadian public consultation on xenotransplantation." Public Understanding of Science 11(4): 315-331. Falk, J. H. and L. D. Dierking (2000). Learning from Museums: Visitor Experiences and the Making of Meaning. Oxford, Altamira Press. Gates, C. (2001). Democracy and the Civic Museum. Museum News. 80: 47-55. Hamlett, P. W. (2003). "Technology Theory and Deliberative Democracy." Science, Technology, & Human Values 28(1): 112-140. Hirzy, E. (2002). Mastering Civic Engagement: A Report from the American Association of Museums. Mastering Civic Engagement: A Challenge to Museums. A. A. o. Museums. Washington D.C., American Association of Museums: 9-20. Horlock-Jones, T., J. Walls, et al. (2004). A Deliberative Future?: An Independent Evaluation of the GM Nation? Public Debate about the Possible Commercialisation of Transgenic Crops in Britain, 2003. House, E. R. and K. R. Howe (2000). "Deliberative Democratic Evaluation." New Directions for Evaluation 85: 3-12. Joss, S. (1995). Evaluating consensus conferences: necessity or luxury? Public participation in science: The role of consensus conferences in Europe. S. Joss and J. Durant. London, Science Museum, London. Lewenstein, B. and R. Bonney (2004). Different Ways of Looking at the Public Understanding of Research. Creating Connections: Museums and the Public Understanding of Current Research. D. Chittenden, G. Farmelo and B. Lewenstein. Oxford, Altamira Press. Macoubrie, J. (2003). Conditions for Deliberation. Pearson, G. and A. T. Young, Eds. (2002). Technically Speaking: Why All Americans Need to Know More About Technology. Washington, D.C., National Acadamy Press. Pitrelli, N. (2003). "The crisis of the "Public Understanding of Science" in Great Britain." Journal of Science Communication 2(1). Rowe, G. and L. Frewer (2000). "Public Participation Methods: A Framework for Evaluation." Science, Technology, & Human Values 25(1): 3-29. Sturgis, P. and N. Allum (2004). "Science in society: Re-evaluating the deficit model of public attitudes." Public Understanding of Science 13(1): 55-74. Thelan, D. (2001). Learning Community: Lessons in Co-Creating the Civic Museum. Museum News. 80. Webler, T. (1995). "Right" Discourse in Citizen Participation: An evaluative yardstick. Fairness and Competence in Citizen Participation: Evaluating Models for Environmental Discourse. O. Renn, T. Webler and P. Wiedemann. Boston, Kluwer Academic Publishers.