In Press, Journal of Risk Research Making sense of uncertainty: Advantages and disadvantages of providing an evaluative structure Nathan F. Dieckmann Decision Research, Eugene, Oregon Ellen Peters Department of Psychology, Ohio State University, Columbus, Ohio Robin Gregory Decision Research, Eugene, Oregon Martin Tusler Department of Psychology, Ohio State University, Columbus, Ohio Correspondence concerning this article should be addressed to Nathan F. Dieckmann, Decision Research, 1201 Oak Street, Suite 200, Eugene, OR 97401. Telephone (541) 485-2400, fax (541) 485-2403. E-mail: ndieckmann@decisionresearch.org Acknowledgements We would like to acknowledge the generous support of the National Science Foundation that made this work possible: NSF Award #0725025 to Decision Research (Robin Gregory, PI) and NSF Award #0925008 to Decision Research (Nathan Dieckmann, PI). All views expressed in this paper are those of the authors alone. Abstract In many decision contexts, there is uncertainty in the assessed probabilities and expected consequences of different actions. The fundamental goal for information providers is to present uncertainty in a way that is not overly complicated, yet sufficiently detailed to prompt decision makers to think about the implications of this uncertainty for the decision at hand. In two experiments, we assess the pros and cons of providing an evaluative structure to facilitate the comprehension and use of uncertainty information and explore whether people who vary in numeracy perceive and use uncertainty in different ways. Participants were presented with scenarios and summary tables describing the anticipated consequences of different environmental-management actions. Our results suggest that different uncertainty formats may lead people to think in particular ways. Lay people had an easier time understanding the general concept of uncertainty when an evaluative label was presented (e.g., uncertainty is High or Low). However, when asked about a specific possible outcome for an attribute, participants performed better when presented with numerical ranges. Our results also suggest that there appear to be advantages to using evaluative labels in that they can highlight aspects of uncertainty information that may otherwise be overlooked in more complex numerical displays. However, the salience of evaluative labels appeared to cause some participants to put undue weight on this information, which resulted in value-inconsistent choices. The simplicity and power of providing an evaluative structure is a double-edged sword. Keywords: uncertainty; decision analysis; risk communication; ambiguity; evaluability. Word count: 6,992 2 Making sense of uncertainty: Advantages and disadvantages of providing an evaluative structure Introduction In many decision contexts, there is uncertainty in the assessed probabilities and expected consequences of different events. It is important that these uncertainties are communicated to decision makers in a manner that is easily evaluated and integrated into the decision-making process. Good communication should facilitate a thorough deliberation about the risks and benefits of different options so that decision makers can make informed, value-consistent choices. The fundamental issue for information providers is deciding how to present uncertainty in a way that is not overly complicated (which can cause some users to misunderstand or ignore uncertainty), yet sufficiently detailed to prompt decision makers to think about the implications of uncertainty for the decision at hand. There are several alternatives for communicating uncertainty, which include numbers (e.g., probabilities, ranges, or distributions), verbal probability statements (e.g., highly unlikely), and evaluative labels (which might characterize uncertainty as “High” or “Low”; Peters et al. 2009). We focus on two common presentation formats: numerical uncertainty ranges and evaluative labels. In two experiments, we assess the pros and cons of providing an evaluative structure to facilitate the comprehension and use of uncertainty information and explore whether people who vary in numeracy perceive and use uncertainty in different ways. Participants were presented with scenarios and summary tables describing the anticipated consequences of different environmental-management actions. These decision problems were designed to mimic typical deliberative decision-making contexts. 3 1.1 Evaluative Labels In general, decision makers will tend to use information to the extent that they can map it to some evaluative scale (e.g., good/bad; Hsee et al. 1999). Some pieces of information will be easier to evaluate because salient comparisons can be made from experience (e.g., costs) or with other choice options in the decision context (e.g., alternative 1 costs twice as much as alternative 2). However, unfamiliar numeric information can often be difficult to evaluate, and may, consequently, be misinterpreted or underweighted in the decision-making process. Overly complex representations of uncertainty often fall into this category of unfamiliar, not easily evaluable numeric information. Thus, providing an evaluative context could help decision makers make better use of this unfamiliar information. There are many real-world examples of simplifying evaluative structures being used to increase the understanding and use of uncertainty information. In the weather forecasting domain, color codes are often used to distinguish different levels of uncertainty in weather forecasts (e.g., Joslyn et al. 2005). Presumably, color coding simplifies the comparison of different forecasts and improves comprehension. In the intelligence domain, evaluative verbal labels have been used to communicate analytic uncertainty. In a recent National Intelligence Estimate (NIE) on Iran’s nuclear intentions and capabilities, analysts used (albeit inconsistently) verbal evaluative labels to express analytic confidence in their estimates and assessments (i.e., High confidence, Medium confidence, or Low confidence). In the environmental risk management domain, legal mandates for increased public participation have amplified the role of deliberative processes in risk management decisions. Risk managers and decision makers need to communicate effectively with a variety of stakeholders including the lay public. Consequently, 4 verbal evaluative labels are often used to simplify communication about the uncertainty associated with proposed management actions. However, to our knowledge there has been relatively little experimental work focusing on how these evaluative structures affect the perception and use of uncertainty information by endusers. In the health communication domain, evaluative labels have been shown to improve the use of unfamiliar numeric information (Peters et al. 2009). In one study, participants were presented with a hospital judgment task in which different hospitals (i.e., the alternatives) were described by several different hospital quality indices (i.e., the attributes). When these quality indicators were presented on numeric scales only (e.g., Ease of getting a referral on 0–100 point scale), decision makers did not weigh the information as much compared to when the attributes were also marked with evaluative labels that classified the score as Poor, Fair, Good, or Excellent. These results were found to be most consistent with an affective explanation. Specifically, the affect derived from the manipulation appeared to either “spotlight” particular information for use in choice, or to directly motivate choice. In another study, interpretive labels for test results (the test came back “positive” or “abnormal”) induced larger changes to risk perceptions and behavioral intentions than did numeric results alone (Zikmund-Fisher et al. 2007). 1.2 The Use of Numerical Ranges and Evaluative Labels Based on Slovic’s (1972) concreteness principle, decision makers will tend to use information in the form in which it is provided. This principle likely also applies to uncertainty information. Numerical uncertainty ranges and evaluative labels may highlight different aspects of uncertainty. As an example, consider Figure 1, which displays a range of different management alternatives along with the key evaluation criteria. In this case, the focus is on the 5 effects of different environmental-management strategies on costs and the local bird population. This type of presentation is commonly referred to as a consequence matrix (Keeney 1982) or facts box, and is extensively used as a decision aid to facilitate individual or group decision making in a variety of domains (e.g., Clemen 2004; Gregory et al. forthcoming). Its purpose is to clarify different alternatives and illustrate how the options differ on important attributes. Clarifying the primary tradeoffs is thought to allow decision makers to integrate their personal values and make an informed decision. [Figure 1 near here] In Figure 1, the confidence (or uncertainty) in the estimated bird population is described by a range as well as an evaluative label.1 In terms of assessing the general confidence in estimating the bird population across the options, both the range and evaluative label are informative and are correlated with each other. As the confidence range increases in width, the evaluative label changes accordingly (e.g., from high confidence to low). The general uncertainty is clearly represented with the evaluative labels, thus simplifying comparison between the options. Making this same comparison using only the numeric ranges is more difficult given that the range widths need to be compared. The numeric range, however, provides additional information that the evaluative label does not—namely, what the values of the bird population could be given the underlying uncertainty (i.e., the upper and lower bounds of the range). Thus, different uncertainty formats may make different aspects of uncertainty more or less salient, which could affect the information used in the decision-making process. This also has implications for the use of multiple uncertainty presentations. It has been suggested that communicating uncertainty in multiple ways will aid understanding and use (e.g., Budescu, Broomell, and Por 2009). For example, a decision maker could focus on the format that is best suited to their needs. However, 6 in some situations, access to multiple pieces of information can be confusing and can lead to poorer decision making (Schwartz 2004; Peters et al. 2007). 1.3 Numeracy People’s comfort with and ability to use numerical information may also affect how well they understand and how they use uncertainty information. For instance, numeracy had been shown to affect how much people trust numeric information (Gurmankin, Baron, and Armstrong 2004) and how they respond to different numeric probability formats (e.g., Dieckmann, Slovic, and Peters 2009; Peters, Hart, and Fraenkel, 2011). Numeracy is also associated with the use of other more easily evaluated information (e.g., mood or narrative information) in the presence of numeric information that is difficult to evaluate (Dieckmann, Slovic, and Peters 2009; Peters et al. 2009). Less numerate consumers were also helped to a greater extent when evaluative labels were presented in the hospital choice experiment discussed above. 1.4 Hypotheses We focus on testing four main hypotheses about how lay people perceive and use numeric uncertainty ranges (e.g., Best estimate: 5000; Confidence range: 3000–7000) and evaluative labels (e.g., High or Low). Basic comprehension is important. For instance, lay people should be able to compare different alternatives with respect to attribute uncertainty, and to identify the different possible values of an attribute given that uncertainty exists. However, based on the concreteness principle, we expect that different uncertainty formats will facilitate the comprehension of particular aspects of uncertainty. H1: Lay people will be better able to compare options with respect to attribute uncertainty when the evaluative labels are present, and they will be better able to identify 7 different possible values of an attribute when the numeric range is present. We expect more numerate participants to show higher comprehension. Second, information that is perceived as relevant and easy to use will be more likely to be used in decision making. Because numerical uncertainty ranges are difficult to evaluate for many laypeople, we expect that they will perceive the evaluative labels as easier to use. H2: Lay people will find the uncertainty information easier to use when the evaluative labels are present. Third, the goal of structuring a decision with a device such as a consequence table is to facilitate the understanding of relevant tradeoffs and to help a decision maker make a value consistent choice. Thus, choice is a primary dependent variable. As we report in a companion paper (Gregory et al. 2011), lay people, as opposed to experts, tended to focus on evaluative labels, even when presented with a numerical uncertainty range. We expect the evaluative labels to be an important determinant of choice. H3: Participants will be sensitive to the evaluative labels and will tend to choose the option with the most favorable evaluative label (i.e., the highest confidence). Evaluative labels provide a very salient source of information for a lay consumer. However, given how easy it is to evaluate the options with respect to the labels, there is a danger that people will choose a simple decision strategy even though it may not result in the best choice given their values. H4: The salience of the evaluative labels will lead some participants into making choices that do not align with their values. 8 2.0 General Methods 2.1 Sample Participants (N = 367) were randomly drawn from the Decision Research web panel subject pool. Demographically, the panel members are similar to the US population but differ in three areas: they are younger, better educated, and have a higher proportion of females than the US population as a whole. The mean age of the sample was 40.35 years (Range = 19–76 yrs) and was 65.1% female. Approximately 22.3% had a high school education or less, 31.9% had attended some college or vocational school, 30% were college graduates, and 15.8% had advanced degrees. 2.2 Procedure The two experiments reported here were conducted over the web at the same time. Participants were presented with short scenarios and consequence tables describing different environmental-management problems. Each participant responded to the scenarios below in the same order. At the beginning of the experimental session, participants were presented with a short written tutorial that described the different parts of a consequence table. The goal of this tutorial was to ensure that participants had a working understanding of the consequence table comparable to actual decision makers participating in a deliberative decision-making group. Participants then answered three questions about a consequence table that outlined a choice between three different vacation locations. In a previous session, participants completed a measure of numerical ability that was adapted from existing measures (Weller et al. forthcoming). 9 3.0 Study 1 3.1 Scenario Figure 2 shows the scenario and consequence table for Study 1. The three alternatives were structured such that the best estimate of the saved moose population was inversely related to the confidence in the best estimate (i.e., the higher the best estimate of the population, the less confidence associated with the estimate). The average cost per moose was constant across the alternatives. [Figure 2 near here] 3.2 Design and Measures The format of the confidence information was manipulated in three between subject conditions2: (1) Numerical range (shown in Figure 1), (2) evaluative label only (e.g., Low, Medium, High), and (3) a combined condition with both the range and an evaluative label. After reading the scenario, participants were asked several questions about the consequence table. First, they were asked to choose a management option. Next, they were asked two comprehension questions: (1) In which option do scientists have the least amount of confidence in the estimated saved moose population? and (2) For which option is a final saved population of 8,500 most likely? Finally, they were asked two questions about how difficult it was to use the consequence table: (1) How easy or hard was it to use the confidence information presented in the consequence table? (1 = very easy to 4 = very hard) and (2) How much effort did it take you to make your choice? (1 = very little effort to 4 = a lot of effort). 10 3.3 Results and Discussion 3.3.1 Comprehension Logistic regression models were used to compare performance on the two comprehension questions between the format conditions. The first comprehension question asked participants to identify the option for which scientists had the least amount of confidence in the best estimate. As expected, participants showed poorer performance when a range was presented alone as compared to the other two conditions (Percent correct: Range = 47.0%, Label only = 83.8%, Combined = 79.4%), B = –1.62, p < .001, Odds Ratio = .20. The label only and combined conditions were not significantly different in the percentage of correct responses, B = .29, p = .38, Odds Ratio = 1.34. The second comprehension question asked participants for which option would the final saved population of 8,500 be most likely. In this case, we were asking about the implication of uncertainty on the possible values of the moose population. As expected, participants provided the correct response (option #1) less often in the label only condition as compared to the other two conditions (Percent correct: Range = 92.2%, Label only = 52.3%, Combined = 83.7%), B = –1.96, p < .001, Odds Ratio = .14. In addition, participants in the range only condition showed significantly better performance than in the combined condition, B = – .83, p = .04, Odds Ratio = .44, suggesting that the presence of the label hurt performance for some participants. The continuous numeracy score for each participant was added to the logistic regression model to test for numeracy main effects and interactions with condition. For the first comprehension question, more numerate respondents were better than the less numerate in answering correctly in all conditions, B = .40, p < .001, Odds Ratio = 1.49 [Median split percentage correct for more and less numerate: Range = 64.3% versus 30.5%, Label only = 11 88.4% versus 80.9%, Combined = 87.9% versus 72.0%]. Although this effect appeared to be strongest in the range only condition, the interaction between numeracy and condition was not significant (p = .51). They also performed better than the less numerate on the second comprehension question across conditions, B = .22, p = .007, Odds Ratio = 1.24 (Range = 91.1% versus 93.2%, Label only = 60.5% versus 47.1%, Both = 89.4% versus 78.7%). Although this effect appeared to be stronger when the label was present, the interaction between numeracy and condition was not significant (p = .34). 3.3.2 Perceptions of use. Labels only were perceived as easier to use as compared to the numeric range only, F(1,360) = 4.46, p = .04, d = .27, and the combined conditions, F(1,360) = 10.69, p = .001, d = .42. Participants also reported using less effort in the labels only as compared to the range only, F(1,362) = 4.40, p = .04, d = .28, and combined conditions, F(1,362) = 7.22, p = .007, d = .34. More numerate participants perceived the uncertainty information to be slightly easier to use (r = –.14, p = .007) and requiring less effort to make a choice (r = –.11, p = .03) across the conditions. There were no significant interaction effects. 3.3.3 Choice Figure 3 shows the choice results for each format condition. Multinomial logistic regression models were used to test for differences in choice proportions between the conditions. Examining the pattern of choices across the conditions revealed that the range only condition was significantly different from the label and combined conditions, χ2(2) = 22.73, p < .001, whereas the label and combined conditions did not significantly differ from each other, χ2(2) = 1.43, p = .49. In the label and combined conditions, participants were more likely to choose 12 option #2 (highest confidence, lowest best estimate), Wald(1) = 13.25, p < .001, and option #3 (compromise option), Wald(1) = 15.54, p < .001, as compared to option #1 (lowest confidence, highest best estimate). Option #1 (lowest confidence, highest best estimate) was the most frequent choice in the range only condition. The continuous numeracy score for each participant was added to the model to test for numeracy main effects and interactions with condition. Numeracy was not a significant predictor of any choice as a main effect nor did it significantly interact with condition. [Figure 3 near here] As expected, participants were more likely to choose the option with the highest confidence or the compromise option when an evaluative label was present. The similarity in the pattern of choices for the evaluative label only and the combined condition suggests that laypeople tended to use the evaluative labels more than the numeric range (also see companion paper, Gregory et al. 2011). In the range only condition, the majority of the participants chose the option with the highest best estimate but the lowest confidence (largest range). The fewest chose the option with the lowest best estimate but highest confidence (tightest range). There are several potential explanations of the pattern of choices in the range only condition. First, it is possible that a substantial proportion of participants did not understand the range information, ignored it, and just based their choice on the best estimate. This would explain the large increase in the number of people that chose the lowest confidence option in this condition. However, it is also possible that participants used the range information in a way other than just recognizing that the size of the range was an indication of the overall uncertainty (which is necessary to understand the tradeoff in the choices). For instance, they could have focused on the 13 ends of the range (e.g., with option #1 there is a chance of saving 9,000 moose, much higher than the other options). Although we cannot identify the specific strategies that people used, we can explore one potential explanation by identifying participants who did not demonstrate a basic understanding of the range information. We separated people who answered the first comprehension question correctly in the range-only condition (e.g., “In which option do scientists have the least amount of confidence?”) from those who did not; recall that 47% answered this question correctly in this condition. We may expect that people who responded incorrectly would be unable to identify the main tradeoff in the options, and may be more likely to ignore the range information or use some other strategy. People who responded correctly, on the other hand, were presumed to be more likely to understand the tradeoff and choose in a manner similar to the participants who had the labels present. There was a significant difference in choice proportions in the range only condition between people who responded incorrectly to this question (Option #1 = 67.2%, Option #2 = 13.1%, Option #3 = 19.7%) and those who responded correctly (Option #1 = 44.4%, Option #2 = 18.5%, Option #3 = 37.0%). As expected, those who answered incorrectly were more likely to choose option #1, B = .94, p = .02, Odds ratio = 2.56 (supporting the notion that incorrect responders may have been more likely to ignore the uncertainty), and those who responded correctly were more likely to choose option #3, B = –.88, p = .04, Odds ratio = .42. However, even those people who answered correctly showed a similar pattern of choices across the different conditions as the total sample (i.e., more choices of option #1 and fewer choices of options #2 and #3 in the range condition). In other words, the different pattern of choices in the range condition does not appear 14 to be the result only of people not understanding that the width of a range is related to the confidence in the estimate. 3.3.4 Discussion Consistent with H1, people were better able to identify the option with the greatest amount of attribute uncertainty when an evaluative label was provided. But they were better able to identify the relative likelihood of a particular value with a range. People may be best off in the combined condition because they can focus on the representation that can be most easily used to answer the question at hand. Although not statistically significant, more numerate participants did appear to be more flexible with uncertainty by being better able to answer the comprehension question with the format least suited for the job (i.e., comparisons of general confidence with the range, and identifying the probability of particular outcomes in the label only). Consistent with H3, people tended to choose options with greater confidence—either the option with the tightest range or the compromise option—when the evaluative labels were present. This makes sense given that the highest confidence option has the superior label, and a simple decision strategy would be to just choose that option. In this choice context, the evaluative labels also help to make the tradeoffs clear, potentially leading more people to choose the compromise option. In other words, participants appeared to weigh the overall uncertainty more when this aspect of uncertainty was made salient through the use of evaluative labels. Interestingly, as we have reported elsewhere (Gregory et al. 2011), people tended to choose in a similar manner when the label and range were present. This was true for both more and less numerate participants. In addition, consistent with H2, both the more and less numerate found the label only condition to be the easiest to use. This was true even when comparing the label only to the combined condition. Given this difference in perceived difficulty, it was surprising 15 that the choice proportions did not significantly differ between the combined and label only conditions. The decision strategies used in the range only condition are more difficult to identify. We do show, however, that they are explained in part (but not completely) by people not understanding that the width of the range is related to the uncertainty in estimating an attribute. In sum, both the range and the evaluative labels appear to have advantages. The results suggest that the label, in particular, was a strong and easily evaluated marker for decision makers. In the next study, we replicate these effects in a different decision context and explore the potential unintended consequences of evaluative labels. 4.0 Study 2 Evaluative labels provide a salient source of information for a lay consumer. In Study 1, we demonstrated that the evaluative labels were perceived as easy to use and that they appeared to be used even in the presence of a numerical uncertainty range. However, given how easy it is to evaluate the options based on the labels, there is a danger that people will choose a simple decision strategy based on the labels even though it may not result in the best choice given their values. In this experiment, we examined whether the presence of evaluative labels could lead some participants into making choices that do not align with their values. 4.1 Scenario Figure 4 shows the scenario and consequence table for Study 2. The three options were structured such that higher costs were associated with less uncertainty. The best estimate of the salmon population was constant across the options. Thus, we might expect a person with strong values favoring economic over environmental goals to choose option #3 because the cost savings are the highest. A person with strong values favoring environmental over economic goals might 16 choose option #1 because experts have the most confidence in the best estimate (meaning that it is unlikely that the bird population will be very low, or the probability of extinction is unlikely to be very high, depending on condition). [Figure 4 near here] 4.2 Design and Measures This experiment was a 2 x 2 between subjects design. There were two different formats of the confidence information3: (1) Numerical range (shown in Figure 2), and (2) combined condition with a range plus evaluative label (Excellent, Good, Poor, respectively). The confidence information was placed on either the best estimate of the population (shown in Figure 2), or the assessed probability of extinction. Because the results were substantively identical between these conditions, we pooled the sample across this condition and report results based only on the format difference. Participants were asked several questions about the consequence table. First, they were asked to choose a management option. They were then asked two comprehension questions: (1) In which option do scientists have the least amount of confidence in the estimated salmon population [probability of extinction]? and (2) In the best case scenario, which option would result in the most salmon? Finally, they were asked two questions about how easy or hard it was to use the consequence table: (1) How easy or hard was it to use the confidence information presented in the consequence table? (1 = very easy to 4 = very hard) and (2) How much effort did it take you to make your choice? (1 = very little effort to 4 = a lot of effort). Two weeks prior to participating in this experiment, participants were given a short survey about attitudes and values concerning economic versus environmental goals. The following four survey questions were averaged to create a values index relating to economic versus 17 environmental goals (alpha = .77, mean inter-item correlation = .46), with the items recoded such that higher scores indicated stronger positive attitudes towards environmental concerns: (1) In general, do you think societies should focus more on environmental or more on economic issues? (Strongly prefer environmental – Strongly prefer economic) (2) In general, would you be more likely to support Project A that created jobs in your area but threatened songbird populations, or would you rather support Project B that did not affect songbird populations and did not create jobs in your community? (Strongly prefer A – Strongly prefer B) (3) In general, would you be more willing to support Project A that reduced your monthly electricity bill but also threatened songbirds, or would you rather support Project B that did not affect songbird populations and did not reduce your monthly electric bills? (Strongly prefer A – Strongly prefer B) (4) In general, would you be more willing to support Project A that increased your monthly electricity bill but protected endangered salmon, or would you rather support Project B that did not increase your monthly electric bill but did not protect endangered wild salmon? (Strongly prefer A – Strongly prefer B) 4.3 Results and Discussion 4.3.1 Comprehension Logistic regression models were used to compare performance on the two comprehension questions between the format conditions. The first comprehension question asked participants to identify the option for which scientists had the least amount of confidence in the best estimate. As expected, participants tended to be better at doing this in the combined as compared to the 18 range only condition (Percent correct: Range = 79.3%, Combined = 84.6%), although this difference was not statistically significant, B = –.36, p = .28, Odds ratio = .70. The second comprehension question asked participants which option would result in the most salmon in the best-case scenario. As expected, based on the study 1 results, participants answered correctly that it was option #3 more often when the range was presented alone despite the same numbers being available in both conditions (Percent correct: Range = 50.5%, Combined = 36.2%), B = .59, p = .03, Odds ratio = 1.80. The continuous numeracy score for each participant was added to the logistic regression model to test for numeracy main effects and interactions with condition. For the first comprehension question, the numeracy main effect was not significant nor did numeracy interact with condition, Odds ratio = .99, p = .98 (Range = 80.0% versus 78.4%; Combined = 87.3% versus 82.6%). The more numerate performed better on the second comprehension question across the conditions, B = .38, p < .001, Odds ratio = 1.46, (Range = 65.0% versus 33.3%; Combined = 50.9% versus 25.3%). Numeracy did not significantly interact with condition, Odds ratio = 1.08, p = .65. 4.3.2 Perceptions of use The confidence information was rated as similar in difficulty in the range only and combined conditions, F(1,237) = 1.17, p = .28, d = .14. Participants also did not report any differences in the perceived effort of making the choice between the conditions, F(1,235) = 0.19, p = .89, d = .03. Although across the conditions more numerate participants found the uncertainty information easier to use (η2 = .02, p = .02) and perceived less effort in making their choices (η2 = .02, p = .02), these effects were small, and there were no significant interaction effects. 19 4.3.3 Choice Multinomial logistic regression models were used to test for differences in choice proportions between the conditions. Examining the pattern of choices between the conditions revealed that the range only condition was significantly different from combined condition, χ2(2) = 16.91, p < .001. In the combined condition (as compared to the range only), participants were more likely to choose option #1 (highest confidence, highest cost), Wald(1) = 15.14, p < .001, and option #2 (compromise option), Wald(1) = 8.46, p = .004, as compared to option #3 (lowest confidence, lowest cost). This pattern of results is consistent with what was found in Study 1. As in Study 1, it is possible that the reason that more people chose the economical (and most uncertain) option #3 in the range only condition, was that they did not understand that the width of the range provides information about the confidence in the estimate. To address this potential explanation we removed people who did not correctly identify that option #1 had the lowest amount of confidence. The results remained substantively the same as above, providing evidence that even though people could recognize which option had the most confidence in the range condition, they still were more likely to choose the option #3 (lowest confidence, lowest cost). [Figure 5 near here] To examine the effect of numeracy, we tested main effects and interactions with condition in separate logistic regression models for each choice option. Less numerate participants were more likely to choose option #1 across conditions, B = –.17, p = .05, Odds ratio = .85. The interaction with condition was not significant, (Odds ratio = 1.26, p = .20). More numerate participants were also more likely to choose the compromise option #2, B = .18, p = .02, Odds ratio = 1.12, and numeracy significantly interacted with condition, B = –32, p = .04, Odds ratio = .73. The numeracy effect appeared to be stronger in the combined condition. In the combined condition, 20 the more numerate were more likely to choose the compromise option (63.0% versus 41.7%, based on median split), whereas in the range condition the effect was smaller and opposite in direction (43.3% versus 48.0%, based on median split). The more numerate in the combined condition chose option #1 less often than the less numerate (22.2% versus 41.7%). We speculate that the less numerate were more likely to use a simple strategy of choosing the option with the most favorable evaluative label, whereas the label caused the tradeoff structure to be more salient for the more numerate prompting them to choose the compromise option. To test H4, we focused on the extent to which people made value consistent choices. This experiment was designed such that people with strong economic values should choose option #3 (lowest confidence, lowest cost) and people with strong environmental values should choose option #1 more often (highest confidence, highest cost). Participants with average value index scores of 1.0–3.0 were classified as economic leaning, and those with index scores from 4.0–6.0 were classified as environmental leaning. Participants with an average value index score between 3.0–4.0 (3.5 being the midpoint of the scale indicating indifference) were classified as neutral. Figure 6 shows the choice proportions for the economic, neutral, and environmental-leaning participants. We used logistic regression to examine the pattern of choices for the environmentally and economically friendly options, with condition and value index group as the predictors. As hypothesized, the environmental-leaning group was more likely to choose the environmentally friendly option (#1) across conditions as compared to the neutral and economic leaning groups, B = .63, p = .06, Odds ratio = 1.87. The neutral and economic leaning groups did not differ from each other, B = –.32, p = .41, Odds ratio = .73. There was also a significant interaction between uncertainty condition and value group (contrast on value group was 21 environmental vs. average of economic and neutral), B = 1.40, p = .05, Odds ratio = 4.05, suggesting that the difference in the proportion of environmentally friendly choices between the conditions was not equivalent across the value groups. For the environmentally leaning group, the proportion of environmentally friendly choices was very similar across the conditions, whereas for the neutral and economically leaning groups the presence of the evaluative labels increased the proportion of environmentally friendly choices.4 As hypothesized, the economic and neutral groups were more likely to choose the economically friendly option (#3) across the conditions as compared to the environmentally leaning group, B = –.99, p = .02, Odds ratio = .37. Again, the economic and neutral groups did not differ, B = –.22, p = .55, Odds ratio = .81. For all of the groups there was a similar decrease in the proportion of economically friendly choices in the evaluative label condition. Thus, the interaction between condition and value group was nonsignificant, B = .35, p = .71, Odds ratio = 1.42. For the compromise option (#2), both the main effect of value index group (p = .41) and the interaction between condition and value group (p = .67) were nonsignificant. For the economically leaning group, in particular, the addition of the evaluative label increased the likelihood of making a choice that was inconsistent with their stated values. This demonstrates an unintended consequence of making some pieces of information very easy to evaluate. Presumably, a subset of the economically leaning participants chose option #1 just because there was a label saying that this was the better option in terms of confidence. However, it doesn’t necessarily mean that this is the option that fits the best with a person’s values and goals. [Figure 6 near here] 22 4.3.4 Discussion As in Study 1, participants were more likely to choose the option with the highest confidence when presented with the range and evaluative labels as opposed to the range alone. They were also more likely to choose the lowest confidence, lowest cost option with the range only. Unlike Study 1, however, the labels had a different influence depending on numeracy. In particular, less numerate individuals were more likely to choose the option with the most favorable label, whereas the more numerate were more likely to choose the compromise option. In contrast to Study 1, participants were substantially better able to identify the option with the lowest confidence in the range only condition. This may be due to a practice effect from Study 1 to Study 2. However, the overall pattern of effects was consistent across the studies, in that participants were better able to identify the option with the lowest confidence when the label was present, and were better able to identify the option with the highest salmon population in the best-case scenario with the range presented alone. This latter result implies that the presence of the evaluative label somehow interferes with the use of the confidence range to answer these types of questions; more information is not always better (Peters et al. 2007). We also demonstrate the power of using manipulations like evaluative labels to increase the saliency of uncertainty information. Consistent with H4, we found that people with economically leaning values, who should prefer the option with the best outcomes in terms of costs, were drawn to choose the option that was least favorable in terms of costs. This highlights an unintended consequence of using evaluative labels. 5.0 General Discussion In many decision contexts, there is substantial uncertainty in estimating the attributes associated with different options. Uncertainty must be represented in some manner such that end 23 users can understand the implications of this uncertainty for decision making. However, laypeople can have difficulty making sense of common representations of uncertainty. Presenting numerical representations along with evaluative labels (verbal or otherwise) has been suggested as a way to simplify the comprehension and use of uncertainty information. Our results suggest that lay people are sensitive to evaluative labels, even in the presence of numerical uncertainty ranges. These findings are consistent with previous work on providing evaluative structure to facilitate the use of unfamiliar numerical information (Peters et al. 2009). Although we did not explicitly test the mechanisms behind this effect in this paper, we speculate, based on previous research, that participants derived affective meaning from the evaluative labels that either focused attention on the numerical uncertainty information or acted as a direct motivator of choice (i.e., caused participants to choose the options with the most favorable label). Thus, there appear to be advantages of using evaluative labels in that they can highlight aspects of uncertainty information that may otherwise be overlooked in more complex numerical displays. However, putting an evaluative label on a particular element of a display may signal to a layperson that this element is paramount to the decision at hand. Thus, even though this element may not be the most important factor with respect to a decision makers values, the perceived importance (e.g., “They wouldn’t have made this information element so prominent if it wasn’t important”) may cause people to put undue weight on it. As demonstrated in Study 2, this may be particularly true for the less numerate. The simplicity and power of evaluative labels is a double-edged sword. There are several important issues that need attention when designing any evaluative structure. The first is that someone needs to decide how to define the evaluative structure. In other words, by what process do we decide what evaluative label to assign to an attribute level? 24 What are the cutoffs for high, medium, and low amounts of uncertainty and how many categories are necessary? These decisions will likely have consequences for choice and the individuals or groups who have vested interests in one option over another. Another important decision is deciding to what aspect of uncertainty to apply the evaluative structure. Evaluative labels could, in principle, be attached to any aspect of uncertainty that was deemed important. For instance, a “poor-excellent” evaluative structure could be applied to the amount of uncertainty in assessing an outcome or probability, or to some other implication of uncertainty like the presence of lowprobability, high-magnitude events. These types of decisions will be made by the experts responsible for developing the decision aid or risk communication. The important point is that these decisions be made explicitly. There may be a temptation to avoid a careful consideration of these issues, but if a coherent evaluative structure is not provided consumers will try to create one for themselves that may or may not acknowledge uncertainty. Choosing an uncertainty format depends greatly on what you want people to know. One guiding principle for experts is that uncertainty should be presented at the level of detail needed by consumers. In particular, experts should think about how they are presenting the general uncertainty concept (e.g., with evaluative labels) and the states of the world that may come about because of this uncertainty (e.g., range of possible outcomes). Our results show that people have an easier time understanding the general concept of uncertainty when an evaluative label, or conceptually equivalent marker of relative uncertainty, is present. However, when asked about a specific outcome that could happen, ranges provide the needed detail, and conceptually, graphs or more detailed representations of uncertainty distributions could provide the same information. Providing certain types of information may lead people to think in particular ways. For instance, the presence of evaluative labels may prompt people to think in terms of overall uncertainty to 25 support a decision, and to ignore potentially important other ways of looking at uncertainty like examining upper and lower values of a distribution. Our choice results suggest that this is the case. Because evaluative labels are easier to digest, they will probably be the focus of the decision strategies used by many lay people. So, although they can be helpful, they should be used with caution in situations where the different possible values of a parameter are important. All static uncertainty displays likely have limits as to how well they can communicate the important implications of uncertainty. This may be particularly true given that laypeople do not generally have experience with explicit uncertainty presentations and, consequently, often evaluate this information in idiosyncratic ways. Perhaps more effort should be focused on providing targeted instruction about the important implications of uncertainty within a particular decision context. Of course, these types of instruction aids will only be possible in communication contexts where the attention of the consumer can be captured for more than a few brief moments. In conclusion, laypeople will tend to use information that is more easily evaluated as the basis for reasoning and decision making. Uncertainty information is particularly difficult to evaluate and communicators need to think hard about how they are presenting uncertainty and how different representations can lead to different decision-making strategies. Both ranges (and other methods that show the possible outcomes due to uncertainty) and providing an evaluative structure are important tools. The best presentation method really depends on what you want people to know and what types of decision strategies you want to prompt. 26 Notes 1. The evaluative labels in this example are not probability statements but labels that provide an evaluative structure to help a decision maker understand the goodness or badness of a quantity or quantity range to facilitate comparisons between different options. Thus, this approach differs from adding verbal probability statements to numerical ranges (e.g., Budescu, Broomell, and Por 2009). In this approach, probabilistic belief in a proposition is communicated with an imprecise verbal probability phrase (e.g., moderately likely), and a range of probability is provided to give decision makers a better sense of this imprecision. 2. There were two additional manipulations within each main condition. The range information was displayed with two different layouts (triangular shape without the hyphen, or horizontal with hyphen), and the evaluative labels consisted of two different labeling schemes (High, Medium, Low versus Excellent, Good, Poor). We have averaged these together because there were no significant differences within each main condition. Additional information is available from the first author upon request. 3. There was one additional between subjects formatting condition that was not implemented properly in the experimental materials. The results from this group are not reported here. 4. We speculate that the evaluative labels did not increase choices of this option in the environmentally leaning group because of the substantial proportion of the group that choose this option with numerical information only (because it was the most environmentally sensitive whether the labels were present or not). Due to the structure of the task, however, we focus on the choices of the economically leaning group where the presence of a favorable evaluative label was attached to an option that they would otherwise not be expected to choose based on their stated values. 27 References Budescu, David V., S. Broomell, and H. H. Por. 2009. Improving communication of uncertainty in the reports of the Intergovernmental Panel on Climate Change. Psychological Science 20:299–308. Clemen, Robert T.. 2004. Making hard decisions: An introduction to decision analysis. 4th ed. Belmont, CA: Duxbury. Dieckmann, Nathan F., Paul Slovic, and Ellen Peters. 2009. The use of narrative evidence and explicit likelihood by decision makers varying in numeracy. Risk Analysis 29:1473–88. Gregory, Robin, Lee Failing, Michael Harstone, Graham Long, Tim McDaniels, and Don Ohlson. Forthcoming. Structured decision making: A practical guide to environmental management choices. New York: Wiley-Blackwell. Gregory, Robin, Nathan F. Dieckmann, Ellen Peters, Lee Failing, Graham Long, and Martin Tusler. 2011. Deliberative disjunction: Expert and public understanding of outcome uncertainty. Manuscript submitted for publication. Gurmankin, A.D., J. Baron, and K. Armstrong. 2004. The effect of numerical statements of risk on trust and comfort with hypothetical physician risk communication. Medical Decision Making 24:265–71. Hsee, C.K., G. Loewenstein, S. Blount, and M.H. Bazerman. 1999. Preference reversals between joint and separate evaluations of options: a review and theoretical analysis. Psychological Bulletin 125:576–590. Joslyn, Susan, David W. Jones, Karla Schweitzer, John Pyles, and Patrick Tewson. 2005. Designing tools for uncertainty estimation in naval weather forecasting. In Proceedings of the Seventh International NDM Conference, ed. Jan Maarten C. Schraagen. Amsterdam, The Netherlands. Keeney, Ralph L. 1982. Decision analysis: An overview. Operations Research 30: 803–38. Peters, Ellen, Nathan F. Dieckmann, Anna Dixon, Judith H. Hibbard, and C.K. Mertz. 2007. Less is more in presenting quality information to consumers. Medical Care Research and Review 64: 169–90. Peters, Ellen, Nathan F. Dieckmann, Daniel Västfjäll, C.K. Mertz, Paul Slovic, and Judith H. Hibbard. 2009. Bringing meaning to numbers: The impact of evaluative categories on decisions. Journal of Experimental Psychology: Applied 15: 213–27. Peters, Ellen, P. Sol Hart, and Liana Fraenkel. 2011. Informing patients: The influence of numeracy, framing, and format of side-effect information on risk perceptions. Medical Decision Making 31: 432–36. Schwartz, Barry. 2004. The paradox of choice: Why more is less. New York: HarperCollins. 28 Slovic, Paul. 1972. Information processing, situation specificity, and the generality of risk-taking behavior. Journal of Personality and Social Psychology 22: 128–34. Weller, Joshua A., Nathan F. Dieckmann, Martin Tusler, C.K. Mertz, and Ellen Peters. Forthcoming. Development and testing of an abbreviated numeracy scale: A Rasch Analysis approach. Journal of Behavioral Decision Making. Wilson, Edwin B. 1927. Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 22: 209–12. Zikmund-Fisher, B.J., A. Fagerlin, K. Keeton, and P.A. Ubel. 2007. Does labeling prenatal screening test results as negative or positive affect a woman’s responses? American Journal of Obstetrics and Gynecology 197 (5):528e1–6. 29 Figure Captions Figure 1. Example consequence table with numerical uncertainty range and evaluative label describing uncertainty in the best estimate of a bird population. Figure 2. Scenario and consequence table for Study 1 – the range condition. Note. Consequence table shows the range condition. In the verbal only condition, the range was replaced with an evaluative label (shown in brackets), and in the combined condition a verbal label was added below the range. Figure 3. Study 1: Choice results by uncertainty format condition (N = 367). Note. Confidence intervals for the proportions were calculated using Wilson’s method (Wilson, 1927). Option #1: Lowest confidence, highest best estimate. Option #2: Highest confidence, lowest best estimate. Option #3: Compromise. Figure 4. Scenario and consequence table for Study 2 – numerical range condition. Note. Consequence table shows the range condition with an uncertainty range for the population estimate. In the combined condition, a verbal label was added below the range. Figure 5. Study 2: Choice results by uncertainty format condition (N = 236). Note. Confidence intervals for the proportions were calculated using Wilson’s method (Wilson, 1927). Option #1: Highest confidence, highest cost. Option #2: Compromise. Option #3: Lowest confidence, lowest cost. Figure 6. Study 2: Choice results by uncertainty condition for participants with different expressed values. Note. Economic (n = 103); Neutral (n = 66); Environmental (n = 64). Confidence intervals for the proportions were calculated using Wilson’s method (Wilson, 1927). Option #1: Most environmentally friendly. Option #3: Most economically friendly. 30 31 32 33 34 35 36