INTERPRETING POINT PREDICTIONS: Some Logical Issues

INTERPRETING POINT PREDICTIONS: Some Logical Issues Charles F. Manski Department of Economics and Institute for Policy Research Northwestern University October 2015, forthcoming in Foundations and Trends in Accounting Abstract Forecasters regularly make point predictions of future events. Recipients of the predictions may use them to inform their own assessments and decisions. This paper integrates and extends my past analyses of several simple but inadequately appreciated logical issues that affect interpretation of point predictions. I explain the algebraic basis for a pervasive empirical finding that the cross-sectional mean or median of a set of point predictions is more accurate than the individual predictions used to form the mean or median, a phenomenon sometimes called the "wisdom of crowds." I call attention to difficulties in interpretation of point predictions expressed by forecasters who are uncertain about the future. I consider the connection between predictions and reality. In toto, the analysis questions prevalent prediction practices that use a single combined prediction to summarize the beliefs of multiple forecasters. I have benefitted from the opportunity to present this work at the December 2014 Stanford conference on Causality in the Social Sciences, the September 2015 Cornell conference on Computational Social Science, and in seminars at Collegio Carlo Alberto and University College London. I have also benefitted from the comments of an anonymous reviewer. 1. Introduction Persons, firms, and governments regularly make point predictions of future events and estimates of past conditions. Recipients of predictions and estimates may use them to inform their own assessments and decisions. This paper integrates and extends my past analyses of several simple but inadequately appreciated logical issues that affect interpretation of predictions and estimates. I use the prospective term "prediction" (or forecast) for simplicity of terminology, but the paper applies to retrospective estimates as well. I first explain the logical basis for a pervasive empirical finding on the performance of combined predictions of real quantities. Empirical researchers have long reported that the cross-sectional mean or median of a set of point predictions is more accurate than the individual predictions used to form the mean or median. This phenomenon is sometimes colloquially called the "wisdom of crowds." It has only occasionally been recognized that these regularities have algebraic foundations. The one concerning mean predictions holds whenever a convex loss function (or concave welfare function) is used to measure prediction accuracy, by Jensen’s inequality. The one concerning median predictions holds whenever a unimodal loss or welfare function is used to measure accuracy. I have called attention to the algebra underlying the wisdom of crowds in Manski (2010, 2011). Here, in Section 2, I paraphrase these earlier discussions and extend them to cover weighted averages of predictions, as advocated in Bayesian model averaging. I also caution that the algebra has limited scope of application. In particular, I show that combining predictions of treatment response need not outperform individual predictions when a planner makes binary treatment decisions. The algebra in Section 2 implies nothing about the informativeness of the predictions that forecasters provide. Predictions may imperfectly convey the expectations that forecasters hold for future events and they may imperfectly anticipate future realities. To interpret predictions requires assumptions on the decision processes that forecasters use to generate their predictions. Sections 3 and 4 interpret predictions under various assumptions. 2 Section 3 concerns interpretation of point predictions of uncertain events. Economists commonly assume that persons hold probabilistic beliefs about uncertain events. A point prediction at most provides some measure of the location of a probability distribution—it cannot reveal anything about the shape of the distribution. Users of predictions typically do not know how forecasters choose points to summarize their beliefs. This generates an unavoidable problem in interpretation of point predictions, one that arises even if forecasters seek to honestly convey their beliefs. Other problems, which are avoidable, arise when researchers make logical errors in their interpretation of point predictions. A frequent error has been to use the dispersion of point predictions across forecasters to measure the uncertainty that forecasters perceive. Engelberg, Manski, and Williams (2009) and others have called attention to these matters. Section 3 explains. Predicting binary outcomes is an important special case that is instructive to study in some depth. Manski (1990) observed that a point prediction of a binary outcome at most yields a bound on the probability that the forecaster holds for the outcome. Manski (2006) applied this simple finding to interpret the bids in prediction markets where traders bet on occurrence of a binary outcome. I summarize here. The analyses of Sections 2 and 3 imply no conclusions about the connection between predictions and reality. Assessment of realism requires assumptions about the process generating predictions. Section 4 discusses two cases: unbiased and rationalizable predictions. Predictions of a real quantity are unbiased if they are generated by a process such that the mean prediction equals the actual value of the quantity. This assumption, often made in research that combines forecasts, makes the wisdom of crowds a statistical rather than simply algebraic phenomenon. The assumption has strong consequences but is rarely credible. Rationalizable predictions are ones that follows logically from some plausible model, without requiring that the model be accurate. Such predictions pose possible futures. In the absence of knowledge of the process generating predictions, there is no logical reason to average a set of rationalizable predictions 3 as recommended in research on unbiased prediction. However, if predictions are formed in an adversarial environment, one might find it reasonable to conclude that the actual value of the quantity of interest lies in the interval between the smallest and largest prediction. The concluding Section 5 observes that prevalent practices in prediction of future events summarize the beliefs of forecasters in two respects. First, individual forecasters commonly provide point predictions even though they may perceive considerable uncertainty. Second, recipients of predictions from multiple forecasters often combine them to form the mean or median prediction. I question both practices. 2. The Algebraic Wisdom of Crowds For over a century, researchers studying the accuracy of predictions have studied settings in which multiple agents are asked to give point predictions of an event and their predictions are combined. The combined prediction is often the cross-sectional mean or median of the individual predictions. Empirical studies have regularly found that the combined prediction is more accurate than the individual predictions used to form it. Clemen (1989, p. 559) put it this way in a review article: "The results have been virtually unanimous: combining multiple forecasts leads to increased forecast accuracy. This has been the result whether the forecasts are judgmental or statistical, econometric or extrapolation. Furthermore, in many cases one can make dramatic performance improvements by simply averaging the forecasts." A notable early example is Galton (1907), who presented data on the point estimates of the weight of an ox made in a weight-judging contest. He found considerable dispersion in the estimates, but the median estimate was close to the actual weight. Galton viewed this as a demonstration of the power of democracy, writing (p. 450) "According to the democratic principle of 'one vote one value,' the middlemost estimate 4 expresses the vox populi, every other estimate being condemned as too low or too high by a majority of the voters." He concluded (p. 451): "This result is, I think, more creditable to the trustworthiness of a democratic judgment than might have been expected." Close to a century later, Surowiecki (2004) opened his popularscience book with a summary of Galton’s work, pointing to it as a leading early example of the so-called "wisdom of crowds." In fact, the empirical regularities that mean and median predictions perform better than individual predictions are algebraic results that hold whenever the loss function measuring predictive accuracy has certain properties, convexity and unimodality respectively. With such loss functions, the results hold regardless of the quality of the individual predictions that are combined. Neither the mean nor median prediction need perform well in an absolute sense. Either may be a good prediction or a terrible one. Section 2.1 shows that, whatever the truth may be, the mean prediction performs better than the mean performance of the individual predictions. Section 2.2 shows that the median prediction performs better than half of the individual predictions. 2.1. Mean Predictions and Convex Loss Functions Let (yn, n = 1, . . . , N) be a set of point predictions of an unknown real quantity è. Let ìN / (1/N) 3n yn denote the mean prediction. Let L(A, A): R × R 6 [0, 4) be a loss function used to measure the consequence of prediction error. Research on forecasting has typically used absolute loss L(y, è) = *y ! è* or square loss L(y, è) = (y ! è)2. When these or any other convex loss function is used, Jensen’s inequality gives L(ìN, è) # (1/N) 3n L(yn, è) for all è 0 R. Thus, whatever the actual value of the quantity being forecast, the loss occurring with the mean prediction is no larger than the mean loss of the individual predictions. This simple algebraic result applies to combination of any set of predictions, whatever their source. They could be obtained in surveys, generated by econometric models, or be clinical judgments. 5 The result has long been known in statistical decision theory in the context of parameter estimation from sample data. There è is a parameter to be estimated and (yn, n = 1, . . . , N) are sample data. A randomized estimate draws an integer n at random from the set (1, . . . , N) and uses yn to estimate è. Suppose that a convex loss function is used to measure precision of estimation. Then Jensen’s inequality implies that loss using the non-randomized estimate ìN is smaller than expected loss using the randomized estimate. See Hodges and Lehmann (1950). The result has also been used to compare collective and individualistic decision making under uncertainty. Manski (2010) studies decision making when N members of a population want to maximize the same private utility function, which depends on an unknown state of nature. If agents knew the state of nature, they would make the same decision. However, they may have different beliefs or may use different decision criteria to cope with incomplete knowledge. Hence, they may choose different actions (yn, n = 1, . . . , N) even though they share the same objective. Let the set of feasible actions be convex and the utility function be concave in actions, for all states of nature. Then Jensen’s inequality implies that consensus choice of the mean privately-chosen action yields a larger mean payoff than does individualistic decision making, in all states of nature. Research on combining predictions has largely appeared to be unaware of the result as it has sought to explain why mean predictions perform better than individual predictions. A notable exception is McNees (1992), who discussed the matter in the context of absolute and square loss. McNees observed that much research on forecasting did not acknowledge "these simple, well-known, yet often ignored arithmetic principles" (p. 705). Larrick and Soll (2006) have elaborated on McNees' observation, showing that experimental subjects often do not understand what they call the "averaging principle." The above discussion assumes that the combined prediction is a simple average of the individual predictions. The result extends immediately to weighted averages. Let (wn, n = 1, . . . , N) be a set of nonnegative weights that sum to one. Let ìwN / 3n wnyn denote the weighted mean prediction. For any convex 6 loss function, Jensen’s inequality gives L(ìwN, è) # 3n wnL(yn, è) for all è 0 R. Thus, the loss associated with the weighted mean prediction is no larger than the weighted mean loss of the individual predictions. Consider, for example, Bayesian model averaging (e.g., Hoeting et al., 1999). In this setting, there are N models. For model n, yn is the mean of the posterior distribution for è conditional on the observed data and on correctness of model n. Weight wn is the posterior probability that one places on correctness of model n conditional on the observed data. Jensen's inequality shows that, for any convex L, the loss using the posterior model-weighted mean of è as the point prediction is no larger than the mean loss that results if one draws a model at random from the posterior model distribution (wn, n = 1, . . . , N) and uses the modelspecific posterior mean as the point prediction. 2.2. Quantile Predictions and Unimodal Loss Functions McNees (1992) recognized that the median prediction of a real quantity must be at least as close to the truth as at least half of the individual predictions. He wrote "it is always true that no more than half of the individual forecasts that define a median forecast can ever be more accurate than the median forecast." McNees' discussion of this algebraic truism was informal. I proved an extended result in Manski (2011) that holds for all quantile predictions when L is any unimodal loss function. The derivation is short, so I reproduce it here. Let L(@, è) be unimodal for all values of è, with L(è, è) = 0 and L(t, è) > 0 for t è. Let á 0 (0, 1), let PN(y # t) / (1/N) 3n 1[yn # t] denote the empirical probability of the event {y # t}, and let q(á, N) / inf [t: PN(y # t) $ á] denote the á-quantile of PN. Consider use of q(á, N) as the combined prediction for è. If q(á, N) = è, then PN[L(y, è) $ L(q(á, N), è)] = PN[L(y, è) $ 0] = 1. If q(á, N) < è, then PN[L(y, è) $ L(qáN, è)] $ PN(y # qáN) $ á. If qáN > è, then PN[L(y, è) $ L(qáN, è)] $ PN(y $ qáN) $ 1 ! á. Hence, PN[L(y, è) $ L(qáN, è)] $ min (á, 1 ! á), whatever the actual value of è may be. 7 When á = ½, this result become PN[L(y, è) $ L(q(½, N), è)] $ ½ for all è. In words, the loss from using q(½, N) to predict è is less than or equal to the loss from using yn for at least half of the individual predictions (yn, n = 1, . . . , N). This formalizes McNees' observation, quoted above. 2.3. Choice Between a Status Quo Treatment and an Innovation The algebraic results of Sections 2.1 and 2.2 are powerful but they are not universal, resting on convexity and unimodality of the loss function. Whereas unimodality is a fairly innocuous property for a loss function, convexity is a strong property that does not hold in many realistic decision problems. I show here that the finding of Section 2.1 does not hold in a simple setting with non-convex loss. Suppose that a person must choose between two actions, a status quo option and an innovation. The status quo option has a known real value, which I normalize to zero for simplicity. The innovation has an unknown value è that may be positive or negative. Thus, it is optimal to choose the innovation if è $ 0 and to choose the status quo if è # 0. Utility from making the optimal choice is max(0, è). Not knowing è, suppose that one uses a point prediction to choose between the status quo and the innovation. Let x denote the prediction. Suppose that one chooses the innovation if x > 0 and the status quo if x # 0. The resulting utility is U(x, è) = 0 if x # 0 and è if x > 0. Hence, U(x, è) = è@1[x > 0]. This utility function is not concave. Equivalently, the loss function L(x, è) = !U(x, è) = !è@1[x > 0] is not convex. As in Section 2.1, let (yn, n = 1, . . . , N) be a collection of point predictions of è and let ìN / (1/N) 3n yn denote the mean prediction. Using ìN to make the choice yields utility è@1[ìN > 0]. Using a randomly drawn prediction to make the choice yields expected utility (1/N) 3n è@1[yn > 0] = è@PN(y > 0), where PN(y > 0) / (1/N) 3n 1[yn > 0] is the fraction of positive predictions. Hence, the difference in expected utility between the two decision criteria is è@{1[ìN > 0] ! PN(y > 0)}. The two criteria yield the same utility è if all N predictions are positive and the same utility 0 if all 8 N predictions are non-positive. They yield different expected utility if some predictions are positive, some non-positive, and è 0. Consider such a case. If è > 0, use of the mean prediction to make the choice performs better than use of a randomly drawn prediction if ìN > 0 but performs worse if ìN # 0. Contrariwise, if è < 0, the mean prediction performs better than a randomly drawn prediction if ìN # 0 but performs worse if ìN > 0. Thus, a decision maker who does not know whether è is positive or negative cannot determine whether use of the mean prediction or a random prediction yields higher expected utility. 3. Interpreting the Point Predictions of Uncertain Forecasters Professional forecasters regularly give point predictions of uncertain future events. A notable example is Congressional Budget Office scoring of pending legislation. A score is a ten-year-ahead point prediction of the impact that the legislation would have on the federal debt. CBO scores are not accompanied by measures of uncertainty. This is so even though legislation often proposes complex changes to federal law, whose budgetary implications must be difficult to foresee. See Manski (2013) for further discussion. Similarly, financial analysts offer point predictions of the profit that firms will earn in the quarter ahead. Macroeconomic forecasters give point predictions of GDP growth and inflation. A notable exception to point prediction is that meteorologists report the percent chance that it will rain during the next hour, day, or week. However, they typically give point predictions of future temperatures. The appropriate interpretation of point predictions depends on what forecasters actually believe and what they choose to communicate. Point predictions cannot reveal anything about the uncertainty that forecasters perceive. They at most convey some notion of the central tendency of beliefs. The discussion in this section assumes that forecasters are neutral scientists or clinicians who do not "spin" their predictions; 9 they simply seek to be informative. Section 4 will consider settings in which this assumption may not hold. 3.1. Point Predictions Derived From Probabilistic Forecasts There are two senses in which point predictions may derive from probabilistic forecasts. First, economists regularly assume that persons use subjective probability distributions to express uncertainty about future events. Supposing that forecasters have subjective probability distributions for the events they predict, it would be most informative to have forecasters provide their subjective distributions. There is precedent for this. For example, the Survey of Professional Forecasters elicits probabilistic forecasts of inflation and GDP growth from macroeconomic forecasters. An increasing number of household surveys elicit probabilistic expectations of various future events from large samples of respondents; see Manski (2004). Nevertheless, point prediction remains commonplace. Second, forecasters often use stochastic models to generate predictions. Such models assume that future events are determined by observed and unobserved state variables. Conjecture of a probability distribution on the unobserved variables conditional on the observed ones yields a model-based objective probability distribution for future events. When forecasting is performed in this manner, it would be most informative to have forecasters report the probability distribution for the future event of interest. Forecasters sometimes do report probabilistic predictions based on stochastic models but it is common to report only a point prediction. Point predictions should somehow be related to subjective or model-based probability distributions. But how? Forecasters may report the means of their distributions—their best point predictions under square loss. Or they may report medians—best point predictions under absolute loss. However, forecasters typically are not asked to report means or medians. They are simply asked to "predict" or "forecast" the outcome. 10 In the absence of explicit guidance, forecasters may report different distributional features as their point predictions. Some may report means, while others report medians or modes. Still others, applying asymmetric loss functions, may report non-central quantiles of their probability distributions. Research calling attention to and analyzing the potential heterogeneity of response practices include Elliott, Komunjer, and Timmermann (2005, 2008), Engelberg, Manski, and Williams (2009), and Clements (2009). Heterogeneous reporting practices are consequential for the interpretation of point predictions. Forecasters who hold identical probabilistic beliefs may provide different point predictions, and forecasters with dissimilar beliefs may provide identical point predictions. If so, comparison of point predictions across forecasters is problematic. Variation in predictions need not imply disagreement among forecasters, and homogeneity in predictions need not imply agreement. Researchers in macroeconomics, finance, and accounting have sometimes interpreted cross-forecaster dispersion in point predictions to indicate disagreement in their beliefs. See, for example, Diether, Malloy, and Scherbina (2002) and Mankiw, Reis, and Wolfers (2003). If forecasters vary in the way they transform probability distributions into point predictions, this interpretation confounds variation in forecaster beliefs with variation in the manner that forecasters make point predictions. A distinct, and more severe, interpretative problem is the longstanding use of cross-sectional dispersion in point predictions to measure forecaster uncertainty about future outcomes. See, for example, Cukierman and Wachtel (1979), Levi and Makin (1979, 1980), Makin (1982), Brenner and Landskroner (1983), Hahm and Steigerwald (1999), Hayford (2000), Giordani and Söderlind (2003), Johnson (2004), Barron, Stanford, and Yu (2009), and Güntay and Hackbarth (2010). This research practice is suspect on logical grounds, even if all forecasters make their point predictions in the same way. Even in the best of circumstances, point predictions provide no information about the uncertainty that forecasters perceive. This point has been made forcefully by Zarnowitz and Lambros (1987). Nevertheless, researchers have continued to use the dispersion in point predictions to measure forecaster uncertainty. 11 3.2. Prediction of Binary Outcomes Forecasters are often asked to predict binary outcomes. Respondents to household surveys predict personal events such as purchase of a consumer durable, job loss, or childbirth. Physicians predict treatment outcomes. Professional forecasters are asked to predict the onset of a recession or the result of an election. Predicting binary outcomes is an important special case of general point prediction that is instructive to study in some depth. The idea that point predictions of binary outcomes should be related to probability distributions was suggested early on by Juster (1966). Considering the case in which consumers are asked to give a point prediction of their buying intentions (buy or not buy), Juster wrote (page 664): "Consumers reporting that they 'intend to buy A within X months' can be thought of as saying that the probability of their purchasing A within X months is high enough so that some form of 'yes' answer is more accurate than a 'no' answer." Thus, he hypothesized that a consumer facing a yes/no intentions question responds as would a statistician asked to make a best point prediction of a binary event. Some logical implications of Juster’s idea were later worked out in Manski (1990). Let y and è be binary variables denoting the prediction and the subsequent outcome respectively. Thus, y = 1 if a forecaster states that the outcome will occur and y = 0 if he states that it will not occur. Similarly, è = 1 if the outcome actually occurs and è = 0 if it does not. Suppose that one observes y and assumes that it is a best point prediction of the outcome, in the sense of minimizing expected loss. A best point prediction depends on the losses a forecaster associates with the two possible prediction errors, (y = 0, è = 1) and (y = 1, è = 0). Whatever the loss function, the best point prediction will satisfy the condition 12 (1) y = 1 Y Q(è = 1*s) $ p, y = 0 Y Q(è = 1*s) # p. Here s denotes the information possessed by the forecaster when making the prediction and Q(è = 1*s) is the probability he places on outcome è = 1. The value of p 0 [0, 1] depends on the loss function that the forecaster uses to form the best point prediction. This formalizes the idea originally stated by Juster (1966). A researcher who does not know a forecaster's threshold p can conclude nothing about his probability distribution. On the other hand, (1) shows that y bounds the probability given knowledge of p. In some settings, a researcher may find it credible to assume that the forecaster uses a symmetric loss function to form the point predictions. Any symmetric loss function implies that p = ½. Suppose that N forecasters offer predictions, say (yn, n = 1, . . . , N). As in Section 2, let ìN / (1/N) 3n yn denote the mean prediction. Each prediction takes the value 0 or 1, so ìN is the fraction of forecasters who predict that the event will occur. One should be careful not to interpret ìN as a consensus "probability" that the event will occur. The value of ìN has no obvious interpretation if the N forecasters use heterogeneous thresholds (pn, n = 1, . . . , N) to make their point predictions. If one finds it credible to assume that all forecasters use the same known threshold p, then ìN is the fraction of forecasters who place probability p or greater on occurrence of the event. Even in this case, ìN is not interpretable as a probability. The above results have implications for interpretation of the bids placed in prediction markets where traders bet on occurrence of a binary outcome. Let p be the price of a bet on the event {è = 1} and 1 ! p be the price of a bet on {è = 0}. Thus, a dollar bet on {è = 1} returns 1/p dollars if this event occurs and 0 otherwise, while a dollar bet on {è = 0} returns 1/(1 ! p) if this event occurs and 0 otherwise. If traders maximize expected utility, observation that a trader bets on {è = 1} reveals that his probability for this event is at least p, while observation that the trader bets on {è = 0} reveals that his probability for {è = 1} is no more than p. Thus, observation of the fraction of traders who bet on occurrence of the event reveals the 13 fraction of them who place probability p or greater on occurrence of the event. The above reasoning holds however the price p is determined in a prediction market. Suppose that p is determined in competitive equilibrium. Researchers have often called such a price the "market probability" that the event will occur and have specifically asserted that it equals the mean probability that traders place on the event. Manski (2006) shows that this interpretation is generally unfounded. The equilibrium price is a certain quantile of the cross-sectional distribution of probabilities if all traders are risk neutral and bet the same amount. More generally, interpretation of p depends jointly on the risk preferences and wealth of the traders. 4. Connecting Predictions to Reality The algebra in Section 2 and the analysis in Section 3 are silent about the connection between predictions and reality. Predictions alone imply nothing about reality. To draw conclusions requires assumptions that relate observed predictions to reality. Strong assumptions yield strong conclusions but may lack credibility. A leading case is the assumption that predictions are unbiased, discussed in Section 4.1. Weaker assumptions yield weaker conclusions but may be more credible. With this in mind, Section 4.2 discusses rationalizable assumptions. 4.1. Unbiased Predictions: The Statistical Wisdom of Crowds Suppose again that (yn, n = 1, . . . , N) are a set of point predictions of an unknown real quantity è. Researchers frequently assume that individual predictions are generated by a sampling process that makes them unbiased estimates of è. For example, Bates and Granger (1969) state near the beginning of their 14 widely cited article (p. 451): "It should be noted that we impose one condition on the nature of the individual forecasts, namely that they are unbiased." This assumption implies that all weighted averages of the predictions are themselves unbiased estimates of è. Bates and Granger consider weighting to minimize the mean square error of a combined prediction. The assumption that predictions are unbiased is often mentioned in popular discussions of the wisdom of crowds. For example, the Wikipedia entry on the subject states (Wikipedia, 2014): "An intuitive and often-cited explanation for this phenomenon is that there is idiosyncratic noise associated with each individual judgment, and taking the average over a large number of responses will go some way toward canceling the effect of this noise." This statistical argument for the wisdom of crowds, informally invoking the Law of Large Numbers, is distinct from and should not be confused with the algebra of Section 2. Unfortunately, researchers and others who assume that predictions are unbiased generally do not comment on the credibility of the assumption. For example, Bates and Granger provide no rationale. They just pose the assumption and maintain it. It may perhaps be possible to generate persuasive arguments for unbiasedness of predictions of certain well-understood physical phenomena. However, I am unaware of arguments that make it credible to assume unbiasedness of predictions of economic and other social outcomes. It should be mentioned that unbiased prediction of a binary outcome is logically impossible except in the special case where forecasters always make correct predictions. Suppose that è can only take the value zero or one. A point prediction y can similarly only take the value zero or one. Let P(y|è) denote the crosssectional probability distribution of predictions conditional on the actual value of è. Unbiasedness means that E(y|è = 0) = 0 and E(y|è = 1) = 1. This equation holds only if P(y = 0|è = 0) = P(y = 1|è = 1) = 1. 15 4.2. Rationalizable Predictions Game theorists define rationalizable strategies to be ones that choose actions which are optimal given some beliefs about the actions of other rational players, without requiring that the beliefs be accurate. With some semantic license, I will similarly say that a rationalizable prediction of a future event is one that follows logically from some plausible model, without requiring that the model be accurate. When forecasters offer predictions as subjective judgments, without accompanying explanation, it may be difficult for recipients of predictions to determine whether they are rationalizable. On other hand, recipients may be able to assess predictions derived from stochastic models that are adequately described in research articles or other media. When a recipient determines that a prediction is rationalizable, he can conclude that the prediction describes a possible future, albeit not necessarily the actual future. Suppose that a recipient determines that a set (yn, n = 1, . . . , N) of point predictions are rationalizable, each being derived from a plausible model. How might he use this information? In the absence of knowledge of the process generating predictions, there is no logical reason to form a simple or weighted average of the predictions, as is recommended in research on unbiased prediction. The recipient can only conclude that (yn, n = 1, . . . , N) pose N possible futures. There may be circumstances in which a recipient has knowledge that enables him to credibly draw a stronger conclusion. In Section 3, I assumed that forecasters are neutral persons who do not spin their predictions. Contrariwise, suppose that controversy surrounds the event being predicted and that some forecasters want to influence recipients of predictions. In an adversarial environment, forecasters who want recipients to believe that è is small or large may respectively provide low or high predictions, each asserting that their prediction is accurate. I have elsewhere called this practice "dueling certitudes" (Manski, 2013, Chapter 1). In an adversarial environment, forecasters who want to encourage low or high beliefs about è may 16 refrain from providing extreme predictions that lack credibility. They may instead seek to provide the lowest and highest predictions that are rationalizable. If so, the recipient of N rationalizable predictions can conclude that the actual value of è lies in the interval [min (n = 1, . . . ,N) yn, max (n = 1, . . . ,N) yn]. 5. Conclusion Prevalent practices in prediction of future events summarize the beliefs of forecasters in two respects. First, individual forecasters commonly provide point predictions even though they may perceive considerable uncertainty, which may be expressed through a subjective or model-based probability distribution. Second, recipients of predictions from multiple forecasters often combine them to form the mean or median prediction. This paper questions both practices. Using point predictions to summarize probability distributions enables the recipients of predictions to observe only a small and ambiguous part of what forecasters perceive about the future. It would be more informative for forecasters to give probabilistic predictions. Using the mean or median prediction of a collection of forecasters to summarize their beliefs has a logical foundation in the algebraic wisdom of crowds if the loss function is convex or unimodal, but algebra implies nothing about the connection of the combined prediction to reality. The assumption that predictions are unbiased provides a statistical rationale for averaging predictions, but one that typically lacks credibility. It generally is more informative to scrutinize the set of predictions that forecasters provide than to combine them. 17 References Barron, O., M. Stanford, and Y. Yu (2009), "Further evidence on the relation between analysts’ forecast dispersion and stock returns," Contemporary Accounting Research, 26, 329-357. Bates, J. and C. Granger (1969), "The Combination of Forecasts," Operations Research Quarterly, 20, 451468. Brenner, M. and Y. Landskroner (1983), "Inflation Uncertainties and Returns on Bonds," Economica, 50, 463-468. Clemen, R. (1989), "Combining Forecasts: A Review and Annotated Bibliography," International Journal of Forecasting, 5, 559-583. Clements, M. (2009), "‘Internal Consistency of Survey Respondents’ Forecasts: Evidence Based on the Survey of Professional Forecasters," in The Methodology and Practice of Econometrics, J. Castle and N. Shephard (editors), Oxford: Oxford University Press. Cukierman, A. and P. Wachtel (1979), "Differential Inflationary Expectations and the Variability of the Rate of Inflation: Theory and Evidence," American Economic Review, 69, 595-609. Diether, K., C. Malloy, and A. Scherbina (2002), "Differences of Opinion and the Cross Section of Stock Returns," The Journal of Finance, 57, 2113-2141. Elliott, G., I. Komunjer, and A. Timmermann (2005), "Estimation and Testing of Forecast Rationality under Flexible Loss," Review of Economic Studies, 72, 1107-1125. Elliott, G., I. Komunjer, and A. Timmermann (2008), "Biases in Macroeconomic Forecasts: Irrationality or Asymmetric Loss?" Journal of European Economic Association, 6, 122-157. Engelberg, J. C. Manski, and J. Williams (2009), "Comparing the Point Predictions and Subjective Probability Distributions of Professional Forecasters," Journal of Business and Economic Statistics, 27, 3041. Galton, F. (1907), "Vox Populi," Nature, 75, 450-451. Giordani, P. and P. Söderlind (2003), "Inflation Forecast Uncertainty," European Economic Review, 47, 1037-1059. Güntay, L. and D. Hackbarth (2010), "Corporate Bond Credit Spreads and Forecast Dispersion," Journal of Banking & Finance, 34, 2328-2345. Hahm, J. and D. Steigerwald (1999), "Consumption Adjustment under Time-Varying Income Uncertainty," Review of Economics and Statistics, 81, 32-40. Hayford, M. (2000), "Inflation Uncertainty, Unemployment Uncertainty, and Economic Activity," Journal of Macroeconomics, 22, 315-329. 18 Hoeting, J., D. Madigan, A. Raftery, and C. Volinsky (1999), "Bayesian Model Averaging: A Tutorial," Statistical Science, 14, 382-417. Hodges, J. and E. Lehmann (1950), "Some Problems in Minimax Point Estimation," Annals of Mathematical Statistics, 21, 182-197. Johnson, T. (2004), "Forecast dispersion and the cross section of expected returns," Journal of Finance, 59, 1957-1978. Juster T. (1966), "Consumer Buying Intentions and Purchase Probability: An Experiment in Survey Design," Journal of the American Statistical Association, 61, 658-696. Larrick, R. and J. Soll (2006), "Intuitions about Combining Opinions: Misappreciation of the Averaging Principle," Management Science, 52, 111-127. Levi, M. and J. Makin (1979), "Fisher, Phillips, Friedman and the Measured Impact of Inflation on Interest," Journal of Finance, 34, 35-52. Levi, M. and J. Makin (1980), "Inflation Uncertainty and the Phillips Curve: Some Empirical Evidence," American Economic Review, 70, 1022-1027. Lochner, L. (2007), “Individual Perceptions of the Criminal Justice System,” American Economic Review, 97, 440-460. Makin, J. (1982), "Anticipated Money, Inflation Uncertainty, and Real Economic Activity," Review of Economics and Statistics, 64, 126-134. Mankiw, G. R. Reis, and J. Wolfers (2003), "Disagreement about Inflation Expectations," NBER Macroeconomics Annual 18, Chicago: University of Chicago Press, 209-248. Manski, C. (1990), "The Use of Intentions Data to Predict Behavior: A Best Case Analysis," Journal of the American Statistical Association, 85, 934-940. Manski, C. (2004), "Measuring Expectations," Econometrica, 72, 1329-1376. Manski, C. (2006), "Interpreting the Predictions of Prediction Markets," Economic Letters, 91, 425-429. Manski, C. (2010), "When Consensus Choice Dominates Individualism: Jensen’s Inequality and Collective Decisions under Uncertainty," Quantitative Economics, 1, 187-202. Manski, C. (2011), "Interpreting and Combining Heterogeneous Survey Forecasts," in M. Clements and D. Hendry (editors), Oxford Handbook on Economic Forecasting, Oxford: Oxford University Press, 457-472. Manski, C. (2013), Public Policy in an Uncertain World, Cambridge, MA: Harvard University Press. McNees, S. (1992), "The Uses and Abuses of 'Consensus' Forecasts," Journal of Forecasting, 11, 703-710. Surowiecki, J. (2004), The Wisdom of Crowds, New York: Random House. 19 Wikipedia (2014), Wisdom of the Crowd, en.wikipedia.org/wiki/Wisdom_of_the_crowd#cite_note-WoC-Combi-1, accessed August 19. 2014. Zarnowitz, V. and L. Lambros (1987), "Consensus and Uncertainty in Economic Prediction," Journal of Political Economy, 95, 591-621.

INTERPRETING POINT PREDICTIONS: Some Logical Issues

Related documents

Products

Support

INTERPRETING POINT PREDICTIONS: Some Logical Issues

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib