>> Host: My name is [indiscernible] and I run the... MSR. And these are a set of talks that...

>> Host: My name is [indiscernible] and I run the EETALK series for MSR. And these are a set of talks that we do regularly, pretty regularly, twice a month or so. Just to reach out to everyone in Microsoft, the product groups in particular, the business groups. And it’s mostly talking about technologies, interesting trends, and innovations coming from MSR that we think have a broader impact. This of course has a broader impact to all of us. And it’s a very interesting talk. So David is new to Microsoft, he came from Yahoo Research just recently. We have a whole new lab in New York City that is a new Microsoft research lab, all housing about 15 researchers? >> David Rothschild: That’s correct. >> Host: Several of whom are some economists, some behavioral psychologists and some machine learning people. Yeah, that’s the sort of expertise there. And they are all applied; you know they have been working for various lengths at Yahoo. David in particular has been there only a year at Yahoo. >> David Rothschild: That’s correct. >> Host: And prior to that got his PhD from the Wharton School in applied economics. With that I will hand it over to David, take it away, thank you. >> David Rothschild: Excellent. Well thank you guys and thank you guys for coming today. It has been a pleasure being at Microsoft. Of course we have only been here for a few months but it has been working out very nicely, so. Our entire lab is excited about it. Today’s talk is talking about forecasting, in particular focusing in on the 2012 election in the end, but really focusing on the idea of forecasting for economics and related topics as well as politics. And I will get into that in a second. But when it comes to elections this is big business, upwards of 10 billion dollars are going to be spent on the 2012 election cycle. So this is a very serious, serious industry. And not only that it’s similar techniques that are developed thinking about economics and thinking about how to gather information for forecasting politics is coming into political economy questions, marketing type questions, economic indicators, etc. Pretty much anything which I am talking about in the upcoming election you could be thinking about how many copies of Halo 4 is going to be sold, or what kind of jeans are going to be popular next year, or whether or not people are going to use the new initiatives in Obama care, so any sorts of these types of questions in which individuals are making some sort of collective action. The same sort of forecasting techniques can be utilized. And the kind of scary thing about it though is that the forecasting that is used and used to make really serious investment decisions hasn’t changed very much since 1936. So in 1936 George Gallup came up with the idea of randomly polling a representative group of people and using that as a basis for forecast of what is going to happen in an election. And if you froze him after the 1936 election and thawed him out today he would be pretty comfortable with what is being done to find data for a 10 billion dollar industry. But that’s going to change very quickly. And it’s going to change in the next 5-10 years. And I ma going to talk today about how a lot of that invitations are going to be coming out of our lab and other related labs in Microsoft research and how it’s going to impact not just politics, but also hopefully have some really neat implications for Microsoft as well. So when you think about forecasting and you think about forecasting the elections the first data point that comes to mind is going to be the polls of voter intention. So if an election were held today, who would you vote for? This is the standard ubiquitous data point that’s been around since 19, well it’s been around since the 1920s and perfected by George Gallup in the 30s. But in addition to that there are kind of four other groups where I have put data that you do hear about and sometimes you don’t think about in the same sort of way, but kind of fundamental data is all this stuff that people talk about where they imply that it has some correlation with the outcome, but a lot of times they don’t really go into how, or the exact mechanism. But past election results, incumbency, presidential approval ratings, even for Senate and House races and governors’ races. Economic indicators, what’s the latest job report, prediction markets. For those of you who are not familiar with prediction markets, prediction markets are actual real money markets where people can buy and sell contracts which pay off if a candidate wins and don’t pay off if a candidate loses. And they become very strong indicators of the probability of an event happening, so if it costs 60 cents for a contract that pays 1 dollar of Barak Obama wins, which is essentially the case right now, it’s a very closely aligned with about a 60 percent probability from the collective people in the market. Experts, they can say whatever they want. And social media, so social media has just started to be coming into it’s own, Twitter data, Facebook data, search data, all sorts of real time data in which people are trying to think of how do these correlate with outcomes of different events. And so I am going to focus first in this talk on the [indiscernible] poll. And I am going to be talking about how stuff that we have learned in research about prediction markets is going to help shape how the poll is going to transform over the next few cycles. So this kind of outlines the key differences between prediction markets and polls. And I am going to walk you through this, but I want to keep something in mind too, is that research into polling which has been fairly stagnant over the years is a totally different tract than research into prediction markets. And it’s kind of weird because essentially these are just two different ways to engage individuals, gather information for them, aggregate them together, and then say something about something. Say something about some upcoming event, get a data point, but yet no one has really kind of worked to kind of bring the two together and think about how they differ and how they share similarities. Polling likes to think about gathering a random sample of a representative group of something. So in elections it’s likely voters. For marketing it’s going to be likely people in the market for Xbox or jeans or whatever you are talking about. Prediction markets rely on a self-selected group of people who have superior information to join a market, bid on contracts, and provide that information, self-selecting group. Questions, prediction markets focus in on asking the expectation of what’s going to happen. They are asking people about whether or not something is going to hit or not, whether or not someone is going to win an election, whether or not sales are going to cross a certain threshold. And polling focus is on the intentions. It’s the same thing in marketing as it is in politics. Will you buy this product? Will you go see this movie? Will you vote on this election? Who will you vote for in this election? Aggregation; polls focus in on simple aggregation methods. Generally averaging or some sort of weighted average based on some prior demographics. Polls use, sorry prediction markets proxy for how much money you are willing to spend as some sort of proxy for confidence. So if you are allowed to put more money into, if you feel very confident you can bid up in a prediction market. And finally, incentives; polling is generally not incentive controlled, so the incentives may align for people to be telling the truth, they may not, and there’s a lot of questions about that verses prediction markets which are directly assigned to have properly aligned incentive so that you are supposed to be able to bid up to about the probability that you feel that something is when you place a purchase order. Now I am going to focus in though first on this question. So polling they focus in on intention questions. Prediction markets focus in on expectation questions. Is there something we thought about here where you can kind of go into polling and see if maybe expectation questions make a difference? So when polling individuals in order to forecast an upcoming election, generally you have this, voter intention. Who would you vote for if the election were held today? The wording is pretty much identical for any of the major new sources that you see from Rasmussen, to Gallup, to Fox News polls, to ABC polls and MSNBC, it’s all the same. Voter expectation question; voter expectation question would look something like this, “Who do you think will win the election? Regardless of who you are going to vote for, who do you think is going to win the election”? And the reason that this interests us a lot is this idea that everyone has a pool of information, whether about an upcoming event such as an election or whether or not it’s a marketing question. So on elections we know a lot of things. We know who we are going to vote for. We know who our friends and family are going to vote for. We have seen the news; we know what other people are talking about it. And when you ask someone their intentions you are getting a small sliver, a very small sliver of that data right. You are getting just that individuals intention of it. Expectation is going to respond with something about their individual information, as well as information about their social network, as well as other centralized information. And this is the key motivating factor about it. And so in order to compare it we were able to, and this is work with a co-author of mine Justin Wolfers who is no, who is at Wharton School at the University of Pennsylvania, we were able to find a data set, a collective data set essentially of 345 times where the same people at the same time were asked who they were going to vote for in an election and who they were going to, who they thought was going to win. Specifically in state by state electoral votes that went from 1952 until 2008. And this is the basis of the data. 345 races, 217 of the times they both had greater than 50 percent. Both their expectation and their intention pointed towards the winner of the election. 45 races or 13 percent of the time they both pointed to the eventual looser. 82 races, or 83 races 24 percent of the time one poll, so the same people at the same time, more than 50 percent of people intended to vote for one candidate, more than 50 percent of people expected the other candidate to win. And by the way, just to be clear, we are talking here about the major party candidates, so we dropped people who intended to vote for a third party candidate and/or expected to vote, expected a third party candidate to win in which case no third party candidates have won, so it wasn’t was relevant. Three-quarters of the time the expectation question points to the eventual winner, one-quarter of the time the intention question pointed to the eventual winner. And so just on the sake of this kind of binary outcome it’s pretty clear that there is a lot of information coming from this expectation question. But to kind of head off some like very obvious questions about this first of all is that there are many examples in here, and many more examples in life where there is no intention poll kind of publicized before the election. So one of the first things you are thinking of is, “Oh people can do expectations well because they probably saw the intention poll”. But a lot of these states you are not really seeing publicized intention polls before the election. Second thing I would mention is that we actually, part of the research that I am not going to show today is that we gathered amounts of about several hundred actually of times where this type of question was asked prior to intention polls. So prior to 1936 we gathered a data set from 1880 to 1932 in which people were asked who they thought was going to win in certain elections. And show that it actually provided a lot of information, even prior to the availability of any sort of polls of voter intention. But more importantly I think for kind of this example and thinking about major races where there are intention polls what we are able to show here is that when both intention polls and expectation polls are available there is a time where intention polls are very, very poor. Let’s say there is a very small sample size or there isn’t a good sample selection. Expectation polls dominate and provide just tons more information. In the most advantageous position for intention polls, were perfectly run large sample size, major polling companies, etc, etc, expectation polls still carry more weight than the intention polls. Also we have individual level data for these 345 races and we were able to show that there is an extreme amount of homogeneity. So if you think about it, if everyone just expected whoever they voted for to win than there would be the intention poll and expectations polls would be the same. If everyone was seeing the same thing from the media than everyone would expect the same thing, regardless of their intention. What we see is a mass amount of homogeneity, which shows there is some middle ground. There is information coming from the center, coming from this kind of social network data or coming from some other thing besides just your intention in the central signal. And finally I think it’s the most important for some reasons too is that even if there is an extremely well publicized intention poll which one would you rather ask somebody? If you only asked one person and you asked them their intention. There is a close race, there is about a 55 percent chance that they may be right, a 45 percent chance they may be wrong in that sense. But if you ask them their expectation there is a much higher likelihood you are going to be right. And this kind of carries over, this kind of basic idea carries over through this idea of asking this question. So, now of course the binary outcome is nice, but that’s not really what a lot of people care about at the end of the day, especially when it comes to forecasting, people think about expected vote share. So what I have done here on the X-axis is the proportion of people who intend to vote Democratic. So this is the polls that we just talked about on the last slide and I have dropped 2008 from here because that becomes out of sample. And so this is 311 different elections, this is the proportion of people who intend to vote Democratic and the Y-axis is the actual vote share. Now this is what you see a lot on the news. It’s a kind of naive, implicit vote share. And what we are going to see is that there definitely is an upward slope here. Definitely the higher the poll the more votes they end up getting, but it’s clearly not a one for one ratio right. So this naive vote share is not that great. And so we do some very clean and transparent calculations where we say that the poll essentially equals the actual vote share plus some sort of house bias, plus some [indiscernible] party or [indiscernible] bias, plus some sort of time bias. Most of these are actually clear. We don’t have much time variation. They are all taken about 30 days before the election. Plus some certain error turn and we are able to translate the raw polling data into an expected vote share and it looks something like this. Now the interesting thing though is when you look at expectations. So on the X-axis is the proportion of people who expect the Democrat to win and the Y-axis is the actual vote share. It doesn’t look that clean. It doesn’t show very much. And this is one of the reasons why people have not been using this poll for very much because if there is 90 percent of people who expect a candidate to win what does that mean? I mean it could be a close race, but everyone still knows what the outcome is going to be or it could be a very wide race. And so implicitly it doesn’t look like expected vote share and doesn’t look like a probability victory. It doesn’t look like all actually. There is an upward slope to it. The higher the expectation the more people expect a candidate to win, and the votes they end up getting, but clearly not one to one. But we same transformation, it’s the exact transformation we use for intentions and this is what we get. it much at more do the And it may not be clean because you didn’t see the other one right next to it. So if you look at the two of them right next to it, being on the 45 degree line means parity, which means the expected vote share coming from that poll equals the outcome. And you can see just looking at these that the expectation data translates into a much cleaner looking forecast of the actual vote share. For the numbers people we can just show that when you take the expectation data, which is the second column and turn that into an expected vote share verses the first column which is the intention data into a then expected vote share you have a lower root mean square, a lower mean absolute error, its closer to the forecast, the forecast is closer to the answer more often, the higher correlation and encompassing regression. But this is the kind of thing which I like to focus on is this last slide here, this last number here. Is that if we were to put a weight on which forecasts it would take you get tons of significance and a lot of weight on the expectation and you get a little bit and no significance on the intention. And to be clear here what major polling companies use are 100 percent and 0 right. So even if you don’t totally buy it, right here we have a very, very strong finding which is essentially 100 percent and 0 in the other direction. So we wanted to think more about what this means and how this could help us beyond just this question. And in many ways we wanted to kind of interpret the power of it as well. So let me just give you a quick interpretation. Imagine if when people were asked their expectation over what was going to happen in an election they went out and they just polled random voters on the street? They just polled random voters on the street? The data is as powerful for expectation as if they went out and they interviewed 10 random voters and then included themselves in it. So essentially if they just gave us a binary answer, you know Democratic or Republican, based of a poll of 11 people which was them, plus 10 additional people. We don’t think they are actually doing that obvious, but that just kind of shows the kind of power of asking this question verses the other question. So there’s this massive multiplicative effect. It’s like having a poll of 10 times as many people. But not just 10 times as many people, it also solves one of the hardest questions in polling which is, “What’s the representative sample”? So when polls report a margin of error, when you see Gallup as a margin of error plus or minus 2, or plus or minus 3, but what they are reporting to you is the random sample error if they knew exactly what the likely voters were going to look like. So they knew the demographic breakdown on Election Day of who was going to show up and then they went out and polled 1000 people. That plus or minus 3 percent, plus or minus 2 percent, that’s the random sample error based off the fact that they only interviewed 1000 verses you know a million or everyone. But it misses this other major source of error which is that they don’t know who is going to vote and they are having this likely voter model which is a lot of guessing work. And that’s why in the 2008 election, the day before the election of the major polls that came out in pretty much every major polling company they ranged from Obama up by 2 to Obama up by 11. And it wasn’t just from random sampling error; it had a lot to do with this hard problem of thinking about likely voters. But that’s not what actually happens. So we want to figure out more about what actually happens. And as we said there is this idea about, you know that the people’s intentions were involved, we know there is something about this localized information that they have, and we know that there is something about the central signal. So, one of the things that we are working on is that we actually have a bunch of stuff in the field, actually with Gallup, which I have been working on prior to getting here. And also we will be asking these questions inside Microsoft’s network with Xbox and with some other platform in which we are going to be asking people about not just their expectation and intention, but asking who their friends and family are voting for; who they think that the media is saying is going to win. And trying to put all this together to actually information we are getting because the question the question we had to use because it was asked Rumsfeld said you go to war with what you have, What we want doesn’t involve the central signal leading expert in the world on what the central break down the about expectation is before as Donald not with what you want. because I am the signal is. So I don’t want people regurgitating to me what they think the polls are saying because I can do that with less noise than they can. I want to know what they are providing me about their friends and family. I want to be able to ask a poll and get that information and think about the social network data that’s coming from it. And that would be the ideal and that’s what we are experimenting on this year. Let me show you a little bit in the field this year. So like I said I have been working with Gallup, so Gallup has been kind enough to be asking these questions for us. This was taken in November during the Republican primary and what you will see is at this point Romney and Cain, if you remember him, were in a dead heat, but when we asked them their expectations, Romney dominated it. Later, one month later, you had Romney getting crushed by Cain, Cain was, oh I am sorry not Cain, Gingrich, Gingrich, sorry it just kept on floating around so I forget who was leading at different points. There was actually, in the intention polls, there was about 15 different leaders. It was like one leader, then Romney, then another leader, them Romney. And this went back and forth. The expectation poll even at the lowest point, and this would be about the lowest point, never wavered. So every time we asked the expectation poll they always said Romney, even when he was being dominated in the intention poll. So there is clearly something that people knew when they were coming into these that was providing us meaningful information. And so what do we have right now? We have a mix of polls up there. So if you look up a Real Clear Politics or Pollster which takes the latest polls up there, I think this was Real Clear Politics and I polled this down two days ago, they had Obama up in four different intention polls, they had Romney up in three and they had a tie in two. Not all of them had expectation polls, but everyone we found so far in this entire cycle has Obama up. A good example would be the latest Washington Times poll in which they have Romney up 42.8 percent to 42, almost a tie in the intention, but they still have Obama up big time in expectation. So let me talk about sample selection now. So this is a pretty interesting thing because the way the polls are worked, and this is for everything, we spend a mass amount of money and time thinking about this representative sample. And the basis of all of this kind of polling has always been to gather representative samples; random groups of representative samples. But what we are able to show with this same sort of data is that I can take the most biased, un-representative sample possible. Just those people who said they were going to vote Democratic, or just those people who save vote Republican and actually create and accurate forecast than the whole group. And why is this important? Declining land-line penetration, unrepresentative online surveys, difficult contacting working families, 20 to 30 percent of some demographic groups don’t have land-lines anymore. Pew ran a survey recently which showed that getting people onto standard random digit dialing polling, which is the gold standard, 9 percent is what people are averaging now of how much you can gather. They did a kind of gold plated one where they got up into like the 20s by like calling 35 times or something crazy like that. If you could take non-representative samples like Microsoft and other major corporations handle millions in any given day, or billions right of non-representative people who ping them for different things and ask them simple questions and you can make it work for you, then you have got something going. And another thing to think about is that, you know, in contrast to the polls and just going back to prediction markets, prediction markets are dominated by generally elderly or kind of people in their 40s or 50s, presumably wealthy, and white, and male; this is a very un-representative group and they have done a very good job in predicting outcomes of elections. Just to be really clear of what we are doing in this, standard. You take all the poll people who said they were going to vote Democratic, people who said they are Republican, that’s your voter intention. Expectations, let’s take the whole poll. All the people involved, the whole sample, expectations. What we are able to do here is by accounting for the correlation between intention and expectations, we are able to take roughly the sample size, either just those people who were going to vote Democratic or just those people who are Republican and are able to get a more accurate forecast of the upcoming election. So I am not going to go into the math because I am not going to have time here, but this was really big because this kind of showed us that we don’t need to do polling on fully representative samples. If we have some historical data and we have some understanding about debiasing we can actually make un-representative samples work for us. I am going to go over this real quickly because I will be low on time, which I am, aggregation methods and incentives. So what are we thinking about here? We are thinking about here is a combination of gamification and new graphical interfaces. And so this is what we call the ball and buckets method which we have been working on, in which we get laypeople, people with you know non-experts, and they fill up, we ask them a question and we have them fill up the different distributions between the different ranges. And actually they end up creating a probability distribution. So we asked them in this case about quantity of something, their [indiscernible], we asked them say, “Where is the answer likely to fall”? They pushed these up until they distributed 100 percent likelihood across these things and we have been able to show that people have been able to create really amazingly good distributions; normal distributions, uniformed distributions, right distribution, left distributions. Really able to show their confidence and their kind of full understanding of where the answer may be able to like. More importantly we have been also developing other things which are actually less cumbersome which are just kind of fun games that people have been playing in order to provide information for us. And what its doing is two things. Number one is that it’s allowing people to reveal their confidence over their answer, reveal their confidence without being a money based thing or without having money as a proxy for it because what we are showing is that confidence as the standard deviation of their probability distributions is highly correlated to how accurate they have been. So we have had people play these games like this. The tighter their standard deviation, the tighter their distribution, the more accurate they have been, so we have been able to use that to weight forecast based on it. And the second thing we have been able to show is that if we make the games fun, if people have incentives beyond money, we can actually incentivize them properly to give us meaningful and truthful answers without having to make it a money based game. And so in short when we are thinking about polling and how we are going to transform it is that previously basically everything has been based on this random sample representative. We are switching to the idea that we can use random sample, even self-selective samples of nonrepresentative groups. Questions, we are trying to think of the intersection between intentions, expectations and people’s social network rather than just relying on people’s intentions. We are working on weighting on weighting by revealed confidence. And we are working on making things incentive compatible by gamifying things rather than thinking about having to make sure that everything is money based. And so redefining polling by having polls meet prediction markets. And where this is going to be useful moving forward too is that obviously Microsoft has massive amounts of engaged, non-representative users. And so in this election cycle we are going to be asking some of these questions, a lot of these questions and pushing it forward with experiments with the Xbox where people are going to have the opportunity to answer polling as well on some other platforms. And it’s the idea being that we hope to be able to create very meaningful forecast predictions, as well as also some analysis on sentiment and interest from these people coming from this nonrepresentative group. Furthermore we are also running some experimental games which are running off of Microsoft right now because these are not advanced enough, but we have been toying with them and the idea of making even more advanced prediction markets which we are really excited about could really utilize and really be a lot of fun for people. But I should also representative is of likely voters, representative of representative of people. put a caveat here on this question of nonthat our user base is not necessarily representative but it is representative of Xbox users and it is Microsoft users. It also, it’s also somewhat a lot of demographics that are poured into a lot of So the ability to also do some really cool market research, some advertising research and some other very useful and very direct things is not lost on us by the fact that we don’t need a perfectly representative, but reasonably representative stuff, hit randomly or hit even self-selectively, our ability to start debiasing that data and thinking, “Well I would rather just have this many more hits and not be representative”. If we can debias it properly we can actually answer some really answer some really awesome questions and with some massive scale. So let me start putting this into a little more context. So I am going to sort of put things into three categories here. So what I just talked about is kind of active Microsoft data. This is data that we are going to start collecting in different realms with experimental polling and experimental games. On the far left, this is kind of passive outside data that is generally used in making predictions for both kind of marketing type questions as well as election type questions. In the middle kind of I will refer to as passive Microsoft data. This is Microsoft data that we have that we don’t actively all for this use. Now the really interesting thing is when you start thinking about this in terms of election forecasting. A lot of stuff stops at this kind of passive level right? So you hear a lot of people report the latest daily poll, but they don’t kind of translate that into expected vote share you or probability of victory or something. People talk about the latest jobless numbers and say this is important, but they are not, there are ways to translate that into what does it really mean? Like does a 10 percent tick up in the un-employment rate, what does that do? Same thing with Twitter; everyone likes to report raw Twitter numbers you know. There was this many re-Tweets of something. What does that mean? I mean put that in context. How many Tweets are normal? And so the idea really of what this kind of greater project that this falls into is taking all of this data, this passive outside data, this passive Microsoft data, this Active data we are calling and really work it into real time data visualizations and tables in which there will be internal dashboards in which people can utilize to learn about different things, as well as for external market research type things. And then external facing charts and tables that try to take all of this data into --. Aggregate it together into very clean kind of predictions of things, social media interests and things, and sentiment around things. So that’s kind of the overall kind of goals. This is kind of the second stage. So before I was talking about the first stage, let’s talk a little bit about this second stage. And so when you think about this in terms of forecasts, the idea is to gather this information and aggregate it. And I kind of think about it in four kind of key things. One is thinking about accuracy, two is thinking about timeliness, three is thinking about the relevancy of the forecast, and four is about the economic efficiency. And so what do I mean by these things? So let’s think about accuracy. I am going to talk a lot about combining data. And so what you see a lot of is, is the people providing you individual data streams, but I don’t inherently care about Twitter. I care about what Twitter can forecast for me or what it tells me about sentiment or interests. You know I don’t inherently care about Gallup’s polls verses Rasmussen’s polls right. I care about the forecast they are trying to project. So it doesn’t make much sense, the fact that people don’t think about this. And what the first type of thing is to think about, of kind of within data type aggregation, so you get a benefit from you know aggregating Rasmussen’s polls and Gallup’s polls and taking an average. It’s pretty much always better than grabbing one verses the other. Debiasing it, so with polling we know that there are biases, whether or not it’s from, in polling there is something called the anti-incumbency bias. So incumbents poll lower around Labor Day or a couple of months before the election than they actually do in the outcome. And these are very simple things you can debias. Combining different data streams together, and combining prediction markets and polls. Is there and advantage to it? And then kind of debiasing the whole thing together and I will give you more examples. So this is data from 2008. And so essentially what you are seeing here in the blue line is all polling for the national election and I have aggregated the different polling together by taking a simply linear trend on any given day of the previous polls. And then I did a simple debiasing method which is very transparent and created a probability of victory. That’s the blue line. And this is for the incumbent party which was the Republican Party, so this is McCain. And what you see is that he floats around 40 percent, pretty stable, and then Lehman goes under and his likelihood of winning the election just crashes towards zero. Now what if I had done the same method, but I had just taken the daily polls on any given day? So rather than taking a trend of the previous, all the previous polls together, I just took whatever the latest poll that came out was? That’s what the exact same thing would look like. And this is what you are getting basically right, when you turn on the evening news? People are just talking about that latest poll about Obama, or maybe there are two polls out today and they will average them together for you. But there is really no reason to believe that the underlying value of this election was bouncing around like that on a daily basis. I think it’s pretty safe to say that it looks a lot more like this blue line. And so this is what happens when you just kind of do some basic within data aggregation. So let me talk about prediction markets. So people like to report prediction market prices and it’s been pretty clear with the research that prediction market price as far as raw data goes are really a strong indication of probability of victory. But this is polls from about two days ago. What you will see on the top line there is Michigan, so this is the top line in yellow is essentially a contract that pays off, let’s assume at 100, should Obama carry Michigan and pays off to 0 if he loses. But what you will see is that there is a pretty big difference here between the bids. So people are willing to spend 75 and when they were asked people were willing to sell it for 86. And then the last purchase was at 73. So it’s not exactly obvious what the price is and what the number to poll from this is. What we generally take is something that is actually, in order to do this in real time, is actually a pretty long string of if and but conditions. Which comes out to something roughly between the middle of the bid and the ask, unless the spread gets too large. But then there is something else which is a question of debiasing which is you will see Main here sitting at you know somewhere around 97 or 96. That will never go any higher because of transaction costs and opportunity costs no one is actually going to bid Main all the way up to 100. And we know this and we see that systematically so we have ways to kind of say, “Let’s debias what we call the favorite [indiscernible] which is that really its low contracts tend to move up slightly higher than they should, and really high contracts don’t go all the way to 100". This is from another set of research which I have done using kind of fundamental data. So this is created based off of past election results. Past election results combined with economic indicators, incumbency and other things like that. And what you see is that pretty early before the election you can actually take all this data, aggregate all these of these different kind of data sources together and create a pretty accurate forecast of what the expected vote shares could be. So the X-axis is the kind of forecast that I can create on June 15th of an election year, the Y-axis account for the actual outcome, and what you see is that it’s a pretty nice line. But it does speak to this question of when people just talk about job numbers and talk about things like this. You know we spend a lot of time of how they actually correlate in. If you actually think about it you can actually break it down. I can see, you know, the trend in job numbers. How important that is. How much does incumbency buy you? And that’s why you kind of want to create these models and think about it more in this kind of fully debias form. With that being said, this is taking that kind of fundamental data, adding it to prediction market data, and polling data and this is from the last few cycles, both presidential and senate races. This is the kind of main forecast model I use. This is how accurate it can be 103 days before the election. 103 days before the election is obviously today. I didn’t just pick that number just randomly. So this is the forecast that my model had created for the last couple of election cycles about 202 races, so it’s the last three presidents, the last three senatorial cycles and the last two presidential cycles, and you can get pretty darn accurate if you start polling all this data together. And this will just get more and more accurate as the election approaches because more and more information is added to prediction markets or polls or fundamental which are the main inputs going into this model. So now on the question of timeliness, this has always been something that just bugged me in the sense that academia and the press tend to judge forecasts for almost anything, this includes economic indicators especially right before something comes out when it’s only completely meaningless. So whether or not you can forecast the jobless rate you know three minutes before the jobless rate comes out, or generally they come out a day to five days beforehand. You can’t make major decisions based off that right? Essentially you need that data coming when you need it. And the earlier you can make these predictions the more valuable it is. To actually make investment decisions. And so I work on making sure that all these forecasts are updated in real time so they are as relevant when you need it and when you need it. And actually adds a massive layer of complexity that we are working with because of things like prediction markets breaking down. It can get confusing when there is no bid and ask in some prediction market how you write the right code in order to make all this stuff work. And second of all the granular nature of creating forecasts that update all the time makes it so that you can do some really awesome research about the effect of different things. So if I have a forecast that literally moves in real time and someone announces a surprise VP candidate you can actually see how quickly and how precisely this moves, these forecasts moves. And that allows you to study things that you can’t study if you have a forecast that comes out once a month or something. And so it’s the question of relevancy too. So, generally as you know with elections, the first thing that people talk about is the raw poll numbers. This is raw data right. And hopefully you can kind of see from when I was talking about today is that it is very easy to make it into something a lot more meaningful. But yet what people generally report is raw data. Raw number of Tweets, raw number on a poll, the price on a prediction market, the jobless number, and this stuff doesn’t help most people. This is raw data. Then you have got to transform it into the most popular thing is into some sort of estimation of vote share. But most people don’t actually care about estimates and vote share. The reason that estimation of vote share has been the kind of grand daddy of forecasting for elections is because it’s the easiest forecast to make and polls implicitly to look like it. But most stakeholders don’t care if Obama wins by negative points, as long as he wins the election. And they don’t care if he wins by a thousand. What they care about is who wins. So probability of victory is actually what most stakeholders have shown they care about, but what very few forecasts focus on, and because it’s a little harder to make and a little different. But trying to think about these things are some of the things that we have been thinking about a lot and trying to make whatever we create, whether or not they are sentiment index or interest index or predictions more relevant. And so finally, kind of social media data, and I apologize if I did not talk about social media enough data today, but the bottom line is that there is a lot of ongoing work on it. But currently people embarrass themselves frequently when they talk about social media data in the popular press and also in academia too. And the reason is because it is very hard to calibrate social media data without coming to this point because social media changes so rapidly and there are so few outcomes to correlate them with. And the bigger problem though comes with this very, very had problem of people taking this raw data and then confusing it with many different types of outcomes. Not being clear what they are talking about. Are they talking about how much people are interested in something? What they think is going to happen, or the sentiment around it? And a good example would be Ron Paul who was getting a lot of news during the primary was the fact that Ron Paul was dominating social media. And so a lot of media people kind of made an ass of themselves by talking about maybe this is some sort of indication if he is going to win. Well it was no indication if he was going to win; it had nothing to do with that. It was some indication of how much people on Twitter, which is a very small self-selected group of people that we are able to actually quantify, which is an even smaller group of people that you can actually poll sentiment from, happen to like Ron Paul a lot. But people weren’t putting that into the proper context. They also weren’t putting in this question about sentiment or was it a question about interest? And all these types of things are very difficult to do and these are kinds of things we are working on. What we are trying to do is pull out all this social media data combined with other rapidly moving data and put it into proper context; into kind of three main buckets. One is a prediction of what things are going to happen. Two is an interest index which kind of says something about how big something is and how long we believe that this event is going to last. So if an issue pops up during a debate can we put that into context of earlier debates? How big the Twitter reaction was to it or the Facebook reaction to it was, and how long we feel it is going to last. And then finally putting it into a proper sentiment index which shows us things like pretty much every candidate has massively negative sentiment on Twitter. So unless you put that into the context you are not applying anything meaningful. So if I take those first two steps then we are going to try and bring in the next two steps which are kind of infographics and stuff. Thinking about how do we get laypeople to understand this in a very clean and meaningful manner? And then supply this to outside media. And the reason why we are interested in doing that is to kind of build this feedback loop. And this feedback loop is that the more we can talk about the really meaningful and interesting data that’s coming from Microsoft, the more we can get people to supply us with data and utilize our systems. But also the interesting one is kind of the other inner back-loop which is that the more we can figure out how people can understand things, if I can get people to understand improbability distribution, and then I can get them to supply me more interesting information. So the work we are going on to create data visualizations that people can understand is the same data visualizations that we turn around into these graphical interfaces to get people to supply that information. And so let me talk about the 2012 election then. Let’s see here. I know it’s always tricky to go live during a talk, but. So the outcome of this is that, and these are currently sitting on my personal blogs, but they are going to transfer over to Microsoft space sometime soon, is that we are able to turn around real time and what we believe is extremely accurate forecasts of the election. So this table, along with accompanying map form, is something which is updated every two minutes or so and it’s updated with the latest prediction market data and other data that’s coming into real time. And we feel, like I said, that this is very accurate, but also very meaningful. That it comes in when you need it. And what you are going to see from this list of the Electoral College is that you can focus in on this middle section, these are the swing states. So if you add up everything above that point you are going to give Romney 191 electoral votes and everything, so that’s everything up to Tennessee. And then everything Wisconsin or below is 247 electoral votes for the presidency. Now I want to make extremely clear is that these things are well calibrated. So when I say there is a 77 percent chance of covering Wisconsin, I do mean that there is a 1 in 5 percent chance, or almost a 1 in 4 percent chance the president will lose. But, and this is kind of tricky and this is really interesting research we are also doing as well, the correlation is such that if the president loses Wisconsin, mostly likely it’s not the swing vote at that point because most likely if he loses Wisconsin he has lost a lot of other states as well. And so that’s something to start kind of thinking. Rather than thinking about these independently your best thing is to think about these almost as a ranking method where it’s very unlikely that states jump across too many different states. If the president carries Alabama, he has pretty much won the country. But the swing states point to a really interesting thing, which is that the fundamental data, as which I said we looked at this fundamental data that comes up through mid June or so, and Romney should be winning on all accounts if you look at just, even incumbency against him, but you look at the economic conditions, you look at some of these other kind of raw data that comes that is not directly related to the campaign. But Romney is getting killed in favorability ratings compared to the president. So Romney’s favorability is still sitting around 43, where the president is closer to 50. And the president has extremely high presidential approval for someone in these economic conditions. And what that translates into is an uphill battle. So if I kind of refocus on here again, as I said, once you get up through Tennessee that means that Romney is only up to 191 right there. He pretty much needs to carry Florida, which he has a pretty good shot at, but he needs to take Ohio and Virginia and then he needs to take one of the following four states. So that means the president needs to defend either Ohio or Virginia, or just these other states, Iowa, New Hampshire, Colorado and Nevada, which does translate into what we have right now as a roughly 60 percent chance of the president winning the election. Now, and I will say one thing which I was just talking to Jennifer in the audience about this a few seconds ago. This doesn’t, it’s kind of scary because it does rather than kind of adding things to watch, it really subtracts into sort of a scarcely small segment of the country. The senate list though is a little interesting as well. So the Democrats currently have 53 seats which they are controlling, that includes 2 seats which are technically controlled by independent, Bernie Sanders who is a socialist and so is not in danger of [indiscernible] with Republicans. And Lieberman who is an independent, but is retiring. And so if you take a look at this list right now you have to put in mind though that despite the fact that the Democrat’s control 53 seats right now, they have 23 seats that are up for election and Republican’s have just 10. So, the Democrats need to think about it in a different way; it’s that they are going into this election with 30 guaranteed seats and the Republicans, despite being down, are going into this election with 37 guaranteed seats. So, 33 seats are up for grabs. The Democrats have 30 already right now and the Republicans have 37. And the really interesting thing is that if you just think binary right now, so if you think the Democrats capture Virginia through Washington and the Republicans capture Montana through Mississippi and I will ignore Main for one second, that essentially gives the Republicans, I’m sorry the Democrats that puts at 49 seats and the Republicans at 50 seats. Main right now is highly likely to go to an independent, which I have sitting out here because there is only one of them, there you go. His name is Augustus King and he refuses to say who he is going to, Angus King sorry, he refuses to say who he is going to caucus with. Although, he says the he is going to vote for the president, so it’s likely he will caucus with the Democrat’s. And then of course there is a 51st tiebreaker which is the presidency. So you have to slide that in there as I said, about a 60 percent chance that the Democrat’s control that tiebreaker and about a 40 percent chance the Republicans control that tiebreaker. What this translates into is simply as tight of race to control the Senate as pretty much possible, essentially 50/50. With races to watch, not that surprising, which is Massachusetts and Virginia are the two races people have been talking about. Massachusetts which is incumbent Scott Brown who upset and took Kennedy’s old seat after his death, against Elizabeth Warren who is the popular professor from Harvard who recently ran the consumer finance board, consumer finance protection board. And then Virginia, which has George Allen who lost to Webb in 2006 after his racial slur incident, running against the former governor Tim Kaine; both of these votes, elections which are essentially toss-ups. So I think that about covers the predictions. long but does anyone have any questions? So I ran it a little >>: So the debiasing that you are talking about requires a ton of prior data? Okay. Otherwise it’s like a magic variable that you just --. >> David Rothschild: That’s right, and so, which I didn’t go into too much. So a lot of this requires a mass amount of historical data which is one of the major issues when it comes to new data, social media data, Twitter, etc, Facebook. But what we are looking at with polling data of course is a tremendous amount of historical information, but on the other hand you look at something like even prediction markets and you only have a few cycles of them. So it is something that is continuously updated with each new cycle of information that comes in, which will make us more and more accurate. But it is something that requires historical information. Yes? >>: [indiscernible] >> David Rothschild: So some of it is going to be kind of straight linear type progressions for very simple things. A lot of things though on the probability are looking at probits or logits which are fairly straight forward as well. Some of them do go a little beyond that. A lot of them have polynomial terms or kind of factorial terms in order to show how essentially the effects differ as you move away from the middle. So let me give you an example. So when it comes to polling per say there is this idea called regression. Essentially polls which are at 10 point leads turn into something like 3 point or 2 point victories on average. But that kind of factor only hits the very high lead ones. It doesn’t hit the small lead ones. So there you have a square term and some other things like that to work on that. Mainly you would think of debiasing as looking at incumbency, party, but then thinking about how it changes over the days before the election, which is very key. How does it change based off of different election types? And then how does it change, as I am saying, if something is very close or not very close? And so working in all those factors, but especially the days before the election is an interesting one because there when you are thinking about combining you see fundamental data does extremely strong 140 days out, but essentially provides no additional information 10 days out. There you are focusing only on other tips data. >>: Yeah, I was trying to say [indiscernible], because when it came to historical data [indiscernible]. >> David Rothschild: Uh huh. >>: [indiscernible]. >> David Rothschild: Right. >> [indiscernible] and it works very well up until now, so how do you -? >> David Rothschild: Right. So I think there is kind of a couple of ways in which we have been doing that for, depending on the different types of elections. So for something like the fundamentals or the polling individually we do have many, many years and there we are able to very easily drop years and do in sample and out of sample examinations of that. And the same thing even with prediction markets though, we have in sample and out of sample for anything that we have done, anything that I have done. But more importantly also as well is very cognizant of the, the out of, of this kind of concern, we do end up --. I do end up, depending on if we or me sorry, depending on if I am working with someone or now, dropping variables that would, or dropping terms and different conditions that look very strong in sample if they are not working out of sample. And so we have done very strong on that, making sure that we are not over fitting. And it has been a big concern. And so the basic paper that I have on this last kind of combined model I talk a lot about over fitting and things that I dropped back and scaled back on because I was concerned about it. Yes? >>: You made a comment about, the only thing you were interested in was the probability of victory. And the only thing that mattered was [indiscernible]. >> David Rothschild: Uh huh. >>: But if I look at, if I am a working manager at Xbox, I am really not interested in victory or [indiscernible]. I am really interested in actually unit sales. >> David Rothschild: Yeah, so. Let me, let me, let me answer that question. It’s essentially that’s for elections. And so for elections we are not as concerned, although everything is also run for that for historical comparison. What we are working on, and what I am doing a lot more internal research on thinking about when it comes to marketing, is actually full probability distributions, because I think that is the answer more than just even levels. So again, it’s more thinking, “What’s the relevant thing for the relevant condition”? I could be thinking about forecasting the number of sales on Xbox. And if I --. Maybe that, in a standard deviation, maybe that’s the relevant thing, but if the relevant thing is probability a distribution that’s what we are actually gunning for. Thinking can we actually take that to the next level and get what we feel could be for some of those things; the actually most especially if there is a probability of something of a skewed distribution of some sort. So we have been working towards thinking about that. most relevant for the right context? Just what’s the >>: Second question. [indiscernible] intriguing piece of information about the conversion between [indiscernible] markets and polling data and show where I think expectations tend to win roughly by a factor of 3 to 1 and that’s for about a set of what 80 elections. So is there anything common between that 3/4 of the 80 elections or 60 elections, or something common about the others; the characteristics of the elections that make one method better than the other? >> David Rothschild: Right. Well so the thing which I am happy to say is that it doesn’t --. There is differing varying sample sizes that occurred in these in different days before the election. And it was not reliant on that kind of basic variables that we thought. And so it really cut through fairly consistently. And so I didn’t show a slide, but if we broke it up by those major factors we didn’t see anything very, very strong. >>: I just want to make sure I understood correctly the more bias toward the higher intent, for example those much more likely to [indiscernible] Democratic for example. Did you say that their expectation is better? >> David Rothschild: No, so to be clear, I could take the expectations of just those people who were going to vote Democratic or just those people who vote Republican and take that group and debias it based on historical data to create a more accurate forecast of expected vote share and probability of victory than I could with essentially just thinking about the intentions of the whole group. But both Republican’s and Democrat’s were capable of doing it. >>: When you are talking about gamification you are getting information from [indiscernible] from winning or losing? >> David Rothschild: How do you know if you are winning or losing? So, for what we are doing right now we are running samples generally using Mechanical Turk or other sets of outside groups in which you are knowing if you are winning or losing based off of bonuses that you are receiving or based off of points that you are getting in a different game. >>: So how do you now what points to give me if I am [indiscernible] information to you right? >> David Rothschild: Right, so generally in these types of things you are providing the information on either outcomes that have already happened that you don’t know about or things that are randomly drawn in the future and then taking that data to provide the points for it. >>: [indiscernible] in the news [indiscernible] is starting to go towards this new method we can kind of watch as it starts to emerge? >> David Rothschild: Yeah, so some of them are, um, including --. So I, I tend to blog on some news outlets as well as the New York times as a statistician in house that does some things in probability, although I am not a big fag --. No, I was, I was, but he has moved away from some of those things. And some other news outlets are starting to refer to the least aggregated forecasts or from in trade prices, or [indiscernible] prices, or things like that. But I think it’s slow because essentially there was always a kind of loss leader to talk about the latest poll that they are conducting and they want to conduct these polls and want to talk about them. So, I think what you are going to see quicker and more aggressively is people talking about just raw social media data, because that’s kind of the prestigious thing for people to talk about now; whereas it used to be a prestigious thing to drop 100,000 dollars on a couple of polls during the course of an election. Now people, they need to have their Twitter guy talking about the latest Twitter reaction to a different event. So I think people are still far away from it. Still people know that this is still out there, except they still are just reluctant to kind of switch over. >>: So I have heard that the polls that are out these days are mostly registered voters, in the future they are going to start doing likely voters. >> David Rothschild: Yeah, so it transfers over --. >>: [indiscernible] change the goal results? >> David Rothschild: Essentially that transfers over sometime around now, where essentially early in the cycle you are having polling on registered voters and then you start building in the likely voter models. It actually switches over kind of organically throughout the end of the summer. And what you will see is going to be a mixed bag because different companies used different likely voter models, and so it’s very hard for them to predict who’s going to come out. And a lot of that is where the bias between different polling companies comes in. So you have Rasmussen and some other companies on the right that tend to come out with polls of Romney slightly leading. And then you have some polls which are know to have a slightly less bias. And it’s all coming from their estimation of whose going to come out and vote and that’s their way for them to kind of cycle it through. >>: How do the two political camps use their statistics? >> David Rothschild: Well they use it increasingly meaningfully is the short of it. So, you know, 5 or 10 years ago the basic statistic that people worked on was primary voters, so they used a triple D or triple R, which meant that they had voted in previous election cycles in the primaries as a strong indication. That was kind of the extent of data mining that people were doing. And now the polling has become ridiculously sophisticated as far as transferring 10s of millions of dollars from different states, looking at whether or not it’s prediction markets are polling, but more likely its these internal pollsters that they are hiring as well as teams of kids, especially on Obama who are translating Twitter data and other things for whatever they claim they can do. But my guess is that it’s, I haven’t seen inside any of the campaigns, but I think they probably have a long way to go to actually make it meaningful, but definitely impactful. I am sorry, meaningful in the way whether or not they are correct, but impactful in the sense that it does lead for a huge amount of investment decisions are being based off what they can decide on these. >>: Wouldn’t they try to hire you and say, “Quit your job and work for us”. >> David Rothschild: Right, so um. I don’t, I don’t, I think Microsoft won’t allow me to, would not want me to work for one camp or the other. But you know anyone who wants to, you know take our publically available data is more than happy to translate that into investment decisions I guess. >>: Are you expanding the expectations model to include to side with my social network and [indiscernible]. >> David Rothschild: Yeah. >>: That’s my expectation as opposed to someone who is a social hub? >> David Rothschild: No question. So that’s kind of our future work which we are really excited about, is weighing people by how much, how many people they know about. So right now we have kind of been torn on it. We have asked people, “How many people do you talk to about politics”? And the numbers actually come out fairly consistently. But then there is, you know, some people are going to say 1000 people and some people are going to say 1. And obviously we are not going to weigh people based directly upon that. We also have been working on kind of deducing it from how accurate people can be and it’s been a fun challenge. But that’s something where we would definitely like to go. >>: Or their Facebook friends. >> David Rothschild: Yeah, or their Facebook friends. And you know at the end of the day the idea is to be able to poll this information more organically. I think the key thing to think about like in 5 or 10 years is the idea of an actually poll wont make much sense to people at that time, because its all going to be ongoing data collection, which is going to be a combination of passive and active data in which you wont be able to tell the difference. You are not going to realize you were just polled per say in the same way that you know Google, and Microsoft and Yahoo and all these companies are, you know, predicting things all the time on an individual level. Why wouldn’t we be able to do that for elections and other things like that? And so polling from social network data would definitely be part of it as well. >>: So is one idea that you could do to determine that a likely voter would be to ask some kind of a question that somebody who pays attention to the news might know the answer to, like you could ask them about this Obama sector and how he crashed his car. But if the person knew about that maybe they are more likely to be a voter [indiscernible]. >> David Rothschild: There is a mass of literature on breaking down political information and what does it mean. And it’s actually a very tricky question because its not always clear which questions translate into which outcomes. But what there is actually a lot of people doing research in those types of things. Like which questions can we ask and then what does that imply if people understand that or don’t understand that. And also thinking about people’s confidence in those questions is actually where my research is tended towards though. Is that we show that people who are extremely confident in their responses to these political questions are also less likely to be influenced by certain things. And try to figure out exactly where does all that lie and what do we know? >>: That’s good. Thank you [indiscernible].

>> Host: My name is [indiscernible] and I run the... MSR. And these are a set of talks that...

Related documents

Products

Support

&gt;&gt; Host: My name is [indiscernible] and I run the... MSR. And these are a set of talks that...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Host: My name is [indiscernible] and I run the... MSR. And these are a set of talks that...