16313 >> Kim Ricketts: Good afternoon, everyone, and welcome. My name is Kim Ricketts. I'm here today to introduce and welcome Stephen Baker, who is visiting us as part of the Microsoft Research Visiting Speaker series. Steven is here today to discuss the world of The Numerati, a global elite of computer scientists and mathematicians, don't you love being called elite? Who are involved in every realm of human affairs, whether it be creating new political groupings, upping our consumer power or transforming healthcare by diagnosing illnesses before you even have symptoms. The Numerati is here with us to stay. Stephen Baker has written for BusinessWeek for over 20 years, covering Latin America, the Rust Belt, European technology and a host of other topics, including blogs, math and nano technology. Baker has written for the Wall Street Journal, the Los Angeles Times, the Boston Globe and many other publications. His portrait of the rising Mexican auto industry won an overseas Press Club award. He's the coauthor of blogspotting.net featured by the New York Times as one of the 50 blogs to watch. Join me in welcoming Stephen Baker to Microsoft Research. [Applause] >> Stephen Baker: Thanks a lot. It's nice to be here. You know, the PR people at Houghton Mifflin put some of these impressive sounding newspapers in my biography because once upon a time I wrote tiny little dispatches for the Wall Street Journal, from places like Venezuela. But the real newspapers where I got a lot of experience were now both defunct newspapers, like the El Paso Herald Post and Black River Tribune in Ludlow, Vermont. Anyway, a little bit of a promo sometimes into those things. One of the things that -- I'm on this tour for this book. I've been on it for two weeks. I have another half week to go. And one of the things that I keep getting asked, for some reason or another is what these people that I call the Numerati have to do with the financial problems we have in the world right now. And it's funny that they ask, because when I started this book, you know, it was clear, one of the areas where they're most important is in finance. And it was so clear that it didn't seem fresh or new and we just decided everybody knew about the quants in finance so I wasn't going to say anything new let's junk that and go to sexier stuff like elections and voting and shopping and computer dating and things like that. So but I still get asked. And just a couple days ago the people from Mifflin said: Come up with something about finance. You can tie it to the book because then we can get you on TV. And if they get me on TV, then my Amazon ranking will go like this and I'll get on the Today Show or something. So if any of you have any ideas to help me figure out how these people had to do with the mess we're in or, better yet, how they can help us get out of it, I'm all ears and I will channel you as I go on the Today Show or the Colbert report. The one thing -- I've sent the few contacts I've had at Goldman and Lehman Brothers frantic e-mails asking for their input. For some reason they're not answering them. I don't know why. But one thing I did hear is if you look at credit applications or mortgage applications in around the year 2000, they asked for a lot of details about people. And what their employment background was and how much money they made and their credit history and things like that. And as the years passed, they asked for less and less information. And this kind of goes against the whole theme of my book, which is that there's all of this information available about all of us so that people in every realm can find all kinds of data about us and understand us and sell to us and give us advertising and figure us out as voters and yet in finance they were moving in the other direction. They were taking rounded people and turning them into ants. And so it kind of goes back -- it kind of takes things the other way. I thought that was -- that's all I can say right now if they get me on the Colbert report. Anyway, I'll tell you about the genesis of this book. I was working -- I worked at BusinessWeek, and I pitched this cover story in the summer of '05. And the idea was that the U.S. tech industry might be heading into a decline because of fewer -- graduating fewer engineers and scientists behind in broadband, behind in wireless, 9/11 Visa regulations, I went on and on and the editors all yawned and said we've kind of heard that before. It sounds like Thomas Friedman's book. I was like, okay. It's kind of an important theme, though. Is there any other way we can discuss this? And one of the science editors said math is at the heart of all of these competitive issues. And the editor in chief said why don't we do a cover story about math. Nobody writes about math. Incidentally I have something I want to show you. So I don't have audio visuals but I have a couple of props. Anyway, so he said let's write a cover story on math and let's get somebody who is not too brainy to do it and he appointed me. [laughter]. And I didn't really know much about math at all, and I still don't. But I went around and I talked to people at MIT and I called up the usual suspects and asked them about math in the most general terms. I learned all kinds of interesting things that I had no idea what my story was going to be. Then I went to IBM Research in Yorktown, New York. And the head of their stochastic analysis division, Simon Dacreedy, told me he and his team of 40 were embarked on this project to build mathematical models of 50,000 of their colleagues. This was modeling the consultants at IBM, a group of them. And they were going to get data from all of these different sources, the e-mails, the calendars and all that, and resumes and try to what people were allergic to, what airports they lived near and build models of them so they could be deployed more efficiently. I thought if he can do that with workers, then other people can do that with shoppers, voters, et cetera, et cetera, and that's how the idea came together. And I did this cover story. And it really didn't have that much to do with math, full disclosure. It was much more data mining and computer science, really. But this was the cover story. And it really sold well, because people, even if it's not math, if it says it on the cover, there's a crowd of people that are interested in things like that. And later I pitched this as a book and I got this book contract. But as I was working on this cover story, I came out of the IBM interview and I said, I called up my roommate who has a Ph.D. in computer science. My roommate from college. And I called him and I said: I am going to do the most exciting cover story you can imagine. I'm going to do this mathematical modeling of humanity. I was full of the passion of the ignorant, but it was excitement. And I raved on in this e-mail for a couple of minutes and then forgot about it. Then a couple of weeks later I got a phone call from him. He said I'm really concerned about that cover story of yours. And he said have you ever heard of garbage in, garbage out. I said I've heard of it. He said have you heard the story about the drunk and the light and looking for the key? I'm sure you've all heard of that story, the guy's looking for the key because that's where the light is even though that's not where the key is. So he gave me sort of 101 on what to watch out for in this world. And I went on and I've told this story once or twice before, and he actually called me as I was on my way for an interview with Google and I was thinking this morning should I cut out the Google bit for the Microsoft talk? But then I thought well it's not that flattering to Google so I think I'll go ahead and tell it. So I went to Google and I talked to Craig silverStein, who is one of the I guess the first employee. And I said that story about the drunk and the key, is that something I should be keeping in mind? And he said: When I was in middle school this was a science fair project and I came up with this experiment and I came up with all this terrific data, and then I realized the experiment was flawed and so I tried to come up with a new experiment that I could hitch up to the data that I had already generated. And it was at that point that I said: You know, this mathematical modeling of humanity, it might happen, but if it does, it's going to happen first in areas where people can afford to make a whole lot of mistakes. And so marketing and advertising are two key areas for that thing. Anyway, so my book is about this nascent effort, this modeling of humanity. And it's just looking at where we are in trying to figure out patients, trying to figure out voters, trying to understand blogs and use them for market research. Modeling, went to IBM, did the modeling of the workers, just this tour through this world, and a lot of these efforts are really, you know, I think we'll look back and say they're pretty primitive. They make lots of mistakes. They really don't understand us in a lot of meaningful ways. But the standard isn't whether they're true or not or whether they understand humans in all of our complexity. The standard is if they understand us just a little bit bit better than what the status quo was before enough so they can make money. And if they can, then they keep on doing it and they learn a little bit more and it progresses. And that's where I think we are in this thing. And so I'm not here to make any tremendous promises that I'm sure you would -- I don't need to make them to you anyway. But anyway. And the other thing is the important thing isn't that it's true, it's that it provides incredible scale and efficiency so you can deal with millions of people at the same time. And that's why these schemes that I'm talking about have such tremendous power. For the first time we can compare people to a million or 10 million or 100 million other people. I thought I'd walk you through a few of the case studies that I did. One has to do -- a lot of them put us into new tribes. They take a look at the old ones where we were understood by our demographics or our region or our race and replace it with new ones that are based more on our behavior. One of these is in politics. I went to this political consultancy in Washington called Spotlight and like so many others they're trying to micro target swing voters. And one of their ideas is that what are we September 30th? If on September 30th an American voter doesn't know what he or she is going to vote for, that person really isn't terribly engaged in the political process and isn't thinking about the issues the way that the politicians and the politically involved people are. They're thinking about things in another way. But those are the people who are going to likely swing the elections in key states like Ohio, Wisconsin, New Mexico, Nevada. So how do you understand those people? You don't do it by the issues that they don't really spend a lot of time thinking about. How do you find those people? So they did -- they basically co-opted corporate marketing techniques. They took about 4,000 people that they thought represented a cross section of the American voters, and they gave them lengthy interviews, where they talked about all kinds of things that they were, what are you scared of? What do you hope for? What do you want your kids to do? Sort of looking at the future through their eyes at what scared them, what were they excited about, but not politics. And then to fill out these profiles, they, of course, asked them a lot of questions about politics that they could look at the correlations. So they had 4,000 people. They gave them to Yankolovich and Partners, a company that analyzes consumer behavior. They said are these people divided and can you divide these people into any sort of recognizable groupings? And they could. And they did. And they said these are five tribes and you can divide each tribe into a more zealous and a less zealous. So a total of 10 tribes. And some were people who focused on righteousness. And some were people who focused on community and those are pretty clearly Democrat and Republican. But they were interested in the ones in the middle who really cared deeply about freedom. Pie in the sky term, but they found it was something about these people around freedom. And there was one group of them that they call barn-raisers. They have names for all of these people. Right clicks [indiscernible]. But these barn-raisers care deeply, playing by the rules, those sorts of things. They care about morality but they're not deeply religious as a rule. They're swing voters, represent 8 percent of the population, which is 14 million voters and they voted for President Bush by 90 percent, 90 to 10 in '04 and two years later they went Democrat 50 to 60 percent in the congressional elections. So they think that they have their eyes on a new swing voting group. It's not a demographic, it's this tribe that exists only in their database. But how do they get 14 -- they have the 3,000 that they know about, and so the barn-raisers are 8 percent of those 3,000, but how do they find barn-raisers in the rest of the country? They have to do a model based on demographics and consumer behavior of the barn-raisers that they have. They test it against the control group that they know, and then they take that model and they run it across 175 million voters to pick out the 14 million barn-raisers. So they've done that with every one of us, everyone here who is a U.S. voter exists in one of these tribes. And that's just for Spotlight. I'm sure we exist in other tribes for other political consultants. What they want to do is hit those barn-raisers with specific ads in places like Milwaukee, Santa Fe, swing states, that emphasize the points that they seem to care about. They think that their technique gets is 75 percent accurate. So 3,000 out of four people that they call barn-raisers are barn-raisers and the other 25 percent are something kind of close. One of those freedom tribes but not one of the community or righteousness tribes. So a lot of people complain that I talk to complain about this. They think that it's kind of weird and scary and it's the automation of American politics and we're being treated like things. But I say we've always been treated like herd animals. They've looked at us as one ethnic group or another or one urban group or another or voting precinct. So they're actually trying to understand us as something closer to the people we are, even though they use strange statistical techniques. Another area that I covered was medicine. I went to Intel down in Portland. And they've wired the homes of several scores of elderly people with all kinds of sensors. And they're trying to measure absolutely everything these people do in their homes. The nature of their strides, how they shift their weight on the kitchen floor. They've got sensors under their tiles to measure how they shift their weight on the kitchen floor. The strength of their voice. The length of time it takes them to recognize a voice on the telephone. All kinds of things. They establish base lines for each of those behaviors, if they see a deviation from the baseline that points to some problem and eventually they want to be able to diagnose it automatically or at least come up with a suggested diagnosis automatically and they're looking at things like Alzheimer's, Parkinson's Disease. Oh, loss of muscle mass in the legs or loss of balance that would lead to a catastrophic fall. Their theory is that a lot of people who -- well, right now a lot of middle aged people have aging parents that they're hard-pressed to keep track of and take care of. And that as this generation ages, we're going to need more and more of this home healthcare. I'm sure you people at Microsoft have lots of projects using the same, following the same ideas. I think eventually this is going to raise all kinds of questions for society about insurance, what happens if the insurance company calls you and says I'll give you a 30 percent discount on your health insurance if you put a few sensors in your house. I think, increasingly, we may be faced with those sorts of questions which will raise further questions about the very nature of insurance, which is an industry that relies on a certain amount of ignorance. And as we learn more we're not going to be as ignorant in what happens to the insurance industry. In the auto industry, there's a company called Progressive that's offering people discounts to put black boxes in their car. And they measure where they go, how they drive, which neighborhoods they go in, what times they drive. They're trying to assess their risk. And I talked about this to one group, and I asked if anybody would be interested in that? And a guy said not for me but for my kid. And I think that that's going to happen more and more, is that middle-aged people are going to impose these surveillance systems on their parents and their kids and those are going to be the test populations. And if it works, and the results are good, then I think more and more of us are going to embrace it for ourselves for the life enhancing qualities. But like so many others, the business case for this starts out with really basic things that have less to do with the Numerati and more to do with just reporting simple facts. One of the things is weighing people. My mother actually participated in this Intel study in Portland. And she was 90 and suffering from congestive heart failure and extremely weak and frail. And they told her she should weigh herself every day and report the conclusion, report her weight every day. Well, she didn't remember that often. She wouldn't remember to weigh herself every day at that point in her life. But I bought a scale for her, one of these digital scales. And as soon as I gave it the to her I realized it was absolutely futile because it takes a strong tap to activate that, and she couldn't double click -- she had a hell of a time with a mouse, double clicking. And tapping that scale was beyond her. And then even if she would remember to weigh herself and successfully tap the scale, she wouldn't be able to see the numbers. So there were like 3,000 data collection obstacles right there with my mother. Well, at Intel they've wired people's beds so they can weigh them in bed. And that's a useful thing, you know? People might pay for that. It's very primitive but these things start with primitive hookups. The only trouble was there was one case where a woman gained eight pounds in the middle of the night they thought she was taking on fluids should they get an ambulance over there. It turned out her little dog jumped on her bed. So the data is not always that clean. When I went to IBM -- well, I did this cover story, and the nice thing about writing cover stories for a magazine like BusinessWeek is I can go to IBM and they can tell me we're going to model 50,000 workers, and I can say IBM is going to model 50,000 workers, picking up data from e-mail and blah, blah and lay it out in a paragraph and maybe even a second paragraph. After that, I don't need to know much about it. I don't have to know how they do it because I'm off to my next example. But when I'm writing my book, I had to go to IBM and say you know that thing I spent two paragraphs in the book talking about could you walk me through that and tell me how you plan to model 50,000 consultants? So they did. They walked me through it at some length, and they use old -- they use hand-me-down tools from different disciplines. For example, they used financial tools to analyze the skills, to put a value on the skills that people have so that they can do a business plan and say: This is where we project our company's going to be in five years and these are the skills that we're going to need. So how much are these skills worth and how many skills do we need? And so looking at skills, valuing people's contacts, valuing, trying to create some kind of value for where they sit in the network according to their e-mail patterns, all of these things go into numbers which each person becomes sort of like a mutual fund of different skills going up and down. And it's not at all what people are. But it gives them some way to try to get a handle on how to evaluate them and project their value in the future. So it's not -- it's not that close, but it might work to some degree. And then the other one that they use that's a big hand-me-down, is operations research. And during World War II, the convoys were crossing the North Atlantic to arm Britain they kept getting sunk by German U-boats so the U.S. and Britain put together teams of mathematicians that turned the north Atlantic into an entire mathematical battleground, if you call an ocean a battleground. Anyway, they figured out how to optimize the convoys to minimize the damage, how many destroyers should surround each convoy, how many boats should be in each convoy. They figured which routes they should take, they optimized it, lowered the casualties along the way, and it was very successful. And later, after the war, IBM used that same science to optimize its own supply chain. So they developed all kinds of efficiencies. They saved a lot of money and then they used that knowledge to create a new service business and they sold their supply chain smarts to the rest of the world. And everybody optimized their supply chain either with IBM's science or somebody else's. And now IBM has moved to a much more service company for manufacturing, and if they were to try to optimize their supply chain, it would be its people. And so that's what the Cready team is trying to do is to sort of optimize their people and they're using a lot of the hand-me-down techniques from operations research. And again it doesn't really -- this wasn't built for people. But if it works and provides some kind of incremental improvement then they'll go with it. And I guess one of my questions for you, and I'd be interested in hearing what you have to say about this, is if we have a system -- if we have systems that improve because they get better results and they try to analyze people, and through the years and through the decades we fine tune them and fine tune them but the very platforms that they're built upon were built for financial instruments and for machine parts, is this the wrong way to try to understand people? I don't really know. But that's the way that I think a lot of people are heading because that's the way that works for today and tomorrow. And this whole industry is based on today and tomorrow not based on a clean sheet of paper that might work in five years. Maybe it's being done in universities. Maybe it's being done in research departments like this one. But I think it's going to be basically built on the same systems that understand finance and machine parts. I went to Yahoo! and I asked the head of research there [Paraga Rafaca] about the challenges of trying to dig through these mountains of data, trying to understand consumers and building services for them. And he gave me kind of a primer on managing massive amounts of data. And he told me about overfitting and all these other problems that you have with data and somehow you can get overwhelmed by it, you can dive down rat holes chasing various correlations that turn out to not have any meaning. So then I went to the National Security Agency, and I met with the chief mathematician there. This is in the summer of '06, and they had gotten into a lot of trouble -- well, a lot of controversy, because they had been consuming immense streams of Internet and telephone data. And so I was very worried about my interview with him, I was worried he would object to my questions and storm out and slam the door. So I was a little bit tentative when I asked him questions. I started out by saying you know these people at Yahoo! were telling me that sometimes you get too much data. Is that a problem for you? And he said the people at Yahoo! might not know how to store their data and they might not ask the right questions and they might get confused by the data, but no, you can never have too much data. And so that is my story. I've got this book. I'm happy to answer any questions or talk to you more about the Numerati if you would like to. >>: So your discussion about how you divided the political people into tribes, how that was being done, was any follow-up done to see how successful they were in converting undecided into a decided, number one? And number two, the [inaudible] because I assume they had an agenda. [inaudible]. >> Stephen Baker: Right. The question, I don't know if the question comes through. >>: Not as well. >> Stephen Baker: The question is about the conversion rate in the political thing. I would say if they are doing those conversion studies, they're doing them and will only publicize them if they benefit their consultancy. You know? I think come December there's going to be a lot of chest thumbing by whichever was the winning side, and a lot of claims about having swung Ohio or Wisconsin for one candidate or the other. And there is a lot of hype in this field. There's some truth and a whole lot of hype and a lot of marketing. I don't really know it will be interesting to find out. Maybe somebody will give me the inside look. But I don't know about their luck in that. Any others? >>: I enjoyed hearing you talk, brought up a lot of stuff. I was just thinking about one place where they've been very successful doing this, which is sort of the credit rating industry and a whole lot of potentially unrelated information and turn it into a credit rating. And then in my mind they failed to adapt and they failed to change but they still have so much power. I mean they're so powerful that no matter how bad they are at this point, they're going to maintain at least until they get so bad that the economy collapses, that they maintain a dominance in the industry. And all sorts of -- seems to some extent you're talking about the democracitization of predictions or something. So I just thought that was interesting. It's a case of potentially abuse of power and modeling. >> Stephen Baker: You're talking about [Fair Isaac] or Standard & Poors? >>: More like individuals, I was thinking, not corporate. >> Stephen Baker: Fair Isaac. FICO score. >>: [inaudible] is one. The other thing is you said they predict like 25 percent of the time and we think that's good. But what about the other 25 percent of the people? Isn't the foundation upon which our country was based, isn't that just [inaudible] people who don't fit in models or treat them differently? >> Stephen Baker: Well, you do that, when you're running a standard political operation and you think that there's a Democrat -- like if you go into Philadelphia, which is a highly democratic city, you run commercials for the democrats to get them out to vote and you just forget about all the republicans that are there, because they're a minority that you're not paying attention to. >>: This is going well beyond politics this is diving down into people's lives, we're talking healthcare. >> Stephen Baker: Right. >>: I mean monitoring their homes and what if there's those 25 percent of the people and the healthcare company looks at the data rather than the person? >> Stephen Baker: Well, I think it's made for areas where it doesn't matter. If they think you're a barn raiser and Obama sends you an ad saying I really care about right and wrong and nobody's been playing by the rules and blah, blah, blah, you know that's not a big deal. You just get the wrong advertisement that's not micro fitted to you, but in medicine it's a whole different game. So I think it's going to be longer before these people make great strides in medicine. That's where they need it the most. But yeah. Yeah? >>: Have there been social and kind of legal responses to the ability to profile people, crunch numbers to find an idea of this group or that group will go this way or that way? Have there been lots of ways of wrong way and right way of using that and what's the responsibility of the company to do that? >> Stephen Baker: You mean to privacy issues? >>: Privacy or manipulation in some respect as well. >> Stephen Baker: I don't know. Does anybody else have any thoughts on that? I can't say. >>: Any government regulation or attempt at ->> Stephen Baker: There's a lot of talk about different regulations. But there are very strict ones about medicine. But as far as profiling for things like advertising and marketing, I don't think there's much -- I don't think there's much of regulation at all. There are much stricter regulations in Europe than there are here, I know that. I don't really have specifics on that, I'm sorry to say. >>: So one of the key issues here when we are talking about this mathematical model of humanity is people have a lot of privacy concerns. I think that's one thing that's maybe stopping a lot of things from already being modeled more. How do you see that panning out? Do you see the Numerati becoming more aware of people's privacy and building in maybe new I guess ways to protect people's privacy or do you people becoming less concerned about privacy? >> Stephen Baker: I see both. I see people redefining privacy and trying to come to grips with what -consider the secrets that you have and then which secrets should you keep in the future or should you attempt to keep in the future. And there's some secrets that traditionally you've kept them but you don't really need to. Then there are other ones you want to keep. And I think a role for a company like Microsoft, and I know you're at work on it, is to create tools for people to protect themselves and for industries to provide services where you can get the benefits from sharing information without the costs of exposing yourself to loss of privacy or loss of money. I mean, especially important in -- I talked to one of your colleagues, Cynthia Dwark, I don't know if she still works for Microsoft down in San Francisco, and she was talking to me about medical data and how you could, if you zeroed in on it, it was impossible to see the individual. I mean that's the real key is it's a real opportunity for companies like this one, I would think. Now, I said that to Google. I went to Google two weeks ago. And I said: People at IBM are really concerned about this article. The book excerpt ran as a cover, and it was about the IBM chapter. It was about modeling workers and whatnot. And it took out the most noteworthy stuff and so it didn't have some of the softening elements of the book. And the IBM people were very upset about it. It made it look like they were a Big Brother company and they wanted me in my talks about this to say that that was a pilot project and their surveillance of employees is done on an opt-in basis now. But then I went to Google and I said: You know these people at IBM were really concerned about that. But I would assume at Google where all data is just considered information to be analyzed that you would assume that people are looking at your patterns, your workplace patterns and trying to figure you out and make you more productive or help you come up with better ideas or whatever. And they were horribly offended by that idea. A couple of them denounced me for even suggesting it. And the word "evil" came up into the conversation more than once. And so it would be interesting. I don't know what the thinking is here at Microsoft. But I just assumed that if companies aren't looking at that kind of data, it's just because they haven't gotten around to it yet. But I mean not that you can build predictive models of IBM, I mean Microsoft Researchers, but there's something to be learned. I don't know what you think about that. Yeah? >>: I wonder if you've come across any projects where people are trying to do real-time profiling for instance I'm on vacation and I bought lunch maybe now I'm more prone to go buy dessert, based on history. >>: I don't think you need a model to figure that out. [laughter]. >> Stephen Baker: Well, the one company that I talked to that I thought was doing something interesting in that area, do you know Sense Networks they come out of MIT. And they've put this software into telephones so that they can track all these people's movements. They're doing it in San Francisco. Coolest thing in the world to look at this map of San Francisco and see these various people moving through it. And so they think that if you look at a city as sort of like the physical Internet, then the corner, such and such a corner of Lombard Street and something else in San Francisco is like a web page. And if you stand on that corner between, let's say, 9:00 and 10:30 p.m., then you and everybody else who stands on that corner at that time have something in common just like people who visit a certain web page. And so then if you look back at their patterns, like where do most of the people who go to that corner sleep? And you might see that a certain number of them come from this area. And where are they at 2:00 in the morning. You might see they're in this certain club. And then you can define the people who have those patterns as a tribe or a group you can market to. And that could bring real-time marketing, the kind you're talking about. Interestingly, this is just beginning, this thing just launched a couple months ago, Sense Networks, but the investors in Sense Networks aren't VCs, it's a hedge fund. You figure a hedge fund it's four million bucks, which is nothing, even in today's climate, and they get this raw data of people's movements within New York, San Francisco and other cities, and if they can use that to try to understand something about what consumers are up to, they might be able to understand the economy just $4 million better. >>: Have you thought about generational or cohort differences with regard to even the privacy or acceptance of this? Because based on what I see, I think like the younger teens and things who are brought up with technology and are much more comfortable and familiar with it would be more comfortable with having this information used to ->> Stephen Baker: That's right. If you look at the blogs and social networks, there are many parts of our society that are spilling much of their lives including intimate details of their lives for the whole world to see. It's a treasure trove for the data miners who want to figure out do sentiment analysis on these people or people in general because they just, they get big enough sample and they adjust for the age. But one of the companies I visited was Umbria Communications, which was going through those blog posts and coming up with sentiment analysis for marketing companies. And right now all they're doing is the thumbs up or thumbs down for a new, the Jerry Seinfeld commercials. But in the future they're going to be able to understand those messages, those writings with a lot more nuance and context. >>: You suggested earlier on that politics was borrowing from marketing. But I've done some -- I've worked on some marketing campaigns for the company. I kind of got the impression it came the other way, which is that there are certain industries like politics where marketing wannabes can get a start, and other places, and they go to conservative companies that are establishing and have a lot to lose. But when you're on a campaign that has pretty much no downside and big upside, it's a chance for some young kid with some innovation to make a mark. And they're almost never going to do the safe, reliable thing that anybody sensible would do. This is, if you're going to make a mark, you have to do something where you've got a candidate who is probably not going to win. And you've got to bet the farm on double 0 and come up with something truly creative. And if it works, you've now got a career. I think it's different when you're in the home stretch. >> Stephen Baker: Yes. Just from my own experience, having kids and friends who go into politics, they often get discouraged because the old pros who know all the precincts and know the way things are done, I find that young, sharp 20-year-olds or 22-year-olds often get stuck stuffing envelopes and not having that kind of input. >>: If you look at what the creative new things coming out of the more recent campaign, they're things that both the old pros never would have done and also they're the things that the marketing companies have never done. >> Stephen Baker: You're right. Certainly in terms of the Internet stuff. >>: So, for example, let's take, give you a five-minute head start on knowing who the presidential, VP presidential pick is going to be in exchange for that I give you permission to text message me. And they get a massive number of opt-ins by this method. This is something that an old pro never would have thought of. >> Stephen Baker: Right. >>: I'm saying this campaign [inaudible] with this kind of thing. If you look at how the CVs of the people in the conservative places that are doing marketing now, where they came from, they almost all came from one of those places where something like politics where you didn't need a long CV to get in. >> Stephen Baker: Interesting, I really didn't think about it. >>: Modern research is born from politics. I think that's pretty much the genesis of it. >>: Furthermore, one of the big marketing things now is turning your much loyal customers into your marketers, which is what politics has always been about? >> Stephen Baker: That's true. >>: And you have to go out there. >>: It's about changing attitudes. I mean with politics it's easier because you have a very discrete choice and a very discrete period. It's the election that you have to impact their position before that, and then it's ubiquitous after that, the marketing products. >>: I have a question. How do you think the census, the American census [inaudible] you know the way the government categorizes. >> Stephen Baker: Yeah, I'm thinking about it. What do you think? >>: Well, it just seems archaic the questions they ask. It's not very far-reaching. But they must use it to make decisions about us. >> Stephen Baker: It's just that when you fool around with the census, I can imagine if people, if there were an open source movement to try to figure out what the best census would be, it would be fascinating. >>: Right. >> Stephen Baker: But then it would come up with all these privacy implications and this would be massive debate about it and it would seem that the census for all of its potential that it gives us, is one of these areas that's going to be really hard to change. But I don't know. >>: Before the 2000 census, there was a proposal to actually do it by sampling. And all the mathematicians who worked for the census bureau swear up and down it would be more accurate if they were allowed to use sampling rather than individually trying to count everybody and the politicians just would not hear of it. >> Stephen Baker: Right. >>: So it was voted down. >> Stephen Baker: Yeah. >>: I think a good example or one of the best examples of data privacy is the whole RFID thing, where they were going to put a radio frequency ID tag maybe in a piece of clothing at a retail store or something. And the benefits are great, right? Because potentially you could push your cart through the register and pick up everything all at once. Nobody was having it. Putting RFID chips in licenses and have to be able to disable it and passports now have them in there. There's websites telling you like take a hammer and smash it in a certain spot so that it doesn't work. If you do it any other way you're defacing the passport and it's illegal. >>: Carry it wrapped in aluminum foil. [laughter]. >>: [inaudible]. >> Stephen Baker: Did you want to say something back there? >>: I was going to mention that I was looking into some of these bills that were being passed and I was researching some of the bills for some reasons, and found out that some of the censuses were actually, they were going back 3,000 decades to pull up numbers to support the bill as opposed to using the modern census. So it's kind of ironic that our government will say oh no go count everybody in that census, because at times when it's not beneficial to them they'll go use an older census and quote it in the bill. >> Stephen Baker: Right. That reminds me of my time as a steel reporter when I was in Pittsburgh. I went to this cutting edge rolling facility, steel rolling mill in Indiana that was half Japanese. It cost more than a billion dollars, which was incredible for the steel industry. And they take a band of steel, flat rolled steel that's this flat and in one continuous process they roll it until it's like tin foil. And it's just the band goes on forever and ever. But the tricky part is that you have to weld together the ends of one band or the beginning of another and that's done -- that was done electronically, and it was high tech thing they were very proud of. And so they show it to me. They tell me how it's welded together really fast and really strong. And I saw this hammer next to it. And I said what's that hammer doing there? And they said oh those guys never trust the automatic welds. So I think those ->>: Just recently purchased your book. Although I haven't had a chance to read it yet -- it's a joke. [laughter] I did notice that you had something in there related to RFID along the lines what Josh was just mentioning. I was wondering if you had given any thought to anything that related to evening the value propositions with some of these instruments that were designed initially or primarily for supply chain from the perspective of the consumer. These devices in products that you purchased that still actually would work when you have them in the home, have you thought about something that -- I know there's the long, long set of discussion around privacy related to some of these things but perhaps maybe if the value was sort of balanced out a little bit giving the consumer some value in having these in their products. >> Stephen Baker: Yeah, I haven't come up with great ideas about it. But it just seems to me that increasingly we're going to be making deals where if we agree to use services and provide people with our data, we're going to be getting more and more -- they're going to have to offer us more -- as people become aware of how valuable their data is, they're going to be in a position to ask for great deals and great services in exchange for it. And I think that's where there's going to be a lot of business opportunities for companies that figure that out. But I don't have specific examples of that. >>: Obviously the fear factor for people is huge. I mean when you're in the radio and stuff. >> Stephen Baker: That's what people are calling that. They want to go off the grid. >>: So but do you think as data collection becomes more sophisticated, if we have simultaneously much more sophisticated security that that will alie some of it? >> Stephen Baker: I think people a decade ago would never put their credit card information on line for e-commerce, that was a big deal. Something happened to convince people that that was okay. And I think there will be some demythfication that goes on. There's real scary stuff and there's fake scary stuff and they can divide the two, hopefully. >>: One thing -- I'm in market research so I'm cheating a bit, but talking about data privacy is to think about the privacy and data at the point of collection rather than at the point when you're going to use it or just stuff it in a new database somewhere so you can have separated it from identifying information. So a lot of times this data is collected inadvertently or sort of, I don't want to say subconsciously, but they're not really thinking about it as collecting data, they're thinking about it as part of the process. And if they had built in thoughts about privacy and retaining that information at the point where it was collected, you stand a much better chance of maintaining privacy. >> Stephen Baker: Right. >>: Anyway. >> Stephen Baker: No, I think that's true. I think this has been just a random Helter Skelter evolution so far without many rules guiding it, without best practices and a ton of really valuable wasted data that's not analyzed. And I think people are going to get, figure things out. >>: On the subject of privacy, I found your NSA example intriguing. What are your thoughts on the amount of infrastructure that they have in place to profile people, et cetera? Because they're not out to sell people anything. They're trying to find the breadth of America and mop them up, are they miles ahead of the private sector? Are they behind? What are your thoughts on that? >> Stephen Baker: I don't know. I know they have real recruiting challenges because they're competing with companies like this one when it comes to recruiting the top mathematicians and computer scientists and it's really hard for them, civil service pay scale to compete with these big web companies. And plus they're limited to American citizens which further handicaps them. And they have had to turn their mission from cryptography and code breaking during the entire Cold War into data mining and this type of analysis. And I can't imagine that they're ahead of the cutting edge private sector companies. But that said, I got no details from the guy. He said they supposedly have the biggest math shop in the world. He went on and on about the huge challenge of weeding, of finding truth in these mountains of data and leads and all the rest. And I mean we haven't had a terrorist attack in this country since 2001. So far be it for me to say they're not doing an effective job, but somehow ->>: Anthrax. >> Stephen Baker: You're right. >>: That was the wrong guy, though. >> Stephen Baker: No, but it was a terrorist -- it was sort of a terrorist attack. I mean I don't know. It just seems to me that that was one chapter I really had trouble writing because it was not -- they're not successful -- it's not a good science for them, really, for the most part. They don't have any -- they don't have good patterns of behavior or known patterns of behavior of terrorists the way they do Cheerios buyers or home buyers. They don't have that kind of data. And it's a little bit like the space shuttle, you know? The space shuttle has two accidents and so it doesn't give them much of a sample to work with. And so what I did in that chapter is I looked at the two different approaches you can have toward it. One is the statistical analysis of just data mining and the other is going through databases looking for correlation, looking for phone numbers that overlap, names that overlap, aliases and things like that. There's this software called NORA, which is a nonrelation awareness or something. Anyway, it goes through and tries to find people. The guy who created NORA thinks that he could have stopped 9/11, they could have stopped it if they used something like his because it was clear that two guys who were associated with the bombing of the U.S.S. Cole were living in Los Angeles and they were in the phone book; the data was there. So any other questions? I appreciate you coming down today and spending some time with me. Be happy to sign any books if anybody wants one. >> Kim Ricketts: And the blog. >> Stephen Baker: This is my blog, thenumerati.net. My contact information is there. If you want to get in touch with me, my e-mail is there. Feel free to leave all kinds of comments on the blog. I hate having zero comments on a post. [laughter] it's depressing. >> Kim Ricketts: Thank you. [applause]