>> Lee Dirks: Thank you so much, so much... out here. We're thrilled to have you. Welcome...

>> Lee Dirks: Thank you so much, so much for taking part of your day to come out here. We're thrilled to have you. Welcome to Seattle, more specifically welcome to Redmond and to the Microsoft Corporate Campus. My name is Lee Dirks. I'm the director for education and scholarly communication on the Microsoft Research Connections team. And a little bit of background about myself. I'm a librarian. Yay. [laughter]. Actually I went to two library schools. One was not enough. So I got my master's degree at Chapel Hill. Do we have any tar hills in here? One, two, three. All right. I'll see you later at the Duke game. Okay. We're going to be watching some basketball tonight. Got my master's at Chapel Hill and at the time when Colombia still had a library school, I went up to Colombia and got a degree, post master's in preservation, preservation administration specifically. And I'll spare you my whole career, but prefer progression, but I was very lucky enough in 1996 to be named Microsoft's corporate archivist. So I started here about 15 years ago as the corporate archivist, ended up managing the corporate library for a few years, moved into marketing, if you can believe that, and worked in market research for about five years. And then I was lucky enough to run into this guy, Dr. Tony Hey at I think it was library fundraiser event at the University of Washington. And he said what's a librarian doing over in marketing, you need to come over here into research. So I was lucky that he lured me over. And I've been in his team about a little bit over four years now. Just to quickly tell you a little bit about the team. Microsoft Research, Peter Lee is going to give you a more in-depth overview of Microsoft Research, but Tony's team is technically responsible for our external research collaboration. So all of the things that Peter will be talking about Tony is very interested in across a variety of different theme areas or kind of a portfolio of looking out and working in engaging with the community on specific research projects or research areas that Microsoft is interested in. My specific area, as I mentioned earlier is education and scholarly communication. And that is where we look at the libraries playing a huge role. Traditionally played a role more I have to tell you at the end of the life cycle of collecting these materials, but I think we're doing a fantastic job, but we need to do a better job of working in earlier in the research process and starting to gather data, collect that data, curate that data, et cetera. So that's a little bit about me. We've been -- Microsoft Research has been sponsor of the iConference I think three out of the last four years. It's something we see a lot. We're very interested in this community. We see a vital role for librarians and information professionals in a world of data intensive science. We're going to be speaking a little bit about that. And this is a community we want to continue to engage with. Over the course of the day, you're going to hear a lot about Microsoft Research. You're going to see and meet some of our researchers and get a little kind of deep view into their research, researchers we hope that you'll be interested in. It's going to be a pretty broad pallet. There's going to be things kind of all over the map. But I think you'll see the applications and the potential. And then in the last session in the afternoon, we have folks that are going to be coming from our product groups, from different specific products or technologies that I think you'll see some interesting cutting edge things. So we have folks working with Surfaces, folks working with Connect. You'll see a user experience guy from Connect. I think you'll find it very, very interesting. So logistics, just very quickly. I think everyone's got a lunch. Drinks, you'll see that there was water in the back. And again, there's soft difference down the hall, coffee, tea and still water if you want that as well. So feel free. The other thing I'm just going to say, we're going to point the fire hose at you for the next four hours. So if you need a biobreak, feel free to duck out, or if you want to get a drink, we're going to keep talking, and we're going to ask you guys -- we are going to take one break from about 1:15 or 1:20 until 1:30. Hopefully by now you've seen the wireless codes and you know that there's power in the floor if you need to plug in your laptop. And I think I might have covered all of the logistical points. So with that done, I welcome you again. I was very, very excited to have all of you here. And I'd like to invite Peter Lee to come up and tell us about Microsoft Research. [applause]. >> Peter Lee: Great. Thanks, Lee. It's really great to be here and to welcome you to Microsoft and Microsoft Research. My name is Peter Lee, and I think I know -- I can recognize a couple of familiar faces. You probably if you do know about me, I was last time you saw me a computer science professor and a department head at Carnegie Mellon, and then I had a short stint at DARPA. But for the past three and a half months, I've fallen over to the dark side and am now the managing director of the lab here in Redmond. What I've been asked to do is to give you a little bit of an introduction to Microsoft Research and what we do and what we're about. And so let me just do that. We are just in a nutshell a lab dedicated to expanding the frontiers of possibilities in computing and expanding the modes of human experience as they relate to computing. And that is something that's very important for advancing the field. It's also important for the future of this company that is completely committed to computing and computing technologies. We have a mission statement. And this mission statement has been the same for 19 years. We are 19 years old, in fact, going on 20 years, and for the entire history of Microsoft Research, this mission statement has been the same. And it's worth going through. It's remarkable in its simplicity. First, we have the mission to advance the field in the areas we do research. That statement is actually a bit subtle. Advancing the field means that we're committing to doing basically research even in the foundations. And as I'll get to in a little bit, you'll see that we do really a broad range of different kinds of research activities, some very, very much blue skies, others really directed at helping our product teams. The other is in areas we do research. And what that implies is a level of autonomy and independence. No one in the company is allowed to tell us who to work on or on what problems we choose to tackle. Microsoft Research within the context of Microsoft is an autonomous, independent lab that sets its own agenda and pursues the research in the matter that it chooses. The second part of the mission is to transfer research results as aggressively as we can into products. And as I'll also explain a little bit later, you'll see that this is a constant concern of ours. How can we bring more and more great technology and research advances into the things that Microsoft sells. And then final ensure that this company has a future and more broadly that the computing field has a future. And so some of the things you'll hear from Tony Hey and external research really pertain very directly to ensuring that the computing field very broadly has a future and we engage in this as well. And that's important to us because the more that we expand the possibilities for computing, the more that Microsoft has opportunities. So this mission statement has really been a remarkably simple and direct and consistent part of the 19-year history of MSR. Now, I'm managing director. And I have a little bit of a joke. You know at Microsoft -- at Carnegie Mellon I was a department head and so I, as you know, if you're at university, I had no authority there and lots of cats to herd. I then went to DARPA and I found it stunning. I had absolute military authority, but there were no cats. [laughter]. Now here at Microsoft, I have a tremendous amount of authority in the corporate hierarchy and lots and lots of cats, but I'm not allowed to do any herding. And why that? Well, we have a culture here at Microsoft Research that roughly speaking goes like this. What's my job? Hire the best people, make sure they are fully resourced, and then get out of the way. And in reality I have a prime directive from my boss not ever to tell researchers what to do. And so it, as much as possible, is a bottom-up organization. Now, Microsoft Research is a pretty big and complicated place. We have eight locations around the world, and so the sun never sets on Microsoft Research. There are three big labs. The big one, the mothership is here in Redmond, which is my responsibility. The two other big labs are in Beijing and in Cambridge, England. Then there are three smaller labs in the Silicon Valley, in the Boston area, and in Bangalore, India. And then there are two much smaller research centers, one in Aachen, Germany, and the other in Cairo, Egypt. All of these labs really work together. And in total, we end of up having about 900 PhD researchers and research engineers. Here in Redmond, we have about 350 of those 900. 260 of them being researchers, PhD level researchers and about 90 being research engineers. So from the perspective of a university-like structure, you can think of Microsoft Research Redmond as like a school or college at university. And, in fact, we're organized into 10 departments here in Redmond. Now, what do we do here in Microsoft Research? And to explain that, I use a kind of a rubric that I also used at DARPA. Which is a quad chart that maps out the space of basic research investments and activities. On the X axis here, we have research activity that span short-term managed risk research activities to a longer term activities. Research activities that require some patience in order to bear fruit. Sometimes a very long time, you know, a decade or more. And then on the Y axis we have a span of research problems spanning reactive types of problems. Pain points for product groups or societal problems that we understand today and know that we want to solve all the way up to the more open ended search for truth and beauty. The search for disruptions. And the thing that's important to understand about Microsoft Research is that we try to cover this whole space. And if we map out the quadrants of this, we have in our organizational structure here a demand that the lab actually creates significant impact across all of these quadrants. And these three blue quadrants in particular are the things that I refer to as the lanes of basic research. Let me describe a little bit about what these are. On this lower left quadrant we have short-term reactions to ongoing problems. So, for example, and I refer to this as mission focussed activities. So for example you might have heard that we're engaged in a holy war of sorts in search. And that is a technical problem that can be broken down into ideas and approaches that really hit on the leading edge of the state of knowledge in machine learning, in large-scale distributed systems, in economic models. And so for example on the specific problem of the quality of the relevance of search results, which is a measurable activity, we know that we want to beat Google -- and, by the way, let me brag that we are beating Google [laughter] and furthermore that that's the kind of activity that really arise bringing absolutely state-of-the-art concepts in computer science to bear. And so, we have mission-focused activities that react to those kinds of known problems and try to making progress. On the upper right corner we have pure blue sky, curiosity-driven research. And we have a substantial amount of activity that is like that. Some of it is purely theoretical, some of it is applied, but blue sky in the sense that the company doesn't necessarily have an obvious interest in the outcomes. To give you an example of that, let me just brag a little bit. Yuval Perez in our theory group in this lab just won the David Robbins prize, which is a very prestigious prize in algebra for being part of a team that computed the perfect arrangement of wooden blocks to get maximum overhang over the edge of a table and furthermore proved a lower bound on how far you can go given a number of blocks. And there you can see, I think that's a 30 block arrangement, and you can see you have to make a very judicious of counterbalance kind of arrangements. And yes, we do try these in reality. In the orange behind what's the previous known best solutions, but now we know actually how to optimize this. Why does Microsoft care about this, well today Microsoft probably does not care about this. But the point here is that sometimes -- in fact, more often be you would think, somewhere down the line it becomes important for Microsoft to know how to solve these kinds of problems. In the applied space, we were very deeply involved in cognitive radio and in solving key technical problems in using the dynamically available spectrum for much, much more extended WiFi access. The fact that we didn't have an obvious business interest in this made it possible for us to influence the FCC chairman to change government policy to allow wide space spectrum to be used. And so we engage in quite a bit of that blue sky activity. On the upper left quadrant we have a very, very interesting set of activities. We're looking at our understanding of the world today. So that's a short-term focus. How we understand the world today. In what ways can we be completely disruptive? And so, for example, when we about how people should interact with computing, we think very hard about domain that we here refer to as natural user interaction or NUI. And that kind of thinking can be very disruptive in the short term. So, for example, in a very short amount of time we were able to engage with the Xbox product team in order to delve the technology around Connect. And so the -- both the 3D image processing as well as the steerable microphone array technologies are ideas that at one point, 15 years ago in this lab were blue sky but a decade and a half passes and suddenly there's an opportunity, based on the understanding of the world today, to be completely disruptive. And then in the lower right why I grayed it out -- and I grayed it out because it's an area that has somewhat less investment but is also important for the company. And these are areas where there is a less time critical but still important process of continuous improvement or sustaining of capabilities. So some of you might know that we ship a very large number of language translation technologies in our various products like Office, and every day we have to make them a little bit better. Just two weeks ago I received a report that our English to Russian translation blue scores were not up to the same level as some of you're other language translation products. That's a kind of a problem that requires a roomful of PhD researchers to fix. You can't just expect a product group to do that. On the other hand, it's not going to be the end of Microsoft's business if we don't fix it tomorrow. It's something that we just work on every single day in order to make improvements. And internally we do a huge amount for developer tools. The world outside sees the technologies and things like VisualStudio. Internally the amount that we do in order to support developers and help them have predictive power and where problems are going to be is enormous. And these are things we work on every day to continuously improve what we do. So in Microsoft Research, these lanes of basic research are all equally valued, and we really push hard on all of them. And as a lab director, my demand of the lab and of my direct reports is to show significant impact in all these quadrants. Microsoft Research Redmond overall is a very interesting place. It's a big place. Just Redmond is 350 computer scientists and engineers. Divided across seven research departments, two mission focused research centers and pretty substantial engineering team. And, you know, at least from my experience, this is a real challenge to try to just even understand everything that's going on. On the other hand, within Microsoft, we're tiny. The whole of Microsoft Research is just one percent of the company. Of course we are highly visible within the company. And so you would just given our visibility, particularly with the senior leadership of the company you would expect we're 20 percent of Microsoft. But we're not. We're just one percent, but we're very, very high impact one percent. And it's a very interesting thing that we are just so different. So you could view Microsoft as being a very execution-driven company and then it's carved off just this little tiny bit to try to think differently. And we're impactful. It is literally the case that every single product and service that Microsoft offers has had key contributions by Microsoft Research. So as small as we are, the impact is enormous. So what I'd like to do now is do something. It's a little silly. And some of you might have seen if you've seen me speak before, and if you have, don't ruin it for everyone else. But one question that I often get from outsiders and visitors but also from the CEO and division presidents, is why do we have research? You know, what's the value of research within a company like Microsoft? And that's a really hard thing to explain. By the way, I even get that question from our researchers. And so it's something that is so hard to understand that even if you realize at one point, it's easy to forget. And so I've developed several ways to explain this. And the cutest way is through this test. So what I'm going to do here is show you a picture, and the picture will have a bunch of different colored dots. And so I will give you five seconds to count the red dots. Okay? Ready. Here we go. Five, four, three, two, one. All right. So how many blue dots were there? [laughter]. All right. What's the point? We here at Microsoft it's stunning how disciplined and execution oriented this company is. 99 percent of this company is exceptionally disciplined and focused on understanding consumer needs, doing the product development, lining up the marketing and sales, and making sure that all of that ships on time. And the level of discipline and focus in this company is truly, truly amazing. And that's analogous to counting the red dots. That's what this company does. Microsoft Research is the tiny one percent of this company, though, that wonders about the rest. How many blue dots are there? What are all the different colors? Why are there dots? Are there other things going on. And even though none of that is directly relevant to counting the receipt dots, what you don't know is down the line whether conditions change where it suddenly becomes important to know how many blue dots there are. So 12 years ago when this lab started to engage in computer vision research, there was absolutely no reason for Microsoft to be interested in computer vision. But 12 years later, it suddenly becomes the biggest selling product -- hardware product that this company has ever produced. And it's that sort of idea that we produce. We look at the outside try to understand where the field is going and try to expand the possibilities. So within Microsoft, we are that special one percent of the company that takes a more questioning culture and flows ideas from the bottom up, not from executives and managers on down. We fail a lot. I mean, most of the things we do don't pan out. And that's oftentimes frustrating for researchers. They invent what they think is the next great thing and they usually are great things but for various reasons, especially in the marketplace, the world might not be ready for it. And that kind of failure happens over and over again. But that's what we're able to do. And luckily in a research environment you can fail and not get fired, which is not true for product groups. And we push all of the lanes of research. So with that, let me just say this is what we do and welcome again to Microsoft Research. I'm super excited to be here. I'm still in my honeymoon period. Very excited to see all of you here. And happy to hear from any of you about your thoughts. If you want, you can contact me through that e-mail address. I show that because I'm the eighth Peter Lee at Microsoft right now. So it's actually hard to find me. So thank you very much. [applause] >> Lee Dirks: Thank you, Peter. Are there any questions? We can take one or two questions. >> Peter Lee: Yes? >>: Are the blocks [inaudible]. >> Peter Lee: The blocks? So there's -- if you look at the map, the paper is actually quite beautiful. So there's a parameter. So for a coefficient of friction, yes. It's actually -- it's a very, very nice paper. >> Lee Dirks: Any other questions for Dr. Lee? >> Peter Lee: Yes? >>: Would unwilling to talk a little bit more about the concept of disruption? >> Peter Lee: Yes. So disruption is something that I think is extremely challenging for any research organization, including if you're from a university, including your departments. There are two comfort zones. One comfort zone is the zone where there are large communities of researchers that are publishing and really, you know, focused on those certain kinds of areas. We have that comfort zone. We publish openly almost everything that we do. Another comfort zone is if there are societal or in our case company needs that are well known. Disruption is and uncomfortable place to be because generally speaking it means striking out in new directions that don't really connect in our case with product groups and don't really connect with where the academic community is going. And so how you encourage and push those things is really a challenge. There are few things that I found are extremely important there. One is concepts that impinge on the idea of democratization of mobilizing very large numbers of people or putting technology in the hands of very large numbers of people to enable surprising outcomes. That's one key strategy. A second is to constantly ask what is your vision to the application of a research result? Now, oftentimes as researchers we're susceptible to thinking about the technical problems and then dismissing what we think we want to do with it as just the application. But the vision for the application can sometimes be more impactful and inspirational in a disruptive sense than the research itself. Let me pick a non-Microsoft example of this. Some of you have heard the news about IBM Research getting the green light to go on TV with their jeopardy-playing robot called Watson. If you look at the foundational research there, it's completely CORR, computer research science and information retrieval, and there are many applications. In fact, there are direct applications of that type of research say to the bing search engine. But the vision to say we're now going to go take this to play jeopardy is a disruptive concept. Because it gives you a better chance to stumble of across some surprising result or idea. So another thing that we do here is to try to look at the projects that we have and ask again, have you thought enough about what are the possible applications of what you do and try not to fall into the trap of just worrying about what the product groups would bless but really try to be expansive. Yes? >>: Just to follow up on that. The obvious alternative to reactive is proactive. >> Peter Lee: Yes. >>: But instead of using proactive, you're using disruptive. >> Peter Lee: Disruptive. >>: [inaudible]. >> Peter Lee: So, you know, my belief is that -- so I made a joke here. One of the things that happened my first month here, and I'll bet many of you at the university departments have helped the same way, because I think I encountered this at Carnegie Mellon, also. But it's just really vivid here. I -- at the end of the first month, I offered to give up any performance bonus at the end of the year in engage for nickel every time someone told me a story about some great thing that they invented that later became a multi-billion-dollar industry that Microsoft didn't act on. So there's a search engine, there's, you know, the precursors to My Space and Facebook, you know, slates and so on. And so there is no problem, at least for this lab and I think for academics, to be proactive. That's what we do. We dream about the future. And we think about that future increasingly in very practical terms. So I don't view that as a problem. That just comes naturally to us. Disruptive is harder because disruptive is really trying to imagine changing world views or, you know, shifting paradigms to use a buzz phrase. And that's -- that's much, much harder. And so on the -- being proactive, we do have a challenge in trying to get people to listen and have the imagination that yes, this could be a big business for us. But that's something that we're just wired to do, and we do that every single day. The disruptive thing is, I think, where the truly important things will take place. Yeah? >> Lee Dirks: Maybe one last question. We need to stay on schedule. >>: [inaudible] and my question is there a percentage of project [inaudible]? A second question is it's known that large corporations are not particularly creative but they tend to become kind of elephant. So the question is are there any particular process within Microsoft to get your juices flowing and to get you guys, you know, come out with creative ideas? >> Peter Lee: So, let's see. So this is a great couple of questions. And it's one of these questions that's impossible to answer without sounding a little bit defensive. So let me just say that and very defensively say we've got plenty of creativity. But I think there's a -- in a lab like Microsoft Research we do a lot of things that are disruptive. And disruptive can have a dark side. Disruptive can mean inventing things that really could emerge as viable threats to the health of had company. And so it is true that there is a very delicate and complicated and constantly changing interaction that we have with the company. So for example if we invent say new browser technologies, which happens, there are very, very interesting discussions that take place with the Internet Explorer team, just to take that as one example that might be relevant here. And so -- and my observation is that no two discussions are the same. The business conditions and the challenges for each of our product groups is different. But it is important that those discussions take place. And so one thing that is important here at Microsoft is that every product group listens to Microsoft Research. Even though we're so tiny, we're engaged with literally every single part of this company. Next month Tony and I will be involved in a show called TechFest. TechFest will be a big conference where we will show off demonstrations of some of our most exciting research and 5 to 7,000 Microsoft developers will flock to come and see what's going on. I mean, it's just one sign of the kind of sort of interactions that we have. But then how we take that to reality is always just a one op kind of thing. >>: [inaudible] such organizations need to renew themselves, as you say, to continually be relevant. And I think what Rick Rashid has done is trying to make us a little less comfortable and that he's deliberately done that by introducing a disruptive influence. So I think Peter's really great news. >> Peter Lee: Yeah. So there are product groups that just love us and then some of the same people in the same product groups also just hate that we're doing -- that we're making their lives harder. And that's exactly what we're supposed to do. All right. Thank you very much. [applause]. >>: Thanks, Peter. That was great. That really was good. >> Lee Dirks: And now I'd like to introduce Dr. Tony Hey, who is the corporate vice president tor the Microsoft Research Connections group. >> Tony Hey: Thanks very much, Lee. Thank you very much, Peter. So what I'm going to do briefly is tell you a little bit about the paradigm change which is nature of the fourth paradigm. So let me just explain a little bit about what it is. We've been looking at -- my Research Connections team engages not only with the computer science community, the library community and the iSchools, but also with scientists. And in science we're seeing veritable change of scale in the amount of data. We're going to generate more data in the next year than we've generated in the whole of human history and all that stuff. It's really true. And scientists are going to be drowning in data. And how they analyze it and how they report it and how they do their science is going to change. So this was Jim Gray, my colleague who alas is no longer with us. This was subject of his last talk to the National Science Board. And he had this vision of science which is sort of relevant to the high school vision of X-Info. So he saw that many fields like chemistry or biology were developing different fields like computational chemistry, computational biology but at the same time as emphasizing the computing, there was also a data and information side by informatics -- chemical informatics and so on. And I now see there's astroinformatics and archaeoinformatics. So -- and he felt that scientists really are going to have to struggle to deal with all this data. And you have all this sort of stuff coming in, and you want to get questions and answers but you have to take the data in, you have to manage -- could be large amounts of data, petabytes, but it could also be just an order of magnitude more than you're used to, so going from megabytes to gigabytes, gigabytes to terabytes and so on. Knowing about schemas, how to organize it, how to reorganize it, how to share it. And how actually to write it up, how to document all these things. So that was what he saw that scientists were now in many cases being faced with. So this is a slide from Jim's last talk, as was the last thing. He saw that we have experimental science for thousands of years. And encapsulating in the traditional ways of doing science since Isaac Newton and Galileo and people like that, we've encapsulated that into laws of physics which we express mathematically and we can explore that mathematically and that's theoretical physics and Maxwell's Equations, the Navier Stokes equations, [inaudible] an equation also on -- we can make predictions and do experiments. And that's the way we've done science for hundreds of years. But in the last few decades, since about 1950, we've had computers. And they actually are, as Ken Wilson, who won the Nobel Prize in the '70s for his work in particle physics called them the third paradigm. Because how can you do investigations of climate change, how can you do investigations of galaxy formation other than doing simulations? And so he was doing a type of simulation which was only possible on the computer. And so he was doing if you like computer experiments. And so to do that, you needed scientists with -- yes, you needed to know experiment, you need to know theory. It doesn't replace them, but you need to know a new set of skills that numerical methods, computational algorithms, computer architectures and parallel algorithms. So all these sort of skills were different from the previous ones. So he saw that as Ken Wilson as a third paradigm. And what Jim was pointing out was now we're seeing in biology and for example the genome, we're all going to have our genome sequenced if we want in a few years time for maybe less than $1,000. And that's going to give huge amounts of data, the possibility of personalized meds and so on. So we have all this data coming from all sorts of new types of sensor network simulations, all sorts of high throughput instruments. And we need a set of tools and technologies. Now, in the UK I used to run what was called the eScience program, and eScience if you like is a set of tools and technologies to support, you know, combining data, managing data, sharing it and analyzing with visualizing it and exploring it. And you'll see some examples of some of the tools we've been looking at applied to data. So my colleagues and I produced this book which you'll find in your bag. If you've already got a copy and would like not to talk one back, please leave your copy at the back because we're happy to keep them. But what it is is it's a completion of essays looking at biological healthcare sciences, environmental sciences, the infrastructure you need, and it is looking at a scientist with a IT or computer scientist looking at where their field would be five years from now. So they're visionary accounts, not particularly Microsoft based, but just looking ahead as to where science is going. But in parallel with this revolution, there is another revolution which is also captured in the book in a section edited by Lee, which is on scholarly communication. We're in the midst of a revolution in scholarly communication. And I'll come back to that. So you can get this free. It's the only Microsoft thing I think, as far as I know, published under a Creative Commons license. So you can download it from our website articles, the whole book if you want, or you can get a version on Kindle or you can go and get a print on demand copy from Amazon. So just to give you a quick couple of examples of data intensive -- I've called it research rather than science because I believe this is not just science, this data explosion will also apply to the social sciences, humanities and so on. So I'm guilty of calling it eScience but really it's eResearch if you like. So this is the NSF are funding the Ocean Observatory Initiative, which is this is the Juan de Fuca Plate, and this is where it goes under the North American plate. So it's a highly earthquake prone zone. I live on Lake Washington here. And I look to the left on a good day like this, I see Mt. Rainier, on the right if I lean out far enough, I can see Mt. Baker, another volcano, and of course there's earthquake zones through the lake. So it's very reassuring to me that they're actually going to monitor the earthquake plates so we might get some data. But what this is doing, they're putting fiberoptic cable down on the floor here, and they have all sorts of instruments which could be just temperatures, it could be salinity, but it could also be more sophisticated robots, video cameras, all this sort of stuff. And instead of actually being able to observe what goes on down on the ocean floor, just occasionally when you have a ship going across the top what they now have is data coming in not -- you know, you go on this ship, you take the data, you bring it back to the lab and you spend time with your grad student analyzing it. Here data is coming in 24 hours a day, 7 days a week, 365 days a year. The field will go from being data poor to being data rich overnight. And we've been investigating with the oceanographers how you deal with that. Do you use the cloud, do you use workflows, all these sort of things. So that's an example of eScience application. And you'll hear later about some tools for visualizing how you begin geological time scales which is from the big band which is cosmological time scales, but how you put all that together and how you can look at civilizations, the biology development, the historical development and so on. So you'll hear of a talk about that. So I do believe tools for helping manage and visual size and put together different datasets in other fields in science are important. But I talked about the other revolution which is perhaps most interesting to this audience, so let me just say a few words about that. So this is based on the slide that I got in 2005 from a talk that Bill Gates gave, all right? And it was this slide. And it's very prescient in my view that you have a document and rather than this static document you really want it to be interactive data, you want to be able to find the data, you want to be able to have documents that can change if things get updated, maybe collaborative so other people can work on it, and you may want also to have rankings. You have peer review, but you might have some other rankings like for example they have faculty of a thousand where academics can rank. This was the most interesting paper last month, and you can actually choose [inaudible] network of those so, you know, what Tony thinks is the most interesting is never as good as what Lee things. So you could choose your own social network. And that gives you an alternative to peer review and maybe there were new ways of doing things. And of course, you actually ultimately also want to be able to reproduce the research. And it's a sad fact that most of the papers that you publish now it's very difficult to get hold of the actual data that the research was based on. And that's going to be a problem if we really want to be able to build and reproduce the research. So this is, you know, a vision as to where you could go with this be able to mash up data, collect them, use all sorts of technologies to view them. And you'll see some of those technologies today. So this is an example. Here is a typical paper. And the way chemists get data off graphs, they make a Xerox copy and they take a ruler and they measure it. And that's ridiculous because you should be able to click on it and go to the actual data. And you also should be able to click and go to where the curve -you may want to plot something different, you may want to try a different fit. So there are all sorts of richnesses that you can add to make publications live documents. So Jim Gray in his talk -- in this book, we wrote up his last talk given two weeks before he died. And this -- he made a call, and this was the call he made about scholarly communication. We should establish digital libraries that support other sciences just like the National Library of Medicine does for medicine. And I'll come back to that. We should fund the development of new authoring tools and publication models. And that's something really that the iSchools and the iConference surely must be looking at. And he felt very strongly we need to explore the development of digital data libraries that contain not just metadata but also contain the actual data which is curated and actually has a preservation mechanism to do it. And this must be integrated with the published literature so you can go from the paper to the data and so on. So that was Jim's call to action. And his vision was in the end that the world of science and indeed research will be just one gigantic digital library, all scientific data online and you could -information at your fingertips everywhere, the cloud will enable you to access this from wherever you are on the planet you'll be able to get your work environment wherever you are and take it with you. So this was Jim's vision. So I would just like to say in my last few minutes where I think universities and perhaps iSchools actually should fit in here. And I was the dean of engineering at Southampton. I was a professor for 25 years there. And I'd like to tell you some of my experiences about the library at which I was a supporter in general, although I didn't agree with the librarian's insistence on expanding space for shelving. That seemed to be unnecessary. And a few other little things. But nonetheless I think that we have evolved. So this is PubMed Central. What is PubMed Central? It's now legal requirement if you get funding from the National Institutes of Health that you deposit -- you can publish whatever you like, it could be in science, it could be in nature, it could be whatever journal, PNAS, whatever, plots, but you're required after I think it's 12 months that you have a full text of copy available in PubMed Central that anybody, you, me, our taxes have paid for this research, afterall, we can all access this research. And that's the principle of open access. And in this case, PubMed Central embodies that and the publishers have complied with it. But it does more than that because there's also databases in this sort of walled garden of the National Library of Medicine. And the tool like entree enables you to go from a paper to the data, the paper, look at other data, go back to other papers. And you can do this sort of search. And David Lipman, who at NCBI which is responsible for PubMed Central, believe that he can sit down with a medical researcher for a couple of hours and actually help find a new discovery by navigating all this data. David Lipman is one of the most published medical researchers. So this is an example. This is an example which is well funded by the federal government. Not all fields are going to be able to have a centralized location. So we also I think have to consider federated systems and how could those come about? Well, as dean of engineering, I was responsible for monitoring 200 faculty and about 500 post-doc grad students, so a large number of people producing lots of stuff, and I'm supposed to be responsible for their research, to monitor, see what's happening. And since the library could no longer afford to subscribe, every year we are saying which ones do you want to cancel? That's typical. And furthermore, if I want to start a if you field of nano-engineering or whatever, you couldn't subscribe to new journals because there was no budget. So the whole system was clearly broken, all right. So what we insisted was that with we keep a digital copy of everything anybody produces, be it research paper, research report, thesis, Power Point presentation, data that goes with it. So we set up an institution repository. And the person who did it, incidentally, was my wife, who is actually a real digital librarian, unlike me. I steal all her stories. Okay. So this is an example of Open Access. It's Green Open Access. Most publishers will allow you to put up some version of your paper on your website, the departmental website. You have to read the small print. And maybe they don't allow it at all, in which case you keep it internally, you don't let it be viewed externally. But you produce this stuff, you were paid to produce it, you have a right to keep a copy. All right. And that's what we insisted on. Okay. So that's what that said. So it isn't necessarily the final version, but it is a version that people can see. And that I think is important and powerful for the following reasons: This is an example which still amazes me from Virginia technical. I'm sure most of you know it. Electronic theses and dissertations. In 1997, their dean of engineering insisted they keep a digital version of all theses and dissertations, head master thesis and PhD thesis. And in 1997 you had 200,000 requests for PDFs. And in 10 years after it was put up, you have over 20 million. I do find that surprising. Why did 20 million people want to look at Virginia Tech's theses? But apparently they do. What it illustrates that having things available, having things accessible makes people be able to find it, they can use it, they can access it and makes your research much more visible. Another example which is a result of the repository we put up -- now, this is -there's various rankings of universities. I'm sure you're aware of that. There's a Shanghai ranking, there's the UK times ranking of world universities. There's also this thing called Webometrics. Now, I don't subscribe to Webometrics methodologies but I'm just looking at one of their criteria, which is what Google scholar says are the number of papers you've got there in the citations. So they lump all this together and then they rank universities by the Google scholar hits that they get. Now, this is the result you get. If you look at other things you'll see a different list. But here it's not so unreasonable, Harvard and MIT and Stanford are up there. But there's a few oddities. One of the things at Southampton is clearly better than Oxford and Cambridge in the UK, which I know is not true, but nonetheless our research is more visible and people can see it and people cite it. So having your research up there visible is actually something to do with the reputation of your university. So it's reputation management. And the research repository I think is important. This is not a perfect metric. We can do better. You can do better. You device whatever metric you like. But certainly measuring the visibility and how often it's cited of your research papers is something of relevance. So what's the future? Well, it will not only have the full text versions of research papers and grey literature, but also it will contain data images, software and so on. And what you need are federated databases and be able to search across he's. And I think this is a big opportunity for major libraries to play a major role and partner between them to look at the data. So we've seen the example of the centralized one. There's also worldwide science which we had an ICSTI conference this week and they have a worldwide science where they access databases from something like 80 countries now, I think it is. And they have a translation service so you could type in a query in English and it goes to a Chinese or a Russian database, is translated and then you get the results back and it's translated back. So these are some examples. And if you're interested, there's -- in Europe there's much more emphasis on repositories and there's the coalition for open access repositories. And it's very difficult, for example, to find a US university on that list. Where is the future? Well, I think the cloud is going to be part of everybody's future. And cyberinfrastructure a vision where you have some of the services locally, you may have copies of the data, you may have part of the data, but you may have access in the cloud. You may have compute services, you may have storage services. You certainly will have identity services. We already do have that. We have blogs and social networking. So a mixture of resources local and in the cloud will, I think, be part of the future. What about the library future? I just like to end with this comment from Paul Ginsparg. He created the particle physics archive called arXiv, which everybody uses. Every particle physicist -- every physicist and astronomer, mathematicians, every morning they go and look at what's being newly put on arXiv, rather than the archival papers that are published. So he says on the one-decade time scale, it is likely that more research communities will join some form of global unified archive system without the current partitioning and access restrictions familiar from the paper medium, for the simple reason that it's the best way to communicate knowledge and hence to create new knowledge. And secondly, ironically, it is also possible that the technology of the 21st Century will allow the traditional players from a century ago, namely the professional societies and the institutional libraries, to return to their dominant role in support of the research enterprise. And I think that's a noble vision. Thank you very much for listening. [applause]. >> Lee Dirks: Are there any questions for Dr. Hey? >>: Just before coming here our faculty had a very interesting e-mail discussion about a repository that the library at our university is trying to get off the ground. And the discussion was about central repositories versus the local repositories of, you know, universities like Virginia Tech mention or the med central repository that you're talking about. And the question is, if you have a central repository with all of the researchers in the world know about, why would you do a local repository, for example, right? And then the challenges of managing all of the publications from all your institutions and make sure that they are up to date and that people have access, and of course the fight with the publishers when they don't want you to put them up there. >> Tony Hey: Okay. The question was, you know, if I summarize it correctly, should we have a centralized repository, what's the point of having local ones if we have a big centralized one, and what about the publishers? The point is I don't know what should replace them. I have no -- I edit a journal for Wiley, all right, so in interest of full disclosure. I don't wish to put publishers out of business particularly, but I do know that the present system is broken. And I don't know what's going to replace it. But it's clearly ridiculous that, you know, they get all the material electronically and now if you want a paper plus an electronic version they pay more, the prices have gone up 10 percent the for the last 15 years, library budgets haven't. Something's got to break, and I don't know what it is. So I don't want to put them out of business, but things are going to change. So I'm going to put publishers on one side because we're in the midst of a revolution. What about centralized ones? Well, even if you have a centralized resource like National Library of Medicine. My wife worked on a project on oceanographic data in Southampton, and there is a national oceanographic archiving service in the UK. And they take some of the data. But there's also other bits of data which are not taken. And I think the institution has a responsibility to do that. And I think, you know, my students never went in the library for doing research type things, engineering students that they went there because it was a warm place to work, they could get WiFi, they could chat to their friends. But that's confusing the library with Starbucks, all right? And actually what's the purpose of a research library? Well, actually it must be to work with the researchers you have to guard the output of the institution. And similarly with the teaching role. So libraries need to reinvent themselves to be relevant. And I see that you will need -- yes, you may have centralized ones but it won't contain everything, and the universities must be responsible for what they have. But I do see the possibilities of federations of major libraries, for example, in the US, where they have a lot of resources. The resources are already there. And they have money. The money at the moment just goes to the publishers but it will change. I don't know how it will change, but it will. >> Lee Dirks: One more question for Dr. Hey, please. >>: Just a quick question. You had mentioned or maybe somewhat of a comment as a question, you had mentioned the theses and dissertations why it was confusing were a lot of people would want that. And I know from one of my researches on cybervictimization, and that's a very evolving -- especially in relation to disabilities when I look at the intersection of those two very much evolving, I like to cite sometimes in dissertations because they're the latest sometimes is being done by grad students and is not yet out in a journal. And it seems like dissertation theses even more so are rolled off than the journals because ProCAST kind of keeps them behind this walled garden that you can only often get in your own university unless you're at like Virginia Tech or some other schools. Now, do you see that changing also in maybe the next 10, 20 years where they have an open library where all the dissertation theses instead of sitting on a dustier shelf are the PDFs that are already out there that are sitting behind ProCAST's wall can be acceptable to researchers? Because that would benefit I know a lot of my research. >> Tony Hey: Well, apart from PDFs I agree with you, all right. I think PDFs actually you lose a lot of the information if you lock it up in a PDF. >>: [inaudible]. >> Tony Hey: But, yes, I was -- my son just finished a PhD at Berkeley. And I said you got to make your thesis available online. So he said I want to do that. And they said okay. It will cost you 80 bucks and you have to pay for it. And I said this is ridiculous. I talked to ProCAST and said oh, that's surprising because we only charge Berkeley 50 bucks. So Berkeley were making a profit. [laughter]. Some numbers like that. Anyway, but that has changed apparently. So what he did was put it up on his website. So I do think that universities need to be more proactive. And I agree with you. This is the latest research. Last example, then I'll get off the floor. I came from particle physics where we have a -- we have this preprint things. ArXiv is an example of instance research, you can find on it what's going on. In computer science we have a very strange system that our journals take two years to accept a paper and it takes two years to acquire some hits. So if you want to find out what the most significant paper was four years ago in computer science you can find out from Web of Science. But if you go to the astronomers you can find out what the most significant cited paper was on black holes last week because it's all accessible. And I'm absolutely sure that what Paul Ginsparg said at the end, you know, it is the way to do research better. So I think it will come. But I think it's -- you know, it's this community is going to help change the world. I don't know where it will go quite, but I think it's your job to find out. Thanks very much, Lee. >> Lee Dirks: Thank you very much, Tony. [applause]. >> Lee Dirks: Our next speaker -- make sure my mic is on -- our next speaker is Curtis Wong, principal researcher on the eScience team actually in Tony's organization. But Curtis has been with Microsoft for several -- many years and has done some amazing, amazing work, a broad variety of different projects. And every single one of them has been a tremendous -- in and of itself a tremendous achievement. Most recently he's been working on -- with a couple of other colleagues on worldwide telescope. And I thought it would be very interesting to have Curtis walk you through some -- through that tool and through some of the latest developments they've done with that. So I'll hand it over to you, Curtis. >> Curtis Wong: Thank you, Lee. So a little bit of history. Tony was talking about Jim Gray. About nine years ago Jim and I met and talked about a project he was doing around the data from the Sloan Digital Sky Survey. And they were thinking about creating a site which was -- ultimately became Sky Server where they would bring together all the Sloan Digital Sky Survey data and make it available to everybody, essentially the first sort of public astronomical data archive. And I went to him, I said, you know, having the data available is great for researchers, but there's also another opportunity, I think, to really inspire and educate people. I think having the data there -- the imaginary there is a great start. But what I'd really love to see is I'd love to see a really rich visual environment that essentially emulates the night sky where kids and -- that live in major metropolitan areas like I grew up in LA, and I never saw the Milky Way. And 70 percent of the kids live in major metropolitan areas. Can be able to start to explore the night and to be able to see everything from the broad swath of the sky to the highest resolution imaginary. And not only that, but I also wanted the opportunity to be able to hear from astronomers about what it is what they're seeing. Because everybody's seen these beautiful Hubble images but most kids have no idea how big that is or where it is in the sky or really what's going on with it. So part of this was, you know, I made a pitch to our VP of research at that time. I wanted to build the biggest telescope on the Internet. And he says well, I'm not quite sure exactly how that's going to help Microsoft Business, but we're in research and I'm looking forward to seeing what you're going to do with this. And so we started this project about 2006 and started with me and a developer Jonathan Fay. And so what I'm going to talk about a little bit today is take you a little bit into WorldWide Telescope, probably show you some things you may not have seen. How many people have seen WorldWide Telescope? Oh, not too many. Excellent. Well, maybe you'll be surprised. So I'm going to go into WorldWide Telescope. And if you don't mind I'm going to turn the lights off since we're going to be looking at the night sky. And there maybe you can see that. So this is the Milky Way invisible light. This is the largest continuous color image of the visible light of the night sky that exists. It's a trillion pixel image. So you can see [inaudible] go right into the center of a galaxy. And in digs to the visible light view, we have the ability to look at the night sky through microwave, the dust, the hydrogen, essentially 50 different wavelengths of light in the night sky. I could pick this as a foreground layer. And this is a supernova remnant. So if we zoom into the supernova remnant you can see the signature of that. But if I go back and cross fade, I can see the visible light view behind it and see the debris clouds from the result of that supernova recommend intent. And this was actually one of the important principles that we wanted to do in bringing this all together. Because in astronomy much of the imaginary is a sort of contained within the different silos of the -- you know, the particular survey, you know, the Hubbles concentrates on visible light, the Chandra on x-ray and Spitzer on infrared and how do we bring all that together to provide sort of a larger context for a lot of this imaginary? And within the capabilities of WorldWide Telescope now we have the ability to bring in you know surround work and have their own images displayed on the sky at the exact right location so that they can again compare something that they're looking at across other spectrum. You can also look at other image studies such as these from the Hubble. And we have many, many of these. If I clicked on M42 up there, I notice that's off the screen. So down here in the lower right you can sort of see that blue rectangle. Kind of like powers of 10. We're giving you the sense of scale and where the -where that object is in the sky if the celestial sphere and what the field of view is. And because these are very high resolution images you can continue to zoom in right down to this little black grain of rice which is an edge on view of a disk of dust which is forming into a solar system and a star is being born in the center there. So I mentioned before the idea of these guided tours. And so guided tours are being produced by kids, by educators, by professors and astrophysicists. And there are different categories of them here. Here's one that's done by a 6-year-old right next to one from Harvard University. And this is ->>: The Milky Way is a spiral galaxy, but it's hard to see that because we're inside of it. Here is a spiral galaxy not far from us, about 12 million light years away called the M81. If we look at in it optical light, we see billions of stars shining together in a spiral pattern. If we look at the heat from M81, rather than the light, it looks like the false color R and G image we see here. This Spitzer spaced telephone scope image uses long wavelength cameras that can see heat just like the one that took the picture of this cat. Galaxies are filled with tiny particles called dust that absorb the life of star ->> Curtis Wong: So I can pause this, and even though it feels like a video, it has that [inaudible] narration and music, it's actually just a path within this visual and data environment that we have. So if I wanted to see what the context of that image was, I could zoom back out. If I wanted to see what that image looked like from the Chandra X-ray telescope, and there are many sort of other contextual images that would overlay that particular object you can see it. If I wanted to branch off to a different tower that talked about the interaction between this galaxy and a near by galaxy right there M82, we could do that. But another really important capability is that behind every object I can right click and pull up a little information window where I can say -- if I was a kid doing some research on a paper, I might want to look in Wikipedia. Or if I was an astronomer and wanted to find out what's the latest paper that was written that cites this particular object in ADS, which is the Smithsonian astrophysical database? And a here you can see it did a realtime query and it found 2,625 papers of which the most recent one was, you know, last month. You could also get original source imaginary. As I mentioned before, a lot of imaginary was contained in the specific telescope areas and different service. If I wanted to find a DSS image, I don't have to know where the database is of the query, I can just say go get that image for me, and it would bring it right here for me. I could get science level imaginary as well. On in addition to this particular rendering of the image environment, we also have a full 3D model of the solar system that goes all the way out to the local universe. And you could sort of see the orbit of Pluto right there. And we could go right into an individual planet, but in the interest of time, I'm going to go back out a little bit. Here's the constellation Orion behind it. If we pull back away from Orion, you'll notice the local stars in our arm of the Milky Way start to change. Because if you look out this way, you can see that the constellation, the stars in the constellation Orion are actually in different distances. As we exit the Milky Way and go out into the Sloan Digital Sky Survey, these are the million galaxies from the Sloan. Essentially the 3D manifestation of the original work that Jim Gray was starting to do back in 2002 with Alex Szalay and that team, but even though this is a really interesting visualization it's actually a connected view like we were talking about earlier. If I wanted to know what was behind that particular galaxy just to randomly pick one, we go right into the Sloan site. Here's the image of the galaxy. You could see the red shift. I could download the data for that. I can pull up the spectra for the chemical composition of it. It's all right there. So one other thing I'd like to show you is the more recent work is around bring in data within this particular space. And to do that, we'll start with something a little bit more familiar, which is around the Earth. So I'm going to go over here and bring up the earth. And got this up here. And bring up some data. So here's some data in Excel. It's of earthquakes from the year 2002. So you can look here and see that there's about 40,000 rows of earthquakes here. Kind of difficult to comprehended given the amount of information in there. But what we can do is we can bring that data in the WorldWide Telescope, I can bring up a layer manager here, and you can start to see what that data will look like. Here we go. So here's 40,000 earthquakes in that particular year. Now, we can play around with some of these properties here by perhaps changing the scale of them so we can see them a little bit more densely. But when you start to notice these are -- we actually have depth data in here. And you're starting to see these weird clusterings of the earthquakes happening in the middle of the ocean. But if we pick a different underlying image flair behind it, you can start to see the context of that in that a lot of these quakes here are really in the subduction zones between ocean and the plates. Now, we have data in this environment. One thing we'd like to be able to do when you have lots and lots of data is to be able to annotate them fairly richly. And so within WorldWide Telescope just like the tours you saw earlier we can create tours that are essentially within this environment that I can help us take a look at data in this particular space to call out interesting anomalies. So I'm going to take this particular tour here and open it up and I'm going to play it. [music played]. >> Curtis Wong: So I could easily be annotating this with [inaudible] what it is [inaudible] or something that it wanted to call out. I could add in any kind of graphical elements. Any element that's in here I can have it link to website outside or some other data source. So you can imagine when kids see something like this and they start talking about Earth quakes it becomes very, very obvious about some of the things that were happening. We just saw the area of Seattle about what Tony was talking about earlier. But here of course you can see the clustering of earthquakes along that subduction zone along the coast of central California -- Central America. And so even though this is like a video in a sense, we can compress time. And we're just looking at those 40,000 rows of data over about 10 or 15 seconds. But these tours are also fully interactive as well. You can sort of pause and explore because you're essentially in that data environment. I could hover over any of these data points too and you could start to see the metadata about that particular quake. And here is the coast of Chile with all of its aftershocks. So I'm going to show you one more thing which is another tour which is actually a tour that was created by Harvard in commemoration of an astronomer named John Huchra, who worked with about -- many years ago on the large scale structure the universe. And what's interesting about this particular tour is that it -- it's a very, very rich data tour within the environment. But everything that's in here they mention things like papers and other things, every paper, everything there is linked to the original paper. So while you're getting the story of John Huchra's work, you can pause and drill in to all of the underlying material related to the data and research around what John Huchra did. [video played] >>: Lucky for us John Huchra saw our ignorance as a fantastic opportunity. As an astronomer his principal goal was to learn the three dimensional distribution of matter in the universe much as we know and understand the surface of the Earth. Here is a three-dimensional view of the top of Mt. Hopkins in Arizona, the site of many of John Huchra's astronomical observations. Adding altitude measurements to a two-dimensional map of this part of Arizona gives a 3D view in the same way adding distances conferred from galaxy redships to a two-dimensional map of the sky can make a three-dimensional map of the universe in this [inaudible]. >> Curtis Wong: I'm going to jump ->>: [inaudible]. >> Curtis Wong: This is their first survey they plotted against the sky. So this is really data plotted against the visual image ary of this -- of the WorldWide Telescope. [video played]. >>: It's easier to understand this diagram if we look at it in context. So let's put it before we dissect its meaning. Shown here again are the galaxies in the first script as projected on the sky in two dimensions. If we color code the same first strip galaxies in a 3D view of the universe given to us by the modern Sloan Digital Sky Survey we can see how a strip on the sky translates to a wedge in three dimensions. >> Curtis Wong: So you can see the scale of it is much smaller than the modern day Sloan [inaudible]. So I'm jumping to another part of the tour. So he's are the 18,000 galaxies of the CFA2 survey which are colored coded and plotted in the sky, color coded by the red shift of that particular galaxy. So again, this is a fully interactive experience in that you can get to any individual galaxy behind each one of those plots and see the underlying red shift behind it. So that sort of gives you an idea of an example of that. And I want to make sure that we have time for questions, so I'm going to go back here. But I think -thinking about big data and the opportunity around big data around astronomy I think some of the opportunities are first making the data available and making it broadly available to the public. The second is bringing together tools that are really simple and intuitive to use for the public to have access to that data allow for rich annotation and a collective way to sort of surface it and make the feedback connection back to the scientists to sort of make them aware kind of like what happened with the honey's work cloud which you may have heard about where a Dutch school teacher identified a new object in the night sky. So, anyway, thank you. >> Lee Dirks: Thank you very much. [applause]. >> Lee Dirks: Maybe a question or two for Curtis while the next speaker comes up. We do have a question. For a second I thought you were stunned into silence after that demonstration. >>: So thank you very much for that presentation. One of the -- one of the issues that I think all of the visualization that you've shown us today raises from a science perspective, you can do the visualizations but a lot of the predictions, for instance the National Science Foundation is fast forming teams for research in distributed modes and the collaboration aspects, trust aspects of the scientists who may not fully know each other. >> Curtis Wong: Right. >>: In human dimensions. >> Curtis Wong: Yes. >>: So ->> Curtis Wong: So how do you contain that? >>: Yes. So thinking about how that all plugs into this, because eScience won't be successful without this kind of fast forming ->> Curtis Wong: Right. Where sort of the integrity of everybody that's involved in the early stages of science. I mean, I think what Alissa is doing with this is she did that earlier tour is she has her datasets, and she can public these paths and annotation about her datasets in this environment that are of course private. Because what I showed you in these tours, they're just a stream of metadata, relatively small path that shows access to the data as well as these paths in the data as well as the rich annotation about the data and that she can share that with a number of her colleagues and then they also have access, takes them to the data, and is contained within themselves. And so when you're ready to publish, it's sort of a different story of how you then sort of make something a little more analogous to the last thing that I showed you and perhaps that an idea. I'm not saying this is the solution, but I'm hoping to try some of these new ideas and see what people think and hopefully it will inform the process a little bit. So another question. >>: [inaudible] accessiblity you've shown some examples [inaudible]. >> Curtis Wong: Yeah. >>: So it looks like a pretty complex interface in general. How accessible is it and who -- you know, to I guess people with varying levels and talent. >> Curtis Wong: Right. Well, if I had time I could show you. The authoring is built into WorldWide Telescope. And it's very analogous to Power Point. So all you're doing is if you can see something, you can create a slide and it's the starting point for that. And if you go somewhere else, like if I zoom into a portion of the sky, you just say end -- the end should be right here and you've created a camera move into that space. Then you create a new slide and then where you go from there and then it interpolates the next space. And then if you want to add narration to it, you just can click and say add narration and you can drop in an audio file. And it spans across those things. So a lot of the things that you've seen like the kids tour is just him narrating these little places that we wanted to go. And the thing you saw with data is a bit more complex. Of course you have to bring data into it. But, you know, it retains sort of the meta about that data when you're creating it. I hope that answers your question. >>: Yeah. So would you say -- so it is accessible regardless of ->> Curtis Wong: Oh, yes it's accessible and it's free. It's at worldwidetelescope.org, so it's a free resource. Thank you. >> Lee Dirks: Please join me again in thank you Curtis. [applause]. >> Lee Dirks: And next I'd like to introduce my colleague Donald brink man. He's actually from my team. And he's going to be showing a demonstration about something called Chronozoom and a little bit of the work we're doing with big history professors from around the world. Donald. >> Donald Brinkman: Thank you. Hi, everyone. So, yes, my name is Al Brinkman. I work for -- I've worked for Lee. This is an excerpt from a longer presentation. It's going to be more about the big history and the deep zoom than the long tail. And let me just jump into it. So first of all, I just want to kind of give a little context for what we're talking about today and just talk a little bit about big history. This is probably the most words you'll see on any of the slides I'll show you today. I try to give you something nice and light for after lunch. Big history is a field of study that looks at history in a context larger than just human history. We look at earth history, life history, and the history of the universe. So how did Microsoft get involved with this? We were -- there's this guy, his name is Walter Alvarez. He's pictured here. He's the strapping young lad over there in the -- standing there. He's holding this little vial. The vial has -- it's full of iridium. And he's a geologist. And he's very well known for the fact that back in the 1970s he was part of a team that was looking at the KT boundary, which is the boundary between the cretaceous and tertiary layers of the Earth's crust. And he found these high levels of iridium in this boundary. He made this hypothesis that the reason this was here is because there had been a giant asteroid impact. And there was another interesting quality of this space that on the cretaceous side there was lots and lots of fossils and on the tertiary side there was very little. And so he came up with this hypothesis that the dinosaurs had been made extinct by a giant asteroid impact which at the time was considered to be a insane thought and no one believed him for about 10 years or so until we found a giant asteroid impact dating exactly to that time that had a whole bunch of iridium in it. So, and Walter was, you know, validated and he went on to continue teaching at Berkeley to much acclaim in his field. Fast forward about 30 years and we have just last year Walter giving the 97th annual faculty lectures at Berkeley, which is one lecture is chosen per year to give a sort of capstone lecture. And you can see here that in his talk he's talking about big history. And he credits Microsoft Live Labs and Microsoft Research. And that's what I'm here to talk about. So Walter maybe about four years ago started getting interested in big history, and he wanted to teach it in his classes. And he wasn't sure how to go about doing it. Now, you mean he knew how to talk about these things. He had a wonderful resources at Berkeley in terms of people from all the sorts of different fields to come in and give guest lectures. But he had a really difficult time conveying time scales. And it's very hard for people to just really understand what it means to go from human history to life history and then, you know, Earth history and the history of the universe. And really it has to do with -- he had this idea to zoom. So he asked his students, he said how should we go about and do this? And one of his students, Roland Saekow, raises his hand and says why don't we just go inside of Photo Shop and we'll make a giant canvas, this huge image and we'll just kind of, you know, click and zoom into it and then people can see, you know, what's going on. Which is a really great idea. And Walter says, wonderful idea, let's go and do it. Very earlier on, though, they discovered this problem. And the problem really had to do with zoom factor. And the problem is that when you go from 13 billion years to a single day, you're looking at a zoom level of somewhere around 500 trillion percent. This is a significant amount of zoom. And for all the wonderful things that Photo Shop is capable of bringing to our lives, it only has a maximum zoom of 6,250 percent, which can't get Walter and his team very far. Although they did make an admirable effort to do it. We found out about this project. And Darren Green, who is the general manager of my team, he said wow, this is a great application for this technology called Seadragon. Seadragon is essentially this -- it's this thing these guys dreamed up where they found if you take a really, really giant image and you break it into what we call an image pyramid, so essentially you have a low res version of the image and then you cut that into four pieces or a certain number of pieces and then have the same resolution as you go down in it, the resulting total data storage of that image is about 1.3 times the original image. So it's fairly space conservative. And it also allows us to do some really amazing things with a little bit of animation and some seam stitching to create this effect of seamlessly zooming into an infinite canvas. And what I'd like to show you -- I normally show an overview of -- there's a YouTube video that you can see if you search for Walter Alvarez or if you search for Chronozoom on YouTube. It's an hour-long lecture. I show a little small excerpt from it. I'm a little time constrained, so I'm going to jump right into another video. Because I really want to show -- in talking to Walter and his team, we discovered that not only did we give them the capability to teach big history in all its manage any sense, and there's a working prototype on the Web if you go to Chronozoomtimescale.org you can go see it. But we also kind of made them into human interaction researchers. They started playing around with the ideas and what was available in thinking about education using zoom as a metaphor. And let me just pull this up. So this is Roland Saekow. He's one of Walter's students who's narrating this. And he's going to show off an experiment looking at Italy timeline and some of the ideas they've had about how to use deep zoom in teaching class and thinking about research as well. And I just need to figure out how to turn -- there we go. I got it. Maybe. I got it. [Video played]. >>: Here we're looking at an example of [inaudible] history. In this example we're using Italy. And the first big division you can see here is pre-Roman Italy, Roman Italy, and post-Roman Italy. So thanks to zoom technology, we can zoom into any one of these boxes to look at them in more detail. So for example in Roman Italy, if we zoom in here, we'll see that's divided into early Rome, the Roman republic and the Roman Empire. This is just an example. So we're only providing one source. But in the future we hope to provide different sources of different interpretations of history. Let's take a closer look at the Roman Empire. So what we've done down here is we've represented all the different emperors with little pictures or statutes. So if we go into one of these, for example, and look at hedron, hedron is also a really good example of how there's really no limit to the kind of resolution that these images can be. So here's a quite large resolution image which is really just a small part of this map here. If with we look over here, for example, here's a panoramic. And these can really be gigapixel images and that is images that are greater than 1,000 megapixels or one gigapixel. But again, that's just one small part of this entire document. Another fun thing we can do is we can embed papers. So we look down here we've got a paper talking about periodization of history. This is an entire PDF. And any part of it you can zoom into. And you can do so without any delay of having to scroll through it tediously through a PDF. And of course one of the great benefits of figures they remain sharp but they're also very quickly and easy to browse, more so than using this in a standard PDF reader. High resolution images again can take place. Here we are looking at an outcrop. And if we had a gigapixel image on this particular outcrop, we could zoom right down, right on to the ground and look at an object right on the floor. In this case, the marker. For now, though, that's just an illustration. And one last fun they can we can do, we go back in this paper here, every paper has references. So here's the references cited in this paper. The tedious things about finding additional reference, you have to first open them and download, perhaps, and open in its own reader. With zoom technology, let's say we're looking at this paper here, also by Alvarez, what you do is you slide down here you'll see that the entire paper is also embedded right here also. So without any delay we can keep zooming. And this is a full PDF and a full resolution. We can look at any one of these diagrams if we wish. And again, we zoom out, we can see it's just a small portion. You can watch it there going away on the original document. In the original document is just part of this historical Italy example. So I hope you can understand why I wanted to share that with you guys. I'm not saying that this is a good idea. There's a lot of come you know, questions to be answered about this, but what it does is it offers us -- that's not me. What it offers us is a vision of what's possible and the sorts of things that we can start thinking about if we start thinking about not just searching for documents but thinking about semantic zoom ass a method for exploring and researching. So that's the Berkeley big history course which is currently being taught this semester. There's an undergraduate course and a graduate course. And the graduate course we're actually wording with a lot of graduate students to get data -- data inputs for Chronozoom V2, which is really what I'm here to talk to you about. Before I get there, I'd like to just talk -- talk a little bit about this idea of search. And I really feel that I'm going to make a prophetic statement here and say that zoom is the new search. And essentially what I'm saying here is not that zoom replaces search but in the same way that search became very important in the first decade of the 21st Century and we saw that with Google rising to, you know, its ascendency in our culture, we're going to see zoom take on a lot of importance as a complement to what search provides to us. And I see this when I go out and I imagine our digital humanities programs and I talk to a lot of people in humanities field. When I see people -- you know, we talk a lot about books, and we talk about Google books. And it's interesting. Because books are really complicated technology and they're very -- they're very sophisticated. And they're very long lasting. And we've actually developed these things called libraries which are another technology that are composed of all kinds of other little pieces of technology, card catalogs, you know, databases that help us to organize and access these incredibly valuable pieces of technology in our lives. Now we have with the advent of search, we've gone in this strange direction. We have this thing, you know, it's the 10 blue links. I search for a book and what I get back is it's sort of like I'm teleported to an empty room and there's -- there's, you know, the book -- well, hopefully the book I wanted. And that's the thing, search is a little imprecise. It never knows if it got quite what you got. So what a search algorithm will tend to do, it will give you what it thinks you want and then it will give you 10 of the things that seem like very similar to the thing you want in case those are one of those things. And this is neat. And it's very powerful. But when I talk to people, humanities researchers especially, one thing that comes up over and over again is that they say I miss browsing the stacks, how do I browse the stacks like this. You know how do I have the sort of incidental discovery that occurs. When I go in looking for one thing and there might be something that isn't a general, you know, coupling with this, but it's a sort of incidental discovery that I can't be enabled through searches we know right now. And this is what we're trying to do is a sense think about semantic zooming and how we can enable a sort of, you know, matrix like zooming bookshelf or multimedia experience. And so what we're working on now is Chronozoom V2. What we showed you that image -- and I do encourage you to see the full demo where you can zoom in from, you know, January -- or December 31st, 2000, all the way out to 13 billion years ago and see the entirety of the universe. It's really great stuff. But it's still a very perimeter prototype. We're talking about it's what we would call a raster graphic. It's a static image. If you zoom in on the letter A far enough it will become the size of the entire solar system and include all other data. Which isn't that really meaningful to you. And so what we're doing right now is we're adding in significant additional capabilities. We're adding in scalable vector graphics, continuous algorithms, dynamic text and data. Which what we're trying to do here is essentially provide these sorts of rich deep zoom experiences combined with visualizations of scientific data continuously scaling algorithms and, you know, dynamic multimedia from all kinds of Internet data resources. Combining that with collaborative authoring tools, the ability to create tours in a very similar way to what we saw in WorldWide Telescope and which multimedia experience that are embedded inside of the authoring and presentation experience. And we're doing this -- actually we're -- I'm jumping ahead. See, I've edited my own presentation. I realize I have logical gaps. How are we doing on time? Should I ->>: [inaudible]. >> Donald Brinkman: Three minutes. Okay. So we want to do this to enable research, but we also want to do it to enable high school education. And there's a good reason for this. We found that with many of our most successful experiments they're best if they have a higher education research application but also a more common application. And you can see that in WorldWide Telescope which can be used in astronomy research but can also be used in museums, you know, in libraries to teach kids about astronomy, in chem for word, which can be used for professional authoring of chemistry articles but can also be used by high school students. And in the same way, we're really trying to look at this as something that can be incorporated into high school big history curriculums. And why? I mean, what's the point this? I mean, one is that we can provide a more versatile tool. But there's other reasons too. If you think about this, and this is based on a little article that some of you might have seen on Gizmodo, if this is the whole of human knowledge here, so we have everything, down here in the middle we have sort of that little nugget of elementary school. So this is basic literacy basic math that we all learn. If we surround that a little bit through high school, we have more common knowledge that we sort of join together. And then if we go into college, you get even more common knowledge and maybe a little bit of specialization as you move in to masters courses you specialize even further, start reading journal articles. You get into doctoral work and you achieve very highly specialized work and you start working on your dissertation. Eventually you push and push and now you're up against the boundaries of human knowledge. You're pushing and pushing and pushing against those boundaries until finally at some point you break through. And that think right there is your doctorate. [laughter]. And here's the thing, I'm not belittling doctorates and if anything actually I'm overemphasizing them because I would say that this boundary here is logarithmic in nature. It's actually could it larger than just this sort of a linear progression here. This is a tiny little dot on this vast here that we're all working on. And when we think about how Microsoft Research makes its investment, we make point investments all over this boundary. And this is great. If your goal here is to advance human knowledge, then this is great because we're trying to find the right places to go and make things -- just push things forward a little bit. And this could be something, you know, that is a very, very specialized or it could be something like a cure for HIV, for example. I mean these things can be very important. But here, you know, there's another way of looking at making investments in knowledge and education. So if we go back to education in high school, if we think about high school, you know, this area where we kind of have a lot of people learning common knowledge and we insert some big history, then what we can do is we can make one point investment that has the potential to teach an entire generation of have kids a new way of looking at life, an empirical way of looking at life. Something that one of my favorite writers, Phil P. Dicks says is that reality is that which when you stop believing in it doesn't go away. Help people to really understand what is out there, what is real, how to try to understand things in an empirical sense. And, you know, I'm going to have to cut this short because I could go -- well, I've got a little bit more? Okay. He's going to give me some rope. I'm trying to single handily make up for our slippage in schedule. >>: I think when cameras are snapping [inaudible]. >> Donald Brinkman: Go forward. I will do it. I will do it. So just I want to talk a little bit about teaching and research, because we're saying here on the one hand we want a teaching tool and we want a research tool. How does that all work out? And so what we want is we want to make a tool for teaching. And this is how I tend to think of teaching. It's kind of like white water rafting. You can see some older more experienced people out there in the back kind of guiding things down. We got some young and hopefully excited rather than horrified people in front [laughter], you know, going down this what we would hope to be an adventure and not a disaster. So, you know, if you look here, I love this picture. Because look at this kid. You know. I mean, this is like the natural love for learning that we should all have. [laughter] and sometimes it seems like our school systems sort of turn this inside out, you know, and they provide a much different experience. And so you look at that and here we have education, you know, and then you start to think about well what about research, what about this as a research tool. Is that the same kind of experience? And sometimes we think of research in sort of a strange kind of way. We think of a good he's wearing, you know, last year's fashion. He's got some good technology but it's sort of a monastic experience, he's by himself. But what I'd like to say is that ultimately these two experiences don't have to be so separated. They might be right now, but they don't have to be. And one thing I talk about is games. Games is a really -- you know, almost everybody plays games, especially now digital games are obviously getting a lot more, you know, experience out this. But these are things that go far, far back in time. Even now like for a lot of people who are, you know, with work and students, people go home and they play video games after they get done with their real life. And many of us learn games from a -- from a very early age, you know. And it's a very collaborative environment at a very fundamental level. So what we're trying to do is create a more game-like multimedia exciting experience of education. But we think we can take that also into the research fields as well. So just to sum this up, I just want to say what Chronozoom is and what it isn't. So we're a platform to store and a visualize time-stamped datasets. We're an application for authoring interactive tours of these datasets. And a tool to compare disparate datasets and enable incidental insights. We really want to bring multi-disciplinary knowledge into one place so perhaps a geologist can find out things about other fields like biology that they may not know of and find entirely new discoveries which could be meaningful to us. What Chronozoom is not is a platform for collecting and visualizing arbitrary datasets. We're really looking at a very specific application, looking at historical data in various types. It's not a platform important authoring -- for publishing authoritative information. We see this as something that we imagine in a classroom setting where a student could author a presentation and also watch a lecture. But it's -- and it might be able to create some sort of visualization that could be put into something like, you know, a scholarly article but it is not a route towards a scholarly article itself. It's fought a tool to record time-stamped data. So what we need is we need authoritative and seminal datasets from a vast variety of disciplines. And this is what I was trying to talk about at the beginning. We're working with Walter Alvarez and his graduate students to sort of bring together all these -- like at least a seed set of seminal datasets. We're looking for novel visualization techniques. We're really building out a platform. The original Chronozoom was an end user application. We're developing Web services that will enable not just a single siloed application but a multiplicity of applications that many might surface in HTML 5, Silverlight, you know, Facebook, iPhones, Xbox, it could be anywhere that has an Internet connection. And we really need a passionate community that's interested in using Chronozoom as a tool for education and research. So we need you. And I hope this was entertaining. I encourage you to go out and check out the prototype, check out the YouTube lecture. Walter's lecture is an how long but it's thoroughly worth it. He gives a brief big history lesson. And thank you very much. >> Lee Dirks: Thank you, Donald. [applause]. >> Lee Dirks: And I would like to invite the next speaker to come up and maybe get set up. And are there any questions for Donald. We'll field one or two. >>: This is kind of off, kind of a side question. But for Chronozoom, what are the storage capacity requirements for that. I mean I know storage is growing but I was wondering if there's any kind of limits or boundaries because it seemed like when you're zooming and having all these really great looking images and I know vector is going to help with that, but it just seemed like even when you had videos and complex games, you know, they take up a lot of space. So is that any challenges that you foresee now or you're kind of really hitting that head on and that's not really going to become much of a problem? >> Donald Brinkman: So it's a great question. A lot of the datasets we're log at are actually -- specially as we go further back in time are actually not that big or that the that dense. It's really just being able to put them in context which is sort of a big challenge. Now, that being said, yes, there's tremendous multimedia out there, right? I mean, if you have these giga pixel images, the ->>: [inaudible]. >> Donald Brinkman: The original Chronozoom being one enormous static image was -- is very huge. What we're doing is we're developing new technology that allows us to actually dynamically generate image pyramids on the fly using this abstract data. So you could think of it as streaming -- a very efficient way of streaming particular data and we'll cache it a little bit. So if someone -- if like a lot of people are using the same data over and over again just for efficiency sake but a lot of this stuff is sort of exists in an abstract logical space outside of -yeah. >> Lee Dirks: Any other questions? All right. [inaudible] easy. >> Donald Brinkman: All right. >> Lee Dirks: Thank you very much. >> Donald Brinkman: Thank you, guys. [applause]. >> Donald Brinkman: I would like to introduce Andy Wilson to be talking to us about a lot of his research with touch and human computer interaction. Andy is, I'll call you this, you may not claim to be this, but Mr. Surface. He's the -- he is the man responsible for that fantastic interface. So with no further ado, we'll hand it over to Andy. >> Andy Wilson: Thanks, Lee. My name is Andy Wilson. So I've been doing a lot of projects around Surface and Connect depth sensing cameras and finding new ways to interact with computing. And what I'd like to do is show you a couple of videos of some of the things that I work on. Because I think that rather than sort of giving you a bunch of Power Point I think that actually speaks better to what I do. And they're a lot of fun to look at. And, you know, there's definitely a theme. But I'm just going to kind of hop through. >>: [inaudible]. >> Andy Wilson: Okay. Is that better? No. There we go. It looks like it just came on. Thank you whoever did that. Yeah. So I like to build prototypes, working prototypes as a way to investigate whether or not an idea works. So we spend a lot of time building -- building systems. And I just went through my collection of videos, and I pulled out some things that I realize now that I really like. And some of them are actually getting a little old. But I think they are still fun to show anyway. So as Lee mentioned, I had some role in the development of the Surface products. And some of the things that I'm going to show you today are actually riffs on that or things that we were doing at the same time that we were developing the product. Different kinds of interactions, different feels, things that you don't see in the product but are still nonetheless interesting. This is a networked version of two surface like units where we use top-down projection. And I think the thing that's really interesting about this is that while you see the other person's hands, but there's no -- unlike most teleconferencing scenarios, there's no buttons or icons or menu, this is just like a video conferencing for your desktop, if you will. And so you get this sort of interesting effect where it's very analog, it's a very analog experience. And I think that that's actually a theme that's starting to come out of all of the things that I do. I tend to -- I didn't realize this until a few months ago, that I'm -- this was kind of what I like. So having this sort of analog experience. We're using these very digital systems, very powerful sensors, video cameras and interesting sensors and phones and so forth. And at the end of the day we're recreating some of the things that we do already in the real world, a very analog experience. Of course it's a little bit different. The person's not actually sitting across from you. We're stabilizing the background here or the actual chess board. That's really the only sort of smart processing we're doing with this. But you get the sort of intangible quality of the -- you know, the users' gestures, maybe some of their thinking. You can imagine coupling this with more traditional face-based video conferencing. Here's kind of a fun one. Instead of having the person come in from the opposite side, you actually have them come in around you, if you will, or from the same side like that and so you get a very different kind of feeling. And when I first did this, I thought oh, this is going to be really strange, it's going to feel like this person's violating your space when they come in and act. Really not a concern. Something you get used to right away. The funny thing about this is when you're done making this drawing of course you don't actually have a drawing, you have exactly half a drawing [laughter]. You know, we could rectify that in software. That's not a real problem. Here's another one that I really like. This is something that we -- we've been picking up on and actually I think we're going to see this become more pervasive in real products before -- before too long with NFC and all the kinds of other standards. Put the phone down. System identifies the phone and finds it does a handshaking over Bluetooth and then pulls over your camera phone photos, right? Well, when is the last time you thought oh, I'm going to transfer something to my friend using my phone and then you realize that there's this kind of impossibility to it because it's so complex. If you could just put it down on a table top, your buddy puts down their phone and you just drag from one phone to the next and you break the connection by picking up the device. That's something that I think is magical and makes a lot of sense. It makes -- the idea being let's make the user experience so easy that you actually start thinking oh, I can do these kinds of things effortlessly. This is just a little detail. This is what the camera sees above the surface. You can see that this is actually picking up an infrared signal off the phone. This is done in the days when we had IRDA ports on our phones if you remember that. There are other ways to do that now, NFC, near field communication being probably the -- one of the front runners. Another direction of my work is just thinking about completely different interaction moods, different ways to think about. We're all used to cursors, right, and pointers. And multi-touch is just multiple cursors, really. This system is a surface-based system where we think about how can we leverage the full fidelity of the inputs? Surface is all about using video cameras to detect what's going on, detecting shape, making use of shape. And what we do here is actually bring all of the input, all of the data from the image into a physics engine or realtime gaming physics engine. Things move because we're simulating friction forces and collision forces. So if you have any experience with video games, you know, you know that objects tend to behave as if they're really world objects. Things collide and sometimes even friction force is simulated. It gives you a different feel for interaction. Here's just a really simple demo that I think says it all. You can lightly touch this ball to influence its direction or you can trap the ball with your hand. You can spin it. You can set the moment of inertia, the masses, the frictions -- coefficients of friction dynamic and static. And just completely change the behavior of this ball as if it were a real object. Stacking objects is kind of a fun thing. This is a 2D -- you know a projector showing a 2D image but it's rendered in 3D. It looks really 3D and it sort of almost feels 3D sometimes when you're interacting with it. This has a very different feel. Because we're using a physics engine, we can do all sorts of fun things with simulating cloth, different kinds of things that you may have never really interacted with before on the touch display. And again, leveraging all the kinds of expertise you have interacting in the real world. I'm going to skip ahead just a little bit. There's some kind of, you know, funny little demos that we've been playing around you know just pushing the limits of what we can do. We can't really do Origami with this system. We can do, you know, a single fold and then it's -- yeah, it's pretty much all we can do. [laughter]. And then tearing -- this is a fun one. When you're doing this, there's almost a kinesthetic quality -- it's a little hard to see this. There's almost a kinesthetic quality to it, you really do feel like fruit rollups. You remember fruit rollups? That's what it feels like. Even looks like a fruit rollup a little bit. >>: Can you turn the lights down a little bit? >> Andy Wilson: Can we turn the lights down back there? Just -- I'm going to skip ahead. I'm going to give you a little bit of a peak as to how this actually works. This down here is a representation of what Surface sees after a bit of filtering. And then what we're doing is bringing in these red objects. These are actually rigid bodies that are being manipulated in the physics engine and so these look rather different than cursors, right, we're actually throwing down as many of these little red objects as we need to do approximate the contours that we see in the image. And so we never had to decide about, you know, what makes a contact or what makes a cursor. We just put it all in there and control it appropriately. So I mentioned in my. At the start that I've been involved in some depth camera technology. This is actually one of the very first systems that we did at Microsoft that used depth camera technology. The camera here is made by a company called 3DV, which Microsoft acquired a couple years ago. So we combine this with a projector. There's the projector sort of more or less in the same place. And then you -these little cars here are -- they are projected obviously. And then you can drive them around with an Xbox controller. Okay. Nothing unusual yet. Then you put down this piece of construction paper and you can drive the car over the construction paper, right. And so the car takes the jump appropriately because there's a physics engine running behind it, this Saul, and basically this is the view from the computer. This is how the computer sees the world. We're getting the range map from the camera. Now, things are a little bit soft and fuzzy because you have to do a lot of smoothing to make this really work. But the system doesn't know anything about red objects or construction paper or these ramps. It's just taking the color data and the range data to build up this little course that's basically -- it's something that you put together and we'll show new a few minutes here that you can actually do some things with your hands which is kind of exciting. This is the view again from the computer's point of view. You know, this is just an exploration into extending the boundary of what we know with regards to the surface and sort of how we think about surfaces. What if you really did know something about the objects that are on top of surface in a really tangible way? And this is sort of your arcade view. I mean, once you sort of get into this sort of mind set you start thinking wow this would be really cool if I do this and, you know, you do this game and, you know, you kind of go nuts with this kind of stuff. It's really -- now you might understand why we had to do so much smoothing. A little bit of noise in this creates a giant pot hole and it's really difficult. Here's a little bit of interaction where I'm actually reaching into the scene and dropping these blocks by this pinching operation. The pinching operation there's the -- I'll come back to that later. Here is the 3DV range image. Brighter things are closer to the camera in this image. It's rather similar to what Connect gives you. Connect's data is actually rather cleaner than that. Here is me driving the car up on to my buddy's hand, capturing that and we actually took the system to the maker fair a couple years ago and it was really fun to watch all the -- you know, I had all the 12-year-old boys in the crowd completely mesmerized. They were fighting over who gets to drive and who gets to knock the cars off the table. This is again the system's view. Now, of course it's really not a 3D camera it's sort of a 2 and a half D camera, right? You can't see around objects with the camera, right? So you get the sort of effect that's almost like you threw a bed spread over your desk, right, and you're working with the -- the resultant shape. Yeah, and there's the same thing again, so you get this kind of weird effect. So that's -- that was some really early stuff. I guess we did this in 2006 already. Shortly thereafter we did a little bit more work on this physics part of it and started thinking about what more can we do and around looking at hands and physics models, getting a little bit closer. This is now the PrimeSense camera which is actually the camera that's in the Connect. It's an early version of Connect essentially. And the thing that I really like about this is it -- it -- because you're using the depth information and you're using a lot of the depth information, almost all of it, in fact, you get all the fidelity, all those sort of weird shapes in your hand and sort of the way that your hand sort of falls off, you know, as you go sort of toward your fingers and sort of all these things that you actually really have internalized over -- you know, over the many years of being a human why can't we use that today in our interfaces, right, the subtly of human emotion is brilliant. The way that you grasp objects, the way you pick up objects, these are things that we completely ignore when we're just talking about a mouse and a mouse cursor, right? And so I believe that with some of these interesting kinds of sensors that we're playing with now we can actually get back to some of those -- some of that fidelity, some of that subtly. This is a project that we recently completed I guess this past year and what we did was ask the question, you know, what if we took a three or four of these Connect cameras and actually covered the entire space, the entire room with these cameras and so what we did is we built like this kind of chandelier almost. So we had these three projectors, these guys here or these cameras. Those are PrimeSense cameras. Again, proto connect, if you will. And we basically hung them in this kind of chandelier in the middle of the room and the idea is you would turn it on and it would transform the entire space into surface. Surface everywhere, right? So this is a 3D model that you get when you do a little bit of calibration of those cameras, bring them into the same world coordinate -- same coordinate system. This part of the video I'm just showing a couple of little interactions that are -- that are fun. And we'll go into each of these interactions a little more deeply. So the first one is all about thinking about multiple surface or multiple displays. How do we move things from one display to the next, right? In the future we're going to have displays everywhere, right? That's the idea. And how do we get data from one place to the next? Well, there's our surface kind of thing. We're doing this right now. That's just a piece of foam core. It's not a very special piece of hardware at all. And this is just a regular desk, the kinds you get at Microsoft. And so when you -- so you can sense when you're touching the desk and do these kind of various surface manipulations, but then you'll notice that when he touches the one display and then touches the next the object he's touching on the first one is transmitted almost as if like through his body to the other display. Here's another idea. You can then pick or pull the object into your hand. It turns into this red ball. It has a bit of physics behavior to it. You can drop it on to this -- on to anything really. It doesn't need to be a piece of paper. It's just looking at the raw mesh data from the camera. And then you could pass it back and it has a sort of physical tangible quality to it. Of course there's nothing there, it's just light. So you don't really feel it there, but -- and you pass it back and forth and do these kind of manipulations. Here's another one. It's basically a menu. It's a very simple idea that, you know, you can maybe select items for a menu if you hold your hand over that spot on the floor and then pull up and raise your hand up and down to select an option. This is just one idea. You know, maybe we can do more things with widgets in space or projecting it on to your body. Maybe the menu rolls out on to your arm when you punch an object. There's transmitting through two people actually. It's showing that it's really -you know, maybe you shake hands and the system detects that and you transfer information when you shake hands. This is the little bit of a few of how the system actually works. We have three depth cameras. This is again -- this is like the third time I've shown these kinds of images already. The -- in this case, the range is indicated by the pixel intensity. So the darker ones are closer except for where it's black. So this is sort of a base level geometry. Here's my associate Herbert Banco [phonetic] just walked in. I can recognize people pretty well just by their mesh at this point, at least people that I work with regularly. And so he's walking in. And there's a little close-up of that manipulation of that little ball that's rendered. You can see that things are a little bit noisy but, you know, with appropriate amounts of smoothing you can actually do some manipulation. There's what the menu looks like, similar kind of effect. And then this whole business of moving things from one display to the next is a little complicated but what we're trying to do with the 3D cameras is actually emulate what Surface would see if there were a Surface unit there. Okay? So we're taking a projection of the 3D data and constructing an actual image that very roughly approximates what a camera would see if there were a camera under the table and doing the same thing for the wall and so you get these -- this kind of affect and so you can do all kinds of surface processing or surface manipulations that we're familiar with but on -- potentially any surface in the -- in the -- or any actual fast in the real world. This is a planned view of the -- of the space and that can be also be used to relate one surface to the another. Basically you connect from here to here through here and through three different views. So that's a system I call light space. And let's see. I think I'm probably running a little bit short on time. I want to squeeze one more in here just to move back to Curtises presentation a little bit and also show sort of completely different -- I'm going to turn down this funky audio. So this is -- this is a system we've been playing around with. Now, I showed you that pinching gesture, right? So imagine that you can actually pinch the graphics in this omnidirectional presentation, right? This is data right from WorldWide Telescope and we did a special version of WorldWide Telescope that allows you to project into a dome, something that was done with the principal developers of WorldWide Telescope. And then we basically -- if you think about a multi-touch system, you know, like your iPad or your Surface, what have you, when you touch you get a contact and it tracks that contact. You get put down two, you get two contacts. Well, here we're doing this pinch gesture. It's like we're doing multi-touch in space. For every time you pinch you actually create a contact. And there it is right there. This is the camera. The camera actually is embedded with the projector. They share the same optical access and, in fact, it makes calibration very easy. The idea really of this system was that we wanted to build this thing and wheel it into the room. You turn it on and it transforms that entire room into an interactive surface, right? But of course if you did that with one projector with today's kind of technology, today's projectors, it's not going to be bright enough. So we said hey, what we're going to do is we're going to put it in a dome. The first one we built was out of cardboard and latex paint and two by fours. This one is a little bit -- a little bit more upscale. It's something that you can blow up. It's like the bouncy toys that kids jump on at parties. And there is -- here is a view of when we take round table which is a video 360 degree video teleconferencing product that came out of some work done in research and actually put it in the dome. It's really interesting. Right now there's all kinds of interesting sources of omnidirectional data, right. Here WorldWide Telescope, round table, all the stuff the Photosynth guys have done you know, all the sort of gaming stuff can be ported directly into these kinds of environments. The idea that you could bring somebody into this environment and, you know, point -- interact with and point at various kinds of omnidirectional data I think is quite real. Not every household will have one, I suppose, but -- yeah. And a few other things we've been playing around with. So that's -- I think that's all for now. I think I've probably overstayed my welcome. Any questions? Yes? [applause]. >> Lee Dirks: Thank you, Andy. >>: Yeah, I was just [inaudible] the microphone. >> Andy Wilson: Just go ahead. >>: Just go ahead? I was wondering with the way the surface [inaudible] with the cameras do you see any -- do you see any integration with this with some of the new feeling surface technology like the Tesla test technology? Are you familiar with that ->> Andy Wilson: Yeah, I'm ->>: At Disney? >> Andy Wilson: Yeah, I'm familiar with it. It's -- that is the sort of the really the big problem right now everyone's having is how do you do those kinds of things and scale them up. Right? It's one thing to do the electrostatic thing you're talking about. Something that's going to work on, you know, a device like this. Oh, by the way you have to be moving in order to feel it, right. So there's a bit of a limitation there. Some of the [inaudible] stuff is another way of doing it doesn't work at the scale that we're talking about. It's the kind of thing that works on a handheld device so ->>: [inaudible] maybe in the future maybe like 10 queries from now maybe with that touch technology [inaudible]. >> Andy Wilson: Yeah. I hope so. That was one of -- that's one of the objections to the physics work that we've been doing is we've been doing some stuff we're trying to emulate the physics of grasping behavior. The problem is there's nothing there. So nothing pushes back. And so that's -- and that's the holy grail, I believe, of had some of this haptic stuff. Just to do that last little bit. Any more? >>: [inaudible] HP and their cameras. Having trouble detecting people [inaudible] is there any possibility that these installations [inaudible] distribution will include a [inaudible] camera so you can [inaudible] color not have to worry about so much whether somebody is on a color -- bright enough to be picked up by the camera? >> Andy Wilson: Almost all of the stuff that I showed you here is using infrared cameras. They don't actually sense heat, by the way. >>: I'll be retarded for that. [laughter]. >> Andy Wilson: It's the same frequency that your infrared remote control uses on your TV. So that doesn't get hot, right? But you can't see it. It's just beyond visible. And it just so happens that most materials in the natural world reflect infrared light. There are some exceptions. Asia folks hair is an interesting one. No IR off of that. And that's actually a problem. But for the most part it's not, once you go to infrared. It's a good question. Thank you. >> Lee Dirks: Maybe one more question for Andy. >>: Thank you for your presentation. I have done a little bit of experimenting with the fast stable but somehow your presentation went well beyond my engagement. I'm going back to an earlier question I asked about an earlier presentation become big data, eScience and a collaboration over time and space by best forming teams. And you really got towards the data end at the ends of your presentation. Now I'm trying to figure out how does this scale up to the kinds of things where we really want to have the kind of blended worlds between virtual and physical and use the analog ideas that you're dealing with as a way of building the human trust to form collaboration around big data and a eScience. [laughter]. >> Andy Wilson: Wow. Yeah. I'm going to pass on the big data and eScience questions but I will say that we have spent some time on a very large version of our table top system. I didn't show that. But it is something that we believe would be very useful for showing lots of -- lots of information and getting, you know, six to ten people around something that we can do. Interestingly one of the problems that we have is that the displays that we have do not have the resolution, right. I have 65 inch LCD panel in my lab. It's beautiful. It's still 1920 by 1080, right? It would be nice if it were one of the new 4K can standards would come around. That's a real problem I think. And the projectors are falling way behind on that that curve. And so you notice that the Surface 2.0 product is more of a like a 40 inch HCLCD. >>: So did you -- I'm sure you've run across Ted Nelson's Xanadu from 30 years ago probably before you were born, and it -- a lot of what you're doing really, it -I recall that very, very clearly in case you -- maybe you haven't ->> Andy Wilson: You mean the resolution problem? >>: No, the whole Surface manipulation idea that you're working on. It's very reminiscent of what was then a very futuristic idea. >> Andy Wilson: I mean we owe a -- you know, a terrible debt to Myron Krueger and, you know, these guys. In many ways what we're doing is just rediscovering, reinventing. I think one of the things that's different about it this time of course is that the fidelity of what we're doing is just, you know, very different. So thank you. >> Lee Dirks: Well, thank you very much Andy for joining us today. [applause]. >> Lee Dirks: On a side note, at the iConference back at the renaissance we've got -- Microsoft will have a suite that will be open tomorrow and Friday, and we're going to have a surface there. Surface 1.0, Surface 2.0 are still kind of being manufactured right now. But it's a Surface 1.0. And I encourage you to drop buy and play with it. We've got a few apps that came out with it and a few apps that our team at MSR has developed. Not quite as cool as the stuff that Andy was just showing you, that's kind of pushing the envelope. But we have a few things that I think will entertain you. Now I'd like to turn it over to Adnan Mahmud who is from Microsoft Research Asia. He's actually based here in Redmond but spends a lot of his time back and forth and represents a lot of the projects and a lot of the research that's happening in our lab in Beijing. And specifically Adnan and a myself have been working on something called Microsoft Academic Search which some of you may have heard of, but if you haven't I think you'll find it very, very interesting. So over to Adnan. >> Adnan Mahmud: Can you guys hear me? I think you can right? All right. Good. All right. So we have looked at all those cool presentations and you're wondering, man, I'd like to find out more about Andy Wilson or you're wondering, I'd like to find out more about Microsoft Research because they are doing some really, really cool work. So how do you do that? How do you find scholars in different domains and find out the work they have done, papers they have written, people they've worked with, other researchers they've worked with. So in the interest of time, I am going to do a five, maybe three-minute version of the demo and to show you what Lee referred to as Academic Search. All right. So it is a beta surface right now and it is a public service, a long address. See if you can remember this. I got academic.research@Microsoft.com and right now we only surface really this computer science data. We have about eight million publications papers and seven million authors. So clearly don't have seven million authors in computer science. But we have other domains but really right now the experience is focused only on computer science. So I'll walk through that. So let's start. And we'll start let's say with -- let's start with Tony Hey. So typing Tony's name. So here is a profile of Tony and this talks about some of the interest that he has and I think Tony said earlier that he did a lot of work in particle physics before which is not clustered here. It's almost again focused on computer science. So I can go through here and I can see all the publications that Tony was involved with. And then I can also see how many times they were cited. Let me go back to -- okay. So and Tony is part of Microsoft. So let me look at Microsoft here. And here are some of the computer science researchers in Microsoft. And you can see for each of them how many publications they have, how many citations they have. So I can again drill into one of these guys and then here on the chart you can see the number of publications cumulative and then number of citations cumulative. Probably a better view might be Daniel's view. And you can see number of papers he has published per year. And how often they were cited. An interesting view here is the co-author graph which will show you all the people that Mr. Lamport has worked on. I can zoom out through the whole graph here. And I can actually dig into any one -- any of these relationships. So when I hover over any of these lines I can see Bill and Leslie and Robert Shostak six papers, right. And I can click on that. And I can click on the six and here are the six papers that they worked on together, right. And I can use this visual explorer inform look at any of these other people, navigate around see if I click on the person's name it will go and we center the graph for that person. So now let's look at the paper side. So here are the papers that these people have worked on. And it's still one of these papers. So here is the paper number of times they were cited, abstract from the paper, where it was published, location for you to get the paper. We don't actually store the full text of the paper, we direct you off to the original source. And then here is some of the context where the citations were made and then the list of references for this paper. Very, very rich information. I showed you the author information and now I'm showing you the information about the specific paper. You can find keywords related to the paper, other related papers as well. Now, all the fun things -- so those are some of the fun things. Let's go back to the organization. So here you can see the list of organizations by number of citations that papers publish from the organization has received, right. So Stanford -- again [inaudible] science Stanford is one, MIT is two, Berkeley is three and you can talk to Tony afterwards that he agrees with this ranking or not. Cambridge and Oxford are higher than Southampton. >>: [inaudible]. >> Adnan Mahmud: Right. So I am actually even dig deeper. So I can say like filter this list for let's say just Europe, right. So here is the list in Europe. Or I could say filter it for like South America, for example, and you can see the list or something like that. Right. Let's go back to the all continents and let's look at Stanford. So one of the things that we did was we build this visualization for domain trends. So basically what this is showing you is how has the research focus in Stanford changed since 1960, right? And this is only focused on Stanford. Now, what we can do is I can go back to our homepage, and I can look at all domains and look at that one as well. This is across all computer science, all organizations, right. There's some interesting things that this points out. You can see that programming languages has kind of decreased its share as time has gone on, right, whereas data distributed and parallel computing has clearly gone up significantly in the last four decades, five decades. Now what's really interesting is that again if I go back to Stanford University -sorry. And you want to do some comparisons of how does the trend look or some of the bigger organizations, right. So let's look at Stanford. Let's look at domain trend of Stanford. And let's do another one. Let's do MIT. Actually I should have done this a little differently. There we go. So let's compare Stanford and MIT side by side. So you can see like they are very similar trends that go across those two universities. It looks like the one where they are today and you can see how the focuses of the two organizations are very similar to each other in terms of where the publications are happening. So I think I'm going to stop it right there. Questions? >>: Question about the citation tools. >> Adnan Mahmud: Yes. >>: The challenges we have in using them is that they include self citations. Is there a way if you got your tool that we could pull out self citations to analyze the scholars impact? >> Adnan Mahmud: So by that you mean like how often that the author cites his own papers? >>: Right. >> Adnan Mahmud: Or within the organization? >>: No. How often an -- when an author cites their own paper it kind of mucks up the data when you're trying to understand. >> Adnan Mahmud: Right. Well, one of the things we're working on right now is an extensibility plan on really [inaudible] APIs. So you'll be able to actually access a lot of this data through APIs and create your own rankings and create your own calculations. So we might not provide a lot of those things on the site itself, because we don't think we can address all of the needs, but that one you will be able to do it through APIs and you can make your own calculations. >>: [inaudible] in the field [inaudible] a lot [inaudible] tenure cases and that's the big common problem with using most of these tools is we have to manually strip out self citations. So just throw that out at some ->> Adnan Mahmud: That's great. So I'll take that back to our team and we'll throw that in our [inaudible]. >> Lee Dirks: You've got a question over there next. >>: So I've always been amused by the fact that faculty have different views of these kinds of things, depending if they're a candidate for promotion in tenure or serving on the promotion and tenure's committee. University presidents, provosts and deans have different views of these things, depending on what the ranking results were for their unit. But what's really I think interesting, and there are a couple of companies like Academic Analytics and some others are really building towards what-if tools. So when you're recruiting somebody, you know, you mentally try and say well if you get this person and put them with this cluster of faculty maybe good things will happen. So to be able to drag that person in based on publication history and actually drop it in and find out what are the unusual effects that that person could have in your current research cluster is clearly a tool that about three different companies are heading towards. Do you have any -- any interest in those areas? >> Adnan Mahmud: I think our interest is to make sure that we provide some of the most commonly used tools through our site. And if -- and again, like we're just starting this out and we're still in beta, so we'll -- our focus is to make sure that we provide the 90 percent of the scenarios, right. And then going back to the earlier point, then with APIs we're putting another 10 percent that we might not be able to swerve, you will be able to do those on your own and create those calculations or what-if analysis. >> Lee Dirks: I thought there was another -- yes, question there. >>: Yeah. Getting back to those last two questions. What are your plans for expansion? You mention that currently your focus ->> Adnan Mahmud: Right. That's a good question. So we're -- we definitely want to get to multiple domains, all domains so [inaudible] computer science and we're working hard with publishers, repositories and libraries to get more content into our pipeline. So we hope that in the next few months, if not earlier, we will be able to put in more domains into the system and not just computer science. Does that answer your question? The question was about the domains, right? >>: About expansion in general, do you have an idea of what type of services but also domain falls into that. >> Adnan Mahmud: So domain expansion, yes. So that's definitely in our plans. In terms of expansion like APIs, as I said, that's also, you know -- so those are the two places where we're focused on. >>: And as you see I've got to get my H index up and my G index. And what this doesn't count are all the references I have to eScience for people who aren't computer scientists. So I'm seriously worried about this. No, to be sir, was was we would like to do is to make this as open as possible with -- you know, we can do open APIs so you can build your own things. And so you can do, you know, your own analysis, other things we should be doing that we're not doing. And I'll try to beat on poor old Adnan and his team. So what we'd like to do is make this responsive and collaborative and to try to produce something that is genuinely useful, perhaps even for [inaudible] decisions, who knows. But something that is actually a value to you guys. So please, please, send suggestions to Lee, me, or Adnan. >> Adnan Mahmud: Yes, absolutely. And I'm sure we'll share the contact info, but we'll also have the contact info on the website itself. So we always like suggestions. A bunch of our features come from suggestions from the community as well. So if you have ideas once you've tried it out, definitely send it to us. >> Lee Dirks: Well, thank you very much, Adnan. Great work. [applause].

>> Lee Dirks: Thank you so much, so much... out here. We're thrilled to have you. Welcome...

Related documents

Products

Support

&gt;&gt; Lee Dirks: Thank you so much, so much... out here. We're thrilled to have you. Welcome...

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib

>> Lee Dirks: Thank you so much, so much... out here. We're thrilled to have you. Welcome...