Lewis Shepherd Chief Technology Officer Microsoft Institute for Advanced Technology in Governments www.ShepherdsPi.com FUTURE TRENDS: INFORMATION MANAGEMENT IN 2015 A Move from DIA to Microsoft America’s Long Tradition of Government Supporting Research Simon Cameron, U.S. Senator from Pennsylvania A leading voice during 1861 debate over Smithsonian Institution funding, during run-up to War “I am tired of all this thing called science.... We have spent millions in that sort of thing for the last few years, and it is time it should be stopped.” America’s Long Tradition of Government Supporting Research Later in 1861: Named by Abraham Lincoln to head the 19th Century’s military-industrial complex (Secretary of War) 1862: Ousted for corruption, censured by the House of Representatives for “contract manipulations” 1866: Re-elected to U.S. Senate Why I Joined this Small West-Coast Startup Business Division Platform and Entertainment & Online Services Devices Division Division Research & Development Security Quantum Computing & Cryptography Collaboration Aids Vaccine Robotics R&D Budget 2008: $8 Billion 2009: $9 Billion MSR Growth Research PhD’s MSR Cambridge Research lab locations : Redmond, Washington San Francisco, California Cambridge, United Kingdom Mountain View, California Bangalore, India (Sep, 1991) (Jun, 1995) (July, 1997) (July, 2001) (Jan, 2005) MSR India “Institute” in Organizational Context The Breadth of MS Research Platform Elements ◦ Networking, Distributed systems, Operating systems ◦ Cellphone and other Devices ◦ Sensor networks ◦ Security, Protection against Malware, Identity Web ◦ Search and Advertising ◦ Knowledge management ◦ Cybersecurity Data and Documents ◦ Database Architectures, Data Mining ◦ Machine learning, Fighting SPAM ◦ Meta data extraction, authoring User Interfaces, Social Computing, and Collaboration ◦ New UI – Speech, Ink, Gesture, Natural Language, Large Displays, Surface Computing ◦ Meetings and Collaboration ◦ Modeling of People and Groups ◦ Technologies for Emerging Segments Connecting Developer and IT ◦ Languages, tools, compilers ◦ Revolutionizing software and services Media ◦ Graphics and Multimedia ◦ Digital Photography and Video Science ◦ ◦ ◦ ◦ AIDS Vaccine, Quantum Computing eScience – Bioinformatics, Astronomy Algorithms, Cryptography Economic models Rapidly Changing Technology Predicting is hard Computational power Multi/Many-core CPUs Graphics 3x per year Storage 2x per year Networking 4x per year New devices Ubiquitous connectivity Nano Technology The Web Microsoft Inc. as an Enterprise Example 141,000 end users 260,000 computers 550 Buildings in 98 countries 358,000 SharePoint sites 435 million unique users 2,500 internal applications 280 billion page views/day 3 million internal emails/day 20 million incoming emails per day (97% filter) 29 billion emails sent/day 42,000,000 remote connections per month 6 billion instant messages (IMs) per day Technology Themes Benefiting from Increased Speed and Scale Cloud Computing Security & Mobile Human-Computer Interaction Immersive Data Geospatial /Robotics/Social Networks Semantic Computing 2015: What will continue to be important to I.M. leaders? Right information Right people Right time ALL information Right information Right people ALL people Right time ALL the time All Information • Explosion of data-rich Social Media • Semantic techniques & semantic computing • Hybrid machine/human translation Semantic Computing Leveraging our positions as world’s largest hosting company, world’s largest email provider, world’s largest IM platform Finding Meaning in our Search & our Hosted Social Platforms Microsoft Live Search index =25 TB content, 3500 queries/sec Mining Social Media: the Twitterverse http://www.TweetGrid.com Social Streams: Real-Time Social Media Mining A robust acquisition and mining platform supporting research and product exploration in social-media analysis. Platform acquires social-media data, such as Blogs, Usenet, and Twitter in real time or near real time and provides a stream of this content. Stream is consumed by real-time mining components that can be assembled into compelling desktop applications. Platform provides a content store that gives access to the textual content of the media, as well as statistics and other meta-data describing the publications and authors serving and creating the content. Real-time mining application analyzes social media for references to news articles, allowing ranking of news & opinions as discussed online. Facilitates discovery, browsing, and sharing of online info, keeping the user up-to-date and informed about events & attention being paid to them “Political Streams” – OSINT Early Warning System Online Info/Blogosphere Viz & Data Mining http://socialstreams.livelabs.com/ Semantically Enabled “Research Desktop” Research Desktop Activities allows users to add semantic tags & labels to related documents, images, e-mails, and other items. Using the semantic labels, users can easily activate a particular task or switch between multiple tasks. Dedicated information spaces: Personal Library to collect books, manuscripts, relevant articles and media Notes to enable simple storage and access to content snippets, URLs, and other bits of information that can easily be misplaced or can be difficult to find. Tools and services that can be used in various contexts. Users can easily analyze individual books or collections of publications, create a co-author network. Easier discovery of trends in data. Research Desktop augments the standard desktop environment with concepts and designs that enable new ways of working and managing resources. RD provides semantic enabling within four key areas: Activities, Tools, Library and Notes. Consumer/Web-User Semantics Creating overlays over the World Wide Web Storing a location-transparent digital memory that works to personalize, semantically relate, and socially enhance experiences of the web Designed to integrate with traditional desktop Will first be packaged and projected via the browser -- but that is just a delivery channel About putting the human back in the center of the experience of technology Every aspect of our UI and technology is Research Desktop demo subordinated to creating experiences that enhance human community building and interaction. Translation Research MSR’s Natural Language Processing group has developed a hybrid Human/Machine Translation (MT) system Has both data-driven and rule-based components Learns translation mappings automatically from bilingual sentence pairs (Microsoft product TMs) Allows semi-automated human-in-the-loop with a wiki Google’s MT group is using speed of first-pass API allows integration by 3rd-parties Translation Research MSR’s Natural Language Processing group has developed a hybrid Machine Translation (MT) system Has both data-driven and rule-based components Learns translation mappings automatically from bilingual sentence pairs (Microsoft product TMs) The system has been used successfully by the internal Customer Support group to translate knowledge-base articles Its use is being extended to localization work for selected Microsoft products Google-Powered “Nice Translator” www.NiceTranslator.com MS-Powered Real-Time Translation in Browser All People • Social Networks • Presence • Assured Identity Human Terrain Analysis: “Largest Social Network Ever Analyzed” ”Planetary-Scale Views on a Large InstantMessaging Network” – Eric Horvitz One month of traffic on MS Messenger (May 2007) Dataset contained “summary properties” of 30 billion conversations among 240 million people The communication graph constructed includes 180 million nodes, 1.3 billion undirected edges Visualizing the Human Terrain Looks like Tom Friedman was right… The More Things Change… “This is the first time a planetary-scale social network has been available to validate the wellknown “6 degrees of separation” finding by Travers and Milgram [1969]. The earlier work employed a sample of 64 people and found that the average number of hops for a letter to travel from Nebraska to Boston was 6.2 (mode 5, median 5), which is popularly known as the “6 degrees of separation” among people.” “We used a population sample that is more than two million times larger than the group studied earlier and confirmed the classic finding.” Some Findings: “We find that people tend to communicate more with each other when they have similar age, language, and location” “Cross-gender conversations are both more frequent and of longer duration than conversations with the same gender.” MashupOS: Security in Cloud Services • Today’s mashups turn the browser into a multi-user system • Mutually distrusting domains become co-users • No control on content integrated from different domains MashupOS will apply operating system principles to mashups • Service-based resource isolation • Protected, data-only, message-based communication between services Invokes well-understood Secure OS principles to provide a stable security foundation to replace today's mashup anarchy All the Time • Cloud Computing • Live Mesh • Software + Services -> Immersive “augmented reality” “Cloud Computing”: Different Definitions Technically, they may all be “Off-premise, Virtualized, Scalable (up and down)” But Different Business Models •Utility computing - Virtual hosting (e.g. Rackspace Cloud) •Cloud storage - Data hosting (e.g. Flickr, Amazon S3) •SaaS - Hosted services, email, Ims (e.g. Salesforce.com) •PaaS (“Platform as a Service”) hosted apps.(e.g. Google Apps Engine) We are bringing these elements together into a cohesive platform: Windows Azure Azure: Enterprise-Class Cloud Services Windows Azure Services Platform Internet-scale cloud computing and services platform • Hosted in Microsoft data centers. • Provides a range of functionality to build applications that span from individual mashup to enterprise scenarios. • Includes a cloud operating system and a set of developer services. • Fully interoperable through the support of industry standards and web protocols such as REST and SOAP. • You can use the Azure services individually or together, either to build new applications or to extend existing ones. Azure: The Web as an Application Platform Blogging, social networking Data processing/transformation Content upload, sharing, discovery Storage, computation, messaging Identity Mashups: composing data and applications SensorMap Functionality: Map navigation Data: sensor-generated temperature, video camera feed, traffic feeds, etc. Microsoft Institute-sponsored development: Semantic Virtual Earth Integrates real-time “real-world” data from VE, into rich 3D immersive simulation PhotoSynth: Beyond “Image-Stitching” A technology that analyzes related images and links them together appropriately, to re-create physical environments in a navigable virtual space. GeoSynth: the Semantic Metaverse • • • • Hostable behind a secure enterprise firewall Useful on huge datasets (e.g. Flickr, individual hard-drives) PhotoSynth captures all metadata with images Will enable semantic image browsing , searching , geolocation Microsoft Tag Small, colorful codes that can be printed, displayed, emailed, disseminated anywhere Simple software works on any smartphone (yes, even Apple’s iPhone) Link to online information Simplify personal or business contacts http://www.microsoft.com/tag Tying it All Together: Live Mesh Access to all your data, anywhere http://www.mesh.com http://www.mesh.com more information http://research.microsoft.com lewiss@microsoft.com www.ShepherdsPi.com © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.