A Snapshot Of MSR: 2005 Daniel T. Ling Corporate Vice President Microsoft Research Microsoft Corporation Microsoft Research 2005 Founded in 1991 Staff of 750 in over 55 areas Internationally recognized research teams Small part of overall R&D ~$6 Billion Research locations Redmond, Washington, San Francisco, California Cambridge, United Kingdom Beijing, People’s Republic of China Mountain View, California Bangalore, India MSR Mission Statement Expand the state of the art in each of the areas in which we do research Rapidly transfer innovative technologies into Microsoft products Ensure that Microsoft products have a future Wide Range Of Activities Work with product groups Program management team TechFest in March Participate in Research Community Extensive publication and conference participation Professional service - DARPA, NSF, NRC Strong ties with universities Joint research projects Extensive visitor and speaker program Students, faculty, research scientists Post-docs, sabbaticals, interns Community Events Faculty Summit DC Tech Fair Open Day at MSR Cambridge 21st Century of Computing in Beijing Workshops in specific areas Email and Anti-Spam Conference (Stanford) Social Computing Symposium (Redmond) Conference on Converging Sciences (Trento, Italy) UW / MSR Summer Institute Data Mining, Invisible Computing, Software Tools, Specifications, Security, Testing 2005: Biological and Computation Perspectives on Intelligent Systems; Handling Imprecise Information Inventing The Future… Platform Elements Networking, Distributed systems, Operating systems Cellphone and other Devices Sensor networks Security, Protection against Malware Reinventing Software Development Languages, tools, compilers Data and Documents Data Solutions for a Terabyte World Search Fighting SPAM UI and Collaboration New UI – Speech, Ink, Gesture, Natural Language Meetings and Collaboration Modeling of People and Groups Media Graphics and Multimedia Digital Photography and Video Science AIDS Vaccine, Quantum Computing, Astronomy Algorithms, Cryptography Susan Dumais Shaz Qadeer Ken Hinckley / Johnson Apacible Nebojsa Jojic Microsoft Research Personalized Search Susan Dumais Senior Researcher Adaptive Systems and Interaction Search … Your Way Stuff I’ve Seen (SIS) Unified search over your content (mail, files, web, calendar, contacts, music, notes, rss, etc.) Try (something like) it, MSN Desktop Search (http://toolbar.msn.com) Memory landmarks Memory Landmarks Stuff I’ve Seen MSN-DS Search … Your Way NewsJunkie: Pizza delivery man w/ bomb incident Stuff I’ve Seen (SIS) Friends say Wells is innocent Unified search over your content (mail, files, web, calendar, contacts, music, notes, rss, etc.) Novelty Score Try (something like) it, MSN Desktop Search (http://toolbar.msn.com) Memory landmarks Looking for two people Copycat case in Missouri Gun disguised as cane NewsJunkie Articles Ordered by Time Monitoring ongoing news events Identify stories that are novel, given what you’ve already read Search … Your Way Stuff I’ve Seen (SIS) Unified search over your content (mail, files, web, calendar, contacts, music, notes, rss, etc.) Try (something like) it, MSN Desktop Search (http://toolbar.msn.com) Memory landmarks NewsJunkie Monitoring ongoing news events Identify stories that are novel, given what you’ve already read Personalized Web Search … Today’s focus Personalized Web Search (PS) (w/ Jaime Teevan and Eric Horvitz) Web Search All users get the same results, independent of previous search history, current context, etc. Personalized Web Search Personalize search results, using rich client-side information Personal content (e.g., MSN-DS index), activities No profile setup or maintenance required All profile storage and processing clientside, for improved privacy User control over amount of personalization Web Search Personalized Web Search Personalized Search Demo PS: Overview Step 1: Retrieve web search results, n>>10 Step 2: Compute similarity (result, user) User Model Step 3: Re-rank search results PS: Theoretical Framework Score = Σ tfi * wi World N wi = log ni r i Client wi = log R wi = log r i R (N) (ni) (ri+0.5)(N-ni-R+ri+0.5) (ni-ri+0.5)(R-ri+0.5) (ri+0.5)(N’-n’i-R+ri+0.5) (n’i-ri+0.5)(R-ri+0.5) Where: N’ = N+R, ni’ = ni+ri PS: Evaluation How well does it work? Rich space of algorithmic and UI possibilities Experiment: Participants judge top 50 results, 137 queries User Model No Profile < Query history < Web SIS < Recent SIS < All SIS Document Model Full document in results set < Snippets in results set PS score + Web rank, even better Internal deployment ongoing Search … Your Way Example systems Stuff I’ve Seen -> MSN Desktop Search NewsJunkie Personalized Web Search Questions / Comments ? Contact information: Susan Dumais Senior Researcher Adaptive Systems and Interaction Group sdumais@microsoft.com http://research.microsoft.com/~sdumais Finding Concurrency Bugs in Systems Software Shaz Qadeer Software Productivity Tools Concurrency Is Important Critical software Operating systems, databases, embedded software (e.g., flight control, handheld devices, cell phones) Single-chip multiprocessors will become common Software running on these chips will be even more concurrent Concurrent Systems Code Is Complicated! Shared memory between threads Race conditions: some interleaving of concurrently enabled actions causes an error Data races: data being read by a thread might be trashed by concurrent write of another thread Reference counting bugs: a thread might access a resource already freed by another thread, memory leaks IRP (I/O Request Packet) cancellation bugs: unexpected cancellation of an IRP violates the state machine of IRP Concurrency Analysis Is Difficult (1) Finite-data single-procedure program n lines m states for global data variables 1 thread n * m states K threads (n)K * m states Concurrency analysis is difficult (2) Finite-data program with procedures n lines m states for global data variables 1 thread Infinite number of states Can still decide assertions in O(n * m3) K 2 threads Undecidable! KISS: A Static Checker For Concurrent Software Has found a number of concurrency errors in NT device drivers Key new ideas Technique to use any sequential checker to perform concurrency analysis Current implementation on top of Static Driver Verifier Find all errors that can manifest in a small number of context-switches Many steps later… Context switch Many steps later… A few steps later… Context switch Data Races In DDK Drivers Device extension shared among threads Data races on device extension fields two threads concurrently accessing a field at least one access is a write Driver #fields without races #fields with races Tracedrv 3 0 Moufiltr 7 0 Kbfiltr 7 0 Imca 4 1 Startio 9 0 Toaster/toastmon 7 1 Diskperf 1394diag 14 17 0 1 1394vdev 17 1 Fakemodem 31 6 Toaster/bus 22 0 Serenum Toaster/func Mouclass Kbdclass 21 17 32 33 2 5 1 1 Mouser 27 1 Fdc 54 9 KISS: A static checker for concurrent software No error found Concurrent program P KISS Sequential program Q SDV Error in Q indicates error in P KISS Insight Many subtle concurrency errors manifest themselves in executions with few context switches Analyze all executions with a small number of context switches KISS Strategy Concurrent program P KISS Sequential program Q SDV Q encodes executions of P with small number of context switches instrumentation introduces lots of extra paths to mimic context switches Leverage all-path analysis of sequential checkers What does KISS stand for? Smartphlow for Smartphone Eric Horvitz (on sabbatical) Ken Hinckley Johnson Apacible Adaptive Systems & Interaction Smartphlow For Smartphone Uses machine learning techniques to predict traffic flow Predict how long until jams will appear Smartphlow For Smartphone Uses machine learning techniques to predict traffic flow Predict how long until jams will appear Predict how long before traffic jams will disappear Smartphlow For Smartphone Uses machine learning techniques to predict traffic flow Predict how long until jams will appear Prototype has ~3,000 active users Predict how long before traffic jams will disappear Smartphlow Fuses Multiple Sources Traffic data Weather Holidays & Major Events Incident reports INCIDENT INFORMATION Cleared 1637: I-405 SB JS I-90 ACC BLK RL CCTV 1623 – WSP, FIR ON SCENE • Event store • Learning • Reasoning From Data to Predictive Models Data store, user logs Predictive models system-wide status & dynamics Incident reports sporting events weather time of day day of week season holidays UAI paper to appear next week Search over directed acyclic graph using Bayesian information criterion Smartphlow User Interface UI designed for quick glances at screen Quick overview of traffic status Smartphlow User Interface UI designed for quick glances at screen Red clock how long for jam to dissipate full circle = 1 hour Smartphlow User Interface UI designed for quick glances at screen Surprise (!) jam notification Small Screen Navigation 9 keys zoom to 9 zones of screen Zoom & flow animations maintain context Bayesphone: Context-sensitive communications Caller ID Context Call handling costbenefit analysis Ring Voice Mail The epitome of a virus: Combating HIV with machine learning Nebojsa Jojic Microsoft Research Collaborators Vladimir Jojic, Microsoft/U Toronto Carl Kadie, Microsoft Jennifer Listgarten, Microsoft/U Toronto Chris Meek, Microsoft Brendan Frey, Microsoft/ U Toronto Bette Korber, Los Alamos National Laboratory Christian Brander, Harvard/MGH Nicole Frahm, Harvard/MGH Simon Mallal/ Royal Perth Hospital Jim Mullins/ University of Washington Epitome as a model of diversity in natural signals A set of image patches Input image Epitome Using the epitome for recognition The smiling point Epitome of 295 face images Images with the highest total posterior at the “smiling point” Images with the lowest total posterior at the “smiling point” Epitomes May Also Allow Some Variability Epitome e: Mean Variances Epitomes Can Be Computed For Ordered Datasets (E.G., 1-D Arrays Or 2-D, Or 3-D Or N-d Matrices) With Arbitrary Measurement Types: Intensities R, G, B values Gradient values Wavelet coefficients Spectral energies Nucelotide or aminoacid content … We even played with text and MIDI files An Epitope Presented By An MHC-I molecule MHC-I Molecule Peptide Immune System Response The Map Of HIV From http://www.mcld.co.uk/hiv (A simplified version of the LANL detailed map) HIV diversity (LANL database) HIV is encoded in an RNA sequence of about 10000 nucleotides, divided into several genes. NEF is one of the shorter and moderately variable ones. The NEF length in the strain The 73 nucelotides of the NEF gene Note the insertions, deletions and mutations. A triplet of nucleotides encode for one aminoacid. A change in a single aminoacid may lower the cellular immunity to the virus in one patient and increase it in the other. Known Epitopes In A Part Of HIV’s Gag Protein Epitopes In Variable Regions Colors signify different human immune types A Vaccine For HIV/AIDS Typical vaccines are near copies of the virus that is being vaccinated against HIV mutates at a high rate – can’t use traditional techniques Machine learning allows us to build compact forms of “pseudo-virus” that covers the diversity of the HIV virus (or rather a pseudo-protein that covers the diversity of a particular HIV protein) This pseudo-protein, which we call the epitome is much shorter than the concatenation of all strains The Epitome Of A Virus Colors: Different patients Sequence data VLSGGKLDKWEKIRLRPGGKKKYKLKHIVWASRELERF LSGGKLDRWEKIRLR KKKYQLKHIVW KKKYRLKHIVW Epitome Machine Learning Approach to Vaccine Design Use sample HIV strains from multiple patients Build models that compactly encode as many epitopes (or likely epitopes) as possible Learning techniques Myopic Split and merge Expectation Maximization Coverage of all 10aa blocks from 245 Gag proteins (Perth data) We Are Also Working On: Epitope prediction Evolution and immune pressure modeling and inference Wet lab confirmation experiments (Harvard) Looking Forward Moore’s Law, bandwidth improvements mean continued dramatic improvements in computing MSR is an environment for collaboration and excellence in computer science research MSR works actively with the research community MSR researchers are building technologies for MS products that will enable this future © 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.