Learning in the Wild In the Meeting Room Scenario Satanjeev “Bano” Banerjee Dialogs on Dialog March 18th, 2005 Introduction Project CALO: Building agents that support professionals in their work life Smart management of emails, calendar, etc. Organizing information from meetings for easier/faster access Inter-disciplinary, distributed research: Speech, Vision, Multi-modal integration, Machine learning, Reasoning, etc… CMU’s Piece of the Pie Automatic meeting understanding, specifically: Perform automatic speech recognition Detect different meeting states Detect major topics of discussion Detect decisions and action items Detect personnel skills Presentations, discussions, briefings Vision expert vs. speech expert vs. statistics expert… What else? Program’s main aim: Learning in the Wild! Learning in the Wild! Improve performance with continued use and without additional engineering! Typical learning scenario: Engineers deliver the system to company X, and never touch the system again. Compare % of action items automatically detected during 1st and 10th weeks of use. % action items detected during 10th week must be greater than during 1st week! How can we do LITW? Purely unsupervised learning techniques: Improve ASR through adaptation Clustering to create better understanding of topics / personnel skills… “Implicitly supervised” learning Learn from users doing the same task (hints) Ask for “labels” from human in particularly “useful” cases (nudges) Description of Software Meeting Recorder Mock Brow Speech, video, notes, slides, whiteboard markings All synchronized using NTP server, and auto-stored in central database Plays back media recorded using Meeting Recorder Offline “understanding” component Speech recognizer Meeting state and topic detector Out-of-the-Box Topic Detection Look for sudden drastic changes in Vocabulary Subset of active speakers Look for cue phrases, intonation “So…” “Moving on…” Can we Improve with Supervision? If a user was to manually mark topics, can we improve beyond a trained system? Yes, by acquiring the vocabulary of the various topics typical of this company! Yes, by learning the active speakers for each topic Training data may not cover company’s domain Training data cannot cover this at all Yes, by learning idiosyncrasies of this group Topic shifts happen only when Bob is speaking Bob always says “Now, the next thing…” before he shifts How to Elicit Supervision? But user will not mark topics for us So: Can we design an interaction so topic markup falls out of other user activity? Because cost too high, and value too low! An activity with “viable” cost/value ratio Idea: Better designed notes box “Smart” Notes Box First allow user to enter the agenda, one item per box User will enter notes on discussion of each agenda item in the appropriate box From times of taking notes we can tell topic boundaries! Report on integration Inputs from sphinx ok Trouble with output of adapter Discuss web page Web space set up Bob needs access to CGI server Plan roadmap Cost / Value of Smart Notes Box Cost of interaction: Nearly same as “flat” notes box Immediate value: Users can “one click email” the notes to other participants – useful for meeting scribe Longer term value: Notes will be “media augmented” – users will be able to use Mock Brow to play back the media, using the notes as an index At future meetings, MR will “suggest” agenda items – even less typing needed than “flat” notes box! Steps to Create Instances of Learning in the Wild 1. Identify task to be performed by system 2. Create “out-of-the-box” system 3. 4. E.g.: Topic detection E.g.: Decision trees trained on manual data Show that supervision can help Design interface to extract implicit supervision My Research Goals For the chosen tasks (topic detection, action item detection, etc): Create out-of-the-box technologies Implement/develop algorithms to rapidly adapt using sparse implicit supervision Design interfaces that extract maximal implicit supervision Wish: Somehow generalize all this… How to Evaluate? Out-of-the-box technology: Standard machine learning train/test evaluation Improvement with adaptation: Show delta of performance between system with/without supervision Evaluating the interface’s cost/value viability: Long and situated user study to show that indeed regular users will use the interface, often! User Study: Testing the Smart Notes Box Interface Goals: 1. 2. Will users use the “smart” notes box? Will their use indeed line up cleanly with actual topic boundaries? Cannot run a “one off” lab test Users may use the box the first few times – but will they give up after a few meetings? Will users use the notes box when they are deeply involved in the meeting? User Study Design Invite multiple groups to use the Meeting Recorder for their regular meetings At the first meeting, describe the interface Point out the value of taking notes in the smart notes box instead of in Emacs say Observe the groups over a couple of months, noting their use of the text box User Study Success Conditions Will declare victory if users use the smart text box often Often defined as some percentage of maximum use we can hope for = we get supervision for all the major topics discussed! Will pop the champagne if their use of the text box increases over time May happen as they experience the long term values of the interaction