Learning in the Wild Satanjeev “Bano” Banerjee Dialogs on Dialog March 18

advertisement
Learning in the Wild
In the Meeting Room Scenario
Satanjeev “Bano” Banerjee
Dialogs on Dialog
March 18th, 2005
Introduction

Project CALO: Building agents that support
professionals in their work life



Smart management of emails, calendar, etc.
Organizing information from meetings for
easier/faster access
Inter-disciplinary, distributed research:

Speech, Vision, Multi-modal integration,
Machine learning, Reasoning, etc…
CMU’s Piece of the Pie

Automatic meeting understanding, specifically:


Perform automatic speech recognition
Detect different meeting states




Detect major topics of discussion
Detect decisions and action items
Detect personnel skills



Presentations, discussions, briefings
Vision expert vs. speech expert vs. statistics expert…
What else?
Program’s main aim: Learning in the Wild!
Learning in the Wild!


Improve performance with continued use
and without additional engineering!
Typical learning scenario:



Engineers deliver the system to company X,
and never touch the system again.
Compare % of action items automatically
detected during 1st and 10th weeks of use.
% action items detected during 10th week
must be greater than during 1st week!
How can we do LITW?

Purely unsupervised learning techniques:



Improve ASR through adaptation
Clustering to create better understanding of
topics / personnel skills…
“Implicitly supervised” learning


Learn from users doing the same task (hints)
Ask for “labels” from human in particularly
“useful” cases (nudges)
Description of Software

Meeting Recorder



Mock Brow


Speech, video, notes, slides, whiteboard markings
All synchronized using NTP server, and auto-stored in
central database
Plays back media recorded using Meeting Recorder
Offline “understanding” component


Speech recognizer
Meeting state and topic detector
Out-of-the-Box Topic Detection

Look for sudden drastic changes in



Vocabulary
Subset of active speakers
Look for cue phrases, intonation


“So…”
“Moving on…”
Can we Improve with Supervision?

If a user was to manually mark topics, can we
improve beyond a trained system?

Yes, by acquiring the vocabulary of the various topics
typical of this company!


Yes, by learning the active speakers for each topic


Training data may not cover company’s domain
Training data cannot cover this at all
Yes, by learning idiosyncrasies of this group


Topic shifts happen only when Bob is speaking
Bob always says “Now, the next thing…” before he shifts
How to Elicit Supervision?

But user will not mark topics for us


So: Can we design an interaction so topic
markup falls out of other user activity?


Because cost too high, and value too low!
An activity with “viable” cost/value ratio
Idea: Better designed notes box
“Smart” Notes Box



First allow user to enter the
agenda, one item per box
User will enter notes on
discussion of each agenda
item in the appropriate box
From times of taking notes
we can tell topic boundaries!
Report on integration
Inputs from sphinx ok
Trouble with output of adapter
Discuss web page
Web space set up
Bob needs access to CGI server
Plan roadmap
Cost / Value of Smart Notes Box


Cost of interaction: Nearly same as “flat” notes
box
Immediate value:


Users can “one click email” the notes to other
participants – useful for meeting scribe
Longer term value:


Notes will be “media augmented” – users will be able
to use Mock Brow to play back the media, using the
notes as an index
At future meetings, MR will “suggest” agenda items –
even less typing needed than “flat” notes box!
Steps to Create Instances of
Learning in the Wild
1.
Identify task to be performed by system

2.
Create “out-of-the-box” system

3.
4.
E.g.: Topic detection
E.g.: Decision trees trained on manual data
Show that supervision can help
Design interface to extract implicit
supervision
My Research Goals

For the chosen tasks (topic detection,
action item detection, etc):




Create out-of-the-box technologies
Implement/develop algorithms to rapidly
adapt using sparse implicit supervision
Design interfaces that extract maximal implicit
supervision
Wish: Somehow generalize all this…
How to Evaluate?



Out-of-the-box technology: Standard machine
learning train/test evaluation
Improvement with adaptation: Show delta of
performance between system with/without
supervision
Evaluating the interface’s cost/value viability:
Long and situated user study to show that
indeed regular users will use the interface,
often!
User Study: Testing the Smart
Notes Box Interface

Goals:
1.
2.

Will users use the “smart” notes box?
Will their use indeed line up cleanly with
actual topic boundaries?
Cannot run a “one off” lab test


Users may use the box the first few times –
but will they give up after a few meetings?
Will users use the notes box when they are
deeply involved in the meeting?
User Study Design


Invite multiple groups to use the Meeting
Recorder for their regular meetings
At the first meeting, describe the interface


Point out the value of taking notes in the
smart notes box instead of in Emacs say
Observe the groups over a couple of
months, noting their use of the text box
User Study Success Conditions

Will declare victory if users use the smart
text box often


Often defined as some percentage of
maximum use we can hope for = we get
supervision for all the major topics discussed!
Will pop the champagne if their use of the
text box increases over time

May happen as they experience the long term
values of the interaction
Download