What is the Tactical Language Training System

advertisement
Tactical Language
Training System
Natalie MacConnell
April 21, 2005
Organization of Talk

What is the Tactical Language Training System?

Objectives and Quick Facts

System Architecture

Mission Skill Builder

Mission Practice Environment

Speech Recognition and Error Modeling

Demonstration Video

Summary
What is the Tactical Language
Training System (TLTS)?

Intelligent tutoring system design to aid military personnel in rapidly
acquiring language and cultural skills in order to carry out peaceful
and effective communication in foreign countries

Focuses on “tactical languages”: subsets of linguistic, gestural, and
cultural knowledge and skills necessary to accomplish the task at
hand

Currently developed for Levantine and Iraqi Arabic

Virtual tutor coaches learners in pronunciation, assesses their
mastery, and provides assistance

Learners then apply their language skills to perform missions in an
interactive story environment, where they communicate with
autonomous, animated, Arabic speaking characters
Quick Facts about TLTS

Center for Advanced Research in
Technology for Education (CARTE) at the
University of Southern California


$7.4 million project funded by DARPA




Dr. Lewis Johnson, director of CARTE, linguist
and A.I. expert
Being developed as part of the Training
Superiority Program (DARWARS)
“DARWARS seeks to transform military training
by providing continuously-available, on-demand
mission-level training for all forces at all
echelons”
To be deployed late this year
Full program to include about 80 hours of
instruction with a vocabulary of around 500
carefully chosen words
Objectives of the TLTS

Help military and civilian personnel gain an understanding of a
foreign language and culture so they can learn to communicate
peacefully and effectively with foreigners in their native
language

Eliminate heavy reliance on language experts

Deemphasize written language -- focus on spoken communication
skills for immediate application

Learn the role of nonverbal communication

Develop a more engaging and motivating learning environment
compared to traditional language instruction

Provide training in less commonly taught, difficult to learn languages

Yield rapid acquisition of foreign language skills
System Architecture

Three main components




Mission Skill Builder (MSB): interactive exercises that introduce
learner to the vocabulary and pronunciation of the language
Mission Practice Environment (MPE): story-based, interactive video
game environment where learners advance through game levels by
using their newly acquired linguistic and cultural skills to accomplish
particular tasks and missions
Medina Authoring Tool: used to develop curriculum and game content
Common set of services and content databases: Curriculum
Database, Pedagogical Agent, Learner Model, and Language Model

Language Model consists of:
 Speech Recognizer: used by MSB and MPE
 Natural Language Parser: annotates phrases with structural
information and refers to relevant grammatical explanations
 Error Model: finds and analyzes syntactic and phonological
mistakes in the learner’s speech
System Architecture Diagram
MEDINA
Authoring Tool
Mission Skill Builder (MSB)
Language Model
Pedagogical Agent
NLP Parser
Speech
Recognizer
Curriculum
Material
Learner
Model
Error Model
Mission Practice Environment (MPE)
Johnson, W.L., S. Marsella, N. Mote, H. Vilhjalmsson, S. Narayanan , and S. Choi, Tactical Language
Training System: Supporting the Rapid Acquisition of Foreign Language and Cultural Skills
Mission Skill Builder


Intensive and “intelligent” version of traditional language lab programs
where students are exposed to words and phrases pronounced by native
speakers, which they imitate and practice
Important innovations:

Speech Recognizer is tailored for learner speech so it is able to evaluate
learner’s pronunciation and detect common errors
 Pedagogical Agent provides the learner with tailored performance feedback
 Learner Model tracks what the learner has mastered and what areas the
learner needs to improve

Learning process involves the following steps:

Learner hears Pedagogical Agent pronounce phrase
 Learner records himself speaking the phrase
 Speech Recognizer analyzes the recording and passes it to the Pedagogical
Agent, which provides appropriate feedback based on pronunciation errors and
the Learner Model’s learner history

Also instructs students in non-verbal communication
Mission Skill Builder
Mission Practice Environment

Story-based, interactive video game environment designed to
give students an unscripted, unpredictable, and challenging test
of their mastery of the skills learned in the MSB

Learner moves a uniformed figure through a videogame-like
Lebanese village

Learner speaks into a microphone to control the speech of his
character and selects from gestures for nonverbal communication

Can carry on free-form conversation with AI-animated Arabic
speaking characters, who can understand what is said if it is
understandable Arabic and then respond

Learner must be careful to use appropriate phrases and gestures

Tests the learner’s ability to carry on two-way communication
Mission Practice Environment
Initial Game Scenario:
“In a scene in a café, Sergeant Smith must try
to find out who the village headman is. If he
doesn’t act properly, one of the café patrons
will jump up and demand to know who he really
is. If tensions escalate, the patron will
eventually accuse the sergeant of being a CIA
agent. Standing in the background is the
pedagogical agent, here in the role of aide, who
can assist the learner by translating phrases or
offering suggestions of what to say.”
Mission Practice Environment
Mission Practice Environment

The Learner Model maintained by the Pedagogical Agent controls
the aide’s behavior in the game

Adapts to each individual, noting consistent errors or difficulties,
which can be targeted for remedial practice in the MSB

Based on the graphics capabilities of Unreal Tournament


Implemented as a Total Conversion Mod to
Unreal Tournament 2003

Removed all the combat elements

Added a speech recognition engine

Added intelligent agents that react to the
learner’s speech and pronunciation
UnrealWorld: renders it on the screen and
provides a user interface
Mission Practice Environment
Mission Practice Environment

MissionEngine

Controls what happens in the game, while the
UnrealWorld renders it on the screen

Represents each character in the story as an agent
with its own goals, relationships, and private beliefs

High-level director agent influences the character
agents


Controls how the story unfolds

Ensures the pedagogical and dramatic goals are met
Backend written in Python
MissionEngine System Architecture






Pedagogical Agent: intelligent agent that provides feedback and
encouragement to the learner based on pronunciation correctness
and learner history; implemented in Python
Automatic Speech Recognizer: speech recognition system built
on top of the Cambridge Hidden Markov Model Toolkit (HTK);
implemented as a C++ library
PsychSim: decision-making framework of the virtual characters;
models the goals, motivations, and world beliefs of the characters;
implemented in Python
SocialPuppets: module that controls physical character behavior in
the environment given a description of the character's intent from
PsychSim; implemented in Python
Gamebots: interface that allows Unreal Tournament bots to be
controlled
DataManager: storage module used for all data in the system;
implemented in C++ as an XML database
MissionEngine Architecture
http://www.python.org/pycon/2005/papers/4/MissionEngine.WhitePaper.pdf
Speech Recognition

Hidden Markov Model Automatic Speech Recognizer bootstrapped
from English and Modern Standard Arabic speech and enhanced
with data from native and learner Lebanese Arabic speech


Implemented using the Cambridge HTK
Trained on a Modern Standard Arabic dataset with around 10 hours
of native speech, as well as approximately one hour of non-native
speech samples

Learner speech data is being collected to train the ASR

Generated non-native pronunciation variations for every utterance in
the system and loaded into the Arabic ASR

Hypothesis Rejection Module compares HMM likelihoods from an
Arabic recognizer, English recognizer, and pronunciation variants to
detect whether the user has spoken the right utterance and provide
correct feedback
Speech Recognition

Dynamic switching of recognition grammars allows the recognizer to
focus on recognizing the words and phrases that are likely to occur
in a given learning context

For the MSB, recognizer is constrained to recognize only the
pronunciation variants of the utterances being taught

For the MPE, recognizer is a finite state graph, which has all the
utterances in the MSB as parallel paths

Focuses on recognizing the most likely utterance from among a set of
utterances that are appropriate for a given scene

Enables the system to simulate dialogue with other characters

If the recognizer recognizes a phrase that doesn’t fit into the current
context, the character indicates that he does not understand

If the recognizer fails to recognize an utterance, the aide makes a
suggestion to the learner
Error Detection and Modeling

The ASR detects learner errors and passes them to the Pedagogical Agent
which provides feedback to the learner

Aims to recognize (1) what the learner intended to say, (2) the deviations
the learner made from what he intended to say

For each lesson or exercise, a recognition grammar is loaded that detects
correct responses for that context as well as likely learner errors

Speech Recognizer must recognize both true Arabic words and
mispronounced Arabic words since it is dealing with learner speech

The variability of learner language makes robustness difficult to achieve


Inaccuracies in the speech analysis algorithms caused utterances that were
pronounced correctly but slowly to be rejected -- has been modified to give
higher scores for these utterances so they are not rejected as errors
Can reduce recognition vocabulary size because the learner is taught a
small subset of the language
Video Clip
http://www.isi.edu/~jmoore/Mankin/TLMankin256.wmv
Summary

Help people gain an understanding of a foreign language and
culture so they can communicate peacefully and effectively with
foreigners in their native language

Focus on “tactical languages” to accomplish specific missions

Focus on spoken communication skills

Rapid acquisition of foreign language skills  save time

Remove need for interpreters  save money

Model learner speech and common errors, including English
language utterances

More engaging learning experience (and video games are fun!)
References









“DARPA Tactical Language Training Project”:
http://www.isi.edu/isd/carte/proj_tactlang/tactical_lang_overview.pdf
“Experts Use AI to Help GIs Learn Arabic”:
http://www.usc.edu/uscnews/stories/10321.html
HTK Speech Recognition Toolkit: http://htk.eng.cam.ac.uk/
Johnson, W. L., C. Beal, A. Fowles-Winkler, U. Lauper, S. Marsella , S. Narayanan,
D. Papachristou , and H. Vilhjalmsson, Tactical Language Training System: An Interim
Report
Johnson, W.L., S. Marsella, N. Mote, H. Vilhjalmsson, S. Narayanan , and S. Choi,
Tactical Language Training System: Supporting the Rapid Acquisition of Foreign
Language and Cultural Skills
“Mission to Arabic: It's Not Your Father’s Language Lab”:
http://www.isi.edu/stories/print/78.html
“MissionEngine: Multi-system integration using Python in the Tactical Language
Project”: http://www.python.org/pycon/2005/papers/4/MissionEngine.WhitePaper.pdf
Mote, N., W. L. Johnson, A. Sethy, J. Silva, and S. Narayanan, Tactical Language
Detection and Modeling of Learner Speech Errors: The case of Arabic tactical
language training for American English speakers
“The Tactical Language Project at CARTE”: http://www.isi.edu/isd/carte/proj_tactlang/
Additional Resources






DARPA Training Superiority Program (DARWARS):
http://www.darpa.mil/dso/thrust/biosci/training_super.htm
Mission Rehearsal Exercise Project:
http://www.ict.usc.edu/disp.php?bd=proj_mre
NPR, “A Virtual Course in Iraqi Arabic”:
http://www.npr.org/templates/story/story.php?storyId=4503426
Newsweek, “Arabic: High-Tech Tutor”:
http://www.msnbc.msn.com/id/5146254/site/newsweek/
The Pulse Journal, “Researchers tame violent video game to keep troops
safe in Iraq”:
http://www.pulsejournal.com/news/content/shared/news/nation/stories/0222
_TRAINING_GAME.html
Wired Magazine, “The War Room”:
http://www.wired.com/wired/archive/12.09/warroom.html
Download