Sports Scores Speech Recognition System

advertisement
Sports Scores Speech
Recognition System
Major League Baseball Score System
Development Team
Members

Dan Corkum
 Jason NguyenTrieu
 Dan Ragland
 Quang Vu
 Andrew Wagner
(Director)
(Producer)
Sponsor: Jim Larson, Intel Corporation
Goals & Objectives

Develop a compelling Speech Recognition
Application for Retrieval of Sports
Information.
 Incorporate Ease of Use Techniques
including: Tapered Prompts, Global
Commands, Barge-In, Repair Dialogs, and
others.
 Develop an Architecture that is both Robust
and Modular. Design for Reuse.
Example
Application
Cellular Phone Application
– Using Wireless Web
– Embedded Windows CE
(Auto PC)
Core Modules

“Web Viking” – Parse Internet Web Pages to retrieve
sports information.

Data Warehousing & Querying – Database for
storage of searchable information.

Client and Server Communication – Enables
communication between Server and remote Clients.

VUI (Voice User Interface) Voice Prompts
and Response System – The core engine that
controls the entire VUI.

Dialog Database – Contains the content for the textto-speech prompts and response criteria.
Architecture - Server
Architecture - Client
Web Viking

The purpose of the Web Viking is to retrieve data
from web sites, parse and format it into a format
so that the database interface can understand it.
 There are three data collection scripts: Schedule,
Scores, and Standing/Ranking
 The data comes from 2 sources:
– Major League Baseball
– ESPN

Two chances to get the right data:
– First, we get data from MLB web site and parse it. If it
fails for any reason, we'll try to get data from the
ESPN web site.
Web Viking

How is the data retrieved?
–

We used the library functions available in the CPAN
(Comprehensive Perl Archive Network.)
– The HTTP::Request module: package up the URL
request
– The HTTP::Response module: handle the data coming
back.
How the data is parsed:
1. Match and strip off unnecessary data.
2. Regular expression
3. Split
4. Format data and check result.
Database & Queries

The Database was implemented using MS Access.
 It functions as a storage site keeping track of team
names, scores associated with each team,
league/division ranking information, and the
schedules for each game.

The Database Handler was written in Java.
 Its primary purpose is to query the database and
fetch the results to the sport score server.
Client & Server Communication

Being an Internet based application, the server is
designed to support multiple clients
simultaneously.
 Communications is implemented using TCP
(Transmission Control Protocol). A secure,
reliable, and widely used Internet protocol.
 The maximum number of clients supported by the
Sports Score server is administrator configurable
based on the performance needs of the server.
Client & Server Communication

Both server and client-side communications are
data independent.
 Data is encapsulated in a packet before
transmission. Data wrapper contains information
pertaining to what type of data is encapsulated,
and it’s size.
 Data packeting allows for multiple information
types (ping, data request, communications
termination, etc…)
 Labeling each packet with a type allows for quick
identification and routing of information to
necessary destinations within the server/client.
VUI (Voice User Interface)
Voice Prompts and Response System
User Interface and Underlying Logic
VUI
Design Considerations
Two Options For Design:
1. Dialog logic coded directly into code.
2. Dialog logic entered into a data structure
and presented by separate internal logic.
VUI
Advantages & Disadvantages
of Hard-Coded Dialogs

Fast initial
implementation
 Ultimate flexibility of
features

Duplicated code
 Difficult to provide
consistent global
functionality
 Hard-coded grammars
VUI
Advantages & Disadvantages
of Dialog Database





Good design: Data
separated from
presentation
Consolidation of code
Easy to create and
maintain dialogs
Features aided by use of
recursion
Computer-generated
grammars

Much work required
before any results seen
 Difficult to customize
specific components
VUI
Decision: Dialog Database

Sports Score dialogs all follow the same
basic pattern
 Implementation could be modularized by
separating the dialogs from their
presentation logic
 The gains made by the ease of entry and
flexibility for the end-user outweighed the
losses in implementation time
 Some features require recursion
VUI
VUI Features






Tapered, User-Level Sensitive Prompts
Tapered, User-Level Sensitive Help
Barge-In capability
User shortcut capability (users can answer future
prompts from any prompt)
Navigational user commands (“back”,”quit”,etc)
Enumerated user commands to allow the user to say a
number as an alternative to the command
VUI
Queries

All query parameters are accumulated in an
XML document
 When a query occurs, the document is sent
to the server
 The server returns an XML document
containing results
 The results are read to the user based on
administrator-defined result strings
Why XML?

XML is fast becoming the industry standard
for data transfer over the Internet
 XML’s hierarchical structure lends itself to
this application
 Several XML parsers already exist for
various platforms (we used IBM’s XML4J)
 The HTML-like nature of XML makes
results easy to read, even for a human.
How Query Results Are Read

The administrator defines parameter-value
pairs as criteria for which response is read
 Each response consists of segments of
literal text along with parameter values
(which can be drawn either from the client
or server)
VUI
The Results

The front-end is very customizable
 Dialogs can be built simply and quickly
 The system administrator needs no
knowledge of programming concepts
 The overall behavior of the system could be
changed without changing each prompt
 The computer speech engine is accessed in
only one area of code, so it could be
swapped with minimal effort
Dialog Structure

The Dialog System consists of:
– Prompts
– Responses
– Help System
 All Dialogs are tapered (Prompts, Responses, & Help)
 Repair Dialogs – Example: Two teams from same city
(New York  Mets and Yankees)
Dialog Structure Overview
Main Menu
HELP
Sc
ore
Sc
Rank
he
du
le
Score
Info.
HELP
Ranking
Info.
HELP
Scheduling
Info.
Info by
League
HELP
Summary

We not only developed a powerful Speech
Recognition Application for Retrieval of Sports
Information, we also developed a reusable
framework which can be easily modified for use in
other applications.

We incorporated Ease of Use Techniques
including: Tapered Prompts, Global Commands,
Barge-In, Repair Dialogs, and others.
More Information is available on the Web:
http://www.cs.pdx.edu/~danr/public/capstone/
Download