About VoiceXML 2.0 Stefanie Shriver

advertisement

About VoiceXML 2.0

Stefanie Shriver

a lot of this stuff is pulled directly from the 2.0 spec: http://www.w3.org/TR/voicexml20/

Why use VoiceXML?

Minimizes client/server interactions by specifying multiple interactions per document

Shields application authors from low-level, and platformspecific details

Separates user interaction code (in VoiceXML) from service logic (CGI scripts)

Promotes service portability across implementation platforms. VoiceXML is a common language for content providers, tool providers, and platform providers

Easy to use for simple interactions, yet provides language features to support complex dialogs

VoiceXML has features to handle:

 Output of synthesized speech (text-to-speech)

 Output of audio files

 Recognition of spoken input

 Recognition of DTMF input

 Recording of spoken input

 Control of dialog flow

 Telephony features such as call transfer and disconnect

What can you do with VoiceXML?

 Create simple dialogs, simply.

 Good for prototyping (hmm, would this have worked for USI keyword experiment?)

 Create more complex dialogs with some work.

 "VoiceXML supports a limited type of mixed initiative. VoiceXML does NOT support the user asking arbitrary questions during a dialog."

 I think it can actually be more arbitrary than this, though, with more complex grammars.

[ SUB_PLST:plst {<option strcat($plst "^d=na^ps=true")>}

SUB_TMST:tmst {<option strcat($tmst "^d=na^ps=false")>}

SUB_HELP:hst {<option strcat($hst "^d=na^ps=hh")>}

SUB_STARTOVER:sst {<option strcat($sst "^d=na^ps=hh")>}

SUB_QUIT:qst {<option strcat($qst "^d=na^ps=hh")>}

SUB_UPDATE:updt {<option strcat($updt "^d=na^ps=updt")>}

SUB_LEADER:ldr {<option strcat($ldr "^d=na^ps=ldr")>}

SUB_NEWGAME:ngm {<option strcat($ngm "^ps=go")>}]

SUB_STARTOVER

[ (start over) {return (strcat("event=" "home"))}

(?(go) home) {return (strcat("event=" "home"))}]

SUB_NEWGAME

[ (?(i want to) go to SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=today")))}

([(tell me) what] about the SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=today")))}

(?(i want to) go to the last SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=last")))}

([(tell me) what] about the last SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=last")))}

(?(i want to) go to yesterday's SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t

"^d=yesterday")))}

([(tell me) what] about yesterday's SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t

"^d=yesterday")))}]

SUB_QUIT

[ (quit) {return (strcat("event=" "quit"))}

(good bye) {return (strcat("event=" "quit"))}]

SUB_PLST

[ [ (SUB_PLAYER:p SUB_STAT:s)

(SUB_STAT:s SUB_PLAYER:p)

(?[(give me) (tell me ?(about)) (what is)] ?(the) SUB_STAT:s ?(for) SUB_PLAYER:p)

(?(SUB_TELLME) how many SUB_STAT:s SUB_PLAYER:p has ?(had))

Resources

http://www.w3.org/TR/voicexml20/

http://www.voicexml.org/

Development platforms:

 http://studio.tellme.com/ http://cafe.bevocal.com

http://freespeech.heyanita.com

http://developer.voicegenie.com/ see http://www.commweb.com/article/COM20010129S0003 for an article comparing these platforms

What about SALT?

http://www.saltforum.org/

SALT developed/promoted by Microsoft,

Philips, SpeechWorks, Intel, Cisco,

Comverse

VoiceXML developed/promoted by AT&T,

Lucent, IBM, Motorola

SALT features

Focus on multi-modal development

Supports XML form of SRGS

Parallel tasks

Applications are DOM based

Uses SSML for speech synthesis

Call Control

Applications are scripted in ECMAScript (aka

Javascript)

Uses fewer XML elements

(see http://www.voicexmlplanet.com/articles/saltspec.html)

Multi-modality in SALT

<xhtml xmlns:salt="urn:schemas.saltforum.org/2002/02/SALT">

<!-- HTML -->

...

<input name="txtBoxCity" type="text" onpendown="listenCity.Start()"/>

...

<!-- SALT -->

<salt:listen id="listenCity">

<salt:grammar name="gramCity" src="./city.xml" />

<salt:bind targetelement="txtBoxCity" value="//city" />

</salt:listen>

</xhtml>

Discussion

Can something this "simple" really handle useror mixed-initiative that well?

What are the implications of having a standard, but having different development platforms with different supported & proprietary features?

What do we really need to solve dialog system development? (per Alex…)

Can multi-modalality be successfully integrated

(i.e. via SALT)?

Download