a lot of this stuff is pulled directly from the 2.0 spec: http://www.w3.org/TR/voicexml20/
Minimizes client/server interactions by specifying multiple interactions per document
Shields application authors from low-level, and platformspecific details
Separates user interaction code (in VoiceXML) from service logic (CGI scripts)
Promotes service portability across implementation platforms. VoiceXML is a common language for content providers, tool providers, and platform providers
Easy to use for simple interactions, yet provides language features to support complex dialogs
Output of synthesized speech (text-to-speech)
Output of audio files
Recognition of spoken input
Recognition of DTMF input
Recording of spoken input
Control of dialog flow
Telephony features such as call transfer and disconnect
Create simple dialogs, simply.
Good for prototyping (hmm, would this have worked for USI keyword experiment?)
Create more complex dialogs with some work.
"VoiceXML supports a limited type of mixed initiative. VoiceXML does NOT support the user asking arbitrary questions during a dialog."
I think it can actually be more arbitrary than this, though, with more complex grammars.
[ SUB_PLST:plst {<option strcat($plst "^d=na^ps=true")>}
SUB_TMST:tmst {<option strcat($tmst "^d=na^ps=false")>}
SUB_HELP:hst {<option strcat($hst "^d=na^ps=hh")>}
SUB_STARTOVER:sst {<option strcat($sst "^d=na^ps=hh")>}
SUB_QUIT:qst {<option strcat($qst "^d=na^ps=hh")>}
SUB_UPDATE:updt {<option strcat($updt "^d=na^ps=updt")>}
SUB_LEADER:ldr {<option strcat($ldr "^d=na^ps=ldr")>}
SUB_NEWGAME:ngm {<option strcat($ngm "^ps=go")>}]
SUB_STARTOVER
[ (start over) {return (strcat("event=" "home"))}
(?(go) home) {return (strcat("event=" "home"))}]
SUB_NEWGAME
[ (?(i want to) go to SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=today")))}
([(tell me) what] about the SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=today")))}
(?(i want to) go to the last SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=last")))}
([(tell me) what] about the last SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t "^d=last")))}
(?(i want to) go to yesterday's SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t
"^d=yesterday")))}
([(tell me) what] about yesterday's SUB_ALLTEAMS:t ?(game)) {return (strcat(strcat("team=" "")strcat($t
"^d=yesterday")))}]
SUB_QUIT
[ (quit) {return (strcat("event=" "quit"))}
(good bye) {return (strcat("event=" "quit"))}]
SUB_PLST
[ [ (SUB_PLAYER:p SUB_STAT:s)
(SUB_STAT:s SUB_PLAYER:p)
(?[(give me) (tell me ?(about)) (what is)] ?(the) SUB_STAT:s ?(for) SUB_PLAYER:p)
(?(SUB_TELLME) how many SUB_STAT:s SUB_PLAYER:p has ?(had))
http://studio.tellme.com/ http://cafe.bevocal.com
http://freespeech.heyanita.com
http://developer.voicegenie.com/ see http://www.commweb.com/article/COM20010129S0003 for an article comparing these platforms
Focus on multi-modal development
Supports XML form of SRGS
Parallel tasks
Applications are DOM based
Uses SSML for speech synthesis
Call Control
Applications are scripted in ECMAScript (aka
Javascript)
Uses fewer XML elements
(see http://www.voicexmlplanet.com/articles/saltspec.html)
<xhtml xmlns:salt="urn:schemas.saltforum.org/2002/02/SALT">
<!-- HTML -->
...
<input name="txtBoxCity" type="text" onpendown="listenCity.Start()"/>
...
<!-- SALT -->
<salt:listen id="listenCity">
<salt:grammar name="gramCity" src="./city.xml" />
<salt:bind targetelement="txtBoxCity" value="//city" />
</salt:listen>
</xhtml>
Can something this "simple" really handle useror mixed-initiative that well?
What are the implications of having a standard, but having different development platforms with different supported & proprietary features?
What do we really need to solve dialog system development? (per Alex…)
Can multi-modalality be successfully integrated
(i.e. via SALT)?