bYTEBoss 08-SALT

advertisement
ITCS 6010
SALT
SALT

Speech Application Language Tags (SALT)

Speech interface markup language

Extension of HTML and other markup languages

Adds speech and telephony features to Web applications
and services for both voice only and multimodal
browsers
SALT Overview

SALT


Small set of XML elements
Elements have:
Attributes
 DOM (Document Object Model) object properties
 Events
 Methods


Applies speech to source page when used in
conjunction with source markup document
SALT Design Principles

Clean integration of speech with Web
pages



Leverages event-based DOM execution
model of Web pages
Integrates cleanly into visual markup pages
Reuses knowledge and skill of Web
developers

Does not reinvent page execution or programming
models
SALT Design Principles (cont’d)

Separation of speech interface from business
logic and data



Individual markup language not directly extended
Provides separate layer extensible across different
markup languages
Allows for loose or tight coupling of speech interface
to underlying data structure

Enables reuse of speech and dialog components across
pages and applications
SALT Design Principles (cont’d)

Power and flexibility of programming
model



SALT elements are simple and intuitive
Offer fine-level control of dialog execution
through DOM event and scripting model
Leverages benefits of rich and wellunderstood execution environment
SALT Design Principles (cont’d)

Reuses existing standards for grammar,
speech output and semantic results

Range of devices


Designed for range of architectural scenarios
Not for particular device type
SALT Design Principles (cont’d)

Minimal cost of authoring across modes
and devices

Enables 2 important classes of application
scenario
1)
Multimodal
o
2)
Visual page enhanced with speech interface on same
device
Cross-modal
o
Single application page reused for different modes on
different devices
Top-level Elements

There are 4 main top-level elements:

<prompt …>


<listen …>


For speech recognition
<dtmf …>


For speech synthesis and prompt playing
For configuration and control of DTMF collection
<smex …>

For general purpose communication with platform
components
Top-level Elements

listen and DTMF elements


May contain <grammar> and <bind>
elements
listen element

May contain <record> element
<listen> Element

Used for speech input






Specifies grammars
Specifies means of dealing with speech recognition
results
Used for recording spoken input
Handles speech events and configures recognizer
properties
Activates/deactivates grammars
Starts/stops recognition
<listen> Element (cont’d)

<listen> example
<salt:listen id=“travel”>
<salt:grammar src=“./city.xml” />
<salt:bind targetElement=“txtBoxOriginCity”
value=“/result/origin_city” />
</salt:listen>
<listen> Element (cont’d)

<listen> element



Can be executed with Start() method in script
Can be executed declaratively in scriptless environment
Handlers include events for:




Successful recognitions
Misrecognitions
Timeouts
Each recognition event can be configured via attributes
for:


Timeout periods
Confidence thresholds
<grammar> Element

Used to specify grammars




Inline or referenced
Multiple grammar elements may be used in single
<listen>
Individual grammars may be activated/deactivated
before recognition begins
Independent of grammar format

Will support at minimum XML form of W3C Speech
Recognition Grammar Specification
<bind> Element


Used to inspect result of recognition
Conditionally copies relevant portions to
values in page


Multiple bind elements may be used in single
<listen>
Recognition result returned in XML document
form
Uses XPath syntax in value attribute
 Uses and XML pattern query in test attribute

<bind> Element (cont’d)

Value attribute


To reference particular node of result
Test attribute


To specify binding conditions
If condition evaluates to true, node content
bound to page element specified by
targetElement attribute
<bind> Element Example

Recognition example
<result text=“I’d like to go to London, please” confidence=“0.45”>
<dest_city text=“to London” confidence=“0.55”> London</dest_city>
</result>
<bind> code
<input name=“txtBoxDestCity” type=“text” />
<salt:listen ….>
<salt:bind targetElement=“txtBoxDestCity”
value=“/result/dest_city” test=“/result/dest_city[@confidence >
0.4]” />
</salt:listen>

<record> Element


Used to specify audio recording
parameters
Results may be processed with bind or
scripted code
<prompt> Element


Used to specify system output
Content may include:





Text
Speech output markup
Variable values
Links to audio files
Mix of any of the above
<prompt> Element (cont’d)
Executed in 2 ways ways:

1)
2)


Declaratively on scriptless browser
By object methods in script
Contains methods to start, stop, pause and resume
prompt playback, and alter speed and volume
Handlers include events for user barge-in, promptcompletion and internal ‘bookmarks’
<prompt> Element Example
<salt:prompt id=“ConfirmTravel”>
So you want to travel from
<salt:value targetElement=“txtBoxOriginCity”
targetAttribute=“value” />
to
<salt:value targetElement=“txtBoxDestCity”
targetAttribute=“value” />
?
</salt:prompt>
<dtmf> Element



Used to specify DTMF grammars in
telephony applications
Deals with keypress input and other
events
Executed declaratively or
programmatically with start and stop
commands
<dtmf> Element (cont’d)




Main elements include <grammar> and <bind>
Holds resources for configuring DTMF collection
process
Configured via attributes for configuring timeouts
Handlers include keypress events, valid dtmf
sequences and out-of-grammar input
<dtmf> Element Example
<salt:dtmf id=“dtmfPhoneNumber”>
<salt:grammar
src=“7digits.gram” />
<salt:bind value=“/result/phoneNumber”
targetElement=“iptPhoneNumber” />
</salt:dtmf>
Event writing

SALT elements contain methods,
properties and event handlers accessible
to script


Enable interaction with other events and
processes in Web page
Because SALT elements are XML objects in
DOM of page
Event writing (cont’d)

Top-level elements contain asynchronous
methods for initiation and completion of
execution

Contain properties


For configuration and result storing
Event handlers

For events associated with speech
Event writing

onReco


Event fired when recognition results
successfully returned
onBargein

Event fired on prompt element if user input
received during prompt playback
Code Examples
<input name=“txtBoxDestCity” type=“text”
onclick=“recoDestCity.Start()” />
<salt:listen id="recoDestCity">
<salt:grammar src="city.xml" />
<salt:bind targetElement="txtBoxDestCity"
value="/result/city" />
</salt:listen>
Code Examples (cont’d)
<input type="button" onclick="recoFromTo.Start()" value="Say From
and To Cities" />
<input name="txtBoxOriginCity" type="text" />
<input name="txtBoxDestCity" type="text" />
<salt:listen id="recoFromTo">
<salt:grammar src="FromToCity.xml" />
<salt:bind targetElement="txtBoxOriginCity"
value="/result/originCity" />
<salt:bind targetElement="txtBoxDestCity"
value="/result/destCity" />
</salt:listen>
<!—- HTML -->
<html xmlns:salt="urn:saltforum.org/schemas/020124">
<body onload="RunAsk()">
<form id="travelForm">
<input name="txtBoxOriginCity" type="text" />
<input name="txtBoxDestCity" type="text" />
</form>
<!—- Speech Application Language Tags -->
<salt:prompt id="askOriginCity"> Where would you like to leave from? </salt:prompt>
<salt:prompt id="askDestCity"> Where would you like to go to?
</salt:prompt>
<salt:prompt id="sayDidntUnderstand" onComplete="runAsk()">
Sorry, I didn't understand.
</salt:prompt>
<salt:listen id="recoOriginCity" onReco="procOriginCity()" onNoReco="sayDidntUnderstand.Start()">
<salt:grammar src="city.xml" />
</salt:listen>
<salt:listen id="recoDestCity" onReco="procDestCity()" onNoReco="sayDidntUnderstand.Start()">
<salt:grammar src="city.xml" />
</salt:listen>
<!—- script -->
<script>
function RunAsk() {
if (travelForm.txtBoxOriginCity.value=="") {
askOriginCity.Start();
recoOriginCity.Start();
} else if (travelForm.txtBoxDestCity.value=="") {
askDestCity.Start();
recoDestCity.Start();
}
}
function procOriginCity() {
travelForm.txtBoxOriginCity.value = recoOriginCity.text;
RunAsk();
}
function procDestCity() {
travelForm.txtBoxDestCity.value = recoDestCity.text;
travelForm.submit();
}
</script>
</body>
</html>
Download