pptx

advertisement
Didier Perroud
Raynald Seydoux
Frédéric Barras

Abstract

Objectives

Modalities
◦ Project modalities
◦ CASE/CARE

Implementation
◦ VICI, Iphone, Voice recognition, Network

Demonstration

Conclusion

Coordination between two persons to move a
ball into a labyrinth

Rotation possible on the x and y axis

Gates can be opened with vocal and gestural
commands

Coordinate the following technologies:
◦ Augmented reality with tags
◦ Gesture detection ( with Iphone accelerometers)
◦ Voice recognition ( words)
◦ Collaborative environments
◦ Physic engine

Inputs
◦ Hand rotation in x and y axis ( one axis per player) 
direct manipulation of the labyrinth board
◦ Hand pumping for gates’ openings
◦ Voice recognition (words) for selecting gate to open and
start the game

Outputs
◦ Image on the beamer
◦ Iphone vibrations

CASE
◦ Semantic level
of abstraction

CARE
◦ Gesture orientation: assignment
◦ Gesture pumping/Voice selection: complementary to open a gate
◦ Voice commands: assignment


Decision level fusion
Fission: image, vibration

Blocks
◦ Webcam, Tag detection
◦ OpenGL, Physic engine

Multimodality Management

Augmented reality application

Messages from the gateway

Messages to the gateway
◦ state machine
◦ event based
◦ Voice events
◦ Gesture events (orientation X and Y, shake)
◦ Vibration events



Handle the UIAccelerometer interface
Generate motionEvent when shaking
Messages to the gateway
◦ Orientations (X or Y)
◦ Shake

Messages from the gateway
◦ Vibrate
Windows speech API

SDK Features:
◦
◦
◦
◦
◦
◦
◦
◦
◦
API definition files
Runtime component
Control Panel applet
Text-To-Speech engines in multiple languages.
Speech Recognition engines in multiple languages.
Redistributable components
Sample application code.
Sample engines
Documentation.
Our System


A speech recognition engine
A grammar
<grammar xmlns="http://www.w3.org/2001/06/grammar"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.w3.org/2001/06/grammar
http://www.w3.org/TR/speech-grammar/grammar.xsd"
xml:lang="en-EN" version="1.0">
<rule id="Labyrinth" scope="public">
<one-of>
<item>New game</item>
<item>Pause</item>
<item>Exit</item>
<item>Open gate one</item>
<item>Open gate two</item>
<item>Close gate one</item>
<item>Close gate two</item>
</one-of>
</rule>
</grammar>

Recognition comparison before training /
after training


Live
Videos

Problems with the physic engine
◦ Coordination user moves – physic moves

Voice recognition OK

High-level programing
Heterogeneity not a problem

Functional prototype


Thank you
Download