Glenn Campbell - Paul Mc Kevitt Paul McKevitt Pól Mac Dhaibhéid

advertisement
MediaHub: An Intelligent MultiMedia
Distributed Platform Hub
Glenn Campbell, Tom Lunney, Paul Mc Kevitt
School of Computing and Intelligent Systems
Faculty of Engineering
University of Ulster, Magee Campus
Northland Road, Derry
{Campbell-g8, TF.Lunney, P.McKevitt} @ulster.ac.uk
PGNET, Liverpool JMU, June 2005
Outline






Goals and objectives
Key research problems
Distributed Processing
Distributed Platforms
Architecture of MediaHub
Tools and future development
PGNET, Liverpool JMU, June 2005
Project goals
The primary objectives of this research are to:

Interpret/generate semantic representations of multimodal
input/output

Perform decision-making (fusion and synchronisation) over
multimodal data

Implement MediaHub, a multimodal platform hub
PGNET, Liverpool JMU, June 2005
Project objectives
Focus on following research questions:




Will MediaHub use frames for semantic representation or
XML or one of its derivatives?
How will MediaHub communicate with various elements
of a platform?
Will MediaHub constitute a blackboard or non-blackboard
model?
What mechanism will be implemented for decisionmaking within MediaHub?
PGNET, Liverpool JMU, June 2005
Key research problems

Semantic Representation



Semantic Storage



Represent language and vision
Frames or XML?
Blackboard model?
Non-blackboard model?
Decision-making


Fusion and synchronisation
AI technique
PGNET, Liverpool JMU, June 2005
Semantic representation

Frames
(CHAMELEON)
[MODULE
INPUT: input
INTENTION: intention-type
TIME: timestamp]
[SPEECH-RECOGNISER
UTTERANCE:(Point to
Hanne’s office)
INTENTION: instruction!
TIME: timestamp]
[GESTURE
GESTURE: coordinates (3,
2)
INTENTION: pointing
TIME: timestamp]
 XML (M3L, SmartKom)
<presentationTask>
<presentationGoal>
<inform> <informFocus> <RealizationType>list </RealizationType>
</informFocus> </inform>
<abstractPresentationContent>
<discourseTopic> <goal>epg_browse</goal> </discourseTopic>
<informationSearch id="dim24"><tvProgram id="dim23">
<broadcast><timeDeictic id="dim16">now</timeDeictic>
<between>2003-03-20T19:42:32 2003-0320T22:00:00</between>
<channel><channel id="dim13"/> </channel>
</broadcast></tvProgram>
</informationSearch>
<result> <event>
<pieceOfInformation>
<tvProgram id="ap_3">
<broadcast> <beginTime>2003-03-20T19:50:00</beginTime>
<endTime>2003-03-20T19:55:00</endTime>
<avMedium> <title>Today’s Stock News</title></avMedium>
<channel>ARD</channel>
</broadcast>……..
</event> </result>
</presentationGoal> </presentationTask>
PGNET, Liverpool JMU, June 2005
Semantic storage

Blackboard or Non-blackboard?



High coupling – Blackboard?
Low coupling - distributed architecture?
Communication


Via central blackboard?
Message passing between modules?
PGNET, Liverpool JMU, June 2005
Decision-making (fusion &
synchronisation)

Rule-based

Potential for Other AI techniques




Fuzzy Logic
Neural Networks
Genetic Algorithms
Bayesian Networks (CPNs)
PGNET, Liverpool JMU, June 2005
Distributed processing







PVM (Parallel Virtual Machine)
(Sunderam 1990, Fink et al. 1995)
ICE (Amtrup 1995)
DACS (Fink et al. 1995, 1996)
Open Agent Architecture (OAA)
(Cheyer et al. 1998, OAA 2004)
JATLite (Kristensen 2001, Jeon et al. 2000)
JavaSpaces (Freeman 2004)
CORBA (Vinoski 1993)
PGNET, Liverpool JMU, June 2005
Intelligent Multimedia Distributed
Platforms

Blackboard Model:

Ymir (Thórisson 1999)

CHAMELEON (Brøndsted et al. 1998, 2001)

Smartkom
(Bühler et al. 2002, Wahlster et al. 2001, SmartKom 2004)

DARBS (Nolle et al. 2001)

DARPA Galaxy Communicator (Bayer et al. 2001)

Psyclone (Psyclone 2004)

Spoken Image/SONAS
(Ó Nualláin et al. 1994, Ó Nualláin & Smith 1994,
Kelleher et al. 2000)
PGNET, Liverpool JMU, June 2005
Intelligent Multimedia Distributed
Platforms

Non-blackboard Model:

WAXHOLM (Carlson et al. 1996)

AESOPWORLD (Okada 1996)

COLLAGEN (Rich et al. 1997)

INTERACT (Waibel et al. 1996)

Oxygen (Oxygen 2004)

EMBASSI (Kirste 2001, EMBASSI 2004)

MIAMM (MIAMM 2004)
PGNET, Liverpool JMU, June 2005
CHAMELEON

Language & vision integration system





consists of ten modules, mostly programmed in C and
C++
DACS communication system used for communication
Blackboard stores semantic representations produced
by other modules
Communication between modules achieved by
exchanging semantic representations between
themselves or blackboard
Semantic representation in form of input, output and
integration frames
PGNET, Liverpool JMU, June 2005
Architecture of CHAMELEON
PGNET, Liverpool JMU, June 2005
SmartKom

User adaptive interface for human-computer interaction



Mobile
Public
Home/Office

Facilitates speech, gestures and facial expression input

XML-based mark-up language, M3L, used for semantic
representation

Distributed multiple blackboard model
PGNET, Liverpool JMU, June 2005
Architecture of SmartKom
PGNET, Liverpool JMU, June 2005
Project proposal

Dialogue Manager




Semantic Representation Database


Acts as a blackboard module
Facilitates communication between other modules
Synchronisation
Provides semantic representation of language and
vision data
Decision Making Module

AI technique for a unique form of decision-making


Bayesian Networks (CPNs)
Neural Networks, Genetic Algorithms, Fuzzy Logic
PGNET, Liverpool JMU, June 2005
Architecture of MediaHub
PGNET, Liverpool JMU, June 2005
Comparison of Intelligent MultiMedia
Platforms
PGNET, Liverpool JMU, June 2005
Software Analysis

Main Programming Language
 Java
 C++

Semantic Representation
 XML
 XHTML + Voice
 SMIL
 RDF Schema
 MPEG-7

Decision Making
 HUGIN (Bayesian Networks) (Hugin 2004)
PGNET, Liverpool JMU, June 2005
 FuzzyJ Toolkit (Fuzzy Logic) (NRC 2004)
Project Schedule
PGNET, Liverpool JMU, June 2005
Conclusion

An intelligent multimodal distributed
MediaHub will be developed
platform

MediaHub will interpret and generate semantic representations of
multimodal input and output

MediaHub will perform fusion and synchronisation of language
and vision data

Unique contribution of MediaHub is to provide a new method of
decision making

MediaHub will be tested within an existing multimodal platform
(e.g. CONFUCIUS)
PGNET, Liverpool JMU, June 2005
hub
called
Download