W3C Web Technology Day Signal Processing for Multimodal Web Irek Defée Department of Signal Processing Tampere University of Technology Irek Defée Current status • Web is developed for traditional data and computer I/O: text, keyboard, mouse • This is simple and effective but not a natural way of human interaction with the world • Humans interact via perceptual system Irek Defée Human Perceptual System • • Human perceptual system has multiple senses: visual, acoustical, haptic (touch, body position, temperature) and actuators (vocal tract, muscles, motoric system) The perceptual system is intrinsically MULTIMODAL: multiple senses and actuators operate in perfectly coordinated way Irek Defée Perceptual Information Technology • Information technology is evolving towards natural MULTIMODAL human interaction: Touch gestures revolutionized mobile devices Intelligent speech input is available There is more to come: new sensors, cameras and intelligence Irek Defée Signal Processing Role • • Perceptual Information Technology requires sophisticated signal processing and it is hard due to: - Complex input signals - Complex information encoding - Complex databases of knowledge Highly sophisticated algorithms and huge processing power are required Irek Defée Multimodal Web • The trend towards perceptual information is noted at the W3C: Extending the Web to allow multiple modes of interaction: GUI, Speech, Vision, Pen, Gestures, Haptic interfaces, ... • Multimodal Interaction Activity: - Multimodal Architecture and Interfaces - EMMA - InkML - EmotionML Irek Defée Multimodal Architecture Irek Defée EMMA • Extensible Multimodal Markup Language for Annotations - containing and annotating the interpretation of user input - transcription into words of a raw signal, for instance derived from speech, pen - interpretation is to be generated by signal interpretation processes, such as speech and ink recognition, semantic interpreters Irek Defée Ink Markup Language • data format for representing ink • input and processing of handwriting, gestures, sketches, music using traces of pen Trace attributes Irek Defée Emotion Markup Language • Annotation of material involving emotionality • Automatic recognition of emotions from sensors • Generation of emotion-related system responses: speech, music, colors, gestures, synthetic faces • Emotion vocabularies and representations: <emotion categoryset="http://www.w3.org/TR/emotionvoc/xml#big6"> <category name="surprise" confidence="0.9 </emotion> Irek Defée Department of Signal Processing • • Signal processing has a key role as a front-end for the Multimodal Web Department is on the forefront of research in the natural information processing: - Multimedia information analysis, retrieval and databases - Audio information analysis : speech and music - Media information handling: representation and compression Irek Defée for your attention! Irek Defée