A more effective way to label affective expressions Please share

advertisement
A more effective way to label affective expressions
The MIT Faculty has made this article openly available. Please share
how this access benefits you. Your story matters.
Citation
Eckhardt, M., and R. Picard. “A more effective way to label
affective expressions.” Affective Computing and Intelligent
Interaction and Workshops, 2009. ACII 2009. 3rd International
Conference on. 2009. 1-2. © 2009, IEEE
As Published
http://dx.doi.org/10.1109/ACII.2009.5349528
Publisher
Institute of Electrical and Electronics Engineers
Version
Final published version
Accessed
Fri May 27 05:20:56 EDT 2016
Citable Link
http://hdl.handle.net/1721.1/59285
Terms of Use
Article is made available in accordance with the publisher's policy
and may be subject to US copyright law. Please refer to the
publisher's site for terms of use.
Detailed Terms
A More Effective Way to Label Affective Expressions
Micah Eckhardt
Massachusetts Institute of Technology
20 Ames ST, E15-446
Cambridge, MA 02139 USA
Rosalind Picard
Massachusetts Institute of Technology
20 Ames ST, E15-448
Cambridge, MA 02139 USA
micahrye@mit.edu
picard@media.mit.edu
Abstract
Labeling videos for affect content such as facial expression is tedious and time consuming. Researchers often
spend significant amounts of time annotating experimental
data, or simply lack the time required to label their data.
For these reasons we have developed VidL, an open source
video labeling system that is able to harness the distributed
people-power of the internet. Through centralized management VidL can be used to manage data, custom label videos,
manage workers, visualize labels, and review coders work.
As an example, we recently labeled 700 short videos, approximately 60 hours of work, in 2 days using 20 labelers
working from their own computers.
1. Introduction
Video has become an invaluable tool for the exploration
of human affect and social interaction. Video allows for
complex interactions and dynamic situations to be captured
and reviewed for detailed analysis. Unfortunately, the detailed review and labeling of video that is necessary for
scientific investigation is time consuming. Typically, researchers recruit workers to come to their laboratory and
label videos. For a small number of short videos, or if there
is no a need to have many people label the data for cross validation, this may be acceptable. When presented with many
videos and a need for cross validation this method becomes
a major time consideration.
There are a growing number of video annotation systems
that offer varying functionalities. These systems vary in
cost from free to thousands of dollars. Popular free systems
are VCode, Anvil and the Continuous Measurement System
(CMS) [3–5]. Anvil works across WinXP, Mac OSX 10.4,
and the Linux/GNU operating system while VCode works
only on Mac OSX 10.5 and CMS works only on WinXP.
All systems provide different levels of customization and
functionality and are intended to operate on the user’s local
c
978-1-4244-4799-2/09/$25.00 2009
IEEE
machine. See [3–5] for more detail.
A significant difference between VidL and others is that
it is intended to be used from a web-server. This model is
operating system independent and allows for a distributed
work force of coders to label the data while an administer
can easily manage all data and work related task. VidL can
be used to rapidly label videos by many people from any
location.
2. VidL an Online Video Annotation Tool
The VidL framework is intended for use on a web-server,
which acts as a central repository for the storage of the VidL
application, video content, and user label data. This allows
for video coders to work remotely annotating video, while a
single administrator can control access, application appearance, manage workers and check label data remotely.
The VidL framework is written in Flex and PHP and uses
FFMPEG [1] and MySql [2] to create a complete set of tools
for video annotation. The VidL framework is composed of
two main parts, a back-end data and application management system and a front-end video labeling interface.
2.1. VidL Management System
The VidL management system (VMS) is written in PHP.
VMS allows the administrator to segment videos, and convert AVI videos to FLV or MOV for use with VidL. Additionally, there are scripts to create new users, control which
videos any specific user can label and check what videos
have been labeled by a user. All user label data is stored
in plan text files that can be parsed and added to a MySql
database or otherwise stored and manipulated. Application
functionality and video meta-labels are controlled by the
VidL default configuration xml file. This default configuration file establishes the the underlying directory structure
and the name of required files for VidL to use. It also controls the layout of labels to be used when labeling videos.
Additionally, each video can have a unique configuration
file associated with it, this enables each video to have a cus-
tom label set associated with it.
2.2. VidL Labeling Interface
The VidL labeling interface is designed to to allow efficient labeling and visualization of placed labels (Fig. 1).
The interface includes standard video controls in the menu
bar and bellow the video. Next to the video display is the
label button bar, where all labels that can be associated with
the current move are presented. New labels can be added
and removed from the button bar via the VidL configuration
file.
VidL has several modes including: compare, confidence,
meta-text, and single label. Compare mode allows the user
to compare the labels for a particular video across multiple users (Fig. 2). Confidence mode requires the user
to associate a level of confidence to the label. Meta-text
mode allows the user to associate additional text to the label, perhaps explaining why the user selected a particular
label. Single label mode is used to force a single label for a
video.
VidL also has several different visualization modes.
When reviewing labels the user is presented with both the
video timeline label bar and button active label indicators.
When a label is encountered along the timeline the colored
circle next to the label button expands, highlighting the current label. The button color also corresponds to the tick
mark in the video timeline label bar (Fig. 1). Additionally,
there is a bar-view mode that can be used for visualizing
durations (Fig. 2) and there are visualization charts (Fig.
3).
Figure 2. VidL compare mode with bar view selected for label visualization.
Figure 3. VidL label data visualization.
labels. ACII attendees will have the opportunity to use
the application, ask questions and provide feedback and
suggestions. Additional information can be found at
http://vidl.media.mit.edu/
4. Acknowledgements
We are grateful to Rana el Kaliouby, Matthew Goodwin
and Mish Madsen for their helpful suggestions while creating this tool. This material is based upon work supported
by the National Science Foundation (NSF) under Grant No.
0555411. Any opinions, findings and conclusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of NSF.
References
Figure 1. VidL in standard label mode.
3. VidL Demonstration
The VidL framework will be explained, including setting up the VidL system, creating new users, creating
unique labels, manipulating videos and managing worker
[1] http://ffmpeg.org/.
[2] http://www.mysql.com/.
[3] J. Hagedorn, J. Hailpern, and K. Karahalios. VCode and
VData: illustrating a new framework for supporting the video
annotation workflow. In Proceedings of the working conference on Advanced visual interfaces, pages 317–321. ACM
New York, NY, USA, 2008.
[4] M. Kipp. Anvil-a generic annotation tool for multimodal dialogue. In Seventh European Conference on Speech Communication and Technology. ISCA, 2001.
[5] D. Messinger, M. Mahoor, S. Chow, and J. Cohn. Automated
Measurement of Facial Expression in Infant-Mother Interaction: A Pilot Study. Infancy, 14, 2009.
Download