18th IAA Humans in Space Symposium (2011) 2306.pdf Development of Optical Computer Recognition (OCR) for Monitoring Stress and Emotions in Space N. Michael,1 F. Yang,1 D. Metaxas,1 and D.F. Dinges2 1 Center for Computational Biomedicine Imaging and Modeling Center, Rutgers University, New Brunswick, NJ, USA, 2Unit for Experimental Psychiatry, Department of Psychiatry, University of Pennsylvania School of Medicine, Philadelphia, PA, USA INTRODUCTION. While in space, astronauts are required to perform mission-critical tasks on very expensive equipment at a high level of functional capability. Stressors can compromise their ability to do so, thus it is very important to have a system that can unobtrusively and objectively detect neurobehavioral problems involving elevated levels of behavioral stress and negative emotions. Computerized approaches involving inexpensive cameras offer an unobtrusive way to detect distress and to monitor observable emotions of astronauts during critical operations in space, by tracking and analyzing facial expressions and body gestures in video streams. Such systems can have applications beyond space flight, e.g., surveillance, law enforcement and human computer interaction. TECHNOLOGY DEVELOPMENT. We developed a framework [1-9] that is capable of real time tracking of faces and skin blobs of heads and hands. Face tracking uses a group of deformable statistical models of facial shape variation and local texture distribution to robustly track facial landmarks (e.g., eyes, eyebrows, nose, mouth). The model tolerates partial occlusions, it automatically detects and recovers from lost track, and it handles head rotations up to full profile view. The skin blob tracker is initialized with a generic skin color model, dynamically learning the specific color distribution online for adaptive tracking. Detected facial landmarks and blobs are filtered online, both in terms of shape and motion, using eigenspace analysis and temporal dynamical models to prune false detections. We then extract geometric and appearance features to learn models that detect relevant gestures and facial expressions. In particular, our method utilizes the relative intensity ordering of facial expressions (i.e., neutral, onset, apex, offset) found in the training set to learn a ranking model (Rankboost) for their recognition and intensity estimation, which improves our average recognition rate (~87.5% on CMU benchmark database [4,10]). In relation to stress detection, we piloted an experiment to learn subject-specific models of deception detection using behavioral cues to discriminate stressed and relaxed behaviors. We video recorded 147 subjects in 12-question interviews after a mock crime scenario, tracking their facial expressions and body gestures using our algorithm. Using leave-one-out cross validation we acquired separate Nearest Neighbor models per subject, discriminating deceptive from truthful responses with an average accuracy of 81.6% [7, 9]. We are currently experimenting with structured sparsity [14] and super-resolution [11-13] techniques to obtain better quality image features to improve tracking and recognition accuracy. Validation of detection of emotional states is underway (see abstract by Moreta et al.). REFERENCES. [1] Peng Y, et al. Exploring Facial Expressions with Compositional Features. IEEE Computer Soc. Conf. on Computer Vision & Pattern Recognition 2010; [2] Peng Y, et al. Ranking Model for Facial Age Estimation. 20th Internat. Conf. on Pattern Recognition 2010; [3] Peng Y, et al. Boosting Encoded Dynamic Features for Facial Expression Recognition. Pattern Recognition Letter. 2009; [4] Peng Y, et al. RankBoost with regularization for Facial Expression Recognition & Intensity Estimation. In Proc. Int. Conf. on Computer Vision 2009; [5] Neidle C, et al. A Method for Recognition of Grammatically Significant Head Movements & Facial Expressions, Developed through use of a Linguistically Annotated Video Corpus. Proc. Workshop on Formal Approaches to Sign Languages, 21st European Summer School in Logic, Language & Information, Bordeaux, France, July 20-31, 2009; [6] Michael N, et al. Spatial & Temporal Pyramids for Grammatical Expression Recognition of ASL Proc. 11th Inter. ACM SIGACCESS Conf on Computers & Accessibility, Phil., PA, Oct 26-28. 2009; [7] Judee K, et al. Automated Kinesic Analysis for Deception Detection, 43rd Hawaii Internat. Conf. on Systems Sciences; [8] Michael N, et al. Computer-based recognition of facial expressions in ASL: From face tracking to linguistic interpretation. Proc. 4th Workshop on the Representation & Processing of Sign Languages: Corpora & Sign Language Tech., Malta, May 22-23, 2010; [9] Michael N, et al. Motion Profiles for Deception Detection using Visual Cues, Proc. 11th European Conf on Computer Vision, September 2010; [10] Kanade T, et al. Comprehensive database for facial expression analysis. Proc. 4th IEEE Internat. Conf. on Automatic Face & Gesture Recognition (FG'00), Grenoble, France, 46-53, 2000; [11] Borman S, Stevenson R. Super-resolution from image sequences—a review, Proc. Midwest Symposium on Circuits & Systems, 1998; [12] Park SC, et al. Super-resolution image reconstruction: a technical overview, IEEE Signal Processing Mag, 20(3):21–36, May 2003; [13] Schuon S, et al. LidarBoost: Depth Superresolution for ToF 3D Shape Scanning, Proc. IEEE CVPR 2009; [14] Huang J, et al. Learning with Structured Sparsity, 26th Inter. Conf. on Machine Learning, ICML’09, Montreal, June, 2009. FUNDING. Supported by the National Space Biomedical Research Institute through NASA NCC 9-58.