Project title : Automated Detection of Sign Language Patterns Faculty: Sudeep Sarkar, Barbara Loeding, Students: Sunita Nayak, Alan Yang Department of Computer Science and Engineering, Department of Special Education Goal and Impact Statement Representation that does not require tracking Goal: To advance the design of robust computer representations and algorithms for recognizing American Sign Language from video. Broader Impact: • To facilitate the communication between the Deaf and the hearing population. • To bridge the gap in access to next generation Human Computer Interfaces. Intellectual Merit: We are developing representations and approaches that can • Handle hand and face segmentation (detection) errors, • Learn, without supervision, sign models from examples, • Recognize in the presence of movement epenthesis, i.e. hand movements that appear between two signs. • We have proposed a novel representation that captures the Gestalt configuration of edges and points in an image. • It can work with fragmented noisy low-level outputs such as edges and regions • It captures the statistics of the relations between the low-level primitives • Distance and orientation between edge primitive. • Vertical and horizontal displacement • Relationships between short motion tracks Movement epenthesis is the gesture movement that bridges two consecutive signs. This effect can be over a long duration and involve variations in hand shape, position, and movement, making it hard to model explicitly these intervening segments. This has been a problem when trying to match individual signs to full sentences. We have overcome this with a novel matching methodology that do not require modeling of movement epenthesis segments. • Normalized RD is an estimate of Prob (Any two primitives in the image exhibit a relationship) • The shape of the RD changes as parts of the objects move. • Relational distributions over time model high-level motion patterns. Unsupervised Learning of Sign Models Learn sign model given example sentences with one sign in common. In the following two sentences, the target sign model to be learned is HOUSE (marked in red) Segmentation Aware Matching S2 S3 O1 O2 O3 ... S1 SHE WOMAN HER HOUSE FIRE …… g12 { p12 , p22 } g 22 { p22 , p32 } …… g13 { p13 , p23} g 23 { p23 , p33} …… ... g11 { p11 , p12 } g 12 { p12 , p31} fs-JOHN CAN BUY HOUSE FUTURE Movement Epenthesis Aware Matching Frag-Hidden Markov Models: • Groups across frames are linked • Best match is a path in this induced graph over groups • Matching involves optimization over states AND groups for each frame The error rates for enhanced Level Building (eLB) (our method), which accounts for movement epenthesis, and classical Level Building (LB) that does not account for movement epenthesis. Publications and Acknowledgement • R. Yang; S. Sarkar, B. Loeding, Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition, to be presented at IEEE Conf. on Computer Vision and Pattern Recognition, 2007. • R. Yang; Sarkar, S., “Gesture Recognition using Hidden Markov Models from Fragmented Observations,” IEEE Conference on Computer Vision and Pattern Recognition pp. 766- 773, 17-22 June 2006. • R. Yang and S. Sarkar, “Detecting Coarticulation in Sign Language using Conditional Random Fields,” International Conference on Pattern Recognition vol.2, pp. 108- 112, 20-24 Aug. 2006 • S. Nayak, S. Sarkar, and B. Loeding, “Unsupervised Modeling of Signs Embedded in Continuous Sentences,” IEEE Workshop on Vision for Human-Computer Interaction, vol. 3, pp. 81, June 2005. • R. Yang, S. Sarkar, B. L. Loeding, A. I. Karshmer: Efficient Generation of Large Amounts of Training Data for Sign Language Recognition: A Semi-automatic Tool. ICCHP 2006: 635-642 • B. L. Loeding, S. Sarkar, A. Parashar, A. Karshmer: Progress in Automated Computer Recognition of Sign Language. ICCHP 2004: 1079-1087 This work was supported in part by the National Science Foundation under ITR grant IIS 0312993. Center of Excellence in Pattern Recognition