Cindy Song Sharena Paripatyadar • Use vision for HCI • Determine steps necessary to incorporate vision in HCI applications • Examine concerns & implications of such applications In Today’s world: • Many devices with integrated cameras • Many personal webcams Our Goal: • To understand how to take advantage of these one camera systems Literature Research Idea Generation Media Player Wizard of Oz Literature Research Re-evaluate Approach Media Player Implementation Revised Implementation Learning Application User Study Evaluation Freeman and Roth For hand posture analysis Creates histograms of local orientation using feature vectors from pixel intensity Recognizes 10 gestures in real time Triesch and von der Malsburg Based on Elastic Graph Matching Extended for skin color feature detection Recognizes 12 gestures Freeman Uses one open hand to control onscreen display Real time application Hand may not be prominent in image 5 participants, various technical backgrounds, age 20-27 Using computer with remote control Used alternate monitor to show user video captured Small set of user-intuitive gestures are easy to remember, but need some menu reminder Show rationale behind gestures Visual feedback to show recognized command before execution Concerns with: Low-light condition Camera field of view Webcam configuration Responsiveness Accuracy Pros Don’t have to search for remote Don’t have to touch remote while eating No battery to run down Cons Doesn’t have as many features as remote Doesn’t work in dark environments More ambiguous than remote, more errors possible – know what each button will do Skin Color Training: With images of various lighting Calibration: Learn lighting and coloring Hand Location: Create bounding box from skin color Finger Region Detection: Find connected regions (fingers/palm) Pattern Matching: Compare regions with gesture patterns Gesture Determination: Using surrounding frames Skin Color Training • Trained on 20+ images • Different lighting & people • Uses “Lab” color space Calibration • Short training based on person’s hand and lighting conditions - < 1 sec needed • Determines correct lighting & with skin color data • Learns specific hand features Hand Location • Determines hand position in image using skin color • Fill in missing portions of hand • Create bounding box Finger Region Detection • Examine bounding box • Find connected regions • Remove small regions Pattern Recognition • Created set patterns based on 10 gestures • Counts number of finger regions for gestures 1-5 • For gestures 6-10, based on number regions detected, looks at other patterns i.e. for 6 determine ratio of finger width to space between fingers Gesture Determination • 20 frames needed to recognize the gesture • Avoids recognizing accidental gestures Complex Backgrounds • First skin color analysis • Then find large connected regions of fingers and hand Motion • Static gestures & frame by frame analysis • Allow for moving camera • Gesture determination corrects obscurities or out of frame hand positioning 5 participants, various technical backgrounds, age 20-27 Taught users 2-4 gestures Quizzed users on gestures learned Ran gesture recognition algorithm to provide feedback Asked several follow up questions Useful for learning sign language, teaching kids to count Instant feedback necessary Nice to know how to correct gesture Needs high accuracy Other applications • Some said Media Player application more useful • Or use as security system (hand gestures as a password) Of implementation Real time is difficult Pattern recognition for specific gestures vs. technique for all types of gestures Complex/moving backgrounds important for real world applications Of user studies Video is valuable avenue for many applications Accuracy and responsiveness are important In one camera systems, there is a tradeoff between convenience and clarity Real-time More user studies Mobile devices Gesture learning application • i.e. Chinese cultural gestures Media Player plug-in application