2D, 3D AND MULTI-TOUCH GESTURES MADE EASIER Joshua Sunshine What is a gesture? “Gesture-based interfaces offer an alternative to traditional keyboard, menu, and direct manipulation interfaces.” [Rubine] “Pen, finger, and wand gestures are increasingly relevant to many new user interfaces.” [Wobbrock] “Input paths of recognized shapes.” [Me] What can gesture? Mouse: Adobe Illustrator, SKETCH Pen: Palm Pilot, Interactive Whiteboards Finger: IPhone Body: Minority Report Face: ATM PINs Wii-mote: Wii games Why support gestures? Efficient A single stroke can indicate: The operation The operand Additional parameters A proofreader’s mark indicates [Rubine] : that a move should occur (operation) the text that should be moved (the operand) and the new location of the text (an additional param) Why support gestures? Natural Chinese brush painting Musical Score Chemical Formula No other choice IPhone (touch screens in general) Table Top Interactive Whiteboard Videos World Builder Opera face recognition Gesture support, two approaches Ad hoc = “Recognizers that use heuristics specifically tuned to a predefined set of gestures.” [Wobbrock 2007] Application Specific: e.g. Chinese Brush Painting, Musical scores, Chemical Formulas Platform: e.g. IPhone gesture libraries Systematic = Allow for definition of new gestures. Toolkit or framework Simple algorithm Ad hoc vs. systematic Ad hoc can be hard to implement because gestures collide [Long 1999]. Ad hoc doesn’t allow definition of new gestures Harder to perfect gestures in a systematic system Consistency of gestures across applications is better in ad hoc systems GRANDMA Gesture Recognizers Automated in a Novel Direct Manipulation Architecture Co-developed with a the Gesture-based Drawing Program (GDP) Major features: build by example Extensions: multi-touch eager gestures GDP design Gesture input handlers are associated with classes Any subclass is also automatically associated with gesture handler. Examples: Associating the “X” shaped delete gesture with GraphicalObject automatically associates it with all Rectangle, Line, Circle, etc. objects. Associating the “L-shaped” create rectangle with GdpTopView associates it with any GDP window. Note the difference with the interactor model, which we implemented in our homework, which associates input handlers with groups. GRANDMA recipe Create a new gesture handler and associate it with a class. Draw gesture ~15 times. Define semantics (in Objective C) 1. 2. 3. 1. 2. 3. Gesture is recognized Mouse movements after recognition Gesture finishes Formal Definitions Gesture is represented as an array g of P sample points: Gp = (xp, yp, tp) 0 ≤ p ≤ P Problem: Given an input gesture g0 and set {C1, C2,…} of gesture classes determine which class g belongs to. GRANDMA Algorithm 13 Features A gesture class is a set of weights assigned to each feature Gestures are given a grade by the linear evaluation function resulting from the weights A gesture is assigned to the class with the maximum grade. Training assigns weights to the 13 features Gestures are rejected if the grade assigned to two classes is similar Limitations Only supports single stroke gestures. Why? Avoids segmentation problem. Why? More usable, a single stroke is associated with a single operation. Supports only a subset of the gestures that are part of my definition. Segmentation problem = problem of recognizing when one stroke ends and the next one begins. Fails to distinguish between some gestures Hard to make size independent gestures GRANDMA in Amulet Amulet was extended to allow gesture input [Landay] GestureInteractor Interactor calls callback function which decides what to do with the result of classification. Classifiers are created with the GRANDMA training algorithm The features of GDP I discussed weren’t used. $1 recognizer Most recognizers are hard to write: Hidden Markov Models Neural networks GRANDMA requires programmers to computer matrix inversions, discriminants, and Mahalanobis distances Toolkits are not available in every setting Therefore creative types (e.g. curious college sophomores) don’t implement gesture in the UIs $1 goals Resilience to sampling Require no advance math Small code Fast 1-gesture training Return an N-best list with scores $1 algorithm Resample the input N evenly spaced points Rotate “Indicative” Scale Reference angle between centroid and start point square Re-rotate and Score Score built from average distance between candidate and template points Really $1? Hold up back page of paper Thoughts? My opinion: Algorithm is simple enough to re-implement Major barrier is discovery problem Limitations Cannot distinguish between gestures whose identities depend on aspect ratios, orientations Square from rectangle Up arrow from down arrow Cannot be distinguished based on speed $1 Evaluation User study 10 users 16 gesture types 30 entries each: 10 slow, 10 medium, 10 fast Compared recognizers Results: $1 .98% errors, GRADMA 7.17% errors Medium speed is best $1 and GRANDMA were fast enough DiamondSpin Video Toolkit for efficient prototyping of multi-person shared displays (target = touch-screen tabletop) Defines API for building tabletop applications Gestures are defined in an ad-hoc manner Conclusion Gesture support makes building gestures easy Multi-touch gestures are easy There remain significant challenges to building the gestures of the future: Many limitations of current approaches in 2D 3D gestures are supported only in an ad-hoc manner References (slide 1 of 2) Chia Shen, Frédéric D. Vernier, Clifton Forlines, Meredith Ringel. "DiamondSpin: an extensible toolkit for around-the-table interaction", In CHI '04, p. 167-174. ACM DL Ref JO Wobbrock, AD Wilson, Y Li. "Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes", In UIST '07, p. 159-168. ACM DL Ref Dean Rubine, "Specifying Gestures by Example", Computer Graphics, Volume 25, Number 4, July 1991, p. 329-337. ACM DL Ref James A. Landay, Brad A. Myers. "Extending an existing user interface toolkit to support gesture recognition." CHI'93 extended abstracts, Pages: 91 - 92. ACM DL Ref T. Westeyn, H. Brashear, A. Atrash, and T. Starner. "Georgia tech gesture toolkit: supporting experiments in gesture recognition." In Proceedings of the 5th international conference on Multimodal interfaces, pages 85-92. ACM DL Ref References (slide 2 of 2) Kent Lyons, Helene Brashear, Tracy Westeyn, Jung Soo Kim, and Thad Starner. "GART: The Gesture and Activity Recognition Toolkit." In HCI ‘07. Springer Ref Jason I. Hong, James A. Landay. "SATIN: a toolkit for informal inkbased applications." In UIST '00: CHI Letters, vol 2, issue 2, p. 6372. ACM DL Ref J. Allan Christian Long, J. A. Landay, and L. A. Rowe. " Implications for a gesture design tool." In CHI '99, p. 40-47. ACM Press, 1999. ACM DL Ref B MacIntyre, M Gandy, S Dow, JD Bolter. "DART: a toolkit for rapid design exploration of augmented reality experiences." ACM DL Ref RC Zeleznik, KP Herndon, JF Hughes. "SKETCH: An interface for sketching 3D scenes." In SIGGRAPH 96, p. 163-170. ACM DL Ref SATIN Video