The BoPen: A Tangible Pointer ... Degrees of Freedom Daniel Matthew Taub AUG

The BoPen: A Tangible Pointer Tracked in Six Degrees of Freedom OF TECHNOLOGY by AUG 2 4 2010 Daniel Matthew Taub LIBRAR IES S.B., EECS, M.I.T., 2006 Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY September 2009 ARCHIVES @ Massachusetts Institute of Technology 2009. All rights reserved. ....................................... Author ......... Department of Electrical Engineering and Computer Science August 21, 2009 t C ertified by ....................................... . ......... Ramesh Raskar Associate Professor Thesis Supervisor Accepted by. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . - Dr. 'h istopher J. Terman Chairman, Department Committee on Graduate Theses 4 .J The BoPen: A Tangible Pointer Tracked in Six Degrees of Freedom by Daniel Matthew Taub Submitted to the Department of Electrical Engineering and Computer Science on August 21, 2009, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science Abstract In this thesis, I designed and implemented an optical system for freehand interactions in six degrees of freedom. A single camera captures a pen's location and orientation, including roll, tilt, x, y, and z by reading information encoded in a pattern at the infinite focal plane. The pattern design, server-side processing, and application demo software is written in Microsoft C#.NET, while the client-side pattern recognition is integrated into a camera with an on-board DSP and programmed in C. We evaluate a number of prototypes and consider the pen's potential as an input device. Thesis Supervisor: Ramesh Raskar Title: Associate Professor 4 Acknowledgments I would first like to thank the creators of the Bokode technology: Ankit Mohan, Grace Woo, Shinsaku Hiura, Quinn Smithwick, and my advisor, Ramesh Raskar. Special thanks to Ankit for countless instances of support and advice. Thank you to Shahram Izadi and Steve Hodges of Microsoft for being so interested in a collaboration and for great conversations about HCI. Without the help of Paul Dietz at Microsoft, the collaboration would never have happened in the first place. David Molyneaux at MSRC provided considerable support to me, especially during long nights at the lab. He also helped with editing, as did Susanne Seitinger. Though I have never met him, I am indebted to Jean Renard Ward for his extensive bibliography on the history of pens and handwriting recognition, which made my related works section both more interesting and immensely more time-consuming. Special thanks to Daniel Saakes for doing the mock-up BoPen graphic and for his valuable contributions to the final pen design. Also thanks to Tom Lutz for tolerating my frequent iterations on the pen design and for training me in all the shop tools, and thank you to MSRC for footing the bill when I had to make more pens. Bob from FS Systems graciously lent me his smart camera and served as an excellent liaison with the team at Vision Components, who were extremely helpful to modify their data matrix decoding library to work faster and more accurately with our setup. For giving me a great background in HCI before I started this project, I would like to thank Randy Davis and Rob Miller. Professor Davis taught me more than I ever wanted to know about multimodal interfaces and Professor Miller gave me a firm-but-fair grounding in traditional HCI. I take for granted that my family has always supported me, but I shall thank my parents for trusting that my decisions are sound and for only pushing me just enough. Thanks to my sister for her sense of humor, and for smiling sometimes. Unspeakable appreciation to Virginia Fisher for lending her incredible strength and support. Looking forward to our future together provided motivation immeasurable. 6 Contents 13 1 Introduction 1.1 M otivation . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.2 Research Question . . . . . . . . . . . . . . . . . . . . . 15 1.3 2 1.2.1 Creating real-time Bokode tracking and decoding 15 1.2.2 Co-locating Bokode projector with a display . . . 16 1.2.3 Enabling interaction with system in place . . . . 16 O utline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 19 Related Works 2.1 2.2 . . . . . . . . 19 2.1.1 M ice . . . . . . . . . . . 19 2.1.2 Pens . . . . . . . . . . . 21 2.1.3 Other Pointers . . . . . 23 Pointing Devices 24 Blending Reality and Virtuality 2.2.1 Tangible . . . . . . . . . 25 2.2.2 Mixed-Reality Interfaces 26 2.3 Fiducials . . . . . . . . . . . . . 27 2.4 Bimanual Interaction . . . . . . 29 2.5 Conclusion . . . . . . . . . . . . 30 31 3 Design Development 3.1 BoPen Overview 3.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Bokode Optics 7 3.1.2 Preliminary Pattern . . . . . . . . . . . . . . . . . . . . . . . 33 3.2 Building the BoPen . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3 Software and System Iteration . . . . . . . . . . . . . . . . . . . . . . 36 3.3.1 Replicating Bokode . . . . . . . . . . . . . . . . . . . . . . . . 36 3.3.2 Diffuser Experiment . . . . . . . . . . . . . . . . . . . . . . . 38 3.3.3 Live Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.3.4 Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . . 42 4 Final Implementation 45 4.1 SecondLight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.2 Optical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2.1 Pattern Choices . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.2 Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3 The Pens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4 Hardware and Software . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.4.1 Smart Camera Approach . . . . . . . . . . . . . . . . . . . . . 52 4.4.2 Vision Server Approach . . . . . . . . . . . . . . . . . . . . . 54 Interaction Vision.. . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.5 5 Evaluation 5.1 5.2 6 57 Pen Design Verification. . . . . . . . . . . . . . . . . . . . . . .. 57 5.1.1 Pattern Comparison . . . . . . . . . . . . . . . . . . . . . . . 59 5.1.2 Optical Idiosyncrasies . . . . . . . . . . . . . . . . . . . . . . 63 5.1.3 Best Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Interface Design Verification . . . . . . . . . . . . . . . . . . . . . . . 64 5.2.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2.2 Suggested Improvements . . . . . . . . . . . . . . . . . . . . . 67 5.2.3 Interface Potential 68 . . . . . . . . . . . . . . . . . . . . . . . . Conclusions 69 6.1 69 Contribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 6.2 Relevance . . . . . . . . . . . . . . . . . . . . . 70 6.3 Application Extensions . . . . . . . . . . . . . . 70 6.3.1 GUI Extensions . . . . . . . . . . . . . . 70 6.3.2 Tangible and Direct Manipulation . . . . 71 6.3.3 Multi-user Interaction . . . . . . . . . . 72 6.4 Device Future . . . . . . . . . . . . . . . . . . . 73 6.5 Outlook . . . . . . . . . . . . . . . . . . . . . . 74 A MATLAB Code 85 B C Code 87 105 C C Sharp Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 C.2 Form l.cs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 C.3 Tracker.cs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 C.1 Program .cs C.4 BoPen.cs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . 111 10 List of Figures . . . 2-1 Mixed-Reality Continuum 2-2 Two-Dimensional Barcodes . . . 3-1 BoPen: Basic Design . . . . . . . . 3-2 Basic Optical Models . . . . . . . . 3-3 Data Matrix Grid Pattern . . . . . 3-4 Bokode Versus BoPen . . . . . . . 3-5 Components of final pen design. 3-6 Table-Based Setup 3-7 . . . . . . . . . Software Version One . . . . . . . . 3-8 Version One Failure . . . . . . . . . 3-9 Diffuser-Based FOV Enhancement . 3-10 Mock-up of Diffuser Imaging Setup 3-11 Pipeline for Software Version Two 3-12 Version Two With Multiple Pens 4-1 SecondLight Modifications . . . . . . . . . . . . 4-2 Pattern Designs . . . . . . . . . . . . . . . . . . 4-3 Alternate Pattern Design for Template Matching 4-4 The Pens! . . . . . . . . . . . . . . . . . . . . . 4-5 Pen: Exploded View 4-6 Film Masks . . . . . . . . . . . 4-7 BoPen Debug Display . . . . . . . . . . . . . . . . . . . . . .. . . . 11 5-1 Spaced Data Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 62 5-2 Reflection Chamber . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 6-1 Vision for Public Display . . . . . . . . . . . . . . . . . . . . . . . . . 73 12 List of Tables 4.1 Lens Properties for Each Pen. . . . . . . . . . . . . . . . . . . . . . . 49 5.1 Images of Various Patterns. . . . . . . . . . . . . . . . . . . . . . . . 58 5.2 Paper Test Recognition Rates . . . . . . . . . . . . . . . . . . . . . . 62 5.3 Images of Various Bokehs . . . . . . . . . . . . . . . . . . . . . . . . 65 13 14 Chapter 1 Introduction 1.1 Motivation When Doug Englebart and colleagues created the first Graphical User Interface in 1965-1968 [25], few could have predicted that, 30 years later, it would become the dominant method for humans to interact with computers. It was only a few years later that Xerox created the Alto, but it was built as a research machine [88]. Not until Apple's Macintosh XL was released in 1984 did GUI and pointing-based computers really start taking off [40]. By 1993-almost a decade later-Microsoft@ had taken the lead in software with more than 25 million licensed users of Windows [67]. Today, users the world over have seen very little change in terms of mainstream interface design. Despite the popularization of the Apple iPod and iPhone, 90% of users operate a mouse on a daily basis, and the mouse and PC pairing has 97% prevalence [74]. Whilst this 40-year old technology has grown to widespread use and acceptance, other interesting pointing interface technologies have been created and are being used in research, academic, and business settings. Digital pens and (multi-)touch screens are two of the most popular of these modern interfaces. Most people are familiar with pens as writing utensils, but many also have come to know their digital counterparts via signature pads, tablet PCs, stylus-capable phones and PDAs, and digital art tablets. Our comfort and familiarity with pens for writing, marking, and gesturing contribute to our cultural familiarity with expressing ourselves with pens and pen-based devices. Touch interfaces, on the other hand, draw on our early practice with object manipulation tasks by creating a "magic finger" analogy. Arguably, the touch-activated screens we are familiar with from ATMs and public kiosks were-initially-extensions of the physically actuated buttons on mechanical apparati, telephones, TV remotes, and other electromechanical devices. Even with screen-based touch interfaces, this interaction paradigm has not changed much; with a few exceptions, touch devices are physically flat and permit surface-based interactions in only two dimensions. Some gestural interfaces omit the surface entirely [12], while others can make it seem that any surface is interactive [69]. The magic fingers remain, but the removal of haptic feedback can seriously distance these interfaces from the object manipulation metaphor. In contrast, digital pen-based interfaces always by definition include a physical component-the pen-shaped tool-which acts as a liaison between the physical and digital realms. To accomplish a similar result with touch-based interactions requires touch-responsive props in addition to or in place of the standard two-dimensional surface. At a certain point after adding these props, the interface no longer falls under the category of "touch interfaces". Instead, it is called "tangible," and in many cases, touch becomes only one of many diverse ways to interact with the object. Pen-based interfaces have other affordances, in addition to tangibility. Though touch screens have been increasing in popularity [14], pens are generally understood to be more accurate [38]. Where speed is concerned, pens are nearly indistinguishable from mice under the Fitts' paradigm [2]. Pens are also used for more than just mouse-like pointing. Higher-end commercial pen interfaces support both rolling and tilting for precise drawing and sculpting actions [96]. Digital pens still fall short compared to tangible and augmented reality where flexibility is concerned. In augmented reality systems, everyday objects recognized by vision or tagged by a fiduciary marker may take on an arbitrary meaning that is conveyed or enhanced using sensory overlays [68]. Likewise, interactions with external physical objects can be enabled if the pen position can be detected reliably on those objects. However, pen interfaces are still largely limited to the scenario of a single user interacting with a nearby tablet or other flat surface. Research Question 1.2 What if one could combine the affordances of a familiar physical object, like those of a pen, with the flexibility of an augmented reality system? We believe the result would be something like the widely popular mouse, but providing a more comfortable way to exert control in up to six degrees of freedom (DOFs). A small, lightweight, wireless pen could provide an ideal balance between the fatiguing bulk of a six-DOF mouse and the lack of haptic feedback in hand gesture and finger-tracking systems. This thesis presents a unique application of the "Bokode" optical system first described by [70] to the Microsoft SecondLight multi-touch and multi-display table [46] to enable pen-based interaction supporting three-dimensional translation, two- dimensional tilt, and rotation using no more than a camera, a light source, and some inexpensive optical components (Figure 3-1). In the Bokode setup, a matrix of codes in each pen simultaneously provides both identification and location information directly, using a microscopic array of fiducials. Informed by augmented reality research as well as pen interface technologies, we endeavor to demonstrate the first real-time application of the Bokode as well as the first application programming interface (API) for this technology. 1.2.1 Creating real-time Bokode tracking and decoding As we shall explain in Section 3.3.4, decoding Data Matrices can be a slow process. Human perceptual psychology demands system response times between 100 and 200 milliseconds to fulfill our expectations of interaction [21], so rapid response dispatching is an essential component of a good interface. We plan to explore both hardware and software solutions to the decoding delay problem. 1.2.2 Co-locating Bokode projector with a display Due to the optical nature of the Bokode system(described in Section 3.1.1), traditional diffusive surface-based projection and LED/LCD display technology prevents the Bokode signal from reaching the camera. We discuss our integration of the Bokode with the Microsoft Second Light multi-touch and multi-display system as one solution to this problem. 1.2.3 Enabling interaction with system in place Bokode pens encode rotation angle, unique ID, and tilt in the projected patterns. Creating an API to access this data from higher-level software is necessary to connect the lower-level aspects of the interface to the applications. We shall describe the design for an API and give some examples of applications that could utilize it. 1.3 Outline Chapter 2 provides the background and a literature review for this project. We convey a history of pointers-mouse, pen, and other-in human-computer interactions (HCI). Then, we detail a variety of modern interface inventions, relating them back to the current work. Finally, we describe motivating research from the ubiquitous computing domain, where fiduciary markers bridge augmented reality and tangible interfaces. In Chapter 3, we describe how the system works, optically and computationally, followed by a presentation of design iterations, starting with our proof-of-concept system: a replication of the experiments in [70] using pre-recorded high-resolution video. We detail the evolution of the design with an analysis of our second system, which provides real-time tracking and delayed-time Data Matrix decoding with a lower-resolution (and more conventional) CCD sensor. Chapter 4 describes the structure of our current system and its implementation as part of a collaboration with the Microsoft Second Light project. Chapter 5 evaluates our system in its current form and describes both its limitations and the methods we've used to enhance performance. We close in Chapter 6 with a description of our plan to develop the system into a fully-functional interface, laying out some possible research extensions and promising application areas. 20 Chapter 2 Related Works 2.1 2.1.1 Pointing Devices Mice The most basic and best known example of a pointing device is the standard twodimensional mouse, a device which has changed little since its first public demonstration by Englebert, English, et al. in December 1968 [35]. As modern GUI-based applications often demand more than the standard two degrees of freedom (DOFs) for optimum usability, modifications to the 2D mouse tend to address this issue in similar ways-with a single, small addition. One of the most common additions is that of the scroll wheel [94], a small rotating disk that is usually placed between the two mouse buttons and provides one additional DOF. Another similar design, the JSMouse, enables two additional DOFs by placing an IBM Trackpoint IIITM miniature isometric joystick in the familiar scroll wheel position [100]. The three- or four-DOF devices resulting from these modifications can enable more efficient zooming, rapid browsing, and virtual object manipulation. Many of the more common modern-day mice draw heavily from one or both of these examples [3]. The aforementioned devices may reduce the need for menu traversal or key combinations and therefore increase efficiency, but they leave the mouse design essentially unchanged, save a small addition. Preserving the object manipulation metaphor in- herent in the traditional mouse results in more fundamental design changes. Take for example the Rockin'Mouse [4]. In this device, an internal tilt sensor enables the same four DOFs but as a more natural extension of the traditional mouse; a rounded bottom affords the user a means to figuratively and literally grasp the two additional DOFs, and there is no additional nub or wheel to find; all actions can be performed by directly manipulating the control object. This kind of design is compelling-it allows the user to move a mouse in multiple directions, mapping that movement to the motion of a virtual object. However, the Rockin'Mouse is limited in the scope of its applications. For example, in the case where roll (rotation around the axis normal to the table's plane) is more important than tilt, this particular device would no longer be as intuitive to use, and a new mouse would need to be created. This new mouse would likely look something like the prototypes devised by MacKenzie et al. [61] and Fallman et al. [26]. These mice are both based on the idea that two standard ball- or optical- mouse elements can be combined in one device, giving the user the option of manipulating two connected yet fixed-distanced cursors at the same time-moving around a line instead of a point. The direct manipulation of a widget that represents some virtual object is an attractive interface paradigm-so alluring, in fact, that the development of so-called "tangible" interfaces has broken off as a field of interaction research in its own right. (See Section 2.2.1) Two examples of highly-manipulable mice are the GlobeMouse and the GlobeFish [30], which provide independent control of translation and rotation in three dimensions each, for a total of six DOFs. The GlobeMouse places a three-DOF trackball on top of a rotation-capable mouse, and is meant to be operated with one hand. The GlobeFish is operated bi-manually and uses three nested frames to implement 3D translation with a three-DOF trackball for rotation. The authors report both these devices require considerable force relative to other mice, with the GlobeFish causing fairly rapid motor fatigue in their users. The authors were careful to consider that direct manipulation was not necessarily the best rule to follow for all interfaces, reporting that 3 DOF (translation) + 3 DOF (rotation) is better suited to docking tasks than a device with the more-direct 1 * 6DOF configuration. The VideoMouse [43] is a true 1 * 6DOF direct manipulation interface; it supports roll, two-DOF rocking, two-DOF translation, and some amount of height sensing. This device uses computer vision to process images captured by a camera underneath the mouse. Additionally, a specially-patterned mousepad can act as a handle for the non-dominant hand to assist in orientation adjustments, and the same camera can be used for scanning short segments of text. However, there was no data reported from user studies, so the usability of the VideoMouse is relatively unknown. However, it provides considerable insight into how one might design a 1*6-DOF pen user interface. Indeed, the BoPen is very similar to the VideoMouse-except that it replaces the camera with a projector, the mousepad with a camera, and the focused lenses with out-of-focus ones. 2.1.2 Pens The use of digital pens predates the popular introduction of mice by more than a decade. In the mid-1950 the US military established the Semi-Automatic Ground Environment (SAGE) to command and control air defenses in the event of a Soviet attack. The radar consoles used with SAGE included "light-pens" for selecting which blips to track [24]. Following SAGE, light-pen research continued to be funded (like most computational technologies of the era) primarily through military contracts [71]. Ian Sutherland's 1963 work, "Sketchpad : A man-machine graphical communication system," leveraged the light-pen for drawing and subsequent selection, movement, and alignment of drawn elements and was completed while Sutherland was working at MIT's Lincoln Laboratory [86]. As an interesting side-note, Sutherland's work was one of the first examples of a real-time screen interface, where manipulating elements on the display directly modified the contents of the computer's memory. By the time the first mouse was introduced, digital pens had also started moving off the screen. In a seminal 1964 work, Davis and Ellis of the Research ANd Development (RAND) Corporation wrote about the electrostatics-based RAND Tablet [79]. This paper was possibly the first to use the phrase "electronic ink," and is remembered as both the first handwriting recognition system and the first digitizing tablet [71]. However, written character recognition had already been studied for some time, in both electromechanical [22, 73] and light-pen [64] systems. Still, the RAND Tablet soon became a highly popular platform for researching hand-printed character recognition [7, 33], virtual button pressing [91], and gesture/sketch recognition [87]. It remains one of the best-known of the early digitizing tablets. Irrespective of indications that users could quickly learn to work with stylus input to an off-screen tablet [79], research into the use of digital pens for direct on-screen interactions continued [32]. In a natural extension of this idea, Alan Kay's "Dynabook" became well known as the earliest vision for a hand-held device not unlike today's tablet PCs and PDAs [52, 51]. At Xerox PARC, Kay's design brought inspiration to the development of the Alto-on which many of the ideas for modern GUIs were developed (it featured a mouse as its pointer) [78]. Despite these early works and visions, the cost of computers kept tablet- and pen-based interfaces from reaching the public [93] for quite some time. Finally, in the early 1980s, the Casio PF-8000 and PenCept Penpad became available as the first tablet-based consumer devices, and around the same time Cadre Systems Limited "Inforite" digital signature pads were being used for identity verification at a few shops in Great Britan. In the early 1990s, a plethora of PDAs and tablet PCs began to become available. Since this time, devices based on two-dimensional pen interaction have changed little aside from miniaturization, performance optimizations, and resolution enhancements. A few well-known products from around the turn of the millennium deserve brief mention. Wacom@ tablets and the Anoto@ Pen are two systems against which many contemporary pen interfaces are compared. Interestingly, save a few models of Wacom's devices, these two-dimensional pen systems are primarily screen-less interfaces [97]. As the two-DOF mouse relates to the two-DOF pens familiar to users of tablet PCs and PDAs, the higher-dimensional mice also have analogues in the pen domain. The addition of a third dimension has been explored with visible feedback to indicate movement between layers [85]. Bi et al. describe a system that employs rotation around the center axis, known as "roll," as a way to enhance interaction for both direct manipulation and mode selection [8]. Changing a pen's tilt-the angle made with the drawing surface-has been used for menu selection [90], brush stroke modification [96], and feedback during a steering task[89]. This enhanced functionality is possible because the additional degrees of freedom can be re-mapped for arbitrary purposes, an idea also present in Ramos et al.'s "pressure widgets" [80]. Taking the idea of higherdimensional control more literally, Oshita recognized that the metaphor between pen and object position works well for bipedal characters [75]. He created a system that directly changes the posture of a virtual human figure based on pen position, and this novel application of a stylus will inspire us to more closely examine the boundary between tangible interfaces and more traditional ones in Section 2.2.1 below. From the light-pen to the Anoto, optical pen interfaces belong to an expansive group of interaction technologies. As we mentioned before, the BoPen system is based on some very special optics. Both the Anoto and the light-pen contain optical receiving elements that can determine the pen's location relative to a nearby surface. The BoPen, however, is a tiny image projector. Projectors have been used only recently in pens as a way of turning any surface into a display [84], but we will discuss how to use a projector as a way to indicate position and location in Section 3.1.1. 2.1.3 Other Pointers Traditional pointing techniques have often been compared to touch [29] and multitouch [23]. However, for the purposes of this section, we will only review devices and techniques for assisting interaction at a distance (for a good review of multitouch technologies, see [14]). As early as 1966, three-DOF tracking down to 0.2in resolution was available for the ultrasonically-sensed Lincoln Wand, and at the time it claimed to supplant the light-pen [82]. A bit later, magnetic systems were used to facilitate the use of natural pointing gestures [12]. Currently, similar results can be obtained with camera-tracked wands [15] and device-assisted gestures [41, 95]. Visible laser pointers are a popular tool for interacting with large, distant displays, and computer-tracked versions are no exception [54]. Infrared lasers provide the same pinpoint accuracy but with a virtual cursor replacing the common red dot, reducing interference with the screen [65]. In these systems, however, hand jitter and fatigue are frequently-encountered difficulties [72]. Some systems approach the problem of fatigue by using gestures rather than clicking [19], whereas K6nig's adaptive pointing techniques promise to improve the problem with jitter [55]. One might expect that lighter devices would assist with fatigue, but even hand-only gestures can become tiring after a while, decreasing recognition accuracy [1]. While the BoPen technology seeks to eventually support distance interaction with large displays, this goal is secondary to that of supporting augmented versions of traditional pointing tasks. We aim, then, to provide a pen-based pointer capable of interacting with the surface, above it, and with gestures. This description implies a more flexible version of Grossman et al.'s "hover widgets" [34], which use button activation (rather than distance from the display) to distinguish between the gesture and selection modes. 2.2 Blending Reality and Virtuality Our discussion now centers around objects that provide a dynamic interface to the digital world, blurring the lines between virtual and physical. In some systems, physical objects are tagged for computerized recognition. In others, physical objects themselves are endowed with the ability to "think" and interact. In general, these examples belong to the category of ubiquitous computing-the pervasive presence of computational devices in a user's surroundings-and in moving away from purely localized computer access, they enable interaction paradigms based on constant access to contextually-aware systems [1]. Much of the pioneering work in this area involved immersive virtual reality systems, where every aspect of user context is known because it is provided, and the earliest examples were so cumbersome as to severely restrict the user's movement [81]. As computers have become smaller and more powerful, integration with the physical world is now possible. Rather than immersing a user in a virtual world, mixed-reality and tangible interfaces provide the user with a world in which virtual and physical are no longer so disparate. Immersed in this environment, a person can simultaneously respond to both physical and computer-generated realities. This new model results in an interface that is not as restricted by its physical form as the devices we have hitherto discussed. These interfaces have the ability to adapt and change in direct response to context and experience. 2.2.1 Tangible Oshita's mapping of a pen to a virtual character (as mentioned above in Section 2.1.2) relates to work on tangible user interfaces (TUI), where physical objects "serve as both representation and controls for their digital counterparts" [44]. Immediately following the manipulation of a physical object, the kinesthetic memory of its location decays slowly in comparison to visual memory: multimodal systems employing haptic feedback (along with vision) leverage the kinesthetic sense of body part location (proprioception) to improve performance in object interaction and collaborative environment tasks [37]. As we discussed earlier, using a physical object to directly represent and control a virtual one can restrict the user to the affordances of that object. To perform other activities or control other (virtual) objects requires building a new interface. Many tangible interfaces are application-specific, but the associate-manipulate paradigm enables constrained tangible systems to be used more generally [92]. The ability to dynamically bind a physical object, or token, to a specific digital representation or category creates a mutable syntax for informational manipulation. Constraints in form and placement options for the object directly convey the grammar though physical structure. Even simply the shape of the object can be used as the structure for interaction. In a brilliant example of a distributed TUI, Siftables constitute the first example of a Sensor Network User Interface [66]. A Siftable is square and rests on a table with a screen pointing upward. This shape limits connections between devices to one for each of the four sides of the square. Drawing from the field of tangible interaction, we seek to create an input device that supports assignment of temporary ... . ......... REAL ENVIRONMENT VIRTUAL ENVIRONMENT MIXED REALITY (MR) L 1 AR'ads interacA AV'Oadra cmWr toeAnal ARAA mroma.ton A(~~A r Virtual Reality (VR) Augmented Virtuality (AV) Augmented Reality (AR) Tangible User Interfaces (TUI) A TuI ses Ma hyscal Otgcts with tobothrepresen "n fnicmlaton cmpulw-geere eoamon to a cm u-genraed l;0) 2ea (WReeed VR ersto conVAe&y compur-gnrated envonents 01kSchnW sWOOd OvAtn. 89A.& Lay.2006.Bede A Cosec 203) enwenment J al al 20W4 soo asaaunmer SpatialAR t it&~A~A A Spaa AR440ays PoWed comngeneteo erm=onmet 6dcy mf aUWse' AAROaAar. 0 R (BiAe Us"g ysa" Wbjcmto rate a vtua 204) As iccdah. & KAmAr le f a physc'AckACube a usr adds consctoA "hee4awat vMa; cod upateOd -SMAC01meAcAy mod '$ee-through' AR (eitheropticalor video) A usmwRs a head4rourt dspay tough Auch informaln Maycanwe Cosm - IMPgg The lutl Tecccg" alSIQORAPWAA The OA et t vm:*041d " or" tacked and an s praected 0 mhe" m mage smhy hwod wco r-grated crktMraH supenposed on ao (Cakftka. HaA Road 200 BAnghust Grs & Loose 200) See4oug eeyhing the rf Immersie VR Sem14mmersive V A sem # evVR Oplay n1&W"a ara of a us heid-af-vw VR.Ach uss w~rRa headmontad.say ora pe 8R6tn-ba Vse'' syste, comple"t Wit vWw mmesiv feldof. Smnnw vR VAnfte Smca ProecCOlaoAadimrsaOAe VR. te hftRy s ad ANcerrIera baom wokbech(Orettas,Rousiu. AmeAl(Frchr, Bar & fta"ec.2006:KscA usr are immrd ROS m Ah TAngs,Reche & Ga Z004) 'CAVE (FakeSpac, 20W CRu1 &T 200M Nia. Sandin&DeFan, 193) AR ele San",Hlaer 1W Figure 2-1: Adapted version of Milgram and Kishino's MR continuum[68] Downloaded from http://en.wikipedia.org/wiki/Projection_ augmented_ model. handles to virtual objects while maintaining the point/click/draw capabilities of a normal stylus. 2.2.2 Mixed-Reality Interfaces The phrase "augmented reality" describes one subcategory of the Mixed-Reality(MR) interfaces represented in Figure 2-1. Milgram and Kishino describe MR as a perceptual merging of real and virtual worlds. Augmented Reality(AR) develops when a live camera feed or an object in the physical environment is augmented with computer graphics. Augmented Virtuality(AV) results when virtual environments are supplemented with live video from some part of the "real world." [68] Some of the most prevalent of MR systems combine physical and digital workstations. In an expansion of his 1991 work "The DigitalDesk calculator: tangible manipulation on a desk top display," Pierre Wellner created a full "digital desk" that supported camera digitization of numbers on physical paper for use on the calculator, which was projected onto the desk [98]. It is now possible to project images onto objects and canvases of arbitrary shape [10], enhancing the similarity between AR and TUI and increasing the likelihood that AR interfaces will eventually have a haptic component. Another method to mix realities uses a mobile phone or PDA instead of a projector to create a "magic window" into the digital version of a scene [60]. On the other side of the MR continuum, Malik and Laszlo bring the user's hands onto the computer desktop for a compelling implementation of direct manipulation [62]. In an extension of this project, Malik et al. enable more flexibility in camera position by placing a fiduciary marker next to the touchpad surface [63]. These types of markers, common to MR and tangible systems, are the subject of the next section. 2.3 Fiducials Tagging physical objects gives them a digital identity. It is well known that laserscanned barcodes are used for inventory management and price lookup in stores and warehouses. We also know, from the discussion above, that two-dimensional markers, called fiduciary markers, are common in MR systems. Indeed, the ARTag architecture provides markers intended for identification and pose estimation in augmented reality systems [101]. Markers can also be found enabling tangible interfaces-the reacTIVision table uses "amoeba" tags to detect the position and orientation of the tokens [50] and the TrackMate project aims to simplify the process of building tangible interfaces with its circular tags [53]. As markers can be distracting, some tracking systems employ computer vision techniques such as template matching or volume modeling that can reduce or eliminate the need for tags, but according to Lepetit and Fua, "Even after more than twenty years of research, practical vision-based 3D tracking systems still rely on fiducials because this remains the only approach that is sufficiently fast, robust, and accurate." [58]. For this reason, one alternative to purely vision-based approaches uses markers that, while imperceptible to the human eye, remain visible to cameras in the infrared band [76]. Another option uses time-multiplexing to project markers such that they are only visible to a camera synchronized at a certain frequency [36]. .. .................. .. .............. (a) QRCode on a billboard in Japan. Photo by Nicolas Raoul (b) DataMatrix on an Intel wireless device. Photo by Jon Lund Steffensen Figure 2-2: Examples of 2D Barcodes from http://commons.wikimedia.org Another benefit to using markers is the ability to provide unique identity to objects or actors. Even the best computer vision algorithm would have difficulty distinguishing between two very similar objects, but an imperceptible marker would make it trivial. Markers, in one form or another, are here to stay. With the increasing popularity of mobile devices and smart phones, ubiquitous computing has become truly ubiquitous. Along with these compact computing platforms, fiduciary markers have left the laboratory and are finding their way into common interface technologies. Two-dimensional barcodes are gaining surface area in and on magazines, t-shirts, graffiti, consumer devices, shipping labels, and advertisements (see Figure 2-2). When decoding the Quick Response(QR) code-a two-dimensional barcode used to encode web addresses-a delay is perfectly acceptable; much of the current research focuses on decoding from any angle and under varied environmental conditions [18, 20]. Other research seeks to improve performance by utilizing network connectivity, sending a compressed image and performing the actual decoding on a remote server [99]. This approach still falls short of real time performance-a vital problem for interface development [21]. Indeed, one project that focuses on using mobile cameras for real-time decoding of a Data Matrix code has met with some difficulty [6]. In contrast with many of these projects, our approach to using fiducials for interaction switches the position of the camera and the barcode. The BoPen can be produced inexpensively as a stand-alone device or embedded in a PDA or cell phone, uniquely identifying users to any system with a camera. Furthermore, mobile devices are limited in display and processing power. While mobile phones may eventually be able to recognize fiducials for real-time interaction, personal computers and those driving large displays are likely to arrive there sooner. What's more, some of the limitations of using a PDA as a small window-like portal into an augmented reality can be circumvented by intelligent projection, and a combination of the two can yield a richer interaction experience [83]. 2.4 Bimanual Interaction One of the great draws of tangible interfaces, mixed reality, and multitouch systems is the ability to simultaneously use both hands for interaction. In 1997, Hinckley et al. confirmed the predictions of Guiard's kinematic chain model as applied to laterally asymmetric tasks; for three-dimensional physical manipulation with a tool in one hand and a target object in the other, the dominant hand was best for fine (tool) manipulation whereas the non-dominant hand was best for orienting the target [42]. This study was performed with right-handed subjects and demonstrated that the observable effects of asymmetry increases with task difficulty. The results also reflect our everyday behaviors; people are used to holding a book with one hand, and marking it with the other. In one extension of this research to pen user interfaces, Li et al. reported, based on a keystroke-level analysis of five mode-switching methods, that using the non-dominant hand to press a button offers the best performance by a wide margin, even when the placement of the button is non-ideal [59]. These findings, reported in 2005, are not surprising. Watching Alan Kay present a video demonstration (http://www.archive.org/details/AlanKeyD1987) of Sutherland's Sketchpad [86], one realizes that this bimanual model was used in the interface he implemented over 40 years ago! Symmetric action with two mice can be more efficient than asymmetric action, even when the asymmetry is sympathetic to handedness [56]. Keeping track of two mouse cursors and the functions they represent can be difficult. Indeed, given the option of performing symmetric tasks with touch or with a mouse, participants preferred touch, even though the accuracy of touch decreased rapidly with distance and resulted in more selection errors [29]. It is possible that applying an adaptive pointing technique to a pen-controlled cursor could provide an all-around better means of interaction. However, if touch is available as well, it might be best to use both. Brandl et al. showed that a combination of pen and touch is best for speed, accuracy, and user preference, in comparison to both the pen/pen and touch/touch alternatives [13]. This finding lends credibility to the notion that the pen-touch combination feels more natural while providing a more efficient and accurate interface for certain tasks. 2.5 Conclusion We have seen that modern developments in input technologies enable extending traditional input devices to take advantage of up to six degrees of freedom. Noting that tangible and mixed-reality interfaces area commonly use objects for 6-DOF interaction, we looked at how these interfaces are enhanced by the markers and dynamic binding, as in the associate-manipulate paradigm or with binding virtual objects to physical fiducials. Finally, we described the potential benefit to using multiple modes of interaction, especially when interacting bi-manually. The BoPen is an inexpensive input device that could bind to virtual objects for direct 6-DOF manipulation. It could also function as a pointer for precise, localized input. In both cases, it might be easiest to use a system that employs multiple pens or modalities. Utilizing asymmetric bi-manual action could facilitate rapid mode and association switching, enabling an interaction that is both flexible and natural. . . .. .......... Chapter 3 Design Development LED Difuser Barcode pattern Lens Figure 3-1: BoPen: Basic Design 3.1 BoPen Overview The BoPen contains an LED, a diffuser, a transparency, and a lens, as seen in Figure 3-1. The optics are aligned such that a tiny pattern on the transparency will be projected into infinity. To the human eye-or a camera focused at the front of the pen-the pattern is not visible. Instead all that can be seen (if the LED is in the visible range of the spectrum, in the case of the human eye) is a small point light source. However, when a camera focuses at the infinite plane, the pattern on the Camera Sensor: (a) A pinhole placed in front of a barcode pattern encodes directional rays with the pattern. The camera captures this information by positioning the sensor outAn unbounded magnification of-focus. is achieved by increasing sensor-lens distance, limited only by the signal-to-noise ratio. (b) A small lenslet placed a focal length away from the pattern creates multiple directional beams (ray bundles) for each position in the barcode pattern. The camera lens, focused at infinity, images a magnified version of the barcode pattern on the sensor. Figure 3-2: Optical Models: Pinhole (left) and Lenslet (right). mask appears in a circle of confusion, or bokeh (pronounced "bouquet") around the light. The circle itself is created by defocus blur, and its size and shape are dependent on the properties of the camera, most notably its aperture. 3.1.1 Bokode Optics We will now briefly introduce the enabling optical technology for this system. For a more complete description, please refer to [70]. With the BoPen's optical configuration, the information of the barcode patterns is embedded in the angular and not in the spatial dimension. By placing the camera out-of-focus, the angular information in the defocus blur can be captured by the sensor. The pinhole is blurred, but the information encoded in the bokeh is sharply imaged. Looking at the pinhole setup in Figure 3-2(a), one can imagine that the barcode image-as seen by a stationary observer-will be made arbitrarily large by simply moving the lens more out-of-focus. When the lens is one focal length away from the source image, the rays traced from a single point come through the lens collimated (parallel). In this case, the lens is focused at infinity and the magnification remains 14% I"0%5 L'"&M I"'f Figure 3-3: Data Matrix-Based Pattern Design. A tiled arrangementof Data Matrix (DM) codes encodes identification and angular information. Each 10 x 10 symbol stores both its physical position in the overall pattern and a unique byte identification code that is repeated across all DMs in the Bokode. Image from [70]. constant despite changing the observer distance: the size of the observed image is depth-independent. As relatively little light can enter through a pinhole and the little that does is extensively diffracted, the Bokode and BoPen utilize a lenslet in the same position, as seen in Figure 3-2(b). The camera images a different part of the pattern depending on its position relative to the pen. The viewable region of the transparency is a function of the angle formed between the camera position and the BoPen's optical axis. Unlike traditional barcodes, Bokodes have the ability to give different information to cameras in different positions/orientations [70]. In the BoPen, this feature is used to provide information about the angular tilt of the pen with respect to the drawing surface. 3.1.2 Preliminary Pattern We based our initial pattern on the Data Matrix code, and it is identical to the one described in [70]. The Data Matrix (DM) is a two dimensional barcode [45] that uses a matrix of binary cells to store information. As shown in Figure 3-3 the Bokode uses a tiled array of 10 x 10 DM codes with one row/column of silent cells between adjacent codes. One 10 x 10 DM encodes 3 bytes of data and another 5 bytes of Reed-Solomon error correcting code. This error correcting code ensures data integrity even when up to 30% of the symbol is damaged; we can also rely on this redundancy to help (a) Original Bokode. Image from [70]. (b) BoPen Construction. This image shows both the first and third prototypes. Figure 3-4: Comparison of the original Bokode design and the BoPen. disambiguate between overlapping patterns. Since which pattern is visible depends on the camera angle, the tiled DM design offers up to three independent bytes of information that can vary with tilt to provide both identification and orientation information based on how the pen is positioned relative to the camera. Two bytes provide the the x and y positions of the currently visible Data Matrix. The remaining byte is common across all DM codes in a single pen, providing consistent identification information that is independent of pen orientation. We ordered a printed film mask with these patterns sized at 20x20pm per pixel from PageWorks company in Cambridge, Massachusetts, USA. Each DM in a 128x128 matrix encodes the same unique identifier, and the layout encodes row and column information as detailed above. The transparencies were cut by hand and positioned in the pen as described below. 3.2 Building the BoPen We tested several hardware designs before selecting and iterating upon a final version. Figure 3-4(a) shows Mohan et al.'s original Bokode, while Figure 3-4(b) shows two of our pen-based prototypes. Prototype I was approximately 10 mm wide and 5 cm long with a high-intensity red LED connected to a pushbutton and an external battery case. It was designed with the software program Rhinoceros@ and printed in two ........... ...... ............. (a) SolidWorks drawing for 3D printing. . (b) Lasercutter patterns for adjustable and modular slices. Figure 3-5: Components of final pen design. pieces on a Dimension 3D printer. A tap and die enabled us to put threads on the two pieces and fit them together, once we had glued the patterned transparency to the back of the tip piece. For this to work, the tip's length had to be exactly the focal length of the lens. After our proof-of-concept experiment (described below in Section 3.3.1), we built our second prototype. Prototype II (not shown) was similar to the first but scaled up, utilizing a larger lens to increase the amount of light it emitted. Unfortunately, the walls were too thin to tap without breaking. This problem-combined with the difficulty of achieving proper focus with the fixed position of the mask-convinced us to completely re-design the pen to enable a more flexible option for focusing. Prototype III was designed in SolidWorks@ and also printed on the Dimension printer (Figure 3-5(a)). Rather than an assembly of two pieces, this version utilizes laser-cut slices that can be moved around during testing and fixed in place with additional laser-cut spacers. The slices were cut in a variety of shapes to provide holders for all the component pieces of a BoPen (Figure 3-5(b)). They were cut from wood and paper of varying thicknesses to facilitate focusing. Prototype III was the final redesign, but further iterations based on lens choice are described in Section 4.2.2. Most pens created from this design are approximately 20 mm wide and 6 cm long. This larger pen provided room for a wider pattern, enabling an associated increase in the theoretical angular range for lenses with a relatively small focal length. b(a) [~~] (e) (d) Figure 3-6: Diagram of arrangement for first test. (a) BoPen (b) Clear Tabletop (c) 50mm Lens (d) Camera, with bare sensor exposed. The increased size also resulted in a pen that is more natural to hold. Prototype I was reminiscent of a tiny disposable pencil, whereas Prototype III felt more like a short marker. 3.3 Software and System Iteration With our first prototype, we sought to replicate the off-line results obtained in the original Bokode experiments, but with a web-cam instead of an expensive digital SLR. We also wished to create a computational pipeline for identifying the location of a bokeh in an image and extracting and processing the DM codes. Then, we could work to optimize this process and work toward a real-time system. We planned to do this through an iterative process of evaluations and improvements. This section describes a few cycles of this process. 3.3.1 Replicating Bokode For our proof-of-concept, we used a Philips SPC-900NC web-cam with the optics removed to expose the CCD. A Canon@ 50mm camera lens set at infinite focus ........ .............. found onet . . .... ...... ....... ...... I Figure 3-7: First Software Version. Running in "Images Visible" mode. There is a small red dot to indicate the center of the found circle, and the thumbnail in the bottom right shows the histogram-normalizedimage sent to the Data Matrix decoder. was aligned with the CCD, and a piece of transparent acrylic at 0.5m served as our drawing surface. Figure 3-6 shows a rough sketch of the physical layout. We used our first BoPen prototype as described above, and recorded data at 5 frames per second and 320x240 resolution. This data was interpreted by software written in C++ that made use of open source libraries, specifically IBM's Open Computer Vision (OpenCV) library and the Datamatrix Libraries (LibDMTX). First Version Software Our initial software was designed to interpret data from a video recording. For each frame, it first performs filter operations followed by a polar Hough transform to find circles present in the image (Figure 3-7). A predefined number of pixels (the "window") around the circle center is cropped, normalized, and passed to the DM decoding library for interpretation. The software reports the number of frames processed, the number of barcodes found, and the total time taken for processing. Performance Evaluation For our first test, we processed 268 frames from a 53-second video clip captured in the manner described above at five frames per second. Using no scaling and a window size of 120 pixels, we found and interpreted codes in 22% of frames over the course of 374 seconds. This was unacceptable, so we tried scaling the images (and window ......... Figure 3-8: Example: Image Processing Failure. In this example, the background noise in the red channel resulted in incorrect circle identification and, subsequently, an inordinate delay in DM decoding. size) down by a factor of two. This action resulted in the same recognition rate, but the processing took only 112 seconds (about half of real-time speed). By increasing the size of the window region, we were able to improve the recognition rate to 47.4% in only 89.7 seconds. Unexpectedly, using larger window sizes with the scaled-down lower resolution images resulted in both a higher recognition rate and a shorter processing time. This result was observed even when sending the entire scaled-down image to the DM decoding library, instead of using circle detection to determine which frames warranted interpretation. In some cases the circle detector incorrectly identified the salient region, as seen in Figure 3-8. In subsequent versions, we removed the circle detector entirely. 3.3.2 Diffuser Experiment In our first design, we were only able to move the device around a small portion of the table before it was no longer visible by the camera. Our first attempt to improve the system ambitiously sought to enable interaction across the entire table-increasing the field of view by imaging onto a diffuser. This enabled us to move the BoPen within a rectangular region of approximately 33 x 22 cm at a distance of 0.5m-more than enough space for multiple users to interact. Unfortunately, the diffusers degraded the .. .. .. .. ...... .. ........... ....... :: .. .......... (b) En(e) Figure 3-9: Diffuser-Based FOV Enhancement. Imaging onto a diffuser-even one designed for image projection-causes significant artifacts from the grain of the material. Figure 3-10: Mock-up of Diffuser Imaging Setup. (a) BoPen (b) Clear Tabletop (c) 50mm Lens (d) Diffuser (e) Camera, with lens focused onto diffuser. image quality to the point where we could no longer decode the Data Matrix (Figure 3-9). We made a number of attempts to use smoothing and other image processing methods to clean up the image, but were unsuccessful in decoding the DM with this configuration. Though the implementation fell short of our goals, it provided us a more in-depth understanding of the optical limitations. In the future, we might try to design a pattern that projects more clearly onto a diffuser or, alternately, use an optical taper to scale down without degrading image quality. 3.3.3 Live Tracker After our unsuccessful diffuser experiment, we decided to focus on getting the system to operate more quickly. Though it is less likely we could develop a successful multi-user system with only a portion of the table as input, the popularity of the mouse and the pen tablet attest to the great number of interesting single-user applications we could explored. The refined system now described is based on a similar optical configuration to the first version. However, instead of reading images from a prerecorded file, it processes live video frame-by-frame. Figure 3-11: Software Pipeline: Version Two. A distinguishingfeature of this realtime tracker is that it runs the Data Matrix decoder in a separate thread. This choice means that although the blob centroid detection is reported in real time, the decoded ID numbers and angle positions are at least a few frames behind. Second Software Version Since DM decoding is essential to 3 of the 6 degrees of freedom we planned to support, we went back to imaging through a 50mm camera lens onto a bare sensor. This time, we chose a camera that had the capability of a higher frame rate-an older PointGray@ Dragonfly T M with better resolution than the Phillips web-cam, but still well within the range of "commodity cameras." The choice to use a bare sensor resulted in a 4to-5-fold reduction in our field of view and a corresponding increase in the resolution. As we expected, this enabled us to again decode Data Matrices. We also significantly altered our pipeline, as shown in Figure 3-11 and described below: A frame captured from the camera is Bayer-encoded grayscale, so it must first be converted to an RGB image. Next the image is scaled down by a factor of four and split into component colors. Due to the observed irrelevance of the circle detection ......................... .... ...... .... . (a) System detecting a single centroid (green "X") and interpreted barcode data (White "One") from a recent frame. When this image was captures, the label indicating successful DM decoding was unwavering. Note the small, monochrome, edgedetected blob image in the bottom left. (b) System detecting two centroids and decoding one of the two patterns. This kind of split was often observed. Figure 3-12: Multiple Simultaneous Detection and Disambiguation. during our initial experiments, we decided to omit it from this version of the pipeline. Instead, the red and blue images are thresholded to create a binary image in which we find blobs corresponding to bokehs, calculating their centroids. The regions in the green channel corresponding to discovered blobs in the thresholded image are copied, normalized, smoothed, and sent to a separate thread which runs the DM decoder. Up to a variable ten of these threads can be active at a time, and frames are dropped (ignored) until there is space in the queue. Because the system is threaded, the blob detection portion of the program continues to run, tracking centroid location. When a DM has been decoded in this separate thread, the decoding thread is joined with the main thread (usually 2-5 frames later) and the identifier is displayed. Performance Evaluation The first live system performed very satisfactorily. It could consistently detect: * How many pens were in the field of view * The approximate coordinates of each pen " Each pen's unique identifier Examples of system output can be seen in Figures 3-12(a) and 3-12(b). The system was never observed to confuse one device for the other, and it was observed on multiple occasions to decode two barcodes simultaneously-from images taken from the same or directly adjacent frames. However, the performance was inconsistent between two devices: In a sample of approximately 200 frames where the pens were at rest, the code in the device labeled "One" was available in 58% of frames. For the device labeled "Two," that number was substantially less: 13%. This is most likely the result of device-specific focus or dust/smudging occlusions issues. 3.3.4 Lessons Learned There were three main weaknesses in this version of the BoPen, which prevented us from making it into an interface device: e No detection of roll orientation, and no use of the decoded tilt data. Both these weaknesses are a matter of software implementation; roll angle can be calculated based on the orientation of the DM, and the tilt can be calculated from the decoded row and column information in the Bokode. * No model for changing position over time. Perhaps the greatest limitation is that this system has no model for changing position over time. We separately detect bokeh position and identification, but there is nothing to associate the two. For this reason, we chose only to use the identification functionality of the Bokode, and not its tilt-detection capability. o Lag between position detection and DM decoding. Even if we did model a device over time, the tilt and roll information would lag a second or two behind the position data. This prohibits the system from producing tilt and roll data at a rate adequate for the quick response times required by a direct-manipulation interface. Another limitation comes in the form of the light-based noise to which all optical systems are susceptible: additional emitters in the scene would reduce image contrast. A related difficulty arises when attempting using this system with a projection surfacebased display; the diffuser obscures the signal, so we cannot use this pen on a backprojected display. One alternative suggested by Han is illumination using infrared light, as it is less occluded by LCDs [39]. A more promising alternative-a collaboration with Microsoft Research-is explored in the next section. The SecondLight system uses a variable diffusivity surface that is constantly oscillating, enabling the SecondLight team to display both on and above the surface [46], and enabling the image from the BoPen to be captured from beneath. 46 Chapter 4 Final Implementation As we saw in Section 2, pens for both screenless tablets and interactive displays enjoy some popularity as computer input devices. The BoPen differs from most other digital pens in that it provides a greater number of degrees of freedom. An additional unique feature of the BoPen is that very fine tilting motions result in large and rapid shifts in the transmitted barcode. Feedback from these tilting motions is somewhat limited; though the user has a kinesthetic sense of his or her hand position, there is little haptic information specific to the pen's orientation. According to Balakrishnan and Hinckley's appraisal of Guiard's Kinematic Chain model, visual feedback can compensate for lack of kinesthetic feedback, but not vice-versa [5]. This means that the visual feedback afforded by a concurrent display could be invaluable in enabling users to derive benefit from the fine movement capabilities of the BoPen. Projecting an image onto an interactive surface from above (front-projection) requires an opaque or semi-opaque material onto which the image can be focused, reflected, and later absorbed by the viewer's eye [23]. Projecting from below (rearprojection) requires a translucent diffuser to scatter incident light, resulting in the same effect but without shadows. However-in both these cases-with a BoPen pointed downward, it is impossible to recover the image using a camera on the other side of the scattering material. Indeed, when projector-based augmented reality and mixed-reality system need a transparent object, they simulate transparency by frontprojecting an image of what is behind the object [10]. Through luck, effort, and the good graces of Shahram Izadi, Steve Hodges, and the rest of the Computer-Mediated Living group at Microsoft Research Ltd. in Cambridge, UK, we were able to work on integrating our BoPen with their SecondLight system. With the affordances of a display through which we can focus a camera, we sought to enable a tablet-in-picture pen interface on this horizontal display surface. The goal was and remains to find a method of optimally using this setup to provide visual feedback that will enhance the user's kinesthetic sense of pen orientation and motion. 4.1 SecondLight The Microsoft SecondLight system (Figure 4-1(a)) is a rear-projection multi-touch surface with a twist; by leveraging a diffuser that can switch to clear, SecondLight supports projecting and capturing above the surface as well as on it (For a full account, refer to Izadi et al. [46]). The diffuser is a polymer stabilized cholesteric textured liquid crystal (PSCT) driven by a 150V signal at 60 Hz. When voltage is applied across two planes (in alternating polarities), the material becomes clear. When the voltage is removed, or when the power is turned off, the material is diffuse. Two 60Hz projectors with alternating shutters-one synchronized to the clear part of the cycle, and one to diffuse-form two separate images at an effective frame rate of 120Hz, fast enough to be perceived as continuous by the human eye. While one projector displays onto the tabletop surface, the other creates an image on any diffusive material held above the table, and both images appear to be present at the same time. Like with most FTIR multitouch displays, pressing a paper-printed barcode up against the screen frustrates the light, making the barcode visible. The SecondLight can also theoretically image a barcode once it moves away from the surface, during the part of the cycle where the PSCT is clear. However, a printed barcode becomes smaller and smaller as it is moved away, and the camera needs to be refocused. By using a Bokode-based device instead, the image remains the same size and the camera can stay out-of-focus at infinity. Using the BoPen with this setup also provides a ........... ---------- (a) The SecondLight Rig. ............... (b) With Modifications. Figure 4-1: Second light modifications. We placed the camera in a position that would avoid blocking the projectors whilst enabling a "tablet" size of 100mm by 150mm (Smart camera shown). unique opportunity to explore the intersection of pen and multitouch interfaces. This combination has only been explored a few times before, and never with a 6-DOF pen or a display surface capable of projecting multiple layers. 4.2 Optical Considerations The SecondLight multi-display touch-sensitive screen is 307mm (height) by 406mm (width). To enable a 100mmx150mm pen interaction area, we placed a camera 340mm from the screen with a lens of focal length 16mm. The position of the camera is optimized to provide a small tablet surface to right-handed users. Figure 4-1(b) shows the rig with our modifications. For our tests, we did not leverage the multi-display capabilities of the SecondLight, so only one of the projectors is used. Data Matrix Spaced Data Matrix (a) (b) Video Mouse + Surface Tag (c) ARTags TrackMate (d) (e) Figure 4-2: Patterns we printed. Figure 4-3: Alternate pattern design: (f) 4.2.1 Pattern Choices In Figure 4-2 we show a number of different patterns that we tested for use as the tiltand roll-identifying patterns. The original Bokode pattern consists of many closelytiled (a) Data Matrices, but for better compatibility with the available decoding software, we printed a similar pattern with slightly (b) spaced-apart DM tiles. Some other patterns we tried are (c) a hybrid of VideoMouse [43] and Microsoft Surface tags, (d) ARTag fiducials [101], and (e) TrackMate [53] markers. Figure 4-3 shows another pattern that we devised for a template-matching approach (f) but never printed. The designs of patterns (c) and (f) are meant to provide a motion-blur resistant marker that might one day also take advantage of optical flow. Patterns (a),(b),(d), and (e) were designed to be most compatible with existing techniques and decoding software. We will discuss the differences between these patterns alongside brief evaluations of their performance in Section 5.1. 4.2.2 Lenses We used the simple code in Appendix A to probe the design space based on the dimensions of our table computing setup (as described in Section 4.2). With the table dimensions, an understanding of the camera lens geometry, and CCD resolution and pixel size, we determined a good rule of thumb to use: for a pattern with a side length of Xpm, it's best to have a lens with a focal length of X/10mm. For 1Ox1O DM codes, this translates to: an Xmm BoPen lens requires patterns with Xpm features. An additional consideration is that of angular range: the smaller the focal length, the less of the pattern that is traversed by a slight tilt of the pen. This smaller focal length translates to a larger angular range for a fixed pattern area. A larger focal length can be used with larger patterns, but at the cost of increasing both motion blur and the likelihood of running off the end of the pattern during tilt and translation. These requirements must be balanced against each other, but as we will discuss in Section 5.1.1, they must also be balanced against the constraints of the film mask production process; we cannot make masks that clearly depict barcodes smaller than around 100pm on a side. The optical calculation software informed us that the smallest usable focal length would be around 4.4mm (with an angular range of approximately 580) , and the largest would be around 22mm (with an angular range of approximately 170). With these considerations in mind, we designed and constructed six final variants of our prototype based on lenses within this range as follows: Table 4.1: Lens Properties for Each Pen. Pen Focal Length 8.8mm # 1 17.5mm # 2 12.5mm # 3 4.65mm # 4 11mm # 5 8mm # 6 4.3 Diameter 19.7mm 19.7mm 10mm 7.8mm 7.2mm 6.3mm Type Aspheric Aspheric Condenser Planoconvex Aspheric Condenser Aspheric Planoconvex The Pens With the exception of Pen #5, we built one pen for each of the lenses listed in Table 4.1. These pens were almost identical to Prototype III described in Section 3.2, and AID- Figure 4-4: A Multiplicity of Pens. Far Right: Prototype III. Top Row: P1,P2,P3 Bottom Row: P4,P5,P6 they can be seen next to each other in Figure 4-4. The pens were again designed in SolidWorks and printed on a Dimension 3D printer. This time, however, the slices were cut from 2mm and .5mm acrylic, as well as from paper. All the BoPens built included at least the components ordered as shown in Figure 4-5: on the back is an infrared LED that fits into a reflecting chamber constructed of three adjacent slices with foil tape inside covering the cylindrical wall. The exit from this chamber goes through a diffusing material (tracing paper) before finally back-illuminating the pattern. The pattern is spaced by slices of varying depths in order to ensure that it is exactly one focal length from the lens and will be projected into infinity. Figure 4-6 shows the different pattern slices we tested in each pen. We had software sufficient for preliminary evaluation all of these designs, and in most cases we tested designs with a paper printout mock-up before building a pen with the corresponding film mask. Based on these preliminary tests, described in the next chapter, we decided to implement our software for pens that use the spaced DM codes. .. . . ..... ............. .... Figure 4-5: Pen: exploded view. From Top: LED and spacer, mirror chamber, diffuser, pattern, focusing spacer. The lens (not shown) is at the tip of the pen. Figure 4-6: Pen slices containing patterns on film. Clockwise, from top left: Data Matrix (and spaced DM), TrackMate, MS Hybrid (in two sizes), and ARTags. 4.4 Hardware and Software In our implementation we explored two separate configurations. One uses a smart camera to do the image processing, sending numerical values to a computer for analysis and display. The other uses a camera attached directly to the computer, and the "Vision Server" software that runs the SecondLight image processing pipeline. The speed of the two methods is comparable if GPU shader code is used for the vision server-based solution. However, as it was not within the scope of this project to implement decoding on a GPU, the smart camera approach is faster. A benefit to the vision server approach, though, is that it already includes code for connected component analysis and simplifies the process of masking out the BoPen signal when calculating multitouch. For this disambiguation functionality, the smart camera approach requires a later step, leaving the system more susceptible to asynchronous matching problems. 4.4.1 Smart Camera Approach The VC4038, a smart camera made by the German company Vision Components (VC), has an on-board DSP running at 3200 MIPS, 10OMbit Ethernet, and can capture video a rate of 63 fps (or 126fps with 2x binning) [31]. An infrared-pass (850nm) filter on the front of the 16mm lens serves to eliminate visible light, ensuring high contrast for our infrared bokeh signal. Though we are currently using a single camera at 640x480 resolution, we opted for an extensible client-server architecture that simplifies the process of using additional cameras to extend the usable surface size. Each camera simply watches and reports what it sees, whilst the server keeps track of all devices using multiple instances of a class to represent different BoPens. This method effectively creates a layer of abstraction between the image processing and the interface API. Client Side We borrowed a camera from FS Systems LLP and, using the VC tutorial code as a guide, developed our client-side software. The code running on the camera (reproduced in Appendix B) utilizes a telnet connection to the camera for configuration, and it displays debug information on a VGA monitor connected directly to the camera. The camera uses its Ethernet port to send UDP packets with the following information at a maximum rate of about 50fps: Bokeh X, Bokeh Y, Bokeh Width, Bokeh Height. When decoding DM codes in the spaced configuration shown in Figure 4-2b, the tracking rate is limited to between 20 and 30 fps. When the camera decodes a DM, it additionally outputs the following data: Raw Tilt p, Raw Tilt #, BoPen Identity, BoPen Rotation 0, and (currently disabled but in support of a more accurate tilt calculation in a future version) DM Location relative to bokeh. The smart camera is synchronized on an optically isolated pulse from the circuit which drives the SecondLight's switchable diffuser, so the camera only captures images when the surface is transparent. By leveraging the image processing and DM decoding libraries provided by Vision Components, we were able to do blob detection and decoding in 5.5ms and 19ms, respectively. Threading was not required, so the pipeline of our main loop is very simple: 1. Wait for trigger and capture image (opt. Draw: captured image) 2. Do thresholding and detect biggest blob (Draw: frame around blob region) 3. Search for Data Matrix within the blob region (Draw: square around found DM) 4. If decoded, extract tilt, roll, and ID parameters (Draw: text with decoded information) 5. Send UDP packet with all available information for this frame 6. Handle key press for changing options, etc. Server-Side A UDP server written in C# receives the packets sent by the camera and does the additional calculations necessary to give physical meanings to the client's output parameters, including tilt calculation based on a hard-coded and empirically determined "center marker" of the BoPen's pattern array. The code is reproduced in full in Appendix C, and the main window view is visible in Figure 4-7. This software has a button to test the UDP server, a text box to display any received packets with decoded DM data, and an image plot on which is drawn all the information available for a given moment. The program has the capability to dynamically adjust the maximum and minimum values for coordinates based on received packets in order to automatically calibrate the distance that the BoPen traverses as it moves across the tablet space. This means that one can simply use the system, and it will train to the Figure 4-7: BoPen Debug Display. The program receives UDP packets sent by the smart camera and keeps track of maximum, minimum, and current values for all position and orientation information, displaying whatever is available. The wedge in the blue circle indicates the roll angle and the black lines indicate pen tilt. The size of the blue circle indicates distance from the surface, with larger dots being closer. Not shown: a message indicates when the pen is on the surface, based on a simple Z threshold. information in the UDP packets, scaling and shifting the displayed measurements as needed. 4.4.2 Vision Server Approach The benefits of using the SecondLight's vision server for BoPen processing are twofold. Firstly, the information about the location of the detected bokeh is immediately available to the multitouch detection routines, which can be used to prevent the connected components analysis from incorrectly identifying the BoPen as a finger. Secondly, the vision server integrates GPU shader calls to provide parallel processing of images and other matrix-like data structures. The disadvantages include the difficulty of multi- camera integration and the relative complexity of the code base. Since the image server is necessary for any multitouch applications on the SecondLight, even the smart camera-based approach requires some slight modification of vision server code to integrate the UDP server and pen orientation calculation described in the previous section. Fortunately, our UDP server code is highly modular, with Tracker.cs (Appendix C.3) and BoPen.cs (Appendix C.4) providing most of the necessary functionality for interpreting data from the smart camera and calculating pen pose, respectively. In the next chapter we evaluate only the smart camera approach, as we lack data on the performance of the other method. 4.5 Interaction Vision Only very recently have multitouch and pen interfaces been combined on a single display surface [13, 57], and the techniques described so far do not support tilt and roll sensing with the pen. Creating an additional platform for exploration of simultaneous touch and pen interaction could generate a multiplicity of new interaction techniques and applications. Furthermore, the SecondLight provides an opportunity to utilize multi-touch and multi-display technologies on the same device. As the combination of touch and pen is faster and less error-prone than either touch and touch or pen and pen combinations [13], we shall work to design a system that makes optimal use of a user's innate capacity for bi-manual interaction. In the next chapter, we evaluate our prototypes based on both their technical performance and their interaction potential. 58 Chapter 5 Evaluation In this chapter, we report our observations on overall system performance of different tags in both paper-printed and BoPen-installed configurations and comment on the differences between our BoPen prototypes in image quality, angular range, and contrast variation. Further, we describe a few minor changes to our design, how these changes have enhanced performance, and the limitations that result. We close with an inquiry into the suitability of this pen as an interface device. 5.1 Pen Design Verification The first table below (Table 5.1) shows a size comparison for all patterns as projected from each pen, illustrating how some patterns are simply too small or too large after magnification to be used in a given optical configuration. Farther down, Table 5.3 illustrates the additional constraints imposed by the distance between BoPen and camera in our system; this distance-along with the focal lengths and aperturesdictates the size of the bokeh. In many cases, the pattern cannot be decoded since the bokeh is too small to contain even a single complete marker. Table 5.1: Pattern size variations due to lens and pattern differences. Pen 1 8.8mm Pen 2 17.5mm Pen 3 12.5mm Pen 4 4.65mm Pen 5 11mm. Pen 6 8mm Trackmate wwawev *EsamE[ ARTags Il E) C -Lm Ml assaar. P- I -1 mW.smog -N Small MS/v S*7* mese L E. Big MS/V lo....pe UDI A 1 1 5* woo DataMatrix i **a a Er. * I #4 SpacedDM5 0 Ju SpacedDM1O ~11" .mmAL IV i I *A .1I 5.1.1 Pattern Comparison Each pattern has unique requirements and limitations. TrackMate and ARTags were developed to facilitate mixed-reality applications, and they each have their respective toolkits. Microsoft Tags are proprietary and integrated into the MS Surface SDK. Data Matrix tags can take some time to decode, but are simple and common enough that one might quickly conceive of a solution to this problem. All patterns described below were tested as paper mock-ups (with the camera focused on the surface), and some were also tested (as infinity-projected patterns) in situ on the SecondLight unit. TrackMate The TrackMate Tracker software uses an Open Sound Control (OSC) server to dispatch packets with tag identification and location data. The program also provides on-demand visual feedback to facilitate the configuration process. Users can perform camera calibration or background subtraction on-line, and the software provides immediate confirmation of the settings change. In addition to providing an inspired WYSIWYG (What You See Is What You Get) configuration process that conveys useful information about intermediary image processing steps, configuration is automatically loaded and can be saved with a single key press. According to an informal conversation with the creator of the package, resolving TrackMate tags requires at least 60 pixels to an edge, meaning that, even for _ pen, we would ideally want to print at 17.5mm*1omm/pm 6Opx our longest focal-length our 2.9pm/px. The theoretical printing capability of the film masks is around 5pm/px, so the patterns we printed were slightly too large and could not be resolved in any of the bokehs produced by our pens, not even pen #2 that produces the smallest magnification. This outcome was unfortunate, as our paper experiments revealed that the TrackMate Tracker is fairly robust to variations in pattern size. Writing our own software to generate the tags might enable us to order smaller film masks, but a pattern re-design would be required to fill the entire mask. ARTags ARToolkit+ is a popular software package for mixed-reality research, and its successor Studierstube preserves the functionality of the ARTags with almost identical code and configuration options. This means that for people well-versed in the ARToolkit, these tags should be rather easy to use. The configuration files are written in XML and we had little trouble finding examples, but we had difficulty creating a configuration file for all 255 tags. Furthermore, the tags are slightly larger than 10x1O data matrix codes, so at the two mask sizes we tried, they were either too large to fit in the bokeh or too small to be printed without smudging. The slightly smudged codes were clear enough in pens #1 and #6 to be interpreted some of the time. However, they were not visible in these pens for a large enough proportion of the time, and we were unable to get the toolkit to recognize them consistently. In pens #5 and #3, the patterns were more consistently visible, but the toolkit still did not recognize them. We conclude that ARTags need further experimentation; they may become a viable solution if printed slightly larger and used in a pen with a focal length between 11 and 13 millimeters. However, it is important to consider their limitation: ARTags are recognized as templates in a similar fashion to the unimplemented approach we mentioned in Section 4.2.1, so a custom pattern-matching solution may be a better alternative. VideoMouse/Microsoft Tag Hybrid Microsoft Surface tags are designed to be placed on the bottom of coffee mugs and other objects that will be sitting on top of the illuminated multitouch Microsoft Surface. There are both high- and low-density versions. We used only the lowdensity tags, which can be micro-printed at about the same size as our data matrix codes. This fact means that low-density Surface tags can theoretically be used with all the same pens and lenses as the DM tags. Indeed, the larger masks made from this pattern were well sized for pen #3, and the smaller versions were visible in pens #1 and #5. For our application, however, these tags had a number of disadvantages. They are designed to only encode a single byte, so the number of distinct tags is limited to 256. We divided the single byte into two sets of four bits each, encoding row and column addresses for a 16x16 array of locations and opting not to encode unique identifiers for each device. Relative to DM tags, this encoding scheme reduces our ability to uniquely identify locations, which we planned to counter using a pattern of dots for dead-reckoning: optical flow data would supplement the discrete locations derived from the decoded byte tags with information about relative motion. In practice, the dots were far more likely to be seen in the bokeh than the tags. Only in pen #3 did we observe consistently visible Surface tags, but pen #3 had other problems, which we will describe in Section 5.1.3. In an iteration of our hybrid pattern we would recommend making the dots smaller to increase the noticeable difference between the two features. We would also bring the Surface tags closer together to ensure that they are more frequently visible. Given the tilt performance of the pens (Section 5.2.1), it also makes sense to abandon the absolute location markers in the outermost regions of the film mask, favoring a more concentrated region of tags in the center. As a final note, the software for reading these tags is written entirely in GPU shader code for speed of execution. It is also extremely well optimized for a specialized optical setup-different from the one we were using. These factors made it relatively difficult for us to reliably determine any more than just roll information, despite repeated optical and software reconfiguration attempts Though we did extensively test these tags in our paper-based setup, we had neither the time nor the expertise necessary to properly modify the shader code to work with our setup. Data Matrix Codes As there are a number of Data Matrix decoding packages available freely and for purchase, and as we were relatively familiar with the DM-based mask designs, we chose to concentrate primarily on these patterns for our final implementation. In 5px 7px 9px Figure 5-1: Spaced Data Matrix Codes. Pixel-spaced versions of the data matrix patterns allowed us to use the smart camera's rapid decoding library. our initial explorations, we used the open-source package LibDMTX, which provided fairly robust detection of the very tightly packed barcodes in the original Bokode design. However, the algorithms required a fairly long decoding time-upwards of two seconds in some cases. The library for the VC smart-camera was much faster, but it can currently only recognize DM codes with a blank "comfort zone" around the edges: the original Bokode design is too tightly packed for this library. We evaluated the real-time performance of the system across 1000 frames with codes of five, seven, and nine-pixel spacings (Seen in Figure 5-1) in our paper demo setup, and the results were as follows: Table 5.2: Paper Test Recognition Rates Pattern Spacing Five Pixels Seven Pixels Nine Pixels Stationary 68.3% 89.2% 93.6% Moving 12.1% 30.7% 29.4% We believe that the slightly higher recognition rate of the seven pixel-spaced codes indicates that the choice of seven provides a good tradeoff: it creates enough of a comfort zone around each tag whilst improving the likelihood of seeing a barcode within the decoding region at a given time. Depending on the optics, it could be fruitful to explore six- and eight-pixel spacings as well. The 5pm feature-sized codes we ordered were unreadable due to the nature of the film on which our masks are produced. The material is dipped into a chemical bath and portions of it exposed to light become dark. However, the grains of lightreactive material are larger than the intended pixel size, resulting in the overexposure of some regions that are meant to be clear. This is visible in the images of 5pm DM codes projected by pens #1, #4, and #6 . For the 10pm-feature spaced DM codes, there is also some smudging visible in the images from pens #1 and #6, but it is far less pronounced. Though the images produced by pens #1 and #6 were of an acceptable size for the codes to fit within the bokeh, we again found that they were not consistently visible during motion. Pens #3 and #5 were the best for decoding. After slight refocusing, we were able to track pen #5 at 30 frames per second with approximately 10% DM recognition rate. These numbers come with some caveats, which we will discuss further in Section 5.2.1. 5.1.2 Optical Idiosyncrasies We observed aberrations such as coma and astigmatism in projects from all our lenses. After a certain degree of tilt, the focal length changes, causing a blur and rendering the DM codes unreadable. In some of our lenses, most noticeably #2 and #5 (both aspheric), pincushion distortion stretched and deformed the outermost edges of the image, effectively reducing our decoding region to a fraction of the actual bokeh size. One possible solution would be to geometrically transform the image, based on an earlier calibration of a test pattern projected by the same lens. Pen #1's lens displayed a small amount of barrel distortion, but it also created a strangely shaped bokeh (not visible in our image table) that complicated the process of determining its location relative to the surface. Though the focal length of the lens used in pen #1 was only 8.8mm, its diameter was 19.7mm. Most of this area was not the lens itself, but an extra ring-shaped region of material surrounding the lens. This region permitted excess infrared light to increase the size of the bokeh and decrease the image contrast. We designed a smaller channel between lens and mask to limit the transmission of excess light, but we still observed some degradation in image quality. Both our condenser lens-based pens (#2 and #4) produced lower-contrast images than the other pens, especially pen #4. Since condenser lenses are designed to focus light, not to magnify images, this result was not entirely unexpected. Overall, we found that aspheric lenses helped a bit with astigmatism but were more susceptible to coma when not properly aligned and tended to have either contrast problems, distortion issues, or strangely-shaped bokehs. In further iterations it would be interesting to explore multi-lens systems (like achromatic doublets) to reduce some of these problems. 5.1.3 Best Choice For the display of lpm-spaced DM codes, pens #3 and #5 produced images with optimal magnification for decoding, and one tag was consistently visible in all the images we captured. Since pen #3 was more susceptible to off-axis blur, decreasing both its angular range and effective recognition area, we chose pen #5 as the best of our prototypes. When tested as a stationary device and imaged through the switching diffuser, we could read pen #5's information at between 29 and 40 frames per second, decoding Data Matrix information in 15 to 35% of the captured frames. For simplicity, we will hereafter refer to the prototype pen #5 as "the BoPen." 5.2 Interface Design Verification Earlier, we briefly mentioned that the BoPen projected DM codes that could be captured and decoded at 30fps and approximately 10% recognition rate. In this section we describe the trade-offs necessary to achieve that performance. 5.2.1 Limitations By far the largest barrier to turning the BoPen into a fully-fledged 6-DOF tracking system is the small angular range; only rows of 8-12 tags in the center of the spaced DM pattern can be decoded. After tilting past a ten degree region containing approximately 144 codes, decoding happens only intermittently. Some of this effect is surely worsened by the print quality of the 10pm patterns, but the primary causes are nonuniform illumination, which causes the contrast to drop off as a function of Table 5.3: Bokehs as seen from our camera setup. Pen I 8.8mm Trackmate ARTags Small MS/V Big MS/V DataMatrix SpacedDM5 SpacedDM1O Pen 2 17.5mm Pen 3 12.5mm Pen 4 4.65mm Pen 5 11mm Pen 6 8mm tilt, and astigmatism, where variations in the focal distance between the lens and the pattern cause blur. The problem of angular range is particularly noticeable when trying to use the BoPen for off-surface interaction, as it is difficult to keep one's hand steady without a support. Another important barrier is more specific to our lens/pen choice. With the standard infinity-focused optical configuration, the DM codes are not quite large enough to be decoded when projected onto the CCD. Indeed, the output of our parameter calculation script (Appendix A) indicates that the pattern magnification achieved with the current setup is 1.9626, just below our theorized cutoff of 2.0. This means that a single pixel in our printed pattern is received by no more than 2x2 pixels on the CCD. Tradeoffs To increase the area of the CCD used for capturing Data Matrices, we moved the camera lens slightly closer to the CCD. This refocusing enabled the reliable interpretation of DM tags at the cost of losing depth-independent pattern projection. Moving the BoPen away from the surface now increases the size of the pattern within the bokeh, reducing the distance that the pen can travel from the surface before one full DM code is no longer visible. Still, with minor modifications to the DM search algorithm, we preserved rapid DM decoding in the region three to five inches above the surface, which is sufficient for our envisioned applications. One additional loss from changing the system's focus deals a stronger blow to our vision for interaction. Now that the pattern size changes with depth, the tilt calculation depends on an additional variable. Before, it was possible to determine (after calibration) which DM code is at the origin of a tilting action based on the X and Y positions of the pen along with the roll 0. Now, Z is also an important considerations, complicating our tilt-determination algorithm. We currently are capable of detecting tilt as long as the pen is not rotated, but have yet to generalize it to arbitrary 0. Figure 5-2: Reflection Chamber. Realizing that the black acrylic we used for slices is transmissive to IR light, we designed this second chamber to produce more uniform illumination via reflection and scattering. Minor Concessions Changing the focus as described above also increased the bokeh size at a given distance. However, since variations due to depth remained consistent, our system can still distinguish between on- and off-surface interaction. We also observed an increase in coma distortion with large tilts, resulting in blurrier DM codes at the edges of the mask. However, since astigmatism and light levels already make these codes difficult to interpret, we felt it an acceptable loss: it is far more important ensure the functionality of the DM decoding in the center regions, which is essential to the fine-grained tilt and roll capabilities of the interface. 5.2.2 Suggested Improvements Figure 5-2 shows an additional reflective chamber that we built to increase the BoPen's angular range by combatting the problem of nonuniform pattern illumination. Since black acrylic is transmissive to infrared light (in fact, we use black acrylic as our IR filter) we can place reflective foil tape on the outside of a series of ring-shaped slices. Positioned after the first chamber containing the LED and after the diffuser in the BoPen, this chamber creates more uniform illumination before the light reaches the pattern mask. Along with software for local histogram normalization, the chamber provides a good alternative to more expensive uniform emitters. An interesting observation related to the tilt problems of our pens is a lack of the predicted trade-off between lens size and angular range: smaller lenses afforded little additional tilt capacity. Experimenting with lens arrays based on microscope objectives may reduce off-axis aberrations. In another approach to the same problem, one could create a layered or tiered pattern so that stacking multiple two-dimensional masks would create a three dimensional pattern that is more resistant to the disparity between focal length and object distance. This pattern would be composed of concentric circles that decrease in depth-moving closer to the BoPen's lens-as the radius increases. This design would likely introduce variations in light levels that could be addressed with precise pattern cuts, intelligent illumination schemes, and histogram normalization. One of the most interesting approaches involves the use of flat lenses or holographic projection techniques to create an optical system where the focal distance is always constant. Working within the constrains of the current BoPen capabilities, however, a compound superposition approach like that described in [70] could increase the angular range at the expense of device size. 5.2.3 Interface Potential Though our system is still very preliminary, we have a few comments from potential users that will guide our further refinement of this technology. First, of the people we talked to, all said they would be interested in a simultaneous pen and touch interface. Their reactions to the current design rated the BoPen favorably against the original Bokode for pen-like interaction. However, most preferred the original Bokode for mixed-reality applications because of its ability to stand up on its own, pointing toward the camera by default. Some requested that the pen be larger, have a more ergonomic grip, and function at larger angles, so they could hold it normally, rather than needing to point it downward all the time. At least one user felt that it was not cumbersome to hold it pointed downward, as she holds pens in a similar way normally. All appreciated the visual feedback afforded by our debug console, but they also seemed interested in testing out some more extensive application designs. We are currently working on these applications, and shall discuss some of our longer-term ideas in the next and final chapter. Chapter 6 Conclusions In this thesis, we presented continuing work on a system that enables real-time interaction using a special optical barcode-projection system. We reviewed the literature on similar interfaces and technologies, reported the current state of the art in using Bokodes for interaction, and discussed the concessions we made to achieve real-time performance. We outlined the remaining limitations to overcome before the BoPen becomes a fully viable 6-DOF pen interface. 6.1 Contribution A significant contribution of this work is our demonstration of real-time blob tracking and Data Matrix decoding, which produces a 30fps data-stream with decoded information in as many as 90% of packets when imaging paper and up to 35% for infinity-projected Bokodes printed at 10pum. The use of a client-server architecture provides an easy way to extend the system to multiple users or surfaces via many distinct or one continuous tablet region(s). Additionally, we described integration with the Microsoft@ SecondLight system, using a switchable diffuser to collocate a multi-tiered display with our optical pen interface. Finally, our system calculates roll information from decoded Bokodes as well as tilt information for certain orientations. 6.2 Relevance Like the light-pen, our system is an optoelectronic pen that can be used to interact with computing devices. Unlike the light-pen, it provides information for up to six degrees of freedom. In comparison with contemporary pen interfaces, such as the Wacom tablet [96], our system is limited in that it can only provide a small range of angular tilt information, and it does not currently support pressure sensing or integrated buttons. However, the BoPen provides depth information that can be used in lieu of pressure for screen taps, and its compatibility with the identification of multiple devices is uniquely compelling. In comparison with Bi et al.'s use of the Vicon motion-tracking system for obtaining roll [8], the BoPen/camera solution is more compact and less expensive. However, the smart camera approach makes the BoPen solution significantly less affordable, albeit not for the pen device itself. Overall, the BoPen is unlikely to replace the mouse anytime soon, but as an adjunct to a tabletop computing interface like SecondLight, the BoPen provides a novel input modality with exciting interaction potential. 6.3 Application Extensions The applications we are considering to best demonstrate the strengths of BoPen technology fall within a few different interaction areas. GUI extensions use the additional degrees of freedom of the BoPen for enhancing interaction with the cursor and menu elements already present in Graphical User Interfaces. Applications for direct manipulation draw inspiration from mixed-reality interfaces, and the integration of multiple users with multiple displays engenders a vision for the ubiquitous availability of BoPens to users at public kiosks. 6.3.1 GUI Extensions In Chapter 2, we discussed some user interface widgets based on input devices with additional degrees of freedom. Tilt for marking menus [90] and pressure changes to assist in target acquisition [80] are only two of the many ways that these extra degrees of freedom can be used. Tian et al. explored a tilt cursor that provides the user with visual feedback indicative of pen orientation [89]. For the BoPen, too, additional degrees of freedom could be expressed with visual feedback to preserve a WYSIWYG model of interaction. With this consideration in mind, tilt can be used to change pen modes. For example, in a drawing application, pen depth could dictate stroke thickness, and tilt could provide direction. When selecting a region of an image to "cut," a tilt cursor could be displayed to provide a more natural-feeling razor-blade tool. Finally, it would be interesting to explore variations on the theme of crossing targets, which are known to be more efficient than standard Fitts targets for interaction using stylus pens [2, 27]. 6.3.2 Tangible and Direct Manipulation We can take the BoPen even further into the tangible realm: working with digital paintings (in 2D) or sculptures (in 3D) becomes more natural with the availability of multiple physical tools that are linked to virtual ones. Paint brushes, knives, pencils, pens (of course), and a whole host of other physical tools could be built as BoPens with visibly and tactilely distinct forms. In this way, the function of the device is immediately conveyed by the form while preserving the aforementioned higher-DOF modes. The communication of affordances is a given with physical tools, but for their digital counterparts it must be engineered, an idea pioneered in virtual reality systems [49]. Like a set of physical tools, a set of specially designed and uniquely-identified BoPens enables switching between modes of interaction by simply picking up a different device; this spacial search task is far less cognitively demanding than looking for the right icon on an unfamiliar tool bar, and it leverages the user's kinesthetic memory as well. There are a number of interfaces that offer more natural interactions through the use of both hands [42]. We mentioned the pen/touch combination, but for symmetric tasks the use of two pens can be almost equally compelling [13]. Video-based multitouch systems have trouble telling the difference between two users and one user using two hands [39], but an extension of our system would support multi-point IDed interaction through the use of bimanual pens. This has implications for the traditional point-and-click interface as well as tangible and gestural interaction. One example application is stretching and skewing a 3D object based on manipulation of pens representing virtual axes. Another is 6-DOF control of multiple virtual objects; research into a simulated docking tasks using two pens would serve as an extension of the established research on bimanual interaction with physical objects. 6.3.3 Multi-user Interaction Adding another device is understood to be one of the best (but not the only. See [11]) approaches to solving problems encountered when enabling multiple users [47, 15, 63]. One of the most difficult aspects of this process is establishing ad-hoc communication protocol between devices [48, 17]. Tools exist to facilitate these inter-device communications [77, 28], but for simple input to a single computer or public display(c.f. [9]), a simpler (and less expensive) solution like the BoPen is quite powerful. For a person with a BoPen, interaction is possible at any display equipped with a suitable camera (Figure 6-1). Under this paradigm, multiple people can collaborate, play games, and express themselves at a number of kiosks distributed around a city or building. However, there is an important "registration" component to this kind of interaction that cannot be ignored; the user would likely have to execute an initial login either online at home or at a special "Association Kiosk," which would read their pen and associate it with their digital identity (e.g. e-mail address or username). The user could then login at home to access statistics and other information, providing additional depth and breadth to using Bokode-authenticated interfaces. Additionally, in implementing interaction at a distance, it is important to consider that the users' hands will not be steady. For this reason, using large gestures is going to be more effective than precise cursor control. One might refer to Nintendo's popular WiiTM console for examples of how large gestures can effectively be used for interaction. .... . .. .... .... ... ................. I Figure 6-1: Vision for Public Display. In this vision for interaction, multiple users are uniquely identified to a public display, and can interact with it in real time. Image from http://web.media.mit.edu/~ ankit/bokode/ 6.4 Device Future Aside from the obvious improvements to be garnered by higher printing resolution and improved focal length stability, actuation is a major next step in creating a BoPen interface. The lack of pressure sensitivity in the BoPen complicates the reliable detection of a tap or click. Rather than using Z-depth, a future pen might employ capacitive sensing on the pen barrel to detect hand position and finger actuation or force sensitivity at the tip to determine with fine granularity the pen's pressure on the display surface. Both these methods potentially increase the power consumption and cost of the BoPen, but these unwanted results could be minimized by preserving the optical method of communication. Rather than adding a Bluetooth radio or other RF transmitter, the existing optical communication infrastructure could be used to transmit this additional information. For example, a tip could be designed where exerting pressure would mechanically alter the shape of the tip and hence the bokeh. The camera could read this shape change using a template-matching approach, and infer the exerted pressure. Other solutions include modulation of light intensity in a ring around the code region, pulse width modulation, or color-based conveyance of button or tip actuation. 6.5 Outlook The Data Matrix tracking and decoding on which the BoPen's functioning depends is now available in real time. After some additional refinements to address the problems we identified earlier, the next important step in the development of BoPen interaction is the implementation and testing of interface applications. We are actively working on developing some of the applications described above for three-dimensional object manipulation as well as novel GUI extensions supported by the BoPen's additional degrees of freedom. We also hope to one day be able to explore gesture detection, sketch reading, and-pushing the technology to its limits-handwritten character recognition. Approaching these areas with the BoPen in hand could open up a valuable avenue of research into Human-Computer Interaction. Bibliography [1] Gregory D. Abowd and Elizabeth D. Mynatt. Charting past, present, and future research in ubiquitous computing. ACM Trans. Comput.-Hum. Interact., 7(1):29-58, 2000. [2] Johnny Accot and Shumin Zhai. Beyond fitts' law: models for trajectory-based hci tasks. In CHI '97: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 295-302, New York, NY, USA, 1997. ACM. [3] Apple. Mighty Mouse. Online at http://www.apple.com/mightymouse/, Accessed August 2009. [4] Ravin Balakrishnan, Thomas Baudel, Gordon Kurtenbach, and George Fitzmaurice. The rockin'mouse: integral 3d manipulation on a plane. In CHI '97: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 311-318, New York, NY, USA, 1997. ACM. [5] Ravin Balakrishnan and Ken Hinckley. The role of kinesthetic reference frames in two-handed input performance. In UIST '99: Proceedings of the 12th annual ACM symposium on User interface software and technology, pages 171-178, New York, NY, USA, 1999. ACM. [6] Rafael Ballagas, Michael Rohs, and Jennifer G. Sheridan. Sweep and point and shoot: phonecam-based interactions for large public displays. In CHI '05: CHI '05 extended abstracts on Human factors in computing systems, pages 12001203, New York, NY, USA, 2005. ACM. [7] Morton I. Bernstein. Computer recognition of on-line hand-written characters. Rand Corporation. Memorandum RM-3753-ARPA. Rand Corp., Santa Monica, CA, USA, 1964. [8] Xiaojun Bi, Tomer Moscovich, Gonzalo Ramos, Ravin Balakrishnan, and Ken Hinckley. An exploration of pen rolling for pen-based interaction. In UIST '08: Proceedings of the 21st annual ACM symposium on User interface software and technology, pages 191-200, New York, NY, USA, 2008. ACM. [9] Xiaojun Bi, Yuanchun Shi, Xiaojie Chen, and Peifeng Xiang. Facilitating interaction with large displays in smart spaces. In sOc-EUSAI '05: Proceedings of the 2005 joint conference on Smart objects and ambient intelligence, pages 105-110, New York, NY, USA, 2005. ACM. [10] Oliver Bimber and Ramesh Raskar. Modern approaches to augmented reality. page 1, 2007. [11] Alan F. Blackwell, Mark Stringer, Eleanor F. Toye, and Jennifer A. Rode. Tangible interface for collaborative information retrieval. In CHI '04: CHI '04 extended abstracts on Human factors in computing systems, pages 1473-1476, New York, NY, USA, 2004. ACM. [12] Richard A. Bolt. "put-that-there": Voice and gesture at the graphics interface. In SIGGRAPH '80: Proceedings of the 7th annual conference on Computer graphics and interactive techniques, pages 262-270, New York, NY, USA, 1980. ACM. [13] Peter Brandl, Clifton Forlines, Daniel Wigdor, Michael Haller, and Chia Shen. Combining and measuring the benefits of bimanual pen and direct-touch interaction on horizontal interfaces. In AVI '08: Proceedings of the working conference on Advanced visual interfaces, pages 154-161, New York, NY, USA, 2008. ACM. [14] Bill Buxton. Multi-Touch Systems that I Have Known and Loved. Online at http://www.billbuxton.com/multitouchOverview.html, Accessed March 2009. [15] Xiang Cao, Clifton Forlines, and Ravin Balakrishnan. Multi-user interaction using handheld projectors. In UIST '07: Proceedings of the 20th annual ACM symposium on User interface software and technology, pages 43-52, New York, NY, USA, 2007. ACM. [16] Xiang Cao, Michael Massimi, and Ravin Balakrishnan. Flashlight jigsaw: an exploratory study of an ad-hoc multi-player game on public displays. In CSCW '08: Proceedings of the ACM 2008 conference on Computer supported cooperative work, pages 77-86, New York, NY, USA, 2008. ACM. [17] Eduardo Cerqueira, Luis Veloso, Augusto Neto, Marilia Curado, Edmundo Monteiro, and Paulo Mendes. Mobility management for multi-user sessions in next generation wireless systems. Comput. Commun., 31(5):915-934, 2008. [18] Yu-Hsuan Chang, Chung-Hua Chu, and Ming-Syan Chen. A general scheme for extracting qr code from a non-uniform background in camera phones and applications. In ISM '07: Proceedings of the Ninth IEEE InternationalSymposium on Multimedia, pages 123-130, Washington, DC, USA, 2007. IEEE Computer Society. [19] Kelvin Cheng and Kevin Pulo. Direct interaction with large-scale display systems using infrared laser tracking devices. In APVis '03: Proceedings of the Asia-Pacific symposium on Information visualisation, pages 67-74, Darlinghurst, Australia, Australia, 2003. Australian Computer Society, Inc. [20] Chung-Hua Chu, De-Nian Yang, and Ming-Syan Chen. Image stablization for 2d barcode in handheld devices. In MULTIMEDIA '07: Proceedings of the 15th international conference on Multimedia, pages 697-706, New York, NY, USA, 2007. ACM. [21] James R. Dabrowski and Ethan V. Munson. Is 100 milliseconds too fast? In CHI '01: CHI '01 extended abstracts on Human factors in computing systems, pages 317-318, New York, NY, USA, 2001. ACM. [22] T.L. Diamond. Devices for reading handwritten characters. In IRE-A CM-AIEE '57 (Eastern): Papers and discussions presented at the December 9-13, 1957, eastern joint computer conference: Computers with deadlines to meet, pages 232-237, New York, NY, USA, 1958. ACM. [23] Paul H. Dietz and Darren Leigh. DiamondTouch: A multi-user touch technology. In Proc. of UIST 2001, pages 219-226, 2001. [24] Paul N. Edwards. The Closed World: Computers and the Politics of Discourse in Cold War America. MIT Press, Cambridge, MA, USA, 1996. [25] Englebart D.C. et al. A research center for augmenting human intellect (demo). Online at http://sloan.stanford.edu/MouseSite/1968Demo.html, Accessed August 2009. [26] Daniel Fallman, Anneli Mikaelsson, and Bj6rn Yttergren. The design of a computer mouse providing three degrees of freedom. In HCI (2), pages 53-62, 2007. [27] Clifton Forlines and Ravin Balakrishnan. Evaluating tactile feedback and direct vs. indirect stylus input in pointing and crossing selection tasks. In CHI, pages 1563-1572, 2008. [28] Clifton Forlines, Alan Esenther, Chia Shen, Daniel Wigdor, and Kathy Ryall. Multi-user, multi-display interaction with a single-user, single-display geospatial application. In UIST '06: Proceedings of the 19th annual ACM symposium on User interface software and technology, pages 273-276, New York, NY, USA, 2006. ACM. [29] Clifton Forlines, Daniel Wigdor, Chia Shen, and Ravin Balakrishnan. Directtouch vs. mouse input for tabletop displays. In CHI '07: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 647-656, New York, NY, USA, 2007. ACM. [30] Bernd Froehlich, Jan Hochstrate, Verena Skuk, and Anke Huckauf. The globefish and the globemouse: two new six degree of freedom input devices for graphics applications. In CHI '06: Proceedings of the SIGCHI conference on Human Factors in computing systems, pages 191-199, New York, NY, USA, 2006. ACM. [31] Vision Components GmbH. Vc 4038 smart camera. Online at: http://www.vision-components.com/vc-smart-camera-series-and-software/vcsmart-camera-series/vc4038-smart-camera- 20060124238/, Accessed August 2009. [32] John B. Goodenough. A lightpen-controlled program for online data analysis. Commun. ACM, 8(2):130-134, 1965. [33] Gabriel F. Groner. Real-time recognition of handprinted text. In AFIPS '66 (Fall): Proceedings of the November 7-10, 1966, fall joint computer conference, pages 591-601, New York, NY, USA, 1966. ACM. [34] Tovi Grossman, Ken Hinckley, Patrick Baudisch, Maneesh Agrawala, and Ravin Balakrishnan. Hover widgets: using the tracking state to extend the capabilities of pen-operated devices. In CHI '06: Proceedings of the SIGCHI conference on Human Factors in computing systems, pages 861-870, New York, NY, USA, 2006. ACM. [35] Barbara J Grosz. Visualizing the process a graph-based approach to enhancing system-user knowledge sharing. Proceedings of the American Philosophical Society, 149(4):529-543, 2005. [36] Anselm Grundh6fer, Manja Seeger, Ferry Hantsch, and Oliver Bimber. Dynamic adaptation of projected imperceptible codes. In ISMAR '07: Proceedings of the 2007 6th IEEE and ACM InternationalSymposium on Mixed and Augmented Reality, pages 1-10, Washington, DC, USA, 2007. IEEE Computer Society. [37] Kelly S. Hale and Kay M. Stanney. Deriving haptic design guidelines from human physiological, psychophysical, and neurological foundations. IEEE Comput. Graph. Appl., 24(2):33-39, 2004. [38] Michael Haller. Pen-based interaction. In SIGGRAPH '07: ACM SIGGRAPH 2007 courses, pages 75-98, New York, NY, USA, 2007. ACM. [39] Jefferson Y. Han. Low-cost multi-touch sensing through frustrated total internal reflection. In UIST '05: Proceedings of the 18th annual ACM symposium on User interface software and technology, pages 115-118, New York, NY, USA, 2005. ACM. [40] Andy Hertzfeld. Revolution in The Valley (hardcover). O' Reilly & Associates, Inc., 2004. [41] Ken Hinckley, Randy Pausch, John C. Goble, and Neal F. Kassell. A survey of design issues in spatial input. In UIST '94: Proceedings of the 7th annual ACM symposium on User interface software and technology, pages 213-222, New York, NY, USA, 1994. ACM. [42] Ken Hinckley, Randy Pausch, Dennis Proffitt, James Patten, and Neal Kassell. Cooperative bimanual action. In CHI '97: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 27-34, New York, NY, USA, 1997. ACM. [43] Ken Hinckley, Mike Sinclair, Erik Hanson, Richard Szeliski, and Matthew Conway. The videomouse: A camera-based multi-degree-of-freedom input device. In ACM Symposium on User Interface Software and Technology, pages 103-112, 1999. [44] Hiroshi Ishii and Brygg Ullmer. Tangible bits: towards seamless interfaces between people, bits and atoms. In CHI '97: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 234-241, New York, NY, USA, 1997. ACM. [45] ISO. Automatic identification and data capture techniques - Data Matrix bar code symbology specification. ISO/IEC 16022:2006, 2006. [46] Shahram Izadi, Steve Hodges, Stuart Taylor, Dan Rosenfeld, Nicolas Villar, Alex Butler, and Jonathan Westhues. Going beyond the display: a surface technology with an electronically switchable diffuser. In UIST '08: Proceedings of the 21st annual ACM symposium on User interface software and technology, pages 269-278, New York, NY, USA, 2008. ACM. [47] Hao Jiang, Eyal Ofek, Neema Moraveji, and Yuanchun Shi. Direct pointer: direct manipulation for large-display interaction using handheld cameras. In CHI '06: Proceedings of the SIGCHI conference on Human Factors in computing systems, pages 1107-1110, New York, NY, USA, 2006. ACM. [48] Victor P. Jimenez and Ana Garcia Armada. Multi-user synchronisation in ad hoc ofdm-based wireless personal area networks. Wirel. Pers. Commun., 40(3):387-399, 2007. [49] Marcelo Kallmann and Daniel Thalmann. Direct 3d interaction with smart objects. In VRST '99: Proceedings of the ACM symposium on Virtual reality software and technology, pages 124-130, New York, NY, USA, 1999. ACM. [50] Martin Kaltenbrunner and Ross Bencina. reactivision: a computer-vision framework for table-based tangible interaction. In TEI '07: Proceedings of the 1st international conference on Tangible and embedded interaction, pages 69-74, New York, NY, USA, 2007. ACM. [51] Alan Curtis Kay. The reactive engine. PhD thesis, The University of Utah, 1969. [52] Alan Curtis Kay. A personal computer for children of all ages. Xerox Palo Alto Research Center: Proceedings of the ACM National Conference, 1972. [53] Alan Curtis Kay. Trackmate: Large-scale accessibility of tangible user interfaces. Master's thesis, Massachusetts Institue of Technology, 2009. [54] Werner A. K6nig, Joachim B6ttger, Nikolaus V61zow, and Harald Reiterer. Laserpointer-interaction between art and science. In IUI '08: Proceedings of the 13th internationalconference on Intelligent user interfaces, pages 423-424, New York, NY, USA, 2008. ACM. [55] Werner A. K6nig, Jens Gerken, Stefan Dierdorf, and Harald Reiterer. Adaptive pointing: implicit gain adaptation for absolute pointing devices. In CHI EA '09: Proceedings of the 27th international conference extended abstracts on Human factors in computing systems, pages 4171-4176, New York, NY, USA, 2009. ACM. [56] Celine Latulipe, Stephen Mann, Craig S. Kaplan, and Charlie L. A. Clarke. symspline: symmetric two-handed spline manipulation. In CHI '06: Proceedings of the SIGCHI conference on Human Factors in computing systems, pages 349358, New York, NY, USA, 2006. ACM. [57] Jakob Leitner, James Powell, Peter Brandl, Thomas Seifried, Michael Haller, Bernard Dorray, and Paul To. Flux: a tilting multi-touch and pen based surface. In CHI EA '09: Proceedings of the 27th internationalconference extended abstracts on Human factors in computing systems, pages 3211-3216, New York, NY, USA, 2009. ACM. [58] Vincent Lepetit and Pascal Fua. Monocular model-based 3d tracking of rigid objects. Found. Trends. Comput. Graph. Vis., 1(1):1-89, 2005. [59] Yang Li, Ken Hinckley, Zhiwei Guan, and James A. Landay. Experimental analysis of mode switching techniques in pen-based user interfaces. In CHI '05: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 461-470, New York, NY, USA, 2005. ACM. [60] Wendy E. Mackay, Guillaume Pothier, Catherine Letondal, Kaare B6egh, and Hans Erik S6rensen. The missing link: augmenting biology laboratory notebooks. In UIST '02: Proceedings of the 15th annual ACM symposium on User interface software and technology, pages 41-50, New York, NY, USA, 2002. ACM. [61] I. Scott MacKenzie, R. William Soukoreff, and Chris Pal. A two-ball mouse affords three degrees of freedom. In CHI '97: CHI '97 extended abstracts on Human factors in computing systems, pages 303-304, New York, NY, USA, 1997. ACM. [62] Shahzad Malik and Joe Laszlo. Visual touchpad: a two-handed gestural input device. In ICMI '04: Proceedings of the 6th international conference on Multimodal interfaces, pages 289-296, New York, NY, USA, 2004. ACM. [63] Shahzad Malik, Abhishek Ranjan, and Ravin Balakrishnan. Interacting with large displays from a distance with vision-tracked multi-finger gestural input. In UIST '05: Proceedings of the 18th annual ACM symposium on User interface software and technology, pages 43-52, New York, NY, USA, 2005. ACM. [64] T. Marrill, A. K. Hartley, T. G. Evans, B. H. Bloom, D. M. R. Park, T. P. Hart, and D. L. Darley. Cyclops-1: a second-generation recognition system. In AFIPS '63 (Fall): Proceedings of the November 12-14, 1963, fall joint computer conference, pages 27-33, New York, NY, USA, 1963. ACM. [65] Sergey V. Matveyev and Martin Gdbel. The optical tweezers: multiple-point interaction technique. In VRST '03: Proceedings of the ACM symposium on Virtual reality software and technology, pages 184-187, New York, NY, USA, 2003. ACM. [66] David Merrill, Jeevan Kalanithi, and Pattie Maes. Siftables: towards sensor network user interfaces. In TEI '07: Proceedings of the 1st international conference on Tangible and embedded interaction, pages 75-78, New York, NY, USA, 2007. ACM. Online Key events in microsoft history. [67] Microsoft. http://download.microsoft.com/download/7/e/a/7ea5ca8c-4c72-49e9-a69487ae755elf58/keyevents.doc, Accessed August 2009. at: [68] Paul Milgram and Fumio Kishino. A taxonomy of mixed reality visual displays. IEICE Transactions on Information Systems, E77-D(12), December 1994. [69] Pranav Mistry, Pattie Maes, and Liyan Chang. Wuw - wear ur world: a wearable gestural interface. In CHI EA '09: Proceedings of the 27th international conference extended abstracts on Human factors in computing systems, pages 4111-4116, New York, NY, USA, 2009. ACM. [70] Ankit Mohan, Grace Woo, Shinsaku Hiura, Quinn Smithwick, and Ramesh Raskar. Bokode: imperceptible visual tags for camera based interaction from a distance. In SIGGRAPH '09: ACM SIGGRAPH 2009 papers, pages 1-8, New York, NY, USA, 2009. ACM. [71] Brad A. Myers. A brief history of human-computer interaction technology. interactions, 5(2):44-54, 1998. [72] Brad A. Myers, Rishi Bhatnagar, Jeffrey Nichols, Choon Hong Peck, Dave Kong, Robert Miller, and A. Chris Long. Interacting at a distance: measuring the performance of laser pointers and other devices. In CHI '02: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 33-40, New York, NY, USA, 2002. ACM. [73] Ernie G. Nassimbene. Utensil for writing and simultaneously recognizing the written symbols. In US Patent, number 3182291, May 1965. [74] University of Surrey and Loughborough University. Ergonomics of using a mouse or other non-keyboard input device. HSE Research Report RR045. Health and Safety Executive, United Kingdom, 2002. [75] Masaki Oshita. Pen-to-mime: Pen-based interactive control of a human figure. Computers & Graphics, 29(6):931 - 945, 2005. [76] Hanhoon Park and Jong-Il Park. Invisible marker tracking for ar. In ISMAR '04: Proceedings of the 3rd IEEE/ACM International Symposium on Mixed and Augmented Reality, pages 272-273, Washington, DC, USA, 2004. IEEE Computer Society. [77] Jeffrey S. Pierce and Jeffrey Nichols. An infrastructure for extending applications' user experiences across multiple personal devices. In UIST '08: Proceedings of the 21st annual ACM symposium on User interface software and technology, pages 101-110, New York, NY, USA, 2008. ACM. [78] Larry Press. The acm conference on the history of personal workstations. SIGSMALL/PC Notes, 12(4):3-10, 1986. [79] Davis M. R. and Ellis T. 0. The rand tablet: a man-machine graphical communication device. In AFIPS '64 (Fall, part I): Proceedings of the October 27-29, 1964, fall joint computer conference, part I, pages 325-331, New York, NY, USA, 1964. ACM. [80] Gonzalo Ramos, Matthew Boulos, and Ravin Balakrishnan. Pressure widgets. In CHI '04: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 487-494, New York, NY, USA, 2004. ACM. [81] Howard Rheingold. The Virtual Community: Homesteading on the Electronic Frontier.The MIT Press, 2000. [82] Lawrence G. Roberts. The lincoln wand. In AFIPS '66 (Fall): Proceedings of the November 7-10, 1966, fall joint computer conference, pages 223-227, New York, NY, USA, 1966. ACM. [83] Johan Sanneblad and Lars Erik Holmquist. Ubiquitous graphics: combining hand-held and wall-size displays to interact with large images. In AVI '06: Proceedings of the working conference on Advanced visual interfaces,pages 373377, New York, NY, USA, 2006. ACM. [84] Hyunyoung Song, Tovi Grossman, George W. Fitzmaurice, Frangois Guimbretiere, Azam Khan, Ramtin Attar, and Gordon Kurtenbach. Penlight: combining a mobile projector and a digital pen for dynamic visual overlay. In CHI, pages 143-152, 2009. [85] Sriram Subramanian, Dzimitry Aliakseyeu, and Andres Lucero. Multi-layer interaction for digital tables. In UIST '06: Proceedings of the 19th annual ACM symposium on User interface software and technology, pages 269-272, New York, NY, USA, 2006. ACM. [86] Ivan E. Sutherland. Sketch pad a man-machine graphical communication system. In DAC '64: Proceedings of the SHARE design automation workshop, pages 6.329-6.346, New York, NY, USA, 1964. ACM. [87] W. L. Sibley T. 0. Ellis, J. F. Heafner. The GRAIL System Implementation. Rand Corporation. Memorandum RM-6002-ARPA. Rand Corp., Santa Monica, CA, USA, 1969. [88] Chuck Thacker. Personal distributed computing: the alto and ethernet hardware. In Proceedings of the ACM Conference on The history of personal workstations, pages 87-100, New York, NY, USA, 1986. ACM. [89] Feng Tian, Xiang Ao, Hongan Wang, Vidya Setlur, and Guozhong Dai. The tilt cursor: enhancing stimulus-response compatibility by providing 3d orientation cue of pen. In CHI '07: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 303-306, New York, NY, USA, 2007. ACM. [90] Feng Tian, Lishuang Xu, Hongan Wang, Xiaolong Zhang, Yuanyuan Liu, Vidya Setlur, and Guozhong Dai. Tilt menu: using the 3d orientation information of pen devices to extend the selection capability of pen-based user interfaces. In CHI '08: Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, pages 1371-1380, New York, NY, USA, 2008. ACM. [91] Ellis T.O. and Sibley W.L. On the Problem of Directness in Computer Graphics. Rand Corporation. Report P-3697. Rand Corp., Santa Monica, CA, USA, 1968. [92] Brygg Ullmer, Hiroshi Ishii, and Robert J. K. Jacob. Token+constraint systems for tangible interaction with digital information. ACM Trans. Comput.-Hum. Interact., 12(1):81-118, 2005. [93] Andries van Dam and David E. Rice. On-line text editing: A survey. A CM Comput. Surv., 3(3):93-114, 1971. [94] Dan Venolia. Facile 3d direct manipulation. In INTER CHI, pages 31-36, 1993. [95] Daniel Vogel and Ravin Balakrishnan. Distant freehand pointing and clicking on very large, high resolution displays. In UIST '05: Proceedings of the 18th annual ACM symposium on User interface software and technology, pages 3342, New York, NY, USA, 2005. ACM. Wacom Intuos3 Art Pen Orientation Guide. [96] Wacom. at http://www.wacom-asia.com/intuos3/spec/intuos3artpen.html, March 2009. Online Accessed An annotated bibliography [97] Jean Renard Ward. recognition. character handwriting and puting http://users.erols.com/rwservices/biblio.html, 2009. [98] Pierre Wellner. Interacting with paper on the digitaldesk. 36(7):87-96, 1993. in pen comat Online Commun. ACM, [99] Adam Wojciechowski and Konrad Siek. Barcode scanning from mobile-phone camera photos delivered via mms: Case study. In ER '08: Proceedings of the ER 2008 Workshops (CMLSA, ECDM, FP-UML, M2AS, RIGiM, SeCoGIS, WISM) on Advances in Conceptual Modeling, pages 218-227, Berlin, Heidelberg, 2008. Springer-Verlag. [100] Shumin Zhai, Barton A. Smith, and Ted Selker. Improving browsing performance: A study of four input devices for scrolling and pointing tasks. In INTERACT '97: Proceedings of the IFIP TC13 Interantional Conference on Human-Computer Interaction, pages 286-293, London, UK, UK, 1997. Chapman & Hall, Ltd. [101] Xiang Zhang, Stephan Fronz, and Nassir Navab. Visual marker detection and decoding in ar systems: A comparative study. In ISMAR '02: Proceedings of the 1st International Symposium on Mixed and Augmented Reality, page 97, Washington, DC, USA, 2002. IEEE Computer Society. Appendix A MATLAB Code function quickcalc () r = radius of pen (transparency) 7e-3 fb = 8.8e-3 bokode focal length (try 4.5 - t 0.1 field width .3399 camera distance from screen = U 22) ub = 10e-06 print feature size 1 = 15 number of features for decoding (try up to 35) Uc = 7.4e-06; pixel size dc = 4.7e-3; 5mm COD fc=dc/(t/u); % fc=16e-3 k=(ub/uc)*(fc/fb ); a=(l*ub)/(fb/u); disp([ 'fc-=', num2str(fc*1000), 'n;-2-<?=', ... numn2str (k), '-; -a-=-',.. num2str(a*1000), 'nmt; -range-=J', num2str(atan(r/fb)*180/pi) , 'degrees']); 88 Appendix B C Code " Daniel Taub * MIT Media Lab * Code for * Based on Example Code by: and Microsoft Research Ltd. VC4xxx the add: timeout for + To #include SmartCamera By Vision Components Klaus Schneider , Components Inc. fail trigger <DM-Header.h> //timing blobs int bms,ms, sec, fpsv , fps=0,fpss=O; int trigger int ovIDisplay int lastx=-1,1asty , lastdx , lastdy , lastw , lasth #define Vision = 1; = 0; 500 MIN.AREA /*******************************************************************/ #define LOCAL-PORT 5005 /* local #define DEST-PORT 4004 /* udp #define BUF.LEN 32 /* srsly port 5005 destination buffer port length /*******************************************************************/ #define fsign(x) sign ((int)( ceil (x))) main(void) void I // Init params for UDP packets sockaddr-in laddr , raddr; unsigned sock ; unsigned error unsigned const char rest ; dest = "157.58.60.69"; char buf [BUFLEN] ; int var1 = 0, var2 = 0; int var3 = 0, var4 = 0; 132 destInHex ;// uint_16 = //for conversion simplicity (varl << 24) = remote-addrlen + (var2 << 16) + (var3 << 8) + var4; ); sizeof(sockaddrin /******************************************************************/ /********+*******DATA MATRIX DECODING* **+***+**+********/ /******************************************************************/ ScreenX , ScreenY , 132 ScreenDx , ScreenDy , VideoDispICamera , result; OvlScreenDx , OvlScreenDy , OvlSearchAreaDx , OvISearchAreaDy; 132 SearchAreaX , 132 SearchBorder 132 char = SearchAreaY , 0; SearchAreaDx , SearchAreaDy; //8 Text [DMMAX.TEXTLEN]; DmParameter DmPar; image SearchArea , SearchAreaOvl unsigned int dm-count unsigned char dm..id=,dm_x=0,dm-y=O; - 0,count = 0,lastudm-count=0; /******************************************************************/ // Command and control /******************************************************************/ int threshold=40; char kb; int display = 0; short fullScreenMode short secondpic = 0; short streaming = 0; int grow = = 1; 0; /******************************************************************/ int x, y, dx, int Over; image int nt wArea; = // dy; char resp; 0; /************************************************/ dx = EVEN.S(640); dy = EVENS(480); x = EVEN.S((ScrGetColumns-dx)/2); y = EVEN.S((ScrGetRows /12 SearchAreaX, -dy)/2); SearchAreaY, i n i t _l i c e n c e ("T2EC4DB8BF2" ); i n i t -Ii c e n ce (" E2EC4E94792" SearchAreaDx, SearchAreaDy, SearchBorder = 8; result if = DmIniMemory(&DmPar); .Error>\n" , {print ("e<%d: .Data...Matrix.Initialization (result) ); result exit (0);} /*****************************************************************/ /7 variables inst /****************+********++*************+**************************/ // ini = 0; ScreenY = 0; ScreenDx = ScrGetColumns; ScreenDy = ScrGetRows; OvlScreenDx = OvIScreenDy = OvIGetRows; // Screen display and define ScreenX define capture position) OvlGetColumns; search area SearchAreaX = ScreenX + SearchBorder SearchAreaY = ScreenY + SearchBorder SearchAreaDx = ScreenDx -2* SearchBorder; SearchAreaDy = ScreenDy -2* SearchBorder OvlSearchAreaDx = OvISc reenDx - 2 * SearchBorder OvISearchAreaDy = OviSc reenDy - 2 * SearchBorder ImageAssign(&SearchArea , SearchAreaY), OvlByteAddr(SearchAreaX, Data Marix input parameters including all of initialization SearchAreaY) , SearchAreaDx , , ScrByteAddr(SearchAreaX ImageAssign(&SearchAreaOvl , 77 image and Positon (Screen Size licence code. Call this function /*****************************************************************/ // setup overlay set-translucent draw 0, 250, (1, 0); 7/ /7 set-overlay-bit (2, 210, 210, 210); // 0); // 0, set-overlay-bit (3, 250, 70); // 70, 70, (4, set.overlay.bit (2, set-translucent 255, 0); 250, 250, (5, set.overlay.bit 255, 1 red TRANYEL 3 yellow COLGREY 4 grey 8 COLRED 32 white 7/ COLWHITE 0, 200, 0); // COLGREEN set-overlay.bit(7, 65, 90, 255); // COLBLUE 64 128 green blue getvar (OVLY.START) , DispGetColumns, DispGetRows, ImageAssign(&wArea, getvar (DISPSTART) , DispGetColumns, DispGetRows, 122); EmptyKeyboardBuffer (; /*****************************************************************/ "%d.%d.%d.%d" , &varl , &var2 , &var3 , &var4); sscanf(dest , destInHex = / (var1 << (157 << destInHex = OvlGetPitch); 0); set(&SearchAreaOvl set(&wArea, black 16 250); translucent red COLBLACK set-overlay.bit (6, ImageAssign(&SearchAreaOvl , translucent TRANRED + 24) 0, (var2 + 24) << (58 << BUFLEN); + 16) 16) //clear + (var3 (60 << 8) << buffer memset((char *) &buf, memset ((char *) &laddr , 0, sizeof(sockaddr-in )); memset ((char *) &raddr, 0, sizeof(sockaddr.in)); 8) + + 69; var4; DispGetPitch); ScrGetPitch ); SearchAreaDy, DM-MAXTEXTLEN); Text, &SearchArea, InitDataMatrixPar(&DmPar, SearchAreaDy , SearchAreaDx, OvlGetPitch); as often you like. laddr. sin.family = AF-INET; laddr .sin-port = LOCALPORT; // // sin-addr . s-addr = INADDILANY; laddr raddr . sin-family = raddr. sin-port = DESTPORT; bind port listen for any IP addr AFINET; // in port at dest raddr . sin-addr . s-addr = INADDRLBROADCAST; //destInHex; //dest IP from sock = socket.dgram (; (sock if VCRTSOCKETERROR) = { print (" \ne<Create-UDP..socket .. failed >" return; } /* bind to = error if local address */ bind(sock, (error != &laddr, sizeof(raddr)); VCRT_OK) print("\ne<UDP..bind..f ailed...-...0x%x>" , error); return; rest = connect (sock, if (rest != &raddr, remote-addrlen); VCRTOK) { ("\nError- printf // main connect ()..failed .with...error...code.%1x rest ); loop /*************************+************+******++*******************/ do{ /* Delete old //OvlClearA /* drawings */ ll; GET KEYBAORD INPUT */ kb = KEYJNVALID; if (kbhit() kb = ReadKeyboard(1); EmptyKeyboardBuffer (; fpsv = fps = if 1000 * (getvar(SEC) - getvar(MSEC); (trigger) { vmode(vmOvlStill); tenable (); if (secondpic) { while (! trdy (); } } else vmode(vmOvlLive); set.ovlmask (255); fpss) + fpss= getvar(SEC); (getvar(MSEC) - fps); above () tpict } /* Take pictures //tpict */ (; Working Page */ /* = Over (int)OvlGetPhysPage; OvlSetLogPage (Over); (1)//kb == KEYENTER) if blob detect and display biggest /* y, blobtest(x, if dy, dx, *7 threshold , -1, display); (secondpic) { // while (!trdy()); 7* U16 *rlc; long maxlng= 0x020000L; Hc = sysmalloc ((maxlng* *) (U16 rlcmk(tSearchArea , (0) if ric , threshold , / sizeof(U32), MDATA); overrun>\n");} 0x40000); O=no filter 1=with filter /// (U16)+1) sizeof {pstr("e<DRAM Memory (rlc == NULL) if { /7 , r2); slc=erode2 (rll // sIc=dilate (rll /7 slc=rlc.mf (rl1 ,r2 r12); 0 ,4); } wArea,rc ,0,255); rlcout( (rlc ); rlcfree , //rlcmkf(lx threshold , input , rlc );*/ I (fullScreenMode) if { ScrByteAddr(SearchAreaX, ImageAssign(&SearchArea, if SearchAreaY), SearchAreaDx, SearchAreaDy , { ImageAssign(&SearchAreaOvl, OvlByteAddr(SearchAreaX, SearchAreaY), OvlSearchAreaDx, OvlSearchAreaDy, } } else { ImageAssign(&SearchArea , if ((ovIDisplay) ScrByteAddr( lastx, && (lastx lasty), lastdx , lastdy, { ImageAssign(&SearchAreaOvl OvlByteAddr(lastx, lasty), lastdx, DmPar. SearchArea = &SearchArea; DmPar. ClosingFilter = grow; *************************************** start reading data matrix 7** * * ScrGetPitch); -1)) != } // ScrGetPitch); (ovlDisplay) ** ************** * *****/7 lastdy, OvlGetPitch); } OvlGe ms = // sec = getvar(MSEC); getvar(SEC); main data matrix read function DataMatrixReader(&DmPar); count++; if (count > MAXCOUNT) { last-dm.count = dm-count count = = dm.count; 0; 1; } = 1000 ms if * (getvar (SEC) ((DmPar.DmError 0) = - sec ) + ( getvar (MSEC) && (DmPar.DmTextLength - ms); = 3)) { double avgX=0,avgY=0; int i,xx, ctr-x /7 for angle ctr-y; dm-count++; dm-id = DmPar.DmText [2]; dm-x = DmPar. DmText [ 1]; dm-y = DmPar. DmText [0]; /*+++++*+++*calculateangle***********/ for ( i =0; i <4; i++) { avgX+=(double)DmPar.DmPosX[ i]; avgX=avgX / 4 0; avgY+=(double)DmPar.DmPosY[ i]; } avgY=avgY/ 4. 0; /store if center of IDed barcode (fullScreenMode) { ctr~x - (int) avgX; ctr-y = (int) avgY; = lastx + (int) avgX; ctry = lasty + (int) avgY; } else { ctr-x } //calculate angle vector avgY=(DmPar. DmPosY[0] -avgY ); avgX =(DmPar. DmPosX[0] - avgX ); (avgX < if avgX = xx = xx = 0) else xx = (int)((avgX/3.14159265359*180) printf("X:%f if 270; 90; atan(avgY/avgX); + xx); Y:%f<%d>\n",avgX,avgY, xx); (streaming) { //may want to marshall as byte array sprintf(buf,"%d,%d,%d,%d,%d,%d,%d,%d,%d,%d\0" nt if = send(sock,bufBUFLEN,0); (nt > //sendto(sock, ,lastx ,lasty 100) printf("\ne<send ().failed .with.error..%d.-..%d>" , count, } send (sock , buf , BUFLEN, 0); print ("%d,%d,%d,%d,%d\n\r" } else if ,lastdx ,lastdy ,dm-id,dmx,dmy,xx,0 buf,BUFLEN,O,traddr, sizeof(sockaddr-in)); (streaming) , lastx , lasty , dm-id , dm-x, dm-y); VCRT-geterror(nt),nt ); ,0);//ctr x, f (buf, "%d,%d,%d,%d\0" , lastx , lasty , lastdx , last dy ); sprint nt = send(sock ,buf ,BUF.LEN,0); if (nt (sock, //sendto buf ,BUFLEN,0,&raddr , sizeof printf("\ne<send ()...failed -with-error...%d----%d>", // VCRT-geterror(nt),nt); count, print("%d,%d\n\r", lastx ,lasty); display screen //on (ovlDisplay) if (sockaddr-in)); 100) > { // monitor text skip image ImgTxt; char ResTxt[80]; sprint (ResTxt, "%d,%d,%d" , dm-id,dm.x,dm-y); COLBLUE); framed(&SearchAreaOvl, OvlByteAddr(lastx ImageAssign(&ImgTxt, // set(&ImgTxt, if (DmPar.DmError { = , lasty -12), ScreenDx, 8, OvlGetPitch); COLGREY); 0) chprint1(ResTxt,&ImgTxt,1, 1, COLBLUE);} else { chprint1(ResTxt,&ImgTxt,1, // ChangeColor(ImgTxt, 1, 0, COLGREY);} 0, COLGREY); } /**********+*************************************************/ /7 results output /********************************************************/ // display if (ovlDisplay) data matrix and modul position { ColorDark-COLRED, CrossSize=1, 132 //ImageAssign(wOvl, if (DmPar.DmError = ColorBright-COLGREEN;//, OvlByteAddr(PosX, PosY), PosX=SearchAreaX, PosY=SearchAreaY; OvlGetColumns-PosX, OvlGetRows-PosY, OvlGetPitch); 0) { (fullScreenMode) if {DrawDataMatrix (&DmPar, &SearchAreaOvl , CrossSize , ColorDark , ColorBright );} e Is e {DrawDataMatrixOff (&DmPar, &SearchArea, if (kb = KEY.DOWN) { threshold = (kb = threshold = max(0 ,threshold -1); I else if KEY-UP) { min(255 ,threshold +1); I if (kb = KEY.LEFT) { grow = max(O,grow-1); I else if (kb = KEYRIGHT) { grow = min(16 ,grow+1); } CrossSize, ColorDark, ColorBright, lastx -6 , lasty -6);} else if (kb 'f') = { fullScreenMode = not (fulIScreenMode ); } else if (kb = 'd') { display = not(display ); } else if (kb 'o') = { OvIClearAll; ovIDisplay not (ovIDisplay); = } else if (kb 't') - { trigger = not(trigger); } else if (kb = 'w') secondpic = { not(secondpic); } else { if (kb = 's') pstr("\n\r" ); streaming = not(streaming); } else if { int if (kb ant = (ant = 'u') udp /test sendto(sock ," test" ,4,0,&raddr ,sizeof(sockaddr-in ));//sendto (sock, buf,BUFLEN,O,&raddr, sizeof(sockaddr_ > 100) printf("\nsendto -with-count...%ld-and.error.%lx..-%d" ().failed , count, VCRT-geterror(ant) ,ant); } else if (kb - 'p') { print ("\nthreshold.at...%d",threshold); print("\nUseTrigger : ..%d\t print ("\nOverlay:-%d\t-WaitBufFull print("\nbob-<%dms>+" print (".-...filter..at..%d" grow); -BlobResult : -%d\t...Streaming: ..%d" ,trigger ,display ,bms); print ("\n-/d.d/...%dDM-found\n" :...%d\t-FullScreen :..%d" ,ovIDisplay print("dm<%d>ms..=" ,ms); ,streaming); ,secondpic , fulIScreenMode); print ("%f.2-fps" ,1000.0/(double)fpsv); , last-dm-count ,MAX.COUNT); } else if (kb = 'h') /help! { pstr ("\n-HELP...ENU"); pstr("\n " pstr ("\n.change...threshold..with..<up>.and..<down>.arrow..keys"); pstr ("\n change.DM.. filler ...with-_< left >.and...<right >..arrow-keys"); pstr ("\n..enable/ disable-overlay :<o>"); .. <t>") pstr ("\n.enable/ disable-trigger pstr ("\n enable/disable taking (second) threshold pstr ("\n-enable/disable..waiting..for..buffer-to- fill pstr ("\n.enable/disable...fullscreen..data. (?):...<d>" ); pstr ("\n-enable/ disable-streaming..output ..<s>") ..<h>" ); pstr ("\n.print..status..screen : -<p>\n") } } while ( (kb != KEY.ESC) blob matrix..decoding :..<f>"); pstr ("\n.enable/disable-blob.display pstr ("\n.print-this..menu: after ); 96 detection: .. before..blob-detection <n>"); .. <w>" ); graceful // exit shutdown (sock , FLAGABORTCONNECTION); vmode(vmLive); /*****************************************************************/ void (void) EmptyKeyboardBuffer { while (kbready ()) kbdrcv (; while (rbready() rs232rcv(; } 10 TimeOut #define (int wait) int ReadKeyboard int || if(wait ch4, ch3, ch2, i, chl, kbready() || src; rbready ()) { while (1) if (kbready() {src=1; break;) if (rbready() {src=0; break;) time-delay (1); } if(src) chl else chl = rs232rcv (; (ch1 if != kbdrcv(; = Oxb) if(chl = OxOd || ch1 = OxOa) return(KEY.ENTER); else return(chl ); ch1 = KEY-ESC; (i=O; for i<TimeOut; i++) time-delay (1); if if(src) {if (kbready() else {if ( (i=TimeOut) kbhit() return(chl); if(src) ch2 = kbdrcv (); else ch2 = rs232rcv(; break;) ) break;) /* we've been waiting for too long * for (i=0; i<TimeOut; i++) { time-delay (1); if(src) {if (kbready() else {if ( break;} ) kbhit() break;) I if (i==TimeOut) return(chl); if(src) ch3 = kbdrcv (); else ch3 = if - (ch2 /* we 've been waiting for too long */ rs232rcv(; Ox4f) /* of 2/3 a F1 or F2 received? +/ { ((ch3 if 0x71) = (ch3 = received? *7 Ox5O)) /* F1 Ox5l)) /* F2 received? */ return (KEYF1); if ((ch3 0x72) = (ch3 = return (KEY.F2); return (KEY.INVALID); if (ch2 -- Ox5B) /* of 2/3 a Cursor received? */ { if (ch3 = 0x31) { (i=0; for i<TimeOut; i++) { time-delay (1); if(src) {if (kbready() else {if ( kbhit() break;} ) break;} } if (i=TimeOut) return(chl); if(src) ch4 = kbdrcv(); else ch4 = rs232rcv(; Ox3l) if (ch4 = /* we 've /* F1 received? */ /* F2 received? */ return(KEYF1); = if (ch4 Ox32) return(KEY.F2); } else { switch (ch3) { case Ox41 case Ox42 case Ox43: return(ch3 - default return(KEYINVALID); } I } re turn (KEYINVALID); } { 61); case Ox44 been waiting for too long */ return (KEYINVALID); } } /*************************************************************/ /********************* BLOB SUBROUTINES +*************************/ /*************************************************************/ 100 MAXOBJ #define int int dx, int y, (int x, void blobtest int color , int dy, threshold , int display) { nobj; j=0, i, int area; image ftr f [MIAXOBJ] ftr *f-largest = NULL; side=1,line , col , sidex=,sidey=1; long the Search area is /* whole */ image ,dxdy,ScrGetPitch); ImageAssign (&area ,ScrByteAddr(x,y) */ extraction feature /* nobj=find-objects(&area %d //print("\n\nFound // /7 (color) if else ,f display); ,MAXOBJ, threshold white and black print("\nwhite objects : print("\nblack objects:"); objects (-1 means more than %d objects)",nobjMAXOBJ); i++) for(i=0;i<nobj; { color /* looking for if ( f [i]. color--color objects (black=0, white=-1) *7 ) { = (( flargest if via print /* || NULL) (flargest ->area f [i] .area)) < = &f[i];} {f-largest at 9600 seriell Band takes a ("\nArea %d: Size=%d Center of //print area, f/i. ++j,f[i]. // long time */ Gravity =%d,%d", f/i]. xcenter+x, ycenter+y); //cross-image(EVEN((int)f[i].y-center+y),EVEN((int)f[iI.x-center+x),10,0xFF); } } if != ((f-largest NULL) && (f-largest ->area > MINAREA)) { (long) sqrt ((double)f-largest = /side sidey=( ->area); use max and min //eventually ->x-max-f-largest ->x-min ); sidex=(f-largest f-largest ->y-max-f-largest ->y-min); col=f-largest ->x-center ; ->y-center line=f-largest mark-overlay ( col -(sidex >>1),line -(sidey >>1),sidex , sidey ) } else if //remove (lastx != lines and SEND SIGNAL 'no longer seen -1) { image wOvl;ImageAssign(&wOvl, pstr ("\ne<no..light .visible>"); lastx=-1; OvlByteAddr(lastx, lasty), lastdx+1, lastdy+1, OvlGetPitch);set(&wOvl,0); void mark-overlay(int image x, int y, int dx, int dy) //assumes only one done last time.. wOvl; /* *7 Display pictures if(ovlDisplay) { if (lastx != -1) { ImageAssign(&wOvl, , OvlByteAddr( lastx lasty -15), lastdx+1, lastdy+16, set (&wOvl, 0); ImageAssign(&wOvl, framed(&wOvl, markerd(&wOvl, overlay dx, dy, OvlGetPitch); COLRED); colors set-overlay-bit(7, 255, 0, 0); set-overlay-bit(6, 0, 255, 0); set-overlay-bit(4, 0, 0, +/ 7/ y), COLBLUE); ellipsed(&wOvl, /* set OvlByteAddr(x, TRANRED); 255); set-ovlmask (255); lastx=max(0,x); lasty=max(0,y); lastdx=dx; lastdy=dy; int find-objects (image *a, ftr *f , int maxobjects , int th , int display) { long U16 maxlng= 0x020000L; * rlc; U16 * sIc; int // rlc if objects; allocate = DRAM Memory (U16 (rIc *) sysmalloc ((maxlng* == NULL) sizeof(U16)+1) { pstr("e<DRAM Memory / create RLC // bms = getvar(MSEC); sIc=rlcmk(a, th, ric , sizeof (U32), overrun>\n"); return(-1);} maxlng); 100 MDATA); OvlGetPitch); (sIc // object { pstr ("e<RLC overrun>\n"); sysfree ( rc ) == NULL if {pstr("e<object number overrun>\n"); sysfree(rlc); return(-1);} extraction feature , (rc objects=rl-ftr2 bms = 1);} labe ling if (sgmt(rlc , slc)==OL) // return(- ); f , maxobjects ); getvar(MSEC)-bms; if(bms<0) bms+=1000; // display RLC if (display) // free ,0,255); rlcout(a,rc allocation sysfree (rlc ); +/ maxlng= 0x020000; long U16 *rl1 , *r12 , int objects ; // nentered find\n"); printf("\ = rl // DRAM Memory allocate /* if if NULL) = if th, , rli maxIng); rlcfree(rll); {pstr("RLC overrun\n"); {pstr("e<RLC-overrun>\n"); NULL ) - sysfree(rll); return(-1);} return(-1);} overrun check\n"); printf("\after // return(-1);} rlc\n"); */ ==NULL) (rl2 (r12 if create ); getvar(MSEC); rl2=rlcmk(a, / return(-1);} overrun\n"); { pstr ("e<DRAM-Memory-overrun>\n" before create RLC bms = ("DRAM Memory {pstr == NULL) (rll (rl /* */ rlcmalloc(maxlng); printf("\n // *slc /* you can use some if (0) /* (dilate filter , 0=no filter 1=with filter erode) */ */ { slc=erode2(rl1,r12); +/ /* sIc=dilate(rll /* slc=rlc_mf(rl1 ,rl2 ,0,4); +/ ,rl2); else { slc=rl2 rl2=rl1; } /* +/ labelling object rlcfree(rll); if(sgmt(rl2,slc)==NULL) {pstr("object-number-overrun\n"); //if(sgmt(rl2,slc)==OL) {pstr("e<object number overrun>\n"); rlcfree(rl1); label\n"); printf("\nafter // /* */ extraction feature , objects=rl-ftr2(r12 // getvar(MSEC)-bms; ms+=1000; if (ms<0) /* if display RLC */ /* free rlcfree ("\nafter allocation (rll ,0,255); rlcout(a,rll (display) printf // maxobjects); obj\n"); printf("\nafter bms = f, display\n"); +/ ); 101 return(-1);} return(-1);} // printf("\ nafter free\n"); return(objects); } /******************************************************************/ void (int cross-image line , int column, size , int int color) { int i; for ( i- size ;i<=size ; i++) { //wpix(color ,(U8 +)ScrByteAddr(column //wpix(color , (U8 *)ScrByteAddr(column+i , line *(ScrByteAddr(column = ,line+i)) )) *(ScrByteAddr(column+i , line ,line+i)); )); color; color; } } /******************************************************************/ /******************************************************************/ /******************************************************************/ /********************+++**+******************++++*****+++*********/ MATRIX SUBROUTINES* /**********************DATA *****************/ /******************************************************************/ 132 InitDataMatrixPar (DmParameter *DmPar, *SearchArea , image char *OutputText , 132 OutputTextLength) { ///////////////////////////////////////////// 7/ ini input parameters /// etv/r//////////// // please set ////////////////////// variable Licence Code to the DmPar->LicenceCodel = Ox0D444D18; DmPar->LicenceCode2 = 0x274B6669; // define define = decode text Vision Components space = OutputText; DmPar->MaxTextLength = OutputText Length; define offered from SearchArea; DmPar->DmText // value data matrix search area DmPar->SearchArea // the data matrix size DmPar->DmDx = 34 DmPar->DmDy = 33 ; pixel ; / pixel = 4 0; // DmDeltaSize (0% DmPar->PropOppLine = 70; opposite lines in percent (50 = 70; 77 77 Difference of DmPar->PropNextLine Difference of next lines in percent (20 = DmPar->DmDeltaSize -1 // = any define = exactly DmDx x DmDy size tolerance +/- line must be at least 50% of smaller line must be at least 20% of k% = k% size) = smaller data matrix parameter DmPar->D m Search Color = 0; // DM Color (O=black / DmPar->ModulContrast 4; / DmPar->D m Modu lSelect 50; 77 DmPar->ModulNrMin = 10; // min in contrast percent (0% minimum is 1=white / between = 8 dark color / modules 102 -1=all) modul and background in 100% = bright grey color) values (75% of moduls must have DmPar->ModulNrMax = 10; DmPar->ModulSizeMin = 1; DmPar->ModulSizeMax = // define = -1; = 0; image define DmPar->FindZoomMax 1; order to in pyramid zoom DmPar->ReadZoomMin max max to angle delta 90 degree = 20; in allowed reading time = DmPar->MaxReadTime return -1) speed up the system for DM finding speed up the system for DM reading 1; DmPar->DeltaAngleL // not 0; = DmPar->ReadZoomMax // DmSearchColor is if only, (works data matrix "L" order to in pyramid zoom 1; define 1 pixel pixel 0; DmPar->FindZoomMin // minimum is pixel: closing for filter DmPar->ClosingFilter // 144 modules is maximum data matrix options DmPar->DmRectangle pre /7 5; DmPar->DmMirrorMode // 77 7/ After ms. // 500; 20 example: (for time the means 90 is 20 ) +/- elapsed , the program returns with a time out -1 error. = unlii ms 0; } /******************************************** DrawDataMatrix 132 (DmParameter *DmPar, *Source , image 132 CrossSize , 132 ColorDark , 132 ColorBright) { return DrawDataMatrixOff CrossSize , Source , (DmPar, ColorDark , ColorBright , 0 ,0); } /******************************************************************/ I32 DrawDataMatrixOff x, ModulX, 532 i , y, 532 **ModulPosX, U8 **ModulValue, (DmParameter *DmPar, Color, ModulY, image *Source , PosX[4] , PosY [4]; **ModulPosY; **ModulThresh; if (DmPar->DmError) return -1; } else ////////////////////////// 7 ini variables 7///////////////////// ModulPosX = DmPar->ModulPosX; ModulPosY = DmPar->ModulPosY; ModulValue = DmPar->ModulValues; ModulThresh = DmPar->ModulThresh; if (DmPar->DmColor else = 0) Color = ColorDark; Color = ColorBright 103 I32 CrossSize , I32 ColorDark , 532 ColorBright , int off.x for ( i =0; i <4; i++) for( i=0; i <4; i++) PosY[ i] PosX [ i] = DmPar->DmPosX[ i+ // for(i =0; i<4; i++) print ("PosX[%d]=%3d // draw data matrix border for(i=O; // i<4; i++) off-x; = DmPar->DmPosY[ i]+offy; linex(Source, PosY[%d]=%3d\n", PosX[i] , i, PosX[i], PosY[i] , PosX[(i+1)&0x03], draw data matrix point 0 MarkCross( Source , // PosX[0} , PosY[0} , ColorDark, 4 * CrossSize); draw data matrix moduls # if d e f TESTINGVERSION2 for (y = DmPar->ModulStartY; y <= DmPar->ModulEndY; y++) { for (x = DmPar->ModulStartX; x <= DmPar->ModulEndX; x++) ModulX = ModulPosX [y x ]; ModulY = ModulPosY [y] [x ]; if ( ModulValue [y] [x] > ModulThresh [y] [x]) else MarkCross ( Source , ModulX, Color , ModulY, Color = ColorDark; Color = ColorBright; CrossSize); } #endif return 0; /******************************************************************/ void ChangeColor 132 i, j, dx, U8 *restrict (image dy, = area->st ; dx = area->dx; dy = area->dy; for 132 start , 132 end, pitch; ppix , *addr; addr pitch = *area, area->pitch (j=0; j<dy; j++) { ppix = addr+(pitch*j ); 104 132 newcol) i, PosY[i]); PosY[(i+1)&0x03] ColorBright); for if (i=O; ( i<dx; i++) (*ppix>=start ) && (*ppix<=end) ) *ppix=newcol; } ppix++; *********************************** /************************************************************* /**********************************************************/ 105 106 Appendix C C Sharp Code C.1 Program.cs using System; using System. Collections . Generic; using System. Linq; using System . Windows. Forms; namespace UDPListener class static Program { /// /// <summary> /// </summary> The main public entry point for the application Form1 mainForm; static [STAThread] void static Main() { Application .EnableVisualStyles (; Application . SetCompatibleText RenderingDefault (false); mainForm = new Form1 (); Application. Run(mainForm); } } C.2 Forml.cs using System; using System . Collections . Generic; using System . ComponentModel; using System . Data; using System. Drawing; 107 using System. Linq; using System. Text; using System . Windows. Forms; using System . Net . Sockets using System . Net; using System . Net . Sockets using System. Threading; using System. Collections namespace UDPListener { public partial class Formi Form { Tracker myTracker; //static UdpClient server; DateTime timestarted ; public delegate void SetTextCallback(string byte[] them { 157, 58, byte[] us public Forml() = { 157, 58, 60, 60, 63 69 txt ); }; }; { InitializeComponent private void (; buttonlClick( object sender , EventArgs e) { UdpClient uSend IPAddress ipa = new System . Net . IPEndPoint //byte [] UdpClient (5005); //4004, AddressFamily. Unix); = new System . Net . IPAddress (us ); temp = iep = new System . Net . IPEndPoint ( ipa , server. Receive (ref 4004); ipe ); //System. Windows. Forms. MessageBox. Show (temp. ToString ()); uSend. EnableBroadcast = true; uSend . Send (System. Text . Encoding . ASCII. GetBytes ( textBoxl . Text . ToCharArray ()) uSend. Close (); private void button2-Click( object sender , EventArgs { textBox2 . Text I private void FormlLoad ( obj ect sender , EventArgs { myTracker = new timestarted Tracker ( pictureBox1 , timeri ); = DateTime.Now; backgroundWorkerl . RunWorkerAsync (); } public void safeAppend(string txt) 108 e) e) , textBoxl . Text. Length, iep); if (textBox2.InvokeRequired) { c = SetTextCallback this.Invoke(c, new SetTextCallback(safeAppend); object [] new { txt }); //"I_" + tot } else { int ret = 0; try { ret = myTracker.parseString(txt); } catch (Exception e) txt = "<unparsed>" + if (textBox2.Text.Length txt ;// nothing yet. > 2000) textBox2 . Text if (ret != 3 { += txt; textBox2 . Text += "\r\n"; textBox2 .Text textBox2.Select (textBox2.Text.Length - 2, 1); textBox2. ScrollToCaret (); private void backgroundWorkerlDoWork( object sender , DoWorkEventArgs e) { int sampleUdpPort = 4004; IPHostEntry localHostEntry; timestarted = DateTime.Now; try { //Create Socket a UDP socket. soUdp = new Socket (AddressFamily. InterNetwork , SocketType . Dgram, ProtocolType . Udp); try { localHostEntry } catch = Dns. GetHostByName(Dns. GetHostName()); (Exception) { Console. WriteLine("Local..Host...not..found"); // fail return; IPEndPoint localIpEndPoint = new IPEndPoint ( localHostEntry . AddressList [0] , sampleUdpPort); soUdp . Bind ( localIpEndPoint ); while (true) { Byte[] received IPEndPoint = new Byte[256]; tmpIpEndPoint = new IPEndPoint ( localHostEntry . AddressList [0] , 109 sampleUdpPort); EndPoint int = remoteEP = soUdp. ReceiveFrom bytesReceived String dataReceived safeAppend (tmpIpEndPoint); , ref remoteEP); = System . Text . Encoding . ASCII. GetString ( received ); ( dataReceived ); "The /* String returningString Byte [] (received Server got your message through UDP:" returningByte = System. Text . Encoding . ASCII. GetBytes (returning soUdp. SendTo(returningByte , remoteEP);*/ } } catch se) (SocketException { Console. WriteLine ("A.Socket..Exception..has-occurred!" I void private object timerlTick( sender , EventArgs e) { myTracker . drawPoint (; } pictureBox1 void private sender , Resize (object { myTracker . resizeCont rol (this . pictureBox1 } } } Tracker.cs C.3 using System ; using System . Collections . Generic using System. Collections; using System. Linq; using System . Text ; using System . Drawing; narnespace UDPListener { class the //essentially Tracker controller { const int ROLLDISPFACTOR = 2; const int TILT.DISP..FACTOR = 3; Point home, const int //private angle; vertSpan = ArrayList BoPen myPen = 16, horizSpan = 20; _pens; null ; System . Windows. Forms. Control myControl; System . Windows. Forms . Timer myTimer; Graphics myG = null ; 110 EventArgs e) + se.ToString()); + String dataReceived; . ToCharArray()); public if void drawPoint () (myPen.x return; -1) = int x = myPen.x; int y = myPen-y; int z = myPen. z * ROLL-DISPFACTOR; home = new Point (x, y); myG. Clear( Color . White); // (!myPen.angleOutdated) test (Brushes.LightBlue , myG. FillEllipse home.X - Math.Cos(myPen.lastTheta * home.Y - z, * Math.Sin(myPen. lastTheta angle = new Point((int)((z) (int)((z) reading DMs still if tril = home; tril Point tri2 - home; tri2.Offset(1-angle.Y / * z << 1, << / 180)), Math.PI 180))); 3, 1-angle .X >> 3); >> 3, 1+angle.X >> 3); Offset (1+angle .Y >> Point . * Math.PI z, angle . Offset (home.X,home.Y); tri3 = angle ; Point Point [] triangle = {tril tri2 , , Point tiltX = home; Point tiltY = home; tiltX Offset (myPen. tiltX tiltY . Offset (0, tri3 }; triangle myG. FillPolygon ( Brushes. White, //Draw ); * TILTDISP-FACTOR, Orientation //Draw 0); til t * TILTDISP.FACTOR); myPen. tiltY myG.DrawLine(Pens.Black, home, tiltX); myG.DrawLine(Pens.Black, home, tiltY); home.X-1,home.Y-1, myG.DrawRectangle(Pens.Black, 2, 2); //Draw Center } public bool resizeControl (System.Windows. Forms. Control p) myG = Graphics . FromHwnd(myControl. Handle); if (myPen == null) { home = new Point (p.Width / 2, p. Height / 2); angle = home; return false } myPen. screenResize (p .Width, p. Height return true; } public Tracker (System . Windows. Forms. Control p, { /set up view myControl = p; 111 System . Windows. Forms. Timer t) 1); myTimer = t; resizeControl (p); } public int parseString (string given) { given = if given . Remove( given . IndexOf( '\0 ' [] string parseMe = given. Split ( ' ,'); (parseMe.Length = 4) { if (myPen = null) myPen = new BoPen(int. Parse (parseMe [0]) int . Parse (parseMe [2]) , myPen. screenResize (myControl. Width, myTimer. Enabled //myTimer. Start = , int .Parse (parseMe [1) int . Parse (parseMe [3]. TrimEnd ( ); myControl. Height); true; (; } else myPen. update3 (int . Parse ( parseMe [0]) int . Parse (parseMe [2]) return , int , . Parse (parseMe [1]) int . Parse (parseMe [3]. TrimEnd ( ) 3; } else if (parseMe. Length = 10) { if (myPen = null) { myPen = new BoPen(int. Parse(parseMe [0}) int . Parse (parseMe [1]) int .Parse (parseMe [2]), int .Parse (parseMe [3]), int .Parse (parseMe [4]) int .Parse(parseMe[5]) int .Parse (parseMe [6]) int .Parse (parseMe [7]), int .Parse (parseMe [8]), int . Parse (parseMe [9] . TrimEnd ())) myPen. screen Resize (myControl . Width, myTimer. Enabled //myTimer. Start = myControl. Height); true; (; else myPen. update7 (int . Parse (parseMe [0]), int .Parse (parseMe [1]) int . Parse (parseMe [2] int . Parse (parseMe [3]) int . Parse (parseMe [4] int . Parse (parseMe [5]) int . Parse(parseMe [6]) int . Parse (parseMe [7] int .Parse (parseMe [8]), int . Parse ( parseMe [9] . TrimEnd () )) return 6; } else { throw(new Exception (" unexpected-.number-of..datum...in .. update,..string" )); } } 112 } BoPen.cs C.4 using System; using System . Collections . Generic; using System. Linq; using System . Text; namespace UDPListener {/294,169,119,36,88,46 class BoPen { delegate public sender , LimitsChangedDelegate (object void const int moveToleranceX = 3; const int moveToleranceY = 3; const int moveToleranceZ = 3; EventArgs screenWidth ; screenHeight screenDepth; int int int int = .numPens 0; private static private int private bool private DateTime .lastSeen private TimeSpan _dwellTime = TimeSpan . MinValue; private int = -id -1; _currentlyVisible = = false; _lastMoved = DateTime.MinValue, _lastX = -1-moveToleranceX DateTime.MinValue; , = -1--moveToleranceY, .lastY .. last Z = -1-moveToleranceZ, = .lastPhi -1, _tiltY = -1, private int basePhi = private int _maxX = 624, private int .maxY private int .. maxZ = *** bool private double public = -1; baseRho = 6, void 5; _minX = 0; _minY = 0; = 468, 140, autoResize -1, -1, .minZ = = radTheta = //used 45; NOTE THE USE OF THIS Fx private = .lastRho .tiltX = -lastTheta /* e); to //462 be / false -1; screenResize ( int width , int height) { screenHeight = height; screenWidth = width; screenDepth = Math.Max( height , width) / } public BoPen(int x, int y, int zl, int z2) { 113 190 and 45 20; observed max update3(x, y, zl, z2); _BoPen (; } public ( int update3 void x, int y, int z2) center of zI, int { if (x -1) - { = currentlyVisible false; } else { z = Math.Max(zl,z2); int x+=zl/2; //convert y+=z2/2; DateTime.Now; = (((lastX if > 0) & (Math.Abs(_lastX ((lastY > 0) & (Math.Abs(_lastY ((lastZ > 0) & (Math.Abs(_lastZ _lastSeen - = _dwellTime (.dwellTime. if blob = true; _currentlyVisible .lastSeen to //dispatch Seconds x) - < moveToleranceX)) y) < moveToleranceY)) z) < moveToleranceZ))) // && test if have moved; & _lastMoved; > 1) click } // //updateModel(); so that you the use still data in determining other info like extent and vel I // else lastX = x; lastY = y; = z; lastZ int private way this dwell time _lastMoved = DateTime.Now; //dispatch update scalePatternDistance (int is determined by how long you stay within an initial position , cant jus dist) { ret urn diskt«<2; void private updateTilt(int phi, int rho) { , //eventually baseRho and basePhi are functions of position .. astPhi = .lastRho = rho; _tiltX = (int )(( scalePatternDistance (rho - baseRho) * Math. Sin (radTheta)) + (scalePatternDistance (phi - _tiltY = (int)(( scalePatternDistance (phi - basePhi) * Math. Sin(radTheta)) + (scalePatternDistance (rho - / 180.0); phi; - //phi-basePhi;//rho baseRho;// } private updateRoll(int void theta) { lastTheta radTheta = = theta; (Math.PI * (double)(theta+45) 114 I int y, public void update7(int x, int zl, int int z2, id, int int phi, rho, int theta, int dm-x,int dm.-y) { = true; _currentlyVisible if (id -1) = { id if =id; (Aid id) - //decoded info is valid { updateRoll (theta); updateTilt (phi , rho); update3(x,y,zl,z2); } public BoPen(int x, .id // = id ; bad y, update7(x, int y, if zi, int zl, first the z2, id, phi, int ID is z2, incorrect { -BoPen(); } _BoPen() void private //increment _numPens++; number of pens ~BoPen() { -numPens-- public int lastX { return private if _lastX ; set _lastX = value; (autoResize) _maxX = Math.Max(-_astX , _rnaxX); _minX = Math.Min(-lastX , _minX); } public int lastY { get { return , b/c rhotheta,dm.x,dm-y); _BoPen ( ); public BoPen() int id, int phi, ..astY; 115 int there 's rho, no way int theta, to int dm-x, change it. int dm-y) } set private { lastY = value; (autoResize) if { -maxY = Math.Max(lastY ,maxY); -minY = Math.Min(lastY ,minY); } I lastZ public int { return ..lastZ; set private -last Z = value; (autoResize) if { -maxZ = Math.Max(_lastZ ,maxZ); -minZ = Math.Min(..astZ ,minZ); public int x { { get if (.maxX != .minX) return(screenWidth - (int )(((float )(-lastX * screenWidth) / (float)(_maxX - _minX)))); else return screenWidth / 2; I public int y { (..maxY if != _minY) return ((int)((float)(_-lastY return screenHeight * screenHeight) / (float)(.maxY - _minY))); else / 2; public int z { get { if (.maxZ != -minZ) return (screenDepth return screenDepth / 2 + ((int)(100 else / 2; 116 * Math.LoglO((float)(_lastZ) / (float)(.maxZ - _minZ))))); public int lastTheta { return get int /*public _lastTheta; } lastRho { get { return -lastRho; } _lastPhi } } public int lastPhi { get { return } */ public int tiltX { get { return -tiltX } } public int tiltY { get { return _tiltY ; } } public bool angleOutdated { //in future use time-based approach get return (!( currentlyVisible && (_lastPhi } } } 117 > 0)));

The BoPen: A Tangible Pointer ... Degrees of Freedom Daniel Matthew Taub AUG

Related documents

Products

Support

The BoPen: A Tangible Pointer ... Degrees of Freedom Daniel Matthew Taub AUG

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib