The BoPen: A Tangible Pointer ... Degrees of Freedom Daniel Matthew Taub AUG

advertisement
The BoPen: A Tangible Pointer Tracked in Six
Degrees of Freedom
OF TECHNOLOGY
by
AUG 2 4 2010
Daniel Matthew Taub
LIBRAR IES
S.B., EECS, M.I.T., 2006
Submitted to the Department of Electrical Engineering and Computer
Science
in partial fulfillment of the requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
September 2009
ARCHIVES
@ Massachusetts Institute of Technology 2009. All rights reserved.
.......................................
Author .........
Department of Electrical Engineering and Computer Science
August 21, 2009
t
C ertified by .......................................
.
.........
Ramesh Raskar
Associate Professor
Thesis Supervisor
Accepted by. . . . . . . . . . . . . . . . . . . . . . . .
. .. . . . . . . .
-
Dr. 'h istopher J. Terman
Chairman, Department Committee on Graduate Theses
4
.J
The BoPen: A Tangible Pointer Tracked in Six Degrees of
Freedom
by
Daniel Matthew Taub
Submitted to the Department of Electrical Engineering and Computer Science
on August 21, 2009, in partial fulfillment of the
requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science
Abstract
In this thesis, I designed and implemented an optical system for freehand interactions
in six degrees of freedom. A single camera captures a pen's location and orientation,
including roll, tilt, x, y, and z by reading information encoded in a pattern at the
infinite focal plane. The pattern design, server-side processing, and application demo
software is written in Microsoft C#.NET, while the client-side pattern recognition is
integrated into a camera with an on-board DSP and programmed in C. We evaluate
a number of prototypes and consider the pen's potential as an input device.
Thesis Supervisor: Ramesh Raskar
Title: Associate Professor
4
Acknowledgments
I would first like to thank the creators of the Bokode technology: Ankit Mohan, Grace
Woo, Shinsaku Hiura, Quinn Smithwick, and my advisor, Ramesh Raskar. Special
thanks to Ankit for countless instances of support and advice. Thank you to Shahram
Izadi and Steve Hodges of Microsoft for being so interested in a collaboration and
for great conversations about HCI. Without the help of Paul Dietz at Microsoft, the
collaboration would never have happened in the first place.
David Molyneaux at MSRC provided considerable support to me, especially during
long nights at the lab. He also helped with editing, as did Susanne Seitinger.
Though I have never met him, I am indebted to Jean Renard Ward for his extensive
bibliography on the history of pens and handwriting recognition, which made my
related works section both more interesting and immensely more time-consuming.
Special thanks to Daniel Saakes for doing the mock-up BoPen graphic and for his
valuable contributions to the final pen design. Also thanks to Tom Lutz for tolerating
my frequent iterations on the pen design and for training me in all the shop tools, and
thank you to MSRC for footing the bill when I had to make more pens. Bob from FS
Systems graciously lent me his smart camera and served as an excellent liaison with
the team at Vision Components, who were extremely helpful to modify their data
matrix decoding library to work faster and more accurately with our setup.
For giving me a great background in HCI before I started this project, I would
like to thank Randy Davis and Rob Miller. Professor Davis taught me more than
I ever wanted to know about multimodal interfaces and Professor Miller gave me a
firm-but-fair grounding in traditional HCI.
I take for granted that my family has always supported me, but I shall thank
my parents for trusting that my decisions are sound and for only pushing me just
enough. Thanks to my sister for her sense of humor, and for smiling sometimes.
Unspeakable appreciation to Virginia Fisher for lending her incredible strength and
support. Looking forward to our future together provided motivation immeasurable.
6
Contents
13
1 Introduction
1.1
M otivation . . . . . . . . . . . . . . . . . . . . . . . . . .
13
1.2
Research Question
. . . . . . . . . . . . . . . . . . . . .
15
1.3
2
1.2.1
Creating real-time Bokode tracking and decoding
15
1.2.2
Co-locating Bokode projector with a display . . .
16
1.2.3
Enabling interaction with system in place
. . . .
16
O utline . . . . . . . . . . . . . . . . . . . . . . . . . . . .
16
19
Related Works
2.1
2.2
. . . . . . . .
19
2.1.1
M ice . . . . . . . . . . .
19
2.1.2
Pens . . . . . . . . . . .
21
2.1.3
Other Pointers
. . . . .
23
Pointing Devices
24
Blending Reality and Virtuality
2.2.1
Tangible . . . . . . . . .
25
2.2.2
Mixed-Reality Interfaces
26
2.3
Fiducials . . . . . . . . . . . . .
27
2.4
Bimanual Interaction . . . . . .
29
2.5
Conclusion . . . . . . . . . . . .
30
31
3 Design Development
3.1
BoPen Overview
3.1.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
. . . . . . . . . . . . . . . . . . . . . . . . . .
32
Bokode Optics
7
3.1.2
Preliminary Pattern
. . . . . . . . . . . . . . . . . . . . . . .
33
3.2
Building the BoPen . . . . . . . . . . . . . . . . . . . . . . . . . . . .
34
3.3
Software and System Iteration . . . . . . . . . . . . . . . . . . . . . .
36
3.3.1
Replicating Bokode . . . . . . . . . . . . . . . . . . . . . . . .
36
3.3.2
Diffuser Experiment
. . . . . . . . . . . . . . . . . . . . . . .
38
3.3.3
Live Tracker . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.3.4
Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . . .
42
4 Final Implementation
45
4.1
SecondLight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
4.2
Optical Considerations . . . . . . . . . . . . . . . . . . . . . . . . . .
47
4.2.1
Pattern Choices . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.2.2
Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.3
The Pens
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
4.4
Hardware and Software . . . . . . . . . . . . . . . . . . . . . . . . . .
51
4.4.1
Smart Camera Approach . . . . . . . . . . . . . . . . . . . . .
52
4.4.2
Vision Server Approach
. . . . . . . . . . . . . . . . . . . . .
54
Interaction Vision.. . . . . . . .
. . . . . . . . . . . . . . . . . . .
55
4.5
5 Evaluation
5.1
5.2
6
57
Pen Design Verification.
. . . . . . . . . . . . . . . . . . . . . ..
57
5.1.1
Pattern Comparison . . . . . . . . . . . . . . . . . . . . . . .
59
5.1.2
Optical Idiosyncrasies . . . . . . . . . . . . . . . . . . . . . .
63
5.1.3
Best Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
Interface Design Verification . . . . . . . . . . . . . . . . . . . . . . .
64
5.2.1
Limitations
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
5.2.2
Suggested Improvements . . . . . . . . . . . . . . . . . . . . .
67
5.2.3
Interface Potential
68
. . . . . . . . . . . . . . . . . . . . . . . .
Conclusions
69
6.1
69
Contribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
6.2
Relevance . . . . . . . . . . . . . . . . .
.
.
.
.
70
6.3
Application Extensions . . . . . . . . . .
.
.
.
.
70
6.3.1
GUI Extensions . . . . . . . . . .
.
.
.
.
70
6.3.2
Tangible and Direct Manipulation
.
.
.
.
71
6.3.3
Multi-user Interaction
. . . . . .
.
.
.
.
72
6.4
Device Future . . . . . . . . . . . . . . .
.
.
.
.
73
6.5
Outlook . . . . . . . . . . . . . . . . . .
.
.
.
.
74
A MATLAB Code
85
B C Code
87
105
C C Sharp Code
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
105
C.2 Form l.cs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
105
C.3 Tracker.cs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
108
C.1 Program .cs
C.4 BoPen.cs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
. .
111
10
List of Figures
. . .
2-1
Mixed-Reality Continuum
2-2
Two-Dimensional Barcodes . . .
3-1
BoPen: Basic Design . . . . . . . .
3-2 Basic Optical Models . . . . . . . .
3-3 Data Matrix Grid Pattern . . . . .
3-4 Bokode Versus BoPen
. . . . . . .
3-5 Components of final pen design.
3-6 Table-Based Setup
3-7
. . . . . . . . .
Software Version One . . . . . . . .
3-8 Version One Failure . . . . . . . . .
3-9
Diffuser-Based FOV Enhancement .
3-10 Mock-up of Diffuser Imaging Setup
3-11 Pipeline for Software Version Two
3-12 Version Two With Multiple Pens
4-1
SecondLight Modifications . . . . . . . . . . . .
4-2
Pattern Designs . . . . . . . . . . . . . . . . . .
4-3
Alternate Pattern Design for Template Matching
4-4
The Pens! . . . . . . . . . . . . . . . . . . . . .
4-5
Pen: Exploded View
4-6
Film Masks . . . . . . . . . . .
4-7
BoPen Debug Display
. . . . . . . . . . . . . . .
. . . . . . ..
. . .
11
5-1
Spaced Data Matrices
. . . . . . . . . . . . . . . . . . . . . . . . . .
62
5-2
Reflection Chamber . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
6-1
Vision for Public Display . . . . . . . . . . . . . . . . . . . . . . . . .
73
12
List of Tables
4.1
Lens Properties for Each Pen. . . . . . . . . . . . . . . . . . . . . . .
49
5.1
Images of Various Patterns.
. . . . . . . . . . . . . . . . . . . . . . .
58
5.2
Paper Test Recognition Rates . . . . . . . . . . . . . . . . . . . . . .
62
5.3
Images of Various Bokehs
. . . . . . . . . . . . . . . . . . . . . . . .
65
13
14
Chapter 1
Introduction
1.1
Motivation
When Doug Englebart and colleagues created the first Graphical User Interface in
1965-1968 [25], few could have predicted that, 30 years later, it would become the
dominant method for humans to interact with computers. It was only a few years later
that Xerox created the Alto, but it was built as a research machine [88]. Not until
Apple's Macintosh XL was released in 1984 did GUI and pointing-based computers
really start taking off [40]. By 1993-almost a decade later-Microsoft@ had taken the
lead in software with more than 25 million licensed users of Windows [67]. Today,
users the world over have seen very little change in terms of mainstream interface
design.
Despite the popularization of the Apple iPod and iPhone, 90% of users
operate a mouse on a daily basis, and the mouse and PC pairing has 97% prevalence
[74].
Whilst this 40-year old technology has grown to widespread use and acceptance,
other interesting pointing interface technologies have been created and are being
used in research, academic, and business settings. Digital pens and (multi-)touch
screens are two of the most popular of these modern interfaces. Most people are
familiar with pens as writing utensils, but many also have come to know their digital
counterparts via signature pads, tablet PCs, stylus-capable phones and PDAs, and
digital art tablets. Our comfort and familiarity with pens for writing, marking, and
gesturing contribute to our cultural familiarity with expressing ourselves with pens
and pen-based devices.
Touch interfaces, on the other hand, draw on our early practice with object manipulation tasks by creating a "magic finger" analogy. Arguably, the touch-activated
screens we are familiar with from ATMs and public kiosks were-initially-extensions
of the physically actuated buttons on mechanical apparati, telephones, TV remotes,
and other electromechanical devices. Even with screen-based touch interfaces, this
interaction paradigm has not changed much; with a few exceptions, touch devices are
physically flat and permit surface-based interactions in only two dimensions. Some
gestural interfaces omit the surface entirely [12], while others can make it seem that
any surface is interactive [69]. The magic fingers remain, but the removal of haptic feedback can seriously distance these interfaces from the object manipulation
metaphor.
In contrast, digital pen-based interfaces always by definition include a physical
component-the pen-shaped tool-which acts as a liaison between the physical and
digital realms. To accomplish a similar result with touch-based interactions requires
touch-responsive props in addition to or in place of the standard two-dimensional
surface. At a certain point after adding these props, the interface no longer falls
under the category of "touch interfaces". Instead, it is called "tangible," and in many
cases, touch becomes only one of many diverse ways to interact with the object.
Pen-based interfaces have other affordances, in addition to tangibility. Though touch
screens have been increasing in popularity [14], pens are generally understood to be
more accurate [38]. Where speed is concerned, pens are nearly indistinguishable from
mice under the Fitts' paradigm [2]. Pens are also used for more than just mouse-like
pointing. Higher-end commercial pen interfaces support both rolling and tilting for
precise drawing and sculpting actions [96].
Digital pens still fall short compared to tangible and augmented reality where
flexibility is concerned. In augmented reality systems, everyday objects recognized
by vision or tagged by a fiduciary marker may take on an arbitrary meaning that is
conveyed or enhanced using sensory overlays [68]. Likewise, interactions with external
physical objects can be enabled if the pen position can be detected reliably on those
objects. However, pen interfaces are still largely limited to the scenario of a single
user interacting with a nearby tablet or other flat surface.
Research Question
1.2
What if one could combine the affordances of a familiar physical object, like those
of a pen, with the flexibility of an augmented reality system? We believe the result
would be something like the widely popular mouse, but providing a more comfortable
way to exert control in up to six degrees of freedom (DOFs). A small, lightweight,
wireless pen could provide an ideal balance between the fatiguing bulk of a six-DOF
mouse and the lack of haptic feedback in hand gesture and finger-tracking systems.
This thesis presents a unique application of the "Bokode" optical system first
described by [70] to the Microsoft SecondLight multi-touch and multi-display table
[46]
to enable pen-based interaction supporting three-dimensional translation, two-
dimensional tilt, and rotation using no more than a camera, a light source, and some
inexpensive optical components (Figure 3-1). In the Bokode setup, a matrix of codes
in each pen simultaneously provides both identification and location information directly, using a microscopic array of fiducials. Informed by augmented reality research
as well as pen interface technologies, we endeavor to demonstrate the first real-time
application of the Bokode as well as the first application programming interface (API)
for this technology.
1.2.1
Creating real-time Bokode tracking and decoding
As we shall explain in Section 3.3.4, decoding Data Matrices can be a slow process.
Human perceptual psychology demands system response times between 100 and 200
milliseconds to fulfill our expectations of interaction [21], so rapid response dispatching is an essential component of a good interface. We plan to explore both hardware
and software solutions to the decoding delay problem.
1.2.2
Co-locating Bokode projector with a display
Due to the optical nature of the Bokode system(described in Section 3.1.1), traditional diffusive surface-based projection and LED/LCD display technology prevents
the Bokode signal from reaching the camera. We discuss our integration of the Bokode
with the Microsoft Second Light multi-touch and multi-display system as one solution
to this problem.
1.2.3
Enabling interaction with system in place
Bokode pens encode rotation angle, unique ID, and tilt in the projected patterns.
Creating an API to access this data from higher-level software is necessary to connect
the lower-level aspects of the interface to the applications. We shall describe the
design for an API and give some examples of applications that could utilize it.
1.3
Outline
Chapter 2 provides the background and a literature review for this project.
We
convey a history of pointers-mouse, pen, and other-in human-computer interactions
(HCI). Then, we detail a variety of modern interface inventions, relating them back
to the current work. Finally, we describe motivating research from the ubiquitous
computing domain, where fiduciary markers bridge augmented reality and tangible
interfaces.
In Chapter 3, we describe how the system works, optically and computationally,
followed by a presentation of design iterations, starting with our proof-of-concept
system: a replication of the experiments in [70] using pre-recorded high-resolution
video. We detail the evolution of the design with an analysis of our second system,
which provides real-time tracking and delayed-time Data Matrix decoding with a
lower-resolution (and more conventional) CCD sensor.
Chapter 4 describes the structure of our current system and its implementation as
part of a collaboration with the Microsoft Second Light project. Chapter 5 evaluates
our system in its current form and describes both its limitations and the methods
we've used to enhance performance. We close in Chapter 6 with a description of our
plan to develop the system into a fully-functional interface, laying out some possible
research extensions and promising application areas.
20
Chapter 2
Related Works
2.1
2.1.1
Pointing Devices
Mice
The most basic and best known example of a pointing device is the standard twodimensional mouse, a device which has changed little since its first public demonstration by Englebert, English, et al. in December 1968 [35]. As modern GUI-based
applications often demand more than the standard two degrees of freedom (DOFs)
for optimum usability, modifications to the 2D mouse tend to address this issue in
similar ways-with a single, small addition. One of the most common additions is
that of the scroll wheel [94], a small rotating disk that is usually placed between the
two mouse buttons and provides one additional DOF. Another similar design, the JSMouse, enables two additional DOFs by placing an IBM Trackpoint IIITM miniature
isometric joystick in the familiar scroll wheel position [100]. The three- or four-DOF
devices resulting from these modifications can enable more efficient zooming, rapid
browsing, and virtual object manipulation. Many of the more common modern-day
mice draw heavily from one or both of these examples [3].
The aforementioned devices may reduce the need for menu traversal or key combinations and therefore increase efficiency, but they leave the mouse design essentially
unchanged, save a small addition. Preserving the object manipulation metaphor in-
herent in the traditional mouse results in more fundamental design changes. Take
for example the Rockin'Mouse [4]. In this device, an internal tilt sensor enables the
same four DOFs but as a more natural extension of the traditional mouse; a rounded
bottom affords the user a means to figuratively and literally grasp the two additional
DOFs, and there is no additional nub or wheel to find; all actions can be performed
by directly manipulating the control object.
This kind of design is compelling-it allows the user to move a mouse in multiple
directions, mapping that movement to the motion of a virtual object. However, the
Rockin'Mouse is limited in the scope of its applications. For example, in the case
where roll (rotation around the axis normal to the table's plane) is more important
than tilt, this particular device would no longer be as intuitive to use, and a new
mouse would need to be created. This new mouse would likely look something like
the prototypes devised by MacKenzie et al. [61] and Fallman et al. [26]. These mice
are both based on the idea that two standard ball- or optical- mouse elements can
be combined in one device, giving the user the option of manipulating two connected
yet fixed-distanced cursors at the same time-moving around a line instead of a point.
The direct manipulation of a widget that represents some virtual object is an
attractive interface paradigm-so alluring, in fact, that the development of so-called
"tangible" interfaces has broken off as a field of interaction research in its own right.
(See Section 2.2.1)
Two examples of highly-manipulable mice are the GlobeMouse and the GlobeFish
[30], which provide independent control of translation and rotation in three dimensions
each, for a total of six DOFs. The GlobeMouse places a three-DOF trackball on top of
a rotation-capable mouse, and is meant to be operated with one hand. The GlobeFish
is operated bi-manually and uses three nested frames to implement 3D translation
with a three-DOF trackball for rotation.
The authors report both these devices
require considerable force relative to other mice, with the GlobeFish causing fairly
rapid motor fatigue in their users. The authors were careful to consider that direct
manipulation was not necessarily the best rule to follow for all interfaces, reporting
that 3 DOF (translation) + 3 DOF (rotation) is better suited to docking tasks than
a device with the more-direct 1 * 6DOF configuration.
The VideoMouse [43] is a true 1 * 6DOF direct manipulation interface; it supports
roll, two-DOF rocking, two-DOF translation, and some amount of height sensing.
This device uses computer vision to process images captured by a camera underneath
the mouse. Additionally, a specially-patterned mousepad can act as a handle for the
non-dominant hand to assist in orientation adjustments, and the same camera can be
used for scanning short segments of text. However, there was no data reported from
user studies, so the usability of the VideoMouse is relatively unknown. However, it
provides considerable insight into how one might design a 1*6-DOF pen user interface.
Indeed, the BoPen is very similar to the VideoMouse-except that it replaces the
camera with a projector, the mousepad with a camera, and the focused lenses with
out-of-focus ones.
2.1.2
Pens
The use of digital pens predates the popular introduction of mice by more than a
decade. In the mid-1950 the US military established the Semi-Automatic Ground
Environment (SAGE) to command and control air defenses in the event of a Soviet
attack. The radar consoles used with SAGE included "light-pens" for selecting which
blips to track [24]. Following SAGE, light-pen research continued to be funded (like
most computational technologies of the era) primarily through military contracts [71].
Ian Sutherland's 1963 work, "Sketchpad : A man-machine graphical communication
system," leveraged the light-pen for drawing and subsequent selection, movement,
and alignment of drawn elements and was completed while Sutherland was working
at MIT's Lincoln Laboratory [86]. As an interesting side-note, Sutherland's work was
one of the first examples of a real-time screen interface, where manipulating elements
on the display directly modified the contents of the computer's memory.
By the time the first mouse was introduced, digital pens had also started moving
off the screen. In a seminal 1964 work, Davis and Ellis of the Research ANd Development (RAND) Corporation wrote about the electrostatics-based RAND Tablet
[79]. This paper was possibly the first to use the phrase "electronic ink," and is
remembered as both the first handwriting recognition system and the first digitizing
tablet
[71].
However, written character recognition had already been studied for some
time, in both electromechanical [22, 73] and light-pen [64] systems. Still, the RAND
Tablet soon became a highly popular platform for researching hand-printed character
recognition [7, 33], virtual button pressing
[91],
and gesture/sketch recognition
[87].
It remains one of the best-known of the early digitizing tablets.
Irrespective of indications that users could quickly learn to work with stylus input
to an off-screen tablet [79], research into the use of digital pens for direct on-screen interactions continued [32]. In a natural extension of this idea, Alan Kay's "Dynabook"
became well known as the earliest vision for a hand-held device not unlike today's
tablet PCs and PDAs [52, 51]. At Xerox PARC, Kay's design brought inspiration
to the development of the Alto-on which many of the ideas for modern GUIs were
developed (it featured a mouse as its pointer) [78]. Despite these early works and
visions, the cost of computers kept tablet- and pen-based interfaces from reaching the
public [93] for quite some time.
Finally, in the early 1980s, the Casio PF-8000 and PenCept Penpad became available as the first tablet-based consumer devices, and around the same time Cadre Systems Limited "Inforite" digital signature pads were being used for identity verification
at a few shops in Great Britan. In the early 1990s, a plethora of PDAs and tablet PCs
began to become available. Since this time, devices based on two-dimensional pen interaction have changed little aside from miniaturization, performance optimizations,
and resolution enhancements. A few well-known products from around the turn of
the millennium deserve brief mention. Wacom@ tablets and the Anoto@ Pen are
two systems against which many contemporary pen interfaces are compared. Interestingly, save a few models of Wacom's devices, these two-dimensional pen systems
are primarily screen-less interfaces [97].
As the two-DOF mouse relates to the two-DOF pens familiar to users of tablet
PCs and PDAs, the higher-dimensional mice also have analogues in the pen domain.
The addition of a third dimension has been explored with visible feedback to indicate
movement between layers [85].
Bi et al. describe a system that employs rotation
around the center axis, known as "roll," as a way to enhance interaction for both
direct manipulation and mode selection [8]. Changing a pen's tilt-the angle made with
the drawing surface-has been used for menu selection [90], brush stroke modification
[96], and feedback during a steering task[89]. This enhanced functionality is possible
because the additional degrees of freedom can be re-mapped for arbitrary purposes, an
idea also present in Ramos et al.'s "pressure widgets" [80]. Taking the idea of higherdimensional control more literally, Oshita recognized that the metaphor between pen
and object position works well for bipedal characters [75]. He created a system that
directly changes the posture of a virtual human figure based on pen position, and
this novel application of a stylus will inspire us to more closely examine the boundary
between tangible interfaces and more traditional ones in Section 2.2.1 below.
From the light-pen to the Anoto, optical pen interfaces belong to an expansive
group of interaction technologies.
As we mentioned before, the BoPen system is
based on some very special optics. Both the Anoto and the light-pen contain optical
receiving elements that can determine the pen's location relative to a nearby surface.
The BoPen, however, is a tiny image projector. Projectors have been used only
recently in pens as a way of turning any surface into a display [84], but we will
discuss how to use a projector as a way to indicate position and location in Section
3.1.1.
2.1.3
Other Pointers
Traditional pointing techniques have often been compared to touch [29] and multitouch [23]. However, for the purposes of this section, we will only review devices and
techniques for assisting interaction at a distance (for a good review of multitouch
technologies, see [14]). As early as 1966, three-DOF tracking down to 0.2in resolution
was available for the ultrasonically-sensed Lincoln Wand, and at the time it claimed
to supplant the light-pen [82]. A bit later, magnetic systems were used to facilitate
the use of natural pointing gestures [12]. Currently, similar results can be obtained
with camera-tracked wands [15] and device-assisted gestures [41, 95].
Visible laser pointers are a popular tool for interacting with large, distant dis-
plays, and computer-tracked versions are no exception [54]. Infrared lasers provide
the same pinpoint accuracy but with a virtual cursor replacing the common red dot,
reducing interference with the screen [65].
In these systems, however, hand jitter
and fatigue are frequently-encountered difficulties [72]. Some systems approach the
problem of fatigue by using gestures rather than clicking [19], whereas K6nig's adaptive pointing techniques promise to improve the problem with jitter [55]. One might
expect that lighter devices would assist with fatigue, but even hand-only gestures can
become tiring after a while, decreasing recognition accuracy [1]. While the BoPen
technology seeks to eventually support distance interaction with large displays, this
goal is secondary to that of supporting augmented versions of traditional pointing
tasks. We aim, then, to provide a pen-based pointer capable of interacting with the
surface, above it, and with gestures. This description implies a more flexible version
of Grossman et al.'s "hover widgets" [34], which use button activation (rather than
distance from the display) to distinguish between the gesture and selection modes.
2.2
Blending Reality and Virtuality
Our discussion now centers around objects that provide a dynamic interface to the
digital world, blurring the lines between virtual and physical.
In some systems,
physical objects are tagged for computerized recognition. In others, physical objects
themselves are endowed with the ability to "think" and interact. In general, these
examples belong to the category of ubiquitous computing-the pervasive presence of
computational devices in a user's surroundings-and in moving away from purely localized computer access, they enable interaction paradigms based on constant access
to contextually-aware systems [1].
Much of the pioneering work in this area involved immersive virtual reality systems, where every aspect of user context is known because it is provided, and the
earliest examples were so cumbersome as to severely restrict the user's movement [81].
As computers have become smaller and more powerful, integration with the physical
world is now possible. Rather than immersing a user in a virtual world, mixed-reality
and tangible interfaces provide the user with a world in which virtual and physical are
no longer so disparate. Immersed in this environment, a person can simultaneously
respond to both physical and computer-generated realities. This new model results
in an interface that is not as restricted by its physical form as the devices we have
hitherto discussed. These interfaces have the ability to adapt and change in direct
response to context and experience.
2.2.1
Tangible
Oshita's mapping of a pen to a virtual character (as mentioned above in Section
2.1.2) relates to work on tangible user interfaces (TUI), where physical objects "serve
as both representation and controls for their digital counterparts" [44]. Immediately
following the manipulation of a physical object, the kinesthetic memory of its location decays slowly in comparison to visual memory: multimodal systems employing
haptic feedback (along with vision) leverage the kinesthetic sense of body part location (proprioception) to improve performance in object interaction and collaborative
environment tasks
[37].
As we discussed earlier, using a physical object to directly represent and control
a virtual one can restrict the user to the affordances of that object. To perform other
activities or control other (virtual) objects requires building a new interface. Many
tangible interfaces are application-specific, but the associate-manipulate paradigm
enables constrained tangible systems to be used more generally [92].
The ability
to dynamically bind a physical object, or token, to a specific digital representation
or category creates a mutable syntax for informational manipulation. Constraints
in form and placement options for the object directly convey the grammar though
physical structure. Even simply the shape of the object can be used as the structure
for interaction. In a brilliant example of a distributed TUI, Siftables constitute the
first example of a Sensor Network User Interface [66]. A Siftable is square and rests on
a table with a screen pointing upward. This shape limits connections between devices
to one for each of the four sides of the square. Drawing from the field of tangible
interaction, we seek to create an input device that supports assignment of temporary
...
. .........
REAL
ENVIRONMENT
VIRTUAL
ENVIRONMENT
MIXED REALITY (MR)
L
1
AR'ads
interacA
AV'Oadra
cmWr
toeAnal ARAA
mroma.ton
A(~~A
r
Virtual
Reality (VR)
Augmented
Virtuality (AV)
Augmented
Reality (AR)
Tangible User
Interfaces (TUI)
A TuI ses Ma hyscal
Otgcts
with
tobothrepresen "n
fnicmlaton
cmpulw-geere
eoamon to a
cm u-genraed
l;0)
2ea
(WReeed
VR ersto conVAe&y
compur-gnrated
envonents 01kSchnW sWOOd
OvAtn. 89A.& Lay.2006.Bede A
Cosec
203)
enwenment
J
al al 20W4
soo
asaaunmer
SpatialAR
t
it&~A~A
A
Spaa AR440ays PoWed
comngeneteo
erm=onmet
6dcy mf aUWse'
AAROaAar.
0
R
(BiAe
Us"g
ysa" Wbjcmto rate a vtua
204) As
iccdah. & KAmAr
le
f
a physc'AckACube
a usr adds
consctoA "hee4awat vMa; cod
upateOd
-SMAC01meAcAy
mod
'$ee-through' AR (eitheropticalor video)
A usmwRs a head4rourt dspay tough
Auch
informaln Maycanwe
Cosm - IMPgg
The lutl
Tecccg" alSIQORAPWAA
The
OA et t
vm:*041d
"
or" tacked and an
s
praected
0
mhe"
m
mage
smhy
hwod wco
r-grated
crktMraH
supenposed on ao (Cakftka. HaA
Road 200 BAnghust Grs & Loose 200)
See4oug
eeyhing
the
rf
Immersie VR
Sem14mmersive
V
A sem
#
evVR
Oplay
n1&W"a
ara of a us
heid-af-vw
VR.Ach uss w~rRa headmontad.say ora pe
8R6tn-ba
Vse''
syste, comple"t Wit
vWw
mmesiv
feldof.
Smnnw
vR VAnfte Smca ProecCOlaoAadimrsaOAe VR.
te
hftRy s
ad
ANcerrIera
baom wokbech(Orettas,Rousiu.
AmeAl(Frchr, Bar & fta"ec.2006:KscA
usr
are
immrd
ROS m
Ah
TAngs,Reche
& Ga Z004)
'CAVE (FakeSpac, 20W CRu1
&T
200M
Nia. Sandin&DeFan, 193)
AR
ele
San",Hlaer
1W
Figure 2-1: Adapted version of Milgram and Kishino's MR continuum[68]
Downloaded from http://en.wikipedia.org/wiki/Projection_ augmented_ model.
handles to virtual objects while maintaining the point/click/draw capabilities of a
normal stylus.
2.2.2
Mixed-Reality Interfaces
The phrase "augmented reality" describes one subcategory of the Mixed-Reality(MR)
interfaces represented in Figure 2-1. Milgram and Kishino describe MR as a perceptual merging of real and virtual worlds. Augmented Reality(AR) develops when a
live camera feed or an object in the physical environment is augmented with computer graphics. Augmented Virtuality(AV) results when virtual environments are
supplemented with live video from some part of the "real world." [68]
Some of the most prevalent of MR systems combine physical and digital workstations. In an expansion of his 1991 work "The DigitalDesk calculator: tangible
manipulation on a desk top display," Pierre Wellner created a full "digital desk" that
supported camera digitization of numbers on physical paper for use on the calculator,
which was projected onto the desk [98]. It is now possible to project images onto ob-
jects and canvases of arbitrary shape [10], enhancing the similarity between AR and
TUI and increasing the likelihood that AR interfaces will eventually have a haptic
component. Another method to mix realities uses a mobile phone or PDA instead of
a projector to create a "magic window" into the digital version of a scene [60].
On the other side of the MR continuum, Malik and Laszlo bring the user's hands
onto the computer desktop for a compelling implementation of direct manipulation
[62]. In an extension of this project, Malik et al. enable more flexibility in camera
position by placing a fiduciary marker next to the touchpad surface [63]. These types
of markers, common to MR and tangible systems, are the subject of the next section.
2.3
Fiducials
Tagging physical objects gives them a digital identity. It is well known that laserscanned barcodes are used for inventory management and price lookup in stores and
warehouses. We also know, from the discussion above, that two-dimensional markers,
called fiduciary markers, are common in MR systems. Indeed, the ARTag architecture
provides markers intended for identification and pose estimation in augmented reality
systems [101]. Markers can also be found enabling tangible interfaces-the reacTIVision table uses "amoeba" tags to detect the position and orientation of the tokens [50]
and the TrackMate project aims to simplify the process of building tangible interfaces
with its circular tags [53].
As markers can be distracting, some tracking systems employ computer vision
techniques such as template matching or volume modeling that can reduce or eliminate the need for tags, but according to Lepetit and Fua, "Even after more than
twenty years of research, practical vision-based 3D tracking systems still rely on fiducials because this remains the only approach that is sufficiently fast, robust, and
accurate." [58].
For this reason, one alternative to purely vision-based approaches
uses markers that, while imperceptible to the human eye, remain visible to cameras
in the infrared band [76]. Another option uses time-multiplexing to project markers
such that they are only visible to a camera synchronized at a certain frequency [36].
..
..................
..
..............
(a) QRCode
on a
billboard in Japan.
Photo
by
Nicolas
Raoul
(b) DataMatrix on an
Intel wireless device.
Photo by Jon Lund Steffensen
Figure 2-2: Examples of 2D Barcodes from http://commons.wikimedia.org
Another benefit to using markers is the ability to provide unique identity to objects or
actors. Even the best computer vision algorithm would have difficulty distinguishing
between two very similar objects, but an imperceptible marker would make it trivial.
Markers, in one form or another, are here to stay.
With the increasing popularity of mobile devices and smart phones, ubiquitous
computing has become truly ubiquitous. Along with these compact computing platforms, fiduciary markers have left the laboratory and are finding their way into common interface technologies. Two-dimensional barcodes are gaining surface area in and
on magazines, t-shirts, graffiti, consumer devices, shipping labels, and advertisements
(see Figure 2-2). When decoding the Quick Response(QR) code-a two-dimensional
barcode used to encode web addresses-a delay is perfectly acceptable; much of the
current research focuses on decoding from any angle and under varied environmental
conditions [18, 20]. Other research seeks to improve performance by utilizing network
connectivity, sending a compressed image and performing the actual decoding on a
remote server [99]. This approach still falls short of real time performance-a vital
problem for interface development [21].
Indeed, one project that focuses on using
mobile cameras for real-time decoding of a Data Matrix code has met with some
difficulty [6].
In contrast with many of these projects, our approach to using fiducials for in-
teraction switches the position of the camera and the barcode. The BoPen can be
produced inexpensively as a stand-alone device or embedded in a PDA or cell phone,
uniquely identifying users to any system with a camera. Furthermore, mobile devices
are limited in display and processing power. While mobile phones may eventually
be able to recognize fiducials for real-time interaction, personal computers and those
driving large displays are likely to arrive there sooner. What's more, some of the
limitations of using a PDA as a small window-like portal into an augmented reality
can be circumvented by intelligent projection, and a combination of the two can yield
a richer interaction experience [83].
2.4
Bimanual Interaction
One of the great draws of tangible interfaces, mixed reality, and multitouch systems
is the ability to simultaneously use both hands for interaction. In 1997, Hinckley et
al. confirmed the predictions of Guiard's kinematic chain model as applied to laterally asymmetric tasks; for three-dimensional physical manipulation with a tool in
one hand and a target object in the other, the dominant hand was best for fine (tool)
manipulation whereas the non-dominant hand was best for orienting the target [42].
This study was performed with right-handed subjects and demonstrated that the observable effects of asymmetry increases with task difficulty. The results also reflect
our everyday behaviors; people are used to holding a book with one hand, and marking it with the other. In one extension of this research to pen user interfaces, Li et al.
reported, based on a keystroke-level analysis of five mode-switching methods, that
using the non-dominant hand to press a button offers the best performance by a wide
margin, even when the placement of the button is non-ideal
[59].
These findings,
reported in 2005, are not surprising. Watching Alan Kay present a video demonstration (http://www.archive.org/details/AlanKeyD1987) of Sutherland's Sketchpad
[86], one realizes that this bimanual model was used in the interface he implemented
over 40 years ago!
Symmetric action with two mice can be more efficient than asymmetric action,
even when the asymmetry is sympathetic to handedness
[56].
Keeping track of two
mouse cursors and the functions they represent can be difficult. Indeed, given the
option of performing symmetric tasks with touch or with a mouse, participants preferred touch, even though the accuracy of touch decreased rapidly with distance and
resulted in more selection errors [29]. It is possible that applying an adaptive pointing technique to a pen-controlled cursor could provide an all-around better means
of interaction. However, if touch is available as well, it might be best to use both.
Brandl et al. showed that a combination of pen and touch is best for speed, accuracy,
and user preference, in comparison to both the pen/pen and touch/touch alternatives
[13]. This finding lends credibility to the notion that the pen-touch combination feels
more natural while providing a more efficient and accurate interface for certain tasks.
2.5
Conclusion
We have seen that modern developments in input technologies enable extending traditional input devices to take advantage of up to six degrees of freedom. Noting that
tangible and mixed-reality interfaces area commonly use objects for 6-DOF interaction, we looked at how these interfaces are enhanced by the markers and dynamic
binding, as in the associate-manipulate paradigm or with binding virtual objects to
physical fiducials. Finally, we described the potential benefit to using multiple modes
of interaction, especially when interacting bi-manually.
The BoPen is an inexpensive input device that could bind to virtual objects for
direct 6-DOF manipulation. It could also function as a pointer for precise, localized
input. In both cases, it might be easiest to use a system that employs multiple pens
or modalities. Utilizing asymmetric bi-manual action could facilitate rapid mode and
association switching, enabling an interaction that is both flexible and natural.
.
. ..
..........
Chapter 3
Design Development
LED
Difuser
Barcode pattern
Lens
Figure 3-1: BoPen: Basic Design
3.1
BoPen Overview
The BoPen contains an LED, a diffuser, a transparency, and a lens, as seen in Figure
3-1. The optics are aligned such that a tiny pattern on the transparency will be
projected into infinity. To the human eye-or a camera focused at the front of the
pen-the pattern is not visible. Instead all that can be seen (if the LED is in the
visible range of the spectrum, in the case of the human eye) is a small point light
source. However, when a camera focuses at the infinite plane, the pattern on the
Camera
Sensor:
(a) A pinhole placed in front of a barcode pattern encodes directional rays with
the pattern. The camera captures this information by positioning the sensor outAn unbounded magnification
of-focus.
is achieved by increasing sensor-lens distance, limited only by the signal-to-noise
ratio.
(b) A small lenslet placed a focal length away from the pattern
creates multiple directional beams
(ray bundles) for each position in
the barcode pattern. The camera
lens, focused at infinity, images a
magnified version of the barcode
pattern on the sensor.
Figure 3-2: Optical Models: Pinhole (left) and Lenslet (right).
mask appears in a circle of confusion, or bokeh (pronounced "bouquet") around the
light. The circle itself is created by defocus blur, and its size and shape are dependent
on the properties of the camera, most notably its aperture.
3.1.1
Bokode Optics
We will now briefly introduce the enabling optical technology for this system. For a
more complete description, please refer to [70].
With the BoPen's optical configuration, the information of the barcode patterns
is embedded in the angular and not in the spatial dimension. By placing the camera
out-of-focus, the angular information in the defocus blur can be captured by the
sensor. The pinhole is blurred, but the information encoded in the bokeh is sharply
imaged.
Looking at the pinhole setup in Figure 3-2(a), one can imagine that the barcode
image-as seen by a stationary observer-will be made arbitrarily large by simply moving the lens more out-of-focus. When the lens is one focal length away from the
source image, the rays traced from a single point come through the lens collimated
(parallel). In this case, the lens is focused at infinity and the magnification remains
14%
I"0%5
L'"&M
I"'f
Figure 3-3: Data Matrix-Based Pattern Design. A tiled arrangementof Data Matrix
(DM) codes encodes identification and angular information. Each 10 x 10 symbol
stores both its physical position in the overall pattern and a unique byte identification
code that is repeated across all DMs in the Bokode. Image from [70].
constant despite changing the observer distance: the size of the observed image is
depth-independent. As relatively little light can enter through a pinhole and the little
that does is extensively diffracted, the Bokode and BoPen utilize a lenslet in the same
position, as seen in Figure 3-2(b).
The camera images a different part of the pattern depending on its position relative to the pen. The viewable region of the transparency is a function of the angle
formed between the camera position and the BoPen's optical axis. Unlike traditional
barcodes, Bokodes have the ability to give different information to cameras in different
positions/orientations [70]. In the BoPen, this feature is used to provide information
about the angular tilt of the pen with respect to the drawing surface.
3.1.2
Preliminary Pattern
We based our initial pattern on the Data Matrix code, and it is identical to the one
described in [70]. The Data Matrix (DM) is a two dimensional barcode [45] that uses
a matrix of binary cells to store information. As shown in Figure 3-3 the Bokode uses
a tiled array of 10 x 10 DM codes with one row/column of silent cells between adjacent
codes. One 10 x 10 DM encodes 3 bytes of data and another 5 bytes of Reed-Solomon
error correcting code. This error correcting code ensures data integrity even when
up to 30% of the symbol is damaged; we can also rely on this redundancy to help
(a) Original
Bokode.
Image from [70].
(b)
BoPen
Construction.
This image shows both the first and third prototypes.
Figure 3-4: Comparison of the original Bokode design and the BoPen.
disambiguate between overlapping patterns. Since which pattern is visible depends
on the camera angle, the tiled DM design offers up to three independent bytes of
information that can vary with tilt to provide both identification and orientation
information based on how the pen is positioned relative to the camera. Two bytes
provide the the x and y positions of the currently visible Data Matrix. The remaining
byte is common across all DM codes in a single pen, providing consistent identification
information that is independent of pen orientation.
We ordered a printed film mask with these patterns sized at 20x20pm per pixel
from PageWorks company in Cambridge, Massachusetts, USA. Each DM in a 128x128
matrix encodes the same unique identifier, and the layout encodes row and column
information as detailed above. The transparencies were cut by hand and positioned
in the pen as described below.
3.2
Building the BoPen
We tested several hardware designs before selecting and iterating upon a final version.
Figure 3-4(a) shows Mohan et al.'s original Bokode, while Figure 3-4(b) shows two of
our pen-based prototypes. Prototype I was approximately 10 mm wide and 5 cm long
with a high-intensity red LED connected to a pushbutton and an external battery
case. It was designed with the software program Rhinoceros@ and printed in two
...........
......
.............
(a) SolidWorks drawing for 3D
printing.
.
(b) Lasercutter patterns for adjustable
and modular slices.
Figure 3-5: Components of final pen design.
pieces on a Dimension 3D printer. A tap and die enabled us to put threads on the
two pieces and fit them together, once we had glued the patterned transparency to
the back of the tip piece. For this to work, the tip's length had to be exactly the focal
length of the lens.
After our proof-of-concept experiment (described below in Section 3.3.1), we built
our second prototype. Prototype II (not shown) was similar to the first but scaled
up, utilizing a larger lens to increase the amount of light it emitted. Unfortunately,
the walls were too thin to tap without breaking. This problem-combined with the
difficulty of achieving proper focus with the fixed position of the mask-convinced us
to completely re-design the pen to enable a more flexible option for focusing.
Prototype III was designed in SolidWorks@ and also printed on the Dimension
printer (Figure 3-5(a)). Rather than an assembly of two pieces, this version utilizes
laser-cut slices that can be moved around during testing and fixed in place with
additional laser-cut spacers. The slices were cut in a variety of shapes to provide
holders for all the component pieces of a BoPen (Figure 3-5(b)). They were cut from
wood and paper of varying thicknesses to facilitate focusing. Prototype III was the
final redesign, but further iterations based on lens choice are described in Section
4.2.2. Most pens created from this design are approximately 20 mm wide and 6
cm long. This larger pen provided room for a wider pattern, enabling an associated
increase in the theoretical angular range for lenses with a relatively small focal length.
b(a)
[~~]
(e)
(d)
Figure 3-6: Diagram of arrangement for first test. (a) BoPen (b) Clear Tabletop (c)
50mm Lens (d) Camera, with bare sensor exposed.
The increased size also resulted in a pen that is more natural to hold. Prototype I
was reminiscent of a tiny disposable pencil, whereas Prototype III felt more like a
short marker.
3.3
Software and System Iteration
With our first prototype, we sought to replicate the off-line results obtained in the
original Bokode experiments, but with a web-cam instead of an expensive digital
SLR. We also wished to create a computational pipeline for identifying the location
of a bokeh in an image and extracting and processing the DM codes. Then, we could
work to optimize this process and work toward a real-time system. We planned to
do this through an iterative process of evaluations and improvements. This section
describes a few cycles of this process.
3.3.1
Replicating Bokode
For our proof-of-concept, we used a Philips SPC-900NC web-cam with the optics
removed to expose the CCD. A Canon@ 50mm camera lens set at infinite focus
........
..............
found onet
.
.
....
......
.......
......
I
Figure 3-7: First Software Version. Running in "Images Visible" mode. There is
a small red dot to indicate the center of the found circle, and the thumbnail in the
bottom right shows the histogram-normalizedimage sent to the Data Matrix decoder.
was aligned with the CCD, and a piece of transparent acrylic at 0.5m served as
our drawing surface. Figure 3-6 shows a rough sketch of the physical layout. We
used our first BoPen prototype as described above, and recorded data at 5 frames
per second and 320x240 resolution. This data was interpreted by software written
in C++ that made use of open source libraries, specifically IBM's Open Computer
Vision (OpenCV) library and the Datamatrix Libraries (LibDMTX).
First Version Software
Our initial software was designed to interpret data from a video recording. For each
frame, it first performs filter operations followed by a polar Hough transform to find
circles present in the image (Figure 3-7). A predefined number of pixels (the "window") around the circle center is cropped, normalized, and passed to the DM decoding
library for interpretation. The software reports the number of frames processed, the
number of barcodes found, and the total time taken for processing.
Performance Evaluation
For our first test, we processed 268 frames from a 53-second video clip captured in
the manner described above at five frames per second. Using no scaling and a window
size of 120 pixels, we found and interpreted codes in 22% of frames over the course
of 374 seconds. This was unacceptable, so we tried scaling the images (and window
.........
Figure 3-8: Example: Image Processing Failure. In this example, the background
noise in the red channel resulted in incorrect circle identification and, subsequently,
an inordinate delay in DM decoding.
size) down by a factor of two. This action resulted in the same recognition rate, but
the processing took only 112 seconds (about half of real-time speed). By increasing
the size of the window region, we were able to improve the recognition rate to 47.4%
in only 89.7 seconds.
Unexpectedly, using larger window sizes with the scaled-down lower resolution
images resulted in both a higher recognition rate and a shorter processing time.
This result was observed even when sending the entire scaled-down image to the
DM decoding library, instead of using circle detection to determine which frames
warranted interpretation. In some cases the circle detector incorrectly identified the
salient region, as seen in Figure 3-8. In subsequent versions, we removed the circle
detector entirely.
3.3.2
Diffuser Experiment
In our first design, we were only able to move the device around a small portion of the
table before it was no longer visible by the camera. Our first attempt to improve the
system ambitiously sought to enable interaction across the entire table-increasing the
field of view by imaging onto a diffuser. This enabled us to move the BoPen within
a rectangular region of approximately 33 x 22 cm at a distance of 0.5m-more than
enough space for multiple users to interact. Unfortunately, the diffusers degraded the
..
..
..
..
......
..
...........
.......
:: ..
..........
(b)
En(e)
Figure 3-9: Diffuser-Based FOV Enhancement. Imaging onto a diffuser-even one
designed for image projection-causes significant artifacts from the grain of the material.
Figure 3-10: Mock-up of Diffuser Imaging
Setup. (a) BoPen (b) Clear Tabletop (c)
50mm Lens (d) Diffuser (e) Camera, with
lens focused onto diffuser.
image quality to the point where we could no longer decode the Data Matrix (Figure
3-9). We made a number of attempts to use smoothing and other image processing
methods to clean up the image, but were unsuccessful in decoding the DM with this
configuration. Though the implementation fell short of our goals, it provided us a
more in-depth understanding of the optical limitations. In the future, we might try
to design a pattern that projects more clearly onto a diffuser or, alternately, use an
optical taper to scale down without degrading image quality.
3.3.3
Live Tracker
After our unsuccessful diffuser experiment, we decided to focus on getting the system to operate more quickly. Though it is less likely we could develop a successful
multi-user system with only a portion of the table as input, the popularity of the
mouse and the pen tablet attest to the great number of interesting single-user applications we could explored. The refined system now described is based on a similar
optical configuration to the first version. However, instead of reading images from a
prerecorded file, it processes live video frame-by-frame.
Figure 3-11: Software Pipeline: Version Two. A distinguishingfeature of this realtime tracker is that it runs the Data Matrix decoder in a separate thread. This choice
means that although the blob centroid detection is reported in real time, the decoded
ID numbers and angle positions are at least a few frames behind.
Second Software Version
Since DM decoding is essential to 3 of the 6 degrees of freedom we planned to support,
we went back to imaging through a 50mm camera lens onto a bare sensor. This time,
we chose a camera that had the capability of a higher frame rate-an older PointGray@
Dragonfly T M with better resolution than the Phillips web-cam, but still well within
the range of "commodity cameras." The choice to use a bare sensor resulted in a 4to-5-fold reduction in our field of view and a corresponding increase in the resolution.
As we expected, this enabled us to again decode Data Matrices. We also significantly
altered our pipeline, as shown in Figure 3-11 and described below:
A frame captured from the camera is Bayer-encoded grayscale, so it must first be
converted to an RGB image. Next the image is scaled down by a factor of four and
split into component colors. Due to the observed irrelevance of the circle detection
.........................
....
......
....
.
(a) System detecting a single centroid
(green "X") and interpreted barcode data
(White "One") from a recent frame. When
this image was captures, the label indicating successful DM decoding was unwavering. Note the small, monochrome, edgedetected blob image in the bottom left.
(b) System detecting two centroids and decoding one of the two patterns. This kind
of split was often observed.
Figure 3-12: Multiple Simultaneous Detection and Disambiguation.
during our initial experiments, we decided to omit it from this version of the pipeline.
Instead, the red and blue images are thresholded to create a binary image in which
we find blobs corresponding to bokehs, calculating their centroids.
The regions in the green channel corresponding to discovered blobs in the thresholded image are copied, normalized, smoothed, and sent to a separate thread which
runs the DM decoder. Up to a variable ten of these threads can be active at a time,
and frames are dropped (ignored) until there is space in the queue. Because the system is threaded, the blob detection portion of the program continues to run, tracking
centroid location. When a DM has been decoded in this separate thread, the decoding thread is joined with the main thread (usually 2-5 frames later) and the identifier
is displayed.
Performance Evaluation
The first live system performed very satisfactorily. It could consistently detect:
* How many pens were in the field of view
* The approximate coordinates of each pen
" Each pen's unique identifier
Examples of system output can be seen in Figures 3-12(a) and 3-12(b).
The
system was never observed to confuse one device for the other, and it was observed on
multiple occasions to decode two barcodes simultaneously-from images taken from
the same or directly adjacent frames. However, the performance was inconsistent
between two devices: In a sample of approximately 200 frames where the pens were
at rest, the code in the device labeled "One" was available in 58% of frames. For the
device labeled "Two," that number was substantially less: 13%. This is most likely
the result of device-specific focus or dust/smudging occlusions issues.
3.3.4
Lessons Learned
There were three main weaknesses in this version of the BoPen, which prevented us
from making it into an interface device:
e No detection of roll orientation, and no use of the decoded tilt data. Both
these weaknesses are a matter of software implementation; roll angle can be
calculated based on the orientation of the DM, and the tilt can be calculated
from the decoded row and column information in the Bokode.
* No model for changing position over time. Perhaps the greatest limitation is
that this system has no model for changing position over time. We separately
detect bokeh position and identification, but there is nothing to associate the
two. For this reason, we chose only to use the identification functionality of the
Bokode, and not its tilt-detection capability.
o Lag between position detection and DM decoding. Even if we did model a device
over time, the tilt and roll information would lag a second or two behind the
position data. This prohibits the system from producing tilt and roll data at
a rate adequate for the quick response times required by a direct-manipulation
interface.
Another limitation comes in the form of the light-based noise to which all optical
systems are susceptible: additional emitters in the scene would reduce image contrast.
A related difficulty arises when attempting using this system with a projection surfacebased display; the diffuser obscures the signal, so we cannot use this pen on a backprojected display. One alternative suggested by Han is illumination using infrared
light, as it is less occluded by LCDs [39]. A more promising alternative-a collaboration
with Microsoft Research-is explored in the next section. The SecondLight system uses
a variable diffusivity surface that is constantly oscillating, enabling the SecondLight
team to display both on and above the surface [46], and enabling the image from the
BoPen to be captured from beneath.
46
Chapter 4
Final Implementation
As we saw in Section 2, pens for both screenless tablets and interactive displays
enjoy some popularity as computer input devices. The BoPen differs from most other
digital pens in that it provides a greater number of degrees of freedom. An additional
unique feature of the BoPen is that very fine tilting motions result in large and rapid
shifts in the transmitted barcode. Feedback from these tilting motions is somewhat
limited; though the user has a kinesthetic sense of his or her hand position, there is
little haptic information specific to the pen's orientation. According to Balakrishnan
and Hinckley's appraisal of Guiard's Kinematic Chain model, visual feedback can
compensate for lack of kinesthetic feedback, but not vice-versa [5]. This means that
the visual feedback afforded by a concurrent display could be invaluable in enabling
users to derive benefit from the fine movement capabilities of the BoPen.
Projecting an image onto an interactive surface from above (front-projection) requires an opaque or semi-opaque material onto which the image can be focused,
reflected, and later absorbed by the viewer's eye [23]. Projecting from below (rearprojection) requires a translucent diffuser to scatter incident light, resulting in the
same effect but without shadows. However-in both these cases-with a BoPen pointed
downward, it is impossible to recover the image using a camera on the other side
of the scattering material.
Indeed, when projector-based augmented reality and
mixed-reality system need a transparent object, they simulate transparency by frontprojecting an image of what is behind the object [10].
Through luck, effort, and the good graces of Shahram Izadi, Steve Hodges, and
the rest of the Computer-Mediated Living group at Microsoft Research Ltd. in Cambridge, UK, we were able to work on integrating our BoPen with their SecondLight
system. With the affordances of a display through which we can focus a camera, we
sought to enable a tablet-in-picture pen interface on this horizontal display surface.
The goal was and remains to find a method of optimally using this setup to provide
visual feedback that will enhance the user's kinesthetic sense of pen orientation and
motion.
4.1
SecondLight
The Microsoft SecondLight system (Figure 4-1(a)) is a rear-projection multi-touch
surface with a twist; by leveraging a diffuser that can switch to clear, SecondLight
supports projecting and capturing above the surface as well as on it (For a full account,
refer to Izadi et al.
[46]).
The diffuser is a polymer stabilized cholesteric textured
liquid crystal (PSCT) driven by a 150V signal at 60 Hz. When voltage is applied
across two planes (in alternating polarities), the material becomes clear. When the
voltage is removed, or when the power is turned off, the material is diffuse. Two 60Hz
projectors with alternating shutters-one synchronized to the clear part of the cycle,
and one to diffuse-form two separate images at an effective frame rate of 120Hz, fast
enough to be perceived as continuous by the human eye. While one projector displays
onto the tabletop surface, the other creates an image on any diffusive material held
above the table, and both images appear to be present at the same time.
Like with most FTIR multitouch displays, pressing a paper-printed barcode up
against the screen frustrates the light, making the barcode visible. The SecondLight
can also theoretically image a barcode once it moves away from the surface, during
the part of the cycle where the PSCT is clear. However, a printed barcode becomes
smaller and smaller as it is moved away, and the camera needs to be refocused. By
using a Bokode-based device instead, the image remains the same size and the camera
can stay out-of-focus at infinity. Using the BoPen with this setup also provides a
...........
----------
(a) The SecondLight Rig.
...............
(b) With Modifications.
Figure 4-1: Second light modifications. We placed the camera in a position that
would avoid blocking the projectors whilst enabling a "tablet" size of 100mm by 150mm
(Smart camera shown).
unique opportunity to explore the intersection of pen and multitouch interfaces. This
combination has only been explored a few times before, and never with a 6-DOF pen
or a display surface capable of projecting multiple layers.
4.2
Optical Considerations
The SecondLight multi-display touch-sensitive screen is 307mm (height) by 406mm
(width). To enable a 100mmx150mm pen interaction area, we placed a camera 340mm
from the screen with a lens of focal length 16mm. The position of the camera is
optimized to provide a small tablet surface to right-handed users. Figure 4-1(b) shows
the rig with our modifications. For our tests, we did not leverage the multi-display
capabilities of the SecondLight, so only one of the projectors is used.
Data Matrix
Spaced
Data Matrix
(a)
(b)
Video Mouse
+ Surface Tag
(c)
ARTags
TrackMate
(d)
(e)
Figure 4-2: Patterns we printed.
Figure 4-3: Alternate pattern design: (f)
4.2.1
Pattern Choices
In Figure 4-2 we show a number of different patterns that we tested for use as the tiltand roll-identifying patterns. The original Bokode pattern consists of many closelytiled (a) Data Matrices, but for better compatibility with the available decoding
software, we printed a similar pattern with slightly (b) spaced-apart DM tiles. Some
other patterns we tried are (c) a hybrid of VideoMouse [43] and Microsoft Surface tags,
(d) ARTag fiducials [101], and (e) TrackMate [53] markers. Figure 4-3 shows another
pattern that we devised for a template-matching approach (f) but never printed. The
designs of patterns (c) and (f) are meant to provide a motion-blur resistant marker
that might one day also take advantage of optical flow. Patterns (a),(b),(d), and (e)
were designed to be most compatible with existing techniques and decoding software.
We will discuss the differences between these patterns alongside brief evaluations of
their performance in Section 5.1.
4.2.2
Lenses
We used the simple code in Appendix A to probe the design space based on the
dimensions of our table computing setup (as described in Section 4.2).
With the
table dimensions, an understanding of the camera lens geometry, and CCD resolution
and pixel size, we determined a good rule of thumb to use: for a pattern with a side
length of Xpm, it's best to have a lens with a focal length of X/10mm. For 1Ox1O DM
codes, this translates to: an Xmm BoPen lens requires patterns with Xpm features.
An additional consideration is that of angular range: the smaller the focal length,
the less of the pattern that is traversed by a slight tilt of the pen. This smaller focal
length translates to a larger angular range for a fixed pattern area. A larger focal
length can be used with larger patterns, but at the cost of increasing both motion blur
and the likelihood of running off the end of the pattern during tilt and translation.
These requirements must be balanced against each other, but as we will discuss in
Section 5.1.1, they must also be balanced against the constraints of the film mask
production process; we cannot make masks that clearly depict barcodes smaller than
around 100pm on a side.
The optical calculation software informed us that the smallest usable focal length
would be around 4.4mm (with an angular range of approximately 580) , and the
largest would be around 22mm (with an angular range of approximately 170). With
these considerations in mind, we designed and constructed six final variants of our
prototype based on lenses within this range as follows:
Table 4.1: Lens Properties for Each Pen.
Pen Focal Length
8.8mm
# 1
17.5mm
# 2
12.5mm
# 3
4.65mm
# 4
11mm
# 5
8mm
# 6
4.3
Diameter
19.7mm
19.7mm
10mm
7.8mm
7.2mm
6.3mm
Type
Aspheric
Aspheric Condenser
Planoconvex
Aspheric Condenser
Aspheric
Planoconvex
The Pens
With the exception of Pen #5, we built one pen for each of the lenses listed in Table
4.1. These pens were almost identical to Prototype III described in Section 3.2, and
AID-
Figure 4-4: A Multiplicity of Pens. Far Right: Prototype III. Top Row: P1,P2,P3
Bottom Row: P4,P5,P6
they can be seen next to each other in Figure 4-4. The pens were again designed in
SolidWorks and printed on a Dimension 3D printer. This time, however, the slices
were cut from 2mm and .5mm acrylic, as well as from paper. All the BoPens built
included at least the components ordered as shown in Figure 4-5: on the back is
an infrared LED that fits into a reflecting chamber constructed of three adjacent
slices with foil tape inside covering the cylindrical wall. The exit from this chamber
goes through a diffusing material (tracing paper) before finally back-illuminating the
pattern. The pattern is spaced by slices of varying depths in order to ensure that it
is exactly one focal length from the lens and will be projected into infinity.
Figure 4-6 shows the different pattern slices we tested in each pen. We had software
sufficient for preliminary evaluation all of these designs, and in most cases we tested
designs with a paper printout mock-up before building a pen with the corresponding
film mask. Based on these preliminary tests, described in the next chapter, we decided
to implement our software for pens that use the spaced DM codes.
.. .
.
.....
.............
....
Figure 4-5: Pen: exploded view. From Top: LED and spacer, mirror chamber,
diffuser, pattern, focusing spacer. The lens (not shown) is at the tip of the pen.
Figure 4-6: Pen slices containing patterns on film. Clockwise, from top left: Data
Matrix (and spaced DM), TrackMate, MS Hybrid (in two sizes), and ARTags.
4.4
Hardware and Software
In our implementation we explored two separate configurations. One uses a smart
camera to do the image processing, sending numerical values to a computer for analysis and display. The other uses a camera attached directly to the computer, and
the "Vision Server" software that runs the SecondLight image processing pipeline.
The speed of the two methods is comparable if GPU shader code is used for the
vision server-based solution. However, as it was not within the scope of this project
to implement decoding on a GPU, the smart camera approach is faster. A benefit
to the vision server approach, though, is that it already includes code for connected
component analysis and simplifies the process of masking out the BoPen signal when
calculating multitouch. For this disambiguation functionality, the smart camera ap-
proach requires a later step, leaving the system more susceptible to asynchronous
matching problems.
4.4.1
Smart Camera Approach
The VC4038, a smart camera made by the German company Vision Components
(VC), has an on-board DSP running at 3200 MIPS, 10OMbit Ethernet, and can
capture video a rate of 63 fps (or 126fps with 2x binning) [31].
An infrared-pass
(850nm) filter on the front of the 16mm lens serves to eliminate visible light, ensuring
high contrast for our infrared bokeh signal. Though we are currently using a single
camera at 640x480 resolution, we opted for an extensible client-server architecture
that simplifies the process of using additional cameras to extend the usable surface
size. Each camera simply watches and reports what it sees, whilst the server keeps
track of all devices using multiple instances of a class to represent different BoPens.
This method effectively creates a layer of abstraction between the image processing
and the interface API.
Client Side
We borrowed a camera from FS Systems LLP and, using the VC tutorial code as a
guide, developed our client-side software. The code running on the camera (reproduced in Appendix B) utilizes a telnet connection to the camera for configuration, and
it displays debug information on a VGA monitor connected directly to the camera.
The camera uses its Ethernet port to send UDP packets with the following information at a maximum rate of about 50fps: Bokeh X, Bokeh Y, Bokeh Width, Bokeh
Height. When decoding DM codes in the spaced configuration shown in Figure 4-2b,
the tracking rate is limited to between 20 and 30 fps. When the camera decodes
a DM, it additionally outputs the following data: Raw Tilt p, Raw Tilt
#,
BoPen
Identity, BoPen Rotation 0, and (currently disabled but in support of a more accurate
tilt calculation in a future version) DM Location relative to bokeh.
The smart camera is synchronized on an optically isolated pulse from the circuit
which drives the SecondLight's switchable diffuser, so the camera only captures images
when the surface is transparent. By leveraging the image processing and DM decoding
libraries provided by Vision Components, we were able to do blob detection and
decoding in 5.5ms and 19ms, respectively. Threading was not required, so the pipeline
of our main loop is very simple:
1. Wait for trigger and capture image (opt. Draw: captured image)
2. Do thresholding and detect biggest blob (Draw: frame around blob region)
3. Search for Data Matrix within the blob region (Draw: square around found
DM)
4. If decoded, extract tilt, roll, and ID parameters (Draw: text with decoded
information)
5. Send UDP packet with all available information for this frame
6. Handle key press for changing options, etc.
Server-Side
A UDP server written in C# receives the packets sent by the camera and does the
additional calculations necessary to give physical meanings to the client's output parameters, including tilt calculation based on a hard-coded and empirically determined
"center marker" of the BoPen's pattern array. The code is reproduced in full in Appendix C, and the main window view is visible in Figure 4-7. This software has
a button to test the UDP server, a text box to display any received packets with
decoded DM data, and an image plot on which is drawn all the information available for a given moment. The program has the capability to dynamically adjust the
maximum and minimum values for coordinates based on received packets in order to
automatically calibrate the distance that the BoPen traverses as it moves across the
tablet space. This means that one can simply use the system, and it will train to the
Figure 4-7: BoPen Debug Display. The program receives UDP packets sent by the
smart camera and keeps track of maximum, minimum, and current values for all
position and orientation information, displaying whatever is available. The wedge in
the blue circle indicates the roll angle and the black lines indicate pen tilt. The size
of the blue circle indicates distance from the surface, with larger dots being closer.
Not shown: a message indicates when the pen is on the surface, based on a simple Z
threshold.
information in the UDP packets, scaling and shifting the displayed measurements as
needed.
4.4.2
Vision Server Approach
The benefits of using the SecondLight's vision server for BoPen processing are twofold.
Firstly, the information about the location of the detected bokeh is immediately available to the multitouch detection routines, which can be used to prevent the connected
components analysis from incorrectly identifying the BoPen as a finger. Secondly, the
vision server integrates GPU shader calls to provide parallel processing of images and
other matrix-like data structures. The disadvantages include the difficulty of multi-
camera integration and the relative complexity of the code base.
Since the image server is necessary for any multitouch applications on the SecondLight, even the smart camera-based approach requires some slight modification
of vision server code to integrate the UDP server and pen orientation calculation
described in the previous section. Fortunately, our UDP server code is highly modular, with Tracker.cs (Appendix C.3) and BoPen.cs (Appendix C.4) providing most
of the necessary functionality for interpreting data from the smart camera and calculating pen pose, respectively. In the next chapter we evaluate only the smart camera
approach, as we lack data on the performance of the other method.
4.5
Interaction Vision
Only very recently have multitouch and pen interfaces been combined on a single
display surface [13, 57], and the techniques described so far do not support tilt and roll
sensing with the pen. Creating an additional platform for exploration of simultaneous
touch and pen interaction could generate a multiplicity of new interaction techniques
and applications. Furthermore, the SecondLight provides an opportunity to utilize
multi-touch and multi-display technologies on the same device. As the combination
of touch and pen is faster and less error-prone than either touch and touch or pen and
pen combinations [13], we shall work to design a system that makes optimal use of a
user's innate capacity for bi-manual interaction. In the next chapter, we evaluate our
prototypes based on both their technical performance and their interaction potential.
58
Chapter 5
Evaluation
In this chapter, we report our observations on overall system performance of different
tags in both paper-printed and BoPen-installed configurations and comment on the
differences between our BoPen prototypes in image quality, angular range, and contrast variation. Further, we describe a few minor changes to our design, how these
changes have enhanced performance, and the limitations that result. We close with
an inquiry into the suitability of this pen as an interface device.
5.1
Pen Design Verification
The first table below (Table 5.1) shows a size comparison for all patterns as projected
from each pen, illustrating how some patterns are simply too small or too large after
magnification to be used in a given optical configuration. Farther down, Table 5.3
illustrates the additional constraints imposed by the distance between BoPen and
camera in our system; this distance-along with the focal lengths and aperturesdictates the size of the bokeh. In many cases, the pattern cannot be decoded since
the bokeh is too small to contain even a single complete marker.
Table 5.1: Pattern size variations due to lens and pattern differences.
Pen 1
8.8mm
Pen 2
17.5mm
Pen 3
12.5mm
Pen 4
4.65mm
Pen 5
11mm.
Pen 6
8mm
Trackmate
wwawev
*EsamE[
ARTags
Il E) C
-Lm
Ml
assaar.
P-
I -1
mW.smog -N
Small MS/v
S*7*
mese
L E.
Big MS/V
lo....pe
UDI A 1 1
5*
woo
DataMatrix
i
**a
a
Er.
*
I
#4
SpacedDM5
0
Ju
SpacedDM1O
~11" .mmAL
IV
i
I
*A
.1I
5.1.1
Pattern Comparison
Each pattern has unique requirements and limitations. TrackMate and ARTags were
developed to facilitate mixed-reality applications, and they each have their respective
toolkits. Microsoft Tags are proprietary and integrated into the MS Surface SDK.
Data Matrix tags can take some time to decode, but are simple and common enough
that one might quickly conceive of a solution to this problem. All patterns described
below were tested as paper mock-ups (with the camera focused on the surface), and
some were also tested (as infinity-projected patterns) in situ on the SecondLight unit.
TrackMate
The TrackMate Tracker software uses an Open Sound Control (OSC) server to dispatch packets with tag identification and location data. The program also provides
on-demand visual feedback to facilitate the configuration process. Users can perform
camera calibration or background subtraction on-line, and the software provides immediate confirmation of the settings change. In addition to providing an inspired
WYSIWYG (What You See Is What You Get) configuration process that conveys
useful information about intermediary image processing steps, configuration is automatically loaded and can be saved with a single key press.
According to an informal conversation with the creator of the package, resolving TrackMate tags requires at least 60 pixels to an edge, meaning that, even for
_
pen, we would ideally want to print at 17.5mm*1omm/pm
6Opx
our longest focal-length
our
2.9pm/px. The theoretical printing capability of the film masks is around 5pm/px,
so the patterns we printed were slightly too large and could not be resolved in any
of the bokehs produced by our pens, not even pen #2 that produces the smallest
magnification. This outcome was unfortunate, as our paper experiments revealed
that the TrackMate Tracker is fairly robust to variations in pattern size. Writing our
own software to generate the tags might enable us to order smaller film masks, but a
pattern re-design would be required to fill the entire mask.
ARTags
ARToolkit+ is a popular software package for mixed-reality research, and its successor
Studierstube preserves the functionality of the ARTags with almost identical code and
configuration options. This means that for people well-versed in the ARToolkit, these
tags should be rather easy to use. The configuration files are written in XML and
we had little trouble finding examples, but we had difficulty creating a configuration
file for all 255 tags. Furthermore, the tags are slightly larger than 10x1O data matrix
codes, so at the two mask sizes we tried, they were either too large to fit in the bokeh
or too small to be printed without smudging.
The slightly smudged codes were clear enough in pens #1 and #6 to be interpreted
some of the time. However, they were not visible in these pens for a large enough
proportion of the time, and we were unable to get the toolkit to recognize them
consistently. In pens #5 and #3, the patterns were more consistently visible, but the
toolkit still did not recognize them.
We conclude that ARTags need further experimentation; they may become a viable
solution if printed slightly larger and used in a pen with a focal length between 11
and 13 millimeters. However, it is important to consider their limitation: ARTags
are recognized as templates in a similar fashion to the unimplemented approach we
mentioned in Section 4.2.1, so a custom pattern-matching solution may be a better
alternative.
VideoMouse/Microsoft Tag Hybrid
Microsoft Surface tags are designed to be placed on the bottom of coffee mugs and
other objects that will be sitting on top of the illuminated multitouch Microsoft
Surface. There are both high- and low-density versions. We used only the lowdensity tags, which can be micro-printed at about the same size as our data matrix
codes. This fact means that low-density Surface tags can theoretically be used with
all the same pens and lenses as the DM tags. Indeed, the larger masks made from
this pattern were well sized for pen #3, and the smaller versions were visible in pens
#1 and #5.
For our application, however, these tags had a number of disadvantages. They are
designed to only encode a single byte, so the number of distinct tags is limited to 256.
We divided the single byte into two sets of four bits each, encoding row and column
addresses for a 16x16 array of locations and opting not to encode unique identifiers
for each device. Relative to DM tags, this encoding scheme reduces our ability to
uniquely identify locations, which we planned to counter using a pattern of dots for
dead-reckoning: optical flow data would supplement the discrete locations derived
from the decoded byte tags with information about relative motion. In practice, the
dots were far more likely to be seen in the bokeh than the tags. Only in pen #3 did
we observe consistently visible Surface tags, but pen #3 had other problems, which
we will describe in Section 5.1.3.
In an iteration of our hybrid pattern we would recommend making the dots smaller
to increase the noticeable difference between the two features. We would also bring
the Surface tags closer together to ensure that they are more frequently visible. Given
the tilt performance of the pens (Section 5.2.1), it also makes sense to abandon the
absolute location markers in the outermost regions of the film mask, favoring a more
concentrated region of tags in the center.
As a final note, the software for reading these tags is written entirely in GPU
shader code for speed of execution. It is also extremely well optimized for a specialized
optical setup-different from the one we were using. These factors made it relatively
difficult for us to reliably determine any more than just roll information, despite
repeated optical and software reconfiguration attempts Though we did extensively
test these tags in our paper-based setup, we had neither the time nor the expertise
necessary to properly modify the shader code to work with our setup.
Data Matrix Codes
As there are a number of Data Matrix decoding packages available freely and for
purchase, and as we were relatively familiar with the DM-based mask designs, we
chose to concentrate primarily on these patterns for our final implementation.
In
5px
7px
9px
Figure 5-1: Spaced Data Matrix Codes. Pixel-spaced versions of the data matrix
patterns allowed us to use the smart camera's rapid decoding library.
our initial explorations, we used the open-source package LibDMTX, which provided
fairly robust detection of the very tightly packed barcodes in the original Bokode
design. However, the algorithms required a fairly long decoding time-upwards of two
seconds in some cases. The library for the VC smart-camera was much faster, but
it can currently only recognize DM codes with a blank "comfort zone" around the
edges: the original Bokode design is too tightly packed for this library. We evaluated
the real-time performance of the system across 1000 frames with codes of five, seven,
and nine-pixel spacings (Seen in Figure 5-1) in our paper demo setup, and the results
were as follows:
Table 5.2: Paper Test Recognition Rates
Pattern Spacing
Five Pixels
Seven Pixels
Nine Pixels
Stationary
68.3%
89.2%
93.6%
Moving
12.1%
30.7%
29.4%
We believe that the slightly higher recognition rate of the seven pixel-spaced codes
indicates that the choice of seven provides a good tradeoff: it creates enough of a
comfort zone around each tag whilst improving the likelihood of seeing a barcode
within the decoding region at a given time. Depending on the optics, it could be
fruitful to explore six- and eight-pixel spacings as well.
The 5pm feature-sized codes we ordered were unreadable due to the nature of
the film on which our masks are produced. The material is dipped into a chemical
bath and portions of it exposed to light become dark. However, the grains of lightreactive material are larger than the intended pixel size, resulting in the overexposure
of some regions that are meant to be clear. This is visible in the images of 5pm DM
codes projected by pens #1, #4, and #6 . For the 10pm-feature spaced DM codes,
there is also some smudging visible in the images from pens #1 and #6, but it is
far less pronounced. Though the images produced by pens #1 and #6 were of an
acceptable size for the codes to fit within the bokeh, we again found that they were
not consistently visible during motion.
Pens #3 and #5 were the best for decoding. After slight refocusing, we were able
to track pen #5 at 30 frames per second with approximately 10% DM recognition
rate. These numbers come with some caveats, which we will discuss further in Section
5.2.1.
5.1.2
Optical Idiosyncrasies
We observed aberrations such as coma and astigmatism in projects from all our lenses.
After a certain degree of tilt, the focal length changes, causing a blur and rendering
the DM codes unreadable. In some of our lenses, most noticeably #2 and #5 (both
aspheric), pincushion distortion stretched and deformed the outermost edges of the
image, effectively reducing our decoding region to a fraction of the actual bokeh size.
One possible solution would be to geometrically transform the image, based on an
earlier calibration of a test pattern projected by the same lens.
Pen #1's lens displayed a small amount of barrel distortion, but it also created a
strangely shaped bokeh (not visible in our image table) that complicated the process
of determining its location relative to the surface. Though the focal length of the
lens used in pen #1 was only 8.8mm, its diameter was 19.7mm. Most of this area
was not the lens itself, but an extra ring-shaped region of material surrounding the
lens. This region permitted excess infrared light to increase the size of the bokeh and
decrease the image contrast. We designed a smaller channel between lens and mask
to limit the transmission of excess light, but we still observed some degradation in
image quality.
Both our condenser lens-based pens (#2 and #4) produced lower-contrast images
than the other pens, especially pen #4. Since condenser lenses are designed to focus
light, not to magnify images, this result was not entirely unexpected. Overall, we
found that aspheric lenses helped a bit with astigmatism but were more susceptible
to coma when not properly aligned and tended to have either contrast problems,
distortion issues, or strangely-shaped bokehs. In further iterations it would be interesting to explore multi-lens systems (like achromatic doublets) to reduce some of
these problems.
5.1.3
Best Choice
For the display of lpm-spaced DM codes, pens #3 and #5 produced images with
optimal magnification for decoding, and one tag was consistently visible in all the
images we captured. Since pen #3 was more susceptible to off-axis blur, decreasing
both its angular range and effective recognition area, we chose pen #5 as the best of
our prototypes. When tested as a stationary device and imaged through the switching
diffuser, we could read pen #5's information at between 29 and 40 frames per second,
decoding Data Matrix information in 15 to 35% of the captured frames. For simplicity,
we will hereafter refer to the prototype pen #5 as "the BoPen."
5.2
Interface Design Verification
Earlier, we briefly mentioned that the BoPen projected DM codes that could be
captured and decoded at 30fps and approximately 10% recognition rate.
In this
section we describe the trade-offs necessary to achieve that performance.
5.2.1
Limitations
By far the largest barrier to turning the BoPen into a fully-fledged 6-DOF tracking
system is the small angular range; only rows of 8-12 tags in the center of the spaced
DM pattern can be decoded. After tilting past a ten degree region containing approximately 144 codes, decoding happens only intermittently. Some of this effect is
surely worsened by the print quality of the 10pm patterns, but the primary causes
are nonuniform illumination, which causes the contrast to drop off as a function of
Table 5.3: Bokehs as seen from our camera setup.
Pen I
8.8mm
Trackmate
ARTags
Small MS/V
Big MS/V
DataMatrix
SpacedDM5
SpacedDM1O
Pen 2
17.5mm
Pen 3
12.5mm
Pen 4
4.65mm
Pen 5
11mm
Pen 6
8mm
tilt, and astigmatism, where variations in the focal distance between the lens and
the pattern cause blur. The problem of angular range is particularly noticeable when
trying to use the BoPen for off-surface interaction, as it is difficult to keep one's hand
steady without a support.
Another important barrier is more specific to our lens/pen choice.
With the
standard infinity-focused optical configuration, the DM codes are not quite large
enough to be decoded when projected onto the CCD. Indeed, the output of our
parameter calculation script (Appendix A) indicates that the pattern magnification
achieved with the current setup is 1.9626, just below our theorized cutoff of 2.0. This
means that a single pixel in our printed pattern is received by no more than 2x2 pixels
on the CCD.
Tradeoffs
To increase the area of the CCD used for capturing Data Matrices, we moved the camera lens slightly closer to the CCD. This refocusing enabled the reliable interpretation
of DM tags at the cost of losing depth-independent pattern projection. Moving the
BoPen away from the surface now increases the size of the pattern within the bokeh,
reducing the distance that the pen can travel from the surface before one full DM
code is no longer visible. Still, with minor modifications to the DM search algorithm,
we preserved rapid DM decoding in the region three to five inches above the surface,
which is sufficient for our envisioned applications.
One additional loss from changing the system's focus deals a stronger blow to our
vision for interaction. Now that the pattern size changes with depth, the tilt calculation depends on an additional variable. Before, it was possible to determine (after
calibration) which DM code is at the origin of a tilting action based on the X and Y
positions of the pen along with the roll 0. Now, Z is also an important considerations,
complicating our tilt-determination algorithm. We currently are capable of detecting
tilt as long as the pen is not rotated, but have yet to generalize it to arbitrary 0.
Figure 5-2: Reflection Chamber. Realizing that the black acrylic we used for slices is
transmissive to IR light, we designed this second chamber to produce more uniform
illumination via reflection and scattering.
Minor Concessions
Changing the focus as described above also increased the bokeh size at a given distance. However, since variations due to depth remained consistent, our system can
still distinguish between on- and off-surface interaction. We also observed an increase
in coma distortion with large tilts, resulting in blurrier DM codes at the edges of the
mask. However, since astigmatism and light levels already make these codes difficult
to interpret, we felt it an acceptable loss: it is far more important ensure the functionality of the DM decoding in the center regions, which is essential to the fine-grained
tilt and roll capabilities of the interface.
5.2.2
Suggested Improvements
Figure 5-2 shows an additional reflective chamber that we built to increase the BoPen's
angular range by combatting the problem of nonuniform pattern illumination. Since
black acrylic is transmissive to infrared light (in fact, we use black acrylic as our IR
filter) we can place reflective foil tape on the outside of a series of ring-shaped slices.
Positioned after the first chamber containing the LED and after the diffuser in the
BoPen, this chamber creates more uniform illumination before the light reaches the
pattern mask. Along with software for local histogram normalization, the chamber
provides a good alternative to more expensive uniform emitters.
An interesting observation related to the tilt problems of our pens is a lack of
the predicted trade-off between lens size and angular range: smaller lenses afforded
little additional tilt capacity. Experimenting with lens arrays based on microscope
objectives may reduce off-axis aberrations. In another approach to the same problem,
one could create a layered or tiered pattern so that stacking multiple two-dimensional
masks would create a three dimensional pattern that is more resistant to the disparity
between focal length and object distance. This pattern would be composed of concentric circles that decrease in depth-moving closer to the BoPen's lens-as the radius
increases. This design would likely introduce variations in light levels that could be
addressed with precise pattern cuts, intelligent illumination schemes, and histogram
normalization.
One of the most interesting approaches involves the use of flat lenses or holographic
projection techniques to create an optical system where the focal distance is always
constant. Working within the constrains of the current BoPen capabilities, however,
a compound superposition approach like that described in [70] could increase the
angular range at the expense of device size.
5.2.3
Interface Potential
Though our system is still very preliminary, we have a few comments from potential
users that will guide our further refinement of this technology. First, of the people
we talked to, all said they would be interested in a simultaneous pen and touch
interface. Their reactions to the current design rated the BoPen favorably against
the original Bokode for pen-like interaction. However, most preferred the original
Bokode for mixed-reality applications because of its ability to stand up on its own,
pointing toward the camera by default. Some requested that the pen be larger, have
a more ergonomic grip, and function at larger angles, so they could hold it normally,
rather than needing to point it downward all the time. At least one user felt that it
was not cumbersome to hold it pointed downward, as she holds pens in a similar way
normally. All appreciated the visual feedback afforded by our debug console, but they
also seemed interested in testing out some more extensive application designs. We
are currently working on these applications, and shall discuss some of our longer-term
ideas in the next and final chapter.
Chapter 6
Conclusions
In this thesis, we presented continuing work on a system that enables real-time interaction using a special optical barcode-projection system. We reviewed the literature
on similar interfaces and technologies, reported the current state of the art in using
Bokodes for interaction, and discussed the concessions we made to achieve real-time
performance. We outlined the remaining limitations to overcome before the BoPen
becomes a fully viable 6-DOF pen interface.
6.1
Contribution
A significant contribution of this work is our demonstration of real-time blob tracking and Data Matrix decoding, which produces a 30fps data-stream with decoded
information in as many as 90% of packets when imaging paper and up to 35% for
infinity-projected Bokodes printed at 10pum. The use of a client-server architecture
provides an easy way to extend the system to multiple users or surfaces via many
distinct or one continuous tablet region(s). Additionally, we described integration
with the Microsoft@ SecondLight system, using a switchable diffuser to collocate a
multi-tiered display with our optical pen interface. Finally, our system calculates roll
information from decoded Bokodes as well as tilt information for certain orientations.
6.2
Relevance
Like the light-pen, our system is an optoelectronic pen that can be used to interact
with computing devices. Unlike the light-pen, it provides information for up to six
degrees of freedom. In comparison with contemporary pen interfaces, such as the
Wacom tablet
[96],
our system is limited in that it can only provide a small range
of angular tilt information, and it does not currently support pressure sensing or
integrated buttons. However, the BoPen provides depth information that can be
used in lieu of pressure for screen taps, and its compatibility with the identification
of multiple devices is uniquely compelling.
In comparison with Bi et al.'s use of the Vicon motion-tracking system for obtaining roll [8], the BoPen/camera solution is more compact and less expensive. However,
the smart camera approach makes the BoPen solution significantly less affordable, albeit not for the pen device itself. Overall, the BoPen is unlikely to replace the mouse
anytime soon, but as an adjunct to a tabletop computing interface like SecondLight,
the BoPen provides a novel input modality with exciting interaction potential.
6.3
Application Extensions
The applications we are considering to best demonstrate the strengths of BoPen
technology fall within a few different interaction areas. GUI extensions use the additional degrees of freedom of the BoPen for enhancing interaction with the cursor and
menu elements already present in Graphical User Interfaces. Applications for direct
manipulation draw inspiration from mixed-reality interfaces, and the integration of
multiple users with multiple displays engenders a vision for the ubiquitous availability
of BoPens to users at public kiosks.
6.3.1
GUI Extensions
In Chapter 2, we discussed some user interface widgets based on input devices with
additional degrees of freedom. Tilt for marking menus [90] and pressure changes
to assist in target acquisition [80] are only two of the many ways that these extra
degrees of freedom can be used. Tian et al. explored a tilt cursor that provides
the user with visual feedback indicative of pen orientation [89]. For the BoPen, too,
additional degrees of freedom could be expressed with visual feedback to preserve
a WYSIWYG model of interaction. With this consideration in mind, tilt can be
used to change pen modes. For example, in a drawing application, pen depth could
dictate stroke thickness, and tilt could provide direction. When selecting a region of
an image to "cut," a tilt cursor could be displayed to provide a more natural-feeling
razor-blade tool. Finally, it would be interesting to explore variations on the theme
of crossing targets, which are known to be more efficient than standard Fitts targets
for interaction using stylus pens [2, 27].
6.3.2
Tangible and Direct Manipulation
We can take the BoPen even further into the tangible realm: working with digital
paintings (in 2D) or sculptures (in 3D) becomes more natural with the availability of
multiple physical tools that are linked to virtual ones. Paint brushes, knives, pencils,
pens (of course), and a whole host of other physical tools could be built as BoPens
with visibly and tactilely distinct forms. In this way, the function of the device is
immediately conveyed by the form while preserving the aforementioned higher-DOF
modes.
The communication of affordances is a given with physical tools, but for their digital counterparts it must be engineered, an idea pioneered in virtual reality systems
[49]. Like a set of physical tools, a set of specially designed and uniquely-identified
BoPens enables switching between modes of interaction by simply picking up a different device; this spacial search task is far less cognitively demanding than looking
for the right icon on an unfamiliar tool bar, and it leverages the user's kinesthetic
memory as well.
There are a number of interfaces that offer more natural interactions through the
use of both hands [42]. We mentioned the pen/touch combination, but for symmetric
tasks the use of two pens can be almost equally compelling [13]. Video-based multi-
touch systems have trouble telling the difference between two users and one user
using two hands [39], but an extension of our system would support multi-point
IDed interaction through the use of bimanual pens. This has implications for the
traditional point-and-click interface as well as tangible and gestural interaction. One
example application is stretching and skewing a 3D object based on manipulation of
pens representing virtual axes. Another is 6-DOF control of multiple virtual objects;
research into a simulated docking tasks using two pens would serve as an extension
of the established research on bimanual interaction with physical objects.
6.3.3
Multi-user Interaction
Adding another device is understood to be one of the best (but not the only. See [11])
approaches to solving problems encountered when enabling multiple users [47, 15, 63].
One of the most difficult aspects of this process is establishing ad-hoc communication
protocol between devices [48, 17]. Tools exist to facilitate these inter-device communications [77, 28], but for simple input to a single computer or public display(c.f. [9]),
a simpler (and less expensive) solution like the BoPen is quite powerful.
For a person with a BoPen, interaction is possible at any display equipped with a
suitable camera (Figure 6-1). Under this paradigm, multiple people can collaborate,
play games, and express themselves at a number of kiosks distributed around a city
or building. However, there is an important "registration" component to this kind of
interaction that cannot be ignored; the user would likely have to execute an initial
login either online at home or at a special "Association Kiosk," which would read their
pen and associate it with their digital identity (e.g. e-mail address or username). The
user could then login at home to access statistics and other information, providing
additional depth and breadth to using Bokode-authenticated interfaces. Additionally,
in implementing interaction at a distance, it is important to consider that the users'
hands will not be steady. For this reason, using large gestures is going to be more
effective than precise cursor control. One might refer to Nintendo's popular
WiiTM
console for examples of how large gestures can effectively be used for interaction.
....
. ..
....
....
...
.................
I
Figure 6-1: Vision for Public Display. In this vision for interaction, multiple users
are uniquely identified to a public display, and can interact with it in real time. Image
from http://web.media.mit.edu/~ ankit/bokode/
6.4
Device Future
Aside from the obvious improvements to be garnered by higher printing resolution
and improved focal length stability, actuation is a major next step in creating a
BoPen interface. The lack of pressure sensitivity in the BoPen complicates the reliable
detection of a tap or click. Rather than using Z-depth, a future pen might employ
capacitive sensing on the pen barrel to detect hand position and finger actuation or
force sensitivity at the tip to determine with fine granularity the pen's pressure on
the display surface. Both these methods potentially increase the power consumption
and cost of the BoPen, but these unwanted results could be minimized by preserving
the optical method of communication. Rather than adding a Bluetooth radio or other
RF transmitter, the existing optical communication infrastructure could be used to
transmit this additional information. For example, a tip could be designed where
exerting pressure would mechanically alter the shape of the tip and hence the bokeh.
The camera could read this shape change using a template-matching approach, and
infer the exerted pressure. Other solutions include modulation of light intensity in a
ring around the code region, pulse width modulation, or color-based conveyance of
button or tip actuation.
6.5
Outlook
The Data Matrix tracking and decoding on which the BoPen's functioning depends is
now available in real time. After some additional refinements to address the problems
we identified earlier, the next important step in the development of BoPen interaction
is the implementation and testing of interface applications. We are actively working
on developing some of the applications described above for three-dimensional object
manipulation as well as novel GUI extensions supported by the BoPen's additional
degrees of freedom. We also hope to one day be able to explore gesture detection,
sketch reading, and-pushing the technology to its limits-handwritten character recognition. Approaching these areas with the BoPen in hand could open up a valuable
avenue of research into Human-Computer Interaction.
Bibliography
[1] Gregory D. Abowd and Elizabeth D. Mynatt. Charting past, present, and
future research in ubiquitous computing. ACM Trans. Comput.-Hum. Interact.,
7(1):29-58, 2000.
[2] Johnny Accot and Shumin Zhai. Beyond fitts' law: models for trajectory-based
hci tasks. In CHI '97: Proceedings of the SIGCHI conference on Human factors
in computing systems, pages 295-302, New York, NY, USA, 1997. ACM.
[3] Apple. Mighty Mouse. Online at http://www.apple.com/mightymouse/, Accessed August 2009.
[4] Ravin Balakrishnan, Thomas Baudel, Gordon Kurtenbach, and George Fitzmaurice. The rockin'mouse: integral 3d manipulation on a plane. In CHI '97:
Proceedings of the SIGCHI conference on Human factors in computing systems,
pages 311-318, New York, NY, USA, 1997. ACM.
[5] Ravin Balakrishnan and Ken Hinckley. The role of kinesthetic reference frames
in two-handed input performance. In UIST '99: Proceedings of the 12th annual
ACM symposium on User interface software and technology, pages 171-178,
New York, NY, USA, 1999. ACM.
[6] Rafael Ballagas, Michael Rohs, and Jennifer G. Sheridan. Sweep and point and
shoot: phonecam-based interactions for large public displays. In CHI '05: CHI
'05 extended abstracts on Human factors in computing systems, pages 12001203, New York, NY, USA, 2005. ACM.
[7] Morton I. Bernstein. Computer recognition of on-line hand-written characters.
Rand Corporation. Memorandum RM-3753-ARPA. Rand Corp., Santa Monica,
CA, USA, 1964.
[8] Xiaojun Bi, Tomer Moscovich, Gonzalo Ramos, Ravin Balakrishnan, and Ken
Hinckley. An exploration of pen rolling for pen-based interaction. In UIST '08:
Proceedings of the 21st annual ACM symposium on User interface software and
technology, pages 191-200, New York, NY, USA, 2008. ACM.
[9] Xiaojun Bi, Yuanchun Shi, Xiaojie Chen, and Peifeng Xiang. Facilitating interaction with large displays in smart spaces. In sOc-EUSAI '05: Proceedings
of the 2005 joint conference on Smart objects and ambient intelligence, pages
105-110, New York, NY, USA, 2005. ACM.
[10] Oliver Bimber and Ramesh Raskar. Modern approaches to augmented reality.
page 1, 2007.
[11] Alan F. Blackwell, Mark Stringer, Eleanor F. Toye, and Jennifer A. Rode.
Tangible interface for collaborative information retrieval. In CHI '04: CHI '04
extended abstracts on Human factors in computing systems, pages 1473-1476,
New York, NY, USA, 2004. ACM.
[12] Richard A. Bolt. "put-that-there": Voice and gesture at the graphics interface.
In SIGGRAPH '80: Proceedings of the 7th annual conference on Computer
graphics and interactive techniques, pages 262-270, New York, NY, USA, 1980.
ACM.
[13] Peter Brandl, Clifton Forlines, Daniel Wigdor, Michael Haller, and Chia Shen.
Combining and measuring the benefits of bimanual pen and direct-touch interaction on horizontal interfaces. In AVI '08: Proceedings of the working conference on Advanced visual interfaces, pages 154-161, New York, NY, USA, 2008.
ACM.
[14] Bill Buxton. Multi-Touch Systems that I Have Known and Loved. Online at
http://www.billbuxton.com/multitouchOverview.html, Accessed March 2009.
[15] Xiang Cao, Clifton Forlines, and Ravin Balakrishnan. Multi-user interaction
using handheld projectors. In UIST '07: Proceedings of the 20th annual ACM
symposium on User interface software and technology, pages 43-52, New York,
NY, USA, 2007. ACM.
[16] Xiang Cao, Michael Massimi, and Ravin Balakrishnan. Flashlight jigsaw: an
exploratory study of an ad-hoc multi-player game on public displays. In CSCW
'08: Proceedings of the ACM 2008 conference on Computer supported cooperative work, pages 77-86, New York, NY, USA, 2008. ACM.
[17] Eduardo Cerqueira, Luis Veloso, Augusto Neto, Marilia Curado, Edmundo
Monteiro, and Paulo Mendes. Mobility management for multi-user sessions
in next generation wireless systems. Comput. Commun., 31(5):915-934, 2008.
[18] Yu-Hsuan Chang, Chung-Hua Chu, and Ming-Syan Chen. A general scheme for
extracting qr code from a non-uniform background in camera phones and applications. In ISM '07: Proceedings of the Ninth IEEE InternationalSymposium
on Multimedia, pages 123-130, Washington, DC, USA, 2007. IEEE Computer
Society.
[19] Kelvin Cheng and Kevin Pulo. Direct interaction with large-scale display
systems using infrared laser tracking devices. In APVis '03: Proceedings of
the Asia-Pacific symposium on Information visualisation, pages 67-74, Darlinghurst, Australia, Australia, 2003. Australian Computer Society, Inc.
[20] Chung-Hua Chu, De-Nian Yang, and Ming-Syan Chen. Image stablization for
2d barcode in handheld devices. In MULTIMEDIA '07: Proceedings of the 15th
international conference on Multimedia, pages 697-706, New York, NY, USA,
2007. ACM.
[21] James R. Dabrowski and Ethan V. Munson. Is 100 milliseconds too fast? In
CHI '01: CHI '01 extended abstracts on Human factors in computing systems,
pages 317-318, New York, NY, USA, 2001. ACM.
[22] T.L. Diamond. Devices for reading handwritten characters. In IRE-A CM-AIEE
'57 (Eastern): Papers and discussions presented at the December 9-13, 1957,
eastern joint computer conference: Computers with deadlines to meet, pages
232-237, New York, NY, USA, 1958. ACM.
[23] Paul H. Dietz and Darren Leigh. DiamondTouch: A multi-user touch technology. In Proc. of UIST 2001, pages 219-226, 2001.
[24] Paul N. Edwards. The Closed World: Computers and the Politics of Discourse
in Cold War America. MIT Press, Cambridge, MA, USA, 1996.
[25] Englebart D.C. et al. A research center for augmenting human intellect (demo).
Online at http://sloan.stanford.edu/MouseSite/1968Demo.html, Accessed August 2009.
[26] Daniel Fallman, Anneli Mikaelsson, and Bj6rn Yttergren. The design of a
computer mouse providing three degrees of freedom. In HCI (2), pages 53-62,
2007.
[27] Clifton Forlines and Ravin Balakrishnan. Evaluating tactile feedback and direct
vs. indirect stylus input in pointing and crossing selection tasks. In CHI, pages
1563-1572, 2008.
[28] Clifton Forlines, Alan Esenther, Chia Shen, Daniel Wigdor, and Kathy Ryall.
Multi-user, multi-display interaction with a single-user, single-display geospatial
application. In UIST '06: Proceedings of the 19th annual ACM symposium on
User interface software and technology, pages 273-276, New York, NY, USA,
2006. ACM.
[29] Clifton Forlines, Daniel Wigdor, Chia Shen, and Ravin Balakrishnan. Directtouch vs. mouse input for tabletop displays. In CHI '07: Proceedings of the
SIGCHI conference on Human factors in computing systems, pages 647-656,
New York, NY, USA, 2007. ACM.
[30] Bernd Froehlich, Jan Hochstrate, Verena Skuk, and Anke Huckauf. The globefish and the globemouse: two new six degree of freedom input devices for graphics applications. In CHI '06: Proceedings of the SIGCHI conference on Human
Factors in computing systems, pages 191-199, New York, NY, USA, 2006. ACM.
[31] Vision Components GmbH.
Vc 4038 smart camera.
Online at:
http://www.vision-components.com/vc-smart-camera-series-and-software/vcsmart-camera-series/vc4038-smart-camera- 20060124238/, Accessed August
2009.
[32] John B. Goodenough. A lightpen-controlled program for online data analysis.
Commun. ACM, 8(2):130-134, 1965.
[33] Gabriel F. Groner. Real-time recognition of handprinted text. In AFIPS '66
(Fall): Proceedings of the November 7-10, 1966, fall joint computer conference,
pages 591-601, New York, NY, USA, 1966. ACM.
[34] Tovi Grossman, Ken Hinckley, Patrick Baudisch, Maneesh Agrawala, and Ravin
Balakrishnan. Hover widgets: using the tracking state to extend the capabilities
of pen-operated devices. In CHI '06: Proceedings of the SIGCHI conference on
Human Factors in computing systems, pages 861-870, New York, NY, USA,
2006. ACM.
[35] Barbara J Grosz. Visualizing the process a graph-based approach to enhancing system-user knowledge sharing. Proceedings of the American Philosophical
Society, 149(4):529-543, 2005.
[36] Anselm Grundh6fer, Manja Seeger, Ferry Hantsch, and Oliver Bimber. Dynamic
adaptation of projected imperceptible codes. In ISMAR '07: Proceedings of the
2007 6th IEEE and ACM InternationalSymposium on Mixed and Augmented
Reality, pages 1-10, Washington, DC, USA, 2007. IEEE Computer Society.
[37] Kelly S. Hale and Kay M. Stanney. Deriving haptic design guidelines from human physiological, psychophysical, and neurological foundations. IEEE Comput. Graph. Appl., 24(2):33-39, 2004.
[38] Michael Haller. Pen-based interaction. In SIGGRAPH '07: ACM SIGGRAPH
2007 courses, pages 75-98, New York, NY, USA, 2007. ACM.
[39] Jefferson Y. Han. Low-cost multi-touch sensing through frustrated total internal
reflection. In UIST '05: Proceedings of the 18th annual ACM symposium on
User interface software and technology, pages 115-118, New York, NY, USA,
2005. ACM.
[40] Andy Hertzfeld. Revolution in The Valley (hardcover). O' Reilly & Associates,
Inc., 2004.
[41] Ken Hinckley, Randy Pausch, John C. Goble, and Neal F. Kassell. A survey
of design issues in spatial input. In UIST '94: Proceedings of the 7th annual
ACM symposium on User interface software and technology, pages 213-222,
New York, NY, USA, 1994. ACM.
[42] Ken Hinckley, Randy Pausch, Dennis Proffitt, James Patten, and Neal Kassell.
Cooperative bimanual action. In CHI '97: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 27-34, New York, NY,
USA, 1997. ACM.
[43] Ken Hinckley, Mike Sinclair, Erik Hanson, Richard Szeliski, and Matthew Conway. The videomouse: A camera-based multi-degree-of-freedom input device.
In ACM Symposium on User Interface Software and Technology, pages 103-112,
1999.
[44] Hiroshi Ishii and Brygg Ullmer. Tangible bits: towards seamless interfaces
between people, bits and atoms. In CHI '97: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 234-241, New York, NY,
USA, 1997. ACM.
[45] ISO. Automatic identification and data capture techniques - Data Matrix bar
code symbology specification. ISO/IEC 16022:2006, 2006.
[46] Shahram Izadi, Steve Hodges, Stuart Taylor, Dan Rosenfeld, Nicolas Villar,
Alex Butler, and Jonathan Westhues. Going beyond the display: a surface
technology with an electronically switchable diffuser. In UIST '08: Proceedings
of the 21st annual ACM symposium on User interface software and technology,
pages 269-278, New York, NY, USA, 2008. ACM.
[47] Hao Jiang, Eyal Ofek, Neema Moraveji, and Yuanchun Shi. Direct pointer: direct manipulation for large-display interaction using handheld cameras. In CHI
'06: Proceedings of the SIGCHI conference on Human Factors in computing
systems, pages 1107-1110, New York, NY, USA, 2006. ACM.
[48] Victor P. Jimenez and Ana Garcia Armada. Multi-user synchronisation in
ad hoc ofdm-based wireless personal area networks. Wirel. Pers. Commun.,
40(3):387-399, 2007.
[49] Marcelo Kallmann and Daniel Thalmann. Direct 3d interaction with smart
objects. In VRST '99: Proceedings of the ACM symposium on Virtual reality
software and technology, pages 124-130, New York, NY, USA, 1999. ACM.
[50] Martin Kaltenbrunner and Ross Bencina. reactivision: a computer-vision framework for table-based tangible interaction. In TEI '07: Proceedings of the 1st
international conference on Tangible and embedded interaction, pages 69-74,
New York, NY, USA, 2007. ACM.
[51] Alan Curtis Kay. The reactive engine. PhD thesis, The University of Utah,
1969.
[52] Alan Curtis Kay. A personal computer for children of all ages. Xerox Palo Alto
Research Center: Proceedings of the ACM National Conference, 1972.
[53] Alan Curtis Kay. Trackmate: Large-scale accessibility of tangible user interfaces. Master's thesis, Massachusetts Institue of Technology, 2009.
[54] Werner A. K6nig, Joachim B6ttger, Nikolaus V61zow, and Harald Reiterer.
Laserpointer-interaction between art and science. In IUI '08: Proceedings of
the 13th internationalconference on Intelligent user interfaces, pages 423-424,
New York, NY, USA, 2008. ACM.
[55] Werner A. K6nig, Jens Gerken, Stefan Dierdorf, and Harald Reiterer. Adaptive
pointing: implicit gain adaptation for absolute pointing devices. In CHI EA '09:
Proceedings of the 27th international conference extended abstracts on Human
factors in computing systems, pages 4171-4176, New York, NY, USA, 2009.
ACM.
[56] Celine Latulipe, Stephen Mann, Craig S. Kaplan, and Charlie L. A. Clarke.
symspline: symmetric two-handed spline manipulation. In CHI '06: Proceedings
of the SIGCHI conference on Human Factors in computing systems, pages 349358, New York, NY, USA, 2006. ACM.
[57] Jakob Leitner, James Powell, Peter Brandl, Thomas Seifried, Michael Haller,
Bernard Dorray, and Paul To. Flux: a tilting multi-touch and pen based surface. In CHI EA '09: Proceedings of the 27th internationalconference extended
abstracts on Human factors in computing systems, pages 3211-3216, New York,
NY, USA, 2009. ACM.
[58] Vincent Lepetit and Pascal Fua. Monocular model-based 3d tracking of rigid
objects. Found. Trends. Comput. Graph. Vis., 1(1):1-89, 2005.
[59] Yang Li, Ken Hinckley, Zhiwei Guan, and James A. Landay. Experimental
analysis of mode switching techniques in pen-based user interfaces. In CHI '05:
Proceedings of the SIGCHI conference on Human factors in computing systems,
pages 461-470, New York, NY, USA, 2005. ACM.
[60] Wendy E. Mackay, Guillaume Pothier, Catherine Letondal, Kaare B6egh, and
Hans Erik S6rensen. The missing link: augmenting biology laboratory notebooks. In UIST '02: Proceedings of the 15th annual ACM symposium on User
interface software and technology, pages 41-50, New York, NY, USA, 2002.
ACM.
[61] I. Scott MacKenzie, R. William Soukoreff, and Chris Pal. A two-ball mouse
affords three degrees of freedom. In CHI '97: CHI '97 extended abstracts on
Human factors in computing systems, pages 303-304, New York, NY, USA,
1997. ACM.
[62] Shahzad Malik and Joe Laszlo. Visual touchpad: a two-handed gestural input device. In ICMI '04: Proceedings of the 6th international conference on
Multimodal interfaces, pages 289-296, New York, NY, USA, 2004. ACM.
[63] Shahzad Malik, Abhishek Ranjan, and Ravin Balakrishnan. Interacting with
large displays from a distance with vision-tracked multi-finger gestural input. In
UIST '05: Proceedings of the 18th annual ACM symposium on User interface
software and technology, pages 43-52, New York, NY, USA, 2005. ACM.
[64] T. Marrill, A. K. Hartley, T. G. Evans, B. H. Bloom, D. M. R. Park, T. P.
Hart, and D. L. Darley. Cyclops-1: a second-generation recognition system. In
AFIPS '63 (Fall): Proceedings of the November 12-14, 1963, fall joint computer
conference, pages 27-33, New York, NY, USA, 1963. ACM.
[65] Sergey V. Matveyev and Martin Gdbel. The optical tweezers: multiple-point
interaction technique. In VRST '03: Proceedings of the ACM symposium on
Virtual reality software and technology, pages 184-187, New York, NY, USA,
2003. ACM.
[66] David Merrill, Jeevan Kalanithi, and Pattie Maes. Siftables: towards sensor
network user interfaces. In TEI '07: Proceedings of the 1st international conference on Tangible and embedded interaction, pages 75-78, New York, NY,
USA, 2007. ACM.
Online
Key events in microsoft history.
[67] Microsoft.
http://download.microsoft.com/download/7/e/a/7ea5ca8c-4c72-49e9-a69487ae755elf58/keyevents.doc, Accessed August 2009.
at:
[68] Paul Milgram and Fumio Kishino. A taxonomy of mixed reality visual displays.
IEICE Transactions on Information Systems, E77-D(12), December 1994.
[69] Pranav Mistry, Pattie Maes, and Liyan Chang. Wuw - wear ur world: a wearable gestural interface. In CHI EA '09: Proceedings of the 27th international
conference extended abstracts on Human factors in computing systems, pages
4111-4116, New York, NY, USA, 2009. ACM.
[70] Ankit Mohan, Grace Woo, Shinsaku Hiura, Quinn Smithwick, and Ramesh
Raskar. Bokode: imperceptible visual tags for camera based interaction from a
distance. In SIGGRAPH '09: ACM SIGGRAPH 2009 papers, pages 1-8, New
York, NY, USA, 2009. ACM.
[71] Brad A. Myers. A brief history of human-computer interaction technology.
interactions, 5(2):44-54, 1998.
[72] Brad A. Myers, Rishi Bhatnagar, Jeffrey Nichols, Choon Hong Peck, Dave
Kong, Robert Miller, and A. Chris Long. Interacting at a distance: measuring
the performance of laser pointers and other devices. In CHI '02: Proceedings of
the SIGCHI conference on Human factors in computing systems, pages 33-40,
New York, NY, USA, 2002. ACM.
[73] Ernie G. Nassimbene. Utensil for writing and simultaneously recognizing the
written symbols. In US Patent, number 3182291, May 1965.
[74] University of Surrey and Loughborough University. Ergonomics of using a
mouse or other non-keyboard input device. HSE Research Report RR045. Health
and Safety Executive, United Kingdom, 2002.
[75] Masaki Oshita. Pen-to-mime: Pen-based interactive control of a human figure.
Computers & Graphics, 29(6):931 - 945, 2005.
[76] Hanhoon Park and Jong-Il Park. Invisible marker tracking for ar. In ISMAR
'04: Proceedings of the 3rd IEEE/ACM International Symposium on Mixed
and Augmented Reality, pages 272-273, Washington, DC, USA, 2004. IEEE
Computer Society.
[77] Jeffrey S. Pierce and Jeffrey Nichols. An infrastructure for extending applications' user experiences across multiple personal devices. In UIST '08: Proceedings of the 21st annual ACM symposium on User interface software and
technology, pages 101-110, New York, NY, USA, 2008. ACM.
[78] Larry Press. The acm conference on the history of personal workstations. SIGSMALL/PC Notes, 12(4):3-10, 1986.
[79] Davis M. R. and Ellis T. 0. The rand tablet: a man-machine graphical communication device. In AFIPS '64 (Fall, part I): Proceedings of the October 27-29,
1964, fall joint computer conference, part I, pages 325-331, New York, NY,
USA, 1964. ACM.
[80] Gonzalo Ramos, Matthew Boulos, and Ravin Balakrishnan. Pressure widgets.
In CHI '04: Proceedings of the SIGCHI conference on Human factors in computing systems, pages 487-494, New York, NY, USA, 2004. ACM.
[81] Howard Rheingold. The Virtual Community: Homesteading on the Electronic
Frontier.The MIT Press, 2000.
[82] Lawrence G. Roberts. The lincoln wand. In AFIPS '66 (Fall): Proceedings of
the November 7-10, 1966, fall joint computer conference, pages 223-227, New
York, NY, USA, 1966. ACM.
[83] Johan Sanneblad and Lars Erik Holmquist. Ubiquitous graphics: combining
hand-held and wall-size displays to interact with large images. In AVI '06:
Proceedings of the working conference on Advanced visual interfaces,pages 373377, New York, NY, USA, 2006. ACM.
[84] Hyunyoung Song, Tovi Grossman, George W. Fitzmaurice, Frangois Guimbretiere, Azam Khan, Ramtin Attar, and Gordon Kurtenbach. Penlight: combining a mobile projector and a digital pen for dynamic visual overlay. In CHI,
pages 143-152, 2009.
[85] Sriram Subramanian, Dzimitry Aliakseyeu, and Andres Lucero. Multi-layer
interaction for digital tables. In UIST '06: Proceedings of the 19th annual
ACM symposium on User interface software and technology, pages 269-272,
New York, NY, USA, 2006. ACM.
[86] Ivan E. Sutherland. Sketch pad a man-machine graphical communication system. In DAC '64: Proceedings of the SHARE design automation workshop,
pages 6.329-6.346, New York, NY, USA, 1964. ACM.
[87] W. L. Sibley T. 0. Ellis, J. F. Heafner. The GRAIL System Implementation.
Rand Corporation. Memorandum RM-6002-ARPA. Rand Corp., Santa Monica,
CA, USA, 1969.
[88] Chuck Thacker. Personal distributed computing: the alto and ethernet hardware. In Proceedings of the ACM Conference on The history of personal workstations, pages 87-100, New York, NY, USA, 1986. ACM.
[89] Feng Tian, Xiang Ao, Hongan Wang, Vidya Setlur, and Guozhong Dai. The tilt
cursor: enhancing stimulus-response compatibility by providing 3d orientation
cue of pen. In CHI '07: Proceedings of the SIGCHI conference on Human factors
in computing systems, pages 303-306, New York, NY, USA, 2007. ACM.
[90] Feng Tian, Lishuang Xu, Hongan Wang, Xiaolong Zhang, Yuanyuan Liu, Vidya
Setlur, and Guozhong Dai. Tilt menu: using the 3d orientation information of
pen devices to extend the selection capability of pen-based user interfaces. In
CHI '08: Proceeding of the twenty-sixth annual SIGCHI conference on Human
factors in computing systems, pages 1371-1380, New York, NY, USA, 2008.
ACM.
[91] Ellis T.O. and Sibley W.L. On the Problem of Directness in Computer Graphics.
Rand Corporation. Report P-3697. Rand Corp., Santa Monica, CA, USA, 1968.
[92] Brygg Ullmer, Hiroshi Ishii, and Robert J. K. Jacob. Token+constraint systems
for tangible interaction with digital information. ACM Trans. Comput.-Hum.
Interact., 12(1):81-118, 2005.
[93] Andries van Dam and David E. Rice. On-line text editing: A survey. A CM
Comput. Surv., 3(3):93-114, 1971.
[94] Dan Venolia. Facile 3d direct manipulation. In INTER CHI, pages 31-36, 1993.
[95] Daniel Vogel and Ravin Balakrishnan. Distant freehand pointing and clicking
on very large, high resolution displays. In UIST '05: Proceedings of the 18th
annual ACM symposium on User interface software and technology, pages 3342, New York, NY, USA, 2005. ACM.
Wacom Intuos3 Art Pen Orientation Guide.
[96] Wacom.
at http://www.wacom-asia.com/intuos3/spec/intuos3artpen.html,
March 2009.
Online
Accessed
An annotated bibliography
[97] Jean Renard Ward.
recognition.
character
handwriting
and
puting
http://users.erols.com/rwservices/biblio.html, 2009.
[98] Pierre Wellner. Interacting with paper on the digitaldesk.
36(7):87-96, 1993.
in
pen comat
Online
Commun. ACM,
[99] Adam Wojciechowski and Konrad Siek. Barcode scanning from mobile-phone
camera photos delivered via mms: Case study. In ER '08: Proceedings of the
ER 2008 Workshops (CMLSA, ECDM, FP-UML, M2AS, RIGiM, SeCoGIS,
WISM) on Advances in Conceptual Modeling, pages 218-227, Berlin, Heidelberg, 2008. Springer-Verlag.
[100] Shumin Zhai, Barton A. Smith, and Ted Selker. Improving browsing performance: A study of four input devices for scrolling and pointing tasks. In
INTERACT '97: Proceedings of the IFIP TC13 Interantional Conference on
Human-Computer Interaction, pages 286-293, London, UK, UK, 1997. Chapman & Hall, Ltd.
[101] Xiang Zhang, Stephan Fronz, and Nassir Navab. Visual marker detection and
decoding in ar systems: A comparative study. In ISMAR '02: Proceedings of
the 1st International Symposium on Mixed and Augmented Reality, page 97,
Washington, DC, USA, 2002. IEEE Computer Society.
Appendix A
MATLAB Code
function quickcalc ()
r
=
radius of pen (transparency)
7e-3
fb = 8.8e-3
bokode focal length (try 4.5 -
t
0.1
field width
.3399
camera distance from screen
=
U
22)
ub
=
10e-06
print feature size
1
=
15
number of features for decoding (try up to 35)
Uc
=
7.4e-06;
pixel size
dc
=
4.7e-3;
5mm COD
fc=dc/(t/u);
% fc=16e-3
k=(ub/uc)*(fc/fb
); a=(l*ub)/(fb/u);
disp([ 'fc-=',
num2str(fc*1000), 'n;-2-<?=', ...
numn2str (k), '-; -a-=-',..
num2str(a*1000), 'nmt; -range-=J',
num2str(atan(r/fb)*180/pi) , 'degrees']);
88
Appendix B
C Code
"
Daniel Taub
*
MIT Media Lab
*
Code for
*
Based on Example Code by:
and Microsoft Research Ltd.
VC4xxx
the
add: timeout for
+ To
#include
SmartCamera By
Vision
Components
Klaus Schneider ,
Components Inc.
fail
trigger
<DM-Header.h>
//timing
blobs
int
bms,ms, sec, fpsv , fps=0,fpss=O;
int
trigger
int
ovIDisplay
int
lastx=-1,1asty , lastdx , lastdy , lastw , lasth
#define
Vision
=
1;
=
0;
500
MIN.AREA
/*******************************************************************/
#define
LOCAL-PORT
5005
/*
local
#define
DEST-PORT
4004
/*
udp
#define
BUF.LEN
32
/*
srsly
port 5005
destination
buffer
port
length
/*******************************************************************/
#define
fsign(x)
sign ((int)(
ceil (x)))
main(void)
void
I
//
Init
params for UDP packets
sockaddr-in
laddr , raddr;
unsigned
sock ;
unsigned
error
unsigned
const
char
rest ;
dest
= "157.58.60.69";
char
buf [BUFLEN] ;
int
var1
=
0,
var2 =
0;
int
var3 =
0,
var4 =
0;
132
destInHex ;//
uint_16
=
//for
conversion simplicity
(varl << 24)
=
remote-addrlen
+
(var2 <<
16)
+
(var3 <<
8)
+
var4;
);
sizeof(sockaddrin
/******************************************************************/
/********+*******DATA
MATRIX DECODING*
**+***+**+********/
/******************************************************************/
ScreenX , ScreenY ,
132
ScreenDx , ScreenDy , VideoDispICamera ,
result;
OvlScreenDx , OvlScreenDy , OvlSearchAreaDx , OvISearchAreaDy;
132
SearchAreaX ,
132
SearchBorder
132
char
=
SearchAreaY ,
0;
SearchAreaDx ,
SearchAreaDy;
//8
Text [DMMAX.TEXTLEN];
DmParameter DmPar;
image
SearchArea , SearchAreaOvl
unsigned
int
dm-count
unsigned
char dm..id=,dm_x=0,dm-y=O;
-
0,count =
0,lastudm-count=0;
/******************************************************************/
//
Command and control
/******************************************************************/
int
threshold=40;
char kb;
int
display =
0;
short
fullScreenMode
short
secondpic
= 0;
short
streaming
= 0;
int
grow =
=
1;
0;
/******************************************************************/
int
x, y, dx,
int
Over;
image
int
nt
wArea;
=
//
dy;
char
resp;
0;
/************************************************/
dx = EVEN.S(640);
dy = EVENS(480);
x
= EVEN.S((ScrGetColumns-dx)/2);
y
= EVEN.S((ScrGetRows
/12
SearchAreaX,
-dy)/2);
SearchAreaY,
i n i t _l i c e n c e ("T2EC4DB8BF2" );
i n i t -Ii c e n ce (" E2EC4E94792"
SearchAreaDx,
SearchAreaDy,
SearchBorder = 8;
result
if
=
DmIniMemory(&DmPar);
.Error>\n" ,
{print ("e<%d: .Data...Matrix.Initialization
(result)
);
result
exit (0);}
/*****************************************************************/
/7
variables
inst
/****************+********++*************+**************************/
//
ini
=
0;
ScreenY
=
0;
ScreenDx
= ScrGetColumns;
ScreenDy
= ScrGetRows;
OvlScreenDx
=
OvIScreenDy
= OvIGetRows;
//
Screen
display and define
ScreenX
define
capture position)
OvlGetColumns;
search area
SearchAreaX
=
ScreenX
+
SearchBorder
SearchAreaY
=
ScreenY
+
SearchBorder
SearchAreaDx
=
ScreenDx
-2*
SearchBorder;
SearchAreaDy
=
ScreenDy
-2*
SearchBorder
OvlSearchAreaDx
=
OvISc reenDx -
2
*
SearchBorder
OvISearchAreaDy
=
OviSc reenDy -
2
*
SearchBorder
ImageAssign(&SearchArea ,
SearchAreaY),
OvlByteAddr(SearchAreaX,
Data Marix input parameters including
all
of
initialization
SearchAreaY) , SearchAreaDx ,
,
ScrByteAddr(SearchAreaX
ImageAssign(&SearchAreaOvl ,
77
image
and Positon (Screen
Size
licence
code.
Call this
function
/*****************************************************************/
//
setup
overlay
set-translucent
draw
0,
250,
(1,
0);
7/
/7
set-overlay-bit (2, 210, 210, 210); //
0); //
0,
set-overlay-bit (3, 250,
70); //
70,
70,
(4,
set.overlay.bit
(2,
set-translucent
255,
0);
250,
250,
(5,
set.overlay.bit
255,
1
red
TRANYEL
3
yellow
COLGREY
4 grey
8
COLRED
32 white
7/
COLWHITE
0,
200,
0);
//
COLGREEN
set-overlay.bit(7,
65,
90,
255);
//
COLBLUE
64
128
green
blue
getvar (OVLY.START) , DispGetColumns,
DispGetRows,
ImageAssign(&wArea,
getvar (DISPSTART) ,
DispGetColumns,
DispGetRows,
122);
EmptyKeyboardBuffer (;
/*****************************************************************/
"%d.%d.%d.%d" , &varl , &var2 , &var3 , &var4);
sscanf(dest ,
destInHex =
/
(var1
<<
(157 <<
destInHex =
OvlGetPitch);
0);
set(&SearchAreaOvl
set(&wArea,
black
16
250);
translucent
red
COLBLACK
set-overlay.bit (6,
ImageAssign(&SearchAreaOvl ,
translucent
TRANRED
+
24)
0,
(var2
+
24)
<<
(58 <<
BUFLEN);
+
16)
16)
//clear
+
(var3
(60
<< 8)
<<
buffer
memset((char
*)
&buf,
memset ((char
*)
&laddr ,
0,
sizeof(sockaddr-in ));
memset ((char
*)
&raddr,
0,
sizeof(sockaddr.in));
8)
+
+
69;
var4;
DispGetPitch);
ScrGetPitch );
SearchAreaDy,
DM-MAXTEXTLEN);
Text,
&SearchArea,
InitDataMatrixPar(&DmPar,
SearchAreaDy ,
SearchAreaDx,
OvlGetPitch);
as often
you
like.
laddr. sin.family
= AF-INET;
laddr .sin-port
= LOCALPORT;
//
//
sin-addr . s-addr = INADDILANY;
laddr
raddr . sin-family
=
raddr. sin-port
= DESTPORT;
bind port
listen
for any
IP
addr
AFINET;
//
in
port at dest
raddr . sin-addr . s-addr = INADDRLBROADCAST; //destInHex;
//dest
IP
from
sock = socket.dgram (;
(sock
if
VCRTSOCKETERROR)
=
{
print (" \ne<Create-UDP..socket .. failed >"
return;
}
/*
bind to
=
error
if
local
address */
bind(sock,
(error
!=
&laddr,
sizeof(raddr));
VCRT_OK)
print("\ne<UDP..bind..f ailed...-...0x%x>" ,
error);
return;
rest = connect (sock,
if
(rest
!=
&raddr, remote-addrlen);
VCRTOK)
{
("\nError-
printf
//
main
connect ()..failed .with...error...code.%1x
rest );
loop
/*************************+************+******++*******************/
do{
/*
Delete
old
//OvlClearA
/*
drawings */
ll;
GET KEYBAORD INPUT */
kb = KEYJNVALID;
if
(kbhit()
kb =
ReadKeyboard(1);
EmptyKeyboardBuffer (;
fpsv =
fps =
if
1000
*
(getvar(SEC) -
getvar(MSEC);
(trigger)
{
vmode(vmOvlStill);
tenable ();
if
(secondpic)
{
while (! trdy ();
}
}
else
vmode(vmOvlLive);
set.ovlmask (255);
fpss)
+
fpss= getvar(SEC);
(getvar(MSEC) -
fps);
above
()
tpict
}
/*
Take
pictures
//tpict
*/
(;
Working Page */
/*
=
Over
(int)OvlGetPhysPage;
OvlSetLogPage
(Over);
(1)//kb == KEYENTER)
if
blob
detect and display biggest
/*
y,
blobtest(x,
if
dy,
dx,
*7
threshold ,
-1,
display);
(secondpic)
{
//
while (!trdy());
7*
U16 *rlc;
long
maxlng= 0x020000L;
Hc =
sysmalloc ((maxlng*
*)
(U16
rlcmk(tSearchArea ,
(0)
if
ric ,
threshold ,
/
sizeof(U32),
MDATA);
overrun>\n");}
0x40000);
O=no filter
1=with filter
///
(U16)+1)
sizeof
{pstr("e<DRAM Memory
(rlc == NULL)
if
{
/7
, r2);
slc=erode2 (rll
//
sIc=dilate
(rll
/7
slc=rlc.mf
(rl1 ,r2
r12);
0 ,4);
}
wArea,rc ,0,255);
rlcout(
(rlc );
rlcfree
,
//rlcmkf(lx
threshold ,
input ,
rlc );*/
I
(fullScreenMode)
if
{
ScrByteAddr(SearchAreaX,
ImageAssign(&SearchArea,
if
SearchAreaY),
SearchAreaDx,
SearchAreaDy ,
{ ImageAssign(&SearchAreaOvl,
OvlByteAddr(SearchAreaX,
SearchAreaY),
OvlSearchAreaDx,
OvlSearchAreaDy,
}
}
else
{
ImageAssign(&SearchArea ,
if
((ovIDisplay)
ScrByteAddr( lastx,
&& (lastx
lasty),
lastdx ,
lastdy,
{ ImageAssign(&SearchAreaOvl
OvlByteAddr(lastx,
lasty),
lastdx,
DmPar. SearchArea = &SearchArea;
DmPar. ClosingFilter
=
grow;
***************************************
start
reading data matrix
7**
*
*
ScrGetPitch);
-1))
!=
}
//
ScrGetPitch);
(ovlDisplay)
**
**************
*
*****/7
lastdy,
OvlGetPitch);
}
OvlGe
ms
=
//
sec =
getvar(MSEC);
getvar(SEC);
main data matrix read function
DataMatrixReader(&DmPar);
count++;
if
(count
> MAXCOUNT)
{
last-dm.count
=
dm-count
count
=
=
dm.count;
0;
1;
}
= 1000
ms
if
*
(getvar (SEC)
((DmPar.DmError
0)
=
-
sec ) +
( getvar (MSEC)
&& (DmPar.DmTextLength
-
ms);
=
3))
{
double avgX=0,avgY=0;
int
i,xx,
ctr-x
/7
for
angle
ctr-y;
dm-count++;
dm-id = DmPar.DmText [2];
dm-x = DmPar. DmText
[ 1];
dm-y = DmPar. DmText [0];
/*+++++*+++*calculateangle***********/
for ( i =0; i <4; i++) { avgX+=(double)DmPar.DmPosX[ i];
avgX=avgX / 4 0;
avgY+=(double)DmPar.DmPosY[ i];
}
avgY=avgY/ 4. 0;
/store
if
center
of IDed
barcode
(fullScreenMode)
{
ctr~x
-
(int)
avgX;
ctr-y =
(int)
avgY;
=
lastx
+
(int)
avgX;
ctry =
lasty
+
(int)
avgY;
}
else
{
ctr-x
}
//calculate
angle
vector
avgY=(DmPar. DmPosY[0] -avgY );
avgX =(DmPar. DmPosX[0] - avgX );
(avgX <
if
avgX =
xx =
xx =
0)
else
xx =
(int)((avgX/3.14159265359*180)
printf("X:%f
if
270;
90;
atan(avgY/avgX);
+ xx);
Y:%f<%d>\n",avgX,avgY, xx);
(streaming)
{
//may
want
to
marshall as byte
array
sprintf(buf,"%d,%d,%d,%d,%d,%d,%d,%d,%d,%d\0"
nt
if
= send(sock,bufBUFLEN,0);
(nt
>
//sendto(sock,
,lastx ,lasty
100)
printf("\ne<send ().failed
.with.error..%d.-..%d>"
,
count,
}
send (sock , buf , BUFLEN, 0);
print
("%d,%d,%d,%d,%d\n\r"
}
else
if
,lastdx ,lastdy ,dm-id,dmx,dmy,xx,0
buf,BUFLEN,O,traddr, sizeof(sockaddr-in));
(streaming)
, lastx , lasty , dm-id , dm-x, dm-y);
VCRT-geterror(nt),nt );
,0);//ctr
x,
f (buf, "%d,%d,%d,%d\0" , lastx , lasty , lastdx , last dy );
sprint
nt
= send(sock ,buf ,BUF.LEN,0);
if
(nt
(sock,
//sendto
buf ,BUFLEN,0,&raddr , sizeof
printf("\ne<send ()...failed -with-error...%d----%d>",
//
VCRT-geterror(nt),nt);
count,
print("%d,%d\n\r", lastx ,lasty);
display
screen
//on
(ovlDisplay)
if
(sockaddr-in));
100)
>
{
//
monitor text
skip
image
ImgTxt;
char
ResTxt[80];
sprint (ResTxt,
"%d,%d,%d" , dm-id,dm.x,dm-y);
COLBLUE);
framed(&SearchAreaOvl,
OvlByteAddr(lastx
ImageAssign(&ImgTxt,
//
set(&ImgTxt,
if
(DmPar.DmError
{
=
,
lasty -12),
ScreenDx,
8,
OvlGetPitch);
COLGREY);
0)
chprint1(ResTxt,&ImgTxt,1,
1,
COLBLUE);}
else
{
chprint1(ResTxt,&ImgTxt,1,
//
ChangeColor(ImgTxt,
1,
0,
COLGREY);}
0,
COLGREY);
}
/**********+*************************************************/
/7
results
output
/********************************************************/
//
display
if
(ovlDisplay)
data matrix and modul position
{
ColorDark-COLRED,
CrossSize=1,
132
//ImageAssign(wOvl,
if
(DmPar.DmError
=
ColorBright-COLGREEN;//,
OvlByteAddr(PosX,
PosY),
PosX=SearchAreaX, PosY=SearchAreaY;
OvlGetColumns-PosX,
OvlGetRows-PosY,
OvlGetPitch);
0)
{
(fullScreenMode)
if
{DrawDataMatrix
(&DmPar,
&SearchAreaOvl ,
CrossSize ,
ColorDark ,
ColorBright );}
e Is e
{DrawDataMatrixOff (&DmPar, &SearchArea,
if
(kb
=
KEY.DOWN)
{
threshold
=
(kb
=
threshold
=
max(0 ,threshold -1);
I
else
if
KEY-UP)
{
min(255 ,threshold +1);
I
if
(kb
=
KEY.LEFT)
{
grow = max(O,grow-1);
I
else
if
(kb
=
KEYRIGHT)
{
grow = min(16 ,grow+1);
}
CrossSize,
ColorDark,
ColorBright,
lastx -6
,
lasty -6);}
else
if
(kb
'f')
=
{
fullScreenMode
=
not (fulIScreenMode
);
}
else
if
(kb =
'd')
{
display
=
not(display );
}
else
if
(kb
'o')
=
{
OvIClearAll;
ovIDisplay
not (ovIDisplay);
=
}
else
if
(kb
't')
-
{
trigger
= not(trigger);
}
else
if
(kb
=
'w')
secondpic
=
{
not(secondpic);
}
else
{
if
(kb =
's')
pstr("\n\r"
);
streaming
=
not(streaming);
}
else
if
{
int
if
(kb
ant =
(ant
=
'u')
udp
/test
sendto(sock ," test" ,4,0,&raddr ,sizeof(sockaddr-in
));//sendto
(sock,
buf,BUFLEN,O,&raddr, sizeof(sockaddr_
> 100)
printf("\nsendto
-with-count...%ld-and.error.%lx..-%d"
().failed
,
count,
VCRT-geterror(ant)
,ant);
}
else
if
(kb
-
'p')
{
print ("\nthreshold.at...%d",threshold);
print("\nUseTrigger : ..%d\t
print ("\nOverlay:-%d\t-WaitBufFull
print("\nbob-<%dms>+"
print (".-...filter..at..%d"
grow);
-BlobResult : -%d\t...Streaming: ..%d" ,trigger ,display
,bms);
print ("\n-/d.d/...%dDM-found\n"
:...%d\t-FullScreen :..%d" ,ovIDisplay
print("dm<%d>ms..="
,ms);
,streaming);
,secondpic
, fulIScreenMode);
print ("%f.2-fps" ,1000.0/(double)fpsv);
, last-dm-count ,MAX.COUNT);
}
else
if
(kb
=
'h')
/help!
{
pstr ("\n-HELP...ENU");
pstr("\n
"
pstr ("\n.change...threshold..with..<up>.and..<down>.arrow..keys");
pstr ("\n change.DM.. filler
...with-_< left >.and...<right >..arrow-keys");
pstr ("\n..enable/ disable-overlay :<o>");
.. <t>")
pstr ("\n.enable/ disable-trigger
pstr
("\n enable/disable
taking
(second)
threshold
pstr ("\n-enable/disable..waiting..for..buffer-to-
fill
pstr ("\n.enable/disable...fullscreen..data.
(?):...<d>" );
pstr ("\n-enable/ disable-streaming..output
..<s>")
..<h>" );
pstr ("\n.print..status..screen
: -<p>\n")
}
}
while
(
(kb
!=
KEY.ESC)
blob
matrix..decoding :..<f>");
pstr ("\n.enable/disable-blob.display
pstr ("\n.print-this..menu:
after
);
96
detection:
.. before..blob-detection
<n>");
.. <w>" );
graceful
//
exit
shutdown (sock , FLAGABORTCONNECTION);
vmode(vmLive);
/*****************************************************************/
void
(void)
EmptyKeyboardBuffer
{
while (kbready ())
kbdrcv (;
while (rbready()
rs232rcv(;
}
10
TimeOut
#define
(int wait)
int ReadKeyboard
int
||
if(wait
ch4,
ch3,
ch2,
i, chl,
kbready()
||
src;
rbready ())
{
while (1)
if
(kbready()
{src=1;
break;)
if
(rbready()
{src=0;
break;)
time-delay (1);
}
if(src)
chl
else
chl = rs232rcv (;
(ch1
if
!=
kbdrcv(;
=
Oxb)
if(chl =
OxOd
||
ch1
=
OxOa)
return(KEY.ENTER);
else
return(chl
);
ch1 = KEY-ESC;
(i=O;
for
i<TimeOut;
i++)
time-delay (1);
if
if(src)
{if
(kbready()
else
{if
(
(i=TimeOut)
kbhit()
return(chl);
if(src)
ch2 =
kbdrcv ();
else
ch2 =
rs232rcv(;
break;)
) break;)
/*
we've
been
waiting for
too
long
*
for
(i=0;
i<TimeOut;
i++)
{
time-delay (1);
if(src)
{if
(kbready()
else
{if
(
break;}
)
kbhit()
break;)
I
if
(i==TimeOut)
return(chl);
if(src)
ch3 = kbdrcv ();
else
ch3 =
if
-
(ch2
/*
we 've
been
waiting for
too
long */
rs232rcv(;
Ox4f)
/*
of
2/3
a F1
or F2 received? +/
{
((ch3
if
0x71)
=
(ch3
=
received? *7
Ox5O))
/*
F1
Ox5l))
/*
F2 received? */
return (KEYF1);
if
((ch3
0x72)
=
(ch3 =
return (KEY.F2);
return (KEY.INVALID);
if
(ch2
--
Ox5B)
/*
of
2/3
a Cursor received? */
{
if
(ch3 =
0x31)
{
(i=0;
for
i<TimeOut;
i++)
{
time-delay (1);
if(src)
{if
(kbready()
else
{if
(
kbhit()
break;}
)
break;}
}
if
(i=TimeOut)
return(chl);
if(src)
ch4 =
kbdrcv();
else
ch4 =
rs232rcv(;
Ox3l)
if (ch4 =
/*
we 've
/*
F1
received? */
/*
F2
received? */
return(KEYF1);
=
if (ch4
Ox32)
return(KEY.F2);
}
else
{
switch
(ch3)
{
case Ox41
case Ox42
case Ox43:
return(ch3 -
default
return(KEYINVALID);
}
I
}
re turn (KEYINVALID);
}
{
61);
case Ox44
been
waiting for
too
long */
return (KEYINVALID);
}
}
/*************************************************************/
/********************* BLOB SUBROUTINES +*************************/
/*************************************************************/
100
MAXOBJ
#define
int
int dx,
int y,
(int x,
void blobtest
int color , int
dy,
threshold , int
display)
{
nobj;
j=0,
i,
int
area;
image
ftr
f [MIAXOBJ]
ftr
*f-largest
= NULL;
side=1,line , col , sidex=,sidey=1;
long
the
Search area is
/*
whole
*/
image
,dxdy,ScrGetPitch);
ImageAssign (&area ,ScrByteAddr(x,y)
*/
extraction
feature
/*
nobj=find-objects(&area
%d
//print("\n\nFound
//
/7
(color)
if
else
,f
display);
,MAXOBJ, threshold
white
and black
print("\nwhite
objects :
print("\nblack
objects:");
objects (-1
means more than %d
objects)",nobjMAXOBJ);
i++)
for(i=0;i<nobj;
{
color
/*
looking for
if
( f [i]. color--color
objects
(black=0,
white=-1)
*7
)
{
=
(( flargest
if
via
print
/*
||
NULL)
(flargest ->area
f [i] .area))
<
= &f[i];}
{f-largest
at 9600
seriell
Band
takes a
("\nArea %d: Size=%d Center of
//print
area, f/i.
++j,f[i].
//
long time
*/
Gravity =%d,%d",
f/i].
xcenter+x,
ycenter+y);
//cross-image(EVEN((int)f[i].y-center+y),EVEN((int)f[iI.x-center+x),10,0xFF);
}
}
if
!=
((f-largest
NULL)
&& (f-largest
->area
> MINAREA))
{
(long) sqrt ((double)f-largest
=
/side
sidey=(
->area);
use max and min
//eventually
->x-max-f-largest ->x-min );
sidex=(f-largest
f-largest
->y-max-f-largest ->y-min);
col=f-largest ->x-center ;
->y-center
line=f-largest
mark-overlay ( col -(sidex
>>1),line -(sidey
>>1),sidex , sidey )
}
else
if
//remove
(lastx
!=
lines
and SEND SIGNAL
'no
longer seen
-1)
{
image wOvl;ImageAssign(&wOvl,
pstr ("\ne<no..light .visible>");
lastx=-1;
OvlByteAddr(lastx,
lasty),
lastdx+1,
lastdy+1,
OvlGetPitch);set(&wOvl,0);
void
mark-overlay(int
image
x,
int
y,
int
dx,
int
dy)
//assumes
only
one
done
last
time..
wOvl;
/*
*7
Display pictures
if(ovlDisplay)
{
if
(lastx
!=
-1)
{
ImageAssign(&wOvl,
,
OvlByteAddr( lastx
lasty -15),
lastdx+1,
lastdy+16,
set (&wOvl, 0);
ImageAssign(&wOvl,
framed(&wOvl,
markerd(&wOvl,
overlay
dx,
dy,
OvlGetPitch);
COLRED);
colors
set-overlay-bit(7,
255,
0,
0);
set-overlay-bit(6,
0,
255,
0);
set-overlay-bit(4,
0,
0,
+/
7/
y),
COLBLUE);
ellipsed(&wOvl,
/* set
OvlByteAddr(x,
TRANRED);
255);
set-ovlmask (255);
lastx=max(0,x);
lasty=max(0,y);
lastdx=dx;
lastdy=dy;
int
find-objects
(image
*a,
ftr
*f ,
int
maxobjects ,
int
th ,
int
display)
{
long
U16
maxlng= 0x020000L;
*
rlc;
U16 * sIc;
int
//
rlc
if
objects;
allocate
=
DRAM Memory
(U16
(rIc
*)
sysmalloc ((maxlng*
== NULL)
sizeof(U16)+1)
{ pstr("e<DRAM Memory
/
create RLC
//
bms = getvar(MSEC);
sIc=rlcmk(a, th,
ric ,
sizeof
(U32),
overrun>\n"); return(-1);}
maxlng);
100
MDATA);
OvlGetPitch);
(sIc
//
object
{ pstr ("e<RLC overrun>\n"); sysfree ( rc
)
== NULL
if
{pstr("e<object number overrun>\n"); sysfree(rlc);
return(-1);}
extraction
feature
,
(rc
objects=rl-ftr2
bms =
1);}
labe ling
if (sgmt(rlc , slc)==OL)
//
return(-
);
f ,
maxobjects );
getvar(MSEC)-bms;
if(bms<0) bms+=1000;
//
display RLC
if
(display)
//
free
,0,255);
rlcout(a,rc
allocation
sysfree (rlc );
+/
maxlng= 0x020000;
long
U16
*rl1 , *r12 ,
int
objects ;
//
nentered find\n");
printf("\
=
rl
//
DRAM Memory
allocate
/*
if
if
NULL)
=
if
th,
,
rli
maxIng);
rlcfree(rll);
{pstr("RLC overrun\n");
{pstr("e<RLC-overrun>\n");
NULL )
-
sysfree(rll);
return(-1);}
return(-1);}
overrun check\n");
printf("\after
//
return(-1);}
rlc\n");
*/
==NULL)
(rl2
(r12
if
create
);
getvar(MSEC);
rl2=rlcmk(a,
/
return(-1);}
overrun\n");
{ pstr ("e<DRAM-Memory-overrun>\n"
before
create RLC
bms =
("DRAM Memory
{pstr
== NULL)
(rll
(rl
/*
*/
rlcmalloc(maxlng);
printf("\n
//
*slc
/*
you
can use some
if
(0)
/*
(dilate
filter
,
0=no filter
1=with filter
erode)
*/
*/
{
slc=erode2(rl1,r12);
+/
/*
sIc=dilate(rll
/*
slc=rlc_mf(rl1 ,rl2 ,0,4); +/
,rl2);
else
{
slc=rl2
rl2=rl1;
}
/*
+/
labelling
object
rlcfree(rll);
if(sgmt(rl2,slc)==NULL)
{pstr("object-number-overrun\n");
//if(sgmt(rl2,slc)==OL)
{pstr("e<object number overrun>\n"); rlcfree(rl1);
label\n");
printf("\nafter
//
/*
*/
extraction
feature
,
objects=rl-ftr2(r12
//
getvar(MSEC)-bms;
ms+=1000;
if (ms<0)
/*
if
display RLC */
/*
free
rlcfree
("\nafter
allocation
(rll
,0,255);
rlcout(a,rll
(display)
printf
//
maxobjects);
obj\n");
printf("\nafter
bms =
f,
display\n");
+/
);
101
return(-1);}
return(-1);}
//
printf("\
nafter
free\n");
return(objects);
}
/******************************************************************/
void
(int
cross-image
line ,
int
column,
size ,
int
int
color)
{
int i;
for
( i-
size ;i<=size ; i++)
{
//wpix(color
,(U8
+)ScrByteAddr(column
//wpix(color
, (U8
*)ScrByteAddr(column+i , line
*(ScrByteAddr(column
=
,line+i))
))
*(ScrByteAddr(column+i , line
,line+i));
));
color;
color;
}
}
/******************************************************************/
/******************************************************************/
/******************************************************************/
/********************+++**+******************++++*****+++*********/
MATRIX SUBROUTINES*
/**********************DATA
*****************/
/******************************************************************/
132
InitDataMatrixPar
(DmParameter
*DmPar,
*SearchArea ,
image
char *OutputText ,
132
OutputTextLength)
{
/////////////////////////////////////////////
7/ ini input parameters
///
etv/r////////////
//
please
set
//////////////////////
variable Licence Code to
the
DmPar->LicenceCodel
=
Ox0D444D18;
DmPar->LicenceCode2
=
0x274B6669;
//
define
define
=
decode
text
Vision
Components
space
=
OutputText;
DmPar->MaxTextLength
=
OutputText Length;
define
offered from
SearchArea;
DmPar->DmText
//
value
data matrix search area
DmPar->SearchArea
//
the
data matrix size
DmPar->DmDx
=
34
DmPar->DmDy
=
33
;
pixel
; / pixel
=
4 0;
//
DmDeltaSize (0%
DmPar->PropOppLine
=
70;
opposite
lines
in
percent (50
=
70;
77
77
Difference of
DmPar->PropNextLine
Difference of
next
lines
in
percent (20 =
DmPar->DmDeltaSize
-1
//
= any
define
=
exactly DmDx x DmDy size
tolerance
+/-
line must
be
at
least
50% of
smaller line must
be
at
least
20% of
k% =
k%
size)
=
smaller
data matrix parameter
DmPar->D m Search Color
=
0;
//
DM Color (O=black /
DmPar->ModulContrast
4;
/
DmPar->D m Modu lSelect
50;
77
DmPar->ModulNrMin
=
10;
//
min
in
contrast
percent (0%
minimum is
1=white /
between
=
8
dark color /
modules
102
-1=all)
modul and background in
100% =
bright
grey
color)
values
(75%
of
moduls must
have
DmPar->ModulNrMax
=
10;
DmPar->ModulSizeMin
=
1;
DmPar->ModulSizeMax
=
//
define
=
-1;
=
0;
image
define
DmPar->FindZoomMax
1;
order to
in
pyramid zoom
DmPar->ReadZoomMin
max
max
to
angle delta
90
degree
=
20;
in
allowed reading time
=
DmPar->MaxReadTime
return
-1)
speed up
the
system
for DM finding
speed up
the
system
for
DM reading
1;
DmPar->DeltaAngleL
//
not
0;
=
DmPar->ReadZoomMax
//
DmSearchColor is
if
only,
(works
data matrix "L"
order to
in
pyramid zoom
1;
define
1 pixel
pixel
0;
DmPar->FindZoomMin
//
minimum is
pixel:
closing
for
filter
DmPar->ClosingFilter
//
144 modules
is
maximum
data matrix options
DmPar->DmRectangle
pre
/7
5;
DmPar->DmMirrorMode
//
77
7/
After
ms.
//
500;
20
example:
(for
time
the
means 90
is
20 )
+/-
elapsed ,
the
program returns
with
a time
out
-1
error.
= unlii
ms
0;
}
/********************************************
DrawDataMatrix
132
(DmParameter
*DmPar,
*Source ,
image
132
CrossSize ,
132
ColorDark ,
132
ColorBright)
{
return
DrawDataMatrixOff
CrossSize ,
Source ,
(DmPar,
ColorDark ,
ColorBright ,
0 ,0);
}
/******************************************************************/
I32
DrawDataMatrixOff
x,
ModulX,
532
i , y,
532
**ModulPosX,
U8
**ModulValue,
(DmParameter
*DmPar,
Color,
ModulY,
image *Source ,
PosX[4]
,
PosY [4];
**ModulPosY;
**ModulThresh;
if (DmPar->DmError)
return
-1;
}
else
//////////////////////////
7 ini variables
7/////////////////////
ModulPosX
= DmPar->ModulPosX;
ModulPosY
= DmPar->ModulPosY;
ModulValue
= DmPar->ModulValues;
ModulThresh
= DmPar->ModulThresh;
if
(DmPar->DmColor
else
=
0)
Color = ColorDark;
Color =
ColorBright
103
I32
CrossSize ,
I32
ColorDark ,
532
ColorBright ,
int
off.x
for ( i =0;
i <4;
i++)
for( i=0;
i <4;
i++) PosY[ i]
PosX [ i]
= DmPar->DmPosX[ i+
//
for(i =0; i<4; i++) print ("PosX[%d]=%3d
//
draw data matrix border
for(i=O;
//
i<4;
i++)
off-x;
= DmPar->DmPosY[ i]+offy;
linex(Source,
PosY[%d]=%3d\n",
PosX[i] ,
i,
PosX[i],
PosY[i] , PosX[(i+1)&0x03],
draw data matrix point 0
MarkCross( Source ,
//
PosX[0} ,
PosY[0} ,
ColorDark,
4
*
CrossSize);
draw data matrix moduls
# if d e f TESTINGVERSION2
for
(y
=
DmPar->ModulStartY;
y <= DmPar->ModulEndY;
y++)
{
for
(x
= DmPar->ModulStartX;
x <= DmPar->ModulEndX;
x++)
ModulX = ModulPosX [y x ];
ModulY = ModulPosY [y] [x ];
if ( ModulValue [y] [x]
> ModulThresh [y] [x])
else
MarkCross ( Source ,
ModulX,
Color ,
ModulY,
Color =
ColorDark;
Color =
ColorBright;
CrossSize);
}
#endif
return
0;
/******************************************************************/
void
ChangeColor
132
i,
j,
dx,
U8 *restrict
(image
dy,
= area->st ;
dx
= area->dx;
dy
= area->dy;
for
132
start
,
132
end,
pitch;
ppix , *addr;
addr
pitch =
*area,
area->pitch
(j=0;
j<dy;
j++)
{
ppix =
addr+(pitch*j );
104
132
newcol)
i,
PosY[i]);
PosY[(i+1)&0x03]
ColorBright);
for
if
(i=O;
(
i<dx;
i++)
(*ppix>=start ) && (*ppix<=end)
)
*ppix=newcol;
}
ppix++;
***********************************
/*************************************************************
/**********************************************************/
105
106
Appendix C
C Sharp Code
C.1
Program.cs
using
System;
using
System. Collections . Generic;
using
System. Linq;
using
System . Windows. Forms;
namespace
UDPListener
class
static
Program
{
///
///
<summary>
///
</summary>
The main
public
entry
point for
the
application
Form1 mainForm;
static
[STAThread]
void
static
Main()
{
Application .EnableVisualStyles
(;
Application . SetCompatibleText RenderingDefault (false);
mainForm =
new Form1 ();
Application. Run(mainForm);
}
}
C.2
Forml.cs
using
System;
using
System . Collections . Generic;
using
System . ComponentModel;
using
System . Data;
using
System. Drawing;
107
using
System. Linq;
using System. Text;
using System . Windows. Forms;
using
System . Net . Sockets
using
System . Net;
using System . Net . Sockets
using System. Threading;
using System. Collections
namespace
UDPListener
{
public
partial
class
Formi
Form
{
Tracker
myTracker;
//static
UdpClient server;
DateTime timestarted ;
public
delegate
void
SetTextCallback(string
byte[]
them
{
157,
58,
byte[]
us
public
Forml()
=
{ 157,
58,
60,
60,
63
69
txt );
};
};
{
InitializeComponent
private void
(;
buttonlClick(
object
sender ,
EventArgs
e)
{
UdpClient
uSend
IPAddress
ipa
= new
System . Net . IPEndPoint
//byte
[]
UdpClient (5005); //4004,
AddressFamily. Unix);
= new System . Net . IPAddress (us );
temp =
iep = new
System . Net . IPEndPoint ( ipa ,
server. Receive (ref
4004);
ipe );
//System. Windows. Forms. MessageBox. Show (temp. ToString ());
uSend. EnableBroadcast
= true;
uSend . Send (System. Text . Encoding . ASCII. GetBytes ( textBoxl . Text . ToCharArray ())
uSend. Close ();
private void
button2-Click( object
sender ,
EventArgs
{
textBox2 . Text
I
private void
FormlLoad ( obj ect
sender ,
EventArgs
{
myTracker = new
timestarted
Tracker ( pictureBox1 , timeri );
= DateTime.Now;
backgroundWorkerl . RunWorkerAsync ();
}
public
void
safeAppend(string
txt)
108
e)
e)
,
textBoxl . Text. Length,
iep);
if
(textBox2.InvokeRequired)
{
c =
SetTextCallback
this.Invoke(c,
new SetTextCallback(safeAppend);
object []
new
{ txt
});
//"I_"
+
tot
}
else
{
int ret = 0;
try
{
ret
= myTracker.parseString(txt);
}
catch
(Exception
e)
txt = "<unparsed>" +
if
(textBox2.Text.Length
txt ;//
nothing yet.
> 2000)
textBox2 . Text
if
(ret
!=
3
{
+=
txt;
textBox2 . Text +=
"\r\n";
textBox2 .Text
textBox2.Select (textBox2.Text.Length
-
2,
1);
textBox2. ScrollToCaret ();
private
void
backgroundWorkerlDoWork( object
sender , DoWorkEventArgs
e)
{
int
sampleUdpPort =
4004;
IPHostEntry
localHostEntry;
timestarted
=
DateTime.Now;
try
{
//Create
Socket
a UDP socket.
soUdp = new Socket (AddressFamily.
InterNetwork ,
SocketType . Dgram,
ProtocolType . Udp);
try
{
localHostEntry
}
catch
= Dns. GetHostByName(Dns. GetHostName());
(Exception)
{
Console. WriteLine("Local..Host...not..found");
//
fail
return;
IPEndPoint
localIpEndPoint
= new IPEndPoint ( localHostEntry . AddressList [0] ,
sampleUdpPort);
soUdp . Bind ( localIpEndPoint );
while
(true)
{
Byte[]
received
IPEndPoint
= new Byte[256];
tmpIpEndPoint
= new IPEndPoint ( localHostEntry . AddressList [0] ,
109
sampleUdpPort);
EndPoint
int
=
remoteEP
= soUdp. ReceiveFrom
bytesReceived
String
dataReceived
safeAppend
(tmpIpEndPoint);
,
ref
remoteEP);
= System . Text . Encoding . ASCII. GetString ( received );
( dataReceived
);
"The
/* String returningString
Byte []
(received
Server got your message
through UDP:"
returningByte = System. Text . Encoding . ASCII. GetBytes (returning
soUdp. SendTo(returningByte ,
remoteEP);*/
}
}
catch
se)
(SocketException
{
Console. WriteLine ("A.Socket..Exception..has-occurred!"
I
void
private
object
timerlTick(
sender , EventArgs
e)
{
myTracker . drawPoint (;
}
pictureBox1
void
private
sender ,
Resize (object
{
myTracker . resizeCont rol (this
.
pictureBox1
}
}
}
Tracker.cs
C.3
using System ;
using
System . Collections . Generic
using
System. Collections;
using
System. Linq;
using
System . Text ;
using
System . Drawing;
narnespace
UDPListener
{
class
the
//essentially
Tracker
controller
{
const
int
ROLLDISPFACTOR =
2;
const
int
TILT.DISP..FACTOR =
3;
Point
home,
const
int
//private
angle;
vertSpan =
ArrayList
BoPen myPen =
16,
horizSpan =
20;
_pens;
null ;
System . Windows. Forms. Control
myControl;
System . Windows. Forms . Timer myTimer;
Graphics myG = null ;
110
EventArgs
e)
+
se.ToString());
+
String
dataReceived;
.
ToCharArray());
public
if
void
drawPoint ()
(myPen.x
return;
-1)
=
int
x = myPen.x;
int
y = myPen-y;
int
z = myPen. z
* ROLL-DISPFACTOR;
home = new Point (x,
y);
myG. Clear( Color . White);
//
(!myPen.angleOutdated)
test
(Brushes.LightBlue ,
myG. FillEllipse
home.X -
Math.Cos(myPen.lastTheta
*
home.Y -
z,
* Math.Sin(myPen. lastTheta
angle = new Point((int)((z)
(int)((z)
reading DMs
still
if
tril
=
home;
tril
Point
tri2
-
home;
tri2.Offset(1-angle.Y
/
*
z <<
1,
<<
/
180)),
Math.PI
180)));
3,
1-angle .X >>
3);
>> 3,
1+angle.X >>
3);
Offset (1+angle .Y >>
Point
.
* Math.PI
z,
angle . Offset (home.X,home.Y);
tri3 = angle ;
Point
Point []
triangle
=
{tril
tri2 ,
,
Point
tiltX
= home;
Point
tiltY
= home;
tiltX
Offset (myPen. tiltX
tiltY
. Offset (0,
tri3
};
triangle
myG. FillPolygon ( Brushes. White,
//Draw
);
* TILTDISP-FACTOR,
Orientation
//Draw
0);
til
t
* TILTDISP.FACTOR);
myPen. tiltY
myG.DrawLine(Pens.Black,
home,
tiltX);
myG.DrawLine(Pens.Black,
home,
tiltY);
home.X-1,home.Y-1,
myG.DrawRectangle(Pens.Black,
2,
2);
//Draw
Center
}
public
bool
resizeControl (System.Windows. Forms. Control p)
myG = Graphics . FromHwnd(myControl. Handle);
if
(myPen ==
null)
{
home = new Point (p.Width
/
2,
p. Height
/
2);
angle = home;
return
false
}
myPen. screenResize (p .Width, p. Height
return
true;
}
public
Tracker (System . Windows. Forms. Control p,
{
/set
up view
myControl =
p;
111
System . Windows. Forms. Timer
t)
1);
myTimer = t;
resizeControl (p);
}
public
int
parseString (string
given)
{
given
=
if
given . Remove( given . IndexOf( '\0 '
[]
string
parseMe = given. Split ( ' ,');
(parseMe.Length
=
4)
{
if
(myPen
=
null)
myPen = new BoPen(int. Parse (parseMe [0])
int . Parse (parseMe [2])
,
myPen. screenResize (myControl. Width,
myTimer. Enabled
//myTimer. Start
=
,
int .Parse
(parseMe [1)
int . Parse (parseMe [3]. TrimEnd ( );
myControl. Height);
true;
(;
}
else
myPen. update3 (int . Parse ( parseMe [0])
int . Parse (parseMe [2])
return
,
int
,
. Parse (parseMe [1])
int . Parse (parseMe [3]. TrimEnd ( )
3;
}
else
if
(parseMe. Length
=
10)
{
if
(myPen =
null)
{
myPen
= new BoPen(int. Parse(parseMe [0})
int . Parse (parseMe [1])
int .Parse
(parseMe [2]),
int .Parse
(parseMe [3]),
int .Parse
(parseMe [4])
int .Parse(parseMe[5])
int .Parse
(parseMe [6])
int .Parse
(parseMe [7]),
int .Parse (parseMe [8]),
int . Parse (parseMe
[9] . TrimEnd ()))
myPen. screen Resize (myControl . Width,
myTimer. Enabled
//myTimer. Start
=
myControl. Height);
true;
(;
else
myPen. update7 (int . Parse (parseMe [0]),
int .Parse (parseMe [1])
int . Parse (parseMe [2]
int . Parse (parseMe [3])
int . Parse (parseMe [4]
int . Parse (parseMe [5])
int . Parse(parseMe [6])
int . Parse (parseMe [7]
int .Parse
(parseMe [8]),
int . Parse ( parseMe [9] . TrimEnd () ))
return
6;
}
else
{
throw(new
Exception (" unexpected-.number-of..datum...in .. update,..string" ));
}
}
112
}
BoPen.cs
C.4
using
System;
using
System . Collections . Generic;
using
System. Linq;
using
System . Text;
namespace
UDPListener
{/294,169,119,36,88,46
class
BoPen
{
delegate
public
sender ,
LimitsChangedDelegate (object
void
const
int
moveToleranceX
=
3;
const
int
moveToleranceY
=
3;
const
int
moveToleranceZ
=
3;
EventArgs
screenWidth ;
screenHeight
screenDepth;
int
int
int
int
=
.numPens
0;
private
static
private
int
private
bool
private
DateTime
.lastSeen
private
TimeSpan
_dwellTime = TimeSpan . MinValue;
private
int
=
-id
-1;
_currentlyVisible
=
=
false;
_lastMoved =
DateTime.MinValue,
_lastX = -1-moveToleranceX
DateTime.MinValue;
,
= -1--moveToleranceY,
.lastY
.. last Z = -1-moveToleranceZ,
=
.lastPhi
-1,
_tiltY = -1,
private
int
basePhi =
private
int
_maxX = 624,
private
int
.maxY
private
int
..
maxZ =
***
bool
private
double
public
=
-1;
baseRho =
6,
void
5;
_minX = 0;
_minY = 0;
= 468,
140,
autoResize
-1,
-1,
.minZ
=
=
radTheta =
//used
45;
NOTE THE USE OF THIS Fx
private
=
.lastRho
.tiltX
=
-lastTheta
/*
e);
to
//462
be
/
false
-1;
screenResize ( int
width ,
int
height)
{
screenHeight
=
height;
screenWidth
= width;
screenDepth
=
Math.Max( height ,
width)
/
}
public BoPen(int x, int y, int zl,
int z2)
{
113
190
and 45
20;
observed max
update3(x,
y,
zl,
z2);
_BoPen (;
}
public
( int
update3
void
x,
int
y,
int
z2)
center
of
zI,
int
{
if
(x
-1)
-
{
=
currentlyVisible
false;
}
else
{
z = Math.Max(zl,z2);
int
x+=zl/2;
//convert
y+=z2/2;
DateTime.Now;
=
(((lastX
if
> 0)
&
(Math.Abs(_lastX
((lastY
>
0)
&
(Math.Abs(_lastY
((lastZ
> 0)
&
(Math.Abs(_lastZ
_lastSeen -
=
_dwellTime
(.dwellTime.
if
blob
= true;
_currentlyVisible
.lastSeen
to
//dispatch
Seconds
x)
-
<
moveToleranceX))
y)
< moveToleranceY))
z)
< moveToleranceZ)))
//
&&
test
if
have moved;
&
_lastMoved;
>
1)
click
}
//
//updateModel();
so
that
you
the
use
still
data in
determining other
info
like
extent
and vel
I
//
else
lastX =
x;
lastY =
y;
=
z;
lastZ
int
private
way
this
dwell
time
_lastMoved
= DateTime.Now;
//dispatch
update
scalePatternDistance
(int
is
determined by
how long
you
stay
within an
initial
position
,
cant jus
dist)
{
ret urn
diskt«<2;
void
private
updateTilt(int
phi,
int
rho)
{
,
//eventually
baseRho and basePhi are functions
of
position
.. astPhi
=
.lastRho
= rho;
_tiltX =
(int )((
scalePatternDistance (rho
-
baseRho)
*
Math. Sin (radTheta))
+
(scalePatternDistance (phi
-
_tiltY =
(int)((
scalePatternDistance (phi
-
basePhi)
*
Math. Sin(radTheta))
+
(scalePatternDistance (rho
-
/
180.0);
phi;
-
//phi-basePhi;//rho
baseRho;//
}
private
updateRoll(int
void
theta)
{
lastTheta
radTheta
=
=
theta;
(Math.PI
*
(double)(theta+45)
114
I
int y,
public void update7(int x,
int zl,
int
int z2,
id,
int
int phi,
rho,
int theta,
int dm-x,int dm.-y)
{
= true;
_currentlyVisible
if
(id
-1)
=
{
id
if
=id;
(Aid
id)
-
//decoded
info
is
valid
{
updateRoll (theta);
updateTilt (phi , rho);
update3(x,y,zl,z2);
}
public BoPen(int x,
.id
//
= id ;
bad
y,
update7(x,
int y,
if
zi,
int zl,
first
the
z2,
id,
phi,
int
ID
is
z2,
incorrect
{ -BoPen();
}
_BoPen()
void
private
//increment
_numPens++;
number of
pens
~BoPen()
{
-numPens--
public int lastX
{
return
private
if
_lastX ;
set
_lastX
=
value;
(autoResize)
_maxX = Math.Max(-_astX
,
_rnaxX);
_minX = Math.Min(-lastX
,
_minX);
}
public int lastY
{
get
{
return
,
b/c
rhotheta,dm.x,dm-y);
_BoPen ( );
public BoPen()
int id, int phi,
..astY;
115
int
there 's
rho,
no
way
int theta,
to
int dm-x,
change it.
int dm-y)
}
set
private
{
lastY =
value;
(autoResize)
if
{
-maxY =
Math.Max(lastY ,maxY);
-minY =
Math.Min(lastY ,minY);
}
I
lastZ
public int
{
return
..lastZ;
set
private
-last Z =
value;
(autoResize)
if
{
-maxZ =
Math.Max(_lastZ ,maxZ);
-minZ =
Math.Min(..astZ ,minZ);
public int x
{
{
get
if
(.maxX
!=
.minX)
return(screenWidth
-
(int )(((float
)(-lastX
*
screenWidth)
/
(float)(_maxX
-
_minX))));
else
return
screenWidth
/
2;
I
public int y
{
(..maxY
if
!=
_minY)
return
((int)((float)(_-lastY
return
screenHeight
*
screenHeight)
/
(float)(.maxY
-
_minY)));
else
/
2;
public int z
{
get
{
if
(.maxZ
!=
-minZ)
return
(screenDepth
return
screenDepth
/
2 +
((int)(100
else
/
2;
116
*
Math.LoglO((float)(_lastZ)
/
(float)(.maxZ
-
_minZ)))));
public
int
lastTheta
{ return
get
int
/*public
_lastTheta;
}
lastRho
{
get
{
return
-lastRho;
}
_lastPhi
}
}
public
int
lastPhi
{
get
{
return
}
*/
public
int
tiltX
{ get { return
-tiltX
} }
public
int
tiltY
{ get { return
_tiltY ;
} }
public bool angleOutdated
{
//in future
use time-based approach
get
return
(!(
currentlyVisible
&& (_lastPhi
}
}
}
117
> 0)));
Download