STEREOSCOPIC TRACKING IN 3 DIMENSIONS WITH NEURAL NETWORK HARDWARE Adam Ruggles B.S., California State University, Sacramento 2003 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE at CALIFORNIA STATE UNIVERSITY, SACRAMENTO SPRING 2012 STEREOSCOPIC TRACKING IN 3 DIMENSIONS WITH NEURAL NETWORK HARDWARE A Project by Adam Ruggles Approved by: __________________________________, Committee Chair Dr. V. Scott Gordon __________________________________, Second Reader Dr. Akihiko Kumagai ____________________________ Date ii Student: Adam Ruggles I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the project. __________________________, Graduate Coordinator ___________________ Dr. Nikrouz Faroughi Date Department of Computer Science iii Abstract of STEREOSCOPIC TRACKING IN 3 DIMENSIONS WITH NEURAL NETWORK HARDWARE by Adam Ruggles The V1KU is a product by Cognimem Technologies, Inc., which combines a hardware neural network chip, a Micron/Aptina monochrome CMOS sensor (camera) and CogniSight image recognition engine. It is capable of learning the target either by giving it examples of the object to track using pre-captured images or by using the included camera. This project extends the functionality of the V1KU module for the purposes of tracking. Two V1KU modules are inserted into a camera mounting system that allows the V1KU modules to tilt vertically similar to the way the human eyes move coordinated up and down. The harness also allows for horizontal movement of each module individually. In this configuration, the V1KU modules are able to stereoscopically track an object in all 3 dimensions. iv The application combines the facilities to teach the V1KU modules a given object and track that object. The program calculates the position of the identified object using the pixel coordinates and servo angles. It then uses that information to keep the target in the center of the camera for each module. A Kalman filter-tracking algorithm is to predict the next location of the object in case the tracked object becomes obstructed or un-identified to a short period. The result is a tracking solution that can follow any learned target seen using its cameras alone. _______________________, Committee Chair Dr. V. Scott Gordon _______________________ Date v DEDICATION I dedicate this project to my wife and two girls for their patience and understanding during this long and time-consuming process. vi ACKNOWLEDGEMENTS I would first like to thank Dr. V. Scott Gordon for allowing me to work on such an interesting and fun project. He was instrumental to my success in this project. I would like to acknowledge Dr. Akihiko Kumagai for his mechanical expertise which was invaluable in this project. I would also like to add a special thanks to Bill Nagel and Anne Menendez from Cognimem Technologies Inc., for their code and assistance. Finally, I would like to acknowledge Graham Ryland from Barobo Inc. for his help in drawing up my harness design in SolidWorks and laying out the parts to be laser cut. vii TABLE OF CONTENTS Page Dedication .................................................................................................................................vi Acknowledgements ................................................................................................................. vii List of Tables ............................................................................................................................. x List of Figures ...........................................................................................................................xi Chapter 1. INTRODUCTION ............................................................................................................... 1 2. BACKGROUND ................................................................................................................. 3 2.1 V1KU ............................................................................................................................ 3 2.2 Architecture of the CM1K chip ..................................................................................... 5 2.3 Phidget Servo Controller ............................................................................................... 8 2.4 Tracking Systems .......................................................................................................... 8 3. CAMERA MOUNTING HARDWARE ........................................................................... 12 3.1 Background ................................................................................................................. 12 3.2 Design.......................................................................................................................... 13 4. SOFTWARE DESIGN ...................................................................................................... 19 4.1 Definition of Terms ..................................................................................................... 19 4.2 Algorithms ................................................................................................................... 19 4.3 Application .................................................................................................................. 28 4.4 Internal Architecture Overview ................................................................................... 32 5. RESULTS.......................................................................................................................... 37 viii 6. FUTURE WORK .............................................................................................................. 43 Appendix A. Source Code ....................................................................................................... 44 Appendix B. User Guide .......................................................................................................... 86 Appendix C. Kalman Filter Test Results ................................................................................. 94 References ................................................................................................................................ 99 ix LIST OF TABLES Tables Page Table 1 Product Specifications for the Servo Controller ........................................................... 8 Table 2 Torque Requirement Calculation ................................................................................ 14 Table 3 Kalman Filter Test Results ......................................................................................... 38 Table 4 Complete Kalman Filter Test Results ......................................................................... 94 x LIST OF FIGURES Figures Page Figure 1 CM1K Functional Diagram [6] ................................................................................... 5 Figure 2 TLD Tracking Algorithm ............................................................................................ 9 Figure 3 HiTec Servo with servo to shaft couplers .................................................................. 13 Figure 4 Rendered harness with perspective view ................................................................... 15 Figure 5 Rendered harness with supports ................................................................................ 16 Figure 6 Rendered harness with top down view ...................................................................... 16 Figure 7 Servo mount with horn .............................................................................................. 17 Figure 8 High Level Flow Diagram of the Tracking Algorithm .............................................. 25 Figure 9 Horizontal Triangulation View.................................................................................. 26 Figure 10 Main Class Diagram ................................................................................................ 32 Figure 11 Kalman Filter Class Diagram .................................................................................. 34 Figure 12 Learning Algorithm Class Diagram ........................................................................ 35 Figure 13 Transform Learner Class Diagram .......................................................................... 36 Figure 14 Results of the Kalman filter during a test run .......................................................... 37 Figure 15 Anne’s Algorithm heat map .................................................................................... 40 Figure 16 Engine Conservative heat map ................................................................................ 40 Figure 17 Engine Moderate heat map ...................................................................................... 41 Figure 18 Region Learner heat map......................................................................................... 41 Figure 19 Transform Learner heat map ................................................................................... 42 Figure 20 Engine Normal heat map ......................................................................................... 42 xi Figure 21 Settings Tab ............................................................................................................. 87 Figure 22 Learning Tab ........................................................................................................... 89 Figure 23 Track Tab................................................................................................................. 92 xii 1 Chapter 1 INTRODUCTION The V1KU board made by Cognimem Technologies, Inc. provides the perfect platform to construct a complete tracking system. The V1KU contains a neural network chip, camera, FPGA, and USB port all as a module. The V1KU module utilizing a radial basis function artificial neural network is able to efficiently learn and recognize any target that can be captured on the camera. It is capable of learning the target either by giving it examples of the object to track using precaptured images or by using the included camera. Several systems utilizing the V1KU have demonstrated its abilities such as a fish inspection [14] and a vehicle-license-plate recognition system [13]. A tracking system based on the V1KU module was developed in 2011 by Hitesh Wadhwani [1]. The application developed called “BirdView” used the V1KU in a static configuration to track any learned object that passed in front of a camera. Later it was extended [2] to support two V1KUs each attached to a servo that allowed the cameras to pan left and right as well as triangulate the location of the tracked object along a horizontal plane about the camera lens. This project extends the two previous projects in three fundamental ways. First, it provides a camera mounting harness that allows the cameras to, in addition to panning, tilt the camera. In addition, the harness provides ability to change the distance between the two V1KU modules. Second, the project implements a sophisticated tracking algorithm that is able to predict the location of the tracked object. Third, it processes both cameras in parallel, allowing for maximum performance. Four different learning algorithms are implemented and more can be easily added in future revisions. It gives the user all the tools necessary to train and track any 2 object that the cameras can see. In addition, project files obtained from other Cognimem tools can also be loaded into each camera. 3 Chapter 2 BACKGROUND This project builds off an existing prototype [1][2] stereoscopic tracking system that is able to track along a two-dimension plane. The prototype uses a pair of V1KU CogniMem modules. The V1KU is an evaluation board used for video and image recognition. Each module is mounted to a Hitec HS-322HD servo motor using Velcro and rubber bands. The servo motor is rated with a speed of .15 seconds @ 60 degrees and torque of 3.7 kg/cm or 51 oz./in. Both servo motors are mounted to a wooden board using L-brackets, felt padding and Velcro. The two motors are spaced 11.5 inches apart measured from the center of the servos. They are then connected to a Phidget 8-motor servo controller. Each V1KU module and the Phidget servo controller are then connected to a computer via a USB 2.0 cable. The prototype includes an application written in C# with the ability to see images coming from the camera, allows the user to specify the region of interest (ROI), learn and unlearn the selected ROI, simple tracking algorithm, and the ability to triangulate the location of the learned object. The triangulation in the prototype only takes into account the positions of the servo motors so is only accurate if the object is positioned in the center of the camera horizontally. 2.1 V1KU The feature set for the V1KU module [3]: Aptina/Micron MT9V022 Video Sensor o Monochrome, progressive scan o 752x480 pixel, 60 frames per second o Global shutter for fast moving objects o 6mm M7 lens with holder 4 CM1K Neural Network Chip o 1024 silicon neurons working in parallel o Classify vectors of up to 256 bytes o Up to 16382 categories o Up to 127 sub-networks per chip o Category readout in 36 clock cycles per firing neuron, (1.4μsec @ 24 MHz clock) o Radial Basic Function or K-NN classifier o Real time self-adaptive model generator I/O Busses o Miniature USB Hi Speed (480 Mbps) o I2C serial interface (100-400 kbit) o 2 RS485 serial output o 1 opto isolated input line o 2 opto isolated output lines (<60 v, 500 mA) o Two 10-pin headers CogniSight Recognition Engine on FPGA o Simple Read/Write protocol to access all components via USB or RS485 o Learning and recognition of a fixed region o Finding of objects in a region of search o Grab video to memory (area and line scan) o Load images from and transfer to host o Output to opto-isolated relays Mechanical and Electrical o Powered through USB or external supply 5 o 6v to 36v, 1 Watt o 27 x 27 mm, 120 grams 2.2 Architecture of the CM1K chip The CM1K is a high-performance pattern recognition chip with a network of 1024 neurons operating in parallel. The chip also contains an embedded recognition engine that is capable of classifying a digital signal received from a sensor. The CM1K uses a Restricted Coulomb Energy network as a non-linear classifier in combination with a hardwired parallel architecture to enable high-speed pattern recognition [5]. The CM1K, as shown in Figure 1 is composed of a Top control logic (NSR and RSR registers, Read and Busy control signals) clusters of 16 neurons, recognition stage and I2C slave. Figure 1: CM1K Functional Diagram [6] 6 2.2.1 Top Controller Logic [6] Synchronize communication between the clusters of neurons, the recognition state machine and the I2C slave Inter-module communication is made through a bi-directional parallel bus of 25 wires: data strobe (DS) read/write (RW_), 5-bit register (REG), 16-bit data (Data), read (RDY) Inter-neuron communication also uses two additional lines indicating the global status of the neural network: identified recognition (ID), uncertain recognition (UNC) Communication with external control unit can be made through the same parallel bus or the serial I2C bus 2.2.2 Cluster of Neurons [6] 16 identical neurons operating in parallel. All neurons have the same behavior and execute the instructions in parallel independent from the cluster or even chip they belong to No controller or supervisor Selection of one out of two classifiers: K-Nearest Neighbor (KNN) or Radial Basic Function (RBF) Recognition time is independent of the number of neurons in use o Recognition status in 2 clock cycles after the broadcast of the last vector component o Distance and Category readout in 36 clock cycles per firing neuron Automatic model generator built into the neurons Save and Restore of the contents of the neurons in 258 clock cycles per neuron Simple Register Transfer Level instruction set through 15 registers 7 Most operations execute in 1 clock cycle except for Write LCOMP, Write CAT, Read CAT and Read DIST which can take up to 19 clock cycles Daisy-chain connectivity between the neurons of multiple CM1K chips to build networks with thousands of neurons 2.2.3 Recognition state (optional usage) [6] Enabled physically with RECO_EN pin and activated programmatically via a control command Vectors received through the digital input bus are continuously recognized and the response can be snooped directly from control lines or is readable through registers Recognition is made in 37 clock cycles from the receipt of the last component of a vector If the input signal is a video signal, the vector is extracted by the recognition stage from a user-defined region of interest. 2.2.4 I2C slave controller (optional usage) [6] Enable physically with I2C_EN pin Receives the serial signal on the I2C_CLK and I2C DATA lines and convert it into a combination of DS, RW_, REG and DATA signals compatible with the parallel neuron bus. 8 2.3 Phidget Servo Controller Table 1 contains the product specifications of the 1061_0 - PhidgetAdvancedServo 8-Motor servo controller [15]. Table 1: Product Specifications for the Servo Controller Pulse Code Period Typical: 20ms - Maximum: 25ms Minimum Pulse Width 83.3ns Maximum Pulse Width 2.7307ms Output Controller Update Rate Typical: 31 updates/second Output Impedance (control) 600 Ohms Position Resolution 0.0078125º (15-bit) Lower Position Limit -22.9921875º Upper Position Limit 233º Velocity Resolution 0.390625º/s (14-bit) Velocity Limit 6400º/s Acceleration Resolution 19.53125º/s2 (14-bit) Acceleration Limit 320000º/s2 Time Resolution 83.3ns Minimum Power Supply Voltage 6V Maximum Power Supply Voltage 15V Max Motor Current Continuous(individual) 1.6A Max Motor Current (Surge) 3A Motor Overcurrent Trigger (combined) 12A Operating Motor Voltage 5.0V Device Current Consumption 26mA max Operating Temperature 0 - 70°C 2.4 Tracking Systems There is lots of research being conducted in the field of computer vision as well as how it relates to tracking systems. Some of this research will be explained in this section. 9 2.4.1 TLD Zdenek Kalal describes [16] a real-time algorithm for tracking of unknown objects in a video stream. The systems uses the learning system referred to as P-N Learning, which is a semisupervised algorithm that guides the learning by generating positive and negative examples as the structural constraints. The algorithm is high performance and learns from errors. The current implementation of the algorithm limits it to tracking a single object using a single monocular camera. Figure 2: TLD Tracking Algorithm 2.4.2 Unsupervised Tracking of Stereoscopic Video Objects Employing Neural Networks Retraining Anastasios D. Doulamis, Klimis S. Ntalianis, Nikolaos D. Doulamis, Kostas Karpouzis and Stefanos D. Kollias describe [17] a Recursive Shortest Spanning Tree implementation. The procedure includes a retraining algorithm for adapting the network weights to the current conditions, semantically meaningful object extraction which plays the role of the retraining set and a decision mechanism for determining when network retraining should be activated. Object extraction is accomplished by utilizing depth information from the stereoscopic video and 10 incorporating a multi resolution implementation of the Recursive Shortest Spanning Tree segmentation algorithm. 2.4.3 Neural Network for Real-Time Object Tracking Javed Ahmed, M. N. Jafri, J. Ahmad, and Muhammad I. Khan describe [18] using a back-propagation neural network (BPNN) for real-time object tracking. The object of the application is to locate a specific airplane in the frames grabbed from a movie clip playing at a speed of 25 frames/second. The BPNN uses one sigmoid-type hidden layer and a linear output layer. The input layer is determined from a pixel with a red (r), green (g), and blue (b) value and the resolution of a movie. Each pixel is then converted to a gray scale using the following formula: 𝑦 = (0.212671)𝑟 + (0.71516)𝑔 + (0.072169)𝑏. Then the image is down-sampled by extracting only specific rows and columns from the original image. Finally each input is normalized to a value between [0.0, 1.0]. 2.4.4 Vision-Based Tracking System Michael Baker and Holly A. Yanco describe the progress towards implementing a streetcrossing system for an outdoor mobile robot. The application applies a series of operations to the images from the on board cameras. First image differencing is applied using interframe differencing to extract the regions of motion from frame to frame. Next, a 3x3 median filter is applied to remove camera noise and any interference from background motion. A Sobel edge detector is then used to delineate the motion edges. Then a Mori “sign pattern” scan is used for vehicle detection, this technique uses the shadow underneath a vehicle for detection. The algorithm then attempts to detect lines, which corresponds to the roofline of a vehicle. The linefinding algorithm tolerates a specified percentage of outlier pixels, any frame of that has to many rejected lines will be thrown away as too noisy. The results of the line-finding algorithm and the 11 Mori scan are combined to produce the bounding box for the vehicle. The algorithm includes a history component that helps reduce the noise and smooth the motion results. 2.4.5 Motion-Information-Based Vision-Tracking System Jaehong Park, Chang-hun Lee, Tae-il Kim, Teajae Lee, Shaikh, M.M., Kwang-soo Kim, and Dong-il Cho describe [20] a vision-tracking system for a mobile robot, using robot motion and stereovision data. The pair of cameras are fixed an unable to move independently in their implementation. The application uses a 3-axis gyroscope with the vision system to calculate the target object’s position. The vision detection system uses a face recognition algorithm, using face certainty map based recognition. This limits the application to human face recognition; however, the paper explains that they can use any tracking system that gives them a pixel location for the target object. The following formulas are used to find the location of the image: 𝑥0 𝑦0 𝛼𝑖𝑚𝑎𝑔𝑒 = arctan ( ) , 𝛽𝑖𝑚𝑎𝑔𝑒 = arctan ( ). 𝜆 𝜆 𝑋𝑖 = 𝑑 sin(𝛼𝑖𝑚𝑎𝑔𝑒 ) cos(𝛽𝑖𝑚𝑎𝑔𝑒 ) 𝑌𝑖 = 𝑑 sin(𝛽𝑖𝑚𝑎𝑔𝑒 ) 𝑍𝑖 = 𝑑 cos(𝛼𝑖𝑚𝑎𝑔𝑒 ) cos(𝛽𝑖𝑚𝑎𝑔𝑒 ) In the above formulas 𝜆 is defined as the focal length and d is the distance to the target. 12 Chapter 3 CAMERA MOUNTING HARDWARE The camera mounting system (harness) design must meet specific design criteria. First, the harness must allow the cameras to tilt vertically at the point of the camera lens and both cameras need to move together. This mimics the way the human eye works, which allows for smooth camera movements as well as making the targeting calculation easier. The design also needs to be simple and rugged. A simple design will be easier to manufacture and rugged to deal with the stress of holding up the cameras and servo motors in addition to mounting materials. The harness must also allow for the maximum visibility in the horizontal direction. The addition of a servo and mounting material should not block the vertical viewing angles of the servos. While full 180-degree movement may not be achievable, the design must maximize this value as much as possible. Finally, the design should allow for different positions of the cameras. In the prototype [2] the distance of the cameras is approximately 1 foot, the new design should allow for the same configuration as well as other closer configurations. 3.1 Background There are many camera mounts of various shapes and designs already on the market and initially I attempted to buy off-the-shelf parts to build the initial design. I found camera mounts similar to the LensMaster Gimbal RH-2 [7], the GigaPan Epic [8] and the PT-2100 Pan & Tilt System [9] to be ideal but all of those designs were for a single camera. That would require some additional complexity in synchronizing the two cameras. Due to being unable to find any off-theshelf components that could be used to construct the harness it was necessary to design and build a custom solution. 13 In the stereoscopic prototype [1][2], the servos that moved the cameras were fastened horizontally using Velcro and rubber bands. That was sufficiently strong enough when the forces on the fasteners were only pushing down but as the module is tilted back it is not strong enough to hold them with sufficient stability. 3.2 Design The first problem the design needs to solve is to secure the V1KU modules to the servos that move horizontally. The initial supports of Velcro and rubber bands are not sufficient to secure the modules when being tilted. To solve this problem a ¼” servo to shaft coupler is attached as seen in Figure 3 to the servo. A ¼” threaded rod .9” in length is used to fasten the V1KU module to the servo. One side of the threaded rod screwed into the camera mounting hole on the module and the other slid into to the servo to shaft coupler. Since the ¼” threaded rod is slightly undersized for the servo to shaft coupler it needs to be expanded. To accomplish this, two layers of masking tape are applied around the bottom of the threaded rod. Figure 3: HiTec Servo with servo to shaft couplers 14 There were no readily available off-the-shelf components that would meet the requirements of the project for the harness so it was necessary to design and build a custom solution. The servo motors, camera module, servo to shaft couplers and attaching threads were accurately measured to calculate the distance of the camera lens and the mounting arm that the horizontal servos would rest on. The weight of the parts and materials was also measured and it was determined that a second vertical servo motor would need to be added to ensure a smooth movement when at the maximum stress level. From Table 2 we can see the calculated torque needed when moving the arm out to a 90 degree angle is 43.06 oz.-in and each servo has a rated torque value of 41.6 oz.-in. So the factor of safety (FOS) is 83.2 oz.-in /43.06 oz.-in which gives a FOS of 1.9. Table 2: Torque Requirement Calculation Component V1KU Module Servo motor Table Description 4.24oz (weight) @ 1.14” (distance from the hub) x2 1.5oz @ 4.40” x2 6.0oz @ 3.37” Total Calculated Weight 9.64 oz.-in. 13.20 oz.-in. 20.22oz.-in. 43.06 oz.-in. The harness was modeled in Solid Works (3D CAD design software) as you can see in Figure 4, Figure 5 and Figure 6. Two types of materials were evaluated to make the harness with; aluminum and acrylic. It was determined that acrylic would be the easiest to work with since it could be cheaply and precisely laser cut. The final design used 1/8” transparent gray acrylic and was cut by Pololu Robotics and Electronics. The harness allows for three different positions for the camera mounts 12 inches, 9 inches and 6 inches apart. When in the 12 inch configuration there is a loss of 35 degrees from the field of view. This means that in that configuration, the right servo (pointing away from the viewer) can go between 5 degrees to 145 degrees before the 15 harness is blocking the center of the camera image. The left servo (pointing away from the viewer) would be blocked 35 degrees to 175 degrees. Figure 4 and Figure 6 show the renderings without the supports from the top down and perspective views. From the perspective view of Figure 4 and Figure 5, you can see the forward leaning support arms that give the cameras the best possible field of view. In Figure 5, you can see the back support behind the cameras that removes the flex when the cameras are placed in the inner positions slots. Also visible are the triangular supports the make the structure easy to assemble, ensuring the proper right angles, of the support arms for the cradle and servo mounts. Figure 4: Rendered harness with perspective view 16 Figure 5: Rendered harness with supports Figure 6: Rendered harness with top down view 17 The harness was originally designed to have the cradle arm connect directly with the shoe of the servo motor. However, even with an oversized screw there was not enough friction to properly secure the servo motor to the cradle. A circular servo horn was glued to the outside of the cradle arm to ensure the proper connection. Washers were added to push the servo motors out so that the cradle arms were still parallel to the base as seen in Figure 7. Figure 7: Servo mount with horn The final hardware configure consists of the harness made out of 1/8” transparent gray acrylic with each piece glued together using GOOP. The arm has two circular servo horns glued to the ends of the arms on the bed and are connected directly to the servos. The two vertical tilt servos are then mounted to the sides of the table harness using #8-32 x3/8” machine screws with #8 hex nuts; 4 screws/nuts for each servo motor. Washers were added to adjust the servo back to fit with the servo horns properly. The two horizontal servos, controlling the left and right motion of the V1KU modules, were inserted in one of the three available mounting positions and secured using #8-32 x3/8” machine screws with #8 hex nuts; 4 screws/nuts for each servo motor. Each servo to shaft coupler is attached to the two horizontal servos, and then the threaded rods are 18 screwed into the cameras. Finally, the cameras are attached to the servo to shaft coupler and tightened. All 4 servos are connected to an 8-Motor PhidgetAdvancedServo controller. The two vertical tilt servos are connected to a single port using a servo splitter into the third position. The servo on the left side of Figure 6 is designated as “Servo 1” and inserted into the first position on the controller while “Servo 2” on the right side is put in the second position on the controller. The Servo controller and the two V1KUs are then attached to a computer using USB 2.0 cables. 19 Chapter 4 SOFTWARE DESIGN 4.1 Definition of Terms ROI – The region of interest to learn and recognize. ROS – the region to search for the item of interest. Distance – A value that indicates the amount of drift between the signature of the ROI and the model of the closes neuron. BWIDTH – Width of a primitive block inside the Region of Interest in pixels, used by the feature extraction. 4.2 Algorithms The application can be in 1 of 4 different states: Learning, Recognizing, Tracking and Unlearning. When learning and unlearning the Cognimem engine is learning how to identify the target within the ROI. When in recognition mode the user can move the ROI around the current camera view obtaining the confidence value that the target is identified within the ROI. When in tracking mode the Cognimem engine will search the ROS. If the target is found the location will be calculated and displayed to the user as well as the servos being moved to center the target within the camera. When unlearning the Cognimem engine is learning the background. Each of the algorithms used to perform those operations are described in the following sections. 4.2.1 Learning There are two learning categories used for the current design. The two categories are defined as 1 and 0. Learning in category 1 adds a neuron or neurons to learn the ROI. Learning in category 0 mode shrinks (does not remove) a neuron and is defined as unlearning. While 20 Cognimem engine is capable of learning more than one category, the application is only concerned with tracking a single object so only one category is required. The application contains 4 learning algorithms, Anne's Algorithm, Engine (Conservative, Moderate and Normal), Region Learner, and Transform Learner. The Engine algorithms are derived from Cognimem’s SDK examples. Anne’s algorithm is based on a code example [21] from Anne Menendez*. At the ROI the V1KU learns the specified category. If the category is zero it ends after performing the first step, otherwise it will unlearn at the offset of 8 pixels for a North West (NW), North East (NE), South West (SW), and South East (SE). The NW location is defined as the ROI at x and y with an offset of -8 pixels and the SE location is defined as the ROI at x and y with an offset of +8 pixels. The Engine Conservative algorithm uses the Cognimem DLL’s built in LearnROI(Int32, Boolean, Int32) method. It learns the ROI with the specified category then also learns four neighboring positions NE, NW, SE, and SW with the same category with an offset of 2 pixels. The Engine moderate algorithm uses the Cognimem DLL’s built in LearnROI(Int 32), this algorithm only learns the specified area. The Engine Normal algorithm uses the same method as the Conservative Engine but uses the BWIDTH as the pixel offset value. The Region Learner is used to learn a bit more than the region of interest. It first performs the same operation as the Engine (Conservative) algorithm at the region of interest. It then attempts to learn an area 1.5 times the size of the ROI with a 4 pixel stepping. At each point it uses the same algorithm as the Engine (Normal) algorithm. After learning each 4-pixel region it then unlearns the area surrounding the ROI with a 2 pixel stepping. It does this by calculating and offset that is twice the size of the ROI then stepping 2 pixels for the North, West, East and South region. The region Learner is useful when performing tests in a single time slice as it learns the * Anne Menendez is a found of Cognimem 21 target very precisely. The problem with the region learner is that it is not very useful when tracking a target as simply changing the perspective or shading causes the recognition engine to not identify the target. The Transform Learner attempts to automatically learn a region at different light levels and perspectives. First, the algorithm grabs a frame from the camera and performs the same algorithm as the Engine conservative algorithm. It then crops the ROI and performs a series of 3 transformations; contrast, rotation, and perspective. First, a contrast transformation is applied to the image both increasing and decreasing a level of magnitude. This produces 4 new images, two with the contrast increased and two with the contrast decreased. The images are combined with the background and learned using the ROI and the Engine Normal algorithm. Next, a rotation transformation is applied. The rotation transformation is applied to the original image as well as the 4 previous contrast images. Each image is rotation is 5 degrees and 10 degrees in each direction and after each image generation the Engine Normal algorithm is used to learn the ROI. Finally, the Perspective transformation is applied. This time the perspective transformation is only applied to the original cropped ROI and the contrasted transformed images. The perspective transformations adjust the image by performing a shear matrix transformation at 1/10 and 2/10 in both directions for both x and y. 4.2.2 Recognition and Searching When performing a recognition operation the ROI is first moved by the user to the area of the image to be checked for the learned object. The V1KU is queried to perform a recognition operation at the position of the ROI. A distance value and category is then retrieved. The 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 ), 100 confidence is then calculated using the following formula, 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 = 100 − ( value is less than 100 it is treated as a confidence of zero. if the 22 When performing a search operation the Cognimem engine performs a similar operation as when recognizing an object but it searches the region of search (ROS) with a user definable stepping. The ROS by default is defined as half the camera window width and height; however, it can be adjusted to the full camera window width and height. The ROI is then moved in a raster pattern from the upper left to the lower right moving the ROI the value of the search stepping. When a region is identified as recognized, it is captured as a VObject. At the end of the search each VObject is then queried for the distance and a confidence is calculated. If the confidence is greater than 50 then the point is recorded and compared to the best point. If a better point is found the best point is updated until all the VObjects have been reviewed. The best point is defined as the point with the highest confidence value. If a best point is found a flag is set to let the tracking system know that the target has been identified. 4.2.3 Tracking A Kalman filter is used as the tracking algorithm due to its ability to deal with noise/variance introduced when measuring the targets location and its capability of estimating the target location based on previous positions. If the target is unable to be detected the Kalman filters provides a mechanism to estimate the location of the target. The Kalman filter provides a mechanism to handle measurement inaccuracy that occur due to the ROS stepping as well as other mechanical variances when calculating the position of the target. The Kalman filter is a recursive two-stage filter. At each time interval it performs a predict step and if there is a new measurement an observe step. 1) Predict performs the following operations: a. Predicted state: 𝑋𝑡 = 𝐹𝑡 𝑋𝑡−1 + 𝐵𝑡 𝑈𝑡 23 b. Predicted covariance estimate: 𝑃𝑡 = 𝐹𝑡 𝑃𝑡−1 𝐹𝑡𝑇 + 𝑄𝑡 2) Observe performs the following operations: a. Innovation or measurement residual: 𝑌𝑡 = 𝑍𝑡 − 𝐻𝑡 𝑋𝑡 b. Innovation or residual covariance: 𝑆𝑡 = 𝐻𝑡 𝑃𝑡 𝐻𝑡𝑇 + 𝑅𝑡 c. Optimal Kalman gain: 𝐾𝑡 = 𝑃𝑡 𝐻𝑡𝑇 𝑆𝑡−1 d. Updated state estimate: 𝑋𝑡 = 𝑋𝑡−1 + 𝐾𝑡 𝑌𝑡 e. Updated estimate covariance: 𝑃𝑡 = (𝐼 − 𝐾𝑡 𝐻𝑡 )𝑃𝑡−1 Definition of Parameters [12]: 𝑋𝑡 Is the current state vector at time t 𝑍𝑡 Is the measurement vector at time t 𝑃𝑡 Measures the estimated accuracy of 𝑋𝑡 at time t 𝐹𝑡 Is the state transition matrix 𝐻𝑡 Defines the mapping from the state vector to the measurement vector 𝑄𝑡 & 𝑅𝑡 Define the Gaussian process and measurement noise respectively 𝐵𝑡 & 𝑈𝑡 are control-input parameters 𝐼 is the identity matrix Definition of the System: 𝑍 = [𝑥, 𝑦, 𝑧] 𝑋 = [𝑥, 𝑦, 𝑧, 𝑑𝑥, 𝑑𝑦, 𝑑𝑧] 24 1 0 0 𝐹= 0 0 [0 0 1 0 0 0 0 1 𝐻 = [0 0 0 0 1 0 0 1 . 25 0 0 .5 0 0 0 . 25 0 0 .5 0 0 0 . 25 0 0 . 5 𝑄= ∗4 . 25 0 0 1 0 0 0 . 25 0 0 1 0 [ 0 0 . 25 0 0 1 ] 1 0 𝑅 = [0 1 0 0 0 0] ∗ 16 1 1 0 0 𝑃= 0 0 [0 0 0 1 0 0 0 U is set to a 1x1 matrix with a value of 0 B is set to a 6x1 matrix with all elements set to 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 1 0 0 0 1 0 0 1] 0 0 0 0 0 0 0 0] 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 ∗ 16 0 0 1] 25 Initialize Tracking Enabled? Yes End Tracking Yes No Search ROS V1KU 1 Search ROS V1KU 2 Wait for both Searches to Complete Obtain Kalman Estimate Was target recognized for one of the V1KUs? No Center V1KUs using only Kalman estimate Was target recognized for both V1KUs? No Yes Yes Use Kalman estimate and measurements from V1KU that recognized the target. Update Kalman measurement & center V1KUs on target Figure 8: High Level Flow Diagram of the Tracking Algorithm Figure 8 shows the flow of the tracking algorithm. When tracking is enabled, two worker threads are spawned; each thread performs a search in the ROS. The main thread then blocks waiting for both threads to complete. Once the search is completed, an estimated value is queried from the Kalman filter. If both V1KUs recognize the object with the ROS the Kalman filter is updated with the new measurement and the V1KUs are then moved to a position where the target 26 is in the center of the camera. If only one of the V1KUs recognized the target, the Kalman filter is not updated with a new measurement. Instead, the estimated value is used to adjust the V1KU that was not able to find the target. The other V1KU is updated using the recognized target location. If neither V1KU is able to recognize the target the estimates from the Kalman filter are used to update position of the V1KUs. If the Kalman filter is still in its initial state then the V1KUs will remain in their current positions. This process is repeated until tracking is no longer enabled. 4.2.4 Triangulation T b a d A S2 S1 t Figure 9: Horizontal Triangulation View Figure 9 shows a horizontal view represented as a triangle. S2 represents the angle of the second servo and S1 represents the angle of the first servo. The dotted line represents the horizontal distance line. From the horizontal view, we can calculate the x and y positions of the tracked object. The coordinate system is relative to a point exactly between the two servos at the height of the center of the camera lens. This point for the rest of this section is referred to as the 27 reference point. We know the distance between S1 and S2 as well as their angles read from the servo controller. The values from the servo controller are in the range of 40 to 200, which are converted to the values from 0 to 180. 𝑦1 – 𝑦0 𝑦 = 𝑦0 + (𝑥 – 𝑥0) ∗ ( ) 𝑥1 – 𝑥0 (1) To convert the servo values an interpolation function (1) is used where (y0, y1) are set to (0, 180), (x0, x1) are set to (40, 200), and x is the position of the servo angle read from the servo controller. This gives us the value y that is the angle in degrees. In addition to the servo positions the application must also take account of where the object was found on the camera image and estimate the value the servo would need to be to center the image. The value of the servo motor offset was derived through positioning various objects around the camera and centering the camera on the object and measuring the angle of change. For an object at the edge of the horizontal view the offset would be -22.5 for the left and 22.5 for the right. For the vertical view it was calculated to be -16.0 for the top and 16.0 for the bottom. The values are valid for the full resolution of the camera. If we set the camera window for half the resolution and centered within the view those values will need to be adjusted by half. To calculate 𝑥−376 )∗ 376 the offset along the horizontal dimension the following formula is used, ( 22.5 where 376 is half the camera width in pixels and 22.5 is the measured max offset described above. For 𝑦−240 ) ∗ 16.0 240 the vertical dimension the offset is calculated with the formula ( where 240 is half the camera height in pixels and 16.0 is the measured max offset. The values calculated are for each camera then added to the servo angles to derive our final angles. Since both cameras affect the third servo the value from the two cameras is averaged to calculate the offset. With the two known servo angles, we can calculate the third angle by taking 180 and subtracting the other two servo angles. Now we can calculate the distance of a or b using the law 28 𝑎 𝑏 of sines, (sin 𝑆2 = sin 𝑆1 = 𝑡 ). sin 𝑇 In this implementation side a was used for all of the remaining 𝑡 2 𝑡 2 calculations. With the law of cosines, (𝑑2 = 𝑎2 + ( )2 − 2𝑎 ∗ cos 𝑆1) we can calculate the distance of the target from our reference point where d is the length of the dotted line. Finally, we can calculate the angle A again using the law of cosines (𝐴 = 𝑐𝑜𝑠 −1 𝑡2 − 𝑎2 2 𝑡 2∗𝑑∗ 2 𝑑2 + ( )). From there we draw an imaginary right angle and complete the equations to solve for the other sides of the triangle. Some special situations have to also be considered if angle A is equal to 90 degrees then y = d and x is zero. If angle A is greater than 90 degrees then x is defined as being negative otherwise we know x is a positive value. The calculation of z is much simpler. Using the value y calculated earlier as one side and the angle from the third servo position we can draw an imaginary right triangle with the 90 degree angle between side z and y. Knowing two angles and y we can use the law of sines to calculate the length of z. A couple of special situations have to be considered when calculating the z value as well. If the angle for the third servo is 90 degrees then we know the value of z is zero. If the angle is greater than 90 degrees then we define z as being positive otherwise z is negative. To calculate the true distance of the target the following equation is used: 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 = √𝑑2 + 𝑧 2 . The above equations can be reversed to obtain the servo positions from a target position. This is necessary for using the Kalman filter. When a target cannot be found the value predicted from the Kalman filter can be used. This allows the tracking to continue even if another object obstructs the target for a short period. 4.3 Application The application “Stereoscopic3DTracking” was developed using Microsoft Visual Studio 2010 utilizing the .NET 4.0 framework. The application is heavily inspired by sample 29 applications from Cognimem Technologies, Inc. C# Software Development Kit [11], “BirdView” [1], and .”V1KU_Stereo” [2]. The application extends the functionality of its predecessors allowing tracking of an object in 3 dimensions. The application utilizes the capabilities of the V1KU to quickly search and recognize a learned object, then return the best possible position of that object in real-time. The V1KU utilizes radial-basis neural network architecture to first learn then recognize any target under the ROI. There are two primary modes in the application: Learning and Tracking. Each operation is allocated its own tab with just the functionality to perform the task. This way the user is not overwhelmed by toggles, buttons and switches unrelated to the activity they are performing. When on the learning tab the following operations are available: Learn – Learns the object at the ROI using the selected learning algorithm, but only once for each time the button is pressed. Continuous Learn – Learns the object at the ROI using the selected algorithm, but will learn as fast as the V1KU and algorithm allow for as long as the button is down. Unlearn – Learns the background at the ROI location using the selected algorithm, but only once for each time the button is pressed. Continuous Unlearn – Learns the background at the location of the ROI using the selected learning algorithm. The continuous Unlearn works like the continuous learn as it will continue to learn the background for as long as the button is held down. Find – Will search the entire camera view and display to the users a point at each location where it believes the learned object exits. It will draw the points as a heat map, blue for cold and red the more heat (confidence) it has at that point. Reset – Reset will set both V1KUs back to their initial state and zero learned objects. 30 View Neurons – Will allow the user to see the information stored in each neuron for the object learned by the V1KU. Swap Cameras – Allows the user to both train each camera individually but is also used to set the correct camera to the correct servo without having to swap the modules to different USB ports. Increase/Decrease ROI – increase or decrease the ROI by 16 pixels. Learning algorithm – Allows the user to select from one of the learning algorithms mentioned in 4.2.1. Search Stepping – Allows the user to adjust the ROS stepping value from 1 to 25. Adjusting the ROI – By clicking or dragging the mouse in the camera image the user can select the location of the ROI. Copy – Copies the neurons “Knowledge” from the selected V1KU to the other. Save – Saves the neurons “Knowledge” from the current camera to a chosen file location and name. Load – Loads the neurons “Knowledge” from a previous save, either in a previous training session or with another Cognimem Technologies, Inc. tool. When on the tracking tab the following operations are available: Enable/disable a live feed – This option allows the user to disable the live camera feed. This speeds up the tracking. Enable/disable tracking – This will switch the application between recognition mode and tracking mode. Reset servos – this will reset all the servos back to their default positions. 31 Manually adjust servo – This option allows the user to adjust a track bar linked to one of three servos. The servos will adjust their position to match the value of the track bar. Visually inspect the location of the target and search info – In this view the user can see the X, Y, Z, and Distance from the target in inches. There is also top view and horizontal view of the servos. In addition while in tracking mode a red box is drawn in the live feed indicating where the target was recognized last. There is also a third tab for camera settings. It is recommended to use the presets when changing the available options. The settings allow you to adjust the camera height and width, minimum/maximum value of the servos as well as the amount of change it takes to center the servo in one time step if the target is found at the edge of the camera screen. 32 4.4 Internal Architecture Overview Tracking::Location Stereoscopic3DTracking::MainApplication Stereoscopic3DTracking::V1KUTracker -x : double -y : double -z : double -distance : double +Location() +Location() +Location() +X() +Y() +Z() +Distance() +ToString() -doneEvents : ManualResetEvent[] -v1kuState : V1KU_STATE -viewState : VIEW_STATE -mainWorkerThread : Thread -trackers : V1KUTracker[] -binning : int -cameraWidth : int -cameraHeight : int -halfCameraWidth : int -halfCameraHeight : int -servoDeltaX : double -servoDeltaY : double -objectSizeWidth : int -objectSizeHeight : int -ROS_STEP : int -servoDistance : double -minServoPos : double -maxServoPos : double -advServo : AdvancedServo -filter : KalmanFilter -measurement : Matrix -InitializeComponent() +MainApplication() -configureServos() -configureCameraWindow() -configureV1KU() -configureFilter() -doMoveServo() -adjustROI() -interpolate() -findSideWithTwoSidesAndAngle() -findSideWithTwoAnglesAndSide() -findAngleWithTwoSidesAndAngle() -findAngleWithThreeSides() -calculateServoPosition() -calculateLocation() -mainThread() -getEstimateLocation() -DoWork() -MAX_STEP : int -MIN_STEP : int -v1ku : CogniSight -ros : Rectangle -roi : Rectangle -confidence : int -distance : int -roiCategory : int -roiStatus : ROIStatuses -teacher : LearnInterface -windowWidth : int -windowHeight : int -found : bool -rf : Rectangle -bf : Point +V1KUTracker() +V1KUTracker() +V1KU() +ROS() +ROI() +RF() +Found() +Distance() +BF() +Confidence() +Category() +ROIStatus() +Teacher() +WindowWidth() +WindowHeight() +reset() +learn() +search() +adjustROS() +movedWindowX() +movedWindowY() +recognize() +moveROI() +moveROS() Tracking::ServoPosition -servo1 : double -servo2 : double -servo3 : double -offset1 : double -offset2 : double -offset3 : double +Servo1() +Servo2() +Servo3() +CameraOffset1() +CameraOffset2() +CameraOffset3() Filter::KalmanFilter -_X : Matrix -_F : Matrix -_FTranspose : Matrix -_B : Matrix -_H : Matrix -_HTranspose : Matrix -_Q : Matrix -_R : Matrix -_P : Matrix -_PIdentity : Matrix -_U : Matrix +KalmanFilter() +Predict() +Observe() +X() +F() +B() +H() +Q() +P() +U() +R() +Estimate() Stereoscopic3DTracking::FrmNeuronContent «enumeration» Stereoscopic3DTracking::V1KU_STATE +UNLEARN +LEARN +RECOGNIZE +TRACK «enumeration» Stereoscopic3DTracking::VIEW_STATE +TAB_LEARN +TAB_TRACK -components : IContainer -lblNeuronNumber : Label -neuronUpDown : NumericUpDown -lblOf : Label -txtTotalNeurons : TextBox -picNeuron : PictureBox -picPlotNeuron : PictureBox -lblContext : Label -lblCategory : Label -lblInfluenceField : Label -lblComponent : Label -lblValue : Label -txtContext : TextBox -txtCategory : TextBox -txtInfluenceField : TextBox -txtComponent : TextBox -txtValue : TextBox -MAX_VEC_LEN : int -model : byte[] -hb : int -vb : int -plotCursor : int -tracker : V1KUTracker #Dispose() -InitializeComponent() +FrmNeuronContent() -frmNeuronContent_Activated() -showSelectedNeuron() -picPlotNeuron_Paint() -picPlotNeuron_MouseMove() -nueronUpDown_ValueChanged() Figure 10: Main Class Diagram 33 The diagram in Figure 10 shows the core architecture of the application. The MainApplication is responsible for initializing all of the objects, configuring the Kalman filter, the V1KUs, and the servo controller. After everything is configured, a main worker thread is started. The main worker thread is responsible for starting a worker thread for each V1KU to handle the 4 applications states (LEARN, RECOGNIZE, TRACK, UNLEARN). After each worker thread has completed its task, the main worker thread synchronizes the threads. If the application is in the TRACK state, the work flow diagram shown previously in Figure 8 is performed in the main worker thread. The V1KUTracker class is responsible for containing all the state data of the V1KU and methods to perform operations related to the V1KU. The recognize, search, moveROI, moveROS, adjustROS, learn and reset methods are not thread safe and the object must be locked before performing those operations within a thread. Accessing the V1KU property is not thread safe and the object must be locked before accessing that property. Figure 11 Show the class structure of the Kalman filter. Parts of the Matrix class and the two LUDecomposition and QRDecomposition classes derive from the Jama Java implementation [22], which are in the public domain. Figure 12 displays the Learning Algorithm class structure. Each implementation extends the LearnerInterface, which uses the doLearn method of the V1KUTracker class. This allows different learning implementations to be swap in and out. Figure 13 shows the class structure of the Transform Learner algorithm. This implementation uses the ModelSynthesizer class library from Bill Nagel at Cognimem Technologies, Inc. 34 Filter::KalmanFilter Filter::Matrix -_X : Matrix -_F : Matrix -_FTranspose : Matrix -_B : Matrix -_H : Matrix -_HTranspose : Matrix -_Q : Matrix -_R : Matrix -_P : Matrix -_PIdentity : Matrix -_U : Matrix +KalmanFilter() +Predict() +Observe(in Z : Matrix) +X() : Matrix +F() : Matrix +B() : Matrix +H() : Matrix +Q() : Matrix +P() : Matrix +U() : Matrix +R() : Matrix +Estimate() : Matrix -data : double[][] -rows : int -columns : int +Matrix(in row : int, in column : int) +Matrix(in B : double[][]) +Rows() : int +Columns() : int +DataCopy() : double[][] +Data() : double[][] +ValueAt(in row : int, in column : int) : double +Transpose() : Matrix +Identity(in m : int) : Matrix +Multiply(in scalar : double) : Matrix +Multiply(in B : Matrix) : Matrix +Add(in B : Matrix) : Matrix +Sub(in B : Matrix) : Matrix +SubMatrix(in row0 : int, in row1 : int, in column0 : int, in column1 : int) : Matrix +SubMatrix(in r : int[], in column0 : int, in column1 : int) : Matrix +Solve(in B : Matrix) : Matrix +Inverse() : Matrix +ToString() : string Filter::LUDecomposition Filter::QRDecomposition -lu : double[][] -m : int -n : int -pivsign : int -piv : int[] +LUDecomposition(in A : Matrix) +IsNonSingular() : bool +L() : Matrix +U() : Matrix +Pivot() : int[] +DoublePivot() : double[] +Determinant() : double +solve(in B : Matrix) : Matrix -qr : double[][] -m : int -n : int -rdiag : double[] +QRDecomposition(in A : Matrix) +hypot(in a : double, in b : double) : double +IsFullRank() : bool +getH() : Matrix +getR() : Matrix +getQ() : Matrix +solve(in B : Matrix) : Matrix Figure 11: Kalman Filter Class Diagram 35 Learn::EngineLearn Learn::RegionTeacher Learn::DefaultLearnImpl -_algorithm : int +doLearn() +doLearn() +doLearn() +EngineLearn() +EngineLearn() Learn::LearnTransform «interface»Stereoscopic3DTracking::LearnInterface -rotateTransform : RotateTransform -contrastTransform : ContrastTransform -perspectiveTransform : PerspectiveTransform -originalImage : Image -originalWidth : int -originalHeight : int -saveCount : int -_v1ku : CogniSight +doLearn() -DoProcess() -SaveSourceImage() +GetEncoder() -LearnImage() -Crop() +doLearn(in category : int, in tracker : V1KUTracker) Stereoscopic3DTracking::V1KUTracker -MAX_STEP : int -MIN_STEP : int -v1ku : CogniSight -ros : Rectangle -roi : Rectangle -confidence : int -distance : int -roiCategory : int -roiStatus : ROIStatuses -teacher : LearnInterface -windowWidth : int -windowHeight : int -found : bool -rf : Rectangle -bf : Point +V1KUTracker() +V1KUTracker() +V1KU() +ROS() +ROI() +RF() +Found() +Distance() +BF() +Confidence() +Category() +ROIStatus() +Teacher() +WindowWidth() +WindowHeight() +reset() +learn() +search() +adjustROS() +movedWindowX() +movedWindowY() +recognize() +moveROI() +moveROS() Figure 12: Learning Algorithm Class Diagram 36 Learn::LearnTransform -rotateTransform : RotateTransform -contrastTransform : ContrastTransform -perspectiveTransform : PerspectiveTransform -originalImage : Image -originalWidth : int -originalHeight : int -saveCount : int -_v1ku : CogniSight +doLearn() -DoProcess() -SaveSourceImage() +GetEncoder() -LearnImage() -Crop() ModelSynthesizer::PerspectiveTransform ModelSynthesizer::ContrastTransform -_sampleCount : int -_sampleMagnitude : int +ApplyTransform() +SampleCount() +SampleMagnitude() +ContrastTransform() -GenerateImages() -_sampleCount : int -_sampleMagnitude : int +ApplyTransform() +SampleCount() +SampleMagnitude() +PerspectiveTransform() -GenerateImages() -AddPerspective() -AddPerspective2() «interface»Stereoscopic3DTracking::LearnInterface +doLearn(in category : int, in tracker : V1KUTracker) ModelSynthesizer::RotateTransform -_sampleCount : int -_sampleMagnitude : int +ApplyTransform() +SampleCount() +SampleMagnitude() +RotateTransform() -GenerateImages() +RotateImage() «interface»ModelSynthesizer::ITransform +ApplyTransform(in sourceBitmap : Image) : List<System.Drawing.Image> Figure 13: Transform Learner Class Diagram 37 Chapter 5 RESULTS The project required a solid harness to mount the cameras and servos. There were very specific requirements on how the cameras should behave when panning and titling. The designed harness only required one modification to enable all the specified requirements. The modifications made were to add the circular servo horns and lock washers to the servos. The horns allowed for a solid connection between the swinging bed and the servos. The lock washers were required to push the servos out to make room for the added servo horns. The two servos were easily able to handle the weight of the system even at in the worst-case scenario of +/- 90 degrees from the rest position. 60 50 40 PD 30 MD 20 10 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 141 148 155 162 0 Figure 14: Results of the Kalman filter during a test run 38 Table 3: Kalman Filter Test Results T 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 PX PY PZ PD 8.464248 10.10626 12.36098 9.175526 9.960176 8.111338 7.66872 7.607709 7.359861 7.267969 7.359173 7.314268 7.690673 7.579441 7.400706 7.35686 7.313014 7.269169 7.865785 7.709048 7.644054 7.569329 7.555236 7.487743 7.449714 7.411684 7.343795 7.59285 7.738489 7.676457 7.392832 7.165322 7.308795 36.44295 44.28579 54.23477 38.54506 41.56582 33.83067 30.0749 29.12561 29.82021 31.15448 32.28936 32.66398 32.45877 30.83535 29.73958 29.18057 28.62156 28.06255 32.41423 34.57016 33.98323 31.21876 29.02345 28.8914 28.10519 27.31897 31.16102 35.65085 36.49873 32.80905 28.16245 26.08202 29.02912 -9.07748 -10.4933 -12.8037 -9.59331 -10.4118 -9.9437 -9.00927 -9.07718 -7.11383 -6.95272 -7.91439 -7.94361 -9.5955 -8.97791 -7.8643 -7.62496 -7.38561 -7.14627 -9.68672 -10.5599 -9.57938 -7.57377 -6.22843 -6.35463 -5.80753 -5.26043 -8.81441 -10.0336 -9.87942 -8.42342 -6.79275 -5.90029 -6.85644 38.49848 46.62056 57.0801 40.76694 43.99238 36.18266 32.31835 31.44159 31.52806 32.73783 34.04993 34.40254 34.71011 32.99803 31.63954 31.04463 30.45031 29.8566 34.73306 36.95994 36.12556 33.00406 30.63063 30.51492 29.65008 28.79117 33.20594 37.80617 38.59591 34.73206 29.89848 27.68441 30.71024 MX MY 6.998 30.13 7.579 33.4 MZ -7.505 -7.785 MD 31.875 35.171 7.52 31.42 -7.891 33.307 7.574 31.82 7.532 29.57 -9.246 -8.422 34.058 31.731 7.439 30.61 7.381 31.79 7.482 32.35 -7.243 -7.458 -8.419 32.374 33.52 34.31 7.695 32.14 7.497 30.48 7.367 29.99 -9.41 -8.429 -7.601 34.436 32.562 31.86 7.814 7.603 7.621 7.574 7.586 7.509 32.24 34.28 32.96 30.58 29.48 30.05 -9.511 -10.283 -9.001 -7.434 -6.801 -7.285 34.587 36.657 35.065 32.42 31.243 31.876 7.386 7.662 7.709 7.595 7.339 7.22 7.449 7.735 31.3 35.71 34.97 31.17 28.11 27.69 31.22 35.62 -8.788 -9.697 -9.18 -7.902 -6.874 -6.52 -7.721 -9.228 33.407 37.844 37.022 33.098 29.913 29.403 33.068 37.652 39 Figure 14 and Table 3 display the test results of using the Kalman filter. MD represents the measured distance calculated from the pixel position on the cameras and servo positions. PD is the predicted distance from the Kalman filter. The table includes the predicted and measure X, Y and Z values at time T. After the first six measured values, the Kalman filter appears to be a very reliable source for predicted results even with the implementation using such a simplistic model of motion. The learning algorithms were a mixed bag. The transform learner and region learner performed the worse. The region learner worked well, as long as the object never moved and it used too many neurons to enable further training to improve the results. The transform learner did not work as well as was hoped. It was unable to learn the object under different light conditions and rotations. It also had a similar issue as the region learner and due to its longer time to train it could not use the continuous training option. Out of the four algorithms, Anne’s algorithm performed the best. After learning a target in one location, it was better able to detect the object when moved to a different location. It did however produce many false positives, so required extra Unlearning operations. This entails moving the ROI over misidentified locations and telling it to unlearn (learn as the background). Using the continuous unlearn worked the best. The “Engine Moderate” algorithm produced the most false positives and requires the most amount of time to train an unknown object. See the below figures to see how each algorithm performed with learning an unknown object with a single learn operation. The results of the tests are displayed as dots on the screen, red for areas of high confidence and blue for areas of low confidence. 40 Figure 15: Anne’s Algorithm heat map Figure 16: Engine Conservative heat map 41 Figure 17: Engine Moderate heat map Figure 18: Region Learner heat map 42 Figure 19: Transform Learner heat map Figure 20: Engine Normal heat map 43 Chapter 6 FUTURE WORK There are two fundamental areas that this project can be extended. The first involves improving the learning algorithms. The current approach is time consuming, requiring the repositioning of the tracked object as well as the repositioning of the servos to change the perspective of the camera. This learning process is time consuming and can tak anywhere from 20 minutes to a couple of hours to properly train. It would be useful if a training method could be introduced that would automatically perform those operations. In addition to learning the object ahead of time, it would be advantageous for the system to learn while it was tracking to improve its results. The second fundamental extension would be to add the tracking system to a moving robot. Fast real-time visual tracking lends itself well to robotics. One example is soccer or basketball playing robots. With the ability to gauge the distance of a tracked object and give coordinates of its location the tracking system would be ideally suited for such a task. One other area of improvement is in the tracking algorithm. The Kalman filter while, a good tracking algorithm, is only suited for linear problems. An obvious improvement would be to implement an extended Kalman filter or the invariant extended Kalman filter. It might also be possible to improve the search speed of the V1KU by only searching areas of movement. It might be possible to apply filters to the camera image and detect only the regions that differ from the background. Those areas are much smaller than the entire camera space so should be faster for the V1KU to only search those specific areas. 44 APPENDIX A Source Code //MainApplication.cs (partial) namespace Stereoscopic3DTracking { public enum VIEW_STATE { TAB_LEARN, TAB_TRACK } public enum V1KU_STATE { UNLEARN, LEARN, RECOGNIZE, TRACK } public partial class MainApplication : Form { private ManualResetEvent[] doneEvents = new ManualResetEvent[2]; private volatile int fps; private volatile bool liveVideo = true; private volatile V1KU_STATE v1kuState; private volatile VIEW_STATE viewState; private volatile bool running = true; private volatile bool done = false; private volatile bool resetROS = false; // This is used on the FormClose method to make sure the uninitialization isn't called twice. private bool uninitialized = false; private Thread mainWorkerThread = null; // Array to hold the tracking information. private V1KUTracker[] trackers = { new V1KUTracker(0), new V1KUTracker(1) }; // Camera Settings. private int binning = 1; // 1 = full resolution, 2 = half resolution. private int cameraWidth = 376; private int cameraHeight = 240; private int halfCameraWidth = 376 / 2; private int halfCameraHeight = 240 / 2; private double servoDeltaX = 22.5 / 2; private double servoDeltaY = 16 / 2; // Default Object Size values. private int objectSizeWidth = 16; private int objectSizeHeight = 16; // Default Region of Search step value. private const int ROS_STEP = 8; // values for indexing the servos and cameras. private volatile int cam1 = 0, cam2 = 1; private volatile int servo1 = 0, servo2 = 1, servo3 = 2; private double servoDistance = 12.0; private double minServoPos = 40.0; 45 private double maxServoPos = 200.0; // Servo Controller. private AdvancedServo advServo = new AdvancedServo(); // Filter used to predict the next state. private KalmanFilter filter = new KalmanFilter(); private Matrix measurement = new Matrix(new double[1][] { new double[] { 0, 0, 0 } }).Transpose(); private int filterCounter = 0; public MainApplication() { InitializeComponent(); // Init state information. viewState = VIEW_STATE.TAB_LEARN; v1kuState = V1KU_STATE.RECOGNIZE; tabMain.SelectedIndex = 0; running = true; // Setup Servo Controller advServo.Attach += new AttachEventHandler(advServo_Attach); advServo.Detach += new DetachEventHandler(advServo_Detach); advServo.Error += new Phidgets.Events.ErrorEventHandler(advServo_Error); advServo.PositionChange += new PositionChangeEventHandler(advServo_PositionChange); advServo.VelocityChange += new VelocityChangeEventHandler(advServo_VelocityChange); advServo.open(); advServo.waitForAttachment(); // Setup the servos. configureServos(servo1); configureServos(servo2); configureServos(servo3); // Configure the Kalman Filter for predictive tracking. configureFilter(); // Set the max/min range for the servo. tbrCam1.SetRange( (int)advServo.servos[servo1].PositionMin, (int)advServo.servos[servo1].PositionMax); tbrCam2.SetRange( (int)advServo.servos[servo2].PositionMin, (int)advServo.servos[servo2].PositionMax); tbrVertical.SetRange( (int)advServo.servos[servo3].PositionMin, (int)advServo.servos[servo3].PositionMax); // Initialize settings page cboBinning.SelectedIndex = binning - 1; txtCameraWidth.Text = Convert.ToString(cameraWidth); txtCameraHeight.Text = Convert.ToString(cameraHeight); txtServoDeltaX.Text = Convert.ToString(servoDeltaX); txtServoDeltaY.Text = Convert.ToString(servoDeltaY); txtMinServoPosition.Text = Convert.ToString(minServoPos); txtMaxServoPosition.Text = Convert.ToString(maxServoPos); txtServoDistance.Text = Convert.ToString(servoDistance); 46 // Create the Main Worker Thread. mainWorkerThread = new Thread(mainThread); mainWorkerThread.Name = "Main Worker Thread"; mainWorkerThread.IsBackground = false; // Configure the cameras if (trackers[cam1].V1KU.DeviceFound && trackers[cam2].V1KU.DeviceFound) { trackers[cam1].V1KU.Comm.Connect( CogniMemEngine.Platforms.V1KU_board, cam1); trackers[cam2].V1KU.Comm.Connect( CogniMemEngine.Platforms.V1KU_board, cam2); cboLearningAlgorithm.SelectedIndex = 0; trackers[cam1].reset(); trackers[cam2].reset(); configureV1KU(cam1); configureV1KU(cam2); // Start the main working thread. mainWorkerThread.Start(); // Enable the timers. timerCamera.Enabled = true; timerFPS.Enabled = true; } else { Console.Error.WriteLine("No camera detected! Program will exit."); MessageBox.Show("No camera detected! Program must exit."); done = true; Application.Exit(); } } private void configureServos(int servoIndex) { advServo.servos[servoIndex].Engaged = true; advServo.servos[servoIndex].Position = interpolate( 90.0, 0, 180, minServoPos, maxServoPos); advServo.servos[servoIndex].Acceleration = 180000; advServo.servos[servoIndex].VelocityLimit = 316; } private void configureCameraWindow(V1KUTracker tracker) { if (binning == 1) { if (cameraWidth == 752 && cameraHeight == 480) { tracker.V1KU.Camera.SetWindow( 0, 0, cameraWidth, cameraHeight); } else { tracker.V1KU.Camera.SetWindow( halfCameraWidth / 2, halfCameraHeight / 2, cameraWidth, cameraHeight); } 47 } else { tracker.V1KU.Camera.SetBinning(binning); } tracker.WindowHeight = cameraHeight; tracker.WindowWidth = cameraWidth; tracker.adjustROS(true); } private void configureV1KU(int index) { V1KUTracker tracker = trackers[index]; lock (tracker) { Debug.WriteLine(tracker.V1KU.Platform.ToString()); tracker.V1KU.CogniMem.FORGET = 0; configureCameraWindow(tracker); //tracker.V1KU.CSR = 0; tracker.V1KU.Camera.AGC = false; tbrShutter.Value = tracker.V1KU.Camera.SHUTTER; // min=0 max=480 tbrGain.Value = tracker.V1KU.Camera.GAIN; // min=0 max=64 // Set the ROI. int roiX = halfCameraWidth - (objectSizeWidth / 2); int roiY = halfCameraHeight - (objectSizeHeight / 2); tracker.moveROI(roiX, roiY, objectSizeWidth, objectSizeHeight); // Make the ROS 4 times bigger than the ROI. int rosX = halfCameraWidth - ((objectSizeWidth * 2) / 2); int rosY = halfCameraHeight - ((objectSizeHeight * 2) / 2); tracker.adjustROS(true); tracker.V1KU.ROSSTEPX = ROS_STEP; tracker.V1KU.ROSSTEPY = ROS_STEP; cboSearchStep.SelectedIndex = ROS_STEP - 1; } } private void configureFilter() { filter.X = new Matrix(new double[1][] { new double[] { 0, 0, 0, 0, 0, 0} }).Transpose(); filter.F = new Matrix(new double[6][] { new double[] { 1.0, 0.0, 0.0, 1.0, 0.0, 0.0}, new double[] { 0.0, 1.0, 0.0, 0.0, 1.0, 0.0}, new double[] { 0.0, 0.0, 1.0, 0.0, 0.0, 1.0}, new double[] { 0.0, 0.0, 0.0, 1.0, 0.0, 0.0}, new double[] { 0.0, 0.0, 0.0, 0.0, 1.0, 0.0}, new double[] { 0.0, 0.0, 0.0, 0.0, 0.0, 1.0} }); filter.B = new Matrix(new double[6][] { new double[] { 0.0, }, new double[] { 0.0, }, new double[] { 0.0, }, new double[] { 0.0, }, new double[] { 0.0, }, new double[] { 0.0, } }); filter.U = new Matrix(new double[1][] { new double[] { 0.0 } }); filter.Q = new Matrix(new double[6][] { new double[] { 0.25, 0.00, 0.00, 0.5, 0.0, 0.0}, 48 new double[] { 0.00, 0.25, 0.00, 0.0, 0.5, 0.0}, new double[] { 0.00, 0.00, 0.25, 0.0, 0.0, 0.5}, new double[] { 0.25, 0.00, 0.00, 1.0, 0.0, 0.0}, new double[] { 0.00, 0.25, 0.00, 0.0, 1.0, 0.0}, new double[] { 0.00, 0.00, 0.25, 0.0, 0.0, 1.0} }).Multiply(4); filter.H = new Matrix(new double[3][] { new double[] { 1.0, 0.0, 0.0, 0.0, 0.0, 0.0}, new double[] { 0.0, 1.0, 0.0, 0.0, 0.0, 0.0}, new double[] { 0.0, 0.0, 1.0, 0.0, 0.0, 0.0}}); // Measured noise. filter.R = Matrix.Identity(3).Multiply(16); filter.P = Matrix.Identity(6).Multiply(16); } private static void advServo_Attach(object sender, AttachEventArgs e) { Debug.WriteLine("AdvancedServo controller attached: {0}", e.Device.SerialNumber); AdvancedServo controller = (AdvancedServo)sender; for (int i = 0; i < controller.servos.Count; i++) { controller.servos[i].VelocityLimit = 500.0; // min = 0, max = 1500 controller.servos[i].Acceleration = 2000.0; // min = 0, max = 4590 Debug.WriteLine("Servo #:{0} attached", i); } } private static double interpolate(double pos, double minServoPos, double maxServoPos) { return (pos - minServoPos) * (180.0 / (maxServoPos - minServoPos)); } private static double interpolate(double pos, double x0, double x1, double y0, double y1) { return y0 + (pos - x0) * ((y1 - y0) / (x1 - x0)); } private double findSideWithTwoSidesAndAngle(double b, double c, double radianAngleA) { return Math.Sqrt((b * b) + (c * c) - (2 * b * c * Math.Cos(radianAngleA))); } private double findSideWithTwoAnglesAndSide(double a, double radianAngleA, double radianAngleB) { return (a * Math.Sin(radianAngleB)) / Math.Sin(radianAngleA); } private double findAngleWithTwoSidesAndAngle(double a, double b, double radianAngleB) { return Math.Asin((Math.Sin(radianAngleB) * a) / b); } private double findAngleWithThreeSides(double a, double b, double c) { 49 return Math.Acos((a * a + b * b - c * c) / (2 * a * b)); } private double degreeToRadian(double angle) { return Math.PI * angle / 180.0; } private double radianToDegree(double radian) { return radian * (180.0 / Math.PI); } private ServoPosition calculateServoPosition(Location loc) { double halfServoDistance = servoDistance / 2.0; ServoPosition pos = new ServoPosition(); // Calcualte Servo 1 Position. double s1x = halfServoDistance - loc.X; if (s1x < 0) { pos.Servo1 = 180.0 - radianToDegree(findAngleWithTwoSidesAndAngle( loc.Y, Math.Sqrt(s1x * s1x + loc.Y * loc.Y), (Math.PI / 2))); } else { pos.Servo1 = radianToDegree(findAngleWithTwoSidesAndAngle( loc.Y, Math.Sqrt(s1x * s1x + loc.Y * loc.Y), (Math.PI / 2))); } // Calculate Servo 2 Position. double s2x = halfServoDistance + loc.X; if (s2x > 0) { pos.Servo2 = 180.0 - radianToDegree(findAngleWithTwoSidesAndAngle( loc.Y, Math.Sqrt(s2x * s2x + loc.Y * loc.Y), (Math.PI / 2))); } else { pos.Servo2 = radianToDegree(findAngleWithTwoSidesAndAngle( loc.Y, Math.Sqrt(s2x * s2x + loc.Y * loc.Y), (Math.PI / 2))); } // Calculate Servo 3 Position. double s3z = Math.Abs(loc.Z); if (loc.Z == 0) { pos.Servo3 = 90.0; } else if (loc.Z < 0.0) { pos.Servo3 = 90.0 - radianToDegree(findAngleWithTwoSidesAndAngle( s3z, Math.Sqrt(s3z * s3z + loc.Y * loc.Y), (Math.PI / 2))); 50 } else { pos.Servo3 = 90.0 + radianToDegree(findAngleWithTwoSidesAndAngle( s3z, Math.Sqrt(s3z * s3z + loc.Y * loc.Y), (Math.PI / 2))); } pos.Servo1 = interpolate(pos.Servo1, 0.0, 180.0, 40, 200); pos.Servo2 = interpolate(pos.Servo2, 0.0, 180.0, 40, 200); pos.Servo3 = interpolate(pos.Servo3, 0.0, 180.0, 40, 200); return pos; } private Location calculateLocation(ServoPosition pos) { Location loc = new Location(); loc.Distance = double.NaN; // Degree based angles. double angle1Deg = interpolate( pos.Servo1 + pos.CameraOffset1, minServoPos, maxServoPos); double angle2Deg = 180 - interpolate( pos.Servo2 + pos.CameraOffset2, minServoPos, maxServoPos); if ((angle1Deg + angle2Deg) >= 180.0) { // Cannot triangluate distance. return loc; } double angle3Deg = interpolate( pos.Servo3 + pos.CameraOffset3, minServoPos, maxServoPos); // Radian Based angles. double angle1Rad = degreeToRadian(angle1Deg); // horizontal plane double angle2Rad = degreeToRadian(angle2Deg); // horizontal plane double angle3Rad = degreeToRadian(angle3Deg); // vertical plane // Calculate the third angle. double cAngle = 180 - angle1Deg - angle2Deg; double cRad = degreeToRadian(cAngle); // Converted to radians. // Calculate the distance for the servos in inches. double servo1Dist = findSideWithTwoAnglesAndSide( servoDistance, cRad, angle2Rad); // Calculate the distance from the center of the module. double halfServoDistance = servoDistance / 2; // a^2 = b^2 + c^2 - 2bc cos A double d1 = findSideWithTwoSidesAndAngle( servo1Dist, halfServoDistance, angle1Rad); // Returns the angle opposite servo1Distance side. double d1AngleDeg = Math.Round(radianToDegree( findAngleWithThreeSides(d1, halfServoDistance, servo1Dist)), 3); if (d1AngleDeg == 90.0) { loc.X = 0; loc.Y = d1; } else if (d1AngleDeg > 90.0) { double angle1 = 180.0 - d1AngleDeg; 51 loc.Y = Math.Round(findSideWithTwoAnglesAndSide( d1, degreeToRadian(90.0), degreeToRadian(angle1)), 3); // x is negative. loc.X = Math.Round(findSideWithTwoAnglesAndSide( d1, degreeToRadian(90.0), degreeToRadian(90.0 - angle1)) * -1, 3); } else { loc.Y = Math.Round(findSideWithTwoAnglesAndSide( d1, degreeToRadian(90.0), degreeToRadian(d1AngleDeg)), 2); // x is positive. loc.X = Math.Round(findSideWithTwoAnglesAndSide( d1, degreeToRadian(90.0), degreeToRadian(90.0 - d1AngleDeg)), 3); } // Calculate the z angle. if (angle3Deg == 90.0) { loc.Z = 0; loc.Distance = d1; } else if (angle3Deg > 90.0) { double angle1 = 180 - angle3Deg; double yAngleDeg = 90.0 - angle1; loc.Z = Math.Round(findSideWithTwoAnglesAndSide( loc.Y, degreeToRadian(angle1), degreeToRadian(90.0 - angle1)), 3); loc.Distance = Math.Round(findSideWithTwoAnglesAndSide( d1, degreeToRadian(angle1), degreeToRadian(90.0)), 3); } else { loc.Z = Math.Round(findSideWithTwoAnglesAndSide( loc.Y, degreeToRadian(angle3Deg), degreeToRadian(90.0 - angle3Deg)) * -1, 3); loc.Distance = Math.Round(findSideWithTwoAnglesAndSide( d1, degreeToRadian(angle3Deg), degreeToRadian(90.0)), 3); } return loc; } private void mainThread() { done = false; doneEvents[0] = new ManualResetEvent(false); doneEvents[1] = new ManualResetEvent(false); while (running) { if (viewState == VIEW_STATE.TAB_TRACK) { for (int i = 0; i < 2; i++) 52 { ThreadPool.QueueUserWorkItem(new WaitCallback(DoWork), (object)i); } WaitHandle.WaitAll(doneEvents); doneEvents[0].Reset(); doneEvents[1].Reset(); if (v1kuState == V1KU_STATE.TRACK) { V1KUTracker t1 = trackers[cam1]; V1KUTracker t2 = trackers[cam2]; ServoPosition pos = new ServoPosition(); pos.Servo1 = advServo.servos[servo1].Position; pos.Servo2 = advServo.servos[servo2].Position; pos.Servo3 = advServo.servos[servo3].Position; double x1 = 0.0; double x2 = 0.0; double y1 = 0.0; double y2 = 0.0; filterCounter++; filter.Predict(); Location loc = getEstimateLocation(filter.Estimate); Debug.WriteLine("Predicted {0}", loc); writeLocation("predicted.csv", loc); resetROS = false; if (t1.Found) { // Map the dx and dy values in the range of [-1,1]. double dx = ((double)(t1.BF.X - halfCameraWidth)) / (double)halfCameraWidth; double dy = ((double)(t1.BF.Y - halfCameraHeight)) / (double)halfCameraHeight; y1 = dy * -1; // flip the value for the y axis. x1 = dx * servoDeltaX; pos.CameraOffset1 = x1; } if (t2.Found) { // Map the dx and dy values in the range of [-1,1]. double dx = ((double)(t2.BF.X - halfCameraWidth)) / (double)halfCameraWidth; double dy = ((double)(t2.BF.Y - halfCameraHeight)) / (double)halfCameraHeight; y2 = dy * -1; // flip the value for the y axis. x2 = dx * servoDeltaX; pos.CameraOffset2 = x2; } if (t1.Found && t2.Found) { resetROS = true; double dy = (y1 + y2) / 2; double value = dy * servoDeltaY; pos.CameraOffset3 = value; loc = calculateLocation(pos); 53 Debug.WriteLine("Measured {0}", loc); writeLocation("measured.csv", loc); // Take a measurement the kalman filter. measurement.Data[0][0] = loc.X; measurement.Data[1][0] = loc.Y; measurement.Data[2][0] = loc.Z; filter.Observe(measurement); ServoPosition servoPos = calculateServoPosition(loc); this.setMotor(servo1, servoPos.Servo1); this.setMotor(servo2, servoPos.Servo2); this.setMotor(servo3, servoPos.Servo3); } else if (t1.Found) { double value = y1 * servoDeltaY; pos.CameraOffset3 = value; if (loc.Y > 0.0) { ServoPosition servoPos = calculateServoPosition(loc); this.setMotor(servo2, servoPos.Servo2); } this.updateMotor(servo1, x1); this.updateMotor(servo3, value); } else if (t2.Found) { double value = y2 * servoDeltaY; pos.CameraOffset3 = value; if (loc.Y > 0.0) { ServoPosition servoPos = calculateServoPosition(loc); this.setMotor(servo1, servoPos.Servo1); } this.updateMotor(servo2, x2); this.updateMotor(servo3, value); } else { // Use kalman filter to predict location. if (loc.Y > 0.0) { ServoPosition servoPos = calculateServoPosition(loc); this.setMotor(servo1, servoPos.Servo1); this.setMotor(servo2, servoPos.Servo2); this.setMotor(servo3, servoPos.Servo3); } } } } else { int camIndex = cam1; ThreadPool.QueueUserWorkItem(new WaitCallback(DoWork), (object)camIndex); 54 doneEvents[camIndex].WaitOne(); doneEvents[camIndex].Reset(); } Application.DoEvents(); fps++; } done = true; Debug.WriteLine("Completed"); } private Location getEstimateLocation(Matrix estimate) { Location loc = new Location(); loc.X = estimate.ValueAt(0, 0); loc.Y = estimate.ValueAt(1, 0); loc.Z = estimate.ValueAt(2, 0); return loc; } private void DoWork(object o) { int index = (int)o; if (!running) { Debug.WriteLine("Not Execuing work {0}, run is false", index); doneEvents[index].Set(); return; } try { V1KUTracker tracker = trackers[index]; lock (tracker) { // Perform operation. switch (v1kuState) { case V1KU_STATE.LEARN: tracker.learn(1); break; case V1KU_STATE.RECOGNIZE: tracker.recognize(); break; case V1KU_STATE.TRACK: tracker.adjustROS(resetROS); tracker.search(); break; case V1KU_STATE.UNLEARN: tracker.learn(0); break; } } } catch (Exception e) { Console.Error.WriteLine( "Error occurred performing tracking operations.", e); 55 } finally { doneEvents[index].Set(); } } // FrmNeuronContent.cs using using using using using using using using System; System.Collections.Generic; System.ComponentModel; System.Data; System.Drawing; System.Linq; System.Text; System.Windows.Forms; namespace Stereoscopic3DTracking { public partial class FrmNeuronContent : Form { // Max Vector Length private const int MAX_VEC_LEN = 256; private byte[] model = new byte[MAX_VEC_LEN]; private int hb, vb; // number of blocks in teh 2D featuer model. private int plotCursor = 0; private V1KUTracker tracker; public FrmNeuronContent(V1KUTracker tracker) { InitializeComponent(); this.tracker = tracker; } private void frmNeuronContent_Activated(object sender, EventArgs e) { txtTotalNeurons.Text = Convert.ToString(tracker.V1KU.CogniMem.NCOUNT); neuronUpDown.Minimum = (tracker.V1KU.CogniMem.NCOUNT != 0) ? 1 : 0; neuronUpDown.Maximum = tracker.V1KU.CogniMem.NCOUNT; neuronUpDown.Value = 1; showSelectedNeuron(); } private void showSelectedNeuron() { lock (tracker) { tracker.V1KU.CogniMem.NSR = 16; tracker.V1KU.CogniMem.RESETCHAIN = 0; int temp; for (int i = 0; i < (int)neuronUpDown.Value - 1; i++) temp = tracker.V1KU.CogniMem.CAT; 56 txtContext.Text = Convert.ToString(tracker.V1KU.CogniMem.NCR); txtInfluenceField.Text = Convert.ToString(tracker.V1KU.CogniMem.AIF); for (int i = 0; i < MAX_VEC_LEN; i++) { model[i] = (byte)tracker.V1KU.CogniMem.COMP; } txtCategory.Text = Convert.ToString(tracker.V1KU.CogniMem.CAT); tracker.V1KU.CogniMem.NSR = 0; //2d and 1d display hb = tracker.ROI.Width / tracker.V1KU.BWIDTH; vb = tracker.ROI.Height / tracker.V1KU.BHEIGHT; Bitmap bm = null; tracker.V1KU.ReadModel((int)neuronUpDown.Value - 1, out bm, 1); picNeuron.Image = bm; } picPlotNeuron.Refresh(); } private void picPlotNeuron_Paint(object sender, PaintEventArgs e) { Graphics g = e.Graphics; Pen pen = new Pen(Color.Yellow, 2); for (int i = 0; i < MAX_VEC_LEN; i++) { g.DrawLine(pen, i, 92, i, 92 - (model[i] / 3)); } pen = new Pen(Color.Blue, 2); g.DrawLine(pen, plotCursor, 0, plotCursor, 92); } private void picPlotNeuron_MouseMove(object sender, MouseEventArgs e) { plotCursor = e.X; txtComponent.Text = Convert.ToString(plotCursor); txtValue.Text = Convert.ToString(model[plotCursor]); picPlotNeuron.Refresh(); } private void nueronUpDown_ValueChanged(object sender, EventArgs e) { showSelectedNeuron(); } } } // Servo Position.cs using using using using System; System.Collections.Generic; System.Linq; System.Text; 57 namespace Stereoscopic3DTracking.Tracking { class ServoPosition { private double servo1; private double servo2; private double servo3; private double offset1; private double offset2; private double offset3; public double Servo1 { get { return servo1; } set { servo1 = value; } } public double Servo2 { get { return servo2; } set { servo2 = value; } } public double Servo3 { get { return servo3; } set { servo3 = value; } } public double CameraOffset1 { get { return offset1; } set { offset1 = value; } } public double CameraOffset2 { get { return offset2; } set { offset2 = value; } } public double CameraOffset3 { get { return offset3; } set { offset3 = value; } } } } // Location.cs using System; using System.Collections.Generic; using System.Linq; using System.Text; namespace Stereoscopic3DTracking.Tracking { class Location { private double x; 58 private double y; private double z; private double distance; public Location() { x = 0.0; y = 0.0; z = 0.0; distance = 0.0; } public Location(double x, double y, double z) { this.x = x; this.y = y; this.z = z; } public Location(double x, double y, double z, double distance) { this.x = x; this.y = y; this.z = z; this.distance = distance; } public double X { get { return x; } set { x = value; } } public double Y { get { return y; } set { y = value; } } public double Z { get { return z; } set { z = value; } } public double Distance { get { return distance; } set { distance = value; } } public override string ToString() { return "Location: X=" + x + ", Y=" + y + ", Z=" + z + ", Distance=" + distance; } } } // LearnInterface.cs using System; 59 using using using using System.Collections.Generic; System.Linq; System.Text; System.Drawing; namespace Stereoscopic3DTracking { public interface LearnInterface { void doLearn(int category, V1KUTracker tracker); } } // DefaultLearnImpl.cs using using using using using System; System.Collections.Generic; System.Linq; System.Text; CogniMemEngine.Cognisight; namespace Stereoscopic3DTracking.Learn { class DefaultLearnImpl : LearnInterface { public void doLearn(int category, V1KUTracker tracker) { /** * Learning only takes place on camera 1 (V1KU) which should be on the left hand side. * Anne's suggestion: * 1. learn ROI (cat=1, then CSR=4) * 2. offset and unlearn (cat=0, then CSR=4 - shrinks neuron) * 3. learn same offset (cat=1, then CSR=4 - adds neuron for offset) * 4. repeat for other offsets * 5. return ROI back to original values */ CogniSight v1ku = tracker.V1KU; v1ku.CSR = 1; v1ku.CATL = category; v1ku.CSR = 4; if (category == 0) return; int learnOffset = 8; // NW (North West) v1ku.ROILEFT = tracker.ROI.X - learnOffset; v1ku.ROITOP = tracker.ROI.Y - learnOffset; v1ku.CATL = 0; v1ku.CSR = 4; //v1ku.CATL = category; v1ku.CSR = 4; // NE (North East) 60 v1ku.ROILEFT = tracker.ROI.X + learnOffset; v1ku.ROITOP = tracker.ROI.Y - learnOffset; v1ku.CATL = 0; v1ku.CSR = 4; //v1ku.CATL = category; v1ku.CSR = 4; // SW (South West) v1ku.ROILEFT = tracker.ROI.X - learnOffset; v1ku.ROITOP = tracker.ROI.Y + learnOffset; v1ku.CATL = 0; v1ku.CSR = 4; //v1ku.CATL = category; v1ku.CSR = 4; // SE (South East) v1ku.ROILEFT = tracker.ROI.X + learnOffset; v1ku.ROITOP = tracker.ROI.Y + learnOffset; v1ku.CATL = 0; v1ku.CSR = 4; //v1ku.CATL = category; v1ku.CSR = 4; // return region of interest to center v1ku.ROILEFT = tracker.ROI.X; v1ku.ROITOP = tracker.ROI.Y; } } } // EngineLearn.cs using using using using System; System.Collections.Generic; System.Linq; System.Text; namespace Stereoscopic3DTracking.Learn { class EngineLearn : LearnInterface { private int _algorithm = 0; public EngineLearn() { _algorithm = 0; } public EngineLearn(int algorithm) { _algorithm = algorithm; } public void doLearn(int category, V1KUTracker tracker) { if (_algorithm == 1) { // Conservative tracker.V1KU.LearnROI(category, true, 2); } else if (_algorithm == 2) 61 { // Moderate. tracker.V1KU.LearnROI(category); } else { // Normal tracker.V1KU.LearnROI(category, true, tracker.V1KU.BWIDTH); } } } } // LearnTransform.cs using using using using using using using using System; System.Collections.Generic; System.Linq; System.Text; System.Diagnostics; System.Drawing; ModelSynthesizer; System.Drawing.Imaging; namespace Stereoscopic3DTracking.Learn { class LearnTransform : LearnInterface { private RotateTransform rotateTransform = new RotateTransform(2, 2); private ContrastTransform contrastTransform = new ContrastTransform(2, 2); private PerspectiveTransform perspectiveTransform = new PerspectiveTransform(2, 2); private Image originalImage = null; private int originalWidth; private int originalHeight; private int saveCount = 0; CogniMemEngine.Cognisight.CogniSight _v1ku = null; public void doLearn(int category, V1KUTracker tracker) { if (category == 0) { tracker.V1KU.LearnROI(0); return; } saveCount = 0; Debug.WriteLine("Starting the Transform Learner"); _v1ku = tracker.V1KU; originalHeight = tracker.ROI.Height; originalWidth = tracker.ROI.Width; try { 62 // Grab an image from the v1ku and crop it. _v1ku.GrabImage(); _v1ku.LearnROI(category, true, 2); originalImage = (Image) _v1ku.Bitmap.Clone(); Image img = Crop(tracker.ROI, originalImage); SaveSourceImage(img); DoProcess(category, img, tracker.ROI); } catch (Exception e) { Console.Error.WriteLine(e.ToString()); } finally { _v1ku.GrabImage(); _v1ku = null; } Debug.WriteLine("Ending the Transform Learner"); } private void DoProcess(int category, Image src, Rectangle roi) { // Apply the contrast transformation to the original image. foreach (Image img in contrastTransform.ApplyTransform(src)) { Bitmap b = (Bitmap) originalImage.Clone(); using (Graphics g = Graphics.FromImage(b)) { g.FillRectangle(new SolidBrush(Color.White), roi); int midX = roi.X + (roi.Width / 2); int midY = roi.Y + (roi.Height / 2); int x = midX - (img.Width / 2); int y = midY - (img.Height / 2); g.DrawImageUnscaled(img, x, y); } SaveSourceImage(b); LearnImage(category, b); // Apply the rotate transformation to the contrasted image. foreach (Image rotImage in rotateTransform.ApplyTransform(img)) { Bitmap b2 = (Bitmap)originalImage.Clone(); using (Graphics g = Graphics.FromImage(b2)) { g.FillRectangle(new SolidBrush(Color.White), roi); int midX = roi.X + (roi.Width / 2); int midY = roi.Y + (roi.Height / 2); int x = midX - (rotImage.Width / 2); int y = midY - (rotImage.Height / 2); g.DrawImageUnscaled(rotImage, x, y); } SaveSourceImage(b2); LearnImage(category, b2); } // Apply the perspective transformation to the contrasted image. 63 foreach (Image perspImage in perspectiveTransform.ApplyTransform(img)) { Bitmap b2 = (Bitmap)originalImage.Clone(); using (Graphics g = Graphics.FromImage(b2)) { g.FillRectangle(new SolidBrush(Color.White), roi); int midX = roi.X + (roi.Width / 2); int midY = roi.Y + (roi.Height / 2); int x = midX - (perspImage.Width / 2); int y = midY - (perspImage.Height / 2); g.DrawImageUnscaled(perspImage, x, y); } SaveSourceImage(b2); LearnImage(category, b2); } } // Apply the rotate transformation to the original image. foreach (Image img in rotateTransform.ApplyTransform(src)) { Bitmap b = (Bitmap)originalImage.Clone(); using (Graphics g = Graphics.FromImage(b)) { g.FillRectangle(new SolidBrush(Color.White), roi); int midX = roi.X + (roi.Width / 2); int midY = roi.Y + (roi.Height / 2); int x = midX - (img.Width / 2); int y = midY - (img.Height / 2); g.DrawImageUnscaled(img, x, y); } SaveSourceImage(b); LearnImage(category, b); } // Apply the perspective transformation to the original image. foreach (Image img in perspectiveTransform.ApplyTransform(src)) { Bitmap b = (Bitmap)originalImage.Clone(); using (Graphics g = Graphics.FromImage(b)) { g.FillRectangle(new SolidBrush(Color.White), roi); int midX = roi.X + (roi.Width / 2); int midY = roi.Y + (roi.Height / 2); int x = midX - (img.Width / 2); int y = midY - (img.Height / 2); g.DrawImageUnscaled(img, x, y); } SaveSourceImage(b); LearnImage(category, b); } } [Conditional("DEBUG")] private void SaveSourceImage(Image img) { EncoderParameters encoderParameters = new EncoderParameters(1); 64 encoderParameters.Param[0] = new EncoderParameter(System.Drawing.Imaging.Encoder.Quality, 100L); String fileName = "src_" + (saveCount++) + ".jpg"; img.Save(fileName, GetEncoder(ImageFormat.Jpeg), encoderParameters); } public ImageCodecInfo GetEncoder(ImageFormat format) { ImageCodecInfo[] codecs = ImageCodecInfo.GetImageDecoders(); foreach (ImageCodecInfo codec in codecs) { if (codec.FormatID == format.Guid) { return codec; } } return null; } private void LearnImage(int category, Image img) { _v1ku.Bitmap = (Bitmap)img; _v1ku.LearnROI(category, true, _v1ku.BWIDTH); } private Image Crop(Rectangle roi, Image src) { Bitmap bmp = (Bitmap)src; return (Image) bmp.Clone(roi, src.PixelFormat); } } } // RegionTeacher.cs using using using using using System; System.Collections.Generic; System.Linq; System.Text; CogniMemEngine.Cognisight; namespace Stereoscopic3DTracking.Learn { class RegionTeacher : LearnInterface { public void doLearn(int category, V1KUTracker tracker) { CogniSight v1ku = tracker.V1KU; // Learn the ROI location. v1ku.CSR = 1; v1ku.LearnROI(category, true, 2); int xOffset = tracker.ROI.Width / 2; int yOffset = tracker.ROI.Height / 2; for (int x = tracker.ROI.X - xOffset; x < tracker.ROI.X + xOffset; x += 4) 65 { v1ku.ROILEFT = x; for (int y = tracker.ROI.Y - yOffset; y < tracker.ROI.Y + yOffset; y += 4) { v1ku.ROITOP = y; v1ku.LearnROI(category, true, v1ku.BWIDTH); } } // North v1ku.ROITOP = tracker.ROI.Y - tracker.ROI.Height; for (int x = tracker.ROI.X - tracker.ROI.Width; x < tracker.ROI.X + tracker.ROI.Width; x += 2) { v1ku.ROILEFT = x; v1ku.CATL = 0; v1ku.CSR = 4; } // South v1ku.ROITOP = tracker.ROI.Y + tracker.ROI.Height; for (int x = tracker.ROI.X - tracker.ROI.Width; x < tracker.ROI.X + tracker.ROI.Width; x += 2) { v1ku.ROILEFT = x; v1ku.CATL = 0; v1ku.CSR = 4; } // West v1ku.ROILEFT = tracker.ROI.X - tracker.ROI.Width; for (int y = tracker.ROI.Y - tracker.ROI.Height; y < tracker.ROI.Y + tracker.ROI.Height; y += 2) { v1ku.ROITOP = y; v1ku.CATL = 0; v1ku.CSR = 4; } // East v1ku.ROILEFT = tracker.ROI.X + tracker.ROI.Width; for (int y = tracker.ROI.Y - tracker.ROI.Height; y < tracker.ROI.Y + tracker.ROI.Height; y += 2) { v1ku.ROITOP = y; v1ku.CATL = 0; v1ku.CSR = 4; } // return region of interest to center v1ku.ROILEFT = tracker.ROI.X; v1ku.ROITOP = tracker.ROI.Y; } } } // KalmanFilter.cs 66 using using using using System; System.Collections.Generic; System.Linq; System.Text; namespace Filter { public class KalmanFilter { private Matrix _X; // State vector. private Matrix _F, _FTranspose; // State transition. private Matrix _B; // Input gain matrix. private Matrix _H, _HTranspose; // Observation/Measurement matrix. private Matrix _Q; // Estimated process error covariance. private Matrix _R; // Estimated measurement error/noise covariance. private Matrix _P, _PIdentity; // The covariance matrix. private Matrix _U; // Control/Input vector. public KalmanFilter() { } public void Predict() { _X = _F.Multiply(_X).Add(_B.Multiply(_U)); _P = _F.Multiply(_P).Multiply(_FTranspose).Add(_Q); } public void Observe(Matrix Z) { Matrix Y = Z.Sub(_H.Multiply(_X)); Matrix S = _H.Multiply(_P).Multiply(_HTranspose).Add(_R); Matrix K = _P.Multiply(_HTranspose).Multiply(S.Inverse()); _X = _X.Add(K.Multiply(Y)); _P = _PIdentity.Sub(K.Multiply(_H)).Multiply(_P); } public Matrix X { get { return _X; } set { _X = value; } } public Matrix F { get { return _F; } set { _F = value; _FTranspose = _F.Transpose(); } } public Matrix B { get { return _B; } 67 set { _B = value; } } public Matrix H { get { return _H; } set { _H = value; _HTranspose = _H.Transpose(); } } public Matrix Q { get { return _Q; } set { _Q = value; } } public Matrix P { get { return _P; } set { _P = value; _PIdentity = Matrix.Identity(_P.Rows); } } public Matrix U { get { return _U; } set { _U = value; } } public Matrix R { get { return _R; } set { _R = value; } } public Matrix Estimate { get { return _H.Multiply(_X); } } } } // Matrix.cs using using using using System; System.Collections.Generic; System.Linq; System.Text; namespace Filter { public class Matrix { 68 private double[][] data; private int rows; private int columns; public Matrix(int row, int column) { // Initialize the matrix A data = new double[row][]; for (int i = 0; i < row; i++) { data[i] = new double[column]; } this.rows = row; this.columns = column; } public Matrix(double[][] B) { rows = B.Length; columns = B[0].Length; data = B; } public int Rows { get { return rows; } } public int Columns { get { return columns; } } public double[][] DataCopy() { double[][] C = new double[rows][]; for (int i = 0; i < rows; i++) { C[i] = new double[columns]; for (int j = 0; j < columns; j++) { C[i][j] = data[i][j]; } } return C; } public double[][] Data { get { return data; } } public double ValueAt(int row, int column) { if (row >= this.rows || row < 0 || column >= this.columns || column < 0) { return double.NaN; } 69 return data[row][column]; } public Matrix Transpose() { double[][] B = new double[columns][]; for (int i = 0; i < columns; i++) { B[i] = new double[rows]; for (int j = 0; j < rows; j++) { B[i][j] = data[j][i]; } } return new Matrix(B); } public static Matrix Identity(int m) { // Initialize the matrix A double[][] B = new double[m][]; for (int i = 0; i < m; i++) { B[i] = new double[m]; for (int j = 0; j < m; j++) { if (i == j) { B[i][j] = 1.0; } else { B[i][j] = 0.0; } } } return new Matrix(B); } public Matrix Multiply(double scalar) { Matrix C = new Matrix(rows, columns); for (int i = 0; i < rows; i++) { for (int j = 0; j < columns; j++) { C.data[i][j] = scalar * data[i][j]; } } return C; } public Matrix Multiply(Matrix B) { // Check to see if the matrix is compatible if (columns != B.Rows) { throw new System.ArgumentException( 70 "The column of this matrix and the row of the other matrix are not compatible"); } Matrix C = new Matrix(rows, B.Columns); for (int i = 0; i < rows; i++) { for (int j = 0; j < B.Columns; j++) { C.data[i][j] = 0.0; for (int k = 0; k < columns; k++) { C.data[i][j] += data[i][k] * B.data[k][j]; } } } return C; } public Matrix Add(Matrix B) { // Check to see if the matrix is compatible. if (rows != B.Rows && columns != B.columns) { throw new System.ArgumentException( "Matrix is not compatible for addition"); } Matrix C = new Matrix(rows, columns); for (int i = 0; i < rows; i++) { for (int j = 0; j < columns; j++) { C.data[i][j] = data[i][j] + B.data[i][j]; } } return C; } public Matrix Sub(Matrix B) { // Check to see if the matrix is compatible. if (rows != B.Rows && columns != B.columns) { throw new System.ArgumentException( "Matrix is not compatible for addition"); } Matrix C = new Matrix(rows, columns); for (int i = 0; i < rows; i++) { for (int j = 0; j < columns; j++) { C.data[i][j] = data[i][j] - B.data[i][j]; } 71 } return C; } public Matrix SubMatrix(int row0, int row1, int column0, int column1) { Matrix M = new Matrix(row1 - row0 + 1, column1 - column0 + 1); for (int i = row0; i <= row1; i++) { for (int j = column0; j <= column1; j++) { M.Data[i - row0][j - column0] = data[i][j]; } } return M; } public Matrix SubMatrix(int[] r, int column0, int column1) { Matrix M = new Matrix(r.Length, column1 - column0 + 1); for (int i = 0; i < r.Length; i++) { for (int j = column0; j <= column1; j++) { M.Data[i][j - column0] = data[r[i]][j]; } } return M; } public Matrix Solve(Matrix B) { return (rows == columns) ? (new LUDecomposition(this)).solve(B) : (new QRDecomposition(this)).solve(B); } public Matrix Inverse() { return Solve(Identity(rows)); } public override string ToString() { string value = ""; for (int i = 0; i < rows; i++) { for (int j = 0; j < columns; j++) { value += string.Format("[{0,3}]", data[i][j]); } value += "\n"; } return value; } } } 72 // LUDecomposition.cs using using using using System; System.Collections.Generic; System.Linq; System.Text; namespace Filter { /// <summary> /// LU Decomposition. /// This class is a C# port of the JAMA Java Library. /// JAMA is in the public domain. /// See http://wordhoard.northwestern.edu/userman/thirdparty/jama.html /// </summary> public class LUDecomposition { private double[][] lu; private int m; private int n; private int pivsign; private int[] piv; public LUDecomposition(Matrix A) { lu = A.DataCopy(); m = A.Rows; n = A.Columns; piv = new int[m]; for (int i = 0; i < m; i++) { piv[i] = i; } pivsign = 1; double[] luRowI; double[] luColJ = new double[m]; // Outer loop. for (int j = 0; j < n; j++) { // Make a copy of the j-th column to localize references. for (int i = 0; i < m; i++) { luColJ[i] = lu[i][j]; } // Apply previous transformations. for (int i = 0; i < m; i++) { luRowI = lu[i]; // Most of the time is spent in the following dot product. int kmax = Math.Min(i, j); 73 double s = 0.0; for (int k = 0; k < kmax; k++) { s += luRowI[k] * luColJ[k]; } luRowI[j] = luColJ[i] -= s; } // Find pivot and exchange if necessary. int p = j; for (int i = j + 1; i < m; i++) { if (Math.Abs(luColJ[i]) > Math.Abs(luColJ[p])) { p = i; } } if (p != j) { for (int k = 0; k < n; k++) { double t = lu[p][k]; lu[p][k] = lu[j][k]; lu[j][k] = t; } int pivp = piv[p]; piv[p] = piv[j]; piv[j] = pivp; pivsign = -pivsign; } // Compute multipliers. if (j < m & lu[j][j] != 0.0) { for (int i = j + 1; i < m; i++) { lu[i][j] /= lu[j][j]; } } } } public bool IsNonSingular() { for (int j = 0; j < n; j++) { if (lu[j][j] == 0) return false; } return true; } public Matrix L { get { Matrix X = new Matrix(m, n); double[][] L = X.Data; for (int i = 0; i < m; i++) 74 { for (int j = 0; j { if (i > j) { L[i][j] = } else if (i == { L[i][j] = } else { L[i][j] = } } < n; j++) lu[i][j]; j) 1.0; 0.0; } return X; } } public Matrix U { get { Matrix X = new Matrix(n, n); double[][] U = X.Data; for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { if (i <= j) { U[i][j] = lu[i][j]; } else { U[i][j] = 0.0; } } } return X; } } public int[] Pivot { get { int[] p = new int[m]; for (int i = 0; i < m; i++) { p[i] = piv[i]; } return p; 75 } } public double[] DoublePivot { get { double[] vals = new double[m]; for (int i = 0; i < m; i++) { vals[i] = (double)piv[i]; } return vals; } } public double Determinant { get { if (m != n) { throw new System.ArgumentException("Matrix must be square."); } double d = (double)pivsign; for (int j = 0; j < n; j++) { d *= lu[j][j]; } return d; } } public Matrix solve(Matrix B) { if (B.Rows != m) { throw new System.ArgumentException("Matrix row dimensions must agree."); } if (!this.IsNonSingular()) { throw new Exception("Matrix is singular."); } // Copy right hand side with pivoting int nx = B.Columns; Matrix Xmat = B.SubMatrix(piv, 0, nx - 1); double[][] X = Xmat.Data; // Solve L*Y = B(piv,:) for (int k = 0; k < n; k++) { for (int i = k + 1; i < n; i++) 76 { for (int j = 0; j < nx; j++) { X[i][j] -= X[k][j] * lu[i][k]; } } } // Solve U*X = Y; for (int k = n - 1; k >= 0; k--) { for (int j = 0; j < nx; j++) { X[k][j] /= lu[k][k]; } for (int i = 0; i < k; i++) { for (int j = 0; j < nx; j++) { X[i][j] -= X[k][j] * lu[i][k]; } } } return Xmat; } } } // QRDecomposition.cs using using using using System; System.Collections.Generic; System.Linq; System.Text; namespace Filter { /// <summary> /// QR Decomposition. /// This class is a C# port of the JAMA Java Library. /// JAMA is in the public domain. /// See http://wordhoard.northwestern.edu/userman/thirdparty/jama.html /// </summary> public class QRDecomposition { private double[][] qr; private int m; private int n; private double[] rdiag; public QRDecomposition(Matrix A) { qr = A.DataCopy(); m = A.Rows; n = A.Columns; 77 rdiag = new double[n]; // Main loop. for (int k = 0; k < n; k++) { // Compute 2-norm of k-th column without under/overflow. double nrm = 0; for (int i = k; i < m; i++) { nrm = hypot(nrm, qr[i][k]); } if (nrm != 0.0) { // Form k-th Householder vector. if (qr[k][k] < 0) { nrm = -nrm; } for (int i = k; i < m; i++) { qr[i][k] /= nrm; } qr[k][k] += 1.0; // Apply transformation for (int j = k + 1; j < { double s = 0.0; for (int i = k; i < { s += qr[i][k] * } s = -s / qr[k][k]; for (int i = k; i < { qr[i][j] += s * } } to remaining columns. n; j++) m; i++) qr[i][j]; m; i++) qr[i][k]; } rdiag[k] = -nrm; } } public static double hypot(double a, double b) { double r; if (Math.Abs(a) > Math.Abs(b)) { r = b / a; r = Math.Abs(a) * Math.Sqrt(1 + r * r); } else if (b != 0) { r = a / b; r = Math.Abs(b) * Math.Sqrt(1 + r * r); 78 } else { r = 0.0; } return r; } public bool IsFullRank() { for (int j = 0; j < n; j++) { if (rdiag[j] == 0) return false; } return true; } public Matrix getH() { Matrix X = new Matrix(m, n); double[][] H = X.Data; for (int i = 0; i < m; i++) { for (int j = 0; j < n; j++) { if (i >= j) { H[i][j] = qr[i][j]; } else { H[i][j] = 0.0; } } } return X; } public Matrix getR() { Matrix X = new Matrix(n, n); double[][] R = X.Data; for (int i = 0; i < n; i++) { for (int j = 0; j < n; j++) { if (i < j) { R[i][j] = qr[i][j]; } else if (i == j) { R[i][j] = rdiag[i]; } else { 79 R[i][j] = 0.0; } } } return X; } public Matrix getQ() { Matrix X = new Matrix(m, n); double[][] Q = X.Data; for (int k = n - 1; k >= 0; k--) { for (int i = 0; i < m; i++) { Q[i][k] = 0.0; } Q[k][k] = 1.0; for (int j = k; j < n; j++) { if (qr[k][k] != 0) { double s = 0.0; for (int i = k; i < m; i++) { s += qr[i][k] * Q[i][j]; } s = -s / qr[k][k]; for (int i = k; i < m; i++) { Q[i][j] += s * qr[i][k]; } } } } return X; } public Matrix solve(Matrix B) { if (B.Rows != m) { throw new System.ArgumentException( "Matrix row dimensions must agree."); } if (!this.IsFullRank()) { throw new Exception("Matrix is rank deficient."); } // Copy right hand side int nx = B.Columns; double[][] X = B.DataCopy(); // Compute Y = transpose(Q)*B for (int k = 0; k < n; k++) { for (int j = 0; j < nx; j++) 80 { double s = 0.0; for (int i = k; i < m; i++) { s += qr[i][k] * X[i][j]; } s = -s / qr[k][k]; for (int i = k; i < m; i++) { X[i][j] += s * qr[i][k]; } } } // Solve R*X = Y; for (int k = n - 1; k >= 0; k--) { for (int j = 0; j < nx; j++) { X[k][j] /= rdiag[k]; } for (int i = 0; i < k; i++) { for (int j = 0; j < nx; j++) { X[i][j] -= X[k][j] * qr[i][k]; } } } return (new Matrix(B.Data).SubMatrix(0, n - 1, 0, nx - 1)); } } } // V1KUTracker.cs using using using using using System.Drawing; Stereoscopic3DTracking.Learn; System; CogniMemEngine.Cognisight; System.Diagnostics; namespace Stereoscopic3DTracking { public enum ROIStatuses { UNKNOWN, UNCERTAINTY, RECOGNIZED } public class V1KUTracker { private const int MAX_STEP = 16; private const int MIN_STEP = 1; private CogniSight v1ku; private Rectangle ros; 81 private private private private private private private private private private private Rectangle roi; int confidence; int distance; int roiCategory; ROIStatuses roiStatus; LearnInterface teacher; int windowWidth; int windowHeight; bool found; // True if anything was found in the last search. Rectangle rf; // The region found in the last search. Point bf; // The point of the best found in the last search. public V1KUTracker() { v1ku = new CogniSight(CogniMemEngine.Platforms.V1KU_board); teacher = new DefaultLearnImpl(); windowHeight = 240; windowWidth = 376; reset(); } public V1KUTracker(int deviceId) { v1ku = new CogniSight(CogniMemEngine.Platforms.V1KU_board, deviceId); teacher = new DefaultLearnImpl(); windowHeight = 240; windowWidth = 376; reset(); } public CogniSight V1KU { get { return v1ku; } } public Rectangle ROS { get { return ros; } } public Rectangle ROI { get { return roi; } } public Rectangle RF { get { return rf; } } public bool Found { get { return found; } } public int Distance { get { return distance; } } 82 public Point BF { get { return bf; } } public int Confidence { get { return confidence; } } public int Category { get { return roiCategory; } } public ROIStatuses ROIStatus { get { return roiStatus; } } public LearnInterface Teacher { get { return teacher; } set { this.teacher = value; } } public int WindowWidth { get { return windowWidth; } set { this.windowWidth = value; } } public int WindowHeight { get { return windowHeight; } set { this.windowHeight = value; } } public void reset() { roi = new Rectangle(); ros = new Rectangle(); rf = new Rectangle(); bf = new Point(); v1ku.CogniMem.FORGET = 0; confidence = 0; roiCategory = 0; roiStatus = 0; roiStatus = ROIStatuses.UNKNOWN; found = false; } public void learn(int category) { teacher.doLearn(category, this); } public void search() { if (v1ku.CogniMem.NCOUNT == 0) return; v1ku.CSR = 1; v1ku.SearchROS(); Point xPoint = new Point(int.MaxValue, int.MinValue); 83 Point yPoint = new Point(int.MaxValue, int.MinValue); distance = int.MaxValue; found = false; confidence = 0; if (v1ku.VObjects.Count > 0) { foreach (CogniMemEngine.Cognisight.CogniSight.VOBJECT vobject in v1ku.VObjects) // for each hit, check distance { // Only accept values that have > 50% confidence int locConfidence = 100 - (vobject.Distance / 100); if (locConfidence < 0) locConfidence = 0; if (locConfidence < 50) continue; int x = vobject.X; // always grab X first int y = vobject.Y; // when grabbed, they are at center int tempDist = vobject.Distance; // Store the min and max values for X. xPoint.X = (x < xPoint.X) ? x : xPoint.X; xPoint.Y = (x > xPoint.Y) ? x : xPoint.Y; // Store the min and max values for Y. yPoint.X = (y < yPoint.X) ? y : yPoint.X; yPoint.Y = (y > yPoint.Y) ? y : yPoint.Y; // Store the best point hit point. if (tempDist < distance) { bf.X = x; bf.Y = y; distance = tempDist; } } rf.X = xPoint.X; rf.Width = xPoint.Y - xPoint.X; rf.Y = yPoint.X; rf.Height = yPoint.Y - yPoint.X; // Calculate confidence confidence = 100 - (distance / 100); if (confidence < 0) confidence = 0; if (confidence > 50) found = true; } } public void adjustROS(bool reset) { if (reset) { moveROS(windowWidth / 4, windowHeight / 4, windowWidth / 2, windowHeight / 2); } else { moveROS(0, 0, windowWidth, windowHeight); } } public void movedWindowX(int value) 84 { bf.X += value; rf.X += value; } public void movedWindowY(int value) { bf.Y += value; rf.Y += value; } public void recognize() { if (v1ku.CogniMem.NCOUNT == 0) return; v1ku.CSR = 1; v1ku.CSR = 2; // Get recognition information. int roidist = v1ku.ROIDIST; int roistate = v1ku.CogniMem.NSR; roiCategory = v1ku.ROICAT; // Calculate confidence confidence = 100 - (roidist / 100); if (confidence < 0) confidence = 0; // Calculate status. switch (roistate) { case 0: roiStatus = ROIStatuses.UNKNOWN; break; case 4: roiStatus = ROIStatuses.UNCERTAINTY; break; case 8: roiStatus = ROIStatuses.RECOGNIZED; break; default: roiStatus = ROIStatuses.UNKNOWN; break; } } public void moveROI(int x, int y, int width, int height) { // The the checks makes sure that in valid bounds are ignored. int newX = (x <= 0) ? 0 : x; int newY = (y <= 0) ? 0 : y; int newWidth = (newX + width >= windowWidth) ? (windowWidth - newX) : width; int newHeight = (newY + height >= windowHeight) ? (windowHeight - newY) : height; if (newWidth <= 0 || newHeight <= 0) { return; } roi.X = newX; roi.Y = newY; roi.Width = newWidth; roi.Height = newHeight; 85 // set the region of search on the V1KU v1ku.ROILEFT = roi.X; v1ku.ROITOP = roi.Y; v1ku.ROIWIDTH = roi.Width; v1ku.ROIHEIGHT = roi.Height; int value = (int)Math.Ceiling(Math.Sqrt( Convert.ToDouble(roi.Height) * Convert.ToDouble(roi.Width)) / 16.0d); v1ku.BHEIGHT = value; v1ku.BWIDTH = value; } public void moveROS(int x, int y, int width, int height) { // The the checks makes sure that invalid bounds are ignored. ros.X = (x <= 0) ? 1 : x; ros.Y = (y <= 0) ? 1 : y; ros.Width = (ros.X + width >= windowWidth) ? (windowWidth - ros.X) - 1 : width; ros.Height = (ros.Y + height >= windowHeight) ? (windowHeight - ros.Y) - 1 : height; // set the region of search on the V1KU v1ku.ROSLEFT = ros.X; v1ku.ROSTOP = ros.Y; v1ku.ROSWIDTH = ros.Width; v1ku.ROSHEIGHT = ros.Height; } } } 86 APPENDIX B User Guide 1. Installation Prerequisites: First, you will need to obtain the USB Drivers for the Cognimem V1KU. The drivers can be obtained from the cognimem website (http://www.cognimem.com/v1ku/index.html). There is a link on supplied page to download the Drivers for Windows and Linux. Inside of the downloaded archive are instructions on how to install the drivers. Note: The cognimem drivers are unsigned, so on Windows 7 and Vista you will need to boot Windows with driver signature enforcement disabled. This is accomplished by pressing the F8 key before the windows logo is displayed on boot. This will present you with boot options, select the one that disables driver signature enforcement. Next, you need to obtain the USB Drivers for the Phidgets servo controller. The drivers can be obtained from the Phidgets website (http://www.phidgets.com/drivers.php). Choose the driver appropriate for your operating system. Finally, you will need the .NET 4.0 Framework. This can be obtained through Microsoft’s website (http://www.microsoft.com/download/en/details.aspx?id=17851) or through automatic updates. If you are compiling the source, you will need Microsoft’s Visual Studio 2010. From Source: In the root source folder there is a file called Stereoscopic3DTracking.sln, either double clicking or selecting that file from within Visual Studio will load the project. Once the project is loaded you will have the option to “Build Solution” under the build menu. Once you have successfully 87 built the project you can run it within Visual Studio by clicking the play button on the default toolbar. From executable: The executable includes the required DLL libraries, so if you have the executable simply double clicking it will launch the application. 2. Settings Figure 21: Settings Tab It is usually not necessary to adjust the settings screen, these options are only available so slight changes to the hardware can be adjusted without code changes. Setting some values can causes issues with the hardware. 88 Presets: These load the settings with preset values; it recommended using the presets. Binning: Adjusts the binning mode of the camera from full resolution and half resolution. When in half resolution the camera width and height need to be set to 376 and 240 respectively. Camera Width: Adjust the width of the view window. It will be automatically centered within the full resolution of the camera. Camera Height: Adjust the height of the view window. It will be automatically centered within the full resolution of the camera. Servo Change in X: This value represents the maximum change in degrees needed to center the camera on a pixel at the left and right edges of the camera view. Servo Change in Y: This value represents the maximum change in degrees need to center the camera on a pixel at the top and bottom edges of the camera view. Minimum Servo Position: This is the minimum value that the servo control will set the servo to. Maximum Servo Position: This is the maximum value that the servo control will set the servo to. Servo Distance: This is the measured distance between each camera. The measurement is taken at the center of the camera lens. The default value is 12 inches. The unit of measurement used here is the unit of measurement in the tracking output. By default, the unit of measurement is inches. Apply Settings: Any changes made will not take effect until the apply settings button is pressed. It is recommended to make any setting changes first before performing any learning or tracking operations. 89 3. Learning Figure 22: Learning Tab The Learning tab provides all of the functions required to teach the V1KU modules the target object. The operations are broken up into three groups, the functions on the right of the view, the Learning Functions, and the save, load and copy functions. The picture displayed is from the selected camera. A typical learning scenario would be to first adjust the ROI size by clicking the +/buttons on the right. Next, the ROI is positioned over the object to be learned by either clicking the camera picture or dragging the ROI to the needed location on the camera picture. Then select the Learning Algorithm you would like to use. Next, click either the Learn or click and hold 90 down the Continuous Learn buttons. This will cause the V1KU to learn the object. You can then test how well the object was learned by clicking the Find button. This will show a dot colored from blue to red at each point it thinks the learned object exists. Blue represents areas of low confidence and red represents areas of high confidence. If there are any parts of the background that are being incorrectly identified you can move the ROI to the area and use the Unlearn or Continuous Unlearn buttons to teach the V1KU that this area is part of the background. When clicking the Find button the search time is also displayed. This value should be 150ms or less to properly track an object. Adjusting the Search stepping on the left will improve the search speed. After training the first camera you can either copy that information to the other camera or save the knowledge so it can be loaded during another session. To train the other camera press the swap camera button and you are now able to perform the same operations to teach the other camera. Functional description of the controls: Reset V1KUs: Sets both V1KUs back to the default settings and 0 neurons View Neurons: Allow the user to see the neuron information stored on the current V1KU Shutter: Allows the user to adjust the shutter value on the camera (when Automatic Gain Control is disabled) Gain: Allows the user to adjust the gain value on the camera (when Automatic Gain Control is disabled) Automatic Gain Control: Turns on and off the Automatic Gain Control on the camera Swap Camera: Changes the currently active V1KU module. + / -: Increases or Decrease the Region of Interest (ROI). Learning Algorithm: Allows the user to select the learning algorithm Search Stepping: Allows the user to adjust the Region of Search (ROS) stepping value. 91 Continuous Learn: Uses the selected learning algorithm and learns the object at the ROI continuously for as long as the button is held down. Learn: Uses the selected learning algorithm and learns the object at the ROI. Continuous Unlearn: Uses the selected learning algorithm and learns the background at the ROI continuously for as long as the button is held down. Unlearn: Uses the selected learning algorithm and learns the background at the ROI. Find: Performs a search and displays a dot at any location where the learned object is found. The dot will be a shade between blue and red depending on the confidence. The higher the confidence the more red the dot will be. Save Knowledge: Saves the state of the V1KU module to a file. Load Knowledge: Loads a saved state into the V1KU module. Copy: Copies the state of the selected V1KU into the other V1KU Module. 92 4. Tracking Figure 23: Track Tab The tracking tab contains all the functions need to track an object. You will need to have trained each one of the V1KU modules before you will be able to track. You can see the neuron count under each camera; if the value is 0 then the V1KU is not trained. Before tracking confirm that camera 1 is on the left hand side. If it is not then you will need to go back to the learning tab and click the swap camera button. Each one of the servos can be manually adjusted on this table by dragging the track bar. Do not adjust the servos manually when tracking is enabled. The live video feed checkbox turns on and off the updating of the camera views. When turned off the tracking will speed up since it 93 doesn’t have to retrieve the current camera image for each V1KU. To start the tracking just press the Enable tracking button. The application will begin searching for the learned object and will give you X, Y, Z and distance values based on the current positions of the servos. The unit of measurement of the values is based on the servo distance. By default this distance is in inches so all the X, Y, Z and distance values are also in inches. When you enable tracking the Kalman filter is also reset back to its initial state. 94 APPENDIX C Kalman Filter Test Results Table 4: Complete Kalman Filter Test Results T 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 PX PY PZ PD 8.464248 10.10626 12.36098 9.175526 9.960176 8.111338 7.66872 7.607709 7.359861 7.267969 7.359173 7.314268 7.690673 7.579441 7.400706 7.35686 7.313014 7.269169 7.865785 7.709048 7.644054 7.569329 7.555236 7.487743 7.449714 7.411684 7.343795 7.59285 7.738489 7.676457 7.392832 36.44295 44.28579 54.23477 38.54506 41.56582 33.83067 30.0749 29.12561 29.82021 31.15448 32.28936 32.66398 32.45877 30.83535 29.73958 29.18057 28.62156 28.06255 32.41423 34.57016 33.98323 31.21876 29.02345 28.8914 28.10519 27.31897 31.16102 35.65085 36.49873 32.80905 28.16245 -9.07748 -10.4933 -12.8037 -9.59331 -10.4118 -9.9437 -9.00927 -9.07718 -7.11383 -6.95272 -7.91439 -7.94361 -9.5955 -8.97791 -7.8643 -7.62496 -7.38561 -7.14627 -9.68672 -10.5599 -9.57938 -7.57377 -6.22843 -6.35463 -5.80753 -5.26043 -8.81441 -10.0336 -9.87942 -8.42342 -6.79275 38.49848 46.62056 57.0801 40.76694 43.99238 36.18266 32.31835 31.44159 31.52806 32.73783 34.04993 34.40254 34.71011 32.99803 31.63954 31.04463 30.45031 29.8566 34.73306 36.95994 36.12556 33.00406 30.63063 30.51492 29.65008 28.79117 33.20594 37.80617 38.59591 34.73206 29.89848 MX 6.998 7.579 MY MZ 30.13 -7.505 33.4 -7.785 MD 31.875 35.171 7.52 31.42 -7.891 33.307 7.574 7.532 31.82 -9.246 29.57 -8.422 34.058 31.731 7.439 7.381 7.482 30.61 -7.243 31.79 -7.458 32.35 -8.419 32.374 33.52 34.31 7.695 7.497 7.367 32.14 -9.41 30.48 -8.429 29.99 -7.601 34.436 32.562 31.86 7.814 7.603 7.621 7.574 7.586 7.509 32.24 34.28 32.96 30.58 29.48 30.05 -9.511 -10.283 -9.001 -7.434 -6.801 -7.285 34.587 36.657 35.065 32.42 31.243 31.876 7.386 7.662 7.709 7.595 7.339 7.22 31.3 35.71 34.97 31.17 28.11 27.69 -8.788 -9.697 -9.18 -7.902 -6.874 -6.52 33.407 37.844 37.022 33.098 29.913 29.403 95 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 7.165322 7.308795 7.663943 7.748727 7.859155 7.419539 7.315542 7.239808 7.483967 7.844617 8.081804 8.233029 8.384254 7.069726 7.120075 6.998601 7.347067 7.614577 7.790285 7.90206 8.013835 7.954534 8.02657 8.098606 7.334332 7.215424 7.096516 6.977608 7.565394 7.889554 8.075725 8.209012 7.377134 7.248246 7.49497 7.464077 7.433184 7.350943 7.483702 26.08202 29.02912 34.77268 36.33529 32.92263 28.37043 27.06425 25.85352 30.01486 35.54654 38.10619 40.13028 42.15437 30.81185 25.429 23.12896 29.38755 37.70199 40.63304 43.24067 45.8483 33.4632 32.58822 31.71323 29.63911 28.49034 27.34157 26.1928 31.55078 36.10537 37.64904 39.09994 31.21253 30.1268 30.32326 29.57205 28.82083 30.5324 32.30927 -5.90029 -6.85644 -8.83664 -9.31244 -9.25114 -8.18375 -7.578 -7.33824 -7.44665 -8.35001 -9.03306 -9.35483 -9.6766 -8.23167 -7.17057 -6.80642 -7.82587 -9.6587 -10.1064 -10.6228 -11.1391 -8.54801 -8.34252 -8.13703 -8.0081 -7.82009 -7.63208 -7.44408 -7.67314 -9.59391 -11.0747 -11.8671 -8.44206 -8.08888 -6.72613 -6.10955 -5.49298 -8.74172 -9.68329 27.68441 30.71024 36.68735 38.30167 35.08917 30.44511 29.04164 27.83288 31.81751 37.34725 39.98741 42.02065 44.05591 32.66667 27.36324 25.10491 31.28661 39.65744 42.58959 45.22213 47.85778 35.44191 34.58345 33.72725 31.56579 30.41243 29.2604 28.10986 33.34012 38.18227 40.06639 41.67759 33.16493 32.02486 31.95177 31.10538 30.26657 32.59879 34.54939 7.449 7.735 31.22 -7.721 35.62 -9.228 33.068 37.652 7.772 7.287 7.375 31.82 -8.825 27.78 -7.745 28.22 -7.607 33.987 29.817 30.211 7.529 7.86 7.98 30.71 -7.653 35.7 -8.574 36.39 -8.919 32.586 37.597 38.358 7.126 7.329 30.68 -8.16 26.16 -7.18 32.589 28.176 7.422 7.626 7.715 30.84 -8.053 38.24 -9.771 38.21 -9.551 32.782 40.241 40.183 7.867 32.98 -8.472 35.005 7.378 30.68 -8.203 32.663 7.579 7.89 7.972 31.73 -7.799 36.29 -9.856 36.25 -10.667 33.59 38.479 38.688 7.346 30.75 -8.096 32.696 7.589 31.29 -7.173 33.035 7.389 7.54 7.613 30.94 -8.819 32.65 -9.468 32.61 -8.927 33.075 34.895 34.715 96 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 7.61117 7.670053 7.501954 7.483512 7.323249 7.436378 7.534442 7.587091 7.452727 7.294458 7.232611 7.539289 7.529928 7.553661 7.272147 7.259899 7.356613 7.52131 7.571884 7.613748 7.513535 7.518518 7.4799 7.514205 7.564349 7.51007 7.432288 7.496415 7.569166 7.633889 7.553541 7.412544 7.44748 7.428214 7.730457 7.797739 7.57595 7.405014 7.248568 32.96089 31.95149 29.93983 29.31027 30.722 32.65506 32.78004 31.56022 30.00504 29.49339 28.94681 30.83935 31.83199 32.19905 30.51497 29.64316 29.83272 31.01813 31.24463 31.43287 31.47913 31.64582 31.09995 30.43125 29.66399 29.96905 31.28804 32.3573 32.1122 31.29057 30.93403 30.85699 31.35956 31.42893 31.66476 31.77893 30.83807 30.19485 30.05431 -9.44057 -8.31221 -6.89208 -6.31739 -7.45831 -8.92648 -9.7848 -9.42047 -8.20236 -7.03688 -6.52493 -7.27714 -8.38024 -8.57992 -8.80065 -8.12104 -7.39278 -7.10851 -6.83936 -7.97949 -8.77448 -9.08448 -8.69553 -8.18073 -7.57308 -7.23664 -7.17973 -7.70929 -8.17655 -8.54568 -8.997 -8.95326 -8.5125 -8.50763 -7.18755 -6.82863 -7.65726 -8.15515 -8.69819 35.12086 33.89425 31.62552 30.90315 32.45147 34.66028 35.02915 33.79876 31.98631 31.18632 30.54182 32.5709 33.76691 34.16799 32.58065 31.58123 31.60323 32.69902 32.86849 33.31166 33.53179 33.77149 33.14767 32.39519 31.53606 31.73191 32.9504 34.09728 33.99032 33.32273 33.08952 32.97363 33.33691 33.39664 33.3778 33.42657 32.6652 32.1414 32.11637 7.626 7.436 31.39 -7.978 29.79 -6.99 33.327 31.542 7.355 7.506 7.551 7.564 7.405 7.301 31.16 32.89 32.24 31.04 30 30.06 -7.872 -9.161 -9.483 -8.844 -7.849 -7.174 33.025 35.015 34.518 33.217 31.938 31.8 7.567 7.489 31.16 -7.672 31.79 -8.65 33.028 33.852 7.277 7.322 7.414 7.536 30.34 29.78 30.27 31.34 -8.599 -7.851 -7.388 -7.369 32.433 31.711 32.084 33.116 7.564 7.454 31.21 -8.115 31.25 -8.716 33.183 33.355 7.479 7.524 7.563 7.485 7.43 7.531 7.574 7.614 7.507 7.4 7.498 31 30.41 29.83 30.41 31.58 32.13 31.51 30.9 30.96 31.01 31.54 -8.451 -8.013 -7.591 -7.461 -7.462 -7.972 -8.189 -8.401 -8.79 -8.627 -8.266 33.055 32.399 31.759 32.249 33.333 34.006 33.484 32.976 33.111 33.091 33.516 7.72 31.58 -7.314 33.371 7.535 7.401 7.296 30.82 -7.908 30.32 -8.245 30.37 -8.623 32.754 32.339 32.472 97 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 7.161502 7.512694 7.682206 7.605844 7.631845 7.344963 7.286373 7.544529 7.568663 7.422159 7.482529 7.601674 7.542783 7.551284 7.559784 7.568284 7.576785 7.585285 7.563324 7.566958 7.349559 7.378266 7.495229 7.640635 7.588059 7.426178 7.391505 7.615273 7.65016 7.45822 7.433873 7.409525 7.385177 7.36083 7.58444 7.517357 7.406154 7.412422 7.44153 29.80229 31.26532 32.06793 31.81093 31.99709 30.8701 30.70124 30.86139 30.77846 30.56483 31.01221 31.69663 31.9923 32.25394 32.51557 32.7772 33.03884 33.30047 31.48713 31.41731 30.46394 30.27633 30.78853 31.56721 31.9071 31.48926 31.54012 31.60186 31.65565 30.76801 30.57595 30.38388 30.19182 29.99976 31.22207 31.85523 32.67378 32.30417 30.93961 -8.94347 -8.18649 -7.67808 -7.32681 -7.10323 -7.87608 -7.92189 -8.73539 -8.98169 -8.27399 -7.98443 -7.74363 -7.42203 -7.22838 -7.03474 -6.84109 -6.64744 -6.45379 -8.39971 -8.54783 -7.4628 -7.19375 -7.311 -7.57985 -7.86364 -7.99072 -8.09595 -8.23871 -8.35404 -8.15466 -8.1878 -8.22095 -8.25409 -8.28724 -7.87952 -8.15523 -8.57696 -8.70928 -8.55363 31.92881 33.18101 33.85737 33.51814 33.65286 32.69471 32.53327 32.94924 32.94342 32.52317 32.88611 33.50262 33.697 33.90557 34.11598 34.32819 34.54216 34.75787 33.45442 33.42711 32.2143 31.98195 32.52019 33.35149 33.72653 33.32527 33.39099 33.53426 33.62135 32.69241 32.51447 32.33675 32.15924 31.98195 33.08214 33.73091 34.58311 34.26887 32.95148 7.557 7.661 7.521 31.35 -8.039 31.9 -7.668 31.37 -7.477 33.29 33.739 33.159 7.349 30.81 -8.004 32.722 7.574 31 -8.619 33.12 7.414 7.509 7.608 7.496 30.66 31.17 31.7 31.75 -8.117 -7.968 -7.819 -7.543 32.631 33.092 33.582 33.535 7.559 31.51 -8.3 33.508 7.37 7.438 7.535 7.634 7.523 7.388 30.63 30.59 31.11 31.65 31.69 31.17 32.427 32.383 32.939 33.506 33.565 33.05 7.626 31.55 -8.13 33.521 7.444 30.8 -8.068 32.759 7.573 7.491 7.406 7.454 7.47 31.21 31.84 32.48 31.85 30.67 -7.91 -8.225 -8.549 -8.54 -8.376 33.134 33.782 34.451 33.862 32.722 -7.449 -7.375 -7.542 -7.714 -7.875 -7.898 98 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 7.433579 7.425628 7.417677 7.435945 7.433287 7.282483 7.240948 7.74722 7.850803 7.61697 7.631908 7.646847 7.565431 7.558329 7.551227 7.544126 30.62685 30.31409 30.00133 30.77931 30.68674 31.00595 31.02145 31.73672 31.93758 30.95664 30.84703 30.73742 31.25886 31.29359 31.32833 31.36307 -8.61249 -8.67135 -8.73021 -7.29484 -7.05205 -7.55277 -7.50509 -8.13755 -8.27005 -8.27408 -8.37283 -8.47158 -8.4487 -8.51962 -8.59055 -8.66148 32.67165 32.39253 32.11415 32.49421 32.35215 32.73298 32.72748 33.66687 33.91221 32.93618 32.86167 32.78766 33.25255 33.30167 33.3509 33.40025 7.44 30.93 -7.458 32.729 7.299 31.06 -7.726 32.881 7.733 31.65 -8.116 33.639 7.546 30.87 -8.154 32.871 7.564 31.28 -8.367 33.317 99 REFERENCES [1] Hitesh Wadhwani, A tool for tracking objects through V1KU, a neural network system, 10th February 2011: http://csus-dspace.calstate.edu/handle/10211.9/924. [2] Dr. Scott Gordon, Stereoscopic tracking w/ neural network hardware, 3rd May 2011: http://www.youtube.com/watch?v=SkxSU78cDew. [3] Cognimem Technologies, Inc., V1KU Starter Kit & Data Sheet CM1K Evaluation Kit, June 2011: http://www.cognimem.com/_docs/Datasheet/DS_V1KU.pdf [4] Cognimem Technologies, Inc., V1KU Hardware Design User’s Manual, September 23, 2011: http://www.cognimem.com/_docs/TechnicalManuals/TM_V1KU_Hardware_Manual.pdf [5] Cognimem Technologies, Inc., Introduction to CM1K, Retrieved March 10th 2012: http://www.cognimem.com/_docs/Presentations/PR_CM1K_introduction.pdf [6] Cognimem Technologies, Inc., CM1K Hardware Manual, Retrieved March 10th 2012: http://www.cognimem.com/_docs/Technical-Manuals/TM_CM1K_Hardware_Manual.pdf [7] LensMaster, The LensMaster Gimbal RH-2. Retrieved September 10th, 2011 from LensMaster’s website: http://www.lensmaster.co.uk/rh2.htm [8] GigaPan Systems, GigaPan Epic. Retrieved September 10th, 2011 from GigaPan System’s website: http://www.gigapan.org/cms/shop/epic [9] ServoCity, PT-2100 Pan & Tilt System. Retrieved September 10th, 2011 from ServoCity’s website: http://servocity.com/html/pt-2100_pan__tilt_system.html [10] ServoCity, Servo to Shaft Couplers. Retrieved November 12th, 2011 from ServoCity’s website: http://www.servocity.com/html/servo_to_shaft_couplers.html [11] Cognimem Technologies, Inc, Software Development Kit in C#. Retrieved December 23rd 2011 from the Cognimem Technologies, Inc. website: http://www.cognimem.com/_downloads/V1KU_downloads/V1KU_SDK_Net.zip [12] Alister, Object tracking using a Kalman filter (MATLAB). May 3rd, 2011, Retrieved February 14th, 2012: http://blog.cordiner.net/2011/05/03/object-tracking-using-a-kalmanfilter-matlab [13] YiQing Liu; Dong Wei; Ning Zhang; MinZhe Zhao, Vehicle-license-plate recognition based on neural networks, 6th June 2011. Information and Automation (ICIA), 2011 IEEE International Conference: http://ieeexplore.ieee.org.proxy.lib.csus.edu/xpl/articleDetails.jsp?tp=&arnumber=5949018 &contentType=Conference+Publications 100 [14] Anne Menendez and Guy Paillet (2008). “Fish Inspection System Using a Parallel Neural Network Chip and the Image Knowledge Builder Application”. AI Magazine Vol. 29 No.1. Spring 2008, pp 21-28 [15] Phidgets, 1061_0 – PhidgetAdvancedServo 8 – Motor, Retrieved on March 10th, 2012: http://www.phidgets.com/products.php?product_id=1061_0 [16] Zdenek Kalal, TLD (Predator), Retrieved March 17th 2012: http://info.ee.surrey.ac.uk/Personal/Z.Kalal/tld.html [17] Anastasios D. Doulamis; Klimis S. Ntalianis; Nikolaos D. Doulamis; Kostas Karpouzis; Stefanos D. Kollias, Unsupervised Tracking of Stereoscopic Video Objects Employing Neural Networks Retraining (2010), Retrieved March 17th 2012: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.60.7783 [18] Javed Ahmed; M. N. Jafri; J. Ahmad; Muhammad I. Khan, Design and Implementation of Neural Network for Real-Time Object Tracking (2010). In World Academy of Science, Engineering and Technology (WASET). Retrieved March 31st, 2012 from WASET Online. http://www.waset.org/journals/waset/v6/v6-50.pdf [19] Michael Baker; Holly A. Yanco, A Vision-Based Tracking System for a Street-Crossing Robot. Submitted to 2004 IEEE International Conference on Robotics and Automation. Retrieved March 17th 2012: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.123.1568 [20] Jaehong Park; Chang-hun Lee; Tae-il Kim; Teajae Lee; Shaikh, M.M.; Kwang-soo Kim; Dong-il Cho, A motion-information-based vision-tracking system with a stereo camera on mobile robots. In Robotics, Automation and Mechatronics (RAM), 2011 IEEE. 17-19 Sept. 2011: http://ieeexplore.ieee.org.proxy.lib.csus.edu/xpl/articleDetails.jsp?tp=&arnumber=6070491 [21] Menendez , Anne, Stereoscopy, internal (Cognimem), 2010. [22] Joe Hicklin; Cleve Moler; Peter Webb, JAMA : A Java Matrix Package. Retrieved March 10th, 2012: http://math.nist.gov/javanumerics/jama/