2a01vangoghlee-150707160928-lva1-app6891

advertisement
Fast 3D Object Recognition
In Real-World Environments
Ken Lee, CEO
Insert Company Logo on
Slide Master
May 29, 2014
Copyright © 2014 VanGogh Imaging
1
Company Background
• Founded in 2007
• Located in McLean, VA
• Mission: “Provide Real-time 3D computer vision technology for
embedded and mobile applications”
• Product: ‘Starry Night’ 3D-CV Middleware
• Operating System: Android and Linux
• 3D Sensor: PrimeSense & Kinect & SoftKinetic
• Processors: ARM & Xilinx Zynq
• Applications
• 3D Printing, Parts Inspection, Robotics
• Security, Automotive, Augmented Reality
• Medical, Gaming
Copyright © 2014 VanGogh Imaging
2
Starry Night 3D Middleware
Copyright © 2014 VanGogh Imaging
3
The ‘Starry Night’ Middleware (Unity Plugin)
•
•
•
•
•
•
•
Busy real-world environment
Real-time processing
Tolerant to noise from low-cost scanners
Efficient
Fully automated
Mobile or portable embedded platform (ARM & Xilinx Zynq FPGA)
Released on Avnet Embedded Software Store: June, 2014.
Starry Night Video:
https://www.youtube.com/watch?v=Ro1mv007MHo&feature=youtu.be
Copyright © 2014 VanGogh Imaging
4
The ‘Starry Night’ Middleware Blocks
Copyright © 2014 VanGogh Imaging
5
The ‘Starry Night’ Shape-Based Registration
• Reliable — The output is always a fully-formed 3D model with known
feature points despite noisy or partial scans
• Easy to use — Fully automated process
• Powerful — Known data structure for easy analysis and measurement
• Fast — Single step process (Not iterative)
Input Scan (Partial) + Reference Model = Full 3D Model
Copyright © 2014 VanGogh Imaging
6
Object Recognition Algorithm
Copyright © 2014 VanGogh Imaging
7
Challenges — Scene
• Busy scene, object orientation, and occlusion
Copyright © 2014 VanGogh Imaging
8
Challenges — Platform
• Mobile and Embedded Devices
• ARM — A9 or A15, <1G RAM
• Existing libraries were built for laptop/desktop platform
• GPU processing is not always available
• Therefore, we need a very efficient algorithm
Copyright © 2014 VanGogh Imaging
9
Previous Approaches
• Texture based methods
• Color based  depends heavily on lighting or color of the object
• Machine Learning  robust but requires training per each object
• Neither method provides Transform (i.e. orientation)
• (3D) methods
• Hough transform and geometric hashing  Slow
• Geometric hashing  Even slower
• Tensor matching  Not good for noisy and sparse scene
• Correspondence based methods using rigid geometric descriptors
• The models must have distinctive feature points which is not
true for most models (i.e. cylinder)
Tried
Copyright © 2014 VanGogh Imaging
10
General Concept
Reference
Object
Descriptor
distance & normal
Match Criteria
Compare
Fine-Tune
Orientation
Location
Transpose
Scene
Distance and normal of
Random sample points
Copyright © 2014 VanGogh Imaging
11
Block Diagram — Example for One Model
Copyright © 2014 VanGogh Imaging
12
Model Descriptor (Pre-processed)
Sample all point
pairs in the model
that are separated by
the same distance D
Use the surface
normal of the pair to
group them into the
hash tablet
Note: In the bear example, D = 5 cm which
resulted in 1000 pairs
Note: The keys are angles derived from the normal of
the points.
alpha(α) = first normal to second point
beta(β) = second normal to first point
omega(Ω) = angle of the plane between two points
key
(α1,β1,Ω1)
P1, P2
P3, P4
(α2,β2,Ω2)
P5, P6
P7, P8
(α3,β3,Ω3)
P13, P14
P9, P10
P11, P12
Copyright © 2014 VanGogh Imaging
13
Object Recognition of the Model (Real-time)
Grab Scene
Sample point pair w/
distance D using
RANSAC
Generate key using
same hash function
Use key to retrieve
similarly oriented
points in the model &
rough transform
Match criteria to find
the best match
Note: The example scene has around 16K points
Note: We iterated this sampling process 100 times
Note: Entire process can be easily parallelized
Very Important: Multiple models can be
found using a single hash table for
example sampled point pair in the scene
Use ICP to refine
transform
Copyright © 2014 VanGogh Imaging
14
Implementation
• Result
Object Recognition Video:
https://www.youtube.com/watch?v=h7whfei0fTw&feature=youtu.be
Copyright © 2014 VanGogh Imaging
15
Performance
Copyright © 2014 VanGogh Imaging
16
Reliability (w/ bear model)
• Reliability
• % False positives — depends on the scene
• Clean scene — <1%
• Noisy scene — 15%
• % negative results (cannot find the object)
• Clean scene — <1%
• Noisy scene — 25% (also takes longer)
• Effect of orientation on success ratio
• Model facing front — > 99%
• Model facing backward — > 99%
• Model facing sideways — 65%
Copyright © 2014 VanGogh Imaging
False positive
17
Performance — Mobile
• Performance on Cortex A-15 2GHz ARM (on Android mobile)
• Amount of time it takes to find one object
• Single-thread — 4 seconds
• Multi-thread & NEON — 1 second
• Amount of time it takes to find two objects
• Single-thread — 5.2 seconds
• Multi-thread & NEON — 1.4 second
Copyright © 2014 VanGogh Imaging
18
Hardware Acceleration — FPGA (Xilinx Zynq)
• Select Functions to Be Implemented in Zynq
• FPGA — Matrix operations
• Dual-core ARM — Data management + Floating point
• Entire implementation done in C++ (Xilinx Vivado-HLS)
Copyright © 2014 VanGogh Imaging
19
Performance — Embedded using FPGA
• Note: Currently, only 30% of the computationally intensive functions
are implemented on the FPGA with the rest still running on ARM A9.
Therefore, it should be much faster once we can transfer most of these
to the FPGA.
• Performance on Xilinx Zynq (Cortex A-9 800 MHZ + FPGA)
• Amount of time it takes to find one object
• Zynq 7020 — 6 second
• Zynq 7045 (est.) — <1 second
• No test result for two objects but should scale the same way as for
the ARM.
Copyright © 2014 VanGogh Imaging
20
Lesson Learned
• The object recognition implemented is pretty reliable
• The algorithm does a great job in recognizing multiple models with
minimal penalty
• More improvement is needed for the noisy environment and certain
object orientation
• Additional improvement in the performance is needed
• Algorithm
• Application specific parameters (e.g. size of the model descriptor)
• ARM — NEON
• Algorithm improvement
• Optimize the use of FPGA core
Copyright © 2014 VanGogh Imaging
21
Summary
Copyright © 2014 VanGogh Imaging
22
Summary
• Key implementation issues
• Model descriptor
• Data structure
• Sampling technique
• Performance
• IMPORTANT
• Both ARM & FPGA provides the scalability
• Therefore
• Real-time object recognition was very difficult but successfully
implemented on both mobile and embedded platforms
• LIVE DEMO AT THE BOOTH!
Copyright © 2014 VanGogh Imaging
23
Resources
• www.vangoghimaging.com
• Android 3D printing: http://www.youtube.com/watch?v=7yCAVCGvvso
• “Challenges and Techniques in Using CPUs and GPUs for Embedded Vision” by
Ken Lee, VanGogh Imaging—http://www.embedded-vision.com/platinummembers/vangogh-imaging/embedded-visiontraining/videos/pages/september-2012-embedded-vision-summit
• “Using FPGAs to Accelerate Embedded Vision Applications”, Kamalina Srikant,
National Instruments— http://www.embedded-vision.com/platinummembers/national-instruments/embedded-visiontraining/videos/pages/september-2012-embedded-vision-summit
• “Demonstration of Optical Flow algorithm on an FPGA”—
http://www.embedded-vision.com/platinum-members/bdti/embedded-visiontraining/videos/pages/demonstration-optical-flow-algorithm-fpg
• * Reference: “An Efficient RANSAC for 3D Object Recognition in Noisy and
Occluded Scenes” by Chavdar Papazov and Darius Burschka. Technische
Universitaet Muenchen (TUM), Germany.
Copyright © 2014 VanGogh Imaging
24
Thank you
Copyright © 2014 VanGogh Imaging
25
Download