ECEN 5623 RT Embedded Systems

advertisement
CS A485
Computer and Machine Vision
Lecture 1 - Introduction
January 14, 2014
 Sam Siewert
The Course
An introductory course on computer vision and machine
vision. Topics covered include difference between computer
and machine vision, image capture and processing, filtering,
thresholds, edge detection, shape analysis, shape detection,
pattern matching, digital image stabilization, stereo ranging,
3D models from images, real-time vision systems, recognition
of targets, and applications including inspection, surveillance,
search and rescue, and machine vision navigation.
http://www.cse.uaa.alaska.edu/~ssiewert/a485.html
Richard Szeliski, Computer Vision: Algorithms and Applications,
Springer, 2011. (ISBN 978-1-84882-934-3) author link
Gary Bradski, Adrian Kaehler, Learning OpenCV, 2nd Edition,
O’Reilly, 2012. (ISBN 978-1449314651) publisher link
 Sam Siewert
2
Sam Siewert
UC Berkeley – National Research
University, Philosophy/Physics
1984-85
University of Notre Dame, BS - Private,
Aerospace/Mechanical
Johnson Space Center, U. of Houston
– UHCL Computer Engineering,
National R&D Center, Mission Control
Center
U. of Colorado, Boulder, MS/PhD –
Growing Research University, Gov’t Labs,
Start-Ups, Computer Science
1985-89
1989-92
Interdisciplinary Teaching & Research –
Aerospace/Mechanical, Computer
Science, Computer Engineering
CU Boulder Senior Instructor, Adjunct
Professor
CTO, Architect, Developer/Engineer
 Sam Siewert
1992-today
3
Related Industry Background
General Experience (~24 Years in Embedded and Scalable Systems)
–
–
–
–
–
Intel Architecture Group (Atom, Scalable Cloud Solutions)
CTO at Atrato Inc., a Digital Media Storage Start-up in Broomfield
Consulting with Numerous Digital Media Firms
12 Years NASA JSC, NASA JPL / CU, NASA JPL / Ball Aerospace
12 Years Commercial Telecomm, Storage/Networks, Embedded, Digital
Video
Machine Vision
– Spitzer Space Telescope – Sky-scan Mosaics, Super-resolution, PeakUp
– Optical Navigation – JPL
– Robotics at CU-Boulder
Computer Graphics
– UAV/UAS Digital Video and Graphics Overlays for Frame Latency
Indication
– Space Station/Shuttle Mission Control Real-time Avionics Displays
Digital Media
– Real-Time Digital Video Frame Transformation (1080p, 60Hz), Color
Enhancement
– Digital Media Storage and Networking
 Sam Siewert
4
Course Topics
Computer Vision
– Emulation and Replication of Human-like Vision with Computers
– Goal is to Understand, Model, and Augment Human Vision
– Provide Robotics with Human-like Vision Capability
Machine Vision
– Using Digital Cameras to Automate a Process (E.g. Printed Circuit
Board Inspection, Sorting Recycling, Security Cameras)
– Compared to Computer Vision (More Practical Application)
– Applications (Optical Navigation, Sorting, Segmentation and
Recognition)
Digital Media
– Digital Video Encoding/Decoding
– Some Overlap with A490 Course, but Only to Extent Needed to
Understand CV/MV Cameras and Transport
Linux-based Labs
 Sam Siewert
5
Computer Vision
http://en.wikipedia.org/wiki/File:CVoverview2.svg
 Sam Siewert
6
Why Linux? …
From Game Consoles to Super-Computing
PS3
Blue Gene
GPGPU
Tianhe-1 Pflop
http://www.nscc-tj.gov.cn/en/
From Android Mobiles to GIS and Digital Video Services
Huge Value in Open Source Drivers, Tools, and
Applications – Speeds Up Time to Market
Focus on Leveraging Linux for Desktop and Embedded
Systems for Machine Vision and Graphics
 Sam Siewert
7
Why OpenCV
Long History of Computer Vision Capture in one C/C++
Library
Open Source
Runs on Linux (Easy Ubuntu install)
Students and Instructors Love it!
Abstracts Low-Level Algorithmic Details
– We will Leverage, but Also Implement Ground-Up to Learn
– Provides Algorithm Compare and Abstracted CV Design
Well Documented on the Web and in Books
 Sam Siewert
8
Machine Vision Systems
Camera Basics – Extrinsic and Intrinsic
Embedded Systems for Machine Vision
Fundamentals
– Background Elimination
– Edge Enhancement and Other Convolutions
Optical Navigation
– Segmentation Methods
– Tracking (Centroid of Object)
Stereo Vision
– Distance Estimation Methods (Disparity)
– Binocular Vision vs. RGB-Depth Mappers (PrimeSense, Asus Xtion,
Creative Cam)
 Sam Siewert
9
Digital Media Systems
Embedded Media Devices
– Set-Top Boxes (Linux)
– Mobile Media Systems: Smart Phones, Tablet Computing, Readers,
Notebooks, DVD Players, iPODs, etc.
– Digital Camera Systems (SD, HD, HD-SDI, 2K, 4K, 6K)
Resolutions/Formats http://en.wikipedia.org/wiki/File:Vector_Video_Standards2.svg
– Game Consoles: X-box, PS3, Etc.
– Gesture Recognition, Augmented Reality
– SD , HD Cameras and Interfaces: Composite, S-Video, Component,
DVI, HDMI
Scalable Digital Media Server Systems (Head End)
– Post Production for Digital Cinema, TV, Web
2K, 4K, 6K Streams from Digital Cameras
Frame/Color Editing, CGI (Computer Generated Imagery), Soundtrack,
Write to Distribution Media
– Digital Cinema: HD Digital Projectors, 3D Digital Projectors
– Closed Circuit Security Systems: Multi-Camera NTSC/HD
 Sam Siewert
10
MV vs CV vs Video Analytics
Machine Vision – Photometers Used in Process Control
–
–
–
–
Successful History
Industrial Automation and Robotics
Controlled Environments
Inspection, Optical Navigation, Medical
CU-Boulder ECEN 5623
Computer Vision – Emulate Human Vision System
–
–
–
–
–
Early Underestimation – Marvin Minsky Summer Project
Challenge of Un-controlled Environments
50 Years Later, Challenges Better Understood
Spitzer – JPL/Caltech
Vision Prosthetics, General Automation
Recent Breakthroughs – USC, DARPA Artificial Retina, Google
Car
– Efficiency and Generalization?
Video Analytics – CV from RT or Stored/Networked Video
 Sam Siewert
11
If Possible, CV => MV Conversion – Cheat!
Practical Solution – Convert CV to MV Problem
–
–
–
–
–
Loss of Generalization (Solves One Problem Rather than Class)
Controlling Environment May Be Difficult
Use Non-Visible Spectrum to Advantage (e.g. FLIR)
Sensor Fusion (Visible + FLIR, RADAR, GPS, …)
Models and Prior-Knowledge of Problem Exploited
CU-Boulder ECEN 5623
Overhead Camera
Dark Background
Overhead Lighting
Game State / Grid Known
Shape Database
Active RGB-D (e.g. Kinect)
 Sam Siewert
12
Why is Human Vision > Computer?
Cortex=10 Billion Neurons
(High fan-out)
> 1 Trillion Synapses
Total=100 Billion Neurons
1.
2.
3.
4.
Neuron > Transistor
Better Programming? ROM?
More Richly Interconnected
Storage + Processing
Red Epic 645
63 Mega-Pixel
Approximately
100+ Mega-Pixel
(Rod/Cone Count)
Neuroscience. 2nd edition.
Purves D, Augustine GJ, Fitzpatrick D, et al., editors.
Sunderland (MA): Sinauer Associates; 2001.
http://www.ncbi.nlm.nih.gov/books/NBK10848/
CPU
5 To 7 billion transistors
CPU
Local Bus
Memory
Controller
I/O Bus (x16 5Gbps = 8GB/sec)
 Sam Siewert
Camera
Link
Interface Card
13
Biological Vision vs. Machine Vision
(Why A Honey Bee is Better than HPC for CV)
960K Neurons in flight:
Humans - 100 million
Learns locations,
complex odors,
Photoreceptors
colors, and shapes;
–
–
–
–
10 billion Neurons (Cerebral Cortex)
Brain with 100 billion Neurons
Millisecond Transfer
Massively Parallel Analog + Digital Computation
Synapse Match is a Challenge
–
–
7000 Connections from 10 Billion Neurons
3 Year Olds Have 1015 Synapses
with high efficiency
(500 Watt/Kg), 0.218g
http://en.wikipedia.org/wiki/List_of_animals_by_number_of_neurons
Brain plasticity for
learning,
connectedness,
concurrency,
integrated sensing,
power efficiency,
and resiliency
NVIDIA GK110
28nm, (7.1 billion)
2012 – 8 billion?
CPU to Digital Camera/HDD
–
–
–
Intel MICA 22nm
(5 billion)
Connects 10’s of millions of pixels
to Several Billion transistors
Through Sequential Logic and I/O Bus
 Sam Siewert
http://en.wikipedia.org/wiki/File:Transistor_Count_and_Moore%27s_Law_-_2011.svg
14
How We’ll Do It
Assessment of Theoretical Learning
– Two Mid-term Exams (1/2 way, 7/8 way)
– FINAL
Practice – 5 Labs
Application – 1 Extended Lab with your Own Design
 Sam Siewert
15
Linux Lab and Desktop Options
Native Linux Installation – Ubuntu
Logitech C200 or C270 Camera(s)
OpenCV
ffmpeg
GIMP
Transformer.uaa.alaska.edu – available to all remotely
and in A219
Virtual-Box Ubuntu Installation
Beagle xM Ubuntu, Intel Terasic Atom Yocto Linux
 Sam Siewert
16
Administrivia
Lectures – PowerPoint with Camtasia – Recorded on Wednesdays in
ENGR 227C, Distributed via Blackboard by Thurs Morning
Introductions
– Instructor (Office Hours) http://www.cse.uaa.alaska.edu/~ssiewert/Schedule-Spring-2014.pdf
– Students (Introductions) – Let’s all join Google+ Circle (I will create and
invite you)
UAA Blackboard
– http://www.uaa.alaska.edu/classes/
Personal Lab – You MUST Have Native Linux and I recommend VBLinux Too
– Either using your own Laptop
– Or Using A219 Lab at UAA
UAA Beagle xM Linux Lab – A219,
http://www.cse.uaa.alaska.edu/~ssiewert/cpal.html
 Sam Siewert
17
Linux Digital Video and CV
Processing Skills
Introduction Session
January 14, 2014
 Sam Siewert
Basic Lab Observations
CV is Compute Intensive
– Lower Resolution and Frame Rates (e.g. 640x480 or 320x240 at
30Hz)
– High-End is Really Intense (HPC) – E.g. 1000 Hz 4K Cameras
like http://www.idtvision.com/, or http://www.photron.com/
– Humans Seem to Saturate at 60Hz (Current Theory)
– 60Hz Stereo in HD is still a Massive Data Rate (1920x1080 x 3
bytes x 60 x 2), or about 720 MB/sec!!
We will work at Low Resolution and 30Hz, but with Both
2D and 3D
Both Binocular 3D, and RGB-Depth
 Sam Siewert
19
Tutorial CV Papers – IBM DeveloperWorks
Build a compute node or small cluster application
and scale with HPC http://www.ibm.com/developerworks/cloud/library/clcloudscaling1-hpcondemand/index.html
Explore video analytics in the cloud http://www.ibm.com/developerworks/cloud/library/clcloudscaling3-videoanalytics/
Machine data analytics http://www.ibm.com/developerworks/library/bdmdasecurity/index.html
 Sam Siewert
20
Labs
I will POST to BB and External Website on Thursdays
Read, Review, Start and Question that Weekend
Bring Questions to Office Hours Mon, Tues, Wed the
Following Week
Lab Due one Week Later
This Works Great if YOU Keep Up
I will POST Lab #1 on 1/15/2014, Due on 1/26 for Full
Credit, Accepted Late Until 1/30 (10% Penalty)
 Sam Siewert
21
OpenCV Demos
Overview Session – Passive Computer
Vision Methods
January 14, 2014
 Sam Siewert
2D & 3D Passive Computer Vision
2D Skeletal Transform
3D Disparity & Depth Map
Canny Edge Finding
Analog
Camera #1
LEFT
(NIR, Visible)
USB 2.0, PCIe
Host Channels
Linux with
OpenCV
Storage
(x86, TI OMAP, Atom)
Analog
Camera #2
RIGHT
(NIR, Visible)
Face Detection/Recognition
 Sam Siewert
Linear Hough Transform
23
OpenNI
Overview Session – Active Computer
Vision Methods
January 14, 2014
 Sam Siewert
3D Active Computational Photometry
TI DLP Light-crafter Kit
http://www.ti.com/tool/dlplightcrafter
Networked
Video
Analytics
IR Pattern Projection
HD Digital
Camera Port
(Snapshot)
Analog
Camera #2
(Near
Infrared)
Altera FPGA
CVPU
(Computer Vision
Processing Unit)
Analog
Camera #1
RGB
(Visible)
USB 2.0, PCIe
Host Channels
Depth Map
Mobile
Sensor Network
Processor
(TI OMAP, Atom)
Flash
SD Card
Photo credits and reference:
Dr. Daniel Aliaga, Purdue University
https://www.cs.purdue.edu/homes/aliaga/
 Sam Siewert
https://www.cs.purdue.edu/homes/aliaga/cs635-10/lec-structured-light.pdf
25
3D Computer Vision Transforms
Long Range ( > 5 meters) Using Passive Binocular Methods
– Impractical to Project from a UAV or Long Range Observer
– Requires Image Registration
– Accurate Camera Intrinsic (Camera Characteristics) & Extrinsic (e.g. Baseline)
Short Range ( < 5 meters), Structured IR Light Projection for RGB-D
– Compare to ASUS Xtion and PrimeSense – Off-the-Shelf
– Robust Depth Maps with Less Noise
– Showing Significant Promise to Improve CV Scene Segmentation and Object
Recognition Compared to 2D
– “Change Their Perception”, By Xiaofeng Ren, Dieter Fox, and Kurt Konolige,
IEEE RAS, December 2013.
Noise in Passive Depth Maps
 Sam Siewert
Robust Active Depth Map
“Change Their Perception”, By Xiaofeng Ren,
Dieter Fox, and Kurt Konolige, IEEE RAS,
December 2013.
26
Off-The-Shelf RGB-Depth Mappers
Intel Creative Camera – Windows Perceptual SDK
ASUS Xtion Short and Long Range – OpenNI
PrimeSense (Kinect Old and New) – MS SDK, ROS
 Sam Siewert
27
Summary
Numerous MV and CV Applications
–
–
–
–
Inspection and Process Automation – MV Domain
Interactive Systems and Augmented Reality – CV Domain
Robotics – MV and CV
Study of Human Vision and Vision Prosthetics – CV
2D Image Processing (Machine Vision)
– Capture, Enhancement, Segmentation, Recognition
Passive 3D Computer Vision
– Stereo Capture, Calibration, Enhancement, Registration, Depth
Mapping, Segmentation, Recognition
Active 3D Machine Vision (It’s Cheating!)
– Structured Light Illumination and IR/Visible Capture, IR Analysis and
Depth Mapping, Visible Image Registration
– Works Between 0 and 5 Meters Well
 Sam Siewert
28
Download