Xiuwen Liu

advertisement
Research Activities at
Florida State Vision Group
Florida State University
Xiuwen Liu
Department of Computer Science
Florida State University
http://www.cs.fsu.edu/~liux/courses/intro-seminar-10.ppt
Research Statement
 My
research goal is to create machines that can
“see” with similar and super human
performance and their applications
• This seems a trivial problem as each of us can do this
without any effort
• Computer + Camera = “A See Machine” ?
9/11/2010 11:50:27 PM
intro-seminar-10.ppt
Visual Pathway
9/11/2010 11:50:33 PM
intro-seminar-10.ppt
Visual Illusion
9/11/2010 11:50:39 PM
intro-seminar-10.ppt
Outline
 Why
computer vision and pattern recognition
• Motivating applications
 Some
of my research projects
 Related
courses
 Contact
information
9/11/2010 11:50:43 PM
intro-seminar-10.ppt
What is the Passion?
9/11/2010 11:50:49 PM
intro-seminar-10.ppt
7
Image-Guided Neurosurgery
9/11/2010 11:50:53 PM
intro-seminar-10.ppt
Computer Vision Applications – cont.
 Military
applications
• Automated target recognition
9/11/2010 11:51:03 PM
intro-seminar-10.ppt
Biometrics
Iris code can achieve zero
false acceptance
intro-seminar-10.ppt
9/11/2010 11:51:11 PM
Computer Vision in Sports
 How
was the yellow created?
9/11/2010 11:51:13 PM
intro-seminar-10.ppt
Social Health
 The
coming epidemic – Alzheimer’s
• There is no cure but early detection is the key
• How to do it?
9/11/2010 11:51:17 PM
intro-seminar-10.ppt
Smart Energy
 U.S.
smart grid initiative
9/11/2010 11:51:44 PM
intro-seminar-10.ppt
Cyber-Physical Systems
http://dpolyakov.com/images/design/smartplanet_040.jpg
9/11/2010 11:51:45 PM
intro-seminar-10.ppt
Computational Biology
 Life
is fundamentally digital and so is biology
9/11/2010 11:51:47 PM
intro-seminar-10.ppt
Research Projects
 Image
•
•
•
•
and shape presentations
Image modeling
Video analysis
Medical image analysis
Media for all – Automatic video description generation
 Cyber-physical
systems – RFID Localization
 Computational
Biology
 Classes
9/11/2010 11:52:18 PM
intro-seminar-10.ppt
Generic Image Modeling

How can we characterize all these images perceptually?
9/11/2010 11:52:19 PM
intro-seminar-10.ppt
Spectral Histogram Representation
 Spectral
histogram
• Given a bank of filters F(a), a = 1, …, K, a spectral
histogram is defined as the marginal distribution of filter
responses
I(a ) (v)  F (a ) * I(v)
H
(a )
I
1
(a )
( z) 
δ
(
z

I
(v))

|I| v
H I  ( H I(1) , H I( 2) ,, H I( K ) )
9/11/2010 11:52:20 PM
intro-seminar-10.ppt
Spectral Histogram Representation - continued
 Choice
•
•
•
•
of filters
Laplacian of Gaussian filters
Gabor filters
Gradient filters
Intensity filter
LoG filter
9/11/2010 11:52:20 PM
Gabor filter
intro-seminar-10.ppt
Spectral Histogram Representation - continued
9/11/2010 11:52:21 PM
intro-seminar-10.ppt
Face detection - continued
9/11/2010 11:52:21 PM
intro-seminar-10.ppt
Face detection - continued
9/11/2010 11:52:22 PM
intro-seminar-10.ppt
Face detection - continued
9/11/2010 11:52:23 PM
intro-seminar-10.ppt
Rotation Invariant Face Detection
9/11/2010 11:52:28 PM
intro-seminar-10.ppt
Rotation Invariant Face Detection - continued
9/11/2010 11:52:29 PM
intro-seminar-10.ppt
Linear Representations

Linear representations are widely used in appearance-based
object recognition and other applications
• Simple to implement and analyze
• Efficient to compute
• Effective for many applications
a ( I ,U )  U I  R
T
9/11/2010 11:52:49 PM
d
intro-seminar-10.ppt
Standard Linear Representations
 Principal
Component Analysis
• Designed to minimize the reconstruction error on the training set
• Obtained by calculating eigenvectors of the co-variance matrix
 Fisher Discriminant Analysis
• Designed to maximize the separation between means of each class
• Obtained by solving a generalized eigen problem
 Independent
Component Analysis
• Designed to maximize the statistical independence among coefficients
along different directions
• Obtained by solving an optimization problem with some object function
such as mutual information, negentropy, ....
9/11/2010 11:52:50 PM
intro-seminar-10.ppt
Optimal Component Analysis
9/11/2010 11:55:41 PM
intro-seminar-10.ppt
ORL Face Dataset
9/11/2010 11:55:42 PM
intro-seminar-10.ppt
Performance Comparison
9/11/2010 11:55:42 PM
intro-seminar-10.ppt
Real-time Scene Interpretation
 Object
detection and recognition problem
• Given a set of images, find regions in these images which
contain instances of relevant objects
• Here the number of relevant objects is assumed to be large
– For example, the system should be able to handle 30,000 different
kinds of objects, an estimate of the human brain’s capacity for basic
level visual categorization [I. Biederman, Psychological Review, vol. 94, pp. 115-147,
1987]
9/11/2010 11:55:43 PM
intro-seminar-10.ppt
Problem Statement for Scene Interpretation
 Object
detection and recognition problem
• Given a set of images, find regions in these images which
contain instances of relevant objects
• Here the number of relevant objects is assumed to be large
– For example, the system should be able to handle 30,000 different
kinds of objects, an estimate of the human’s capacity for basic level
visual categorization [I. Biederman, Psychological Review, vol. 94, pp. 115-147, 1987]
 Goal
• Develop a system that can achieve real-time detection and
recognition for images of size 640 x 480 with high accuracy
– Say, at a frame rate of 15 frames per second
9/11/2010 11:55:43 PM
intro-seminar-10.ppt
Proposed Framework
9/11/2010 11:55:43 PM
intro-seminar-10.ppt
Specifications and Requirements
 We
want to detect and recognize at least 30,000
object classes in images
• At four different scales
• Using exhaustive search of local windows, that is, we do not
assume segmentation or other pre-processing
• If we assume objects are in some (e.g. 21 x 21) windows, this
means that there will be many (18,432,000) local windows to
be classified/processed
• We want to do this on a 3.6 Ghz Dell Precision workstation
with an estimated performance of 28,665.4 MIPS
• This amounts to that we have about 1555 instructions to
process a 21 x 21 local window
9/11/2010 11:55:44 PM
intro-seminar-10.ppt
Requirements – cont.
 To
achieve the specifications, we need two critical
components
• A classifier that can reduce the average classification time
effectively
– Note that on average we have 1555 instructions; if we can process
90% of those windows using only 100 instructions per window, we
can have on average 14,650 instructions for the remaining 10% local
windows
• Features that can discriminate a large number of objects and
can be computed using a few instructions
– Do such features exist?
9/11/2010 11:55:44 PM
intro-seminar-10.ppt
Local Spectral Histograms
 We
introduce a new class of features, which we
called LSH features
• It is defined relative to a chosen set of filters
• For a given filter, it is defined as a histogram of a local
window of the filtered image
• One bin of the histogram is given by
9/11/2010 11:55:44 PM
intro-seminar-10.ppt
Local Spectral Histogram Example
Convolution is implemented
using FPGAs
9/11/2010 11:55:45 PM
intro-seminar-10.ppt
Local Spectral Histogram Features
9/11/2010 11:55:45 PM
intro-seminar-10.ppt
ORL Face Dataset
9/11/2010 11:55:45 PM
intro-seminar-10.ppt
Comparison Between Haar and LSH Features
9/11/2010 11:55:46 PM
intro-seminar-10.ppt
COIL Dataset
9/11/2010 11:55:46 PM
intro-seminar-10.ppt
Comparison Between Haar and LSH Features
9/11/2010 11:55:46 PM
intro-seminar-10.ppt
Texture Dataset
9/11/2010 11:55:47 PM
intro-seminar-10.ppt
Comparison Between Haar and LSH Features
9/11/2010 11:55:47 PM
intro-seminar-10.ppt
Mixed Dataset
9/11/2010 11:55:47 PM
intro-seminar-10.ppt
Comparison Between Haar and LSH Features
9/11/2010 11:55:47 PM
intro-seminar-10.ppt
Comparison Between Haar and LSH Features
9/11/2010 11:55:48 PM
intro-seminar-10.ppt
Classifier
 To
achieve the specification, we also need a
classifier that takes only a few instructions to make
a decision on average
• At the same time, we need to achieve high accuracy
 We
propose to use a look-up table tree classifier
• I.e., a decision tree classifier where each node is
implemented by a look-up table
9/11/2010 11:55:48 PM
intro-seminar-10.ppt
Look-up Table Tree Classifier
9/11/2010 11:55:48 PM
intro-seminar-10.ppt
Look-up Table Tree Classifier
9/11/2010 11:55:49 PM
intro-seminar-10.ppt
An Example Path in a Decision Tree
9/11/2010 11:55:49 PM
intro-seminar-10.ppt
Performance Comparison
RCT – Rapid Classification Tree, implemented by Keith Haynes
9/11/2010 11:55:50 PM
intro-seminar-10.ppt
Detection and Recognition
9/11/2010 11:55:50 PM
intro-seminar-10.ppt
Detection and Recognition
9/11/2010 11:55:50 PM
intro-seminar-10.ppt
Content-based Video Representation, Indexing and Retrieval

A video is an extrinsic 3D representation of a 4D volume
• 3D spatial space + 1D temporal space = 4D volume
• For video, 2D image space + 1D temporal space = 3D volume

Our group is working an intrinsic 4D representation for
video
• By first reconstructing the scene using SLAM (Simultaneous
localization and mapping) and stereopsis
9/11/2010 11:55:51 PM
intro-seminar-10.ppt
4D Video Representation Example
9/11/2010 11:55:51 PM
intro-seminar-10.ppt
VeScene System
 VeScene
9/11/2010 11:55:52 PM
– Voiced Scene System
intro-seminar-10.ppt
Illustration by Nan Zhao
Computer Vision for Gerotechnology

As mobile devices become more powerful, they may serve
as an efficient interface to make up visual, memory, and
other deficiencies due to aging
• The society is aging
– For example, people of 65 and older are 16.8% of Florida’s population
(US Census Bureau, 2005)
• By modifying and enhancing environments, vision technology can be
critical for helping people stay active and be independent
9/11/2010 11:55:53 PM
intro-seminar-10.ppt
Early Detection of Alzheimer’s Through Gait

Alzheimer’s can be detected reliably five years before it can
be clinically detected otherwise

Others include typing/writing,
and content analysis
• As they are controlled by cognitive
functions that depend on brain areas
that are affected by Alzheimer’s
• Usage of words is also affected

9/11/2010 11:55:53 PM
Collaborating with Prof. Tyson
to do early detection using
smart phones
intro-seminar-10.ppt
Shape Theory

We want to quantify the difference between two shapes in
a principled way
• We do this by constructing a shape space and then use the geodesic
distance of two shapes on the shape manifold as the metric
9/11/2010 11:55:54 PM
intro-seminar-10.ppt
Surface Parametrization
9/11/2010 11:55:54 PM
intro-seminar-10.ppt
Geodesic Interpolation Between Surfaces
9/11/2010 11:55:55 PM
intro-seminar-10.ppt
Atlas for Hippocampus
9/11/2010 11:55:55 PM
intro-seminar-10.ppt
Characterizing Alzheimer’s via Shape Change
9/11/2010 11:55:56 PM
intro-seminar-10.ppt
Computer Vision for Computational Systems Biology

The goal of systems biology is to link the molecular and cellular
events and properties to physiological functions
9/11/2010
11:55:56
PM proteins to organs: The Physiomeintro-seminar-10.ppt
Source:
“Integration
from
Project”, Nature Review, Vol. 4, 2007.
FISHFinder@FSU
Source: Gilbert’s group at Biology Department, FSU
9/11/2010 11:55:56 PM
intro-seminar-10.ppt
67
High Throughput Nanoscale Localization

In cellular and molecular biology, a typical problem is that
biologists need to localize marked proteins in various areas
9/11/2010 11:55:57 PM
intro-seminar-10.ppt
Live Cell Imaging at Cellular Level
9/11/2010 11:55:57 PM
intro-seminar-10.ppt
QUEST Project
 Quantitative
Elastic Spatial-Temporal Atlases for
Subcellular Structures
9/11/2010 11:55:58 PM
intro-seminar-10.ppt
Atomic Tomography for Electron Microscopy
 As
life is digital, the ultimate
necessary resolution for
modeling biological processes
is atomic resolution
• As life is digital, the ultimate
models will be discrete rather
continuous
• It appears that atomic
reconstruction is almost within
reach using a two-stage
tomography algorithm – to be
named ATomo
– Being developed by Chaity and
others
9/11/2010 11:55:58 PM
intro-seminar-10.ppt
Cyber Physical Systems

As ubiquitous computing is a reality, location aware
services become a critical component
• An example is GPS-based services
• Currently, with Prof. Zhang we are studying a dramatically new way
of localizing objects through RFID tags with a 2mm accuracy
9/11/2010 11:55:58 PM
intro-seminar-10.ppt
Fine Granularity Localization of RFIDs
9/11/2010 11:55:59 PM
intro-seminar-10.ppt
Intelligent Human-Computer Interface
Activity Monitoring for Elderly
 With
RFID tags, we can identify and localize
many objects
• By integrating with built cameras in phones, we can
estimate a three dimensional model of the environment
along with the states of the objects
• An envisioned program is that a person can remotely get a
summary and other statistics of daily activities of elderly
who live independently
9/11/2010 11:55:59 PM
intro-seminar-10.ppt
Courses
 Most
Relevant Courses
• CAP 5638 Pattern Recognition – Offered 2011-2012
• CAP 5415 Principles and Algorithms of Computer Vision
– Will be offered Spring 2010
•
•
•
•
•
CAP 6417 Theoretical Foundations of Computer Vision
STA 5106 Computational Methods in Statistics I
STA 5107 Computational Methods in Statistics I I
ISC 5935-05/STA 5934-01 Applied Machine Learning
Seminars and advanced studies
 Related
Courses
• CAP 5726 Computer Graphics
• CAP 5600 Artificial Intelligence
9/11/2010 11:56:00 PM
intro-seminar-10.ppt
Funding of the Group
 National
•
•
•
•
Science Foundation
DMS
CISE IIS
ACT
CCF
 National
Institute of Health
 Industry
• Harris
9/11/2010 11:56:00 PM
intro-seminar-10.ppt
Summary
 Computer Vision
Group offers interesting
research topics/projects
• Effective and intrinsic represents for images and videos
• Real-time detection and recognition of objects
• Computational models for object recognition and image
classification
• Medical/biological image analysis
• Motion/video sequence analysis and modeling
• They are challenging, interesting, and exciting
• Now it is a productive and fruitful area to be in
9/11/2010 11:56:01 PM
intro-seminar-10.ppt
Contact Information
•
•
•
•
•
Name
Web sites
Email
Offices
Phones
9/11/2010 11:56:02 PM
Xiuwen Liu
http://cavis.fsu.edu
http://www.cs.fsu.edu/~liux
liux@cs.fsu.edu
LOV 166 and Eppes 102
644-0050 and 645-2257
intro-seminar-10.ppt
Thank you!
Any questions?
Download