Introduction to C & C++

advertisement
Object Detection
01 – Advance Hough Transformation
JJCAO
Line and curve detection
• The HTis a standard tool in image analysis that allows
recognition of global patterns in an image space by recognition
of local patterns in a transformed parameter space.
• HT: Elegant method for direct object recognition
– Edges need not be connected
– Complete object need not be visible
– Key Idea: Edges VOTE for the possible model
Detect partially occluded lines
2
HT for Lines
3
• polar representation of lines 𝜌 = 𝑝 ∙ 𝒏 = đ‘Ĩ ∗ 𝑐𝑜𝑠 𝜃 + đ‘Ļ ∗ 𝑠𝑖𝑛 𝜃
y=mx+b
(b,m)
(x,y)
Image space
Parameter space
−N ≤ 𝜌 ≤ 𝑀2 + 𝑁 2 ,
𝜃 ∈ [0, 𝜋)
Hough_Grd
• Recall: when we detect an
edge point, we also know its
gradient direction
• But this means that the line
is uniquely determined!
• Modified Hough transform:
•
For each edge point (x,y)
θ = gradient orientation at (x,y)
ρ = x cos θ + y sin θ
A(θ, ρ) = A(θ, ρ) + 1
end
Θ=[0-360] so there is a conversion
Hough transform for circles
r
y
( x, y ) ī€Ģ rīƒ‘ I ( x, y )
x
(x,y)
( x, y ) ī€­ rīƒ‘ I ( x, y )
image space
x
y
Hough parameter space
6
Generalizing the H.T.
• Suppose, there were m different gradient orientations: (m <= n)
fj
aj
rj
(xc,yc)
ri
fi
ai
Pi
f1
(r11,a11),(r12,a12),…,(r1n1,a1n1)
f2
(r21,a21),(r22,a12),…,(r2n2,a1n2)
.
.
.
.
.
.
fm
(rm1,am1),(rm2,am2),…,(rmnm,amnm)
xc = xi + ricos(ai)
yc = yi + risin(ai)
R-table
Generalized Hough Transform
Find Object Center
( x c , y c ) given edges ( x i , y i , f i )
Create Accumulator Array A ( x
Initialize: A ( x c ,
yc ) ī€Ŋ 0
For each edge point
c
, yc )
ī€ĸ ( xc , y c )
( xi , yi , fi )
For each entry
i
rk
in table, compute:
x c ī€Ŋ x i ī€Ģ rk cos a k
i
i
y c ī€Ŋ y i ī€Ģ rk sin a k
i
Increment Accumulator:
Find Local Maxima in A ( x
c
i
A( xc , y c ) ī€Ŋ A( xc , y c ) ī€Ģ 1
, yc )
•Assumption: translation is the only transformation here, i.e., orientation and
scale are fixed
Voting schemes
• Let each feature vote for all the models
that are compatible with it
• Hopefully the noise features will not vote
consistently for any single model
• Missing data doesn’t matter as long as
there are enough features remaining to
agree on a good model
Application in recognition
• Instead of indexing displacements by gradient orientation,
index by “visual codeword”
visual codeword with
displacement vectors
training image
Combined Object Categorization and Segmentation with an Implicit Shape Model_ECCV04
Object Detection Using a Max-Margin Hough Transform_CVPR09
Application in recognition
• Instead of indexing displacements by gradient orientation,
index by “visual codeword”
test image
Combined Object Categorization and Segmentation with an Implicit Shape Model_ECCV04
Object Detection Using a Max-Margin Hough Transform_CVPR09
Implicit shape models: Training
1. Build codebook of patches around
extracted interest points using clustering
(more on this later in the course)
Implicit shape models: Training
1. Build codebook of patches around
extracted interest points using clustering
2. Map the patch around each interest point
to closest codebook entry
Implicit shape models: Training
1. Build codebook of patches around
extracted interest points using clustering
2. Map the patch around each interest point
to closest codebook entry
3. For each codebook entry, store all
positions it was found, relative to object
center
Implicit shape models: Testing
1.
Given test image, extract patches, match to
codebook entry
2. Cast votes for possible positions of object
center
3. Search for maxima in voting space
4. Extract weighted segmentation mask based on
stored masks for the codebook occurrences
Implicit shape models: Details
• Supervised training
– Need reference location and segmentation mask for
each training car
• Voting space is continuous, not discrete
– Clustering algorithm needed to find maxima
• How about dealing with scale changes?
– Option 1: search a range of scales, as in Hough
transform for circles
– Option 2: use scale-covariant interest points
• Verification stage is very important
– Once we have a location hypothesis, we can overlay a
more detailed template over the image and compare
pixel-by-pixel, transfer segmentation masks, etc.
Hough transform: Discussion
• Pros
– Can deal with non-locality and occlusion
– Can detect multiple instances of a model
– Some robustness to noise: noise points unlikely to
contribute consistently to any single bin
• Cons
– Complexity of search time increases exponentially with
the number of model parameters
– Non-target shapes can produce spurious peaks in
parameter space
– It’s hard to pick a good grid size
• Hough transform vs. RANSAC vs. Geometric
hashing
On Geometric Hashing and the generalized hough transform_tsmc94
Detection of multiple object instances
Olga Barinova
Graphics&Media Lab
Moscow State University
Victor Lempitsky
Yandex company
Moscow
19
Pushmeet Kohli
Machine Learning and Perception
Microsoft Research Cambridge
Visual Geometry Group,
University of Oxford –
postdoc
•
•
•
Slides from CVPR 2010 [zip]
Talk at CVPR 2010 [link]
C++ code for pedestrians detection original Visual Studio 2005 solution
•
C++ code for lines detection the latest version, which is much faster and more
accurate
Dr. Rodrigo Benenson.
or Linux Port by
Detection of multiple object instances using Hough transform_cvpr10
Major flaw of HT
20
• Lacks a consistent probabilistic model
– Does not allow hypotheses to explain away the voting
elements
• Maximum in Hough image corresponds to a correctly detected
object
• The voting elements that were generated by this object also
cast votes for other hypotheses
• The strength of those spurious votes is not inhibited => pseudo
maximum
• Various non-maxima suppression (NMS) heuristics have to be
used to localized peaks in the Hough image, which involve
specification and tuning of several parameters:
– sweep-plane approach (Real-time line detection through an improved
Hough transform voting scheme_pr08)
– …
Download