Histograms of Oriented Gradients for Human Detection

advertisement
Navneet
Bill Triggs
Dalal and
French National Institute for Research in Computer Science and Control (INRIA)
CVPR 05

peopledetect.cpp




Challenge: variable appearance and the wide range of poses
Histogram of Oriented Gradients (HOG) are feature
descriptors used in computer vision and image processing for
the purpose of object detection.
Basic idea : local object appearance and shape can be
characterized rather well by the distribution of local intensity
gradients or edge directions.
Similar with edge orientation histograms [4,5], SIFT
descriptors [12] and shape contexts [1]
64x128

INRIA negative images (64x128 samples)
Person / non-person
classification
http://quyanyun.com/Files/Viso/%E7%AC%AC%E5%9B%9B%E8%AE%B2Dalal-phd-slides.pdf

Color / gamma normalization
o Grayscale, RGB and LAB color spaces optionally with power law
(gamma) equalization
o Not obvious effect

Gradient Computation
o 1-D point derivatives : uncentred [-1, 1], centred [-1, 0, 1]
o
o
o
o
o
and cubic-corrected [1,-8, 0, 8,-1]
3*3 Sobel masks
2*2 diagonal ones
Gaussian smoothing with σ
1-D
at σ =0 work best
The simplest scheme turns out to be the best
DET(Detection
Error Tradeoff)

Creating the orientation histograms
o Weighted vote for an edge orientation histogram over cells.
o Unsigned gradients used in conjunction with 9 histogram channels
performed best in their human detection experiments
o Weight: gradient magnitude itself, or some function of the magnitude
(square, square root, clipped)
o Gradient magnitude itself generally
produces the best results.
cell

Normalization and descriptor blocks
o Owing to local variations of illumination and foreground-background
contrast
o Group cells into larger, spatially connected blocks and normalize each
block separately
o Two main block geometries : rectangular R-HOG blocks and circular
C-HOG blocks.
o R-HOG : 3 parameter
•
•
•
•
# of cells per block
# of pixels per cell
# of channels per cell histogram
Optimal : 3x3 cell blocks of
6x6 pixel cells with 9 channels.
• Gaussian spatial weight

Normalization and descriptor blocks
o C-HOG : 4 parameter
•
•
•
•
•
# of angular bins
# of radial bins
The radius of the center bin
The expansion factor for the radius of additional radial bins
Optimal: 4,2,4,2, Gaussian spatial weight is not need
o Block Normalization schemes
• L2-norm :
• L2-Hys : L2-norm ,clip (limit v<=0.2)
and renormalize
• L1-norm :
• L1-sqrt :


R/C-HOG give near perfect separation on MIT database
Have 1-2 order lower false positives than other descriptors

Feed the descriptors into some recognition system :SVM classifier
8*8 cell size
edge
[-1, 0, 1] gradient filter
with no smoothing
8*16 cells
Gaussian spatial
window with  = 8
Histograms of
edge orientations
9 unsighted bins=>
9 dimension vector
R-HOG, 2*2 block size
=> 36 dimension vector
7*15 blocks =>
descriptor: 3780 dimension vector
L2-Hys
overlap=1/2

We show experimentally that dense grids of Histograms of
Oriented Gradient (HOG) descriptors significantly outperform
existing feature sets for human detection.

We study the influence of each stage of the computation on
performance.
Download