Multiple-shot Person Re-identification by HPE signature

advertisement
Multiple-shot Person Reidentification by HPE signature
Loris Bazzani*, Marco Cristani*†, Alessandro Perina*,
Michela Farenzena*, Vittorio Murino*†
*Computer Science Department, University of Verona, Italy
†Istituto Italiano di Tecnologia (IIT), Genova, Italy
This research is founded by the EU-Project FP7 SAMURAI,grant FP7-SEC- 2007-01 No. 217899
Analysis of the problem (1)
• Person Re-identification: Recognizing an individual in
diverse locations over different (non-)overlapping camera
views
Different cameras
T=1
T = 23
Same camera
2
T = 145
T = 222
Analysis of the problem (2)
• We focus on the problem with non-overlapping cameras
• Problems in real scenarios:
–
–
–
–
–
–
Very low resolution
Severe Occlusions
Illumination variations
Pedestrians with very similar clothes
Pose and view-point changes
No geometry of the environment
• Solution:
- Histogram Plus Epitome (HPE) descriptor, and
- Multiple-shot approach
3
Outline
 Overview of the proposed method
 Pre-processing: Background Subtraction
 “Images selection” for Multiple-shot
 HPE descriptor
- Global descriptor
- Local descriptors
 HPEs’ Matching
 Results
 Conclusions
4
Overview of the proposed method
• Employing global and local appearance-based features
• Exploiting the temporal consistency to make robust the descriptor
5
Background Subtraction
 We employ a novel generative model: STEL [Jojic el al. 2009]


6
Capture the structure of an image class as a mixture of component
segmentations
Isolate meaningful parts that exhibit tight feature distributions
Learned Mixture Components
“Images selection” for Multiple-shot
 Objective: discard redundant information and images with occlusions
 Gaussian Mixture Models Clustering [Figueiredo and Jain 2002] of HSV
histograms
 Automatic model selection employing the Bayesian Information Criterion
[Figueiredo and Jain 2002]
 Discard the clusters with low number of instances
 Keep a random instance for each cluster
 Examples of ruled-out examples:
7
HPE descriptor: Global feature
• Capture chromatic global information
 36-dimensional HSV histogram
Caused by
illumination
changes
8
(H=16, S=16, V=4)
 Average the histograms of the
multiple instances
 Robust to illumination and pose
variations, keeping the
predominant chromatic
information only
HPE descriptor: Local feature (1)
 Epitome [Jojic el al. 2003]: generative model that analyzes the
presence of recurrent, structured local patterns
Generic
Epitome
Local
Epitome
9
HPE descriptor: Local feature (2)
 Generic Epitome
:
 36-dimensional HSV histogram of the Epitome
 Local Epitome
:
 Keep the patches with high
: probability that a patch in
the epitome having (i, j) as left-upper corner represents several
ingredient patches
 Discard the patches with low entropy
 Extract a 36-dimensional HSV histogram of the “survived”
patches
10
HPEs’ Matching
 Re-identification: associating each element in the probe set B
to the corresponding element in the gallery set A
 Minimize the following distance
where
11
is the Bhattacharyya distance and
Results (1)
 iLIDS dataset:
- Multiple images of 119 pedestrians 128x64 pixels
- Comparison with Context-based method [Zheng et al. 2009]
- Cross-validation: SvsS 10 trials, MvsS/MvsM 100 trials
12
Results (2)
 ETHZ dataset:
- Three datasets of 83, 35 and 28 pedestrians of 64x32 pixels
- Comparison with Partial Least Square (PLS) method
[Schwartz and Davis 2009]
- Cross-validation: Settings as for iLIDS
13
Results (3)
 How many images do we need to perform a “good” person
re-identification? N = 5 seems to be the best trade-off
14
N = Number of images for the multi-shot approach
Conclusions
 We proposed a novel descriptor for the person re-
identification problem, i.e., HPE descriptor
 The descriptor is robust to low resolution, occlusions,
illumination variations, pedestrians with very similar
clothes, pose changes
 It is based on the accumulation of images to gain robustness
 Person re-identification problem is still far from being
solved
 The results suggest that further improvements can be
reached
15
References
[Jojic el al. 2009] N. Jojic, A. Perina, M. Cristani, V. Murino, and B. Frey, “Stel
component analysis: Modeling spatial correlations in image class structure,”
IEEE Conference on Computer Vision and Pattern Recognition, pp. 2044–
2051, 2009.
[Figueiredo and Jain 2002] M. Figueiredo and A. Jain, “Unsupervised learning of
finite mixture models,” IEEE Trans. PAMI, vol. 24, no. 3, pp. 381–396, 2002.
[Jojic el al. 2003] N. Jojic, B. J. Frey, and A. Kannan, “Epitomic analysis of
appearance and shape,” in IEEE International Conference on Computer Vision.
Washington, DC, USA: IEEE Computer Society, 2003, p. 34.
[Schwartz and Davis 2009] W. Schwartz and L. Davis, “Learning discriminative
appearance-based models using partial least squares,” in XXIISIBGRAPI, 2009.
[Zheng et al. 2009] W. Zheng, S. Gong, and T. Xiang, “Associating groups of
people,” in BMVC, 2009.
16
Download