Technical Progress - Imagelab - Università degli studi di Modena e

advertisement
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
NATO PROGRAMME FOR SECURITY THROUGH SCIENCE
SCIENCE FOR PEACE
NATO Public Diplomacy Division, Bd. Leopold III, B-1110 Brussels, Belgium
fax +32 2 707 4232 : e-mail sfp.applications@hq.nato.int
Progress Report – APRIL 2010
Project SfP 982480 – BE SAFE
(Behavior lEarning in Surveilled Areas with Feature Extraction)
Project Director (PPD):
PROF. NAFTALI TISHBY, ISRAEL
Project Director (NPD):
PROF. RITA CUCCHIARA, ITALY
People involved in the report’s preparation:
Prof. Rita Cucchiara, Dr. Andrea Prati
Prof. Naftali Tishby
Date of completion: 22 April 2009
1
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Table of Content
Table of Content................................................................................................................................... 2
List of abbreviations ............................................................................................................................. 3
Participants .......................................................................................................................................... 4
Project background and objectives...................................................................................................... 5
Overview of the project ....................................................................................................................... 6
Technical Progress ............................................................................................................................... 7
PPD – Hebrew University (HUJI) ...................................................................................................... 7
NPD – University of Modena and Reggio Emilia (UNIMORE) .......................................................... 7
Financial Status .................................................................................................................................. 24
PPD Financial Status ....................................................................................................................... 24
NPD Financial Status ...................................................................................................................... 25
Equipment Inventory Records ........................................................................................................... 27
Criteria for success table .................................................................................................................... 28
2
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
List of abbreviations
HUJI – The Hebrew University
UNIMORE – Università degli Studi di Modena e Reggio Emilia
MSS – Magal Security Systems, Ltd.
PPD – Partner Project Director
NPD – NATO Project Director
CV – Computer Vision
ML – Machine Learning
FOV – Field of View
OGM – Oscillatory Gait Model
HMM – Hidden Markov Model
SVM – Support Vector Machine
PTZ – Pan-Tilt-Zoom
3
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Participants
(a) Project Director (PPD) (Consult ”Definitions”)
Surname/First
Job Title, Institute and Address
Country
name/Title
TISHBY/NAFTALI
Professor, School of Engineering
ISRAEL
/PROF.
and Computer Science, The
Hebrew University, Ross Building,
Givat Ram Campus, 91904
Jerusalem, Israel
(b) End-user(s) (Consult “Definitions”)
Surname/First
Job Title, Company/Organisation
name/Title
and Address
DANK/ZVI
V.P. Engineering, Magal Security
Systems, Ltd., P.O. Box 70,
Industrial Zone, 56000, Yahud
(c) Project Director (NPD) (Consult “Definitions”)
Surname/First
Job Title, Institute and Address
name/Title
CUCCHIARA/RITA/ Full professor, Dipartimento di
PROF.
Ingegneria dell’Informazione,
University of Modena and Reggio
Emilia, Via Vignolese, 905, 41100
Modena
Country
ISRAEL
Telephone, Fax and Email
Tel: +972-2-65-84167
Fax: +972-2-658-6440
Email:
tishby@cs.huji.ac.il
Telephone, Fax and Email
Tel: +972-3-5391444
Fax: +972-3-5366245
Email:
mglzvi@trendline.co.il
Country
Telephone, Fax and E-mail
ITALY
Tel: +39 059 2056136
Fax: +39 059 2056129
Email:
rita.cucchiara@unimore.it
4
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Project background and objectives
This project is unique since it aims at combining two main areas of research, Computer Vision and
Machine Learning, in an application of automatic surveillance for people detection and tracking
and abnormal behavior recognition. Computer Vision (CV) and Machine Learning (ML) have been
used jointly for many different applications but either using ML as a tool for computer vision
applications or using CV as a case study to proof theoretical advances in ML.
The project aims at exploring how visual features can be automatically extracted from video using
computer vision techniques and exploited by a classifier (generated by machine learning) to detect
and identify suspicious people behavior in public places in real time. In this sense, CV and ML are
jointly developed and studied to provide a better mix of innovative techniques.
Justification of the proposed project is based on two issues of major concern to the state of Israel:
(1) the need for intelligent surveillance in public and commercial areas that are susceptible to
terrorist attacks and (2) lack of automatic and intelligent decision support in existing surveillance
systems.
More specifically, the objectives of the project are: (1) to achieve a better understanding of which
visual features can be used for (1.a) analyzing people activity and (1.b) characterizing people
shape; (2) to suitably adapt ML techniques such as HMM, SVM or methods for “novelty detection”
in order to infer from the visual features extracted the behavior of the people and possible
classifying it as normal or abnormal; (3) develop a first simple prototype in a specific scenario that
can be considered as a threat for security.
The machine learning research is carried out at the Hebrew University’s machine learning lab
utilizing its long experience in temporal pattern recognition and computational learning methods.
Following the meeting in June 2007 in Jerusalem, we decided to focus in the available time for the
project on one particular behavior which is both well defined and threatening: people who leave
objects behind them (such as luggage in airports). The machine learning component is based on
the following phases: (1) constructing a generative statistical model of human gait on the basis of
the features provided by the CV group. Such a model is an adaptation of an oscillatory dynamic
model we developed in the past (Singer and Tishby 1994), where different points on the walking
person are assumed having a drifted oscillatory motion with characteristic frequency and relative
phases; (2) this basic Oscillatory Gait Model (OGM) is then plugged as the output of a state of an
HMM, yielding a complete statistical model of regular gait; (3) detecting deviations (irregularities)
in the relative phases and amplitudes of the OGM to capture irregular behavior, e.g. halting,
bending, leaving objects, etc. The output of such a statistical model can be classified using
likelihood ratio tests or standard classifiers as SVM to improve confidence. (4) We also carried
work on detecting statistical irregularities in multivariate correlated data, as another component
of the project.
5
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Overview of the project
1st year
1-2–3
Month:
1. Hybrid and distributed multi-camera people detection and tracking
2nd year
4-5-6
7-8–9
10 - 11 - 12
1-2-3
4-5-6
7-8-9
10 - 11 - 12
S1.1 People detection and tracking in multi-camera systems
S1.2 Camera coordination primitives for static, hybrid and PTZ cameras
2. Feature extraction for people surveillance
S2.1 Features extraction for people activity detection
S2.2 Features extraction for people shape detection
3. Data preparation and symbolic coding
S3.1 Data preparation and understanding, per-sensor symbolic coding and state
modeling for people activity features
S3.2 Data selection, cleaning, formatting, and cases generation for people activity
features
S3.3 Data preparation and understanding, per-sensor symbolic coding and state
modeling for people shape features
S3.4 Data selection, cleaning, formatting, and cases generation for people shape
features
4. Designing a dynamic gait model based on coupled oscillatory motion
S4.1 Using the people activity features to design a statistical classifier of regular gait
S4.2 Design a state model/kernel for the people shape features
S4.3 Plug in the Gait-Oscillatory model (GOM) as a state in an HMM for a complete
regular gate statistical model
S4.4 Use the likelihood of the model for robust classification of regular
motion/behaviour
5. Framework for Abnormal Behavior monitoring
S5.1 Analysis of requirements and constraints
S5.2 Video data collection and annotation
S5.3 Testing and refinement of integrated framework
Final
Report
Delayed:
3rd
Progress
Report
2nd
Progress
Report
1st
Progress
Report
Completed:
Research
report on
integrate
d
framewor
k testing
- Final
report of
develope
d
algorithm
Research
s and
report on
technique
the
s for the
kernels
end-user
and SVM
empirical
classificat
ion tests
Prototype
for
collection
of visual
people
shape
Prototype
features
for
Research
collection
on
report
of visual
symbolic
people
coding
activity
for
Research
features
people
report on
shape
testing of
features
people
detection
and
tracking
As planned:
6
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Technical Progress
NPD – University of Modena and Reggio Emilia (UNIMORE)
Description of the research (months 19-24)
The main objective of UNIMORE unit in the project is to study which visual features can be used
for inferring people abnormal behaviors. These features have been considered coming from two
types of analysis: the analysis of the people activity and the people shape.
The research activities have been mainly concentrated during the first year and the first semester
of the second year. Consequently, the activities during these six months were reduced.
In particular, UNIMORE has concentrated mainly on two researches. The first research was mainly
devoted to further investigate the use of circular statistics for modelling people trajectories as an
important feature characterizing people activity. This research was linked to task 2.1 – Feature
extraction for people activity detection – which ended the first year, but UNIMORE devoted some
more effort on this interesting and promising topic.
The second research in this semester further addressed the task 5.2 – Video Data Collection and
annotation – by improving the functionalities of the repository ViSOR already described in the
previous reports.
Finally, UNIMORE collaborated with HUJI to develop an integrated framework which combines the
tracking algorithms (UNIMORE) with algebraic graph theoretical methods (HUJI). This effort is part
of the tasks 4.4 and 5.3 described in the section of HUJI progress.
Task S2.1 – Feature extraction for people activity detection
Recently, one of the most addressed topics in video surveillance research is the extraction and
analysis of features for behavior understanding. Among the possible features, trajectories
represent a rich source of information which can be robustly extracted from single or multiple
fixed cameras (Calderara, Cucchiara, & Prati, Bayesian-competitive consistent labeling for people
surveillance, 2008). Morris and Trivedi in (Morris & Trivedi, 2008) proposed a recent survey on
state-of-art techniques for modeling, comparing and classifying trajectories for video surveillance
purposes.
The people trajectory projected on the ground plane is a very compact representation of patterns
of movement, normally characterized by a sequence of 2D data ({(x1, y1) , · · · , (xn, yn)}
coordinates) and often associated with the motion status, e.g. the punctual velocity or
acceleration. However, in many cases, instead of analyzing the spatial relationships of single
trajectory points, the focus should be given to a more global descriptor of the trajectory, i.e. the
trajectory shape. The shape is independent of the starting point and could constitute a very
effective descriptor of the movement and the action. In surveillance of large public spaces, the
trajectory shape could discriminate between different behaviors such as the ones of people
moving on a straight path or people moving in a circle. In order to give an example, Fig. 1 shows a
sketch of a real scenario. The bird-eye view reconstruction based on three overlapped cameras is
reported and collected trajectories are superimposed with different colors corresponding to
different trajectory classes. Observing the trajectory shapes only, in spite of their location, we
could infer that a group of people goes straight on, passing through the monitored area, while
other people arrive and move toward to the upper part of the scene. Finally, some people stay
close to the benches. To cope with this evident diversity of behavior, we propose to model
7
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
trajectory shapes by means of a representation based on a sequence of angles and we focus the
attention on statistical pattern recognition techniques for angular sequences.
Since angles are periodic variables, the classical approach based on Gaussian distributions is
unsuitable and another distribution should be adopted. By exploiting the circular statistics, we
proposed in the previous reports and papers the adoption of a new statistical representation
based on a mixture of von Mises (MovM) distributions.
Figure 1
However, pure shape is not sufficiently discriminative in surveillance scenarios, (e.g. the same path
covered by a walk or by a run has a different meaning in terms of behavior) in the further
refinement carried out during this semester we studied a way to add the speed to the shape
description to provide a more complete analysis of the trajectory. The introduction of the speed,
which is not periodic, requires to account for the different nature of these features, the angle θ,
directional, and the speed v, linear. Using a statistical model, the resulting bivariate joint
probability p (θ, v) can be easily modelled as the product p (θ) ·p (v) if and only if the two variables
result to be independent for the considered application. If they are not, the joint probability must
be modelled by using a directional (univariate) pdf for θ and a linear (univariate) pdf for v. The
estimation of the covariance matrix for this bivariate joint pdf can be quite challenging since the
dependency between θ and v must be modelled properly. When a directional or periodic variable
is combined with a linear one the term semi-directional is often used.
The use of a Gaussian pdf for the linear variable v is straightforward, while the choice of the pdf
for θ is less obvious. One of the most used (due to the properties it shares with the Gaussian) is
the von Mises (vM) distribution. However, in the case of semi-directional statistics, the use of a
wrapped Gaussian (Bahlmann, 2006) (Mardia, 1972) distribution is preferable because, due to the
closeness to its linear counterpart, it is possible to adopt a linear approximation of the variance
parameter even for circular variables. The linear variance approximation allows the employment
of the Gaussian maximum likelihood estimator to calculate, with a feasible precision, the
covariance matrix in the case of joint linear and periodic multivariate variables. The wrapped
Gaussian can be written as:
8
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
+∞
๐‘Š๐บ(๐œƒ|๐œƒ0 , ๐œŽ) = ∑ ๐‘(๐œƒ − ๐‘ค2๐œ‹|๐œƒ0 , ๐œŽ)
๐‘ค=−∞
Nevertheless, parameter estimation in the case of wrapped Gaussian is not easy. For this reason,
Bahlmann (Bahlmann, 2006) proposed to adopt a multivariate semi-directional distribution in
handwriting recognition by using an approximated wrapped Gaussian (AWG) pdf for the
directional variable (the tangent slope of a written segment) and the use of a linear Gaussian for
the linear variable, by defining a semi-wrapped Gaussian distribution which we will refer to
hereinafter as AWLG (Approximated Wrapped and Linear Gaussian). Eventually, both directional
and linear data can be modelled with multi-modal distributions, for example using parametric
mixtures of the corresponding pdfs.
The expression of AWG is the following:
๐ด๐‘Š๐บ(๐œƒ|๐œƒ0 , ๐œŽ) =
1
√2๐œ‹๐œŽ
๐‘’
−
((๐œƒ−๐œƒ0 )mod 2๐œ‹)
2๐œŽ2
2
which can be extended to include also a linear variable as follows:
๐ด๐‘Š๐ฟ๐บ(๐‘‹|๐œ‡, Σ) =
1
1
(๐‘‹−๐œ‡)๐‘‡ Σ−1 (๐‘‹−๐œ‡)
√2๐œ‹|Σ|
๐‘’ −2
๐œƒ
๐œƒ
where ๐‘‹ = [ ] is the observation vector, ๐œ‡ = [ 0 ] is the mean vector, ๐‘‹ − ๐œ‡ =
๐‘ฃ0
๐‘ฃ
๐œŽ๐œƒ,๐œƒ ๐œŽ๐œƒ,๐‘ฃ
(๐œƒ − ๐œƒ0 ) mod 2๐œ‹
[
] the “difference” between them, Σ = [๐œŽ
๐œŽ๐‘ฃ,๐‘ฃ ] is the covariance matrix and
๐‘ฃ − ๐‘ฃ0
๐‘ฃ,๐œƒ
|Σ| its determinant.
Consequently, the mixture of AWLG (MoAWLG) can be defined as:
๐พ
๐‘€๐‘œ๐ด๐‘Š๐ฟ๐บ(๐‘‹|๐…, ๐, ๐šบ) = ∑ ๐œ‹๐‘˜ ๐ด๐‘Š๐ฟ๐บ (๐‘‹|๐œ‡๐‘˜ , Σk )
๐‘˜=1
Figure 2. Plots of different circular pdfs, with θ0 = 0 and σ = 1.0 (corresponding to m = 1.54) or σ = 1.5 (m = 0.69).
9
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
The results of this research exploit semi-directional statistics (specifically a mixture of AWLG) to
model and analyze people trajectory shapes in order to classify paths shape and motion models.
The AWLG model results to be the more appropriate since we measured mutual information for
testing the dependency between the directional and linear variables. Since exact mutual
information is hard to compute in the case of mixtures of pdfs, a variational approximation of it
has derived. Finally, an approach for comparing sequences of semi-directional data has been
derived: it exploits the global alignment of sequences of symbols with a distance based on
Kullback-Leibler divergence. Finally, a complete system for the classification of people trajectories
is proposed and experiments on both synthetic and real data are provided to demonstrate its
accuracy. Some hours of unconstrained acquisition of people walking around in an open space are
evaluated.
In order to verify the accuracy of our approach, we performed extensive experiments with both
synthetic and real data. Two sets of synthetic trajectories (one with dependent and one with
independent data) have been generated with a Matlab simulator in order to evaluate both the
solutions (with dependent and independent variables). We also evaluated the average mutual
information on real data. The average value for real data is high enough to mean that some
correlation exists between the angles and the speed in the considered scenario. This is not true in
general, but it heavily depends on the context of application and the collected data.
Table 1
The robustness of the proposed sequences’ comparison algorithm has been tested performing
two different kinds of experimental campaigns. The first campaign evaluates the performance of
our approach in some specific situations, where common approaches tend to fail. First, the
robustness against small fluctuations around the zero value of θ (row “Periodicity” of Table 1) has
been evaluated by generating trajectories that are composed by a unique straight almost-zero
direction with added noise. In this case, the system is able to cluster together all the trajectories
thanks to the use of circular (i.e. wrapped) statistics to model angular data.
Subsequently, we tested the capability of the system to handle sequences of either the same
principal directions or speeds, but given in different order (rows “Sequence” of Table 1). In this
case, both the proposed statistical measure and the alignment technique concur to filter out the
noise and to correctly cluster this kind of data. Then, a specific test was performed to verify the
robustness against severe noise on either angular or speed values (rows “Noise”). The second test
10
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
campaign evaluates the accuracy of the proposed approach performing sequences’ classification
on a large amount of data. Synthetic and real data are used for testing, and two synthetic sets are
provided with either dependent or independent data (rows 6 and 7 of Table 1). The real test (row
8) is composed by 356 trajectories collected by the system previously mentioned and manually
ground-truthed. The set of trajectories has been divided randomly in 200 trajectories in the
training set and the remaining in the testing set. Examples of the obtained classes (superimposed
to a bird-eye view of the multiple camera scenario) are shown in Fig. 3. Please note that
trajectories of the same color belong to the same class.
Figure 3
Further details can be found in several papers accepted in the last months on this topic. More
specifically, the paper presented in oral at International Conference on Advanced Video and
Signal-based Surveillance (AVSS) conference held in Genova, Italy on September 2009, and those
presented in posters at International Workshop on Multimedia in Forensics (MiFOR) held in
Bejing (China) on October 2009 and at International Conference on Imaging for Crime Detection
and Prevention (ICDP) held in London (UK) on December 2009. Moreover, recently the result of
this work has been accepted for the prestigious International Conference on Pattern Recognition
(ICPR) to be held in Instanbul (Turkey) on August 2010.
References
Bibliografia
๏‚ท
๏‚ท
๏‚ท
๏‚ท
Bahlmann, C. (2006). Directional features in online handwriting recognition. Pattern
Recognition , 39, 115-125.
Calderara, S., Cucchiara, R., & Prati, A. (2008). Bayesian-competitive consistent labeling for
people surveillance. IEEE Trans. on PAMI , 30 (2), 354-360.
Mardia, K. (1972). Statistics of directional data.
Morris, B., & Trivedi, M. (2008). A survey of vision-based trajectory learning and analysis
for surveillance. IEEE Transactions on Circuits and Systems for Video Technology , 11141127.
11
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
S5.2 Video data collection and annotation
Previous reports of the project showed the development and continue enrichment of the video
on-line repository ViSOR (http://imagelab.ing.,unimo.it/visor) developed by UNIMORE. This
semester the UNIMORE activity has concentrated on two main aspects of ViSOR.
The first concerns the creation of a survey on the user needs for managing video surveillance. An
excerpt of questions and answers is reported in Fig. 4. The questionnaire was principally conceived
to highlight the inadequacy of traditional free text annotation and query approach applied to the
surveillance field. Looking at the reported results, it is clear that the video surveillance community
needs new concept-based technologies. In particular, even if almost the totality of the
interviewees uses or develops new tools for event, object, and people detection, only few of them
apply a standard schema, an ontology or even a controlled lexicon to annotate videos. Thus,
queries by concept (desirable by more than half of the users) cannot be performed.
Figure 4
The second aspect addressed in this semester regards the study of a multi-dimensional annotation
procedure. Different types of annotation can be generated depending on the drill-down depth
used to annotate the video and on the application goal. To this aim we defined three dimensions
over which an annotation can be differently detailed: Spatial, Temporal, and Conceptual (STC
space). In the graph of Fig. 5 these three dimensions are associated to the Cartesian axes and each
point corresponds to a particular annotation type. For each dimension we have identified some
significant values.
12
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Figure 5
Thus, three temporal levels of description are defined:
๏‚ท none or video-level : no temporal information are given;
๏‚ท clip: the video is partitioned into clips and each of them are described by the set of
descriptor instances;
๏‚ท frame: the annotation is given frame by frame.
Moreover, we can have the following four spatial levels:
๏‚ท none or image-level: no spatial information are given and the concept is referred to the
whole frame;
๏‚ท position: the location of the concepts is specified by a single point, e.g. the centroid;
๏‚ท ROI : the region of the frame containing the concept is reported, for example using the
bounding box;
๏‚ท mask: a pixel level mask is reported for each concept instance.
Eventually, we have also defined these four conceptual levels:
๏‚ท none (Syntactical level): no semantic information are provided; free-text keywords and title
can be provided.
๏‚ท one concept : only one particular concept is considered and annotated; other concepts can
be added but they are not the focus of the annotation itself;
๏‚ท subset: only a subset of the ViSOR surveillance concepts are considered and the subset
adopted should be indicated;
๏‚ท whole ontology: all the ViSOR surveillance concepts are considered.
As an example, Fig. 6 shows the web interface of ViSOR with the different levels of annotation.
13
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Figure 6. The ViSOR Web interface for (a) the syntactical annotation and (b) the concept list annotation.
(c) Screen shot of the ViPER-GT annotation tool
14
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
PPD – Hebrew University (HUJI)
Description of the research (months 19-24)
The main objective of HUJI unit in the project is to study novel machine learning techniques
capable to deal with the features extracted by UNIMORE to extract knowledge, specifically
abnormal or suspicious behaviors.
In particular, HUJI has concentrated to the development of graph theoretical approaches for
detecting anomalies and their application to trajectory analysis (provided on real scenes by
UNIMORE)
Task S4.4 – Use the likelihood of the model for robust classification of regular
motion/behaviour
The combined approach UNIMORE/HUJI is composed of three steps:
1. from video to discrete time-series: the results of image processing techniques developed
during the project by Imagelab are the trajectories produced by moving people in a wide
scene monitored by multiple cameras;
2. from time-series to graph: following an innovative partitioning process described below,
we transform the trajectories in a graph describing the probability of a person to move in a
certain manner in the scene; this graph actually models “normality”
3. the anomaly detection method on graphs using the Laplacian filtering: the learned graph
describing “normality” is compared with the current graph (which may or not contain
anomalous trajectories) by means on an innovative similarity measure based on Laplacian.
The first step has been deeply described in UNIMORE past reports. Regarding step 2, the time
series (trajectories) are transformed into a weighted graph, where a node ๐‘ฃ represents movement
from one location in the scene to another, while the weight of edge ๐‘’๐‘–,๐‘— is the probability to see
the movement followed by the ๐‘— movement. The direct use of the (x; y) samples is unfeasible since
it will result in a very high number of nodes (as extreme case, square of the number of pixels),
which severely compromises the use of graph-based approaches due to computational cost, as
well as the robustness of the representation. Moreover, the (x; y) data are often affected by noise
and tracking errors, and thus need to be filtered before use. The simplest solution to both
problems is the quantization of image (x; y) plane, which in this context translates into dividing the
scene into a fixed number of cells and assigning each data point to its containing cell.
The naive scheme is to divide the scene using a fixed-size grid. Unfortunately, the grid size is a
crucial parameter in this approach. Let ๐‘๐‘ฅ๐‘€ be the size of the image, and ๐‘๐‘Ÿ and ๐‘๐‘ be the
number of rows and columns of the grid, respectively. The direct use of the coordinates would
result in (๐‘๐‘ฅ๐‘€)2 nodes, reduced to (๐‘๐‘Ÿ ๐‘ฅ๐‘๐‘ )2 using a uniform grid: if ๐‘๐‘Ÿ and ๐‘๐‘ are too high (say
Nr = 100 and Nc = 100 for a 1000x1000 image), the approximation is good but the computational
load can still be too high (100,000,000 nodes in the example); if they are reduced (e.g., 10,000
nodes) the complexity becomes more acceptable (but still not practical) at the price of the risk of
having an overly coarse quantization of the data. Another disadvantage of a uniform grid is the
uneven statistics of the cells occupation in natural scenes, yielding suboptimal statistical
quantization of the trajectories.
Thus, we use a density-sensitive variable-geometry grid scheme. Moreover, graph nodes are
assigned only to observed transitions. By not forcing any specific geometry of the cells (as in the
15
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
case of a regular grid), the task of finding an adequate partition to cells is reduced to finding
appropriate center-points. Having established the centers, the cells’ boundaries are determined by
the locus of points that are at the same distance from two centers, hence creating a Voronoi
tessellation. In order to select the centers, certain properties can be considered: first, an area that
is rarely traversed needs only a rough description; conversely, busy areas require a high resolution
partitioning in order to distinguish between normal and abnormal walks. A natural solution is to
use as centers points that are randomly sampled on the training trajectories, taking into account
small sample size effects. In that way we can use fewer cells (i.e., nodes) but still maintain high
resolution in the “most populated” areas.
Figure 7
The procedure is summarized in Fig. 7. Given a training set composed of normal trajectories, the
image is first divided in ๐‘๐‘Ÿ ๐‘ฅ๐‘๐‘ rectangular cells of fixed size (Fig. 7(a)). For this preliminary step, Nr
and Nc are less critical parameters and can be high (we used 250x500 in our experiments). Using
this division, a 2D histogram H can be built (Fig. 7(b) and (c)), where H(i; j) represents the number
of trajectory points falling in the cell at row i and column j. The 2D histogram represents a 2D
distribution of the samples in the scene, with peaks (Fig. 7(c)) in the major areas (cells) of activity
in the scene. To obtain the best coverage (related to the training data) and the most suitable
partition of the scene with the fewest cells, they need to be distributed according to the discrete
distribution described by H. Thus, given ๐‘๐‘๐‘’๐‘›๐‘ก๐‘’๐‘Ÿ as the number of cells/nodes used, we draw
๐‘๐‘๐‘’๐‘›๐‘ก๐‘’๐‘Ÿ samples from the distribution approximated by H, which increases the likelihood of
sampling from peaks of H, avoiding sampling from areas where no points are present in the
training set. These samples represent the seeds of a Voronoi tessellation of the scene (Fig. 7(d)).
The adjacency map and the transition matrix are then computed on these cells, making the graph
treatable in computational terms.
16
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Having established the centers, each point is replaced by the center closest to it, so that the
trajectories are transformed into a sequence of centers. A node is assigned to each observed
transition from cell a to cell b. Let node i represent the transition from node a to b, and in the
same manner j represents the transition from cell b to cell c. Hence, the edge ei;j will represent
the occurrence of moving from cell a to cell b and then to cell c, while the weight of edge ei;j will
be the probability of such a movement.
This procedure transforms a collection of trajectories into a graph. When a new trajectory is
encountered, the cell centers are determined according to the above scheme, but using the new
trajectory together with all the normal ones. By using these centers a graph is constructed using
the new trajectory, while another is built using only the normal trajectories. Once these two
graphs have been constructed, anomalies are detected by searching for substantial differences
between them.
In order to compute differences between graphs, the crucial issue is to find a proper similarity
function. Our specific approach is motivated by the algebraic properties of similarity matrices
proved highly successful in spectral clustering. Let ๐‘Š be a symmetric matrix that represents the
edge weights and the matrix eigenvalues, ordered in increasing order. The corresponding
eigenvector will be denoted as ๐œ™1 , … , ๐œ™๐‘ .
Let us now define the variables and ๐‘˜, as the spectral gap (maximum of the difference between
two consecutive eigenvalues) and the index which correspond to the maximum. Given two graphs
๐บ and ๐บฬƒ the smallest k eigenvectors of one graph are projected on the other by means of the
definition of the matrix M:
T
๏ƒฌ
๏ƒฏ๏ฆ[1...k ]๏ฆ[1...k ] k ๏‚ณ k
M๏€ฝ๏ƒญ T
๏ƒฏ
๏ƒฎ๏ฆ[1...k ]๏ฆ[1...k ] else
The eigenvalues of the matrix M are called the canonical angles, and are the most intuitive way to
measure distance between subspaces. Accordingly, two measures can be defined on the matrix M:
๏€จ ๏€ฉ
๏€จ ๏€ฉ
๏„ ๏€จG,G ๏€ฉ ๏€ฝ k ๏€ญ Trace ๏€จ M M ๏€ฉ
๏„ G,G ๏€ฝ 1๏€ญ det MT M
T
17
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Task S5.3 – Testing and refinement of integrated framework
In the last period of the project, HUJI collaborates with UNIMORE to test the proposed integrated
approach in a real testbed. People’s trajectories were collected for a month from the set-up
reported in Fig. 8.
Figure 8
For assessing the capabilities of our detection algorithm, a corpus of 1131 trajectories was
collected by the surveillance system. For testing our algorithm we treat this corpus as the normal
behavior of the scene. In order to simulate abnormal events we collected 9 abnormal trajectories,
partially shown in Fig. 9.
Detection of anomalies must be preceded by learning the normal behavior. Hence we divided our
normal trajectories in two sets: the first 900 trajectories were used in the learning phase, while
the other 231 normal trajectories were used in the test phase.
Figure 10 plots the Trace distance measure (see above) for the test set. It is clear that system
detected the 8 out of the 9 abnormal trajectories (times in which the system declares as abnormal
are marked by square on the distance line). These results suggest that this new proposed anomaly
detection framework is able to detect anomalies in a complex scenarios, such as the one
considered when the data present structure complexity and noise due to the automatic tracking
techniques. It is nevertheless important to stress that good performances depend strongly on
stable and long trajectories that can be extracted using the surveillance system described here.
18
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Figure 9
19
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Figure 10
20
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Accomplishment achieved
Here is a comprehensive list of the accomplishments achieved so far compared to the Project Plan
in the first 24 months (1-24):
๏‚ท Developing a new approach for modeling human gait (GOM) and model it statistically using
autoregressive processes (concluded).
๏‚ท Use the GOM as a state output model of an HMM for a complete statistical model of
human motion (concluded).
๏‚ท Use the graph Laplacian formulation, proved very successful for detecting irregularities in
multivariate data (concluded).
๏‚ท Development of a complete tool for extracting visual features (people detection and
tracking with correspondent features) from a system of multiple cameras with partially
overlapped FOVs (concluded);
๏‚ท Further enhancement of solutions for analyzing people trajectories to account for multimodal and sequential trajectories in order to infer behaviors (concluded);
๏‚ท Study of a system for people shape analysis based on action signature (concluded);
๏‚ท Creation of a video repository for annotated surveillance videos (concluded);
๏‚ท Development of a system for people tracking in freely moving cameras (concluded);
๏‚ท Development of a system for markerless modeling of human actions from multiple
cameras (concluded);
๏‚ท Organization of the first ACM International Workshop on Vision Networks for Behaviour
Analysis (ACM VNBA 2008) – http://imagelab.ing.unimore.it/vnba08 - Vancouver, BC
(Canada) – October 31, 2008 (concluded).
Actions taken to ensure the implementation of the end-results
UNIMORE has move forward the development of real prototypes both for the people detection
and tracking from fixed multi-camera systems and for trajectory analysis.
Involvement of young scientists
At UNIMORE five young scientists have been involved in the project:
๏‚ท Simone Calderara (post-doc at UNIMORE): involved in the study of people trajectories and
the research on people shape detection and markerless segmentation of human body
parts; he has been sent to international schools and conferences on these topics to acquire
the necessary knowledge and experience for the project;
๏‚ท Roberto Vezzani (assistant professor at UNIMORE): involved in the development and
maintenance of the VISOR system; he also participated to a meeting in Italy to disseminate
the VISOR system and BESAFE project;
๏‚ท Giovanni Gualdi (former Ph.D student and recently post-doc at UNIMORE): involved in the
study of methods for object tracking in freely moving cameras.
๏‚ท Daniele Borghesani (2nd year PhD student at UNIMORE): involved in the study of
biometric features that can be applied to model people shape; he has been sent to
international schools on biometry to acquire the necessary knowledge and experience for
the project;
๏‚ท Paolo Piccinini (2nd year PhD student at UNIMORE): involved in the development of the
people trajectory analysis system; he has participated to international schools on
fundaments in computer vision and pattern recognition useful for BESAFE project;
At HUJI, five graduating students have been involved:
21
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
๏‚ท
Amos Goldman, Yizhar Shay, Nili Rubinstein, Dan Rosenbaum, Uri Heinemann: involved in
the development of Oscillatory Gait Model (OGM) within a Multivariate Auto-Regressive
Hidden Markov Model (MAR-HMM) and the development of the graph theoretical
approaches
Major travels
๏‚ท
๏‚ท
๏‚ท
๏‚ท
๏‚ท
Participation to 10th Machine Learning Summer School (MLSS) at Ile de Re (France)
between 30th August 20008 and 15th September 2008 (http://mlss08.futurs.inria.fr/):
Giovanni Gualdi participated to this school to acquire complete and in-depth knowledge of
machine learning fundamentals.
Participation to First International Workshop on Tracking Humans for the Evaluation of
their Motion in Image Sequences (THEMIS'2008), in conjunction with BMVC 2008, Leeds,
UK, September 1st-4th, 2008: Roberto Vezzani presented the work on tracking of humans
robust to occlusions.
Participation to IEEE International Conference on Image Processing (ICIP) 2008, San Diego,
CA (USA) - http://www.icip08.org/: Simone Calderara presented the work on smoke
detection and feature detection for behavior analysis
Meeting (Lisboa, Portugal): Rita Cucchiara and Roberto Vezzani attended to a meeting for
evaluate possible collaborations within the BESAFE project.
Project meeting (Jerusalem, Israel): Andrea Prati and Simone Calderara attended to a
meeting at HUJI with Naftali Tishby and Uri Heinemann for discussing on how to integrate
the two approaches and for posing the basis for the joint paper.
Visibility of the project
Scientific publications in conferences with specific acknowledgment
[1] S. Calderara, A. Prati, R. Cucchiara
“Mixtures of von Mises Distributions for Trajectory Shape Analysis”, under review in
IEEE Transactions on Circuits and Systems for Video Technologies
[2] S. Calderara, A. Prati, R. Cucchiara
“Learning People Trajectories using Semi-directional Statistics”, under review in
IEEE International Conference on Advanced Video and Signal Based Surveillance
(IEEE AVSS 2009)
[3] S. Calderara, A. Prati, R. Cucchiara
“Video surveillance and multimedia forensics: an application to trajectory analysis”,
in Proceedings of 1st ACM International Workshop on Multimedia in Forensics
(MiFOR 2009), pp. 13-18
[4] S. Calderara, C. Alaimo, A. Prati, R. Cucchiara
“A Real-Time System for Abnormal Path Detection”, in Proceedings of 3rd IEE
International Conference on Imaging for Crime Detection and Prevention (ICDP
2009)
[5] S. Calderara, A. Prati, R. Cucchiara
“Alignment-based Similarity of People Trajectories using Semi-directional
Statistics”, in Proceedings of International Conference on Pattern Recognition (IAPR
ICPR 2010)
Scientific publications in conferences on topic related to the project
[6] R. Vezzani, R. Cucchiara, "AD-HOC: Appearance Driven Human tracking with
Occlusion Handling" First International Workshop on Tracking Humans for the
22
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Evaluation of their Motion in Image Sequences (THEMIS'2008), in conjunction with
BMVC 2008. ISBN: 978-84-935251-9-4, Leeds, UK, September 1st-4th, 2008
(WINNER OF THE BEST PAPER AWARD)
Other events
None.
Technical and administrative difficulties encountered
None.
Changes in project personnel
UNIMORE included in the project staff Daniele Borghesani and Paolo Piccinini. HUJI included in the
project staff Amos Goldman, Yizhar Shay, Nili Rubinstein, Dan Rosenbaum, Uri Heinemann.
23
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Financial Status
PPD Financial Status
Annex 4a
Science for Peace - Project Management Handbook
SfP NATO BUDGET TABLE
Please provide one sheet per Project Co-Director
ATTENTION: Project Co-Directors from NATO countries (except Bulgaria and Romania) are only eligible for NATO funding for items f-g-h !
Project number: SfP - 982480
Report date: 20/10/2008
Project Co-Director: Prof. Naftali Tishby
ACTUAL
EXPENDITURES
Detailed Budget Breakdown
BE SAFE
04/07-03/09
Project short title: SfP Duration of the Project 1 :
(1) from start until
30.09.08
FORECAST EXPENDITURES
(2) for the
following six
months
(3) for the
following period
until project's end
Comments on changes, if any, in the financial
planning compared to the approved Project Plan
(a) Equipment
2 Samsung SHC 740 D/N cameras
1 Samsung SPD 3300 (PTZ), 1 Samsung SHC 750
1" D/N camera, 1 Samsung SVR 950E recorder for
cameras
Miscellaneous equipments
1.892
2.903
10.358
4.565
12.922
professional rack for DVD recording
ink-jet printer for rack
Thermal-Eye 250D w/150mm Lens
Subtotal "Equipment"
16.815
Upgrades in the brand of the cameras
2 PTZ cameras changed in 1 PTZ plus one high-quality D/N
camera
5.400
2.760
19.750
43.735
equipments moved to following period
equipments moved to following period
(b) Computers - Software
Sun Fire X2200 M2 x64 Server, DS14 Shelf with 7TB
SATA, 6 Imacs plus upgrades
Lapto, PC, other equipments
46.764
14.492
Accessories, external storage, printers, peripherals
Software: productivity applications, Data storage and statistics
Subtotal "Computers - Software"
510
61.766
0
5.744
4.000
5.290
15.034
0
0
0
(c) Training
30
10000
10.000
7.110
800
7.910
0
5.113
2.500
2.500
4.887
5.113
4.887
0
1.094
1.094
8.906
8.906
0
159
2.841
4.000
519
1.800
1.800
10.960
103.932
0
0
International for meetings in Italy
Subtotal "Training "
(d1) Books and Journals (global figure)
(d2) Publications (global figure)
Subtotal "Books - Publications"
30
0
0
(e) Experts - Advisors
security consultant, anti-terror experts
Subtotal "Experts - Advisors "
(f) Travel for conference, workshops, domestic
International
for
meetings and setup scenarios
Subtotal "Travel"
(g) Consumables
- Spare
parts: software,
maintenance
computing
equpiment,
network, servers
Subtotal "Consumables - Spare parts"
(h) Other costs and (i) stipends (specify)
telecommunication, printing, desk-top
Miscellaneous
1.281
Graduate student (to be identified)
Master's student (to be identified)
Master's student (to be identified)
Subtotal "Other costs"
TOTAL (1), (2), (3) :
CURRENT COST OUTLOOK
=(1)+(2)+(3)
1.440
86.258
190.190
24
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
NPD Financial Status
Annex 4a
Science for Peace - Project Management Handbook
SfP NATO BUDGET TABLE
Please provide one sheet per Project Co-Director
ATTENTION: Project Co-Directors from NATO countries (except Bulgaria and Romania) are only eligible for NATO funding for items f-g-h !
Project number: SfP - 982480
Report date: 20/04/2009
Project Co-Director: Prof. Rita Cucchiara
ACTUAL
EXPENDITURES
Detailed Budget Breakdown
(to be completed in EUR 3 )
BE SAFE
04/07-03/09
Project short title: SfP Duration of the Project 1 :
(1) from start until
30.09.08
FORECAST EXPENDITURES
(2) for the following
six months
(3) for the following
period until
project's end
Comments on changes, if any, in the financial
planning compared to the approved Project Plan
(a) Equipment
Subtotal "Equipment"
(b) Computers - Software
Subtotal "Computers - Software"
(c) Training
Subtotal "Training "
(d1) Books and Journals (global figure)
1.752
0
(d2) Publications (global figure)
844
2.595
0
19.344
0
Subtotal "Books - Publications"
books' quote has been increased a little (approx 64 euro)
Costs for publishing journal papers and publications of events
(e) Experts - Advisors
Subtotal "Experts - Advisors "
(f) Travel
Travels for PHD student involved in the project
(increased of approx 5000 euro)
Subtotal "Travel"
19.344
0
3.277
0
3.277
0
2.740
200
844
(g) Consumables - Spare parts:
Subtotal "Consumables - Spare parts"
Reduced to compensate to increases in books and travel
0
(h) Other costs and (i) stipends (specify)
other vosts
stipends
Subtotal "Other costs"
2.940
844
0
TOTAL (1), (2), (3) :
28.156
844
0
CURRENT COST OUTLOOK
=(1)+(2)+(3)
29.000
25
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
SFP NATO BUDGET SUMMARY TABLE
Project number: SfP - 982480
Project short title: SfP -
Report date: 20/04/2009
Duration of the Project:
Be Safe
04/07-03/09
The Project is in the year 2
ACTUAL
Breakdown per Project Co-Director (to be completed in EUREXPENDITURES
3)
Project Co-Director's name, city, country
Naftali Tishby,Israel
Rita Cucchiara, Modena, Italia
TOTAL (must be identical with
TOTALs given in 'Breakdown per item'):
APPROVED
BUDGET:
Total year 1-5
CURRENT COST
OUTLOOK:
Total year 1 - 5
(a) Equipment
(b) Computers - Software
(c) Training
(d) Books - Publications
(e) Experts - Advisors
(f) Travel
(g) Consumables - Spare parts:
(h) Other costs and (i) stipends
TOTAL :
1 Give month/year when the Project started and expected ending date.
for the following 6
months
190.190
29.000
190.190
29.000
86.258
28.156
103.932
844
219.190
219.190
114.414
104.776
ACTUAL
EXPENDITURES
Breakdown per item (to be completed in EUR 3)
Project Co-Director's name, city, country
since start until
30.09. of current
year 2
FORECAST EXPENDITURES
APPROVED
BUDGET:
Total year 1-5
CURRENT COST
OUTLOOK:
Total year 1 - 5
60.550
76.800
10.000
7.940
2.500
20.000
19.000
22.400
219.190
2 Choose the appropriate date and complete the year.
60.550
76.800
10.000
9.628
2.500
23.306
15.006
21.400
219.190
since start until
30.09. of current
year 2
16.815
61.766
1.718
18.419
3.153
3.553
105.424
for the following
period until
project's end
Comments on changes, if any, in financial planning compared to
the approved Project Plan
FORECAST EXPENDITURES
for the following 6
months
43.735
15.034
10.000
7.910
2.500
4.887
11.853
17.847
113.766
for the following
period until
project's end
Comments on changes, if any, in financial planning compared to
the approved Project Plan
books are necessary to the added staff member
Travels for participating to schools for added PhD students
reduced to compensate incresed costs of books and travels
reduced to compensate incresed costs of books and travels
0
3 As of January 2002, grants will be made in Euro (EUR) and all figures should be given in EUR.
26
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Equipment Inventory Records
The completion of the equipment inventory records has been delayed since we never received the inventory labels.
Date of Purchase
Cost (EUR1)
Location
Fire X2200
10/9/2007
19670,00
Apple
iMac
10/9/2007
13700,00
DS14 Shelf with 7TB SATA
NetApp
DS14
10/9/2007
13100,00
0748
DVR SVR950H160
Samsung
SVR950H160
22/10/2007
1861,20
0749
PTZ Camera SPD 3300P
Samsung
SPD 3300P
22/10/2007
1619,75
0750
Laptop Sony VAIO
Sony
VGNTZ21MN/N
.IT1
18/10/2007
1700,00
Machine
Lab, HUJI
Machine
Lab, HUJI
Machine
Lab, HUJI
Imagelab,
UNIMORE
Imagelab,
UNIMORE
Machine
Lab, HUJI
Inventory Label No.
Manufacturer
Model Number
0745
Property
Item
Sun Fire X2200 M2 x64 Server
Sun
0746
iMac
0747
Serial Number
Learning
Learning
Learning
Learning
27
Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010
Criteria for success table
The Project is in the year: 2
Criteria for Success as approved
Criteria for Success:
Achievements as at 30.09.08
with the first Grant Letter on: 24/10/2006
%
(changes should be refleced here)
%
1) Abnormal behavior: defined, scenarios of motion capture video
are collected, data is acquired and annotated
1) Abnormal behavior: partial definition,
25% defined scenario of abandoned baggage, acquired several annotated videos
, acquired additional video with MoCAP
2) People detection and tracking: techniques for multiple cameras
and PTZ defined; detection and tracking evaluated
2) People detection and tracking: techniques for overlapped multiple cameras
20% defined and deeply tested; preliminary techniques for PTZ studied;
detection and tracking evaluated; preliminary studies for freely moving cameras;
going forward an integrated system
25%
20%
3) People activity: features extracted, symbolic coding for trajectories defined,
data prepared, per-sensor classification is evaluated
3) People activity: features partially extracted, symbolic coding for
15% trajectories defined, data prepared
15%
4) People shape: features extracted, symbolic coding defined,
data prepared, per-sensor classification is evaluated
4) People shape: initial study on feature extraction and representation
15% through action signatures; markerless system for human body part tracking
15%
5) Kernel design and SVM learning: kernels are mathematically defined,
their evaluation algorithm is implemented, experimental tests and accuracy evaluated
25%
25%
5) Statistical framework designed and tested
.)
.)
TOTAL :
1
Give month/year when the project started and expected ending date;
4
2
At the end of the Project, the TOTAL should be 100% if all criteria were successfully met.
100%
Please underline the appropriate year;
3
4
TOTAL :
100%
Choose the appropriate date and complete the year;
28
Download