Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 NATO PROGRAMME FOR SECURITY THROUGH SCIENCE SCIENCE FOR PEACE NATO Public Diplomacy Division, Bd. Leopold III, B-1110 Brussels, Belgium fax +32 2 707 4232 : e-mail sfp.applications@hq.nato.int Progress Report – APRIL 2010 Project SfP 982480 – BE SAFE (Behavior lEarning in Surveilled Areas with Feature Extraction) Project Director (PPD): PROF. NAFTALI TISHBY, ISRAEL Project Director (NPD): PROF. RITA CUCCHIARA, ITALY People involved in the report’s preparation: Prof. Rita Cucchiara, Dr. Andrea Prati Prof. Naftali Tishby Date of completion: 22 April 2009 1 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Table of Content Table of Content................................................................................................................................... 2 List of abbreviations ............................................................................................................................. 3 Participants .......................................................................................................................................... 4 Project background and objectives...................................................................................................... 5 Overview of the project ....................................................................................................................... 6 Technical Progress ............................................................................................................................... 7 PPD – Hebrew University (HUJI) ...................................................................................................... 7 NPD – University of Modena and Reggio Emilia (UNIMORE) .......................................................... 7 Financial Status .................................................................................................................................. 24 PPD Financial Status ....................................................................................................................... 24 NPD Financial Status ...................................................................................................................... 25 Equipment Inventory Records ........................................................................................................... 27 Criteria for success table .................................................................................................................... 28 2 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 List of abbreviations HUJI – The Hebrew University UNIMORE – Università degli Studi di Modena e Reggio Emilia MSS – Magal Security Systems, Ltd. PPD – Partner Project Director NPD – NATO Project Director CV – Computer Vision ML – Machine Learning FOV – Field of View OGM – Oscillatory Gait Model HMM – Hidden Markov Model SVM – Support Vector Machine PTZ – Pan-Tilt-Zoom 3 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Participants (a) Project Director (PPD) (Consult ”Definitions”) Surname/First Job Title, Institute and Address Country name/Title TISHBY/NAFTALI Professor, School of Engineering ISRAEL /PROF. and Computer Science, The Hebrew University, Ross Building, Givat Ram Campus, 91904 Jerusalem, Israel (b) End-user(s) (Consult “Definitions”) Surname/First Job Title, Company/Organisation name/Title and Address DANK/ZVI V.P. Engineering, Magal Security Systems, Ltd., P.O. Box 70, Industrial Zone, 56000, Yahud (c) Project Director (NPD) (Consult “Definitions”) Surname/First Job Title, Institute and Address name/Title CUCCHIARA/RITA/ Full professor, Dipartimento di PROF. Ingegneria dell’Informazione, University of Modena and Reggio Emilia, Via Vignolese, 905, 41100 Modena Country ISRAEL Telephone, Fax and Email Tel: +972-2-65-84167 Fax: +972-2-658-6440 Email: tishby@cs.huji.ac.il Telephone, Fax and Email Tel: +972-3-5391444 Fax: +972-3-5366245 Email: mglzvi@trendline.co.il Country Telephone, Fax and E-mail ITALY Tel: +39 059 2056136 Fax: +39 059 2056129 Email: rita.cucchiara@unimore.it 4 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Project background and objectives This project is unique since it aims at combining two main areas of research, Computer Vision and Machine Learning, in an application of automatic surveillance for people detection and tracking and abnormal behavior recognition. Computer Vision (CV) and Machine Learning (ML) have been used jointly for many different applications but either using ML as a tool for computer vision applications or using CV as a case study to proof theoretical advances in ML. The project aims at exploring how visual features can be automatically extracted from video using computer vision techniques and exploited by a classifier (generated by machine learning) to detect and identify suspicious people behavior in public places in real time. In this sense, CV and ML are jointly developed and studied to provide a better mix of innovative techniques. Justification of the proposed project is based on two issues of major concern to the state of Israel: (1) the need for intelligent surveillance in public and commercial areas that are susceptible to terrorist attacks and (2) lack of automatic and intelligent decision support in existing surveillance systems. More specifically, the objectives of the project are: (1) to achieve a better understanding of which visual features can be used for (1.a) analyzing people activity and (1.b) characterizing people shape; (2) to suitably adapt ML techniques such as HMM, SVM or methods for “novelty detection” in order to infer from the visual features extracted the behavior of the people and possible classifying it as normal or abnormal; (3) develop a first simple prototype in a specific scenario that can be considered as a threat for security. The machine learning research is carried out at the Hebrew University’s machine learning lab utilizing its long experience in temporal pattern recognition and computational learning methods. Following the meeting in June 2007 in Jerusalem, we decided to focus in the available time for the project on one particular behavior which is both well defined and threatening: people who leave objects behind them (such as luggage in airports). The machine learning component is based on the following phases: (1) constructing a generative statistical model of human gait on the basis of the features provided by the CV group. Such a model is an adaptation of an oscillatory dynamic model we developed in the past (Singer and Tishby 1994), where different points on the walking person are assumed having a drifted oscillatory motion with characteristic frequency and relative phases; (2) this basic Oscillatory Gait Model (OGM) is then plugged as the output of a state of an HMM, yielding a complete statistical model of regular gait; (3) detecting deviations (irregularities) in the relative phases and amplitudes of the OGM to capture irregular behavior, e.g. halting, bending, leaving objects, etc. The output of such a statistical model can be classified using likelihood ratio tests or standard classifiers as SVM to improve confidence. (4) We also carried work on detecting statistical irregularities in multivariate correlated data, as another component of the project. 5 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Overview of the project 1st year 1-2–3 Month: 1. Hybrid and distributed multi-camera people detection and tracking 2nd year 4-5-6 7-8–9 10 - 11 - 12 1-2-3 4-5-6 7-8-9 10 - 11 - 12 S1.1 People detection and tracking in multi-camera systems S1.2 Camera coordination primitives for static, hybrid and PTZ cameras 2. Feature extraction for people surveillance S2.1 Features extraction for people activity detection S2.2 Features extraction for people shape detection 3. Data preparation and symbolic coding S3.1 Data preparation and understanding, per-sensor symbolic coding and state modeling for people activity features S3.2 Data selection, cleaning, formatting, and cases generation for people activity features S3.3 Data preparation and understanding, per-sensor symbolic coding and state modeling for people shape features S3.4 Data selection, cleaning, formatting, and cases generation for people shape features 4. Designing a dynamic gait model based on coupled oscillatory motion S4.1 Using the people activity features to design a statistical classifier of regular gait S4.2 Design a state model/kernel for the people shape features S4.3 Plug in the Gait-Oscillatory model (GOM) as a state in an HMM for a complete regular gate statistical model S4.4 Use the likelihood of the model for robust classification of regular motion/behaviour 5. Framework for Abnormal Behavior monitoring S5.1 Analysis of requirements and constraints S5.2 Video data collection and annotation S5.3 Testing and refinement of integrated framework Final Report Delayed: 3rd Progress Report 2nd Progress Report 1st Progress Report Completed: Research report on integrate d framewor k testing - Final report of develope d algorithm Research s and report on technique the s for the kernels end-user and SVM empirical classificat ion tests Prototype for collection of visual people shape Prototype features for Research collection on report of visual symbolic people coding activity for Research features people report on shape testing of features people detection and tracking As planned: 6 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Technical Progress NPD – University of Modena and Reggio Emilia (UNIMORE) Description of the research (months 19-24) The main objective of UNIMORE unit in the project is to study which visual features can be used for inferring people abnormal behaviors. These features have been considered coming from two types of analysis: the analysis of the people activity and the people shape. The research activities have been mainly concentrated during the first year and the first semester of the second year. Consequently, the activities during these six months were reduced. In particular, UNIMORE has concentrated mainly on two researches. The first research was mainly devoted to further investigate the use of circular statistics for modelling people trajectories as an important feature characterizing people activity. This research was linked to task 2.1 – Feature extraction for people activity detection – which ended the first year, but UNIMORE devoted some more effort on this interesting and promising topic. The second research in this semester further addressed the task 5.2 – Video Data Collection and annotation – by improving the functionalities of the repository ViSOR already described in the previous reports. Finally, UNIMORE collaborated with HUJI to develop an integrated framework which combines the tracking algorithms (UNIMORE) with algebraic graph theoretical methods (HUJI). This effort is part of the tasks 4.4 and 5.3 described in the section of HUJI progress. Task S2.1 – Feature extraction for people activity detection Recently, one of the most addressed topics in video surveillance research is the extraction and analysis of features for behavior understanding. Among the possible features, trajectories represent a rich source of information which can be robustly extracted from single or multiple fixed cameras (Calderara, Cucchiara, & Prati, Bayesian-competitive consistent labeling for people surveillance, 2008). Morris and Trivedi in (Morris & Trivedi, 2008) proposed a recent survey on state-of-art techniques for modeling, comparing and classifying trajectories for video surveillance purposes. The people trajectory projected on the ground plane is a very compact representation of patterns of movement, normally characterized by a sequence of 2D data ({(x1, y1) , · · · , (xn, yn)} coordinates) and often associated with the motion status, e.g. the punctual velocity or acceleration. However, in many cases, instead of analyzing the spatial relationships of single trajectory points, the focus should be given to a more global descriptor of the trajectory, i.e. the trajectory shape. The shape is independent of the starting point and could constitute a very effective descriptor of the movement and the action. In surveillance of large public spaces, the trajectory shape could discriminate between different behaviors such as the ones of people moving on a straight path or people moving in a circle. In order to give an example, Fig. 1 shows a sketch of a real scenario. The bird-eye view reconstruction based on three overlapped cameras is reported and collected trajectories are superimposed with different colors corresponding to different trajectory classes. Observing the trajectory shapes only, in spite of their location, we could infer that a group of people goes straight on, passing through the monitored area, while other people arrive and move toward to the upper part of the scene. Finally, some people stay close to the benches. To cope with this evident diversity of behavior, we propose to model 7 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 trajectory shapes by means of a representation based on a sequence of angles and we focus the attention on statistical pattern recognition techniques for angular sequences. Since angles are periodic variables, the classical approach based on Gaussian distributions is unsuitable and another distribution should be adopted. By exploiting the circular statistics, we proposed in the previous reports and papers the adoption of a new statistical representation based on a mixture of von Mises (MovM) distributions. Figure 1 However, pure shape is not sufficiently discriminative in surveillance scenarios, (e.g. the same path covered by a walk or by a run has a different meaning in terms of behavior) in the further refinement carried out during this semester we studied a way to add the speed to the shape description to provide a more complete analysis of the trajectory. The introduction of the speed, which is not periodic, requires to account for the different nature of these features, the angle θ, directional, and the speed v, linear. Using a statistical model, the resulting bivariate joint probability p (θ, v) can be easily modelled as the product p (θ) ·p (v) if and only if the two variables result to be independent for the considered application. If they are not, the joint probability must be modelled by using a directional (univariate) pdf for θ and a linear (univariate) pdf for v. The estimation of the covariance matrix for this bivariate joint pdf can be quite challenging since the dependency between θ and v must be modelled properly. When a directional or periodic variable is combined with a linear one the term semi-directional is often used. The use of a Gaussian pdf for the linear variable v is straightforward, while the choice of the pdf for θ is less obvious. One of the most used (due to the properties it shares with the Gaussian) is the von Mises (vM) distribution. However, in the case of semi-directional statistics, the use of a wrapped Gaussian (Bahlmann, 2006) (Mardia, 1972) distribution is preferable because, due to the closeness to its linear counterpart, it is possible to adopt a linear approximation of the variance parameter even for circular variables. The linear variance approximation allows the employment of the Gaussian maximum likelihood estimator to calculate, with a feasible precision, the covariance matrix in the case of joint linear and periodic multivariate variables. The wrapped Gaussian can be written as: 8 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 +∞ ๐๐บ(๐|๐0 , ๐) = ∑ ๐(๐ − ๐ค2๐|๐0 , ๐) ๐ค=−∞ Nevertheless, parameter estimation in the case of wrapped Gaussian is not easy. For this reason, Bahlmann (Bahlmann, 2006) proposed to adopt a multivariate semi-directional distribution in handwriting recognition by using an approximated wrapped Gaussian (AWG) pdf for the directional variable (the tangent slope of a written segment) and the use of a linear Gaussian for the linear variable, by defining a semi-wrapped Gaussian distribution which we will refer to hereinafter as AWLG (Approximated Wrapped and Linear Gaussian). Eventually, both directional and linear data can be modelled with multi-modal distributions, for example using parametric mixtures of the corresponding pdfs. The expression of AWG is the following: ๐ด๐๐บ(๐|๐0 , ๐) = 1 √2๐๐ ๐ − ((๐−๐0 )mod 2๐) 2๐2 2 which can be extended to include also a linear variable as follows: ๐ด๐๐ฟ๐บ(๐|๐, Σ) = 1 1 (๐−๐)๐ Σ−1 (๐−๐) √2๐|Σ| ๐ −2 ๐ ๐ where ๐ = [ ] is the observation vector, ๐ = [ 0 ] is the mean vector, ๐ − ๐ = ๐ฃ0 ๐ฃ ๐๐,๐ ๐๐,๐ฃ (๐ − ๐0 ) mod 2๐ [ ] the “difference” between them, Σ = [๐ ๐๐ฃ,๐ฃ ] is the covariance matrix and ๐ฃ − ๐ฃ0 ๐ฃ,๐ |Σ| its determinant. Consequently, the mixture of AWLG (MoAWLG) can be defined as: ๐พ ๐๐๐ด๐๐ฟ๐บ(๐|๐ , ๐, ๐บ) = ∑ ๐๐ ๐ด๐๐ฟ๐บ (๐|๐๐ , Σk ) ๐=1 Figure 2. Plots of different circular pdfs, with θ0 = 0 and σ = 1.0 (corresponding to m = 1.54) or σ = 1.5 (m = 0.69). 9 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 The results of this research exploit semi-directional statistics (specifically a mixture of AWLG) to model and analyze people trajectory shapes in order to classify paths shape and motion models. The AWLG model results to be the more appropriate since we measured mutual information for testing the dependency between the directional and linear variables. Since exact mutual information is hard to compute in the case of mixtures of pdfs, a variational approximation of it has derived. Finally, an approach for comparing sequences of semi-directional data has been derived: it exploits the global alignment of sequences of symbols with a distance based on Kullback-Leibler divergence. Finally, a complete system for the classification of people trajectories is proposed and experiments on both synthetic and real data are provided to demonstrate its accuracy. Some hours of unconstrained acquisition of people walking around in an open space are evaluated. In order to verify the accuracy of our approach, we performed extensive experiments with both synthetic and real data. Two sets of synthetic trajectories (one with dependent and one with independent data) have been generated with a Matlab simulator in order to evaluate both the solutions (with dependent and independent variables). We also evaluated the average mutual information on real data. The average value for real data is high enough to mean that some correlation exists between the angles and the speed in the considered scenario. This is not true in general, but it heavily depends on the context of application and the collected data. Table 1 The robustness of the proposed sequences’ comparison algorithm has been tested performing two different kinds of experimental campaigns. The first campaign evaluates the performance of our approach in some specific situations, where common approaches tend to fail. First, the robustness against small fluctuations around the zero value of θ (row “Periodicity” of Table 1) has been evaluated by generating trajectories that are composed by a unique straight almost-zero direction with added noise. In this case, the system is able to cluster together all the trajectories thanks to the use of circular (i.e. wrapped) statistics to model angular data. Subsequently, we tested the capability of the system to handle sequences of either the same principal directions or speeds, but given in different order (rows “Sequence” of Table 1). In this case, both the proposed statistical measure and the alignment technique concur to filter out the noise and to correctly cluster this kind of data. Then, a specific test was performed to verify the robustness against severe noise on either angular or speed values (rows “Noise”). The second test 10 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 campaign evaluates the accuracy of the proposed approach performing sequences’ classification on a large amount of data. Synthetic and real data are used for testing, and two synthetic sets are provided with either dependent or independent data (rows 6 and 7 of Table 1). The real test (row 8) is composed by 356 trajectories collected by the system previously mentioned and manually ground-truthed. The set of trajectories has been divided randomly in 200 trajectories in the training set and the remaining in the testing set. Examples of the obtained classes (superimposed to a bird-eye view of the multiple camera scenario) are shown in Fig. 3. Please note that trajectories of the same color belong to the same class. Figure 3 Further details can be found in several papers accepted in the last months on this topic. More specifically, the paper presented in oral at International Conference on Advanced Video and Signal-based Surveillance (AVSS) conference held in Genova, Italy on September 2009, and those presented in posters at International Workshop on Multimedia in Forensics (MiFOR) held in Bejing (China) on October 2009 and at International Conference on Imaging for Crime Detection and Prevention (ICDP) held in London (UK) on December 2009. Moreover, recently the result of this work has been accepted for the prestigious International Conference on Pattern Recognition (ICPR) to be held in Instanbul (Turkey) on August 2010. References Bibliografia ๏ท ๏ท ๏ท ๏ท Bahlmann, C. (2006). Directional features in online handwriting recognition. Pattern Recognition , 39, 115-125. Calderara, S., Cucchiara, R., & Prati, A. (2008). Bayesian-competitive consistent labeling for people surveillance. IEEE Trans. on PAMI , 30 (2), 354-360. Mardia, K. (1972). Statistics of directional data. Morris, B., & Trivedi, M. (2008). A survey of vision-based trajectory learning and analysis for surveillance. IEEE Transactions on Circuits and Systems for Video Technology , 11141127. 11 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 S5.2 Video data collection and annotation Previous reports of the project showed the development and continue enrichment of the video on-line repository ViSOR (http://imagelab.ing.,unimo.it/visor) developed by UNIMORE. This semester the UNIMORE activity has concentrated on two main aspects of ViSOR. The first concerns the creation of a survey on the user needs for managing video surveillance. An excerpt of questions and answers is reported in Fig. 4. The questionnaire was principally conceived to highlight the inadequacy of traditional free text annotation and query approach applied to the surveillance field. Looking at the reported results, it is clear that the video surveillance community needs new concept-based technologies. In particular, even if almost the totality of the interviewees uses or develops new tools for event, object, and people detection, only few of them apply a standard schema, an ontology or even a controlled lexicon to annotate videos. Thus, queries by concept (desirable by more than half of the users) cannot be performed. Figure 4 The second aspect addressed in this semester regards the study of a multi-dimensional annotation procedure. Different types of annotation can be generated depending on the drill-down depth used to annotate the video and on the application goal. To this aim we defined three dimensions over which an annotation can be differently detailed: Spatial, Temporal, and Conceptual (STC space). In the graph of Fig. 5 these three dimensions are associated to the Cartesian axes and each point corresponds to a particular annotation type. For each dimension we have identified some significant values. 12 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Figure 5 Thus, three temporal levels of description are defined: ๏ท none or video-level : no temporal information are given; ๏ท clip: the video is partitioned into clips and each of them are described by the set of descriptor instances; ๏ท frame: the annotation is given frame by frame. Moreover, we can have the following four spatial levels: ๏ท none or image-level: no spatial information are given and the concept is referred to the whole frame; ๏ท position: the location of the concepts is specified by a single point, e.g. the centroid; ๏ท ROI : the region of the frame containing the concept is reported, for example using the bounding box; ๏ท mask: a pixel level mask is reported for each concept instance. Eventually, we have also defined these four conceptual levels: ๏ท none (Syntactical level): no semantic information are provided; free-text keywords and title can be provided. ๏ท one concept : only one particular concept is considered and annotated; other concepts can be added but they are not the focus of the annotation itself; ๏ท subset: only a subset of the ViSOR surveillance concepts are considered and the subset adopted should be indicated; ๏ท whole ontology: all the ViSOR surveillance concepts are considered. As an example, Fig. 6 shows the web interface of ViSOR with the different levels of annotation. 13 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Figure 6. The ViSOR Web interface for (a) the syntactical annotation and (b) the concept list annotation. (c) Screen shot of the ViPER-GT annotation tool 14 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 PPD – Hebrew University (HUJI) Description of the research (months 19-24) The main objective of HUJI unit in the project is to study novel machine learning techniques capable to deal with the features extracted by UNIMORE to extract knowledge, specifically abnormal or suspicious behaviors. In particular, HUJI has concentrated to the development of graph theoretical approaches for detecting anomalies and their application to trajectory analysis (provided on real scenes by UNIMORE) Task S4.4 – Use the likelihood of the model for robust classification of regular motion/behaviour The combined approach UNIMORE/HUJI is composed of three steps: 1. from video to discrete time-series: the results of image processing techniques developed during the project by Imagelab are the trajectories produced by moving people in a wide scene monitored by multiple cameras; 2. from time-series to graph: following an innovative partitioning process described below, we transform the trajectories in a graph describing the probability of a person to move in a certain manner in the scene; this graph actually models “normality” 3. the anomaly detection method on graphs using the Laplacian filtering: the learned graph describing “normality” is compared with the current graph (which may or not contain anomalous trajectories) by means on an innovative similarity measure based on Laplacian. The first step has been deeply described in UNIMORE past reports. Regarding step 2, the time series (trajectories) are transformed into a weighted graph, where a node ๐ฃ represents movement from one location in the scene to another, while the weight of edge ๐๐,๐ is the probability to see the movement followed by the ๐ movement. The direct use of the (x; y) samples is unfeasible since it will result in a very high number of nodes (as extreme case, square of the number of pixels), which severely compromises the use of graph-based approaches due to computational cost, as well as the robustness of the representation. Moreover, the (x; y) data are often affected by noise and tracking errors, and thus need to be filtered before use. The simplest solution to both problems is the quantization of image (x; y) plane, which in this context translates into dividing the scene into a fixed number of cells and assigning each data point to its containing cell. The naive scheme is to divide the scene using a fixed-size grid. Unfortunately, the grid size is a crucial parameter in this approach. Let ๐๐ฅ๐ be the size of the image, and ๐๐ and ๐๐ be the number of rows and columns of the grid, respectively. The direct use of the coordinates would result in (๐๐ฅ๐)2 nodes, reduced to (๐๐ ๐ฅ๐๐ )2 using a uniform grid: if ๐๐ and ๐๐ are too high (say Nr = 100 and Nc = 100 for a 1000x1000 image), the approximation is good but the computational load can still be too high (100,000,000 nodes in the example); if they are reduced (e.g., 10,000 nodes) the complexity becomes more acceptable (but still not practical) at the price of the risk of having an overly coarse quantization of the data. Another disadvantage of a uniform grid is the uneven statistics of the cells occupation in natural scenes, yielding suboptimal statistical quantization of the trajectories. Thus, we use a density-sensitive variable-geometry grid scheme. Moreover, graph nodes are assigned only to observed transitions. By not forcing any specific geometry of the cells (as in the 15 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 case of a regular grid), the task of finding an adequate partition to cells is reduced to finding appropriate center-points. Having established the centers, the cells’ boundaries are determined by the locus of points that are at the same distance from two centers, hence creating a Voronoi tessellation. In order to select the centers, certain properties can be considered: first, an area that is rarely traversed needs only a rough description; conversely, busy areas require a high resolution partitioning in order to distinguish between normal and abnormal walks. A natural solution is to use as centers points that are randomly sampled on the training trajectories, taking into account small sample size effects. In that way we can use fewer cells (i.e., nodes) but still maintain high resolution in the “most populated” areas. Figure 7 The procedure is summarized in Fig. 7. Given a training set composed of normal trajectories, the image is first divided in ๐๐ ๐ฅ๐๐ rectangular cells of fixed size (Fig. 7(a)). For this preliminary step, Nr and Nc are less critical parameters and can be high (we used 250x500 in our experiments). Using this division, a 2D histogram H can be built (Fig. 7(b) and (c)), where H(i; j) represents the number of trajectory points falling in the cell at row i and column j. The 2D histogram represents a 2D distribution of the samples in the scene, with peaks (Fig. 7(c)) in the major areas (cells) of activity in the scene. To obtain the best coverage (related to the training data) and the most suitable partition of the scene with the fewest cells, they need to be distributed according to the discrete distribution described by H. Thus, given ๐๐๐๐๐ก๐๐ as the number of cells/nodes used, we draw ๐๐๐๐๐ก๐๐ samples from the distribution approximated by H, which increases the likelihood of sampling from peaks of H, avoiding sampling from areas where no points are present in the training set. These samples represent the seeds of a Voronoi tessellation of the scene (Fig. 7(d)). The adjacency map and the transition matrix are then computed on these cells, making the graph treatable in computational terms. 16 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Having established the centers, each point is replaced by the center closest to it, so that the trajectories are transformed into a sequence of centers. A node is assigned to each observed transition from cell a to cell b. Let node i represent the transition from node a to b, and in the same manner j represents the transition from cell b to cell c. Hence, the edge ei;j will represent the occurrence of moving from cell a to cell b and then to cell c, while the weight of edge ei;j will be the probability of such a movement. This procedure transforms a collection of trajectories into a graph. When a new trajectory is encountered, the cell centers are determined according to the above scheme, but using the new trajectory together with all the normal ones. By using these centers a graph is constructed using the new trajectory, while another is built using only the normal trajectories. Once these two graphs have been constructed, anomalies are detected by searching for substantial differences between them. In order to compute differences between graphs, the crucial issue is to find a proper similarity function. Our specific approach is motivated by the algebraic properties of similarity matrices proved highly successful in spectral clustering. Let ๐ be a symmetric matrix that represents the edge weights and the matrix eigenvalues, ordered in increasing order. The corresponding eigenvector will be denoted as ๐1 , … , ๐๐ . Let us now define the variables and ๐, as the spectral gap (maximum of the difference between two consecutive eigenvalues) and the index which correspond to the maximum. Given two graphs ๐บ and ๐บฬ the smallest k eigenvectors of one graph are projected on the other by means of the definition of the matrix M: T ๏ฌ ๏ฏ๏ฆ[1...k ]๏ฆ[1...k ] k ๏ณ k M๏ฝ๏ญ T ๏ฏ ๏ฎ๏ฆ[1...k ]๏ฆ[1...k ] else The eigenvalues of the matrix M are called the canonical angles, and are the most intuitive way to measure distance between subspaces. Accordingly, two measures can be defined on the matrix M: ๏จ ๏ฉ ๏จ ๏ฉ ๏ ๏จG,G ๏ฉ ๏ฝ k ๏ญ Trace ๏จ M M ๏ฉ ๏ G,G ๏ฝ 1๏ญ det MT M T 17 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Task S5.3 – Testing and refinement of integrated framework In the last period of the project, HUJI collaborates with UNIMORE to test the proposed integrated approach in a real testbed. People’s trajectories were collected for a month from the set-up reported in Fig. 8. Figure 8 For assessing the capabilities of our detection algorithm, a corpus of 1131 trajectories was collected by the surveillance system. For testing our algorithm we treat this corpus as the normal behavior of the scene. In order to simulate abnormal events we collected 9 abnormal trajectories, partially shown in Fig. 9. Detection of anomalies must be preceded by learning the normal behavior. Hence we divided our normal trajectories in two sets: the first 900 trajectories were used in the learning phase, while the other 231 normal trajectories were used in the test phase. Figure 10 plots the Trace distance measure (see above) for the test set. It is clear that system detected the 8 out of the 9 abnormal trajectories (times in which the system declares as abnormal are marked by square on the distance line). These results suggest that this new proposed anomaly detection framework is able to detect anomalies in a complex scenarios, such as the one considered when the data present structure complexity and noise due to the automatic tracking techniques. It is nevertheless important to stress that good performances depend strongly on stable and long trajectories that can be extracted using the surveillance system described here. 18 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Figure 9 19 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Figure 10 20 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Accomplishment achieved Here is a comprehensive list of the accomplishments achieved so far compared to the Project Plan in the first 24 months (1-24): ๏ท Developing a new approach for modeling human gait (GOM) and model it statistically using autoregressive processes (concluded). ๏ท Use the GOM as a state output model of an HMM for a complete statistical model of human motion (concluded). ๏ท Use the graph Laplacian formulation, proved very successful for detecting irregularities in multivariate data (concluded). ๏ท Development of a complete tool for extracting visual features (people detection and tracking with correspondent features) from a system of multiple cameras with partially overlapped FOVs (concluded); ๏ท Further enhancement of solutions for analyzing people trajectories to account for multimodal and sequential trajectories in order to infer behaviors (concluded); ๏ท Study of a system for people shape analysis based on action signature (concluded); ๏ท Creation of a video repository for annotated surveillance videos (concluded); ๏ท Development of a system for people tracking in freely moving cameras (concluded); ๏ท Development of a system for markerless modeling of human actions from multiple cameras (concluded); ๏ท Organization of the first ACM International Workshop on Vision Networks for Behaviour Analysis (ACM VNBA 2008) – http://imagelab.ing.unimore.it/vnba08 - Vancouver, BC (Canada) – October 31, 2008 (concluded). Actions taken to ensure the implementation of the end-results UNIMORE has move forward the development of real prototypes both for the people detection and tracking from fixed multi-camera systems and for trajectory analysis. Involvement of young scientists At UNIMORE five young scientists have been involved in the project: ๏ท Simone Calderara (post-doc at UNIMORE): involved in the study of people trajectories and the research on people shape detection and markerless segmentation of human body parts; he has been sent to international schools and conferences on these topics to acquire the necessary knowledge and experience for the project; ๏ท Roberto Vezzani (assistant professor at UNIMORE): involved in the development and maintenance of the VISOR system; he also participated to a meeting in Italy to disseminate the VISOR system and BESAFE project; ๏ท Giovanni Gualdi (former Ph.D student and recently post-doc at UNIMORE): involved in the study of methods for object tracking in freely moving cameras. ๏ท Daniele Borghesani (2nd year PhD student at UNIMORE): involved in the study of biometric features that can be applied to model people shape; he has been sent to international schools on biometry to acquire the necessary knowledge and experience for the project; ๏ท Paolo Piccinini (2nd year PhD student at UNIMORE): involved in the development of the people trajectory analysis system; he has participated to international schools on fundaments in computer vision and pattern recognition useful for BESAFE project; At HUJI, five graduating students have been involved: 21 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 ๏ท Amos Goldman, Yizhar Shay, Nili Rubinstein, Dan Rosenbaum, Uri Heinemann: involved in the development of Oscillatory Gait Model (OGM) within a Multivariate Auto-Regressive Hidden Markov Model (MAR-HMM) and the development of the graph theoretical approaches Major travels ๏ท ๏ท ๏ท ๏ท ๏ท Participation to 10th Machine Learning Summer School (MLSS) at Ile de Re (France) between 30th August 20008 and 15th September 2008 (http://mlss08.futurs.inria.fr/): Giovanni Gualdi participated to this school to acquire complete and in-depth knowledge of machine learning fundamentals. Participation to First International Workshop on Tracking Humans for the Evaluation of their Motion in Image Sequences (THEMIS'2008), in conjunction with BMVC 2008, Leeds, UK, September 1st-4th, 2008: Roberto Vezzani presented the work on tracking of humans robust to occlusions. Participation to IEEE International Conference on Image Processing (ICIP) 2008, San Diego, CA (USA) - http://www.icip08.org/: Simone Calderara presented the work on smoke detection and feature detection for behavior analysis Meeting (Lisboa, Portugal): Rita Cucchiara and Roberto Vezzani attended to a meeting for evaluate possible collaborations within the BESAFE project. Project meeting (Jerusalem, Israel): Andrea Prati and Simone Calderara attended to a meeting at HUJI with Naftali Tishby and Uri Heinemann for discussing on how to integrate the two approaches and for posing the basis for the joint paper. Visibility of the project Scientific publications in conferences with specific acknowledgment [1] S. Calderara, A. Prati, R. Cucchiara “Mixtures of von Mises Distributions for Trajectory Shape Analysis”, under review in IEEE Transactions on Circuits and Systems for Video Technologies [2] S. Calderara, A. Prati, R. Cucchiara “Learning People Trajectories using Semi-directional Statistics”, under review in IEEE International Conference on Advanced Video and Signal Based Surveillance (IEEE AVSS 2009) [3] S. Calderara, A. Prati, R. Cucchiara “Video surveillance and multimedia forensics: an application to trajectory analysis”, in Proceedings of 1st ACM International Workshop on Multimedia in Forensics (MiFOR 2009), pp. 13-18 [4] S. Calderara, C. Alaimo, A. Prati, R. Cucchiara “A Real-Time System for Abnormal Path Detection”, in Proceedings of 3rd IEE International Conference on Imaging for Crime Detection and Prevention (ICDP 2009) [5] S. Calderara, A. Prati, R. Cucchiara “Alignment-based Similarity of People Trajectories using Semi-directional Statistics”, in Proceedings of International Conference on Pattern Recognition (IAPR ICPR 2010) Scientific publications in conferences on topic related to the project [6] R. Vezzani, R. Cucchiara, "AD-HOC: Appearance Driven Human tracking with Occlusion Handling" First International Workshop on Tracking Humans for the 22 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Evaluation of their Motion in Image Sequences (THEMIS'2008), in conjunction with BMVC 2008. ISBN: 978-84-935251-9-4, Leeds, UK, September 1st-4th, 2008 (WINNER OF THE BEST PAPER AWARD) Other events None. Technical and administrative difficulties encountered None. Changes in project personnel UNIMORE included in the project staff Daniele Borghesani and Paolo Piccinini. HUJI included in the project staff Amos Goldman, Yizhar Shay, Nili Rubinstein, Dan Rosenbaum, Uri Heinemann. 23 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Financial Status PPD Financial Status Annex 4a Science for Peace - Project Management Handbook SfP NATO BUDGET TABLE Please provide one sheet per Project Co-Director ATTENTION: Project Co-Directors from NATO countries (except Bulgaria and Romania) are only eligible for NATO funding for items f-g-h ! Project number: SfP - 982480 Report date: 20/10/2008 Project Co-Director: Prof. Naftali Tishby ACTUAL EXPENDITURES Detailed Budget Breakdown BE SAFE 04/07-03/09 Project short title: SfP Duration of the Project 1 : (1) from start until 30.09.08 FORECAST EXPENDITURES (2) for the following six months (3) for the following period until project's end Comments on changes, if any, in the financial planning compared to the approved Project Plan (a) Equipment 2 Samsung SHC 740 D/N cameras 1 Samsung SPD 3300 (PTZ), 1 Samsung SHC 750 1" D/N camera, 1 Samsung SVR 950E recorder for cameras Miscellaneous equipments 1.892 2.903 10.358 4.565 12.922 professional rack for DVD recording ink-jet printer for rack Thermal-Eye 250D w/150mm Lens Subtotal "Equipment" 16.815 Upgrades in the brand of the cameras 2 PTZ cameras changed in 1 PTZ plus one high-quality D/N camera 5.400 2.760 19.750 43.735 equipments moved to following period equipments moved to following period (b) Computers - Software Sun Fire X2200 M2 x64 Server, DS14 Shelf with 7TB SATA, 6 Imacs plus upgrades Lapto, PC, other equipments 46.764 14.492 Accessories, external storage, printers, peripherals Software: productivity applications, Data storage and statistics Subtotal "Computers - Software" 510 61.766 0 5.744 4.000 5.290 15.034 0 0 0 (c) Training 30 10000 10.000 7.110 800 7.910 0 5.113 2.500 2.500 4.887 5.113 4.887 0 1.094 1.094 8.906 8.906 0 159 2.841 4.000 519 1.800 1.800 10.960 103.932 0 0 International for meetings in Italy Subtotal "Training " (d1) Books and Journals (global figure) (d2) Publications (global figure) Subtotal "Books - Publications" 30 0 0 (e) Experts - Advisors security consultant, anti-terror experts Subtotal "Experts - Advisors " (f) Travel for conference, workshops, domestic International for meetings and setup scenarios Subtotal "Travel" (g) Consumables - Spare parts: software, maintenance computing equpiment, network, servers Subtotal "Consumables - Spare parts" (h) Other costs and (i) stipends (specify) telecommunication, printing, desk-top Miscellaneous 1.281 Graduate student (to be identified) Master's student (to be identified) Master's student (to be identified) Subtotal "Other costs" TOTAL (1), (2), (3) : CURRENT COST OUTLOOK =(1)+(2)+(3) 1.440 86.258 190.190 24 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 NPD Financial Status Annex 4a Science for Peace - Project Management Handbook SfP NATO BUDGET TABLE Please provide one sheet per Project Co-Director ATTENTION: Project Co-Directors from NATO countries (except Bulgaria and Romania) are only eligible for NATO funding for items f-g-h ! Project number: SfP - 982480 Report date: 20/04/2009 Project Co-Director: Prof. Rita Cucchiara ACTUAL EXPENDITURES Detailed Budget Breakdown (to be completed in EUR 3 ) BE SAFE 04/07-03/09 Project short title: SfP Duration of the Project 1 : (1) from start until 30.09.08 FORECAST EXPENDITURES (2) for the following six months (3) for the following period until project's end Comments on changes, if any, in the financial planning compared to the approved Project Plan (a) Equipment Subtotal "Equipment" (b) Computers - Software Subtotal "Computers - Software" (c) Training Subtotal "Training " (d1) Books and Journals (global figure) 1.752 0 (d2) Publications (global figure) 844 2.595 0 19.344 0 Subtotal "Books - Publications" books' quote has been increased a little (approx 64 euro) Costs for publishing journal papers and publications of events (e) Experts - Advisors Subtotal "Experts - Advisors " (f) Travel Travels for PHD student involved in the project (increased of approx 5000 euro) Subtotal "Travel" 19.344 0 3.277 0 3.277 0 2.740 200 844 (g) Consumables - Spare parts: Subtotal "Consumables - Spare parts" Reduced to compensate to increases in books and travel 0 (h) Other costs and (i) stipends (specify) other vosts stipends Subtotal "Other costs" 2.940 844 0 TOTAL (1), (2), (3) : 28.156 844 0 CURRENT COST OUTLOOK =(1)+(2)+(3) 29.000 25 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 SFP NATO BUDGET SUMMARY TABLE Project number: SfP - 982480 Project short title: SfP - Report date: 20/04/2009 Duration of the Project: Be Safe 04/07-03/09 The Project is in the year 2 ACTUAL Breakdown per Project Co-Director (to be completed in EUREXPENDITURES 3) Project Co-Director's name, city, country Naftali Tishby,Israel Rita Cucchiara, Modena, Italia TOTAL (must be identical with TOTALs given in 'Breakdown per item'): APPROVED BUDGET: Total year 1-5 CURRENT COST OUTLOOK: Total year 1 - 5 (a) Equipment (b) Computers - Software (c) Training (d) Books - Publications (e) Experts - Advisors (f) Travel (g) Consumables - Spare parts: (h) Other costs and (i) stipends TOTAL : 1 Give month/year when the Project started and expected ending date. for the following 6 months 190.190 29.000 190.190 29.000 86.258 28.156 103.932 844 219.190 219.190 114.414 104.776 ACTUAL EXPENDITURES Breakdown per item (to be completed in EUR 3) Project Co-Director's name, city, country since start until 30.09. of current year 2 FORECAST EXPENDITURES APPROVED BUDGET: Total year 1-5 CURRENT COST OUTLOOK: Total year 1 - 5 60.550 76.800 10.000 7.940 2.500 20.000 19.000 22.400 219.190 2 Choose the appropriate date and complete the year. 60.550 76.800 10.000 9.628 2.500 23.306 15.006 21.400 219.190 since start until 30.09. of current year 2 16.815 61.766 1.718 18.419 3.153 3.553 105.424 for the following period until project's end Comments on changes, if any, in financial planning compared to the approved Project Plan FORECAST EXPENDITURES for the following 6 months 43.735 15.034 10.000 7.910 2.500 4.887 11.853 17.847 113.766 for the following period until project's end Comments on changes, if any, in financial planning compared to the approved Project Plan books are necessary to the added staff member Travels for participating to schools for added PhD students reduced to compensate incresed costs of books and travels reduced to compensate incresed costs of books and travels 0 3 As of January 2002, grants will be made in Euro (EUR) and all figures should be given in EUR. 26 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Equipment Inventory Records The completion of the equipment inventory records has been delayed since we never received the inventory labels. Date of Purchase Cost (EUR1) Location Fire X2200 10/9/2007 19670,00 Apple iMac 10/9/2007 13700,00 DS14 Shelf with 7TB SATA NetApp DS14 10/9/2007 13100,00 0748 DVR SVR950H160 Samsung SVR950H160 22/10/2007 1861,20 0749 PTZ Camera SPD 3300P Samsung SPD 3300P 22/10/2007 1619,75 0750 Laptop Sony VAIO Sony VGNTZ21MN/N .IT1 18/10/2007 1700,00 Machine Lab, HUJI Machine Lab, HUJI Machine Lab, HUJI Imagelab, UNIMORE Imagelab, UNIMORE Machine Lab, HUJI Inventory Label No. Manufacturer Model Number 0745 Property Item Sun Fire X2200 M2 x64 Server Sun 0746 iMac 0747 Serial Number Learning Learning Learning Learning 27 Project BeSafe – SfP 982480 – 4th Progress Report – APRIL Progress Report - 2010 Criteria for success table The Project is in the year: 2 Criteria for Success as approved Criteria for Success: Achievements as at 30.09.08 with the first Grant Letter on: 24/10/2006 % (changes should be refleced here) % 1) Abnormal behavior: defined, scenarios of motion capture video are collected, data is acquired and annotated 1) Abnormal behavior: partial definition, 25% defined scenario of abandoned baggage, acquired several annotated videos , acquired additional video with MoCAP 2) People detection and tracking: techniques for multiple cameras and PTZ defined; detection and tracking evaluated 2) People detection and tracking: techniques for overlapped multiple cameras 20% defined and deeply tested; preliminary techniques for PTZ studied; detection and tracking evaluated; preliminary studies for freely moving cameras; going forward an integrated system 25% 20% 3) People activity: features extracted, symbolic coding for trajectories defined, data prepared, per-sensor classification is evaluated 3) People activity: features partially extracted, symbolic coding for 15% trajectories defined, data prepared 15% 4) People shape: features extracted, symbolic coding defined, data prepared, per-sensor classification is evaluated 4) People shape: initial study on feature extraction and representation 15% through action signatures; markerless system for human body part tracking 15% 5) Kernel design and SVM learning: kernels are mathematically defined, their evaluation algorithm is implemented, experimental tests and accuracy evaluated 25% 25% 5) Statistical framework designed and tested .) .) TOTAL : 1 Give month/year when the project started and expected ending date; 4 2 At the end of the Project, the TOTAL should be 100% if all criteria were successfully met. 100% Please underline the appropriate year; 3 4 TOTAL : 100% Choose the appropriate date and complete the year; 28