2. Space-time visual exploration…with a mouse

advertisement
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
Towards integrated spatial surveillance tools of urban risks :
space-time analysis of road accidents in Lille (France)1
Arnaud Banos, INRETS/DERA ; Florence Banos, Univeristé Paris IV Sorbonne
arnaud.banos@inrets.fr, florence.banos@tiscali.fr
Résumé
L’accident de la circulation peut être interprété comme un événement ponctuel discret,
localisé dans le temps et dans l’espace. Rare de par sa probabilité d’occurrence, il comporte
par ailleurs une large part d’aléas dans son avènement. L’identification de configurations
spatio-temporelles significatives, au sein d’une information aussi bruitée, représente donc un
enjeu réel. Le point de vue défendu dans cet article est le suivant : une stratégie basée sur la
coopération homme-machine, exploitant les capacités de chacune des parties, est à même de
permettre des avancées intéressantes dans cette perspective de recherche de structures. A
partir du semis de points des accidents survenus à Lille, entre 1993 et 1997, un exemple de
mise en œuvre sera proposé. Animation de carte, exploration spatio-temporelle interactive et
recherche automatique de structures seront combinées, au sein d’une stratégie de data-mining
spatial interactif.
Mots-Clés : Accidents de la route, Analyse spatiale exploratoire, Data-mining spatial, Lille,
Semis de Points
Abstract
Road accidents may be seen as discrete punctual events, localized in both space and time.
Called “rare” because of their low probability of occurrence, they are also largely subject to
random fluctuations. Identification of significant space-time configurations, within such noisy
information, is then a real stake. In this paper, we defend the idea that a strategy based on a
1
Most of the developments this article is based on were led in collaboration with Florence Banos (HugueninRichard), while the author was finishing his PhD thesis at ThéMA, UMR 6049 CNRS, Besançon. More
information available at : http://thema.univ-fcomte.fr/banos/banos-page.htm
Page 1 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
close human/computer cooperation, exploiting the peculiar capacities of each side, may be
very useful. The point pattern of accidents that occurred in Lille, from 1993 to 1997, will be
used to illustrate this point of view. Map animation, interactive spatio-temporal exploration
and automated search for patterns will be combined, within an interactive spatial data-mining
strategy.
Key Words : Exploratory spatial analysis, Lille (France), Point pattern analysis, Road
accidents, Spatial data-mining
Introduction
Spatial surveillance tools, aimed at revealing and detecting significant space-time patterns,
may be of great value in urban planning and risk analysis. By allowing a detailed, accurate
and maybe permanent following of urban palpitations, these tools would provide a strong and
reliable base for urban management. In our opinion, such tools may be based on a strategy
combining interactivity, animation and automation within a dynamic framework. This point of
view we exposed in previous works (Banos, 1999, 2001a,b,c,d ; Banos and HugueninRichard, 2000) relies to three different traditions : exploratory spatial data analysis,
geovisualisation and geocomputation. While the first one is firmly anchored in the statistic
tradition with the original works of John W. Tukey (1977), the two others rely more
fundamentally on computer science developments (Openshaw and Openshaw, 1997 ;
Openshaw and Abrahart, 2000 ; Kraak, 1998 ; Kwan, 2000). The key idea of this paper is that
we should take full advantage of the combination of these three fields, even if modalities and
tools are still to be assessed. Examples of such developments are exposed, focused on road
risk analysis, and developped mainly within the free statiscal programming environment
Xlisp-Stat (Tierney, 1990).
1. Outline of the spatial data base
The data set used for this synthesis comes from the territorial administration of Lille (France),
kind provider of a very rich spatial data base :

each accident is a localised event, known by its geographic co-ordinates ;

each accident is described by qualitative and quantitative information (day, month,
hour of occurrence, as well as seriousness of the event, users involved...).
The Geographic Information System Arcview (ESRI) is used to store and to manage the data
base.
Page 2 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
On the pinpoints map hereafter, each point localises one or many traffic accidents, that may
overlap each others.
Figure 1 : Location of accidents in Lille, 1993-1997
From that point, different alternatives can be pursued : visual exploration through interactive
graphs, space-time animation through smoothing algorithm and automated search for clusters
2. Space-time visual exploration…with a mouse
The idea of interactivity is not new (Cleveland, 1993 ; Haslett and al., 1991 ; Monmonier,
1989) but retained increasing attention with the development and diffusion of interactive
environments2.
Dynamic linked graphics are thus as simple as powerful tools to investigate one's data set,
particularly in a space-time perspective, by simply linking the spatial and temporal
dimensions through appropriate graphs. We exploit here the dynamic graphic capacities of
Xlisp-Stat, a statistical programming environment provided “for free” by Luke Tierney
(Tierney, 1990).
The spatial distribution of road accidents in Lille can be displayed through a bi-dimensional
plot-points, while the temporal distribution – for a given time lag – can be displayed in a socalled “stacked plot”, enhanced with a kernel density function revealing the global trend.
2
See for example the study by Carolina Tobon, http//www.casa.ucl.ac.uk/paper31.pdf
Page 3 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
Figure 2 : Interactive space-time exploration of accidents data within a dynamic environment
The existence of a «hot link» between the two windows allows for interactive space-time
analysis to be led, as every individual or group of individuals selected on one of the graphs
(here the time plot) is instantaneously highlighted on the other graph. The user is then invited
to investigate one dimension conditionally to the other, focusing on more local than global
trends, the spatial or temporal scale of investigation being highly customisable.
While this very simple strategy has proved successful in revealing meaningful local spacetime patterns, it cannot be considered as a stand-alone solution as what makes it strength
makes also its weakness :

While focusing on local trends, the user may miss global ones. Moreover, fully
desegregated information, if desirable on many aspects, may not be sufficient and
should lead to user driven aggregations ;

Being active part of the investigative process through mouse manipulation, the
user may miss patterns that are not expected at the moment ;

The rough graphical display, that may be improved through real-time display of
numeric indicators (Banos, 2001a), may not be sufficient to make distinctions
between different patterns ;

The combination of mouse manipulations that led to a given pattern may be hardly
reproducible by the user itself.
Page 4 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
For this reason, additional strategies may be imagined, based on both animation and
automation processes.
3. Animating maps to reveal complexity
It is often worth aggregating information, in order to provide additional insights. For example,
pinpoints map – even interactive – hardly inform about the varying intensities of the
underlying phenomenon, not only because points can overlap each other but also due to the
limited capacity of human vision. Aggregating punctual information, such as the location of
accidents, can be usefully realised using a kernel density algorithm (Bailey and Gatrell, 1995 ;
Brunsdon, 1991 ; Silverman, 1986 ). A moving three-dimensional window of a chosen radius
“r”, scans the studied area, counting the events “Xi” included in its circular area, and
weighting them according to their distance to the centre “X” of the window :
λˆ X  
n
 X  X i  
1

 k
i 1 
r X i 
 rX i  
2
The function k(d), which is the kernel, may be defined in several ways. Here, a bi-square
function is used :
3
2 2
 1  d  if d  1
k (d )   π
0

 X  X i  
with d  

r


otherwise
The densities estimated this way are finally interpolated3, producing smoothed surface in two
or three dimensions. This computer intensive procedure allows for map animation to be
produced, as figure 3 points out, providing valuable communication tools. The main message
this kind of visual display conveys is indeed that we'd better learn to manage complexity, as
we cannot always reduce it.
3
Most of the algorithms available provide good results, as the underlying grid is regular and complete.
Page 5 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
Figure 3 : 3D map animation of observed insecurity in Lille (surface deformation is
proportional to the density of accidents)4
Anyway, while being very useful to reveal space-time patterns, this smoothing method suffers
from various defaults. The first one relates to the large volume of computation involved, that
may reduce dramatically the level of interactivity of the process. Indeed, from an ideal point
of view, the user should be able to cope dynamically with the spatial and temporal scales the
animation is based on, so as to evaluate the influence of these major parameters on the visual
rendering. Unfortunately, the amount of time needed to realise such an animation makes this
goal unreachable, at least for the moment.
A major default, especially when dealing with road risk analysis, can also be pointed out.
Indeed, traffic accidents happen mainly on the road network (if we except those happening on
parking) whether the kernel density algorithm operates on a surface basis (a circular window),
what is more in an isotropic way for a given distance to the centroide of the window.
Anisotropic smoothing algorithms, based on the network could be investigated, but we
believe the surface rendering is essential for visual efficiency. More importantly, the densities
estimated this way may be seen as indicators of unsafety rather than of risk, as they are not
related to any exposure indicator. A relevant improvement would then be to weight each
Page 6 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
localised density estimate by the local length of the network included in the window
(figure 4).
Figure 4 : Integration of the road network to enhance kernel density algorithm
With the idea of feasibility testing, this GIS based procedure was applied to a small area,
since it implies very large computations.
Figure 5 : Application of the modified algorithm to a small study area
The map obtained shows estimates of densities of accidents weighted by the density of road
network for each moving window. This process provides rough but spatially differentiated
estimates of risk of accidents in a given area. It shows the benefits we could obtain by
integrating spatial analysis methods within a common framework : GIS should besides play
this integration role, as much as they usually do in term of data management.
4
Visual rendering was obtained with MapInfo. Full animation
fcomte.fr/banos/banos-page.htm
can
be
viewed
from
http://thema.univ-
Page 7 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
This «user oriented» approach may not be an end in itself. The interactive space-time analysis
sketched out previously can indeed be enriched with relevant local indicators, obtained
through more automated processes.
4. Automated identification of accidents clusters, in space and time
While interactivity and visualisation are of crucial importance, spatial surveillance cannot be
reduced to these tasks. Indeed, the burden of discovery can hardly be entirely left to what
might be called a perfect user, never missing meaningful patterns nor being misled by
incorrect knowledge. Some processes may then be automated, in the spirit of spatial datamining (Zeitouni and Yeh, 1999). Two strategies are introduced hereafter, allowing for
clusters to be identified in a very precise way.
Detection of spatial clusters in time
As Stan Openshaw recalls, “the original idea of a geographical analysis machine was that of
a something that could be left running permanently processing important databases in a
never ending quest for patterns or relationships that matter” (Openshaw, 1995, p. 10). The
method used below is a recent adaptation of the GAM proposed by Fotheringham and Zhan
(1996). It compares locally the spatial pattern of a “case” population with its “at-risk”
population (figure 6).
Figure 6 : Defining significant clusters of child pedestrian accidents (right map) among the
whole data set (left map)
Page 8 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
Here, the spatial pattern of all accidents, minus the « case « ones, is supposed to form the « atrisk « population : “...the knowledge of the spatial distribution of the at-risk population allows
more interesting clusters to be distinguished from those that arise purely from spatial
variations in the density of the at-risk population” (Fotheringham and Zhan, 1996).
The studied area is scanned with many randomly localised circular windows, and a statistical
test is computed each time. In each window, the proportion of observed “case” events is
compared to its theoretical proportion, which is the product of the number of the “at-risk”
events and the mean proportion of “case” events in the whole area. The Poisson probability
distribution is then used to assess the significance of the difference observed. The windows
recording significantly high proportions of pedestrian children accidents, for a specified
alpha-level, are mapped. Clusters, or relative concentrations of “case” accidents, are then
pointed out by the overlapping of multiple circles.
The modified version of the “GAM” allows here to detect significantly high proportions of
child pedestrian accidents, for a given period and for a specified alpha-level (figure 7).
Figure 7 : Clusters of child pedestrian accidents identified in Lille (1993-1997) by the
modified GAM
This clustering strategy proved successful in many occasions (Banos and Huguenin-Richard,
2000 ; Fotheringham and Zhan, 1996 ; Huguenin-Richard, 2000 ; Openshaw and al., 1987)
Page 9 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
but remains fundamentally sequential in its spirit, being based on a loose interaction between
the spatial and temporal dimensions. For that reason, we found useful to propose an alternate
strategy, dealing more explicitly with space-time interactions
Detection of space-time clusters
The approach we propose here is based on now classical literature about space-time modeling
(Bailey et Gatrell, 1995 ; Wakefield et al., 2000). The main idea is to extend the global
indicator of Nathan Mantel (1967), so as to provide a local indicator of space-time clustering.
Mantel’s indicator is expressed as the sum of the product of distances between events in both
space and time.
I Mantel   xij yij
i j
with xij 
1
1
and yij 
( kd  dij )
( kt  tij )
d ij is the euclidean distance, when t ij is the time interval between events. Using inverse
functions allows to switch from distance to proximity, allowing for more intuitive
interpretations of the values obtained. Finally, k d and k t are constants, allowing for events
occuring at the same place and/or moment to be taken into account (mainly when k  1 ). The
bias introduced by the combination of different metrics can be reduced by transforming each
metric to a common interval [0, 100], using the following function :
( dis tan ceij  Min( dis tan ceij )
Max( dis tan ceij )  Min( dis tan ceij )
 100
From that point, two last points have to be delt with. First, it is worth questionning whether
events occuring in distant places are to be compared. Second, the question of the pragmatic
use of this indicator may be adressed. Let us recall that we are looking for fairly detailed
space-time clusters, rather than for global warnings. In this spirit, we proposed (Banos, 1999,
2001a) to estimate this indicator on a local base, using buffering process (Figure 8).
Page 10 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
Figure 8 : A local adaptation of Mantel’s indicator of space-time clustering
A local monte-carlo permutation test is then used to assess whether the values obtained may
be seen as significant or could have occured “by chance”. Significant values, for a specified
alpha-level, are finally mapped, allowing for space-time clusters to be easily detected
(Figure 9).
Figure 9 : Space-time clusters of road accidents in Lille (1993-1997) identified using buffers
of size 50m (left) and 100m (right) and a 0.05 alpha-level
Conclusion
While being “in progress”, this work shows how spatial surveillance tools can be
implemented within specific environments, dedicated to the management of urban risks.
Space-time strategies combining exploratory spatial data analysis, geocomputation and
geovisualisation based methods can indeed be imagined, leading to promising results.
Page 11 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
Anyway, a major GIS shift has to be achieved if we want to carry on, from data management
to data analysis : as Stan Openshaw recalled, “analysis is no longer an option” (Openshaw,
1995). In that spirit, local based methods and models are still to be defined, leading to more
realistic and efficient identification of underlying patterns. While being of concern for the
entire spatial analysis field, this challenge may sounds particularly appealing to the geograph,
given his capacity to assume a major role in the process.
References
BAILEY T. and GATRELL A., 1995 : Interactive spatial data analysis, Longman Scientific &
Technical, Harlow, 413 p.
BANOS A. and HUGUENIN-RICHARD F., 2000 : “Spatial distribution of road accidents in
the vicinity of point sources : application to child pedestrian accidents”, Geography and
Medicine, Elsevier, pp. 54-64
BANOS A., 1999 : “Quelle implication de l’utilisateur dans une stratégie de data mining
spatial ? Illustration à partir de l’appréhension spatio-temporelle des accidents de la route en
milieu urbain”, Revue Internationale de Géomatique, Vol. 9, n° 4, pp. 441-456.
BANOS A., 2001a : Le lieu, le moment, le mouvement : pour une exploration spatiotemporelle désagrégée de le demande de transport en commun en milieu urbain, Thèse de
géographie, Besançon, 356 p.
BANOS A., 2001b : “A propos de l’analyse exploratoire des données”, Cybergéo
http://www.cybergeo.presse.fr, n° 197, 15 p.
BANOS A., 2001c : “Enhancing mobility behaviour analysis using spatial interactive tools
and computer intensive methods”, Geographic Information Sciences, Vol. 7, n° 1, pp. 35-41.
BANOS A., 2001d : “Localizing people during surveys : a versatile strategy”, Actes de
Colloque, Geocomputation, 6th International Conference on Geocomputation, Brisbane, 8 p.
BRUNSDON C., 1991 : “Estimating probability surfaces in GIS : an adaptive technique”,
EGIS’91 Proceedings, Second European Conference in GIS, Brussels, Vol. 1, pp. 155-164.
BRUNSDON C., 1998 : “Exploratory spatial data analysis and local indicators of spatial
association with Xlisp-Stat”, The Statistician, Vol. 47, n° 3, pp. 471-484.
CAHEGAN M., 2000 : “Visualization as a tool for geocomputation”, Geocomputation, Taylor
& Francis, London, pp . 253-274.
Page 12 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
CLEVELAND, W., 1993 : Visualizing Data, Murray Hill, AT&T Bell Laboratories/Hobart
Press.
FOTHERINGHAM S. and ZHAN B., 1996 : “A comparison of three exploratory methods for
cluster detection in spatial point patterns”, Geographical Analysis, Vol. 28, n° 3, pp. 200-218
HASLETT J., BRADLEY R., CRAIG P., UNWIN A., WILLS G., 1991 : “Dynamic graphics
for exploring spatial data with application to locating global and local anomalies”, The
American Statistician, Vol. 45, pp. 234-242
HUGUENIN-RICHARD F., 1999 : “Identifier les sites routiers dangereux : application de
méthodes d’analyse utilisant la localisation géographique des accidents”,
Revue
Internationale de Géomatique, Vol. 9, n° 4, pp. 471-487.
HUGUENIN-RICHARD F., 2000 : Approche géographique des accidents de la circulation :
proposition de modes opératoires de diagnostic. Application au territoire de la métropole
lilloise, Thèse de géographie, Besançon, 322 p.
KRAAK M..J., 1998 : “The cartographic visualisation process : from presentation to
exploration”, The Cartographic Journal, Vol. 35, n° 1, pp. 11-15
KWAN M., 2000 : “Interactive geovisualization of activity-travel patterns using threedimensional geographical information systems : a methodological exploration with a large
data set”, Transportation Research C, Vol. 8, pp. 185-203
MANTEL, N., 1967: “The detection of disease clustering and a generalized regression
approach”, Cancer Research, Vol. 27, pp. 209-220.
MONMONIER M., 1989 : “Geographic brushing, enhancing exploratory analysis of the
scatterplot matrix”, Geographical Analysis, Vol. 21, pp. 81-84
OPENSHAW S., 1995 : “Developping automated and smart spatial pattern exploration tools
for geographical information systems applications”, The Statistician, Vol. 44, n° 1, pp. 3-16.
OPENSHAW S. and ABRAHART R., 2000 : Geocomputation, Taylor & Francis, London,
413 p.
OPENSHAW S., CHARLTON M., WYMER C. and CRAFT A., 1987 : “A mark I
geographical analysis machine for the automated analysis of point data sets”, International
Journal of Geographical Information Systems, Vol. I, n° 4, pp. 335-358.
OPENSHAW S. and OPENSHAW C., 1997 : Artificial intelligence in geography, John Wiley
& Sons, Chichester, 329 p.
SILVERMAN B., 1986 : Density estimation for statistics and data analysis, Chapman and
Hall, London
Page 13 sur 14
BANOS Arnaud, BANOS Florence, 2003 : « Towards integrated spatial surveillance tools of urban risks : spacetime analysis of road accidents in Lille » , in Géographie des risques des transports, BANOS Arnaud, BANOS
Florence, BROSSARD Thierry, LASSARRE Sylvain, Editeurs, Paradigme, Orléans, pp. 9-22
TIERNEY, L., 1990 : Lisp-Stat : an object-oriented environment for statistical computing and
dynamic graphics, John Wiley & Sons, New-York, http://www.stat.umn.edu/
TUKEY J.-W., 1977 : Exploratory data analysis, Addison-Wesley, Reading, Massachusetts,
688 p.
WAKEFIELD J., KELSALL J. and MORRIS S., 2000 : “Clustering, cluster detection and
spatial variation in risk”, Spatial epidemiology, methods and applications, pp. 128-152.
ZEITOUNI K. and YEH L., 1999 : “Le data mining spatial et les bases de données spatiales”,
Revue Internationale de Géomatique, Vol. 9, n° 4, pp. 389-423.
Page 14 sur 14
Download