Response to NIJ`s A Methodology for Evaluating Geographic

advertisement
An Evaluation of NIJ’s Evaluation Methodology for Geographic Profiling Software
D. Kim Rossmo
Research Professor and Director
Center for Geospatial Intelligence and Investigation
Department of Criminal Justice
Texas State University
March 9, 2005
Executive Summary
This is a response to the National Institute of Justice’s A Methodology for Evaluating
Geographic Profiling Software: Final Report (Rich & Shively, 2004). The report contains
certain errors, the most critical of which involves suggested performance measurements. Output
accuracy is the single most important criterion for evaluating geographic profiling software. The
report discusses various performance measures; unfortunately only one of these (hit score
percentage/search cost) accurately captures how police investigations actually use geographic
profiling. This response addresses the various problems associated with the other measures.
Geographic profiling evaluation methodologies must respect the limitations and assumptions
underlying geographic profiling, and accurately measure the actual function of a geographic
profile. Geographic profiling assumes: (1) the case involves a series of at least five crimes; (2)
the offender has a single stable anchor point; (3) the offender is using an appropriate hunting
method; and (4) the target backcloth is reasonably uniform. Additionally, for various theoretical
and methodological reasons, not all crime locations in a given series can be used in the analysis.
The most appropriate measure of geographic profiling performance is “hit score
percentage/search cost.” It is the ratio of the area searched (following the geographic profiling
prioritization) before the offender’s base is found, to the total hunting area; the smaller this ratio,
the better the geoprofile’s focus. There are no intrinsic disadvantages to this measure.
The other evaluation measures discussed in the NIJ report are all linked to the problematic
“error distance.” “Top profile area” is the ratio of the total area of the top profile region (which
is not defined) to the total search area. It is not a measure by itself. “Profile error distance” is
the distance from the offender’s base to the nearest point in the top profile region (undefined).
“Profile accuracy” indicates whether the offender’s base is within the “top profile area”
(undefined); it fundamental misrepresents the prioritization nature of geographic profiling.
“Error distance” is the distance from the offender’s actual to predicted base of operations.
While it is easily applied to centrographic measures, the complex probability surfaces produced
by geographic profiling software must be reduced to a single (usually the highest) point. Several
researchers have unfortunately adopted this technique because of its simplicity. There are three
major problems with error distance. First, it is linear, while the actual error is nonlinear. Area,
rather than distance, is the relevant and required measure. Population (and therefore suspects)
increases with area size, which is a function of the square of the radius (error distance). The
second problem with error distance is that it is not a standardized measure because of its
sensitivity to scale. The third and most serious analytic problem with error distance is that it fails
to capture how geographic profiling software actually works. Criminal hunting algorithms
produce probability surfaces that outline an optimal search strategy. As an offender’s search is
rarely uniformly concentric, simplifying a geoprofile to a single point from which to base an
error distance is invalid. The use of error distance ignores most of the output from geographic
profiling software and undermines the very mechanics of how the process functions.
A more comprehensive approach to evaluating geographic profiling as an investigative
methodology needs to consider applicability and utility, as well as performance. Applicability
refers to how often geographic profiling is an appropriate investigative methodology. Utility
refers to how useful or helpful geographic profiling is in a police investigation.
To evaluate geographic profiling properly requires analysing only those cases and crimes
appropriate for the technique, and measuring performance by mathematically sound methods.
Hit score percentage/search cost is the only measure that meets NIJ’s standard of a “fair and
rigorous methodology for evaluating geographic profiling software.”
1
Introduction
In January 2005, the National Institute of Justice (NIJ) released A Methodology for
Evaluating Geographic Profiling Software: Final Report (Rich & Shively, 2004). While the
intent of this document is laudable, it is necessary to respond to certain significant errors that are
contained in the report. Some of these may be the result of the advisory expert panel not
including professional geographic profilers (defined as police personnel whose full-time function
involves geographic profiling), “customers” of geographic profiling (police investigators), or
developers of geographic profiling software. A crime analyst (trained in geographic profiling
analysis for property crime) was the sole law enforcement practitioner on the advisory panel.
The most critical error in the NIJ report involves suggested performance measurements.
The expert panel correctly concluded that output accuracy – “the extent to which each software
application accurately predicts the offender’s ‘base of operations’” (p. 14) – is the single most
important criterion for evaluating geographic profiling software (p. 15). The report discusses
various performance measures, providing short definitions, advantages, and disadvantages (p.
16). Only one of these measures (hit score percentage/search cost), however, accurately captures
how police investigations actually use geographic profiling. This response addresses the various
problems associated with the other measures.
Background
Geographic profiling is a criminal investigative methodology that analyzes the locations of a
connected crime series to determine the most probable area of offender residence. It is primarily
used as a suspect prioritization and information management tool (Rossmo, 1992a, 2000).
Geographic profiling was developed at Simon Fraser University’s School of Criminology, and
first implemented in a law enforcement agency, the Vancouver Police Department, in 1995.1
Geographic profiling embraces a theory-based framework rooted in environmental
criminology. Crime pattern (Brantingham & Brantingham, 1981, 1984, 1993), routine activity
(Cohen & Felson, 1979; Felson, 2002), and rational choice (Clarke & Felson, 1993; Cornish &
Clarke, 1986) theories provide the major foundations. While there are several techniques used
by geographic profilers, the main tool is the Rigel software program, built in 1996 around the
Criminal Geographic Targeting (CGT) algorithm developed at SFU in 1991.
After discussions in the mid-1990s with senior police executives and managers of the
Vancouver Police Department (VPD) and the Royal Canadian Mounted Police (RCMP), it was
concluded that several components would be necessary for the successful implementation of
geographic profiling within the policing profession. These included:
 creating personnel selection, training. and testing standards;
 following mentoring and monitoring practices;
 developing usable and functional software;
 establishing case policies and procedures;
 identifying supporting investigative strategies;
1 Examples of the use of spatial statistics for investigative support can be identified in the
literature as far back as 1977. During the Hillside Stranglers investigation, the Los Angeles Police
Department analyzed where the victims were abducted, their bodies dumped, and the distances between
these locations, correctly identifying the area of offender residence (Gates & Shah, 1992). None of these
isolated applications, however, led to sustained police use or organizational program implementation.
2
 building awareness and knowledge in the customer (police investigator) community; and
 committing to evaluation, research, and improvement.
Over the course of the next few years these components were developed, first for major
crime investigation, and then for property crime investigation. Personnel from various
international police agencies were trained in geographic profiling. Their agencies signed
memoranda of understanding agreeing to follow the established protocols, and to assist other
police agencies needing investigative support. Training standards for geographic profilers were
eventually adopted by the International Criminal Investigative Analysis Fellowship (ICIAF), an
independent professional organization first started by the Federal Bureau of Investigation (FBI)
in the 1980s.
For a geoprofile to be more than just a map, it must be integrated with specific strategies
investigators can use. Examples of strategies identified for geographic profiling include: (1)
suspect and tip prioritization; (2) database searches (e.g., police information systems, sex
offender registries, motor vehicle registrations, etc.); (3) patrol saturation and surveillance; (4)
neighborhood canvasses; (5) information mail outs; and (6) DNA dragnets. The level of
resources required by these strategies is directly related to the size of the geographic area in
which they are conducted.
While not used by professional geographic profilers, there are two derivative geographic
profiling tools also mentioned in the NIJ report: NIJ’s own CrimeStat JTC (journey-to-crime)
module; and The University of Liverpool’s Dragnet. Both of these systems were developed in
1998. Neither is a commercial product, and training in their use, beyond software instruction
manuals, is currently unavailable. Little is known about a fourth geographic profiling software
program, Predator, first mentioned in 1998 on Maurice Godwin’s investigative psychology
website.
Geographic Profiling Evaluation
There are three methods for testing the efficacy of geographic profiling software. The first
uses Monte Carlo simulation techniques. These test the expected performance of the software on
various point patterns representative of serial crime sites. The major advantage of this approach
is the ability to generate large numbers of data cases (e.g., 10,000). The major disadvantage is
the likelihood the site generation algorithm’s underlying assumptions do not accurately reflect
the geographic patterns of all serial crime cases. In addition, the additional information
associated with an actual case that can help refine a geoprofile is not present.
The second and most common method of evaluating geographic profiling software
performance involves examining solved cases. This technique has been used by Rossmo (1995a,
2000), Canter, Coffey, Huntley, and Missen (2000), Levine (2002), Snook, Taylor, and Bennell
(2004), and Paulsen (2004). The major advantage of research using historical (cold) cases is that
with sufficient effort a reasonably sized sample of cases can be collected. Disadvantages include
sampling bias problems and the need for extensive data review.
The third method tracks geographic profiling performance in unsolved criminal
investigations. This approach is the best of the three as it measures actual – not simulated –
performance under field conditions (Rossmo, 2001). It also serves as a blind test as the “answer”
is not known at the time of the analysis. Monitoring actual case performance is slow, however,
as it is necessary for a case to be solved before it can be included in the data sample.
Every trained geographic profiler is required to keep a case file that records the details of
their work. The log includes fields for case number, sequential number, date, crime type, city,
3
region, law enforcement agency, investigator, number of crimes, number of locations, type of
analysis, report file name, case status, and result (when solved). This file has both administrative
and research purposes. It was encouraging to see the NIJ report recommend the use of logs and
journals by individuals involved in geographic profiling. However, considering how much there
is to learn with any new police technology (especially in regards to investigator utility versus
software performance), it seems more prudent for all users, and not just a sample, to keep
detailed records.
Geographic Profiling Evaluation Methodology
Evaluation Premises
NIJ’s purpose was “to develop a fair and rigorous methodology for evaluating geographic
profiling software” (p. 4), and their report identifies law enforcement officials as the key
audience for the evaluation. With this in mind, the following premises are used as the basis for
the discussion in this response.
Any geographic profiling evaluation methodology should:
 follow the limitations and assumptions underlying geographic profiling;
 analyze exactly what the geographic profiling software produces, and not a simplification
or generalization of its output;
 measure, as accurately as reasonably possible, the actual function of a geographic profile;
 use the highest level measurements possible (i.e., ratio/interval/ordinal/nominal); and
 be based on validity and reliability concerns (and not on tangential factors such as “it is
easier,” “it has been done that way before,” or “the software has limitations”).
It is tempting, in the effort to increase a study’s sample size, to collect cases from large
databases derived from records management systems (RMS). However, if the details of the
crimes are overlooked, inappropriate series will be included: GIGO – garbage in, garbage out.
Wilpen Gorr, Michael Maltz, and John Markovic have warned us of the importance of data
integrity and specificity issues. “You really need to know the capacities and limitations of this
less then perfect [crime] data before you dump it into a model” (John Markovic, International
Association of Chiefs of Police, NIJ CrimeMap listserve, January 31, 2005).
To prepare a geographic profile properly involves first making sure the case does not violate
any underlying assumptions. Furthermore, only those crime locations in the series that meet
certain criteria can be used in the analysis. This is one of the reasons why a geoprofile requires
anywhere from half a day for a property crime case to up to two weeks for a serial murder case.
A significant portion of the geographic profiling training program is spent learning to understand
these issues so the methodology is not improperly applied. These complexities are why testing,
monitoring and mentoring, and review exist.
Geographic Profiling Assumptions
Any algorithm or mathematical function is only a model of the real world. The
appropriateness and applicability of weather forecasting techniques, multiple linear regression,
the spatial mean, or horserace odds are all premised on various assumptions. If those
assumptions are violated, or if the processes of interest are not accurately replicated, the model
has little value. Using atheoretical algorithms for police problems is tantamount to fast food
crime analysis.
There are four major theoretical and methodological assumptions required for geographic
profiling (Rossmo, 2000):
4
1. The case involves a series of at least five crimes, committed by the same offender. The
series should be relatively complete, and any missing crimes should not be spatially
biased (such as might occur with a non-reporting police jurisdiction).
2. The offender has a single stable anchor point2 over the time period of the crimes.
3. The offender is using an appropriate hunting method.
4. The target backcloth is reasonably uniform.
Geographic profiling is fundamentally a probabilistic form of point pattern analysis. Every
additional point (i.e., offense location) in a crime series adds information, and results in greater
precision. A minimum of five crime locations is necessary for stable pattern detection and an
acceptable level of investigative focus; the mean in operational cases has been 14 (Rossmo,
2000, 2001). Monte Carlo testing shows with only three crimes the expected hit score
percentage (defined below) is approximately 25%. By comparison, the expected search area
drops to 5% with 10 crimes.3 The resolution of any method will be poor if tested on series of
only a few crimes.
The NIJ evaluation methodology recommends analyzing cases with as low as three crimes in
the series. While there may be some research interest in studying performance for small-number
crime series, the report is supposed to lay out guidelines for evaluation methodologies. The
document seems at times to be confused as to its role. Research and evaluation are separate
processes. At a minimum, any research results should be reported separately, and it should be
made clear they do not represent operational geographic profiling performance.
For evaluation purposes, it is inappropriate to include cases that fall outside the
recommended operational parameters. As small-number crime series are easier to obtain than
large-number crime series, there is a risk they will inappropriately drive the findings. For
example, the distribution for the number of crimes in Paulsen’s (2004) analysis was heavily
skewed to small-number series. Of 150 cases, only 37 (25%) meet the minimum specified
requirement in geographic profiling of 5 crime locations – and 22 of those were on the borderline
(6-7 crimes). Only 15 cases (10%) involved more than 7 crimes.
If the offender is nomadic or transient then there may not be a residence to locate. If the
offender is constantly moving residence, then multiple anchor points could be involved in a
single crime series, confusing the analysis, and possibly resulting in a violation of the first
assumption. It should also be remembered that what constitutes a residence for a street criminal
might vary from middle-class expectations. Two geoprofiled burglary cases illustrate this point.
In the first, the “home” for a group of transient gypsies was a motel where they temporarily
stayed while they committed their crimes. In the second, the homeless offender’s base was a
bush in a vacant lot where he slept at night. It is important in geographic profiling to consider
the details of the case, the timing of the crimes, and the nature of the area where the peak
geoprofile is located. Like all investigative tools, it should be used intelligently. See the
discussion below regarding applicability, performance, and utility, in the section General
Methodological Comments.
2 The offender’s anchor point appeared to be their residence in the majority (about 85%) of the
cases analyzed by professional geographic profilers. Other examples of anchor points include the
offender’s workplace or (for students) school, past residence (if the move was recent), and surrogate
residences (homes of family members and friends where the offender actually lives).
3 The benchmark is 50%, what you would expect from a uniform (i.e., non-prioritized) search.
5
Offender hunting method is defined as the search for, and attack on, a victim or target
(Rossmo, 1997, 2000). Geographic profiling is inappropriate for certain search and attack
methods. The residence of a poacher (an offender commuting into an area to commit crimes) by
definition will not be within the hunting area of the crimes (though he may be using some other
anchor point, such as his workplace or a “fishing hole”). The NIJ methodology suggests that
“commuters” be included in any evaluations. How that is to be done, however, is not made
clear, as the report acknowledges CrimeStat and Dragnet are unable to handle this type of
offender (nor can Rigel, as this is an assumption violation). Gorr (2004) presents some
interesting and useful ideas for expanding geographic profiling systems to such cases and these
should be explored (perhaps with the addition of a directionality component). As discussed
above, it is important to distinguish research from evaluation, and report the results of each
separately.
While most burglars identify targets during their routine activities in areas of familiarity,
others watch for news of estate sales or use accomplices who read luggage nametags at airports.
Stalkers (offenders who do not attack victims upon encounter) are also problematic. In one case
example, the offenders in a series of armed robberies of elderly victims in Los Angeles went to
hospitals and shopping malls, selected suitable victims, and then followed them home where the
robbery occurred. In this situation, the victims were choosing the crime sites – not the offenders.
A geographic profile based on the robbery locations would therefore be wrong. Instead, the
victim encounter sites (the shopping malls and hospitals) should be used because these are the
locations the robbers had control over.
All criminal profiling involves the inference of offender characteristics from offense
characteristics. But this assumes freedom of offender choice; the more constrained the behavior,
the less valid the inferences. A uniform (isotropic) target backcloth is thus a necessary
assumption for geographic profiling. Certain offenders hunt victims whose spatial opportunity
structure is patchy. A predator attacking prostitutes is one such example; he has little freedom of
spatial choice as he must hunt in red light districts where potential victims are located.
A final caution is necessary regarding what a geographic profile actually produces. It shows
the most likely area for an offender’s anchor point (“base of operations”). While this is most
often their residence, in other cases it may be their former residence, workplace, a freeway
access point, or other significant activity site. This underlines why it is inappropriate to analyze
cases without consideration of the underlying theory and environmental background. For
example, the geoprofile in a series of bank robberies that occurred from 12:30 pm to 1:00 pm fell
onto a commercially zoned section of the city. This led to the correct inference the offender was
committing the crimes during his lunch break; while the geoprofile said nothing about where he
lived, it accurately pointed to where he worked.
For testing purposes, the use of numerous anchor points as prediction sites can quickly
produce tautological results. Absent significant a priori information (as in the above bank
robbery case), it represents multiple “kicks at the can,” and performance results need to be
statistically adjusted accordingly.
6
Crime Site Selection
The selection of which crimes to use in a geoprofile is an important process called scenario
creation. For various theoretical and methodological reasons, not all crime locations4 should be
used in the analysis. Independence of site selection is one issue. An arsonist walks down the
street, sets fire to a dumpster, then walks around the corner and sets fire to another dumpster.
We have two crimes, two addresses, two times, and potentially two victims. However, for
geographic profiling purposes we would eliminate the second crime location, as it is not
independent of the first location.
A second issue is spatial displacement. If the offender moves his or her area of operations
because of police saturation patrolling, for instance, it would likely be inappropriate to combine
post-displacement with pre-displacement crime locations.
NIJ Performance Measures
The NIJ report lists several geographic profiling performance measures in a table (p, 16),
explaining “Individual panelists also presented what they felt was their ‘favorite’ output
measure” (p. 9). Reviewing the transcript in the report’s appendix shows much debate amongst
the members of the expert panel on this particular topic. Employing multiple performance
definitions, unless they all reflect actual use conditions, is confusing and disadvantageous.
Geographic profiling is an information management strategy, primarily used for suspect and area
prioritization. The proper way to test its efficacy is to employ a method that simulates what
happens in the real world. The hit score percentage (or search cost) is a measure that
satisfactorily accomplishes this goal. Their “popularity” notwithstanding, the other measures
discussed in the report do not. The problems with these measures are discussed below.
Hit Score/Search Cost
The NIJ report incorrectly states “there are no existing standard for measuring accuracy” (p.
16). The hit score percentage (also referred to as search cost) is such a standard, one that
accurately and correctly captures geographic profiling performance. It was established in the
early 1990s (Rossmo, 1993b), and is used by all professional geographic profilers to quantify
geoprofile accuracy. The report lists two “disadvantages” associated with the hit score
percentage, both of which are carefully scrutinized below. However, let us first closely examine
what hit score percentage measures.
The hit score percentage is a measure of geographic profiling search efficiency. It is defined
as the ratio of the area searched (following the geographic profiling prioritization) before the
offender’s base is found, to the total hunting area; the smaller this ratio, the better the
geoprofile’s focus. It is calculated by first adding the number of pixels with a hit score
(likelihood value) higher than that of the pixel containing the offender’s residence to half the
number of “ties” (pixels with the same hit score), and then dividing by the total number of pixels
in the hunting area (40,000 for Rigel).
The hunting area is defined as the rectangular zone oriented along the street grid containing
all crime locations. These locations may be victim encounter points, murder scenes, body
4 There are other issues for multiple location crimes but these more typically occur in violent
sexual offenses. A rape, for example, may involve separate encounter, attack, assault, and victim release
sites (Rossmo, Davies, & Patrick, 2004).
7
dump sites, or some combination thereof. The term hunting area is therefore used broadly in
the sense of the geographic region within which the offender chose – after some form of
search or hunting process – a series of places for criminal action. Locations unknown to
authorities, including those where the offender searched for victims or dump sites but was
unsuccessful or chose not to act, are obviously not included…. A priori, we do not know
where this hunting area is – we only know the locations of the reported, and connected,
crimes. Technically, a geoprofile stretches to infinity; the hunting area is only a
standardized method of displaying results so that important information is shown, and
unimportant information is not. (Rossmo, 2000)
In the nomenclature of point pattern analysis, the crimes are events and the hunting area is the
study region (Gatrell, Bailey, Diggle, & Rowlingson, 1996).
Two factors influence the hit score. First, the resolution of the pixel grid slightly influences
precision as more pixels result in more decimal places (e.g., 4.87% vs. 4.9%). Second, the size
of the hunting determines the denominator of this ratio. For example, a larger hunting area
produces a larger denominator, therefore reducing the hit score percentage or search cost. Rigel
uses a minimal bounding rectangle for its hunting area, with a small addition of a guard area for
edge effects.5 Dragnet also employs a rectangular search area, defined by the offenses but
magnified by 20%, displayed in a grid of 3,300 pixels (Canter et al., 2000).
While the NIJ report is correct in stating this measure is dependent upon how the search area
is defined, this is not a disadvantage for the purposes of comparative evaluation. As long as the
same method is used consistently for all analyses in the comparison, relative performance will be
the same. When you divide one ratio by another with a common denominator, the denominator
is cancelled out. Therefore, the hunting area can be defined as a rectangle, circle, convex hull
polygon, or any other logical shape. As long as the same procedure is consistently used,
numerator comparisons will be independent of its size.
The second disadvantage listed in the report for the hit score percentage is a claim that it is
“Subject to severe changes in output display based on method of thematically mapping the
output” (p.16). NIJ Mapping and Public Safety (MAPS) personnel were queried to clarify the
meaning of this. It turns out that this is not an inherent disadvantage of the hit score
percentage/search cost measure, but rather is a program-specific problem for NIJ’s CrimeStat
software. Apparently, its output, when loaded into a geographic information system (GIS), can
be dramatically altered depending on the thematic classification used. Nevertheless, as
CrimeStat produces a “hit score map” (p. 5) this issue should be solvable (otherwise, the
software could not prioritize a list of suspects or areas – the main function of geographic
profiling).
The equivalent of hit score percentage or search cost can also be calculated for first-order
centrographic statistics (e.g., spatial mean, spatial median, “eyeball” estimates, etc.). A search
process from a single point estimate involves an outward spiral. The square of the error distance
divided by the total area covered by the crimes provides the appropriate search performance
measure. If this is being calculated on a grid, half the “ties” (the number of pixels located
exactly the error distance away) must be added to the numerator
5 Rigel calculates the boundaries of the hunting area by adding half the mean x and y interpoint
distances to the most eastern and western, and northern and southern crime sites, respectively (Boots &
Getis, 1988; Gatrell et al., 1996). The geoprofile is generated within the hunting area.
8
Previous Evaluations of Geographic Profiling Performance
Three reported studies have examined geographic profiling performance using hit score
percentage or search cost. In research on 13 solved serial murder cases from the United States,
Canada, and Great Britain, involving 15 serial murderers, 178 victims, and 347 crime locations,
Rossmo (1995a) observed the CGT algorithm located the offender’s residence in the top 6% of
the hunting area.
Canter et al. (2000) found similar results in their study of 70 U.S. serial killers using
Dragnet. The mean search cost was 11%; 87% of the offenders’ home bases were found in the
top 25% of the search area, 51% in the top 5%, and 15% in the top 1%.
In 2001, a review was conducted of all now-solved (but initially unsolved) cases analyzed
by full-time police geographic profilers (from the Vancouver Police Department, Royal
Canadian Mounted Police, Ontario Provincial Police, and British National Crime Faculty). The
study combined the 39 hot cases with 31 cold cases, for a total of 70 serial crime cases,
representing 1,426 offenses and 1,726 crime locations (Rossmo, 2001). These involved, in order
of frequency, murder, rape, arson, robbery, and sexual assault. The hot cases came from police
agencies in North America, Europe, and Africa, and occurred from 1991 to 2001.
There was an average of 20 crimes (25 locations), and a median of 14 crimes (17 locations)
per case. The mean hit score percentage was 4.7% (SD = 4.4%), and the median was 3.0%. The
performance on hot cases was about 1% better than for the cold cases, likely the result of the
extra information and analysis time involved.
Top Profile Area
The top profile area is defined as the ratio of the total area of the top profile region to the
total search area. It used with profile error distance. This is actually not a distinct performance
measure (at least as explained in the report). It forms an integral part of “profile error distance”
and “profile accuracy,” but does not stand alone as a performance measure. Apparently top
profile area is similar to search cost, but the mathematical relationship is unclear. The “top
profile region” is not defined in the NIJ report beyond the vague “the predicted most likely
region containing the base of operations” (p. 15). Is this the top 10% of the area? The top five
square miles? The top profile area is apparently a subjective and crude measure.
Profile Error Distance
This is defined as the distance from the offender’s base to the nearest point in the top profile
region. The report claims it takes advantage of the whole profile, but this is incorrect. It only
takes into account the “top profile area,” which is not defined. The profile error distance cannot
distinguish between offender locations within the “top profile area” – they are either in or out of
this region. Moreover, for offender locations outside the “top profile area,” we are back to the
use of a linear measure, with all the problems associated with error distance.
Profile Accuracy
This is defined as a dichotomous measure of whether the offender’s base is within the “top
profile area,” and provides a simple indication of whether the geoprofile was “correct.” The very
use of such terms as “correct” or “accurate” shows a fundamental misunderstanding of
geographic profiling. This is like saying a golf drive from the tee was “wrong” because it only
9
landed the ball in the green, and not in the hole.6 Like grenades and slow dancing, close does
matter here. “Top profile area” is also undefined.
Error Distance
Error distance is defined in the NIJ report as the distance from the offender’s actual to
predicted base of operations. While it is easily applied to centrographic measures, the complex
probability surfaces produced by geographic profiling software must be reduced to a single
(usually the highest) point. Several researchers have unfortunately adopted this technique
because of its simplicity (Paulsen, 2004; Levine, 2002; Snook, 2000, 2004; Snook, Canter, &
Bennell, 2002; Snook et al., 2004). However, common usage is not synonymous with correct
usage. There are three major problems with error distance in this context (Rossmo, in press).
While error distance is linear, the actual error is nonlinear. Area, rather than distance, is the
relevant and required measure. Population (and therefore suspects) increases with area size,
which is a function of the square of the radius (error distance) – when distance doubles, all else
being equal, area and population quadruple. Another way of looking at this is to imagine a
database containing 1,000 suspects in a sexual homicide case with crime scene DNA evidence.
How many of these suspects need to be tested before the offender is identified? This is the
appropriate test of a geographic profile’s accuracy.
Table 1 compares error distance (d) and search area (πd2) results for two hypothetical
geographic profiling systems (A and B). Even though system B shows better performance in
terms of error distance (5% less per case on average), it is significantly worse in terms of search
area (20% more per case on average). Generally, the greater the relative standard deviation, the
more problematic the use of error distance. This means the error distance underestimates the
impact of poor performance, favoring geographic profiling systems with higher variability, even
when they do not do as well as systems that are more consistent.
6 If so desired, hit score percentage can be converted into binned results of desired area sizes or
categories of performance (though this reduces a ratio-level measure to a nominal one).
10
GP System
A
B
Case #
Error Distance
Search Area
Error Distance
Search Area
1
2
12.57
2
12.57
2
2
12.57
2
12.57
3
6
113.10
7
153.94
4
4
50.27
2
12.57
5
4
50.27
3
28.27
6
7
153.94
8
201.06
7
3
28.27
1
3.14
8
5
78.54
3
28.27
9
5
78.54
1
3.14
10
2
12.57
9
254.47
Mean
4.00
59.06
3.80
71.00
SD
1.76
47.46
3.01
94.61
Total
40.00
590.62
38.00
710.00
0.95
1.20
System B/A Ratio
Table 1. Error Distance versus Search Area.
The second problem with error distance is that it is not a standardized measure. While the
algorithms used in Rigel and Dragnet are calibrated from point pattern statistics, and are
therefore blind to the actual scale of the map, error distance is sensitive to scale. It has been
reported in units ranging from millimeters to kilometers, precluding inter-study comparisons.7
For example, in a burglary series, covering 1.5 square miles in Carmel, California, an error
distance of 2.1 miles is not very impressive; but it is in a robbery series covering 1,200 square
miles in greater Los Angeles. Figures 1 and 2 illustrate this problem. Figure 1 displays 12 crime
sites (red dots), and two offender residence scenarios (blue squares), all on a featureless
background. The small black arrow in the upper left of the figure represents a short error
distance of about 0.25 miles (good performance), while the green arrow represents a much less
accurate error distance of about 2 miles (poor performance). Now compare these results to those
from Figure 2. Again, there are 12 crime sites, but this time over a much larger scale. The error
distance represented by the black arrow is 5 miles, while the error distance represented by the
7 For example, both Paulsen (2004) and Snook et al. (2004) used the negative exponential
function in CrimeStat for their research. Paulsen reports a mean error distance of 6.176 kilometers, while
Snook et al. report 55.9 millimeters.
11
green arrow is 25 miles. Given the provided information (the point pattern), the geoprofile’s
match to the offender’s residence is still very good, yet the error distance does not capture this.
Furthermore, the results from the two cases are not comparable.8 Finally, averaging results in
this manner is particularly problematic because error distances from crime series occurring over
large scales overwhelm error distances from crime series occurring on smaller scales.
Figure 1. Crime Sites, GeoProfile, and Error Distance (1 Mile Scale).
8 Expressing the error distance as a proportion of the standard distance or the mean of the sum of
all k-order nearest neighbor distances would be one way around this problem.
12
Figure 2. Crime Sites, GeoProfile, and Error Distance (10 Mile Scale).
The third and most serious analytic problem with error distance is that it fails to capture how
geographic profiling software actually works. Criminal hunting algorithms produce probability
surfaces that look like colored topographic maps (see Figures 1, 2, 3, or Exhibit 1.1 in the NIJ
report, p. 2). These geoprofiles outline the optimal search strategy. As an offender’s search is
rarely uniformly concentric, simplifying a geoprofile to a single point from which to base an
error distance is invalid. The use of error distance ignores most of the output from geographic
profiling software and in effect undermines the very mechanics of how the process functions.
Error distance simply does not accurately measure how such systems operate in actual practice.
Figure 3 shows an example of this problem. It shows a geoprofile for a series of 11 sexual
assaults that occurred in Mississauga, Ontario. Nine locations were used in the analysis. It was
prepared by the Vancouver Police Department’s Geographic Profiling Section at the request of
the Peel Regional Police prior to the case being solved. The top 4% of the geoprofile is
displayed. The crime sites are indicated by red circles, and the offender’s residence by a blue
square. The geoprofile was used, amongst other criteria (e.g., uncorroborated alibi, similarity to
a composite sketch, etc.) to help investigators prioritize an initial list of 312 suspects for DNA
testing. The individual ultimately convicted of the crime series was ranked sixth out of 312
based on the geoprofile alone (he actually scored higher because of the other criteria).
13
Figure 3. Sexual Assault GeoProfile.
The offender resided in the top 2.2% of the hunting area of the original geoprofile, and was
in the top 1.9% of the list of suspects.9 But using error distance (0.50 miles, shown as an arrow
in Figure 1) places the offender in the top 53.8% of the hunting area – a proportion almost 25
9 It is worth mentioning that the number of suspects generated in a criminal investigation is not
strictly a function of the geographic area involved. Equally important factors include the type of crime,
number of offenses, length of investigation, volume of police resources, number of jurisdictions, and
levels of public fear and media interest. A rape series covering five square miles is more likely to have
more suspects than a burglary series covering 10 square miles.
14
times as large. Error distance does not even closely measure the performance of this particular
geoprofile.
This same problem with error distance can be seen with the geographic profile displayed in
Exhibit 1.1 (p. 2) of the NIJ report. A careful inspection of the jeopardy surface and the
geoprofile in the inset shows two separate peak areas. The highest point of the geoprofile is in
the upper peak area, but the offender’s residence was in the lower peak (his probation office was
in the upper peak). The error distance in this case does not represent the geoprofile’s search
efficiency. Generally, the extent to which the peak area of any geoprofile deviates in shape from
a perfect circle is related to the degree that the spatial distribution of crime sites is distorted from
the concentric. This is typically the result of an asymmetrical offender activity space and a less
than uniform (anisotropic) target backcloth.
General Methodological Comments
The report contains discussion of both evaluation methodology and research initiatives for
geographic profiling. There needs to be a clear distinction between the two – though having said
that, there are several excellent research ideas in the report. For example, varying the values of
different input parameters and sensitivity analysis (p. 8) – providing good datasets are used –
might fine tune applications to environmental conditions. Deleting non-inhabitable areas from
the analysis would produce more accurate search cost estimates. Randomly dropping crimes (p.
17) from series could help determine robustness, though crimes should be eliminated both
randomly, to simulate non-reporting or linkage failures, and in reverse chronological order, to
test performance over series progression (Brantingham & Brantingham, 1981).
A caveat is required here. There are three desirable qualities in model building: (1) realism;
(2) generality; and (3) precision (Levins, 1966; see also Winterhalder & Smith, 1992, p. 13).
Tradeoffs are usually required as improvements in one model quality generally result in the
deterioration of one or both of the other qualities. Greater precision, for example, might well
require a loss of robustness. Striking the optimal balance will be an issue for any efforts to
improve geographic profiling models.
CrimeStat allows the user to select from multiple (five, plus any specifically calibrated
functions). Only one of its functions should be chosen a priori for evaluation purposes.
Communication with some of CrimeStat’s 1,500 law enforcement users (p. 11) should help NIJ
narrow down what works best. Testing on more than one function is equivalent to allowing for
repeated tries, and performance results would have to be statistically adjusted downwards.
Any testing of CrimeStat’s jurisdiction and crime-specific calibrated function obviously
requires separate learning and testing datasets.10
Evaluation of the costs associated with a geographic profiling system (e.g., CrimeStat)
should include all federal funding grants and other public taxpayers’ dollars.
Applicability, Performance, and Utility
The evaluation of geographic profiling as an investigative methodology is not addressed in
the NIJ report, which focused strictly on associated software performance. Only a part of the
10 See Canter and Gregory (1994), for an egregious example of this type of error. After
calibrating the parameters for a multivariable geographic profiling system for stranger rape, the model
was tested on its learning data. Not surprisingly, it achieved a 100% “success” rate.
15
process of geographic profiling (and not the most important) is therefore discussed. A more
comprehensive approach needs to consider applicability, performance, and utility (APU).
Applicability refers to how often geographic profiling is an appropriate investigative
methodology. It would be beneficial to know the frequency of serial crime cases that meet the
criteria and assumptions for geographic profiling, and the ability of law enforcement agencies to
recognize such patterns. Utility refers to how useful or helpful geographic profiling is in a police
investigation. Geographic profiling only plays a supporting role as it cannot directly solve a
crime – only a witness, confession, or physical evidence can do that (Klockars & Mastrofski,
1991). How it plays that role, how often, how well, and how it compares to other available
offender location techniques are important research questions.
Let us consider an APU evaluation in the context of the anchor point/base/residence
discussion. We will use the example of a homeless sex offender who has molested several
women. This case does not fit the criteria for geographic profiling because there is no anchor
point to geoprofile. But what if the offender lived in his car, which was always parked in the
same location? Now we have applicability – a stable offender anchor point. Next, let us assume
the geoprofile locates the offender’s car in the top 4% of the five square mile hunting area, so we
have reasonably good performance. However, this does not mean we have utility. There is no
relevant database for investigators to search in this situation. On the other hand, if the victims
reported the molester was unkempt in appearance and accurately described his vehicle and
contents, an astute detective may conclude the offender is living out of his car and direct patrol
search efforts accordingly. Now we have utility.
Alternatively, a geoprofile could accurately predict the residence of a rapist, but lose utility
because the police jurisdiction did not keep its sex offender registry updated. An argument could
be made in this situation for potential utility, as the problem area is the record keeping system,
and other law enforcement agencies may have better maintenance. Finally, for an APU
assessment to have meaning, it must be considered within a cost/benefit framework and
compared to similar analyses for other investigative techniques (e.g., fingerprinting, canvassing,
DNA, tip lines, behavioral profiling, etc.).
Conclusion
We must be absolutely certain that the way in which we think we are empirically evaluating
our research question is the way in which we actually are evaluating the question. That is,
the statistical model must maximally correspond to the theoretical model, and the degree to
which these diverge correspondingly threatens the validity of the empirically based
inferences we can draw … (Curran & Willoughby, 2003, pp. 581-582)
Replication is an integral and critical part of the scientific process. However, it must be
done properly. In the case of geographic profiling, this involves analysing those cases and
crimes, and only those cases and crimes, appropriate for the technique, and measuring
performance by mathematically sound methods. The hit score percentage/search cost measure
accurately captures how these systems work in the real world, while possessing no intrinsic
disadvantages. Error distance, top profile area, and their other associated measures do not reflect
actual performance, and do not meet NIJ’s standard of a “fair and rigorous methodology for
evaluating geographic profiling software.”
16
References
Boots, B. N., & Getis, A. (1988). Point pattern analysis. Sage university paper series on
scientific geography, 8. Beverly Hills: Sage.
Brantingham, P. J., & Brantingham, P. L. (Eds.). (1981). Environmental criminology. Beverly
Hills: Sage.
Brantingham, P. J., & Brantingham, P. L. (1984). Patterns in crime. New York: Macmillan.
Brantingham, P. L., & Brantingham, P. J. (1981). Notes on the geometry on crime. In P. J.
Brantingham & P. L. Brantingham (Eds.), Environmental criminology (pp. 27-54). Beverly
Hills: Sage.
Brantingham, P. L., & Brantingham, P. J. (1993). Environment, routine and situation: Toward a
pattern theory of crime. In R. V. Clarke & M. Felson (Eds.), Routine activity and rational
choice (pp. 259-294). New Brunswick, NJ: Transaction.
Canter, D. V., Coffey, T., Huntley, M., & Missen, C. (2000). Predicting serial killers’ home base
using a decision support system. Journal of Quantitative Criminology, 16, 457-478.
Canter, D. V., & Gregory, A. (1994). Identifying the residential location of rapists. Journal of the
Forensic Science Society, 34, 169-175.
Canter, D. V., & Larkin, P. (1993). The environmental range of serial rapists. Journal of
Environmental Psychology, 13, 63-69.
Canter, D. V., & Snook, B. (1999, December). Modelling the home location of serial offenders.
Paper presented at the meeting of the Crime Mapping Research Center, Orlando, FL.
Clarke, R. V., & Felson, M. (Eds.). (1993). Routine activity and rational choice. New
Brunswick, NJ: Transaction.
Cohen, L., & Felson, M. (1979). Social change and crime rate trends: A routine activity
approach. American Sociological Review, 44, 588-608.
Cornish, D. B., & Clarke, R. V. (Eds.). (1986). The reasoning criminal: Rational choice
perspectives on offending. New York: Springer-Verlag.
Curran, P. J., & Willoughby, M. T. (2003). Implications of latent trajectory models for the study
of developmental psychopathology. Development and Psychopathology, 15, 581-612.
Felson, M. (2002). Crime and everyday life (3rd ed.). Thousand Oaks, CA: Sage.
Gates, D. F., & Shah, D. K. (1992). Chief. New York: Bantam Books.
Gatrell, A. C., Bailey, T. C., Diggle, P. J., & Rowlingson, B. S. (1996). Spatial point pattern
analysis and its application in geographical epidemiology. Transactions of the Institute of
British Geographers, 21, 256-274.
Gorr, W. L. (2004). Framework for validating geographic profiling using samples of solves
serial crimes. Unpublished manuscript, Carnegie Mellon University, H. J. Heinz III School
of Public Policy and Management, Pittsburgh, PA.
Harries, K. (1999). Mapping crime: Principle and practice (NCJ 178919). Washington, DC:
National Institute of Justice.
Klockars, C. B., & Mastrofski, S. D. (Eds.). (1991). Thinking about police: Contemporary
readings (2nd ed.). New York: McGraw-Hill.
Levine, N. (2002). CrimeStat: A spatial statistics program for the analysis of crime incident
locations (v. 2.0). Washington, DC: National Institute of Justice.
Levins, R. (1966). The strategy of model-building in population biology. American Scientist, 54,
421-431.
MacKay, R. E. (1999, December). Geographic profiling: A new tool for law enforcement. The
Police Chief, pp. 51-59.
17
Paulsen, D. J. (2004, March). Geographic profiling: Hype or hope? – Preliminary results into the
accuracy of geographic profiling software. Paper presented at the UK Crime Mapping
Conference, London, UK.
Paulsen, D. J., & Robinson, M. B. (2004). Spatial aspects of crime: Theory and practice. Allyn
and Bacon: Boston.
Rich, T., & Shively, M. (2004). A methodology for evaluating geographic profiling software.
Cambridge, MA: Abt Associates.
Rossmo, D. K. (1992a, April). Targeting victims: Serial killers and the urban environment.
Paper presented at the First International Conference on Serial and Mass Murder: Theory,
Research and Policy, Windsor, ON.
Rossmo, D. K. (1992b, November). Target patterns of serial murderers: A methodological
model. Paper presented at the meeting of the American Society of Criminology, New
Orleans, LA.
Rossmo, D. K. (1993a). Geographic profiling: Locating serial killers. In D. Zahm & P. F.
Cromwell (Eds.), Proceedings of the International Seminar on Environmental Criminology
and Crime Analysis (pp. 14-29). Coral Gables, FL: Florida Criminal Justice Executive
Institute.
Rossmo, D. K. (1993b). Multivariate spatial profiles as a tool in crime investigation. In C. R.
Block & M. Dabdoub (Eds.), Workshop on crime analysis through computer mapping;
Proceedings: 1993 (pp. 89-126). Chicago: Illinois Criminal Justice Information Authority.
Rossmo, D. K. (1993c). Target patterns of serial murderers: A methodological model. American
Journal of Criminal Justice, 17(2), 1-21.
Rossmo, D. K. (1995a). Geographic profiling: Target patterns of serial murderers. Unpublished
doctoral dissertation, Simon Fraser University, Burnaby, BC.
Rossmo, D. K. (1995b). Place, space, and police investigations: Hunting serial violent criminals.
In J. E. Eck & D. A. Weisburd (Eds.), Crime and place: Crime prevention studies, Vol. 4
(pp. 217-235). Monsey, NY: Criminal Justice Press.
Rossmo, D. K. (1997). Geographic profiling. In J. L. Jackson & D. A. Bekerian (Eds.), Offender
profiling: Theory, research and practice (pp. 159-175). Chichester: John Wiley & Sons.
Rossmo, D. K. (2000). Geographic profiling. Boca Raton, FL: CRC Press.
Rossmo, D. K. (2001, November). Evaluation of geographic profiling search strategies. Paper
presented at the meeting of the American Society of Criminology, Atlanta, GA.
Rossmo, D. K. (in press). Geographic heuristics or shortcuts to failure?: Response to Snook et al.
Applied Cognitive Psychology.
Rossmo, D. K., Davies, A., & Patrick, M. (2004). Exploring the geo-demographic and distance
relationships between stranger rapists and their offences (Special Interest Series: Paper 16).
London: Research, Development and Statistics Directorate, Home Office.
Smith, E. A., & Winterhalder, B. (Eds.). (1992). Evolutionary ecology and human behavior.
New York: Aldine de Gruyter.
Snook, B. (2000, December). Utility or futility? A provisional examination of the utility of a
geographical decision support system. Paper presented at the meeting of the Crime Mapping
Research Center, San Diego, CA.
Snook, B. (2004). Individual differences in distance travelled by serial burglars. Journal of
Investigative Psychology and Offender Profiling, 1, 53-66.
18
Snook, B., Canter, D. V., & Bennell, C. (2002). Predicting the home location of serial offenders:
A preliminary comparison of the accuracy of human judges with a geographic profiling
system. Behavioral Sciences and the Law, 20, 109-118.
Snook, B., Taylor, P. J., & Bennell, C. (2004). Geographic profiling: The fast, frugal, and
accurate way. Applied Cognitive Psychology, 18, 105-121.
Winterhalder, B., & Smith, E. A. (1992). Evolutionary ecology and the social sciences. In E. A.
Smith & B. Winterhalder (Eds.), Evolutionary ecology and human behavior (pp. 3-23). New
York: Aldine de Gruyter.
19
Download