Chapter 9 Space-Time Analysis

advertisement
Chapter 9
Space-Time Analysis
In t h is ch a pt er , we discu ss t h r ee t ech n iques t h a t a r e u sed t o a n a lyze th e
r ela t ion sh ip bet ween spa ce a n d t im e. U p t o t h is poin t , we h a ve a n a lyzed t h e dis t r ibu t ion
of in cid en t s ir r es pect ive of t h e or der in wh ich t h ey a pp ea r ed or in wh ich t h e t im e fr a m e in
wh ich t h ey app ea r ed. The only tem pora l an a lysis t h a t wa s con du ct ed wa s in Cha pt er 4
wh er e severa l spa t ial descript ion ind ices, inclu din g th e st a n da r d deviat ion a l ellipse, were
compa red for different time periods.
As police d ep ar t m en t s u su a lly k n ow, h owever , t h e s pa t ia l p at t er n in g of in cid en t s
doesn’t occu r u n ifor m ly t h r ou ghout t h e year , but ins t ea d a r e oft en clus t er ed t ogeth er
du r in g sh ort t im e per iods . At cert a in t im es, a r a sh of in ciden t s will occu r in cert a in
n eigh bor h oods a n d t h e police oft en h a ve to resp on d qu ickly t o t h ese event s. In ot h er
wor ds, t h er e is bot h clu st er in g in t im e a s well clu st er in g in spa ce. Th is a r ea of r esea r ch
h a s been developed m ost ly in t h e field of epidem iology (Kn ox, 1963, 1988; Ma n t el, 1967;
Ma n t el an d Ba ilar , 1970; Besa g an d N ewell, 1991; Ku lldor f a n d N a r gawa lla, 1995; Bailey
a n d Ga t t r ell, 1995). H owever , m ost of t h ese t echn iqu es a r e a pp lica ble t o cr im e a n a lysis
a n d cr im in a l jus t ice resea r ch a s well.
Crim eS tat in clud es fou r sp a ce-t im e t echn iqu es : th e Kn ox ind ex, t h e Ma n t el in dex,
t h e Spa t ial-t em por a l moving aver a ge, an d Cor r elat ed Walk Ana lysis. Figur e 9.1 sh ows t h e
Spa ce-Time Ana lysis screen .
Me a su re m e n t o f Tim e in Cr i m eS t a t
Tim e can be d efin ed a s h our s, d a ys, week s, m ont h s, or yea r s. Th e defa u lt is d a ys.
H owever , plea se n ote t h a t for a n y of th ese t echn iqu es, in Crim eS tat, t im e m u st be
m ea s u r ed a s a n in teger or real var iable, as m en t ion ed in Ch a pt er 3. Tim e ca n n ot be
defin ed by a for m a t t ed da t e code (e.g., 11/06/01, J u ly 30, 2002). Ea ch of t h e t h r ee spa cet ime r ou t ines expect t ime t o be an int eger or r ea l va r iable (e.g., 1, 2, 34527, 2.8). If given
for m a t t ed da t es , t h ey will ca lcula t e a n a n sw er , bu t t h e r es u lt will n ot be cor r ect .
If t h e t im e u n it is da ys , a sim ple t r a n sfor m a t ion is t o u se t h e n u m ber of da ys sin ce
J a n u a r y 1, 1900. Most spr ea dsh eet a n d da t a ba se pr ogra m s u su a lly a ssign a n int eger
n u m ber from t h is r efer en ce point . For exa m ple, Novem ber 12, 2001 h a s t h e in t eger va lu e
of 37 207 wh ile J a n u a r y 30, 2002 h a s t h e in t eger va lu e of 37286. Th ese a r e t h e n u m ber of
da ys sin ce J a n u a r y 1, 1900. An y s pr ea dsh eet pr ogr a m (e.g., E xcel or Lot u s 1-2-3) ca n
con ver t a da t e for m a t in t o a r ea l n u m ber wit h t h e Va lu e fu n ct ion . Als o, a n y a r bit r a r y
n u m ber in g s ys t em will wor k (e.g., 1, 2, 3).
S p a c e -Ti m e In t e r a c t i o n
Th er e a r e differ en t t ypes of int er a ct ion t h a t cou ld occu r bet ween spa ce a n d t ime.
F ou r dist inction s can be m a de. Fir st , th er e cou ld be sp at ia l clu st erin g a ll t h e t im e.
9.1
Figure 9.1:
Space-Time Analysis Screen
Cer t a in com m u n ities a r e pr on e t o cer t a in even t s. For exa m ple, robberies oft en a r e
con cen t r a t ed in pa r t icu la r loca t ion s a s a r e veh icle t h eft s . Th e h ot sp ot m et h od s t h a t wer e
dis cus sed in cha pt er s 6, 7 a n d 8 a r e u seful for iden t ifyin g t h ese con cent r a t ion s. In t h is
ca se, th er e is no spa ce-t ime in t er a ct ion sin ce t h e clus t er ing occu r s a ll t h e t ime.
Secon d, th er e cou ld be spatial clu sterin g w ith in a specific tim e period . H ot sp ot s
cou ld occu r du r in g cer t a in t im e per iod s. F or exa m ple, m ot or veh icle cr a sh es t en d t o occur
wit h m u ch h igh er fr equ en cies in t h e la t e a ft er n oon a n d ea r ly even in g, oft en a s a by-p r odu ct
of congestion on t he roads. Crash hot spots will tend t o appear at certa in times becau se of
t h e con gest ion . At m ost ot h er t imes , th e con cen t r a t ion does not occu r beca u se t h e
con gest ion levels a r e lower .
Th ird , th er e cou ld be sp ace-t im e clu st erin g. A n u m ber of even t s cou ld occur wit h in a
sh ort t im e per iod wit h in a con cent r a t ed a r ea . Th is t ype of effect is ver y comm on wit h
m otor veh icle th eft s. A ca r t h ief gan g m a y decide t o at t a ck a pa r t icula r n eigh borh ood.
Aft er a bin ge of ca r t h eft s , t h ey m ove on t o a n ot h er n eigh bor h ood . In t h is in st a n ce, t h er e
a r e a n u m ber of t h eft inciden t s t h a t a r e occu r r ing with in a lim ited per iod in a lim ited
locat ion . Th e clus t er m oves from one loca t ion t o an oth er . In t h is ca se, t h er e is a n
intera ction between spa ce and t ime in t ha t spa tial hot spots a ppear a t pa rt icular times,
bu t a r e t em por a r y. Th e a bilit y t o det ect t h is t yp e of sh ift is ver y im por t a n t t o police
depart ment s since it a ffects t heir ability to respond.
F ou r t h , th er e cou ld be space-tim e interaction in wh ich t h e r elat ion sh ip betw een
sp a ce an d t im e in m ore comp lex. Th e in t er a ction cou ld be con cent r a t ed, as in t h e spa t ia l
clus t er ing m en t ion ed a bove, or it cou ld follow a m or e com plex pat t er n . For exam ple, th er e
cou ld be a diffusion of dr u g sa les from a cent r a l loca t ion t o a m ore dis per se d a r ea .
Wher ea s in itia lly, th e dr u g dealin g is con cen t r a t ed in a few loca t ion s, it st a r t s t o diffu se t o
oth er a r ea s. H owever , t h e diffu sion m a y occur a t differ en t t im es of t h e yea r (e.g.,
Ch r ist m a s a n d N ew Yea r s). Alt er n a t ively, vehicle t h efts m a y sh ift t owar ds sea sid e
comm un ities during the sum mer mont hs when t he nu mber of vacat ioners increases. We
sa w a n exam ple of t h is in cha pt er 4 wh er e t h e ellipse of m ot or vehicle th eft s sh ift ed
bet ween J u n e a n d J u ly t o t h e com m u n ities a lon g th e Ches a pea ke River n ea r Balt imore.
Th is t ype of diffusion is n ot clu st er in g per se, in t h a t it m a y be s pr ea d over a ver y la r ge
coa st lin e. Bu t it is a dis t in ct spa ce-t im e in t er a ct ion .
Th e im por t a n ce of th ese dist in ctions is t h a t m a n y of th e spa ce-t im e t est s t h a t exist
on ly m ea su r e gr oss spa ce-t im e in t er a ct ion , r a t h er t h a n spa ce-t im e clu st er in g. F or
exam ple, th e Kn ox an d Ma n t el tes t s t h a t follow t est for spa t ial int er a ct ion . The
int er a ct ion cou ld be th e r esu lt of spa t ial clus t er ing, but doesn’t n ecessa r ily h a ve to be. The
in t er a ct ion cou ld occu r in a ver y com plex wa y t h a t wou ld n ot ea sily len d it self t o m or e
focus ed in t er ven t ion by t h e police. St ill, th e a bilit y t o iden t ify t h e in t er a ction is a n
im p or t a n t st ep in pla n n in g a n in t er ven t ion st r a t egy.
9.3
Kn o x In d e x
Th e Kn ox In dex is a sim ple com pa r ison of t h e r ela t ion sh ip bet ween in ciden t s in
t er m s of dis t a n ce (sp a ce) a n d t im e (Kn ox, 1963; 1964). Th a t is, ea ch in dividu a l pa ir is
com pa r ed in t er m s of dis t a n ce an d in t er m s of t im e in t er va l. Sin ce each p a ir of point s is
bein g com pa r ed, t h er e a r e N*(N-1)/2 pa ir s. Th e dis t a n ce bet ween poin t s is divid ed in t o t wo
gr ou ps - Clos e in dis t an ce a n d N ot clos e in dis t an ce, a n d t h e t im e in t er va l bet ween poin t s
is a ls o divid ed in t o t wo gr ou ps - Close in t im e a n d Not clos e in t im e. Th e defin it ion s of
‘close’ a n d ‘Not close’ a r e left t o t h e u ser .
A sim ple 2 x 2 t a ble is pr odu ced t h a t com pa r es closen ess in dist a n ce with closen ess
in t im e. Th e n u m ber of pa ir s t h a t fall in ea ch of th e fou r cells a r e com pa r ed (Table 9.1).
Ta ble 9.1
Lo g i c a l S t ru c t u r e o f Kn o x In d e x
Clo se in tim e
N ot c lo se in tim e
Close
in Dis t a n ce
O1
O2
S1
Not clos e
in d is t a n ce
O3
O4
S2
S3
S4
N
wh er e N = O 1 + O 2 + O 3 + O 4
S 1 = O1 + O2
S 2 = O3 + O4
S 3 = O1 + O3
S 4 = O2 + O4
Th e a ct u a l nu m ber of pa irs t h a t fa lls in t o ea ch of t h e fou r cells ar e t h en com pa r ed t o
t h e expect ed n u m ber if t h er e wa s n o r elat ion sh ip betw een closen ess in dist a n ce a n d
clos en ess in t im e. Th e exp ect ed n u m ber of pa ir s in ea ch cell u n der st r ict in depen den ce
bet ween dis t a n ce an d t h e t im e in t er va l is obt a in ed by t h e cross -pr odu cts of t h e colu m n s
a n d r ow t ot a ls (t a ble 9.2).
9.4
Ta ble 9.2
E x p e c t e d F re q u e n c i e s f o r Kn o x In d e x
Close in t im e N ot clos e in t im e
Close
in Dis t a n ce
E1
E2
Not clos e
in d is t a n ce
E3
E4
w h e r e E1
E2
E3
E4
=
=
=
=
S1
S1
S2
S2
* S3
* S4
* S3
* S4
/N
/N
/N
/N
Th e differ en ce bet ween t h e a ctu a l (obser ved) n u m ber of pa ir s in ea ch cell an d t h e
expect ed n u m ber is m ea su r ed wit h a Ch i-squ a r e st a t ist ic (equ a t ion 9.1).
P2 = E
(O i - E i )2
----------Ei
wit h 1 degr ee of fr eedom
(9.1)
Mont e Carlo S im u latio n of Critic al Ch i-sq u are
Un for t u n a t ely, t h e u su a l pr obabilit y t est a ss ocia t ed wit h t h e Ch i-squ a r e st a t ist ic
ca n n ot be a pplied sin ce t h e obs er va t ion s a r e n ot in depen den t . In t er a ct ion bet ween spa ce
a n d t ime t en d t o be com poun ded wh en ca lcu lat ing t h e Chi-squa r e st a t ist ic. For exam ple,
we’ve noticed t h a t t h e Chi-squa r e st a t ist ic t en ds t o get la r ger wit h increa sin g sam ple size,
a con dit ion t h a t would n orm a lly not be t r u e wit h t h e in depen den t obser va t ion s. To ha n dle
t h e is su e of in t er depen den cy, t h er e is a Mon t e Ca r lo s im u la t ion of t h e ch i-s qu a r e va lu e for
t h e Kn ox Ind ex un der spa t ial r a n domn ess (Dwas s, 1957; Bar n a r d, 1963). If t h e u ser
s elect s a sim u la t ion , t h e r ou t in e r a n d om ly s elect s M p a ir s of a d is t a n ce a n d a t im e in t er va l
wh er e M is t h e n u m ber of pa ir s in t h e da t a set (M = N * [N-1]/2) a n d ca lcu la t es t h e Kn ox
In dex a n d t h e ch i-s qu a r e t est . E a ch pa ir of a dis t a n ce a n d a t im e in t er va l a r e select ed fr om
t h e r a n ge bet ween t h e m in im u m a n d m a xim u m va lu es for dis t a n ce an d t im e in t er va l in
t h e da t a set u sin g a u n ifor m r a n dom gen er a t or.
The ran dom simu lation is repeated K times, where K is specified by th e user an d
Us u a lly, it is wise t o r u n t h e sim u lat ion 1000 or m or e t imes . The ou t pu t inclu des:
1.
2.
Th e sa m ple size
Th e n u m ber of p air s
9.5
3.
4.
5.
6.
Th e
Th e
Th e
Ten
a.
b.
c.
d.
e.
f.
g.
h.
i.
j.
ca lcu lat ed chi-squ a r e valu e of t h e Kn ox Ind ex fr om t h e da t a
m in im u m ch i-s qu a r e va lu e of t h e Kn ox In dex fr om t h e sim u la t ion
m a xim u m ch i-s qu a r e va lu e of t h e Kn ox In dex fr om t h e sim u la t ion
per cen t iles fr om t h e sim u la t ion :
0.5%
1%
2.5%
5%
10%
90%
95%
97.5%
99%
99.5%
Meth ods for Dividin g Dista nc e a nd Time
In t h e Crim eS tat im p lem en t a t ion of t h e Kn ox In d ex, t h e u s er ca n d ivid e d is t a n ce
a n d t im e in t er va l ba sed on t h e t h r ee cr it er ia :
1.
Th e m ea n (mea n dis t a n ce an d m ea n t im e in t er va l). Th is is t h e defau lt .
2.
Th e m edia n (media n dis t a n ce an d m edia n t im e in t er va l)
3.
Use r define d cr it er ia for dis t a n ce an d t im e sepa r a t ely.
Th er e a r e a dva n t a ge t o ea ch of th ese m et h ods. Th e m ea n is t h e cen t er of th e
dis t r ibu t ion ; it den otes a ba la n ce point . Th e m edia n will divide both dis t a n ce an d t im e
int er val int o a ppr oxim a t ely equa l nu m ber s of pa irs . The division is ap pr oxim a t e sin ce t h e
da t a m a y not ea sily divide in t o two equa l n u m ber ed gr oup s. A us er -defined cr it er ia can fit
a pa r t icu lar n eed of a n a n a lyst . For exam ple, a police depa r t m en t m a y on ly be int er est ed
in in cid en t s t h a t occu r wit h in t wo m iles of ea ch ot h er wit h in a on e week per iod . Th ose
cr iter ia would be t h e bas is for dividin g th e sa m ple int o ‘Close’ a n d ‘Not close’ dist a n ces a n d
time intervals.
E xa m ple o f th e Kn o x In d e x
F or a n exam ple, veh icle th eft s in Balt imore Coun t y for 1996 were t a ken . Ther e
wer e 1855 veh icle th eft s for wh ich a da t e wa s r ecor ded in t h e da t a ba se. The d a t a ba se
wa s fu r t h er br oken down in t o t welve sepa r a t e m on t h ly su bset s. U sin g t h e m edia n for bot h
dist a n ce a n d t ime in t er val, th e Kn ox Ind ex was ca lcu lat ed for t h e ent ire s et of 1855
in ciden t s. Th en , usin g t h e m edia n dis t a n ce for t h e en t ir e yea r bu t a m ont h -specific m edia n
t im e in t er va l, t h e Kn ox In dex wa s ca lcula t ed for ea ch of th e t welve m ont h s. Ta ble 9.3
pr esen t s t h e Ch i-squ a r e va lu es a n d t h eir ps eu do-significan ce levels.
9.6
To pr odu ce a bet t er t est of t h e sign ifica n ce of t h e r esu lt s, 1000 r a n dom sim u la t ion s
wer e calcu la t ed for t h e veh icle t h eft for t h e en t ir e yea r . Ta ble 9.3 below sh ows t h e r esu lt s.
Beca u se a n ext r em e va lu e cou ld be obt a in ed by ch a n ce wit h a r a n dom dis t r ibu t ion ,
r eas on a ble cu t -off point s a r e us u a lly select ed fr om t h e simu lat ion . In t h is ca se, we wan t a
cu t -off poin t t h a t a ppr oxim a t es a 5% significa n ce level. Since th e Kn ox Ind ex is a on et a iled t est (i.e., on ly a h igh ch i-s qu a r e va lu e is in dica t ive of spa t ia l in t er a ct ion ), we a dopt
a n u pper t h r esh old of t h e 95 per cen t ile. In ot h er wor ds, on ly if t h e obser ved ch i-squ a r e
t est for t h e Kn ox In dex is la r ger t h a n t h e 95 per cen t ile t h r esh old will t h e n u ll h yp ot h esis of
a r an dom distr ibut ion between spa ce and t ime be rejected.
Ta ble 9.3
Kn o x In d e x fo r B a lt im o re Co u n ty Ve h i cle Th e ft s
Med ian S plit
N = 1,855 wit h 1,719,585 com pa r is on s
95 P e rce n tile
Ac t u a l
S i m u l a ti o n
Ap p r o x.
Mo n th
Chi-squ are
Chi-squ are
p
J anuary
0.26
6.95
n.s.
F ebr u a r y
0.00
6.61
n.s.
Ma r ch
0.00
6.86
n.s.
Ap ril
0.50
6.56
n.s.
Ma y
1.04
7.25
n.s.
J une
0.01
6.02
n.s.
J u ly
9.96
9.05
.05
August
5.91
5.55
.05
Sept em ber
0.27
5.41
n.s.
Oct ober
3.33
6.43
n.s.
Novem ber
10.79
8.91
.01
Decem ber
0.00
6.87
n.s.
--------------------------------------------------------------------------------------All of 1996
8.69
41.89
n .s .
F or t h e en t ir e yea r , t h er e wa s n ot a sign ifica n t clus t er in g bet ween sp a ce an d t im e.
Appr oxim a t ely, 26.7% of t h e inciden t s wer e bot h close in dist a n ce (i.e., closer t h a n t h e
m edia n dis t a n ce bet ween pa ir s of in ciden t s) a n d close in t im e (i.e., closer t h a n t h e m edia n
time interval between pa irs of incidents). However, when individua l month s ar e exam ined,
on ly t h r ee sh ow significa n t r elat ion sh ips: J u ly, Au gus t , an d N ovember . Dur ing t h ese
m on t h s, t h er e is an int er a ct ion bet ween spa ce a n d t ime. Typica lly, in ciden t s t h a t clus t er
t ogeth er spa t ially ten d a lso t o clus t er t ogeth er t em pora lly. However , it cou ld be th e
op pos it e (i.e., even t s t h a t clu s t er t oget h er t em p or a lly t en d t o be fa r a p a r t s pa t ia lly).
Th e n ext st ep wou ld t o id en t ify wh et h er t h er e a r e pa r t icu la r clu st er s t h a t occu r
with in a short t ime period. Using one of th e ‘hot spot’ an alysis met hods discussed in
ch a pt er s 6 a n d 7, a n a n a lys t cou ld ta k e t h e even t s for t h e t h r ee m on t h s a n d t r y t o id en t ify
9.7
wh et h er t h er e is spa t ia l clu st er in g d u r in g t h ose t h r ee m on t h s t h a t does n ot n or m a lly occu r .
We won ’t do t h a t h er e, bu t t h e point is t h a t t h e Kn ox In dex is u se ful t o iden t ify w h en t h er e
is s pa t ia l clus t er in g.
P r ob le m s w i th th e Kn o x In d e x
Th e Kn ox In dex is a sim ple m ea su r e of sp a ce-t im e clus t er in g. However , becau se it
is on ly a 2 x 2 t a ble, d iffer en t r esu lt s ca n be obt a in ed by va r yin g t h e cu t -off poin t s for
dist a n ce or t ime. F or exam ple, usin g th e m ea n a s t h e cu t -off, th e overa ll Ch i-squ a r e
st a t is t ic for a ll veh icle t h eft s wa s 8.67, r ea son a bly clos e. H owever , wh en a cu t -off poin t for
dist a n ce of 1000 met er s a n d a cu t -off poin t for t ime of 80 da ys wa s u sed, t h e Chi-squa r e
st a t is t ic d r opped t o 3.16. In ot h er wor ds, t h e Kn ox In dex will p r odu ce differ en t r esu lt s for
differe n t cut -off point s.
A second pr oblem ha s to do with th e int erpret at ion. As with an y Chi-squar e test,
differ en ces bet ween t h e obs er ved a n d exp ect ed fr equ en cies cou ld occu r in a n y cell or a n y
com bin a t ion of cells . F in d in g a s ign ifica n t r ela t ion s h ip does n ot a u t om a t ica lly m ea n t h a t
even t s t h a t wer e close in dis t a n ce wer e a lso close in t im e; it could h a ve been t h e opposit e
r elat ion sh ip. However , a sim ple insp ect ion of t h e t a ble ca n ind ica t e wh et h er t h e
r ela t ion sh ip is a s expect ed or n ot. In t h e a bove exam ple, all t h e significan t r ela t ion sh ips
sh owed a h igh er pr oport ion of event s t h a t wer e bot h close in dist a n ce a n d close in t ime.
Ma n te l In d e x
Th e Ma n t el In dex r esolves s ome of t h e pr oblem s of t h e Kn ox In dex. Es sen t ia lly, it
is a cor r ela t ion bet ween dis t a n ce an d t im e in t er va l for pa ir s of inciden t s (Ma n t el, 1967).
Mor e for m a lly, it is a gen er a l t est for t h e cor r ela t ion bet ween t wo dissim ilarity m a t r ices
t h a t su m m a r izes comp a r ison s bet ween pa ir s of point s (Ma n t el a n d Ba ila r , 1970). It is
ba sed on a sim ple cr oss-pr odu ct of two in t er va l va r ia bles (e.g., dis t a n ce a n d t im e in t er va l):
N N
T=
E E (X
ij
- Mea n X)(Yij - Mea n Y)
(9.2)
i=1 j=1
wh er e Xij is a n in dex of sim ila r it y bet ween t wo obs er va t ions , i a n d j, for one va r ia ble (e.g.,
dist a n ce) wh ile Yij is a n in dex of s im ila r it y bet ween t h e sa m e t wo obs er va t ion s, i a n d j, for
a n oth er va r ia ble (e.g., t im e in t er va l).
The cross-product is then norm alized by dividing each deviation by its sta nda rd
devia t ion :
1
r = --------(N-1)
N N
E E (X
ij
- Mea n X)/S x * (Yij - Mea n Y)/S y
i=1 j=1
9.8
(9.3a)
N N
=
E EZ
x
* Zy / (–1)
(9.3b)
i=1 j=1
wh er e Xij an d Yij a r e t h e origin a l va r ia bles for com pa r in g t wo obs er va t ions , i a n d j, a n d Z x
an d Zy ar e the norma lized var iables.
E xa m ple o f th e Ma n te l In d e x
I n Crim eS tat, t h e Ma n t el I n dex r ou t in e ca lcu la t es t h e cor r ela t ion bet ween dis t a n ce
a n d t ime in t er val. To illus t r a t e, ta ble 9.4 exam ines t h e Ma n t el cor r elat ion for t h e 1996
veh icle t h eft s in Ba lt im or e Cou n t y t h a t wa s illu st r a t ed a bove. As seen , t h e cor r ela t ion s a r e
a ll low. H owever , as wit h t h e Kn ox In dex, J u ly, Au gu st a n d N ovem ber pr odu ce re la t ively
h igh er cor r ela t ions . If use d a s a n in dex, r a t h er t h a n a n est im a t e of va r ia n ce expla in ed, t h e
Ma n t el In dex can iden t ify tim e per iods wh en sp a t ia l in t er a ction is occu r r in g.
Ta ble 9.4
Man tel In d ex for Balt im or e Count y Ve h ic le The f t s
Med ian S plit
N = 1,855 a n d 1,719,585 Com pa r is on s
S i m u l a ti o n S i m u l a ti o n Ap p r o x.
Mo n th
r
2.5%
97.5%
p -le v e l
J anuary
-.0047
-0.033
0.033
n.s.
F ebr u a r y
-.0023
-0.037
0.042
n.s.
Ma r ch
-.0245
-0.032
0.039
n.s.
Ap ril
0.0077
-0.040
0.041
n.s.
Ma y
0.0018
-0.038
0.043
n.s.
J une
0.0043
-0.035
0.041
n.s.
J u ly
0.0348
-0.034
0.033
.025
August
0.0544
-0.034
0.035
.01
Sept em ber
0.0013
-0.044
0.046
n.s.
Oct ober
0.0409
-0.037
0.043
n.s.
Novem ber
0.0630
-0.042
0.040
.001
Decem ber
0.0086
-0.035
0.038
n.s.
-------------------------------------------------------------------------------------All of 1996
0.0015
-0.009
0.010
n.s.
Mon te Carlo Si m u lat io n of Con fide n ce Int e rva ls
E ven t h ou gh t h e Ma n t el Ind ex is a P ea r son pr odu ct -m om en t cor r elat ion bet ween
dis t a n ce an d t im e in t er va l, th e m ea su r es a r e n ot in depen den t a n d, in fact , ar e h ighly
in t er depen den t . Con sequ en t ly, th e u su a l sign ifican ce te st for a cor r ela t ion coefficien t is
9.9
n ot a pp r opr ia t e. In st ea d, t h e Ma n t el r out in e offers a sim u la t ion of t h e con fiden ce int er va ls
a r ou n d t h e in dex. If t h e u ser select s a sim u la t ion , t h e r ou t in e r a n dom ly select s M pa ir s of
a dis t a n ce a n d a t im e in t er va l wh er e M is t h e n u m ber of pa ir s in t h e da t a set (M = N* [N1]/2) a n d ca lcu la t es th e Ma n t el In dex. E a ch pa ir of a dis t an ce a n d a t im e in t er va l a r e
selecte d fr om t h e r a n ge bet ween t h e m in im u m a n d m a xim u m va lu es for dis t a n ce an d t im e
in t er va l in t h e d at a set u sin g a u n ifor m r a n dom gen er a t or .
Th e r a n dom sim u lat ion is r epea t ed K t imes , wher e K is specified by th e u ser .
Us u a lly, it is wise t o r u n t h e sim u lat ion 1000 or m or e t imes . The ou t pu t inclu des:
1.
2.
3.
4.
5.
6.
Th e
Th e
The
Th e
Th e
Ten
a.
b.
c.
d.
e.
f.
g.
h.
i.
j.
sa m ple size
n u m ber of p air s
calculat ed Man tel Index from t he dat a
m in im u m Ma n t el va lu e fr om t h e sim u la t ion
m a xim u m Ma n t el va lu e fr om t h e sim u la t ion
per cen t iles fr om t h e sim u la t ion :
0.5%
1%
2.5%
5%
10%
90%
95%
97.5%
99%
99.5%
To illust r a t e, 1000 r a n dom sim u la t ions wer e calcula t ed for ea ch m ont h u sin g t h e
sa m e sa m ple size a s t h e m ont h ly veh icle t h eft t ota ls. Ta ble 9.4 a bove sh ows t h e r esu lt s.
Beca u se a n ext r em e va lu e cou ld be obt a in ed by ch a n ce wit h a r a n dom dis t r ibu t ion ,
r eas on a ble cu t -off point s a r e us u a lly select ed fr om t h e simu lat ion . In t h is ca se, we wan t
cu t -off point s t h a t ap pr oxim a t e a 5% significa n ce level. Since t h e Man t el Index is a t wot a iled tes t (i.e., on e cou ld ju st a s ea sily get disp er sion bet ween spa ce a n d t ime a s
clus t er in g), we a dopt a lower t h r es h old of t h e 2.5 per cent ile a n d a n u pp er t h r es h old of 97.5
per cen t ile. Com bined , th e t wo cu t -off poin t s en su r e t h a t a ppr oxim a t ely 5% of t h e ca ses
wou ld be eith er lower t h a n t h e lower t h r esh old or h igh er t h a n t h e u pper t h r esh old u n der
ra ndom conditions. 1 In ot h er wor ds, on ly if t h e obser ved Man t el Ind ex is sm a ller t h a n t h e
lower t h r esh old or la r ger t h a n t h e u pper t h r esh old will t h e n u ll h yp ot h esis of a r a n dom
distr ibut ion between spa ce and t ime be rejected.
In Table 9.4, for t h e ent ire yea r , th e obser ved Man t el Ind ex (cor r elat ion bet ween
sp a ce an d t im e) wa s 0.0015. Th e 2.5 per cent ile wa s -.009 a n d t h e 97.5 per cent ile wa s 0.01.
Sin ce th e obser ved va lu e is bet ween t h ese t wo cu t -off point s, we can n ot r eject t h e n u ll
h yp ot h esis of n o r ela t ion sh ip bet ween spa ce a n d t im e. H owever , for t h e in divid u a l m on t h s,
a ga in , J u ly, Au gu st a n d N ovem ber h a ve cor r ela t ions a bove t h e u pp er cut -off t h r es h old.
9.10
Thus, for t hose thr ee mont hs on ly, t h e a m ou n t of spa ce-t im e clu st er in g in t h e veh icle t h eft
da t a is m ost lik ely gr ea t er t h a n wh a t wou ld be exp ect ed on t h e ba sis of a ch a n ce
dis t r ibu t ion . On e wou ld , t h en , h a ve t o exp lor e t h e da t a fu r t h er t o fin d ou t wh er e t h ose
vehicle th eft s wer e occu r r ing, us ing on e t h e h ot spot r ou t ines in Ch a pt er 6.
Li m i t a ti o n s o f t h e Ma n t e l In d e x
Th e Ma n t el Ind ex is a u seful m ea su r e of t h e r elat ion sh ip betw een spa ce a n d t ime.
Bu t it does h a ve lim it a t ion s. F ir st , becau se it is a P ea r son -typ e cor r ela t ion coefficien t , it is
pr on e t o t h e sa m e t ypes of pr oblems t h a t befa ll cor r elat ion s. Ext r em e valu es of eith er
spa ce or t ime cou ld distort t h e rela t ion sh ip, eith er p ositively, if t h ere a r e on e or t wo
obser va t ion s t h a t a r e ext r em e in both dist a n ce in t ime in t er val, or n egat ively, if t h er e a r e
only one or t wo obser va t ion s t h a t a r e ext r em e in eit her dis t a n ce or in t im e in t er va l.
Secon d, beca u se t h e t est is a com pa r is on of a ll pa ir s of obs er va t ion s, t h e cor r ela t ion s
t en d t o be sm a ll, a s n ot ed a bove. This m a kes it less in t u itive as a m ea su r e t h a n a
t r a dit ion a l cor r elat ion coefficient wh ich var ies betw een -1 a n d +1 a n d in wh ich h igh valu es
ar e expected. For m ost a na lysts, it is not very int uitive to ha ve an index where 0.05 is a
h igh va lu e. Th is d oesn ’t fau lt t h e st a t ist ic as m u ch m a k e it a lit t le n on-in t u it ive for u se r s.
Th ird , as wit h a n y cor r elat ion coefficient , th e sa m ple size needs t o be fa irly lar ge to
pr odu ce a s t a ble es t im a t e. In t h e a bove, exam ple, on e cou ld fu r t h er br ea k down m ont h ly
vehicle th eft s by week or, even, da y. H owever, t h e n u m ber of ca ses will decr ea se
con sid er a bly. In t h e a bove exam ple, wit h 1,855 veh icle t h efts over a yea r , t h e week ly
average would be ar oun d 36, which is a sma ll sam ple. Intu itively, a crime an alyst wan ts t o
k n ow wh en spa ce-t im e clu st er in g is occu r r in g a n d a sh or t t im e fr a m e is cr it ica l for
det ect ion; a wee k would be t h e la r gest t im e in t er va l t h a t would be u se ful. H owever , a s t h e
sa m ple size get s sm a ll, t h e index becom es u n st a ble. For one t h ing, th e sa m ple size ma kes
t h e in dex volat ile. Wh ile t h e Mon t e Ca r lo sim u la t ion will a dju st for t h e sa m ple size, t h e
r a n ge of t h e cut -off t h r esh olds will va r y cons ider a bly from one week t o an oth er wit h sm a ll
sa m ple sizes. Th e a n a lyst will h a ve t o r u n t h e sim u la t ion r epea t edly t o a dju st for t h e
var yin g sam ple sizes. F or a n ot h er t h ing, th e sh or t en ed t ime fra m e a llows fewer
dist inction s in t ime; if on e t a kes a very n a r r ow t ime fra m e (e.g., a da y), th er e ca n be
vir t u a lly no tim e differ en ces obser ved. On e wou ld h a ve t o swit ch t o an h our ly an a lysis t o
produce mean ingful differences.
On e wa y t o get a r oun d t h is is t o ha ve a m ovin g aver a ge wh er e t h e t im e fra m e is
a dju st ed t o fit a con st a n t n u m ber of da ys (e.g., a 14 da y m ovin g a ver a ge). Th e a dva n t a ge is
t h a t t h e sa m ple size ten ds t o r em a in fair ly con st a n t ; on e cou ld t h er efor e r edu ce t h e
n u m ber of r eca lcu la t ion s of t h e cu t -off t h r esh old s sin ce t h ey wou ld n ot va r y m u ch fr om on e
da y to an ot h er . To m a ke t h is wor k, h owever, t h e da t a ba se m u st be set u p t o pr odu ce t h e
appr opriat e num ber of incidents for a moving average an alysis.
Never t h eles s, t h e Ma n t el In dex r em a in s a u seful t ool for a n a lyst s. It is s t ill widely
u sed for spa ce-t im e a n a lysis a n d it h a s been gen er a lized t o m a n y ot h er t yp es of
9.11
dissim ilar ity a n a lyses t h a n ju st spa ce a n d t ime. If u sed car efu lly, th e index can be a
powerful t ool for det ect ion of clus t er s t h a t a r e a lso con cen t r a t ed in t ime.
S p a t i a l-Te m p o r a l Mo v i n g Av e r a g e
Th e Spa t ial-Tem pora l Movin g Avera ge is a sim ple st a t ist ic. It is t h e m ovin g mea n
cen t er of M obser vat ion s wh er e M is a su b-set of t h e t ota l sa m ple , N . By ‘m oving’, t h e
obser va t ion s a r e sequ en ced in or der of occur r en ce. H en ce, t h er e is a t im e dim en sion
a ssociat ed wit h t h e sequ en ce. The M obser vat ion s is called th e span a n d t h e d efa u lt s pa n
is 5 obser vat ion s. The s pa n is cen t er ed on ea ch obser vat ion so th a t t h er e a r e a n equa l
n u m ber on bot h sides. Beca u se t h er e a r e n o da t a poin t s pr ior t o t h e firs t event a n d a ft er
t h e la st even t , t h e fir st few m ea n cen t er s will h a ve few er obs er va t ion s t h a n t h e r est of th e
sequ en ce. For exa m ple, wit h a sp a n of 5, t h e firs t a n d la st m ea n cent er s will ha ve only
t h r ee obser vat ion s, t h e secon d a n d n ext-to-las t will h a ve 4 obser vat ion s, wh ile a ll ot h er s
will h a ve 5. In gen er a l, it ’s a good id ea t o choose a n odd n u m ber sin ce t h e m id dle of th e
sp a n will be cen t er ed on a r ea l obser va t ion r a t h er t h a n h a ving t o fa ll bet ween t wo in t h e
case of an even span .
Th ough sim ple, t h e Sp a t ia l-Tem por a l Moving Avera ge is ver y useful for det ectin g
ch a n ges in beh a vior by s er ia l offen der s. In t h e n ext ch a pt er , we will exa m in e jou r n ey-t ocr ime m odels t h a t a t t em pt s t o est ima t e t h e likely or igin loca t ion of a ser ial offen der ba sed
on t h e dist r ibut ion of inciden t s com m itt ed by th e offen der . However , if t h e ser ial offen der
h a s eit h er m oved r esiden ces or else m oved t h e field of oper a t ion , t h en t h e t echn iqu e will
er r or becau se it is a ss u m in g a st a ble field of oper a t ion s wh en , in fact , it isn ’t . Th e m ovin g
a ver a ge ca n su ggest wh et h er t h e offen der ’s beh a vior is st a ble or n ot .
As a n exa m ple, figu r e 9.2 be low sh ows t h e Spa t ia l-Tem por a l Movin g Aver a ge of an
offen der wh o comm it t ed 12 offen se s befor e bein g a r r es t ed. Th e in divid u a l com m it t ed eigh t
t h eft s fr om vehicles, t wo t h eft s fr om st or es, on e res ident ial bur gla r y an d on e highwa y
r obber y. The a ct u a l in ciden t s a r e sh own in r ed cir cles with t h e sequ en ce n u m ber
displa yed. The m ovin g aver a ge is sh own in blu e squ a r e with t h e sequ en ce n u m ber
displa yed. The pa t h of t h e m ovin g aver a ge is sh own a s a green lin e.
As seen , th er e is a d efinit e sh ift in t h e field of oper a t ion by th is offen der . The m ea n
cent er m oves a bout a m ile du r in g t h is p er iod bu t t h e con sis t en cy of t h e t r en d would
su ggest t h a t somet h ing fu n da m en t a l ch a n ged by th e offen der , eit h er t h e per son m oved
r esiden ces or t h e n a t u r e of t h e com m it t ed crim es cha n ged. In u sin g t h e J our n ey-to-cr im e
t ools , a n a n a lyst wou ld pr oba bly wa n t t o focu s on t h e la t t er even t s sin ce t h ese a r e m or e
geogr a ph ica lly cir cu m scr ibed. N ot ice t h a t t h e la st t wo m ovin g a ver a ges a r e r ela t ively clos e
t o t h e a ct u a l r esid en ce locat ion of th e offen der wh en a r r est ed (less t h a n t h r ee-qu a r t er s of a
m ile a wa y).
In sh ort , t h e Sp a t ia l-Tem por a l Moving Avera ge sim ply plot s t h e cha n ges in t h e
m ea n cen t er of t h e spa n a n d is u sefu l for det ect in g ch a n ges in t h e beh a vior pa t t er n of
serial offenders.
9.12
Figure 9.2:
Moving Path of Serial Offender
Sequence of 12 Crimes
#
3
1
#
&
Residence
%
10
#
1
2
%
&
3
%
#
%
12
%
4&5
%
11
Residence when arrested
Incidents
Moving average
Path of moving average
%
11
8
#
# %
9 & 10 %
2 & 7 & 12
6
#
7
5
#
%
%
6
4
#
8
#
9
#
N
W
E
S
0
0.5
1
1.5
2 Miles
Correlate d Walk Ana lysis
Corr ela t ed Wa lk An a lysis (CWA) is a t ool t h a t is a im ed a t a n a lyzing t h e spa t ia l a n d
t em p or a l sequencing of in ciden t s com m it t ed by a sin gle ser ia l offen der . In t h is s en se, it is
t h e ‘flip s ide’ of J our n ey t o cr im e a n a lysis (see cha pt er 10). Wher ea s jour n ey t o cr im e
a n a lysis m a k es gu esses a bou t t h e lik ely or igin loca t ion for a ser ia l offen der , ba sed on t h e
spa t ial dist r ibut ion of t h e inciden t s com m itt ed by th e offen der , th e CWA r ou t ine m a kes
gu esses a bou t t h e t im e a n d loca t ion of a n ext even t , ba sed on bot h t h e spa t ia l d is t r ibu t ion
of t h e in cid en t s a n d t h e t em por a l s equ en cin g of t h em . In effect , it is a Spa t ia l-Tem por a l
Movin g Avera ge wit h a pr edict ion of a n ext even t .
Th e st a t ist ical or igin of CWA is Ra n dom Wa lk Th eor y. Ra n dom Wa lk Th eor y h a s
been developed by ph ysicist s t o expla in t h e dist r ibu t ion of m olecules in a r a pid ly ch a n ging
en vir onm en t (e.g., t h e m ovem en t s of a pa r t icle in a ga s wh ich is diffu sin g - Br ownia n
m ovemen t ). Som et imes ca lled a ‘dr u n ka r d’s wa lk’, th e t h eor y st a r t s wit h t h e pr em ise t h a t
m ovem en t is r a n dom in a ll dir ect ion s. F r om a n a r bit r a r y s t a r t in g p oin t , a pa r t icle (or
per son ) moves in a n y dir ection in a ser ies of st eps. Th e dir ection of ea ch s t ep is
ind epen den t of t h e pr eviou s st eps. Aft er ea ch st ep, a r a n dom decision is m a de a n d t h e
person m oves in a ra ndom direction. This process is repeated a d in fin it um u n t il a n
a r bitr a r y stopping point is select ed (i.e., t h e obser ver qu its lookin g). It h a s been s h own
m a t h em a t ica lly t h a t a ll on e a n d t wo dim en sion a l ra n dom wa lks m u st event u a lly r et u r n t o
t h eir or igin a l s t a r t in g poin t (Spit zer , 1963; H en der son , Ren sh a w, a n d F or d, 1983).2 Th is is
called a recu rren t ran d om w alk . On t h e ot h er h a n d, in dep en den t r a n dom wa lk s in m or e
th an two dimensions a re not necessarily recur rent , a st at e called tran sien t ran d om w alk .
F igur e 9.3 illu st r a t es a r a n dom wa lk of 2000 st eps. F or a la r ge n u m ber of st eps in a
t wo-dim en sion a l wa lk, t h e lik ely dis t a n ce of a p er son (or pa r t icle) fr om t h e st a r t in g point is
E (d) = d r m s * /N
(9.4)
wh er e d r m s = /( G d i 2 / N ). The t er m , d r m s is t h e root m ean squa re of d is t a n ce.
There ar e a nu mber of different types of ra ndom wa lks. The simplest is a
m ovemen t of u n ifor m dist a n ce on ly a lon g a gr id cell (i.e., a Ma n h a t t a n geom et r y). The
per son can on ly m ove Nor t h , Sou t h , Ea st or West for a u n it d ist a n ce of 1. A m or e com plex
r a n dom walk allows a n gular dist a n ces a n d a n even m or e com plex ra n dom walk allows
va r yin g d is t a n ces (e.g., n or m a lly d is t r ibu t ed r a n dom dis t a n ces, u n ifor m ly r a n dom
dist a n ces). The wa lk in figu r e 9.3 was of t h is lat t er t ype. X a n d Y valu es wer e select ed
r a n dom ly fr om a r a n ge of -1 t o +1 u sin g a u n ifor m r a n dom n u m ber gen er a t or . F or a
con cept u a l u n der st a n din g of Ra n dom Wa lk Th eor y, see Ch a it in (1990) a n d, for a
m a t h em a t ica l t r ea t m en t , s ee Spit zer (1976). Ma lk iel (1999) a pplied t h e con cept s of
Ra n dom Wa lk Th eor y t o st ock pr ice flu ct u a t ion s in a book t h a t h a s n ow becom e a cla ssic.
H en der son , Ren sh a w a n d F or d (1983; 1984) h a ve in t r odu ced t h e concep t of a
correlated ran d om w alk . In a cor r ela t ed r a n dom wa lk , m om en t u m is m a in t a in ed . If a
per son is m ovin g in a cer t a in dir ect ion , t h ey a r e m or e lik ely t o con t in u e in t h a t dir ect ion
9.14
Figure 9.3:
A Random Walk
2000 Random Steps of -1.0 to +1.0 in X and Y Direction
Finish
"
Start
"
t h a n t o rever se dir ection or t r a vel ort h ogona lly. In oth er words , at a n y on e decision p oin t ,
t h e pr oba bilit ies of t r a veling in a n y direction a r e n ot equa l; t h e sa m e dir ect ion h a s a h igh er
pr oba bilit y th a n a n or t h ogon a l ch a n ge (i.e., t u r n ing 90 degrees ) a n d t h ose, in t u r n , ha ve a
h igh er pr oba bilit y t h a n com plet ely r ever sin g d ir ect ion . By im plica t ion , t h e sa m e is t r u e for
dist a n ce a n d dist a n ce. A lon ger s t ep t h a n a vera ge is likely to be followed by an ot h er lon ger
st ep t h a n a vera ge while a s h or t er st ep t h a n a vera ge is likely to be followed by an ot h er
sh or t st ep. Sim ila r ly, t h er e is con sis t en cy in t h e t im e in t er va l bet ween even t s; a sh or t
int er val is a lso likely to be followed by a s h or t int er val. In oth er wor ds, a cor r elat ed
ra ndom wa lk is a r an dom wa lk with moment um (Chen an d Rensh aw, 1992; 1994). These
a u t h or s h a ve a pplied t h e t h eor y t o t h e a n a lysis of t h e br a n ch in g of t r ee r oot s (H en der son ,
F or d, Ren sh a w, a n d Dea n s, 1983; Ren sh a w, 1985).
Correlate d Walk Ana lysis
Cor r ela t ed Wa lk An a lys is is a set of t ools t h a t ca n h elp a n a n a lys t u n der s t a n d t h e
se qu en cing of sequ en t ia l even t s in t er m s of tim e in t er va l, dis t a n ce an d d ir ect ion. In
Crim eS tat, t h er e a r e t h r ee CWA r ou t in es . Th e fir s t t wo h elp t h e a n a lys t u n der s t a n d
wh eth er t h ere a r e pa t t ern s in t ime, dista n ce or direction while th e last rout ine a llows t h e
a n a lyst t o ma k e a guess a bout t h e n ext like ly even t , wh en it will occur a n d wh er e it will
occu r . Th e t h r ee r ou t in es ar e:
1.
2.
3.
CWA - Corr elogra m
CWA - Dia gn ost ics
CWA - P r edict ion
CWA - Corre log ram
Th e Correlogram r out in e calcula t es t h e cor r ela t ion in t im e in t er va l, dis t a n ce, an d
bea r in g (dir ect ion) bet ween even t s. I t does t h is t h r ough la gs. A la g is a sep a r a t ion in t h e
in t er va ls be t ween even t s. Th e differ en ce bet ween t h e firs t a n d s econ d even t is t h e firs t
int er val. The differ en ce bet ween t h e secon d a n d t h ird event s is t h e secon d in t er val. The
d iffer en ce bet ween t h e t h ir d a n d fou r t h even t s is t h e t h ir d in t er va l, a n d s o for t h . F or ea ch
su cces sive in t er va l, t h er e is a t im e differ en ce; t h er e is a dis t a n ce an d t h er e is a dir ect ion.
On e could ext en d t h is t o a ll t h e in t er va ls , com pa r in g ea ch in t er va l wit h t h e n ext one; t h a t
is, w e com pa r e t h e first in t er va l wit h t h e secon d, t h e secon d in t er va l wit h t h e t h ir d, t h e
t h ir d in t er va l wit h t h e fou r t h , an d s o on u n t il t h e sa m ple is com plet e. Wh en com pa r in g
successive int ervals, th is is called a lag of 1. It is im por t a n t t o keep in m in d t h e dis t in ct ion
bet ween a n event (e.g., a n inciden t ) a n d a n int er val. It t a kes t wo event s t o cr ea t e a n
int er val. Thu s, for a lag of 1, th er e a r e M= N-1 in t er vals wh er e N is t h e n u m ber of event s
(e.g., for 3 in cid en t s, t h er e a r e 2 in t er va ls ).
A la g of t wo com pa r es ever y ot h er even t . Th u s, t h e fir st in t er va l is com pa r ed t o t h e
t h ird int er val; t h e secon d in t er val is com pa r ed t o t h e fou r t h ; t h e t h ird int er val is com pa r ed
t o th e fift h ; an d s o on u n t il t h er e a r e n o mor e in t er va ls left in t h e sa m ple . Aga in , t h e
com p a r is on is for t im e d iffer en ce, d is t a n ce, a n d d ir ect ion s ep a r a t ely. We ca n ext en d t h is
logic t o a la g of 3 (ever y t h ir d even t ), a la g of 4 (ever y fou r t h even t ), an d s o fort h .
9.16
Th e CWA - Cor r elogr a m r ou t in e ca lcu la t es th e P ea r son P r od uct -Mom en t cor r ela t ion
coefficient bet ween su ccessive event s. For a lag of 1, it com pa r es su ccessive event s a n d
cor r ela t es t h e t im e in t er va l, dis t a n ce, an d bea r in g sepa r a t ely for t h es e su cces sive even t s.
F or a la g of 2, it com pa r es ever y oth er even t a n d corr ela t es t h e t im e in t er va l, dis t a n ce, an d
bea r ing sepa r a t ely for t h ese su ccessive event s. The r ou t ine does t h is u n t il it r ea ch es a
m a xim u m of 7 la gs (i.e., ever y seven t h even t ). H owever , if t h e sa m ple size is ver y sm a ll, it
m a y not be a ble t o ca lcula t e a ll lags. It will r equ ir e 12 in ciden t s (even t s) t o ca lcula t e a ll
seven la gs s in ce it r equ ir es a t lea st fou r obser va t ion s p er la g (i.e., N - L - 4 wh er e N is t h e
n u m ber of even t s a n d L is t h e m a xim u m n u m ber of la gs ca lcu la t ed).
Ad j u st ed C or r el og r a m
Th e Cor r elogra m ca lcu lat es t h e r a w cor r elat ion bet ween int er vals by lag for t ime,
dis t a n ce, an d be a r in g. On e of t h e pr oblem s t h a t m a y a pp ea r , esp ecially wit h sm a ll
sa m ples, is for h igh er -or der lags t o be very high , eit h er positive or n egat ive. Ther e a r e
pr oba bly t wo r ea sons for t h is. For on e t h ing, with ea ch lag, th e sa m ple size decr ea ses by
on e; wit h a ver y s m a ll sa m ple size, cor r ela t ion s ca n becom e ver y vola t ile, ju m pin g fr om
positive to negat ive, an d from low t o h igh . An ot h er r ea son is t h a t per iodicity in t h e da t a
set is com pou n ded wit h h igher -or der la gs in t h e for m of ‘echos’. For exa m ple, if a la g of 2 is
h igh , t h en a la g of 4 will a lso be s ome wh a t h igh sin ce th er e is a com pou n din g of t h e la g 2
effect . Wh en com bined with a sm a ll sa m ple size, it is n ot u n com m on t o h a ve higher -or der
la gs wit h ver y h igh corr ela t ion s, s om et im es a ppr oa ch in g +/- 1.0. Th e u ser m u st be ca r efu l
in select in g a h igh er -or der la g beca u se t h er e is a n a ppa r en t effect wh ich m a y be du e t o t h e
a bove r ea son s, r a t h er t h a n a n y r ea l p r edict a bilit y. On e of t h e key s ign s for spu r iou s
h igh er -or der effect is a su dden ju m p in t h e st r en gt h of t h e cor r ela t ion fr om on e la g t o t h e
n ext (t h ou gh som et im es a h igh h igh er -or der la g ca n be r ea l; s ee exa m ples be low).
To m in im ize t h es e effect s , t h e ou t p u t a ls o in clu d es a n a d ju s t ed cor r elogr a m t h a t
adjusts for t he loss of degrees of freedom. The form ula is:
A =
M-L-1
---------------M -1
(9.5)
wh er e M is t h e n u m ber of in t er va ls (N-1) a n d L is t h e n u m ber of la gs . F or exa m ple, for a
sa m ple size of 13, th er e will be 12 int er vals (M). For a lag of 1, th e a djus t m en t will be
A =
12 - 1 - 1
10
---------------- = ------------ =
12 - 1
11
0.909
Th e effect of th e a dju st m en t is t o redu ce th e cor r ela t ion for h igh er -or der la gs. It
won’t com plet ely elim in a t e t h e effect, bu t it sh ould h elp m in im ize sp u r iou s effects. As will
be s h own below, however , som et im es h igh h igher -or der la gs a r e r ea l.
9.17
C WA - C or r el og r a m O u t p u t
The rout ine out put s 10 par am eters:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Th e sa m ple size (n u m ber of even t s);
Nu mber of int ervals;
In for m a t ion on t h e u n it s of tim e, dist a n ce, an d bea r in g;
F in a l d is t a n ce t o origin in m et er s (dis t a n ce bet ween la st a n d fir st even t );
E xpect ed r a n dom wa lk dis t a n ce fr om or igin (if sequ en ce wa s s t r ictly
r a n dom );
Dr ift (t h e r a t io of a ctu a l dis t a n ce fr om origin t o expecte d r a n dom wa lk
dist a n ce);
F in a l bea r in g fr om or igin (dir ect ion be t ween la st even t a n d fir st even t );
E xpect ed r a n dom wa lk bea r ing. Defined a s 0 beca u se t h er e is no expect ed
dir ect ion .
Cor r ela t ion s by la g for t im e, dis t a n ce, a n d be a r in g (u p t o 7 la gs ); an d
Adju st ed corr ela t ion s by la g for t im e, dis t a n ce, a n d be a r in g (u p t o 7 la gs ).
Th e a im of t h e CWA - Cor r elogr a m is t o exa m in e r epet it ive sequ en ces, wh et h er for
t ime in t er val, dist a n ce or dir ect ion . It is possible to ha ve sepa r a t e r epet itions for t ime,
dis t a n ce a n d dir ect ion . F or exa m ple, a n offen der m a y com m it cr im es ever y 7 da ys or so,
sa y, on t h e week en d. In t h is ca se, t h e in divid u a l is r epea t in g h im self/h er self a bou t on ce
ever y week . Sim ila r ly, an in dividu a l m a y a lt er n a t e dir ection s, first goin g Ea st t h en goin g
Wes t , t h en goin g ba ck t o t h e E a st , a n d s o for t h . In ot h er wor ds , wh a t we’r e a sk in g wit h
t h e r ou t ine is wh et h er t h er e a r e a n y repet itions in t h e sequ en ce of inciden t s com m itt ed by
a ser ial offen der . Does h e/sh e r epea t t h e cr imes in t ime? If so, wh a t is t h e periodicity (t h e
r epit it iou s s equ en ce? Does h e/sh e r epea t t h e crim es in dis t a n ce? Is so, wh a t is t h e
per iodicit y? F in a lly, does h e/sh e r epea t t h e crim es in dir ection ? If so, wh a t is t h e
per iodicity? The CWA-Cor r elogra m , th er efor e, an a lyzes t h e sequ en ce of inciden t s
com m it t ed by a n in divid u a l a n d d oes t h is s epa r a t ely for t im e in t er va l, dis t a n ce, an d
dir ect ion.
Offend er r ep eti ti on
Wh y is t h is im p or t a n t ? Mos t cr im e a n a lys is is pr ed ict ed on t h e a s su m p t ion t h a t
offend er s (people in gen er a l) r epea t t h em selves, con sciou sly or u n con sciou sly. Th a t is,
in divid u a ls h a ve specific beh a vior pa t t er n s t h a t t en d t o be r epea t ed. If a n in divid u a l a ct s
in a cer t a in wa y (e.g., com m itt ing a bur gla r y), th en , most likely , th e per son will r epea t
h im self/h er self a ga in . Th er e is n o gu a r a n t ee, of cou r se. Bu t , beca u se h u m a n bein gs do n ot
beh a ve spa t ia lly or t em por a lly r a n dom bu t t en d t o oper a t e in som ewh a t con sis t en t wa ys,
t h er e is a likelih ood t h a t t h e in divid u a l will a ct in a sim ila r m a n n er a ga in .
Th is a ssu m pt ion is t h e ba sis of pr ofilin g wh ich a im s a t u n der st a n din g t h e MO of an
offen der . If offen der s wer e t ot a lly r a n dom in t h eir beh a vior , d et ect ion a n d a ppr eh en sion
wou ld be m a de m u ch m or e difficu lt t h a n it a lr ea dy is. So, bet ween t h e t wo ext r em es of a
tota lly ra ndom individua l (th e ‘ra ndom wa lk person’) an d a t ota lly predicta ble individua l
9.18
(t h e ‘a lgor it h m ic p er s on ’), we h a ve t h e bu lk of h u m a n beh a vior , a t lea st in t er m s of t im e,
d is t a n ce a n d d ir ect ion .
CWA - D i a g n o s t ic s
Th e Dia gn ost ics r out in e is s im ila r t o th e CWA - Corr elogra m except t h a t it
calcula t es a n Or din a r y Lea st Squ a r es a u t oregr ession for a pa r t icula r la g. Tha t is, it
r egr esses ea ch in t er va l a ga in st a pr eviou s in t er va l. Th e u ser en t er s t h e la g n u m ber (t h e
defau lt is 1) an d t h e r ou t ine pr odu ces t h r ee r egres sion m odels for t h e su ccessive event a s
t h e depen den t var iable a gain st t h e pr ior event a s t h e indep en den t var iable. Ther e a r e
t h r ee equa t ion s, for t ime in t er val, dist a n ce, an d bea r ing sepa r a t ely. The ou t pu t inclu des:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Th e sa m ple size (n u m ber of even t s);
The num ber of int ervals;
In for m a t ion on t h e u n it s of tim e, dist a n ce, an d bea r in g;
Th e m u ltiple cor r elat ion coefficient ;
Th e squ a r ed m u lt iple cor r ela t ion coefficien t (i.e., R2 );
Th e overa ll st a n da r d er r or of est ima t e;
Th e r egres sion coefficient for t h e con st a n t a n d for t h e pr ior event ;
Th e st a n da r d er r or of t h e r egres sion coefficient s;
Th e t -valu es for t h e r egres sion coefficient s;
Th e p-va lue (two-t a il) for t h e r egres sion coefficient s;
An a n a lysis of var ian ce t est for t h e fu ll m odel. This includes su m of squ a r es
for t h e r egr ession t er m a n d for t h e r esidu a l;
Th e r a t io of t h e r egr ession su m of squ a r es t o t h e r esid u a l s u m of squ a r es (t h e
F -r a t io); a n d
Th e p-va lue a ssociat ed wit h t h e F-valu e.
Wh a t t h e r egr ession dia gn ost ics pr ovides is a n in dica t or of t h e a m ou n t of
pr edict a bilit y in t h e lag. It h a s t h e sa m e infor m a t ion a s t h e Cor r elogra m (sin ce t h e squ a r e
of t h e cor r ela t ion, r 2 , is th e same as R 2 for a sin gle in depen den t va r ia ble r egr ession
equ a t ion ), but it is ea sier t o int er pr et . Ess en t ially, it is a r gued below th a t , un less t h e R 2 in
t h e r egr ession equ a t ion is s u fficien t ly high, t h a t one is bet t er off us in g t h e m ea n or m edia n
lag for pr edict ion . Con vers ely, if t h e R 2 is ver y h igh , t h en t h e u ser sh ou ld be su spiciou s
a bou t t h e d a t a .
CWA - P r e d i c t io n
F ina lly, after h a vin g an a lyzed t h e sequ en t ial pa t t er n of event s, t h e u ser ca n m a ke a
pr ediction a bout t h e t im e a n d p la ce of th e n ext even t . Th er e a r e t h r ee m et h ods for m a k in g
a pr ediction , ea ch w it h a sepa r a t e la g:
1.
2.
3.
Mea n d iffer en ce
Med ia n d iffer en ce
Regr ession equ a t ion
9.19
Th e m et h od is a pp lied t o th e la st even t in t h e da t a set . Th e m ea n d ifferen ce a pplies
t h e m ea n in t er va l of th e da t a for t h e specified la g t o th e la st even t . For exa m ple, for t im e
in t e r va l a n d a la g of 1, t h e r ou t i n e ca lcu la t e s t h e in t e r va l be t we en e a ch e ve n t a n d t a k es
th e avera ge. It th en applies the mea n t ime int erval to th e last time in t he dat a set a s th e
pr edict ion . The m ed ia n d ifferen ce a pplies t h e m edia n int er val of t h e da t a for t h e specified
lag t o t h e las t event . For exam ple, for bear ing a n d a lag of 1, th e r ou t ine calcu lat es t h e
dir ect ion (bea r ing) bet ween ea ch event , ca lcu lat es t h e m edia n bear ing, an d a pplies t h a t
m edia n a ver a ge t o th e loca t ion of th e la st even t in t h e da t a set a s t h e pr edicte d va lu e.
Th e regression equ ation ca lcu la t es a r egr ession coefficien t a n d con st a n t for t h e
specified lag and u ses the dat a value for t he last interval a s in pu t in t o t h e r egr ession
equ a t ion; t h e r es u lt is t h e pr edicte d va lu e. F or exa m ple , for dis t a n ce an d a la g of 1, t h e
r ou t in e ca lcu la t es t h e r egr ession coefficien t a n d con st a n t for a r egr ession equ a t ion in wh ich
ea ch even t is comp a r ed t o th e pr evious even t . Th e la st dis t a n ce in t h e da t a set (i.e.,
bet ween t h e la st even t a n d t h e pr eviou s even t ) is u sed a s a n in pu t for t h e r egr ession
equ a t ion a n d t h e pr edict ed dist a n ce is m a r ked off fr om t h e coor din a t es of t h e las t event .
In ot h er wor ds, t h e r ou t ine t a kes t h e t ime a n d loca t ion of t h e las t event a n d a dds a
t im e in t er va l, a dir ect ion, a n d a dis t a n ce as a pr edicte d n ext even t (next t im e, n ext
loca t ion ). Th e m et h od by wh ich t h is pr ed ict ion is m a de ca n be t h e m ea n in t er va l, t h e
m edia n in t er va l, or t h e r egr ession equ a t ion . If th e u ser sp ecies a la g ot h er t h a n 1, t h a t la g
is a pplied t o t h e las t event . For exam ple, for t ime wit h a m ea n differ en ce a n d a lag of 2,
t h e r ou t ine calcu lat es t h e t ime in t er val bet ween ea ch event a n d every oth er event ,
calcu la t es t h e a ver a ge a n d a pp lies t h a t a ver a ge t o th e la st even t in t h e da t a set .
C WA - P r ed i c ti on Gr a p h i ca l O u t p u t
Th e CWA - P r edict ion r ou t ine out pu t s five gr a ph ica l object s in ‘sh p’, ‘m if, or ‘bn a ’
for m a t s. Th e r ou t in e a dds five pr efixes t o t h e file n a m e of t h e ou t pu t object :
1.
2.
3.
4.
5.
E ven t s - a line in dica t in g t h e sequ en ce of event s. If th e u ser a lso br in gs in
t h e poin t s in t h e da t a set , it will be possible t o n u m ber ea ch of t h ese st eps;
P r edDest - t h e pr edict ed loca t ion for t h e n ext even t ;
P a t h - a lin e fr om t h e la st loca t ion in t h e da t a set t o t h e pr edict ed loca t ion ;
P Or igL - a poin t r epr esen t ing t h e cen t er of m inim u m dist a n ce of t h e da t a
set . Th e cent er of m in im u m dis t a n ce is t a k en a s a pr oxy for t h e origin
loca t ion of th e offen der ; a n d
P W - a lin e fr om t h e exp ect ed or igin t o t h e pr edict ed dest in a t ion
F or exam ple, if t h e u ser pr ovides t h e file n a m e ‘Night Robber ies’ a n d sp ecifies a ‘sh p’
ou t pu t , t h er e will be five object s ou t pu t :
EventsNight Robberies.shp
P a t h N igh t Robber ies .s h p
P WN igh t Robber ies .s h p
Pr edDestN ight Robberies.shp
POrigLNight Robberies.shp
9.20
E x a m p l e 1: A Co m p l e t e ly P r e d i c t a bl e In d i v i d u a l
Th e sim plest wa y t o illu st r a t e t h e logic of th e CWA is t o st a r t wit h a com plet ely
pr edicta ble in divid u a l. Th is in divid u a l com m it s cr im es on a com ple t ely syst em a t ic basis .
Ta ble 9.5 illu st r a t es t h e beh a vior of t h is in dividu a l.
St a r t ing a t a n a r bitr a r y or igin wit h a n X coor din a t e of 1 an d a Y coor din a t e of 1 an d
on da y 1, t h e individu a l com m its 13 inciden t s in t ot a l. In t h e t a ble, t h ese a r e n u m ber ed
even t s 1 t h r ough 13. Let ’s s t a r t wit h dir ection a n d d ist a n ce. Fr om t h e origin , t h e in dividu a l
a lwa ys t r a vels in a Nor t h ea st dir ect ion of 45 degr ees (clockwis e fr om du e Nor t h - 0
degr ees ). Th e in dividu a l’s s econ d in ciden t is a t coord in a t e X=2, Y=2. Th u s, t h e in dividu a l
t r a veled a t 45 degr ees fr om t h e pr eviou s in cid en t a n d for a dis t a n ce of 1.4142 (t h e
h ypot en u se of t h e r ight a n gle crea t ed by t r a velin g on e u n it in t h e X dir ection a n d on e u n it
in th e Y direction). For t he th ird incident, th e individua l comm its th is at X=4, Y=4. Thus,
t h e dir ect ion is also at 45 degrees fr om t h e pr eviou s loca t ion but t h e dist a n ce is now 2.8284
(or t h e s qu a r e r oot of 8 wh ich com es fr om a s t ep of 2 a lon g t h e X a xis a n d a s t ep of 2 a lon g
t h e Y a xis). For t h e fou r t h in ciden t , t h e in dividu a l comm it s t h e crim e a t X=7, Y=7. Aga in ,
t h e dir ect ion is 45 degrees , but t h e dist a n ce is 4.2426 (or t h e squ a r e r oot of 18 which com es
from a st ep of 3 a long t h e X axis a n d a st ep of 3 a long t h e Y axis ).
Ta ble 9.5
Ex am ple of a P re di ct ab le S e rial Offe n de r: 1
(N = 13 incidents)
Eve nt
X Y
Distance
D a y s Ti m e I n t e rv a l
1
1
1
1
2
2
2
1.4142
3
2
3
4
4
2.8284
7
4
4
7
7
4.2426
9
2
5
8
8
1.4142
13
4
6
10 10
2.8284
15
2
7
13 13
4.2426
19
4
8
14 14
1.4142
21
2
9
16 16
2.8284
25
4
10
19 19
4.2426
27
2
11
20 20
1.4142
31
4
12
22 22
2.8284
33
2
13
25 25
4.2426
37
4
-------------------------------------------------------------------Logica l
pr ediction
for
n ext even t
14
26 26
1.4142
39
2
----------------------------------------------------------------------------------------
9.21
F or t h e fift h in cid en t , a ga in t h e in divid u a l t r a veled a t 45 degr ees t o t h e pr eviou s
in ciden t , bu t r epea t ed h im self/h er self wit h a st ep of on ly 1 u n it in bot h t h e X a n d Y
dir ect ion s. The in dividu a l th en con t inu ed t h e sequ en ce, alwa ys t r a veling in a 45 degree
orienta tion t o due Nort h. For dista nce, a st ep of 1 in both th e X an d Y directions is
followed by a st ep of 2 in both dir ect ions , a n d is followed by a st ep of 3 in both dir ect ions .
In ot h er wor ds, t h e individu a l rep ea t s dir ect ion every t ime a n d r epea t s dist a n ce every
t h ird t ime. Th er e is a p er iodicity of 1 for dir ect ion a n d 3 for dist a n ce.
F or t im e in t er va l, t h is in divid u a l r epea t s h im /h er self ever y ot h er t im e. Th e secon d
even t occu r s 2 da ys a ft er t h e fir st even t . Th e t h ir d even t occu r s 4 da ys a ft er t h e secon d
even t ; t h e fou r t h even t occu r s 2 d a ys a ft er t h e t h ir d even t ; t h e fift h even t s occu r s 4 d a ys
after th e four th event; and so fort h. In oth er words, for t ime int erval, th e individua l
r epea t s h im/her self every ot h er int er val (i.e., t h e per iodicity is 2). Figur e 9.4 illust r a t es t h e
sequ en ce; t h e n u m ber a t ea ch even t locat ion is t h e n u m ber of t h e da y t h a t t h e in dividu a l
com m it t ed t h e offen s e (s t a r t in g a t a n a r bit r a r y d a y 1).
Sin ce t h is fict itious in dividu a l is com plet ely pr edict a ble, we ca n ea sily guess wh en
a n d wh er e t h e n ext even t will occu r (see t a ble 9.5 a bove). The dir ect ion will, of cou r se, be
a t 45 d egr ees from t h e pr evious locat ion. Lookin g a t t h e la st kn own even t (even t 13), t h e
dis t a n ce t r a veled wa s 4.2426. Th u s, we pr edict t h a t t h e in divid u a l will r ever t t o a m ove of
1 in t h e X dir ection a n d 1 in t h e Y dir ection , or coord in a t es X=26, Y=26. Fin a lly, for t im e
in t er va l, sin ce th e la st kn own t im e in t er va l wa s 4 da ys, t h en t h is in dividu a l will comm it
t h e n ext even t 2 da ys la t er , or d a y n u m ber 39.
E xa m p l e 1: An a l y si s
Th e firs t st ep is t o a n a lyze th e sequ en cing of t h e event s. Ther e a r e 13 event s a n d 12
in t er va ls . Th e corr elogr a m pr odu ces t h e followin g ou t pu t (t a ble 9.6).
Look in g a t t h e u n a dju s t ed cor r ela t ion s , it ca n be s een t h a t t im e s h ows a n
a lt er n a t in g p a t t er n of per fect cor r ela t ion s. Th e fir st r epea t in g p osit ive 1.0 cor r ela t ion is for
lag 2, which is t h e exact per iodicity t h a t wa s sp ecified in t h e exam ple. This offen der
r epea t s t h e t ime s equ en ce every ot h er t ime. Th u s, if t h e individu a l alt er n a t es bet ween
comm itting offenses 2 an d 4 days after t he last, then k nowing the t ime int erval for t he last
offen s e, it ca n be a s su m ed t h a t t h e n ext even t will r ep ea t t h e n ext -t o-t h e-la s t t im e in t er va l.
F or dist a n ce, th e h igh est cor r elat ion is for a lag of 3. This offen der r epea t
h im se lf/h er se lf ever y t h ir d t im e, wh ich is exa ctly w h a t wa s p r ogra m m ed in t o th e exa m ple .
Th u s, k n owing t h e loca t ion of t h e la st even t , it can be a ss u m ed t h a t t h e in dividu a l will
ch oose t h e sa m e dista n ce for t h e next int erva l as t h r ee ear lier. F ina lly, all lags sh ow a
per fect 1.0 cor r ela t ion for bea r in g. The lowest one is t a k en , wh ich is a la g of 1. Th a t is,
t h is in dividu a l r epea t s t h e dir ection ever y sin gle t im e (i.e., h e/sh e a lwa ys t r a vels in t h e
sa m e dir ect ion ). Thu s, in su m m a r y, t h e cor r elogra m sh ows t h a t t h e individu a l rep ea t s t h e
t ime in t er val every ot h er t ime, t h e dist a n ce every t h ird t ime, a n d t h e dir ect ion every t ime.
9.22
Figure 9.4:
Example of a Predictable Serial Offender: I
(N=13 Incidents)
% 39
#
37
33
#
31
#
27
#
25
#
21
19
#
#
15
#
13
9
#
#
Date of Incident Shown
7
#
3
1
#
#
N
W
E
S
Ta ble 9.6
Corre log ram of P re di ct ab le S e rial Offe n de r: 1
Th e CWA - Dia gn ost ics r ou t in e m er ely con fir m s t h ese cor r ela t ion s. Th e r egr ession
equ a t ion s yield a n R 2 of 1.0 (u n a dju st ed) for ea ch of th r ee va r ia bles , for t h e a pp r opr ia t e la g.
F or exa m ple, t a ble 9.7 be low sh ows t h e r egr ession r esu lt s for dis t a n ce for a la g of 3
Ta ble 9.7
R e g r e s s i o n R e s u l t s fo r S e ri a l Offe n d e r 1: D i s t a n c e
==================================================================
Var ia ble: d is t a n ce
St a n da r d er r or of est ima t e:
0.00000
Mult iple R: 1.00000
Squ a r ed m u ltip le R:
1.00000
Coefficient
Con s t a n t 0.000000
Coefficient 1.000000
S td Er r or
0.00000
0.00000
t
0.00000
0.00000
P (2 Tail)
0.00000
0.00000
Ana lysis of Var ia n ce
Sou r ce
Su m -of-Squ a r es
df
Mea n -S qu a r e
F -r a t io
P
Regr ession
12.00000
1
12.00000
0.00000
0.00000
Res id u a l
0.00000
8
0.00000
Tot a l
12.00000
9
==================================================================
9.24
Th e a dju st ed cor r elogr a m sh ow a sim ila r pa t t er n , t h ou gh t h e a bsolu t e cor r ela t ion s
h a ve been r edu ced. Th e best decis ion wou ld st ill be for a la g of 2 for t im e, a la g of 3 for
dis t a n ce, an d a la g of 1 for bea r in g. Figu r e 9.5 sh ows a gr a ph of t h e cor r elogra m .
Crim eS tat h a s a bu ilt -in gr a ph fun ction for t h e cor r elogra m a n d a dju st ed cor r elogra m .
Exa m p le 1: P r ed ict ion
F ina lly, for pr edict ion , it is ap pa r en t t h a t t h e best m et h od would be t o u se a
r egr es sion equ a t ion wit h la gs of 2 for t im e, 3 for d ist a n ce, an d 1 for bea r in g. Ta ble 9.8
sh ows t h e out pu t . As ca n be s een , t h e r out in e pr edicts exa ctly t h e n ext t im e a n d locat ion.
Th e n ext even t for t h is com plet ely pr edict a ble seria l offen der will be on da y 39 at t h e
loca t ion wit h coord in a t es X=26, Y=26.
Ta ble 9.8
P re di ct e d R e su lts fo r Se rial Offe n de r 1
R e gre s si on Eq u at io n w i th
Lags of 2 for Time , 3 for Distan ce , 1 for Be aring
Varia ble
P r e di ct e d v alu e
F ro m e ve n t Me t h o d
La g
--------------------------------------------------------------------------------------------------------------Time in t er va l
2.00000
13
Regr ession
2
Dis t a n ce in t er va l
1.41421
13
Regr ession
3
Bea r in g in t er va l
44.99997
13
Regr ession
1
Pr edicte d t im e .............: 39.00000
Pr edict ed X coor din a t e : 26.00000
Pr edict ed Y coor din a t e : 26.00000
--------------------------------------------------------------------------------------------------------------Th e r egr ession equ a t ion is t h e best m odel in t h is ca se. Th e ot h er m et h ods pr odu ce
r ea sona bly close a ppr oxim a t ion s, however. Table 9.9 shows t h e r esu lts of u sin g ot h er
m et h ods for pr edict ion . As seen , a m odel wh er e a ll t h r ee com ponen t s (tim e, dista n ce,
bea r in g) wer e la gged by 1 a s w ell a s a m odel wh er e a ll t h r ee compon en t s w er e la gged by 3
a lso pr odu ces t h e exp ect ed cor r ect a n sw er . Th e m ea n in t er va l a n d m edia n in t er va l
m et h ods a lso pr odu ce r ea sona bly close, th ou gh n ot exact, a n swer s. In t h is pa r t icu lar ca se,
t h e r egr ession m et h od wit h t h e best la gs pr odu ced t h e opt im a l s olu t ion .
E x a m p l e 2: An o t h e r Co m p l e t e ly P r e d i c t a bl e In d i v i d u a l
A secon d exa m ple is a lso a per fect ly pr edicta ble in divid u a l. Th is t im e, t h e
directiona l component cha nges. The directiona l tr end is nort hwar d, but with cha nges in
a n gle every t h ird event . The t ime p a t t er n is com plet ely con sist en t with su bsequ en t event s
occur r in g ever y t wo da ys. Ta ble 9.10 pr esen t s t h e pa t t er n a n d t h e logical n ext even t wh ile
figu r e 9.6 displays t h e pa t t er n .
9.25
Figure 9.5:
Correlogram of Serial Offender: 1
1.0
Lagged Correlation
0.5
Time
0.0
Distance
0
1
2
3
4
-0.5
-1.0
Lag
5
6
7
Bearing
Ta ble 9.9
Com pa riso n of Meth od s for P re di ct ab le S e rial Offe n de r 1
Ta ble 9.10
Ex am ple of a P re di ct ab le S e rial Offe n de r: 2
(N = 1 4 i n c i d e n t s )
Time
Eve nt
X Y
Distance
D a y s In t e r v a l
1
3
1
1
2
1
3
2.8284
3
2
3
1
5
2.0000
5
2
4
3
7
2.8284
7
2
5
1
9
2.8284
9
2
6
1 11
2.0000
11
2
7
3 13
2.8284
13
2
8
1 15
2.8284
15
2
9
1 17
2.0000
17
2
10
3 19
2.8284
19
2
11
1 21
2.8284
21
2
12
1 23
2.0000
23
2
-------------------------------------------------------------------Logica l
pr ediction
for
n ext even t
13
3 25
2.8284
25
2
----------------------------------------------------------------------------------------
9.27
Figure 9.6:
Example of a Predictable Serial
Offender: 2
(N=12 Incidents)
% 25
#
23
21
#
19
#
17
#
15
#
13
#
11
#
9
#
Date of Incident Shown
7
#
5
#
3
N
#
W
1
#
E
S
Th e cor r elogra m r eveals t h a t bot h dist a n ce a n d bea r ing r epea t t h em selves every
t h ir d even t wh ile t h e t im e in t er va l is r epea t ed ever y t im e. Th e r egr ession dia gn ost ics sh ow
t h a t t h er e is per fect pr edict a bilit y for t im e a n d for dis t a n ce, a n d h igh pr edict a bilit y for
bea r in g (n ot sh own ). F in a lly, a r egr ession m odel is u sed for pr edict ion wit h la gs of 1 for
t im e, 3 for dis t a n ce, an d 3 for be a r in g. The m odel cor r ectly pr edicts t h e expected t im e
(da ys=25) an d loca t ion (X=3, Y=25). Ta ble 9.11 sh ows t h e r esu lt s.
Met h o do lo g y for CWA
Th ese t wo exam ples illu st r a t e wh a t t h e CWA r ou t ine is doing. Th er e a r e t h r ee
st eps. F ir st , t h e sequ en t ia l p a t t er n is a n a lyzed wit h t h e cor r elogr a m . Th is sh ows wh ich
la gs h a ve t h e st r ongest cor r ela t ions bet ween la gs for t im e, dist a n ce, an d bea r in g sepa r a t ely.
Secon d, t h e pa t t er n is t est ed wit h a r egr ession m odel. Th e pu r pose is t o det er m in e h ow
st r on g a r ela t ion sh ip is a n y p a r t icu la r m odel. As will be su ggest ed below, if a m odel is t oo
wea k or , con ver sely, t oo st r on g, it m ost lik ely will n ot pr edict ver y well. Th ir d, a pr edict ion
m odel is select ed. Th e u ser ca n u t ilize t h e r egr ession m odel or u se t h e m ea n in t er va l or
m edia n in t er va l.
Table 9.11
Com pa riso n of Meth od s for P re di ct ab le S e rial Offe n de r 2
E x a m p l e 3: A R e a l S e r i al Offe n d e r
H ow well does t h e CWA r ou t in e wor k wit h r ea l s er ia l offen der s? P eople a r e n ot a s
pr edicta ble a s t h ese exa m ples; t h e exa m ples a r e a lgor it h m ic a n d p eople don’t work like
a lgorit h m s. Bu t , t o th e ext en t t o which t h er e is som e pr edicta bilit y in h u m a n beh a vior, t h e
CWA r ou t in e ca n be a u sefu l t ool for cr im e a n a lysis , d et ect ion , a n d a ppr eh en sion .
9.29
To illu st r a t e t h is , a ser ia l offen der wa s id en t ified fr om a la r ge da t a set obt a in ed fr om
Ba ltim or e Cou n t y. The in dividu a l com m itt ed 16 offen ses bet ween 1992 an d 1997 wh en h e
wa s even t u a lly app r eh en ded. Th e pr ofile of crim es com m it t ed by t h is in dividu a l wer e qu it e
diver se. Th er e wer e 11 la r ceny in ciden t s (sh oplift in g a n d bicycle t h eft), 1 r esiden t ia l
bu r gla r y, 1 com m er cial bu r gla r y, 2 a ss a u lt s, a n d 1 r obber y.
To t est t h e m odel, th e firs t 15 inciden t s wer e u sed t o pr edict t h e 16 t h . This a llowed
t h e er r or bet ween t h e obs er ved a n d pr edict ed va lu es for t im e a n d loca t ion t o be u sed for
evalu a t ion . Figur e 9.7 shows t h e sequ en cing of a ct ion s of t h e firs t 15 inciden t s com m itt ed
by t h is in divid u a l, m ost of wh ich occu r r ed in t h e ea st er n pa r t of Ba lt im ore Coun t y.
Th e cor r elogra m r evea led a com plicat ed pa t t er n (figur e 9.8). The a dju st ed m a t r ix
wa s u sed be cau se of t h e h igh cor r ela t ion s a t h igher -or der la gs. N ever t h eles s, t h e opt im a l
la gs a pp ea r ed t o be 1 for t im e, 3 for dis t a n ce, an d 6 for be a r in g. A r egr ession m odel wa s
u se d t o test t h es e pa r a m et er s. F igu r e 9.7 a lso sh ows t h e pr edicte d locat ion for t h e n ext
lik ely loca t ion (t h e r ed plu s s ign ) a n d t h e loca t ion wh er e t h e in divid ua l a ct u a lly com m it t ed
t h e 16 t h event (green t r ian gle). The er r or in pr edict ion wa s good. The d ist a n ce bet ween t h e
a ctu a l a n d p r edicte d locat ions wa s 1.8 m iles a n d t h e er r or in pr edictin g t h e t im e of t h e n ext
loca t ion wa s 3.9 da ys. Over a ll, t h e m odel d id qu it e we ll for t h is in divid u a l.
Ev e n t S e qu e n ce as an Ana log y t o a Corre lat e d Walk
Never t h eless, t h er e a r e pr oblem s in t h e m odel for t h is ca se. F ir st , t h is is n ot a t r u e
sequ en ce of a ct ion s, but a pseu do-sequ en ce. The ind ividu a l doesn ’t go fr om t h e firs t event t o
t h e secon d event t o t h e t h ird event , an d so for t h . A con sider a ble tim e m a y ela pse bet ween
even t s. S im ila r ly, dist a n ce an d d ir ection a r e con cept u a l on ly, not r ea l. For exa m ple, in
figur e 9.7, t h e in divid u a l did n ot a ctu a lly t r a vel a cross t h e in let s of th e Ch es a pea k e Ba y a s
t h e lin es in dica t e. Dist a n ce bet ween t h e event s wa s a ct u a lly m u ch grea t er t h a n est ima t ed
by t h e m odel a n d dir ect ion wa s m or e com plex. Never t h eless, t o t h e ext en t t o wh ich a n
in divid u a l m a k es a sp a t ia l decision a bout wh er e t o go, im plicit ly h e or sh e is m a k in g a
dir ect iona l a n d d ist a n ce decision. In oth er wor ds , t h e decision m a k in g pr oces s m a y t a k e
in t o a ccou n t p r ior loca t ion s . In t h is ca s e, t h e CWA r ou t in es wou ld be u s efu l.
E x a m p l e 4: A S e c o n d R e a l S e ri a l Offe n d e r
A second r eal exam ple confirms t ha t t he met hod can produce reasona bly close
pr ediction s. An offen der com m it t ed 13 cr im es , in clud in g t h r ee in ciden t s of sh opliftin g, eigh t
inciden t s of t h eft fr om a vehicle, on e r esiden t ial bu r gla r y, a n d one h igh wa y robbery. Th e
cor r elogra m sh owed t h a t a la g of 1 wa s s t r ongest for t im e, dist a n ce, an d bea r in g (figu r e 9.9).
Th e R-sq u a r es wer e m oder a t e (0.45 for t im e; 0.18 for d ist a n ce; 0.18 for bea r in g). Usin g t h e
r egres sion m et h od wit h a lag of 1 for ea ch com ponen t , th e likely loca t ion of t h e n ext even t
wa s pr edict ed (Figur e 9.10). The er r or bet ween t h e pr edict ed event a n d t h e a ct u a l even t
wa s, a ga in , r ea son a ble wit h a differ en ce in t im e of 3.3 da ys a n d a differ en ce in dis t a n ce of
2.4 miles.
9.30
Figure 9.8:
Correlogram of Actual Offender
1
Lagged Correlation
0.5
Time
0
Distance
0
1
2
3
4
-0.5
-1
Lag
5
6
7
Bearing
Figure 9.9:
Correlogram of Another Offender
1
Lagged Correlation
0.5
Time
0
0
1
2
3
4
-0.5
-1
Lag
5
6
7
Distance
Bearing
Figure 9.10:
Likely Location for Next Crime:
Another Serial Offender in Baltimore County
#3
#
1
10
#
11 27
5
9
#
## 12
# #4
6
#
8
#
Ñ
Predicted next location
T
$
Ñ
#
Actual next location
Predicted path
Predicted next location
Committed incidents
Sequence of incidents
Major street
Street
T
$
Actual next location
N
W
0
1
2 Miles
E
S
Tracking a Burglary Gang with the Correlated Walk Analysis
Bryan Hill
Glendale Police Department
Glendale, AZ
The space-time analysis tools provided with CrimeStat II add an important
element to an analyst’s review of a tactical prediction effort. Although the method
for calculating the Correlated Walk Analysis (CWA) is still more experimental than
proven, it allows the analyst to see potential patterns in relation to a suspect’s crime
travel in terms of time, distance, and direction. In a recent burglary series involving
several jurisdictions in our county, the CWA technique was used as part of an
aggregate process referred to as the Probability Grid Method. That method
combines results from several models to predict the next likely area for a new hit in
a crime series. One of the most confusing aspects of these burglaries was the fact
that several jurisdictions were involved and the offenders seemed to bounce back
and forth from one jurisdiction to the next.
There were also 219 offenses in the series, providing considerable complexity.
Because there were so many events, the distances could be anywhere from 0.5 miles
to 20 miles, I could never really put my finger on what direction or distance the
offender would hit next, but was confident a pattern existed and was likely changing
over time. The following map shows the probability grid areas predicted and the
CWA points predicted. The triangles shown represent the last four hits. The first hit
was near the probability grid prediction in the northern portion of the map; however
the subsequent hits were all very close to where the CWA routine predicted they
would be. This was also a brand new area for these offenders and was a surprise to
the department investigating these incidents. This area was not what was expected
based on the SD ellipses and other methods used to predict the next event. The
CWA tool requires more testing to determine the accuracy of its predictions, however
it may turn out to be a valuable tool in a crime analyst’s arsenal.
Ac c u ra cy of P re d ic tio n s
H owever , it ’s im por t a n t n ot t o be overly opt im ist ic about t h e t echn iqu e. It is a lwa ys
possible t o fin d ca ses t h a t fit a m et h od ver y well. Th e a bove m en t ion ed ca ses a ppea r t o do
t h a t . Un for t u n a t ely, t h e m et h od is n ot a m a gic elixir for pr edictin g ser ia l offen der s. Lik e
a n y m et h od, it h a s er r or. I t is a lso a fair ly n ew t ool in crim e a n a lysis so t h a t we d on’t h a ve
a lot of exper ien ce wit h it . Th e one exa m ple of it s u se wa s by H elm s (1999), wh o als o is
ca u t iou s a bou t it s u t ilit y.
Th er efor e, a t t h is p oin t , I can n ot give con clus ive r esu lt s a bout wh et h er t h e m et h od is
a ccu r a t e or n ot a n d u n der wh a t con dit ion s it is best u sed. It will t a ke s om e experien ce t o
kn ow how effective it is for crime an alysis.
To exp lor e t h e a ccu r a cy of t h e m et h od, 50 ser ia l offen der s wer e id en t ified fr om a
lar ge dat a ba se of m or e t h a n 41,000 inciden t s in Balt imore Coun t y between 1993 an d 1997
(see Ch a pt er 10 for det a ils). The 50 offen der s wer e ident ified bas ed on kn owing t h e da t es
on wh ich t h ey com m it t ed cr im es , or a t lea s t on wh ich t h ey com m it t ed cr im es for wh ich t h ey
wer e ch a r ged a n d even t u a lly t r ied. Th e n u m ber of in cid en t s va r ied fr om a low of 7
in cid en t s t o a h igh of 38 in cid en t s. An a t t em p t wa s m a de t o p rod uce ba la n ce in t h e n u m ber
of in cid en t s, t h ou gh t h e a ct u a l d is t ribu t ion of ca ses did reflect t h e a va ila bilit y of ca n did at es
in t h e da t a ba se . For t h e fift y in divid u a ls, t h e dist r ibu t ion of incide n t s w a s 7 (five
in divid u a ls), 8 (four in divid u a ls), 9 (six in divid u a ls), 10 (t wo individ u a ls), 11 (five
in dividu a ls), 12 (five in dividu a ls), 13 (six individu a ls), 14 (th r ee in dividu a ls), 15 (six
ind ividu a ls), 17 (t wo ind ividu a ls), an d one ind ividu a l each for 20, 21, 24, 29 a n d 38
incidents.
To t est t h e CWA m odel, th e las t event com m itt ed by th ese ind ividu a ls wa s r em oved
so t h a t N-1 even t s cou ld be u sed t o pr edict even t N. In t h is wa y, it is poss ible t o evalu a t e
t h e a ccu r a cy of th e m et h od.
Ten m ethods were compa red:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
The optima l regression m ethod for t ime with t he lag ha ving th e strongest
relationsh ip being selected;
Th e opt ima l regr ession m et h od for loca t ion (dist a n ce a n d bea r ing) wh er e t h e
wit h t h e la gs for dis t a n ce an d be a r in g ha ving t h e st r onges t r ela t ion sh ip bein g
selected;
A r egres sion m odel for t ime wit h a lag of 1;
A regr ession m odel for loca t ion wit h a la g of 1 (for bot h dis t a n ce a n d be a r in g);
Th e m ea n int er val for t ime;
Th e m ea n in t er va l for loca t ion (dis t a n ce a n d be a r in g);
Th e m edia n int er val for t ime;
Th e m edia n in t er va l for loca t ion (dis t a n ce a n d be a r in g);
Th e m ea n cen t er of th e in cid en t s (for loca t ion only); an d
Th e cen t er of min im u m dis t a n ce of t h e in cid en t s (for loca t ion only).
9.36
Th e la t t er t wo met h ods wer e u se d for r efer en ce. For jour n ey t o cr im e es t im a t ion, t h e
cen t er of m inim u m dist a n ce is th e best a t pr edict ing t h e or igin loca t ion of ser ial offen der s
(see ch a pt er 10). Th e r ea son is beca u se t h is st a t is t ic m in im izes th e d istan ce t o all in ciden t
loca t ion s . Th e m ea n cen t er wa s clos e beh in d , t h ou gh n ot qu it e a s good . As a n es t im a t e, t h e
cen t er of m in im u m d is t a n ce is a ver y good in d ex wh en t h er e is a sin gle or igin t h a t is bein g
pr edict ed. On t h e ot h er h a n d, wh er e t h e pu r pose is to pred ict t h e loca t ion of a n ext even t ,
t h e cen t er of m in im u m dis t a n ce a n d m ea n cen t er m a y be less t h a n u sefu l s in ce t h ey will n ot
gen er a lly pr edict t h e a ctu a l n ext loca t ion. Th ey m in im ize er r or, bu t a r e r a r ely a ccu r a t e.
F or exa m ple, in t h e a bove m en t ion ed ca ses (t wo t h eor et ica l a n d t wo r ea l), t h ese st a t is t ics
did n ot p r edict a ccu r a t ely t h e loca t ion of t h e n ext even t . In st ea d, t h ey iden t ified a point in
t h e m id dle of t h e dis t r ibu t ion wh er e t h e su m of t h e dis t a n ces t o a ll in cid en t loca t ion s wa s
s m a ll.
Error Ana lysis
E a ch of t h e m odels wa s com pa r ed t o t h e a ct u a l tim e a n d loca t ion of t h e las t , rem oved
inciden t . For t ime, t h e er r or m ea su r e wa s in da ys (t h e a bsolu t e differ en ce bet ween t h e
a ctu a l da y a n d t h e pr edicte d d a y). F or locat ion, t h e er r or m ea su r e wa s in m iles (i.e.,
a bs olut e dist a n ce bet ween t h e a ctu a l a n d p r edicte d locat ion). Th e r es u lt s w er e m ixed.
Over a ll, er r or wa s m oder a t e. Ta ble 9.12 su m m a r izes t h e over a ll er r or .
Over a ll, t h e cen t er of m in im u m d is t a n ce a n d t h e m ea n cen t er d o p r od u ce, a s
expect ed, sm a ller er r or s for dist a n ce t h a n a n y of t h e CWA m et h ods; as n ot ed a bove,
loca t ion s in t h e m id dle of t h e dis t r ibu t ion of in cid en t s will m in im ize er r or , bu t t h ey won ’t
pr edict a ccu r a t ely t h e loca t ion of a n ext even t n or in dica t e in wh ich dir ect ion it will occu r
fr om t h e la st even t . On t h e ot h er h a n d, t h e CWA m et h od s a r e n ot pa r t icu la r ly a ccu r a t e,
eit h er . Th ey work ver y well for a com plet ely pr edicta ble offend er , as wa s s een in t h e
exam ples a bove, but n ot n ecessa r ily for r ea l offen der s.
Am on g t h e CWA m et h od s, t h e m ea n in t er va l, m ed ia n in t er va l a n d t h e la g 1
r egr es sion a pp ea r s t o give bet t er r es u lt s for t im e t h a n t h e opt im a l r egr es sion . Over a ll, t h e
m edia n in t er va l pr odu ces t h e lowes t m edia n er r or, wh ich is a bout a m ont h a n d h a lf. In
t er m s of locat ion , t h e m ea n in t er va l a n d m edia n int er va ls p r odu ce sligh t ly bet t er r esu lt s
t h a n t h e opt im a l r egr ession , t h ou gh t h e la g 1 r egr ession wa s ju st a s good.
Co m p a r i s o n o f CWA Me t h o d s
At t h is poin t , it is u n clea r a s wh en it is bes t t o u s e t h is t ech n iqu e. Th r ee va r ia bles
seem t o explain pa r t of t h e er r or var iat ion . Fir st , a la r ger s a m ple size lea ds t o bet t er
pr edict ion , a s wou ld be expect ed (Ta ble 9.13).
F or t im e, t h er e is d efinit ely a n im pr ovem en t in pr edicta bilit y wit h la r ger sa m ple
sizes. Am on g th ese m et h ods, t h e m ea n int er val a n d la g 1 regr ession sh ow t h e sm a llest
er r or for t h e la r ges t sa m p les (14 ca s es ). F or d is t a n ce, on t h e ot h er h a n d, gen er a lly, t h e
er r or in cr ea ses wit h in cr ea sin g s a m ple size. Th e on e except ion is for t h e opt im a l r egr ession
m et h od wh er e m ediu m -sized sa m ple s (10-13 cas es ) pr odu ce th e lowes t er r or.
9.37
Table 9.12
Av e r a g e a n d Me d i a n E r ro r fo r CWA Me t h o d s
50 Se rial Offen de rs
Av e r a g e
E rr o r
Me t h o d
Tim e (da ys)
Op t im a l r egr ession : tim e
La g 1 r egr ession : tim e
Mea n in t er va l: t im e
Med ia n in t er va l: t im e
Di sta nce (m iles)
Opt im a l r egr ession : loca t ion
La g 1 r egr ession : loca t ion
Mea n in t er va l: loca t ion
Media n in t er va l: loca t ion
Me d i a n
E rr o r
112.2
88.1
89.7
91.2
79.8
70.0
64.9
45.5
6.4
5.7
5.8
5.3
5.4
4.2
4.7
3.9
Referen ce Loca ti on (m iles)
Mea n cen t er
3.3
Cen t er of m in im u m d is t a n ce
3.1
1.7
1.2
Variables Affecting P redicta bility
Long ti m e spa n
Th er e a r e a var iety of r ea sons for t h ese st r a n ge resu lts , but on e r ea son m a y be th e
t im e spa n of t h e even t s. S ome of t h es e offen der s comm it t ed crim es over a long p er iod, u p t o
five year s. Sa m ple size is in t r ins ica lly r elat ed t o t h e t ime s pa n (r =0.55). The lon ger t h e
t im e spa n t h a t a n offend er com m it s cr im es, th e m ore in ciden t s h e/sh e will per pet r a t e. Wit h
in cre a sin g t im e, t h e in divid u a l’s beh a vior p a t t er n s m a y cha n ge.
F or t h ose offen der s wit h m a n y in cid en t s, a sepa r a t e a n a lysis wa s condu ct ed of th e
event s occu r r ing with in t h e las t year . Man y of t h ese ind ividu a ls a ppea r ed t o h a ve moved
t h eir ba se of oper a t ion over t im e, so t h e isola t ion of t h e m ost r ecen t even t s wa s d one in
or d er t o p r od u ce a clea r er beh a vior p a t t er n . Th e r es u lt s , wh ile p r om is in g, wer e n ot
dr a m a t ic. Accu r a cy wa s im pr oved a lit t le com pa r ed t o us in g t h e full s equ en ce, pa r t icula r ly
spa t ial a ccu r a cy. H owever, even wit h t h e las t few event s, t h ese frequ en t ly occu r r ed over a
lon g tim e per iod (up t o t wo year s). Con sequ en t ly, th e idea of isolat ing a ‘clean ’ set of event s
did n ot m a t er ia lize, at lea st wit h t h ese da t a . On t h e oth er h a n d, wit h a da t a set of only
r ecen t even t s, it m a y be p ossible t o imp r ove pr edicta bilit y.
9.38
Table 9.13
S a m p l e S i z e a n d P r e d i c t io n E r ro r
(Aver a ge E r r or )
Time (days)
Sa m ple
Size
6-9
10-13
11+
Op t im a l
Regr ession
143.4
108.2
79.8
La g 1
Regr ession
108.5
86.8
65.1
Mea n
I nt er va l
116.4
83.4
65.7
Med ia n
I nt er va l
120.8
79.5
71.2
La g 1
Regr ession
5.2
6.0
5.9
Mea n
I nt er va l
5.0
5.7
6.8
Med ia n
I nt er va l
4.4
5.5
6.1
D i s t a n c e (miles)
Sa m ple
Size
6-9
10-13
11+
Op t im a l
Regr ession
7.4
5.5
6.1
Ce n t r o g ra p h i c : D i s t an c e (miles)
Sa m ple
Size
6-9
10-13
11+
Mea n
Cent er
2.9
2.9
4.3
Cen t er of
Min im u m
Dis t a n ce
2.4
3.1
4.1
Table 9.14
R e g r e s s i o n D i a g n o s t i c s a n d P r e d i c t io n E r ro r
Co m p a r i s o n o f CWA R e g r e s s i o n Me t h o d s
R-Sq u are
0-0.29
0.30-0.59
0.60+
T i m e (d a ys)
Op t im a l
La g 1
Regression Regression
93.7
89.3
164.3
90.9
33.8
122.7
Di st a n ce (m iles)
Op t im a l
La g 1
Regression Regression
6.7
6.0
6.3
9.39
6.3
5.0
5.2
S t r en g t h of p r e d i c ta b i l i t y
A s econ d va r ia ble t h a t a pp ea r s t o h a ve a n effect is th e s tr en gt h of p red ict a bilit y,
ba sed on t h e fir st N-1 ca ses. F or t h e dia gn ost ics r ou t in e, a s t h e over a ll R-s qu a r e for t h e
r egr es sion equ a t ion in cre a se s, t h e r egr es sion equ a t ion d oes be t t er . H owever , wit h ver y h igh
R-squ a r e coefficien t s, t h e er r or is worse. Ta ble 9.14 sh ows t h e r ela t ion sh ip.
Th e lowest er r or is obt a ined with m oder a t e R-squ a r e coefficient s, for bot h t ime a n d
dis t a n ce. This is wh y on e h a s t o be ca r eful w it h ver y high la gged cor r ela t ion s in t h e
cor r elogr a m a n d h igh R-s qu a r es in t h e d ia gn os t ics . U n les s on e is dea lin g wit h a per fect ly
pr edict a ble in divid u a l (a s t h e t wo t h eor et ica l exa m ples illu st r a t ed), h igh cor r ela t ion s m a y
be a r es u lt of a ver y sm a ll sa m ple size, r a t h er t h a n a n y in h er en t pr edicta bilit y.
Li m it at io n s o f th e Te c h n iq u e
In sh or t , u ser s sh ou ld be ca r efu l a bou t u sin g t h e CWA t ech n iqu e. It ca n be u sefu l for
ide n t ifying r epea t in g pa t t er n s by a n offen der , bu t it won’t n ecessa r ily pr edict a ccu r a t ely t h e
offend er ’s n ext a ctions. Th er e a r e a va r iet y of rea son s for t h e la ck of pr edicta bilit y. Fir st ,
t h er e m a y be in t er m edia t e even t s t h a t a r e u n k n own. With ea ch of t h ese offend er s in t h e
Ba ltim or e Cou n t y dat a ba se, th er e is alwa ys t h e possibility th a t t h e individu a ls com m itt ed
ot h er cr im es for wh ich t h ey wer e n ot ch a r ged . Th e s equ en t ia l a n a lys is a ss u m es t h a t all t h e
even t s a r e k n own . Bu t t h is m a y n ot be t h e case .
A s im u la t ion on sever a l ca ses wa s con du ct ed by r em ovin g even t s a n d t h en r er u n n in g t h e cor r elogra m a n d p r ediction m odels. Rem ovin g on e even t did n ot a pp r eciably
a lter t h e r elat ion sh ip, but r em ovin g more t h a n on e event did. In ot h er wor ds, if t h er e a r e
u n k n own even t s, t h e t r u e sequ en t ia l beh a vior pa t t er n of t h e offend er m a y not be pr oper ly
iden t ified. Con sider ing t h a t m ost offen der s com m it fewer t h a n 10 inciden t s befor e t h ey get
cau gh t , t h e st a t ist ical effect of m iss in g infor m a t ion m a y be crit ical.
A second r eason h as been alluded to already. In applying th e model to crime events,
it is n ot a t r u e sequ en t ial m odel, but a pseud o-sequential m od el s in ce m u ch t im e m a y
int er vene bet ween event s. Dist a n ce a n d dir ect ion a r e con cept u a l in t h e sen se t h a t t h e
in dividu a l doesn ’t dir ectly or ien t from one even t t o th e oth er , bu t r et u r n s t o his/her living
pa t t er n s. Th u s, wh a t m a y a ppea r t o be a r epea t in g p a t t er n m a y n ot be. H er e, t h e is su e of
sa m ple size is cr it ica l. If t h er e a r e on ly a few in cid en t s on wh ich t o ba se a n a n a lysis , on e
cou ld see a p at t er n wh ich a ct u a lly d oes n ’t exis t . On e h a s t o be ca r efu l a bou t d r a win g
in fer en ces fr om ver y sm a ll sa m ple s.
A t h ir d r ea son is t h a t people a r e in h er en t ly un pr edicta ble. Th e t wo algor it h m ic
exa m ples p r odu ced excellen t r esu lt s, bu t few per son s a r e t h a t sys t em a t ic a bout t h eir
beh a vior . Th er efor e, we m u st be cau t iou s in expectin g t oo m u ch out of t h e m odel.
9.40
Co n c l u s i o n
Never t h eles s, t h e m odel h a s u t ilit y. Fir st , it can h elp police ide n t ify wh et h er t h er e is
a pa t t er n in a n offend er ’s beh a vior . Kn owing t h a t t h er e is a pa t t er n can h elp in pla n n in g
a n a r r est st r a t egy. E ven if t h e st r a t egy does n ot pa y off ever y t im e, it m a y im pr ove police
effectiven ess. In sh ort , t h e CWA ca n h elp a police depa r t m en t a n a lyze t h e sequ en t ia l
beh a vior of a n offen der t h ey ar e t r yin g to ca t ch . They ma y be able to an t icipa t e a n ew event
an d ma y be able to war n people who ar e more likely to be at ta cked by th is individua l. If
u sed ca r efu lly, t h e m odel ca n be u sefu l for cr im e a n a lysis a n d det ect ion .
Secon d, it ca n en cou r a ge t h e develop m en t of a ddit ion a l p r edict or t ools for
ind ividu a ls. As m en t ion ed a bove, t h e cen t er of m inim u m dist a n ce pr odu ces a ‘best guess ’
est ima t e in t h e sen se t h a t it m inim izes t h e dist a n ce t o t h e n ext even t . It u su a lly doesn’t
pr edict t h e n ext even t , bu t it does p r odu ce a m in im a l er r or. If u sed in con jun ction wit h t h e
CWA, it m a y be possible t o n a r r ow t h e sea r ch a r ea for t h e n ext even t .
Th ird , th e CWA m odel ca n st imu lat e r esea r ch int o cr ime p r edict ion . Police a r e
a lways t r yin g to pred ict t h e n ext even t by an offen der a n d will u se m u ltiple t ech n iques a n d
a lot of in t u it ion in t r ying t o ‘out -gu ess’ a n offend er . It is h oped t h a t t h e CWA model will
st im u la t e m or e r esea r ch in t o pr edict in g t h e sequ en ce of offen der beh a vior a s well in t o h ow
t h ose sequ en ces a ggr egat e int o a lar ge spa t ial pa t t er n . Most of t h is t ext h a s been devot ed
to an alyzing the spat ial pat tern s of a large num ber of events. The stat istics ha ve, perha ps
n a ively, a ssu m ed t h a t ea ch of t h ose even t s wer e in depen den t . In r ea lit y, t h ey a r en ’t sin ce
m a n y cr im es a r e com m it t ed by t h e sa m e in dividu a ls. In t h eory, a dis t r ibu t ion of crim e
in cid en t s could be dis a ggr ega t ed in t o a dis t r ibu t ion of sequ en ces of even ts com m it t ed by t h e
sa m e offen der s, if we h a d en ou gh in for m a t ion . U n der st a n din g h ow a ggr ega t e dis t r ibu t ion s
is a by-p r odu ct of t h e beh a vior of a lim it ed n u m ber of in divid u a ls is a n im por t a n t r esea r ch
goal th at needs to be addr essed.
In t h e n ext ch a pt er , we’ll look a t J ou r n ey-t o-cr im e m odelin g a n d a t t h e is su e of
m odelin g cr im in a l t r a vel beh a vior .
9.41
En dn ot e s for Ch ap te r 9
1.
It would be possible t o ma k e a one-t a iled t est wit h t h e sim u la t ion . For exa m ple, if
one is only in t er est ed in t h e degr ee of clu st er in g, on e cou ld a dopt t h e 95 p er cent ile
a s t h e t h r esh old. An obser ved M a n t el va lu e t h a t wa s lower t h a n t h is t h r esh old
would be consistent with th e null hypoth esis.
2.
H en der son , Ren sh a w a n d F or d (1981) defin ed t h e cor r ela t ed wa lk a s a t wodim en sion a l wa lk wh er e t h e su m of t h e pr obabilit ies in fou r dir ect ions a long a
la t t ice a r e:
P = p + q + 2r = 1
wh er e P is t h e t ota l pr obabilit y (1), p is th e probability of cont inuing in th e same
dir ect ion, q is t h e pr obabilit y of m oving in a n opposit e dir ect ion, a n d r is t h e
pr obabilit y of moving on e u n it t o th e r ight or t o th e left. Th e a dva n t a ge of t h is
for m u la t ion is t h a t t h e pr oba bilit ies do n ot h a ve t o be equ a l (i.e., p cou ld exceed q or
r ). N ever t h eless, t h e in divid u a l s t eps ca n be consid er ed a specia l ca se of a
cor r ela t ed r a n dom wa lk in t h e pla n e (H en der son , 1981).
Th e n on -la t t ice t wo dim en sion a l ca se ca n a ls o be con sid er ed a r ecu r r en t r a n dom
wa lk sin ce a st ep in a n y direction (n ot ju st a lon g a la t t ice) ca n be con sider ed t h e
r es u lt of t wo st eps, on e in t h e X dir ect ion a n d on e in t h e Y (or, a lt er n a t ively, a
pa ir in g of a ll st eps in t h e X dir ect ion wit h a ll st eps in t h e Y dir ect ion).
Un for t u n a t ely, t h is logic does n ot a pp ly to mor e t h a n t wo dim en sion s. Su ch m u lt idim en sion a l wa lk s do not h a ve t o r et u r n t o t h eir or igin . H owever , Spit zer (1963) h a s
sh own t h a t a n in depen den t wa lk is r ecu r r en t if t h e secon d m om en t a r ou n d t h e
or igin is fin ite.
9.42
Download