Development of Accident Prediction Models for the Highways of Thailand Lalita Thakali Transportation Engineering Outline of Presentation Statements of Problem Objective Methodology Preliminary analysis Model development Identification of hazardous location Conclusion Recommendation Statement of Problems In 2002 the economic losses due to road accidents was estimated to be in approximately 115932 million baht, or 2.13% of the GDP 18% of the annual road accidents occurs in highway of Thailand. (The annual report (2005) of the Bureau of Traffic Safety ) Year Accidents Fatalities Injuries Trend of accident in highways of Thailand Budget allocation for road safety for highways of Thailand Property Damage (THB) 2001 15,341 2,212 12,712 352,851,000 2002 15,066 2,265 13,285 445,236,000 2003 15,171 2,023 12,984 464,248,000 2004 18,547 2,324 18,381 425,623,000 2005 16,287 2,169 15,300 405,248,000 Year 2002 2003 2004 2005 Budget (in Million Baht) 1,400.000 1,400.000 1,770.000 1,644.999 Causes of Accidents in Thailand 63% Human 5% 1% 20% 4% 3% Vehicle 6% Road & Environment How to Address Road Safety Problem By Accident Modeling 1 Descriptive model 2 Predictive model 3 Risk model 4 Accident consequences model Objective of Study 1 Identify existing accident characteristic 2 To develop a generalized accident prediction models for highways using different statistical techniques. 3 To identify hazardous locations Methodology Literatures Review •DoH historical Accident Data •DoH traffic data •Metrological data •Video data Site Selection Data Collection Monthly accident data (λij) •Accident •Fatality •Injury •Property damage Explanatory Variables (Xij) Homogenous Section l1 l2 ln-1 ln i= 1,2….n Preliminary data analysis (Characteristic of accident & severities Identification of possible Variables Site Selection 2. Why route no 4 in Ratcha Buri & Nakhonpathom 1. Why route no 4 Year Accidents (highways) Accidents (major route 1,2,3,4) Accidents (route 4) % of accidents (major route) % of accidents (route 4 w.r.t. major routes) Year Accident Fatalit y Injury Property Damage 2001 29.9 15.1 10.4 15.6 2001 15341 3228 800 21.04 24.783 2002 23.5 10.0 8.6 13.8 2002 15066 3142 869 20.85 27.658 2003 19.0 5.0 5.2 7.3 2003 15171 2982 949 19.66 31.824 2004 34.4 11.4 24.0 10.3 2004 18547 3534 993 19.05 28.098 2005 29.6 10.0 14.9 7.3 2006 29.2 3.0 22.3 5.9 Average 27.59 9.10 14.24 10.04 2005 16287 3016 861 18.52 28.548 2006 10597 2077 552 19.60 26.577 Average 15168 2997 837 19.79 27.91 3. Relatively high no of AADT count stations 27.59% of accident occurs in selected site, where it covers only 8.56% of Total road length of route no 4 Study Area Nakhonpathom Ratcha Buri Route no 4 Total length = 117.93 km Methodology Literatures Review •DoH historical Accident Data •DoH traffic data •Metrological data •Video data Site Selection Data Collection Monthly accident data (λij) •Accident •Fatality •Injury •Property damage Explanatory Variables (Xij) Homogenous Section l1 l2 ln-1 ln i= 1,2….n Preliminary data analysis (Characteristic of accident & severities Identification of possible Variables Data Collection 1. 2. 3. 4. Traffic DoH historical Accident Data DoH traffic data Metrological data Video data AADT (2) % of heavy vehicle (2) Weather Rainfall (3) Calendar Month Accident Geometric 2001- 2006 Total accident Fatality Injury Property damage DOH (1) No of lane (4,1)) Types of median (4,1) Shoulders available (4) No of curves (4) No of intersection (4) No of access (4) Data: Year 2001 to 2006 Methodology Literatures Review •DoH historical Accident Data •DoH traffic data •Metrological data •Video data Site Selection Data Collection Monthly accident data (λij) •Accident •Fatality •Injury •Property damage Explanatory Variables (Xij) Homogenous Section l1 l2 ln-1 ln i= 1,2….n Preliminary data analysis (Characteristic of accident & severities Identification of possible Variables Accident Rate (MVK) Fatality rate - 3.08 per MVK Countries Canada France Germany Italy UK USA Bahrain Egypt Oman Yemen 0.01 0.02 0.02 0.01 0.01 0.001 0.002 0.44 0.04 0.11 Average Fatality rate in this study area is much higher than the rate in other countries. seven times greater than that of Egypt Objective1 Causes of Accident S.N Types of Causes Total % 1 Maximum speed limit 1051 76.10 2 3 4 5 Maximum speed limit + others Improper passing Improper passing +others Failure to yield right Failure to temporary stop, slow down, turn Disregarding traffic signal marking Vehicle defective Drunkeness Sleepy Others 31 47 4 1 2.24 3.40 0.29 0.07 1 0.07 7 0.51 24 1 8 206 1.74 0.07 0.58 14.92 1381 100 6 7 8 9 A B Total The exceeding of max speed is mostly due to the humanvehicle and its interaction with the geometric features of the road- this could be addressed in the model with the inclusion of geometric variables Objective1 Location of Accident 45% 78% 67% 73% •Accident •Fatality •Injury •Property damage 55% 22% 33% 27% intersection Segment Weather related Accident Vehicle involvement Weather Accident % Fatality % Injury % PD % Clear Fog Rain Other Total 1018 1 125 237 1381 74 0 9 17 100 59 0 0 5 64 92 0 0 8 100 421 0 40 187 648 65 0 6 29 100 14295 22 1654 398 16369 87 0 10 2 100 Note: PD= Property Damage (1000 baht) Surface Condition Dry Dirty Wet Other Total Accident 990 1 125 265 1381 % Fatality % Injury % PD % 72 0 9 19 100 59 0 0 5 64 92 0 0 8 100 394 0 40 214 648 61 0.00 7 33 100 14045 27 1628 669 16369 86 0 10 4 100 Vehicles Total % Pedestrian 21 1.04 Bicycle 3 0.15 Tricycle 2 0.10 Motorcycle 392 19.34 Trimotcycle 480 23.68 Passenger car 605 29.85 Light bus 59 2.91 Light truck 98 4.83 Heavy vehicle (HV) Heavy bus 218 10.75 Medium truck Heavy truck Farm vehicle 1 92 56 0.05 4.54 2.76 HV only 16% of total number of accidents while it represents 22.39% of total traffic volume Objective1 Accident distribution based on month Objective1 Methodology Literatures Review •DoH historical Accident Data •DoH traffic data •Metrological data •Video data Monthly accident data (λij) •Accident •Fatality •Injury •Property damage Site Selection Data Collection Explanatory Variables (Xij) Homogenous Section l1 l2 ln-1 ln i= 1,2….n Preliminary data analysis (Characteristic of accident & severities Identification of possible Variables Variables (per month) Independent Dependent Variables Total Mean Std Units Accident 1220 1.22 1.89 Number Fatality 61 0.06 0.33 Person Variables AADT PD 578 0.54 1.68 15596 15.66 50.05 Person Thousand baht % Lane Number Number/km Intersection(I’) Number/km Curve (C’) Number/km AADT and lane is highly correlated, so lane has been excluded Data from 2001- 2005 was used in model development Objective 2 Median (MD) Category Divided (1) Undivided (0) km Access (A’) Rain (R) Variables Number (*1000) HV Length Injury Units •Literatures •Preliminary analysis •Data availability mm No (1) Shoulder (S) Yes (0) Others (1) Month (M) April (0) Addition of variables Forward selection Pearson Correlation Variables Accident Fatality Injury PD AADT (*1000) HV Length (km) Lane A I C R Dependent Variables Accident 1 Fatality Injury 0.217 1 PD 0.65 0.386 0.251 0.189 1 0.201 AADT HV 0.488 0.083 0.132 0.006 0.258 0.071 0.149 0.035 0.109 0.461 0.117 0.238 0.363 0.899 0.379 -0.294 0.536 0.145 0.487 0.075 0.276 0.54 0.019 0.028 -0.048 0.014 -0.126 0.141 -0.012 0.083 0.128 0.113 -0.025 -0.078 -0.077 0.002 0.052 -0.05 0.172 0.084 0.004 Length Lane A I C R 1 Independent Variables 0.295 1 0.118 0.335 1 -0.148 -0.07 0.005 1 0.587 0.249 -0.023 1 0.229 1 0.367 1 0.247 0.357 0.724 1 0.067 0.053 -0.02 1 0.002 0.019 Pearson Correlation Variables Accident Fatality Injury AADT (*1000) PD HV (%) Length (km) A' I' C' R Dependent Variables Accident 1 Fatality 0.217 1 Injury 0.650 0.251 1 PD 0.386 0.189 0.201 1 Independent Variables AADT 0.487 0.132 0.258 0.295 1 HV 0.090 0.008 0.074 0.122 0.341 1 Length 0.149 0.035 0.109 -0.005 -0.148 -0.066 1 A' 0.382 0.123 0.375 0.043 0.387 0.222 -0.243 1 I' -0.196 -0.031 -0.131 -0.097 -0.183 -0.276 -0.402 0.100 1 C' -0.074 -0.023 -0.051 -0.156 -0.158 -0.442 -0.229 -0.168 0.699 R -0.025 -0.078 -0.077 0.002 0.052 -0.005 -0.023 0.021 1 -0.002 -0.024 1 Methodology cont. Forward selection of variables Yes Model development •GLM- Poisson regression •GLM- NB regression E(λ) = exp∑βjXij λ = accident per month βj = parameter coefficient Xi = explanatory variable Is included variable significant? And is the goodness of fit better? If yes •Continue to include If not •Exclude the variable Any explanatory variables remaining? No Selection of model (Poisson or NB) •Accident Data 2006 •(Visual validation) Identification of hazardous location Generalized Linear Model ? Accident Modeling Descriptive model Empirical Bayes Predictive model Fuzzy Logic Risk model Multivariate Accident consequences Model Artificial Neural Network Linear Model Normal dis of accident with constant mean & variance GLM Only Poisson /Negative Binomial regression model- poisson trial Accident- Normally follows poisson trial rather than binomial trial Objective 2 Generalized Linear Model cont. Link Function E(λ) =μ= ∑βiXi Linear model λ (i, t) = e∑βjXij η= ∑βiXi Generalized Linear Model Link function used gives non negative value which comply with nature of accident. Parameter is estimated by max likelihood method unlike OLS method . Objective 2 Significance & Goodness of tests Estimation of parameters β Significance of parameters Maximum log likelihood method. SPSS (16) = standard deviation W = Wald value 95% confident interval Discuss about the goodness of fit Tests Log likelihood (LR) test. Formula Criteria Purpose P value >0.05 To select Step AIC Less the value better is the selected step model “ BIC “ “ Greater the value better is the model To select either Poisson or NB Deviance Total explained variation (R2D) Objective 2 “ “ Variables /goodness of fit Constant Steps 1 2 3 4 5 6 7 8 9 10 11 0.203 (0.000) - 0.656 (0.000) - 1.13 (0.000) -1.426 (0.000) -1.019 (0.000) -1.147 (0.000) -1.132 (0.000) -851 (0.000) -739 (0.000) -1.235 (0.000) -1.189 (0.000) 0.018 (0.000) 0.019 (0.000) 0.016 (0.000) 0.017 (0.000) 0.017 (0.000) 0.016 (0.000) 0.016 (0.000) 0.016 (0.000) 0.015 (0.000) 0.015 (0.000) 0.063 (0.000) 0.078 (0.000) 0.118 (0.000) 0.079 (0.000) 0.125 (0.000) -0.021 (0.000) 0.08 (0.000) 0.126 (0.000) -0.025 (0.000) 0.233 (0.083) 0.101 (0.000) 0.101(0.0 00) -0.024 (0.000) 0.101 (0.000) 0.101 (0.000) -0.024 (0.000) 0.097 (0.00) 0.101 (0.000) -0.025 (0.000) 0.112 (0.00) 0.103 (0.000) -0.017 (0.004) 0.112 (0.000) 0.103 (0.000) -0.017 (0.004) 0.639 (0.000) 0.639 (0.000) -0.348 (0.000) 0.601 (0.000) -0.348 (0.000) -0.055 (0.380) 0.769 (0.000) -0.348 (0.000) 0.769 (0.000) -0.338 (0.000) 0.298 (0.020) 0.294 (0.020) AADT Length Access HV Median Shoulder Month Intersection Detail forward selection procedure for Accident model (Poisson) Curve Rain 0.0 (0.059) Deviance 2281 1719 1610 1491 1474 1471 1415 1392 1391 1382 1378 Pearson-Chi 2886 1866 1724 1561 1531 1529 1533 1490 1490 1481 1479 LL AIC BIC -1796 3593 3598 0 (0.000) -1515 3034 3044 562 (0.000) -1460 2927 2942 670 (0.000) -1401 2809 2829 790 (0.000) -1392 2795 2819 806 (0.000) -1391 2794 2823 810 (0.000) -1363 2738 2767 865 (0.000) -1351 2716 2750 889 (0.000) -1351 2717 2756 890 (0.000) -1346 2708 2748 899 (0.000) -1344 2707 2751 903 (0.000) 0.25 0.29 0.35 0.35 0.35 0.38 0.39 0.39 0.39 0.39 LR ratio R2 D Accident Variables/Goodness Tests Selected Step Constant Poisson (1) 10 -1.235 (0.000) Fatality Negative Binomial (2) 10 -1.162 (0.027) Injury Property Damage (PD)*1000 baht Poisson (3) Negative Binomial (4) Poisson (5) Negative Binomial (6) 11 -2.855 (0.000) 11 - 2.576 (0.000) 11 -2.153 (0.00) 11 -1.911 (0.000) Poisson (7) Negative Binomial (8) 10 10 -0.449 (0.00) 2.100 (0.000) AADT (1000) 0.015 (0.000) 0.014 (0.000) 0.016 (0.000) 0.017 (0.00) 0.015 (0.000) 0.013 (0.00) 0.018 (0.00) 0.031 (0.00) length (km) 0.112 (0.00) 0.121 (0.000) 0.102 (0.002) 0.093 (0.003) 0.148 (0.000) 0.155 (0.00) 0.087 (0.00) -0.056 (0.055) Access (per km) 0.103 (0.000) 0.115 (0.000) 0.097 (0.017) 0.104 (029) 0.210 (0.000) 0.224 (0.00) -0.115 (0.00) -0.21 (0.000) -0.017 (0.004) - 0.017 (0.053) HV (%) Median Shoulder Month -0.009 (0.00) -1.054 (0.030) - 1.217 (0.008) -0.506 (0.003) 0.769 (0.000) 0.884 (0.00) 0.963 (0.030) 0.791 (0.042) 0.561 (0.001) -0.348 (0.000) - 0.552 (0.001) -0.916 (0.000) - 0.906 (0.011) -0.714 (0.000) -0.654 (0.004) 0.767 (0.0017) -0.814 (0.00) Intersection (per km) Curve (per km) 0.298 (0.020) 0.333 (0.013) 0.43 (0.005) 1.086 (0.00) -0.206 (0.00) 0.589 (0.00) -0.342 (0.00) -1.427 (0.00) -0.33 (0.004) 841 841 36730 36730 3182 3182 0.381 (0.037) Deviance Scaled Deviance 1382 1382 739 739 Pearson Chi-Square 1481 751 1567 1480 3494 2459 96300 9800 1481 -1346 2708 2748 899 (0.000) 0.39 751 -1332 2680 2719 386 (0.000) 0.34 1567 -211 441 485 72 (0.000) 0.18 1480 -200 417 456 66 (0.000) 0.19 3494 -980 1978 2023 726 (0.000) 0.33 2459 -793 1604 1648 410 (0.000) 0.32 96300 -19300 38620 38670 23509 (0.000) 0.39 9800 -3092 6198 6232 1350 (0.000) 0.29 Scaled Pearson LL AIC BIC LR ratio R2D -0.002 (0.000) 1459 1459 1.567 (0.00) -0.003 (0.012) 329 329 Rain (mm) - 0.005 (0.016) 267 267 1.783 (0.00) -0.003 (0.00) Prediction Models Accident Fatality Injury Property Damage Unit: per month Objective 2 Multiplier Factors Annual Average Daily Traffic Objective 2 Percent of heavy vehicle The factor is computed for its changes in magnitude of each predicting variables while considering all the other variables to be constant Length No of Access per km Multiplier factors cont. Median Objective 2 Intersection Shoulder Intersection Multiplier factors cont. No of Curve per km Rain fall Objective 2 Methodology cont. Forward selection of variables Yes Model development •GLM- Poisson regression •GLM- NB regression E(λ) = exp∑βjXij λ = accident per month βj = parameter coefficient Xi = explanatory variable Is included variable significant? And is the goodness of fit better? If yes •Continue to include If not •Exclude the variable Any explanatory variables remaining? No Selection of model (Poisson or NB) •Accident Data 2006 •(Visual validation) Identification of hazardous location Comparative study : Actual vs Model prediction Total road section of 26.98 km Road section divided into constant length of 2km, with few less then 2 km. Predicting Variables Mean Standard Deviation Critical Frequency Actual Model Actual Model Actual Model Accident 0.42 1.65 0.76 1.20 1.18 2.85 Fatality 0.01 0.11 0.07 0.12 0.07 0.23 Injury 0.33 0.94 0.74 1.32 1.06 2.26 PD 1.83 22.79 7.20 23.13 9.03 45.92 Visual validation Predicting Variables Mean Standard Deviation Critical Rate Actual Model Actual Model Actual Model Accident 3.51 13.63 6.48 10.21 9.99 23.84 Fatality 0.07 0.94 0.54 1.01 0.62 1.95 Injury 2.78 7.86 6.24 11.33 9.02 19.19 PD 14.77 183.38 57.08 182.15 71.85 365.52 Hazardous Locations for accident Control Section 201.1.1 201.1.2 201.2.1 201.2.2 201.2.3 201.2.4 201.2.5 201.2.6 202.1.1 202.1.2 202.1.3 202.1.4 202.1.5 202.2.1 Chainage From To 26+420 27+700 29+700 31+700 33+700 35+700 37+700 39+700 41+700 27+700 29+700 31+700 33+700 35+700 37+700 39+700 41+700 43+830 45+830 47+830 49+830 51+830 43+830 45+830 47+830 49+830 51+830 53+830 AADT Length (km) A’ HV MD S I’ C’ 117.187 117.187 117.187 117.187 117.187 117.187 117.187 117.187 126.068 126.068 126.068 126.068 126.068 126.068 1.28 2 2 2 2 2 2 2 2.13 2 2 2 2 1.57 12.5 12 5 4 3.5 2.5 3 3 3.28 2.5 2 3 2 1.28 36.1 36.1 36.1 36.1 36.1 36.1 36.1 36.1 25.14 25.14 25.14 25.14 25.14 25.14 1 1 1 1 1 1 1 1 1 1 1 1 1 1 No No Yes Yes Yes Yes Yes Yes No No No No No Yes 0 1 1 0.5 0.5 1 0.5 0 1.406 1 2 1 0.5 1.276 0 0.5 1 0.5 1 0.5 0.5 0 1 0.5 0.5 0.5 0.5 0.319 Chainage Control Section Objective 3 Month AADT From To Length (km) Hazardous location Frequency Rate 201.1.1 Jan- Dec 26+420 27+700 117.2 1.28 Yes Yes 201.1.2 Jan- Dec 27+700 29+700 117.2 2 Yes Yes 202.1.1 April 41+700 43+830 126.1 2.13 Yes Yes Hazardous Locations for accident April Nakhonpathom Jan- Dec Objective 3 Conclusions Characteristic of Accidents Accident trend is highly dependent on the exposure factors (MVK). 76% of accidents - exceeding of speed limit. Light vehicles have comparatively greater influence to the accidents than the HV. April has higher trend of accident and its severity than in rest of the months. Model Development Total Explained variation (%) S.N Variables Poisson Negative Binomial 1 Accident 39 34 2 Fatality 18 19 3 Injury 33 32 4 Property 39 29 Damage Significant variables AADT (1,2,3,4) Length (1,2,3,4) Access per km (1,2,3,4) HV % (1,4) Median (2,3) Shoulder (1,2,3,4 Month (1,2,4) Intersection per km (4) Curve per km (1,3,4) Rainfall (2,3) Conclusions cont. •Total explanatory variation is not surprising as data excludes detail station of traffic count, detail geometric data like lane width, shoulder width and the human behaviors. Comparable with to Caliendo et al. (2007). •The variables on the different severity of accident comply with the preliminary analysis. i.e. methodology implemented for the model formulation is appropriate one. Identification of hazardous location Using the accident prediction models as the tool for the identification of hazardous section, the road sections with high traffic volume, high number of curves per km and absence of shoulder were found to be hazardous. Recommendations From preliminary analysis & the models accident is prominent in April. Hence, more instant safety measures would be taken to reduce the numbers of accidents during this period which would safe both huge life and economic losses. Accident is enhanced by the light vehicles as depicted in the result. Traffic management enforcing the rules and regulation would be implemented such as provision of separate lanes. The developed accident prediction models would be integrated with the GIS tools and develop interface that would explicitly present the hazardous road sections. Future Researches Develop separate accident prediction models for intersection. Develop model with inclusion of more detail geometric data like width of lane, width shoulder, speed limit etc. Recommendations cont. Separate accident prediction model, such as for vehicle to vehicle collision, vehicle turn over etc. Real time crash prediction model would be developed for the link and intersection provided the data availability is real time. The real time crash prediction would be integrated with the simulation package in the network for the traffic assignment with the safety factor with addition to the delay factors. The real time traffic model would be integrated with GIS or Google earth to display the risk of particular section.