Auto Accidents: What’s responsible? Group 8 Janelle Chang Helena Jeanty Rhiana Quail DISCLAIMER!!! • • • • • Weather conditions Drivers’ mental health Drivers’ physical health Time of day Time of year Sorting the Data... • National vs. Regional – National: • all 50 states (not including DC) – Regional: • Region 1 ~ North East • Region 2 ~ South East • Region 3 ~ South MidWest • Region 4 ~ North MidWest • Region 5 ~ South West • Region 6 ~ North West • Reasoning… – Allows one to view any type of national behavior – Allows for comparisons to be made within the United States Normalizing Data • Reason: – Every entry needs to be expressed in a “standard” proportion so that the data can be evaluated equally. • State populations differs • Number of states per region differ – Basic assumption: more people = more cars = higher number of automobile fatalities. Testing #1: Does alcohol affect the number of drivers killed in car accidents? • assumption – Alcohol affects the number of people killed in car accidents BUT is not the only contributing factor. – Younger people probably drink more irresponsibly so more likely to be involved and be responsible for fatal car accidents. #2: Does a combination of age and alcohol affect the number of people (including drivers) killed in car accidents? #3: Do individual regions mimic national data? t-Test For each region: • H0: tot. drivers killed = drunk drivers killed • H1: tot. drivers killed drunk drivers killed • t-Test: » = 0.05, 95% confidence » 2-sided test » df = (# obs) - 1 t-Test (#1) Rejecting H0 Source | SS df MS -------------+-----------------------------Model | 1.1071e-12 1 1.1071e-12 Residual | 4.1075e-13 15 2.7383e-14 -------------+-----------------------------Total | 1.5179e-12 16 9.4868e-14 Number of obs = 17 F( 1, 15) = 40.43 Prob > F = 0.0000 R-squared = 0.7294 Adj R-squared = 0.7114 Root MSE = 1.7e-07 -----------------------------------------------------------------------------reg1normki~d | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------+---------------------------------------------------------------reg1normdr~k | 125.8276 19.78864 6.36 0.000 83.64916 168.0061 _cons | 1.37e-07 4.70e-08 2.91 0.011 3.66e-08 2.37e-07 ------------------------------------------------------------------------------ Reject H0: | t| > t15 ie. 6.36 > 2.131 Regression: driverskilled = 1.37e-07+ 125.8276 * drunkdriverskilled t-Test Accepting H0 Source | SS df MS -------------+-----------------------------Model | 6.5849e-14 1 6.5849e-14 Residual | 1.3030e-14 2 6.5150e-15 -------------+-----------------------------Total | 7.8879e-14 3 2.6293e-14 Number of obs = 4 F( 1, 2) = 10.11 Prob > F = 0.0863 R-squared = 0.8348 Adj R-squared = 0.7522 Root MSE = 8.1e-08 -----------------------------------------------------------------------------reg5normki~d | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------reg5normdr~k | 90.53328 28.47673 3.18 0.086 -31.99218 213.0587 _cons | 1.85e-07 5.11e-08 3.62 0.069 -3.50e-08 4.05e-07 ------------------------------------------------------------------------------ Accept H0: | t| < t2 ie. -4.303 < 3.18 < 4.303 Regression: driverskilled = 110.3849 + 1.27e-07 *drunkdriverskilled Testing #1: Does alcohol affect the number of drivers killed in car accidents? • assumption – Alcohol affects the number of people killed in car accidents BUT is not the only contributing factor. – Younger people probably drink more irresponsibly so more likely to be involved and be responsible for fatal car accidents. #2: Does a combination of age and alcohol affect the number of people (including drivers) killed in car accidents? #3: Do individual regions mimic national data? F-Test For each region: • H0: 1 = 2 = 0 • H1: 1 2 (at least one i 0) • F-Test: = 0.05, 95% confidence 1-sided test F-Test (#2) Rejecting H0 Source | SS df MS -------------+-----------------------------Model | 7.0241e-13 2 3.5121e-13 Residual | 3.4191e-13 8 4.2739e-14 -------------+-----------------------------Total | 1.0443e-12 10 1.0443e-13 Number of obs = F( 2, 8) = 8.22 Prob > F = 0.0115 R-squared = 0.6726 Adj R-squared = 0.5908T Root MSE = 2.1e-07 11 -----------------------------------------------------------------------------reg1normki~d | Coef. Std. Err. t P>|t| [95% Conf. Interval] ------------- -----+---------------------------------------------------------------reg1normdr~k | 102.7064 39.55021 2.60 0.032 11.50342 193.9093 personskil~d | -.0043674 .0144079 -0.30 0.770 -.037592 .0288572 _cons | 2.69e-07 2.30e-07 1.17 0.275 -2.61e-07 7.99e-07 ------------------------------------------------------------------------------ Reject H0: F0.025, 2, 8 > 4.46 ie. 8.22 > 4.46 peoplekilled = 2.69e-07 + 102.7064 * drunkdrivers - .0043674 * agekilled F-Test Accepting H0 Source | SS df MS -------------+-----------------------------Model | 3.2398e-14 2 1.6199e-14 Residual | 2.2270e-14 5 4.4540e-15 -------------+-----------------------------Total | 5.4668e-14 7 7.8097e-15 Number of obs = 8 F( 2, 5) = 3.64 Prob > F = 0.1059 R-squared = 0.5926 Adj R-squared = 0.4297 Root MSE = 6.7e-08 -----------------------------------------------------------------------------reg3normki~d | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------reg3normdr~k | 394.0596 146.2681 2.69 0.043 18.06551 770.0537 agekilled | -.0011894 .0022405 -0.53 0.618 -.0069489 .0045701 _cons | 1.38e-07 4.85e-08 2.85 0.036 1.36e-08 2.63e-07 ------------------------------------------------------------------------------ Accept H0: F0.025, 2, 5 < 5.79 ie. 2.69 < 5.79 driverskilled = 1.38e-07 + 394.0596 * drunkdrivers -.0011894 * agekilled Testing #1: Does alcohol affect the number of drivers killed in car accidents? • assumption – Alcohol affects the number of people killed in car accidents BUT is not the only contributing factor. – Younger people probably drink more irresponsibly so more likely to be involved and be responsible for fatal car accidents. #2: Does a combination of age and alcohol affect the number of people (including drivers) killed in car accidents? #3: Do individual regions mimic national data? Confidence Intervals (#3) Confidence Interval of the mean for the National Data National Mean of drivers killed: 2.96763E-07 Confidence Interval (2.96763E-07 - 8.46641E-08 , 2.96763E-07 + 8.46641E-08) (2.12099E-07, 3.81428E-07) Region Results with Confidence Intervals Region Mean Lies within National CI 1 2.92972E-07 Yes 2 1.082E-07 No 3 1.40915E-07 No 4 3.847E-07 No 5 2.8484E-07 Yes 6 6.16645E-07 No Graph of National Data ANOVA Test • H0: national = reg 1 = reg 2 = .….. = reg 6 • The number of divers killed in car accidents is independent of the region in which they occur. • Reject H0 if F > F0.95, 3, 2 = 19.2 • F = 7.7631 < 19.2 so accept H0 Conclusions • Nationally, 4 out of the 6 regions rejected the Ftest null hypothesis => there is a correlation between age, BAC, and the number of drivers killed. • Regionally, 4 out 6 supported the national data trend. The regressions carried out confirm that the number of people killed depends on the number of drunk drivers. • Regions do not reflect the national trend for the average number of drivers killed. • The number of drivers killed does not depend on the region in which they occur.