Regression of NFL Scores on Vegas Line – 2007 Regular Season

advertisement
Regression of NFL Scores on Vegas Line –
2007 Regular Season
Problem Description
• Odds makers Place a Point Spread (differential) and a
Over/Under (total) on all National Football League
games
• Combining these two quantities, we can obtain a
prediction for the final score of the game
• Let PA and PH be the odds makers Predicted scores for
the Away and Home teams, respectively
• Spread [wrt Home Team] (PS)= PA – PH (Negative spreads
for Home teams mean they are favored (“giving” points)
• Over/Under (OU) = PA + PH
•  PA = (OU+PS)/2
PH = (OU-PS)/2
Data/Model Description
• Point Spreads, Over/Under, and Actual Scores obtained
for all n=256 NFL games from 2007 season
• Predicted Scores obtained for each team in each game
• Regression is fit for each team’s actual score (n=512 team
games) as a function of predicted score, and home team
indicator
• Residuals checked to see if errors are independent within
games for the two teams
• Tests conducted to determine:
 If Home Team effect is sufficiently accounted for by odds makers
 If Odds makers are “unbiased” in their point predictions
 If relation between actual and predicted scores is linear
Week 1 Data
Week Away Team
1
NO
1
ATL
1
CAR
1
DEN
1
KC
1
MIA
1
NE
1
PHI
1
PIT
1
TEN
1
CHI
1
DET
1
TB
1
NYG
1
BAL
1
ARI
Home Team
IND
MIN
STL
BUF
HOU
WAS
NYJ
GB
CLE
JAX
SD
OAK
SEA
DAL
CIN
SF
Open Spread (HT)
-6.5
-2.5
-1
3.5
-3
-2.5
7
2.5
4.5
-7
-5.5
-1.5
-6
-5.5
-2.5
-3.5
Open Over/Under
49.5
36
42.5
37
38
35
41
43.5
37
37.5
42.5
40
41
44
40.5
45
Expected Home
28
19.25
21.75
16.75
20.5
18.75
17
20.5
16.25
22.25
24
20.75
23.5
24.75
21.5
24.25
Expected Visitor
21.5
16.75
20.75
20.25
17.5
16.25
24
23
20.75
15.25
18.5
19.25
17.5
19.25
19
20.75
Observed Home
41
24
13
14
20
16
14
16
7
10
14
21
20
45
27
20
Observed Visitor
10
3
27
15
3
13
38
13
34
13
3
36
6
35
20
17
Note for the first game:
• Spread = PA – PH = -6.5 (IND was favored to beat NO by 6.5 Points)
• Over/Under = PA + PH = 49.5 (Predicted Total Score was 49.5 points)
•  PA = (49.5 + (-6.5))/2 = 21.5 PH = (49.5 - (-6.5))/2 = 28
Observed vs Odds Makers Predicted Score
60
55
50
45
Observed
40
35
Home
30
Away
25
20
15
10
5
0
0
10
20
30
Predicted
40
50
60
Regression Model
Yi   0   P Pi   H H i   PH PH
i
i  i
i  1,..., 512
where:
Yi

Observed score for the i th team-game
Pi

Odds makers predicted score for the i th team-game
Hi
 1 if the i th team-game is a Home game, 0 if Away
 i ~ NID  0,  2 
Away Teams  H i  0  :
Home Teams  H i  1 :
E Y i   0   P Pi
E Y i    0   H     P   PH  Pi
Regression Results
Away: Y i  0.38  1.05Pi
Home: Y i   0.38  4.45  1.05  0.19  Pi
 4.07  0.86 Pi
s 2  97.67  s  9.88
X'X
512
10696
256
5664.5
10696
232904.8
5664.5
129445.4
256
5664.5
256
5664.5
5664.5
129445.4
5664.5
129445.4
X'Y
11104
241207.8
5919
134504.5
(X'X)^-1
0.08846
-0.00430
-0.08846
0.00430
-0.00430
0.00022
0.00430
-0.00022
-0.08846
0.00430
0.21157
-0.00969
0.00430
-0.00022
-0.00969
0.00046
Beta-hat
-0.377
1.050
4.453
-0.189
sse
mse
49615.22 97.66775
s
9.8827
SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations
0.3942
0.1554
0.1504
9.8827
512
ANOVA
df
Regression
Residual
Total
Intercept
X1=Vegas
X2=Home
X3=Vegas*Home
SS
3
508
511
9128.78
49615.22
58744
Coefficients
-0.3767
1.0497
4.4533
-0.1890
Standard Error
2.9393
0.1462
4.5457
0.2125
MS
3042.93
97.67
t Stat
-0.1281
7.1792
0.9797
-0.8893
F
Significance F
31.16
0.0000
P-value Low er 95%
Upper 95%
0.8981
-6.1513
5.3980
0.0000
0.7624
1.3369
0.3277
-4.4773
13.3840
0.3742
-0.6065
0.2285
Actual Score versus Odds Makers Predictions
60
55
50
45
y = 1.0497x - 0.3767
40
Actual Score
35
Home
30
Away
Linear (Home)
25
Linear (Away)
20
15
y = 0.8607x + 4.0767
10
5
0
0
5
10
15
20
Predicted Score
25
30
35
40
Test of No Home Effect and “Unbiasedness”
No Home Effect and "Unbiasedness"  E Yi   Pi
  0   H   PH  0,  P  1
 K ' β  m where:
1
0
K' 
0

0
0 0 0
 0 
0 
 
1 
1 0 0 
P
 m 
β
 H 
0 
0 1 0



 

0 0 1
0 
 PH 
H0 : K 'β  m  0
H A : K 'β  m  0
K ' β  m   K '  X ' X  K   K ' β  m  / rank  K   Q k 



T
Test Statistic: Fobs
Under H 0 : F ~ Fk ,n  p '  F4,508
1
1
s2
s2
Results of Test of No Home Effects and Unbiasedness
K'
1
0
0
0
0
1
0
0
0
0
1
0
K'(X'X)^-1K
0.088456472
-0.00430187
-0.088456472
0.00430187
-0.00430187
0.000218877
0.00430187
-0.000218877
-0.088456472
0.00430187
0.211567097
-0.009689162
Q(K)
436.0329611
df(K)
4
df(E)
508
F_obs
1.1161
F(0.05)
2.3895
P-value
0.3481
No evidence to Conclude that E(Y) ≠ P
0
0
0
1
0.004302
-0.00022
-0.00969
0.000462
K'B
-0.37666
1.049672
4.453325
-0.18898
m
0
1
0
0
(K'(XXI)K)^-1
512
10696
10696
232904.8
256
5664.5
5664.5 129445.4
K'B-m
-0.37666
0.049672
4.453325
-0.18898
256
5664.5
256
5664.5
5664.5
129445.4
5664.5
129445.4
Fit of Simple Regression of Actual on Predicted Score
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.3919
R Square
0.1536
Adjusted R Square
0.1519
Standard Error
9.8738
Observations
512
ANOVA
df
Regression
Residual
Total
Intercept
X1=Vegas
SS
1
510
511
9023.01
49720.99
58744
Coefficients
1.2836
0.9767
Standard Error
2.1653
0.1015
MS
9023.01
97.49
F Significance F
92.55
0.0000
t Stat
P-value Lower 95%Upper 95%
0.5928
0.5536
-2.9705
5.5377
9.6204
0.0000
0.7772
1.1762
Note, we clearly do not reject H0 that the intercept is 0 and slope is 1, but will use
this model to obtain Confidence Intervals for Mean Score and Prediction Intervals
for Individual Game Scores at various levels of predicted scores
Joint 95% Confidence Ellipsoid for 0,1
1.4
1.2
1
0.8
0.6
0.4
0.2
0
-6
-4
-2
0
Beta0
2
4
6
8
Confidence Intervals and Prediction Intervals
Point Predictions of Score When P  P0:
 0 
Y 0  x β  1 P0      0   P P0
  P 
1   100% Confidence Interval for Mean of all Games when P  P0 :
'
0
 P  P
 P  P
2
Y 0  t /2 n  p ' s x  X'X  xo
-1
'
0
1
 Y 0  t /2 n  p ' s

n
0
n
i 1
2
i
1   100% Prediction Interval for a Single Game when P  P0 :
 P  P
 P  P
2
Y 0  t /2 n  p ' s 1  x  X'X  xo
'
0
-1
1
 Y 0  t /2 n  p ' s 1  
n
0
n
i 1
For this analysis: n  512 P  20.89

n
i 1
Pi  P

2
2
i
 9458.625 Y  1.2836  0.9767 P
Fitted Scores, 95% CI and 95% PI
60
50
40
30
Actual
P-hat
LB(PI)
20
LB(CI)
UB(CI)
10
UB(PI)
0
-10
-20
0
5
10
15
20
Odds Makers Predicted
25
30
35
Residual Analysis
• Are the residuals consistent with the model assumptions:
 Normally Distributed
• Histogram, Normal Probability Plot, Wilks-Shapiro Test
 Linear relation between Actual and Predicted Scores
• Plot of Residuals versus Fitted, Lack-of-Fit F-test
 Constant Error Variance
• Plot of Residuals versus Fitted, Regress |resid| vs fitted
 Independent (e.g. Within Games and Within Teams Over Time)
• Correlation between Home/Away within games
• Non-Independent errors within Teams (Random Team
effects)
• Autocorrelation among errors over time within teams
Normal Distribution of Residuals
Histogram of Residuals
120
Frequency
100
80
60
40
20
0
-25
-20
-15
-10
-5
0
5
10
15
20
25
More
e
40
30
Residual
Correlation between
residuals and their
corresponding normal
scores = .9952
Normal Probability Plot (Residuals from
Simple Regression)
20
10
0
-10 -3
-2
-1
0
1
-20
-30
Normal Score
2
3
4
Linearity of Regression
Residuals versus Fitted Values
40
30
Residuals
20
10
0
-10
-20
-30
-40
0
5
10
15
20
25
30
35
40
45
Fitted Values
F -Test for Lack-of-Fit (n j observations at c distinct levels of "X")
H 0 : E Yi    0   P Pi
H A : E Yi   i   0   P Pi
c
nj

Lack-of-Fit: SS  LF    Y j  Y j
j 1 i 1
c
nj

Pure Error: SS  PE    Yij  Y j
j 1 i 1


2
2
df LF  c  2
 SS ( LF )  c  2  

 SS ( PE )  n  c   ~
For this example: n  512, c  81
SS
MS
9039.0655 114.4186
40681.9251 94.3896
49720.9905 97.4921
F
1.2122
F(.05)
1.3097
P-value
0.1200
df PE  n  c
H0
Test Statistic: FLOF
Lack of Fit
Pure Error
Residual
df
79
431
510
Fc  2,n c
No evidence to reject the hypothesis of a
linear relation between Actual and
Predicted scores
Equal (Homogeneous) Variance - I
Residuals versus Fitted Values
40
30
20
Residuals
10
0
-10
-20
-30
-40
0
5
10
15
20
25
30
Fitted Values
No overwhelming evidence of unequal variance based on graph
35
40
45
Equal (Homogeneous) Variance - II
Brown-Forsythe Test:
H 0 : Equal Variance Among Errors V   i    2  i
H A : Unequal Variance Among Errors (Increasing or Decreasing in X )
1) Split Dataset into 2 groups based on levels of X with sample sizes: n1 , n2
2) Compute the median residual in each group: e1 , e 2
3) Compute absolute deviation from group median for each residual:
dij  eij  e j
i  1,..., n j
j  1, 2
4) Compute the mean and variance for each group of dij : d 1 , s12
5) Compute the pooled variance: s
Test Statistic: t BF
d1  d 2

1 1
s

n1 n2
2
n1  1 s12   n2  1 s22


n1  n2  2
H0
~
d 2 , s22
tn1  n2  2
Group
1
2
X_Low
11
20.75
X_High
20.5
37.75
n(i)
257
255
s2
34.8277
t(BF)
-1.4870
t(.025)
1.9646
P-value
0.1376
med(e)
-0.8875
-1.4802
dbar(i)
7.5886
8.3643
s2(i)
31.3643
38.3184
No evidence to reject the null hypothesis of equal
variance among errors
Equal (Homogeneous) Variance
Breusch-Pagan (aka Cook-Weisberg) Test:
H 0 : Equal Variance Among Errors V   i     i
ANOVA
2
df
H A : Unequal Variance Among Errors    h   1 X i1  ...   p X ip 
2
i
2
n
1) Let SSE   ei2
i 1
2) Fit Regression of ei2 on X i1 ,...X ip and obtain SS  Reg *
Test Statistic: X
2
BP

SS  Reg * 2
 n 2 
  ei n 
 i 1

2
H0
~
 p2
Regression of e^2 on X
1200
1000
Regression
Residual
Total
SS(Reg*)
SSE
SS(Reg*)/2
SSE/512
X2(BP)
X2(.05,df=1)
P-value
SS
1 93238.13
510 8461614
511 8554852
93238.13
49720.99
46619.07
97.11131
4.943379
3.841459
0.026191
800
There is some evidence of unequal variance,
but keep in mind the sample size is huge.
See plot for how weak the association is
600
400
200
0
0
5
10
15
20
25
30
35
40
Independence Between Home/Away Residuals Within Games
Residuals of Away versus Home Team
30
20
e(Away)
10
0
-10
-20
-30
-40
-30
-20
-10
0
10
20
30
40
e(Home)
Test for Correlation Between Home and Away Team Residuals:
H 0 :   0 (Errors within Games are Independent) H A :   0
Test Statistic: tr 
r
1 r2
n2
H0
~
tn  2
No Evidence of associations between residuals within games
r
1-r^2
n
t_r
t(.025)
P-value
0.0577
0.9967
512
1.3052
1.9646
0.1924
Testing For Random Team Effects - I
Residuals by Team
40
30
20
Residuals
10
0
-10
-20
-30
-40
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Team #
No overwhelming evidence of team random effects
Testing for Random Team Effects - II
 ij  Residual for team i on week j i  1,..., g
j  1,..., ni
 ij     i  uij  i ~ NID  0,  2  uij ~ NID  0,  2 
 COV   ij ,  ij '    2
 g  32,
ni  16 
 i   uij 
j  j'
H 0 :  2  0  COV   ij ,  ij '   0 Residuals are independent within teams
H A :  2  0  COV   ij ,  ij '   0 Residuals not independent within teams
Test based on 1-Way Random Effects ANOVA on Residuals
g




SS Teams    ni ei  e
i 1
g
ni
SS  Error    eij  ei
i 1 j 1
2
dfTeams  g  1
2
df Err  N  g
SS (Teams) dfTeams  H

Test Statistic: Fobs 
 SS ( Error ) df Err  ~
0
Source
Team
Error
SS
2997.58
49720.99
df
31
480
MS
96.70
103.59
F
0.9335
FdfTeams ,df Err
F-Crit
1.4752
P-value
0.5724
No evidence of
random Team
Effects
Durbin-Watson Test Within Teams over Weeks
Yt   0  1 X t   t
ut ~ NID  0,  2 
 t   t 1  ut
 1
H 0 :   0  Errors are uncorrelated over time H A :   0  Positively correlated
1) Obtain Residuals from Regression
3) If DW  d L  p, n  Reject H 0
n
Test Statistic: DW 
  et  et 1 
2) Compute Durbin-Watson Statistic
If DW  dU  p, n  Conclude H 0 Otherwise Inconclusive
2
t 2
n
e
t 1
2
t
For NFL teams, we use eit* = eit  ei
For n =16 (weeks/team) and p =1 (predictor): d L  1.10 dU  1.37
Team
1
2
3
4
5
6
7
8
DW
1.87
0.97
1.95
2.74
2.42
1.21
1.78
1.60
Team
9
10
11
12
13
14
15
16
DW
0.79
2.22
2.51
2.04
2.21
2.16
1.83
2.61
Team
17
18
19
20
21
22
23
24
DW
1.36
2.04
1.72
1.58
1.55
3.05
1.54
2.87
Team
25
26
27
28
29
30
31
32
DW
1.62
2.33
1.67
2.21
2.20
2.61
2.84
2.04
Teams 2 and 9 have small DW
values (positive autocorrelation).
Team 22 displays negative
autocorrelation (value above 4-dL).
Most teams show no autocorrelation
Download