Bradley-Terry Models Stat 557 Heike Hofmann

advertisement
Bradley-Terry Models
Stat 557
Heike Hofmann
Outline
• Definition: Bradley-Terry
• Fitting the model
• Extension: Order Effects
• Extension: Ordinal & Nominal Response
• Repeated Measures
Bradley-Terry Model
(1952)
• Idea: based on pairwise comparisons, find
overall ranking
• e.g. sports teams, wine tasting, , ...
Πab
= βa − βb
Πba
If βa = βb the two products are equal, i.e. Πab = Πba = 0.5; if βa > βb then Πab > 0.5 > Πba
model identifiable, we use the constraint βI = 0.
Then
exp(βa − βb )
exp(βa )
Πab =
=
1 + exp(βa − βb )
exp(βa ) + exp(βb )
�I �
Since we have 2 values Πab to estimate (I − 1) parameters βa , a = 1, ..., I − 1, the degrees
the Bradley-Terry model are
� �
I
df =
− (I − 1) = I(I − 1)/2 − (I − 1) = (I − 1)(I/2 − 1) = (I − 1)(I − 2)/
2
log
Example: American
League Baseball
• 1987 Season
Example: American League Baseball Teams of Milwaukee, Detroit, Toronto, New
Cleveland, and Baltimore are compared pairwise (data from 1987). Each team played the o
times; wins and losses are given in the table below:
each team played every other 13 times
losing team
winning
Baltimore Cleveland Boston NY Toronto Detroit Milwaukee
Baltimore 0
7
1
3 1
4
2
Cleveland 6
0
6
6 5
4
4
Boston
12
7
0
7 6
2
6
NY
10
7
6
0 6
8
6
Toronto
12
8
7
7 0
6
4
Detroit
9
9
11
5 7
0
6
Milwaukee 11
9
7
7 9
7
0
� �
This table translates into a matrix of I2 = 21 columns:
http://www.baseball-reference.com/leagues/AL/1987standings.shtml
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [
Milwaukee
-1
-1
-1
-1
-1
-1
0
0
0
0
0
0
0
0
0
0
0
0
Detroit
1
0
0
0
0
0
-1
-1
-1
-1
-1
0
0
0
0
0
0
0
Toronto
0
1
0
0
0
0
1
0
0
0
0
-1
-1
-1
-1
0
0
0
NY
0
0
1
0
0
0
0
1
0
0
0
1
0
0
0
-1
-1
-1
Bradley-Terry Model
• Let π = probability that a beats b
• assume π +π = 1
ab
ab
ba
i.e. no ties are allowed (for now)
• logit model
log πab/πba = µa - µb
with µ1 = 0 (estimability)
Bradley-Terry Model
• π = exp(µ )/(exp(µ )+exp(µ ))
• π > 0.5 , if µ > µ
• Bradley-Terry Model is quasi-symmetric
ab
ab
model
a
b
a
b
a
ABL - logit model
data
abl$pair <- fsym(abl$winner, abl$loser)
require(plyr)
abl.new <- ddply(abl, .(pair), function(x) {
dummy <- as.numeric(teams==x$winner[1]) - as.numeric(teams==x$loser[1])
return(c(dummy, x$times))
})
names(abl.new) <- c("pair", as.character(teams), "scoreA", "scoreB")
abl.tb <- glm(cbind(scoreA, scoreB)~Milwaukee + Detroit + Toronto +
+ NY + Boston + Cleveland-1, data=abl.new, family=binomial(link=logit))
summary(abl.tb)
ABL - logit model
data
> head(abl.new)
1
2
3
4
5
6
pair Milwaukee Detroit Toronto NY Boston Cleveland Baltimore scoreA scoreB
Baltimore,Boston
0
0
0 0
1
0
-1
12
1
Baltimore,Cleveland
0
0
0 0
0
1
-1
6
7
Baltimore,Detroit
0
1
0 0
0
0
-1
9
4
Baltimore,Milwaukee
1
0
0 0
0
0
-1
11
2
Baltimore,NY
0
0
0 1
0
0
-1
10
3
Baltimore,Toronto
0
0
1 0
0
0
-1
12
1
glm(formula = cbind(scoreA, scoreB) ~ Milwaukee + Detroit + Toronto +
NY + Boston + Cleveland - 1, family = binomial(link = logit),
data = abl.new)
ALB - logit model
Deviance Residuals:
Min
1Q
Median
-1.50067 -0.52962 -0.06604
3Q
0.16281
Max
2.06170
Coefficients:
Estimate Std. Error z value Pr(>|z|)
Milwaukee
1.5814
0.3433
4.607 4.09e-06 ***
Detroit
1.4364
0.3396
4.230 2.34e-05 ***
Toronto
1.2945
0.3367
3.845 0.000121 ***
NY
1.2476
0.3359
3.715 0.000203 ***
Boston
1.1077
0.3339
3.318 0.000908 ***
Cleveland
0.6839
0.3319
2.061 0.039345 *
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 49.699
Residual deviance: 15.737
AIC: 87.324
on 21
on 15
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
ABL - QS model
data
> head(abl)
loser
2
Detroit
3
Toronto
4
NY
5
Boston
6 Cleveland
7 Baltimore
winner times
pair
Milwaukee
7
Detroit,Milwaukee
Milwaukee
9
Milwaukee,Toronto
Milwaukee
7
Milwaukee,NY
Milwaukee
7
Boston,Milwaukee
Milwaukee
9 Cleveland,Milwaukee
Milwaukee
11 Baltimore,Milwaukee
glm(formula = times ~ pair - 1 + winner, family = poisson(link = log),
data = abl)
ABL - QS model
Coefficients:
Estimate Std. Error z value Pr(>|z|)
pairBaltimore,Boston
2.7532
0.3973
6.930 4.22e-12 ***
pairBaltimore,Cleveland
3.0539
0.3981
7.671 1.71e-14 ***
.
.
.
pairMilwaukee,NY
2.0248
0.3061
6.615 3.71e-11 ***
pairMilwaukee,Toronto
2.0050
0.3076
6.518 7.13e-11 ***
pairNY,Toronto
2.1818
0.3870
5.638 1.72e-08 ***
winnerDetroit
-0.1449
0.3111 -0.466 0.64131
winnerToronto
-0.2869
0.3103 -0.925 0.35520
winnerNY
-0.3337
0.3102 -1.076 0.28198
winnerBoston
-0.4737
0.3105 -1.525 0.12718
winnerCleveland
-0.8975
0.3166 -2.835 0.00458 **
winnerBaltimore
-1.5814
0.3433 -4.607 4.09e-06 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: 609.702
Residual deviance: 15.737
AIC: 222.05
on 42
on 15
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 5
ABL - TB model
data
> library(BradleyTerry2)
>
> data(baseball, package = "BradleyTerry2")
> head(baseball)
home.team away.team home.wins away.wins
1 Milwaukee
Detroit
4
3
2 Milwaukee
Toronto
4
2
3 Milwaukee New York
4
3
4 Milwaukee
Boston
6
1
5 Milwaukee Cleveland
4
2
6 Milwaukee Baltimore
6
0
BTm(outcome = cbind(home.wins, away.wins), player1 = home.team,
player2 = away.team, id = "team", data = baseball)
Deviance Residuals:
Min
1Q
Median
-1.6539 -0.0508
0.4133
ABL - TB model
3Q
0.9736
Max
2.5509
Coefficients:
Estimate Std. Error z value Pr(>|z|)
teamBoston
1.1077
0.3339
3.318 0.000908 ***
teamCleveland
0.6839
0.3319
2.061 0.039345 *
teamDetroit
1.4364
0.3396
4.230 2.34e-05 ***
teamMilwaukee
1.5814
0.3433
4.607 4.09e-06 ***
teamNew York
1.2476
0.3359
3.715 0.000203 ***
teamToronto
1.2945
0.3367
3.845 0.000121 ***
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 78.015
Residual deviance: 44.053
AIC: 140.52
on 42
on 36
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
ABL
• Three different solutions:
• same fits
• different residual/null deviances
• different degrees of freedom
?
??
Terry Bradley Model
• Assume, X has J categories (number of
teams)
• There are a total of J(J-1)/2 pairs of
categories
• (J-1) parameters are fit
• degrees of freedom: (J-1)(J-2)/2
ABL
• For 7 teams we have
• 21 pairs of teams
• we fit 6 parameters
• resulting in 15 degrees of freedom
ABL
•
logit has correct deviance and degrees of
freedom
•
BTm uses extended data set (comes with
package, regards home/away teams)
•
loglinear model computes deviances and
degrees of freedom differently, residual
deviance and degrees of freedom as with
logit model (i.e. correct)
=
B
B
A
(λA
a − λb ) − (λa − λb ) .
� �� � � �� �
βa
βb
Home Advantage
ome Team Advantage
some comparisons the order of comparison makes a difference - e.g. teams do have an advantage, if the
ay at home, at a wine tasting the first wine tasted is usually thought better than the other. To accoun
this home team advantage, we extend the Bradley-Terry model to:
log
Πab
= α + (βa − βb ).
Πba
• most sports show a home advantage
• 1987 season
α is significantly > 0 we do have a home team advantage.
xample: American Baseball League
ses for home and away team:
Home Team
Milwaukee
Detroit
Toronto
New York
Boston
Cleveland
Baltimore
Milwaukee
–
3-3
2-5
3-3
5-1
2-5
2-5
Detroit
4-3
–
4-3
5-1
2-5
3-3
1-5
For the 1987 baseball season we have a table containing wins
Away Team
Toronto New York Boston
4-2
4-3
6-1
4-2
4-3
6-0
–
2-4
4-3
2-5
–
4-3
3-3
4-2
–
3-4
4-3
4-2
1-6
2-4
1-6
. NY vs Boston lost 4–2 at Boston, and won 4–3 at New York
tting the extended Bradley-Terry model yields:
Cleveland
4-2
6-1
4-2
4-2
5-2
–
3-4
Baltimore
6-0
4-3
6-0
6-1
6-0
2-4
–
Bradley Terry with
Order Effects
• assume that first team plays at home
• let π be the probability that team a beats
ab
team b when team a goes first
• logit model
log πab/πba = µ + µa - µb
• if µ significantly > 0 there is a home
advantage
ABL
TerryBradley2 package
baseball$home.team <- data.frame(team = baseball$home.team, at.home = 1)
baseball$away.team <- data.frame(team = baseball$away.team, at.home = 0)
baseballModel2 <- update(baseballModel1, formula = ~ team + at.home)
summary(baseballModel2)
> anova(baseballModel1, baseballModel2)
Analysis of Deviance Table
Response: cbind(home.wins, away.wins)
Model 1:
Model 2:
Resid.
1
2
~team
~team + at.home
Df Resid. Dev Df Deviance
36
44.053
35
38.643 1
5.4106
BTm(outcome = cbind(home.wins, away.wins), player1 = home.team,
player2 = away.team, formula = ~team + at.home, id = "team",
data = baseball)
ABL
Coefficients:
Estimate Std. Error z value Pr(>|z|)
teamBoston
1.1438
0.3378
3.386 0.000710 ***
teamCleveland
0.7047
0.3350
2.104 0.035417 *
teamDetroit
1.4754
0.3446
4.282 1.85e-05 ***
teamMilwaukee
1.6196
0.3474
4.662 3.13e-06 ***
teamNew York
1.2813
0.3404
3.764 0.000167 ***
teamToronto
1.3271
0.3403
3.900 9.64e-05 ***
at.home
0.3023
0.1309
2.308 0.020981 *
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 78.015
Residual deviance: 38.643
AIC: 137.11
on 42
on 35
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
ABL - logit with order
response<-cbind(
c(4,4,4,6,4,6, 3,4,4,6,6,4, 2,4,2,4,4,6, 3,5,2,4,4,6, 5,2,3,4,5,6,
2,3,3,4,4,2, 2,1,1,2,1,3),
c(3,2,3,1,2,0, 3,2,3,0,1,3, 5,3,4,3,2,0, 3,1,5,3,2,1, 1,5,3,2,2,0,
5,3,4,3,2,4, 5,5,6,4,6,4))
# 42 pair sets
xabl <- expand.grid(teamB=teams, teamA=teams)
idx <- with(xabl, which(teamA==teamB))
xabl <- xabl[-idx,]
X <- matrix(0, nrow=nrow(xabl), ncol=length(teams))
for (i in 1:nrow(X)) {
X[i,as.numeric(xabl$teamA)[i]] <- 1
X[i,as.numeric(xabl$teamB)[i]] <- -1
}
X <- data.frame(X)
names(X) <- as.character(teams)
ABL - home advantage
2
3
4
5
6
7
teamB
Detroit
Toronto
NY
Boston
Cleveland
Baltimore
teamA scoreA scoreB Milwaukee Detroit Toronto NY Boston Cleveland Baltimore
Milwaukee
4
3
1
-1
0 0
0
0
0
Milwaukee
4
2
1
0
-1 0
0
0
0
Milwaukee
4
3
1
0
0 -1
0
0
0
Milwaukee
6
1
1
0
0 0
-1
0
0
Milwaukee
4
2
1
0
0 0
0
-1
0
Milwaukee
6
0
1
0
0 0
0
0
-1
fit.BTO<-glm(cbind(scoreA, scoreB)~1+Milwaukee + Detroit + Toronto
+ NY + Boston + Cleveland, family=binomial(link=logit), data=xabl)
glm(formula = cbind(scoreA, scoreB) ~ 1 + Milwaukee + Detroit +
Toronto + NY + Boston + Cleveland, family = binomial(link = logit),
data = xabl)
ABL - home advantage
Coefficients:
Estimate Std. Error z value
(Intercept)
0.3023
0.1309
2.308
Milwaukee
1.6196
0.3474
4.662
Detroit
1.4754
0.3446
4.282
Toronto
1.3271
0.3403
3.900
NY
1.2813
0.3404
3.764
Boston
1.1438
0.3378
3.386
Cleveland
0.7047
0.3350
2.104
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01
Pr(>|z|)
0.020981
3.13e-06
1.85e-05
9.64e-05
0.000167
0.000710
0.035417
*
***
***
***
***
***
*
‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 73.516
Residual deviance: 38.643
AIC: 137.11
on 41
on 35
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
Bradley Terry Extensions
• Ordinal Response:
cumulative logit model
logit P(Y ≤ j) = µj + µa - µb
e.g. “loss”, “tie”, “win”
• Nominal Response:
baseline categorical model
log P(Y = j)/P(Y = J) = µj + µaj - µbj
Repeated Measures
Models
• Extension of matched pairs data
• Multiple (T ≥ 3) measurements observed
for same individual, e.g. individuals’ weekly
progress
• Measurements for cluster of individuals (T
≥ 3), e.g. one litter, teeth at dentist’s visit, ...
If β1 = β2 = ... = βT = 0 we observe marginal homogeneity, i.e. t
P (YT = 1).
Example: Drug
Comparisons
Example: Crossover-Study of Drugs Drugs A, B, and C are t
disease in a cross-over study, i.e. each individual is treated for some
binary: (success/failure) for each drug, giving a data set of
• Cross-over effect of drugs
A, B, C
• Interested in marginal
distributions
P(A=S), P(B=S), P(C=S)
A B
S S
S S
S F
S F
F S
F S
F F
F F
Total
C
S
F
S
F
S
F
S
F
count
6
16
2
4
2
4
6
6
46
One question of interest for these data is, whether all of the drugs are
difference. This question translates to whether marginal homogeneity
From the raw data we get estimates for the effectiveness of each drug
8
Repeated
Response
Data
d ResponseMultiple
8 Data
Repeated
Response D
Binary
A lot of repeatedly,
studies observe
individuals repeatedly,
e.g.
rve individuals
e.g.
longitudinal
studies.
A data
lot of
studies
observe
individuals
rep
For
these
we
will
be
mainly
interested
in
the
ill be mainly interested in the marginal distributions.
For
these
data
we will
be mainly
intere
Let
(Y
,
Y
,
....,
Y
)
be
a
tuple
of
binary
response
va
1
2
T
be a tuple of binary response variables observed at (time) po
Let
(Y1 , Yin
, ....,
Y
) points
be
of
We Yare
interested
the
probability
of
for e
response
for
time
t=1,
...,
T binary
t binary
2for
Tt,
the probability
of success
each
i.e. awetuple
aresuccess
interested
A as
logit model
is interested
then definedinasthe probability of
We
are
n defined
logit model
logit P (Yt
A logit
is 1)
then
logitmodel
P (Yt =
= αdefined
+ βt , as
estimability:
with
constraint βT = 0 (or α = 0).
= 0 (or α = 0).
= β2 =marginal
... = βThomogeneity,
= 0 we observe
marginal
h
1observe
βT = 0 IfweβMarginal
i.e.
then
P
(Y
withhomogeneity
constraint βT = 0 (or α = 0). 1
P (YT = 1).
Response
•
•
•
If β1 = β2 = ... = βT = 0 we observe
P
(Y
=
1).
T
Example: Crossover-Study of Drugs Drugs
ver-Study of Drugs Drugs A, B, and C are tested on 4
disease
in a individual
cross-over isstudy,
i.e.foreach
individual
er study,
i.e. each
treated
some
time wit
Drugs Crossover
> head(drugs.m)
count id variable value
1
6 1
A
Y
2
16 2
A
Y
3
2 3
A
Y
4
4 4
A
Y
5
2 5
A
N
6
4 6
A
N
glm(formula = value ~ variable - 1, family = binomial(link = logit),
data = drugsm, weights = count)
Deviance Residuals:
Min
1Q Median
-3.698 -2.740 -0.220
Drugs Crossover
3Q
2.152
Max
3.986
Coefficients:
Estimate Std. Error z value Pr(>|z|)
variableA
0.4418
0.3021
1.462
0.1436
variableB
0.4418
0.3021
1.462
0.1436
variableC -0.6286
0.3096 -2.031
0.0423 *
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 191.31
Residual deviance: 182.60
AIC: 188.60
on 24
on 21
degrees of freedom
degrees of freedom
Number of Fisher Scoring iterations: 4
Marginal Homogeneity
> anova(drugs.null, drugs.mh, test="Chisq")
Analysis of Deviance Table
Model 1: value ~ 1
Model 2: value ~ variable - 1
Resid. Df Resid. Dev Df Deviance P(>|Chi|)
1
23
191.05
2
21
182.60 2
8.451
0.01462 *
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
Download