Paradoxes in Colley Matrix Sports Rankings T. S. Michael

advertisement
Paradoxes in Colley Matrix Sports Rankings
T. S. Michael
U. S. Naval Academy
Annapolis, Maryland
tsm@usna.edu
http://www.usna.edu/Users/math/tsm/
joint work with Thomas Quint
University of Nevada
Joint National Mathematics Meetings, Boston, January 2012
T. S. Michael (U. S. Naval Academy)
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
1 / 15
Summary
The Colley matrix sports rating method
produces extreme and shocking forms
of Simpson’s paradox.
T. S. Michael (U. S. Naval Academy)
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
2 / 15
Ratings and Rankings
Rating: a real number that measures sports performance
Ranking: sort the ratings to produce the ranks (ties allowed)
Example: Baseball batting averages are ratings.
batting average =
rank
1
2
..
.
T. S. Michael (U. S. Naval Academy)
number of hits
number of at-bats
2011 season
player
batting average
Miguel Cabrera
.344
Adrian Gonzales
.338
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
3 / 15
Which Y∗nkee Was the Better Batter in 2011?
pre-All-Star
post-All-Star
Alex R∗driguez
hits at-bats ave.
90
305
.295
13
68
.191
Eric Ch∗vez
hits at-bats ave.
10
33
.303
32
127
.252
Ch∗vez was better both before and after the All-Star break.
whole season
Alex R∗driguez
hits at-bats ave.
103
373
.276
Eric Ch∗vez
hits at-bats ave.
42
160
.263
R∗driguez was better for the whole season!
90
10
<
305
33
and
T. S. Michael (U. S. Naval Academy)
13
32
<
68
127
but
90 + 13
10 + 32
.
>
|305 + 68 {z 33 + 127}
baseball addition
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
4 / 15
Which Y∗nkee Was the Better Batter in 2011?
pre-All-Star
post-All-Star
Alex R∗driguez
hits at-bats ave.
90
305
.295
13
68
.191
Eric Ch∗vez
hits at-bats ave.
10
33
.303
32
127
.252
Ch∗vez was better both before and after the All-Star break.
whole season
Alex R∗driguez
hits at-bats ave.
103
373
.276
Eric Ch∗vez
hits at-bats ave.
42
160
.263
R∗driguez was better for the whole season!
90
10
<
305
33
and
T. S. Michael (U. S. Naval Academy)
13
32
<
68
127
but
90 + 13
10 + 32
.
>
|305 + 68 {z 33 + 127}
baseball addition
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
4 / 15
Simpson’s Paradox
Simpson’s paradox occurs when
M
m
<
n
N
and
p
P
<
q
Q
but
m+p
M +P
>
n+q
N +Q
Simpson’s paradox:
is not a genuine paradox
I
I
there is no contradiction
better name: reversal of rankings phenomenon
contradicts most people’s intuition about averages
arises in many statistical contexts
cannot occur when batters are consistent throughout the season:
m
p
=
n
q
T. S. Michael (U. S. Naval Academy)
and
M
P
=
N
Q
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
5 / 15
College Football Rankings
Problem: How do we rank college football teams based on the
results of the games in a season?
Small Example:
team
1
2
3
4
5
6
7
beats
2, 3
4, 6
5, 6, 7
3
1
5
# wins
2
2
3
1
1
1
0
T. S. Michael (U. S. Naval Academy)
# losses
1
1
2
1
2
2
1
4 xH
x3 H x7
AA
A
E
E
E
E HA
H 2 x
Ax6
A
EE
A
A AH A
1 Ax H x5
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
6 / 15
College Football Rankings
Problem: How do we rank college football teams based on the
results of the games in a season?
Small Example:
team
1
2
3
4
5
6
7
beats
2, 3
4, 6
5, 6, 7
3
1
5
# wins
2
2
3
1
1
1
0
T. S. Michael (U. S. Naval Academy)
# losses
1
1
2
1
2
2
1
4 xH
x3 H x7
AA
A
E
E
E
E HA
H 2 x
Ax6
A
EE
A
A AH A
1 Ax H x5
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
6 / 15
Colley Matrix Rankings
ranks college football teams
I
based on wins, losses, and schedule
one of the six computer rankings used for BCS bowl games
method is revealed to the public
I
other five computer rankings are secret
well-motivated
start:
rbi =
1 + di+
=
2 + di+ + di−
1+
di+ −di−
2
+
di+ +di−
2
2 + di+ + di−
Colley’s website has details
T. S. Michael (U. S. Naval Academy)
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
7 / 15
Colley Matrix Ratings
Colley ratings are easy to compute. Solve a linear system
Cr =
1
2
(d + − d − + 2)
Colley matrix C records the schedule of games
vector r is the ratings vector
vectors d + and d − count wins and losses
vector 2 has all components equal to 2
T. S. Michael (U. S. Naval Academy)
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
8 / 15
Colley Ratings: Small Example
Cr =










5
−1
−1
0
−1
0
0
−1
5
0
−1
0
−1
0
−1
0
7
−1
−1
−1
−1
4 xH
0
−1
−1
4
0
0
0
−1
0
−1
0
5
−1
0
x3 H x7
AA
A
E
E
E
E HA
H 2 x
Ax6
A
EE
A
A AH A
1 Ax H x5
T. S. Michael (U. S. Naval Academy)
1
2
(d + − d − + 2)
0
−1
−1
0
−1
5
0
0
0
−1
0
0
0
3










r1
r2
r3
r4
r5
r6
r7

r1
r2
r3
r4
r5
r6
r7
= .6157
= .6144
= .5482
= .5407
= .4159
= .4157
= .3494









= 1

2






Paradoxes in Colley Matrix Sports Rankings
(2 − 1) + 2
(2 − 1) + 2
(3 − 2) + 2
(1 − 1) + 2
(1 − 2) + 2
(1 − 2) + 2
(0 − 1) + 2










January 2012 - Boston JMM
9 / 15
Colley Ratings: Small Example
Cr =










5
−1
−1
0
−1
0
0
−1
5
0
−1
0
−1
0
−1
0
7
−1
−1
−1
−1
4 xH
0
−1
−1
4
0
0
0
−1
0
−1
0
5
−1
0
x3 H x7
AA
A
E
E
E
E HA
H 2 x
Ax6
A
EE
A
A AH A
1 Ax H x5
T. S. Michael (U. S. Naval Academy)
1
2
(d + − d − + 2)
0
−1
−1
0
−1
5
0
0
0
−1
0
0
0
3










r1
r2
r3
r4
r5
r6
r7

r1
r2
r3
r4
r5
r6
r7
= .6157
= .6144
= .5482
= .5407
= .4159
= .4157
= .3494









= 1

2






Paradoxes in Colley Matrix Sports Rankings
(2 − 1) + 2
(2 − 1) + 2
(3 − 2) + 2
(1 − 1) + 2
(1 − 2) + 2
(1 − 2) + 2
(0 − 1) + 2










January 2012 - Boston JMM
9 / 15
Connection to Algebraic Graph Theory





C=




3
−1
−1
0
−1
0
0
−1
3
0
−1
0
−1
0
0 −1
0
−1
0 −1
−1 −1 −1
2
0
0
0
3 −1
0 −1
3
0
0
0
{z
L = Laplacian matrix
|
−1
0
5
−1
−1
−1
−1
1
2
(d + − d − + 2)
d + is the out-degree vector (wins)
d − is the in-degree vector (losses)
T. S. Michael (U. S. Naval Academy)



2
 
 
 
 
+
 
 
 
 
2









2
2
2
2
2
}
|
{z
2 (identity matrix)
}
4 xH
C = L + 2I
(L + 2I) r =
0
0
−1
0
0
0
1
x3 H x7
AA
A
E
E
H
A
E
E
H Ax6
2 x
A
EE
A
H
A
A AAx x
1
5
H
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
10 / 15
College Football 2011: Colley Ratings
Ignore Division 1-AA teams. Use the original Colley method.
120 teams
680 games
rank
1
2
3
4
5
6
7
8
9
10
11
12
13
T. S. Michael (U. S. Naval Academy)
team
LSU
Oklahoma State
Alabama
Kansas State
Stanford
Oregon
Oklahoma
South Carolina
Arkansas
Boise State
Southern Cal
Virginia Tech
Baylor
Colley rating
1.0433
0.9678
0.9140
0.8744
0.8713
0.8485
0.8380
0.8369
0.8318
0.8312
0.8298
0..8232
0.8220
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
11 / 15
College Football 2011: Colley and Simpson
dozens of pairs of teams exhibit the reversal of rankings
phenomenon with respect to 1st and 2nd halves of season
... and also with respect to even and odd weeks (for connectivity)
about half of the teams are involved in a reversal-pair
Question: What happens to the rankings if we duplicate the
season—every game is played twice with the same outcome?
T. S. Michael (U. S. Naval Academy)
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
12 / 15
College Football 2011: Replicated Seasons
rank
1
2
3
4
5
6
7
8
9
10
11
12
13
team
LSU
Oklahoma State
Alabama
Kansas State
Stanford
Oregon
Oklahoma
South Carolina
Arkansas
Boise State
Southern Cal
Virginia Tech
Baylor
T. S. Michael (U. S. Naval Academy)
duplicate
1
2
3
4
5
6
7
8
9
12
11
13
10
Paradoxes in Colley Matrix Sports Rankings
triplicate
1
2
3
4
5
8
6
9
10
13
11
12
7
quadruplicate
1
2
3
4
6
8
5
9
10
14
11
12
7
January 2012 - Boston JMM
13 / 15
College Football 2011: Replicated Seasons
rank
1
2
3
4
5
6
7
8
9
10
11
12
13
team
LSU
Oklahoma State
Alabama
Kansas State
Stanford
Oregon
Oklahoma
South Carolina
Arkansas
Boise State
Southern Cal
Virginia Tech
Baylor
T. S. Michael (U. S. Naval Academy)
duplicate
1
2
3
4
5
6
7
8
9
12
11
13
10
Paradoxes in Colley Matrix Sports Rankings
triplicate
1
2
3
4
5
8
6
9
10
13
11
12
7
quadruplicate
1
2
3
4
6
8
5
9
10
14
11
12
7
January 2012 - Boston JMM
13 / 15
Small Example ... Duplicated
4 xH
x3 H x7
AA
A
E
E
H
A
E
E
H 2 x
Ax6
A
EE
A
H
A
A AA
x5
1 x H
T. S. Michael (U. S. Naval Academy)
rank team duplicate rank
1
1
2
2
2
1
3
3
4
4
4
3
5
5
6
6
6
5
7
7
7
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
14 / 15
Summary
The Colley matrix sports rating method
produces extreme and shocking forms
of Simpson’s paradox.
T. S. Michael (U. S. Naval Academy)
Paradoxes in Colley Matrix Sports Rankings
January 2012 - Boston JMM
15 / 15
Download