processing of sports information

advertisement
Alberto Palacios Pawlovsky
1
SPM
Introduction
Soccer is one of the most popular sports in the world and with baseball is one of the
two most popular sports in Japan. In soccer, two teams, each one of eleven players,
try to put a ball into the adversary goal that is defended by a goalkeeper who is the
only player that can touch the ball with the hands within a restricted zone of the
game area. The other players can only kick or head the ball inside all the game
area. In almost all the tournaments of soccer a team is awarded three points if it
wins, zero points if it is defeated, and each of the contenders is given one point if
the match ends in a tie in the score (Brillinger (2010)).
The distribution of goals in soccer has been the focus of research and Reep, Pollard, and Benjamin (1971) showed that the number of goals scored by a team would
follow a Negative Binomial distribution. Maher (1982) contended this finding and
instead used a Poisson distribution and defined the mean of the goals scored by a
team as the product of its attack strength and defense weakness. His model can also
be used to predict scores. For game outcome prediction, we can also use ranking.
Ranking systems are used in some sports to select or seed teams for pre or post
season tournaments (Harville (2003)). Ranking teams before a match has also value
to managers because it could help them in choosing defense and offense strategies
or the starting players for a game. The problem of rating teams and forming a
ranking has been studied for a long time and people from diverse disciplines and
backgrounds have proposed several methods.
The creation and open availability of databases for almost any popular sport has
also fostered the development of many computer-based ranking systems. One that
has attracted attention is the method proposed by Colley (2002) since it is one of
the computer rankings used for the College football’s Bowl Championship Series
(BCS). His ranking system can also be used for prediction. Ingram (2007) work
using linear algebra is the base of the ODM (Offense Defense Model) of Govan,
Langville, and Meyer (2009). ODM is a model based on defense and offense ratings that can be used for prediction too. Stefani (2008) developed a least squares
approach to predict the scores in rugby and soccer games using also, like Maher,
defense and offense ratings of the teams.
Ranking is in itself a constant evolving area. We have models in the area of
information processing, like the one of Callagham, Mucha, and Porter (2007), that
could also be applied to sports. For soccer, we have the work of Hallinan (2005)
that ranks national soccer teams using a modified Bradley-Terry Model (Bradley
and Terry (1952)). We even have works that use computer voters to determine
rating and rank (Gleich and Lim (2011)).
One characteristic almost common to all computer based methods is that they
use the information currently available in sports association sites. This paper intro1
Alberto Palacios Pawlovsky
SPM
duces two metrics we have developed to rate soccer teams. We evaluated the quality
of these metrics rating and ranking Japanese university soccer teams and using the
resulting rank for predicting the outcome of soccer games in the first and second
division of JUFA (Japanese University Football Association).
2
Scores and Points Metrics (SPM)
In the case of soccer the usual and minimal information available in association
sites, for tournament games, is the date of the matches and the corresponding scores.
We have been studying several metrics based on this basic information to use them
in rating teams, rank them and predict the outcome of future games. We propose
two performance metrics and one way to rate a team. One of the metrics uses the
goals scored by a team and the other one the points that the team gets up to a given
match, so we will call them scores and points metrics (SPM) in what follows.
2.1
SPM and Rating
We will explain our metrics using two teams, i and a that has confronted each other
in the k-th match day. We will express the points gained by team i before this game
and those of the a (adversary) team by pk−1
by pk−1
a . Their initial values, before
i
the start of the season, will be p0i = 0 and p0a = 0.
We will use the point rules of soccer tournaments, so if the i team wins the k+ 3.
th match, it will earn three points, and its points will be given by pki = pk−1
i
However, if the game ends in a tie, the team points will be given by pki = pik−1 + 1,
and if it loses by pki = pk−1
. The total number of points possible to be earned up to
i
the k-th match, for any team, is given by equation (1).
k
ptotal
= k×3
(1)
One of our performance metrics evaluates the points gained by a team relative to all
the points that it could have earned. We call it the points metric and is defined by
the following equation.
pk
(2)
pi,k = k i
ptotal
We also measure the performance of a team by its goals. The goals of the i team in
the j-th game are expressed by gi, j and all its goals up to the k-th match are given
by equation (3).
k
gki =
∑ gi, j
j=1
2
(3)
Alberto Palacios Pawlovsky
SPM
In the same way we express the goals conceded by a team in the j-th game by cgi, j
and its total number up to the k-th game by equation (4).
k
cgki =
∑ cgi, j
(4)
j=1
So the total number of goals scored and conceded by team i is given by the following
equation. If tgki is 0, it is set to 0.1 to avoid the division by zero in some special
cases.
tgki = gki + cgki
(5)
Our second metric measures the performance of team i, up to the k-th game, using
what we call the scores metric (s) of a team, which is defined by equation (6).
si,k =
gki
tgki
(6)
We use the above two metrics (equations (2) and (6)) to rate a team, up to the k-th
game, according to the formula given by equation (7).
rik = si,k × pi,k
(7)
We use the rating of a team to compare it to other teams and if needed rank them.
We have also studied and evaluated the individual effects of the scores and
points metrics when rating. In the case of using only the score metric, equation
(7) becomes,
rik = si,k
(8)
And when using only the points metric the rating is given by,
rik = pi,k
(9)
We evaluated our metrics combining them, as in equation (7), but using a weighting factor (w) to measure their effect on rating. For the evaluation, we used the
following (modified) rating.
rik = (si,k )(1−w) × (pi,k )w
(10)
When w is 0 we have equation (8) and when it is 1 we get equation (9). The
evaluation was carried setting w to 11 values, from 0 to 1 in increments of 0.1
and using the ratings for game outcome prediction. The results are detailed in the
following subsections
3
Alberto Palacios Pawlovsky
2.2
SPM
Weighted SPM Evaluation : Prediction
As indicated above, we used in the evaluation of our metrics the data of the last
twelve years (1999∼2010) of the first and second divisions of the Japanese University Football Association (JUFA, Kanto League). It has the characteristic that
almost all its games are played in neutral stadiums, with none or negligible home
advantage. We used the public data in the site of JUFA (2011).
The rules governing JUFA have changed over the years, and the data collected
have the following characteristics. The first and second divisions of JUFA had only
8 teams in 1999 and 2000, and the teams played only one game against all other
teams in those seasons. From 2001 to 2004, the teams played also a return game and
the season’s games were divided into two terms. Since the 2005 season, the number
of teams per division grew to 12 teams. In all these seasons, all the teams played
the same number of games before confronting an adversary. The only exception in
the data is season 2006, second division. In this season, one team was suspended in
the middle of the second term and that year we had only 119 games in that division.
The data gathering process required the parsing and processing of all the corresponding match day pages. We used for it tailored programs written in Python.
Table 1: Prediction Results: Detailed Example (by match date, w = 0.5)
1999 Season : 1st Division
match
w = 0.5
0/4
League 1 2
0.00%
2/4
League 1 3
50.00%
1/4
League 1 4
25.00%
1/4
League 1 5
25.00%
3/4
League 1 6
75.00%
2/4
League 1 7
50.00%
9/24
Total:
37.50%
JUFA’s games are all scheduled weekly, so for all the weights (w) the scores
and points metrics were computed using weekly results. We predicted only the
results of the games after the first match day (from the second game onward). One
4
Alberto Palacios Pawlovsky
SPM
sample of the detailed results, for one season, is shown in Table 1. Match days are
represented in a League x y format. Where x is the division and y is the match day.
We used equation (10) to rate each team before its k-th match using data up to
its previous (k-1)-th match. We then used those ratings to determine the winners of
the k-th match day. We have not used pre-season data neither other data to improve
the predictions.
The total results for the first division and all weights are shown in Figure 1. It
shows that the best setting is w = 0.5 for the twelve years span (an equal weight for
the s and p metrics). However, if we see the details of Table 2, for the 1999 and 2000
seasons, all the weights between 0.1 and 0.6 will give the same highest prediction
percentage. Also, for the seasons between 2001 and 2004, the best weight is 0.9.
Moreover, for the contemporary data (2005 onward) the best figures are obtained
Figure 1: Weighted SPM : Foresight Prediction Percentages (1st Division JUFA).
with w set to 0.0 (only using the scores metric) or 0.1.
Figure 2 shows the results for the second division. Those results show a slight
different distribution. The best setting is for w = 0.6 and second is w = 0.5. Taking
the details of Table 3, we can determine that for the 1999 and 2000 seasons, the
best prediction percentages are obtained with the values of w between 0.0 and 0.6.
Also, for the seasons between 2001 and 2004 the highest prediction percentages
are obtained with w set to 0.6 or 1.0 (second best values are for 0.5 and 0.7). And
if we limit the spam to the seasons after 2005, the best values are obtained with
5
Alberto Palacios Pawlovsky
SPM
w set to 0.3, 0.5 and 0.6. The best weight values for both divisions hint that the
best combination of the score and points metrics is with an equal weight for both
metrics.
Figure 2: Weighted SPM : Foresight Prediction Percentages (2nd Division JUFA).
Table 2 shows the details per season of the prediction results for the first division
of JUFA. As it has already been shown in Figure 1, for all the twelve seasons, the
best total is obtained with w = 0.5. If we look at the best values per season (in
boldface) of this table, the setting with the highest number of seasons with best
values is w = 1. It has five seasons with best values (1999, 2001, 2003, 2005,
and 2009), with two of them on the contemporary range. The next best setting
is w = 0.7, with three seasons with best values and all them in the contemporary
range. Three values of w share the third position, 0, 0.5 and 0.6. Each one has three
best value seasons with two in the contemporary range.
Table 3 shows the details per season of the prediction results for the second
division of JUFA. The best values per season are also highlighted in this table. For
the span of twelve seasons, the best setting is w = 0.6. It has six seasons with best
values of which three of them are in the contemporary range (2005 onward). The
next best setting for w is 0.7 with five seasons with best values and two of them
in contemporary seasons. The third place is for w = 0.3 with four seasons of best
values and three in the contemporary seasons. The values for w at the four place
are 0.5, 0.8 and 0.9, each having four seasons of best values with two of them from
6
7
Total
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
Year
0
8/24
33.33
12/24
50.00
19/52
36.54
24/52
46.15
23/52
44.23
30/52
57.69
56/126
44.44
69/126
54.76
66/126
52.38
59/126
46.83
62/126
49.21
63/126
50.00
491/1012
48.52
0.1
8/24
33.33
13/24
54.17
19/52
36.54
24/52
46.15
22/52
42.31
30/52
57.69
55/126
43.65
67/126
53.17
65/126
51.59
62/126
49.21
62/126
49.21
64/126
50.79
491/1012
48.52
0.2
8/24
33.33
13/24
54.17
20/52
38.46
24/52
46.15
23/52
44.23
30/52
57.69
56/126
44.44
64/126
50.79
65/126
51.59
61/126
48.41
62/126
49.21
64/126
50.79
490/1012
48.42
0.3
8/24
33.33
13/24
54.17
19/52
36.54
25/52
48.08
24/52
46.15
29/52
55.77
55/126
43.65
64/126
50.79
64/126
50.79
60/126
47.62
62/126
49.21
64/126
50.79
487/1012
48.12
0.4
9/24
37.50
12/24
50.00
19/52
36.54
24/52
46.15
26/52
50.00
30/52
57.69
56/126
44.44
63/126
50.00
64/126
50.79
60/126
47.62
63/126
50.00
64/126
50.79
490/1012
48.42
weight (w)
0.5
0.6
9/24
9/24
37.50
37.50
12/24
12/24
50.00
50.00
20/52
19/52
38.46
36.54
23/52
23/52
44.23
44.23
26/52
26/52
50.00
50.00
29/52
29/52
55.77
55.77
56/126
57/126
44.44
45.24
65/126
64/126
51.59
50.79
63/126
62/126
50.00
49.21
61/126
60/126
48.41
47.62
63/126
63/126
50.00
50.00
65/126
64/126
51.59
50.79
492/1012 488/1012
48.61
48.22
0.7
7/24
29.17
12/24
50.00
19/52
36.54
23/52
44.23
27/52
51.92
29/52
55.77
57/126
45.24
64/126
50.79
61/126
48.41
59/126
46.83
63/126
50.00
65/126
51.59
486/1012
48.02
0.8
7/24
29.17
12/24
50.00
19/52
36.54
23/52
44.23
28/52
53.85
29/52
55.77
57/126
45.24
64/126
50.79
60/126
47.62
58/126
46.03
62/126
49.21
64/126
50.79
483/1012
47.73
0.9
7/24
29.17
12/24
50.00
20/52
38.46
23/52
44.23
28/52
53.85
29/52
55.77
56/126
44.44
65/126
51.59
60/126
47.62
58/126
46.03
62/126
49.21
64/126
50.79
484/1012
47.83
1
9/24
37.50
11/24
45.83
21/52
40.38
22/52
42.31
28/52
53.85
27/52
51.92
57/126
45.24
66/126
52.38
57/126
45.24
53/126
42.06
63/126
50.00
63/126
50.00
477/1012
47.13
Table 2: Prediction Results (1st Division JUFA) : correctly predicted games/all games, and corresponding percentage.
Alberto Palacios Pawlovsky
SPM
SPM
Alberto Palacios Pawlovsky
0
12/24
50.00
13/24
54.17
20/52
38.46
27/52
51.92
21/52
40.38
21/52
40.38
63/126
50.00
61/119
51.26
73/126
57.94
70/126
55.56
76/126
60.32
58/126
46.03
515/1005
51.24
0.1
13/24
54.17
12/24
50.00
21/52
40.38
28/52
53.85
22/52
42.31
20/52
38.46
64/126
50.79
61/119
51.26
71/126
56.35
70/126
55.56
78/126
61.90
58/126
46.03
518/1005
51.54
0.2
13/24
54.17
12/24
50.00
21/52
40.38
28/52
53.85
22/52
42.31
20/52
38.46
64/126
50.79
60/119
50.42
71/126
56.35
71/126
56.35
78/126
61.90
59/126
46.83
519/1005
51.64
0.3
13/24
54.17
12/24
50.00
20/52
38.46
28/52
53.85
21/52
40.38
21/52
40.38
64/126
50.79
59/119
49.58
70/126
55.56
74/126
58.73
79/126
62.70
59/126
46.83
520/1005
51.74
0.4
13/24
54.17
12/24
50.00
20/52
38.46
28/52
53.85
22/52
42.31
22/52
42.31
64/126
50.79
59/119
49.58
71/126
56.35
72/126
57.14
78/126
61.90
59/126
46.83
520/1005
51.74
weight (w)
0.5
0.6
13/24
13/24
54.17
54.17
12/24
12/24
50.00
50.00
20/52
20/52
38.46
38.46
29/52
29/52
55.77
55.77
22/52
22/52
42.31
42.31
22/52
23/52
42.31
44.23
64/126
64/126
50.79
50.79
60/119
59/119
50.42
49.58
70/126
70/126
55.56
55.56
71/126
71/126
56.35
56.35
81/126
82/126
64.29
65.08
59/126
59/126
46.83
46.83
523/1005 524/1005
52.04
52.14
0.7
13/24
54.17
11/24
45.83
19/52
36.54
29/52
55.77
22/52
42.31
23/52
44.23
63/126
50.00
59/119
49.58
70/126
55.56
71/126
56.35
82/126
65.08
59/126
46.83
521/1005
51.84
0.8
13/24
54.17
11/24
45.83
20/52
38.46
27/52
51.92
22/52
42.31
23/52
44.23
63/126
50.00
59/119
49.58
69/126
54.76
71/126
56.35
82/126
65.08
59/126
46.83
519/1005
51.64
0.9
13/24
54.17
11/24
45.83
20/52
38.46
27/52
51.92
22/52
42.31
23/52
44.23
63/126
50.00
59/119
49.58
69/126
54.76
72/126
57.14
82/126
65.08
59/126
46.83
520/1005
51.74
1
12/24
50.00
10/24
41.67
22/52
42.31
25/52
48.08
24/52
46.15
23/52
44.23
58/126
46.03
61/119
51.26
68/126
53.97
73/126
57.94
78/126
61.90
58/126
46.03
512/1005
50.95
Table 3: Prediction Results (2nd Division JUFA) : correctly predicted games/all games, and corresponding percentage.
Year
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Total
8
Alberto Palacios Pawlovsky
SPM
2005 onward. If we limit the analysis to the contemporary data, the best w’s settings
are 0.3 and 0.6. Second come all other settings, but w = 1.
Based on all the above and taking as reference the number of games correctly
predicted, for contemporary data, in both divisions we chose as best setting w =
0.5. For this value of w we have in the first division 373 games (of 756) correctly
predicted. The highest number is obtained with w = 0 or w = 0.1, but the difference
is only of two games. For the second division, the number of games correctly
predicted with this setting is 405 games (of 749). The same number is obtained
with w = 0.6, but there is a difference of three games in the first division (w = 0.6
has only 370 games correctly predicted).
Setting w to 0.5 means that the scores and points metrics must have the same
weight. In other words, if we will combine them for rating teams and use this
rating for ranking or prediction we should use equation (7). It also seems to be the
best tradeoff for the whole span of data and for both divisions. Of course, another
possible choice would be to use different weight settings for each division.
2.3
Weighted SPM Evaluation : Fitting
We also measured the fitting of the predictions obtained with our weighted metrics. Some authors call it hindsight prediction. It is a way of measuring how well
Figure 3: Weighted SPM : Hindsight Prediction Percentages (1st Division JUFA).
9
Alberto Palacios Pawlovsky
SPM
the models used fitted the target data (the higher the hindsight prediction rate the
smaller the error).
For hindsight prediction, we could use the data (rating, ranking) at the end of
the season to predict the outcomes of all its games, but we opted for measuring it,
incrementally, one match day at a time. We used data up to game k-th to predict the
results of that match day. The total results for the first division of JUFA are shown
in Fig. 3. The total results for all weights and for the second division of JUFA are
shown in Fig. 4. As expected, and since the team with more points is the top team at
Figure 4: Weighted SPM : Foresight Prediction Percentages (2nd Division JUFA).
the end of a season, or a match day, the best fitting is obtained with big weights for
the points metric. However, we can also see from both figures that the best weight
is not equal to 1. For the first division, and the twelve seasons span, the best fitting
is obtained with w set to 0.9, and for the second division this value is 0.8.
The best hindsight percentages for the contemporary data (2005 season onward)
are also obtained with these (best) weight settings.
3
SPM Evaluation : Comparison to Other Methods
We also evaluated the performance of the ratings based on our metrics comparing
their prediction results to those we can derive using the methodology of Maher
10
Alberto Palacios Pawlovsky
SPM
(1982), the score prediction method of Stefani (2008), the ranking method of Colley
(2002), and the ODM of Govan et al. (2009).
Maher supported the theory that the number of goals in soccer follows a Poisson distribution, and defined parameters to represent the defensive and offensive
characteristics of a team. His approach defines four parameters for each team (two
at home and two when playing away). However, he studied the importance of all
these four values and found that two of them, the offensive strength and defensive
weakness, will suffice to describe the quality of a team (without differencing them
for home and away games). Using his method we can calculate the mean of the
goals distribution of each team and determine the number of goals most likely to be
scored in a given game. Once we know the scores, we can predict the result of a
game between any two teams.
Colley has proposed a method for ranking college (American) football teams
that uses only the number of games won and the number of games played as input.
In this method, we can calculate the ratings by an iterative scheme or a matrix of
linear equations. The ratings for a given game can then be used to rank teams and
predict the winner of a game. In our implementation of this method we used as
initial rating, for all teams, a value of 0.5.
Stefani developed a least squares and an exponential smoothing method for predicting scores and applied it to English Premier Soccer League and Super 12/14
rugby union competitions. His model predicts the scores of the home team and
away team using the offensive and defensive ratings of each team. In his method,
these ratings have a smooth factor that puts more weight in more recent game results. In our implementation of this method, we used 0.5 as initial value of the
offensive and defensive ratings for all teams.
The Offense-Defense Model (ODM) of Govan et al. (2009) uses a matrix to
define the offensive and defensive ratings of a team. Our implementation follows
the details given in that paper.
3.1
Foresight Prediction
We used our metrics and the rating based on them for game outcome foresight
prediction. All the predictions, for all the methods, are based on the data available
before the game we will predict. All the methods used the same information. There
is no home advantage in the games so no method uses it.
Maher’s and Stefani’s methods predict the scores while all other methods, including ours, compute ratings to compare teams and decide which one will win. In
the case the ratings of both teams are equal, the game is predicted as a tie.
11
Alberto Palacios Pawlovsky
SPM
Table 4 and Table 5 show the prediction results for JUFA’s first and second
divisions, respectively. From these tables, we can see that SPM gives the best percentages for both divisions for the whole twelve seasons.
Table 4: Foresight Prediction Results Comparison: JUFA’s 1st Division.
First Division
Season
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Total
Maher
3/24
12.50%
8/24
33.33%
21/52
40.38%
23/52
44.23%
18/52
34.61%
26/52
50.00%
56/126
44.44%
60/126
47.61%
43/126
34.12%
47/126
37.30%
47/126
37.30%
50/126
39.68%
402/1012
39.72%
Stefani
5/24
20.83%
8/24
33.33%
23/52
44.23%
21/52
40.38%
24/52
46.15%
24/52
46.15%
49/126
38.88%
53/126
42.06%
47/126
37.30%
48/126
38.09%
52/126
41.26%
52/126
41.26%
406/1012
40.11%
Method
ODM
11/24
45.83%
12/24
50.00%
17/52
32.69%
22/52
42.30%
24/52
46.15%
28/52
53.84%
57/126
45.23%
67/126
53.17%
63/126
50.00%
63/126
50.00%
63/126
50.00%
62/126
49.20%
489/1012
48.32%
Colley
9/24
37.50%
11/24
45.83%
19/52
36.53%
23/52
44.23%
28/52
53.84%
29/52
55.76%
57/126
45.23%
62/126
49.20%
63/126
50.00%
52/126
41.26%
61/126
48.41%
61/126
48.41%
475/1012
46.93%
SPM
9/24
37.50%
12/24
50.00%
20/52
38.46%
23/52
44.23%
26/52
50.00%
29/52
55.76%
56/126
44.44%
65/126
51.58%
63/126
50.00%
61/126
48.41%
63/126
50.00%
65/126
51.58%
492/1012
48.61%
However, if we look only at the first division table, ODM’s method has seven
seasons of best values, with five of them in the contemporary data range. It is
followed by SPM’s method, with six seasons with best values and three of them in
the contemporary seasons.
For the second division, Colley’s method has six seasons with best values, with
four in the contemporary years (2005 onward). The second place corresponds to
12
Alberto Palacios Pawlovsky
SPM
SPM’s method with four seasons with best values, and three of them in contemporary seasons.
Table 5: Foresight Prediction Results Comparison : JUFA’s 2nd Division.
Second Division
Season
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Total
Maher
8/24
33.33%
9/24
37.50%
18/52
34.61%
23/52
44.23%
16/52
30.76%
13/52
25.00%
54/126
42.85%
45/119
37.81%
55/126
43.65%
59/126
46.82%
54/126
42.85%
46/126
36.50%
400/1005
39.80%
Stefani
14/24
58.33%
13/24
54.16%
9/52
17.30%
23/52
44.23%
14/52
26.92%
20/52
38.46%
52/126
41.26%
56/119
47.05%
55/126
43.65%
58/126
46.03%
67/126
53.17%
49/126
38.88%
430/1005
42.78%
Method
ODM
11/24
45.83%
10/24
41.66%
21/52
40.38%
28/52
53.84%
21/52
40.38%
24/52
46.15%
65/126
51.58%
58/119
48.73%
67/126
53.17%
71/126
56.34%
73/126
57.93%
53/126
42.06%
502/1005
49.95%
Colley
12/24
50.00%
9/24
37.50%
23/52
44.23%
25/52
48.07%
23/52
44.23%
19/52
36.53%
65/126
51.58%
59/119
49.57%
71/126
56.34%
73/126
57.93%
77/126
61.11%
60/126
47.61%
516/1005
51.34%
SPM
13/24
54.16%
12/24
50.00%
20/52
38.46%
29/52
55.76%
22/52
42.30%
22/52
42.30%
64/126
50.79%
60/119
50.42%
70/126
55.55%
71/126
56.34%
81/126
64.28%
59/126
46.82%
523/1005
52.03%
From Figure 5 and Figure 6, we can see that no method reach the 60% line for
the first division and the 70% line for the second division.
For the first division and seasons 1999 and 2000 where the number of games is
small, Maher’s and Stefani’s did not give good results. From 2001 to 2004, where
the number of games doubled, these methods improved their predictions but for
contemporary data (2005 onward) they hardly reached the 40% line.
13
Alberto Palacios Pawlovsky
SPM
ODM’s, Colley’s and the prediction based on SPM show the best results for
almost all these seasons. With only one exception in 2001, where Stefani’s method
shows the highest prediction percentage.
Figure 5: Foresight Prediction Results Comparison (1st Division JUFA).
For the second division of JUFA and the seasons of 1999 and 2000, Stefani’s
based predictions show high figures, but its results for all other seasons are low.
Maher’s results started with low values, but they seem to be more stable for seasons
with a larger number of games. Its results for contemporary data show an average
around the 40% line for all those seasons.
In the first division and for all seasons between 2000 and 2004, the predictions
based on SPM are better than those given by ODM’s method. For contemporary
data, both methods alternate in giving the best results, but without a clear difference
between them. For the second division and years 2001, 2004 and 2005, ODM’s
method gives better results, but for all other seasons SPM’s method is better.
If we compare only Colley’s and SPM’s results for the first division, Colley’s
method gives the best results for seasons 2003 and 2005, but SPM’s results are
better for all other seasons (Fig. 5). When comparing the results of the second
division, we can not see a clear predominance of one of these methods. SPM gives
better results for seasons with a small number of games (1999, and 2000), but both
methods alternate in giving the best results for almost all other seasons (Fig. 6).
14
Alberto Palacios Pawlovsky
SPM
Figure 6: Foresight Prediction Results Comparison (2nd Division JUFA).
3.2
Fitting : Hindsight Prediction
We also measured the fitting of all the methods we compared. The results obtained
are shown in Figure 7 and Figure 8. They are also detailed in Table 6 and Table 7,
respectively.
For the first division, Maher’s based method gives, almost for all seasons, the
lowest results with one exception in the 2004 season where it has one of the highest
values. For the same data, Stefani’s based method is almost stable and its results
move around an average of 50% for all the contemporary seasons. These two methods show almost the same total prediction percentage. This time again, ODM’s,
Colley’s and SPM based methods stand above these methods, with a little total difference between any two of them. ODM’ results keeps a position around the 55%
line, while Colley’s and SPM’s results move around the 60% line. Their hindsight
prediction percentages are the best for the first division of JUFA (Fig. 7). If we see
the total figures of Table 6, these methods have almost a 10% of difference when
compared to Maher’s or Stefani’s results. For the first division, Colley’s method has
nine seasons with the best percentages, five of them in the contemporary data (2005
onward). It is followed by SPM’s method which has five seasons of best values,
with two of them in the contemporary range.
15
Alberto Palacios Pawlovsky
SPM
Figure 7: Hindsight Prediction Results Comparison (1st Division JUFA).
Figure 8: Hindsight Prediction Results Comparison (2nd Division JUFA).
16
Alberto Palacios Pawlovsky
SPM
Table 6: Hindsight Prediction Results Comparison : JUFA’s 1st Division.
First Division
Season
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Total
Maher
6/24
25.00%
15/24
62.50%
31/52
59.61%
28/52
53.84%
21/52
40.38%
38/52
73.07%
61/126
48.41%
73/126
57.93%
54/126
42.85%
58/126
46.03%
55/126
43.65%
61/126
48.41%
501/1012
49.50%
Stefani
12/24
50.00%
15/24
62.50%
32/52
61.53%
26/52
50.00%
27/52
51.92%
33/52
63.46%
54/126
42.85%
68/126
53.96%
60/126
47.61%
60/126
47.61%
66/126
52.38%
61/126
48.41%
514/1012
50.79%
Method
ODM
17/24
70.83%
15/24
62.50%
27/52
51.92%
28/52
53.84%
30/52
57.69%
35/52
67.30%
65/126
51.58%
77/126
61.11%
73/126
57.93%
70/126
55.55%
75/126
59.52%
77/126
61.11%
589/1012
58.20%
Colley
16/24
66.66%
17/24
70.83%
27/52
51.92%
31/52
59.61%
34/52
65.38%
39/52
75.00%
71/126
56.34%
84/126
66.66%
81/126
64.28%
79/126
62.69%
76/126
60.31%
84/126
66.66%
639/1012
63.14%
SPM
17/24
70.83%
17/24
70.83%
28/52
53.84%
30/52
57.69%
34/52
65.38%
38/52
73.07%
68/126
53.96%
81/126
64.28%
81/126
64.28%
77/126
61.11%
78/126
61.90%
82/126
65.07%
631/1012
62.35%
For the second division (Fig. 8, Table 7), Maher’s and Stefani’s methods alternate, for almost all seasons, for the lowest results. For this division, these methods
give total results almost in the same range. Above the 60% line are, again, ODM’s,
Colley’s and SPM’s results. They are better than Maher’s and Stefani’s results for
almost a 10% of difference.
For both divisions, Colley’s and SPM’s methods give very close results. One
season to be noted is second division’s 2000 season. In this season, Stefani’s method
shows the best result with more than 10% of difference to any other method. Colley’s method also shows a similar value for the 2002 season.
17
Alberto Palacios Pawlovsky
SPM
Table 7: Hindsight Prediction Results Comparison : JUFA’s 2nd Division.
Second Division
Season
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
Total
Maher
13/24
54.16%
15/24
62.50%
25/52
48.07%
33/52
63.46%
24/52
46.15%
24/52
46.15%
69/126
54.76%
54/119
45.37%
63/126
50.00%
72/126
57.14%
68/126
53.96%
54/126
42.85%
514/1005
51.14%
Stefani
17/24
70.83%
18/24
75.00%
21/52
40.38%
31/52
59.61%
24/52
46.15%
28/52
53.84%
58/126
46.03%
62/119
52.10%
73/126
57.93%
75/126
59.52%
77/126
61.11%
58/126
46.03%
542/1005
53.93%
Method
ODM
16/24
66.66%
14/24
58.33%
35/52
67.30%
35/52
67.30%
30/52
57.69%
35/52
61.53%
75/126
59.52%
70/119
58.82%
80/126
63.49%
81/126
64.28%
86/126
68.25%
65/126
51.58%
622/1005
61.89%
Colley
17/24
70.83%
14/24
58.33%
32/52
61.53%
39/52
75.00%
29/52
55.76%
29/52
55.76%
81/126
64.28%
80/119
67.22%
80/126
63.49%
86/126
68.25%
88/126
69.84%
69/126
54.76%
644/1005
64.07%
SPM
17/24
70.83%
15/24
62.50%
31/52
59.61%
38/52
73.07%
30/52
57.69%
30/52
57.69%
80/126
63.49%
73/119
61.34%
82/126
65.07%
87/126
69.04%
88/126
69.84%
63/126
50.00%
634/1005
63.08%
If we compare ODM’s and SPM’s results for the first division, SPM has a clear
advantage over ODM. The fitting of the predictions of SPM for all the seasons of
this division are better than those of ODM. The differences between their results
fall between 2.5% and 6%. However, this is not the case for the results of the
second division (Fig. 8). For seasons with a small number of games (1999 and
2000) SPM’s results are better. For the seasons between 2001 and 2004 there is
no clear difference, and for the contemporary data SPM is better for all but the last
season (2010). For the seasons between 2005 and 2009, the difference in the results
lies between 1.5% and 4%.
18
Alberto Palacios Pawlovsky
SPM
If we compare Colley’ and SPM’s results, for the first division data, there is no
clear predominance of one of these methods. However, we could say that Colley’s
results are slightly better. Both methods give the same results for three seasons
and Colley’s results are better than those of SPM in six seasons (SPM ’s results are
better than those of Colley’s only in three seasons). Again, SPM’s results seem to
be better for seasons with a small number of games (1999 and 200 seasons). For
the contemporary data, between 2005 and 2010, they alternate in giving the best
results.
For the second division (Table 7), Colley’s method has six seasons with best
values of which four are in the contemporary range. It is followed by SPM’s method
with five seasons with best results and three of them in the contemporary seasons
(2005 onward). Colley’s results are better than those of SPM’s method for three of
the six years of contemporary data. Also, for the 2006 and the last season, Colley’s
results have a better fitting with an improvement that ranges from 5% to 6%. For
all other seasons, there is no clear predominance of one method and the differences
between their results are small.
From what we explained above, we can say that SPM’s, ODM’s and Colley’s
based methods have a slight advantage to all other methods we compared. One
way of improving the overall prediction percentages, in both divisions, would be to
combine these methods using rank aggregation (Govan et al. (2009)).
4
Conclusions
We have shown two metrics and one way of combining them for rating soccer
teams. One of the metrics uses the goals scored by a team and the other the points
earned by it. We evaluated the combined use of these metrics using a weighted
rating to rank the teams and predict the results of the games of the first and second
division of the Japanese University Football Association (Kanto League). Based on
the results of this evaluation, we determined that our metrics should be used with
the same weight when combined for rating. This rating seems to be the only one
using these metrics combined in this way.
We also compared the game outcome prediction results of our metrics to those
obtained with four other methods. The comparison results show that SPM could be
a good alternative to Colley’s based method or the Offense Defense Model (ODM)
when ranking teams and for prediction.
Our metrics are easy to implement and can also be used in other sports. Rugby
(targeted in Stefani’s method) and Basketball and Football (targeted in ODM’s
method) could probably use them without major changes. There are works, like
19
Alberto Palacios Pawlovsky
SPM
the one of Pasteur (2010), that aims to improve prediction results. Similar and other
approaches tailored to SPM could also be topics for further study.
Annex: Brief Description of Other Methods.
In this paper we compare SPM predictions to the predictions we can derive using
the methodology of Maher (1982), the score prediction method of Stefani (2008),
the ranking method of Colley (2002), and the ODM method of Govan et al. (2009).
We detail briefly these methods in the following subsections.
Maher (1982) Based Prediction
Maher supported the theory that the number of goals in soccer follows a Poisson
distribution, and defined parameters to represent the defensive and offensive characteristics of a team. In his model two teams i (home team) and j (away team) face
each other in a game that ends with a score (xi j , yi j ). He also attributes these scores
to occurrences of variables Xi j and Yi j that have a Poisson distribution and means
given by αi β j and γi δ j . Where αi defines the offensive strength of (local) team i, β j
the defensive weakness of (away) team j, γi the defensive weakness of team i, and
δ j the offensive strength of team j. Taking the scores’ log function, the maximum
likelihood estimators (MLE) for team i are given by Equation (11) (the values of γ
and δ can be determined in the same way).
α̂i =
∑ j6=i xi j
and β̂i =
∑ j6=i β̂ j
∑i6= j xi j
∑i6= j α̂ j
(11)
Since α̂ depends on the values of β̂ and vice versa, Maher suggests as initial values
the following ones.
∑ j6=i xi j
∑i6= j xi j
α̂i = √
and β̂i = √
(12)
Sx
Sx
Where the denominator is given by Equation (13), and is the number of the goals
scored by all teams.
Sx = ∑ ∑ xi j
(13)
i j6=i
He studied the importance of these values and found that two of them, the offensive strength and defensive weakness, will suffice to describe the quality of a team
(without differencing them for home and away games). Equations (8) to (10) can be
applied to each match day k based on the data up to match k-1. After determining
20
Alberto Palacios Pawlovsky
SPM
the α̂ and β̂ for each team we can calculate the mean of its goals distribution and
determine the number of goals most likely to be scored in game k. Knowing the
scores we can predict the result of a game between any two teams.
Colley (2002) Based Prediction
Colley has proposed a method for ranking college football teams that uses only the
number of games won nw and the number of games played ntot as input. He uses
the modified winning percentage shown in equation (11) as the rating of a team.
r=
1 + nw
2 + ntot
(14)
He also works with the number of wins given by equation (12) (nl is the number of
games lost).
(nw − nl ) ntot 1
(nw − nl ) ntot
+
=
+∑
(15)
nw =
2
2
2
2
And modifies the second term to define an adjustement for strength of schedule
based on the rates of the opponents of team i as given by equation (13).
newf f
(nw,i − nl,i )
=
+
2
ntot,i
∑ rij
(16)
j=1
It gives the effective number of wins of team i. Here rij is the rating of the jth
opponent of i. In this method we can calculate the ratings by an iterative scheme or
a matrix of linear equations. The ratings for a given game can then be used to rank
teams and predict the winner of a game.
Stefani (2008) Based Prediction
Stefani developed a least-squares and an exponential smoothing method for predicting scores and applied it to English Premier Soccer League and Super 12/14 rugby
union competitions. His model predicts the score of home team i (si j ) and away
team j (s ji ) using the formulas in equations (14) and (15).
sPij = roi + rd j
(17)
sPji = ro j + rdi
(18)
21
Alberto Palacios Pawlovsky
SPM
Where ro and rd are the offensive and defensive ratings of each team and for i team
are given by equations (16) and (17) (j team’s ratings are calculated in a similar
way).
m−1
n−1
n
n−1
](si j − (roi
+ rdm−1
(19)
roi
= roi
+[
j ))
nm − 1
m−1
n
n−1
n−1
rdi
= rdi
+[
](s ji − (rom−1
(20)
j + rdi ))
nm − 1
Here n is the number of games of team i and m the number of games of team j. The
fraction in the second term of these equations is the smoothing factor. We used 0.5
as initial value of ro (ro0 ) and rd (rd0 ) for all teams to predict the scores of the first
games when using this method.
ODM Based Prediction
The Offense-Defense Model (ODM) of Govan et al. (2009) uses a matrix A = [ai j ]
where ai j is the score of team j against team i. It also defines two ratings. The
offensive rating of team j is given by the following equation.
o j = a1 j (
1
1
) + ... + an j ( )
d1
dn
(21)
And the defensive rating of i is given by equation (19).
di = ai1 (
1
1
) + ... + ain ( )
o1
on
(22)
For convergence they define a new matrix P = A + εeeT , where e is a vector of all
ones and equal to the initial values of all ds (d(0) = e). This makes possible the
calculation of o (all the offensive ratings) as follows.
o(k) = PT
1
d(k−1)
(23)
And then of all the defensive ratings d.
d(k) = P
1
o(k)
The overall rating of team i is given by the following equation.
oi
ri =
di
(24)
(25)
For prediction we used this overall rating to rank teams and determine the winner
of a game.
22
Alberto Palacios Pawlovsky
SPM
References
Bradley, R. A. and M. E. Terry (1952): “Rank Analysis of Incomplete Block Designs I : The Method of Paired Comparisons,” Biometrika, 39, 324–345.
Brillinger, D. R. (2010): Wiley Enciclopedia of Operations Research and Management Science, John Wiley and Sons, Inc., chapter Soccer/World Football.
Callagham, T., P. J. Mucha, and M. A. Porter (2007): “Random Walker Ranking for
NCAA Division I-A Football,” American Mathematical Monthly, 114, 761–777.
Colley, W. N. (2002): “Colley’s bias free college football ranking method,” .
Gleich, D. F. and L.-H. Lim (2011): “Rank Aggregation via Nuclear Norm Minimization,” in Proceedings of the Conference on Knowledge Discovery and Data
Mining, ACM, KDD 11, 60–68.
Govan, A. Y., A. N. Langville, and C. D. Meyer (2009): “Offense-Defense Approach to Ranking Team Sports,” Journal of Quantitative Analysis in Sports, 5,
1–17.
Hallinan, S. E. (2005): “Paired Comparison Models for Ranking National Soccer
Teams,” Technical report, Worcester Polythecnic Institute.
Harville, D. A. (2003): “The Selection or Seeding of College Basketball or Football
Teams for Postseason Competition,” Journal of the American Statistical Association, 98, 17–27.
Ingram, L. C. (2007): Ranking NCAA Sports Teams with Linear Algebra, Master’s
thesis, The Graduate School of the College of Charleston.
JUFA (2011): http://www.jufa-kanto.jp/.
Maher, M. J. (1982): “Modelling Association Football Scores,” Statistica Neerlandica, 36, 109–118.
Pasteur, R. D. (2010): Extending the Colley Method to Generate Predictive Football Rankings, number 43 in Dolciani Mathematical Expositions, Mathematical
Association of America, chapter 10, 117–129.
Reep, C., R. Pollard, and B. Benjamin (1971): “Skill and Chance in Ball Games,”
Journal of the Royal Statistical Society. Series A, 134, 623–629.
Stefani, R. T. (2008): “Predicting Score Difference Versus Score Total in Rugby
and Soccer,” IMA Journal of Management Mathematics, 20, 147–158.
23
Download