Bruce McCurdy - The Reusch Blog

advertisement
The evolution of hockey statistics
– an ongoing story
Bruce McCurdy
Analytics, Big Data, and the Cloud
2012 April 25
Traditional game summaries
1967-68 Plus/minus formally introduced, as
well as individual shots on goal / Shooting %
1983-84 Goaltender save percentage added
Grant Fuhr
Grant Fuhr
1998-99 Time on ice published, opening the
door for rate stats
Chris Pronger
1998: NHL introduces Zone Time
… but turfs it in 2002. Why?!
1998: NHL starts to (sporadically) maintain
Real Time Scoring System (RTSS)
…but there remain huge problems due to
lack of standardization & rink bias
Oilers have twice as many giveaways as Florida … or do they?
• Ranking of teams’ RTSS home and away yields results
that might as well be randomized for giveaways and
takeaways, and very nearly so for hits and blocked
shots.
• Whereas the same exercise for Goals For yields a
crudely similar ordering home to away.
• Significant home scorer bias in turnover stats. 45%
more giveaways and 33% more takeaways by home
teams league-wide!
• As a result RTSS is highly unreliable, serving to rank
players within a given team but almost useless for
comparing players from different clubs.
2002-03: NHL introduces play-by-play reports
… though problems remain with accuracy of some data, e.g. shot distance
“Stripping” of PxP data allows detailed
on-ice analysis of individual players
Even-strength shots / Fenwick / Corsi from timeonice.com
Head-to-head match-ups (timeonice.com)
Customizable, sortable stats from behindthenet.ca
Available stats:
Even strength / powerplay / shorthanded
Scoring per 60 minutes
On/off ice plus/minus per 60
On/off ice shots / Fenwick / Corsi per 60
On-ice Sh% / Sv% / PDO
QualComp / QualTeam
Penalties drawn / taken
ZoneStart / ZoneFinish
• Many stats need to be parsed in terms of positive
/ negative /neutral game states, e.g.:
• Leading / trailing / tied (score effects are HUGELY
important)
• PP / PK / EV
• O-zone / D-zone / neutral zone
• Taken in isolation without context, modern stats will be distorted;
e.g. “soft minutes” players used in offensive situations should be
expected to have positive numbers in things like Relative Corsi
Scoring chances
"A chance is counted any time a
team directs a shot cleanly on-net
from within home-plate. Shots on
goal and misses are counted, but
blocked shots are not (unless the
player who blocks the shot is “acting
like a goaltender”). Generally
speaking, we are more generous
with the boundaries of home-plate if
there is dangerous puck movement
immediately preceding the scoring
chance, or if the scoring chance is
screened. If you want to get a visual
handle on home-plate, check this
image."
One weakness to the current method is that
“home plate” isn’t best template for scoring area
Another is that scoring chances are just 1’s and 0’s – no extra weight
for first class chances as suggested by heat map colour coding
Actually,
scoring
areas
…which
vary for
different
types of
shots and
manpower
situations.
Scoring
chance
model is
greatly
simplified
from this
reality.
Common SC errors and outcomes
•
•
•
•
•
•
•
•
NHL data doesn’t properly record on-ice players
+1 or -1 for selected players
Scoring chance improperly credited (or missed)
+1 or -1 for 10 players
Scoring chance recorded at wrong game time
+1 or -1 for up to 20 players
Scoring chance recorded but for wrong team
+2 or -2 for 10 players
Neilson Numbers
• Based on ideas of Roger Neilson
• Assignment of individual responsibility on scoring chances
for and against
• Requires an extra degree of qualitative judgement over and
above deciding whether a scoring chance has occurred
• Eliminates false positives/negatives, however individual
numbers don’t reconcile to team totals
• Fewer recording errors than on-ice scoring chances as
players are identified as part of the process
• Same system can be used to assign unofficial assists on GF
or errors on GA
• Reliant on a knowledgeable scorer, but as with other
scoring chance systems, would work better if 3 or 5 scorers
worked independently, then pooled results.
Sample box:
Zone Start:
fad or trend?
Possession
• “Hockey is a transition game: offense to defense,
defense to offense, one team to another. Hundreds of
tiny fragments of action, some leading somewhere,
most going nowhere. Only one thing is clear. A
fragmented game must be played in fragments. Grand
designs do not work. … Before offense turns to
defense, or defense to offense, there is a moment of
disequilibrium when a defense is vulnerable, when a
game’s sudden, unexpected swings can be turned to
advantage. It is what you do at this moment, when
possession changes, that makes the difference.”
• – Ken Dryden, The Game
• “It is noteworthy that in general … our teamwork was
considerably above our main contenders. In the game
against the Canadian team, the players of the USSR
squad made 110 passes, while the Canadians made 60
passes; in the game against Czechoslovakia we made
106 passes, they made 70; in the game against Sweden
we made 49 more passes than they did. … This is an
indication of quite stable habits and a high culture of
playing, a correct understanding of the game by the
Soviet players.”
• -- Anatoli Tarasov, Road to Olympus
Tarasov Numbers
Good pass: plus.
Bad pass: minus.
Good clearance: plus.
Bad clearance: minus.
Good rush: plus.
Bad rush: minus.
Good shoot in: plus.
Bad shoot in: minus.
…and many more advanced ideas
•
•
•
•
•
•
•
•
Goals Versus Threshold (GVT)
Defence Independent Goalie Rating (DIGR)
Shot Quality (SQF / SQA)
Preditcted Goals Scored (PGS)
Zone Start Adjusted Corsi (ZSAC)
Etc. …
No time to do them all justice here
Thanks for listening!
Download