Public Sports Data Sets Introduction

advertisement
Public Sports Data Sets
Introduction
The following pubic data sets are available for MIS 580 class project (“Sport Data Mining
& Knowledge Management”). Students are strongly encouraged to identify other public sports
data sets for research.
Baseball
1. http://www.baseball1.com/
This database contains pitching, hitting, and fielding statistics for Major League Baseball
from 1871 through 2005. They will update the 2006 data soon. The data set includes data
from the two current leagues (American and National), the four other "major" leagues
(American Association, Union Association, Players League, and Federal League), and the
National Association of 1871-1875. The data are provided in Microsoft Access, CVS and
other formats.
A copy of data set (in Microsoft Access format) can be downloaded from:
http://ai.arizona.edu/hchen/chencourse/SportsData/baseball1_MSAccess.zip
There are 17,609 players and 2,505 teams included in the dataset. A unique number
(playerID) is assigned to each player. Players’ related information is tagged with the
corresponding playerID.
Tables:
MASTER - Player names, DOB, and biographical info
Batting - batting statistics
Pitching - pitching statistics
Fielding - fielding statistics
AllStar - All-Star appearances
Hall of Fame - Hall of Fame voting data
Managers - managerial statistics
Teams - yearly stats and standings
BattingPost - post-season batting statistics
PitchingPost - post-season pitching statistics
TeamFranchises - franchise information
FieldingOF - outfield position data
FieldingPost- post-season fieldinf data
ManagersHalf - split season data for managers
TeamsHalf - split season data for teams
Salaries - player salary data
SeriesPost - post-season series information
 Page 1
AwardsManagers - awards won by managers
AwardsPlayers - awards won by players
AwardsShareManagers - award voting for manager awards
AwardsSharePlayers - award voting for player awards
2. http://www.baseball-reference.com/
The website contains statistics on players, teams, leagues, managers, leaders, awards,
postseason, and box scores from 1871. This dataset contains 1,6804 player and 2,535
teams.
A copy of data set (in MYSQL dump file format) can be downloaded from:
http://ai.arizona.edu/hchen/chencourse/SportsData/baseball-databank_MYSQL.zip
Although the website does not provide an explicit explanation of their tables, most of the
tables have a meaningful name.
Tables:
Allstar
AwardsManagers
AwardsPlayers
AwardsShareManagers
AwardsSharePlayers
AwardsVotes
AwardsWinners
Batting
BattingPost
Fielding
FieldingOF
FieldingPost
HallOfFame
HOFold
IDxref
IDxrefSchools
IDxrefTeams
Managers
ManagersHalf
Master
 Page 2
AmexicanStates
Pitching
PitchingPost
Salaries
Schools
SchoolsPlayers
SeriesPost
Teams
TeamsFranchises
TeamsHalf
TmpCollege
Transactions
xref_stats
Basketball
1. http://databaseBasketball.com/
The website contains the NBA data from 1947 to 2005 and ABA data from 1968 to 1976
on players, teams, leagues, all-star games, awards, and coaches. The dataset has 3,572
players and 96 teams.
A copy of data set (in comma delimited text format) can be downloaded from:
http://ai.arizona.edu/hchen/chencourse/SportsData/databasebasketball_TXT.zip
Tables:
players.txt - list of all players
player_regular_season.txt - regular season player stats
player_regular_season_career.txt - regular season career totals by player
player_playoffs.txt - playoff stats for all players
player_playoffs_career.txt - career playoff stats by player
player_allstar.txt - all-star stats by player
teams.txt - list of all teams
team_season.txt - regular season team stats
draft.txt - nba and aba draft results by year
coaches_season.txt - nba coaching records by season
coaches_career.txt - nba career coaching records
 Page 3
2. http://web1.ncaa.org/stats/StatsSrv/careersearch
The website contains the NCAA basketball data from 1998 to 2005. Although the data set
can not be downloaded directly, the WebPages can be automatically extracted and parsed to
get the data.
3.
Other
websites,
such
as
http://www.sportsstats.com/jazzyj/
and
http://www.infoplease.com/ipsa/A0003203.html, also contain NCAA data, but these websites
are not as comprehensive as the NCAA’s website.
Football
1. http://www.pro-football-reference.com/
The website contains the game data from 1995 to 2006. The dataset contains 4,327
players and the games they played in.
A copy of data set (in CVS format) can be downloaded from:
http://ai.arizona.edu/hchen/chencourse/SportsData/Pro-football-refernce_CSV.zip
Tables:
Master—information about players
Seasons—the statistics of the players’ records by season
Games—the statistics of the players’ records by game
2. http://www.databasefootball.com/
The website contains the NFL data from 1922 to 2005 and AFL data from 1960 to 1969
on players, teams, leagues, awards, and coaches. Although the data set can not be
downloaded directly, the WebPages can be automatically extracted and parsed to get the
data.
3. http://www.jt-sw.com/football/
The website contains the player/coach statistics of NFL from 1920 to present and
statistics of AFL from 1960 to 1969. Although the data set can not be downloaded directly,
the WebPages can be automatically extracted and parsed to get the data.
Other Websites
1. http://www.amstat.org/sections/sis/
Statistics in Sports--A section of the American Statistical Association: The website
contains some links to sports data and statistics.
2. http://sportsillustrated.cnn.com/baseball/mlb/stats/
The CNN/SI website provides statistics to players/teams. Their data were provided by
STATS LLC (http://biz.stats.com/internet.asp).
 Page 4
STATS offers historical data for Major League Baseball, the National Football League,
the National Basketball Association, and the National Hockey League. STATS has the data
back to 1876. STATS can deliver standard column-delimited data files of their database.
3. http://www.bballsports.com/
The website contains the basketball data from 1937 to 2004, the baseball data from 1871
to 2003, and the football data from 1920 to 2002. The data is packaged in a standalone
application.
A
copy
of
the
application
can
be
downloaded
http://ai.arizona.edu/hchen/chencourse/SportsData/bballsports_app.exe
 Page 5
from:
Download