Uploaded by Lex렉스

SpDA SP23 G syllabus

advertisement
STAT 9073-01. Analysis of Sports Big Data
Spring 2023
Instructor: Sangwook Kang, Ph.D.
Associate Professor, Department of Applied Statistics
DWHMB 529, Phone: (02) 2123-2538; Email: kanggi1@yonsei.ac.kr
Class Hours: Wednesday 9:00 - 11:50am
Class Room: DWHMB 535
Office Hours: TBD
Textbook:
Analyzing Baseball data with R, 2nd Edition, Max Marchi, Jim Albert and Benjamin S.
Baumer, CRC Press, 2018.
References:
(i) Curve Ball: Baseball, Statistics, and the Role of Chance in the Game, by
Jim Albert and Jay Bennett, Copernicus; 2001.
(ii) Full House: The Spread of Excellence from Plato to Darwin, by Stephen
J. Gould, Harmony, 1996.
(iii) Moneyball: The Art of Winning an Unfair Game, by Michael Lewis, W. W.
Norton & Company, 2004.
(iv) Basketball Data Science: With Applications in R, by Paola Zuccolotto and Marica
Manisera, Chapman and Hall/CRC, 2020.
Course Description:
Overview:
• Are you a fan of baseball? Do you like numbers? Then, this course is for you. In
this course, we talk about numbers, sports, and numbers in sports, especially for
baseball. Many people have already done lots of things using numbers to answer their
own questions of interest. We will go over them. And then it will be your turn to
apply statistical concepts and techniques to answer your own questions of interest.
For example, will Doosan Bears be the champion this year? We will also talk about
Basketball data analysis (if time permits), too!
1
• This course is designed for graduate students in the Department of Statistics and Data
Science. Other students could still, however, take this course if some requirements are
met (Check “Prerequisite” for more details).
Topics:
• Introduction to baseball, Introduction to Sabermetrics, Exploring databases, Seasonby-Season data, Game-by-Game data, Play-by-Play data and Pitch-by-Pitch data,
Relation between runs and wins, Value of plays using run expectancy, Balls and strikes
Effects, Career Trajectories, Computing park factors, Other topics in baseball data
analysis, Some basketball data science, and Examples of data analysis in other sports
Prerequisites:
• Your passion for sports and numbers plus introductory statistics are required as a
prerequisite. If you are not a fan of baseball (or other sports) or do not want to be a
fan but still want to take this course, you need to contact the instructor and talk with
the instructor why you want to take this course!
Software: R (and Python, if needed)
Course Materials:
We use LearnUs. Course syllabus, lecture notes, homework assignments, and some other
related course materials will be posted on the course website at LearnUs. The lecture note
will be available before each class. The students are responsible to print out all required
course materials.
Course Requirements for Grading Purposes:
Homework:
• There will be roughly bi-weekly homework assignment.
• All homework assignments are due at the beginning of the class period on the date on
which they are due. No delay will be allowed.
• Students are encouraged to work together (maybe online?) on homework, but copying
someone else’s work always an academic honesty violation.
Exams:
• There will be one take-home exam.
2
Project:
• There will be one team project.
• The project guidelines will be posted later in the semester.
• The project report is due during the last class of the semester.
Topical Outline:
Date
Week 1: Mar 2 - 8
Topic of Class
Introduction to baseball / Introduction to Sabermetrics
Week 2: Mar 9 - 15
Exploring databases: Season-by-Season data, Game-by-Game data,
Play-by-Play data and Pitch-by-Pitch data
Week 3: Mar 16 - 22
Introduction to R / Graphics
Week 4: Mar 23 - 29
Relation between runs and wins
Week 5: Mar 30 - Apr 5
Value of plays using run expectancy
Week 6: Apr 6 - 12
Balls and strikes Effects / Career Trajectories
Week 7: Apr 13 - 19
Exploring Streaky Performances / Computing park factors
Week 8 *Apr 20 - 26
(Midterm week) Take-home Examination
Week 9: Apr 27 - May 3
Extinction of 0.400 hitters in MLB
Week 10: May 4 - 10
Other topics in baseball data analysis
Week 11: May 11 - 17
Some basketball data science
Week 12: May 18 - 24
Examples of Data Analysis in Other Sports
Week 13: May 25 - 31
Students’ presentation
Week 14: Jun 1 - 7
Students’ presentation
Week 15: Jun 8 - 14
Final project due (*Jun 14)
*: The dates of the midterm and final exams will be determined by the school during the
exam weeks.
3
Grading:
The grades will be assigned as follows:
Homework Assignments
Course Project
Midterm Exam
Class Attendance & Participation
40%
35% Due: Jun 14
20% Apr 20 - 26
5%
The distribution of grades will be as follows:
Grade
A+ ∼ A−
B+ ∼ B−
Proportions (%)
60% + α1∗ %
35% + α2 %
*: α1 is a non-negative number.
In order to obtain a good grade, you need to successfully complete all assignments, the
course project, and exams, to show your efforts putting into the class, and to attend every
lecture.
Make-Up Policy:
You must contact the instructor in advance if you are unable to take an exam at its
scheduled time. Arrangements may then be made for a make-up exam.
Attendance Policy:
Attending all the classes is highly recommended. Those who miss classes too often may
not get full 5% credit.
General Disclaimers:
The course syllabus is a general plan for the course; deviations announced to the class by
the instructor may be necessary.
4
Download