an analysis of bowling scores and handicap systems

AN ANALYSIS
OF
BOWLING SCORES AND HANDICAP SYSTEMS
by
Wenjun Chen
M.Sc. Beijing Institute of Technology
A PROJECT SUBMITTED IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE
in the Department of Mathematics and Statistics
of
Simon Fraser University
@ Wenjun Chen 1991
SIMON FRASER UNIVERSITY·
August, 1991
All rights reserved. This work may not be
reproduced in whole or in part, by photocopy
or other means, without the permission of the author.
APPROVAL
Name:
Welljun Chen
Degree:
Master of Science
Title of project:
An Analysis of Bowling Scores and Handicap Systems
Examining Committee: Dr. A. Lachlan
Chair
,
Dr. R. Routledge, Committee Member
Dr. C. Dean, External Examiner
Date Approved:
November 5
ii
] 991
Abstract
This project uses the Box-Cox transformation and goodness of fit techniques to find a
model describing the distribution of bowling scores based on the analysis of actual bowling
data. We have found that the logarithm of bowling scores is approximately normally
distributed with a constant variance. The simulation of bowling scores based on a Fortran
program confirms this model. The project also uses Monte Carlo methods to investigate
the effect of various handicap systems based on the proposed model.
III
Acknowledgements
My sincere thanks to Dr. Tim Swartz for his invaluable advice and guidance and time
that he spent with me in the preparation of this project. I would also like to thank him
for his supervision during my studies.
Thanks also go to Dr. R. Routledge, Dr. K. 1. Weldon and Dr. C. Dean for their
assistance. I would also like to express my thanks to Dr. M. A. Stephens and to F.
Bellavance for their guidance and advice in doing the project. I would also like to thank
my fellow students and friends for any help that they gave me.
I would also like to extend my thanks to Ms. Sherry Swartz who supplied the data set
used in this project. It was a pleasure to work with such a "good" data set.
Finally, I acknowledge with humble gratitude, the constant encouragement and every
possible help from my mother, father, brothers and sisters. Without them, I would never
have been, what I am today. I would also like to acknowledge all the help and support
extended by my husband.
iv
Contents
Abstract
iii
Acknowledgements
iv
Dedication
v
Contents
vi
List of Tables
viii
List of Figures
ix
1 Introduction
1
2 Description of Five Pin Bowling and The Data Set
3
2.1
Five Pin Bowling
3
2.2
The Data Set . . .
8
3 Characterizing The Bowling Scores
11
3.1
Initial Exploration of the Data Set
11
3.2
Mean-Variance Relationship . . .
15
3.3
Profile Analysis of Bowling Scores
21
3.4
Normality of the Bowling Scores
24
4 Modelling The Bowling Scores
29
4.1
Box-Cox Transformation
4.2
Goodness of Fit Technique in Testing for Normality
29
29
4.3
The Results of Goodness of Fit for Bowling Scores
30
.
vi
4.4
The Proposed Model for Bowling Scores.
32
4.5
The Property of Equal Variances.
34
40
5 Simulated Bowling Scores
5.1
Assumptions of Simulation
5.2
The Results of Simulation.
40
44
49
6 Handicap Systems
6.1
The Remington Rand Study
49
6.2
The Monte Carlo Study ..
49
6.3
The Results of Our Study .
50
52
Appendix
A
The Data Set
B
.
54
A Program which Simulates Bowling Scores.
62
76
Bibliography
vii
List of Tables
2.1
Two Scoring Sheets
.......
9
3.1
Brief Summary of The Data Set.
14
4.1
The Results of Bowling Scores. .
31
4.2
The Results of Logarithms of Bowling Scores
33
4.3
The Sample Avera.ges and Sample Variances of Logarithm Data.
34
5.1
The Results of Simulation (N = 10000)
6.1
Estimated Probabilities of the Favourite Winning
viii
......
44
51
List of Figures
2.1
Pin Count
.
4
3.1
The Scatter Plots of Scores vs Games for Players 1 to 20 .
12
3.2
The Scatter Plots of Scores vs Games for Players 21 to 40
13
3.3
The Plot of Standard Deviation vs Average ..
16
3.4
The Samples of SL and SH
17
3.5
Possible Standard Deviation vs Average Curve
3.6
Overa.ll Average Plots
.
22
3.7
Plots Based on Scores of All Players
.
23
3.8
The Histograms of Bowling Scores for Players 1 to 20
26
3.9
The Histograms of Bowling Scores for Players 21 to 40 .
27
.
20
3.10 The Distribution of Standardized Bowling Scores
4.1
28
Normality of the Logarithms of Bowling Scores.
. ....
35
4.2 The Sample Variances vs Averages Plot for Logarithm Data.
36
4.3
Normality of the Logarithm of Bowling Scores (<1 2 = 0.0328).
38
5.1
Tree Diagram of Possible Outcomes
.
43
5.2
The Results of pl=0.15, p2=0.15, p3=0.25, p4=0.45
46
5.3
The Results of p1=0.15, p2=0.15, p3=0.1, p4=0.6
47
5.4 The Results of p1=0.3, p2=0.3, p3=0.1, p4=.3 ..
6.1
The Probability of the Favourite Winning the Game
IX
48
. . . . . . . . 53
Chapter 1
Introduction
In the fall semester of 1990, I contacted Dr. Tim Swartz about analyzing a data set for
my M.Sc. project. Several days later, he mentioned that he had been watching television
and noticed that unlike most major sports, in professional bowling very little quantitative
analysis is provided to the viewing audience. In particular it seemed that all that could
be said about a bowler X was that he/she maintained a bowling average of Y.
This observation inspired us to ask the following questions: Can more be said about a
bowler's tendencies beyond reporting his/her bowling average? Do bowling scores follow
a particular distribution? What impact might this have on handicapping? What are the
handicap systems currectly used today? Are they fair?
In order to pursue these questions we required a practical data set of bowling Scores.
Sherry Swartz, Dr Tim Swartz's sister, has participated in a bowling league for several
years. She mailed us a total of approximately 2300 five pin bowling scores from an actual
league. These are the scores upon which our analysis is based.
In our attempt to gain a quantitative understanding of bowling scores we also came
in contact with several Canadian bowling agencies and individuals. We describe these
helpful encounters throughout the project.
1
CHAPTER 1. INTRODUCTION
2
Chapter 2 introduces the game of five pin bowling; we describe the rules, the equipment
and the scoring procedure. In this chapter we also describe the data set on which my
project is based.
In Chapter 3 we carry out exploratory data analysis. Through the use of simple
descriptive statistics, plots and tests we gain some feeling for the data set. This exploratory
work also provides ideas regarding future directions for our analysis.
Chapter 4 reviews the Box-Cox transformation
if a
-# 0
if a = 0
which is used in my project to maximize the p-value in a test of normality involving
bowling scores. The main result of this chapter is that the logarithms of bowling scores
are approximately normally distributed.
In Chapter 5 a simulation experiment based on a Fortran program is presented. It is
used to verify the proposed model obtained in Chapter 4. Adjustable parameters which
describe different bowling skills are considered.
In Chapter 6, the Remington Rand handicap study is described. We use our model to
investigate the effect of various handicap systems on the probability of winning and then
compare our results with the Remington Rand study.
Chapter 2
Description of Five Pin Bowling
and The Data Set
This chapter contains an introduction to the sport of five pin bowling with an emphasis
on the scoring procedure. We also describe the data set upon which our study is based.
2.1
Five Pin Bowling
Bowling is one of the most popular sports for participants of all ages, regardless of sex,
shape or physical condition. The two most popular types of bowling in Canada are ten pin
bowling and five pin bowling. For five pin bowling, five pins are used in the game. A game
of five pin bowling consists of ten frames and should be played with regulation equipment
on regulation lanes. Each frame consists of a maximum of three legally delivered balls
rolled by the same bowler down the lane in succession. If a bowler should knock down all
5 pins in less than 3 attempts the frame is considered complete. An exception occurs in the
tenth frame where 3 balls are always delivered. If in the tenth frame all 5 pins are knocked
down during the first or second attempts then the 5 pins are reset. The object of the
game is to score as many points as possible in ten frames. The score is the total number
of points corresponding to the pins knocked down in the ten frames (plus bonuses). The
scores assigned to each pin are recorded in Figure 2.1.
3
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET
4
Figure 2.1: Pin Count
The following are some common scoring terms:
(1) strike: All pins are knocked down by the first ball bowled in a frame.
(2) spare: All pins are knocked down by the first two balls bowled in a frame.
(3) corner pin: All pins are knocked down by the first ball with the exception of a single
corner pin.
(4) head pin: The head pin is picked out by the first ball bowled in a frame.
The basic rules for scoring a game of bowling are as follows:
(1) no strike or spare: Merely add the total points corresponding to the pins knocked
down on the three balls.
(2) strike: Fifteen points plus a bonus of the number of points accumulated by the next
two balls rolled.
(3) spare: Fifteen points plus a bonus of the number of points accumulated by the next
ball rolled.
A perfect game of 450 is scored by recording strikes in each of the ten frames. In
addition this requires knocking down all 5 pins on both the second and third balls of the
tenth frame. An example of the scoring for a typical bowling game is given below.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET
5
frame 1 The player knocks down all pins except the headpin using three balls.
Score: 10 points.
frame 2 The player knocks down all the pins using three balls.
Count: 15 points.
Score: 25 points.
frame 3 The player uses only two balls to knock down all pins. This is called a spare.
Count: 15 points. However, as each frame's count is for three balls, the player adds
the count from his first ball in the next frame.
Score: INCOMPLETE.
frame 4 The player knocks down the 3 pin with his first ball. We therefore add 3 points
to the 15 points of his spare making a count of 18 for the third frame.
Score: 43 points.
The player then knocks down the headpin (5) and the left 2 pin. This makes his
count for the fourth frame 3+5+2=10 points.
Score: 53 points.
frame 5 The player records a strike. Count: 15 points plus points scored with the next
two balls bowled.
Score: INCOMPLETE.
frame 6 The player makes another strike and credits the fifth frame with 15 points. Fifth
frame score: still incomplete. Sixth frame count: 15 points plus points scored with
the next two balls bowled.
Score: INCOMPLETE.
frame 7 With the first ball, the player picks the headpin (5). This completes the fifth
frame. Count: 15+15+5=35 points, fifth frame score: 88. On the second try he
scores 5 points. The player adds the 10 points to the strike count of the sixth frame.
Sixth frame count 25, score:
113.
Then the player knocks down the 2 pin, his count
for the seventh frame: 12 points.
Score: 125 points.
frame 8 A strike is recorded. Count: 15 points plus points scored with the next two balls
bowled.
Score: INCOMPLETE.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET
6
frame 9 With the first ball, the player picks the 3 pin. With his second ball, he counts 12
points knocking down all remaining pins for a spare. This complete the 8th frame.
Count: 15+3+12=30 points. Eighth frame score 155.
Score: INCOMPLETE.
frame 10 A strike is recorded and 15 points are credited to the ninth frame. Ninth frame
count: 15+15=30. Score 185. The strike in the tenth frame permits two additional
attempts. The player gets the three pin on his second ball and the five on his third
ball. Player's tenth frame count: 15 (for the strike)+3+5=23.
Game score: 208 points.
Scoring Sheet
1
2
5 /5 /-
31 2/10
10
25
6
7
X
II
113
51512
125
3
-I
31 51 2
53
5/11
43
8
X
II
155
Here "X" means a strike,
5
"I"
II
88
10
9
31 /
185
X
I
X 1315
208
means a spare.
Handicapping:
In a league in which the range of abilities is wide, the league may adopt handicap
rules. A handicap attempts to "even up" the chance of winning between opposing teams.
There are two basic kinds of handicapping currently used by the Canada Five Pin Bowlers'
Association.
A. Individual handicap systems:
(1) 80% of the difference between the bowler's average and a base figure of 225.
(2) 66% of the difference between the bowler's average and a base figure of 200.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA. SET
7
(3) 75% of the difference between the bowler's average and a base figure of 200.
(4) 75% of the difference between the bowler's average and a base figure of 220.
The various handicap methods mentioned above are applied to a 160 average bowler
to produce the following handicaps.
System
Handicap
80% of 225
52
66% of 200
27
75% of 200
30
75% of 220
46
These handicaps are then added to a bowler's gross score at the end of each game to
give a net score. In the case of a bowler whose average exceeded the base figure, a zero
handicap would be assigned.
B. Team handicap systems:
(1) Team handicaps are determined by adding the averages of the team players for each
of two opposing teams. Then 80% of the difference between the team totals is taken
as the handicap for the weaker team in each individual game.
(2) In deciding the three game handicap total, multiply the single game team handicap
by 3. This total would be added to the total team score for the three games.
(3) When a team's strength is not identical for the three games, the three game handicap
shall be the total of the handicap allowed for each of the three games. This may
happen for example when a team has 6 bowlers, only 5 are permitted to bowl in a
given game and some rotation scheme is used.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET
2.2
8
The Data Set
The data set of bowling scores on which my project is based carne from an actual five
pin bowling league in Kitchener/Waterloo. The scoring sheets were provided to us by the
league scorer (Ms. Sherry Swartz). Each scoring sheet is a weekly summary showing the
results between two competing teams. The scorer recorded the league name, the date and
the particular lane used by the competing teams. For each team the scoring sheet provides
us with the team number, team players, individual and team handicaps and individual
and team totals. From the scoring sheet one can also determine the number of points that
each team scores for the week. A team obtains two points for every game won and a single
point for having the largest grand total. Therefore 7 points are shared by two competing
teams and the maximum number of points that a team can score in one week is 7 points.
Table 2.1 illustrates the format of two sheets.
There are 4 scoring sheets provided each week. This represents a league consisting
of 8 teams with 5 players per team. However players change teams from time to time
and new players occasionally join the league. For example, Mary was a member of team
8 on December 19, 1990 and became a member of team 5 on January 2, 1991. Also
Diana happened to join team 4 on October 31, 1990. It is also possible that some players
might choose to leave the league. The total number of players recorded in the scoring
sheets exceed 40. However some players only participated in three or six games. The data
set was collected weekly from September 5, 1990 to January 9, 1991 excluding the week
of Christmas. This resulted in a total of 19 weeks. The players bowled 3 games every
Wednesday evening over the span of 19 weeks giving a maximum of 57 bowling scores per
player.
The scoring sheets illustrate all information about individual and team scores. However, because the members of teams varied from week to week, team scoring will not be
the focus of this project. We will instead concentrate on individual scores. In our study
we have selected the forty players who have been the most active in the league. Therefore
for each player we will have a maximum of 57 game scores plus possibly some missing
values. This results in approximately 2300 game scores. Appendix A lists all data used in
the analysis.
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET
9
Table 2.1: Two Scoring Sheets
Lane No. 19&20
Team 8
A_I
PI&y~u
Ma.y
Pride
Sh ....
QlIal"
ao,
League KolC
Gamel
'I'
'"~
Total
m
Team H&lIdieap
'"
O .... d Tau,'
19 90
Team 7
vs
... '"
.... ,,,
,.
"
H"
Date Dec.19
1140
O.m~:l
O.me3
'"
U•
'"
'"
u.
'"~
'"
u.
",
'"~
A_,
Tot.1
Playe"
H"
.....'",
W ...... y
"
M,i....
...
Edie
"
m
"1..111.
W.. D.lI
.'", ..'",
1200
1144
"
"
"
Tot ..1
J.....
Oamel
O .. me,
O .. meJ
.... ...'" ,.".....
,.
... ,., '"
'"~
• 11
U•
'"~
'"
'"
'"
'"
O'aDd Tol ..1
1166
to!>1
12'!'2
A_,
Play.. "
P..k
Joh"
Bell"r
Cook
Pert'
TOh,1
Tea ... H&odlc .. p
0.& .. 0 Total
H<,
.."
..
..
'"~
...
League KolC
.. '" ...'"
...
,'".
...
..'",
...
Oame2
'"~
,'"
'"~
'"~
'"
'"
'"
'"
1133
a ..me3
,,.
ToI ..1
Date Dec.19
A_.
Sbelly
Lulie
m
'"~
'"
",
U76
1040
Hdl
Bo'"
m
",
Play ....
Earl
."
"
"
T .., ..I
Te ..... Ha .. dicap
3449
..
H<,
OralIO Tou.1
Gamet WOIl Poillu War.. Team Avu..,e
.11
Oa ... el
'"~
O .. me,
Ga. ... ")
>I'
,.,
'"
>I.
'"
'"
...
'"
,.,
'"
'"
'"
."
,
.n
'"
'"
166
1008
IHi3
IHi'T
..
.n
m
Poiau Wor>
19 90
Team 5
U.
,>I
'"
."
H69
WOIl
vs
O .. mel
'"
...
'"~
Team Ha .. oicap
O .. mu
Lane No. 19&20
Team 6
Tot ..,
...
Total
."
'"
."
'"
3318
Ga",e, WOIl Poi"" WOD
CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET
10
The players in the study vary in sex, age and experience. Unfortunately, we could not
get information concerning these variables and we will therefore not take these factors into
account. The only discriminating factors which we will consider in our analysis of bowling
scores are overall ability, changes over weeks and changes between the 3 games bowled in
the same evening.
Chapter 3
Characterizing The Bowling
Scores
Before a formal analysis is given in each section of this chapter, we present an exploratory
analysis by using simple descriptive statistics and plots. We begin by carrying out an
informal graphical exploration of the data as this is often helpful in highlighting special
features of the data and may be helpful in determining the direction in which we should
continue our analysis. We then attempt to characterize the data set formally by considering
the mean-variance relationship, the profile analysis and the property of normality.
3.1
Initial Exploration of the Data Set
The scatter plots of scores vs games for each of the forty bowlers are given in Figure 3.1
and Figure 3.2.
Figures 3.1 and 3.2 are given using the same vertical and horizontal scale. From
studying the two figures, we see that the bowling game scores vary widely for different
players. The maximum score is about 330 points and the minimum is about 70 points.
Actually the maximum score is 329 which was obtained by player 25 in game 40 and the
minimum score is 71 which was obtained by player 17 in game 2. The bowling skills are
quite different from player to player. For example players 2,11,12 and 17 tend to have
lower bowling scores, having average scores of 140, 120, 131 and 126; players 5, 10, 25
11
CHAPTER 3. CHARACTERlZING THE BOWLING SCORES
player 1
player 2
playe, 3
12
player 4
player 5
I
... , .'
.'
! .....
.. '.
:
!
~
."
.
.' .. " ......
". .
.,
.......
'-
,
::......
':'
~'.
~
010l0:SO.oW
I
!
!
. ...
-', .......
.,:;
'"
.-
.-
play.f6
playef7
player 8
player 9
!
!
!
'.
':.' . ...
:', .
............
'
..,
! ,', ....:....:. .....:..:
"
!
I
!
"::
..
...
.-
.-
.-
player 11
player 12
player 13
player 14
I
!
!
D
..-
!
.,.
:'. ;~~'.~'."\{'
;'
! "
I !!
..
~:.:
,
' ....
',': .
., :.
! .....:.:..
!
I
!
"
.',:..~ ':.:
.
",
.-.
:: .
.-
.-
player 16
player 17
player 18
I !!
! ,". ',::u.
-,0' ...... ':
,- ...
..-
!
!
::...;.:...... ' .
. ',\:',:
-;.-
"0"30":10
.-
. ..
.
10 10 :llI 40
':
r.o
player 15
II
! .'.. ..
,
!
"
..... '. ;..... ..
. '.
....::. . , ',:.
..
10
"
:)II
.-
40
"
r.a
D
\D
player 19
U
3D 'D !>D
playaf 20
I
I
!
.
!
I
..-
"",-
..
playe, \0
"010:104050
I
.....
"OIO:lOUr.ll
..-
:.:.:.:," .
':.': .
! '. '.. ::. '
!
!
..-
! .': :
... I
!
! .: "''. ','~ .....~..:.:.:', .
'.'
!
on·
,- :
.
.
~.
!
D1GID:lII4G!>D
.-
.-
Figure 3.1: The Scatter Plots of Scores vs Games for Players 1 to 20
....
CHAPTER 3. CHARACTERiZING THE BOWLING SCORES
player 21
!
!
!
~
......
player 22
:
... .
":"
~.:
"
!
!
!
~
~
" " • "
.....
!
..
I
~
,
~
"
!
.... . •
""•
~
......
.;: .. -
!
" "•
.....
.. "
!
....,.
:....
!
.•
"
~
..
:.-:.'
"
"
..... :
!
.
•"•
" ".....
,
"
!
!
!
.
..
..
~
'.
!
,
".
....... ·... '
·
!
!
!
~
.. :'::::'
"
,
""•
....
....
."
~
!
,,
,,'
"
..
.'
'"
.
player 39
",
.
.... . •
""•
!
.. ·
player 38
.. ...:' .
•""• •
...-
•I !
~
"
'
player 35
• "
" " ...-
!
,.,
'::,::'.-
,
" • "•
.
,.1
~
!
!
!
.".' . ....
I !
"
.....
"
!
!
!
"
I
~
•
"
""•"•
,
....
player 34
!
!
.. ..
"
"
:
"
I
:
" " • "•
player 30
!
!
I
!
,
,';:
,
'.'
~
• " •
" " ...-
player 37
..
,"
'
!
'. , '
!
player 33
.
•"
" " ...-
player 36
!
!
!
......
....w
.... . •
player 29
I
"
!
" "•
!
!
ptayer32
." :'.:"'-'
"
"
~
..... . ,
....
~
!
!
','
....: : /
•I !
..
"
~
...
player 25
!
!
,
player2B
.... . •
!
!
!
...
,', "
.....
" "
•""•
!
!
!
!
!
!
'.
,....
!
player 24
!
!
player 31
I
, ....
~
..... .
I
!
!
.':.
".:',', '::::'" :~.
!
"
player 27
,
"
....
!
!
!
" "•
player 26
I
!
!
.:
> ..
player 23
13
!
~
!
','
"
..
,
"
,
..... "
" " •
player40
!
!
!
..
...... .:...
,"
.... -:.
~
!
!
...... "
" " •
..... "
" " •
Figure 3.2: The Scatter Plots of Scores vs Games for Players 21 to 40
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
14
and 29 have higher bowling scores, having average scores of 204, 198, 197 and 204. The
variations of scores are also quite different for different players. For instance players 1, 10,
22 and 25 have ranges of 165, 157, 138 and 189. Players 11, 14, and 16 have ranges of 74,
99 and 87. It also seems to be the case that if a player has a high average, then he(she)
also tends to have a high variation. We will investigate the mean-variance relationship in
more detail in section 3.2. Missing values can also be noted from the plots. Some players
such as players 1, 5 and 6 bowled from the begining to the end of the bowling season.
They do not have any missing values. Other players were occasionally absent even though
I chose the 40 players with fewest missing values. For example the scores of player 20
are missing between games 34 and 54. In order to get a more quantitative understanding
of the bowling scores for different players, Table 3.1 gives a brief summary of the data
set including the average, sample variance and actual number of games that each player
bowled.
Table 3.1: Brief Summary of The Data Set
Player
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Games
57
54
48
48
57
57
51
54
57
57
51
57
54
51
48
51
57
54
51
39
Avg.
195.9
138.8
160.2
187.7
204.3
145.0
155.9
180.3
162.4
197.5
119.7
131.2
162.3
142.6
153.0
143.1
126.1
163.6
161.5
155.4
S"
1415.5
492.5
432.7
911.0
1130.8
672.4
938.6
948.3
1088.8
1237.2
312.4
639.3
918.8
447.1
1016.8
411.4
705.0
797.8
942.8
839.3
Player
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Games
48
51
45
54
51
57
57
45
54
57
51
54
57
54
48
57
54
57
42
51
Avg.
S"
190.5
968.7
203.5 1198.8
188.6 1026.4
180.0
937.5
196.7 1699.0
157.7
689.0
140.9
712.8
119.9
694.6
203.8 1233.1
162.2
576.4
163.2
703.1
144.7
608.2
151.6 1015.1
164.2
928.8
136.6
856.0
171.6
891.0
128.1
800.8
169.2 1339.1
141.0
703.9
160.0 1056.4
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
15
From Table 3.1 we see that 13 players participated in all 57 games. Player 20 participated in the fewest games (39) amongst the 40 bowlers. The averages of game scores vary
from 119.7 (player 11) to 204.3 (player 5). The variances of scores also vary widely from
312.4 (player 11) to 1699 (player 25).
3.2
Mean-Variance Relationship
We mentioned a little bit about the relationship between mean and variance in Section
3.1. We now give a more detailed investigation of the relationship. Figure 3.3 gives a
plot of standard deviations vs averages for each ofthe 40 players. The smoothing line was
obtained by using the lowess command in S-plus.
As mentioned in Section 3.1, we verify in Figure 3.3 that standard deviations tend to
increase with average. In this section we will use formal statistical methods to test this
phenomenon. We will test the hypothesis using different statistical tests.
In order to test this hypothesis, we divided the standard deviations into two groups:
SL and SH. SL is the set of standard deviations corresponding to the 20 lowest averages
and SH is the set of standard deviations corresponding to the 20 highest averages. Figure
3.4 illustrates the two samples.
Method 1: Mann-Whitney Test
The distribution of standard deviations of bowling scores is unknown. Therefore the
non-parametric method may be preferable here.
The main assumptions of the Mann-Whitney test:
(1) The data consist of a random sample of observations SL p SL" ... ,
SLnl
from a
population with unknown median M'L and an independent random sample of observations SH" SH" ..., SH., from a population with unknown median M'H"
(2) The distribution functions of the two populations differ only with respect to location,
if they differ at all.
16
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
The Plot without Smoothing Line
S1
c
.g
III
"
to!
~
(J)
~
.!!l
>
0
~
•
160
180
'
•
·
~
··
140
120
.".
. . ...
200
Average
The Plot with Smoothing Line
...
0
c
.2
.1!
~
0
~
'l!
~
III
'
to!
~
120
•
··
•
·
•
~
..
140
160
180
Average
Figure 3.3: The Plot of Standard Deviation vs Average
200
17
CHAPTER 3. CHARACTERiZING THE BOWLING SCORES
~
-
SL
"
SH
'"'"
:
0
"Q
.~
c
'
g
..
,
~J!I
(/)
l!,J -
0
'"
,
,
120
160
140
180
200
Average
Figure 3.4: The Samples of SL and SH
The hypothesis:
H o: M'L
= M'H
HI: M'L < M'H
The test statistic T
=S -
n"n~ -I)
= 164 where S is the sum of Ihe ranks assigned to
the sample observations from the first population.
Decision rules: We reject Ho at the a level if the computed T is less than the critical
value given below.
18
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
Critical Values of the Mann-Whitney test statistic (
p\n2
.01
.025
.05
.10
16 17
15
81 88 94
91
99 106
101 108 116
11 120 128
nI
= 20 )
18 19 20
101 108 115
113 120 128
124 131 139
136 144 152
We can not reject Ho even at the a = 10% level.
Method 2: Two Sample t-Test
Assumptions of t-test:
(1) SL and SH are independent samples from normal populations.
(2) The standard deviations of SL and SH are identical.
Assumption (1) is clearly violated here. However the t-test is a robust test for samples
from certain non-normal distributions. Graphically, it seems no reason to lose faith in
assumption (2).
The hypothesis:
H o: E(SL) = E(SH)
HI: E(SL)
< E(SH)
We calculate T =
Sp
J -~
lInt +1/n2
= -lAO with degrees of freedom = 38 and
p - value = .085 which leads us to not reject Ho (some mild evidence against Ho)
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
19
Method 3: Spearman Rank Correlation Coefficient
Method 1 and method 2 tested for a difference between means in S Land S H. Both
methods were simple to use but gave only mild evidence of a difference between means.
We now use Spearman's rank correlation method to determine Whether there is evidence
of increasing standard deviation with respect to mean. This test imposes more structure
than the previous 2 methods which simply divided the data in half.
The assumptions of Spearman's test are that the data consist of a random sample of
n pairs of observations and that each pair of observations represents two measurements
taken on the same object or individual. Let (Xi, Yi) denote the average and standard
deviation of player i and let R(Xi)(R(Yi)) be the rank of the Xi(Yi) relative to all other
values of X(Y). If ties occur among the X's(Y's) each tied value is assigned the average of
the rank positions for which it is tied.
The hypothesis:
H o : X and Yare independent
HI : There is a direct relationship between X and Y
The test statistic is r.
6L:~, ,q
= 1- ~
= 0.724 where di = [R(Xi) _ R(YiW and n = 40.
Decision rules: We reject Ho at the a level if the computed value of r. is greater than
the critical value which is given below.
Critical Values (n = 40 )
a
.25
.10
.05
.01
.005
r( a)
.110
.207
.264
.368
.507
Therefore we reject H o even at the a
= 0.005 level.
CHAPTER 3. CHARACTERiZING THE BOWLING SCORES
20
From the exploratory graphical approach and the third test there seems to be some
evidence that the standard deviation increases as the average increases. Possible reasons
for the phenomenom might be as follows: (1) In bowling, strikes and spares affect scores
dramatically. Players with high averages have more chance of getting a strike or a spare.
Therefore the variations of scores are wider than for players with lower average scores.
(2) Bowling scores range from 0 to 450. If a player obtains 0 for every game, the average
score of the player is 0 and the variance is also O. It is the same if a player obtains 450
for every game. The average score of the player is 450 and the variance is O. Using these
two fixed points we might therefore expect the standard deviation versus average curve to
appear as in Figure 3.5. The maximum point of variation is not intended to occur at any
specific point along the horizonal axis. For our case study, we collected bowling scores
with average scores between 120 and 205. Therefore corresponding to Figure 3.5 there is
an increasing relationship between standard deviation and average as we expected.
0
g
c
:;
~
1
0
!<l
c
~
8
o
100
200
300
400
Average
Figure 3.5: Possible Standard Deviation vs Average Curve
.1.
_
CHAPTER 3. CHARACTERIZING THE'BOWLING SCORES
3.3
21
Profile Analysis of Bowling Scores
The bowling season of the league from which the data set was collected took place from
September 1990 to April 1991. As described in Section 2.2 the players bowled a maximum
of 57 games. Three games are bowled in one evening (Wednesday evening) every week.
We are therefore interested in answering the following questions: Do the players improve
their bowling skills from week to week or maybe from game to game each week? The
answer to these questions is important as it gives some indication of whether the scores
for each bowler are identically distributed.
Before proceeding further, we introduce some convenient terminology. In every bowling
evening, the players bowled three games (ordered games). The corresponding scores are
called the first game score, the second game score and the third game score respectively.
By the week average we mean the average game score for each week. Therefore we have
at most 57 game scores, 19 first game scores, 19 second game scores, 19 third game scores
and 19 week averages for each player.
let
Xijk
be the bowling score for the
i'h
player (i = 1,2, ... ,40) in the
= 1,2,3) in the k'h week (k = 1,2, ... ,19). Then
L'·' 4'0 x;), is the overall average game scores of game j
X,jk =
i th
game
(j
Xuk
= L'·;=1 L'
120=1 Xi)'
in week k.
is the overall average of week k and
19
x.j.
L'· L
Y6t-' Xi;'
=i-l
is the overall average of the j'h game.
In order to investigate the questions posed earlier we give the plots of overall averages
of the 40 players. If players improved their bowling skills significantly from week to week
or from game to game, we could see it from the overall average plots. Figure 3.6 gives the
plots of overall week averages vs weeks and overall average game scores vs games.
From Figure 3.6 we see that there is some indication of improvement over the three
games in a given week (2nd plot) and possibly some very mild evidence in improvement
over the season (1st plot). Figure 3.7 gives the plots of scores vs games and scores vs weeks.
We can not see any indication of improvement over the three games nor improvement over
T
22
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
Overall Average Week Scores vs Weeks
II)
Q)
~
~
'"
Q)
Q)
3:
Q)
lfl
'"
~
g
•
~
•
Cl
os
~
Q)
~
~
<3
III
~
•
•
•
•
~
•
•
•
•
•
•
~
•
•
•
10
5
15
Weeks
Overall Average Game Scores vs Games
•
•
1.0
1.5
2.0
Games
Figure 3.6: Overall Average Plots
2.5
3.0
1
CHAPTER 3. CHARACTERlZING THE BOWLING SCORES
23
Scores vs Weeks
•
0
0
C')
•
Ul
~
en8
0
0
N
•
•
t
••
~
t
••
I
••
•
II I I
I
•
0
0
•
•
•
•
•
•
•
•t
II
I
•
5
•
•
•
•
•
•
••
I
f
••
•
••
•
•
t
•
f
I
10
•
•
I
••
•
!
•
•••
I• I • I
•
•
•
15
Weeks
Scores vs Games
0
0
C')
III
~
en8
•
•
•
••
•
••
I
I
0
0
N
0
0
•
1.0
1.5
2.0
2.5
Games
Figure 3.7: Plots Based on Scores of All Players
3.0
1,
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
24
weeks from this plot. However we will need to test these conjectures formally as these
plots do not convey the variability associated with each observation. Because the players
have quite different average bowling scores, we stratify our population by choosing each
player as a subject.
For i
Rio:
Ui 1
= 1,2, ... ,40 we test:
= Ui2 = Ui3
not the case that u" = Ui, = Ui,
where Ui, = mean score of player i for the first game,
Hi, :
Ui,
= mean score of player i for the second game and
Ui,
= mean score of player i for the third game.
In a similar manner wealso test for a difference over weeks. The null hypothesis states
that there is no difference between the means of the 19 weekly scores.
Because we are interested in both the game effect and the week effect for each player
we use the two factor analysis of variance model for repeated measurement designs. We
note that analysis of variance models are robust with respect to small departures from
normality. Using the statistical package SAS we obtain a p-value of 4.2 % for a difference
between games and a p-value of 17.5 % for a difference between weeks. Therefore there
seems to be at most mild evidence for an effect due to games and no evidence for an effect
due to weeks. We will therefore assume these effects do not exist.
3.4
Normality of the Bowling Scores
As mentioned earlier it seems that currently the only qnantitative comment that is routinely made concerning a bowler's ability is the reporting of his/her bowling average. We
would like to do more than this by possibly describing the distribution from which bowling
scores arise. Initially we conjectured that bowling scores for each individual are approxi·
mately normally distributed with unknown mean and variance. We made this conjecture
as bowling scores arise as the sum of scores of 10 frames which suggests that the central
limit theorem may approximately hold here. Figures 3.8 and 3.9 show the histograms of
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
25
bowling scores for each player.
From Figures 3.8 and 3.9, it seems that some histograms are approximately normally
distributed (eg. histogram for player 29). However more often than not, the histograms
seem to have a long right tail (eg. histogram for player 22). We will use a graphical method
to pool the results of each of the 40 bowlers to determine whether the bowling scores
are approximately normally distributed. The null hypothesis is: Ho : Xijk ~ N(J1-i, ,,1),
i
= 1,2, ... ,40.
Pierce[5] has suggested a method of combining tests based on several samples, for
testing H o : the sample comes from a distribution F(Xi,8 i ), with 8i containing unknown
location and/or scale parameters J1-i and
"i.
The true value of these parameters may be
different for each test. For sample i, let jJ.i and O"i be the maximum likelihood estimates,
Define standardized values Wijk = (Xijk - P,i)/Ui, i = 1,2, ... ,40 (in our case). Based
on the proposal of Pierce, the Wijk for all 40 samples should be pooled to form one large
sample of size n
= ~t~l ni
where ni is the size of i'k sample. The limiting distribution
of Wijk will be the same as its limiting distribution for individual samples. Therefore the
above hypothesis is changed to the null hypothesis: Do: Wijk
~
N(O, 1).
Figure 3.10 shows the histogram of standardized and pooled bowling scores Wijk for all
players and the q-q plot with the standardized normal distribution. A line of zero intercept
and unit slope is added to the q-q plot in order to measure easily whether the W ijk are
approximately normally distributed. Both the histogram and the q-q plot suggest that
bowling scores are skewed to the right. In the q-q plot the left tail departs much more from
the straight line than the right tail. Therefore the bowling scores are not approximately
normally distributed. In Chapter 4 we will use the Box-Cox transformation and formal
statistical methods to find a better model for the bowling scores.
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
player 1
'00
'00
.........."
player 6
player 2
eo
120
1$0
player 3
200
player 7
100 140 110 220
.........."
play", 11
eo
100
'40
player 16
100
'40
110
120
player 17
100
150
200
player 4
200
100
100
ISO
200
.........."
250
100
player 13
'00
1SO
200
player 5
250
'"'
player 9
player 8
player 12
60 100 140 llll
160
26
.........."
player 18
100 140 180 220
ISO
200
.........."
player 10
250
140
'80
'"'
'00
player 14
100
'"'
player 15
220
'00
'00
Bc:J,Wng ScDlq
player 19
100 140 180 220
player 20
100
Figure 3.8: The Histograms of Bowling Scores for Players 1 to 20
uo
leo 220
27
CHAPTER 3. CHARACTERIZING THE BOWLING SCORES
player 22
player 21
player 25
prayer 24
player 23
120 160 200 2010
150
120 160 200 240
250
BowlirIg Sca'n
player 26
100 140
tao
220
player 31
-. .....
,co
player 32
player 33
150
200
80 120 160 200
player 36
100 UO 180 220
player 28
tOO
8cMIlng Sea..
100 140 180 220
player 27
80
..
-.""""
player 37
player 38
120
160
200
,so
lOCI
200
,so
100
player 34
tOO 1010 leo 220
100 150 200
BcMtIng S<:a'H
player 30
player 29
140
60
player 39
80
120 160 200
220
player 35
100 140 \80
Bawling
300
180
ScCfU
player 40
80
Bowling Sca••
Figure 3.9: The Histograms of Bowling Scores for Players 21 to 40
120 160 200
28
CHAPTER 3. CHARACTERIZING THE I;IOWLING SCORES
Histogram of Standardized Values
8
~
8
'"
8
'"
§
0
-2
-1
0
3
2
Standardized Bowling Score
Q - Q Plot
~
e '"
~
'" '"
,5
~
<D
IIN
'15
0
la
"D
c:
~
";'
')'
....
-2
0
2
Quantiles of Standard Normal
Figure 3.10: The Distribution of Standardized Bowling Scores
4
Chapter 4
Modelling The Bowling Scores
In Chapter 3, we mentioned that the bowling scores are not approximately normally
distributed. In this chapter, we will use the Box-Cox transformation technique to find an
improved model for the bowling scores.
4.1
Box-Cox Transformation
In general, the Box-Cox family of transformations is given by
y(a)
=
x"-l
-a{ In(x)
ira
f
0
if a = 0
where the transformed y(a) is "more normal" than x. In our case let
y~l
={
xrj.-l
a
In(x;jk)
'f.J- 0
ar
if a = 0
1
where i stands for one of the players 1 through 40, j stands for one of the games 1 through
3, k stands for one of the weeks 1 through 19. We consider different parameters a in order
to maximum the p-value in testing the hypotheses that Yi~l ~ N(j.t;,al), i = 1,2,.,.,40,
j
= 1,2,3 and k = 1,2, ... ,19.
4.2
Goodness of Fit Technique in Testing for Normality
Suppose that a given random sample of size n is given by Xl> X 2 , • • " X n , and let X(I) <
X(2)
< ... < X(n) be the order statistics. Suppose that the distribution of X is
29
F(x), The
~-------------------------CHAPTER 4. MODELLING THE BOWLING SCORES
30
empirical distribution function (edf) for the sample is defined by
[;' ( ) _ number
0/ observations::;
x.
l -00 < X < 00.
n
Edf statistics are a class of goodness of fit statistics which measure the difference between
.I'n X
-
Fn(x) and F(x). The Anderson-Darling edf statistic is defined by
A2
=n
J:
[Fn(x) - F(xW[(F(x))(l- F(X)))-ldF(x).
The A2 statistic is a general purpose (omnibus) goodness of fit statistic although on the
whole it is most powerful when F(x) departs from the true distribution in the tail.
A modified statistic A 2 ' is used in the test for normality with J1 and
(j
unknown and
estimated by the mles. It is given by
A 2·
= A2 (1.0 + .75/n +2.25/n 2 ).
Tables providing significance levels for the Anderson-Darling test for normality can be
found in D'Agostino and Stephens[l].
Fisher's method is a method for combining independent tests from several samples.
Suppose that k tests are to be made of the null hypotheses HOI, H o2 , .•. , H ok . Let Ho be
the composite hypothesis that all HOi are true. Let Pi be the p-value corresponding to the
i th test. Then when HOi is true, Pi is U(O,I) as long as the test statistics are continuous.
The statistic
P =
-22: 10g(Pi)
under H o, has the X~k distribution.
We will use the modified Anderson-Darling statistic A Z ' to test the hypothesis of
normality for the individual bowling scores by finding the p-value Pi, i = 1,2, ... ,40. We
then use Fisher's method to test the. composite hypothesis that all HOi are true.
4.3
The Results of Goodness of Fit for Bowling Scores
In Section 3.4, we used a graphical method to show that the bowling scores are not
approximately normally distributed. Here we give the results of testing the normality of
CHAPTER 4. MODELLING THE BOWLING SCORES
31
bowling scores by using the modified Anderson-Darling statistic and Fisher's method. The
hypotheses are:
= 1,2, ... ,40
HOi: Xijk - N(j1.i,U?), i
HI: not
all HOi are true
where j1.i and
ul are unknown and are estimated by the sample average Xi and the sample
variance s~ respectively.
Table 4.1 gives the modified Anderson-Darling statistic A 2' and the p-value for each
of above hypotheses.
Table 4.1: The Results of Bowling Scores
player
1
2
3
4
5
6
7*
8
9*
10
11
12
13
14
15*
16
17
18*
19
20
"*,,
A"
.27
.32
.32
.39
.27
.65
1.70
.50
1.09
.36
.63
.66
.40
.54
1.11
.29
.58
1.36
.45
.31
p-value
.68
.53
.54
.39
.67
.09
.00
.21
.01
.45
.10
.08
.37
.17
.01
.60
.13
.00
.28
.57
player
21
22
23*
24
25*
26
27*
28
29
30
31
32
33
34
35
36
37*
38
39
40*
A2
.39
.65
.86
.33
1.34
.56
1.69
.54
.24
.29
.16
.61
.65
.54
.15
.26
1.31
.33
.26
.82
p-value
38
.09
.03
.51
.00
.14
.00
.16
.78
.63
.94
.11
.09
.17
.97
.72
.00
.52
.71
.04
means that the p-value is smaller than .05.
From Table 4.1 we observe that nine out of forty players have bowling scores which
are significantly different from the normal population at the 5% significance level. We
CHAPTER 4. MODELLING THE BOWLING SCORES
32
now use Fisher's method for combining the tests of forty samples and get the statistic
p
= -2l:t~1Iog(Pi) = 176.9 with overall p-value O.
We therefore reject the null hypothesis
that the bowling scores are approximately normally distributed.
4.4
The Proposed Model for Bowling Scores
The Box-Cox family of transformations of bowling scores is given by
if a
f
0
if a = O.
We want to find the value of the parameter a so that the transformed
mately normally distributed. We test the hypotheses HOi:
yJ;l are approxi-
yJ;l ~ N(J.li' all, i = 1,2, ... ,40
for a specified a by using the modified Anderson-Darling statistic and Fisher's method.
We do this rather than the traditional maximum likelihood approach since the maximum
likelihood method requires a constant variance amongst individuals aud we have no prior
reason to believe this. The "best" model is the one whose value a gives the maximum
overall p-value. Through trying different a-values ranging from -2 to 2 the logarithms of
bowling scores (a
= 0) has nearly a maximal p-value of 0.07.
When a = -.05, Pm"x
= .07.
However we would like to choose a = 0, because the difference in p-values is small. We
mention that the log transformation is the variance stabilizing transformation resulting
from a model where the standard deviation is proportional to the mean. Table 4.2 gives
the results of testing the normality of the logarithms of bowling scores.
We see that the logarithms of bowling scores for players 7, 8, 23 and 27 are significantly
different from the normal distribution at the 5% significant level. We reject four out of
forty hypotheses
HOi ~
N(J.li, all, i = 1,2, ... ,40. Through using Fisher's method for
combining tests of forty samples, we get the statistic P
= -2l:t~1Iog(Pi) = 99.8
with
overall p-value 0.07. We therefore tentatively accept the hypothesis that the logarithms
of bowling scores are approximately normally distributed.
We mention that the p-values obtained above are appropriate for a specified value a.
We have optimally determined a and then computed the p-value as though a was specified.
CHAPTER 4. MODELLING THE BOWLING SCORES
33
Table 4.2: The Results of Logarithms of Bowling Scores
player
1
2
3
4
5
6
7'
8·
9
10
11
12
13
14
15
16
17
18
19
20
A2
.39
.33
.23
.67
.15
.32
.92
1.10
.54
.26
.53
.30
.18
.25
.44
.39
.42
.72
.39
.21
p-value
.39
.51
.82
.53
.96
.53
.02
.01
.17
.71
.17
.58
.92
.74
.29
.39
.33
.06
.38
.86
player
21
22
23·
24
25
26
27·
28
29
30
31
32
33
34
35
36
37
38
39
40
A2
.26
.43
.93
.35
.61
.25
.88
.35
.51
.17
.23
.31
.30
.43
.28
.42
.70
.10
.18
.61
p-value
.72
.32
.02
.48
.11
.74
.02
.47
.19
.93
.80
.56
.58
.31
.64
.33
.07
.99
.92
.11
"." means that the p-value is smaller than .05.
Technically this is not ideal and the true p- value should be smaller than the reported p_
value of 0.07. Despite this we believe that the log normal approximation is good as will
be seen in the q-q plot.
To compare the results for the bowling scores obtained from Section 3.4 by using
standardized and pooled data, we give the histogram and standardized q-q plot for the
logarithms of bowling scores in Figure 4.1. We see that the histogram of standardized
and pooled logarithms of bowling scores is approximately normally distributed. The q_
q plot is almost a straight line through the origin and with unit slope. The only place
of departure is in the tail. It seems that the tails of the normal distribution may be
slightly thicker than the tails of the logarithms of bowling scores. However as mentioned
earlier, the Anderson-Darling statistic is very sensitive in detecting departures from the
true distribution in the tail. We therefore accept the hypothesis that the logarithms of
bowling scores are approximately normally distributed.
CHAPTER 4. MODELLING THE BOWLING SCORES
4.5
34
The Property of Equal Variances
After having found that the logarithms of bowling scores are approximately normally
distributed, we would like to know whether the equality of variances holds. Table 4.3 lists
the averages and the sample variances of the logarithms of bowling scores. The sample
variances vary from 0.017 to 0.050.
Table 4.3: The Sample Averages and Sample Variances of Logarithm Data
Player
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Averages
5.26
4.92
5.07
5.22
5.31
4.96
5.03
5.18
5.07
5.27
4.77
4.88
5.07
4.95
5.01
4.95
4.82
5.08
5.07
5.03
Sample Var.
.038
.026
.017
.029
.027
.030
.035
.034
.039
.031
.022
.037
.034
.021
.038
.020
.043
.027
.036
.034
Player
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Averages
5.24
5.30
5.23
5.18
5.26
5.05
4.93
4.76
5.30
5.08
5.08
4.96
5.00
5.08
4.89
5.13
4.83
5.11
4.93
5.05
Sample Var.
.027
.028
.030
.029
.039
.027
.032
.045
.033
.022
.027
.029
.044
.035
.050
.033
.046
.045
.035
.043
Figure 4.2 gives a plot of the sample variances against the averages.
We see that the sample variances of logarithms of bowling scores are approximately the
same amongst all players. We will use Bartlett's method for testing whether the variances
are approximately equal for the logarithms of bowling scores.
The assumptions of the Bartlett test are:
CHAPTER 4. MODELLING THE BOWLING SCORES
35
Histogram of Standardized Values
;---
r--
-
r---'
8
o
r---'
I--
....---rI
I
I
I
-3
-2
-1
o
II.
i
I
I
2
3
Standardized logarithms of Bowling SCores
Q. Q Plot
~
8
'"'"
,5
~
'"
'"
III
"0
0
E
,5
0
.9'"
-
'fij
II
~
i!
~
,
')'
"I
-2
o
2
Quantile, of Standard Normal
Figure 4.1: Normality of the Logarithms of Bowling Scores
16n
_
q
CHAPTER 4. MODELLING THE BOWLING SCORES
36
C!
Xlu
0
iij
'iij
:
>
~
Q.
~
III
'" -
0
0
•
".
'"
0
ci
,
4.8
5.1
5.0
4.9
5.2
5.3
Averages
Figure 4.2: The Sample Variances vs Averages Plot for Logarithm Data
(1) Each of the k populations is normal.
(2) Independent random samples are obtained from each population.
The hypothesis is:
- ,,2k
Ho·.,,21-- ,,22- - " ' H1 : not all of the (11 are equal
Let s~, . .. , s~ denote the sample variances from the k normal populations and let
denote the degrees of freedom associated with the sample variance
square error is given by
1
MSE
= dlf
k
'Edf;sl
T i=l
where
k
dfT
=
'Ed!i.
j=}
d
s?-
d!,
Then the mean
""
CHAPTER 4. MODELLING THE BOWLING SCORES
37
The test statistic is
1
k
B = C[(dfy)log(M SE) - 2:)d/;)log(sr)]
1=1
where
1
k 1
C = 1 + 3(k _ 1)[(2: d'!) 1=1
Under Ho, B is approximately distributed as
I
1
d""l
'JT
xLI'
In studying the equality of variances for the logarithms of bowling scores, the above
two assumptions hold. We have dfi
= ni -
1 where
ni
is the number of games in which
player i participated. The sample variance of the logarithm of bowling scores for player i
is
sr and k = 40. The test statistic B = 57.6 yields a p - value = .03. A Spearman Rank
test was also carried out as in Section 3.2 and the p-value was found to be insignificant.
Therefore there seems to be mild evidence of differences amongst the variances. However
all that we care about is that the differences are not too big. Therefore we suggest
that the variances are approximately equal amongst players and that the logarithm of
bowling scores are approximately normally distributed with a constant variance estimated
by 0- 2
E'· So, = 0.0328.
=;40
An approximate 95% confidence interval for
,,2
based on
normality is (0.0220, 0.0541)
Having revised our model we standardize logarithms of bowling scores as in Section
3.4 by using the r; for each player together with 0- 2 = 0.0328 obtained above. We then
construct the standardized q-q plot and histogram for pooled values based on this new
model. Figure 4.3 gives the plot of normality of logarithms of bowling scores with constant
variance.
Comparing Figure 4.1 and Figure 4.3, we observed that Figure 4.3 is more normal than
Figure 4.1. The reason might be that in Figure 4.1 fewer data are in the tail compared
with the standard normal. In Figure 4.3 the i'k bowler contributes the terms
the pooled data. Therefore those bowlers which have a small
Si
k Xi
x't . -
u
to
are going to contribute
terms that are clustered mOre tightly about zero and those bowlers which have a larger s,
are now going to contribute terms that are more spread out about zero. The net effect is
a longer tailed distribution which is what we observed.
CHAPTER 4. MODELLING THE BOW~ING SCORES
38
Histogram of Standardized Values
r--
f---
'--
r--
8
o
f---
r--
s---
r-,
i
i
i
Figure 4.3: Normality of the Logarithm of Bowling Scores (a 2
rJb
_
= 0.0328)
CHAPTER 4. MODELLING THE BOWLING SCORES
39
Note that some very strong modelling assumptions have been conjectured; i.e. that
logarithms of bowling scores are approximately normally distributed with a constant variance. However this conclusion is based on a single league and it may be unreasonable
to extend this inference to populations in general. In the next chapter we hope to show
through simulation that the result is approximately valid for a wide range of bowling
abilities.
Chapter 5
Simulated Bowling Scores
In Chapter 4, we found that the logarithm of bowling scores is approximately normally
distributed. However we have some concern over the adequacy of the approximation due
the 7% p-value. Perhaps the approximation is quite good and the questionable p-value
can be attributed to the effect of sample size on the meaning of significance tests. For
example, it is well known that with a very large data set a precise H o will almost always
be rejected (see Royall[lOJ). In any case we would like to confirm the adequacy of the
approximation. In this chapter, we use a Fortran program to simulate bowling scores to
confirm the model.
5.1
Assumptions of Simulation
As we know, the actual mechanism underlying a bowling game is impossible to describe
and to simulate. We give some simplifications concerning the bowling mechanism in order
to make the simulation easy.
(A) There is no curve on each ball bowled.
(B) A bowler always aims directly at the middle of the pin of interest and can miss the
pin by no more than 1 pin to the right or to the left.
(C) The bowler has equal accuracy to the left or to the right; the chance of hitting either
the adjacent right pin or the adjacent left pin is the same.
(D) There is no learning effect. Every ball bowled is independent of one other.
40
4
_
4
CHAPTER 5. SIMULATED BOWLING SCORES
41
We also simplify the outcomes of each ball bowled. These outcomes are described below.
We mention that other outcomes can arise in practice other than those described. However
they are far less probable. They also have a similar structure to one of the above possibilities. That is, they count approximately the same number of points and have nearly the
same implications for successive balls in the frame.
1. The outcomes resulting from the first ball bowled in a frame are one of following
four possible results:
(a) Strike: A bowler knocks down all pins with probability pi and the score is 15 points.
(b) Corner: A bowler knocks down all pins except a single corner pin with probability p2
and the score is 13 points.
(c) Headpin: A bowler knocks down the headpin with probability p3 and the score is 5
points.
(d) 3-2: A bowler knocks down the 3-pin and the 2-pin on the same side with probability
p4
=1 -
(pi
+ p2 + p3) and the score is 5 points.
2. The outcomes resulting from the second ball bowled in a frame are conditional on
the result of the first ball.
(a) A strike is recorded and the frame is completed. No second ball is available.
(b) A corner pin is left standing after the first ball. There are 2 possible outcomes for
the second ball:
(1) Spare: The bowler knocks down the corner pin. The probability pl+p2+p3
relates to a ball which does not miss its intended pin. A spare is recorded and
the score is 15 points.
(2) The corner pin remains. The bowler does not not knock down the corner pin
with probability p4 and the score remains unchanged.
CHAPTER 5. SIMULATED BOWLING SCORES
42
(c) A headpin is picked as the result of the first ball. There are four possible outcomes
for the second ball:
(1) 3-2: The bowler knocks down the 3-pin and 2-pin on either the right or left side.
The probability pl+p2 relates to a ball which is aimed at 3-pin and hits but
does not punch out the 3-pin and the score is 5 + 5 = 10.
(2) 3-pin: The bowler knocks down the 3-pin with probability p3 and the score is
5 + 3 = 8.
(3) 2-pin: The bowler knocks down the 2-pin and the score is 5
+2
= 7. The
probability p4/2 relates to the bowlers tendency to miss to the left or to the
right with equal probability.
(4) Miss: The bowler rolls the ball through the headpin channel with probability
p4/2 and the score is still 5 points.
(d) The 3-2 combination is picked as the result of the first ball. There are five possible
outcomes for the second ball:
(1) Headpin: The bowler picks the headpin with the second ball with probability
p3 and the score is 5 + 5
= 10.
(2) Spare: The bowler knocks down all remaining pins with the second ball with
probability pl+p2/2. A spare is recorded and the score is 15.
(3) Miss: The bowler rolls the ball through the 3-2 channel with probability p4/2
and the score is unchanged.
(4) 3-2: The bowler knocks down the other 3-pin and 2-pin with probability p4/2
and the score is 5 + 5 = 10.
(5) hp-3: The bowler knocks down both the head pin and the 3-pin with probability
p2/2 and the score is 5 + 5 + 3 = 13.
3_ For the third ball of the frame there are 14 different results based on the second
ball. Figure 5.1 graphically depicts all outcomes and probabilities. The numbers within
circles represent the cumulative scores and the letters A, B, C and D after the second ball
indicate that the same situations have occurred at other places in the chart.
_ ..dtt
_
-~
43
CHAPTER 5. SIMULATED BOWLING SCORES
hp: headpin
3-2: 3-pin and 2-pin knocked down
3:
3-pin knocked down
2:
2-pin knocked down
hp-3: headpin and 3-pin knocked down
(0
G
8
(0
hp(pl+pZ+ 3
Figure 5.1: Tree Diagram of Possible Outcomes
G
<
CHAPTER 5. SIMULATED BOWLING SCORES
5.2
44
The Results of Simulation
Based on the simplifications and the scoring rules, a Fortran program has been coded to
simulate bowling games. Appendix B lists the computer program.
We use the Fortran program to simulate bowling scores by choosing different parameters pI, p2, p3 and p4 roughly based on the observations of actual bowlers and on the
considerations of variability of abilities where pI
+ p2 + p3 + p4 =
1. Table 5.1 lists
some typical results of the simulation where averages range between 120 and 305. The
results include the sample averages and sample variances of the bowling scores based on
N simulations for a set of chosen parameters.
Table 5.1: The Results of Simulation (N = 10000)
pI
0.3
0.2
0.2
0.4
0.25
0.1
0.15
0.15
0.05
0.1
0.2
0.25
0.4
0.25
0.25
0.3
p2
0.3
0.2
0.3
0.3
0.25
0.1
0.15
0.15
0.05
0.2
0.2
0.15
0.4
0.2
0.25
0.3
p3
0.2
0.3
0.2
0.2
0.2
0.3
0.25
0.1
0.35
0.3
0.15
0.1
0.1
0.25
0.1
0.1
p4
0.2
0.3
0.3
0.1
0.3
0.5
0.45
0.6
0.65
0.4
0.45
0.5
0.1
0.3
0.4
0.3
Average
250
199
219
282
224
149
173
171
125
169
197
202
302
214
223
250
Variance
1388
1157
1197
1507
1335
731
978
1015
412
855
1218
1329
1355
1338
1356
1412
We also give some standardized q-q plots and histograms based on the logarithms of
the simulated bowling scores in Figures 5.2-5.4. The sample variances of the logarithms of
bowling scores are 0.0327, 0.0361 and 0.0234 for the data sets which contributed Figures
5.2,5.3 and 5.4 respectively. They belong to the range of variances of logarithms of actual
bowling scores. The plots in Figures 5.2, 5.3 and 5.4 are obtained by using the constant
variance
(72
= 0.0328. Figure 5.2 gives one of the "best" amongst the simulated scores,
·
----.....-CHAPTER 5. SIMULATED BOWLING SCORES
45
Figure 5.3 gives the result of a typical simulation and Figure 5.4 gives one of the "worst"
results. By "best" we mean that it fits the normal model best and by "worst" we mean
that it fits the normal model worst. Note that in the worst case the average of simulated
bowling scores = 250 which extends the range of averages of the actual data. However
it is still not clear which values of the parameters pI, p2, p3 and p4 lead to a good
approximation.
It has been observed in Figures 5.2-5.4 that in each case the left sample quantiles
fall below the normal quantiles. A possible explanation for this is the inadequacy of the
simulation model. The simplified model may make it unrealistically easy to obtain a low
score.
From Figures 5.2-5.4, we find that the logarithms of simulated bowling scores are
approximately normally distributed with a constant variance. We have therefore confirmed
the proposed model of Chapter 4 by using a Fortran program to simulate bowling scores.
46
CHAPTER 5. SIMULATED BOWLING SCORES
Histogram of Standardized Values
8
'"
-Sl
8
Sl
0
-4
0
-2
2
Standardized Logarithm. of Simulated Bowling Score.
Q - Q Plot
o
-2
Quantile.
2
0' Standard Normal
Figure 5.2: The Results of pl=O.15, p2=O.15, p3=O.25, p4=0.45
6
_
47
CHAPTER 5. SIMULATED BOWLING SCORES
Histogram of Standardized Values
.---
--
-
-
8
-
~
L
i
i
,
i
-4
-2
o
2
o
Standardized Logarithms of Simulated Bowling SCores
Q - Q Plot
-2
o
2
Ouantiles of Standard Normal
Figure 5.3: The Results of pl=O.15, p2=O.15, p3=O.1, p4=O.6
...
48
CHAPTER 5. SIMULATED BOWLING SCORES
Histogram of Standardized Values
-iil
-
8
-
o
-4
-2
-3
o
-1
2
Standardized Logarithms of Simulated Bowling Scores
Q. Q Plot
(;
~
';"
E
-S
.~
ll.
..,.9
~
J
~
.!!l
CJ)
'1'
'?
-2
o
2
Quantiles of Standard Normal
Figure 5.4: The Results of pl=O.3, p2=O.3, p3=O.1, p4=.3
Chapter 6
Handicap Systems
From preceeding chapters we have concluded that the logarithm of bowling scores is approximately normally distributed with a constant variance
(72
= 0.0328.
In this chapter we
will use Monte Carlo methods to investigate the effect of various handicap systems based
on the proposed model of Chapter 4 and then compare our results with the Remington
Rand study.
6.1
The Remington Rand Study
There are various handicap systems currently used in league and tournament play. Detailed information about handicap systems is described in Section 2.1. The Remingtom
Rand study[8] processed over 100,000 league bowling scores and the results suggested that
the individual handicap system of 80% of the difference between the bowler's average and
a base figure of 225 is the fairest handicap system. We tried to get more information about
the Remington Rand study and their criteria of "fairest". Unfortunately we did not get
any reply from the Ontario 5 Pin Bowlers' Association on this matter. In the remainder
of this chapter, we use our model to investigate the effects of various handicap systems on
the probability of winning. By a fair handicap system we mean one which tries to "even
up" the chance of winning in a match between competitors of various strengths.
6.2
Let
The Monte Carlo Study
X;j
be the
j'h
bowling score of player i with mean m;. In this way we can compare
bowlers of various abilities by changing the value of m;. We know from our model in the
49
6z
_
CHAPTER 6. HANDICAP SYSTEMS
previous chapters that Yij
= log(Xij)
50
by completing the square it is easy to show that mi
therefore Ui = log( mil -
= 0.0328.
= E(xij) = E(e
~ N(ui, (12) where (12
Since Xij
Y" )
=, eY",
= e"'+"'
and
a;. We easily simulate bowling scores for player i by generating
random variates from the N(log(mi) -
a', (12)
2
distribution with
(12
= 0.0328 and then
taking logarithms.
In more detail the method for generating bowling scores is as follows:
(1) Create average bowling scores mi and mj for player i and player j. The values of mi
and mj (mi ::; mj) are between 100 and 240 to reflect realistic abilities. We consider
all possible combinations with averages ranging by 10 point intervals. By restricting
mi ::; mj
we refer to player i as the underdog and player j as the favourite.
(2) For each pair of players we generate 10,000 bowling scores Xik and x jk, k
= 1,2, ... ,10000,
according to the log normal distribution described above.
(3) We then add a handicap to both Xik and Xjk based on the handicap system currently
under study and obtain the total game scores. We consider handicap systems 1, 2,
3 and 4 corresponding to the descriptions in Section 2.1.
(4) We then estimate the probability of the favourite defeating the underdog in a given
game by considering the fraction of the 10,000 games won by the favourite over the
underdog.
6.3
The Results of Our Study
Table 6.1 lists the bowler's averages and the probabilities of the favourite defeating the
underdog based on the four handicap systems. This has been done using 20 point intervals.
As we expected, from Table 6.1 )Ve observe that the stronger player always has an
advantage under each of the four handicap systems. We see this as a good thing as it
offers incentive to improve one's bowling skills. On the other hand the advantage of the
stronger player should not be so great as to discourage the weaker player. From this point
of view Table 6.1 indicates that handicap system 1 may be the most preferable as the
advantage of the favourite over the underdog is not as dramatic as with handicap systems
2, 3 and 4. For example the favourite with an average of 220 has probability 0.54, 0.69,
L..
d
Q
CHAPTER 6. HANDICAP SYSTEMS
51
Table 6.1: Estimated Probabilities of the Favourite Winning
Underdog
100
100
100
100
100
100
100
100
120
120
120
120
120
120
120
140
140
140
140
140
140
160
160
160
160
160
180
180
180
180
200
200
200
220
220
240
Favourite
100
120
140
160
180
200
220
240
120
140
160
180
200
220
240
140
160
180
200
220
240
160
180
200
220
240
180
200
220
240
200
220
240
220
240
240
Handicap1
0.50
0.56
0.60
0.63
0.67
0.68
0.70
0.81
0.50
0.54
0.58
0.61
0.64
0.66
0.77
0.50
0.54
0.58
0.59
0.64
0.74
0.51
0.53
0.56
0.59
0.70
0.50
0.53
0.55
0.67
0.49
0.53
0.64
0.50
0.61
0.51
Handicap2
0.50
0.60
0.68
0.73
0.78
0.80
0.90
0.96
0.50
0.57
0.64
0.70
0.73
0.86
0.93
0.50
0.57
0.64
0.67
0.82
0.90
0.51
0.56
0.61
0.76
0.85
0.50
0.55
0.70
0.81
0.49
0.65
0.76
0.50
0.64
0.51
Handicap3
0.50
0.57
0.63
0.67
0.71
0.73
0.86
0.93
0.50
0.55
0.60
0.64
0.67
0.82
0.90
0.50
0.55
0.60
0.62
0.78
0.87
0.51
0.54
0.58
0.73
0.83
0.50
0.53
0.68
0.80
0.49
0.65
0.76
0.50
0.64
0.51
Handicap4
0.50
0.57
0.63
0.67
0.61
0.73
0.75
0.87
0.50
0.55
0.60
0.64
0.67
0.71
0.83
0.50
0.55
0.60
0.62
0.67
0.79
0.51
0.54
0.58
0.61
0.74
0.50
0.53
0.56
0.71
0.49
0.54
0.67
0.50
0.64
0.51
CHAPTER 6. HANDICAP SYSTEMS
52
0.67 and 0.55 of defeating the underdog with an average of 180 under handicap systems
1, 2, 3 and 4 respectively.
To gain a better understanding of Table 6.1 we present it in a graphical manner in Figure 6.1. Figure 6.1 gives plots of the probabilities of the favourite defeating the underdog
under four handicap systems (H1, H2, H3 and H4) for an underdog with a fixed average.
Notice that any lack of smoothness in the plot is due to errors in our estimates and should
be ignored. The standard error of the probabilities is less than or equal to
°i~~go5
= 0.005.
The estimate errors are also clearly seen in Table 6.1 where the probability of the favourite
winning should always be .50 when
mj
= mj'
Figure 6.1 shows clearly that handicap system 1 is the fairest. This gives the same result
as the Remington Rand study: the individual handicap system of 80% of the difference
between the bowler's average and a base figure of 225 is the fairest handicap system to
use in league or tournament play.
CHAPTER 6, HANDICAP SYSTEMS
53
Average of Underdog = 100
"!
0
<Xl
,~
~
a.
0
[}]
----
"!
,
,,
0
"
/
1
H3
H4
CD
f
0
f
......
f
fe
f
..-.
r0
Average of Underdog = 130
.,- ....
... / ,d"
-"
/
r0
rn
--. --
H3
H4
<X!
0
'"0
'"0
,
"
f
f
f
f
"
..r
<X!
,1
.
:
a.
~
..~.
, ,.
.,"
'
,,- I
f
1..
"
0
100
140
130
220
180
160
220
190
Average of Favourite
Average of Favourite
Average of Underdog = 160
Average of Underdog = 190
lil
<Xl
0
,~
:sco
[}]
----
r0
....
" ,
",
.,'l'
/' /
H3
H4
"
//
R
f
" f
,,'/
,~
-"
e
e
<D
....
a.
.: I
,: I
...• I . ,""'
,./'
.-/
[}]
----
... ;
,"
H3
H4
J'~
,"
.....
.'/
/'"
/1
-"
/1
0
0
~
/,
: f
a.
0
,
.'"
,I
.';
5l
;i'
0
,I
,-/
..f·'·
..~
,/
"
,'/
.. ,'( .
g
'"0
0
160
180
200
220
Average of Favourite
240
190
210
Average of Favourite
Figure 6,1: The Probability of the Favourite Winning the Game
d
230
.~
Appendix A
The Data Set
Player
Game
11
12
1
173
165
182
159
147
194
221
153
159
183
141
252
13
14
181
210
15
205
16
17
18
19
20
163 118
189 162
189 120
234 139
205 146
1
2
3
4
5
6
7
8
9
10
2
3
1.52 161
164 172
148 204
130 NA
4
5
215
200
210
182
141
6
125
103
128
133
173
171
136
147
151
187
184 248
180 229
147 177
197 159
191 138
161 167
141
150
135
119
135
161
158
137
122
147
133
170
147
188 185
219 204 167
178 198 146
205
193
221
NA
NA
NA
NA
152 NA NA
88
172 NA NA 176 122
153
167
149
113
143
184
137
122
124 149
NA 149 164 146
NA 222 196 162
NA 200 184 154
198
126
139 257
213 260
146
116
7
NA
NA
NA
NA
NA
NA
10
8
9
127 148 257
185 107 161
205 167 203
159 155 178
190 148 142
128 187 170
176 138 295
211 136 200
212 110 185
195 139 224
209 121 146
206 138 227
150 215
137 107 136
106 230 212
147 193 126
153 145 155
123 160 143
129 205 126
143 106 135
193
165
230
162
161
176
147
199
54
.-z
_
APPENDIX A.
55
THE DATA SET
Game
Player
4
3
179 126 162 190
199 130 166 187
238 161 162 220
133 161 169 230
199 130 166 187
238 161 162 220
133 161 169 230
1
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
223
235
239
207
214
163
221
242
196
209
139
250
146
256
297
224
2
125 NA
139 NA
125 NA
101 137
124 155
5
234
173
201
214
173
201
214
10
8
9
192 132 188
191 202 163
182 156 255
176 218 211
191 202 163
182 156 255
176 218 211
187 168 200
188 134 164
171 167 205
242
229
252
222
190
184
214
299
202
133 226
209 157
159 146
116 163
156 127
113 144 220
157 120 129
III 124 173
177 113 203
131 173 192
126 115 165
142 211 178
159 191 181
149 156 180
236
142
173 198
202 195
159 199
178
124
141 131 183
122 162 190
173 171 186
121 151 113
113 137 225
146 181 162
146 146 255
NA 146 196
7
6
172 149
167 147
133 226
209 157
167 147
152
200
170 210
145 217
248 253
220 145
162 184
173 188
136 200
159 266
116 225
184
235
NA 130 224 162 158 152 144 211 171
NA 151 197 180 144 136 144 168 205
101
148 235
188 177 148 165
166 187 203 190
235 119 187 223
132 102 202 142
194 121 164 204
206 144 156 NA
232 140 178 NA
259 136 149 NA
178 119 154 148
~2
169 195
181
187
124
179
194
219
229
246
147
135
141
165
138
160
182
193
141
167
149
NA
NA
NA
156
217 190
166 228
185 200
157 138
132 154
162
209
148
172
239
195
285
205
159
132
135
163 137
113 130
152 142
119 213
138
148
173
234
231
174
118
145
193
211
228
199
175
185
157
APPENDIX A.
56
THE DATA SET
Game
54
55
56
57
2
1
156 105
146 96
212 142
196 163
3
140
182
4
183
165
126
185
178
184
Game
1
2
3
4
5
6
7
7
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
12 13 14
11
85 123 119 NA
109 109 201 NA
95 III 137 NA
123 140 144 132
158 117 177 153
131 109 146 III
123 106 151 NA
153 160 188 NA
155 124 158 NA
84 137 148 154
114 152 132 149
119 136 143 148
135 118 173 180
158
112
110
110
129
126
109
123
108
112
135
157
117
166
119
124
124
Player
6
7
8
5
234 176 143 220
208 120 124 192
213 155 141 241
208 232 207 177
9
155
134
10
229
172
149
169
216
240
Player
15 16
170 125
153 147
157 153
19
151
196
158
20
200
154
156
115
157
115
128
136
159
186
144 122
137 237
158 136
187
193
121
162
132
135
171
151
159
181
197
190
164
139
121
195
164
138
121
170
163
183
132
129
176
139
172
143
154
129
147
103
NA
NA III
NA 112
123
140
190
190
154
136
115
123
157
156
135
125
156
163
132
108
184
122
141
213
136
126
154
151
135
113
146
125
170
179
119
142
147
150
144
158
158
151
154
162
17 18
114 NA
71 NA
102 NA
103 154
102 144
91 158
103 205
114 213
94 217
99 171
168
124
136
126
141
130
87
138
145
120
132
153
145
188
174
138
118
168
175
184
NA 117 125 171 135 173
NA 151 133 155 183 131
136
115 194 176 171 NA 157 140 234 174 149
150 157
122 205
160
144
APPENDIX A.
57
THE DATA SET
Game
25
26
27
28
29
30
31
32
33
34
12
14
160
144
Player
15
16
11
108
136
13
150 157
122 205
115
111
108
104
124 145
130 125
125 136
124 131
144 101
207 177
189 133 208
169 143 128
108 135 116
131 137 157
146 136 136
186 118 131
195 182 143
169 155 143
156 135 176
168 141 180
17
117 125
151 133
18
171
155
19
135
183
174
145
146
148 223
194 126
155 110
154 128
132 163
204 117
152 105
160 166
180 221
215 NA
218 NA
170 NA
134 NA
216 NA
185 NA
183 NA
NA
NA
115 194 176 171 NA 157 140 234 174 149
NA
NA
NA
NA
NA
NA
148
142
124
134
136
139
125
155
101
147
163
163
126
123
148
118
153
124
147
179
142
178
135
164
137
168
138
158
206
145
42
43
129
86
44
111
146
164
112
152
167
114
136
119
45
46
47
48
49
50
51
52
107
99
118
155
107
157
141
162
148 116
261 119
203 167
106 142
142 134
133 171
275
137
164
136
122
116
104
104
102
150
170
156
145
142
202
228
163
162
131
144
128
147
190
188
112
175
119
107
127
132
102
NA
NA
NA
113
124
NA
NA
NA
96
131
53
54
136
111
132
151
55
56
57
109
128
138
108
113
113
97
122
128
132
161
135
190
185
115
156
142
146
133
155
103
158
155
218
182
128
75
148
162
96
180
139
131
115
103
121
121
142
146
94
35
36
37
38
39
40
41
20
173
131
172 153
NA 146
177
146
186
130
130
192
104
NA 125 155
NA 132 214
188
239
150
144
160
133
117
183
153
119
155
121
158
125
115
161
178
102
105
196
80
135
128
136
111
124
131
NA
NA
NA
135
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA 189
164 NA 148
136 NA 141
APPENDIX A.
THE DATA SET
58
Game
Player
21
1
2
3
4
5
6
7
8
9
10
11
12
22
23
24
NA NA NA NA
NA NA NA NA
NA NA NA NA
148 187 NA 153
172 176 NA 133
152 240 NA 191
25
152
145
187
151
216
222
182
175
211
13
14
15
16
17
18
19
20
21
22
23
233
166
210
24
237 151 161 212 177
166 221 210 192 181
210 193 204 140 158
237 151 161 212 177
195 236 215 171 159
152 213 186 158 199
233 263 155 149 207
25
26
27
28
29
30
28
29
30
164
186
NA
NA NA
NA NA
NA
NA 156 108 NA NA 208
174 155
181 227
160 232
163 204
177 246
190 289
172 188
180 182
211 184
157 205
176 179
212 261
201 178
207 153
212 174
260 156
199 223
162 181
233 163
141 133
26 27
162 114
172 105
218 135 164
168 174 171
197 167 205
192 162 212
161 148 201
228 182 241
234 191 204
142 202 156
276 152 210 169
221 210 192 181
193 204 140 158
177
144
92
119
130
151
131
153
140
175
161
110
130
206
221
129
154
179
112
133
185
165
156
142
217
129
123
144
139
132
212
165
107
140
78
90
121
112
85
152
103
123
119
104
234
211
188
221
200
263
237
191
169
246
100 186
126 188
NA 148
NA 191
NA 281
151 223
145 200
152 160 118 229
145 176 123 228
165 161 111 187
142 135 124 230
145 176 123 228
165 161 111 187
142 135 124 230
134 132 82 273
128 166 146 246
216 142 101 242
153
186
148
140
183
169
109
157
152
161
164
145
154
138
134
145
146
200
135
177
147
135
177
147
141
134
161
APPENDIX A.
THE DATA SET
Game
Player
21
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
59
22 23 24
183 149 151
162 200 231
179 225 226
189 209 212
179 226 141
NA
NA
NA
NA
NA
NA 185
244 NA
189 NA
223 NA
233
216
185
167
193
182
142
215
189
129
179
162
199
208
180
210
253
238
218
218
164 187
131 254
187 172
226 193
NA 204
NA 225
NA 153
212 195
157 166
208 137
210 180
237 147
161 128
143 168
175 219
185 165
235 237 155 180
162 152 NA 183
140 243 NA 202
25
196
140
220
26
136
153
161
27
162
152
199
28 29
117 204
115 128
133 183
246 123
183 144
189 222
271 195
156 190
272 186
329 167
182 176
298 201
NA 170
NA 166
NA 121
215 157
149 169
205 147
184 209
243 163
184 124
171 131
158 126
132
127
123
127
113
132
155
127
134
125
128
150
128
147
142
126
149
95
169
240
277
165
125
123
117
133
142
114
111
96
81
76
136
107
131
138
NA
NA
NA
109
221
128
30
138
166
186
179 169
214 175
146 157
158 216
209 154
135 181
254 156
226 128
203 147
156 134
218 192
212 183
231 201
211 169
190 190
174 174
239 163
186 138
198 118
235 217
NA
NA
263 203 NA 176 200 122 121 NA 141 194
216
185
179
250
170 198
192 157 217
188 180 229
140
187
178
129
132
150
119
137
172
172
163
184
163
171
APPENDIX A.
THE DATA SET
60
Game
Player
31
1
2
3
4
5
6
7
8
9
10
11
12
NA NA 221 NA NA
NA NA 165 NA NA
NA NA 189 NA NA
167 125 138 138 NA
139 133 198 147 NA
131 127 158 224 NA
173 122 96 181 NA
129 114 100 177 NA
127 150 127 163 NA
159
197
171
21
22
23
185
174
199
24
25
26
27
28
29
30
34
36
18
19
20
16
17
33
35
184 125
128 153
151 103
NA 134
NA 182
NA 179
158 168
173 141
159 141
170 135
185 149
13
14
15
...
32
163
134
219 119
135 158
161 141
115
156
138
137 147 132
128 128 128
196 190 138
109 176 81
148 120 71
131
213
125
194
151
166
193
190
134
153
134
178
140
140
179
181
144
155
183
132
37
119
181
164
148
110
113
114
107
38
176
146
177
294
128
174
168
151
118 101
112 141
139 152
91 207
146 138
162 195
106 233
NA 213
NA 153
39
40
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
119 158
137 157
95 126
110 132
140 218
107 95
132 156
131 195
165 132
166
153
155
112
90 150 NA 128 123
107 147 128 147 155
191 204 115 160 124
120 183 115 165 158
106 106 126 193 183
125 218 83 111 206
156 165 169 132 157
106 106 126 193 183
125 218 83 111 206
156 165 169 132 157
136 191 103 118 131
158
212
148
153 168
107 124
187 120 200
196 131 166 203
174 134 107 124
199 187 120 200
196 131 166 203
115 119 117 125
146 153 165 150 108
191 106 131 208 150
194
176
t-.
113
111
169
163
118
139
137
103
155
120
103
155
120
125
145
138
._
APPENDIX A.
THE DATA SET
61
Game
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
32
31
121
149
184
147
211
135
156
169
185
133
139
133
199
137
163
128
140
148
166
148
160
159
169
214
225
179
112
145
139
145
141
145
166
176
155
157
202
138
99
106
154
126
203
182
154
140
143
160
115
169
134
189 143 230
181
209
136
124
163
187
190
121
170 138 186
135 200 211
141 104 150
121 203
129 232
186 155
162 206
107 120
121 187
127 206
51
52
53
54
55
187
144
155
161
117
137
179
141
177
149
56
57
169
185
120
119
33
134
158
188
144
136
121
186
34
212
132
Player
35 36 37
119 132 102
95 131 155
101 180 199
118 224 111
119 181 84
155 124 114
141 152 90
188 167 121
127 198 129
165 186 142
164 170 152
149 227 124
162 176 95
138 167 125
126 146 115
147 175 179
145 208 150
177 166 120
191 174 105
198
172
172
229
122
188
118
154
142
155
214
121
157
162
142
173 208
146 184
147
170
156
154
159 227
152 151
124
174
176
180
38
153
191
39 40
151 191
105 189
191 105 150
144 134 187
146 168 150
154 136 154
162 149 148
174 145 159
174 141 142
185 146 195
242 160 154
222 119 166
136 128 210
181 118 220
134 93 137
168 NA 165
159 NA 193
191 NA 202
202 NA 195
159 NA 142
NA
NA
NA
NA
119
148
161
138
206
188
214
201
144
166
Appendix B
A Program which Simulates
Bowling Scores
c
This is a program which simulates bowling scores
c**********************************************************************
c
Simplifying Assumptions:
c
(1) no curve on ball
c
(2) aim straight on
c
(3) equal accuracy to left .or. right
c
(4) no learning effect (balls are independent)
c
(5) misses target by at most 1 pin
c*********************************************************************
program main
parameter (n=10000)
dimension gscore(n).fscore(10).p(4) .t(3)
real p,mscore,sscore
double precision drand.pp
integer gscore.fscore.tstrike.tspare.i.j.t.k
open(8.file=·output')
mscore=O
sscore=O
read(*.*) (p(i),i=1.4)
62
17
;
•
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
p(2)=p(1)+p(2)
p(3)=p(2)+p(3)
p(4)=p(3)+p(4)
if (p(4) .It . . 99999999 .and. p(4) .gt. 1.00000001) then
print*, 'NOT proper selection of probability'
endif
do 99 i=l,n
do 20 k=l,10
20
fscore(k)=O
t(l)=O
t(2)=0
t(3)=0
tstrike=O
tspare=O
do 88 j=l,10
40
continue
pp=drand(O)
c
This is a strike
if (pp .le . p(l») then
fscore(j)=fscore(j)+15
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+1)=fscore(j+1)+15
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+15
t(3)=t(3)+1
endif
t(l)=t(l)+l
tstrike=tstrike+1
63
..
_----------------
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
c
This is a corner
elseif (pp .gt. p(l) .and. pp .le. p(2)) then
fscore(j)=fscore(j)+13
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+13
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+13
t(3)=t(3)+1
endif
t(1)=t(l)+l
!f(t(1) .It.3) then
call
else
corner(fscore,t,tspare,tstrike,p,j)
goto 30
endif
c
This is a headpin
else if (pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(l)+l
if(t(l) .It.3) then
t
64
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
call headpin(fscore,t,tspare,tstrike,p,j)
else
goto 30
end if
c
This is a 3-2
elseif( pp .gt. p(3) .and. pp .le.l) then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) . and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t(l) .It.3) then
call bov32(fscore,t,tspare,tstrike,p,j)
else
goto 30
endif
endif
30
if (t(l) .ge. 3) then
t (1)=t(2)
t(2)=t(3)
t(3)=0
tstrike=O
elseif(t(l) .It. 3) then
goto 40
endif
65
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
88
continue
gscore(i)=O
do 12 j=1.10
12
99
gscore(i)=gscore(i)+fscore(j)
continue
write(8.101) (gscore(i).i=l.n)
do 19 i=l.n
19
mscore=mscore+gscore(i)
mscore=mscore/n
do 29 i=l.n
29
sscore=sscore+(gscore(i)-mscore)**2
sscore=sscore/(n-l)
write(*.*) mscore.sscore
101
format(lx.40i5)
stop
end
c
This subroutine calculates the outcomes of a corner pin
subroutine corner(fscore.t.tspare.tstrike.p.j)
dimension fscore(10).p(4).t(3)
real p
double precision drand.pp
integer fscore.tstrike.tspare.j.t
10
pp=drand(O)
if( pp .le. p(3»
then
fscore(j)=fscore(j)+2
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+2
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+2
66
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t(l) .It. 3) then
tspare=tspare+l
endif
elseif (pp .gt. p(3) .and. pp .le.l) then
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
t(1)=t(l)+l
if· ( t(l) .It. 3) then
goto 10
endif
endif
return
end
c
This subroutine calculates the outcomes of a headpin
subroutine headpin(fscore,t,tspare,tstrike,p,j)
dimension fscore(10),p(4),t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
10
pp=drand(O)
if( pp .le. p(2»
then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
s
67
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
68
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(l)+l
if (t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
elseif (pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+3
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+3
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+3
t(3)=t(3)+1
endif
t(1)=t(1)+l
if (t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
elseif (pp .gt. p(3) .and. pp .It. (1+p(3))/2) then
fscore(j)=fscore(j)+2
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+2
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
••
d
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
fscore(j+2)=fscore(j+2)+2
t(3)=t(3)+1
endif
t(1)=t(1)+l
if (t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
elseif (pp .gt. (p(3)+p(4»/2 ) then
t(1)=t(l)+l
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
if(t(1) .It. 3) then
goto 10
endif
endif
return
end
c
This subroutine calculates the outcomes of a 3-2
subroutine bov32(fscore,t,tsapre,tstrike,p,j)
dimension fscore(10),p(4),t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
10
pp=drand(O)
69
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
if(pp .It. (p(1)+p(2))!2) then
fscore(j)=fscore(j)+10
if ((tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+10
endif
continue
if( t_strike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+10
t(3)=t(3)+1
endif
t(1)=t(1)+l
if (t(l) .le. 3) then
tspare=tspare+l
endif
return
elseif (pp .gt.(p(1)+p(2))!2 .and. pp .le. p(2)) then
fscore(j)=fscore(j)+8
if ((tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+8
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+8
t(3)=t(3)+1
endif
t(1)=t(l)+l
if(t(l) .It. 3) then
call corner(fscore.t.tspare.tstrike.p.j)
endif
return
70
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
elseif ( pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+5
if «(tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t(l) .It. 3) then
call hp32(fscore,t,tspare,tstrike,p,j)
endif
return
elseif (pp .gt. p(3) .and. pp .le. (p(3)+p(4))/2) then
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
t(1)=t(1)+1
if ( t(1) .It. 3) then
goto 10
endif
else
fscore(j)=fscore(j)+5
if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+5
il
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
t(1)=t(1)+l
if(t (1) .It. 3) then
call b3232(fscore,t,tspare,tstrike,p,j)
endif
endif
return
end
c
This subroutine calculates the outcomes of head pin and a 3-2
subroutine hp32(fscore,t,tspare,tstrike,p,j)
dimension fscore(10),p(4),t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
10
pp=drand(O)
if( pp .le. p(2)) then
fscore(j)=fscore(j)+5
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+1)=fscore(j+1)+5
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+5
t(3)=t(3)+1
endif
72
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
t(1)=t(1)+l
return
elseif ( pp .gt. p(2) .and. pp .le. p(3)) then
fscore(j)=fscore(j)+3
if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+3
endif
continue
if( tstrike.eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+3
t(3)=t(3)+1
endif
t(1)=t(1)+l
elseif ( pp .gt. p(3) .and. pp .le. (p(3)+p(4))!2) then
fscore(j)=fscore(j)+2
if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+l)=fscore(j+l)+2
endif
continue
if( tstrike .eq. 2 .and. j .le. 8) then
fscore(j+2)=fscore(j+2)+2
t(3)=t(3)+1
endif
t(1)=t(1)+l
else
if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
73
u
_ _. _ _
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
continue
if( tstrike .eq. 2 .and. j .le. 8) then
t(3)=t(3)+1
endif
t(1)=t(1)+1
endif
return
end
c
This subroutine calculates the outcomes of a 3-2 on both sides
subroutine b3232(fscore,t,tspare,tstrike,p,j)
dimension fscore(10),p(4) ,t(3)
real p
double precision drand,pp
integer fscore,tstrike,tspare,j,t
pp=drand(O)
if (pp .le. p(3)) then
fscore(j)=fscore(j)+5
if ((tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
fscore(j+1)=fscore(j+1)+5
endif
t(1)=t(1)+1
return
elseif ( pp .gt. p(3) .and. pp .le.p(4)) then
t (1) =t (1) +1
if((tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then
t(2)=t(2)+1
endif
return
endif
74
APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES
return
end
75
Bibliography
[IJ
D'Agostino, R. B. and Stephens, M. A., Goodness of Fit Techniques, New York,
Marcel Dekker, 1986.
[2]
Daniel, Wayne W., Applied Nonparametric Statistics, Second Edition, Boston,
PWS-Kent, 1990.
[3]
Box, G. P., Hunter, W. G. and Hunter, J. S., Statistics for Experimenters, New
York, John Wiley & Son, 1978.
[4]
Neter, J., Wasserman, W. and Kutner, M. H., Applied Linear Statistical Models,
Second Edition, Homewood, Richard D. Irwin, 1985.
[5)
Pierce, D. A. and Kopecky, R. J., Testing goodness of fit for the distribution of
errors in regression models, Technical Report Symp. 16, Department of Statistics,
Stanford University, 1978.
[6J
Huynh, H., Some Approximate Tests for Repeated Measurement Designs, Psychometrika, Vol. 43, No.2, June, 1978,161-175.
[7J
Let's Go Bowling, Canadian 5 Pin Bowlers' Association.
[8J
Official Rules and Regulations Governing The Sport of 5 Pin Bowling, Canadian
5 Pin Bowlers' Association, 1987.
[9]
5 Pin Bowling Specifications and Standards Manual, Canadian 5 Pin Bowlers'
Association, 1987.
[10]
Royall, R. M., The effect of sample size on the meaning of significance tests, The
American Statistician, Vol. 40, No.4, November, 1986,313-315.
76