AN ANALYSIS OF BOWLING SCORES AND HANDICAP SYSTEMS by Wenjun Chen M.Sc. Beijing Institute of Technology A PROJECT SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in the Department of Mathematics and Statistics of Simon Fraser University @ Wenjun Chen 1991 SIMON FRASER UNIVERSITY· August, 1991 All rights reserved. This work may not be reproduced in whole or in part, by photocopy or other means, without the permission of the author. APPROVAL Name: Welljun Chen Degree: Master of Science Title of project: An Analysis of Bowling Scores and Handicap Systems Examining Committee: Dr. A. Lachlan Chair , Dr. R. Routledge, Committee Member Dr. C. Dean, External Examiner Date Approved: November 5 ii ] 991 Abstract This project uses the Box-Cox transformation and goodness of fit techniques to find a model describing the distribution of bowling scores based on the analysis of actual bowling data. We have found that the logarithm of bowling scores is approximately normally distributed with a constant variance. The simulation of bowling scores based on a Fortran program confirms this model. The project also uses Monte Carlo methods to investigate the effect of various handicap systems based on the proposed model. III Acknowledgements My sincere thanks to Dr. Tim Swartz for his invaluable advice and guidance and time that he spent with me in the preparation of this project. I would also like to thank him for his supervision during my studies. Thanks also go to Dr. R. Routledge, Dr. K. 1. Weldon and Dr. C. Dean for their assistance. I would also like to express my thanks to Dr. M. A. Stephens and to F. Bellavance for their guidance and advice in doing the project. I would also like to thank my fellow students and friends for any help that they gave me. I would also like to extend my thanks to Ms. Sherry Swartz who supplied the data set used in this project. It was a pleasure to work with such a "good" data set. Finally, I acknowledge with humble gratitude, the constant encouragement and every possible help from my mother, father, brothers and sisters. Without them, I would never have been, what I am today. I would also like to acknowledge all the help and support extended by my husband. iv Contents Abstract iii Acknowledgements iv Dedication v Contents vi List of Tables viii List of Figures ix 1 Introduction 1 2 Description of Five Pin Bowling and The Data Set 3 2.1 Five Pin Bowling 3 2.2 The Data Set . . . 8 3 Characterizing The Bowling Scores 11 3.1 Initial Exploration of the Data Set 11 3.2 Mean-Variance Relationship . . . 15 3.3 Profile Analysis of Bowling Scores 21 3.4 Normality of the Bowling Scores 24 4 Modelling The Bowling Scores 29 4.1 Box-Cox Transformation 4.2 Goodness of Fit Technique in Testing for Normality 29 29 4.3 The Results of Goodness of Fit for Bowling Scores 30 . vi 4.4 The Proposed Model for Bowling Scores. 32 4.5 The Property of Equal Variances. 34 40 5 Simulated Bowling Scores 5.1 Assumptions of Simulation 5.2 The Results of Simulation. 40 44 49 6 Handicap Systems 6.1 The Remington Rand Study 49 6.2 The Monte Carlo Study .. 49 6.3 The Results of Our Study . 50 52 Appendix A The Data Set B . 54 A Program which Simulates Bowling Scores. 62 76 Bibliography vii List of Tables 2.1 Two Scoring Sheets ....... 9 3.1 Brief Summary of The Data Set. 14 4.1 The Results of Bowling Scores. . 31 4.2 The Results of Logarithms of Bowling Scores 33 4.3 The Sample Avera.ges and Sample Variances of Logarithm Data. 34 5.1 The Results of Simulation (N = 10000) 6.1 Estimated Probabilities of the Favourite Winning viii ...... 44 51 List of Figures 2.1 Pin Count . 4 3.1 The Scatter Plots of Scores vs Games for Players 1 to 20 . 12 3.2 The Scatter Plots of Scores vs Games for Players 21 to 40 13 3.3 The Plot of Standard Deviation vs Average .. 16 3.4 The Samples of SL and SH 17 3.5 Possible Standard Deviation vs Average Curve 3.6 Overa.ll Average Plots . 22 3.7 Plots Based on Scores of All Players . 23 3.8 The Histograms of Bowling Scores for Players 1 to 20 26 3.9 The Histograms of Bowling Scores for Players 21 to 40 . 27 . 20 3.10 The Distribution of Standardized Bowling Scores 4.1 28 Normality of the Logarithms of Bowling Scores. . .... 35 4.2 The Sample Variances vs Averages Plot for Logarithm Data. 36 4.3 Normality of the Logarithm of Bowling Scores (<1 2 = 0.0328). 38 5.1 Tree Diagram of Possible Outcomes . 43 5.2 The Results of pl=0.15, p2=0.15, p3=0.25, p4=0.45 46 5.3 The Results of p1=0.15, p2=0.15, p3=0.1, p4=0.6 47 5.4 The Results of p1=0.3, p2=0.3, p3=0.1, p4=.3 .. 6.1 The Probability of the Favourite Winning the Game IX 48 . . . . . . . . 53 Chapter 1 Introduction In the fall semester of 1990, I contacted Dr. Tim Swartz about analyzing a data set for my M.Sc. project. Several days later, he mentioned that he had been watching television and noticed that unlike most major sports, in professional bowling very little quantitative analysis is provided to the viewing audience. In particular it seemed that all that could be said about a bowler X was that he/she maintained a bowling average of Y. This observation inspired us to ask the following questions: Can more be said about a bowler's tendencies beyond reporting his/her bowling average? Do bowling scores follow a particular distribution? What impact might this have on handicapping? What are the handicap systems currectly used today? Are they fair? In order to pursue these questions we required a practical data set of bowling Scores. Sherry Swartz, Dr Tim Swartz's sister, has participated in a bowling league for several years. She mailed us a total of approximately 2300 five pin bowling scores from an actual league. These are the scores upon which our analysis is based. In our attempt to gain a quantitative understanding of bowling scores we also came in contact with several Canadian bowling agencies and individuals. We describe these helpful encounters throughout the project. 1 CHAPTER 1. INTRODUCTION 2 Chapter 2 introduces the game of five pin bowling; we describe the rules, the equipment and the scoring procedure. In this chapter we also describe the data set on which my project is based. In Chapter 3 we carry out exploratory data analysis. Through the use of simple descriptive statistics, plots and tests we gain some feeling for the data set. This exploratory work also provides ideas regarding future directions for our analysis. Chapter 4 reviews the Box-Cox transformation if a -# 0 if a = 0 which is used in my project to maximize the p-value in a test of normality involving bowling scores. The main result of this chapter is that the logarithms of bowling scores are approximately normally distributed. In Chapter 5 a simulation experiment based on a Fortran program is presented. It is used to verify the proposed model obtained in Chapter 4. Adjustable parameters which describe different bowling skills are considered. In Chapter 6, the Remington Rand handicap study is described. We use our model to investigate the effect of various handicap systems on the probability of winning and then compare our results with the Remington Rand study. Chapter 2 Description of Five Pin Bowling and The Data Set This chapter contains an introduction to the sport of five pin bowling with an emphasis on the scoring procedure. We also describe the data set upon which our study is based. 2.1 Five Pin Bowling Bowling is one of the most popular sports for participants of all ages, regardless of sex, shape or physical condition. The two most popular types of bowling in Canada are ten pin bowling and five pin bowling. For five pin bowling, five pins are used in the game. A game of five pin bowling consists of ten frames and should be played with regulation equipment on regulation lanes. Each frame consists of a maximum of three legally delivered balls rolled by the same bowler down the lane in succession. If a bowler should knock down all 5 pins in less than 3 attempts the frame is considered complete. An exception occurs in the tenth frame where 3 balls are always delivered. If in the tenth frame all 5 pins are knocked down during the first or second attempts then the 5 pins are reset. The object of the game is to score as many points as possible in ten frames. The score is the total number of points corresponding to the pins knocked down in the ten frames (plus bonuses). The scores assigned to each pin are recorded in Figure 2.1. 3 CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 4 Figure 2.1: Pin Count The following are some common scoring terms: (1) strike: All pins are knocked down by the first ball bowled in a frame. (2) spare: All pins are knocked down by the first two balls bowled in a frame. (3) corner pin: All pins are knocked down by the first ball with the exception of a single corner pin. (4) head pin: The head pin is picked out by the first ball bowled in a frame. The basic rules for scoring a game of bowling are as follows: (1) no strike or spare: Merely add the total points corresponding to the pins knocked down on the three balls. (2) strike: Fifteen points plus a bonus of the number of points accumulated by the next two balls rolled. (3) spare: Fifteen points plus a bonus of the number of points accumulated by the next ball rolled. A perfect game of 450 is scored by recording strikes in each of the ten frames. In addition this requires knocking down all 5 pins on both the second and third balls of the tenth frame. An example of the scoring for a typical bowling game is given below. CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 5 frame 1 The player knocks down all pins except the headpin using three balls. Score: 10 points. frame 2 The player knocks down all the pins using three balls. Count: 15 points. Score: 25 points. frame 3 The player uses only two balls to knock down all pins. This is called a spare. Count: 15 points. However, as each frame's count is for three balls, the player adds the count from his first ball in the next frame. Score: INCOMPLETE. frame 4 The player knocks down the 3 pin with his first ball. We therefore add 3 points to the 15 points of his spare making a count of 18 for the third frame. Score: 43 points. The player then knocks down the headpin (5) and the left 2 pin. This makes his count for the fourth frame 3+5+2=10 points. Score: 53 points. frame 5 The player records a strike. Count: 15 points plus points scored with the next two balls bowled. Score: INCOMPLETE. frame 6 The player makes another strike and credits the fifth frame with 15 points. Fifth frame score: still incomplete. Sixth frame count: 15 points plus points scored with the next two balls bowled. Score: INCOMPLETE. frame 7 With the first ball, the player picks the headpin (5). This completes the fifth frame. Count: 15+15+5=35 points, fifth frame score: 88. On the second try he scores 5 points. The player adds the 10 points to the strike count of the sixth frame. Sixth frame count 25, score: 113. Then the player knocks down the 2 pin, his count for the seventh frame: 12 points. Score: 125 points. frame 8 A strike is recorded. Count: 15 points plus points scored with the next two balls bowled. Score: INCOMPLETE. CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 6 frame 9 With the first ball, the player picks the 3 pin. With his second ball, he counts 12 points knocking down all remaining pins for a spare. This complete the 8th frame. Count: 15+3+12=30 points. Eighth frame score 155. Score: INCOMPLETE. frame 10 A strike is recorded and 15 points are credited to the ninth frame. Ninth frame count: 15+15=30. Score 185. The strike in the tenth frame permits two additional attempts. The player gets the three pin on his second ball and the five on his third ball. Player's tenth frame count: 15 (for the strike)+3+5=23. Game score: 208 points. Scoring Sheet 1 2 5 /5 /- 31 2/10 10 25 6 7 X II 113 51512 125 3 -I 31 51 2 53 5/11 43 8 X II 155 Here "X" means a strike, 5 "I" II 88 10 9 31 / 185 X I X 1315 208 means a spare. Handicapping: In a league in which the range of abilities is wide, the league may adopt handicap rules. A handicap attempts to "even up" the chance of winning between opposing teams. There are two basic kinds of handicapping currently used by the Canada Five Pin Bowlers' Association. A. Individual handicap systems: (1) 80% of the difference between the bowler's average and a base figure of 225. (2) 66% of the difference between the bowler's average and a base figure of 200. CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA. SET 7 (3) 75% of the difference between the bowler's average and a base figure of 200. (4) 75% of the difference between the bowler's average and a base figure of 220. The various handicap methods mentioned above are applied to a 160 average bowler to produce the following handicaps. System Handicap 80% of 225 52 66% of 200 27 75% of 200 30 75% of 220 46 These handicaps are then added to a bowler's gross score at the end of each game to give a net score. In the case of a bowler whose average exceeded the base figure, a zero handicap would be assigned. B. Team handicap systems: (1) Team handicaps are determined by adding the averages of the team players for each of two opposing teams. Then 80% of the difference between the team totals is taken as the handicap for the weaker team in each individual game. (2) In deciding the three game handicap total, multiply the single game team handicap by 3. This total would be added to the total team score for the three games. (3) When a team's strength is not identical for the three games, the three game handicap shall be the total of the handicap allowed for each of the three games. This may happen for example when a team has 6 bowlers, only 5 are permitted to bowl in a given game and some rotation scheme is used. CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 2.2 8 The Data Set The data set of bowling scores on which my project is based carne from an actual five pin bowling league in Kitchener/Waterloo. The scoring sheets were provided to us by the league scorer (Ms. Sherry Swartz). Each scoring sheet is a weekly summary showing the results between two competing teams. The scorer recorded the league name, the date and the particular lane used by the competing teams. For each team the scoring sheet provides us with the team number, team players, individual and team handicaps and individual and team totals. From the scoring sheet one can also determine the number of points that each team scores for the week. A team obtains two points for every game won and a single point for having the largest grand total. Therefore 7 points are shared by two competing teams and the maximum number of points that a team can score in one week is 7 points. Table 2.1 illustrates the format of two sheets. There are 4 scoring sheets provided each week. This represents a league consisting of 8 teams with 5 players per team. However players change teams from time to time and new players occasionally join the league. For example, Mary was a member of team 8 on December 19, 1990 and became a member of team 5 on January 2, 1991. Also Diana happened to join team 4 on October 31, 1990. It is also possible that some players might choose to leave the league. The total number of players recorded in the scoring sheets exceed 40. However some players only participated in three or six games. The data set was collected weekly from September 5, 1990 to January 9, 1991 excluding the week of Christmas. This resulted in a total of 19 weeks. The players bowled 3 games every Wednesday evening over the span of 19 weeks giving a maximum of 57 bowling scores per player. The scoring sheets illustrate all information about individual and team scores. However, because the members of teams varied from week to week, team scoring will not be the focus of this project. We will instead concentrate on individual scores. In our study we have selected the forty players who have been the most active in the league. Therefore for each player we will have a maximum of 57 game scores plus possibly some missing values. This results in approximately 2300 game scores. Appendix A lists all data used in the analysis. CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 9 Table 2.1: Two Scoring Sheets Lane No. 19&20 Team 8 A_I PI&y~u Ma.y Pride Sh .... QlIal" ao, League KolC Gamel 'I' '"~ Total m Team H&lIdieap '" O .... d Tau,' 19 90 Team 7 vs ... '" .... ,,, ,. " H" Date Dec.19 1140 O.m~:l O.me3 '" U• '" '" u. '"~ '" u. ", '"~ A_, Tot.1 Playe" H" .....'", W ...... y " M,i.... ... Edie " m "1..111. W.. D.lI .'", ..'", 1200 1144 " " " Tot ..1 J..... Oamel O .. me, O .. meJ .... ...'" ,."..... ,. ... ,., '" '"~ • 11 U• '"~ '" '" '" '" O'aDd Tol ..1 1166 to!>1 12'!'2 A_, Play.. " P..k Joh" Bell"r Cook Pert' TOh,1 Tea ... H&odlc .. p 0.& .. 0 Total H<, .." .. .. '"~ ... League KolC .. '" ...'" ... ,'". ... ..'", ... Oame2 '"~ ,'" '"~ '"~ '" '" '" '" 1133 a ..me3 ,,. ToI ..1 Date Dec.19 A_. Sbelly Lulie m '"~ '" ", U76 1040 Hdl Bo'" m ", Play .... Earl ." " " T .., ..I Te ..... Ha .. dicap 3449 .. H<, OralIO Tou.1 Gamet WOIl Poillu War.. Team Avu..,e .11 Oa ... el '"~ O .. me, Ga. ... ") >I' ,., '" >I. '" '" ... '" ,., '" '" '" ." , .n '" '" 166 1008 IHi3 IHi'T .. .n m Poiau Wor> 19 90 Team 5 U. ,>I '" ." H69 WOIl vs O .. mel '" ... '"~ Team Ha .. oicap O .. mu Lane No. 19&20 Team 6 Tot .., ... Total ." '" ." '" 3318 Ga",e, WOIl Poi"" WOD CHAPTER 2. DESCRIPTION OF FIVE PIN BOWLING AND THE DATA SET 10 The players in the study vary in sex, age and experience. Unfortunately, we could not get information concerning these variables and we will therefore not take these factors into account. The only discriminating factors which we will consider in our analysis of bowling scores are overall ability, changes over weeks and changes between the 3 games bowled in the same evening. Chapter 3 Characterizing The Bowling Scores Before a formal analysis is given in each section of this chapter, we present an exploratory analysis by using simple descriptive statistics and plots. We begin by carrying out an informal graphical exploration of the data as this is often helpful in highlighting special features of the data and may be helpful in determining the direction in which we should continue our analysis. We then attempt to characterize the data set formally by considering the mean-variance relationship, the profile analysis and the property of normality. 3.1 Initial Exploration of the Data Set The scatter plots of scores vs games for each of the forty bowlers are given in Figure 3.1 and Figure 3.2. Figures 3.1 and 3.2 are given using the same vertical and horizontal scale. From studying the two figures, we see that the bowling game scores vary widely for different players. The maximum score is about 330 points and the minimum is about 70 points. Actually the maximum score is 329 which was obtained by player 25 in game 40 and the minimum score is 71 which was obtained by player 17 in game 2. The bowling skills are quite different from player to player. For example players 2,11,12 and 17 tend to have lower bowling scores, having average scores of 140, 120, 131 and 126; players 5, 10, 25 11 CHAPTER 3. CHARACTERlZING THE BOWLING SCORES player 1 player 2 playe, 3 12 player 4 player 5 I ... , .' .' ! ..... .. '. : ! ~ ." . .' .. " ...... ". . ., ....... '- , ::...... ':' ~'. ~ 010l0:SO.oW I ! ! . ... -', ....... .,:; '" .- .- play.f6 playef7 player 8 player 9 ! ! ! '. ':.' . ... :', . ............ ' .., ! ,', ....:....:. .....:..: " ! I ! ":: .. ... .- .- .- player 11 player 12 player 13 player 14 I ! ! D ..- ! .,. :'. ;~~'.~'."\{' ;' ! " I !! .. ~:.: , ' .... ',': . ., :. ! .....:.:.. ! I ! " .',:..~ ':.: . ", .-. :: . .- .- player 16 player 17 player 18 I !! ! ,". ',::u. -,0' ...... ': ,- ... ..- ! ! ::...;.:...... ' . . ',\:',: -;.- "0"30":10 .- . .. . 10 10 :llI 40 ': r.o player 15 II ! .'.. .. , ! " ..... '. ;..... .. . '. ....::. . , ',:. .. 10 " :)II .- 40 " r.a D \D player 19 U 3D 'D !>D playaf 20 I I ! . ! I ..- "",- .. playe, \0 "010:104050 I ..... "OIO:lOUr.ll ..- :.:.:.:," . ':.': . ! '. '.. ::. ' ! ! ..- ! .': : ... I ! ! .: "''. ','~ .....~..:.:.:', . '.' ! on· ,- : . . ~. ! D1GID:lII4G!>D .- .- Figure 3.1: The Scatter Plots of Scores vs Games for Players 1 to 20 .... CHAPTER 3. CHARACTERiZING THE BOWLING SCORES player 21 ! ! ! ~ ...... player 22 : ... . ":" ~.: " ! ! ! ~ ~ " " • " ..... ! .. I ~ , ~ " ! .... . • ""• ~ ...... .;: .. - ! " "• ..... .. " ! ....,. :.... ! .• " ~ .. :.-:.' " " ..... : ! . •"• " "..... , " ! ! ! . .. .. ~ '. ! , ". ....... ·... ' · ! ! ! ~ .. :'::::' " , ""• .... .... ." ~ ! ,, ,,' " .. .' '" . player 39 ", . .... . • ""• ! .. · player 38 .. ...:' . •""• • ...- •I ! ~ " ' player 35 • " " " ...- ! ,., '::,::'.- , " • "• . ,.1 ~ ! ! ! .".' . .... I ! " ..... " ! ! ! " I ~ • " ""•"• , .... player 34 ! ! .. .. " " : " I : " " • "• player 30 ! ! I ! , ,';: , '.' ~ • " • " " ...- player 37 .. ," ' ! '. , ' ! player 33 . •" " " ...- player 36 ! ! ! ...... ....w .... . • player 29 I " ! " "• ! ! ptayer32 ." :'.:"'-' " " ~ ..... . , .... ~ ! ! ',' ....: : / •I ! .. " ~ ... player 25 ! ! , player2B .... . • ! ! ! ... ,', " ..... " " •""• ! ! ! ! ! ! '. ,.... ! player 24 ! ! player 31 I , .... ~ ..... . I ! ! .':. ".:',', '::::'" :~. ! " player 27 , " .... ! ! ! " "• player 26 I ! ! .: > .. player 23 13 ! ~ ! ',' " .. , " , ..... " " " • player40 ! ! ! .. ...... .:... ," .... -:. ~ ! ! ...... " " " • ..... " " " • Figure 3.2: The Scatter Plots of Scores vs Games for Players 21 to 40 CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 14 and 29 have higher bowling scores, having average scores of 204, 198, 197 and 204. The variations of scores are also quite different for different players. For instance players 1, 10, 22 and 25 have ranges of 165, 157, 138 and 189. Players 11, 14, and 16 have ranges of 74, 99 and 87. It also seems to be the case that if a player has a high average, then he(she) also tends to have a high variation. We will investigate the mean-variance relationship in more detail in section 3.2. Missing values can also be noted from the plots. Some players such as players 1, 5 and 6 bowled from the begining to the end of the bowling season. They do not have any missing values. Other players were occasionally absent even though I chose the 40 players with fewest missing values. For example the scores of player 20 are missing between games 34 and 54. In order to get a more quantitative understanding of the bowling scores for different players, Table 3.1 gives a brief summary of the data set including the average, sample variance and actual number of games that each player bowled. Table 3.1: Brief Summary of The Data Set Player 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Games 57 54 48 48 57 57 51 54 57 57 51 57 54 51 48 51 57 54 51 39 Avg. 195.9 138.8 160.2 187.7 204.3 145.0 155.9 180.3 162.4 197.5 119.7 131.2 162.3 142.6 153.0 143.1 126.1 163.6 161.5 155.4 S" 1415.5 492.5 432.7 911.0 1130.8 672.4 938.6 948.3 1088.8 1237.2 312.4 639.3 918.8 447.1 1016.8 411.4 705.0 797.8 942.8 839.3 Player 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Games 48 51 45 54 51 57 57 45 54 57 51 54 57 54 48 57 54 57 42 51 Avg. S" 190.5 968.7 203.5 1198.8 188.6 1026.4 180.0 937.5 196.7 1699.0 157.7 689.0 140.9 712.8 119.9 694.6 203.8 1233.1 162.2 576.4 163.2 703.1 144.7 608.2 151.6 1015.1 164.2 928.8 136.6 856.0 171.6 891.0 128.1 800.8 169.2 1339.1 141.0 703.9 160.0 1056.4 CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 15 From Table 3.1 we see that 13 players participated in all 57 games. Player 20 participated in the fewest games (39) amongst the 40 bowlers. The averages of game scores vary from 119.7 (player 11) to 204.3 (player 5). The variances of scores also vary widely from 312.4 (player 11) to 1699 (player 25). 3.2 Mean-Variance Relationship We mentioned a little bit about the relationship between mean and variance in Section 3.1. We now give a more detailed investigation of the relationship. Figure 3.3 gives a plot of standard deviations vs averages for each ofthe 40 players. The smoothing line was obtained by using the lowess command in S-plus. As mentioned in Section 3.1, we verify in Figure 3.3 that standard deviations tend to increase with average. In this section we will use formal statistical methods to test this phenomenon. We will test the hypothesis using different statistical tests. In order to test this hypothesis, we divided the standard deviations into two groups: SL and SH. SL is the set of standard deviations corresponding to the 20 lowest averages and SH is the set of standard deviations corresponding to the 20 highest averages. Figure 3.4 illustrates the two samples. Method 1: Mann-Whitney Test The distribution of standard deviations of bowling scores is unknown. Therefore the non-parametric method may be preferable here. The main assumptions of the Mann-Whitney test: (1) The data consist of a random sample of observations SL p SL" ... , SLnl from a population with unknown median M'L and an independent random sample of observations SH" SH" ..., SH., from a population with unknown median M'H" (2) The distribution functions of the two populations differ only with respect to location, if they differ at all. 16 CHAPTER 3. CHARACTERIZING THE BOWLING SCORES The Plot without Smoothing Line S1 c .g III " to! ~ (J) ~ .!!l > 0 ~ • 160 180 ' • · ~ ·· 140 120 .". . . ... 200 Average The Plot with Smoothing Line ... 0 c .2 .1! ~ 0 ~ 'l! ~ III ' to! ~ 120 • ·· • · • ~ .. 140 160 180 Average Figure 3.3: The Plot of Standard Deviation vs Average 200 17 CHAPTER 3. CHARACTERiZING THE BOWLING SCORES ~ - SL " SH '"'" : 0 "Q .~ c ' g .. , ~J!I (/) l!,J - 0 '" , , 120 160 140 180 200 Average Figure 3.4: The Samples of SL and SH The hypothesis: H o: M'L = M'H HI: M'L < M'H The test statistic T =S - n"n~ -I) = 164 where S is the sum of Ihe ranks assigned to the sample observations from the first population. Decision rules: We reject Ho at the a level if the computed T is less than the critical value given below. 18 CHAPTER 3. CHARACTERIZING THE BOWLING SCORES Critical Values of the Mann-Whitney test statistic ( p\n2 .01 .025 .05 .10 16 17 15 81 88 94 91 99 106 101 108 116 11 120 128 nI = 20 ) 18 19 20 101 108 115 113 120 128 124 131 139 136 144 152 We can not reject Ho even at the a = 10% level. Method 2: Two Sample t-Test Assumptions of t-test: (1) SL and SH are independent samples from normal populations. (2) The standard deviations of SL and SH are identical. Assumption (1) is clearly violated here. However the t-test is a robust test for samples from certain non-normal distributions. Graphically, it seems no reason to lose faith in assumption (2). The hypothesis: H o: E(SL) = E(SH) HI: E(SL) < E(SH) We calculate T = Sp J -~ lInt +1/n2 = -lAO with degrees of freedom = 38 and p - value = .085 which leads us to not reject Ho (some mild evidence against Ho) CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 19 Method 3: Spearman Rank Correlation Coefficient Method 1 and method 2 tested for a difference between means in S Land S H. Both methods were simple to use but gave only mild evidence of a difference between means. We now use Spearman's rank correlation method to determine Whether there is evidence of increasing standard deviation with respect to mean. This test imposes more structure than the previous 2 methods which simply divided the data in half. The assumptions of Spearman's test are that the data consist of a random sample of n pairs of observations and that each pair of observations represents two measurements taken on the same object or individual. Let (Xi, Yi) denote the average and standard deviation of player i and let R(Xi)(R(Yi)) be the rank of the Xi(Yi) relative to all other values of X(Y). If ties occur among the X's(Y's) each tied value is assigned the average of the rank positions for which it is tied. The hypothesis: H o : X and Yare independent HI : There is a direct relationship between X and Y The test statistic is r. 6L:~, ,q = 1- ~ = 0.724 where di = [R(Xi) _ R(YiW and n = 40. Decision rules: We reject Ho at the a level if the computed value of r. is greater than the critical value which is given below. Critical Values (n = 40 ) a .25 .10 .05 .01 .005 r( a) .110 .207 .264 .368 .507 Therefore we reject H o even at the a = 0.005 level. CHAPTER 3. CHARACTERiZING THE BOWLING SCORES 20 From the exploratory graphical approach and the third test there seems to be some evidence that the standard deviation increases as the average increases. Possible reasons for the phenomenom might be as follows: (1) In bowling, strikes and spares affect scores dramatically. Players with high averages have more chance of getting a strike or a spare. Therefore the variations of scores are wider than for players with lower average scores. (2) Bowling scores range from 0 to 450. If a player obtains 0 for every game, the average score of the player is 0 and the variance is also O. It is the same if a player obtains 450 for every game. The average score of the player is 450 and the variance is O. Using these two fixed points we might therefore expect the standard deviation versus average curve to appear as in Figure 3.5. The maximum point of variation is not intended to occur at any specific point along the horizonal axis. For our case study, we collected bowling scores with average scores between 120 and 205. Therefore corresponding to Figure 3.5 there is an increasing relationship between standard deviation and average as we expected. 0 g c :; ~ 1 0 !<l c ~ 8 o 100 200 300 400 Average Figure 3.5: Possible Standard Deviation vs Average Curve .1. _ CHAPTER 3. CHARACTERIZING THE'BOWLING SCORES 3.3 21 Profile Analysis of Bowling Scores The bowling season of the league from which the data set was collected took place from September 1990 to April 1991. As described in Section 2.2 the players bowled a maximum of 57 games. Three games are bowled in one evening (Wednesday evening) every week. We are therefore interested in answering the following questions: Do the players improve their bowling skills from week to week or maybe from game to game each week? The answer to these questions is important as it gives some indication of whether the scores for each bowler are identically distributed. Before proceeding further, we introduce some convenient terminology. In every bowling evening, the players bowled three games (ordered games). The corresponding scores are called the first game score, the second game score and the third game score respectively. By the week average we mean the average game score for each week. Therefore we have at most 57 game scores, 19 first game scores, 19 second game scores, 19 third game scores and 19 week averages for each player. let Xijk be the bowling score for the i'h player (i = 1,2, ... ,40) in the = 1,2,3) in the k'h week (k = 1,2, ... ,19). Then L'·' 4'0 x;), is the overall average game scores of game j X,jk = i th game (j Xuk = L'·;=1 L' 120=1 Xi)' in week k. is the overall average of week k and 19 x.j. L'· L Y6t-' Xi;' =i-l is the overall average of the j'h game. In order to investigate the questions posed earlier we give the plots of overall averages of the 40 players. If players improved their bowling skills significantly from week to week or from game to game, we could see it from the overall average plots. Figure 3.6 gives the plots of overall week averages vs weeks and overall average game scores vs games. From Figure 3.6 we see that there is some indication of improvement over the three games in a given week (2nd plot) and possibly some very mild evidence in improvement over the season (1st plot). Figure 3.7 gives the plots of scores vs games and scores vs weeks. We can not see any indication of improvement over the three games nor improvement over T 22 CHAPTER 3. CHARACTERIZING THE BOWLING SCORES Overall Average Week Scores vs Weeks II) Q) ~ ~ '" Q) Q) 3: Q) lfl '" ~ g • ~ • Cl os ~ Q) ~ ~ <3 III ~ • • • • ~ • • • • • • ~ • • • 10 5 15 Weeks Overall Average Game Scores vs Games • • 1.0 1.5 2.0 Games Figure 3.6: Overall Average Plots 2.5 3.0 1 CHAPTER 3. CHARACTERlZING THE BOWLING SCORES 23 Scores vs Weeks • 0 0 C') • Ul ~ en8 0 0 N • • t •• ~ t •• I •• • II I I I • 0 0 • • • • • • • •t II I • 5 • • • • • • •• I f •• • •• • • t • f I 10 • • I •• • ! • ••• I• I • I • • • 15 Weeks Scores vs Games 0 0 C') III ~ en8 • • • •• • •• I I 0 0 N 0 0 • 1.0 1.5 2.0 2.5 Games Figure 3.7: Plots Based on Scores of All Players 3.0 1, CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 24 weeks from this plot. However we will need to test these conjectures formally as these plots do not convey the variability associated with each observation. Because the players have quite different average bowling scores, we stratify our population by choosing each player as a subject. For i Rio: Ui 1 = 1,2, ... ,40 we test: = Ui2 = Ui3 not the case that u" = Ui, = Ui, where Ui, = mean score of player i for the first game, Hi, : Ui, = mean score of player i for the second game and Ui, = mean score of player i for the third game. In a similar manner wealso test for a difference over weeks. The null hypothesis states that there is no difference between the means of the 19 weekly scores. Because we are interested in both the game effect and the week effect for each player we use the two factor analysis of variance model for repeated measurement designs. We note that analysis of variance models are robust with respect to small departures from normality. Using the statistical package SAS we obtain a p-value of 4.2 % for a difference between games and a p-value of 17.5 % for a difference between weeks. Therefore there seems to be at most mild evidence for an effect due to games and no evidence for an effect due to weeks. We will therefore assume these effects do not exist. 3.4 Normality of the Bowling Scores As mentioned earlier it seems that currently the only qnantitative comment that is routinely made concerning a bowler's ability is the reporting of his/her bowling average. We would like to do more than this by possibly describing the distribution from which bowling scores arise. Initially we conjectured that bowling scores for each individual are approxi· mately normally distributed with unknown mean and variance. We made this conjecture as bowling scores arise as the sum of scores of 10 frames which suggests that the central limit theorem may approximately hold here. Figures 3.8 and 3.9 show the histograms of CHAPTER 3. CHARACTERIZING THE BOWLING SCORES 25 bowling scores for each player. From Figures 3.8 and 3.9, it seems that some histograms are approximately normally distributed (eg. histogram for player 29). However more often than not, the histograms seem to have a long right tail (eg. histogram for player 22). We will use a graphical method to pool the results of each of the 40 bowlers to determine whether the bowling scores are approximately normally distributed. The null hypothesis is: Ho : Xijk ~ N(J1-i, ,,1), i = 1,2, ... ,40. Pierce[5] has suggested a method of combining tests based on several samples, for testing H o : the sample comes from a distribution F(Xi,8 i ), with 8i containing unknown location and/or scale parameters J1-i and "i. The true value of these parameters may be different for each test. For sample i, let jJ.i and O"i be the maximum likelihood estimates, Define standardized values Wijk = (Xijk - P,i)/Ui, i = 1,2, ... ,40 (in our case). Based on the proposal of Pierce, the Wijk for all 40 samples should be pooled to form one large sample of size n = ~t~l ni where ni is the size of i'k sample. The limiting distribution of Wijk will be the same as its limiting distribution for individual samples. Therefore the above hypothesis is changed to the null hypothesis: Do: Wijk ~ N(O, 1). Figure 3.10 shows the histogram of standardized and pooled bowling scores Wijk for all players and the q-q plot with the standardized normal distribution. A line of zero intercept and unit slope is added to the q-q plot in order to measure easily whether the W ijk are approximately normally distributed. Both the histogram and the q-q plot suggest that bowling scores are skewed to the right. In the q-q plot the left tail departs much more from the straight line than the right tail. Therefore the bowling scores are not approximately normally distributed. In Chapter 4 we will use the Box-Cox transformation and formal statistical methods to find a better model for the bowling scores. CHAPTER 3. CHARACTERIZING THE BOWLING SCORES player 1 '00 '00 .........." player 6 player 2 eo 120 1$0 player 3 200 player 7 100 140 110 220 .........." play", 11 eo 100 '40 player 16 100 '40 110 120 player 17 100 150 200 player 4 200 100 100 ISO 200 .........." 250 100 player 13 '00 1SO 200 player 5 250 '"' player 9 player 8 player 12 60 100 140 llll 160 26 .........." player 18 100 140 180 220 ISO 200 .........." player 10 250 140 '80 '"' '00 player 14 100 '"' player 15 220 '00 '00 Bc:J,Wng ScDlq player 19 100 140 180 220 player 20 100 Figure 3.8: The Histograms of Bowling Scores for Players 1 to 20 uo leo 220 27 CHAPTER 3. CHARACTERIZING THE BOWLING SCORES player 22 player 21 player 25 prayer 24 player 23 120 160 200 2010 150 120 160 200 240 250 BowlirIg Sca'n player 26 100 140 tao 220 player 31 -. ..... ,co player 32 player 33 150 200 80 120 160 200 player 36 100 UO 180 220 player 28 tOO 8cMIlng Sea.. 100 140 180 220 player 27 80 .. -."""" player 37 player 38 120 160 200 ,so lOCI 200 ,so 100 player 34 tOO 1010 leo 220 100 150 200 BcMtIng S<:a'H player 30 player 29 140 60 player 39 80 120 160 200 220 player 35 100 140 \80 Bawling 300 180 ScCfU player 40 80 Bowling Sca•• Figure 3.9: The Histograms of Bowling Scores for Players 21 to 40 120 160 200 28 CHAPTER 3. CHARACTERIZING THE I;IOWLING SCORES Histogram of Standardized Values 8 ~ 8 '" 8 '" § 0 -2 -1 0 3 2 Standardized Bowling Score Q - Q Plot ~ e '" ~ '" '" ,5 ~ <D IIN '15 0 la "D c: ~ ";' ')' .... -2 0 2 Quantiles of Standard Normal Figure 3.10: The Distribution of Standardized Bowling Scores 4 Chapter 4 Modelling The Bowling Scores In Chapter 3, we mentioned that the bowling scores are not approximately normally distributed. In this chapter, we will use the Box-Cox transformation technique to find an improved model for the bowling scores. 4.1 Box-Cox Transformation In general, the Box-Cox family of transformations is given by y(a) = x"-l -a{ In(x) ira f 0 if a = 0 where the transformed y(a) is "more normal" than x. In our case let y~l ={ xrj.-l a In(x;jk) 'f.J- 0 ar if a = 0 1 where i stands for one of the players 1 through 40, j stands for one of the games 1 through 3, k stands for one of the weeks 1 through 19. We consider different parameters a in order to maximum the p-value in testing the hypotheses that Yi~l ~ N(j.t;,al), i = 1,2,.,.,40, j = 1,2,3 and k = 1,2, ... ,19. 4.2 Goodness of Fit Technique in Testing for Normality Suppose that a given random sample of size n is given by Xl> X 2 , • • " X n , and let X(I) < X(2) < ... < X(n) be the order statistics. Suppose that the distribution of X is 29 F(x), The ~-------------------------CHAPTER 4. MODELLING THE BOWLING SCORES 30 empirical distribution function (edf) for the sample is defined by [;' ( ) _ number 0/ observations::; x. l -00 < X < 00. n Edf statistics are a class of goodness of fit statistics which measure the difference between .I'n X - Fn(x) and F(x). The Anderson-Darling edf statistic is defined by A2 =n J: [Fn(x) - F(xW[(F(x))(l- F(X)))-ldF(x). The A2 statistic is a general purpose (omnibus) goodness of fit statistic although on the whole it is most powerful when F(x) departs from the true distribution in the tail. A modified statistic A 2 ' is used in the test for normality with J1 and (j unknown and estimated by the mles. It is given by A 2· = A2 (1.0 + .75/n +2.25/n 2 ). Tables providing significance levels for the Anderson-Darling test for normality can be found in D'Agostino and Stephens[l]. Fisher's method is a method for combining independent tests from several samples. Suppose that k tests are to be made of the null hypotheses HOI, H o2 , .•. , H ok . Let Ho be the composite hypothesis that all HOi are true. Let Pi be the p-value corresponding to the i th test. Then when HOi is true, Pi is U(O,I) as long as the test statistics are continuous. The statistic P = -22: 10g(Pi) under H o, has the X~k distribution. We will use the modified Anderson-Darling statistic A Z ' to test the hypothesis of normality for the individual bowling scores by finding the p-value Pi, i = 1,2, ... ,40. We then use Fisher's method to test the. composite hypothesis that all HOi are true. 4.3 The Results of Goodness of Fit for Bowling Scores In Section 3.4, we used a graphical method to show that the bowling scores are not approximately normally distributed. Here we give the results of testing the normality of CHAPTER 4. MODELLING THE BOWLING SCORES 31 bowling scores by using the modified Anderson-Darling statistic and Fisher's method. The hypotheses are: = 1,2, ... ,40 HOi: Xijk - N(j1.i,U?), i HI: not all HOi are true where j1.i and ul are unknown and are estimated by the sample average Xi and the sample variance s~ respectively. Table 4.1 gives the modified Anderson-Darling statistic A 2' and the p-value for each of above hypotheses. Table 4.1: The Results of Bowling Scores player 1 2 3 4 5 6 7* 8 9* 10 11 12 13 14 15* 16 17 18* 19 20 "*,, A" .27 .32 .32 .39 .27 .65 1.70 .50 1.09 .36 .63 .66 .40 .54 1.11 .29 .58 1.36 .45 .31 p-value .68 .53 .54 .39 .67 .09 .00 .21 .01 .45 .10 .08 .37 .17 .01 .60 .13 .00 .28 .57 player 21 22 23* 24 25* 26 27* 28 29 30 31 32 33 34 35 36 37* 38 39 40* A2 .39 .65 .86 .33 1.34 .56 1.69 .54 .24 .29 .16 .61 .65 .54 .15 .26 1.31 .33 .26 .82 p-value 38 .09 .03 .51 .00 .14 .00 .16 .78 .63 .94 .11 .09 .17 .97 .72 .00 .52 .71 .04 means that the p-value is smaller than .05. From Table 4.1 we observe that nine out of forty players have bowling scores which are significantly different from the normal population at the 5% significance level. We CHAPTER 4. MODELLING THE BOWLING SCORES 32 now use Fisher's method for combining the tests of forty samples and get the statistic p = -2l:t~1Iog(Pi) = 176.9 with overall p-value O. We therefore reject the null hypothesis that the bowling scores are approximately normally distributed. 4.4 The Proposed Model for Bowling Scores The Box-Cox family of transformations of bowling scores is given by if a f 0 if a = O. We want to find the value of the parameter a so that the transformed mately normally distributed. We test the hypotheses HOi: yJ;l are approxi- yJ;l ~ N(J.li' all, i = 1,2, ... ,40 for a specified a by using the modified Anderson-Darling statistic and Fisher's method. We do this rather than the traditional maximum likelihood approach since the maximum likelihood method requires a constant variance amongst individuals aud we have no prior reason to believe this. The "best" model is the one whose value a gives the maximum overall p-value. Through trying different a-values ranging from -2 to 2 the logarithms of bowling scores (a = 0) has nearly a maximal p-value of 0.07. When a = -.05, Pm"x = .07. However we would like to choose a = 0, because the difference in p-values is small. We mention that the log transformation is the variance stabilizing transformation resulting from a model where the standard deviation is proportional to the mean. Table 4.2 gives the results of testing the normality of the logarithms of bowling scores. We see that the logarithms of bowling scores for players 7, 8, 23 and 27 are significantly different from the normal distribution at the 5% significant level. We reject four out of forty hypotheses HOi ~ N(J.li, all, i = 1,2, ... ,40. Through using Fisher's method for combining tests of forty samples, we get the statistic P = -2l:t~1Iog(Pi) = 99.8 with overall p-value 0.07. We therefore tentatively accept the hypothesis that the logarithms of bowling scores are approximately normally distributed. We mention that the p-values obtained above are appropriate for a specified value a. We have optimally determined a and then computed the p-value as though a was specified. CHAPTER 4. MODELLING THE BOWLING SCORES 33 Table 4.2: The Results of Logarithms of Bowling Scores player 1 2 3 4 5 6 7' 8· 9 10 11 12 13 14 15 16 17 18 19 20 A2 .39 .33 .23 .67 .15 .32 .92 1.10 .54 .26 .53 .30 .18 .25 .44 .39 .42 .72 .39 .21 p-value .39 .51 .82 .53 .96 .53 .02 .01 .17 .71 .17 .58 .92 .74 .29 .39 .33 .06 .38 .86 player 21 22 23· 24 25 26 27· 28 29 30 31 32 33 34 35 36 37 38 39 40 A2 .26 .43 .93 .35 .61 .25 .88 .35 .51 .17 .23 .31 .30 .43 .28 .42 .70 .10 .18 .61 p-value .72 .32 .02 .48 .11 .74 .02 .47 .19 .93 .80 .56 .58 .31 .64 .33 .07 .99 .92 .11 "." means that the p-value is smaller than .05. Technically this is not ideal and the true p- value should be smaller than the reported p_ value of 0.07. Despite this we believe that the log normal approximation is good as will be seen in the q-q plot. To compare the results for the bowling scores obtained from Section 3.4 by using standardized and pooled data, we give the histogram and standardized q-q plot for the logarithms of bowling scores in Figure 4.1. We see that the histogram of standardized and pooled logarithms of bowling scores is approximately normally distributed. The q_ q plot is almost a straight line through the origin and with unit slope. The only place of departure is in the tail. It seems that the tails of the normal distribution may be slightly thicker than the tails of the logarithms of bowling scores. However as mentioned earlier, the Anderson-Darling statistic is very sensitive in detecting departures from the true distribution in the tail. We therefore accept the hypothesis that the logarithms of bowling scores are approximately normally distributed. CHAPTER 4. MODELLING THE BOWLING SCORES 4.5 34 The Property of Equal Variances After having found that the logarithms of bowling scores are approximately normally distributed, we would like to know whether the equality of variances holds. Table 4.3 lists the averages and the sample variances of the logarithms of bowling scores. The sample variances vary from 0.017 to 0.050. Table 4.3: The Sample Averages and Sample Variances of Logarithm Data Player 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Averages 5.26 4.92 5.07 5.22 5.31 4.96 5.03 5.18 5.07 5.27 4.77 4.88 5.07 4.95 5.01 4.95 4.82 5.08 5.07 5.03 Sample Var. .038 .026 .017 .029 .027 .030 .035 .034 .039 .031 .022 .037 .034 .021 .038 .020 .043 .027 .036 .034 Player 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Averages 5.24 5.30 5.23 5.18 5.26 5.05 4.93 4.76 5.30 5.08 5.08 4.96 5.00 5.08 4.89 5.13 4.83 5.11 4.93 5.05 Sample Var. .027 .028 .030 .029 .039 .027 .032 .045 .033 .022 .027 .029 .044 .035 .050 .033 .046 .045 .035 .043 Figure 4.2 gives a plot of the sample variances against the averages. We see that the sample variances of logarithms of bowling scores are approximately the same amongst all players. We will use Bartlett's method for testing whether the variances are approximately equal for the logarithms of bowling scores. The assumptions of the Bartlett test are: CHAPTER 4. MODELLING THE BOWLING SCORES 35 Histogram of Standardized Values ;--- r-- - r---' 8 o r---' I-- ....---rI I I I -3 -2 -1 o II. i I I 2 3 Standardized logarithms of Bowling SCores Q. Q Plot ~ 8 '"'" ,5 ~ '" '" III "0 0 E ,5 0 .9'" - 'fij II ~ i! ~ , ')' "I -2 o 2 Quantile, of Standard Normal Figure 4.1: Normality of the Logarithms of Bowling Scores 16n _ q CHAPTER 4. MODELLING THE BOWLING SCORES 36 C! Xlu 0 iij 'iij : > ~ Q. ~ III '" - 0 0 • ". '" 0 ci , 4.8 5.1 5.0 4.9 5.2 5.3 Averages Figure 4.2: The Sample Variances vs Averages Plot for Logarithm Data (1) Each of the k populations is normal. (2) Independent random samples are obtained from each population. The hypothesis is: - ,,2k Ho·.,,21-- ,,22- - " ' H1 : not all of the (11 are equal Let s~, . .. , s~ denote the sample variances from the k normal populations and let denote the degrees of freedom associated with the sample variance square error is given by 1 MSE = dlf k 'Edf;sl T i=l where k dfT = 'Ed!i. j=} d s?- d!, Then the mean "" CHAPTER 4. MODELLING THE BOWLING SCORES 37 The test statistic is 1 k B = C[(dfy)log(M SE) - 2:)d/;)log(sr)] 1=1 where 1 k 1 C = 1 + 3(k _ 1)[(2: d'!) 1=1 Under Ho, B is approximately distributed as I 1 d""l 'JT xLI' In studying the equality of variances for the logarithms of bowling scores, the above two assumptions hold. We have dfi = ni - 1 where ni is the number of games in which player i participated. The sample variance of the logarithm of bowling scores for player i is sr and k = 40. The test statistic B = 57.6 yields a p - value = .03. A Spearman Rank test was also carried out as in Section 3.2 and the p-value was found to be insignificant. Therefore there seems to be mild evidence of differences amongst the variances. However all that we care about is that the differences are not too big. Therefore we suggest that the variances are approximately equal amongst players and that the logarithm of bowling scores are approximately normally distributed with a constant variance estimated by 0- 2 E'· So, = 0.0328. =;40 An approximate 95% confidence interval for ,,2 based on normality is (0.0220, 0.0541) Having revised our model we standardize logarithms of bowling scores as in Section 3.4 by using the r; for each player together with 0- 2 = 0.0328 obtained above. We then construct the standardized q-q plot and histogram for pooled values based on this new model. Figure 4.3 gives the plot of normality of logarithms of bowling scores with constant variance. Comparing Figure 4.1 and Figure 4.3, we observed that Figure 4.3 is more normal than Figure 4.1. The reason might be that in Figure 4.1 fewer data are in the tail compared with the standard normal. In Figure 4.3 the i'k bowler contributes the terms the pooled data. Therefore those bowlers which have a small Si k Xi x't . - u to are going to contribute terms that are clustered mOre tightly about zero and those bowlers which have a larger s, are now going to contribute terms that are more spread out about zero. The net effect is a longer tailed distribution which is what we observed. CHAPTER 4. MODELLING THE BOW~ING SCORES 38 Histogram of Standardized Values r-- f--- '-- r-- 8 o f--- r-- s--- r-, i i i Figure 4.3: Normality of the Logarithm of Bowling Scores (a 2 rJb _ = 0.0328) CHAPTER 4. MODELLING THE BOWLING SCORES 39 Note that some very strong modelling assumptions have been conjectured; i.e. that logarithms of bowling scores are approximately normally distributed with a constant variance. However this conclusion is based on a single league and it may be unreasonable to extend this inference to populations in general. In the next chapter we hope to show through simulation that the result is approximately valid for a wide range of bowling abilities. Chapter 5 Simulated Bowling Scores In Chapter 4, we found that the logarithm of bowling scores is approximately normally distributed. However we have some concern over the adequacy of the approximation due the 7% p-value. Perhaps the approximation is quite good and the questionable p-value can be attributed to the effect of sample size on the meaning of significance tests. For example, it is well known that with a very large data set a precise H o will almost always be rejected (see Royall[lOJ). In any case we would like to confirm the adequacy of the approximation. In this chapter, we use a Fortran program to simulate bowling scores to confirm the model. 5.1 Assumptions of Simulation As we know, the actual mechanism underlying a bowling game is impossible to describe and to simulate. We give some simplifications concerning the bowling mechanism in order to make the simulation easy. (A) There is no curve on each ball bowled. (B) A bowler always aims directly at the middle of the pin of interest and can miss the pin by no more than 1 pin to the right or to the left. (C) The bowler has equal accuracy to the left or to the right; the chance of hitting either the adjacent right pin or the adjacent left pin is the same. (D) There is no learning effect. Every ball bowled is independent of one other. 40 4 _ 4 CHAPTER 5. SIMULATED BOWLING SCORES 41 We also simplify the outcomes of each ball bowled. These outcomes are described below. We mention that other outcomes can arise in practice other than those described. However they are far less probable. They also have a similar structure to one of the above possibilities. That is, they count approximately the same number of points and have nearly the same implications for successive balls in the frame. 1. The outcomes resulting from the first ball bowled in a frame are one of following four possible results: (a) Strike: A bowler knocks down all pins with probability pi and the score is 15 points. (b) Corner: A bowler knocks down all pins except a single corner pin with probability p2 and the score is 13 points. (c) Headpin: A bowler knocks down the headpin with probability p3 and the score is 5 points. (d) 3-2: A bowler knocks down the 3-pin and the 2-pin on the same side with probability p4 =1 - (pi + p2 + p3) and the score is 5 points. 2. The outcomes resulting from the second ball bowled in a frame are conditional on the result of the first ball. (a) A strike is recorded and the frame is completed. No second ball is available. (b) A corner pin is left standing after the first ball. There are 2 possible outcomes for the second ball: (1) Spare: The bowler knocks down the corner pin. The probability pl+p2+p3 relates to a ball which does not miss its intended pin. A spare is recorded and the score is 15 points. (2) The corner pin remains. The bowler does not not knock down the corner pin with probability p4 and the score remains unchanged. CHAPTER 5. SIMULATED BOWLING SCORES 42 (c) A headpin is picked as the result of the first ball. There are four possible outcomes for the second ball: (1) 3-2: The bowler knocks down the 3-pin and 2-pin on either the right or left side. The probability pl+p2 relates to a ball which is aimed at 3-pin and hits but does not punch out the 3-pin and the score is 5 + 5 = 10. (2) 3-pin: The bowler knocks down the 3-pin with probability p3 and the score is 5 + 3 = 8. (3) 2-pin: The bowler knocks down the 2-pin and the score is 5 +2 = 7. The probability p4/2 relates to the bowlers tendency to miss to the left or to the right with equal probability. (4) Miss: The bowler rolls the ball through the headpin channel with probability p4/2 and the score is still 5 points. (d) The 3-2 combination is picked as the result of the first ball. There are five possible outcomes for the second ball: (1) Headpin: The bowler picks the headpin with the second ball with probability p3 and the score is 5 + 5 = 10. (2) Spare: The bowler knocks down all remaining pins with the second ball with probability pl+p2/2. A spare is recorded and the score is 15. (3) Miss: The bowler rolls the ball through the 3-2 channel with probability p4/2 and the score is unchanged. (4) 3-2: The bowler knocks down the other 3-pin and 2-pin with probability p4/2 and the score is 5 + 5 = 10. (5) hp-3: The bowler knocks down both the head pin and the 3-pin with probability p2/2 and the score is 5 + 5 + 3 = 13. 3_ For the third ball of the frame there are 14 different results based on the second ball. Figure 5.1 graphically depicts all outcomes and probabilities. The numbers within circles represent the cumulative scores and the letters A, B, C and D after the second ball indicate that the same situations have occurred at other places in the chart. _ ..dtt _ -~ 43 CHAPTER 5. SIMULATED BOWLING SCORES hp: headpin 3-2: 3-pin and 2-pin knocked down 3: 3-pin knocked down 2: 2-pin knocked down hp-3: headpin and 3-pin knocked down (0 G 8 (0 hp(pl+pZ+ 3 Figure 5.1: Tree Diagram of Possible Outcomes G < CHAPTER 5. SIMULATED BOWLING SCORES 5.2 44 The Results of Simulation Based on the simplifications and the scoring rules, a Fortran program has been coded to simulate bowling games. Appendix B lists the computer program. We use the Fortran program to simulate bowling scores by choosing different parameters pI, p2, p3 and p4 roughly based on the observations of actual bowlers and on the considerations of variability of abilities where pI + p2 + p3 + p4 = 1. Table 5.1 lists some typical results of the simulation where averages range between 120 and 305. The results include the sample averages and sample variances of the bowling scores based on N simulations for a set of chosen parameters. Table 5.1: The Results of Simulation (N = 10000) pI 0.3 0.2 0.2 0.4 0.25 0.1 0.15 0.15 0.05 0.1 0.2 0.25 0.4 0.25 0.25 0.3 p2 0.3 0.2 0.3 0.3 0.25 0.1 0.15 0.15 0.05 0.2 0.2 0.15 0.4 0.2 0.25 0.3 p3 0.2 0.3 0.2 0.2 0.2 0.3 0.25 0.1 0.35 0.3 0.15 0.1 0.1 0.25 0.1 0.1 p4 0.2 0.3 0.3 0.1 0.3 0.5 0.45 0.6 0.65 0.4 0.45 0.5 0.1 0.3 0.4 0.3 Average 250 199 219 282 224 149 173 171 125 169 197 202 302 214 223 250 Variance 1388 1157 1197 1507 1335 731 978 1015 412 855 1218 1329 1355 1338 1356 1412 We also give some standardized q-q plots and histograms based on the logarithms of the simulated bowling scores in Figures 5.2-5.4. The sample variances of the logarithms of bowling scores are 0.0327, 0.0361 and 0.0234 for the data sets which contributed Figures 5.2,5.3 and 5.4 respectively. They belong to the range of variances of logarithms of actual bowling scores. The plots in Figures 5.2, 5.3 and 5.4 are obtained by using the constant variance (72 = 0.0328. Figure 5.2 gives one of the "best" amongst the simulated scores, · ----.....-CHAPTER 5. SIMULATED BOWLING SCORES 45 Figure 5.3 gives the result of a typical simulation and Figure 5.4 gives one of the "worst" results. By "best" we mean that it fits the normal model best and by "worst" we mean that it fits the normal model worst. Note that in the worst case the average of simulated bowling scores = 250 which extends the range of averages of the actual data. However it is still not clear which values of the parameters pI, p2, p3 and p4 lead to a good approximation. It has been observed in Figures 5.2-5.4 that in each case the left sample quantiles fall below the normal quantiles. A possible explanation for this is the inadequacy of the simulation model. The simplified model may make it unrealistically easy to obtain a low score. From Figures 5.2-5.4, we find that the logarithms of simulated bowling scores are approximately normally distributed with a constant variance. We have therefore confirmed the proposed model of Chapter 4 by using a Fortran program to simulate bowling scores. 46 CHAPTER 5. SIMULATED BOWLING SCORES Histogram of Standardized Values 8 '" -Sl 8 Sl 0 -4 0 -2 2 Standardized Logarithm. of Simulated Bowling Score. Q - Q Plot o -2 Quantile. 2 0' Standard Normal Figure 5.2: The Results of pl=O.15, p2=O.15, p3=O.25, p4=0.45 6 _ 47 CHAPTER 5. SIMULATED BOWLING SCORES Histogram of Standardized Values .--- -- - - 8 - ~ L i i , i -4 -2 o 2 o Standardized Logarithms of Simulated Bowling SCores Q - Q Plot -2 o 2 Ouantiles of Standard Normal Figure 5.3: The Results of pl=O.15, p2=O.15, p3=O.1, p4=O.6 ... 48 CHAPTER 5. SIMULATED BOWLING SCORES Histogram of Standardized Values -iil - 8 - o -4 -2 -3 o -1 2 Standardized Logarithms of Simulated Bowling Scores Q. Q Plot (; ~ ';" E -S .~ ll. ..,.9 ~ J ~ .!!l CJ) '1' '? -2 o 2 Quantiles of Standard Normal Figure 5.4: The Results of pl=O.3, p2=O.3, p3=O.1, p4=.3 Chapter 6 Handicap Systems From preceeding chapters we have concluded that the logarithm of bowling scores is approximately normally distributed with a constant variance (72 = 0.0328. In this chapter we will use Monte Carlo methods to investigate the effect of various handicap systems based on the proposed model of Chapter 4 and then compare our results with the Remington Rand study. 6.1 The Remington Rand Study There are various handicap systems currently used in league and tournament play. Detailed information about handicap systems is described in Section 2.1. The Remingtom Rand study[8] processed over 100,000 league bowling scores and the results suggested that the individual handicap system of 80% of the difference between the bowler's average and a base figure of 225 is the fairest handicap system. We tried to get more information about the Remington Rand study and their criteria of "fairest". Unfortunately we did not get any reply from the Ontario 5 Pin Bowlers' Association on this matter. In the remainder of this chapter, we use our model to investigate the effects of various handicap systems on the probability of winning. By a fair handicap system we mean one which tries to "even up" the chance of winning in a match between competitors of various strengths. 6.2 Let The Monte Carlo Study X;j be the j'h bowling score of player i with mean m;. In this way we can compare bowlers of various abilities by changing the value of m;. We know from our model in the 49 6z _ CHAPTER 6. HANDICAP SYSTEMS previous chapters that Yij = log(Xij) 50 by completing the square it is easy to show that mi therefore Ui = log( mil - = 0.0328. = E(xij) = E(e ~ N(ui, (12) where (12 Since Xij Y" ) =, eY", = e"'+"' and a;. We easily simulate bowling scores for player i by generating random variates from the N(log(mi) - a', (12) 2 distribution with (12 = 0.0328 and then taking logarithms. In more detail the method for generating bowling scores is as follows: (1) Create average bowling scores mi and mj for player i and player j. The values of mi and mj (mi ::; mj) are between 100 and 240 to reflect realistic abilities. We consider all possible combinations with averages ranging by 10 point intervals. By restricting mi ::; mj we refer to player i as the underdog and player j as the favourite. (2) For each pair of players we generate 10,000 bowling scores Xik and x jk, k = 1,2, ... ,10000, according to the log normal distribution described above. (3) We then add a handicap to both Xik and Xjk based on the handicap system currently under study and obtain the total game scores. We consider handicap systems 1, 2, 3 and 4 corresponding to the descriptions in Section 2.1. (4) We then estimate the probability of the favourite defeating the underdog in a given game by considering the fraction of the 10,000 games won by the favourite over the underdog. 6.3 The Results of Our Study Table 6.1 lists the bowler's averages and the probabilities of the favourite defeating the underdog based on the four handicap systems. This has been done using 20 point intervals. As we expected, from Table 6.1 )Ve observe that the stronger player always has an advantage under each of the four handicap systems. We see this as a good thing as it offers incentive to improve one's bowling skills. On the other hand the advantage of the stronger player should not be so great as to discourage the weaker player. From this point of view Table 6.1 indicates that handicap system 1 may be the most preferable as the advantage of the favourite over the underdog is not as dramatic as with handicap systems 2, 3 and 4. For example the favourite with an average of 220 has probability 0.54, 0.69, L.. d Q CHAPTER 6. HANDICAP SYSTEMS 51 Table 6.1: Estimated Probabilities of the Favourite Winning Underdog 100 100 100 100 100 100 100 100 120 120 120 120 120 120 120 140 140 140 140 140 140 160 160 160 160 160 180 180 180 180 200 200 200 220 220 240 Favourite 100 120 140 160 180 200 220 240 120 140 160 180 200 220 240 140 160 180 200 220 240 160 180 200 220 240 180 200 220 240 200 220 240 220 240 240 Handicap1 0.50 0.56 0.60 0.63 0.67 0.68 0.70 0.81 0.50 0.54 0.58 0.61 0.64 0.66 0.77 0.50 0.54 0.58 0.59 0.64 0.74 0.51 0.53 0.56 0.59 0.70 0.50 0.53 0.55 0.67 0.49 0.53 0.64 0.50 0.61 0.51 Handicap2 0.50 0.60 0.68 0.73 0.78 0.80 0.90 0.96 0.50 0.57 0.64 0.70 0.73 0.86 0.93 0.50 0.57 0.64 0.67 0.82 0.90 0.51 0.56 0.61 0.76 0.85 0.50 0.55 0.70 0.81 0.49 0.65 0.76 0.50 0.64 0.51 Handicap3 0.50 0.57 0.63 0.67 0.71 0.73 0.86 0.93 0.50 0.55 0.60 0.64 0.67 0.82 0.90 0.50 0.55 0.60 0.62 0.78 0.87 0.51 0.54 0.58 0.73 0.83 0.50 0.53 0.68 0.80 0.49 0.65 0.76 0.50 0.64 0.51 Handicap4 0.50 0.57 0.63 0.67 0.61 0.73 0.75 0.87 0.50 0.55 0.60 0.64 0.67 0.71 0.83 0.50 0.55 0.60 0.62 0.67 0.79 0.51 0.54 0.58 0.61 0.74 0.50 0.53 0.56 0.71 0.49 0.54 0.67 0.50 0.64 0.51 CHAPTER 6. HANDICAP SYSTEMS 52 0.67 and 0.55 of defeating the underdog with an average of 180 under handicap systems 1, 2, 3 and 4 respectively. To gain a better understanding of Table 6.1 we present it in a graphical manner in Figure 6.1. Figure 6.1 gives plots of the probabilities of the favourite defeating the underdog under four handicap systems (H1, H2, H3 and H4) for an underdog with a fixed average. Notice that any lack of smoothness in the plot is due to errors in our estimates and should be ignored. The standard error of the probabilities is less than or equal to °i~~go5 = 0.005. The estimate errors are also clearly seen in Table 6.1 where the probability of the favourite winning should always be .50 when mj = mj' Figure 6.1 shows clearly that handicap system 1 is the fairest. This gives the same result as the Remington Rand study: the individual handicap system of 80% of the difference between the bowler's average and a base figure of 225 is the fairest handicap system to use in league or tournament play. CHAPTER 6, HANDICAP SYSTEMS 53 Average of Underdog = 100 "! 0 <Xl ,~ ~ a. 0 [}] ---- "! , ,, 0 " / 1 H3 H4 CD f 0 f ...... f fe f ..-. r0 Average of Underdog = 130 .,- .... ... / ,d" -" / r0 rn --. -- H3 H4 <X! 0 '"0 '"0 , " f f f f " ..r <X! ,1 . : a. ~ ..~. , ,. .," ' ,,- I f 1.. " 0 100 140 130 220 180 160 220 190 Average of Favourite Average of Favourite Average of Underdog = 160 Average of Underdog = 190 lil <Xl 0 ,~ :sco [}] ---- r0 .... " , ", .,'l' /' / H3 H4 " // R f " f ,,'/ ,~ -" e e <D .... a. .: I ,: I ...• I . ,""' ,./' .-/ [}] ---- ... ; ," H3 H4 J'~ ," ..... .'/ /'" /1 -" /1 0 0 ~ /, : f a. 0 , .'" ,I .'; 5l ;i' 0 ,I ,-/ ..f·'· ..~ ,/ " ,'/ .. ,'( . g '"0 0 160 180 200 220 Average of Favourite 240 190 210 Average of Favourite Figure 6,1: The Probability of the Favourite Winning the Game d 230 .~ Appendix A The Data Set Player Game 11 12 1 173 165 182 159 147 194 221 153 159 183 141 252 13 14 181 210 15 205 16 17 18 19 20 163 118 189 162 189 120 234 139 205 146 1 2 3 4 5 6 7 8 9 10 2 3 1.52 161 164 172 148 204 130 NA 4 5 215 200 210 182 141 6 125 103 128 133 173 171 136 147 151 187 184 248 180 229 147 177 197 159 191 138 161 167 141 150 135 119 135 161 158 137 122 147 133 170 147 188 185 219 204 167 178 198 146 205 193 221 NA NA NA NA 152 NA NA 88 172 NA NA 176 122 153 167 149 113 143 184 137 122 124 149 NA 149 164 146 NA 222 196 162 NA 200 184 154 198 126 139 257 213 260 146 116 7 NA NA NA NA NA NA 10 8 9 127 148 257 185 107 161 205 167 203 159 155 178 190 148 142 128 187 170 176 138 295 211 136 200 212 110 185 195 139 224 209 121 146 206 138 227 150 215 137 107 136 106 230 212 147 193 126 153 145 155 123 160 143 129 205 126 143 106 135 193 165 230 162 161 176 147 199 54 .-z _ APPENDIX A. 55 THE DATA SET Game Player 4 3 179 126 162 190 199 130 166 187 238 161 162 220 133 161 169 230 199 130 166 187 238 161 162 220 133 161 169 230 1 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 223 235 239 207 214 163 221 242 196 209 139 250 146 256 297 224 2 125 NA 139 NA 125 NA 101 137 124 155 5 234 173 201 214 173 201 214 10 8 9 192 132 188 191 202 163 182 156 255 176 218 211 191 202 163 182 156 255 176 218 211 187 168 200 188 134 164 171 167 205 242 229 252 222 190 184 214 299 202 133 226 209 157 159 146 116 163 156 127 113 144 220 157 120 129 III 124 173 177 113 203 131 173 192 126 115 165 142 211 178 159 191 181 149 156 180 236 142 173 198 202 195 159 199 178 124 141 131 183 122 162 190 173 171 186 121 151 113 113 137 225 146 181 162 146 146 255 NA 146 196 7 6 172 149 167 147 133 226 209 157 167 147 152 200 170 210 145 217 248 253 220 145 162 184 173 188 136 200 159 266 116 225 184 235 NA 130 224 162 158 152 144 211 171 NA 151 197 180 144 136 144 168 205 101 148 235 188 177 148 165 166 187 203 190 235 119 187 223 132 102 202 142 194 121 164 204 206 144 156 NA 232 140 178 NA 259 136 149 NA 178 119 154 148 ~2 169 195 181 187 124 179 194 219 229 246 147 135 141 165 138 160 182 193 141 167 149 NA NA NA 156 217 190 166 228 185 200 157 138 132 154 162 209 148 172 239 195 285 205 159 132 135 163 137 113 130 152 142 119 213 138 148 173 234 231 174 118 145 193 211 228 199 175 185 157 APPENDIX A. 56 THE DATA SET Game 54 55 56 57 2 1 156 105 146 96 212 142 196 163 3 140 182 4 183 165 126 185 178 184 Game 1 2 3 4 5 6 7 7 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 12 13 14 11 85 123 119 NA 109 109 201 NA 95 III 137 NA 123 140 144 132 158 117 177 153 131 109 146 III 123 106 151 NA 153 160 188 NA 155 124 158 NA 84 137 148 154 114 152 132 149 119 136 143 148 135 118 173 180 158 112 110 110 129 126 109 123 108 112 135 157 117 166 119 124 124 Player 6 7 8 5 234 176 143 220 208 120 124 192 213 155 141 241 208 232 207 177 9 155 134 10 229 172 149 169 216 240 Player 15 16 170 125 153 147 157 153 19 151 196 158 20 200 154 156 115 157 115 128 136 159 186 144 122 137 237 158 136 187 193 121 162 132 135 171 151 159 181 197 190 164 139 121 195 164 138 121 170 163 183 132 129 176 139 172 143 154 129 147 103 NA NA III NA 112 123 140 190 190 154 136 115 123 157 156 135 125 156 163 132 108 184 122 141 213 136 126 154 151 135 113 146 125 170 179 119 142 147 150 144 158 158 151 154 162 17 18 114 NA 71 NA 102 NA 103 154 102 144 91 158 103 205 114 213 94 217 99 171 168 124 136 126 141 130 87 138 145 120 132 153 145 188 174 138 118 168 175 184 NA 117 125 171 135 173 NA 151 133 155 183 131 136 115 194 176 171 NA 157 140 234 174 149 150 157 122 205 160 144 APPENDIX A. 57 THE DATA SET Game 25 26 27 28 29 30 31 32 33 34 12 14 160 144 Player 15 16 11 108 136 13 150 157 122 205 115 111 108 104 124 145 130 125 125 136 124 131 144 101 207 177 189 133 208 169 143 128 108 135 116 131 137 157 146 136 136 186 118 131 195 182 143 169 155 143 156 135 176 168 141 180 17 117 125 151 133 18 171 155 19 135 183 174 145 146 148 223 194 126 155 110 154 128 132 163 204 117 152 105 160 166 180 221 215 NA 218 NA 170 NA 134 NA 216 NA 185 NA 183 NA NA NA 115 194 176 171 NA 157 140 234 174 149 NA NA NA NA NA NA 148 142 124 134 136 139 125 155 101 147 163 163 126 123 148 118 153 124 147 179 142 178 135 164 137 168 138 158 206 145 42 43 129 86 44 111 146 164 112 152 167 114 136 119 45 46 47 48 49 50 51 52 107 99 118 155 107 157 141 162 148 116 261 119 203 167 106 142 142 134 133 171 275 137 164 136 122 116 104 104 102 150 170 156 145 142 202 228 163 162 131 144 128 147 190 188 112 175 119 107 127 132 102 NA NA NA 113 124 NA NA NA 96 131 53 54 136 111 132 151 55 56 57 109 128 138 108 113 113 97 122 128 132 161 135 190 185 115 156 142 146 133 155 103 158 155 218 182 128 75 148 162 96 180 139 131 115 103 121 121 142 146 94 35 36 37 38 39 40 41 20 173 131 172 153 NA 146 177 146 186 130 130 192 104 NA 125 155 NA 132 214 188 239 150 144 160 133 117 183 153 119 155 121 158 125 115 161 178 102 105 196 80 135 128 136 111 124 131 NA NA NA 135 NA NA NA NA NA NA NA NA NA NA NA NA 189 164 NA 148 136 NA 141 APPENDIX A. THE DATA SET 58 Game Player 21 1 2 3 4 5 6 7 8 9 10 11 12 22 23 24 NA NA NA NA NA NA NA NA NA NA NA NA 148 187 NA 153 172 176 NA 133 152 240 NA 191 25 152 145 187 151 216 222 182 175 211 13 14 15 16 17 18 19 20 21 22 23 233 166 210 24 237 151 161 212 177 166 221 210 192 181 210 193 204 140 158 237 151 161 212 177 195 236 215 171 159 152 213 186 158 199 233 263 155 149 207 25 26 27 28 29 30 28 29 30 164 186 NA NA NA NA NA NA NA 156 108 NA NA 208 174 155 181 227 160 232 163 204 177 246 190 289 172 188 180 182 211 184 157 205 176 179 212 261 201 178 207 153 212 174 260 156 199 223 162 181 233 163 141 133 26 27 162 114 172 105 218 135 164 168 174 171 197 167 205 192 162 212 161 148 201 228 182 241 234 191 204 142 202 156 276 152 210 169 221 210 192 181 193 204 140 158 177 144 92 119 130 151 131 153 140 175 161 110 130 206 221 129 154 179 112 133 185 165 156 142 217 129 123 144 139 132 212 165 107 140 78 90 121 112 85 152 103 123 119 104 234 211 188 221 200 263 237 191 169 246 100 186 126 188 NA 148 NA 191 NA 281 151 223 145 200 152 160 118 229 145 176 123 228 165 161 111 187 142 135 124 230 145 176 123 228 165 161 111 187 142 135 124 230 134 132 82 273 128 166 146 246 216 142 101 242 153 186 148 140 183 169 109 157 152 161 164 145 154 138 134 145 146 200 135 177 147 135 177 147 141 134 161 APPENDIX A. THE DATA SET Game Player 21 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 59 22 23 24 183 149 151 162 200 231 179 225 226 189 209 212 179 226 141 NA NA NA NA NA NA 185 244 NA 189 NA 223 NA 233 216 185 167 193 182 142 215 189 129 179 162 199 208 180 210 253 238 218 218 164 187 131 254 187 172 226 193 NA 204 NA 225 NA 153 212 195 157 166 208 137 210 180 237 147 161 128 143 168 175 219 185 165 235 237 155 180 162 152 NA 183 140 243 NA 202 25 196 140 220 26 136 153 161 27 162 152 199 28 29 117 204 115 128 133 183 246 123 183 144 189 222 271 195 156 190 272 186 329 167 182 176 298 201 NA 170 NA 166 NA 121 215 157 149 169 205 147 184 209 243 163 184 124 171 131 158 126 132 127 123 127 113 132 155 127 134 125 128 150 128 147 142 126 149 95 169 240 277 165 125 123 117 133 142 114 111 96 81 76 136 107 131 138 NA NA NA 109 221 128 30 138 166 186 179 169 214 175 146 157 158 216 209 154 135 181 254 156 226 128 203 147 156 134 218 192 212 183 231 201 211 169 190 190 174 174 239 163 186 138 198 118 235 217 NA NA 263 203 NA 176 200 122 121 NA 141 194 216 185 179 250 170 198 192 157 217 188 180 229 140 187 178 129 132 150 119 137 172 172 163 184 163 171 APPENDIX A. THE DATA SET 60 Game Player 31 1 2 3 4 5 6 7 8 9 10 11 12 NA NA 221 NA NA NA NA 165 NA NA NA NA 189 NA NA 167 125 138 138 NA 139 133 198 147 NA 131 127 158 224 NA 173 122 96 181 NA 129 114 100 177 NA 127 150 127 163 NA 159 197 171 21 22 23 185 174 199 24 25 26 27 28 29 30 34 36 18 19 20 16 17 33 35 184 125 128 153 151 103 NA 134 NA 182 NA 179 158 168 173 141 159 141 170 135 185 149 13 14 15 ... 32 163 134 219 119 135 158 161 141 115 156 138 137 147 132 128 128 128 196 190 138 109 176 81 148 120 71 131 213 125 194 151 166 193 190 134 153 134 178 140 140 179 181 144 155 183 132 37 119 181 164 148 110 113 114 107 38 176 146 177 294 128 174 168 151 118 101 112 141 139 152 91 207 146 138 162 195 106 233 NA 213 NA 153 39 40 NA NA NA NA NA NA NA NA NA NA NA NA 119 158 137 157 95 126 110 132 140 218 107 95 132 156 131 195 165 132 166 153 155 112 90 150 NA 128 123 107 147 128 147 155 191 204 115 160 124 120 183 115 165 158 106 106 126 193 183 125 218 83 111 206 156 165 169 132 157 106 106 126 193 183 125 218 83 111 206 156 165 169 132 157 136 191 103 118 131 158 212 148 153 168 107 124 187 120 200 196 131 166 203 174 134 107 124 199 187 120 200 196 131 166 203 115 119 117 125 146 153 165 150 108 191 106 131 208 150 194 176 t-. 113 111 169 163 118 139 137 103 155 120 103 155 120 125 145 138 ._ APPENDIX A. THE DATA SET 61 Game 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 32 31 121 149 184 147 211 135 156 169 185 133 139 133 199 137 163 128 140 148 166 148 160 159 169 214 225 179 112 145 139 145 141 145 166 176 155 157 202 138 99 106 154 126 203 182 154 140 143 160 115 169 134 189 143 230 181 209 136 124 163 187 190 121 170 138 186 135 200 211 141 104 150 121 203 129 232 186 155 162 206 107 120 121 187 127 206 51 52 53 54 55 187 144 155 161 117 137 179 141 177 149 56 57 169 185 120 119 33 134 158 188 144 136 121 186 34 212 132 Player 35 36 37 119 132 102 95 131 155 101 180 199 118 224 111 119 181 84 155 124 114 141 152 90 188 167 121 127 198 129 165 186 142 164 170 152 149 227 124 162 176 95 138 167 125 126 146 115 147 175 179 145 208 150 177 166 120 191 174 105 198 172 172 229 122 188 118 154 142 155 214 121 157 162 142 173 208 146 184 147 170 156 154 159 227 152 151 124 174 176 180 38 153 191 39 40 151 191 105 189 191 105 150 144 134 187 146 168 150 154 136 154 162 149 148 174 145 159 174 141 142 185 146 195 242 160 154 222 119 166 136 128 210 181 118 220 134 93 137 168 NA 165 159 NA 193 191 NA 202 202 NA 195 159 NA 142 NA NA NA NA 119 148 161 138 206 188 214 201 144 166 Appendix B A Program which Simulates Bowling Scores c This is a program which simulates bowling scores c********************************************************************** c Simplifying Assumptions: c (1) no curve on ball c (2) aim straight on c (3) equal accuracy to left .or. right c (4) no learning effect (balls are independent) c (5) misses target by at most 1 pin c********************************************************************* program main parameter (n=10000) dimension gscore(n).fscore(10).p(4) .t(3) real p,mscore,sscore double precision drand.pp integer gscore.fscore.tstrike.tspare.i.j.t.k open(8.file=·output') mscore=O sscore=O read(*.*) (p(i),i=1.4) 62 17 ; • APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES p(2)=p(1)+p(2) p(3)=p(2)+p(3) p(4)=p(3)+p(4) if (p(4) .It . . 99999999 .and. p(4) .gt. 1.00000001) then print*, 'NOT proper selection of probability' endif do 99 i=l,n do 20 k=l,10 20 fscore(k)=O t(l)=O t(2)=0 t(3)=0 tstrike=O tspare=O do 88 j=l,10 40 continue pp=drand(O) c This is a strike if (pp .le . p(l») then fscore(j)=fscore(j)+15 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+1)=fscore(j+1)+15 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+15 t(3)=t(3)+1 endif t(l)=t(l)+l tstrike=tstrike+1 63 .. _---------------- APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES c This is a corner elseif (pp .gt. p(l) .and. pp .le. p(2)) then fscore(j)=fscore(j)+13 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+13 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+13 t(3)=t(3)+1 endif t(1)=t(l)+l !f(t(1) .It.3) then call else corner(fscore,t,tspare,tstrike,p,j) goto 30 endif c This is a headpin else if (pp .gt. p(2) .and. pp .le. p(3)) then fscore(j)=fscore(j)+5 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+5 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+5 t(3)=t(3)+1 endif t(1)=t(l)+l if(t(l) .It.3) then t 64 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES call headpin(fscore,t,tspare,tstrike,p,j) else goto 30 end if c This is a 3-2 elseif( pp .gt. p(3) .and. pp .le.l) then fscore(j)=fscore(j)+5 if «tstrike .ne. 0 .or. tspare .ne. 0) . and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+5 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+5 t(3)=t(3)+1 endif t(1)=t(1)+l if(t(l) .It.3) then call bov32(fscore,t,tspare,tstrike,p,j) else goto 30 endif endif 30 if (t(l) .ge. 3) then t (1)=t(2) t(2)=t(3) t(3)=0 tstrike=O elseif(t(l) .It. 3) then goto 40 endif 65 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 88 continue gscore(i)=O do 12 j=1.10 12 99 gscore(i)=gscore(i)+fscore(j) continue write(8.101) (gscore(i).i=l.n) do 19 i=l.n 19 mscore=mscore+gscore(i) mscore=mscore/n do 29 i=l.n 29 sscore=sscore+(gscore(i)-mscore)**2 sscore=sscore/(n-l) write(*.*) mscore.sscore 101 format(lx.40i5) stop end c This subroutine calculates the outcomes of a corner pin subroutine corner(fscore.t.tspare.tstrike.p.j) dimension fscore(10).p(4).t(3) real p double precision drand.pp integer fscore.tstrike.tspare.j.t 10 pp=drand(O) if( pp .le. p(3» then fscore(j)=fscore(j)+2 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+2 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+2 66 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES t(3)=t(3)+1 endif t(1)=t(1)+l if(t(l) .It. 3) then tspare=tspare+l endif elseif (pp .gt. p(3) .and. pp .le.l) then if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 endif if( tstrike .eq. 2 .and. j .le. 8) then t(3)=t(3)+1 endif t(1)=t(l)+l if· ( t(l) .It. 3) then goto 10 endif endif return end c This subroutine calculates the outcomes of a headpin subroutine headpin(fscore,t,tspare,tstrike,p,j) dimension fscore(10),p(4),t(3) real p double precision drand,pp integer fscore,tstrike,tspare,j,t 10 pp=drand(O) if( pp .le. p(2» then fscore(j)=fscore(j)+5 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 s 67 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES 68 fscore(j+l)=fscore(j+l)+5 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+5 t(3)=t(3)+1 endif t(1)=t(l)+l if (t(l) .It. 3) then call hp32(fscore,t,tspare,tstrike,p,j) endif elseif (pp .gt. p(2) .and. pp .le. p(3)) then fscore(j)=fscore(j)+3 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+3 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+3 t(3)=t(3)+1 endif t(1)=t(1)+l if (t(l) .It. 3) then call hp32(fscore,t,tspare,tstrike,p,j) endif elseif (pp .gt. p(3) .and. pp .It. (1+p(3))/2) then fscore(j)=fscore(j)+2 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+2 endif continue if( tstrike .eq. 2 .and. j .le. 8) then •• d APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES fscore(j+2)=fscore(j+2)+2 t(3)=t(3)+1 endif t(1)=t(1)+l if (t(l) .It. 3) then call hp32(fscore,t,tspare,tstrike,p,j) endif elseif (pp .gt. (p(3)+p(4»/2 ) then t(1)=t(l)+l if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 endif if( tstrike .eq. 2 .and. j .le. 8) then t(3)=t(3)+1 endif if(t(1) .It. 3) then goto 10 endif endif return end c This subroutine calculates the outcomes of a 3-2 subroutine bov32(fscore,t,tsapre,tstrike,p,j) dimension fscore(10),p(4),t(3) real p double precision drand,pp integer fscore,tstrike,tspare,j,t 10 pp=drand(O) 69 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES if(pp .It. (p(1)+p(2))!2) then fscore(j)=fscore(j)+10 if ((tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+10 endif continue if( t_strike.eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+10 t(3)=t(3)+1 endif t(1)=t(1)+l if (t(l) .le. 3) then tspare=tspare+l endif return elseif (pp .gt.(p(1)+p(2))!2 .and. pp .le. p(2)) then fscore(j)=fscore(j)+8 if ((tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+8 endif continue if( tstrike.eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+8 t(3)=t(3)+1 endif t(1)=t(l)+l if(t(l) .It. 3) then call corner(fscore.t.tspare.tstrike.p.j) endif return 70 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES elseif ( pp .gt. p(2) .and. pp .le. p(3)) then fscore(j)=fscore(j)+5 if «(tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+5 endif continue if( tstrike.eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+5 t(3)=t(3)+1 endif t(1)=t(1)+l if(t(l) .It. 3) then call hp32(fscore,t,tspare,tstrike,p,j) endif return elseif (pp .gt. p(3) .and. pp .le. (p(3)+p(4))/2) then if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 endif if( tstrike .eq. 2 .and. j .le. 8) then t(3)=t(3)+1 endif t(1)=t(1)+1 if ( t(1) .It. 3) then goto 10 endif else fscore(j)=fscore(j)+5 if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+5 il APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES endif continue if( tstrike.eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+5 t(3)=t(3)+1 endif t(1)=t(1)+l if(t (1) .It. 3) then call b3232(fscore,t,tspare,tstrike,p,j) endif endif return end c This subroutine calculates the outcomes of head pin and a 3-2 subroutine hp32(fscore,t,tspare,tstrike,p,j) dimension fscore(10),p(4),t(3) real p double precision drand,pp integer fscore,tstrike,tspare,j,t 10 pp=drand(O) if( pp .le. p(2)) then fscore(j)=fscore(j)+5 if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+1)=fscore(j+1)+5 endif continue if( tstrike.eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+5 t(3)=t(3)+1 endif 72 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES t(1)=t(1)+l return elseif ( pp .gt. p(2) .and. pp .le. p(3)) then fscore(j)=fscore(j)+3 if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+3 endif continue if( tstrike.eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+3 t(3)=t(3)+1 endif t(1)=t(1)+l elseif ( pp .gt. p(3) .and. pp .le. (p(3)+p(4))!2) then fscore(j)=fscore(j)+2 if «tstrike.ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+l)=fscore(j+l)+2 endif continue if( tstrike .eq. 2 .and. j .le. 8) then fscore(j+2)=fscore(j+2)+2 t(3)=t(3)+1 endif t(1)=t(1)+l else if «tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 endif 73 u _ _. _ _ APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES continue if( tstrike .eq. 2 .and. j .le. 8) then t(3)=t(3)+1 endif t(1)=t(1)+1 endif return end c This subroutine calculates the outcomes of a 3-2 on both sides subroutine b3232(fscore,t,tspare,tstrike,p,j) dimension fscore(10),p(4) ,t(3) real p double precision drand,pp integer fscore,tstrike,tspare,j,t pp=drand(O) if (pp .le. p(3)) then fscore(j)=fscore(j)+5 if ((tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 fscore(j+1)=fscore(j+1)+5 endif t(1)=t(1)+1 return elseif ( pp .gt. p(3) .and. pp .le.p(4)) then t (1) =t (1) +1 if((tstrike .ne. 0 .or. tspare .ne. 0) .and. j .le. 9) then t(2)=t(2)+1 endif return endif 74 APPENDIX B. A PROGRAM WHICH SIMULATES BOWLING SCORES return end 75 Bibliography [IJ D'Agostino, R. B. and Stephens, M. A., Goodness of Fit Techniques, New York, Marcel Dekker, 1986. [2] Daniel, Wayne W., Applied Nonparametric Statistics, Second Edition, Boston, PWS-Kent, 1990. [3] Box, G. P., Hunter, W. G. and Hunter, J. S., Statistics for Experimenters, New York, John Wiley & Son, 1978. [4] Neter, J., Wasserman, W. and Kutner, M. H., Applied Linear Statistical Models, Second Edition, Homewood, Richard D. Irwin, 1985. [5) Pierce, D. A. and Kopecky, R. J., Testing goodness of fit for the distribution of errors in regression models, Technical Report Symp. 16, Department of Statistics, Stanford University, 1978. [6J Huynh, H., Some Approximate Tests for Repeated Measurement Designs, Psychometrika, Vol. 43, No.2, June, 1978,161-175. [7J Let's Go Bowling, Canadian 5 Pin Bowlers' Association. [8J Official Rules and Regulations Governing The Sport of 5 Pin Bowling, Canadian 5 Pin Bowlers' Association, 1987. [9] 5 Pin Bowling Specifications and Standards Manual, Canadian 5 Pin Bowlers' Association, 1987. [10] Royall, R. M., The effect of sample size on the meaning of significance tests, The American Statistician, Vol. 40, No.4, November, 1986,313-315. 76