Comparison of Nuclear Decay Data against Gaussian and Poisson Distribution Models H. Potter, and E. Kager (Completed 24 October 2005) The Poisson distribution, with a p-value of 25.55%, was determined to be more likely to provide a valid theoretical model for random nuclear decay count data than the Gaussian distribution, which had a p-value of 5.36%. I. Introduction In the 1800’s several mathematicians began thoroughly investigating random processes. The fruit of their labors was the development of a new field of mathematics known now as statistics. Two such pioneers in statistics were Carl Friedrich Gauss, who analyzed what is now known as the Gaussian distribution, and Siméon Denis Poisson, who analyzed what is now known as the Poisson distribution. II. Experiment A distribution of random numbers was created by recording the number of counts in a radioactive detector for 320 4 second intervals. A radioactive source with an extremely long half-life was used so that it was safe to assume that the counts would be randomly distributed about a constant mean value for every observation. Through a statistical analysis of this data a determination of whether the Gaussian or Poisson distribution was more likely to provide an accurate statistical model for predicting such radioactive decay counts was to be made. III. Results Value Observed 10 0 11 0 12 0 13 2 14 0 15 4 16 7 17 13 18 12 19 16 20 17 21 22 22 22 23 31 24 35 25 23 26 28 27 6 28 19 29 14 30 11 31 10 32 7 33 5 34 4 35 2 36 5 37 3 38 0 39 1 40 1 41 0 n Mean 320 24.3313 Normal Poisson 0.424 0.174 0.736 0.384 1.228 0.780 1.969 1.459 3.033 2.536 4.489 4.113 6.385 6.255 8.727 8.953 11.462 12.102 14.464 15.497 17.539 18.853 20.436 21.844 22.880 24.159 24.614 25.557 25.444 25.910 25.273 25.217 24.122 23.598 22.122 21.266 19.495 18.479 16.508 15.504 13.432 12.575 10.501 9.870 7.889 7.504 5.695 5.533 3.950 3.960 2.633 2.753 1.686 1.860 1.038 1.223 0.614 0.783 0.349 0.489 0.190 0.297 0.100 0.176 2 s s 24.9808 4.9981 Table 1: The observed frequency for each data value observed, as well as several that were not observed but were needed for later statistical analysis, alongside the expected frequency of each value according to both the Gaussian and Poisson distributions. Summary statistics that were calculated from the data set in order to calculate the expected values for the two theoretical distributions are also provided at the bottom. 2 IV. Analysis and Discussion Bins 10--17 18--25 26--33 34--41 SUMS: Observed 26 178 100 16 320 P-Values: 2 χ reduced Normal Poisson 26.991 24.654 162.111 169.138 119.764 114.329 10.559 11.542 319.425 319.662 5.36% 25.55% 2.553 1.352 χ2N 0.036 1.557 3.261 2.804 7.659 χ2P 0.074 0.464 1.796 1.722 4.056 Table 2: Binning of data and the associated observed and expected values, along with pvalues for the Gaussian and Poisson distributions using a χ2 analysis. χ2 and reduced χ2 values are also given to further illustrate the difference in fits between the two distributions. In order to compare how well the Gaussian and Poisson distributions fit the data by using a χ2 goodness of fit analysis, the data had to be binned so that the observed and expected values were above 5 for each bin. In order to keep all of the bins the same size, some values in the largest and smallest bins were included even though they were not actually observed. At first glance the distribution that would be more likely to provide a better theoretical fit for future observed data can be determined by comparing the reduced χ2 values for each distribution. The closer a reduced χ2 value is to 1, the better the data fit that particular distribution; thus, the Poisson distribution is seen to fit the data better than the Gaussian distribution. A more quantified comparison of how well each distribution fits the observed data can be gleaned from the p-values for each distribution. The meaning of a p-value in this context is that if the observed data were actually governed by the specified theoretical distribution, then the probability that the data that were observed would actually be observed is the specified p-value; therefore, the larger that a p-value is, the more likely it is that the specified distribution actually provides an accurate theoretical model for the observed data. The Poisson distribution, with a p-value of 25.55%, is thus more likely to provide a better theoretical model for the observed data than the Gaussian distribution, which had a p-value of 5.36%. A more intuitive grasp regarding which model better fits the data can be had by glancing at the graph provided in Figure 1 below, which contains a graph of both the theoretical Gaussian and Poisson distributions, as well as a plot of each observed data point. Note that the bins used in the specific calculations indicated above are indicated by the scale on the x-axis. 3 Distribution Comparison 40 35 Frequency 30 25 Observed 20 Normal Poisson 15 10 5 0 10 18 26 34 Values Figure 1: A graph of both theoretical distributions against the observed data points. V. Conclusion All indicators of which distribution is more likely to be a better theoretical model for the observed data indicate that the Poisson distribution fits the data better than the Gaussian distribution: the Poisson distribution has a reduced χ2 value closer to 1, a larger p-value, and a graph that is in slightly better agreement with the observed data. It is thus concluded that the Poisson distribution is more likely to be a better theoretical model for predicting radioactive decay counts than the Gaussian distribution. 4