Chapter 2-18. One Sample Tests < This chapter is under construction > Sometimes you will have only one group, or one sample, and you want to compare to hypothesized values. That situation is the focus of this chapter. Nominal Scale Variable If the variable is nominal scaled, having 3 or more unordered categories, and you want to compare against known population or hypothesized values, the most commonly used statistic is the chi-square goodness-of-fit test. This statistic is not available in Stata, but can be added. To add it, in the Stata command window, run the command, findit mgof which displays, SJ-8-2 st0142 . Multinomial GOF: Large-sample svy & small-sample exact tests (help mgof if installed) . . . . . . . . . . . . . . . . . . B. Jann Q2/08 SJ 8(2):147--169 computes distributional tests for discrete (categorical, multinomial) variables and supports large-sample tests for complex survey designs and exact tests for small samples Click on st0142 , which displays, TITLE SJ8-2 st0142. Multinomial goodness-of-fit tests DESCRIPTION/AUTHOR(S) Multinomial goodness-of-fit tests by Ben Jann, ETH Zurich Support: jann@soz.gess.ethz.ch After installation, type help mgof You must type . mata mata mlib index to include lmgof.mlib in the list of Mata libraries to be searched for the mgof command to work. INSTALLATION FILES st0142/mgof.hlp st0142/mgof.ado st0142/mgofi.ado st0142/lmgof.mlib st0142/mgof.mata (click here to install) _____________________ Source: Stoddard GJ. Biostatistics and Epidemiology Using Stata: A Course Manual [unpublished manuscript] University of Utah School of Medicine, 2010. Chapter 2-18 (revision 16 May 2010) p. 1 Click on click here to install , which installs the multinominal goodness of fit (mgof) command. Finally, following the installation instructions shown above when you clicked on st0142 , run the following command in the Command window, mata mata mlib index The mgof command uses the matrix programming facility in Stata, called mata, and this last command you entered informs mata of which internal libraries to search, which is required for the command to run properly. If you want to see the help screen for mgof, you cannot get to it from the Help button on the menu bar. You can get it, however, from the Command window, using help mgof Example. Suppose you have created a variable where you categorized birthweight into: 1 = low birthweight (< 10th percentile) 2 = normal birthweight (10th-90th percentile) (between 2,750 and 4,250 grams, or between 6 pounds and 9 pounds, 4 ounces) 3 = high birthweight (> 90th percentile) With the following frequency table: . tab birthweight birthweight | Freq. Percent Cum. ------------+----------------------------------1 | 40 32.00 32.00 2 | 80 64.00 96.00 3 | 5 4.00 100.00 ------------+----------------------------------Total | 125 100.00 You want to demonstrate that your type of patient has a greater frequency of low births than the normal population. If your n=125 birthweights were distributed the same as the normal population, you would expect frequencies of: display 125*0.1 display 125*0.8 . display 125*0.1 12.5 . display 125*0.8 100 where 10% are 1’s, 80% are 2’s, and 10% are 3’s. Thus your observed and expected (normal population) frequencies are: Chapter 2-18 (revision 16 May 2010) p. 2 Observed Expected* low 40 (32%) 13 (10%) normal 80 (64%) 100 (80%) high 5 ( 4%) 12 (10%) * conservatively rounded 12.5 to 13 in low category and 12.5 to 12 in high category, so frequencies would sum to n=125. This was “conservative” since this rounding makes the expected values more like the observed values. Now, using the “immediate” form of the mgof command, mgofi, which tells Stata that the data follow the command on the same line, which has the syntax: mgofi f1 f2 ... [ / p1 p2 ... ] we list the observed frequencies, then a “/”, then the expected frequencies, mgofi 40 80 5 / 13 100 12 Number of obs = N of outcomes = Chi2 df = 125 3 2 ---------------------------------------------Goodness-of-fit | Coef. P-value ----------------------+----------------------Pearson's X2 | 64.16026 0.0000 Log likelihood ratio | 45.45675 0.0000 ---------------------------------------------- The “Pearson’s X2” line is the chi-square goodness of fit test. We see that the test is significant (p<0.05), so we can conclude that our patient sample has more low birthweights than the normal population. Just as you have to decide between a chi-square test and a Fisher’s exact test with an ordinary crosstabulation of two variables, you have to decide between the ordinary, or asymptotic, chisquare goodness of fit test displayed here and the exact version of this test. The ordinary test shown here assumes a sufficient large sample size to give an accurate p value. For small sample sizes, you should obtain an exact p value. The ordinary chi-square p value can be used if the minimum expected frequencies rule-of-thumb is met. This is the same rule-ofthumb presented in Ch 2-4 “Comparison of 2 Independent Groups”. In Ch 2-4, the expected frequencies were the cell frequencies that were consistent with the two variables being independent, or consistent with the hypothesis of the groups not being different. In the one sample case, the expected frequencies are the cell frequencies consistent with the hypothesized population cell frequencies. In both situations, the chi-square statistic itself has exactly the same formula, and hence uses the same rule-of-thumb, Chapter 2-18 (revision 16 May 2010) p. 3 2 (O E )2 E , where the sum is over all categories of the variable, and O = observed cell frequency E = expected cell frequency (hypothesized cell frequency) Minimum Expected Cell Frequency Rule-of-Thumb for Chi-Square Goodness-of-Fit Test Siegel and Castellan (1988, p.49) state the rule-of-thumb as follows: For two categories, each expected frequency should be at least 5. For three or more categories, the one-sample chi-square goodness-of-fit test should not be used if more than 20 percent of the expected frequencies are less than 5 or when any expected frequency is less than 1. If the minimum expected cell frequency rule is not met, then you should obtain an exact p value for the test. This is sometimes called the exact chi-square test , but that term confuses most readers. (See Article Suggestion below for a less confusing way to describe it.) Exact P Value The mgof command computes an exact p value using the “exhaustive enumeration” method. If the sample size is sufficiently small, it can be obtained with the “ee” option. Try that first. mgofi 40 80 5 / 13 100 12 , ee Number of obs = N of outcomes = Compositions = 125 3 8001 ---------------------------------------------| Exact Goodness-of-fit | Coef. P-value ----------------------+----------------------Pearson's X2 | 64.16026 0.0000 Log likelihood ratio | 45.45675 0.0000 ---------------------------------------------exhaustive enumeration exact tests If the sample size is large, the “ee” option will fail. In that case, use the Monte Carlo simulation approach, which approximates the exact p value, using the “mc” option. For example, if the sample size was 10 times larger, you would get tired of waiting. If that occurs, hit the “break” button to abort the command. (break button is the circle x on the menu bar) Then, use the mc option, mgofi 400 800 50 / 130 1000 120 , mc Chapter 2-18 (revision 16 May 2010) p. 4 Number of obs = N of outcomes = Replications = 1250 3 10000 ---------------------------------------------------------------------| Exact Goodness-of-fit | Coef. P-value [99% Conf. Interval] ----------------------+----------------------------------------------Pearson's X2 | 641.6026 0.0000 0.0000 0.0005 Log likelihood ratio | 454.5675 0.0000 0.0000 0.0005 ---------------------------------------------------------------------- Article Suggestion If you report the ordinary test, use something like: The observed birthweights, categorized into low (<10th percentile of normal population birthweight), middle (between 10th and 90th percentiles), and high (>90th percentile), were compared to the expected frequencies of the normal population using a one-sample chi-square goodness-of-fit test. If you report the exact p value, use something like: The observed birthweights, categorized into low (<10th percentile of normal population birthweight), middle (between 10th and 90th percentiles), and high (>90th percentile), where compared to the expected frequencies of the normal population using a one-sample chi-square goodness-of-fit test with an exact p value (Jann, 2008). Chapter 2-18 (revision 16 May 2010) p. 5 References Jann B. (2008). Multinomial goodness-of-fit: large sample tests with survey design correction and exact tests for small samples. The Stata Journal 8(2):147-169. Radlow R, Alf EF, Jr. (1978). An alternative multinomial assessment of the accuracy of the χ2 test of goodness of fit. Journal of the American Statistical Association 70:811-813. Siegel S and Castellan NJ Jr (1988). Nonparametric Statistics for the Behavioral Sciences, 2nd ed. New York, McGraw-Hill. Chapter 2-18 (revision 16 May 2010) p. 6