CHAPTER 12 – SAMPLING DISTRIBUTIONS, SAMPLING DISTRIBUTION OF THE MEAN, THE NORMAL DEVIATE (Z) TEST I. INTRODUCTION At the heart of hypothesis testing is the ability to answer the question: What is the probability of getting the obtained result or results even more extreme if chance alone is responsible for the differences between the experimental and control scores? The answer to the above question involves two steps: 1. Calculating the appropriate statistic, and 2. Evaluating the statistic based on its sampling distribution. II. SAMPLING DISTRIBUTIONS What is a sampling distribution? Definition (p. 268) The sampling distribution of a statistic gives (1) all the values that the statistic can take and (2) the probability of getting each value under the assumption that it resulted from chance alone. A sampling distribution of a statistic is a theoretical frequency distribution of the scores for or values of a statistic, such as a mean. Any statistic that can be computed for a sample has a sampling distribution. A sampling distribution is the distribution of statistics that would be produced in repeated random sampling (with replacement) from the same population. It is all possible values of a statistic and their probabilities of occurring for a sample of a particular size (Vogt, 1999). If the probability of getting the obtained value of the statistic or any value more extreme is equal to or less than the alpha level, we reject H0 and accept H1. If not, we retain H0. If we reject H0 and it is true, we have made a Type I Error. If we retain H0 and it is false, we have made a Type II Error. The above process applies to all experiments involving hypothesis testing. What changes from experiment to experiment is the statistic that is used and its accompanying sampling distribution. A. GENERATING SAMPLING DISTRIBUTIONS A sampling distribution has been defined as a probability distribution of all the possible values of a statistic under the assumption that chance alone is operating. One way of deriving sampling distributions is from basic probability considerations. Sampling distributions can also be derived from an empirical sampling approach. Here we have an actual or theoretical set of population scores that exists if the independent variable has no effect. We derive the sampling distribution of the statistic by: 1. Determining all the possible different samples of size N that can be formed from the population of scores (that is – all unique samples of the same size). 2. Calculating the statistic for each of the samples 3. Calculating the probability of getting each value of the statistic if chance alone is operating Definition (p. 269) The null-hypothesis population is an actual or theoretical set of population scores that would result if the experiment were done on the entire population and the independent variable had no effect. It is called the null-hypothesis population because it is used to test the validity of the null hypothesis. Definition (p. 272) A sampling distribution gives all the values a statistic can take, along with the probability of getting each value if sampling is random from the null-hypothesis population. A sampling distribution is constructed by assuming that an infinite number of samples of a given size have been drawn from a particular population and that their distributions have been recorded. Then the statistic, such as the mean, is computed for the scores of each of these hypothetical samples; then this infinite number of statistics is arranged in a distribution in order to arrive at the sampling distribution. The sampling distribution is compared with the actual sample statistic to determine if that statistic is or is not likely to be the way it is due to chance (Vogt, 1999). III. THE NORMAL DEVIATE (Z) TEST The normal deviate (z) test is a test that is used when we know the parameters of the nullhypothesis population. That is, when we know the population mean () and standard deviation (). The z test uses the mean of the sample ( X obt ) as a basic statistic. A. SAMPLING DISTRIBUTION OF THE MEAN Definition (p. 273) The sampling distribution of the mean gives all the values the mean can take, along with the probability of getting each value if sampling is random from the null-hypothesis population. CHAPTER 12 PAGE 2 The sampling distribution of the mean can be determined empirically and theoretically, the latter through the use of a theorem called the Central Limit Theorem. The Central Limit Theorem tells us that, regardless of the shape of the population of raw scores, the sampling distribution of the mean approaches a normal distribution as sample size N increases. Empirically – we can determine the sampling distribution of the mean by actually taking a specific population of raw scores having a mean () and standard deviation () and: 1. drawing all possible different samples of a fixed size N, 2. calculating the mean of each sample, and 3. calculating the probability of getting each mean value if chance alone were operating. The sampling distribution of the mean has the following general characteristics (for samples of any size N): 1. The sampling distribution of the mean is made up of sample mean scores. As such, it, too, must have a mean and a standard deviation. The mean of the distribution is symbolized as X (mean of the sampling distribution of the mean) The standard deviation of the distribution is symbolized as X (standard deviation of the sampling distribution of the mean, also called the standard error of the mean because each sample mean can be considered an estimate of the mean of the raw-score population). Variability between sample means then occurs due to errors in estimation – hence the phrase standard error of the mean for X . 2. The mean of the sampling distribution of the mean is equal to the mean of the raw scores ( X ). Recall that each sample mean is an estimate of the mean of the raw-score population (differing by chance). 3. The standard deviation of the sampling distribution of the mean is equal to the standard deviation of the raw-score population divided by N ( X N ). The standard deviation of the sampling distribution of the mean varies directly with the standard deviation of the raw-score population and inversely with N . If the scores in the population are more variable, goes up and so does the variability between the means based on these scores. CHAPTER 12 PAGE 3 4. Is normally shaped, depending on the shape of the raw-score population and on sample size. That is, if the shape of the population of raw scores is normally distributed, the sampling distribution of the mean will also be normally distributed, regardless of sample size. However, if the population of raw scores is not normally distributed, the shape of the sampling distribution of the mean depends on the sample size. If N is sufficiently large, the sampling distribution of the mean is approximately normal. If N > 30, it is usually assumed that the sampling distribution of the mean will be normally shaped. If N > 300, the shape of the population of raw scores is no longer a consideration – with this size N, regardless of the shape of the raw-score population, the sampling distribution of the mean will deviate so little from normality that, for statistical considerations, we can consider it normally distributed. There are two factors that determine the shape of the sampling distribution of the mean: 1. the shape of the population of raw scores (if the population of raw scores is normally distributed, the sampling distribution of the mean will also be normally distributed – if the population of raw scores is not normally distributed, the shape of the sampling distribution depends on the sample size) 2. the sample size, N (if N is sufficiently large, the sampling distribution of the mean is approximately normal – the further the raw scores deviate from normality, the larger the sample size must be for the sampling distribution of the mean to be normally shaped) The formula for the normal deviate (z) test (the equation for zobt) is very similar to the z equation – but instead of using raw scores, we use mean values. z equation: zobt X obt X zobt X obt X X where X and since X , the equation simplifies to N B. ALTERNATIVE SOLUTION USING ZOBT AND THE CRITICAL REGION FOR REJECTION OF H0 Definition (p. 281) The critical region for rejection of the null hypothesis is the area under the curve that contains all the values of the statistic that allow rejection of the null hypothesis. Definition (p. 281) The critical value of a statistic is the value of the statistic that bounds the critical region. CHAPTER 12 PAGE 4 To analyze the data using the alternative method, all we need to do is calculate zobt, determine the critical value of z (zcrit), and assess whether zobt falls within the critical region for rejection of H0. Based on the data from our question (study), we calculate zobt We find zcrit for the region of rejection by using the area under the normal curve table. X obt X The critical region for rejection of H0 is determined by the alpha level. Values are determined depending on whether we are testing a one-tailed (directional) hypothesis or a two-tailed (non-directional) hypothesis. To reject H0, the obtained sample mean ( X obt ) must have a z-transformed value (zobt) that falls within the critical region of rejection. That is, the value falls in the tail. If |zobt or tobt| < |zcrit or tcrit| Retain the null hypothesis If |zobt or tobt| > |zcrit or tcrit| Reject the null hypothesis Review Practice Problem 12.1 on pages 284-285 (two-tailed, non-directional) and Practice Problem 12.2 on pages 285-286 (one-tailed, directional) for examples of hypothesis testing. REFER TO CLASS HANDOUT (HYPOTHESIS TESTING AND TYPE I AND TYPE II ERROR) FOR FURTHER DISCUSSION. E. CONDITIONS UNDER WHICH THE Z TEST IS APPROPRIATE The z test (z statistic) is appropriate when the experiment involves a single sample mean ( X obt ) and the parameters of the null-hypothesis population are known (i.e., when and are known). To use this test – the sampling distribution of the mean should be normally distributed. This is the mathematical assumption underlying the z test. This requires that N > 30 or that the null-hypothesis population itself be normally distributed. CHAPTER 12 PAGE 5 F. POWER AND THE Z TEST Conceptually, power is the sensitivity of the experiment to detect a real effect of the independent variable, if there is one. Power is defined mathematically as the probability that the experiment will result in rejecting the null hypothesis if the independent variable has a real effect. Power + Beta () = 1.00. Thus, power varies inversely with beta (1 – β = Power). Power varies directly with N. Increasing N increases power. Power varies directly with the size of the real effect of the independent variable. The power of an experiment is greater for large effects than for small effects. Power varies directly with the alpha level (). If alpha is made more stringent (conservative, e.g., from 0.05 to 0.01), power decreases. Power of a Test – broadly, is the ability of a technique, such as a statistical test, to detect relationships or differences. Specifically, the probability of rejecting a null hypothesis when it is false – and therefore should be rejected (i.e., a correct decision). The power of a test is calculated by subtracting the probability of a Type II error (β) from 1.0. The maximum total power a test can have is 1.0 and the minimum is zero; .80 is often considered an acceptable level for a particular test in a particular study. Power is also called statistical power (Vogt, 1999). References Pagano, R. R. (2007). Understanding Statistics in the Behavioral Sciences (8th ed.). Belmont, CA: Wadsworth. Vogt, W. P. (1999). Dictionary of Statistics & Methodology: A Nontechnical Guide for the Social Sciences (2nd ed.). Thousands Oak, CA: Sage Publications. CHAPTER 12 PAGE 6