the standard error

Standard Error Inferential statistics are based on the concept of making decisions using distributions of sample means (AKA sampling distributions). The reason to use samples is that populations are usually too large or impossible to test. If one sets up a distribution of means, you can judge the relative position of your sample mean compared to the population mean. With this information, you can make a judgment called a hypothesis test. In any case - there are different shapes of sampling distributions. Some are bell shaped like the t-distribution. Some have asymmetrical shapes like the Chi-square and F-distributions (both related). You generate sampling distribution by:  Defining a population  Generating all possible samples of a given size from the population  Plotting that distribution. We will use an artificial population of five numbers, which are 1, 2, 3, 4, 5. If you want, assume that it is after World War III and only 5 people survive, they are the population. We test them on some variable. One person gets 1, another gets 2, etc. Note that the mean () for this population is 3 and its standard deviation is 1.414. I decide to take a sample size of 2. I pick a person at random. Then, I pick again. It is possible that I pick the same person twice. These would be all my possible samples. I have also calculated the mean of each of the samples. There are 25 possible samples. Subject 1 1.00 1.00 1.00 1.00 1.00 2.00 2.00 2.00 Subject 2 1.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 Average of The Sample* 1.00 1.50 2.00 2.50 3.00 1.50 *Average = (2 + 1)/2 = 1.50 2.00 2.50 2.00 2.00 3.00 3.00 3.00 3.00 3.00 4.00 4.00 4.00 4.00 4.00 5.00 5.00 5.00 5.00 5.00 4.00 5.00 1.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 4.00 5.00 3.00 3.50 2.00 2.50 3.00 3.50 4.00 2.50 3.00 3.50 4.00 4.50 3.00 3.50 4.00 4.50 5.00 To have my distribution of means, I prepare a frequency distribution of the sample means (my last column of figurers). Sample Average 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 Frequency 1 2 3 4 5 4 3 2 1 Percent 4.0 8.0 -> Look for the two 1.50 values above 12.0 16.0 20. 16.0 12.0 8.0 4.0 Total Number of Samples = 25 Now, I graph this distribution and calculate its mean and standard deviation. - This is a distribution of means. Its mean has to be the same as that of the population mean. Its standard deviation is the standard error of the mean. Look at the Graphic below. Thus, we can see that most samples are close to 3.0 but there are some extreme sample means with values of 1.0, 1.5 and 4.5 and 5.0. Now I will redo the exercise with a sample size of 3. Here is the list of all possible samples of size 3, with averages for each sample: Subject 1 Subject 2 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 2.00 1.00 2.00 1.00 2.00 1.00 2.00 1.00 2.00 1.00 3.00 1.00 3.00 1.00 3.00 1.00 3.00 1.00 3.00 Etc., etc., etc. 1.00 5.00 1.00 5.00 1.00 5.00 1.00 5.00 2.00 1.00 2.00 1.00 2.00 1.00 2.00 1.00 2.00 1.00 2.00 2.00 2.00 2.00 2.00 2.00 Etc., etc., etc. 2.00 4.00 2.00 4.00 2.00 4.00 2.00 4.00 Subject 3 1.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 4.00 5.00 Average of Sample 1.00 1.33 1.67 2.00 2.33 1.33 1.67 2.00 2.33 2.67 1.67 2.00 2.33 2.67 3.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 2.67 3.00 3.33 3.67 1.33 1.67 2.00 2.33 2.67 1.67 2.00 2.33 1.00 2.00 3.00 4.00 2.33 2.67 3.00 3.33 2.00 4.00 2.00 5.00 2.00 5.00 2.00 5.00 2.00 5.00 2.00 5.00 3.00 1.00 3.00 1.00 3.00 1.00 3.00 1.00 3.00 1.00 Etc., etc., etc. 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 5.00 Sample Average 1.00 1.33 1.67 2.00 2.33 2.67 3.00 3.33 3.67 4.00 4.33 4.67 5.00 5.00 1.00 2.00 3.00 4.00 5.00 1.00 2.00 3.00 4.00 5.00 3.67 2.67 3.00 3.33 3.67 4.00 1.67 2.00 2.33 2.67 3.00 1.00 2.00 3.00 4.00 5.00 3.67 4.00 4.33 4.67 5.00 Frequency 1 3 6 10 15 18 19 18 15 10 6 3 1 Percent .8 2.4 4.8 8.0 12.0 14.4 15.2 14.4 12.0 8.0 4.8 2.4 .8 Now, I present a graphic which compares the sampling distribution with a sample size of two versus a sample size of three. The frequencies of the means are presented on a percentage basis for easy comparison. 1. Fewer Extremes: With the larger sample size (3) - note that there are fewer extreme sample means. Look at the number of samples means with a mean of 5. This is an extreme and not very representative mean. The percentage is dramatically less for the N=3 sample. Thus, with larger samples - you don't get wacky means that much. 2. Tighter distributions - note that the standard deviation of all the sample means (the standard error) is smaller than with a sample size of 2. It's mean is again 3 but the standard deviation of this distribution of means is equal to .82. Sampling distributions of the mean and those of some other statistics have a particular shape. They are bell shaped, like the normal curve, but less peaked and with fatter tails. This particular shape is called leptokurtic (from leptokurtosis). The fatness of the tails is controlled by a parameter called the degrees of freedom (df). DF are related to sample size. In our example, df = Nsample -.1. So for example with 3 subjects in the sample, df =2. The graphics above are actual frequency distributions. However, the t-distributions are theoretical mathematical functions. Here are some examples comparing the t- distributions with df = 3 or 6. Note the fatter tails and cut off scores for 5% total extremes (2.5% in each tail). t-distributions have cut offs like the z-score in Workshop 2. Let's look at the tails of the distributions. We've marked the cut offs for the 5% two tailed level. . This reflects our example above where the smaller sample size gave more extreme sample means. Thus, with big samples it's hard to get weird means. Depending on your situation, you don't have to actually construct a distribution. The value of the standard error can be calculated. equals the standard deviation calculated from your sample. The standard error formula can vary for different sample statistics. One can determine the standard error for a proportion, different between means, a correlation coefficient, slope of a regression line, intercept and other items. Your texts can supply these. The important idea is that you have a sample statistic and want to build a frequency distribution of all the possible samples. Then, you want to describe the variability of these samples. This is the use of the standard error. Here are some important points about samples: Samples Representative? You want your sample to mirror the important characteristics of the population. What Size Sample Do I Need? How Many Samples Do I Need? The larger the sample, the more likely it will be representative. For most situations, there are methods to calculate the appropriate sample size. A Sampling Distribution is based on the idea of all possible samples. However, you only need to calculate one sample to get the statistics you need to use the sampling distribution.

the standard error

Related documents

Products

Support

the standard error

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib