Measures of Association
POWER CALCULATOR
David J. Pittenger
Marietta College
1
2
Measures of Association
MS-DOS and Windows are the registered trademarks of Microsoft Corporation.
IBM is the registered trademark of International Business Machine Corporation.
Copyright © by David J. Pittenger. All rights reserved. Except as permitted under the United
States Copyright Act of 1976, no part of this publication or the accompanying software may be
reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the author.
Measures of Association
3
TABLE OF CONTENTS
Chapter 1: Introduction to Power Calculator ........................................... 7
Computer Hardware Requirements............................................................... 7
Making Backups ............................................................................................... 7
Installing the Programs to the Hard Drive .................................................... 7
Starting the Program ....................................................................................... 7
The Mouse .................................................................................................................. 8
The Keyboard ............................................................................................................. 8
The Main Menu ................................................................................................ 8
Setup ........................................................................................................................... 8
Background Color .................................................................................................. 9
Highlight Color ....................................................................................................... 9
Chapter 2: Power ~ One Sample t-Ratio................................................. 11
Introduction .................................................................................................... 11
Power Estimation ........................................................................................... 12
The Effect Size Index: d1 .......................................................................................... 12
Estimating Power ........................................................................................... 13
Alpha Level .......................................................................................................... 14
Number of Tails.................................................................................................... 14
Degrees of Freedom and Sample Size .................................................................. 14
Example of Power Estimation .................................................................................. 15
Chapter 3: Power ~ Two Sample Independent t-Ratio ........................... 17
Introduction .................................................................................................... 17
Independent Groups t-Ratio ..................................................................................... 17
Dependent Groups t-Ratio ........................................................................................ 17
Statistical Issues Related to the Independent Groups t-Ratio ................................... 18
Effect Size ................................................................................................................ 19
Assumptions for t-ratio and power ........................................................................... 19
Estimating Power ........................................................................................... 20
Alpha Level .......................................................................................................... 21
Number of Tails.................................................................................................... 21
Degrees of Freedom and Sample Sizes ................................................................. 21
Examples of Power Estimation ................................................................................. 22
t-Ratio When Variances Are Not Equal ....................................................... 25
Chapter 4: Power ~ Two Sample Dependent t-Ratio.............................. 27
Introduction .................................................................................................... 27
Estimating Power ........................................................................................... 29
Examples of Power Estimation ................................................................................. 29
Chapter 5: Power ~ Pearson Correlation Coefficient ............................ 31
4
Measures of Association
Introduction .................................................................................................... 31
Estimating Power ........................................................................................... 32
Alpha Level .......................................................................................................... 32
Number of Tails ................................................................................................... 32
Degrees of Freedom and Sample Sizes ................................................................ 32
Examples of Power Estimation ................................................................................ 33
Chapter 6: Power ~ Difference Between Correlations: 1 = 2 ............. 37
Introduction .................................................................................................... 37
Estimating Power ........................................................................................... 38
Alpha Level .......................................................................................................... 38
Number of Tails ................................................................................................... 38
Sample Sizes ........................................................................................................ 38
Examples of Power Estimation ................................................................................ 39
Chapter 7: Power ~ Multiple Regression ................................................ 41
Introduction .................................................................................................... 41
Estimating Power ........................................................................................... 41
Alpha Level .......................................................................................................... 41
Number of Predictors: U ...................................................................................... 41
Sample Size: N ..................................................................................................... 41
Examples of Power Estimation ................................................................................ 42
Chapter 8: Power ~ Sign Test and P = .50 .............................................. 45
Introduction .................................................................................................... 45
Estimating Power ........................................................................................... 45
Alpha Level .......................................................................................................... 45
Number of Tails ................................................................................................... 45
Sample Size .......................................................................................................... 46
Examples of Power Estimation ................................................................................ 46
Chapter 9: Power ~ Difference Between Proportions: P1 = P2 .............. 49
Introduction .................................................................................................... 49
Estimating Power ........................................................................................... 49
Alpha Level .......................................................................................................... 50
Number of Tails ................................................................................................... 50
Degrees of Freedom and Sample Sizes ................................................................ 50
Examples of Power Estimation ................................................................................ 50
Chapter 10: Power ~ Analysis of Variance ............................................. 53
Introduction .................................................................................................... 53
Foundation of the ANOVA ...................................................................................... 54
Interpreting the F-ratio ............................................................................................. 54
F-ratio ...................................................................................................................... 55
The Correlation Ratio: 2 ......................................................................................... 55
Effect Size: f ............................................................................................................. 55
Measures of Association
5
Special Issues for Power Estimation for the ANOVA .............................................. 56
Measurement Error and Power ............................................................................. 56
Factorial Designs and Power ................................................................................ 57
Analysis of Covariance and Power ........................................................................... 57
ANOVA vs. ANCOVA ............................................................................................ 58
Estimating Power ........................................................................................... 59
Between-Subjects Factors..................................................................................... 59
Within-Subjects Factors ....................................................................................... 59
Levels of a Factor ................................................................................................. 59
Sample Size .......................................................................................................... 59
Chapter 11: Power ~ 2 .......................................................................... 65
Introduction .................................................................................................... 65
Estimating Power ........................................................................................... 66
Alpha Level .......................................................................................................... 66
Degrees of Freedom ............................................................................................. 66
Sample Size .......................................................................................................... 66
Examples of Power Estimation ................................................................................. 67
Chapter 12: Random Number Generator ............................................... 69
Introduction .................................................................................................... 69
Random Integers ....................................................................................................... 69
Random Normal Distribution ....................................................................... 71
Random Assignment of Subjects .................................................................. 71
Latin Square Generator ................................................................................ 72
Whole Number Generator ............................................................................. 73
Generating Samples with Specified Means and Standard Deviations ................ Error!
Bookmark not defined.
Frequency Distributions ............................................ Error! Bookmark not defined.
Correlation and Regression ....................................... Error! Bookmark not defined.
Analysis of Variance ................................................. Error! Bookmark not defined.
One-Way ANOVA ................................................ Error! Bookmark not defined.
Two-Way ANOVA................................................ Error! Bookmark not defined.
Running the Program ................................................................................................ 73
Lower and Upper Sample Size ............................................................................. 80
Lower and Upper Standard Deviations ................................................................. 81
Denominator for SD ............................................................................................. 81
Replications in Set ................................................................................................ 81
Compute ............................................................................................................... 81
Print ...................................................................................................................... 81
Exit ....................................................................................................................... 81
Settings ................................................................................................................. 81
Chapter 13 Statistical Tables Generator ................................................. 85
Introduction .................................................................................................... 85
Critical Values : t-ratio ............................................................................................. 85
6
Measures of Association
Critical Values : F-ratio ........................................................................................... 86
Critical Values: 2 .................................................................................................... 86
Critical Values: r ...................................................................................................... 86
r To z Transformation .............................................................................................. 86
Normal Distribution ................................................................................................. 86
Binomial Distribution ............................................................................................... 87
Chapter 14: ANOVA — Monte Carlo Simulator.................................... 89
Introduction .................................................................................................... 89
Design ANOVA Model ........................................................................................ 89
Change Parameters ............................................................................................... 90
Plot Factors .......................................................................................................... 90
Start Demonstration.............................................................................................. 90
Iterations............................................................................................................... 90
Significant Digits.................................................................................................. 90
Print Summary Tables to Screen .......................................................................... 91
Print Summary Tables to Disk ............................................................................. 91
Print Summary Tables to Printer .......................................................................... 91
Print Raw Data with Tables .................................................................................. 91
Create F-Ratio File ............................................................................................... 91
Other Features: ..................................................................................................... 92
Practice Session:................................................................................................... 93
Examining Power ..................................................................................................... 96
Robustness of the ANOVA .................................................................................... 102
References .............................................................................................. 105
Introduction to Power Calculator
7
CHAPTER 1: INTRODUCTION TO POWER CALCULATOR
Power Calculator is a collection of computer programs that allow you to
estimate the power of various statistical tests and to create frequently
used statistical tables. Power Calculator is fully interactive and offers
considerable flexibility. The program will produce tabular as well as
graphical representations of the power estimates. In addition, you can
vary all the essential parameters of the calculations to create the estimated power of a statistic.
I wrote the program with the hope that researchers and students can
use it in many different contexts with different abilities in statistics.
Specifically, I hope that the program will be as useful for the professional
researcher who needs quick power estimates and the student who is
learning about the foundations of inferential statistics, confidence intervals, and statistical power.
COMPUTER HARDWARE REQUIREMENTS
To operate Power Calculator you will need an IBM® compatible computer. The program operates within MS-DOS® 2.0 or greater. The program
will work within either the DOS or WINDOWS® environments.
MAKING BACKUPS
The programs are not copy protected. The programs and manual are
copyrighted, however. Please do not distribute the program without permission.
INSTALLING THE PROGRAMS TO THE HARD DRIVE
The programs must be stored on your computer’s hard drive. The install
program allows you to create a new directory on your hard drive for storing the programs, copy all the necessary files, and then prepare the programs to run on your system.
To start the installation of Power Calculator, double click your
mouse on the POWER1 icon. The program will begin by asking you to
identify the directory you want to use for storing the programs. The default directory is C:\POWER. You should use this directory unless you
have a specific need to store the program elsewhere.
To accept the default directory press the ENTER key. To identify a
new directory, type the new directory and press the ENTER key. Be sure
that you follow DOS conventions for identifying the directory.
STARTING THE PROGRAM
When the program begins, you will see a large menu of options. As
you can see, the program offers an array of computational alternatives
ranging from estimation of power for various statistical procedures, to
8
Introduction to Power Calculator
generation of random numbers, to the creation of frequently used statistical tables.
The options listed in the Main Menu of the program can be selected
using either a mouse or the keyboard.
THE MOUSE
To activate an option with the mouse, place the mouse cursor over
the button representing the desired option and click the left button.
You do not need to hold the button down other than to click it once.
The program will immediately start the option you selected.
THE KEYBOARD
To activate a menu with the keyboard, press the highlighted letter associated with the option you wish to run. For example, to select the
statistical tables, press the letter “L." To estimate the power of an
Analysis of Variance, press the letter “F."
THE MAIN MENU
The Main Menu contains a list of all the options for this program. This
menu serves as the central control for the program. In essence, you will
control the operation of the program from this location.
Each of the computational options will be explained in a subsequent
section in this manual. The following is a brief review of the general options in this and many of the specific routines.
HELP
The Help button is present in all the programs. Pressing this button will
present a small text window that contains information about the program
and the specific routine you are using. To scroll through the help information, use the N, P, and E keys to move to the Next page, Previous page,
or Exit the help routine.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
EXIT
As its name suggests, the Exit button causes the program to leave the
current application. If you are using one of the specific applications in
Power Calculator pressing the Exit button will return you to the Main
Menu. If you are in the Main Menu, pressing the Exit button will return
your computer to DOS or Windows.
SETUP
This set of routines allows you to customize your program. You can
change the color of the background and the color of the highlighted text.
Introduction to Power Calculator
9
Background Color
You can vary the color of the background screen by changing the
proportions of Red, Green, and Blue. To increase or decrease a specific color, place the mouse on the appropriate arrow and click the left
button. The colored slider bar will change and represent the proportion of the color represented in the background screen. At the same
time, the background color will change.
Highlight Color
This option is useful to increase the contrast of the Highlight Color.
You can select any color to compliment the color of the background
screen by varying the amount of Red, Green, and Blue. To increase
or decrease a specific color, place the mouse on the appropriate arrow
and click the left button. The colored slider bar will change and represent the proportion of the color represented in the highlighted text.
At the same time, the highlighted text color will change.
Power: One Sample t-Ratio
11
CHAPTER 2: POWER ~ ONE SAMPLE T-RATIO
INTRODUCTION
William Gossett (Student, 1908) initiated a new generation of statistical
testing when he described the t-ratio in 1908 under the pseudonym,
“Student.” The t-ratio has become the familiar friend and beast of burden for contemporary researchers who depend upon inferential statistics
for their work. Furthermore, this statistic served as the inspiration for
more complex analytic tools such as the analysis of variance. In this
chapter we will examine the power of the single sample t-ratio. The two
sample t-ratios are examined in the next two chapters.
Perhaps the most simple of the inferential statistics is the t-ratio for a
single sample. For this test we are interested in whether the mean of a
single sample should be considered a member of a specific population or
not. In general, the null hypothesis for this test is:
H 0 : 1   0
2.1
If the null hypothesis is true, then any difference between the sample
and population means is due to sampling error or random effects. In
other words, the null hypotheses states that there is no meaningful difference between these two numbers. In other words, random effects created the difference between the means. According to the central limit
theorem, the mean of any sample drawn from the population may deviate
from the population mean and within specific parameters. Therefore, our
acceptance of the null hypothesis indicates that the difference between
the sample and population means is within these parameters.
The alternative hypothesis can take one of two forms. One form of the
alternative hypothesis is:
H 1 : 1   0
2.2
This alternative hypothesis is the non-directional hypothesis or a
two-tailed test. We use these names because the researcher has predicted that there will be a difference between the sample and population
means, but has not specified the direction of the difference.
The directional or one-tailed is another form of alternative test because the researcher believes that there is reason to predict that the
sample mean will be greater than (>) or less than (<) the population
mean. Based on the researcher’s prediction the directional hypothesis is:
H 1 : 1   0
or as
H 1 : 1   0
2.3
An advantage of a two-tailed test is that it allows a researcher to find
significant differences between the means without having to specify the
direction of the difference. For example, if a researcher used a one-tailed
12
Power: One Sample t-Ratio
test and predicted that the sample mean would be greater than the population mean and the sample mean is really LESS than the population
mean, the researcher could not reject the null hypothesis. By contrast, if
the researcher used a non-directional test, the difference could have been
considered statistically significant.
The disadvantage of the two-tailed test is that it is less powerful than
the one-tailed test. Therefore, it is important to specify beforehand the
type of hypothesis testing you will use and then follow through with power estimation.
POWER ESTIMATION
The t-ratio for the one sample case is similar to the equation for the
z-score. The t-ratio is:
t 
X 
ŝ
2.4
n
The t-ratio can be negative, 0, or positive. The importance of the sign is
essential if you are conducting a one-tailed test. If the sign of the t-ratio
does not match the sign predicted in the alternative hypothesis, one cannot reject the null hypothesis.
Three factors affect the magnitude of the t-ratio. The first factor is the
difference between the sample and population means. All else being
equal, the larger the difference between these values, the greater the
magnitude of the t-ratio.
The second factor that influences the t-ratio is ŝ , the unbiased estimate of the standard deviation of the population, . Larger values of ŝ
will decrease the size of the t-ratio when all other factors are constant.
Similarly, smaller values of ŝ increase the magnitude of the t-ratio.
Therefore, power increases as ŝ decreases.
The last factor that affects the size of the t-ratio is n, the number of
subjects in the sample. According to the central limit theorem, the distribution of sample means drawn from the population decreases as a
function of the square root of n. In other words, as n increases the theoretical spread of sample means drawn from a population decreases.
Therefore, larger sample sizes produce a smaller standard error of the
mean. The consequence for the t-ratio is that larger samples will, if all
other things are equal, produce t-ratios of greater power.
THE EFFECT SIZE INDEX: d1
Cohen (1988) devised the effect size index as a way to characterize the
relative difference between two population means. The statistic is the
ratio of the difference between the population means to the estimate of
the population standard deviation. In mathematical terms, the effect size
is determined by
d1 
X 
ŝ
2
2.5
Power: One Sample t-Ratio
13
In this equation, X and  represent the sample and population means,
respectively. The denominator, ŝ , is the unbiased estimate of the standard deviation of the population.
If the null hypothesis is correct, then X and  will be identical and
the numerator of the statistic and d1 will equal 0. In words, when d1 = 0
there is no difference between the two population means.
Values of d1 that are not equal to 0 represent a difference between the
two populations. The larger the absolute value of d1, the greater the relative difference between the two populations. Cohen (1988) recommended
some general benchmarks for evaluating the magnitude of d1.
0.0 d1 < 0.20:
No Effect to Little Effect: The difference between
the means are nonexistent or trivial. There may be many reasons for
this condition. First, the population means may be different, but relative to the amount of variation within the population, the effect is
difficult to detect without extremely large samples.
0.20 d1 < 0.50: Little Effect to Moderate Effect: This range of
effect sizes represents a difference between the means that is difficult
to detect without large samples. The difference among scores is large
and contributes much noise to the data. Cohen (1988) suggested
that effect sizes close to d1 = .20 is equivalent to the difference in
heights between 15- and 16-year-old women.
0.50 d1 < 0.80: Moderate Effect to Large Effect: This range of
effect sizes represents a difference between the means that can be
seen in a graph of the data. In other words, effect sizes in this range
allow the researcher to find significant effects with fewer subjects.
Such effects are better suited for laboratory studies.
0.80 d1 < :
Large Effect: Effects of this magnitude are extremely easy to observe and require few subjects to effectively estimate the difference between the population means. In essence, there
is little overlap of the two sampling populations. The difference in
height between 13- and 18-year-old women represents a large effect.
ESTIMATING POWER
You can use Power Calculator to estimate the power of the single sample t–ratio for various sample sizes, -levels, and directionality of the test.
When you select this option, you will see a screen similar to the one presented in Figure 2.1. As you can see, you can change several parameters
of the statistic. Let’s look at each of these in turn.
14
Power: One Sample t-Ratio
Figure 2.1: The first screen for the program that calculates the
power of a single sample t-ratio.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
Number of Tails
This option allows you to toggle between a one- and two-tailed test.
Remember that when you use a two-tailed test you divide  between the
two extremes of the sampling distribution, thus the proportion of the distribution at the either extreme of the distribution is /2.
Sample Size
You can enter sample sizes as small as 5 and as large as 9999. Recall
that the degrees of freedom for the single sample t-ratio are n - 1.
COMPUTE
This function creates the power table for the parameters you have entered. The effect sizes for the table will range between 0 and 1.80. Note
that the power distribution is symmetrical. Therefore, you can apply this
information to positive and negative values of d1.
GRAPH POWER
This alternative offers a graph of the relation between sample size, effect
size and power for the -level and type of hypothesis you are using. The
graph option is useful when you want a quick estimate of the sample size
your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this com-
Power: One Sample t-Ratio
15
putational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
EXIT
This function returns you to the Main Menu.
EXAMPLE OF POWER ESTIMATION
An educational psychologist wants to compare the reading scores of students in a special education program to national norms. Because the
psychologist uses a popular reading test, good estimates of the population mean and standard deviation are available.
For specific reasons, the researcher decides to use an -level level of
 = .01 and a two-tailed test. Therefore, we can estimate the number of
students required for a fair test of the hypothesis.
After changing the basic parameters, select the Graph Power option.
You will see a graph like the one presented in Figure 2.2. As you can see,
the power of the study increases as effect size and sample size increase.
How many subjects should the psychologist use?
Figure 2.2: The power graph produced by the Power Calculator.
The parameters are:  = .01 for a two-tailed test.
Obviously, using 500 subjects will provide good power, but at what
cost? Testing this many subjects will be expensive and time consuming.
If the effect size is large then the psychologist will actually be wasting
time and money testing this many subjects. In essence, the researcher
16
Power: One Sample t-Ratio
will be committing statistical overkill. Therefore, the researcher will need
to find a balance between the competing need to conduct cost-effective
research that will afford useful results.
One benchmark is to set power to .80. Using this standard, we can
use the graph to estimate the optimal sample size across the range of effect sizes. The researcher may guestimate that the effect size is d1 = .40.
From the graph, it appears that a sample size between 70 and 80 will
produce the desired power. For a more accurate estimate, we can return
to the previous screen. Therefore, exit the graph and return to the calculation screen.
Increase the sample size to 75 and select the calculate option. You
will see a screen similar to the one presented in Figure 1.3. As you can
see, 75 subjects will afford power of approximately .80 when d1 = .40 and
 = .01, two-tailed.
Figure 2.3: The power table produced by the Power Calculator.
The parameters are: N = 75,  = .01 for a two-tailed test.
Power: Two Sample Independent t-Ratio
17
CHAPTER 3: POWER ~ TWO SAMPLE
INDEPENDENT T-RATIO
Introduction
The most common use of the t-ratio is to compare the means of two
groups. In its typical application, the data may be discrete or continuous, and represent interval or ratio data. The goal of the t-ratio is to determine whether the difference between the two means represents chance
factors or represents a meaningful difference. In addition to hypothesis
testing, the t-ratio allows us to determine the relation between the independent and dependent variables, the effect size of the statistic.
There are two essential forms of the t-ratio. The first is the independent groups t-ratio, which is described in this chapter. The second t-ratio
is the dependent groups t-ratio, which is described in the next chapter.
The difference between these two t-ratios depends upon how subjects are
assigned to the two groups.
INDEPENDENT GROUPS t-RATIO
We use the independent groups t-ratio whenever we directly compare
two sets of freestanding groups of subjects that are directly compared.
Thus, the independent groups t-ratio can be used for either a true experiment or an intact groups design. In the true experiment the researcher
assigns subjects to one of two groups such as a control or an experimental condition. For the intact group design, the researcher randomly
selects subjects from two different and preexisting populations for comparison. The essential element of the independent groups t-ratio is that
the behavior of subjects in one group has no effect on subjects in the
other group. Similarly, the selection of subjects for one condition is unrelated to, or independent of, the selection of subjects for the alternate
condition.
DEPENDENT GROUPS t-RATIO
Researchers use the dependent groups t-ratio when the two sets of
data are in some way related to each other. For example, many researchers use a matched groups design to increase the power of the experiment. In this type of experiment the researcher identifies a significant subject variable that is related to the purpose of the research. The
researcher then uses this subject variable to assign subjects to the
treatment conditions. The goal of a matched groups design is to equate
the groups before beginning the experiment.
Another form of dependent groups design is to measure the same subjects under different treatment conditions. Such a design is called a repeated measures design. For example, in a study of forgetting, a psychologist may test the subject’s memory of specific material over the
18
Power: Two Sample Independent t-Ratio
course of several days. In the repeated measures design the researcher
tests the same subjects under different levels of the independent variable.
The essential element in the dependent groups design is that subjects
are assigned to the treatment conditions in a predetermined manner.
The advantage of dependent groups design is that it tends to increase
power.
STATISTICAL ISSUES RELATED TO THE INDEPENDENT GROUPS t-RATIO
Equation 8.1 is a common version of the t-ratio for independent
groups.
t 
X 1  X 2  1   2 









 X 1 2
X 12 

 X 2 2 
X 22 

n1
n 1  n 2  2 
n2
3.1
 1
1 




 n 1 n 2 



Equation 8.2 is another version of the same equation. Notice that the
denominator uses the variances of the two groups.
t 
X 1  X 2  1   2 
s X2  s X2
1
3.2
2
Many statistics textbooks do not report the full numerator as I have
done in equations 3.1 and 3.2. As you can see, you can include the estimated difference between the population means. Because most researchers assume that the means are equal (e.g., 1 - 2 = 0) it is removed
from the equation as superfluous. There may be conditions where the
researcher knows that there is a difference between the populations and
wishes to determine if the data from an experiment exceed that difference. For example, the researcher may have reason to assume that
1 = 10.0 and 2 = 5.0, therefore 1 - 2 = 5.0. The researcher may wish
to determine if the difference between the two sample means is greater
than 5.0.
The denominator for this equation is the estimated standard error of
the difference between means. As you can see, the denominator combines the sum of squares for the two sets of data to form the estimate.
Statisticians call this process pooling because the equation combines the
estimated variances of the two groups into a single estimate.
The magnitude of the independent groups t-ratio depends upon several
general factors. First, the difference between the means directly affects
the value of the t-ratio. As the difference between the two means increases, the absolute value of the t-ratio also increases. When planning a
research project, it is important to ensure that the two groups are as different from each other as possible. For a true experiment, one should
select an independent variable that maximally influences the data. For
Power: Two Sample Independent t-Ratio
19
the intact group design study, one should select from populations that
are clearly defined and substantively different from each other.
Another factor that influences the magnitude of the independent
groups t-ratio is the amount of variability within the groups. With all else
being equal, the less intersubject variability the greater the t-ratio and
the power of the statistic. Again, reducing the factors that affect intersubject variability can increase the power of the statistic.
Finally, all else being equal, the sample size influences the size of the
independent groups t-ratio and the power of the statistic. As a generality,
larger sample sizes decrease the size of the standard error of the difference between means. Therefore, increasing sample size will increase
power.
EFFECT SIZE
The effect size of the independent groups t-ratio is defined as:
d2 
X1  X 2
sˆ
3.3
The numerator contains the sample means, X 1 and X 2 . The denominator
of the equation represents an estimate of the common intersubject variability, or sampling error. We estimate ŝ using the simple equation
sˆ 
sˆ 1  sˆ 2
2
3.4
Equation 3.4 is valid only when the two sample sizes are the same.
The absolute value of d2 can range between 0 and . For practical
purposes, however, most statisticians limit themselves to discussing effect sizes that range between 0 and 2. As noted in the previous chapter,
Cohen (1988) listed these benchmarks as general guidelines.
0.0
0.2
0.5
0.8
d2 < 0.20 :
d2 < 0.50 :
d2 < 0.80 :
d2
No to Little Effect
Little Effect to Moderate Effect
Moderate Effect to Large Effect
Large Effect
ASSUMPTIONS FOR t-RATIO AND POWER
The accuracy of the t-ratio is dependent upon meeting several mathematical assumptions. The independent groups t-ratio requires independence of groups, normally distributed data, and homogeneity of variance.
Although each of these assumptions is important, we will focus specifically on the assumption of homogeneity of variance.
Because the denominator of the t-ratio relies upon the variance of
each group, a great difference between the variances will compromise the
accuracy of the t-ratio. The accuracy of the test is greatly compromised
when the variances are radically different from each other and the sample sizes are not equal.
20
Power: Two Sample Independent t-Ratio
A quick way to determine whether two variances are equal is to conduct a simple test called the Fmax test. The Fmax test is the larger variance
divided by the smaller variance.
sˆ l2arg er
3.5
F max  2
sˆ smaller
The degrees of freedom correspond to the sample sizes for the two
groups: The sample size represented in the numerator determines the
first degrees of freedom. The sample size represented in the denominator
determines the second degrees of freedom. The F-ratio is then tested
against a table of F-ratios for /2. This table represents the upper and
lower critical values of the F-distribution. If the F-ratio exceeds the critical value, the variances cannot be considered equal.
If the variances are not equal, then the t-ratio may not provide an accurate estimate of the statistic. A powerful alternative to use in these
conditions is the t-ratio developed by Welch (1936, 1938, 1947, 1951).
We will consider this statistic at the end of this chapter.
For this program we will assume that the variances can be considered
equivalent. If the variances are not equal, then you must use an averaged denominator to estimate d2. The following equation shows how to
average the variances of the groups for estimating the effect size.
d2 
X1  X 2
sˆ12  sˆ 22
2
3.6
If you use this estimate of d2 the sample sizes must be equal. If the
sample variances are not equal and the sample sizes are not equal, then
power cannot be accurately estimated.
ESTIMATING POWER
We can use Power Calculator to calculate the power of the independent
groups t–ratio for various sample sizes, -level, and directionality of the
test. When you select this option, you will see a screen similar to the one
presented in Figure 3.1. As you can see, you can change several parameters of the statistic. Let’s look at each of these in turn.
Power: Two Sample Independent t-Ratio
21
Figure 3.1: The initial screen for the program to calculate the
power of a two-sample t-ratio.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
Number of Tails
This option allows you to toggle between a one- and two-tailed test.
Remember that when you use a two-tailed test you divide  between the
two extremes of the sampling distribution.
Degrees of Freedom and Sample Sizes
The program allows you to set the sample size of each group, it will
then determine the degrees of freedom. You can enter sample sizes as
small as 5 and as large as 9999. Recall that the degrees of freedom for
the independent groups t-ratios is determined by (N1 -1) + (N2 - 1). If the
sample sizes are not equal, the program calculates a common sample size
using:
~ 2  N1  N 2
N 
3.7
N1  N 2
If the sample sizes are not equal, the sample variances must be equal
in order to create accurate power estimates!
COMPUTE
This function causes the program to create a power table for the parameters you have entered. The effect sizes for the table will range between 0
and 1.80.
22
Power: Two Sample Independent t-Ratio
GRAPH POWER
This option draws a graph of the relation between sample size, effect size
and power for the -level and type of hypothesis you are using. The
graph option is useful when you want a quick estimate of the sample size
your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this computational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
EXIT
This function returns you to the Main Menu.
EXAMPLES OF POWER ESTIMATION
EXAMPLE 1
Assume that an experimental psychologist wants to examine two different
conditions that are thought to evoke altruistic behavior. The researcher
will randomly assign some subjects to a control group and the other subjects to an experimental group. Based on previous research, the psychologist believes that the effect size for this research will be moderate at
best. Therefore, the researcher decides to set d2 = 0.40 and use a conventional -level of  = .05 and a two-tailed test.
Power: Two Sample Independent t-Ratio
23
Figure 3.2: A graph of the power curves for a two sample t-ratio
where  = .05, two tailed.
Using these parameters, have the computer create a graph of the data.
Looking at this graph, we can see that there should be approximately 100
subjects in each group to achieve a power of .80. In other words, the researcher will require 200 subjects randomly assigned to one of the two
groups in order to find a significant result 80% of the time.
Figure 3.3: A graph of the power curves for a two sample t-ratio
where  = .05, one tailed.
24
Power: Two Sample Independent t-Ratio
What would happen to the power of the statistic if the researcher decided to convert the test from a two-tailed to a one-tailed test? Such a
decision may be made in the planning phase of the experiment if the researcher can make a clear rationale for such a test. Following the same
steps, we find that 75 subjects in each group (150 subjects, total) are
needed for the study. Figure 3.3 represents the power curves for the onetailed test, and Figure 3.4 represents the power estimates for n = 75.
Therefore, changing from a two-tailed test to a one-tailed test allows the
researcher to use fewer subjects to obtain the same level of power. This
tactic has risks, however. Using a one-tailed test requires that the sign
of the t-ratio match the predicted outcome. If the relation between the
means is opposite to the predicted relation, then the null hypothesis
cannot be rejected.
Figure 3.4: A table of power estimates for a two sample t-ratio
where N = 150,  = .05, one tailed.
EXAMPLE 2
A psychologist wants to conduct a study examining the effects of reinforcement on the speed with which a behavior is learned. Two groups of
subjects will learn a multi-step task. One group of subjects will receive
no extrinsic rewards for completing the task. The other subjects will be
paid $5.00 each time they complete the task correctly.
The researcher believes that d2 = 0.20 and wishes to use a -level of
 = .05 two tailed. Look at Figure 3.2 to estimate the number of subjects
required for adequate power. Given these conditions, the researcher
should use 400 subjects in each group to have power close to .80. This
prediction is confirmed by using the power calculator. Figure 3.5 presents these computations.
Power: Two Sample Independent t-Ratio
25
Figure 3.5: A table of power estimates for a two sample t-ratio
where N = 800,  = .05, two tailed.
t-RATIO WHEN VARIANCES ARE NOT EQUAL
There are many cases when the variances for the two samples will be unequal. Although the t-ratio has a tendency to be robust against violations
of the homogeneity of variance principle, the test will produce spurious
results when the difference between the variances is large and when the
sample sizes are unequal. The problem of unequal variances has been
recognized for quite some time and is known as the Fisher-Brehnes problem. Welch (1936, 1938, 1947, & 1951) provided a solution to the problem when he devised an alternative form for calculating the t-ratio and its
degrees of freedom.
t̂ 
df ' 
X1  X 2
1

 n 1
1
n1
1
n1
 SS1   1

  
 N 1  1   n 2
 SS 2 


 n 2  1 
 SS1  1  SS 2 
 



 n 1  1  n 2  n 2  1 
 SS1 


 n1 1 
n1 1
2

1
n2
 SS 2 


 n 2 1 
n 2 1
2
3.8
3.9
In these equations, SS represents the sum of squares for each group and
n represents the number of subjects in each group. Although the equations do require considerable computational effort, one is rewarded with
26
Power: Two Sample Independent t-Ratio
a parametric test that is powerful and robust. Indeed, there are several
advantages to using these equations when the homogeneity assumption
cannot be met. First, the Welch t-ratio uses the same sampling distribution as the conventional Student’s t-ratio. Second, Kohr, and Games
(1974), Scheffe’ (1970), Wang (1971), and Zimmerman and Zumbo (1993),
demonstrated that the Welch t-ratio has favorable features for protecting
against Type I and Type II errors when sample variances and sizes are
not equal. Therefore, one may apply power estimates generated by this
program to the Welch version of the t-ratio.
Power: Two Sample Dependent t-Ratio
27
CHAPTER 4: POWER ~ TWO SAMPLE DEPENDENT
T-RATIO
Introduction
The essential difference between the independent and dependent groups
t-ratio is the manner by which subjects the researcher assigns subjects
to the groups. For the independent groups test, we assume that the researcher randomly assigns subjects or that the subjects are members of
preexisting groups. By contrast, for the dependent groups t-ratio, the
researcher purposefully assigns subjects to each of the groups. There
are two general experimental procedures where a researcher will use a
dependent groups t-ratio. The first is the matched group design, the
second is the repeated measures design.
For the matched group design, the researcher evaluates the subjects
for some basic characteristic and then rank orders the subjects based on
their scores. Next, the researcher randomly assigns the highest scoring
subject to one group and the next subject to the second group. The researcher repeats this procedure until all subjects are assigned to the two
groups. Using the matched groups design the researcher can increase
the equivalence of the groups before the experiment begins.
The other form of dependent groups design is the repeated measures
design. For this design, the researcher tests the same subject under
more than one condition. Stated simply, the researcher tests the same
subject under both the control and experimental conditions. Consequently, the subjects serve as their own control condition.
The advantage of using a dependent groups design is that the systematic variance among subjects can be estimated and statistically removed
from the denominator of the t-ratio. The dependent groups t-ratio is
t 
X 1  X 2  2  2 

s X2  s X2  2r s X s X
1
2
1
2

4.1
As you can see in the denominator, the size of the standard error for the
differences between means is reduced by the correlation between the two
groups. As a generality, a dependent groups design are more powerful
than the equivalent independent groups design if there is a correlation
between the treatment conditions.
There are some interesting points about the difference between the
independent- and dependent-groups designs that require additional attention. The first issue to consider is the degrees of freedom for a dependent groups design. Because the subject’s scores are treated as
pairs, the degrees of freedom are the number of pairs less one (n – 1 not
n1 + n2 – 2). Therefore, the dependent groups t-ratio will have fewer degrees of freedom than the comparable independent groups design.
28
Power: Two Sample Dependent t-Ratio
For example, if a researcher used 20 subjects in an independent
groups design, the degrees of freedom would be 18 = (10 -1) + (10 -1). By
contrast, the dependent groups design treats the 20 scores as 10 pairs of
data. Therefore the degrees of freedom will be 9 = (10 - 1). Although the
degrees of freedom are smaller for the dependent groups design, the power will not necessarily be smaller. Indeed, the power of the dependent
groups design will be equal to or greater than the power of an equivalent
independent groups design. The reason the power does not decrease has
to do with the assumption of the dependent groups test.
In the independent groups design there are two means each representing a separate population. Consequently, each population contributes its own sampling error. In the dependent groups design there is only one mean that represents the differences between the paired scores.
Thus there is a single source of variation. The dependent groups test has
half the amount of variance.
If there is no correlation between the two groups, the power of the dependent groups t-ratio will be no greater than the independent groups tratio. Therefore, researchers using a matched groups design should select a matching procedure that is relevant to the study.
When there is a correlation between the groups, the consequence on
power can be dramatic. Consider the graph presented in Figure 4.1
which represents the increase in power as the correlation between groups
varies between 0 and 1.0. This graph was generated using the Comparison of ts option in Power Calculator program. The lines represent the effect sizes ranging between 0.1, and 1.4. As you can see, when the correlation between groups is 0, the effect size of the dependent groups design
is no greater than the independent groups design. However, as the size
of the correlation between the groups increases, effect size increases.
The increase in power is especially dramatic for larger effect sizes.
Consider, for example, a moderate effect size of d = .40. An independent groups t-ratio will produce extremely low power, approximately 1  = .12. If a matched groups design is used, and the correlation between
the groups is .90, the power jumps to approximately 1 -  = .58.
Power: Two Sample Dependent t-Ratio
29
Figure 4.1 A graphic illustration of the relation between power,
effect size and the correlation between two groups in a dependent
groups t-ratio. When the correlation between the groups is 0 the
power of the dependent groups t-ratio is equal to the equivalent
independent groups design. As the size of the correlation increases the power of the test increases.
ESTIMATING POWER
To estimate the power of a dependent groups t-ratio, you will need to
conduct an intermediate statistical calculation. You will be converting
the conventional measure of d, used for the independent groups test, to a
form of d that reflects the correlation between the groups. This conversion is easy to perform using the following equation.
d3
 X1  X 2


sˆ
 
1  r12




4.2
Note that the numerator is our measure of effect size for the independent
groups test, d2. The denominator is the correction factor created by the
correlation between the two groups, r12. Once you convert the effect size,
you can use the procedure described in the previous chapter to examine
the power of the statistic.
EXAMPLES OF POWER ESTIMATION
EXAMPLE 1
A researcher who specializes in the study of math education wishes to
examine the effectiveness of a new mathematics education program for
middle school student. The researcher plans to conduct a preliminary
research program that compares the new procedure with a conventional
mathematics curriculum. The researcher has access to 120 students
30
Power: Two Sample Dependent t-Ratio
who are available for the research project. We will assume that the effect
size is small, thus d2 = .2. In addition we will assume that  = .05, twotailed. If the researcher were to use an independent groups design, 60
subjects would be randomly assigned to each group (120 = 60 + 60). Under these conditions, the power would be 1 -  = .18.
Because middle school mathematics requires students to solve word
problems, the researcher decides to use a matched groups design where
students are matched based on a combined mathematics and verbal
skills achievement test. If the correlation between the groups will be
moderate, r = .70 we can re-estimate the power of the dependent groups
design. Specifically,
d3 
.20
1  .70

.20
.30

.20
 .3651
.5477
We can round d3 to .35 and leave  = .05, two-tailed, and N remains 60
for each group. Using a matched groups with these conditions will produce a power of 1 -  = .46. Therefore, the power increased by 28 percentage points .28 = .46 – .18; a 64% increase in power.
Power: Pearson Correlation Coefficient
31
CHAPTER 5: POWER ~ PEARSON CORRELATION
COEFFICIENT
INTRODUCTION
Sir Francis Galton, the famous British heridetarian, first conceived the
concept of correlation. It was the mathematician Carl Pearson, however,
who established the descriptive statistic that we currently recognize as
the correlation coefficient or more formally as the Pearson Product Moment Correlation Coefficient. The definitional equation for the correlation
coefficient is:
r XY 
 z X zY
N
5.1
The correlation coefficient is an index of the relatedness between the
two variables. Perfect correlations are represented by r = -1.00 and
r = 1.00. A correlation of 0 indicates no linear * or systematic relation between the variables. When the correlation coefficient is squared, we calculate the coefficient of determination, r2. Specifically, the coefficient
of determination indicates the proportion of variance in one variable that
is shared with the other variable.
There are many ways to interpret the correlation coefficient. One is to
examine the size of r and r2. Cohen (1988) suggested that the magnitude
of the correlation can be divided into four major categories.
0.0
0.1
0.3
0.5
r < 0.10 :
r < 0.30 :
r < 0.50 :
r
No to Little Effect
Little Effect to Moderate Effect
Moderate Effect to Large Effect
Large Effect
The correlation coefficient can also be subjected to hypothesis testing.
A common hypothesis to test is whether the correlation for the population is significantly different from 0. This test is accomplished by converting the correlation coefficient to a t-ratio.
t 
r XY df
2
1  r XY
5.2
* The correlation coefficient assumes that the relation between the two variables
is linear. Therefore, it is possible that a non-linear relation may exist between the
two variables and that the correlation coefficient will be close to or equal to 0.
32
Power: Pearson Correlation Coefficient
Where rXY is the correlation coefficient and df = n - 2. Because of the nature of the t-ratio, we can plan to conduct either directional or nondirectional tests of the null hypothesis.
ESTIMATING POWER
We can use Power Calculator to calculate the power of correlation coefficient for various sample sizes, -level, and directionality of the test.
When you select this option, you will see that you can change several parameters of the statistic. Let’s look at each of these in turn.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
Number of Tails
This option allows you to toggle between a one- and two-tailed test.
Remember that when you use a two-tailed test you divide  between the
two extremes of the sampling distribution.
Sample Size
You can enter sample sizes as small as 5 and as large as 9999. Recall
that the degrees of freedom is determined by n - 2. If the sample sizes
are not equal, the program calculates a common sample size using
COMPUTE
This function causes the program to create a power table for the parameters you have entered. The correlations in the table will range between 0
and 1.00. Because the correlation and t-ratio are symmetrical for this
hypothesis, the information presented in the table applies equally to negative correlation coefficients.
GRAPH POWER
This alternative offers a graph of the relation between sample size, effect
size and power for the -level and type of hypothesis you are using. The
graph option is useful when you want a quick estimate of the sample size
your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this computational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
EXIT
This function returns you to the Main Menu.
Power: Pearson Correlation Coefficient
33
EXAMPLES OF POWER ESTIMATION
A psychologist who studies personality created a new personality inventory that measures extroversion and introversion. In order to determine
the validity of the new test, the psychologists decides to compare the new
inventory to a commonly used measure of extroversion. How many subjects will the psychologist need in order to find a significant correlation
between the two personality measures?
Based on previous research, the psychologist believes that extroversion is a relatively easy construct to measure and that the effect size is
large, r = .70. Using  = .05, two-tailed, we can have the computer create
a graph of the relation between effect size, sample size, and power.
Figure 5.1: The power graph produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
With an extremely large effect size, we can see that the researcher will
not require many subjects to detect the effect. Indeed, with 20 subjects,
the power is 1 -  = .95.
34
Power: Pearson Correlation Coefficient
Figure 5.2: The power table produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
A health psychologist wants to determine if there is a correlation between
a person's environmental stress and physical health. To test this correlation, the psychologist develops an environmental stress inventory that
asks people to describe the frequency of stressful events in their lives
(e.g., death of a family member, a promotion at work, or buying a new
car). The participants will also complete a questionnaire about their
health (e.g., blood pressure, number of time ill, and a general appraisal of
their health). The psychologists believes that the effect size will be small,
r = .20.
Because the correlation is small, the researcher will require a larger
sample in order to detect the effect. As you can see in the next figure,
the researcher will require approximately 200 subjects in order to detect
the effect.
Power: Pearson Correlation Coefficient
Figure 5.4: The power graph produced by the Power Calculator.
The parameters are:  = .01 for a two-tailed test.
35
Power: Multiple Regression
37
CHAPTER 6: POWER ~ DIFFERENCE BETWEEN
CORRELATIONS: 1 = 2
INTRODUCTION
In the previous chapter we examined the method for determining the
power of the statistical test that determines whether or not a correlation
coefficient equals 0. In this chapter we will examine a different test of
the correlation, whether or not two correlations are equal each other.
When examining the hypothesis  = 0, we first convert the correlation
to a t-ratio and then test the size of the t-ratio. To determine whether
1 = 2, we must convert the correlation coefficients to z-scores and then
compare the difference between the z-scores.
The first step to compare two correlations is to convert the correlations to z-scores using Fisher’s z-transformation:
z r  .5 log e 1  r   log e (1  r )
6.1
The Power Calculator program contains a routine that will print a table of
r to z transformations.
Once the correlations are transformed, we can proceed with a test of
their difference using the equation:
z1  z 2
z 
1
1

N1  3 N 2  3
6.2
The effect size for the difference between correlations is q, which is determined as
q = z1 - z2
6.3
for directional tests and as
q = z1 - z2 
for nondirectional tests.
6.4
38
Power: Difference Between Correlations
0.0
0.1
0.3
0.5
q < 0.10 :
q < 0.30 :
q < 0.50 :
q
No to Little Effect
Little Effect to Moderate Effect
Moderate Effect to Large Effect
Large Effect
ESTIMATING POWER
We can use Power Calculator to calculate the power of the differences
between two independent correlation coefficients for various sample sizes, -levels, and directionality of the test. When you select this option,
you will see that you can change several parameters of the statistic. Let’s
look at each of these in turn.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
Number of Tails
This option allows you to toggle between a one- and two-tailed test.
Remember that when you use a two-tailed test you divide  between the
two extremes of the sampling distribution.
Sample Sizes
You can enter sample sizes as small as 5 and as large as 9999. When
comparing unequal sample sizes, the program uses the following equation to create a balanced sample size estimate:
2 n 1  3 n 2  3 
n' 
3
6.5
n1  n 2  6
COMPUTE
This function causes the program to create a power table for the parameters you have entered.
GRAPH POWER
This alternative offers a graph of the relation between sample size, effect
size and power for the -level and type of hypothesis you are using. The
graph option is useful when you want a quick estimate of the sample size
your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this computational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
EXIT
This function returns you to the Main Menu.
Power: Multiple Regression
39
EXAMPLES OF POWER ESTIMATION
A researcher wishes to examine the correlations among different variable.
One test will be to determine whether two correlations are equivalent.
The researcher believes that the effect is moderate to large, therefore she
sets q = .40. She wants to know the power of her statistical test if she
uses 100 pairs for each correlation. According to the following table, the
power of the test is 1 -  = .80.
Figure 6.1: The power graph produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
Power: Multiple Regression
41
CHAPTER 7: POWER ~ MULTIPLE REGRESSION
INTRODUCTION
The process of multiple regression is a logical extension of the simple linear regression. Specifically, the researcher hopes to use two or more variables to predict a criterion. The goal of multiple regression is, therefore,
to offer a better model or method of predicting the criterion. Although
there are many ways to characterize the size of the effect, the more common form is R2. For the simple linear regression, r2 is an estimate of effect size at it indicates the proportion of the criterion variance that is accounted for by the predictor variable. For the multiple regression, R2 estimates the proportion of the criterion variable that is predicted by the
model.
0.00
0.02
0.13
0.26




R2 < 0.02 : No to Little Effect
R2 < 0.13 : Little Effect to Moderate Effect
R2 < 0.26 : Moderate Effect to Large Effect
R2
Large Effect
ESTIMATING POWER
We can use Power Calculator to calculate the power of the multiple linear regression with different numbers of predictors, -levels, and sample
sizes. When you select this option, you will see that you can change several parameters of the statistic. Let’s look at each of these in turn.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
Number of Predictors: U
This variable determines the number of variables you use will use in
the model.
Sample Size: N
You can enter sample sizes as small as 5 and as large as 9999. When
comparing unequal sample sizes, the program uses the following equation to create a balanced sample size estimate:
COMPUTE
This function causes the program to create a power table for the parameters you have entered.
42
Power: Multiple Regression
GRAPH POWER
This alternative offers a graph of the relation between sample size, effect
size and power for the -level and type of hypothesis you are using. The
graph option is useful when you want a quick estimate of the sample size
your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this computational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
EXIT
This function returns you to the Main Menu.
EXAMPLES OF POWER ESTIMATION
A researcher wants to use two measures to predict an outcome. The researcher believes that the effect size is moderate to large. Therefore, R2 is
approximately .20. How many subjects should the research use in the
study? Set U = 2 and then click the mouse over the GRAPH POWER button.
According to Figure 7.1, the researcher will need approximately 50 subjects to find the effect.
Power: Multiple Regression
43
Figure 7.1: The power graph produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
When you return to the computational format, you can test the accuracy
of the prediction. As you can see in Figure 7.2, if U = 2 and n = 50, the
power is 1 -  = .78.
Figure 7.2: The power table produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
Power: Sign Test
45
CHAPTER 8: POWER ~ SIGN TEST AND P = .50
INTRODUCTION
There are many situations where the researcher wishes to determine
proportion in the population. Consider a political election. A candidate
for an elected may wish to know the proportion of registered voters who
will offer their support. If there are two candidates, Smith and Jones, a
pollster may want to determine the proportion of the voters who will vote
for Smith rather than Jones. If Smith and Jones are perceived equally by
the electorate, then P = .50. However, if Smith has an advantage over
Jones, then P > .50. By contrast, if more voters favor Jones, then P <
.50.
The effect size for the proportion test is g, which is determined as
g =P - .50
8.1
for directional tests, and as
g = P - .50 
8.2
for nondirectional tests.
0.00
0.05
0.15
0.25
g < 0.05 :
g < 0.15 :
g < 0.30 :
g
No to Little Effect
Little Effect to Moderate Effect
Moderate Effect to Large Effect
Large Effect
ESTIMATING POWER
We can use Power Calculator to calculate the power of the sign test or
that the P = .50 for various sample sizes, -levels, and directionality of
the test. When you select this option, you will see a screen similar to the
one presented in Figure X. As you can see, you can change several parameters of the statistic. Let’s look at each of these in turn.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
Number of Tails
This option allows you to toggle between a one- and two-tailed test.
Remember that when you use a two-tailed test you divide  between the
two extremes of the sampling distribution.
46
Power: Sign Test
Sample Size
You may vary either of these values - when you change one the other
is updated. You can enter sample sizes as small as 5 and as large as
9999.
COMPUTE
This function causes the program to create a power table for the parameters you have entered.
GRAPH POWER
This alternative offers a graph of the relation between sample size, effect
size and power for the -level and type of hypothesis you are using. The
graph option is useful when you want a quick estimate of the sample size
your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this computational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
END
This function returns you to the Main Menu.
EXAMPLES OF POWER ESTIMATION
A researcher wants to conduct a survey of local voters. He believes that
one part has a slight edge over the other in a specific county (e.g.,
p = .53). Therefore g = .03 How many subjects should he sample in order to detect the effect? Using the graph presented in Figure 8.1 suggests that 500 will not be enough if  = .05 two-tailed.
Power: Sign Test
47
Figure 8.1: The power graph produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
Figure 8.2 illustrates the power table when N = 2100. As you can see,
with this many subjects, we will be able to have an 80% chance of detecting the difference if P = .53.
Figure 8.2: The power graph produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
Power: Difference Between Proportions
49
CHAPTER 9: POWER ~ DIFFERENCE BETWEEN
PROPORTIONS: P1 = P2
INTRODUCTION
When sampling from different populations or comparing the results of
two samples, one may want to determine whether the two proportions are
equal or significantly different. The difference between proportions can
be converted to a z-score using:
z 
P1  P2
p 1  p  p 1  p 

n1
n2
9.1
where p is
p 
n 1 P1  n 2 P2
n1  n 2
9.2
In both equations, N represents the sample size and P represents the observed proportion.
The first step is to convert the proportions to a common metric, 

  2 arcsin
p

9.3
With this transformation we can determine the effect size, h, using:
h  1  2
9.4
for directional tests and
h  1   2
9.5
for nondirectional tests. When the sample sizes used to determine the
two proportions, we can create a harmonic sample size using:
n' 
2 n 2 n 2 
n1  n 2
9.6
ESTIMATING POWER
We can use Power Calculator to calculate the power of correlation coefficient for various sample sizes, -levels, and directionality of the test. As
50
Power: Difference Between Proportions
you can see, you can change several parameters of the statistic. Let’s
look at each of these in turn.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
Number of Tails
This option allows you to toggle between a one- and two-tailed test.
Remember that when you use a two-tailed test you divide  between the
two extremes of the sampling distribution.
Sample Sizes
You may vary either of these values - when you change one the computer updated the other sample size. You can enter sample sizes as
small as 5 and as large as 9999.
COMPUTE
This function causes the program to create a power table for the parameters you have entered. The correlations in the table will range between 0
and 1.00.
GRAPH POWER
This alternative offers a graph of the relation between sample size, effect
size and power for the -level and type of hypothesis you are using. The
graph option is useful when you want a quick estimate of the sample size
your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this computational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computes the power table, you can click on this button to have the program print a version of the table to the printer.
EXIT
This function returns you to the Main Menu.
EXAMPLES OF POWER ESTIMATION
A researcher wishes to compare the proportion of women who support a
candidate against the proportion of men who support the same candidate. The researcher believes that the difference between the proportions
will produce a medium effect size (e.g., h = .50). Using the graph option
(Figure 9.1), you can see that the researcher will need about 70 subjects
in each group.
Power: Difference Between Proportions
51
Figure 9.1: The power graph produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
We can confirm this estimation by entering the appropriate sample sizes
for n1 and n2. Using 70 men and 70 women will create a power of .84.
Figure 9.2: The power graph produced by the Power Calculator.
The parameters are:  = .05 for a two-tailed test.
Power: Analysis of Variance
53
CHAPTER 10: POWER ~ ANALYSIS OF VARIANCE
INTRODUCTION
Much like the ubiquitous t-ratio, the Analysis of Variance (ANOVA) has
become the indispensable statistical tool for the contemporary researcher. Since its introduction in the 1920s by the great statistician, Sir
Ronald A. Fisher, the ANOVA has been cultivated in many disciplines.
Lovie (1979) for example, documented the introduction of the ANOVA to
psychology. In essence, the ANOVA freed researchers from the use of
haphazard single factor experiments whose results were analyzed by visual inspection, a hodgepodge of rudimentary descriptive statistics, and
considerable guess work and subjective appraisal. Crutchfield and Tolman, two researchers to first use the ANOVA noted that:
In this paper we wish to indicate the unique significance of multiplevariable designs in the study of those areas of behavior where it is
known or suspected that complex interaction of variables exist. ... According to [Tolman’s] system, direct study of the isolated relationships
between each of the independent variables or condition, on the one
hand, and the dependent variable-resultant behavior-on the other, is
considered not feasible. Instead, certain mediate conceptual constructs
(intervening variables) are developed, and these bridge the operational
gap between the independent variables and the dependent variables.
The underlying presupposition of this system is that the combination of
variables is not simple and direct in nature, but is a complex synthesis
of field-relations (1940, p. 39).
Crutchfield and Tolman recognized that the ANOVA afforded the opportunity to design experiments where one could simultaneously examine
the effects of several independent variables and their interaction upon
behavior. This is an essential insight for any science where the phenomenon of interest is affected by many variables operating in a complex
manner.
By the 1940s, statisticians had developed many of the forms of
ANOVA currently in use. Reading any advanced text on the ANOVA will
reveal that the statistic can be applied to univariate and multivariate
procedures, randomized and fixed models, Latin square deigns, and
mixed models, to list but a few experimental models. With the advent of
post hoc test procedures that controlled experimentwise error rates (e.g.,
Tukey’s HSD, or the Scheffé test), the ANOVA now offers the researcher a
full complement of statistical tools.
The general logic of the ANOVA is quite simple. The total variance
among all subjects is subdivided into identifiable components. In the
simple single-factor design, the total variance is divided, or partitioned,
into variance due to differences among treatment conditions and variance
due to error. If the variance among groups is sufficiently larger than the
variance within groups, one can reject the null hypothesis that the two
variances are equal to each other.
54
Power: Analysis of Variance
A complete review of how the ANOVA partitions the total sum of
squares is beyond the agenda of this text. Rather than attempt to repeat
what is done well elsewhere, I will focus on how one can estimate the
power of a specific experimental design. The following section is a greatly
abridged review of the ANOVA test. This information will allow us to
knowledgeably discuss the power of the statistic.
FOUNDATION OF THE ANOVA
There are many ways to write the null and alternative hypotheses for
the F-ratio. One of the more common is to present the hypotheses as:
H0: 1 = 2 = 3 = k
10.1
H1: Not H0
10.2
In this form of the hypothesis, all group means are said to be equivalent
to each other. This hypothesis is satisfactory for simple main effects, but
can be more cumbersome when examining the relation among means for
an interaction. Consequently, I prefer to think of the null and alternative
hypotheses in a different form. Specifically, I note that:
H0: Variance Effect = Variance Residual
10.3
H1: Variance Effect  Variance Residual
10.4
I am an advocate for this form of hypothesis testing for several reasons.
First, this hypothesis reminds us that the F-ratio is a nondirectional test.
The statistic merely indicates whether the variance among the means is
within the range predicted by sampling error. The F-ratio does not indicate the location of significant differences among the means (unless we
are comparing two groups).
A second reason that I prefer this form of hypothesis is that the
statements make clear what we are comparing in the statistic. The Fratio is the ratio variance attributed to the effect (main effect or interaction) to the variance attributed to random effects. Therefore, the hypothesis can be readily applied to the analysis of main effects and interactions.
Of course, the hypothesis can be written in a more elementary form.
H0: Feffect = 1
10.5
H1: Feffect  1
10.6
INTERPRETING THE F-RATIO
Although the purpose of the ANOVA is to determine whether the null
hypothesis can be rejected, there are additional and important bits of information we can extract from the summary table. This information enhances our interpretation of the data and the experimental results. Besides the F-ratio, the ANOVA summary table offers two additional statistics that we need to use. These statistics are the (1) Measures of Associa-
Power: Analysis of Variance
55
tion, and (2) Measure of Effect Size. Let’s consider each of these statistics in turn.
F-RATIO
The F-ratio is an inferential statistic that allows us to determine
whether to reject the null hypothesis. If the size of F is sufficiently large,
we can reject the hull hypothesis at a specified -level. With proper
planning, we can design experiments that will yield data that will have
sufficient statistical power to reject the null hypothesis.
Aside from determining the whether or not to reject the null hypothesis, the F-ratio affords no other direct comparison or interpretation. Unfortunately, many who interpret the F ratio mistakenly believe that the
size of the statistic indicates the importance of the data, the likelihood
that the results of the experiment can be replicated, or the relation between the independent and dependent variables. Such interpretations
are incorrect. However, other statistical tools address these issues.
THE CORRELATION RATIO: 2
There are several ways to determine the relation between the independent and dependent variables. One of these is the correlation ratio
that is represented as 2. The 2 is determined by dividing the sum of
squares for the effect by the total sum of squares. In mathematical
terms,
2 
2
 Effect
2
 Total
or as
2 
SSEffect
SSTotal
10.7
The numerator is the sum of squares for the effect and the denominator
is the sum of squares for the total variance. We can interpret the correlation ratio in the same manner as r2, the coefficient of determination.
Specifically, 2 will range between 0 and 1.0. Small values of 2 indicate
that a small proportion of the total variance among the observations is
due to the treatment effect(s) and that the majority of the variance due to
other factors such as error. Larger values of 2 indicate that a portion of
the differences among scores can be attributed to the treatment effect(s)
EFFECT SIZE: F
Another useful statistic is the effect size of the F-ratio. The effect size
is the ratio of the variance for the effect divided by the total variance.
Mathematically, we define f as
f 
 Effect
 Total
10.8
Of course, 2 and f are interrelated as is indicated in the following
equations.
56
Power: Analysis of Variance
2
1 2
f 
10.9
f2
10.10
1 f 2
We can interpret f in the same way that we interpret d for the t-ratio.
Indeed, the t-ratio is a special case of the F-ratio when there is one degree of freedom for the numerator of the F-ratio. That is, t2 (N-2) = F(1, N2). Therefore, when there is one degree of freedom in the numeration of
the F-ratio, d = 2 f.
Following Cohen’s (1988) lead, we can characterize the magnitude as
small, medium, and large.
2 
0.00
0.10
0.25
0.40




f < 0.10:
f < 0.25:
f < 0.40:
f
No to Little Effect
Little Effect to Moderate Effect
Moderate Effect to Large Effect
Large Effect
We can now use this basic information to proceed with a purposeful
study of the power analysis of the ANVOA.
SPECIAL ISSUES FOR POWER ESTIMATION FOR THE ANOVA
When conducting any empirical research, one must be concerned with
the accuracy with which the dependent variable is measured. Measurement error refers to the fact that the measurement of the dependent variable is subject to random variation. This variation, in combination with
variance created by intrasubject differences and other sources of random
error, increase the error term of the ANOVA and degrade the power of the
statistic. Therefore, increasing the accuracy with which measurement is
made can increase power.
Intrasubject variability is another issue that influences the power of
any statistic. As the average difference among subjects increases the
ability to detect differences among the groups decreases. Fortunately,
researchers have access to a number of methodological and statistical
procedures that reduce or statistically control for intrasubject variability.
Two of the more common statistical procedures that will be examined include blocking, within-subjects designs, and the analysis of covariance.
Measurement Error and Power
Measurement error reflects the reliability with which a construct is
measured. Reliable test produce consistent results, unreliable tests do
not. When the reliability of the measurement techniques is perfect the
correlation between two sets of measurements will be rXX = 1.00. A
measurement procedure with no reliability will produce measurements
that can be best described as a series of random numbers. In such cases
rXX = 0.00.
Hopkins and Hopkins (1979), and Rogers and Hopkins (1988) demonstrated that the relation between power and measurement error can be
expressed using the following equation.
Power: Analysis of Variance
f
YY

YY f 1.0
57
10.11
In this equation, fYY is the estimated power, YY is the estimate of testretest reliability, and f1.0 is the power of the effect assuming perfect
measurement.
Here is a simple example of how this statistic can be used. Assume
that you are designing a study that involves a measure of academic
achievement. According to published results, the test-retest reliability of
the test is rYY = .64. You have reason to believe that the effect size of the
study is moderate, f = .35. If your estimate of effect size did not include
an estimate of measurement error, you will need to adjust your effect size
measure as f = .28 = . 64  .35. In essence, measurement error reduces
power.
This drop in power can be overcome through several strategies. The
first would be to increase the sample size to ensure sufficient power. Another alternative would be to increase the measurement accuracy. For
example, the test could be lengthened to increase reliability or you could
add an additional test to increase measurement accuracy. The alternative will be dependent upon the relative cost of each. Using longer and
more complex tests may tax the patience of the participants and burden
the budget of the study.
Factorial Designs and Power
The factorial design has many advantages. Among these is the ability to
examine the interaction among two or more independent variables. Another advantage is that factorial designs are considered more cost effective than comparable single factor experiments. In essence, the factorial
design has the potential to make better use of subjects than more simplistic designs. Because of these two factors, the factorial ANOVA has
the potential of increasing the power of a research project.
ANALYSIS OF COVARIANCE AND POWER
Another method for controlling intrasubject variation is the analysis of
covariance or ANCOVA. The ANCOVA is a sophisticated statistical technique that systematically estimates intrasubject variability due to a specific subject variable. This variability is then removed from the general
residual estimate, thus increasing the power of the measure.
Rogers and Hopkins (1988) provided an estimate of the effect size
when one uses an analysis of covariance. Specifically, they noted that
f '
XX
 YY

YY f 1 .0
1   
2
X
YY  XX
Y

10.12
In the equation, 2X  Y represents the true score correlation between the
covariate and the dependent variable. The variables XX and YY represent the reliability of the measure of the covariate and the dependent variable, respectively.
58
Power: Analysis of Variance
Other than this transformation, the same procedures for estimating
the power of the conventional ANOVA can be used to estimate the power
of the ANCOVA. In other words, the following examples may be used for
either the ANOVA or ANCOVA.
ANOVA VS ANCOVA
As a generality, the ANCOVA is more powerful than the ANOVA for the
simple reason that the ANCOVA identifies an additional source of error
variance that is then removed from the general error term. The consequence is a larger F-ratio. Although it is tempting to assume that the
ANCOVA is always the superior research tool to the ANOVA, there are
instances where the ANOVA may be preferred.
Maxwell, Cole, Arvey, and Salas (1991) demonstrated that there are
clear instances where the ANOVA may be the preferred statistical tool.
As they noted in their paper, the noncentrality parameter for the
ANCOVA is:
 ANCOVA 
n   i2

2
a  W2  XY
10.13

where n is the sample size  is the deviation between the mean of a
treatment condition and the grand mean, a is the number of levels for
the factor, W2 is the within-group variance for the dependent variable,
and XY is the population correlation between the pre-test and post-test.
The noncentrality parameter for the ANOVA is:
n ' k   i2
 ANOVA 
10.14
a  W2 1  k  1 YY 
Here, n’ is the sample size of groups, YY is the test-retest reliability of the
dependent variable measure, and k is the factor by which the test is
lengthened or shortened.
There are several important aspects of these equations. First, the
noncentrality parameter is central to the estimation of power. For example, we can write the equation in its most simple form as:

f
10.15
n
Using these basic mathematical principles, Maxwell et al. (1991)
demonstrated that under special conditions,
 ANCOVA 
n   i2

2
a  W2  XY

=  ANOVA 
n ' k   i2
a  W2 1  k  1 YY 
Specifically, one can adjust the size of the dependent measure test in the
ANOVA to equal the power in the ANCOVA. Maxwell et al. demonstrated
Power: Analysis of Variance
59
when YY and XY are small, the ANOVA with a longer post test will be as
powerful and require fewer subjects than the comparable ANCOVA. For
larger values of YY and XY, the ANCOVA requires fewer subjects to
achieve the same power.
ESTIMATING POWER
Before the computer can estimate the power of your statistic, it must
know the specifics of the design you are studying. Specifically, you will
need to enter the number of factors used in the study and the number of
levels within each factor. It is essential to understand the terminology
used in this manual to use the program effectively. Here are general definitions of the terms the program uses.
Between-Subjects Factors
When the program begins, you will be asked to enter the number of
between-subjects Factors that are used in the study. A factor is an
independent variable. A between-subjects factor is an independent
variable wherein the subjects are exposed to only one level of the
variable.
Within-Subjects Factors
The program will also ask you to enter the number of withinsubjects factors in the design. A within-subjects factor is an independent variable where subjects are exposed to all levels of the
treatment condition. In other words, if the same subject is measured under several treatment conditions or tested on several occasions, then the variable is a within-subjects factor.
Levels of a Factor
For the analysis of variance, each factor will have two or more levels. Each level represents a unique condition within that factor.
Consider an experiment that examines the relation between drug
dosage and response. The drug treatment represents a factor.
Each dosage represents a level of the factor. If separate groups of
subjects are randomly assigned to each dosage condition, then drug
dosage is a between-subjects variable. If all subjects experience
each drug dosage during the course of the experiment, then drug
dosage is a within-subjects variable.
Sample Size
The sample size represents the number of subjects assigned to each
treatment condition. The program will assume that you intend to
assign equal numbers of subjects to all treatment conditions.
Once you have entered the relevant information, the program will create a type of ANOVA summary table. The table will identify each of the
major terms in the ANOVA, its degrees of freedom, the adjusted sample
size (N’), and the power of the effect for various effect sizes. Let’s consider several examples as a way to illustrate the use of the program
60
Power: Analysis of Variance
Example 1: One Way ANOVA
A researcher wants to conduct a single factor ANOVA with five levels of
the factor. Subjects are to be randomly assigned to each of the five
treatment conditions. Therefore, there is one between-subjects factor
with five levels. We will assume that there are to be 5 subjects in each
group and that  = .05. In response to the computer’s questions we enter
the following information:
Number of Between-Subjects Factors:
Levels of B-S Factor 1:
Number of Within-Subjects Factors:
Sample Size:
:
1
5
0
5
.05
The computer will generate a table similar to the one in Figure 10.1. If
you want to examine larger effect sizes, use the arrow pointing to the
right. The arrow pointing to the left reveals smaller effect sizes.
Figure 10.1 An example of the output for the ANOVA power estimate. For this example the program estimated the power of a
one-way ANOVA with five levels of the independent variable, five
subjects in each group and  = .05
You can experiment with the power calculator by changing the sample
size and -level. After you change these values, activate the REDO option for a revised set of power estimates.
Example 2: One Way ANOVA with repeated measures
A researcher is interested in how rapidly people will forget information
and arranges to test subjects once a week for five weeks after they have
memorize specific information. Because the researcher is testing the
same subjects once a week, time is a within-subjects variable. Therefore,
Power: Analysis of Variance
61
there is one within-subjects factor with five levels. We will assume that
there are to be 5 subjects in the study and that  = .05. In response to
the computer’s questions we enter the following information:
Number of Between-Subjects Factors:
Number of Within-Subjects Factors:
Levels of W-S Factor 1:
Sample Size:
:
0
1
5
5
.05
The computer will generate a table similar to the one in Figure 10.2. If
you want to examine larger effect sizes, use the arrow pointing to the
right. The arrow pointing to the left reveals smaller effect sizes.
Note that in the “Model” line the 5 is surrounded by brackets ( [5] ).
The program does this to indicate that the factor is a within-subjects variable.
Figure 10.2 An example of the output for the ANOVA power estimate. For this example the program estimated the power of a
one-way repeated-measures ANOVA with five levels of the independent variable, five subjects in each group and  = .05
You can experiment with the power calculator by changing the sample
size and -level. After you change these values, activate the REDO option for a revised set of power estimates.
Example 3: Two-Way Factorial ANOVA
A developmental psychologist wants to study children’s reaction to
strangers. The researcher decides to study the interaction between the
age of the child and the sex of the stranger. The first factor, sex, has two
levels. The researcher decides to make this a Between-Subjects Variable
— half of the children meet a stranger who is male, the others meet a
62
Power: Analysis of Variance
stranger who is female. The second factor is also a Between-Subjects
Variable because the children are tested once. The researcher decides to
use four age groups representing children who are 4-8 months, 8-12
months, 12-16 months, and 16-24 months. Therefore, the researcher
has a 2  4 factorial design.
Number of Between-Subjects Factors:
2
Levels of B-S Factor 1:
2
Levels of B-S Factor 2:
4
Number of Within-Subjects Factors:
0
Sample Size:
5
:
.05
Figure 10.3 An example of the output for the ANOVA power estimate. For this example the program estimated the power of a
two-way ANOVA with two levels of the first variable, four levels of
the second factor, five subjects in each group, and  = .05
Example 4: Two-Way Factorial ANOVA With One Repeated Factor
Assume that a clinical researcher wants to examine the long-term effectiveness of three forms of psychotherapy across time. First, subjects
are randomly assigned to one of three treatment program, psychoanalytic, humanistic, or behavioral. After a fixed number of sessions, the
treatment is terminated. The participates in the study are contacted 3, 6
and 12 months after the end of the treatment for assessment. The first
factor is a between-subjects factor and has three levels. The second factor is a within subjects variable with three levels. Assume that there are
21 subjects (7 in each group). Therefore the information supplied to the
computer is:
Power: Analysis of Variance
Number of Between-Subjects Factors:
Levels of B-S Factor 1:
Number of Within-Subjects Factors:
Levels of W-S Factor 1:
Sample Size:
:
63
1
3
1
3
7
.05
Figure 10.4 An example of the output for the ANOVA power estimate. For this example the program estimated the power of a
two-way ANOVA with three levels of the first variable, three levels
of the second factor, seven subjects in each group, and  = .05
Note that in the “Model” line, the second 3 is surrounded by brackets
( [3] ). The program does this to indicate the factor that is the withinsubjects variable.
Power: 2
CHAPTER 11:
65
POWER ~ 2
INTRODUCTION
Where the analysis of variance is the statistic of preference for data described as continuous and interval or ratio, the 2 is the statistic of preference for categorical data. This statistic is a frequently used inferential
procedure in both the natural and social scientists. Thus, it is not uncommon to find extensive references to the statistic made by geneticists
studying population characteristics, political scientists studying voting
patterns, and psychologists studying the relation between attribution and
behavior.
Since its introduction by Carl Pearson in 1900, many excellent accounts of the use and misuse of the statistic have been written. I refer
you to several of the more lucid for your review so that we may proceed
with a review of the power of the 2.
As a brief review, the 2 for a contingency table having I rows and J
column, is determined using the following equation.
I
J
1
1
2  
O
ij
 E ij

2
E ij
11.1
Where Oij and Eij represent the Observed and Expected frequencies for
each cell in the 2 table. The degrees of freedom for the 2 is simply
df = (Number of Rows - 1)(Number of Columns -1)
11.2
An alternative application of the 2 is the goodness of fit test. This
test is used to determine whether a single row of frequencies conforms to
some predetermined set of frequencies. When using the goodness of fit
test, the degrees are simply the number of cells less 1.
To determine whether or not to reject the null hypothesis, the observed 2 is compared to a critical value of 2 based in the degrees of
freedom and -level selected.
If you reject the null hypothesis for 2 you can calculate several descriptive statistics that aid in the interpretation of the statistic. Of the
many that are available, Cramer’s Contingency Coefficient, C, is most
useful to us in our exploration of power. In brief, C, is a measure of association that indicates the degree to which the row and column variables are related to each other. We can determine C using
C 
2
2  N
11.3
Power: 2
66
This statistic is important to use because we can use it to develop a
measure of effect size, w. Technically, the effect size of the 2 is determined by
w 
P O   P E 

P E 
I
J
1
1
2
ij
ij
11.4
ij
Where P(Oij) and P(Eij) represent the proportion of the total frequency represented in each cell. A more convenient method of calculating w, however, is
w 
C2
1  C2
11.5
As with other measures of effect size, Cohen (1988) has recommended
several benchmarks for interpreting w. Specifically,
0.0
0.1
0.3
0.5




w < 0.10 :
w < 0.30 :
w < 0.50 :
w
No to Little Effect
Little Effect to Moderate Effect
Moderate Effect to Large Effect
Large Effect
ESTIMATING POWER
We can use Power Calculator to calculate the power 2 for various sample sizes, degrees of freedom, and -level. When you select this option,
you will see a screen similar to the one presented in Figure X. As you
can see, you can change several parameters of the statistic. Let’s look at
each of these in turn.
Alpha Level
This option allows you to vary the -level you plan to use for your research. Although the default value is set as  = 0.05, you can increase or
decrease the value of . You can enter values between .50 and .00001.
The 2, like the F-ratio, is considered a nondirectional test
Degrees of Freedom
As noted previously, the degrees of freedom depend upon the size of
the table used to conduct the test. For a goodness of fit test, the degrees
of freedom are the number of cells less 1. For the contingency table, the
degrees of freedom are (R - 1)(C - 1).
Sample Size
As the name implies, the sample size reflects the total number of observation used to make up the table of data.
COMPUTE
This function causes the program to create a power table for the parameters you have entered.
Power: 2
67
GRAPH POWER
This alternative offers a graph of the relation between sample size, effect
size and power for the -level you are using. The graph option is useful
when you want a quick estimate of the sample size your study may require.
HELP
The help function calls a help screen that should provide you with general information that will help you understand the features of this computational option. The help screens are a greatly abridged version of this
manual.
PRINT
Once the program computer the power table, you can click on this button
to have the program print a version of the table to the printer.
EXIT
This function returns you to the Main Menu.
EXAMPLES OF POWER ESTIMATION
Two researchers believe that the effect size for a study they wished to
conduct is relatively large (e.g., w = .40). Will an N of 50 observations be
sufficient? According to the results presented in Figure 10.1, the answer
is yes. The power is 1 -  = .82.
Figure 11.1: The power table produced by the Power Calculator.
The parameters are:  = .05.
68
Power: 2
Random Number Generator
69
CHAPTER 12: RANDOM NUMBER GENERATOR
INTRODUCTION
This chapter describes four separate options built into the Power Calculator program. Each of these options generates a table of some form of
random number. Using these options, you can (a) generate random integers between two extremes, (b) generate a normally distributed random
sample with a specified mean and standard deviation, (c) create a random sequence for assigning subjects to treatment conditions, (d) generate
a Latin Square of a specified size, and e) residuals for whole numbers.
Each option will print the results to either the computer screen of the
printer.
RANDOM INTEGERS
Random numbers are essential for work in statistics and experimental
design. Consider a political scientist who wishes to poll the registered
voters of a county. In order to produce useful data, the researcher will
need to ensure that sample is not biased. One protection against selection bias is to use random selection. Let’s look at how the researcher
could use the program.
When the Random Integers option is selected, you will see a screen
similar to the one presented in Figure 12.1. Options available are:
Figure 12.1 The initial screen that you will see when you select
the Random Integer option.
70
Random Number Generator
Lowest Value in Sample
This number represents the lowest integer that can potentially
be in your sample. You may include positive as well as negative numbers. The only requirements are that this number be
less than the highest potential integer in the sample and a
whole number.
HIghest Value in Sample
This number represents the highest integer that can potentially be in your sample. You may include positive as well as
negative numbers. The only requirements are that this number be greater than the lowest potential integer in the sample
and a whole number.
Sample Size
This tells the computer how many numbers to generate for the
sample.
Print to Screen
You may print the sample of integers to the screen or your
printer.
For the sake of illustration, assume that we want to generate 100 numbers between 0 and 25,000, inclusive. After you enter the appropriate
value, press the Compute Button. The computer will generate a screen
that looks something like the following.
Figure 12.2 An example of the printout created by the Random
Integer program. For this example, the computer generated 100
random numbers between 0 and 25,000.
The program prints the numbers as it creates them, which is to say
the numbers are random. In addition, the numbers are not sorted and
there is the chance that the program will produce duplicate values. The
probability that duplicates increases as the difference between the highest and lowest numbers decreases and the number in the sample increases.
Random Number Generator
71
RANDOM NORMAL DISTRIBUTION
This option is similar to the previous option. The primary difference is
that the numbers that are generated are normally distributed and are
selected from a population with a mean and standard deviation that you
define. When this program begins, you may alter the value of the population mean, the population standard deviation, and the number of observations to include in the sample. The table can be printed to the
screen or the printer.
The following figure is an example of the printout that the program
generates. These data were generated from a population where the
 = 100 and  = 15.
Figure 12.3 An example of the printout created by the Random
Integer program. For this example, 10 were selected from a population where  = 100 and  = 15.
RANDOM ASSIGNMENT OF SUBJECTS
One of the key elements of a true experiment is the random assignment of subjects to the control and treatment conditions. Many researchers believe that true assumptions of cause and effect can be made
only when random assignment determined the treatment conditions the
subjects experienced. This program will create a table that will allow you
to randomly assign subjects to treatment conditions.
Let’s look at an some examples of how the program could be used.
Assume that a researcher is conducting an experiment where there are
four treatment conditions. The experiment could be a single factorial design with four levels of the factor, or a 2  2 factorial design. Each
treatment condition is designated a number between 1 and 4, inclusive.
We will also assume that the researcher plans to put five subjects into
each treatment condition. Therefore, the number of groups is 4 and the
number of subjects is 5. Enter this information into the computer and
then click on the compute button. The program will generate a table
something like the following figure.
72
Random Number Generator
The program created five sets of numbers ranging between 1 and 4.
Looking at Set 1, we can see that the first subject is assigned to Group 2,
the second subjects is assigned to Group 3, the third subject is assigned
to Group 1, and the fourth subject is assigned to Group 4. The researcher will follow this table until all the subjects are assigned to the appropriate treatment conditions.
Immediately following is a test of randomness. This test determines if
the distribution of numbers across the treatment conditions (positions) is
random or not. The computer generates a 2 test for each position. As a
general benchmark, if the p value is greater than .05 the column of numbers can be considered random. For all practical purposes, the numbers
in this example appear to be reasonably random. If you are suspicious of
the table, generate a new table by returning to the previous screen and
creating a new table.
Figure 12.4 An example of a table of random numbers that can
be used to randomly assign subjects to treatment conditions. In
this example there were four treatment conditions with five subjects in each treatment condition. The test of randomness is a 2
for the numbers in each column or testament condition.
LATIN SQUARE GENERATOR
Latin Squares are an important control and analysis procedure. In its
simple form, the Latin Square is a form of counterbalancing procedure.
The Latin Square design has many methodological and statistical advantages. To learn more about the Latin Square, consult an advanced
text on statistics such as Winer, Brown, and Michels (1991).
The Latin Square is a matrix of numbers. Each number represents a
specific condition. The distinguishing feature of the Latin Square is that
each number will be in a position only once. In other words, no number
will have the same position more than once. We can look at an example
to see how the Latin Square is constructed. The Latin Square in Figure
12.5 is a 10  10 matrix. Each row and each column contains the num-
Random Number Generator
73
bers between 1 and 10 inclusive. If you look closely at the second number in the second row (7) you will see that the is not repeated again in
that row or column.
Figure 12.5 An example of a 10  10 printout produced by the
Latin Square program. The numbers in the square represent the
individual treatment conditions. Each number resides in a column location once, there are no replications. The numbers at
the ends of the rows and columns represent are the total of the
row or column and thus a check of the square.
The computer generates the Latin Square by selecting numbers at random. The implication of this fact is two fold. First, each time you generate a Latin Square it will be different from the previous Latin Square *.
The second implication is that with large Latin Squares, the computer
will spend some time generating the square. I cannot offer you a good
estimate of the time it will take to generate a Latin Square. Factors such
as the size of the square and the operating speed of your computer interact to make it impossible to offer useful predictions. However, if you
require a large Latin Square (e.g., greater than 10) be patient. If the program seems to be “stuck,” wait; the program is looking for a valid solution for you request. If you become inpatient, click on the Exit button
and then restart the program.
WHOLE NUMBER GENERATOR
In statistics, as in all disciplines, doing is learning. Homework assignments, worked examples, and computational examples are effective
* In fact, when the number of rows is three or fewer, there is only one possible
solution. When there are four rows, there are four potential solutions. Five rows
produces 58 potential solutions. The number of potential solutions increases
rapidly as the number of rows is increased.
74
Random Number Generator
methods for teaching simple and complex statistical techniques. Requiring students to work through examples allows them to practice their
computational skills as well as examine how a statistical test works. One
problem that students often encounter is computational error. These
errors are most likely to occur when an intermediate step produces a
number with a remainder. The student may transpose numbers, round
inappropriately, or make some other mistake that produces the wrong
answer. Finding the error in computation can be daunting and frustrating, and may make statistical work aversive for students. A partial solution to this problem is to create data sets that produce whole numbers
for the desired statistical test. These data sets allow students to work
through assignments that do not produce an intimidating collection of
numbers to manipulate. These experiences may help the student who is
easily threatened by numbers to develop a measure of confidence in his
or her abilities.
GENERATING SAMPLES WITH SPECIFIED MEANS AND STANDARD DEVIATIONS
A residual is ei  X i  X . The program will allow you to generate residuals with different sample sizes and standard deviations. Here is an
example of the arrays in the table.
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Residuals for whole numbers: SD = (SS/(n–1))^.5
repeats = 5
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
Residual Arrays for N = 5 and s = 4
Array: e
1:
7 –3 –2 –1 –1
:SK = 2.0
2:
6 –5
1 –1 –1
:SK = 0.6
3:
6 –4
2 –2 –2
:SK = 0.9
4:
5 –5
3 –2 –1
:SK = 0.1
5:
5 –5 –3
2
1
:SK = –0.1
6:
4
4 –4 –4
0
:SK = 0.0
–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
In this example, the standard deviation of each array is 4.0 using n - 1
in the denominator of the equation. The mean of each array is 0.0. The
table also includes the skew of the array. You can use these arrays to
generate data sets for students. All you need to do is add a constant to
each residual. The result will produce an array of data whose mean will
equal the constant. Specifically,
X i  ei  X
Random Number Generator
75
For example,
X
Mean
Median
X2
(X)2
ŝ
e
X
X
6 +10 = 16
–5 +10 = 5
1 +10 = 11
–1 +10 = 9
–1 +10 = 9
0.0
50.0
0.0
10.0
–1.0
9.0
64.0
564.0
0.0 2500.0
4.0
4.0
e
4 +
4 +
–4 +
–4 +
0 +
0.0
0.0
0.0
64.0
0.0
4.0
X
10
10
2
2
6
30.0
6.0
6.0
244.0
900.0
4.0
X
6=
6=
6=
6=
6=
You can use the information about skew when you wish to illustrate how
outliers affect the skew of the data and the difference between the mean
and median. The greater the absolute value of the skew, the greater the
difference between the mean and median.
To reverse the sign of the skew, multiply the residuals by –1. For example, the array
e = (7
–2
–2
–1
–1)
has a skew of 2.0. Multiplying the array by –1 produces a new array
e' = (–7
2
2
1
1)
which has a skew of –2.0.
FREQUENCY DISTRIBUTIONS
Using arrays of different sizes, you can create data sets for frequency distributions, stem–and–leaf plots, or other forms of exploratory data analysis. For example, with N = 10 and ŝ = 2, two of the many residual arrays
are
e = (4
e = (3
–4
3
1
–3
1
–3
–1
0
–1
0
0
0
0
0
0
0
0)
0)
The following graph of the data is a simple frequency distribution. The
two sample means are 9 and 10, respectively.
76
Random Number Generator
7
Frequency of X
6
5
4
3
2
1
0
1
2
3
4
5
6
7
8
9 10 11 12 13 14 15
Valu es of X
Frequency distribution of two arrays
CORRELATION AND REGRESSION
Using the residuals, we can generate data for correlation coefficients.
Although the final correlation coefficient will have fractional values, the
intermediate steps will be whole numbers. The following data represent
an extended example. Here are four sets of residuals where N = 6 and
ŝ = 4. For the sake of illustration, I rearranged the order of the residuals
to create different patterns of correlations. In this example, all the Xs
equal zero. Therefore, the correlation is the sum of the cross products
(XY) divided by the sum of squared residuals: r XY   e X e Y  e 2 .
Data For Correlations
Subject 1
1
7
2
–4
3
–3
4
2
5
–1
6
–1
e2
80
Cross Products
Subject 1
eXeY
2
–5
–4
–1
2
3
5
80
3
–3
–3
–3
0
2
7
80
12
–35
16
3
4
–3
–5
–20
13
21
12
9
0
-2
-7
33
4
–2
–4
–4
6
2
2
80
14
-14
16
12
12
-2
-2
22
Random Number Generator
Subject
1
2
23
15
12
3
0
6
35
71
2 4
10
16
4
12
6
10
58
2
3
34
6
12
12
0
4
14
48
eXeY
Subject
1
77
eXeY
Correlations
1
2
3
4
1
1.000
2
–0.250
1.000
3
0.412
0.888
1.000
4
0.275
0.725
0.600
1.000
By adding a constant to each array of residuals, you can eliminate negative values and present data that are more realistic.
You can, of course, use pairs of residuals that have different standard
deviations. Take care in your selection, however. The denominator of the
correlation coefficient is the harmonic mean of the sum of squares for the
two samples. If both sum of squares are even values, then the denominator will also be even. Other combinations of sum of squares will produce whole denominators with fractional values.
ANALYSIS OF VARIANCE
One–Way ANOVA
Assume that you want students to complete a single factor AONVA with
four levels and six observations in each group. To create the data, we
need to follow several specific steps.
Step 1: Generate Group Means
For this example, there are four group means. For our example we will
set ŝ = 2.0. The residuals for the group means are:
e = (3
–1
–1
–1)
With these residuals, we can select the grand mean. The grand mean
determines the mean of each group. Let's set the grand mean to 10.
Therefore, the four group means are
78
Random Number Generator
Ms = (13
9
9
9)
Recall that the estimate of the between-groups variance is:
2
sˆ between

 n j  X j  X 
2
k 1
As you can see, we need to recognize that sample size affects the sum of
squares between groups. In this example, the sum of squares among the
means is SS = 12.0, and the sample size for each group is 6. Therefore,
the sum of squares between groups is 72.0.
Step 2: Generate Individual Scores
We can now use these means to generate the data for the groups. For
this example, I selected ŝ = 2.0
X1 =
X2 =
X3 =
X4 =
X1 =
X2 =
X3 =
X4 =
(4
(3
(3
(3
–1
–3
2
–2
(17 12
(12 6
(12 11
(12 7
Source
Between
Within
Total
–1
1
–2
–2
12
10
7
7
–1
–1
–1
1
12
8
8
10
Sum of Squares
72.00
80.00
152.00
–1
0
–1
1
12
9
8
10
0)
0)
–1)
–1)
13)
9)
8)
8)
Totals
df
3
20
23
+
+
+
+
13
9
9
9
Xi
Xi2
78
54
54
54
240
1034
506
506
506
2552
Mean Square
24.00
4.00
F–Ratio
6.00
Two–Way ANOVA
For this example, I will generate data for a 3  4 ANOVA with five scores
in each cell. As in the previous example, the first thing we need to do is
select a grand mean. For the sake of simplicity, we will use a grand
mean of 10.0.
Step 1: Generate Group Means
For this example, I chose the following residuals.
eA = (5
eB = (6
–5
–2
0)
–2
–2)
Random Number Generator
79
By adding 10 to each residual, we have the means for the main effects.
MA = (15
MB = (16
5
8
10)
8
8)
We can now create the mean for each cell in the factorial block. To calculate the cell mean, multiple the row and column mean and divide by
the grand mean. For example, the mean of A1B1 is 24 = (15  16)/10.
a1
a2
a3
b1
24
8
16
16
b2
12
4
8
8
b3
12
4
8
8
b4
12
4
8
8
15
5
10
10
Step 2: Generate Individual Scores
For a sample size of n = 5 and ŝ = 3.0 there are only three residual arrays
that produce whole numbers.
e1 = (5
e2 = (4
e3 = (3
–3
–3
3
–1
–3
–3
–1
1
–3
0)
1)
0)
We can randomly assign the arrays to the different cell means to generate
the data for the individual cells. For example;
a1
a2
a3
Totals:
b1
29
21
23
23
24
120
b2
15
15
9
9
12
60
b3
16
9
9
13
13
60
b4
17
9
11
11
12
60
Totals
13
5
7
7
8
40
7
7
1
1
4
20
8
1
1
5
5
20
9
1
3
3
4
20
100
21
13
15
15
16
80
240
11
11
5
5
8
40
120
12
5
5
9
9
40
120
13
5
7
7
8
40
120
200
600
300
80
Random Number Generator
The calculations for the ANOVA produce whole numbers until we reach
the F–ratio. The F–ratios would have been whole numbers if I had selected arrays for the main effects that have standard deviations whose ratio
is a whole number. In fact, for Factor A, s = 5, for Factor B, s = 4, and s
= 3 for each group.
Source
A
B
AB
Within
Total
Sum of Squares
1000.00
720.00
120.00
432.00
2272.00
df
2
3
6
48
59
Mean Square
500.00
240.00
20.00
9.00
F–Ratio
55.556
26.667
2.222
RUNNING THE PROGRAM
The program is very easy to use. Just enter the parameters for the data
you desire and then click a button — the computer does all the work.
When you start the program, the screen will be similar to the one in Figure 13.2. As you can see, you can change several parameters. Let's review each in turn.
Figure 12.7 Sample screen of the Whole Number Generator
Lower and Upper Sample Size
These parameters determine the size of the arrays that the computer will
generate. In the default mode, the values are 2 and 5. Therefore, the
computer will generate arrays of residuals for sample sizes of 2, 3, 4, and
5. You can set the upper sample size to 35. To generate arrays of only
one sample size, set the lower and upper values to the same value.
Random Number Generator
81
Lower and Upper Standard Deviations
These values determine the standard deviations of the arrays that the
computer will generate. In the default mode, the values are 2 and 5.
Therefore, the computer will generate arrays of residuals with sample sizes of 2, 3, 4, and 5. You should note that not all sample size and standard deviation combinations will produce an array. The computer will indicate in the printout those combinations that are impossible to solve.
The largest standard deviation that you can enter is 20. Use large
standard deviations with caution, however. The larger the standard deviation the greater the number of arrays the computer will produce. By
large, I mean on the order of more than 20,000!
Denominator for SD
The computer will create data using N or N - 1 for the denominator of the
standard deviation. You can switch between 0, for N, and 1, for N - 1.
Replications in Set
This is an important parameter you can set. In essence, you can determine the frequency of equal values in the array of residuals. The number
of replication can be set to any number between 1 and the Upper Sample
Size. Restricting the number of replications will reduce the number of
arrays. In some cases, there may be no solution of N and s with a specified maximum number of replications.
Compute
Clicking the mouse over this button causes the program to start generating the numbers. The computer will print the information to the screen,
printer, or the disk drive. If you print to the screen, the computer will
pause after each page of information until you press any key to continue
the report. Press the ESC key to quit the current run and start over.
Print
Each time you click on this button, the program will send the information to a different destination. The computer's screen is the default
device. You can also print to the printer or to the disk drive. If you print
to the drive, the computer will create a file named FILE####.NMS. The #s
represent numbers that the computer will generate for the data. Each
time you print to the disk drive, the computer will generate a new file.
The files are in ASCII format and can be easily read by any word processor.
Exit
This button returns control of the program to the main menu.
Settings
This option gives you access to a set of routines that allow you to change
the color of the screen and text as well as other information concerning
the program.
The following table is an example of the data generated by the computer.
For this example, I set the upper and lower sample size to 8, the standard deviations to 4, the denominator to N - 1, and the number of replications to 5. The computer found 70 unique arrays that produce whole
numbers.
82
Random Number Generator
------------------------------------------------------------------------------Residuals for whole numbers: SD = (SS/(n-1))^.5
repeats = 5
------------------------------------------------------------------------------Residual Arrays for N = 8 and s = 4
Array: e
1:
9 -2 -2 -2 -1 -1 -1
0
:SK = 2.08
2:
8 -5 -2
1 -1 -1
0
0
:SK = 1.12
3:
8 -4 -4
0
0
0
0
0
:SK = 1.14
4:
8 -4 -3
2 -1 -1 -1
0
:SK = 1.26
5:
8 -4 -3 -2
1
1 -1
0
:SK = 1.23
6:
8 -4
2 -2 -2 -2
0
0
:SK = 1.28
7:
8 -3 -3 -3
2 -1
0
0
:SK = 1.3
8:
8
3 -3 -2 -2 -2 -1 -1
:SK = 1.44
9:
8 -3 -3
2 -2 -2
1 -1
:SK = 1.33
10:
7 -6 -3
1
1
0
0
0
:SK = .3
11:
7 -6
2 -2
1 -1 -1
0
:SK = .37
12:
7 -6 -2 -2
1
1
1
0
:SK = .33
13:
7 -5 -4
2
1 -1
0
0
:SK = .48
14:
7 -5
3 -3 -2
0
0
0
:SK = .62
15:
7 -5
3 -3
1 -1 -1 -1
:SK = .64
16:
7 -5 -3 -3
1
1
1
1
:SK = .5
17:
7 -5
3 -2 -2 -2
1
0
:SK = .66
18:
7 -5 -3
2
2 -2 -1
0
:SK = .58
19:
7
4 -4 -3 -2 -1 -1
0
:SK = .91
20:
7 -4 -4
3 -2
1 -1
0
:SK = .69
21:
7 -4 -4 -3
2
1
1
0
:SK = .58
22:
7
4 -3 -3 -3 -2
0
0
:SK = .94
23:
7
4 -3 -3 -2 -2 -2
1
:SK = .98
24:
7 -4
3 -3
2 -2 -2 -1
:SK = .8
25:
7 -4 -3 -3
2
2 -2
1
:SK = .69
26:
7
3 -3 -3 -3 -3
1
1
:SK = .78
27:
6 -6
4 -2 -2
0
0
0
:SK = .14
28:
6 -6 -4
2
2
0
0
0
:SK = -.15
29:
6 -6
4 -2
1 -1 -1 -1
:SK = .16
30:
6 -6 -4
2
1
1
1 -1
:SK = -.17
31:
6 -6
3 -3
2 -1 -1
0
:SK = .01
32:
6 -6
3 -3 -2
1
1
0
:SK = -.02
33:
6 -6
2
2
2 -2 -2 -2
:SK = 0
34:
6 -5 -5
3
1
0
0
0
:SK = -.02
35:
6
5 -5 -2 -2 -1 -1
0
:SK = .58
36:
6 -5 -5
2
2
1 -1
0
:SK = -.06
37:
6
5 -4 -4 -1 -1 -1
0
:SK = .62
38:
6 -5
4 -4
1 -1 -1
0
:SK = .26
39:
6
5 -4 -3 -3 -1
0
0
:SK = .66
40:
6 -5
4 -3 -3
1
0
0
:SK = .3
41:
6
5 -4 -3 -2 -2
1 -1
:SK = .69
42:
6 -5
4 -3
2 -2 -1 -1
:SK = .37
43:
6 -5
4 -3 -2 -2
1
1
:SK = .33
44:
6 -5 -4
3
2 -2
1 -1
:SK = .16
45:
6 -5 -4 -3
2
2
1
1
:SK = .05
46:
6 -5
3
3 -3 -2 -2
0
:SK = .3
47:
6
4 -4 -4 -3
1
1 -1
:SK = .37
48:
6 -4 -4 -4
3
1
1
1
:SK = .16
49:
6
4 -4 -4
2 -2 -2
0
:SK = .42
50:
6 -4 -4 -4
2
2
2
0
:SK = .14
51:
6 -4 -4
3
3 -3 -1
0
:SK = .33
52:
6
4
3 -3 -3 -3 -2 -2
:SK = .62
53:
6 -4
3 -3 -3 -3
2
2
:SK = .33
54:
5
5 -5 -4 -2
1
0
0
:SK = .16
55:
5 -5 -5
4
2 -1
0
0
:SK = -.17
56:
5
5 -5 -3 -3
1
1 -1
:SK = .21
57:
5 -5 -5
3
3
1 -1 -1
:SK = -.22
58:
5
5 -5 -3
2 -2 -2
0
:SK = .26
59:
5 -5 -5
3
2
2 -2
0
:SK = -.27
60:
5
5 -4 -4 -3
2 -1
0
:SK = .3
61:
5 -5
4 -4
3 -2 -1
0
:SK = .05
62:
5 -5
4 -4 -3
2
1
0
:SK = -.06
63:
5
5 -4
3 -3 -2 -2 -2
:SK = .48
64:
5 -5
4 -3 -3
2
2 -2
:SK = .05
65:
5 -5 -4
3
3
2 -2 -2
:SK = -.06
Random Number Generator
83
------------------------------------------------------------------------------Residuals for whole numbers: SD = (SS/(n-1))^.5:
repeats = 5
Page
2
------------------------------------------------------------------------------66:
5
5
3 -3 -3 -3 -3 -1
:SK = .5
67:
5
4
4 -4 -3 -3 -2 -1
:SK = .37
68:
5
4 -4 -4
3 -3 -2
1
:SK = .16
69:
5 -4 -4 -4
3
3
2 -1
:SK = -.02
70:
4
4
4 -4 -4 -4
0
0
:SK = 0
------------------------------------------------------------------------------===============================================================================
Program Finished
Statistical Tables Generator
85
CHAPTER 13 STATISTICAL TABLES GENERATOR
INTRODUCTION
The goal of this option is to provide accurate tables that researchers frequently use when conducting statistical tests. Specifically, the program
will generate a table of values for a specific sampling distribution or table
of critical values based on the parameters that you supply. The obvious
advantage of this program is that it supplies a table on demand that
meets your particular needs. Because you can specify the parameters of
the table, the computer can create a specific table of values that may not
otherwise be readily available.
When you select this option, you will see a menu of alternative like the
one in the following figure. As you can see, the program offers eight alternatives including tables for t-ratios, F-ratios, 2, the correlation coefficient, and the normal and binomial distributions. Each option will print
the table to the screen or your computer’s printer.
Figure 13.1 Menu of options available for the Statistical Tables
option.
CRITICAL VALUES : t-RATIO
In this program you will have the opportunity to create the critical
values required to reject the null hypothesis for Student’s t-ratio. To use
the program you will need to specify the -level to be used and whether
you wish a one or a two-tailed test. The lower and upper degrees of freedom establish the size of the table. You can make the table small by selecting a small range between the upper and lower limits. Similarly, you
can make the table extremely large by setting the upper level at an extremely large level.
86
Statistical Tables Generator
As a generality, the critical values of t will vary considerable as the
degrees of freedom increases. When the degrees of freedom reach approximately 120, the amount of change will decrease and only be within
one or two parts per thousand. Indeed, with degrees of freedom as large
as these there is little difference between the t-distribution and the normal distribution.
CRITICAL VALUES : F-RATIO
This program generates the critical values of the F-ratio for a specified
-level and range of degrees of freedom. The program will generate a table of F-ratios that identifies the degrees of freedom and critical value. To
use the table, compare the observed F-ratio to the appropriate critical
value. If the observed value is greater than the critical value, the null
hypothesis may be created.
On some occasions, you may find that the observed F-ratio is less
than 1.0. You may want to know if the F-ratio is significantly less than
1. To conduct such a test, take the reciprocal of the F-ratio and reverse
the degrees of freedom. Then test the revised F-ratio against the tabled
values. For example, if the F-ratio were 0.436 with degrees of freedom of
Numerator = 4 and Denominator = 30, the revised F-ratio is
2.296 = 1/0.436 with degrees of freedom of Numerator = 30 and Denominator = 4.
CRITICAL VALUES: 2
This program produces critical values required to reject the null hypothesis for the 2 test. You may enter the -level and the range of degrees of freedom that the table should contain.
CRITICAL VALUES: r
When using the Pearson Product Moment Correlation Coefficient, one
may determine whether the correlation coefficient is different from 0. In
essence, the coefficient can be converted to a t-ratio that is then tested
against critical values. This program will generate the minimum size of r
required to reject the null hypothesis. You may set the -level, whether
the test is a one- or two-tailed test, and the range of degrees of freedom
to include in the table.
r TO z TRANSFORMATION
This program creates a table representing the transformation of correlation coefficients to z-scores using Fisher’s transformation. There are no
parameters to set. The creation of the table is automatic.
NORMAL DISTRIBUTION
The goal of this program is to create a table of z-scores and the corresponding proportion of the distribution under specific section of the
curve. Once you set the range of the table, the program will create a table of z-score and the proportion of the distribution above and below each
z-score
Statistical Tables Generator
87
BINOMIAL DISTRIBUTION
The Binomial Distribution option creates the values for a Binomial distribution. The program allows you to determine P and the number of
events in the population. The program prints a table with four columns.
When the program creates the table, it provides you with basic information about the binomial distribution you created. The program will
print the values of P; Q (1 - P); and the mean, standard deviation, skew,
and kurtosis of the distribution. The table also includes four columns
Column 1 represents the specific event that is represented by X(i).
The column will range in values between 0 and the number of events in
the distribution. The next column, headed by pX(i) is the proportion of
the curve at the value of X. The next lines are the cumulative proportions of the distribution. The first of these columns represents the cumulative proportion at and below X(i). The second of these columns represents the cumulative proportion above X(i).
Statistical Tables Generator
89
CHAPTER 14: ANOVA — MONTE CARLO SIMULATOR
INTRODUCTION
The ANOVA Monte Carlo Simulator is a program that has many functions
and uses. You can use the program as a tutorial or as a utility to generate data. As a tutorial, you can use the program to examine many of the
general principles of the Analysis of Variance. Like the other tutorials in
this package, the program allows you to vary many of the essential parameters of the ANOVA and then generate random data that fit those parameters. In this capacity you can learn more about important concepts
such as the robustness of the ANOVA when the population parameters
violate the specific assumptions of the statistic. As a utility, you can
generate data for homework assignments or other projects where one
needs data that fit within specific parameters. Let’s begin by looking at
how the program works. We will then examine different applications of
the program.
When the program begins, you will see a screen similar to the one presented in Figure 14.1. As you can see, there are many buttons. Each
button controls the operation of the program. The following is a brief description of each button.
Figure 14.1: Primary screen for the ANOVA Monte Carlo Simulator.
Design ANOVA Model
When you select this option, the program will allow you to design the
type of ANOVA you wish to examine. You can control the number of Between-Subject and Within-Subject factors, and the levels of each factor.
90
Statistical Tables Generator
Therefore, you will be able to model simple one-way ANOVAs, factorial
ANOVAs, and mixed-model ANOVAs.
Change Parameters
This function allows you to change the parameters of the ANOVA you
are examining. Specifically, the program allows you to change the mean
(), standard deviation (), and sample size of each treatment condition
within the ANOVA. When the program begins, all parameters are set as:
 = 5.0,  = 2.5, and n = 5.
Plot Factors
Once you have generated data, the program will allow you to plot several graphs. These graphs represent each of the tests conducted in the
ANOVA. That is, you will be able to examine the results for the main effects and interactions conducted for the model you created.
Start Demonstration
This button starts the program. After you have designed the ANOVA,
set the parameters and the number of iterations, the program will generate data, conduct the ANOVA, and save the information to the computer’s
memory.
Iterations
You can use this option to set the number of trials to generate. For
quick demonstrations, you can use a small number of trials (e.g., 100).
For more comprehensive tests of the ANOVA, you can increase the iterations to a large value (e.g., 10,000). For homework projects, you can set
the iterations to the number of students in the class.
The program always starts by setting the number of iterations to 10.
You can change the number of iterations using the arrows to increase or
decrease the value, or by clicking the mouse over the number of iterations and then entering the desired value.
A word of caution is in order. Selection a large iteration value may
commit you to a long computer run. Remember that the computer must
generate new random numbers for each iteration. If you are examining a
2  4 ANOVA with 10 observations within each cell, the computer must
generate 80 numbers and then conduct the appropriate ANOVA. These
steps multiplied by the number of iterations you select can create a significant time-consuming task for the computer. Of course, your computer’s speed will affect the time required to complete the analysis. Similarly, a math coprocessor will enhance the speed of the intermediate calculations.
Significant Digits
The program generates data at random and will then round each
number to represent the number significant digits (places to the right to
the decimal point). You can instruct the program to generate whole
Statistical Tables Generator
91
numbers (0 significant digits) or numbers with up to 4 significant digits
(e.g., 2.1234).
The number of significant digits only affects the random numbers that
the computer generates, not the calculations. The computer uses the
highest level of precision to calculate the ANOVA and descriptive statistics and reports all statistics to the third significant digit.
Print Summary Tables to Screen
When you select this option, the program will print the ANOVA summary tables to the screen. This will allow you to see the summary tables
as the computer generates the data.
Print Summary Tables to Disk
This option allows you to make a permanent record of the ANOVA
summary tables. The program saves each completed summary table to
the disk drive. You can then retrieve the file using a word processor for
review or printing.
For each Monte Carlo Simulation, the computer will create a new file
in the Statistics Tutor subdirectory. The general format for these files
is FILEXXXX.AOV where the XXXXs represent the file number. Each
time you run a simulation, the computer creates a new file. These files
are numbered sequentially.
Print Summary Tables to Printer
This option allows you to make a permanent record of the ANOVA
summary tables. The program prints each summary table to the printer.
This option will slow the time it takes the computer to complete the Monte Carlo simulation.
Print Raw Data with Tables
If you print the summary tables to the disk or the printer, activating
this option cause the program to print the raw data used for each
ANOVA. You would use this option if you want a permanent copy of the
data for later analysis. The raw data are never printed to the computer
screen.
This option is especially useful for instructors who wish to create
homework assignments.
Specifically, the instructor can create an
ANOVA design, set the population parameters, and then set the iterations
to the number of students in the class. Therefore, each student will produce a different data set for the assignment. Because the computer
prints the data along with the summary tables, the instructor has a
ready mechanism for checking the student’s work.
Create F-Ratio File
This option creates a special file that contains only the F-Ratios and
their p-values. You might use this option if you want a permanent table
of the F-Ratios and their p-values that you can use with a spread sheet
program.
92
Statistical Tables Generator
For each Monte Carlo Simulation, the computer will create a new file
in the Statistics Tutor subdirectory. The general format for these files
is TABLEXXX.TAB where the XXXXs represent the file number. Each
time you run a simulation, the computer creates a new file. These files
are numbered sequentially.
The following table is an example of the file that the computer created.
As you can see, the computer provides general information about the design of the ANOVA and the population parameters. For each data set,
the computer records the label and degrees of freedom, F-ratio, and probability value for each F-ratio in the ANOVA.
Table 14.1: Example of the Table Generated by the Monte Carlo Simulator.
ANOVA Model: 2
Number of Between-Groups Factors: 1
Factor 1:
Levels =
Number of Within-Subjects Factors: 0
Parameters for ANOVA
Name
MU
SIGMA
A15
2.5
A25
2.5
Name
F-Ratio
A ( 1, 8)
1.214707
A ( 1, 8)
.2932584
A ( 1, 8)
8.424465E-02
A ( 1, 8)
1.672475
A ( 1, 8)
.2825279
A ( 1, 8)
.1869996
A ( 1, 8)
.3089594
A ( 1, 8)
.5579816
A ( 1, 8)
4.054998
A ( 1, 8)
3.627503
2 Name = 1
n
5
5
p-value
.3024517
.6028903
.7790068
.2320189
.6094869
.676846
.59352
.4764526
7.882446E-02
9.330428E-02
Other Features:
You will notice that there are two additional bits of information on the
screen. The first is the line ANOVA MODEL:. This line represents the
type of ANOVA you have designed. The program always begins by conducting a one-way ANOVA with two independent groups.
The next line is a graphic that represents how many ANOVAs the program has generated. When the program begins, it will draw a blue rectangle to show the proportion of the total iteration it has generated.
Statistical Tables Generator
93
Figure 14.2: Example of ANOVA Monte Carlo Screen after
the program has generated the data for the simulation.
Practice Session:
Let’s begin with a simple practice session. We will use the default
ANOVA model and the default parameters. Click the mouse over the
Start button or press the letter “S” on your keyboard. The program will
immediately begin the process of generating the 10 ANOVAs. After the
program generates the data, the screen will look like the one in Figure
14.2.
Now that you have generated the 10 ANOVAs, let's look at the results.
Click the mouse over the Plot Factors button, or press the letter “P” on
the keyboard. You will see the screen change to the one presented in
Figure 14.3.
94
Statistical Tables Generator
Figure 14.3: Listing of the available effects that can be examined
using the Plot Factors option.
As you can see, there is only one effect to examine, the Main Effect for
Factor A. If you had entered a factorial design, the program would present separate buttons for each Main Effect and interaction of factors.
Because we have only the one option, press the return key, or click the
mouse over the “A” button.
The program will then tell you that it is working. In essence, the program is looking at all the F-ratios it generated and preparing the information for the following graphs. Once it has analyzed the data, you will
see a graph similar to the one in Figure 14.4.
Figure 14.4: Sample screen of the frequency distribution of F-ratios generated by the Monte Carlo simulator. Note that the program generates all
Statistical Tables Generator
95
numbers at random. Therefore, each sampling distribution will be different from all others.
This graph represents the frequency distribution of the F-Ratios generated for the ANOVA. Specifically, this is the frequency distribution of
F-Ratios for Factor A. The horizontal axis represents the F-ratios. These
values will range from 0.0 (on the left of the scale) to a large value of F.
The graph also represents the probability level of the F-ratios. In this
graph, the computer plotted the location of p = .5, p = .1, and p = .05.
The vertical axis represents the observed frequency of each F-Ratio.
You can now move the mouse through the graph. As you do, the
numbers on the lower right of the screen will change. These numbers
represent the locate of the mouse in the graph. For example, the mouse
in Figure 14.4 is at an F-Ratio of 4.00. There was only one F-Ratio of this
value.
In essence, the data represented in Figure 14.4 is the sampling distribution for the F-ratio for the following conditions. First, the null hypothesis is a true statement. Specifically, the two population means are
equal: 1 = 2. Second, the degrees of freedom are 1 for the numerator
degrees of freedom, and 8 for the denominator degrees of freedom. Of
course our sample size is small. Had we run more iterations, say 10,000,
the distribution of F-ratios would look more like the distributions created
by the Sampling Distributions tutorial.
You can also generate other graphs. Press the letter “P” or click the
mouse on the button to the left of the line “Press P to view Cumulative
Probabilities.” When you do, you will see a graph like the one in Figure
14.5.
Figure 14.5: Example of the cumulative probabilities graph. The horizontal axis represents the probability of the F-ratios. The vertical axis
represents the cumulative probability. The dark curved line represents
the ideal cumulative probability when the null hypothesis is true. The
lighter histogram represents the observed data.
96
Statistical Tables Generator
This graph represents the cumulative frequency of the probabilities for
the tests you conducted. Along the horizontal axis are the probabilities.
The probabilities range from 1.0 (on the right to the screen) to .0009 (on
the left of the screen). The scale of the probabilities represents a log
scale. The vertical axis represents the cumulative percentage for the Fratios. The percentages range from 0% (the bottom of the scale) to 100%
(the top of the scale).
There are two elements in the graph. The first is a black curved line.
This line represents the cumulative probability that would occur for the
null hypothesis. If the null hypothesis was a correct statement, all the
probabilities should fit under this line. The second component is the
blue bars. These bars represent the observed cumulative frequency of
the probabilities. As you can see, for the 10 ANOVAs we conducted, the
blue bars are close to the black line. This occurred because the null hypothesis is a true statement - the two population means are equal to
each other.
Move the mouse around the graph. As you do will see the numbers in
the lower right of the screen change. These numbers represent the location of your mouse. In Figure 14.5, the mouse sets a p = .05 and cumulative percentage = 5%.
In this example, we set the population parameters so that the null hypothesis was correct. That is, we ensured that 1 = 2. As we would expect, the size of the F-ratios followed what we would expect given this situation. We are now ready to begin experimenting with the ANOVA and
this program. Specifically, we can use the program to examine such issues as power and the robustness of the ANOVA.
EXAMINING POWER
You should recall that power is the ability to reject the null hypothesis
when the null hypothesis is a false statement. Power is affected by several variables including the sample size, the difference among the populations, the amount of within-group error, and the alpha level the researcher uses to test the null and alternative hypotheses. Researchers
attempt to maximize the power of their experiments by optimizing each of
these variables.
Let’s look at how we can use the program to examine the concept of
power as it relates to the ANOVA. In this example we will use a one-way
ANOVA with four levels of the independent variable. To set-up the experiment, begin at the first page of the Monte Carlo program and click over
the Design ANOVA Model button. The program will begin with the request:
Enter Number of Between-S Variables
#B/S: 1
Because we are using a simple one-way ANOVA, just press the ENTER
key to indicate that there is one between-subject variable. The program
will then ask:
Enter LEVELS of Between-S Variable 1
Statistical Tables Generator
97
LEVELS 2
The analysis we want to conduct calls for four levels of the independent
variable. Therefore, type 4 and press the ENTER key. The program will
then ask for the name of Between-Subject variable 1. You can either
press the ENTER key or type a short name and then press ENTER. The
program now asks for the number of within-subjects variables.
Enter Number of Within-S Variables
#W/S: 0
There are no within-subject variables in this example. Enter a 0 and
press the ENTER key. The program will automatically return to the first
page. Notice that at the bottom of the screen, the program prints:
ANOVA MODEL: 4
This indicates that we have entered the design of the ANOVA as a oneway ANOVA with four levels of the independent variable.
Now that we have entered the design of the ANOVA, we can change
the parameters of the population variables. Click the mouse over the
Change Parameters button. You will now see a screen like the one in
Figure 14.6.
Figure 14.6: Example of the screen for changing the parameters of a
simulation.
Note that all the population means () all equal 5.000, the population
standard deviations () equal 2.500, and the sample sizes are 5. For this
example, let’s set the four means as 5.0, 6.0, 7.0, and 8.0. Enter the appropriate value for the mean and press the ENTER key. You have now
98
Statistical Tables Generator
changed the population means for the four groups. Click the mouse over
the Exit button to return to the first screen.
For the next step, set the number of iterations to 100. Click the
mouse over the 10, press the backspace key to clear space for the new
number, and then type 100. When you press the ENTER key, the number of iterations will be reset. You are now ready to start the demonstration. Click the mouse over the Start Demonstration button. The program
will automatically begin to generate the numbers for the separate ANOVAs.
When the program is done, select the Plot Factors option, and then
Factor A. You will then see a graph similar to the one in Figure 14.7.
Your graph will be somewhat different from the one you see below because the computer is generating the numbers at random. Therefore,
each run of the simulation will produce a slightly different pattern of results.
Figure 14.7: Frequency distribution of F-ratios produced by the Monte
Carlo Simulator.
When you click the mouse over the Cumulative Probabilities option you
will see a graph similar to the one in Figure 14.8. This is an important
graph because it allows us to examine the power of the ANOVA design.
Recall that the black curved line represents the normal cumulative probability that would be expected if the null hypothesis were true. That is,
the line represents the condition where 1 = 2 = 3 = 4. As you can see,
the actual cumulative probability level is above this line. This fact suggests that the probability of rejecting the null hypothesis is greater than
. According to the graph, the cumulative probability is about 26. In
other words, the probability of rejecting the null hypothesis is approximately 26%.
Statistical Tables Generator
99
Figure 14.8: Cumulative probability distribution created by Monte Carlo
simulator.
By most standards, a power of 26% is small. What can the researcher
do to increase the power of a research design? One alternative is to increase the level of . Although this is an easy and effective strategy,
most researchers do not like to use an  larger than .05. The options,
therefore, are to reduce the within subject variation, increase the difference among the means, and increase the sample size.
As a quick experiment, return to the main page of the ANOVA Monte
Carlo simulator and select the Change Parameters option and then increase the sample size of the groups from 5 to 10. Once you have
changed the sample sizes, rerun the simulation and determine the
change in power. As you will see, a small increase in sample size greatly
increases the power of the research design.
100
Statistical Tables Generator
Figure 14.9: Cumulative probability distribution created by Monte Carlo
simulator.
You should note that this estimate of power is just that, an estimate.
True estimates of power can be determined mathematically. Cohen’s
book (19XX) contains a complete list of power tables for various statistics
and designs. That book, however, examines power under the condition
that the assumptions of the statistic are met. This program will allow
you to experiment with violations of the assumptions of the ANOVA.
As we noted above, there are many ways to experiment with the power
of a research design. Currently, we have examined the effects of increasing sample size. You can continue to experiment with the program by
systematically altering the different parameters of the population. For
example, all else being equal, how does increasing or decreasing the within group variation in each treatment condition affect the results? Similarly, how much difference must there be among the groups to increase
power to 80?
Another dramatic means of affecting the power of the ANOVA is altering the design of the research methods. One of the more dramatic alterations involves using a within-subjects research design. Within-subjects
research designs are important in the behavioral sciences because they
offer the researcher increased power and the ability to detect interesting
effects. Let’s take a moment and examine how these designs work.
There are two general ways we can use a within-subjects design. The
first is know as a repeated measures design. As the name implies, the
repeated measures design include many measures taken from the same
subject. Here is a simple example. A researcher may be interested in
how quickly people forget information. Therefore, she has people memorize a list of words. Once the people memorize the list, the researcher
tests their memory once every 3 hours for the next 12 hours. In this
case, the researcher has 4 observations for each subject. The independ-
Statistical Tables Generator
101
ent variable is the passage of time and the dependent variable is the individual’s performance on the memory test.
The second general method for using a within-subjects design is the
matched-groups design. In matched groups design, the participants are
randomly selected but assigned to one of the treatment conditions on the
basis of some important characteristic. The researcher assigns the subjects to the groups in such a way that each group has nearly identical, or
matched, individuals. The researcher first selects a significant subject
variable that he or she believes affects the outcome of the results. Next,
the researcher ranks the individuals from highest to lowest on this important variable. Now comes the important part. The researcher begins
with the highest ranking individuals and randomly assigns each to one of
the treatment conditions. If there were four treatment conditions, the
researcher would take the four highest scoring subjects and randomly
assign each to one of the four treatment conditions.
Whether one uses a repeated-measures design or a matched-groups
design, the net result is the same. Be can estimate the proportion of variance that is due to subject variables. In the within-subject design, each
person serves as his or her own control. In the matched-groups design,
the matching variable allows us to identify an important contributing factor to total variance.
The net result is that a within-subjects design has the potential of being more powerful than the equivalent between-subjects design. The increase in power is owed to the fact that the within-subject design allows
us to estimate the portion of the total variance that is unique to differences among subjects. Because we have identified a new source of variation, the total error term of the ANOVA is reduced.
Here is a simple experiment you can perform. Return to the original
page of the program and select the design ANOVA model option. Then
enter the following information:
Enter Number of Between-S Variables
#B/S: 0
Enter Number of Within-S Variables
#W/S: 1
Enter LEVELS of Within-S Variable 1
LEVELS 4
After the program asks for the name of the within-subjects variable, it
will ask for the correlations among the groups. Accept the default value
of .750 by pressing the ENTER key.
Enter r = .750
When you are finished, the program will note at the bottom your screen.
ANOVA MODEL: [4]
The model indicates that you entered a one-way ANOVA with repeated
measures. Now change the four population means to 5, 6, 7, and 8.
What we have done is create a situation similar to the earlier between-
102
Statistical Tables Generator
subjects design. The only difference is that we now know that there is a
significant correlation among the subjects in the treatment condition.
Run 100 iterations and then look at the cumulative probability. The
following graph is from a simulation using the same parameters. Compare this graph to the one in Figure 14.9. As you can see, the power for
the within-subjects design is much greater.
Figure 14.10: Cumulative probability distribution created by Mon-
te Carlo simulator.
ROBUSTNESS OF THE ANOVA
Robustness is a statistical term that refers to the ability of an inferential
statistic to afford accurate inferences when the mathematical assumptions of the statistic cannot be met. More specifically, robustness refers
to the degree to which the rate of Type I errors are held constant when
the assumptions are violated. Let’s look at an example for the sake of
illustration.
One of the key assumptions of the ANOVA is homogeneity of variance.
When we conduct an ANOVA we assume that the variances of all groups
are equal. This is an important assumption because the denominator of
the F-Ratio is known as a pooled error term. This phrase means that the
mean-squares error is really a type of average variance. If there are large
differences among the variances, then the pooled error term may not accurately reflect the typical variance for any group. If there are large differences among the variances, will the ANOVA continue to provide useful
information about the variance among groups? From the perspective of
hypothesis testing, will the ANOVA continue to create too many or too
few Type I and Type II errors? If the ANOVA is robust, then the rate of
Type I errors will remain relatively constant. If the test is not robust,
then the rate of Type I errors will be greatly increased or decreased.
Statistical Tables Generator
103
Indeed, let’s see what would happen if we violated the assumption of
homogeneity. Return to the first page and design a one-way ANOVA with
four levels of the independent variable. In the Change Parameters option
set all population means to 5 and the sample sizes to 10. Now change
the four standard deviations to 1, 2, 4, and 6. This arrangement will create a considerable amount of heterogeneity of variance. How will this
arrangement affect the robustness of the ANOVA?
Figure 14.11: Cumulative probability distribution created by
Monte Carlo simulator.
As you can see in Figure 14.11, the violations of the homogeneity assumption had minimal influence on the rate of Type I errors around the
conventional testing area of  = .05. In this example, there was a slight
inflation of Type I errors, but these appear to be minimal. Let’s look what
happens, however, when the sample sizes are not equal. Return to the
Change Parameters option and change the sample sizes to 4, 6, 8, and
10. When you run a simulation under these conditions you may produce
a cumulative percentage graph like the one in Figure 14.12.
104
Statistical Tables Generator
Figure 14.12: Cumulative probability distribution created by Monte Carlo
simulator.
The results are dramatic! The violation of the homogeneity principle
coupled with unequal sample sizes greatly reduced the frequency of Type
I errors.
Statistical Tables Generator
105
REFERENCES
Hopkins, K. D., & Hopkins, B. R. (1979). The effect of the reliability of the dependent variable on power. Journal of Special Education, 13,
463-466.
Kohr, R. L., Games, P. A. (1974). Robustness of the analysis of
variance, the Welch procedure, and a Box procedure to heterogeneous
variables. Journal of Experimental Education, 43, 61-69.
Lovie, A. D. (1979). The analysis of variance in experimental psychology: 1934-1945. British Journal of Mathematical and Statistical Psychology, 32, 151-178.
Maxwell, S. E., Cole, D. A., Arvey, R. D., & Salas, E. (1991). A
comparison of methods for increasing power in randomized betweensubjects designs. Psychological Bulletin, 110, 328-337.
Rogers, W. T., & Hopkins, K. D. (1988). Power estimates in the
presence of covariate and measurement error. Educational and Psychological Measurement, 48, 647-656.
Scheffé, H. (1970). Practical solutions of the Behrens-Fisher
problem. Journal of American Statistical Association, 65, 1501-1508.
Student. (1908). The probable error of the mean. Biometrika, 6,
1-25.
Wang, Y. Y. (1971). Probabilities of the type I errors of the Welch
tests for the Behrens-Fisher problem, Journal of the American Statistical
Association, 66, 605-608.
Welch, B. L. (1936). Specification of rules for rejecting too variable a product with particular reference to an electric lamp problem.
Journal of the Royal Statistical Society, 3, 29-48.
Welch, B. L. (1938). The significance of the difference between
two means when the population variances are unequal. Biometrika, 29,
350-362.
Welch, B. L. (1947). The generalization of ‘Student’s’ problem
when several different population variances are involved. Biometrika, 34,
28-35.
Welch, B. L. (1951). On the comparison of several mean values:
An alternative approach. Biometrika, 38, 330-336.
Winer, B. J., Brown, D. R., & Michels, K. M. (1991). Statistical
principles in experimental design (3rd ed.). Boston, McGraw-Hill.
Zimmerman, D. W., & Sumbo, B. D. (1993). The relative power of
the Wilcoxon-Mann-Whitney test and Student t test under simple bounded transformations. The Journal of General Psychology, 117, 425-436.