Using SPSS, Chapter 11: Additional Hypothesis Tests

advertisement
1
Using SPSS, Chapter 11:
Additional Hypothesis Tests
Here we see how to use SPSS to perform Chi-Squared and ANOVA tests.
• Chapter 11.1 - Chi-Squared Test for Goodness of Fit
If your data is in a frequency table be sure to first weight your data by frequency.
Data → Weight Cases...
Either way, you proceed by
Analyze → Nonparametric Tests → One Sample...
2
– Assuming equal probabilities.
Play Video .
– Assuming non-equal probabilities.
Play Video .
• Chapter 11.2 - Chi-Squared Test of Independence
If your data is in a frequency table be sure to first weight your data by frequency.
Data → Weight Cases...
Either way, you proceed by
Analyze → Descriptive Statistics → Crosstabs... .
• Chapter 11.3 - ANOVA
Your data must be in standard SPSS format (rows = cases & columns = variables).
Analyze → Compare Means → One-Way ANOVA...
• Creating and Importing Data
3
Play Video .
4.
Play Video .
5
2
Chapter 11.1 - Chi-Squared Test for Goodness of Fit
Play Video .
• Assuming Equal Probabilities
This video demonstrates the Preliminary Example from Chapter 11.1. Here, the assumed probabilities
in the null hypothesis are all equal. There are two parts to this video. In part 1, the data is in the form
of a frequency table (below left). In part 2, the data is in standard format (below right).
Part 1: Data in a Frequency Table
Example 1, Hypothesized Even Distribution
Sample Distribution for 60 Rolls of a Single Die
Outcomes:
Observed
Assumed
# on die Frequency (Oi ) Probability (pi )
1
7
1/6
2
6
1/6
3
11
1/6
4
15
1/6
5
13
1/6
6
8
1/6
Part 2: Data in Standard Format
Rows = Cases and Columns = Variables
Roll #
1
2
3
4
5
..
.
Outcome
6
3
2
2
1
..
.
60
5
Results Calculated in Textbook:
χ2 = 6.400 P-value = 0.2692
Fail to reject the null hypothesis.
Conclusion: There is not enough evidence to conclude that the die is not fair.
Play Video .
• Assuming Unequal Probabilities
This video demonstrates Your Turn problem #2 from Chapter 11.1. Here, the assumed probabilities in
the null hypothesis are not all equal. There are two parts to this video. In part 1, the data is in the
form of a frequency table (below left). In part 2, the data is in standard format (below right).
Part 1: Data in a Frequency Table
Your Turn 2, Hypothesized Distribution
Sample Distribution for 800 Blood Donors
Blood
Observed
Assumed
Type Frequency (Oi ) Probability (pi )
O+
310
0.38
−
O
71
0.07
A+
235
0.34
−
A
64
0.06
+
B
68
0.09
−
B
12
0.02
AB+
36
0.03
−
AB
4
0.01
Part 2: Data in Standard Format
Rows = Cases and Columns = Variables
Donor Blood
#
Type
1
O+
2
O+
3
A+
4
AB+
5
B6
A+
..
..
.
.
800
B+
Results Calculated in Textbook:
χ2 = 23.724 P-value = 0.00127
Reject the null hypothesis.
Conclusion: Regional distribution does not seem to fit national distribution
3
Chapter 11.2 - Chi-Squared Test of Independence
Play Video .
This video demonstrates Example 1 from Chapter 11.2 Below is the distribution of grades by gender for
grades in a class with 72 students in the form of a contingency table. Test whether or not there is a significant
dependent relationship between grade and gender in this class. Use a 0.05 significance level.
Observed Frequencies - Contingency Table
A B C D F
Male
8 10 6 9 9
Female 4 6 9 6 5
Results Calculated in Textbook:
χ2 = 2.724 P-value = 0.605
Fail to reject the null hypothesis.
Conclusion: Grade and gender are not significantly dependent in this class.
There are two parts to this video. In part 1, the data is in the form of a frequency table (below left). In
part 2, the data is in standard format (below right).
Part 1: Data in a Frequency Table
Gender Grade Frequency
Male
A
8
Female
A
4
Male
B
10
Female
B
6
Male
C
6
Female
C
9
Male
D
9
Female
D
6
Male
F
9
Female
F
5
Part 2: Data in Standard Format
Rows = Cases and Columns = Variables
Student # Gender Grade
1
Male
B
2
Female
C
3
Female
A
4
Male
B
5
Female
F
6
Male
D
7
Male
F
8
Female
C
..
..
..
.
.
.
72
Male
B
4
Chapter 11.3 - ANOVA
Play Video .
This video demonstrates the Over-Simplified Examples (Case 1 and Case 2) in Chapter 11.3. For each case,
there are three samples. For each case, test the claim that population means are not all equal.
Case 1: Similar Means
Sample 1 Sample 2 Sample 3
3
3
4
3
5
5
4
5
6
5
5
7
5
7
8
Results Calculated in Textbook:
F = 2.73
P-value = 0.106
Fail to reject the null hypothesis
Conclusion: There is not enough evidence
to conclude that the population means
are not equal.
Case 2: Disparate Means
Sample 1 Sample 2 Sample 3
3
3
8
3
5
9
4
5
10
5
5
11
5
7
12
Results Calculated in Textbook:
F = 28.18
P-value = 0.000029
Reject the null hypothesis
Conclusion: There is sufficient evidence
to conclude that the population means
are not equal.
Format The Data First: The data must be put into the standard rows = cases and columns = variables
format. Notice, each entry from each sample represents a different case so we need to set up the data so that
there are 15 cases. Each case has a sample number and a score.
Case 1: Similar Means
Sample
Score
1
3
1
3
1
4
1
5
1
5
2
3
2
5
2
5
2
5
2
7
3
4
3
5
3
6
3
7
3
8
Case 2: Disparate Means
Sample
Score
1
3
1
3
1
4
1
5
1
5
2
3
2
5
2
5
2
5
2
7
3
8
3
9
3
10
3
11
3
12
5
Creating and Importing Data
• There are two ways to get data into SPSS.
– You can enter the data by typing it directly into the data editor.
– You can open an existing data file by selecting the File tab, then Open , then Data... .
Then select the type of file from the list of options. If it is not already an SPSS (.sav) data file,
you will be prompted to answer some questions. For example, if you open an Excel file it may ask
which worksheet and whether or not the first row contains labels.
• Make sure your data is formatted as described below.
– Rows = Cases
Each row represents a case such as each respondent to a questionnaire.
– Columns = Variables
Each column represents a variable being tracked or measured. For example, the answers to a specific
question on a questionnaire defines it’s own variable (column). As such, each row represents an
individual case for all variables.
– Cells contain values
Each cell contains a single value of a variable for a case.
It is possible to enter data in the form of a frequency table but then you must do some alterations
before analyzing such data.
• Once you have the data opened in the data editor, click the Variable View tab at the bottom of the
data editor. In this view, each variable is now a row and you must make sure all your variables are
defined appropriately. The most important distinctions are
– TYPE : The most common types are
∗ Numeric: Used for quantitative data. These are numbers with no commas and a period
delimiting the decimal places. SPSS will not allow you to enter non-numeric characters into a
cell of numeric type.
∗ Date: Used for dates or times from a menu of formats.
∗ String: Used for qualitative data. Avoid symbols such as *, -, +, ?, etc.
– Measure : There are three levels of measurement.
∗ Scale is for ratio or interval levels of measurement.
∗ Ordinal is for ordinal or ranked data.
∗ Nominal is for qualitative data.
– Values : If you have numeric values representing qualitative data such a 1=male and 0=female,
you will probably want this to be labelled accordingly in graphs and outputs. Click on the cell in
the Values column for that variable and assign labels for each value.
Download