Chapter 11
Contingency Table Analysis
Nonparametric Systems
• Another method of examining the
relationship between independent (X) and
dependant (Y) variables is contingency
table analysis.
• Up to this point we have used parametric
statistics. These methods make a number
of assumptions about the way that the
population that serves as the basis for
your research sample is distributed.
Nonparametric Systems
• Correlation and regression assumes that
the independent and dependent variables
are linearly related.
• Other assumptions behind the use of
parametric statistics include:
– Independent observations: the measurement or selection
of one case does not affect the measurement or
selection of another case
– The level of measurement for the variables is at least
interval in nature
Nonparametric Systems
• If these assumptions can not be meet
then the researcher has the option of
using nonparametric statistics.
• They are useful when the data is nominal
or ordinal, and they require no
assumptions about the population
parameters
Nonparametric Systems
• Parametric statistics are usually preferred
•
because they are more powerful, the power of a
statistic involves the acceptance of a false null
hypothesis (reaching the conclusion that there
are no differences between the sample and the
population when in fact there are)
The greater power of the stats the less likely the
researcher is to commit a Type II error
Nonparametric Systems
• In this chapter we will consider one particular
•
•
type of nonparametric measure, the chi-square
(X2), and some of the measures of association
that are used with nominal and ordinal data.
Chi-squared is most appropriate when the data
is divided into mutually exclusive categories that
can not be legitimately summed up- data at the
nominal or ordinal level
Chi-squared tells us whether the observed
distribution is significantly different from the one
that we would expect to occur by chance.
Constructing Contingency Tables
• A contingency table is a joint frequency
distribution- a frequency distribution with
two categorical variables.
• Again we are concerned with the
relationship between the independent and
dependant variables.
• The contingency table is also known as a
crosstabulation, because it counts the
cases that fall into each pairing of the
table
Constructing Contingency Tables
• The cells contain those cases that fall into each
•
pairing of the variables- the number of cases
that fit the categories described by the cross
listing of the variables.
The joint frequencies fall within the cells of the
table under the categories for the independent
and dependant variables. It is called a
contingency table because the cases contained
along the rows (the categories of the dependent
variable) are contingent upon what is contained
along the columns( the independent variables).
Constructing Contingency Tables
• Consider a crosstabulation of race and attitudes
•
toward capital punishment from the National
Crime Survey data set. Our research hypothesis
is that whites are more likely to favor capital
punishment than are minorities.
The examination begins with a look at the
frequency distribution of a variable, including the
percentages within each categories as it relates
to the entire group.
Constructing Contingency Tables
• Marginals are the total frequency column,
•
because they are presented at the margins of
the table.
However we are usually concerned with
subgroup analysis. We examine the breakdown
of frequencies and percentages of the
dependent variables as they are categorized
under the independent variable, as with
correlation, the assumption is that the
independent variable produces an effect on the
dependent variable.
Constructing Contingency Tables
• The table lists the independent and dependant
variables, the usual procedure is to construct the
table so that the independent variable is listed
along the columns and the dependant variable
follows the rows. We are interested in the
impact of the independent variable follows the
rows. So we read the table by comparing the
percentage value of the column (independent
variable) for the subgroups under the dependent
variable (rows)
Constructing Contingency Tables
• The examination of a relationship typically
begins with a look at the frequency
distribution of a variable, including the
percentages within each category as it
relates to the entire group.
• The total frequencies are called marginals,
because they are found in the in the
margins of the table.
Constructing Contingency Tables
• With this method we usually concerned
with the subgroup analysis, the
breakdown of frequencies and
percentages of the dependent variable as
they are categorized under the
independent variable, as with correlation
the assumption is that the independent
variable produces and effect on the
dependent variable.
Constructing Contingency Tables
• The usual procedure is to construct the
table so the independent variable is listed
along the columns and the dependent
variable is listed along the rows.
• We read the table by comparing the
percentage value of the column
(independent variable) for the subgroups
under the dependent variable (rows)
Rules for the Construction and
Interpretation of Tables
• 1. Divide the sample into categories based upon the
•
•
•
•
values of the independent variable.
2. The table should be fully labeled. The categories of
the independent and dependent variable should be
clearly presented. The variable headings should describe
what is contained in the table.
3. The independent variable follows the columns of the
table. The dependent variable follows the rows of the
table.
4. Each subgroup is described in terms of the categories
of the dependent variable.
5. To read the table, compare the percentages of the
independent variable subgroups in terms of the
percentages of the subgroups of the dependent variable.
Chi-Square Test for Independent
Samples
• In statistical analysis, conclusions typically result
•
from a description of the findings. Inferential
statistics then allow us to make a decision about
the null hypothesis and whether this finding
would hold true if we had the data from the
entire population.
In the previous example a statistical test is
needed to determine whether we can assume
that this difference in attitudes on capital
punishment between racial groups also exists in
the entire population.
Chi-Square Test for Independent
Samples
• The data are at the nominal (race) and ordinal
•
•
(support for capital punishment) level of
measurement .
The groups and the choices fall into different
categories. We can use chi-squared to tell us the
probability that the frequencies we observed in
our survey results (observed frequencies) differ
from an expected (hypothesized) set of
frequencies.
With chi-squared, these expected frequencies
represent what we could expect to occur by
chance
Chi-Square Test for Independent
Samples
• Chi-squared is based upon the differences
•
•
between observed and expected frequencies. It
tells us the level of probability of obtaining the
differences between the observed and expected
frequencies.
If the observed frequencies (the survey results)
differ greatly (.05 level) then the null hypothesis
can be rejected.
If they do not substantially differ, the difference
between the two sets of frequencies could be
due to a sampling error.
Limitations of Chi-Squared
• 1. The sample must be randomly selected
• 2. Each category must be independent – the
•
way in which one response is categorized does
not influence the way that another response is
listed. In our example, the opinion of one
respondent did not affect another in terms of
his/her attitude toward the death penalty.
3. Each cell must have an expected frequency of
no less than five
Calculation of Chi-Squared
• Chi-squared is relatively easy to calculate
by hand with a calculator. In table 11.2 we
show how to calculate chi-squared by
hand in our example of the relationship
between race and attitude toward capital
punishment.
• Insert table 11.2
Calculation of Chi-Squared using
SPSS
• To calculate chi-squared using SPSS, take
the following steps.
• 1. On the Menu bar, click on “Analyze”
• 2. On the drop down menu, click on
“Descriptive Statistics”
• 3. On the next menu, click on “Crosstabs”.
These steps are on figure 11.1
Calculation of Chi-Squared using
SPSS
• 4. In the Crosstabs menu, the variables are
•
listed in the left-hand window. Highlight “Favor:
Death Penalty for Murderers” and paste it into
the “Row” window by clicking on the arrow
button. This is your dependent variable (Y) – the
respondents attitude toward capital punishment.
Remember that Y is always the row variable in a
contingency table.
This is shown in Figure 11.2
Calculation of Chi-Squared using
SPSS
• 5. In the same window, highlight the
•
•
independent variable (X), “Race Recode” and
paste it into the “Columns” window by clicking
on the arrow button.
6. In the “Crosstabs” window, click in the
“Statistics” button. The “Crosstabs: Statistics”
menu then appears. Click on the box next to
“Chi-square” to include a checkmark. Then, click
on the “Continue” button
This is shown in figure 11.3
Calculation of Chi-Squared using
SPSS
• 7. When you return to the “Crosstabs” window,
•
click the “Cells” button. The “Crosstabs: Cell
Display” window appears. In this window, in the
“Counts” section, click the box next to
“Observed” to make a checkmark. Then in the
“Percentages” section, click on the box next to
“Column”. Now your contingency table will give
you the observed frequencies for each cell. The
table will contain the percentages for the
independent variable.
This is shown in figure 11.4
Calculation of Chi-Squared using
SPSS
• 8. Click on the “Continue” button. You
then return to the “Crosstabs” window.
Click on “OK” to generate your
contingency table and the chi-squared
statistic.
• This is shown in figure 11.5
Figure11.5
• The Crosstabs printout contains the contingency
•
table and statistics. The first table tells us the
number of cases in the sample that had valid
information for these variables. The second table
mirrors our Table 11.1.
Our conclusion is that a higher percentage of
whites favor the death penalty in murder cases
by a difference of 30 percentage points (whites,
78.5 percent; minorities, 48.5 percent).
SPSS Output
• What we still need is a decision as to whether
•
this conclusion would be true if we had data
from the entire U.S. population. This is where
chi-squared as an inferential statistic, comes in.
One major limitation of chi-squared is that no
cell can have a expected frequency of less than
five, in our case our lowest expected frequency
is 17.46, so we can assume that our chi-squared
statistic is valid.
SPSS Output
• In the chi-squared tests table, we see that
with two degrees of freedom the Pearson
Chi-Squared value of 86.304 is significant
at .000.
• Because .000 is less than .05, we reject
the null hypothesis. Our research
conclusion is statistically significant .
Measures of Association with ChiSquared
• Another aspect of chi-squared analysis involves
•
measures of association. These measures
indicate the strength of a relationship between
the independent and dependent variable.
The measures of association available under
SPSS Studentware are listed in the “Crosstabs:
Statistics” screen. The following measures are
listed under the “Nominal” section.
Cramer’s V
• Is useful with nominal data.
• It is probably the most used popular of
the three measures we have discussed
because it has a lower limit of 0 (no
relationship) and an upper limit of 1
(perfect relationship). Unlike C and Phi,
there is no need to do further calculations
to determine the upper limit of Cramer’s V.
Introducing a third Variable
• We are introducing a third variable as a
second independent or control variable.
• We reexamine the relationship between
the original two variables (X and Y) within
each of the categories of the control
variable and then compare the results
across the categories of the control
variable.
Introducing a third Variable
• Returning to our examination of the forces
that influence attitudes toward capital
punishment, another key independent
variable is sex.
• Table 11.5 shows this reexamination.
Conclusion
• Categorical data measured at the nominal and ordinal
•
•
•
•
level are very common in criminal justice research.
A contingency table is an excellent method to summarize
and highlight research findings. Conclusions are drawn
from the table and its results.
Chi-Squared and its accompanying measures of
associations provide a method to determine statistical
significance.
In order to address complex problems such as crime,
multivariate analysis must be conducted.
Usually there is more than one contributing factor to
social problems such as crime.