Log-Linear Analysis

advertisement
COMSTAT, Log-linear, p. 1
I
Log Linear Analysis or Likelihood Ratio Chi Square
A. Steps in Log Linear Analysis
B. Warning about Log-Linear Analysis
II. SPSS
Alternatives to Chi Square for Frequency Data
There are times when researchers need alternatives to chi square analyses.
Log-Linear Analysis or Likelihood Ratio Chi Square
Also known as the G test, this method permits analysis of nominal level dependent variables
in ways that are similar to analysis of variance for interval or ratio level dependent variables. In
fact, Sokol and Rohlf (1981) recommend that it be used as a wholesale replacement for chi
square test (p. 692). In the case of 2 x 2 tables, it is a simple alternative to chi square that
provides much of the same information. Hence, researchers use log-linear analysis when they
can examine more than two independent variables and look for interactions on some nominal
level dependent variable. The researcher may examine different main effects and then add effects
together to make meaningful interpretations.
There are few assumptions in the use of this method. First, it is assumed that the data are
frequencies from a multinomial distribution, rather than a normal distribution. This requirement
is usually handled as a result of identifying classification categories whose probabilities sum to
1.0 within the range of the study. A second assumption is randomization. Beyond these
assumptions, the log-linear analysis requires few serious assumptions.
The approach of log-linear analysis is a bit different from other statistical tools. Though most
analyses explore if a collection variables is significantly different from a zero contribution, loglinear analysis attempts to fit a collection of models to the data. In these cases, the researchers (in
COMSTAT, Log-linear, p. 2
most cases) are interested in finding that there are no significant differences between the model
and the data. In addition, the analysis usually produces more than one model that shows no
difference from the data. Hence, researchers need to know how to compare one model to another.
The language of log linear analysis sometimes puts off researchers. But the concepts,
themselves are relatively straightforward. Two terms often are used without explanation in the
method.

Hierarchical models are simply models that include all lower level effects as well. For
instance, if a researcher examines a hierarchical model of the interaction among age,
communicator competence, and amount training in communication, the hierarchical model
also includes the possible two-variable interactions (age by competence, age by amount of
training in communication, and competence by amount of training in communication). The
model also includes contributions made by the main effects of age, communicator
competence, and amount training in communication. Though this type of thinking model may
seem initially confusing, its approach actually permits researchers to partial out the sources
of variance in a fairly direct way.

Saturated models include all possible effects within them. When you do not know the
contours of a model, you start by assuming a saturated model. Then, the reduced elements
may be separated for subsequent analysis.

Restricted models refer to models that involve highly limited numbers of parameters. For
instance a model that includes only one variable is the most restrictive. Models that include
all possible effects among multiple variables (a.k.a saturated models) are called “least
restrictive” models.
To test if the model fails to deviate from the data, the -2 Log Likelihood statistic (sometimes
COMSTAT, Log-linear, p. 3
called the deviance) is used. If this statistic shows statistical significance, it means that the
model does not fit the data. Hence, researchers typically are looking for statistically insignificant
results. Comparing the observed and the expected frequencies, researchers use applications of the
k
 Oj
G 2  2 O j ln 
E
j 1
 j
formula,



 (Upton & Cook, 2002, p. 201).
Steps in Log Linear Analysis. In essence, log-linear analysis includes identifying a model to
fit the data. Then, the researchers attempt to identify the specific parameters. Though they can be
computed by hand, the author has never met a researcher who does so. Hence, in this analysis,
the use of SPSS output (for the most part) will be used to illustrate the notion.
Step 1: Choose a method to select models. When there are three or more variables in the log
linear model, there are many effects (three variables produce seven effects, four variables
produce 15 effects, five variables produce 31 effects) that could be tested and included in a final
model. In fact, researchers often find that there is more than one model that could be fit to the
data. Thus, there are some different suggestions about ways to complete log linear analysis
studies. The most popular approach, sometimes called a two-stage analysis, involves two passes
through the data. The second major approach involves the use of “stepwise backward
elimination” of statistically insignificant effects. The first of these approaches will be described
here. Though stepwise often is criticized for capitalizing on chance findings, it is a popular
approach. This stepwise approach will be illustrated in the section of this chapter that describes
using SPSS to complete log-linear analyses.
The two-stage model follows the suggestions of Brown (1976). In this approach, highest
order interactions are tested for significance of their contributions to the model (in this case the
researcher is looking for a statistically significant effect). After finding significant interactions,
COMSTAT, Log-linear, p. 4
the researcher looks for partial and marginal associations and retains only those for which the
partial and marginal associations are significant. Though actually a separate approach, this
method sometimes is combined with the method of backward elimination of effects. Then, to
obtain general parameter
information, the general log
linear analysis is employed.
For example, one researcher
prepared an analysis of
whether individuals were
willing to sign up for a seminar
in presentation skills (or not)
based on whether they were
men or women with high or
low communication apprehension, and whether they received persuasive information in flyers
containing (or not) evidence from highly and lowly credible sources. To develop the model, the
researchers needed to follow two steps. First, they needed to determine a model and second, they
needed to get parameter estimates. To secure a fit to the model, the researchers chose Loglinear
from the Analysis menu. Then, they chose Model Selection. They transferred evidence use
(“evid”), sex of respondent, communication apprehension level (low and high), and source
COMSTAT, Log-linear, p. 5
credibility (high
and low
credibility of the
source of the
message) to the
field identified as
“factor(s).” In the
“Cell Weights:”
field, the variable that constituted the dependent variable count of the number of participants
willing to sign up for a seminar in presentation skills (or not) was included. To select a model,
the researcher clicked on the Model button and assured that the saturated model was identified.
Then, the researcher clicked Continue and selected Options. In the Options window the
researchers selected “Association
table” to get a display of partial
correlations. There was no need to
include residuals since, with a
completely saturated model they would
be zero. The Delta value was set at 0
given the sample sizes involved. Then
the researcher would click Continue and OK.
Step 2: Check to see if the goodness of fit statistics reveal a fit of the data to the model. The
researchers were looking for coefficients that indicated no significant differences between the
observed data and the model. Both the Likelihood ratio chi square (G2) and the Pearson χ2 were
COMSTAT, Log-linear, p. 6
not statistically significant at the .05 level. With small samples, the Pearson chi square is more
Goodness-of-fit test statistics
Likelihood ratio chi square =
Pearson chi square =
.00000
.00000
DF = 0
DF = 0
P = 1.000
P = 1.000
powerful than the Likelihood ratio chi square. In this case, the two statistics were zero since a
saturated model was examined.
Step 3: Examine global tests to identify differences between the saturated model and models
without various and main effects. Researchers look at assessments of classes of effects. The
impact of removing these effects is shown on the resulting likelihood ratio chi square and the
Pearson chi square. If the effects of each of these source of variation were zero, the coefficients
reveal how poorly the model would fit the data. In this case the K = 4 effect corresponded to the
single four-variable interaction of all the predictor variables. The K = 3 effect indicated the
three-variable interactions and all higher interactions, including the previous four-variable
interaction. As can be seen, the only three-variable interaction effects and above were not
significant contributors.
Tests that K-way and higher order effects are zero.
K
DF
L.R. Chisq
Prob
Pearson Chisq
Prob
Iteration
4
3
2
1
1
5
11
15
.063
6.553
28.360
31.445
.8016
.2561
.0029
.0077
.063
6.502
29.601
33.189
.8015
.2603
.0018
.0044
2
3
2
0
To assess whether individual effects contributed to the model’s good fit, a comparison was
made to what the Pearson chi square and the Likelihood ratio statistic would be if each category
of effect were omitted one at a time. To identify a good fit of the model to the data, researchers
looked for significant differences (any significant difference means that without this source of
variation, the model would deviate significantly from the data). Hence, these elements were
COMSTAT, Log-linear, p. 7
important ones to retain. In this case, researchers know that only without the six two-variable the
Tests that K-way effects are zero.
K
DF
L.R. Chisq
Prob
Pearson Chisq
Prob
Iteration
1
2
3
4
4
6
4
1
3.085
21.807
6.490
.063
.5437
.0013
.1654
.8016
3.588
23.099
6.439
.063
.4646
.0008
.1687
.8015
0
0
0
0
model would fail. Hence, a model composed of these types of interactions explains the data.
Nevertheless, which two-variable interactions explain the effects?
Step 3: Check the association table to identify the specific effects defining the model’s fit. At
this point, since many effects have been tested, researchers usually desire to reduce the number
of specific effects by looking at specific sources. In the example we are describing, individual
tests will be remain at alpha=.05, though an argument from dropping them to alpha=.01 would
be reasonable and will be made shortly in another, related context. The association table presents
the partial chi square likelihood ratio statistic. It considers the hierarchical model and for each
source of variation computes the effects of the same order minus the contribution of the specific
element. So, for the two-variable interactions between evidence and sex, it computed the effects
of the other two-variable interactions and the contribution of each of the main effects (since a
hierarchical design includes all lower level effects) minus the contribution of the specific
evidence x sex interaction. Since the global contribution of the three-way and four-way
interactions were not found statistically significant, it becomes clear that significant two-variable
interactions exist at .05 for evidence in the message and participant’s level of communication
apprehension, respondent sex and level of communication apprehension, and evidence in the
message and credibility of the source of the message. The researcher would need to include
these three interactions and the main effects, since this analysis is hierarchical and lower level
effects are a subset of those observed.
COMSTAT, Log-linear, p. 8
* * * * * * * *
H I E R A R C H I C A L
L O G
L I N E A R
* * * *
Tests of PARTIAL associations.
Effect Name
EVID*SEX*COMMAPP
EVID*SEX*CREDIB
EVID*COMMAPP*CREDIB
SEX*COMMAPP*CREDIB
EVID*SEX
EVID*COMMAPP
SEX*COMMAPP
EVID*CREDIB
SEX*CREDIB
COMMAPP*CREDIB
EVID
SEX
COMMAPP
CREDIB
DF
Partial Chisq
Prob
Iter
1
1
1
1
1
1
1
1
1
1
1
1
1
1
5.544
.168
.007
.091
.033
7.072
5.397
6.434
1.194
.156
.676
.243
.433
1.733
.0185
.6815
.9352
.7628
.8553
.0078
.0202
.0112
.2745
.6930
.4109
.6218
.5107
.1880
3
3
2
2
2
3
3
3
3
3
2
2
2
2
Step Four: Obtain the parameters by comparing the effects of viable models. To do so, the
researcher may use a different routine in SPSS. In this case, the researcher selected Analyze
followed by Loglinear.
Selecting General from the
menu options, the
researcher moved the four
variables into the Factor(s):
box and “freq” into the Cell
Structure: box. Following
selection of this structure,
the model must be
specified.
COMSTAT, Log-linear, p. 9
By clicking on
Model, the
researcher opened
the General
Loglinear Analysis:
Model dialog box
and clicked on the
Custom radio button.
The researcher
entered the three variable interaction that was found significant from the table of tests of partial associations.
Some might exclude the three-variable
interaction since the global test of all
three-way interactions was not
statistically significant at .05, but
interactions may warrant “further
scrutiny” because in global tests large
individual effects sometimes become
overwhelmed by a host of small
statistically insignificant effects
(Stevens, 2002, p. 587). Clicking on the Continue button and the Options… button, the
researcher selected displays of the design matrix, model estimates, and deviance residuals. The
delta was set at 0.
COMSTAT, Log-linear, p. 10
After clicking Continue and OK, the researchers interpreted the results that included the
following elements:
Goodness-of-fit Statistics
Likelihood Ratio
Pearson
Chi-Square
DF
Sig.
9.981
9.7082
8
8
.2664
.2861
As can be seen, the goodness of fit test indicated that the model was not significantly different
from the data. The researchers noted the parameters associated with effects to determine the
direction of the effects. Also useful were plots of residuals. For custom models, the resdearcher
also examined various plots including the very helpful Normal Q-Q plot of adjusted residuals.
This chart takes the residuals and
Normal Q-Q Plot of Adjusted Residuals
2.0
represents them as if they were units
1.5
under the standard normal curve.
1.0
.5
Then, the residuals are plotted
0.0
against the expected normal values.
-.5
If the residuals form a generally
-1.0
-1.5
normal distribution, as they do in this
-2.0
-3
-2
-1
0
1
2
3
case, the points will tend to fall near
Adjusted Residuals
a straight line. Overall, the emergent model was a relatively good fit.
Step 5: Complete Multiple Comparison Tests if Desired. In many cases, researchers also
need to test for comparison among specific cells. Unlike simple chi square analyses, there is a
way to complete multiple comparisons with log linear analysis. Thus, to the multiple
comparisons described in chapter 8, the researcher may explore contrasts using Goodman’s L̂
COMSTAT, Log-linear, p. 11
(1970). The chief formula is

L
. Obviously, this formula required computing two other terms.
sL
The first of these elements is: L  c1 ln o1  c1 ln o2  ...ck ln ok where
c1 is a contrast coefficient. For instance, as described in chapter 8, these contrasts
between two means could be computed as  [“psi”] = (1) 1 + (-1)2.
ln o1 is the natural log of the number of observations in the first condition to be
contrasted, and so forth through all k conditions.

The second of element to be computed is the standard error of a contrast: s L 
c k2
o .
k
Then, the coefficient is compared to the values on the z table1 to see if differences are beyond
chance expectations at the given alpha risk.
As an example, the researcher noted a significant interaction between evidence use and
credibility. By examining the portion of output below, it seemed that when evidence was not
used in the message (evidence value of 1), having a highly credible source enhanced
respondents’ willingness to sign up for a course in presentation skills, just as was the case when a
message with evidence was presented by a source who was not highly credible. These conditions
were in contrast to situations in which lowly credible sources used no evidence and highly
credible sources used evidence. It seemed that evidence benefited most those whose credibility
had not already “topped out” in persuasiveness. When the eight conditions that produced the
highest effects (indicated in bold in the table) were contrasted with the remaining conditions, the
effects were contrasted in the following formula:
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
L  ln 11  ln 7  ln 5  ln 15  ln 15  ln 11  ln 12  ln 8  ln 9  ln 8  ln 4  ln 15  ln 6  ln 4  ln 8  ln 5
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
.
Filling in the natural logs and multiplying by the contrast coefficients, one gets the answer: L=
COMSTAT, Log-linear, p. 12
.3867.
Table Information
Factor
Value
EVID
SEX
COMMAPP
CREDIB
CREDIB
COMMAPP
CREDIB
CREDIB
SEX
COMMAPP
CREDIB
CREDIB
COMMAPP
CREDIB
CREDIB
1
EVID
SEX
COMMAPP
CREDIB
CREDIB
COMMAPP
CREDIB
CREDIB
SEX
COMMAPP
CREDIB
CREDIB
COMMAPP
CREDIB
CREDIB
2
Observed
Count
Expected
Count
%
%
.
1
1
1
2
9.00 (
11.00 (
6.29)
7.69)
8.26 (
8.72 (
5.78)
6.10)
1
2
8.00 (
7.00 (
5.59)
4.90)
9.00 (
9.50 (
6.29)
6.64)
1
2
4.00 (
5.00 (
2.80)
3.50)
5.84 (
6.17 (
4.09)
4.31)
1
2
15.00 ( 10.49)
15.00 ( 10.49)
12.90 (
13.61 (
9.02)
9.52)
1
2
15.00 ( 10.49)
6.00 ( 4.20)
16.01 ( 11.20)
8.00 ( 5.60)
1
2
11.00 (
4.00 (
7.69)
2.80)
7.67 (
3.84 (
5.36)
2.68)
1
2
12.00 (
8.00 (
8.39)
5.59)
11.32 (
5.66 (
7.92)
3.96)
1
2
8.00 (
5.00 (
5.59)
3.50)
11.00 (
5.50 (
7.69)
3.84)
2
2
1
2
1
1
2
2
1
2
The contrast variance that must be divided into this amount was:
sL 
2
c k2
 o , which is
k
2
2
2
2
2
2
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
1
 1
 1
 1
 1
 1
 1
 1
 1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8  8  8  8  8  8  8  8   8   8   8   8   8   8   8   8
11
7
5
15
15
11
12
8
9
8
4
15
6
4
8
5
result of such computations was .1837. When the L̂ was computed, the result was
2
. The
COMSTAT, Log-linear, p. 13
.3867
 2.105 . This statistic is distributed as z, or as a unit under the standard normal curve.
.1837
By looking at the table of z, one finds that 2.11 z’s corresponds to .4826, which means that only
.0174 of 1.74% of the area lies above this point. Assuming that the researcher was using a twotailed test at alpha risk of .05, the critical value to exceed would be 1.96 in either direction. In
this case, 2.11 exceeds 1.96 and, hence, the difference between the set of conditions that explains
the interaction was statistically significant.
One might wonder if it would be acceptable to “collapse” the cells in this design from their
sixteen original conditions to eight or four basic conditions. Though sometimes researchers have
used this method, it appears to be suitable only if higher order interactions involving variables in
the collapsed categories were not statistically significant and if all the variables in the
interactions are independent of other variables. Research indicates that inappropriate collapsing
levels across of third variable often leads to misleading results (see Bishop, Fienberg, & Holland,
1975, 41-42).
Warnings About Log-linear Analysis. Though its usefulness is undeniable, log linear analysis
can create a couple of problems that prudent researchers should take into account:

Overfitting models can be a problem. The method of log linear analysis can produce models
that seem fit simply because they include too many parameters. Indeed, with all the effects
included in a saturated model, the fit is perfect, but it is just an artifact. Some writers have
warned researchers to be particularly suspicious about overfitting (Bishop, Fienberg, &
Holland, 1975, p. 324) especially when the chi square value is actually smaller than the
degrees of freedom (Marascuilo & Busk, 1987, p. 452). In fact, when more than one model
seems to produce a chi square suggesting a fit to the data, one bit of advice is to select the
model whose chi square value is close to the degrees of freedom (Marascuilo & Busk, 1987,
COMSTAT, Log-linear, p. 14
p. 452; see also Stevens, 2002, p. 590).

There is no best method for model selection (see Fienberg, 1980, p. 56; see also Stevens, p.
590). The researcher uses statistics to make an argument (based on high quality
circumstantial evidence) for the tenability of a model, but the methods themselves do not
automatically produce the best model. Researchers should be guided by some hypotheses
and/or theories to promote their efforts. The two-stage method of model development and the
use of stepwise backward elimination cannot be shown superior to each other in identifying
the best models. Ultimately, deciding on the best model requires the researcher to do some
sound thinking based on good reasons to prefer one model over another (after Freeman,
1987, p. 214)

Type I error can get out of hand by thoughtless use of log linear analysis. Especially when
using the stepwise method with backward elimination, researchers test a series of effects, not
because they have predictions or hypotheses, but because method goes on the equivalent of a
“search and destroy” mission snooping for all significant effects. Rather than having a
theoretic foundation for testing an effect, many tests are completed only because some higher
order effects have been found to be statistically insignificant. “Unless one can argue that the
tests are all of interest a priori, one needs to assess the overall probability of making a type I
error” (Sokol & Rohlf, 1981, p. 762). One response is to follow the advice developed by
Whittaker and Aitkin (1978) to control alpha risk for individual tests. If each test is
announced by the researchers at an alpha risk of .05, one may determine the effects of
computing three non independent significance tests by computing total experimentwise (or
“study wise”) alpha risk at:
 experiment wise   first test  ((1   first test ) second test )  ((1   first test )(1   fsecond test ) third test ). 2 To bring
COMSTAT, Log-linear, p. 15
this alpha risk back under control, with experimentwise alpha risk at .05, the researcher could
COMSTAT, Log-linear, p. 16
Special Discussion: Logit Analysis
A variation of log-linear analysis is the use of LOGIT models (also sometimes called
multinomial logit models) in which two or more independent variables (sometimes called
explanatory variables) predicted dependent variables (sometimes called response variables).
The dependent variable actually analyzed is a logit. “Logit” is “short for ‘logistic probability
 p 
 , is the
unit’ or the natural ‘log of the odds’ (Vogt, 1999, p. 164). The expression ln 
1

p


actual formula for the “log odds” (Upton & Cook, 2002, p. 207). In essence, the method
reports a coefficient indicating the probability of events expressed as a number ranging from 0
to 1.
Independent variables can be variable factors (categories created from continuous variables
broken into separate levels). In addition, the method may include covariates used to adjust cells
by the weighted covariate mean. When using SPSS, two major tests of fit are employed in
addition to the Likelihood ration and the Pearson chi square (simply labeled “Pearson” in the
SPSS output). Other analyses of dispersion are used including measures of “entropy” and
“concentration.” Entropy (also known as the diversity index) is a measure of “the extent to
which the different types in a population are unequally common. A value of 0 is attained if
there is only one type in the population” (Upton & Cook, 2002, p. 110). Thus researchers most
often look for models that distinguished between the categorical dependent variables.
Researchers also examine measures of concentration, such as the Gini index (often called the
coefficient of concentration. This measure takes “the mean difference between all pairs of
values and dividing that by twice the population mean” (Vogt, 1999, p. 123). The coefficient
produces a value from 0 to 1 with increased coefficients indicating increased dispersion.
COMSTAT, Log-linear, p. 17
set the individual alpha risk at  individualalpha risk  1  number of statistical tests 1 -  experiment wise . So, if a
researcher wants to limit alpha risk to .05 for a study, individual alpha should be set at .017.

Large samples can lead researchers to make some misinterpretations. As with most
statistical significance tests, when samples get large almost any effects will be identified as
statistically significant in the partial associations list. This problem, however, is particularly
pronounced in log linear analysis. “Overrestricted models tend to be selected in very small
samples and underrestricted models tend to be selected in very large samples” (Bonnett &
Bentler, 1983, p. 156). An alternative us the use of Goodman’s ̂ (1970), which reveals how
much of a percentage improvement is created in the fit of any model over the base model.
This model takes the goodness of fit chi square statistic and makes a comparison using the
formula:
2
2
ˆ   the base model   advanced model . The closer this statistic is to 1, the greater the
2
 the base model
improved fit. Though there is no criterion for determining how high a percentage should be
for retaining an improved model, the researcher often finds these percentages usefully
interpreted by looking at experience with such data.
SPSS
Log Linear Analysis Using Stepwise Methods. As previously described, there are
different ways to identify a model. This example will illustrate SPSS to create a model using
stepwise methods with backward elimination. From the Analyze menu, the researcher chooses
Loglinear followed by Model. The researcher transfers the variable “freq” to the Cell Weights:
field and the remaining variables to the Factor(s): field. After this step is completed, the range
of levels for each factor must be specified. In this case each variable has two levels, 1 and 2.
When the ranges are indicated in the Loglinear Analysis: Define Range dialog box, the
COMSTAT, Log-linear, p. 18
researcher clicks the Continue button. To use the stepwise method, the researcher then needs to
assure that the radio of button for “Use backward elimination;” is selected.
After clicking the Options button, the researcher sets delta at 0 and requests an association table.
The association table would be used to
help interpret the model, rather than to
construct it directly. Afterward, the
researcher clicks the Continue and OK
buttons.
The chart below shows the first portion of the stepwise analysis. The model begins by testing the
four-variable interaction. As can be seen, deleting this four-variable from the model would not
produce a statistically significant deviation of the model from the data. The next lowest effects
COMSTAT, Log-linear, p. 19
(in this case, three-variable interactions) are tested to determine which one contributed the least
and may be deleted from the model.
If Deleted Simple Effect is
DF
L.R. Chisq Change
Prob
Iter
EVID*SEX*COMMAPP*CREDIB
1
.063
.8016
2
Step 1
The best model has generating class
EVID*SEX*COMMAPP
EVID*SEX*CREDIB
EVID*COMMAPP*CREDIB
SEX*COMMAPP*CREDIB
Likelihood ratio chi square =
.06314
DF = 1 P = .802
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - If Deleted Simple Effect is
EVID*SEX*COMMAPP
EVID*SEX*CREDIB
EVID*COMMAPP*CREDIB
SEX*COMMAPP*CREDIB
DF
L.R. Chisq Change
Prob
Iter
1
1
1
1
5.544
.168
.007
.091
.0185
.6815
.9352
.7628
3
3
2
2
In this case, the evidence by communication apprehension by source credibility interaction
produces only a .007 change in the chi square value. Hence, it was deleted in the next step as
shown below.
Step 2
The best model has generating class
EVID*SEX*COMMAPP
EVID*SEX*CREDIB
SEX*COMMAPP*CREDIB
Likelihood ratio chi square =
.06975
DF = 2 P = .966
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - If Deleted Simple Effect is
EVID*SEX*COMMAPP
EVID*SEX*CREDIB
SEX*COMMAPP*CREDIB
DF
L.R. Chisq Change
Prob
Iter
1
1
1
5.573
.189
.095
.0182
.6640
.7582
4
3
3
The third step involved deleting the three-variable interaction contribution the least to the change
in the chi square value. At this point, since the three-variable interactions were exhausted, the
stepwise procedure introduced consideration of two-way interactions. Eventually, the result of
the work was identification of a three-variable interaction among evidence use by sex of
respondent by communication apprehension and the two-variable interaction between evidence
COMSTAT, Log-linear, p. 20
Step 3
The best model has generating class
EVID*SEX*COMMAPP
EVID*SEX*CREDIB
COMMAPP*CREDIB
Likelihood ratio chi square =
.16452
DF = 3 P = .983
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - If Deleted Simple Effect is
DF
L.R. Chisq Change
EVID*SEX*COMMAPP
1
6.187
EVID*SEX*CREDIB
1
.139
COMMAPP*CREDIB
1
.105
Step 4
The best model has generating class
EVID*SEX*COMMAPP
EVID*SEX*CREDIB
Likelihood ratio chi square =
.26950
DF = 4 P =
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - If Deleted Simple Effect is
DF
L.R. Chisq Change
EVID*SEX*COMMAPP
1
6.239
EVID*SEX*CREDIB
1
.200
Step 5
The best model has generating class
EVID*SEX*COMMAPP
EVID*CREDIB
SEX*CREDIB
Likelihood ratio chi square =
.46994
DF = 5 P =
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - If Deleted Simple Effect is
Prob
Iter
.0129
.7092
.7459
4
4
2
.992
- - - - - Prob
Iter
.0125
.6544
3
3
.993
- - - - - -
DF
L.R. Chisq Change
Prob
Iter
1
1
1
6.239
6.305
1.075
.0125
.0120
.2999
4
2
2
EVID*SEX*COMMAPP
EVID*CREDIB
SEX*CREDIB
Step 6
The best model has generating class
EVID*SEX*COMMAPP
EVID*CREDIB
Likelihood ratio chi square =
1.54453
DF = 6 P = .956
- - - - - -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - If Deleted Simple Effect is
EVID*SEX*COMMAPP
EVID*CREDIB
DF
L.R. Chisq Change
Prob
Iter
1
1
6.239
6.703
.0125
.0096
3
2
- - - - - - - - - - - -- - - - - - - - - - - - - - - - - - - - - - - - -
* * * * * * * *
* * * *
H I E R A R C H I C A L
The final model has generating class
EVID*SEX*COMMAPP
EVID*CREDIB
L O G
L I N E A R
* * * *
COMSTAT, Log-linear, p. 21
Step 7
The best model has generating class
EVID*SEX*COMMAPP
EVID*CREDIB
Likelihood ratio chi square =
1.54453
DF = 6
P =
.956
The final model has generating class
EVID*SEX*COMMAPP
EVID*CREDIB
use and source credibility. In addition, the model also includes the two-way interaction between
evidence and credibility. Though this procedure produced a slightly different model than did the
two-stage approach (which found a three-variable interaction and three two-variable
interactions), the significant three-variable interaction and the interaction between evidence and
credibility were shared with both models. As can be seen, researchers need to rely on sound
judgment in interpreting results of log-linear analyses if they are to replicate in related inquiry.
COMSTAT, Log-linear, p. 22
End Notes
1
With a large sample size, it has been observed that L divided by the contrast coefficient
is distributed as z (the standard normal curve) when the null hypothesis is true. This fact, also
means that there is an alternative way to compute the standard error of the contrast. It may be
remembered that with one degree of freedom
 2  z . Hence, for large sample sizes, one may
use the chi square distribution to compute such matters in the
2
formula s   degrees
of freedom fro the effect examined . The degrees of freedom often reduce to the number of
conditions contrasted.
2
An equivalent (and simpler) formula was provided in this book as
 experiment wise  1  (1   for each individualtest ) number of nonindependent tests.
Download