Autoregressive (simplex) models

advertisement
Autoregressive (simplex) models
Autoregressive models estimate the stability of a construct over time by
regressing later measures of a construct onto earlier measures of the same construct.
These constructs or variables can be observed, latent, and even cross-lagged with other
variables. In the simple observed example, a variable is an additive function of its
immediate predecessor and a random disturbance. Without too much difficulty latent
variables can be used to remove measurement error, this is known as a quasi simplex
model.
These across-time (within construct) predictions are referred to as stability
coefficients. The stability coefficients indicate the average standing relative to the group
mean from one time to another. So that a large positive stability coefficient would
indicate that people reporting above the mean at time 1 tended to report above the mean
in time 2. In other papers the coefficients discussed as correlations, which works well
since they can range from -1 to 1. The single stability parameter characterizes the
stability of each construct over two time points for all subjects in the sample. This is an
important distinction from the latent growth curve models which estimate characteristics
individual change. So, in the autoregressive model, if the group mean increased 2 units
from time 1 to time 2, but an individual reported a the same score for each time period
this individual would be making a negative contribution to the average covariance
between the two time points.
In the simplex models, all time periods are not continuous, but are t-1 discrete
time relationships, where t is the number of time points. This allows for different models
to act for every time period. So certain factors may influence the first wave, and,
depending on the stability coefficients, may affect later scores through the first score.
However, we can also examine how different variables can change in their effects across
time. So, for example, minority status may contribute to a high score in time one, and
therefore may keep scores higher than average due to the high initial score. However, we
could also let minority status affect later scores, thus finding that minority status may
affect scores at time 2 and 3, but not in time 4. Note these at not time-varying covariates,
but rather time invariant variables that effects change over time. We can also add timevarying covariates.
Even if set up like the standard simplex model, the quasi simplex model is not
identified, and more specifically, the parameter between the first two and last two time
periods cannot be identified. To overcome this problem, typically, researchers make two
restrictions: first make the first and second errors (the variances) equal and second, make
the last and second to last errors equal. Of course, in the minimal three-period model, all
the errors would be set to be equal. For a one indicator quasi simplex model there is an
additional assumption that is not as easily handled and/or justified. That is, there is an
assumption that the errors of the indicators are not correlated over time. This, of course is
problematic, since we might expect errors of the same measures to be correlated over
time. The only way to adjust for this is to have multiple indicators for the latent construct.
Our Current Analysis
As is evident from the figure below, our current analysis is a quasi-simplex
model. We make all the necessary assumptions about errors to make the model identified.
Interestingly our stability coefficients (from time 1 to time 3) for the attendance (.72)
religious importance (.82) and the different values (.51 to .81, mean .7) are actually
impressively high, especially when it is really a four year time lag. For example, achohol
use for kids is around .53-.59 for one year. The highest scores are for intelligence, which
are around .87-.9 for one year. The good thing about this is that by knowing the general
trend (which I don’t remember discussing in either of our papers) we typically know the
trend of all the individuals. As for the controls and college major at time 1 two things are
being estimated simultaneously. One is the stability of the construct from one time to
another, and the other is the effect of the variable. So, by illustration, after removing the
stability from a score, a control with a positive coefficient increases the relative (to the
mean) score at that time period. Here we take advantage of the quasi simplex model by
allowing the controls and college major to affect multiple time periods. This suggests that
we want to allow for different models of change at different times.
In the paper we talk about the effects being interpreted as maintenance of attitudes. This
may or may not be true since we do not know what the actual score is. In fact, a positive
score could be evidence of maintenance, or change because then we are always talking
about it in relation to the group average or the average group change.
ε0
ε1
ε3
A0
A1
A3
λ0
ζ1
λ1
η0
τ1
τ2
η1
ζ3
λ3
η3
β1
College Major at
Time 1
Controls
β2
Var ε0 = Var ε1 = Var ε3
λ 0 = λ1 = λ 3 = 1
Latent growth curve models
Latent growth curve models allow separate trajectories over time using repeated
measures. It requires 3 time periods of measures for a linear trajectory, and more for
nonlinear trajectories. It models change for each individual over the time period covered.
So instead of comparing time adjacent relations of variables, it uses the repeated
measures to estimate an underlying or latent trajectory for each person. Two latent
factors, the intercept and the slope, represent the growth over time. Thus it estimates the
group level (fixed) means and the random (individual) effects as well. For more details
see pages 341-343 of Bollen and Curran 2004, or Curran 2000 (pp. 14-23). As with the
simplex models the addition of latent constructs is quite straightforward and commonly
done (in psychology).
The assumption of continuous change is important as we assume a certain
functional relationship over time. We also assume that time-invariant variables can only
affect the change once across time. So, if we thought minority status had little effect
early, but a major affect later, the latent growth model would not be able to test, or
control for this situation. However, we can easily include time-varying covariates. The
interpretation would be something along the lines of: after removing the underlying
trajectory, the independent variable affected the dependent variable by a _____ units.
Clearly an important issue with these models is missing data. As well, people are
assumed to be interviewed at roughly equal intervals. However, imputation using
available data to complete the trajectories is commonly used.
Our Possible Analysis
Below is a very rough, but accurate, illustration of what our analysis would
probably look like if done using latent growth techniques. Although I added time, two
these would not necessarily be requisite (although I think we should). Notice all of the
time periods contribute (with regression set to 1) the intercept, and the last three periods
contribute to the slope (S1, S2, S3). It is by manipulating S1, S2, and S3 that we would test
for nonlinear relationships, as well as specifying time intervals. It is here where we would
run into a major decision. Because some students were interviewed on a 0, 1, 3, 5 and
others a 0, 2, 4, 6 schedule, we would either need to run two separate models for the two
schedules, or we would need to impute the missing years from the observed trajectories,
which would put us into having a model with 7 time periods. Although the model would
run with the data set like it is now (wave 0, 1 and 3) it really violates the continuous
change assumption of the model. I am not sure where my preference lies, but I would try
running two separate groups before imputing no matter what the final decision is. No
matter our choice we would be estimating trajectories for each individual over the time
period. This would lead to some very nice results which would let us talk about change
over a very important time. Many studies have used just the unconditional model to talk
about different trajectories (increasing, staying the same, decreasing, nonlinear). There is
already some work using growth models for religiosity, although ours has different
measures, a longer time period, and more controls. However, there is nothing I am aware
of like this for the attitudes we talk about in the other paper. In fact these may be very
nice corollaries to the current papers.
Turing to the different effects let me again remind you that our controls (currently
set to be time invariant) would only operate once on the trajectories (Ci, Cs). Since most
of our controls (with the exception of the R’s political views) are temporally prior to the
respondent’s attitude, we shouldn’t be in too much trouble there. A nice feature of these
models is that the controls can not only predict starting levels, but also how the change
will occur. So while minority status may affect the intercept, parent SES may affect both
the intercept and the slope. I find this particularly interesting, and in some ways I think
this is what we are seeing with the current setup of allowing the controls to affect both
time 1 and time 3 in autoregressive model. However, again this is outside of the scope of
this paper. The inclusion of Major would be a time-varying covariate. We would not need
to have multiple measures of major, but as illustrated below we could have it affect both
time 2 and time 3 (M2, M3). A positive coefficient for M3, for example, would suggest
that having a certain major raises an individual’s religiosity, even after taking out the
conditional trajectory of that individual. Interestingly, by doing trajectory groups (i.e.
increasing, staying the same, etc) analyses we could essentially do interactions of the
major with the trajectory type (assuming we have enough data).
ε0
ε1
ε2
ε3
A0
A1
A2
A3
λ0
ζ1
λ1
η0
η1
1
η2
S1
ζ3
λ3
η3
1
M2
College Major at
Time 1
M3
1
1
intercept
Ci
Controls
ζ2
λ2
S2
S3
slope
Cs
Other possible models
One model that is relatively new is addressed in Bollen and Curran’s recent book
on Latent Growth curve models which uses a method effect for multiple indicators. For
both of the quasi simplex and latent growth curve models we have to assume the errors
are uncorrelated over time, an assumption that is almost certainly incorrect. However,
when there are multiple indicators for the latent constructs a method effect construct can
be added for each of the indicators over time. This, in reality, can be done for the
autoregressive models as well. Certainly for the family, career and society paper we
might want to look at this.
There have been some attempts to combine the autoregressive and growth curve
models. The most recent, and apparently most successful, attempt is the Autoregressive
latent trajectory model (starting on page 345 of Bollen and Curran 2004, see attached). It
may be best summarized by looking at one of their figures:
As you can see it tries to combine the elements of both, and it seems to do a very good
job. However, at this point they have not tried to include time varying covariates. In their
recent book, these authors argue that this model acts much like the method effect model,
where the stability coefficients are really just getting ride of correlated measurement over
time. As much as I like this idea, I have to be honest that I am not completely confident
of what each regression or stability coefficient really means. Nevertheless, it is cutting
edge, and may be something we want to look into further. By the way, in the Bollen and
Curran 2004 article attached they also do a nice job of writing a set of equations that can
be used for both autoregressive and latent curve models, thus showing many of the
similarities.
Conclusion
As of right now I think our paper attempts to answer this question (substitute
values for religiosity for the other paper): Does being involved in a specific major have
an effect on religiosity during young adulthood? To determine what model we need to
use I think we need to answer to important questions. 1) Do we want to examine change
relative to the group or change relative to the individual? 2a) Do we think the change is
continuous or discrete? 2b)Are there different models for different time points? There are
good things about each model, but in the end I think it comes down to these two major
issues.
Download