2450_0_Leung_paper_with_changes_03_agree

advertisement
A Three Dimension Latent Variable Model for Attitude Scale
Keywords: Latent variable models, multidimension model, factor analysis, binary items,
polytomous items, attitude scale
Abstract
We propose a three dimension latent variable (trait) model for analyzing attitudinal scaled
data.
It is successfully applied to two examples: one with twelve binary items and the other
with eight items with five categories each.
The models are exploratory instead of
confirmatory, and sub-scales from which data was selected are clearly identified. For binary
items it gives similar results with factor analysis.
For polytomous items, it can estimate
category scores simultaneously with the internal structure.
the degree to take moderate views is extracted.
From that, another dimension on
This is because conventional analyses
usually fix category scores as numerical numbers while they are free to vary in latent variable
models.
With these findings, more than three dimensions are possible given today’s
computing power and tailor made methods such as adaptive quadrature points for numerical
integrations.
Introduction
In analyzing attitudinal data, questionnaires usually consist of many questions or items and
each item consists of categories such as “strongly agree”, “agree”, etc.
The most popular
way is to use factor analysis in some well-known software like SPSS. This tells us the
1
dimensions of the data, as well as the inside structure through factor loadings.
Category
scores are however fixed as numerical numbers, e.g., 1 for “strongly disagree”, 2 for
“disagree”, etc.
Some log-linear models are used to estimate category scores, and Sobel
(1998) has used log-linear and log-nonlinear models to investigate the nature of midpoint of
categories.
However, it has nothing to do with the internal structure.
Another way is to
use item response theories and related methods. These methods are very often
uni-dimensional to start with, but can be extended to multi-dimensional models.
In
particular, Boeck & Wilson (2005) proposed a generalized model which includes both fixed
and random effects.
But these effects are external rather than internal.
Chapter eight of the
book deals with the internal structure which is more relevant to the present paper.
Another
branch is compensatory multi-dimensional item response model (Bolt & Lall 2003). These
models are compensatory because latent constructs, or attitudes in this paper, of one
dimension can be compensated by attitudes of other dimensions.
In non-compensatory
models, attitudes of one dimension cannot be compensated by other dimensions.
(Technically, in mathematical form compensatory models are additive and non-compensatory
models are multiplicative. For non-compensatory models, increases in probabilities in one
dimension increase the overall probabilities but not the conditional probabilities of other
dimensions.)
However, for most multi-dimensional models in item response theories, few include three or
more latent dimensions.
And, these models very often assume some specific characteristics
of items or categories, e.g. difficulty or discrimination, and the purpose is to estimate them.
On the other hand, latent variable models have different starting points, and are largely
exploratory.
Although the mathematical forms of latent variable models for categorical data
analysis and item response theories are much the same, there are differences in orientations
2
and implications.
In this paper, we apply a three dimension latent variable model to
attitudinal data.
The latent variable model this paper is referring to dates back to Bartholomew (1980).
A
detailed description can be found in Bartholomew and Knott (1999) and recent development
in Moustaki (2005).
The first reason to use this model is data reduction.
Given a number
of observed items x’s, the aim is to find a much smaller number of latent variables z’s, that
can explain the inter-dependence among x’s. We start with one z but there is nothing to stop
us from using more.
By comparing results from one z with those from two z’s or more gives
us an idea of differences between models.
The second reason is to quantify latent constructs.
In attitudinal scale it fits nicely with the purpose of measuring individual’s social attitudes.
More important, the third reason is to achieve conditional independence.
Here, the aim is to
explain the dependencies among x’s by z’s. Please note that this is treated as the aim instead
of characteristics.
If one dimension is not enough we can use two or more.
later that this gives important interpretations to our findings.
principle enables a very simple way of data reduction.
We will see
Finally, the sufficiency
It states that under most usual
conditions, z’s depend on x’s only through simple linear combinations.
In our example
below, we shall show that the simple sum correlates highly with the weighted sum and hence
it is does not matter which is used.
Details on these reasons can be found in Bartholomew
and Knott (1999, chapter 1).
There are many works on latent variable models.
For metrical data, well-established models
in factor analysis are equivalent to latent variable models (Bartholomew & Knott 1999,
chapter 3).
If observed items are ordinal or categorical, there are latent trait models that can
3
be used (Bartholomew & Knott 1999, chapter 5; and Leung 1992). Moustaki and Joreskog
(2001) made a comparison between this approach and underlying response variable approach.
Moustaki & Knott (2000b) presents a framework for a mixture of manifest binary, nominal
and metrical variables.
For ordinal data, Moustaki (2000) presents a general framework
where the latent variable model as a special case depends on which link function is chosen.
However, parameters are assigned for items instead of categories.
In this paper parameters
are assigned for each category of each item so that category scores can be estimated.
Moustaki & Knott (2000a) assume two latent variables with one on attitude and the other on
propensity to response.
It uses latent variable models with observed covariate to compute
response propensities.
Here, we extend further to include another dimension to measure the
degree of taking moderate views.
But, we want to emphasize that this is not the purpose of
our model choices, but rather the consequence.
In sum, latent variable models aim to use a much lower dimension to explain dependencies
among observed items, and eventually achieve the aim for data reduction.
The model is
largely exploratory in nature because it does not assume any specific characteristics or
internal structure as in item response theories or factor analysis.
The next section looks into
the model in more detail.
Model
Suppose we have p items with each having c categories.
There is no need for all items to
have the same number of categories, but it is easier to read and fits the examples better.
We
label a single observation by x' = (x11, x12, … , xpc), with xij = 1 if category j of variable i is
chosen, and zero otherwise.
Let πij(z) = Pr(Xij=1|z) be the conditional probability of
4
choosing category j of variable i, given that continuous latent variables take value z.
The
dimension of the latent space can be of any values smaller than p, but here we only deal with
three latent dimensions, i.e., z’ = (z1, z2, z3). With the aim to achieve a conditional
independent, Pr(X=x|z) can be expressed as follows:
Pr(X=x|z) = Pr(X=x|z1,z2,z3) =
p
c
i 1
j 1
  ij ( z1, z2 , z3 )
xij
By the sufficiency principle set out in Bartholomew & Knott (1999, chapter 5), a generalized
logit function for πij(z) is chosen, i.e.,
πij (z) = exp(αoij + α1ij z1 + α2ijz2 + α3ijz3) / Ai
c
where Ai =

exp(αoij + α1ij z1 + α2ijz2 + α3ijz3) is the normalizing function with α’s and
j 1
z’s.
Since adding the same constant to αoij for all values of j will make πij(z) unchanged, we fix
the first category as zero, i.e., αoi1=0 for all i, and similarly for α1i1, α2i1 and α3i1. Parameters
αoij, α1ij, α2ij and α3ij serve similar functions as factor loading in factor analysis, and are
responsible for locating categories of items in the latent space.
The conditional probability
depends on z’s through α1ij, α2ij and α3ij and not αoij. The magnitude of α1ij determines how
big the effects of z1 on the probability, similarly for z2 and z3. We will see later that the
relative value of the α’s tell us the identification for sub-scales behind the data.
In one dimension model with binary data, αois and α1is are respectively the difficulty and
discriminating parameters in item response theories.
In latent variable models, we do not
use any α’s to describe any specific characteristics, though they serve similar functions. For
example, there are debates whether the discriminating parameters are constant across items or
5
categories. Here, they are free to take any value, and this serves to explain the exploratory
nature.
Without loss of generalities, and as a conventional choice, we assume the prior distribution of
the latent density of z’s as independent identical standard normal.
The overall probability of
having a specific response pattern h, say Ph, is the marginal density of x’s, and can now be
written as follows:
Ph  f  x      z1    z2    z3 
p
c
  z , z , z 
xij
ij
i 1
1
2
3
dz1dz2 dz3
j 1
where  is the standard normal density function.
The integrations are over z1, z2 and z3.
From the sufficiency principle described earlier, the components, denoted by y’s, defined by
p
c
y1   xij α1ij
i 1 j  2
p
c
y2   xij α2ij
i 1 j  2
p
c
y3   xij α3ij
i 1 j  2
are now sufficient for z’s as x’s depend on z’s only through y’s.
j=2 because α’s are zero for j=1.
These components are linear combinations of x’s and α’s,
and are used to locate individuals in the latent space.
sum of discrimination parameters.
The summation starts from
In item response theories, these are the
We will see later that they serve an important function in
scaling.
Parameter Estimation
For parameter estimation, we use the marginal maximum likelihood and details can be found
6
in Bock and Aitkin (1981) and Leung (1992). We only give a brief description here and
computer programs are available upon request from the author.
First, the latent space is
divided into many grids, and in our case it is a three dimensional latent space with points in
grids. At the l-th point, say (l1, l2 ,l3), we calculate the expected number of persons choosing
category j of variable i, say rl1l 2 l3ij , and the expected sample size, say N l l l . They are
1 2 3
“expected” figures because they depend on estimated parameter α’s. However, summation
of these values over the latent space will give the overall number of persons choosing
category j of variable i and the total sample size correspondingly.
The algorithm is divided into two major steps: the expectation step and the maximization step.
In the expectation step, we compute r’s and N’s by fixing α’s.
the other way around.
In the maximization step, it is
We solve the maximum likelihood equations for α’s by fixing r’s and
N’s. Two steps are then iterated until convergence is obtained.
Unlike some item response theories, the latent variable z’s are treated as variables instead of
as parameters. Hence, numerical integrations of z’s are unavoidable. That is why marginal
maximum likelihood is particularly useful here.
Methods like the Gauss-Hermite quadrature
formula are needed. There are computational problems in high dimensions and we will
address this issue in more detail in the discussion section.
Examples with binary data
The data used is part of a study on mathematic attitude of secondary school students in
Macau (Wong 2004), and a Chinese version of the Fennema Sherman Mathematic Attitude
Scale is used (Fennema & Sherman 1976). There are 108 items divided into nine sub-scales
7
with twelve items each.
The five-point Likert scale is used with categories labeled as
"strongly agree", "agree", "no comment", "disagree" and "strongly disagree".
choose four items from three sub-scales for illustration purpose.
Here, we only
All twelve items are listed
in the Appendix. The first four items belong to the sub-scale "usefulness of mathematics".
The next four and the last four items belong to the sub-scale "attitude towards success in
mathematics" and "confidence in learning mathematics" respectively.
Having said that, we
pretend that we do not know which items belong to which sub-scales, and let parameter
estimation tell us. Hence, there is no index for sub-scales.
We shall start with binary items.
Five categories are grouped into two so that we have
twelve binary items. The total sample size is 519 after deleting all missing values on the list,
i.e., the whole record is considered as missing if any item is missing.
Table 1 reports
estimates of α’s.
--------------Table 1 about here
--------------In Table 1, the first category for all α’s are set to zero, so only α’s of the second category are
reported. As we noted before, the magnitude of the α’s determine the effects of z’s on the
response probability.
In the first dimension, relatively bigger α’s are identified for items in
the second sub-scale “attitude towards success in mathematics”. Similarly, the second and
third dimensions are matched to the third and first sub-scales respectively.
In fact, in terms
of magnitude, the maximum of those not-underlined α’s is 1.04, and the minimum of those
underlined is 1.72 with the maximum at 3.78. That is, those smaller α’s are under 1.04, and
those bigger α’s ranged from 1.72 to 3.78.
Having said that, more formal treatment is to set
corresponding α’s to zero, and test the difference between restricted and unrestricted models.
8
We will come back to this in the discussion section.
To compare with other methods, we run conventional factor analysis to these twelve binary
items.
The default estimation method (SPSS calls this as factor extraction method) is
principal component analysis.
After varimax rotation, factor loadings are reported in Table
1 for comparison.
For factor loadings, small values are under 0.18, and big values ranged
from 0.71 to 0.80.
In sum, three latent dimensions can be interpreted as the second, third
and first sub-scales respectively, same as in latent variable models.
We can see α’s from latent variable models and factor loading from factor analysis have a
very strong relation.
In fact, the correlations between α’s and factor loading are 0.95, 0.93
and 0.91 for the first, second and third dimensions, respectively.
models will provide the same scaling.
This implies that two
In fact, Bartholomew and Knott (1999, p87 to 89)
shows the equivalence between latent variable models for binary data and underlying variable
approach using threshold parameters.
The only difference is that normal ogive response
function is used in the proof, and logistic function is used in this paper, but two are very
similar.
Goodness-of-fit
It is well known that in high way contingency tables, the number of possible cells far exceed
the sample size.
In our example with binary items the number of cells and sample size are
212 = 4,096 and 519 respectively.
Bartholomew and Tzamourani (1999) show that
traditional goodness-of-fit tests are often invalid for sparse 2p tables.
Any traditional
measure cannot provide a true picture but only initial views (Joreskog & Moustaki 2001).
9
Initial analysis can be provided by the traditional log-likelihood ratio, G2, which is – 2 log
(likelihood ratio). This is reported in Table 2.
--------------Table 2 about here
--------------In Table 2, if we move from one dimension to two and three dimensions, the marginal
decrease in G2 is far bigger than the number of parameters added.
But, the proportional
decrease in G2 does not match with number of parameters added.
Doubling the number of
parameters was only accompanied by a decrease of one third in G2.
give a better fit, but it is difficult to say whether this is enough.
Adding parameters did
Another way is to use some
statistics based on two-way margins only because data is very sparse in high way tables.
We suggest using the “Y-statistics” introduced by Bartholomew and Leung (2002) and Leung
(2005) as it is based on two-way margins.
Since the focus of this paper is on applications,
only brief descriptions are given here, and interested readers can refer to the above papers for
details.
This method goes through all possible pairs of two-way margins, and calculates the
observed and expected frequencies in each table corresponding to each pair.
computed by summing up all chi-squared statistics for all pairs.
A statistic is
The distribution of these
statistics, called the “Y-statistics” is very complicated because individual terms inside the
summation is chi-squared distributed, but they are highly associated among themselves, and
hence the sum is not chi-squared distributed.
However, the dependences can be adjusted by
the second and third moments. Under any fully specified models, Bartholomew and Leung
(2002) give a formula for calculating the first three moments.
From these, an approximation
of the percentage points of the tail probabilities is calculated by matching moments.
The
method matches moments between those computed above and those with a linear function of
10
a chi-squared variable with an adjustable degree of freedom.
The p-values for one, two and
three dimension latent variable models are reported in the last column of Table 2.
From Table 2, one dimension model is significant.
Two dimension model can give a good
fit but the fit is further improved by a third dimension.
But, from the last section on
parameter estimations, three dimensions corresponding to three sub-scales are clearly
identified, and hence we may suspect significant results for the two dimension model, and
non-significant results for the three dimension model. Although results from overall
goodness-of-fit and parameter estimations do not match perfectly, we would like to have the
following considerations:
First, Y-statistics only deal with any fully specified models.
Parameters are estimated but are treated as known. Hence, adding more parameters will
always provide a better fit. Second, and perhaps more important here, the Y-statistics only
deal with two-way margins. There are possibly un-detected misfits in higher way margins,
particularly when we apply a two dimension model to three dimension data.
It is noted that
further analysis of this problem is needed though this is not the central theme of this paper.
Scaling for binary items
For binary items, in most practical situations scaling is done by summing numbers zero or
one over all items.
We call it raw scores, which is very convenient but without statistical
justification. Alternatively, usual factor analysis in SPSS gives factor scores to each subject.
In the latent variable model, we use components y’s defined above. Table 3 shows the
correlations among these different types of scores.
--------------Table 3 about here
11
--------------All correlations are very high and the three scoring system will give very similar scaling of
mathematic attitude.
From latent variable modeling, dependences among items can largely
be accountable by the component scores, and hence the raw scores.
the use of raw scores in practice.
can work just like factor analysis.
This, perhaps, justifies
In this example with binary items, latent variable models
In the next example with polytomous items, we shall
show that category scores can be estimated simultaneously with latent structures.
Examples with polytomous items
In the second example, five categories remain ungrouped, but we only work with two of the
three sub-scales, namely the “usefulness of Mathematics” and “success in mathematics”.
We remove one sub-scale but allow five categories in order to see how the three dimension
model works.
Parameter α’s estimates are reported in Table 4.
--------------Table 4 about here
---------------
The way to look at parameter α’s is different here. For binary items, we only need to look at
α’s of one category as the other is fixed.
The effect of one item depends on the magnitude
of one α. Here, we have five categories instead of two. Even though we fix one, we still
have four remaining.
In the first dimension of Table 4, the first four items have a general
trend of increasing α’s when we move from the first to fifth category.
Though there is a
slight decrease in α’s when we go from the first to second category for item two, three and
four, that does not affect the general trend.
The general trend is in-line with category labels.
Moreover, in the fifth category, the minimum of the first four is 2.54, and the maximum of
12
the last four is only 1.88.
The values of the fifth category are relatively bigger for the first
four items than the last four.
of mathematics”.
Bigger α’s implies a more positive attitude towards “usefulness
Hence, we can name the first dimension as the first sub-scale.
Similar phenomenon occurs in the second dimension.
In the second dimension, the last four
items have a gradual increase in α’s when we go from the first to fifth category. The only
exception is item six when we move from the second to third category, and this is
insignificant.
For the fifth category, the maximum of the first four is only 0.96, and the
minimum of the last four is 2.87. So, we can name the second dimension as the second
sub-scale, i.e., attitude towards “success in mathematics”.
In the third dimension, we have underlined α’s of the middle categories of all items because
they are the biggest in many cases. Exceptions are in items one, five and seven, but the
maximum occurs at the adjacent category and the differences are negligible.
More
importantly, the general trend is that categories at both ends of the continuum from “strongly
disagree” to “strongly agree” have lower values, and those near the middle have higher
values. Hence, this dimension can be interpreted as the degree to take moderate views.
Bigger α’s implies more moderate and smaller implies more aggressive.
This cannot be
identified by conventional methods since category scores are fixed as numerical numbers.
Please note that this analysis only refers to category scores instead of individual scoring.
However, from the sufficiency principle, we know that we can use the y’s in the third
dimension to approximate an individual’s attitude to take moderate views.
In sum, there is no assumption on the internal structures and category scores in the model or
at the beginning of our analysis. However, at the end, the model can successfully identify
13
two sub-scales and estimate category scores simultaneously. This explains the exploratory
nature of the model.
For goodness-of-fit, the sparseness problem is now much more serious.
items now but five categories.
and the total sample size is 529.
We have eight
The number of cells is a huge number (58=390,625 cells),
To economize the space, the conventional G2 statistics and
the p-values of the Y-statistics are reported in Table 2.
In Table 2 with polytomous items,
when we move from one dimension to three dimensions, there is only a decrease in one-fifth
in G2 when number of parameters is doubled.
The corresponding figure for binary items is
one third. This is an important feature for G2 and needs further study.
phenomenon is similar to those in binary items.
For Y-statistics, the
One dimension is significant, two
dimensions are not, and three dimensions give a nearly complete fit.
In sum, the implication
here is that adding more dimensions improves the fit but it is difficult to tell whether that is
enough.
The problems here are largely the same as with binary items.
Traditional G2 has a
sparseness problem and new Y-statistics may have a problem with estimated parameters.
We cannot handle all these problems at once but will move on to them in further research.
Even though we do not have very promising results for goodness-of-fit, there are interesting
findings in scaling.
Scaling for polytomous items
The correlations among different types of scores for polytomous items are reported in Table 3.
The situation here is largely the same as for binary items.
All correlations are very high,
and practically three scoring system will give very similar scaling of mathematic attitude.
Relatively speaking, latent variable model component scores have lower correlations with the
14
other two. This is because both raw scores and factor scores are based on the same
assumption that category scores are fixed as numerical numbers.
In latent variable models,
category scores are estimated and hence we get a third dimension on the degree to take
moderate views. Correlations between the third component and the other two are very low,
and not reported, as this has nothing to do with mathematics attitude.
Instead of reporting
the correlations, we would like to see the relationship between the mathematics attitude and
the attitude to take moderate views. This is done by a simple plot of components.
To have a further picture of how individuals are distributed in the latent space, we plot the
third component against the first and second in Figure 1 and 2 respectively.
In both figures,
the y-axis represents the degree of taking moderate views, with bigger values representing
more moderate views.
In Figure 1 and 2, the x-axis represents the attitude towards the
“usefulness of mathematics” and “success in mathematics” respectively; more positive values
represent more positive attitudes.
--------------Figure 1 about here
----------------------------Figure 2 about here
--------------In Figure 1 and 2, most people are concentrated in the large values on the y-axis.
This
indicates that most people would like to take moderate views, that fewer people are
aggressive, and this is quite typical for the Chinese.
The distributions are quite skewed
because they are not prior standard normal, but instead a posterior distribution of z’s which
have taken account of data.
In both figures, those people with very positive attitudes
15
towards mathematics will have more chance of taking aggressive views.
But, those with a
very negative attitude will remain the same with others and have more chance of taking
moderate views. This pattern is the same in both figures. These figures give us an idea of
how individuals are distributed in the latent space, and hence the relationship between
mathematic attitudes and the degree to take moderate views.
Discussion
In this paper, we extend latent variable models to three latent dimensions, and apply it to
examples with binary and polytomous items.
Although the model presented here is similar
to compensatory item response models and models handling internal structure in mixed
effects item response models, the latter models seldom extend to three latent dimensions.
More importantly, latent variable models have different orientations and motivations.
Its
aim is to explain dependencies among observable by a much lower latent dimension, in
addition to data reduction and quantifying latent constructs.
It is largely exploratory by its
nature, and hence can simultaneously estimate category scores and latent structure.
We can
identify the dimension in taking moderate views because of the exploratory nature, and not
because of the labeling of "no comment" in the third category since the algorithm is blind to
this.
This also explains why this model is different from other latent variable models for
propensity to respond.
To see whether alternative methods like factor analysis can do the same things, we have done
a small-scale study using dummy variables in factor analysis.
Since there are eight items
with five categories each, forty dummy variables are created with one indicating a particular
category of an item chosen, and zero otherwise.
16
Factor analysis is applied to these forty
dummy variables, and fifteen factors with eigenvalues bigger than one is identified.
is no clear pattern of factor loading after varimax rotation.
study and further work is necessary.
There
But this is only a small-scale
In particular, it helps if we can simulate data from
models hypothesized, and then fit it respectively with latent variable models and factor
analysis with dummy variables.
Another important area is the confirmatory nature of the models. In our examples with
binary items we know what the sub-scales are though we pretend that we do not.
By setting
corresponding α’s to zero, we can investigate whether restricted models are very different.
In most applications, researchers know what the latent constructs are.
It would be
straightforward to estimate parameters for confirmatory models given exploratory estimates.
With so many possibilities ahead, there are problems on computation which limit its
applicability at present.
Most applications have many items and a few categories. There
are two types of limitations: (a) number of items and categories in turn determine total
number of possible cells; and (b) number of latent dimensions.
Limitations on the number
of items and categories for parameter estimations are not a problem because we do not need
to go through all possible cells. We need only to pass through all observable cells, or the
sample.
In most applications, sample sizes are seldom bigger than a few thousand, but all
possible cells can be as large as millions.
However, it takes the author only one to two days
to compute the "Y-statistics" in an IBM ThinkPad notebook computer, and desktop
computers would be even faster. So, the limitation on all possible cells is a problem but not
very serious. More serious problems are number of latent dimensions.
Whenever multiple integrals are involved in marginal maximum likelihood for
17
multi-dimensional latent variable models or item response models, say, it might be practically
impossible because computation may take days, weeks, or even months. Recently, Schilling
and Bock (2005) pointed out that usual fixed point Gauss-Hermite quadrature for marginal
maximum likelihood has problem not only on heavy computation demand, but also on
accuracy in evaluating likelihood because likelihood is so sparse. However, they pointed
out that substantial improvement can be obtained both in accuracy and speed if adaptive
quadrature points are used. This method is adaptive because points are not fixed but depend
on each distinct pattern in the data.
Looking forward, with future's advance in computing
and more tailor-made algorithm, computational issues can be resolved to a large extent.
Reference
Bartholomew, D. J.
(1980). Factor Analysis for Categorical Data.
Journal of the Royal
Statistical Society Series B, 42, 293 – 321.
Bartholomew, D. J. and Knott, M. (1999).
ed.).
Latent variable models and factor analysis (2nd
London: Arnold.
Bartholomew, D. J. & Leung, S. O. (2002).
Contingency Tables.
A Goodness-of-Fit Test for Sparse 2**p
British Journal of Statistical of Mathematical Society.
55, 1-15.
Bartholomew, D. J. and Tzamourani, P. (1999). The goodness-of-fit of latent trait models
in attitude measurement.
Sociological Methods and Research, 27, 525-546.
Bock, R. D. & Aitkin, M. (1981).
Marginal maximum likelihood estimation of item
parameters: Application of an EM algorithm.
Boeck, de Paul & Wilson, Mark. (2004)
Psychometrika, 46, 443-459.
Explanatory item response models : a generalized
linear and nonlinear approach. New York : Springer, c2004.
Bolt, Daniel M. & Lall, Venessa F. (2003)
18
Estimation of compensatory and
non-compensatory multi-dimensional item response models using Markov Chain Monte
Carlo. Applied Psychological Measurement, Vol. 27 No. 6, 395-414.
Fennema, E., & Sherman, J. A. (1976). Fennema-Sherman Mathematics Attitudes Scales:
Instruments designed to measure attitudes towards the learning of mathematics by
males and females.
JSAS Catalog of Selected Documents in Psychology, 6(1), 3b.
Joreskog, K. G. & Moustaki, Irini
(2001)
Factor Analysis for Ordinal Variables: a
Comparison of three approaches. Multivariate Behavioural Research, 36, 347-387.
Leung, S. O. (1992). Estimation and Application of Latent variable models in Categorical
Data Analysis.
British Journal of Mathematical and Statistical Psychology, 45,
311-328.
Leung, S. O. (2005) On full and limited information statistics for goodness-of-fit of sparse
2p contingency tables.
Paper presented at the 70th Annual Meeting of the
Psychometric Society and the 14th International Meeting of the Psychometric Society
(IMPS2005) to be held during July 5-8, 2005, Tilburg University, The Netherlands.
Moustaki, Irini
(2000)
A Latent variable model for Ordinal Variables. Applied
Psychological Measurement, 24(3), 211-223.
Moustaki, Irini (2005)
and trends.
Latent variable models for categorical responses: New developments
Paper presented in the 70th Annual Meeting of the Psychometric Society
and the 14th International Meeting of the Psychometric Society (IMPS2005), at Tilburg
University, Tilburg, The Netherlands, on 5 to 8 July, 2005.
Moustaki, Irini & Knott, M (2000a) Weighting for Item Non-Response in Attitude Scales
Using Latent variable models with Covariates. Journal of the Royal Statistical Society,
Series A, 163(3), 445-459.
Moustaki, Irini & Knott, M (2000b) Generalised Latent Trait Models. Psychometrika, 65
(3), 391-411.
19
Wong, Im Mui (2004).
A Research on Mathematical Attitude and Related Factors for Junior
Secondary Students in Macao. M.Ed. Thesis.
Schilling, Stephen & Bock, R. Darrell
(2005)
University of Macau.
High-dimensional maximum marginal
likelihood item factor analysis by adaptive quadrature.
Psychometrika.
Vol 70 (3)
p533-556.
Sobel, Michael
(1998)
Some Log-Linear and Log-Nonlinear Models for Ordinal Scales
With Midpoints, With an Application to Public Opinion Data.
Methodology. Vol 28, 263-292.
20
Sociological
Appendix:
Questions used in mathematics attitude scale.
1. I will need mathematics for my future work.
2. I study mathematics because I know how useful it is.
3. Mathematics is of no relevance to my life.
4. I will need a firm mastery of mathematics for my future work.
5. It would make me happy to be recognized as an excellent student in mathematics.
6. I would be proud to be the outstanding student in mathematics.
7. I am happy to get top grades in mathematics.
8. Being first in a mathematics competition would make me pleased.
9. I am no good at mathematics.
10. I am sure I could do advanced work in mathematics.
11. I don't think I could do advanced mathematics.
12. For some reason even though I study, mathematics seems unusually hard for me.
21
Table 1: Parameter α’s estimated from latent variable model and factor loading from factor
analysis for example with binary items
Latent variable model (α’s)
Factor Analysis (factor loading)
Items
Dimension
Dimension
1
2
3
1
2
3
1
0.51
-0.14
2.10
0.16
0.05
0.79
2
0.94
0.04
3.04
0.17
0.14
0.80
3
0.08
0.01
1.68
-0.02
0.13
0.71
4
0.67
-0.03
1.75
0.16
0.12
0.73
5
2.11
-0.20
0.30
0.72
0.01
0.16
6
3.24
-0.40
1.01
0.80
0.10
0.17
7
2.38
-0.47
0.69
0.80
0.04
0.13
8
1.87
-0.29
0.13
0.74
0.08
0.00
9
0.02
2.02
0.48
0.03
0.78
0.10
10
0.79
3.78
1.04
0.12
0.76
0.18
11
0.26
3.72
0.99
0.09
0.79
0.14
12
-0.16
1.72
0.22
-0.01
0.75
0.03
22
Table 2: Log-likelihood ratio, G2, and p-values of the Y-statistics for examples with binary
and polytomous items
Y-statistics
G2
No of parameters
Examples
Dimensions
p-values
Value
difference number difference
0.001
Binary
One
1993
24
0.611
Items
Two
1448
545
35
11
1.000
Three
1249
199
47
12
Polytomous
One
3949
64
0.000
Items
Two
3418
531
95
31
0.345
Three
3146
272
127
32
1.000
(Note: p-value of 1.000 is obtained by rounding, and does not mean perfect fit.)
23
Table 3: Correlations among raw scores, factor scores from factor analysis (FA) and
component scores from latent variable model (LVM) for examples with binary and
polytomous items
Examples
Binary
Items
Sub-scales
Raw Vs FA
Raw Vs LVM FA Vs LVM
First
0.983
0.879
0.813
Second
0.985
0.952
0.908
Third
0.995
0.963
0.964
Polytomous
First
0.987
0.939
0.901
Items
Second
0.987
0.928
0.900
(Note: The third sub-scale is not included in example with polytomous items.)
24
Table 4: Parameter α’s estimated under three dimension latent variable model for example
with polytomous items
Items
1
2
3
4
5
6
7
8
categories
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
1
2
3
4
5
1
0.00
0.63
2.61
4.09
5.76
0.00
-0.16
2.12
4.44
6.57
0.00
-0.26
0.09
1.29
2.54
0.00
-0.39
0.66
1.88
3.31
0.00
0.65
1.15
1.30
1.49
0.00
0.39
1.16
2.02
1.88
0.00
0.10
0.96
1.58
1.32
0.00
0.36
1.05
1.11
0.79
Dimension
2
0.00
0.48
0.02
0.38
0.65
0.00
0.77
0.43
1.05
0.96
0.00
0.07
-0.20
-0.07
0.03
0.00
0.73
0.17
0.70
0.88
0.00
1.87
1.99
3.86
4.18
0.00
1.05
0.90
3.96
4.82
0.00
0.44
0.59
2.87
3.58
0.00
0.72
0.80
2.37
3.07
25
3
0.00
2.37
2.32
1.63
0.14
0.00
1.61
2.09
1.23
-0.95
0.00
1.63
1.79
1.55
0.38
0.00
1.50
1.57
1.11
-0.54
0.00
2.47
3.21
3.33
1.27
0.00
2.11
3.44
2.23
-0.73
0.00
1.37
3.03
3.11
0.08
0.00
2.60
3.88
3.31
1.00
Plot of first and third component
25.00
Third component
20.00
15.00
10.00
5.00
.00
.00
5.00
10.00
15.00
20.00
First component
Figure 1: Plot of first and third component.
26
25.00
30.00
Plot of second and third component
25.00
Third component
20.00
15.00
10.00
5.00
.00
.00
2.00
4.00
6.00
8.00
10.00
12.00
Second component
Figure 2. Plot of second and third component.
End of paper
27
14.00
16.00
18.00
20.00
Download