讲座-王占礼

advertisement
A TIME TO LEARN AND SHARE
CIFA国际货代考试英语卷测量等价性检验
Testing Cross gender and region construct validity in CIFA
English test for certification of freight forwarders
王占礼
测量等价 (Measurement Invariance; MI)
测量等价是Drasgow 借用项目反应理论( Item Response
Theory)的相似概念首次提出了一个测量学术语 , 是指对于
不同的条件下观察和研究的现象, 测量操作产生对同一属性
的度量。根据检验的对象不同, 测量等价由低到高构成四个
水平 :




形等价( configural invariance)
弱等价( weak invariance)
强等价( strong invariance )
严等价( strict invariance)。
等价级别
形态等价又称结构等价, 是指不同组的潜变量、显变量之间的基本结构关系相同, 即每
一潜变量以相同的显变量来测量, 但不要求对应参数相等。
弱等价又称因素负荷等价, 是指不同组之间的因素负荷相等, 这意味着每一个显变量在
不同的组之间具有相同的单位, 潜变量每变化一个单位, 显变量在不同组中都会产生相同程
度的变化。
强等价又称截距等价, 是指不同组之间显变量在由潜变量预测时截距相等。强等价意味
着测量在不同组之间具有对等的参照点, 这样, 显变量的跨组差异将可以完全反映所测量的
潜变量的跨组差异, 也就是进行跨组的均数比较是有意义的。
严等价又称误差等价, 是指每一显变量在不同的组间测量误差具有相同的变异, 在这一
水平上跨组的方差齐性检验是有意义。在统计上, 四个水平的等价性具有层级嵌套关系, 即
只有在低一水平的等价性得到证实后, 高一水平的等价性检验才有意义。故测量等价性检
验步骤也偱此顺序进行。
构念(construct)
Literature review
We often have several groups in our analyses: different
cultures, regions or countries.
In order to compare relationships between constructs or
means across groups, we need certain level of invariance of
the constructs across those groups.
The meaning of invariance is “whether or not, under different
conditions of observing and studying phenomena,
measurement operations yield measures of the same attribute”
(Horn and McArdle 1992, 117).
Techniques to test invariance


Various techniques have been developed to
test measurement invariance (De
Beuckelaer, 2005).
Multiple- group confirmatory factor analysis
(MGCFA: Jöreskog 1971) is among the most
powerful.
Configural Invariance (1)



The lowest level of invariance is ‘configural’
invariance.
Configural invariance requires that the items in the
measuring instrument exhibit the same
configuration of loadings in each of the different
countries.
That is, the confirmatory factor analysis thus
confirms that the same items measure each
construct in all countries in the cross-national study
(or cross-group).
Configural Invariance (2)
Configural invariance is supported if
(a)
a single model specifying which items measure
each construct fits the data well,
(b)
all item loadings are substantial and significant,
(c)
there are no large modification indices, and
(d)
the correlations between the factors are less
than one. The latter requirement guarantees
discriminant validity between the factors
(Steenkamp and Baumgartner 1998).
Measurement invariance (1)



Configural invariance does not ensure that the
people in different nations understand the items in
the same way.
The factor loadings may still be different across
countries.
The test of the next higher level of invariance,
‘measurement’ or ‘metric’ invariance, requires that
the factor loadings between items and constructs
are invariant across nations
Measurement invariance (2)


It is tested by constraining the factor loading
of each item on its corresponding construct
to be the same across groups.
Measurement invariance is supported if the
model cannot be significantly improved by
releasing some of the constraints.
Partial measurement invariance (1)



However, for cross-cultural comparison to be
allowed, it is not necessary that all factor loadings
are equal.
Several scholars have suggested that it is enough
to have two equal factor loadings per construct
across countries to allow comparison of effects.
They termed it partial measurement (metric)
invariance (Byrne, Shavelson, and Muthen 1989;
Steenkamp and Baumgartner 1998).
Scalar invariance (1)




A third level of invariance is necessary to allow
mean comparison of the underlying constructs
across countries.
This is often a central goal of cross-national
research.
Such comparisons are meaningful only if ‘scalar’
invariance of the items is ensured.
Scalar invariance guarantees that cross-country
differences in the means of the observed items are
a result of differences in the means of their
corresponding constructs.
Scalar invariance (2)


To assess scalar invariance, one constrains
the intercepts of the underlying items to be
equal across countries.
It is supported if the model fit to the data is
good and if it cannot be improved by
releasing some of the equality constraints.
Invariance - summary



Meaningful comparison of construct means across
countries requires three levels of invariance,
configural, metric, and scalar.
Meaningful comparison of relationships between
constructs requires two levels of invariance,
configural and metric.
Only if all these types of invariance are supported
can we confidently carry out comparisons.
CIFA考试简介
CIFA国际货代考试是由原外经贸部(现商务部)委托,由中国国际
货运代理协会(CIFA)组织实施的职业认证考试。自2002年实施
以来已有近16万人参加考试,其中近6万人获得证书(中国国际货
运代理协会,2011)。考点遍布全国省市,考试得到了业内的高
度评价和广泛的认可。参加考试的院校之间也常常进行比较,考试
成绩对相关院校的英语教学具有巨大的反馈作用。 该考试权威性
强、规模大,高风险(high-stakes)的特点要求考试必须科学、严
谨,尤其对不同群组(性别、区域等)的考生都要公平、公正,具
有较好的跨组测量等价性,跨组效度。这样对分数的解释,进行组
间差异比较也才有意义。
AMOS


结构方程模型(SEM) 包括多种统计技术,如路
径分析,验证性因子分析,带潜变量的因果关
系模型,甚至方差分析和多重线性回归。
AMOS即是处理结构方程的一种软件包。
Amos is short for Analysis of Moment
Structures. It implements the general approach
to data analysis known as structural equation
modeling (SEM), also known as analysis of
covariance structures, or causal modeling. This
approach includes, as special cases, many wellknown conventional techniques, including the
general linear model and common factor
analysis.

The value 0.49 is the correlation between Education and Income. The
values 0.72 and0.11 are standardized regression weights. The value
0.60 is the squared multiple correlation of SAT with Education and
Income.
模型比较
分析步骤
收集数据Collecting
and treating data.
建立理论模型并检
验不同组别的拟和
程度。Theoretical
model (setting and
fitting to various subpopulation of the
test takers)
嵌套模型检验
nested model testing
模型筛选(model
assessment)
建议Implications
跨性别等价性理想,强等价。
跨区域等价性较好,弱等价。
Dif项目原因待查
地雷Caution
Recent studies suggest that when full or partial
measurement invariance is not guaranteed, it may still be
the case that constructs are equivalent.
Saris and Gallhofer (2007, chapter 16) indicate that the test of
measurement invariance is too strict and may fail although
cognitive equivalence still holds.
谢谢!
Thank you very much for your attention!
请多多指教!
I would appreciate your comments and advice.
Download