Application of Generalizability Theory to Concept-Map

advertisement
Application of Generalizability Theory
to Concept-Map Assessment Research
Yue Yin & Richard J. Shavelson
Stanford Educational Assessment Laboratory (SEAL)
Stanford University
& CRESST
AERA 2004, San Diego CA
Overview
• Part 1: Feasibility of applying G-theory to
concept-map assessment (CMA) research
- Examining the dependability of CMA scores
- Designing a CMA for a particular application
- Narrowing down alternatives
• Part 2: Empirical study of using G-theory to
compare two CMAs:
- Construct-a-map with created linking phrases (C)
- Construct-a-map with selected linking phrases (S)
A Concept-map
Concepts/Terms
Linking lines
Linking Phrases
Proposition
Variations in CMA
Components
Variation Examples
Task
-Topic only
-Topic and concepts (C)
-Topic, concepts and linking phrases (S)
-Topic, incomplete concepts or incomplete
linking phrases (fill-in-the-nodes or fill-in-thelines)
Response
-Computer
-Paper-pencil
Scoring System
-Link score
-Concept score
-Proposition score
-Structure score
Part 1
Feasibility of Applying
G Theory to CMA Research
Viewing CMA with G theory
• Basic idea
A particular type of score, given by a particular rater,
based on a particular type of concept map, on a
particular occasion, … is a sample from a multifaceted
universe.
• Object of measurement
People—the variation in students’ knowledge structure
• Facets
Task (concept & proposition), response format, scoring
system, rater, occasion, …
G theory vs. CTT
Similarity
•
•
•
•
Concept-term sampling
Proposition sampling
Rater sampling
Occasion sampling
•
•
•
•
Equivalence of alternate forms
Internal consistency
Inter-rater reliability
Stability over time
G Theory’s Advantage
• Integrate conceptually and simultaneously evaluate all the
technical properties above
• Estimate not only the effect of individual facets, but also
interaction effects
• Permits us to optimize an assessment’s technical quality
Examining Technical Properties
& Designing Assessments
• Examining dependability (G study)
How well can a measure of student’s declarative
knowledge structure be generalized across concept map
tasks? scoring systems? occasions? raters?
propositions? different concept samples?
• Designing an assessment (D study)
How many concept map tasks, scoring systems,
occasions, raters, propositions, and/or different concept
samples will be needed to obtain a reliable measurement
of students’ declarative knowledge structure?
Narrowing Down Alternatives
• Task
- Which task type is more reliable over raters,
occasions, propositions, concept samples?
- Accordingly, this task needs fewer raters, occasions,
propositions, and concept samples.
• Scoring system
- Which scoring system is more reliable over raters,
occasions, propositions, concept samples?
- Accordingly, this scoring system needs fewer raters,
occasions, propositions, and concept samples.
Part 2
Empirical Study of Using
G-theory to Compare
Two CMAs
Two Frequently Used CMAs
• Construct-a-map with created linking
phrases (C)--Provides a cognitively valid
measure of knowledge structure (e.g., Ruiz-Primo
et al., 2001 & Yin et al., 2004)
• Construct-a-map with selected linking
phrases (S)--Provides an efficient way to
measure knowledge structure (e.g., Klein et al.,
2001)
Method
• Concept-map task
- 9 Concepts (for C & S)
water, volume, cubic centimeter,
wood, density, mass, buoyancy,
gram, and matter
- 6 Linking phrases (for S only)
is a measure of…
has a property of…
depends on…
is a form of…
is mass divided by…
divided by volume equals…
• Participants
- 92 eighth-graders
- 46 girls
- previously studied a
related unit
- no related instruction
between two occasions
• Procedures
C  S (n = 22)
S  C (n = 23)
C  C (n = 26)
S  S (n = 21)
Criterion Map
Water
Wood
has
is a form of
has
has
has
has a property of
Mass
is a form of
Matter
has a property of
divided by volume equals
is unit of
has
Density
depends on
Gram
has
has
has a property of
is mass divided by
is a unit of
CC
Buoyancy
Volume
Mandatory Propositions
Source of Variation
CS & SC
CC & SS
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Person (P)
Proposition/Item (I)
Format (F)
PxF
PxI
FxI
P x F x I, e
Person (P)
Proposition/Item (I)
Occasion (O)
PxO
PxI
OxI
P x O x I, e
Variance Component Estimate
G Study in SC & CS
70%
Percent of Total Variability
60%
50%
40%
CS
SC
30%
20%
10%
0%
P
F
I
PF
Source
PI
FI
PFI,e
G Study in CC & SS
Percent of Total Variability
70%
60%
50%
40%
CC
SS
30%
20%
10%
0%
P
O
I
PO
Source
PI
OI
POI,e
D Study for C CMA
1
0.9
Relative G Coefficient
0.8
0.7
0.6
1
0.5
2
3
0.4
0.3
0.2
0.1
0
0
4
8
12
16
20
24
Item/Proposition Numbers
28
32
D Study for S CMA
1
0.9
Relative G Coefficient
0.8
0.7
1
0.6
2
0.5
3
0.4
0.3
0.2
0.1
0
0
4
8
12
16
20
Item/Proposition Number
24
28
32
Conclusions
• G study pinpoints multiple sources of measurement
error, thereby giving insight into how to improve the
reliability and applicability of CMA via a D study
• C and S mapping tasks are not equivalent in their
technical properties
• Fewer occasions and propositions are needed in S
than C to get a reliable evaluation of students’
declarative knowledge structure
Thank You for Your Interest! 
To get the complete paper, please either
contact Yue Yin at
yyin@stanford.edu
Or
download the file directly at
http://www.stanford.edu/dept/SUSE/SEAL/
Download