CHAPTER 14 ITEM ANALYSIS

advertisement
UNIT IV
ITEM ANALYSIS IN TEST DEVELOPMENT
CHAP 14: ITEM ANALYSIS
CHAP 15: INTRODUCTION
TO ITEM RESPONSE
THEORY
CHAP 16: DETECTING
ITEM BIAS
1
CHAPTER 14 ITEM ANALYSIS
*The goal of test construction is to create a
test with minimum length and good
reliability and validity.
*Item Analysis is the computation and
examination of any statistical property of an
item response distribution.
*Item Analysis is a process that we go
through when constructing a new test or
subtests from a pool of items with good
reliability and validity.
2
CHAPTER 14 ITEM ANALYSIS
*Categories
of Item Parameter
*Item parameters fall into 3 categories or
indices.
1. Indices that describe the distribution of
responses to a single item (e. g. mean and
variance of item responses).
2. Indices that describe the degree of
relationship between the response to the
item and some criterion of interest.
Ex. next
3
CHAPTER 14  ITEM ANALYSIS
Ex.
The relationship between the questions
(items) and the criterion of interest i.e.,
depression in Factor Analysis.
3. Indices that are a function of both, meaning,
relationship to item variance/mean and a criterion
of interest.
Ex. First, find the variance/mean for your items
then, calculate the relationship between these
items variance and the criterion of interest (i.e.,
depression) for two groups..
4
CHAPTER 14 ITEM ANALYSIS
Item Difficulty “P”
P= f/N or Number of examinees
who answered an item correctly /
Total number of participants
(See your midterm item analysis and Chap 5).
The higher the P value the easier the
item
5
6
CHAPTER 14 ITEM ANALYSIS
 *Steps
in Item Analysis
In a typical item analysis
the test developer will take
7 steps (they are similar to
the process of test
construction in Chapter 4).
Next Slide
7
FYI PROCESS OF TEST CONSTRUCTION CHAP IV
1-Identifying purposes of test scores
use
2-Identifying behaviors to represent
the construct
3- Preparing test specification i.e.,
Bloom Taxonomy
4- Item construction
5- Item Review
8
PROCESS OF TEST CONSTRUCTION
6-
Preliminary item tryouts
7- Field test
8- Statistical Analysis
9- Reliability and Validity
10- Guidelines
9
CHAPTER 14 ITEM ANALYSIS
 *7 Steps in Item Analysis
1. Describe what proportions of the
test score are of greatest important.
Ex. when I select questions for your
midterm/final exam I look for the
similarities of the questions with those
of qualifying/comprehensive or EPPP
exam.
10
CHAPTER 14 ITEM ANALYSIS
*Steps in Item Analysis
2. Identify the item parameters (e.g.
mean, variance) most relevant to these
proportions.
3. Administer the items to a sample
of examinees representative of those
for whom the test is intended.
Ex. IQ test for children or
depression test for adults.

11
CHAPTER 14 ITEM ANALYSIS
Steps in Item Analysis
4. Estimate for each item the
parameters identified in step 2 i.e.,
variance).
5. Establish a plan for item
selection.
Ex. Using item difficulties (P) as
in Item Analysis to select the
items.

12
CHAPTER 14 ITEM ANALYSIS
Steps in Item Analysis
6. Select the final subset of items, or use
the data (Items in your Item Analysis) for
test revision.
Ex. Takeout all questions with very
high or very low item difficulties.
7. Conduct a cross validation (validity)
study.
Ex. Use SPSS and compare the results of 2
tests or 2 classes (e. g. this year class and
last year class). i.e., Confirmatory Factor
Analysis.

13
UNIT V
TEST SCORING AND INTERPRETATION
CHAP 17: CORRECTING FOR GUESSING
AND OTHER SCORING METHODS
CHAP 18: SETTING STANDARDS
CHAP 19: NORMS AND STANDARD
SCORES
CHAP 20: EQUATINGSCORESFROM
DIFFERENT
TESTS
14
UNIT V
TEST SCORING AND INTERPRETATION
CHAPT
NORMS AND
19
STANDARDS
SCORES
15
CHAPTER 19
NORMS AND STANDARD SCORES
*Alfred Binet (1910)Ratio IQ = Ratio of
MA/CA
Terman  Ratio IQ = Ratio of MA/CA X 100
standardized it.
 *Louis
*Deviation
IQ = Uses Norms to estimate the IQ
We use Norms when we want to compare an
examinee’s score (raw score) or score on a test to
the distribution of scores (scaled or standard
scores) for a sample from a well-defined
population. Ex. next
16
CHAPTER 19
NORMS AND STANDARD SCORES
Ex.
When we want to estimate the IQ of
a 20 year-old person, We compare
his/her raw score on the subtest of an IQ
test with the people of his/her age,
which is his/her norm (standard scores).
Using this technique tells us where this
person stands among the people of
his/her age.
17
NORMS AND STANDARD SCORES
*9 BASIC STEPS IN CONDUCTING A NORMING STUDY
(P.432)
1.
Identify the population of interest
Ex. Students, employees of a company,
inmates, patients, etc.
2. Identify the most critical statistics that will
be computed for the sample data.
Ex. Standard deviation σ, σ² , M, SS, p
18
NORMS AND STANDARD SCORES
*9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)
3.
Decide on the tolerable amount of
sampling error
That is the discrepancy between the
sample statistic (M) and population
parameter, (µ) (Central Tendency M=µ).
The Central Limit Theorem has 3
characteristics;
1. Central Tendency 2.The Shape of the
Distribution (normal) and 3. Variability or
Standard Error of Mean (σm). M-µ
19
9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)
4. Device a procedure for drawing a sample from the
population of interest.
There are 4 types of probability sampling
I Simple Random Sampling
Give everyone in the population an equal chance to
be selected Ex. Draw names from a hat.
II Systemic Sampling N/n
Select every Kth name on the list. Ex. CAU Pop
N=1500 and your sample size n=150
N/n=1500/150=10 Select every 10th student. 20
9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)
SAMPLING CONT..
III Stratified Sampling “Strata” means
different layers. We use Stratified
Sampling when we want to compare 2
different groups (e.g. Males and females
CAU Doctoral Students).
First we randomly select males then,
randomly select females.
21
9BASIC STEPS IN CONDUCTING A NORMING STUDY(P.432)
SAMPLING CONT..
IV Cluster Sampling We use Cluster
sampling when the population consists of
units not individuals, such as classes. Ex.
Miami Dade School Districts. If we want
to conduct a research with the Miami
Dade 2nd graders (1000- 2nd grade classes).
We’ll randomly select about 10 of these
1000- 2nd grade classes to be in our sample
then we conduct research.
22
9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)
5.Estimate
the minimum sample size (n) required to
hold the sampling error within the specific limits.
There are different statistical procedures to
estimate the (n). (n) should be ≥30.
1. n= (σ/d)²
d=effect size d=M-µ/σ
2. n= (σ/σm) ²
σm= σ/√n Standard error of mean for pop Ex.
Z score
Sm=S/√n Estimated Standard Error of the Mean
23
for a sample. Ex. t-distribution
NORMS AND STANDARD SCORES
24
THE EFFECT SIZE
EX. TWO INDEPENDENT T-TEST
25
NORMS AND STANDARD SCORES
26
9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)
6.
Draw the Sample and collect the Data
7. Compute the Values of the Group
Statistics of interest and their standard
error. Sm=S/√n or σm = σ/√n
Calculate the standard error of
measurement, which is the difference
between M and µ. Also known as
sampling error.
27
9BASIC STEPS IN CONDUCTING A NORMING STUDY (P.432)
8.
Identify the Types of Normative
Scores that will be needed, and
prepare the Normative Score
Conversion table (see next 2 slide).
9. Prepare written documentation of
the Normative Scores.
28
NORMS AND STANDARD SCORES
Types
of Normative Scores
Raw Score Score on a subtest or a
test.
Scaled Score Normative score for
specific age.
29
NORMATIVE SCORES
30
Wex-ler
*NORMATIVE SCORES
31
NORMS AND STANDARD SCORES
*Usefulness of Scaled Scores
Scaled Scores are useful for two purpose:
1. Scaled scores relate the examinee’s
performance to percentile rank scores of the
norm group and their grade level.
2. In evaluation and research the mean scaled
score is a better estimation of average group
performance than the mean raw score.

32
33
34
NORMATIVE SCORES
Multiply by 5 to convert to
percentile. This means neither
USA nor Iran are using a Normal
Distribution in their grading
system. USA is negatively and
IRAN is positively skewed.
35
CHAPTER 19
NORMS AND STANDARD SCORES
*Echternacht (1971) 3 steps Process of Grade
and Age Equivalent Scores
1. First we convert the raw scores to scaled
scores
2. Second, calculate the median scaled score for
each grade-level, and plot them on a bivariate
scatter plot.
3.Connect the points and draw a smooth curve.
It is similar to Deviation IQ. I.e., Child’s performance compares
36
with that of others at a particular age or grade level.
CHAPTER 19
NORMS AND STANDARD SCORES
37
38
Download