Classroom Assessment

advertisement
Classroom Assessment
Standard Test
Interpretation
(This presentation concerns the
interpretation of scores from NormReferenced Tests {NRTs})
CRTs vs NRTs
But first, what makes a test a criterionreferenced test?
We’ve talked about this several times in this
class.
Does saying, for instance, that the passing
criterion on this test is 80% of the
items correct make it a CRT?
To be a CRT, the items in a test must be
representative of a well-defined content
domain.
Interpret the following
statements
Marigold reports that she got a 32 on her
history test.
She says she got 80% of the items correct.
Her teacher told her the average scores was
34.
You learn that the standard deviation on the
test was 5, and that
Her %tile rank (PR) is 34.
Who had the best performance on
a 25-item test? (The test has a mean of
20 and a standard deviation of 3)
Lars, who’s raw score was 20.
Laura, who got 80% of the items correct.
René, who’s standard score was 1.
Hildebrand, who’s PR was 84.
Manuel, who scored at the 7th stanine.
Standardized Test
Interpretation

Standardized Tests: What are they?
– Tests that are administered, scored, and
interpreted in a standard manner.
– Any type of assessment can be standardized:





Achievement tests.
Aptitude tests.
Diagnostic tests.
Performance assessments.
Portfolios.
– Even classroom tests can be standardized.
Individual Scores Commonly
Reported on Standardized Tests

Raw scores:
– number of items correct.
– percent of items correct.

Derived scores (all require a comparison or NORM group.
–
–
–
–
Percentile ranks.
Standard Scores.
Grade equivalent scores.
Scale scores:


Arbitrary mean and standard deviation.
E.g., EOG developmental scales.
– Lexiles (for standardized reading tests).
Percentile Ranks (PRs)

Scores on different tests, taken by
different groups, can have widely different
means and standard deviations.
– Eg. A reading test with a Mean of 55 and a standard deviation of
5; and a math test with a mean of 35 and a standard deviation
of 3.

Percentile ranks (PRs) provide a useful
scale for comparing an individual’s
performance across different tests.
– PRs tell us on which test an individual
performed best relative to other individuals
who have taken the same tests (i.e., a norm
group).
Percentile Ranks vs Percentiles

An individual’s PR tells us what percent
of examinees had lower scores.
I.e. an individual with a PR of 63 scored better
than 63% of the examinees in the comparison
(norm) group.

A percentile is the raw score equivalent
of the percentile rank.
I.e. if an individual who’s raw score in 48 has a
PR of 65 then 48 is the 65th percentile.
Comprehension Check
Roberto’s percentile rank, on a math test,
is 34.
If there are 50 students in his class, how
many students obtained scores lower
than Roberto’s?
Percentile Ranks

NEVER confuse PRs with percent correct.
– Unlike raw scores (and other types of scores) that
tend to approximate a normal distribution PRs are
uniformly (or rectangularly) distributed.

PRs exaggerate raw score differences near
the middle of the distribution (the differences
seem larger than they really are), but
 Reduce differences toward the extremes (the
differences seem smaller than then they
really are).
Percentile Ranks
Relation of Percentile Ranks and Raw Scores (Normal Distribution)
RS
PR
PR
RS
100
100
100
100
80
96
80
79
60
73
60
63
40
27
40
47
20
4
20
26
0
0
0
0
At the extremes, large score differences give smaller
PR differences.
Normal Distribution
Standard (z) Scores: Computation
RS - Mean
z = ——————————
Standard deviation
= distance from mean in
standard deviation units.
You should REMEMBER this equation!
So…what is a standard
deviation?
The standard deviation (S.D.) provides
an index of how far, on average, a
score is likely to be from the mean of
a set of scores.
E.g., if a set (distribution) of scores has a
mean of 74 and a standard deviation of
6, then the average distance between
scores in the set and the mean will be 6
points.
More about the standard
deviation
The standard deviation (SD) tells something
about the variability of the scores in a
distribution (or set) of scores.
A set of scores with a standard deviation of 15 is
more variable (spread out) than is a set of
scores with a standard deviation of 5.
z-scores give the distance between a score and
the mean of the scores in standard deviation
units.
A z score of 1.2 means that it’s corresponding raw
score is 1.2 SD units from the mean score.
An Example
Suppose, on a classroom, 20-item math
test, Lamont gets a score of 18 correct.
 If the set of scores has a mean of 16
and a standard deviation of 2, then…
 Lamont was (18-16)/2 = 1 standard
deviation above the mean for the class.
 What else can we tell about Lamont’s
performance in class?

More about Standard Scores
Standard scores, by definition, have a mean of
0 and a standard deviation of 1.
Hence, we can easily determine the following:
About 68 percent of the scores in a distribution are between -1 and
+1 standard deviations (between -1z and +1z,)
About 95 percent of the scores are between -2z and +2z, and
Nearly all the scores (99.7%) are found between -3z and +3z.
All test scores can be converted to z scores.
Click the box for a graphic of the Normal Curve

Normal Distribution
Interpreting Standard Scores
Indicate relative standing (relative to the
comparison, or norm, group taking the
same test).
 Converting from raw scores does not
change meaning of performance.
 Can be used to compare student’s
performance on different tests.
 Can be transformed without changing
their interpretation.

Converting Standard Scores

Recall that z-scores have a mean of 0 and an
SD of 1.
 We can convert (transform) a set of z-scores
to a new scale having a mean of m by adding
m to all the scores.
 For instance, suppose we want a new scale
with a mean of 100. All we have to do is add
100 to all the z-scores.
 Now the set of scores will have a mean of
100 and an SD of 1.
Converting Standard Scores

Now, suppose we want the set of scores to
have a a standard deviation of 10.
 To accomplish this we multiply all the zscores by 10.
 This gives us a new set of scale scores with a
mean of 100 and an SD of 10.
 EOG and EOC scale scores are nothing more
than transformed standard (z) scores.
The Linear Equation for
Converting Standard Scores to
a New Scale
New Score =(New Mean) + (New SD) x z
Using the Standard Scores to
Interpret Test Scores
Example: Mortimer attained a raw score
of 68 on a test having a mean of 62 and
a standard deviation of 4.
 Approximately what percent of those
tested attained a raw score lower than
Mortimer’s?

Normal Distribution
Using the Standard Scores to
Interpret Test Scores

Buffy scored 75% items correct on both a
math test containing 20 items and a spelling
test containing 40 items.
 Assume the math test has a mean of 14 and
an S.D of 2, and the spelling test, a mean of
25 and an S.D. of 5.
 Relative to others taking the same tests, in
which area, math or spelling, did Buffy exhibit
the strongest performance?
 What are their respective PRs?
Normal Distribution
Using Percentile Ranks and Standard
Scores to Interpret Test Scores
 James
scored at the 16th percentile
on a history test.
 His sister, Maggie scored 75%
items correct on the same test.
 Assuming the test has 40 items, a
mean of 36, and a standard
deviation of 3, who exhibited the
strongest performance?
Normal Distribution
Using Percentile Ranks and Standard
Scores to Interpret Test Scores
 On
a 100-item test, having a mean
of 60 and a standard deviation of
10,
– Willie obtained a raw score of 75.
– His friend, Waylon, scored at the 75th
percentile on the same test.
 Who
test?
had the better score on the
Normal Distribution
Stanine Scores

Much underutilized scores.
 Stanines split the distribution of raw scores
into nine intervals.
1

2
3
4
5
6
7
8
9
The middle seven intervals (2 thru 8) have
equal widths in terms of standard deviation
units (each unit is ½ standard deviation wide).
 The two extreme intervals (Stanine 1 and
Stanine 2) have open intervals. .
Some facts about Stanines
The middle three stanines contain a little
more than 50 percent of the scores in a
normal distribution of scores.
1
2
3
4
5
6
7
8
9
Shouldn’t any interval of scores that
includes HALF the population of scores be
considered AVERAGE?
The lower boundary of the 4th stanine has a
percentile rank of 23.
The upper boundary of the 6th stanine has a
percentile rank of 77.
Grade Equivalent Scores
Give the median test score for a
particular grade level.
Do not convey achievement in terms of
years and months of schooling.
Do not allow comparisons across
content areas.
Do indicate level of performance
relative to the group being tested.
Lexiles

The Lexile Framework® for Reading matches
reader ability and text difficulty.

It includes the Lexile® measure and the
Lexile scale.
– The Lexile measure is a reading ability or text
difficulty score followed by an “L” (e.g., “850L”).
– The Lexile scale is a developmental scale for
reading ranging from 200L for beginning readers
to above 1700L for advanced text.
Lexiles

Tens of thousands of books and

Tens of millions of newspaper and magazine
articles have Lexile measures.

More than 450 publishers Lexile their titles.

All major standardized reading tests and
many popular reading programs can report
student reading scores in Lexiles.
Lexiles

To determine the Lexile level of a book or
article, text is split into 125-word slices.
 Each slice is compared to a list of nearly 600million words taken from a variety of sources
and genres — and words in each sentence
are counted.
 These calculations are put into a
psychometric equation.
 From this, the Lexile measure for the entire
text is determined.
More informaiton on Lexiles
For general information about Lexiles, go
to:
http://www.lexile.com/
To find the Lexile measure for a title to to:
http://www.lexile.com/DesktopDefault.aspx
?view=ed&tabindex=5&tabid=67
Recap:Interpret the following
statements
Marigold reports that she got a 32 on her
history test.
She says she got 80% of the items correct.
Her teacher told her the average scores was
34.
You learn that the standard deviation on the
test was 5.
Her %tile rank (PR) is 34.
Who had the best performance on
a 25-item test? (The test has a mean of
20 and a standard deviation of 3)
Larrs, who’s raw score was 20.
Laura, who got 80% of the items correct.
René, who’s standard score was 1.
Hildebrand, who’s PR was 84.
Manuel, who scored at the 7th stanine.
END
Download