Uploaded by Noel Simfukwe

MARY MARY

advertisement
1. Assessment involves the use of empirical data on student learning to refine programs and
improve student learning. (Assessing Academic Programs in Higher Education by Allen
2004)
2. Assessment is the process of gathering and discussing information from multiple and
diverse sources in order to develop a deep understanding of what students know,
understand, and can do with their knowledge as a result of their educational experiences;
the process culminates when assessment results are used to improve subsequent learning.
(Learner-Centered Assessment on College Campuses: shifting the focus from teaching to
learning by Huba and Freed 2000)
3. Assessment is the systematic basis for making inferences about the learning and
development of students. It is the process of defining, selecting, designing, collecting,
analyzing, interpreting, and using information to increase students' learning and
development. (Assessing Student Learning and Development: A Guide to the Principles,
Goals, and Methods of Determining College Outcomes by Erwin 1991)
4. Assessment is the systematic collection, review, and use of information about educational
programs undertaken for the purpose of improving student learning and development.
(Assessment Essentials: planning, implementing, and improving assessment in higher
education by Palomba and Banta 1999)
There are two main types of assessment, summative assessment, and formative assessment.
Summative Assessment
Oftentimes, summative assessments can be considered high-stakes. Summative assessments are
used to gauge children's learning against a standard or a benchmark. They are often given at the
end of the year and are sometimes used to make important educational decisions about children.
Summative assessments are a snapshot of students' understanding which is useful for summarizing
student learning. What helps me remember the difference between the two types is that summative
is like a summary. Summative is the big picture or the grand summary of a child's learning.
Summative assessments aren't used a lot in early childhood programs because they're not really
considered developmentally appropriate as a form of assessment for very young children. One
1
example that you might see or use in your program is a Kindergarten Readiness Assessment or a
developmental skills assessment that enables the child to move to the next classroom. I've heard
of some programs doing assessments like that, where a child has to have a certain score on this
assessment in order to move up to the next preschool room or the four or five room, or whatever
it was for that particular program.
There's a little bit of debate within our field about whether it is developmentally appropriate or not
to test children to move them up to that next level. In my experience, there have been several times
where I felt that a child was ready to move up to the next class even though age-wise or
chronologically he wasn't the right age to move. I'm sure we've all had those children where we're
in the three-year-old class and the child's mentally five, but chronologically he's three. Then there
have been other times where the child was chronologically ready to move up at age five, but
developmentally I really felt like he should have been in the other room for a little bit longer.
There are all kinds of implications for using an assessment of that type for that reason. Not to say
that that's wrong to do. It's just there's a little bit of debate in our profession about using those types
of assessments.
Formative Assessments
That takes us to the second type of assessment which is formative assessments. These are
considered low-stakes. So summative are high-stake and formative are low-stake. They're ongoing
and they tend to be based on teachers' intentional observations of children which are typically
during specific learning experiences and/or during everyday interactions or classroom
involvement. These assessments are most useful for planning learning experiences, activities, and
environments.
These are the everyday interactions that we talked about, where assessment naturally emerges from
the work that you're already doing. Those would be considered more of the formative assessment.
Again, these assessments are used to determine activities for the lesson plan after asking questions
such as

What kind of things should I change out in my centers?

What kind of items in the science center are the kids just throwing?
2

What kind of things in the science center are they actually sitting down and investigating
and trying to see what they can figure out about it or are they really actually curious about?
When I was a preschool teacher, I had many four to five-year-old children in my classroom because
at the time, the ratio for our state was one to 15. I had 30 children in my classroom and I had to
really be on top of what my children were interested in and what they had figured out or had gotten
over the excitement of. When you have that many children in the classroom, you have to keep
them engaged, active, and busy. Formative assessments were extremely helpful for me in that way.
Formative assessments are most appropriate for use with young children. Remember, summative
assessments are not necessarily appropriate for age five years and under, but formative assessments
are definitely appropriate as they're often more authentic, more real, and more holistic. They show
a picture of the whole child as well so they can be more useful. Because young children's learning
can be so varied and sometimes erratic, using multiple sources of assessment information is ideal.
That goes back to what we were just talking about where children develop in such a wide range,
with a variety of contexts and situations.
There's such a wide range of development when it comes to young children, that even though you
might have a classroom full of three-year-olds, developmentally they're going to be on a spectrum.
That's because development in learning is varied and can be erratic. The term erratic may be a little
bit shocking at first, but young children's learning can be erratic. For example, if you work with
infants, one day you send them home and they can't sit up or roll over and are just laying there
looking at you. Then they come back on Monday and they're rolling and moving and grooving and
doing all kinds of stuff. If you work with toddlers, one day you send them home and they barely
say two or three words, the next week they come back and you can't get all the words down that
they're speaking. In this situation, erratic means sometimes very sudden, but sometimes it's drawn
out. It depends on the child.
Formative assessments can be formal, where you're actually making time to sit down and take
notes during a specific time or a specific center based on a specific child. They can also be informal
such as when you're out on the playground and a child is sitting under the tree with a book and you
just go over and you sit down and say, "Hey, can I read with you?" You notice, wow, this child
knows a lot of words in this book and you make a note of that. That would be more of an informal
type of assessment that you've done.
3
Formative assessments can be initial or ongoing. The initial formative assessment is usually done
to find out as much as we can about the child, usually at the beginning of the year or as a child
enters a program. It usually involves observing, studying existing information, and reviewing
home background info.
In the program that I supervised, when we had a new child join our program, we had a sheet that
the parents would fill out that asked all kinds of information like, "What's your child's favorite
stuffed animal? How does your child go to sleep at night? What's the bedtime routine? What's your
child's favorite food? What's your child's favorite movie?" It was all background information about
the child so that we could get to know them. That helped us begin those connections that are so
important in early childhood. That home background information would be a part of that first initial
formative assessment.
The other type of formative assessment is an ongoing formative assessment. This typically
provides more in-depth information, often because it takes more time. An ongoing formative
assessment isn’t a quick form that you’re through with once. It’s an ongoing thing you will look
at every week, month, three months, or however it is set up in your program.
Here are some examples of published formative assessment tools often used in early childhood
programs.

The Work Sampling System (WSS) www.worksamplingonline.com

Teaching Strategies GOLD www.teachingstrategies.com

High Scope COR (Child Observation Record) www.onlinecor.net

The Creative Curriculum Developmental Continuum www.teachingstrategies.com
Sometimes a state or funding sources will mandate that certain early childhood programs use a
specific assessment tool. Sometimes your program itself mandates that. I've had the experience of
working with all of these tools at one time or another in my career. All of them have definite
benefits to using them and many of them are pretty easy to complete. As you know, in early
childhood time is not a luxury that we have a lot of. It's always nice to have a tool that's easy to
use so that when you find five minutes to sit down and work on something or do an assessment,
then it's easy to figure out.
4
Objective items which require students to select the correct response from several alternatives or
to supply a word or short phrase to answer a question or complete a statement; and (2) subjective
or essay items which permit the student to organize and present an original answer. Objective items
include multiple-choice, true-false, matching and completion, while subjective items include
short-answer essay, extended-response essay, problem solving and performance test items. For
some instructional purposes one or the other item types may prove more efficient and appropriate.
To begin out discussion of the relative merits of each type of test item, test your knowledge of
these two item types by answering the following questions.
These are some characteristics of objective and subjective tests:
Objective Tests characteristics:

They are so definite and so clear that a single, definite answer is expected.

They ensure perfect objectivity in scoring.

It can be scored objectively and easily.

It takes less time to answer than an essay test
Subjective Tests Characteristics

Subjective items are generally easier and less time consuming to construct than are most
objective test items

Different readers can rate identical responses differently, the same reader can rate the
same paper differently over time.
Criterion-referenced tests compare a person’s knowledge or skills against a predetermined
standard, learning goal, performance level, or other criterion. With criterion-referenced tests, each
person’s performance is compared directly to the standard, without considering how other students
perform on the test. Criterion-referenced tests often use “cut scores” to place students into
categories such as “basic,” “proficient,” and “advanced.”
Criterion-referenced tests compare a student's knowledge and skills against a predetermined
standard,
cut
score
or
5
other
criterion.
ln criterion-referenced tests the performance of other students does not affect a student's score.
This text was recognized by the built-in Ocrad engine. A better transcription may be attained by
right clicking on the selection and changing the OCR engine to "Tesseract" (under the "Language"
menu). This message can be removed in the future by unchecking "OCR Disclaimer" (under the
Options menu).
If you’ve ever been to a carnival or amusement park, think about the signs that read “You must be
this tall to ride this ride!” with an arrow pointing to a specific line on a height chart. The line
indicated by the arrow functions as the criterion; the ride operator compares each person’s height
against it before allowing them to get on the ride.
Note that it doesn’t matter how many other people are in line or how tall or short they are; whether
or not you’re allowed to get on the ride is determined solely by your height. Even if you’re the
tallest person in line, if the top of your head doesn’t reach the line on the height chart, you can’t
ride.
Criterion-referenced assessments work similarly: An individual’s score, and how that score is
categorized, is not affected by the performance of other students. In the charts below, you can see
the student’s score and performance category (“below proficient”) do not change, regardless of
whether they are a top-performing student, in the middle, or a low-performing student.
6
This means knowing a student’s score for a criterion-referenced test will only tell you how that
specific student compared in relation to the criterion, but not whether they performed belowaverage, above-average, or average when compared to their peers.
How to interpret norm-referenced tests
Norm-referenced measures compare a person’s knowledge or skills to the knowledge or skills of
the norm group. The composition of the norm group depends on the assessment. For student
assessments, the norm group is often a nationally representative sample of several thousand
students in the same grade (and sometimes, at the same point in the school year). Norm groups
may also be further narrowed by age, English Language Learner (ELL) status, socioeconomic
level, race/ethnicity, or many other characteristics.
7
One norm-referenced measure that many families are familiar with is the baby weight growth
charts in the pediatrician’s office, which show which percentile a child’s weight falls in. A child
in the 50th percentile has an average weight; a child in the 75th percentile weighs more than 75%
of the babies in the norm group and the same as or less than the heaviest 25% of babies in the
norm group; and a child in the 25th percentile weighs more than 25% of the babies in the norm
group and the same as or less than 75% of them. It’s important to note that these norm-referenced
measures do not say whether a baby’s birth weight is “healthy” or “unhealthy,” only how it
compares with the norm group.
For example, a baby who weighed 2,600 grams at birth would be in the 7th percentile, weighing
the same as or less than 93% of the babies in the norm group. However, despite the very low
percentile, 2,600 grams is classified as a normal or healthy weight for babies born in the United
States—a birth weight of 2,500 grams is the cut-off, or criterion, for a child to be considered low
weight or at risk. (For the curious, 2,600 grams is about 5 pounds and 12 ounces.) Thus, knowing
a baby’s percentile rank for weight can tell you how they compare with their peers, but not if the
baby’s weight is “healthy” or “unhealthy.”
8
Norm-referenced assessments work similarly: An individual student’s percentile rank describes
their performance in comparison to the performance of students in the norm group, but does not
indicate whether or not they met or exceed a specific standard or criterion.
In the charts below, you can see that, while the student’s score doesn’t change, their percentile
rank does change depending on how well the students in the norm group performed. When the
individual is a top-performing student, they have a high percentile rank; when they are a lowperforming student, they have a low percentile rank. What we can’t tell from these charts is
whether or not the student should be categorized as proficient or below proficient.
9
This means knowing a student’s percentile rank on a norm-referenced test will tell you how well
that specific student performed compared to the performance of the norm group, but will not tell
you whether the student met, exceeded, or fell short of proficiency or any other criterion.
3. d. Item analysis is a process which examines student responses to individual test items
(questions) in order to assess the quality of those items and of the test as a whole. Item analysis is
especially valuable in improving items which will be used again in later tests, but it can also be
used to eliminate ambiguous or misleading items in a single test administration. In addition, item
analysis is valuable for increasing instructors’ skills in test construction, and identifying specific
areas of course content which need greater emphasis or clarity. Separate item analyses can be
requested for each raw score
3.e Test-retest reliability is a measure of reliability obtained by administering the same test twice
over a period of time to a group of individuals. Example: A test designed to assess student learning
in psychology could be given to a group of students twice, with the second administration perhaps
coming a week after the first. The obtained correlation coefficient would indicate the stability of
the scores. For a test to be reliable, it also needs to be valid. For example, if your scale is off by 5
lbs, it reads your weight every day with an excess of 5lbs. The scale is reliable because it
consistently reports the same weight every day, but it is not valid because it adds 5lbs to your true
weight. It is not a valid measure of your weight.
Test validity refers to how well a test measures what it is purported to measure. For a test to be
reliable, it also needs to be valid. For example, if your scale is off by 5 lbs, it reads your weight
every day with an excess of 5lbs. The scale is reliable because it consistently reports the same
weight every day, but it is not valid because it adds 5lbs to your true weight. It is not a valid
measure of your weight.
10
Alternatives
Upper 25
Lower 25
A
2
5
B
3
2
Calculate its item index of difficulty =
=
D
15
3
𝑁𝑈𝑀𝐵𝐸𝑅 𝑂𝐹 𝑆𝑇𝑈𝐷𝐸𝑁𝑇𝑆 𝑊𝐻𝑂 𝐺𝑂𝑇 𝐼𝑇 𝑅𝐼𝐺𝐻𝑇
2+5
25
C
5
15
𝑇𝑂𝑇𝐴𝐿 𝑁𝑈𝑀𝐵𝐸𝑅 𝑂𝑃 𝑆𝑇𝑈𝐷𝐸𝑁𝑇
=
7
25
= 0.28
B) Discrimination Index= number of students in the lower group - number of students in the upper
group
Discrimination Index for A= 3-2
Discrimination Index for A =1
item
# Upper 25
Lower
A
2
3
item index
of difficulty
(p)
0.28
B
3
2
0.28
-1
C
5
15
0.8
10
D
15
3
0.72
-12
11
Discrimination
Index
(D)
1
Download