Measurement vs. Evaluation/ Reliability vs. Validity

advertisement
Introduction to
Measurement and
Evaluation
PE 254
Test and Data


Test: An instrument or activity used to
accumulate data on a person’s ability to
perform a specified task. In kinesiology
the content of these tests are usually
either cognitive, skill, or fitness.
Data: The translation of behavior into a
numerical or verbal descriptor which is
then recorded in written form.
Why Administer Tests?


To measure individual differences on
a specific trait (behavior).
Discussion: Is a test “good” if
everyone/anyone scores 100%? Or,
is a test “good” if everyone/anyone
scores 0%?
Use of Tests








Motivation
Achievement
Improvement
Diagnosis
Prescription
Grading
Classification
Prediction
Administrative Concerns in Test Selection







Relevance
Education value
Economic value
Time
Norms
Bias
Safety
Measurement



A measurement takes place when a
“test” is given and a “score” is
obtained .
If the test collects quantitative data,
the score is a number.
If the test collects qualitative data,
the score may be a phrase or word
such as “excellent.”
Definitions of Measurement

The systematic assignment of
numerical values (quantitative)
or verbal descriptors
(qualitative) to the
characteristics of objects or
individuals; designation of the
status of such characteristics.
Measurement Process Involves Four Steps
1.
2.
3.
4.
Define the characteristics that you want
to measure.
Select the appropriate test. This may
also mean to select the appropriate
testing instrument.
Administer the test. If an instrument is
involved in the testing, this also means
to use the instrument correctly.
Collect and record the measurement
from the test.
Considerations When Taking Measurements




Remember that you are measuring a
characteristic of the person—you are
not measuring the person
themselves; thus, make no
judgments about the person.
Make no comical remarks regarding
the collected data.
Have a high ethical standards when
collecting the data.
Be professional.
Subjective vs. Objective Measurement


A subjective measurement is one
that can possibly be interrupted
differently.
An objective measurement is one
that cannot be interpreted differently
because of numerical values.
Discussion

Every time you go to a doctor’s
office, they weigh you. Let’s say you
weigh 140 pounds.
Did your measurement of 140
pounds come from a test? Why or
why not?
Evaluation


Definition 1: The process of making
judgments about the results of
measurement in terms of the
purpose of the measurement.
Definition 2: The process of obtaining
information (data) and using it to
form judgments, which in turn are
used in decision making.
Steps Involved in Making an Evaluation
1.
2.
3.
4.
5.
Define the objective or the purpose of
the test.
Measure the performance or administer
the test.
Find or develop a standard.
Compare a person’s performance on the
test to a standard.
Make the evaluation then discuss and
distribute the results in the most
appropriate manner.
Formative & Summative Evaluation


Formative evaluation: If the
evaluation is at the beginning or
during.
Summative evaluation: If the
evaluation is at the end.
Norms
Evaluations are often based on norms:



Local norm: Norms based on a relatively small
group of subjects. Ex: Pull-up norms for 7th grade
boys at one school.
State norms: Norms that are representative of all
similar subjects in the state. Ex: CAHPERD fitness
norms for 7th grade boys.
National norms: Norms that are representative of
all similar subjects in the United States. Ex:
AAHPERD fitness norms for 7th grade girls.
Reliability




Reliability defined as the consistency of an
individual when repeatedly performing the
same test.
Example: If a group of people take the
same test on two different days, the
scores obtained should be approximately
the same.
A reliable test will yield data that are
stable, repeatable, and precise.
Reliability of a test refers to the
dependability of test scores.
Validity



The American Psychological Association
(APA) reported that validity is the most
important characteristic to a test or
measuring instrument.
The validity of each test can only be
evaluated in terms of a particular purpose
and for a particular group.
Example: A strength test that is valid for
college-aged students is not necessarily
valid for sedentary adults.
Group Activities


Identify the reliability and validity for
administering the 1.5-mile run in
college-aged students.
Identify the reliability and validity for
administering a standing broad jump
in elementary school students.
Download