Introduction to Measurement and Evaluation PE 254 Test and Data Test: An instrument or activity used to accumulate data on a person’s ability to perform a specified task. In kinesiology the content of these tests are usually either cognitive, skill, or fitness. Data: The translation of behavior into a numerical or verbal descriptor which is then recorded in written form. Why Administer Tests? To measure individual differences on a specific trait (behavior). Discussion: Is a test “good” if everyone/anyone scores 100%? Or, is a test “good” if everyone/anyone scores 0%? Use of Tests Motivation Achievement Improvement Diagnosis Prescription Grading Classification Prediction Administrative Concerns in Test Selection Relevance Education value Economic value Time Norms Bias Safety Measurement A measurement takes place when a “test” is given and a “score” is obtained . If the test collects quantitative data, the score is a number. If the test collects qualitative data, the score may be a phrase or word such as “excellent.” Definitions of Measurement The systematic assignment of numerical values (quantitative) or verbal descriptors (qualitative) to the characteristics of objects or individuals; designation of the status of such characteristics. Measurement Process Involves Four Steps 1. 2. 3. 4. Define the characteristics that you want to measure. Select the appropriate test. This may also mean to select the appropriate testing instrument. Administer the test. If an instrument is involved in the testing, this also means to use the instrument correctly. Collect and record the measurement from the test. Considerations When Taking Measurements Remember that you are measuring a characteristic of the person—you are not measuring the person themselves; thus, make no judgments about the person. Make no comical remarks regarding the collected data. Have a high ethical standards when collecting the data. Be professional. Subjective vs. Objective Measurement A subjective measurement is one that can possibly be interrupted differently. An objective measurement is one that cannot be interpreted differently because of numerical values. Discussion Every time you go to a doctor’s office, they weigh you. Let’s say you weigh 140 pounds. Did your measurement of 140 pounds come from a test? Why or why not? Evaluation Definition 1: The process of making judgments about the results of measurement in terms of the purpose of the measurement. Definition 2: The process of obtaining information (data) and using it to form judgments, which in turn are used in decision making. Steps Involved in Making an Evaluation 1. 2. 3. 4. 5. Define the objective or the purpose of the test. Measure the performance or administer the test. Find or develop a standard. Compare a person’s performance on the test to a standard. Make the evaluation then discuss and distribute the results in the most appropriate manner. Formative & Summative Evaluation Formative evaluation: If the evaluation is at the beginning or during. Summative evaluation: If the evaluation is at the end. Norms Evaluations are often based on norms: Local norm: Norms based on a relatively small group of subjects. Ex: Pull-up norms for 7th grade boys at one school. State norms: Norms that are representative of all similar subjects in the state. Ex: CAHPERD fitness norms for 7th grade boys. National norms: Norms that are representative of all similar subjects in the United States. Ex: AAHPERD fitness norms for 7th grade girls. Reliability Reliability defined as the consistency of an individual when repeatedly performing the same test. Example: If a group of people take the same test on two different days, the scores obtained should be approximately the same. A reliable test will yield data that are stable, repeatable, and precise. Reliability of a test refers to the dependability of test scores. Validity The American Psychological Association (APA) reported that validity is the most important characteristic to a test or measuring instrument. The validity of each test can only be evaluated in terms of a particular purpose and for a particular group. Example: A strength test that is valid for college-aged students is not necessarily valid for sedentary adults. Group Activities Identify the reliability and validity for administering the 1.5-mile run in college-aged students. Identify the reliability and validity for administering a standing broad jump in elementary school students.