SHOWTIME! EVALUATING KNOWLEDGE INTRODUCTION • KNOWLEDGE IS AN OBJECTIVE OF MOST PHYSICAL EDUCATION PROGRAMS • KNOWLEDGE CAN INCREASE ENJOYMENT AS A SPECTATOR • KNOWLEDGE IS AN OBJECTIVE IN ADULT FITNESS AND REHABILITATION PROGRAMS: WHY FITNESS IS IMPORTANT, HOW TO DEVELOP AND MAINTAIN FITNESS, IMPORTANCE OF GOOD DIET, WHY DID AN INJURY OCCUR, HOW CAN AN INJURY BE AVOIDED, ETC PURPOSES OF KNOWLEDGE TESTS • ASSIGNING A GRADE OR SUMMATIVE EVALUATION • MEASURING PROGRESS OR FORMATIVE FEEDBACK • PROVIDING FEEDBACK TO STUDENTS OR PROGRAM PARTICIPANTS AS TO THEIR STATUS AND WHAT THE CLASS OR PROGRAM KNOWLEDGE EXPECTATIONS ARE • MOTIVATING STUDENTS OR PROGRAM PARTICIPANTS TO LEARN THE MATERIAL TESTED • ASSESSING TEACHING OR INSTRUCTIONAL EFFECTIVENESS LEVELS OF KNOWLEDGE THE FIRST FLOOR LEVELS OF BLOOM’S TAXONOMY TYPES OF KNOWLEDGE TESTS ESSAY VERSUS OBJECTIVE • ESSAY TEST - PEOPLE ANSWER EACH ITEM (QUESTION) WITH WHATEVER INFORMATION THEY CHOOSE AND WRITE THEIR ANSWERS IN SENTENCES • OBJECTIVE TESTS - TRUE-FALSE, MULTIPLE-CHOICE, AND MATCHING HAVE POTENTIAL ANSWERS PROVIDED WITH EACH QUESTION TYPES OF KNOWLEDGE TESTS MASTERY VERSUS DISCRIMINATION • MASTERY TEST - FORMATIVE EVALUATION WITH CRITERION-REFERENCED STANDARDS WHICH ARE USED TO DETERMINE WHETHER INDIVIDUALS HAVE MASTERED THE MATERIAL (PASS-FAIL, PROFICIENTNON-PROFICIENT); TYPICALLY EASIER QUESTIONS ASKED • DISCRIMINATION TEST - SUMMATIVE EVALUATION WITH NORM-REFERENCED STANDARDS DESIGNED TO DIFFERENTIATE AMONG STUDENTS IN TERMS OF KNOWLEDGE; TYPICALLY HARDER QUESTIONS ASKED TEST CONSTRUCTION • STEP 1: CONSTRUCT A TABLE OF SPECIFICATIONS • STEP 2: DECIDE ON NATURE OF TEST OR TYPE OF QUESTIONS TO BE USED • STEP 3: CONSTRUCT THE TEST ITEMS (QUESTIONS) • STEP 4: DETERMINE THE TEST FORMAT AND ADMINISTRATIVE DETAILS (INSTRUCTIONS, NEATLY TYPED, EASY TO READ, ALL ITEM INFORMATION ON SAME PAGE, TEST ENVIRONMENT, ETC) TYPES OF TEST ITEMS (QUESTIONS) TRUE-FALSE QUESTIONS ADVANTAGES • MANY ITEMS CAN BE ON TEST AS THEY CAN BE ANSWERED QUICKLY • EASY AND QUICK TO WRITE • QUICK TO SCORE • FACTUAL INFORMATION IS EASILY TESTED • STANDARDIZED ANSWER SHEETS CAN BE USED DISADVANTAGES • ONLY FIRST LEVEL OF BLOOM’S TAXONOMY, BASIC KNOWLEDGE, CAN BE TESTED • 50% CHANCE OF GUESSING ANSWER • EASY FOR A PERSON TO CHEAT • ENCOURAGES MEMORIZATION RATHER THAN UNDERSTANDING OF FACTS • CAN BE AMBIGUOUS • MAY TEST TRIVIAL INFORMATION • REQUIRES MORE QUESTIONS TO ENSURE RELIABILITY CONSTRUCTION PROCEDURES • • • • • • • • • • • KEEP QUESTION SHORT USE ONLY A SINGLE CONCEPT IN EACH QUESTION KEEP VOCABULARY SIMPLE DO NOT COPY STATEMENTS DIRECTLY FROM THE TEXT WHEN POSSIBLE STATE THE ITEMS POSITIVELY RATHER THAN NEGATIVELY AVOID WORDS LIKE ALWAYS, ALL, NEVER, OR NONE DO NOT ALLOW MORE THAN 60% OF THE ITEMS TO HAVE THE SAME ANSWER AVOID LONG STRINGS OF ITEMS TO HAVE THE SAME ANSWER AVOID PATTERNS IN THE ANSWERS DO NOT GIVE CLUES IN ONE ITEM TO ANOTHER ITEM AVOID INTERDEPENDENT TERMS IN ITEMS MULTIPLE-CHOICE ITEMS (QUESTIONS) ADVANTAGES • MANY ITEMS CAN BE ON TEST AS THEY CAN BE ANSWERED QUICKLY • QUICKLY SCORED • ALL LEVELS OF BLOOM’S TAXONOMY (KNOWLEDGE, COMPREHENSION, APPLICATION, ETC) CAN BE TESTED • DECREASES CHANCE OF GUESSING CORRECTLY • STANDARDIZED ANSWER SHEETS CAN BE USED DISADVANTAGES • FEWER ITEMS CAN BE ASKED THAN WITH TRUE-FALSE • TAKES TIME TO THINK OF GOOD DISTRCTOR RESPONSES • SOME DANGER TO CHEATING • TO SOME EXTENT, ENCOURAGES MEMORIZATION WITHOUT UNDERSTANDING IMPLICATIONS • PEOPLE ARE UNABLE TO DEMONSTRATE THE EXTENT OF THEIR KNOWLEDGE AS THEY CAN ONLY RESPOND TO THE ITEMS WRITTEN CONSTRUCTION • KEEP STEMS & RESPONSES SHORT • MAKE ALL RESPONSES APPROXIMATELY THE SAME LENGTH • USE APPARENTLY ACCEPTABLE ANSWERS FOR ALL RESPONSES • USE 3-5 RESPONSES FOR EACH STEM • IF STEM IS INCOMPLETE SENTENCE, RESPONSE SHOULD COMPLETE SENTENCE • DO NOT GIVE AWAY THE ANSWER WITH ENGLISH USAGE • DO NOT GIVE AWAY THE ANSWER TO ONE ITEM IN THE CONTENT OF ANOTHER ITEM CONSTRUCTION • DO NOT ALLOW THE ANSWER OF ONE ITEM TO DEPEND ON THE ANSWER TO ANOTHER ITEM • DO NOT CONSTRUCT A STEM THAT SOLICITS A PERSON’S OPINION • USE LETTERS (A, B, C, ETC) TO ENUMERATE RESPONSES TO NUMBERED QUESTIONS • TRY TO EQUALLY USE EACH LETTER AS THE CORRECT RESPONSE • WHEN POSSIBLE, STATE THE STEM POSITIVELY RATHER THAN NEGATIVELY MATCHING ITEMS (QUESTIONS) ADVANTAGES • SAVES SPACE (AND TREES) BY GIVING THE SAME POTENTIAL ANSWERS FOR SEVERAL ITEMS • LOWERS THE ODDS OF GUESSING CORRECTLY • QUICKER TO CONSTRUCT THAN MULTIPLE CHOICE ITEMS • STANDARDIZED ANSWER SHEETS CAN BE USED IF THERE ARE 5 OR LESS RESPONSES DISADVANTAGES • SIMILAR TO TRUE-FALSE ITEMS, USUALLY ONLY TESTS FACTUAL INFORMATION (LOWEST LEVEL OF BLOOM’S TAXONOMY) • STANDARDIZED ANSWER SHEETS CAN NOT BE USED IF THERE ARE MORE THAN FIVE RESPONSES CONSTRUCTION • STATE THE ITEMS AND POTENTIAL ANSWERS CLEARY AND SUCCINTLY • NUMBER ITEMS AND LETTER POTENTIAL ANSWERS • KEEP ALL ANSWERS AND ITEMS ON THE SAME PAGE • MAKE ALL ITEMS SIMILAR IN CONTENT • PROVIDE MORE ANSWERS THAN ITEMS TO PREVENT PEOPLE FROM DEDUCING ANSWERS BY ELIMINATION • IN DIRECTIONS, INDICATE WHETHER OR NOT THE ANSWER CAN BE USED MORE THAN ONCE • HAVE SEVERAL POTENTIAL ANSWERS FOR EACH ITEM • IF MORE THAN 5 RESPONSES EXIST, ARRANGE POTENTIAL ANSWERS IN LOGICAL GROUPINGS (E.G., NUMERICAL ANSWERS TOGETHER, DATES TOGETHER, ETC) SHORT-ANSWER AND ESSAY ITEMS (QUESTIONS) ADVANTAGES • STUDENTS ARE FREE TO ANSWER ESSAY ITEMS IN THE WAY THAT SEEMS BEST TO THEM • STUDENTS CAN DEMONSTRATE THE DEPTH OF THEIR KNOWLEDGE • ENCOURAGES STUDENTS TO RELATE ALL THE MATERIAL TO A TOTAL CONCEPT RATHER THAN JUST LEARN THE FACTS • ITEMS ARE EASY AND QUICK TO CONSTRUCT • ALL LEVELS OF BLOOM’S TAXONOMY CAN BE TESTED DISADVANTAGES • TIME-CONSUMING TO GRADE • OBJECTIVITY OF TEST SCORES ARE OFTEN LOW • RELIABILITY AND HENCE VALIDITY OF TEST SCORE ARE OFTEN LOW • ESSAY ITEMS REQUIRE SOME SKILL IN SELF EXPRESSION, WHICH IF IT IS NOT AN INSTRUCTIONAL OBJECTIVE, VALIDITY MAY BE FURTHER LOWERED DUE TO LACK OF RELEVANCY • PENMANSHIP AND NEATNESS AFFECT GRADES, WHICH AGAIN LOWERS THE VALIDITY • THE HALO AFFECT IS PRESENT CONSTRUCTION • STATE THE ITEM AS CLEARLY AND CONCISELY AS POSSIBLE • NOTE ON THE TEST THE APPROXIMATE TIME STUDENTS SHOULD SPEND ON EACH ITEM • NOTE ON THE TEST THE POINT VALUE FOR EACH ITEM • CAREFULLY KEY THE TEST BEFORE ADMINISTRATION WHICH WILL HELP IDENTIFY AMBIGUOUS ITEMS AND IMPROVE OBJECTIVITY (AND HENCE RELIABILITY AND VALIDITY) IN THE GRADING ADMINISTRATION OF TEST • TEST SETTING SHOULD BE QUIET, WELL LIGHTED, PROPERLY HEATED, ODOR-FREE, SPACIOUS, AND COMFORTABLE • STUDENTS SHOULD FACE THE SAME DIRECTION AND BE SPACED OUT • MAY WANT TO CONSIDER PARALLEL TESTS IF TESTING MORE THAN ONE CLASS; DIFFICULT AND TIME CONSUMING TO CONSTRUCT SIMILAR EXAMS THAT TEST THE SAME CONTENT SCORING PROCEDURES • OBJECTIVE EXAMS - FAST TO GRADE - USE COMPUTER OR LAYOVER KEY FOR GRADING • ESSAY EXAMS – USE KEY – REMOVE STUDENT’S NAME – GRADE EACH QUESTION FOR ALL STUDENTS BEFORE GRADING THE NEXT QUESTION FOR ALL STUDENTS TO HELP EXAM’S OBJECTIVITY ANALYSIS AND REVISION • • • • • OVERALL DIFFICULTY VARIABILITY IN TEST SCORES RELIABILITY THE DIFFICULTY OF EACH ITEM THE DISCRIMINATION, OR VALIDITY, OF EACH ITEM • QUALITY OF EACH RESPONSE IN A MULTIPLE-CHOICE ITEM DIFFICULTY AND VARIABILITY • MEAN REFLECTS OVERALL DIFFICULTY • HIGHER THE MEAN THE EASIER THE TEST AND VICE-VERSA • STANDARD DEVIATION REFLECTS VARIABILITY IN TEST SCORES • LARGER THE STANDARD DEVIATION, THE MORE RELIABLE THE TEST AND THE MORE THE TEST DISCRIMINATES BETWEEN ABILITY RELIABILITY • RELIABILITY OF TEST SCORES IS USUALLY ESTIMATED USING EITHER THE KUDER-RICHARDSON OR COEFFICIENT ALPHA METHOD (PP. 453-455) ITEM ANALYSIS • USED TO DETERMINE THE DIFFICULTY AND VALIDITY OF THE ITEMS (QUESTIONS) AND THE EFFICIENCY OF RESPONSES • INCLUDES – ITEM DIFFICULTY – DISCRIMINATION INDEX – RESPONSE QUALITY ITEM DIFFICULTY • THE PERCENTAGE OF PEOPLE WHO CHOSE THE RIGHT ANSWER • IT IS LARGE WHEN THE TEST IS EASY AND SMALL WHEN THE TEST IS HARD ITEM DIFFICULTY ITEM DIFFICULTY ITEM DIFFICULTY DISCRIMINATION INDEX (r) • ITEM VALIDITY, OR ITEM DISCRIMINATION, INDICATES HOW WELL A TEST ITEM DISCRIMINATES BETWEEN THOSE WHO PERFORMED WELL AND THOSE WHO DID POORLY • POSITIVE DISCRIMINATION INDEX (r) - AN ITEM IS ANSWERED CORRECTLY BY MORE OF THE BETTER PERFORMERS THAN THE WORSE PERFORMERS • NEGATIVE DISCRIMINATION INDEX (r) - AN ITEM IS ANSWERED CORRECTLY BY MORE OF THE WORSE PERFORMERS THAN THE BETTER PERFORMERS • DISCRIMINATION INDEX (r) RANGES FROM -1 TO +1 • POSITIVE DISCRIMINATION IS DESIRABLE • GENERALLY ITEMS WITH A DIFFICULTY OF ABOUT .50 RESULTS IN ITEMS THAT THAT HAVE A GOOD POSITIVE DISCRIMINATION INDEX DISCRIMINATION INDEX RESPONSE QUALITY • IDEALLY, EACH RESPONSE OF A MULTIPLE-CHOICE ITEMS SHOULD BE SELECTED BY AT LEAST SOME OF THE STUDENTS TAKING THE TEST ITEM ANALYSIS • VERY TIME CONSUMING TO DO BY HAND • THEREFORE, A COMPUTER IS GENERALLY NEEDED TO DO AN ITEM ANALYSIS FOR EACH QUESTION (ITEM) ON A TEST • A COMPROMISE WOULD BE TO DO AN ITEM ANALYSIS BY HAND ON RANDOMLY SELECTED QUESTIONS OR QUESTIONS WHICH MAY APPEAR TO BE POOR OR HAVE PROBLEMS REVISING THE TEST • AFTER CALCULATING THE DIFFICULTY OF AND DISCRIMINATION INDEX FOR EACH ITEM, THE OVERALL QUALITY OF TEST AND OF EACH ITEM MUST BE DETERMINED SO THAT THE TEST CAN BE REVISED AS NECESSARY STANDARDS FOR TEST REVISION QUESTIONNAIRES • FOLLOWS PROCEDURES AND STRATEGIES VERY SIMILAR TO THOSE OF KNOWLEDGE TESTS • BELIEFS, PRACTICES, ATTITUDES, KNOWLEDGE, INSTRUCTOR AND/OR COURSE EVALUATION, PARTICIPANT EVALUATION OF EXERCISE PROGRAM, PARTICIPANT RECALL OF EXERCISE ADHERENCE, BARRIERS TO EXERCISE, ATTITUDES TOWARD EXERCISE, TENSION REDUCTION, KNOWLEDGE ABOUT BENEFITS OF EXERCISE, SUBSTANCE ABUSE, ETC ARE OFTEN EXAMINED USING QUESTIONNAIRES FACTORS AFFECTING SUCCESS OF QUESTIONNAIRES (I.E., COMPLETION AND RETURN OF QUESTIONNAIRES) • • • • • • • COVER LETTER TIMING APPEARANCE FORM LENGTH CONTENT DEMOGRAPHIC INFORMATION AT END OF QUESTIONNAIRE QUESTIONNAIRES • MINIMUM DATA ANALYSIS IS FREQUENCY COUNTS FOR THE RESPONSES TO EACH ITEM (QUESTION) • OFTEN EACH OF THE DEMOGRAPHIC ITEMS (E.G., MALE OR FEMALE) IS CROSS TABULATED WITH EACH OF THE NON-DEMOGRAPHIC ITEMS TO SEE IF DIFFERENT CLASSIFICATIONS OF PEOPLE RESPONDED DIFFERENTLY TO THE NONDEMOGRAPHIC QUESTIONS QUESTIONS OF COMMENTS?? THANK YOU!!