Grading & Achievement 4-20

advertisement
SHOWTIME!!
EVALUATING
ACHIEVEMENT
INTRODUCTION
• BOTH CHILDREN AND ADULTS WANT TO KNOW HOW
THEY COMPARE TO OTHERS OR A STANDARD
• PRIMARY ROLE OF TEACHER OR PROGRAM LEADER
IS TO PROMOTE DESIRABLE CHANGES IN PEOPLE
• FOR INSTRUCTIONAL OR PROGRAM PROCESS TO BE
MEANINGFUL:
RELEVANT STATED OBJECTIVES
INSTRUCTION OR PROGRAM MUST BE DESIGNED
TO ACHIEVE OBJECTIVES EFFECTIVELY
RELIABLE AND VALID EVALUATION PROCESS
THAT ASSESSES ACHIEVEMENT
• TESTS ARE ADMINISTERED PRIMARILY TO
FACILITATE THE ACHIEVEMENT OF INSTRUCTIONAL
AND PROGRAM OBJECTIVES
INTRODUCTION
• EDUCATIONAL TESTS CAN BE USED FOR
PLACEMENT, DIAGNOSIS, EVALUATION OF
LEARNING, PREDICTION, PROGRAM
EVALUATION, AND MOTIVATION (CHAPTER 1)
• EVALUATION IS NOT SYNONYMOUS WITH
GRADING AND EVALUATION CAN OCCUR
WITHOUT THE ASSIGNMENT OF GRADES
• A TEACHER THAT PASSES ALL STUDENTS
REGARDLESS OF THEIR LEVEL OF
ACHIEVEMENT OR A TRAINER WHO DOES
NOT TELL HIS/HER CLIENT THAT THEY ARE
NOT DOING WELL IS IGNORING HER/HIS
PROFESSIONAL RESPONSIBILITIES
EVALUATION
• OFTEN FOLLOWS MEASUREMENT TAKING
THE FORM OF A JUDGMENT ABOUT THE
QUALITY OF A PERFORMANCE
• OBJECTIVITY OF EVALUATION INCREASES
WHEN IT IS BASED ON DEFINED STANDARDS
SUCH AS
- REQUIRED LEVELS OF PERFORMANCE
BASED ON TEACHER’S OR TRAINER’S
EXPERIENCE AND/OR CONVICTIONS
- THE RANKED PERFORMANCE OF THE REST
OF THE GROUP
- EXISTING STANDARDS CALLED NORMS
TYPES OF EVALUATION
• FORMATIVE EVALUATION THROUGHOUT THE
PROGRAM MOTIVATES AND INFORMS
PARTICIPANTS OF THEIR PROGRESS AS WELL
AS ALLOWS FOR JUDGEMENT REGARDING
THE PROGRAM’S EFFECTIVENSS
• SUMMATIVE EVALUATION IS THE FINAL
MEASURMENT OF A PARTICIPANT’S
PERFORMANCE AT THE END OF A PROGRAM
WHICH OFTEN INVOLVES COMPARISON
AMONG STUDENTS OR STUDENTS TO NORMS
OR AN IDEAL STANDARD
STANDARDS FOR EVALUATION: CRITERION
REFERENCE STANDARDS
• REPRESENTS THE LEVEL OF
PERFOMRANCE THAT ALL INDIVIDUALS
SHOULD BE ABLE TO ACHIEVE GIVEN
PROPER INSTRUCTION
• MUST BE USED WITH EXPLICIT
OBJECTIVES
• USED IN FORMATIVE EVALUATION TO
DIAGNOSIS WEAKNESSES AND TO
DETERMINE WHEN PARTICIPANTS ARE
READY TO PROGRESS
• STANDARDS TEND TO BE PASS OR FAIL
EXAMPLES OF CRITERIONREFERENCED STANDARDS
PROCEDURES TO DEVELOP CRITERION-REFERENCED
STANDARDS
• IDENTIFY THE SPECIFIC BEHAVIORS THAT MUST BE
ACHIEVED TO ACCOMPLISH A BROAD OBJECTIVE
• DEVELOP CLEARLY DEFINED OBJECTIVES THAT
CORRESPOND TO THE SPECIFIC BEHAVIORS
• DEVELOP STANDARDS THAT GIVE EVIDENCE OF
SUCCESSFUL ACHIEVEMENT OF THE OBJECTIVE;
THESE STANDARDS MAY BE BASED ON LOGIC, EXPERT
OPINION, RESEARCH LITERATURE, AND/OR ANALYSIS
OF TEST SCORES
• TRY THE SYSTEM AND EVALUATE THE STANDARDS;
DETERMINE WHETHER THE STANDARDS MUST BE
ALTERED AND DO SO IF NECESSARY
“IF STANDARDS ARE TOO HIGH, VERY FEW PEOPLE WILL
PASS AND RECEIVE POSITIVE REINFORCEMENT; IF
STANDARDS ARE TOO LOW, MANY WILL PASS THAT
MAY HAVE FALSE ILLUSIONS OF THEIR CAPABILITIES”
STANDARDS OF EVALUATION: NORMREFERENCED STANDARDS
• COMPARE THE PERFORMANCES OF
PEERS
• USED IN SUMMATIVE EVALUATION TO
DETERMINE IF BROAD PROGRAM
OBJECTIVES HAVE BEEN MET
• LEVELS OF PERFOMANCE ARE
ESTABLISHED THAT DISTINGUISH
BETWEEN ABILITY GROUPS RANGING
FROM ‘HIGH ABILITY” TO “LOW
ABILITY”
GRADING
• GRADING IS A TWO-FOLD PROCESS - THE SELECTION
OF THE MEASUREMNTS (SUBJECTIVE OR OBJECTIVE)
THAT FORM THE BASIS OF THE GRADE AND
THE ACTUAL CALCULATION
• INSTRUCTIONAL PROCESS BEGINS WITH
INSTRUCTIONAL OBJECTIVES AND CULMINATES
WITH EVALUATION
• GRADES SHOULD BE BASED ON INSTRUCTIONAL
OBJECTIVES AND THE SCORES FROM RELIABLE AND
VALID TESTS
• SELECTION OF TESTING INSTRUMENTS SHOULD
CONSIDER:
WHAT ARE THE INSTRUCTIONAL OBJECTIVES?
WERE THE STUDENTS TAUGHT IN ACCORDANCE
WITH THESE OBJECTIVES?
DOES THE TEST YIELD SCORES THAT REFLECT
ACHIEVEMENT OF THE OBJECTIVES?
GRADING ISSUES
• IS IT A MAJOR OBJECTIVE OF THE PHYSICAL EDUCATION
PROGRAM?
• DO ALL STUDENTS HAVE IDENTICAL OPPORTUNITIES TO
DEMONSTRATE THEIR ABILITY RELATIVE TO THE ATTRIBUTE?
• CAN THE ATTRIBUTE BE MEASURED SO THAT THE TEST SCORES
ARE RELIABLE AND THE INTERPRETATIONS OF THE SCORES
VALID?
• WERE THE GRADING POLICES EXPLAINED AT THE BEGINNING
OF THE PROGRAM?
• WERE THE GRADES BASED ON A SUFFICIENT AMOUNT OF VALID
EVIDENCE?
• WHAT SHOULD THE RANGE IN GRADING BE?
• SHOULD THE RANGE IN GRADING BE THE SAME FOR A
BEGINNING COURSE COMPARED TO AN ADVANCED COURSE?
• SHOULD THE OVERALL QUALITY OF THE CLASS AFFECT THE
GRADING DISTRIBUTION?
• DOES THE GRADING REPRESENT ONLY ACHIEVEMENT OR
ACHIEVEMENT AND STUDENT EFFORT AS WELL?
• IF PASS-FAIL GRADES ARE ASSIGNED WILL ANYONE FAIL?
GENERALLY ACCEPTED GRADING
PHILOSOPHY
• GRADE A STUDENT RECEIVES SHOULD
NOT DEPEND ON
- THE SEMESTER OR YEAR IN WHICH
THE CLASS IS TAKEN
- THE INSTRUCTOR, PARTICULARLY IF
SEVERAL INSTRUCTORS TEACH THE
COURSE
- OTHER STUDENTS IN THE COURSE
GRADING METHODS
• NATURAL BREAKS
• TEACHER’S STANDARD
• RANK ORDER
• NORMS
GRADING METHODS: NATURAL BREAKS
• SCORES ARE LISTED FROM BEST TO WORST
• EACH BREAK OR GAP IS A CUT-OFF POINT
FOR A LETTER GRADE
• USEFUL METHOD FOR TEACHERS WHO DO
NOT BELIEVE IN SPECIFYING THE POSSIBLE
GRADES AND PERCENTAGES FOR THESE
GRADES
• POOREST METHOD OF ASSIGNING GRADES
• NON SEMESTER-TO-SEMESTER CONSISTENCY
• EACH STUDENT’S GRADE IS DEPENDENT ON
THE PERFORMANCE OF OTHER STUDENTS IN
THE CLASS
GRADING METHODS: NATURAL BREAKS
GRADING METHODS: TEACHER’S STANDARD
• GRADES ARE BASED ON THE TEACHER’S PERCEPTION
OF WHAT IS FAIR AND APPROPRIATE, SOMETIMES
WITHOUT ANALYZING ANY DATA
• EX.: 90-100 A, 80-89 B, ETC
• CONSISTENT STANDARDS FROM YEAR TO YEAR ARE
POSSIBLE
• STUDENT’S PERFORMANCE IS NOT DEPENDENT ON
THE PERFORMANCE OF OTHER STUDENTS
• GOOD METHOD FOR EXPERIENCED TEACHER’S WHO
HAVE REASONABLE STANDARDS OR EXPECTATIONS
OF STUDENTS’ ABILITIES
• NORM-REFERENCED STANDARDS DEVELOPED USING
THE CRITERION-REFERENCED STANDARDS SET BY
THE TEACHER
GRADING METHODS: RANK ORDER
• STRAIGHT FORWARD, NORM-REFERENCED METHOD
OF GRADING
• TEACHER DECIDES LETTER GRADES WILL BE
ASSIGNED AND WHAT PERCENTAGE OF THE CLASS
SHOULD RECEIVE EACH LETTER GRADE
• SCORES ARE ORDERED AND GRADES ARE ASSIGNED
• ADVANTAGES INCLUDE THAT IT IS QUICK AND EASY
TO USE AND ALLOWS GRADES TO BE DISTRIBUTED AS
WANTED
• DISADVANTAGES INCLUDE THAT A STUDENT’S
GRADE IS DEPENDENT ON THE GRADES OF OTHER
STUDENTS AND THAT NO ALLOWANCE IS MADE FOR
THE QUALITY OF THE CLASS WHICH RESULTS IN
GRADES VARYING FROM SEMESTER TO SEMESTER
GRADING METHODS: RANK ORDER
GRADING METHODS: NORMS
• NORMS BASED ON ANALSYS OF THE DATA, NOT ON SUBJECTIVE
STANDARDS CHOSEN BY THE TEACHER
• DEVELOPED BY GATHERING SCORES FOR A LARGE NUMBER OF
INDIVIDUALS WITH SIMILAR DEMOGRAPHICS
• DATA IS STATISTICALLY ANALYZED AND PERFORMANCE
STANDARDS ARE THEN CONSTRUCTED BASED ON THE ANALYSIS
• ADVANTAGES INCLUDE
THE STUDENT’S GRADE IS NOT BASED ON THE
PERFORMANCE OF THE GROUP OR CLASS BEING
EVALUATED
THE NORMS CAN BE USED FOR SEVERAL YEARS (THEREBY
PROVIDING CONSISTENCY FROM SEMESTER TO SEMESTER)
BEFORE THEY NEED TO RE-EVALUATED AND PERHAPS
REVISED
• HOWEVER, THE TEACHER STILL NEEDS TO DECIDE HOW
LETTER GRADES WILL BE ASSIGNED TO THE NORMS
GRADING METHODS: NORMS
HOW WOULD YOU ASSIGN A LETTER GRADE TO THESE
NORMS?
FINAL GRADES
• ASSIGNMENT OF A FINAL GRADE OR FINAL
CLASSIFICATION (FITNESS OR REHAB) MUST
BE BASED ON ALL AVAILABLE INFORMATION
• TEACHER SHOULD CHOOSE AND EXPLAIN
THE FINAL GRADING SYSTEM AT THE
BEGINNING OF A PROGRAM
• THREE METHODS OF ASSIGNING FINAL
GRADES
- SUM OF LETTER GRADES
- POINT SYSTEM
- SUM OF THE T-SCORES
SUM OF THE LETTER GRADES
• USED WHEN TEST SCORES REFLECT
DIFFERENT UNITS OF MEASURE THAT
CANNOT BE SUMMED
• SCORES ON TESTS ARE CONVERTED TO
LETTER GRADES
• LETTER GRADES ON EACH TEST ARE
CONVERTED TO POINTS (A+ = 14, A = 13, A- = 12,
B+ = 11, ETC. DOWN TO F = 1 AND F- = 0)
• POINTS ON ALL TESTS ARE ADDED TOGETHER
AND DIVIDED BY THE NUMBER OF TESTS TO
GET AN AVERAGE SCORE (POINT VALUE),
WHICH IS CONVERTED BACK INTO A LETTER
GRADE USING THE 14-POINT SCALE ABOVE
SUM OF THE LETTER GRADES WHEN
TESTS ARE EQUALLY WEIGHTED
• USING
TABLE 5.5
AS AN
EXAMPLE
THAT HAD
5 TESTS
• SUM = 45 / 5
TESTS = 9
• AVERAGE
SCORE
(POINT
VALUE) OF
9=B-
SUM OF THE LETTER GRADESWHEN
TESTS ARE EQUALLY WEIGHTED
• USING TABLE
5.6 AS AN
EXAMPLE
THAT HAD 5
TESTS
• SUM = 59 / 5
TESTS = 11.8
• AVERAGE
SCORE
(POINT
VALUE) OF
11.8 = B+ AS 12
IS NEEDED
FOR AN “A-”
• DOES THIS
SEEM FAIR
LOOKING AT
THE TEST
SCORES?
DRAWBACKS OF THE SUM OF THE
LETTER GRADES METHOD
• LOSE INFORMATION BY CONVERTING TEST SCORES
TO POINT VALUES
• 96% OR 93% ARE BOTH AN “A” OR 13 POINTS
• WASTE OF TIME TO CALCULATE THE MEAN
• NO ALLOWANCE IS MADE IN THE FINAL GRADE FOR
THE REGRESSION EFFECT AND THUS VERY FEW HIGH
OR LOW GRADES ARE GIVEN, MOST GRADES ARE IN
THE MIDDLE OF THE RANGE
• REGRESSION EFFECT: A STUDENT WHO EARNS AN
“A” OR A “F” ON ONE TEST IS LIKELIER ON THE NEXT
TO EARN A GRADE CLOSER TO “C” THAN TO REPEAT
THE FIRST PERFORMANCE
SUM OF THE LETTER GRADES WHEN
TESTS ARE UNEQUALLY WEIGHTED
SUM OF THE LETTER GRADES WHEN
TESTS ARE UNEQUALLY WEIGHTED
POINT SYSTEMS
• OFTEN USED
BY
CLASSROOM
TEACHERS SO
THAT ALL
TEST SCORES
ARE IN THE
SAME UNIT OF
MEASURE AND
CAN BE EASILY
COMBINED
SUM OF THE T-SCORES
• CHANGE TEST SCORE TO T-SCORES AND
SUM THE T-SCORES AS PREVIOUSLY
DISCUSSED
• POSSIBLE TO WEIGHT EACH TEST
DIFFERENTLY IN SUMMING THE TSCORES BY USING THE PROCEDURES
JUST OUTLINED FOR WEIGHTING
LETTER-GRADE POINTS
OTHER EVALUATION TECHNIQUES
•
BEST OF 5 PEOPLE RECEIVE A SCHOLARHIP, JOB, PROMOTION,
ETC
– RANK-ORDER SITUATION WHEN THE 5 BEST POPLE ARE
REWARDED (PASS) AND REST GET NOTHING (FAIL)
• NUMBER OF PEOPLE AWARDED OR RECOGNIZED IS NOT
LIMITED
– CRITERION-REFERENCED SITUATION THAT IDEALLY NEEDS
A GOLD STANDARD OR A STANDARD ESTABLISHED BY
EXPERT(S)
• PHYSICAL THERAPIST OR ATHLETIC TRAINER SETS A
STANDARD FOR RELEASING PEOPLE FROM THERAPY PROGRAM
CRITERION REFERENDED STANDARD WHERE STANDARD
SHOULD BE BASED ON MINIMUM STRENGTH OR ABILITY
NEEDED TO FUNCTION IN DAILY LIFE
AUTHENTIC ASSESSMENT
“AN ATTEMPT TO EVALUATE PEOPLE IN A
REAL-LIFE OR MORE “AUTHENTIC” SETTING”
CHARACTERISTICS OF AUTHENTIC
ASSESSMENT
• AUTHENTHIC ASSESSMENTS PRESENT CHALLENGES
THAT ARE REPRESENTATIVE OF REAL LIFE
• AUTHENTIC ASSESSMENTS REQUIRE STUDENTS TO
DEMONSTRATE HIGHER-LEVEL THINKING
• STUDENTS KNOW THE STANDARDS FOR ASSESSMENT
FROM THE BEGINNING ALLOWING THEM TO
CONSTANTLY RECEIVE FEEDBACK ABOUT THEIR
PROGRESS
• AUTHENTIC ASSESSMENTS BECOME PART OF THE
CURRICULUM RESULTING IN TEACHERS TEACHING
TO THE TEST
• STUDENTS OFTEN PRESENT THE CULMINATION OF
THE AUTHENTIC ASSESSMENT PUBLICLY
• THERE IS AM EMPHASIS ON PROCESS (HOW
STUDENTS ARRIVE AT THE CORRECT ANSWER) AND
NOT JUST PRODUCT (CORRECT ANSWER)
TYPES OF AUTHENTIC ASSESSMENT
•
•
•
•
•
•
•
•
•
STUDENT PROJECTS
STUDENT LOGS
STUDENT JOURNALS
PEER OBSERVATION
SELF-ASSESSMENT
GROUP PROJECTS
PORTFOLIOS
EVENT TASKS
TEACHER OBSERVATION
RUBRICS
• OFTEN USED IN AUTHENTIC ASSESSMENT
• PERSON’S PERFORMANCE IS COMPARED TO
CRITERIA SPECIFIED IN THE RUBRIC USING A
SCALE THAT RANGES FROM 3 (OUTSTANDING,
ACCEPTABLE, AND DEFICIENT) TO 5
(EXCELLENT, GOOD, SATISFACTORY, FAIR,
AND POOR) LEVELS
• WHEN DESIGNING THE RUBRIC:
– DECIDE WHICH ERRORS WOULD BE MOST
JUSTIFIABLE FOR DISCRIMINATING BETWEEN
ABILITY LEVELS
– BE AS SPECIFIC AS POSSIBLE WHEN DESIGNING
RUBRICS AS THIS WILL INCREASE OBJECTIVITY
CONCERNS WITH AUTHENTIC ASSESSMENT
• QUALITY (VALIDITY, RELIABILITY, AND OBJECTIVITY) OF
AUTHENTIC ASSESSMENT
• HOW WELL DOES THE AUTHENTIC ASSESSMENT TEST RELATE
TO OTHER MEASURES (CRITERION-RELATED VALIDITY)
- ONE MEASURE OF VOLLEYBALL SKILL SHOULD BE RELATED
TO OTHER MEASURES OF VOLLEYBALL SKILL
• ABILITY OF THE ASSESSMET TO PREDICT FUTURE
PERFORMANCE (PREDICTIVE VALIDITY)
- CAN AUTHENTIC ASSESSMENT OF CURRENT FITNESS PREDICT
FUTURE FITNESS BEHAVIOR?
• DOES THE AUTHENTIC ASSESSMENT COVER ALL AREAS OF THE
ACTIVITY (CONTENT VALIDITY)
- ARE THE AUTHENTIC ASSESSMENT OF SOME SOFTBALL
SKILLS REFLECTIVE OF THE ALL THE COMPONENTS OF
SOFTBALL?
• DETAILED RUBRIC AND PRACTICE SCORING WITH THE RUBRIC
CAN ENHANCE THE RELIABILITY AND OBJECTIVITY OF
AUTHENTIC ASSESSMENT
CHARACTERISTICS OF GOOD AUTHENTIC
ASSESSMENT
• MEANINGFUL FOR BOTH TEACHERS AND STUDENTS
• SERVES AS MOTIVATION FOR PERFORMANCE
• EVALUATES ATTRIBUTES THAT ARE IMPORTANT TO
BOTH TEACHERS AND STUDENTS
• REQUIRES DEMONSTRATION OF COMPLEX
COGNITION
• EXEMPLIES CURRENT STANDARDS OF CONTENT
QUALITY
• MINIMIZES THE EFFECTS OF IRRELEVANT SKILLS
• POSSESSES EXPLICIT STANDARDS FOR RATING OR
JUDGMENT
PROGRAM EVALUATION
• SUCCESS OF A PROGRAM DEPENDS LESS ON ITS
PHYSICAL CHARACTERISTICS (E.G., FACILITIES AND
EQUIPMENT) AND MORE ON THE MANNER IN WHICH
THEY ARE USED IN THE INSTRUCTIONAL OR
PROGRAM PROCESS
• ARE STUDENTS ACHIEVING IMPORTANT
INSTRUCTIONAL OBJECTIVES?
• ARE PARTICIPANTS BENEFITING FROM THE
PROGRAM?
• ARE PROGRAM OBJECTIVES BEING MET?
• BOTH FORMATIVE AND SUMMATIVE EVALUATION
ARE REQUIRED FOR PROGRAM EVALUATION
• REQUIRES PLANNED DATA COLLECTION FROM
TESTING AND/OR GOOD DAILY RECORD KEEPING
PROGRAM EVALUATION
• FORMATIVE EVALUAITON IS THE PROCESS OF
JUDGING PERFOMANCE WITH REFERENCE TO
AN ESTABLISHED STANDARD (CRITERION)
• FORMATIVE EVALUATION REQUIRES
SELECTION OF WELL-DEFINED PROGRAM
OBJECTIVES AND ESTABLISHMENT OF
REALISTIC STANDARDS
• VALUE OF FORMULATIVE EVALUATION IS
THAT IF IT SIGNALS THAT SOMETHING IS
WRONG, ACTION CAN STILL BE TAKEN TO
ADJUST AND IMPROVE THE PROGRAM
PROGRAM EVALUATION
• SUCCESS OF A PROGRAM IS REFLECTED
IN TERMS OF HOW WELL A PROGRAM
ACHIEVES ITS BROAD, OVERALL
OBJECTIVES
– SCHOOL PERFORMANCES ARE OFTEN
COMPARED TO NATIONAL, STATEWIDE, OR
LOCAL NORMS
– IN FITNESS PROGRAMS PARTICIPANT
PEFORMANCE IS OFTEN COMPARED TO
NATIONAL OR LOCAL STANDARDS OR
PERHAPS TO LONG-TERM EXERCISE
ADHERENCE PATTERNS
PROGRAM IMPROVEMENT
• EVALUATION IS A DYNAMIC DECISIONMAKING PROCESS THAT WORKS
TOWARD PROGRAM IMPROVEMENT
• FORMATIVE EVALUATION LEADS TO
HIGHER-LEVEL ACHIEVEMENT OF
OBJECTIVES EVALUATED
SUMMATIVELY
• PRIMARY OBJECTIVE OF PROGRAM
DEVELOPERS SHOULD BE IMPROVED
PARTICIPANT PERFORMANCE OVER
TIME
COMMENTS OR QUESTIONS??
THANK YOU, THANK YOU VERY MUCH!!
Download