Wollack presentation - Teaching and Learning Excellence

advertisement
Grading
Principles and Practices
James Wollack
Department of Educational Psychology
Office of Testing & Evaluation Services
UW-Madison
Teaching Academy
September 28, 2012
Walvoord & Anderson (1995)
• Which assessment method…
•
•
•
•
•
•
has nearly universal faculty participation
enjoys superb student participation
is never accused of violating academic freedom
provides detailed diagnostic assessment of student learning
is tightly linked to teaching
has a tight feedback loop into classroom learning and into
teacher planning
• is cheap to implement?
• Grading, when well done
Value of Grading to Instructors
• To what extent are the students achieving the stated
course goals?
• Are students understanding the concepts?
• How should class time be allocated for the current topic?
• Can this topic be introduced more effectively?
• What parts of this course are students finding most
valuable or most difficult?
• How should this course be altered next time it is taught?
Value of Grading to Students
• Do I know what my instructor thinks is most important?
• Am I mastering the course content?
• How can I improve the way I study in this course?
• What grade am I earning in this course?
Value of Grading
• Assessment and evaluation (i.e., grading) are essential
components of effective instruction
• Research suggests that faculty spend 30% of their
planning/instructional time on assessment activities
• Clearly, we are assessment driven and recognize the
value of proper evaluation
• Why, then, do we not evaluate
our own grading practices?
Evaluating One’s Grading
• Should assessments be learning experiences for students?
• What evidence of student learning should figure into
grades?
• How should these data be combined to form final grades?
• Ought grades reflect developmental nature of learning?
• How can my grading criteria remain reasonable and
comparable when assessments keep changing?
Components of Meaningful Grades
• Although grades serve multiple purposes, their primary
purpose should be to reflect the extent to which a
student has mastered the essential learning outcomes.
• Grades should be reliable
• Students’ demonstrated mastery should depend little on
assessment modality or specific tasks
• Students’ work should be scored similarly (using same criteria)
regardless of who is evaluating or when it is evaluated
• Grades should be valid
• Based on assessments closely linked to course objectives and
desired learning outcomes
• Speak to students’ mastery of ELOs
#1 Grading Rule
• Focus on Fairness and Consistency
• Fairness = points/grades accurately reflect the weighting of
course objectives.
• Consistency = similar scoring/grading for comparable work/depth
of understanding.
Making Grades Fair and Consistent
• Consider what you want your students to learn
• Select assignments/assessments that measure what you
value most
• Course Assessment Map: Description of how various learning
goals will be assessed
Test Blueprints
• Develop test blueprints for exams
• Identifies the objectives/skills to be measured
• Objectives should be specific, but general enough to allow for
multiple questions in that category.
• Blueprint should specify the relative weight of each category
Sample Biology Blueprint
• General Biology
45%
• Cellular and Molecular Biology
• Diversity of Life Forms
• Health
• Microbiology
•
•
•
•
•
35%
Microorganisms
Infectious Diseases & Prevention
Microbial Ecology
Medical Microbiology
Immunity
• Human Anatomy and Physiology
• Structure
• Systems
20%
Sample U.S. History Blueprint
• Domestic Affairs
25%
• American political System
• Major Social Problems
• Global Affairs
• Civil Rights/Human Rights
• Economics
20%
20%
25%
• Economic Transformation of US
• Government involvement in economy
• Culture
10%
More on Blueprints
• Develop blueprints before making tests
• Can build into blueprint a measure of cognitive load
• Bloom’s taxonomy
• Overlap with textbook/lecture
Grading Subjective Assessments
• Embrace the grading challenge
• There is no one right way to score artifacts of student learning.
• Grading is socially constructed and context-dependent
• Expert graders will likely differ—sometimes considerably—in
their evaluations.
• Overall harshness
• Importance of material
• Magnitude of student errors
Develop a Scoring Rubric
• A scoring rubric identifies the basis for awarding and
subtracting points at each phase of each item.
• Nature of scoring rubric will change depending on
whether item is computational or a narrative response.
Class Presentation Scoring Rubric (Sum-to-total)
•
•
•
•
Knowledge/Understanding
Thinking/Inquiry
Communication
Use of Visual Aids
30%
30%
20%
20%
6 points
6 points
4 points
4 points
• 20-point assignment
• For each domain, construct a scale, describing
characteristics of each scale point (or range of points)
Class Presentation Scoring Rubric (Sum-to-total)
Needs Work (0-2)
Knowledge/
Understanding
30%
Competent (3-4)
Excellent (5-6)
• The presentation
• The presentation
• The presentation
uses little relevant
uses knowledge
demonstrates a
or accurate
which is generally
depth of historical
information, not
accurate with only
understanding by
even that which was
minor inaccuracies,
using relevant and
presented in class or
and which is
accurate detail to
in the assigned
generally relevant to
support the student’s
texts.
the student’s thesis.
thesis.
• Little or no research • Research is
• Research is through
is apparent
adequate but does
and goes beyond
not go much beyond
what was presented
what was presented
in class or in the
in class or the
assigned texts.
assigned texts.
Class Presentation Scoring Rubric (Sum-to-total)
Thinking/
Inquiry
30%
Needs Work (0-2)
Competent (3-4)
Excellent (5-6)
• The presentation
shows no analytical
structure and no
central thesis.
• The presentation
shows an analytical
structure and a
central thesis, but
the analysis is not
always fully
developed and/or
linked to the thesis.
• The presentation is
centered around a
thesis which shows a
highly developed
awareness of
historiographic or
social issues and a
high level of
conceptual ability.
Class Presentation Scoring Rubric (Sum-to-total)
Needs Work (0-1)
Communicat’n
20%
• The presentation
fails to capture the
interest of the
audience and/or is
confusing in what is
communicated.
Competent (2-3)
Excellent (4)
• Presentation
• The presentation is
techniques used are
imaginative and
effective in
effective in
conveying main
conveying ideas to
ideas, but a bit
the audience.
unimaginative.
• The presenter
• Some questions
responds effectively
from the audience
to audience
remain unanswered.
reactions and
questions.
Class Presentation Scoring Rubric (Sum-to-total)
Needs Work (0-1)
Use of Visual
Aids
20%
• The presentation
includes no visual
aids or visual aids
that are
inappropriate
and/or too small or
messy to be
understood.
• The presenter
makes no mention
of visual aids in the
presentation.
Competent (2-3)
Excellent (4)
• The presentation
• The presentation
includes appropriate
includes appropriate
visual aids, but they
and easily
are too few in
understood visual
number and/or in a
aids which the
format that makes
presenter refers to
them difficult to use
and explains at
or understand.
appropriate
moments in the
presentation
Art History News Report (Holistic)
Points
14-15
•
•
•
•
•
•
•
•
Describes work concisely
Relates message to artist’s choice and use of various devices
Develops how message affects beholder
Considers audience in writing
Clearly organized and presented
Well-imagined
Legible
No problems with mechanics, grammar, spelling, or punctuation
11-13
•
•
•
•
•
•
•
Good description
Relates message to artist’s choices and use of various devices
Some consideration of effect on beholder
Perhaps could be better organized or presented
Adequately imagined
Legible
Few problems with mechanics, grammar, spelling, or punctuation
Art History News Report (Holistic)
Points
8-10
•
•
•
•
•
•
6-7
• Lacking substantially in either description or analysis
• Problems with audience, organization, presentation, or mechanics
interfere with understanding
0-5
• Substandard on more than two of these: description, analysis of choices
and devices, effects on beholder
• Major problems with audience, organization, presentation, or mechanics
Adequate description
Less thorough analysis of how artist conveys message and devices
Audience not necessarily kept in mind
Needs significant improvement in organization or presentation
Needs better imagination
Problems with legibility, mechanics
Statistics Problem Scoring Rubric
Students are given scores for 12 students on a Physics quiz and asked to compute
the mean, median, and standard deviation of quiz scores, showing all work.
Mean
Median
Standard Deviation
Correct formula (1 point)
Apparent understanding
that median refers to
middle score (1 point)
Correct formula (1 point)

X i correct (1 point)
Formula applied correctly
Scores ranked correctly
(2 points)
(2 points)
Correct answer* (1 point)
Correct answer (2 points)
• Given (possibly incorrect) numbers from previous steps
Confusing mean for median (& vice versa) (−3 points)
Nonsensical answer without annotation (−2 points)

2
i
X or
X
i
 X

2
correct (1 point)
Formula applied correctly
(2 points). Correct except
no square root (1 point)
Correct answer* (1 point)
Strategies for Maintaining Consistency
• Grading should be done anonymously
• Grade all students’ responses one question at a time.
• Assign each rater a unique set of problems
• Multiple raters per item
• Maintain an error log and corresponding deductions.
Assigning Final Grades
• To curve or not to curve?
• Should only be considered for large classes (N ≥ 100)
• Not appropriate if course involves scored group work
• Runs counter to philosophy that teaching affects learning.
• Easiest if all assignments are scored numerically
• Hard to average (or weight) letter grades
• Assignments’ points may not reflect their relative weighting
• Problem for norm-referenced grading schemes
Assigning Final Grades
Test A
Test B
Comp.
(15 points) (30 points) (45 points)
Rank
A
15
13
28
1
B
13
14
27
2
C
11
15
26
3
D
9
16
25
4
E
7
17
24
5
F
5
18
23
6
G
3
19
22
7
H
1
20
21
8
Mean
8
16.5
St. Dev.
4.9
2.45
Assigning Final Grades
• To curve or not to curve?
• Should only be considered for large classes (N ≥ 100)
• Not appropriate if course involves scored group work
• Runs counter to philosophy that teaching affects learning.
• Easiest if all assignments are scored numerically
• Hard to average (or weight) letter grades
• Assignments’ points may not reflect their relative weighting
• Problem for norm-referenced grading schemes
• Unless assessment standard deviations are comparable, weighting
will not be as intended
Assigning Final Grades
Assigning Final Grades
Test A
(15 )
Test B
(30)
Z(A)
Z(B)
15Z(A) 30Z(B) Comp.
Rank
A
15
13
1.43
−1.43
21.43
−42.87
−21.43
8
B
13
14
1.02
−1.02
15.31
−30.62
−15.31
7
C
11
15
0.61
−0.61
9.19
−18.37
−9.19
6
D
9
16
0.20
−0.20
3.06
−6.12
−3.06
5
E
7
17
−0.20
0.20
−3.06
6.12
3.06
4
F
5
18
−0.61
0.61
−9.19
18.37
9.19
3
G
3
19
−1.02
1.02
−15.31
30.62
15.31
2
H
1
20
−1.43
1.43
−21.43
42.87
21.43
1
Mean
8
16.5
0
0
0
0
0
St. Dev.
4.9
2.45
1
1
15
30
Assigning Final Grades
• What should contribute?
Student
Engagement
Students’
Mastery of
ELOs
• Should homework, participation, attendance, etc. count?
• Perhaps we could adopt a grading strategy that was
reflective of this relationship
Noncompensatory Grading
• Non-assessment activities likely to improve student
learning can serve as gate-keepers for grades
• Composite score based entirely on assessments
• Can have separate engagement targets that must be met in order
to qualify for grades
• A 90 – 100%
+
passing score for at least 85% HW
AB 85 – 89%
+
passing score for at least 80% HW
B
+
passing score for at least 70% HW
80 – 84 %
What else should contribute?
• Learning is often cumulative and unscheduled
• Adopting a developmental grading scheme
• Exams count more later in semester than earlier
• Final counts more and gives students chance to demonstrate mastery
of concepts assessed earlier in semester
• Dropping lowest midterm score?
How Many Points is an A?
• No ubiquitous definition
• Clearly some conventions
• Like it or not, a B is an average grade
• Dramatic variations from conventions communicates
misinformation to stakeholders (students, industry, grad
schools, etc.)
• Test development is very difficult
• What if I write an exam that’s too hard?
• Consider adopting a class-referenced approach
Class-Referenced Grading
• Write exams so that the very best students should be able
to attain a perfect or nearly perfect score.
• Rescale exams (percentage correct scores) so that the top
score is treated as the maximum achievable score
• Highest earned score will always be 100%
• Students not punished because of an overly (and
inadvertently) hard exam
• Standards in syllabus are conservative
• Different from norm-referenced in that percentage of
students at each grade is not predetermined.
Grading Challenges
• Grading is really tricky
• Appropriate assignments that cover domain and allow students to
demonstrate their knowledge
• Consistent, fair, and appropriate scores for each
• Determining what counts and in what combination
• Obscure task of assigning composite score ranges to ill-defined
grade labels.
• Doing it right, is paramount!
Thank You, Teaching Academy
• Questions?
James Wollack
Department of Educational Psychology
Office of Testing & Evaluation Services
jwollack@wisc.edu
Download