AP and Assessment - Home

What I Learned
About Assessment
From the AP Program
Dan Kennedy
Baylor School
NCTM Louisville
Regional Meeting
November 8, 2013
True Confession of a Veteran
Mathematics Teacher:
For many years I never thought much about
assessment.
I graded my students in what I thought was an
appropriate variety of ways:
Tests … Quizzes … and Homework.
This model had stood the test of time.
In 1986 I was invited to
become a member of the AP
Calculus Test Development
Committee.
From 1990 to 1994 I would
serve as chair.
My experience with this group changed my views
of assessment forever.
I had already learned one important fact about
classroom assessment merely by teaching an AP
course:
It changes the entire classroom dynamic when the
teacher honestly does not know what will be on the
test.
The teacher has no other option
but to teach the students how to
think for themselves!
Why students don’t think on tests:
•Thinking takes time.
•Thinking is only necessary when you cannot
do something “without thinking.”
•If you can do something without thinking, you
can do it very well.
•Students who can do something very well have
been well-prepared.
•Therefore, if you prepare them well, your
students will proceed through your tests without
thinking!
Were AP Calculus exams
predictable?
1987 BC Exam:
1.
2.
3.
4.
5.
6.
Differential equation
Implicit Differentiation
Area/volume
Series
Particle problem
Theory problem (stretch)
Of course, this was just one exam.
But there were others like it.
But if we tried to change
anything, teachers would
notice.
Then, in AP workshops all
over the country, teachers
would find themselves
uttering to AP consultants
the words they dreaded
most when spoken by their
students:
Will this be
on the test?
And why should teachers NOT ask that question?
It is how the game is played.
•We show the students how to do math.
•We let them practice at it for a while.
•Then we give them a test to see how well they can
mimic what we did.
The game is won and lost for BOTH of us on test day.
This was just another example of the educational
paradigm that was leading my student not to think
on tests!
But how can teachers change the game if we want our
students to succeed?
Teachers have one secret
weapon:
We define what it means
to succeed.
We control the grade!
Something I learned about assessment from
the AP program:
It is perfectly OK, perhaps even necessary, to
scale grades!
AP Grade Conversion Chart
Calculus AB
Composite
AP Grade
Score Range*
75−108
5
58−74
4
40−57
3
25−39
2
0−24
1
*The candidates' scores are weighted
according to formulas determined by the
Development Committee to yield raw
composite scores; the Chief Faculty
Consultant is responsible for converting
composite scores to the 5-point AP scale.
75%
=5
At our school, 75% is not a good grade. In fact, 65% is
a minimal pass.
Is this reasonable? Think about it.
•The all-time NBA record for field goal percentage in a
season is 72.7%.
•The all-time record batting average for major league
baseball is .440 (44%).
•A salesperson who makes a sale on 75% of first
contacts is a genius.
So how can we expect 75% success from someone who
is just learning?
If the AP exam were constructed so that the
average student could get 75% of the possible
credit,
(a) it wouldn’t be much of a test, and
(b) the distribution would be skewed rather than
normal.
99
92
82
•
•
71
•
30
•
20
75
•
93
An Important Disclaimer:
Scaling grades is not about building self-esteem.
Scaling grades is about teaching mathematics.
Assessment should support your efforts to teach
your students mathematics.
It should not get in the way.
ClrHome:FnOff
PlotsOff :ClrTable:ExprOff
6  Xmin:100  Xmax
0  Ymin:124  Ymax
0  Xscl:0  Yscl
Input "RAW SCORE: ",A
Input "CURVED TO: ",B
Input "RAW SCORE: ",C
Input "CURVED TO: ",D
(B−D)/(A−C)  M
"round(MX+B−AM,0)"  Y1
IndpntAsk
DispGraph
Text(1,1,"TRACE OR USE TABLE")
Text(7,1,"TO ENTER RAW")
Text(13,1,"SCORES.")
Scaling grades on the TI-84 Plus
Some things that ETS worried about that I
didn’t:
• r-biserial
•Content validity
•Speededness
•True score
•Grading rubrics
r-biserial (r-bis)
“A correlation coefficient relating
performance on a test question and
performance on the measure used as a
criterion. It is an index of discrimination
measuring the extent to which examinees
who score high on the measure used as the
criterion tend to get the question right and
those who score low tend to get it wrong.”
1969 Multiple-choice question #26:

1
0
x  2 x  1 dx is
2
1
(A)  1
(B) 
2
(E) none of the above
The answer is (C).
1
(C)
2
(D) 1
AB Stats:
A 3% B 57% C 7% D 3% E 20%
BC Stats:
A 1% B 70% C 11% D 2% E 9%
Projected Chimpanzee Stats:
A 20% B 20% C 20% D 20% E 20%
Correct responses to problem #26:
AB 7%
BC 11%
Chimps 20%
Content Validity
“Validity is the extent to which a test
measures what it is supposed to measure. The
content validity of an (AP) test is the extent to
which the content of the test represents a
balanced and adequate sampling of the
universe of content in which the test is
intended to measure achievement.”
The AP Calculator Experiment (1983-84)
In 1983 the AP Calculus
Committee decided to allow
(but not require) the use of
scientific calculators on the
AP Calculus examinations.
This was not to be a very happy debut for
technology on the AP stage.
AP readers found that students were losing points on
the free-response section because of calculator misuse.
The calculators affected the scores.
But calculators were not being tested!
This compromised the content validity.
The committee had two choices:
1. Forbid calculators and test as usual;
2. Require calculators and alter the test.
They chose to forbid the calculators.
One of my Precalculus tests from 1990:
Note the
emphasis on
computation.
Note that there is
nothing here to
suggest that any
of this stuff is
worth knowing!
Here is a more
recent test on the
same material.
There are some
computations, but
also some
applications and
more function
analysis.
And they can use
graphing
calculators!
There is also some
graphical analysis, a
logistic function,
and a “stretch”
problem to see if
they can identify a
log function by its
properties.
It’s still not perfect,
but it’s a much more
interesting test!
Speededness
“The appropriateness of a test in terms of the
length of time allotted. For most purposes, a
good test will make full use of the
examination period but not be so speeded
that an examinee’s rate of work will have an
undue influence on the score he receives.”
Allowing for speededness
Exam Format for AP Calculus AB
Exam Format
True Score
“A score entirely free of errors of
measurement. True scores are hypothetical
values never obtained in actual testing. A true
score is sometimes defined as the average
score that would result from an infinite series
of measurements with the same or exactly
equivalent tests, assuming no practice effect
or change in the examinee during the
testings.”
Why teachers don’t need to worry about true
score:
We can assess our students all year long!
The more often the better.
Sorry, kids.
Yessss!
AP Calculus Grading Rubrics
AP® CALCULUS AB
2004 SCORING GUIDELINES
Question 1
If the AP readers can
give partial credit
fairly to 300,000
students, I ought to be
able to do it for my
own students.
In AP Calculus, I can
even use the AP
rubrics to do it.
Traffic flow is defined as the rate at which cars pass through an intersection, measured in cars
per minute. The traffic flow at a particular intersection is modeled by the function F defined by
t
F (t )  82  4sin   for 0  t  30,
2
where F(t) is measured in cars per minutes and t is measured in minutes.
(a) To the nearest whole number, how many cars pass through the intersection over the 30minute period?
(b) Is the traffic flow increasing or decreasing at t = 7? Give a reason for your answer.
(c) What is the average value of the traffic flow over the time interval 10  t  15? Indicate
units of measure.
(d) What is the average rate of change of the traffic flow over the time interval 10  t  15?
Indicate units of measure.
(a)

30
0
F (t )dt  274 cars
(b) F(7)  1.872 or  1.873
Since F (7)  0 , the traffic flow is decreasing
at t = 7.
(c)
1 15
F (t )dt  81.899 cars / min
5 10
(d)
F (15)  F (10)
 1.517 or 1.518 cars / min 2
15  10
Units of cars / min in (c) and cars / min 2 in (d)
 1 : limits

3 :  1 : integrand
 1 : answer

1 : answer with reason
 1 : limits

3:  1 : integrand
 1 : answer

1 : answer
1: units in (c) and (d)
Copyright © 2004 by College Entrance Examination Board. All rights reserved.
Visit apcentral.com (for AP professionals) and www.collegeboard.com/apstudents (for AP students and parents).
AP Calculus Exams are:
•Designed to test knowledge
•Designed to test cleverness
•Scaled reasonably
•Not made up by the teacher
•Open assessments
•Comprehensive assessments (valid)
•Honest about technology
Two Fundamental Principles:
1. Assess what you value.
2. Value what you assess.
Some problems with traditional tests:
•They assess only a fraction of what we value.
•They depend too much on luck.
•There is often no feedback (as with final exams).
•They are usually taken alone. (Is this what we value?)
•They are usually timed. (Is this a good model for quality work?)
•They are frequently taken under artificial, stressful conditions.
•They are dependent on teacher stimulus.
•They are often devoid of creativity (if students are “prepared”).
•They favor one narrow kind of student performance.
•Success is usually short-term and non-transferable.
•The emphasis in the end is what the student can NOT do.
•They can inhibit further learning.
Some assessment strategies I like:
•Assess what you value and value what you assess!
•Assess often, with different kinds of assessments.
•Give meaningful and prompt feedback.
•Give partial credit for partially correct work.
•Explain all your expectations to your students from the start.
•Test diligence, knowledge, and cleverness in focused ways.
•Encourage creativity through your assessments.
•Scale grades to control the standard deviation.
•Only fail students who are failures. Keep everyone in the game.
•Encourage collaboration in class and on homework.
•Assess diligence. Find a way to grade homework frequently.
•Try portfolios.
•Remember: This is not about self-esteem. It’s about teaching mathematics to all
your students!
E-mail me at:
dkennedy@baylorschool.org
Or visit the Baylor School
web site at
www.baylorschool.org.
Click on me under
Faculty and link to my
home page.