Motivations for and barriers to the implementation of diagnostic RESEARCH

View Online
RESEARCH
www.rsc.org/cerp | Chemistry Education Research and Practice
Motivations for and barriers to the implementation of diagnostic
assessment practices – a case study
Monica Turner, Katie VanderHeide and Herb Fynewever*
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
Received 30th October 2010, Accepted 25th February 2011
DOI: 10.1039/C1RP90019F
Given the importance of diagnostic assessment as a well-substantiated pedagogical strategy for use
at various educational levels from kindergarten to the undergraduate level, we must consider its
lack of implementation in the classroom. The implementation gap is especially severe at the
tertiary level, with chemistry and other STEM (science, technology, engineering, and math)
instruction being no exception. Yet, some tertiary instructors perform diagnostic assessment. How
does this happen? What motivates, enables, and sustains these instructors in their implementation
of diagnostic assessment? In this study we collect data on the practices of two tertiary chemistry
instructors, including classroom observation, artifact collection, and interview. Analysis shows that
these instructors employ several techniques that are consistent with the paradigm of researchbased diagnostic assessment. Perceived motives behind the use of these techniques, as well as
perceived barriers to further implementation of diagnostic techniques are reported and discussed.
Keywords: diagnostic assessment, undergraduate, case study
Introduction
What is diagnostic assessment?
Because interpretations can vary, we will briefly define
what we mean by diagnostic assessment. Black and Wiliam
(1998a) used the term, formative assessment, which they
defined as “all those activities undertaken by instructors
and their students [that] provide information to be used as
feedback to modify the teaching and learning activities in
which they are engaged” (p. 7). Harlen (1997) referred to
diagnostic assessment as the “gathering and use of
information about students’ ongoing learning by both
instructors and students to modify teaching and learning
activities.” From these definitions, and other diagnostic
assessment literature, key elements of diagnostic assessment
may be identified (Stiggins, 1992; Black and Wiliam,
1998a; Steadman, 1998; Shepard, 2000; Wiliam et al.,
2004; Marshall, 2005):
• Target: agreement by both instructors and students on
learning goals and criteria for achievement;
• Measurement: collection of data revealing level of
student understanding and progress towards learning
goals;
• Feedback: provision of effective feedback to students
(and to instructors);
• Adjustment: adjustment of teaching and learning
strategies to respond to identified learning needs and
strengths.
Diagnostic assessment serves a formative purpose, and in
this way is in contrast with summative assessment.
Diagnostic assessment takes place while student learning is
still happening – for example students make their thinking
Calvin College, Grand Rapids, MI,USA
e-mail: herb.fynewever@calvin.edu
142 | Chem. Educ. Res. Pract., 2011, 12, 142–157
visible to the instructor and the instructor gives feedback to
the students while the learning is still taking place. This
timing allows for diagnostic assessments to be used to steer
students and teachers’ actions to enhance student learning
and teaching effectiveness. Summative assessment takes
place after the learning is finished – for example when an
exam at the end of a course measures what learning already
happened. With summative assessments it is too late to steer
student or instructor action to accomplish any formative
purpose. Rather, summative assessments are primarily used
to find out what learning has happened and to assign grades
or meet certain accountability demands of an external body.
From the literature it is clear that diagnostic assessment
is effective. The field is mature enough that there have been
a number of review articles and meta-analyses of empirical
studies. Several of these works quantified the impact of
formative assessment techniques by calculating the effect
size ranging from 0.25 to 0.95 (Kluger and DeNisi, 1996;
Black and Wiliam, 1998a; Nyquist, 2003; Hattie and
Timperley, 2007; Shute, 2008). In particular, Black and
Wiliam’s review (1998a) spans over 250 individual studies
to give a comprehensive description of the evidence of the
effectiveness of diagnostic assessment. Their review found
that a common thread in many successful educational
innovations was the aim to strengthen the frequent feedback
given to students. The article provides concrete examples of
controlled experiments that show how timely feedback to
students leads to substantial learning gains. Several studies
reviewed provide evidence that formative assessment is
directly linked to learning gains and that the gains are, in
fact, “significant and often substantial” (p. 3) with typical
effect sizes of 0.4 to 0.7.
A more recent book by Hattie takes on the mammoth task
of synthesizing the educational research behind over 800
meta-analyses of some 50,000 studies representing millions
This journal is © The Royal Society of Chemistry 2011
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
of students involved in 138 distinct interventions (Hattie,
2009). Consistent with the reports above, Hattie’s work
determines that interventions that hinge on feedback to
students have an effect size of 0.73 and those that focus on
formative evaluation to teachers have an effect size of 0.90.
These effect sizes are greater than those of concept mapping
(0.57), inquiry-based teaching (0.31), class size (0.21),
problem-based learning (0.15), or teacher content
knowledge (0.09). Hattie contended that most of the
effective teaching and learning methods are “based on
heavy dollops of feedback” (p. 173). Further, some of the
largest effect sizes were found when teachers were required
to use evidence of student learning to decide their next steps
(p. 181).
In this paper we concern ourselves primarily with
chemical education and with the tertiary (i.e. college and
university) level of education. Given these two
specialization constraints, there is much less literature
concerning the effectiveness of diagnostic assessment.
Nevertheless, there is still strong evidence that diagnostic
assessment is being done and having a positive impact on
student learning in college level chemistry. A commentary
by Brooks et al. (2005) pointed out that many of the
successful teaching techniques in tertiary chemistry rely
upon frequent performance-related feedback to students (a
component of diagnostic assessment). Notably, Wilson and
Scalise (2006) have successfully adapted for college level
chemistry a diagnostic assessment system for which
empirical evidence shows significant learning gains over
control treatments (Wilson and Sloane, 2000). Although
there is certainly a need for further study, the evidence to
date suggests that diagnostic assessment is effective in
college-level chemistry instruction.
There is a problem. While we can be confident that
diagnostic assessment is effective, research also suggests
that there is an implementation gap. As is often the case in
education, just because a technique is proven to be
effective, we cannot assume that practitioners will adopt it.
Myles Boylan, a program officer for the National Science
Foundation (of the United States) has said “In almost every
discipline, I could point to a variety of really effective …
instructional practices, and say that if we could magically
click our fingers and get everybody using them, there would
be a huge improvement in undergraduate education that
would happen instantaneously. But we’re nowhere near
that” (quoted in Brainard, 2007).
For diagnostic assessment in particular, inventories of
teachers’ beliefs show that most instructors believe
assessments can push students to learn, yet inventories of
teachers’ practices show that most teachers limit their
assessment techniques largely to marking papers for the
purpose of arriving at grades (Hattie, 2009). According to
the National Research Council (of the United States) “In
many classrooms opportunities for feedback appear to
occur relatively infrequently. Most teacher feedback –
grades on tests, papers, worksheets, homework ...
represents summative assessments that are intended to
measure the results of learning. After receiving grades,
This journal is © The Royal Society of Chemistry 2011
students typically move on to a new topic and work for
another set of grades. (But) feedback is most valuable when
students have the opportunity to use it to revise their
thinking as they are working.” (Bransford et al., 2000, pp.
140-141).
While the literature provides evidence for the existence
of an implementation gap, there is only limited information
on exactly how large this gap is. If a recent survey of
college-level physics instructors is any indication, many
college instructors are not even aware of the concept of
diagnostic assessment, and truly few are implementing it in
a way that is true to best practices (Henderson and Dancy,
2007). This survey included a sample of 722 physics faculty
to measure their awareness of and use of reformed teaching
practices (in general – not necessarily diagnostic assessment
techniques). This study showed that physics faculty have a
relatively high awareness of research-based instructional
techniques. In fact, 63.5% of the faculty surveyed stated
they were familiar with Peer Instruction (Mazur, 1997), a
diagnostic assessment technique implemented through
conceptual questions (often with ‘clickers’) and pair-wise
peer discussions. Awareness does not necessarily indicate
use, however, and only 29.2% of those surveyed currently
use Peer Instruction. Furthermore, only 16.9% of those who
do use Peer Instruction reported that they use the technique
“as described by the developer” while 35.9% made “minor
modifications”
and
41.0%
made
“significant
modifications.” Additionally, 32.3% discontinued using
Peer Instruction after only one semester.
And finally, we are not aware of any survey or other
empirical study that quantifies the rate of implementation of
diagnostic assessment in chemistry at the college level. It is
telling, however, that at the 2008 Biennial Conference on
Chemical Education (BCCE, 2008) many sessions, posters,
and papers advocated teaching informed by assessment, but
very few framed these methods within an overarching
paradigm of diagnostic assessment. Examination of the
presentations made at two sessions, both called “Assessing
to Generate Learning Centered Undergraduate Chemistry,”
reveals the focus of nearly every talk was either summative
or program assessment (Ziebarth et al., 2009). This
suggests that in the chemistry community at the tertiary
level, the word ‘assessment’ generally has two meanings.
The first is summative assessments, such as have already
been discussed. The second is program assessment, such as
a department might do at the urging of the administration in
order to improve overall programs.
Upon considering the implementation gap in the tertiary
level, one must remember that most tertiary instructors have
no pedagogical training, and most tertiary instructors have a
high degree of autonomy with little if any requirement for
ongoing professional development. Both of these
differences could contribute to a lack of diagnostic
assessment in the tertiary context. Yet, diagnostic
assessment does happen with some tertiary instructors. How
does this happen? What motivates, enables, and sustains
these instructors in their doing of diagnostic assessment?
And to the extent that diagnostic assessment is limited at the
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 143
View Online
tertiary level, what are the barriers perceived by instructors
preventing further implementation? These questions form
the basis of our research. The purpose, then, of this research
project was to explore the views of tertiary chemical
educators regarding diagnostic assessment in order to gain a
better understanding of why diagnostic assessment is often
poorly implemented at the tertiary level despite the strong
evidence for its effectiveness. Thus, the investigation was
guided by the following question:
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
Research question:
What do professors with a reputation for good teaching do
that is consistent with diagnostic assessment, why do they
do it, and what barriers to diagnostic assessment do they
face?
Theorectical framework
To determine our theoretical framework, we will begin with
the problem: Why is there an implementation gap for
tertiary instructors’ use of diagnostic assessment? As noted
previously, this question intentionally focuses on
instructors’
motives
and
perceptions,
because
implementation at the tertiary level is ultimately sustained
only with substantial ‘buy-in’ from the faculty (Henderson,
2010). It follows that we choose a theoretical framework
that uses a second-order perspective, i.e, we will describe
the phenomenon of diagnostic assessment as it is perceived
by the research subjects, even if the researchers may not
agree (or even feel comfortable) with these descriptions. For
this work, we will employ the phenomenographic
framework (Marton, 1981, Marton and Booth, 1997).
Phenomenography is a relational (or non-dualist),
qualitative, second-order perspective that aims to describe
the key aspects of variation of the experience of a
phenomenon (Trigwell and Prosser, 2003). In other words,
through phenomenography, researchers are able to
apprehend through their subjects’ diverse perspectives the
complex, relational, lived experience of the phenomenon for
those involved. Each characteristic of phenomenography is
appropriate for our research question:
Relational (no n-dualist) Our research assumes that the
instructors are not separate entities from the phenomenon of
diagnostic assessment. Rather, meaning is constituted in the
relationship between individual instructors and the
phenomenon, which in this instance includes instructorstudent dynamics as well.
Qualitative Our research is methodologically qualitative.
As described in a later section, we rely primarily on the
qualitative analysis of interview data.
Second-order In a first-order approach, the researchers
describe the phenomenon as they perceive it. We primarily
use a second-order approach because instructor
implementation of diagnostic assessment relies on the
instructor’s perceptions and motivations. However, we
recognize, as is true of all qualitative research, that these
144 | Chem. Educ. Res. Pract., 2011, 12, 142–157
perceptions will nevertheless be reported through our
interpretation of them and will take precautions to bracket
our interpretations as much as possible as discussed below.
Phenomenography is a research tradition used especially
in the context of educational research and health care
research; it is often used to examine variation of experience
of a certain phenomenon by students and teachers (for a
review article, see Orgil (2007)). As a research tradition,
phenomenography has been successfully used by others
studying university faculty conceptions of teaching
(Kember, 1997; Trigwell and Prosser, 2003). The categories
of description that result from phenomenography can be
used as the basis for quantitative instruments, such as the
Approaches to Teaching Inventory (Trigwell and Prosser,
1993). We anticipate that our research will lead to future
research that aims to develop an instrument for approaches
to assessment. Such an instrument could be used with
research into the correlational relationships between
phenomena (e.g., how student approaches correlate with
teacher approaches) and studies on the impact of
professional development (i.e., instruments can be used preand post- professional development to measure change).
Sample
The two chemistry professors observed in this study are part
of a larger, ongoing study ranging across STEM disciplines.
Because we wanted to see evidence of diagnostic
assessment, it was important to have professors who were
likely to use it often. For this reason, we chose subjects with
a reputation for quality teaching. Further, of necessity we
chose subjects who were teaching during times that the
observer (HF) happened to be free. These selection criteria
led us to recruit our two subjects, one who taught general
chemistry courses, and one who co-taught an introductory
chemistry/materials science course for engineers. Names of
professors are pseudonyms to preserve their anonymity.
Both subjects taught at the same small, four-year liberal arts
college located in the mid-western United States. The
college has an average of 22 students in each class.
Professor Peterson has over 35 years of teaching
experience. He entered academia after getting his PhD and
is currently in his third teaching position. He is a past
recipient of the annual college-wide faculty teaching award,
the highest teacher honor given by the college. He usually
teaches general (i.e. first year) chemistry and second year
organic chemistry. The class observed for this study was
one of two sections of a second semester general chemistry
course he taught that semester. There were 55 students in
this particular class.
Professor Evans teaches both chemistry and engineering,
with nearly 10 years of experience teaching introductory
and upper level chemistry courses, as well as introductory
engineering courses. This is his second teaching position
after having received his PhD. During this investigation, he
co-taught an introductory level chemistry/materials science
course for engineers with an engineering professor. There
were 37 students in this particular class.
This journal is © The Royal Society of Chemistry 2011
View Online
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
Table 1 Interview protocol examples: four elements of diagnostic assessment
Target: The instructor communicates
Interview prompt examples
learning goals and criteria for achievement How do students become aware of student learning goals for the class? – How do you communicate
your expectations? Are there other ways that students learn about them?
Measurement: The instructor collects
How do you know if your students are ‘tracking’ with you and the rest of the class? What evidence do
data revealing level of student progress
you have that the students are learning? What information do you have on where they are not meeting
learning goals? How do you know if this information is reliable?
Feedback: The instructor provides
How do students learn about their progress in the class? What happens inside the classroom to make
feedback and receives feedback
them aware? What happens outside of class time to make them aware? If they know they are lagging, how
can they figure out the ‘next steps’?
What role do you think that feedback plays in your teaching and learning? - How do you receive
feedback from students? How do you use feedback from students? How do students receive feedback?
Adjustment: The instructor makes
How do you make decisions about teaching strategies? How do you decide that works and what
doesn’t work? Think of the most recent homework assignment that you gave. Would you give it again next
changes in teaching choices. The student
year? Why or why not? Think of the most recent class period. Would you make any changes before next
makes changes in learning strategies.
year? What would you change and why?
Why do you think students respond to feedback the way they do? What do you do to help students
make changes in how they go about learning? What do they do? How would you explain the relative
success (or lack of success) of your approaches?
Data collection
Field observations Over the course of one semester, one of
the authors (HF) observed one class taught by each
participating instructor once every two weeks. During each
observation, he used a semi-standardized field protocol to
take detailed notes of classroom events and dynamics; the
disposition and activities of students, and instructor
methods, especially as related to diagnostic assessment. The
protocol identified salient process dynamics, techniques,
and barriers. These observations were linked conceptually
to the interview protocols to permit maximum triangulation
of data and methods. The observations were non-obtrusive
to the extent that the researcher kept quiet throughout. The
observer took contemporaneous field notes as well as postobservation notes to highlight what to ask subjects about
during the interviews.
We note that each course had as a co-requisite a onceweekly laboratory section. Although we expect that the lab
component of the course likely did contribute to the
diagnostic assessment experienced by the students, we
chose to focus exclusively on the lecture portion of the
courses in this study.
Semi-standardized in terviews The researcher conducted
approximately
30-60
minute-long
semi-structured
interviews with each professor as soon after the observation
as possible. The goal of the interviews was to discuss the
teaching practices observed and to try and get at the
professors’ ideas about diagnostic assessment based on their
actions in the classroom and data collected thus far (Table
1). The interviews played a key role in this investigation in
providing an accurate picture of the professors’ opinions
that cannot be gathered from observation alone, and also to
assist in the authors’ interpretation of the observations
made. During the interviews, the interviewer did not tell the
professors what type of teaching practices in particular he
was looking for, nor did he volunteer suggestions for
changing teaching in the future.
This journal is © The Royal Society of Chemistry 2011
For this style of research, the interview unfolds as a
dynamic iterative process with the interviewer presenting
questions in a non-linear fashion in response to the
interview subject’s responses and reasoning. This
“conversation with a purpose” (Orgil, 2007) continues until
the interviewer feels confident that he or she has not only
gained the information desired, but feels confident in his or
her interpretation of the subject’s responses and reasoning.
The depth of the interview derives in large part from the
probes used to capture the details and nuances of the
subject’s knowledge and perceptions. As part of the
reflective process, the interviewer continues to probe for
ideas until the interviewee has nothing more to add. Sample
interview questions and accompanying probes are provided
as illustrations in Table 1. We note that the interview
questions are constructed to make tight connections with
research question and elements of diagnostic assessment
(i.e. target, measurement, feedback, and adjustment).
Collection of artifacts Researchers collected and analyzed
artifacts related to classroom instruction, with special
emphasis focused on materials used during periods of
observation, including handouts, syllabi, student work,
graded student work, etc. A sampling was taken of all
documents passed from the instructor to the students and
vice versa. Special attention was given to any examples of
feedback, such as, through goal setting or grading
comments. We noted opportunities missed for feedback as
well. Review of these artifacts informed the interviews as
previously noted.
Data analysis
The analysis of the data needed to accurately interpret the
meanings behind each professor’s view of diagnostic
assessment, as seen in their interviews. Thus, three
researchers analyzed the interview transcripts in such a way
as to maintain reliability and validity. We used the
computer program called HyperRESEARCH to code each
interview. In order to be faithful to the ideas presented by
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 145
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
Fig. 1 The ten parts of diagnostic assessment. Each object is scaled relative to other objects of the same shape with respect to the frequency with which it
was mentioned by the professors.
the subjects, we chose to develop codes based on patterns in
the data rather than use a list of pre-set codes. There was an
exception to this, however, in that the researchers had
already studied the literature and agreed on a definition of
diagnostic assessment and the four primary elements of
diagnostic assessment (given in the introduction). These
elements served as a framework while coding, and to this
extent, our codes were not purely emergent from the data.
To begin the analysis of the transcribed interviews, we
individually read through four randomly chosen interviews
to understand the larger meaning behind each interview
(Creswell, 2003, p. 183). This allowed us to gain a general
sense of each professor and his possible conceptions of
diagnostic assessment. We individually coded the
interviews, and then collaboratively coded them, discussing
each code until consensus was reached. During this process,
we also became normalized in our coding procedure. After
coding this sample of interviews, we developed an initial set
of codes and code descriptions. This iterative process
involved reading through the interviews several times and
involved the consolidation of each researcher’s codes to
develop specific codes to match patterns we began to notice.
We then went back to the data and made sure that the codes
still fit where they were initially applied. At this point we
coded the rest of the interviews, with two authors coding
each interview individually, and then collaboratively
discussing each code as before, until consensus was
reached. As an additional check on validity, participants
were provided with a near-final draft of this manuscript, to
give them an opportunity to comment on their perception of
the accuracy of the findings.
Limitations
As is to be expected with a case study methodology, our
main limitations stem from our very small sample size. We
present this work with the caveat that attempting to
146 | Chem. Educ. Res. Pract., 2011, 12, 142–157
generalize to other instructors of chemistry is not
appropriate. We do hope, however, that this work provides a
depth of understanding of the two cases and is illustrative of
what diagnostic assessment can look like.
Results and discussion
All of the codes generated for these cases can be assigned to
one of the ten axial categories represented in the composite
concept map (Fig. 1). The four main elements of the
diagnostic assessment cycle are seen on the squares placed
along the main cycle. The cycle created by these four
elements is hindered by the three types of barriers listed on
the octagons in the center of the circle. Other aspects of
teaching that drive or detract from the diagnostic
assessment cycle are listed on arrows placed around the
main cycle. We will define each of these categories below,
illustrating with examples and discussing the role each
concept plays in our overall model of diagnostic assessment
and the relative prominence of each category in our case
data. To be clear: the four main elements of the diagnostic
assessment cycle were taken from our analysis of the
literature as a priori axial categories. The remaining axial
categories and all of the codes that fit within all categories
were emergent from the data and serve as our primary
research results.
As we will see in the data, it makes sense to represent the
central circle as a cycle because each element therein
supports and leads naturally into the other elements on the
circle. Briefly, an instructor using the cycle will make it
clear to the students what that learning goals are (target).
This sets the stage for what should be measured in terms of
how the students are progressing towards the goal
(measurement). Once the measurement is complete, the
results of that measurement serve as feedback both to the
students and the instructor (feedback). Acting upon that
feedback will involve making adjustments to both what the
This journal is © The Royal Society of Chemistry 2011
View Online
teacher does and what the students do (adjustment). The
adjustments should then be done in a way that reinforces
what the original target was, which brings the course back
to the target. This cycle can continue iteratively until
desired student mastery is reached on a given topic, and can
be used repetitively as new content is introduced.
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
Four key elements of diagnostic assessment cycle
Table 2 shows the ten most commonly mentioned codes,
ranked from most frequent at top, relating to the diagnostic
assessment cycle and where they fit into each of these
categories. Fig. 2 shows the relative frequency for how
often the codes within each category were used. In the
following section each of these four elements of diagnostic
assessment will be explained in further detail.
Table 2 Ten most commonly used diagnostic assessment techniques
Rank Technique
1 Asks questions
2 Adjusts teaching based on measurements
Provides feedback students can act on to
3
move forward
Diagnostic assessment geared toward the
4
majority of the class
5 Provides opportunity for peer-assessment
Provides written feedback beyond a
6
grade
7
8
9
Target The target element of the diagnostic assessment
cycle consists of the agreement by both professors and
students on learning goals and criteria for achievement.
Consistent with the literature (Ludwig et al., 2011), our
subjects made assessment and content expectations clear in
the syllabus, which they handed out at the beginning of the
semester, but they infrequently revisited target expectations
throughout the semester. When they did revisit target
expectations, it was often to communicate process
expectations with regard to how students do the homework
and assessment expectations explaining what tests will be
like. An example of how the professors used syllabi to make
assessment expectations known to students is seen in a
quote from Professor Evans:
Professor Evans: “In the syllabus, we promise the
students that when we make a test, 1/3 of it will be
problems from the homework, 1/3 of it will be problems
from previous tests which they have all access to, and 1/3
will be problems they haven’t seen before, new
problems.”
The professors also used verbal feedback on grading
choices to make process expectations clear, for example, by
describing how students should show their work when
writing solutions for a quiz or exam. Professor Peterson had
a situation where several students fortuitously got nearly the
right answer for a quiz question even though they followed
the wrong technique. As a result, he verbally explained to
the class why he took points off from several papers that
had the correct answer if they did not show their work.
Professor Peterson: “I’m trying to make the point that I
really don’t care too much, in general, about answers. I
care everything about how we get to the answer. I’ll
make that point a number of times. I’ve already made it
before in class. Because that’s something a lot of them
don’t see at all. They just think an answer, that’s all
we’re really after. And that’s, of course, not true.”
While both of the professors did discuss making their
target expectations clear to students, target expectations was
the element of the diagnostic assessment cycle that was
least frequently mentioned by the interviewed professors
(9% of coded statements regarding the four elements of
diagnostic assessment).
This journal is © The Royal Society of Chemistry 2011
10
Engages in dialogue with students
Communicates process expectations to
students
Provides verbal feedback to the whole
class
Differentiates based on student needs
Key Element
measurement
adjusts teaching
feedback
measurement, adjusts
teaching
feedback
feedback
measurement,
feedback
target
feedback
adjusts teaching,
measurement
Fig. 2 Frequency with which each category of diagnostic assessment
was coded in the interview data. These are axial categories, each with
many codes.
Measurement The measurement element of the diagnostic
assessment cycle occurs when the professor provides an
opportunity for students to make their thinking visible. This
allows the professor to measure the students’ level of
understanding with respect to the agreed upon learning
targets. Measurement can occur in a variety of ways, both
formal and informal (Brooks et al., 2005). Formal methods
of measurement used by our subjects included requesting
student evaluation on teaching, homework, quizzes, test,
and classroom response systems (or ‘clickers’). Professor
Evans was very deliberate in his use of clickers as a
measurement tool to gauge student understanding:
Professor Evans: “I have to work to get the students to
give me stuff so that I know where they're at. But it is
important, and more and more I've been trying to use
clickers, which I love, as an instantaneous way to assess
where a class is at: building much more time for
questions and going through problems together, being
much more interactive as we go through problems,
finding out where they're getting stuck, what they're
interested in, what they disagree with.”
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 147
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
The professors also often used more informal means of
measuring student understanding, such as, listening to
student comments in class and office hours, eavesdropping
on students in class discussions of the content, and noting
student non-verbal communication. Professor Peterson often
used student non-verbal communication as his means of
gauging student understanding:
Professor Peterson: “You can just tell when you're just
blowing everybody out, even your best students, it shows
up. So obviously, feedback may not even have to have
words with it. Just their facial expressions can tell me
that it's not going where I want it to go.”
Of all the elements of diagnostic assessment described by
the professors interviewed, measurement was the most
frequent focus (39% of coded statements regarding the four
elements of diagnostic assessment).
Feedback The feedback element of the diagnostic
assessment cycle occurs when the results of a measurement
are used to inform an individual of their level of
performance. This can occur in two distinct ways. The first
way feedback occurs is when the professor provides
effective feedback to students informing them of their level
of achievement in relation to the agreed upon target, and
provides them with enough information so that they know
what the next step is in the learning process (Hattie and
Timperley, 2007). In our data, feedback to students
sometimes occurred verbally and sometimes in written
form. Verbal feedback was sometimes given to individuals
and sometimes to the class as a whole. Written feedback
usually consisted of comments written on submitted written
work such as homework or tests. The second way feedback
can occur is when students give the professor feedback on
his or her teaching (Fuchs and Fuchs, 1986). In our data,
this occurred sometimes verbally through comments in class
or in office hours or through a written evaluation distributed
to students informally by the professor or formally by the
college. We note, however, that college administered
evaluations were optional, except at the end of a semester,
which is too late to be formative, except for subsequent
semesters.
Professor Peterson would regularly use summative
assessments, such as quizzes and midterm exams in a
formative way. For example, when handing back written
assessments, he would reserve a significant portion of class
time to discuss the results in the hope of helping the
students learn from their mistakes.
Professor Peterson: “I know when you go through tests
and they didn't so well on this problem or that problem, I
hope when we take the time to go through that test and
say, okay, most of the class missed this question, now
why did we miss this question, that they're tuned in and
they wouldn't repeat that mistake if they had it again
down the pike.”
Professor Evans also made a concerted effort to give
feedback to his students, but his focus was more often on
giving feedback to students as they worked through
problems (e.g. in class, laboratory, and office hours) rather
148 | Chem. Educ. Res. Pract., 2011, 12, 142–157
than feedback after an assignment had been turned in.
Professor Evans: “Usually I try to be very conscious
about the feedback to the students, the timeliness of it
especially. Because I've come to realize that for the bulk
of students, who can't fully see a problem from its
beginning to its end all on their own, and be confident
that they're done, they really need, ideally, moment to
moment feedback: ‘am I on the right track here?’ ‘what's
going on?’, ‘do I have the right answers?’; because a lot
of students will spin their wheels. They'll work hard even,
but they won't be moving in the right direction at all. And
they won't figure that out for a long time.”
Feedback was the second largest focus of the professors
we interviewed with regards to diagnostic assessment (33%
of coded statements regarding the four elements of
diagnostic assessment), being mentioned just slightly less
often than measurement.
Adjusts te aching Adjusting teaching based on data is the
fourth element of the diagnostic assessment cycle. In
general, adjusting teaching occurs when a professor has
taken a measurement of student understanding and then
adjusts his or her teaching strategies in order to
accommodate the identified learning needs and strengths of
the students in the class (Steadman, 1998). Some examples
from our data of this occurring in the classroom were when
a professor altered the teaching and learning activities or the
topics addressed in a class period or spent more or less time
than originally planned on an activity or topic based on the
students’ needs. Adjusting teaching took place immediately
or with a delay. Immediate adjustment of teaching took
place when a professor took a measurement and reacted to it
in the same class period. Professor Peterson gives an
example of this in the following quote:
Professor Peterson: “I will definitely - I often, well, not
often, but if I’m doing a difficult concept I can just tell
right from their faces they’re not getting it. I’d probably
tell them. I’d just stop and say, ‘Okay, guys, I know
you’re not getting it. I can see it on most of your faces, so
let’s get this thing resolved. Let’s start asking questions.
What’s not sinking in?’[I] just deal with it.”
Another way in which adjusting teaching occurs is when
a professor took a measurement and used it to change future
classes. This sometimes meant that they revisited a subject,
changed how they taught it in the next period, or made a
note to adjust it in future semesters. Professor Evans
explained how feedback from students helps him adjust
many aspects of his teaching over short and long time:
Professor Evans: “Feedback from students in previous
years has helped me figure out how things can go, and I
have tweaked and optimized from workshops to lectures
to labs to other things. And I know there is less feedback
in terms of seeking clarification about confusions,
because I have ironed most of that out. When there was
confusion about a question on a lab or whatever it was, I
have gone back and changed the wording so that it was
clearer. I have even done it between one section and the
next. So a lot of that has been alleviated.”
This journal is © The Royal Society of Chemistry 2011
View Online
‘Adjusting teaching based on data’ was mentioned in
19% of coded statements regarding the four elements of
diagnostic assessment, about half as often as ‘measurement’
and ‘feedback’ but twice as often as ‘target’.
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
Barriers
The barriers to the diagnostic assessment cycle that the
professors mentioned can be divided into three areas:
student, instructor, and situational (Fig. 3). All the barriers
can prevent the full implementation of diagnostic
assessment because the professor, the students, or classroom
circumstances cause some hindrance to one or more of the
four main elements mentioned earlier. Table 3 lists the
barriers most frequently listed in the interviews. Because
none of the top ten barriers were instructor barriers, we also
include the highest-ranking instructor barrier for
comparison.
Student barriers
The student barriers refer to those barriers caused by
students’ attitudes or their behaviors. Consistent with the
literature (Hesse, 1989) student barriers were the most
common type, constituting 52% of all coded statements
regarding barrier to diagnostic assessment. Due to these
barriers, the professors in our study sometimes felt that they
could not measure the students’ level of understanding,
provide effective feedback for the students, or that students
may not use the feedback. At times, the professors were still
able to implement diagnostic assessment to a certain extent,
but believed that diagnostic assessment could not occur as
well as it would if the student barriers were not there. Often
a student barrier was cited as the reason behind a decision
regarding how to implement a diagnostic assessment. Other
times, a professor stated that a student barrier stopped him
from implementing a particular diagnostic assessment
strategy at all. The following quote is an example from
Professor Evans illustrating a common barrier to him, that
of students not paying attention in class. In this quote
Professor Evans acknowledges that when he answers one
student’s question, the two-way communication might not
reach other students who are not paying attention. This
barrier is revealed if a student who was not listening asks
the same question again.
Professor Evans: “When I hear a question that comes
back again, what I’m listening for is whether this student
is basically asking the exact same question again or are
they asking it at least in a way that demonstrates they’re
aware that we’ve already been talking about this. If I
hear them ask it in a nuanced way or building way or a
way that connects with in any way what we said before,
even if it is, ‘I know you’ve gone over this before, but is
this the same as [before]?’ or whatever, then I’m much
more likely to spend more time on it because then I feel
like they’re asking for further clarification and it’s
probably representative of more in the class. But if it’s a
totally redundant question, I tend to think they just didn’t
hear it the first time. So I’m not going to fully elaborate
again because I’m guessing that’s not representative of
the rest of the class.”
This journal is © The Royal Society of Chemistry 2011
Fig. 3 Frequency with which each category of barrier to diagnostic
assessment was coded in the interview data. These are axial categories,
each with many codes.
Table 3 Top ten barriers to diagnostic assessment
Rank Barriers
Professors do not have enough time for diagnostic
1
assessment
2
Students don’t pay attention in class
3
Students don’t put in required effort to do their work
4
Students are apathetic about the class
5
Students ignore feedback from the professor
6
Students don’t communicate their thinking
7
Students don’t know how to learn
Large quantity of content limits implementation of
8
diagnostic assessment
9
Students don’t take initiative to get or give feedback
10 Students that are doing very poorly lack hope
23
Instructor’s feedback is too vague to provide a path
forward
Type
Situational
Student
Student
Student
Student
Student
Student
Situational
Student
Student
Instructor
Instructor barriers
The instructor barriers are those barriers caused directly by
the professor. The literature suggests that these barriers may
come from a simple lack of training for faculty in how to
use diagnostic assessment (Stiggins, 2002; McNamee and
Chen, 2005). For our subjects, who were selected based on
their reputation for exemplary teaching, these barriers were
far less numerous (18% of coded statements regarding
barriers to diagnostic assessment) than student and
situational barriers. However, the fact that this type of
barrier to diagnostic assessment exists even for these
‘successful instructors’ should be noted. In the interviews,
the professors mentioned only two types of instructor
barriers. The first one mentioned was when, in the moment,
the instructor was not sure enough of the content to provide
students with immediate feedback. The second type of
barrier was the instructor’s feedback being vague, so the
students could not necessarily move forward as much as
they would if his feedback were more specific. This came
up when the interviewer (HF) discussed papers that
Professor Evans had marked to hand back to his students:
HF: “When you do something like this, you have a couple
of ‘minus ones’ [i.e. points taken off]. Do they know what
these ‘minus ones’ are for?
Professor Evans: “I don’t know. Basically when I have to
make a correction to how they use a formula and I put a
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 149
View Online
one here, then I put a minus one because it wasn’t
[crystal structure] 110, it was [crystal structure] 111. So I
made the correction. That’s a minus one. Here it’s not
supposed to be A. It’s supposed to be D. There’s a minus
one.”
HF: Okay.
Professor Evans: “Do they know? I don’t know. But I try
to put the minus one right next to whatever correction I
had to make to make their calculation work.”
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
Situational barriers
The final type of barrier is the situational barrier (30% of all
coded statements regarding barriers to diagnostic
assessment). This area encompasses all those barriers
mentioned that are not directly caused by the students or
professors. Consistent with the literature (Black and
Wiliam, 1998b; OECD, 2005), these barriers came about
due to the inevitable circumstances of teaching, such as a
large number of students, a large quantity of content, and a
wide distribution of students’ abilities. Of the situational
codes in our study, the most frequently mentioned was ‘time
constraints’, meaning that the professor did not feel he had
enough time to do some part of the diagnostic assessment
cycle well or even at all. Professor Peterson provides the
following explanation why time constraints make
implementing diagnostic assessment so difficult.
Professor Peterson: “There's always a tension between
what you want to cover in class. Are we willing to not
cover a lot of material in a chapter so that we can go
back and forth in a feedback thing?… So if you've got to
cover it all, then, obviously, you just don't have a lot of
time for some of these other things like discussion.”
Driving forces
Informed or uninforme d techniques Besides the four
elements of diagnostic assessment and the barriers to its
implementation, there are also certain driving forces that
either propel the cycle forward or detract from it (Fig. 4).
As with any technique, the diagnostic assessment
techniques that a professor uses can either be well informed
or uninformed, depending on the professor’s grounding in
pedagogical content knowledge (Mishra and Koehler,
2006). In our data we see that uninformed techniques that
detract from the cycle (36% of coded statements regarding
driving forces for diagnostic assessment) were drawn from
the professor’s intuition, or occurred when the professor did
not have a reason for his actions. Simply relying on one’s
intuition was one of the most common forces distracting
from the cycle. This can look like the professor replacing
measurement and feedback from students with intuition on
what teaching adjustments to make. The following quote
illustrates how a professor might use his intuition rather
than make a measurement or seek feedback.
Professor Peterson: “I'm not one that thinks about what I
do too much which is probably not good. I don't know. I
just kind of do what feels good. Yeah. I don't sit back and
analyze things. I just don't. I just never have.”
150 | Chem. Educ. Res. Pract., 2011, 12, 142–157
Fig. 4 Frequency with which each category of driving force for
diagnostic assessment was coded in the interview data. These are axial
categories, each with many codes.
We saw that informed techniques (37% of coded statements
regarding driving forces for diagnostic assessment) that
drove professors forward in the diagnostic assessment cycle
were those that drew from some body of knowledge outside
of the professors’ own intuition. When using informed
techniques, each professor in our study based his decision to
do diagnostic assessment on sources, such as the professor’s
experiences with previous classes, a previous measurement
the professor took at another time, or information from
external resources, such as colleagues or literature. These
informed techniques drove the cycle forward, because the
professor was using a concrete measurement of student
learning. Thus, he could address the students’ needs more
accurately than if he was relying solely on his intuition. In
the following example, Professor Evans adjusted his
teaching based on his past experience with students, using
his experience as a form of measurement to base his
decision to adjusting his teaching.
Professor Evans: “There’s a history here as we’ve had
issues in previous years. We did make previous tests
available, because some students had them from brothers
and suitemates and what-not. Other students complained
because there would be too much repetition. So we
committed to making them available, but then we also
committed to making partially new material for every test
so that it becomes a bit of a standard.”
Positive at titude tow ards di agnostic assessment The
final driving force found for the diagnostic assessment cycle
was a positive attitude towards the implementation of
diagnostic assessment (27% of coded statements regarding
driving forces for diagnostic assessment). Beliefs that a
diagnostic assessment strategy is important, or that it should
be implemented, will further steer professors towards the
cycle because of the increased professor ‘buy-in’ of the idea
of effective diagnostic assessment (Shepard, 2000). In our
data, we saw that this driving force most commonly
encompasses the belief that diagnostic assessment was
working, which served as a motivation for the professor to
continue formative strategies they had already been
implementing, as seen in the quote below.
HF: “So when do you think feedback is useful? You
mentioned when they're having a hard time for example.”
Professor Peterson: “Well heck, it's always useful.”
This journal is © The Royal Society of Chemistry 2011
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
Fig. 5 Diagnostic Assessment Cycle, Motivations, and Barriers for Professor Peterson
Case studies
Both professors discussed the diagnostic assessment cycle
to varying degrees, with the aforementioned emphasis on
‘measurement’ and ‘feedback’. However, the specific
techniques that professors used in order to implement the
four key elements provide a window into better
understanding each professor’s approach to diagnostic
assessment. The cycle of the four key elements to diagnostic
assessment from Fig. 1 has been personalized in the case
studies below for both of the professors, using the codes
most frequently mentioned by each professor and the codes
most specific to the professor. The context of these codes
within the interviews reveals the professors’ various
motivations for doing diagnostic assessment techniques and
reasons for not doing them. It should be noted, though, that
the motivations are not always fitting with the idea of
diagnostic assessment because at times a professor may
implement a technique, but not with the desire to assess his
students. For example, a professor may construct a quiz that
closely mimics the homework problems not because he
wants the students to learn from feedback on the homework,
but rather because he wants to be blameless if some students
do not do well on the quiz (i.e. the students knew what was
coming and cannot accuse the professor of surprising them).
The motivations are noted by the arrow pointing to the key
element that the professor implemented. The arrow pointing
away indicates the barriers for that professor.
This journal is © The Royal Society of Chemistry 2011
Case Study: Professor Peterson (Fig. 5)
Diagnostic as sessment tech niques us ed by Prof essor
Peterson
Flexible le sson plans One formative aspect of Professor
Peterson’s teaching is his willingness to adjust his teaching
based on students’ needs. The main way he does this is
through being flexible with his daily lesson plans. He goes
into every class period with a general idea of what he wants
to cover, but he does not strictly follow a lesson plan.
Because of this flexibility, he is able to adjust the time he
spends on certain topics and the way he covers the content
based on his sense of the students’ needs. Professor
Peterson is versatile in the way he adjusts his teaching, and
tries to uses a variety of techniques to help students
understand a particularly difficult topic.
Questioning A second way that Professor Peterson fosters a
formative environment is through asking questions and
getting students to ask questions. Professor Peterson asks
his students a lot of questions, and often gives them
significant wait-time (Rowe, 1972) to think about the
questions, because he wants to measure student
understanding. He also encourages his students to ask him
questions, so that he knows when they are getting stuck and
what part of the subject matter is confusing them. He then
uses the information he gains from asking questions and
from student questions to help him know how best to adjust
his teaching.
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 151
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
Atmosphere of en
gagement and p eer-assessment
Professor Peterson explicitly focuses on engagement of
students during class to foster measurement and feedback.
To set the stage for engagement, Professor Peterson
intentionally creates a comfortable, non-threatening
atmosphere in his classes that encourages student questions.
He also tries to answer student questions in a formative
way, with explanations or more questions to give students
an opportunity to think about the issue instead of just giving
them short answers that don’t make them think. In addition
to questioning, Professor Peterson also gets students
involved through peer-assessment by having the students in
the class answer their peers’ questions rather than
answering the questions himself, and when he has students
work through problems in small groups.
Clear ass essment exp ectations Professor Peterson makes
his assessment expectations abundantly clear. One of the
ways he does this is through making a big point of directly
basing his quizzes and exams on homework problems.
These problems are web-based and graded automatically.
Additionally, Professor Peterson makes complete solutions
available to the students after the assignments are due. The
assignments can then be formative in that students’ thinking
is measured and they receive feedback through the grading
and posted answer keys. Professor Peterson does expect the
students to learn from this homework and be able to solve
similar problems on quizzes and exams. On one occasion,
when returning a quiz, he explicitly showed the students
what homework problems the quiz problems came from, “I
told them the quiz would come from the problem sets, and
here, ‘I’m showing you exactly where it came from.’” This
cycle of repetitive assessment and feedback is a recurring
theme to Professor Peterson’s teaching.
What motivates Professor Pe
assessment?
terson to use diagnostic
Professor as provider Throughout his discussions about
diagnostic assessment techniques, Professor Peterson
demonstrated an overall characterization of the professor as
the provider (Harden and Crosby, 2000). Many of his
motivations for doing diagnostic assessment revolve around
the idea that his role is to provide what the students’ need to
learn. This can mean providing opportunities for
involvement so he can gauge student thinking, creating a
welcoming atmosphere so students have opportunities to
make their thinking clear, or making expectations clear so
that students know a path forward to achieving learning
goals. For example, Professor Peterson was often observed
asking questions in class, but often he would pause until it
was clear that several of the students were ready to answer
rather than just taking the first hand. In response to why he
does this, he said:
“…I want a dialogue, right. Maybe some of them thought
a little bit there, ‘I wonder what are we looking at here,’
… It gets at least some dialogue going, maybe not
between me and everybody, but at least some of the kids
are thinking. The fact that four or five of them are
152 | Chem. Educ. Res. Pract., 2011, 12, 142–157
answering questions means that just I made them do
some thinking.”
By asking questions with a prolonged wait-time,
Professor Peterson provides space for the students to think
and not just listen to a lecture (Rowe, 1972). Additionally,
he often answers students’ questions in a formative way by
not giving a quick answer but instead asking the student
more questions or turning the question over to the class.
Again, he provides for the students by giving them many
opportunities to build their understanding during class time.
Blame av oidance Professor Peterson’s motivations to be a
provider stem in part from a defensive compulsion to have
the students, rather than him, be to blame when they fail
(Hesse, 1989). If he fulfills his role of providing what the
students need, then he cannot be blamed for any student
failures. This is clearly seen in his discussion on how
students should know how to prepare summative
assessments such as quizzes:
If they blew it, that’s just fine. I don’t - I’m not here to
win [the] popular [vote] - if they didn’t do well, ‘You are
to blame, don’t blame me. I didn’t pull anything unusual
that you didn’t expect here. It was laid out exactly as I
told you it was going to be laid out. And if you screwed
up I hope the blame comes back to you. Don’t make it my
fault, because I made it pretty clear, and here it is.’
That’s okay. If they’re upset that’s just fine.
Professor Peterson is a provider of tools and opportunity
to succeed, but the students must have initiative and
responsibility to take advantage of what is provided.
Barriers to d iagnostic as sessment p erceived b y Profes sor
Peterson
Students i gnore feedb ack One of the most significant
barriers Professor Peterson sees to the successful use of
diagnostic assessment strategies is when students do not
take advantage of the feedback that is provided to them.
This is a corollary to his motivation to provide them with
feedback (e.g. graded homework) – if they do not pay
sufficient attention to the feedback given, they will not be
able to learn from their mistakes.
Professor Peterson: “After all these years of teaching it
still amazes me how some kids do as poorly as they do.
Especially, given that they knew it was coming. Came out
of problem sets. They had complete solutions to those in
advance. We have had kids [scoring] 10, 15 points out of
33. I just have to conclude they’re not working, that’s
all”.
HF: “So, that’s what you would attribute it to?”
Professor Peterson: “I don’t know what else I can
conclude. I mean, they … yeah. I think so, I mean, I’m
not a hundred percent [sure], but certainly a high
majority of them, I think, are doing poorly because
they’re just not putting forth what I consider the amount
of effort that you need to.”
This is a barrier that Professor Peterson sees as beyond
his control in that it originates from the students lack of a
responsible attitude.
This journal is © The Royal Society of Chemistry 2011
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
Fig. 6 Diagnostic Assessment cycle, barriers, and motivations for Professor Evans.
Coverage p ressure A significant barrier perceived by
Professor Peterson is the pressure to cover a large amount
of material in a limited amount of time (OECD, 2005). He
indicates that the coverage pressure prevents him from
doing more of the diagnostic assessment techniques that he
values, and prevents him from pursuing other diagnostic
assessment techniques that he is aware of. For example,
achieve sufficient efficiency; he sometimes relies on
intuition rather than using a diagnostic assessment
technique to make a measurement of how well students are
doing in achieving the learning outcomes.
HF: “Did you have any feeling yesterday for how many
were following that bit [a topic being discussed in
class]?”
Professor Peterson: “You see, I guess this is where - this
is probably a place where I could benefit by having some
clicker [i.e. classroom response system] stuff. … Maybe
that’s the direction I should go with some of this stuff.
Then I could say, ‘Okay, if I’m going to spend ten more
minutes on clickers during a class, okay, then what do I
chop out?’ ….
Yeah, so, did they all get it? I don’t know. Maybe… So,
how many were with me? I think most of them. But, you
can kind of tell. ”
HF: “What are the hints?”
Professor Peterson: “Oh, you can see it on their faces.”
In this dialogue we can see that Professor Peterson sees
some diagnostic assessment techniques as potentially
useful, but not worth the extra class time that he thinks they
This journal is © The Royal Society of Chemistry 2011
would use. He is also confident enough in his own intuition
that the benefit added from taking a measurement may not
be worth the extra time he assumes it would take to make
the measurement.
Case study: Professor Evans (Fig. 6)
Diagnostic assessment techniques used by Professor Evans
Professor Evans uses a rich and varied arsenal of diagnostic
assessment techniques. In this paper we will discuss those
he discussed most often, including: clickers, peerassessment, holistic office hour sessions, and repetitive
assessment/feedback cycles.
Clickers In using clickers, Professor Evans accomplishes
all four aspects of the diagnostic assessment cycle
simultaneously (MacArthur and Jones, 2008). During class
time, Professor Evans projects a clicker question, most
often a content question, but sometimes a question about
such things as student expectations for the course or their
study habits. The questions themselves provide the students
with an idea of the target for the course in that it’s clear
what the students are expected to be able to accomplish.
After the students answer the question with their clickers,
the results are posted as a histogram. This information is
valuable to Professor Evans as a measurement of whether
the students are achieving the learning outcome. This serves
as feedback for him to determine the extent to which he
should adjust his teaching to forge ahead or have the class
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 153
View Online
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
dwell on the topic at hand. The clicker questions are also
formative for the students in that they receive timely
feedback on whether or not their answer fits with the
answers of their peers. As the question is discussed and, in
the case of objective questions, the correct answer is
revealed, students can use that information as feedback to
adjust their subsequent learning.
Peer as sessment Several times during the course of the
semester, Professor Evans spent the full class period having
the students complete what he calls ‘workshops’ in which
they work in groups of three to answer questions and solve
problems regarding an application of chemistry to materials
science (e.g. one workshop observed dealt with solid state
phase diagrams and the manufacture of Samurai swords).
As the students work in their groups, Professor Evans
circulates about the class, listening, answering questions,
and making occasional statements to the group as a whole.
These workshops serve several formative purposes. As the
students share their ideas with peers, they get feedback from
each other, which then helps them to adjust subsequent
learning (Mazur, 1997). As Professor Evans eavesdrops on
students and answers questions, he gathers a measurement
of where the students are having trouble and this serves as
feedback for what clarifying announcements to make as the
groups continue in the workshop.
Holistic office hours In the interviews, Professor Evans
placed a frequent emphasis on the formative role that oneon-one interactions during office hours can play. Office
hours are unique in that they are an opportunity for an
instructor to discuss holistic issues with the student, such as
study habits, time management, prioritization, and adjusting
to the freedom that comes with being a first year college
student (the engineering class that Professor Evans was
teaching was more than 90% freshmen students). Professor
Evans recalled more than one occasion where he and the
student were able to discuss unsatisfactory progress in the
class in terms of content expectations, process expectations,
and many aspects of the student’s situation including
method of studying, time spent studying, good habits (e.g.
sleep and exercise), and possible distractions from study,
such as paid work or unproductive activities (e.g. video
games). This sort of holistic measurement and feedback to
the students is different for each student based on their
personal situation; it is only practical in such a private
setting.
Repetitive ass essment/feedback cycle
Similarly to
Professor Peterson, Professor Evans structured the graded
aspects of the class around a repetitive assessment/feedback
cycle (Pellegrino et al., 2001). He presented the immediate
aim of the homework as a preparation for taking homeworkbased quizzes and examinations. This was communicated to
the students to give them a path forward for study, so that
they knew what to expect. Keys for the homework were
posted so that the students could compare their completed
homework against the correct solutions and get feedback in
154 | Chem. Educ. Res. Pract., 2011, 12, 142–157
this way. This method of posting a written key is used not
only for homework, but also for all written work, including
exams.
What motivates Professor Evans to
assessment?
use diagnostic
Professor a s gui de a nd advi sor Professor Evans’
motivation to use diagnostic assessment techniques can be
understood within a characterization of professor as a guide
and advisor to his students (Harden and Crosby, 2000). He
wants to do all he can to advise his students and guide them
as they take his class. Clearest evidence for this comes from
Professor Evans discussion of test results with his meetings
with students during office hours.
“I ask them a lot of questions like, ‘How many were
dumb mistakes? How did you feel when you left the test?’
And even on a question-by-question basis, especially on
the ones they got wrong. Focusing like, ‘What were you
thinking on this question? Why do you think you got it
wrong? Did you think you had it right when you left the
test? Did you know you hadn’t studied the right stuff?’
Their answers provide a lot of insight into how well
prepared they were, how much test anxiety they
experienced, and just where their problem-solving skills
are at.
So usually based on that, if I assess it is dumb mistakes
then we usually talk more about getting enough sleep,
getting some exercise, being mentally prepared. What
can the student do to eliminate dumb mistakes and
practice being more careful doing the homework? If it
was it lack of preparation, then I might say ‘You could
really just study the notes more because these are the
types of problems you got wrong.’ If it was it an inability
to solve problems - if their problem-solving skills are
really lacking then I really dive into how are they doing
the homework. What adjustments they could make and
how they’re approaching the homework itself, because
that’s where they will improve their problem-solving
skills.
At that point I’ll ask them, ‘Do you have a picture in
mind when you start these types of problems or do you
have a mental picture that’s guiding you?’ And then we
differentiate between problems they’ve seen before and
what they haven’t seen before. I just keep asking them a
lot of questions to try to really get at what derailed the
student on a particular problem and on a particular type
of problem. One often sees a pattern in a test. Like all the
quantitative problem-solving ones, those are the ones
that they got most of their points off. Or all the short
answer conceptual ones. That’s where they got more of
their points off or that type of thing.
So it’s a dialogue that kind of goes back and forth. I
usually in the end giving them advice about how to adjust
their discipline in approaching their homework, how they
can try to improve some areas that they’re - in terms of
problem-solving, how they can best work on areas of
weakness, how they can better prepare. Things like that ”
This journal is © The Royal Society of Chemistry 2011
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
Clear expectations Another way in which Professor Evans
is motivated to use diagnostic assessment techniques is
through his desire for each student to be able to achieve the
learning goals with the help of clear process and content
expectations (Ludwig et al., 2011). He uses homework and
quizzes together, where the homework is graded only on
completion but the ability to do the homework correctly is
measured through quizzes based on that homework:
“They have to hand in the homework, 10 points of it is
just for doing it, and not based on correctness and just a
few points for correctness. We're trying to steer away
from over-rewarding them for correct answers at this
point. And steer them much more towards doing the
homework in the right way. Then an additional 10 is the
quiz the next day which will just take some of those
problems and the reason for that, is I've found that I want
to steer students toward how to do homework which is
you should do homework in a way so that you can
reproduce the results by yourself in a quiz situation the
next day. If you can do that, then you've understood the
homework problem. It's not just about getting the right
answer once. We've had a huge problem in the past
where groups of five to ten students will get together on
Wednesday night, the one smart student has worked
through most of it and everyone is just copying like mad,
and learning very little from the homework, jumping
through the hoop but not getting anything out of it. So
we've really tried to defang the homework in terms of
points for correctness, so they're not as motivated to do
that. ”
This quote illustrates again how Professor Evans guides
his students, this time through designing class policies that
hold them individually accountable and guide them towards
good study habits. He has adjusted his policies in response
to what the class needed to have a better chance of being
guided to success.
Barriers to d iagnostic as sessment p erceived b y Profes sor
Evans
Similarly to Professor Peterson, Professor Evans finds the
most significant barriers to his use of diagnostic assessment
are when students do not take advantage of the guidance
that he offers and that he just does not have enough time to
give more guidance than he does. Additionally, he must
sometimes cut diagnostic assessment events short when he
is running out of classroom time.
Students do not ta
ke adva ntage of fe
edback
opportunities After the students in his class received a
midterm exam back, Professor Evans invited those who
struggled to come by office hours to receive help, but only
about five out of one hundred students came. When asked if
he had hoped for more he said:
“There are more students than that that need help. Do I
hope for more? That's a hard one for me to answer. If I
can help them, I would hope that they would come to my
office. …. But there's probably some that just don't
accept the invitation because they’re still intimidated,
This journal is © The Royal Society of Chemistry 2011
don’t want to come talk a professor about it because
they’re afraid that the professor will make them feel bad
or dumb because they didn’t do well on the test. So I
guess I would hope for more because there are more that
could use the help.”
Here Professor Evans identifies the reluctance of the
students to seek out his guidance as a barrier to his giving
them the diagnostic feedback that they could benefit from.
Time constraints On the other hand, even though Professor
Evans clearly is willing to give detailed guidance to any
student who comes to his office, he also knows that lack of
time would be a barrier to making this work (Black and
Wiliam 1998b; OECD, 2005) even if he removed the barrier
by requiring students to make one visit to his office.
“I could imagine building something into an assignment
where they were required to do it once, somehow, the
first time was sort of required, like you must come ask
some question and come in a group of three or
something. … I don't know, I haven't tried that stuff. …
I'm not sure it's a good idea, it would be a big time
commitment on my part, I don't know how the students
would perceive it. … With 111 students it's a big task. I
wouldn't want them all to have to come one at a time,
because that would take up the whole week.”
We see that part of his time constraint here originates
with the large number of students that he was teaching
during the semester he was a participant in the study.
Although each section of students was a reasonable size
(less than 40), he was teaching three sections for a total of
111 students.
Time constraints were also sometimes perceived by
Professor Evans to be a barrier to diagnostic assessment in
the classroom (Black and Wiliam, 1998b; OECD, 2005).
For example, even though Professor Evans often used
clickers to good effect, he only sometimes followed it with
telling the students to turn to their neighbor and discuss
what answers they had and why. When asked “How do you
decide when to do it and when not to do it?” Professor
Evans responded, “Yesterday there was a bit of a time
limitation, which I knew I had almost exactly 50 minutes of
stuff [content to present].” On another occasion Professor
Evans might even decrease the amount of time students
have to ‘click in’ their answers:
HF: You had a clicker question on [the solubility of]
potassium iodide, lead nitrate. You put 10 seconds on.
You had 24 students click in. That wasn’t the whole class.
Right?
Professor Evans: No. Probably not.
HF: Why so short and why so few students?
Professor Evans: I was probably feeling a little pressed
for time.
Conclusions and implications for further
research
In order to conclude this paper, we invite the reader to
consider again the problem that motivates this research and
our research question. The problem: there is an
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 155
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
View Online
implementation gap in that many college-level chemistry
instructors do not fully take advantage of the diagnostic
assessment paradigm in their teaching. The research
question: for those professors who do use diagnostic
assessment, what do they do, what motivates them to do it,
and what barriers do they face? Our hope is that if we can
answer our research question we can shed light on how to
address the problem of the implementation gap. For
example, if we can illustrate how successful chemistry
instructors do use diagnostic assessment, this could serve as
a foothold for others to recognize what current practices
they could invigorate or new practices they could initiate to
bring their teaching more in line with the diagnostic
assessment paradigm. Below we highlight some of the
results of these two case studies, to emphasize that these
footholds are apparent.
It is clear that in these two cases, the professors
repeatedly integrated diagnostic assessment into their
courses and integrated the various elements of diagnostic
assessment together. Successful diagnostic assessment has
several elements that play into and support each other. Both
professors in this study engaged in all four elements of the
diagnostic assessment cycle in this way. Reflecting on case
studies, we can find examples of how one element of the
cycle leads into the other. For example, when the students
are aware of the target through the syllabus or verbal
comments made in class or even through a promise that they
will be held accountable for the homework on a test, then
the students know what to aim for and this sets up the
opportunity for measurement. In the measurement, whether
it be through using clickers, or collecting written work, or
using good wait time along with questions in class, the
professors in this study found out where their students were
at and that equipped them to be able to give feedback. For
the feedback, whether it be through written comments on
written work, or through observing and commenting on peer
interactions in class, the professors gave feedback to
students and received feedback from students so that both
parties could make adjustments. When making adjustments
by keeping lesson plans flexible, or giving more time to a
topic in response to clicker question results, the instructors
reinforce for the students what the target expectations are
by dwelling on important content until the learning
outcomes are achieved. For the professors in this study, the
cycle continued until an acceptable level of mastery was
met in the class and the cycle was revisited as new material
was presented.
It is also clear that there are many techniques that may be
seen as otherwise common teaching techniques that have
been used by these two professors in a way that is consistent
with the diagnostic assessment principle. Successful
diagnostic assessment can happen with small changes to
common teaching techniques. For example, from Table 1,
the technique most discussed in the interviews was ‘asking
questions’. Research has shown that many teachers ask
questions during class, but that many also do so without
sufficient wait-time (Rowe, 1972). By making a small
change and consciously waiting after asking a question, an
156 | Chem. Educ. Res. Pract., 2011, 12, 142–157
instructor can turn questions into valid measurement events
where everyone in the class has enough time to engage with
the question and commit to an answer. The instructor can
collect the data from the measurement in a low tech way,
such as the verbal discussions that Professor Peterson had,
or in a high tech way, such as using clickers like Professor
Evans did. We assert that the techniques used by these
professors (Table 1) are not terribly distant from the
teaching of many other professors who may not even be
aware of the diagnostic assessment paradigm. These should
be accessible to any instructor who is committed to
introducing more diagnostic assessment into their
classroom.
Many of the barriers perceived by the professors studied
here are probably unavoidable; at the same time it is
encouraging to clearly see that diagnostic assessment can
happen despite perceived barriers. The barrier cited most
often, lack of time to perform diagnostic assessment, is of
course the barrier cited for many things left undone in life.
Even so, as these two professors demonstrated, when
motivated to make diagnostic assessment a priority it can be
done. The long list of examples of diagnostic assessment
performed by these two instructors shows the extent to
which the techniques can be incorporated despite time
constraints. Further, with a very small amount of additional
time (e.g. increasing wait-time from 10 to 20 seconds a few
times per class period), diagnostic assessment can be
increased with very little sacrifice of content coverage. The
many ‘student barriers’ listed suggest that diagnostic
assessment is not as effective as it could be for those
students who don’t pay attention, don’t put in the required
work, are apathetic, ignore feedback, etc. This of course
does not imply that quality pedagogical choices are wasted
on the remaining students. Indeed, the body of research on
the effectiveness of diagnostic assessment suggests that it
can play an important role in removing many of these
student barriers by, for example, engaging students to make
them more likely to pay attention in class and to overcome
their apathy towards the class.
It is interesting to note that some barriers perceived as
significant by one of the professors was not mentioned by
the other. This suggests that the barriers we perceive may be
overcome by the insights of our peers. This is a dialogue
that has not happened between Professors Peterson and
Evans yet. Still, as research such as this is expanded, we
anticipate that common barriers can be better understood, as
will the ways to overcome them.
It is informative, but not surprising, to note that different
professors can implement best practices in different ways.
This suggests that qualitative research, such as this, can
uncover and illustrate good practices that can inform
struggling and successful instructors alike. There is more
than one excellent way to teach even within a somewhat
constrained paradigm of teaching. And while high quality
teachers will likely come to these paradigms even without
ever having any formal training in teaching, observing and
reflecting on good practice can help us to further
conceptualize what works and why.
This journal is © The Royal Society of Chemistry 2011
View Online
Acknowledgments
We would like to thank Professors Peterson and Evans for
generously giving of their time to make this research
possible.
Downloaded on 03 May 2011
Published on 21 April 2011 on http://pubs.rsc.org | doi:10.1039/C1RP90019F
References
Angelo T. A., (1990), Classroom assessment: improving learning
quality where it matters most, New Direct. Teach. Learn., 42, 7182.
Atkin J. M., Black P. and Coffey J., (Eds.), (2001), Classroom
assessment and the National Science Education Standards,
Washington DC: National Academy Press.
BCCE, (2008), Biennial Conference on Chemical Education,
Bloomington, IN.
Black P. and Wiliam D., (1998a), Assessment and classroom learning,
Assess. Educ., 5, 7-74.
Black P. J. and Wiliam D., (1998b), Inside the Black Box: raising
standards through classroom assessment, Phi Delta Kappa, 80,
139-148.
Brainard J., (2007), The tough road to better science teaching, Chron.
High. Educ., 53, 1-3.
Bransford J. D., Brown A. L. and Cocking R. R., (2000), How people
learn: brain, mind, experience, and school. Washington, DC:
National Academy Press, pp. 131-154.
Brooks D. W., Schraw G. and Crippen K. J., (2005), Performancerelated feedback: the hallmark of efficient instruction, J. Chem.
Educ., 82, 641-644.
Chappuis S., (2005), Is formative assessment losing its meaning?,
Educ. Week, 24, 38.
Creswell J. W., (2003), Research design: qualitative, quantitative, and
mixed method approaches, 2nd ed. Thousand Oaks, CA: Sage.
Fuchs L. S. and Fuchs D., (1986), Effects of systematic formative
evaluation: a meta-analysis. Except. Child., 53, 199-208.
Harden R. M. and Crosby J. R., (2000), The good teacher is more than
a lecturer – the twelve roles of the teacher, Med. Teach., 22, 334347.
Harlen W., (2003), Enhancing inquiry through formative assessment,
San Francisco: Exploratorium Institute for Inquiry. Retrieved
August 2007 from
http://www.exploratorium.edu/IFI/resources/harlen_monograph.pdf
Harlen W. and James M., (1997), Assessment and learning: differences
and relationships between formative and summative assessment,
Assess. Educ., 4, 365–379.
Hattie J., (2009), Visible learning: a synthesis of over 800 metaanalyses relating to achievement, London: Taylor and Francis.
Hattie J. and Timperley H., (2007), The power of feedback, Rev. Educ.
Res., 77, 81-112.
Henderson C. and Dancy M. H., (2007), Barriers to the use of researchbased instructional strategies: the influence of both individual and
situational characteristics. Phys. Rev. Spec. Top.: Phys. Educ. Res.,
3, 1-14.
Henderson C., Finkelstein N. and Beach A., (2010), Beyond
dissemination in college science teaching: an introduction to four
core change strategies, J. Coll. Sci. Teach., 39, 18-25.
Hesse J., (1989), From naive to knowledgable, Sci. Teach., 56, 55-58.
Kember D., (1997), A reconceptualisation of the research into
university academics' conceptions of teaching, Learn. Instr., 7,
255-275.
Kluger A. N. and DeNisi A., (1996), The effects of feedback
interventions on performance: a historical review, a meta-analysis,
and a preliminary feedback intervention theory, Psych. Bull., 119,
254-284.
Leahy S., Lyon C., Thompson M. and Wiliam D., (2005), Classroom
assessment minute by minute, day by day, Educ. Lead., 63, 18-24.
This journal is © The Royal Society of Chemistry 2011
Ludwig M., Bentz A. and Fynewever H., (2011), Your syllabus should
set the stage for assessment for learning, J. Coll. Sci. Teach., 40,
20-23.
MacArthur J. R. and Jones L. L., (2008), A review of literature reports
of clickers applicable to college chemistry classrooms, Chem.
Educ. Res. Pract., 9, 187-195.
Marshall J. M. (2005), Formative assessment: mapping the road to
success, Princeton Rev.
http://www.scribd.com/doc/26211684/Princeton-ReviewFormative-Assessment-Pitch (accessed May, 2010)
Marton F., (1981), Phenomenography – describing conceptions of the
world around us, Instruct. Sci., 10, 177-200.
Marton F. and Booth S., (1997), Learning and awareness, Maywah,
New Jersey: Lawrence Earlbaum.
Mazur E., (1997), Peer instruction, Upper Saddle River, NJ: Prentice
Hall.
McNamee G. D. and Chen J. Q., (2005), Dissolving the line between
assessment and teaching, Educ. Lead., 63, 72-76.
Mishra P. and Koehler M. J., (2006), Technological pedagogical
content knowledge: a framework for teacher knowledge, Teach.
Coll. Rec., 108, 1017–1054.
Nyquist J. B., (2003), The benfits of reconstructing feedback as a
larger system of formative assessment: a meta-analysis.
Unpublished master’s thesis. Nashville, TN: Vanderbilt University,
as cited in Wiliam D., An integrative summary of the research
literature and implications for a new theory of formative
assessment, in Handbook of formative assessment, (2010),
Andrade H. L. and Cizek G. J. (eds) New York: Routledge, pp. 1840.
OECD: Organisation for economic co-operation and development,
(2005), Formative assessment: improving learning in secondary
schools, Paris.
Orgil M., (2007)., Phenomenography in theoretical frameworks for
research in chemistry/science education, Bodner G. and Orgil M.,
(eds.) Upper Saddle River, NJ: Pearson Prentice Hall, pp. 132-151.
Pellegrino J., Chudowsky N. and Glaser R., (2001), Knowing what
students know: the science and design of educational assessment,
Washington, DC: National Academy Press.
Rowe M. B., (1972), Wait-time and rewards as instructional variables,
their influence in language, logic, and fate control, National
Association for Research in Science Teaching, Chicago, IL, ED
061 103.
Shepard L. A., (2000), The role of assessment in a learning culture,
Educ. Res., 29, 4–14.
Shute V. J., (2008), Focus on formative feedback, Rev. Educ. Res., 78,
153-189.
Steadman M., (1998), Using classroom assessment to change both
teaching and learning, New Direct. Teach. Learn., 75, 23-35.
Stiggins R. J., (1992), High quality classroom assessment: what does it
really mean? Educ. Meas.: Iss. Pract., 11, 35–39.
Stiggins R. J., (2002). Assessment crisis: the absence of assessment for
learning. Phi Delta Kappa Intl, 83, 758-765.
Trigwell K. and Prosser M., (2003), Qualitative differences in
university teaching, in Access and Exclusion, M. Tight (ed.)
Oxford: JAI Elsevier.
Wiliam D., Lee C., Harrison C. and Black P., (2004), Teachers
developing assessment for learning: impact on student
achievement, Assess. Educ.: Principles, Policy Pract., 11, 49-65
Wilson M. and Scalise K., (2006), Assessment to improve learning in
higher education: the BEAR assessment system, High. Educ., 52,
635-663.
Wilson M. and Sloane K., (2000), From principles to practice: an
embedded assessment system, Appl. Meas. Educ., 13, 181-208.
Ziebarth S.W., Fynewever H., Akom G., Cummings K. E., Bentz A.,
Engelman J. A., Ludwig M., Noakes L. A. and Rusiecki E., (2009),
Current developments in assessment for learning in universities
and high schools in Michigan: problems and perspectives in
mathematics and science education, J. MultiDisc. Eval., 6, 1-22.
Chem. Educ. Res. Pract., 2011, 12, 142–157 | 157