SRI Subcommittee Final Report - Metropolitan State University of

advertisement
Student Ratings of Instruction at Metropolitan State College
Recommendations of the Subcommittee to the Faculty Evaluation Task Force
August 2010
Subcommittee Members:
Ellen Boswell
Juan Dempere
Erick Erickson
Clark Germann
Jeffrey Lewis
Ruth Ann Nyhus
Mark Potter (chair)
David Sullivan
Sheila Thompson
Jacob Welch
Metropolitan State College of Denver
Contents
1. Executive Summary
Page 2
2. Teaching and Instruction in the Higher Educational Setting
Page 3
3. Terminology and Intent
Page 6
4. Review, Adaptation, and Development Processes
Page 7
5. Proposed Metro State Instrument
Page 11
6. Administration, Reporting, and Evaluation Recommendations
Page 17
7. Proposal for a Pilot
Page 21
8. Appendix
Page 22
9. References
Page 24
1
Metropolitan State College of Denver
1. Executive Summary
Members of the Student Ratings of Instruction (SRI) subcommittee respectfully submit the
following recommendations to the Faculty Evaluation Task Force. We recommend that:
1. The Faculty Evaluation Task Force adopt an understanding of instructional
responsibilities, as they pertain to the overall evaluation of teaching, to include
instructional content, instructional design, instructional delivery, and instructional
assessment.
2. The College consistently adopt the terminology of “student ratings of instruction” (SRIs)
in place of what has been termed “student evaluation of teaching.”
3. Through every level of the evaluation process reviewers draw conclusions from a variety
of available sources, including, as determined by the FETF, faculty self assessment, peer
observations, department chair observations, and peer review of instructional materials.
4. The College retain items 1 and 3 from the current SRI instrument to use as the two
“global items” for summative evaluation.
5. The College adopt, after careful piloting, the SRI instrument described in Section 5 of
this report.
6. The College establish a two-pronged approach to moving toward online administration
of SRIs by a) establishing a targeted long-term goal of moving fully to online
administration of SRIs, and b) offering in the meantime, beginning in Fall 2011, the
option to individual instructors in face-to-face and hybrid classes of administering SRIs
either online or by paper.
7. Administrators and faculty alike pursue efforts, enumerated herein, to increase student
participation in online-administered SRIs.
8. The Faculty Evaluation Task Force engage faculty and administrators campus-wide on
the key question of how much SRI data is necessary for informed and meaningful
summative evaluation. This discussion should inform decisions on the desired frequency
of SRIs for faculty at different ranks.
9. For 16-week courses, SRIs be conducted during the final three weeks of instruction; for
other types of courses, a proportional timing for SRI administration be followed.
10. The original SRIs with student responses be returned to the faculty member, via Deans
Offices, along with the Office of Institutional Research (OIR)-generated report.
11. Statistical reports include a histogram presentation of scores for the two global items in
Section I of the instrument, along with the mean value and standard deviation, as are
currently reported by the OIR.
12. The Faculty Evaluation Task Force consider very carefully, in discussion with faculty and
administrators, which norms, if any, are essential for comparative purposes when
conducting summative evaluations from SRI data.
13. The Faculty Evaluation Task Force conduct scaled pilots of the proposed instrument—a
“mini-pilot” during fall semester and a large-scale pilot during spring semester, in order
to launch a new instrument by Fall 2011.
2
Metropolitan State College of Denver
2. Teaching and Instruction in the Higher Educational Setting
Members of the sub-committee have decided neither to try to define the activity of teaching nor
to specify in detail the role of the teacher. This is because of the oft-perceived “multidimensionality” both of the activity as a whole and of certain aspects of the role (Marsh & Roche,
1997; Theall & Arreola, 2006). We aspire to create as best we can an evaluation system that
reflects this multi-dimensionality.
Teaching is a complex and reflective human activity that, in the higher education context, is
offered in a forum that is advanced (“higher”), semi-public, and essentially critical in nature. As
instructor, the most important responsibilities of a teacher to his/her students are the following,
which we have adapted from Theall and Arreola (2006):
1. To possess recognized knowledge and/or relevant experience (content expertise);
2. To re-order and re-organize this knowledge/experience for student learning
(instructional design);
3. To communicate and ‘translate’ this knowledge/experience into a format accessible to
students (instructional delivery); and
4. To evaluate the mastery and other accomplishments of students (instructional
assessment).
We distinguish, thus, between teaching and instruction. Although instruction is a large part of
the teaching activity, it in no way comprises all of the activities, goals or concerns of the college
professor. Professors typically aspire to a number of other purposes in the classroom that may
include encouraging their students to long for the truth, to aspire to achievement, to emulate
heroes, to become just, or to do good, for example (Weimer, 2010). In establishing a roadmap
for evaluation, we are being careful not to reduce teaching to that which is measurable. Our
focus is thus on the four instructional responsibilities defined above. We encourage the Faculty
Evaluation Task Force to adopt a similar understanding of instructional responsibilities as they
pertain to the overall evaluation of teaching (Recommendation #1).
As complex and multi-dimensional as the nature of teaching is, we proceed according to the
following principles that are supported by research and scholarship:
1. Evaluation of teaching, including that based on Student Ratings of Instruction, must be
sufficiently comprehensive and flexible to recognize that different teaching modalities
require different sets of skills and strengths.
2. Evaluation of teaching, including that based on Student Ratings of Instruction, should
lend itself to both summative and formative decisions.
3. Student Ratings of Instruction are not uniformly appropriate as a means of gathering
data on each of the four sets of instructional responsibilities defined above. Additional
methods of compiling data for evaluation include:
a. Peer observation,
b. Peer review of instructional materials,
c. Department chair observation, and
3
Metropolitan State College of Denver
d. Self-evaluation.
The following table provides further description of the four instructional responsibilities, weighs
the appropriateness of using SRIs to measure them, and indicates alternative appropriate datagathering methods (Arreola, 2007; Bain, 2004; Berk, 2006; Theall & Arreola, 2006).
4
Metropolitan State College of Denver
Instructional responsibility Appropriateness of
student ratings?
Alternate methods of data
gathering
Content expertise. Includes Instructors bring their
the formally recognized
expertise to bear on selecting
knowledge, skills, and abilities a the content they judge best for
faculty member possesses in a maximizing student learning
chosen field by virtue of
outcomes. Students are not in
advanced training, education, an appropriate position to
and/or experience.
evaluate these choices. Bottom
line: SRIs are not appropriate
for evaluating this instructional
responsibility.

Peer review of
instructional materials
Instructional design.
Determines how students
interact with content, and
includes designing, sequencing,
and presenting experiences
intended to induce learning.
SRI items may ask about
learning opportunities,
assignments, exams, clarity of
objectives, course materials,
etc.


Self-evaluation
Peer review of
instructional materials
Peer observation
Department chair
observation
Instructional delivery.
Includes those human
interactions that promote or
facilitate learning, as well as
various forms of instructional
delivery mechanisms. In faceto-face (traditional classroom)
instruction, instructional
delivery involves, for example:
 giving organized
presentations;
 motivating students;
 generating enthusiasm;
 communicating
effectively.
Online instruction may also
require using various
technologies and applications.
SRI items may ask about
instructor clarity, enthusiasm,
openness to student input,
class management, use of
technology, etc.

Instructional assessment.
Includes developing and using
tools and procedures for
assessing student learning both
to provide feedback and to
assign grades.
SRI items may ask about
timeliness, frequency, and
usefulness of feedback. SRIs
can also ask students to report
their perceptions of fairness
and accuracy of assessments
and alignment of assessments
with course learning objectives.





Department chair
observation
Peer observation
Peer review of
instructional materials
Self-evaluation
5
Metropolitan State College of Denver
3. Terminology and Intent
Presently, the Handbook for Professional Personnel uses the terminology of “student evaluation
of teaching” to refer to student ratings. This is a terminology that in the past has been used
more or less interchangeably with “student ratings of teaching.” The lists of references in Marsh
and Roche (1997) and Cashin (1995) illustrate the past currency of both sets of terms. Today,
however, the terminology “student evaluation of teaching” has fallen from usage. Students
provide feedback in the form of comments and ratings, and reviewers then use this information
in a broader evaluative system to arrive at informed summative decisions (Cashin, 1995). Bain
explains: “Any good process should rely on appropriate sources of data, which are then compiled
and interpreted by an evaluator or evaluative committee. Student remarks and ratings, in other
words, are not evaluations; they are one set of data that an evaluator can take into
consideration” (2004, pp. 167-168). We want to insist on this distinction, whereby students rate
and faculty and administrators evaluate. This distinction underscores the responsibility of
evaluators at every level to consider all appropriate evidence before arriving at summative
decisions. Furthermore, Bain (2004) and Cashin (1995) agree that efforts must be made to
educate reviewers at all levels that SRI results are only one data point in an overall
comprehensive faculty evaluation system. For these reasons, we recommend that Metro State
consistently adopt the terminology of student ratings of instruction (SRIs) in place of what had
been termed student evaluation of teaching (Recommendation #2).
The SRI subcommittee has followed the lead of the Faculty Evaluation Task Force, which
identified two reasons for evaluation (February 26, 2010): We evaluate at Metro State to make
sure that tenure and promotion and other summative decisions are based on meaningful valid
data, and we evaluate to support professional growth for faculty who are interested. This dual
purpose of faculty evaluation, in support of both summative and formative decision making, is
the standard in colleges and universities today (Arreola, 2007; Seldin, 2006; Seldin & Miller,
2009). Accordingly, we have sought to ensure that SRIs lend themselves to both summative and
formative purposes.
SRIs are only one measure of instructional effectiveness. For summative decision making it is
especially important to avoid overreliance on student ratings. This warning is echoed repeatedly
in the literature. In arguing for a comprehensive portfolio-style approach to faculty evaluation,
Seldin and Miller, for example, state that “student rating numbers… do not describe one’s
professional priorities and strengths” (2009, p. 1). Berk recognizes thirteen distinct methods of
measuring teaching effectiveness and argues that “student ratings are a necessary source of
evidence of teaching effectiveness for formative, summative, and program decisions, but not a
sufficient source” (2006, p. 19, emphasis added).
We recommend that through every level of the evaluation process, reviewers draw summative
conclusions from a variety of available sources, including, as determined by the FETF, faculty
self assessment, peer observations, department chair observations, and peer review of
instructional materials. (Recommendation #3).
6
Metropolitan State College of Denver
4. Review, Adaptation, and Development Process
The SRI subcommittee followed a thorough and deliberate approach to reviewing, adapting, and
developing a set of recommendations for an SRI instrument, for how that instrument will be
administered, and for how the student ratings and comments will be used for summative
decision-making. Our process has included a literature review, an examination of the SRI
procedures of Metro State’s peer institutions, and a review of commercial options. This process
has allowed us to identify both strengths and weaknesses of Metro State’s current approach to
SRIs.
Our literature review yielded a list of questions that we identified to guide our process (see
appendix). The lessons we took from the literature pertain to the construction of an instrument,
the administration of SRIs, and the use of student ratings for summative evaluation. We have
sought to balance these lessons with local considerations pertinent to Metro State.
In our examination of practices at peer institutions, we took care to avoid what Arreola calls the
“trap of best practices,” where “what works well at one institution may not work at all at
another” because of unique values, priorities, tradition, culture, and institutional mission (2007,
p. xvi). We thus approached our peer institutional practices through the lens of what we
determined at the outset we want from our SRIs and what we considered would “fit” with Metro
State.
From among Metro State’s thirteen peer institutions approved by the Board of Trustees, we were
able to locate and identify SRI instruments, administrative procedures, and/or evaluative
guidelines at seven: Appalachian State University, College of Charleston, CSU Chico, CSU
Fresno, CSU San Bernardino, James Madison University, Saint Cloud State University, and
University of Northern Iowa. We also reviewed the SRI process at CSU San Marcos.
There are several high-profile commercial options available as well. We took time to examine
and consider Course/Instructor Evaluation Questionnaire (CIEQ), IDEA Center Student Ratings
of Instruction, including both the short and long form, Student Instructional Report (SIR II),
and Purdue Instructor Course Evaluation System (PICES). Arreola (2007) reviews the CIEQ,
IDEA Center long form, and SIR II options. Glenn (2010) reports on faculty responses to
several commercial options, including the IDEA Center forms. CSU Fresno, as we found in our
review of peer institutions, uses the IDEA Center long form.
Through this review of peer institutional practices and commercial options, we discerned
features that were common across several instruments, and we identified features that made
certain instruments stand out as unique, for better or worse. This process enabled us to identify
features and components that we desire as part of a Metro State instrument. We found ourselves
drawn to:

Relatively short instruments. Most peer-institution instruments contain 16 or fewer
items, including scaled items, open-ended items, and student information background
items. Commercial options, with the exception of the IDEA Center short (summativeonly) form, tend to be longer.
7
Metropolitan State College of Denver

Instruments with two global questions constructed for summative evaluation. We have
remained mindful that “poorly worded or inappropriate items will not provide useful
information, whereas scores averaged across an ill-defined assortment of items offer no
basis for knowing what is to be measured” (Marsh & Roche, 1997, p. 1187). Whereas we
judged certain of the peer-institution instruments to have fallen short of this standard,
we did find that a widely-followed approach among them, as well as among the
commercial options, is to include two global items that read like the following from the
PICES form: “Overall I would rate this course as: Excellent-Good-Fair-Poor-Very Poor”
and “Overall I would rate this instructor as: Excellent-Good-Fair-Poor-Very Poor.” We
found use of these two global items, with only slight variation of wording, on instruments
at James Madison University, College of Charleston, and Appalachian State. CSU San
Marcos uses these two questions, plus a third (“I learned a great deal in this course” with
a scaled response from Strongly Agree to Strongly Disagree). The IDEA Center forms,
both the long form and the short form, include these same two global items, while SIR II
features only one global item, “Rate the quality of this course as it contributed to your
learning.” Cashin reports broad acceptance among scholars of the sufficiency of “one or a
few” global items for summative decisions (1995, p. 2).

Items designed for formative feedback from students. The inclusion of specifically
“formative” items is common to all instruments that we examined with the exception of
the IDEA Center short form. Several instruments (James Madison University, CSU San
Bernardino, Appalachian State University, PICES, and SIR II, for example) include items
relevant to all three instructional responsibilities that we have identified as appropriate
for student feedback (instructional design, delivery, and assessment). The PICES and
CSU San Bernardino instruments are designed to allow individual instructors to follow
the “cafeteria approach” of selecting from a bank of formative items those that are most
relevant for their courses.

Comments. We seek to balance SRI numerical scores with written student feedback.
Fuller qualitative data can provide both context for summative decision-making and
feedback for formative decision-making. Furthermore, there is a tendency to overuse
quantitative SRI data because, right or wrong, those are often perceived as being the only
data points in faculty evaluations that have undergone testing for reliability and validity
(Marsh & Roche, 1997). As a corrective against such potential overuse, we desire an
intentional approach to asking for student written responses. For example, Saint Cloud
University asks students to explain “What are the strong points of the course and the
instruction that you believe should be continued?” and “What are the weak points of the
course and the instruction that you believe should be modified?” The CSU San Marcos
instrument asks students to “List one or two specific aspects of this course that were
particularly effective in stimulating your interest in the materials presented or in
fostering your learning;” “If relevant, describe one or two specific aspects of this course
that lessened your interest in the materials presented or interfered with your learning;”
and “What suggestions, if any, do you have for improving this course?” We find these
directed questions superior to the practice of designating a box at the end of an
instrument for “Comments,” as done on the IDEA Center forms and on the current
8
Metropolitan State College of Denver
Metro State form. The CSU San Bernardino form stood out to us for its intentional
approach to eliciting written student feedback: It asks students to provide, in written
comments, explanations for their ratings of each of the two global items, and its
formative “teaching improvement questions” are phrased as open-ended questions
rather than as scaled items.
Our review process enabled us to make the following observations with regard to the current
Metro State instrument:


All current Metro State forms (A-H) begin with the same 4 global items, indicated as
items “to provide a general evaluation.”
o
One of these global items (item 2) asks students to rate course content, and we
find that question inappropriate on an SRI instrument. Arreola states that
“rarely does a well-designed student rating form ask students to evaluate the
content expertise of the teacher” (2007, p. 20), and as mentioned in section 2 of
this report, there are alternate appropriate methods for evaluating course
content.
o
Two items (item 1 and item 3) ask “The course as a whole was: Very Poor-PoorFair-Good-Very Good-Excellent” and “The instructor’s contribution to the course
was: Very Poor-Poor-Fair-Good-Very Good-Excellent.” These items align well
with the 2 standard global items found on most instruments. We recommend
retaining these two items as standard “global items” intended to elicit ratings that
contribute to summative evaluation (Recommendation #4) By retaining these
two items, we preserve longitudinal consistency across different Metro State
instruments.
o
Our reading of the fourth item (item 4), “The instructor’s effectiveness in
teaching the subject matter was: Very Poor-Poor-Fair-Good-Very GoodExcellent,” leads us to conclude that it is insufficiently differentiated from item 3
(instructor’s contribution).
The Metro State instrument relies too heavily on scaled items that elicit numerical
scores. There are 27 such items, divided between “general evaluation items” (section 1),
“diagnostic feedback items” (section 2), “information items about the course to other
students” (section 3), and “information items relative to other courses [students] have
taken” (section 4). A fifth section asks students to provide general information about
themselves and about the course they are rating. We question the need for this number
of items, we are concerned that the instrument wrongly conveys to students how the data
are used, and we fear that this large number of numerical scores might lead to
overreliance on quantitative versus qualitative data. Specifically,
o
Scores in section 3 (“To provide information about the course to other students”)
are not made available for use by students; instead, students have access through
MetroConnect to scores from the four “global” items in Section 1. Furthermore,
9
Metropolitan State College of Denver
these items in section 3 are insufficiently differentiated from items in sections 1
and 2 of the instrument.
o
It is unclear how student responses to items in sections 4 (“To provide
information relative to other courses you have taken”) and 5 (“To provide general
information about yourself and this course”) are used in either summative or
formative decision making.
o
While there is a space for “comments,” there are no open-ended items that ask
students to comment on specific aspects of the course. We are aware that certain
academic programs distribute additional pages with open-ended items to elicit
student comments. We take this as acknowledgment that the Metro State
instrument insufficiently elicits student written feedback.
10
Metropolitan State College of Denver
5. Proposed
Metro State Instrument
Informed by our review of the literature and of the peer institution and commercial options, and
having evaluated the current Metro State instrument in light of our findings, we recommend the
piloting and adoption of the following instrument (Recommendation #5).
11
Metropolitan State College of Denver
Student Ratings of Instruction
Fall 2010 Pilot
Section I: Please circle a rating number and provide comments in the boxes.
1. The course as a whole was
(6) Excellent (5) Very good (4) Good (3) Fair (2) Poor (1) Very poor
Please provide reasons why you gave the above rating
2. The instructor’s contribution to the course was
(6) Excellent (5) Very good (4) Good (3) Fair (2) Poor (1) Very poor
Please provide reasons why you gave the above rating
12
Metropolitan State College of Denver
Section II: Teaching improvement questions
1.
2.
3.
4.
5.
13
Metropolitan State College of Denver
Section III
Student Ratings of Instruction
Supplemental Faculty Comment Form
Faculty Name:
Course:
Completing this form is optional. Use this form only in the event of an unusual circumstance or
circumstances that you believe may influence the Student Ratings of Instruction for your course.
In order to be made part of the record, this form must be received in the relevant Dean’s office
no later than the end of business on the last day of instruction in the semester that the course is
taught.
Directions: Using the space below, please describe the unusual circumstance(s) that you believe
may influence the Student Ratings of Instruction for this class.
14
Metropolitan State College of Denver
Notes:
1) Student ratings and responses from Section I are intended for both summative and
formative decision-making. While we prefer in abstract that comments be included
along with the OIR (Office of Institutional Research)-generated report in RTP/PTR
dossiers, we realize that this may prove impractical. Instead, individual faculty can
quote from and make reference to student comments in order to provide context for the
global item scores, and comments from this section should be made available to
department chairs and reviewers upon request.
2) Student responses in Section II are for formative decision-making and “belong” to the
individual instructor. If the faculty member desires, he/she may choose to include the
questions and some or all of the student responses from this section in his/her RTP/PTR
dossier.
3) If a faculty member completes the Supplemental Faculty Comment Form (Section III), a
copy of the completed form should follow the OIR-generated report from Section I of the
instrument.
We propose the following bank of teaching improvement questions. Ultimately, once Metro
State moves administration of SRIs entirely online, we envision individual instructors being able
to choose from online drop-down menus up to 5 teaching improvement questions that are most
appropriate for their courses. Until then, while SRIs are still administered by paper, it is most
practical for faculty from each curricular program to choose the teaching improvement
questions (up to 5) to be used for all courses across that program. Per note #2 above, however,
even though these questions will be selected by programs, it is our intention that they be about
and for teaching improvement and not program assessment. As such, the process for selecting
questions in Section II should be faculty-driven, and responses to these questions “belong” to
individual faculty for formative purposes.
Teaching Improvement Questions
Category 1: Instructional Design





Describe how the syllabus helped/hindered your learning in this course.
Describe what you liked best/least about “hands on” learning activities, such as research,
experiments, case studies, or problem-solving activities.
Describe what you liked best/least about the sequencing of the course content.
Describe what you liked best/least about the scheduling of course work, such as reading,
assignments, and exams.
Describe how the overall workload helped/hindered your learning in this course.
Category 2: Instructional Delivery




Comment on the strengths/weakness of group activities in this course.
Describe how class discussions helped/hindered your learning in this course.
Describe how online discussions helped/hindered your learning in this course.
Describe what you liked best/least about the instructor’s interaction with the class.
15
Metropolitan State College of Denver




Comment on the strengths/weaknesses of the instructor’s explanations of course
material.
Describe how the classroom climate helped/hindered your learning in this course.
Describe how the online climate helped/hindered your learning in this course.
Comment on the strengths/weaknesses of the instructor’s use of technology.
Category 3: Instructional Assessment




Describe how exams and assignments helped/hindered your learning in this course.
Describe what you liked best/least about the instructor’s overall approach to grading in
this course.
Comment on the strengths/weaknesses of instructor feedback in this course.
Describe how exams and assignments in this course challenged you intellectually.
Category 4: Student engagement and motivation


Describe how this class helped/hindered your motivation to learn the subject material.
Describe how actively you have participated in all aspects of the learning process (for
example completing required readings and assignments, participating in class activities,
studying for exams).
16
Metropolitan State College of Denver
6. Administration, Reporting, and Evaluation Recommendations
Developing an instrument is only one small piece of the total SRI process. For the system to be
valued by faculty and administrators alike as one that supports meaningful, fair, and valid
summative decision-making while also providing robust support for professional growth, it
must adhere to research-based best practices, as referenced in the responses below to our
guiding questions.
1. Do we want to use online administration or pencil/paper administration? There
are numerous advantages of using online administration of SRIs, and we recognize that the
full potential of our proposed instrument, with its emphasis on open-ended items and with
the bank of teaching improvement questions intended eventually to be made available to
individual instructors, can be best met through online administration. Challenges with
response rates, experienced nationally as well as here at Metro State with online course
evaluations, give us pause, however. As reported in the Chronicle of Higher Education
(Miller, 2010), the IDEA Center found in a study examining responses at nearly 300
institutions between 2002 to 2008 that, while there was no change in how students rated
their instructors, response rates dropped from 78% for paper surveys to 53% for online
surveys. Thus, we recommend establishing a long-term goal of moving fully to online
administration of SRIs, but for the present we wish for individual instructors in face-to-face
and hybrid classes to have the choice of administering SRIs for their courses either online or
by paper (Recommendation #6).
2. If we pursue an online option, what criteria do we want to prioritize in selecting
a vendor or platform, if applicable? We did not directly address this question, though
we do note that Metro State already has contracts with Blackboard and Digital Measures,
both of which have the capability of administering online SRIs. We also support directing
students to their course-specific SRIs through MetroConnect rather than through email.
3. Do we want to establish a minimum response rate to ensure representativeness
of results? Whereas there is no agreed-upon standard for what is considered a minimum
acceptable response rate, scholars widely agree that summative decisions should not be
made using SRI data when response rates fall below a certain minimum, because those
results cannot be assumed to represent class opinions as a whole. For formative decisionmaking, on the other hand, even the potentially biased responses from an unrepresentative
sample of students can yield useful insights. We hesitate to make a specific recommendation
on this matter before more fully engaging with the campus community. Furthermore, any
policy decision in response to this question must be formulated in conjunction with
responses to question 5 regarding frequency of administration. Both questions (minimum
necessary response rate and minimum frequency of administration) beg the broader
question of how much SRI data is needed to make informed and meaningful summative
decisions.
4. What steps can we take to ensure acceptable response rates? As Metro State
moves gradually toward online administration of SRIs (see Recommendation # 6), there are
17
Metropolitan State College of Denver
several steps that can be taken to increase response rates. We do not recommend punitive
measures against students, for example withholding grades, as a means of increasing
response rates. Our efforts toward increased participation in online administration of SRIs
should come primarily through joint effort on the part of administrators and faculty. We
recommend that Metro State administrators and faculty alike pursue the following efforts,
adapted from Berk (2006), intended to increase student participation in onlineadministered SRIs (Recommendation #7):
a. Make computer labs available for completion of online SRIs during class time.
b. Communicate to students assurances of anonymity.
c. Provide frequent reminders to students.
d. Communicate to students the importance of their input and how their results will be
used.
e. Ensure that the online SRI system is convenient and user-friendly.
Part of the responsibility for response rates rests upon individual instructors as well, and
there are certain contributions that they can make. As a general rule of thumb, the time
saved by not conducting SRIs in class using paper should be used by the instructor to remind
students over the course of several class meetings why it is important that they complete
their SRIs. Furthermore, students will be more likely to complete SRIs if they perceive
throughout the semester that their instructors want and care about their input (Weimer,
2010).
5. How often should we be gathering SRI data? Once again, there is no universal
standard, and the answer also depends on whether the use of SRI responses is for
summative or formative decision-making. Cashin argues that summative decision-making
in cases of full-time instructors should consider “ratings from a variety of courses, for two or
more courses from every term for at least two years, totaling at least five courses” (1995, p.
2). We acknowledge that there may be a desire to administer SRIs more frequently for
untenured assistant professors. In addition, faculty of any rank should have the option of
going beyond whatever minimum is required and administering SRIs for their own
formative purposes. We recommend that the Faculty Evaluation Task Force determine
policy regarding the minimum frequency of SRI administration only after engaging faculty
and administrators on the key question of how much SRI data is necessary for informed and
meaningful summative decision-making (Recommendation #8). Cashin’s rule of thumb
may provide a starting point, but this decision also needs to be made in concert with several
other considerations:
a. What additional sources of instructional data will be used to inform summative
decisions?
b. What will be the frequency and content of annual evaluations and RTP/PTR
dossiers?
c. What expectations will be established regarding minimum SRI response rates?
6. What is the best timing for administering SRIs? Providing for relatively
standardized conditions in the administration of SRIs is important to the reliability of
results. In fact, all else being equal, paper administration of SRIs tends to produce greater
standardization of conditions than does online administration (Berk, 2006). Whichever the
18
Metropolitan State College of Denver
mode of administration, a relatively narrow time frame for completion of SRIs is preferred
for the purpose of standardization. Further, we prefer that SRIs be completed toward the
end of the 15-week instructional period of the term, so that students will have the whole of
the semester to reflect on while responding. We recommend that, for 16-week courses, SRIs
be conducted during the final three weeks of instruction; for differently scheduled courses, a
proportionate timing be followed (Recommendation #9).
7. In what format will reporting take place? Because initially there will be a significant
portion of SRIs completed in pencil, and because open-ended narrative responses in Section
I of SRIs will be integral to the evaluation process, we recommend that the original SRIs
with student responses be returned to the faculty member via each School’s Deans Office
(Recommendation #10). Student written comments in response to the global items in
Section I should be made available upon request from any level of summative review, and
responses to items in Section II should stay with the faculty member. Some may argue that
returning the original SRIs with student comments included on them can create conditions
for retaliation against students. However, we find such a risk to be more theoretical than
practical, and we feel that the value of providing rich, contextualized data, both quantitative
and qualitative, for evaluation outweighs the need to guard against the remote and
theoretical risk of retaliation. For purposes both of summative and formative decisionmaking, there is value in keeping individual students’ comments paired with their ratings to
better understand the context and reasons for global item scores. Over time, as Metro State
makes a concerted institutional move to online administration of SRIs, even the theoretical
risk of retaliation will disappear. In the meantime, if departments or programs choose to
type comments to preserve respondent anonymity, they reserve the option to do so—as long
as individual students’ comments remain paired with their global item scores—but we urge
against any institution-wide mandate to do so. Additionally, we recommend that statistical
reports include a histogram presentation of scores for the two global items in Section I,
along with the mean value and standard deviation, as is currently reported by the Office of
Institutional Research (Recommendation #11).
Currently, OIR reports student rating results for each course with results from the following
norm groups: course prefix (upper or lower), department, school, college, and (all courses
taught by individual) faculty. We do not believe that the use of 5 different norms is
necessary, and including that many norm groups in reports creates a risk of over-reliance
upon and misuse of quantitative data. Weimer (2010) warns that SRI results tell only the
story of “what happened in one class with one group of students at one point in a career.”
These highly particular contexts become lost when evaluators turn right away to
comparisons with norm groups. The inclusion of “faculty mean” is especially troubling to us,
since it purports to capture broadly an instructor’s performance and thus invites evaluators
to make summative decisions based on a single number. An alternative to norm referenced
evaluation (using norm groups for comparison) is criteria referenced evaluation, wherein
programs, departments, or levels of review determine a minimum standard for performance
on the scale (Berk, 2006). We believe that reviewers often intuitively look for scores to be at
a certain level or above. In such cases, they are following a criteria referenced approach to
evaluation. We are not advocating for either a norm-based or a criteria-based approach to
19
Metropolitan State College of Denver
evaluation. We recommend, however, that the Faculty Evaluation Task Force consider very
carefully, in discussion with faculty and administrators, which norms, if any, are essential
for conducting fair and meaningful summative evaluations (Recommendation #12). As a
starting point for this conversation, we note that CSU San Bernardino reports the following
norm groups: lower division courses within the same college (school); upper division courses
within the same college (school); and graduate courses within the same college (school).
20
Metropolitan State College of Denver
7. Proposal for a Pilot
We recommend that the Faculty Evaluation Task Force conduct scaled pilots of the proposed
instrument—a “mini-pilot” during fall semester and a large-scale pilot during spring semester
(Recommendation #13).
Since, initially, the questions in Section II of the proposed instrument will be determined by
program, it will make most sense to select entire departments to participate in the pilots. For
the fall semester mini-pilot, we suggest 4 departments: one each from the Schools of Business
and Professional Studies, and two from the School of Letters, Arts, and Sciences. We believe
that for this first pilot, these should be departments with few or no untenured faculty.
The purpose of the pilots, both in fall and spring, will be to gather user feedback on
1. The process in general for administering SRIs, as proposed in this report, and
2. The quality and usefulness of the SRI data, qualitative as well as quantitative, for making
both summative and formative decisions.
21
Metropolitan State College of Denver
8. Appendix
The SRI subcommittee used the following questions as a “roadmap” for its work:
General questions





What is the purpose of SRIs? What do we want to use them for?
What is the teaching role at Metro? How do we describe teaching so that we know what
we’re evaluating, so that we are evaluating using appropriate methods, and so that all
appropriate modalities are included?
How can we ensure a system that encourages discussion and sharing about teaching and
learning?
What are our peer institutions doing?
What is the history of the current MSCD instrument?
Instrument-related questions



What type of questions do we want to include on the instrument(s)?
o Global questions?
o Questions that rate students’ perception of how well the learning environment
helped them learn?
o Low-inference questions about instructor behaviors that are related to teaching
effectiveness?
o Open-ended questions?
Will faculty have the opportunity to pick from a menu of optional questions (e.g. the
Purdue Cafeteria [PICES] approach)?
What steps can we take in the design of the instrument to minimize bias?
Administration of SRIs







Do we want to use online administration or pencil/paper administration?
If we pursue an online option, what criteria do we want to prioritize in selecting a vendor
or platform, if applicable?
Do we want to establish a minimum response—perhaps tied to the size of the course—to
ensure representativeness of results? (The answer here may differ depending on
formative or summative purposes).
What steps can we take to ensure acceptable response rates?
How often should we be evaluating faculty using SRIs? Every class every semester?
Something less than that? At frequencies determined by rank/tenure status? (The
answer here may differ depending on formative or summative purposes).
How much data is needed and over what spread of time in order to make summative
decisions? (I.e. it is widely agreed that summative decisions should not be based only on
SRIs from one semester, let alone one course).
What is the best timing for administering SRIs? How can we balance the need for
flexibility along with the desire for standardization of conditions?
22
Metropolitan State College of Denver
Post-administration of SRIs





Who will have access to results?
Will we differentiate between results for summative purposes and results for formative
purposes?
What additional measures and what additional contextual information will be needed in
dossiers for summative evaluation of teaching?
What steps can we take to avoid over-reliance on or misuse of SRIs for summative
decisions?
In what format will the reporting take place?
23
Metropolitan State College of Denver
9. References
Arreola, R. A. (2007). Developing a comprehensive faculty evaluation system: A guide to
designing, building, and operating large-scale faculty evaluation systems (3rd ed.). San
Francisco, CA: Anker Publishing Company, Inc.
Bain, K. (2004). What the best college teachers do. Cambridge, Mass: Harvard University Press.
Berk, R. A. (2006). Thirteen strategies to measure college teaching: A consumer's guide to
rating scale construction, assessment, and decision making for faculty, administrators,
and clinicians. Sterling, Va: Stylus Pub.
Cashin, W. E. (1995). Student ratings of teaching: The research revisited. IDEA Paper, (32).
Retrieved from http://www.theideacenter.org/sites/default/files/Idea_Paper_32.pdf
Glenn, D. (2010, April 25). Rating your professors: Scholars test improved course evaluations.
The Chronicle of Higher Education. Retrieved from
http://chronicle.com/article/Evaluations-That-Make-the/65226/
Marsh, H. W., & Roche, L. A. (1997). Making students' evaluations of teaching effectiveness
effective: The critical issues of validity, bias, and utility. American Psychologist, 52(11), 11871197.
Miller, M. H. (2010, May 6). Online evaluations show same results, lower response rate.
Chronicle of Higher Education. Retrieved from http://chronicle.com/blogPost/OnlineEvaluations-Show-Same/23772/?sid=wc&utm_source=wc&utm_medium=en
Seldin, P. (2006). Building a successful evaluation program. In P. Seldin (Ed.), Evaluating
faculty performance: A practical guide to assessing teaching, research, and service (pp. 119). San Francisco, CA: Anker Publishing.
Seldin, P., & Miller, J. E. (2008). The academic portfolio: A practical guide to documenting
teaching, research, and service. San Francisco, CA: Jossey-Bass.
Seldin, Peter, and J. Elizabeth Miller. 2008. The academic portfolio: A practical guide to
documenting teaching, research, and service. San Francisco, CA: Jossey-Bass.
Theall, M., & Arreola, R. (2006, June). The meta-profession of teaching. Thriving in Academe,
22(5), 5-8.
Weimer, M. (2010). Inspired College Teaching: A Career-Long Resource for Professional
Growth. San Francisco, CA: Jossey-Bass.
24
Download