Professional educators agreement on criterion for measuring teacher effectiveness

Professional educators agreement on criterion for measuring teacher effectiveness
by Francis Allen Olson
A thesis submitted in partial fulfillment of the requirements for the degree of DOCTOR OF
EDUCATION
Montana State University
© Copyright by Francis Allen Olson (1978)
Abstract:
This study investigated the criteria by which administrators and teachers, employed by Montana school
districts, judged teacher effectiveness.
A similar study, which this study replicated, was carried out in the State of Delaware by Jenkins and
Bausell.
The data for this study was gathered by a survey questionnaire which listed criteria for judging teacher
effectiveness typed under the Mitzel Scheme. Administrators were sampled first and with their return
an answer was given as to whether or not they gave permission to sample teachers on their respective
staffs. Samples of administrators (N=665) was followed by sampling of teachers (N=9,428) and these
results were compared by analysis of variance statistic at .05 level of significance. Comparison of the
results of the study with the Delaware study was done by using Spearman's Coefficient of Rank
Correlation. The Multiple Regression model was used to determine whether or not a significant
relationship existed among the differences between the administrators and teachers' rating of each
criterion measure of effectiveness and the teachers' ratings of administrators.
Conclusions reached are: (1) The highest rated criterion by teachers and administrators in Montana for
measuring effectiveness was "classroom control" followed by "knowledge of subject matter" and
"rapport with students". (2) The "amount students learn" a criterion uppermost in the minds of
accountability proponents was considerably less significant in the minds of teachers. (3) Product
criteria (measure of student learning and behavior) and process criteria (measure of teacher behavior)
were rated significantly higher than presage criteria (measure of a teacher's personal or intellectual
attributes). (4) Teachers' rating of effectiveness criteria was not significantly related to how teachers
viewed their administrators' effectiveness. (5) Montana and Delaware teachers were in agreement that
process and product criteria for measuring effectiveness are considerably more important than presage
criteria. (6) Both Montana and Delaware teachers do not consider that "what students learn", is as
important a criterion by which to measure effectiveness as other criteria. This view differs considerably
from that of accountability proponents.
One of the more important recommendations coming out of this study is to determine whether or not
parents and other constituents of Montana served by administrators and teachers are in agreement
among themselves and with educators on what types of criteria ate most important by which to judge
effectiveness. It would be important to know if discipline (effectiveness in controlling his class) is the
number one rated criterion. ©
1979
FRANCIS ALLEN OLSON
ALL RIGHTS RESERVED
C
PROFESSIONAL EDUCATORS AGREEMENT ON CRITERION
FOR MEASURING TEACHER EFFECTIVENESS
by
FRANCIS ALLEN OLSON
A thesis submitted in partial fulfillment
of the requirements for the degree
;
-
DOCTOR OF EDUCATION
.Approved:
Graduate Dfean
MONTANA STATE UNIVERSITY
Bozeman, Montana
December 1978
/
iii
ACKNOWLEDGMENTS
Special acknowledgment goes to Dr. Robert Thibeault, my advisor
and committee chairman.
He has graciously .provided many hours of
guidance and encouragement throughout this study.
Acknowledgment is made to my reading committee, Dr.-'Albert Suvak,and Dr..Robert Van Woert and to other members -of.my committee, D r .
Douglas Herbster, Dr. Del Samson, and Or. Alvin Fiscus.. ' .■
Acknowledgment is also made to,Dr. Eric Strohmeyer for his help
arid support.
■
.
'
■
Special thanks-is given to Mrs. Louise Greene who spent m any:
hours typing this dissertation.
.
Most of all I wish- to thank my -wife, Alyce, for the many hours •
that she spent helping with recording of data, running errands, and ■
providing needed patience and encouragement.
TABLE OF CONTENTS,
•'
•'Page
LIST OF TABLES ' ........ ...........................
CHAPTER
I.
INTRODUCTION ............ . . .; .
I
STATEMENT OF THE PROBLEM. . . . . . . . . .
15
NEED FOR THE STUDY'
,15
PURPOSE OF THE STUDY .
........ ..
QUESTIONS TO BE ANSWERED .
.
.
. ... . ....
16 .
.
LIMITATIONS OF THE STUDY
, "17
DEFINITIONS OF TERMS . , ■.......... ..
' S U M M A R Y ............ ..
II.
17-
. . .
. . . . . . . . .
18
19
REVIEW OF RELATED LITERATURE . . .............
22-
INTRODUCTION ...............................
22
NEED FOR EVALUATING TEACHER
24
EFFECTIVENESS. . .
FORCES WHICH CREATED THE NEED TO EVALUATE
TEACHER EFFECTIVENESS. . . . . . . . . . .
. ' I 25
RENEWED EMPHASIS PLACED UPON MEASURING
TEACHER EFFECTIVENESS
................... '
'
38
' TEACHER PARTICIPATION IN THE EVALUATION '
OF THEIR S E R V I C E S .......... ............. '
48
'
V
CHAPTER
Page
STUDIES OF TEACHER EFFECTIVENESS
Product c r i t e r i a ........ ..
........
. .............
Process c r i t e r i a .................
Presage c r i t e r i a ................... ..
69-
69
. . .
70
STATUS OF PRESENT METHODS OF APPRAISING
TEACHER PERFORMANCE
. . . . ’,...............'
70
SUMMARY
III.
51
PROCEDURES
..........................................
.............. ■ ...................
DESCRIPTION OF THE POPULATION
.
........ .. ' ■
SAMPLING PROCEDURE . . . . . . . . . . . . . .
METHOD OF COLLECTING DATA
METHOD OF ORGANIZING DATA
HYPOTHESES TESTED
. . .
73
77
77
SI
........
84
. . ...................
86
. .. .................... ..
. .
HYPOTHESIS I ...................
92
’92-
HYPOTHESIS I I ...............................
93
HYPOTHESIS' I I I ...............................
93
HYPOTHESIS I V ...............................
93
HYPOTHESIS V ..........
. . . . . . . . . . .
93
HYPOTHESIS V I ...............................
.94
HYPOTHESIS V I I ...............................
94
METHOD OF ANALYZING DATA .'............... ......
94
PRECAUTIONS TAKEN FOR ACCURACY ................. -
96
SUMMARY
96
vi
CHAPTER
_
IV.
ANALYSIS OF D A T A ...............■............. ..
Page
98
INTRODUCTION..........
98
MEAN RATINGS OF CRITERIA . . ..................
99
THE TESTING OF H Y P O T H E S E S ...................
103
S U M M A R Y .......................................... ' 1 3 8
V.
SUMMARY, CONCLUSIONS AND RECOMMENDATIONS . . . .
S U M M A R Y ........ ’ ..................... ..
. i .
C O N C L U S I O N S ..........
RECOMMENDATIONS FOR FURTHER STUDY
APPENDIX B
. 143
147
.............
LITERATURE CITED
APPENDIX A
143
................................................
149
.151
160
166
•&
vii
LIST OF TABLES
TABLE
I.
II.
III.
IV.
■
'
CATEGORIES OF ADMINISTRATOR POPULATION
■
■
.■
■
■
■
.
,
.. .
80
TEACHER SAMPLE C H A R A C T E R I S T I C S ........ ' . .
CATEGORIES OF THE ADMINISTRATOR SAMPLE
. 'Page
81
......
' 83
NUMBER AND PER CENT OF TEACHER RESPONDENTS'
■ •
BY DISTRICT AND SEX CATEGORIES
..........
84-
V.
YEARS OF EXPERIENCE OF ADMINISTRATORS . . . . . .
. 88,
VI.
THE YEARS OF EXPERIENCE OF TEACHER RESPONDENTS
BY DISTRICT CLASSIFICATION AND,.SEX . ■........
89
VII.
VIII.
IX.
X.
XI.
XII.
XIII.
PERCENTAGE OF TEACHERS BY DISTRICT CLASSIFICATION .
NUMBER TEACHER RESPONDENTS BY GRADE LEVEL
CLASSIFICATION . ........... :
.
..90 -
91
'RANK- ORDER OF MEANS FOR SIXTEEN CRITERION MEASURE .
' OF TEACHER EFFECTIVENESS FOR ADMINISTRATORS OF .
COMBINED CLASSES OF SCHOOL DISTRICTS
......
Iplj
RANK ORDER OF MEANS FOR SIXTEEN CRITERION.MEASURES .
OF TEACHER EFFECTIVENESS FOR TEACHERS OF COMBINED
. CLASSES OF SCHOOL DISTRICTS ...................
102
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES '
' OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS I
SCHOOL D I S T R I C T S ................... ..
1.04' '
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES
OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS II . :
SCHOOL DISTRICTS
.105
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES
OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS III •
SCHOOL D I S T R I C T S .......................... .
106.
viii
TABLE
XIV.
XV.
XVI.
XVII.
XVIII.
XIX.
XX.
XXI.
XXII.
XXIII.
XXIV.
XXV.
Page
COMBINED MEANS OF RATINGS OF ADMINISTRATORS
AND TEACHERS OF M O N T A N A ............................
107
ADMINISTRATORS VERSUS TEACHERS LEAST SQUARE
MEANS AMONG THE THREE SUB-TESTS BETWEEN
ADMINISTRATORS AND TEACHERS ........................
109
LEAST SQUARE MEANS OF COMPARISON OF TEACHERS OF ■■
THREE CLASSES OF DISTRICTS: TEACHERS OF 1st, ■
IInd, AND IIIrd CLASS SCHOOL DISTRICTS
. . ......
All
A COMPARISON OF LEAST SQUARE MEANS FOR MALE
AND FEMALE T E A C H E R S ................................
114
LEAST SQUARES MEANS FOR ELEMENTARY VERSUS
■ SECONDARY TEACHERS FOR THREE SUB-GROUPS OF
TEACHER EFFECTIVENESS C R I T E R I A .................
. 116
LEAST SQUARE MEANS OF SIX EXPERIENCE CLASSES
'FOR T E A C H E R S ....................... "................ 118
MULTIPLE REGRESSIONS AMONG. THE 16 INDEPENDENT
VARIABLES OF TEACHER AND ADMINISTRATOR DIFFER­
ENCES AND THE DEPENDENT VARIABLE, TEACHER
RATING OF ADMINISTRATOR EFFECTIVENESS ..........
122
MEAN RATINGS AND RANK ORDER OF THE 16 CRITERIA
DELAWARE S T U D Y .................................
. 125
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA TEACHERS AND ADMINISTRATORS
129
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION. FOR MONTANA AND DELAWARE TEACHERS . .
130
CALCULATION OF'.SPEARMAN 'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA ADMINISTRATORS AND
DELAWARE TEACHERS ...............................
131
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA TEACHERS OF FIRST AND
SECOND CLASS SCHOOL DISTRICTS
.................
,
132
ix
TABLE
XXVI.
XXVII.
XXVIII.
■ Page
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA TEACHERS OF FIRST AND
THIRD CLASS SCHOOL DISTRICTS
...................
133
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA TEACHERS.OF SECOND AND
THIRD CLASS SCHOOL DISTRICTS
...................
134
T-TEST OF SIGNIFICANCE OF THE MEAN DIFFERENCE
FOR TEACHER AND ADMINISTRATOR . . i ............
' 137
X
ABSTRACT
This study investigated the criteria by which administrators and
teachers, employed by Montana school districts, judged teacher effec­
tiveness. A similar study; which this study replicated, was carried out
in the State of Delaware by Jenkins and Bausell.
The data for this study was gathered by a survey questionnaire
which listed criteria for judging teacher effectiveness typed under the
Mitzel Scheme. Administrators were sampled first and with their return
an answer was given as to whether or not they gave permission to sample
teachers on their respective staffs. Samples of administrators (N=665)
was followed by sampling of teachers (N=9,428) and these results were
compared by analysis of variance statistic at .05 level of significance.
Comparison of the results of the study with the Delaware study was done
by using Spearman's Coefficient of Rank Correlation. The Multiple Re­
gression model was used to determine whether or not a significant rela­
tionship existed among the differences between the administrators and ■
teachers' rating of each criterion measure of effectiveness and the
teachers' ratings of administrators.
Conclusions reached are:
(I) The highest rated criterion by
teachers and administrators in Montana for measuring effectiveness was
"classroom control" followed by "knowledge of subject matter" and "rap­
port with students".
(2) The "amount students learn" a criterion upper­
most in the minds of accountability proponents was considerably less
significant in the minds of teachers. (3) Product criteria (measure of
student learning and behavior) and process criteria (measure of teacher
behavior) were rated significantly higher than presage criteria (measure
of a teacher's personal or intellectual attributes).
(4) Teachers' rat­
ing of effectiveness criteria was not significantly related to how
teachers viewed their administrators' effectiveness.
(5) Montana and
Delaware teachers were in agreement that process and product criteria
for measuring effectiveness are considerably more important than presage
criteria.
(6) Both Montana and Delaware teachers do not consider that
"what students learn", is as important a criterion by which to measure
effectiveness as other criteria. This view differs considerably from
that of accountability proponents.
One of the more important recommendations coming out of this
study is to determine whether or not parents and other constituents of •
Montana served by administrators and teachers are in agreement among
themselves .and with educators on what types of criteria ate most impor­
tant by which to judge effectiveness.
It would be important to know if
discipline (effectiveness in controlling his class) is the number one
rated criterion.
CHAPTER I
INTRODUCTION
Defining the effective teacher has been a continuing process
carried out by researchers for many years.
What research has said
about teacher effectiveness is that it is not the clearly defined trait
that many would have us believe.
Research has indicated that teacher
performance is one of the most complex human phenomenon that researchers
have been privileged to study (Ellena, 1964).
Teacher effectiveness has continued to be a subject of much
interest to educators and more recently to the public whom
they serve.
Interest in teacher effectiveness by the public has arisen from the■
current viewpoint of accountability in education which has focussed on
the people's right to know what is taking place in the schools (House,
1973).
Addressing the February, 1975 meeting of the American Associa­
tion of School Administrators which was held in Dallas, Texas,
Frank
Gary stated
Measuring the effectiveness of teaching . . . is still a
topic of interest to the public and school administrators.
There has not been too much progress in the 'area of measuring
practitioner effectiveness because of the educational stance
that it is impossible to make valid judgments about anything
as complex and personal as teaching ability (Gary, 1975).
Biddle and Ellena point out in the preface to their review of research
that:
2
Probably no aspect of education has been discussed with
greater frequency, with as much deep concern, or by more edu­
cators and citizens than has that of teacher effectiveness—
how to define it, how to identify it, how to measure it, how
to evaluate it, and how to detect and remove obstacles to its
achievement (Biddle and Elleria, 1964). .
Research has said that teacher effectiveness cannot be. summarized
in a few words.
However, many people
who have had contact with schools— -
whether as students, parentsj or interested Citizens— feel qualified to
make dogmatic pronouncements
about teacher
effectiveness. Teacher effec­
tiveness has been a matter of long concern in all efforts to improve
education.
Before the turn of the century, studies were conducted in
this country which attempted to isolate the factors which contributed
significantly to teaching effectiveness.
1,006 researches made in this area
One bibliography alone lists
from 1890 to 1949 (Domas and Tiedeman,
1950).
It has been pointed out that teaching must be defined before it
can be evaluated and effectiveness predicted.
From Ellena's summary
of teacher effectiveness studies, it was evident that part of the diffi­
culty associated with the prediction of teacher effectiveness had arisen
from the fact that teaching was described differently by. different people,
and the teaching act varied from person to person, and from- situation to
situation.
One of the most difficult problems in teacher effectiveness studies
has been that researchers had to assume that effectiveness was either a
3
statement about an attribute of the teacher, a statement about an attri­
bute of a teacher in a particular teaching situation, or a statement
about the results which come out of a teaching situation (Ellena, 1961).
Gage, who searched for a scientific basis to describe teacher effective­
ness pointed out that during most of the history of education, "What
knowledge, understanding, and ways of behaving should teachers possess,"
has been found through raw experience, tradition, common sense, and
authority (Gage, 1972).
He defined research- on teacher effectiveness as
the relationship between- teacher behaviors and characteristics and their
effects on students.
In their relationship teacher behavior was con­
sidered an independent variable (Gage, 1972).
Flanders has stated that, "Knowledge about teaching effective­
ness consists
of relationships between what a teacher
does while teaching
and the effect of these actions on the growth and development of his
pupils."
From his point of view an effective teacher interacted skill­
fully with pupils in such a way that they learned more
better compared with the ineffective teacher.
and liked learning
He described teaching
effectiveness as being concerned with those aspects of teaching
in which
the teacher "has direct control and current options" (Flanders, 1970).
A number of researchers' as well as some professional associations
supported the position that the ultimate criterion by which to judge a
teacher's competence was the impact that the teacher exerted upon the
learner to bring about behavioral change in the learner.
Reluctance in
4
accepting pupil change as the chief criterion of teacher effectiveness
has arisen both from the technical problems in assessing learner growth
and from philosophical considerations (Travers, 1973).
The chief concern
among the technical problems in assessing learner growth has centered on
the adequacy of measures for assessing a wide range of pupil attitudes
and achievement at different educational levels, and in diverse subjectmatter areas.
Philosophical differences have centered upon the selection
of desirable changes to be sought in learners and value differences
observed in- the preferred methodologies of teacher competence researchers
(Travers, 1973).
'
Research, which has tried to determine that teaching effectiveness
has something to do with what a teacher is, assumed that teacher success
can be predicted in terms of individual teacher personality traits.
Both laymen and the majority of professional educators have clung to the
idea that ability to teach is correlated in some way with such person­
ality factors as a sense of humor, empathy, industriousness, willingness
to cooperate, physical attractiveness and health, love of knowledge,
.creativity, and so forth.
To some extent, nearly every teacher evalua­
tion program in existence has taken these factors into consideration.
Yet numerous research studies have failed to find a significant cause-
V
and-effect relationship between traits and teaching effectiveness.
Research has shown that we tend to place the highest values on those
traits we ourselves possess or think we possess (Brighton, 1965).
5
Barr cautioned those who pursued the traits approach to describ­
ing teacher effectiveness that personal qualities such as considerate­
ness, cooperativeness, ethicality, which are used to assess teacher
effectiveness are not directly observable.
ences drawn from data.
These qualities are infer­
He described this concern as follows:
These data may be of many sorts arising from the observation
of behavior, interviews, questionnaires, inventories, or tests.
Whatever the source of information, judgments, about the
qualities are inferences, and subject to all the limitations,
associated with inference making including the accuracy.of the
original data upon which the inferences are based, and the
processes of inference making (Barr, 1961).
Barr related that if one has considered the qualities of the
individual in terms of characteristics of performance, he has utilized
a behavioral approach to assess teacher effectiveness. . He noted further
that those who have interpreted personality in behavioral terms in
assessing teacher effectiveness have attempted to integrate the concept .
of personality with that of methods.
Historically, this concept has
been considered an important aspect of teacher effectiveness.
The
problem which has been encountered in the behavioral approach to
assessing teacher effectiveness is that of choosing and defining the
personal qualities that have appeared to be pertinent to teacher effec­
tiveness.
The literature has given one the impression that the choice
of personal qualities used to assess teacher effectiveness has been
based very much upon personal preference (Barr, 1961).
Barr stated,
If judgments about teachers are based upon observations of
teachers' behaviors, how do we know what to look for and what to
6
.
ignore? Whether a behavior, or aspect of behavior, is perti­
nent to some particular quality depends on how the quality is
defined. . Many subtle shades of meanings will probably need
to be considered (Barr, 1961).
The personality of the teacher has been a significant variable in
the classroom.
Many have argued that the educational impact of a teacher
is not due solely to what he knows or does, but to what he is as well.
After an in-depth study of this problem, Getzels
and Jackson concluded
that despite the critical importance of the problem and a half-century
of prodigious research effort, little is known for certain about the
nature and measurement of teacher personality, or about the relation
between teacher personality and "teacher effectiveness (Averch and others, '
1971) (Lewis, 1973) (Gage, 1972).
In his summary on teacher effectiveness studies, Ellena pointed
out that teachers differ widely with respect to maturity, intellectuality,
personality, and other characteristics.
The demands of the subjects
they teach, the scope and the structure of the objectives to be achieved—
all contribute to diversity.
In addition, Ellena noted that local con­
trol has exerted its influence toward diversity.
For almost any goal
one might choose, it was possible to find a continuous spectrum of
values, opinions, and goals.
The notion of the 1Jgood teacher", described
by Ellena as basic to the study of teacher effectiveness has turned out
to be almost as vague and diffused as the range of human experiences
relative to teaching (Ellena, 1961).
Rabinowitz and Travers summarized
their problem, and repeated by Ellena, as follows:
7
There is no way to discover the characteristics which dis­
tinguish effective and ineffective teachers unless one has made
or is prepared to make a value judgment. The effective teacher
does not exist pure and serene, available for scientific scru­
tiny, but instead a fiction in .the mind of men. No teacher is
more effective than another except as someone so decides and
designates . . . (Ellena, 1961).
Most likely the reason that the effective teacher has not existed
pure and serene has been due to the fact that under local control of
schools, the teaching act is free to vary from school system to school
system.
The job of the teacher thus varies according to the location
of the job.
The particular job a teacher has been expected to perform
has varied from grade to grade.
Because, definitions of teacher func­
tions has varied by grade level and school system little headway has
been made in solving the problem of successfully measuring teacher
effectiveness in spite of the immense number of studies that have been
conducted (Ellena, 1961).
To make sense of the diverse inquiries that have been undertaken
in the name of teacher effectiveness, Travers related that it has
become necessary for one to make "distinctions in purposes".
He pointed
out that the administrator has been looking for knowledge of teacher
effectiveness in order to make a better decision in situations such as
hiring or firing a teacher.
The instructional supervisor or teacher
wanted to know what instructional procedures are most likely to prove
useful in achieving certain instructional ends with given students.
Researchers' purposes, according to Travers, included:
8
satisfying.a desire to describe accurately what teachers do,
searching for associations between theoretically or empir­
ically derived variables and learning, and demonstrating the
power of a given factor or instructional operation to make a '
practical difference upon the outcome sought (Travers, 1973).
Those who have been interested in teacher effectiveness have had
different purposes and consequently have varied their interpretations
of the problem.
Some who have investigated the problem of teacher
effectiveness would have been satisfied to know whether or not a
teacher was getting desired results with the results indicating effec- .
tiveness,' not the process used.
Others wanted to know how to increase
the probability of attaining desired results.
Researchers who were
interested in process were searching for lawful teaching behavior, j^.je. ,
validated procedures for achieving instructional ends.
Their assump­
tion was that effective teaching would have been recognized when lawful
relationships were established between instructional variables and
learner outcomes— that certain procedures in teaching would have, within
certain probability limits, been labeled as effective or ineffective
(Travers, 1973).
To date there are no such laws, only a few leads or practices
that are more likely than others to maximize the attainment of selected
instructional ends.
Researchers such as Gage (1968) had hoped to estab­
lish scientific laws for teaching; other researchers agreed with Dewey
(1929), who held that it was an error to believe that scientific findings
and conclusions from laboratory experiments to such activities as
9
helping the teacher make his practice more intelligent, flexible ahd
better adapted to dealing with individual situations (Travers, 1973).
Bolton suggested that the purposes of teacher evaluation vary
somewhat from school district to school district.
Included were many
of the following:
(a) to improve teaching . . . by determining what actions
can be taken to improve teaching systems, the teaching
environment, or teacher behavior,
(b) to supply information for modification of assignments,
(c) to protect individuals and the school system from incom­
petence ,
(d) to reward superior performance,
(e) to validate the selection process, and
(f) to provide a basis for the teacher's career planning
and growth and development.
Bolton summarized by stating, "All of these purposes might be
expressed by saying:
The purpose of teacher evaluation is to safe­
guard and improve the quality of instruction received by students"
(Bolton, 1973).
Wilson, in presenting a paper' to the annual convention of the
National School Boards Association, April 17-22, 1975, described the
purposes of evaluation as follows:
Before we come to grips with the methods to be used in evalua­
ting teachers, there must be a clear understanding of the purpose
for the evaluation in the first place. As a superintendent of
schools it is clear to me that teachers are evaluated for t w o '
major reasons. First, teacher evaluation takes place for the
10
specific purpose of improving the quality of instruction. The
focal point of all education is the learner . . . .
The second
major reason for teacher evaluation is to identify those staff
members who are perpetrating such crimes against youngsters
■ that their removal from the classroom and from the. profession
is the major objective.
In other words the evaluation process
is used to document teacher ineffectiveness so that termination
can be accomplished (Wilson, 1974).
From the standpoint of the local school official, Ellena pointed
out that the extent to which any procedure has been used in teacher
evaluation depended on how much and what kind of evidence was desired in
making decisions about local school personnel.
immediate and self-terminating information.
These concerns were for
There was no concern from
the local standpoint about adding to the fund of knowledge about teacher
effectiveness to the extent that it could be predicted and explained
accurately (Ellena, 1961).
Travers, in his review on teacher effective­
ness, related that decisions require judgments about teachers have been
made by many— teacher educators, school personnel officers, administra­
tors, supervisors; and teachers.
Wise choices about teachers have been
made when adequate data was at hand for judging.
He added:
Complete data have typically not been available; possibly be­
cause those who have been making decisions have not given enough
thought to what is required for making warranted decisions about
a teacher and, accordingly have not arranged for the collection
of data.
A second reason that data are not available, according to Travers, is
that researchers have not pursued their investigations with awareness of
the practical decisions that must be made by those working with teachers
(Travers, 1973).
11
The school official, as suggested by Ellena, has sought "to
determine how well a teacher performed his job in terms of certain speci­
fied and more often unspecified criteria."
He has not been concerned
with whether or not the job he asked the teacher to perform was' repre­
sentative of the class of such.jobs or if the teacher performed the
class of jobs well.
On the other hand, the researcher, according to
Ellena, has been concerned with how well a teacher could perform, "in
any of a class of jobs which share many common characteristics, as well
as with identifying these common characteristics" (Ellena, 1961).
The difference between a school official's concern and a .
researcher's concern, as pointed out by Ellena, has several implications.
For example, the overall or intuitive ratings may be used by a school
official to help make a general assessment of how well a teacher has
performed and the general assessments thus gained has provided relevant
and useful information for the immediate school situation.
From Ellena's
point of view, overall ratings have not generally been useful or relevant
because such ratings have low reliabilities, and have not been consistent
with the purposes of researchers who wish to predict and to describe.
Ellena has evaluated ratings as follows:
An overall rating for research purposes implicitly assumes
that when a teacher receives a rating of 80 per cent effective,
it means that a teacher is as effective in doing the same things
as every other teacher who is rated as 80 per cent effective by
other school officials. An overall scale does not show that
teachers are effective in doing the same things.
It only shows
that raters thought teachers were effective in doing whatever it
12
was they did. An overall rating is simply a means of letting
the criterion which the researcher wants to predict, vary in its
meaning, without showing that it so varies. It is impossible to
predict consistently a criterion whose meaning constantly shifts
(Ellena, 1961).
, As pointed out by Ellena, officials of local school districts and
researchers have had different purposes for describing teacher effective­
ness.
As a result of this difference in purpose researchers experienced
the problem of predicting a criterion of teacher effectiveness that was
relevant to local school board needs.
Ellena described this problem by
s bating:
,
If a researcher uses any procedure that permits the definition of teacher effectiveness to vary, he will not be success­
ful in predicting.
If he does not let the definition vary, he
places' himself in the position of having to specify what the
function of teachers should be. If he so specifies, he either
usurps the function of local school districts in deciding what
the functions of a local teacher should be, or else runs the
risk of predicting a criterion that some local school boards
considered inconsequential or irrelevant to how they define
teacher performance (Elldna, 1961).
Travers has stated in his Second Handbook of Research on Teaching
that, "Professionals and laymen alike are unhappy with what is loosely
called the evaluation of teachers" (Travers, 1973).
He summarized the
results of national surveys which indicated the reasons for dissatis­
faction with most evaluations are:
(I) lack of confidence in the school
system's evaluation program, (2). infrequent observation of tenured
teachers,
(3) inaccurate evaluation,
(4) administrative staff have little
time to effectively evaluate and make judgments of staff, and (5) evalua­
tions are poorly communicated to others (Travers, 1973).
13
Many considerations beside teacher effectiveness entered into
decisions such as whether to hire, to grant tenure, to fire teachers.
Travers pointed out that the practice of assessing a teacher without
having had valid data regarding his ability to effect changes in pupils
seemed wanting.
In contrast, information about the teacher's personal
characteristics, relations with other adults, appearance, political
attitudes, etc., has been plentiful and easily acquired.
of a teacher on the basis -of factors
Appraisals
unrelated to the progress
of pupils
has allowed the value preference of individuals and local communities
to operate (Travers, 1973).
Bolton expressed the view that judgments regarding teachers are
made inevitably and if the criteria were appropriate and the data were
sound, resulting judgments would be useful.
He stressed the fact that
in evaluating teachers, judgments should be made in relation to objec­
tives rather than the personal worth of people.
He pointed out that
evaluation should establish whether the teacher reached various standards,
not whether the teacher did better or worse than other teachers.
He
emphasized the idea that teachers should be helped to improve their con­
tribution to the learning of school children (Bolton, 1973). ■
A review of research supported the position that the more widely
used criteria for assessing teacher competency included student ratings,
self ratings, administrator ratings, and peer ratings.
Assessments of
classroom environment, personal attributes, performance tests, alterna­
14
tive criteria (contract plans using student gain) and systematic obser­
vations provided additional criteria for judging teacher effectiveness.
Travers reviewed the work of McNeil and Popham who have cautioned in
their assessment of teacher competence,
Any single criterion of effectiveness is confounded by a
number of factors. One factor stems from,who is doing the
measuring; a second is the kind and quality of instrument
used; a third is faithfulness in applying the instrument as
its designer intended; and a fourth is the purpose for apply­
ing the criteria— how the data are used (Travers, 1973).
Research has generally supported the conclusion that effective­
ness in teaching is best evidenced by criterion measures Which detect
pupil growth as a result of the teacher's instruction.
pointed out by Wolf, teachers are not fond of evaluation.
However, as
He stated
their concern as follows:
. . . They suspect any measure designed to assess the quality
of their teaching, and any appraisal usually arouses anxiety.
If teachers are to submit to an assessment of their perform­
ance, they would probably like reassurance that the criteria and
method of evaluation that are used would produce credible results
(House, 1973).
According to Wolf, teachers have believed that the standards for
evaluating what is effective teaching are too vague and ambiguous to be
worth anything.
They have felt that current appraisal techniques fall
short of collecting information
performance.
that accurately characterize their
They received the ultimate rating as depending more on
the idiosyncrasies of the rater than on their own behavior in the class­
room.
As a result, teachers saw nothing to be,gained from evaluation.
15
Statement of the Problem
The emphasis placed upon "accountability" by the public during
the decade of the 1960's had intensified the search by school districts
in the 19701s to find improved ways to evaluate teacher effectiveness.
The problem inherent.in the search for improved ways to evaluate teacher
effectiveness was that of selecting suitable criteria upon which both
administrators and teachers agreed which truly measured teacher effec­
tiveness.
School districts, facing this problem needed to Know how much
agreement existed between teachers and administrators on effective cri­
teria and to determine what kinds of criteria were appropriate for judg­
ing the effectiveness of teachers.
If it was determined that administra­
tors and teachers varied greatly in their views of the perceived.importance
of criteria for measuring teacher effectiveness, then continuation of the
evaluation process would have resulted in increased sensitivity and mis­
trust on the part of teachers toward administrators' who judged teaching
effectiveness.
It was necessary for school district, administrators to
include the teacher in determining the criteria used to evaluate their
own effectiveness.
Need for the Study
It is evident from a review of the literature that the task of
identifying effective teachers and effective teaching is crucial to
teacher education, teacher selection,teacher performance, and ulti­
16
mately, to the survival of society.
Crucial as this need is and in view
of the enormous amount of research directed at identifying effective '
teaching, it is disturbing to note that there has been no general agree­
ment upon what constitutes effective teaching, or standards of teaching
effectiveness.
A substantial amount of pressure has been placed upon
school districts to evaluate teaching effectiveness because of the
accountability impact.
This practice is a sensitive issue to teachers.
Research seems to bear out the fact that teachers should indeed be con­
cerned.
Very evident in the research reviewed is the need to involve
more than just the administrator or supervisor in evaluation of teach­
ing effectiveness.
The literature pointed out the fact that teachers
should be involved in the evaluation process.
Purpose of the Study
The purpose of this study was to determine what factors were
important from the teacher's viewpoint in identifying effective teach­
ing, compare the findings with the administrator's viewpoint and deter­
mine whether or not there was agreement on the criteria for judging
effective teaching.
By first determining whether or not teachers and
administrators agreed upon the criteria for judging effective teaching,
this study provided a means whereby some conclusions could be reached
by school districts concerning their efficiency in meeting the public
demand of the best possible education for the tax dollar.
.17
Questions to be Answered
The questions to be answered by this study were:
1.
Is there agreement among teachers in Montana on the criteria ■
that describes the effective teacher?
2.
Is there agreement among school administrators in Montana on
the criteria that describes the effective teacher?
3.
What is the degree of agreement between administrators and
teachers in Montana schools on the criteria that describes the effective
teacher?
4
.
What is the degree of agreement between elementary and sec­
ondary teachers on the criteria that describes the effective teacher?
5.
Is there a relationship between the criteria differences as
perceived by teachers and their administrators and the rated effective­
ness of the administrator in helping the teacher to improve his effec­
tiveness?
Limitations of the Study*
1
The limitations of this study were:
1.
The study was limited to the geographic area of the State of
Montana.
2.
The elementary and secondary teacher population of Montana
schools comprised the teacher population from which the sample was drawn.
3.
The district superintendents and principals of Montana schools
comprised the administrator population from which a sample was drawn. ■
18
Definition of Terms
Terms defined for the purpose of this study were:
1.
Teacher.
A person who
is certificated to teach in Montana
and who will be under contract to teach during the 1976-1977 school year
in any school district in Montana.
2.
Administrator.
A person who is certificated by the State of
Montana for the purpose of administrating a school or school district
and who is employed either as a principal or superintendent in any
school or school district in Montana during the 1976-1977 school year.
3.
Teacher Effectiveness.. This term is the degree of success a
teacher achieves in attaining the desired outcomes that a school dis­
trict wishes to obtain the teaching-learning environment.
4.
Evaluative Criteria.
Evaluative criteria are measures of
teacher effectiveness.
5.
Types of Criteria.
Criteria are typed in accordance with
Mitzel1s Scheme of process criteria, product criteria, and presage
criteria.
Process criteria are measures of teacher effectiveness based
upon classroom behavior either the teacher behavior, his students'
.
behavior or the interplay of both.
Product criteria are measures of teacher effectiveness in
terms of measurable change in student behavior as a product of teaching.
Presage criteria are evident measures of teacher effective­
19
ness based upon a teacher's personality or intellectual attributes,
performance in training, years of experience, tenure, etc.
Summary
Defining the effective teacher has been a continuing process
carried out over many decades by researchers and investigators.
A
review of the literature•indicated that the purpose of most completed
research has been to improve teaching performance.in order to provide
better education for children..
The process of identifying the effective teacher in earlier times
depended upon a subjective evaluation of the teacher's personality
traits and behavior in light of some particular authority's judgment as
to what was acceptable or unacceptable.
This process was usually
accomplished by the use of some type of rating instrument.
In more recent years the use of the rating instrument received
severe criticism by both teachers and administrators because both
believed its primary use by evaluators was for the purpose of dismiss­
ing teachers.
As a result of the criticism directed at the use of the
rating instrument, the emphasis in evaluation shifted from the subjec­
tive approach to a more objective approach which resulted in the posi­
tive practice of identifying the strengths and weaknesses of teachers.
The purpose of evaluation became-that of correcting weaknesses and
reinforcing strengths of the teacher.
The emphasis became one of
20
measuring teacher effectiveness centering on product measurement through
previously agreed upon objectives of instruction.
To evaluate a
teacher's performance, it became necessary for school administrators
and teachers to determine the characteristics of the effective teacher
and the ingredients of effective instruction.
The problem for adminis- .
trators and teachers was that of agreeing upon the criterion measures
of effective teaching.
Traditionally the evaluation of a teacher's effectiveness was
conducted by the teacher's immediate supervisor, usually the principal.
Other forms of teacher evaluation which emerged more recently included
peer evaluation, self-evaluation, pupil evaluation or combinations of
these.
There was, by no means, total agreement among school districts
of the nation that any or all of the newer trends in the evaluation
process contained total answers to the teacher effectiveness problem.
One of the biggest problems encountered, regardless of the approach
school district followed in evaluation, was that of defining the cri­
teria by which teaching and teachers were to be assessed.
a
CHAPTER II
REVIEW OF RELATED LITERATURE
Introduction
This review was organized into four elements of studies and
research relating to development and change that has taken place in
determining teacher effectiveness.
The initial section concerns the
forces at work which created the need to evaluate teacher effectiveness.
The second portion relates the status of present appraisal methods used
in teacher evaluation.
The third section reviews experimental studies
of teacher effectiveness that illustrate the problems inherent in a
study of this nature.
Some specific trends, criterion and design models
were reviewed and described. ■
' Areas of emphasis in the review of litera­
ture includes:
1.
Status of present methods of evaluating teacher performance.
2.
Studies of teacher performance.
For the purpose of statistical comparison, this study followed
closely a study reported in 1974 by Jenkins and Bausell on how teachers
view the effective teacher (Jenkins and Bausell, 1974).
The purpose of
their study was to consult teachers and administrators regarding their
views on teacher effectiveness, in particular, on criteria they used to
evaluate their own effectiveness.
To provide some structure for such an
inquiry, Jenkins and Bausell developed a survey instrument which was
based on the category labels of product, process and presage employed by
22
Harold Mitzel in his contribution to the 1960 edition of the Encyclopedia
of Educational Research.
A more elaborate description of Mitzel1s cate­
gories of teacher effectiveness criteria appear later in' this chapter
and a copy of the instrument used by Jenkins, and Bausell appears in '
Appendix-A.
Briefly described product criteria in Mitzel1s scheme are employed
where a teacher is judged on the basis of a measurable change in what is
viewed as his product, student behavior.
Process criteria is used when
a teacher's evaluation is judged by either his behavior in the classroom
or that of his pupils or the interplay of both teacher/student behavior.
Presage criteria is used, if a teacher's evaluation is judged in terms of
the teacher's personal or intellectual attributes, his performance in
training, his knowledge or achievement or other pre-service character­
istics (Mitzel, 1960).
Jenkins and Bausell administered a survey instrument which
included an assortment of product, process and presage criteria to a
random sample of all public school teachers and administrators in the
State of Delaware.
Respondents who numbered two hundred sixty-four
(N = 264) were instructed to assume that adequate measures were avail­
able to measure each of the criteria listed.
The instructions listed
were replicated for this study as well as the continuum used for
responses.
This information appears in the instrument which is located
in the Appendix. A.
2 3
The criteria and the ratings given them by Delaware teachers and
administrators appear in Table XXI. When the responses of elementary
teachers, middle school teachers, secondary teachers and principals were
compared, the results indicated that although these groups might be
expected to have different biases, their ratings were remarkably simi^
Iar.
The average correlation between these groups was .93 (Jenkins and
Bausell, 1974).
Perhaps the most revealing aspect of the survey according to Jen­
kins and Bausell was the rating given to the criterion, Amount Students
Learn.
This criterion in the Delaware study was not seen as particularly
important in judging teacher effectiveness relative
rated.
to the other criteria
The implication of the rating received by Amount Students Learn
for accountability proponents should be obvious.
While those in the'
accountability movement stressed student learning as the primary basis
for educational decision making, educational practitioners, at the same
time, affirmed their preference for other criteria as indicated by the
Delaware study.
Results of their study and comparison tables with the
•Delaware study are listed in Chapter IV (Jenkins and Bausell , 1974).
Need for Evaluating Teacher Effectiveness
The study of the effectiveness of. a.teacher and particularly how
teachers themselves viewed the effective teacher gained increased momen­
tum in recent years with the emphasis placed upon "accountability" by
the soaring cost of education in the 1960's.
The cry for accountability
intensified the search in the 19701s for improved ways to evaluate
and to
24 •
standardize these procedures.
needs of two groups:
Impetus for this search arose from the
teachers, on the one hand, who sought the security
of fair objective standards of evaluation; and the public, on the other
hand, who sought assurance that its tax dollar was well spent (Oldham, ■
1974).
Because teacher accountability remained the center of debate,
any discussion of the topic turned sooner or later to the issue of
teacher effectiveness.
For the teacher the idea of accountability
quickly translated into an assessment of the quality of his instruction
and the related necessity of selecting a criteria by which one would
judge his effort.
Because the accountability movement centered on
teacher effects, it seemed imperative that teachers be consulted regard- ■
ing their views on teacher effectiveness, and particularly upon the
criteria they used to evalute their own effectiveness (Jenkins and .
Bausell, 1974) .
Forces Which Created the Need to Evaluate Teacher Effectiveness
In addressing the National Association of Secondary School Prin­
cipals' annual convention in Anaheim, California in March of 1972,
Governor Ronald Reagan who was then the Governor of California referred
to the growing need of public education to become more accountable in
the decade of the seventies.
To "re-establish the public's confidence
in education and our school system," Governor Reagan described as yet
another responsibility the public had given to its education system.
25
Governor Reagan described the public's eroding confidence in education
although education traditionally had been America's major public prior­
ity.
Of the reasons described by him as contributing to the eroding
public confidence., crisis stemming from financial problems and the feel­
ing that people had reached the limit of their ability to pay higher
taxes seemed most paramount.
How this mood affected education was
described by Governor Reagan in his statement,
However unjustified educators feel the attitude may be, there
is a feeling among our people that our schools are not doing all
that they should, or doing it as efficiently and as economically
as they could (Reagan, May 1972).
The implications that Governor Reagan's address held for measur­
ing teacher effectiveness.as one way to meet the public's demand for
accountability is summarized in part of his address as follows:
We must develop ways to evaluate objectively the performance
of teachers, to find the best, and to reward them for superior
performance.
In California last year, we passed legislation to
require evaluation of teacher performance.
You can probably guess the result. The deadline for conform­
ing to this new law had to be postponed. Because we have pro­
moted by seniority alone for so long, we have had to start from
the basics to determine just what should be measured in evaluat­
ing teachers and how to measure it.
. . . However difficult it may be, we are determined to
develop fair, realistic and reasonably flexible methods of
measuring teacher performance (Reagan, May 1972).
Herman supported the primary reason expressed by Governor Reagan
that the education institutions of this nation were besieged by internal
26
and external forces demanding that these institutions be held account<
able and show evidence of having used the taxpayers money wisely before
asking for additional money.
Herman took the position that one of the
most basic elements in accountability was staff evaluation.
This element,
according to Herman,, dealt with definitions of what we were doing, who
was responsible for doing it, and how did we measure the effectiveness
of the work assigned each individual within the program.
Herman
described two basic ideas that needed to be included within each dis-.
trict's plan of evaluation regardless of the ultimate number of person­
nel involved in the evaluation.
These were described as:
(I) a self-
evaluation must be done by the employee and (2) the employee's immediate
supervisor has to arrive at judgments based upon his evaluations when
administrative decisions, such as whether or not to grant a teacher
tenure, needed to be made (Herman, 1973).
Ornstein and Talmage have stated that the Concept of account­
ability was borrowed from management.
They described the concept of
accountability applied to education as
. . . "holding some people (teachers or administrators), some
agency (board of education or state department of education), or
some organization (professional organization or private, company)
responsible for performing according to agreed-upon terms"
(Ornstein and Talmage, 1974).
•
In the past, so stated Ornstein and Talmage, students alone were held
accountable for specific objectives in terms of student changes in
achievement and behavior.
According to Ornstein and Talmage most people
27
believe that everyone, including teachers and administrators, should be
held accountable for their work.
What many educators objected to, and
even feared, was the oversimplified idea of accountability as. the sole
responsibility of the teacher or principal.
Accountability should have
included not only teachers and administrators but also, parents and com­
munity residents, school board members and taxpayers, government offi­
cials and business representatives, and most importantly the students.
Ornstein and Talmage summarized their concern about the concept, of
accountability as an idea which was spreading throughout the country
regardless of the fact that there was no evidence that it would reform
the schools.
One of the major difficulties which seemed to plague the
accountability movement was- that of measuring learning (Ornsteih and
Talmage, March 1974).
One process which the call for accountability in the seventies
forced upon some school districts was termed management by objectives
(MBO).
A number of school districts turned to this new concept of man­
agement, alternately referred to as management by mission, goals manage­
ment, and results management in the hope that because the concept had
been used successfully in business and industry for more than a decade,
it would likewise prove successful for school districts.
Although MBO and accountability have been frequently teamed in
the literature and in school district improvement efforts, they have-not
been considered as generic teammates.
MBO preceded the accountability-
28
in-education movement by at least a decade.
/
'
The term management,by
.
objectives was first used by Drucker in his book Practice of Management in 1954.
McGregor of M.I.T. and Likert of the University of Michigan
had used it to justify the application of findings in behavioral research
to the business situation.
Since then, results management has been
widely installed throughout the United States and other countries,
notably Great Britain, where business, industry, and government have
found it a productive way of managing their enterprises (Read, March
1974) .
-
'
.
'
'
Read listed several administrative practices which he felt MBO
would strengthen.
Read stated that successful implementation of MBO
would . . . "eliminate the tendency to evaluate personnel in terms of
their personality traits; substituting instead, their performance in
terms of results" (Read, 1974).
For the purpose .of this paper the prac­
tice of determining teacher effectiveness seemed most appropriate.
As pointed out by Howard one should not be led to believe that,
the idea of accountability, the adoption of business practices
in education, is new, that it has just been discovered by some
of our brighter, abler, and more responsible people in educa­
tion (Howard, .1974).
He noted that in the early 1900's we were blessed in having a number of
educators who, in response to pressures from business, industry, and
the general public, were able to devise methods for determining effi­
ciency and educational output in the schools.
"Educational efficiency.
2 9
experts" and "educational engineers" were names given to the responding
educators of those days (Howard, 1974).
Miller described the influence of business practice on account­
ability by noting that developments in the field of managementtechniques
required sharper expertise in goal setting, planning, and establishing
of cost effectiveness measures.
Management also increased its skill in
evaluation and assessment which in turn fostered the move toward account­
ability, Miller concluded (Miller, 1972).
Miller perceived accountability as a means of holding an indi­
vidual or group responsible for a level of performance or accomplishment
for specific pupils.
He emphasized that program goals .would be developed
for each activity, thus clarifying the purposes and goals of all programs
and making it easier to assess results.
He believed that educators would
have to develop greater skill in goal setting, diagnosing needs, and '
analyzing learning problems.
He also noted that increased emphasis on
improved communication and involvement of pupils and.parents would be a
necessity and would result in better understanding and support of the
school program (Miller, 1972).
Many persons were threatened by the idea of accountability and
even more were disturbed by the apparent way in which the concept was
being implemented. ' A major cry from the teachers was that standards for
them and for the pupils Were likely to be set by central office adminis­
trators.
They feared that the required levels of performance would be
30
unrealistic and unobtainable, thus triggering, punitive .actions toward
pupils and teachers.
Teachers did not want to become the scapegoats
when the school systems did not produce, what the p a r e n t s t h e boards, or
the administrators demanded.
Teachers pointed out that while they.were "
likely to be the ones held accountable, they often- did not have the
resources or power to alter policies or practices which must be changed’,
if improvement were to come about..
r
Many worried that implementation of accountability would cause
education to focus on- that which could be easily identified and measured.
The area of academic achievement would most likely■g e t ■the most atten­
tion at the expense of the affective domain.
What was certain in the
’,
minds of many was that accountability would surely increase the educa­
tional bureaucracy which, to some, already constituted a serious impedi­
ment to improving instruction.
As described by Kibler, the use of instructional objectives was
consistent with the concept of accountability which was described as the
balancing of money spent for education with the amount students learned; '■
Accountability in education as described by some writers was rapidly
gaining acceptance from both the public and the federal government,. ,
Unfortunately, some educators who had negative attitudes about account­
ability also had become negative about instructional objectives (Kibler,.
1974).
Apparently, the negative attitudes about instructional objectives
were based on the-misconception that using ,instructional objectives lead
31"
to accountability in education.
Kibler felt that few comforting■words
could be said to those teachers who viewed accountability-based educa­
tional
systems as a threat.
If accountability-based educational systems
did become the norm, experience in the use of instructional objectives
would enable teachers to adapt to the system more easily (Kibler, 1974).
Hottleman, Director of Educational Services, of the Massachusetts
Teachers Association described the negative impact that accountability .
had on public education in his statement that,
The accountability movement probably offers more potential
for harm to public education than any other idea ever intro­
duced, yet more and more highly placed education officials hop
on the bandwagon daily. One common element among the major
accountability movers is their backgrounds. They are mostly
administrators, testing experts, or private businessmen.
Teachers' organizations and individual teachers are notably
absent (Hottleman, 1974).
Hottleman described the accountability movement in public education as
■
first becoming visible in 1970 when President Nixon announced, "School
administrators and school teachers are responsible for their performance
and it is in their interest as well as in the interests of their pupils
that they be held accountable."
Hottleman explained that the President
was probably influenced by Leon Lessinger, the Assistant Commissioner of
Education, who openly stated his intention to make public education
accountable (Hottleman, 1974).
In summarizing the accountability movement Hottleman noted that
the accountability movement had not begun as a way to improve learning
32
opportunities for children but in response to problems which arose out
of the increasing costs of public e d u c a t i o n T h e proponents’, in the .main,
were, not public educators but were those who had an accounting mentality
that viewed sorting, classifying and measuring as .significant per.se.
Hottleman viewed the overemphasis on measurement as promising
greater con­
formity, the diminishing of humaneness, individuality, and creativeness
in public education, and if Unchecked, threatened a concerted move toward
educational mechanization.
In the opinion of Hottleman teachers were'
viewed as the least important resource in seeking -answers about the
improvement of education.
In his view what was heeded, was a reduction .
of funds spent by the measurement, fanatics and an increase in. fund's
spent in finding ways of surfacing and implementing.the ideas of prac-.
ticing teachers (Hottleman, January 1974).
Weiss described educational accountability as a threat to. the
privacy and security of educators who worked in greater privacy than
almost any other professional group.
Because educators worked in rela­
tive privacy compared to other professional groups they looked upon the
concept of accountability as having.strong implications.of distrust for
their effectiveness.
As Weiss stated, "Educators, like all of us, know
that 'accountability' does not enter into the discussions between
persons
or agencies with great confidence in each other," therefore, educational'
•
■
.
' .
•
■
■
'
"
v
accountability carried with'it an obvious presumption of gUilt.(Weiss,
April 1973).
Weiss's observation described in.some degree'the defensiveness
that educators displayed toward the concept of accountability.■. While
most people believed that everyone, including teachers and administrators,
should be held accountable for their work, educators objected to., and
feared, the oversimplified idea that accountability was the sole respon­
sibility of the teacher or principal.'
The response of teachers to a state demand for accountability aind .
assessment was described in the research carried out by Bleecher.
In the
State of Michigan a demand for accountability and assessment was seen as
a rational response to political pressures from taxpayers who felt
heavily taxed.
Because taxpayers wanted to knew what they were, getting
for their annual two billion dollars spent on education, the legislature
passed an act which ordered a program designed to assess pupil learning
in the basic educational skills to take effect immediately.
In order to
comply with the educational assessment act, the State Department of Educa­
tion advocated a six-rstep model which presumed would lead' to educational '
accountability.
The response of teachers was rejection in the form of
minimal ctimpliance and by pressure from the organized teacher groups
.
(Bleecher, December 1975)..
The accountability movement which received renewed emphasis in
education in the early part of this decade still .continues.
This move­
ment has been summarized by Popham,' as a public challenge to education
by his statement that:
'
34
The public is clearly subjecting educational institutions to
Increased scrutiny.
Citizens are not elated with their percep­
tions of the quality of education.' They want dramatic improve■ ments in the schools, and unless they get them, there.is real
doubt as to whether we can. expect much increased financial sup­
port for our educational endeavor's. And the. public is in no
mood to be assauged by promises.
'Deliver the results,.' we are
being told. No longer will lofty language suffice,and yester­
year's assurances that 'only we professionals know what we're
doing' must seem laughable to today's informed layman.
The distressing fact is that we haven't produced very.impres­
sive results for the nation's children. There are too many
future voters who can't read satisfactorily, can't reason
respectably, don't care for learning in general, and are pretty
well alienated from the larger adult society (Popham; May 1972).
Many educators particularly administrators' responded to ttie
accountability challenge which Popham described as inevitable and
accepted the premise that the schools must indeed be accountable.
Thus,
according to Popham, "the course was set to find, the most expedient way
to accomplish accountability".
For teachers the challenge could be -
described as an admission of guilt for failure of students-to learn.
Popham suggested that educators accept the accountability challenge by
increasing clasroom teachers' skills in producing evidence that their
instruction yielded worthwhile results for learners.
One way, suggested
by Popham, of showing.results' was' to place appropriate measures of stu- ■
dent performance in the hands of the teacher.
The measures,suggested by
Popham were tests of instructional objectives described in the literature
as criterion-referenced measures (Popham, May 1972).
Measuring student performance" by testing has.been referred to in
35
the literature as product measurement which in.turn has been one method
used to measure teacher effectiveness.
This method of measuring teacher
effectiveness has not been popular with teachers for a number of reasons.
■The common problem is that attempts to evaluate teachers on the basis of
pupil's test performance tend to focus teaching too narrowly on the '
specifics measured by the test (Rosenshine, 1970) and (Veldman and
Brophy, 1974).
Grogman described the dangers inherent in accountability
■measures that focus on short-term goals which are the kind measured by
tests:
’
.
As teachers are threatened with accountability measures that
focus on short-term measurable goals, their only recourse is to
’stress what is stressed in the accountability measures, fre- ■'
quently to the detriment of more important learnings', which may'
be underemphasized or overlooked.
If not.measured (and they .
generally are not in accountability systems), such skills as
socialization, cooperation, and communication undoubtedly will
suffer (Grogman, May 1972) .
■'
.
In recent years many who are charged with the responsibility of
evaluating teachers have begun to consider product evaluation methods.
Thus, trying to imitate industry, evaluation centered on student achieve­
ment, which in part depended upon test scores (Thomas, December 1974). ■
Yet, Medley and others concluded from their research that only short-term
goals which were almost certainly the least important goals of education
are validly measured by tests.
The validity of "teacher tests" of
ability to achieve short-term outcomes/as predictors of overall- teacher
effectiveness is by no means self-evident.
"Their validity,. according
to Medley, predictor of over all teacher effectiveness, must be empiri—
36
.
.
■
,■
•
cally demonstrated before their use i-s justified" (Medley, June 1975).
In view of the limitations'placed upon tests of student achievement as a
criterion to measure teacher effectiveness, it was not surprising to find
that teachers questioned product measurement as a measure of their effec­
tiveness.
This position was summarized in the Fleischmann Report which
was made to the New York State Commission on the quality, cost, and
finance of elementary and secondary education for the State of New York.
The Report stated,
.
•
Because of the many circumstances that influence learning,
educators have traditionally been reluctant to submit to eval­
uation on the. basis of student performance.- They have argued
that learning is in too many ways beyond their control and . •
that it is therefore unfair to judge school effectiveness by
measuring student achievement alone" (The Fleischmann Report,
■ 1973).
"
Not only were educators reluctant to be evaluated on the basis of'
student performance, they questioned any measure designed to assess their,
teaching effectiveness.
Wolf described this concern of teachers in the
following statements:
Teachers are not fond of evaluation. They suspect"any
measure designed to assess the quality of their teaching, and
any appraisal, usually arouses anxiety . . . .
If teachers are
to submit to an assessment of their performance, they would
probably like reassurance that the criteria and method of
evaluation that are used would produce credible results. . : .
Teachers probably believe that the standards for evaluating
what is effective teaching are too vague and ambiguous to be
worth anything. They feel that current appraisal techniques
fall short of collecting information that accurately charac­
terizes their performance (Wolf, 1973).
Supporting this point of view House noted that little demand existed ■ ■
37
among teacher a n d 'administrators for evaluating their programs.
view teachers gained little by having their work assessed.
Jn his
.
Instead,
according to House, teachers risked damage to their egos by subjecting
themselves to evaluation by administrators, parents, and worst of all,
students only to find that they were not doing the,job as effectively as
they thought they had.
As stated by House,
The culture of the school offers no rewards for examining
one's behavior— only penalties.
Since there are no punishments
for not exposing one's behavior and many dangers in so doing,
the prudent teacher gives lip service to the idea and drags
both feet (House, 1973).
. ■'
Teachers reacted strongly to the renewed emphasis placed upon
evaluation as dictated by the accountability movement.
They.felt that
they would be blamed for something whether or not such blame was deserved.
This guilt feeling on the part of teachers was obviously not conducive, to
open discussion, examination, or evaluation.
Administrators and teachers
saw evaluation as quite important and quite threatening.saw little to be gained (House, 1973).
Both groups
•
Renewed Emphasis Placed Upon
Measuring Teacher Effectiveness
As. noted from a review of the literature, the accountability move­
ment in education during the 1960's and the early 1970's renewed emphasis
of school districts throughout the nation to find fitting methods to
measure teacher effectiveness.
Also evident from the review of the lit-r.
erature was the fact that teachers generally distrusted accountability
38
measures that used student gain as a criterion for measuring teacher
effectiveness. ■ The burden of identifying effective teachers and effec­
tive teaching and the concern that both teachers and administrators
felt was best illustrated by Thomas.
He noted that,
Evaluation has always been troublesome for school administra­
tors. It has always been troublesome for teachers. Both pro­
fess the value and necessity for evaluation, but neither believes
that it can be effectively accomplished. At one extreme is the
position of Robert Finley, one of the nation's finest superin­
tendents:
'Evaluation is subjective . . . period. No other way
to evaluate people exists— so that's the way to do it.' At the'
other extreme the National Education Association states:
'Eval­
uation must be objective; subjective evaluations have a dele­
terious effect on teachers and children' (Thomas, December 1974).
The question of whether or not to -evaluate teachers and the
process of so doing was taken away from the jurisdiction of local school
districts in some states.
State legislatures began to enact laws which
mandated the evaluation of teachers at specified intervals and in speci­
fied ways.
An example was California's Stull Act which required the
evaluation of all certificated personnel based upon "expected student
progress in each area of study."
The Stull Act, passed in July, .1971,
was not the only state law requiring evaluation of teachers and adminis­
trators.
At the beginning of 1974 nine states had enacted legislation
mandating some form of teacher evaluation (Oldham, 1974).
Other states considered enactment.of similar laws, but the trend
had not developed at the pace that the emphasis over "accountability"
suggested previous to 1974.
There was considerable interest in account­
ability laws which would place teacher evaluation beyond the influence of
3 9
existing laws and regulations which had governed the certification of
teachers.
possible
State governments, seemingly, had recognized that it was not
to determine competence of teaching on the basis.of university
training or licensing (Oldham, 1974).
The law of at least one state mandated the evaluation of teachers
and established criteria by which such evaluation was to be measured.
Kansas law establishes guidelines or criteria for evaluation'
policies in general terms of efficiency, personal qualities,.
professional deportment, results and performance, capacity to
maintain control of students, etc. The law says community atti­
tudes should be reflected. It provides for teacher participa­
tion in the development of the evaluation policies and selfevaluations . The law also provides for state board assistance
in preparation of original policies' of personnel evaluation
■ (Oldham, 1974).
The inclusion of personal qualities as a criterion for measuring teacher
effectiveness aptly illustrated the conclusion reached by Gage that,
. the personality of the teacher is a significant variable in the
classroom . . . and has recently become the basis for a growing
body of research (Gage, 1963). The concern of teachers who ■
questioned the feasibility of measuring their effectiveness by
assessment of personality qualities was described by the find­
ings of Getzels and Jackson who concluded, . . . despite . . . .
a half-century of prodigious research effort, very little is
known for certain about the nature and measurement of teacher
personality, or about the relation between teacher personality
and teacher effectiveness (Getzels and Jackson, 1963).
Traditionally teachers and 'their organizations have had to fight
for job security and fair standards of pay.
Due process in dismissals,
and punitive actions along with the single-salary schedule helped equal­
ize salaries between men and women and between elementary and secondary
40
school teachers.
fession.
These were well-earned victories for a vulnerable pro­
Evaluation systems, were often disguised means for firing mili­
tant or nonconformist■teachers, for slashing budgets, and fbr enforcing
authoritarianism in the schools.
Teachers wanted no part of them.
torians have noted that at the 1915 NEA convention, one delegate
His­
denounced'
teacher rating as being "demeaning, artificial, arbitrary, perfunctory
and superficial"'(Oldham, 1974).
'
Early teacher ratings were primarily■the outgrowth of. merit pay
programs which had originated around the turn of the century.
A merit .
pay program was a method used by school districts to determine a teacher's
salary in light of a judgment made as to his competency (Brighton and
Hannon, 1962). .
Merit rating which was the outgrowth of the merit pay idea was.
described by Rogers, as
. . . the effort to evaluate or measure more successfully the
effectiveness of the performance of the teacher, with a view of
rewarding excellence while avoiding over-payment to the mediocre
or unsuccessful teacher (Brighton and Hannon, 1962).
'
■
.
'
■
The merit rating movement by 1915 had reached such proportions
that i f caused a decided ,division between proponents and opponents . ' One
group of people, which included both laymen and professionals, concluded
that it was impossible to find a safe, Lisable scheme of rating.
This
group was unable to determine exactly why they thought such.delineation
was impossible.
The 19201s saw the peak use of formal merit pay plans
41
in school districts throughout the United States.
The Department of.
Classroom Teachers of the National Education Association reported to. the
1925 national convention the Ohio State University study of 1922 which
indicated that. 99 per cent of the cities in the United States with popu­
lations of over 25,000 had some form of tehchdr rating in.operation .
(Brighton and Hannon, 1962).
Most persistent merit rating problems which appeared in research
.
between 1900 and 1930 dealt with
the reliability
.
.
.
.
.
and validity involved in
measuring teacher effectiveness. The concern at that.time was whether or
not the measuring device was consistent in measuring what it was supposed
to measure--was the instrument reliable?
Secondly, was the. instrument
valid in that it measured what it was supposed to measure?
The questions
of reliability and validity led to the development of measurement devices .
which could be tested against such criteria (Brighton and Hannon, 1962).
Brighton and Hannon noted that rating scales which listed the per­
sonal and pedagogical attributes of a successful teacher were the main
instruments used to measure teacher competence by 1930.
Trait scales
were developed which,required agreement on the relative importance of .
each item.
It then became necessary to measure the degree to which a
particular teacher possessed or did not possess each particular attribute.
1939).
Barr analyzed 209 of these rating scales in use by 1930 (Cooke,
Barr concluded that ten categories pould include all the attri^
butes that were being used in this approach to rating.
They w ere:
..
4 2
Instruction
Classroom management
Professional attitude
Choice of subject matter
Personal habits
Discipline. •
Appearance of the room
Personal appearance '
Co-operation
Health-
That there was little agreement.among raters as to what personal
and pedagogical attributes described the successful teacher, was further
illustrated by Brighton and Hannon:
-
Shelter reported that a similar study in Pennsylvania by
•Charters and Woples produced a list of 25 categories. Twenty
of these are. not found in Barr's list. Shannon queried 164
public school administrators concerning 430 of their best.and .
352 of their worst teachers. From the replies, he formulated
ten categories important in defining teacher■competence. Only
four of the ten are found in Barr's list. Sheller studied
five such lists and found little■similarity among' them (Brigh­
ton and Hannon, 1962).
The obvious inconsistency between Various lists caused Barr to
comment:
■ ■
,
.
•
'
.
Excellent as these earlier check lists are, they represent
in most instances, merely abbreviated statements of the author's
own opinion of what constitutes good teaching and do not neces­
sarily supply valid and reliable criteria of teaching success
(Barr and others, 1938).
In a report on teacher ratings in public school systems which was
compiled by Boyce in 1915 and published, by the National Society for the
Study of Education,.Boyce noted that the number of items by which teach­
ing efficiency was judged ranged from two items to eighty items.
listed four types of analysis of rating scales as:
He
(I) descriptive
reports which dealt with specified points, (2) lists of-questions which
were answered yes or no, (3) lists of items which were evaluated by the
43
classification of excellent, good, medium,- .unsatisfactory, etc. and (4)
lists of items in which each item was assigned a numerical
value.
Boyce's summary of the qualities "discipline" was the most listed
on 50 of the rating schemes evaluated.
In
quality
It was' found in 98 per c.erit of
.
the forms. ■ Next in frequency were "instructional skill" and !'cooperation
and loyalty".
Each was mentioned in 60 per cent of the forms (Biddle
and Ellena, 1964).
Boyce's plan of rating included a classified list of 45 items
grouped under five headings: personal equipment, social and professional.
equipment; school management, and technique of teaching.
ranged from "general appearance" to "moral influence."
The items
His plan required
that each item be checked on a scale of five 'terms ranging from "very
poor" to "excellent".
By 1967 many evaluation forms were in use which,
were similar to the "efficiency record" published by Boyce in- 1915
(Biddle and Ellena, 1964).
In 1924 Monroe and Clark summarized the researches of the preced­
ing, twenty years in which they cited studies that showed .the lack of
reliability of existing rating devices.
They also pointed out that a
halo effect existed on the part of the rater's general estimate of the
teacher.
The halo effect influenced the estimates of particular traits
held by the teacher who was
rated.
-
'As noted, by Biddle and Ellena, many different people.have been ■
used as raters in competence research.
This group included classroom
•
44
teachers, student teachers, critic teachers, principals, supervisors,
superintendents, school board members, pupils, parents, other lay per­
sons, and college instructors.
A large majority of studies reporting on
teacher competence have used rating forms for one purpose or another.
Biddle and Ellena summarized as follows:
Generally, the results of research using rating forms have
been poor and contradictory.
This is not surprising in view of
the fact that such judges as listed above are handicapped by
personal bias, a lack of training for observation, and a lack
of firsthand information concerning the teacher-classroom
interaction. Yet each year brings a new crop of studies using
rating forms. Why this perseverance? One obvious answer is
the prevalence of rating forms in school programs of assessment,
merit pay and promotion.
In their view ratings seemed less than useful for research
on teacher effectiveness. . . (Biddle and Ellena, 1964).
Historically teachers have objected to.the use of the rating
forms to measure their effectiveness.
Over the years the proceedings of
their professional organization have demonstrated their feelings in this
issue.
Biddle and Ellena reported the feelings of teachers as described
in the 1915 Proceedings of the NEA as follows:
. . . A sense of real injustice develops among teachers when
ratings arrived at in a perfunctory manner become the basis for
salary increases. It was this sense of injustice that led the
members of the National Education Association, as early as 1915,
■to adopt a resolution in opposition to 1those ratings' and
records which unnecessarily disturb the teacher's peace and
make the rendering of the best service impossible (NBA, Proceed­
ings, 1915) and (Biddle and Ellena, 1964).
Again in 1961, and in more recent years, the National Education Associa­
tion through its resolutions recognized that "it is a major responsibility
45
of the teaching profession, as of other professions, to evaluate' the
quality of its services."
The resolution opposed the use of subjective
methods of evaluating teachers for the purpose of setting salaries, say­
ing specifically, "Plans which require such subjective judgments (com­
monly
known as merit ratings) should be avoided."
1961, p p . 189-193).
(NEA, Proceedings,
(Biddle and Ellena, 1964).
Oldham related that the literature from teacher organizations has
continued in opposition to the idea of merit pay.
Administrators, board
members and the public regard merit pay as a way to improve education and
get a better return on the tax dollar.
Merit pay, according to Oldham, could not exist without evaluation,
but the converse was not true; evaluation could and did exist without
merit pay.
Usually, when evaluation began to suggest merit pay,
tion itself was likely to be attacked by teachers.
evalua­
The idea of teacher
evaluation to improve instructions reached a near consensus status;,but
teacher evaluation for the sake of paying some teachers more than others
was still very much a subject of much debate among teachers (Oldham, 1974)..
Most school districts avoided tying teacher evaluation to merit pay and
most avoided merit pay schemes altogether.
employed merit pay schedules.
However, a few districts
In those districts it was common practice
to separate teacher evaluation for the sake of improving learning from
teacher evaluation for the sake of rewarding teachers with some additional .
increments .
Most districts followed the voluntary participation in merit
46
pay programs (Oldham, 1974).
,
.
■ '
.
Opponents of merit pay ,programs put forth the argument ,that
teachers preferred to be paid on the basis of experience and earned
credit.
Some of the reasons teachers avoided merit pay programs were
listed by Shaughnessy.
These were who shall appraise, what should be
subject to evaluation, how will appraisals be conducted, and how will
appraisals be translated into salary increments.
Oldham quotes
"
Shaughnessy as follows:• "Perhaps.the most striking issue is that which
centers around inadequate basic and evaluative research in teacher and
teaching evaluation in general and in merit pay programs in particular"
(Oldham, 1974).
A summary of the more recent teacher response to merit pay and '
other considerations was illustrated by Oldham in. the following quote:
Teacher evaluation systems are not implemented anywhere with­
out some conflict. The reasons for controversy, vary arid there
does not appear to be any one over-riding cause, such as
teachers' intransigent opposition to all forms, of evaluation.
Although some very few districts replying to the Education U.S.A.
survey did report teacher opposition to any evaluation after
tenure, most conflict seems to stem from other matters. .Evalua­
tion, as a general process, ■is in itself rarely the issue..
However, some tentative generalizations, based on the survey
replies, are possible:
Conflict seems, most likely if teacher evaluation is tied to
identifying incompetent teachers for the purpose of dismissal;
if it'is tied to merit pay provisions; if a check-list type of
evaluation instrument is used that does not reflect any teacher
input .■. . . Even when teachers participate in the creation
of the evaluation procedure conflict can result. . . (Oldham,
1974).
47
Teacher Participation in the
Evaluation of Their Services
"From staunch opposition, to a guarded receptivity, to a leader­
ship role in planning for teacher evaluation— such has been the course
of opinion of large numbers of the nation's teachers regarding evalua­
tion," stated Oldham (Oldham, 1974).
But this apparent reversal in the
approach to teacher evaluation by teachers did not reflect a real change
in philosophy held by teachers as much as it reflected the changes in
the types of evaluation programs that were being proposed due to outside
pressures on the schools.
Up to the present time, wrote Oldham, there has been little con­
sensus among teachers in the area of teacher evaluation although it is a
subject that vitally affected and concerned them.
Individual teachers
and their professional organizations ranged on a continuum from a firm
opposition to evaluation plans to active support for a certain plan.
Because of the complexity and difficulty inherent in teacher evaluation,■
the numerous plans proposed, and the uniqueness of local district condi­
tions, this change in teachers' attitudes toward evaluation come as no
surprise (Oldham, 1974).
In the last decade, research and experimentation in the univer­
sities and teacher training institutions of our country brought forth
much re-examination of the teaching and learning.process on an intellec­
tual and scholarly basis.
This re-examination and resulting new theories
48
intensified the move toward better teacher evaluation.
provided for carrying it out.
New methods were
Teachers' organizations have added to
this body of knowledge through studies, workshops and professional
development programs (Oldham, 1974).
These activities contributed to the emerging philosophy that
evaluation of teachers was for the purpose of improving instruction
rather than for rating purposes.
Teachers supported the idea that evalu­
ation should "pinpoint teacher strengths and weaknesses" and help them
to reinforce their strengths and overcome their weaknesses.
This phil­
osophy was supported by many teachers because it provided an approach
to evaluation that seemed to meet the needs, and demands of account­
ability proponents as well as teacher needs.
Oldham described gradual
teacher acceptance as follows:
Many teacher groups began to accept the precept that teacher
evaluation may be one way to satisfy the public demand for tan­
gible evidence that the schools were doing their job with the
properly qualified and properly educated staffs (Oldham, 1974).
The conditions as described by Oldham rapidly moved teachers' associa­
tions on all levels toward planning and recommending "acceptable" •
evaluation systems (Oldham, 1974).
This change in attitude was summarized by Larry E . Wicks in an
article, "Teacher Evaluation in Today's Education/NEA Journal (March,
1973):
49
They [a teacher's association] "were .aware that parents, stu­
dents , elected officials, and state agencies across the country
are demanding teacher accountability.
They believed that if the
profession doesn't deal with the problem then someone else will.Therefore, they felt that education associations must place a
high priority on becoming fully involved in establishing policies
for and carrying out evaluation of education programs and of
teaching processes.
In a similar vein, N E A 's The Early Warning Kit on the Evaluation
of Teachers contained the following statements:
. . . The work of teachers is constantly being evaluated not
only by supervisory personnel but by the lay public as it criti­
cizes educational products. Teachers should not become defen­
sive but should be prepared to respond affirmatively. Appro­
priate response is made by taking a hard look at programs to
improve the schools. . . . Without association involvement in
the selection, adoption, or development of the evaluation instru­
ment, there is little likelihood it will be used adequately and
fairly to evaluate teachers. If teachers do not take a strong
position on teacher performance evaluation, they will be unable
to benefit from this important and sensitive activity . . . it
is a major responsibility of educators to participate in the
evaluation of the quality of their services.
Oldham reported that many teachers' groups concluded that teacher
evaluation was here, would stay, and would expand.
To insure that
evaluation would be done right, teachers chose to become involved in the
process from beginning to end.
Because teachers felt the need to help
shape policies, set goals, design instruments and carry out the proce­
dures of evaluation, many associations made teacher evaluation a nego­
tiable item in contract bargaining (Oldham, 1974).
Specifically the National Education Association took the position
that it was a major responsibility of educators to participate in the
50
evaluation of the quality of their services.
To enable educators to '
meet this responsibility more effectively, the Association called for
continued research and experimentation to develop means of objective .
:
.
"
evaluation _of the .performance of all educators.
The means included
identification of (a) factors thpt -^determined professional competence;
(b) factors that determined the effectiveness-of competent professionals; ■
(c ) methods of evaluating effective professional service; and (d) methods
of recognizing effective professional service through self-realization,
personal status, and salary.
■
The association held the view.that evaluations should be con- 1
ducted for the purpose of improvement of performance and quality of
instruction offered to pupils, .based upon written criteria and following '
procedures mutually developed by and acceptable to the teacher asso'cia-.'
tion, the administration and the- governing board (NEA, January, 1974).
Studies of Teacher Effectiveness
.
During most of the history of education the question of what knowl­
edge, understanding and ways of behaving teachers should possess was
based on experience,. tradition, common sense and authority. (Gage., 1963).
Philosophers and:theologians applying their modes of truth-seeking to the
problems of education, included the question of how teachers should be- ■
have.
With .the emergence of the behavioral sciences in the twentieth ;
century,- attampts were made to apply scientific: methods to the problems
51
of learning, teacher behavior and teacher evaluation.
From this
developed a sub-discipline that is referred to as "research on teaching."
Gage defines the term "research on teaching" as the study of relationships
between variables, at least one of which refers to a characteristic or
behavior of a teacher.
He stated:
If the relationship is one between teacher behaviors or char­
acteristics, on the one hand, and effects on students, on the
other, then we have research on 1teacher effects' in which the
teacher behavior is an independent variable (Gage, 1972).
The record of research accomplishment on teacher effectiveness
does not firmly support the idea that science can contribute to the art
of teaching.
to Gage.
There are reasons for questioning this pessimism according
Part of the reason for the questioning this pessimism lies in
the fact that research in teacher effectiveness has assisted in revision,
of the teacher's role by deriving and evaluating'the ways in which
teacher behavior is changed (Gage, 1972).
One problem, as described by Barr, that must be of continued con- •
cern to those interested in the measurement and prediction of teacher
effectiveness was that of adequate criterion.
Barr presented a critical -
overview of some 75 doctoral studies made at the University of Wisconsin
that pertained in some respect to the measurement and prediction of
teacher effectiveness.
He concluded that "by and large, and with many
exceptions two criteria have been used:
(I) efficiency ratings of one
sort or another, and (2) measured pupil gains."
His summary on ratings
52
was . . . "Over all, general ratings of teacher effectiveness have been
shown to be, under current conditions, exceedingly unreliable" (Barr,
1961).
Regarding the use of measured pupil gain as a criterion of teacher
effectiveness, Barr listed two difficulties that were encountered by the
researcher.
■■
. . . First of all, each teacher in the modern school, within
very broad limits, chooses his own purposes, means and methods
of instruction. . . .
A second difficulty arises out of the
fact that notwithstanding over a half century of effort, many
of the outcomes of learning and of teaching are poorly or in­
adequately measured (Barr, 1961).
Some approaches to the development of criteria that described
teacher effectiveness were listed by Barr as the "traits approach" which
considered the qualities of the individual such as cooperativeness, ethicality, and considerateness, and the "behavioral approach" which considered
the qualities of the individual not in terms of personality traits but in
terms of characteristics of performance which integrated the concept of
personality with that of methods.
This
concept has always been con­
sidered an important aspect of teacher effectiveness. Barr noted that,
"The criterion of teaching effectiveness may also be behavioralIy defined,
directly and without the summarizing operations provided by personality
traits."
A very attractive feature of a behavioral criterion was that
behaviors may be directly observed by all who cared to look (Barr , 1961).
53
In describing the criteria of measuring teacher effectiveness,
Barr listed three commonly employed criteria which encompassed four
approaches to evaluation.
The criteria most commonly employed were:.
(I) Efficiency ratings, which may be made by. any number of
persons, but most frequently by the superintendent of schools
or members of his staff; (2) measures of pupil growth and
achievement usually adjusted for differences in intelligence
and other factors thought to influence growth and achievement;
(3) and a preservice graduation criterion composed of (a)
measures of the foundations of efficiency, basic knowledges,
skills, and attitudes and (b) the personal prerequisites to
effectiveness; and professional competencies as inferred from
observation of performance in practice teaching, internships,
and other activities involving children.(Barr, 1961).
Embodied within these criteria w e r e ,four approaches to teacher
evaluation which were combined in different ways by different persons,
institutions, and data gathering devices.
These were (a) evaluations
made in terms of the qualities of the person, as in personality rating;
(b) evaluations which proceed from studies of teacher.behaviors, as in
the rating of performance in terms of inferred personal qualities or
desirable professional characteristics;. (c) evaluations developed from
data collected relative to presumed prerequisites.to teacher effective­
ness , potential or already achieved, represented by such psychological
constructs, as knowledges, skills, and attitudes; and (d) evaluations
developed from studies of the product, for example, pupil growth and
achievement (Barr, 1961).
A fourth type of criterion of teacher effectiveness described by
Barr was that of pupil growth and achievement which was usually expressed
54
as pupil gain scores based upon achievement tests administered prior to
instruction and again at some subsequent date when a particular unit of
instruction or course had been completed.
Barr cautioned that although-
many would consider this criterion a primary criterion against which all
other criteria should be validated, it was subject to very definite
limitations.
One limitation was the fact that tests measured results
but provided little information as to how these effects were produced.
The teacher effect was only one of many effedts that produced .changes in
pupil growth and achievement.
One of the real difficulties was that of
isolating the teacher effect (Barr, 1961).
One can assume that knowledge about teaching effectiveness con­
sists of relationships between what a teacher does while teaching and the
effect of these actions on the growth and development of his pupils.
In
pursuing this assumption Flanders defined research on teaching effective­
ness as attempts to discover relationships between teaching behavior and
measure of pupil growth.
The most common research design which in his
mind left much to be desired, compared ah "experimental treatment" group
with a control group.
Pre-tests and post-tests of pupil achievement and
attitude were administered to all classes and an analysis of the scores
showed that there were, or were not, significant differences -between the
two groups of classes that were being compared.
The problem with this
design, according to Flanders was the failure to collect data which helped
to explain why the results turned out the way they did.
He felt that
55
interaction analysis provided information about the verbal communication
which occurred, and this often helped to explain the results (Flanders,
1970).
In an earlier research project which was conducted at the Univer­
sity of Minnesota and supported by the U. S . Office of Education, ten
categories were used to classify the statements of the pupils and the
teacher at a rate of approximately once every three seconds.
It was
found that an observer could be trained to categorize at this rate with
sufficient accuracy (Flanders, 1960).
The ten categories included seven assigned to teacher talk,
two to student talk, and one to silence or confusion. When the
teacher was talking, the observer decided if the statement was':
(I) accepting student feelings; (2) giving praise; (3) accepting,
clarifying, or making use of a student's ideas; (4) asking a
question; (5) lecturing, giving facts or opinions; (6) giving
directions; or'(7) giving criticism. When a student was talking,
the observer classified what was said into one of two categories:
(8) student response or (9) student initiation.
Silence and con­
fusion were assigned to category (10) (Flanders, 1960).
In practice, an observer kept a record of different periods of
classroom activity.
At the end of an hour's observation, it was possible
for an observer to sum the different kinds of statements for each of six
types of classroom activity separately and combine these into a grand
total for the entire hour's observation.
This method of observation
was called "interaction analysis" in the classroom, and it was used, to
quantify the qualitative aspects of verbal communication.
The entire
process became a measure of teacher influence because it made the
56
assumption that most teacher influence was expressed through verbal
statements and that most nonverbal influence was positively correlated
with the verbal.
Those who have worked with this technique were dis­
posed to accept this assumption (Flanders, 1960).
Interaction analysis is a specialized research procedure that
provides information about only a,few of the many aspects of
teaching.
It is an analysis of spontaneous communication be­
tween individuals, and it is of no value if no one is talking,
if one person talks continuously, or if one person reads a book
or report. . . .
Of the total complex called "teaching".inter­
action analysis applied only to the content-free characteris­
tics of verbal communication (Flanders, 1960).
In a later publication entitled Analyzing Teaching Behavior,
Flanders discussed sampling problems inherent in research on teaching
effectiveness when a small group of teacher-class units were selected so
that they represented a larger population.
In reference to coding verbal
behavior, he stated:
Teaching effectiveness is by definition concerned with what
teachers do that affects educational outcomes. In order to
investigate teaching behavior with techniques such as coding
verbal communication, sample, sizes have necessarily been small,
too small to provide a logical basis for extending conclusions
so as to generalize about a target population. The target
populations, in turn, are difficult to identify in any mean­
ingful way (Flanders, 1970).
One of the most extensive "teacher characteristics" studies was
carried out by Ryans.
Ryans' design was considered by other researchers
such as Biddle to be classical in the sense that teacher
were abstracted from the classroom context.
"characteristics"
Thus, classroom situation and
57
teacher-pupil interaction were ignored for the most part.
His work was
unique in that he established relationships between ten characteristics
and both formative and outcome variables (Biddle and Ellbna, 1964).
Teacher characteristics, as defined by Ryans, meant both teacher
properties and teacher behavior.
He presented three dimensions of teacher
behavior (measured by rating, forms used in direct observation of behavior) :
warmth, organization and stimulation.
He detailed seven teacher charac­
teristics (measured by objective instruments) including:
favorable
opinion of pupils, favorable opinion of classroom procedures, favorable
opinion of personnel, traditional versus child-centered approach, verbal
understanding, emotional stability, and validity of response (Biddle and
Ellena, 1964).
The teacher characteristics study made relatively few assumptions
about the roles of teachers.
Instead, it followed a design that dic­
tated going into the classroom to observe what transpired when teachers
and pupils reacted and interacted in the learning environment.
It
attempted systemizabion of the observation data collected, related those
observation data to other kinds of information about teachers, and dis­
cerned typical patterns of teacher characteristics in relation to various
conditions of teacher status.
An effort was made to investigate the
interactions and interrelationships among pupil behaviors and teacher
behaviors.
58
In the teacher-observation phase of the teacher characteristics
study, the staff concluded that there were at least three major patterns
of teacher classroom behavior that could be identified.
These were:
TCS pattern X warm, understanding, friendly versus aloof,
egocentric, restricted teacher classroom behavior
TCS pattern Y responsible, businesslike, systematic versus,
evading, unplanned, slipshod teacher classroom behavior
TCS pattern Z stimulating, imaginative versus dull,
routine teacher classroom behavior (Ryans, 1947).
Some conclusions that came out of the study were:
(I) Certain
characteristics of teachers may be traceable to behavior patterns that •
were expressed
in related, but different, channels long before the ihdi-•.
vidual entered teaching as a profession.
(2)
There appeared..ta be, Iittle1
doubt about the existence of important differences between teachers in
varying age groups with respect to a number of characteristics.. Generally
speaking, scores of teachers fifty-five years and above showed this.group
to be at a disadvantage when compared with young teachers— except from
the standpoint of pattern Y (systematic and businesslike classroom
behavior) and characteristic B (learning-centered, traditional education
viewpoints).
(3)
Differences between the sexes, often insignificant in
the elementary school, were fairly general and pronounced among secondary
school teachers.
Women generally attained significantly higher scores
than men on the scales measuring understanding and friendly classroom
behavior, businesslike and stimulating classroom behavior, favorable
5 9
attitudes toward democratic practices, permissive viewpoints, and verbal
understanding.
(4)
Teachers in large schools (17 to 50 or more teachers)
scored higher than those from small schools.
(5) Good mental health, or
emotional maturity, generally was assumed to be a requisite for satisfac­
tory teaching performance.
(6) Teachers are "good" if they rank very
high among their colleagues with respect to,such observable classroom
behaviors as warmth and kindliness, systematic and businesslike manner,
and stimulating and original teacher behavior.
(7)
Pupil behavior
appears to be rather closely related to teacher behavior in the elemen­
tary school.
In the secondary school it seems almost unrelated to teacher
behavior in the classroom (Biddle and Ellena, 1964).
R . L . Turner utilized a strategy for effectiveness research that
focussed upon the assumption that teaching may be viewed as a series of
problem solving or coping behaviors.
Utilizing this strategy, Turner
and Fattu developed objective instruments to measure teacher potential
for performing coping tasks. Turner demonstrated that teacher scores on
these instruments were related to .formative experiences,
to teacher prop­
erties, such as intelligence, attitudes, and values, and to contextual
variables, such as subject matter and age of pupils (Biddle and Ellena,
1964).
The design followed by Turner was intermediate between the classi­
cal approach of Ryans and the systematic interactions studies of Meux
and Smith or Flanders.
Turner assumed that the teacher reacted to the
60
problems posed by classroom situations.
With this assumption in mind
he
built "classroom contexts into the definition of teacher properties to
be measured."
By way of contrast, Ryans' ten traits were "abstracted
from the classroom context" (Biddle and Ellena, 1964).
Turner, in his study which investigated the training and experi­
ence variables of teacher performance, suggested the following interpre­
tations:
First, there was considerable evidence that treatments such
as methods courses and student teaching during undergraduate
teacher preparation have a distinct bearing on teaching-task
performance in arithmetic and reading.
Second, there was
considerable evidence that variation in performance in teach­
ing tasks . . . was associated with variation in undergraduate
preparation. . . . Third, there is some evidence that varia­
tion in teaching-task performance is associated with variation
in teaching situations. . . . Fourth, there is considerable
evidence that the very early years of teaching experience pro­
duce the greatest rise in teaching-task performance;— as evi­
denced by differences in performance between fully prepared
but inexperienced teachers and teachers with no more than .
three years of experience.
There was little evidence to sug­
gest that performance changed greatly, for the average teacher,
after the third year of experience (Biddle and Ellena, 1964).
Viewing the teacher effectiveness problem from a sociological
context, Brim questioned whether or not there were personal and social
characteristics which greatly influenced role performance of a teacher.
He summarized the teacher characteristics to effectiveness in teaching
research by stating:
. . . even though there is a vast body of research on the
relation of teacher characteristics to effectiveness in teach­
ing, the reviews of this research show no consistent relation
between any characteristics, including intelligence, and such
teaching effectiveness (Brim, 1958).
61
Brim suggested that perhaps the effects of the teacher's personality had
been looked for in the wropg place.
He suggested that other actions of
the educational process, njamely, the values the student learnbd,' his
feelings about himself and other persons, h i s 'attitudes toward further
education and many other social factors influenced the effects of teacher
personality on the student.
Brim proposed the possibility that the influ­
ence of a teacher's characteristics upon his effectiveness as an educator.
is contingent on characteristics- of the students.
He cited the example
that teachers of given personal characteristics may be more effective
with boys, others with girls; some with .bright students, others with
average students.
He pointed out that teachers themselves were the -
first to admit that they seemed to do better with one rather than
another type of student, and the preference, of different students for
various teachers was easily recognized as part of one's own life experi­
ence.
As pointed out by Brim this approach to measuring teacher-effec­
tiveness was not a novel observation, but someHpw it had escaped atten­
tion as a critical research problem (Brim, 1958).
In research that was carried out by Biddle and Ellena on teacher
effectiveness two problems were identified which seemed to cause con­
fusion in dealing with teacher effectiveness.
They were listed as (I)
during the decade of the fifties educational researchers said that they
did not know- how to define, prepare for, or measure teacher competence,
and (2) researchers disagreed over the effects a teacher expected to
62
produce.
For example, should the teacher's tasks be defined in terms of
the ultimate goals of education or the effects he produced with the pupil?
Was a teacher expected to gain the same degree of competence for all'
pupils, or should special competence be allowed in working
with the under­
privileged, and handicapped, and the exceptional pupil?
They concluded
as follows:
"...
until effects desired of the teacher are decided
upon, no adequate definition of teaching competence is possible" (Biddle
and Ellena, 1964).
Biddle and Ellena believed that it was necessary for researchers
to agree upon language and the variables that words described in order
to resolve the problem which was caused by researchers using multiple
meanings for terms used to describe teacher effectiveness research. They
pointed out that long-term effects of a teacher are difficult to assess.
The problem being that the individual teacher contribution was hard to
separate from the influence other teachers exerted who taught the same
child.
They believed that teacher competency involved a complex inter­
action between teacher properties and contextual factors in the community,
school, and classroom.
In their opinion it was possible that a number ■
of independent competencies found varied
with the types of teachers teach­
ing (Biddle and Ellena, 1964).
In order to clarify the effectiveness problem in teacher effec­
tiveness research,
Biddle and Ellena suggested a variable system composed
63
of seven classes by which one could examine short and long range effects
of teacher-pupil' interaction.
described as:
Summarized these seven variables are
(I) formative experience's, (2) teacher properties, (3)
teacher behaviors, (4) immediate effects, (5) long-term consequences,
(6) classroom situations, and (7) school and community contexts.
Five
variables— formative experiences, teacher properties, teacher behaviors,
immediate effects, and long-term consequences— were postulated by Biddle
and Ellena to form a cause and effect sequence.
In this sequence each
variable class in the sequence caused effects in the next variable class
listed.
Biddle postulated further that the last two variables listed
above were contexts for the main sequence.
For example:
(a)
The
classroom situation imbeds (and interacts) with teacher properties,
teacher behaviors, and immediate effects.
(b)
School and community
contexts imbed (and interact) with formative experiences, teacher
properties, teacher behaviors, immediate effects, and long-term conse­
quences (Biddle and Ellena, 1964).
Biddle and Ellena listed the following hypotheses as examples by
which to examine each variable class in the diagram shown below:
Hypothesis I:
a. Teachers receiving "four years of college
training" will "know more about the techniques
of elementary education" than those receiving
less education.
b . Teachers "knowing more about the techniques
of elementary education" will use more "flexi­
bility in the control of classroom discipline"
than those who know less.
64
c . Teachers who use more "flexibility in the
control of classroom discipline" will produce
fewer "overt acts of deviancy" by pupils than
those who use less.
,
d. Pupils who exhibit fewer "overt acts of
deviancy" will show greater "achievement at the
end of their schooling than those who exhibit
more.
Hypothesis 2:
"Deviancy control" is more related to "achievement"
in classroom situations characterized by "ritual
and boredom" than in classroom situations.
Hypothesis 3:
The "control of deviancy" is more related to
"achievement" in "lower class" schools than in
"upper class" schools.
Biddle explained that the hypotheses as listed above were not necessarily
confirmable but each could be tested experimentally (Biddle and Ellena,
1964).
Biddle noted that much of the literature up to the time of his
study was concerned with the adequacy of various measurement techniques
which were used to measure teaching effectiveness.
He assessed the
literature as confusing the measuring technique with the variable to be
measured.
The result of this confusion was evident, according to Biddle,
in that one author treated measurement as a "cause" of effectiveness,
while another author treated the same measurement as a "direct indication"
of effectiveness.
Some went so far as to consider measurement as a
criterion of effectiveness (Biddle and Ellena, 1964).
In addition to proposing a model for classifying variables involved
in teacher effectiveness, Biddle and Ellena reviewed the forms of measure-
65
merit in use and their application to effectiveness variables.
listed these were:
Briefly
(I) "observation techniques," which were classified
into four categories of participant observation— categorical check lists—
specimen record— and electronic recording;
(2) "objective instruments,"
which included achievement tests— ability inventories— questionnaires
and interview schedules— and projective tests; (3) "rating forms;" (4)
"self-reports;" (5) "existing records;" (6) "a priori classification."
In summarizing the methods in the measurement of effectiveness variables,
Biddle suggested that measurements by a priori classification, behavioral
observation, and objective instruments were to be advocated over measure­
ments made by existing records, self-reports and ratings (Biddle and
Ellena, 1964).
In a study of teacher-pupil relationships conducted by Bush,
he suggested that caution should be exercised in overgeneralizing on the
subject of teacher competence.
His study noted the divergence of super­
visors' and administrators' rating of teachers from those of pupils.
This divergence suggested that an estimate of a teacher's effectiveness
should not be based upon the opinions of one group.
Each of the class­
rooms reported in Bush's study contained examples of both effective and
ineffective relations between pupils and teachers.
No one teacher was
found to be effective with all of his students or ineffective with all
of them.
Therefore, blanket, statements concerning what constitutes good
teaching and the good teacher should be viewed skeptically for they are
66
likely to be based upon inadequate data and a failure to recognize the
complexity of teaching.
Bush further suggested that:
The most meaningful and accurate appraisal is probably one
that is specific and limited to an estimate of the effective­
ness of the relationship between a given teacher and a given
pupil at a specific time in terms of the current needs of
that pupil (Bush, 1954).
Sciara and Jantz reported on the work of Seller who distinguished
three aspects of teacher functioning as a role, style and technique.
Teacher role was defined as behavior which was concerned with-the duties,
responsibilities and functions of the teacher.
Teacher style referred
to personality traits and teacher attitudes which were not a planned
component of the teacher role.
Technique of teaching referred to
specific strategies employed by the teacher to carry out her role or to
accomplish her objectives.
In evaluation of the teacher role, Seller
concluded that future evaluation of teacher roles should never rely on
one source alone but should include the perception and judgment of all
groups involved in the teaching of the child, e . g ., the teacher, the
administrator, other professionals in the school system, parents and
pupils.
Teacher style has been found to have an effect on the techniques
which the teacher used, and in some instances, on the effectiveness of
his teaching.
Several studies investigated relationships between teach­
ing style and teaching effectiveness.
One group of investigators
examined this relationship by insuring that judgments of both teaching
style and teacher effectiveness would be by the same judges.
An
L
67
investigation by Kerlinger used a wide range of judges and found that.
two major clusters for effective teaching were evident. -One of these'
clusters was named traditional, and it associated effective teaching with
being self-controlled, trustworthy, refined, industrious, reliable,
healthy, moral, religious, and conscientious.
The second cluster was
named progressive, and associated effective teaching with imagination,
insight, warmth, openmindedness, flexibility, sympathy, sensitive,
patience, and sincerity (Sciara and Jantzl 1972).
The greatest advanc.e in the evaluation of teacher functioning was
made in the area of teaching techniques and outcomes termed products of
teaching.
These advances occurred in the refinement of measuring and
evaluating an individual's techniques, concentrating, of patterns of
techniques, the studying of teacher-pupil interaction, and the improve­
ment of methods of measuring and evaluating effects of teaching on pupil
achievement and functioning (Sciara and Jantz, 1972).
In order to provide structure for an inquiry into teacher effec­
tiveness which attempted to uncover discrepancies in conceptions of
teacher effectiveness', Jenkins and Bausell developed a survey instrument
which was based on categories employed by Harold Mitzel in his contribu­
tion to the 1960 edition of the Encyclopedia of Educational Research.
This study was briefly described earlier in this chapter.
Mitzel
described the categories of teacher effectiveness as follows:
68Product criteria.
When teachers were judged by their effectiveness in
changing student behavior, the judge employed in Mitzel1s scheme, was
product criteria.
Product criteria required the judgment of the teacher
on the basis of a measurable change in what was viewed as his product,
student behavior.
What constituted acceptable products, or changes', was
never made altogether clear.
Measures of growth in skills, knowledge of
subject matter, and attitude which could logically or empirically be
attributed to the teacher's influence constituted acceptable data in the
product category.
As an example, skills and behaviors which evidence
changes in critical thinking, inquiry, evaluating, reading, spelling,
typing, speaking, and discussing were considered to be potential entries.
Gains in knowledge of subject matter as measured by standardized achieve­
ment tests, end-of-lesson or unit quizzes, and student reports were
accepted as evidence of teacher influence.
Student performances
measured in terms of self-acceptance, attitudes toward school subjects
or toward learning in general, and respect of others and their opinions
qualified as affective goals and thus were accepted within the product
category.
Confusion about the product category, probably arose not so
much from the notion of using student change as a criterion but from the
difficulty in gaining consensus on what products were considered appro­
priate within the domain of the school (Mitzel, 1960).
Process criteria.
Teacher evaluation which was based upon classroom
behavior, either the teacher's behavior, his students' behavior, or the"
69
.interplay of both constituted process criteria.
Process behaviors were
worthwhile in their own right and thus were not necessarily related to
product criteria.
Variables upon which teachers could be rated were
their verbal behavior, methods, classroom control, and individualization
of instruction.
Students might also be rated on their verbal behavior,
attentiveness, and conformity to classroom routine.
The interactions
between student and teachers was the bases by which to judge rapport and
climate in the classroom (Mitzel, 1960).
Presage criteria.
if teacher evaluation was based upon the teacher's
personality or intellectual attributes (industry, adaptability,
intelli­
gence, character), his performance in training, his knowledge or achieve­
ment (e.g. , marks in education courses, success in student teaching,
national teacher examinations, knowledge of educational facts) Or his
in-
service status characteristics (e^.g. , tenure, years of experience, or
participation in professional organizations), then he was being judged
upon presage criteria.
These criteria were indirect measures of a
teacher's effectiveness and were normally chosen because in some
authority's view they were related to, and therefore, predict, either
process or product criteria (Mitzel, 1960).
Status of Present Methods of
Appraising Teacher Performance
The principal's task, Lewis summarized, seemed futile when one
reviewed the attempts of past research .to produce acceptable criterion
70
by which one could measure teacher effectiveness.
This pessimistic
point-of-view was illustrated by Lewis who stated,
Although in the past we have been unable to objectively
measure teacher performance effectiveness, the tradition of
1evaluating1 educators continues to be a dominate feature of
our schools (Lewis, 1973).
Lewis described the present method of appraising the performance
of educators as "dysfunctional" and serving no useful purpose.
He
described present appraisal methods as falling short of assessing ade­
quately "true" performance.
The result of inadequate assessing proce­
dures made it impossible for school districts to take corrective action
for professional growth, improvement and development of the staff.
Fur­
thermore, stated Lewis, "it has been a device which over the years has
perpetuated the division between teachers and administrators" (Lewis,
1973).
In his summary of studies on teacher effectiveness Stephens found
no relationship between the academic gains of pupils and the qualities
of the teacher that could be observed by principals or supervisors.
Researchers, it seemed, arrived at the same findings that regardless of
the techniques or method employed such as rating scales, self-analysis,
classroom visitation, etc., few if any "facts" seem to have been reached
concerning teacher and administrator effectiveness.
According to Stephens1
no generally agreed upon method of measuring the competence of educators
has been accepted and no methods of promoting growth, improvement and
development has been generally adopted (Stephens, 1967).
71
A Current Trends Report conducted by the National School Public "
Relations Association reported that a large body of facts and descrip­
tive reports on teacher evaluation presently exist in school districts
across the nation, but the approaches to teacher evaluation varied
considerably from district to district.
The most noteworthy trend in
this report on teacher evaluation was the growing practice of involving
teachers in the establishment of evaluation programs.
The pattern.of
evaluation showed a trend away from negative aspects of identifying
poor teachers so they can be dismissed to positive aspects of identify­
ing their weaknesses and strengths *
The present purpose of teacher
evaluation was described as correcting weaknesses and reinforcing
strengths o f .teachers (Oldham, 1974).
A most recent publication by Educational Research Service sup­
ported the Current Trends Report of a few years ago in stating that,
"great diversity of thought on how to evaluate teaching performance,
who should evaluate, and what criteria should be used," are still para­
mount considerations in evaluation'programs in school districts across
the nation (Robinson, 1978).
The effective evaluation of teaching per­
formance is still of particular importance in meeting thp demands for
accountability by the public (Robinson-, 1978).
The overview of the
research done by Educational Research Service indicated that 97.9 per
cent of responding school systems to its survey on teacher evaluation ■
carry on some type of formal teacher evaluation programs.
The main
72
purposes of evaluation have not changed in school districts throughout
the country.
The two main purposes of teacher evaluation as noted by
the report of Educational Research Service are:
(I) to perform an
evaluative function for management decisions; and (2) to perform a
developmental function to help teachers identify areas for improvement
and growth (Robinson, 1978).
73
SUMMARY
. .
.
-
It was evident by the middle of the 1970's that this nation's
educational institutions were besieged by internal and external forces
which demanded that these,institutions be held accountable and show
evidence of having used the taxpayers' money, wisely before asking for
additional money.
The demand for "accountability" in the schools gained
increased momentum by the soaring cost of education in the 1960's.
It
was the demand for accountability placed on the schools that intensified
the search in t h e •1970's to find improved ways to evaluate the effective­
ness of teachers,.
Historically the cry of the public for "accountability"
has not been new but one which received renewed emphasis with each sue- .
ceeding period of financial stress for taxpayers.
For the teacher the topic of accountability quickly translated
into an assessment of the quality of his instruction and the related
necessity of selecting a criteria.by which one. would judge his efforts.
In their effort to respond to the accountability emphasis, educators '.and
in particular, administrators turned to business and industry to find
more effective ways of measuring educational efficiency and output in
the schools;
..Borrowing management and evaluation techniques from business and
industry has not bee popular with teachers,.because they feared
74
unrealistic levels of required performance would be set from higher '
management levels without imput from teachers.
Traditionally teachers
felt that any measurement of their effectiveness which was solely de­
pended upon product measures would be punitive toward them and their
i. '
students.
This feeling stemmed from the fact that teachers were held,
■
.
accountable for many variables which they could neither measure nor
. ,
1
I
%
control.
In recent years teachers have asked for and taken a more active
role in finding what they consider to be, equitable ways of measuring
their own effectiveness.
The question of whether to evaluate teachers and how to do it has
been taken out of the hands of local school districts in many states, as
state legislatures enacted laws which mandated the evaluation of all '
teachers at specified intervals and often in specified ways.
California
'
was one state which initiated this trend with passage of the Stull Act
in 1971.
Similar laws awaited enactment in other states.,
.. ''
Many teachers' groups came to the conclusion that teachqr eyalua- , ‘ tion was here, would stay and most likely expand.
Teachers felt fhat
.
they should be included in shaping the policies, setting the .goals,and
designing the instruments of evaluation;
'As a result many associations
made teacher evaluation a negotiable item in contract bargaining. .
Historically the evaluation of teacher effectiveness, .has'belied;
' . '.
. - ' ..
'
'
' ' ' ,
- -u .
upon two criteria as pointed out by the criteria! overview by Barr.
■? ■
. '
' .
'
I
'
The two criteria used have been efficiency ratings of one sort,, or another
•'
75
and measured pupil gains.
or measurable.
Neither criteria has been exceedingly reliable
Approaches used by researchers in the development of
criteria that described teacher effectiveness included the "traits
approach" which considered the qualities of the individual, and the
"behavioral approach" which integrated the concept of personality with
that of teaching methods (Barr, 1961).
The greatest advance with regard to teacher evaluation has been
made in the area of teaching techniques and outcomes or product of
teaching.
These advances have occurred in the refinement of measuring
and evaluating an individual's techniques, the concentration of patterns
of techniques, the study of teacher-pupil interaction and the greatly .
improved methods of measuring and evaluating effects of teaching on
pupil achievement and functioning (Sciara and Jantz, 1972).
The most noteworthy trend on teacher evaluation as reported by
Oldham, 1974, was the growing practice of involving teachers in' the
establishment of evaluation programs.
The primary purpose of evaluation
shows a trend away from the negative aspect of identifying poor teachers
so they can be dismissed and toward the positive aspect of identifying
weaknesses and strengths so that the former can be corrected and the
latter reinforced.
CHAPTER III
PROCEDURES
This study was directed at determining what teachers and admin­
istrators in Montana believed were the appropriate criteria for
judging the effectiveness of a teacher.
A survey instrument was
prepared which included an assortment of criteria for judging
teacher effectiveness.
This instrument, in addition to a form which
was used to collect necessary demographic data, was mailed to a ran­
domly selected sample of Montana school administrators and teachers.
The returned questionnaires provided demographic data and the ratings
given to the criteria for each respondent.
The responses of admin­
istrators and teachers were compared and the results indicated in
appropriate table form.
This chapter provides a description of the population studied,
the sampling procedure used, the method used for data collection,
the manner by which data were organized and presented, the hypotheses
tested, the means by which the data were analyzed, and the precautions
taken to insure accuracy.
The final section consists of a chapteh
summary.
.
Description of the Population
This study sampled all teachers and administrators who were
under contract in school districts in Montana during the 1976-1977
77
school term, as determined by the Office of Public Instruction, numbered
approximately 9,428 people.
Within this estimated population, the
Office of Public Instruction listed in its directory for that year,
665 administrative positions designated as either superintendent or
principal positions (O.P.I., 1977).
A count of the elementary princi-.
pals, secondary principals, and superintendents listed in the directory
indicated a total of 665 administrator positions divided into 319
elementary principal positions, 143 secondary principal positions, and
203 district superintendent positions.
The number of administrative
positions listed in the directory did not indicate the actual popula­
tion of administrators sampled for this study.
The reason for the
difference in total positions listed and actual number of persons
sampled was accounted for by the practice of smaller school districts"
combining two or more administrative functions into one position.
During the school term from which the administrator sample was
taken for this study, 68 people served in at least two or more adminis­
trative positions concurrently in Montana school districts,
Because of -
this practice an actual administrator population of 597 people served
in a total of 665 administrative positions in the 1976-1977 school
year.
The population for this study included all public school prin­
cipals in first- and second-class school districts, superintendents ''
78
of third-class districts and teachers on the staff of each adminis­
trator who responded to the survey and consented to have his staff
surveyed.
The population from which administrator and teacher
samples were drawn included all principals and third-class district
superintendents and all teachers who were under contract in the State
of Montana for the 1976-1977 school term.
The researcher chose the administrator population from principals
in first and second-class school districts and superintendents in the
third-class school districts to insure the probability of gaining a
sample of administrators who were directly responsible for judging
the effectiveness of teachers who worked directly under the adminis­
trators supervision.
In most third-class districts in Montana the
district superintendent of schools assumes the role of principal
who traditionally is the administrator responsible for evaluating
teachers on his staff.
In most of the larger first and second-class'
school districts a principal is employed who directly supervises
teachers.
For the purpose of this study the random sample of the school
administrators was limited to a total population of approximately 474
administrators.
This number was arrived at by consulting the 1976- .
1977 Montana Education Directory, which listed all employed principals
and district superintendents, and computing the total population
which met the limitation as previously described in this chapter.
The following Table I gives the district classification and
categories of the administrative population total.
TABLE I
CATEGORIES OF ADMINISTRATOR POPULATION
Districts
High School
Junior High
Elementary
-
Totals
Principals
Class I
23
18 ■
1.39
Class II
85
■7
129.
108
■ 25
Sub-totals
399
268
Superintendents
Class U l
73
'
TOTAL ADMINISTRATOR POPULATION________________________
• 474'
The purpose, of this study required that the administrator
population be sampled first in order-to arrive, at the total popular
tion from which teacher samples were drawn.
The administrators were
sampled and the returns were- tallied to determine the total number
who gave permission to have their respective staffs sampled by the
same instrument.
. This form of sampling resulted in a proportional stratified
random sampling of all- teachers in the State of Montana. • Sources
80
in the Department of Public Instruction for the State of Montana
listed a total teacher population of 9,428 for the 1976-1977 school
year.
The total teacher population on the staffs of the responding
school administrators numbered 2,348.
. To insure a .05 confidence level for the study a sample size
of 568 teachers was drawn from a total state population of 9,428
teachers and from a stratified population of 2,348 teachers.
Of the
568 questionnaires which were sent to teachers, 454 were returned.
The percent of return was 79.93.
The sample size and population comparisons for teachers are
represented in Table I I .
TABLE II
TEACHER SAMPLE CHARACTERISTICS
Teachers
Sampled
State
Population
Stratified
Population
Per Cent of
State
Population
568
9,428
2,348
6.0%
Per Cent of
Stratified
Population
24.2%
Sampling Procedure
In late January of 1977 total population figures were.made
available through the Montana Office of Public Instruction for
teachers and administrators who were under contract in school
districts.
Using computer facilities of the Office of Public .
81
Instruction a sample of 114 administrators which included principals
and superintendents was drawn randomly.
This sample represented
24 percent of the population members, which is considerably larger
than that needed to insure a 95 percent confidence level.
Of 114 administrators who were surveyed for this study 112
responded.
Usable returns number H O of.the original 114 survey
instruments used for sampling purposes.
Two of the original 114
survey instruments were not returned and two were incomplete returns
that were not usable for this study.
The usable administrators'
survey returns of H O out of .114 sampled represent a return of 95
percent.
Table III illustrates the categories of the administrator's
sample.
After the administrative sample was returned the total number
of administrators who consented to have their staff surveyed was
determined.
Questionnaires were sent to 518 teachers.
This sample
was drawn from a total stratified population of 2,348 teachers.
Of the 455 teacher respondents to the questionnaire on teacher
effectiveness criteria measures, 454 were usable returns.
One
return was not used due to non-response to the items in the survey
instrument.
The total number of teacher respondents categorized by class
of districts resulted in Class I districts being represented by
TABLE III.
CATEGORIES OF THE ADMINISTRATOR SAMPLE
Administrators
Class of
District
Principals
I
Principals
TI
Superintendents
County Supt.
Number of
High School
Principals
Number of Jr.
High SchoolPrincipals
Number of
Elem. School
Principals
Male
Male
Male
Female
23
III
Female
7.
4
13 .
6
21
Female
,
- 29
Per Cent
of
Totals
5
39
35; 5
I
49
44.6
21
19.0
- . .
--
. Ill
Sub— totals
Totals
I
41
10
52
Total Administrator Sample = 110
Total Males in the Administrator Sample = 103 or 93.6% of sample
Total Females in the administrator Sample = 7 or
6.4% of sample1
7
I
HO
100.0
83
278 teachers, Class II districts by 135 teachers, and Class III
districts by 41 teachers.
Table IV illustrates the returned teacher sample characteristics
by district class, sex and returned administrator sample size.
TABLE IV
NUMBER AND PER CENT OF TEACHER RESPONDENTS
BY DISTRICT AND SEX CATEGORIES
Class of
District
No. of Adm.
Respondents
No. of
Teacher
Respondents
Per Cent of
Teacher
Respondents
Total Number of
Respondents by
Class of District
Male Female
Male Female
Total
Per Cent
Class I
39
130
148
28.7
32.7
278
61.4 ,
Class II
49
64
71
14.1
15.7
135 .
29.8
Class III
22
18
23
3.9
5.0
212
242
46.5
53.5
41
'
9.0
'< '
Sub-totals
Total
Sample
HO
454
100.0
454
100.0
Method of Collecting Data
A survey instrument which listed sixteen (16) criteria of
teacher effectiveness was sent to the members of the sample popula­
tion for each to rate according to the importance of each criteria
84
which in their judgment determined teacher effectiveness.
This
instrument was constructed and used in a similar survey conducted
by Jenkins and Bausell in the State of Delaware (Jenkins and Bausell,
1974).
An assumption to be made and described in the instrument for
the survey was that adequate measures were available to measure each
of the criteria which were randomly listed in the instrument.
The continuum used for rating each of the criteria listed on
the survey instrument ranged from an evaluation of "completely
unimportant" to an evaluation of "extremely important" on a ninepoint scale.
Below the instructions the sixteen criteria were
listed in random order.
Beneath each criteria there was a nine-
point scale like this:
Capacity to perceive the world from the student point of view.
Completely
^
^
^
Unimportant
^
^
^
^
^
Extremely
Important
The survey instrument sent to the teacher sample included in
addition to listing the sixteen criterion measures of teaching effec­
tiveness the following question:
". ,
How effective is your administrator in helping you to improve
your teaching effectiveness?
The survey instrument sent to administrators contained an
'
identical listing of the sixteen criterion measures of teacher
effectiveness but contained the following additional questioni
How effective are you in helping your teachers to improve their
teaching effectiveness?
Attached to each survey instrument was a cover sheet which
listed briefly the purpose of this survey, instructions for com­
pleting the questionnaire and seven items of demographic data to be
answered by the respondent.
Space was provided on the back of the
instrument for a respondents comments.
The kinds of demographic information included:
School district
classification, sex of respondent, years of experience, grade levels
presently taught, and student enrollment in the district where, the
respondent taught.
Copies of the form asking for this demographic ■
information along with the survey sent to teachers and administrators
appears in the Appendix C .
Method of Organizing Data
Demographic data were presented in forms of percentages a n d ■
listed in tables depicting years of experience, classes of districts,
grade levels taught and sex categories for the total samples of
administrator and teacher populations.
The experience categories
for teacher respondents to the survey were the same as those which
were used for the administrator respondents.
gories are :
The experience cate­
Category I (0-5) years, Category II (6-10) years',
86
Category III (11-15) years, Category IV (16-20) years, Category V
(21-25) years and Category VI (26 and over).
The administrative organizational patterns vary from district
to district in Montana in relation to the grouping of grade levels.
The grouping most commonly used by districts are:
through grade three, primary level;
intermediate level;
(K-3) kindergarten
(4-6) grades four through six,
(7-9) grades seven.through nine, upper grade
level or Junior High School;
(9-12) grades nine through twelve,
secondary or Senior High School level.
A variation from the (7-9)
and (9-12) groupings is used in smaller populated school districts
which results in an (7-8 and (9-12) grade grouping and in some
districts a (6-8) and (9-12) grouping.
For the purpose of this
paper grade levels designated (K-8) kindergarten through grade level
eight and (9-12) grade nine through twelve have been used to describe
elementary and secondary levels respectively.
Table V illustrates the years of experience .categories for the
administrator's sample.
Comparisons were made by district class­
ification and sex of the respondents.
Table VI lists the years of
experience of teacher respondents by district classification.
Com- i
-
parisons are made by sex of respondents and summed in percentages
per district class.
:
'
TABLE V
YEARS OF EXPERIENCE OF ADMINISTRATORS
Years of
Experience
Class II
Districts
Class III
Districts
*M
*F
*M
*F
*M
I
7
32
29.0
7
30
27.3
16
14.5
20
18.2
5
4.5
5
4.5
2
2.0
I.
0-5
'I
3
20
2.
6-10
11
I
11
3. 11-15
9
7
4. 16-20
8
6
5. 21-25
4
I
6. 26-over
I
3
Data not
Available
34
5
I
I
39
Totals
*M = Male
*F’= Female
48
I
49
Per cent
of Total
*F
6
I
Sub-total
Total of
Admin.
Exper.
Class I
Districts
21
I
22
HO
100.0
o
TABLE VI
THE YEARS OF EXPERIENCE OF TEACHER RESPONDENTS■BY DISTRICT CLASSIFICATION AND SEX
Years of
Experience
1st ClassDistricts
Male
Female
2nd Class
Districts
Male
Female
3rd Class
Districts
Male
Female
I.
0-5
25
46
24
27
11
2.
6-10
45
45-
17
18
3. 11-15
24
26
16
4. 16-20
17
16
5. 21-25
7
6. 25-over
11
Info. Not
Available
I
Sub-total
139
% of Total
Total
28.6
12 '
145
32.0
2
8
135
29.7
9
3
2
80
17.6
4
6
0
0
’ 43
9.5
6 .
I
4
2
I
21
4.6
9
I
7
0
0
28
6.2
2
.4
100
100.0
454
100.0
I
'148
32.6
278-61.2
Total of Teacher Per Cent of Teacher
Experience by
Experience by
Classification
Classification
64
71
14.1
15.6
135-29.7
18
4.0
41-9.1
23
5.1
89
The percentages of teacher respondents both male and female
teaching at elementary, upper, grades and high school levels are
listed in Table VII.
Comparisons were made by class of district.
Table VIII illustrates the number of teacher respondents found in
each grade level as compared to school district classification.
Comparisons were made by sex of respondents and the totals for each
district class were compiled.
TABLE VII
PERCENTAGE OF TEACHERS BY DISTRICT CLASSIFICATION
District
I
Male
Female
Grade
Levels
District
II
Male
Female
District
III
Male
Female
Total
K-6
6.8
21.8
2.6
10.4
1.3
4.4
. 47.3
7-9
8.8
5.5
4.4
2.2
. 1.1
. .2
- 22.2
9-12
13.0
5.3
7.1
3.1
1.5
.5
30.5
Sub total
28.6
32.6
14.1
15.7
3.9
5.1
100.0
Totals
6:L.2
29 .8
9.0
Mean ratings of the sixteen criterion measures of teacher
effectiveness were compiled separately for all respondents in the
teacher and administrator samples.
In addition mean ratings were
compiled for teachers who were divided into three sub-groups by
class of school district.
The three classes of school districts
100.0
TABLE VIII
NUMBER TEACHER RESPONDENTS BY GRADE LEVEL CLASSIFICATION
First Class
Dist . Teachers
Male
Female
Second Class
Dist . Teachers
Female
Male
Third Class
Dist. Teachers
Female
Male
12
47
6
7-9
40
25
20
10
5
59
24
32
14
130
148
64
71
Sub total
Totals
I
99
I I V
31
C D
K-6
I
Grade
Levels
278
135
Total Teachers by District
District District District
I.
II
III
130
59
26
I.
65
30
6
7
2
83
46
9
18
23
278
135
. '41
20
41
454
91
are Class I , Class II and Class III.
Tables were compiled which
.listed the mean ratings and rank order of the sixteen criteria for
administrators, all teachers and teachers of Classes I , II and ill
districts respectively.
Data supplied by the respondents were examined and comparisons
made in the ordering of criteria by ratings of teachers and admin­
istrators.
This data were arranged in table form to show rank
order of the sixteen criteria.
The types of criteria were listed
in the categories of process, product and presage as described by
Mitzel (Mitzel, 1960).
Mean ratings were listed for each criteria
and combined means for the type categories of process, product and
presage.
Responses of elementary teachers, secondary teachers and
administrators were compared for significant differences in ranking
the criteria.
These data were organized into tables which illus­
trated the comparisons.
Hypotheses Tested
Hypothesis I
The null hypothesis was that there is no significant difference
among the subgroup measures of teacher effectiveness; process, pro­
duct and presage in the responses between teachers and administrators
92
Hypothesis II
The null hypothesis was that there is no significant difference
among the subgroup measures of teacher effectiveness; process, pro­
duct and presage in the responses of teachers in the three district
classifications.
Hypothesis III
The null hypothesis was that there is no significant difference
among the subgroup measures of teacher effectiveness; process, pro­
duct. and presage in the responses between male and female teachers.
Hypothesis IV
The null hypothesis was that there is no significant difference
among the subgroup measures of teacher effectiveness; process, pro­
duct and presage in the responses between elementary and secondary
teachers.
Hypothesis V
The null hypothesis was that there is no significant difference
among the subgroup measures of teacher effectiveness; process, pro­
duct and presage between six classes, of years of experience for
teachers.
9.3
Hypothesis VI
The null hypothesis was that there is no significant relation­
ship among the differences between the teacher rating.of each of the 16
measures of teacher effectiveness and the teacher rating,of the' adminis­
trator.
.
Hypothesis VII
.
"
;
•
The.null hypothesis was that there is no significant relation­
ship in the mean rating of each of the 16 criteria of teacher effective­
ness between teachers and administrators (either as combined groups or
by classes of districts) and a reference group from another study.
test of significance for this study was on the .05 level.
The
A summary
table will be included which will report the distribution of the
responses of the.teachers and administrators in all categories and •
frequency tables will be used whenever necessary.
Method of Analyzing D a t a '
Hypotheses I through V were tested by using the analysis .of
.
.
variance statistic in conjunction with appropriate tests of significance.
When appropriate, the Duncan's test for multiple comparisons/
was used to test the individual means.
Sub-test comparisons were made on the criteria grouped under the
Mitzel Scheme of categorizing effectiveness criterion into three groups ,
11 3
94
■labeled process, product and presage.
Definitions of the Mitzel- Scheme
were provided in Chapter II, pages 69-70 of this study.
Hypothesis VI
design.
was' analyzed by using
a multiple
regression
The multiple' regression-design makes use of the coefficient
of multiple correlation which is an estimate of the relationship between
one variable and two or more others in combination.
The correlation
coefficient may be used to predict or estimate a score on an unknown
variable from knowledge of a score on a known variable.• The dependent
variable is spoken of as the criterion.
described as predictors.
The independent variables are
In this problem, the criterion variable repre-
sented the rated effectiveness of the administrator in improving the
effectiveness of the teacher (Y-*- variable)..
The difference in the per-,
ception of.the importance.of criterion measures in evaluating teacher,
effectiveness between teachers and administrators represented the inde­
pendent or predictor•variables (X .variable).
There was a total of 16
predictor variables. . All differences between' teacher and administrator .
perception were computed, for the. 16 criterion on the survey instrument..
Spearman's coefficient of rank correlation formula was used to
.compare the rankings of Montana teachers and.administrators' ratings on
effectiveness criteria to the rankings Cf the Delaware teachers a n d .
administrators.
Mean ratings only were available to the researcher ' .
from the. Delaware study.
paring .results was chosen.
Therefore, a non-parametric method.of com­
Tables illustrating the comparisons were
I
95
made and correlation coefficients were calculated.
The significance of
the calculations as determined by the critical value of F for N number
of degrees of freedom was listed from appropriate tables.
Precautions Taken for Accuracy
All data compiled from questionnaires were minutely examined by
two people prior to using computer processing techniques.
All samples
were drawn by computer programs to insure randomness of sampling.
Cal­
culations were checked by computer and hand calculators to eliminate
mathematical error.
SUMMARY
Data used to answer the questions relating to the problem of
what teachers and administrators in Montana believed were the appro­
priate criteria for judging the effectiveness of a teacher was obtained
by using a survey-rating instrument.
The population for this study was •
composed of the total teaching and administrative population who were
under contract in school districts throughout Montana during the 19761977 school term.
The instrument used to collect data was essentially
the same instrument used in a similar study conducted by Jenkins and
Bausell in the State of Delaware (Jenkins and Bausell, 1974).
The
instrument was based upon three categories of teaching effectiveness
measures as described by Harold Mitzel (Mitzel, I960).
Data supplied
by respondents was examined and compared in table form to show rank
96
order of the 16 criteria along with combined means for type categories
of process, product and presage.
The analysis of variance statistic
along with appropriate tests for significance was used to test hypo­
theses one through five.
Hypothesis six was tested using the multiple regression equation
to answer the question of whether or not the expected success of an
administrator in helping individual teachers to become more effective
could be determined from the teacher's rating of the administrator com­
pared to the difference in administrator and teaching ratings of
teacher effectiveness criteria.
The Spearman Coefficient of rank correlation was used to test
Hypothesis VII
which
compared the relationship that existed between
how teachers and administrators of Montana viewed effectiveness cri­
teria compared to how teachers and administrators of Delaware view
effectiveness criteria.
The null hypotheses were tested at the .05
level of significance.
Summary tables were constructed to report the
distribution of responses of teachers and administrators in all cate­
gories and frequency tables were used whenever necessary.
The.survey
responses were categorized and recorded by a committee of two people
to insure accuracy, and the electronic calculator as well as computer
services at Montana State University were employed to minimize compu'tational errors.
CHAPTER IV
ANALYSIS OF DATA
Introduc fcion
As described in Chapter I, the purpose of this study was to deter­
mine whether or not administrators who were charged with evaluating
teacher effectiveness agreed on the criteria that was used in judging
effective teaching and compare that finding with the teacher's view of
criteria measures of teaching effectiveness.
Comparisons were made
between the results of the study and a similar one conducted in the State
of Delaware.
In an attempt to probe into how both teachers and administrators
viewed the criteria' upon which teachers have been evaluated, this re­
searcher used a survey instrument that was constructed by Jenkins and
Bausell for a similar study carried out in the State of Delaware (Jen­
kins
and Bausell, 1974).
The survey instrument included an assortment
of sixteen- criteria based on the categories of product, process and
presages.that were employed by Harold Mitzel (1960) and described in
Chapter I I .
An additional item was added to the survey instrument which asked
the administrator respondent.to rate his own effectiveness in helping
teachers under his supervision to improve their teaching effectiveness.
A comparable item was added to the survey instrument which was sent to
teachers.
Teacher respondents were asked to rate the effectiveness of
98
their administrator in helping the teacher to improve his own teaching
effectiveness.
Mean Ratings of Criteria
The ratings by administrator respondents to each of sixteen effec­
tiveness criteria listed on the survey instrument were tallied.
The mean
rating for each criterion was calculated from the total of administrator
responses.
The means thus calculated were rank ordered so that the
highest mean rating for a criterion was listed as number one.
ranked criterion was listed sixteenth.
The lowest
The means calculated for the
sixteen criteria as rated by administrators ranged from a low mean of
4.95, on a scale of I to 9, to a high mean of 8.32.
Each of the sixteen criterion measures of teacher effectiveness
appearing on the survey instrument was labeled in accordance with the
Mitzel Scheme.
This Scheme described each criterion as either process,
product or presage criterion.
A description of the Mitzel Scheme and
definition of the three types of effectiveness criteria appear in Chapter
II of this study.
The sixteen criterion measures rated by administrators were grouped
into the three types of effectiveness measures labeled product, process,
and presage.
measures.
Combined means were calculated for each type of criterion
The highest mean rating for combined effectiveness measures
was that labeled product.
Its mean was 7.61 rated on a scale from I to
9.
This combined mean rating was followed by process criterion with a
combined mean rating of 7.49 and presage criterion with a combined mean
rating of 6.55.
Presage criteria received the lowest rating by adminis­
trators as a measure of teacher effectiveness.
Table IX lists the rank order of means of sixteen measures of
teacher effectiveness as rated by Montana administrators.
In addition,
the combined mean rating of product, process and presage are listed in
Table IX.
Attention is called to the fact that although product criteria,
which are measures of student gain, were rated highest as a group; the
process criteria, "effectiveness in controlling his class", was rated
highest by administrators.
In other words, discipline was the top rank­
ing criterion by which to measure teacher effectiveness as rated by
administrators.
The same format that was used in Table IX to list the mean ratings
of administrators' measures of teacher effectiveness was used in Tables
X, XI, XII, and XIII.
Table X lists the rank order of means of the six­
teen effectiveness criteria as rated by Montana teachers.
bottom of Table X are the combined mean ratings.
Listed at the
Special note should be
made of the fact, as indicated by the results listed in Table X, that
teachers rated process criteria highest.
This rating differed from
administrators who rated product criteria highest.
However, as noted
from the Table X teachers rated "effectiveness in controlling his class"
the highest of the sixteen criteria as did administrators.
TABLE IX
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURE OF TEACHER
EFFECTIVENESS FOR ADMINISTRATORS OF COMBINED
CLASSES OF SCHOOL DISTRICTS •
Rank
Criteria
Type
Order_____________(Ordered by Rating)_____________(Mitzel Scheme)
I.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Effectiveness in controlling
his class.
Relationship with class (good
rapport).
Knowledge of subject matter and
related areas.
Amount his students learn.
Means
Process
8.32
Process
7.95
Presage
Product
7.68
7.66
Presage
Product
7.57
7.56
Presage
7.49
Process
7.39
Ability to personalize his teaching.
Extent to which his verbal behavior
in classroom is student-centered.
General knowledge and understanding
of education facts.
Extent to which he Uses inductive
(discovery) methods.
Process
7.36
Process
7.35
Presage
6.59
Process
6.58
Civic responsibility (patriotism).
Performance in student teaching.
Participation in community and
professional activities.
Years of teaching experience.
Presage
Presage
6.53
5.87
Presage
Presage
5.74
4.95
Personal adjustment and character.
Influence on student's behavior.
Willingness to.be flexible, to be
direct or indirect as situation
demands.
Capacity to perceive the world from
the student's point of view.
Type
Product
Process
Presage
Combined Means
7.61
7.49
6.55
'
101
TABLE
X
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER
EFFECTIVENESS FOR TEACHERS OF COMBINED
CLASSES OF SCHOOL DISTRICTS
Rank
Order'
I.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Criteria
(Ordered by Rating)
Effectiveness in controlling his
class.
Knowledge of subject matter and
related areas.
Relationship with class (good
rapport).
Willingness to be flexible, to be
direct or indirect as situation
demands.
Type
(Mitzel Scheme)
Means
Process
8.13
Presage
8.09
Process
7.89
Presage
7.61
Presage
Process
7.56
7.44
Process
7.34
Process
7.31
Influence on student's behavior.
Amount his students learn.
General knowledge and understanding
of education facts.
Extent to which he uses inductive
(discovery) methods.
Product
Product
7.19
7.04
Presage
6.77
Process
6.35
Civic responsibility (patriotism).
Participation in community and
professional activities.
Years of teaching experience.
Performance in student teaching.
Presage
6.09
Presage
Presage
Presage
5.23
5.08
5.01
Personal adjustment and character.
Ability to personalize his teaching.
Capacity to perceive the world from
the student's point of view.
Extent to which his verbal behavior
in classroom is student centered
Type
Process
Product
Presage
Combined Means
7.41
7.12
■6.43
1 0 2
Tables XI, XII, and XIII show the mean ratings by teachers of the
sixteen measures of teacher effectiveness criteria by district classifi^
cation.
Because school districts 'differ in size as determined by popu­
lation, districts are classed from I. which has the greater population to
III which has the smallest population.
The rank order of teacher ratings
of effectiveness criteria was listed by district classification to show
whether or not there is an appreciable difference in how teachers rated
effectiveness criteria as determined by the size of the district in which
they taught.
It is again noted from Tables XI, XII, and XIII that teachers,
regardless of the size of school district in which they taught, rated
process criteria highest as noted by combined mean ratings listed in
each table.
As with combined teacher and administrator ratings, teachers
regardless of the size of the district in which they taught, rated "effec­
tiveness in controlling his class" as number one.
A comparison of teacher and administrator ratings on the three
types of teacher criteria are shown in Table XIV.
This comparison is
made in terms of combined means.
The Testing of Hypotheses
The first hypotheses examined whether or not there is agreement
between administrators and teachers on the criteria that measures teacher
effectiveness.
To test the null hypotheses the sixteen criterion measures
were grouped into the three sub-groups of process, presage and product
103
-
TABLE Xl
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER
EFFECTIVENESS FOR TEACHERS OF CLASS I SCHOOL DISTRICTS '
Rank
Order
I.
2.
3.
4..
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Criteria
(Ordered by Rating) .
Effectivness in controlling his
class.
Knowledge of subject matter and
related areas.
Relationship with class (good
rapport).
Willingness to be flexible, to be.
direct or indirect as situation
demands
Type
(Mitzel Scheme)
Means
Process
8.12
Presage
8.12
Process
7.92
Presage
7.77
Personal adjustment and character.
Ability to personalize his teaching.
Extent to which his verbal behavior
in classroom is student centered.
Capacity to perceive the world from
the student's point of view.
Presage
Process
7.65
7.56
Process
7.42
Process
7.35
Influence on student's behavior.
Amount his students learn.
General knowledge and understanding
of education facts.
Extent to which he uses inductive
(discovery) methods.
Product
Product
7.28
6.94
Presage
6.74
Process
6.39
Civic responsibility (patriotism).
Participation in community and
professional activities.
Years of teaching experience.
Performance in student teaching.
Presage
6.12
Presage
Presage
Presage
5.17
5.14
5.14
Type
Combined Means
Process
Product
Presage.
7.46
7.11
6.48
104
TABLE XlT
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER
EFFECTIVENESS FOR TEACHERS OF CLASS II SCHOOL DISTRICTS
Rank
Order
I.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
Type
(Mitzel Scheme)
Criteria
(Ordered by Rating)
Effectiveness in controlling his
class.
Knowledge of subject matter and
related areas.
Relationship with class (good
rapport).
Personal adjustment and character.
Willingness to be flexible, to be
direct or indirect as situation
demands.
Capacity to perceive the world from
the student's point of view.
Ability to personalize his teaching.
Amount his students learn.
Extent to which his verbal behavior
in classroom is student centered.
Influence on student's behavior.
General knowledge and understanding
of education facts.
Extent to which he uses inductive
(discovery) methods.
Civic responsibility (patriotism).
Participation in community and
professional activities.
Years of teaching experience.
Performance in student teaching.
Type
Process
Product
Presage
Means
Process
8.12
Presage
7.98
Process
Presage
7.85
7.50
Presage
7.33
Process
Process
Product
7.30
7.24
7.16
Process
Product
7.09
7.06
Presage
6.79
Process
6.23
Presage
5.96
Presage
Presage
Presage
5.28
5.04
4.86
Combined Means
7.30
7.11
6.34
105
TABLE XlIl
RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER
EFFECTIVENESS FOR TEACHERS OF CLASS III SCHOOL DISTRICTS
Rank
Order
Criteria
(Ordered by Rating)
I.
Effectiveness in controlling his
class.
Knowledge of subject matter and
related areas.
Relationship with class (good
rapport).
Willingness to be flexible, to be
direct or indirect as situation
demands.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13. •
14.
15.
16.
Type
(MitzeI Scheme)
Means
Process
8.32
Presage
8.27
Process
7.73
Presage
7.49
Process
Process.
Product
7.41
7.37
7.36
Process
7.32
Personal adjustment and character.
Influence on student's behavior.
General knowledge and understanding
of education facts.
Extent to which he uses inductive
(discovery) methods.
Presage
Product
7.20
7.05
Presage
6.93
Process
6.49
Civic responsibility (patriotism).
Participation in community and
professional activities.
Years of teaching experience.
Performance in student teaching.
Presage
6.35
Presage
Presage
Presage
. 5.49
4.83
4.63
Capacity to perceive the world from
the student's point of view.
Ability to personalize his teaching.
Amount his students learn.
Extent to which his verbal behavior
in classroom is student centered.
Type
Combined Means
Process.
Product
Presage
7.44
7.20
6.40
106
TABLE XIV
COMBINED MEANS OF RATINGS OF ADMINISTRATORS
AND TEACHERS OF MONTANA
Type
(Mitzel Scheme
Admin.
All Teachers
Process
7.49
7.41
7.46
7.30
7.44
Product
7.61
7.12
7.11
7.11
7.20
Presage
6.55
6.43
6.48
• 6.34
6.40
Class I
Teachers
Class II Class III
which have been described as the Mitzel Scheme. , The first hypothesis
was tested at the .05 level of significance using the least squares
analysis of variance.
Null hypothesis Ir
There is no significant difference among the
subgroup measures of teacher effectiveness; process, product and presage
in the responses between teachers and administrators.
Since the computed F value of 6.79 was greater than the critical
valute of -3.84 the null hypothesis that there was no significant differ­
ence between the administrators and the teachers was rejected.
Adminis­
trators rated the three subgroups of criteria significantly higher than
teachers.
.
Since the .computed F value of 55.60 was greater than the critical
value of 2.99, the null hypothesis that there was no significance dif­
ference among the three criteria subgroups was rejected.
Duncan's test
for multiple comparisons indicated that the subgroups, process and
107
product were rated significantly higher than presage criteria..
Since the critical F value of 1.73 was less than the critical
value of 2.99, the null hypothesis that there was no significant differ­
ence between administrators and teachers among the three criteria was
not rejected.
The three criteria subgroups were presage, process and.
product.
Table XV presents the analysis of variance results for adminis­
trators and teachers' ratings of the three subgroups of criteria.
As .
noted from the table, teachers rated process criteria higher than either
product or presage criteria.
Administrators rated product criteria
higher than either process or presage criteria.
While both administra­
tors and teachers placed relatively equal significance on process and
product measures of teacher effectiveness, they gave significantly less
emphasis to presage criteria.
Schools in Montana belong to one of three classes of districts
which are determined by the population within the district.
The larger
school districts are Class I, if it has a population of 6500 or more;
intermediate school districts are Class II, with population of 1000 to
6500;' and smaller school districts are Class III, with population of
1000 or less. Enrollment of students and size of teaching staffs are
proportioned in size to the class of district in which they are located.
One of the questions to be answered by this study was whether or not
teachers in larger districts viewed teacher effectiveness criterion
108
TABLE XV
ADMINISTRATORS VERSUS TEACHERS
LEAST SQUARE MEANS AMONG THE THREE SUB-TESTS
BETWEEN ADMINISTRATORS AND TEACHERS
Sample
Administrators
Teachers
N
Test I
Presage
Test 3
Product
Total
7.39
.6.87
6.90
6.62
6.11
7.10
452
5.98
6.99
6.04
7.04
6.791 row means*
'
1.733 interaction means
*
7.15
critical value of F = 3.00
critical value of F = 3.00
.05 level of confidence
Overall mean = 6.66
S.D. = 1.42
Duncan's Test
7.04 compared to 6.04 significant
.05
7.15 compared to 6.04 significant
.05
7.15 compared to 7.04
.
critical value of F = 3.84
F = 55.605 column means*
F =
Test 2
Process
81
Total
F =
■
109
measures differently than teachers in intermediate and smaller or rural
districts.
Hypothesis 2 relating to classes of districts was tested at
the .05 level of significance using the least square analysis of vari­
ance.
Null hypothesis 2.
There is no significant difference among the
subgroup measure of teacher effectiveness; process, product and presage
in the responses of teachers in the three district classifications.
Since the computed F value of .29 was less than the critical
value of 3.00, the null hypothesis that there was ho significant differ­
ence in the teachers1 responses by three district classification was not
rejected.
Since the computed F value of 62.28 was greater than the critical
value of 3.00, the null hypothesis that there was no signfleant differ­
ence among the three criteria subgroups was rejected.
Duncan's test for
multiple comparisons indicated that subgroups process and product were
rated significantly higher than presage.
Since the computed F value of .63 was less than the critical
value of 2.37, the null hypothesis that there was no significant differ­
ence in teachers' responses from three classes of school districts among
the three criteria was not rejected.
The three criteria subgroups were
process, product and presage.
Table XVI presents.the analysis of variance results for teachers
of three classes of school districts and their rating of the three
HO
TABLE XVI
LEAST SQUARE MEANS OF COMPARISON OF TEACHERS OF THREE
CLASSES OF DISTRICTS: TEACHERS OF 1st,
IInd, AND IIIrd CLASS SCHOOL DISTRICTS
N
Test I
■ Presage
1st Class Districts
305
6.03
7.05
6.92
6.67
IInd Class Districts
170
5.93
6.94
7.05
6.64
IIIrd Class Districts
58
6.07
7.00
.7.11
6.73
6.01
6.99
7.03
Teacher Sample
Total
F =
F =
F =
Test 2
Process
critical value of F = 3.00
.292 row means
62.279 column means*
critical value of F = 3.00
.635 interaction means
*
critical value of F = 2.37
.05 level of confidence
Overall mean = 6 . 6 6
S.D. = 1.42
Duncan's Test
6.99 compared to 6.01 significant
.05
7.03 compared to 6.01 significant
.05
7.03 compared to 6.99
Test 3
Product
Total
Ill
subgroups of criteria.
The results in Table XVI indicate that no signi­
ficant difference existed among teachers of the three classes of school
districts in their separate rating of teacher effectiveness criteria.
A significant difference existed in the ratings given process and pro­
duct criteria compared to presage criteria of effectiveness by teachers
of three classes of school districts.
This difference exceeded the .05
level of confidence.
The third hypothesis was tested by the least squares analysis of
variance at the .05 level of significance.
The objective of this hypo­
thesis was to determine whether or not there was a significant differ­
ence between male and female teachers in their rating of the teacher
effectiveness criteria.
Null hypothesis 3.
There is no significant difference among the
subgroup measures of teacher effectiveness; process, product and presage
in the responses between male and female teachers.
Since the computed F value of 6.66 was greater than the critical
value of 3.84, the null hypothesis that there was no significant differ­
ence between male and female teachers' ratings was rejected.
Female
teachers were higher raters of criteria.
Since the computed F value of 95.26 was greater than the critical
value of 3.84, the null hypothesis that there was no significant differ­
ence among the three criteria
process, product and presage was.rejected
Duncan's test for multiple comparisons indicated that the subgroups
1 1 2
process and product were rated significantly higher than presage cri­
teria..
Since the computed F value of .76 was less than the critical
value of 2.99, the null hypothesis that there was no significant differ­
ence between male and female teachers among the three criteria was not
rejected.
Process, product and presage made up the three criteria.'
Table XVlI presents the analysis of variance for male and female
teacher ratings of the three subgroups of criteria.
Female teachers
rated the three types of teacher effectiveness criteria higher than
male teachers did.
Both groups rated process and product criteria sig­
nificantly higher than presage criteria at the .05 level of confidence.
The fourth hypothesis was tested by the least squares analysis of
variance at the .05 level of significance.
The purpose of this hypo­
thesis was to determine whether or not elementary and secondary teachers
differed significantly in their rating of teacher effectiveness criteria
Null hypothesis 4.
There is no significant difference among the
subgroup measures of teacher effectiveness; process, product and presage
in the responses between elementary and secondary teachers.
Since the computed F value of 2.35 was less than the critical
value of 3.84, the null hypothesis that there was no significant differ­
ence between elementary and secondary teacher ratings of criteria was
not rejected.
confidence.
The difference was not significant at the .05 level- of
113
TABLE XVII
A COMPARISON OF LEAST SQUARE MEANS AND ANALYSIS OF VARIANCE
FOR MALE AND FEMALE TEACHERS' RATINGS OF CRITERIA
N
Test I
Presage
Test 2
Process
Male
286
5.90
6.89
6.95
6.58
Female
247
6.12
7.14
7.01
6.75
6.01.
7.02
6.98
Sample
Combined Means
F =
6.66 row means*
Test 3
Product
critical value of F = 3.84
F = 95.26 column .means*
critical value of F = 3.00
F =
critical value of F = 3.00
.762 interaction means
*
.05 level of confidence
Overall mean = 6.66
S.D, = 1.42
Duncan's Test
7.02 compared
)
to 6.01 significant
6.98 compared
to 6.01 significant
6.98 compared
to 7.02
.05
.
.05
Combined
Mean
114
Since the computed F value of 79.29 was greater than the critical
value of 3.00, the null hypothesis that there was no significant differ­
ence among the three criteria was rejected.
Process and product criteria
were rated significantly higher than presage criteria as indicated by
the Duncan's test for multiple comparisons.
Since the computed F value of .82 was less than the critical
value of 3.00, the null hypothesis that there was no significant differ­
ence between elementary and secondary teachers among the three criteria
was not rejected.
The three criteria process, product and presage were
compared.
Table XVIII presents the analysis"of variance for elementary and
secondary teachers for three subgroups of teacher effectiveness criteria.
For comparison, as noted in the table, the elementary teachers included
all grade levels, kindergarten through grade level eight (K-8), and
secondary teachers included all grade levels, grades nine through
twelve (9-12).
secondary.
The two levels are commonly regarded as elementary.and
Elementary teachers rated process criteria highest, and
secondary teachers rated product criteria highest.
high
Neither rating was
enough to be significant in comparison to the next higher subgroup
rating.
There was a significant difference in the rating of process and
product criteria over presage criteria for the combined group (K-12).
It may be assumed that as teachers gained years of experience in
teaching they might have viewed measures of teaching effectiveness
115
TABLE XVIlI
LEAST SQUARES MEANS FOR ELEMENTARY VERSUS SECONDARY
TEACHERS FOR THREE SUBGROUPS OF
TEACHER EFFECTIVENESS CRITERIA
Sample
N
Elementary Teachers
(K-8)
363
Secondary Teachers
(9-12)
170
Combined Means
(K-12)
F
2.35
F = 79.29
F =
.82
Test I
Presage
Test 2
Process
Test 3
Product
Combined
Mean
6.01
7.08
7.00
6.70
5.98
6.84
6.93
6.59
5.99
6.96
6.96
row means
critical value of F = 3.84
column means*
critical value of F = 3.00
interaction mean
critical value of F = 3.00
*
.05 level of confidence
Overall mean = 6.66 ■ • • S.D. 1.42
Duncan's Test
6.96 compared to 5.99 significant
.05
6.96 compared to 5,99 significant
.05
6.96 compared to 6.96
116
differently than in the earlier years of their experience.
For example,
a teacher who•was inexperienced might have been more concerned with his
own "personal adjustment and character", a presage criteria, and rated
it higher than "amount his students learn", a product criteria.
A more
experienced teacher might have reversed the emphasis between these two
criteria.
The fifth' hypothesis examined the assumption that no significant
difference existed among the subgroup measures of teacher effectiveness,
process, product and presage as rated by teachers who were grouped into
six classes of years of experience.
Briefly stated the years of experi­
ence has no or little influence on how teachers rated teacher effective­
ness criteria.
The six classes of experience were described as follows:
(I) 0-5
years, (2) 6-10 years, (3) 11-15 years, (4) 16-20 years, (5) 21-25 years
and (6) 25 years and over.
The number of teachers sampled who were in
each experience group were listed in Table VII.
The number of adminis­
trators sampled who were in each experience group were listed in Table
III.
The least squares analysis of variance statistic was used to test
hypothesis V at the .05 level of significance.
The means for the six
classes of teacher experience under subgroup headings of process, pro­
duct and presage criteria are listed in Table XIX.
TABLE XIX
'LEAST SQUARE MEANS AND ANALYSIS OF VARIANCE
OF SIX EXPERIENCE CLASSES FOR TEACHERS
Teacher Years
of Experience
Test I ■
PresageI
Test 2
Process
Test 3
Product
Total
172
5.70
6.93
6.85
6.49
157
5.86
6.98
6.93
6.59
. 91 .
6.19
6.97
6.97
6.71
56
6.45
7.18
7.30
6.98
25
6.17
6.97
6.85
6.66
32
6.85
7.35 .
7.41
7.20
6.20
7.06
7.05
Analysis <of Variance
critical value of F = 2 . 2 1
F = 6.838 row means*
critical value of F = 3.00
F- = 43.465 column means*
F .=
.923 interaction means
critical value of F = 1.83
■ *
.05 level.of confidence
S.D1. = 1 .42
Overall mean = 6.64
Duncan's Test— Column Means
Means
Duncan's Test— Row
7.20 compared to 6.66 significant
'6.20
significant
.05
7.06 compared.to
7.20
compared to 6.98
6.20
significant
.05
7.05 compared' to
7.20
compared to 6.71 significant
7.06
.
7.05 compared to
7.20 compared to 6.59 significant
7.20 compared to 6.49 significant
6.98 compared to 6.71 significant
6.98 compared to 6.59 significant
6.98 compared to 6.49 significant
.
6.98 compared to 6.66 significant
.05
.05
.05
.05
.05
.05
.05
.05
117
Cla s s .I
(0-5 years)
Class 2
(6-10 years)
Class 3
(11-15)
Class 4 •
(16-20 years)
Class 5
(21-25 years)
Class 6
(26+ years)
N
118
Null hypothesis _5.
There is no significant difference among the
subgroup measures of teacher'effectiveness; process, product, and pre­
sage between six classes of years of experience for teachers.
Since the computed F value of 6.84 was greater than the critical
value of 2.21, the null hypothesis that there was no significant differ­
ence in the ratings of teachers among years of experience was rejected.
The years of experience was comprised of six classes.
Since the computed F value of 43.46 was greater than the critical
value of 3.00, the null hypothesis that there was no significant difference among the three subgroups of criteria was rejected.. Subgroups,
process, product and presage comprised the effectiveness criteria.
Since the computed F value of .92 was less than the critical
value of 1.83, the null hypothesis that there was no significant differ­
ence in teacher ratings by classes of experience among the three criteria
was not rejected.
i
A significant difference beyond the .05 level of confidence was
found in the total means of product and process criteria as compared to
the presage criteria.
Duncan's test for multiple comparisons was ad­
ministered to the analysis of variance data to determine which experi­
ence class means were significantly different at the .05 level of con­
fidence.
The comparison of the total means of each experience class
indicated that Class 4, which represented (16-20) years of experience
was significantly higher than^mearis' for Class I, (0-5) years of
1 1 9
experience and Class 2 (6-10) years of experience, Class 3 (11-15) years
of experience and Class 5 (21-25) years of experience.
Class 6 which
represented 26 and over years of experience was significantly higher
than the remaining Class I, 2, 3, and 5 at the .05 level of confidence.
This researcher could not determine the reason for the significant dif­
ference found in how teachers with (16-20) and (26+) years of experi­
ence rated the three groups of effectiveness criteria compared to
teachers of other experience categories.
About all that could be said
was that teachers in the (16-20) and (26+) years of experience cate­
gories were consistently higher raters of effectiveness criteria.
The researcher observed that teachers with zero to fifteen (0-15)
years of experience and teachers with twenty-one to twenty-five (21-2.5)
years of experience rated process criteria equally as high or higher
than product criteria.
The difference was not significant.
Teachers with sixteen to twenty (16-20) years of experience and
those with twenty-six and more (26+) years of experience rated product
criteria highest of the three subgroups of criterion measures.
The dif­
ference between product and process ratings was not significant for
these two groups of teachers.
To examine whether or not a significant relationship existed
between the differences in teacher and administrator ratings of 16 cri­
terion measures of teacher effectiveness and how the teacher rated the
effectiveness of their administrator in helping them to become more
120
effective teachers, this researcher obtained first an administrator
rating of criteria and with the administrator's permission obtained the
teachers' ratings of the criteria concerning the administrators' effec­
tiveness.
This sampling procedure has been described in more detail in
the introduction of this chapter.
Hypothesis VI which examined this
question was tested at the .05 level of significance using multiple
regression equation computed with the teacher rating of the administra­
tor as the dependent variable and the difference between the teacher and
administrator rating of each of the 16 criterion measures of teacher
effectiveness as the independent variable.
Table XX presents the multi­
ple regression analysis of the comparison.
Null hypothesis £>.
There is no significant relationship among
the differences between the teacher rating of each of the 16 measures of
teacher effectiveness and the teacher rating of the administrator.
Since the computed F value of 1.07 was less than the critical
value of 1.67, the null hypothesis that there was no significant rela­
tionship among the differences between the teacher rating of the 16
criterion measures and the rating of the administrator was not-,rejected.
Only one criterion was found significant at the .05 level of confidence.
The differences between teacher and administrator rating of the
16 teacher effectiveness criteria and the teachers rating of the adminis­
trators was not significant except for one.criterion which was #13 on
the survey instrument.
This criterion was listed as "effectiveness in
1 2 1
TABLE XX
MULTIPLE REGRESSIONS AMONG THE 16 INDEPENDENT VARIABLES
.OF TEACHER AND ADMINISTRATOR DIFFERENCES AND
THE DEPENDENT VARIABLE, TEACHER
RATING OF ADMINISTRATOR EFFECTIVENESS
Beta
T-Test
. Par R
I.
2.
3.
4.
.05032
.01461
.00923
-.01938
.88
.26
.25
- .'33
.0417
.0123
.0118
-.0159
5.
6.
7.
8.
-.02561
.02265
.08600
-.09677
- .31
.37
1.08
-1.87
-.0149
.0174
.0511 .
-.0887
9.
11.
12.
-.00946
.06294
.01459
-.04659
e 14
1 .01
.30
— •46
-.0065
.0482
.0144
-.0218
13.
14.
15.
16.
.30510
-.11406
-.07011
-.04772
2.78
-1.52
- .79
- ,94
.1309*
-.0722
-.0376
-.0449
io.
—
Constant V = .04035
R2 = .0374
Multiple R = .1933
Source
Due to regression
Standard error of estimate = .
MS
D.F.
SS
16
121.533 ■
7.5958
313.113
7.0840
F = 1.07
About regression
442
TOTAL
458
.
3252.671
.
Critical value of F = 1.67 at
=
t
05 level of confidence
1 2 2
controlling his class".
The difference between teacher and administra­
tor rating of this criteria was significantly related to how.teachers
rated the effectiveness of the administrator in helping them to become
more effective.
Although this one of the 16 criteria was found signifi­
cant at the .05 level of confidence it accounted for only 1.7% of the
variance.
Criteria #13 which was found to be significantly related to how
teachers rated their administrators' effectiveness can be described as
exercising of discipline in the classroom.
This aspect of a teacher’s
effectiveness was rated highest in comparison to the remaining 15 cri■
teria measures of teacher effectiveness by both teachers and administra­
tors.
The difference illustrated by the multiple regression equation
was most likely caused by teachers and administrators communicating on
this particular criteria.
For example, should a teacher and administra­
tor disagree as to what constituted effective classroom control (disci­
pline) then the difference most likely would have resulted in confronta­
tion or dialogue between teacher and administrator.
-
How effective the
.
teacher perceived the administrator to be in resolving this conflict
•
1
.
.
.
determined how the teacher rated the administrator's effectiveness in
helping that teacher to become more effective.
It was probable that not enough difference of opinion existed be­
tween the administrator and his staff on the remaining criteria to cause
1 2 3
communication between the administrator and his staff to occur which re­
sulted in no or little evaluation of the administrator.
It is probable
that the administrator effectiveness is measured by criteria other than
those listed on the survey instrument with the exception of criterion
#13.
Another way to view the lack of significance between the teachers
rating of the 16 criteria measures of teacher effectiveness and their
rating of the effectiveness of their administrator was to assume that
unless differences occurred between the administrator and the teacher on
a given criterion, no discussion took place between the teacher and the
administrator and no basis existed for the teacher to rate the
adminis­
trator's expertise in resolving conflict over a given criterion measure.
Classroom control is a noted exception because it is easily observable
and both teacher and administrator have biases as to what constitutes
effective (discipline) classroom control.
If they agree on what con­
stitutes good classroom control, no conflict occurred; therefore, no
observation resulted by the teacher on how effective an administrator
was in dealing with the difference.
Hypothesis VII examined whether or not there was a significant
relationship in the mean rating of each of the criteria of teacher
effectiveness between teachers and administrators either as combined
groups or by classes of districts and a reference group from a similar
study conducted in Delaware.
Table XXI lists in rank order the mean
124
TABLE XXI
MEAN RATINGS AND. RANK ORDER OF THE 16 CRITERIA
DELAWARE STUDY
Rank
Criteria
Type
Means
Order___________ (Ordered by Rating)______________(Mitzel Scheme)_________
I.
3.
4.
5.
6.
7.
8.
9.
13.
14.
15.
16.
Civic responsibility (patriotism).
Performance in student teaching
Participation in community and pro­
fessional activities.
Years of teaching experience.
Type
Process
Product
Presage
Phi Delta Kappan
April 1974 .
Presage
Process
8.17
7.88
Process
7.79
Presage
Product
7.71
7.65
Presage
Process
7.64
7.63
Process
Process
Product
6.95
6.86
Presage
6.43
Presage
Presage
6.25
5.66
Presage
Presage
4.88
3.89
Combined means
7.64
7.26
6.43
p-
11.
12.
Extent to which his verbal behavior in
classroom is student-centered.
Extent to which he uses inductive
(discovery) methods.
Amount his students learn.
General knowledge and understanding
of educational facts.
8.31
CM
10.
Personal adjustment and character.
Influence on student's behavior.
Knowledge of subject matter and re­
lated areas
■Ability to personalize his teaching.
Process
P-
2.
Relationship with class (good
rapport)
Willingness to be flexible, to be
direct or indirect as situation demands.
Effectiveness in controlling his class.
Capacity to perceive the world from the
student's point of view.
125
ratings of Delaware teachers and administrators of the 16 criteria of
teacher effectiveness.
Tables IX through XIII list in rank order the
mean ratings for Montana teachers and administrators on the same 16
criteria measures of teacher effectiveness.
Because only mean ratings were available from the Delaware Study
which was conducted by Jenkins and Bausell, Spearman's coefficient of
rank correlation was used to test hypothesis VII.
Ferguson described
the comparison of two correlated or independent samples by using ranks
as nonparametric or distribution free tests (Ferguson, 1971, p. 304).
Spearman's coefficient or rank correlation is defined in such a
way as to take a value of +1 when paired ranks are in the same order and,
a value of -I when the ranks are in an inverse order and an expected value
of zero when the ranks are arranged at random with respect to each other.
The formula meeting these conditions of the Spearman rho is
P = I -
6
d2
N(r2 - I)
(Ferguson, 1971, p. 305-6).
Null hypothesis 7.
There is no significant relationship .in the
mean rating of each of the 16 criteria of teacher effectiveness between
teachers and administrators (either as combined groups or by classes of
districts) and a reference group from another study.
Since the computed rho value.of .888 was greater than the critical
value of .425, the null hypothesis that there was no significant
1 2 6
relationship in the ratings of criteria between Montana teachers and
administrators was rejected.
Significance was tested at the .05 level
of confidence.
Since the computed rho value of .894 was greater than the critical
value of .425, the null hypothesis that there was no significant rela­
tionship in the ratings of criteria between Montana teachers and Dela- ■
ware teachers was rejected.
Significance was tested at the .05 level of
confidence.
Since the computed rho value of .827 was greater than the critical
value of .425, the null hypothesis that there was no significant rela­
tionship in the ratings of criteria between Montana administrators and
Delaware teachers was rejected.
Significance was tested at the .05
level of confidence.
Since the computed rho value of .977 was greater than the criti­
cal value of .425, the null hypothesis that there was no significant
relationship in the rating of criteria between Montana teachers in first
and second class school districts was rejected.
Significance was tested
at the .05 level of confidence.
Since the computed rho value of .947 was greater than the criti- .
cal value of .425, the null hypothesis that there was no significant
relationship in the rating of criteria between Montana teachers in first
and third class school districts was rejected.
at the .05 level of confidence.
Significance was tested
1 2 7
Since the computed rho value of .956 was greater than the criti­
cal value of .425, the null hypothesis that there was no significant
relationship in the rating of criteria between Montana teachers in
second and third class school districts was rejected.
Significance
was tested at the .05 level of confidence.
Tables XXII through XXVII give the Spearman's coefficient of rank
correlation calculation for comparisons of Montana teachers and adminis­
trators , Montana teachers and Delaware teachers, Montana administrators
and Delaware teachers, Montana teachers of Class I and Class II school
districts, Montana teachers of Class I and Class III school districts and
Montana teachers of Class II and Class III school districts respectively.
In Tables XXII through XXVII the columns one and two are arranged so that
column one will contain the rank order of sixteen effectiveness criteria
for a particular group of raters.
Column two will contain the rank that
another group of raters gave the same criteria.
For example, in Table
XXII column one Montana teachers rated the criterion "effectiveness in
controlling his class" as number one in importance.
Montana also ranked this criterion as number one.
Administrators in
The criterion "knowl­
edge of subject matter and related areas", Montana teachers rated second
in importance, but administrators rated this same criterion as number
three in importance. 1
To find the rank-order ratings and descriptions of the criteria
in columns one and two of Tables XXII through XXVII, the reader is
128
TABLE XXII
CALCULATION OF SPEARMAN'S COEFFICIENT
OF RANK CORRELATION FOR MONTANA
TEACHERS AND ADMINISTRATORS
Teachers1 Rank
Order of
Criteria
Administrators'
Rank of
- Criteria
Difference
d
I
2
3
4
I
3
2
7
0
-I
I
-3
0
I
I
9
5
6
7
5
9
0
8
8
• 10
0
-3
-I
-2
9
10
11
12
6
4
12
3
6
0
0
9
36
0
0
13
15
16
14
0
-I
-I
2
0
I
I
4
11
13
14
15
16
N(W2-I)
Critical Value of P for N of 16 =
P = I-
6 x 76
=
co
6
CM
rO
P = I-
9
I
4
CXJ11
. 0
Totals
Note:
d2
.888
16(256-1)
.425 at the .05 level of confidence
Refer to Table X, Page 102 for the rank order of criteria
for Montana teachers.
Refer to Table IX, Page 101 for the rank order of criteria
for Montana administrators.
129
TABLE XXIII
CALCULATION OF SPEARMAN'S COEFFICIENT
OF RANK CORRELATION FOR MONTANA
AND DELAWARE TEACHERS
Montana Teachers 1
Rank Order of
Criteria
Delaware Teachers'
Rank of
Criteria •
Differences
d.
d2
I
2
3
4
3
7
I
2
-2
-5
2
2
5
6
7
8
5
8
4
9
0
-2
3
..-I
9
10
11
12
.6
11 '
12
10
3
-I
-I
2
9.
I
I
4
13
14
15
16
13
15
16
14
0
-I
-I
2
0
I
I
4
d2
•
NCN^-D
P = I'
6 x 72
=
.894
16(256-1)
Critical Value of P for N of 16 = .425 at the .05 level of confidence.
Note:
Refer to Table X, Page 102
for Montana teachers.
for rank order of criteria
Refer to Table XXI, Page 125 for rank order of criteria
for Delaware teachers and administrators.
OJ
6
1P
P = I-
0
4
9
I
c"
0
Totals
4
25
4 '
4
130
TABLE XXIV
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA ADMINISTRATORS
AND DELAWARE TEACHERS
Montana Administrators'
Rank Order
of Criteria
Delaware Teachers'
Rank Order of
Criteria
I
2
3
4
3
I
7
11
-2
I
-4
-7
4
I
16
49
5
6
7
8
5
6
2
4
0
0
5
4
0
0
25
16
9
10
11
12
8
9
12
10
I
I
-I
2
I
I
I
4
13
14
15
16
13
14
15
16
0
0
0
b
0
0
0
0
Differences
d
0
Totals
P = I-
6
. d^
N(N2-I)
P = I -
6 x 118
16(256-1)
d2
d2=118
.827.
Critical Value of P for N of 16 = .425 at the .05 level of confidence.
Note:
Refer to Table IX, Page 101 for rank order of criteria
for Montana administrators.
Refer to Table XXI., Page 125 for rank order of criteria .
for Delaware teachers. '
131
TABLE XXV
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA TEACHERS OF FIRST
AND SECOND CLASS.SCHOOL DISTRICTS
Teachers
Class II
Rank of
Criteria
Class I
Rank Order
Criteria
Differences
d
a?
I
2
3
4
I
2
3
5
0
0
0
-I
0
0
0
I
5
6
7
8
4 .
7 .
9
6
I
-I
-2
2
I
I '
4
4
9
IO
11
12
10
8
11
12
-I
2
0 .
0
I
4
0
0
13
14
15
16
13
14
15
16
0
0
0
0
0
0
0
0
0
d2=16
Totals
P = I -
6
d2
N(N2-I)
.
P = I -
6 x 16
=
.977
16(256-1)
Critical Value of P for N of 16 = .425 at the .05 level of confidence •
Note:
Refer to Table X I 1 Page 104. for rank order of criteria
for Class I teachers.
Refer to Table XII, Page 105 for rank order of criteria
for Class II teachers.
132
TABLE XXVI
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA TEACHERS O F '
FIRST AND THIRD CLASS SCHOOL DISTRICTS
Teachers
Class I
Rank Order
Criteria
Class III
Rank Order
of Criteria
.
Differences
d
d2
I
2
3
4
I
2
3
4
0
0
0
0
0
0
0
5
6
7
8
9
6
8
5
-4
0
-I
3
16
0
I
9
9
IO
11
12
10
7
11
12
-I
3
0
0
I
9
0
0
13
14
15
16
13
14
15
16
0
0
0
0
0
0
0
0
0
d2=36
0
'
Totals
P = I -
G
d
2
P = I
N(N2-I)
Critical Value of P for N .of 16 =
Note:
6 x 36
= '.947
16(256-1)
.425 at ■the..05 level of confidence.
Refer to Table XI, Page 104 for rank order of criteria
for Class I teachers.
Refer to Table XIII!, Page 106 for rank order of criteria
for Class III teachers.
133
TABLE XXVII
CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK
CORRELATION FOR MONTANA TEACHERS OF SECOND
AND THIRD CLASS SCHOOL DISTRICTS '•
Teachers
\ ■
Class II
Rank Order
of Criteria
■
I
Class III
Rank of
Criteria:
,
2
3
4
d .
i
'
'2
3
9
■
0
,' ' 0 .
-
11 ■
12
13
14
15
16
13
14
15
16
i
—I
Ii
a,
N(N2-I)
Critical Value of P for N of 16.=
Note:
=
I -
■
I
'
.
.
I
■
I
■ '■
I
'
I
0
0
0
0
. 0
0 .
0
0
..Q:,
6 x 30
.
I
■I
I
Totals
P
0
0
' 0
■ 25
I
8
10
d2
:
d2
I
9
10
11
12
6
o
-5"
4
. 5
6
7
5. ■
6.
7
8
'
Differences
.
9
0
b
0
0
,0 ,
d2=30
= . .956
16(256-1)
.425 at the .05 level of confidence.
. Refer t o .Table XII, Page 105,for rank order of criteria
for Class II teachers.
Refer to Table XIII1 Page 106 for rank order of criteria
■ for.Class III teachers.
-■ "
134
referred to the tables which listed the rank order of means for the
sixteen effectiveness criteria.
Tables IX, X, X I * XII, XIII and XXI of
this chapter lists this information respectively for Montana administra­
tors, Montana teachers collectively. Class I district teachers, Class II
district teachers, Class III district teachers, and Delaware teachers
and administrators.
Table references giving the above information are
noted at the bottom of each Teable XXII through XXVII.
For each of the comparisons, the critical value of P was .425 at
the .05 level of confidence.
The lowest correlation coefficient for the
six comparisons was P = .827 which was the comparison between Montana
administrators and Delaware teachers.
P = .827 is considerably higher
than the critical value of P = .425 at the .05 level of confidence indi­
cating a rather high degree of correlation between the subjects.
The
highest correlation coefficient obtained was from the comparison between
Montana teachers from Class I and Class II school districts.
The correla­
tion coefficient between these two groups was P = .977 which would indi­
cate a high degree of agreement on how they viewed teacher effectiveness
criterion.
The correlation coefficient of P = .888 between Montana
teachers and administrators indicated a high degree of agreement on
teacher effectiveness criterion measures well above the P = .425 level
of significance at the .05 level of confidence.
In examining hypothesis VII by calculation of correlation coeffi­
cients this researcher noted that Montana teachers and administrators
seemingly agree in their rating of effectiveness criteria for measuring
teacher effectiveness.
Their ratings correlated significantly with
teachers of Delaware who participated in a similar study.
To test the significance of whether or not the difference in how
Montana teachers and administrators rated each criterion of effective­
ness, a T-test for significance of the mean differences was calculated
for each criterion measure.
These T-ratios are shown in Table XXVIII.
Examination of the table indicates that the differences in rating
criterion I, 2, 9, 10, 13, 14, and 16 between administrators and teachers
were significant at the .05 level of confidence.
The criterion and type in rank order of teacher rating illustrated
in.Table IX are described as follows:
Criterion
Type
1.
Effectiveness in controlling his class.
(Process)
2.
Knowledge of subject matter and related areas.
(Presage)
9.
Influence on student behavior.
(Product)
10.
Amount his students learn.
(Product)
13.
Civic responsibility (patriotism).
(Presage)
14.
Participation in community and professional activities. (Presage)
16.
Performance in student teaching.
(Presage)
Inspection of Table XXVlII indicates that administrators rated
criterion I, 9, 10, 13, 14, and 16 higher than teachers.
criterion 2 higher than administrators.
Teachers rated
Other than informational pur­
poses for the reader no Conclusions were drawn by the researcher as to
the significance of these particular data.
TABLE XXVIII
T-TEST OF SIGNIFICANCE OF THE MEAN DIFFERENCE
FOR TEACHER AND ADMINISTRATOR
Montana Teachers' Montana
Teacher
Rank Order of
Means X 1
Criterion
Montana
Administrator
Means X„
Teacher
N1
Administrator
N2
Teacher
Administrator
2
S1 „
P
s;
t
Ratio
8.13
8.09
7.89
7.61
8.32
7.68
7.95
7.49
452
451
452
445
82
82
82
78
1.24
1.41
1.62
1.90
.61
1.01
1.16
1.73
7.56
7.44
. 7.34
7.31
7.57
7.36
7.39
7.36
447
447
452
444
81
80
80
81
2.11
1.82
.07
.44
.33
2.15
1.35
2.26
1.45
1.21
9
10
11
12
7.19
. 7.04
6.77
6.35
7.56
7.66
6.59
6.58
452
443
448
452
82
80
81
. 82
2.05
2.77
2.75
2.63
1.51
1.82
1.97
1.77
2.46*
3.64*
1.03
1.39
13
14
15
16
6.09
5.23
5.08
5.01
6.53
5.74
4.95
5.87
448
451
450
451
3.97
3.68
3.95
4.20
2.25'
2.60
3.38
3.02
2.32*
I
2
3
4
D. f . range from N^ + N^ - 2 = 527
to
N 1 + N2 - 2 = 534
Critical value of t at .05 Level of Confidence = 1.64
for I Tail Test
♦Denotes Significant Value of t.
81
80
79 .
80
2,09
.45
.75
'35
2.55*
.57
3.98*
136
5
6
7.
8
1.90*
3.30*
5
1 3 7
SUMMARY
All subjects in both the administrator and teacher samples for
this study were administered the questionnaire which contained
criterion measures of teacher effectiveness.
that the sixteen criterion were measurable.
sixteen
The assumption was made
The criterion measures were
grouped together under the Mitzel Scheme of product, process and presage.
The rank order of means computed for administrators and teachers
was listed illustrating how each group rated the sixteen measures of
teacher effectiveness.
Comparison tables were made to illustrate the
difference in ratings.
The least, squares analysis of variance was the statistic used to
measure the significance of difference in ratings given by administra­
tor and teacher groups to effectiveness criteria.
This analyses helped
the researcher in determining whether or not the null should be retained
in the hypotheses one through five which examined how:
administrators rated teacher effectiveness criteria,
(I) teachers and
(II) teachers of
three district classifications rated teacher effectiveness criteria,
(III) male and female teachers rated teacher effectiveness criteria,
(IV) elementary and secondary teachers rated teacher effectiveness
criteria, and (V) years of experience influenced the rating of teacher
effectiveness criteria by teachers.
The analysis of variance examination of the data along with the
\
use of Duncan's test for multiple comparisons when applicable indicated
138
that teachers and administrators in the State of Montana do not signi­
ficantly differ in how they rated teacher effectiveness criteria grouped
under the Mitzel Scheme of product, process and presage types of measures.
Analysis of data did indicate that both Montana administrators and teachers
rated process and product criteria measures significantly higher than
presage criteria.
Although administrators rated product measures highest
and teachers rated process measures highest the difference in measurement
was not significant.
The sampled teachers taught in small, medium and large districts
in the State of Montana.
Regardless of the size of district in which
they taught all viewed process and product criteria measures as signifi­
cantly more important than.presage criteria.
It was noted that teachers
in first class districts rated process criteria highest and teachers in
class two and class three districts rated product criterion highest but
the difference was not significant.
In comparing how male and female teachers rated teacher effective­
ness criterion, the same trend was indicated by the data.
Both groups
rated product and process measures of teacher effectiveness significantly
higher than presage criteria.
Male teachers rated product criteria high­
est and female teachers rated process criteria highest but the difference
was not significant.
Years of experience was not a significant factor in how Montana
teachers rated teacher effectiveness criteria measures.
Teachers of all
139
experience categories ranging from (0-5) years to over 26 years of ex­
perience rated process and product criteria as significantly more impor­
tant as effectiveness measures than presage criteria.
Teachers with 16-20 years of experience and 26 and over years of
experience were higher raters of the three subgroups of teacher effec­
tiveness measures than were teachers in other experience categories.
Teachers in experience categories 16-20 years and 26 and over years of
experience rated product criteria measures highest.
The teachers in
the remaining experience categories rated process equally high or higher
-as effectiveness measures.
In neither case, however, was the difference
significant.
The multiple regression equation was used to test whether or not
a significant relationship existed among the differences between admin­
istrators and teachers rating of teacher effectiveness and the teachers '
rating of the administrators effectiveness in helping the teacher to
become a more effective teacher." The teachers rating of the administra­
tors was the dependent variable sometimes spoken of as the criterion and
the sixteen teacher effectiveness criteria were the independent vari- .
ables sometimes spoken of as predictors.
The analysis of data using the multiple regression statistic indi­
cated that only one of the sixteen criteria measures of teacher effec­
tiveness was significant in predicting how teachers evaluated the effec­
tiveness of their administrator.
This criterion listed as #13 on the
140
survey instrument and entitled "effectiveness in controlling his class",
although significant, accounted for only 1.7% of the total variance
which is very minimal in importance.
Apparently criteria other than the sixteen listed on the survey
instrument are more significant in the teachers' evaluation of adminis­
trators' effectiveness.
The null hypothesis used to examine this ques­
tion was retained as true.
To compare the results of this study which replicates, in some
degree, a previous study completed in recent years in Delaware, this
researcher used a nonparametric statistic to make the comparison.
The
statistic chosen to examine the relationship in the criterion ratings of
Montana teachers and administrators to Delaware teachers and adminis­
trators on the rating of teacher effectiveness was the Spearman's Coef­
ficient of rank correlation; because, only mean data for the Delaware
study was .available to the researcher.
The Spearman rho statistic indicated that a high correlation
existed between the results of the Delaware study and this study for
combined teacher and administrator rankings of criterion measures of
teacher effectiveness.
In both studies product and process criterion
measures were rated significantly higher than' presage measures of
teacher effectiveness.
The difference in. results of this study and the Delaware study
was observed to be that by raink of mean comparison Montana teachers
141
and administrators rated "effectiveness in controlling his class" as
number one and Delaware teachers and administrators rated "relationship
in class (good rapport)" as number one in importance.
are process criterion.
Both criteria
One might conclude that discipline is of prime
importance to Montana teachers.
The "amount his students learn" (a
product criterion) was rated fourth in importance by Montana teachers
and administrators and eleventh by Delaware teachers and administrators.
In comparing the mean ratings of each of the sixteen criterion
measures of teacher effectiveness between teachers and administrators of
Montana and the reference group of administrators and teachers of Dela­
ware , the researcher concluded that the null hypotheses was a true des­
cription of this relationship.
Teachers and administrators rating in
this study compared significantly with the ratings of Delaware teachers
and administrators.
CHAPTER V
SUMMARY, CONCLUSIONS AND RECOMMENDATIONS
SummaryPast research in the area of teacher effectiveness has concen­
trated on identifying the effective and ineffective teacher.
Several
criteria have been used for placement of teachers into one of these two
groups.
However, when characteristics of each of the group members
have been studied, few distinguishing variables have been identified.
Those who have been interested in teacher effectiveness have had
different purposes and consequently have varied their interpretations of
the problem.
Some who have investigated the problem of teacher effec­
tiveness would have been satisfied to know whether or not a teacher was
getting desired results with the results indicating effectiveness.
Others wanted to know how to increase the probability of attaining
desired results.
Researchers of the latter persuasion were searching
for lawful teaching behavior, i_.£. , validated procedures for achieving
instructional ends.
Their assumption was that effective teaching would
be recognized when lawful relationships were established between instruc­
tional variables and learner
outcomes, that certain procedures in
teaching would have, within certain probability limits, been labeled as
effective or ineffective.
To date there, are no such laws, only a few
leads or practices that are more likely than others to maximize the
attainment of selected instructional ends.
143
Examination of the literature indicated that teacher effective­
ness has been such a nebulous quality that it was unlikely that factors
would ever be found which would successfully categorize teachers.
In'
reality effectiveness and ineffectiveness are not mutually exclusive.
A teacher can be effective for one reason while at the same time he is
ineffective for another reason.
Purposes of defining the effective teacher have varied between
the practical need to assess teachers for retention or release at the
local school district level to the researcher's purpose of determining
how well a teacher could perform in any of a class of jobs which share
many common characteristics, as well as with identifying these common
characteristics.
The school official has sought to determine how well a teacher
performed his job in terms of certain specified and more often unspeci­
fied criteria.
He has not been concerned with whether or not the job
he asked the teacher to perform was representative of the class of such
jobs or could the teacher perform the class of jobs well.
The difference between a school official's concern and a re­
searcher's concern has several implications.
For example, the overall
or even intuitive ratings may be used by a school official to help make
a very general assessment of how well a teacher performs".
The general
assessment thus gained has provided relevant and useful information for
the immediate situation.
Overall ratings have not generally been useful
144
or relevant because such ratings have low reliabilities, and have not
been consistent with the purpose of the researcher:
to predict and to
describe.
Much unhappiness regarding assessment of teachers has been for
curricular rather than instructional reasons..
The teacher may have been
labeled ineffective not because his students failed to achieve, but be­
cause the achievement has been in directions that were not valued by
the rater.
Judgments regarding teachers have always been.made but the recent
public outcry for accountability has placed increasing emphasis on the
need to improve ways to evaluate teacher effectiveness.
The problem
inherent in this emphasis was the selection of suitable criteria upon
which both administrators and teachers truly agreed measured teacher
effectiveness.
r-"""'
The purpose of this study was to determine whether or not teachers
and administrators in the State of Montana agreed on the criteria for
judging effective teaching.
In the need for school districts to meet ■
the public's demand for accountability this determination had to be made
before a basis could be established between teachers and administrators
for effective evaluation procedures.
To gather data fob this study the researcher sent a survey instru­
ment to both administrators and teachers in the spring of 1977.
instrument used was that which was designed for a similar study
The
145
conducted by Jenkins and Bausell in the State of Delaware.
The administrator population was sampled first and permission was
given from each respondent to sample teachers who were members of his
staff.
The useable sample returns numbered H O out of 114 sampled for
administrators.
Useable samples return for teachers number 454 from
568 sent out.
Analysis of the data gathered from the survey of both adminis­
trators and teachers of Montana indicated that administrators and
teachers agreed significantly within their respective groups on what
criteria of teacher effectiveness were most important.
Both groups
agreed significantly on the same criteria measures of teacher effec­
tiveness.
'
This agreement would seemingly provide a basis within Montana
school districts for objectively arriving at agreed upon teacher effec­
tiveness measures supportable by both teachers and administrators within
a given school district.
Whether these same measures are acceptable to
school board members, other school p a t r o n s s u c h as parents and young­
sters whom teachers serve, remains to be determined.
One can assume
that agreement exists between the school.district's professional staff
and school patrons if elected school board officials accurately reflect
the communities idea of effective teaching.
Unless agreement is apparent
it is probable, as supported by review of the current literature on
teacher effectiveness, that subjective measures will be the prevailing
1 4 6
criteria in judging the effective and ineffective teacher.
1 4 7
Conclusions
The following conclusions were developed upon analysis of data on
this study.
1.
Teachers and administrators of"Montana were in agreement in
how they ranked the criteria which measures teacher effectiveness as
described by the Mitzel types of product, process and presage.
T his■
was attested to by the high correlation between how admihistrators
rated effectiveness criteria compared to teacher rating of criteria
regardless of the size of school district in which they taught.
2.
In comparing the ratings by Montana teachers and administra­
tors of teacher effectiveness criteria grouped in sub-groups of process,
product and presage, it was found that administrators rated all three •
sub-groups of criteria significantly higher than did teachers.
3.
'
* Both administrators and teachers in Montana rated process and
product criteria significantly higher than presage criteria. ' This rat- •
ing was consistent for comparisons by years of experience, classes of
school districts, elementary versus secondary teachers, and male or
female categories.
4.
There was no significant difference among the ratings given .
to the three sub-groups of effectiveness criteria by Montana teachers in
the three classes of school districts.
5.
There was a significant difference between male and female
teachers in Montana in their ratings of the three sub-groups of effec-
148
tiveness criteria.
Female teachers significantly rated all three sub­
groups , process, product and presage criteria higher than did. male
teachers.
6.
There was no significant difference between elementary and
secondary teachers of Montana in their ratings of the three sub-groups
of effectiveness criteria, process, product and presage.
7.
Montana teachers■grouped into six classes by years of experi­
ence significantly rated effectiveness criteria differently in two of
the six classes of experience.
Teachers with 16-20 years of experience
and teachers with 26 or more years of experience consistently rated the
three sub-groups of effectiveness criteria higher than teachers in the
four remaining experience categories.
8.
Of the sixteen effectiveness criteria only the criterion
"effectiveness in controlling his class" had a significant relationship
to the rating of the administrator's effectiveness.
9;
There was a significant correlation between this study and ■
a similar study conducted in the State of Delaware of teacher and .admin­
istrator rankings of the sixteen effectiveness criteria.
10.
There was a significantly high correlation between Montana
teachers' rankings of the sixteen effectiveness criteria by school dis­
trict -classification.
149
Recommendations for Further Study
1.
A study should be conducted to determine if parents and other
constituents of a school district served by teachers and administra­
tors agree among themselves and with the latter as to what criteria
are jnost important in measuring teacher effectiveness.
This would
determine if parents in a community emphasize the same criteria that
teachers and administrators do in Montana.
If expectations are simi­
lar, it is conceivable that school districts will be effective in
working out acceptable teacher evaluation programs.
2.
A study should be conducted that would determine if present
school district teacher evaluation programs emphasize morti than just
classroom control as the principal criteria for retaining or releas­
ing teachers.
This would test the assumption that if a teacher can­
not exercise an acceptable level of student control, learning is
less likely to take place.
3.
There is a.need to know what criteria teachers use to evaluate
their administrator's effectiveness in helping them to become better
teachers.
The indication is that this process does not take place
unless conflict in values arise between administrators and the
teachers on what criteria is considered most important in measuring
effectiveness.
If a teacher and administrator agree philosophically
on a criterion measure, no need exists for the administrator to. assess,
a teacher on the criterion measure.
They agree— why measure.
150
4.
As more pressure is exerted by the public for schools to give
evidence that young learners are "mastering the skills", or show,
competency in attaining "agreed upon learner outcomes", it may be
assumed that the degree of student competency attained will directly
reflect upon his teachers' effectiveness.
A study should be made to
determine to what degree school districts presently assess their
staff's competency by criterion referenced test results of students
and is this trend apparent in Montana.
The reason for inquiry stems
from the fact that administrators in Montana emphasize the importance
of product criterion of "what students learn" much more than teachers
presently do as evidenced by the analysis of data in this study.
Apparently administrators are sensitive to the public's demand for
accountability much more than teachers at this time.
If administra­
tors reflect this sensitivity by using test results to give evidence
of this school's effectiveness,parents may give more emphasis to
these results than is warranted as an indication of a- teacher's effec-,
tiveness.
This approach would surely lead to conflict and ineffective
evaluation of staff if teachers do not accept product measures as
being of primary importance.
5.
It is not clear why principals and teachers place relatively greater,
emphasis on criteria other than "amount students learn" as proposed
by accountability proponents. .A start toward lessening the dissonant
attitudes resulting from this conflict in values would be a study to
151
determine the reasoning behind teacher's rating of student learning
relative to other criteria.
LITERATURE CITED
LITERATURE CITED
Allen, Dwight W., (Ed.) and Eli Seifman.. The Teacher's Handbook.
view, 111.: Scott, Foresman and Company, 1971.
Glen­
Anderson, C . C . and S.- M. Hunka. "Teacher Evaluation:
Some Problems
and a Proposal," Harvard Educational Review. Cambridge, Mass.:
Winter 1963.
Averch, Harvey A., Stephen J . Carroll, Theodore S . Donaldson, Herbert
J . Kiesling, John Pincus. How Effective is Schooling. (A Criti­
cal Review and Synthesis of Research Findings.) Santa Monica,
Calif.: The Rand Corporation, December, 1971.
Barr, A. S . Wisconsin Studies of the Measurement and Prediction of
Teacher Effectiveness. Madison, Wise.: Dembar Publications, Inc.,
1961.
Barr, A. S . and others. "Wisconsin Studies of the Measurement and
Prediction of Teacher Effectiveness:' A Summary of Investigations,"
Journal of Experimental Education, 30, (1961) 5-156.
Beery, John R . Professional Preparation and Effectiveness of Begin­
ning Teachers. (The Ford Foundation) Coral'Gables, Fla.: Graphic
Arts Press, 1960.
Biddle, Bruce J . and William J . Ellena.
(Eds.) Contemporary Research
on Teacher Effectiveness. New York: Holt, Rinehart and Winston,
1964.
Bleecher, Harvey. "The Authoritativeness of Michigan's Educational
Accountability Program," The Journal of Educational Research.
69, (November and December 1975) 135-141.
Holton, Dale L . Selection and Evaluation of Teachers.
Calif.: McCutchan Publishing Corporation, 1973.
Berkeley,
Bottenberg, Robert A., and Joe H. Ward. Applied Multiple Linear Regres­
sion. 6510th Personnel Research'Laboratory, Aerospace Medical
Division, Air Force Systems Command.
Lackland Air Force Base,
Texas: 1963.
/
Boyce, A. C . "Methods of Measuring Teachers' Efficiency," 14th Yearbook,
National Society for the Study of Education, Part 2. Chicago:
University of Chicago Press, 1915.
154
Briggs, Thomas H . Improving Instruction. (Supervision by Principals
of Secondary Schools). New York: The Macmillan Company, 1938.
Brighton, Stayner F . Increasing Your Accuracy in Teacher Evaluation.
Englewood Cliffs, N . J.: Prentice-Hall, Inc., 1965.
Brighton, Stayner F . and Cecil J . Hannah. Merit Pay Programs for
Teachers. (A Handbook). San Francisco, Calif.: Fearon Publishers,
■ 1962.
Brim, Orville G., Jr. Sociology and The Field of Education.
Sage Foundation, New York: 1958.
Russell
Brophy, Jere E . and Carolyn M. Evertson. Learning from Teaching: A
Developmental Perspective. Boston, Mass.: Allyn and Bacon, 1976.
Brophy, Jere E. and Carolyn M. Evertson. "Teacher Education, Teacher
Effectiveness, and Developmental Psychology." Eric ED 118 257,
Report No. 75-10, August, 1975.
Bush, Robert Nelson.
The Teacher-Pupil Relationship.
Prentice-Hall, Inc., 1954.
New York:
Coleman, James S . et.al. Equality of Educational Opportunity. Wash­
ington, D . C.: U . S. Department of Health, Education, and Welfare,
Government Printing Office, August 1966.
Dalton, Elizabeth L . What Makes Effective Teachers for Young Adoles­
cents? George Peabody College for Teachers, Nashville, Tennessee,
1962.
Davis, Hazel.
(Director) Evaluation of Teachers. (Research Report)
Washington, D . C.: National Education Association, 1964.
Domas, Simeon J . and David V. Tiedeman. "Teacher Competence: An Anno­
tated Bibliography," Journal of Experimental Education, XIXv No. 2
(December 1950) 101-218.
Ebel, Robert L . (Ed.) Encyclopedia of Educational Research.
Edition. Toronto, Ontario: The MacMillan Company, 1969.
4th
Ebel, Robert L . and Roger M. Baun (Eds.) Encyclopedia of Educational
Research. 4th Edition. Toronto, Ontario: The MacMillan Company,
Collier-MacMillan Consolidated, 1969.
155
Ellena, William J., (Ed.)
"Who's A Good. Teacher?" American Association
of School Administrators. 1201 Sixteenth Street, N.W., Washington,
D .C ., 1961.
Eye, Glen G . "The Superintendent's Role in Teacher Evaluation, Reten­
tion, and Dismissal," The Journal of Educational Research, 68:390-395,
July/August 1975.
Ferguson, George A. Statistical Analysis in Psychology and Education.
New York: McGraw-Hill Book Company, 1971. ,
Flanders, Ned A. Analyzing Teacher Behavior.
Wesley Publishing Company, 1970.
Flanders, Ned A. Teaching With Groups.
Publishing Company, 1954.
Reading, Pa.:
Minneapolis, Minn.:
Addison-
Burgess
Gage, N . L. Teacher Effectiveness and Teacher Education. (The search
for a Scientific Basis), Palo Alto, Calif.: Pacific Books, Pub­
lishers, 1972.
Gage, N. L . (Ed.) Handbook of Research, on Teaching. Chicago: The
American Educational Research Association, Rand McNally and Company,
1963.
Gage, N. L. (Ed.) Mandated Evaluation of Educators: A Conference on
California's Stull Act. Palo Alto * Calif.: Center for Research
and Development in Teaching, School of Education, Stanford Univer­
sity, 1973.
Gagne, Robert M. The Conditions of Learning.
and Winston, Inc., 1965.
New York:
Holt, Rinehart
Gary, Frank.
"How Successful is Performance Evaluation," paper presented
at the Annual Convention of the American Association of School Ad­
ministrators, Dallas, Texas, Eric February 1975.
Getzels, J . W. and P . W. Jackson. The Teacher's Personality and Char­
acteristics. Handbook of Research on Teaching, N. L. Gage (Ed.).
Chicago: Rand McNally and Company, 1963.
Glass, Gene V. "Teacher Effectiveness," Evaluating Educational Per­
formance . (A resource book of methods, instruments, and examples)
Herbert J . Walberg (Ed.). Berkeley, Calif.: McCutchon Publishing
Company, 1974.
156 '
Grabman, Hulda.
"Accountability for What?" Nation's Schools, Education
Digest 38:65-68, October 1972.
Haley, Dennis Richard.
"Relationship of Variables Beyond Teacher's
Control and Teacher's Effectivness Ratings by Students," an. unpub­
lished Doctoral Dissertation, Montana State Universaity, Bozeman,
Montana, 1974.
Helvig, Carl.
Solution.
1972.
(Ed.) Teacher Evaluation:
The State of the Art and
Harvard Educational Review. Cambridge, Mass.: May,
Herman, Jerry J . Developing an Effective School Staff Evaluation Pro­
gram. West Nyack, N. Y.: Parker Publishing Company, Inc., 1973.
Herrboltd, Allen A. "The Relationship Between the Perceptions of
Principals and Teachers Concerning Supervisory Practices in
Selected High Schools of Montana," an unpublished Doctoral Dis­
sertation, Montana State University, Bozeman, Montana, 1975.
Hildebrand, Milton H . and Others. Evaluating University Teaching (A
Handbook). Center for Research and Development in Higher Educa­
tion, University of California, Berkeley, 1971.
Hottleman, Girard D . "The Accountability Movement," The Massachusetts
Teacher, LIII, January 1974.
House, Ernest R . (Ed.) School Evaluation: The Politics and Process.
Berkeley, Calif.: McCutchan Publishing Corporation, 1973.
Howard, Alvin W. "Accountability at Last (and Again)," National Associ­
ation of Secondary School Principals, 58:20-23, March 1974.
Jenkins, Joseph R . and R . Barker BauSell. "How Teachers View the Effec­
tive Teacher; Student Learning is Not the Top Criterion." Phi
Delta Kappan, April, 1974.
Kibler, Robert J., Donald J .,Cegala, Larry L. Barker, and David T. Miles
Objectives for Instruction and Evaluation. Boston: Allyn and
Bacon, Inc., 1974.
Lewis, James, Jr. Appraising Teacher Performance.
Parker Publishing Company, Inc., 1973.
West Nyack, N . Y.: '
157
Marsh, Joseph E . and Eleanor W. Wilder.
Identifying the Effective ■
Instructor: A Review of the Quantitative Studies, 1900-1952.
Chanute Air Force Base, 111.: Air Force Personnel and Training
Center, 1954.
McGowan, Francis A. II. Teacher Observation and Evaluation:
Paper. Eric ED 113 309, November 1974.
A Working
McKenna, Bernard H . Staff the Schools. New York: Bureau of Publica­
tions, Teachers College, Columbia University, 1965.
McNeil, John D . and W. James Popham. "The Assessment of Teacher Compe­
tence," taken from Robert M. W. Travers (Ed.) Second Handbook of
Research on Teaching. Chicago: Rand McNally, 1973.
Medelcy,. Donald M. and Harold E . Mitzel; Measuring Classroom Behavior
by Systematic Observation. Handbook on Research on Teaching, N. L.
Gage (Ed.) Chicago:■. Rand.McNally and Company, 1963.
Med-Icy, Donald W . and Others. Assessment and Research in Teaching
Education. Focus on PBTE , Jjjrdc June 1975.
Miller, William Ci '"Xccouhtabiix uy Dem'ands InVolvement," Educational
Leadership. 29:613-617, April 1972.
Moddus, George F . and Peter -W. Airosion. "Performance Evaluation,"
Journal of Research and Development in Education. V o l . 10, #3,
Spring 1977.
Mohan., Madan and Ronald E . Hull.
Teaching Effectiveness:
Its Meaning,
. Assessment and Improvement. Englewood Cliffsj New Jersey: Educa- ■
tional Technology Publications, 1975.
NE A . "Better Than Rating" (New Approaches to Appraisal of Teaching Ser­
vices), Association for Supervision and Curriculum Development,
National Education Association, Washington, D . C., 1950
NEA.. "The Early Warning Kit. on The Evaluation of Teachers," First
Revision, National Education Association, Washington, D . C.,
January 1974.
N E A . "The Evaluation of Teachers," National Education Association, ,
Washington, D . C., Winter 1973-74.
158
Nelson, Kenneth G., John E . Becknell, and Paul A. Hedlund. Development
and Refinement of Measures of.Teaching Effectiveness. The-University
of the State of New York, Albany, N.Y., 1956.
Ober, Richard L ., Ernest L . Bentley, and Edith Miller.
Systematic
Observation of Teaching. (An interaction analysis-instructional •
strategy approach.) Englewood- Cliffs, N. J . : Prentice-Hall, Inc.
Oldham, Nield. (Ed.) Evaluating Teachers for. Professional Growth.
rent Trends in School Policies and Programs. National School
Public Relations Association. Arlington, Va.: 1974,
Cur­
Ornstein, Allan C . and Harriet Talmage. "The Promise and Politics of
Accountability," National Association of Secondary School Princi­
pals , 58:20-23, March 1974.
Popham, W . James.. (Ed.) Evaluation in Education. (Current, applica­
tions) Berkeley, Calif.: McCutchan Publishing Corporation, 1974.
Popham, W. James.
"The New World of Accountability:
In the Classroom,"
The National Association of Secondary School Principals. 56:25-31,
May 1972.
Read, Edwin A.
"Accountability and Management by Objectives," The
National Association of Secondary School Principals. 58:1-10,
March 1974.
Reagan, Ronald.
"Public Education: An Appraisal, " The National Associ­
ation of Secondary School Principals. 56:1-9, May 1972.
Remmers,' H'. H. (Chairman) and Others. "Report onthe Committee on the
Criteria of Teacher Effectiveness,"Review of Educational Research.
22:238-263, June 1952.
Rogers, Virgil M.
(Ed.) Do We Want "Merit" Salary Schedules? (Report
of Second Annual. Workshop on Merit Rating in Teachers'1 Salary.
Schedules.) Syracuse University Press, I960.
Rosenshine, B . "Evaluation -of Classroom Instructor," Review -of Educa­
tional Research. 40:279-301,1970.
Rosenshine, Barak. "New Directions for Research on Teaching," How
Teachers Make a Difference. U. S. Government Printing Office,
1971.
159
Rosenshine, Barak. Teaching Behaviors and Student Achievement.
York: Humanities Press, 1971.
New
Rosenshine, B . and N . Furst. "The Use of Direct Observation to Study
Teaching," taken from Robert M.W. Travers (Ed.) Second Handbook
. of Research on Teaching. Chicago: Rand McNally, 1973.
Ryans, D . G . Characteristics of Teachers:
Their Description, Compari­
son, and Appraisal. Washington: American Council on Education,
1960.
Sciara, Frank J . and Richard K. Jantz. Accountability in American
Education. Boston: Allyn and Bacon, Iric., 1972.
Spears, Harold.
Improving the Supervision of Instruction.
Prentice-Hall, Inc., 1953.
Stephen, John M. The Psychology of Classroom Learning.
Holt, Rinehart and Winston, Inc., 1965.
New York
New York:
Stephens, J.. M. The Process of Schooling. A Psychological Examina­
tion. New York: Holt, Rinehart and Winston, Inc., 1967.
Thomas, Donald.
"The Principal and Teacher Evaluation," 'The Education
Digest. Vol. 40, March, 1975.
Thomas, Donald.
"The Principal and Teacher Evaluation," National
Association of Secondary School Principals. 58:1-8, December 1974.
Thomas, J . Alan. The Productive School (A Systems Analysis Approach
to. Educational Administration). New York: John Wiley and Sons,
Inc., 1971.
Travers, Robert M.W.,(Ed.) Second Handbook of Research on Teaching.
Chicago: Rand McNally, College Publishing Company, 1973.
Walberg, Herbert J . (Ed.) Evaluating Educational Performance< (A
Sourcebook of Methods, Instruments, a n d .Examples.) Berkeley ,
Calif.: McCutchan Publishing Corporation, 1974.
Walter, Franklin B .. "Mandates for Evaluation: The National Overview,"
(Paper presented at the conference of the Kentucky Association of
Teacher Educators) . Richmond, Kentucky, Eric Ed 115 607, October 31
1975. .
160
Weiss, Edmond. "Educational Accountability and the Presumption of
Guilt," Planning and Changing. Illinois State University, Norman,
111., 1972; appeared in The Education Digest, April 1973.
Wicks, Larry E . "Opinions Differ:
cation . 62:42-43, March 1973.
Teacher Evaluation," Today1s Edu­
Wilson, Laval S . "Assessing Teacher Skills: Necessary Component of
Individualization," Phi Delta Kappan. Vol. LVI, November 1974.
Wilson, Laval S . How,to Evaluate Teacher Performance. (Paper pre­
sented to the Annual Convention of the National School Boards
Association, April 1975.)
Wolf, Robert L . How Teachers Feel Toward Evaluation:
in School
Evaluation, the Politics and Process, (Ed.) Ernest R . House.
Berkeley, Calif.: McGutchan Publishing Corporation, 1973.
APPENDIX-A
162
DEPARTMENT O F E D U C A T IO N A L SERVICES
COLLEGE OF EDUCATION
M O N T A N A STATE UNIVERSITY BOZEM AN 59715
March 14, 1977
Dear
In recent months you have most likely become very much aware
of the increased emphasis which has been placed upon accountabil­
ity of schools and schooling by the tax paying public. Current
literature points to the fact that parents expect their youngsters
to be competent in basic skills as a result of schooling. Improv­
ing teacher effectiveness is timely and basic to the increased
concerns of parents and school patrons.
The problem for which this survey will provide data is that
of finding the appropriate basis to evaluate teacher effectiveness
The immediate need in solving this problem is for school districts
to find the degree of agreement among teachers as to what is appro
priate criteria for judging the effectiveness of a teacher and com
pare those findings with the administrator's determination.
Enclosed you will find a survey, instrument that is designed
to gather data which will provide an answer to. this problem. The
survey contains a list of criteria for judging teacher effective­
ness and directions for its completion.
Your response is essential to the study of this problem, as
the data gained will reflect the.thinking of people like you who
must deal with this problem each day in their professional career.
This survey questionnaire is designed to be answered in a
maximum of five to seven minutes. A self-addressed envelope is
provided. An early response, within your busy schedule, will be
very much appreciated.
Your administrator has been contacted, and his permission
has been granted for you to receive this survey instrument. Due
to the design of this instrument, it is essential that both the
administrator and his staff respond to the questionnaire. Your
response will be kept strictly confidential. No identification
of schools or respondents will be made public in this study..
This study is being conducted under the direction of
Dr. Robert Thibeault of the College of .Education, Montana State
University.
A summary statement of the questionnaire data will be made
available to you if you wish to know the results of this survey.
Sincerely yours,
Francis Ai Olson
T m i 1H O N E K O f t J O * ! O 1J U
DEPARTMENT O F E D U C A T IO N A L SERVICES
COLLEGE CE EDUCATION
M O N T A N A STATE UNIVERSITY BO ZEM AN 59715
May 16, 1977
Dear
Recently I mailed you a questionnaire which will provide data
on the problem of finding the appropriate baaia to evaluate teacher
. effectiveness.
In the event that the instrument may have been misplaced or
you did not receive it, I am enclosing another for your consideration
I would very much appreciate your completing the instrument and re­
turning it to me in the self-addressed envelope provided. The input
from you would be very helpful to this study and assuredly very much
appreciated.
I am aware of the many deadlines that you face in the closing
days of the school year, and, believe me, I deeply appreciate your
time and kind consideration in completing and returning a survey at
this time.
In the event that you have already completed the instrument sent
to you earlier, kindly disregard this request.
Thank you for your kind consideration.
Very sincerely yours,
Francis A. Olson
T ELEPHONt M 0 6 I W 4
4 f »
164
INSTRUCTIONS FOR COMPLETING THIS SURVEV INSTRUMENT .
The purpose of this survey is to determine what professional edu­
cators believe are the appropriate criteria for judging the effec­
tiveness of a teacher.
On the survey instrument, which follows the demograhic informa­
tion, please rate each of the items on the nine-point scale provid­
ed. Assume that adequate measures exist to measure each criteria.
Try to differentiate aa much as possible between items. Please
rate all items and be sure not to circle more than one rank for any
given item.
Use the scale to rate each of the criteria according to its impor­
tance in determining teacher effectiveness. Circle one rank for
each item. Low ranks are indicative of unimportant criteria; high
ranks, important. Five is, of course, average.
Please complete the following demographic information.
A.
B.
C.
D.
E.
The class of the school district in which you teach (circle
one).
Third Class
Second Class
First Class
male or
female •
Sex
Your years of teaching experience (circle one).
I. 0-5
2. 6-10
3. 11-15
4. 16-20
5. 21-25
6. 26 and over
. Circle the grade level or levels that you are presently
teaching.
K I 2 3 4 5 6 7 8 9 10 11 12
Please list the number of students who ore presently
enrolled in your districts
__ enrolled
Do you want a copy of the dato summary sent.to you when this study
is completed?
_yes
.no
165
SURVEY INSTRUMENT
Criteria
.
Completely
Unimportan t
Extremely
Importan t
I. Willingness to be flexible, to be
direct or indirect as situation
demands.
I 234 567a9
Participation in community and
professional activities^
I234 567a9
2.
3. Years of teaching experience.
6.
Ability to personalize his teaching.
5. Knowledge of subject matter and
related areas.
I 234 5
6
7a 9
I234 5
6
78 9
I234 5
6
789
6.
Extent to which his verbal behavior
in classroom is student-centered.
I234 56789
7.
Capacity to perceive the world from
the student's point-of-view.
.I 2 3 4 5 6 7 8 9
O. Civic responsibility (patriotism).
I 23456789
9'. Personal adjustment and character.
I2345678 9
11). General knowledge and understanding
of education facts.
I2 34 56789
11.
Amount his students learn.
I 234 56 789
12.
Relationship with class (good
rapport).
I234 56789
13.
Effectiveness in controlling his .
class.
I234 56789
14.
Extent to which he uses inductive
(discovery) methods.
I234 5
15.
Influence on student's behavior.
I234 56 78 9
16.
Performance in student teaching.
I 'Z 3 4 5 6 7 8 9
continued on back
6
7a9
166
Hou effective ia your administrator in helping you to improve
your teaching effectiveness?
Circle one rsnk on the following scale for your rating.
Very Ineffective
Comments:
123456789
Very Effective
APPENDIX B
168
DEPARTMENT O F E D U C A T IO N A L SERVICES
COLLEGE OF EDUCATION
M O N T A N A STATE,UNIVERSITY. BOZEM AN 59715
January 6, 1977
Dear
In recent months you have most likely become very much aware
oF the increased emphasis which has been placed upon accountabil­
ity of schools and schooling by the tax paying public. Current
literature points to the fact that parents expect their youngsters
to be competent in basic skills as a result of schooling. Improv­
ing teacher effectiveness is timely and basic to the increased
concerns of parents and school patrons.
The problem for which this survey will provide, data is that
of finding the appropriate basis to evaluate teacher effective­
ness. The immediate need in solving this problem is for school
districts to find the degree of agreement among teachers as to what
is appropriate criteria for judging the effectiveness of a teacher
and compare those findings with the administrator's determination.
Enclosed you will find a survey instrument that is designed
to gather data which will provide an answer to this problem. The
survey contains a list of criteria for judging teacher effective­
ness and directions for its completion.
Your response is essential to the study of this problem, as
the data gained will reflect the thinking of people like you who
must deal with this problem each day in their professional career.
This survey questionnaire is designed to be answered in a
minimum of five to seven minutes. A self-addressed envelope is
provided. An early response, within your busy schedule, will be
very much appreciated.
Because your name was chosen randomly, it,is necessary and
essential for the purpose of this study, not only to get your
response, but also that of your staff members. In view of this
need, I am requesting your permission to send questionnaires to
your staff members. Your response and the responses of your
staff will be kept strictly confidential. No identification of
schools or respondents will be made public in this study.
This study is being conducted under the direction of
Dr. Robert Thibeault of the College of Education, Montana State
University.
A summary statement of the questionnaire data will be made
available to you if you wish to know the results of this survey.
Sincerely yours,
Francis A. 01son
T ELEPHONE W C t o l S M
<W]3
169
DEmRTMENTOF E D U C A T IO N A L SERVICES
COLLEGE OF EDLJCATION
M O N T A N A STATE UNIVERSITY BO ZEM AN 59715
March 8, 1977
Dear
May I take this means to remind you to complete and
return the survey instrument which you received from me
a while back.
In the event that the instrument may have been mis­
placed or you did not receive it, I am enclosing another
for your consideration. The input from you and your staff
would be very helpful to this study and assuredly very
much appreciated.
I. am very much aware that time is a
precious item in a busy administrator's day. For this
consideration I also express my sincere thanks.
If you have already returned the survey instrument
to me, then kindly disregard this request.
Very sincerely yours,
Francis A. Olson
M iH tHnfsjf Iii(ViH)iM dOU
170
INSTRUCTIONS FOR COMPLETING THIS SURVEY INSTRUMENT
The purpose of this survey is to determine what> professional edu­
cators believe are the appropriate criteria for judging the effec­
tiveness of a teacher.
On the survey instrument, which follows the demograhic informa­
tion, please rate each of the items on the nine-point scale provid­
ed. Assume that adequate measures exist to measure each criteria.
Try to differentiate as much as possible between items. Please •
rate all items and be sure not to circle more than one rank for any
given item.
Use the scale to rate each of the criteria according to its impor­
tance in determining teacher effectiveness. Circle one rank for
each item. Low ranks are indicative of unimportant criteria; high
ranks, important. Five is, of course, average.
Please complete the following demographic information.
A.
The class of the school district in which you are an
administrator (circle one).
First Class
Second Class
Third Class
0.
The administrator position which you presently hold in
your district (circle).
Principal
Superintendent
C. Sex
____male or ____female
D. Your years of administrative experience (circle one).
I. 0-5
2. 6-10 ■
3. 11-15
4. 16-20
5. 21-25
6. 26 and over
E. Circle the grade level or levels for which you are
presently responsible.
K I 2 3 4 5 67 89 10 1112
F. Please list the number of students for whom you are
presently responsible.
________ enrolled
G. Please list the number of staff members for whom you are
directly responsible.
________ staff members
Do you want a copy of the data summary sent to you when this study
is completed?
yes
no
171
SURVEY INSTRUMENT
Criteria
Completely
Unimportant
Extremely
Important
I . Willingness to be flexible, to be
2.
direct or indirect as situation
demands.
123456789
Participation in community and
professional activities.
123456789
3. Years of teaching experience.
123456789
4. Ability to personalize his teaching.
I 2 3 4 56789
5.
Knowledge of subject, matter and
related areas.
123456789
6.
Extent to which his verbal behavior
in classroom is student-centered.
123456789
7.
Capacity to perceive the world from
the student's point-of-view.
12 3 4 5 6 7 8 9
:i.
Civic responsibility,(patriotism).
123456789
y.
Personal adjustment and character.
123456789
iu.
General knowledge and understanding
of education facts.
12.3 4 5 6 7 8 9
ii.
Amount his students learn.
123456789
12 .
Relationship with class (good
rapport).
123456789
13.
Effectiveness in controlling hia
class.
12 3 4 5 6 7 8 9
14.
Extent to which he uses inductive
(discovery) methods;.
123456789
15.
Influence on student's behavior.
1 2 3 4. 5 6 7 8 9
16.
Performance in student teaching.
123456789
continued on back
172
Iluu effective are you in helping your teachers to improve their
teaching effectiveness?
'
Circle one rank on the following scale for your rating.
Very Effective
123456709
Very Ineffective
Commen ts :
Permission is granted to send this questionnaire to your staff
members.
^yes
MONTANA STATE UNIVERSITY LIBRARIES
3
762
100
1131
7
0^765
Olson,
cop.2
Francis
A
Professional
agreement
on
for measuring
educators
criterion
teacher
effectiveness
DATE
ISSUED TO
X
/ X
:
p
^
-
'
flX)
d# I N T E R U 8 R A R Y
1WTKRU8RABV
%
w t ^re-
^
W
T
W
^
^
ZzV fcP > ^
-
^ 2 /
-7^ 5T