Professional educators agreement on criterion for measuring teacher effectiveness by Francis Allen Olson A thesis submitted in partial fulfillment of the requirements for the degree of DOCTOR OF EDUCATION Montana State University © Copyright by Francis Allen Olson (1978) Abstract: This study investigated the criteria by which administrators and teachers, employed by Montana school districts, judged teacher effectiveness. A similar study, which this study replicated, was carried out in the State of Delaware by Jenkins and Bausell. The data for this study was gathered by a survey questionnaire which listed criteria for judging teacher effectiveness typed under the Mitzel Scheme. Administrators were sampled first and with their return an answer was given as to whether or not they gave permission to sample teachers on their respective staffs. Samples of administrators (N=665) was followed by sampling of teachers (N=9,428) and these results were compared by analysis of variance statistic at .05 level of significance. Comparison of the results of the study with the Delaware study was done by using Spearman's Coefficient of Rank Correlation. The Multiple Regression model was used to determine whether or not a significant relationship existed among the differences between the administrators and teachers' rating of each criterion measure of effectiveness and the teachers' ratings of administrators. Conclusions reached are: (1) The highest rated criterion by teachers and administrators in Montana for measuring effectiveness was "classroom control" followed by "knowledge of subject matter" and "rapport with students". (2) The "amount students learn" a criterion uppermost in the minds of accountability proponents was considerably less significant in the minds of teachers. (3) Product criteria (measure of student learning and behavior) and process criteria (measure of teacher behavior) were rated significantly higher than presage criteria (measure of a teacher's personal or intellectual attributes). (4) Teachers' rating of effectiveness criteria was not significantly related to how teachers viewed their administrators' effectiveness. (5) Montana and Delaware teachers were in agreement that process and product criteria for measuring effectiveness are considerably more important than presage criteria. (6) Both Montana and Delaware teachers do not consider that "what students learn", is as important a criterion by which to measure effectiveness as other criteria. This view differs considerably from that of accountability proponents. One of the more important recommendations coming out of this study is to determine whether or not parents and other constituents of Montana served by administrators and teachers are in agreement among themselves and with educators on what types of criteria ate most important by which to judge effectiveness. It would be important to know if discipline (effectiveness in controlling his class) is the number one rated criterion. © 1979 FRANCIS ALLEN OLSON ALL RIGHTS RESERVED C PROFESSIONAL EDUCATORS AGREEMENT ON CRITERION FOR MEASURING TEACHER EFFECTIVENESS by FRANCIS ALLEN OLSON A thesis submitted in partial fulfillment of the requirements for the degree ; - DOCTOR OF EDUCATION .Approved: Graduate Dfean MONTANA STATE UNIVERSITY Bozeman, Montana December 1978 / iii ACKNOWLEDGMENTS Special acknowledgment goes to Dr. Robert Thibeault, my advisor and committee chairman. He has graciously .provided many hours of guidance and encouragement throughout this study. Acknowledgment is made to my reading committee, Dr.-'Albert Suvak,and Dr..Robert Van Woert and to other members -of.my committee, D r . Douglas Herbster, Dr. Del Samson, and Or. Alvin Fiscus.. ' .■ Acknowledgment is also made to,Dr. Eric Strohmeyer for his help arid support. ■ . ' ■ Special thanks-is given to Mrs. Louise Greene who spent m any: hours typing this dissertation. . Most of all I wish- to thank my -wife, Alyce, for the many hours • that she spent helping with recording of data, running errands, and ■ providing needed patience and encouragement. TABLE OF CONTENTS, •' •'Page LIST OF TABLES ' ........ ........................... CHAPTER I. INTRODUCTION ............ . . .; . I STATEMENT OF THE PROBLEM. . . . . . . . . . 15 NEED FOR THE STUDY' ,15 PURPOSE OF THE STUDY . ........ .. QUESTIONS TO BE ANSWERED . . . . ... . .... 16 . . LIMITATIONS OF THE STUDY , "17 DEFINITIONS OF TERMS . , ■.......... .. ' S U M M A R Y ............ .. II. 17- . . . . . . . . . . . . 18 19 REVIEW OF RELATED LITERATURE . . ............. 22- INTRODUCTION ............................... 22 NEED FOR EVALUATING TEACHER 24 EFFECTIVENESS. . . FORCES WHICH CREATED THE NEED TO EVALUATE TEACHER EFFECTIVENESS. . . . . . . . . . . . ' I 25 RENEWED EMPHASIS PLACED UPON MEASURING TEACHER EFFECTIVENESS ................... ' ' 38 ' TEACHER PARTICIPATION IN THE EVALUATION ' OF THEIR S E R V I C E S .......... ............. ' 48 ' V CHAPTER Page STUDIES OF TEACHER EFFECTIVENESS Product c r i t e r i a ........ .. ........ . ............. Process c r i t e r i a ................. Presage c r i t e r i a ................... .. 69- 69 . . . 70 STATUS OF PRESENT METHODS OF APPRAISING TEACHER PERFORMANCE . . . . ’,...............' 70 SUMMARY III. 51 PROCEDURES .......................................... .............. ■ ................... DESCRIPTION OF THE POPULATION . ........ .. ' ■ SAMPLING PROCEDURE . . . . . . . . . . . . . . METHOD OF COLLECTING DATA METHOD OF ORGANIZING DATA HYPOTHESES TESTED . . . 73 77 77 SI ........ 84 . . ................... 86 . .. .................... .. . . HYPOTHESIS I ................... 92 ’92- HYPOTHESIS I I ............................... 93 HYPOTHESIS' I I I ............................... 93 HYPOTHESIS I V ............................... 93 HYPOTHESIS V .......... . . . . . . . . . . . 93 HYPOTHESIS V I ............................... .94 HYPOTHESIS V I I ............................... 94 METHOD OF ANALYZING DATA .'............... ...... 94 PRECAUTIONS TAKEN FOR ACCURACY ................. - 96 SUMMARY 96 vi CHAPTER _ IV. ANALYSIS OF D A T A ...............■............. .. Page 98 INTRODUCTION.......... 98 MEAN RATINGS OF CRITERIA . . .................. 99 THE TESTING OF H Y P O T H E S E S ................... 103 S U M M A R Y .......................................... ' 1 3 8 V. SUMMARY, CONCLUSIONS AND RECOMMENDATIONS . . . . S U M M A R Y ........ ’ ..................... .. . i . C O N C L U S I O N S .......... RECOMMENDATIONS FOR FURTHER STUDY APPENDIX B . 143 147 ............. LITERATURE CITED APPENDIX A 143 ................................................ 149 .151 160 166 •& vii LIST OF TABLES TABLE I. II. III. IV. ■ ' CATEGORIES OF ADMINISTRATOR POPULATION ■ ■ .■ ■ ■ ■ . , .. . 80 TEACHER SAMPLE C H A R A C T E R I S T I C S ........ ' . . CATEGORIES OF THE ADMINISTRATOR SAMPLE . 'Page 81 ...... ' 83 NUMBER AND PER CENT OF TEACHER RESPONDENTS' ■ • BY DISTRICT AND SEX CATEGORIES .......... 84- V. YEARS OF EXPERIENCE OF ADMINISTRATORS . . . . . . . 88, VI. THE YEARS OF EXPERIENCE OF TEACHER RESPONDENTS BY DISTRICT CLASSIFICATION AND,.SEX . ■........ 89 VII. VIII. IX. X. XI. XII. XIII. PERCENTAGE OF TEACHERS BY DISTRICT CLASSIFICATION . NUMBER TEACHER RESPONDENTS BY GRADE LEVEL CLASSIFICATION . ........... : . ..90 - 91 'RANK- ORDER OF MEANS FOR SIXTEEN CRITERION MEASURE . ' OF TEACHER EFFECTIVENESS FOR ADMINISTRATORS OF . COMBINED CLASSES OF SCHOOL DISTRICTS ...... Iplj RANK ORDER OF MEANS FOR SIXTEEN CRITERION.MEASURES . OF TEACHER EFFECTIVENESS FOR TEACHERS OF COMBINED . CLASSES OF SCHOOL DISTRICTS ................... 102 RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES ' ' OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS I SCHOOL D I S T R I C T S ................... .. 1.04' ' RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS II . : SCHOOL DISTRICTS .105 RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS III • SCHOOL D I S T R I C T S .......................... . 106. viii TABLE XIV. XV. XVI. XVII. XVIII. XIX. XX. XXI. XXII. XXIII. XXIV. XXV. Page COMBINED MEANS OF RATINGS OF ADMINISTRATORS AND TEACHERS OF M O N T A N A ............................ 107 ADMINISTRATORS VERSUS TEACHERS LEAST SQUARE MEANS AMONG THE THREE SUB-TESTS BETWEEN ADMINISTRATORS AND TEACHERS ........................ 109 LEAST SQUARE MEANS OF COMPARISON OF TEACHERS OF ■■ THREE CLASSES OF DISTRICTS: TEACHERS OF 1st, ■ IInd, AND IIIrd CLASS SCHOOL DISTRICTS . . ...... All A COMPARISON OF LEAST SQUARE MEANS FOR MALE AND FEMALE T E A C H E R S ................................ 114 LEAST SQUARES MEANS FOR ELEMENTARY VERSUS ■ SECONDARY TEACHERS FOR THREE SUB-GROUPS OF TEACHER EFFECTIVENESS C R I T E R I A ................. . 116 LEAST SQUARE MEANS OF SIX EXPERIENCE CLASSES 'FOR T E A C H E R S ....................... "................ 118 MULTIPLE REGRESSIONS AMONG. THE 16 INDEPENDENT VARIABLES OF TEACHER AND ADMINISTRATOR DIFFER­ ENCES AND THE DEPENDENT VARIABLE, TEACHER RATING OF ADMINISTRATOR EFFECTIVENESS .......... 122 MEAN RATINGS AND RANK ORDER OF THE 16 CRITERIA DELAWARE S T U D Y ................................. . 125 CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS AND ADMINISTRATORS 129 CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION. FOR MONTANA AND DELAWARE TEACHERS . . 130 CALCULATION OF'.SPEARMAN 'S COEFFICIENT OF RANK CORRELATION FOR MONTANA ADMINISTRATORS AND DELAWARE TEACHERS ............................... 131 CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS OF FIRST AND SECOND CLASS SCHOOL DISTRICTS ................. , 132 ix TABLE XXVI. XXVII. XXVIII. ■ Page CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS OF FIRST AND THIRD CLASS SCHOOL DISTRICTS ................... 133 CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS.OF SECOND AND THIRD CLASS SCHOOL DISTRICTS ................... 134 T-TEST OF SIGNIFICANCE OF THE MEAN DIFFERENCE FOR TEACHER AND ADMINISTRATOR . . i ............ ' 137 X ABSTRACT This study investigated the criteria by which administrators and teachers, employed by Montana school districts, judged teacher effec­ tiveness. A similar study; which this study replicated, was carried out in the State of Delaware by Jenkins and Bausell. The data for this study was gathered by a survey questionnaire which listed criteria for judging teacher effectiveness typed under the Mitzel Scheme. Administrators were sampled first and with their return an answer was given as to whether or not they gave permission to sample teachers on their respective staffs. Samples of administrators (N=665) was followed by sampling of teachers (N=9,428) and these results were compared by analysis of variance statistic at .05 level of significance. Comparison of the results of the study with the Delaware study was done by using Spearman's Coefficient of Rank Correlation. The Multiple Re­ gression model was used to determine whether or not a significant rela­ tionship existed among the differences between the administrators and ■ teachers' rating of each criterion measure of effectiveness and the teachers' ratings of administrators. Conclusions reached are: (I) The highest rated criterion by teachers and administrators in Montana for measuring effectiveness was "classroom control" followed by "knowledge of subject matter" and "rap­ port with students". (2) The "amount students learn" a criterion upper­ most in the minds of accountability proponents was considerably less significant in the minds of teachers. (3) Product criteria (measure of student learning and behavior) and process criteria (measure of teacher behavior) were rated significantly higher than presage criteria (measure of a teacher's personal or intellectual attributes). (4) Teachers' rat­ ing of effectiveness criteria was not significantly related to how teachers viewed their administrators' effectiveness. (5) Montana and Delaware teachers were in agreement that process and product criteria for measuring effectiveness are considerably more important than presage criteria. (6) Both Montana and Delaware teachers do not consider that "what students learn", is as important a criterion by which to measure effectiveness as other criteria. This view differs considerably from that of accountability proponents. One of the more important recommendations coming out of this study is to determine whether or not parents and other constituents of • Montana served by administrators and teachers are in agreement among themselves .and with educators on what types of criteria ate most impor­ tant by which to judge effectiveness. It would be important to know if discipline (effectiveness in controlling his class) is the number one rated criterion. CHAPTER I INTRODUCTION Defining the effective teacher has been a continuing process carried out by researchers for many years. What research has said about teacher effectiveness is that it is not the clearly defined trait that many would have us believe. Research has indicated that teacher performance is one of the most complex human phenomenon that researchers have been privileged to study (Ellena, 1964). Teacher effectiveness has continued to be a subject of much interest to educators and more recently to the public whom they serve. Interest in teacher effectiveness by the public has arisen from the■ current viewpoint of accountability in education which has focussed on the people's right to know what is taking place in the schools (House, 1973). Addressing the February, 1975 meeting of the American Associa­ tion of School Administrators which was held in Dallas, Texas, Frank Gary stated Measuring the effectiveness of teaching . . . is still a topic of interest to the public and school administrators. There has not been too much progress in the 'area of measuring practitioner effectiveness because of the educational stance that it is impossible to make valid judgments about anything as complex and personal as teaching ability (Gary, 1975). Biddle and Ellena point out in the preface to their review of research that: 2 Probably no aspect of education has been discussed with greater frequency, with as much deep concern, or by more edu­ cators and citizens than has that of teacher effectiveness— how to define it, how to identify it, how to measure it, how to evaluate it, and how to detect and remove obstacles to its achievement (Biddle and Elleria, 1964). . Research has said that teacher effectiveness cannot be. summarized in a few words. However, many people who have had contact with schools— - whether as students, parentsj or interested Citizens— feel qualified to make dogmatic pronouncements about teacher effectiveness. Teacher effec­ tiveness has been a matter of long concern in all efforts to improve education. Before the turn of the century, studies were conducted in this country which attempted to isolate the factors which contributed significantly to teaching effectiveness. 1,006 researches made in this area One bibliography alone lists from 1890 to 1949 (Domas and Tiedeman, 1950). It has been pointed out that teaching must be defined before it can be evaluated and effectiveness predicted. From Ellena's summary of teacher effectiveness studies, it was evident that part of the diffi­ culty associated with the prediction of teacher effectiveness had arisen from the fact that teaching was described differently by. different people, and the teaching act varied from person to person, and from- situation to situation. One of the most difficult problems in teacher effectiveness studies has been that researchers had to assume that effectiveness was either a 3 statement about an attribute of the teacher, a statement about an attri­ bute of a teacher in a particular teaching situation, or a statement about the results which come out of a teaching situation (Ellena, 1961). Gage, who searched for a scientific basis to describe teacher effective­ ness pointed out that during most of the history of education, "What knowledge, understanding, and ways of behaving should teachers possess," has been found through raw experience, tradition, common sense, and authority (Gage, 1972). He defined research- on teacher effectiveness as the relationship between- teacher behaviors and characteristics and their effects on students. In their relationship teacher behavior was con­ sidered an independent variable (Gage, 1972). Flanders has stated that, "Knowledge about teaching effective­ ness consists of relationships between what a teacher does while teaching and the effect of these actions on the growth and development of his pupils." From his point of view an effective teacher interacted skill­ fully with pupils in such a way that they learned more better compared with the ineffective teacher. and liked learning He described teaching effectiveness as being concerned with those aspects of teaching in which the teacher "has direct control and current options" (Flanders, 1970). A number of researchers' as well as some professional associations supported the position that the ultimate criterion by which to judge a teacher's competence was the impact that the teacher exerted upon the learner to bring about behavioral change in the learner. Reluctance in 4 accepting pupil change as the chief criterion of teacher effectiveness has arisen both from the technical problems in assessing learner growth and from philosophical considerations (Travers, 1973). The chief concern among the technical problems in assessing learner growth has centered on the adequacy of measures for assessing a wide range of pupil attitudes and achievement at different educational levels, and in diverse subjectmatter areas. Philosophical differences have centered upon the selection of desirable changes to be sought in learners and value differences observed in- the preferred methodologies of teacher competence researchers (Travers, 1973). ' Research, which has tried to determine that teaching effectiveness has something to do with what a teacher is, assumed that teacher success can be predicted in terms of individual teacher personality traits. Both laymen and the majority of professional educators have clung to the idea that ability to teach is correlated in some way with such person­ ality factors as a sense of humor, empathy, industriousness, willingness to cooperate, physical attractiveness and health, love of knowledge, .creativity, and so forth. To some extent, nearly every teacher evalua­ tion program in existence has taken these factors into consideration. Yet numerous research studies have failed to find a significant cause- V and-effect relationship between traits and teaching effectiveness. Research has shown that we tend to place the highest values on those traits we ourselves possess or think we possess (Brighton, 1965). 5 Barr cautioned those who pursued the traits approach to describ­ ing teacher effectiveness that personal qualities such as considerate­ ness, cooperativeness, ethicality, which are used to assess teacher effectiveness are not directly observable. ences drawn from data. These qualities are infer­ He described this concern as follows: These data may be of many sorts arising from the observation of behavior, interviews, questionnaires, inventories, or tests. Whatever the source of information, judgments, about the qualities are inferences, and subject to all the limitations, associated with inference making including the accuracy.of the original data upon which the inferences are based, and the processes of inference making (Barr, 1961). Barr related that if one has considered the qualities of the individual in terms of characteristics of performance, he has utilized a behavioral approach to assess teacher effectiveness. . He noted further that those who have interpreted personality in behavioral terms in assessing teacher effectiveness have attempted to integrate the concept . of personality with that of methods. Historically, this concept has been considered an important aspect of teacher effectiveness. The problem which has been encountered in the behavioral approach to assessing teacher effectiveness is that of choosing and defining the personal qualities that have appeared to be pertinent to teacher effec­ tiveness. The literature has given one the impression that the choice of personal qualities used to assess teacher effectiveness has been based very much upon personal preference (Barr, 1961). Barr stated, If judgments about teachers are based upon observations of teachers' behaviors, how do we know what to look for and what to 6 . ignore? Whether a behavior, or aspect of behavior, is perti­ nent to some particular quality depends on how the quality is defined. . Many subtle shades of meanings will probably need to be considered (Barr, 1961). The personality of the teacher has been a significant variable in the classroom. Many have argued that the educational impact of a teacher is not due solely to what he knows or does, but to what he is as well. After an in-depth study of this problem, Getzels and Jackson concluded that despite the critical importance of the problem and a half-century of prodigious research effort, little is known for certain about the nature and measurement of teacher personality, or about the relation between teacher personality and "teacher effectiveness (Averch and others, ' 1971) (Lewis, 1973) (Gage, 1972). In his summary on teacher effectiveness studies, Ellena pointed out that teachers differ widely with respect to maturity, intellectuality, personality, and other characteristics. The demands of the subjects they teach, the scope and the structure of the objectives to be achieved— all contribute to diversity. In addition, Ellena noted that local con­ trol has exerted its influence toward diversity. For almost any goal one might choose, it was possible to find a continuous spectrum of values, opinions, and goals. The notion of the 1Jgood teacher", described by Ellena as basic to the study of teacher effectiveness has turned out to be almost as vague and diffused as the range of human experiences relative to teaching (Ellena, 1961). Rabinowitz and Travers summarized their problem, and repeated by Ellena, as follows: 7 There is no way to discover the characteristics which dis­ tinguish effective and ineffective teachers unless one has made or is prepared to make a value judgment. The effective teacher does not exist pure and serene, available for scientific scru­ tiny, but instead a fiction in .the mind of men. No teacher is more effective than another except as someone so decides and designates . . . (Ellena, 1961). Most likely the reason that the effective teacher has not existed pure and serene has been due to the fact that under local control of schools, the teaching act is free to vary from school system to school system. The job of the teacher thus varies according to the location of the job. The particular job a teacher has been expected to perform has varied from grade to grade. Because, definitions of teacher func­ tions has varied by grade level and school system little headway has been made in solving the problem of successfully measuring teacher effectiveness in spite of the immense number of studies that have been conducted (Ellena, 1961). To make sense of the diverse inquiries that have been undertaken in the name of teacher effectiveness, Travers related that it has become necessary for one to make "distinctions in purposes". He pointed out that the administrator has been looking for knowledge of teacher effectiveness in order to make a better decision in situations such as hiring or firing a teacher. The instructional supervisor or teacher wanted to know what instructional procedures are most likely to prove useful in achieving certain instructional ends with given students. Researchers' purposes, according to Travers, included: 8 satisfying.a desire to describe accurately what teachers do, searching for associations between theoretically or empir­ ically derived variables and learning, and demonstrating the power of a given factor or instructional operation to make a ' practical difference upon the outcome sought (Travers, 1973). Those who have been interested in teacher effectiveness have had different purposes and consequently have varied their interpretations of the problem. Some who have investigated the problem of teacher effectiveness would have been satisfied to know whether or not a teacher was getting desired results with the results indicating effec- . tiveness,' not the process used. Others wanted to know how to increase the probability of attaining desired results. Researchers who were interested in process were searching for lawful teaching behavior, j^.je. , validated procedures for achieving instructional ends. Their assump­ tion was that effective teaching would have been recognized when lawful relationships were established between instructional variables and learner outcomes— that certain procedures in teaching would have, within certain probability limits, been labeled as effective or ineffective (Travers, 1973). To date there are no such laws, only a few leads or practices that are more likely than others to maximize the attainment of selected instructional ends. Researchers such as Gage (1968) had hoped to estab­ lish scientific laws for teaching; other researchers agreed with Dewey (1929), who held that it was an error to believe that scientific findings and conclusions from laboratory experiments to such activities as 9 helping the teacher make his practice more intelligent, flexible ahd better adapted to dealing with individual situations (Travers, 1973). Bolton suggested that the purposes of teacher evaluation vary somewhat from school district to school district. Included were many of the following: (a) to improve teaching . . . by determining what actions can be taken to improve teaching systems, the teaching environment, or teacher behavior, (b) to supply information for modification of assignments, (c) to protect individuals and the school system from incom­ petence , (d) to reward superior performance, (e) to validate the selection process, and (f) to provide a basis for the teacher's career planning and growth and development. Bolton summarized by stating, "All of these purposes might be expressed by saying: The purpose of teacher evaluation is to safe­ guard and improve the quality of instruction received by students" (Bolton, 1973). Wilson, in presenting a paper' to the annual convention of the National School Boards Association, April 17-22, 1975, described the purposes of evaluation as follows: Before we come to grips with the methods to be used in evalua­ ting teachers, there must be a clear understanding of the purpose for the evaluation in the first place. As a superintendent of schools it is clear to me that teachers are evaluated for t w o ' major reasons. First, teacher evaluation takes place for the 10 specific purpose of improving the quality of instruction. The focal point of all education is the learner . . . . The second major reason for teacher evaluation is to identify those staff members who are perpetrating such crimes against youngsters ■ that their removal from the classroom and from the. profession is the major objective. In other words the evaluation process is used to document teacher ineffectiveness so that termination can be accomplished (Wilson, 1974). From the standpoint of the local school official, Ellena pointed out that the extent to which any procedure has been used in teacher evaluation depended on how much and what kind of evidence was desired in making decisions about local school personnel. immediate and self-terminating information. These concerns were for There was no concern from the local standpoint about adding to the fund of knowledge about teacher effectiveness to the extent that it could be predicted and explained accurately (Ellena, 1961). Travers, in his review on teacher effective­ ness, related that decisions require judgments about teachers have been made by many— teacher educators, school personnel officers, administra­ tors, supervisors; and teachers. Wise choices about teachers have been made when adequate data was at hand for judging. He added: Complete data have typically not been available; possibly be­ cause those who have been making decisions have not given enough thought to what is required for making warranted decisions about a teacher and, accordingly have not arranged for the collection of data. A second reason that data are not available, according to Travers, is that researchers have not pursued their investigations with awareness of the practical decisions that must be made by those working with teachers (Travers, 1973). 11 The school official, as suggested by Ellena, has sought "to determine how well a teacher performed his job in terms of certain speci­ fied and more often unspecified criteria." He has not been concerned with whether or not the job he asked the teacher to perform was' repre­ sentative of the class of such.jobs or if the teacher performed the class of jobs well. On the other hand, the researcher, according to Ellena, has been concerned with how well a teacher could perform, "in any of a class of jobs which share many common characteristics, as well as with identifying these common characteristics" (Ellena, 1961). The difference between a school official's concern and a . researcher's concern, as pointed out by Ellena, has several implications. For example, the overall or intuitive ratings may be used by a school official to help make a general assessment of how well a teacher has performed and the general assessments thus gained has provided relevant and useful information for the immediate school situation. From Ellena's point of view, overall ratings have not generally been useful or relevant because such ratings have low reliabilities, and have not been consistent with the purposes of researchers who wish to predict and to describe. Ellena has evaluated ratings as follows: An overall rating for research purposes implicitly assumes that when a teacher receives a rating of 80 per cent effective, it means that a teacher is as effective in doing the same things as every other teacher who is rated as 80 per cent effective by other school officials. An overall scale does not show that teachers are effective in doing the same things. It only shows that raters thought teachers were effective in doing whatever it 12 was they did. An overall rating is simply a means of letting the criterion which the researcher wants to predict, vary in its meaning, without showing that it so varies. It is impossible to predict consistently a criterion whose meaning constantly shifts (Ellena, 1961). , As pointed out by Ellena, officials of local school districts and researchers have had different purposes for describing teacher effective­ ness. As a result of this difference in purpose researchers experienced the problem of predicting a criterion of teacher effectiveness that was relevant to local school board needs. Ellena described this problem by s bating: , If a researcher uses any procedure that permits the definition of teacher effectiveness to vary, he will not be success­ ful in predicting. If he does not let the definition vary, he places' himself in the position of having to specify what the function of teachers should be. If he so specifies, he either usurps the function of local school districts in deciding what the functions of a local teacher should be, or else runs the risk of predicting a criterion that some local school boards considered inconsequential or irrelevant to how they define teacher performance (Elldna, 1961). Travers has stated in his Second Handbook of Research on Teaching that, "Professionals and laymen alike are unhappy with what is loosely called the evaluation of teachers" (Travers, 1973). He summarized the results of national surveys which indicated the reasons for dissatis­ faction with most evaluations are: (I) lack of confidence in the school system's evaluation program, (2). infrequent observation of tenured teachers, (3) inaccurate evaluation, (4) administrative staff have little time to effectively evaluate and make judgments of staff, and (5) evalua­ tions are poorly communicated to others (Travers, 1973). 13 Many considerations beside teacher effectiveness entered into decisions such as whether to hire, to grant tenure, to fire teachers. Travers pointed out that the practice of assessing a teacher without having had valid data regarding his ability to effect changes in pupils seemed wanting. In contrast, information about the teacher's personal characteristics, relations with other adults, appearance, political attitudes, etc., has been plentiful and easily acquired. of a teacher on the basis -of factors Appraisals unrelated to the progress of pupils has allowed the value preference of individuals and local communities to operate (Travers, 1973). Bolton expressed the view that judgments regarding teachers are made inevitably and if the criteria were appropriate and the data were sound, resulting judgments would be useful. He stressed the fact that in evaluating teachers, judgments should be made in relation to objec­ tives rather than the personal worth of people. He pointed out that evaluation should establish whether the teacher reached various standards, not whether the teacher did better or worse than other teachers. He emphasized the idea that teachers should be helped to improve their con­ tribution to the learning of school children (Bolton, 1973). ■ A review of research supported the position that the more widely used criteria for assessing teacher competency included student ratings, self ratings, administrator ratings, and peer ratings. Assessments of classroom environment, personal attributes, performance tests, alterna­ 14 tive criteria (contract plans using student gain) and systematic obser­ vations provided additional criteria for judging teacher effectiveness. Travers reviewed the work of McNeil and Popham who have cautioned in their assessment of teacher competence, Any single criterion of effectiveness is confounded by a number of factors. One factor stems from,who is doing the measuring; a second is the kind and quality of instrument used; a third is faithfulness in applying the instrument as its designer intended; and a fourth is the purpose for apply­ ing the criteria— how the data are used (Travers, 1973). Research has generally supported the conclusion that effective­ ness in teaching is best evidenced by criterion measures Which detect pupil growth as a result of the teacher's instruction. pointed out by Wolf, teachers are not fond of evaluation. However, as He stated their concern as follows: . . . They suspect any measure designed to assess the quality of their teaching, and any appraisal usually arouses anxiety. If teachers are to submit to an assessment of their perform­ ance, they would probably like reassurance that the criteria and method of evaluation that are used would produce credible results (House, 1973). According to Wolf, teachers have believed that the standards for evaluating what is effective teaching are too vague and ambiguous to be worth anything. They have felt that current appraisal techniques fall short of collecting information performance. that accurately characterize their They received the ultimate rating as depending more on the idiosyncrasies of the rater than on their own behavior in the class­ room. As a result, teachers saw nothing to be,gained from evaluation. 15 Statement of the Problem The emphasis placed upon "accountability" by the public during the decade of the 1960's had intensified the search by school districts in the 19701s to find improved ways to evaluate teacher effectiveness. The problem inherent.in the search for improved ways to evaluate teacher effectiveness was that of selecting suitable criteria upon which both administrators and teachers agreed which truly measured teacher effec­ tiveness. School districts, facing this problem needed to Know how much agreement existed between teachers and administrators on effective cri­ teria and to determine what kinds of criteria were appropriate for judg­ ing the effectiveness of teachers. If it was determined that administra­ tors and teachers varied greatly in their views of the perceived.importance of criteria for measuring teacher effectiveness, then continuation of the evaluation process would have resulted in increased sensitivity and mis­ trust on the part of teachers toward administrators' who judged teaching effectiveness. It was necessary for school district, administrators to include the teacher in determining the criteria used to evaluate their own effectiveness. Need for the Study It is evident from a review of the literature that the task of identifying effective teachers and effective teaching is crucial to teacher education, teacher selection,teacher performance, and ulti­ 16 mately, to the survival of society. Crucial as this need is and in view of the enormous amount of research directed at identifying effective ' teaching, it is disturbing to note that there has been no general agree­ ment upon what constitutes effective teaching, or standards of teaching effectiveness. A substantial amount of pressure has been placed upon school districts to evaluate teaching effectiveness because of the accountability impact. This practice is a sensitive issue to teachers. Research seems to bear out the fact that teachers should indeed be con­ cerned. Very evident in the research reviewed is the need to involve more than just the administrator or supervisor in evaluation of teach­ ing effectiveness. The literature pointed out the fact that teachers should be involved in the evaluation process. Purpose of the Study The purpose of this study was to determine what factors were important from the teacher's viewpoint in identifying effective teach­ ing, compare the findings with the administrator's viewpoint and deter­ mine whether or not there was agreement on the criteria for judging effective teaching. By first determining whether or not teachers and administrators agreed upon the criteria for judging effective teaching, this study provided a means whereby some conclusions could be reached by school districts concerning their efficiency in meeting the public demand of the best possible education for the tax dollar. .17 Questions to be Answered The questions to be answered by this study were: 1. Is there agreement among teachers in Montana on the criteria ■ that describes the effective teacher? 2. Is there agreement among school administrators in Montana on the criteria that describes the effective teacher? 3. What is the degree of agreement between administrators and teachers in Montana schools on the criteria that describes the effective teacher? 4 . What is the degree of agreement between elementary and sec­ ondary teachers on the criteria that describes the effective teacher? 5. Is there a relationship between the criteria differences as perceived by teachers and their administrators and the rated effective­ ness of the administrator in helping the teacher to improve his effec­ tiveness? Limitations of the Study* 1 The limitations of this study were: 1. The study was limited to the geographic area of the State of Montana. 2. The elementary and secondary teacher population of Montana schools comprised the teacher population from which the sample was drawn. 3. The district superintendents and principals of Montana schools comprised the administrator population from which a sample was drawn. ■ 18 Definition of Terms Terms defined for the purpose of this study were: 1. Teacher. A person who is certificated to teach in Montana and who will be under contract to teach during the 1976-1977 school year in any school district in Montana. 2. Administrator. A person who is certificated by the State of Montana for the purpose of administrating a school or school district and who is employed either as a principal or superintendent in any school or school district in Montana during the 1976-1977 school year. 3. Teacher Effectiveness.. This term is the degree of success a teacher achieves in attaining the desired outcomes that a school dis­ trict wishes to obtain the teaching-learning environment. 4. Evaluative Criteria. Evaluative criteria are measures of teacher effectiveness. 5. Types of Criteria. Criteria are typed in accordance with Mitzel1s Scheme of process criteria, product criteria, and presage criteria. Process criteria are measures of teacher effectiveness based upon classroom behavior either the teacher behavior, his students' . behavior or the interplay of both. Product criteria are measures of teacher effectiveness in terms of measurable change in student behavior as a product of teaching. Presage criteria are evident measures of teacher effective­ 19 ness based upon a teacher's personality or intellectual attributes, performance in training, years of experience, tenure, etc. Summary Defining the effective teacher has been a continuing process carried out over many decades by researchers and investigators. A review of the literature•indicated that the purpose of most completed research has been to improve teaching performance.in order to provide better education for children.. The process of identifying the effective teacher in earlier times depended upon a subjective evaluation of the teacher's personality traits and behavior in light of some particular authority's judgment as to what was acceptable or unacceptable. This process was usually accomplished by the use of some type of rating instrument. In more recent years the use of the rating instrument received severe criticism by both teachers and administrators because both believed its primary use by evaluators was for the purpose of dismiss­ ing teachers. As a result of the criticism directed at the use of the rating instrument, the emphasis in evaluation shifted from the subjec­ tive approach to a more objective approach which resulted in the posi­ tive practice of identifying the strengths and weaknesses of teachers. The purpose of evaluation became-that of correcting weaknesses and reinforcing strengths of the teacher. The emphasis became one of 20 measuring teacher effectiveness centering on product measurement through previously agreed upon objectives of instruction. To evaluate a teacher's performance, it became necessary for school administrators and teachers to determine the characteristics of the effective teacher and the ingredients of effective instruction. The problem for adminis- . trators and teachers was that of agreeing upon the criterion measures of effective teaching. Traditionally the evaluation of a teacher's effectiveness was conducted by the teacher's immediate supervisor, usually the principal. Other forms of teacher evaluation which emerged more recently included peer evaluation, self-evaluation, pupil evaluation or combinations of these. There was, by no means, total agreement among school districts of the nation that any or all of the newer trends in the evaluation process contained total answers to the teacher effectiveness problem. One of the biggest problems encountered, regardless of the approach school district followed in evaluation, was that of defining the cri­ teria by which teaching and teachers were to be assessed. a CHAPTER II REVIEW OF RELATED LITERATURE Introduction This review was organized into four elements of studies and research relating to development and change that has taken place in determining teacher effectiveness. The initial section concerns the forces at work which created the need to evaluate teacher effectiveness. The second portion relates the status of present appraisal methods used in teacher evaluation. The third section reviews experimental studies of teacher effectiveness that illustrate the problems inherent in a study of this nature. Some specific trends, criterion and design models were reviewed and described. ■ ' Areas of emphasis in the review of litera­ ture includes: 1. Status of present methods of evaluating teacher performance. 2. Studies of teacher performance. For the purpose of statistical comparison, this study followed closely a study reported in 1974 by Jenkins and Bausell on how teachers view the effective teacher (Jenkins and Bausell, 1974). The purpose of their study was to consult teachers and administrators regarding their views on teacher effectiveness, in particular, on criteria they used to evaluate their own effectiveness. To provide some structure for such an inquiry, Jenkins and Bausell developed a survey instrument which was based on the category labels of product, process and presage employed by 22 Harold Mitzel in his contribution to the 1960 edition of the Encyclopedia of Educational Research. A more elaborate description of Mitzel1s cate­ gories of teacher effectiveness criteria appear later in' this chapter and a copy of the instrument used by Jenkins, and Bausell appears in ' Appendix-A. Briefly described product criteria in Mitzel1s scheme are employed where a teacher is judged on the basis of a measurable change in what is viewed as his product, student behavior. Process criteria is used when a teacher's evaluation is judged by either his behavior in the classroom or that of his pupils or the interplay of both teacher/student behavior. Presage criteria is used, if a teacher's evaluation is judged in terms of the teacher's personal or intellectual attributes, his performance in training, his knowledge or achievement or other pre-service character­ istics (Mitzel, 1960). Jenkins and Bausell administered a survey instrument which included an assortment of product, process and presage criteria to a random sample of all public school teachers and administrators in the State of Delaware. Respondents who numbered two hundred sixty-four (N = 264) were instructed to assume that adequate measures were avail­ able to measure each of the criteria listed. The instructions listed were replicated for this study as well as the continuum used for responses. This information appears in the instrument which is located in the Appendix. A. 2 3 The criteria and the ratings given them by Delaware teachers and administrators appear in Table XXI. When the responses of elementary teachers, middle school teachers, secondary teachers and principals were compared, the results indicated that although these groups might be expected to have different biases, their ratings were remarkably simi^ Iar. The average correlation between these groups was .93 (Jenkins and Bausell, 1974). Perhaps the most revealing aspect of the survey according to Jen­ kins and Bausell was the rating given to the criterion, Amount Students Learn. This criterion in the Delaware study was not seen as particularly important in judging teacher effectiveness relative rated. to the other criteria The implication of the rating received by Amount Students Learn for accountability proponents should be obvious. While those in the' accountability movement stressed student learning as the primary basis for educational decision making, educational practitioners, at the same time, affirmed their preference for other criteria as indicated by the Delaware study. Results of their study and comparison tables with the •Delaware study are listed in Chapter IV (Jenkins and Bausell , 1974). Need for Evaluating Teacher Effectiveness The study of the effectiveness of. a.teacher and particularly how teachers themselves viewed the effective teacher gained increased momen­ tum in recent years with the emphasis placed upon "accountability" by the soaring cost of education in the 1960's. The cry for accountability intensified the search in the 19701s for improved ways to evaluate and to 24 • standardize these procedures. needs of two groups: Impetus for this search arose from the teachers, on the one hand, who sought the security of fair objective standards of evaluation; and the public, on the other hand, who sought assurance that its tax dollar was well spent (Oldham, ■ 1974). Because teacher accountability remained the center of debate, any discussion of the topic turned sooner or later to the issue of teacher effectiveness. For the teacher the idea of accountability quickly translated into an assessment of the quality of his instruction and the related necessity of selecting a criteria by which one would judge his effort. Because the accountability movement centered on teacher effects, it seemed imperative that teachers be consulted regard- ■ ing their views on teacher effectiveness, and particularly upon the criteria they used to evalute their own effectiveness (Jenkins and . Bausell, 1974) . Forces Which Created the Need to Evaluate Teacher Effectiveness In addressing the National Association of Secondary School Prin­ cipals' annual convention in Anaheim, California in March of 1972, Governor Ronald Reagan who was then the Governor of California referred to the growing need of public education to become more accountable in the decade of the seventies. To "re-establish the public's confidence in education and our school system," Governor Reagan described as yet another responsibility the public had given to its education system. 25 Governor Reagan described the public's eroding confidence in education although education traditionally had been America's major public prior­ ity. Of the reasons described by him as contributing to the eroding public confidence., crisis stemming from financial problems and the feel­ ing that people had reached the limit of their ability to pay higher taxes seemed most paramount. How this mood affected education was described by Governor Reagan in his statement, However unjustified educators feel the attitude may be, there is a feeling among our people that our schools are not doing all that they should, or doing it as efficiently and as economically as they could (Reagan, May 1972). The implications that Governor Reagan's address held for measur­ ing teacher effectiveness.as one way to meet the public's demand for accountability is summarized in part of his address as follows: We must develop ways to evaluate objectively the performance of teachers, to find the best, and to reward them for superior performance. In California last year, we passed legislation to require evaluation of teacher performance. You can probably guess the result. The deadline for conform­ ing to this new law had to be postponed. Because we have pro­ moted by seniority alone for so long, we have had to start from the basics to determine just what should be measured in evaluat­ ing teachers and how to measure it. . . . However difficult it may be, we are determined to develop fair, realistic and reasonably flexible methods of measuring teacher performance (Reagan, May 1972). Herman supported the primary reason expressed by Governor Reagan that the education institutions of this nation were besieged by internal 26 and external forces demanding that these institutions be held account< able and show evidence of having used the taxpayers money wisely before asking for additional money. Herman took the position that one of the most basic elements in accountability was staff evaluation. This element, according to Herman,, dealt with definitions of what we were doing, who was responsible for doing it, and how did we measure the effectiveness of the work assigned each individual within the program. Herman described two basic ideas that needed to be included within each dis-. trict's plan of evaluation regardless of the ultimate number of person­ nel involved in the evaluation. These were described as: (I) a self- evaluation must be done by the employee and (2) the employee's immediate supervisor has to arrive at judgments based upon his evaluations when administrative decisions, such as whether or not to grant a teacher tenure, needed to be made (Herman, 1973). Ornstein and Talmage have stated that the Concept of account­ ability was borrowed from management. They described the concept of accountability applied to education as . . . "holding some people (teachers or administrators), some agency (board of education or state department of education), or some organization (professional organization or private, company) responsible for performing according to agreed-upon terms" (Ornstein and Talmage, 1974). • In the past, so stated Ornstein and Talmage, students alone were held accountable for specific objectives in terms of student changes in achievement and behavior. According to Ornstein and Talmage most people 27 believe that everyone, including teachers and administrators, should be held accountable for their work. What many educators objected to, and even feared, was the oversimplified idea of accountability as. the sole responsibility of the teacher or principal. Accountability should have included not only teachers and administrators but also, parents and com­ munity residents, school board members and taxpayers, government offi­ cials and business representatives, and most importantly the students. Ornstein and Talmage summarized their concern about the concept, of accountability as an idea which was spreading throughout the country regardless of the fact that there was no evidence that it would reform the schools. One of the major difficulties which seemed to plague the accountability movement was- that of measuring learning (Ornsteih and Talmage, March 1974). One process which the call for accountability in the seventies forced upon some school districts was termed management by objectives (MBO). A number of school districts turned to this new concept of man­ agement, alternately referred to as management by mission, goals manage­ ment, and results management in the hope that because the concept had been used successfully in business and industry for more than a decade, it would likewise prove successful for school districts. Although MBO and accountability have been frequently teamed in the literature and in school district improvement efforts, they have-not been considered as generic teammates. MBO preceded the accountability- 28 in-education movement by at least a decade. / ' The term management,by . objectives was first used by Drucker in his book Practice of Management in 1954. McGregor of M.I.T. and Likert of the University of Michigan had used it to justify the application of findings in behavioral research to the business situation. Since then, results management has been widely installed throughout the United States and other countries, notably Great Britain, where business, industry, and government have found it a productive way of managing their enterprises (Read, March 1974) . - ' . ' ' Read listed several administrative practices which he felt MBO would strengthen. Read stated that successful implementation of MBO would . . . "eliminate the tendency to evaluate personnel in terms of their personality traits; substituting instead, their performance in terms of results" (Read, 1974). For the purpose .of this paper the prac­ tice of determining teacher effectiveness seemed most appropriate. As pointed out by Howard one should not be led to believe that, the idea of accountability, the adoption of business practices in education, is new, that it has just been discovered by some of our brighter, abler, and more responsible people in educa­ tion (Howard, .1974). He noted that in the early 1900's we were blessed in having a number of educators who, in response to pressures from business, industry, and the general public, were able to devise methods for determining effi­ ciency and educational output in the schools. "Educational efficiency. 2 9 experts" and "educational engineers" were names given to the responding educators of those days (Howard, 1974). Miller described the influence of business practice on account­ ability by noting that developments in the field of managementtechniques required sharper expertise in goal setting, planning, and establishing of cost effectiveness measures. Management also increased its skill in evaluation and assessment which in turn fostered the move toward account­ ability, Miller concluded (Miller, 1972). Miller perceived accountability as a means of holding an indi­ vidual or group responsible for a level of performance or accomplishment for specific pupils. He emphasized that program goals .would be developed for each activity, thus clarifying the purposes and goals of all programs and making it easier to assess results. He believed that educators would have to develop greater skill in goal setting, diagnosing needs, and ' analyzing learning problems. He also noted that increased emphasis on improved communication and involvement of pupils and.parents would be a necessity and would result in better understanding and support of the school program (Miller, 1972). Many persons were threatened by the idea of accountability and even more were disturbed by the apparent way in which the concept was being implemented. ' A major cry from the teachers was that standards for them and for the pupils Were likely to be set by central office adminis­ trators. They feared that the required levels of performance would be 30 unrealistic and unobtainable, thus triggering, punitive .actions toward pupils and teachers. Teachers did not want to become the scapegoats when the school systems did not produce, what the p a r e n t s t h e boards, or the administrators demanded. Teachers pointed out that while they.were " likely to be the ones held accountable, they often- did not have the resources or power to alter policies or practices which must be changed’, if improvement were to come about.. r Many worried that implementation of accountability would cause education to focus on- that which could be easily identified and measured. The area of academic achievement would most likely■g e t ■the most atten­ tion at the expense of the affective domain. What was certain in the ’, minds of many was that accountability would surely increase the educa­ tional bureaucracy which, to some, already constituted a serious impedi­ ment to improving instruction. As described by Kibler, the use of instructional objectives was consistent with the concept of accountability which was described as the balancing of money spent for education with the amount students learned; '■ Accountability in education as described by some writers was rapidly gaining acceptance from both the public and the federal government,. , Unfortunately, some educators who had negative attitudes about account­ ability also had become negative about instructional objectives (Kibler,. 1974). Apparently, the negative attitudes about instructional objectives were based on the-misconception that using ,instructional objectives lead 31" to accountability in education. Kibler felt that few comforting■words could be said to those teachers who viewed accountability-based educa­ tional systems as a threat. If accountability-based educational systems did become the norm, experience in the use of instructional objectives would enable teachers to adapt to the system more easily (Kibler, 1974). Hottleman, Director of Educational Services, of the Massachusetts Teachers Association described the negative impact that accountability . had on public education in his statement that, The accountability movement probably offers more potential for harm to public education than any other idea ever intro­ duced, yet more and more highly placed education officials hop on the bandwagon daily. One common element among the major accountability movers is their backgrounds. They are mostly administrators, testing experts, or private businessmen. Teachers' organizations and individual teachers are notably absent (Hottleman, 1974). Hottleman described the accountability movement in public education as ■ first becoming visible in 1970 when President Nixon announced, "School administrators and school teachers are responsible for their performance and it is in their interest as well as in the interests of their pupils that they be held accountable." Hottleman explained that the President was probably influenced by Leon Lessinger, the Assistant Commissioner of Education, who openly stated his intention to make public education accountable (Hottleman, 1974). In summarizing the accountability movement Hottleman noted that the accountability movement had not begun as a way to improve learning 32 opportunities for children but in response to problems which arose out of the increasing costs of public e d u c a t i o n T h e proponents’, in the .main, were, not public educators but were those who had an accounting mentality that viewed sorting, classifying and measuring as .significant per.se. Hottleman viewed the overemphasis on measurement as promising greater con­ formity, the diminishing of humaneness, individuality, and creativeness in public education, and if Unchecked, threatened a concerted move toward educational mechanization. In the opinion of Hottleman teachers were' viewed as the least important resource in seeking -answers about the improvement of education. In his view what was heeded, was a reduction . of funds spent by the measurement, fanatics and an increase in. fund's spent in finding ways of surfacing and implementing.the ideas of prac-. ticing teachers (Hottleman, January 1974). Weiss described educational accountability as a threat to. the privacy and security of educators who worked in greater privacy than almost any other professional group. Because educators worked in rela­ tive privacy compared to other professional groups they looked upon the concept of accountability as having.strong implications.of distrust for their effectiveness. As Weiss stated, "Educators, like all of us, know that 'accountability' does not enter into the discussions between persons or agencies with great confidence in each other," therefore, educational' • ■ . ' . • ■ ■ ' " v accountability carried with'it an obvious presumption of gUilt.(Weiss, April 1973). Weiss's observation described in.some degree'the defensiveness that educators displayed toward the concept of accountability.■. While most people believed that everyone, including teachers and administrators, should be held accountable for their work, educators objected to., and feared, the oversimplified idea that accountability was the sole respon­ sibility of the teacher or principal.' The response of teachers to a state demand for accountability aind . assessment was described in the research carried out by Bleecher. In the State of Michigan a demand for accountability and assessment was seen as a rational response to political pressures from taxpayers who felt heavily taxed. Because taxpayers wanted to knew what they were, getting for their annual two billion dollars spent on education, the legislature passed an act which ordered a program designed to assess pupil learning in the basic educational skills to take effect immediately. In order to comply with the educational assessment act, the State Department of Educa­ tion advocated a six-rstep model which presumed would lead' to educational ' accountability. The response of teachers was rejection in the form of minimal ctimpliance and by pressure from the organized teacher groups . (Bleecher, December 1975).. The accountability movement which received renewed emphasis in education in the early part of this decade still .continues. This move­ ment has been summarized by Popham,' as a public challenge to education by his statement that: ' 34 The public is clearly subjecting educational institutions to Increased scrutiny. Citizens are not elated with their percep­ tions of the quality of education.' They want dramatic improve■ ments in the schools, and unless they get them, there.is real doubt as to whether we can. expect much increased financial sup­ port for our educational endeavor's. And the. public is in no mood to be assauged by promises. 'Deliver the results,.' we are being told. No longer will lofty language suffice,and yester­ year's assurances that 'only we professionals know what we're doing' must seem laughable to today's informed layman. The distressing fact is that we haven't produced very.impres­ sive results for the nation's children. There are too many future voters who can't read satisfactorily, can't reason respectably, don't care for learning in general, and are pretty well alienated from the larger adult society (Popham; May 1972). Many educators particularly administrators' responded to ttie accountability challenge which Popham described as inevitable and accepted the premise that the schools must indeed be accountable. Thus, according to Popham, "the course was set to find, the most expedient way to accomplish accountability". For teachers the challenge could be - described as an admission of guilt for failure of students-to learn. Popham suggested that educators accept the accountability challenge by increasing clasroom teachers' skills in producing evidence that their instruction yielded worthwhile results for learners. One way, suggested by Popham, of showing.results' was' to place appropriate measures of stu- ■ dent performance in the hands of the teacher. The measures,suggested by Popham were tests of instructional objectives described in the literature as criterion-referenced measures (Popham, May 1972). Measuring student performance" by testing has.been referred to in 35 the literature as product measurement which in.turn has been one method used to measure teacher effectiveness. This method of measuring teacher effectiveness has not been popular with teachers for a number of reasons. ■The common problem is that attempts to evaluate teachers on the basis of pupil's test performance tend to focus teaching too narrowly on the ' specifics measured by the test (Rosenshine, 1970) and (Veldman and Brophy, 1974). Grogman described the dangers inherent in accountability ■measures that focus on short-term goals which are the kind measured by tests: ’ . As teachers are threatened with accountability measures that focus on short-term measurable goals, their only recourse is to ’stress what is stressed in the accountability measures, fre- ■' quently to the detriment of more important learnings', which may' be underemphasized or overlooked. If not.measured (and they . generally are not in accountability systems), such skills as socialization, cooperation, and communication undoubtedly will suffer (Grogman, May 1972) . ■' . In recent years many who are charged with the responsibility of evaluating teachers have begun to consider product evaluation methods. Thus, trying to imitate industry, evaluation centered on student achieve­ ment, which in part depended upon test scores (Thomas, December 1974). ■ Yet, Medley and others concluded from their research that only short-term goals which were almost certainly the least important goals of education are validly measured by tests. The validity of "teacher tests" of ability to achieve short-term outcomes/as predictors of overall- teacher effectiveness is by no means self-evident. "Their validity,. according to Medley, predictor of over all teacher effectiveness, must be empiri— 36 . . ■ ,■ • cally demonstrated before their use i-s justified" (Medley, June 1975). In view of the limitations'placed upon tests of student achievement as a criterion to measure teacher effectiveness, it was not surprising to find that teachers questioned product measurement as a measure of their effec­ tiveness. This position was summarized in the Fleischmann Report which was made to the New York State Commission on the quality, cost, and finance of elementary and secondary education for the State of New York. The Report stated, . • Because of the many circumstances that influence learning, educators have traditionally been reluctant to submit to eval­ uation on the. basis of student performance.- They have argued that learning is in too many ways beyond their control and . • that it is therefore unfair to judge school effectiveness by measuring student achievement alone" (The Fleischmann Report, ■ 1973). " Not only were educators reluctant to be evaluated on the basis of' student performance, they questioned any measure designed to assess their, teaching effectiveness. Wolf described this concern of teachers in the following statements: Teachers are not fond of evaluation. They suspect"any measure designed to assess the quality of their teaching, and any appraisal, usually arouses anxiety . . . . If teachers are to submit to an assessment of their performance, they would probably like reassurance that the criteria and method of evaluation that are used would produce credible results. . : . Teachers probably believe that the standards for evaluating what is effective teaching are too vague and ambiguous to be worth anything. They feel that current appraisal techniques fall short of collecting information that accurately charac­ terizes their performance (Wolf, 1973). Supporting this point of view House noted that little demand existed ■ ■ 37 among teacher a n d 'administrators for evaluating their programs. view teachers gained little by having their work assessed. Jn his . Instead, according to House, teachers risked damage to their egos by subjecting themselves to evaluation by administrators, parents, and worst of all, students only to find that they were not doing the,job as effectively as they thought they had. As stated by House, The culture of the school offers no rewards for examining one's behavior— only penalties. Since there are no punishments for not exposing one's behavior and many dangers in so doing, the prudent teacher gives lip service to the idea and drags both feet (House, 1973). . ■' Teachers reacted strongly to the renewed emphasis placed upon evaluation as dictated by the accountability movement. They.felt that they would be blamed for something whether or not such blame was deserved. This guilt feeling on the part of teachers was obviously not conducive, to open discussion, examination, or evaluation. Administrators and teachers saw evaluation as quite important and quite threatening.saw little to be gained (House, 1973). Both groups • Renewed Emphasis Placed Upon Measuring Teacher Effectiveness As. noted from a review of the literature, the accountability move­ ment in education during the 1960's and the early 1970's renewed emphasis of school districts throughout the nation to find fitting methods to measure teacher effectiveness. Also evident from the review of the lit-r. erature was the fact that teachers generally distrusted accountability 38 measures that used student gain as a criterion for measuring teacher effectiveness. ■ The burden of identifying effective teachers and effec­ tive teaching and the concern that both teachers and administrators felt was best illustrated by Thomas. He noted that, Evaluation has always been troublesome for school administra­ tors. It has always been troublesome for teachers. Both pro­ fess the value and necessity for evaluation, but neither believes that it can be effectively accomplished. At one extreme is the position of Robert Finley, one of the nation's finest superin­ tendents: 'Evaluation is subjective . . . period. No other way to evaluate people exists— so that's the way to do it.' At the' other extreme the National Education Association states: 'Eval­ uation must be objective; subjective evaluations have a dele­ terious effect on teachers and children' (Thomas, December 1974). The question of whether or not to -evaluate teachers and the process of so doing was taken away from the jurisdiction of local school districts in some states. State legislatures began to enact laws which mandated the evaluation of teachers at specified intervals and in speci­ fied ways. An example was California's Stull Act which required the evaluation of all certificated personnel based upon "expected student progress in each area of study." The Stull Act, passed in July, .1971, was not the only state law requiring evaluation of teachers and adminis­ trators. At the beginning of 1974 nine states had enacted legislation mandating some form of teacher evaluation (Oldham, 1974). Other states considered enactment.of similar laws, but the trend had not developed at the pace that the emphasis over "accountability" suggested previous to 1974. There was considerable interest in account­ ability laws which would place teacher evaluation beyond the influence of 3 9 existing laws and regulations which had governed the certification of teachers. possible State governments, seemingly, had recognized that it was not to determine competence of teaching on the basis.of university training or licensing (Oldham, 1974). The law of at least one state mandated the evaluation of teachers and established criteria by which such evaluation was to be measured. Kansas law establishes guidelines or criteria for evaluation' policies in general terms of efficiency, personal qualities,. professional deportment, results and performance, capacity to maintain control of students, etc. The law says community atti­ tudes should be reflected. It provides for teacher participa­ tion in the development of the evaluation policies and selfevaluations . The law also provides for state board assistance in preparation of original policies' of personnel evaluation ■ (Oldham, 1974). The inclusion of personal qualities as a criterion for measuring teacher effectiveness aptly illustrated the conclusion reached by Gage that, . the personality of the teacher is a significant variable in the classroom . . . and has recently become the basis for a growing body of research (Gage, 1963). The concern of teachers who ■ questioned the feasibility of measuring their effectiveness by assessment of personality qualities was described by the find­ ings of Getzels and Jackson who concluded, . . . despite . . . . a half-century of prodigious research effort, very little is known for certain about the nature and measurement of teacher personality, or about the relation between teacher personality and teacher effectiveness (Getzels and Jackson, 1963). Traditionally teachers and 'their organizations have had to fight for job security and fair standards of pay. Due process in dismissals, and punitive actions along with the single-salary schedule helped equal­ ize salaries between men and women and between elementary and secondary 40 school teachers. fession. These were well-earned victories for a vulnerable pro­ Evaluation systems, were often disguised means for firing mili­ tant or nonconformist■teachers, for slashing budgets, and fbr enforcing authoritarianism in the schools. Teachers wanted no part of them. torians have noted that at the 1915 NEA convention, one delegate His­ denounced' teacher rating as being "demeaning, artificial, arbitrary, perfunctory and superficial"'(Oldham, 1974). ' Early teacher ratings were primarily■the outgrowth of. merit pay programs which had originated around the turn of the century. A merit . pay program was a method used by school districts to determine a teacher's salary in light of a judgment made as to his competency (Brighton and Hannon, 1962). . Merit rating which was the outgrowth of the merit pay idea was. described by Rogers, as . . . the effort to evaluate or measure more successfully the effectiveness of the performance of the teacher, with a view of rewarding excellence while avoiding over-payment to the mediocre or unsuccessful teacher (Brighton and Hannon, 1962). ' ■ . ' ■ The merit rating movement by 1915 had reached such proportions that i f caused a decided ,division between proponents and opponents . ' One group of people, which included both laymen and professionals, concluded that it was impossible to find a safe, Lisable scheme of rating. This group was unable to determine exactly why they thought such.delineation was impossible. The 19201s saw the peak use of formal merit pay plans 41 in school districts throughout the United States. The Department of. Classroom Teachers of the National Education Association reported to. the 1925 national convention the Ohio State University study of 1922 which indicated that. 99 per cent of the cities in the United States with popu­ lations of over 25,000 had some form of tehchdr rating in.operation . (Brighton and Hannon, 1962). Most persistent merit rating problems which appeared in research . between 1900 and 1930 dealt with the reliability . . . . . and validity involved in measuring teacher effectiveness. The concern at that.time was whether or not the measuring device was consistent in measuring what it was supposed to measure--was the instrument reliable? Secondly, was the. instrument valid in that it measured what it was supposed to measure? The questions of reliability and validity led to the development of measurement devices . which could be tested against such criteria (Brighton and Hannon, 1962). Brighton and Hannon noted that rating scales which listed the per­ sonal and pedagogical attributes of a successful teacher were the main instruments used to measure teacher competence by 1930. Trait scales were developed which,required agreement on the relative importance of . each item. It then became necessary to measure the degree to which a particular teacher possessed or did not possess each particular attribute. 1939). Barr analyzed 209 of these rating scales in use by 1930 (Cooke, Barr concluded that ten categories pould include all the attri^ butes that were being used in this approach to rating. They w ere: .. 4 2 Instruction Classroom management Professional attitude Choice of subject matter Personal habits Discipline. • Appearance of the room Personal appearance ' Co-operation Health- That there was little agreement.among raters as to what personal and pedagogical attributes described the successful teacher, was further illustrated by Brighton and Hannon: - Shelter reported that a similar study in Pennsylvania by •Charters and Woples produced a list of 25 categories. Twenty of these are. not found in Barr's list. Shannon queried 164 public school administrators concerning 430 of their best.and . 352 of their worst teachers. From the replies, he formulated ten categories important in defining teacher■competence. Only four of the ten are found in Barr's list. Sheller studied five such lists and found little■similarity among' them (Brigh­ ton and Hannon, 1962). The obvious inconsistency between Various lists caused Barr to comment: ■ ■ , . • ' . Excellent as these earlier check lists are, they represent in most instances, merely abbreviated statements of the author's own opinion of what constitutes good teaching and do not neces­ sarily supply valid and reliable criteria of teaching success (Barr and others, 1938). In a report on teacher ratings in public school systems which was compiled by Boyce in 1915 and published, by the National Society for the Study of Education,.Boyce noted that the number of items by which teach­ ing efficiency was judged ranged from two items to eighty items. listed four types of analysis of rating scales as: He (I) descriptive reports which dealt with specified points, (2) lists of-questions which were answered yes or no, (3) lists of items which were evaluated by the 43 classification of excellent, good, medium,- .unsatisfactory, etc. and (4) lists of items in which each item was assigned a numerical value. Boyce's summary of the qualities "discipline" was the most listed on 50 of the rating schemes evaluated. In quality It was' found in 98 per c.erit of . the forms. ■ Next in frequency were "instructional skill" and !'cooperation and loyalty". Each was mentioned in 60 per cent of the forms (Biddle and Ellena, 1964). Boyce's plan of rating included a classified list of 45 items grouped under five headings: personal equipment, social and professional. equipment; school management, and technique of teaching. ranged from "general appearance" to "moral influence." The items His plan required that each item be checked on a scale of five 'terms ranging from "very poor" to "excellent". By 1967 many evaluation forms were in use which, were similar to the "efficiency record" published by Boyce in- 1915 (Biddle and Ellena, 1964). In 1924 Monroe and Clark summarized the researches of the preced­ ing, twenty years in which they cited studies that showed .the lack of reliability of existing rating devices. They also pointed out that a halo effect existed on the part of the rater's general estimate of the teacher. The halo effect influenced the estimates of particular traits held by the teacher who was rated. - 'As noted, by Biddle and Ellena, many different people.have been ■ used as raters in competence research. This group included classroom • 44 teachers, student teachers, critic teachers, principals, supervisors, superintendents, school board members, pupils, parents, other lay per­ sons, and college instructors. A large majority of studies reporting on teacher competence have used rating forms for one purpose or another. Biddle and Ellena summarized as follows: Generally, the results of research using rating forms have been poor and contradictory. This is not surprising in view of the fact that such judges as listed above are handicapped by personal bias, a lack of training for observation, and a lack of firsthand information concerning the teacher-classroom interaction. Yet each year brings a new crop of studies using rating forms. Why this perseverance? One obvious answer is the prevalence of rating forms in school programs of assessment, merit pay and promotion. In their view ratings seemed less than useful for research on teacher effectiveness. . . (Biddle and Ellena, 1964). Historically teachers have objected to.the use of the rating forms to measure their effectiveness. Over the years the proceedings of their professional organization have demonstrated their feelings in this issue. Biddle and Ellena reported the feelings of teachers as described in the 1915 Proceedings of the NEA as follows: . . . A sense of real injustice develops among teachers when ratings arrived at in a perfunctory manner become the basis for salary increases. It was this sense of injustice that led the members of the National Education Association, as early as 1915, ■to adopt a resolution in opposition to 1those ratings' and records which unnecessarily disturb the teacher's peace and make the rendering of the best service impossible (NBA, Proceed­ ings, 1915) and (Biddle and Ellena, 1964). Again in 1961, and in more recent years, the National Education Associa­ tion through its resolutions recognized that "it is a major responsibility 45 of the teaching profession, as of other professions, to evaluate' the quality of its services." The resolution opposed the use of subjective methods of evaluating teachers for the purpose of setting salaries, say­ ing specifically, "Plans which require such subjective judgments (com­ monly known as merit ratings) should be avoided." 1961, p p . 189-193). (NEA, Proceedings, (Biddle and Ellena, 1964). Oldham related that the literature from teacher organizations has continued in opposition to the idea of merit pay. Administrators, board members and the public regard merit pay as a way to improve education and get a better return on the tax dollar. Merit pay, according to Oldham, could not exist without evaluation, but the converse was not true; evaluation could and did exist without merit pay. Usually, when evaluation began to suggest merit pay, tion itself was likely to be attacked by teachers. evalua­ The idea of teacher evaluation to improve instructions reached a near consensus status;,but teacher evaluation for the sake of paying some teachers more than others was still very much a subject of much debate among teachers (Oldham, 1974).. Most school districts avoided tying teacher evaluation to merit pay and most avoided merit pay schemes altogether. employed merit pay schedules. However, a few districts In those districts it was common practice to separate teacher evaluation for the sake of improving learning from teacher evaluation for the sake of rewarding teachers with some additional . increments . Most districts followed the voluntary participation in merit 46 pay programs (Oldham, 1974). , . ■ ' . Opponents of merit pay ,programs put forth the argument ,that teachers preferred to be paid on the basis of experience and earned credit. Some of the reasons teachers avoided merit pay programs were listed by Shaughnessy. These were who shall appraise, what should be subject to evaluation, how will appraisals be conducted, and how will appraisals be translated into salary increments. Oldham quotes " Shaughnessy as follows:• "Perhaps.the most striking issue is that which centers around inadequate basic and evaluative research in teacher and teaching evaluation in general and in merit pay programs in particular" (Oldham, 1974). A summary of the more recent teacher response to merit pay and ' other considerations was illustrated by Oldham in. the following quote: Teacher evaluation systems are not implemented anywhere with­ out some conflict. The reasons for controversy, vary arid there does not appear to be any one over-riding cause, such as teachers' intransigent opposition to all forms, of evaluation. Although some very few districts replying to the Education U.S.A. survey did report teacher opposition to any evaluation after tenure, most conflict seems to stem from other matters. .Evalua­ tion, as a general process, ■is in itself rarely the issue.. However, some tentative generalizations, based on the survey replies, are possible: Conflict seems, most likely if teacher evaluation is tied to identifying incompetent teachers for the purpose of dismissal; if it'is tied to merit pay provisions; if a check-list type of evaluation instrument is used that does not reflect any teacher input .■. . . Even when teachers participate in the creation of the evaluation procedure conflict can result. . . (Oldham, 1974). 47 Teacher Participation in the Evaluation of Their Services "From staunch opposition, to a guarded receptivity, to a leader­ ship role in planning for teacher evaluation— such has been the course of opinion of large numbers of the nation's teachers regarding evalua­ tion," stated Oldham (Oldham, 1974). But this apparent reversal in the approach to teacher evaluation by teachers did not reflect a real change in philosophy held by teachers as much as it reflected the changes in the types of evaluation programs that were being proposed due to outside pressures on the schools. Up to the present time, wrote Oldham, there has been little con­ sensus among teachers in the area of teacher evaluation although it is a subject that vitally affected and concerned them. Individual teachers and their professional organizations ranged on a continuum from a firm opposition to evaluation plans to active support for a certain plan. Because of the complexity and difficulty inherent in teacher evaluation,■ the numerous plans proposed, and the uniqueness of local district condi­ tions, this change in teachers' attitudes toward evaluation come as no surprise (Oldham, 1974). In the last decade, research and experimentation in the univer­ sities and teacher training institutions of our country brought forth much re-examination of the teaching and learning.process on an intellec­ tual and scholarly basis. This re-examination and resulting new theories 48 intensified the move toward better teacher evaluation. provided for carrying it out. New methods were Teachers' organizations have added to this body of knowledge through studies, workshops and professional development programs (Oldham, 1974). These activities contributed to the emerging philosophy that evaluation of teachers was for the purpose of improving instruction rather than for rating purposes. Teachers supported the idea that evalu­ ation should "pinpoint teacher strengths and weaknesses" and help them to reinforce their strengths and overcome their weaknesses. This phil­ osophy was supported by many teachers because it provided an approach to evaluation that seemed to meet the needs, and demands of account­ ability proponents as well as teacher needs. Oldham described gradual teacher acceptance as follows: Many teacher groups began to accept the precept that teacher evaluation may be one way to satisfy the public demand for tan­ gible evidence that the schools were doing their job with the properly qualified and properly educated staffs (Oldham, 1974). The conditions as described by Oldham rapidly moved teachers' associa­ tions on all levels toward planning and recommending "acceptable" • evaluation systems (Oldham, 1974). This change in attitude was summarized by Larry E . Wicks in an article, "Teacher Evaluation in Today's Education/NEA Journal (March, 1973): 49 They [a teacher's association] "were .aware that parents, stu­ dents , elected officials, and state agencies across the country are demanding teacher accountability. They believed that if the profession doesn't deal with the problem then someone else will.Therefore, they felt that education associations must place a high priority on becoming fully involved in establishing policies for and carrying out evaluation of education programs and of teaching processes. In a similar vein, N E A 's The Early Warning Kit on the Evaluation of Teachers contained the following statements: . . . The work of teachers is constantly being evaluated not only by supervisory personnel but by the lay public as it criti­ cizes educational products. Teachers should not become defen­ sive but should be prepared to respond affirmatively. Appro­ priate response is made by taking a hard look at programs to improve the schools. . . . Without association involvement in the selection, adoption, or development of the evaluation instru­ ment, there is little likelihood it will be used adequately and fairly to evaluate teachers. If teachers do not take a strong position on teacher performance evaluation, they will be unable to benefit from this important and sensitive activity . . . it is a major responsibility of educators to participate in the evaluation of the quality of their services. Oldham reported that many teachers' groups concluded that teacher evaluation was here, would stay, and would expand. To insure that evaluation would be done right, teachers chose to become involved in the process from beginning to end. Because teachers felt the need to help shape policies, set goals, design instruments and carry out the proce­ dures of evaluation, many associations made teacher evaluation a nego­ tiable item in contract bargaining (Oldham, 1974). Specifically the National Education Association took the position that it was a major responsibility of educators to participate in the 50 evaluation of the quality of their services. To enable educators to ' meet this responsibility more effectively, the Association called for continued research and experimentation to develop means of objective . : . " evaluation _of the .performance of all educators. The means included identification of (a) factors thpt -^determined professional competence; (b) factors that determined the effectiveness-of competent professionals; ■ (c ) methods of evaluating effective professional service; and (d) methods of recognizing effective professional service through self-realization, personal status, and salary. ■ The association held the view.that evaluations should be con- 1 ducted for the purpose of improvement of performance and quality of instruction offered to pupils, .based upon written criteria and following ' procedures mutually developed by and acceptable to the teacher asso'cia-.' tion, the administration and the- governing board (NEA, January, 1974). Studies of Teacher Effectiveness . During most of the history of education the question of what knowl­ edge, understanding and ways of behaving teachers should possess was based on experience,. tradition, common sense and authority. (Gage., 1963). Philosophers and:theologians applying their modes of truth-seeking to the problems of education, included the question of how teachers should be- ■ have. With .the emergence of the behavioral sciences in the twentieth ; century,- attampts were made to apply scientific: methods to the problems 51 of learning, teacher behavior and teacher evaluation. From this developed a sub-discipline that is referred to as "research on teaching." Gage defines the term "research on teaching" as the study of relationships between variables, at least one of which refers to a characteristic or behavior of a teacher. He stated: If the relationship is one between teacher behaviors or char­ acteristics, on the one hand, and effects on students, on the other, then we have research on 1teacher effects' in which the teacher behavior is an independent variable (Gage, 1972). The record of research accomplishment on teacher effectiveness does not firmly support the idea that science can contribute to the art of teaching. to Gage. There are reasons for questioning this pessimism according Part of the reason for the questioning this pessimism lies in the fact that research in teacher effectiveness has assisted in revision, of the teacher's role by deriving and evaluating'the ways in which teacher behavior is changed (Gage, 1972). One problem, as described by Barr, that must be of continued con- • cern to those interested in the measurement and prediction of teacher effectiveness was that of adequate criterion. Barr presented a critical - overview of some 75 doctoral studies made at the University of Wisconsin that pertained in some respect to the measurement and prediction of teacher effectiveness. He concluded that "by and large, and with many exceptions two criteria have been used: (I) efficiency ratings of one sort or another, and (2) measured pupil gains." His summary on ratings 52 was . . . "Over all, general ratings of teacher effectiveness have been shown to be, under current conditions, exceedingly unreliable" (Barr, 1961). Regarding the use of measured pupil gain as a criterion of teacher effectiveness, Barr listed two difficulties that were encountered by the researcher. ■■ . . . First of all, each teacher in the modern school, within very broad limits, chooses his own purposes, means and methods of instruction. . . . A second difficulty arises out of the fact that notwithstanding over a half century of effort, many of the outcomes of learning and of teaching are poorly or in­ adequately measured (Barr, 1961). Some approaches to the development of criteria that described teacher effectiveness were listed by Barr as the "traits approach" which considered the qualities of the individual such as cooperativeness, ethicality, and considerateness, and the "behavioral approach" which considered the qualities of the individual not in terms of personality traits but in terms of characteristics of performance which integrated the concept of personality with that of methods. This concept has always been con­ sidered an important aspect of teacher effectiveness. Barr noted that, "The criterion of teaching effectiveness may also be behavioralIy defined, directly and without the summarizing operations provided by personality traits." A very attractive feature of a behavioral criterion was that behaviors may be directly observed by all who cared to look (Barr , 1961). 53 In describing the criteria of measuring teacher effectiveness, Barr listed three commonly employed criteria which encompassed four approaches to evaluation. The criteria most commonly employed were:. (I) Efficiency ratings, which may be made by. any number of persons, but most frequently by the superintendent of schools or members of his staff; (2) measures of pupil growth and achievement usually adjusted for differences in intelligence and other factors thought to influence growth and achievement; (3) and a preservice graduation criterion composed of (a) measures of the foundations of efficiency, basic knowledges, skills, and attitudes and (b) the personal prerequisites to effectiveness; and professional competencies as inferred from observation of performance in practice teaching, internships, and other activities involving children.(Barr, 1961). Embodied within these criteria w e r e ,four approaches to teacher evaluation which were combined in different ways by different persons, institutions, and data gathering devices. These were (a) evaluations made in terms of the qualities of the person, as in personality rating; (b) evaluations which proceed from studies of teacher.behaviors, as in the rating of performance in terms of inferred personal qualities or desirable professional characteristics;. (c) evaluations developed from data collected relative to presumed prerequisites.to teacher effective­ ness , potential or already achieved, represented by such psychological constructs, as knowledges, skills, and attitudes; and (d) evaluations developed from studies of the product, for example, pupil growth and achievement (Barr, 1961). A fourth type of criterion of teacher effectiveness described by Barr was that of pupil growth and achievement which was usually expressed 54 as pupil gain scores based upon achievement tests administered prior to instruction and again at some subsequent date when a particular unit of instruction or course had been completed. Barr cautioned that although- many would consider this criterion a primary criterion against which all other criteria should be validated, it was subject to very definite limitations. One limitation was the fact that tests measured results but provided little information as to how these effects were produced. The teacher effect was only one of many effedts that produced .changes in pupil growth and achievement. One of the real difficulties was that of isolating the teacher effect (Barr, 1961). One can assume that knowledge about teaching effectiveness con­ sists of relationships between what a teacher does while teaching and the effect of these actions on the growth and development of his pupils. In pursuing this assumption Flanders defined research on teaching effective­ ness as attempts to discover relationships between teaching behavior and measure of pupil growth. The most common research design which in his mind left much to be desired, compared ah "experimental treatment" group with a control group. Pre-tests and post-tests of pupil achievement and attitude were administered to all classes and an analysis of the scores showed that there were, or were not, significant differences -between the two groups of classes that were being compared. The problem with this design, according to Flanders was the failure to collect data which helped to explain why the results turned out the way they did. He felt that 55 interaction analysis provided information about the verbal communication which occurred, and this often helped to explain the results (Flanders, 1970). In an earlier research project which was conducted at the Univer­ sity of Minnesota and supported by the U. S . Office of Education, ten categories were used to classify the statements of the pupils and the teacher at a rate of approximately once every three seconds. It was found that an observer could be trained to categorize at this rate with sufficient accuracy (Flanders, 1960). The ten categories included seven assigned to teacher talk, two to student talk, and one to silence or confusion. When the teacher was talking, the observer decided if the statement was': (I) accepting student feelings; (2) giving praise; (3) accepting, clarifying, or making use of a student's ideas; (4) asking a question; (5) lecturing, giving facts or opinions; (6) giving directions; or'(7) giving criticism. When a student was talking, the observer classified what was said into one of two categories: (8) student response or (9) student initiation. Silence and con­ fusion were assigned to category (10) (Flanders, 1960). In practice, an observer kept a record of different periods of classroom activity. At the end of an hour's observation, it was possible for an observer to sum the different kinds of statements for each of six types of classroom activity separately and combine these into a grand total for the entire hour's observation. This method of observation was called "interaction analysis" in the classroom, and it was used, to quantify the qualitative aspects of verbal communication. The entire process became a measure of teacher influence because it made the 56 assumption that most teacher influence was expressed through verbal statements and that most nonverbal influence was positively correlated with the verbal. Those who have worked with this technique were dis­ posed to accept this assumption (Flanders, 1960). Interaction analysis is a specialized research procedure that provides information about only a,few of the many aspects of teaching. It is an analysis of spontaneous communication be­ tween individuals, and it is of no value if no one is talking, if one person talks continuously, or if one person reads a book or report. . . . Of the total complex called "teaching".inter­ action analysis applied only to the content-free characteris­ tics of verbal communication (Flanders, 1960). In a later publication entitled Analyzing Teaching Behavior, Flanders discussed sampling problems inherent in research on teaching effectiveness when a small group of teacher-class units were selected so that they represented a larger population. In reference to coding verbal behavior, he stated: Teaching effectiveness is by definition concerned with what teachers do that affects educational outcomes. In order to investigate teaching behavior with techniques such as coding verbal communication, sample, sizes have necessarily been small, too small to provide a logical basis for extending conclusions so as to generalize about a target population. The target populations, in turn, are difficult to identify in any mean­ ingful way (Flanders, 1970). One of the most extensive "teacher characteristics" studies was carried out by Ryans. Ryans' design was considered by other researchers such as Biddle to be classical in the sense that teacher were abstracted from the classroom context. "characteristics" Thus, classroom situation and 57 teacher-pupil interaction were ignored for the most part. His work was unique in that he established relationships between ten characteristics and both formative and outcome variables (Biddle and Ellbna, 1964). Teacher characteristics, as defined by Ryans, meant both teacher properties and teacher behavior. He presented three dimensions of teacher behavior (measured by rating, forms used in direct observation of behavior) : warmth, organization and stimulation. He detailed seven teacher charac­ teristics (measured by objective instruments) including: favorable opinion of pupils, favorable opinion of classroom procedures, favorable opinion of personnel, traditional versus child-centered approach, verbal understanding, emotional stability, and validity of response (Biddle and Ellena, 1964). The teacher characteristics study made relatively few assumptions about the roles of teachers. Instead, it followed a design that dic­ tated going into the classroom to observe what transpired when teachers and pupils reacted and interacted in the learning environment. It attempted systemizabion of the observation data collected, related those observation data to other kinds of information about teachers, and dis­ cerned typical patterns of teacher characteristics in relation to various conditions of teacher status. An effort was made to investigate the interactions and interrelationships among pupil behaviors and teacher behaviors. 58 In the teacher-observation phase of the teacher characteristics study, the staff concluded that there were at least three major patterns of teacher classroom behavior that could be identified. These were: TCS pattern X warm, understanding, friendly versus aloof, egocentric, restricted teacher classroom behavior TCS pattern Y responsible, businesslike, systematic versus, evading, unplanned, slipshod teacher classroom behavior TCS pattern Z stimulating, imaginative versus dull, routine teacher classroom behavior (Ryans, 1947). Some conclusions that came out of the study were: (I) Certain characteristics of teachers may be traceable to behavior patterns that • were expressed in related, but different, channels long before the ihdi-•. vidual entered teaching as a profession. (2) There appeared..ta be, Iittle1 doubt about the existence of important differences between teachers in varying age groups with respect to a number of characteristics.. Generally speaking, scores of teachers fifty-five years and above showed this.group to be at a disadvantage when compared with young teachers— except from the standpoint of pattern Y (systematic and businesslike classroom behavior) and characteristic B (learning-centered, traditional education viewpoints). (3) Differences between the sexes, often insignificant in the elementary school, were fairly general and pronounced among secondary school teachers. Women generally attained significantly higher scores than men on the scales measuring understanding and friendly classroom behavior, businesslike and stimulating classroom behavior, favorable 5 9 attitudes toward democratic practices, permissive viewpoints, and verbal understanding. (4) Teachers in large schools (17 to 50 or more teachers) scored higher than those from small schools. (5) Good mental health, or emotional maturity, generally was assumed to be a requisite for satisfac­ tory teaching performance. (6) Teachers are "good" if they rank very high among their colleagues with respect to,such observable classroom behaviors as warmth and kindliness, systematic and businesslike manner, and stimulating and original teacher behavior. (7) Pupil behavior appears to be rather closely related to teacher behavior in the elemen­ tary school. In the secondary school it seems almost unrelated to teacher behavior in the classroom (Biddle and Ellena, 1964). R . L . Turner utilized a strategy for effectiveness research that focussed upon the assumption that teaching may be viewed as a series of problem solving or coping behaviors. Utilizing this strategy, Turner and Fattu developed objective instruments to measure teacher potential for performing coping tasks. Turner demonstrated that teacher scores on these instruments were related to .formative experiences, to teacher prop­ erties, such as intelligence, attitudes, and values, and to contextual variables, such as subject matter and age of pupils (Biddle and Ellena, 1964). The design followed by Turner was intermediate between the classi­ cal approach of Ryans and the systematic interactions studies of Meux and Smith or Flanders. Turner assumed that the teacher reacted to the 60 problems posed by classroom situations. With this assumption in mind he built "classroom contexts into the definition of teacher properties to be measured." By way of contrast, Ryans' ten traits were "abstracted from the classroom context" (Biddle and Ellena, 1964). Turner, in his study which investigated the training and experi­ ence variables of teacher performance, suggested the following interpre­ tations: First, there was considerable evidence that treatments such as methods courses and student teaching during undergraduate teacher preparation have a distinct bearing on teaching-task performance in arithmetic and reading. Second, there was considerable evidence that variation in performance in teach­ ing tasks . . . was associated with variation in undergraduate preparation. . . . Third, there is some evidence that varia­ tion in teaching-task performance is associated with variation in teaching situations. . . . Fourth, there is considerable evidence that the very early years of teaching experience pro­ duce the greatest rise in teaching-task performance;— as evi­ denced by differences in performance between fully prepared but inexperienced teachers and teachers with no more than . three years of experience. There was little evidence to sug­ gest that performance changed greatly, for the average teacher, after the third year of experience (Biddle and Ellena, 1964). Viewing the teacher effectiveness problem from a sociological context, Brim questioned whether or not there were personal and social characteristics which greatly influenced role performance of a teacher. He summarized the teacher characteristics to effectiveness in teaching research by stating: . . . even though there is a vast body of research on the relation of teacher characteristics to effectiveness in teach­ ing, the reviews of this research show no consistent relation between any characteristics, including intelligence, and such teaching effectiveness (Brim, 1958). 61 Brim suggested that perhaps the effects of the teacher's personality had been looked for in the wropg place. He suggested that other actions of the educational process, njamely, the values the student learnbd,' his feelings about himself and other persons, h i s 'attitudes toward further education and many other social factors influenced the effects of teacher personality on the student. Brim proposed the possibility that the influ­ ence of a teacher's characteristics upon his effectiveness as an educator. is contingent on characteristics- of the students. He cited the example that teachers of given personal characteristics may be more effective with boys, others with girls; some with .bright students, others with average students. He pointed out that teachers themselves were the - first to admit that they seemed to do better with one rather than another type of student, and the preference, of different students for various teachers was easily recognized as part of one's own life experi­ ence. As pointed out by Brim this approach to measuring teacher-effec­ tiveness was not a novel observation, but someHpw it had escaped atten­ tion as a critical research problem (Brim, 1958). In research that was carried out by Biddle and Ellena on teacher effectiveness two problems were identified which seemed to cause con­ fusion in dealing with teacher effectiveness. They were listed as (I) during the decade of the fifties educational researchers said that they did not know- how to define, prepare for, or measure teacher competence, and (2) researchers disagreed over the effects a teacher expected to 62 produce. For example, should the teacher's tasks be defined in terms of the ultimate goals of education or the effects he produced with the pupil? Was a teacher expected to gain the same degree of competence for all' pupils, or should special competence be allowed in working with the under­ privileged, and handicapped, and the exceptional pupil? They concluded as follows: "... until effects desired of the teacher are decided upon, no adequate definition of teaching competence is possible" (Biddle and Ellena, 1964). Biddle and Ellena believed that it was necessary for researchers to agree upon language and the variables that words described in order to resolve the problem which was caused by researchers using multiple meanings for terms used to describe teacher effectiveness research. They pointed out that long-term effects of a teacher are difficult to assess. The problem being that the individual teacher contribution was hard to separate from the influence other teachers exerted who taught the same child. They believed that teacher competency involved a complex inter­ action between teacher properties and contextual factors in the community, school, and classroom. In their opinion it was possible that a number ■ of independent competencies found varied with the types of teachers teach­ ing (Biddle and Ellena, 1964). In order to clarify the effectiveness problem in teacher effec­ tiveness research, Biddle and Ellena suggested a variable system composed 63 of seven classes by which one could examine short and long range effects of teacher-pupil' interaction. described as: Summarized these seven variables are (I) formative experience's, (2) teacher properties, (3) teacher behaviors, (4) immediate effects, (5) long-term consequences, (6) classroom situations, and (7) school and community contexts. Five variables— formative experiences, teacher properties, teacher behaviors, immediate effects, and long-term consequences— were postulated by Biddle and Ellena to form a cause and effect sequence. In this sequence each variable class in the sequence caused effects in the next variable class listed. Biddle postulated further that the last two variables listed above were contexts for the main sequence. For example: (a) The classroom situation imbeds (and interacts) with teacher properties, teacher behaviors, and immediate effects. (b) School and community contexts imbed (and interact) with formative experiences, teacher properties, teacher behaviors, immediate effects, and long-term conse­ quences (Biddle and Ellena, 1964). Biddle and Ellena listed the following hypotheses as examples by which to examine each variable class in the diagram shown below: Hypothesis I: a. Teachers receiving "four years of college training" will "know more about the techniques of elementary education" than those receiving less education. b . Teachers "knowing more about the techniques of elementary education" will use more "flexi­ bility in the control of classroom discipline" than those who know less. 64 c . Teachers who use more "flexibility in the control of classroom discipline" will produce fewer "overt acts of deviancy" by pupils than those who use less. , d. Pupils who exhibit fewer "overt acts of deviancy" will show greater "achievement at the end of their schooling than those who exhibit more. Hypothesis 2: "Deviancy control" is more related to "achievement" in classroom situations characterized by "ritual and boredom" than in classroom situations. Hypothesis 3: The "control of deviancy" is more related to "achievement" in "lower class" schools than in "upper class" schools. Biddle explained that the hypotheses as listed above were not necessarily confirmable but each could be tested experimentally (Biddle and Ellena, 1964). Biddle noted that much of the literature up to the time of his study was concerned with the adequacy of various measurement techniques which were used to measure teaching effectiveness. He assessed the literature as confusing the measuring technique with the variable to be measured. The result of this confusion was evident, according to Biddle, in that one author treated measurement as a "cause" of effectiveness, while another author treated the same measurement as a "direct indication" of effectiveness. Some went so far as to consider measurement as a criterion of effectiveness (Biddle and Ellena, 1964). In addition to proposing a model for classifying variables involved in teacher effectiveness, Biddle and Ellena reviewed the forms of measure- 65 merit in use and their application to effectiveness variables. listed these were: Briefly (I) "observation techniques," which were classified into four categories of participant observation— categorical check lists— specimen record— and electronic recording; (2) "objective instruments," which included achievement tests— ability inventories— questionnaires and interview schedules— and projective tests; (3) "rating forms;" (4) "self-reports;" (5) "existing records;" (6) "a priori classification." In summarizing the methods in the measurement of effectiveness variables, Biddle suggested that measurements by a priori classification, behavioral observation, and objective instruments were to be advocated over measure­ ments made by existing records, self-reports and ratings (Biddle and Ellena, 1964). In a study of teacher-pupil relationships conducted by Bush, he suggested that caution should be exercised in overgeneralizing on the subject of teacher competence. His study noted the divergence of super­ visors' and administrators' rating of teachers from those of pupils. This divergence suggested that an estimate of a teacher's effectiveness should not be based upon the opinions of one group. Each of the class­ rooms reported in Bush's study contained examples of both effective and ineffective relations between pupils and teachers. No one teacher was found to be effective with all of his students or ineffective with all of them. Therefore, blanket, statements concerning what constitutes good teaching and the good teacher should be viewed skeptically for they are 66 likely to be based upon inadequate data and a failure to recognize the complexity of teaching. Bush further suggested that: The most meaningful and accurate appraisal is probably one that is specific and limited to an estimate of the effective­ ness of the relationship between a given teacher and a given pupil at a specific time in terms of the current needs of that pupil (Bush, 1954). Sciara and Jantz reported on the work of Seller who distinguished three aspects of teacher functioning as a role, style and technique. Teacher role was defined as behavior which was concerned with-the duties, responsibilities and functions of the teacher. Teacher style referred to personality traits and teacher attitudes which were not a planned component of the teacher role. Technique of teaching referred to specific strategies employed by the teacher to carry out her role or to accomplish her objectives. In evaluation of the teacher role, Seller concluded that future evaluation of teacher roles should never rely on one source alone but should include the perception and judgment of all groups involved in the teaching of the child, e . g ., the teacher, the administrator, other professionals in the school system, parents and pupils. Teacher style has been found to have an effect on the techniques which the teacher used, and in some instances, on the effectiveness of his teaching. Several studies investigated relationships between teach­ ing style and teaching effectiveness. One group of investigators examined this relationship by insuring that judgments of both teaching style and teacher effectiveness would be by the same judges. An L 67 investigation by Kerlinger used a wide range of judges and found that. two major clusters for effective teaching were evident. -One of these' clusters was named traditional, and it associated effective teaching with being self-controlled, trustworthy, refined, industrious, reliable, healthy, moral, religious, and conscientious. The second cluster was named progressive, and associated effective teaching with imagination, insight, warmth, openmindedness, flexibility, sympathy, sensitive, patience, and sincerity (Sciara and Jantzl 1972). The greatest advanc.e in the evaluation of teacher functioning was made in the area of teaching techniques and outcomes termed products of teaching. These advances occurred in the refinement of measuring and evaluating an individual's techniques, concentrating, of patterns of techniques, the studying of teacher-pupil interaction, and the improve­ ment of methods of measuring and evaluating effects of teaching on pupil achievement and functioning (Sciara and Jantz, 1972). In order to provide structure for an inquiry into teacher effec­ tiveness which attempted to uncover discrepancies in conceptions of teacher effectiveness', Jenkins and Bausell developed a survey instrument which was based on categories employed by Harold Mitzel in his contribu­ tion to the 1960 edition of the Encyclopedia of Educational Research. This study was briefly described earlier in this chapter. Mitzel described the categories of teacher effectiveness as follows: 68Product criteria. When teachers were judged by their effectiveness in changing student behavior, the judge employed in Mitzel1s scheme, was product criteria. Product criteria required the judgment of the teacher on the basis of a measurable change in what was viewed as his product, student behavior. What constituted acceptable products, or changes', was never made altogether clear. Measures of growth in skills, knowledge of subject matter, and attitude which could logically or empirically be attributed to the teacher's influence constituted acceptable data in the product category. As an example, skills and behaviors which evidence changes in critical thinking, inquiry, evaluating, reading, spelling, typing, speaking, and discussing were considered to be potential entries. Gains in knowledge of subject matter as measured by standardized achieve­ ment tests, end-of-lesson or unit quizzes, and student reports were accepted as evidence of teacher influence. Student performances measured in terms of self-acceptance, attitudes toward school subjects or toward learning in general, and respect of others and their opinions qualified as affective goals and thus were accepted within the product category. Confusion about the product category, probably arose not so much from the notion of using student change as a criterion but from the difficulty in gaining consensus on what products were considered appro­ priate within the domain of the school (Mitzel, 1960). Process criteria. Teacher evaluation which was based upon classroom behavior, either the teacher's behavior, his students' behavior, or the" 69 .interplay of both constituted process criteria. Process behaviors were worthwhile in their own right and thus were not necessarily related to product criteria. Variables upon which teachers could be rated were their verbal behavior, methods, classroom control, and individualization of instruction. Students might also be rated on their verbal behavior, attentiveness, and conformity to classroom routine. The interactions between student and teachers was the bases by which to judge rapport and climate in the classroom (Mitzel, 1960). Presage criteria. if teacher evaluation was based upon the teacher's personality or intellectual attributes (industry, adaptability, intelli­ gence, character), his performance in training, his knowledge or achieve­ ment (e.g. , marks in education courses, success in student teaching, national teacher examinations, knowledge of educational facts) Or his in- service status characteristics (e^.g. , tenure, years of experience, or participation in professional organizations), then he was being judged upon presage criteria. These criteria were indirect measures of a teacher's effectiveness and were normally chosen because in some authority's view they were related to, and therefore, predict, either process or product criteria (Mitzel, 1960). Status of Present Methods of Appraising Teacher Performance The principal's task, Lewis summarized, seemed futile when one reviewed the attempts of past research .to produce acceptable criterion 70 by which one could measure teacher effectiveness. This pessimistic point-of-view was illustrated by Lewis who stated, Although in the past we have been unable to objectively measure teacher performance effectiveness, the tradition of 1evaluating1 educators continues to be a dominate feature of our schools (Lewis, 1973). Lewis described the present method of appraising the performance of educators as "dysfunctional" and serving no useful purpose. He described present appraisal methods as falling short of assessing ade­ quately "true" performance. The result of inadequate assessing proce­ dures made it impossible for school districts to take corrective action for professional growth, improvement and development of the staff. Fur­ thermore, stated Lewis, "it has been a device which over the years has perpetuated the division between teachers and administrators" (Lewis, 1973). In his summary of studies on teacher effectiveness Stephens found no relationship between the academic gains of pupils and the qualities of the teacher that could be observed by principals or supervisors. Researchers, it seemed, arrived at the same findings that regardless of the techniques or method employed such as rating scales, self-analysis, classroom visitation, etc., few if any "facts" seem to have been reached concerning teacher and administrator effectiveness. According to Stephens1 no generally agreed upon method of measuring the competence of educators has been accepted and no methods of promoting growth, improvement and development has been generally adopted (Stephens, 1967). 71 A Current Trends Report conducted by the National School Public " Relations Association reported that a large body of facts and descrip­ tive reports on teacher evaluation presently exist in school districts across the nation, but the approaches to teacher evaluation varied considerably from district to district. The most noteworthy trend in this report on teacher evaluation was the growing practice of involving teachers in the establishment of evaluation programs. The pattern.of evaluation showed a trend away from negative aspects of identifying poor teachers so they can be dismissed to positive aspects of identify­ ing their weaknesses and strengths * The present purpose of teacher evaluation was described as correcting weaknesses and reinforcing strengths o f .teachers (Oldham, 1974). A most recent publication by Educational Research Service sup­ ported the Current Trends Report of a few years ago in stating that, "great diversity of thought on how to evaluate teaching performance, who should evaluate, and what criteria should be used," are still para­ mount considerations in evaluation'programs in school districts across the nation (Robinson, 1978). The effective evaluation of teaching per­ formance is still of particular importance in meeting thp demands for accountability by the public (Robinson-, 1978). The overview of the research done by Educational Research Service indicated that 97.9 per cent of responding school systems to its survey on teacher evaluation ■ carry on some type of formal teacher evaluation programs. The main 72 purposes of evaluation have not changed in school districts throughout the country. The two main purposes of teacher evaluation as noted by the report of Educational Research Service are: (I) to perform an evaluative function for management decisions; and (2) to perform a developmental function to help teachers identify areas for improvement and growth (Robinson, 1978). 73 SUMMARY . . . - It was evident by the middle of the 1970's that this nation's educational institutions were besieged by internal and external forces which demanded that these,institutions be held accountable and show evidence of having used the taxpayers' money, wisely before asking for additional money. The demand for "accountability" in the schools gained increased momentum by the soaring cost of education in the 1960's. It was the demand for accountability placed on the schools that intensified the search in t h e •1970's to find improved ways to evaluate the effective­ ness of teachers,. Historically the cry of the public for "accountability" has not been new but one which received renewed emphasis with each sue- . ceeding period of financial stress for taxpayers. For the teacher the topic of accountability quickly translated into an assessment of the quality of his instruction and the related necessity of selecting a criteria.by which one. would judge his efforts. In their effort to respond to the accountability emphasis, educators '.and in particular, administrators turned to business and industry to find more effective ways of measuring educational efficiency and output in the schools; ..Borrowing management and evaluation techniques from business and industry has not bee popular with teachers,.because they feared 74 unrealistic levels of required performance would be set from higher ' management levels without imput from teachers. Traditionally teachers felt that any measurement of their effectiveness which was solely de­ pended upon product measures would be punitive toward them and their i. ' students. This feeling stemmed from the fact that teachers were held, ■ . accountable for many variables which they could neither measure nor . , 1 I % control. In recent years teachers have asked for and taken a more active role in finding what they consider to be, equitable ways of measuring their own effectiveness. The question of whether to evaluate teachers and how to do it has been taken out of the hands of local school districts in many states, as state legislatures enacted laws which mandated the evaluation of all ' teachers at specified intervals and often in specified ways. California ' was one state which initiated this trend with passage of the Stull Act in 1971. Similar laws awaited enactment in other states., .. '' Many teachers' groups came to the conclusion that teachqr eyalua- , ‘ tion was here, would stay and most likely expand. Teachers felt fhat . they should be included in shaping the policies, setting the .goals,and designing the instruments of evaluation; 'As a result many associations made teacher evaluation a negotiable item in contract bargaining. . Historically the evaluation of teacher effectiveness, .has'belied; ' . '. . - ' .. ' ' ' ' ' , - -u . upon two criteria as pointed out by the criteria! overview by Barr. ■? ■ . ' ' . ' I ' The two criteria used have been efficiency ratings of one sort,, or another •' 75 and measured pupil gains. or measurable. Neither criteria has been exceedingly reliable Approaches used by researchers in the development of criteria that described teacher effectiveness included the "traits approach" which considered the qualities of the individual, and the "behavioral approach" which integrated the concept of personality with that of teaching methods (Barr, 1961). The greatest advance with regard to teacher evaluation has been made in the area of teaching techniques and outcomes or product of teaching. These advances have occurred in the refinement of measuring and evaluating an individual's techniques, the concentration of patterns of techniques, the study of teacher-pupil interaction and the greatly . improved methods of measuring and evaluating effects of teaching on pupil achievement and functioning (Sciara and Jantz, 1972). The most noteworthy trend on teacher evaluation as reported by Oldham, 1974, was the growing practice of involving teachers in' the establishment of evaluation programs. The primary purpose of evaluation shows a trend away from the negative aspect of identifying poor teachers so they can be dismissed and toward the positive aspect of identifying weaknesses and strengths so that the former can be corrected and the latter reinforced. CHAPTER III PROCEDURES This study was directed at determining what teachers and admin­ istrators in Montana believed were the appropriate criteria for judging the effectiveness of a teacher. A survey instrument was prepared which included an assortment of criteria for judging teacher effectiveness. This instrument, in addition to a form which was used to collect necessary demographic data, was mailed to a ran­ domly selected sample of Montana school administrators and teachers. The returned questionnaires provided demographic data and the ratings given to the criteria for each respondent. The responses of admin­ istrators and teachers were compared and the results indicated in appropriate table form. This chapter provides a description of the population studied, the sampling procedure used, the method used for data collection, the manner by which data were organized and presented, the hypotheses tested, the means by which the data were analyzed, and the precautions taken to insure accuracy. The final section consists of a chapteh summary. . Description of the Population This study sampled all teachers and administrators who were under contract in school districts in Montana during the 1976-1977 77 school term, as determined by the Office of Public Instruction, numbered approximately 9,428 people. Within this estimated population, the Office of Public Instruction listed in its directory for that year, 665 administrative positions designated as either superintendent or principal positions (O.P.I., 1977). A count of the elementary princi-. pals, secondary principals, and superintendents listed in the directory indicated a total of 665 administrator positions divided into 319 elementary principal positions, 143 secondary principal positions, and 203 district superintendent positions. The number of administrative positions listed in the directory did not indicate the actual popula­ tion of administrators sampled for this study. The reason for the difference in total positions listed and actual number of persons sampled was accounted for by the practice of smaller school districts" combining two or more administrative functions into one position. During the school term from which the administrator sample was taken for this study, 68 people served in at least two or more adminis­ trative positions concurrently in Montana school districts, Because of - this practice an actual administrator population of 597 people served in a total of 665 administrative positions in the 1976-1977 school year. The population for this study included all public school prin­ cipals in first- and second-class school districts, superintendents '' 78 of third-class districts and teachers on the staff of each adminis­ trator who responded to the survey and consented to have his staff surveyed. The population from which administrator and teacher samples were drawn included all principals and third-class district superintendents and all teachers who were under contract in the State of Montana for the 1976-1977 school term. The researcher chose the administrator population from principals in first and second-class school districts and superintendents in the third-class school districts to insure the probability of gaining a sample of administrators who were directly responsible for judging the effectiveness of teachers who worked directly under the adminis­ trators supervision. In most third-class districts in Montana the district superintendent of schools assumes the role of principal who traditionally is the administrator responsible for evaluating teachers on his staff. In most of the larger first and second-class' school districts a principal is employed who directly supervises teachers. For the purpose of this study the random sample of the school administrators was limited to a total population of approximately 474 administrators. This number was arrived at by consulting the 1976- . 1977 Montana Education Directory, which listed all employed principals and district superintendents, and computing the total population which met the limitation as previously described in this chapter. The following Table I gives the district classification and categories of the administrative population total. TABLE I CATEGORIES OF ADMINISTRATOR POPULATION Districts High School Junior High Elementary - Totals Principals Class I 23 18 ■ 1.39 Class II 85 ■7 129. 108 ■ 25 Sub-totals 399 268 Superintendents Class U l 73 ' TOTAL ADMINISTRATOR POPULATION________________________ • 474' The purpose, of this study required that the administrator population be sampled first in order-to arrive, at the total popular tion from which teacher samples were drawn. The administrators were sampled and the returns were- tallied to determine the total number who gave permission to have their respective staffs sampled by the same instrument. . This form of sampling resulted in a proportional stratified random sampling of all- teachers in the State of Montana. • Sources 80 in the Department of Public Instruction for the State of Montana listed a total teacher population of 9,428 for the 1976-1977 school year. The total teacher population on the staffs of the responding school administrators numbered 2,348. . To insure a .05 confidence level for the study a sample size of 568 teachers was drawn from a total state population of 9,428 teachers and from a stratified population of 2,348 teachers. Of the 568 questionnaires which were sent to teachers, 454 were returned. The percent of return was 79.93. The sample size and population comparisons for teachers are represented in Table I I . TABLE II TEACHER SAMPLE CHARACTERISTICS Teachers Sampled State Population Stratified Population Per Cent of State Population 568 9,428 2,348 6.0% Per Cent of Stratified Population 24.2% Sampling Procedure In late January of 1977 total population figures were.made available through the Montana Office of Public Instruction for teachers and administrators who were under contract in school districts. Using computer facilities of the Office of Public . 81 Instruction a sample of 114 administrators which included principals and superintendents was drawn randomly. This sample represented 24 percent of the population members, which is considerably larger than that needed to insure a 95 percent confidence level. Of 114 administrators who were surveyed for this study 112 responded. Usable returns number H O of.the original 114 survey instruments used for sampling purposes. Two of the original 114 survey instruments were not returned and two were incomplete returns that were not usable for this study. The usable administrators' survey returns of H O out of .114 sampled represent a return of 95 percent. Table III illustrates the categories of the administrator's sample. After the administrative sample was returned the total number of administrators who consented to have their staff surveyed was determined. Questionnaires were sent to 518 teachers. This sample was drawn from a total stratified population of 2,348 teachers. Of the 455 teacher respondents to the questionnaire on teacher effectiveness criteria measures, 454 were usable returns. One return was not used due to non-response to the items in the survey instrument. The total number of teacher respondents categorized by class of districts resulted in Class I districts being represented by TABLE III. CATEGORIES OF THE ADMINISTRATOR SAMPLE Administrators Class of District Principals I Principals TI Superintendents County Supt. Number of High School Principals Number of Jr. High SchoolPrincipals Number of Elem. School Principals Male Male Male Female 23 III Female 7. 4 13 . 6 21 Female , - 29 Per Cent of Totals 5 39 35; 5 I 49 44.6 21 19.0 - . . -- . Ill Sub— totals Totals I 41 10 52 Total Administrator Sample = 110 Total Males in the Administrator Sample = 103 or 93.6% of sample Total Females in the administrator Sample = 7 or 6.4% of sample1 7 I HO 100.0 83 278 teachers, Class II districts by 135 teachers, and Class III districts by 41 teachers. Table IV illustrates the returned teacher sample characteristics by district class, sex and returned administrator sample size. TABLE IV NUMBER AND PER CENT OF TEACHER RESPONDENTS BY DISTRICT AND SEX CATEGORIES Class of District No. of Adm. Respondents No. of Teacher Respondents Per Cent of Teacher Respondents Total Number of Respondents by Class of District Male Female Male Female Total Per Cent Class I 39 130 148 28.7 32.7 278 61.4 , Class II 49 64 71 14.1 15.7 135 . 29.8 Class III 22 18 23 3.9 5.0 212 242 46.5 53.5 41 ' 9.0 '< ' Sub-totals Total Sample HO 454 100.0 454 100.0 Method of Collecting Data A survey instrument which listed sixteen (16) criteria of teacher effectiveness was sent to the members of the sample popula­ tion for each to rate according to the importance of each criteria 84 which in their judgment determined teacher effectiveness. This instrument was constructed and used in a similar survey conducted by Jenkins and Bausell in the State of Delaware (Jenkins and Bausell, 1974). An assumption to be made and described in the instrument for the survey was that adequate measures were available to measure each of the criteria which were randomly listed in the instrument. The continuum used for rating each of the criteria listed on the survey instrument ranged from an evaluation of "completely unimportant" to an evaluation of "extremely important" on a ninepoint scale. Below the instructions the sixteen criteria were listed in random order. Beneath each criteria there was a nine- point scale like this: Capacity to perceive the world from the student point of view. Completely ^ ^ ^ Unimportant ^ ^ ^ ^ ^ Extremely Important The survey instrument sent to the teacher sample included in addition to listing the sixteen criterion measures of teaching effec­ tiveness the following question: ". , How effective is your administrator in helping you to improve your teaching effectiveness? The survey instrument sent to administrators contained an ' identical listing of the sixteen criterion measures of teacher effectiveness but contained the following additional questioni How effective are you in helping your teachers to improve their teaching effectiveness? Attached to each survey instrument was a cover sheet which listed briefly the purpose of this survey, instructions for com­ pleting the questionnaire and seven items of demographic data to be answered by the respondent. Space was provided on the back of the instrument for a respondents comments. The kinds of demographic information included: School district classification, sex of respondent, years of experience, grade levels presently taught, and student enrollment in the district where, the respondent taught. Copies of the form asking for this demographic ■ information along with the survey sent to teachers and administrators appears in the Appendix C . Method of Organizing Data Demographic data were presented in forms of percentages a n d ■ listed in tables depicting years of experience, classes of districts, grade levels taught and sex categories for the total samples of administrator and teacher populations. The experience categories for teacher respondents to the survey were the same as those which were used for the administrator respondents. gories are : The experience cate­ Category I (0-5) years, Category II (6-10) years', 86 Category III (11-15) years, Category IV (16-20) years, Category V (21-25) years and Category VI (26 and over). The administrative organizational patterns vary from district to district in Montana in relation to the grouping of grade levels. The grouping most commonly used by districts are: through grade three, primary level; intermediate level; (K-3) kindergarten (4-6) grades four through six, (7-9) grades seven.through nine, upper grade level or Junior High School; (9-12) grades nine through twelve, secondary or Senior High School level. A variation from the (7-9) and (9-12) groupings is used in smaller populated school districts which results in an (7-8 and (9-12) grade grouping and in some districts a (6-8) and (9-12) grouping. For the purpose of this paper grade levels designated (K-8) kindergarten through grade level eight and (9-12) grade nine through twelve have been used to describe elementary and secondary levels respectively. Table V illustrates the years of experience .categories for the administrator's sample. Comparisons were made by district class­ ification and sex of the respondents. Table VI lists the years of experience of teacher respondents by district classification. Com- i - parisons are made by sex of respondents and summed in percentages per district class. : ' TABLE V YEARS OF EXPERIENCE OF ADMINISTRATORS Years of Experience Class II Districts Class III Districts *M *F *M *F *M I 7 32 29.0 7 30 27.3 16 14.5 20 18.2 5 4.5 5 4.5 2 2.0 I. 0-5 'I 3 20 2. 6-10 11 I 11 3. 11-15 9 7 4. 16-20 8 6 5. 21-25 4 I 6. 26-over I 3 Data not Available 34 5 I I 39 Totals *M = Male *F’= Female 48 I 49 Per cent of Total *F 6 I Sub-total Total of Admin. Exper. Class I Districts 21 I 22 HO 100.0 o TABLE VI THE YEARS OF EXPERIENCE OF TEACHER RESPONDENTS■BY DISTRICT CLASSIFICATION AND SEX Years of Experience 1st ClassDistricts Male Female 2nd Class Districts Male Female 3rd Class Districts Male Female I. 0-5 25 46 24 27 11 2. 6-10 45 45- 17 18 3. 11-15 24 26 16 4. 16-20 17 16 5. 21-25 7 6. 25-over 11 Info. Not Available I Sub-total 139 % of Total Total 28.6 12 ' 145 32.0 2 8 135 29.7 9 3 2 80 17.6 4 6 0 0 ’ 43 9.5 6 . I 4 2 I 21 4.6 9 I 7 0 0 28 6.2 2 .4 100 100.0 454 100.0 I '148 32.6 278-61.2 Total of Teacher Per Cent of Teacher Experience by Experience by Classification Classification 64 71 14.1 15.6 135-29.7 18 4.0 41-9.1 23 5.1 89 The percentages of teacher respondents both male and female teaching at elementary, upper, grades and high school levels are listed in Table VII. Comparisons were made by class of district. Table VIII illustrates the number of teacher respondents found in each grade level as compared to school district classification. Comparisons were made by sex of respondents and the totals for each district class were compiled. TABLE VII PERCENTAGE OF TEACHERS BY DISTRICT CLASSIFICATION District I Male Female Grade Levels District II Male Female District III Male Female Total K-6 6.8 21.8 2.6 10.4 1.3 4.4 . 47.3 7-9 8.8 5.5 4.4 2.2 . 1.1 . .2 - 22.2 9-12 13.0 5.3 7.1 3.1 1.5 .5 30.5 Sub total 28.6 32.6 14.1 15.7 3.9 5.1 100.0 Totals 6:L.2 29 .8 9.0 Mean ratings of the sixteen criterion measures of teacher effectiveness were compiled separately for all respondents in the teacher and administrator samples. In addition mean ratings were compiled for teachers who were divided into three sub-groups by class of school district. The three classes of school districts 100.0 TABLE VIII NUMBER TEACHER RESPONDENTS BY GRADE LEVEL CLASSIFICATION First Class Dist . Teachers Male Female Second Class Dist . Teachers Female Male Third Class Dist. Teachers Female Male 12 47 6 7-9 40 25 20 10 5 59 24 32 14 130 148 64 71 Sub total Totals I 99 I I V 31 C D K-6 I Grade Levels 278 135 Total Teachers by District District District District I. II III 130 59 26 I. 65 30 6 7 2 83 46 9 18 23 278 135 . '41 20 41 454 91 are Class I , Class II and Class III. Tables were compiled which .listed the mean ratings and rank order of the sixteen criteria for administrators, all teachers and teachers of Classes I , II and ill districts respectively. Data supplied by the respondents were examined and comparisons made in the ordering of criteria by ratings of teachers and admin­ istrators. This data were arranged in table form to show rank order of the sixteen criteria. The types of criteria were listed in the categories of process, product and presage as described by Mitzel (Mitzel, 1960). Mean ratings were listed for each criteria and combined means for the type categories of process, product and presage. Responses of elementary teachers, secondary teachers and administrators were compared for significant differences in ranking the criteria. These data were organized into tables which illus­ trated the comparisons. Hypotheses Tested Hypothesis I The null hypothesis was that there is no significant difference among the subgroup measures of teacher effectiveness; process, pro­ duct and presage in the responses between teachers and administrators 92 Hypothesis II The null hypothesis was that there is no significant difference among the subgroup measures of teacher effectiveness; process, pro­ duct and presage in the responses of teachers in the three district classifications. Hypothesis III The null hypothesis was that there is no significant difference among the subgroup measures of teacher effectiveness; process, pro­ duct. and presage in the responses between male and female teachers. Hypothesis IV The null hypothesis was that there is no significant difference among the subgroup measures of teacher effectiveness; process, pro­ duct and presage in the responses between elementary and secondary teachers. Hypothesis V The null hypothesis was that there is no significant difference among the subgroup measures of teacher effectiveness; process, pro­ duct and presage between six classes, of years of experience for teachers. 9.3 Hypothesis VI The null hypothesis was that there is no significant relation­ ship among the differences between the teacher rating.of each of the 16 measures of teacher effectiveness and the teacher rating,of the' adminis­ trator. . Hypothesis VII . " ; • The.null hypothesis was that there is no significant relation­ ship in the mean rating of each of the 16 criteria of teacher effective­ ness between teachers and administrators (either as combined groups or by classes of districts) and a reference group from another study. test of significance for this study was on the .05 level. The A summary table will be included which will report the distribution of the responses of the.teachers and administrators in all categories and • frequency tables will be used whenever necessary. Method of Analyzing D a t a ' Hypotheses I through V were tested by using the analysis .of . . variance statistic in conjunction with appropriate tests of significance. When appropriate, the Duncan's test for multiple comparisons/ was used to test the individual means. Sub-test comparisons were made on the criteria grouped under the Mitzel Scheme of categorizing effectiveness criterion into three groups , 11 3 94 ■labeled process, product and presage. Definitions of the Mitzel- Scheme were provided in Chapter II, pages 69-70 of this study. Hypothesis VI design. was' analyzed by using a multiple regression The multiple' regression-design makes use of the coefficient of multiple correlation which is an estimate of the relationship between one variable and two or more others in combination. The correlation coefficient may be used to predict or estimate a score on an unknown variable from knowledge of a score on a known variable.• The dependent variable is spoken of as the criterion. described as predictors. The independent variables are In this problem, the criterion variable repre- sented the rated effectiveness of the administrator in improving the effectiveness of the teacher (Y-*- variable).. The difference in the per-, ception of.the importance.of criterion measures in evaluating teacher, effectiveness between teachers and administrators represented the inde­ pendent or predictor•variables (X .variable). There was a total of 16 predictor variables. . All differences between' teacher and administrator . perception were computed, for the. 16 criterion on the survey instrument.. Spearman's coefficient of rank correlation formula was used to .compare the rankings of Montana teachers and.administrators' ratings on effectiveness criteria to the rankings Cf the Delaware teachers a n d . administrators. Mean ratings only were available to the researcher ' . from the. Delaware study. paring .results was chosen. Therefore, a non-parametric method.of com­ Tables illustrating the comparisons were I 95 made and correlation coefficients were calculated. The significance of the calculations as determined by the critical value of F for N number of degrees of freedom was listed from appropriate tables. Precautions Taken for Accuracy All data compiled from questionnaires were minutely examined by two people prior to using computer processing techniques. All samples were drawn by computer programs to insure randomness of sampling. Cal­ culations were checked by computer and hand calculators to eliminate mathematical error. SUMMARY Data used to answer the questions relating to the problem of what teachers and administrators in Montana believed were the appro­ priate criteria for judging the effectiveness of a teacher was obtained by using a survey-rating instrument. The population for this study was • composed of the total teaching and administrative population who were under contract in school districts throughout Montana during the 19761977 school term. The instrument used to collect data was essentially the same instrument used in a similar study conducted by Jenkins and Bausell in the State of Delaware (Jenkins and Bausell, 1974). The instrument was based upon three categories of teaching effectiveness measures as described by Harold Mitzel (Mitzel, I960). Data supplied by respondents was examined and compared in table form to show rank 96 order of the 16 criteria along with combined means for type categories of process, product and presage. The analysis of variance statistic along with appropriate tests for significance was used to test hypo­ theses one through five. Hypothesis six was tested using the multiple regression equation to answer the question of whether or not the expected success of an administrator in helping individual teachers to become more effective could be determined from the teacher's rating of the administrator com­ pared to the difference in administrator and teaching ratings of teacher effectiveness criteria. The Spearman Coefficient of rank correlation was used to test Hypothesis VII which compared the relationship that existed between how teachers and administrators of Montana viewed effectiveness cri­ teria compared to how teachers and administrators of Delaware view effectiveness criteria. The null hypotheses were tested at the .05 level of significance. Summary tables were constructed to report the distribution of responses of teachers and administrators in all cate­ gories and frequency tables were used whenever necessary. The.survey responses were categorized and recorded by a committee of two people to insure accuracy, and the electronic calculator as well as computer services at Montana State University were employed to minimize compu'tational errors. CHAPTER IV ANALYSIS OF DATA Introduc fcion As described in Chapter I, the purpose of this study was to deter­ mine whether or not administrators who were charged with evaluating teacher effectiveness agreed on the criteria that was used in judging effective teaching and compare that finding with the teacher's view of criteria measures of teaching effectiveness. Comparisons were made between the results of the study and a similar one conducted in the State of Delaware. In an attempt to probe into how both teachers and administrators viewed the criteria' upon which teachers have been evaluated, this re­ searcher used a survey instrument that was constructed by Jenkins and Bausell for a similar study carried out in the State of Delaware (Jen­ kins and Bausell, 1974). The survey instrument included an assortment of sixteen- criteria based on the categories of product, process and presages.that were employed by Harold Mitzel (1960) and described in Chapter I I . An additional item was added to the survey instrument which asked the administrator respondent.to rate his own effectiveness in helping teachers under his supervision to improve their teaching effectiveness. A comparable item was added to the survey instrument which was sent to teachers. Teacher respondents were asked to rate the effectiveness of 98 their administrator in helping the teacher to improve his own teaching effectiveness. Mean Ratings of Criteria The ratings by administrator respondents to each of sixteen effec­ tiveness criteria listed on the survey instrument were tallied. The mean rating for each criterion was calculated from the total of administrator responses. The means thus calculated were rank ordered so that the highest mean rating for a criterion was listed as number one. ranked criterion was listed sixteenth. The lowest The means calculated for the sixteen criteria as rated by administrators ranged from a low mean of 4.95, on a scale of I to 9, to a high mean of 8.32. Each of the sixteen criterion measures of teacher effectiveness appearing on the survey instrument was labeled in accordance with the Mitzel Scheme. This Scheme described each criterion as either process, product or presage criterion. A description of the Mitzel Scheme and definition of the three types of effectiveness criteria appear in Chapter II of this study. The sixteen criterion measures rated by administrators were grouped into the three types of effectiveness measures labeled product, process, and presage. measures. Combined means were calculated for each type of criterion The highest mean rating for combined effectiveness measures was that labeled product. Its mean was 7.61 rated on a scale from I to 9. This combined mean rating was followed by process criterion with a combined mean rating of 7.49 and presage criterion with a combined mean rating of 6.55. Presage criteria received the lowest rating by adminis­ trators as a measure of teacher effectiveness. Table IX lists the rank order of means of sixteen measures of teacher effectiveness as rated by Montana administrators. In addition, the combined mean rating of product, process and presage are listed in Table IX. Attention is called to the fact that although product criteria, which are measures of student gain, were rated highest as a group; the process criteria, "effectiveness in controlling his class", was rated highest by administrators. In other words, discipline was the top rank­ ing criterion by which to measure teacher effectiveness as rated by administrators. The same format that was used in Table IX to list the mean ratings of administrators' measures of teacher effectiveness was used in Tables X, XI, XII, and XIII. Table X lists the rank order of means of the six­ teen effectiveness criteria as rated by Montana teachers. bottom of Table X are the combined mean ratings. Listed at the Special note should be made of the fact, as indicated by the results listed in Table X, that teachers rated process criteria highest. This rating differed from administrators who rated product criteria highest. However, as noted from the Table X teachers rated "effectiveness in controlling his class" the highest of the sixteen criteria as did administrators. TABLE IX RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURE OF TEACHER EFFECTIVENESS FOR ADMINISTRATORS OF COMBINED CLASSES OF SCHOOL DISTRICTS • Rank Criteria Type Order_____________(Ordered by Rating)_____________(Mitzel Scheme) I. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Effectiveness in controlling his class. Relationship with class (good rapport). Knowledge of subject matter and related areas. Amount his students learn. Means Process 8.32 Process 7.95 Presage Product 7.68 7.66 Presage Product 7.57 7.56 Presage 7.49 Process 7.39 Ability to personalize his teaching. Extent to which his verbal behavior in classroom is student-centered. General knowledge and understanding of education facts. Extent to which he Uses inductive (discovery) methods. Process 7.36 Process 7.35 Presage 6.59 Process 6.58 Civic responsibility (patriotism). Performance in student teaching. Participation in community and professional activities. Years of teaching experience. Presage Presage 6.53 5.87 Presage Presage 5.74 4.95 Personal adjustment and character. Influence on student's behavior. Willingness to.be flexible, to be direct or indirect as situation demands. Capacity to perceive the world from the student's point of view. Type Product Process Presage Combined Means 7.61 7.49 6.55 ' 101 TABLE X RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER EFFECTIVENESS FOR TEACHERS OF COMBINED CLASSES OF SCHOOL DISTRICTS Rank Order' I. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Criteria (Ordered by Rating) Effectiveness in controlling his class. Knowledge of subject matter and related areas. Relationship with class (good rapport). Willingness to be flexible, to be direct or indirect as situation demands. Type (Mitzel Scheme) Means Process 8.13 Presage 8.09 Process 7.89 Presage 7.61 Presage Process 7.56 7.44 Process 7.34 Process 7.31 Influence on student's behavior. Amount his students learn. General knowledge and understanding of education facts. Extent to which he uses inductive (discovery) methods. Product Product 7.19 7.04 Presage 6.77 Process 6.35 Civic responsibility (patriotism). Participation in community and professional activities. Years of teaching experience. Performance in student teaching. Presage 6.09 Presage Presage Presage 5.23 5.08 5.01 Personal adjustment and character. Ability to personalize his teaching. Capacity to perceive the world from the student's point of view. Extent to which his verbal behavior in classroom is student centered Type Process Product Presage Combined Means 7.41 7.12 ■6.43 1 0 2 Tables XI, XII, and XIII show the mean ratings by teachers of the sixteen measures of teacher effectiveness criteria by district classifi^ cation. Because school districts 'differ in size as determined by popu­ lation, districts are classed from I. which has the greater population to III which has the smallest population. The rank order of teacher ratings of effectiveness criteria was listed by district classification to show whether or not there is an appreciable difference in how teachers rated effectiveness criteria as determined by the size of the district in which they taught. It is again noted from Tables XI, XII, and XIII that teachers, regardless of the size of school district in which they taught, rated process criteria highest as noted by combined mean ratings listed in each table. As with combined teacher and administrator ratings, teachers regardless of the size of the district in which they taught, rated "effec­ tiveness in controlling his class" as number one. A comparison of teacher and administrator ratings on the three types of teacher criteria are shown in Table XIV. This comparison is made in terms of combined means. The Testing of Hypotheses The first hypotheses examined whether or not there is agreement between administrators and teachers on the criteria that measures teacher effectiveness. To test the null hypotheses the sixteen criterion measures were grouped into the three sub-groups of process, presage and product 103 - TABLE Xl RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS I SCHOOL DISTRICTS ' Rank Order I. 2. 3. 4.. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Criteria (Ordered by Rating) . Effectivness in controlling his class. Knowledge of subject matter and related areas. Relationship with class (good rapport). Willingness to be flexible, to be. direct or indirect as situation demands Type (Mitzel Scheme) Means Process 8.12 Presage 8.12 Process 7.92 Presage 7.77 Personal adjustment and character. Ability to personalize his teaching. Extent to which his verbal behavior in classroom is student centered. Capacity to perceive the world from the student's point of view. Presage Process 7.65 7.56 Process 7.42 Process 7.35 Influence on student's behavior. Amount his students learn. General knowledge and understanding of education facts. Extent to which he uses inductive (discovery) methods. Product Product 7.28 6.94 Presage 6.74 Process 6.39 Civic responsibility (patriotism). Participation in community and professional activities. Years of teaching experience. Performance in student teaching. Presage 6.12 Presage Presage Presage 5.17 5.14 5.14 Type Combined Means Process Product Presage. 7.46 7.11 6.48 104 TABLE XlT RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS II SCHOOL DISTRICTS Rank Order I. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. Type (Mitzel Scheme) Criteria (Ordered by Rating) Effectiveness in controlling his class. Knowledge of subject matter and related areas. Relationship with class (good rapport). Personal adjustment and character. Willingness to be flexible, to be direct or indirect as situation demands. Capacity to perceive the world from the student's point of view. Ability to personalize his teaching. Amount his students learn. Extent to which his verbal behavior in classroom is student centered. Influence on student's behavior. General knowledge and understanding of education facts. Extent to which he uses inductive (discovery) methods. Civic responsibility (patriotism). Participation in community and professional activities. Years of teaching experience. Performance in student teaching. Type Process Product Presage Means Process 8.12 Presage 7.98 Process Presage 7.85 7.50 Presage 7.33 Process Process Product 7.30 7.24 7.16 Process Product 7.09 7.06 Presage 6.79 Process 6.23 Presage 5.96 Presage Presage Presage 5.28 5.04 4.86 Combined Means 7.30 7.11 6.34 105 TABLE XlIl RANK ORDER OF MEANS FOR SIXTEEN CRITERION MEASURES OF TEACHER EFFECTIVENESS FOR TEACHERS OF CLASS III SCHOOL DISTRICTS Rank Order Criteria (Ordered by Rating) I. Effectiveness in controlling his class. Knowledge of subject matter and related areas. Relationship with class (good rapport). Willingness to be flexible, to be direct or indirect as situation demands. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. • 14. 15. 16. Type (MitzeI Scheme) Means Process 8.32 Presage 8.27 Process 7.73 Presage 7.49 Process Process. Product 7.41 7.37 7.36 Process 7.32 Personal adjustment and character. Influence on student's behavior. General knowledge and understanding of education facts. Extent to which he uses inductive (discovery) methods. Presage Product 7.20 7.05 Presage 6.93 Process 6.49 Civic responsibility (patriotism). Participation in community and professional activities. Years of teaching experience. Performance in student teaching. Presage 6.35 Presage Presage Presage . 5.49 4.83 4.63 Capacity to perceive the world from the student's point of view. Ability to personalize his teaching. Amount his students learn. Extent to which his verbal behavior in classroom is student centered. Type Combined Means Process. Product Presage 7.44 7.20 6.40 106 TABLE XIV COMBINED MEANS OF RATINGS OF ADMINISTRATORS AND TEACHERS OF MONTANA Type (Mitzel Scheme Admin. All Teachers Process 7.49 7.41 7.46 7.30 7.44 Product 7.61 7.12 7.11 7.11 7.20 Presage 6.55 6.43 6.48 • 6.34 6.40 Class I Teachers Class II Class III which have been described as the Mitzel Scheme. , The first hypothesis was tested at the .05 level of significance using the least squares analysis of variance. Null hypothesis Ir There is no significant difference among the subgroup measures of teacher effectiveness; process, product and presage in the responses between teachers and administrators. Since the computed F value of 6.79 was greater than the critical valute of -3.84 the null hypothesis that there was no significant differ­ ence between the administrators and the teachers was rejected. Adminis­ trators rated the three subgroups of criteria significantly higher than teachers. . Since the .computed F value of 55.60 was greater than the critical value of 2.99, the null hypothesis that there was no significance dif­ ference among the three criteria subgroups was rejected. Duncan's test for multiple comparisons indicated that the subgroups, process and 107 product were rated significantly higher than presage criteria.. Since the critical F value of 1.73 was less than the critical value of 2.99, the null hypothesis that there was no significant differ­ ence between administrators and teachers among the three criteria was not rejected. The three criteria subgroups were presage, process and. product. Table XV presents the analysis of variance results for adminis­ trators and teachers' ratings of the three subgroups of criteria. As . noted from the table, teachers rated process criteria higher than either product or presage criteria. Administrators rated product criteria higher than either process or presage criteria. While both administra­ tors and teachers placed relatively equal significance on process and product measures of teacher effectiveness, they gave significantly less emphasis to presage criteria. Schools in Montana belong to one of three classes of districts which are determined by the population within the district. The larger school districts are Class I, if it has a population of 6500 or more; intermediate school districts are Class II, with population of 1000 to 6500;' and smaller school districts are Class III, with population of 1000 or less. Enrollment of students and size of teaching staffs are proportioned in size to the class of district in which they are located. One of the questions to be answered by this study was whether or not teachers in larger districts viewed teacher effectiveness criterion 108 TABLE XV ADMINISTRATORS VERSUS TEACHERS LEAST SQUARE MEANS AMONG THE THREE SUB-TESTS BETWEEN ADMINISTRATORS AND TEACHERS Sample Administrators Teachers N Test I Presage Test 3 Product Total 7.39 .6.87 6.90 6.62 6.11 7.10 452 5.98 6.99 6.04 7.04 6.791 row means* ' 1.733 interaction means * 7.15 critical value of F = 3.00 critical value of F = 3.00 .05 level of confidence Overall mean = 6.66 S.D. = 1.42 Duncan's Test 7.04 compared to 6.04 significant .05 7.15 compared to 6.04 significant .05 7.15 compared to 7.04 . critical value of F = 3.84 F = 55.605 column means* F = Test 2 Process 81 Total F = ■ 109 measures differently than teachers in intermediate and smaller or rural districts. Hypothesis 2 relating to classes of districts was tested at the .05 level of significance using the least square analysis of vari­ ance. Null hypothesis 2. There is no significant difference among the subgroup measure of teacher effectiveness; process, product and presage in the responses of teachers in the three district classifications. Since the computed F value of .29 was less than the critical value of 3.00, the null hypothesis that there was ho significant differ­ ence in the teachers1 responses by three district classification was not rejected. Since the computed F value of 62.28 was greater than the critical value of 3.00, the null hypothesis that there was no signfleant differ­ ence among the three criteria subgroups was rejected. Duncan's test for multiple comparisons indicated that subgroups process and product were rated significantly higher than presage. Since the computed F value of .63 was less than the critical value of 2.37, the null hypothesis that there was no significant differ­ ence in teachers' responses from three classes of school districts among the three criteria was not rejected. The three criteria subgroups were process, product and presage. Table XVI presents.the analysis of variance results for teachers of three classes of school districts and their rating of the three HO TABLE XVI LEAST SQUARE MEANS OF COMPARISON OF TEACHERS OF THREE CLASSES OF DISTRICTS: TEACHERS OF 1st, IInd, AND IIIrd CLASS SCHOOL DISTRICTS N Test I ■ Presage 1st Class Districts 305 6.03 7.05 6.92 6.67 IInd Class Districts 170 5.93 6.94 7.05 6.64 IIIrd Class Districts 58 6.07 7.00 .7.11 6.73 6.01 6.99 7.03 Teacher Sample Total F = F = F = Test 2 Process critical value of F = 3.00 .292 row means 62.279 column means* critical value of F = 3.00 .635 interaction means * critical value of F = 2.37 .05 level of confidence Overall mean = 6 . 6 6 S.D. = 1.42 Duncan's Test 6.99 compared to 6.01 significant .05 7.03 compared to 6.01 significant .05 7.03 compared to 6.99 Test 3 Product Total Ill subgroups of criteria. The results in Table XVI indicate that no signi­ ficant difference existed among teachers of the three classes of school districts in their separate rating of teacher effectiveness criteria. A significant difference existed in the ratings given process and pro­ duct criteria compared to presage criteria of effectiveness by teachers of three classes of school districts. This difference exceeded the .05 level of confidence. The third hypothesis was tested by the least squares analysis of variance at the .05 level of significance. The objective of this hypo­ thesis was to determine whether or not there was a significant differ­ ence between male and female teachers in their rating of the teacher effectiveness criteria. Null hypothesis 3. There is no significant difference among the subgroup measures of teacher effectiveness; process, product and presage in the responses between male and female teachers. Since the computed F value of 6.66 was greater than the critical value of 3.84, the null hypothesis that there was no significant differ­ ence between male and female teachers' ratings was rejected. Female teachers were higher raters of criteria. Since the computed F value of 95.26 was greater than the critical value of 3.84, the null hypothesis that there was no significant differ­ ence among the three criteria process, product and presage was.rejected Duncan's test for multiple comparisons indicated that the subgroups 1 1 2 process and product were rated significantly higher than presage cri­ teria.. Since the computed F value of .76 was less than the critical value of 2.99, the null hypothesis that there was no significant differ­ ence between male and female teachers among the three criteria was not rejected. Process, product and presage made up the three criteria.' Table XVlI presents the analysis of variance for male and female teacher ratings of the three subgroups of criteria. Female teachers rated the three types of teacher effectiveness criteria higher than male teachers did. Both groups rated process and product criteria sig­ nificantly higher than presage criteria at the .05 level of confidence. The fourth hypothesis was tested by the least squares analysis of variance at the .05 level of significance. The purpose of this hypo­ thesis was to determine whether or not elementary and secondary teachers differed significantly in their rating of teacher effectiveness criteria Null hypothesis 4. There is no significant difference among the subgroup measures of teacher effectiveness; process, product and presage in the responses between elementary and secondary teachers. Since the computed F value of 2.35 was less than the critical value of 3.84, the null hypothesis that there was no significant differ­ ence between elementary and secondary teacher ratings of criteria was not rejected. confidence. The difference was not significant at the .05 level- of 113 TABLE XVII A COMPARISON OF LEAST SQUARE MEANS AND ANALYSIS OF VARIANCE FOR MALE AND FEMALE TEACHERS' RATINGS OF CRITERIA N Test I Presage Test 2 Process Male 286 5.90 6.89 6.95 6.58 Female 247 6.12 7.14 7.01 6.75 6.01. 7.02 6.98 Sample Combined Means F = 6.66 row means* Test 3 Product critical value of F = 3.84 F = 95.26 column .means* critical value of F = 3.00 F = critical value of F = 3.00 .762 interaction means * .05 level of confidence Overall mean = 6.66 S.D, = 1.42 Duncan's Test 7.02 compared ) to 6.01 significant 6.98 compared to 6.01 significant 6.98 compared to 7.02 .05 . .05 Combined Mean 114 Since the computed F value of 79.29 was greater than the critical value of 3.00, the null hypothesis that there was no significant differ­ ence among the three criteria was rejected. Process and product criteria were rated significantly higher than presage criteria as indicated by the Duncan's test for multiple comparisons. Since the computed F value of .82 was less than the critical value of 3.00, the null hypothesis that there was no significant differ­ ence between elementary and secondary teachers among the three criteria was not rejected. The three criteria process, product and presage were compared. Table XVIII presents the analysis"of variance for elementary and secondary teachers for three subgroups of teacher effectiveness criteria. For comparison, as noted in the table, the elementary teachers included all grade levels, kindergarten through grade level eight (K-8), and secondary teachers included all grade levels, grades nine through twelve (9-12). secondary. The two levels are commonly regarded as elementary.and Elementary teachers rated process criteria highest, and secondary teachers rated product criteria highest. high Neither rating was enough to be significant in comparison to the next higher subgroup rating. There was a significant difference in the rating of process and product criteria over presage criteria for the combined group (K-12). It may be assumed that as teachers gained years of experience in teaching they might have viewed measures of teaching effectiveness 115 TABLE XVIlI LEAST SQUARES MEANS FOR ELEMENTARY VERSUS SECONDARY TEACHERS FOR THREE SUBGROUPS OF TEACHER EFFECTIVENESS CRITERIA Sample N Elementary Teachers (K-8) 363 Secondary Teachers (9-12) 170 Combined Means (K-12) F 2.35 F = 79.29 F = .82 Test I Presage Test 2 Process Test 3 Product Combined Mean 6.01 7.08 7.00 6.70 5.98 6.84 6.93 6.59 5.99 6.96 6.96 row means critical value of F = 3.84 column means* critical value of F = 3.00 interaction mean critical value of F = 3.00 * .05 level of confidence Overall mean = 6.66 ■ • • S.D. 1.42 Duncan's Test 6.96 compared to 5.99 significant .05 6.96 compared to 5,99 significant .05 6.96 compared to 6.96 116 differently than in the earlier years of their experience. For example, a teacher who•was inexperienced might have been more concerned with his own "personal adjustment and character", a presage criteria, and rated it higher than "amount his students learn", a product criteria. A more experienced teacher might have reversed the emphasis between these two criteria. The fifth' hypothesis examined the assumption that no significant difference existed among the subgroup measures of teacher effectiveness, process, product and presage as rated by teachers who were grouped into six classes of years of experience. Briefly stated the years of experi­ ence has no or little influence on how teachers rated teacher effective­ ness criteria. The six classes of experience were described as follows: (I) 0-5 years, (2) 6-10 years, (3) 11-15 years, (4) 16-20 years, (5) 21-25 years and (6) 25 years and over. The number of teachers sampled who were in each experience group were listed in Table VII. The number of adminis­ trators sampled who were in each experience group were listed in Table III. The least squares analysis of variance statistic was used to test hypothesis V at the .05 level of significance. The means for the six classes of teacher experience under subgroup headings of process, pro­ duct and presage criteria are listed in Table XIX. TABLE XIX 'LEAST SQUARE MEANS AND ANALYSIS OF VARIANCE OF SIX EXPERIENCE CLASSES FOR TEACHERS Teacher Years of Experience Test I ■ PresageI Test 2 Process Test 3 Product Total 172 5.70 6.93 6.85 6.49 157 5.86 6.98 6.93 6.59 . 91 . 6.19 6.97 6.97 6.71 56 6.45 7.18 7.30 6.98 25 6.17 6.97 6.85 6.66 32 6.85 7.35 . 7.41 7.20 6.20 7.06 7.05 Analysis <of Variance critical value of F = 2 . 2 1 F = 6.838 row means* critical value of F = 3.00 F- = 43.465 column means* F .= .923 interaction means critical value of F = 1.83 ■ * .05 level.of confidence S.D1. = 1 .42 Overall mean = 6.64 Duncan's Test— Column Means Means Duncan's Test— Row 7.20 compared to 6.66 significant '6.20 significant .05 7.06 compared.to 7.20 compared to 6.98 6.20 significant .05 7.05 compared' to 7.20 compared to 6.71 significant 7.06 . 7.05 compared to 7.20 compared to 6.59 significant 7.20 compared to 6.49 significant 6.98 compared to 6.71 significant 6.98 compared to 6.59 significant 6.98 compared to 6.49 significant . 6.98 compared to 6.66 significant .05 .05 .05 .05 .05 .05 .05 .05 117 Cla s s .I (0-5 years) Class 2 (6-10 years) Class 3 (11-15) Class 4 • (16-20 years) Class 5 (21-25 years) Class 6 (26+ years) N 118 Null hypothesis _5. There is no significant difference among the subgroup measures of teacher'effectiveness; process, product, and pre­ sage between six classes of years of experience for teachers. Since the computed F value of 6.84 was greater than the critical value of 2.21, the null hypothesis that there was no significant differ­ ence in the ratings of teachers among years of experience was rejected. The years of experience was comprised of six classes. Since the computed F value of 43.46 was greater than the critical value of 3.00, the null hypothesis that there was no significant difference among the three subgroups of criteria was rejected.. Subgroups, process, product and presage comprised the effectiveness criteria. Since the computed F value of .92 was less than the critical value of 1.83, the null hypothesis that there was no significant differ­ ence in teacher ratings by classes of experience among the three criteria was not rejected. i A significant difference beyond the .05 level of confidence was found in the total means of product and process criteria as compared to the presage criteria. Duncan's test for multiple comparisons was ad­ ministered to the analysis of variance data to determine which experi­ ence class means were significantly different at the .05 level of con­ fidence. The comparison of the total means of each experience class indicated that Class 4, which represented (16-20) years of experience was significantly higher than^mearis' for Class I, (0-5) years of 1 1 9 experience and Class 2 (6-10) years of experience, Class 3 (11-15) years of experience and Class 5 (21-25) years of experience. Class 6 which represented 26 and over years of experience was significantly higher than the remaining Class I, 2, 3, and 5 at the .05 level of confidence. This researcher could not determine the reason for the significant dif­ ference found in how teachers with (16-20) and (26+) years of experi­ ence rated the three groups of effectiveness criteria compared to teachers of other experience categories. About all that could be said was that teachers in the (16-20) and (26+) years of experience cate­ gories were consistently higher raters of effectiveness criteria. The researcher observed that teachers with zero to fifteen (0-15) years of experience and teachers with twenty-one to twenty-five (21-2.5) years of experience rated process criteria equally as high or higher than product criteria. The difference was not significant. Teachers with sixteen to twenty (16-20) years of experience and those with twenty-six and more (26+) years of experience rated product criteria highest of the three subgroups of criterion measures. The dif­ ference between product and process ratings was not significant for these two groups of teachers. To examine whether or not a significant relationship existed between the differences in teacher and administrator ratings of 16 cri­ terion measures of teacher effectiveness and how the teacher rated the effectiveness of their administrator in helping them to become more 120 effective teachers, this researcher obtained first an administrator rating of criteria and with the administrator's permission obtained the teachers' ratings of the criteria concerning the administrators' effec­ tiveness. This sampling procedure has been described in more detail in the introduction of this chapter. Hypothesis VI which examined this question was tested at the .05 level of significance using multiple regression equation computed with the teacher rating of the administra­ tor as the dependent variable and the difference between the teacher and administrator rating of each of the 16 criterion measures of teacher effectiveness as the independent variable. Table XX presents the multi­ ple regression analysis of the comparison. Null hypothesis £>. There is no significant relationship among the differences between the teacher rating of each of the 16 measures of teacher effectiveness and the teacher rating of the administrator. Since the computed F value of 1.07 was less than the critical value of 1.67, the null hypothesis that there was no significant rela­ tionship among the differences between the teacher rating of the 16 criterion measures and the rating of the administrator was not-,rejected. Only one criterion was found significant at the .05 level of confidence. The differences between teacher and administrator rating of the 16 teacher effectiveness criteria and the teachers rating of the adminis­ trators was not significant except for one.criterion which was #13 on the survey instrument. This criterion was listed as "effectiveness in 1 2 1 TABLE XX MULTIPLE REGRESSIONS AMONG THE 16 INDEPENDENT VARIABLES .OF TEACHER AND ADMINISTRATOR DIFFERENCES AND THE DEPENDENT VARIABLE, TEACHER RATING OF ADMINISTRATOR EFFECTIVENESS Beta T-Test . Par R I. 2. 3. 4. .05032 .01461 .00923 -.01938 .88 .26 .25 - .'33 .0417 .0123 .0118 -.0159 5. 6. 7. 8. -.02561 .02265 .08600 -.09677 - .31 .37 1.08 -1.87 -.0149 .0174 .0511 . -.0887 9. 11. 12. -.00946 .06294 .01459 -.04659 e 14 1 .01 .30 — •46 -.0065 .0482 .0144 -.0218 13. 14. 15. 16. .30510 -.11406 -.07011 -.04772 2.78 -1.52 - .79 - ,94 .1309* -.0722 -.0376 -.0449 io. — Constant V = .04035 R2 = .0374 Multiple R = .1933 Source Due to regression Standard error of estimate = . MS D.F. SS 16 121.533 ■ 7.5958 313.113 7.0840 F = 1.07 About regression 442 TOTAL 458 . 3252.671 . Critical value of F = 1.67 at = t 05 level of confidence 1 2 2 controlling his class". The difference between teacher and administra­ tor rating of this criteria was significantly related to how.teachers rated the effectiveness of the administrator in helping them to become more effective. Although this one of the 16 criteria was found signifi­ cant at the .05 level of confidence it accounted for only 1.7% of the variance. Criteria #13 which was found to be significantly related to how teachers rated their administrators' effectiveness can be described as exercising of discipline in the classroom. This aspect of a teacher’s effectiveness was rated highest in comparison to the remaining 15 cri■ teria measures of teacher effectiveness by both teachers and administra­ tors. The difference illustrated by the multiple regression equation was most likely caused by teachers and administrators communicating on this particular criteria. For example, should a teacher and administra­ tor disagree as to what constituted effective classroom control (disci­ pline) then the difference most likely would have resulted in confronta­ tion or dialogue between teacher and administrator. - How effective the . teacher perceived the administrator to be in resolving this conflict • 1 . . . determined how the teacher rated the administrator's effectiveness in helping that teacher to become more effective. It was probable that not enough difference of opinion existed be­ tween the administrator and his staff on the remaining criteria to cause 1 2 3 communication between the administrator and his staff to occur which re­ sulted in no or little evaluation of the administrator. It is probable that the administrator effectiveness is measured by criteria other than those listed on the survey instrument with the exception of criterion #13. Another way to view the lack of significance between the teachers rating of the 16 criteria measures of teacher effectiveness and their rating of the effectiveness of their administrator was to assume that unless differences occurred between the administrator and the teacher on a given criterion, no discussion took place between the teacher and the administrator and no basis existed for the teacher to rate the adminis­ trator's expertise in resolving conflict over a given criterion measure. Classroom control is a noted exception because it is easily observable and both teacher and administrator have biases as to what constitutes effective (discipline) classroom control. If they agree on what con­ stitutes good classroom control, no conflict occurred; therefore, no observation resulted by the teacher on how effective an administrator was in dealing with the difference. Hypothesis VII examined whether or not there was a significant relationship in the mean rating of each of the criteria of teacher effectiveness between teachers and administrators either as combined groups or by classes of districts and a reference group from a similar study conducted in Delaware. Table XXI lists in rank order the mean 124 TABLE XXI MEAN RATINGS AND. RANK ORDER OF THE 16 CRITERIA DELAWARE STUDY Rank Criteria Type Means Order___________ (Ordered by Rating)______________(Mitzel Scheme)_________ I. 3. 4. 5. 6. 7. 8. 9. 13. 14. 15. 16. Civic responsibility (patriotism). Performance in student teaching Participation in community and pro­ fessional activities. Years of teaching experience. Type Process Product Presage Phi Delta Kappan April 1974 . Presage Process 8.17 7.88 Process 7.79 Presage Product 7.71 7.65 Presage Process 7.64 7.63 Process Process Product 6.95 6.86 Presage 6.43 Presage Presage 6.25 5.66 Presage Presage 4.88 3.89 Combined means 7.64 7.26 6.43 p- 11. 12. Extent to which his verbal behavior in classroom is student-centered. Extent to which he uses inductive (discovery) methods. Amount his students learn. General knowledge and understanding of educational facts. 8.31 CM 10. Personal adjustment and character. Influence on student's behavior. Knowledge of subject matter and re­ lated areas ■Ability to personalize his teaching. Process P- 2. Relationship with class (good rapport) Willingness to be flexible, to be direct or indirect as situation demands. Effectiveness in controlling his class. Capacity to perceive the world from the student's point of view. 125 ratings of Delaware teachers and administrators of the 16 criteria of teacher effectiveness. Tables IX through XIII list in rank order the mean ratings for Montana teachers and administrators on the same 16 criteria measures of teacher effectiveness. Because only mean ratings were available from the Delaware Study which was conducted by Jenkins and Bausell, Spearman's coefficient of rank correlation was used to test hypothesis VII. Ferguson described the comparison of two correlated or independent samples by using ranks as nonparametric or distribution free tests (Ferguson, 1971, p. 304). Spearman's coefficient or rank correlation is defined in such a way as to take a value of +1 when paired ranks are in the same order and, a value of -I when the ranks are in an inverse order and an expected value of zero when the ranks are arranged at random with respect to each other. The formula meeting these conditions of the Spearman rho is P = I - 6 d2 N(r2 - I) (Ferguson, 1971, p. 305-6). Null hypothesis 7. There is no significant relationship .in the mean rating of each of the 16 criteria of teacher effectiveness between teachers and administrators (either as combined groups or by classes of districts) and a reference group from another study. Since the computed rho value.of .888 was greater than the critical value of .425, the null hypothesis that there was no significant 1 2 6 relationship in the ratings of criteria between Montana teachers and administrators was rejected. Significance was tested at the .05 level of confidence. Since the computed rho value of .894 was greater than the critical value of .425, the null hypothesis that there was no significant rela­ tionship in the ratings of criteria between Montana teachers and Dela- ■ ware teachers was rejected. Significance was tested at the .05 level of confidence. Since the computed rho value of .827 was greater than the critical value of .425, the null hypothesis that there was no significant rela­ tionship in the ratings of criteria between Montana administrators and Delaware teachers was rejected. Significance was tested at the .05 level of confidence. Since the computed rho value of .977 was greater than the criti­ cal value of .425, the null hypothesis that there was no significant relationship in the rating of criteria between Montana teachers in first and second class school districts was rejected. Significance was tested at the .05 level of confidence. Since the computed rho value of .947 was greater than the criti- . cal value of .425, the null hypothesis that there was no significant relationship in the rating of criteria between Montana teachers in first and third class school districts was rejected. at the .05 level of confidence. Significance was tested 1 2 7 Since the computed rho value of .956 was greater than the criti­ cal value of .425, the null hypothesis that there was no significant relationship in the rating of criteria between Montana teachers in second and third class school districts was rejected. Significance was tested at the .05 level of confidence. Tables XXII through XXVII give the Spearman's coefficient of rank correlation calculation for comparisons of Montana teachers and adminis­ trators , Montana teachers and Delaware teachers, Montana administrators and Delaware teachers, Montana teachers of Class I and Class II school districts, Montana teachers of Class I and Class III school districts and Montana teachers of Class II and Class III school districts respectively. In Tables XXII through XXVII the columns one and two are arranged so that column one will contain the rank order of sixteen effectiveness criteria for a particular group of raters. Column two will contain the rank that another group of raters gave the same criteria. For example, in Table XXII column one Montana teachers rated the criterion "effectiveness in controlling his class" as number one in importance. Montana also ranked this criterion as number one. Administrators in The criterion "knowl­ edge of subject matter and related areas", Montana teachers rated second in importance, but administrators rated this same criterion as number three in importance. 1 To find the rank-order ratings and descriptions of the criteria in columns one and two of Tables XXII through XXVII, the reader is 128 TABLE XXII CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS AND ADMINISTRATORS Teachers1 Rank Order of Criteria Administrators' Rank of - Criteria Difference d I 2 3 4 I 3 2 7 0 -I I -3 0 I I 9 5 6 7 5 9 0 8 8 • 10 0 -3 -I -2 9 10 11 12 6 4 12 3 6 0 0 9 36 0 0 13 15 16 14 0 -I -I 2 0 I I 4 11 13 14 15 16 N(W2-I) Critical Value of P for N of 16 = P = I- 6 x 76 = co 6 CM rO P = I- 9 I 4 CXJ11 . 0 Totals Note: d2 .888 16(256-1) .425 at the .05 level of confidence Refer to Table X, Page 102 for the rank order of criteria for Montana teachers. Refer to Table IX, Page 101 for the rank order of criteria for Montana administrators. 129 TABLE XXIII CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA AND DELAWARE TEACHERS Montana Teachers 1 Rank Order of Criteria Delaware Teachers' Rank of Criteria • Differences d. d2 I 2 3 4 3 7 I 2 -2 -5 2 2 5 6 7 8 5 8 4 9 0 -2 3 ..-I 9 10 11 12 .6 11 ' 12 10 3 -I -I 2 9. I I 4 13 14 15 16 13 15 16 14 0 -I -I 2 0 I I 4 d2 • NCN^-D P = I' 6 x 72 = .894 16(256-1) Critical Value of P for N of 16 = .425 at the .05 level of confidence. Note: Refer to Table X, Page 102 for Montana teachers. for rank order of criteria Refer to Table XXI, Page 125 for rank order of criteria for Delaware teachers and administrators. OJ 6 1P P = I- 0 4 9 I c" 0 Totals 4 25 4 ' 4 130 TABLE XXIV CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA ADMINISTRATORS AND DELAWARE TEACHERS Montana Administrators' Rank Order of Criteria Delaware Teachers' Rank Order of Criteria I 2 3 4 3 I 7 11 -2 I -4 -7 4 I 16 49 5 6 7 8 5 6 2 4 0 0 5 4 0 0 25 16 9 10 11 12 8 9 12 10 I I -I 2 I I I 4 13 14 15 16 13 14 15 16 0 0 0 b 0 0 0 0 Differences d 0 Totals P = I- 6 . d^ N(N2-I) P = I - 6 x 118 16(256-1) d2 d2=118 .827. Critical Value of P for N of 16 = .425 at the .05 level of confidence. Note: Refer to Table IX, Page 101 for rank order of criteria for Montana administrators. Refer to Table XXI., Page 125 for rank order of criteria . for Delaware teachers. ' 131 TABLE XXV CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS OF FIRST AND SECOND CLASS.SCHOOL DISTRICTS Teachers Class II Rank of Criteria Class I Rank Order Criteria Differences d a? I 2 3 4 I 2 3 5 0 0 0 -I 0 0 0 I 5 6 7 8 4 . 7 . 9 6 I -I -2 2 I I ' 4 4 9 IO 11 12 10 8 11 12 -I 2 0 . 0 I 4 0 0 13 14 15 16 13 14 15 16 0 0 0 0 0 0 0 0 0 d2=16 Totals P = I - 6 d2 N(N2-I) . P = I - 6 x 16 = .977 16(256-1) Critical Value of P for N of 16 = .425 at the .05 level of confidence • Note: Refer to Table X I 1 Page 104. for rank order of criteria for Class I teachers. Refer to Table XII, Page 105 for rank order of criteria for Class II teachers. 132 TABLE XXVI CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS O F ' FIRST AND THIRD CLASS SCHOOL DISTRICTS Teachers Class I Rank Order Criteria Class III Rank Order of Criteria . Differences d d2 I 2 3 4 I 2 3 4 0 0 0 0 0 0 0 5 6 7 8 9 6 8 5 -4 0 -I 3 16 0 I 9 9 IO 11 12 10 7 11 12 -I 3 0 0 I 9 0 0 13 14 15 16 13 14 15 16 0 0 0 0 0 0 0 0 0 d2=36 0 ' Totals P = I - G d 2 P = I N(N2-I) Critical Value of P for N .of 16 = Note: 6 x 36 = '.947 16(256-1) .425 at ■the..05 level of confidence. Refer to Table XI, Page 104 for rank order of criteria for Class I teachers. Refer to Table XIII!, Page 106 for rank order of criteria for Class III teachers. 133 TABLE XXVII CALCULATION OF SPEARMAN'S COEFFICIENT OF RANK CORRELATION FOR MONTANA TEACHERS OF SECOND AND THIRD CLASS SCHOOL DISTRICTS '• Teachers \ ■ Class II Rank Order of Criteria ■ I Class III Rank of Criteria: , 2 3 4 d . i ' '2 3 9 ■ 0 ,' ' 0 . - 11 ■ 12 13 14 15 16 13 14 15 16 i —I Ii a, N(N2-I) Critical Value of P for N of 16.= Note: = I - ■ I ' . . I ■ I ■ '■ I ' I 0 0 0 0 . 0 0 . 0 0 ..Q:, 6 x 30 . I ■I I Totals P 0 0 ' 0 ■ 25 I 8 10 d2 : d2 I 9 10 11 12 6 o -5" 4 . 5 6 7 5. ■ 6. 7 8 ' Differences . 9 0 b 0 0 ,0 , d2=30 = . .956 16(256-1) .425 at the .05 level of confidence. . Refer t o .Table XII, Page 105,for rank order of criteria for Class II teachers. Refer to Table XIII1 Page 106 for rank order of criteria ■ for.Class III teachers. -■ " 134 referred to the tables which listed the rank order of means for the sixteen effectiveness criteria. Tables IX, X, X I * XII, XIII and XXI of this chapter lists this information respectively for Montana administra­ tors, Montana teachers collectively. Class I district teachers, Class II district teachers, Class III district teachers, and Delaware teachers and administrators. Table references giving the above information are noted at the bottom of each Teable XXII through XXVII. For each of the comparisons, the critical value of P was .425 at the .05 level of confidence. The lowest correlation coefficient for the six comparisons was P = .827 which was the comparison between Montana administrators and Delaware teachers. P = .827 is considerably higher than the critical value of P = .425 at the .05 level of confidence indi­ cating a rather high degree of correlation between the subjects. The highest correlation coefficient obtained was from the comparison between Montana teachers from Class I and Class II school districts. The correla­ tion coefficient between these two groups was P = .977 which would indi­ cate a high degree of agreement on how they viewed teacher effectiveness criterion. The correlation coefficient of P = .888 between Montana teachers and administrators indicated a high degree of agreement on teacher effectiveness criterion measures well above the P = .425 level of significance at the .05 level of confidence. In examining hypothesis VII by calculation of correlation coeffi­ cients this researcher noted that Montana teachers and administrators seemingly agree in their rating of effectiveness criteria for measuring teacher effectiveness. Their ratings correlated significantly with teachers of Delaware who participated in a similar study. To test the significance of whether or not the difference in how Montana teachers and administrators rated each criterion of effective­ ness, a T-test for significance of the mean differences was calculated for each criterion measure. These T-ratios are shown in Table XXVIII. Examination of the table indicates that the differences in rating criterion I, 2, 9, 10, 13, 14, and 16 between administrators and teachers were significant at the .05 level of confidence. The criterion and type in rank order of teacher rating illustrated in.Table IX are described as follows: Criterion Type 1. Effectiveness in controlling his class. (Process) 2. Knowledge of subject matter and related areas. (Presage) 9. Influence on student behavior. (Product) 10. Amount his students learn. (Product) 13. Civic responsibility (patriotism). (Presage) 14. Participation in community and professional activities. (Presage) 16. Performance in student teaching. (Presage) Inspection of Table XXVlII indicates that administrators rated criterion I, 9, 10, 13, 14, and 16 higher than teachers. criterion 2 higher than administrators. Teachers rated Other than informational pur­ poses for the reader no Conclusions were drawn by the researcher as to the significance of these particular data. TABLE XXVIII T-TEST OF SIGNIFICANCE OF THE MEAN DIFFERENCE FOR TEACHER AND ADMINISTRATOR Montana Teachers' Montana Teacher Rank Order of Means X 1 Criterion Montana Administrator Means X„ Teacher N1 Administrator N2 Teacher Administrator 2 S1 „ P s; t Ratio 8.13 8.09 7.89 7.61 8.32 7.68 7.95 7.49 452 451 452 445 82 82 82 78 1.24 1.41 1.62 1.90 .61 1.01 1.16 1.73 7.56 7.44 . 7.34 7.31 7.57 7.36 7.39 7.36 447 447 452 444 81 80 80 81 2.11 1.82 .07 .44 .33 2.15 1.35 2.26 1.45 1.21 9 10 11 12 7.19 . 7.04 6.77 6.35 7.56 7.66 6.59 6.58 452 443 448 452 82 80 81 . 82 2.05 2.77 2.75 2.63 1.51 1.82 1.97 1.77 2.46* 3.64* 1.03 1.39 13 14 15 16 6.09 5.23 5.08 5.01 6.53 5.74 4.95 5.87 448 451 450 451 3.97 3.68 3.95 4.20 2.25' 2.60 3.38 3.02 2.32* I 2 3 4 D. f . range from N^ + N^ - 2 = 527 to N 1 + N2 - 2 = 534 Critical value of t at .05 Level of Confidence = 1.64 for I Tail Test ♦Denotes Significant Value of t. 81 80 79 . 80 2,09 .45 .75 '35 2.55* .57 3.98* 136 5 6 7. 8 1.90* 3.30* 5 1 3 7 SUMMARY All subjects in both the administrator and teacher samples for this study were administered the questionnaire which contained criterion measures of teacher effectiveness. that the sixteen criterion were measurable. sixteen The assumption was made The criterion measures were grouped together under the Mitzel Scheme of product, process and presage. The rank order of means computed for administrators and teachers was listed illustrating how each group rated the sixteen measures of teacher effectiveness. Comparison tables were made to illustrate the difference in ratings. The least, squares analysis of variance was the statistic used to measure the significance of difference in ratings given by administra­ tor and teacher groups to effectiveness criteria. This analyses helped the researcher in determining whether or not the null should be retained in the hypotheses one through five which examined how: administrators rated teacher effectiveness criteria, (I) teachers and (II) teachers of three district classifications rated teacher effectiveness criteria, (III) male and female teachers rated teacher effectiveness criteria, (IV) elementary and secondary teachers rated teacher effectiveness criteria, and (V) years of experience influenced the rating of teacher effectiveness criteria by teachers. The analysis of variance examination of the data along with the \ use of Duncan's test for multiple comparisons when applicable indicated 138 that teachers and administrators in the State of Montana do not signi­ ficantly differ in how they rated teacher effectiveness criteria grouped under the Mitzel Scheme of product, process and presage types of measures. Analysis of data did indicate that both Montana administrators and teachers rated process and product criteria measures significantly higher than presage criteria. Although administrators rated product measures highest and teachers rated process measures highest the difference in measurement was not significant. The sampled teachers taught in small, medium and large districts in the State of Montana. Regardless of the size of district in which they taught all viewed process and product criteria measures as signifi­ cantly more important than.presage criteria. It was noted that teachers in first class districts rated process criteria highest and teachers in class two and class three districts rated product criterion highest but the difference was not significant. In comparing how male and female teachers rated teacher effective­ ness criterion, the same trend was indicated by the data. Both groups rated product and process measures of teacher effectiveness significantly higher than presage criteria. Male teachers rated product criteria high­ est and female teachers rated process criteria highest but the difference was not significant. Years of experience was not a significant factor in how Montana teachers rated teacher effectiveness criteria measures. Teachers of all 139 experience categories ranging from (0-5) years to over 26 years of ex­ perience rated process and product criteria as significantly more impor­ tant as effectiveness measures than presage criteria. Teachers with 16-20 years of experience and 26 and over years of experience were higher raters of the three subgroups of teacher effec­ tiveness measures than were teachers in other experience categories. Teachers in experience categories 16-20 years and 26 and over years of experience rated product criteria measures highest. The teachers in the remaining experience categories rated process equally high or higher -as effectiveness measures. In neither case, however, was the difference significant. The multiple regression equation was used to test whether or not a significant relationship existed among the differences between admin­ istrators and teachers rating of teacher effectiveness and the teachers ' rating of the administrators effectiveness in helping the teacher to become a more effective teacher." The teachers rating of the administra­ tors was the dependent variable sometimes spoken of as the criterion and the sixteen teacher effectiveness criteria were the independent vari- . ables sometimes spoken of as predictors. The analysis of data using the multiple regression statistic indi­ cated that only one of the sixteen criteria measures of teacher effec­ tiveness was significant in predicting how teachers evaluated the effec­ tiveness of their administrator. This criterion listed as #13 on the 140 survey instrument and entitled "effectiveness in controlling his class", although significant, accounted for only 1.7% of the total variance which is very minimal in importance. Apparently criteria other than the sixteen listed on the survey instrument are more significant in the teachers' evaluation of adminis­ trators' effectiveness. The null hypothesis used to examine this ques­ tion was retained as true. To compare the results of this study which replicates, in some degree, a previous study completed in recent years in Delaware, this researcher used a nonparametric statistic to make the comparison. The statistic chosen to examine the relationship in the criterion ratings of Montana teachers and administrators to Delaware teachers and adminis­ trators on the rating of teacher effectiveness was the Spearman's Coef­ ficient of rank correlation; because, only mean data for the Delaware study was .available to the researcher. The Spearman rho statistic indicated that a high correlation existed between the results of the Delaware study and this study for combined teacher and administrator rankings of criterion measures of teacher effectiveness. In both studies product and process criterion measures were rated significantly higher than' presage measures of teacher effectiveness. The difference in. results of this study and the Delaware study was observed to be that by raink of mean comparison Montana teachers 141 and administrators rated "effectiveness in controlling his class" as number one and Delaware teachers and administrators rated "relationship in class (good rapport)" as number one in importance. are process criterion. Both criteria One might conclude that discipline is of prime importance to Montana teachers. The "amount his students learn" (a product criterion) was rated fourth in importance by Montana teachers and administrators and eleventh by Delaware teachers and administrators. In comparing the mean ratings of each of the sixteen criterion measures of teacher effectiveness between teachers and administrators of Montana and the reference group of administrators and teachers of Dela­ ware , the researcher concluded that the null hypotheses was a true des­ cription of this relationship. Teachers and administrators rating in this study compared significantly with the ratings of Delaware teachers and administrators. CHAPTER V SUMMARY, CONCLUSIONS AND RECOMMENDATIONS SummaryPast research in the area of teacher effectiveness has concen­ trated on identifying the effective and ineffective teacher. Several criteria have been used for placement of teachers into one of these two groups. However, when characteristics of each of the group members have been studied, few distinguishing variables have been identified. Those who have been interested in teacher effectiveness have had different purposes and consequently have varied their interpretations of the problem. Some who have investigated the problem of teacher effec­ tiveness would have been satisfied to know whether or not a teacher was getting desired results with the results indicating effectiveness. Others wanted to know how to increase the probability of attaining desired results. Researchers of the latter persuasion were searching for lawful teaching behavior, i_.£. , validated procedures for achieving instructional ends. Their assumption was that effective teaching would be recognized when lawful relationships were established between instruc­ tional variables and learner outcomes, that certain procedures in teaching would have, within certain probability limits, been labeled as effective or ineffective. To date there, are no such laws, only a few leads or practices that are more likely than others to maximize the attainment of selected instructional ends. 143 Examination of the literature indicated that teacher effective­ ness has been such a nebulous quality that it was unlikely that factors would ever be found which would successfully categorize teachers. In' reality effectiveness and ineffectiveness are not mutually exclusive. A teacher can be effective for one reason while at the same time he is ineffective for another reason. Purposes of defining the effective teacher have varied between the practical need to assess teachers for retention or release at the local school district level to the researcher's purpose of determining how well a teacher could perform in any of a class of jobs which share many common characteristics, as well as with identifying these common characteristics. The school official has sought to determine how well a teacher performed his job in terms of certain specified and more often unspeci­ fied criteria. He has not been concerned with whether or not the job he asked the teacher to perform was representative of the class of such jobs or could the teacher perform the class of jobs well. The difference between a school official's concern and a re­ searcher's concern has several implications. For example, the overall or even intuitive ratings may be used by a school official to help make a very general assessment of how well a teacher performs". The general assessment thus gained has provided relevant and useful information for the immediate situation. Overall ratings have not generally been useful 144 or relevant because such ratings have low reliabilities, and have not been consistent with the purpose of the researcher: to predict and to describe. Much unhappiness regarding assessment of teachers has been for curricular rather than instructional reasons.. The teacher may have been labeled ineffective not because his students failed to achieve, but be­ cause the achievement has been in directions that were not valued by the rater. Judgments regarding teachers have always been.made but the recent public outcry for accountability has placed increasing emphasis on the need to improve ways to evaluate teacher effectiveness. The problem inherent in this emphasis was the selection of suitable criteria upon which both administrators and teachers truly agreed measured teacher effectiveness. r-"""' The purpose of this study was to determine whether or not teachers and administrators in the State of Montana agreed on the criteria for judging effective teaching. In the need for school districts to meet ■ the public's demand for accountability this determination had to be made before a basis could be established between teachers and administrators for effective evaluation procedures. To gather data fob this study the researcher sent a survey instru­ ment to both administrators and teachers in the spring of 1977. instrument used was that which was designed for a similar study The 145 conducted by Jenkins and Bausell in the State of Delaware. The administrator population was sampled first and permission was given from each respondent to sample teachers who were members of his staff. The useable sample returns numbered H O out of 114 sampled for administrators. Useable samples return for teachers number 454 from 568 sent out. Analysis of the data gathered from the survey of both adminis­ trators and teachers of Montana indicated that administrators and teachers agreed significantly within their respective groups on what criteria of teacher effectiveness were most important. Both groups agreed significantly on the same criteria measures of teacher effec­ tiveness. ' This agreement would seemingly provide a basis within Montana school districts for objectively arriving at agreed upon teacher effec­ tiveness measures supportable by both teachers and administrators within a given school district. Whether these same measures are acceptable to school board members, other school p a t r o n s s u c h as parents and young­ sters whom teachers serve, remains to be determined. One can assume that agreement exists between the school.district's professional staff and school patrons if elected school board officials accurately reflect the communities idea of effective teaching. Unless agreement is apparent it is probable, as supported by review of the current literature on teacher effectiveness, that subjective measures will be the prevailing 1 4 6 criteria in judging the effective and ineffective teacher. 1 4 7 Conclusions The following conclusions were developed upon analysis of data on this study. 1. Teachers and administrators of"Montana were in agreement in how they ranked the criteria which measures teacher effectiveness as described by the Mitzel types of product, process and presage. T his■ was attested to by the high correlation between how admihistrators rated effectiveness criteria compared to teacher rating of criteria regardless of the size of school district in which they taught. 2. In comparing the ratings by Montana teachers and administra­ tors of teacher effectiveness criteria grouped in sub-groups of process, product and presage, it was found that administrators rated all three • sub-groups of criteria significantly higher than did teachers. 3. ' * Both administrators and teachers in Montana rated process and product criteria significantly higher than presage criteria. ' This rat- • ing was consistent for comparisons by years of experience, classes of school districts, elementary versus secondary teachers, and male or female categories. 4. There was no significant difference among the ratings given . to the three sub-groups of effectiveness criteria by Montana teachers in the three classes of school districts. 5. There was a significant difference between male and female teachers in Montana in their ratings of the three sub-groups of effec- 148 tiveness criteria. Female teachers significantly rated all three sub­ groups , process, product and presage criteria higher than did. male teachers. 6. There was no significant difference between elementary and secondary teachers of Montana in their ratings of the three sub-groups of effectiveness criteria, process, product and presage. 7. Montana teachers■grouped into six classes by years of experi­ ence significantly rated effectiveness criteria differently in two of the six classes of experience. Teachers with 16-20 years of experience and teachers with 26 or more years of experience consistently rated the three sub-groups of effectiveness criteria higher than teachers in the four remaining experience categories. 8. Of the sixteen effectiveness criteria only the criterion "effectiveness in controlling his class" had a significant relationship to the rating of the administrator's effectiveness. 9; There was a significant correlation between this study and ■ a similar study conducted in the State of Delaware of teacher and .admin­ istrator rankings of the sixteen effectiveness criteria. 10. There was a significantly high correlation between Montana teachers' rankings of the sixteen effectiveness criteria by school dis­ trict -classification. 149 Recommendations for Further Study 1. A study should be conducted to determine if parents and other constituents of a school district served by teachers and administra­ tors agree among themselves and with the latter as to what criteria are jnost important in measuring teacher effectiveness. This would determine if parents in a community emphasize the same criteria that teachers and administrators do in Montana. If expectations are simi­ lar, it is conceivable that school districts will be effective in working out acceptable teacher evaluation programs. 2. A study should be conducted that would determine if present school district teacher evaluation programs emphasize morti than just classroom control as the principal criteria for retaining or releas­ ing teachers. This would test the assumption that if a teacher can­ not exercise an acceptable level of student control, learning is less likely to take place. 3. There is a.need to know what criteria teachers use to evaluate their administrator's effectiveness in helping them to become better teachers. The indication is that this process does not take place unless conflict in values arise between administrators and the teachers on what criteria is considered most important in measuring effectiveness. If a teacher and administrator agree philosophically on a criterion measure, no need exists for the administrator to. assess, a teacher on the criterion measure. They agree— why measure. 150 4. As more pressure is exerted by the public for schools to give evidence that young learners are "mastering the skills", or show, competency in attaining "agreed upon learner outcomes", it may be assumed that the degree of student competency attained will directly reflect upon his teachers' effectiveness. A study should be made to determine to what degree school districts presently assess their staff's competency by criterion referenced test results of students and is this trend apparent in Montana. The reason for inquiry stems from the fact that administrators in Montana emphasize the importance of product criterion of "what students learn" much more than teachers presently do as evidenced by the analysis of data in this study. Apparently administrators are sensitive to the public's demand for accountability much more than teachers at this time. If administra­ tors reflect this sensitivity by using test results to give evidence of this school's effectiveness,parents may give more emphasis to these results than is warranted as an indication of a- teacher's effec-, tiveness. This approach would surely lead to conflict and ineffective evaluation of staff if teachers do not accept product measures as being of primary importance. 5. It is not clear why principals and teachers place relatively greater, emphasis on criteria other than "amount students learn" as proposed by accountability proponents. .A start toward lessening the dissonant attitudes resulting from this conflict in values would be a study to 151 determine the reasoning behind teacher's rating of student learning relative to other criteria. LITERATURE CITED LITERATURE CITED Allen, Dwight W., (Ed.) and Eli Seifman.. The Teacher's Handbook. view, 111.: Scott, Foresman and Company, 1971. Glen­ Anderson, C . C . and S.- M. Hunka. "Teacher Evaluation: Some Problems and a Proposal," Harvard Educational Review. Cambridge, Mass.: Winter 1963. Averch, Harvey A., Stephen J . Carroll, Theodore S . Donaldson, Herbert J . Kiesling, John Pincus. How Effective is Schooling. (A Criti­ cal Review and Synthesis of Research Findings.) Santa Monica, Calif.: The Rand Corporation, December, 1971. Barr, A. S . Wisconsin Studies of the Measurement and Prediction of Teacher Effectiveness. Madison, Wise.: Dembar Publications, Inc., 1961. Barr, A. S . and others. "Wisconsin Studies of the Measurement and Prediction of Teacher Effectiveness:' A Summary of Investigations," Journal of Experimental Education, 30, (1961) 5-156. Beery, John R . Professional Preparation and Effectiveness of Begin­ ning Teachers. (The Ford Foundation) Coral'Gables, Fla.: Graphic Arts Press, 1960. Biddle, Bruce J . and William J . Ellena. (Eds.) Contemporary Research on Teacher Effectiveness. New York: Holt, Rinehart and Winston, 1964. Bleecher, Harvey. "The Authoritativeness of Michigan's Educational Accountability Program," The Journal of Educational Research. 69, (November and December 1975) 135-141. Holton, Dale L . Selection and Evaluation of Teachers. Calif.: McCutchan Publishing Corporation, 1973. Berkeley, Bottenberg, Robert A., and Joe H. Ward. Applied Multiple Linear Regres­ sion. 6510th Personnel Research'Laboratory, Aerospace Medical Division, Air Force Systems Command. Lackland Air Force Base, Texas: 1963. / Boyce, A. C . "Methods of Measuring Teachers' Efficiency," 14th Yearbook, National Society for the Study of Education, Part 2. Chicago: University of Chicago Press, 1915. 154 Briggs, Thomas H . Improving Instruction. (Supervision by Principals of Secondary Schools). New York: The Macmillan Company, 1938. Brighton, Stayner F . Increasing Your Accuracy in Teacher Evaluation. Englewood Cliffs, N . J.: Prentice-Hall, Inc., 1965. Brighton, Stayner F . and Cecil J . Hannah. Merit Pay Programs for Teachers. (A Handbook). San Francisco, Calif.: Fearon Publishers, ■ 1962. Brim, Orville G., Jr. Sociology and The Field of Education. Sage Foundation, New York: 1958. Russell Brophy, Jere E . and Carolyn M. Evertson. Learning from Teaching: A Developmental Perspective. Boston, Mass.: Allyn and Bacon, 1976. Brophy, Jere E. and Carolyn M. Evertson. "Teacher Education, Teacher Effectiveness, and Developmental Psychology." Eric ED 118 257, Report No. 75-10, August, 1975. Bush, Robert Nelson. The Teacher-Pupil Relationship. Prentice-Hall, Inc., 1954. New York: Coleman, James S . et.al. Equality of Educational Opportunity. Wash­ ington, D . C.: U . S. Department of Health, Education, and Welfare, Government Printing Office, August 1966. Dalton, Elizabeth L . What Makes Effective Teachers for Young Adoles­ cents? George Peabody College for Teachers, Nashville, Tennessee, 1962. Davis, Hazel. (Director) Evaluation of Teachers. (Research Report) Washington, D . C.: National Education Association, 1964. Domas, Simeon J . and David V. Tiedeman. "Teacher Competence: An Anno­ tated Bibliography," Journal of Experimental Education, XIXv No. 2 (December 1950) 101-218. Ebel, Robert L . (Ed.) Encyclopedia of Educational Research. Edition. Toronto, Ontario: The MacMillan Company, 1969. 4th Ebel, Robert L . and Roger M. Baun (Eds.) Encyclopedia of Educational Research. 4th Edition. Toronto, Ontario: The MacMillan Company, Collier-MacMillan Consolidated, 1969. 155 Ellena, William J., (Ed.) "Who's A Good. Teacher?" American Association of School Administrators. 1201 Sixteenth Street, N.W., Washington, D .C ., 1961. Eye, Glen G . "The Superintendent's Role in Teacher Evaluation, Reten­ tion, and Dismissal," The Journal of Educational Research, 68:390-395, July/August 1975. Ferguson, George A. Statistical Analysis in Psychology and Education. New York: McGraw-Hill Book Company, 1971. , Flanders, Ned A. Analyzing Teacher Behavior. Wesley Publishing Company, 1970. Flanders, Ned A. Teaching With Groups. Publishing Company, 1954. Reading, Pa.: Minneapolis, Minn.: Addison- Burgess Gage, N . L. Teacher Effectiveness and Teacher Education. (The search for a Scientific Basis), Palo Alto, Calif.: Pacific Books, Pub­ lishers, 1972. Gage, N. L . (Ed.) Handbook of Research, on Teaching. Chicago: The American Educational Research Association, Rand McNally and Company, 1963. Gage, N. L. (Ed.) Mandated Evaluation of Educators: A Conference on California's Stull Act. Palo Alto * Calif.: Center for Research and Development in Teaching, School of Education, Stanford Univer­ sity, 1973. Gagne, Robert M. The Conditions of Learning. and Winston, Inc., 1965. New York: Holt, Rinehart Gary, Frank. "How Successful is Performance Evaluation," paper presented at the Annual Convention of the American Association of School Ad­ ministrators, Dallas, Texas, Eric February 1975. Getzels, J . W. and P . W. Jackson. The Teacher's Personality and Char­ acteristics. Handbook of Research on Teaching, N. L. Gage (Ed.). Chicago: Rand McNally and Company, 1963. Glass, Gene V. "Teacher Effectiveness," Evaluating Educational Per­ formance . (A resource book of methods, instruments, and examples) Herbert J . Walberg (Ed.). Berkeley, Calif.: McCutchon Publishing Company, 1974. 156 ' Grabman, Hulda. "Accountability for What?" Nation's Schools, Education Digest 38:65-68, October 1972. Haley, Dennis Richard. "Relationship of Variables Beyond Teacher's Control and Teacher's Effectivness Ratings by Students," an. unpub­ lished Doctoral Dissertation, Montana State Universaity, Bozeman, Montana, 1974. Helvig, Carl. Solution. 1972. (Ed.) Teacher Evaluation: The State of the Art and Harvard Educational Review. Cambridge, Mass.: May, Herman, Jerry J . Developing an Effective School Staff Evaluation Pro­ gram. West Nyack, N. Y.: Parker Publishing Company, Inc., 1973. Herrboltd, Allen A. "The Relationship Between the Perceptions of Principals and Teachers Concerning Supervisory Practices in Selected High Schools of Montana," an unpublished Doctoral Dis­ sertation, Montana State University, Bozeman, Montana, 1975. Hildebrand, Milton H . and Others. Evaluating University Teaching (A Handbook). Center for Research and Development in Higher Educa­ tion, University of California, Berkeley, 1971. Hottleman, Girard D . "The Accountability Movement," The Massachusetts Teacher, LIII, January 1974. House, Ernest R . (Ed.) School Evaluation: The Politics and Process. Berkeley, Calif.: McCutchan Publishing Corporation, 1973. Howard, Alvin W. "Accountability at Last (and Again)," National Associ­ ation of Secondary School Principals, 58:20-23, March 1974. Jenkins, Joseph R . and R . Barker BauSell. "How Teachers View the Effec­ tive Teacher; Student Learning is Not the Top Criterion." Phi Delta Kappan, April, 1974. Kibler, Robert J., Donald J .,Cegala, Larry L. Barker, and David T. Miles Objectives for Instruction and Evaluation. Boston: Allyn and Bacon, Inc., 1974. Lewis, James, Jr. Appraising Teacher Performance. Parker Publishing Company, Inc., 1973. West Nyack, N . Y.: ' 157 Marsh, Joseph E . and Eleanor W. Wilder. Identifying the Effective ■ Instructor: A Review of the Quantitative Studies, 1900-1952. Chanute Air Force Base, 111.: Air Force Personnel and Training Center, 1954. McGowan, Francis A. II. Teacher Observation and Evaluation: Paper. Eric ED 113 309, November 1974. A Working McKenna, Bernard H . Staff the Schools. New York: Bureau of Publica­ tions, Teachers College, Columbia University, 1965. McNeil, John D . and W. James Popham. "The Assessment of Teacher Compe­ tence," taken from Robert M. W. Travers (Ed.) Second Handbook of Research on Teaching. Chicago: Rand McNally, 1973. Medelcy,. Donald M. and Harold E . Mitzel; Measuring Classroom Behavior by Systematic Observation. Handbook on Research on Teaching, N. L. Gage (Ed.) Chicago:■. Rand.McNally and Company, 1963. Med-Icy, Donald W . and Others. Assessment and Research in Teaching Education. Focus on PBTE , Jjjrdc June 1975. Miller, William Ci '"Xccouhtabiix uy Dem'ands InVolvement," Educational Leadership. 29:613-617, April 1972. Moddus, George F . and Peter -W. Airosion. "Performance Evaluation," Journal of Research and Development in Education. V o l . 10, #3, Spring 1977. Mohan., Madan and Ronald E . Hull. Teaching Effectiveness: Its Meaning, . Assessment and Improvement. Englewood Cliffsj New Jersey: Educa- ■ tional Technology Publications, 1975. NE A . "Better Than Rating" (New Approaches to Appraisal of Teaching Ser­ vices), Association for Supervision and Curriculum Development, National Education Association, Washington, D . C., 1950 NEA.. "The Early Warning Kit. on The Evaluation of Teachers," First Revision, National Education Association, Washington, D . C., January 1974. N E A . "The Evaluation of Teachers," National Education Association, , Washington, D . C., Winter 1973-74. 158 Nelson, Kenneth G., John E . Becknell, and Paul A. Hedlund. Development and Refinement of Measures of.Teaching Effectiveness. The-University of the State of New York, Albany, N.Y., 1956. Ober, Richard L ., Ernest L . Bentley, and Edith Miller. Systematic Observation of Teaching. (An interaction analysis-instructional • strategy approach.) Englewood- Cliffs, N. J . : Prentice-Hall, Inc. Oldham, Nield. (Ed.) Evaluating Teachers for. Professional Growth. rent Trends in School Policies and Programs. National School Public Relations Association. Arlington, Va.: 1974, Cur­ Ornstein, Allan C . and Harriet Talmage. "The Promise and Politics of Accountability," National Association of Secondary School Princi­ pals , 58:20-23, March 1974. Popham, W . James.. (Ed.) Evaluation in Education. (Current, applica­ tions) Berkeley, Calif.: McCutchan Publishing Corporation, 1974. Popham, W. James. "The New World of Accountability: In the Classroom," The National Association of Secondary School Principals. 56:25-31, May 1972. Read, Edwin A. "Accountability and Management by Objectives," The National Association of Secondary School Principals. 58:1-10, March 1974. Reagan, Ronald. "Public Education: An Appraisal, " The National Associ­ ation of Secondary School Principals. 56:1-9, May 1972. Remmers,' H'. H. (Chairman) and Others. "Report onthe Committee on the Criteria of Teacher Effectiveness,"Review of Educational Research. 22:238-263, June 1952. Rogers, Virgil M. (Ed.) Do We Want "Merit" Salary Schedules? (Report of Second Annual. Workshop on Merit Rating in Teachers'1 Salary. Schedules.) Syracuse University Press, I960. Rosenshine, B . "Evaluation -of Classroom Instructor," Review -of Educa­ tional Research. 40:279-301,1970. Rosenshine, Barak. "New Directions for Research on Teaching," How Teachers Make a Difference. U. S. Government Printing Office, 1971. 159 Rosenshine, Barak. Teaching Behaviors and Student Achievement. York: Humanities Press, 1971. New Rosenshine, B . and N . Furst. "The Use of Direct Observation to Study Teaching," taken from Robert M.W. Travers (Ed.) Second Handbook . of Research on Teaching. Chicago: Rand McNally, 1973. Ryans, D . G . Characteristics of Teachers: Their Description, Compari­ son, and Appraisal. Washington: American Council on Education, 1960. Sciara, Frank J . and Richard K. Jantz. Accountability in American Education. Boston: Allyn and Bacon, Iric., 1972. Spears, Harold. Improving the Supervision of Instruction. Prentice-Hall, Inc., 1953. Stephen, John M. The Psychology of Classroom Learning. Holt, Rinehart and Winston, Inc., 1965. New York New York: Stephens, J.. M. The Process of Schooling. A Psychological Examina­ tion. New York: Holt, Rinehart and Winston, Inc., 1967. Thomas, Donald. "The Principal and Teacher Evaluation," 'The Education Digest. Vol. 40, March, 1975. Thomas, Donald. "The Principal and Teacher Evaluation," National Association of Secondary School Principals. 58:1-8, December 1974. Thomas, J . Alan. The Productive School (A Systems Analysis Approach to. Educational Administration). New York: John Wiley and Sons, Inc., 1971. Travers, Robert M.W.,(Ed.) Second Handbook of Research on Teaching. Chicago: Rand McNally, College Publishing Company, 1973. Walberg, Herbert J . (Ed.) Evaluating Educational Performance< (A Sourcebook of Methods, Instruments, a n d .Examples.) Berkeley , Calif.: McCutchan Publishing Corporation, 1974. Walter, Franklin B .. "Mandates for Evaluation: The National Overview," (Paper presented at the conference of the Kentucky Association of Teacher Educators) . Richmond, Kentucky, Eric Ed 115 607, October 31 1975. . 160 Weiss, Edmond. "Educational Accountability and the Presumption of Guilt," Planning and Changing. Illinois State University, Norman, 111., 1972; appeared in The Education Digest, April 1973. Wicks, Larry E . "Opinions Differ: cation . 62:42-43, March 1973. Teacher Evaluation," Today1s Edu­ Wilson, Laval S . "Assessing Teacher Skills: Necessary Component of Individualization," Phi Delta Kappan. Vol. LVI, November 1974. Wilson, Laval S . How,to Evaluate Teacher Performance. (Paper pre­ sented to the Annual Convention of the National School Boards Association, April 1975.) Wolf, Robert L . How Teachers Feel Toward Evaluation: in School Evaluation, the Politics and Process, (Ed.) Ernest R . House. Berkeley, Calif.: McGutchan Publishing Corporation, 1973. APPENDIX-A 162 DEPARTMENT O F E D U C A T IO N A L SERVICES COLLEGE OF EDUCATION M O N T A N A STATE UNIVERSITY BOZEM AN 59715 March 14, 1977 Dear In recent months you have most likely become very much aware of the increased emphasis which has been placed upon accountabil­ ity of schools and schooling by the tax paying public. Current literature points to the fact that parents expect their youngsters to be competent in basic skills as a result of schooling. Improv­ ing teacher effectiveness is timely and basic to the increased concerns of parents and school patrons. The problem for which this survey will provide data is that of finding the appropriate basis to evaluate teacher effectiveness The immediate need in solving this problem is for school districts to find the degree of agreement among teachers as to what is appro priate criteria for judging the effectiveness of a teacher and com pare those findings with the administrator's determination. Enclosed you will find a survey, instrument that is designed to gather data which will provide an answer to. this problem. The survey contains a list of criteria for judging teacher effective­ ness and directions for its completion. Your response is essential to the study of this problem, as the data gained will reflect the.thinking of people like you who must deal with this problem each day in their professional career. This survey questionnaire is designed to be answered in a maximum of five to seven minutes. A self-addressed envelope is provided. An early response, within your busy schedule, will be very much appreciated. Your administrator has been contacted, and his permission has been granted for you to receive this survey instrument. Due to the design of this instrument, it is essential that both the administrator and his staff respond to the questionnaire. Your response will be kept strictly confidential. No identification of schools or respondents will be made public in this study.. This study is being conducted under the direction of Dr. Robert Thibeault of the College of .Education, Montana State University. A summary statement of the questionnaire data will be made available to you if you wish to know the results of this survey. Sincerely yours, Francis Ai Olson T m i 1H O N E K O f t J O * ! O 1J U DEPARTMENT O F E D U C A T IO N A L SERVICES COLLEGE CE EDUCATION M O N T A N A STATE UNIVERSITY BO ZEM AN 59715 May 16, 1977 Dear Recently I mailed you a questionnaire which will provide data on the problem of finding the appropriate baaia to evaluate teacher . effectiveness. In the event that the instrument may have been misplaced or you did not receive it, I am enclosing another for your consideration I would very much appreciate your completing the instrument and re­ turning it to me in the self-addressed envelope provided. The input from you would be very helpful to this study and assuredly very much appreciated. I am aware of the many deadlines that you face in the closing days of the school year, and, believe me, I deeply appreciate your time and kind consideration in completing and returning a survey at this time. In the event that you have already completed the instrument sent to you earlier, kindly disregard this request. Thank you for your kind consideration. Very sincerely yours, Francis A. Olson T ELEPHONt M 0 6 I W 4 4 f » 164 INSTRUCTIONS FOR COMPLETING THIS SURVEV INSTRUMENT . The purpose of this survey is to determine what professional edu­ cators believe are the appropriate criteria for judging the effec­ tiveness of a teacher. On the survey instrument, which follows the demograhic informa­ tion, please rate each of the items on the nine-point scale provid­ ed. Assume that adequate measures exist to measure each criteria. Try to differentiate aa much as possible between items. Please rate all items and be sure not to circle more than one rank for any given item. Use the scale to rate each of the criteria according to its impor­ tance in determining teacher effectiveness. Circle one rank for each item. Low ranks are indicative of unimportant criteria; high ranks, important. Five is, of course, average. Please complete the following demographic information. A. B. C. D. E. The class of the school district in which you teach (circle one). Third Class Second Class First Class male or female • Sex Your years of teaching experience (circle one). I. 0-5 2. 6-10 3. 11-15 4. 16-20 5. 21-25 6. 26 and over . Circle the grade level or levels that you are presently teaching. K I 2 3 4 5 6 7 8 9 10 11 12 Please list the number of students who ore presently enrolled in your districts __ enrolled Do you want a copy of the dato summary sent.to you when this study is completed? _yes .no 165 SURVEY INSTRUMENT Criteria . Completely Unimportan t Extremely Importan t I. Willingness to be flexible, to be direct or indirect as situation demands. I 234 567a9 Participation in community and professional activities^ I234 567a9 2. 3. Years of teaching experience. 6. Ability to personalize his teaching. 5. Knowledge of subject matter and related areas. I 234 5 6 7a 9 I234 5 6 78 9 I234 5 6 789 6. Extent to which his verbal behavior in classroom is student-centered. I234 56789 7. Capacity to perceive the world from the student's point-of-view. .I 2 3 4 5 6 7 8 9 O. Civic responsibility (patriotism). I 23456789 9'. Personal adjustment and character. I2345678 9 11). General knowledge and understanding of education facts. I2 34 56789 11. Amount his students learn. I 234 56 789 12. Relationship with class (good rapport). I234 56789 13. Effectiveness in controlling his . class. I234 56789 14. Extent to which he uses inductive (discovery) methods. I234 5 15. Influence on student's behavior. I234 56 78 9 16. Performance in student teaching. I 'Z 3 4 5 6 7 8 9 continued on back 6 7a9 166 Hou effective ia your administrator in helping you to improve your teaching effectiveness? Circle one rsnk on the following scale for your rating. Very Ineffective Comments: 123456789 Very Effective APPENDIX B 168 DEPARTMENT O F E D U C A T IO N A L SERVICES COLLEGE OF EDUCATION M O N T A N A STATE,UNIVERSITY. BOZEM AN 59715 January 6, 1977 Dear In recent months you have most likely become very much aware oF the increased emphasis which has been placed upon accountabil­ ity of schools and schooling by the tax paying public. Current literature points to the fact that parents expect their youngsters to be competent in basic skills as a result of schooling. Improv­ ing teacher effectiveness is timely and basic to the increased concerns of parents and school patrons. The problem for which this survey will provide, data is that of finding the appropriate basis to evaluate teacher effective­ ness. The immediate need in solving this problem is for school districts to find the degree of agreement among teachers as to what is appropriate criteria for judging the effectiveness of a teacher and compare those findings with the administrator's determination. Enclosed you will find a survey instrument that is designed to gather data which will provide an answer to this problem. The survey contains a list of criteria for judging teacher effective­ ness and directions for its completion. Your response is essential to the study of this problem, as the data gained will reflect the thinking of people like you who must deal with this problem each day in their professional career. This survey questionnaire is designed to be answered in a minimum of five to seven minutes. A self-addressed envelope is provided. An early response, within your busy schedule, will be very much appreciated. Because your name was chosen randomly, it,is necessary and essential for the purpose of this study, not only to get your response, but also that of your staff members. In view of this need, I am requesting your permission to send questionnaires to your staff members. Your response and the responses of your staff will be kept strictly confidential. No identification of schools or respondents will be made public in this study. This study is being conducted under the direction of Dr. Robert Thibeault of the College of Education, Montana State University. A summary statement of the questionnaire data will be made available to you if you wish to know the results of this survey. Sincerely yours, Francis A. 01son T ELEPHONE W C t o l S M <W]3 169 DEmRTMENTOF E D U C A T IO N A L SERVICES COLLEGE OF EDLJCATION M O N T A N A STATE UNIVERSITY BO ZEM AN 59715 March 8, 1977 Dear May I take this means to remind you to complete and return the survey instrument which you received from me a while back. In the event that the instrument may have been mis­ placed or you did not receive it, I am enclosing another for your consideration. The input from you and your staff would be very helpful to this study and assuredly very much appreciated. I. am very much aware that time is a precious item in a busy administrator's day. For this consideration I also express my sincere thanks. If you have already returned the survey instrument to me, then kindly disregard this request. Very sincerely yours, Francis A. Olson M iH tHnfsjf Iii(ViH)iM dOU 170 INSTRUCTIONS FOR COMPLETING THIS SURVEY INSTRUMENT The purpose of this survey is to determine what> professional edu­ cators believe are the appropriate criteria for judging the effec­ tiveness of a teacher. On the survey instrument, which follows the demograhic informa­ tion, please rate each of the items on the nine-point scale provid­ ed. Assume that adequate measures exist to measure each criteria. Try to differentiate as much as possible between items. Please • rate all items and be sure not to circle more than one rank for any given item. Use the scale to rate each of the criteria according to its impor­ tance in determining teacher effectiveness. Circle one rank for each item. Low ranks are indicative of unimportant criteria; high ranks, important. Five is, of course, average. Please complete the following demographic information. A. The class of the school district in which you are an administrator (circle one). First Class Second Class Third Class 0. The administrator position which you presently hold in your district (circle). Principal Superintendent C. Sex ____male or ____female D. Your years of administrative experience (circle one). I. 0-5 2. 6-10 ■ 3. 11-15 4. 16-20 5. 21-25 6. 26 and over E. Circle the grade level or levels for which you are presently responsible. K I 2 3 4 5 67 89 10 1112 F. Please list the number of students for whom you are presently responsible. ________ enrolled G. Please list the number of staff members for whom you are directly responsible. ________ staff members Do you want a copy of the data summary sent to you when this study is completed? yes no 171 SURVEY INSTRUMENT Criteria Completely Unimportant Extremely Important I . Willingness to be flexible, to be 2. direct or indirect as situation demands. 123456789 Participation in community and professional activities. 123456789 3. Years of teaching experience. 123456789 4. Ability to personalize his teaching. I 2 3 4 56789 5. Knowledge of subject, matter and related areas. 123456789 6. Extent to which his verbal behavior in classroom is student-centered. 123456789 7. Capacity to perceive the world from the student's point-of-view. 12 3 4 5 6 7 8 9 :i. Civic responsibility,(patriotism). 123456789 y. Personal adjustment and character. 123456789 iu. General knowledge and understanding of education facts. 12.3 4 5 6 7 8 9 ii. Amount his students learn. 123456789 12 . Relationship with class (good rapport). 123456789 13. Effectiveness in controlling hia class. 12 3 4 5 6 7 8 9 14. Extent to which he uses inductive (discovery) methods;. 123456789 15. Influence on student's behavior. 1 2 3 4. 5 6 7 8 9 16. Performance in student teaching. 123456789 continued on back 172 Iluu effective are you in helping your teachers to improve their teaching effectiveness? ' Circle one rank on the following scale for your rating. Very Effective 123456709 Very Ineffective Commen ts : Permission is granted to send this questionnaire to your staff members. ^yes MONTANA STATE UNIVERSITY LIBRARIES 3 762 100 1131 7 0^765 Olson, cop.2 Francis A Professional agreement on for measuring educators criterion teacher effectiveness DATE ISSUED TO X / X : p ^ - ' flX) d# I N T E R U 8 R A R Y 1WTKRU8RABV % w t ^re- ^ W T W ^ ^ ZzV fcP > ^ - ^ 2 / -7^ 5T