National Association of Test Directors Performance Assessment in the Brave New World of Continuous Quality Improvement: Perspectives from Business, Curriculum Development & Testing Proceedings of the 1995 NATD Annual Symposium Presented at the National Council on Measurement in Education Annual Conference April 1995 San Francisco, CA National Association of Test Directors 1995 Symposia Edited by: Organized by: Joseph O'Reilly H. Guy Glidden Mesa (AZ) Schools Wichita (KS) Schools This is the eleventh volume of the published symposia, papers and surveys of the National Association of Test Directors (NATD). This publication serves an essential mission of NATD to promote discussion and debate on testing matters from both a theoretical and practical perspective. In the spirit of that mission, the views expressed in this volume are those of the authors and not NATD. The paper and discussant comments presented in this volume were presented at the April, 1995 meeting of the National Council on Measurement in Education (NCME) in San Francisco. The authors and editor of this volume are: Joe Hansen, Colorado Springs Public Schools, 1115 North El Paso Street, Colorado Springs, CO 80903-2599 719.520.2077 Jonelle Adams, Boeing Company, PO Box 24346, Mail Stop 7A-37, Seattle, WA 98124-0346, 206.865.5489 Bev Merrill, Gilbert Public Schools, 150 South Gilbert Road, Gilbert, AZ 85234, 602.497.3351 Darlene Patrick & Jon S. Twigg, Harcourt Brace Educational Measurement, 555 Academic Court, San Antonio, TX 78204, 800.228.0752 Joseph O'Reilly, Mesa Public Schools, 549 North Stapley Drive, Mesa AZ 85203, 602.898.7771 H. Guy Glidden, Wichita Public Schools, 217 North Watoe, Wichita, KS 67202, 316.833.4180 Table of Contents The Role of Assessment in a Continuous Improvement Education System Joe Hansen...........................................................................................................4 Performance Assessment in the Brave New World of Continuous Quality Improvement: A Perspective from Business Jonelle Adams....................................................................................................18 Curriculum Development, Continuous Quality Improvement, and Performance Assessment in the Local Educational Agency Bev Merrill..........................................................................................................24 Discussant Comments Darlene Patrick & Jon S. Twing......................................................................24 THE ROLE OF ASSESSMENT IN A CONTINUOUS IMPROVEMENT EDUCATION SYSTEM Joe B. Hansen Colorado Springs Public Schools INTRODUCTION Papers, articles and books too numerous to list, have been written about the need for "systemic reform" in education. In other writing (Hansen, 1994a) I have described in detail how modern systems theory can be a useful tool in understanding the nature of school systems and in reform, based on four components; a systems framework for analyzing school system characteristics, criteria for school system effectiveness, empirical data on practical remedies for system malfunctions, and a monitoring system for maintaining the course of the system and system fine tuning. In this paper I will focus primarily on the monitoring system and in so doing I will attempt to define the role of assessment in a continuous progress, self monitoring education system. At present such an education system is a mythical ideal to be strived for by educators. This ideal need not be unattainable, however, for the means for attaining it is readily available. Although it requires of those who would pursue it a commitment to quality, a customer orientation, a focus on data based decision making, and a willingness to experiment. An educational CIS would embrace some of the principles of Total Quality Management (TQM) as espoused by Deming, Juran and others. I am not, however, advocating that TQM become the model for education. CONTINUOUS IMPROVEMENT EDUCATION SYSTEMS Continuous Improvement Systems (CIS) comprise a class of self regulating systems that use internally or externally generated signals to monitor progress of the system and its components toward some end state or goal. A public school system can become a CIS if it is consciously redesigned to be one. In order for this to occur the leadership of the system must take the responsibility to develop within the organization a customer orientation and a focus on quality education. A customer orientation means, among other things, that the participants and stakeholders in the system comprised of all employees, parents, taxpayers and business people understand who the customers are, what they want and how to deliver what they want. This customer orientation is crucial to organizations striving to adopt the principles of Total Quality Management (TQM) but is a relatively new concept to public education. Maintaining a customer orientation requires that the system have in place a mechanism or process for monitoring customer satisfaction, detecting and analyzing dissatisfaction and correcting faults in the educational product before the customers lose confidence in the system and demand its replacement. A focus on quality is a fundamental and pervasive element in education CIS. A CIS must generate and make effective use of data on key indicators of quality or progress toward predetermined goals or desired conditions. A high regard for the value and use of quality indicator data must be embedded in the culture of the system. A process of using such indicator data for self regulation and improvement must also be clearly defined and implemented. In order for this to occur there must be a direct link between the reporting, analysis and interpretation of data and some corrective action within the system, just as there is a link between the temperature displayed on a thermostat and the switch mechanism that activates or deactivates the heating or cooling process of a building's climate control system. The above mentioned goals or conditions might pertain to the progress or improvement of students, organizational units, employee groups, district financial well being or any other goals deemed important by the system or its customers. Progress toward those goals is a critical element in the definition of quality for the system but it should not be the total definition of quality. Quality must also be defined in terms of societal values and the expectations held by the customers of the educational enterprise. Therefore, it is of crucial importance that the process of defining quality for the system includes broad community involvement. QUALITY INDICATORS In order to monitor quality or progress a subsystem of indicators of quality or successful performance for the system must be developed. Such indicators can take a variety of forms from test scores to complex indices comprised of data from multiple sources. This subsystem would include the following minimum components: student performance indicators based on an assessment subsystem which will be described in a later section system and organizational unit indicators based on the mission and goals of the units themselves which draw their data from all appropriate sources a database (computerized) designed to enable ad-hoc inquiries of student progress, system status and unit progress on a wide variety of indicators. This database must be: interactive and easy to use by teachers, administrators and secretaries routinely updated on a real time or daily basis to keep current data available for inquiry based on distributive computer technology to collect input at individual student level and at organizational work unit level (e.g., administrative department.) . secured to prevent tampering with or altering data and to limit access to appropriate users. Student performance indicators may take a variety of forms including both individual and group indicators of achievement, attendance, discipline, dropouts, graduation rates, specific proficiency measures, attitudes and vocational skills. Indicators may be single scores or indices comprised of combination of data. For example, percent of students by ethnic and racial groupings meeting the fourth grade proficiency standard as measured by both a performance assessment and a multiple choice test might be one indicator. The role of specific types of assessments in this process will be discussed later in this paper under the topic of the assessment subsystem. System and organizational unit effectiveness indicators may include ratio indices such as perpupil expenditure on instructional supplies and materials, pupil teacher ratio, average class size by subject area, percentage of minorities enrolled in advance level courses compared to the percentage of minority students in total enrollment. More specific indicators of unit (department or school) goals should be designed based on the specific mission and goals of the unit. For example, a reduction in paycheck errors or decreased time in processing check requests for payroll or reduced network downtime or reducing help line calls through increased staff training for Information Services. The important issue is that each organizational unit should establish improvement goals based on the unit's mission, which directly support the overall mission and goals of the district or total system. Indicators of quality must then be defined to quantify performance related to those goals and a systematic data collection, analysis, interpretation and application process must be in place to enable the unit to evaluate itself. This self evaluation process is the core of the CIS. It requires the development of a culture of evaluation within the system. CULTURE OF EVALUATION Developing an assessment and evaluation culture within the system means that assessment is understood and valued by the board of education, superintendent, administrators, teachers, students and parents. It also means that evaluation of performance for individuals and organizational units is expected and welcomed throughout the system as the basis for improvement. Parents must be an integral part of this culture, sharing in the results of the evaluative process and providing continuous input to the system on their observations and expectations. In some school systems it may be necessary to formalize this process of parent involvement through establishing an advisory group at each school and requiring training in the use of the assessment subsystem for all advisory group members. In Colorado this process is addressed through the State Accountability Act. The training, however, is not mandated. Training is essential for the culture of evaluation. This culture of evaluation requires continuous training the entire staff, administration, board of education and key advisory groups on the characteristics of the assessment system, its role, uses and limitations. Such training must strive to make all users of assessment data experts in the interpretation and application of the data it produces. This means they must understand the purposes, strengths and limitations of their data and must learn to use it responsibly and appropriately. In order to have its desired effect the training must be required of all employees and should also be included in the curriculum for students at appropriate levels and in appropriate ways. This enables students to monitor their own progress against predetermined standards accepted by all within the system. The teachers are then free to focus more of their time and effort on coaching and supporting students and finding alternative instructional strategies to use with those whose progress is lagging. Training in alternative instructional strategies must provide clear linkages to the indicator data so that teachers can understand what their specific options are for students with specific needs. in other words diagnosis and prescription must replace the one size fits all and shotgun approaches so common in today's schools. If teacher training, both in-service and pre-service, does not focus more tightly on teaching teachers how to use data to select or devise the most effective approaches for teaching those students who are not demonstrating proficiency, then continuous improvement is unlikely to occur. Principals must also be trained in the classroom research methodology and in the interpretation and application of research data to school improvement. THE ASSESSMENT SUBSYSTEM An essential component of any such self regulating education CIS is an assessment subsystem (AS) designed to provide accurate and timely data on student performance. As I have shown elsewhere, such a subsystem must be comprehensive in nature with multiple forms of assessment, each designed for a different purpose (Hansen, 1992, 1994a). The AS must address the needs for student performance data at all levels within the system hierarchy from the student to the board of education. These information needs vary by level in terms of the nature and qualitative characteristics of the data required for decision making. Policy level decisions such as those made by the board of education, superintendent and superintendent's cabinet, are best served by summary data which provides an accurate but coarse grained view of system-wide performance. Quality indicators at this policy level include district-wide mean scores on achievement measures, percentages of students meeting proficiencies at designated levels, district graduation rates and so on. At the management level assessment data may be more narrowly focused on specific school performance, grades or groups of students within schools. At the instructional level data are even more fine grained, providing sufficient detail for individual students, their parents and teachers to make decisions about appropriate instructional content and processes. This is illustrated in figure 1. Figure 1 Relationship of Informational Detail, Educational Decision Making Hierarchy and Decision Type The AS is not to be confused with the quality indicator subsystem (QIS). The major distinction between the two is that the QIS provides analyzed and summarized data on demand for data users on how well they are doing on their specific goals. The AS, on the other hand, is based on student achievement data only. The AS must have the following minimum features: quality standards for all forms of assessments (Hansen, 1994b) a comprehensive design based on multiple forms of assessments, e.g., multiple choice, openended, performance and portfolio, with each form serving a defined purpose in the decision process it serves a reporting system for presenting information to personnel who are responsible for system monitoring and management a training component for all assessment data users in the system. I will discuss each of these required elements of the AS below and illustrate how they integrate to form a cohesive subsystem. ASSESSMENT QUALITY STANDARDS These standards have become increasingly important as American education moves toward Standards Based Education with its emphasis on performance assessment, portfolios and other alternative assessment forms purporting to better reflect a shift toward a more "constructivist" approach to education. These alternative assessment forms, while holding forth promise of greater authenticity in assessment, have not as yet, been shown to be as reliable and valid as more traditional forms. Nevertheless, they have an important place in the AS in as much as they are used for specific purposes for which they are well suited. I will elaborate on the relationship between assessment purpose and form in a subsequent section. The National Association of Test Directors (NATD) has developed a set of seven guiding principles or quality standards to guide the development of performance assessments (Hansen, 1992b). These standards can and should apply not only to performance assessments but to all forms of assessment directed at student achievement. The NATD Guiding Principles are summarized below: Purpose of Assessment: Different tests are designed for different purposes and should be used only for the purposes for which they are designed. Norm-referenced tests, for example, provide broad measures of performance on a generalized curriculum, whereas, performance assessments are intended to provide in-depth coverage of a subject and elicit higher cognitive skills. Use of Multiple Measures: Multiple measures should always be considered in generating information upon which any type of educational decision is to be made. No test, no matter how technically advanced, should be used as the sole criterion in making high stakes decisions. Technical Rigor: All assessment must meet appropriate standards of technical rigor, which have been clearly defined prior to the development of the assessment. This is even more important when an assessment is used in a high stakes decision context. If a(n) (performance) assessment cannot meet desired technical standards, it should not be used to make high stakes decisions. At a minimum, assessments should be judged against the following quality criteria - - assessment consequences, fairness, transfer and generalizability, cognitive complexity, content quality, content coverage, meaningfulness and cost efficiency, as described by Linn, Baker and Dunbar (1991). Cost Effectiveness: The quality and utility of information collected through any assessment must be weighed against the cost of collecting, interpreting and reporting it, including the costs of the time required of teachers, administrators and others. Protection of Students/Equitability: No harm should come to any student as a result of the administration or use of data from any assessment and extra care should be used with performance assessments to prevent any systematic bias against any racial, ethnic or gender group. Educational Value: Assessments should be designed to augment the educational experience of the student. Decision Making: Assessments should provide data that enhances the decision making ability of students, teachers, administrators, parents and community members. These guiding principles provide the minimal criteria for the type of AS that is needed for developing a CIS in education. COMPREHENSIVE ASSESSMENT To adequately serve its purpose within the CIS the AS must be comprehensive in its design. That is it must be based on multiple forms of assessment and each form must serve an intended purpose based on the specific characteristics of the assessment. A comprehensive AS must employ in a proper manner, each of the three basic forms of assessment, performance based assessment, curriculum referenced measures, and norm-referenced tests. Performance Based Assessments (PBA) provide rich detailed information about a student's knowledge in a content area and afford him the opportunity to integrate information from other areas in his response. PBAs are believed to be more reflective of the constructivist approach to education. PBAs, however, may not satisfy the criteria for technical rigor as indicated in the NATD Guiding Principles. They may also pose problems for summarizing data for use at the policy decision level. Curriculum Referenced Assessment (CRA) is also an essential component of a comprehensive AS[1]. CRAs are designed specifically to measure a student's progress on the district's curriculum and relate that growth to an underlying continuum of difficulty. These measures have relatively low measurement error compared to other forms of assessment, high validity and high reliability. Since they are constructed specifically to measure progress on the curriculum, they provide data useful to students, parents, teachers, administrators and the public. Norm-referenced testing (NRT) is needed in order to provide administrators, the board of education and the public with a sense of how groups of students, ranging from class size groups to district-wide grade level groupings are doing on a highly generalized curriculum compared to other students in the same age or grade level. In the Colorado Springs Public Schools we are in the process of building a comprehensive assessment system. We began this process five years ago and have proceeded cautiously, making sure that the components of our system meet the highest quality standards we can attain. Figure 2 illustrates that system. The purposes for which each type of assessment is used are shown below each of the three types. We began our development with the CRM, which is a multi-level system of overlapping tests in reading, mathematics and language composed of Rasch scaled items. As we have refined the system we have added alternate forms and course level tests in algebra. We are now in the process of developing tests in science and will soon begin to explore the potential in social studies. As the CRM component has grown we have reduced the use of the NRTs to three grade levels, third, fifth and tenth. We are planning district-wide pilot efforts in PBA for the fall of 1995, in the subject areas of writing, reading, and social studies. Currently, we have many small PBA projects scattered throughout the system, some of which will provide the basis for the district-wide pilots. After an evaluation period and at least one year of refinement, the pilots will be transformed into PBAs at grades four, eight, and ten. THE REPORTING SYSTEM A comprehensive reporting subsystem is needed to provide appropriate data to users at each decision level of the organization as depicted in figure one. This system must provide highly detailed data for students, parents and teachers, while summarizing data appropriately for higher level decision making. An example of how such a subsystem might look is shown in Exhibit 1. The reporting subsystem should, of course, make the most effective use of both tabular and graphic presentation modes Figure 2. Comprehensive Assessment System Model Showing the Relationship of Assessment Type to Purpose Exhibit 1 Comprehensive Reportiong Systemdepending on the specific audience and purpose for a report. In addition to presenting assessment data, the reporting system should also be capable of displaying and presenting other indicator data from the QIS as described above. As shown in Exhibit 1, the AS reporting subsystem provides results on each type of assessment for use by a variety of audiences/users thereby facilitating the monitoring of system progress consonant with the CIS concept. Training is necessary for students, teachers, parents, building and district level administrators, and board of education members in order for the AS to be effectively used for improvement purposes. Developing the culture of evaluation as was described above is the goal of this training. Since this topic was discussed under the section on the Quality Indicator Subsystem I will not belabor it here except to point out that specific training on the interpretation and use of the AS data is essential to prevent misuse or abuse of the data and erroneous conclusions or inferences from being drawn. Well used data should generate more questions than answers. Those questions should lead to further investigations of cause and effect relationships, avoiding overly simplistic interpretations. ISSUES AND IMPLICATIONS Developing a CIS in education presents several major issues, each of which has implications for the CIS design. Among these are: 1) defining the customer and identifying the customer's needs; 2) the role and purpose of public education; 3) the extent to which the education system should be process or results driven; and 4) the difficulty of developing a culture of evaluation. The question of who is the customer for education is not trivial. The answer is intertwined with how one defines the role and purpose of public education. Education has numerous customer groups it must satisfy and those groups at times have conflicting needs. For example, Thomas Jefferson is attributed with having said that the ability of a democracy to survive is dependent on an educated electorate. Such a perspective implies that the democratic society as a whole has a vested interest in the quality of education of its members in order that it might remain free rather the succumb to dictatorship. Presumably this means that education is defined in broad terms and serves the purpose of producing an electorate with the knowledge and ability to understand major issues and to act on behalf of the entire society in addressing those issues, rather than on a purely selfish or parochial basis. In this depiction every member of society is a customer of the education process and its purpose is liberal education for each member in order to preserve the society. An alternative depiction which is increasingly popular is that business is the major customer to be satisfied by the educational system. This depiction of the customer emphasizes workplace skills which include cooperativeness, teamwork, reasoning ability, and a mastery of the basic skills of reading, mathematics and communicative arts. This view of the customer base emphasizes the economic factors - the success of business. The assumption is that if business is successful in generating profit, there will be more wealth to trickle down to all of society. The emphasis of such as education system is, perforce, vocational rather than liberal. While this dichotomy may be to some extend fabricated, it illustrates some of the competing values an education system must address and in some way satisfy. Other value laden conflicts among education's customers include the role of religion in education, the breadth and scope of the curriculum, and the extent to which the education system focuses on the process versus the outcomes of education. Clearly, an education system that attempts to satisfy all of these conflicting values-based needs will be confused and without direction, for the concept of quality itself is differentially expressed in these diverse views. Currently, the American education system is moving towards a "standards based" system in which both content and performance standards are expected to drive the system to a higher level of quality. Many view this as a move toward results oriented rather than a process oriented system as has been the tradition in education. At the same time TQM is exerting an influence on education. Ironically, TQM is a process oriented rather than results oriented approach (Imai, 1990). Developing a culture of evaluation within a school system is obviously not an easy thing to do. American education has seen wave after wave of accountability based reform rise and crash on its rocky shores only to dissipate with the outgoing tide characteristic of all educational fads (Hansen, 1993). Yet evaluation of our effectiveness is still not ingrained within our own system. It hardly seems possible that the lofty reform goals of Goals 2000 or any other reform campaign can ever be met until educators themselves learn to value the use of high quality data for the purpose of improving education. Although mandatory training can be helpful it cannot by itself create the culture of evaluation. For this to occur a change in values must take place within the educational professions. We must learn not to fear negative results, but to embrace them as the source of information we so badly need to improve the system and our own performances as educators. SUMMARY AND CONCLUSION In this paper, I have attempted to extend the work I initiated several years ago in which I developed a general systems theory model (GSTM) for education by reframing the GSTM in terms of a continuous improvement system (CIS) and focusing more specifically on the assessment subsystem (AS) component. The AS is depicted in terms of a comprehensive system of assessment comprised of performance based, curriculum referenced and norm-referenced assessments, each addressing a different purpose within the AS. A graphic portrayal of the AS is shown in figure 3. Many of the concepts discussed here have been implemented in the Colorado Springs Public Schools, where we are in the process of developing a purpose driven comprehensive assessment system. As resources and district priorities allow we will also develop a broad based Quality Indicator System (QIS) as described above. It is my hope that the concepts and principles discussed in this paper will find a receptive audience among educators interested in transforming their school districts into CIS districts. Developing a culture of evaluation is a core concept in this process that must be addressed with a diligence and commitment by all who call themselves educators. Figure 3 Assessment Subsystem Flow Diagram REFERENCES Hansen, J.B., (1992) Matching Levels of Accountability with Information Needs. Symposium presentation for Topical Interest Group on Evaluation Use, American Evaluation Association Annual Conference, April, 1992, Chicago, IL. Hansen, J. B., (1993) Education Reform Through Mandated Accountability: Is it an oxymoron? Measurement and Evaluation in Counseling and Development, 26 (1), 11021. Hansen, J. B., (1994a) Applying Systems Theory to Systemic Change: A generic model for educational reform Presentation in the symposium: Systemic Change and Educational Reform, Division H, American Educational Research Association Annual Conference, April, 1994, New Orleans, LA. Hansen, J. B., (1994b) Guiding Principles for Performance Assessment: Proceedings of the 1994 NATD Annual Symposium, presented at the National Council on Measurement in Education annual conference, April, 1994, New Orleans. Imai, M., (1986) Kaizen, The Concept, from Kaizen, the Key to Japan's Competitive Success, Masaki, Imai, Random House. Reprinted in Total Quality Handbook, G. Dixon and J. Swiller eds., Lakewood Books, 1990, Minneapolis. Linn, R., Baker, E., and Dunbar, S., (1991) Complex Performance-Based Assessment: Expectations and Validation Criteria. Educational Researcher, 20, (8), 15-21. Discussant Comments Perspectives From Business Jonelle Adams The Boeing Company INTRODUCTION At Boeing, our serious education in Continuous Quality Improvement (CQI) began in the early '80s. We visited world-class companies, listened to the quality experts, started educating our people, and began focusing on problem-solving and process improvements. Continuous Quality Improvement means far more than the application of practices and techniques. It represents a significant cultural change and a shift in the fundamental philosophy of doing business. When we believe that perfection is possible, we will have achieved the mindset required to attain world-class quality. When we know we can go for years without defects or mistakes in any of our processes, we will have made the necessary paradigm shift, and will expect of ourselves the unprecedented levels of quality required to remain competitive in the twenty-first century. In my address to you, today, I will cover three areas of performance assessment used in business: Daily Management System Quality Measures for Full Customer Satisfaction Personal Behavior Assessments DAILY MANAGEMENT SYSTEM Daily Management is the system by which the organization performs its daily activities, by establishing reliable methods, reducing variation in processes, and using facts and data to ensure that processes, products and services are continuously improved and predictable forever. Characteristics of the Daily Management System: Daily activities include performance to schedule and execution of operating plan commitments. Division-level and inter division-level activities are included. Equal attention is paid to results and methods. The focus is on holding the gains and making small improve-ments (SDCA, PDCA). Management with Facts and Data is used. Statistical Process Control (SPC) is extensively used in measuring and analyzing products and processes. Actual facts are reviewed, at the actual place, and the goal is to understand the actual problem. Extensive use is made of disciplined methods and tools to achieve reliable methods and processes. Improvement methodologies are used, such as Work Management, Quality Improvement Story (QIS), and Situation-Target-Proposal (STP). Problem-solving and planning tools are employed. Disciplined process-improvement methodologies and process thinking are employed. Terms used in Daily Management are defined as follows: Daily activities: Those actions that either provide incremental improvement or maintain the current level of performance without degradation. Daily work: All the activities performed in an organization, by all levels and all positions. Management with Facts & Data: All goals, targets, and measures in the Management-By-Policy system, together with facts and data from Daily Management. These facts and data are based on the current understanding of the process performance and capability. Process improvement: Those processes that are targeted to need either a standardization/stabilization or a maintenance approach (SDCA), versus an improvement to a new level of performance (PDCA). SDCA= Standardize/stabilize-Do-Check-Act PDCA= Plan-Do-Check-Act Reliable method: A consciously-developed, explicitly-established, consistently-followed method, verified through performance data to be the best approach. Control point: A description of the outcome of a process for which we want to achieve a specified (predictable and consistent) level of performance. Checkpoint: An item within a process that impacts the achieve-ment and level of performance of a control point. Target: A specific value or prescribed, quantitative level of performance of a control point or checkpoint. Elements of the Daily Management System: Statement of the core purpose of the process Specification of the outcomes of the process-control points, control items and target level Documented description of the process for achieving core purpose and specific outcomes, for example: flow diagrams standards QC process charts Check items (checkpoints) Operating and procedure manuals Requisites for implementing procedures and reliable methods Description of individual roles and responsibilities. Data and information requirements: Types of data Method and frequency of collection Visual display and charts used Method and frequency of reporting Procedures for checking results by control items and check items. Procedures for corrective action when abnormalities occur. Procedures for daily improvement activities. Description of linkages to other processes and other roles. The diagram below depicts a system focused on processes up and down the organization. The Statement of Work and the objectives and plans to accomplish Management-By-Policy are described in the operating plan. Priorities are communicated formally through the Performance Management process. Problems are discovered through continued examination of the control points of the process and through the diagnosis and review described in the Management By Policy system. As problems are discovered, extensive use is made of disciplined improvement methodologies and problem-solving tools throughout the organization.Figure 1 Daily Management To develop and accomplish the policies of the company, Daily Management must be in place to provide data and to achieve action. Applications and Uses of Daily Management Daily Management serves as the basis for: Knowing the level of performance of important areas by organizational units as well as for the whole company. Determining areas of breakthrough for the MBP system. Determining improvement priorities for all important areas throughout the organization. These priorities are: Needs breakthrough (MBP). Needs incremental improvement (daily). Maintain as is--do not let deteriorate (daily). Means of holding the gains of breakthrough achievement from MBP system. Daily Management will allow us to make evolutionary improvements. It will rarely be adequate to meet survival goals. New methods may be needed. Re-allocation of resources (people and money) may be required. Facts and data derived from Daily Management will help us to determine what break-through advances we need and how to make these advances. Figure 2 Basis For Determining Areas of Breakthrough for the Management by Policy System Quality Measures for Full Customer Satisfaction Achieving full customer satisfaction requires the involvement and participation of everyone in the organization - in everything that is done. Achieving full customer satisfaction requires integrated cross-functional cooperation. People and departments must work with one another, not against one another. All functions and departments must work together to achieve superior quality, cost, delivery, and safety for their customers. Everyone must feel responsible for providing full customer satisfaction. Figure 3 The Five Elements of Full Customer Satisfaction The only way to ensure full customer satisfaction for end-use customers is build quality into every step of the process. Any next person or team in the process is treated as a customer, and customer specifications must be met internally and externally. All employees learn what their customers - their teammates, the next people in the process - want and need. They have to make sure they delivery it exactly right and exactly on time. Achieving full customer satisfaction means that we follow these principles: All activities and products are seen from a customer's point of view. No variances are passed on. Each person is trained and given the authority to control quality at the source. Thus all employees have the responsibility to check their work or process to ensure that no defects get by them. Everyone works carefully and systematically to build effective relationships with internal customers. This empowers everyone to contribute to the eventual satisfaction of the final, external customer. A chain of internal customer satisfaction--passing from one person to the next, and from one process to the next--builds the full satisfaction each customer deserves. Quality is measured at The Boeing Company through improvement targets: Quality: Reduce defects by 90% in the year 2001. Cost: Lower unit cost by 25% in 1997. Delivery Reduce cycle time by 50% in 1996. Safety: Reduce Lost Workday Case Rate by 50% in 1996. Morale: To perform in the top 25 percentile of premier companies by the year 2004. Figure 4 Company Improvement Targets Personal Behavior Assessments Boeing 360deg. Management Assessment Process Purpose: For each Boeing manager to understand how others perceive them against the written management attributes. It is a tool to use to improve their performance. The assessment is planned to be used as part of performance management and as additional data for making evaluation, promotion, selection and retention decisions. Human Resource Development: Operational Model Evaluation: Performance (reward) Leadership (promote) Potential (develop/removal) Development: Coaching Assignment & Experiences Training & Education Management Succession Planning & Development: Performance Management Process Replacement Tables Succession Planning Reviews and new: 360deg. Management Attributes Benefits For the manager: A more complete and accurate picture of how they are perceived Data-based decisions 360 data vs. single point (boss-only) Time to adjust to getting behavioral feedback For The Boeing Company: Employee Opinion Survey issues of fairness and leadership are addressed Data-based decisions Senior management credibility regarding the use of Boeing Management Attributes Careful implementation of Plan-Do-Check-Act (PDCA) cycle Benefits for both-----better Boeing managers are developed. Curriculum Development, Continuous Quality Improvement, and Performance Assessment in the Local Education Agency Bev Merrill Gilbert (AZ) Public Schools I work in a small school district that is located in Gilbert, Arizona, a community which is part of the Phoenix, Arizona, metropolitan area. Until recently, Gilbert was primarily an agricultural town awash in cattle, horses, tractors, and pickup trucks. In the last seven years, however, this town has grown at a breath-taking rate - approximately 410% making it the fastest growing city in the State of Arizona. The school district has close to 16,000 students, 18 schools - 13 elementary, 3 junior highs, 2 high schools, with 4 schools (including a high school scheduled to open in 1997) on the drawing board. Our teachers are young and energetic; our students are primarily white and middle class; the minority rate is approximately 10%. We traditionally have done very well on any and all state tests. I do many things as a district-level administrator but primarily I am responsible for secondary curriculum and the district testing programs. Preparing for this discussion has allowed me to consider how Continuous Quality Improvement is aligned with the curriculum processes in the school district where I work. Although CQI is not formalized in the Gilbert district, it appears there is a strong relationship between the principles of Continuous Quality Improvement and what occurs in curriculum development. Curriculum is an area that of necessity requires teamwork - numerous discussions, and constant re-evaluation. In fact, one of the very critical components of curriculum is this continual re-assessment. I believe educators are obligated to search for better ways of instructing students. Now, admittedly sometimes we are accused of changing programs coinciding with the arrival of the hottest new traveling inservice show, but I have found that in the area of curriculum, real change is hard to produce. Teachers take pride in closing their classroom doors and doing what they please, what they think is in the best interests of their students, so I am convinced that permanent and positive curriculum change can never be done by edict. I believe teachers should be involved in all aspects of the design of curriculum and assessment. I respect their knowledge and work ethic and when they are asked to assist in solving an educational issue they have a good sense of what will really work. When curriculum does undergo major revisions, at least in my experience, there are processes that need to occur and those appear to be very similar to CQI processes. To illustrate this, recently my district has undergone a major curriculum revision in math education. For a number of years, the Gilbert district has had a successful and traditional secondary math program - pre-algebra, algebra 1-2, geometry, trigonometry, the usual sequence. Two years ago, some of our high school math teachers met with me to begin a discussion of how we might establish a curriculum which was more closely aligned with the NCTM Standards in Mathematics, one which was centered on problem solving--a curriculum which allowed for cooperative group interaction and asked students to utilize writing as a means of reinforcing their learning. The traditional algebra, geometry sequence no longer seemed in the best interests of our students because current literature suggests that when people solve real world math problems, they use a combination of math skills and often collaborate with other people. One of the concerns that propelled this initial discussion for change was the belief that the students, our customers, were not always receiving the best possible math instruction. We knew that some students were graduating from our high schools with minimal math--kids who were trying to beat the system by fulfilling the two math requirements with general math and consumer math. These courses had some value, but they did not sufficiently teach the math skills students need in order to succeed today. As this curriculum discussion gained momentum, other math instructors (certainly not all) at the secondary schools agreed that they, too, were interested in a new curriculum which they believed would better prepare students in the area of math, and so began the process of planning for the change and building a shared vision. A small initial curriculum group worked through the summer, and then the committee was expanded to include some of the reluctant. This larger committee developed a draft of the new curriculum, and began a sales campaign to ring in some of the non-believers. We knew that our efforts would be sabotaged if we did not involve math teachers who were seen as informal leaders, those who might not agree there was a need for a change, and those who would be threatened when ask to instruct their students in a way that was foreign to them. Our planned change began by conducting a pilot, relying on teachers who were anxious to try the new curriculum. We monitored the pilot's effectiveness by giving students the same end-ofcourse test they would have taken if they had been in a traditional class and then comparing their test scores with the test scores of students who were in a traditional math class and who had taken the same test. The results of the pilot showed that students in the integrated pilot classes did slightly better on the algebra 1-2 test than the students in the traditional algebra 1-2 courses. Also supporting the pilot were survey results from students indicating that once they make the adjustment, they, too, liked the integrated math instruction. Next as part of the planned change was the infusion of staff development activities. Math instructors were encouraged, and in some cases, expected to attend a number of presentations which focused on modeling activities and techniques which would help teachers make the necessary classroom instructional adjustments. The change was also assisted by the purchasing of sophisticated math software, graphing calculators and other technological support. Finally after a year of planning the change, and implementing the change, a shared vision emerged which is now embodied in a new curriculum, supplemented by new textbooks, new tests. It is a very improved, very gutsy new math program. Currently, this integrated approach to math is receiving accolades not only from math teachers, but also from science teachers who find their students better prepared for science investigations as well as, from parents, and students. Students can no longer graduate from our high schools without at least one full year in the integrated series. Normed math scores are higher than they were when we had a traditional math curriculum--evidence that we have an improved product, but our success doesn't mean we won't be making adjustments, getting rid of the glitches, working towards zero defects, improving constantly. This summer, for the third time in a row, we are again working together on staff development activities designed to help refine the math curriculum. The most important component in monitoring our math curriculum is a consistent series of teacher-generated, end-of-semester tests administered at the district level. These math scores are being very carefully watched by the math chairs who invested heavily in the change. All of us who have become "true believers" also know that since this is a 7-12 continuum, that if there is one weak link in the series, it affects the entire sequence. So when our first semester test scores were still hot off the main frame, we noted that one junior high produced scores that were considerably below the scores of students at the other two junior highs. In cooperation with the site administrators, and in the spirit of collaboration, we have established an inservice plan which should improve the teaching skills of the instructors whose students were not performing at expected levels. Test data are enormously critical in monitoring the quality of this program. The processes used in making the curriculum change seem very closely related to the CQI process of plan, do, check, act. I can become quite enthusiastic about other curriculum standards that need to be embedded in curriculum if quality instruction and, consequently, quality learning is going to take place in the classroom. I very much respect the work of the National Writing Project and their very significant contributions to the teaching of writing via the writing process and the resulting literature-based writing programs. Without fondness I remember the days of language-arts skill and drill grammar instruction and the resulting inability of students to express their thoughts on paper. In science, a rigorous, inquiry-based, hands-on science program is fundamental to a successful science curriculum. Currently we are looking at a curriculum design that will extend these already successful programs into an integrated format, trying to break down the barriers between disciplines and departments and build greater student understanding of the connectedness among disciplines. I suggest that curriculum is a critical piece of any school discussion, and greatly influences all else that happens in a school - either positively or negatively. However, one cannot be involved in curriculum without keeping an eye on assessment. I acknowledge that ideally curriculum and assessment should be based on a common articulated concept of learning, or in other words, they should match. Ensuring this articulated concept of learning may sound like a relatively easy task, but I can assure you herein lies the real test. At a district level, I am constantly working with classroom educators to first agree on an appropriate curriculum and then to match this curriculum with a reasonable test. One or the other is always changing. On a larger scale, however, tests especially high stakes tests, can influence and even distort curriculum. I believe that one of the reasons curriculum was so skills driven for so long is that teachers were preparing students for assessments which focused on discrete skills. Perhaps the State Department of Arizona had that in mind when they decided to introduce curriculum outcomes entitled The Essential Skills. They really did believe that these curriculum outlines which focused on reading and writing processes would produce change. When this change was not rapid nor noticeable, the next step was to design performance assessments which they thought would ensure that the Essential Skills' curriculum was adopted. Using assessment to drive curriculum and instruction may not be a good thing. I want to be very clear when I say that the performance-based mandated state tests in Arizona have supported many worthy curriculum goals, such as writing-across-the-curriculum and increased expectations in the area of math. Philosophically, I am very supportive of performance-based assessments. However, the Arizona performance assessments were flawed. Perhaps, the principles and processes of CQI could have averted some of the problems associated with these tests. The first year these tests were administered was a never-never-land adventure not only for school districts, but also for the state department. The performance assessments were distributed based on a matrix - nine different tests in three subject areas so different schools got different tests and in different subject areas--that's twenty-seven tests. At the district level sorting and delivering the tests was a performance assessment in itself. But, our biggest challenge came at the high schools where we were asked to administer about nine different tests in the three subject areas to seniors less than two months from graduation. These tests were designed to take sixty minutes (keep in mind we have fifty-five minute class periods) and while the math and reading tests were a one day assessment, students who had been selected to take the writing test had to have their schedules re-arranged for two days. Seniors do not have only senior level classes, so there were no consistent courses where we could capture all the grade twelve students. In a desperate attempt to do what we were asked to do, we posted lists throughout the halls of the high schools, asking students who had been randomly selected (by the state department) for a particular test to show up in a room (with a very disgruntled teacher) to take this new performance assessment. The real performance assessment was whether or not they could find their names on a list and then find the right room at the right time. Most students thought it was easier to find the right river on the right day. Less than half our seniors bothered to attend school on the scheduled test day. Some of the students who did take the test, amused themselves by coloring the graphics, by turning the very serious booklet on the Rain Forest into a romantic adventure starring the Little Mermaid or chose to record their answers upside down and backwards. It has always impressed me how creative our students can become given the appropriate challenge. There were no consequences tied to the tests; the results would not be available until next fall. I think you get the picture. Needless to say, things since that very wobbly beginning did improve, but the credibility gap was almost impossible to bridge. The next set of performance assessments were thematic, that is, they were contained in a booklet which housed questions on math, reading, and composition, and all students throughout the state took the same test, by grade level. From the prospective of someone who had to supervise the administration of the Arizona performance assessments at the school level, I found myself in the position of always having to apologize for the lack of foresight in determining how they were to be administered, for obvious mistakes in the actual test, and for the inadequate scoring time line. Because these tests were so costly and more complicated than expected, I think the state tried to economize on the scoring. Almost all of the performance assessments were scored only once, and it is difficult to defend such a score. The anecdotes surrounding the Arizona performance assessments could contribute to a good novel. Large groups of tests were lost by the scoring groups, never ever again to see the light of day. The Arizona State Department of Corrections printed the tests--sometimes upside down or with missing pages or in any other way that amused them, and sometimes the prisoners left little X-rated messages in the tests for our students to read. These performance assessments were administered for two years, and then with an election of a new state superintendent this year, two weeks prior to the date when this spring's assessments were to be given, all testing was abruptly canceled. It may be that the state department was too ambitious. They wanted to change teacher instructional practices by presenting them with curriculum outcomes, and then reinforcing it with a high stakes test. My colleagues and I often felt as if those of us who actually administered the district tests were not to be trusted; the state did not appear overly willing to accept our suggestions and act on our concerns. Perhaps, the performance assessments might have had a longer shelf life if the state had first focused on developing the assessments along with a reasonable plan for reporting results and ensuring the quality of the scoring--and, if they had involved more field people in improving the quality of the process. Planning is the key concept here. I have found that the road less traveled can be a very rocky road, and as an Arizonan, I know that if I venture off Route 66, I will need to carry a water jug and a snake-bit kit. I also think much of what the state department tried to do runs counter to the Continuous Quality Improvement processes. I applaud our new state superintendent's decision to pause and reconsider aspects our very ambitious performance assessment program. I am also pleased that she is taking the time to catalog the problems that have given it a shaky reputation and to systematically plan for change. Currently, she and her staff are visiting with teachers, administrators and community members. She also intends to survey teacher needs and solicit recommendations for improving the Arizona assessments. In a letter dated last week to the educators of Arizona, State Superintendent Lisa Graham said, "I believe that performance-based assessment should be our goal. But we're going to move more slowly and thoughtfully toward that goal. . ." I believe that in the very near future, this very important form of testing will return to Arizona classrooms. Discussant Comments Darlene Patrick, Regional Vice President Jon S. Twing, Ph.D., Senior Project Director Harcourt Brace Educational Measurement Introduction My charge today is to discuss the issues and papers surrounding what is becoming a hot topic in educational research, evaluation and testing, namely the application of what is traditionally seen as "Business Management" techniques to the assessment of school and student outcome processes. Simply stated, how does the addition of Performance Assessment mediate our need for continuous quality improvement? During the past twenty years I've had the privilege of viewing education from a number of perspectives. I have approached the topic drawing upon my past teaching, administrative, and current educational publishing experiences. I would also like to publicly thank Dr. Jon Twing, Senior project Director at The Psychological Corporation for his assistance with my discussant role. The Role of Assessment in a Continuous Improvement Education System Joe B. Hansen, Colorado Springs Public Schools The Hansen paper is well written expose into the benefits (as well as limitations) regarding the implementation of a system to improve education. Clearly, the emphasis on a systems approach is not casual as all aspects of the Hansen paper require the integration of multiple-data points, summary analyses, use of information and improvement of the educational product. Presumably this system is being implemented in the Colorado Springs Public Schools and worth our attention during the months to come. First, Hansen talks about the need for a "Continuous Improvement System" or CIS: Continuous Improvement Systems (CIS) " . . . a class of self regulating systems that use internally or externally generated signals to monitor progress of the system and its components toward some end state or goal." Often, schools feel a mandate from the state to implement change at any cost to improve test scores. Seldom do teachers feel this is where instructional emphasis should be placed and schools are not in agreement regarding the end outcomes or goals. The impact of a "high stakes" mandated performance assessment is really an unknown. The highly touted California and Kentucky programs have not, as yet, yielded test scores resulting in "high stakes" decisions about individual students. Issues surrounding fairness, security and costs are still unresolved. Another aspect of the Hansen system is an emphasis on "Customer Orientation": Customer Orientation " . . . system have in place a mechanism or process for monitoring customer satisfaction detecting and analyzing dissatisfaction and correcting faults in the educational product . . " But, just who are the customers? What exactly is the educational product? Suppose, for the sale of argument that we agree that teachers, students, parents, state educational agency personnel, tax payers and business people alike are the customers. Because these customers have conflicting goals, it is unlikely that what one finds satisfying in an educational product will be agreed upon by the others. For example, tax payers are tired of the endless requests for additional tax support for school facilities, teacher pay raises, tests or textbooks. In the midwest, for example, this can be seen from the countless millage increases rejected at the voting booth. Hence, the tax payers are likely to find the cost-to-benefits ratio of a standardized achievement test very attractive. However, business people want students (the educational product?) to be able to do "real world" things like balance a check book, access personal computers and read/write with proficiency. Hence, they may be more interested in the more "authenticity" typically associated with performance assessments (which dollar for dollar are more expensive than "off-the-shelf" achievement tests). Perhaps defining customers is each compared to defining the "educational product." What is an educational product? Is this a "well informed" student? A student who has surpassed some level of mastery toward some goal? A student who "survives" until graduation? There are countless examples of people who are successful despite poor school performance or the lack of graduation. This definition of educational product is not independent of the customer, for each customer previously described is likely to have a different definition of educational product or outcome. Both the customers and products must be clearly defined before CIS will be successful. Another aspect of the Hansen system is a "Focus on Quality": Focus on Quality " . . . is fundamental and pervasive element in an educational CIS. A CIS must generate and make effective use of data on key indicators of quality or progress toward predetermined goals or desired conditions. A high regard for the value and use of quality indicator data must be embedded in the culture of the system. A process for using such indicator data for self regulation and improvement must also be clearly defined and implemented. " A focus on quality is indeed important. First, it suggests that predetermined goals are established. Second, it suggests multiple data points (standardized achievement tests, performance assessments, portfolios, teacher judgments, etc.). Third, it required the "culture" to have a high regard for the quality of the data. Fourth, a process of the "feedback-improvement loop" is required. While I do not want to debate the value of each of these points, there is serious doubt whether the existing school environment will allow for such points to be implemented. For example, regarding the first points, classroom teachers often feel that there are several agendas regarding the "predetermined" outcome desired. Often, teachers are forced to make decisions regarding students instruction at the direction of the principal which they may feel is not in the best interest of the student. Clearly, the goals must be established in light of all the players including the teachers, students, parents, and principals. Multiple data points are wonderful. I doubt anyone here would argue with the point that the more pictures of a child, school of system performance the better we will be at making judgments regarding the areas of strengths and weakness and the need for improvements. However, often in the political reality of a school system the classroom teachers i hampered in his/her ability to use these multiple points. For example, many states have multiple criteria for graduating from high school. Typically these criteria are: number of and type of courses completed successfully, attendance and successful completion of a state mandated test. With closer inspection, however, often the only binding requirement is the mandated test. The other criteria usually have "substitutions" or proxy by which they can be fulfilled (as in summer school make-ups, Saturday detention, and other special instruction). This parlays a message to the schools (teachers, principals, and students) that performance on the mandated test in the only concern. Often this leads to an overemphasis on the behaviors, content and process used or assessed by the mandated test. This overemphasis on what many teachers feel are trivial or "stand-in" measures of student achievement undermines the third point of the CIS, namely that schools have a "high regard" for the value and use of quality indicator data. Clearly performance assessments will not help to improve this interpretation unless they are perceived as having clear value in fostering improvements. Finally, it is often the case that schools are slow to evoke change even when everyone is in agreement that change is needed. This is due to many factors but I speculate that it is due primarily to the fact that schools do not know how to make improvements. Time and time again I hear that "staff improvement" is needed but see no plan for what the "improvement" is likely to bring about or a mechanism for examining its effectiveness. Teachers want to teach well and schools want to facilitate success for all students, but both need help in implementing changes for the better. Hence, regardless of the "multiple data points," courses of action regarding what to do next must be developed. The CIS outlined by Hansen required reliance on computer systems ( a "computerized" data base). However, many school systems I am involved with are having a hard time coming up with the funds to meet their "basic equipment list" with such things as a triple-beam balance for science, plastic bugs, reference materials for local/state cultural or historical data, etc. Such a need to rely on a computerized system may be problematic. Another concern of the CIS system outlined by Hansen is the multiple-level and summary reporting needs. While I agree that multiple reporting levels are required as there is a need for different types of interpretive data at the various user levels (teacher, principal, school board), how such multiple data points are combined or summarized is an unknown. Hansen points this out as a potential problem. In addition, the psychometric quality of these different types of data (i.e., performance assessments, standardized tests, course grades, etc.) varies as a function of the assessment type. Does this mean that the combination of these pieces of information (i.e., the composite picture of the student, school or district) needs to be weighted by which type of data are used? Finally, as Hansen points out, there will be a great cost associated with a CIS which required multiple assessment forms, at least at first, both in terms of teacher/student time and real dollar expense. Perhaps, as the system expands and the assessments become more like instruction (i.e., curriculum embedded assessments) the cost will go down. However, the expense associated with such systems in the large scale arena (e.g., Kentucky, California) have been overwhelming. Curriculum Development, Continuous Quality Improvement and Performance Assessment in the Local Education Agency. Bev Merrill, Gilbert Arizona Public Schools A number of thoughts come to mind while reading Bev's interesting narrative. American consumers have embraced the information revolution. More than ever before we have the knowledge to make informed choices and the "power" we have established is recognized in the market place. Our expectations for quality are voiced loudly. Astute companies continually take the pulse of our pivotal attitudes and changing needs. We are moving at a rapid pace and our "values" rightly or wrongly reflect the demand for convenience, time/cost consciousness, simplicity and instant gratification. With the knowledge that quality in American products and services is possible, a revolution against mediocrity has risen. The same consumer consciousness that demands high performance standards in business (i.e., on time arrivals for the airline industry) is refusing to accept the status quo from our educational system. Welcome to Educational Reform 1990's style. The educational community has tried in earnest to respond to pressures to facilitate immediate change in the quality of our schools. Performance assessment has become in intregal part of improvement plans. Bev's description of the history of the Arizona program is but one example of reform movements that have been implemented by states and districts. Without full realization of the problem and solutions through input/planning by teachers, parents, students and the educational community at large, many more "improvement" programs are doomed to failure. Restructuring casualties are lining many education highways. Through misuse of their power and influence, political organizations are responsible for creating "sporadic systemic chaos." As Bev points out creditability and trust have been sacrificed. Educators disillusioned by with reform methodology have grown skeptical of the promise of positive/lasting change. World Class Quality: Performance Assessment from a Business Perspective Jonelle Adams, The Boeing Company The Adams paper brings to the discussion what is called "Full Customer Satisfaction" or FCS. And while this system clearly has its roots embedded in product delivery (as we would expect from a Fortune 500 company like Boeing), it applies very nicely to educational product (given that this needs to be defined operationally). For example, Adams outlines five areas of FCS: Quality, Costs, Delivery, Safety, and Morale. First, quality refers to Service and Product Design. All customers (students, parents, teachers, principals and taxpayers alike) will want changes to product design, or aspects of product design that meet their particular needs. Likewise, service is important for everyone. Many local school districts (teachers and principals) feel as if they must serve everyone (parents, students, taxpayers and the state) without basic support. It is our job, the people suggesting changes, to support these people in their endeavors. For example, it is not enough to design a new portfolio system without designing and in some cases managing the implementation of such a system. Again, the key point seems to be staff development and training as a service to school personnel. Secondly, any product must be cost effective regarding not only dollar costs but also time and setup costs. This means that a balance will have to be found between the more costly but presumably richer performance assessment information and other more traditional but less rich information such as course grades and standardized achievement tests. Third, the delivery of the product is no small task. For example, currently there exists a relatively long time lag between when performance assessments are "given" and when they are scored. This will impact when these assessments can be used and how the information will be interpreted. Fourth, as with commercial products, the safety issue is as great if not more so with educational products. We must take the time to do the research to know that only well informed decisions are made from such a wide range of data and that the resulting score use does not negatively impact student breakout groups (i.e., minorities and/or other desegregation groups). Finally, what benefit is there of developing a system unless everyone will use the system to their advantage? Unless everyone cares about making an improvement and "buying into the system" it will surly go unused. This then begs the question as to what is driving change? Some large-scale programs maintain that unless the users are held accountable they will not implement the changes necessary. This is clearly the wrong message to send from the classroom perspective. If we want to change the way teachers are using assessment to increase our educational product then they should be the ones that evoke the change. This system should be marketed from the bottom-up in a truly "grass-roots" campaign. Teachers will have ownership and this translates to students, parents, principals and taxpayers. Closing Perspective As I have outlined during my presentation, the issues regarding quality improvement are many faceted and multidimensional. Implementation of change for improvement in the educational setting is not new and is continuously debated. This debate is never about the need to improve, but is always about how to improve. Clearly, the call for more "authentic" measures of student achievement have been seen as one way to change for the better. However, it seems that the zealots who have pushed this bandwagon along the way have failed to deal with the real world problems of: identifying the customers, defining the educational product, assessing the needs of the customers and continuing a "feedback/improvement" loop to facilitate change. What the educational system does not need is plan for improvement which does not address all of the needs and concerns of the people involved. High stakes mandated testing is reality in many state testing programs. These assessments seem wholly incompatable with the desired outcomes from a performance assessment and a total quality improvement system. In fact, in states where such visionary systems were put in place (namely Kentucky and California) the cost per pupil was astronomical and delivery of student results is still pending. What we must strive for in the development of a Total Quality System of Improvement is the assessment of the customer needs (once these are defined), the improvements to the educational product (once this is defined, how might we make it better)a plan for tracking if the changes in the system have improved the product and how this change will be assessed. Finally, a timely and accurate system of delivery of the results to all audiences is as important, if not more so, than the results themselves. Such delivery will bolster the confidence in all players allowing for inspection of the results and not the system itself.