An Assessment Guide to Educational Effectiveness Educational Effectiveness Chapter 1: Assessment Concepts, Terms, and Purpose - provides an overview of assessment in the United States and defines the elements of effective assessment practice. The intended audience of this chapter includes college and university presidents, vice presidents, provosts, board members, and executive sponsors of campus assessment efforts Chapter 2: Planning for Assessment – provides the operational elements of planning for assessment including governance, structures, and strategies. The intended audience of this chapter includes executive sponsors of campus assessment and directors of institutional effectiveness, directors of assessment, assessment coordinators , department heads, and others on committees involved in the assessment process. Chapter 3: Conducting Assessment – provides an operational view of assessment including decisions that need to be made, and designing processes to keep assessment useful, manageable, and sustainable. The intended audience of this chapter include those directly involved in conducting assessment. Chapter 4: Using Assessment Results – provides information about the important final steps of assessment: using actionable knowledge to improve educational quality. The intended audience of this chapter includes senior leadership as well as those involved in assessment. The clickable document map below provides readers with the ability to both understand Blackboard’s Educational Effectiveness strategic consulting framework, and to access specific topics by clicking on the links. Designing Effective Assessment Governance Identifying program outcomes Designing effective direct measurements Designing effective surveys/ course evals Interpreting results Designing Effective Goals & Outcomes Identifying disciplinary outcomes Identifying cocurricular outcomes Writing effective outcome statements Designing effective indirect measurements Designing effective rubric evaluation processes Designing collective interpretation processes Designing Institutional Assessment Plans Designing faculty assessment governance Designing staff assessment governance Aligning assessment with standards Designing effective expressions of quality Using curriculum maps for curricular design Developing summative & formative uses Designing Exec. Roles & Decisionmaking Assessment in Public Relations Interpreting assessment results Interpreting assessment results Creating effective rubrics Improving equity of outcomes Using results for program improvement Using results for program improvement Developing embedded assessments Designing for manageability & sustainability Institutional Strategy Faculty & Program Development Student Support & Engagement Strategic Evaluation Criteria Measurement Methods Repurposing test items for assessment Designing effective assessment portfolios Measurement Instruments Making results visible Designing effective reports Use of Results Tactical Tactical Chapter 1: Assessment Concepts, Terms, and Purpose Assessment as an engine of change. While it is likely that very few institutions would be conducting assessment were it not for accreditation requirements, and while many do view the process as an exercise in accountability, institutions have much to gain from outcomes assessment. If used as a diagnostic tool, assessment can serve as a powerful engine of change that keeps education a vibrant, relevant, and indispensable social institution. Indeed, outcomes assessment is arguably the only systematic means we have to continually improve our core business – educating our students. By framing assessment within our institutional mission and goals, we can understand the impact of our work on several levels. To what extent did we achieve our goals? To what extent are we aligned among mission, goals and processes? Where are opportunities to celebrate our accomplishments? Where are opportunities to improve performance? Of course, educators have long practiced assessment in many forms. Assessment in the classroom allows us to understand student performance. Administrative assessment allows us to understand faculty and staff performance. The work of our Institutional Research offices allows us to understand the inputs and outputs of our work: average test scores, average high school GPA, graduation rates, retention rates, average GPA on graduation, time to graduation, average admission scores, etc. All of these processes tend to focus on the student or employee as the unit of measure. Outcomes assessment focuses on the program as the unit of measure. More importantly, outcomes assessment provides a means for us to understand the quality of our programs. With thoughtfully designed methods, we can generate data that tells us not only where we stand in achieving our desired end states (judgment and accountability), we can also identify what to do to improve our work (diagnostic and improvement). Before delving into the implementation of outcomes assessment (how to measure, what to measure, etc.), it is helpful to have a common understanding of terms used throughout this collection of documents. Terms and definitions. What do we mean when we use the term “assessment” and what is the difference between “assessment” and “evaluation”? 3 Arguably the father (and mother) of evaluation practice, Michael Scriven, in his Evaluation Thesaurus (Fourth Edition, 1991, Sage Publications) provides 4 definitions for evaluation. We share the most relevant definition: EVALUATE, EVALUATION The process of determining the merit, worth, or value of something, or the product of that process. Scriven sees no distinction between the term “assessment” and “evaluation” in his definition of “assessment”: ASSESSMENT Often used as a synonym for evaluation, but sometimes the subject of valiant efforts to differentiate it, presumably to avoid some of the opprobrium associated with the term “evaluation” in the minds of people for whom the term is more important than the process. None of these efforts are worth much, either in terms of intrinsic logic or adoption. Although we do not make distinctions between the two terms, many campuses use the terms differentially – often assigning the term “evaluation” in instances where faculty or staff are the unit of measure (e.g. performance evaluation) and assessment as practices adopted to address accreditation standards. At Blackboard, we have adopted a definition that specifically address our work with clients: ASSESSMENT In an educational context, a process focused on understanding how well a program has delivered specific knowledge, skills, and competencies and using that understanding to plan for improved program performance going forward. Throughout this document, we will use “assessment” and “evaluation” interchangeably. However, when working with clients who use these terms for specific processes, we adjust our use of the terms according to campus definitions, as we do other elements of educational practice. OUTCOME: An outcome is what a student knows, thinks, or is able to do as a result of a program. Outcomes are observable and therefore measurable, where “measureable” is typically a quantitative expression of quality. Outcomes should have a name and a description where the description typically sets forth the criteria for evaluating the outcome. For example: Outcome name: Information Literacy Outcome description: The ability to know when there is a need for information, to be able to identify, locate, evaluate, and effectively and responsibly use and share that information for the problem at hand. - Adopted from The National Forum on Information Literacy (http://www.infolit.org/) Criteria for evaluation: 4 Recognize need: Effectively defines the scope of the research question or thesis. Effectively determines key concepts. Types of information (sources) selected directly relate to concepts or answer research question .Access information: Accesses information using effective, well designed search strategies and most appropriate information sources Evaluate: Thoroughly (systematically and methodically) analyzes own and others' assumptions and carefully evaluates the relevance of contexts when presenting a position. Use for a purpose: Communicates, organizes and synthesizes information from sources to fully achieve a specific purpose, with clarity and depth Use ethically and legally: Students use correctly all of the following information use strategies (use of citations and references; choice of paraphrasing, summary, or quoting; using information in ways that are true to original context; distinguishing between common knowledge and ideas requiring attribution) and demonstrate a full understanding of the ethical and legal restrictions on the use of published, confidential and/or proprietary information (http://www.aacu.org/value/rubrics/pdf/InformationLiteracy.pdf) In addition to curricular outcomes, there are co-curricular outcomes resulting from the programs and services typically organized within student affairs. Among the knowledge, skills, and competencies related to the co-curriculum are career planning, interpersonal relationships, academic planning, multicultural competence, physical health and wellbeing, etc. Beyond curricular and co-curricular outcomes, an argument can be made that any collection of organized activities on a campus has implications for intended outcomes. The student billing function could think of their work as resulting in personal financial skills. The registrar’s office could think of their work in terms of academic planning skills. The library could think of their work in terms information location and access skills. Campus Security can think of their outcomes as knowledge of personal safety, and so on. With the principal terms of “assessment” and “outcomes” defined here, additional definitions will be provided as concepts are introduced. Accreditation and accountability. While we have argued that assessment is of value to all institutions of higher education because of its potential to serve continuous improvement, there is no question that institutions of higher education are being driven to adopt assessment by the professional and regional accreditation agencies. Over the last two decades, as higher education in the US has reached full maturity, accreditation standards have moved away from compliance models to quality assurance and enhancement models. No longer as focused on the capacity to deliver education, regional accreditation now has expectations about institutional and educational effectiveness. Each regional agency states these expectations in language making clear that institutions need to conduct assessment and use results to improve; they do not expect institutions to provide laundry lists of achievement levels or comparability with other institutions. 5 To underscore the concept that “accountability” in the US means that institutions have systematic processes in place and use resultant data to improve their work, relevant language from each of the accreditation agency standards are included below. Southern Association of Colleges and Schools 3.3.1 The institution identifies expected outcomes, assesses the extent to which it achieves these outcomes, and provides evidence of improvement based on analysis of the results in each of the following areas: (Institutional Effectiveness) 3.3.1.1 educational programs, to include student learning outcomes 3.3.1.2 administrative support services 3.3.1.3 educational support services 3.3.1.4 research within its educational mission, if appropriate 3.3.1.5 community/public service within its educational mission, if appropriate North Central Association – Higher Learning Commission Core Component - 2c The organization’s ongoing evaluation and assessment processes provide reliable evidence of institutional effectiveness that clearly informs strategies for continuous improvement. Western Association of Schools and Colleges Standard 4 Creating an Organization Committed to Learning and Improvement The institution conducts sustained, evidence-based, and participatory discussions about how effectively it is accomplishing its purposes and achieving its educational objectives. These activities inform both institutional planning and systematic evaluations of educational effectiveness. The results of institutional inquiry, research, and data collection are used to establish priorities at different levels of the institution and to revise institutional purposes, structures, and approaches to teaching, learning, and scholarly work. Middle States Commission on Higher Education Standard 7 Institutional Assessment The institution has developed and implemented an assessment process that evaluates its overall effectiveness in achieving its mission and goals and its compliance with accreditation standards. New England Association of Schools and Colleges 6 Evaluation 2.4 The institution regularly and systematically evaluates the achievement of its mission and purposes, giving primary focus to the realization of its educational objectives. Its system of evaluation is designed to provide relevant and trustworthy information to support institutional improvement, with an emphasis on the academic program. The institution’s evaluation efforts are effective for addressing its unique circumstances. These efforts use both quantitative and qualitative methods. Northwest Commission on Colleges and Universities Standard 1.A – Mission and Goals The institution’s mission and goals define the institution, including its educational activities, its student body, and its role within the higher education community. The evaluation proceeds from the institution’s own definition of its mission and goals. Such evaluation is to determine the extent to which the mission and goals are achieved and are consistent with the Commission’s Eligibility Requirements and standards for accreditation. This is not to say that the accrediting organizations had fully developed underlying theories and methodologies to offer their members when the standards were initially introduced. Assessment guidelines are evolving, with workshops, written materials, rubrics and other tools being offered by accrediting agencies to assist institutions to meet accreditation expectations. Indeed, the field of evaluation is a developing field as discussed in the following section. Evaluation theory. Evaluation theory is new enough that it is safe to say that all or most theorists are alive today. The field is becoming well established as a profession and as an academic discipline. Growing out of massive government funding of social programs in the 1960’s, when it was important to know the consequences – both intended and unintended – of government spending on programs such as Head Start, Upward Bound and other War on Poverty initiatives, the era gave rise to graduate programs in evaluation across the US, that result today in many options for those seeking advanced degrees in evaluation – particularly in educational evaluation. Theories and evaluation models abound, with a range of underlying concepts. Theories such as “goal based,” “goal-free,” “practical”, ”empowerment,” “appreciative inquiry,” and “program” all have stake in the literature of evaluation. Some theories are more appropriate than others depending on what is being evaluated. Among the things to evaluate are: product, cost, process, social impact, and outcomes. Our work at Blackboard is unique to each client’s needs, but in general, the approach is grounded in higher education, and a blend of: Program level: Understanding the impact of the design and delivery of our programs and services. Outcomes based: Examining the program through the expected knowledge, skills and competencies of students who go through the program 7 Action oriented: Generating “actionable knowledge” by developing information that tells us where we stand and what to do to improve our programs. Our theoretical model gives rise to specific methodologies. Evaluation methodology. It is important to remember that evaluation research differs substantially from scientific research methodology. While evaluation uses some functional elements of scientific research methodology, the purposes are different. Where the purpose of scientific research methodology is to test a hypothesis, evaluation lives in a very real context of teaching, learning, support and administrative activities. Where the former insists on value-free or value-neutral observations, evaluation research is entirely about our values as educators and our desire to contribute to an educated citizenry and develop each student to his or her full potential. The different purposes and uses of these two methods often present challenges in academia where scientific methodology has dominated for generations. The idea of ‘scientific rigor” has led to many assessment practices that are over-engineered, over-sampled, and intrusive. Indeed, outcomes assessment requires us to put on new lenses. New lenses for education. Because educators are comfortable thinking of their work in terms of each student’s mastery of disciplinary knowledge or content, organizing courses to transfer that knowledge from faculty to student, and issuing grades as evidence of mastery, the assessment process initially places many squarely in a zone of discomfort. The assessment process calls for collaboration with colleagues to decide on program outcomes, looking at aggregate rather than individual student performance, and examining these data in terms of program performance (rather than individual student performance) and then collectively determining a course of action to increase program performance. Naming and defining program outcomes in a way that is observable (and therefore measurable) are as yet alien responsibilities for faculty. In addition to new processes, the vocabulary of outcomes assessment is new, very specific and complex. In the course of evolving, attempts to explain the purpose of higher education assessment gave rise to certain concepts that have not withstood the test of merit, yet have remained in the lexicon as attractive distracters. The term “value added,’ for example, has led some institutions to pursue growth of learning – a tiresome endeavor that only leads investigators to the non-starter conclusion that students know more when they graduate than when they entered the institution. The term “lifelong learning” has led some institutions to struggle with how to measure a concept over which they will have no control once the student has graduated. 8 There are many hazards along the road to building manageable sustainable assessment processes that lead to actionable knowledge and institutional effectiveness. Our work as Blackboard consultants is to clear the hazards, recommend the right tools, sharpen visibility, and organize for sustainability and manageability of assessment processes. What does effective assessment practice look like? Effective assessment practice. Several levels of organizational capability need to come together in order for institutions to realize the power of assessment as an engine of change. These levels can be described as follows: 1. Institutional Strategy Leadership – Top leadership visibly supports outcomes based education and views assessment as a continuous improvement strategy Goals and outcomes – Goal statements describe desired end states of the program; outcome statements describe specific knowledge, skills or competencies gained from the program. Assessment plan - An institution-wide assessment plan has been adopted which guides the assessment activities across the institution. Governance model - A representative group meets regularly to monitor the quality of assessment methods, information gained, and improvements in practices Visibility - Outcomes results are reported to all stakeholders who use the information to collectively plan and improve program design and delivery 2. Measurement Methods Direct measurement - All outcomes are assessed using direct measurement such as tests or rubrics over a defined period of time Indirect measurement - Indirect measurements such as surveys are used to supplement and inform direct measurement of outcomes Quantitative expression of quality - Quality of outcomes are expressed quantitatively with institution-wide understanding of the level of achievement desired Equity of outcomes - Outcomes results for at-risk populations are routinely examined and result in strategies achieve parity among all groups Manageability and sustainability - Assessment processes are ongoing while producing low levels of distraction to the teaching and learning process 3. Measurement instruments Surveys and course evaluations - Surveys and course evaluations are easily deployed through automated means and reports are automatically generated 9 Rubrics - Outcomes rubrics are easily deployed through automated means and reports are automatically generated Curriculum maps - Curriculum map shells are easily generated through automated processes and can be accessed across all programs from a central location. Tests - Tests for specific outcomes are easily deployed and reports are automatically generated Portfolios - Portfolios are easily created, populated, and assessed with associated rubrics through automated processes 4. Evaluation criteria Standards - The institution is easily able to associate accreditation standards with evidence of program performance at all times with little manual processing Targets - The institution has established target (expected levels of performance) for all outcomes Outcomes defined - Outcomes are defined with a name or label and an expanded definition describing the key elements of the outcome; there are no conflated outcomes Rubric criteria for evaluation - Each outcome for which rubrics are used in direct measurement has at least 3 or more specific observable criteria used to evaluate the outcome Test items are criteria for assessments (embedded assessment) - Embedded test questions relate to specific criteria for evaluation and scores by test item are used for evaluation purposes 5. Use of Results Program improvement focus - Assessment results are consistently viewed as indicators of program performance (not student performance) and use results to improve programs Collective interpretation - Assessment results are interpreted, discussed and decided upon by all individuals who are involved with the design and delivery of the program. Used for improvement - Assessment processes focus on gaining an understanding of how to improve program performance over time; improvement decisions flow directly from the assessment results Used for accountability - Assessment results are regularly examined by program members to evaluate program merit Results visible - Assessment results are accessible and visible to all interested stakeholders Meaningful reports - Assessment reports are presented in a form that is meaningful to most audiences. 10 6. Faculty and Program Development Outcomes based academic program design - All educational programs have established a comprehensive set of student learning outcomes Faculty-driven change - All full-time program faculty collectively review assessment results and reach agreement on curricular and pedagogical changes leading to improved program outcomes Program level engagement - All full-time faculty in programs are knowledgeable about and involved with program assessment Faculty development program - Faculty development workshops on program assessment are regularly provided 7. Student support and engagement Outcomes based program or service design - All student support programs and services have identified a comprehensive set of student outcomes that go beyond student satisfaction indicators Staff-driven program or service change - Program staff and leaders collectively review assessment results and reach agreement on program and service changes that will lead to improved student outcomes Staff professional development program - Student affairs and student services staff fully participate in outcomes assessment workshops and seminars Actionable knowledge. The power of assessment lies in the development of actionable knowledge – that information that tells an institution where they stand, and what to do to improve. There is no one assessment process that will lay this information at our doorsteps, but there are many pieces of information on campuses that when brought together into a quality dashboard, can provide this information. The following dashboard example would be built from a number of data sources: course evaluations, campus climate surveys, facilities surveys, rubrics, final examinations, and other instruments that campuses use every year. This dashboard uses both direct and indirect measurements to paint a picture of where the campus stands in its desire to provide high quality instruction, support services, and physical environments. 11 In interpreting the information in the illustration “Know where you are,” this campus knows that graduates are not leaving with the level of scientific reasoning skills (circled in red) that would be expected from having completed 12 units of science lab courses. An examination of the rubric used to assess scientific reasoning, provides the granular information we need to improve the institution’s performance. In this case, faculty can collectively engage to plan for emphasizing and reinforcing skills that will improve students’ ability to hypothesize, analyze, and conclude. There are many ways for a campus to develop the skills and organizational capacity to use assessment information in powerful ways. There are a number of useful resources in the field that can be accessed as well. Field Resources. Although there are numerous resources available for educational assessment practitioners, a few stand out in terms of immediate use: Maintained by the University of Kentucky, a very active listserv of US higher education assessment practitioners provides useful advice and support. Subscribe to this listserv at http://lsv.uky.edu/archives/assess.html. With a focus on teaching and learning, developing a quick understanding of the context surrounding outcomes assessment is important for faculty and senior leaders. From Teaching to Learning - A New Paradigm for Undergraduate Education by Robert B. Barr and John Tagg (1994) will be very useful. In this article the authors describe the role of educators under the traditional “instruction paradigm” and under the new “learning paradigm.” See this article at http://ilte.ius.edu/pdf/BarrTagg.pdf. 12 To develop an conceptual understanding of the assessment of student learning, T. Dary Erwin’s Assessing Student Learning and Development: A guide to the Principles, Goals, and Methods of Determining College Outcomes (1991) is an accessible and useful book. At a practice level, program assessment methodology is clearly presented in Worthen, Sanders and Fitzpatrick’s Program Evaluation: Alternative Approaches and Practical Guidelines (3rd Edition), (2003) From a theoretical perspective, E. Jane Davidson’s Evaluation Methodology Basics (2005) is a compendium of general evaluation practices. Although not aimed at educational evaluation, it is a solid source of theory and practice considerations. Assessment is a powerful gift. At Blackboard, the definition of institutional effectiveness is: the capacity of an organization to sustain adaptive processes to achieve its mission Three powerful gifts result from outcomes assessment. First, institutions have a means of collectively engaging faculty in the design and delivery of the curriculum – and to do this on an ongoing basis. Secondly, the institution can now provide evidence to substantiate bold claims of the type typically found in mission statements, such as “our graduates have a deep awareness of their professional work on society.” Thirdly, the most powerful gift is adaptivity – the ability to change course to meet a desired end state. Faculty engagement, continuous improvement, institutional identity, adaptation and change – a powerful gift indeed. . . 13 Chapter 2: Planning for Assessment Assessment planning involves governance, structures and strategies. An effective governance model is key to success in assessment. Ideally, an institution’s assessment initiative has 3 core elements of governance: (1) an executive sponsor providing legitimacy, visibility, and resources Executive Sponsor to the initiative, (2) an assessment team with wide college representation tasked with designing and overseeing the process, and (3) a staff position responsible for organizing, executing and Assessment Assessment presenting resultant data. This model provides the head, heart and Team Staff hands of successful assessment processes in an institution. The specific roles and responsibilities of these core elements are to: 1. Executive sponsor a. Introduce the concept of outcomes and the purpose of evaluating outcomes b. Appoint and charge a working group to plan and oversee assessment processes c. Keep the campus informed of key milestones achieved, celebrate successes, and encourage participation in the process d. Provide resources when needed to accomplish the work of assessment 2. Assessment team or steering committee a. Define an institution-wide structure, process, and schedule for assessment b. Establish broad understanding and agreement of the process c. Identify key outcomes the institution will want to evaluate across all programs; we will refer to these as trans-disciplinary outcomes d. Provide guidelines for programs to define disciplinary outcomes e. Identify assessment methods and instruments to be used on key outcomes the institution will want to evaluate across all programs f. Regularly monitor process and oversee the quality of process and results g. Provide direction to assessment support staff and make decisions as they arise from support staff 3. Assessment support position a. Staff working group meetings, identifying areas of needed attention b. Assist programs with assessment planning, process design, and implementation c. Provide expertise on evaluation methods d. Create instruments such as surveys, rubrics, curriculum maps e. Administer assessment processes f. Create reports on assessment results; present reports if necessary g. Schedule meetings for collective discussion and decision-making h. Organize and centrally maintain assessment data Once these structures are in place, there are governance strategies to consider. Models of governance. Will assessment be centralized (designed and overseen by an institutional assessment team), de-centralized (designed and conducted at the program or 14 discipline level), or some combination? Many institutions expect their faculty to self-organize for assessment. Some have driven assessment down to the course level. Course-level and selforganizing assessment approaches are both recipes for inconsistent results – results that are not comparable from one course to another, unusable metrics (e.g. grades), and results that cannot be generalized to the program level, and certainly not to the institution level. Faculty are not experts in program assessment and without a structured faculty development effort, such governance decisions will expend a great deal of resources for little return. For the sake of consistency, manageability and sustainability 4 general guidelines are recommended: Sample when using a rubric Rotate and Phase Decentralize Disciplinary Centralize Transdisciplinary 1. Centralize as much as possible. Assess all mission related and transdisciplinary outcomes centrally; leaving all disciplinary outcomes to programs. Transdisciplinary outcomes will typically be outcomes that are not “owned” by a specific discipline, and are often those introduced in the general education program with further development in major disciplines. Crossdisciplinary assessment teams can be formed to manage and conduct assessment activities. 2. De-centralize at the highest common level. These assessment activities will typically reflect disciplinary outcomes related to disciplinary knowledge and application of theory and practice in a given field of study. Many programs will list trans-disciplinary outcomes as well, but program level assessment teams can be formed to manage and conduct assessment focused on disciplinary outcomes. 3. Rotate assessment activities so that over a period of time all outcomes will have been assessed both at the central and decentralized levels. Phased approaches may also be used to learn lessons of good practice for all through the experiences of a few. For example, an institution may start with Phase I being the assessment of general education outcomes, followed by Phase II being the assessment of disciplinary outcomes. The lessons learned through Phase I will enable subsequent assessment processes to run more smoothly. 4. Always sample with process-heavy processes such as rubric assessment. Sampling allows for the development of valid results while keeping processes manageable and sustainable. The methodology should also include steps to assure that the sample is representative of the entire population. These guidelines are proposed to maintain high quality results, minimizing effort, and comprehensively addressing an institution’s expected outcomes. “High quality results” means that outcomes assessment of a representative sample of student work results in both summative and formative performance metric. High quality formative performance metrics 15 means results at the criterion for evaluation level. “Minimizing efforts” means that a knowledgeable and interested group of faculty and staff has conducted the assessment and taken the burden of assessment off the backs of faculty and staff across the institution. “Comprehensive assessment” means that with a planned, purposeful approach, the institution will have an understanding of how it is performing on all outcomes within a given time period, and with improvements made as a result of initial rounds, will observe performance increases over time. The governance models described above, working hand-in-hand, assures that little effort is expended on duplicated, unnecessary, or unproductive assessment activity. Regardless of the model of governance used at an institution, the overall process should ensure that coordination takes place among the institution’s assessment processes. Institutional assessment, for example, will typically include assessment of the academic, student support, and administrative functions of the university. Many institutions provide a framework for organizing, storing and reporting data resulting from all assessment processes. Typically this framework includes places for outcomes, outcomes assessment results, and actions to be taken to improve outcomes. A sample framework with sample instructions for submitting departments is provided for illustration. Approaches such as this provide a central common framework for Assessment Directors, Assessment Steering Committees, and leaders to access and monitor assessment activity across the organization. General education, academic departments, student support programs and administrative assessment activities may take place concurrently and somewhat independently, but can easily be reported Academic and tracked within such a framework. (Indeed, the content should be reviewed at Co-curricular a central level with an eye to reducing duplication of efforts and providing Administrativ e feedback to programs with ways to maximize useful information.) Institutional messaging is a key use of the framework approach. Clear language sets Institutional Quality Framework a foundation for expectations. Some institutions have used framework headers such as “What are you going to do?” This invites respondents to list process statements rather than outcome statements, which in turn invites 16 people to report results of processes. The following graphic illustrates the difference between frameworks that set expectations for change and those that will result in the status quo. The quality of information obtained depends on clear directions and language that set expectations for quality of responses. Is this a lot of work? It doesn’t have to be as consuming as we see on many campuses. Indeed, the term “assessment” comes from the Latin verb assidere, which means “to sit beside.” The process of assessment should sit beside teaching and learning, and serve to inform the improvement of teaching and learning. In addition to approaches we have discussed above (centralization/decentralization, rotation and phasing, and sampling), we now turn to assessment at the program level and approaches that can be taken to keep assessment both manageable and sustainable. Sustainability The key to developing sustainable and manageable processes is careful thought and purposeful planning before launching a campus-wide assessment processes. Five recommendations are made for the sake of sustainability and manageability; more detailed discussion follows. 1. Design assessment processes toward the conclusion of the students’ studies in a program or at the institution 2. Develop granularity by developing data at the criteria for evaluation level 3. Reuse graded work; avoid the creation of work (for anyone) that is solely for use in program assessment. 17 4. Multipurpose artifacts: use student artifacts that can be used by applying multiple rubrics. 5. Repurpose tests; analyze existing tests that capture program outcomes at the criteria for evaluation level These recommendations and the context that give rise to them are discussed briefly below. End of program focus: As mentioned earlier, assessment processes should capture the knowledge, skills, and competencies as students are leaving the program in order to answer the question: do graduates have the knowledge, skills and competencies set out in our goals? While it is interesting to know whether students’ progress from one year to the next, assessing progress is a lot of work to learn that students have more knowledge and skills leaving than when they came in the door. So, recommendation 1 is: Design assessment processes toward the conclusion of the students’ studies in a program or at the institution. Granularity: The most useful assessment information lies in the granularity of the results – meaning student performance results that, when aggregated across the sample, provide data at the criterion for evaluation level. It is the granularity that provides the adaptive muscle for program improvement. What does this mean? If, for example, the outcome “written communication” is defined as having criteria for evaluation that include: key idea, presentation of evidence, transition statements, strong vocabulary usage, error-free grammar and mechanics, the real insight into improving program performance is at the criteria for evaluation level. This suggests that recommendation 2 is: Granularity is necessary for visibility; develop granularity by developing data at the criteria for evaluation level. Reuse of graded work and multipurpose artifacts: There are two key aspects of sustainability regarding rubric evaluation – reusing work submitted for grades and choosing artifacts that can be assessed for multiple outcomes. Senior theses and capstone projects are an excellent way to repurpose student work that has already been submitted for individual grading. Additionally, these particular artifacts can be used to assess several competencies at once. Thinking of the senior project/thesis processes at many institutions, one can imagine using the thesis and the presentation to capture: written communication, oral communication, information literacy, problem solving, critical thinking, and perhaps others. Recommendation 3 is about avoiding the creation of work (for anyone) that is solely for use in program assessment. Recommendation 4 is to use student artifacts that can be used to assess more than one outcome. 18 Sy nt h es iz Appl Compreh e Analyze y end RepurposingTests: Within the realm of disciplinary outcomes, a simple and elegant way to assess these is to repurposing final exams scores by test question for Course X program assessment. Many programs Average Score as % of Total will place embedded questions that can Possible be used for program evaluation into Class, Instance,… 86% course finals. This is not necessary if 67% there is a program exit examination that Aggregation or Composition 100% all students take or if there is a required 33% Draw Use Case 86% course that program faculty agree will 92% Modify diagram 75% capture the key disciplinary outcomes. 100% The underlying assumption in this Elaborate: Order Code,… 67% 78% approach is that each test question can be Modify Class Diagram 75% associated with a criterion for evaluation 80% that will roll up to disciplinary outcomes: problem solving, analytic thinking, knowledge of disciplinary theory, application of knowledge, etc. Aggregating student responses by test item and expressing this as a percentage of the total possible for that item, gives the program a good idea of what students are learning and the levels of higher order thinking they are achieving. Thus, recommendation 5 is to analyze existing tests that capture program outcomes. Arguably, all test items reflect a criterion for evaluating a program outcome, making final tests a rich source of information about program performance. In summarizing, the principle of sustainability, the key is to generate as little new work as possible, using existing artifacts that provide sufficient granularity to create actionable knowledge – knowledge that leads to action – action that increases the quality of teaching and learning, support programs and services, and administrative work. The other principle that we emphasize in planning to do assessment is the idea of actively managing the process. Active management requires ownership, roles and responsibilities. Ownership At the institution, program, or in fact any level, the assessment process needs people with sufficient authority and responsibility to see that assessment activities lead to actions that will improve performance. This implies the involvement of senior leadership as well as leadership at the assessment activity level. Senior leaders who remain disengaged from the assessment process are setting their institutions up for mediocrity if not failure of the assessment process; the result of which is institutional status quo. At these institutions, much rich and useful information is developed, with no usage beyond collecting charts, tables, and reports and organizing them for presentation to accreditors. One way to examine the roles and responsibilities needed for assessment is the template used in Blackboard Consulting projects. 19 Roles Executive Sponsor Initiative Lead Initiative Team Members Responsibilities Champions the initiative Provides high level oversight, direction, and support Approves major scope changes including additional funding Regularly updates senior leaders and ensures initiative progress Provides team leadership Manages initiative milestones, schedule, budget, and human resources Provides updates to Executive Sponsor and key stakeholders Recommends scope changes as appropriate Brings initiative team together as necessary Identifies key users of Blackboard Outcomes System Actively supports the initiative Participates in all team meetings Owns processes as assigned Contributes to process decisions Even with clear roles, responsibilities and ownership, the collection of data can still result in less than useful information without a clear evaluation question. The following section will provide an overview of this critical element. Developing the evaluation question Evaluation questions serve the same purpose as research questions but differ from each other in important ways. They both frame the evidence gathering activities around things that matter. While the research question captures the heart of a hypothesis and sets a framework for research methodology, the evaluation question captures the heart of a program and sets a framework for the evaluation process. Ideally, a new program starts by first identifying the desired knowledge, skills, and competencies needed to fulfill the expectations of the desired end state (goal). The next step is to then build a series of activities to produce the desired results, followed by the assessment process which leads to ongoing program improvement. It is safe to say this rarely, if ever, happens. Assessment is usually a process that is tacked onto an existing practice – sometimes without defined outcomes in place. In these instances, we often find surveys that are a collection of “satisfaction” items or survey items that reflect the curiosity or interests of the persons designing the survey. When working with clients we often hear comments like, “it would be interesting to know how students . . . .” It is important to remember that keeping survey questions focused on the research question is essential for useful result as they help inform us about the desired end state. Evaluation questions should be crafted to capture both summative (judgmental) and formative (diagnostic) information. In educational program evaluation the question would be some variant of: To what extent were the goals of the program achieved and what opportunities are there to improve? Course or instructor evaluations are a great example of a common and widely practiced form of assessment that typically occurs without an underlying evaluation question to guide its use. As a result, faculty get summaries (and in some instances copies of student responses) on a 20 course-by-course basis, which encourages them to view the information on a course-by-course basis. These metrics are read with interest and likely interpreted as “the perceptions of students in a course”. Because faculty receive these reports for each administration there is little incentive to do anything with the information because of the sample size. This represents a lot of activity with little payoff. Of particular concern are course evaluations that ask “overall, how do you rate this course?” This single metric is then often used by performance review committees and deans to decide on retention, promotion, or pay increases. This represents a questionable use of a single metric. Neither of these practices are particularly effective, useful, or reliable because there is no underlying evaluation question. Additionally, with current technology, there is no reason to rely on a single question about overall instructor rating, as overall performance scores can easily be calculated by averaging responses of other course evaluation questions. There is a way make effective use of these data if we think of course evaluations within the context of a faculty development program model. So just as the instructional program leads to knowledge, skills and competencies of students, the institution’s faculty development program leads to knowledge, skills, and competencies of faculty. The evaluation question would be: to what extent is the faculty development program resulting in the expected outcomes, and where are opportunities to improve individual and overall performance? In such a model, faculty development goals and outcomes are clearly articulated. Course evaluations, consolidated over a year or two, provide metrics on each of the expected outcomes for individual faculty and for the faculty as a whole. Individual faculty can now rely on the data because they are 21 consolidated across all courses. Faculty and their department heads can now have a conversation on ways to improve individual performance. Provosts can now see patterns across all faculty indicating professional development programming opportunities and schedule workshops, retreats, and other opportunities to improve faculty teaching skills. Understanding Diversity For campuses whose mission and goals address the expectation of a welcoming and inclusive learning environment, assessment is a useful means of understanding where we are and what to do to improve. The question of diversity is related to equity of outcomes and requires no more effort than displaying results disaggregated by demographic groups in order to answer these questions. These data provide an understanding of gaps in knowledge, skills, and competencies between demographic groups and serve as a basis for curricular and pedagogical strategies that will improve the learning outcomes of target groups. Generally, such changes also improve the learning outcomes for all students. This type of approach may also serve to change the conversation from the “underprepared student” to the “underperforming curriculum.” The Nature of Outcomes. We discussed earlier that an outcome is what students know, think or are able to do as a result of their experience in an institution, program, or other purposefully designed collection of activities. Outcomes, then, can reasonably be expected to reflect the organization of a college or university. Institutional outcomes are shaped by a combination of the mission statement and the knowledge, skills, and competencies we expect for all graduates – typically these are results of the general education program. An example mission statement that reads in part: achieve success in their chosen careers and promote justice and peace in a constantly changing global society suggests that the institutional outcomes would include: understanding of peace and justice, and global awareness in addition to the common outcomes defined by general education and the co-curriculum. Co-curricular outcomes are shaped by the programs beyond the classroom. At one time the theory of student affairs offices were founded on the principle of en loco parentis – in the place of a parent – which dissolved in the 1960’s without the emergence of a clear model for the design and delivery of student affairs programming. The organization of student affairs, however, provides a mechanism for identifying co-curricular outcomes. If the institution supports students through offices and activities such as: New Student Orientation, Career & Placement, Personal Counseling, Academic Counseling, Student Activities, Student Clubs & Organizations, then one would expect to see outcomes such as: help-seeking skills, career planning skills, realistic self-appraisal, academic goal setting, leadership, interpersonal relationship skills, membership skills, physical health and wellbeing, spiritual awareness, etc. Curricular outcomes result from the design and delivery of the curriculum. As in the cocurriculum, we can analyze the college catalog and identify most of these outcomes by what we see in the design and delivery mechanisms of the curriculum. 22 Curricular outcomes – disciplinary. Disciplinary outcomes can be found in the required courses for majors. Typically we will see a core set of courses that guide students to develop: knowledge of disciplinary theory and practice, application of theory and practice, knowledge of disciplinary history and philosophy, etc. Disciplinary outcomes are also found in the collection of general education courses representing the breadth of knowledge that institutions want their students to develop. Knowledge of history, literature, mathematics, fine arts, sciences, social sciences – institutions want all students to have a grounding in the theories and principles of these disciplines. In addition, however, they are intended to expose students to knowledge, skills and competencies beyond disciplinary theory. These are the trans-disciplinary outcomes – often referred to as transferrable skills. Some institutions refer to these as “soft skills”. Yet they are as critical to success as disciplinary knowledge. Curricular outcomes – trans-disciplinary. Trans-disciplinary outcomes are less aligned to disciplinary content than they are pedagogical practice. They are, however, purposeful and intentional, and by design included to produce knowledge, skills, and competencies that students need in order to develop their skills. The use of intensive writing, technology, presentations, teamwork, research, and other pedagogies arguably result in outcomes such as: written communication, oral communication, technological literacy, teamwork, information literacy. Other outcomes. Arguably, there are no departments, offices, or programs in a college or university that do not have some outcomes they expect in terms of what students know, think or are able to do. The student billing office plays a role in developing students’ knowledge of managing personal finances, the library plays a role in developing students ability to access information, campus security plays a role in understanding and practicing personal safety, and so on. “Services” such as food service have a role in developing an understanding of nutrition and health. Through defining outcomes, institutions have not only a means of expressing the unique character of their institutions, and identifying the value to students, they also have taken the first step in 23 measuring the quality of these outcomes. In addition to clear outcomes, the clarity of mission and goal statements are also critical to the alignment among all of these elements so that the institution is able to consistently achieve its mission. Effective mission and goal statements. Chapter 1 described the importance of writing effective outcome statements. We see how the outcome name captures the heart of the outcome, and the extended definition contains the criteria for evaluating the outcome. The key to effective outcome statements are that the criteria for evaluation are observable; and because they are observable, they are therefore measurable. Mission and goal statements are not directly measurable but serve as a critical foundation upon which the outcomes rest. Missions describe the overall purpose of the organization and goal statements describe the desired end state as guided by the mission. Mission statements often describe something other than the purpose of the organization as seen in three real examples whose names have been redacted: 1. The mission of Community College X is to address the higher education needs of its community. Through its diverse programs and services, CCX assists students in meeting their educational goals. We fulfill this mission as an institution of higher education by preparing students: To participate responsibly in a culturally diverse, technological and global society. For successful transfer to colleges and universities. For employment and advancement within their chosen care The mission statement of Community College X is clear and comprehensive in stating its purpose (meet educational needs of community); how it meets those needs (through diverse programs . . .); and the end result (prepare students to . . .) 2. State University Y is a diverse, student-centered, globally-engaged public university committed to providing highly-valued undergraduate and graduate educational opportunities through superior teaching, research, creative activity and service for the people of California and the world. This mission statement tells us who comes to the university; what the institution does (engage globally, provide highly valued. . .); how they do it (through superior teaching); and who they serve (people of California and the world). It does not, however, speak to preparing students to participate in society and the workplace 3. Ivy League University Z is one of the world's most important centers of research and at the same time a distinctive and distinguished learning environment for undergraduates and graduate students in many scholarly and professional fields. The University recognizes the importance of its location in City N and seeks to link its research and teaching to the vast resources of a great metropolis. It seeks to attract a diverse and 24 international faculty and student body, to support research and teaching on global issues, and to create academic relationships with many countries and regions. It expects all areas of the university to advance knowledge and learning at the highest level and to convey the products of its efforts to the world. From Ivy League University Z’s mission statement, we can tell why the university is important (research, distinguished learning environment); where it is located and why that’s a good thing; who it attracts; the relationships it thinks are important; that it expects all areas to advance knowledge and learning; and finally that it shares this knowledge with the rest of the world. Again, it does not speak to preparing students to participate in society and the workplace. These illustrations attempt to convey that the essential purpose of all educational institutions is often overlooked for the sake of describing where they are located, who their students are, how they prepare students, and how important they are in the world stage of education. If preparing students for meaningful participation in society is not in the mission statement, the institution could lose its focus on teaching and learning. We also see lost opportunities in institutional goal statements. Goal statements, in higher education, rather than describing the desired end state, often describe the process of achieving the desired end state, as in the statements listed under “The Language of Process.” We can easily see what this university will do to achieve a desired end state, but the desired end state is not as clear. The Language of Process X University’s mission is characterized by its pursuit of the following institutional goals: • To foster a safe, civil, and healthy University community • To provide access to academic programs at reasonable cost and in multiple settings • To strengthen interdisciplinary collaboration and international programs • To increase diversity within the student body, faculty, and staff through institutional practices and programs • To recognize excellence in the teaching, research, learning, creative work, scholarship, and service contributions of students, faculty, and staff • To conduct ongoing assessment activities and engage in continuous improvement initiatives within the University • To establish lifelong relationships between alumni and the University • To advance responsible environmental stewardship • To support community and regional partnerships that elevate civic, cultural, social, and economic life If we recast these goals into the Language of Goals, we can clearly see the desired end state. The Language of Goals 25 X University’s mission is characterized by its pursuit of the following institutional goals: 1. Educational Excellence: Students throughout the region and beyond are prepared to enter society through affordable, interdisciplinary and international programs. 2. Student Access and Success: Students from diverse backgrounds participate together in the life of University X supported by inclusive faculty, staff, practices, and programs. 3. Strength of Community: University X students, alumni, and other stakeholders are deeply engaged with the university through scholarship, service, civic, cultural, social, and economic in a university culture that practices safe, healthy, civil, and environmentally responsible community. The point of the above illustration is to highlight the difference between “goals” that give us a list of things to “do” rather than goals that give us a target to collectively work towards. It is the difference between administration and leadership and the difference between tactical and visionary. There are organizational implications for attending to this concept. Both the nature of how people view their work and the ongoing improvement of that work are at stake. Specifically, when operationalized, the list approach can lead to standoffs between one process and another because of the focus on process; whereas the target approach will encourage reconciliation of processes to achieve the desired end state. From an assessment perspective, the list approach will result in checklists of things people did to operationalize the goal. Did we do it? Yes, and here is a list of the ways we did it – a recipe for maintaining the status quo. By using target language assessment is forced to look at the extent to which the goal was achieved and to see areas for future improvement. While creating clear goals is a fundamental piece of good assessment, they are only a piece of the structure and process for institutional effectiveness. There are many other elements that come together when we examine how well those goals have served the institution. Planning for institutional and program review Concepts that apply to accreditation and program review processes are framed in ways that are similar because of a common practice of aligning institutional goals to program goals. The key elements of making these review processes productive and useful lie in a methodology such as: 1. Creating clear institutional and programmatic goals that depict the desired end state 2. Identifying the key outcomes that the institution or program expects to see if the goal is achieved 3. Gathering, interpreting and reporting metrics at a sufficiently granular level to provide information on the extent to which the goals were achieved and the opportunities to improve for the next review period. 4. Stating what actions or plans will be adopted to achieve the desired improvement Methods used for institutional and program review processes are the real challenge for those responsible for organizing them. Institutional effectiveness and assessment directors know that 26 serving up metrics is useless without a range of other structures, people, and processes in place, including: 1. The visible support of executive sponsorship 2. The structured engagement of key stakeholders – particularly faculty regarding the assessment of academic outcomes and student support personnel regarding the assessment of co-curricular and support outcomes. 3. A critical cohort of community members skilled in developing actionable knowledge 4. Human and technological resources to support manageable and sustainable assessment activity including experts in a. Rubric design b. Survey design c. Data analysis d. Report design e. Sampling methodology We will discuss these methods in more detail in the following section “Conducting Assessment”. Planning for Assessment is a big part of the success of assessment initiatives that lead to improvement. And planning is a lot of work, but the risk of not planning is huge. Without a planned approach to assessment, institutions run the risk of expending huge amounts of energy and resources for little payoff. With a planned approach, however, institutions are positioned to reap gains in terms of validating the design and delivery of all of its programs, adapting to new internal and external expectations, and systematically improving performance of the institution. 27 Chapter 3: Conducting Assessment Defining has many dimensions within the context of outcomes assessment. In earlier discussions we have explored the importance of defining an institutional governance structure, of defining processes, and of defining outcomes with clear criteria for evaluation. In this section, we will assume that all of those structures are in place and explore decisions of methodology and assessment practice, which leads us into tools and instruments, roles and responsibilities, and a host of other details that belie the overly simplified representation of “define, decide, assess.” Deciding on assessment methods, on first blush seems to be a relatively straightforward proposition. There are two major classifications in methodology: direct and indirect. Within these classifications there are two basic instruments or tools: direct measurement typically involves tests or rubrics while indirect measurement typically involves surveys or counts. It is at the point of deciding which test, rubric, survey or count to use that the decision on assessment methods becomes challenging. The chart here provides a diagrammatic representation of the categories where decisions for conducting assessment take place. Tests are a useful source of program assessment data. They are also attractive because students are not required to do anything beyond what is required for the course, and they are motivated to perform at their best. This is an “embedded assessment” opportunity at its best. The real argument for using tests is that every question in a test provides both disciplinary and trans-disciplinary outcomes. By this, we mean that each test question has the following characteristics: 1. Test questions are always aligned to an outcome and always represent a criterion for evaluating an outcome whether recognized, articulated or not. 28 2. A test question in the context of a program always contains an element of disciplinary knowledge and a level of difficulty which can always be aligned to higher order thinking (understanding, applying, analyzing, synthesizing, evaluating, etc). 3. Because they represent criteria for evaluating an outcome, test questions can often be aligned with trans-disciplinary outcomes such as analytic thinking, problem solving, mathematical reasoning, and so on. This method requires a few more steps on the part of faculty to articulate the criteria for evaluation related to a test question and to calculate the test results by question rather than by student. Accordingly, in this illustration, we can see the primary function of course level metrics: issuing a grade to the student. Let’s assume that the General Education Committee has identified this test as one that is suitable for use in assessing how well the General Education program has performed on Quantitative Reasoning. These very same data now become an instrument of program assessment except that rather than calculating by row (student performance) the calculation is done on the columns (program performance). What these data tell us is the extent to which the program has delivered on key disciplinary criteria for evaluation. Of course the questions need to be linked to criteria for evaluation of an outcome to tell us how the outcome is performing. Using the Association of American Colleges and Universities’ definition of Quantitative Literacy (http://www.aacu.org/value/rubrics/pdf/QuantitativeLiteracy.pdf) the faculty would link the criteria for evaluating Quantitative Literacy to the questions, for example: Interpret: Average of Q1 and Q2 Represent: Average of Q3 and Q4 Calculation: Q5 Application/analysis: Q6 29 Communication: Q7 Math 204 Spring '10 Final Interpret 74% Represent 80% Calculation 83% Analysis 81% Communication Overall 61% 76% With these data, the General Education committee has data sets that contribute to understanding how well their program is developing Quantitative Reasoning for all graduates. With these data, they can ask themselves (1) is the overall performance what we want, and (2) where are opportunities to improve the program’s performance? There is always a way to improve performance. Note that sample size would need to be sufficient before making decisions based on such data. The important thing to note is the shift from the student as the unit of measure to the program as the unit of measure. In assessment we are looking at what students know, think or are able to do as an indicator of the impact of the program. To summarize the use of tests in direct assessment, the following steps are necessary: 1. Aggregate average student performance by test question 2. Align test questions to criteria for evaluating the outcome 3. Report aggregate average student performance by criterion to identify areas of improvement (formative) 4. Report average of all criteria for overall performance on the outcome to identify achievement of the outcome (summative) Many commercial tests are available to assess a range of outcomes. Institutions often find this attractive because of the low investment in time and effort. In deciding to use commercial products to measure outcomes, there are some compelling reasons to think twice before making this decision: 1. Are the costs of commercial tests justified by the information we will get? In many cases, results are reported in terms of your institution’s performance against other institutions. Thus, the research question implied in this situation is: How do our students perform in comparison to other colleges or universities? We are less positioned to ask the more appropriate evaluation question: How does our program perform in comparison to like colleges or universities? 30 2. Are commercial test results reported by criterion for evaluation? Without granular information on criteria for evaluation, you will have no ability to develop improvement plans. All we will know is that we performed better or worse than other institutions, which could be a factor of student selectivity or program performance. 3. Will students put any effort into a test that does not count for grades? 4. What is the general feeling about requiring students to participate in an exercise that is of little or no educational benefit to them? Until commercial tests provide clear criteria for evaluation and test results are reported by criterion for evaluation, they will only provide part of the information needed. Rubrics are the other (and arguably the most impactful) instrument of direct assessment. A rubric is simply a matrix with performance level descriptors mapped against a set of criteria for evaluating an outcome. A well designed rubric is a powerful mechanism for giving students a learning target, and for giving faculty a teaching target. Faculty who collaboratively create a rubric often find it an important, energizing and useful academic work. Whether or not a rubric evaluation process is used to evaluate outcomes, it is worthwhile to construct and communicate to students and faculty rubrics for every institution-level outcome. This practice will begin to provide a common understanding of teaching and learning expectations. Institutions are strongly encouraged, however, to develop the skills of a critical core of faculty and staff who are skilled in creating rubrics. Rubric design is critical to meaningful evaluation. A poorly designed rubric is a roadblock to both students and faculty and certainly to improving program performance. Examples of poorly designed rubrics we have encountered are: 1. Rubrics that consist of one criterion for evaluating an outcome. This rubric will not identify opportunities for program improvement because there is insufficient granularity to identify what needs to improve. 2. Rubrics that have no performance level descriptors. This rubric will result in unreliable results because evaluators can each use their own definitions of quality 3. Rubrics that use judgmental terms as performance descriptors. Terms like excellent, good, fair, poor, needs improvement can distract conversations by making student performance the focus rather than program performance. 4. Holistic rubrics that lump all performance levels under criteria for evaluation. This rubric behaves like rubrics that have one criterion for evaluating an outcome. They will result in understanding how well the outcome performs, but will not identify areas of improvement. The characteristics of poorly designed rubrics suggest the elements of a well-designed rubric. Creating rubrics is an art and campuses that have this expertise are fortunate. An example of a 31 well designed rubric is taken from the VALUE (Valid Assessment of Learning in Undergraduate Education) project of the American Association of Colleges and Universities. Earlier we referred to the criteria for evaluating quantitative literacy. The VALUE rubric for Quantitative Literacy (http://www.aacu.org/val ue/rubrics/pdf/Quantitati veLiteracy.pdf) modified for this discussion is shown here. Note the non-judgmental performance levels, the full description of each criterion and performance level, the presence of several key criteria for evaluating the outcome, and in particular, how the criteria for evaluation provide a framework for the use of tests in program evaluation as we described above. Deciding on assessment instruments the direct measurement, given what we have covered in this section suggest that the decision-making process on how to assess an outcome might be: 1. Consider existing tests for disciplinary outcomes: a. If there are existing final test or tests, and b. If those tests are taken by a representative sample of students; and c. If the tests are taken near the completion of the program; and d. If scores by question are available; and e. If a collection of final tests in a program make a complete picture of disciplinary outcomes (knowledge of theory, practice, history/philosophy, application, analysis and evaluation of theory, for example); then f. Use test question scores – reported as a percentage of total possible item score – by criterion for evaluation 32 2. If no test or series of tests are available, use rubrics applied to completed work of students hearing the completion of their program Deciding on who, what, and when to assess takes us once again into the territory of manageability and sustainability. Our discussions of sampling, artifacts, and frequency are written through the lens of manageability and sustainability. Sampling is critical in rubric assessments where large populations are involved. Many institutions try to capture all students in a given assessment activity, which is not necessary. Rubric assessments for all 2,000 students in a graduating class is neither manageable nor sustainable. Rubric assessment of a sample population of 95, however, with a team of 5 evaluators each scoring 38 papers ((95 papers x 2 reads per paper)/5 evaluators) is sustainable and manageable, particularly when the papers are being read for specific reasons (written communication, information literacy, etc.); evaluators are not reading for content. Sampling in outcomes assessment borrows from the guidelines of social scientific research methodology. Typically this is a consideration in rubric evaluation. A quick internet search for “statistical sampling size” will identify several usable tables to incorporate into an assessment plan. Keep in mind that assessment results are not a formal research project ending in published results. Instead the aim is to produce actionable knowledge that assists in the continuous improvement of the program. As such, a 10% confidence level is sufficient for the purposes of assessment. Artifact collection processes are a challenge for all institutions. A perfect scenario is that all students submit all their work electronically, neatly organized across the institution, with “tags” that align to outcomes, so that gathering artifacts is easily accomplished. This scenario, of course, does not exist - anywhere. In the planning stages of assessment, however, the process of artifact collection should be collaboratively planned, clearly defined, and communicated in advance to faculty and students. Bringing sampling methodology into the mix of considerations further complicates the process. When applying sampling to a set of artifacts one approach might be: 1. Identify the outcome to be assessed and the rubric to be used 2. Identify the source of evidence (artifact) that will reflect the outcome and is closest to program completion 3. Identify the complete population of eligible students and artifacts 4. Determine the nature of artifacts that will be collected for evaluation through randomization, purposive selection, or other systematic means. 5. Determine the workflow of obtaining the artifact from students. Examples: will the artifact be collected electronically? Is the artifact already being submitted via an electronic method? Have the artifacts already been collected via some other online/offline method? 33 6. Identify courses from which the artifacts will be collected Frequency of assessment is also viewed through the lens of manageability and sustainability. The goal is to avoid going through assessment processes repeatedly and getting the same results. Assessment results should show improvement over time as assessment results lead to improvements in practice. Once the desired level of performance is achieved and assessment results stabilize, the assessment cycle can move from annual to less frequent maintenance checks. There are events and conditions that could affect results, however: 1. Initially it may be necessary to modify the rubric if evaluator feedback indicates it is vague, confusing, or difficult to score; this is best discovered and corrected during pilot phases 2. As the process matures, assessment teams may want to change the rubric in some fundamental fashion. The institution may feel that important criteria are missing, or that the levels of performance should be widened in order to better understand learning needs. Whatever conditions arise that suggest changing the rubric or the process, change them. This is less about preservation of historical and comparable data than it is establishing the right tools to generate program improvement. Course evaluations are an important part of academic cycles serving the faculty by providing feedback on their performance, and frequently serving the retention and promotion process by providing information on student perspectives of faculty performance. Although volumes could be written on campus issues surrounding course evaluations, this section will address only the relationship of outcomes assessment to course evaluations. Guidelines for the administration of course evaluations have typically been worked out by campus institutional research offices and most campuses have established processes for these. The relationship of outcomes assessment to course evaluations is discussed in the section entitled “Planning for Assessment” in which we argued that course evaluations are about the outcomes of professional development goals in an institution. But course evaluations could also address student learning. If we believe that developing students’ ability to write and speak clearly, think critically, work collaboratively on teams, analyze issues, etc. is the responsibility of faculty across the institution, should course evaluations contain a question such as: Please indicate the degree to which this course improved my ability to: Express my thoughts in writing Express my thoughts orally Analyze and critique the thoughts of others Work well in a team situation 34 Using this approach, course evaluations would provide an additional dimension to understanding student learning. With or without the addition of student learning outcomes in course evaluations there are new ways to think about course evaluations. As in test questions, each course evaluation item always aligns to a faculty performance outcome - whether explicitly stated or not. Mostly, course evaluation results are not reported by outcome, but by order in which the questions were asked. This practice tends to shift interpretation of results to frequency distributions, standard deviations, and other metrics that tend to take our focus away from the meaning of the data. In order to support meaningful and useful interpretation, course evaluation results might instead be reported by the expected performance outcome. For example, a question that asks whether the faculty member graded fairly is about “fairness”, as is a question that asks whether objectives of the class were provided to students. A question that asks whether the faculty member was available during posted office hours is about “accessibility.” And a question that asks whether the faculty member treated students respectfully is about “respectful behavior.” When course evaluations are reported from in an outcomes framework, it becomes a powerful instrument for improving individual and institutional performance. This approach both on an individual basis and on across the institution, provides a roadmap for individual improvement planning and professional development workshops. It should be noted that in this illustration, the category marked “overall” is an average of all questions – it is not a question that asks students to provide an “overall” rating for the course. In the illustration, this instructor would take action on these data by (1) studying the survey questions that are contributing to the low mark in “fairness,” (2) examining student comments from the course evaluations to determine why the rating is low, (3) having focus groups or conversations with students to understand the gap between student perceptions and the faculty member’s performance, and (4) establishing a plan to improve. The Provost would examine this report from an institutional performance perspective, noting that clear communication has the lowest rating across the university. The provost would also study the survey questions and develop additional information as to why this rating is low, then design a faculty development workshop for all faculty to improve their classroom communication skills. 35 Surveys are widely used on campuses for a wide range of purposes. As in course evaluations, each survey question is associated with an outcome, again usually unexpressed. Campuses often describe their surveys as “satisfaction” surveys, however, one might argue that colleges and universities are in the business of educating rather than satisfying students. Survey design should start with the goals of the program. For example, the facilities and maintenance office assessment might look something like the following: Goals: Buildings and grounds at University X are clean, safe and contribute to the educational and social success of its members and the surrounding community. . Outcomes: Clean classrooms, public spaces, office spaces, and grounds; Safe classrooms, public spaces, offices and grounds, etc Functional classrooms, public spaces, offices, and grounds Measurements: Number of campus security incidents Number of injuries due to hazards Survey results on perceptions of students, faculty and staff In designing a survey instrument, we want to ask no more and no less than the information we need to identify how well the program is meeting its goals, and how the program can be improved. By “tagging” each question with the associated outcome, we can on reporting, group these results to show performance on the outcome overall, and the components that will contribute to improved performance going forward. As with rubric design, survey design is key to developing useful, actionable knowledge. In many instances, surveys are being deployed out of routine with no one reviewing and using the results. In these instances, the survey should be decommissioned to avoid survey fatigue. Survey design and development on campuses are often the responsibility of people with direct knowledge or expertise in the practice, but may not necessarily be skilled in survey design. As with rubric design, this is an area in which a small investment in time and money can lead to large rewards in terms of useful knowledge. Qualities of a good survey in an educational setting include: 1. Questions align with criteria for evaluating an outcome. For example, a new student orientation exit survey might ask if students understand how to access help with: understanding your student bill; using the library; use of academic advising, adjusting to college, etc. 2. Responses categories are comparable across questions. For example, a series of 5point Likert responses with “strongly agree to strongly disagree” responses with 4-point 36 Likert responses with “very useful to not useful”. There are very few questions that cannot be worded to maintain a consistent response category throughout the survey. 3. Response categories are grouped together. For example, a series of 5-point Likert responses with “yes/no” responses in the middle. Regardless of reporting format, this type of organization creates problems for the reader or the report designer. Again, this issue can be avoided through rewording. 4. Questions are understood clearly by the respondents. For example, questions that are vaguely written, contain vocabulary or concepts that are inaccessible to the respondent, and questions that should be split into two separate questions or reduced to the essence of the question. 5. Questions are specific about criteria for evaluation. For example, questions that ask for an overall rating, or level of satisfaction provide no meaningful information because the criteria on which the response was based is unknown. This is a very common practice in surveys; unfortunately the single value yielded by such questions are often used for consequential decisions. Faculty committees are known to use the results of a single question asking for an overall rating for a course to make decisions about retention, promotion and tenure. A more valid measure would simply be to take an average of the other questions. Particularly in cases where electronic course evaluations are used, there is no reason to retain this question, as the average of other responses should be easily calculated. Other tools used in assessment are curriculum maps and portfolios. While both of these tools are a bit removed from the assessment of student learning, both can be valuable in institutional processes. Curriculum maps are a very useful tool to understand how a program is designed to deliver on outcomes. Curriculum maps are a matrix, mapping expected program outcomes as column headings with program courses as row headings. Individual cells within the matrix indicate which course delivers on which outcome. Institutions and programs complete curriculum maps in different ways and as might be expected, some approaches are more useful than others. This can be a very powerful mechanism for collaboration, planning, and understanding program design by all members of the program. 37 The concept of curriculum maps can be applied to any program that has a set of activities designed to produce outcomes. The student retention program, for example, can map expected skills and competencies (outcomes) against program activities. In fact any program or service can use this approach to determine: (a) that all of the expected outcomes are addressed in the planned actions of the unit, (b) where there are gaps that need filling, and (c) where there are overlaps that can be reduced. The characteristics of good maps are that visibility and understanding of the program are made clear in each of the 3 areas. Generally, this means that there are numerical values assigned to intensity of coverage, perhaps accompanied by color coding of cells, so that coverage, gaps, and overlaps are clear to the stakeholders. Portfolios are often thought to be a convenient collection method for artifacts organized by expected outcomes which can then be used in the assessment of those outcomes. With the emergence of artifact collection technologies there may be less focus on portfolios because automated artifact collection technologies avoid the problems associated with getting students to faithfully populate the portfolios and avoid the ethical issue of requiring all students to complete a portfolio but using only a portion of the total portfolios for actual assessment. Characteristics of assessment portfolios might include: 1. A separate portfolio space for artifact(s) associated with a specific outcome 2. A clear explanation of the outcome including name, criteria for evaluation, and a means to access the associated rubric(s) 3. Instructions that ask for the student’s best work that matches the criteria for evaluating the outcome. 4. The ability for students to replace the artifact(s) as the student produces higher levels of their best work 5. The student’s ability to take a copy of the portfolio and contents with them upon graduation or to access the portfolio for a certain period of time after graduation for use in employment or graduate school applications. Tests, rubrics, surveys, course evaluations, curriculum maps, and portfolios are the main tools of educational outcomes assessment. Throughout this discussion, we have kept a focus on quality of tools and processes, while keeping an eye on the issues of sustainability, manageability, scalability, which will be directly addressed in the next section. Sustainability, Manageability and Scalability are those high level dynamics that play out in the most minute details of assessment practice. Each of these issues have been addressed throughout the preceding discussions, but to summarize these minute details, a few guidelines are: Sustainability: 38 1. Pilot and conduct direct assessment on a small scale initially and use the initial pilots to identify opportunities to create efficiency – meaning getting the needed information with the least amount of effort 2. Cultivate assessment across the institution through demonstrating the benefits and sharing the energy generated through using results that improve programs 3. Use resources up front to organize, staff, and build expertise in critical areas of assessment design 4. Keep executive sponsorship in the information loop Manageability: 1. Maintain active oversight and guidance by organizing people with responsibility for the assessment processes 2. Establish and communicate clear roles and responsibilities of staff and committees and provide authority to modify, change, and improve assessment processes Scalability 1. Establish a plan for growing assessment practice across the institution 2. Communicate successes and lessons learned The information in this section is designed to produce actionable knowledge – knowledge that results in the successive improvement of quality in teaching and learning, support programs, and administrative services across the institution. A significant factor in creating actionable knowledge is the use of results, which includes the interpretation of data and converting the resultant knowledge into action. Our next section, “Using Assessment Results” will discuss this dimension. 39 Chapter 4: Using Assessment Results . . . for institutions, three powerful gifts result from outcomes assessment. First, institutions have a means of engaging faculty in the design and delivery of the curriculum – and to do this on an ongoing basis. Secondly, the institution can now provide evidence to substantiate bold claims such as “our graduates have a deep awareness of their professional work on society.” Thirdly, the most powerful gift is adaptivity – the ability to change course to meet a desired end state. Faculty engagement, continuous improvement, institutional identity, adaptation and change – a powerful gift indeed. Karen Yoshino, Blackboard Blog January, 2010 Actionable knowledge is the end game of outcomes assessment – meaning the use of data to tell us where we stand and how to improve. Getting to actionable knowledge, however, requires a process for giving meaning to data. When meaning is assigned to data, we raise the opportunity for action – action that improves the quality of teaching and learning, student support programs, and administrative services. This process includes systematic approaches to analysis, interpretation, communication, improvement planning, and follow-up actions. Each of these concepts will be discussed in this section. Indirect Direct Indirec t Direct Dire ct Indirect Institutional Effectiveness Student Success Educational Effectiveness What does actionable knowledge look like? In the example here, we see that direct measurement of teaching and learning Educational Effectiveness indicates our General Education Dashboard - % of total possible program has not developed the level of knowledge and skills we have set as Fairness our standard for “scientific reasoning.” Organized This information can be provided to our Increase knowledge Oral Communication accreditors as evidence that the Critical Thinking institution has named, evaluated, and Scientific Reasoning identified underperforming areas. Most Multicultural Competence institutional metrics (graduation, Availability of social… retention, average GPA, etc.) provide Availability of career guidance this level of insight. They tell us where Clarified values we stand, but provide little insight into Healthy behavior Social responsibility Access to technology 2009-10 Scientific reasoning results Safe campus Access to financial aid status Personal finance knowledge Organizational knowledge Observe Hypothesize Design how to improve performance. Collect Analyze Conclude 40 If we then look at the results of the underlying results that produced the scientific reasoning indicator (in this case a rubric evaluation process), we can then better understand what to do to improve the outcome. In the illustration here, we see that hypothesis, analysis, and conclusions skills need to be improved. This is an opportunity to bring the lab science faculty together as a whole, examine the design of the curriculum and make plans to reorder, reinforce, and emphasize these skills to generate improved student AND program performance in the future. Analysis of outcomes assessment data requires us to think about “data” in different ways than we routinely think about data. In general, when interpreting assessment data it is useful to: Maintain a focus on the program as the unit of measure; Use data for the purpose of improving performance Avoid the tempting but common questions stemming from our prior knowledge and daily practice that detract from the focus on program improvement. We will discuss many of these detractors in this section. Program is the unit of measure. Faculty constantly evaluate student work and assign grades on a student-by-student basis. As a result, our tendencies are to view aggregate student data as a reflection of aggregate student performance rather than our real focus – program performance. It is true that we must rely on student work in outcomes evaluation, but once student performance results are aggregated we need to resist the temptation to speak about areas of underperformance as student underperformance, but instead to think of it in terms of program underperformance. 41 Although most program assessment processes thankfully do not rely on grades, it is important to discuss the underlying rationale. Unless grades are assigned to students using a formal rubric process, a grade is simply not sufficiently granular to be useful in program assessment. For example, a B+ issued to a student after an oral presentation in a history class is based on two areas: history and oral communication. Accordingly, the use of the B+ in assessing “oral communication” for the general education program or the history program will be skewed by the faculty member’s weighting between history and oral communication skills. To further complicate this example, the variation in weighting among several history faculty make the collection of grades for program assessment gives us no actionable knowledge. In using grades for program assessment, we spend considerable institutional resources for little return. Scientific and Evaluative methods. We read scholarly journals containing data gathered with the highest levels of scientific rigor and analyzed with the tools and standards of inferential statistics. As a result, we are accustomed to thinking about data as objective and value neutral, and the implications of these data as adding to a body of theoretical knowledge rather than of practical use. Because we have become accustomed to taking in new research knowledge and filing it away in our minds, we are not inclined to take action on data. The data provided by our institutional research offices are generated with statistical tools and strict standards. These data are often used for external reporting purposes or used as internal indicators of program performance. These data (retention rates, Average GPA, admission scores, demographics) are used to tell us where we stand, but because they are not sufficiently granular they give us little information about where to target improvement efforts. Often, when we do take action based on these high level data, we do so without the benefit of knowing whether the action is going to fix the problem. 42 Another aspect of our training in scientific research methodology is a tendency to bring misplaced scientific rigor to the assessment process. While no one is arguing against methodological rigor in program assessment, traditional methods of statistical research often create unnecessary distractions. A recent assessment listserv discussion on inter-rater reliability encouraged many voices to contribute information on what inter-rater reliability programs were being used, what analyses were being generated, and detailed methodologies. No one raised the issue of examining the specificity of the rubrics! We have seen many rubrics with vague performance descriptors or no performance descriptors. These rubrics will surely lead to interrater reliability issues. However, a tightly written rubric with specific differentiators among performance levels, accompanied by a short introductory exercise will discourage inter-rater reliability issues. In thinking about the level of rigor in conducting assessment it is well to keep in mind that the methodology should be rigorous enough to identify systematic patterns of program performance. Program assessment is not a high stakes proposition for individuals; no individual gets failed, blamed or held accountable as a result of the process. It is high stakes, however, because it is the only mechanism we have to manage the quality of our programs and failure to conduct assessment puts educational quality at risk. Finally, the use of correlations, regressions and other methods of inferential statistics are of little use in program assessment because the unit of measure is the program, not the student. The value of “value added.” During the evolution of educational assessment, the issue of accountability was the principal driver. As the conversation continued and the practice of assessment began to take shape the concept of “value added” was introduced. This concept was often expressed in terms of questions such as: “Why does going to college matter?” or “What value does this college add that another college doesn’t?” “What do students get from an education at University X?” While these are interesting research questions, the pursuit of “value added” has resulted in significant amounts of energy and resources spent on standardized and pre-post tests that only raise more questions. In the case of standardized tests, we learn that our students perform more poorly than our comparison group or the national norm on quantitative literacy. Does this mean we did a worse job of educating our students in quantitative literacy, or does it mean our comparison group draws from a population of students who had higher levels of math skills coming in the door, or does it mean that the test is not a good measure of quantitative literacy? More importantly, what do we do now that we have this information? Unless the standardized test reports provide 43 a breakdown at the criteria for evaluation level, there is little more we can do with this information but say we do better or worse than comparison groups. Pre-post testing is often thought to be a means of measuring value-added, but when we find that students know and are able to do more between pre-and post testing events, what do we learn? Only that they do better at the end of a program or they do not. However, pre-post test results can give us useful information. If we examine the results by criteria for evaluation we have valuable information. Pre-test scores reported by criteria for evaluation will tell us how to adapt the program to address the knowledge and skill needs of the incoming student population. Post-test scores reported by criteria for evaluation will indicate for us specific ways to improve the design or delivery of the curriculum. The graphic below is a concept map for all of the elements we have discussed above that lead to actionable knowledge – using results for accountability and for improvement. What do we mean by “summative” and “formative?” We have argued that the primary purpose of educational program assessment is to provide summative and formative information about program quality. Unfortunately, the terms “summative” and “formative” carry different definitions depending on your location in the world of education. In grants and other projects, the term is time-based. So formative might mean mid-program and summative might mean at 44 the end of a program. In schools, summative is assessment of learning, and formative is assessment for learning – where the former uses grades or report card content, and the latter uses various mechanisms for feedback to students. The result is that context brings different interpretations to the assessment table. We argue that the terms formative and summative mean more than when an assessment occurred; but should instead convey the purpose of the assessment. Thus, we take “summative” to mean judgment (to what extent were the outcomes of the program achieved?); and “formative” to mean diagnostic (what are opportunities to improve the program? In this context, it makes sense that we examine evidence of students nearing the end of a program to answer the summative question – to what extent did we achieve the program goals? If we also have sufficiently granular data, we can also end-point information to identify improvement opportunities. Designing assessment processes for analysis. The conceptual “set-up” for analyzing assessment information is important in generating actionable knowledge. Although rarely found in the real world, ideal assessment instruments are organized so that reporting and analysis are apparent. In this ideal situation: criteria for evaluation are grouped by outcome, outcomes are aligned to goals of the program, goals of the program are aligned to goals of the institution, and goals of the institution are aligned to the mission. In the ideal survey, questions are aligned to outcomes, answer scales are consistent throughout the survey; indeed, the institution has established a standard scale so that all surveys are designed on the same scale. Coming back to the real world, these conditions are rarely found. This makes report design extremely important. Data + meaning = Actionable Knowledge. The key to designing assessment reports is to focus on the evaluation question, which is another way to say “what is the meaning of the data?” Across all types of assessment results to be reported, design the report for your least experienced reader by providing graphics whenever possible, and stick with the principles that: (a) fullness and completeness of data do not Not Recommended necessarily equate to usefulness of data, (b) interesting data do not always equate to important data, and (c) granular results reported at the criteria for evaluation level facilitate use of results. Detail level data can always be produced for those who are interested, but the distillation of data to Recommended its key elements will keep the conversation focused on the meaning 45 and not the methodology or the statistical intricacies involved. Actionable knowledge + effective reports = Improvement Opportunities. The following are helpful pointers for reporting assessment results to support the analysis process: Reporting surveys and course evaluation results 1. Resist the urge to report results in the order they are found in the instrument; instead, group item results by construct logic. For example, the course evaluation questions “The instructor showed respect for students,” “The instructor was tolerant of differing opinion,” and “The instructor encouraged broad participation” might all be grouped under the construct “inclusive behavior.” Whether the 3 individual question results are reported or not, the combined results should be calculated for “inclusive behavior.” This will support analysis by the qualities of teaching that the university considers desirable. Indeed, all questions on a survey or course evaluation should be aligned to a construct that links to the reason the question is being asked in the first place. 2. Resist the urge to report all of the data in the instrument; only report those that respond to the evaluation question. Staying with the course evaluation example, the question “Was this course team taught?” has little to do with the quality of the course. This item responds to a research question such as “what is the status of team teaching at the university?” and as such should be reported for a separate audience. Excluding data that do little to inform the evaluation question will encourage interpretation and use of these reports. Another way to state this is: beware of reporting instances (or asking questions) in which all responses in a course are the same; eg. “Was the course team taught?” In the instance of the team teaching question, if it is important to know how many courses are being team taught by how many teachers, the institution should consider adding this to registrar’s database officially maintained by the institution. 3. Beware of survey items that cannot be associated to a criterion for evaluation – especially if they are “student satisfaction” type questions. When surveys or course evaluations contain questions like “The overall quality of teaching in this course was excellent/outstanding,” consider reporting the combined results of all other questions as well as the results of this question. The argument here is that there are no criteria for evaluating “excellent/outstanding.” Some students will respond to this question on the basis of the personality, political views, or any number of factors that have little to do with effective teaching and learning. Instead, if we put emphasis on the questions that relate to specific criteria (fairness, inclusive behavior, availability, effective communication, student engagement, etc). This becomes particularly important if student perceptions are being used in the performance of faculty, as committees may rely on the responses to questions such as “overall quality” to make impactful decisions. Reporting rubric evaluation results 46 1. Report average aggregate results by criterion for evaluation (rubric rows) as well as averages for the outcome as a whole. This will support analysis by providing answers to the summative question (how well is the program performing on this outcome?) and the formative question (where are there opportunities to improve performance?). 2. Where rubric performance levels (rubric columns) assign labels such as “excellent, good, fair, poor”, attempt to convert these to non-judgmental terms like “level 1, level 2, level 3, level 4”. This will support analysis by lessening the tendency to see the data as “fair” student performance, and encourage viewing the data as indicators of program performance. Once raw assessment data have been converted into meaningful reports, they are ready for the process of interpretation. Improvement Opportunities + Decision Path = Quality Enhancement. The process of interpretation takes us directly back to the evaluation question – to what extent did program X deliver on the expected outcomes, and what are opportunities to improve performance? This implies that those participating in the interpretation of results have a combination of: a stake in the program being evaluated, expertise in the subject matter, the skills to create plans for improvement, and the organizational capital to make recommendations that will be broadly accepted in the community of practice. Since very few individuals possess all of these characteristics, this suggests that the interpretation process be collective and collaborative. Collective and collaborative interpretation processes. After a combination of expertise and skills have been identified, and before coming together as an interpretation team, each member should have an opportunity to review assessment reports, together with a description of the methodology, and the instruments used. In most circumstances, the interpretation process will be facilitated by a brief interpretation of the data and even a recommended action statement such as in this graphic. A team leader or chair of the group should have a thorough understanding of the data and be in agreement with the initial interpretive statements as well as the recommendation before a group meeting takes place. The group should be provided a framework within which to conduct the interpretation discussion. Elements of the framework should include expectations for the outcomes of the meeting. In general these expectations will be: a. Summative: Is the program performing as desired? 47 b. Formative: In what areas do we find opportunities to improve either the design or the delivery of the program? c. Action plan: What do we want to accomplish in terms of improved performance going forward, what steps will be taken, and who will be responsible for follow-through? Action plans. The types of recommended changes resulting from the assessment process typically have broad implications involving many faculty. This is the point at which broad groups of faculty become engaged with each other in improving the design and delivery of the curriculum. Evidence of need to improve written communication, or global awareness, or critical thinking can now take place. Sufficient granularity is provided to understand what elements of critical thinking need to be more specifically infused across the curriculum makes everyone think about the approaches in their programs and in the courses they teach. It is also likely the point at which the Director of Assessment releases ownership of the data and the process to Vice Presidents and Provosts, and the point at which the assessment team releases ownership of the process to the broader faculty. For example, recommendations on general education outcomes may have to be processed through the General Education Committee. The academic senate may need to consider the recommendations. A powerful gift. The mechanisms by which the data move through the governance process of the institution will vary by institution, but the result will be powerful. Through the assessment process, institutions have a mechanism to systematize adaptive processes – to make those adjustments and change course to meet the mission and goals of the institution a powerful gift indeed. 48