Using Rubrics to Collect Evidence for Decision-Making: What do Librarians Need to Learn? Megan Oakleaf, MLS, PhD School of Information Studies Syracuse University 4th International Evidence Based Library & Information Practice Conference May 2007 Overview • • • • • • • • Introduction Definition & Benefits of Rubrics Methodology Emergence of Expert Rubric User Group Characteristics of Expert Rubric Users Barriers to Expert Use of Rubrics The Need for Training Directions for Future Research © M. Oakleaf, EBLIP4, 2007 Rubrics Defined • describe the 1) parts, indicators, or criteria and 2) levels of performance of a particular task, product, or service • formatted on a grid or table • employed to judge quality • used to translate difficult, unwieldy data into a form that can be used for decision-making © M. Oakleaf, EBLIP4, 2007 Rubrics are often used to make instructional decisions and evaluations. http://www.southcountry.org/BROOKHAVEN/classrooms/btejeda/images/rubric%20big.JPG Potential Rubric Uses in Libraries To analyze and evaluate: • Information-seeking behavior • Employee customer service skills • Marketing/outreach efforts • Collection strengths • Information commons spaces • Student information literacy skills © M. Oakleaf, EBLIP4, 2007 Indicators Rubric for a Library Open House Event for First Year Students Rubric created by: Katherine Thurston & Jennifer Bibbens Beginning Developing Exemplary Data Source Attendance Attendance rates are similar to the 2006 Open House Attendance rates increase by 20% from 2006 Open House Attendance rates will increase by 50% from 2006 Open House Staff [Committee and Volunteers] records Staff Participation Staff participation is similar to 2006 Open House, no volunteers Increase in participation by library staff [librarians and paraprofessiona ls] and student volunteers Increase in participation with library staff [librarians and paraprofessiona ls], student volunteers, student workers, and academic faculty Staff [Committee and Volunteers] records Budget Budget same as 2006 Open House, $200 Budget increases by $100 from 2006 Open House Budget increases by $300 from 2006 Open House Budget, Financial Statements Reference Statistics Reference statistics similar to 2006 Reference statistics increase by 20% from 2006 Reference statistics increase by 50% from 2006 Library Reference Department Statistics Student Attitudes Students are pleased with Open House Students enjoy the Open House, are satisfied with information Students are excited about the Open House, volunteer to participate with the next year’s event Survey Rubric for a Virtual Reference Service Indicators Beginning Developing Exemplary Data Source Transactions 0 – 4 reference transactions per week. 5 – 7 reference transactions per week. 8 + reference transactions per week. Transaction Logs User Satisfaction Students, faculty and staff report they are “dissatisfied” or “very dissatisfied” with reference transactions. Students, faculty and staff report they are “neutral” about reference transactions. Students, faculty and staff report they are “satisfied” or “very satisfied” with reference transactions. User Surveys Training Librarians report they are “uncomfortable” or “very uncomfortable” with providing virtual reference service. Librarians report they are “neutral” about providing virtual reference service. Librarians report they are “comfortable” or “very comfortable” with providing virtual reference service. Post-Training Surveys Technology Between 75 % and 100 % of transactions a week report dropped calls or technical difficulties. Between 25 % and 74% of transactions a week report dropped calls or technical difficulties. Between 0 % and 24% of transactions a week report dropped calls or technical difficulties. System Transcripts Electronic Resources 0 – 50 hits on electronic resources a week. 50 – 100 hits on electronic resources a week. 100 + hits on electronic resources a week. Systems Analysis Logs Rubric created by: Ana Guimaraes & Katie Hayduke Study Rubric Benefits • rubrics provide librarians the opportunity to discuss, determine, and communicate agreed upon values • rubrics include descriptive, yet easily digestible data • prevent inaccuracy of scoring • prevent bias When used in student learning contexts… • reveal the expectations of instructors and librarians to students • offer more meaningful feedback than letter or numerical scores alone • support not only student learning, but also self-evaluation and metacognition © M. Oakleaf, EBLIP4, 2007 The Research Question • To what extent can librarians use rubrics to make valid and reliable decisions? – Library service: an information literacy tutorial – Artifacts: student responses to questions within the tutorial – Goal: to make decisions about the tutorial and the library instruction program © M. Oakleaf, EBLIP4, 2007 Methodology • 75 randomly selected student responses to openended questions embedded in an information literacy tutorial at NCSU • 25 raters – 15 internal & trained (NCSU librarians, faculty, students) – 10 external & untrained (non-NCSU librarians) • • • • raters code artifacts using rubrics raters’ experiences captured on comment sheets reliability statistically analyzed using Cohen’s kappa validity statistically analyzed using a “gold standard” approach and Cohen’s kappa © M. Oakleaf, EBLIP4, 2007 Kappa Index Kappa Statistic Strength of Agreement 0.81-1.00 Almost Perfect 0.61-0.80 Substantial 0.41-0.60 Moderate 0.21-0.40 Fair 0.00-0.20 Slight <0.00 Poor © M. Oakleaf, EBLIP4, 2007 Average Kappa Rank Participant Group Status 0.72 1 NCSU Librarian Expert 0.69 2 Instructor Expert 0.67 3 Instructor Expert 0.66 4 Instructor Expert 0.62 5 NCSU Librarian Expert 0.61 6 Instructor Non-Expert 0.59 7 Instructor Non-Expert 0.58 8 Student Non-Expert 0.56 9 Student Non-Expert 0.55 10 NCSU Librarian Non-Expert .055 11 Student Non-Expert 0.54 12 Student Non-Expert 0.52 13 Student Non-Expert 0.52 14 NCSU Librarian Non-Expert 0.43 15 External Instruction Librarian Non-Expert 0.32 16 External Reference Librarian Non-Expert 0.31 17 External Instruction Librarian Non-Expert 0.31 18 NCSU Librarian Non-Expert 0.30 19 External Reference Librarian Non-Expert 0.30 20 External Instruction Librarian Non-Expert 0.27 21 External Reference Librarian Non-Expert 0.21 22 External Instruction Librarian Non-Expert 0.19 23 External Reference Librarian Non-Expert 0.14 24 External Instruction Librarian Non-Expert 0.13 25 External Reference Librarian Non-Expert expert status does not appear to be correlated to educational background, experience, or position within the institution Expert Kappa Statistics Expert Raters 0.8 0.77 Poor Slight Fair Moderate Substantial Almost Perfect 0.74 0.6 0.6 0.52 0.48 0.4 0.2 0 -0.2 Articulates Criteria Cites Indicators © M. Oakleaf, EBLIP4, 2007 Provides Examples Judges Use Grade Non-Expert Kappa Statistics Non-Expert Raters Poor Slight Fair Moderate Substantial Almost Perfect 0.8 0.6 0.47 0.4 0.29 0.27 0.24 0.17 0.2 0 -0.2 Articulates Criteria Cites Indicators © M. Oakleaf, EBLIP4, 2007 Provides Examples Judges Use Grade Expert Characteristics • • • • focus on general features of artifact adopt values of rubrics revisit criteria while scoring experience training © M. Oakleaf, EBLIP4, 2007 Non-Expert Characteristics • • • • • diverse outlooks or perspectives prior knowledge or experiences fatigue mood other barriers © M. Oakleaf, EBLIP4, 2007 Barrier 1 Difficulty Understanding an Outcomes-Based Approach Many librarians are more familiar with inputs/outputs than outcomes. Comments from raters: – using measurable outcomes to assess student learning focuses too much on specific skills—too much “science” and not enough “art.” – “While the rubric measures the presence of concepts…it doesn’t check to see if students understand [the] issues.” – “This rubric tests skills, not…real learning.” © M. Oakleaf, EBLIP4, 2007 Barrier 2 Tension between Analytic & Holistic Approaches Some librarians are unfamiliar with analytical evaluation. Comments from raters: – The rubric “was really simple. But I worried that I was being too simplistic…and not rating [student work] holistically.” – “The rubric is a good and a solid way to measure knowledge of a process but it does not allow for raters to assess the response as a whole.” © M. Oakleaf, EBLIP4, 2007 Analytic vs. Holistic Analytic • Better for judging complex artifacts • Allow for separate evaluations of artifacts with multiple facets • Provide more detailed feedback • Take more time to create and use Bottom line: Better for providing formative feedback © M. Oakleaf, EBLIP4, 2007 Holistic • Better for simple artifacts with few facets • Good for getting a “snapshot” of quality • Provide only limited feedback • Do not offer detailed analysis of strengths/weaknesses Bottom line: Better for giving summative scores Barrier 3 Failure to Comprehend Rubric Some librarians may not understand all aspects of a rubric. Comments from raters: – “I decided to use literally examples, indicators to mean that students needed to provide more than one.” – “The student might cite one example…but not…enough for me to consider it exemplary.” © M. Oakleaf, EBLIP4, 2007 Barrier 4 Disagreement with Assumptions of the Rubric Some librarians may not agree with all assumptions and values espoused by a rubric. Comments from raters: – The rubric “valued students’ ability to use particular words but does not measure their understanding of concepts.” © M. Oakleaf, EBLIP4, 2007 Barrier 5 Difficulties with Artifacts Some librarians may be stymied by atypical artifacts. Comments from raters: • I found myself “giving the more cryptic answers the benefit of the doubt.” • “If a student answer consists of a bulleted list of responses to the prompt, but no discussion or elaboration, does that fulfill the requirement?” • “It’s really hard…when students are asked to describe, explain, draw conclusions, etc. and some answer with one word.” © M. Oakleaf, EBLIP4, 2007 Barrier 6 Difficulties Understanding Library Context & Culture Librarians need campus context to use rubrics well. © M. Oakleaf, EBLIP4, 2007 Training Topics • Value & principles of outcomes-based analysis and evaluation • Theories that underlie rubrics • Advantages & disadvantages of rubric models • Structural issues that limit rubric reliability and validity (too general or specific, too long, focused on quantity not quality, etc) • Ways to eliminate disagreement about rubric assumptions • Methods for handling atypical artifacts © M. Oakleaf, EBLIP4, 2007 Future Research Investigate: • attributes of expert raters • effects of different types and levels of rater training • non-instruction library artifacts • impact of diverse settings © M. Oakleaf, EBLIP4, 2007 Conclusion Are rubrics worth the time and energy? This study confirmed the value of rubrics—nearly all participants stated that they could envision using rubrics to improve library instructional services. Such feedback attests to the merit of rubrics as tools for effective evidence based decision-making practice. © M. Oakleaf, EBLIP4, 2007 American Library Association. 2000. Information Literacy Competency Standards for Higher Education. 22 April 2005 <http://www.ala.org/ala/acrl/acrlstandards/informationliteracycompetency.htm>. Arter, Judith and Jay McTighe. Scoring Rubrics in the Classroom: Using Performance Criteria for Assessing and Improving Student Performance. Thousand Oaks, California: Corwin Press, 2000. Bernier, Rosemarie. “Making Yourself Indispensible By Helping Teachers Create Rubrics.” CSLA Journal 27.2 (2004). Bresciani, Marilee J., Carrie L. Zelna, and James A. Anderson. Assessing Student Learning and Development: A Handbook for Practitioners. Washington: National Association of Student Personnel Administrators, 2004. Callison, Daniel. “Rubrics.” School Library Media Activities Monthly 17.2 (Oct 2000): 34. Colton, Dean A., Xiaohong Gao, Deborah J. Harris, Michael J. Kolen, Dara Martinovich-Barhite, Tianyou Wang, and Catherine J. Welch. Reliability Issues with Performance Assessments: A Collection of Papers. ACT Research Report Series 97-3, 1997. Gwet, Kilem. Handbook of Inter-Rater Reliability: How to Estimate the Level of Agreement between Two or Multiple Raters. Gaithersburg, Maryland: STATAXIS, 2001. Hafner, John C. “Quantitative Analysis of the Rubric as an Assessment Tool: An Empirical Study of Student Peer-Group Rating.” International Journal of Science Education 25.12 (2003). Iannuzzi, Patricia. “We Are Teaching, But Are They Learning: Accountability, Productivity, and Assessment.” Journal of Academic Librarianship 25.4 (1999): 263-266. Landis, J. Richard and Gary G. Koch. “The Measure of Observer Agreement for Categorical Data.” Biometrics 33 (1977). Lichtenstein, Art A. “Informed Instruction: Learning Theory and Information Literacy.” Journal of Educational Media and Library Sciences 38.1 (2000). Mertler, Craig A. “Designing Scoring Rubrics For Your Classroom.” Practical Assessment, Research and Evaluation 7.25 (2001). Moskal, Barbara M. “Scoring Rubrics: What, When, and How?” Practical Assessment, Research, and Evaluation 7.3 (2000). Nitko, Anthony J. Educational Assessment of Students. Englewood Cliffs, New Jersey: Prentice Hall, 1996. Popham, W. James. Test Better, Teach Better: The Instructional Role of Assessment. Alexandria, Virginia: Association for Supervision and Curriculum Development, 2003. Prus, Joseph and Reid Johnson. “A Critical Review of Student Assessment Options.” New Directions for Community Colleges 88 (1994). Smith, Kenneth R. New Roles and Responsibilities for the University Library: Advancing Student Learning through Outcomes Assessment. Association of Research Libraries, 2000. Stevens, Dannielle D. and Antonia Levi. Introduction to Rubrics: An Assessment Tool to Save Grading Time, Convey Effective Feedback, and Promote Student Learning. Sterling, Virginia: Stylus, 2005. Tierney, Robin and Marielle Simon. “What's Still Wrong With Rubrics: Focusing On the Consistency of Performance Criteria Across Scale Levels.” Practical Assessment, Research, and Evaluation 9.2 (2004). Wiggins, Grant. “Creating Tests Worth Taking.” A Handbook for Student Performance in an Era of Restructuring. Eds. R. E. Blum and Judith Arter. Alexandria, Virginia: Association for Supervision and Curriculum Development 1996. Wolfe, Edward W., Chi-Wen Kao, and Michael Ranney. “Cognitive Differences In Proficient and Nonproficient Essay Scorers.” Written Communication 15.4 (1998). Questions?