Nationally Normed Testing via ACS Exams: Research and Practice Thomas Holme Iowa State University ACS DivCHED Examinations Institute ChemEd 2012, Adelaide A fundamental challenge • Teaching is, at once, inherently personal and inescapably corporate. • At present, the corporate interests in student learning are often articulated in terms of assessment. Exams Institute? • How is it that chemistry in the US has an Exams Institute? A Short History 1921 Division of Chemical Education A Short History 1930 Committee on Examinations and Tests Formed to construct or exploit the development of objective test construction and train teachers how to construct tests for their own classes A Short History 1930 Committee on Examinations and Tests “…only specialists in the field can write good tests because they are the persons who know what is significant and important, rather than the test experts who know the forms and techniques of test construction but not the subject matter” Ted Ashford on Ralph Tyler’s position on testing A Short History 1930 Committee on Examinations and Tests The Committee was subsidized by: General Education Board of the Cooperative Test Service Carnegie Foundation for the Advancement of Teaching Dr. Ben Wood A Short History 1934 A group of five Committee members released the first general chemistry test in three forms A Short History 1946 Ted Ashford appointed as Chair of the Committee A Short History 1984 Committee on Examinations and Tests renamed to Examinations Institute Board of Trustees appointed to oversee operation of the Institute A Short History 1987 Dwaine Eubanks appointed Director Examinations Institute moves to Oklahoma State University A Short History 2002 Tom Holme appointed Director Examinations Institute moves to University of Wisconsin – Milwaukee - (2008) Moves to Iowa State Reason Exams Institute exists? • ~ P.T. Barnum • There’s a sucker born every 2 or 3 decades • “on average” The key constituencies • Practitioners – Often motivated by practicality • Chem Ed Researchers (TUES grant recipients) – Often motivated by validation challenges Exam development • • • • • • • • Chair is named Committee is recruited First meeting - sets content coverage Items are written and collated Second meeting - editing items, setting trials Trial testing in classes - provides item stats Third meeting - look at stats and set exam Meetings are held in conjunction with ACS National Meetings (or BCCE) – Partial reimbursement to volunteers Gen Chem Exams • • • • • • • • Full Year Exam (2009, 2011) First Term Exam (2005, 2009) Second Term Exam (2006, 2010) 1st Term Paired Questions (2005) 2nd Term Paired Questions (2007) Conceptual (1st term, 2nd term, full year) Full year - brief exam (2002, 2006) All exams carry secure copyright – Released – not published Norms and reporting • Norms are calculated on voluntary return of student performance data • We have an interactive web site for score reporting for exams that do not yet have enough data to report a norm. • People often use norm (percentile) to help students who transfer to other programs. Exams as artifacts of teaching • Because ACS Exams have been around for a long time, they provide an additional artifact of what the community values. • Consider Organic Chemistry Exams – Analyze cognitive demands • Vast majority of items on organic exam qualify as conceptual understanding – Analyze chemistry content. • Content coverage shows modest fluctuation, but few fundamental shifts in the past 20 years. Then and now • In addition to historic value, the current efforts of the Exams Institute also shed light on research and practice. • First a bit on curricular practice. • Then some a research example. Criterion referencing for program assessment • • • • Requires criteria At the college level, they don’t exist. Build a consensus content map. Similar to using backward design1. 1: Understanding by Design, Grant P. Wiggins, Jay McTighe Anchoring Concept • Use “big ideas” or anchoring concepts to organize content across disciplines. • Build levels with finer grain size down to the point where exam items are generally written. Levels of criteria map • Anchoring Concept • Enduring Understanding • Sub-disciplinary articulation • Content details Process for setting map (so far) • • • • • • • • • • • • • Begin from EMV conference ideas Focus Group (Mar08): Level 1 + Level 2 Workshop (Jul08): Level 2 + Level 3 (General) Focus Group (Aug08): Level 2 + Level 3 (Organic) Workshop (Mar09): Level 3 + Level 4 (General) Focus Group (Aug09): Level 2 + Level 3 (Organic) Workshop (Mar10): Alignment (General) Focus Group (Mar 10): Level 2 + Level 3 (Physical) Focus Group (Jul 10): Level 3 Organic Focus Group (Dec 10): Level 3 + Complexity Organic Focus Groups (Mar 11): Level 3 (Analytical, Biochem, Physical) Focus Groups (Aug 11): Level 3 + Complexity (Organic) Focus Groups (Mar 12): Level 4, Alignment, Complexity (General, Organic) Example of comparing content Access? • The Gen Chem version has been published. – Uses Authors Choice so it should be downloadable. • Organic is expected to be publishable by fall of this year. Item Alignment • Look at current items from ACS Exams and align them to Level 3/4 • Process guided by psychometric experts. • Can include both skills and content • Ultimately can help define specifications for future ACS Exams. Comparison of gen and org Enjoy the ride… • This project shows how discussions around testing/benchmarking could ultimately lead to curricular forcing. • The discussions remain “grass roots” • The Exams Institute provides the playground and occasionally has to decide what rules to follow, but doesn’t push an “ACS” agenda. But, testing…are you sure? • Are we really that confident in our measurements? Research: Teachable moments • Because a sizeable fraction of the Chem Ed community uses (or at least trusts) ACS Exams, the characterization of the exams allows an avenue to educate about assessment issues. Recently taught topics • • • • • • Role of item complexity Item characteristic curves Item Order Effects Answer Order Effects Differential Item Functioning (DIF) Partial credit / polytomous scoring Content vs. construct • Tests demand that students complete tasks • Each item is a task • Students need knowledge within the content domain (chemistry) • Students need knowledge about how to organize their efforts (test taking) • Cast this understanding in terms of item complexity. The Information Processing Model • Consider the task in terms of the Information Processing Model of Johnstone and coworkers. Johnstone, CERP, 7, 49-63 (2006) Estimating Task Complexity (Johnstone and coworkers) • Count up the pieces of information needed to accomplish the task • Compare to student performance Estimating task complexity elsewhere Paas & Van Merriënboer’s 9-point scale (1994). 1 2 3 4 5 6 7 8 9 ~ ~ ~ ~ ~ ~ ~ ~ ~ very, very low mental effort very low mental effort low mental effort lower than average mental effort average mental effort higher than average mental effort high mental effort very high mental effort very, very high mental effort Data for each chemistry exam item • Performance data (difficulty index) • Expert-rated objective complexity • Mental effort (hypothesized to represent the subjective complexity) Three constructs of task complexity • Complexity treated as a psychological experience. – Subjective complexity • Complexity treated as a function of objective task characteristics. – Objective complexity • Complexity treated as an interaction between task and person characteristics. Principle component analysis Com ponent Matrixa Communalities error rate complexity (rating) mental effort (rating) Initial 1.000 1.000 1.000 Extraction .700 .629 .704 Extraction Method: Principal Component Analysis. error rate complexity (rating) mental ef f ort (rating) Component 1 .837 .793 .839 Extraction Method: Principal Component Analysis. a. 1 components extracted. Total Variance Explained Component 1 2 3 Initial Eigenv alues Total % of Variance Cumulativ e % 2.033 67.768 67.768 .537 17.910 85.678 .430 14.322 100.000 Extraction Method: Principal Component Analy sis. Extraction Sums of Squared Loadings Total % of Variance Cumulativ e % 2.033 67.768 67.768 Factor analysis • Factor Analysis finds a single factor with all PCA loading factors above 0.75 • Hypothesis: This factor represents the complexity of multiple-choice chemistry items. • Principal axis factoring and maximum likelihood factoring both reveal a single factor as well. Take home message • Depending on the model used for factor analysis the amount of variance changes (from 51% to 67%). • Nonetheless, for Gen Chem half of the variance of student performance can be explained by the latent variable of task complexity. • Not arguing against complex content – arguing for awareness of complexity when making measurements • Current work is looking at Organic. – Key challenge is assigning task complexity. What can researchers use? • High quality individual instruments – DUCK – Paired question exams – On-line laboratory assessment • Includes full motion video and novel item constructs • A support system for program assessment. – Criterion referencing What’s a DUCK • Diagnostic of Undergraduate Chemistry Knowledge. • Fundamentally inter-disciplinary • Scenario Based. • 15 Scenarios, each with 4 items. • Taken at/near the end of the undergraduate curriculum. – Sometimes in a capstone course, sometimes as an extra. What have we learned? • Analysis of performance from US students on current DUCK. Other advantages? • Having the Exams Institute provides a venue for looking at the interaction of measurement and learning in chemistry. • Complexity • Interdisciplinarity • Measurement factors – Item order effects – Differential Item Functioning – Coming: Human-computer interaction Acknowledgements Current Collaborators • • • • • • • • • Kristen Murphy (UWM) Jeff Raker (ISU) Kim Linenberger (ISU) Mike Slade (ISU) Heather Caruthers (ISU) Anna Prisacara (ISU) Jessica Reed (ISU) John Balyut (ISU) April Zenisky (Umass) Prior Collaborators • • • • • Karen Knaus (UC-Denver) Jacob Schroeder (Clemson) Mary Emenike (Rutgers) Megan Grunert (W. Michigan) Chris Bauer (New Hampshire) NSF: DUE-0618600, 0717769, 0817409, 0920266 taholme@iastate.edu @Tom_Holme