Technical Report Supplement Test Development 2011-2014 Table of Contents Supplemental Information: Dance, the Language Fields, and English as a Second Language ....................................................................................................................2 Part 1: Part 2: Part 3: Part 4: Test Design ..................................................................................................................... 2 Establishing Advisory Committees ................................................................................. 2 Test Objective Preparation and Validation .................................................................... 2 Test Item Preparation, Validation, and Pilot Testing ..................................................... 3 Supplemental Information: Sheltered English Immersion ............................................7 Introduction ................................................................................................................................ 7 Part 1: Test Design ..................................................................................................................... 9 Part 2: Establishing Advisory Committees ................................................................................. 9 Part 3: Test Objective Preparation and Validation .................................................................... 9 Part 4: Test Item Preparation, Validation, and Pilot Testing ................................................... 10 Part 5: Determination of Qualifying Scores ............................................................................. 13 1 Preface The following supplemental information is intended to provide additional information about test development specifics (e.g., conference dates) as well as to describe any test development activities that differed for specific test fields from the process described in the main body of the Technical Report. Individual chapters of each Supplemental Information section mirror in content the chapters of the Technical Report and describe in detail any procedures specific to the development of the test or group of tests addressed by the supplement. Part 1: Test Design provides details about the development of the test design specific to the test or group of tests. Part 2: Establishing Advisory Committees provides the characteristics of the review committees specific to the test or group of tests. Part 3: Test Objective Preparation and Validation describes any processes related to the development, committee review, or content validation specific to the test or group of tests addressed by the supplement. Part 4: Test Item Preparation, Validation, and Pilot Testing describes any processes related to the development, committee review, or pilot testing of the items specific to the test or group of tests addressed by the supplement. Part 5: Determination of Qualifying Scores provides information concerning any procedures used to determine the qualifying scores that were specific to the test or group of tests addressed by the supplement. 2 Supplemental Information: Dance, the Language Fields, and English as a Second Language Development of the Dance (046) and the following language tests began in January 2012 and first became operational in March 2014: 029 Chinese (Mandarin) 027 German 030 Italian 032 Portuguese In addition, development of the Russian (031) test began in January 2012 and became operational in October 2014, and development of English as a Second Language (ESL, 054) began in August 2012 and became operational in March 2014. Part 1: Test Design The Dance and ESL test each include 100 multiple-choice items and two content-based openresponse items. Each of the language tests listed above comprises 55 multiple-choice and shortanswer items and 4 open-response assignments, including written expression, oral expression, listening comprehension, and reading comprehension. Part 2: Establishing Advisory Committees The process for establishing the advisory committees for Dance, ESL, and the language fields listed above is described in the main body of the Technical Report. Part 3: Test Objective Preparation and Validation The preparation and validation of the objectives for Dance, ESL, and the language fields listed above are described in detail in the main body of the Technical Report. Committee Review of Test Objectives. Draft test objectives were reviewed by the Bias Review and Content Advisory Committees for ESL on February 11-12, 2012, and for Dance and the language fields listed above on April 26-27, 2012. Lists of committee members who attended the review meetings are included in Appendix III. Content Validation Survey. The procedures followed for the Content Validation Survey for the tests listed above are described in the main body of the Technical Report. 3 Sampling. The sampling procedures followed for the Content Validation Survey for the tests listed above are described in the main body of the Technical Report. In addition, for the English as a Second Language and Sheltered English Immersion tests, Evaluation Systems conducted an additional supplemental oversample of educators on the variable of race/ethnicity at twice the rate each group appeared in the eligible population. Part 4: Test Item Preparation, Validation, and Pilot Testing The preparation and validation of the test items for Dance, ESL, and the language fields listed above followed the processes and procedures as described in the main body of the Technical Report. Committee Review of Test Items. Draft test items that were developed based on the test objectives were reviewed by the Bias Review Committee and Content Advisory Committee as indicated in the table below. Test Field(s) Dates of Bias Review Committee Meeting Dates of Content Advisory Committee Meeting 046 Dance 029 Chinese (Mandarin) 027 German April 22, 2013 April 23-24, 2013 030 Italian May 13, 2013 May 13-14, 2013 054 English as a Second Language May 13, 2013 and October 8, 2013 May 13-15, 2013 and October 8-9, 2013 032 Portuguese May 13, 2013 May 14-15, 2013 031 Russian October 8, 2013 October 8-9, 2013 Lists of committee members who attended each of the review meetings are included in Appendix III. Pilot Testing. The initial pilot testing of open-response items for ESL and the fields of Chinese (Mandarin), German, Italian, and Portuguese occurred through open session and intact classroom sessions at 11 institutions throughout Massachusetts in October 2013; pilot testing continued through July 2014 for ESL and through October 2014 for Chinese (Mandarin), German, Italian, and Portuguese. Pilot testing of items for Dance began with an open session in conjunction with the July 13, 2013 operational administration and continued through October 2013. Pilot testing for Russian began with an open session in conjunction with the March 1, 2014 operational administration and continued through October 2014. Pilot testing for all fields was continued as noted above in order to reach target numbers of responses, as noted in the main body of the Technical Report. 4 Pilot Test Form Design. The design of the pilot test forms to be administered at operational administrations and/or at colleges and universities varied by field, as indicated in the table below. Field Pilot Test Form Design Number of Pilot Test Forms Created (approximate) 027 German 030 Italian 032 Portuguese Open-response item test forms each including a single open-response item: listening comprehension, reading comprehension, or, oral expression (for Italian and Portuguese only) 8 029 Chinese (Mandarin) Open-response item test forms each including a single open-response item: listening comprehension, reading comprehension, Multiple-choice/short-answer test forms including a total of 37 items each: 1 passage of 12 short-answer items 9 stand-alone short-answer items, and 16 multiple-choice items 8 046 Dance Open-response item test forms each including a single open-response item 8 054 English as a Second Language Open-response item test forms each including a single open-response item 22 Candidates were typically required to complete two pilot test forms at a given pilot test administration, and it was expected that candidates would be able to complete the pair in 1 ½ to 2 hours. For the language fields, open-response item pilot test forms were typically distributed to candidates such that each candidate received one listening comprehension assignment and one reading comprehension assignment. For ESL, multiple-choice items were also pilot tested on operational test forms. In this case, seven sets of 15 pilot test items each were appended to operational test forms. The targeted minimum number of pilot test participants for the pilot test forms comprised of multiple-choice items was 50. However, for fields for which the licensed population is small and test registration is low (the lower-incidence fields), it was expected that fewer than 50 responses for multiple-choice items and fewer than 5 responses for the open-response items would be achieved; numbers of responses in some cases were expected to be as few as 5 approximately 5 per item. Target numbers and pilot testing approaches by test field are outlined in the main body of the Technical Report. Pilot Test Administration. The various models used for pilot testing the test items for Dance, ESL, and the language fields are described in detail in the main body of the Technical Report. Scoring/Data Analysis. Pilot test responses to the multiple-choice items were scored and the data was analyzed as described in the main body of the Technical Report. For fields for which the targeted number of responses to permit statistical analysis could be expected, responses were scored according to procedures used in operational administration scoring. This was the case for ESL and Chinese (Mandarin). Due to the small teacher populations for Dance, German, Italian, Portuguese, and Russian, the open-response items underwent a qualitative review by content experts experienced in the scoring process to determine the answerability of each item and suitability for use on operational test form(s). Part 5: Determination of Qualifying Scores The determination of qualifying scores for these fields is described in the main body of the Technical Report. The dates of the Qualifying Score conferences for these fields are noted below. Test Field(s) Test Implementation Date Qualifying Score Conference Date 029 Chinese (Mandarin) 027 German 030 Italian 032 Portuguese March 1, 2014 April 10, 2014 046 Dance 054 English as a Second Language May 10, 2014 May 29, 2014 031 Russian October 25, 2014 November 18, 2014 6 The item-based judgments were made based on the hypothetical group of individuals as described in the main body of the Technical Report. The question posed to the members of each of the panels for each of these tests was as follows (as described in the main body of the Technical Report), but with field-specific considerations for the open-response items as described below. “Imagine a hypothetical individual who is just at the level of subject matter knowledge required for entry-level teaching in this field in Massachusetts public schools. What score represents the level of response that would be achieved by this individual?” For Dance and ESL, panelists indicated the total number of points, “2” to “16” that would be achieved across both items based on the four-point score scale and two scorers for each item. For the language fields, for each of the open-response items (the written and oral expression assignments, and the listening comprehension and reading comprehension assignments) panelists indicated the score point, “2” to “8”, that would be achieved for each assignment, based on the four-point score scale and two scorers for each item. 7 Supplemental Information: Sheltered English Immersion Introduction The development of the Sheltered English Immersion (SEI) test followed the procedures as described in the main body of the Technical Report, with the exceptions and additional activities as described in this section. Below and on the following page is a timeline of the activities undertaken in the development of the Sheltered English Immersion test. Timeline of Development Activities for the Sheltered English Immersion Test Purpose of Meeting/Activity Participants (in addition to Evaluation Systems staff) Date(s) Planning meeting (at the Department) Department staff September 18, 2012 Planning meeting: test design (webinar) Department staff October 3, 2012 Planning meeting: test design (webinar) Department staff November 2, 2012 Planning meeting: test design (teleconference) Department staff November 9, 2012 Objective Review Conference Bias Review and Content Advisory Committee members December 11-12, 2012 Content Validation Survey Job incumbents and Higher Education Institution faculty April-May 2013 Item Review Conference: First subset of MultipleChoice items and all OpenResponse Items Bias Review and Content Advisory Committee members May 13-15, 2013 Item Review Conference: remaining Multiple-Choice Items Bias Review and Content Advisory Committee members October 8-9, 2013 Review of all Multiple-Choice and Open-Response Items Representatives from the Department of Justice and the Department of Elementary and Secondary Education November 5, 2013 First review of newlydeveloped Open-Response Item format in response to DOJ feedback (meeting at DESE) Representatives from the Department of Justice and the Department of Elementary and Secondary Education November 13, 2013 8 Timeline of Development Activities for the Sheltered English Immersion Test, continued Purpose of Meeting/Activity Participants (in addition to Evaluation Systems staff) Date(s) Second review of revised Open-Response Item format in response to November feedback (teleconference) Department staff December 20, 2013 Item Review Conference to review new Open-Response item (format and first subset of mentor texts) Bias Review and Content Advisory Committees February 18-19, 2014 Review of final OpenResponse Item format and associated mentor texts Representatives from the Department of Justice and the Department of Elementary and Secondary Education March 26, 2014 Test Release June 9, 2014 Marker Response Selection Meeting Content Advisory Committee members June 12-13, 2014 Item Review Conference to review remaining mentor texts Bias Review and Content Advisory Committee July 22-23, 2014 Qualifying Score Conference Qualifying Score Panel July 24, 2014 Schedule of Pilot Testing for the Sheltered English Immersion Test Items Included in the Pilot Test Date(s) First subset of Multiple-Choice items and all initial OpenResponse Items October 2013 Remaining Multiple-Choice Items February 17, 2014 Multiple-Choice Items (continued from February) March 1, 2014 Revised Open-Response Item April 2-14, 2014 Revised Open-Response Item (continued from April) July 12-18, 2014 9 Part 1: Test Design The SEI test consists of 60 multiple-choice items and one open-response assignment consisting of 5 tasks. The 5 tasks are each scored separately on a 4-point score scale. Part 2: Establishing Advisory Committees Eligible nominees for participation on the Content Advisory Committee for the SEI test included those educators who were licensed and practicing in a teaching area or teaching areas associated with the test field, faculty who were teaching undergraduate or graduate arts and sciences courses in which education candidates were enrolled, and faculty who were preparing undergraduate or graduate education candidates in the area(s) associated with the test field. Eligible nominees also included instructors of the Department’s RETELL (Rethinking Equity and Teaching for English Language Learners) SEI Endorsement course. Details about specific committee make-up for this field are included in the main body of the Technical Report. The Content Advisory Committee for the Sheltered English Immersion test comprised a total of 32 educators, each of whom participated in one or more meetings throughout the course of development. Part 3: Test Objective Preparation and Validation Development of the test objectives. The test objectives for the SEI test were developed to be consistent with the Massachusetts Regulations for Educator Licensure and Preparation Program Approval (Regulations). Additional documents referenced in preparing the test objectives included the syllabus of the Massachusetts RETELL (Rethinking Equity and Teaching for English Language Learners) Sheltered English Immersion (SEI) Endorsement Course, and the World-Class Instructional Design and Assessment (WIDA) English Language Development Standards. Prior to the preparation of the test objectives for the test, Evaluation Systems staff met with staff from the Department, either at the Department office or via webinar or teleconference, four times for the purpose of gathering information regarding the specific skills that are considered essential for educators to know and that would be addressed by the SEI test. These meetings were held on September 18, 2012, October 3, 2012, November 2, 2012, and November 9, 2012. Committee Review of Test Objectives. The draft test objectives that were developed based on the results of the planning meetings with the Department were reviewed by the Bias Review 10 and Content Advisory Committees on December 11, 2012. The procedures followed are described in the main body of the Technical Report. A list of committee members who attended the review meeting is included in Appendix III. Content Validation Survey. The procedures followed for the Content Validation Survey for the Sheltered English Immersion test are described in the main body of the Technical Report. Sampling. The sampling procedures followed for the Content Validation Survey for the Sheltered English Immersion test is described in the main body of the Technical Report. In addition, for the English as a Second Language and Sheltered English Immersion tests, Evaluation Systems conducted an additional supplemental oversample of educators on the variable of race/ethnicity at twice the rate each group appeared in the eligible population. Part 4: Test Item Preparation, Validation, and Pilot Testing The preparation and validation of the SEI test items followed the processes and procedures as described in the main body of the Technical Report, along with the additional development and review tasks as described below. Committee Review of Test Items. The draft test items that were developed based on the test objectives were reviewed by the Bias Review Committee and Content Advisory Committee members over the course of multiple meetings, on the following dates: May 13-15, 2013, and October 8-9, 2013. The procedures followed are described in the main body of the Technical Report. Lists of committee members who attended each of the review meetings are included in Appendix III. Department of Justice Review of Test Items. Following the two item review conferences described above, by request of the Department, representatives (content and legal) of the federal Department of Justice reviewed all items developed, both multiple-choice and openresponse. This review occurred on November 5, 2013. As a result of this review and in response to feedback from the Department of Justice, the open-response item component of the test was revised to include a single open-response item intended to require a more in-depth response than the original set. The new item format was reviewed by the Department of Justice and the Department of Elementary and Secondary Education on November 13, 2013 and, based on feedback from that meeting, revised and reviewed by the Department of Elementary and Secondary Education on December 20, 2013. 11 As a result of the introduction of the new open-response item type, and with the approval of the Department of Elementary and Secondary Education, the SEI test design was updated from the initial design approved by the Department of 90 multiple-choice items and two open-response assignments to the final design of 60 multiple-choice items and one extended open-response item. Open-Response Assignment Mentor Texts. The extended open-response assignment on the SEI test includes five parts to allow candidates to demonstrate their knowledge of sheltering content for English Language Learners in Massachusetts public school classrooms. Candidates will use one of ten mentor texts as the basis for their response. Each mentor text is an informational text that is representative of the kind of content and academic language features a student might encounter in texts in a given content area. Candidates are instructed to read the test directions, read the assignment, and select a mentor text for the basis of their response. After choosing a mentor text as the focus of their lesson, candidates are expected to show evidence of a detailed, working knowledge of Sheltered English Immersion strategies and how to employ those various strategies to create a well-developed SEI lesson plan. Following the approval of the new item format and a sample set of mentor texts by the Department of Elementary and Secondary Education based on feedback from the Department of Justice, Evaluation Systems drafted a set of mentor texts to support the new open-response item type for bias and content committee review. Committee Review of Open-Response Assignment Mentor Texts. The first set of mentor texts that were drafted for use with the open-response assignment were reviewed by the Bias Review Committee on February 18, 2014, and by the Content Advisory Committee on February 19, 2014. The committees were reconvened to review an additional set of mentor texts on July 2223, 2014. The procedures followed for the first meeting mirrored those of a typical item review, as described in the main body of the Technical Report. Lists of committee members who attended the review meeting are included in Appendix III. For the second meeting, the same procedures were followed, but using an electronic process. Each committee member was issued a paper copy of the set of mentor texts to be reviewed; however, the master copy was an active word document projected on a screen for all committee members to view. Any revisions agreed upon by the committee were made electronically as edits to the items on the word version by the facilitator as it was being projected and viewed by the committee. Upon 12 completion of the review, the committee representative signed approval of the revisions, which were then saved as a PDF document. Department of Justice Review of Open-Response Item Mentor Texts. Following the committee review of the mentor texts, a representative of the Department of Justice reviewed the final open-response item format and the associated set of mentor texts. This review occurred on March 26, 2014. Pilot Testing of the SEI Items. SEI items were pilot tested through the participation of licensed educators who would be required to obtain the SEI endorsement. The initial pilot testing of SEI test items occurred through open session and intact classroom sessions at select institutions throughout Massachusetts in October 2013, for multiple-choice items and for the openresponse items originally developed for the test. Pilot testing of the multiple-choice items also occurred at open sessions on February 17, 2014 and March 1, 2014. Pilot testing of the originally-developed open-response items occurred through open session and intact classroom sessions at select institutions throughout Massachusetts in October 2013 but was discontinued as a result of the Department of Justice review in November as described above. Pilot testing of the new open-response item format occurred between April 2, 2014 and April 14, 2014 and again between July 12, 2014 and July 18, 2014. Pilot Test Form Design. To pilot test the multiple-choice items, four pilot test forms consisting of 30 items each were created. In addition, 17 forms, each containing one extended open-response item or two shorter open-response items, were created. Candidates were typically required to complete two pilot test forms – depending on the pilot test session attended, either two multiple-choice forms or two open-response item forms - and it was expected that candidates would be able to complete the pair in 1 ½ to 2 hours. Following the revision of the open-response item assignment, additional piloting was required of the new item format. The focus of the pilot of the new item format was on the operational implementation of the new item format. A single test form was created, containing the open-response item assignment and a sample mentor text. Examinees were expected to take 2 ½ to 3 hours to complete this pilot test. Candidates were assigned to either the 2 ½ hour or 3 hour administration, depending on the date they took the pilot test, in order for Evaluation Systems to determine the appropriate amount of time 13 needed to complete the assignment. Candidates were given the opportunity to provide feedback about the administration mode, clarity of directions, ease of use and accessibility of the various components of the item. Pilot Test Administration. Various models used for pilot testing the test items for the SEI test are described in detail in the main body of the Technical Report. Scoring/Data Analysis. Pilot test responses to the multiple-choice items were scored and the data was analyzed as described in the main body of the Technical Report. Responses to the open- response items were reviewed for the following characteristics: Candidate ability to complete the assignment Testing time required to complete the assignment Administration process and computer screen layout Application of the proposed score scale to the responses A review of the times and comments indicated that the candidates could produce a response within an estimated 2 ½ hours, leaving sufficient time to respond to the multiple-choice questions. In addition, candidates were able to move throughout the various item components with relative ease. Part 5: Determination of Qualifying Scores The Qualifying Score Conference for the Sheltered English Immersion test was conducted on July 24, 2014, following the June 9, 2014, release of the test. The qualifying score process for SEI followed the process described in the main body of this Technical Report, with modifications to the description of the hypothetical group of educators. For making their item-based judgments of multiple-choice items, members of the SEI panel were asked to envision a group of Massachusetts educators who are just at the level of knowledge required for entry-level practice 14 in Massachusetts public schools, and provide an independent item rating for each item by answering, in their professional judgment, the following question: “Imagine a hypothetical group of individuals who are just at the level of subject matter knowledge required for entry-level teaching in this field in Massachusetts public schools. What percent of this group would answer the item correctly?” For making their item-based judgments of the open-response item, members of the SEI panel were asked to envision an individual Massachusetts educator who is just at the level of knowledge required for entry-level practice in Massachusetts public schools and answer the question below. “Imagine a hypothetical individual who is just at the level of subject matter knowledge required for entry-level teaching in this field in Massachusetts public schools. What score represents the level of response that would be achieved by this individual?” For the open-response assignment, panel members indicated the score point, “2” to “40”, that would be achieved based on the four-point score scale applied to each of the five parts of the assignment by each of two scorers.