Republic of the Philippines BATANGAS STATE UNIVERSITY ARASOF Nasugbu Nasugbu, Batangas College of Teacher Education Bachelor of Secondary Education MODULE in Ed 104 ASSESSMENT IN LEARNING 1 Page 1 of 56 TABLE OF CONTENTS Overview Learning Outcomes Study Skills Assessment Task Materials Chapter 1. Nature of Educational Assessment Outcomes-Based Education Basic Concepts in Assessment Purposes of Classroom Assessment Assessment Principles Characteristics of Modern Educational Assessment Chapter 2. Instructional Objectives and Learning Outcomes Aims, Goals and Objectives Instructional Objectives Learning Outcomes Taxonomy of Instructional Objectives Chapter 3. Approaches and Trends in Educational Assessment Two Basic Approaches to Educational Assessment Recent Trends in Educational Assessment Assessment of Learning Outcomes in K to 12 Program Policy Guidelines on Classroom Assessment in K to 12 Program Chapter 4. Preparing for Assessment General Principles of Testing Quality of Assessment Tools Steps in Developing Assessment Tools Table of Specification Chapter 5. Development of Classroom Assessment Tools Different Formats of Classroom Assessment Tools Advantages and Disadvantages of Different Test Formats Guidelines in Constructing Different Test Formats Chapter 6. Administering, Analyzing and Improving Tests Packaging and Reproducing Test Items Administering the Examination Item Analysis Improving Test Items Chapter 7. Characteristics of a Good Test Validity Reliability Chapter 8. Measures of Central Tendency Preparing Frequency Distribution Table (FDT) Mean, Median and Mode Measures of Central Tendency in Different Distributions Page 2 of 56 Measures of Location Chapter 9. Measures of Variability Range, Variance and Standard Deviation Measures of Variability in Different Distributions Chapter 10. Describing Individual Performance Z-score T-score Stanine Percentile Rank Chapter 11. Marks and Learning Outcomes Purposes of Grades and Strategies in Grading Types of Grading System Introduction Assessment is an essential component of the teaching and learning process. Classroom teachers employ informal and formal assessments on an ongoing basis to make decisions about their students, evaluate the success of their instruction, and to monitor classroom climate. The role of assessment in the instructional process and the learning of students makes it necessary for preservice teachers to gain skills and competencies about assessment and help them to become competent professional teachers. Intended Learning Outcomes 1. Demonstrate an understanding of the basic concepts and the current principles that guide assessment in learning. 2. Identify the relationship of assessment to curriculum and instruction. 3. Demonstrate an understanding of the major principles and guidelines involved in test construction and development and its application in grading system. 4. Expand their knowledge and skills of interpreting and using assessment data to improve learning in the classroom. 5. Apply strategies to construct valid and reliable test items. 6. Develop the ability to select and create tasks and tools for fair and effective assessment. 7. Describe the process involved in the planning, preparation and administration of assessment tools. Duration: 3 hours/week I. Nature of Educational Assessment Chapter 1: Instructional Objectives 1.1. 1.2. 1.3. 1.4. Define basic terms in assessment Discriminate the different purposes and types of assessment. Distinguish the assessment principles. Describe the characteristics of modern educational assessment Page 3 of 56 What is assessment? Assessment is a system and process of collecting evidence about student learning. Assessment is a lot like research because it involves observing, recording, scoring and interpreting the information we collect. It is the process of systematically gathering information as part of an evaluation. In education, the term assessment refers to the wide variety of methods or tools that educators use to evaluate, measure, and document the academic readiness, learning progress, skill acquisition, or educational needs of students. A good system of assessment provides: feedback to students about their learning feedback to teachers about their instruction evidence to support teachers’ judgments about grading Outcomes based education "…starts with a clear specification of what students are to know, what they are to be able to do, and what attitudes or values they should be able to demonstrate at the end of the program" (Killen, 2005, p. 77). OBE is an educational theory that bases each part of an educational system around goals (outcomes). By the end of the educational experience, each student should have achieved the goal. There is no single specified style of teaching or assessment in OBE; instead, classes, opportunities, and assessments should all help students achieve the specified outcomes. Framework of Outcomes-Based Education Page 4 of 56 Activity 1: Discuss the framework of Outcomes-Based Education. _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ Why shift to OBE? _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ How to adopt OBE? _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ Four Principles of OBE 1. Clarity of Focus - This means that everything teachers do must be clearly focused on what they want students to know, understand and be able to do. Teachers should focus on helping students to develop the knowledge, skills and personalities that will enable them to achieve the intended outcomes that have been clearly articulated. 2. Designing Down - The curriculum design must start with a clear definition of the intended outcomes that students are to achieve by the end of the program. Once this has been done, all instructional decisions are then made to ensure achieve this desired end result. 3. High Expectations - Teachers should establish high, challenging standards of performance in order to encourage students to engage deeply in what they are learning. Helping students to achieve high standards is linked very closely with the idea that successful learning promotes more successful learning. 4. Expanded Opportunities - Teachers must strive to provide expanded opportunities for all students. This principle is based on the idea that not all learners can learn the same thing in the same way and in the same time. However, most students can achieve high standards if they are given appropriate opportunities. Activity 2: Discuss the four principles of OBE as applied to classroom teaching and learning. _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ Page 5 of 56 What is the purpose of assessment? • The purpose of assessment is to determine whether expectations match standards set by school authorities. • The primary purpose of assessment is to improve students' learning and teachers' teaching as both respond to the information it provides. 1. It can be used to assist improvements across the education system in a cycle of continuous improvement. 2. Students and teachers can use the information gained from assessment to determine their next teaching and learning steps. 3. Parents, families and friends can be kept informed of next plans for teaching and learning and the progress being made, so they can play an active role in their children’s learning. 4. School leaders can use the information for school-wide planning, to support their teachers and determine professional development needs. 5. Communities and Stakeholders can use assessment information to assist their governance role and their decisions about staffing and resourcing. 6. School Administrators can use assessment information to inform their advice for school improvement. Basic Concepts in Assessment: 1. Test – an instrument designed to measure any characteristics, quality, ability, knowledge or skill. 2. Measurement – a process of quantifying the degree to which someone/something possesses a given trait using an appropriate measuring instrument. 3. Assessment – a process of gathering and organizing quantitative or qualitative data into an interpretable form to have a bases for judgement or decision-making. Types of Assessment: 1. Traditional Assessment – paper and pencil test 2. Performance Task Assessment – 3. Portfolio Assessment 4. Evaluation – a process of systematic interpretation, analysis, appraisal or judgment of the worth of organized data as basis for decision-making. Page 6 of 56 Purposes of classroom assessment: 1. ASSESSMENT FOR LEARNING – pertains to placement, diagnostic and formative assessment tasks which are used to determine learning needs, monitor academic progress of students during a unit of instruction and guide instruction. 2. ASSESSMENT AS LEARNING – employs tasks or activities that provide students with an opportunity to monitor and improve their own learning. Includes their study and learning habits. It involves metacognitive processes like reflection and self-regulation to allow students to utilize their strengths and work on their weaknesses. It is also formative which can be given at any phase of the learning process. 3. ASSESSMENT OF LEARNING – it is used to certify what students know and can do and determine the level of their proficiency and competency. It is summative and done at the end of a unit, task process or period. Assessment Principles 1. Address learning targets/curricular goals 2. Provide efficient feedback on instructions 3. Use a variety of assessment procedure 4. Ensure that assessment is valid, reliable, and fair 5. Keep record of assessment 6. Interpret, communicate results of assessment meaningfully Characteristics of a modern educational assessment 1. Objectives-based and criterion-referenced 2. Reliable 3. Multidimensional in Structure 4. Value-laden Activity 3: 1. Give three assessment principles and discuss its importance to teachers and student. _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ Page 7 of 56 2. Discuss the different characteristics of a modern educational assessment. _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ _________________________________________________________________________ II. Instructional Objectives and Learning Outcomes Chapter 2: Instructional Objectives 2.1. Define aims, goals, instructional objectives and learning outcomes. 2.2. Compare and contrast the Taxonomy of instructional objectives. 2.3. Identify the parts of instructional objectives. Aims, goals, instructional objectives and learning outcomes are necessary in planning for assessment of student learning. According to Ornstein and Hunkins (1988), aims are orientations that suggest endpoints. They are intention or aspirations; what you hope to achieve. They are not specific quantifiable outcomes and written in broad terms. Educational aims must address the cognitive, psychomotor and affective domain. The 1987 Philippine Consitution, Article XIV, Sec. 3 (2) spells out the aims of education: Goals : 1. inculcate patriotism and nationalism 2. foster love and humanity 3. respect for human rights 4. appreciation of the role of national heroes in the historical development of the country. 5. teach the rights and duties of citizenship 6. strengthen ethical and spiritual values 7. develop moral character and personal discipline 8. encourage critical and creative thinking 9. broaden scientific and technological knowledge 10. promote vocational efficiency -derived from the aims of education. -broad statements that provide guidelines what to accomplish as a result of prescribed educational program. Objectives: -more specific than goals. -They describe learning outcomes. -Behaviors that must be achieved in various levels of the curriculum. These levels include lesson, subject, unit and program. Page 8 of 56 Types of Objectives: 1. General Objectives 2. Specific Objectives Instructional Objectives: An instructional objective is a statement that will describe what the learner will be able to do after completing the instruction. Instructional objectives are specific, measurable, attainable, result oriented and time bounded (SMART). They indicate the desirable knowledge, skills, or attitudes to be gained. They contain four parts: behavior, content, conditions, and criteria. Instructional objectives consists of two essential components: behavior and content. Behavior component tells what a learner is expected to perform (expressed in verb form) Content component tells or specifies the topic or subject matter a student is expected to learn (express as noun phrase). An instructional objective also contains two optional components: condition and criterion level/ Condition is the situation which learning will take place. It maybe materials, tools, places or other resources which can facilitate the learning process. Criterion level refers to the acceptable level of performance (standard). It tells how well a particular behavior is to be done. It could be stated in percentage, number of items answered correctly, completion of a task, within a prescribed time limit and a completion of a task to a certain extent or degree of frequency. Example: Solve a system of linear equation. Identify the parts of a sentence. Draw the different parts of the human body. Describe the function of the digestive system. Given a world map, locate ten Asian Countries with 90% correct. Using a scientific calculator solve for the sine and cosine function with 100% accuracy. Based from the story, discuss the briefly the summary with 99% percent accuracy. Characteristics of a Well-Written Objective A well-written objective should meet the following criteria: 1) describe a learning outcome 2) be student oriented 3) be observable (or describe an observable product) Page 9 of 56 Observable List Select Compute Analyze Draw Predict Recite Non-observable Know Understand Value Realize Appreciate If an instructional objective is not observable, it leads to unclear expectations and it will be difficult to determine whether or not it had been reached. The key to writing observable objectives is to use verbs that are observable and lead to a well-defined product of the action implied by that verb. Verbs such as "to know," "to understand," "to enjoy," "to appreciate," "to realize," and "to value" are vague and not observable. Verbs such as "to identify," "to list," "to select," "to compute," "to predict," and "to analyze" are explicit and describe observable actions or actions that lead to observable products. Well-written learning objectives include these four elements: (Summary) Condition - the condition under which the student will perform the described behavior. Behavior - a description of a specific, observable behavior. Degree - the degree indicates the desired level or degree of acceptable performance. Content – specify the topics (https://education.stateuniversity.com/pages/2098/Instructional-Objectives.html#ixzz6VQHvmNOV) Kinds of Instructional Objectives Instructional objectives are often classified according to the kind or level of learning that is required in order to reach them. There are numerous taxonomies of instructional objectives; the most common taxonomy was developed by Benjamin Bloom and his colleagues. Page 10 of 56 The first level of the taxonomy divides objectives into three categories: cognitive, affective, and psychomotor. Simply put, cognitive objectives focus on the mind; affective objectives focus on emotions or affect; and psychomotor objectives focus on the body. Cognitive objectives call for outcomes of mental activity such as memorizing, reading, problem solving, analyzing, synthesizing, and drawing conclusions. Bloom and others further categorize cognitive objectives into various levels from the simplest cognitive tasks to the most complex cognitive task. These categories can be helpful when trying to order objectives so they are sequentially appropriate. This helps to ensure that prerequisite outcomes are accomplished first. Affective objectives focus on emotions. Whenever a person seeks to learn to react in an appropriate way emotionally, there is some thinking going on. What distinguishes affective objectives from cognitive objectives is the fact that the goal of affective objectives is some kind of affective behavior or the product of an affect (e.g., an attitude). The goal of cognitive objectives, on the other hand, is some kind of cognitive response or the product of a cognitive response (e.g., a problem solved). Psychomotor objectives focus on the body and the goal of these objectives is the control or manipulation of the muscular skeletal system or some part of it (e.g., dancing, writing, tumbling, passing a ball, and drawing). All skills requiring fine or gross motor coordination fall into the psychomotor category. To learn a motor skill requires some cognition. However, the ultimate goal is not the cognitive aspects of the skill such as memorizing the steps to take. The ultimate goal is the control of muscles or muscle groups. https://education.stateuniversity.com/pages/2098/Instructional-Objectives.html#ixzz6VQJ5PGaY TAXONOMY OF INSTRUCTIONAL OBJECTIVES Page 11 of 56 Table 1: Bloom's Taxonomy of Educational Objectives for Knowledge-Based Goals Level of Expertise Description of Level Example of Measurable Student Outcome 1. Knowledge Recall, or recognition of terms, ideas, procedure, theories, etc. When is the first day of Spring? 2.Comprehension Translate, interpret, extrapolate, but not see full implications or transfer to other situations, closer to literal translation. What does the summer solstice represent? 3. Application Apply abstractions, general principles, or methods to specific concrete situations. What would Earth's seasons be like in specific regions with a different axis tilt? 4. Analysis Separation of a complex idea into its constituent parts and an understanding of organization and relationship between the parts. Includes realizing the distinction between hypothesis and fact as well as between relevant and extraneous variables. Why are seasons reversed in the southern hemisphere? 5. Synthesis Creative, mental construction of ideas and concepts from multiple sources to form complex ideas into a new, integrated, and meaningful pattern subject to given constraints. If the longest day of the year is in June, why is the northern hemisphere hottest in August? Page 12 of 56 6. Evaluation To make a judgment of ideas or methods using external evidence or self-selected criteria substantiated by observations or informed rationalizations. What would be the important variables for predicting seasons on a newly discovered planet? Example of learning outcomes: Students will be able to communicate both orally and verbally about music of all genres and styles in a clear and articulate manner (comprehension). Students will be able to analyze and interpret texts within a written context (analysis). Students will be able to demonstrate an understanding of core knowledge in biochemistry and molecular biology (application). Students will be able to judge the reasonableness of obtained solutions (evaluation). Students will be able to evaluate theory and critique research within the discipline (evaluation). Students will be able to work in groups and be part of an effective team (synthesis). Revised Bloom’s Taxonomy 1. Remembering: Recall information and exhibit the memory of previously learned material, information or knowledge (could be facts, terms, basic concepts or answers to questions). 2. Understanding: Demonstrate understanding of facts and ideas by organising, comparing, translating, interpreting, giving descriptions and stating the main ideas. 3. Applying: Use information in new or familiar situations to resolve problems by using the acquired facts, knowledge, rules and techniques. 4. Analyzing: Examine and slice information into portions by understanding causes or motives; make inferences and find evidence to support generalizations. 5. Evaluating: Express and defend opinions through judgements about information, authenticity of ideas or work quality, according to certain criteria. 6. Creating: Organize, integrate and utilize the concepts into a plan, product or proposal that is new; compile information together in a different way. Page 13 of 56 Categories/Level of Cognitive Domain 1. Remembering – define, describe, identify, label, list, match, name, outline, recall, recognize, reproduce, select, state 2. Understanding – distinguish, estimate, explain, give example, interpret, paraphrase, summarize 3. Applying - apply, change, compute, construct, demonstrate, discover, modify, prepare, produce, show, solve, use 4. Analyzing - analyze, compare, contrast, diagram, differentiate, distinguish, illustrate, outline, select 5. Evaluating - compare, conclude, criticize, critique, defend, evaluate, justify, justify, relate, support 6. Creating - categorize, combine, compile, compose, devise, design, generate, modify, organize, plan, revise, rearrange Categories/Level of Psychomotor Domain 1. Observing - watch, detect, distinguish, differentiate, describe, relate, select 2. Imitating - begin, explain, move, display, proceed, react, show, state, volunteer 3. Practicing - bend, calibrate, construct, differentiate, dismantle, display, fasten fix, gasp, grind, handle, measure, mix, operate, manipulate 4. Adapting - organize, relax, shorten, sketch, write, re-arrange, compose, create design, originate Categories/Level of Affective Domain 1. Receiving - select, point to, sit, choose, describe, follow, hold, identify, name reply 2. Responding- answer, assist, comply, conform, discuss, greet, help, perform practice, read, recite, report, tell, write 3. Valuing - complete, demonstrate, differentiate, explain, follow, invite, join, justify, propose, report, share, study, perform 4. Organizing - arrange, combine, complete, adhere, alter, defend, explain, formulate, integrate, organize, relate, synthesize 5. Internalizing – act, display, influence, discriminate, modify, perform, revise, solve, verify Page 14 of 56 Activity: A. Determine which domain and level of learning are targeted by the following learning competencies: 1. Identify the different parts of a microscope and their functions. 2. Employ analytical listening to make predictions. 3. Recognize the benefit of patterns in special products and factoring. 4. Follow written and verbal directions. 5. Compose musical pieces using a particular style of the 20 th century. 6. Describe movement skills in response to sound. 7. Prove statements on triangle congruence. 8. Design an individualized exercise program to achieve personal fitness. 9. List the major parts of a computer. 10. Differentiate myth from legend. B. List 10 observable and non-observable behaviors: one example is done as a guide. Observable Non-Observable Draw Understand Page 15 of 56 **Learning outcomes are the end results of instructional objectives. NOT all action verbs specify learning outcomes, sometimes they specify learning activities (means to an end) C. List down 8 examples of learning outcomes and learning activities: two examples is done as a guide. Learning Outcomes (end) Learning Activities (means) Recited the poem “A tree”. Practiced reciting the poem “A Tree”. Proven trigonometric identities. Memorized the different trigonometric identities. III. Approaches and Trends in Educational Assessment Chapter 3: Instructional Objectives 3.1 Compare norm-referenced measurement and criterion-referenced measurement. 3.2. Discuss recent trends of educational measurement. 3.3. Differentiate the assessment of learning in K to 12 program. TWO BASIC APPROACHES TO EDUCATIONAL ASSESSMENT 1. Criterion-Referenced measurement - determine student’s status in a clearly defined set of related tasks (called domain). It describes what learning tasks an individual can and cannot do. Example: A student can assemble the parts of a microscope. Manuel was able to get the correct solution to the problem. 2. Norm-referenced measurement – determines student’s status compared with that of others on a given tasks. It provides student’s relative standing among other students. Page 16 of 56 Example: Mitch is the highest in a mathematics test in a class of 50. Among your classmates only Samuel got the highest score in Chemistry. Only group 1 was able to perform the experiment accurately. Characteristics of a modern educational assessment 1. 2. 3. 4. Objectives-based and criterion-referenced Reliable Multidimensional in Structure Value-laden Recent Trends in Educational Assessment 1. Congruence of assessment and instructional objectives. It occurs when both the behavior and content in the objective and in the test item are similar. 2. Shift from discrete-point test to integrative assessment. Modern assessment does not focus on the specific point of the cognitive domain only but also on the other two domains. The use of alternative assessment procedures should be utilized. 3. Shift from paper-pencil to authentic assessment. The use of authentic skills assessment considers individual pacing and growth of each students. Authentic skills assessment requires utilization of different types of measuring instruments such as checklists, interview guides, diaries, journals and simulation games. 4. Focus on group assessment rather than on individual assessment. When students work in small groups, opportunities to develop communications skills, leadership, followership and decision-making skills are enhanced. When students learn as a group, cooperation and team building are fostered. 5. Use of Portfolio A portfolio is a collection of student’s learning experiences assembled over time. Its content may show only the best work, evidences of individual work, or evidences of group work. Types of Portfolio: 1. Working Portfolio – consists of collection of day-to-day work of student 2. Documentary Portfolio – collections of the best work of students assembled for assessment purposes. They showcase the final products of student work. 3. Show Portfolio – purposeful collections of limited amount of student’s work usually finished products to display the best he/she accomplished in a given period. Assessment of Learning Outcomes in K to 12 Program The K to 12 Program covers Kindergarten and 12 years of basic education (six years of primary education, four years of Junior High School, and two years of Senior High School [SHS]) Page 17 of 56 to provide sufficient time for mastery of concepts and skills, develop lifelong learners, and prepare graduates for tertiary education. Assessment of learning outcomes in K to 12 program is stated in DepEd Order No. 31. s. 2012. It is standard based as it seeks to ensure that teachers will teach according to the standards and students will aim to meet or even exceed the standards. The Philippine educational landscape continues to evolve in line with the K-12 curriculum. In light of DepEd‘s POLICY GUIDELINES ON CLASSROOM ASSESSMENT FOR THE K TO 12 BASIC EDUCATION PROGRAM (DepEd Order 8. s, 2015). THE GUIDELINES STATE THAT CLASSROOM ASSESSMENT; 1.Recognizes Vygotsky’s Zone of Proximal Development 2. Describes formative and summative assessment as the types of classroom assessment used to know what learners know and can do 3. Explains the learning standards and Cognitive Process Dimensions to be used in the classroom 4. Discusses the processes and measures for assessing, including the new Grading System CLASSROOM ASSESSMENT In the policy, classroom assessment refers to an ongoing process of identifying, gathering, organizing, and interpreting qualitative and quantitative information about what learners know and can do. It also stresses the importance of teachers informing learners about lesson objectives to encourage the latter to meet or even exceed the standards. FORMATIVE Can be given anytime during the teaching and learning process; intended to help students identify strengths and weaknesses from their assessment experience SUMMATIVE Usually occurs at the end of a unit or period of learning to describe the standard reached by the learner; results are recorded and used to report on the learner’s achievement HOW ARE LEARNERS ASSESSED? Three components of summative assessment by which students will be graded: 1. Written Work – includes quizzes and unit/long tests that strengthen learners’ test taking skills. Page 18 of 56 2. Performance Task – include skills demonstration, group and other oral presentations, multimedia presentations, and research projects that allow learners to demonstrate in diverse ways what they know and are able to do. 3. Quarterly Assessment– measures student learning after each quarter; come in the form of objective tests, performance-based tests, or their combination. Knowledge Process Understanding Performance LEARNING STANDARDS What is assessed in the classroom? Classroom assessment is aimed at helping students perform well according to the following learning standards: Content Standards – identify essential knowledge and understanding that learners should learn: “What should the students know?” Performance Standards – describe the abilities and skills that learners are expected to demonstrate in relation to the content standards and 21st century skills integration. Learning Competencies – refer to the knowledge, understanding, skills, and attitudes that students need to demonstrate in every lesson and/or activity Activity: Supply the correct answer on the blank after each question. 1. It determines student’s status in a clearly defined set of related tasks. ___________ 2. It occurs when both the behavior and content in the objective and in the test item are similar. _________. 3. It determines student’s status compared with that of others on a given task. _______ 4. It focuses only on the specific point of the cognitive domain. __________. 5. It is a collection of student’s learning experiences assembled over time. Its content may show only the best work, evidences of individual work, or evidences of group work. ______. 6. It consists of collection of day-to-day work of student. __________. 7. It requires utilization of different types of measuring instruments such as checklists, interview guides, diaries, journals and simulation games. _________. 8. It is purposeful collections of limited amount of student’s work usually finished products to display the best he/she accomplished in a given period. _______. 9. It refers to cognitive operations that the student performs on facts and information for the purpose of constructing meanings and understanding. __________ 10. It refer to the knowledge, understanding, skills, and attitudes that students need to demonstrate in every lesson and/or activity. ____________. Page 19 of 56 Discuss briefly the characteristics of a modern educational assessment. ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ IV. Preparing for Assessment Chapter 4: Instructional Objectives 4.1. 4.2. 4.3. 4.4. Discuss the different principles of testing. Identify different qualities of assessment tools. Identify the different steps in developing test item. Construct table of specifications. General principles of testing 1. Measure all instructional objectives. When a teacher constructs test items to measure the learning progress of the students, they should match all the learning objectives posed during instructions. 2. Cover all the learning tasks. The teacher should construct a test that contains a wide range of sampling items. In this case, he/she can determine the educational outcomes or abilities that the resulting scores are representative of the total performance in the areas measured. 3. Use appropriate test items. The test item constructed must be appropriate to measure intended learning outcomes. 4. Make test valid and reliable. The test must be valid so that it can measure what it supposed to measure from the students. The test is reliable when the scores of the students remain the same or consistent when the teacher gives the same test for the second time. 5. Use to improve learning. Test scores should be utilized by the teacher properly to improve learning by discussing the skills or competencies on the items that have not been learned or mastered by the learners. Qualities of assessment tools 1. 2. 3. 4. 5. 6. 7. Clarity of the learning target. Appropriateness of Assessment Tools. Balanced Validity Reliability Fairness Practicality and Efficiency Page 20 of 56 8. Morality in assessment Steps in developing assessment tools 1. 2. 3. 4. 5. 6. 7. 8. Examine the instructional objectives of the topics previously discussed. Make a table of specifications (TOS). Construct the test items. Assemble the items. Check the assembled items. Write directions. Make the answer key. Analyze and improve the items. Table of Specifications A Table of Specifications allows the teacher to construct a test which focuses on the key areas and weights those different areas based on their importance. A Table of Specifications provides the teacher with evidence that a test has content validity, that it covers what should be covered. A Table of Specifications benefits students in two ways. First, it improves the validity of teacher-made tests. Second, it can improve student learning as well. It helps to ensure that there is a match between what is taught and what is tested. Classroom assessment should be driven by classroom teaching which itself is driven by course goals and objectives. It also provided the link between teaching and testing. A. One – Way Grid Table of Specifications: components; 1. learning outcomes/instructional objectives 2. number of recitation days 3. number of items for each objective 4. percentage of items 5. item placement B. 1. 2. 3. Two-Way Grid Table of Specifications: components; instructional objectives classification according to Bloom’s Taxonomy percentage of item per classification How do you create a table of specifications? Step 1 Determine the coverage of your exam. The first rule in making exams and therefore in making a table of specification is to make sure the coverage of your exam is something that you have satisfactorily taught in class. Page 21 of 56 Select the topics that you wish to test in the exam. It is possible that you will not be able to cover all these topics as it might create a test that is too long and will not be realistic for your students in the given time. So select only the most important topics. Step 2 Determine your testing objectives for each topic area. So for each content area that you wish to test, you will have to determine how you will test each area. Your objectives per topic area should use very specific verbs on how you intend to test the students using the bloom’s taxonomy. It is important that your terms of specification reflect your instructional procedures during the semester. If your coverage on a topic mostly dwelt on knowledge and comprehension of material, then you cannot test them by going up the hierarchy of bloom’s taxonomy. Thus it is crucial that you give a balanced set of objectives throughout the semester depending on the nature of your students Step 3 Determine the duration for each content area The next step in making the table of specifications is to write down how long you spent teaching a particular topic. This is important because it will determine how many points you should devote for each topic. Logically, the longer time you spent on teaching a material, then the more questions should be devoted for that area. Step 4 Determine the Test Types for each objective. It is time to determine the test types that will accomplish your testing objectives. The important thing is that the test type should reflect your testing objective. Step 5 Polish your terms of specification After your initial draft of the table of specifications, it’s time to polish it. Make sure that you have covered in your terms of specification the important topics that you wish to test. The number of items for your test should be sufficient for the time allotted for the test. You should seek your academic coordinator and have them comment on your table of specification. They will be able to give good feedback on how you can improve or modify it. Sample Table of Specification Page 22 of 56 Activity: Discuss the following: 1. How can the use of a Table of Specifications benefit your students, including those with special needs? ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ 2. Discuss the general principles of testing and its importance to teaching and learning. ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ ______________________________________________________________________________ 3. Make/Construct a table of specification using a topic already discuss in your area of specialization. Page 23 of 56 V. Development of Classroom Assessment Tools Chapter 5: Instructional Objectives 5.1 Discussed the different formats of assessment tools 5.2 Determined the advantages and disadvantages of the different format of test item. 5.3. Identified the different rules in constructing multiple-choice test, matching type test, completion test, true or false test and essay test. 5.4. Constructed multiple-choice test, matching type test, completion test, true or false test, essay test and problem solving test. Development of classroom assessment Different Formats of Classroom Assessment Tools A. Objective Test – It requires one and only one correct answer and no other possible answer. It requires students to select the correct response from several alternatives or to supply a word or short phrase to answer a question or complete a statement. Two main types of objective tests. Recall Type: 1. Supply/Recall Type : 1.1. Simple recall type – this type requires the examinee to recall previously learned lessons and the answers are usually short consisting of either word or phrase. 1.2. Completion Test- this test consists of series of items that requires the examinee to fill the blank with the missing word or phrase to complete a statement. (Fill in the blank) 2. Selection/Recognition Type 2.1. Alternative Response – it consists of a series of items where it admits only one correct response for each item from two or three constant options to be chosen. (truefalse, yes-no, modified true-false type) 2.2. Multiple-Choice Test- this is made up of items which consist of three or more plausible options for each item. The choices are multiple so that the examinees may choose only one correct or best option for each item. 2.3. Matching Type – consists of two columns in which proper pairing relationship of two things is directly observed. 2.4. Rearrangement Type - consists of multiple-option item where it requires a chronological, logical or ranks order. 2.5. Analogy – made up of items consisting of a pair of words related to each other. It is designed to measure the ability of the examinee to observe the relationship of the first word to the second word. 2.6. Identification Type – it requires the examinees to identify what is being defined in the statement or sentence and there are no options to choose from. B. Subjective Test – it is commonly known as essay test. It is an assessment tool which consists of specific questions or problem wherein the examinees respond in one or more sentences. It allows the student to organize and present an original answer. Page 24 of 56 1. Restricted Response - it presents some limitations like the required number of words/sentences, the cue words to be found in students’ answer. 2. Extended Response - it allows the students to write freely for it does not limit them to express their thoughts and opinions. For some instructional purposes one or the other item types may prove more efficient and appropriate. When to Use Essay or Objective Tests Essay tests are especially appropriate when: the group to be tested is small and the test is not to be reused. you wish to encourage and reward the development of student skill in writing. you are more interested in exploring the student's attitudes than in measuring his/her achievement. you are more confident of your ability as a critical and fair reader than as an imaginative writer of good objective test items. Objective tests are especially appropriate when: the group to be tested is large and the test may be reused. highly reliable test scores must be obtained as efficiently as possible. impartiality of evaluation, absolute fairness, and freedom from possible test scoring influences (e.g., fatigue, lack of anonymity) are essential. you are more confident of your ability to express objective test items clearly than of your ability to judge essay test answers correctly. there is more pressure for speedy reporting of scores than for speedy test preparation. Either essay or objective tests can be used to: measure almost any important educational achievement a written test can measure. test understanding and ability to apply principles. test ability to think critically. test ability to solve problems. test ability to select relevant facts and principles and to integrate them toward the solution of complex problems. ADVANTAGES AND DISADVANTAGES OF DIFFERENT TEST FORMATS Advantages of Objective Type of Test 1. Easy to correct or score 2. Eliminates subjectivity 3. Adequate sampling 4. Objectivity in scoring 5. Saves time and energy in answering questions. Disadvantages of Objective Type of Test 1. Difficult to construct Page 25 of 56 2. 3. 4. 5. Encourages cheating and guessing Expensive Encourages rote memorization Time consuming on part of the teacher Advantages of Subjective/Essay Test 1. Easy to construct 2. Economical 3. Saves time and energy for the teacher 4. Trains the core of organizing ideas 5. Minimize guessing 6. Develops critical thinking 7. Minimize rote memorization 8. Minimize cheating 9. Develops good study habits 10. Develop student’s ability to express their own ideas Disadvantages of Subjective/Essay Test 1. Low validity 2. Low reliability 3. Low practicality 4. Encourage bluffing 5. Difficult to correct or score 6. Disadvantage to students with poor penmanship GUIDELINES IN CONSTRUCTING DIFFERENT FORMATS Multiple-choice – it has three parts: the stem, the keyed option and the incorrect option or alternatives. The stem represents the problem or question usually expressed in completion form or question form. The keyed option is the correct answer. The incorrect option or alternatives are also called distracters. Guidelines in Constructing Multiple-Choice Test Items 1. Make a test item that is practical or with real-world applications to the students. 2. Use diagram or drawing when asking question about application, analysis or evaluation. 3. When ask to interpret or evaluate about quotations, present actual quotations from secondary sources. 4. Use tables, figures, or chats when asking question to interpret. 5. Use pictures, if possible when students are required to apply concepts and principles. 6. List the choices/options vertically NOT horizontally. 7. Avoid trivial/unimportant question 8. Use only one correct answer or best answer format. 9. Use 3-5 options to discourage guessing. 10. Be sure that distracters are plausible and attractive. Page 26 of 56 Writing Good Stem: 1. Present one clearly stated problem in the stem 2. State the stem in simple, clear language 3. Make the stem longer than the option 4. Stress the negative word used in the stem 5. Avoid grammatical clues in the correct answer Writing plausible options: 1. Make sure there is only one correct answer 2. Make the options homogeneous 3. Make the options grammatically consistent and parallel in form with the stem of the item. 4. Vary the length of the key to avoid giving a clue 5. Place the position of the correct answer in random order 6. Avoid using “all of the above” or “none of the above” as an alternative 7. Avoid using two options that have similar meaning. Matching Type – matching item is consists of a series of stimuli (questions or stems) called premises and a series of options called responses arranged in columns. Guidelines in Constructing Matching Items: 1. Include only materials that belong to the same category. 2. Keep the premises short and place them on the left and designate them by numbers. Put the responses on the right and assign letters to them. The numbers should be arranged consecutively while the letters in alphabetical order. 3. Use more responses than premises and allow the responses to be used more than once. 4. Place the matching items on one page. 5. State basis for matching in the directions True or False – it requires students to identify statements which are correct or incorrect. Only two responses are possible in this item format. Guidelines in Constructing True-False Items: 1. Each statement should include only one idea. 2. Each statement should be short and simple. 3. Qualifiers such “few”, “many”, “seldom” and so on should be avoided. 4. Negative statements should be used sparingly. 5. Double negatives should be avoided. 6. Statements of opinions or facts should be attributed to some important person or organization. 7. The number of true and false statements should be equal if possible. Completion Type/Fill in the blank Guidelines for Writing Completion Items 1. State the items clearly and precisely so that only one correct answer is acceptable. 2. Use an incomplete statement to achieve preciseness and conciseness. 3. Leave the blank at the end of the statement. 4. Focus on one important idea instead of trivial detail and leave only one blank. Page 27 of 56 5. Avoid giving clues to the correct answer. Essay Test Guidelines for Constructing/Administering Essay Questions 1. State questions that require clear, specific and narrow task to be performed. 2. Give enough time limit for answering each essay question. 3. Require students to answer all questions. 4. Make it clear to students if spelling, punctuations, content, clarity, and style are to be considered in scoring the essay questions to make the item valid. 5. Grade each essay question by the point method, using well-defined criteria. 6. Evaluate all of the students’ responses to one question before going to the next question. 7. Evaluate answers to essay questions without identifying the student. 8. If possible, two or more correctors must check the essay to ensure reliable results TRUE OR FALSE 1. Essay test develops good study habits. 2. Objective type of test is difficult to prepare. 3. The items in matching type of test are heterogeneous. 4. Essay type of test is easy to score. 5. Objective test is commonly used in PRC examinations. MIDTERM EXAMINATION VI. Administering, Analyzing and Improving Tests Chapter 6 – Instructional Objectives 6.1. Elaborated the guidelines before, during and after administering examination. 6.2. Defined the basic concepts regarding item analysis. 6.3. Differentiate two types of item analysis. 6.4. Perform item analysis properly and correctly. 6.5. Interpret the results of item analysis. Packaging and Reproducing Test Items Before administering the test, the following points must be ensured first: 1. Put the items with the same format together 2. Arrange the test items from easy to difficult. 3. Give proper spacing for each item for easy reading. 4. Keep option and questions in the same page. 5. Place the illustrations near the options. 6. Check the answer key. 7. Check the directions of the test. 8. Provide space for name, date and score. 9. Proofread the test. 10. Reproduce the test. Page 28 of 56 Administering the Examination Guidelines before Administering Examination 1. Try to induce positive test-taking attitude. 2. Inform the students about the purpose of the test. 3. Give oral directions as early as possible before distributing the tests. 4. Do not give any hints about the test. 5. Inform the students about the length of time allowed for the test. 6. Tell the students how to signal or call your attention if they have a question. 7. Tell the students how the papers are to be collected. 8. Make sure the room is well-lighted and has a comfortable temperature. 9. Remind the students to put their names on their papers. 10. If the test has more than one page, have each student checked to see that all pages are there. Guidelines during the Examination 1. Do not give instructions or avoid talking while examination is going on to minimize interruptions and distractions. 2. Avoid giving hints. 3. Monitor to check student progress and discourage cheating. 4. Give time warnings if students are not pacing their work appropriately. 5. Make a note of any questions students ask during the test so that items can be revised for future use. 6. Test papers must be collected uniformly to save time and to avoid test papers to be misplaced. Guidelines after the Examination 1. Grade the papers (and add comments if you can); do test analysis 2. If your are recording grades or scores, record them in pencil in your class record before returning the papers. 3. Return the papers in a timely manner. 4. Discuss test items with the students. ANALYZING THE TEST Item Analysis – a process which examines student responses to individual test items (questions) in order to assess the quality of those items and of the test as a whole. Uses of Item Analysis 1. Item analysis data provide a basis for efficient class discussion of the test results. 2. It provide a basis for remedial work. 3. It provide a basis for general improvement of classroom instruction. 4. A basis for increased skills in test construction. 5. Item analysis procedure provide a basis for constructing test bank. Two Types of Item Analysis Page 29 of 56 1. Quantitative Item Analysis - it provides the following: a. the difficulty of the item b. the discriminating power of the item. c. the effectiveness of each alternative (for multiple choice type of test) 2. Qualitative Item Analysis - a process in which the teacher or expert carefully proofreads the test before it is administered to check if there are typographical errors, to avoid grammatical clues that may lead to giving away the correct answer and to ensure that the level of reading materials is appropriate. Difficulty Index (DF) – refers to the ease or difficulty of a test item. It is defined as the proportion of the number of students in the upper and lower groups who answered an item correctly. FORMULA: 𝐃𝐅 = 𝐂𝐔𝐆 + 𝐂𝐋𝐆 𝐍 Where: DF = difficulty index CUG = the number of students in the upper group who answered the item correctly CUL = the number of students in the lower group who answered the item correctly N – the total number of students who are involved in item analysis. LEVEL OF DIFFICULTY OF AN ITEM The computed difficulty index can be interpreted using the scale below: Range of Difficulty Index 0.00 – 0.20 0.21 – 0.40 0.41 – 0.60 0.61 – 0.80 0.81 – 1.00 Level Very difficulty item Difficult item Average/Moderately Difficult item Easy item Very Easy item *The higher the value of the difficulty index, the easier the item is. The acceptable difficulty index ranges from 0.41 – 0.60 (moderate) only. Discrimination Index (DI) – it is the power of the item to discriminate the students between those who scored high and those who scored low in the overall test. In other words, it is the item’s ability to distinguish between those who know the lesson and those who do not know the lesson. FORMULA: 𝐃𝐢 = 𝐂𝐔𝐆 − 𝐂𝐋𝐆 n Page 30 of 56 Where: Di = discrimination index value CUG = the number of students in the upper group who answered the item correctly CUL = the number of students in the lower group who answered the item correctly n – the total number of students in either of the two groups. TYPES OF DISCRIMINATION INDEX 1. Positive Discrimination happens when the number of students in the upper group who answered the item correctly is GREATER THAN the number of students in the lower group who answered the item correctly. (CUG > CLG ) 2. Negative Discrimination happens when the number of students in the upper group who answered the item correctly is LESS THAN the number of students in the lower group who answered the item correctly. (CUG < CLG ) 3. Zero Discrimination happens when the number of students in the upper group who answered the item correctly is EQUAL TO the number of students in the lower group who answered the item correctly. (CUG = CLG ) LEVEL OF DISCRIMINATION INDEX Range of Index Discrimination -1.00 – 0.01 0.00 0.01 – 0.19 0.20 – 0.40 0.41 – 0.60 0.61 – 1.00 Level Questionable Item No Discriminating Power Very Low Discriminating Power Low Discriminating Power Moderately Discriminating Power Very High Discriminating Power Note: If the discrimination index is negative, that means the item is automatically rejected regardless of its level of difficulty. The only acceptable discrimination index is from +0.20 to +1.00. Interpretation of difficulty and discrimination indices A good or retained item must have both acceptable indices. A fair or revised item contains either unacceptable index. A poor or rejected/discard item must possess both unacceptable indices. Difficulty Index Acceptable Acceptable Not Acceptable Not Acceptable Index of Discrimination Acceptable Not Acceptable Acceptable Not Acceptable Remarks Good Fair Fair Poor Action Retain Revise/Improve Revise/Improve Reject/Discard Acceptable Difficulty Index (DF) 0.41 – 0.60 - Moderately Difficult/Average Acceptable Discrimination Index (DI) 0.20 – 1.00 - (Low – Very High) Page 31 of 56 Steps in Item Analysis (U-L Method) 1. Arrange the scores from highest to lowest. 2. Separate the scores into upper and lower group. If a class consists of 30 students or below who took the exam, divide them into two groups. The first half comprises the Upper Group (UG) while the other half is the Lower Group (LG). If the students are more than 30, get the top 27% and the lowest 27% and named them as Upper Group (UG) and Lower Group (LG). 3. Compute the index of difficulty for each item then describe the level of difficulty as very easy, easy, average, difficult, or very difficult. Then indicate whether it is acceptable or not. 4. Compute the index of discrimination for each item then describe its power to discriminate. Then indicate whether it is acceptable or not. 5. Interpret the results whether the item is good , fair or poor. 6. Indicate the necessary action if it is to be retained, revised or rejected. Improving Test Items (Multiple Choice) To improve multiple choice test items, consider the stem of the item, the distracters and the key answer. 1. 2. 3. 4. Compute the difficulty index. Compute the discrimination index. Make an analysis about the level of difficulty, discrimination index and distracters. Make a conclusion; retain, revise or reject the item. Sample Problems: 1. Suppose a 40-item test was given to 40 students in History class. Compute the difficulty index and index of discrimination of the following test results. Interpret your answers and determine what actions you should take. UG = 40 x .27 = 10.8 or 11 LG = 40 x .27 = 10.8 or 11 Total Number of Students = 22 Ite m No. 1 CUG 8 3 10 15 21 C Df Level 3 0.50 Averag e 9 5 0.64 Easy Item 5 10 6 2 8 8 Interpretation Dr Level Acceptable 0.45 Not Acceptable 0.36 Moderately Discriminatin g Power Low Discriminatin g Power L Interpretatio n Remark s Decisio n Acceptable Good Retain Acceptable Fair Revise G Page 32 of 56 For item no. 1 Df = (8 + 3)/ 22 = 0.50 Dr = (8 – 3)/11 = 5/11= 0.45 For item no. 2. Df = (9 + 5)/22 = 0.64 Dr = (9 – 5)/ 11= 0.36 2. A class is composed of 30 students. Divide the class into two. Option B is the correct answer. Based from the given data on the table, as a teacher, what would you do? Options Upper Group Lower Group A 1 2 B* 10 6 C 2 4 D 0 0 E 2 3 UG = 15 LG = 15 DF = (10 + 6)/30 = 16/30 = 0.53 – AVERAGE/MODERATELY DIFFICULT/ ACCEPTABLE DI = (10-6)/15 = 4/15 = 0.27 – LOW DISCRIMINATING POWER / ACCEPTABLE REMARKS: THE ITEM IS GOOD DECISION: RETAINED 3. A class is composed of 50 students. Use 27% to get the upper and the lower groups. Analyze the item given the following results. Option D is the correct answer. What will you do with the test item? Options Upper Group 27% Lower Group 27% A 1 5 B 1 0 C 2 4 D* 8 4 E 2 1 4. A class is composed of 50 students. Use 27% to get the upper and the lower groups. Analyze the item given the following results. Option E is the correct answer. What will you do with the test item? Options Upper Group 27% Lower Group 27% A 2 2 B 3 2 C 2 1 D* 2 1 E 5 8 VII. Characteristics of a Good Test Chapter 7 – Instructional Objectives 7.1. Enumerate the different ways of establishing validity and reliability of different assessment tools. 7.2. Identify the different factors affecting the validity and reliability of the test. 7.3. Compute and interpret the validity and reliability coefficient. A good test must first of all be valid. Validity refers to the extent to which a test measures what it purports to measure. This is related to the purpose of the test. If the purpose of the test is Page 33 of 56 to determine the competency in adding two-digit numbers, then the test items will be about addition of these two-digit numbers. If the objective matches the test items prepared, the test is said to be valid. WAYS OF ESTABLISHING VALIDITY: 1. Face Validity – is done by examining the physical appearance of the instrument. 2. Content Validity - is done through a careful and critical examination of the objectives of assessment so that it reflects the curricular objectives. 3. Criterion-Related Validity – it is established statistically such that a set of scores revealed by the measuring instrument is correlated with the scores obtained in another external predictor or measure. It has two types: 3.1. Concurrent Validity - describes the present status of the individual by correlating the sets of scores obtained from two measures given concurrently. 3.2. Predictive Validity - describes the future performance of an individual by correlating the sets of scores obtained from two measures given at a longer time interval. 4. Construct Related Validity – this is the extent to which the test measures a theoretical and unobservable variable qualities such as understanding math achievement, performance anxiety and the like over a period of time on the basis of gathering evidence. It is establish through intensive study of the test or measurement instrument using convergent/divergent validation and factor analysis. 4.1. Convergent validity – is a type of construct validation wherein a test has high correlation with another test that measures the same construct. 4.2. Divergent validity - is a type of construct validation wherein a test has low correlation with a test that measures a different construct. In this case, a high validity occurs only when there is a low correlation coefficient between the tests that measure different traits. A correlation coefficient in this instance is called validity coefficient. 4.3. Factor Analysis - is another method of assessing the construct validity of a test using complex statistical procedures conducted with different procedures. FACTORS AFFECTING VALIDITY 1. Poorly constructed items 2. Unclear directions 3. Ambiguous test items 4. Too difficult vocabulary 5. Complicated syntax 6. Inadequate time limit 7. Inappropriate level of difficulty 8. Unintended clues 9. Improper arrangement of test items Page 34 of 56 RELIABILITY – refers to the consistency of test scores. The reliability of test scores is usually reported by a reliability coefficient. A reliability coefficient is also a correlation coefficient. There are different ways of establishing reliability of a test. They are; 1. Test-retest method - in this method, the same test is administered twice to the same group of students with any time interval between tests. The result of the test scores are correlated using the Pearson Product Correlation Coefficient (r) or Spearman rho formula (rs) and this correlation provides a measure of stability. 2. Equivalent form – it is also known as PARALLEL or ALTERNATE forms. In this method, two different but equivalent forms of the test is administered to the same group of student with a close time interval. The two forms of the test must be constructed that the content type of test item, difficulty and instruction of administration are similar but not identical. 3. Test-retest with equivalent form – it is done by giving equivalent forms of tests with increased time interval between forms. The results of the test scores are correlated using Pearson Product Correlation Coefficient. 4. Split-half method – in this method, the test administered once and the equivalent halves of the test is scored. The common procedure is to divide the test into odd-numbered and even-numbered items. The two halves of the test must be similar but not identical in content, number of items and difficulty. This provides two scores for each student. The scores obtained in the two halves are correlated using Pearson r and the result is reliability coefficient for a half test. The reliability coefficient for a whole test is determined using Spearman Brown formula. It provides a measure of internal consistency. 5. Kuder-Richardson method – in this method, the test is administered once, the total test is scored then the proportion/percentage of the students passing and not passing a given item is correlated. FACTORF AFFECTING RELIABILITY OF A TEST 1. 2. 3. 4. 5. Length of the test Item difficulty Objective of scoring Heterogeneity of the student group Limited time RELIABILITY COEFFICIENT Reliability coefficient is a measure of the amount of error associated with the test scores. DESCRIPTION OF RELIABILITY COEFFICIENT 1. The range of the reliability coefficient is from 0 to 1.0 Page 35 of 56 2. The acceptable range value is 0.60 or higher. 3. The higher the value of the reliability coefficient, the more reliable the overall test scores is. Interpretation of Reliability Coefficient 1. The group variability will affect the size of the reliability coefficient. Higher coefficient results from heterogeneous groups than from the homogeneous groups. As group variability increases, reliability goes up. 2. Scoring reliability limits test score reliability. If tests are scored unreliably, error is introduced. This will limit the reliability of the test scores. 3. Test length affects test score reliability. As the length increases, of the test’s reliability tends to go up. 4. Item difficulty affects test score reliability. As test items become very easy or very difficult, the test’s reliability goes down. Level of Reliability Coefficient Reliability Coefficient 0.91 – 1.00 0.81 – 0.90 0.71 – 0.80 0.61 – 0.70 0.51 – 0.60 0.50 - below Interpretation Excellent reliability. Very ideal for a classroom test. Very high reliability. Very good for a classroom test. High reliability. Good for a classroom test. There are probably few items needs to be improved. Moderately reliability. The test needs to be supplemented by other measures (more test) to determine grades. Low reliability. Suggested need for revision of the test, unless it is quite short (ten or fewer item). Needs to be supplemented by other measures (more test) to determine grades. Questionable reliability. This test should not contribute heavily to the course grade and it needs revision. Pearson r Page 36 of 56 Exercises: 1. Mrs. Dela Cruz conducted a test this 10 students in Elementary Statistics class twice after one-day interval. The test given after one day is exactly the same test given the first time. Scores below were gathered in the first test (x) and second test (y). Using test-retest method, is the test reliable? Show the complete solution using Spearman Rho and Pearson r formula. a. Using Pearson r Student 1 2 3 4 5 6 7 8 9 10 First test (x) 36 26 38 15 17 28 32 35 12 35 ∑X = 274 Second test (y) 38 34 38 27 25 26 35 36 19 38 ∑Y =316 xy X2 Y2 1,368 1,296 1,444 884 676 1,156 1,444 1,444 1,444 405 225 729 425 289 625 728 784 676 1,120 1,024 1,225 1,260 1,225 1,296 228 144 361 1,330 1,225 1,444 ∑XY=9,192 ∑X2 = 8,332 ∑Y2= 10,400 Given: N = 10 r = 10(9,192) – (274)(316) √ (10)(8,332) – (274)2 (10)(10,400) – (316)2 = 91,920 – 86,584 √ [(83,320) – (75,076)] [(104,000) – (99,856)] = 5,336 √(8,244)(4,144) Page 37 of 56 = 5, 336 = 5, 336 √34,163,136 = 0.91 -excellent reliability 5,844.93 b. Using Spearman Rho Student First test Second test Rank (x) Rank (y) 1 2 3 4 5 6 7 8 9 10 36 26 38 15 17 28 32 35 12 35 38 34 38 27 25 26 35 36 19 38 2 7 1 9 8 6 5 3.5 10 3.5 2 6 2 7 9 8 5 4 10 2 Rs = 1 – 6(13.5) 10(102 – 1) 1 – 81 990 Difference between rank (D) 0 1 -1 2 -1 -2 0 -0.5 0 1.5 D2 0 1 1 4 1 4 0 0.25 0 2.25 ∑D2 = 13.5 = 1 - 81 10(100-1) = 1- 0.81 = 0.919 - excellent reliability Page 38 of 56 VIII. Measures of Central Tendency Chapter 8 – Instructional Objectives 8.1. Present scores in a frequency distribution table. 8.2. Calculate the mean, median and mode of the given set of data. 8.3. Described shapes of the distribution using measures of central tendency 8.4. Distinguish the different measures of location. Preparing Frequency Distribution Table (FDT) A frequency distribution table is a method of organizing raw data in a compact form by displaying a series of scores in ascending or descending order, together with their frequencies— the number of times each score occurs in the respective data set. Frequency tells you how often something happened. The frequency of an observation tells you the number of times the observation occurs in the data. For example, in the following list of numbers, the frequency of the number 9 is 5 (because it occurs 5 times): 1, 2, 3, 4, 6, 9, 9, 8, 5, 1, 1, 9, 9, 0, 6, 9. Mode: 9 Median: 5.5 Mean: 5.125 0,1,1,1,2,3,4,5,6,6,8,9,9,9,9,9 = 82/16 The various components of the frequency distribution are: 1. Class interval 2. types of class interval 3. class boundaries 4. midpoint or class mark 5. class width or size of class interval 6. class frequency 7. relative frequency = class frequency/ total frequency 8. cumulative frequency Steps to Making Your Frequency Distribution Step 1: Calculate the range of the data set. The range is the difference between the largest value and the smallest value. Range = Highest Score – Lowest Score Step 2: Divide the range by the number of groups you want and then round up Step 3: Use the class width to create your groups Step 4: Find the frequency for each group - Tally Step 5: Indicate the class boundaries Step 6. Determine the midpoint or class mark Step 7. Indicate the relative frequency and cumulative frequency Page 39 of 56 ACTIVITY: 1. The following scores are obtained from a 60-item test in Assessment in Learning I, administered to 36 students. Construct a frequency distribution table. Determine the ff data: range, class size, mean using the midpoint. 56 44 32 34 22 52 21 18 40 47 30 48 49 36 20 46 30 50 38 27 30 41 50 24 30 40 33 49 36 27 48 33 41 25 36 19 Scores 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 Tally / / // // // / / / // // / /// / // / //// Frequency 1 0 0 0 1 0 2 2 2 1 1 0 1 0 0 2 2 0 1 0 3 0 1 2 1 0 4 0 Page 40 of 56 28 27 26 25 24 23 22 21 20 19 18 Highest Score – 56 Lowest Score – 18 0 2 0 1 1 0 1 1 1 1 1 // / / / / / / / Range = HS – LS 56 – 18 = 38 Class Interval Ci = R/10 38/10 = 3.8 or 4 Frequency Distribution Table – Grouped Data ( 31 above no. of students/scores) Class Limit Tally Frequency Class Midpoint Cumulative (lower limitf Boundaries (x) Frequency upper limit) 16 – 19 20 – 23 24 – 27 28 – 31 32 – 35 36 – 39 40 – 43 44 – 47 48 – 51 52 – 55 56 – 59 // /// //// //// //// //// //// /// ////// / / 2 3 4 4 4 4 4 3 6 1 1 N = 36 15.5 – 19.5 19.5 – 23.5 23.5 – 27.5 27.5 – 31.5 31.5 – 35.5 35.5 – 39.5 39.5 – 43.5 43.5 – 47.5 47.5 – 51.5 51.5 – 55.5 55.5 – 59.5 17.5 21.5 25.5 29.5 33.5 37.5 41.5 45.5 49.5 53.5 57.5 2 5 9 13 17 21 25 28 34 35 36 fx 35.0 64.5 102.0 118.0 134.0 150.0 166.0 136.5 297.0 53.5 57.5 ∑fx=1,314 8th decile = lb + [8(n)/10 – cf] ci f 8th decile = 8(36)/10 = 288/10 = 28.8 or 29 = 47.5 + (29 – 28) 4 = 47.5 + (1)(4) = 47.5 + 0.67 = 48. 17 or 48 6 6 th 5 Decile = 5(36)/10 = 180/10 = 18 = 35.5 + (18 – 17) 4 = 35.5 + 1 (4) = 35.5 + 1 = 36.5 or 37 4 4 75th Percentile = 75 (n)/100 = 75 ( 36) /100 = 2700/100 = 27 = 43.5 + (27 - 25 ) 4 = 43.5 + 2 (4) = 43.5 + 8/3 = 43.5 + 2.67 =46.17 or 46 3 3 Page 41 of 56 25 th percentile = 25(36)/100 = 900/100 = 9 = 23.5 + (9 – 5) 4 = 23.5 + 4(4) 4 4 nd th th 2 quartile = 50 percentile = 5 decile = Median = 23.5 + 4 = 27.5 or 28 3rd quartile = 3(36)/4 = 108/4 = 27 = 43.5 + (27 – 25) 4 = 43.5 + 2 (4) = 43.5 + 2.67 = 46. 17 or 46 3 3 2nd quartile = 2(36)/4 = 72/4 = 18 = = 35.5 + (18 – 17) 4 = 35.5 + 1 (4) = 35.5 + 1 = 36.5 or 37 4 4 56 44 32 34 22 52 21 18 40 47 30 48 49 36 20 46 30 50 38 27 30 41 50 24 30 40 33 49 36 27 48 33 41 25 36 19 Mean = ∑fx/n Mean = 1,314/36 = 36.5 Mdn = lcb + (n/2 – cf ) ci Mdn = 35.5 + (18 – 17) 4 =35.5 + 1 = 36.5 or 37 f 4 Mode = 48 + ( 6 – 3) 4 = 48 + 12 = 48 + 1.5 = 49.5 [2(6) – 3 - 1] 8 2. Using the data below, determine the cumulative frequency less than, lower and upper class boundary and the median. Use separate frequency distribution table for this number. 38 34 12 24 32 31 21 28 30 37 43 48 48 33 27 46 30 23 38 27 20 41 29 24 32 40 23 18 46 37 38 43 21 45 26 19 Page 42 of 56 27 33 45 27 39 24 31 31 38 29 29 36 Measures of Central Tendency - is simply an average or typical value in a set of scores. Mean is the average of the numbers. It is easy to calculate: add up all the numbers, then divide by how many numbers there are. Mean = 31 Median of an Ungrouped Data Set. The median refers to the middle data point of an ordered data set at the 50% percentile. If a data set has an odd number of observations, then the median is the middle value. If it has an even number of observations, the median is the average of the two middle values. Mode of ungrouped data or the data that is not in the form of frequency distribution can be found by pick up the value that occurs maximum number of times. So that the mode of ungrouped data will be: Mode=Most frequent observation 16, 17, 22, 34, 35, 36, 39, 40, 42, 36, 23, 42, 38, 35, 34, 26 = 515 Mean = 515/16 = 32.1875 or 32.19 Median = 16, 17, 22, 23, 26, 34, 34, 35, 35, 36, 36, 38, 39, 40, 42, 42, Mdn= 35 Mode 34, 35, 36, 42 – multimodal SUMMARY The mean is calculated by adding all of the data values together, then dividing by the total number of values. The median is calculated by listing the data values in ascending order, then finding the middle value in the list. The mode is calculated by counting how many times each value occurs. The value that occurs with the highest frequency is the mode. Mean for ungrouped data: Page 43 of 56 Median for Ungrouped Data EXERCISES: 1. A random sample of 10 boys had the following intelligence quotients (I.Q’s). 70, 120, 110, 101, 88, 83, 95, 98, 105, 100 Find the mean I.Q. Mean = 970/10 = 97 70, 83, 88, 95, 98, 100, 101, 105, 110, 120, Median = 99 2. On an exam, two students scored 60, five students scored 90, four students scored 75, and two students scored 81. If the answer is 90, what is being asked in the question (mean, median, mode, or range)? 3. Here are the math quiz scores (number correct) for 16 students: 4,1,2,4,2,4,3,2,2,0,1,2,3,2,0,3 0, 0, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, a) Find the Mode ; Mode = 2 b) What is the mean and the median for the given data? Mean = 35/16= 2.19 or 2 Median = 2 4. A high school teacher assigns trigonometry practice problems to be worked via the net. Students must use a password to access the problems and the time of log-in and log-off are automatically recorded for the teacher. At the end of the week, the teacher examines the amount of time each student spent working the assigned problems. The data is provided below in minutes. 15, 28, 25, 48, 22, 43, 49, 34, 22, 33, 27, 25, 22, 20, 39 15, 20, 22, 22, 22, 25, 25, 27, 28, 33, 34, 39, 43, 48, 49 Find the Mean, Median, and Mode for the above data. Page 44 of 56 Mean = 452/15 = 30.133 or 30 Median = 27 Mode = 22 What does this information tell you about students' length of time on the computer solving trigonometry problems? 5. Compute the mean, the median and mode of the test scores below of 36 students in Assessment in learning 1. x 18 - 24 25 – 31 32 – 38 39 – 45 46 – 52 53 - 59 f 6 7 8 5 9 1 Median Decile Percentile Quartile 10th Decile = 100th Percentile = 4th Quartile Page 45 of 56 5th Decile = 50th Percentile = 2nd Quartile = Median n/2 - 8n/10 (decile), 70n/100 (percentile), 3n/4 (Quartile) 8th Decile 5th Decile 70th Percentile 87th Percentile 3rd Quartile = 75th Percentile 2nd Quartile = Median = 50th Percentile = 5th Decile IX. Measures of Variability Chapter 9 – Instructional Objectives 9.1. Compute the range, variance and standard deviation. 9.2. Interpret the results of computed values. 9.3. Describe kinds of distribution considering measures of variability. Measures of Variation – range, variance, standard deviation, interquartile range Statisticians use summary measures to describe the amount of variability or spread in a set of data. Statistical measures of variation are numerical values that indicate the variability inherent in a set of data measurements. The most common measures of variability are the range, the interquartile range (IQR), variance, and standard deviation. Variability serves both as a descriptive measure and as an important component of most inferential statistics. As a descriptive statistic, variability measures the degree to which the scores are spread out or clustered together in a distribution. Range The range of a set of observations is the absolute value of the difference between the largest and smallest values in the set. It measures the size of the smallest continuous interval of real numbers that encompasses all the data values. Page 46 of 56 EX. Given the following sorted data: 1.2, 1.5, 1.9, 2.4, 2.4, 2.5, 2.6, 3.0, 3.5, 3.8 The range of this set of data is 3.8 - 1.2 = 2.6. Variance and Standard Deviation The variance of a set of data is a cumulative measure of the squares of the difference of all the data values from the mean. Note that the population variance is simply the arithmetic mean of the squares of the difference between each data value in the population and the mean. On the other hand, the formula for the sample variance is similar to the formula for the population variance, except that the denominator in the fraction is (n-1) instead of n. Variance is the average squared difference of the values from the mean. It includes all values in the calculation by comparing each value to the mean. The interquartile range is the middle half of the data that is in between the upper and lower quartiles. In other words, the interquartile range includes the 50% of data points that fall between Q1 and Q3. Population variance The formula for the variance of an entire population is the following: Example: Using the data below, find the variance of the sample scores of 10 students in a Science quiz. Interpret the result. Complete the table. X 19 17 16 16 15 14 14 13 12 10 Σ x = 146 ꭓ = 14.6 (𝒙 − ꭓ ) 4.4 2.4 1.4 1.4 0.4 -0.6 -0.6 -1.6 -2.6 -4.6 (𝒙 − ꭓ )2 19.36 5.76 1.96 1.96 0.16 0.36 0.36 2.56 6.76 21.16 Σ (𝒙 − ꭓ )2 =60.40 Page 47 of 56 Population variance = 60.40/10 = 6.04 standard deviation = √6.04 = 2.45 Sample variance = 60.40/9 = 6.71 In the equation, σ2 is the population parameter for the variance, μ is the parameter for the population mean, and N is the number of data points, which should include the entire population. Sample variance To use a sample to estimate the variance for a population, use the following formula. Using the previous equation with sample data tends to underestimate the variability. Because it’s usually impossible to measure an entire population, statisticians use the equation for sample variances much more frequently. In the equation, s2 is the sample variance, and M is the sample mean. N-1 in the denominator corrects for the tendency of a sample to underestimate the population variance. Example of calculating the sample variance: refer to the table below: Using the formula for a sample on a dataset with 17 observations in the table below. The numbers in parentheses represent the corresponding table column number. The procedure involves taking each observation (1), subtracting the sample mean (2) to calculate the difference (3), and squaring that difference (4). Then, I sum the squared differences at the bottom of the table. Finally, I take the sum and divide by 16 because I’m using the sample variance equation with 17 observations (17 – 1 = 16). The variance for this dataset is 201. Mesokurtic Distribution – middle – close to normal - it has moderate value of measure of variability, the scores tend to spread evenly around the mean. It tends to be symmetrical showing to be close to normal. Page 48 of 56 Leptokurtic distribution – skinny or thin It has a small measure of variability, the scores tend to be compressed toward the mean. It is said to be homogeneous. The group tends to be of almost the same ability. The distribution look like skinny or thin as shown by blue color in the picture. Lepto means skinny or thin. Platykurtic Distribution - flat has large measure of variability, the scores tend to be expanded or widely spread away from the mean. It is said to be heterogenous, the class shows different kinds of ability. The distribution look like flat as shown by red color in the picture. Platy means flat. Page 49 of 56 Page 50 of 56 Sample of getting the variance Standard deviation = √𝟐𝟎𝟏 = 14.18 Standard Deviation The standard deviation is the standard or typical difference between each data point and the mean. The standard deviation is just the square root of the variance. Exercises: 1. Consider the following scores that were taken from a class of boys and girls in a 20-item test in English. Boys : 5, 7, 7, 12, 13, 15, 16, 17, 18, 19 Girls : 9, 10, 10, 12, 13, 13, 15, 16, 16, 16, 17 Find the range, variance and standard deviation of; boys, girls and the whole class. Which group has a better performance? Which group is more spread? Page 51 of 56 X. Describing Individual Performance Chapter 10 – Instructional Objectives 10.1. Differentiate raw score from standard score. 10.2. Convert raw score to standard score. 10.3. Described individual performance using standard scores (z-scores, Tscores, Standard Nine and Percentile Rank) Scores directly obtained from the test are known as actual scores or raw scores. Such scores cannot be interpreted as whether the score is low, average, or high. Scores must be converted or transformed so that they become meaningful and allow some kind of interpretations and direct comparisons of two scores. Z-score – is used to convert a raw score to standard score to determine how far a raw score lies from the mean in standard deviation units. It also determines whether an individual student performs well in the examination compared to the performance of the whole class. The z-score indicates the distance between the given raw scores and mean value in units of the standard deviation. The z-value is positive when the raw score is above the mean while z is negative when the raw scores is below the mean. Formula: Z = 𝑥− 𝜇 𝜎 where: x is the raw score μ is the population mean, and σ is the population standard deviation. T-score - is another type of standard score where the mean is 50 and the standard deviation is 10. To convert raw score to T-score, find first the z-score equivalent of the raw score. It is a mathematical term that calculates how much a result varies from the average or mean. Formula: T-score = 10z + 50, where z refers to z-score Stanine – the third type of standard score is the Standard Nine point scale which is known as stanine. It is a nine-point grading scale ranging from 1 to 9, 1 being the lowest and 9 the highest. Stanine grading is easier to understand than the other standard scores model. The descriptive interpretation of stanine 1,2,3 is below average, 4,5,6 is interpreted as average and 7,8,9 is describe as above average. Stanine has a mean of five (5) and a standard deviation of two (2). Page 52 of 56 Percentile Rank – it is the percentage of scores in its frequency distribution that are equal to or lower than it. For example, a test score that is greater than 75% of the scores of people taking the test is said to be at the 75th percentile, where 75 is the percentile rank. It is a measure used in statistics indicating the value below which a given percentage of observations in a group of observations falls. Example: The data below is Grace’s scores in History of Math and Number Theory. Solve the z-score value and determine in which subject she performed better in relation the class’ performance. History of Math x=92 Mean=95 s=3 z= 92 – 95 3 = -3/3 = -1 T = 10(-1) + 50 = -10 + 50 = 40 Number Theory x=88 Mean = 80 s=4 z=88-80 = 8/4 = 2 4 T = 10(2) + 50 = 20 + 50 = 70 Example: Albert’s raw score in Chemistry Exam is 66 which is equal to 90 th percentile. Interpretation: This means that 90% of Albert’s classmates got a score lower than 66. Albert surpassed 90% of his classmates. Page 53 of 56 XI. Marks and Learning Outcomes Chapter 11 – Instructional Objectives 11.1. Given the purposes of grades. 11.2. Described different forms of grading. 11.3. Compute grades using weighted, averaging and cumulative grading system. Generally, the goal of grading is to evaluate individual students' learning and performance. Moreover, they may incorporate criteria – such as attendance, participation, and effort – that are not direct measures of learning. The goal of assessment is to improve student learning. Grading enables teachers to communicate the achievements of students to parents and others, provide incentives to learn, and provide information that students can use for selfevaluation. Grades reflect the teacher or professors' judgment of students' level of achievement and, ideally, provide students with information they can use to improve their performance. But grades also have been shown to have strong and lasting effects on students' attitudes, behaviors, and motivation to learn Purposes of Grades and Strategies in Grading The following exemplar guidelines are offered as suggestions to schools as they implement a proficiency-based leaning system: 1. The primary purpose of the grading system is to clearly, accurately, consistently, and fairly communicate learning progress and achievement to students, families, postsecondary institutions, and prospective employers. Page 54 of 56 2. The grading system ensures that students, families, teachers, counselors, advisors, and support specialists have the detailed information they need to make important decisions about a student’s education. 3. The grading system measures, reports, and documents student progress and proficiency against a set of clearly defined cross-curricular and content-area standards and learning objectives collaboratively developed by the administration, faculty, and staff. 4. The grading system measures, reports, and documents academic progress and achievement separately from work habits, character traits, and behaviors, so that educators, counselors, advisors, and support specialists can accurately determine the difference between learning needs and behavioral or work-habit needs. 5. The grading system ensures consistency and fairness in the assessment of learning, and in the assignment of scores and proficiency levels against the same leaning standards, across students, teachers, assessments, learning experiences, content areas, and time. 6. The grading system is not used as a form of punishment, control, or compliance. In proficiency-based leaning systems, what matters most is where students end up—not where they started out or how they behaved along the way. Meeting and exceeding challenging standards defines success, and the best grading systems motivate students to work harder, overcome failures, and excel academically. Strategies in Grading 1. Grade on the basis of students’ mastery of knowledge and skills. 2. Avoid grading system that put students in competition with their classmates and limit the number of high grades. 3. Try not to overemphasize grades. 4. Keep students informed of their progress. Types of Grading System 1. Weighted Grading System – the grades are being multiplied to a certain weight or percentage then the products will be added to get the final rating. Example: Grades in Mythology and FolkloreWeight Recitation 15% Quizzes/Chapter Test 15% Major Examination 40% Projects 15% Seatwork/Exercises 15% Grade 85 88 84 90 87 WxG 12.75 13.20 33.60 13.50 13.05 86.10 – Final Grade Page 55 of 56 2. Averaging Method – the grades are added then to be divided by the number of scores/grades that were added. Example: 87 + 90 + 86 + 80 + 93 + 85 = 86.83 -average grade 3. Cumulative Method – this method takes into account the tentative grade in the first grading period which becomes the final grade in that period. In the fourth grading period, the final grade in each subject becomes the final rating. The final grade can be obtained by multiplying a certain percentage (usually 70%) to the current (tentative) grade while 30% is to be multiplied to the previous grade then add the products. Example of Cumulative Method 85 – first grading final grade in second Gr = 88(.70) + 85(.30) = final of 2nd gra 88 – second grading = 61.6 + 25.5 = 87.1 final grade 2nd Gr rd 78- 3 Grading 78(.70) + 87.1(.30) = 54.6 + 26.13 = 80.73 th 90 – 4 Grading 90(.70) + 80.73(.30) = 63 + 24.22 = 87.22 -4th grade Other Types of Grading Systems Percentage Grading – From 0 to 100 Percent. Letter grading and variations – From A Grade to F Grade. Norm-referenced grading – Comparing students to each other usually letter grades. Mastery grading – Grading students as “masters” or “passers” when their attainment reaches a prespecified level. Page 56 of 56