Characteristics of a Good Test aliheydari.tums@gmail.com Definitions A test is an instrument or systematic procedure for observing and describing one or more characteristics of student, using either a numerical scale or classification scheme. Measurement: is procedure for assigning numbers to specified attribute or characteristic of person. Evaluation: is the process of making a value judgment about the worth of someone or something. (Nitko, 2001). aliheydari.tums@gmail.com Literature Review • 13% of students who fail in class are caused by faulty test questions • World watch- The Philadelphia trumpet, 2005 • It is estimated that 90% of the testing items are out of quality • Wilen WW (1992) aliheydari.tums@gmail.com Learning objectives A learning objective (target) specifies what you would like students to achieve at the completion of an instructional segment. aliheydari.tums@gmail.com Stages in Test Construction I. Planning the Test A. Determining the Objectives B. Preparing the Table of Specifications C. Selecting the Appropriate Item Format D. Writing the Test items E. Editing the Test items aliheydari.tums@gmail.com Stages in Test Construction II. Trying Out the Test A. Administering the test B. Item analysis C. Preparing the Final Form of the Test aliheydari.tums@gmail.com Stages in Test Construction III. Establishing Test Validity IV. Establishing Test Reliability V. Interpreting the Test Scores aliheydari.tums@gmail.com A Table of Specifications is: The teacher’s blueprint in constructing a test for classroom use. TOS ensures that there is a balance between items that test lower level thinking skills and those with higher order thinking skills in the test. aliheydari.tums@gmail.com Steps in Preparing TOS List down the topics covered for inclusion in the test. Determine the objectives (Bloom’s Taxonomy) to be assessed by the test. Determine the percentage allocation of the test items for each topic. aliheydari.tums@gmail.com Characteristics of a Good Test Validity Reliability Practicality Administrability aliheydari.tums@gmail.com Comprehensiveness Objectivity Simplicity Scorability Validity A test is valid if it measures what we want it to measure and nothing else. Validity is a more test-dependant concept but reliability is a purely statistical parameter. aliheydari.tums@gmail.com Types Of Validity Content Validity Criterion-Related Validity Construct Validity Face Validity aliheydari.tums@gmail.com Content Validity Does the test measure the objectives of the course? The extent to which a test measures a representative sample of the content to be tested at the intended level of learning. aliheydari.tums@gmail.com Criterion-related Validity Criterion-related Validity investigates the correspondence between the scores obtained from the newly-developed test and the scores obtained from some independent outside criteria. aliheydari.tums@gmail.com Criterion-related Validity Depending on the time of administration Concurrent Validity: Correlation between the test scores (new test) with a recognized measure taken at the same time. Predictive validity: Comparison (correlation) of students' scores with a criterion taken at a later time. aliheydari.tums@gmail.com Construct validity Refers to measuring certain traits or theoretical construct. It is based on the degree to which the items in a test reflect the essential aspects of the theory on which the test is based on. aliheydari.tums@gmail.com Face Validity Does it appear to measure what it is supposed to measure? aliheydari.tums@gmail.com Factors Affecting Validity a. Directions (clear and simple) b. Difficulty level of the test (not too easy nor too difficult) c. Structure of the items (poorly constructed and/or ambiguous items will contribute to invalidity) d. Arrangement of items and correct responses (starting with the easiest items and ending with the difficult ones + arranging item responses randomly not based on an identifiable pattern) aliheydari.tums@gmail.com Reliability A test is reliable if we get the same results repeatedly. An “unreliable” test, on the other hand one’s score might fluctuate from one administration to the other. aliheydari.tums@gmail.com several ways to measuring reliability Internal Consistency Test-retest Reliability Split-half Methods Inter rater Reliability • Parallel-Forms aliheydari.tums@gmail.com Test-Retest Administrating a given test to a particular group twice and calculating the correlation between the two sets of score Since there has to be a reasonable amount of time between the two administrations, this kind of reliability is referred to as the reliability or consistency over time. aliheydari.tums@gmail.com Test-Retest aliheydari.tums@gmail.com Disadvantages of Test-Retest It requires two administrations. Preparing similar conditions under which the administration take place adds to the complications of this method. There should be a short time between to administration. Although not too short nor too long. To keep the balance it is recommended to have a period of two weeks between them. aliheydari.tums@gmail.com Parallel-Forms Two similar, or parallel forms of the same test are administrated to a group of examinees just once. The two form of the test should be the same. Subtests should also be the same. The problem here is constructing two parallel forms of a test which is a difficult job to do. aliheydari.tums@gmail.com Split-Half Test In this method, when a single test with homogeneous items is administrated to a group of examinees, the test is split, or divided, into two equal halves. The correlation between the two halves is an estimate of the test score reliability. In this method easy and difficult items should be equally distributed in two halves. aliheydari.tums@gmail.com Split-Half Advantages and Disadvantages Advantages: There is no need to administer the same test twice. Nor is it necessary to develop two parallel form of the same test. Disadvantages: Developing a test with homogeneous items aliheydari.tums@gmail.com Which method should we use? It depends on the function of the test. Test-retest method is appropriate when the consistency of scores a particular time interval (stability of test scores over time) is important The Parallel-forms method is desirable when the consistency of scores over different forms is of importance. When the go-togetherness of the items of a test is of significance (the internal consistency), Split-Half and KR-21 will be the most appropriate methods. aliheydari.tums@gmail.com Factors Influencing Reliability To have a reliability estimate, one or two sets of scores should be obtained from the same group of testees. Thus, two factors contribute to test reliability: the testee and the test itself. aliheydari.tums@gmail.com Validity and Reliability Neither Valid nor Reliable Reliable but not Valid Valid & Reliable A test must be reliable to be valid, but reliability does not guarantee validity aliheydari.tums@gmail.com Practicality practicality refers to the ease of administration and scoring of a test. aliheydari.tums@gmail.com Administrability Administrability the test should directed uniformly to all students so that the scores obtained will not vary due to factors other than differences of the students’ knowledge and skills. There should a clear provision for instruction for the students and the one who will check the test (having clear directions and processes) aliheydari.tums@gmail.com Scorability Scorability the test should be easy to score, directions for scoring is clear, provide the answer sheet and the answer key. aliheydari.tums@gmail.com Comprehensiveness A test is said to have comprehensiveness if it encompasses all aspects of a particular subject of study. aliheydari.tums@gmail.com Simplicity A test is said to be simple if it is easy to understand along with the instructions and other details. aliheydari.tums@gmail.com Objectivity Objectivity represents the agreement of two or more raters or a test administrator concerning the score of a student. Not influenced by emotion or personal prejudice. Lack of objectivity reduces test validity in the same way that lack reliability influence validity. aliheydari.tums@gmail.com The Other Factors Test length Speed Item difficulty aliheydari.tums@gmail.com References David ,M. Robert , L. . Norman, E. Measurement and Assessment in Teaching (10th Ed). Pearson(2008) Anthony J, N. Educational Assessment of Students (3th Ed). Merrill Prentice Hall (2001) http://www.ehow.com/how_4913690_steps-preparing-test.html aliheydari.tums@gmail.com