1. If you had a test which was mostly composed of items with an item difficulty of 0.9, you would most likely expect a/an ____________ when looking at the distribution of scores: A. Positive Skew B. Unimodal distribution C. Negative Skew D. Bimodal distribution 2. Which of the following statistical methods is most used in test creation, research, reliability and validity studies? A. Analysis of Variance B. Correlation C. Content Validation D. Test for Significant Differences Which of the following statements is TRUE? A. A norm-referenced approach compares a test-taker’s score to a set of characteristics predefined by the test developer B. When you want to compare a person’s score in two different tests, you examine his/her individual raw scores for both tests C. A criterion referenced approach allows test users to compare differences in performance between test takers D. None of the Above 3. 4. Which of the following statements is TRUE? I. We may base our current interpretations, conclusions and recommendations on test results which we have acquired during the early years of our clients when no updated test results are available II. Earliest versions of tests may be used under circumstances where newer tests are unavailable/inaccessible for use III. Third party interpreters may be used in relaying test results to the client IV. Test results that need to be shared with the school, social agencies, the courts or industry should be supervised by a close confidante of the client A. Only III is TRUE B. Only II and IV is TRUE C. I, II and IV is TRUE D. Only II is TRUE 5. When two items in a test are too highly correlated (0.98-0.99), then that might indicate: A. A highly reliable test B. Causality between the first and the second item C. Redundancy D. Differences in what dimensions these items belong to 6. A weighing scale which, when used, always tells you that you are 65kgs even though in reality you are only 60kgs is an instrument that: A. Is valid and reliable B. Is not valid but is reliable C. Is not valid and not reliable D. Is valid but is not reliable 7. 8. 9. Dianne has an already established work satisfaction scale which was created with the intention of identifying how happy workers are with their job. It had been normed for Filipino workers and was very empirically supported with high coefficients for its psychometric properties. She intends to use this test during the next recruitment period as a screening tool in order to predict which applicants would perform the best in their work later on. Given this situation, what could you say about the tool? A. The tool has low validity, moderate reliability and high utility B. The tool has high validity, low reliability and moderate utility C. The tool has high validity, high reliability and low utility D. The tool has low validity, high reliability and low utility Two common ways to increase a test’s reliability is to: A. Lengthen the test item and/or decrease the number of test items B. Reduce the number of people in the normative sample and/or use another criteria C. Increase the number of perfectly correlated items and/or reduce the number of people in the normative sample D. Remove the items with low reliabilities and/or add more items According to the method proposed by C.H. Lawshe, you achieve a zero CVR for a particular item when: A. More experts rate the item not essential B. More experts rate the item essential C. Exactly half the experts rate the item essential D. There is no such thing as a zero CVR 10. Your group wanted to see how peoples’ score on a work performance scale would reflect their scores on an upcoming evaluation. Your group administered the test to a group of workers and got their distribution of scores. Weeks later, you asked their supervisors how they fared on the job. Which of the following statements can be inferred about the given situation?