Uploaded by saturn girl

Psychometrics Test Development & Evaluation

Test conceptualization - test developer identifies the skills and the knowledge that the test is
intended to measure and determines the appropriate format and structure of the test
Test construction - test developer begins to create the test items
Test formats:
● Selected-response format
○ Multiple choice
○ True or false
○ Matching items
● Constructed-response format
○ Completion
○ Essay
Constructing test items - items will be created will revolve around the construct that the
developer aims to measure interview of experts and literature review
Test tryout - the test is administered to a sample of test-takers to evaluate the quality of the test
items; this allows the developers identify any potential problems with the test
Item analysis - developer analyzes the test items to determine quality and effectiveness;
involves evaluating the difficulty, discrimination, and reliability of each item
Test revision - developer makes revisions to the test items to improve their quality and
effectiveness based on the results of item analysis
What makes a good test?
- Measures what it claims measure consistently
- Measures what it claims to measure
- More effective decisions can be made
Psychometric Properties of a Test
- Reliability
- Validity
Factors affecting reliability
- Construction of test items or questions
- Administration
- Scoring
- Environmental factors
- Test-taker
What to look in a test manual
- Validity
- Importance of norming – “the process of constructing norms or the typical performance
of a group of individuals on a psychological or achievement assessment”
Types of reliability
- Test-retest reliability - a measure of reliability obtained by administering the same
test twice over a period of time to a group of individuals
- Alternate/parallel form reliability - obtained by administering different versions of
an assessment tool to the same group of individuals
- Internal consistency - a measure based on the correlations between different items
on the same test
- Inter-rater reliability - refers to the extent to which two or more individuals agree
Types of Validity/Validity Evidence/Validation Studies
- Criterion related validity - evaluates how accurately a test measures the outcome it
was designed to measure
- Content validity - evaluates how well an instrument (like a test) covers all relevant parts
of the construct it aims to measure
- Construct validity - concerns the extent to which your test or measure accurately
assesses what it's supposed to
Branches of statistics
● Descriptive
○ Measures of central tendency
○ Measures of dispersion
○ Statistical graphs
○ Measures of shape
● Inferential
○ Statistical tests of relationship
○ Statistical tests of difference
○ Statistical tests of prediction
● Quantitative
● Qualitative
● Dependent
● Independent