Measurement

advertisement

Measurement

The systematic application of pre-established rules (or standards) for assigning numbers (or scores) to the attributes or traits of an individual.

What is the basic purpose of a test for personnel selection?

What does a score of 75 represent on a test?

What question do you commonly ask when you get back your score on a test?

Positively

Skewed

Distribution

Negatively

Skewed

Distribution

40 45 55 60 70 75 80 90 100

Test Scores

40 45 55 60 70 75 80 90 100

Test Scores

Normal Curve

-4

-3

-2

-1

Mean +1

Central Tendency a) Mode (most frequent score) b) Mean (average score;

[EX/N]) c) Median (midpoint of scores)

+2

+3

+4

Variability (Spread in scores ) a) Range (lowest to highest score) b) Standard Deviation c) Variance

Relationships Among Different Types of Test Scores in a Normal Distribution

Number of

Cases

-4

0.13% 2.14%

-3

-2

13.59% 34.13% 34.13% 13.59%

-1

Mean +1

Test Score

+2

2.14%

0.13%

+3

+4

Z score

-4 -3 -2 -1 0 +1 +2 +3 +4

T score

10 20 30 40 50 60 70 80 90

CEEB score

Deviation IQ

(SD = 15)

Stanine

Percentile

200 300 400 500 600 700 800

55 70 85 100 115 1 30 145

4% 7% 12% 17% 20% 17% 12% 7% 4%

1 2 3 4 5 6 7 8 9

1 5 10 20 30 40 50 60 70 80 90 95 100

Ernest O ’ Boyle and Herman Aguinis

Power law distributions are typified by unstable means, infinite variance, and a greater proportion of extreme events.

Standard Score Example

Z = Raw Score – Mean/Standard deviation

~ Key Terms ~

Predictor : One that can be manipulated or used to predict scores on the criterion (e.g., measure of job performance)

Criterion : The variable of interest; the one you are attempting to understand or affect

SAT, ACT scores used to predict success in college

Interview scores

Predictors used to predict performance in a job

Criteria

8

Predictors (Selection Procedures)

Many types have been used, but most fall into three broad categories:

1. Background information (e.g., biodata, work experience)

2. Interviews

3. Tests (e.g., paper & pencil, work samples, situational judgment, personality)

Criteria (Measures of Job Performance)

1. Objective Production Data (e.g., units produced)

2. Personnel Data (e.g., absenteeism)

3. Judgmental Data (e.g., supervisor ratings)

4. Job or Work Sample Data

5. Training Proficiency Data

Scales of Measurement

1) Nominal -- Indicates categories, classification (e.g., gender, race, yes/no)

Stats: N of cases (e.g., chi-square), mode

2) Ordinal -- Indicates relative position; greater than, less than (e.g., rank ordering percentiles) Link

Stats: Median, percentiles, order statistics, non-parametric analyses

1

2 st nd

3rd

Does not indicate how much of an attribute one possesses (e.g., all may be low or all may be high)

Does not indicate how far apart the people are with respect to the attribute Link

3) Interval -- Indicates an absolute judgment on an attribute (equal intervals)

No absolute zero point (a score of 80 is not twice as high as a score of 40)

Stats: Mean, variance, correlation

4) Ratio -- Possesses an absolute zero point (e.g., number of units produced)

All numerical operations can be performed (add, subtract, multiply, divide)

~ I-O Research ~

Measurement

• Limit collection of categorical and/or dichotomous data

Age

0 - 18

19 – 25

26 – 35

36 – 45

46 – 55

56 – 65

85 & Above

Income

0 ------ 10,000

10,001 – 25,000

25,001 – 35,000

35,001 – 50,000

50,001 – 75,000

75,001 – 100,000

100,000 & Above

Age in Years:

_______

Income:

____________

~ I-O Research ~

Measurement (cont.)

• Dichotomous data versus interval data

Yes _____

No _____

_____ _____ _____ _____ _____

1 2 3 4 5

Highly Highly

Disagree Agree

~ I-O Research ~

Measurement (cont.)

• Restrict possibility of missing data

Scale Questions

1.

2.

3.

4.

5.

Missing

Computed score for scale or subscales containing questions #5 and #48 will also be missing

Missing

48

49

50

~ I-O Research ~

Interesting fact: Substantial amount of I-O studies are non-experimental

(about 50%)

Overall Point:

Best for research to be driven by theories and problem-solving approaches not by methodology/statistics

• Much research efforts in I-O focus on rather trivial questions that can be studied with “ fancy ” techniques

• Bulk of research has limited applied significance

~ I-O Research Trends ~

Some Recent Articles in the Journal of Applied Psychology

• Safety in work vehicles: A multilevel study linking safety values and individual predictors to work-related driving crashes.

• Beyond change management: A multilevel investigation of contextual and personal influences on employees' commitment to change.

• The development of collective efficacy in teams: A multilevel and longitudinal perspective.

Multi-level analysis (or hierarchical linear modeling; HLM). Allows for the assessment of variance in outcome variables to be investigated at multiple, hierarchical levels. Related analyses include structural equation modeling and latent class modeling

Geographic location (region, country)

Study Variables

Work team (or job category)

Employee

~ I-O Research Trends ~

Some Recent Articles in the Journal of Applied Psychology (cont.)

• Predicting workplace aggression: A meta-analysis.

• The good, the bad, and the unknown about telecommuting: Metaanalysis of psychological mediators and individual consequences.

Meta-analysis: Statistical approach that allows the combination of results from multiple independent studies on a given topic. It allows a better estimate of the true “ effect size, ” giving more

“ weight ” to larger studies.

~ I-O Research Trends ~

Some Recent Articles in the Journal of Applied Psychology (cont.)

• Abusive supervision and workplace deviance and the negative reciprocity beliefs.

• Emotional exhaustion and job performance: The effects of role of motivation.

Moderating variable (or 3 rd variable): A variable that affects the strength and/or direction of the relationship between two variables.

Mediating variable : Variable that accounts for (explains) the relationship between two variables

Job enrichment strategies Job Satisfaction Age (as moderator )

(The relationship may be stronger for older individuals)

Job enrichment strategies Job Satisfaction Growth need strength

(as mediator )

(When growth need strength is considered the relationship between job enrichment and satisfaction goes away)

~ I-O Research (cont.) ~

Suggestions

1) More use of “ archival ” data (many are of high quality with large sample sizes; e.g., government statistics on unemployment rates)

2) Longitudinal studies (assessment of change over time)

3) Report confidence intervals and effect sizes in addition to significance levels (e.g., p < .01)

Advantages of using existing measures:

• Use of existing measures usually less expensive and less time-consuming than developing new ones

• If previous research was conducted on these measures, you will have an idea of the reliability, validity and other characteristics of the measures

• Well-developed, existing measures can be superior to what is possible to develop house

~ Some Sources of Test Reviews ~

• Mental Measurements Yearbook (MMY ): Now in its 18th edition. Provides detailed reviews of hundreds of published tests. A typical MMY review is comprised of the following sections: a) Test Entry, b) Description, c)

Development, d) Technical, e) Commentary, and f) Summary.

• Tests in Print : Includes a detailed compilation of commercially available tests as well as information regarding key aspects of tests (e.g., purpose, cost, appropriate population, administration times.

• Tests : Contains data on thousands of tests used in the fields of psychology, business, and education --- comparable to that contained in Tests in Print

• Test Critiques : Provides extra content to that provided by Tests, including psychometric data, practical relevance, and comprehensive reviews by experts.

Evaluating Tests --- Some Questions to Ask

General Information ---

• What is the purpose of the test? What does in intend to measure?

• For what population is the designed? Is this population relevant to the people who will take your test?

• What special qualifications are needed (if any) to acquire the test?

Test Content and Scoring ---

• How many questions exist on the test? What is the format of the test questions? Is the format and number of questions sufficient for your needs? Is the difficulty level and nature of the questions appropriate for your sample?

• What scores are yielded by the test?

• How are test results reported?

• What materials are available to aid in interpreting test results (e.g., narrative summaries, graphs, charts)?

Sample Myers-Briggs Report

Evaluating Tests --- Some Questions to Ask (cont.)

Test Administration ---

• How is the test administered (e.g., individual, group)

• Are the test instructions clear?

• Are the scoring procedures clear and appropriate for your use?

• What time constraints (if any) exist in the administration of the test?

• Are the test materials (e.g., test booklet, manual, appropriateness for examinees) sufficient for your purposes?

• What special training is needed (if any) to administer the test?

Evaluating Tests --- Some Questions to Ask (cont.)

Norm/Validation Sample ---

• What is the composition of the sample (s) used in validating and establishing norms for the test (e.g., race, age, gender, SES)? Is the composition of the sample relevant to your test takers?

• What data is available for the normative sample (e.g., percentiles, standard scores)?

• How many people were in the validation and normative sample(s)?

Is the sample size large enough to yield accurate data and interpretations (e.g., standard error of measurement?

Evaluating Tests --- Some Questions to Ask (cont.)

Test Reliability ---

• What is the reliability of the test and any subscales? What reliability techniques were used? Were these techniques appropriately applied?

• Are reliability estimates different for any sub-groups?

• Is the reliability of the test high enough for the purpose in which the test is going to be used being used?

Evaluating Tests --- Some Questions to Ask (cont.)

Test Validity: Content-Related Validity ---

• What is the domain (e.g., skills, ability, traits) that is being assessed by the test? How was the test content collected and analyzed? What data was used to determine the appropriateness of the test content?

• What was the composition of SMEs (if any) used in the selection of test content? How was the data from theses experts collected and analyzed?

• Is the content of the test related to the qualifications needed by sample to which the test is being given?

Evaluating Tests --- Some Questions to Ask (cont.)

Test Validity: Criterion-Related Validity ---

• What criterion measure (e.g., performance measure) was used to assess the concurrent or predictive validity of the test? Why was the criterion chosen?

• What was the size and nature of the sample used in the validation of the test? Does restriction of range exist in the criterion measure (s)?

• What is the criterion-related validity of the test (e.g., the correlation between test and criterion scores? Is the level of this correlation sufficient for your purposes?

Test Validity: Construct Validity ---

• Is the conceptual framework for the test stated clearly? Is the construct, as defined and measured, relevant for your purposes?

Evaluating Tests --- Some Questions to Ask (cont.)

Test/Item Bias ---

1. Were the items analyzed statistically for possible sub-group bias?

What method(s) was used? How were items selected for inclusion in the final version of the test?

2. Was the test analyzed for differential validity across groups? How was this analysis conducted?

Download