Reliability and Validity

Journalism 614:

Reliability and Validity

Criteria of Measurement Quality



How do we judge the relative success (or failure) in measuring various concepts?

– Reliability

• Consistency over time

– Validity

• Reflects the real meaning

Reliability and Validity



Reliability focuses on measurement



Validity is important to measurement too

– Validity also extends to:

• Internal features of the study (Internal Validity)

• Generalizations made from study (External Validity)

Key to Reliable Valid Measures



Precise conceptual and operational definitions of concepts - tight fit

– Conceptual definitions: abstract sense of the idea

– Operational definitions: measuring the concept

Reliability



Consistency of Measurement

– Reproducibility over time, over different indicators, used by different interviewers



Estimates of Reliability

– Statistical coefficients that tell use how consistently we measured something

Four Aspects of Reliability:



1. Stability



2. Reproducibility



3. Homogeneity



These three factors = precision

1. Stability



Consistency across time

– repeating a measure at a later time to examine the consistency

– Compare time 1 and time 2

2. Reproducibility



Consistency between observers



Equivalent application of measuring device

– Do observers using the same measuring tools reach the same conclusion?

– If we don ’ t get the same results, what are we measuring?

• Lack of reliability can compromise validity

3. Homogeneity



Consistency between different measures of the same concept

– Different items used to tap a given concept show similar results



Homogeneity of measures:

– 1. Cronbach ’ s Alpha coefficient

– 2. Mean Inter-item Correlation

Indicators of Reliability



Test-retest

– Make measurements more than once and see if they yield the same result



Split-half

– If you have multiple measures of a concept, split items into two scales, which should then be correlated



Cronbach ’ s Alpha or Item-total Correlation

Relationship to Validity



Reliability is a necessary condition for validity - consistency as an indicator



Reliability is not a sufficient condition for validity - consistency does not = accuracy

– E.g., Grocery Scale. Must be consistent to have any hope of being valid, but could still be off the mark (1 lb always measures 1.1 lb.

Not Reliable or Valid

Reliable but Not Valid

Reliable and Valid

Types of Validity



1. Face validity



2. Content validity



3. Pragmatic (criterion) validity

– A. Concurrent validity

– B. Predictive validity



4. Construct validity

– A. Convergent validity

– B. Discriminant validity

Face Validity



Subjective expert judgment about “ what ’ s there ”



Compare each item to conceptual definition

– If not, it should be dropped

– Is the measure valid “ on its face

”

– E.g., Asking about race prejudice by asking people ’ s affinity for ethnic cuisine

Content Validity





If current indicators are insufficient, develop more indicators - cycle of face and content validity



Subjective expert judgment of “ what ’ s not there ”

– Start with conceptual definition and see if all dimensions and traits are represented at the operational level

– Are some over or underrepresented?

Example - Civic Participation questions:

– Did you vote in the last election?

– Do you belong to any advocacy groups?

– Have you ever volunteered in your community?

Pragmatic (Criterion-Related) Validity



Uses empirical evidence to test validity



1. Concurrent validity

– Does the measure predict a pre-existing measure that has been previously deemed to be valid?

• E.g., Does a new version of an IQ test correlate with past versions?



2. Predictive validity

– Does the measure predict the future outcomes it is supposed to predict?:

• E.g., SAT scores: Do they predict college GPA?

Construct Validity



Overall validity encompassing other elements



Do measurements:

– A. Represent all dimensions of the concept

– B. Distinguish concept from other similar concepts



Tied to meaning analysis of the concept

– Specifies the dimensions and indicators to be tested



Assessing construct validity:

– A. Convergent validity

– B. Discriminant validity

Convergent Validity



Convergent validity:

– Measuring the same concept with very different methods

– If different methods yield the same results, than convergent validity is supported

– E.g., Different survey items used to measure decision-making style - closed and open-ended

• Code for decision-making style from open-ended responses

• High score on scale = more compensatory responses

Discriminant (Divergent) Validity



Discriminant validity:

– Ability of measure of a concept to discriminate that concept from other closely related concepts

– E.g., Measuring Maternalism and Altruism as distinct concepts. Might be correlated but not too highly or this is an issue.

Validity & Research Design



Internal

– Controlling for other factors in the design

• Validity of structure, sampling, measures, procedures

• Claims regarding what happened in the study



External

– Looking beyond the design to other cases

• Validity of inferences made from the conclusions

• Claims regarding what happens in the real world

Reliability and Validity

Journalism 614:

Reliability and Validity

Criteria of Measurement Quality