Reliability Validity Levels of Measurement Scales How we figure out what to measure • Conceptualization – Process of taking a construct and refining it by giving it a conceptual or theoretical definition – Research focusing on college students • In Ohio? What region? Age? Major? • Operationalization – Links a conceptual definition to a specific set of measurement techniques Coming up with a measure • • • • • Remember the conceptual definition Keep an open mind Borrow from others Anticipate difficulties Don’t forget units of analysis Empirical Hypothesis • The degree of association • How well operationalized variables are associated (or not) with the concept construct determines the hypothesis Reliability • Reliability means dependability or consistency • Same thing occurs over and over under same conditions • A scale, for example • How dependable is the study? • Is the study consistent, or does it yield wide varying results? • Can the study be replicated? Reliability • Measurement directly affects the quality of conclusions. • Care is needed to make sure that results are not corrupted by improper measurement. • The operational definition of a concept should have a precise meaning: – The terms by which you measure a concept should be explicit. Reliability • Reliability and validity are the biggest threats to proper measurement. • Reliability is the extent to which an experiment, test, or any measuring procedure yields the same results on repeated trials. • Do you get the same result every time? Reliability • Three tests of reliability: – Test-retest method • Applying the same test to the same observations after a period of time and then comparing the results of the different measurements – Alternative form method • Two different measures of the same concept administered to the same respondents at different times before the scores are compared – Split-halves method • Divide a multi-item measure into two measures with both of the new measures applied at the same time Improving Reliability • Clearly conceptualize constructs • Increase level of measurement • Use multiple indicators of a variable – Triangulation • Use pretests, pilot studies, and replication Validity • A valid measure is one that measures what it is supposed to measure, in other words, the degree of correspondence between the measure and the concept it is thought to measure. • Four tests of validity Validity • Truthfulness • Refers to the match between a construct and a measure • Want it to be valid for a particular purpose and definition • How good is the measure? • Is the data measured correctly? • Is the data analyzed correctly (statistical)? Internal Validity • Are there errors as a result of the internal design of the study? • Are there errors as a result of the controls? • Internal validity problems can occur from a flawed survey along with a multitude of other factors External Validity • Can your experiment’s findings be generalized? • External Validity questions are evident in every study; however, methods exist to keep external validity high and the number of external flaws low Types of Validity • Face validity – Judgment that the indicator really measures the construct • Content validity – Does your measure represent the full content of a defintion? • Criterion validity – Use some standard or criteria to indicate a construct accurately • Concurrent validity – Indicator must be associated with a preexisting indicator judged to be valid • Predictive validity – Indicator predicts future events that are logically related to a construct Types of Validity • • • • • • • • • 1) If I create a new test of mathematical ability for high school students and test it by having high school math teachers look at it and tell me if it seems appropriate, I am measuring for ____________________ validity. 2) If I am examining an individuals ability to cope with stress and have three attributes I am particularly interested in and I am checking to see if my construct hits on all three attributes, I am measuring for _______________________ validity. 3) If I create a new test for cognitive recognition and students that score high on it also score high on previously existing tests for cognitive recognition, I have demonstrated ____________________ validity. 4) If I compare my measure for testing the potential to suffer from childhood diabetes with a previously used test, I am looking for _________________________validity. 5) If I create a new test of intelligence and students that score high on it also do better in college than those who score lowly, I have shown ___________________ validity. Validity • Tests of validity are not as good as tests of reliability. • Reliability is easy to demonstrate through some form of repeated trials. • Validity is more difficult because we can never be sure about the true value of a concept: – Especially true with abstract concepts Validity • Whereas a valid measure is reliable (because if truly valid, it will measure the concept correctly every time), a reliable measure is not necessarily valid. • The measure could be measuring the concept incorrectly in a consistent way. Relationship between reliability and validity Levels of Measurement • The level of measurement of a variable describes – The amount of precision associated with a variable – The mathematical properties of the variable • Both precision and mathematical properties increase as you increase the level of measurement from nominal to ratio. Levels of Measurement • Continuous v. discrete variables – Continuous • Have an infinite number of values or attributes that flow directly along a continuum • Temperature, age, income, crime rate – Discrete • Relatively fixed set of separate values or attributes • Gender, religion, marital status Nominal Level • Only reports a difference • Candidate preference, religious preference, Yes/No, etc. • Discrete Variables Levels of Measurement • The level of measurement of a variable describes – The amount of precision associated with a variable – The mathematical properties of the variable • Both precision and mathematical properties increase as you increase the level of measurement from nominal to ratio. Ordinal Level • Rank ordered • Grades, opinion • • • • Strongly Agree Agree Disagree Strongly Disagree Levels of Measurement • At the ordinal level, categories may be ranked in order in addition to indicating a difference between categories. • Example: Please indicate the highest level of education you reached (elem., high, college, more). • Precision: A little more precision and can be used with more statistical tools Interval/Ratio Level • A specified distance • Interval does not contain a true zero point (ratio does) • Interval: IQ, SAT • Ratio: years of school, income Levels of Measurement • The interval level includes all of the information of the preceding levels and adds meaningful intervals between values of the variable but does not use a meaningful zero. • Example: What did you score on the SAT? • Precision: More precision and can be used with most statistical tools Levels of Measurement • The ratio level adds a meaningful zero to the interval level. • Example: How many years of education? • Precision: Most precision and can be used with most statistical tools Scales • Some concepts can be captured with a single question. • More complex concepts may require a multi-item measure consisting of several questions that capture different components of the concept and increase validity. Scales • Summation index: – Combines the scores on multiple questions to create one single measure of a concept • Likert scale: – Uses only select questions from an index that differentiate between different respondents to create a single score for each respondent Scales • Guttman scale: – Has answer choices arranged in an ordinal manner; respondents will agree with each of the lower-ranked answers if they agree with a higher-ranked answer • Factor analysis: – Allows researchers to uncover patterns across related measures to create summary variables that represent different dimensions of the same concept Mutually Exclusive • “One and only.” • One may only fit the criteria of one category • Ex: Religion: Christian, non-Christian, Jewish, Buddhist NOT MUTUALLY EXCLUSIVE Exhaustive • All cases fit into one category • Ex: If the Election were today, would you support Sherrod Brown, the Democrat, or Mike DeWine, the Republican or neither? • NOT EXHAUSTIVE Missing Data • No survey is perfect, and certain questions will be left unanswered or completely skipped • Remedy? • A “catch all” category and/or a way to factor out missing data • Yet, missing data can still mislead a study One Last Thing • Feeling Thermometers • Likert Scales • Response set problem – How do we fix this???