Research Method Step 1 – Formulate research question Step 2 – Operationalize concepts ◦ Valid and reliable indicators Step 3 – Decide on sampling technique ◦ Draw sample Step 4 – Select data collection technique ◦ Collect data Step 5 – Analyze data Step 6 – Write up the report It is critically important to develop valid and reliable measurements/indicators If your measurements/indicators are not valid and reliable then you are wasting your time. When you input your data, never forget…………… If you put trash into the computer, you will get trash out, no matter how sophisticated your analysis. What is a Valid and Reliable Measurement? Validity ◦ Refers to the relationships between a concept and its indicator ◦ Is the indicator an accurate measurement of the concept? Reliability ◦ Refers to consistency across time and place ◦ Do you get consistent or same results when indicator is used in different, but comparable, time and/or place? ◦ NOTE – It you don’t get consistent results then it could be that Measurement is ambiguous – faulty Questions are double barreled or confusing, even in same setting Example – Do you agree with the following statement – Men and Women are Good Communicators? – is a double barreled statement Situation is different and measurement doesn’t hold across these situations Terms have different meanings in different subcultures Example - Do you think it’s BAD to get a tattoo? – is a statement that means something entirely different to teenagers than to their parents There are four types of validity to consider Face Validity ◦ Does indicator “obviously” measure the concept? Is it a “sensible” indicator? Content Validity ◦ Does indicator cover entire range of meaning of the concept? If concept is multi-dimensional, then is indicator multi-dimensional? Construct Validity ◦ Is indicator related to other indicators as specified by the literature? Criterion Related or Predictive Validity ◦ If the concept is supposed to predict a future event, then does the indicator predict that same future event accurately? A more detailed look at face validity Face Validity Indicator is a sensible or obvious measurement of the concept. If concept is a type of behavior then indicator should measure behavior. Common Mistake – Using the number of workers of color hired by a company as a measure of prejudice. Hiring is a behavior. Prejudice is an attitude. This would measure discrimination, not prejudice. If concept is a value laden concept, then we must take social desirability into account when constructing a measure. Common Mistake – Measuring crime by asking people if they have committed a crime. No one wants to admit this. A more detailed look at content validity The indicator must cover the entire range of the meaning of the Concept Examples ◦ If you measure attitudes toward a workshop, you must ask multiple questions to cover the multiple aspects of the workshop (i.e., quality of handouts, quality of presentations, relevancy of information, etc. ) ◦ If you measure social class (a multi-dimensional concept) you must measure income, occupation and education ◦ If you measure prejudice, you must either think about and measure all of the different types of prejudice (i.e., racial, religious, social class prejudice) or limit yourself to one type and indicate that when you discuss your concept A more detailed look at construct validity Indicator must be related to other indicators and/or concepts as determined by past research reported in the literature. Theoretical Construct Validity Indicator is related to other concepts/indicators as specified by a theory Example – As predicted by theory, your indicator of poverty is related to whether or not they live in a single parent household. A more detailed look at construct validity Indicator must be related to other indicators and/or concepts as determined by past research reported in the literature. Discriminant Validity Indicator is related to other indicators, measurements or behaviors as predicted by the literature or past research. Example – As predicted in the literature, your volunteers are happier when they have some “voice” in the decisions that are made. Your measurements on happiness and decision making power are related as they should be. A more detailed look at construct validity Convergent Validity Indicator is related to data using other data collection methods as predicted (multi-methods) Example – When children who attend your workshops and “appear” to be happier when observed, also score higher on a happiness measurement. Known Groups Validity Indicator is related to groups with known characteristics as expected. Example – KKK members score higher on prejudice index than members of civil rights movement. A more detailed look at construct validity Factor Validity Indicator is related to other items in same subscale more strongly then to items in different subscale Example – the CES-D scale measures 4 components of depression. (negative affect, lack of positive affect, somatic symptoms and interpersonal). Each of these components is measured by several items/statements that form a subscale. To have factor validity, a single item/statement must be more strongly related to other items in that subscale than to items in another subscale. For instance, in the negative effect subscale there are items measuring feeling blue, feeling sad and feeling depressed. These items are more strongly related to each other than to items in the somatic symptoms subscale (i.e., overeating, difficulty concentrating, sleeping too much). You would use a factor analysis to determine this. A more detailed look at criterion, concurrent or predictive validity Criterion Related Validity Scores on one indicator can be used to predict scores on another. Example - Scores on marital happiness scale can predict scores on personal happiness scale. Concurrent Validity Scores on your indicator can be used to predict current behavior. Example – SAT/ACT scores are related to current performance in school (GPA) A more detailed look at criterion, concurrent or predictive validity. Predictive Validity Indicator can be used to predict future events. Example - SAT/ACT scores related to performance in college (GPA) Reliability Reliability refers to consistency across time. An indicator can be reliable (provide consistent results), but NOT valid (accurate). It can provide consistently WRONG answers. Test/retest ◦ There are different ways to measure reliability. They include: Test/retest Internal consistency Using alternative forms Inter-rater reliability Intra-rater reliability Reliability – consistency of indicators Test/retest ◦ Subjects provide same answers to the same items at different times. Individuals should score the same each time. Internal consistency ◦ Scale items are highly correlated/associated with each other ◦ Use a Cronbach’s alpha to determine this. Alternative forms ◦ Use slightly different forms – see example on next slide Inter-rater reliability ◦ Two or more researchers get same results Intra-rater reliability ◦ Same researcher get similar results across time Using different ways of asking the question should yield same answers SD D A SA I liked the workshop presentation 1 2 3 4 I like the workshop presentation SA 1 A 2 D 3 SD 4 I did not like the workshop presentation SD 1 D 2 A 3 SA 4 D 2 U 3 A 4 SA 5 I liked the presentation SD 1 Relationship between Validity and Reliability Definition of terms ◦ Validity – accuracy ◦ Reliability – consistency Relationships ◦ If it is valid (accurate) then it is reliable (consistent) ◦ BUT if it is reliable – it may not be valid, it could be consistently WRONG Examples – Bathroom Scales Valid ◦ Scales provide accurate measurement of weight ◦ As long as you don’t gain or lose weight, then they will also provide consistent weight Reliable ◦ Scales provide consistent measurement of weight ◦ BUT if you have not calibrated scales accurately, they may be consistently wrong Questions or comments? Please contact: Carol Albrecht Assessment Specialist USU Extension 979-777-2421 carol.albrecht@usu.edu