Workshop 1: Reliability, Validity & Introduction to Modeling Introduction Social science research is about formulating questions we have about the world and fitting the answers into a broader class of explanations (a theory) that can account for causal relationships. This process involves moving from research question to research hypothesis to data collection and analysis to reflection and revision. Brainstorming Research Question Specification Research Hypothesis Operationlization Data Collection & Analysis Evaluation Reflection & Revision Throughout this worksheet you will find terms that are useful for considering this process in a general sense but also as you develop your own proposal for this course. The Research Question The first step in research is finding a topic(s) of interest. Luckily for us, the political world is full of interesting phenomena just waiting to be discovered. Despite the abundance of interesting trends, nailing down a good research question is actually quite difficult. Students coming into this course often make the mistake of equating an interest with a research question. These are not the same. There are no hard and fast rules for finding a research question. Most research questions originate from our observations of the world. Why do people vote? Why do some bills pass while others’ fail? Why do some members of Congress get reelected while others don’t? These are all the beginnings of research questions. Past Fellow’s projects included such topics as interest group influence in Congressioinal hearings, policy consequences of income inequality, Supreme Court agenda setting and public opinion, among many others. Two such projects are appended to this handout, visit the Center for American Politics and Public Policy’s website for more research ideas and details of the projects described here. The Research Hypothesis The research hypothesis extends the question we developed earlier but becomes more precise. In this stage of the research process, the objective is model specification. Here we identify our key variables and hypothesize relationships between them. There are several different levels of variables. Constructs exist at the level of theory and are usually precisely defined ideas that form the bedrock of research. Constructs are unseen (and therefore un-measurable) phenomena. For example, agenda space is a construct that can take on different meanings depending on the author. Agenda space must first be defined theoretically before the researcher can begin to quantify (or qualify) this variable. The process of operationalization links our construct to the observerable world. While agenda space is an abstract object, the number of congressional hearings per year is concrete and provides us with one way to measure agenda space. Agenda space could be operationalized any number of ways and the variable that the researcher ultimately chooses depends upon the type of research, the theoretical orientation of the researcher, ease of measurement, and many others. At this stage, it is important to distinguish between three types of measurement: nominal, ordinal and interval. Nominal data is made up of mutually exclusive categories and includes eye color, race, and gender. Ordinal data is still categorical but it gives us information about the ordering of the data. Likert scales1 are ordinal level data. Unfortunately, ordinal data does not give us information about the distance between different categories. In other words, the difference between self-identifying as a strong Republican vs. a moderate Republican is unclear. Interval data is the preferred level in most cases because it allows us to perform more robust analyses. Interval data contains information about the strength and direction of the variable and we can make more meaningful comparisons between different levels of the variable. The reliability and validity of our measures is a very important topic in research design. Reliability is the extent to which the measure is consistent across space and time. Validity refers to the extent the measure actually measures what we think its measuring. For example, the Scholastic Aptitude Test (SAT) may be a reliable measure of student achievement (if you score a 1200 today, tomorrow you will likely score close to 1200, all other things being equal) but it may not have validity (the test doesn’t really measure student achievement at all, but some other variable). It is possible for a measure to be reliable and not valid, but not the other way around. Valid measures are also reliable. Reliability can have important effects on statistical analysis because poor reliability makes it difficult to lift subtle effects from the data. Not all scholars agree about the importance of each of these measurement characteristics, but both play a crucial role in the research process at all levels. 1 Likert scales are common throughout the behavioral and social sciences and include responses which are phrased as following: Strongly Disagree – Mildly Disagree – Disagree – Mildly Agree – Strongly Agree. Likert scales can be used for questions on political ideology and affiliation, policy preference, among other things. The Policy Agendas Project grew out of an effort to create valid, reliable measures of important American political activity. Data Collection & Analysis Once we narrow and define our research question to a few specific constructs and operationalize are variables, it is time to collect the data and begin our analysis. Data collection may proceed in a number of ways. With the advent of the Internet, more researchers are releasing data on the web. However, you should always scrutinize the process in which the data was processed because not all data is created equal. Pay attention to inter-coder reliability checks and other measures of data quality. Bad data equals bad results, always! For the purposes of this course, most of you will be using pre-existing data sources (primarily from the Policy Agendas Project). Talk with faculty and graduate students to get ideas for other data sources. Once collected, data is analyzed according to the hypotheses developed at the beginning of this process. Here, whether in qualitative or quantitative research, the issues of extraneous and confounding variables become important. An extraneous variable is a factor that has not been controlled for in the study but does not systematically vary with the independent variable. The most common extraneous variable is the random measurement error found in most data. A confound, on the other hand, is a variable that systematically varies with the independent variable potentially tainting the results. Both types of variables are similar in that they are external to the research project we have designed. We need to control extraneous and confounding variables as much as possible because they effect the validity of our results. When a study has controlled both, it is said to have high internal validity. Threats to internal validity include history, instrumentation, regression to the mean and selection among others. Because of the nature of the world we study as social scientists is difficult to study in a laboratory setting, we will never be able to obtain full control over extraneous and confounding variables, but most good research attempts, through a variety of techniques beyond the scope of this class, to control external factors. Quantitative analysis, the focus of this course, will usually proceed down one of two related paths, correlation and regression (there are many other advanced techniques that researchers use, but they will not be discussed here). Correlation is a type of analysis that tells us how related two variables are. It does not demonstrate cause and effect relationships but instead tells us how often two or more variables coincide. Regression2 adds predictive capacity to the researcher’s tool box and allows us to specify causal relationships. Causal relationships, however, can not be demonstrated with most regression analyses, more sophisticated tools are needed. We’ll come back to correlation and regression in Workshop 2. 2 There are many different kinds of regression. For the purposes of your work here, regression refers to linear regression. Reflection & Revision So, we’ve brainstormed a question, defined and operationalized our variables, collected and analyzed data. Now what? Arguably the most important stage in research, interpretation of the results and, if necessary, revision of our theory. How conclusive are the results? What caveats, if any, are necessary? Are the results generalizable to other places, times, cultures, people (external validity3)? Have you learned something new about the world? These and other questions are to be asked at the conclusion of a research study. The answers to these questions lead to the formulation of new questions and hypotheses, which, ironically, leads us full circle to where we started. 3 Threats to external validity include sampling bias, experimental arrangements and others. Glossary Confounding Variables: two or more explanatory variables that are confounded when their effects on a response variable cannot be distinguished from each other Construct: an abstract idea or concept, theoretical in nature Extraneous Variable: any variable other than the identified explanatory variables that effects the independent variable in a non-systematic or random way; for systematic effects see confounding variables External Validity: the extent to which a study’s results can be generalized to other circumstances (times, place, and persons) Internal Validity: the extent to which a study controls for variables (confounding or extraneous) that may affect the explanatory variable of interest Measurement Validity: sometimes known as construct validity; reflects how well a operational variable measures our chosen construct Measurement Reliability: the extent to which a measure is consistent Model Specification: formal development of a model in a statement or equation, based on data analysis and past theoretical developments Operationalization: the process of converting concepts into observable behaviors that a researcher can measure Variable: an attribute that varies across observations; examples height, personality traits, systems of government