PS/CJ 3115 Research Methods Spring 2012 Final Exam Study Guide The final will have two parts: Part I will include: 1) Matching, 2) conceptual short answer, 3) SPSS-related short answer. Part II will include: 4) calculation of probability and standard deviations, and 5) Interpreting statistical output. Note: The second section of the exam (probability/standard deviation) will require the use of a calculator, but you won't need one on Wednesday. Matching. The matching section will include a selection of words from this list. These all appeared in both lecture and the textbook. concept variable validity reliability unit of analysis measure of central tendency measure of association measure of variation variable variation mean median levels of measurement nominal (level of measurement) ordinal (level of measurement) ratio (level of measurement) dependent variable independent variable hypothesis negative relationship (or correlation) positive relationship (or correlation) intervening variable antecedent variable spurious correlation control variable lab experiment field experiment sample bias questioner bias inferential statistics descriptive statistics normal distribution population sample standard deviation Z-score P-value Chi-squared score Conceptual Short Answer The questions in this portion of the exam will resemble the short answer questions that appeared in the quizzes. They will draw on material presented in class, and principally material that appeared on the powerpoint slides. I would start studying for this section by making sure you have the right answers for all the quiz questions. Some of them may appear on the exam verbatim, others may be paraphrased. For example, the one lecture included a slide that said: "Measures of Association tell us how strong the relationship between two variables is." So I will likely ask a question such as this: "A measure of association tells us _________________________________" or "What do measures of association tell us: ___________________________" Another example: I said "Intervening variables: Come between the independent and the dependent variables." So I might ask a question like: "What do we call variables that come between the independent and dependent variables?" or "My theory says that warmer weather causes sunburn, because warmer weather causes people to expose more skin, which leads to more sunburn. Identify the Dependent, Independent, and Intervening variables in my theory." Some specific hints: I will ask you to define statistical significance. I will ask short answer questions about the relationship between standard deviation, normal distribution, t-score, and p-value. I will describe a survey situation and ask you what kind of bias it describes. I will ask about the different types of research design. Know what they are and the advantages/disadvantages of each. I will ask about the definitions of reliability and validity. I will give you some variables and ask you to identify what level of measurement they are at. I will give you several sets of variables, and ask which measure of association you should use with them (lambda, Somer's D, Pearson's R). SPSS-related Short Answer Data Spreadsheets Alabama Alaska Violent Crime Rate 426 634 Homicide Rate 5.5 5.6 Rape 38 85 Be able to: a. Circle a VARIABLE b. Draw a box around a VALUE c. Underline an entire OBSERVATION You learned the following commands. Make sure you can identify which each one does and when you would want to use it. I will ask short answer questions like: "Sex is a nominal variable. What command would I use to learn the distribution of values for the sex variable in my dataset? ___________________" Data View / Variable View Data labels Recode Compute new variable Filter Cases Frequencies Descriptives Independent Sample T-Test Chi-squared test of independence I will not ask really detailed questions about the exact command syntax, like "What does the statistics box in the cross-tab window do?" Calculation of probability and standard deviations 1) Know how to calculate the mean and median of a set of integers. 2) Know the standard deviation formula and how to calculate it for a data set of integers. Sample 1: 4,6,2,5,9 ANSWER: median=5, mean=5.2, SD=2.6) Sample 2: 3, 8, 12, 5, 2, 10, 3 3) Know how to calculate the probability of multiple events happening. On a six-sided die, what is the probability that I’ll roll a 5? What is the probability of rolling ‘boxcars’ (i.e. two sixes) when throwing two dice? If I draw 5 cards in a row from the top of a full deck, what is the probability they will make a flush (in other words, that they are all the same suit). If I flip a coin three times, what is the probability that I’ll get 3 heads in a row? Before the 2008 election, the website fivethirtyeight.com did some great work on electoral probability. The table here lists estimated probabilities that Obama or McCain would win Midwestern states. What was the estimated probability that Obama would win all four Midwestern states? What is the probability that McCain would win all four? 4) Know how to use the inverse probability in calculations of multiple events. Calculate the probability that Obama would lose all four states. If there is a 20% chance of rain every day, what is the probability it will rain at least once this week? If the Harris Teeter stocking crew is doing is job right, there is a 10% of finding a broken egg in any random carton. If I go shopping for eggs three times, and find a broken egg in my carton each time, what is the probability that could have happened if they are doing their job properly? Is that evidence they are not doing it properly? A new player comes to ultimate frisbee scrimmages in Boone, who says he played for Ring, a Raleigh club team. If he really was good enough to play for Ring, he should get through games without a single dropped disc 75% of the time. The first two games he plays in, he drops it both times. What is the probability of that happening if he really did play for Ring? Should I be suspicious? Interpreting statistical output Remember, when I say "interpret your results," you should do 3 things. 1) Tell me the answer. For example: (for a T-test) "On average, Men study 2.5 more hours per week than women do," or (for correlation) "People with longer index fingers tend to have longer ear lobes." 2) Tell me whether the connection is statistically significant. For example: "This result is statistically significant," or "I am not confident these sample results apply to the whole population." 3) Give me the evidence. For example: (Pearson's R = .325), (P<.001), or Lamda=.43, P=.012). DO NOT GET LITERARY ON ME. I've seen more of these answers than you care to imagine, and I can assure that not matter how sophisticated you think you are, if you try to creatively mix your parts 1 and 2, you will probably make a mistake and lose points. KEEP IT SIMPLE. If your results show that "poor neighborhoods have more crime than rich neighborhoods," then say that! Don't give me something like "There is a correlation between the level of poverty and the crime rate at the neighborhood level." Answers your grandmother can understand are better than ones she can't. T-tests In his survey this semester, Justin discovered that most (70%) of ASU students were willing to drive after drinking. I suppose that matches the reports from people who observed court proceedings for this class and said it was mostly DUIs. Anyway, when I first read his explanation, I wondered about three possible hypotheses: a) Older people are wiser and less likely to drive after drinking, b) Younger people aren't legally drinking, so they're more cautious and less likely to drive after drinking, or c) age doesn't matter. Here's the result of a T-Test I did. Interpret the results and decide if there is evidence for any of the three hypotheses. Make sure you assess whether the results of Justin's sample can be extrapolated to the whole ASU population. [the dependent variable is "would you drive after drinking, where 1=never, 2=as a last resort, and 3=yes] (4 points) SPSS says: Under-21 average: 1.62, Over-21 average: 1.45 t-score=.547 p-value= .588 Next I wondered if males or females are more likely to drink and drive. After some thought, I decided that males might have been more likely to 50 years ago, but I don't get the sense that current female students show much more self-restraint around alcohol than male students do. Here is the result of a T-Test. Use these results to make a case that: a) Men DUI more, b) Women DUI more, or c) the data doesn't provide good evidence either way. Make sure you use the statistical results to support your position. (4 points) SPSS says: Male average: 1.64, Female average: 1.33 t-score=1.740 p-value= 0.11 Pearson's R In class I showed you my collection of data from student course evaluations, and we checked hypotheses about what factors made students think a teacher was effective. Remember all variables in this data set are rated from 1 (strongly disagree) to 5 (strongly agree) One possibility is that students respond positively to an enthusiastic professor. When I correlated "Effective Teacher" with "Enthusiastic" I found a Pearson's R of 0.6509, with a p-value of .002. (4 points) Is the correlation positive or negative? How strong is the correlation? Is it statistically significant? Explain what this correlation (or lack thereof) says about teaching styles to your grandma. Among teachers, there is a perennial debate about whether students give better evaluations for professors who go easy on them or whether ease doesn't matter. To test this, I did a correlation between "Teacher is Effective" and "Course is Difficult." I found a Pearson's R of .1043, with a pvalue of .11. (4 points) Dr. Koch says that students respond positively to difficult classes. If he is right, would Pearson's R be positive or negative? How strong is the correlation? Is it statistically significant? Explain this correlation (or lack thereof) to your grandma. One of the CJ profs who left the department in the past few years had good evaluations, but also gave "A"s to pretty much everyone in the whole class. Some people said that the evaluations didn't mean anything since they were "bought" with good grades. Based on the evidence above, do you agree? Why? Measures of Association and Chi-squared One student this summer did an interesting survey that asked students if they "enjoyed college." Fewer than half said yes. There's a lot one might speculate about from that result, but most of it is beyond the scope of this exam, unfortunately. On average, girls do better in college than boys do, so I hypothesized they enjoy college more, since we usually like doing what we excel at. This is what SPSS tells me. Am I right? Interpret the results, making sure to include a description of the correlation (with the relevant number), the statistical significance (with the relevant number)--but don't forget to answer the question. (4 points) The student also asked whether the respondents drink alcohol. Given all the drinking that students do, I hypothesized that young people who drink will enjoy college. Here is the SPSS output. Am I right? Interpret the results, making sure to include a description of the correlation (with the relevant number), the statistical significance (with the relevant number)--but don't forget to answer the question. (4 points)