Chapter 16 Statistical Analysis Chapter Outline § § § § Introduction Descriptive Statistics Inferential Statistics Other Multivariate Techniques Descriptive Statistics § § Statistical computations that describe the characteristics of a sample or the relationship among variables in a sample. Inferential statistics make inferences about the larger population from which the sample observations were drawn. Partial Raw Data Matrix Hypothetical Raw Data on Education and Prejudice Measures of Association § § Descriptive statistics summarizing the relationships between variables. Many measures of association are based on a proportionate reduction of error (PRE) model. Proportionate Reduction of Error (PRE) § A logical model for assessing the strength of a relationship by asking how much knowing values on one variable would reduce our errors in guessing values on the other. § Example: § If we know how much education people have, we can improve our ability to estimate how much they earn, thus indicating there is a relationship between the two variables. Proportionate Reduction of Error (PRE) § Based on a comparison of: 1. The number of errors we would make in attempting to guess the attributes of a given variable for each of the cases under study if we knew nothing but the distribution of attributes on that variable. 2. The number of errors we would make if we knew the joint distribution overall and were told for each case the attribute of one variable each time we were asked to guess the attribute of the other. Question § ______________ is a logical model for assessing the strength of a relationship. A. MRI B. DVU C. OMG D. PRE Answer: D § PRE is a logical model for assessing the strength of a relationship. Nominal Variables § § If two variables consist of nominal data (gender, religious affiliation, race), lambda (l) would be one appropriate measure. Lambda is based on your ability to guess values on one of the variables: the PRE achieved through knowledge of values on the other variable. Ordinal Variables § If the variables being related are ordinal (social class, religiosity, alienation), gamma (g) is one appropriate measure of association. § Lambda is based on guessing exact values, gamma is based on guessing the ordinal arrangement of values. Gamma § § Computed from two quantities: 1. The number of pairs having the same ranking on the two variables. 2. The number of pairs having the opposite ranking on the two variables. For pairs having the same ranking: § The frequency of each cell in the table is multiplied by the sum of all cells below and to the right of it, with all products summed. Question A. If the variables being related are ordinal, _________ is an appropriate measure of association. A. Gamma B. Lambda C. Rho D. chi square Answer: A § If the variables being related are ordinal, gamma is an appropriate measure of association. Interval or Ratio Variables § § If interval or ratio variables (age, income, grade point average, and so forth) are being associated, one appropriate measure of association is Pearson’s product-moment correlation (r). r reflects how closely you can guess the value of one variable through your knowledge of the value of another. Regression Analysis § § A method of data analysis in which the relationships among variables are represented in the form of an equation, called a regression equation. Linear regression analysis § A form of statistical analysis that seeks the equation for the straight line that best describes the relationship between two ratio variables. Question § If interval or ratio variables are being associated, one appropriate measure is ____________. A. gamma B. lambda C. Pearson’s product D. none of these choices Answer: C § If interval or ratio variables are being associated, one appropriate measure is Pearson’s product. Simple Scattergram of Values of X and Y A Scattergram of the Values of Two Variables with Regression Line Added Multiple Regression Analysis § A form of statistical analysis that seeks the equation representing the impact of two or more independent variables on a single dependent variable. Partial Regression Analysis § A form of regression analysis in which the effects of one or more variables are held constant, similar to the logic of the elaboration model. Curvilinear Regression Analysis § A form of regression analysis that allows relationships among variables to be expressed with curved geometric lines instead of straight ones. Inferential Statistics § The body of statistical computations relevant to making inferences from findings based on sample observations to some larger population. Nonsampling Error § Those imperfections of data quality that are a result of factors other than sampling error. § Examples: misunderstandings of questions by respondents, erroneous recordings by interviewers and coders, and keypunch errors. Statistical Significance § § A general term referring to the likelihood that relationships observed in a sample could be attributed to sampling error alone. Tests of Statistical Significance § A class of statistical computations that indicate the likelihood that the relationship observed between variables in a sample can be attributed to sampling error only. The Logic of Statistical Significance 1. 2. 3. Assumptions regarding the independence of two variables in the population study. Assumptions regarding the representativeness of samples selected through conventional probability-sampling procedures. The observed joint distribution of sample elements in terms of the two variables. Question § _________________ indicate the likelihood that the relationship observed between variables in a sample can be attributed to sampling error only. A. ex post facto hypothesizing B. tests of statistical significance C. disconfirmation D. all of these choices Answer: B § Tests of statistical significance indicate the likelihood that the relationship observed between variables in a sample can be attributed to sampling error only. Hypothetical Population of Men and Women Who Favor or Oppose Sexual Equality A Representative Sample An Unrepresentative Sample Level of Significance § § In the context of tests of statistical significance, the degree of likelihood that an observed, empirical relationship could be attributable to sampling error. A relationship is significant at the .05 level if the likelihood of its being only a function of sampling error is no greater than 5 out of 100. Representative Sample from a Population in Which Variables Are Related Chi Square § § § Computed as follows. For each cell in the tables, the researcher: § Subtracts the expected frequency for that cell from the observed frequency. § Squares this quantity. § Divides the squared difference by the expected frequency. This procedure is carried out for each cell in the tables. Path Analysis § A form of multivariate analysis in which the causal relationships among variables are presented in a graphical format. Diagramming the Religious Sources of Anti-Semitism The Larceny Rates over Time in a Hypothetical City Factor Analysis § § An analytical method of discovering the general dimensions represented by a collection of actual variables. These factors are calculated hypothetical dimensions that are not perfectly represented by any of the empirical variables under study but are highly associated with groups of empirical variables. Analysis of Variance (ANOVA) § Based on comparing variations between and within groups and determining whether between-group differences could reasonably have occurred in simple random sampling or whether they likely represent a genuine relationship between the variables involved. Discriminant Analysis § § Seeks to account for variation in some dependent variable by finding a hypothetical, composite dimension that separates categories of the dependent variable. Results in an equation that scores people on the basis of that hypothetical dimension and allows us to predict their values on the dependent variable. Question § _________________ is a causal model for understanding relationships between variables. A. ex post facto hypothesizing B. tests of statistical significance C. path analysis D. all of these choices Answer: C § Path analysis is a causal model for understanding relationships between variables. Two Distribution Patterns of the Incomes of Republicans and Democrats Six Writers: Three Who Write by Hand and Three Who Use Computers Plotting the Six Writers in Terms of Age and Income Income Alone Is Sufficient to Predict Writing Method A Slightly More Complicated Pattern Separating the Pens from the Computers Log-linear Models § Offer a method for analyzing complex relationships among several nominal variables having more than two attributes each. Geographic Information Systems (GIS) § Map quantitative data that describe geographic units for a graphical display. Quick Quiz 1. _____________ is the applied branch of mathematics especially appropriate to a variety of research analyses. A. calculus B. probability C. statistics D. none of these choices Answer: C § Statistics is the applied branch of mathematics especially appropriate to a variety of research analyses. 2. Gamma is composed of: A. the number of pairs having the same ranking on two variables B. the number of pairs having the opposite ranking on the two variables C. both a and b D. none of these choices Answer: C 2. Gamma is composed of the number of pairs having the same ranking on two variables, and the number of pairs having the opposite ranking on the two variables. 3. A __________permits the estimation of values on one variable from values on the other. A. multivariate analysis B. indirect analysis C. regression line D. exploratory study Answer: C § A regression line permits the estimation of values on one variable from values on the other. 4. ______________ are statistical measures used for making inferences from findings based on sample observations to a larger population. A. descriptive statistics B. inferential statistics C. ex post facto statistics D. none of these choices Answer: B § Inferential statistics are statistical measures used for making inferences from findings based on sample observations to a larger population. 5. A____________ analysis represents changes in a variable over time. A. regression B. bivariate C. time-series analysis D. all of these choices Answer: C § A time-series analysis represents changes in a variable over time. 6. Which type of statistics assists researchers in drawing conclusions from their observations. A. descriptive statistics B. inferential statistics C. ordinal statistics D. interval statistics Answer: B § Inferential statistics assists researchers in drawing conclusions from their observations.