Statistics

advertisement
Chapter 16
Statistical Analysis
Chapter Outline
§
§
§
§
Introduction
Descriptive Statistics
Inferential Statistics
Other Multivariate Techniques
Descriptive Statistics
§
§
Statistical computations that describe the
characteristics of a sample or the
relationship among variables in a sample.
Inferential statistics make inferences
about the larger population from which the
sample observations were drawn.
Partial Raw Data Matrix
Hypothetical Raw Data on
Education and Prejudice
Measures of Association
§
§
Descriptive statistics summarizing the
relationships between variables.
Many measures of association are
based on a proportionate reduction of
error (PRE) model.
Proportionate Reduction of
Error (PRE)
§
A logical model for assessing the strength of a
relationship by asking how much knowing
values on one variable would reduce our errors
in guessing values on the other.
§ Example:
§ If we know how much education people
have, we can improve our ability to
estimate how much they earn, thus
indicating there is a relationship between
the two variables.
Proportionate Reduction of
Error (PRE)
§
Based on a comparison of:
1. The number of errors we would make in
attempting to guess the attributes of a given
variable for each of the cases under study if we knew nothing but the distribution of
attributes on that variable.
2. The number of errors we would make if we
knew the joint distribution overall and were
told for each case the attribute of one
variable each time we were asked to guess
the attribute of the other.
Question
§
______________ is a logical model for
assessing the strength of a relationship.
A. MRI
B. DVU
C. OMG
D. PRE
Answer: D
§
PRE is a logical model for assessing the
strength of a relationship.
Nominal Variables
§
§
If two variables consist of nominal data
(gender, religious affiliation, race),
lambda (l) would be one appropriate
measure.
Lambda is based on your ability to guess
values on one of the variables: the PRE
achieved through knowledge of values on
the other variable.
Ordinal Variables
§
If the variables being related are ordinal
(social class, religiosity, alienation),
gamma (g) is one appropriate measure of
association.
§ Lambda is based on guessing exact
values, gamma is based on guessing
the ordinal arrangement of values.
Gamma
§
§
Computed from two quantities:
1. The number of pairs having the same
ranking on the two variables.
2. The number of pairs having the opposite
ranking on the two variables.
For pairs having the same ranking:
§
The frequency of each cell in the table is
multiplied by the sum of all cells below and
to the right of it, with all products summed.
Question
A.
If the variables being related are ordinal,
_________ is an appropriate measure of
association.
A. Gamma
B. Lambda
C. Rho
D. chi square
Answer: A
§
If the variables being related are ordinal,
gamma is an appropriate measure of
association.
Interval or Ratio Variables
§
§
If interval or ratio variables (age, income,
grade point average, and so forth) are
being associated, one appropriate
measure of association is Pearson’s
product-moment correlation (r).
r reflects how closely you can guess the
value of one variable through your
knowledge of the value of another.
Regression Analysis
§
§
A method of data analysis in which the
relationships among variables are represented
in the form of an equation, called a regression
equation.
Linear regression analysis
§ A form of statistical analysis that seeks the
equation for the straight line that best
describes the relationship between two ratio
variables.
Question
§
If interval or ratio variables are being
associated, one appropriate measure is
____________.
A. gamma
B. lambda
C. Pearson’s product
D. none of these choices
Answer: C
§
If interval or ratio variables are being
associated, one appropriate measure is
Pearson’s product.
Simple Scattergram of
Values of X and Y
A Scattergram of the Values of Two
Variables with Regression Line Added
Multiple Regression Analysis
§
A form of statistical analysis that seeks
the equation representing the impact of
two or more independent variables on a
single dependent variable.
Partial Regression Analysis
§
A form of regression analysis in which the
effects of one or more variables are held
constant, similar to the logic of the
elaboration model.
Curvilinear Regression
Analysis
§
A form of regression analysis that allows
relationships among variables to be
expressed with curved geometric lines
instead of straight ones.
Inferential Statistics
§
The body of statistical computations
relevant to making inferences from
findings based on sample observations to
some larger population.
Nonsampling Error
§
Those imperfections of data quality that
are a result of factors other than sampling
error.
§ Examples: misunderstandings of
questions by respondents, erroneous
recordings by interviewers and coders,
and keypunch errors.
Statistical Significance
§
§
A general term referring to the likelihood that
relationships observed in a sample could be
attributed to sampling error alone.
Tests of Statistical Significance
§ A class of statistical computations that
indicate the likelihood that the relationship
observed between variables in a sample can
be attributed to sampling error only.
The Logic of Statistical
Significance
1.
2.
3.
Assumptions regarding the independence of
two variables in the population study.
Assumptions regarding the representativeness
of samples selected through conventional
probability-sampling procedures.
The observed joint distribution of sample
elements in terms of the two variables.
Question
§
_________________ indicate the
likelihood that the relationship observed
between variables in a sample can be
attributed to sampling error only.
A. ex post facto hypothesizing
B. tests of statistical significance
C. disconfirmation
D. all of these choices
Answer: B
§
Tests of statistical significance indicate
the likelihood that the relationship
observed between variables in a sample
can be attributed to sampling error only.
Hypothetical Population of Men and Women
Who Favor or Oppose Sexual Equality
A Representative Sample
An Unrepresentative Sample
Level of Significance
§
§
In the context of tests of statistical
significance, the degree of likelihood that
an observed, empirical relationship could
be attributable to sampling error.
A relationship is significant at the .05 level
if the likelihood of its being only a function
of sampling error is no greater than 5 out
of 100.
Representative Sample from a Population in
Which Variables Are Related
Chi Square
§
§
§
Computed as follows.
For each cell in the tables, the researcher:
§ Subtracts the expected frequency for that cell
from the observed frequency.
§ Squares this quantity.
§ Divides the squared difference by the
expected frequency.
This procedure is carried out for each cell in the
tables.
Path Analysis
§
A form of multivariate analysis in which
the causal relationships among variables
are presented in a graphical format.
Diagramming the Religious
Sources of Anti-Semitism
The Larceny Rates over
Time in a Hypothetical City
Factor Analysis
§
§
An analytical method of discovering the
general dimensions represented by a
collection of actual variables.
These factors are calculated hypothetical
dimensions that are not perfectly
represented by any of the empirical
variables under study but are highly
associated with groups of empirical
variables.
Analysis of Variance
(ANOVA)
§
Based on comparing variations between
and within groups and determining
whether between-group differences could
reasonably have occurred in simple
random sampling or whether they likely
represent a genuine relationship between
the variables involved.
Discriminant Analysis
§
§
Seeks to account for variation in some
dependent variable by finding a
hypothetical, composite dimension that
separates categories of the dependent
variable.
Results in an equation that scores people
on the basis of that hypothetical
dimension and allows us to predict their
values on the dependent variable.
Question
§
_________________ is a causal model
for understanding relationships between
variables.
A. ex post facto hypothesizing
B. tests of statistical significance
C. path analysis
D. all of these choices
Answer: C
§
Path analysis is a causal model for
understanding relationships between
variables.
Two Distribution Patterns of the Incomes of
Republicans and Democrats
Six Writers: Three Who Write by Hand
and Three Who Use Computers
Plotting the Six Writers in
Terms of Age and Income
Income Alone Is Sufficient to
Predict Writing Method
A Slightly More Complicated
Pattern
Separating the Pens from the
Computers
Log-linear Models
§
Offer a method for analyzing complex
relationships among several nominal
variables having more than two attributes
each.
Geographic Information
Systems (GIS)
§
Map quantitative data that describe
geographic units for a graphical display.
Quick Quiz
1. _____________ is the applied branch of
mathematics especially appropriate to a
variety of research analyses.
A. calculus
B. probability
C. statistics
D. none of these choices
Answer: C
§
Statistics is the applied branch of
mathematics especially appropriate to a
variety of research analyses.
2. Gamma is composed of:
A. the number of pairs having the same
ranking on two variables
B. the number of pairs having the
opposite ranking on the two variables
C. both a and b
D. none of these choices
Answer: C
2. Gamma is composed of the number of
pairs having the same ranking on two
variables, and the number of pairs
having the opposite ranking on the
two variables.
3. A __________permits the estimation of
values on one variable from values on
the other.
A. multivariate analysis
B. indirect analysis
C. regression line
D. exploratory study
Answer: C
§
A regression line permits the
estimation of values on one variable
from values on the other.
4. ______________ are statistical
measures used for making inferences
from findings based on sample
observations to a larger population.
A. descriptive statistics
B. inferential statistics
C. ex post facto statistics
D. none of these choices
Answer: B
§
Inferential statistics are statistical
measures used for making inferences
from findings based on sample
observations to a larger population.
5. A____________ analysis represents
changes in a variable over time.
A. regression
B. bivariate
C. time-series analysis
D. all of these choices
Answer: C
§
A time-series analysis represents
changes in a variable over time.
6. Which type of statistics assists
researchers in drawing conclusions from
their observations.
A. descriptive statistics
B. inferential statistics
C. ordinal statistics
D. interval statistics
Answer: B
§
Inferential statistics assists
researchers in drawing conclusions from
their observations.
Download