Quantitative Data Analysis: Inferential Statistics

advertisement
EDUC 7741
Quantitative data analysis-inferential statistics
Paris
Quantitative Data Analysis: Inferential Statistics
Purpose of inferential statistics is to make inferences from a sample to a
larger group (population).
Data from descriptive measures of the sample (mean, standard deviation) are
called statistics.
The corresponding descriptive measures from the population (mu, sigma)
are called parameters. Parametric (normal distribution tied)statistical
methods are based on the assumptions of:
Normality of distribution,
homogeneity of variance of parent population,
and at least interval type data.
See Creswell….page 230, fig. 8.2
NOTE: Nonparametric methods must be used if
any of these assumptions are likely to be violated….
more about nonparametrics later.
Parameters are not computed…rather are “inferred” from the
information gathered from the statistics of the sample.
Nature of inferential statistics involves the testing of hypotheses.
Hypotheses are statements about one or more population parameters. If the
hypothesis is consistent with sample data, the hypothesis is accepted as a
viable value for the parameter. If the sample data are not consistent with the
hypothesis, the hypothesis is rejected- meaning that the hypothesized value
is not a viable value for the population parameter.
In order to make inferences (accept or reject hypothesis) about a value….the
probability of occurrence of a particular value of a statistic (mean, standard
deviation) must be considered. Probability is based on understanding
sampling distribution of the statistic.
Sampling distribution- the values of a statistic (mean, standard deviation)
from all the possible samples of a given size.
EDUC 7741
Quantitative data analysis-inferential statistics
Paris
Significance level (alpha-level) is a criterion used in making a decision
about a hypothesis. It is established prior to testing the hypothesis. Many
researchers today…simply report what the actual significance level is of
their statistical test. It is the probability such that the statistic would appear
by chance if the hypothesis is true. If less than this probability, the statistic
did not occur by chance and the hypothesis is rejected.
Common levels are .05, .01 and occasionally .10.
In examining, we use the standard normal distribution. Mean of 0 and
standard deviation of 1.0. Total area under the curve = 100%. See example
page 242 fig. 8.5.
Hypothesis testing
Types of statistical tests
t-test of significance- used to test the hypothesis that two groups are or are
not equal on a particular parameter. OR that the population correlation
coefficient is zero.
F-test of significance- used to test the hypothesis that two or more
population means are equal. Most frequently used with more than two
groups.
ANOVAs One-way (one independent variable with multiple levels)
Four experimental treatments
Two-way (two independent variables with multiple
levels). Tests for the interaction as well as for
treatments.
Factorial ANOVA- more than two variables.
ANCOVA-Analysis of covariance. When using a covariable to statistically
adjust means on independent variable.
EDUC 7741
Quantitative data analysis-inferential statistics
Paris
CORRELATIONAL DESIGNS
Correlation- a measure of the relationship between two variables. A
measure of how the two variables “covary” with respect to one another. Orhow changes in one variable compare with changes in the other variable.
Are high scores on one variable associated with high scores on the other?
Are high scores on one variable associated with low scores on another? Etc.
Working with sets of ordered pairs from a group of individuals. Each
individual has two scores- one on each of two separate measures.
Correlation coefficient- an index of the extent of the relationship between
the two variables. Can have values from –1.00 to +1.00. –1 means perfect
negative correlation, +1 perfect positive correlation, 0 means no correlation
or the variables are completely independent of each other.
Reporting using notation of r =
Scattergrams illustrate the relationship graphically.
Uses of correlation- not used to identify cause and effect. Most often used
to predict. Increased strength of relationship between variables (i.e.stronger correlation) increased accuracy in prediction.
Types
Pearson-product moment- both variables are interval scales- like the score
example just given.
Spearman rank order- both variables are ordinal scales- performance on
midterm and final- rank from first to last.
Point biserial- one variable on interval and other dichotomous- relationship
between gender and scores on GRE verbal section.
Biserial- one variable on interval, other artificial dichotomy on an ordinal
scale (artificial because underlying continuous distribution-relationship
between scores on midterm exam and rating of stress level(?????)
Coefficient of contingency- (Phi coefficient) between two variables on
nominal scale. Relationship between gender and graduation from college
(coded 0, 1 for gender and 0, 1 for graduation) Examines frequencies of
occurrences- numbers of males graduated, not graduated and number of
females graduated and not graduated.
Download