# Wrap up ```S519 Statistical Sessions
Wrap up
Things we’ve covered
•
•
•
•
•
•
•
•
•
Descriptive Statistics
Normal Distributions
Z-test
Hypothesis Testing
T-test
ANOVA
Correlation
Linear regression
Chi-square
Descriptive Statistics
• Central Tendency
– Mean
– Median
– Mode
• Variance
– Range
– Standard deviation
– Variance
Normal Distributions
• Skewness
• Kurtosis
Z-test
Hypothesis Testing
1. State the hypothesis
– Null hypothesis
– Research hypothesis
•
•
Directional
Non-directional
2. Set decision criteria
3. Collect data and compute sample statistic
4. Make a decision (accept/reject)
T-test
T-test
• Degree of freedom=n-1
• TTEST (array1, array2, tails, type)
– array1 = the cell address for the first set of data
– array2 = the cell address for the second set of data
– tails: 1 = one-tailed, 2 = two-tailed
– type: 1 = a paired t test; 2 = a two-sample test
(independent with equal variances); 3 = a twosample test with unequal variances
ANOVA
• Analysis of Variance
• A hypothesis-testing procedure used to evaluate
mean differences between two or more treatments
(or populations).
– 1) Can work with more than two samples.
– 2) Can work with more than one independent variable
ANOVA
• In ANOVA an independent or quasiindependent variable is called a factor.
• Factor = independent (or quasi-independent)
variable.
• Levels = number of values used for the
independent variable.
• One factor → “single-factor design”
• More than one factor → “factorial design”
ANOVA
• Df for independent ANOVA
– Between-group degree of freedom=k-1
• k: number of groups
– Within-group degree of freedom=N-k
• N: total sample size
• Df for dependent ANOVA
– Between-group degree of freedom=k-1
• k: number of groups
– Within-group degree of freedom=N-k
• N: total sample size
– Between-subject degree of freedom=n-1
• n: number of subjects
– Error degree of freedom=(N-k)-(n-1)
ANOVA
• Three different ANOVA:
– Independent measures design: Groups are samples of
independent measurements (different people)
ANOVA: single factor
– Dependent measures design: Groups are samples of
dependent measurements (usually same people at
different times) “Repeated measures”
ANOVA: two factors without replication
– Factorial ANOVA (more than one factor)
ANOVA: two factors with replication
Correlation
• Pearson correlation
– CORREL function or Pearson function
– Toolpak for more than two variables (matrix)
• The correlation represents the association
between two or more variables
• It has nothing to do with causality (there is no
cause relation between two correlated
variables)
Correlation
rxy value
Interpretation
0.8 ~ 1.0
Very strong relationship (share most of the things in common)
0.6 ~0.8
Strong relationship (share many things in common)
0.4 ~ 0.6
Moderate relationship (share something in common)
0.2 ~ 0.4
Weak relationship (share a little in common)
0.0 ~ 0.2
Weak or no relationship (share very little or nothing in common)
Correlation
Linear regression
• Y’ = bX + a
– b = SLOPE()
– a = INTERCEPT()
•
Chi-square
• Non-parametric vs. parametric
•
2
(
O

E
)
2  
E
– O: the observed frequency
– E: the expected frequency
• df=r-1 (r= number of categories)
```