samples equation

advertisement
Summary of Formulas/Tests (Chapter 10&12)
Chapter 10: Correlation and Regression
1. Linear Correlation Coefficient
๐‘›(∑ ๐‘ฅ๐‘ฆ) − (∑ ๐‘ฅ)(∑ ๐‘ฆ)
๐‘Ÿ=
√๐‘›(∑ ๐‘ฅ²) − (∑ ๐‘ฅ)² √๐‘›(∑ ๐‘ฆ²) − (∑ ๐‘ฆ)²
๏‚ท
Notation:
n: the number of pairs of data present
r: the linear correlation coefficient for a sample, -1 ≤ r ≤ +1
If r is close to 0, we conclude that there is no linear correlation between x and y,
but if r is close to -1 or +1, we conclude that there is a linear correlation
between x and y.
2. Coefficient of Determination: r²
The value of r² is the proportion of the variation in y that is explained by the linear
relationship between x and y.
3. Hypothesis Test for Correlation
H0: ρ = 0
(There is no linear correlation.)
H1: ρ ≠ 0
(There is a linear correlation.)
(ρ: Greek letter rho used to represent the linear correlation coefficient for a population.)
๏‚ท
Method 1: Test statistics is t:
Test statistic: ๐‘ก =
๐‘Ÿ
√1−๐‘Ÿ²
๐‘›−2
Critical values: t distribution (Table A-3 with df = n – 2)
๏‚ท
Method 2: test statistic is r
Test statistic: ๐‘Ÿ =
๐‘›(∑ ๐‘ฅ๐‘ฆ)−(∑ ๐‘ฅ)(∑ ๐‘ฆ)
√๐‘›(∑ ๐‘ฅ²)−(∑ ๐‘ฅ)²√๐‘›(∑ ๐‘ฆ²)−(∑ ๐‘ฆ)²
Critical value: critical values of the person correlation coefficient r (Table A-6)
4. Regression Equation
๏‚ท Notations for regression equation
Population parameter
y-intercept of regression equation
β0
Slope of regression equation
β1
Equation of the regression line
Y = β0 + β1x
b1=
Sample statistic
b0
b1
๐‘ฆฬ‚ = b0 + b1x
๐‘›(∑ ๐‘ฅ๐‘ฆ)−(∑ ๐‘ฅ)(∑ ๐‘ฆ)
๐‘›(∑ ๐‘ฅ²)−(∑ ๐‘ฅ)²
b0= ๐‘ฆ − b1๐‘ฅ or ๐‘0 =
(∑ ๐‘ฆ)(∑ ๐‘ฅ 2 )−(∑ ๐‘ฅ)(∑ ๐‘ฅ๐‘ฆ)
๐‘›((∑ ๐‘ฅ 2 )−(∑ ๐‘ฅ)²
๐‘ฆฬ‚ = b0 + b1x
5.
6.
7.
8.
Residual = observed y – predicted y = y - ๐‘ฆฬ‚
Total Deviation = y - ๐‘ฆ, total variation = ∑(๐‘ฆ − ๐‘ฆ)²
Explained Deviation = y - ๐‘ฆฬ‚ − ๐‘ฆ, explained variation = ∑( ๐‘ฆฬ‚ − ๐‘ฆ)²
Unexplained Deviation = ๐‘ฆ − ๐‘ฆฬ‚, unexplained variation = ∑(๐‘ฆ − ๐‘ฆฬ‚)²
9. Coefficient of Determination: ๐‘Ÿ² =
10. Standard Error of Estimate: se= √
๐‘’๐‘ฅ๐‘๐‘™๐‘Ž๐‘–๐‘›๐‘’๐‘‘ ๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘ก๐‘–๐‘œ๐‘›
๐‘ก๐‘œ๐‘ก๐‘Ž๐‘™ ๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘ก๐‘–๐‘œ๐‘›
∑(๐‘ฆ−๐‘ฆฬ‚)²
๐‘›−2
or se= √
1
๐‘›
+
− ๐‘ฆ)²
∑ ๐‘ฆ²−๐‘0 ∑ ๐‘ฆ−๐‘1 ∑ ๐‘ฅ๐‘ฆ
11. Prediction Interval for an Individual y
Margin of error: E = tα/2Se√1 +
∑(๐‘ฆฬ‚−๐‘ฆ)²
= ∑(y
๐‘›(๐‘ฅ0 −๐‘ฅ)²
๐‘›(∑ ๐‘ฅ²)−(∑ ๐‘ฅ)²
Prediction interval: ๐‘ฆฬ‚ − ๐ธ < ๐‘ฆ < ๐‘ฆฬ‚ + ๐ธ
X0: given value of x
tα/2: t distribution (Table A-3 with df = n-2)
12. Multiple Regression
Multiple regression equation: ๐‘ฆฬ‚ = b0 + b1x1 + b2x2 + โ‹ฏ + bkxk
๐‘›−2
Adjusted coefficient of determination: ๐‘…² = 1 −
(๐‘›−1)
[๐‘›−(๐‘˜+1)]
(1 − ๐‘…2 )
n: sample size
k: number of predictor (x) variables
Chapter 12: Analysis of Variance
1. One-Way ANOVA with Equal Sample Sizes n
H0: µ1 = µ2 = µ3 = ...
H1: At least one of the means is different from the others.
Test statistic: ๐น
=
๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘›๐‘๐‘’ ๐‘๐‘’๐‘ก๐‘ค๐‘’๐‘’๐‘› ๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’๐‘ 
๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘›๐‘๐‘’ ๐‘ค๐‘–๐‘กโ„Ž๐‘–๐‘› ๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’๐‘ 
=
๐‘›๐‘ ๐‘ฅ2
๐‘ ๐‘2
=
๐‘›๐‘ ๐‘ฅ2
2
2
๐‘ 2
1 +๐‘ 2 +โ‹ฏ+๐‘ ๐‘˜
๐‘˜
Critical values: F distribution (Table A-5 with numerator df = k – 1 and denominator
df = k (n - ), k = number of samples and n = sample size)
2. One-Way ANOVA with Unequal Sample Sizes
H0: µ1 = µ2 = µ3 = ...
H1: At least one of the means is different from the others.
Test statistic:
๐น=
๏‚ท
๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘›๐‘๐‘’ ๐‘๐‘’๐‘ก๐‘ค๐‘’๐‘’๐‘› ๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’๐‘ 
๐‘ฃ๐‘Ž๐‘Ÿ๐‘–๐‘Ž๐‘›๐‘๐‘’ ๐‘ค๐‘–๐‘กโ„Ž๐‘–๐‘› ๐‘ ๐‘Ž๐‘š๐‘๐‘™๐‘’๐‘ 
=
∑ ๐‘›๐‘– (๐‘ฅ๐‘–−๐‘ฅ
ฬฟ)²
]
๐‘˜−1
∑(๐‘› −1)๐‘ 2
[ ∑(๐‘›๐‘– −1)๐‘– ]
๐‘–
[
=
๐‘€๐‘† (๐‘ก๐‘Ÿ๐‘’๐‘Ž๐‘ก๐‘š๐‘’๐‘›๐‘ก)
๐‘€๐‘† (๐‘’๐‘Ÿ๐‘Ÿ๐‘œ๐‘Ÿ)
Notation:
๐‘ฅฬฟ : mean of all sample values combined
๐‘˜: number of population means being compared
๐‘›๐‘– : number of values in the ith sample
๐‘ฅ๐‘– : mean of values in the ith sample
๐‘ ๐‘–2 : variance of values in the ith sample
Critical values: F distribution (Table A-5 with numerator df = k – 1 and denominator
df = N – k with k = number of samples and N = total number of values in all samples
combined)
3. Two-Way ANOVA
H0: There are no effects from the row factor (that is, the row means are equal).
H1: There are no effects from the column factor (that is, the column means are equal).
Step 1. Interaction Effect
Test for an interaction between the two factors using ๐น =
๐‘€๐‘† (๐‘–๐‘›๐‘ก๐‘’๐‘Ÿ๐‘Ž๐‘๐‘ก๐‘–๐‘œ๐‘›)
๐‘€๐‘†(๐‘’๐‘Ÿ๐‘Ÿ๐‘œ๐‘Ÿ)
Step 2. Row/Column Effect
Is there an effect due to interaction between the two factors?
๏ƒ˜ Yes (Reject H0 of no interaction effect.). Stop. Don’t consider the effects
of either factor without considering the effects of the other.
๏ƒ˜ No (Fail to reject H0) of no interaction effect.).
๏ƒจ Test for effect from row factor using ๐น =
๐‘€๐‘† (๐‘Ÿ๐‘œ๐‘ค ๐‘“๐‘Ž๐‘๐‘ก๐‘œ๐‘Ÿ)
๐‘€๐‘†(๐‘’๐‘Ÿ๐‘Ÿ๐‘œ๐‘Ÿ)
Interpretation: Compare P-value with significance level. If the P-value
is less than or equal to the significance level, reject the null
hypothesis of no effects from the row factor. If the P-value is greater
than the significance level, fail to reject the null hypothesis of no
effects from the row factor.
๏ƒจ Test for effect from column factor using ๐น =
๐‘€๐‘† (๐‘๐‘œ๐‘™๐‘ข๐‘š๐‘› ๐‘“๐‘Ž๐‘๐‘ก๐‘œ๐‘Ÿ)
๐‘€๐‘†(๐‘’๐‘Ÿ๐‘Ÿ๐‘œ๐‘Ÿ)
Interpretation: Compare P-value with significance level. If the P-value
is less than or equal to the significance level, reject the null
hypothesis of no effects from the column factor. If the P-value is
greater than the significance level, fail to reject the null hypothesis of
no effects from the column factor.
Download