Formulas

advertisement
ES10003: Introduction to Statistics
Formulae & Similar Guide
1. Chebychev’s Inequality
For any population with mean μ and standard deviation σ , and k > 1 , the percentage of
observations that fall within the interval [μ ± kσ] is at the least 100[1-(1/k2)]%.
2. Combinations
The number of combinations of n items taken m at a time
𝑛
πΆπ‘š
= n!/[m!(n-m)!]
3. Permutations
The number of permutations taken m at a time
π‘ƒπ‘šπ‘› = n!/(n-m)!
4. Multiplication rule in probability
P(A|B) P(B) = P(A∩ B)
5. Bayes Theorem
Let A and B be two events. Bayes theorem says:
P(B|A) =[P(A|B)P(B)] / P(A)
6. Binomial Distribution
𝑁!
P(x)= π‘₯!(𝑛−π‘₯)! Px(1-P)(N-x)
Where P(x) is the probability of x successes in N trials. P is the probability of success in a
single trial. N is the sample size.
7. The Poisson Distribution
The probability of x successes is:
P(x)=
𝑒 −πœ† πœ†π‘₯
π‘₯!
Where P(x) = the probability of x successes over a given time or space, given 
 = the expected number of successes per time or space unit,  > 0
8. Rectangular or continuous uniform distribution
𝑓(π‘₯) =
1
𝑏−π‘Ž
𝑖𝑓 π‘Ž ≤ 𝑋 ≤ 𝑏
(4)
= 0 otherwise
9. The Normal Distribution
1
√2πœ‹πœŽ2
2 /2𝜎 2
𝑒 −(π‘₯−πœ‡)
(8)
10. Finite Population Correction factor
If the sample size n is not a small fraction of the population size N, then a finite population
𝑁−𝑛
correction is √𝑁−1.
11. Standard error of the mean:
πœŽπ‘‹Μ… =
𝜎
√𝑛
12. Standard Normal Distribution
𝑍 =
𝑋̅ − πœ‡π‘₯
𝜎π‘₯Μ…
13. Related to: Confidence Interval for sample variance
E(s2) = σ2
2𝜎4
Var(s2) = 𝑛−1
(n-1) s2/σ2 follows a chi-squared distribution with n-1 degrees of freedom:
(𝑛−1)𝑠2
𝜎2
2
= Χ𝑛−1
14. Efficiency
Efficiency is related with smaller variance. If two estimators are unbiased then a estimator with
smaller variance compared to becomes more efficient.
π‘…π‘’π‘™π‘Žπ‘‘π‘–π‘£π‘’ 𝐸𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑐𝑦 =
π‘‰π‘Žπ‘Ÿ(πœƒΜ‚2 )
π‘‰π‘Žπ‘Ÿ(πœƒΜ‚1 )
15. Covariance
πΆπ‘œπ‘£(π‘₯, 𝑦) = 𝐸 (𝑋 − πœ‡π‘‹)(π‘Œ − πœ‡π‘Œ) = 𝑠π‘₯𝑦 =
∑(π‘₯𝑖 − π‘₯Μ… )(𝑦𝑖 − 𝑦̅ )
𝑛−1
16. Correlation
ρ =
𝜎π‘₯𝑦
𝜎π‘₯ πœŽπ‘¦
𝑆π‘₯𝑦
π‘Ÿ =𝑆
π‘₯ 𝑆𝑦
population correlation
Sample correlation
17. Tests difference between two population means, μd . when two samples are
dependent
𝑍=
𝑑̅
𝑆𝑑
Confidence interval
𝑆
𝑑̅ ± tn-1,α/2 𝑑
√𝑛
18. Tests of the Difference between two Normal Population means:
Independent Samples
2
2
𝑛π‘₯
𝑛𝑦
𝜎
𝜎
When 𝜎π‘₯2 and πœŽπ‘¦2 [the population variances] are known the variance of 𝑋̅ - π‘ŒΜ… is π‘₯ + 𝑦 and
the corresponding Z variable is defined as:
Z=
(π‘₯Μ… − 𝑦̅ )−(πœ‡π‘₯ − πœ‡π‘Œ)
2
2
𝜎
𝜎
√ π‘₯+ 𝑦
𝑛π‘₯ 𝑛𝑦
Confidence intervals
(π‘₯Μ… − 𝑦̅) ± 𝑍𝛼/2 √
𝜎π‘₯2
𝑛π‘₯
πœŽπ‘¦2
+
𝑛𝑦
19. Tests of the Difference Between Two Normal Population Means:
Independent Samples: πˆπŸπ’™ and πˆπŸπ’š are unknown but assumed equal
t=
(π‘₯Μ… − 𝑦̅ )−(πœ‡π‘₯ − πœ‡π‘Œ)
2
2
𝑠
𝑠
√ 𝑝+ 𝑝
𝑛π‘₯ 𝑛𝑦
𝑠𝑝2 =
2
(𝑛π‘₯ −1)𝑠π‘₯2 + (𝑛𝑦 −1)𝑠𝑦
𝑛π‘₯ + 𝑛𝑦−2
Confidence intervals
𝑠2
2
𝑠𝑝
π‘₯
𝑛𝑦
(π‘₯Μ… − 𝑦̅) ± 𝑑𝑛π‘₯+𝑛𝑦 −2,𝛼/2 √𝑛𝑝 +
20. Test Hypotheses for the Difference Between Two Population Proportions
(𝑝
Μ‚π‘₯ − 𝑝
Μ‚)
𝑦
𝑍=
√
𝑝̂0 (1 − 𝑝̂0 )
𝑝̂0 (1 − 𝑝̂0 )
+
𝑛π‘₯
𝑛𝑦
𝑝̂0 =
𝑛π‘₯ 𝑝̂π‘₯ + 𝑛𝑦𝑝̂𝑦
𝑛π‘₯ + 𝑛𝑦
Confidence intervals
𝑝̂π‘₯ − 𝑝̂𝑦 ± 𝑍𝛼/2 √
𝑝̂π‘₯ (1−𝑝̂π‘₯ )
𝑛π‘₯
+
21. Tests of Equality of Two Variances
𝑠2 /𝜎2
𝐹 = 𝑠π‘₯2 /𝜎π‘₯2 as population variances are equal.
𝑦
𝑦
𝐹=
𝑠π‘₯2
𝑠𝑦2
22. Regression
π‘Œπ‘– = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑒𝑖
𝑝̂𝑦(1−𝑝̂𝑦 )
𝑛𝑦
∑ π‘₯𝑦−π‘›π‘‹Μ…π‘ŒΜ…
𝛽̂1 = ∑ π‘₯ 2−𝑛𝑋̅ 2 formula for slope.
𝛽̂0 = π‘ŒΜ… − 𝛽̂1 𝑋̅ formula for intercept
𝑑 π‘œπ‘Ÿ 𝑍 =
𝛽̂1 − 𝛽
𝑆𝐸(𝛽̂1 )
Confidence intervals
𝛽̂1 ± 𝑑(𝛼, 𝑣) 𝑆. 𝐸(𝛽̂1 )=95% confidence interval
2
23. Coefficient of Determination π‘ΉπŸ
𝑅2 =
(∑ π‘₯𝑦 − π‘›π‘‹Μ…π‘ŒΜ…)2
(∑ π‘₯ 2 − 𝑛 𝑋̅ 2 )(∑ 𝑦 2 − 𝑛 π‘ŒΜ… 2 )
Download