ES10003: Introduction to Statistics
Formulae & Similar Guide
1. Chebychev’s Inequality
For any population with mean μ and standard deviation σ , and k > 1 , the percentage of
observations that fall within the interval [μ ± kσ] is at the least 100[1-(1/k2)]%.
2. Combinations
The number of combinations of n items taken m at a time
π
πΆπ
= n!/[m!(n-m)!]
3. Permutations
The number of permutations taken m at a time
πππ = n!/(n-m)!
4. Multiplication rule in probability
P(A|B) P(B) = P(A∩ B)
5. Bayes Theorem
Let A and B be two events. Bayes theorem says:
P(B|A) =[P(A|B)P(B)] / P(A)
6. Binomial Distribution
π!
P(x)= π₯!(π−π₯)! Px(1-P)(N-x)
Where P(x) is the probability of x successes in N trials. P is the probability of success in a
single trial. N is the sample size.
7. The Poisson Distribution
The probability of x successes is:
P(x)=
π −π ππ₯
π₯!
Where P(x) = the probability of x successes over a given time or space, given ο¬
ο¬ = the expected number of successes per time or space unit, ο¬ > 0
8. Rectangular or continuous uniform distribution
π(π₯) =
1
π−π
ππ π ≤ π ≤ π
(4)
= 0 otherwise
9. The Normal Distribution
1
√2ππ2
2 /2π 2
π −(π₯−π)
(8)
10. Finite Population Correction factor
If the sample size n is not a small fraction of the population size N, then a finite population
π−π
correction is √π−1.
11. Standard error of the mean:
ππΜ
=
π
√π
12. Standard Normal Distribution
π =
πΜ
− ππ₯
ππ₯Μ
13. Related to: Confidence Interval for sample variance
E(s2) = σ2
2π4
Var(s2) = π−1
(n-1) s2/σ2 follows a chi-squared distribution with n-1 degrees of freedom:
(π−1)π 2
π2
2
= Χπ−1
14. Efficiency
Efficiency is related with smaller variance. If two estimators are unbiased then a estimator with
smaller variance compared to becomes more efficient.
π
ππππ‘ππ£π πΈπππππππππ¦ =
πππ(πΜ2 )
πππ(πΜ1 )
15. Covariance
πΆππ£(π₯, π¦) = πΈ (π − ππ)(π − ππ) = π π₯π¦ =
∑(π₯π − π₯Μ
)(π¦π − π¦Μ
)
π−1
16. Correlation
ρ =
ππ₯π¦
ππ₯ ππ¦
ππ₯π¦
π =π
π₯ ππ¦
population correlation
Sample correlation
17. Tests difference between two population means, μd . when two samples are
dependent
π=
πΜ
ππ
Confidence interval
π
πΜ
± tn-1,α/2 π
√π
18. Tests of the Difference between two Normal Population means:
Independent Samples
2
2
ππ₯
ππ¦
π
π
When ππ₯2 and ππ¦2 [the population variances] are known the variance of πΜ
- πΜ
is π₯ + π¦ and
the corresponding Z variable is defined as:
Z=
(π₯Μ
− π¦Μ
)−(ππ₯ − ππ)
2
2
π
π
√ π₯+ π¦
ππ₯ ππ¦
Confidence intervals
(π₯Μ
− π¦Μ
) ± ππΌ/2 √
ππ₯2
ππ₯
ππ¦2
+
ππ¦
19. Tests of the Difference Between Two Normal Population Means:
Independent Samples: πππ and πππ are unknown but assumed equal
t=
(π₯Μ
− π¦Μ
)−(ππ₯ − ππ)
2
2
π
π
√ π+ π
ππ₯ ππ¦
π π2 =
2
(ππ₯ −1)π π₯2 + (ππ¦ −1)π π¦
ππ₯ + ππ¦−2
Confidence intervals
π 2
2
π π
π₯
ππ¦
(π₯Μ
− π¦Μ
) ± π‘ππ₯+ππ¦ −2,πΌ/2 √ππ +
20. Test Hypotheses for the Difference Between Two Population Proportions
(π
Μπ₯ − π
Μ)
π¦
π=
√
πΜ0 (1 − πΜ0 )
πΜ0 (1 − πΜ0 )
+
ππ₯
ππ¦
πΜ0 =
ππ₯ πΜπ₯ + ππ¦πΜπ¦
ππ₯ + ππ¦
Confidence intervals
πΜπ₯ − πΜπ¦ ± ππΌ/2 √
πΜπ₯ (1−πΜπ₯ )
ππ₯
+
21. Tests of Equality of Two Variances
π 2 /π2
πΉ = π π₯2 /ππ₯2 as population variances are equal.
π¦
π¦
πΉ=
π π₯2
π π¦2
22. Regression
ππ = π½0 + π½1 ππ + π’π
πΜπ¦(1−πΜπ¦ )
ππ¦
∑ π₯π¦−ππΜ
πΜ
π½Μ1 = ∑ π₯ 2−ππΜ
2 formula for slope.
π½Μ0 = πΜ
− π½Μ1 πΜ
formula for intercept
π‘ ππ π =
π½Μ1 − π½
ππΈ(π½Μ1 )
Confidence intervals
π½Μ1 ± π‘(πΌ, π£) π. πΈ(π½Μ1 )=95% confidence interval
2
23. Coefficient of Determination πΉπ
π
2 =
(∑ π₯π¦ − ππΜ
πΜ
)2
(∑ π₯ 2 − π πΜ
2 )(∑ π¦ 2 − π πΜ
2 )