confidence intervals

advertisement
SAMPLE MEAN and its distribution
17
29
35
16
12
15
26
28
23
28
30
10
15
20
18
25
1

x1

x2
18
30
11
27
26

x3

x4
SAMPLE MEAN and its distribution
E(X) = 
σ
σ X = SEX =
n
CENTRAL LIMIT THEOREM:
If sufficiently large sample is taken from population with any
distribution with mean  and standard deviation , then sample
mean has sample normal distribution N(,2/n)
It means that:
sample mean is a good estimate of population mean 
2
with increasing sample size n, standard error SE is lower and
estimate of population mean is more reliable
SAMPLE MEAN and its distribution
http://onlinestatbook.com/stat_sim/sampling_dist/index.html
3
ESTIMATORS
• point
• interval
4
Properties of Point Estimators
• UNBIASEDNESS
• CONSISTENCY
• EFFICIENCY
5
Properties of Point Estimators
UNBIASEDNESS
An estimator is unbiased if, based on repeated
sampling from the population, the average value of
the estimator equals the population parameter. In
other words, for an unbiased estimator, the expected
value of the point estimator equals the population
parameter.
6
Properties of Point Estimators
UNBIASEDNESS
7
Properties of Point Estimators
true value of
population
parameter
individual
sample
estimates
8
ZÁKLADNÍ VLASTNOSTI
BODOVÝCH ODHADŮ
true value of
population
parameter
y – sample estimates
M - „average“ of sample estimates
9
bias of estimates
Properties of Point Estimators
CONSISTENCY
An estimator is consistent if it approaches the unknown
population parameter being estimated as the sample size
grows larger
10
Consistency implies that we will get the inference right if we
take a large enough sample. For instance, the sample mean
collapses to the population mean (X̅ → μ) as the sample size
approaches infinity (n → ∞). An unbiased estimator is
consistent if its standard deviation, or its standard error,
collapses to zero as the sample size increases.
Properties of Point Estimators
CONSISTENCY
11
Properties of Point Estimators
EFFICIENCY
An unbiased estimator is efficient if its standard error is
lower than that of other unbiased estimators
12
Properties of Point Estimators
unbiased
estimator with
large variability
(unefficient)
13
unbiased
estimator with
small variability
(efficient)
POINT ESTIMATES
Point estimate of population mean:
 
E X =μ
Point estimate of population variance:
n
2
S 
=σ
n -1
2
14
bias correction
POINT ESTIMATES
X

population
sample
15
this distance is unknown (we do
not know the exact value of  , so
we can not quatify reliability of
our estimate
INTERVAL ESTIMATES
Confidence interval for parametr  with confidence
level(0,1) is limited by statistics T1 a T2:.
P  T1  τ  T2  = 1- α
X
T1
16
point estimate of unknown
population mean  computed
from sample data– we do not
know anything about his
distance from real population
mean
T2
interval estimate of unknown
population mean  - we
suppose, that with probability
P =1- population mean  is
anywhere in this interval of
number line
CONFIDENCE LEVEL 
IN INTERVAL ESTIMATES

x1
x2
x2
this interval does not
include real value of
population mean (it is
„incorrect“), there will be
at most (100) % of these
„incorrect“ estimates
17
these intervals include real value
of population mean (they are
„correct“), there will be at least
(1- ).100 % these „correct“
estimates
TWO-SIDED INTERVAL ESTIMATES
1 a 2 represent statistical
risk, that real population
parameter is outside of
interval (outside the limits
T1 a T 2
1= /2
T
2= /2
P = 1 -  = 1 – (1 + 2)


T1
18
T2
ONE-SIDED INTERVAL ESTIMATES
LEFT-SIDED ESTIMATE
P(τ > T1 ) = 1 - α
19
RIGHT-SIDED ESTIMATE
P(τ < T2 ) = 1 - α
COMPARISON OF TWO- AND ONESIDED INTERVAL ESTIMATES
T

T1
one-sided interval estimate
P= 1-
/2
/2

T1
20
two-sided interval estimate
P= 1 - 
T2
CONFIDENCE INTERVAL (CI) OF
POPULATION MEAN 
 small sample (less then 30 measurements)
S
S
x - t /2,n-1 
   x + t /2,n-1 
n
n
lower limit of CI
21
upper limit of CI
t/2,n-1 quantil of Student ‘s t-distribution with (n-1)
degrees of freedom and /2 confidence level
CONFIDENCE INTERVAL (CI)
OF POPULATION MEAN 
 large sample (over 30 data points)
x - z /2 

n
   x + z /2 
lower limit of CI

n
instead of 
(population
SD) there is
possible to use
sample
estimate S
upper limit of CI
z/2 quantile of standardised normal distribution
22
CONFIDENCE INTERVAL (CI) OF
POPULATION STAND. DEVIATION 
 for small samples
(n - 1)  S
(n - 1)  S
σ 
2
2
χα
χ α
2
2
23
, n 1
1- , n 1
2
2
CONFIDENCE INTERVAL (CI) OF
POPULATION STAND. DEVIATION 
 for large samples
S
σ = S ± z α/2 .
2n
24
Download