Sampling for Estimation

advertisement
4/13/2015
Sampling for Estimation
Instructor: Ron S. Kenett
Email: ron@kpa.co.il
Course Website: www.kpa.co.il/biostat
Course textbook: MODERN INDUSTRIAL STATISTICS,
Kenett and Zacks, Duxbury Press, 1998
(c) 2001, Ron S. Kenett, Ph.D.
1
4/13/2015
Course Syllabus
•Understanding Variability
•Variability in Several Dimensions
•Basic Models of Probability
•Sampling for Estimation of Population Quantities
•Parametric Statistical Inference
•Computer Intensive Techniques
•Multiple Linear Regression
•Statistical Process Control
•Design of Experiments
(c) 2001, Ron S. Kenett, Ph.D.
2
Key Terms

Error




Sampling
Nonsampling
of the mean
of the proportion
Standardized




individual value
sample mean
Finite Population
Correction (FPC)
(c) 2001, Ron S. Kenett, Ph.D.
Probability sample


Standard error


4/13/2015



Simple random sample
Systematic sample
Stratified sample
Cluster sample
Nonprobability sample




Convenience sample
Quota sample
Purposive sample
Judgment sample
3
Key Terms





Unbiased estimator
Point estimates
Interval estimates
Interval limits
Confidence
coefficient
(c) 2001, Ron S. Kenett, Ph.D.
4/13/2015




Confidence level
Accuracy
Degrees of
freedom (df)
Maximum likely
sampling error
4
Types of Samples
4/13/2015
Probability, or Scientific, Samples: Each element to be
sampled has a known (or calculable) chance of being selected.

Simple random


Systematic

(c) 2001, Ron S. Kenett, Ph.D.
Every person has an equal
chance of being selected.
Best when roster of the
population exists.
Randomly enter a stream of
elements and sample every
kth element. Best when
elements are randomly
ordered, no cyclic variation.
5
Types of Samples
4/13/2015
Probability, or Scientific, Samples: Each element to be
sampled has a known (or calculable) chance of being selected.

Stratified


Cluster

(c) 2001, Ron S. Kenett, Ph.D.
Randomly sample elements
from every layer, or stratum,
of the population. Best
when elements within strata
are homogeneous.
Randomly sample elements
within some of the strata.
Best when elements within
strata are heterogeneous.
6
Types of Samples
4/13/2015
Nonprobability Samples: Not every element has a chance to
be sampled. Selection process usually involves subjectivity.

Convenience


Quota

(c) 2001, Ron S. Kenett, Ph.D.
Elements are sampled
because of ease and
availability.
Elements are sampled, but
not randomly, from every
layer, or stratum, of the
population.
7
Types of Samples
4/13/2015
Nonprobability Samples: Not every element has a chance to
be sampled. Selection process usually involves subjectivity.

Purposive


Judgment

(c) 2001, Ron S. Kenett, Ph.D.
Elements are sampled
because they are atypical,
not representative of the
population.
Elements are sampled
because the researcher
believes the members are
representative of the
population.
8
Distribution of the Mean

4/13/2015
When the population is normally distributed

Shape: Regardless of sample size, the
distribution of sample means will be normally
distributed.

Center: The mean of the distribution of
sample means is the mean of the population.
Sample size does not affect the center of the
distribution.

Spread: The standard deviation of the
distribution of sample means, or the standard
s
.
error, is s x =
n
(c) 2001, Ron S. Kenett, Ph.D.
9
The Standardized Mean

4/13/2015
The standardized z-score is how far above
or below the sample mean is compared to
the population mean in units of standard
error.

“How far above or below” sample mean minus µ

“In units of standard error” divide by
Standardized sample mean
s
n
m
sample
mean
= x –m
z=
s
standard error
n
(c) 2001, Ron S. Kenett, Ph.D.
10
Distribution of the Mean

4/13/2015
When the population is not normally
distributed

Shape: When the sample size taken from
such a population is sufficiently large, the
distribution of its sample means will be
approximately normally distributed regardless
of the shape of the underlying population
those samples are taken from. According to
the Central Limit Theorem, the larger the
sample size, the more normal the distribution
of sample means becomes.
(c) 2001, Ron S. Kenett, Ph.D.
11
Distribution of the Mean

4/13/2015
When the population is not normally
distributed

Center: The mean of the distribution of
sample means is the mean of the population, µ.
Sample size does not affect the center of the
distribution.

Spread: The standard deviation of the
distribution of sample means, or the standard
error, is s = s .
x
n
(c) 2001, Ron S. Kenett, Ph.D.
12
Distribution of the Proportion

4/13/2015
When the sample statistic is generated by a
count not a measurement, the proportion of
successes in a sample of n trials is p, where

Shape: Whenever both n p and n(1 – p) are
greater than or equal to 5, the distribution of
sample proportions will be approximately
normally distributed.
(c) 2001, Ron S. Kenett, Ph.D.
13
Distribution of the Proportion

4/13/2015
When the sample proportion of successes in a
sample of n trials is p,

Center: The center of the distribution of
sample proportions is the center of the
population, p.

Spread: The standard deviation of the
distribution of sample proportions, or the
standard error, is s = p‫(׳‬1– p) .
p
n
(c) 2001, Ron S. Kenett, Ph.D.
14
Distribution of the Proportion

4/13/2015
The standardized z-score is how far above or
below the sample proportion is compared to the
population proportion in units of standard error.


“How far above or below” sample p – p
“In units of standard error” divide by
Standardized sample proportion
z
-p
sample
proportion
=
=
standard error
(c) 2001, Ron S. Kenett, Ph.D.
s p = p‫(׳‬1n– p)
p–p
p‫(׳‬1– p)
n
15
Finite Population Correction

4/13/2015
Finite Population Correction (FPC) Factor:
FPC= N - n
N -1
Rule of Thumb: Use FPC when n >
5%•N.
 Apply to: Standard errors of mean and
proportion.

(c) 2001, Ron S. Kenett, Ph.D.
16
Unbiased Point Estimates
Population
Parameter

Sample
Statistic
Mean, µ
s2

Variance,

Proportion,
(c) 2001, Ron S. Kenett, Ph.D.
p
4/13/2015
x
Formula
x

x = ni
s2
(x – x)2

i
s2 =
p
p = x successes
n trials
n –1
17
Confidence Intervals: µ, s Known
where x = sample mean
s = population standard
deviation
n = sample size
z = standard normal score
for area in tail = a/2
a/2
z:
x:
(c) 2001, Ron S. Kenett, Ph.D.
ASSUMPTION:
infinite population
1-a
–z
s
‫׳‬
x–z
n
4/13/2015
0
x
a/2
+z
s
‫׳‬
+
x z
n
18
Confidence Intervals: µ, s Unknown
where x = sample mean
s = sample standard
deviation
n = sample size
t = t-score for area
in tail = a/2
df = n – 1
a/2
t:
x:
(c) 2001, Ron S. Kenett, Ph.D.
ASSUMPTION:
Population
approximately
normal and
infinite
1-a
–t
x –t‫ ׳‬s
n
4/13/2015
0
x
a/2
+t
x +t ‫ ׳‬s
n
19
Confidence Intervals on
where p = sample proportion
n = sample size
ASSUMPTION:
n•p > 5,
z = standard normal score
n•(1–p) > 5,
for area in tail = a/2
a/2
p : p – z ‫ ׳‬p(1– p)
(c) 2001, Ron S. Kenett, Ph.D.
n
p
4/13/2015
and population
infinite
1-a
p
a/2
p + z ‫ ׳‬p(1– p)
n
20
4/13/2015
Confidence Intervals for Finite Populations

Mean:
or

x  za  s  N – n
2 n N –1


















Proportion:







p  za 
2
p(1– p)  N – n
n
N –1









s
N
–
n

x  ta  

n
N
–
1


2
(c) 2001, Ron S. Kenett, Ph.D.
21
Interpretation of Confidence Intervals
4/13/2015
Repeated samples of size n taken from the
same population will generate (1–a)% of
the time a sample statistic that falls within
the stated confidence interval.
OR
 We can be (1–a)% confident that the
population parameter falls within the
stated confidence interval.

(c) 2001, Ron S. Kenett, Ph.D.
22
Sample Size Determination for
Infinite Populations

4/13/2015
Mean: Note s is known and e, the bound
within which you want to estimate µ, is given.
 The interval half-width is e, also called the
maximum likely error:
e = z‫ ׳‬s
n

Solving for n, we find:
(c) 2001, Ron S. Kenett, Ph.D.
2 ‫׳‬s 2
z
n=
e2
23
Sample Size Determination for
Finite Populations

4/13/2015
Mean: Note s is known and e, the bound
within which you want to estimate µ, is given.
2
s
n =
e2 + s 2
N
z2
where
(c) 2001, Ron S. Kenett, Ph.D.
n = required sample size
N = population size
z = z-score for (1–a)% confidence
24
Sample Size Determination of p for
Infinite Populations

4/13/2015
Proportion: Note e, the bound within which
you want to estimate p , is given.


The interval half-width is e, also called the
maximum likely error:
e = z ‫ ׳‬p(1n– p)
Solving for n, we find:
(c) 2001, Ron S. Kenett, Ph.D.
2 p(1– p)
z
=
n
e2
25
Sample Size Determination of p for
Finite Populations

4/13/2015
Mean: Note e, the bound within which you
want to estimate p, is given.
n =
where
p(1– p)
+ p(1– p)
N
e2
z2
n = required sample size
N = population size
z = z-score for (1–a)% confidence
p = sample estimator of p
(c) 2001, Ron S. Kenett, Ph.D.
26
Download