Uploaded by Deepansh Goyal

IPS-Slides-S13

advertisement
Introduction to Probability and Statistics
Sayantan Banerjee
Session 13
Sampling and Sampling distributions
• Statistics is the science of inference.
• It is the science of generalization from a part (the randomly
chosen sample) to the whole (the population).
• A random sample of n elements is a sample selected from the
population in such a way that every set of n elements is
equally likely to be selected.
Sayantan Banerjee
1
Sample statistics
• A population may be large, sometimes infinite, collection of
elements.
• A numerical measure of a population is called a population
parameter, or simply a parameter.
• A numerical measure of the sample is called a sample
statistic, or simply, a statistic. Note that a statistic is free of
any unknown population parameter.
Sayantan Banerjee
2
Sample statistics
• An estimator of a population parameter is a sample statistic
used to estimate the parameter. An estimate of the
parameter is a particular numerical value of the estimator
based on the random sample.
• When a single value is used as an estimate, we get a point
estimate of the parameter.
Sayantan Banerjee
3
Sample statistics
Consider a population of students in the age-group 5-9 years. The
problem of interest is to obtain an estimate of the mean height of
the student population in that age-group.
• We first draw a random sample of size 100 from the
population.
• Suppose the true mean height is µ, which we wish to estimate.
• Suppose X1 , . . . , Xn be the heights of the students selected,
with observed values x1 , . . . , xn .
• A sample statistic to estimate µ may be given by the sample
mean,
n
1X
Xi .
X̄ =
n i=1
• Suppose we observe the value of the sample mean as x̄ = 4.
Then this value is our estimate of µ based on the sample.
Sayantan Banerjee
4
Sample statistics
• The population proportion p is equal to the number of
elements in the population belonging to a category of interest,
divided by the total number of elements in the population.
• The sample proportion is given by
p̂ =
x
,
n
where x is the number of elements in the sample belonging to
the particular category, and n is the sample size.
Sayantan Banerjee
5
Sample statistics
A market research worker inteviewed a random sample of 18 people
about their use of a certain product. The result, in terms of Y or N
(Yes, of user of the product, and No, otherwise), are as follows: Y
N N Y Y Y N Y N Y Y Y N Y N Y Y N. Estimate the population
proportion of users of the product.
Sayantan Banerjee
6
Sampling techniques
• Simple Random Sampling (with/without replacement)
• Stratified Sampling
• Cluster Sampling
Sayantan Banerjee
7
Sampling distributions
The sampling distibution of a statistic is the probability
distribution of that statistic.
For example, for a random sample X1 , . . . , Xn from some
probability distribution, the sampling distribution of the sample
mean X̄ is the probability distribution of all possible values the
random variable X̄ may take when a sample of size n is taken.
Sayantan Banerjee
8
Sampling distrubution of X̄ for Normal population
Suppose X1 , . . . , Xn be a random sample from a N (µ, σ 2 )
distribution. Then,
X̄ ∼ N (µ, σ 2 /n).
Sayantan Banerjee
9
The Central Limit Theorem
Suppose random sampling from a population with mean µ and
finite variance σ 2 . When the sample size n becomes large, the
sampling distribution of X̄ will tend to a Normal distribution with
mean µ and variance σ 2 /n.
Sayantan Banerjee
10
Download