Uploaded by roseelf14

3.-Sampling Distribution

advertisement
A Sampling Distribution
Sample and population (ASW, 15)v


A population is the collection of all the
elements of interest.
A sample is a subset of the population.
 Good or bad samples.
 Representative or non-representative
samples. A researcher hopes to obtain
a sample that represents the population,
at least in the variables of interest for
the issue being examined.
Sample and population (ASW, 15)v

Probabilistic samples are samples
selected using the principles of
probability. This may allow a
researcher to determine the sampling
distribution of a sample statistic. If so,
the researcher can determine the
probability of any given sampling error
and make statistical inferences about
population characteristics.
Methods of sampling – probabilistic



Random sampling methods – each
member has an equal probability of being
selected.
Systematic Random – every kth case.
Equivalent to random if patterns in list are
unrelated to issues of interest.
Stratified random samples – sample from
each stratum or subgroup of a population. Eg.
region, size of firm.
Population inferences can be made...
...by selecting a representative sample from
the population
Methods of sampling – probability



Cluster samples – sample only certain
clusters of members of a population. Eg. city
blocks, firms.
Multistage samples – combinations of
random, systematic, stratified, and cluster
sampling.
If probability involved at each stage, then
distribution of sample statistics can be
obtained.
Methods of sampling – nonprobability



Friends, family, neighbours,
acquaintances.
Students in a class or co-workers in a
workplace.
Convenience -the willingness of a person
as your subject to interact with you counts
a lot in this non-probability sampling
method.
Methods of sampling – nonprobability


Volunteers-the subjects you expect to
participate in the sample selection are the
ones volunteering to constitute the
sample, there is no need for you to do any
selection process.
Snowball sample-Similar to snow
expanding widely or rolling rapidly
Methods of sampling – nonprobabilistic


Quota sample-you tend to choose
sample members possessing or indicating
the characteristics of the target
population.
Sampling distribution of statistics cannot
be obtained using any of the above
methods, so statistical inference is not
possible.
Why sample?




Time of researcher and those being surveyed.
Cost to group or agency commissioning the
survey.
Confidentiality, anonymity, and other ethical
issues.
Non-interference with population. Large
sample could alter the nature of population,
eg. opinion surveys.
Why sample?



Do not destroy population, eg. crash
test only a small sample of
automobiles.
Cooperation of respondents –
individuals, firms, administrative
agencies.
Partial data is all that is available, eg.
fossils and historical records, climate
change.
Examples:


1.Have a list of all members of the
population; write each name on a card, and
choose cards through a pure-chance
SIMPLE RANDOM
selection.
2. You want to have a sample of 150, you
may select a set of numbers like 1 to 15, and
out of a list of 1,500 students, take every
15th name on the list until you complete the
total number of respondents to constitute
your sample. SYSTEMATIC RANDOM
Examples:
3. Dissimilarity of sample with those in
the sampling frame STRATIFIED RANDOM
4. Group-by-group selection of sample
CLUSTER
5. Checking every 10th student in the
list
SYSTEMATIC RANDOM
Examples:
6. Interviewing some persons you meet
on the campus CONVENIENCE
7. Dividing 100 persons into groups
CLUSTER
8. Choosing subjects capable of helping
you meet the aim of your study
SNOWBALL
Examples:
9. Choosing samples by chance but
through an organizational pattern
SYSTEMATIC RANDOM
10. Letting all members in the
population join the selection process
SIMPLE RANDOM
11. Matching people’s traits with the
population members’ traits
STRATIFIED RANDOM
PARAMETER AND STATISTIC
Parameter – characteristics of a population
(ASW, 259). Eg. total (annual GDP or
exports), proportion p of population that
votes Liberal in federal election. Also, µ or σ
of a probability distribution are termed
parameters.
Statistic – numerical characteristics of a
sample. Eg. monthly unemployment rate,
pre-election polls.
Measure
Parameter Statistic or
point
estimator
Mean
μ
Standard
deviation
Proportion
σ
No. of elements
N
s
p
n
Assessment
1. The teacher randomly selects 20 boys and 15
girls from a batch of learners to be members of
a group that will go to a field trip.
2. A sample of 10 mice are selected at random
from a set of 40 mice to test the effect of a certain
medicine.
3. The people in a certain seminar are all
members of two of five groups are asked what
they think about the president.
Assessment
4. A barangay health worker asks every
four house in the village for the ages of
the children living in those households.
5. A sales clerk for a brand of clothing
asks people who comes up to her
whether they own a piece of article from
her brand.
Assessment
6. A psychologist asks his patient, who suffers
from depression, whether he knows other
people with the same condition, so he can
include them in his study.
7. A brand manager of a toothpaste asks ten
dentists that have clinic closest to his office
whether they use a particular brand of
toothpaste.
Assessment
8. The process of using sample statistics to
draw conclusions about true population
parameters is called
a) statistical inference
b) the scientific method
c) sampling
d) descriptive statistics
Assessment
9. The universe or "totality of items or
things" under consideration is called
a) a sample
b) a population
c) a parameter
d) a statistic
Assessment
10. The portion of the universe that has
been selected for analysis is called
a) a sample
b) a frame
c) a parameter
d) a statistic
Assessment
11. A summary measure that is
computed to describe a characteristic
from only a sample of the
population is called
a) a parameter
b) a census
c) a statistic
d) the scientific method
Assessment
12. A summary measure that is
computed to describe a characteristic of
an entire population is
called
a) a parameter
b) a census
c) a statistic
d) the scientific method
Assessment
13. Which of the following is most likely
a population as opposed to a sample?
a) respondents to a newspaper survey
b) the first 5 learners completing an
assignment
c) every third person to arrive at the
bank
d) registered voters in a county
Assessment
14. Which of the following is most likely a
parameter as opposed to a statistic?
a) The average score of the first five learners
completing an assignment
b) The proportion of females registered to vote in
a county
c) The average height of people randomly
selected from a database
d) The proportion of trucks stopped yesterday
that were cited for bad brakes
Assessment
15. Which of the following is NOT a reason for
the need for sampling?
a) It is usually too costly to study the whole
population.
b) It is usually too time-consuming to look at the
whole population.
c) It is sometimes destructive to observe the
entire population.
d) It is always more informative by investigating
a sample than the entire population.
Assessment
1. The teacher randomly selects 20 boys and 15
girls from a batch of learners to be members
of a group that will go to a field trip.
Stratified sampling
2. A sample of 10 mice are selected at random
from a set of 40 mice to test the effect of a
certain medicine. Simple Random Sampling
3. The people in a certain seminar are all
members of two of five groups are asked what
they think about the president. Cluster Sampling
Assessment
4. A barangay health worker asks every four house in the
village for the ages of the children living in those households.
Systematic Sampling
5. A sales clerk for a brand of clothing asks people who
comes up to her whether they own a piece of article from her
brand. Volunteer Sampling
6. A psychologist asks his patient, who suffers from
depression, whether he knows other people with the same
condition, so he can include them in his study
Snowball Sampling
7. A brand manager of a toothpaste asks ten dentists that
have clinic closest to his office whether they use a particular
brand of toothpaste. Convenience Sampling
Assessment
8. The process of using sample statistics to
draw conclusions about true population
parameters is called
a) statistical inference
b) the scientific method
c) sampling
d) descriptive statistics
ANSWER: a
Assessment
9. The universe or "totality of items or
things" under consideration is called
a) a sample
b) a population
c) a parameter
d) a statistic
ANSWER: b
Assessment
10. The portion of the universe that has
been selected for analysis is called
a) a sample
b) a frame
c) a parameter
d) a statistic
ANSWER: a
Assessment
11. A summary measure that is
computed to describe a characteristic
from only a sample of the
population is called
a) a parameter
b) a census
c) a statistic
d) the scientific method
ANSWER: c
Assessment
12. A summary measure that is
computed to describe a characteristic of
an entire population is
called
a) a parameter
b) a census
c) a statistic
d) the scientific method
ANSWER: a
Assessment
13. Which of the following is most likely
a population as opposed to a sample?
a) respondents to a newspaper survey
b) the first 5 learners completing an
assignment
c) every third person to arrive at the
bank
d) registered voters in a county
ANSWER: d
Assessment
14. Which of the following is most likely a
parameter as opposed to a statistic?
a) The average score of the first five learners
completing an assignment
b) The proportion of females registered to vote in
a county
c) The average height of people randomly
selected from a database
d) The proportion of trucks stopped yesterday
that were cited for bad brakes
ANSWER: b
Assessment
15. Which of the following is NOT a reason for
the need for sampling?
a) It is usually too costly to study the whole
population.
b) It is usually too time-consuming to look at the
whole population.
c) It is sometimes destructive to observe the
entire population.
d) It is always more informative by investigating
a sample than the entire population.
ANSWER: d
A Sampling Distribution
The way our means would be distributed
if we collected a sample, recorded the
mean and threw it back, and collected
another, recorded the mean and threw it
back, and did this again and again, ad
nauseam!
A Sampling Distribution
A theoretical frequency distribution of the
scores for or values of a statistic, such as
a mean. Any statistic that can be
computed for a sample has a sampling
distribution.
A sampling distribution is the
distribution of statistics that would be
produced in repeated random sampling
(with replacement) from the same
population.
A Sampling Distribution
It is all possible values of a statistic and
their probabilities of occurring for a
sample of a particular size.
Sampling distributions are used to
calculate the probability that sample
statistics could have occurred by chance
and thus to decide whether something
that is true of a sample statistic is also
likely to be true of a population
parameter.
A Sampling Distribution
Let’s create a sampling distribution of means…
Take a sample of size 1,500 from the US. Record the mean
income. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean
income. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean
income. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean
income. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean
income. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the mean
income. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean
incomes. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean
incomes. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean
incomes. Our census said the mean is $30K.
$30K
A Sampling Distribution
Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the mean
incomes. Our census said the mean is $30K.
The sample means would stack
up in a normal curve. A normal
sampling distribution.
$30K
A Sampling Distribution
Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would get
a sample mean that is more than $20K off.
The sample means would stack
up in a normal curve. A normal
sampling distribution.
$30K
-3z
-2z
-1z
0z
1z
2z
3z
A Sampling Distribution
Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would get
a sample mean that is more than $20K off.
The sample means would stack
up in a normal curve. A normal
sampling distribution.
2.5%
2.5%
$30K
-3z
-2z
-1z
0z
1z
2z
3z
Central Limit Theorem

as the sample size increases,
the distribution of the sample
proportion tends more towards
a normal distribution.
A Sampling Distribution
Some rules about the sampling distribution of
the mean…
1.
The Central Limit Theorem says that for
random sampling, as the sample size n
grows, the sampling distribution of Y-bar
approaches a normal distribution.
2.
The sampling distribution will be normal
no matter what the population
distribution’s shape as long as n > 30.
A Sampling Distribution
Some rules about the sampling distribution of the
mean…
3. If n < 30, the sampling distribution is likely
normal only if the underlying population’s
distribution is normal.
4. As n increases, the standard error (remember
that this word means standard deviation of the
sampling distribution) gets smaller.
5. Precision provided by any given sample
increases as sample size n increases.
Recall:

A sampling distribution is the
distribution of statistic that would
be produced in repeated random
sampling (with replacement) from
the same population.
A Sampling Distribution
It is all possible values of a statistic and
their probabilities of occurring for a
sample of a particular size.
Sampling distributions are used to
calculate the probability that sample
statistics could have occurred by chance
and thus to decide whether something
that is true of a sample statistic is also
likely to be true of a population
parameter.
Central Limit Theorem

as the sample size increases,
the distribution of the sample
proportion tends more towards
a normal distribution.
The Central Limit Theorem


Difference between the mean
and standard deviation of
population and sample of
sampling distribution with
sample size n.
Try this!!!
Random samples with size 4 are drawn from a
population containing the values
14, 19, 26, 31, 48, and 53.
a. Determine the number of random samples with a
sample size n=4.
b. Construct a sampling distribution of the sample
means
c. Find the mean of the sample means
d. Compute the standard error of the sample means

a. Determine the number of random
samples with a sample size n=4.

a. Determine the number of random
samples with a sample size n=4.

b. Construct a sampling distribution of
the sample means
14, 19, 26, 31, 48, and 53
Sample
Sample
14,19,26,31
22.5
14,26,31,53
31
14,19,26,48
26.75
14,26,48,53
35.25
14,19,26,53
28
14,31,48,53
36.5
14,19,31,48
28
19,26,31,48
31
14,19,31,53
29.25
19,26,31,53
32.25
14,19,48,53
33.5
19,26,48,53
36.5
14,26,31,48
29.75
19,31,48,53
37.75
26,31,48,53
39.5
b. Construct a sampling distribution
of the sample means
Frequency
Table:
f
22.5
1
26.75
1
28
2
29.25
1
29.75
1
31
2
32.25
1
33.5
1
35.25
1
36.5
2
37.75
1
39.5
1
N=15
c. Find the mean of the sample
means
f

22.5
1
26.75
1
28
2
29.25
1
29.75
1
31
2
32.25
1
33.5
1
35.25
1
36.5
2
37.75
1
39.5
1
N=15
d. Compute the standard error of the
sample means

f
22.5
1
31.83 -9.33
87.0489
87.0489
26.75 1
28
2
29.25 1
31.83 -5.08
31.83 -3.83
31.83 -2.58
25.8064
14.6689
6.6564
25.8064
29.3378
6.6564
29.75 1
31
2
32.25 1
31.83 -2.08
31.83 -0.83
31.83 0.42
4.3264
0.6889
0.1764
4.3264
1.3778
0.1764
33.5 1
35.25 1
36.5 2
31.83 1.67
31.83 3.42
31.83 4.67
2.7889
11.6964
21.8089
2.7889
11.6964
43.6178
37.75 1 31.83 5.92
39.5 1 31.83 7.67
N
15
35.0464
58.8289
35.0464
58.8289
306.7085
d. Compute the standard error of the
sample means

A population containing the values
14, 19, 26, 31, 48, and 53.

14
31.83 -17.83
317.9089
19
31.83 -12.83
164.6089
26
31.83 -5.83
33.9889
31
31.83 -0.83
0.6889
48
31.83 16.17
261.4689
53
31.83 21.17
448.1689
N=6
1226.8334
Difference between population
and sampling distribution


Problem Solving:

1. A school has 900 senior high school
students. The average height of these
students is 68 in with a standard
deviation of 6 in. Suppose you draw a
random sample of 50 students. Find the
mean, standard deviation and variance
of the distribution of all sample means
that can be derived from the samples.
Answer:



Mean=68 in
Standard error =0.8485
Variance = 0.72
Problem Solving:

2.The average monthly income of
teachers working in a public school is
25,000 and a standard deviation of 800.
If a random sample of 15 teachers is
selected, what is the mean , variance and
standard error of the corresponding
distribution of the sample means .
Answer:



Mean= 25000
Standard error = 206.56
Variance = 42666.67
Distribution of the sample
mean of a Normal variable

Example:

Solution:

This means that the probability that a randomly selected sample
from the population will have a mean systolic
blood pressure less than 122 is 1.22 %.
Example

Solution:

Solution:

Download
Study collections