Uploaded by sovetgul.asekova

Statistics and modelling course

advertisement
Statistics and Modelling
Course
2011
Topic: Confidence Intervals
Achievement Standard 90642
Calculate Confidence Intervals for
Population Parameters
3 Credits
Externally Assessed
NuLake Pages 63101
LESSON 1 – Sampling
Handout with gaps to fill in – goes with the following slides.
STARTER: Look at the following 2 examples of bad sampling technique
& discuss what’s wrong in each case.
1. Discuss how you’d obtain a representative sample from our school
roll.
2. Notes on sampling and inference.
3. Population and Samples ‘Policemen’ worksheet (from Achieving in
Statistics). Complete for HW.
Sampling
Describe some faults with each of these sampling
methods.
Sampling
Describe some faults with each of these sampling
methods.


(a) A survey on magazine readership is conducted by phoning
households between 1 and 4pm.
People who aren’t at home during those times cannot be
surveyed.
Some people don’t have a phone
Sampling
Describe some faults with each of these sampling
methods.


(b) A talkback radio station asks listeners to phone in with a
quick ‘yes’ or ‘no’ answer to the question “Should NZ have
capital punishment?”
Only people who are listening at the time can participate.
Self-selected sample. Only those with a strong opinion will
ring in.
Sampling
You are asked the question:
“How tall are St. Thomas students?”
• You only have time to measure the heights of 35 students.
Q1: How would you choose which 35 students to measure.
Q2: Once you’ve measured your 35 students’ heights, how would
you use this data to answer the question: “How tall are St.
Thomas students?
Purpose of a Sample
Make an inference
POPULATION
SAMPLE
Purpose of a Sample
SAMPLE
Make
an inference
Inferences
POPULATION
Sampling terminology
Purpose of a Sample
SAMPLE
Make
an inference
Inferences
POPULATION
Sampling terminology
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population”.
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
Sampling terminology
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Sample statistic
Population parameter
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Sample statistic
Number of items
Population parameter
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Sample statistic
Number of items
n: Sample size
Population parameter
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Sample statistic
Population parameter
n: Sample size
N: Population size
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Mean
Sample statistic
Population parameter
n: Sample size
N: Population size
X
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Mean
Sample statistic
Population parameter
n: Sample size
N: Population size
X
m
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Mean
Standard deviation
Sample statistic
Population parameter
n: Sample size
N: Population size
X
m
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Sample statistic
Population parameter
n: Sample size
N: Population size
Mean
X
Standard deviation
s
m
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Sample statistic
Population parameter
n: Sample size
N: Population size
Mean
X
Standard deviation
s
m
s
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Sample statistic
Population parameter
n: Sample size
N: Population size
Mean
X
Standard deviation
s
Proportion
m
s
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Mean
Standard deviation
Proportion
Sample statistic
Population parameter
n: Sample size
N: Population size
X
s
p
m
s
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Mean
Standard deviation
Proportion
Sample statistic
Population parameter
n: Sample size
N: Population size
X
s
p
m
s
POPULATION:
Target Population: All items under investigation.
We usually just call it the “Population.”
SAMPLES:
Sample: Subset selected to REPRESENT the population.
Sampling Frame:
A list/database of items from which we select
our sample. (Should include all items in the Target Population)
For a sample to be Representative of a given population:
The Sampling Frame must match the Target Population.
Number of items
Mean
Standard deviation
Proportion
Sample statistic
Population parameter
n: Sample size
N: Population size
X
s
p
m
s
p
Number of items
Mean
Standard deviation
Proportion
Sample statistic
Population parameter
n: Sample size
N: Population size
X
s
p
m
s
p
A representative sample should have…
• Sample-size large enough to allow the results to be meaningful
(rough guide: sample size of n > 30).
• No Bias – Sample selection is said to be “biased” if some items
are more likely to be chosen than others. Every item in the
target population should be equally-likely to be chosen. Random
selection ensures this.
• Minimal Non-response – difficult to control this.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.
1.) What is the target population?
A. all the people who live in the town.
B. the head of each household.
C. the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.
1.) What is the target population?
A. all the people who live in the town.
B. the head of each household.
C. the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.
1.) What is the target population?
Answer: C. the houses in the town.
2.) What is the sampling frame?
A. the electoral roll for the town.
B. a list of all the people who live in the town.
C. a list of all the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
possible to householders in a certain town.
Usually each house only needs one burglar alarm.
Before the firm orders the alarms from their supplier, they wish to
have an indication of how many alarms they might sell.
1.) What is the target population?
Answer: C. the houses in the town.
2.) What is the sampling frame?
A. the electoral roll for the town.
B. a list of all the people who live in the town.
C. a list of all the houses in the town.
Example:
A home security firm is hoping to sell as many burglar alarms as
Do to
Population
Samples
possible
householders inand
a certain
town.
Usually
each house onlyworksheet.
needs one burglarFinish
alarm. by
‘Policemen’
Before
the firm orders
alarms
their supplier, they wish to
Monday.
Will the
mark
asfrom
a class.
have an indication of how many alarms they might sell.
1.) What is the target population?
Answer: C. the houses in the town.
2.) What is the sampling frame?
Answer: C. a list of all the houses in the town.
3.) How would you select a representative sample of the houses in
the town? (discus s as a class)
EXTRA ON SAMPLING
TECHNIQUES IF TIME (schol
students)
Otherwise skip to Lesson 3:
Distribution of Sample Means 1
Extension Lesson:
Other sampling techniques
Good sampling techniques:
1. Simple Random Sampling
2. Systematic Sampling
3. Stratified Sampling
4. Cluster Sampling
Bad sampling techniques (biased selection):
• Convenience sampling.
• Self-selected sampling.
Random selection
Q: What does the word “random” actually
mean?
Q: How would you select a student at
random from this school?
21.03
Simple random sampling.
Generate 20 different random
numbers between 1 and 100.
If a random number has already
occurred, generate more as needed.
Calculator formula
1 + 100×RAN#
42 67 2 12 77 49 60
20 45 15 64 7 8 21
15 64 58 14 29 68 26 90
1. Simple Random Sampling
1. Obtain a list of all N items in the target population,
numbering them 1 to N (e.g. the school roll: 1-600).
2. Decide how many you will select for your sample (n).
3. Use the random number generator on your calculator
to select numbers at random between 1 and N:
On calculator, type: 1 + Population size × RAN#
4. Keep pressing ‘equals’ until you have selected n
different items. Discard any repeats.
Advantage of SR sampling: Ensures that every item in
the population has an equal chance of being selected
– so no chance of bias.
2. Decide how many you will select for your sample (n).
Select
a sample
35 students
the
St.
3.
Use the
random of
number
generator from
on your
calculator
Thomas
school at
roll.
to
select numbers
random between 1 and N:
On calculator, type: 1 + Population size × RAN#
HW:
Old
Sigma‘equals’
Pg. 130
Ex.have
9.1 (all),
then
4.
Keep
pressing
until–you
selected
n Pg.
different
items.
any repeats.
134 – Ex.
9.2 –Discard
just Q1.
Advantage of SR sampling: Ensures that every item in
the population has an equal chance of being selected
– so no chance of bias.
Disadvantage:
• Does not ensure that all subgroups of the population
are represented in proportion (e.g. some racial, socioeconomic groups could be over/under-represented).
3 other good sampling techniques
Systematic sampling
1.
2.
3.
Obtain a list of all N items in the
target popn (numbered 1N).
Pick a random starting point (e.g.
item number 7)
Sample every kth item after
that, where k=N/n until you have
selected n items.
Cluster sampling
Stratified sampling
Use when the population consists of
categories (strata), (e.g. racial groups)
1.
Divide sampling frame into the
strata (categories).
2. Select a separate random sample
from each stratum in proportion
to the percentage of the population
found in each. (Called Proportional
Allocation )
Use when the population is distributed
into naturally-occurring groups or
‘clusters’ (e.g. towns and cities in NZ)
Stage 1: Select the clusters:
Select a representative sample of the
clusters themselves.
Stage 2: Select a random sample of
items within chosen clusters. Must be
in proportion to the percentage of the
population found in each. (Called
Proportional Allocation )
21.03
Comparison of samples.
Simple random sampling
Stratified sampling
Systematic sampling
Cluster sampling
3 other good sampling techniques
Systematic sampling
1.
Obtain a list of all N items in the
target popn (numbered 1N).
1.
Select a sample of between 30 and 36 students
2. Pick a random starting point (e.g.
from
the7) school roll using each of these 3
item
number
3.
methods.
Sample every kth item after
that, where k=N/n until you have
selected n items.
2. Write down at least one advantage
and at least
Cluster sampling
Use when the population
is distributed
one disadvantage/risk
associated
with each
of
Stratified
sampling
into naturally-occurring groups or
Use whenthese
the population
consists
of
3
techniques.
‘clusters’ (e.g. towns and cities in NZ)
categories (strata), (e.g. racial groups)
Stage 1: Select the clusters:
Divide sampling frame into the
ndSelect
a representative
sample
of the

HW:
Do
Old
Sigma
(2
edition)
p137: Ex.
9.3.
strata (categories).
clusters themselves.
2. Select a separate random sample
from each stratum in proportion
Stage 2: Select a random sample of
to the percentage of the population
items within chosen clusters. Must be
found in each. (Called Proportional
in proportion to the percentage of the
population found in each (Proportional
Allocation )
Allocation).
1.
21.03
Systematic sampling.
To obtain a systematic sample of
size 20 from this data.
Choose a starting point at
random between 1 and 100.
Using calculator
1 + 100×RAN# =
Suppose this gives 5.87352 5.
So start at item number 5.
Then choose every kth item,
where k = N/n.
= 100/20
= 5. So sample every 5th item.
Systematic Sampling
1. Obtain a list of all N items in the target population.
2. Decide on your sample size, n .
3. Pick a random starting point (e.g. item number 7)
4. Sample every kth item after that, where k=N/n until
you have selected n items.
Advantages:
• Ensures that sample is selected from throughout the
breadth of the sampling frame.
• Convenient and fast – easier to collect info on items that
are in a sequence (every 5th house) than from a random
sample where they are scattered all over.
4. Sample every kth item after that, where k=N/n until you
have selected n items.
Advantages:
• Ensures that sample is selected from throughout the
breadth of the sampling frame.
• Convenient and fast – easier to collect info on items that
are in a sequence (every 5th house) than from a random
sample where they are scattered all over.
Disadvantage:
Be careful that the list itself has no systematic pattern. If
every 2nd house on a street were sampled, all would be on
the same side of the street!
21.03
Stratified sampling.
Suppose the avocados are of
3 different varieties.
Hass:
1–40
40%
Fuerte:
41–70
30%
Hopkins: 71–100
30%
The number in each strata of the
sample should be proportional to
the number in each group in the
population.
Hass:
40% x 20 = 8
Fuerte:
30% x 20 = 6
Hopkins: 30% x 20 = 6
21.03
Stratified sampling.
Thus generate random numbers as
follows:
Hass:
1–40
8 random nos.
33 17 12 25 9 9 33 16 39 8
Fuerte: 41–70 6 random nos.
58 59 67 43 53 56
Hopkins: 71–100 6 random nos.
98 85 96 99 90 81
Stratified sampling
Use when the population consists of categories (strata), and
you wish to represent each ‘stratum’ proportionally (e.g.
racial groups, one-story and multi-story homes within a
city).
1. Obtain a list of all N items in the target population.
2. Decide on your sample size, n .
3. Divide list into the strata (categories).
4. Select a separate random sample from each stratum
in proportion to the percentage of the population found in
each.
Proportional Allocation: Selecting from each stratum in
proportion to its percentage of the population.
1.
2.
3.
4.
Obtain a list of all N items in the target population.
Decide on your sample size, n .
Divide list into the strata (categories).
Select a separate random sample from each stratum
in proportion to the percentage of the population found
in each.
Proportional Allocation: Selecting from each stratum in
proportion to its percentage of the population.
E.g. If 12% of a city’s citizens are Pacific Islanders, then
12% of the sample size should be selected from among
the Pacific Island citizens.
3. Divide list into the strata (categories).
4. Select a separate random sample from each stratum
in proportion to the percentage of the population found in
each.
Proportional Allocation: Selecting from each stratum in
proportion to its percentage of the population.
E.g. If 12% of a city’s citizens are Pacific Islanders, then
12% of the sample size should be selected from among
the Pacific Island citizens.
Advantage: Guaranteed to be representative of each stratum.
Disadvantage: Time-consuming and expensive because you
must collect information about the strata-sizes in advance.
Cluster sampling
Use when the population is distributed into naturallyoccurring groups or ‘clusters’ (e.g. towns and cities in a
country).
1. Select a representative sample of the clusters
themselves (usually a lot so we can’t sample from all).
2. Select a random sample of items from within each chosen
cluster.
3. Again, use Proportional Allocation (like with stratified
samples). Weight the number selected from each cluster
according to the cluster size.
E.g. Selecting samples of New Zealanders by selecting a
sample of towns/cities from throughout the country,
then a proportional random sample from within each.
1. Select a representative sample of the clusters
themselves (usually a lot so we can’t sample from all).
2. Select a random sample of items from within each
chosen cluster.
3. Again, use Proportional Allocation (like with stratified
samples). Weight the number selected from each
cluster according to the cluster size.
E.g. Selecting samples of New Zealanders by selecting a
sample of towns/cities from throughout the country,
then a proportional random sample from within each.
Advantage:
• Cheaper and faster when sampling from a
geographically large area (data can be collected in groups
within chosen clusters rather than being spread out).
E.g. Selecting samples of New Zealanders by selecting a
HW:
Memorise
the
4 typesthe
of country,
sample
of towns/cities
from
throughout
thensampling
a proportional
random sampleand
fromthe
within each.
techniques
advantages & disadvantages
Advantage:
of each.
• Cheaper
and faster when sampling from a geographically
large area (data can be collected in groups within chosen
clusters rather than being spread out).
Disadvantages:
• Items don’t have an equal chance of selection.
– Small clusters are unlikely to be sampled from.
– Items that are not in clusters are excluded altogether.
E.g. farmers or people in small rural communities may have no chance
of being selected.
•
Requires prior knowledge of cluster sizes.
21.03
Cluster sampling.
Here is one way of obtaining a
cluster sample of size 20.
Choose four clusters, each of 5
avocados, by selecting four
numbers at random from the
data, and taking them as the
middle item of a ‘cross’.
If clusters overlap or run outside
the boundaries, choose another.
Spreadsheet formula
99×RAN# + 1 =
62 22 2 68 56
Note: Depending how a cluster is
defined, it can exclude some items or make other items more likely
to be chosen than under other sampling methods
LESSON 2 – Distribution of Sample Means
The points of today:
The point of today: Get confident at calculating
probabilities involving the distribution of sample
means.
– Mark HW: “Achieving in Statistics”: pages 30.
– Handout to fill in (goes with following slides)
• Then do Achieving in Statistics: pages 31 & 32.
The Distribution of Sample Means
The Distribution of Sample Means
STARTER ACTIVITY: Each class member has 5 dice.
Toss your 5 dice and record the number facing upward for
each. Add up to get the total for your 5.
My total value from 5 tosses = _____
My mean score for each die roll = ________
Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________
This illustrates the fact that ________________________.
The Distribution of Sample Means
STARTER ACTIVITY: Each class member has 5 dice.
Toss your 5 dice and record the number facing upward for
each. Add up to get the total for your 5.
My total value from 5 tosses = _____
My mean score for each die roll = ________
Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________
This illustrates the fact that sample means vary.
The Distribution of Sample Means
Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________
This illustrates the fact that sample means vary.
A random sample can be thought of as a collection of n items
(n=5 dice-tosses in the experiment we did last time),
The Distribution of Sample Means
Your group of 5 dice tosses represents a sample of size n=5.
Between us, as a class, we tossed 5 dice ________ times.
We got means of: ________________________________
This illustrates the fact that sample means vary.
A random sample can be thought of as a collection of n items
(n=5 dice-tosses in the experiment we did last time), each of
which has a value that we measure (the number facing upward
when a die lands in this case).
The Distribution of Sample Means
This illustrates the fact that sample means vary.
A random sample can be thought of as a collection of n items
(n=5 dice-tosses in the experiment we did last time), each of
which has a value that we measure (the number facing upward
when a die lands in this case).
When you select items at random from any population, the
value of each item, X is a random variable (e.g. height, weight,
volume of drink in soft drink bottles etc.).
Select a random sample of size n from any population:
The sample mean,
X =
X 1  X 2  X 3  ...  X n
n
The Distribution of Sample Means
When you select items at random from any population, the
value of each item, X is a random variable (e.g. height, weight,
volume of drink in soft drink bottles etc.).
Select a random sample of size n from any population:
The sample mean,
X =
X 1  X 2  X 3  ...  X n
n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The Distribution of Sample Means
Select a random sample of size n from any population:
The sample mean,
X =
X 1  X 2  X 3  ...  X n
n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• ___________________________________________
___________________________________________
• ___________________________________________
___________________________________________
___________________________________________
The Distribution of Sample Means
Select a random sample of size n from any population:
The sample mean,
X =
X 1  X 2  X 3  ...  X n
n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random
from sample to sample.
• ___________________________________________
___________________________________________
___________________________________________
The Distribution of Sample Means
Select a random sample of size n from any population:
The sample mean,
X =
X 1  X 2  X 3  ...  X n
n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random
from sample to sample.
• is normally distributed about the population mean m,
The Distribution of Sample Means
Select a random sample of size n from any population:
The sample mean,
X =
X 1  X 2  X 3  ...  X n
n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random
from sample to sample.
• is normally distributed about the population mean m, even
if the population from which it is drawn is not normally
distributed,
The Distribution of Sample Means
Select a random sample of size n from any population:
The sample mean,
X =
X 1  X 2  X 3  ...  X n
n
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random
from sample to sample.
• is normally distributed about the population mean m, even
if the population from which it is drawn is not normally
distributed, provided the samples are large enough.
Rule of thumb is n > 30.
The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random from
sample to sample.
• is normally distributed about the population mean m, even if
the population from which it is drawn is not normally
distributed, provided the samples are large enough.
Rule of thumb is n > 30.
In other words the sample means will ‘average out’ towards
the population mean.
The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random from
sample to sample.
• is normally distributed about the population mean m, even if
the population from which it is drawn is not normally
distributed, provided the samples are large enough.
Rule of thumb is n > 30.
In other words the sample means will ‘average out’ towards
the population mean. This result is called the
‘__________________’.
The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random from
sample to sample.
• is normally distributed about the population mean m, even if
the population from which it is drawn is not normally
distributed, provided the samples are large enough.
Rule of thumb is n > 30.
In other words the sample means will ‘average out’ towards
the population mean. This result is called the
‘Central Limit Theorem’.
The Distribution of Sample Means
Different samples will produce different mean values , just
like we got different mean values from tossing our dice.
The sample mean :
• is a random variable itself because it varies at random from
sample to sample.
• is normally distributed about the population mean m, even if
the population from which it is drawn is not normally
distributed, provided the samples are large enough.
Rule of thumb is n > 30.
In other words the sample means will ‘average out’ towards
the population mean. This result is called the
‘Central Limit Theorem’.
i.e.
mX = m
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
i.e.
mX = m
Mean of sample means
Distribution of Sample Means
X
mX = m
.
Std. deviation of
distribution of sample
means (standard error)
mX
sX =
s
n
*
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
i.e.
mX = m
.
Mean of sample means
Distribution of Sample Means
X
mX = m
.
Std. deviation of
distribution of sample
means (standard error)
mX
sX =
s
n *
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
i.e.
mX = m
.
Mean of sample means
Distribution of Sample Means
X
mX = m
.
Std. deviation of
distribution of sample
means (standard error)
mX
sX =
s
Since sample means are normally distributed about the
population mean,
n
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
i.e.
mX = m
.
Mean of sample means
Distribution of Sample Means
X
mX = m
.
Std. deviation of
distribution of sample
means (standard error)
mX
sX =
s
Since sample means are normally distributed about the
population mean, we can use the properties of a normal
distribution curve
n
In other words the sample means will ‘average out’ towards the population
mean. This result is called the ‘Central Limit Theorem’.
i.e.
mX = m
.
Mean of sample means
Distribution of Sample Means
X
mX = m
.
Std. deviation of
distribution of sample
means (standard error)
mX
sX =
s
n
Since sample means are normally distributed about the
population mean, we can use the properties of a normal
distribution curve to predict the percentage of samples that
will produce means within a particular distance from the
population mean.
Mean of sample means
Distribution of Sample Means
X
mX = m
.
Std. deviation of
distribution of sample
means (standard error)
sX =
mX
s
n
Since sample means are normally distributed about the
population mean, we can use the properties of a normal
distribution curve to predict the percentage of samples that
will produce means within a particular distance from the
population mean.
Example:
Since sample means are normally distributed about the population
mean, we can use the properties of a normal distribution curve to
predict the percentage of samples that will produce means within a
particular distance from the population mean.
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
E( X ) = m X
And,
by the Central Limit Theorem,
 E ( X ) = ____
mX = m
the population mean.
Since sample means are normally distributed about the population
mean, we can use the properties of a normal distribution curve to
predict the percentage of samples that will produce means within a
particular distance from the population mean.
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
E( X ) = m X
And,
by the Central Limit Theorem,
 E ( X ) = 177cm
mX = m
the population mean.
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
E( X ) = m X
And,
by the Central Limit Theorem,
mX = m
the population mean.
 E ( X ) = 177cm
b) The standard deviation (standard error) of the sample
mean.
s
sX =
n
9
=
36
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
E( X ) = m X
And,
by the Central Limit Theorem,
mX = m
the population mean.
 E ( X ) = 177cm
b) The standard deviation (standard error) of the sample
mean.
s
sX =
n
9
=
36
= 1.5cm
If a random sample of 36 seventeen year-old NZ males is
taken, calculate:
a) The expected value of the sample mean.
E( X ) = m X
And,
by the Central Limit Theorem,
mX = m
the population mean.
 E ( X ) = 177cm
b) The standard deviation (standard error) of the sample
mean.
s
sX =
n
=
9
36
= 1.5cm
c) What percentage of such samples would have a mean
that is:
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(i) Within 3cm of the population mean of 177cm?
P(174  X  180, if m = 177) = P_______  z  ________ 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(i) Within 3cm of the population mean of 177cm?
 174  m
180  m 

P(174  X  180, if m = 177) = P
z 

s
s
X
X


Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(i) Within 3cm of the population mean of 177cm?
 174  177
180  177 

P(174  X  180, if m = 177) = P
z 

s
s
X
X


Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(i) Within 3cm of the population mean of 177cm?




174  177
180  177 
P(174  X  180, if m = 177) = P
z 
s
s




n
n 

Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(i) Within 3cm of the population mean of 177cm?




174  177
180  177 
P(174  X  180, if m = 177) = P
z 
9
9




36
36 

Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(i) Within 3cm of the population mean of 177cm?
180  177 
 174  177
P(174  X  180, if m = 177) = P
z 

1.5 
 1.5
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(i) Within 3cm of the population mean of 177cm?
180  177 
 174  177
P(174  X  180, if m = 177) = P
z 

1.5 
 1.5
= P 2  z  2
= 2  0.47724
= 0.9545 (4sf) So 95.45% of samples
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(ii) More than 5cm away from the population mean?
P( X  172 or X  182) = 1 - P____  X  ____ 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(ii) More than 5cm away from the population mean?
P( X  172 or X  182) = 1 - P172  X  182
= 1- P_______  Z  _______ 
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(ii) More than 5cm away from the population mean?
P( X  172 or X  182) = 1 - P172  X  182
 172  177
182  177 


= 1 - P
Z 
s X 
 sX
Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(ii) More than 5cm away from the population mean?
P( X  172 or X  182) = 1 - P172  X  182




172

177
182

177

= 1 - P
Z
s
s




n
n


Example:
The results of a census of all 17 year-old males in NZ showed
a mean height of m = 177cm, with s = 9cm.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(ii) More than 5cm away from the population mean?
P( X  172 or X  182) = 1 - P172  X  182




172

177
182

177

= 1 - P
Z
9
9




36
36


Example:
The results
of a census of all 17 year-old males in NZ showed
Homework:
a mean height of m = 177cm, with s = 9cm.
Do Achieving in Statistics: pages 31 & 32.
If a random sample of 36 seventeen year-old NZ males is
taken:
c) What percentage of such samples would have a mean
that is:
(ii) More than 5cm away from the population mean?
P( X  172 or X  182) = 1 - P172  X  182
182  177 
 172  177
= 1 - P
Z

1.5 
 1.5
1
1
= 1 - P(3  Z  3 )
3
3
= 1- 0.99914 So only about 0.09% of
= 0.00086
samples. Very rare.
Extension
The point of today: Look at when we can draw conclusions
about the population mean based on a sample mean.
STARTER: Look at applet that demonstrates the
distribution of sample means:
SIM - onlinestatbook.com.SLASH.rvls.html.
• Work through the following examples as class (handout to
fill in).
• Then do Sigma p184 – Ex. 11.5 (old version).
or Sigma p66 – Ex. 3.05 (new version)
Example:
The census of all NZ seventeen year-old males from yesterday’s example
was actually conducted back in 1987. It had mean of m =177cm and s of
9cm.
A random sample of 36 seventeen year-old NZ males was selected just last
year. This sample found a mean height of 180cm.
(a) What is the probability that a random sample of 36 students selected
from a population with m=177cm and s=9cm would give a mean height
greater than 180cm?




X m 
P( X  180, if m = 177) = P z 
s 



n 





180  177 
= P z 
9




36 

(a) What is the probability that a random sample of 36 students selected
from a population with m=177cm and s=9cm would give a mean height
greater than 180cm?


P( X  180, if m = 177) = P z 





X m 
s 

n 




180

177

= P z 
9




36


= Pz  2
= 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm?
(a) What is the probability that a random sample of 36 students selected
from a population with m=177cm and s=9cm would give a mean height
greater than 180cm?


P( X  180, if m = 177) = P z 





X m 
s 

n 




180  177 
= P z 
9




36 

= Pz  2
= 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm?
Answer: Only 2.275%




X m
P( X  180, if m = 177) = P z 
s 



n 





180  177 
= P z 
9




36 

= Pz  2
= 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm?
Answer: Only 2.275%
(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm
= Pz  2
= 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%
(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm
(d) So it is very ____________ that a randomly selected ________
taken from a _____________ with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
= Pz  2
= 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%
(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm
(d) So it is very
unlikely
that a randomly selected ________
taken from a _____________ with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
= Pz  2
= 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%
(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm
(d) So it is very
unlikely
that a randomly selected sample
taken from a _____________ with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
= Pz  2
= 0.02275
(b) Based on this answer, what percentage of samples would have means of
180cm or higher if the population mean was 177cm (like in 1987)?
Answer: Only 2.275%
(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm
(d) So it is very
unlikely
that a randomly selected sample
taken from a population
with mean 177cm and standard deviation of
9cm would have a mean as high as this one.
Do Sigma:
= Pz  2
= 0.02275
oldon(2this) answer,
edition:
Pg.
184 – Ex.
11.4. would have means of
(b) In
Based
what
percentage
of samples
180cm or higher if the population mean was 177cm (like in 1987)?
OR in
NEW
edition: Pg. 66 – Ex. 3.04.
Answer:
Only
2.275%
nd
(c) Sketch a normal distribution curve for the distribution of sample
means from a population with m = 177cm and standard deviation of s = 9cm
(d) So it is very
unlikely
that a randomly selected sample
taken from a population
with mean 177cm and standard deviation of
9cm would have a mean as high as this one. Yet it did.
(e) What is the most likely explanation?
LESSON 4 – C.I.s for Means 1
• Today’s theme: Solving problems involving
Confidence Intervals for Means.
• Students do NuLake Ch 2.5 – Calculate
Confidence Intervals of means.
http://www.youtube.com/watch?v=Ohz-PZqaMtk
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.
95%
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
47.5%
i.e. 47.5% (or 0.475) on
each side.
47.5%
Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
Looking up 0.475 on the
tables gives z = 1.96.
47.5%
i.e. 47.5% (or 0.475) on
each side.
47.5%
Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
Looking up 0.475 on the
tables gives z = 1.96.
47.5%
i.e. 47.5% (or 0.475) on
each side.
-1.96
47.5%
Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.
1.96
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
Looking up 0.475 on the
tables gives z = 1.96.
47.5%
i.e. 47.5% (or 0.475) on
each side.
-1.96
47.5%
Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.
1.96
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
So when we calculate the mean from a random sample we expect
that, 95% of the time, it will be within + 1.96 standard
errors of the popn mean, m.
Looking up 0.475 on the
tables gives z = 1.96.
47.5%
i.e. 47.5% (or 0.475) on
each side.
-1.96
47.5%
Notice that the middle
95% of the area under
normal curve means half on
each side of the mean.
1.96
47.5%
Now, work out the lower and
upper limits of the interval
within which you’d expect
95% of sample means to lie
if each sample has 36 people
in it.
-1.96
47.5%
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
So when we calculate the mean from a random sample we expect
that, 95% of the time, it will be within + 1.96 standard
errors of the popn mean, m.
1.96
47.5%
Now, work out the lower and
upper limits of the interval
within which you’d expect
95% of sample means to lie
if each sample has 36 people
in it.
-1.96
47.5%
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
So when we calculate the mean from a random sample we expect
that, 95% of the time, it will be within + 1.96 standard
errors of the popn mean, m.
1.96
Conclusion: So 95% of samples of size 36 from this population
will produce means between _______cm and ________cm
47.5%
Now, work out the lower and
upper limits of the interval
within which you’d expect
95% of sample means to lie
if each sample has 36 people
in it.
-1.96
47.5%
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
So when we calculate the mean from a random sample we expect
that, 95% of the time, it will be within + 1.96 standard
errors of the popn mean, m.
1.96
Conclusion: So 95% of samples of size 36 from this population
will produce means between 174.06cm and 179.93cm
-1.96
47.5%
47.5%
Question: If the population mean height of 17 year-old NZ
males is 177cm with s of 9cm, within what interval would we
expect the means of 95% of samples of size 36 to lie?
So when we calculate the mean from a random sample we expect
that, 95% of the time, it will be within + 1.96 standard
errors of the popn mean, m.
1.96
Conclusion: So 95% of samples of size 36 from this population
will produce means between 174.06cm and 179.93cm
-1.96
Problem:
47.5%
47.5%
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm
1.96
-1.96
47.5%
47.5%
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm
1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
-1.96
47.5%
47.5%
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm
1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
-1.96
47.5%
47.5%
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm
1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??
-1.96
47.5%
47.5%
Conclusion: So 95% of samples of 36 from this population will
produce means between 174.06cm and 179.93cm
1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??
47.5%
47.5%
-1.96
1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??
Answer: We construct an interval within which
we think the population mean lies.
47.5%
47.5%
-1.96
1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
How can our knowledge of the distribution of sample means help
us here??
Answer: We construct an interval within which
we think the population mean lies.
Estimate of
m = X  margin of error
47.5%
47.5%
-1.96
1.96
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Answer: We construct an interval within which we think
the population mean lies.
Estimate of
m =
X  margin of error.
This is known as a Confidence Interval.
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Answer: We construct an interval within which we think the population
mean lies.
Estimate of
m = X  margin of error.
This is known as a Confidence Interval.
A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
Diagram on board
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Answer: We construct an interval within which we think the population
mean lies.
Estimate of
m = X  margin of error.
This is known as a Confidence Interval.
A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X 
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Answer: We construct an interval within which we think the population
mean lies.
Estimate of
m = X  margin of error.
This is known as a Confidence Interval.
A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s X
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Answer: We construct an interval within which we think the population
mean lies.
Estimate of
m = X  margin of error.
This is known as a Confidence Interval.
A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s
n
A 99% confidence interval for m is X  _____ s
n
Problem:
In real-life, we almost never know the population mean (or standard
deviation).
We only have enough resources to conduct ONE random sample and
use it to estimate (infer) the population mean.
Answer: We construct an interval within which we think the population
mean lies.
Estimate of
m = X  margin of error.
This is known as a Confidence Interval.
A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s
n
A 99% confidence interval for m is X  2.576  s
n
A 95% Confidence Interval for the population
mean is an interval that has a 95% probability
of containing the population mean.
A 95% confidence interval for m is X  1.96  s
n
A 99% confidence interval for m is X  2.576  s
n
Example 1:
A soft drink is sold in bottles. The amount of drink in each
bottle is normally distributed with a standard deviation of
40mL. The mean volume of drink in a random sample of 100 such
bottles is 300mL. Construct a 95% confidence interval for the
true mean volume of drink per bottle.
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
There is a 95% probability that the interval 300mL +1.96 standard
errors contains the true population mean.
A 95% C.I. for the population mean m is:
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
There is a 95% probability that the interval 300mL +1.96 standard
errors contains the true population mean.
A 95% C.I. for the population mean m is:
+ Margin of Error
X
=X
+ z ×Standard Error of the sample mean
= 300 +
= 300 +
1.96 
40
100
= 300mL + 7.84mL
Margin of Error
E
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
There is a 95% probability that the interval 300mL +1.96 standard
errors contains the true population mean.
A 95% C.I. for the population mean m is:
+ Margin of Error
X
=X
+ z ×Standard Error of the sample mean
= 300 +
= 300 +
1.96 
40
100
= 300mL + 7.84mL
ANSWER:
The 95% CI for the population mean is:
_____mL < m < _____mL
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
There is a 95% probability that the interval 300mL +1.96 standard
errors contains the true population mean.
A 95% C.I. for the population mean m is:
+ Margin of Error
X
=X
+ z ×Standard Error of the sample mean
= 300 +
= 300 +
1.96 
40
100
= 300mL + 7.84mL
ANSWER:
The 95% CI for the population mean is:
292.2mL < m < _____mL
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
There is a 95% probability that the interval 300mL +1.96 standard
errors contains the true population mean.
A 95% C.I. for the population mean m is:
+ Margin of Error
X
=X
+ z ×Standard Error of the sample mean
= 300 +
= 300 +
1.96 
40
100
= 300mL + 7.84mL
ANSWER:
The 95% CI for the population mean is:
292.2mL < m < 307.8mL
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
There is a 95% probability that the interval 300mL +1.96 standard
errors contains the true population mean.
A 95% C.I. for the population mean m is:
+ Margin of Error
X
= 300 +
40
1.96 
100
= 300mL + 7.84mL
ANSWER:
The 95% CI for the population mean is:
292.2mL < m < 307.8mL
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
There is a 95% probability that the interval 300mL +1.96 standard
errors contains the true population mean.
A 95% C.I. for the population mean m is:
+ Margin of Error
X
40
+
1.96 
= 300
100
= 300mL + 7.84mL
ANSWER:
The 95% CI for the population mean is:
292.2mL < m < 307.8mL
Now calculate the 99 % C.I. Will it be wider or narrower??
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
A 95% C.I. for the population mean m is:
40
+
1
.
96

= 300
ANSWER:
100
= 300mL + 7.84mL
The 95% CI for the population mean is:
292.2mL < m < 307.8mL
Now calculate the 99 % C.I. Will it be wider or narrower??
A 99% C.I. is
s
+
z0.99 / 2 
300
n
Margin of Error
= 300 + 2.576  40
100
E
= 300mL + 10.304mL
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
A 95% C.I. for the population mean m is:
40
+
1
.
96

= 300
ANSWER:
100
= 300mL + 7.84mL
The 95% CI for the population mean is:
292.2mL < m < 307.8mL
Now calculate the 99 % C.I. Will it be wider or narrower??
A 99% C.I. is
s
+
z0.99 / 2 
300
n
= 300 + 2.576  40 ANSWER:
100
The 99% CI for the population mean is:
= 300mL + 10.304mL ____mL < m < _____mL
Example 1:
A soft drink is sold in bottles. The amount of drink in each bottle is
normally distributed with a standard deviation of 40mL. The mean volume
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
A 95% C.I. for the population mean m is:
40
+
1
.
96

= 300
ANSWER:
100
= 300mL + 7.84mL
The 95% CI for the population mean is:
292.2mL < m < 307.8mL
Now calculate the 99 % C.I. Will it be wider or narrower??
A 99% C.I. is
s
+
z0.99 / 2 
300
n
= 300 + 2.576  40 ANSWER:
100
The 99% CI for the population mean is:
= 300mL + 10.304mL 289.7mL < m < _____mL
Example 1:
A soft drink is sold
in bottles.
The amount
Copy
examples,
then of
dodrink in each bottle is
normally distributed
with a Ex.
standard
deviation of 40mL. The mean volume
NuLake
2.5: p8184
of drink in a random sample of 100 such bottles is 300mL. Construct a
95% confidence interval for the true mean volume of drink per bottle.
Solution:
A 95% C.I. for the population mean m is:
40
+
1
.
96

= 300
ANSWER:
100
= 300mL + 7.84mL
The 95% CI for the population mean is:
292.2mL < m < 307.8mL
Now calculate the 99 % C.I. Will it be wider or narrower??
A 99% C.I. is
s
+
z0.99 / 2 
300
n
= 300 + 2.576  40 ANSWER:
100
The 99% CI for the population mean is:
= 300mL + 10.304mL 289.7mL < m < 310.3mL
When we aren’t told the population
standard deviation s.
If we aren’t given the popn standard deviation
s, then use the sample standard deviation s
as an estimate.
This is OK provided the sample size is large
enough (n > 30).
LESSON 5 – C.I.s for Means 2
The purpose of today:
 Memorise definition of a confidence interval.
 Get confident at constructing confidence intervals for
population means.
To do today:
1.
2.
3.
4.
5.
Watch youtube clip: http://www.youtube.com/watch?v=Ohz-PZqaMtk
Interpret C.I. from yesterday’s e.g. in context.
Finish NuLake 2.5.
Do new Sigma p75 - Ex. 4.01: To end of Q14 compulsory.
Q1517 are extra for experts.
2008 NCEA exam question:
Do new Sigma p75 - Ex. 4.01: To
end of Q14 compulsory.
Q1517 are extra for experts.
LESSON 6 – SAMPLE SIZE
(MEANS)
• Today’s theme: Calculate the required
sample size to meet a set of specified
conditions for a Confidence Interval for the
population MEAN.
• Do Sigma (old): Ex. 14.2 – pg. 230.
(New version: Ex. 4.02 – pg. 79)
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The ____________, E, is _____________________________
________________________________________________.
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The margin of error, E, is the _________________________
________________________________________________.
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + ___cm.
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm.
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The ___________ is __cm. Our
estimate is “___________________”.
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “___________________”.
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a,
Calculating the minimum sample size - means.
The confidence interval formula for estimating the population mean,
m, is:
s
X  z
n
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.
E.g: A survey is to be conducted to determine the mean income of a
group of workers. A pilot survey gives s  $100.
The margin of error, E, is the distance between the sample mean
and the upper and lower limits of this interval.
Margin of Error, E
=
z
s
n
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.
E.g: A survey is to be conducted to determine the mean income of a
group of workers. A pilot survey gives s  $100. How large must the
sample be if the mean income is to be estimated to within $20 using a
95% confidence interval?
For example, a confidence interval of 18cm < m < 22cm, can also be
expressed as 20cm + 2cm. The margin of error is 2cm. Our
estimate is “accurate to within 2cm”.
Given a particular level of confidence a, we can calculate how big a
sample is necessary to estimate m to give a required accuracy or
margin of error, E.
E.g: A survey is to be conducted to determine the mean income of a
group of workers. A pilot survey gives s  $100. How large must the
sample be if the mean income is to be estimated to within $20 using a
95% confidence interval?
s
Solution: A confidence interval for the mean income m is: X  1·96 
For the income to be found to within $20, we need:
1·96 
100
n
< 20
n
E.g: A survey is to be conducted to determine the mean income of a
group of workers. A pilot survey gives s  $100. How large must the
sample be
if the mean
income
is to be
estimated
When
you’ve
copied
down
this
e.g: to within $20 using
a 95% confidence interval?
Do Sigma (new): Ex. 4.02 – pg. 79
s
Solution: A confidence interval for the mean income m is:
(or Old version: Ex. 14.2 – pg. 230).
For the income to be found to within $20, we need:
FINISH FOR H.W.
1·96 
100
n
< 20
196
 20
n
1962
 202
n
1962
n
2
20
n
> 96.04
Squaring both sides
Answer:
A minimum sample size
of 97 is needed.
X  1·96 
n
Formula for calculating minimum
sample size.
 zs 
n=

 E 
2
Where E = Margin of Error.
i.e. half of C.I. width.
Sample-size question from 2007
NCEA External Exam
A random sample of size n is taken from a population
having a known standard deviation σ. A 95%
confidence interval for the population mean is
calculated using the sample mean.
A second random sample of size 2n is taken from the
same population and a 95% confidence interval for
the population mean is calculated using its sample
mean.
How many times greater is the width of the first
confidence interval than the width of the second
confidence interval?
Formula for calculating minimum
sample size.
 zs 
n=

 E 
2
Where E = Margin of Error.
i.e. half of Confidence Interval width.
LESSON 7 – Intro to Confidence
Intervals for Proportions
The points of today:
• Introduction to Distribution of Sample PROPORTIONS.
• Construct confidence intervals for Population Proportions.
 Notes on distn. of sample proportions (handout).
 Do handout on distribution of sample proportions
(Achieving in Statistics page 33).
 How to construct a C.I. for a proportion.
 HW: NuLake Ex. 2.6.
The Distribution of Sample Proportions
E.g. Political Opinion Polls - National vs Labour.
2 possible outcomes where p is the proportion of successful
outcomes in n trials.
If a sequence of n independent trials results in x successes,
then x has a _________ distribution.
The Distribution of Sample Proportions
E.g. Political Opinion Polls - National vs Labour.
2 possible outcomes where p is the proportion of successful
outcomes in n trials.
If a sequence of n independent trials results in x successes,
then x has a Binomial distribution.
A point estimator of the popn proportion of successful trials,
x .
p, is the sample proportion
p=
n
With a sufficient sample size (rule of thumb n>30), the
distribution of sample proportions p is approximately normal and…
The
of SampleofProportions
1. DoDistribution
handout on distribution
sample
E.g. Political
Opinion Polls - National vs Labour.
proportions.
(Will do Q1 table on board as a class)
2 possible outcomes where p is the proportion of successful outcomes in n
trials.
If a sequence of n independent trials results in x successes, then x has a
Binomial distribution.
x
A point estimator of the popn proportion, p, is the sample proportion p =
n
With a sufficient sample size (rule of thumb n>30), the distribution of
sample proportions p is approximately normal and…
E ( p) = m p = p
By the Central Limit Theorem
sp =
p (1  p )
n
Next slide:
The proofs of the formulae for mean
and standard deviation of the
distribution of sample proportions
With a sufficient sample size (rule of thumb n>30), the distribution of
sample proportions p is approximately normal and…
E ( p) = m p = p
Proof:
X
Var ( p) = Var 
n
1
E( X )
n
1
= np
n
=p
n
Proof:
X
E ( p) = E  
n
=
sp =
p (1  p )
Since, for the Binomial
Distribution, m = np
1 
= Var  X 
n 
2
1
=   Var  X 
 n 2
1
=   np (1  p )
n
Since, for the
Binomial
Distribution,
s2 = np1 p
With a sufficient sample size (rule of thumb n>30), the distribution of
sample proportions p is approximately normal and…
E ( p) = m p = p
Proof:
X
Var ( p) = Var 
n
1
E( X )
n
1
= np
n
=p
n
Proof:
X
E ( p) = E  
n
=
sp =
p (1  p )
Since, for the Binomial
Distribution, m = np
1 
= Var  X 
n 
2
1
=   Var  X 
n
1
= 2 np (1  p )
n
p (1  p )
=
n
Since, for the
Binomial
Distribution,
s2 = np1 p
s p =
p (1  p )
n
Confidence Intervals for Proportions
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll.
Confidence Intervals for Proportions
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favour National.
Solution: Our point estimate for p is p =
275
500
= 0 .5 5
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
A 95% C.I. is
p  zs
p
=p
 z
p (1  p )
n
= 0.55
 1 . 96 
= 0.55
 _____
0 . 55  0 . 45
500
Margin of
Error E
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favoured National.
275
Solution: Our point estimate for p is p =
500
= 0 .5 5
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
A 95% C.I. is
p  zs
p
=p
 z
p (1  p )
n
= 0.55  1 .96 
= 0.55
 _____
0 . 55  0 . 45
500
Margin of
Error E
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favoured National.
Solution: Our point estimate for p is p =
275
500
= 0 .5 5
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
A 95% C.I. is
p  zs
p
=p
 z
= 0.55
 1 . 96 
p (1  p )
n
0 . 55  0 . 45
500
= 0.55  0 . 0 4 3 6 1
Margin of
Error E
ANSWER: The 95% CI for the proportion in favour of National is
______ < p < _______
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
Find a 95% confidence interval for the true proportion of all NZers who
favoured National.
Solution: Our point estimate for p is p =
275
500
= 0 .5 5
There is a 95% probability that the interval 0.55 + 1.96 standard errors
contains the true population proportion who would prefer National.
A 95% C.I. is
p  zs
p
=p
 z
= 0.55
 1 . 96 
p (1  p )
n
0 . 55  0 . 45
500
= 0.55  0 . 0 4 3 6 1
Margin of
Error E
ANSWER: The 95% CI for the proportion in favour of National is
0.5064 < p < _______
Example: Political opinion polls.
500 New Zealanders aged 18 and over were selected at random for an opinion
poll. They were asked to indicate whether Labour or National would be their
preferred political party. 275 voted for National.
HW: Do NuLake Ex. 2.6 – CIs for
Find a 95% confidence interval for the true proportion of all NZers who
favoured
National .
proportions.
PONDER THIS:
Solution: Our point estimate for p is p =
275
500
= 0 .5 5
Based on this opinion poll, does National have a
There is a 95% probability that
the interval 0.55 + 1.96majority?
standard errors
STATISTICALLY
SIGNIFICANT
contains the true population proportion who would prefer National.
A 95% C.I. is
p  zs
p
=p
 z
= 0.55
 1 . 96 
p (1  p )
n
0 . 55  0 . 45
500
= 0.55  0 . 0 4 3 6 1
Margin of
Error E
ANSWER: The 95% CI for the proportion in favour of National is
0.5064 < p < 0.5936
LESSON 8 – Practice constructing
C.I.s for Proportions
The point of today:
• Do lots of practice involving confidence intervals
for Population Proportions.
Go over any homework questions – NuLake
p87,88: Ch 2.6 – C.I.s for proportions.
Then do Sigma pg. 232 – Ex. 14.3 (old version).
or in new version: pg. 88 - Ex. 5.01. Finish for HW.
LESSON 9 – SAMPLE SIZE
(PROPORTIONS)
• Today’s theme: Calculate the required sample size to
meet a set of specified conditions for a Confidence
Interval for the population PROPORTION.
• Key point – for minimum sample size, if not told p,
assume p=0.5 as this gives the greatest margin of error
(prepared for the worst).
 Do Sigma: old edition – p235 – Ex. 14.4
or new edition – p91 - Ex. 5.02
Calculating the minimum sample size - proportions.
The confidence interval formula for estimating the population
proportion, p, is:
p  z
p (1  p )
n
The margin of error, E, is the distance between the sample
proportion and the upper and lower limits of this interval.
Margin of Error, E
=
z
p (1  p )
n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is _____.
Calculating the minimum sample size.
The confidence interval formula for estimating the population
proportion, p, is:
p  z
p (1  p )
n
The margin of error, E, is the distance between the sample
proportion and the upper and lower limits of this interval.
Margin of Error, E
=
z
p (1  p )
n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is 0.03.
Calculating the minimum sample size.
The confidence interval formula for estimating the population
proportion, p, is:
p  z
p (1  p )
n
The margin of error, E, is the distance between the sample
proportion and the upper and lower limits of this interval.
Margin of Error, E
=
z
p (1  p )
n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is 0.03. Our estimate
is “accurate to within 0.03”.
The sample size depends on three factors:
1.The level of confidence required, a.
2.The true value of p, which will often be unknown.
3.The accuracy required.
i.e. the margin of error, E, we are willing to accept.
Margin of Error, E
= z  p (1  p )
n
For example, a confidence interval of 0.37 < p < 0.43, can also be
expressed as 0.4 + 0.03. The margin of error is 0.03. Our estimate
is “accurate to within 0.03”.
The sample size depends on three factors:
1.The level of confidence required, a.
2.The true value of p, which will often be unknown.
3.The accuracy required.
i.e. the margin of error, E, we are willing to accept.
Example
An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the proportion
of smokers in the population of passengers on its planes by taking a random
sample. How big a sample must it take to be 95% sure that the value so
obtained does not differ from the true proportion by more than 0.05?
The sample size depends on three factors:
1.The level of confidence required, a.
2.The true value of p, which will often be unknown.
3.The accuracy required.
i.e. the margin of error, E, we are willing to accept.
Example
An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the proportion
of smokers in the population of passengers on its planes by taking a random
sample. How big a sample must it take to be 95% sure that the value so
obtained does not differ from the true proportion by more than 0.05?
Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:
p (1  p )
p  1.96 
n
For the proportion to be found to within 0.05, we need: Margin of < 0.05
Error
Example
An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the
proportion of smokers in the population of passengers on its planes by
taking a random sample. How big a sample must it take to be 95% sure
that the value so obtained does not differ from the true proportion
by more than 0.05?
Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:
p (1  p )
p  1.96 
n
Margin of
For the proportion to be found to within 0.05, we need: Error
1.96 
PROBLEM!
p (1  p )
n
< 0.05
 0.05
Example
An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the
proportion of smokers in the population of passengers on its planes by
taking a random sample. How big a sample must it take to be 95% sure
that the value so obtained does not differ from the true proportion
by more than 0.05?
Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:
p (1  p )
p  1.96 
n
Margin of
For the proportion to be found to within 0.05, we need: Error
1.96 
p (1  p )
< 0.05
 0.05
n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!!
Example
An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the
proportion of smokers in the population of passengers on its planes by
taking a random sample. How big a sample must it take to be 95% sure
that the value so obtained does not differ from the true proportion
by more than 0.05?
Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:
p (1  p )
p  1.96 
n
Margin of
For the proportion to be found to within 0.05, we need: Error
1.96 
p (1  p )
< 0.05
 0.05
n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
An international airline is thinking of making smoking illegal on its
aircraft. Before making the decision it wishes to estimate the
proportion of smokers in the population of passengers on its planes by
taking a random sample. How big a sample must it take to be 95% sure
that the value so obtained does not differ from the true proportion
by more than 0.05?
Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:
p (1  p )
p  1.96 
n
For the proportion to be found to within 0.05, we need:
1.96 
Margin of
< 0.05
Error
p (1  p )
n
 0.05
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.
Use a value of p that has held in the past (previous samples).
2. Take a small pilot survey, and use the sample proportion p from that
as an estimate of p.
Solution: A 95% confidence interval for the proportion of smokers on
all planes, p is:
p (1  p )
p  1.96 
n
For the proportion to be found to within 0.05, we need:
1.96 
Margin of < 0.05
Error
p (1  p )
n
 0.05
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.Use a value of p that has held in the past (previous samples).
2.Take a small pilot survey, and use the sample proportion p from that as
an estimate of p.
3.Use p=0.5. This allows for the greatest possible error because the
maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
i.e. when
p(1-p) = 0.5 × 0.5
= 0.25
For the proportion to be found to within 0.05, we need:
1.96 
Margin of
Error
p (1  p )
 0.05
n
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.Use a value of p that has held in the past (previous samples).
2.Take a small pilot survey, and use the sample proportion p from that as
an estimate of p.
3.Use p=0.5. This allows for the greatest possible error because the
maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
i.e. when
p(1-p) = 0.5 × 0.5
= 0.25
Back to this example: 1.96 
p (1  p )
 0.05
n
We’re given no information on the value of p, so let p = 0.5.
PROBLEM! We don’t have a value for π. That’s the very thing we’re
trying to estimate!! To get around this problem we have 3 options:
1.Use a value of p that has held in the past (previous samples).
2.Take a small pilot survey, and use the sample proportion p from that as
an estimate of p.
3.Use p=0.5. This allows for the greatest possible error because the
maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
i.e. when
p(1-p)
Back to this example: 1.96 
= 0.5 × 0.5
= 0.25
p (1  p )
 0.05
n
We’re given no information on the value of p, so let p = 0.5.
1.96 
0.5(1  0.5)
 0.05
n
1.96 
0.25
 0.05
n
3. Use p=0.5. This allows for the greatest possible error because the
maximum possible value of p(1-p) occurs when both p and (1-p) are = ½
Do Sigma p235 – Ex. 14.4 (old version)
i.e. when
p(1-p) = 0.5 × 0.5
p91= 0.25
– Ex. 5.02
(new version)
Back to this example: 1.96  p (1  p )  0.05
Homework: NuLake npg. 96: Q6477.
We’re given no information on the value of p, so let p = 0.5.
1.96 
0.5(1  0.5)
 0.05
n
1.96 
0.25
 0.05
n
1.962  0.25
 0.052
n
Squaring both sides
1.96 2  0.25
n 
0.052
n > 384.16…
Answer:
A sample size of 385 passengers
is needed.
LESSON 10 – Differences between
means 1
The point of today:
Construct confidence intervals for the
difference between 2 population means.
• Do NuLake 2.7: pg. 8993.
2007 NCEA exam – C.I.s
Confidence Intervals for the Difference
Between 2 Means
Confidence Intervals for the Difference
Between 2 Means
Involves comparison between the means of two populations (e.g. males &
females).
Confidence Intervals for the Difference
Between 2 Means
Involves comparison between the means of two populations (e.g. males &
females). We select a random sample from each group and calculate the
2 means, subtracting to get the difference.
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.
The expected difference between the 2 sample means, is the true
difference between the 2 population means: (Central Limit Theorem)
E( X1  X 2 ) =
m1  m2
i.e. Mean difference between sample means= diff. between popn means.
We select a random sample from each group and calculate the 2 means,
subtracting to get the difference.
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.
The expected difference between the 2 sample means, is the true
difference between the 2 population means: (Central Limit Theorem)
E( X1  X 2 ) =
m1  m2
i.e. Mean difference between sample means= diff. between popn means
Sample Mean
(point estimate)
Sample
Size
Popn
Mean
Variance of
Sample Means
We select a random sample from each group and calculate the 2 means,
subtracting to get the difference.
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.
The expected difference between the 2 sample means, is the true
difference between the 2 population means: (Central Limit Theorem)
E( X1  X 2 ) =
m1  m2
i.e. Mean difference between sample means= diff. between popn means
Sample Mean
(point estimate)
Sample
Size
X1
n1
X2
n2
X1  X 2
―
Popn
Mean
Variance of
Sample Means
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.
The expected difference between the 2 sample means, is the true
difference between the 2 population means: (Central Limit Theorem)
E( X1  X 2 ) =
m1  m2
i.e. Mean difference between sample means = diff. between popn means
Sample Mean
(point estimate)
Sample
Size
X1
n1
X2
n2
X1  X 2
―
Popn
Mean
Variance of
Sample Means
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.
The expected difference between the 2 sample means, is the true
difference between the 2 population means: (Central Limit Theorem)
E( X1  X 2 ) =
m1  m2
i.e. Mean difference between sample means = diff. between popn means
Sample Mean
Sample
Size
Popn
Mean
X1
n1
m1
X2
n2
(point estimate)
X1  X 2
―
Variance of
Sample Means
s 12
n1
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.
The expected difference between the 2 sample means, is the true
difference between the 2 population means: (Central Limit Theorem)
E( X1  X 2 ) =
m1  m2
i.e. Mean difference between sample means = diff. between popn means
Sample Mean
(point estimate)
X1
X2
X1  X 2
Sample
Size
Popn
Mean
n1
m1
n2
m2
―
Variance of
Sample Means
s 12
n1
s 22
n2
We then use this difference to estimate the difference between the
means of the 2 populations from which the samples were drawn.
The expected difference between the 2 sample means, is the true
difference between the 2 population means: (Central Limit Theorem)
E( X1  X 2 ) =
m1  m2
i.e. Mean difference between sample means = diff. between popn means
Sample Mean
(point estimate)
X1
X2
X1  X 2
Sample
Size
Popn
Mean
n1
m1
n2
m2
―
m1 – m2
Variance of
Sample Means
s 12
n1
s 22
n2
s1
2
n1

s2
2
n2
Sample Mean
(point estimate)
X1
X2
X1  X 2
Sample
Size
Popn
Mean
n1
m1
n2
m2
―
m1 – m2
Variance of
Sample Means
s 12
n1
s 22
n2
s 12
n1

s 22
n2
Sample Mean
(point estimate)
X1
X2
X1  X 2
Sample
Size
Popn
Mean
n1
m1
n2
m2
―
m1 – m2
Variance of
Sample Means
s 12
n1
s 22
n2
s 12
n1

s 22
n2
So the Standard Error of the difference
between 2 sample means is:
Sample Mean
(point estimate)
Sample
Size
Popn
Mean
n1
m1
n2
m2
―
m1 – m2
X1
X2
X1  X 2
Variance of
Sample Means
s 12
n1
s 22
n2
s 12
n1

s 22
n2
So the Standard Error of the difference
between 2 sample means is:
s X  X  =
1
2
s1
2
n1

s2
2
n2
So
s X  X  =
1
2
s1
2
n1

s2
2
n2
NOTE:
1. The 2 samples must be INDEPENDENT of one another.
2.
When finding a confidence interval for the difference
between 2 means, we use the popn parameters s1 and s2.
If not told these, we can use the sample SD’s s1 and s2,
provided the sample sizes are large enough (n>30).
So
s X  X  =
1
2
s1
2
n1

s2
2
n2
NOTE:
1. The 2 samples must be INDEPENDENT of one another.
2.
When finding a confidence interval for the difference
between 2 means, we use the popn parameters s1 and s2.
If not told these, we can use the sample SD’s
provided the sample sizes are large enough.
3.
s1 and s2,
A 95% Confidence Interval tells us that 95% of such
intervals will CONTAIN the difference between the
POPULATION MEANS.
So the Standard Error of the difference
between 2 sample means is:
s X  X  =
1
2
s1
2
n1

s2
2
n2
Confidence Intervals for Difference
Between 2 Means
So the Standard Error of the difference
between 2 sample means is:
s X  X  =
1
2
s1
2
n1

s2
2
n2
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has a
mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
(b) What can we conclude about the mean lifespans of all men and all
women on the basis of this confidence interval? Justify your answer.
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has a
mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has a
mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women: n1 = 49
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has a
mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
n1 = 49
X1
=76
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has a
mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
n1 = 49
X1
=76
s1 = 8
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has
a mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
n1 = 49
For the men:
n2 = 64
X1
=76
s1 = 8
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has
a mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
n1 = 49
X1
=76
For the men:
n2 = 64
X2
=72
s1 = 8
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has
a mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
n1 = 49
X1
=76
s1 = 8
For the men:
n2 = 64
X2
=72
s2
=9
Confidence Intervals for Difference
Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has a
mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
For the men:
n1 = 49
X 1 =76
s1 = 8
n2 = 64
X 2 =72
s2
=9
Confidence Intervals for Difference Between 2 Means
Example:
If a random sample of 49 women has a mean life of 76 years with a
standard deviation of 8 years and a random sample of 64 men has a
mean life of 72 years with a standard deviation of 9 years.
(a) Find a 95% confidence interval for the difference between the mean
lifetimes of all women and all men.
Solution:
For the women:
For the men:
n1 = 49
n2 = 64
X 1  X 2 = 76 – 72
= 4 yrs
X1
=76
X2
=72
s1 = 8
s2 = 9
Solution:
For the women:
For the men:
n1 = 49
n2 = 64
X1
=76
s1 = 8
s2 = 9
X 2 =72
X 1  X 2 = 76 – 72
= 4 yrs
A 95% Confidence Interval for m1-m2, the difference between the
population mean lifetimes of women and men is:
X
1
 X2
= X 1  X 2 
=
4
+ z ×Standard Error
of
Use the sample
standard deviations –
OK if sample is large
enough
For the women:
For the men:
n1 = 49
n2 = 64
X 1 =76
X2
s1 = 8
s2 = 9
=72
X 1  X 2 = 76 – 72
= 4 yrs
A 95% Confidence Interval for m1-m2, the difference between the
population mean lifetimes of women and men is:
X
1
 X2
= X 1  X 2 
=
4
=
4
+ z ×Standard Error
of
Use the sample
standard deviations –
OK if sample is large
enough
Margin of
Error E
For the women:
For the men:
n1 = 49
n2 = 64
X 1 =76
s1 = 8
s2 = 9
X 2 =72
X 1  X 2 = 76 – 72
= 4 yrs
A 95% Confidence Interval for m1-m2, the difference between the
population mean lifetimes of women and men is:
X
1
 X2
+ z ×Standard Error
Use the sample
standard deviations –
OK if sample is large
enough
= X 1  X 2 
=
=
4
4
of
Margin of
Error E
For the women:
n1 = 49
s1 = 8
X 1 =76
(b)
What can
we conclude
mean
X 2 =72about the
For the men:
n2 = 64
s2 = 9
lifespans of all men and all women on the basis
X 1  X 2 = 76 – 72
of this confidence interval? Justify your
= 4 yrs
answer.
A 95% Confidence Interval for m1-m2, the difference between the
population mean lifetimes of women and men is:
ANSWER: Since the interval does not contain
Error of
X1  X 2  + ofz ×Standard
a difference
ZERO, there
are sufficient
sample
grounds to say that there isUsea the
difference
standard deviations –


=
X

X
1
2
between
the mean lifespansOKofif sample
the ispopulations
large
enough
of all men and all women.
=
4
=
4
ANSWER: The 95% CI for the difference between the population mean
lifetimes of women and men is 0.857yrs < (m1-m2)< 7.143yrs
X 1 =76
n1 = 49
s1 = 8
(b)
What cann we
conclude
about the
mean
X
For the men:
=
64
=72
s
=
9
2
2
2
For the women:
lifespans of all men and all women on the basis
X 1  X 2 = 76 – 72
of this confidence interval? Justify your
= 4 yrs
answer.
A 95% Confidence Interval for m1-m2, the difference between the
population mean lifetimes of women and men is:
ANSWER: Since the interval does not contain
a difference
ZERO, there
sufficient
Error are
of
X1  X 2  + ofz ×Standard
sample
grounds to say that there isUse
a the
difference
standard deviations –


=
X

X
between
the
mean lifespans OK
ofif sample
the ispopulations
1
2
large
enough
of all men and all women.
TRY WITH A
99%
= C.I.
4
=
4
ANSWER: The 95% CI for the difference between the population mean
lifetimes of women and men is 0.857yrs < (m1-m2)< 7.143yrs
Difference between 2 means exercises
• Do NuLake Ch 2.7: pg. 8993
LESSON 11 – Differences between
means 2
The point of today:
Construct confidence intervals for the
difference between 2 population means.
• Do Sigma pg. 239 – Ex. 14.5 (old version).
or pg. 83 – Ex. 4.03 (new version)
STARTER:
GO THROUGH PROBLEM FROM HW AS A CLASS.
Do Sigma pg. 239 – Ex. 14.5 (old version).
or pg. 83 – Ex. 4.03 (new version)
LESSON 12
The distribution of the sample Total.
The point of today:
Construct confidence intervals for the combined
total of a sample of items.
• Example
• 2009 NCEA paper (AS90642): Q1b & c.
• Probabilities for sample totals: Ex. 3.03 (pg. 64) – complete
for HW.
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males).
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean.
Type 2: Based on
X, the mean from a random sample.
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean. (look at today).
Type 2: Based on X, the mean from a random sample. (look
at next lesson)
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean.
This is where you are given m, the mean value per item in the
population and asked to construct a confidence interval for the
total value of a sample of n items.
E.g. Seventeen year-old NZ males have a known mean weight of
80kg, with a standard deviation of 5kg.
Construct a 99% CI for the combined total weight of a random
sample of 8 students.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean.
This is where you are given m, the mean value per item in the
population and asked to construct a confidence interval for the
total value of a sample of n items.
E.g. Seventeen year-old NZ males have a known mean weight of
80kg, with a standard deviation of 5kg.
Construct a 99% CI for the combined total weight of a random
sample of 8 students.
Solution: The distribution of the total weight of 8 students is
the sum of 8 identically distributed random variables.
Here we know the population mean weight per seventeen yearold male, m, and the standard deviation, s.
So we can simply add the means and add the variances.
E.g. Seventeen year-old NZ males have a known mean weight of
80kg, with a standard deviation of 5kg.
Construct a 99% CI for the combined total weight of a random
sample of 8 students.
Solution: The distribution of the total weight of 8 students is
the sum of 8 identically distributed random variables.
Here we know the population mean weight per seventeen yearold male, m, and the standard deviation, s.
So we can simply add the means and add the variances.
Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
sample total is
Tn = X1 + X2,……….+ Xn
Solution: The distribution of the total weight of 8 students is
the sum of 8 identically distributed random variables.
Here we know the population mean weight per seventeen
year-old male, m, and the standard deviation, s.
So we can simply add the means and add the variances.
Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
sample total is
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
= E[X1]+………
+ E[Xn]
Here we know the population mean weight per seventeen
year-old male, m, and the standard deviation, s.
So we can simply add the means and add the variances.
Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
sample total is
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
= E[X1]+………
+ E[Xn]
= nm
Variance of estimates of the total:
Var[Tn] =Var [X1 + X2,……….+ Xn ]
= Var[X1]+………
+ Var[Xn]
So we can simply add the means and add the variances.
Distribution of a Total of n independent items:
If X1, X2,………..Xn are n independent sample values, then the
sample total is
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
= E[X1]+………
+ E[Xn]
= nm
Variance of estimates of the total:
Var[Tn] =Var [X1 + X2,……….+ Xn ]
= Var[X1]+………
+ Var[Xn]
= nσ2 (if all have equal SD)
So the std. deviation of estimates of the total is: s = n  s
T
 
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
= E[X1]+………
+ E[Xn]
= nm
Variance of estimates of the total:
Var[Tn] =Var [X1 + X2,……….+ Xn ]
= Var[X1]+………
+ Var[Xn]
= nσ2 (if all have equal SD)
So the std. deviation of estimates of the total is: s = n  s
T
 
Back to the example: Total weight of sample of 8 males:
E(T8) =
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
= E[X1]+………
+ E[Xn]
= nm
Variance of estimates of the total:
Var[Tn] =Var [X1 + X2,……….+ Xn ]
= Var[X1]+………
+ Var[Xn]
= nσ2 (if all have equal SD)
So the std. deviation of estimates of the total is: s = n  s
T
 
Back to the example: Total weight of sample of 8 males:
E(T8) = 8(80)
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
= E[X1]+………
+ E[Xn]
= nm
Variance of estimates of the total:
Var[Tn] =Var [X1 + X2,……….+ Xn ]
= Var[X1]+………
+ Var[Xn]
= nσ2 (if all have equal SD)
So the std. deviation of estimates of the total is: s = n  s
T
 
Back to the example: Total weight of sample of 8 males:
E(T8) = 8(80) = 640kg.
Var(T8) = 8(52) = 200.
Tn = X1 + X2,……….+ Xn
Expected value of total: E[Tn] = E [X1 + X2,……….+ Xn ]
= E[X1]+………
+ E[Xn]
= nm
Variance of estimates of the total:
Var[Tn] =Var [X1 + X2,……….+ Xn ]
= Var[X1]+………
+ Var[Xn]
= nσ2 (if all have equal SD)
So the std. deviation of estimates of the total is: s = n  s
T
 
Back to the example: Total weight of sample of 8 males:
E(T8) = 8(80) = 640kg.
Var(T8) = 8(52) = 200.
So σ8 = 200 = 14.14213562kg
Variance of estimates of the total:
[X1 +(b)
X2and
,……….+
n] =Var
1. Do 2009 NCEA Var[T
AS90642
– Q1
(c) Xn ]
= Var[X1]+………
+ Var[Xn]
2. Do Sigma:
= nσ2 (if all have equal SD)
So the-std.
of estimates
the–total
Olddeviation
(2nd edition):
pg. of
183
Ex.is:11.3.
s T = n s
 
- or New: pg. 64 – Ex. 3.03.
Back to the example: Total weight of sample of 8 males:
E(T8) = 8(80) = 640kg.
Var(T8) = 8(52) = 200.
So σ8 = 200 = 14.14213562kg
99% CI for T is E(T8)  z  s T
= 640
 2.576 14.14...
= 640kg  36.43kg
(4sf)
ANSWR: The 99% CI for T8, the total weight of the sample of 8 males is:
603.6kg <T< 676.4kg (all to 4sf)
LESSON 13
Confidence Intervals for Population Totals
• STARTER: Revise the definition of a Confidence
Interval.
• Notes on CI for population totals.
 Do NCEA AS90642 – 2009 paper: Q2c.
 Do NuLake p98-100 (mixed problems).
 Do NuLake practice assessment (p101).
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean.
Type 2: Based on
X , the mean from a random sample.
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 1: Based on m, a known population mean. (looked at last
lesson).
Type 2: Based on
at today)
X , the mean from a random sample. (look
Confidence Intervals for the Sample Total, Tn
This is where you are asked to give a confidence interval for
the combined total of a sample of n items (or of the entire
population of N items).
(E.g. total weight of a sample of eight Y13 males). You will be
told the mean value per item and the standard deviation.
These problems come in 2 types, depending on whether you’re
given the population mean, m, or the mean from a sample X .
Type 2: Based on
X , the mean from a random sample:
Type 2: Based on
X,
the mean from a random sample:
This is where you are asked to construct a confidence interval for the
total value of N items but the population mean per item is unknown.
Instead we are told X , the mean from a sample.
Then an estimate of the total value of N items is:
NX
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per item, m.
2. Multiply the lower and upper bounds of the interval by N , the number
of items.
Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams.
Type 2: Based on
X,
the mean from a random sample:
This is where you are asked to construct a confidence interval for the
total value of N items but the population mean per item is unknown.
Instead we are told X , the mean from a sample.
Then an estimate of the total value of N items is:
NX
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per item, m.
2. Multiply the lower and upper bounds of the interval by N , the number
of items.
Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
Then an estimate of the total value of N items is:
NX
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per
item, m.
2. Multiply the lower and upper bounds of the interval by N , the
number of items.
Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30.
Then an estimate of the total value of N items is:
NX
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per
item, m.
2. Multiply the lower and upper bounds of the interval by N , the
number of items.
Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30. This
sample has a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68
randomly selected Year 13 students.
To construct a CI for a total based on a sample:
1. Construct a confidence interval for the population mean per
item, m.
2. Multiply the lower and upper bounds of the interval by N , the
number of items.
Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30. This
sample has a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68
randomly selected Year 13 students.
Solution:
Type 2: Based on
X
, the mean from a random sample:
Example:
68 Year 13 male students are to be selected at random from
throughout NZ to win a prize of an overseas holiday after NCEA
exams. The organisers need to estimate the likely total weight of
the students, due to weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so
they conduct a pilot study by selecting a random sample of 30. This
sample has a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.
Solution:
1.Construct a 96% confidence interval for the popn mean m:
Interval is given by:
7
s
76  z
n
= 76  2.054
= 76  2.625
30
Example:
68 Year 13 male students are to be selected at random from throughout
NZ to win a prize of an overseas holiday after NCEA exams. The
organisers need to estimate the likely total weight of the students, due to
weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so they
conduct a pilot study by selecting a random sample of 30. This sample has
a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.
Solution:
1.Construct a 96% confidence interval for the popn mean m:
7
s
Interval is given by:
76  z
n
= 76  2.054
= 76  2.625
So 96% CI for popn mean weight, m is:
30
Example:
68 Year 13 male students are to be selected at random from throughout
NZ to win a prize of an overseas holiday after NCEA exams. The
organisers need to estimate the likely total weight of the students, due to
weight restrictions on the aircraft.
The mean and SD of the popn of all Year 13 males is unknown so they
conduct a pilot study by selecting a random sample of 30. This sample has
a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.
Solution:
1.Construct a 96% confidence interval for the popn mean m:
7
s
Interval is given by:
76  z
n
= 76  2.054
30
= 76  2.625
So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg
The mean and SD of the popn of all Year 13 males is unknown so they
conduct a pilot study by selecting a random sample of 30. This sample has
a mean weight of 76Kg with standard deviation of 7Kg.
Construct a 96% CI for the expected total of weight of 68 Year 13
students.
Solution:
1.Construct a 96% confidence interval for the popn mean m:
Interval is given by:
s
7
= 76  2.054
76  z
30
n
= 76  2.625
So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg
2. Multiply the lower and upper bounds of the interval by N , the
number of items.
96% CI for the expected total weight of the 68 Y13s is:
(N × lower limit for m) < TN < (N × upper limit for m)
Construct a 96% CI for the expected total of weight of 68 Year 13
students.
1. C.I. for Population Totals:
Solution: Do 2009 NCEA paper (AS90642):
1.ConstructQ2c.
a 96% confidence interval for the popn mean m:
Interval is given by:
7
s
76  z
= 76  2.054
2. Preparation for test:
30
n
Do NuLake p98-100 (Mixed
= 76  2.625
problems).
So 96% CI for popn mean weight, m is: 73.375kg < m < 78.625kg
2.
Do NuLake practice assessment
Multiply the lower and upper bounds of the interval by N
(p101)
number of items.
, the
96% CI for the expected total weight of the 68 Y13s is:
(N × lower limit for m) < TN < (N × upper limit for m)
=
(68 × 73.375) < TN < (68 × < 78.625)
=
4990kg < T68 < 5347kg
answer
Sample-size question from 2007
NCEA External Exam
A random sample of size n is taken from a population
having a known standard deviation σ. A 95%
confidence interval for the population mean is
calculated using the sample mean.
A second random sample of size 2n is taken from the
same population and a 95% confidence interval for
the population mean is calculated using its sample
mean.
How many times greater is the width of the first
confidence interval than the width of the second
confidence interval?
LESSON 14 – ASSESSMENT
What to study:
 Do NuLake mixed problems (p98) – merit level qs.
 NuLake practice assesment (p101)
 More practice (Achieved & Merit):
Do Sigma Confidence Intervals Review exercise:
 Old: p241 – Ex. 14.6
 New: p95 – Ex. 5.03
 CIs for totals (Excellence) – past papers:
2009 Q2c
2008 Q6
2006 Q7
Download