Introduction to Inference

advertisement
4/19/2013
Introduction to Inference
Estimating with Confidence
IPS Chapter 6.1
© 2009 W.H. Freeman and Company
Objectives (IPS Chapter 6.1)
Estimating with confidence

Statistical confidence

Confidence intervals

Confidence interval for a population mean

How confidence intervals behave

Choosing the sample size
1
4/19/2013
Overview of Inference

Methods for drawing conclusions about a population from sample
data are called statistical inference

Methods


Confidence Intervals - estimating a value of a population parameter

Tests of significance - assess evidence for a claim about a population
Inference is appropriate when data are produced by either
 a random sample or
 a randomized experiment
How Statistical Inference Works

Statistical Inference is the process of drawing conclusions using
data that are subject to random variation.

It makes propositions about populations, using data drawn from the
population of interest via some form of random sampling.

The result of a statistical inference is a statistical proposition.
Some common forms of statistical proposition are:

an estimate, i.e. a particular value that best approximates some
population parameter of interest

a confidence interval, i.e. an interval constructed from the data in such
a way that, under repeated sampling, it would contain the true parameter
value with the probability stated in the confidence level

a test of significance, i.e. a decision to reject or accept a
hypothesis/claim about the nature or state of a population on the basis of
a statistically significant outcome/result
2
4/19/2013
How Statistical Inference Works
Population
under study
µ =? , p = ?
3. Use Statistical
Inference to draw
conclusions
1. Take a SRS
of size n
2. Compute the
value of a sample
statistic
x , pˆ
Confidence Intervals

A confidence interval gives an estimated range of values which is
likely to include an unknown population parameter. It’s calculated
from a sample taken from the population and is of the form:
estimate ± margin of error

The level of confidence is the likelihood that the true value of the
population parameter falls in the estimated interval range of values.
3
4/19/2013
Computing Confidence Intervals for µ
when the population standard deviation σ is known
Example 1

The weight of single eggs of the brown variety is normally distributed
with an unknown mean µ and a known standard deviation σ = 5g.
You buy a carton of 12 brown eggs and find out that the box weighs
770g, for an average weight of 64.2g per egg . What can you
conclude about the true mean weight µ of all brown eggs?
 ?
  5g
 770 g
x  64.2 g


 
P   1
 x    1
  0.68 or 68%
n
n

Example 1 Cont.
Therefore, there is a 68% chance that theinterval x  1  
includes the uknown value .
Distribution of X-bar
x  64.2

n

12
 1.443
Location of the
mean µ.
x 1  
68% Confidence Interval
for the mean weight of a
brown egg.
5
n
n
 64.2  1.443
 (62.757, 65.643)
4
4/19/2013


 
P   2 
 x    2
  0.95 or 95%.
n
n

Therefore, there is a 95% chance that theinterval
Example 1 Cont.
x  2
n
includes the uknown value .
Distribution of X-bar
x  64.2

n

5
12
 1.443
Location of the
mean µ.
x  2
n
 64.2  2  1.443
 (61.314, 67.086)
95% Confidence Interval
for the mean weight of a
brown egg.
Example 1 Cont.
What is an 80% Confidence Interval for the mean weight of a brown egg?
We need to find a value z * so that :


 
P   z * 
 x    z* 
  0.80 or 80%
n
n

1  0.80
2
- z* = invNorm(.10, 0, 1) = -1.28
We can use z* to calculate the
margin of error for the interval:

m  z *
n
Therefore an 80% CI for the mean
weight of a brown egg is:
5
64.2  1.28 

12
 64.2  1.8475 or
62.35,
66.05
−z* =
5
4/19/2013
Confidence Interval for µ (σ given)
In general

A level C (expressed as a %)
confidence interval for µ when σ is
known is given by:
x  z * 
n
Assumptions

The population from where the
sample is taken is normally
distributed, or

sample size n ≥ 30.
- z*
z*
Example

Weights of newborn babies follow a normal distribution with a
standard deviation σ =1lb & an unknown mean µ. To estimate µ we
look at the next 10 babies born. We find that the sample mean x-bar
for these two babies is 6.35 lbs.
(a)
What is a 90% CI for µ based on this sample?
(b)
What is a 95% CI for µ based on this sample?
(c)
What is a 85% CI for µ based on this sample?
6
4/19/2013
How do we find specific z* values?
Table D: Values of z* for the listed confidence levels C in the bottom
row of the table are given in the row above it.
Example: For a 98% confidence level, z*=2.326
We can also use software. For example, in Excel:
=NORMINV(probability, mean, standard_dev)
gives z for a given cumulative probability.
Since we want the middle C probability, the probability we require is (1 - C)/2
Example: For a 98% confidence level, =NORMINV(.01,0,1) = −2.32635 (= neg. z*)
Computing CI using the TI-83
1. Press STAT.
2. Select TESTS  Zinterval.


Select Inpt: Data; enter the
value for σ, the list (Li)
where the sample data is
stored, and the confidence
level ( C – Level) in
decimal format.
OR
Select Inpt: Stats; enter
the value for σ, x-bar, n,
and the confidence level (C
– Level).
3. Select Calculate & press
Enter.
7
4/19/2013
Example 1: Calories in Apples

The following table shows the number of calories in a sample of 10 apples
of a certain variety.
49
69
39
30
54
50
65
63
64
41
Compute (a) 80%, (b) 90%, & (c) 98% confidence intervals for the true
population mean of the number of calories in apples of this variety based on
this sample. Assume caloric content in apples of this type is normally
distributed with a standard deviation σ =10 calories.
What does it all mean?
Say we compute a 95%
confidence interval, “95%
/√n
confidence” means 95% of the
time the interval we compute
captures the true value of the
population mean (µ), and 5% of
the time it misses it.
8
4/19/2013
Confidence intervals - Summary

The confidence interval is a range of values with an associated
probability or confidence level C.

The probability quantifies the chance that the interval contains the
true population parameter.
Sample size and experimental design
You may need a certain margin of error (e.g., drug trial, manufacturing
specs). In many cases, the population variability ( is fixed, but we can
choose the number of measurements (n).
So plan ahead what sample size to use to achieve that margin of error.
m  z*

n

z *  2
n  

 m 
Remember, though, that sample size is not always stretchable at will. There are
typically
 costs and constraints associated with large samples. The best
approach is to use the smallest sample size that can give you useful results.
9
4/19/2013
What sample size for a given margin of error?
Density of bacteria in solution:
Measurement equipment has standard deviation
σ = 1 * 106 bacteria/ml fluid.
How many measurements should you make to obtain a margin of error
of at most 0.5 * 106 bacteria/ml with a confidence level of 90%?
For a 90% confidence interval, z* = 1.645.
 z * 
 1.645 *1 
2
n
  n
  3.29  10.8241
m
0
.
5




2
2
Using only 10 measurements will not be enough to ensure that m is no
more than 0.5 * 106. Therefore, we need at least 11 measurements.
10
Download