Review 3 - Jan.ucc.nau.edu

advertisement
Review 3
Chapter 8
1. A statistic  any quantity computed from values
in a sample (for example, x , s, the sample median,
the sample interquartile range and so on).The
distribution of a statistic is called its sampling
distribution.
2. Properties of the sampling distribution of x
Let x denote the mean of the observations in a
random sample of size n from a population having
mean  and standard deviation . Denote the mean
value of the x distribution by  x and the standard
deviation of x distribution by  x . Then the
following rules hold.
Rule 1:  x =
Rule 2:  x =  .
n
Rule 3: When the population distribution is normal,
the sampling distribution of x is also normal for any
sample size n. Thus, the standardized variable
z
x x
x

x
 / n
has the standard normal (z) distribution.
Rule 4: (Central Limit Theorem) When n is
sufficiently large (n ≥ 30), the sampling
distribution of x is well approximated by a normal
curve, even when the population distribution is not
itself normal. So, the standardized variable
z
x x
x
has approximately
distribution.

the
x
 / n
standard
normal
(z)
3. General properties of the sampling distribution of p
Let p be the proportion of S’s in a random sample
of size n from a population whose proportion of S’s
is . Denote the mean value of p by p and the
standard deviation of p by p. Then the following
rules hold.
Rule 1: p = 
Rule 2:  p 
 (1   )
n
Rule 3: (Central Limit Theorem) When n is large
and  is not too near 0 or 1 (n  10 and n(1- )
10), the sampling distribution of p is approximately
normal. Thus, the standardized variable
z
pp
p

p 
 (1   ) / n
has approximately
distribution.
the
standard
normal
(z)
Chapter 9
4. A point estimate of a population characteristic is a
single number computed from sample data and
represents a plausible value of the characteristic. A
point estimate is obtained by (i) selecting an
appropriate statistic; (ii) computing the value of the
statistic for the given sample.
A statistic whose mean is equal to the value of the
population characteristic being estimated is said to
be an unbiased statistic. A statistic that is not
unbiased is said biased.
5. Criteria for choosing among competing statistics
a) First we choose an unbiased statistic if there is
one;
b) if several unbiased statistics could be used for
estimating a population characteristic, we
choose the one with the smallest standard
deviation.
6. Statistics used to estimate some important
population characteristics
Population characteristic to be Statistic to use Unbiasedness
estimated
p
Unbiased
Population proportion, 
Unbiased
x
Population mean, 
2
s
Unbiased
Population variance, 2
s
Biased
Population standard deviation, 
Population median
Sample median
Biased
7. A confidence interval for a population characteristic
is an interval of plausible values for the
characteristic. It is constructed so that, with a chosen
degree of confidence, the value of the characteristic
will be captured inside the interval.
The confidence level associated with a confidence
interval estimate is the success rate of the method
used to construct the interval.
The standard error of a statistic is the estimated
standard deviation of the statistic.
The bound on error of estimation based on a statistic,
B, associated with a 95% confidence interval is
(1.96)(standard deviation of the statistic).
8. The large-sample confidence interval for 
When
(1) p is the sample proportion from a random
sample, and
(2) the sample size n is large (np  10 and n(1-p) 
10)
the general formula for a confidence interval for a
population proportion  is
p  (z critical value)
p(1  p)
n
The desired confidence level determines the z
critical value. The three most commonly used
confidence levels, 90%, 95%, and 99%, use z
critical values 1.645, 1.96, and 2.58, respectively.
9. The sample size required to estimate a population
proportion  to within an amount B with 95%
confidence is
n = (1-) ( 1.96 ) 2
B
The value of  may be estimated using prior
information. In the absence of any such
information, using  = .5 in this formula gives a
conservatively large value for the required sample
size.
Question: What is the formula for the sample size
required to estimate  to within a amount B with any
confidence level? ( n = (1-)( z critical value / B)2 )
10. The one-sample z confidence interval for 
When
(1). x is the sample mean of a random sample
(2) the population distribution is normal or the
sample size n is large (generally n  30)
(3) the population standard deviation  is known
the formula for a confidence interval for a
population mean  is
x
( z critical value) (  )
n
11. Let x1, x2, , xn be a random sample from a
normal population distribution. Then the
probability distribution of the standardized
variable
t
x
s/
n
has the t distribution with n-1 df.
12. the one-sample t confidence interval for 
When
(1) x is the sample mean of a random sample
(2)the population distribution is normal or the
sample size n is large (generally n  30)
(3)the population standard deviation  is
unknown
the formula for a confidence interval for
population mean  is
x
 ( t critical value) (
s
n
)
where the t critical value is based on n-1 df, which
can be found by Appendix Table 3 on page 732.
13. The sample size required to estimate a population
mean  to within an amount B with 95%
confidence is
n =[ 1.96 ]2 .
B
If  is unknown, it may be estimated based on
previous information or, for a population that is
not too skewed, by using (range)/4.
14. Examples
(1) (Ex.9.10)
(a) xJ  1 (103 + 156 + 118 + 89 + 125 + 147 + 122 +
10
109 + 138 +99) = 120.6
(b) Since  = 10,000p, an estimate of  would be
10,000 x J = 10000(120.6) = 1,206,000.
The statistic used is
(The size of the population)  x J
(c) An estimate of  is
p = 8 / 10 = 0.8
(d) We estimate the population median usage by the
sample median: (118+122) / 2 = 120 therms
(2) (Ex. 9.14)
The large-sample confidence interval for  is
p  (z critical value)
p(1  p)
n
= (p - (z critical value)
p(1  p)
, p + (z critical value)
n
p(1 p)
)
n
The width of the interval = 2(z critical value)
p(1  p)
n
a)
As the confidence level increases, z critical value
increases, thus the width of the confidence
interval for  increases.
b) As the sample size increases, the width of the
confidence interval for  decreases.
c)
As the value of p is farther from 0.5, closer to
either 0 or 1, the width of the confidence interval
for  decreases.
(3) (Ex.9.25)
A 90% confidence interval is
0.65  (1.645)
0.65(1  0.65)
150
= 0.65  0.064
= (0.586, 0.714)
Thus, we can be 90% confident that between
58.6% and 71.4% of Utah residents favor
fluoridation. This is consistent with the statement
that a clear majority of Utah residents favor
fluoridation
(4) (Ex 9.27) n = 0.5(1-0.5) (1.96 / 0.05)2 = 384.16
Thus 385 packages of ground beef should be tested.
(5) (Ex.9.32)
A 95% confidence interval for  is (7.8, 9.4).
a)
The 90% confidence interval would have been
narrower, since its z critical value would have
been smaller.
b) The statement is incorrect. The 95% refers to the
percentage of all possible intervals that include ,
not to the chance that a specific interval contains
.
c)
This statement is incorrect. While we would
expect approximately 95 of the 100 intervals
constructed to contain , we cannot be certain
that exactly 95 out of 100 of them will. The 95%
refers to the percentage of all possible intervals
that include .
(6) (Ex.9.39)
Since n = 25, df = 24, and the t critical value is 1.71.
Then the confidence interval is
x  (t critical value) s = 2.2  1.71 1.2
n
= 2.2 0.4104 = (1.7896, 2.6104)
25
Download