DEFINITION - Addis Ababa University (USA)

advertisement
G. W. Teklewolde Math MS
Statistics Basics
Study Note
Part 5
Statistics Basics
Transforming a z-Score to an x-Value
Recall that to transform an x-value to a z-score, you can use the formula
z
x

This formula gives z in terms of x. If you solve this formula for x, you get a new formula that gives x
in terms of z.
z
x

z  x  
  z  x
is the transformation of x.
Sampling Distributions
In previous sections, you studied the relationship between the mean of a population and values of a
random variable. In this section, you will study the relationship between a population mean and the means
of samples taken from the population.
DEFINITION
A sampling distribution is the probability distribution of a sample statistic that is formed when samples
of size n are repeatedly taken from a population. If the sample statistic is the sample mean, then the
distribution is the sampling distribution of sample means.
Properties of Sampling Distributions of Sample Means
1. The mean of the sample means  x is equal to the population mean µ.
x = µ
2. The standard deviation of the sample means
x
is equal to the population standard deviation σ.
divided by the square root of n.
x 

n
The standard deviation of the sampling distribution of the sample means is called the standard
error of the mean.
G. W. Teklewolde Math MS
Statistics Basics
Study Note
The Central Limit Theorem
The Central Limit Theorem forms the foundation for the inferential branch of statistics. This theorem
describes the relationship between the sampling distribution of sample means and the population that the
samples are taken from, The Central Limit Theorem is an important tool that provides the information
you’ll need to use sample statistics to make inferences about a population mean.
The Central Limit Theorem
1. If samples of size n, where n ≥ 30, are drawn from any population with a mean µ and a
standard deviation σ, then the sampling distribution of sample means approximates a
normal distribution. The greater the sample size, the better the approximation.
2. If the population itself is normally distributed, the sampling distribution of sample means is
normally distributed for any sample size n.
In either case, the sampling distribution of sample means has a mean equal to the population mean.
x = µ
Mean
The sampling distribution of sample means has a variance equal to 1/n times the variance of the
population and a standard deviation equal to the population standard deviation divided by the square root
of n.
 
2
x
x 
2
n
Variance

n
Std. Dev.
The standard deviation of the sampling distribution of the sample means,  x is also called the standard
error of the mean.
Probability and the Central Limit Theorem
Previously you saw how to find the probability that a random variable x will fall in a given interval of
population values. In a similar manner, you can find the probability that a sample mean x will fall in a
given interval of the x sampling distribution. To transform x to a z-score, you can use the formula
z
Value  Mean x   x
x


Sdt.Dev.
x
/ n
Approximating a Binomial Distribution
Previously you learned how to find binomial probabilities. For instance, if a surgical procedure has an
85% chance of success and a doctor performs the procedure on 10 patients, it is easy to find the
G. W. Teklewolde Math MS
Statistics Basics
Study Note
probability of exactly two successful surgeries.
But what if the doctor performs the surgical procedure on 150 patients and you want to find the
probability of fewer than 100 successful surgeries? To do this using the techniques described in Section
4.2, you would have to use the binomial formula 100 times and find the sum of the resulting probabilities.
This approach is not practical, of course. A better approach is to use a normal distribution to approximate
the binomial distribution.
Normal Approximation to a Binomial Distribution
If np ≥ 5 and nq ≥ 5, then the binomial random variable x is approximately normally distributed, with
mean
  np
And standard deviation
  npq
Correction for Continuity
The binomial distribution is discrete and can be represented by a probability histogram. To calculate exact
binomial probabilities, you can use the binomial formula for each value of x and add the results.
Geometrically, this corresponds to adding the areas of bars in the probability histogram. Remember that
each bar has a width of one unit and x is the midpoint of the interval.
When you use a continuous normal distribution to approximate a binomial probability, you need to
move 0.5 unit to the left and right of the midpoint to include all possible x-values in the interval. When
you do this, you are making a correction for continuity.
Approximating Binomial Probabilities
GUIDELINES
Using the Normal Distribution to Approximate Binomial Probabilities
In Words
1. Verify that the binomial distribution applies.
2. Determine if you can use the normal
distribution to approximate x, the binomial
variable.
3. Find the mean µ and standard deviation
σ for the distribution.
4. Apply the appropriate continuity correction. Shade the corresponding area under
the normal curve.
In Symbols
Specify n, p, and q.
Is np ≥ 5?
Is nq ≥ 5?
µ= np
  npq
Add or subtract
0.5 from endpoints.
G. W. Teklewolde Math MS
Statistics Basics
Study Note
x
5. Find the corresponding z-score(s).
z
6. Find the probability.
Use the Standard
Normal Table.

Estimating Population Parameters
In this chapter, you will learn an important technique of statistical inference— to use sample statistics to
estimate the value of an unknown population parameter. In this section, you will learn how to use sample
statistics to make an estimate of the population parameter p. when the sample size is at least 30 or when
the population is normally distributed and the standard deviation a. is known. To make such an inference,
begin by finding a point estimate.
DEFINITION
A point estimate is a single value estimate for a population parameter. The most unbiased point
estimate of the population mean µ is the sample mean x .
DEFINITION
An interval estimate is an interval, or range of values, used to estimate a population parameter.
Although you can assume that the point estimate in Example 1 is not equal to the actual population
mean, it is probably close to it. To form an interval estimate, use the point estimate as the center of the
interval, then add and subtract a margin of error. For instance, if the margin of error is 2.1, then an
interval estimate would be given by 12.4 ± 2.1 or 10.3 < µ < 14.5. The point estimate and interval
estimate are as follows.
Interval estimate
Before finding an interval estimate, you should first determine how confident you need to be that your
interval estimate contains the population mean µ.
DEFINITION
The level of confidence c is the probability that the interval estimate contains the population
parameter.
The difference between the point estimate and the actual parameter value is called the sampling
error. When µ is estimated, the sampling error is the difference x   . In most cases, of course, µ is
unknown, x varies from sample to sample. However, you can calculate a maximum value for the error if
G. W. Teklewolde Math MS
Statistics Basics
Study Note
you know the level of confidence and the sampling distribution.
DEFINITION
Given a level of confidence c, the margin of error (sometimes also called the maximum error of estimate
or error tolerance) E is the greatest possible distance between the point estimate and the value of the
parameter it is estimating.
E  zc x  zc

n
When n ≥ 30, the sample standard deviation s can be used in place of σ.
Confidence Intervals for the Population Mean
Using a point estimate and a margin of error, you can construct an interval estimate of a population
parameter such as µ. This interval estimate is called a confidence interval.
DEFINITION
A c-confidence interval for the population mean µ is
xE    x E
The probability that the confidence interval contains µ is c.
GUIDELINES
Finding a Confidence Interval (CI)for a Population Mean (n ≥ 30 or σ known with a normally distributed
population)
In Words
In Symbols
x
1. Find the sample statistics n and x.
s
2 Specify σ if known Otherwise, if (n ≥ 30,
find the sample standard deviation s and use
it as an estimate for α.
3. Find the critical value
zc that corresponds
to the given level of confidence.
4. Find the margin of error E.
x
n
 ( x  x)
2
n 1
.
Use the Standard
Normal Table.
E  zc

n
5. Find the left and right endpoints
Left endpoint: x  E
and form the confidence interval.
Right endpoint: x  E
Interval: x  E    x  E
G. W. Teklewolde Math MS
Statistics Basics
Study Note
Sample Size
For the same sample statistics, as the level of confidence increases, the confidence interval widens. As the
confidence interval widens, the precision of the estimate decreases. One way to improve the precision of
an estimate without decreasing the level of confidence is to increase the sample size. But how large a
sample size is needed to guarantee a certain level of confidence for a given margin of error?
Find a Minimum Sample Size to Estimate µ
Given a c-confidence level and a margin of error E, the minimum sample size n needed to estimate p. the
population mean is
z 
n c 
 E 
2
If σ is unknown, you can estimate it using s, provided you have a preliminary sample with at least 30
members.
Download