CHAPTER 8 REVIEW 8.1 THE BINOMIAL DISTRIBUTIONS

advertisement

CHAPTER 8 REVIEW

8.1 THE BINOMIAL DISTRIBUTIONS

OVERVIEW: The binomial distribution is frequently useful in situations where there are two outcomes of interest, such as success or failure. It is often used to model real-life situations, and it finds its way into many extremely useful and important statistical applications and computations.

The binomial setting:

(1) Each observation is in one of two categories: success or failure.

(2) A fixed number, N, of observations.

(3) Observations are independent .

(Knowing the result of one observation tells you nothing about the other observations.)

(4) The probability of success is the same for each observation.

If a count, X, has a binomial distribution with number of observations, N, and probability of success, p, then

Mean(X) =

 x

= np

Standard Deviation(X) =

 x

= np ( 1

 p )

The probability that one will get exactly k successes is

N

C k p k

(1-p)

N-k

, or you can use binompdf(n, p, X)

Example:

I roll a single die 60 times. If X represents the number of times I roll a "3", then

 x

= 60(1/6) = 10.

 x

= sqrt[60(1/6)(5/6)] = 2.89.

The probability that I will roll exactly ten 3's is

60

C

10

(1/6) 10 (5/6) 50 = .1370 = 13.7%.

Or binompdf (60,1/6,10) = .1370

The probability that I will roll at least ten 3’s is binomcdf(60, 1/6, 10) = .5834

Example (Small sample size from large population. Use of binomial distribution is appropriate.):

Assume that 30% of a population is Hispanic. A random sample of size 4 is chosen from this population. If X is the number of Hispanics in the sample, then

 x

= (4)(.3) = 1.2

 x

= sqrt[(4)(.3)(.7)] = 0.9165

Prob(X=0) =

4

C

0

(.3)

0

(.7)

4

= 0.2401

Prob(X=1) =

4

C

1

(.3)

1

(.7)

3

= 0.4116

Prob(X=2) =

4

C

2

(.3)

2

(.7)

2

= 0.2646

Prob(X=3) =

4

C

3

(.3)

3

(.7)

1

= 0.0756

Prob(X=4) =

4

C

4

(.3)

4

(.7)

0

= 0.0081

The probability that a sample would contain two or fewer Hispanics is Prob(X=0) +

Prob(X=1) + Prob(X=2) = 0.2401 + 0.4116 + 0.2646 = 0.9163 = 91.63%.

The TI-83 can be very useful in calculating binomial probabilities. For instance, the probability that the sample contains exactly 2 Hispanics is binompdf( 4,.3,2) = .2646.

The probability that the sample contains 2 or fewer Hispanics is binomcdf( 4,.3,2) =

.9163.

NOTE: It is important to understand when one has a binomial setting, and when one doesn't have such a setting. For instance, consider a regular shuffled deck of 52 cards.

Setting #1: I pick a card at random and note whether or not it is a heart. I put the card back in the deck, thoroughly shuffle the deck, and then randomly pick a card. Again, I note whether or not it is a heart. I repeat this process ten times. If X is the total number of hearts I obtained in the ten trials, then this is a binomial setting. Possible values for X are

0,1,2,3,4,5,6,7,8,9,10. [N = 10, p = 0.25, and each observation is independent of the previous ones.]

Setting #2: Basically the same situation, except that I do not put the randomly picked card back in the deck before each reshuffling. After ten trials, possible values for X are

0,1,2,3,4,5,6,7,8,9,10. However, this is not a binomial setting. Each observation after the first is not independent of the previous one.

Normal Approximation for Binomial Setting:

If the number of trials, n , gets too large to deal with binomial probabilities, we can approximate the distribution using the Normal approximation. To use this, be sure that np

10 and n(1 – p)

10 before you do this.

8.2 THE GEOMETRIC DISTRIBUTIONS

OVERVIEW: The geometric setting is somewhat similar to the binomial setting, the basic difference being that the geometric setting does not have a fixed number of observations and you are looking for the first time a success occurs.

The geometric setting:

(1) Each observation is in one of two categories: success or failure.

(2) The probability of success is the same for each observation.

(3) Observations are independent. (Knowing the result of one observation tells you nothing about the other observations.)

(4) The variable of interest is the number of trials required to obtain the first success.

Example:

Question: How many times would you expect to have to roll a single die in order to get a

"6"?

Here is a simulation approach (ten trials) using randInt(1,6,15) on the TI-83.

Trial 1: 3 5 2 2 3 6 ... (6 rolls)

Trail 2: 1 3 3 2 1 4 2 1 1 6 ... (10 rolls)

Trial 3: 5 1 5 6 ... (4 rolls)

Trial 4: 4 1 4 3 5 ... (5 rolls)

Trial 5: 4 5 3 4 6 ... (5 rolls)

Trial 6: 5 5 1 3 3 2 2 1 3 6 ... (10 rolls)

Trial 7: 5 3 4 1 6 ... (5 rolls)

Trial 8: 5 2 4 2 1 6 ... (6 rolls)

Trial 9: 4 5 1 4 6 ... (5 rolls)

Trial 10: 2 5 4 2 1 2 3 4 6 (9 rolls)

The mean number of rolls for the 10 trials is 6.5.

*hint: make sure you can do a simulation of a geometric distribution, including creating a histogram!

8.2 Formulas

Mean of a geometric distribution:  x

Variance of a geometric distribution:

2 x

1 p

1

 p

2 p

Standard deviation of a geometric distribution:

 x

1

 p p

2

* note: these formulas are not on your formula sheet, but will be provided on the test day

In the die-rolling example, the probability of rolling a 3 is 1/6. Hence, the expected number of rolls before the first success is 1/(1/6) = 6.

In the present California Lottery, one chooses 6 numbers from the set 1,2,3,...,49,50,51. The

State of California then randomly selects 6 numbers from the set. If you happen to match the six numbers chosen by the State, you win millions of dollars. Your probability of matching six is

1/(

51

C

6

)= 1/18,009,460.

That is, you would expect to have to play 18,009,460 times to get your first success. If you played once a week, you would expect your first success after 346,336 weeks, or 6,660 years.

Good luck!

Geometpdf will give the probability of an event first occurring after a specific number of trials [geometpdf(p, X)]

Geometcdf will give the probability of an event first occurring within a specific number of trials [geometcdf(p, X)]

Download