Using Excel to Find Normal Probabilities

advertisement

MATH-1410

C. Haugen

Using Excel to find Normal Probabilities

Refer to problem 17 on page 253 of the text. Researchers were interested in the number of hours per week adults in the United States spend on their home computers. The results of the survey showed that the number of hours were normally distributed with a mean of 7 hours and a standard deviation of 1 hour. We are asked to find the following: a) The probability that the hours spent on the home computer by a randomly selected participant are less than 5 hours per week. b) The probability that the hours spent on the home computer by the participant are between 5.5 and 9.5 hours per week. c) The probability that the hours spend on the home computer by the participant are more than 10 hours per week. d) Let’s throw one more in – What number of hours represents the third quartile?

In Excel, we can use two statistical functions to find the answers to these questions.

1.

Open a new Excel Workbook, click on any open cell in the worksheet, click on the Formulas tab, click on More

Functions in the Function Library portion of the Formulas ribbon, click on Statistical, and then scroll down and select NORMDIST. Excel is nice enough to tell us what we need to enter when we use this particular function.

If we hold the mouse on NORMDIST for a moment, a box should open with NORMDIST(x, mean, standard_deviation, cumulative) at the top.

We are told that we need to enter four arguments or parameters before the calculation can be carried out. The first parameter represents the upper bound of a region below the normal curve associated with the variable X.

Basically, when we calculate P( X < a ) or P( X <= a ), we are finding the area to the left of X = a below the normal curve associated with the variable X. So, for example, if we want to calculate P( X < 5 ), our first parameter would be 5. The next two parameters are the mean and standard deviation of the variable. We are usually given these values ahead of time. The fourth parameter is a logical value. By typing the word ‘true’ in this space, Excel will know to find the cumulative probability we want. FYI – if we type ‘false’ here, Excel will use what is known as a probability mass function to calculate P( X = a), something we do not need at this time.

2.

Once we have selected NORMDIST, the NORMDIST dialog box should appear on the screen.

Now we are ready to enter all the necessary parameters. For the first part of the problem, we want to find the probability that the participant spends less than 5 hours per week on their home computer. Our first parameter will be 5, the second will be 7, the third 1, and then we need to type the word true in the last field.

3.

If we click the OK button, our answer will appear in the open cell we selected earlier. I went ahead and added a reminder above that cell.

4.

For the second part of the question, we will have to subtract two cumulative probabilities. We want the probability the selected participant spends between 5.5 and 9.5 hours per week on their home computer. Excel does not have a function that will find P( a < X < b ). Instead, we will find P( X < b ) and then subtract P( X < a ). I used a couple more cells to help keep the output organized.

To perform the subtraction, I typed =C4-C7 in cell C10.

5.

In the third part of the question, we want the probability the participant spends more than 10 hours per week on their home computer. If we use mathematical notation, this means we want P( X > 10 ). Since the cumulative probability function only calculates probabilities less than or equal to some value, we will have to make use of the Complementation Rule from class: P(E) = 1 – P(not E). For us, that means P( X > 10 ) = 1 – P( X

<= 10 ). I added a new cell to keep track of the intermediate calculation. I have also widened the cells a bit.

The final result is shown below.

6.

Finally, we want to find the number of hours that represents the third quartile. Excel’s NORMINV statistical function is designed specifically for this type of problem. When we used NORMDIST, Excel was calculating P( X <

a ). When we asked Excel to calculate P( X < 5 ), we were given a value of about 0.02275. This told us that approximately 2.3% of all the survey participants spend less than 5 hours a week on their home computers. The

NORMINV function works in the opposite direction. Instead of entering the number of hours, we enter a cumulative percentage and then Excel will return the number of hours that corresponds to that percentage.

Since we want the number of hours that represents the third quartile, our first parameter will be .75.

The final results are shown below.

At this point, we can copy and paste everything to a Word document and interpret our results. NOTE: If you choose to use Excel for this project, make sure to include a table like this in your Word document.

Mean = 7 P( X < 5 )

Std. Dev. = 1 0.022750132

P( 5.5 < X < 9.5 )

P( X < 9.5 )

0.993790335

P( X < 5.5 )

0.066807201

P( 5.5 < X < 9.5 )

0.926983133

P( X > 10 )

P( X < 10 )

0.998650102

P( X > 10 )

0.001349898

3rd Quartile

7.67448975 a) The probability that the hours spent on the home computer by a randomly selected participant are less than 5 hours per week.

There is a 2.3% chance that the randomly selected participant spends less than 5 hours per week on their home computer. b) The probability that the hours spent on the home computer by the participant are between 5.5 and 9.5 hours per week.

There is a 92.7% chance that the randomly selected participant spends between 5.5 and 9.5 hours per week on their home computer. c) The probability that the hours spend on the home computer by the participant are more than 10 hours per week.

There is a 0.1% chance that the randomly selected participant spends more than 10 hours per week on their home computer. d) Let’s throw one more in – What number of hours represents the third quartile?

75% of all the participants in the survey spend no more than 7.7 hours per week on their home computer.

Download