1 Highlights of Using Excel 2000 to perform selected statistical analyses To graphically portray a frequency, relative frequency, or percent frequency distribution that already exists in the form of a table as a histogram or bar chart or pie chart: Enter the distribution in two adjacent columns somewhere in your spreadsheet. Choose Chart Wizard (the icon that comprises multi-colored vertical bars). Choose the appropriate Chart Type (e.g., the default selection of Column if you want a histogram or vertical bar chart, Bar if you want a horizontal bar chart, or Pie if you want a Pie chart). Click Next. Under the Data Range tab: (1) enter within the Data Range box the address of the distribution exclusive of any column labels, by either (a) clicking and dragging through the distribution (excluding any column labels) or (b) typing in the address (e.g., A1:B10) of the distribution (exclusive of any column labels); and (2) select Series in Columns. Click Next. If you are constructing a histogram or bar chart: (1) under the Titles tab, enter within the Chart title box your desired title, enter within the Category axis box an appropriate label, and enter within the Value axis box an appropriate label; and (2) under the Legend tab, deselect Show legend. If you are constructing a pie chart: (1) under the Titles tab, enter an appropriate title; (2) under the Legend tab, select Show legend; and (3) under the Data Labels tab, select Show value if you are dealing with a frequency distribution or Show percent if you are dealing with a percent frequency distribution. Click Finish. If you are constructing a histogram, remove the gaps between the columns as follows: rightclick on one of the bars, choose Format Data Series…, under the Options tab reduce the Gap width to 0, and click OK. note: Excel’s pie chart module under Chart Wizard is flexible enough so that if you enter a frequency distribution, you can get a graph of the percent frequency distribution by merely, under the Data Labels tab, selecting Show percent. To change the size of a graphical object with a border around it: Click on the border so that you see small black squares along the border. To make the object taller or shorter, move the cursor to the top middle or bottom middle black square so that you see a vertical double-arrow. Click and drag the border up or down as desired. To make the object longer or narrower, move the cursor to the left middle or right middle black square so that you see a horizontal double-arrow. Click and drag the border to the right or left as desired. To stretch or compress the object along a diagonal, move the cursor to a black square on a corner of the border so that you see a diagonal double-arrow. Click and drag the corner along the diagonal as desired. To de-select the object, click somewhere outside its border. To change the location of a graphical object with a border around it: Click on the border so that you see small black squares along the border. Move the cursor so that it resides on a blank space inside the border. Click and drag the graphical object to the desired location. To de-select the graphical object, click somewhere outside its border. 2 Using Excel’s Histogram tool to construct—from an unsummarized set of quantitative data—a histogram of the frequency distribution: Somewhere in your spreadsheet enter all the quantitative data in question in one row or one column or one rectangular array. If you want to specify, for each class interval, a maximum value belonging to that interval-Excel calls each such maximum value a bin--enter the bins you desire in one row or one column of your spreadsheet in ascending order. Choose Tools. Choose Data Analysis… (If Data Analysis is not listed as a choice and you are working from the KSU computer lab: (1) choose Add-Ins…; (2) wait for the hourglass to disappear; (3) choose AnalysisToolPak; (4) click OK; (5) once again wait for the hourglass to disappear; (6) rechoose Tools; and (7) choose Data Analysis. If you are using your own copy of Excel, the AnalysisToolPak add-in may be available on the Excel installation disk, so insert that disk before going through the steps above.) Choose Histogram. Click OK. Enter within the Input Range box the address of the quantitative data by either (a) clicking and dragging through the quantitative data you already entered into your spreadsheet or (b) typing in the address (e.g., A1:B10) of the data. If you chose to specify the bins, enter within the Bin Range box (via a click-and-drag operation or direct address entry) the address of the bins; otherwise, leave the Bin Range box blank. Under Output options, specify where you wish your output located. (To have the output placed on the current sheet, select Output Range and enter in the address of the cell where you want the upper left-hand corner of the output to be located.) Select Chart Output. Click OK. Remove the gaps between the columns as follows: right-click on one of the bars, choose Format Data Series…, under the Options tab reduce the Gap width to 0, and click OK. Enter an appropriate histogram title as follows: right-click on the current title, choose Format Chart Title…, choose OK, type in the desired title, and press the Enter key. Enter an appropriate label for each of the horizontal and vertical axis labels as follows: rightclick on the current label, choose Format Axis Title…, click OK, type in the desired label, and press the Enter key. To get a scatter diagram of a set of (x,y) data points constructed: If, for example, you entered the x-values in cells A1 through A10 and the corresponding y-values in cells B1 through B10, you would: (1) select the Chart Wizard; (2) select XY(Scatter) and click Next; (3) enter for the Data Range A1:B10, select Columns, and click Next; and (4) input the desired Chart Title and labels for the X and Y axes, and click Finish. To get the correlation coefficient calculated from a set of (x,y) data points: If, for example, you entered the x-values in cells A1 through A10 and the corresponding y-values in cells B1 through B10, you would enter in a blank cell the formula =CORREL(A1:A10,B1:B10). 3 Using Excel’s Descriptive Statistics tool to have various summary measures for an unsummarized set of quantitative data computed and displayed in a table: Enter your data set in a single column somewhere in your spreadsheet, and put some descriptive label (e.g., Annual Income if your data comprises annual incomes) at the beginning of the column. Choose Tools. Choose Data Analysis… (If Data Analysis is not listed as a choice and you are working from the KSU computer lab: (1) choose Add-Ins…; (2) wait for the hourglass to disappear; (3) choose AnalysisToolPak; (4) click OK; (5) once again wait for the hourglass to disappear; (6) rechoose Tools; and (7) choose Data Analysis. If you are using your own copy of Excel, the AnalysisToolPak add-in may be available on the Excel installation disk, so insert that disk before going through the steps above.) Choose Descriptive Statistics. Click OK. Enter within the Input Range box (via a click-and-drag operation or direct address entry) the address of your column label together with your column of data. Select Labels in First Row. Under Output options, specify where you wish your output located. (To have the output placed on the current sheet, select Output Range and enter in the address of the cell where you want the upper left-hand corner of the output to be located.) Select Summary statistics. Click OK. Note: You can have multiple data sets summarized by putting the data sets in adjacent columns and expanding the Input Range accordingly. Using Excel statistical functions to have individual summary measures for an unsummarized quantitative data set computed: Enter your data in a single row or single column or rectangular array somewhere in your spreadsheet. The following table assumes your data resides, for example, in A1:C10, and shows what you should type in some blank cell to get one of several summary measures computed. Summary measure What you type (assuming data resides in A1:C10) mean =AVERAGE(A1:C10) median =MEDIAN(A1:C10) mode =MODE(A1:C10) range =MAX(A1:C10)-MIN(A1:C10) variance if data set comprises a population: =VARP(A1:C10) if data set comprises a sample: =VAR(A1:C10) standard deviation if data set comprises a population: =STDEVP(A1:C10) if data set comprises a sample: =STDEV(A1:C10) To determine the probability of a normally distributed random variable attaining particular values: Example: Assume X has a normal distribution with mean 100 and standard deviation 10. To determine P(X < 125) or P(X 125), enter in a cell: =NORMDIST(125,100,10,true) To determine P(X > 125) or P(X 125), enter in a cell: =1 - NORMDIST(125,100,10,true) To determine P(125 < X < 130) or P(125 X < 130) or P(125 < X 130) or P(125 X 130), enter in a cell: =NORMDIST(130,100,10,true) - NORMDIST(125,100,10,true) 4 To determine the probability of a binomial random variable attaining particular values: If, for example, you want to know the probability of exactly 4 successes in 10 trials, where the probability of a success on each individual trial is .30, enter in one cell =BINOMDIST(4,10,.30,false) If, for example, you want to know the probability of 4 or fewer successes in 10 trials, where the probability of a success on each individual trial is .30, enter in one cell =BINOMDIST(4,10,.30,true) If you wanted Excel to construct the entire probability distribution for the binomial random variable X where n = 10 and p = .30, you could enter 0,1,2,…,10 in cells A2 through A12, enter =BINOMDIST(A2,10,.30,false) in cell B2, and then copy the formula in cell B2 to cells B3 through B12. Using Excel’s Regression tool to assist in performing a simple linear regression analysis based on a sample of (x,y) data points: Enter the x-values in one column, and at the top of the column put an appropriate name (label) for X. Enter the y-values in another column, and at the top of the column put an appropriate name (label) for Y. Choose Tools. Choose Data Analysis… (If Data Analysis is not listed as a choice and you are working from the KSU computer lab: (1) choose Add-Ins…; (2) wait for the hourglass to disappear; (3) choose AnalysisToolPak; (4) click OK; (5) once again wait for the hourglass to disappear; (6) rechoose Tools; and (7) choose Data Analysis. If you are using your own copy of Excel, the AnalysisToolPak add-in may be available on the Excel installation disk, so insert that disk before going through the steps above.) Choose Regression. Click OK. Enter within the Input Y Range box (via a click-and-drag operation or direct address entry) the address of your column of y-values, including the label. Enter within the Input X Range box (via a click-and-drag operation or direct address entry) the address of your column of x-values, including the label. Select Labels. Select Confidence Level. Under Output options, specify where you wish your output located. (To have the output placed on the current sheet, select Output Range and enter in the address of the cell where you want the upper left-hand corner of the output to be located.) Select Residuals. Select Residual Plots. Click OK. Change the vertical scale on the residual plot as follows: Right-click on a number marked on the vertical axis / Choose Format Axis… / Choose the Scale tab / for the Minimum, specify the value of –3se (note: se, the standard error of the estimate, is called Standard Error in the Summary Output table provided by Excel when you clicked OK above) / for the Maximum, specify the value of 3se / for each of the Major unit and Minor unit, specify the value of se / Click OK. Get horizontal gridlines to show up on the residual plot as follows: Right-click on blank area of residual plot / Choose Chart Options… / Choose Gridlines tab / under Value(Y) axis, select Major gridlines / Click OK