CHAPTER 8: AN INTRODUCTION TO MICROSOFT EXCEL 2000 This tutorial will introduce you to data analysis and graphing in Microsoft Excel. You should work through this tutorial with Excel open on a computer so that you can work the examples as you move through the tutorial. How to enter data in Excel When you open Excel you should see a blank worksheet, which looks like this: Illustration 1. Blank excel worksheet. The worksheet consists of rows and columns of worksheet spaces, or "cells". Each row is numbered and each column has a letter. Every cell in the worksheet can therefore be referred to by a letter and number. For example, the top left hand cell is called A1, the cell below it is A2, etc. You can type either numbers or words in each cell. Imagine that over one week in July we had collected data on the maximum daily temperature in three locations (inside Woods Lab, on the lawn in front of the library and in Abbo's Alley). To enter this data into Excel, point the cursor on the cell, then click to select the cell. You can now type into the cell. In cell A1 type "Day". In cells B1, C1 and D1 type "Woods Lab", "Library Lawn" and "Abbo's Alley". Your worksheet should look like this: Illustration 2. Sample worksheet showing category titles for a new table. Now you can enter the data into each column. Type the following numbers: Illustration 3. Sample data table. You can now save your work by selecting “Save” or “Save as” from the “File” menu. You can change the appearance of your worksheet by adding lines and by putting words or numbers in bold or italics. For example, click on cell A1 and hold the mouse button down. Now move the mouse over to D1 and release the mouse button. Cells A1, B1, C1 and D1 should all be selected. You can now click on the "B" button on the Formatting toolbar to make them bold (or the "I" button for italics, or the "U" button for underline): Illustration 4. Using the 'Bold' button on the formatting toolbar. You will notice that by making your column headings bold you have made the words slightly wider so that they no longer fit into their columns. You can correct this by positioning the cursor so that it falls on the dividing line between the columns. The cursor should change from a fat cross to a skinny cross with two arrows. You can now hold down the mouse button and move the edge of the column. Alternatively, you can select your whole worksheet, then go to the "Format" menu and choose "Column", then "Auto Fit". This will automatically make each column just as wide as its widest cell. To add a line underneath the column headings, select the four heading cells (as you did to make them bold). On the Formatting toolbar select the Borders tool which looks like this: Illustration 5. Using the 'border' button to apply borders to a table. You can now add a line or a box around your text. Here is my data table with bold column headings, AutoFitted widths and a double line under the headings: Illustration 6. Sample data table showing data and all formats applied so far (bold column headings, autofitted widths and a double line under headings). How to do simple calculations with data in Excel In this section you will learn how to calculate averages and do simple mathematical manipulations of your data. Calculating averages: Imagine that we wanted to compare the mean temperature for the three locations. We will put this information in cells B10, C10 and D10. Click on B10 then pull down the "Insert" menu and choose "Function…". The following box should pop up: Illustration 7. 'Paste function' box showing some function categories and functions. To calculate the mean click on "AVERAGE" in the right hand box, then click "OK". [Note: to calculate the median, click on MEDIAN, to calculate the mode, click on MODE]. A new box will pop up (if the box is blocking your table you can click on the top of the box, hold the mouse button down and drag the box to one side). In this new box you will tell the computer which cells you wish to have averaged. The temperature data for Woods Lab are in (B2 through B8 so type "B2:B8", then choose "OK". Illustration 8. Function box for calculating the average of a range of values. If everything went well, you should have "20.142857" in cell B8. This is the mean of the seven temperatures for Woods Lab. Excel makes it very easy to calculate the means for our other columns. Click on cell B10 again, then position the cursor so that it is just above the bottom right hand corner of the cell. The cursor will change from a fat cross to a skinny cross. Now click the mouse button down and drag it across C10 and D10, then let go. You should see two new numbers in C10 and D10 - these are the averages of the other two columns. By using the skinny cross you have pasted the function "average" from cell B10 into cells C10 and D10. You can check that Excel did this correctly by clicking on C10. Above the worksheet you will see a little window that should contain the correct formula [=AVERAGE(C2:C8)]. The mean gives us a measure of the "center" of the data. We also want to summarize the variability in our data. One measure of variability is the standard deviation. Variable datasets have large standard deviations, uniform datasets have small standard deviations. To calculate the standard deviation in Excel follow the same directions as the calculation of the means: Click on cell B11, choose "Function" from the Insert menu and ask Excel to paste "STDEV", then type "B2:B8" to tell Excel where to find the data. You can use the skinny cross again to paste your formula into the adjacent cells. If all went well you should have the following values in your table (note that I have added labels and a line to make my worksheet clear): Illustration 9. Output table showing values for calculations of mean and standard deviations for temperature classes. The means and standard deviations that Excel has given us go to eight decimal places, but our data were not measured this precisely! To change the number of decimal places, select B10, C10, D10, B11, C11 and D11 (to do this click on B10, hold the mouse button down and move over to D11) then choose "Cells" from the "Format" menu. The following box will pop up: Illustration 10. 'Format cells' option box. Click on "Number" in the left hand box, then use the scroll arrows to tell Excel to use just one decimal place, then click "OK": Illustration 11. Applying format of 1 decimal place to selected cells. Your worksheet should now look like this: Illustration 12. Sample data table showing all formatting and calculations so far. To print this table, select the area you wish to print (cells A1 through D11), then go to the File menu, choose "Print Area" and "Set print area". You can now print what you have in the selected region. Now you will use Excel to create a new column using a simple mathematical manipulation of the other columns. Suppose that we wanted to know what the temperature difference was between the inside of Woods Lab and the Library lawn. Click on cell F2, then type "=C2-B2" into cell F2, then before pressing Return (or Enter) move the cursor to the bottom right corner of the cell until you see the skinny cross. Now click on this corner and pull down until you reach cell F8. When you let go of the mouse Excel will have calculated the difference between column C and column B for each row. Again, you should add a column heading to make your table clear: Illustration 13. Data table with heading for new column. Using the Fill function Excel can do many other types of mathematical manipulations of your data. Here is one more example: Suppose that we were studying some fish in a lake. The initial size of the fish populations is 15 and the fish population doubles each year. We will use Excel to calculate the size of the population after 10 years. In cell A13 type "Year", in cell B13 type "Population size". Now type "1" in cell A14 and "15" in cell B14. You are now ready to make Excel do some work for you -- type "=A14+1" in cell A15, then press return. The number "2" should appear in cell A15. Now position the cursor over the bottom right corner of the cell until you see the skinny cross. Now click on this corner and pull down until you reach cell A23. When you let go of the mouse Excel will have produced a series of years from 1 to 10 by adding one to the cell above every cell in the column. Now click on cell B15 and type "=B14*2", then press return. B15 should now have "30". Click on B15 and position the cursor over the bottom right corner of the cell until you see the skinny cross. Now click on this corner and pull down until you reach cell B23. What is the population size of fish in year 10? You should have 7680. How to draw graphs in Excel Excel allows you to draw many different types of graph. Unfortunately, Excel is not smart enough to know which graph is the most appropriate for the type of data you have, so beware! Think about what kind of graph you draw before you start clicking away in Excel. You will now make a line graph of the temperature in the three locations throughout the week. First, select cells A1 through D8. This will highlight all the data and the column headings. Now go to the top of the screen and click on the "Chart Wizard": Illustration 14. 'Chart wizard' button on 'Standard' toolbar. A new box will pop up and give you some options. You will choose the XY scatter option on the left and the "connected data points" option on the right. Note: even though you are drawing a linegraph do NOT use the "Line" option! Illustration 15. 'Chart type' selection box in 'Chart wizard' function of Excel. Click on "Next". The next box allows you to tell Excel where to find the data for the graphs. We have already selected our columns, so you can just click "Next" again. In the next box you will add axis labels, graph titles, legends, etc. Normally each axis on a graph should include a label with units. Type "Temperature (Degrees Centigrade)" for the y axis and "Day" for the x axis. Graph legends should explain exactly what the graph is showing and should include a figure number: "Figure 1. Graph of the maximum daily temperature during one week in July for three locations in Sewanee" would be appropriate for this graph. Figure 1. Graph of the maximum Figure 1. Graph of the maximum daily temperature during one week in July for three locations in Sewanee. Illustration 16. 'Chart wizard' window showing options for chart formatting. Before clicking "Next", click on some of the other options to see how you can modify your graph (e.g., remove gridlines, change the location of the legend). Now press "Next" to go on to the last box. This box allows you to tell Excel where to put your new graph. You can choose "As an object" to have the graph appear alongside your data table, or "As a new sheet" to add the graph as another "sheet" in Excel. Using "sheets" allows you to keep your file tidy. Illustration 17. 'Chart wizard' chart location window. You can switch between sheets using the tabs at the bottom of the screen (click on the tabs to move between sheets): Illustration 18. 'Page tabs' allowing you to move between pages of your 'work book'. You should now have a nice line graph of your data. If you want to change your graph just double-click on the part that you wish to change. For example, double-click on the y axis label to change the label text or font, double-click on the numbers on the y axis to change the range, double-click on the points to change their size or the thickness of the line. Now you will draw a scatter plot of the temperature in Abbo's Alley plotted against the temperature outside the library. To do this, first select cells C1 through D8: Illustration 19. Selecting cells to make a scatter plot. Now click on the chart wizard and select XY Scatter with no line: Illustration 20. Selecting xy (scatter) in chart wizard. The rest of the process is very similar to our first graph. You should end up with a graph similar to the following: Figure 1. Graph of the temperature at Abbo’s Alley plotted against the temperature at the library lawn for seven days in July. Figure 1. Example of what your scatter graph should look like. The last type of graph that we will explore in this tutorial is the bar graph. We will plot a graph of the average temperature at the three locations. First select the cells that can contain the average temperatures: Illustration 21. Selecting cells to make a bar graph. Now click on the chart wizard and select "Column": Illustration 22. Selecting 'column' chart type and first subtype in 'chart wizard'. In the next box leave the "Data range" as it appears (you selected your data, so Excel knows where to find it) and click on "Series": Illustration 23. Selecting 'series' tab in chart wizard. You will now tell Excel where to get the labels for the x axis. Click in the X axis labels box: Illustration 24. Selecting location of axis labels. Now click on cell B1 in your worksheet and hold the mouse down. Move the mouse over the cell D1. These will give Excel the labels for the x axis. Don't panic if part of your graph wizard disappears for a moment! You should see the following: Illustration 25. Location and range values for cells selected for axis labels. Click "next" to go on and add a title and axes labels to your graph. It should look like this when you are finished: Figure 2. Graph of mean temperatures during one week in July at three locations in Sewanee. Figure 2. Example of properly formatted bar graph. You have one more step: adding symbols to your graph to indicate the standard deviations. You cannot do this in the chart wizard, only afterwards. Note that you can use the same sequence of clicks to put standard deviations on other graphs such as linegraphs. Click on your graph so that a dot appears in each column (if this doesn't happen, try clicking somewhere else on your sheet or graph, then clicking on the columns again): Illustration 26. View of bar graph with columns selected. Now double-click one of the columns. You will get a box with several options; choose the "Y error bars" option: Illustration 27. Options available for column formatting. Now click on the "Both" box and the "Custom" box, then click in the box to the right of the "+" sign, then click on cell B11 in your worksheet and hold the mouse down. Move the mouse over the cell D!1. These will give Excel the values for standard deviation. Don't panic if part of your graph wizard disappears for a moment! Repeat for the "-" box, then recheck to make sure that "Both" is selected at the top of the options box. You should see the following: Illustration 28. Error bar formatting window. Your completed graph should look like this: Figure 2. Graph of means and standard errors for temperatures during one week in July at three locations in Sewanee 35.0 30.0 25.0 20.0 15.0 10.0 5.0 0.0 Woods Lab Library Lawn Abbo's Alley Location Figure 3. Completed bar graph with proper formatting, labeling and error bars. ASSIGNMENT: The data for this assignment are found on the Angelnet file server in an Excel 2000 workbook called "Sample data for Intro Bio". You should copy this file onto a disk and work through the following problems. To navigate to this file, choose the appropriate directions below. Mac-- Open the chooser from the Apple menu on the computer. You should see something like this: Illustration 29. View of the 'chooser' window where the AngelNet file server can be accessed. Click on AppleShare in the top left box, then Sewanee Net Servers in the bottom left box, then 00-AngelNetFileServer in the top right, then click OK. A box will pop up - click on the "Guest" button. The following will come up: Illustration 30. Menu of items found on the AngelNet file server. Click on "Acad_Classes" then click OK. The following icon should appear on the desktop of your computer: Illustration 31. 'Academic classes' icon that will appear on your computers desktop. Double-click on this, then open the following folders: Sciences Math/Biology/Intro Bio/Stats. The file "Sample data for Intro Bio" is in this folder. To make a copy of this file, drag it into a folder on a disk. Now disconnect from the Academic Classes on Angelnet fileserver by dragging its icon into the trash. PC—Open the ‘Start’ menu at the bottom left corner of your desktop with a single click, choose ‘Run’ from that pop up menu. In the ‘Run’ dialog box type \\angelnet_server\Acad_Classes then press ‘OK’. A window titled ‘Acad_classes on angelnet_server’ will appear on your desktop with several folders inside. Navigate through the following folders: Sciences Math/Biology/Intro Bio/Stats. The file "Sample data for Intro Bio" is in this folder. To make a copy of this file, drag it into a folder on a disk. Now disconnect from the Academic Classes on Angelnet fileserver by clicking on the ‘X’ in the upper right corner of the window or chose ‘Close’ from the ‘File’ pulldown menu. Open the file ‘Sample data for Intro Bio’ in Excel and answer the following questions on a separate sheet of paper: 1. The “Farms” column has data about the size of dairy cow herds on farms in Tennessee. Each number represents the number of cows on a farm. There are 30 farms in all. i. What is the mean and median number of cows on each farm? ii. Why do the mean and median differ? iii. Would you report the median, mode or mean, if you wanted half the farms in the sample to be above the value and half below? 2. The "Volume" and "Weight" columns contain data about the volume and weight of 40 hickory nuts that were collected in the fall of 1998 in a cove forest in Sewanee. Draw an appropriate graph to show whether volume and weight of these hickory nuts are correlated with each other. Make sure you fully label your graph, then print it out. 3. The next table shows data about performance scores for five pigeons. These birds were trained to perform a complex series of pecks with their beaks in order to receive a reward. Their "score" summarizes their performance in this task. Scores are shown for each pigeon on each of ten days. There are two blank columns: mean and standard deviation. You should use Excel to calculate these means and standard deviations, then plot a suitable graph to show how the scores changed through time. Include the standard deviations on your graph, Print out this graph. 4. The "Year" and "Human Population" table shows the estimated human population over the last 1000 years. Plot a suitable graph to show this data. How long does it take for the human population to double? If past trends continue, what year will the populations hit 12 billion?