Introduction to Excel These very sketchy notes are meant to help with the terminology I use when talking about spreadsheets. They will give us a common vocabulary to use in our discussion. They also serve as a place to write your own notes. We will go over most of the procedures discussed here in the class. What is a spreadsheet? A (computer) spreadsheet is a piece of software that allows you to organize and analyze information. For me, a spreadsheet fills three useful purposes. It acts as (1) a database, (2) a computational tool, and (3) graphing software. How is a spreadsheet organized? The most recent versions of Excel refer to documents as workbooks. A workbook consists of a number of worksheets. A worksheet consists of columns and rows. The columns are identified by letters, and the rows are identified by numbers. The intersection of a row and column is called a cell. Cells are referenced by the spreadsheet by their respective column letter and row number. The cell in the top left hand corner is cell A1. Last time I checked, there were 16,384 rows and 256 columns available in Microsoft Excel on one worksheet. To move between cells we can use the mouse (click on the cell you want to move to), the arrow keys (they move the cursor one cell at a time in the direction of the arrow), and the tab key (TAB moves right, SHIFTTAB moves left). After entering data or a function into a cell, you must hit ENTER for the entry to take effect. Depending on your configuration of Excel, the ENTER key will also move the cursor down one cell (and SHIFT-ENTER will move you up). How can a spreadsheet help me? As I mentioned, a spreadsheet can help as a database, a computational tool, or as a graphing device. It is simple to use, and it is very powerful. It is also a general-purpose program. This means that it can do a lot, but it is not as sophisticated as many special purpose programs. For example, a spreadsheet can be used to perform numerous statistical analyses, but it is not nearly as powerful as a special-purpose statistical package. Using the Spreadsheet as a Computational Tool We can use the spreadsheet to make many kinds of computations. A spreadsheet can do simple arithmetic computations, matrix manipulations, as well as engineering, financial and statistical computations. It can also be used to do optimization, simulation, and statistical analyses. Using the spreadsheets macro capabilities, you can get it to do just about anything you want. Most of the time we make computations in Excel by entering a function into a cell. In some cases we use some of Excel’s built in tools to do more complex computations. Excel Introduction - 2 Using Functions To enter a function in Excel we begin by typing =. To do simple arithmetic, we can enter functions into cells using symbols close to the ones we learned in beginning math classes. Computation Add Subtract Multiply Divide Raise to a power Symbol + * / ^ Example =5+2 =10-3 =4*5 =15/3 =2^4 Example Result 7 7 20 5 16 Most computations are accomplished by the use of functions. There are too many functions to list them all, but some examples are listed below. We will discuss many others as the course progresses. A function consists of the function name, and the arguments. Almost always one of the arguments is a list of numbers or the cell reference(s) to a set of numbers. Computation Add Average Absolute Value Future Value Binomial Probability Excel Function =sum(number1, number2, …) =average(number1, number2, …) =abs(number) =fv(rate,nper,pmt,pv,type) =binomdist(X,n,p,cumulative) You can get a list of functions quickly in Excel by clicking on the tool bar button that looks like a script fx. It is called the Paste Function button. It is a very useful tool for building functions that you have not used before. Excel Introduction - 3 Once you select the button the following dialog box will be presented. From the column on the left, select the category that you are interested in. If you are not sure what category to use, you can select “All.” Next you will be presented with a list of statistical functions in the “Function name” column on the right. The functions are listed alphabetically. When you click on a function, it is described at the bottom of the box. When you say OK, you are presented with a function-specific dialog box to help you enter the appropriate arguments. Excel Introduction - 4 Suppose that you were interested in creating a frequency distribution. This is a statistical function, so if you select the Statistical category, the function FREQUENCY appears on the right hand side. Once you select it, you will obtain the following dialog box. When you click in the second argument box, a description of that argument is presented. Once you become more familiar with the functions, you will not have to go through all of these steps. You will be able to type in the functions directly. Excel Introduction - 5 Using Cell References The power of a spreadsheet (for me) lies in its ability to do what-if analysis, and to do repetitious computations. To take advantage of this power in Excel, we use cell references rather than numbers in functions. For example, suppose that we had the numbers 10 and 3 in cells A1 and A2. If we wanted to add the numbers, we could use =10+3 or we could use =A1+A2. By using the second entry (the one that uses the cell references), we have much more flexibility for doing further analysis. For example, if we decide that we did not want 10+3, but instead wanted 10+5, we don’t have to change the function. Instead, we just change the entry in cell A2 and the function automatically updates. Using cell references also lets us repeat computations with very little effort. For example, suppose that we had 24 entries in each of 30 columns and we want to know the average of the 24 numbers in each column. By using cell references, we can enter the function for the first column, and then copy it to the other 29 columns to repeat the computation for their entries. For the above example, we need to refer to more cell than one in the function. In Excel we can refer to a contiguous set of cells by putting a colon between the references. For example, A1:A10 refers to cells A1 through A10 inclusive. For some functions it is possible to refer to multiple noncontiguous cells by using commas to separate the references. The reason it doesn’t work for every function is that commas are also used to separate function arguments. Excel Introduction - 6 Relative vs. Absolute Cell Referencing (IMPORTANT!) When copying functions, we need to understand the concept of absolute and relative references. An absolute reference (identified by placing a $ symbol in front of the reference) means that no matter where you copy the function to, the reference will not change. A relative reference, however, will change. For example, if in cell A10 we had the function =sum(A1:A9), and we copy the function to cell C11, the function will change to be =sum(C2:C10). The reason for the change is that in moving from A10 to C11, we moved over 2 columns and down one row. Since the reference is relative, it causes every thing to shift by 2 columns and one row. If the original function had been =sum($A1:A$9) (so now column A in the first part and row 9 in the second part are absolute), then the copied function in cell C11 would be =sum($A2:C$9). In this case only the row entry changed for the first part (because A is absolute it does not change) and the column part for the second entry (because the 9 was absolute). You can use the function key F4 to quickly make a cell reference absolute. By repeatedly hitting the F9 key you can toggle through all possible combinations of absolute and relative references. Excel Introduction - 7 Array Functions In some cases the result that we are looking for is a matrix or vector. In those cases, we use what Excel calls an array function. Entering array functions is different than other functions in two ways. First, before entering the function, you need to highlight the cells where the entire answer will go. For example, if we were multiplying a 3x1 vector (that was in cells A1:A3) by a 1x5-transposed vector (in cells A5:E5), the answer will be a 3x5 matrix. We must tell Excel that the answer will go in 3 rows and 5 columns by highlighting the appropriate 15 cells. For example, we could highlight the cells A7:E9. Once the cells are highlighted, you can enter the function name and arguments in the upper left-hand cell (A7 in the example). Be sure not to click on the upper lefthand cell before entering the function, or your highlighting will be nullified. For the matrix multiplication example, we would type in the function =MMULT(A1:A3, A5:E5). The second difference with array functions is that we don’t use the enter key alone to enter the function. Instead, we must hold down both the CNTRL and SHIFT keys, and while holding them down hit the ENTER key. When Excel is done, it puts curly brackets around it to identify the function as an array function. In our example the cells A7:E9 would all have {=MMULT(A1:A3,A5:E5)} in them. Excel will not allow you to change any part of an array. In the example above, if you tried to change the function in cell A8, an error message would appear. You have to change the entire array to change any part of it. Excel Introduction - 8 Using Built-in Computational Routines There are also some menu items that can be used to do numerical computations. They are almost all under the TOOLS menu. For example, you can do optimization. In Excel the tool is referred to as the SOLVER. You can also do a number of statistical analyses (which will be the main thing we will be doing in our class). Once you select these tools, you will be presented with a dialogue box. We will spend a lot of time on these later in the class, so I have not provided any examples at this point. Formatting There are a number of formatting options available to help make the spreadsheet look nice. In Excel, there are several ways to change the formatting. The most general way is to select the menu item FORMAT and then CELLS. With this menu item you can change the font, the numeric format, the borders of the cells, the colors and the alignment. The spreadsheet also provides numerous toolbar buttons that can be used to change the formatting. It is also possible to rename the sheets within the workbook or notebook. Double click the current name of the sheet (at the bottom of the workbook on small tabs), and you will then be able to change the name. Excel Introduction - 9 Naming Cells Rather than using cell locations all of the time, it is often helpful to give a cell or a range of cells a name. The easiest way I have found to do this is to use the top row of entered data as labels. Then highlight the data, including the header row. In Excel, select INSERT/NAME/CREATE. Then click in the box next to TOP ROW. You can also use INSERT/NAME/DEFINE for a more general way to name the cells. Finally, near the upper right-hand corner of the spreadsheet (directly above the row headings) there is a box showing the current cell location. You can click in that box and change the cell location to a name (be sure to hit enter when you are done). If you want to give a name to a range of cells, highlight the cells first and then change the name in the box. There are rules about eligible names in Excel. For example, most special characters (e.g., %, &) are not allowed. One character that is allowed is the underscore character. Similarly, you cannot put a space in cell names. A notable ineligible name is one that begins with one or two letters and ends with a number (e.g., AB10). These names are reserved in Excel because they identify cell references. I am sure there is a limit on the length of a name, but I don’t know what the limit is. Once you have named the cells, you can use the names rather than cell addresses in functions. For example, if I give the range of cells A1 to A10 the name income, I can find the total income entering the function =SUM(income) or =SUM(A1:A10). Either one will give the same result. Excel Introduction - 10 Graphing The spreadsheet can be used to do many types of graphs. The easiest way to create a graph is to click on the ChartWizard button (the button that look like a small bar chart). You can either highlight the data you want to graph before or after you click on the Chart Wizard button. Excel will then guide you in the construction of the graph. Editing the Graph If you placed the graph on its own separate sheet, then you should be able to edit it directly. If the graph was placed on a worksheet, you will need to double click the graph before editing it. In Excel, you can do most editing that you need to by double clicking on the item you want to change. You can also use the INSERT and FORMAT menus, as well as many of the toolbar buttons (for text formatting). Excel Introduction - 11 Database Capability The spreadsheet can also be used as database software. To use the database tools, data should be entered in columns, and each column should be given a heading. Excel automatically creates the lists created as a database. Each row is treated as a record, and each column as a field. Once you have entered the header row, if you prefer you can select DATA/FORM. You will be presented with a form that can then be used to enter data. The other menu items that are quite useful are SORT and FILTER. SORT allows you to sort the data based on any of the entered fields, in both descending and ascending order. FILTER allows you to view and work with records that fit certain criteria. The easiest filter is the AUTOFILTER. The ADVANCED FILTER lets you input more complicated search criteria. When manipulating data that has been filtered, special functions (database functions) should be used. When working with databases (and any fairly large spreadsheet), using the WINDOW/FREEZE PANES menu item is quite helpful. It allows you to scroll through the spreadsheet while always being able to see the header row. To use this utility, place the cursor under the row(s) and to the right of the column(s) that you always want to see. Then select WINDOW/FREEZE PANES. Excel Introduction - 12 Pivot Tables Excel has a utility called a Pivot Table that allows us to create and analyze tabular summaries (contingency tables) of qualitative data. It can also be used with quantitative data or combinations of quantitative and qualitative data. To use the pivot table feature, data must be entered in columns and each column must have a title or header. Before invoking the procedure, be sure that the cursor is in one of the cells containing a header or data. To start the “wizard,” go to Data/PivotTable and PivotChart Report. In the first step, just click on Next (the default values are what we want). In the second step, verify that the data range shown contains all of the data that you want to analyze, then click on Next again. In step 3, click on the button called “Layout.” You will be presented with the dialog box on the following page (except the buttons on the right will change according to the data set you are using). Excel Introduction - 13 At this point, click on and drag the button corresponding to the variable that you want to be on the rows of your output table to the area labeled “Row” and the variable you want in columns to the area that says “Column.” Then drag either of the two buttons that you just used to the “Data” area. I recommend always dragging one of the qualitative variables’ buttons. The button should change to say “Count of VARIABLE” “where VARIABLE is the name of the variable that you dragged to the middle. Then say OK. To complete the procedure there are a few other options you can change if you desire, but I usually just click on Finish at this point and change options later if the output is not what I desire. If you have used a quantitative variable, you will likely want to group it. To do so, right click on the variable name in the table. One item in the pop-up menu should say Group. Choose it, and then specify how you want the variable to be grouped. Excel Introduction - 14 The pivot table can display several different types of summary measures. The default or “normal” state is to display total counts. There may be times that you want to display the numbers in the table as overall percentages, as row percentages, etc. To change the display, click any where in the table and go again to the Data/PivotTable and PivotChart Report menu item. You should be at step 3 again. Click on Layout and then double click what is in the middle of the table (it should say “Count of…”). Then select options. A drop down menu that says “Show Data As” will be in the middle of the dialog box. Use the drop down menu to say how you want to display the data. Then exit out of all of the boxes. The default way that Excel lists the categories in qualitative variables is alphabetically. You may want them listed in some kind of logical ascending order (for example, you may want to list class standing as Freshman, Sophomore, Junior and Senior). To tell Excel how you want the labels to be ordered, go to the Tools menu, select options, and then click on the tab called “Custom Lists.” Then you can type in the list items in the order you want them (separate them with a comma or return) in the List Entries section. Or you can import the list in the order that you want by identifying the cells where they are listed. Below is a portion of an Excel worksheet with both qualitative and quantitative variables. It shows both a portion of the original data and the the resulting pivot table. I created a custom list in Excel as “Good, Very Good, Excellent.” Excel Introduction - 15