STAT/SOC 221 Excel Tutorial Sam Wang 1 Agenda • Excel Basics • Statistics in Excel • Graphics 2 Excel Basics 3 Excel Cells Excel Cells are referenced by a column and row – Columns are indexed by letters – Rows are indexed by numbers – Ex: A1 is the top right cell 4 Excel Formulas Excel formulas consist of– – – – – “=“ Equal sign Formula name “(“ Open Parenthesis Inputs (will depend on specific formula) “)” Close Parenthesis 5 Example (Product) – – – – – – “=“ Equal sign product “(“ Open Parenthesis 2,3 “)” Close Parenthesis Hit Enter. What does it return? 6 Calculations For basic math calculations– – – – – Start with an “=“ “+/-” for add or subtract “*” (shift + 8) for multiply “/” for divide Excel respects orders of operations (PEMDAS), but Excel can’t read your mind so make sure you specify what you want 7 Best Practices To Help you navigate through the spreadsheet– – Press “shift” to select multiple cells Press (ctrl + direction arrow) to move to edge of cells Check your work – – Whenever Excel gives you a number, take a step back and make sure the number makes sense Double check your formulas to make sure you’ve inputed everything correctly Save Often!! – (Ctrl+s) is a shortcut for saving Selecting Cells – You can select multiple cells by clicking (and continuing to hold down the mouse button) and dragging your mouse over the cells you want to select 8 Install Analysis Add-in Installing the Analysis Add-in will help make histograms – Analysis Toolpak is not available on Macs, so go to pg 26 instead – Make sure data analysis add-in is installed • File > Options > Add-Ins > click Go • Select Analysis ToolPak and Analysis ToolPak – VBA and click OK 9 Statistics in Excel 10 Central Tendencies The “Average” formula – – – Finds the mean of a set of numbers Formula name - Average Inputs- The cells you want to find the average of The “Median” formula – – – Finds the median of a set of numbers Formula name - median Inputs- The cells you want to find the median of The “Mode” formula – – – Finds the mode of a set of numbers Formula name - mode Inputs- The cells you want to find the mode of 11 Standard Deviation The Standard Deviation formula – Finds the sample standard deviation of a set of numbers – Formula name – stdev.s – Inputs- The cells you want to find the standard deviation 12 Z-Scores Excel can also work as a Z-table – Norm.S.Dist(z-score, TRUE) will return the area to the left of a z-score – Norm.s.inv(probability) will return the z-score associated with an area under the curve Example: Find the Z-score and P-value for a sample mean of 15 from a sample of 25 if the null hypothesis is that the mean is 17 with a population sd of 4 13 Five Number Summary Useful formulas – – – – Min and Max find min and max of set of numbers Large(List of #’s, k) can find the k largest value in list Be careful about using the quartile function – Not quite the same as what we’ve done in class Can also use Excel’s built in Sort function to order the values 14 Sample Proportions Useful formulas – Countif(array, condition) will count the number of cells in the array that satisfy the condition – Conditions might be “>0”, “<0” – Countifs(array1, condition1, array2, condition2….) allows for multiple conditions – Averageif(range, condition, averageArray) will find the average of the cells in averageArray which correspond to cells in range which satisfy the condition 15 Graphics in Excel 16 Bar Graphs Creating Bar Graphs in Excel – From the top ribbon go to Insert > Column Chart – Select the Option in the top left 17 Bar Graphs Creating Bar Graphs in Excel – Left click in the box that pops up, and select “Select Data” – Select the Option in the top left – From the pop-up box, select Add to add data to the graph 18 Bar Graphs Creating Bar Graphs in Excel – The box the pops up will allow you to specify what data you want to graph – The top box is the name you want to give the data. This is the name that will show up in the legend – In the Series Values box, click the button on the left and then highlight the data you want to graph (click and hold your mouse to select the data. see bottom of slide 8 on how to select multiple cells to capture all of your data). 19 Bar Graphs Creating Bar Graphs in Excel – To set the correct X-axis, after you have entered in your data, select the edit button under Horizontal Category Axis Label 20 Bar Graphs Creating Bar Graphs in Excel – On the resulting pop-up box, hit the click the red arrow – then select the bins you have created with your mouse 21 Bar Graphs Modifying your bar graph – Left click on one of the bars and select “Format Data Series” – You can change the colors or even decrease the spacing in between bars Use Fill to change the color of the bars Enter 0 to eliminate spacing between bars 22 Bar Graphs Labeling your Graph – After clicking on your graph, from the top ribbon select Layout • From there you can modify the Chart Title and add Axis Titles as well 23 Histograms Histograms – In a new column, define the buckets which we will put observations in (may be useful to use min and max to make sure you capture all observations) – From the top ribbon, select Data > Data Analysis > Histogram 24 Histograms Histograms – – – – Input Range: Select the Data you want to create a histogram of Bin Range: Select the buckets which you created earlier Select Chart Output Exercise: Think about how you could also use countifs to create the same output 25 Histograms Manually Analysis ToolPak is unavailable on Macs, so you may need to create a histogram manually – – In a new column, define the buckets which we will put observations in (may be useful to use min and max to make sure you capture all observations). These will be the values on the X-axis of your histogram Make sure the buckets are sorted smallest to largest 26 Histograms Manually Analysis ToolPak is unavailable on Macs, so you may need to create a histogram manually – Use the countifs formula to count the number of observations which belong in each bin (see slide 15 for a refresher on countifs formula – In the formulas below, BIN.low refers to the actual number that is the low end of your bucket, and BIN.high refers to the high end of your bucket. So do not enter in those words, but replace them with the actual numbers for your bucket (those numbers will be different for each bucket). – In the cell to the right of the first bucket type the following• • • • • • =countifs( Select your data, then enter in a comma “>=BIN.low” then enter in a comma (include the quotation marks) Select your data again (should be the same cells as before). Then enter another comma “<BIN.high” (include the quotation marks) Then enter the close parenthesis ) and hit enter – The cell will now show the number of observations greater to or equal to BIN.low (you need to enter in an actual number) and less than BIN.high (you need to enter in an actual number) – Do this for each bucket you have – Create a column chart out of your data (instructions for column charts start on slide 17) – See the next slide for an example 27 Histograms Manually BIN.low in this case is -1 Reference to my data BIN.high in this case is 0 28 Questions 29