Making Histograms and Bar Charts in Excel Lab 1 This tutorial will be based on a fictional dataset looking to measure the relationship between being happy and being sociable. Happiness and sociability were both measured using a five item scale for 30 university student participants. The scale was on a Likert scale of 1-5 with higher numbers indicating greater happiness or sociability. This tutorial will walk you through the steps and allow you to practice making bar graphs and histograms as well as test some of your knowledge about graphs. Pre-exercise quiz 1. For each of the following types of data, please list whether you would best represent them in a bar graph, a histogram, or a scatterplot. a. Gender of the participants _______________________________ b. Age of the participants __________________________________ c. IQ scores ______________________________________________ d. Relationship between age and height _____________________ e. Ethnicity of participants _________________________________ f. Time spent studying vs final grade ________________________ g. Amount of time participants spent sleeping ________________ h. Amount of time an individual spent sleeping for each day of the week, recorded over a one week period _________________________________________ 2. If I am trying to graph the frequencies of a nominal variable, which graph should I use? a) bar chart b) clustered bar chart c) histogram d) scatterplot 3. Which measure of central tendency is most affected by outliers? a) mode b) median c) mean d) not enough information to decide 4. You have a large and detailed dataset for your thesis. It is full of information just waiting to be analyzed and you have several hypotheses that you want to test. List at least four reasons why you might want to graph your data before you run your tests. 1. __________________________________________________________________________ 2. __________________________________________________________________________ 3. __________________________________________________________________________ 4. ___________________________________________________________________________ Making a Bar Chart Exercise 1. Open Lab 1 Data from the course website (http://aix1.uottawa.ca/~schartie/psy2106/psy2106.htm). 2. You want to find out how many males and females there are in your sample. We are going to make a bar graph in order to find out. Ask yourself, why are we making a bar chart instead of a histogram or scatterplot or line graph? 3. Highlight the column named Gender (including the title) and click on the tab Insert and select Pivot Table. A window that looks like this should appear. I prefer to select Existing Worksheet rather than new worksheet, and place the table at the end of my data. After selecting the radial button Existing Worksheet, click on the red arrow icon and select a cell at the end of the data that is not being used, such as T1, and then click OK. 4. Scroll to your pivot table. Under PivotTable Field List it will show Choose fields to add to report. Select Gender and drag it to Σ Values. It should now say “Count of Gender”. Next, select Gender again and drag it to Row Labels. It should now say “Gender” with the option for a drop-down menu. Your page should now look like this: 5. Go to PivotTable Tools Options PivotChart and select the first option, Clustered Columns. At first it appears that there is a large difference in the numbers of Females and Males, however, when looking at the numbers on the side, they are very similar. In order to make the graph less misleading, right click on the y-axis and select Format Axis. For Axis Minimum, select the radial button Fixed and enter “0”. For Axis Maximum, select the radial button Fixed and enter “18”. This will make the y-axis start at 0 and go up to 18. Adjust the increments in order to make the graph to your liking. Delete the legend and title your graph appropriately. 6. Explore what else you can do to configure the bar graph. Try selecting the bars on the graph and changing the colour, whether there is an outline, or create a colour gradient under the Design tab. Experiment with Chart Layouts. Next, try to change the chart gridlines under the Layout tab. Have fun! As you adjust your graph with different fonts and colours, ask yourself when this style would be appropriate, when it would be inappropriate, and whether it makes the important information easier and faster to extract from the graph, or whether it is distracting or too cluttered. Making a Histogram Now we will create a simple histogram. Ask yourself, what is the difference between a bar graph and a histogram? Some of this will be repetitive, but it’s good practice! 1. Highlight the column Year of School including the title and select Insert PivotTable. To switch things up (and so you can decide which way you like better) select the radial button New Worksheet under the Choose where you want PivotTable report to be placed. A new worksheet should now be opened. If you want to get back to your data, use the tabs at the bottom of the page to switch to Sheet 1. For now, stay on Sheet 4. 2. In the PivotTable Field List on the right hand side of the page, drag Year of School down to Σ Values and Row Labels. Under Σ Values click on Sum of Year of School and select Value Field Settings. 3. In the window that pops up, under Summarize value field by select Count and press OK. This means that the graph produced will be counting the number of times a specific value of Year of School appears rather than summing the values. 4. At the top of the page, under the tab PivotTable Tools select the tab Options, then click on PivotChart and select Clustered Columns (the first chart option). Click OK. 5. You will see a bar chart appear. First, delete the Legend that reads Total. It’s obvious we are doing frequencies. Now, we want to treat year of school like a continuous variable and thus we want to remove the spaces between the bars and create a histogram. To do this, select the actual bars on the graph (make sure that the bars are highlighted separately from the rest of the graph) and right click. Select Format Data Series. In the window that appears, go to Series Options Gap Width and move the arrow from 50% to 0%. 6. You have now created a histogram. APA will also tell you to label your axes and not have a title. Please delete the heading titled Total. To add an axis label, go to the top of the page and select PivotChart Tools and select the tab Layout. Go to Axes Titles Primary Horizontal Axis Title Title Below Axis. Now, select Axes Titles Primary Vertical Axis Title Rotated Title. You should now have the ability to create titles for your axes. Come up with something short and informative. Also, feel free to play with the various options available to you to make your chart prettier. 7. Ask yourself, what kind of shape does your chart have? a) mesokurtic b) positively skewed c) negatively skewed d) platykurtic 8. What kind of distribution is this? a) normal b) linear c) unimodal d) none of the above 8. Given your answer above, what would not be a good way to describe the year of study your participants are in? a) mode b) median c) mean d) none of the above (i.e. they are all good) Making a Histogram 2 1. Calculate the average response on the happy items. Under Average Happy select the first row and put in the formula =AVERAGE(D2:H2). This will calculate an average for Happy1 to Happy5. Copy and paste this formula for the other 29 participants. 2. Right click on the column Average Happy and select Insert. This will create a new column beside Average Happy. Title this column Bins. 3. When making Histograms, it is important to consider bins. The number of bins should not be so many that the graph is overwhelming and displaying too much information, but not so thick that it doesn’t display enough information. To help us figure out how big each bin should be, we must first determine the least possible score and the greatest possible score. Since we are doing averages of a 1-5 scale, we can figure out that the averages can range from 1-5. However, we might also want to know what the actual lowest score is from our participants and what the actual highest score is. 4. To do this, select all 30 participant scores from Average Happy. Right Click Sort Sort Smallest to Largest. This will prompt you to either sort just that column (Continue with Current Selection) or to keep each row of information together (Expand the Selection). Since we want to keep each row intact, select Expand the Selection and click Sort. 5. To double check that each participant is still linked to their scores, check the participant ID number and ensure that it looks like it is in a random order. Now, look at the first score in the column Average Happy. The smallest score should be 1.4 and the largest score should be 4.2. Now that we know our smallest and largest scores, we can determine our bin sizes. 6. There are lots of rules of thumbs available for how many bins there should be. For this exercise, let’s use 9 bins, with each extreme bin being empty to help centre the graph. This will leave us with the bin range of 0.6 – 1.0, 1.1 – 1.5, 1.6 – 2.0, 2.1 – 2.5, 2.6 – 3.0, 3.1 – 3.5, 3.6 – 4.0, 4.1 – 4.5, 4.6 – 5. 7. Now, list the maximum value for each bin in a list under Bins. This should look like 1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 8. Now go to the tab at the top of the page titled Data and select Data Analysis. Next select Histogram. 9. For Input Range select all 30 scores for Average Happy (you can do this manually or by writing I2:I31). For Bin Range you can select J2:J11 or manually select your bin labels. 10. Next, check the box Chart Output. 11. You can choose to either have it appear on the same sheet as the data or on a new sheet. You can decide. 12. You should now have a chart that you can edit. If you alter the columns on the left, it will automatically alter the chart on the right. To get rid of the bin labeled More, right click on that cell and select Delete and then select Delete entire row. The chart should automatically adjust. 13. Feel free to edit it as described before to get rid of the gaps between bars, alter the number of ticks on the y axis, change the bins label to ranges (ex. 1.1-1.5 instead of 1.5) and put an outline for each bar, etc. Post-exercise questions: 14. What best describes this type of distribution? a) unimodal b) bimodal c) normal d) skewed 15. Can you describe this data using the mean of participants? Why or why not? If not, what other measure of central tendency would you use?