Making Histograms and Bar Charts in Excel

advertisement
Making Histograms and Bar Charts in Excel
Lab 1
This tutorial will be based on a fictional dataset looking to measure the relationship
between being happy and being sociable. Happiness and sociability were both
measured using a five item scale for 30 university student participants. The scale was on
a Likert scale of 1-5 with higher numbers indicating greater happiness or sociability.
This tutorial will walk you through the steps and allow you to practice making bar
graphs and histograms as well as test some of your knowledge about graphs.
Pre-exercise quiz
1. For each of the following types of data, please list whether you would best represent
them in a bar graph, a histogram, or a scatterplot.
a. Gender of the participants _______________________________
b. Age of the participants __________________________________
c. IQ scores ______________________________________________
d. Relationship between age and height _____________________
e. Ethnicity of participants _________________________________
f. Time spent studying vs final grade ________________________
g. Amount of time participants spent sleeping ________________
h. Amount of time an individual spent sleeping for each day of the week, recorded over
a one week period _________________________________________
2. If I am trying to graph the frequencies of a nominal variable, which graph should I
use?
a) bar chart b) clustered bar chart c) histogram
d) scatterplot
3. Which measure of central tendency is most affected by outliers?
a) mode
b) median
c) mean
d) not enough information to decide
4. You have a large and detailed dataset for your thesis. It is full of information just
waiting to be analyzed and you have several hypotheses that you want to test. List at
least four reasons why you might want to graph your data before you run your tests.
1. __________________________________________________________________________
2. __________________________________________________________________________
3. __________________________________________________________________________
4. ___________________________________________________________________________
Making a Bar Chart Exercise
1. Open Lab 1 Data from the course website
(http://aix1.uottawa.ca/~schartie/psy2106/psy2106.htm).
2. You want to find out how many males and females there are in your sample. We are
going to make a bar graph in order to find out. Ask yourself, why are we making a bar
chart instead of a histogram or scatterplot or line graph?
3. Highlight the column named Gender (including the title) and click on the tab Insert
and select Pivot Table. A window that looks like this should appear. I prefer to select
Existing Worksheet rather than new worksheet, and place the table at the end of my
data. After selecting the radial button Existing Worksheet, click on the red arrow icon
and select a cell at the end of the data that is not being used, such as T1, and then click
OK.
4. Scroll to your pivot table. Under PivotTable Field List it will show Choose fields to
add to report. Select Gender and drag it to Σ Values. It should now say “Count of
Gender”. Next, select Gender again and drag it to Row Labels. It should now say
“Gender” with the option for a drop-down menu. Your page should now look like this:
5. Go to PivotTable Tools  Options PivotChart and select the first option,
Clustered Columns. At first it appears that there is a large difference in the numbers of
Females and Males, however, when looking at the numbers on the side, they are very
similar. In order to make the graph less misleading, right click on the y-axis and select
Format Axis. For Axis Minimum, select the radial button Fixed and enter “0”. For Axis
Maximum, select the radial button Fixed and enter “18”. This will make the y-axis start
at 0 and go up to 18. Adjust the increments in order to make the graph to your liking.
Delete the legend and title your graph appropriately.
6. Explore what else you can do to configure the bar graph. Try selecting the bars on the
graph and changing the colour, whether there is an outline, or create a colour gradient
under the Design tab. Experiment with Chart Layouts. Next, try to change the chart
gridlines under the Layout tab. Have fun! As you adjust your graph with different fonts
and colours, ask yourself when this style would be appropriate, when it would be
inappropriate, and whether it makes the important information easier and faster to
extract from the graph, or whether it is distracting or too cluttered.
Making a Histogram
Now we will create a simple histogram. Ask yourself, what is the difference between a
bar graph and a histogram? Some of this will be repetitive, but it’s good practice!
1. Highlight the column Year of School including the title and select Insert 
PivotTable. To switch things up (and so you can decide which way you like better)
select the radial button New Worksheet under the Choose where you want PivotTable
report to be placed. A new worksheet should now be opened. If you want to get back to
your data, use the tabs at the bottom of the page to switch to Sheet 1. For now, stay on
Sheet 4.
2. In the PivotTable Field List on the right hand side of the page, drag Year of School
down to Σ Values and Row Labels. Under Σ Values click on Sum of Year of School and
select Value Field Settings.
3. In the window that pops up, under Summarize value field by select Count and press
OK. This means that the graph produced will be counting the number of times a
specific value of Year of School appears rather than summing the values.
4. At the top of the page, under the tab PivotTable Tools select the tab Options, then
click on PivotChart and select Clustered Columns (the first chart option). Click OK.
5. You will see a bar chart appear. First, delete the Legend that reads Total. It’s obvious
we are doing frequencies. Now, we want to treat year of school like a continuous
variable and thus we want to remove the spaces between the bars and create a
histogram. To do this, select the actual bars on the graph (make sure that the bars are
highlighted separately from the rest of the graph) and right click. Select Format Data
Series. In the window that appears, go to Series Options  Gap Width and move the
arrow from 50% to 0%.
6. You have now created a histogram. APA will also tell you to label your axes and not
have a title. Please delete the heading titled Total. To add an axis label, go to the top of
the page and select PivotChart Tools and select the tab Layout. Go to Axes Titles 
Primary Horizontal Axis Title  Title Below Axis. Now, select Axes Titles  Primary
Vertical Axis Title  Rotated Title. You should now have the ability to create titles for
your axes. Come up with something short and informative. Also, feel free to play with
the various options available to you to make your chart prettier.
7. Ask yourself, what kind of shape does your chart have?
a) mesokurtic
b) positively skewed
c) negatively skewed
d) platykurtic
8. What kind of distribution is this?
a) normal
b) linear
c) unimodal
d) none of the above
8. Given your answer above, what would not be a good way to describe the year of
study your participants are in?
a) mode
b) median
c) mean
d) none of the above (i.e. they are all good)
Making a Histogram 2
1. Calculate the average response on the happy items. Under Average Happy select the
first row and put in the formula =AVERAGE(D2:H2). This will calculate an average for
Happy1 to Happy5. Copy and paste this formula for the other 29 participants.
2. Right click on the column Average Happy and select Insert. This will create a new
column beside Average Happy. Title this column Bins.
3. When making Histograms, it is important to consider bins. The number of bins
should not be so many that the graph is overwhelming and displaying too much
information, but not so thick that it doesn’t display enough information. To help us
figure out how big each bin should be, we must first determine the least possible score
and the greatest possible score. Since we are doing averages of a 1-5 scale, we can figure
out that the averages can range from 1-5. However, we might also want to know what
the actual lowest score is from our participants and what the actual highest score is.
4. To do this, select all 30 participant scores from Average Happy. Right Click  Sort
 Sort Smallest to Largest. This will prompt you to either sort just that column
(Continue with Current Selection) or to keep each row of information together (Expand
the Selection). Since we want to keep each row intact, select Expand the Selection and
click Sort.
5. To double check that each participant is still linked to their scores, check the
participant ID number and ensure that it looks like it is in a random order. Now, look at
the first score in the column Average Happy. The smallest score should be 1.4 and the
largest score should be 4.2. Now that we know our smallest and largest scores, we can
determine our bin sizes.
6. There are lots of rules of thumbs available for how many bins there should be. For
this exercise, let’s use 9 bins, with each extreme bin being empty to help centre the
graph. This will leave us with the bin range of 0.6 – 1.0, 1.1 – 1.5, 1.6 – 2.0, 2.1 – 2.5, 2.6 –
3.0, 3.1 – 3.5, 3.6 – 4.0, 4.1 – 4.5, 4.6 – 5.
7. Now, list the maximum value for each bin in a list under Bins. This should look like
1
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
8. Now go to the tab at the top of the page titled Data and select Data Analysis. Next
select Histogram.
9. For Input Range select all 30 scores for Average Happy (you can do this manually or
by writing I2:I31). For Bin Range you can select J2:J11 or manually select your bin
labels.
10. Next, check the box Chart Output.
11. You can choose to either have it appear on the same sheet as the data or on a new
sheet. You can decide.
12. You should now have a chart that you can edit. If you alter the columns on the left, it
will automatically alter the chart on the right. To get rid of the bin labeled More, right
click on that cell and select Delete and then select Delete entire row. The chart should
automatically adjust.
13. Feel free to edit it as described before to get rid of the gaps between bars, alter the
number of ticks on the y axis, change the bins label to ranges (ex. 1.1-1.5 instead of 1.5)
and put an outline for each bar, etc.
Post-exercise questions:
14. What best describes this type of distribution?
a) unimodal
b) bimodal
c) normal
d) skewed
15. Can you describe this data using the mean of participants? Why or why not? If not,
what other measure of central tendency would you use?
Download