In part two of this lecture I’ll talk about X,... I’ll present alternative and better constructions of Chart Deception: Part 2

Chart Deception: Part 2 Slide 1 In part two of this lecture I’ll talk about X, Y or scatter charts and then provide many examples of poorly constructed graphics. In many cases, I’ll present alternative and better constructions of the same graphics. Slide 2 Some of this advice is critical to designing non-deceptive scatter charts. It’s important to label the data points. Slide 3 This graph is essentially meaningless for communicating real information because there are no data points and no axes labels. Slide 4 In contrast to the previous graph, this graph includes both labeled points and axes. Slide 5 You’ll notice, as you start looking at the graphics more critically, that graphical displays like those in USA Today exaggerate differences or changes or trends. USA Today does this by not starting it’s axes at the origin (the 0,0 point). The next four slides illustrate this problem. Slide 6 Here the Y axis starts at 3 and ends in 9, so there appears to be a huge shift in average orders by month. People will remember the shape of this graph, and a quick glance suggests there were no orders in October, November, and December, but a substantial orders in July, August, and September. Slide 7 In contrast to the previous slide, this slide suggests a much more stable pattern of average orders per salesperson over the six month period. The first slide suggests tremendous variability, which is incorrect. The true message is that orders were a bit higher in July through September and a bit lower in October through December. A graph that suggests otherwise is deceptive. Also notice that the Y axis starts at the zero point and extends beyond the top-most value, which occurred in August. Slide 8 Here’s another example of the failure to use the 0,0 origin as the starting point. A quick look at this graph suggests that the average orders per salesperson are highly variable and that September was a disastrous month. Page | 1 Slide 9 In fact, just the opposite is true. The average orders have been extraordinarily stable across this six month period. The first graph leaves viewers with a distorted impression of the average orders over this six month period. Slide 10 (No Audio) Slide 11 In this graph, distortion is caused from treating unequal time intervals as equal. The graph on the right, B, shows a huge area associated with the period between 1975 and 1980. As a result, it seems as if there was a long take off period before the increase in dollars. Alternatively, the graph on the left, A, presents an undistorted X axis. In A, dollars seem relatively stable over the first half of the graph, but from 1985-2000 there was a meaningful increase in dollars. You should avoid graphs like B, in which the period from 1975-1980 and the period from 1995-2000 are distorted. Slide 12 Even if you depict equal intervals identically in your graph, you can still do things to change the visual impression. The top left graph shows the original scaled arrangement; you can see what happens as you start expanding and contracting the X and Y axes. In some cases, you can make changes seem much smaller; in other cases, you can make the changes seem much greater. You shouldn’t distort the message by arbitrarily expanding or contracting the horizontal and vertical scales of your graph. X, Y plots are meant to indicate trends and variability. You can influence that message by arbitrarily expanding or contracting the X and Y axes. Slide 13 (No Audio) Slide 14 As I mentioned earlier, avoid broken axes. Here, the Y axis jumps from 0 to 9, so it seems as if the designer’s properly started at the Y axis at the zero (0) point, but in fact has distorted the message. Thus, it appears that this line graph depicts a huge increase from 1930 to 1970. The numbers in the graph indicate a roughly 50% increase. Do you believe anyone viewing this graph casually will see only a 50% increase when the graph suggests—due to the broken Y axis—that there’s a 400 to 500% increase? Using discontinuous axes to depict data will only confuse and distort the information. Slide 15 Here are three more examples of why you shouldn’t distort a chart by using broken axes and why you should start from the 0 point on the X and Y axis. Slide 16 (No Audio) Slide 17 There are certain conventions when people view graphs, and one of those conventions is that an upward sloping line means the quantity is increasing over time, especially if time is on the X axis and the amount in question is on the Y axis. This cumulative rainfall graph seems to indicate that rainfall increased markedly from June to December. In fact, the individual amounts for each month listed at the bottom of the slide indicate no increase. Avoid using cumulative Page | 2 charts unless you’ve a good reason for doing so because cumulative charts tend to be inconsistent with people’s chart expectations. Slide 18 Although this is a bar chart instead of an X, Y chart, using cumulative bar charts presents a similar problem. Avoid using cumulative charts. Slide 19 Occasionally, people seem compelled to put multiple graphs into the same single graph. Somehow, this arrangement is meant to depict vital information about how the two graphs are related. This format can only confuse people, as you’ll see in the next three slides. Slide 20 Someone looking at this graph is supposed to conclude that sales are dropping because inventory is dropping; however, those things could be independent. Graphing those things together suggests a relationship when none may exist. If it’s necessary to show sales and inventories, you should use two graphs, not one. Slide 21 Here’s another example of graphing two things together that suggests these things are related. It’s doubtful that consumption triples when the outdoor temperature increases from 80 to 95 degrees, yet that’s what this slide suggests. Although these things may be barely or strongly related, but this slide indicates they’re strongly related, which may or may not be the case. Slide 22 This graph is meant to indicate newspaper readership over time. The Daily News and The Post are two New York City newspapers. A quick look at this graph suggests that readership is convergent, which isn’t true, at least not as dramatically as suggested. Part of the problem is this graph is discontinuous; it suddenly drops from 800,000 readers per day to 1.5 million readers per day. In other words, there are two graphs—the Daily New graph on the top and the Post graph on the bottom—that someone stuck together. Although the readership of the Post is increasing and the readership of the Daily News is decreasing, they are not converging as rapidly as suggested by this graph. Slide 23 As I mentioned in an earlier lecture, many marketing relationships are non-linear, and one way to linearize a relationship is to transform the data. Converting data into its log can be useful for scientific audiences used to semi-log charts. However, such transformations will deceive nonscientists who don’t understand it. In this case, the use of semi-log charts minimizes the appearance of a trend. Slide 24 Here’s a logarithmic transformation. Equal intervals denote increases by a power of 10. One is 10 to the zero power, 10 is 10 to the first power, 100 is 10 to the second power, and 1000 is 10 to the third power. Seemingly, there’s not much of a trend from 1995-1997, and the extrapolated area suggests a mild increase over time. In fact, if this graph is correct, then the increase from Page | 3 1995 to 1999 is from 10 to 1000, or a 100-fold increase. A casual look at the graph suggests only a tripling. Slide 25 This is another one of those examples of people’s cultural norms regarding the reading of graphs. If the X axis isn’t the time axis then connecting the dots can mistakenly suggest a trend because that’s what people expect when they see X, Y charts, a bunch of dots, and a line that connects those dots. Don’t connect the dots unless the X axis is the time axis. Slide 26 (No Audio) Slide 27 Finally, regarding X, Y charts, I want to ensure you’re clear about interpolation and extrapolation. With interpolation you’ve got two points on the graph and you’re trying to guess the midpoint between those two points. With extrapolation, you’re looking at all the points up to the end of the graph and then trying to guess subsequent points beyond the current points. Both interpolation and extrapolation are subjective assessments. Interpolation may seem safer because you’re guesstimating a midpoint. If you have a large series of points, you’ll feel comfortable that there won’t be some dramatic change right at the midpoint or somewhere between two existing points. With extrapolation, it’s impossible to know whether trends will continue or not; there could be a dramatic increase or decrease relative to the current trend not suggested by theory, sophisticated forecasting methods, and the like. Just remember that extrapolations are highly subjective and interpolations are somewhat subjective. Slide 28 (No Audio) Slide 29 Here’s what I mean by a radar chart. Although these are popular in Japan, they are problematic. In this example, the axes are not identical; although acceleration goes from 0 to 15 but handling goes from 0 to 8, these two axes are the same length. Fuel economy goes from 0 to 10, riding and styling goes from 5 to 15. Even if you put multiple plots on the same graphic, you’ve got all the problems that I mentioned earlier about the use of profile analysis and semantic differentials: you can’t know what’s important and you can distort people’s perceptions by changing the relative size of the different and unrelated axes. I urge you to avoid radar charts. Fortunately, no current spreadsheet, graphics, or statistical packages uses this approach for plotting data. Slide 30 People seem to focus on point estimates—modes, medians, and means—which are the single best summary numbers. For metric—interval- or ratio-scaled—data, that point is an estimate based on the sample you drew. If you drew subsequent samples, you may find different point estimates. It’s important to give the viewers of your graphs a sense for the range of likely point estimates for repeated samples. That’s what you depict when you show point estimates and confidence intervals around those estimates. The next three slides show confidence intervals as well as the point estimates. Slide 31 to Slide 33 (No Audio) Page | 4 Slide 34 You always want to use footnotes to indicate the source of the information depicted in any table that you create. Otherwise, you can be accused of plagiarism, and that’s a serious charge. In this Internet era, many people believe that borrowing liberally from other sources is okay, as the true creativity is in the remixing of existing sources. That’s not a good mindset. If it’s obvious that you have borrowed something but haven’t indicated the source, then that’s plagiarism. Try to avoid plagiarism by using footnotes. You should use proportional fonts and you shouldn’t mix fonts. I use arial fonts for everything. If you import a jpg file, then it’s difficult to manipulate fonts because you’re starting from a picture. Don’t use all uppercase lettering; instead, mix upper and lower case letters. Using uppercase only is equivalent to shouting. If you need to emphasize, then use underline or italics. Finally, consider how you depict numbers. If the numbers represent data points, then use numbers; otherwise, spell out the numbers one through nine. After that, you can use digits for 10 through infinity. Slide 35 (No Audio) Slide 36 The next four slides depict bad graphics but suggest no fix. The text indicates what’s inappropriate about the graph and what makes it bad suggests improvements. Slide 37 to Slide 38 (No Audio) Slide 39 This graph is lousy for two reasons. First, the Y axis is logarithmic. Second, the graph depicts two different things, which suggests a relationship when none may exist. Slide 40 This is another poor graph. Notice the effort to compare males to females over the same period, 1968-1976. Between the two graphs are dotted lines that imply a precipitous drop from males to females. Although there’s a drop, it’s not as much as implied by this graph. By comparing, in that dotted line, 1976 data for males to 1968 data for females, the graph creates a distorted impression. Slide 41 (No Audio) Slide 42 Here’s an example of stacked 3-D bar charts to reveal a trend from 1971 to 2000 across five different sources of electricity. Slide 43 The real message from the previous chart is that there are profound increases predicted for petroleum and nuclear energy, but only modest increases for other sources. That point is obscured in the previous slide but made obvious in a scatter plot instead of a stacked 3-D bar chart. Slide 44 (No Audio) Page | 5 Slide 45 In the previous slide, the Y axis did not begin at the origin. Here’s what happens when you take that same data and plot it with the Y axis for expenditures per pupil starting at the origin. In this case, it’s clear that expenditures have been relatively stable. The previous slide implies that upward trends in expenditures have caused upward trends in SAT scores. The data, if depicted properly, suggests just the opposite. Slide 46 (No Audio) Slide 47 The problem with the first graph in this two graph series is that 1978 data was only partial data. As a result, it suggests that there was a downward trend from 1976 to 1978. In fact, a careful look at the 1978 data indicates an upward trend in commission payments. Slide 48 (No Audio) Slide 49 The previous stacked bar chart version of this data obscured that different countries were pulling or not pulling their weight in this regard. By dividing the one stacked bar chart into four separate graphs, it appears that the U.S. has added production, Japanese production is flat, West Germany reserves have fluctuated, and the stock for all other OECD countries has declined. The first oil stocks graph obscures whether or not U.S. stocks have increased, the stocks of the two other countries have remained stable, and the stocks of the remaining OECD countries have declined. Slide 50 Here’s an excellent example for why you should avoid the chart junk that appears in publications like USA Today. Notice that the graph in the background contains barrels of beer and this is supposed to artistically indicate changes in beer sales from 1970 to 1978. In 1970, the number of barrels was 120 million. This graph doesn’t start from the zero point; rather, it starts from 100 million. In 1978, total sales were 160 million barrels. The difference between 120 million and 160 million is 40 million, or a 33 1/3% increase. Because the graph starts from a non-zero point on the Y axis and amounts sold are associated with barrel sizes, the 1978 barrel appears huge relative to the 1970 barrel. A quick glance suggests that U.S. beer consumption increased tremendously in the eight year period. The graph in the foreground indicates the millions of barrels sold by Schlitz, and the brewer’s market share seems in rapid decline. Slide 51 However, as this next graph shows, that’s not the case. In fact, U.S. beer sales grew steadily throughout the 1970s. This graph indicates what I mentioned earlier: sales are up 1/3rd and that Schlitz’s original market share of 15% first grew to 25%, and then declined to 20%. The previous graph suggests that the market exploded and Schlitz sales not only declined but its share of the market dropped far more than 5% from its peak. Page | 6 Slide 52 This graphic shows someone’s artistic approach to illustrating the declining purchase power of the dollar. Starting with 1958 as the base year, when Eisenhower was President, the dollar was worth a dollar. By Jimmy Carter’s time in 1978, the value of a dollar dropped to $0.44. Slide 53 As this graph shows, the purchasing power of the dollar has declined since the Eisenhower Administration through at least half of the Carter Administration. The dollar on this graph is worth less than half in 1978 as it was worth in 1958. That’s the correct interpretation. The problem with the previous graph and all the chart junk is that dollars were depicted in areas shown as a picture of a dollar. If you were to take a ruler and measure the length and width of the dollars from 1958 and 1978, the ratio is roughly 1 to 0.44. However, the area taken by the 1978 dollar is nowhere near half of area taken by the 1958 dollar. Once again, as was the case with multiple pie charts, the relative areas are supposed to indicate relative quantities. Slide 54 (No Audio) Slide 55 As the headings for these two revised graphs show, relative to the previous slide, the message was that during the 1970s there was an increasing positive balance of trade with China, but it worsened with the trade deficit with Taiwan. Because of the mixed metaphor in the last draft, that was difficult to discern. Slide 56 As a final example, here’s a table and relates male versus female life expectancies in different countries. One might think this a reasonable way to present this information because the countries are listed alphabetically. What you’re supposed to discern from this graph is females outlive males. Slide 57 As this alternative table shows, a cross-country comparison is easier when life expectancies are organized from highest to lowest rather than listed alphabetically by country name. Notice the even groupings. The first grouping of six countries shows the life expectancies for females exceed or equal 70. The next grouping shows two countries and the life expectancies are between 60 and 69. For the last grouping, the life expectancy is between 40 and 49. Slide 58 Something like this might be more powerful. Women are shown on the left-hand side and men on the right hand side. These aren’t horizontal bar charts; they’re just indications. The only important thing is the age number in the middle of this display, in which the oldest is at the top and the youngest is at the bottom. This graph shows a rank order and grouping of countries by life expectancy at birth by sex. This is a powerful way to summarize the data in the previous table. The bottom line is to think about the point you’re trying to make and be certain that your graphical displays or tables make that point in a way that’s not deceptive. Page | 7

In part two of this lecture I’ll talk about X,... I’ll present alternative and better constructions of Chart Deception: Part 2

Related documents

Products

Support

In part two of this lecture I’ll talk about X,... I’ll present alternative and better constructions of Chart Deception: Part 2

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib