Exploration

advertisement
Q3. By clicking on the data icons on the graph, find which vehicles are at the low and high ends of the
distribution of city mpg.
High end – Toyota Prius (60), Honda Insight manual transmission (60), Honda Insight automatic
transmission (57), Honda Civic Hybrid (47)
Low end – GMC Savanna Passenger (15), GMC Savanna Cargo (15), Chevrolet Passenger (15), Chevrolet
Cargo (15)
Q4. The Volkswagen New Beetle with Automatic transmission is a trendy favorite for many Americans.
By clicking on the case row for this vehicle in the case table, use the graph to describe the New Beetle’s
standing in City mpg relative to the other vehicles.
The Volkswagen New Beetle with Automatic transmission is at the higher end for good gas mileage
compared to the other cars.
Q5. There appears to be a cluster of 4 vehicles with a City mileage above 45. Clicking and dragging a
selection box around those data icons will highlight the vehicles in the case table. Examine these 4 cases
carefully. List two or three attributes these vehicles have in common.
They are all hybrid cars and their annual fuel costs are all comparable and are the lowest out of all the
cars.
Q6. What are the advantages and disadvantages of having the representations dynamically linked when
working with a data set?
The advantages are that you can see them highlighted in the different representations and make the
connection that they are the same thing just represented differently. The disadvantages are that if you
mess something up in one representation, the others will be messed up too.
Q7. The linking of multiple representations in software like Fathom allows one to simultaneously view
the distribution of an entire data set while focusing on individual cases. How might this feature help or
hinder students’ analysis of the data?
It would help the students get the big picture of the data without losing sight of the individual cases
when comparing and contrasting. It might hinder students’ analysis of the data by not being able to look
at the data holistically because they’re too focused on the individual cases.
Q8. Compare the representation of the City data in the three graphs in Figure 3.7. What characteristics
of the distribution are more noticeable or are hidden in each representation in each of these plots.
In the dot plot, you can see each data point and how it compares to the other data.
In the box plot, you can see how the data breaks down into quartiles, but you lose where each individual
data point lies except for the outliers.
In the histogram, you can see the distribution of the data and the frequency of each. Outliers aren’t as
prominent in this representation as they are in the box plot. Since the data is broken up into intervals,
the graph may show more items being in a category than there actually are so the data can be a little
misrepresented.
Q9. By only examining the graphs, what would you characterize as a typical range of City mpg for these
automobiles?
Based on the graphs, I’d say the City mpg could range from about 15mpg – about 38 or 39mpg because
that’s where the biggest cluster is, no matter the representation.
Q10. How can examining a distribution using three different linked graphical representations be a help
or hindrance for students?
It could be a hindrance if they don’t understand what each graph affords them. They could be misled by
one graph and then get confused and interpret the other graphs incorrectly.
Q11. How could students use the box plot to describe the center and spread of the City mpg?
The could describe the data in terms of quartiles and the spread of the data as being skewed right
because the majority of the data is more towards the left.
Q12. Describe how you could help students understand why the median is not located in the center of
the middle 50% of the data.
Median is when you put your data in order from least to greatest or greatest to least and then cross off
values until you get to the “middle” value. This works for a data set with an odd number of values. If
the data set has an even number of values, then you take the average of the middle two values. The
data set doesn’t have a set increment that each value increases by, so this is why the median isn’t
located in the middle of 50% of the data.
Q13. Drag the vertical line to a location in the plot that you believe serves as a visual fulcrum to balance
the data values in the distribution. How did you decide where to place the line?
The data is going to be skewed because of the outliers, so I put the line to balance the data in the third
quartile because I feel it gives a better representation of the spread of the data.
If you take out the outliers, then I would put the line closer to the median because that would give a
better representation of the data because nothing would be skewed because there would be no
outliers.
Q14. How did your visual estimate of the distribution’s center compare with the computed measures of
mean and median? Why might your estimate have been different?
They were pretty close. They were different because I was eyeballing the data and the software actually
had the numbers to crunch.
Download