Word

advertisement
2.1C Describing Location in a Distribution
Name __________________________________
What we know when exploring quantitative data:
1.
Plot the data (the graph is usually a dotplot, stemplot, or histogram).
2.
Look for and describe the overall pattern (Shape, Center, Spread) and departures from that pattern (Outliers).
3.
Calculate a numerical summary to briefly describe center and spread (mean, std dev, 5 number summary, etc.)
What we are going to learn today about exploring quantitative data:
4.
For large data sets, the pattern can be so regular that we can describe it by a smooth curve.
Density Curves
Characteristics of density curves:
1.
They are always on or above the horizontal axis.
2.
The total area underneath the curve is equal to 1.
Density curves describe the overall pattern of a distribution. The area under the curve and above any interval of values
on the horizontal axis is the proportion of all observations that fall in that interval.
Consider a relative frequency histogram: the horizontal axis is broken into classes, and the vertical axis measures the
percentage/proportion of observations in each class. See the example below from a previous class.
What proportion of teams averaged between 100 and
<105 points per game?
What proportion of teams averaged between 95 and
<105 points per game?
0.6
Relative Frequency of PTSG
This histogram, if you recall, shows the relative
frequencies of average points per game scored by the 30
NBA teams in the 09-10 season.
0.5
0.4
0.3
0.2
0.1
0
90
95
100
105
PTSG
110
115
What proportion of teams scored between 100 and <103 points per game?
That last question highlights a limitation of the histogram. By getting a smooth curve, a density curve, to approximate
the data, we are able to ask and answer a more comprehensive set of questions.
Important notes about density curves:
1.
Outliers are not described by the density curve.
2.
Density curves are approximations. They will never mimic the actual data perfectly; they will, however, be
accurate enough for practical use, and often times easier to use.
EX] BATTING AVERAGES
The first histogram below shows the distribution of batting average (proportion of hits) for the 432 Major League
Baseball players with at least 100 plate appearances in the 2009 season. The smooth curve shows the overall shape of
the distribution. In the middle graph, the more heavily shaded bars on the right represent the proportion of player who
had batting averages of at least 0.270. There are 177 such players out of a total of 432, for a proportion of 0.410. In the
third graph below, the area under the curve to the right of 0.270 is shaded more heavily. This area is 0.3974, only
0.0126 away from the actual proportion of 0.410.
Describing Density Curves
Median of a density curve: A median is a data point that has half the observations on either side. Since the area under
the curve now represents proportions of the total number of observations, the median will cut the total area in half: 0.5
to the left, and 0.5 to the right.
Mean of a density curve: A mean can be described as a balancing point (think about the see-saw analogies I gave you
when we talked about standard deviations the first time, and how the fulcrum of the balanced see-saw was the mean).
The mean does not cut the data in half like the median, because values far away from the fulcrum pull harder!! Density
curves can be skewed, and the mean will be pulled towards the skew.
Symmetric density curve: A density curve that is perfectly symmetric will have identical mean and median
Mean and Standard Deviation: Unlike sample data sets, which use 𝑥̅
and 𝑠𝑥 as the mean and standard deviation, respectively, density
curves will use 𝜇 (mu) and 𝜎 (sigma).
Understanding Check
1.
Explain why this is a legitimate density curve.
2.
About what proportion of observations lie between 7 and 8?
3.
Mark the approximate location of the median.
4.
Mark the approximate location of the mean. Explain why the median and the mean have the relationship that
they do in this case.
Pg 108: 27, 31, 39
Read pages 110-119
Download