MAT 155 Key Concept 155S3.4o3 Measures of Relative Standing and Boxplots August 31, 2011

advertisement
155S3.4o3 Measures of Relative Standing and Boxplots
MAT 155
Dr. Claude Moore
Cape Fear Community College
Chapter 3
Statistics for Describing, Exploring, and Comparing Data
3­1 3­2 3­3 3­4 Review and Preview
Measures of Center
Measures of Variation
Measures of Relative Standing and Boxplots
Z score
August 31, 2011
Key Concept
Measures of relative standing, which are numbers showing the location of data values relative to the other values within a data set, can be used to compare values from different data sets, or to compare values within the same data set. The most important concept is the z score. We will also discuss percentiles and quartiles, as well as a new statistical graph called the boxplot.
Interpreting Z Scores
• z Score (or standardized value) the number of standard deviations that a given value x is above or below the mean.
Sample
Population
Round z scores to 2 decimal places
Whenever a value is less than the mean, its corresponding z score is negative
Ordinary values: –2 ≤ z score ≤ 2
Unusual Values: z score < –2 or z score > 2
1
155S3.4o3 Measures of Relative Standing and Boxplots
Percentiles are measures of location. There are 99 percentiles denoted P1, P2, . . . P99, which divide a set of data into 100 groups with about 1% of the values in each group.
August 31, 2011
Converting from the kth Percentile to the Corresponding Data Value
n total number of values in the data set
k percentile being used
L locator that gives the position of a value
Pk kth percentile
Converting from the kth Percentile to the Corresponding Data Value
Quartiles
Are measures of location, denoted Q1, Q2, and Q3, which divide a set of data into four groups with about 25% of the values in each group.
• Q1 (First Quartile) separates the bottom 25% of sorted values from the top 75%.
• Q2 (Second Quartile) same as the median; separates the bottom 50% of sorted values from the top 50%.
• Q3 (Third Quartile) separates the bottom 75% of sorted values from the top 25%.
2
155S3.4o3 Measures of Relative Standing and Boxplots
Quartiles
Q1, Q2, Q3 divide ranked scores into four equal parts
August 31, 2011
Some Other Statistics
Interquartile Range (or IQR): Q3 – Q1
Semi­interquartile Range:
Midquartile:
10 ­ 90 Percentile Range: P90 – P10 5­Number Summary
For a set of data, the 5­number summary consists of 1. the minimum value; 2. the first quartile Q1; 3. the median (or second quartile Q2); 4. the third quartile, Q3; and 5. the maximum value.
Boxplot
A boxplot (or box­and­whisker­diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile, Q1; the median; and the third quartile, Q3.
Boxplot of Movie Budget 3
155S3.4o3 Measures of Relative Standing and Boxplots
Boxplots ­ Normal Distribution Heights from a Simple Random Sample of Women
August 31, 2011
Outliers
An outlier is a value that lies very far away from the vast majority of the other values in a data set.
Important Principles
Skewed Distribution: Salaries (in thousands of dollars) of NCAA Football Coaches
Outliers for Modified Boxplots
For purposes of constructing modified boxplots, we can consider outliers to be data values meeting specific criteria.
In modified boxplots, a data value is an outlier if it is . . .
• An outlier can have a dramatic effect on the mean.
• An outlier can have a dramatic effect on the standard deviation.
• An outlier can have a dramatic effect on the scale of the histogram so that the true nature of the distribution is totally obscured.
Modified Boxplots
Boxplots described earlier are called skeletal (or regular) boxplots.
Some statistical packages provide modified boxplots which represent outliers as special points.
above Q3 by an amount greater than 1.5 × IQR
or
below Q1 by an amount greater than 1.5 × IQR
4
155S3.4o3 Measures of Relative Standing and Boxplots
Modified Boxplot Construction
August 31, 2011
Modified Boxplots ­ Example
A modified boxplot is constructed with these specifications:
• A special symbol (such as an asterisk) is used to identify outliers.
• The solid horizontal line extends only as far as the minimum data value that is not an outlier and the maximum data value that is not an outlier.
Pulse rates of females listed in Data Set 1 in Appendix B.
3­4 Measures of Relative Standing and Boxplots
In this section we have discussed:
• z Scores
• z Scores and unusual values
• Percentiles
• Quartiles
• Converting a percentile to corresponding data values
• Other statistics
• 5­number summary
• Boxplots and modified boxplots
• Effects of outliers
3­4 Measures of Relative Standing and Boxplots
Always consider certain key factors:
•
•
•
•
•
•
•
•
•
•
Context of the data
Source of the data
Sampling Method
Measures of Center
Measures of Variation
Distribution
Outliers
Changing patterns over time
Conclusions
Practical Implications
5
155S3.4o3 Measures of Relative Standing and Boxplots
132/3. Boxplots Shown below is a STATDISK­ generated boxplot of the durations (in hours) of flights of NASA’s Space Shuttle. What do the values of 0, 166, 215, 269, and 423 tell us?
TI: page 130
133/8. z Score for World’s Tallest Man Bao Xishun is the world’s tallest man with a height of 92.95 in. ( or 7 ft, 8.95 in.). Men have heights with a mean of 69.6 in. and a standard devia­tion of 2.8 in. a. What is the difference between Bao’s height and the mean height of men? b. How many standard deviations is that ( the difference found in part ( a))? c. Convert Bao’s height to a z score. d. Does Bao’s height meet the criterion of being unusual by corresponding to a z score that does not fall between and 2?
August 31, 2011
132/4. Boxplot Comparisons Refer to the two STATDISK­ generated boxplots shown below that are drawn on the same scale. One boxplot represents weights of randomly selected men and the other represents weights of randomly selected women. Which boxplot represents women? How do you know? Which boxplot depicts weights with more variation? TI: page 130
133/10. z Scores for Heights of Women Soldiers The U. S. Army requires women’s heights to be between 58 in. and 80 in. Women have heights with a mean of 63.6 in. and a standard deviation of 2.5 in. Find the z score corresponding to the minimum height requirement and find the z score corresponding to the maximum height requirement. Determine whether the minimum and maximum heights are unusual.
6
155S3.4o3 Measures of Relative Standing and Boxplots
Percentiles. In Exercises 15–18, use the given sorted values, which are the numbers of points scored in the Super Bowl for a recent period of 24 years. Find the percentile corresponding to the given number of points.
36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59 61 61 65 69 69 75
August 31, 2011
In Exercises 19–26, use the same list of 24 sorted values given for Exercises 15­18. Find the indicated percentile or quartile.
36 37 37 39 39 41 43 44 44 47 50 53 54 55 56 56 57 59 61 61 65 69 69 75
134/22. P80 133/16. 65 133/18. 41 134/28. Boxplot for Number of English Words A simple random sample of pages from Merriam­ Webster’s Collegiate Dictionary, 11th edition, was obtained. Listed below are the numbers of defined words on those pages, and they are arranged in order. Construct a boxplot and include the values of the 5­ number summary. 34 36 39 43 51 53 62 63 73 79
134/26. P95 134/29. Boxplot for FICO Scores A simple random sample of FICO credit rating scores was obtained, and the sorted scores are listed below. Construct a boxplot and include the values of the 5­number summary. S32B
664 693 698 714 751 753 779 789 802 818 834 836
7
Download