Frequency distributions and graph

advertisement
Statistics
Summary 3: Frequency distributions and graphs
Frequency distribution table: a table in which the data values are grouped in classes and the
k
frequency or number of cases which fall in each class are recorded. Note:
f
i 1
i
 n . The lower value
that falls within a particular class is called the lower class limit and the upper value that falls within a
particular class is called the upper class limit.
Range: the range of the distribution of values is the difference between the maximum and minimum
values of the distribution.
Relative frequency: divide each class frequency by n. relative frequency of a class 
fi
f
Percentage distribution: divide each class frequency by n and multiply by 100.
class percentage 
fi
f
 100
Ungrouped frequency distribution: a frequency distribution in which each class consists of a single
observation.
Grouped frequency distribution: a distribution in which one or more classes contain more than one
observation. In this case the classes are called class intervals. There is no general prescription as to
the number of classes, anywhere from 6 to 20 is acceptable.
Class boundaries or true limits. Between the upper limit of a class and the lower limit of the next
class there is a numerical gap. Subtracting half the gap to the lower limits of the classes gives the lower
boundaries. Adding half the gap to the upper limits of the classes gives the upper boundaries of the
classes. Also, the class boundaries are given by the midpoint of the upper limit of one class and the
lower limit of the next class.
Class mark: the midpoint value of the class interval. To find the class mark, add the class limits or
boundaries and divide by 2.
Class width: the width of a class is given by the difference between the upper and lower boundaries of
the class. If can be found by subtracting the lower limit of one class and the lower limit of the previous
class.
The approximate value of the class width =
Maximum value  min imum value
Number of classes
Statistical Graphs.
Histogram: a graphical representation of a frequency distribution. A frequency histogram consists of
adjacent rectangles in which the class boundaries are market on the horizontal axis and the frequencies
are represented by the heights of the rectangles.
In a relative frequency histogram, the heights of the rectangles indicate the relative frequency of the
class.
In a percentage histogram, the heights of the rectangles indicate the relative percentage of values
which fall within class.

-1-
Shapes of histograms:
Positively skewed distribution or skewed to the right: a distribution that tails to the right. In this
case the mean is greater than the median.
Negatively skewed distribution or skewed to the left: a distribution that tails to the left. In this case
the mean is smaller than the median
Frequency polygons. A graph formed by joining the midpoints of the tops of successive bars in a
histogram with straight lines is called a frequency polygon. To draw a frequency polygon, mark a dot
above the midpoints of each class at a height equal to the frequency of that class, including classes of
zero frequency at the beginning and at the end. Connect the dots with line segments
Less than cumulative frequency distribution.
A less than cumulative frequency distribution gives the total number of values that fall below the upper
boundary of each class.
More than cumulative frequency distribution.
A more than cumulative frequency distribution gives the total number of values that fall above the lower
boundary of each class.
Ogive. The graph of a cumulative frequency distribution is called an ogive. It is obtained by drawing the
frequency polygon of the cumulative frequency distribution.
Problem: The K and R Personnel Service reported that annual salaries for department stores
assistant managers range from $40,000 to $55,000. Assume that the following data are a sample
of the annual salaries for department store assistant managers (data are in thousands of dollars).
52.0
43.4
53.2
50.8
48.1
47.6
54.9
52.3
44.6
51.0
45.0
51.6
53.8
48.1
52.3
46.0
54.6
46.6
43.7
45.6
45.4
47.5
52.3
54.3
40.0
41.8
52.3
52.5
50.4
49.9
a) Find the range of the distribution.
Answer: Maximum value - Minimum value = 54.9 – 40.0 = 14.9
The range is 14.9 thousands of dollars
b) The list of values is to be distributed in classes of width 3 thousand dollars where the lower
limit of the first class is 40.0 thousand dollars.
Find all the lower limits and all the upper limits.
Answers. Lower limits are 40.0, 43.0, 46.0, 49.0, 52.0
Upper limits are 42.9, 45.9, 48.9, 51.9, 54.9
c) Find n, the number of values of the list.
Answer. n = 30 values
d) Tally, to find the frequency of values that corresponds to each class.
Answer. Class frequencies are:2, 6, 6, 5, 11,
e) Use the answers of parts a, b, c, d, to complete the following table
Class
Tally
Class
frequency Relative
Class
Class
Class
Limits
width
frequency
Percentage mark
Boundaries
-2-
Answer:
Class
Tally Class
Limits
width
40-42.9
3
43-45.9
3
46-48.9
3
49-51.9
3
52-54.9
3
Frequency
f
2
6
6
5
11
Relative
frequency
2/30=0.067
6/30=0.2
0.2
0.167
0.367
Class
Percentage
6.7%
20%
20%
16.7%
36.7%
Class
mark
41.45
44.45
47.45
50.45
53.45
Class
Boundaries
39.95-42.95
42.95-45.95
45.95-48.95
48.95-51.95
51.95-54.95
Stem-and-leaf displays. Quantitative data is divided into two portions - a stem and a leaf. The leaf for
each stem are shown separately in a display. An advantage of a stem-and-leaf display over a frequency
distribution is that no information on individual observations is lost.
Example.
Construct a stem-and-leaf display for the following data. Use the first digit at the stem. Arrange the leaves
for each stem in increasing order.
14 16 21 23 18
24 32 21 19 33 38 15 18 35 27
22 25 31 14 17
36 25 20 29 34 11 18 32 14 23
Stem-and-leaf
Stem
1
2
3
Leaf
1 4
0 1
1 2
4
1
2
4
2
3
5
3
4
6 7 8
3 4 5
5 6 8
8
5
8
7
9
9
Example.
Construct a stem-and-leaf display for the following data. Use the first digit at the stem. Arrange the leaves
for each stem in increasing order. Condense the stem-and-leaf display by grouping the stems as 0-2, 3-5,
6-9
14 16 21 23 18 41 53 24 32 73 9 85 41 76 19 62 33 38 15 18 35
48 52
40
29
34 4 11
77
65
3 67 71 27
18
32
98 22
25
8 31
14
17 45
53
36
25 83 20
62
7 32 4 14 23 6
Stem-and-leaf
Stem
0-2
3-5
6-9
Leaf
9 3 8 4 7 4 6 * 4 6 8 9 5 8 4
2 3 8 5 1 6 4 2 2 * 1 1 8 0
2 5 7 2 * 3 6 1 7 * 5 3 * 8
Stem-and-leaf
Stem
0-2
3-5
6-9
Leaf
3 4 4 6 7 8 9*1 4 4 4 5 6 7 8 8 8 9 *0 1 2 3 3 4 5 5 7 9
1 2 2 2 3 4 5 6 8
*0 1 1 5 8 *2 3 3
2 2 5 7 *1 3 6 7 * 3 5 * 8
-3-
7
5
1 8 4 *1 3 4 7 2 5 5 0 9 3
*3 2 3
Problem: A small college in the state of Missouri has 600 students enrolled in their freshman
class. 360 of the students enrolled are males. Professor Clark uses his History class of 120
students in order to make some inferences about the college.
a) Describe the population of the study.
b) Describe the population of female students.
c) What is the proportion of male students in the college?
d) Is the proportion of male students in the college a parameter or a statistics?
e) Based on the given information, predict the number of male students in Professor’s Clark
history class.
f) Professor Clark finds out that 40% of the students in his history class are education majors.
Does the 40% education majors in his class represent a parameter or a statistics?
g) If Professor Clark’s class is unbiased, that is, is representative of the college, predict the
number of education major students in the college?
Answers: A small college in the state of Missouri has 600 students enrolled in their freshman
class. 360 of the students enrolled are males. Professor Clark uses his History class of 120
students in order to make some inferences about the college.
a) Describe the population of the study.
Answer: The 600 students enrolled in a small college in the state of Missouri
b) Describe the population of female students.
Answer: The 240 female students enrolled in the small college in the state of Montana.
c) What is the proportion of male students in the college?
Answer:
360
 0.6
600
d) Is the proportion of male students in the college a parameter or a statistics
Answer: Parameter
e) Based on the given information, predict the number of male students in Professor’s Clark
history class.
Answer: (0.6)(120) = 72
f) Professor Clark finds out that 40% of the students in his history class are education majors.
Does the 40% education majors in his class represent a parameter or a statistics?
Answer: a statistics
g) If Professor Clark’s class is unbiased, that is, is representative of the college, predict the
number of education major students in the college?
Answer: 40% of 600 = 240
-4-
Download