[2] FREQUENCY DISTRIBUTIONS AND GRAPHS 2.0] 2.1]

advertisement
[2] FREQUENCY DISTRIBUTIONS AND GRAPHS
Prepared by: CARLOS I. GIL
2.0] Describe frequency distributions… Univariate/Bivariate/Multivariate Distributions
Univariate Frequency Distributions
2.1] A group of 25 individuals were asked what make of vehicle they drive. The 25 recorded
responses were: Nissan, Toyota, Honda, Lexus, Ford, Nissan, Toyota, Nissan, Toyota, Ford,
Honda, Nissan, Toyota, Nissan, Honda, Toyota, Ford, Nissan, Honda, Toyota, Ford, Toyota,
Lexus, Honda, Toyota. Construct the frequency distribution for these data.
2.2] A group of 21 people were asked about their beverage preferences. Recorded responses: coffee,
soda, tea, water, orange juice, coffee, soda, water, coffee, soda, water, coffee, soda, orange juice,
water, tea, water, soda, water, orange juice, water. Construct the frequency distribution for these data
2.3] 18 families were asked how many children they have. The recorded data were: 2, 1, 3, 2, 0, 1, 2,
0, 1, 3, 0, 2, 1, 0, 1, 4, 1, 2. Construct the frequency distribution for these data.
.
Bivariate Joint Frequency Distributions
2.4] In a group of 250 people, 140 are women and the rest are men. Of the women, 30 enjoy baseball, 70
enjoy football, and the rest enjoy basketball. Of the men, 20 enjoy baseball, 50 enjoy football, and
the rest enjoy basketball. Construct the joint frequency distribution of gender versus sports.
2.5] A total of 400 customers showed up at a car dealership during a particular weekend. Only 320
customers made a purchase. Of these, 300 were satisfied with the service, 15 were not, and the rest
were indifferent about the kind of service they received. Of those who did not make a purchase, 70
were satisfied with the service, 9 were not, and the rest were indifferent about the kind of service
they received. Construct the contingency table of customers versus satisfaction.
Grouped Data Distributions
When dealing with massive amounts of data, we sometimes group the values into non-overlapping
classes (or categories), preferably of equal widths. The frequency of a class is the number of data
values it contains. The recommended number of classes is 5 to 20, (the larger the data set, the bigger
the number of classes). The class width: w = R/k, where R = Range = (maximum – minimum) and k =
(number of classes). In cases when the data follow a Normal Distribution approximated by a Binomial
Distribution with probability p=0.5 (which guarantees a symmetric bell shape), we may use Sturge’s
formula to compute the number of classes: k = 1+ log 2 n , where n is the number of data values. The
R
number of classes can also be expressed as k = 1+ 3.322 log10 n and therefore, w =
.
1+ 3.322 log10 n
2.6] Consider the given sample of 30 scores: 35, 27, 42, 22, 28, 38, 32, 25, 14, 22, 9, 21, 13, 33, 46,
25, 39, 18, 24, 4, 22, 20, 25, 14, 24, 45, 29, 21, 36, 25.
Ordered Data: 4, 9, 13, 14, 14, 18, 20, 21, 21, 22, 22, 22, 24, 24, 25, 25, 25, 25, 27, 28, 29, 32,
33, 35, 36, 38, 39, 42, 45, 46
a) Compute the class width w and form the class limits (LCL, UCL), class boundaries or intervals
(LCB, UCB), class marks, and construct the frequency distribution using 5 classes; b) repeat
with 6 classes; c) repeat with 4 classes.
SOLUTION
a] w = Range =
n classes
Classes
f
4 to 12
2
13 to 21 7
22 to 30 12
31 to 39 6
40 to 48 3
46 − 4
Range
46 − 4
= 8.4 (use 9) b] w =
=
= 7 (use 8)
5
n classes
6
Boundaries m
Classes
f
Boundaries
m
3.5 to 12.5
8
4 to 11
2
3.5 to 11.5
7.5
12.5 to 21.5 17
12 to 19 4 11.5 to 19.5 15.5
21.5 to 30.5 26
20 to 27 13 19.5 to 27.5 23.5
30.5 to 39.5 35
28 to 35 5 27.5 to 35.5 31.5
39.5 to 48.5 44
36 to 43 4 35.5 to 43.5 39.5
44 to 51 2 43.5 to 51.5 47.5
c] w = 46 − 4 = 10.5 (use 11)
4
Classes
4 to 14
15 to 25
26 to 36
37 to 47
f
5
13
7
5
Boundaries
m
3.5 – 14.5
9
14.5 – 25.5
20
25.5 – 36.5
31
36.5 – 47.5
42
2.7] A sample of 40 city inspectors was selected to conduct a study on the number of miles they
drive daily. Collected data (in miles): 30, 20, 40, 65, 39, 28, 12, 25, 39, 43, 11, 37, 13, 48, 34,
50, 29, 35, 42, 23, 37, 18, 66, 33, 19, 22, 53, 33, 45, 10, 32, 28, 16, 34, 14, 27, 43, 58, 38, 28
a) Compute the class width w and form the class limits (LCL, UCL), class boundaries (LCB, UCB),
class marks, and construct the frequency distribution using 5 classes; b) repeat with 6 classes;
2.8] For the given frequency distributions, find the class boundaries and the class marks.
B) Classes
f
Boundaries
m
A) Classes
f
Boundaries
m
1.00 to 2.49 12
12.5 to 21.4
1
2.50 to 3.99 16
21.5 to 30.4
3
4.00 to 5.49
9
30.5 to 39.4
6
5.50 to 6.99
4
39.5 to 48.4 10
7.00
to
8.49
2
48.5 to 57.4
8
Other Distributions
2.9] Use the given grouped-data frequency distribution to construct the CF, RF, CRF, PF, and
CPF distributions. USE FOUR DECIMALS IN ALL APPLICABLE COMPUTATIONS.
Classes
F
CF
RF
CRF
PF
CPF
0.5 to 8.4
5
8.5 to 16.4
9
16.5 to 24.4
10
24. 5 to 32.4
8
32.5 to 40.4
4
2.10] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.1
Make
Nissan
Toyota
Honda
Lexus
Ford
F
6
8
5
2
4
CF
RF
CRF
PF
CPF
2.11] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.2
2.12] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.3
2.13] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.7a
2.14] Graphical Representation of Data: Popular shapes of distributions:
SYMMETRIC DISTRIBUTION
POSITIVELY-SKEWED DISTRIBUTION NEGATIVELY-SKEWED DISTRIBUTION
UNIFORM DISTRIBUTION
EXPONENTIAL DISTRIBUTION
SINUSOIDAL DISTRIBUTION
2.15] POPULAR DESCRIPTIVE GRAPHS
A) DOTPLOT: 17 families were asked how many children they have. The recorded
responses were: 3, 4, 0, 2, 3, 2, 1, 5, 2, 3, 2, 1, 4, 3, 0, 2, 1. Construct the vertical dot plot
and the horizontal dot plot. Comment on the shapes of the distributions.
VERTICAL DOTPLOTS
Frequency Distribution
0
1
2
3
Cumulative Frequency Distribution
4
5
0
1
Number of Children
2
3
4
5
Number of Children
HORIZONTAL DOTPLOTS
Cumulative Frequency Distribution
5
5
4
4
Number of Children
Number of Children
Frequency Distribution
3
2
1
3
2
1
0
0
B) SPIKE GRAPHS: Use the same data as in A) to construct the vertical and the horizontal
spike graphs for each distribution. Comment on the shapes of the distributions.
Children
0
1
2
3
4
5
TOTALS
f
2
3
5
4
2
1
17
CF
2
5
10
14
16
17
RF
0.1176
0.1765
0.2941
0.2353
0.1176
0.0588
1.0000
CRF
0.1176
0.2941
0.5882
0.8235
0.9412
1.0000
PF
CPF
11.76% 11.76%
17.65% 29.41%
29.41% 58.82%
23.53% 82.35%
11.76% 94.12%
5.88% 100.00%
100.00%
FREQUENCY DISTRIBUTION
Horizontal
6
5
4
3
2
1
0
Number of Children
Frequency
Vertical
0
1
2
3
4
5
4
3
2
1
0
5
0
1
2
Number of Children
3
4
5
6
7
Frequency
CUMULATIVE FREQUENCY DISTRIBUTION
Horizontal
18
16
14
12
10
8
6
4
2
0
5
Number of Children
Cumulative Frequency
Vertical
4
3
2
1
0
0
1
2
3
4
5
0
Number of Children
5
10
15
20
Cumulative Frequency
RELATIVE FREQUENCY DISTRIBUTION
Horizontal
0.40
0.30
0.20
0.10
0.00
0
1
2
3
Number of Children
4
5
Number of Children
Relative Frequency
Vertical
5
4
3
2
1
0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
Relative Frequency
CUMULATIVE RELATIVE FREQUENCY DISTRIBUTION
Vertical
Horizontal
5
Number of Children
Cumulative Relative
Frequency
1.00
0.80
0.60
0.40
0.20
0.00
4
3
2
1
0
0
1
2
3
4
5
0.00
0.20
Number of Children
0.40
0.60
0.80
1.00
Cumulative Relative Frequency
PERCENT DISTRIBUTION
Horizontal
5
35
30
25
20
15
10
5
0
Number of Children
Percent Frequency
Vertical
4
3
2
1
0
0
1
2
3
4
5
0
5
Number of Children
10
15
20
25
30
35
Percent Frequency
CUMULATIVE PERCENT DISTRIBUTION
Horizontal
100
Number of Children
Cumulative Percent
Frequency
Vertical
80
60
40
20
0
0
1
2
3
Number of Children
4
5
5
4
3
2
1
0
0
20
40
60
80
100
Cumulative Percent Frequency
C) Use the set of observations {2, 4, 1, 5, 3, 2, 3, 0, 4, 2, 1, 2, 0, 5, 2, 6, 1, 4, 3, 1, 3}
C1) Construct the horizontal and vertical dotplots for the frequency and cumulative frequency
distributions. Comment on the shapes of the distributions.
C2) Construct the horizontal and the vertical spike graphs for all six distributions. Comment
on the shapes of the distributions.
D) BAR GRAPHS:
D1) Qualitative Data: Use the distributions from 2.10) to construct the corresponding bar graphs
Car Make
f
CF
RF
CRF
PF
CPF
Nissan
6
6
0.24
24%
24%
0.24
Toyota
8
14
0.32
0.56
32%
56%
Honda
5
19
0.20
0.76
20%
76%
Lexus
2
21
0.08
0.84
8%
84%
Ford
4
25
0.16
1.00
16%
100%
FREQUENCY BAR GRAPHS
9
8
7
6
5
4
3
2
1
0
Horizontal
Ford
8
6
5
4
2
Car Makes
Frequency
Vertical
4
Lexus
2
Honda
5
Toyota
8
Nissan
6
0 1 2 3 4 5 6 7 8 9
Car Makes
Frequency
25
14
10
0
21
19
15
5
Ford
25
20
Car Make
Cumulative Frequency
CUMULATIVE FREQUENCY BAR GRAPHS
6
25
Lexus
21
Honda
19
Toyota
14
Nissan
6
0
Car Make
5
10
15
20
Cumulative Frequency
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0.24
Ford
0.2
0.16
0.08
Car Make
Relative Frequency
RELATIVE FREQUENCY BAR GRAPHS
0.32
0.16
Lexus
0.08
Honda
0.2
Toyota
0.32
Nissan
0.24
0
Car Make
0.1
0.2
0.3
Relative Frequency
0.4
25
CUMULATIVE RELATIVE FREQUENCY BAR GRAPHS
1.0
0.76
0.8
Horizontal
0.84
1.00
Ford
0.56
0.6
0.4
Car Make
Cumulative Relative
Frequency
Vertical
0.24
0.2
0.0
1.00
Lexus
0.84
Honda
0.76
Toyota
0.56
Nissan
0.24
0.0 0.2 0.4 0.6 0.8 1.0
Cumulative Relative Frequency
Car Make
32
35
30
25
20
15
10
5
0
Ford
24
20
16
8
Car Make
Percent (%)
PERCENT FREQUENCY BAR GRAPHS
16
Lexus
8
Honda
20
Toyota
32
Nissan
24
0 5 10 15 20 25 30 35
Car Make
Percent (%)
100
76
80
56
60
40
84
24
20
0
100
Ford
Car Make
Cumulative Percent
CUMULATIVE PERCENT FREQUENCY BAR GRAPHS
100
Lexus
84
Honda
76
Toyota
56
Nissan
24
0
Car Make
20 40 60 80 100
Cumulative Percent
D2) Grouped Data: Use the distributions from 2.6a] to construct the corresponding bar graphs.
Classes
f
Boundaries
m
CF
RF
CRF
PF
CPF
4 to 12 2 3.5 to 12.5
8
2 0.0667 0.0667 6.67%
6.67%
13 to 21 7 12.5 to 21.5
17
9 0.2333 0.3000 23.33% 30.00%
22 to 30 12 21.5 to 30.5
26
21 0.4000 0.7000 40.00% 70.00%
31 to 39 6 30.5 to 39.5
35
27 0.2000 0.9000 20.00% 90.00%
40 to 48 3 39.5 to 48.5
44
30 0.1000 1.0000 10.00% 100.00%
FREQUENCY BAR GRAPHS
Vertical
12
Class Limits
Frequency
40 to 48
12
10
8
6
7
4
2
0
6
13 21
22 30
31 39
3
31 to 39
6
22 to 30
12
13 to 21
7
4 to 12
3
2
4 12
Horizontal
2
0
40 48
2
4
Class Limits
6
8
10
12
Frequency
30
25
27
20
21
15
10
5
0
2
40 to 48
30
Class Limits
Cumulative Frequency
CUMULATIVE FREQUENCY BAR GRAPHS
9
30
31 to 39
27
22 to 30
21
13 to 21
9
4 to 12
4 12 13 21 22 30 31 39 40 48
2
0
5
10
15
20
25
30
Cumulative Frequency
Class Limits
RELATIVE FREQUENCY BAR GRAPHS
0.30
0.2333
0.20
0.10
40 to 48
0.4000
0.40
0.2000
0.1000
0.0667
Class Limits
Relative Frequency
0.50
0.1000
31 to 39
0.2000
22 to 30
0.4000
13 to 21
0.2333
4 to 12
0.00
4 12 13 21 22 30 31 39 40 48
Class Limits
0.0667
0.0
0.1
0.2
0.3
0.4
Relative Frequency
0.5
CUMULATIVE RELATIVE FREQUENCY BAR GRAPHS
Horizontal
0.9000
1.00
1.0000
40 to 48
0.7000
0.80
0.60
0.3000
0.40
0.20
0.0667
1.0000
31 to 39
Class Limits
Cumulative Relative Frequency
Vertical
0.9000
22 to 30
0.7000
13 to 21
0.3000
4 to 12
0.00
0.0667
0.0
4 12 13 21 22 30 31 39 40 48
0.2
0.4
0.6
0.8
1.0
Cumulative Relative Frequency
Class Limits
PERCENT FREQUENCY BAR GRAPHS
40.00
40
30
23.33
40 to 48
20.00
20
10.00
6.67
10
Class Limit
Percents (%)
50
10.00
31 to 39
20.00
22 to 30
40.00
13 to 21
23.33
4 to 12
0
6.67
4 12 13 21 22 30 31 39 40 48
0 5 10 15 20 25 30 35 40 45 50
Class Limits
Percents (%)
90.00
100
100.00
70.00
80
60
30.00
40
20
40 to 49
100.00
31 to 39
90.00
Class Limits
Cumulative Percents (%)
CUMULATIVE PERCENT FREQUENCY BAR GRAPHS
22 to 30
70.00
13 to 21
6.67
30.00
4 to 12
0
4 12 13 21 22 30 31 39 40 48
Class Limits
6.67
0
20
40
60
80
Cumulative Percents (%)
100
D3) Discrete Data: Use the distributions from 2.15A) to construct the corresponding bar graphs.
Children
0
1
2
3
4
5
f
2
3
5
4
2
1
Boundaries
-0.5 to 0.5
0.5 to 1.5
1.5 to 2.5
2.5 to 3.5
3.5 to 4.5
4.5 to 5.5
m
0
1
2
3
4
5
CF
2
5
10
14
16
17
RF
0.1176
0.1765
0.2941
0.2353
0.1176
0.0588
CRF
0.1176
0.2941
0.5882
0.8235
0.9412
1.0000
PF
11.76%
17.65%
29.41%
23.53%
11.76%
5.88%
CPF
11.76%
29.41%
58.82%
82.35%
94.12%
100.00%
FREQUENCY BAR GRAPHS
6
5
Frequency
5
4
4
3
2
Horizontal
3
2
2
1
1
0
0
1
2
3
4
Number of Children
Vertical
1
5
2
4
4
3
5
2
3
1
2
0
0
5
1
2
Number of Children
3
4
5
6
Frequency
18
14
15
12
10
9
6
3
0
17
16
Number of Children
Cumulative Frequency
CUMULATIVE FREQUENCY BAR GRAPHS
5
2
0
1
2
3
4
17
5
16
4
14
3
10
2
5
1
2
0
0
5
Number of Children
3
6
9
12
15
18
Cumulative Frequency
RELATIVE FREQUENCY BAR GRAPHS
0.30
0.25
0.1765
0.20
0.15 0.1176
0.10
0.05
0.00
0
1
0.2353
0.1176
0.0588
2
3
4
Number of Children
5
Number of Children
Relative Frequency
0.2941
5
4
3
2
1
0
0.0588
0.1176
0.2353
0.2941
0.1765
0.1176
0.00 0.05 0.10 0.15 0.20 0.25 0.30
Relative Frequency
CUMULATIVE RELATIVE FREQUENCY BAR GRAPHS
Horizontal
1.00
0.824
0.80
0.588
0.60
0.294
0.40
0.20
0.941 1.000
0.118
0.00
0
1
2
3
5
Number of Children
Cumulative Relative Frequency
Vertical
4
1.000
4
0.941
3
0.824
2
0.588
1
0.294
0
5
0.118
0.00
Number of Children
0.20
0.40
0.60
0.80
1.00
Cumulative Relative Frequency
29.41
Percents (%)
30
25
17.65
20
15
25.53
11.76
11.76
10
5.88
5
0
0
1
2
3
4
Number of Children
PERCENT FREQUENCY BAR GRAPHS
5
5.88
4
11.76
3
25.53
2
29.41
1
17.65
0
11.76
0
5
5
10
15
20
25
30
Percents (%)
Number of Children
100
82.4
80
58.8
60
29.4
40
20
94.1 100.0
Number of Children
Cumulative Percents (%)
CUMULATIVE PERCENT FREQUENCY BAR GRAPHS
11.8
0
0
1
2
3
4
Number of Children
5
5
100.0
4
94.1
3
82.4
2
58.8
1
29.4
0
11.8
0
20
40
60
80
100
Cumulative Percents (%)
D4] Exercise: Use the distributions from 2.2] to construct all corresponding bar graphs.
D5] Exercise: Use the distributions from 2.6b] to construct all corresponding bar graphs.
D6] Exercise: Use the distributions from 2.3] to construct all corresponding bar graphs.
D7) Answer the questions based on the given bar graph, which shows the number of
students enrolled in Chemistry, Physics, Economics, Political Science, and Psychology
courses at CA College.
Enrollment in Introductory Courses at CA College
350
the course with most students?
300
2) Order the courses by enrollment
from lowest to highest.
3) Approximately, how many times is
the enrollment in Economics bigger
than the enrollment in Chemistry?
4) How many more students are in
Students Enrolled
1) How many students enrolled in
350
250
250
180
200
150
220
150
100
50
0
Economics than in Physics?
5) What percent of all students
Introductory Courses
enrolled in Psychology?
D8) Exercise: The given bar graph shows the quarterly water charges (in U.S. Dollars) from
Miami-Dade Water and Sewer Department to a particular household during the period from
October-2009 to December-2010. Use it to answer the following:
16
1) Which quarter showed the least
14
from highest to lowest.
3) Approximately, how many times is
the charge in Mar-10 smaller
than the charge in Dec-09?
4) How much lower was the charge in
Dec-10 than in Dec-09?
5) What percent of the total charges
is the charge in Sep-10?
Water Charge ($)
charge? How much was it?
2) Arrange the quarters by water charge
15.74
12.43
12
13.41
10.34
10
7.51
8
6
4
2
0
Dec-09
Mar-10
Jun-10
Sep-10
Quarter Ends
Dec-10
D9) The given double bar graph shows the quarterly water charges for a Miami-Dade Water
and sewer customer during the years 2009 and 2010. Use it to answer the following:
80
1) In which quarters was the water
70
bill higher in 2010 than in 2009?
74
69
Water Bill ($)
60
2) Which quarter shows the highest
difference in water bills?
How much is the difference?
3) How much more was the percent
65
67
59
50
54
53
47
40
2009
30
2010
20
10
of the total 2010 charges than the
0
percent of the total 2009 charges
First
Second
Third
Fourth
Year Quarters
in the second quarter?
8
7
6
5
4
3
2
1
0
8
2
Lexus
6
5
Car Make
Frequency
E) PARETO GRAPHS:
E1) Construct the Pareto graphs (vertical and horizontal) for the data in item 2.1].
VERTICAL
HORIZONTAL
4
2
Toyota Nissan Honda
Ford
4
Ford
5
Honda
6
Nissan
8
Toyota
Lexus
0
Car Make
2
4
6
8
Frequency
E2) Exercise: Construct the Pareto graphs (vertical and horizontal) for the data in item 2.2].
E3) Exercise: Construct the Pareto graphs (vertical and horizontal) for the data in item 2.3].
F) PIE GRAPHS:
F1) Construct the pie graphs for the data in item 2.1].
FREQUENCY PIE GRAPH
Lexus,
2
Ford, 4
Honda,
5
Nissan,
6
Toyota,
8
RELATIVE FREQUENCY PIE GRAPH
Lexus
0.08
Ford
0.16
Honda
0.20
Nissan
0.24
Toyota
0.32
PERCENT PIE GRAPH
Lexus
8%
Ford
16%
Honda
20%
Nissan
24%
Toyota
32%
F2) Exercise: Construct the pie graphs (frequency, relative frequency, percent) for the data in 2.2].
F3) Exercise: Customarily, economists examine the educational background of the employees of
a company when studying the company’s employee productivity. The table below shows the
frequency distribution of the highest degrees earned by the 200 employees at CG Corporation.
Complete the relative frequency (RF) and the percent frequency distributions, and construct
the pie graphs (frequency, relative frequency, percent frequency) for these data.
PIE GRAPHS
Degree
High School
Bachelor’s
Master’s
Doctorate
Other
f
44
54
42
38
22
RF
Percent
G] STEM-AND-LEAF DISPLAY
G1] Construct the stem-and-leaf display using the 30 scores in item 2.6: 35, 27, 42, 22, 28,
38, 32, 25, 14, 22, 9, 21, 13, 33, 46, 25, 39, 18, 24, 4, 22, 20, 25, 14, 24, 45, 29, 21, 36, 25
G2] Use the given stem-and-leaf display to answer the questions at right.
STEM LEAF
a) How many observations are there in the data set?
6
568
b) Give the values of the stem and the leaf (separately)
5
0112335889999
for the first observation in the third row from the bottom
4
2225566689
c) List all the observations in the original data set.
3
12237
d) Which observation is the most repeated?
2
36
e) Give the value of the middlemost observation.
1
2
f) Give the value of the largest observation.
0
4
g) Name the (approximate) shape of the distribution.
G3] A sample of 23 drivers was obtained to study the number of miles they drive (rounded
to integers) during a typical day. Recorded values: 25, 50, 29, 35, 47, 11, 39, 21, 38, 5,
36, 23, 43, 33, 26, 16, 38, 23, 34, 18, 49, 27, 38. Construct the stem-and-leaf display and
comment on the shape of the distribution.
G4] Use the given stem-and-leaf display to answer the questions at right.
STEM LEAF
a) How many observations are there in the data set?
0
03
b) Give the values of the stem and the leaf (separately)
1
0124444458999
for the first observation in the third row from the top.
2
122345668
c) List the five largest observations.
3
45667
d) Which observation is the most repeated?
4
236
e) Give the value of the middlemost observation.
5
57
f) Give the value of the smallest observation.
6
4
g) Name the (approximate) shape of the distribution.
H) HISTOGRAMS
H1) Qualitative Data: Use the distributions of 2.10] to construct the corresponding histograms.
Car Make
f
CF
RF
CRF
PF
CPF
Nissan
6
6
0.24
24%
24%
0.24
Toyota
8
14
0.32
0.56
32%
56%
Honda
5
19
0.20
0.76
20%
76%
Lexus
2
21
0.08
0.84
8%
84%
Ford
4
25
0.16
1.00
16%
100%
FREQUENCY HISTOGRAMS
Vertical
Ford
8
6
5
Car Makes
Frequency
9
8
7
6
5
4
3
2
1
0
Horizontal
4
2
4
Lexus
2
Honda
5
Toyota
8
Nissan
6
0 1 2 3 4 5 6 7 8 9
Car Makes
Frequency
25
25
19
20
25
14
15
10
21
21
Car Make
Cumulative Frequency
CUMULATIVE FREQUENCY HISTOGRAMS
6
19
14
5
6
0
0
5
10
15
20
25
Cumulative Frequency
Car Make
0.35
0.30
0.25
0.20
0.15
0.10
0.05
0.00
0.32
0.24
0.16
0.2
0.16
0.08
0.08
Car Make
Relative Frequency
RELATIVE FREQUENCY HISTOGRAMS
0.2
0.32
0.24
0
Car Make
Construct the CRF, PF, and CPF Histograms
0.1
0.2
0.3
Relative Frequency
0.4
H2) Grouped Data: Use the distributions of 2.6a] to construct the corresponding histograms.
Classes
f
Boundaries
m
CF
RF
CRF
PF
CPF
4 to 12 2 3.5 to 12.5
8
2 0.0667 0.0667 6.67%
6.67%
13 to 21 7 12.5 to 21.5
17
9 0.2333 0.3000 23.33% 30.00%
22 to 30 12 21.5 to 30.5
26
21 0.4000 0.7000 40.00% 70.00%
31 to 39 6 30.5 to 39.5
35
27 0.2000 0.9000 20.00% 90.00%
40 to 48 3 39.5 to 48.5
44
30 0.1000 1.0000 10.00% 100.00%
FREQUENCY HISTOGRAMS
Horizontal
Vertical
14
39.5 to 48.5
Class Boundaries
Frequency
12
10
8
6
4
2
0
30.5 to 39.5
21.5 to 30.5
12.5 to 21.5
3.5 to 12.5
3.5
12 .5
21.5
30.5
39.5 48.5
0
Class Boundaries
2
4
6
Frequency
8
10
PERCENT FREQUENCY HISTOGRAMS
Percent (%)
30.0
23.33
20.00
20.0
10.00
6.67
10.0
0.0
3.5 12 .5
21.5
30.5
Class Boundaries
40.00
40.0
39.5 to 48.5
10.00
30.5 to 39.5
20.00
21.5 to 30.5
40.00
12.5 to 21.5
23.33
3.5 to 12.5
6.67
0
39.5 48.5
5 10 15 20 25 30 35 40
Percents (%)
Class Boundaries
CUMULATIVE PERCENT FREQUENCY HISTOGRAMS
90.00
80
100.00
70.00
Class Boundaries
Cumulative Percent (%)
100
60
40
20
0
30.00
6.67
3.5 12 .5
21 .5
30 .5
39 .5 48.5
Class Boundaries
Construct the CF, RF, CRF Histograms.
39.5 to 48.5
100.00
30.5 to 39.5
90.00
21.5 to 30.5
70.00
12.5 to 21.5
3.5 to 12.5
30.00
6.67
0 20 40 60 80 100
Cumulative Percent (%)
H3) Discrete Data: Use the distributions of 2.15a] to construct the corresponding histograms.
Children f Boundaries
m CF
RF
CRF
PF
CPF
0
2
-0.5 to 0.5
0
2
0.1176
0.1176
11.76% 11.76%
1
3
0.5 to 1.5
1
5
0.1765
0.2941
17.65% 29.41%
2
5
1.5 to 2.5
2
10 0.2941
0.5882
29.41% 58.82%
3
4
2.5 to 3.5
3
14 0.2353
0.8235
23.53% 82.35%
4
2
3.5 to 4.5
4
16 0.1176
0.9412
11.76% 94.12%
5
1
4.5 to 5.5
5
17 0.0588
1.0000
5.88% 100.00%
FREQUENCY HISTOGRAMS
6
5
Frequency
5
4
4
3
2
Horizontal
3
2
2
1
1
0
0
1
2
3
4
Number of Children
Vertical
1
5
2
4
4
3
5
2
3
1
2
0
0
5
1
2
Number of Children
3
4
5
6
Frequency
18
14
15
12
3
0
17
10
9
6
16
Number of Children
Cumulative Frequency
CUMULATIVE FREQUENCY HISTOGRAMS
5
2
0
1
2
3
4
17
5
16
4
14
3
10
2
5
1
2
0
0
5
Number of Children
3
6
9
12
15
18
Cumulative Frequency
0.30
0.25
0.1765
0.20
0.15 0.1176
0.10
0.05
0.00
0
1
0.2353
0.1176
0.0588
2
3
4
5
Number of Children
Construct the CRF, PF, and the CPF Histograms
Number of Children
Relative Frequency
RELATIVE FREQUENCY HISTOGRAMS
0.2941
5
4
0.0588
0.1176
3
0.2353
2
0.2941
1
0
0.1765
0.1176
0.00 0.05 0.10 0.15 0.20 0.25 0.30
Relative Frequency
H4] a) Use the distribution of 2.6a] to construct the corresponding histograms but using the
class marks instead of the class boundaries.
b) Use the distribution of 2.6b] to construct the corresponding histograms using the class
boundaries.
c) Use the distribution of 2.6b] to construct the corresponding histograms but using the
class marks instead of the class boundaries.
I] POLYGONS
I1) Grouped Data: Use the distribution of 2.6a] to construct the corresponding polygons.
Classes
4 to 12
13 to 21
22 to 30
31 to 39
40 to 48
f
2
7
12
6
3
Boundaries
3.5 to 12.5
12.5 to 21.5
21.5 to 30.5
30.5 to 39.5
39.5 to 48.5
m
8
17
26
35
44
CF
2
9
21
27
30
RF
0.0667
0.2333
0.4000
0.2000
0.1000
CRF
0.0667
0.3000
0.7000
0.9000
1.0000
PF
CPF
6.67%
6.67%
23.33% 30.00%
40.00% 70.00%
20.00% 90.00%
10.00% 100.00%
Construct the FREQUENCY POLYGON
RELATIVE FREQUENCY POLYGON
Marks
RF
8
0.0667
17
0.2333
26
0.4000
35
0.2000
44
Relative Frequency
0.4
0.4000
0.3
0.2333
0.2
0.1
0.1000
0.0667
0.0
0.0000
0
0.1000
0.2000
8
0.0000
17
26
35
44
53
Class Marks
Construct the PERCENT FREQUENCY POLYGON
I2] Exercise: Use the distributions of item 2.6b] to construct the corresponding polygons.
J] OGIVES
J1] Grouped Data: Use the distribution of 2.6a] to construct the corresponding ogives.
Classes
4 to 12
13 to 21
22 to 30
31 to 39
40 to 48
f
2
7
12
6
3
Boundaries
3.5 to 12.5
12.5 to 21.5
21.5 to 30.5
30.5 to 39.5
39.5 to 48.5
m
8
17
26
35
44
CF
2
9
21
27
30
RF
0.0667
0.2333
0.4000
0.2000
0.1000
CRF
0.0667
0.3000
0.7000
0.9000
1.0000
Construct the CUMULATIVE FREQUENCY OGIVE
Construct the CUMULATIVE RELATIVE FREQUENCY OGIVE
PF
CPF
6.67%
6.67%
23.33% 30.00%
40.00% 70.00%
20.00% 90.00%
10.00% 100.00%
CUMULATIVE PERCENT FREQUENCY OGIVE
90.00
100.0
CPF
8
6.67
17
30.00
26
70.00
35
90.00
44
100.00
Cumulative Percent
Marks
100.00
70.00
80.0
60.0
30.00
40.0
20.0
0.00
6.67
0.0
0
8
17
26
Class Marks
35
44
J2] Exercise: Use the distributions of item 2.6b] to construct the corresponding ogives.
K] SCATTERPLOTS
Negative Linear Relation
30
25
25
25
20
20
20
15
15
15
Y
30
10
10
10
5
5
5
0
0
2
4
X
6
8
0
10
Exponential Relation
0
2
4
X
6
8
0
10
Sinusoidal Relation
0
3.5
30
25
3.0
25
2.5
Y
Y
1.5
10
1.0
5
0.5
0
2
4
X
6
8
10
X
6
8
10
15
10
5
0.0
0
4
20
2.0
15
2
No Discernible Relation
30
20
Y
Nonlinear (Curvilinear) Relation
30
Y
Y
Positive Linear Relation
0
2
4
X
6
8
10
0
0
2
4
X
6
8
10
K1] Example: A large corporation is planning to open a nationwide chain of sporting goods. A
market analysis is conducted to examine the relationship between the variable weekly income
(x) and weekly household expenditure on recreation (y). Eight families were interviewed and
the recorded data are shown below. Construct the scatter diagram and comment on the
relationship between the two variables.
Income
Expenditure
900
90
800
72
600
54
400
50
700
69
500
60
300
30
200
25
K2] Exercise: A consumer is interested in estimating the price of a car based on how old the
car is. She takes a random sample of ten used cars of the same make and model. The
table below shows the price of the car (y, in $1000’s) and the age (x, in years). Construct
the scatterplot and comment on the relationship between the two variables.
Age (years)
Price ($1000)
1
18.5
2
16.0
3
15.2
4
12.5
5
13.1
6
10.5
7
11.0
8
9.5
9
6.5
10
6.1
L] TIME-SERIES PLOTS
L1] Example: The given data was collected to analyze changes in farm population (P, in
millions) over time (t, in years). The year period selected was from 1998 to 2005.
Construct the time-series plot and comment on the relationship between the variables.
Year
Population
1998 1999 2000 2001 2002
14.3 13.5 11.2 9.9
8.7
SOLUTION
2003
7.9
2004 2005
5.8
6.6
Farm Population (in millions)
Time Series Plot
Year
Popul.
20
1998
14.3
1999
13.5
15
2000
11.2
10
2001
9.9
2002
8.7
5
2003
7.9
2004
5.8
0
1996 1998 2000 2002 2004 2006
2005
6.6
Year
Comment: There seems to be a
negative linear relationship between
farm population and time in years. As the years passed by from 1998 to 2005, the farm
population appeared to be decreasing.
L2] Exercise: The yearly sales (in million dollars) from 1993 to 2003 reported by Microsoft
Corporation are shown in the given table. Construct the time-series plot and comment on
the relationship between the variables.
Year
Sales
1993 1994 1995 1999 1996 1997 1998 2000 2001 2002
0.75 1.55 2.35 2.22 2.34 2.54 2.55 2.75 3.11 3.24
2003
3.15
2.2]
Beverage
Coffee
Soda
Tea
Water
O.J.
2.7]
a]
2.8]
2.10]
2.11]
2.13]
ANSWERS TO SELECTED ITEMS
2.4]
2.3]
f
4
5
2
7
3
Children
0
1
2
3
4
f
4
6
5
2
1
Gender
Women
Men
2.5]
Customer
Purchase
No Purchase
Classes
10 to 21
22 to 33
34 to 45
46 to 57
58 to 69
f
9
12
13
4
2
Boundaries
9.5 to 21.5
21.5 to 33.5
33.5 to 45.5
45.5 to 57.5
57.5 to 69.5
m
15.5
27.5
39.5
51.5
63.5
A) Classes
12.5 to 21.4
21.5 to 30.4
30.5 to 39.4
39.5 to 48.4
48.5 to 57.4
f
1
3
6
10
8
Boundaries
12.45 to 21.45
21.45 to 30.45
30.45 to 39.45
39.45 to 48.45
48.45 to 57.45
m
16.95
25.95
34.95
43.95
52.95
Make
Nissan
Toyota
Honda
Lexus
Ford
f
6
8
5
2
4
Beverage f
Coffee
4
Soda
5
Tea
2
Water
7
O. J.
3
CF
4
9
11
18
21
RF
0.1905
0.2381
0.0952
0.3333
0.1429
Classes
10.0 to 21.9
22.0 to 33.9
34.0 to 45.9
46.0 to 57.9
58.0 to 69.9
f
9
12
13
4
2
CF
9
21
34
38
40
RF
0.225
0.300
0.325
0.100
0.050
PF
19.05%
23.81%
9.52%
33.33%
14.29%
CRF
0.225
0.525
0.850
0.950
1.000
PF
22.5%
30.0%
32.5%
10.0%
5.0%
Football
70
50
Basketball
40
40
Satisfied
300
70
Unsatisfied
15
9
Indifferent
5
1
b]
CF
6
14
19
21
25
CRF
0.1905
0.4286
0.5238
0.8571
1.0000
Baseball
30
20
Classes
10 to 19
20 to 29
30 to 39
40 to 49
50 to 59
60 to 69
f
8
9
12
6
3
2
Boundaries
9.5 to 19.5
19.5 to 29.5
29.5 to 39.5
39.5 to 49.5
49.5 to 59.5
59.5 to 69.5
M
14.5
24.5
34.5
44.5
54.5
64.5
B) Classes
1.00 to 2.49
2.50 to 3.99
4.00 to 5.49
5.50 to 6.99
7.00 to 8.49
f
12
16
9
4
2
Boundaries
0.995 to 2.495
2.495 to 3.995
3.995 to 5.495
5.495 to 6.995
6.995 to 8.495
M
1.745
3.245
4.745
6.245
7.745
RF
0.24
0.32
0.20
0.08
0.16
CPF
19.05%
42.86%
52.38%
85.71%
100.00%
2.12]
Children
0
1
2
3
4
CRF
0.24
0.56
0.76
0.84
1.00
f
4
6
5
2
1
CF
4
10
15
17
18
PF
24%
32%
20%
8%
16%
RF
0.2222
0.3333
0.2778
0.1111
0.0556
CRF
0.2222
0.5556
0.8333
0.9444
1.0000
CPF
24%
56%
76%
84%
100%
PF
22.22%
33.33%
27.78%
11.11%
5.56%
CPF
22.22%
55.56%
83.33%
94.44%
100.00%
CPF
22.5%
52.5%
85.0%
95.0%
100.0%
D8) 1) The one ending in Mar-2010; $7.51; 2) Dec-09, Dec-10, Sep-10, Jun-10, Mar-10; 3) ½ ; 4) $2.33; 5) 20.9%
D9) 1) During the first three quarters; 2) Fourth; 3) 13.3%
5
4
2
3
Water Soda Coffee O.J.
3
O.J.
4
Coffee
2
5
Soda
7
Water
Tea
0
2
Beverage
F2]
Water,
7
6
Soda, 5
Water,
0.333
5
6
0
1
K2] There seems to be a negative linear relationship
between the price of a used car and the age of the
car (in years). Older cars seem to be associated
with lower prices.
15.0
10.0
5.0
0.0
6
Age (years)
3
4
5
Frequency
PERCENT PIE GRAPH
Coffee
19.0%
O.J.
14.3%
Soda,
0.238
Water
33.3%
Soda
23.8%
Tea
9.5%
L2] There seems to be a positive linear relationship
between the sales of Microsoft Corporation and
time (in years). As the years pass by, the sales
seem to be increasing.
Sales (in million dollars)
20.0
4
2
G4]
a) 35; b) stem = 2; leaf = 1; c) 43, 46, 55, 57, 64; d) 14; e) 22; f) 0
g) Positively Skewed.
STEM
LEAF
0
5
1
168
2
1335679
3
34568889
4
379
5
0
Approximately symmetric shape.
Price ($1000's)
4
two
Tea,
0.095
G3]
2
2
zero
one
Coffee,
0.190
O.J.,
0.143
Coffee,
4
1
four
three
Children
RELATIVE FREQUENCY PIE GRAPH
Tea, 2
0
6
5
4
3
2
1
0
Frequency
FREQUENCY PIE GRAPH
O. J.
,3
4
b)
Children
2
Tea
4
0
E3] a)
7
Beverage
Frequency
6
b)
Frequency
E2] a)
8
10
3.50
3.00
2.50
2.00
1.50
1.00
0.50
0.00
1990
1995
Year
2000
2005
6
Download