[2] FREQUENCY DISTRIBUTIONS AND GRAPHS Prepared by: CARLOS I. GIL 2.0] Describe frequency distributions… Univariate/Bivariate/Multivariate Distributions Univariate Frequency Distributions 2.1] A group of 25 individuals were asked what make of vehicle they drive. The 25 recorded responses were: Nissan, Toyota, Honda, Lexus, Ford, Nissan, Toyota, Nissan, Toyota, Ford, Honda, Nissan, Toyota, Nissan, Honda, Toyota, Ford, Nissan, Honda, Toyota, Ford, Toyota, Lexus, Honda, Toyota. Construct the frequency distribution for these data. 2.2] A group of 21 people were asked about their beverage preferences. Recorded responses: coffee, soda, tea, water, orange juice, coffee, soda, water, coffee, soda, water, coffee, soda, orange juice, water, tea, water, soda, water, orange juice, water. Construct the frequency distribution for these data 2.3] 18 families were asked how many children they have. The recorded data were: 2, 1, 3, 2, 0, 1, 2, 0, 1, 3, 0, 2, 1, 0, 1, 4, 1, 2. Construct the frequency distribution for these data. . Bivariate Joint Frequency Distributions 2.4] In a group of 250 people, 140 are women and the rest are men. Of the women, 30 enjoy baseball, 70 enjoy football, and the rest enjoy basketball. Of the men, 20 enjoy baseball, 50 enjoy football, and the rest enjoy basketball. Construct the joint frequency distribution of gender versus sports. 2.5] A total of 400 customers showed up at a car dealership during a particular weekend. Only 320 customers made a purchase. Of these, 300 were satisfied with the service, 15 were not, and the rest were indifferent about the kind of service they received. Of those who did not make a purchase, 70 were satisfied with the service, 9 were not, and the rest were indifferent about the kind of service they received. Construct the contingency table of customers versus satisfaction. Grouped Data Distributions When dealing with massive amounts of data, we sometimes group the values into non-overlapping classes (or categories), preferably of equal widths. The frequency of a class is the number of data values it contains. The recommended number of classes is 5 to 20, (the larger the data set, the bigger the number of classes). The class width: w = R/k, where R = Range = (maximum – minimum) and k = (number of classes). In cases when the data follow a Normal Distribution approximated by a Binomial Distribution with probability p=0.5 (which guarantees a symmetric bell shape), we may use Sturge’s formula to compute the number of classes: k = 1+ log 2 n , where n is the number of data values. The R number of classes can also be expressed as k = 1+ 3.322 log10 n and therefore, w = . 1+ 3.322 log10 n 2.6] Consider the given sample of 30 scores: 35, 27, 42, 22, 28, 38, 32, 25, 14, 22, 9, 21, 13, 33, 46, 25, 39, 18, 24, 4, 22, 20, 25, 14, 24, 45, 29, 21, 36, 25. Ordered Data: 4, 9, 13, 14, 14, 18, 20, 21, 21, 22, 22, 22, 24, 24, 25, 25, 25, 25, 27, 28, 29, 32, 33, 35, 36, 38, 39, 42, 45, 46 a) Compute the class width w and form the class limits (LCL, UCL), class boundaries or intervals (LCB, UCB), class marks, and construct the frequency distribution using 5 classes; b) repeat with 6 classes; c) repeat with 4 classes. SOLUTION a] w = Range = n classes Classes f 4 to 12 2 13 to 21 7 22 to 30 12 31 to 39 6 40 to 48 3 46 − 4 Range 46 − 4 = 8.4 (use 9) b] w = = = 7 (use 8) 5 n classes 6 Boundaries m Classes f Boundaries m 3.5 to 12.5 8 4 to 11 2 3.5 to 11.5 7.5 12.5 to 21.5 17 12 to 19 4 11.5 to 19.5 15.5 21.5 to 30.5 26 20 to 27 13 19.5 to 27.5 23.5 30.5 to 39.5 35 28 to 35 5 27.5 to 35.5 31.5 39.5 to 48.5 44 36 to 43 4 35.5 to 43.5 39.5 44 to 51 2 43.5 to 51.5 47.5 c] w = 46 − 4 = 10.5 (use 11) 4 Classes 4 to 14 15 to 25 26 to 36 37 to 47 f 5 13 7 5 Boundaries m 3.5 – 14.5 9 14.5 – 25.5 20 25.5 – 36.5 31 36.5 – 47.5 42 2.7] A sample of 40 city inspectors was selected to conduct a study on the number of miles they drive daily. Collected data (in miles): 30, 20, 40, 65, 39, 28, 12, 25, 39, 43, 11, 37, 13, 48, 34, 50, 29, 35, 42, 23, 37, 18, 66, 33, 19, 22, 53, 33, 45, 10, 32, 28, 16, 34, 14, 27, 43, 58, 38, 28 a) Compute the class width w and form the class limits (LCL, UCL), class boundaries (LCB, UCB), class marks, and construct the frequency distribution using 5 classes; b) repeat with 6 classes; 2.8] For the given frequency distributions, find the class boundaries and the class marks. B) Classes f Boundaries m A) Classes f Boundaries m 1.00 to 2.49 12 12.5 to 21.4 1 2.50 to 3.99 16 21.5 to 30.4 3 4.00 to 5.49 9 30.5 to 39.4 6 5.50 to 6.99 4 39.5 to 48.4 10 7.00 to 8.49 2 48.5 to 57.4 8 Other Distributions 2.9] Use the given grouped-data frequency distribution to construct the CF, RF, CRF, PF, and CPF distributions. USE FOUR DECIMALS IN ALL APPLICABLE COMPUTATIONS. Classes F CF RF CRF PF CPF 0.5 to 8.4 5 8.5 to 16.4 9 16.5 to 24.4 10 24. 5 to 32.4 8 32.5 to 40.4 4 2.10] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.1 Make Nissan Toyota Honda Lexus Ford F 6 8 5 2 4 CF RF CRF PF CPF 2.11] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.2 2.12] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.3 2.13] Construct the CF, RF, CRF, PF, and CPF distributions for the distribution in item 2.7a 2.14] Graphical Representation of Data: Popular shapes of distributions: SYMMETRIC DISTRIBUTION POSITIVELY-SKEWED DISTRIBUTION NEGATIVELY-SKEWED DISTRIBUTION UNIFORM DISTRIBUTION EXPONENTIAL DISTRIBUTION SINUSOIDAL DISTRIBUTION 2.15] POPULAR DESCRIPTIVE GRAPHS A) DOTPLOT: 17 families were asked how many children they have. The recorded responses were: 3, 4, 0, 2, 3, 2, 1, 5, 2, 3, 2, 1, 4, 3, 0, 2, 1. Construct the vertical dot plot and the horizontal dot plot. Comment on the shapes of the distributions. VERTICAL DOTPLOTS Frequency Distribution 0 1 2 3 Cumulative Frequency Distribution 4 5 0 1 Number of Children 2 3 4 5 Number of Children HORIZONTAL DOTPLOTS Cumulative Frequency Distribution 5 5 4 4 Number of Children Number of Children Frequency Distribution 3 2 1 3 2 1 0 0 B) SPIKE GRAPHS: Use the same data as in A) to construct the vertical and the horizontal spike graphs for each distribution. Comment on the shapes of the distributions. Children 0 1 2 3 4 5 TOTALS f 2 3 5 4 2 1 17 CF 2 5 10 14 16 17 RF 0.1176 0.1765 0.2941 0.2353 0.1176 0.0588 1.0000 CRF 0.1176 0.2941 0.5882 0.8235 0.9412 1.0000 PF CPF 11.76% 11.76% 17.65% 29.41% 29.41% 58.82% 23.53% 82.35% 11.76% 94.12% 5.88% 100.00% 100.00% FREQUENCY DISTRIBUTION Horizontal 6 5 4 3 2 1 0 Number of Children Frequency Vertical 0 1 2 3 4 5 4 3 2 1 0 5 0 1 2 Number of Children 3 4 5 6 7 Frequency CUMULATIVE FREQUENCY DISTRIBUTION Horizontal 18 16 14 12 10 8 6 4 2 0 5 Number of Children Cumulative Frequency Vertical 4 3 2 1 0 0 1 2 3 4 5 0 Number of Children 5 10 15 20 Cumulative Frequency RELATIVE FREQUENCY DISTRIBUTION Horizontal 0.40 0.30 0.20 0.10 0.00 0 1 2 3 Number of Children 4 5 Number of Children Relative Frequency Vertical 5 4 3 2 1 0 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 Relative Frequency CUMULATIVE RELATIVE FREQUENCY DISTRIBUTION Vertical Horizontal 5 Number of Children Cumulative Relative Frequency 1.00 0.80 0.60 0.40 0.20 0.00 4 3 2 1 0 0 1 2 3 4 5 0.00 0.20 Number of Children 0.40 0.60 0.80 1.00 Cumulative Relative Frequency PERCENT DISTRIBUTION Horizontal 5 35 30 25 20 15 10 5 0 Number of Children Percent Frequency Vertical 4 3 2 1 0 0 1 2 3 4 5 0 5 Number of Children 10 15 20 25 30 35 Percent Frequency CUMULATIVE PERCENT DISTRIBUTION Horizontal 100 Number of Children Cumulative Percent Frequency Vertical 80 60 40 20 0 0 1 2 3 Number of Children 4 5 5 4 3 2 1 0 0 20 40 60 80 100 Cumulative Percent Frequency C) Use the set of observations {2, 4, 1, 5, 3, 2, 3, 0, 4, 2, 1, 2, 0, 5, 2, 6, 1, 4, 3, 1, 3} C1) Construct the horizontal and vertical dotplots for the frequency and cumulative frequency distributions. Comment on the shapes of the distributions. C2) Construct the horizontal and the vertical spike graphs for all six distributions. Comment on the shapes of the distributions. D) BAR GRAPHS: D1) Qualitative Data: Use the distributions from 2.10) to construct the corresponding bar graphs Car Make f CF RF CRF PF CPF Nissan 6 6 0.24 24% 24% 0.24 Toyota 8 14 0.32 0.56 32% 56% Honda 5 19 0.20 0.76 20% 76% Lexus 2 21 0.08 0.84 8% 84% Ford 4 25 0.16 1.00 16% 100% FREQUENCY BAR GRAPHS 9 8 7 6 5 4 3 2 1 0 Horizontal Ford 8 6 5 4 2 Car Makes Frequency Vertical 4 Lexus 2 Honda 5 Toyota 8 Nissan 6 0 1 2 3 4 5 6 7 8 9 Car Makes Frequency 25 14 10 0 21 19 15 5 Ford 25 20 Car Make Cumulative Frequency CUMULATIVE FREQUENCY BAR GRAPHS 6 25 Lexus 21 Honda 19 Toyota 14 Nissan 6 0 Car Make 5 10 15 20 Cumulative Frequency 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.24 Ford 0.2 0.16 0.08 Car Make Relative Frequency RELATIVE FREQUENCY BAR GRAPHS 0.32 0.16 Lexus 0.08 Honda 0.2 Toyota 0.32 Nissan 0.24 0 Car Make 0.1 0.2 0.3 Relative Frequency 0.4 25 CUMULATIVE RELATIVE FREQUENCY BAR GRAPHS 1.0 0.76 0.8 Horizontal 0.84 1.00 Ford 0.56 0.6 0.4 Car Make Cumulative Relative Frequency Vertical 0.24 0.2 0.0 1.00 Lexus 0.84 Honda 0.76 Toyota 0.56 Nissan 0.24 0.0 0.2 0.4 0.6 0.8 1.0 Cumulative Relative Frequency Car Make 32 35 30 25 20 15 10 5 0 Ford 24 20 16 8 Car Make Percent (%) PERCENT FREQUENCY BAR GRAPHS 16 Lexus 8 Honda 20 Toyota 32 Nissan 24 0 5 10 15 20 25 30 35 Car Make Percent (%) 100 76 80 56 60 40 84 24 20 0 100 Ford Car Make Cumulative Percent CUMULATIVE PERCENT FREQUENCY BAR GRAPHS 100 Lexus 84 Honda 76 Toyota 56 Nissan 24 0 Car Make 20 40 60 80 100 Cumulative Percent D2) Grouped Data: Use the distributions from 2.6a] to construct the corresponding bar graphs. Classes f Boundaries m CF RF CRF PF CPF 4 to 12 2 3.5 to 12.5 8 2 0.0667 0.0667 6.67% 6.67% 13 to 21 7 12.5 to 21.5 17 9 0.2333 0.3000 23.33% 30.00% 22 to 30 12 21.5 to 30.5 26 21 0.4000 0.7000 40.00% 70.00% 31 to 39 6 30.5 to 39.5 35 27 0.2000 0.9000 20.00% 90.00% 40 to 48 3 39.5 to 48.5 44 30 0.1000 1.0000 10.00% 100.00% FREQUENCY BAR GRAPHS Vertical 12 Class Limits Frequency 40 to 48 12 10 8 6 7 4 2 0 6 13 21 22 30 31 39 3 31 to 39 6 22 to 30 12 13 to 21 7 4 to 12 3 2 4 12 Horizontal 2 0 40 48 2 4 Class Limits 6 8 10 12 Frequency 30 25 27 20 21 15 10 5 0 2 40 to 48 30 Class Limits Cumulative Frequency CUMULATIVE FREQUENCY BAR GRAPHS 9 30 31 to 39 27 22 to 30 21 13 to 21 9 4 to 12 4 12 13 21 22 30 31 39 40 48 2 0 5 10 15 20 25 30 Cumulative Frequency Class Limits RELATIVE FREQUENCY BAR GRAPHS 0.30 0.2333 0.20 0.10 40 to 48 0.4000 0.40 0.2000 0.1000 0.0667 Class Limits Relative Frequency 0.50 0.1000 31 to 39 0.2000 22 to 30 0.4000 13 to 21 0.2333 4 to 12 0.00 4 12 13 21 22 30 31 39 40 48 Class Limits 0.0667 0.0 0.1 0.2 0.3 0.4 Relative Frequency 0.5 CUMULATIVE RELATIVE FREQUENCY BAR GRAPHS Horizontal 0.9000 1.00 1.0000 40 to 48 0.7000 0.80 0.60 0.3000 0.40 0.20 0.0667 1.0000 31 to 39 Class Limits Cumulative Relative Frequency Vertical 0.9000 22 to 30 0.7000 13 to 21 0.3000 4 to 12 0.00 0.0667 0.0 4 12 13 21 22 30 31 39 40 48 0.2 0.4 0.6 0.8 1.0 Cumulative Relative Frequency Class Limits PERCENT FREQUENCY BAR GRAPHS 40.00 40 30 23.33 40 to 48 20.00 20 10.00 6.67 10 Class Limit Percents (%) 50 10.00 31 to 39 20.00 22 to 30 40.00 13 to 21 23.33 4 to 12 0 6.67 4 12 13 21 22 30 31 39 40 48 0 5 10 15 20 25 30 35 40 45 50 Class Limits Percents (%) 90.00 100 100.00 70.00 80 60 30.00 40 20 40 to 49 100.00 31 to 39 90.00 Class Limits Cumulative Percents (%) CUMULATIVE PERCENT FREQUENCY BAR GRAPHS 22 to 30 70.00 13 to 21 6.67 30.00 4 to 12 0 4 12 13 21 22 30 31 39 40 48 Class Limits 6.67 0 20 40 60 80 Cumulative Percents (%) 100 D3) Discrete Data: Use the distributions from 2.15A) to construct the corresponding bar graphs. Children 0 1 2 3 4 5 f 2 3 5 4 2 1 Boundaries -0.5 to 0.5 0.5 to 1.5 1.5 to 2.5 2.5 to 3.5 3.5 to 4.5 4.5 to 5.5 m 0 1 2 3 4 5 CF 2 5 10 14 16 17 RF 0.1176 0.1765 0.2941 0.2353 0.1176 0.0588 CRF 0.1176 0.2941 0.5882 0.8235 0.9412 1.0000 PF 11.76% 17.65% 29.41% 23.53% 11.76% 5.88% CPF 11.76% 29.41% 58.82% 82.35% 94.12% 100.00% FREQUENCY BAR GRAPHS 6 5 Frequency 5 4 4 3 2 Horizontal 3 2 2 1 1 0 0 1 2 3 4 Number of Children Vertical 1 5 2 4 4 3 5 2 3 1 2 0 0 5 1 2 Number of Children 3 4 5 6 Frequency 18 14 15 12 10 9 6 3 0 17 16 Number of Children Cumulative Frequency CUMULATIVE FREQUENCY BAR GRAPHS 5 2 0 1 2 3 4 17 5 16 4 14 3 10 2 5 1 2 0 0 5 Number of Children 3 6 9 12 15 18 Cumulative Frequency RELATIVE FREQUENCY BAR GRAPHS 0.30 0.25 0.1765 0.20 0.15 0.1176 0.10 0.05 0.00 0 1 0.2353 0.1176 0.0588 2 3 4 Number of Children 5 Number of Children Relative Frequency 0.2941 5 4 3 2 1 0 0.0588 0.1176 0.2353 0.2941 0.1765 0.1176 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Relative Frequency CUMULATIVE RELATIVE FREQUENCY BAR GRAPHS Horizontal 1.00 0.824 0.80 0.588 0.60 0.294 0.40 0.20 0.941 1.000 0.118 0.00 0 1 2 3 5 Number of Children Cumulative Relative Frequency Vertical 4 1.000 4 0.941 3 0.824 2 0.588 1 0.294 0 5 0.118 0.00 Number of Children 0.20 0.40 0.60 0.80 1.00 Cumulative Relative Frequency 29.41 Percents (%) 30 25 17.65 20 15 25.53 11.76 11.76 10 5.88 5 0 0 1 2 3 4 Number of Children PERCENT FREQUENCY BAR GRAPHS 5 5.88 4 11.76 3 25.53 2 29.41 1 17.65 0 11.76 0 5 5 10 15 20 25 30 Percents (%) Number of Children 100 82.4 80 58.8 60 29.4 40 20 94.1 100.0 Number of Children Cumulative Percents (%) CUMULATIVE PERCENT FREQUENCY BAR GRAPHS 11.8 0 0 1 2 3 4 Number of Children 5 5 100.0 4 94.1 3 82.4 2 58.8 1 29.4 0 11.8 0 20 40 60 80 100 Cumulative Percents (%) D4] Exercise: Use the distributions from 2.2] to construct all corresponding bar graphs. D5] Exercise: Use the distributions from 2.6b] to construct all corresponding bar graphs. D6] Exercise: Use the distributions from 2.3] to construct all corresponding bar graphs. D7) Answer the questions based on the given bar graph, which shows the number of students enrolled in Chemistry, Physics, Economics, Political Science, and Psychology courses at CA College. Enrollment in Introductory Courses at CA College 350 the course with most students? 300 2) Order the courses by enrollment from lowest to highest. 3) Approximately, how many times is the enrollment in Economics bigger than the enrollment in Chemistry? 4) How many more students are in Students Enrolled 1) How many students enrolled in 350 250 250 180 200 150 220 150 100 50 0 Economics than in Physics? 5) What percent of all students Introductory Courses enrolled in Psychology? D8) Exercise: The given bar graph shows the quarterly water charges (in U.S. Dollars) from Miami-Dade Water and Sewer Department to a particular household during the period from October-2009 to December-2010. Use it to answer the following: 16 1) Which quarter showed the least 14 from highest to lowest. 3) Approximately, how many times is the charge in Mar-10 smaller than the charge in Dec-09? 4) How much lower was the charge in Dec-10 than in Dec-09? 5) What percent of the total charges is the charge in Sep-10? Water Charge ($) charge? How much was it? 2) Arrange the quarters by water charge 15.74 12.43 12 13.41 10.34 10 7.51 8 6 4 2 0 Dec-09 Mar-10 Jun-10 Sep-10 Quarter Ends Dec-10 D9) The given double bar graph shows the quarterly water charges for a Miami-Dade Water and sewer customer during the years 2009 and 2010. Use it to answer the following: 80 1) In which quarters was the water 70 bill higher in 2010 than in 2009? 74 69 Water Bill ($) 60 2) Which quarter shows the highest difference in water bills? How much is the difference? 3) How much more was the percent 65 67 59 50 54 53 47 40 2009 30 2010 20 10 of the total 2010 charges than the 0 percent of the total 2009 charges First Second Third Fourth Year Quarters in the second quarter? 8 7 6 5 4 3 2 1 0 8 2 Lexus 6 5 Car Make Frequency E) PARETO GRAPHS: E1) Construct the Pareto graphs (vertical and horizontal) for the data in item 2.1]. VERTICAL HORIZONTAL 4 2 Toyota Nissan Honda Ford 4 Ford 5 Honda 6 Nissan 8 Toyota Lexus 0 Car Make 2 4 6 8 Frequency E2) Exercise: Construct the Pareto graphs (vertical and horizontal) for the data in item 2.2]. E3) Exercise: Construct the Pareto graphs (vertical and horizontal) for the data in item 2.3]. F) PIE GRAPHS: F1) Construct the pie graphs for the data in item 2.1]. FREQUENCY PIE GRAPH Lexus, 2 Ford, 4 Honda, 5 Nissan, 6 Toyota, 8 RELATIVE FREQUENCY PIE GRAPH Lexus 0.08 Ford 0.16 Honda 0.20 Nissan 0.24 Toyota 0.32 PERCENT PIE GRAPH Lexus 8% Ford 16% Honda 20% Nissan 24% Toyota 32% F2) Exercise: Construct the pie graphs (frequency, relative frequency, percent) for the data in 2.2]. F3) Exercise: Customarily, economists examine the educational background of the employees of a company when studying the company’s employee productivity. The table below shows the frequency distribution of the highest degrees earned by the 200 employees at CG Corporation. Complete the relative frequency (RF) and the percent frequency distributions, and construct the pie graphs (frequency, relative frequency, percent frequency) for these data. PIE GRAPHS Degree High School Bachelor’s Master’s Doctorate Other f 44 54 42 38 22 RF Percent G] STEM-AND-LEAF DISPLAY G1] Construct the stem-and-leaf display using the 30 scores in item 2.6: 35, 27, 42, 22, 28, 38, 32, 25, 14, 22, 9, 21, 13, 33, 46, 25, 39, 18, 24, 4, 22, 20, 25, 14, 24, 45, 29, 21, 36, 25 G2] Use the given stem-and-leaf display to answer the questions at right. STEM LEAF a) How many observations are there in the data set? 6 568 b) Give the values of the stem and the leaf (separately) 5 0112335889999 for the first observation in the third row from the bottom 4 2225566689 c) List all the observations in the original data set. 3 12237 d) Which observation is the most repeated? 2 36 e) Give the value of the middlemost observation. 1 2 f) Give the value of the largest observation. 0 4 g) Name the (approximate) shape of the distribution. G3] A sample of 23 drivers was obtained to study the number of miles they drive (rounded to integers) during a typical day. Recorded values: 25, 50, 29, 35, 47, 11, 39, 21, 38, 5, 36, 23, 43, 33, 26, 16, 38, 23, 34, 18, 49, 27, 38. Construct the stem-and-leaf display and comment on the shape of the distribution. G4] Use the given stem-and-leaf display to answer the questions at right. STEM LEAF a) How many observations are there in the data set? 0 03 b) Give the values of the stem and the leaf (separately) 1 0124444458999 for the first observation in the third row from the top. 2 122345668 c) List the five largest observations. 3 45667 d) Which observation is the most repeated? 4 236 e) Give the value of the middlemost observation. 5 57 f) Give the value of the smallest observation. 6 4 g) Name the (approximate) shape of the distribution. H) HISTOGRAMS H1) Qualitative Data: Use the distributions of 2.10] to construct the corresponding histograms. Car Make f CF RF CRF PF CPF Nissan 6 6 0.24 24% 24% 0.24 Toyota 8 14 0.32 0.56 32% 56% Honda 5 19 0.20 0.76 20% 76% Lexus 2 21 0.08 0.84 8% 84% Ford 4 25 0.16 1.00 16% 100% FREQUENCY HISTOGRAMS Vertical Ford 8 6 5 Car Makes Frequency 9 8 7 6 5 4 3 2 1 0 Horizontal 4 2 4 Lexus 2 Honda 5 Toyota 8 Nissan 6 0 1 2 3 4 5 6 7 8 9 Car Makes Frequency 25 25 19 20 25 14 15 10 21 21 Car Make Cumulative Frequency CUMULATIVE FREQUENCY HISTOGRAMS 6 19 14 5 6 0 0 5 10 15 20 25 Cumulative Frequency Car Make 0.35 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0.32 0.24 0.16 0.2 0.16 0.08 0.08 Car Make Relative Frequency RELATIVE FREQUENCY HISTOGRAMS 0.2 0.32 0.24 0 Car Make Construct the CRF, PF, and CPF Histograms 0.1 0.2 0.3 Relative Frequency 0.4 H2) Grouped Data: Use the distributions of 2.6a] to construct the corresponding histograms. Classes f Boundaries m CF RF CRF PF CPF 4 to 12 2 3.5 to 12.5 8 2 0.0667 0.0667 6.67% 6.67% 13 to 21 7 12.5 to 21.5 17 9 0.2333 0.3000 23.33% 30.00% 22 to 30 12 21.5 to 30.5 26 21 0.4000 0.7000 40.00% 70.00% 31 to 39 6 30.5 to 39.5 35 27 0.2000 0.9000 20.00% 90.00% 40 to 48 3 39.5 to 48.5 44 30 0.1000 1.0000 10.00% 100.00% FREQUENCY HISTOGRAMS Horizontal Vertical 14 39.5 to 48.5 Class Boundaries Frequency 12 10 8 6 4 2 0 30.5 to 39.5 21.5 to 30.5 12.5 to 21.5 3.5 to 12.5 3.5 12 .5 21.5 30.5 39.5 48.5 0 Class Boundaries 2 4 6 Frequency 8 10 PERCENT FREQUENCY HISTOGRAMS Percent (%) 30.0 23.33 20.00 20.0 10.00 6.67 10.0 0.0 3.5 12 .5 21.5 30.5 Class Boundaries 40.00 40.0 39.5 to 48.5 10.00 30.5 to 39.5 20.00 21.5 to 30.5 40.00 12.5 to 21.5 23.33 3.5 to 12.5 6.67 0 39.5 48.5 5 10 15 20 25 30 35 40 Percents (%) Class Boundaries CUMULATIVE PERCENT FREQUENCY HISTOGRAMS 90.00 80 100.00 70.00 Class Boundaries Cumulative Percent (%) 100 60 40 20 0 30.00 6.67 3.5 12 .5 21 .5 30 .5 39 .5 48.5 Class Boundaries Construct the CF, RF, CRF Histograms. 39.5 to 48.5 100.00 30.5 to 39.5 90.00 21.5 to 30.5 70.00 12.5 to 21.5 3.5 to 12.5 30.00 6.67 0 20 40 60 80 100 Cumulative Percent (%) H3) Discrete Data: Use the distributions of 2.15a] to construct the corresponding histograms. Children f Boundaries m CF RF CRF PF CPF 0 2 -0.5 to 0.5 0 2 0.1176 0.1176 11.76% 11.76% 1 3 0.5 to 1.5 1 5 0.1765 0.2941 17.65% 29.41% 2 5 1.5 to 2.5 2 10 0.2941 0.5882 29.41% 58.82% 3 4 2.5 to 3.5 3 14 0.2353 0.8235 23.53% 82.35% 4 2 3.5 to 4.5 4 16 0.1176 0.9412 11.76% 94.12% 5 1 4.5 to 5.5 5 17 0.0588 1.0000 5.88% 100.00% FREQUENCY HISTOGRAMS 6 5 Frequency 5 4 4 3 2 Horizontal 3 2 2 1 1 0 0 1 2 3 4 Number of Children Vertical 1 5 2 4 4 3 5 2 3 1 2 0 0 5 1 2 Number of Children 3 4 5 6 Frequency 18 14 15 12 3 0 17 10 9 6 16 Number of Children Cumulative Frequency CUMULATIVE FREQUENCY HISTOGRAMS 5 2 0 1 2 3 4 17 5 16 4 14 3 10 2 5 1 2 0 0 5 Number of Children 3 6 9 12 15 18 Cumulative Frequency 0.30 0.25 0.1765 0.20 0.15 0.1176 0.10 0.05 0.00 0 1 0.2353 0.1176 0.0588 2 3 4 5 Number of Children Construct the CRF, PF, and the CPF Histograms Number of Children Relative Frequency RELATIVE FREQUENCY HISTOGRAMS 0.2941 5 4 0.0588 0.1176 3 0.2353 2 0.2941 1 0 0.1765 0.1176 0.00 0.05 0.10 0.15 0.20 0.25 0.30 Relative Frequency H4] a) Use the distribution of 2.6a] to construct the corresponding histograms but using the class marks instead of the class boundaries. b) Use the distribution of 2.6b] to construct the corresponding histograms using the class boundaries. c) Use the distribution of 2.6b] to construct the corresponding histograms but using the class marks instead of the class boundaries. I] POLYGONS I1) Grouped Data: Use the distribution of 2.6a] to construct the corresponding polygons. Classes 4 to 12 13 to 21 22 to 30 31 to 39 40 to 48 f 2 7 12 6 3 Boundaries 3.5 to 12.5 12.5 to 21.5 21.5 to 30.5 30.5 to 39.5 39.5 to 48.5 m 8 17 26 35 44 CF 2 9 21 27 30 RF 0.0667 0.2333 0.4000 0.2000 0.1000 CRF 0.0667 0.3000 0.7000 0.9000 1.0000 PF CPF 6.67% 6.67% 23.33% 30.00% 40.00% 70.00% 20.00% 90.00% 10.00% 100.00% Construct the FREQUENCY POLYGON RELATIVE FREQUENCY POLYGON Marks RF 8 0.0667 17 0.2333 26 0.4000 35 0.2000 44 Relative Frequency 0.4 0.4000 0.3 0.2333 0.2 0.1 0.1000 0.0667 0.0 0.0000 0 0.1000 0.2000 8 0.0000 17 26 35 44 53 Class Marks Construct the PERCENT FREQUENCY POLYGON I2] Exercise: Use the distributions of item 2.6b] to construct the corresponding polygons. J] OGIVES J1] Grouped Data: Use the distribution of 2.6a] to construct the corresponding ogives. Classes 4 to 12 13 to 21 22 to 30 31 to 39 40 to 48 f 2 7 12 6 3 Boundaries 3.5 to 12.5 12.5 to 21.5 21.5 to 30.5 30.5 to 39.5 39.5 to 48.5 m 8 17 26 35 44 CF 2 9 21 27 30 RF 0.0667 0.2333 0.4000 0.2000 0.1000 CRF 0.0667 0.3000 0.7000 0.9000 1.0000 Construct the CUMULATIVE FREQUENCY OGIVE Construct the CUMULATIVE RELATIVE FREQUENCY OGIVE PF CPF 6.67% 6.67% 23.33% 30.00% 40.00% 70.00% 20.00% 90.00% 10.00% 100.00% CUMULATIVE PERCENT FREQUENCY OGIVE 90.00 100.0 CPF 8 6.67 17 30.00 26 70.00 35 90.00 44 100.00 Cumulative Percent Marks 100.00 70.00 80.0 60.0 30.00 40.0 20.0 0.00 6.67 0.0 0 8 17 26 Class Marks 35 44 J2] Exercise: Use the distributions of item 2.6b] to construct the corresponding ogives. K] SCATTERPLOTS Negative Linear Relation 30 25 25 25 20 20 20 15 15 15 Y 30 10 10 10 5 5 5 0 0 2 4 X 6 8 0 10 Exponential Relation 0 2 4 X 6 8 0 10 Sinusoidal Relation 0 3.5 30 25 3.0 25 2.5 Y Y 1.5 10 1.0 5 0.5 0 2 4 X 6 8 10 X 6 8 10 15 10 5 0.0 0 4 20 2.0 15 2 No Discernible Relation 30 20 Y Nonlinear (Curvilinear) Relation 30 Y Y Positive Linear Relation 0 2 4 X 6 8 10 0 0 2 4 X 6 8 10 K1] Example: A large corporation is planning to open a nationwide chain of sporting goods. A market analysis is conducted to examine the relationship between the variable weekly income (x) and weekly household expenditure on recreation (y). Eight families were interviewed and the recorded data are shown below. Construct the scatter diagram and comment on the relationship between the two variables. Income Expenditure 900 90 800 72 600 54 400 50 700 69 500 60 300 30 200 25 K2] Exercise: A consumer is interested in estimating the price of a car based on how old the car is. She takes a random sample of ten used cars of the same make and model. The table below shows the price of the car (y, in $1000’s) and the age (x, in years). Construct the scatterplot and comment on the relationship between the two variables. Age (years) Price ($1000) 1 18.5 2 16.0 3 15.2 4 12.5 5 13.1 6 10.5 7 11.0 8 9.5 9 6.5 10 6.1 L] TIME-SERIES PLOTS L1] Example: The given data was collected to analyze changes in farm population (P, in millions) over time (t, in years). The year period selected was from 1998 to 2005. Construct the time-series plot and comment on the relationship between the variables. Year Population 1998 1999 2000 2001 2002 14.3 13.5 11.2 9.9 8.7 SOLUTION 2003 7.9 2004 2005 5.8 6.6 Farm Population (in millions) Time Series Plot Year Popul. 20 1998 14.3 1999 13.5 15 2000 11.2 10 2001 9.9 2002 8.7 5 2003 7.9 2004 5.8 0 1996 1998 2000 2002 2004 2006 2005 6.6 Year Comment: There seems to be a negative linear relationship between farm population and time in years. As the years passed by from 1998 to 2005, the farm population appeared to be decreasing. L2] Exercise: The yearly sales (in million dollars) from 1993 to 2003 reported by Microsoft Corporation are shown in the given table. Construct the time-series plot and comment on the relationship between the variables. Year Sales 1993 1994 1995 1999 1996 1997 1998 2000 2001 2002 0.75 1.55 2.35 2.22 2.34 2.54 2.55 2.75 3.11 3.24 2003 3.15 2.2] Beverage Coffee Soda Tea Water O.J. 2.7] a] 2.8] 2.10] 2.11] 2.13] ANSWERS TO SELECTED ITEMS 2.4] 2.3] f 4 5 2 7 3 Children 0 1 2 3 4 f 4 6 5 2 1 Gender Women Men 2.5] Customer Purchase No Purchase Classes 10 to 21 22 to 33 34 to 45 46 to 57 58 to 69 f 9 12 13 4 2 Boundaries 9.5 to 21.5 21.5 to 33.5 33.5 to 45.5 45.5 to 57.5 57.5 to 69.5 m 15.5 27.5 39.5 51.5 63.5 A) Classes 12.5 to 21.4 21.5 to 30.4 30.5 to 39.4 39.5 to 48.4 48.5 to 57.4 f 1 3 6 10 8 Boundaries 12.45 to 21.45 21.45 to 30.45 30.45 to 39.45 39.45 to 48.45 48.45 to 57.45 m 16.95 25.95 34.95 43.95 52.95 Make Nissan Toyota Honda Lexus Ford f 6 8 5 2 4 Beverage f Coffee 4 Soda 5 Tea 2 Water 7 O. J. 3 CF 4 9 11 18 21 RF 0.1905 0.2381 0.0952 0.3333 0.1429 Classes 10.0 to 21.9 22.0 to 33.9 34.0 to 45.9 46.0 to 57.9 58.0 to 69.9 f 9 12 13 4 2 CF 9 21 34 38 40 RF 0.225 0.300 0.325 0.100 0.050 PF 19.05% 23.81% 9.52% 33.33% 14.29% CRF 0.225 0.525 0.850 0.950 1.000 PF 22.5% 30.0% 32.5% 10.0% 5.0% Football 70 50 Basketball 40 40 Satisfied 300 70 Unsatisfied 15 9 Indifferent 5 1 b] CF 6 14 19 21 25 CRF 0.1905 0.4286 0.5238 0.8571 1.0000 Baseball 30 20 Classes 10 to 19 20 to 29 30 to 39 40 to 49 50 to 59 60 to 69 f 8 9 12 6 3 2 Boundaries 9.5 to 19.5 19.5 to 29.5 29.5 to 39.5 39.5 to 49.5 49.5 to 59.5 59.5 to 69.5 M 14.5 24.5 34.5 44.5 54.5 64.5 B) Classes 1.00 to 2.49 2.50 to 3.99 4.00 to 5.49 5.50 to 6.99 7.00 to 8.49 f 12 16 9 4 2 Boundaries 0.995 to 2.495 2.495 to 3.995 3.995 to 5.495 5.495 to 6.995 6.995 to 8.495 M 1.745 3.245 4.745 6.245 7.745 RF 0.24 0.32 0.20 0.08 0.16 CPF 19.05% 42.86% 52.38% 85.71% 100.00% 2.12] Children 0 1 2 3 4 CRF 0.24 0.56 0.76 0.84 1.00 f 4 6 5 2 1 CF 4 10 15 17 18 PF 24% 32% 20% 8% 16% RF 0.2222 0.3333 0.2778 0.1111 0.0556 CRF 0.2222 0.5556 0.8333 0.9444 1.0000 CPF 24% 56% 76% 84% 100% PF 22.22% 33.33% 27.78% 11.11% 5.56% CPF 22.22% 55.56% 83.33% 94.44% 100.00% CPF 22.5% 52.5% 85.0% 95.0% 100.0% D8) 1) The one ending in Mar-2010; $7.51; 2) Dec-09, Dec-10, Sep-10, Jun-10, Mar-10; 3) ½ ; 4) $2.33; 5) 20.9% D9) 1) During the first three quarters; 2) Fourth; 3) 13.3% 5 4 2 3 Water Soda Coffee O.J. 3 O.J. 4 Coffee 2 5 Soda 7 Water Tea 0 2 Beverage F2] Water, 7 6 Soda, 5 Water, 0.333 5 6 0 1 K2] There seems to be a negative linear relationship between the price of a used car and the age of the car (in years). Older cars seem to be associated with lower prices. 15.0 10.0 5.0 0.0 6 Age (years) 3 4 5 Frequency PERCENT PIE GRAPH Coffee 19.0% O.J. 14.3% Soda, 0.238 Water 33.3% Soda 23.8% Tea 9.5% L2] There seems to be a positive linear relationship between the sales of Microsoft Corporation and time (in years). As the years pass by, the sales seem to be increasing. Sales (in million dollars) 20.0 4 2 G4] a) 35; b) stem = 2; leaf = 1; c) 43, 46, 55, 57, 64; d) 14; e) 22; f) 0 g) Positively Skewed. STEM LEAF 0 5 1 168 2 1335679 3 34568889 4 379 5 0 Approximately symmetric shape. Price ($1000's) 4 two Tea, 0.095 G3] 2 2 zero one Coffee, 0.190 O.J., 0.143 Coffee, 4 1 four three Children RELATIVE FREQUENCY PIE GRAPH Tea, 2 0 6 5 4 3 2 1 0 Frequency FREQUENCY PIE GRAPH O. J. ,3 4 b) Children 2 Tea 4 0 E3] a) 7 Beverage Frequency 6 b) Frequency E2] a) 8 10 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00 1990 1995 Year 2000 2005 6