FrequencyDistribution

advertisement
Frequency Distributions: Raw Data List → Grouped into Classes → Reported in a Table
Categorical Frequency Distribution
[Blu4 Example 2-1 Page 36] Blood Types
Class
A
B
O
AB
Total
Tally
-
Frequency
[*]


Percent
of Total
[**]



[***]

There’s a list of raw data (not shown here)
“Class” column lists the groups, the classifications we choose. One class on each line of the
table. In “Categorical”, they are “nominal” values, probably not numbers. Like Colors or
Sizes or in this example, Blood Types.
“Tally” is where [Blu4] puts hash marks to count manually as we process the raw data.
“Frequency” is the count of how many raw data are in the Class.
[*] Total the Frequency column. This matches the count of how many data values are in
your original raw data list.
“Percent of Total” – not in every frequency distribution – but can be useful. Divide
Frequency for the class in this line ÷ Total of all the Frequencies and express it as a percent,
rounded.
Numerical Frequency Distribution, Ungrouped
[Blu4 Example 2-3 Page 41] SUV s’ MPG ratings
Class
12
13
14
15
16
17
18
19
Total
Class
Tally
Boundaries
11.5-12.5
12.5-13.5
13.5-14.5
14.5-15.5
15.5-16.5
16.5-17.5
17.5-18.5
18.5-19.5
-
Frequency
Cumulative
Frequency
[**]





Raw data is a list of MPG ratings; the only values that occurred were 12, 13,
14, 15, 16, 17, 18, 19.
No grouping was done because the number of classes is naturally small
enough for this particular set of data.
Class Boundaries: 11.5-12.5, for example, is for 11.5 ≤ 𝑥 < 12.5. If there
were a 12.5, it would go into the next category, 13, where 12.5 ≤ 𝑥 < 13.
Cumulative Frequency: total frequencies as you go along. Not every
Frequency Distribution has such a column. Sometimes it’s useful.
“Percent of Total” isn’t in this example but it could have been used.
[***]
Based on [Blu4 Section 2-2]
Document1
2/9/2016 10:02 AM - D.R.S.
Frequency Distribution, Grouped
[Blue Example 2-2 Page 39] Record high temperatures for each of the fifty states
Class
Class
Tally
Frequency Cumulative Determining the Classes in this Example:
Limits
Boundaries
Frequency
 Find the highest value and the lowest value among the raw data, 𝐻 and 𝐿.
100-104
99.5-104.5
 Compute 𝑅 = range = 𝐻 − 𝐿. This example: 𝐻 = 134, 𝐿 = 100, 𝑅 = 34.
105-109
104.5-109.5
 How many classes do you want? Usually pick from 5 to 20 classes. The
110-114
109.5-14.5
example just decided to make 7 classes. So 𝐶 =7.
115-119
114.5-119.5
 Compute class width 𝑊 = 𝑅/𝐶 and if it has a remainder, bump up. In this
34
120-124
119.5-124.5
example, 𝑊 = 7 ≈ 4.9, ↑ to 5. (Not “rounding”, but “ceiling”.)
125-129
124.5-129.5
 Choose a starting point, arbitrarily, anything that’s ≤ 𝐿. They picked 100.
130-134
129.5-134.5
 Lower Class Limits: Add 𝑊, add 𝑊 again, etc.: 100, 105, … , 130.
Total
 Upper Class Limits: Second class’s Lower Limit – 1: 105 − 1 = 104, then
 The Midpoint is the numeric center of the class.
repeatedly add 𝑊: 109, 114, … , 134
𝐿𝑜𝑤𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦+𝐻𝑖𝑔ℎ𝐵𝑜𝑢𝑛𝑑𝑎𝑟𝑦
Class Midpoint =

Class Boundaries: Lower Class Limit – 0.5 to Upper Class Limit + 0.5
2
Summary of the Rules and the Guidelines for Grouped Frequency Distributions
Procedures [Blu4 Page 42]
1) Determine the classes
a) Find the highest and the lowest values, 𝐻 and 𝐿.
b) Find the range, 𝑅 = 𝐻 − 𝐿.
c) Select the number of classes desired, 𝐶. Usually 5 ≤ 𝐶 ≤
20. Then class width, 𝑊 = 𝑐𝑒𝑖𝑙𝑖𝑛𝑔(𝑅/𝐶).
d) Lower Class Limits: Select a starting point = the lowest
value or any convenient number less than the lowest value;
add the width (repeatedly) to get the lower class limits.
e) Find the upper class limits.
f) Find the boundaries.
2) Tally the data.
3) Find the numerical frequencies from the tallies.
4) Find the cumulative frequencies.
Some more rules and guidelines [Blu4 Page 40]
 There should be between 5 and 20 classes. No hard and fast rule
but that’s a good rule of thumb.
 It is nice if class width = an odd number. Then you will get class
midpoints that have the same place value as the data. [But JK10
page 57 recommends an even number for class width!]
 The classes must be Mutually Exclusive. (This means the Class
Limits don’t overlap. Each data value falls into only One class.)
And the classes must be Exhaustive. Every data value fits
somewhere.
 The classes must be Continuous. No gaps, even if the class turns
out to be empty. (But you can omit a zero-frequency class at the
beginning or end of the frequency distribution, just don’t omit
any class in the middle.)
 The classes must all have the same Width.
Based on [Blu4 Section 2-2]
Document1
2/9/2016 10:02 AM - D.R.S.
Download