Organizing data and frequency distribution

12/8/2014 AGSC 320 Statistical Methods Organizing & Graphing Data 1 DATA • Numerical representation of reality • Raw data: Data recorded in the sequence in which they are collected and before any processing • Qualitative or quantitative • Depending of the measurement type different mathematical operations can be done Ex. 2 Data representation • Graphical presentation of data • Most common graph describing data → histogram • Histogram relates values taken by a variable with the frequency of occurrence of respective values • Frequency: # times a value is recorded / observed • Frequency distribution: organization of raw data using categories/classes and frequencies 3 1 12/8/2014 Frequency distribution • Class: a category grouping similar values – e.g. 1.2, 1.3, 1.7 and 2.2, 2.5, 2.9 – e.g. deer & bear and turkey & dove and bass & salmon • Properties of a frequency distribution: – There should be at least 5 classes but less than 20 – Classes are mutually exclusive – Classes are continuous • There is no gap between two adjacent classes – Classes are exhaustive • any value should be found in one class – Classes have same width 4 Relative frequency distributions • Relative frequency of a category – relates the frequency of A PARTICULAR category with frequency of ALL categories Relative Freq.  Freq. of that category sum of all frequencies • Percentage of a category is relative frequency expresses in percentage Percentage = Rel.Freq. x 100 5 Frequency distribution Creating a frequency distribution: 1. • find the highest and lowest value 2. find the range 3. select the number of classes – rule of thumb: between 5 and 20 classes – Sturges’ Rule: c = 3.3 x log2 n +1 (round up-usually) 4. 5. 6. 7. 8. 9. determine the class width select a starting point and lower class limits find upper limits for each class find class boundaries and class midpoints tally raw data → find frequencies, plot the data → Graph 6 2 12/8/2014 Organizing Data Example (adjustment of Ex.2-3) Create a frequency distribution using the following data knowing that the table shows the heights of 20 trees from a stand 50, 45, 32, 48, 56, 38, 42, 48, 55, 36, 41, 51, 30, 59, 53, 47, 57, 51, 46, 44 7 Organizing Data • Step 1: extreme values • Step 2: compute range Range = largest value – smallest value • Step 3: determine the # classes 8 Organizing Data • Step 4: class width Width = range / # classes Commonly round width to meaningful values • Step 5: select staring point: – Usually the minimum value 9 3 12/8/2014 Organizing Data • Step 6: determine the upper class limit • Step 7: find class boundaries and midpoints • Step 8: tally raw data Class limits Class boundaries Midpoints Tally 10 Organizing Data • Histogram: graph displaying data using continuous bars having the height the frequencies of the classes and on the abscise the classes 11 Frequency polygon • Freq. polygon: line connecting the points representing the class midpoint and class frequency • Cumulative frequency: total # values bellow the upper-bound of each class • Ogive: curve representing the cumulative frequencies for the classes 40 Class Frequency Ogive 1 2 3 4 5 6 7 2 5 7 10 9 4 1 2 7 14 24 33 37 38 35 30 25 20 15 10 5 0 class 1 class 2 class 3 class 4 class 5 class 6 class 7 12 4 12/8/2014 Relative frequency • Represents the frequency distribution in respect with the total number of values • Ratio between class frequency and total # of values Relative frequency of a class  Class Frequency Relative frequency 1 2 3 4 5 6 7 Total 2 5 7 10 9 4 1 38 0.05 0.13 class frequency # values in class  total # values total # values 15 10 5 0 0.3 0 2 4 6 8 0.2 0.1 0 0 2 4 6 8 13 Shape of histograms • Frequency polygon – empirical representation of the distribution describing the investigated process • Shape of histogram important in determining the appropriate statistical methods used to analyze data Bell – shaped Unimodal Bimodal Uniform distribution Reverse-J distribution (Liocourt) Symmetric Left – skewed Right - skewed 14 Stem-and-leaf display • Another method of displaying data – Histograms and freq. polygons are other methods • Stem-and-leaf technique does not lose information of an individual observation • Break the information in two parts: 1. Stem: ordered list of significant digits 2. Leaf: ordered list of values having the same significant digits 15 5 12/8/2014 Stem-and leaf display Ex.2-49: yard gained by 14 running backs 745 1009 921 1133 1024 848 775 800 1275 857 933 1145 967 995 1. Split each value in two parts: • • first part →stem: significant digits: 7, 9, 11 etc second part → leaf: rest from actual value: 45, 21, etc 2. List stem value in ascendant order • 7, 8, 9, 10, etc 3. Write to each stem value the corresponding observation • For 7 there are 45 and 75 (corresponds to 745 and 775) 16 Stem-and-leaf display • Unsorted 7 45 75 8 48 9 21 00 33 10 24 11 33 09 45 57 67 95 57 67 95 12 75 • Sorted 7 45 75 8 00 9 21 48 33 10 09 11 33 24 45 12 75 17 6

Organizing data and frequency distribution

Related documents

Products

Support

Organizing data and frequency distribution

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib