Statistics: A Gentle Introduction By Frederick L. Coolidge, Ph.D. Sage Publications Chapter 2 Descriptive Statistics: Understanding Distributions of Numbers Chapter 2 1 0730 Q1 Results N=20 1|5 2|1124456679 3|001124779 Chapter 2 2 0900 Q1 Results N=32 1|249 2|0335567799 3|2224444445566889 4|001 Chapter 2 3 Overview Graphs and tables What’s the point? The nasty tricks of the trade Types of distributions Grouping data Cumulative frequency distributions Stem-and-leaf plot Chapter 2 4 Graphs and Tables What’s the point? What’s the point? Document the sources of statistical data and its characteristics. Chapter 2 Where did you get it? What is it measuring? 5 Graphs and Tables What’s the point? Make appropriate comparisons. Chapter 2 Compare similar data. Make the point more clearly. Make data more understandable. Eliminate doubt. 6 Frequency Distributions A table reporting the number of observations falling into each category of the variable; Frequency count for data value is # of times value occurs in data set; Ungrouped frequency distribution lists the data values w/frequency count with which each value occurs; Relative frequency for any class is obtained by dividing frequency for that class by total # of Cumulative Frequency(CF) and Cumulative Relative Freq(CRF) CF- a specific value in a frequency table is sum of frequencies for all values at or below the given value; CRF- the sum of the relative frequencies for all values at or below the given value expressed as a proportion; Grouped Frequency distribution is obtained by constructing intervals for data and listing frequency count in each interval MathAnxiety Relative Cumulative Cumulative Scores Freq Freq Freq Relative Freq 1 1 0.05 1 0.05 2 2 0.09 3 0.14 3 3 0.14 6 0.28 4 4 0.18 10 0.46 5 5 0.23 15 0.69 6 0 0 15 0.69 7 2 0.09 17 0.78 8 3 0.14 20 0.92 9 1 0.05 21 0.97 10 1 0.05 22 1.02 MathAnxietyScore7:30class(Grouped Freq Distribution Class Intervals .5-2.5 2.5-4.5 4.5-6.5 6.5-8.5 8.5-10.5 F 3 7 5 5 2 CF 3 10 15 20 22 RF CRF 0.136 0.1364 0.318 0.4546 0.227 0.6819 0.227 0.9092 0.091 1.0002 Histogram Math Anxiety Scores .30 .25 .20 .15 .10 .5 .5 Chapter 2 2.5 4.5 6.5 8.5 10.5 11 “Blacks More Pessimistic than whites economic opportunities” What Govts Role in improving economic position of minorities NonHispanic Whites(%) Blacks(%) Hispanics Major Role 32 51 16 68 22 9 67 21 8 Minor Role No Role Laws Covering Sales of Firearms: Increase Restrictions( 2000)? Men(N=493) Women(N=538) More Less Same No opinion 256 387 39 11 193 129 5 11 Men and Firearm Restrictions: Frequency Distribution(N=493) F CF RF CRF More 256 256 .52 .52 Less 39 295 .08 .60 Same 193 488 .39 .99 No opinion 5 493 .01 1 Women and Firearm Restrictions: Frequency Distribution(N=538) F CF RF CRF More 387 387 .719 .719 Less 11 398 .020 .739 Same 129 527 .239 .978 No opinion 11 538 .020 .998 Graphs and Tables What’s the point? Demonstrate the mechanisms of cause and effect and express the mechanisms quantitatively. Chapter 2 If you vary the cause and the results change in a predictable and uniform manner, then you make a stronger case for cause and effect. 16 Graphs and Tables What’s the point? Recognize the inherent multivariate (more than one cause) nature of the problem. Is there anything with just one cause? Chapter 2 Temperature of boiling water: Altitude of water What is in the water (salt)? 17 Graphs and Tables What’s the point? Inspect and evaluate alternative hypotheses. Cigarette smoking is related to a lower incidence of Alzheimer’s disease. Chapter 2 Is it the cigarettes? Is it the dying at an earlier age, before Alzheimer’s is diagnosable? 18 Graphs and Tables The nasty tricks of the trade The nasty tricks of the trade Adjust the scale to make the point Show only part of the scale Omit the units of measure Change the scale along the graph Include too much junk Not enough to bother graphing Chapter 2 19 Graphs and Tables The nasty tricks of the trade Is Brand One really any better than the others? Chapter 2 20 Stem-and-leaf plot Presents the frequency of data points without losing important information. Data set: 25, 27, 29 Stem 2 579 Leaves Chapter 2 21 Stem-and-leaf plot The first digit is the stem The second digit is each leaf 25 27 29 Stem 2 579 Leaves Chapter 2 22 Stem-and-leaf plot The first digit is the stem The second digit is each leaf 25 27 29 Stem 2 579 Leaves Chapter 2 23 Stem-and-leaf plot Let’s try it Data set: 30, 32, 32, 34, 37, 37, 39 Data set: 5, 9, 10, 11, 11, 23, 25, 27 Chapter 2 24 Types of Distributions Frequency Distribution Frequency distribution Showing what you have Chapter 2 A way to illustrate how many of each thing. 25 Types of Distributions Frequency Distribution Chapter 2 26 Types of Distributions Normal Distribution Normal distribution Also known as the bell-shaped curve An illustration of the expectation of what most types of data will look like Chapter 2 A few data points at each extreme Most data points in the middle area 27 Types of Distributions Normal Distribution Chapter 2 28 Types of Distributions Positively Skewed Distribution Not all data are created equal Positive skew Chapter 2 Many data points near the origin of the graph 29 Types of Distributions Negatively Skewed Distribution Negative skew Chapter 2 Many data points away from the origin of the graph 30 Types of Distributions Bimodal Distribution Bimodal Chapter 2 Two areas under the curve with many data points 31 Types of Distributions Non-normal Distributions Nonnormal distributions But not abnormal Platykurtic: flat like a plate Chapter 2 32 Bi-Modal Distribution: Spring 2010 Quiz Scores F CF RF CRF 10-16 5 5 .227 .227 17-23 3 8 .136 .363 24-30 2 10 .090 .453 31-37 8 18 .363 .816 38-44 4 22 .181 .997 Types of Distributions Non-normal Distributions Leptokurtic: up & down (like leaping) Bimodal: lumpy Chapter 2 34 Grouping data A way of organizing data so that they are manageable. Which is easier to understand? 3, 1, 7, 4, 1, 2, 3, 5, 4, 9 or 1, 1, 2, 3, 3, 4, 4, 5, 7, 9 Chapter 2 35 Grouping data Tips for grouping data Tips for grouping lots of data Choose interval widths that reduce your data to 5 to 10 intervals. 5 Chapter 2 10 15 20 25 30 35 36 Grouping data Tips for grouping data Choose meaningful intervals. 5 Which is easier to understand at a glance? 10 15 20 25 30 35 13 16 19 22 or 4 Chapter 2 7 10 37 Grouping data Tips for grouping data Interval widths must be the same. 5 10 15 20 25 30 35 30 33 35 NOT 5 Chapter 2 10 20 22 38 Grouping data Tips for grouping data Intervals cannot overlap. 5-10 11-15 16-20 21-25 26-30 31-35 36-40 25-30 30-35 35 NOT 5-10 Chapter 2 10-15 14-20 20-26 39 Grouping data An example The data are displayed using A frequency table of individual data points A frequency table by intervals Graph of data by intervals Chapter 2 40 Grouping data An example Chapter 2 41 Grouping data An example Chapter 2 42 Grouping data An example Chapter 2 43 Freq Distribution Using Stated limits Age Category Freq CF 20-29 7 7 30-39 7 14 40-49 12 26 50-59 3 29 60-69 3 32 70-79 6 38 80-89 2 40 Total 40 Chapter 2 44 Problem w/ Stated Limits Gap of one between adjacent intervals Problem for scores with fractional values; where classify a woman 49.25 years old? Here age would actually fall between intervals 40-49 and 50-59!! Real limits extend upper and lower limits by .5 Chapter 2 45 Freq Distribution Using Real Upper and Lower limits Age Category Freq CF 19.5-29.5 7 7 29.5-39.5 7 14 39.5-49.5 12 26 49.5-59.5 3 29 59.5-69.5 3 32 69.5-79.5 6 38 79.5-89.5 2 40 Total 40 Chapter 2 46 Upper/Lower limits &Fractional Values Scores falling exactly at upper real limit or lower real limit are rounded to closest even number; EX=59.5 rounded to 60 and included in interval 59.5-69.5 Where would you classify respondent 49.25 years? How about 59.4? Chapter 2 47 Cumulative Frequency Distribution Cumulative frequency distribution Shows how many cases (data points) have been accounted for out of the total number of cases (data points). Chapter 2 48 Cumulative Frequency Distribution How many data points have accounted for as each group is displayed. Chapter 2 49 Cumulative Frequency Distribution Cumulative frequencies can also be illustrated using percentages. Chapter 2 50 Cumulative Frequency Distribution Cumulative distributions can help give a reference point for an individual score. Percentile What percentage scored above or below the score of interest Quartile Divides the scores into four groups Chapter 2 25%: 1st, 2nd, 3rd, 4th 51 Cumulative Frequency Distribution Chapter 2 52 Statistics: A Gentle Introduction By Frederick L. Coolidge, Ph.D. Sage Publications Chapter 2 Descriptive Statistics: Understanding Distributions of Numbers Chapter 2 53