Basic Practice of Statistics - 3rd Edition Question: What is the average height of students in this class? Answer: Easy, ask everyone the height, and then find the Chapter 1 average (mean). Question: What is the average height of students in Picturing Distributions with Graphs University of Utah? Answer: Seems complicated. Too many people, it will be difficult to ask everyone. Ask a sufficiently large number of people (but still small compared to the total size of students) and make an estimate using tools from STATISTICS. Chapter BPS 1 - 5th Ed. 1 8/26/09 2 Statistics 1. How do we select the people to ask? How many people should we ask? 2. How do we make an estimate ? 3. Our estimate will be probably be a little different from the true value. Can we say how close it is to the true value? 3 Chapter 1 8/26/09 Statistics is a science that involves the extraction of information from numerical data obtained during an experiment or from a sample. It involves the design of the experiment or sampling procedure, the collection and analysis of the data, and making inferences (statements) about the population based upon information in a sample. Chapter BPS 1 - 5th Ed. 4 1 Basic Practice of Statistics - 3rd Edition Individuals and Variables name major megan ryan jacob antony ali guillermo hao nicos susan bs egr ba bs egr bs bs ba ba score Individuals gpa 25 35 40 50 22 24 31 35 24 the objects described by a set of data 3 2.5 3.5 2.5 2.6 2.3 3.4 3.3 2.9 may be people, animals, or things Variable any characteristic of an individual can take different values for different individuals 8/26/09 5 Variables 6 Chapter BPS 1 - 5th Ed. Case Study The Effect of Hypnosis on the Immune System Places an individual into one of several groups or categories Takes numerical values for which arithmetic operations such as adding and averaging make sense reported in Science News, Sept. 4, 1993, p. 153 Chapter BPS 1 - 5th Ed. Chapter 1 7 Chapter BPS 1 - 5th Ed. 8 2 Basic Practice of Statistics - 3rd Edition Case Study Case Study 65 college students. The Effect of Hypnosis on the Immune System 33 easily hypnotized 32 not easily hypnotized white blood cell counts measured Objective: To determine if hypnosis strengthens the disease-fighting capacity of immune cells. Chapter BPS 1 - 5th Ed. 9 Chapter BPS 1 - 5th Ed. Case Study Case Study Students randomly assigned to one of three conditions white blood cell counts re-measured after one week subjects hypnotized, given mental exercise group control group (no treatment) Chapter BPS 1 - 5th Ed. 10 the two white blood cell counts are compared for each subjects relaxed in sensory deprivation tank Chapter 1 all students viewed a brief video about the immune system. results hypnotized group showed larger jump in white blood cells “easily hypnotized” group showed largest immune enhancement 11 Chapter BPS 1 - 5th Ed. 12 3 Basic Practice of Statistics - 3rd Edition Case Study Case Study Variables measured Weight Gain Spells Heart Risk for Women quantitative Chapter BPS 1 - 5th Ed. “Weight, weight change, and coronary heart disease in women.” W.C. Willett, et. al., vol. 273(6), Journal of the American Medical Association, Feb. 8, 1995. Pre-study white blood cell count Post-study white blood cell count (Reported in Science News, Feb. 4, 1995, p. 108) 13 Case Study Chapter BPS 1 - 5th Ed. 14 Case Study Study started in 1976 with 115,818 women aged 30 to 55 years and without a history of previous CHD. Weight Gain Spells Heart Risk for Women Objective: To recommend a range of body mass index (a function of weight and height) in terms of coronary heart disease (CHD) risk in women. Chapter BPS 1 - 5th Ed. Chapter 1 15 Each woman’s weight (body mass) was determined. Each woman was asked her weight at age 18. Chapter BPS 1 - 5th Ed. 16 4 Basic Practice of Statistics - 3rd Edition Case Study Case Study Variables measured The cohort of women were followed for 14 years. The number of CHD (fatal and nonfatal) cases were counted (1292 cases). categorical Incidence of coronary heart disease Smoker or nonsmoker Family history of heart disease BPS - 5th Ed. Chapter 1 17 Chapter BPS 1 - 5th Ed. Distribution Displaying Distributions Tells what values a variable takes and how often it takes these Categorical variables values 18 Pie charts Bar graphs Can be a table, graph, or function Quantitative variables Histograms Stemplots (stem-and-leaf plots) Chapter BPS 1 - 5th Ed. Chapter 1 19 Chapter BPS 1 - 5th Ed. 20 5 Basic Practice of Statistics - 3rd Edition Class Make-up on First Day Class Make-up on First Day Data Table Pie Chart Year Count Percent Freshman 18 41.9% Sophomore 10 23.3% Junior 6 14.0% Senior 9 20.9% Total 43 100.1% 21 Chapter BPS 1 - 5th Ed. Class Make-up on First Day 22 Chapter BPS 1 - 5th Ed. Example: U.S. Solid Waste (2000) Data Table Bar Graph Material Chapter BPS 1 - 5th Ed. Chapter 1 23 Weight (million tons) Percent of total Food scraps 25.9 11.2 % Glass 12.8 5.5 % Metals 18.0 7.8 % Paper, paperboard 86.7 37.4 % Plastics 24.7 10.7 % Rubber, leather, textiles 15.8 6.8 % Wood 12.7 5.5 % Yard trimmings 27.7 11.9 % Other 7.5 3.2 % Total 231.9 100.0 % BPS - 5th Ed. Chapter 1 24 6 Basic Practice of Statistics - 3rd Edition Example: U.S. Solid Waste (2000) Example: U.S. Solid Waste (2000) Pie Chart Bar Graph 25 Chapter BPS 1 - 5th Ed. Histograms 50 Histograms: Class Intervals For quantitative variables that take many values Divide the possible values into Chapter BPS 1 - 5th Ed. (we will only consider equal widths) How many intervals? One rule is to calculate the square root of the sample size, and round up. Count how many observations fall in each interval (may change to percents) Size of intervals? Draw picture representing distribution Divide range of data (max-min) by number of intervals desired, and round to convenient number Pick intervals so each observation can only fall in exactly one interval (no overlap) Chapter BPS 1 - 5th Ed. Chapter 1 27 Chapter BPS 1 - 5th Ed. 28 7 Basic Practice of Statistics - 3rd Edition Case Study Weight Data Weight Data Introductory Statistics class Spring, 1997 Virginia Commonwealth University 29 Chapter BPS 1 - 5th Ed. Weight Data: Frequency Table Count 100 - <120 7 120 - <140 12 140 - <160 7 160 - <180 8 180 - <200 12 200 - <220 4 220 - <240 1 240 -<260 0 260 -<280 sqrt(53) = 7.2, BPS - 5th Ed. Chapter 1 Weight Data: Histogram Number of students Weight Group 30 Chapter BPS 1 - 5th Ed. 100 120 140 1 ; range (260-100=160) / 8 = Chapter 1 160 180 200 Weight 220 240 260 280 * Left endpoint is included in the group, right endpoint is not. 31 BPS - 5th Ed. Chapter 1 32 8 Basic Practice of Statistics - 3rd Edition Examining the Distribution of Quantitative Data Shape of the Data Symmetric bell shaped Overall pattern of graph other symmetric shapes Deviations from overall pattern Asymmetric Shape of the data right skewed Center of the data left skewed Spread of the data (Variation) Unimodal, bimodal Outliers 33 Chapter BPS 1 - 5th Ed. Symmetric Bell-Shaped BPS - 5th Ed. Chapter 1 34 Chapter BPS 1 - 5th Ed. Symmetric Mound-Shaped Chapter 1 35 BPS - 5th Ed. Chapter 1 36 9 Basic Practice of Statistics - 3rd Edition Symmetric Uniform BPS - 5th Ed. Asymmetric Skewed to the Left Chapter 1 37 Asymmetric Skewed to the Right BPS - 5th Ed. Chapter 1 38 Outliers Extreme values that fall outside the overall pattern May occur naturally May occur due to error in recording May occur due to error in measuring Observational unit may be fundamentally different BPS - 5th Ed. Chapter 1 Chapter 1 39 Chapter BPS 1 - 5th Ed. 40 10 Basic Practice of Statistics - 3rd Edition Stemplots (Stem-and-Leaf Plots) For quantitative variables Separate each observation into a (first part of the number) and a (the remaining part of the number) Write the stems in a vertical column; draw a vertical line to the right of the stems Write each leaf in the row to the right of its stem; order leaves if desired 41 Chapter BPS 1 - 5th Ed. Weight Data: Stemplot (Stem & Leaf Plot) BPS - 5th Ed. Chapter 1 Chapter 1 10 11 12 13 5 14 15 2 16 17 18 19 2 20 21 22 23 24 25 26 192 152 135 43 1 2 Weight Data 42 Chapter BPS 1 - 5th Ed. Weight Data: Stemplot (Stem & Leaf Plot) BPS - 5th Ed. Chapter 1 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 0166 009 0034578 00359 08 00257 555 000255 000055567 245 3 025 0 0 44 11 Basic Practice of Statistics - 3rd Edition Extended Stem-and-Leaf Plots Extended Stem-and-Leaf Plots : if all of the data values were between 150 and 179, then we may choose to use the following stems: If there are very few stems (when the data cover only a very small range of values), then we may want to create more stems by splitting the original stems. 15 15 16 16 17 17 Chapter BPS 1 - 5th Ed. 45 Leaves 0-4 would go on each upper stem (first “15”), and leaves 5-9 would go on each lower stem (second “15”). Chapter BPS 1 - 5th Ed. 46 Class Make-up on First Day Time Plots (Fall Semesters: 1985-1993) A time plot shows behavior over time. Time is always on the horizontal axis, and the variable being measured is on the vertical axis. Look for an overall pattern (trend), and deviations from this trend. Connecting the data points by lines may emphasize this trend. Look for patterns that repeat at known regular intervals (seasonal variations). Chapter BPS 1 - 5th Ed. Chapter 1 47 Chapter BPS 1 - 5th Ed. 48 12 Basic Practice of Statistics - 3rd Edition Average Tuition (Public vs. Private) Chapter BPS 1 - 5th Ed. Chapter 1 49 13