~1~ STATS NOTES What questions do you have concerning this chapter? Do you have any suggestions for improving the presentation of this chapter? Chapter 1 Objectives: 1. The student will be able to define statistics 2. The student will be able to identify disciplines that use statistics. 3. The student will be able to classify statistics into four areas. 4. The student will be able to differentiate a population for a sample. 5. The student will be able to classify different types of data. 6. The student will be able to understand how statistics can be mis-used. Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting data for the purpose of assisting in making a more effective decision. Pg 4 1/09/10 Undressing the Terror Threat WSJ W3 Applications of statistics: accounting, economics, finance, management, marketing, sports, leisure, and households Pg 4-5 “An NBA MBA” WSJ W5 Are Statheads the NBA's Secret Weapon? WSJ W10 “Wal-Mart Seeks new Flexibility In Worker Shifts” WSJ A1 Positive versus normative decision making. 11/3/06 3/10/10 1/03/07 BRANCHES OF STATISTICS 1. DESCRIPTIVE STATISTICS - TRANSFORMS DATA Excel INTO INFORMATION select: Tools Data analysis Descriptive ~2~ Minitab Select: Stat/Display Descriptive Stat 2. 3. PROBABILITY - UNCERTAINTY INFERENCE DRAWS CONCLUSIONS ABOUT THE POPULATION PARAMETERS AFTER EXAMINING ONLY A SAMPLE(OR PORTION) OF THE DATA. Sample - A portion, or part, of the population of interest Sample size and representation - random sample Numerical attributes of samples are called statistics and are called variables Population - A collection, of all possible individuals, objects, measurements of interest. Pg 7 Numerical attributes of populations are alled parameters and are constants. 4. SPECIAL TOPICS IN STATISTICS TYPES OF DATA 1. CONTINUOUS 2. DISCRETE Quantitative Data(units of measurement) Pg 8-9 Qualitative Data (Non-numeric Levels of measurement Pg 10-13 Qualitative data 1. Nominal level - categories that do not show order 2. Ordinal level - categories that show order Quantitative data 3. Interval level - Distance between values is constant size. 4. Ratio level - meaningful zero(absence of the characteristic), and ratio between two numbers is meaningful ~3~ Mutually exclusive - An individual, object, or measurement is included in only one category. Exhaustive - Each individual object, or measurement must appear in one category. Pounds, minutes, # of baskets, wrong or right, shirt sizes, miles per gallon, centigrade, vanilla ice cream or other flavors, profits, net worth, counties in WI, thumbs up or down. “It’s a Crime What Some People Do with Statistics” WSJ 8/30/00 How do you compute the percent of people falsely sentenced to death? How do you compute the percent of people falsely executed? Chapter 2 Objectives: 1. The student will be able to construct and interpret frequency distributions and histograms. 2. The student will be able to construct and interpret an ogive. Frequency Distribution - Pg 22 Grouped data showing all possible outcomes and the number of each outcome. Summarizes data by forming categories of values and indicating the number of occurrences in each. Features of a good frequency distribution 1. Class intervals are mutually exclusive(do not overlap) 2. Class intervals are of equal width(except for open ended intervals) 3. Between 5 and 15 classes are normally used. 4. The number of data values falling in each class is indicated. 5. Try to avoid open-ended classes Construction of frequency distributions Pg 28-30 1. Establish the number of classes n = number of observations k = number of classes ~4~ 2k ο€ n Rule of Thumb 2. Determine width of each class. width = (Range of data set)/k highest value - lowest value ---------------------------k 3. Determine the class boundaries 4. 5. Count number of observations in each class Present results Step #1 2k≥n 27= 128 7 classes Step #2- Determine the class width Maximum - Minimum k This will determine class width { 127 - 70 } OVER 7 = 8.14 The number of classes should be made discrete. I rounded up to nine. Seven classes that are nine units wide will cover 63 units. It is best if you distribute this excess of 6 units in the first and last classes. Therefore, 3 additional units should be in the first class and three additional units should be in the last class. Step #3- Determine the class boundaries Start with the lowest value in the data set (70). Subtract the excess for the first class determined in step #2. PSI to Break 67 to 76 # Parts 2 Rel. Freq.(%) 1.6 C.F. R.C.F .% 2 1.6 76 to 85 22 17.6 24 19.2 85 to 94 47 37.6 71 56.8 94 to 103 29 23.2 100 80.0 ~5~ PSI to Break 103 to 112 # Parts 17 112 to 121 5 121 to 130 3 Totals Rel. Freq.(%) 13.6 4.0 2.4 125 C.F. R.C.F .% 117 93.6 122 97.6 125 100 100.0 frequency distributions for discrete vs nominal data cumulative and relative frequency distributions Histograms - Quality Control Excel - Select/Data Analysis/Histogram Misuses of Statistics Pg 14 1. Per-idem basis: per-capita, per-share, per-household, or per transaction. Chapters 3 and 12 2. Adjustment for inflation - GNP Chapter 18 & 17 3. Induced bias in the process of inference - Management surveys to determine place of Christmas party - Chapters 7-12,14,&15 4. Inappropriate comparisons of groups - Chapters 7-12,14,&15 self-selection - Company asks for volunteers for an exercise program hidden differences - Wage discrimination - Average score on basic skills test for countries - Chapter 12 and 18 5. Scale on graphs and charts - Chapter 2 & 12 6. Inaccurate interpretation of statistics - All chapters Chapter 3 Objectives: 1. The student will be able to compute and interpret the arithmetic mean, median, mode, and weighted mean. 2. The student will be able to explain the advantages, and disadvantages of each measure of central tendency listed above. 3. The student will be able to identify the position of the arithmetic mean, median, and mode for both a symmetrical distribution and a skewed distribution. ~6~ 4. The student will be able to compute and interpret measures of dispersion. 5. The student will be able to explain the advantages and disadvantages of each measure of central tendency 6. The student will be able to compute and interpret population and sample standard deviation. 7. The student will be able to explain Chebyshev’s theorem, or Normal rule. SUMMATION NOTATION X = 1,2,3,4,5 Y = -2,-1,0,1,2 ∑ π₯ = 15 ∑ π₯ 2 = 55 2 (∑ π₯) = 225 ∑ π₯π¦ = 10 ∑π₯∑π¦ = 0 DESCRIPTIVE STATISTICS - Excel-tools/data analysis/descriptive stats Minitab - Stats-Descriptive Statistics MEASURES OF CENTRAL TENDENCY 1. Mean ∑π₯ = π₯Μ π πππππ ππππ π DATA = 2,2,3,5,7,7,8,8,9,10,104 _ X = 15 computers sold per day The mean value is the center of the data that distributes deviations above the mean and below the mean. The mean equates the sum of the deviations above the mean to the sum of the deviations below the mean. Properties of the mean page 59. ~7~ Interpret average page 60 #8,#10 The mean balances deviations above and deviations below. ∑(π₯ − π₯Μ ) = 0 SAMPLE VS POPULATION MEAN OF A FREQUENCY DISTRIBUTION Pg 84 Which of the following is most appropriate for the mean(average). Use an average of the last 12 months to predict next months fuel bill in Wisconsin. Use an average of the last 12 months to pay your annual fuel bill. Use an average of the last 12 months determine what yearly sales would be without seasonal fluctuations. Ch 19 Page 670 text Applications 1. Wisconsin Power and Light energy averaging 2. Moving average and seasonal analysis #9 page 624 Ch 16 3. The cost of advertising on in Super Bowl was is 3 million per 30 second spot or $100,000 per second. 1A. WEIGHTED MEAN Pg 61 ∑ π€π₯ = ππππβπ‘ππ ππππ ∑π€ EX 1 _ X = 81.8 EX 2 EX 3 EX 4 HW .2 .2 .2 .3 85 MIDTERM = 80 FINAL = 90 WEIGHTS 1 2 78 80 84 .1 80 _ X = 86.6 Baseball average versus slugging percentage 1/06/10 In the NBA, 3 Is Cheaper Than 2 WSJ B14 1B. Geometric mean Formula 3-4 page 69 π πΊπ = √(1 + π 1 )(1 + π 2 ) … (1 + π π ) − 1 n=number of changes in the data R = fractional change from one period to the next period. For example if year 1 is 100 and increase in a year to 110 the fractional change is 10/100 or .1. You make a two year investment of $1 ~8~ End of year 1 worth $2 End of year 2 worth $1 100% increase 50% decrease [Determine the arithmetic mean and the geometric mean of the percentage change in your investment.] Formula 3-5 page 70 π π΄π£πππππ πππππππ‘ πππππππ π ππ£ππ π‘πππ = √ ππππ’π ππ‘ π‘βπ πππ ππ ππππππ − 1 ππππ’π ππ‘ π‘βπ πππππππππ ππ ππππππ [Invested $10,000 and after 10 years your investment grew to 30,000. Determine the average rate of return per year.] π √(1 + π1)(1 + π2) … (1 + ππ) − 1 [3 − 4] R1 = the fractional change in period 1 for example assume your profits grew by 2 hundredths in period 1 r1=.02 #28 text page 70 2. Median is the midpoint of the values after they have been ordered from the smallest to the largest. The center is now measured by the value that divides the data set in half, half the values higher and half lower than the median value. Pg 63 POSITION OF THE MEDIAN VALUE = n + 1 2 Half the data must be below the median and half must be above it. If the data set is odd, and 50 or less, n+1/2 will indicate the position that divides the data set in half. If the data set is large n/2 indicates the position of the median. Median = 7 computers sold Properties of the median page 64 9/15/08 “New Evidence on Taxes and Income” 2/2/07 “Is $34.06 Per Hour ‘Underpaid’? WSJ A19 EXCEL - Use sort function under data menu 3. WSJ MODE VALUE IN THE DATA COLLECTION THAT OCCURS MOST OFTEN. 1. USE MORE OFTEN WITH DISCRETE DATA AND ESPECIALLY USEFUL FOR NOMINAL DATA. Skewed frequency distributions Pg 114 [Would you prefer my grading to be balanced skewed right or left?] ~9~ MEASURES OF DISPERSION Pg 73-78 1. RANGE - HIGHEST - LOWEST VALUE ∑|π₯−π₯| 2. Mean Deviation 3. Standard Deviation of a population = √ = 16.18 π Excel function key statistical AVGDEV ∑(π₯−π₯)2 π = 28.26 The average or typical distance data values are from its mean. Sample vs Population - Excel automatically assumes the data is from a sample, if you want Excel to compute a standard deviation for a population you must use the function key. Go to the function key statistical – stdevp 2 ππ‘ππππππ πππ£πππ‘πππ πππ π π πππππ = √ Computer sales each Mean day 2 2 3 5 7 7 8 8 9 10 104 ∑(π₯ − π₯) π−1 Mean Dev Stan Dev 15 15 15 15 15 15 15 15 15 15 15 13 13 12 10 8 8 7 7 6 5 89 178 169 169 144 100 64 64 49 49 36 25 7921 8790 Mean Dev 16.18182 799.0909 Variance 28.2682 St Dev Chebyshev’s Theorem Pg 81 Empirical or Normal Rule pg 82 STANDARD DEVIATION AS A MEASURE OF RISK VARIANCE STEEL MILL VS APPLE ORCHARD Quality Control ~ 10 ~ [You are deciding on two different production methods for producing ball bearings. They both will produce ball bearings with a mean of 1 cm. Production method #1 Sx=.01 and method #2 Sx = .001. Which production method would you choose? Explain] Excel select: Tools/Data analysis/Descriptive Statistics Chapter 4 1. The Student will be able to construct and interpret a stem and leaf display. 2. The Student will be able to Compute and interpret quartiles, Deciles, and Percentiles. 3. The student will be able to compute and interpret skewness. Location of median, quartiles, or percentiles Lp = (n + 1)P/100 Pg 107 πππππ ππ′ π πΆππππππππππ‘ ππ ππππ€πππ π = π π = 3(π₯ − ππππππ) π Pearson 4-2 Page 113-116 Software skewness comes with descriptive statistics in data analysis. 4-3 1/19/06 3/2/06 “Is Inequality Over Wages Worsening? WSJ A2 “Rich Get Richer, But Not as Fast As you Think” WSJ A2 quintiles References: “Census history and Census Politics: What We Can Expect in 1990" Research and Opinion, Urban research center UW-Milwaukee 1989 “Price of Tickets rises 8.9 percent” Wisconsin State Journal 4/6/94 “How fair are our Taxes” WSJ 1/10/96 “Unemployment duration and labor market tightness” Chicago fed Letter march 96 “Brewers at the bottom of the barrel in salaries” Wisconsin State Journal 11/14/96 “Life is a Gamble” WSJ “It’s a Crime What Some People Do With Statistics” WSJ 8/30/00 Mayday at 41,000 Feet–Watch Those Units! American Educator Winter 2003/04 1/19/06 “Is Inequality Over Wages Worsening? WSJ A2 ~ 11 ~ 3/2/06 “Rich Get Richer, But Not as Fast As you Think” WSJ A2 quintiles