Warm up In a class where x 74, 8 State the interval containing the following % of marks: a) 68% b) 95% c) 99.7% Answers: a) 66 – 82 b) 58 – 90 c) 50 – 98 Applying the Normal Distribution: Z-Scores Chapter 3.5 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U AGENDA Comparing Data Standard Normal Distribution Ex. 1: z-scores Ex. 2: Percentage of data below/above Ex. 3: Percentiles Ex. 4: Ranges MSIP / Homework Comparing Data Consider the following two students: Student 1 MDM 4U, Mr. Norbraten, Semester 1 Mark = 84%, x 74, 8 Student 2 MDM 4U, Mr. Lieff, Semester 2 Mark = 83%, x 70, 9.8 Can we compare the two students fairly when the mark distributions are different? Mark Distributions for Each Class Semester 1 50 58 66 74 82 Semester 2 90 98 40.6 50.4 60.2 70 79.8 89.6 99.4 Comparing Distributions It is difficult to compare two distributions when they have different characteristics For example, the two histograms have different means and standard deviations z-scores allow us to make the comparison Histogram 4 2 4 5 6 7 8 9 10 11 b Collection 1 6 5 4 3 2 1 Histogram Count Count Collection 1 6 1 2 3 4 5 6 7 8 a Standard Normal Distribution 99.7% 95% 68% 34% 34% 13.5% 13.5% 2.35% 2.35% -3 -2 -1 0 1 2 3 The Standard Normal Distribution A distribution with mean zero and standard deviation of one X~N(0,1²) z-score translates from any Normal distribution to the Standard Normal Distribution z-score is the number of standard deviations below or above the mean Positive z-score data lies above the mean Negative below z xx Example 1 For the distribution X~N(10,2²) determine the number of standard deviations each value lies above or below the mean: a. x=7 z = 7 – 10 2 z = -1.5 b. x = 18.5 z = 18.5 – 10 2 z = 4.25 7 is 1.5 standard deviations below the mean 18.5 is 4.25 standard deviations above the mean (anything beyond 3 is an outlier) Example continued… 99.7% 95% 34% 34% 13.5% 13.5% 2.35% 2.35% 6 8 7 10 12 14 16 18.5 Standard Deviation A recent math quiz offered the following data z-scores offer a way to compare scores among members of the class, find out what % had a mark greater than yours, indicate position (percentile) in the class, etc. mean = 68.0 standard deviation = 10.9 Test 1 10 Histogram 8 Count 6 4 2 40 45 50 55 60 65 70 75 80 85 90 marks Example 2: If your mark was 64, what % of the class scored lower? Compare your mark to the rest of the class z = (64 – 68.0)/10.9 = -0.37 (using the z-score table on page 398) We get 0.3557 or 35.6% So 35.6% of the class has a mark less than or equal to yours What % scored higher? 100 – 35.6 = 64.4% Example 3: Percentiles The kth percentile is the data value that is greater than k% of the population If another student has a mark of 75, what percentile is this student in? z = (75 - 68)/10.9 = 0.64 From the table on page 398 we get 0.7389 or 73.9%, so the student is in the 74th percentile – their mark is greater than 74% of the others Example 4: Ranges Now find the percent of data between a mark of 60 and 80 For 60: gives 23.3% For 80: z = (60 – 68)/10.9 = -0.73 z = (80 – 68)/10.9 = 1.10 gives 86.4% 86.4% - 23.3% = 63.1% So 63.1% of the class is between a mark of 60 and 80 Back to the two students... Student 1 Student 2 84 74 z 125 . 8 83 70 z 1.326 9.8 Student 2 has the lower mark, but a higher zscore, so he/she did better compared to the rest of her class. MSIP / Homework Read through the examples on pages 180185 Complete p. 186 #2-5, 7, 8, 10 Mathematical Indices Chapter 3.6 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U What is an Index? An arbitrarily defined number that provides a measure of scale Used to indicate a value so that we can make comparisons, but does not always represent an actual measurement or quantity Interval Data (no meaningful starting point) 1) BMI – Body Mass Index A mathematical formula created to determine whether a person’s mass puts them at risk for health problems BMI = m where m = mass in kg, h = height in m h 2 Standard / Metric BMI Calculator http://nhlbisupport.com/bmi/bmicalc.htm Underweight Below 18.5 Normal 18.5 - 24.9 Overweight 25.0 - 29.9 Obese 30.0 and Above NOTE: BMI is not accurate for athletes and the elderly 2) Slugging Percentage Baseball is the most statistically analyzed sport in the world A number of indices are used to measure the value of a player Batting Average (AVG) measures a player’s ability to get on base (hits / at bats) probability Slugging percentage (SLG) also takes into account the number of bases that a player earns (total bases / at bats) SLG = TB where TB = 1B + 2B×2 + 3B×3 + HR×4, OR AB TB = H + 2B + 3B×2 + HR×3 where 1B = singles, 2B = doubles, 3B = triples, HR = homeruns Slugging Percentage Example e.g. 1B Miguel Cabrera, Detroit Tigers http://sports.yahoo.com/mlb/players/7163 2008 Statistics: 616 AB, 180 H, 36 2B, 2 3B, 37 HR NOTE: H (hits) includes 1B as well as 2B, 3B and HR So 1B = H – (2B + 3B + HR) = 180 – (36 + 2 + 37) = 105 Slugging Percentage Example cont’d SLG = (1B + 2×2B + 3×3B+ 4×HR) / AB = (105 + 2×36 + 3×2 + 4×37) / 616 = 331 / 616 = 0.537 (3 decimal places) This means Miggy attained 0.537 bases per AB Example 3: Moving Average Used when time-series data show a great deal of fluctuation (e.g. stocks, currency exchange) Average of the previous n values e.g. 5-Day Moving Average cannot calculate until the 5th day value for Day 5 is the average of Days 1-5 value for Day 6 is the average of Days 2-6 e.g. Look up a stock symbol at http://ca.finance.yahoo.com Click Charts Technical chart n-Day Moving Average Useful for showing long term trends Other Examples 1) Consumer Price Index (CPI) An indicator of changes in Canadian consumer prices Compares the cost of a fixed basket of commodities through time Commodities are of unchanging or equivalent quantity and quality reflecting only pure price change. http://www.statcan.gc.ca/cgibin/imdb/p2SV.pl?Function=getSurvey&SDDS=2301&lang =en&db=imdb&adm=8&dis=2 What is included in the CPI? 8 major categories FOOD AND BEVERAGES (breakfast cereal, milk, coffee, chicken, wine, full service meals, snacks) HOUSING (rent of primary residence, owners' equivalent rent, fuel oil, bedroom furniture) APPAREL (men's shirts and sweaters, women's dresses, jewelry) TRANSPORTATION (new vehicles, airline fares, gasoline, motor vehicle insurance) MEDICAL CARE (prescription drugs and medical supplies, physicians' services, eyeglasses and eye care, hospital services) RECREATION (televisions, toys, pets and pet products, sports equipment, admissions); EDUCATION AND COMMUNICATION (college tuition, postage, telephone services, computer software and accessories); OTHER GOODS AND SERVICES (tobacco and smoking products, haircuts and other personal services, funeral expenses). Other Examples cont’d 2) NHL Fan Cost Index (FCI) Comprises the prices of: four (4) average-price tickets two (2) small draft beers four (4) small soft drinks four (4) regular-size hot dogs parking for one (1) car two (2) game programs two (2) least-expensive, adult-size adjustable caps. Other Examples cont’d 2) NHL Fan Cost Index (FCI) Details Average ticket price represents a weighted average of season ticket prices. Costs were determined by telephone calls with representatives of the teams, venues and concessionaires. Identical questions were asked in all interviews. All prices are converted to USD at the exchange rate of $1CAD=$.932418 USD. MSIP / Homework Read pp. 189-192 Complete pp. 193-195 #1a (odd), 2-3 ac, 4 (alt: calculate SLG for 3 players on your favourite team for 2010), 8, 9, 11 References Halls, S. (2004). Body Mass Index Calculator. Retrieved October 12, 2004 from http://www.halls.md/body-mass-index/av.htm Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page