Warm up In a class where x 74, 8 State the interval containing the following % of marks: a) 68% b) 95% c) 99.7% Answers: a) 66 – 82 b) 58 – 90 c) 50 – 98 Applying the Normal Distribution: Z-Scores Chapter 3.5 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U Comparing Data Consider the following two students: Student 1 MDM 4U, Mr. Lieff, Semester 1, 2004-2005 Mark = 84%, x 74, 8 Student 2 MDM 4U, Mr. Lieff, Semester 2, 2005-2006 Mark = 83%, x 70, 9.8 Can we compare the two students fairly when the mark distributions are different? Mark Distributions for Each Class Semester 1, 2004-05 50 58 66 74 82 90 Semester 2, 2005-06 98 40.6 50.4 60.2 70 79.8 89.6 99.4 Comparing Distributions It is difficult to compare two distributions when they have different characteristics For example, the two histograms have different means and standard deviations z-scores allow us to make the comparison Histogram 4 2 4 5 6 7 8 9 10 11 b Collection 1 6 5 4 3 2 1 Histogram Count Count Collection 1 6 1 2 3 4 5 6 7 8 a The Standard Normal Distribution A distribution with a mean of zero and a standard deviation of one X~N(0,1²) Each element of any normal distribution can be translated to the same place on the Standard Normal Distribution using the z-score of the element the z-score is the number of standard deviations the piece of data is below or above the mean If the z-score is positive, the data lies above the mean, if negative, below z xx Standardizing The process of mapping a normal distribution to the standard normal distribution N(0,12) is called standardizing The Standardized normal distribution has a mean of 0 and a standard deviation of 1 Example 1 For the distribution X~N(10,2²), determine the number of standard deviations each value lies above or below the mean: a. x=7 z = 7 – 10 2 z = -1.5 b. x = 18.5 z = 18.5 – 10 2 z = 4.25 7 is 1.5 standard deviations below the mean 18.5 is 4.25 standard deviations above the mean (anything beyond 3 is an outlier) Example continued… 99.7% 95% 34% 34% 13.5% 13.5% 2.35% 2.35% 6 8 7 10 12 14 16 18.5 Example 2: The class mean is 68.0 and the std.dev. is 10.9. If your mark is 64, what % of the class has a mark below yours? Above yours? z = (64 – 68.0)/10.9 = -0.37 (using the z-score table on page 398) We get 0.3557 or 35.6% So 35.6% of the class has a mark less than or equal to yours Therefore 100 – 35.6 = 64.4% of the class has a mark above yours. Example 3: Percentiles The kth percentile is the data value that is greater than k% of the population If another student has a mark of 75, what percentile is this student in? z = (75 - 68)/10.9 = 0.64 From the table on page 398 we get 0.7389 or 73.9%, so the student is in the 74th percentile – their mark is greater than 74% of the others Example 3: Percentiles cont’d NOTE: Always round UP for percentiles i.e. If 65.2% of the data is below a particular value, then that value is in the 66th percentile. It is analogous to your age – the day after your 15th birthday, you are in your 16th year of life – round UP!!! (The textbook uses standard rounding rules which, by the definition of a percentile, is not correct) Do now The mean of an online IQ test is 110 with a standard deviation of 8. If you scored 120: a. What is your z-score? What does it mean? b. What % of the population are you ‘smarter than’? (use table on p. 398) c. What percentile are you at? a. z = 120 – 110 = 1.25 so you are 1.25 std.dev. above the mean 8 b. 0.8944 or 89.44% c. 90th percentile Example 4: Ranges Find the percent of data between a mark of 60 and 80 For 60: z = (60 – 68)/10.9 = -0.73 gives 23.3% For 80: z = (80 – 68)/10.9 = 1.10 gives 86.4% 86.4% - 23.3% = 63.1% So 63.1% of the class is between a mark of 60 and 80 See http://www.coolschool.ca/content/display.php?file=conte nt/pmath12/standnorm Back to the two students... Student 1 Student 2 84 74 z 125 . 8 83 70 z 1.326 9.8 Student 2 has the lower mark, but a higher zscore! Exercises Read through the examples on pp. 180-185 Complete p. 186 #2-5, 7, 8, 10 Tomorrow: Group Design Challenge: Build a Normal Distribution using Cube-A-Links Chapter 3 Problem Group Design Challenge: Build a Normal Distribution using Cube-A-Links Split into 4 groups Each group will be given a bucket of Cube-ALinks Your task is to build a Normal Distribution (20 mins) We will perform a peer review of each distribution Mathematical Indices Chapter 3.6 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U What is an Index? An arbitrarily defined number that provides a measure of scale Used to indicate a value so that we can make comparisons, but does not represent an actual measurement or quantity Interval Data (no meaningful starting point) 1) BMI – Body Mass Index A mathematical formula created to determine whether a person’s mass puts them at risk for health problems BMI = m where m = mass in kg, h = height in m h 2 Standard / Metric BMI Calculator http://nhlbisupport.com/bmi/bmicalc.htm Underweight Below 18.5 Normal 18.5 - 24.9 Overweight 25.0 - 29.9 Obese 30.0 and Above NOTE: BMI is not accurate for athletes and the elderly 2) Slugging Percentage Baseball is the most statistically analyzed sport in the world A number of indices are used to measure the value of a player Batting Average (AVG) measures a player’s ability to get on base (hits / at bats) probability Slugging percentage (SLG) also takes into account the number of bases that a player earns (total bases / at bats) SLG = TB where TB = 1B + 2B×2 + 3B×3 + HR×4 and 1B = singles, 2B = doubles, AB 3B = triples, HR = homeruns Slugging Percentage Example e.g. 1B Miguel Cabrera, Detroit Tigers http://sports.yahoo.com/mlb/players/7163 2008 Statistics: 616 AB, 180 H, 36 2B, 2 3B, 37 HR NOTE: H (hits) includes 1B as well as 2B, 3B and HR So 1B = H – (2B + 3B + HR) = 180 – (36 + 2 + 37) = 105 Slugging Percentage Example cont’d SLG = (H + 2×2B + 2×3B + 3×HR) / AB = (105 + 2×36 + 3×2 + 4×37) / 616 = 331 / 616 = 0.537 (3 decimal places) This means Miggy attained 0.537 bases per AB Example 3: Moving Average Used when time-series data show a great deal of fluctuation (e.g. stocks, currency exchange) Average of the previous n values e.g. 5-Day Moving Average cannot calculate until the 5th day value for Day 5 is the average of Days 1-5 value for Day 6 is the average of Days 2-6 e.g. Look up a stock symbol at http://ca.finance.yahoo.com Click Charts Technical chart n-Day Moving Average Useful for showing long term trends Other Examples 1) Consumer Price Index (CPI) An indicator of changes in Canadian consumer prices Compares the cost of a fixed basket of commodities through time Commodities are of unchanging or equivalent quantity and quality reflecting only pure price change. http://www.statcan.gc.ca/cgibin/imdb/p2SV.pl?Function=getSurvey&SDDS=2301&lang =en&db=imdb&adm=8&dis=2 What is included in the CPI? 8 major categories FOOD AND BEVERAGES (breakfast cereal, milk, coffee, chicken, wine, full service meals, snacks) HOUSING (rent of primary residence, owners' equivalent rent, fuel oil, bedroom furniture) APPAREL (men's shirts and sweaters, women's dresses, jewelry) TRANSPORTATION (new vehicles, airline fares, gasoline, motor vehicle insurance) MEDICAL CARE (prescription drugs and medical supplies, physicians' services, eyeglasses and eye care, hospital services) RECREATION (televisions, toys, pets and pet products, sports equipment, admissions); EDUCATION AND COMMUNICATION (college tuition, postage, telephone services, computer software and accessories); OTHER GOODS AND SERVICES (tobacco and smoking products, haircuts and other personal services, funeral expenses). Other Examples cont’d 2) NHL Fan Cost Index (FCI) Comprises the prices of: four (4) average-price tickets two (2) small draft beers four (4) small soft drinks four (4) regular-size hot dogs parking for one (1) car two (2) game programs two (2) least-expensive, adult-size adjustable caps. Other Examples cont’d 2) NHL Fan Cost Index (FCI) Details Average ticket price represents a weighted average of season ticket prices. Costs were determined by telephone calls with representatives of the teams, venues and concessionaires. Identical questions were asked in all interviews. All prices are converted to USD at the exchange rate of $1CAD=$.932418 USD. MSIP / Homework Read pp. 189-192 Complete pp. 193-195 #1a (odd), 2-3 ac, 4 (alt: calculate SLG for 3 players on your favourite team for 2009), 8, 9, 11 References Halls, S. (2004). Body Mass Index Calculator. Retrieved October 12, 2004 from http://www.halls.md/body-mass-index/av.htm Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from http://en.wikipedia.org/wiki/Main_Page