Normal Distribution, Likelihood Ratios, and ROC Curves Body Mass Indices for WNBA and NBA Players 2013-2014 Seasons Data Description • • • • Body Mass Index: BMI = 703*Weight(lbs)/(Height(in))2 WNBA (Females): 139 w/ Mean=23.135, SD=2.105 NBA (Males): 505 w/ Mean=24.741, SD=1.720 Distributions are approximately normal WNBA and NBA BMI Distributions 0.25 Females: mF = 23.135 sF = 2.105 Males: mM = 24.741 sM = 1.720 0.2 Normal Density 0.15 f(y_F) f(y_M) 0.1 0.05 0 15 18 21 24 Body Mass Index 27 30 Probability and Quantile Calculations YF ~ N m F 23.135 , s F2 2.1052 YM ~ N m M 24.741 , s M2 1.7202 Y m F 24.00 23.135 P YF 24.00 P Z F F 0.41 .3409 sF 2.105 Y m M 24.00 24.741 P YM 24.00 P Z M M 0.43 1 P Z M 0.43 1 P Z M 0.43 1 .3336 .6664 sM 1.720 95th-Percentile (0.95th-quantile) for Females: P YF q.95 .95 1 P YF q.95 P Z F 1.645 .0500 (From Z-table and interpolation) Y 23.135 P Z F 1.645 P Z F F 1.645 P YF 23.135 1.645 2.105 26.60 .05 2.105 10th-Percentile (0.10th-quantile for Males) P YM q.10 .10 P YM q.90 P Z M 1.282 P Z M 1.282 .1000 (From Z-table and interpolation) Y 24.741 P Z M 1.282 P Z M M 1.282 P YM 24.741 1.282 1.720 22.54 .10 1.720 Normal Probabilities BMI\Gender F M >24 0.3409 0.6664 <24 0.6591 0.3336 Total 1 1 Note: If we used >24 vs <24 as a classifier between Males and Females, about 2/3 of Males and 2/3 of Females would be classified correctly Other Choices of Cut-Off Values Cut-Off Z_F Z_M P(F<CO) P(F>CO) P(M<CO) P(M>CO) CorrectF FalseM FalseF CorrectM 20 -1.4893 -2.7564 0.0682 0.9318 0.0029 0.9971 0.0682 0.9318 0.0029 0.9971 21 -1.0143 -2.1750 0.1552 0.8448 0.0148 0.9852 0.1552 0.8448 0.0148 0.9852 22 -0.5392 -1.5936 0.2949 0.7051 0.0555 0.9445 0.2949 0.7051 0.0555 0.9445 23 -0.0641 -1.0122 0.4744 0.5256 0.1557 0.8443 0.4744 0.5256 0.1557 0.8443 24 0.4109 -0.4308 0.6594 0.3406 0.3333 0.6667 0.6594 0.3406 0.3333 0.6667 25 0.8860 0.1506 0.8122 0.1878 0.5598 0.4402 0.8122 0.1878 0.5598 0.4402 26 1.3610 0.7320 0.9133 0.0867 0.7679 0.2321 0.9133 0.0867 0.7679 0.2321 27 1.8361 1.3134 0.9668 0.0332 0.9055 0.0945 0.9668 0.0332 0.9055 0.0945 28 2.3112 1.8948 0.9896 0.0104 0.9709 0.0291 0.9896 0.0104 0.9709 0.0291 In this table: Z F CO 23.135 2.105 and Z M CO 24.741 1.720 If we make the cut-off very low (say BMI=20), we get very accurate test for Males (.9971 Correct), but very inaccurate test for Females (.0682) correct. Similarly, if we make the cut-off very high (say BMI=28), we get very accurate test for Females (.9896 correct), but very inaccurate for Males (.0291 correct) This situation is very similar to diagnostic tests for patients for a disease Prior/Posterior Probabilities, Odds, Likelihood Ratios In this population of professional basketball players, there are: 139 Females and 505 Males (644 Total). T represents having a BMI above the cut-off Value, and testing "Positive" as being Male 139 505 Prior Probabilities: P F .2158 PM .7842 644 644 p .2158 .7842 Prior Odds: odds odds F .2752 odds M 3.6339 1 p .7842 .2158 Likelihood Ratio of a Positive Test: LR T P T | M P T | F Likelihood Ratio of a Negative Test: LR T P T | F P T | M Posterior odds given a Positive Test (similar for a negative test): odds M T P T | M odds M LR T P T | F odds F T Posterior Probabilities given a Positive Test (similar for a negative test): odds p 1 odds P MT odds M T 1 odds M T P FT P T | F odds F LR T P T odds F T 1 odds F T |M Computations Cut-Off P(F) P(M) odds(F) odds(M) P(T+|F) P(T+|M) LR(T+) odds(M|T+) P(M|T+) 20 0.2158 0.7842 0.2752 3.6331 0.9318 0.9971 1.0701 3.8876 0.7954 21 0.2158 0.7842 0.2752 3.6331 0.8448 0.9852 1.1662 4.2370 0.8091 22 0.2158 0.7842 0.2752 3.6331 0.7051 0.9445 1.3395 4.8664 0.8295 23 0.2158 0.7842 0.2752 3.6331 0.5256 0.8443 1.6064 5.8363 0.8537 24 0.2158 0.7842 0.2752 3.6331 0.3406 0.6667 1.9576 7.1123 0.8767 25 0.2158 0.7842 0.2752 3.6331 0.1878 0.4402 2.3436 8.5144 0.8949 26 0.2158 0.7842 0.2752 3.6331 0.0867 0.2321 2.6754 9.7200 0.9067 27 0.2158 0.7842 0.2752 3.6331 0.0332 0.0945 2.8497 10.3533 0.9119 28 0.2158 0.7842 0.2752 3.6331 0.0104 0.0291 2.7912 10.1407 0.9102 Alternative Calculation using Law of Total Probability and Bayes' Rule (CO = 24): P F .2158 P M .7842 P T | F .3406 P T | M .6667 P T P F P T | F P M P T | M .2158 .3406 .7842 .6667 .5963 PM |T P M P T | M P T .7842 .6667 .8767 .5963 Receiver Operating Characteristic (ROC) Curve - BMI Classify as M/F 1.000 0.900 0.800 Sensitivity = P(True +) = P(T+|M) 0.700 0.600 0.500 True+ 45DegLine 0.400 0.300 0.200 0.100 0.000 0.000 0.100 0.200 0.300 0.400 0.500 0.600 1-Specificity = P(False +) = P(T+|F) 0.700 0.800 0.900 1.000 Performance of BMI as Test for M/F • An excellent test would have a high arc to the Northwest corner of the graph, allowing for a high sensitivity, P(T+|M) along with a low 1-specificity, P(T+|F) • Clearly, this test does not perform particularly well (due to large overlap in the Male/Female BMI densities • Commonly reported measure is the Area Under the ROC Curve (AUC) 0.5 ≤ AUC ≤ 1 • Rule of Thumb: 0.9-1 = Excellent, 0.8-0.9 = Good, 0.7-0.8 = Fair, 0.6-0.7 = Poor, 0.5-0.6 = Fail • For this Test, AUC = 0.6621 (applying trapezoidal rule) ba f x dx 2n f x 2 f x ... 2 f x f x b a 0 1 n 1 n with a 0, b 1, n 197, f x P T | M