Part 24 – Statistical Tests:3 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics Part 24 – Statistical Tests:3 Statistics and Data Analysis Part 24 – Statistical Tests: 3 Part 24 – Statistical Tests:3 A Bivariate Latent Class Correlated Generalised Ordered Probit Model with an Application to Modelling Observed Obesity Levels William Greene Stern School of Business, New York University With Mark Harris, Bruce Hollingsworth, Pushkar Maitra Monash University, Melbourne Health Econometrics Workshop December 4-6, 2008 University of Milan - Bicocca 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Introduction Most common measure = Body Mass Index (BMI): Weight (Kg)/height(Meters)2 WHO guidelines: BMI < 18.5 are underweight 18.5 < BMI < 25 are normal 25 < BMI < 30 are overweight BMI > 30 are obese Pie Chart of Percent vs Type Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% Mushroom and Onion 9.2% Around 300 million people worldwide are obese, a figure likely to rise 30000 32500 0 1000000 60 800000 40 Listing Percent Frequency Listing The International Obesity Taskforce (http://www.iotf.org) calls obesity one of the most important medical and public health problems of our time. Defined as a condition of excess body fat; associated with a large number of debilitating and life-threatening disorders Health experts argue that given an individual’s height, their weight should lie within a certain range Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Costs of Obesity Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% 30000 32500 0 1000000 60 800000 40 Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% Percent Frequency Listing In the US more people are obese than the number who smoke or use illegal drugs Obesity is a major risk factor for (noncommunicable) diseases like heart problems and cancer Obesity is also associated with: lower wages and productivity, and absenteeism low self-esteem And is costly to society: USA costs are around 4-8% of all annual health care expenditure - US $100 billion Canada, 5%; France, 1.5-2.5%; and New Zealand 2.5% Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Background Behavioural adjustments, such as to diet and increased physical activity, can be made: if perceived benefits exceed costs Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% 30000 32500 0 1000000 60 800000 40 Listing Pie Chart of Percent vs Type Mushroom and Onion 9.2% Percent So, it is clearly an enormous public health issue, worldwide ! This is a growing area of research, but to date there have been relatively few economic and econometric analysis of obesity Frequency Listing It has also been argued that obesity is, in part, an economic phenomenon Obesity is seen as potentially avoidable: Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 An Ordered Probit Approach A Latent Regression Model for “True BMI” BMI* = 1x1 + 2x2 +… + , “True BMI” = a proxy for weight is unobserved Observation Mechanism for Weight Type WT = 0 if BMI* < 0 Normal 1 if 0 < BMI* < Overweight 2 if BMI* > Obese 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Data Pie Chart of Percent vs Type Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% Mushroom and Onion 9.2% 30000 32500 0 1000000 60 800000 40 Listing Percent Frequency Listing US National Health Interview Survey (2005); conducted by the National Centre for Health Statistics Information on self-reported height and weight levels, BMI levels Demographic information Remove those underweight Split sample (30,000+) by gender Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 BMI Ordered Choice Model Pie Chart of Percent vs Type Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% Mushroom and Onion 9.2% 30000 32500 0 1000000 60 800000 40 Listing Percent Frequency Listing Here we use, conditional on class membership, lifestyle factors Marriage comfort factor.only for .normal. class women Both classes negatively associated with income, education and Exercise effects similar in magnitude except for exercise Exercise intensity only important for ‘non-normal’ class: Home ownership only important for .non-normal.class, and negative: result of differing soci-economic status distributions across classes? Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Males Pie Chart of Percent vs Type Pepperoni 21.8% Sausage 5.8% 900000 800000 800000 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC 900000 Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 700000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Meatball Garlic 5.0% 2.3% Mushroom and Onion 9.2% 30000 32500 0 1000000 60 800000 40 Listing Percent Frequency Listing Marriage comfort factor for both classes; higher for normal class Income positively associated with weight levels for normal class Home own: unlike females, only normal class affected and positively Education negatively associated, and of a similar magnitude Exercise - as with females negatively associated with weight: but now more pronounced in non-normal class Vigorous exercise only important for .non-normal. class Percent 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Effects of Aging on Weight Class 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Effect of Education 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000 Part 24 – Statistical Tests:3 Effect of Income 600000 500000 400000 Mushroom 16.2% Plain 32.5% Scatterplot of Listing vs IncomePC Normal - 95% CI 900000 Mean StDev N AD P-Value 95 700000 90 500000 400000 200000 100000 15000 800000 700000 60 50 40 30 20000 22500 25000 IncomePC 27500 30000 32500 e mc 6 5 200000 2 1 100000 15000 400000 600000 Listing 800000 1000000 17500 20000 22500 25000 IncomePC 27500 Mean StDev N 369687 156865 51 80 8 4 200000 Normal 10 300000 0 Marginal Plot of Listing vs IncomePC Empirical CDF of Listing 100 12 500000 400000 10 17500 Histogram of Listing 14 2 600000 70 20 300000 200000 369687 156865 51 0.994 0.012 80 600000 300000 100000 Probability Plot of Listing 99 30000 32500 0 1000000 60 800000 40 Listing 800000 800000 Percent 900000 Frequency Sausage 5.8% Scatterplot of Listing vs IncomePC 900000 700000 Listing Pepper and Onion 7.3% Boxplot of Listing Category Pepperoni Plain Mushroom Sausage Pepper and Onion Mushroom and Onion Garlic Meatball Listing Pepperoni 21.8% Listing Meatball Garlic 5.0% 2.3% Percent Pie Chart of Percent vs Type Mushroom and Onion 9.2% 20 600000 400000 0 0 200000 300000 400000 500000 600000 Listing 700000 800000 900000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 20 30 40 50 60 70 80 90 Listing 200000 15000 20000 25000 IncomePC 30000