Reduced Rank Regression – a powerful statistical method for identifying empirical dietary patterns Gina Ambrosini PhD Senior Research Scientist MRC Human Nutrition Research, Cambridge EUCCONET International Workshop, Bristol October 2011 Why dietary patterns ? The human diet is complex – we do not eat nutrients or foods in isolation • Single food/nutrient studies are frequently null e.g. fat intake and obesity; these do not consider total dietary intake • Strong co-linearity between dietary variables; ; difficult to separate effects, may be too small to detect • Numerous dietary variables (foods & nutrients) lead to too many statistical tests Studies of dietary patterns i.e. combinations of total food intake can overcome many of these problems What nutrition epidemiologists want to know … Reduced Rank Regression PCA or Factor Analysis Cluster Analysis ? Dietary Pattern Dietary Indices Eg. Healthy Eating Index ? Disease or Health Outcome Empirical Dietary Patterns E.g. Principal Components Analysis (PCA), Factor Analysis and Cluster Analysis • Data reduction techniques; identify latent constructs in data = patterns • Take advantage of co-linearity • Consider total diet; ‘real-life’ consumption and synergism • Produce uncorrelated dietary patterns (or clusters) suitable for multivariate models Food Intakes Dietary Patterns • Exploratory, data-driven, study specific: reproducibility unknown in different populations • Explain variation in food intakes but not necessarily nutrients – the end product of diet • Not disease-specific or hypothesis-based Reduced Rank Regression – a novel empirical approach Reduced Rank Regression (RRR) • A hypothesis-based empirical method for identifying dietary patterns • Similar to PCA and factor analysis but requires a 2nd set of data = response variables • Response variables should be on the pathway between food intake and outcome of interest RRR dietary patterns are linear combinations of food intake that explain the maximum variation in a set of response variables Dietary Pattern Food Intake Nutrients Predictors Responses Or Biomarkers Disease or Outcome of Interest Example - ALSPAC • Measured dietary intake using a 3d food diary at 7, 10 and 13 years of age • We hypothesised that: a dietary pattern that could explain the variation in dietary energy density, % energy from fat, and fibre at 7, 10 and 13 y would be prospectively assoc with body fatness measured at 9, 11, 13, 15 y Example RRR - ALSPAC Predictors Food Group Intakes 1st Dietary Pattern: Energy-dense, high in fat, low in fibre Responses Nutrient Intakes Dietary Pattern 1 3-day food diary Fruit Veg F3 F4 F5 F6 F7 F8… Fat Fibre Energy Density Dietary Pattern 2 OBESITY (fat mass) Dietary Pattern 3 Each dietary pattern is a linear combination of weighted food intakes that explains the max variation in ALL response variables -1st pattern often explains the most Such that for each dietary pattern a z-score is calculated as = W1(Food1 Intake) + W2(Food2 Intake) + W3(Food3 Intake) + … ALSPAC energy-dense, high fat, low fibre dietary pattern ALSPAC – change in Fat Mass Index (z-score) with a SD increase in energy-dense, high fat, low fibre dietary pattern z-score Girls Age Dietary Pattern FMI @ 9 y n=2868 FMI @ 11 y n=2274 FMI @ 13 y n=2007 FMI @ 15 y n=1556 7y (95% CI) p-value 0.08 (0.05 - 0.12) <.0001 0.08 (0.03 - 0.12) <0.001 0.07 (0.03 - 0.11) <0.001 0.07 (0.02 - 0.12) <0.001 0.05 (0.01 - 0.08) 0.01 0.04 (0.01 - 0.08) 0.04 0.05 (0.01 - 0.10) 0.02 10 y 13Y Boys -0.01 (-0.04 - 0.03) 0.68 Age Dietary Pattern FMI @ 9 y n=2854 FMI @ 11 y n=2118 FMI @ 13 y n=1863 FMI @ 15 y n=1345 7y (95% CI) p-value 0.09 (0.05 - 0.12) <.0001 0.09 (0.05 - 0.13) <.0001 0.06 (0.01 - 0.10) 0.012 0.07 (0.02 - 0.12) 0.006 0.01 (-0.03 - 0.04) 0.65 0.04 (0.01 - 0.08) 0.04 0.01 (-0.03 - 0.06) 0.64 10 y 13Y -0.01 (-0.05 - 0.02) 0.45 Adjusted for age at fat mass assessment, dietary misreporting, physical activity (cpm) Cross-cohort comparisons: ALSPAC v Raine Study PhD project – Geeta Appannah University of Cambridge and MRC Human Nutrition Research: • An almost identical energy-dense, high fat, low fibre dietary pattern seen at 14 and 17 y in The Western Australian Pregnancy Cohort (Raine) Study, a contemporaneous birth cohort. • Similar factor loadings for an energy-dense, high fat, low fibre dietary pattern in a FFQ and a food diary at 14 y of age in the Raine Study Geeta Appannah, MRC Human Nutrition Research Comparisons of RRR and PCA patterns Study RRR response variables Outcome Multi-Ethnic Study of Atherosclerosis (US) CRP, IL-6, Fibrinogen, Homocysteine Sub-clinical atherosclerosis EPIC Potsdam (Germany) Fibre, Magnesium, alcohol Type 2 Diabetes EPIC Potsdam (Germany) % Energy from saturated fat, PUFA, MUFA, protein and carbohydrate All cause mortality EPIC Potsdam (Germany) SFA, MUFA, n-3 PUFA, n-6 PUFA Breast cancer incidence Tehran Lipids and Glucose Study Total fat, PUFA/sat fat, cholesterol, fibre, calcium Obesity • Although the PCA and RRR patterns in these studies had similar nutrient profiles; these studies reported stronger associations between RRR-based dietary patterns and outcomes • RRR patterns explain more variation in the response variables Gina Ambrosini Caution - using biomarkers as response variables Biomarkers as response variables should be chosen carefully: • So they are true intermediates and not a proxy for the outcome of interest Dietary Pattern • • Blood Glucose Food Intake Insulin Resist. Predictors Responses Diabetes Should be on pathway; Therefore must be susceptible to dietary intake – relevant to more novel biomarkers Gina Ambrosini Generalisability of RRR patterns • Imamura et al (2010) applied RRR dietary patterns that were associated with type 2 diabetes in three different cohorts to the Framingham Offspring Study • All patterns were characterised by high intakes of meat products, refined grains and soft drinks Dietary Pattern RRR response variables EPIC Potsdam (Germany) Fibre, Magnesium, alcohol 1.14 (0.99 – 1.32) Nurses Health Study (US) Inflammatory markers 1.44 (1.25 – 1.66) Whitehall II (UK) Insulin resistance * 1.16 (1.00 – 1.35) Gina Ambrosini Risk of T2D in Framingham Offspring Study Imamura F et al. Generalizability of dietary patterns associated with type 2 diabetes mellitus. AJCN 2010; 90(4):1075-83 Limitations RRR appears to be a robust and powerful method, however: • Reproducibility, generalisability of patterns – only 1 published study • RRR depends on existing knowledge in order to choose response variables • Response variables must be chosen very carefully to avoid circular analysis • Biomarkers as response variables: must be an intermediate and not a proxy for the outcome/disease Gina Ambrosini Acknowledgements Dr Pauline Emmett, Dr Kate Northstone, & the ALSPAC Study Team Ms Geeta Appannah, PhD scholar, MRC Human Nutrition Research Mr David Johns, PhD scholar, MRC Human Nutrition Research Dr Anna Karin Lindroos, Swedish Food Authority, Uppsala (prev. HNR) Funding from: MRC Human Nutrition Research Cambridge, UK Gina.Ambrosini@mrc-hnr.cam.ac.uk Reported Associations with Other RRR Dietary Patterns Study RRR response variables Outcome Multi-Ethnic Study of Atherosclerosis (US) CRP, IL-6, Fibrinogen, Homocysteine Sub-clinical atherosclerosis Insulin Resistance Atherosclerosis Study (US multi-ethnic cohort) Plasminogen activator inhibitor 1, Fibrinogen Carotid artery atherosclerosis (IMT, CAC) Coronary Risk Factors for Atherosclerosis in Women (CORA) Germany LDL and HDL cholesterol lipoprotein (a) CRP, C-peptide (insulin resist) Coronary artery disease Nurses Health Study (US) Inflammatory markers Type 2 Diabetes Framingham Offspring Study (US) BMI, fasting HDL-C, TG, glucose, hypertension (BP residuals) Type 2 Diabetes EPIC Potsdam (Germany) Fibre, Magnesium, alcohol Type 2 Diabetes EPIC Potsdam (Germany) % Energy from saturated fat, PUFA, MUFA, protein and carbohydrate All cause mortality EPIC Potsdam (Germany) SFA, MUFA, n-3 PUFA, n-6 PUFA Breast cancer incidence Tehran Lipids and Glucose Study Total fat, PUFA/sat fat, cholesterol, fibre, calcium Obesity ALSPAC Energy density % energy from fat Fibre density Child obesity at 7, 9, 11, 13, 15y Gina Ambrosini