Measuring Dietary Intake

advertisement
Measuring Dietary Intake
_________________________________________________________
Raymond J. Carroll
Department of Statistics
Faculty of Nutrition and Faculty of
Toxicology
Texas A&M University
http://stat.tamu.edu/~carroll
I
Still
Cook
_________________________________________________________
Me in the kitchen,
Yokohama (my
birthplace), 1953
Advertisement
Is this not cool?
I took Hotelling’s
position at UNC,
then Fan took mine
My photo was taken
at the Wichita
Mountains, December
1999 (by me)
Palo Duro
Canyon, the
Grand
Canyon of
Texas
West Texas

East Texas 
Wichita Falls,
Wichita Falls,
that’s my
hometown
Guadalupe
Mountains
National
Park
College Station, home
of Texas A&M
University
I-45
Big Bend
National
Park
I-35
Palo Duro Canyon of the Red River
What I am Not
_________________________________________________________
I know that potato chips are not a
basic healthy food group. However,
if you ask me a detailed question
about nutrition, then I will ask
Joanne Lupton
Nancy Turner
Meeyoung Hong
You are what you eat, but do you
know
who
you
are?
_________________________________________________________
• This talk is concerned with a simple question.
• Will lowering her intake of fat decrease a
woman’s chance of developing breast
cancer?
Basic
Outline
_________________________________________________________
• Diet affects health. Many (not all!) studies
though are not statistically significant.
• Focus: quality of the instruments used to
measure diet
• Conclusion #1: The usual instruments are
largely to blame.
• Conclusion #2: Expect studies to disagree
Evidence in Favor of the FatBreast
Cancer
Hypothesis
_________________________________________________________
• Animal studies
• Ecological comparisons
• Case-control studies
International Comparisons
_____________________________________________________________
Evidence against the Fat-Breast
Cancer
Hypothesis
_________________________________________________________
• Prospective studies
• These studies try to assess a woman’s diet, then
follow her health progress to see if she develops
breast cancer
• The diets of those who developed breast cancer
are compared to those who do not
• Prior to 2007, only 1 prospective study has
found evidence suggesting a fat and
breast cancer link, and 1 has a negative
link
Prospective Studies
_________________________________________________________
• NHANES (National Health and Nutrition
Examination Survey):
n = 3,145
women aged 25-50
• Nurses Health Study:
n = 60,000+
• Pooled Project:
n = 300,000+
• Norfolk (UK) study:
n = 15,000+
The Nurses Health Study, Fat and Breast
Cancer
_________________________________________________________
60,000 women,
followed for 10 years
Prospective study
Note that the breast
cancer cases were
announcing that they
eat less fat
Donna Spiegelman,
the NHS statistician
Clinical Trials
_________________________________________________________
• The lack of consistent (even positive) findings
led to the Women’s Health Initiative
• Approximately 40,000 women randomized to
two groups: healthy eating and typical eating
WHI
Diet
Study
Objectives
_________________________________________________________
Prior Objections to WHI
_________________________________________________________
• Cost ($415,000,000)
• Whether North Americans can really lower %
Calories from Fat to 20%, from the current 38%
• Even if the study was successful, difficulties in
measuring diet mean that we will not know what
components led to the decrease in risk.
Change in Fat Calories Over Time
_________________________________________________________
Result from WHI Diet
Clinical Trial
Women reported a
decrease in fatcalories, but not to
20%
40
35
30
25
Control
Intervention
Goal
20
15
10
5
0
Y-0
Y-1
Y-3
Y-6
How do we measure diet in humans?
_________________________________________________________
• 24 hour recalls
• Diaries
• Food Frequency
Questionnaires (FFQ)
Walt Willett has a
popular book and a
popular FFQ
Food diaries
_________________________________________________________
• Hot topic at NCI
• Only measures a few day’s diet, not typical diet
• A single 3-day diary finding a diet-cancer link is
not universally scientifically acceptable
• Need for repeated applications
• Induces behavioral change??
Diary 6
Diary 5
Diary 4
Diary 3
Diary 2
Diary 1
1800
1750
1700
1650
1600
1550
1500
1450
1400
1350
FFQ
Typical (Median) Values of Reported
Caloric Intake Over 6 Diary Days:
WISH Study
The Food Frequency Questionnaire
_________________________________________________________
The Pizza Question
_________________________________________________________
The Norfolk Study with ~Diaries and FFQ
_________________________________________________________
15,000 women, aged
45-74, followed for 8
years
163 breast cancer
cases
Diary: p = 0.005
FFQ: p = 0.229
Summary
_________________________________________________________
• FFQ does not find a fat and breast cancer link
• 24 hour recalls and diaries are expensive
• They have found links, but in opposite directions
• Diaries may modify behavior
• Question: do any of these things actually
measure dietary intake?
• How well or how badly?
• These are statistical questions!

Do We Know Who We Are?
_________________________________________________________
• Karl Pearson was
arguably the 1st great
modern statistician
• Pearson chi-squared
test
• Pearson correlation
coefficient
Karl Pearson at age 30
Do We Know Who We Are?
_________________________________________________________
• Pearson was deeply
interested in selfreporting errors
• In 1896, Pearson ran
the following
experiment.
• For each of 3 people,
he set up 500 lines of
a set of paper, and
had them bisected by
hand
A gaggle of lines
Pearson’s Experiment
_________________________________________________________
• He then had an
postdoc measure the
error made by each
person on each line,
and averaged
• “Dr. Lee spent
several months in
the summer of
1896 in the
reduction of the
observations ”
A gaggle of lines, with my
bisections
Pearson’s Personal Equations
_________________________________________________________
• Pearson computed the
mean error committed
by each individual: the
“personal equations “
• He found: the errors
were individual. His
errors were to the right,
Dr. Lee’s to the left
Karl Pearson in later life
What Do Personal Equations Mean?
_________________________________________________________
• Given the same set of
data, when we are
asked to report
something, we all
make errors, and our
errors are personal
• In the context of
reporting diet, we call
this “person-specific
bias “
Laurence Freedman of NCI,
with whom I did the work
Model Details for Statisticians
_________________________________________________________
• The model in symbols
Qij =β0 + β1 X i + ri + ε ij ;
X i =true intake;
ri =personal equation=Normal(0,σ 2r );
ε ij =random error =Normal(0,σ 2ε )
• The existence of person-specific bias
means that variance of true intake is less
than one would have thought
Model Details for Statisticians
_________________________________________________________
• We fit a linear mixed model
• The OPEN Study had the following
measurements
• Two FFQ
• Two Protein biomarkers
• Two Energy biomarkers
Our Hypothesis
_________________________________________________________
• We hypothesized that when measuring Fat
intake
• The personal equation, or person-specific
bias, unique to each individual, is large and
debilitating.
• The problem: the actual variability in
American diets is much smaller than
suspected.
Can We Test Our Hypothesis?
_________________________________________________________
• We need biomarker data
that are not much subject
to the personal equation
• There is no biomarker for
Fat 
• There are biomarkers for
energy (calories) and
Protein
• We expect that studies are
too small by orders of
magnitude
Biomarker Data
_________________________________________________________

Calories and Protein:


Available from NCI’s
OPEN study
Results are surprising
Victor Kipnis was the
driving force behind OPEN
Sample Size Inflation
_________________________________________________________


There are formulae for how large a study needs to
be to detect a true doubling of risk from low and
high Fat/Energy Diets
These formulae try to account for measurement
error

These formulae ignore the personal equation

We recalculated the formulae
Biomarker
Data: Sample Size Inflation
_________________________________________________________
If you are interested in the effect of calories on health, multiply
the sample size you thought you needed by 11. For protein, by
4.5
12
10
8
6
4
2
%Protein
Calories
Protein
0
Relative Risk
_________________________________________________________
If high calories increases the risk of breast cancer by 100% in
fact, and you change your intake dramatically, the FFQ thinks
doing so increases the risk by 4%
Result: It is not
possible to tell
if changing your
absolute caloric
intake, or your
fat intake, or
your protein
intake will have
any health
effects
2
1.8
1.6
True: 2.00
1.4
Observed
Protein: 1.09
Observed
Calories: 1.04
1.2
1
Relative Risk For
Changing Your Food
Intake
Relative Risk, Food Composition
_________________________________________________________
If high protein (fat) increases the risk of breast cancer by 100%,
your calories remain the same, you dramatically lower your
protein (fat) intake, then FFQ thinks your risk increases by 20%30%
Result: It is pretty
difficult to tell if
changing your food
composition while
maintaining your
caloric intake will
have any health
effects
2
1.8
True: 2.00
1.6
1.4
Observed
Protein
Density: 1.31
1.2
1
Relative Risk for Food
Composition
New Results
_________________________________________________________


The AARP Study: 250,000+
women, by far the greatest
number in any single study
Results:



Huge size  statistical
significance
FFQ  small measured increase
in risk for dramatic behavioral
change (1.32 after correction)
Statistician’s dream: use
Pearson’s idea to get at the true
increase in risk
A happy statistician
dreaming about AARP
New Results
_________________________________________________________




The WHI Controls Study:
30,000+ women
All with > 32% Calories
from Fat via FFQ
Diaries in a nested casecontrol study
Highly significant fat effect
in the diaries (Observed
RR in quantiles = 1.6)
A happy statistician doing
field biology in the
Kimberley
Summary
_________________________________________________________

WHI, 2006, clinical trial

My best case conjecture in 2005:

Probably no statistically significant effects

The p-value was 0.07, relative risk about 1.2

My best case conjecture in 2008 after further follow-up

Statistically significant, modest effects
You are what you eat, but do you
know who you are?
_________________________________________________________



Diet is incredibly hard to measure
Even 100% increases in risk cannot be seen
in large cohort studies with an FFQ
If you read about a diet intervention,
measured by a FFQ, and it achieves
statistical significance multiple times: wow!
You are what you eat, but do you
know who you are?
_________________________________________________________


Much work at NCI and WHI and EPIC on new
ways of measuring diet
EPIC (a multi-country study) may be a
model, because of the wide distribution of
intakes
What Was Done
_________________________________________________________
• The OPEN analysis actually fit Protein and
Energy together.
• We call this the Seemingly Unrelated
Measurement Error Model
• Can get major gains in efficiency
SUMEM
_________________________________________________________
QijP =β0QP +β1QP X iP
MijP =
QijE =β0QE
MijE =
+ riQP + ε ijQP ;
 UijQP ;
+ X iP
+ β1QE X iE + riQE + ε ijQE ;
+ X iE
 UijQE ;
• Gains in efficiency come from the correlations
of the random effects
Model Details for Statisticians
_________________________________________________________
• The model in symbols
Qij =β0Q +β1Q X i + riQ +ε ijQ ;
Mij =
Xi +
UijF ;
• Linear mixed model, fit by PROC MIXED
Attenuation
_________________________________________________________
• The attenuation is the slope in the linear
regression of X on Q
Qij =β 0Q +β1Q X i + riQ + ε ijQ ;
Mij =
Xi +
ε ijF ;
λ Q =cov(X,Q)/ var(Q)
Relative Risk and Attenuation
_________________________________________________________
• Start with a logistic model
pr(D=1)=H( 0 + 1 X)
• True relative risk
R  exp(1 )
• Observed relative risk (regression calibration)
R
λQ
 R since λ Q < 1
Relative Risk and Attenuation
_________________________________________________________
Attenuation
Relative Risk
1.0 (no meas. Error)
2.0
0.8
1.74
0.5
1.41
0.25
1.19
0.10
1.07
Download