Measuring Dietary Intake - Department of Statistics

advertisement
Measuring Dietary Intake
_________________________________________________________
Raymond J. Carroll
Department of Statistics
Faculty of Nutrition and Toxicology
Texas A&M University
http://stat.tamu.edu/~carroll
Where I am From
_________________________________________________________
Wichita Falls
(Ranked #10 in
the worst jobs
in Texas by
Texas Monthly,
1980)
Best pecans in
the world
What I am Not
_________________________________________________________
I know that potato chips are not a
basic healthy food group. However,
if you ask me a detailed question
about nutrition, then I will ask
Joanne Lupton
Nancy Turner
Meeyoung Hong
You are what you eat, but do you
know
who
you
are?
_________________________________________________________
• This talk is concerned with a simple question.
• Will lowering her intake of fat decrease a
woman’s chance of developing breast
cancer?
• This is a hugely controversial question
• The debate has a huge statistical component
• It is also relevant to questions such as: if I
lower my caloric intake, will I live longer?
Evidence in Favor of the FatBreast
Cancer
Hypothesis
_________________________________________________________
• Animal studies
• Ecological comparisons
• Case-control studies
International Comparisons
_____________________________________________________________
• There are major
differences in fat and
saturated fat intake
across countries
• Are these related to
breast cancer?
International Comparisons
_____________________________________________________________
Case-control studies
_________________________________________________________
• Find women who have breast cancer, and
women who do not.
• Compare their current fat intakes
• A problem on its face. We want past intake,
not after-the-fact intake
• Not much found in single studies, but
pooling them over many diverse studies
suggests a fat-breast cancer link
Evidence against the Fat-Breast
Cancer
Hypothesis
_________________________________________________________
• Prospective studies
• These studies try to assess a woman’s diet, then
follow her health progress to see if she develops
breast cancer
• The diets of those who developed breast cancer
are compared to those who do not
• Only (?) 1 prospective study has found
firm evidence suggesting a fat and breast
cancer link, and 1 has a negative link
Prospective Studies
_________________________________________________________
• NHANES (National Health and Nutrition
Examination Survey):
n = 3,145
women aged 25-50
• Nurses Health Study:
n = 60,000+
• Pooled Project:
n = 300,000+
• Norfolk (UK) study:
n = 15,000+
• AARP:
n = 250,000+
• WHI Controls
n = 30,000+
• AARP and WHI available soon
The Nurses Health Study, Fat and Breast
Cancer
_________________________________________________________
60,000 women,
followed for 10 years
Prospective study
Note that the breast
cancer cases were
eating less fat
Donna Spiegelman,
the NHS statistician
Clinical Trials
_________________________________________________________
• The lack of consistent (even positive) findings
led to the Women’s Health Initiative
• Approximately 60,000 women randomized to
two groups: healthy eating and typical eating
WHI
Diet
Study
Objectives
_________________________________________________________
Objections to WHI
_________________________________________________________
• Cost ($100,000,000+)
• Whether Americans can really
lower % Calories from Fat to
20%, from the current 35%
• Even if the study is successful,
difficulties in measuring diet mean
that we will not know what
components led to the decrease in
risk.
Ross Prentice of
the WHI
How do we measure diet in humans?
_________________________________________________________
• 24 hour recalls
• Diaries
• Food Frequency
Questionnaires (FFQ)
Walt Willett has a
popular book and a
popular FFQ
Objections to the 24 hour recall
_________________________________________________________
• Only measures yesterday’s diet, not typical diet
• A single 24 hour recall finding a diet-cancer link
is not universally scientifically acceptable
• Need for repeated applications
• Expensive
• Personal interview
• Phone interview
NHANES: Fat is Protective (?)
_________________________________________________________
Typical %
Calories from
Fat
Cases: 35%
Controls: 37%
NHANES: Calories are Protective (?)
_________________________________________________________
Typical Calories
Cases: 1,300
Controls: 1,500
Food diaries
_________________________________________________________
• Hot topic at NCI
• Only measures a few day’s diet, not typical diet
• A single 3-day diary finding a diet-cancer link is
not universally scientifically acceptable
• Need for repeated applications
• Induces behavioral change??
Diary 6
Diary 5
Diary 4
Diary 3
Diary 2
Diary 1
1800
1750
1700
1650
1600
1550
1500
1450
1400
1350
FFQ
Typical (Median) Values of Reported
Caloric Intake Over 6 Diary Days:
WISH Study
The Food Frequency Questionnaire
_________________________________________________________
• Do you remember
the SAT?
The Pizza Question
_________________________________________________________
The Norfolk Study with ~Diaries and FFQ
_________________________________________________________
15,000 women, aged
45-74, followed for 8
years
163 breast cancer
cases
Diary: p = 0.005
FFQ: p = 0.229
Directly contradicts
NHANES (women aged
25-50).
Summary
_________________________________________________________
• FFQ does not find a fat and breast cancer link
• 24 hour recalls and diaries are expensive
• They have found links, but in opposite directions
• Diaries also appear to modify behavior
• Question: do any of these things actually
measure dietary intake?
• How well or how badly?
• These are statistical questions!

Do We Know Who We Are?
_________________________________________________________
• Karl Pearson was
arguably the 1st great
modern statistician
• Pearson chi-squared
test
• Pearson correlation
coefficient
Karl Pearson at age 30
Do We Know Who We Are?
_________________________________________________________
• Pearson was deeply
interested in selfreporting errors
• In 1896, Pearson ran
the following
experiment.
• For each of 3 people,
he set up 500 lines of
a set of paper, and
had them bisected by
hand
A gaggle of lines
Pearson’s Experiment
_________________________________________________________
• He then had an
postdoc measure the
error made by each
person on each line,
and averaged
• “Dr. Lee spent
several months in
the summer of
1896 in the
reduction of the
observations ”
A gaggle of lines, with my
bisections
Pearson’s Personal Equations
_________________________________________________________
• Pearson computed the
mean error committed
by each individual: the
“personal equations “
• He found: the errors
were individual. His
errors were to the right,
Dr. Lee’s to the left
Karl Pearson in later life
What Do Personal Equations Mean?
_________________________________________________________
• Given the same set of
data, when we are
asked to report
something, we all
make errors, and our
errors are personal
• In the context of
reporting diet, we call
this “person-specific
bias “
Laurence Freedman of NCI,
with whom I did the work
What errors do FFQ Make?
_________________________________________________________
• Pretend you and I eat
the same amount of
fat on average.
• We each fill out a FFQ
twice, take the mean
fat intake from the
FFQ, and get different
answers. Why?
• Random
Error: I will
give different answers
each time
• No one reports all the
ice cream he/she eats
(fixed bias due to
societal factors)
• Personal Equation:
we all report differently
Model Details for Statisticians
_________________________________________________________
• The model in symbols
Qij =β0 +β1 X i + ri + ε ij ;
X i =true intake;
ri =personal equation=Normal(0,σ 2r );
2
ε
ε ij =random error =Normal(0,σ )
• Note how existence of person-specific bias
means that variance of true intake is less
than one would have thought
Our Hypothesis
_________________________________________________________
• We hypothesized that when measuring Fat
intake
• The personal equation, or person-specific bias,
unique to each individual, is large and debilitating.
• The problem: the actual variability in American diets
is much smaller than suspected.
• If true, the hypothesis says that one cannot
really do an epidemiologic study for total energy
or total fat, with any degree of success for
cancer
Can We Test Our Hypothesis?
_________________________________________________________
• We need biomarker data
that are not much subject
to the personal equation
• There is no biomarker for
Fat 
• There are biomarkers for
energy (calories) and
Protein
• We expect that studies are
too small by orders of
magnitude
Biomarker Data
_________________________________________________________

Protein:


Calories and Protein:


Available from a number
of European studies
Available from NCI’s
OPEN study
Results are surprising
Victor Kipnis was the
driving force behind OPEN
Sample Size Inflation
_________________________________________________________

There are formulae for how large a study needs to
be to detect a doubling of risk from low and high
Fat/Energy Diets

These formulae ignore the personal equation

We recalculated the formulae accounting for

Random error in repeated FFQ

Societal factors causing underreporting in general

Pearson’s personal equation: we report differently
Biomarker
Data: Sample Size Inflation
_________________________________________________________
If you are interested in the effect of calories on health, multiply
the sample size you thought you needed by 11. For protein, by
4.5
12
10
8
6
4
2
%Protein
Calories
Protein
0
Relative Odds
_________________________________________________________



Suppose high fat/energy/? diets lead to
twice the risk of breast cancer compared
to low fat/energy
This is called the Relative Risk
What is the risk we would observe with
the FFQ?
Relative Risk
_________________________________________________________
If high calories increases the risk of breast cancer by 100% in
fact, and you change your intake dramatically, the FFQ thinks
doing so increases the risk by 4%
Result: It is not
possible to tell
if changing your
absolute caloric
intake, or your
fat intake, or
your protein
intake will have
any health
effects
2
1.8
1.6
True: 2.00
1.4
Observed
Protein: 1.09
Observed
Calories: 1.04
1.2
1
Relative Risk For
Changing Your Food
Intake
Relative Risk, Food Composition
_________________________________________________________
If high protein (fat) increases the risk of breast cancer by 100%,
your calories remain the same, you dramatically lower your
protein (fat) intake, then FFQ thinks your risk increases by 20%30%
Result: It is very
difficult to tell if
changing your food
composition while
maintaining your
caloric intake will
have any health
effects
True: 2.00
2
1.8
1.6
1.4
1.2
1
Relative Risk for Food
Composition
Observed
Protein
Density: no
energy effect:
1.31
Observed
Protein
Density, energy
effect: 1.20
Summary
_________________________________________________________





Trying to establish a Fat and Breast Cancer
link has proved difficult
Standard instruments hide effects
24 hour recalls have found effects, but are
very expensive
Diaries may(?) change behavior: difficult to
believe what they say
There is hope to analyze food composition,
not absolute intakes
Summary
_________________________________________________________


The AARP Study: 250,000+
women, by far the greatest
number in any study
My best case conjecture:



Huge size  statistical
significance
FFQ  small measured increase
in risk for dramatic behavioral
change
Statistician’s dream: use
Pearson’s idea to get at the true
increase in risk
A happy statistician
dreaming about AARP
Summary
_________________________________________________________





The WHI Controls Study:
30,000+ women
All with > 32% Calories
from Fat via FFQ
Also includes diaries
Will be able to compare
diaries and FFQ
How many studies with
30,000+ diaries can we
afford?
A happy statistician doing
field biology in Northwest
Australia (the Kimberley)
Summary
_________________________________________________________

WHI, 2005, clinical trial

My best case conjecture:

Probably no statistical effects (?)

Even if so, the FFQ is so bad that we will not know what to do:

Decrease Fat?

Decrease saturated Fat?

Eat more grain?

Eat more veggies (yuck)?
You are what you eat, but do you
know who you are?
_________________________________________________________



Diet is incredibly hard to measure
Even 100% increases in risk cannot be seen
in large studies
If you read about a diet intervention,
measured by a FFQ, and it achieves
statistical significance multiple times: wow!
You are what you eat, but do you
know who you are?
_________________________________________________________


Much work at NCI and WHI and EPIC on new
ways of measuring diet
EPIC may be a model, because of the wide
distribution of intakes
Reporting Biases
_________________________________________________________


FFQ are not very good
for measuring caloric
intake
We do not want to
admit our pizza, ice
cream, etc.
Reporting Biases
_________________________________________________________


24 hour recalls are not
very good for measuring
caloric intake
They are better than
FFQ (less bias, for
example), but they still
are not very good
Reporting Biases
_________________________________________________________



FFQ are better for
% Calories from
Protein
Our food
composition is
better known to us
than the amounts
Inflation of sample
size only 2.3, not
4.5 as for actual
protein
Download