Dietary assessment and estimation of intake densities Michael J. Daniels Alicia Carriquiry

advertisement
Dietary assessment and estimation of intake
densities
Michael J. Daniels1 2
Alicia Carriquiry3 4
Michael Daniels is corresponding author: 102G Snedecor Hall, Department of Statistics, Iowa State
University, Ames, IA 50011-1210, E-mail: mdaniels@iastate.edu.
2 Michael Daniels is Assistant Professor, Department of Statistics, Iowa State University.
3 Alicia Carriquiry is Associate Professor, Department of Statistics and Center for Agricultural and
Rural Development, Iowa State University.
4 This work was partially funded through contracts number 0009730470 and 0009830322 between the
National Center for Health Statistics, Center for Disease Control and Prevention, and the Department
of Statistics, Iowa State University, and by Research Grant No. 1960915 from FONDECYT, Chile.
1
Summary
The U.S. government has conducted nationwide food consumption surveys since 1936.
Information obtained from these surveys is used to design food assistance programs,
guide food and nutrition policy, and monitor the dietary status of the population. The
distribution of usual intakes of a nutrient in the population is of interest to policy makers.
Here, usual intake is dened as the long-run average intake of a nutrient by an individual.
Usual intakes are not observable in practice. Instead, we observe daily intakes for
a sample of individuals and a small number of days, and assume that observed intakes
measure usual intakes with error. The distributions of observed intakes, however, are
typically very skewed, and the day-to-day variability in intakes tends to be large relative
to the between-individual variance, and can be heterogeneous across individuals (Nusser,
Carriquiry, Dodd, and Fuller, 1996).
In this paper, we present a Bayesian approach to estimating the distribution of usual
intakes of a nutrient in a population. Starting with a sample of dietary intakes, we model
a function to map the intakes into the normal scale. This function combines a power
transformation and a cubic spline (constrained to be monotonic) with unknown number
and location of knots, and is estimated using reversible jump Markov chain Monte Carlo
methods (Green, 1995). From each \draw" of the transformation function we obtain
a transformed set of intakes which are approximately normally distributed. We then
remove the day-to-day variability in daily intakes by tting a measurement error model
to each set of transformed observations. Each set of estimated individual usual intakes is
then mapped back to the original scale using the inverse of the transformation function.
Posterior distributions of percentiles and other attributes of the density for each nutrient
are estimated accounting for all major sources of uncertainty.
We apply these methods to a subset of the 1994-1996 Continuing Survey of Food
Intakes by Individuals (CSFII) collected by the USDA (USDA, 1997).
Key words: Dietary data CSFII Measurement error models Splines Reversible jump Markov chain Monte Carlo
1
1 Introduction
The United States government collects dietary intake data since the 1930s. Nationwide food consumption surveys are conducted approximately once a year, where a large
sample of individuals is asked to report their food consumption during the previous 24
hours. Thus, the survey instruments used to collect this information are called 24-hour
recalls. Most nationwide food consumption surveys collect replicate 24-hour recalls for
at least some of the individuals in the sample. Often, these repeated observations are
not collected on consecutive days, so that multiple observations within an individual can
be considered to be independent.
Other survey instruments, for example food frequency questionnaires (FFQs), can
also be used to collect dietary intake data and to estimate usual nutrient intake distributions (e.g., Carroll, Freedman, and Hartman, 1996). In this paper, we consider only
the analyses of intake data collected via 24-hour recalls.
The information obtained from these dietary surveys is used by policy makers to
design, implement, monitor and evaluate food assistance programs and other nutritionrelated policies. For example, policy makers might be interested in comparing the nutritional status of children from low-income households who are enrolled in the School
Lunch program versus that of children who are not. The eectiveness of the Food
Stamps program might be evaluated by, for instance, monitoring the proportion of lowincome elderly who are consuming enough of some essential nutrient, or the proportion
of teenaged girls who consume adequate amounts of folate or calcium. Since food assistance programs managed by the U.S. Department of Agriculture (USDA) alone cost
approximately 26 billion dollars a year, it is important that information obtained from
dietary data be as accurate as possible, and that measures of uncertainty be available
for all estimates.
2
How do we obtain information about the intake of a nutrient from a dietary intake
survey? Individuals participating in food consumption surveys are asked to recall their
food (including beverages, snacks, and meals) consumption for the previous day. A
database managed by the USDA is then used to \map" foods into their nutrient components. This USDA database contains approximately 6,600 entries, and is updated
periodically. For example, we can obtain the content of about 26 dierent nutrients of a
lunch composed of a slice of pepperoni pizza, an 8-ounce can of Diet Coke, and an apple.
It is well known (e.g., Schubert, Holden, and Wolf, 1987 Haytowitz, Pehrsson, Smith,
Gebhardt, Mathews and Anderson, 1996) that this food database is not error free. We
do not, however, address the issue in this work.
The data we obtain for analysis, then, are replicate observations (for at least a
subsample of individuals) of daily intakes of a large set of nutrients for individuals in
the sample. We use Yij to denote the observed intake of a nutrient for individual i on
day j . Because these data are costly to collect, the number of replicate observations is
typically no more than two or three for a subsample of the individuals in the survey. We
use di to denote the number of days of intake information available for each individual
in the sample.
The relationship between diet and health underlies much of the government's goal of
providing the population with the means to consume an adequate diet. Often, the eect
of nutrient consumption on health-related outcomes is chronic, so that researchers are
interested in the long-run average intake of a nutrient by an individual. This long-run
average intake is known as the usual intake of a nutrient by an individual, and is denoted
by yi, with i = 1 ::: n the number of individuals in the sample. Formally, yi = E fYij jig.
Furthermore, population-level assessments such as those described earlier, require that
we estimate the distribution of usual intakes F (y) in the group of interest. This usual
3
intake distribution concept was set forth in a report by the National Research Council
(NRC, 1986).
The problem of estimating usual nutrient intake distributions from dietary survey
data is a challenging one. Usual intakes are not observable in practice, and observed daily
intakes measure usual intakes with error. Furthermore, various characteristics of dietary
intake data (described in the next section) prevent the use of standard normal-theory
methods for analysis. Nusser, Carriquiry, Dodd, and Fuller (1996), Eckert, Carroll,
and Wang (1997), Chen (1998) and Carriquiry (1999), among others, have recently
proposed approaches for analyzing dietary intake data. In particular, Nusser et al.
(1996) propose a measurement error model approach on transformed intake data that
results in estimators of usual intake distributions that perform well in simulation studies.
The Nusser et al. methodology, however, is developed from a frequentist viewpoint,
and consists of several steps. Thus, it is not possible to obtain expressions for standard
errors of various estimates that properly incorporate all uncertainties accumulated along
the way. In fact, the estimators of standard errors for percentiles of the usual intake
distribution given in Nusser et al. (1996) are obtained under the assumption that the
function used to transform the data into the normal scale, and the variance components
in the measurement error model, are xed and known.
We revisit the Nusser et al. (1996) approach to estimating usual intake distributions
from dietary intake data, and reformulate it within a Bayesian framework. Our objective
is to derive marginal posterior distributions for parameters of the usual intake distribution of a nutrient that are of interest to policy makers and researchers in nutrition. We
focus on the marginal posterior distributions of percentiles of the usual intake distribution, and argue that the posterior variances we obtain reect all uncertainties accrued in
the various steps of the procedure. We use Markov chain Monte Carlo methods (MCMC)
4
(e.g., Smith and Roberts, 1993) throughout, to perform all computations. As will be
described in Section 3, the transformation step involves solving a varying-dimensional
problem thus, we proceed as in Green (1995) and Denison, Mallik, and Smith (1998)
and use a reversible-jump MCMC algorithm to obtain the transformation function.
The paper is organized as follows. In Section 2 we briey discuss the characteristics
of dietary intake data, and describe a subset of the Continuing Survey of Food Intakes
by Individuals (CSFII, USDA, 1996) that was used for illustration of the procedure.
The model and proposed estimation strategy are given in Section 3. We apply the
methodology to a subset of CSFII and present results in Section 4. Finally, Section 5
gives a discussion of the approach we propose and of related problems in nutrition that
merit further investigation.
2 Characteristics of dietary intake data
We consider dietary intake data obtained via 24-hour recalls, by the CSFII carried out in
1994-1996. The CSFII is a nationwide food consumption survey designed as a multistage
stratied area probability sample of the 50 states and the District of Columbia, and is
intended to be self-weighting. We consider the subset consisting of males and females
aged 14 to 18 years, who were interviewed between 1994 and 1996. Two observations
were collected for each individual in the sample. Both observations were obtained by
personal interview if possible otherwise, the second day interview was done over the
phone. Within an individual, intakes were collected at least a week apart from each
other thus, we assume that observations within an individual are independent. Because
of non-negligible attrition rates, regression weights (e.g., Huang and Fuller, 1978) were
constructed to adjust for nonresponse. The analyses we present in Section 4 are performed on weighted data, where weights, once computed, are assumed to be xed and
5
known.
Observed intake data are aected not only by individual, but also by nuisance eects
such as day of the week, month of the year, interview sequence (rst or later days)
and interview method (in person or by phone). Prior to analysis, we adjust the data
to remove these nuisance eects. We proceed as in Nusser et al. (1996) and use a
ratio adjustment based on a regression model to partially remove the eects of day of
week and interview method from observed intake data. To avoid carrying the survey
weights throughout our analyses, we linearly transform the intake data to obtain a set
of \equal weight" observations, as described in Dodd (1997). The unweighted analyses
of the equal weight observations are essentially equivalent to the analyses that would be
conducted on the original observations and their weights. In the remainder, Yij denotes
the adjusted, equal-weight intake for individual i on day j .
Dietary intake data have attributes that make their analysis challenging. Observed
daily intakes have skewed distributions, and exhibit both between- and within-individual
variability. In fact, the within-individual variance in observed intakes of most nutrients
is sometimes larger than (or of the same order of magnitude as) between-individual
variation, and is heterogeneous across individuals. Typically, as the mean intake of
a nutrient increases, so does the variance of those intakes. Since our objective is to
estimate F (y), the distribution of the usual intakes, we must remove the day-to-day
variability from the observed intakes.
An additive relationship between observed intake and usual intake in the normal scale
is often adopted to model (transformed) observed intakes. A linear measurement error
model approach that allows for the incorporation of heterogeneous within-individual
measurement error variances is then appropriate for dietary intake data. How to transform observed intakes into the normal scale so that transformed intakes are normally
6
distributed and the additive relationship holds is a matter of ongoing discussion (Nusser
et al. 1996 Stefanski and Bay, 1996 Chen, 1998). Here, we adopt the Nusser et al.
(1996) approach, and assume that in the normal scale, a linear measurement error model
is a reasonable choice to describe the relationship between observed and usual intakes.
3 Model and estimation strategy
We implement a fully Bayesian approach to the problem of estimating the marginal posterior distributions of percentiles of the usual intake distribution of dietary components.
The basic approach uses three nested sampling algorithms to properly account for all
uncertainties: 1) Transformation of observed dietary intake data to normality 2) Removal of measurement error in the normal scale 3) Back-transformation to the original
scale. We now describe each of these steps in detail.
3.1 Transformation to normality
As discussed in Nusser et al. (1996), standard power transformations fail to properly
transform intake data to normality for most nutrients. We use cubic splines to improve
the transformation. Our data consist of pairs, (Yij zij ) where Yij is the observed intake
for the ith individual on the j th day raised to the power which provides the best (in
terms of minimizing mean squared error) transformation to normality, and the zij are
the corresponding normal scores. We use Blom's (1958) formula to compute the zij .
Our goal is to compute a function g;1 (z ) such that g(Yij ) = Xij , where Xij is
approximately normal. We postulate a cubic spline for g;1(z ), indexed by a vector of
unknown parameters and contaminated by normal noise. We use maximum likelihood
(ML) to estimate the parameters in the model. The model is:
Yij = g;1 (z ) + ij
7
=
3
k
X
X
p
0 + pzij + p+4tp + ij p=1
(1)
p=1
where tp = (zij ; rp)3Ifz r g and the ij , j = 1 ::: ni i = 1 ::: n, are normal random
variables with mean 0. The number of knots in (1) is given by k, and their locations are
denoted r1 r2 ::: rk. We dene r(k) = (r1 r2 ::: rk)0 and = (k r(k) ).
The Yij are the sample quantiles of the power-transformed data, and thus cannot be
considered to be iid random variables. As a result, the covariance matrix of might be
modelled as a scale factor 2 times a weight matrix (W), which is proportional to the
asymptotic variance of the sample quantiles (see, e.g., Schervish, 1996, pp. 404-410).
The variance of ij will take the form 2 pij (1;pij )=f 2 (yp ) where yp is the true sample
quantile and pij corresponds to the pij th percentage point (0 < pij < 1) the covariance
between ij and kl is given by Cov(ij kl) = 2 (minfpij pklg; pij pkl )=(f (yp )f (yp )).
To approximate these terms, we use kernel density estimation.
Given the number of knots k and their location r(k), the ML estimate of is obtained,
via the generalized least squares equations:
ij
p
ij
ij
ij
kl
Z0W;1Z^ = Z0W;1Y where Z is an N (k + 4) design matrix (with N = Pni=1 di ) and Y is the vector
of power-transformed observations. In the remainder, and to keep notation simple, we
assume that di = d for all individuals, so that N = nd.
In most applications, the weight matrix W is very large (equal to the number of
observations N ) and therefore computation of its inverse is impractical. To investigate
whether estimates of the parameters in (1) are sensitive to a simplied formulation of the
model, we considered an alternative representation for W in our application: a diagonal
matrix obtained by setting all o-diagonal elements of W to 0.
We proceed as in Denison et al. (1998) and specify prior distributions for the number
8
of knots, k and the location of the knots, r(k). We chose a discrete uniform prior
distribution for the knot location, conditional on k, so that rjk discrete U (z11 ::: znd),
(with additional constraints), and a Poisson distribution with rate for the number of
knots k, so that k Poisson (
). In the example given in Section 4, we x at some
\known" value.
3.1.1 Details of algorithm for transformation
The dimension of the parameter vector changes with k, the number of knots in model
(1). As a result, we use reversible jump MCMC as discussed in Green (1995) and
Denison et al. (1998) to simulate from the posterior distribution of which species the
appropriate transformation.
The idea is simple. At each iteration l = 1 ::: M1, a new knot can be introduced, an
old knot can be deleted, or an old knot can be moved to a new location. Consequently,
each iteration consists of three steps:
1. Choose type of move:
Birth of a new knot, with probability bk .
Death of an existing knot, with probability dk .
New location for a knot, with probability k .
2. Compute MLE ^ (kl) and check monotonicity of g;1(kl)(:).
3. Accept move, with probability (l) (dened below).
For M1 large enough, the algorithm \converges". We monitor the behavior of the iterations using a mean squared error criterion computed as
MSE (l) = (nd);1(Y ; g;1(l)(:))0W;1(Y ; g;1(l)(:)):
9
(2)
Once the algorithm has \converged", we invert draws l = 1 ::: m1 with m1 < M1, of
functions g;1(l)(:), and evaluate each draw at the set of nd values of Yij to obtain a
sample of fXij g(l) that are approximately standard normal. That is
fXij g(l) = g(l)(Yij ) N(0 1):
To compute the MLE of (kl) we use generalized least squares as described in Section
3.1. Because g;1 (:) must be monotonic, at each step we check that the lth draw satises
the condition by evaluating the derivative of g;1(l)(:) on a grid of values of z given by
the knots and midpoints between the knots. If these function evaluations are not all
positive, we obtain an estimate of via linear programming, as the objective function
and all constraints are linear in . Non-monotonicity may occur between midpoints and
knots, and thus our approach does not guarantee that g;1(l)(:) is monotonic. However,
we are reasonably condent that non-monotonicity will usually be uncovered by focusing
on the grid.
Given k p(k), and c 0:5, we follow Denison et al. (1998), and dene bk =
c minf1 p(k + 1)=p(k)g, dk = c minf1 p(k ; 1)=p(k)g, and k = 1 ; bk ; dk , where p(k)
is the prior density for the number of knots. Note that for k = 0, bk = 1, and for
k = kmax, bk = 0. With this formulation, the probability of accepting the proposed
move has a very simple form:
= min f1 (likelihood ratio) (prior ratio) (proposal ratio)g
where
(birth) = min f1 (likelihood ratio) !(k)g
(death) = min f1 (likelihood ratio) !(k);1g
(move) = min f1 (likelihood ratio)g
10
and
8 ; 7k :
!(k) = nd ;nd
The quantity !(k) is the ratio of the number of locations at which a knot may be
placed to the number of data points. (The above result is specic to a cubic spline for
additional details, see Denison et al. (1998), p. 338).
3.2 Measurement error model
We make the assumption that the measurement error is additive in the normal scale.
Using m1 < M1 sets of transformed values, fXij g(l), l = 1 ::: m1, we t an additive
measurement error model (MEM) as proposed by Nusser et al. (1996)
Xij(l) = x(il) + u(ijl)
(3)
where x(il) is the usual intake of the nutrient for the ith individual for the lth draw, and
u(ijl) is the measurement error for the ith individual on the j th day, in the normal scale
for the lth draw.
There may be considerable heterogeneity of the measurement error variances across
individuals (see e.g., Nusser et al., 1996), so we formulate our MEM as a hierarchical
model with three levels. We omit the superscript that denotes draw to keep the notation
simple, but it is important to remember that the hierarchical model is formulated for
each draw fXij g(l), l = 1 ::: m1.
In level 1, the individual's daily intake is modelled as a normally distributed random
variable with mean equal to the individual's usual intake and with a subject-specic
measurement error variance:
Xij jxi ui2 N (xi ui2 ):
In level 2, we model the heterogeneity in the usual intakes and in the measurement
11
error variances across individuals:
xijx x2 N (x x2)
log(ui2 )jA A2 N (log(A ) A2 ):
Finally, in level 3 we place at priors on the remaining hyper-parameters:
x x2 log(A ) A2 Uniform:
We use the Gibbs sampler to draw values from the posterior distribution of the
parameters in the hierarchical MEM model. All full conditionals are of standard form,
with the exception being the full conditional distribution of ui2 , which is proportional
to
Y
(log(ui2 )jxi A A2 Xi) / (ui2 );1=2 exp f; 212
ui
j
X(X ; x )2g
ij
i
j
expf; 21 (log(ui2 ) ; log(A ))2g
A
where Xi = (Xi1 ::: Xid)0. To draw values from (log(ui2 )jxi A A2 Xi), we use a
Metropolis-Hastings algorithm (e.g., Smith and Roberts, 1993) with a normal approximation to the full conditional distribution of log(ui2 ) as a candidate density.
For each transformed sample fXij g(l), l = 1 ::: m1, we obtained M2 draws from the
joint posterior distribution of fx x2 A A2 g. For m2 < M2 of these, we simulated sets
of x(is) ui2 , i = 1 ::: n, s = 1 ::: m2, from xij(xs) x2 and log(ui2 )j(As) A2 , respectively, to transform back to original scale. Note that by sampling from the population
as opposed to transforming back the original subjects, we are accounting for the additional variability of only having a nite (incomplete) sample of individuals from the
population.
(s)
(s)
12
(s)
3.3 Transformation back to original scale
As we described earlier, for each of the m2 draws, we obtained a sample of n usual
intakes and n measurement error variances (x(is) ui2 ) in the normal scale. To make
inferences about the quantiles of the intake distribution, we now need to transform the
usual intake draws back to the original scale. By denition,
(s)
y = E fY jx = xg = E fg;1 (x + u)jx = xg:
To estimate this expectation, for each (xi ui2 ) draw, we generate a large number q of
uij from uij N (0 ui2 ) and approximate the expectation using a Monte Carlo mean:
yi q;1 Pqj=1 g;1(xi + uij ). The number of Monte Carlo replicates q is chosen so as to
obtain the required precision for yi.
For m1 transformations and m2 samples of usual intakes from the measurement error
model, we get m1 m2 samples of size n: fyig(t), t = 1 ::: m1 m2, from which we
can approximate marginal posterior distributions of interest. For example, we derive the
marginal posterior distribution of percentiles of the usual intake distribution of interest,
Pr fy(t) ag = , for = 0:01 0:05 : : : :99. We discuss this further in Section 4.
3.4 Summary of Complete Algorithm
The three stages described in Section 3 can be summarized as follows:
1. Draw transformations g(l)(Y ) = X , l = 1 ::: M1.
2. Obtain transformed intakes X11(l) ::: Xnd(l) for l = 1 ::: m1 out of M1 draws.
3. Using transformed sample fXij g(l), t MEM
Xij(l) = x(il) + u(ijl)
13
via Gibbs, and obtain m1 m2 samples (m2 out of M2 draws for MEM)
2(ls)
(x(1ls) u2(1ls)) ::: (x(nls) un
) l = 1 ::: m1 s = 1 ::: m2:
4. Backtransform:
yi(ls) = Eq?fg;1(l)(x + u)jx = x(ils)g
where Eq?(:) is MC average over draws uv(ls) N (0 ui2(ls)), v = 1 ::: q.
5. Obtain marginal posterior distributions of percentiles of f (ls)(y) and other relevant
quantities.
4 Example
As stated in Section 2, we now illustrate the methodology using a cohort of females and
males, ages 14-18 from CSFII 1994-1996. The female cohort consisted of 303 individuals
each of which had dietary data collected on two non-contiguous days. The male cohort
consisted of 332 individuals also with two non-consecutive days of dietary intake data
each. We focus on six dietary components: calcium, cholesterol, iron, protein, vitamin
A, and vitamin C. In the case of calcium, iron, protein, vitamin A, and vitamin C, we
are interested in estimating the proportion of teen-agers whose usual intakes do not meet
recommendations. In the case of cholesterol, we are concerned with excessive intakes,
and thus focus on the right tail of the distribution.
4.1 Performance of algorithm
The reversible jump MCMC algorithm worked well. Figures 1 and 2 show two realizations (l = 2 and l = 2 000) from the posterior distribution of g(), and the pairs (Yij zij )
for females and males respectively, for each dietary component. We see from these gures
that the WLS procedure places more weight on the center of the distribution and less
14
weight on the tails (where there is considerably more variability). The transformation
draws shown in the gures correspond to the case where the weight matrix W was taken
to be diagonal. The reversible jump MCMC algorithm converges quickly as monitored
by the MSE (2) and to the same value based on multiple starting points (not shown in
gures).
For the prior distribution on the number of knots, we set = 6. We chose a small
value for the mean number of knots as the data had already been power-transformed,
and just a few additional knots are likely to be needed to complete the transformation
to normality. Results were not sensitive to changes in the value of , in the range
3 ; 8. The number of knots drawn from the posterior distribution for the various dietary
components ranged from about two to fourteen.
We monitored the convergence of the Markov chain of the parameters of the measurement error model using Gelman and Rubin-type statistics (Gelman and Rubin, 1992)
and autocorrelation plots (as suggested in Cowles and Carlin, 1995). The convergence
again was rather quick (within about 100 iterations).
For posterior inference, we sampled m1 = 25 transformations (every 40th iteration
after a burn-in of 1000, M1 = 2000) and for each transformation, sampled m2 = 20
iterations (every 10th iteration after a burn-in of 100, M2 = 300) from the measurement
error model, for a total of 500 back-transformed samples of size 303 for females (332
for males) of the usual intakes for which we compute posterior medians and 95% credible intervals (using the 2:5th and 97:5th quantiles of the posterior distribution) of the
quantiles and compute density plots.
4.2 Choice of weight matrix
As mentioned earlier, the weight matrix W has dimensions N N . In our example,
N = 303 2 for females, and N = 332 2 for males. As N can be quite large, the
15
inversion of W can be impractical and very time consuming. Thus, we investigated
whether results would be sensitive to using a simplied (diagonal) version of W for
computation.
We chose to use the diagonal weight matrix for model-tting as a compromise, since
we can account for the extra variability of the quantiles in the tails and yet keep computations manageable. Because a kernel density estimator is used to estimate the density
at the quantiles, use of the full weight matrix W may result in a procedure that is
not only inconvenient from a computational point of view, but also unstable, as density
estimates at the tails get very small.
To decide whether results are sensitive to the choice of a diagonal version of W viz
a viz the \complete" version, we repeated the analyses using both forms of the weight
matrix, for several of the dietary components under consideration. We only show results
obtained for vitamin C (females), which appear in Table 1 and Figure 3.
The eect of ignoring the o-diagonal elements of W in the computations had very
little eect on nal results. Estimates of quantities of interest, such as the mean, the
standard deviation, and the quantiles of usual intakes are very similar, regardless of
the weight matrix chosen. For example, every 95% credible interval obtained using the
diagonal weight matrix covers the corresponding point estimate obtained using the full
weight matrix, and in fact, most point estimates are within a standard deviation of each
other.
4.3 CSFII 1994-1996
We applied the method we propose to dietary intake data collected in the CSFII during
the period 1994-1996, for the two cohorts described in Section 2. Figs. 4 and 5 display
two estimates of the usual intake distribution of each dietary component for females
and males, respectively. The density estimates drawn in dotted lines correspond to
16
the distribution of individual two-day means. These \observed mean" distributions are
skewed for all dietary components except protein, whose empirical mean distribution is
almost symmetric but leptokurtic. As a result, it would not be appropriate to t a normal
measurement error model to intake data to remove the within-individual variance. Thus,
a dierent parametric form must be chosen for the distribution of observed intake means,
or dietary intake data should be transformed into normality prior to variance estimation.
Following the approach described in Section 3, we obtained the usual intake density
estimates shown in solid lines in Figs. 4 and 5. The gures show, as expected, that after
removal of measurement error, the estimated distributions of usual intakes have smaller
variability than the distributions of two-day means.
Tables 2 and 3 show the mean, standard deviation, and selected percentiles of the
distribution of observed individual means for each dietary component, for females and
males, respectively. In addition, tables also show the ratio of within- to betweenindividual variances for each dietary component. These variance ratios are all close
to one, indicating that the measurement error variances are of about the same order
of magnitude as the between-individual variances. Therefore, these within-individual
variance components cannot be ignored.
Tables 4 and 5 show the mean and the 2:5th and 97:5th percentiles of the posterior distribution of the mean, standard deviation, and selected percentiles of the usual
intake distribution for each dietary component, for females and males, respectively. A
comparison of the entries in Tables 4 and 5 to those in tables 2 and 3 conrmed what
Figs. 4 and 5 show intake distributions have less variability and lighter tails that result
from the model's removal of the measurement error in the observed daily intakes. The
dierences between the two estimated densities can be large 95% credible intervals in
tables 4 and 5 often do not contain the corresponding quantile of the observed individual
17
mean distributions. This is particularly noticeable in the upper tail of the distributions.
Table 6 shows the mean and the 2:5th and 97:5th percentiles of the posterior distribution of the prevalence of nutrient inadequacy or, in the case of cholesterol, the prevalence
of excessive intake, for females and males. Here, we estimate the prevalence of nutrient
inadequacy as the proportion of individuals whose usual intake of the dietary component
is less than 83% of the Recommended Dietary Allowance (RDA e.g., NRC, 1989, page
285) for the nutrient (see, e.g., Carriquiry, 1998 IOM, 1999). For calcium, iron, protein,
vitamin A, and vitamin C, table 6 shows selected attributes of the posterior distribution
of Pr(y 0:83 RDA) for females and males, respectively. In the case of cholesterol, we
show the mean, 5th and 95th percentiles of the posterior distribution of Pr(y > 300mg).
The interpretation of the entries in the table is the usual one. For example, for females,
the point estimate of the prevalence of nutrient inadequacy for calcium is :84, and a
posteriori, the probability that prevalence is between :77 and :91 is 95%.
5 Discussion
The analysis of dietary intake data is challenging, even if we do not take into account
the various sources of biases and errors that are often present in this type of data. It is
recognized (see, e.g., IOM, 1999) that individuals tend to under-report the amount of
food they consume. The extent of the under-reporting is known to vary by nutrient, and
by gender-age-ethnic group, but little additional information about the direction and
size of the biases is available. Attempts have been made to calibrate reported intake
using various biochemical markers (see, e.g., IOM, 1999). These methods, however, are
still in the experimental stage, are very costly, and are useful to adjust energy intakes
at best. Nothing is known about the under-reporting of, for example, trace minerals.
It is also known that the USDA databases used to map foods into nutrients are not
18
always error-free (Schubert et al. 1987 Haytowitz et al, 1996). For example, the USDA
databases lack precise information on folate content of foods, as a national fortication
eort that adds folate to various food items was implemented only in 1998 (IOM, 1998).
In this work, we do not take into account these potential sources of biases in dietary
intake data. Rather, we focus on the problem of developing appropriate methods to
analyze the data.
Estimating usual intake distributions of nutrients from dietary intake data can be
dicult, as was argued in Section 2. The approach we have chosen consists in transforming the observed intakes into the normal scale, removing the measurement error in
the normal scale, and then transforming individual estimated usual intakes back into
the original scale. An alternative approach consists in using a parametric model other
than the normal to represent the relationship between observed and usual intakes. For
example, a Weibull or a Gamma distribution might be an appropriate representation
for the distribution of intakes in the population. This approach has the drawback that
each new dietary component would require the identication of the most suitable model,
thereby limiting the usefulness of the method for researchers in nutrition and areas other
than statistics.
The normal-scale measurement error model we propose in Section 3.2 makes an assumption that is not necessarily satised: that once observed individual intake means
are transformed into normality, both the usual intake and the measurement error components are also normally distributed. This is not necessarily so, although informal tests
suggest that for all the dietary components we investigated, the assumptions of model (3)
appear to hold. A deconvolution approach that guarantees that both the usual intakes
and the measurement errors are normally distributed has also been proposed (Stefanski and Carroll, 1990 Stefanski and Carroll, 1991 Chen, 1999). For the specic case
19
of dietary intake data, Chen (1999) argues that results obtained using a deconvolution
approach are not noticeably dierent from those obtained by Nusser et al. (1996) using
a frequentist version of the method we discuss in this manuscript.
We argue in Section 2 that a Bayesian framework is the most appropriate in this
estimation problem, as the method for estimating usual nutrient intake distributions
consists of several steps. Because the estimated transformation into normality and the
estimated variance components in the measurement error model are used as if they were
true values, the standard errors for estimators of the parameters of the usual intake distribution in the Nusser et al. (1996) approach underestimate the true uncertainty about
the value of those parameters. An advantage of the Bayesian paradigm is that it permits
proper accounting of all uncertainties, so that the posterior variance of, for example, the
prevalence of nutrient inadequacy, reects the uncertainty about all parameters in the
model. Thus, we expect that the 95% credible intervals obtained from the marginal
posterior distributions will be wider than the 95% condence intervals obtained from a
frequentist analysis such as that presented by Nusser et al. (1996). Direct comparison
of the Bayesian and frequentist approaches is not possible as the model used for the
transformation function in this paper is dierent from the one used in the Nusser et
al. (1996) manuscript. Nonetheless, we carried out the analysis using the frequentist
version of the method. Computations were done using C-SIDE (Iowa State University,
1997), a software developed to implement the Nusser et al. (1996) method. Results
obtained from a frequentist viewpoint are presented in Tables 7 and 8, for females and
males, respectively. Point estimates of percentiles are somewhat similar when comparing
both approaches. The 95% credible sets, however, tend to be wider, and need not be
symmetric around the posterior means of the percentiles.
In our example, we estimated the prevalence of nutrient inadequacy in the popula20
tion as the proportion of individuals with usual intakes below 83% of the RDA (NRC,
1989) for the nutrient. It has been argued (e.g., Beaton, 1994 Carriquiry, 1999) that the
appropriate cut-o is the median of the distribution of requirements in the population,
rather than the RDA. The National Academy of Sciences, however, has not yet published the value of the median requirement for any gender-age group. The exception is
calcium, for which the Academy of Sciences has concluded that the median requirement
for any group cannot be determined with the information that is currently available
about calcium intakes and requirements (IOM, 1998b). Under simple assumptions, 83%
of the RDA is approximately equal to the median requirement of the nutrient.
In Section 3.1, we used a generalized least squares approach to estimate the parameters of the function that transforms daily intakes into normality. An alternative approach
is as follows: dene g to be the function g;1(z ) such that P (Yij y) = P (Z g;1(y)),
where Z is distributed as a standard normal random variable. Again consider a cubic
spline form for the function g. In this case, an iterative procedure is needed to obtain
maximum likelihood estimates of the parameters in the model. If we let = (k r(k) )
as before, the likelihood for this model is Qni=1 Qdj=1 f (g;1(yij )), where f denotes a
standard normal density. To estimate the parameters in this model, we obtain an initial
value for the parameters using GLS, and then carry out a single Newton-Raphson step
to approach the MLE using analytic derivatives.
We have discussed the specic problem of estimating usual nutrient intake distributions, and presented an application consisting of estimating the prevalence of nutrient
inadequacy among teen-agers using dietary intake data collected between 1994 and 1996.
Several related problems still require investigation. An extension of the methods presented here to the case where the usual intake distributions of food intakes is of interest
is not straightforward. Diculties arise because in the case of foods, it is important
i
21
to consider not only the amount of a food consumed, but also the probability that the
individual would have consumed the food on the day when the interview was conducted.
For many food items, the probability of consumption is not independent of the amount
consumed, so estimating the marginal distribution of usual intake of foods can be challenging. Yet, the problem is an important one, as the distribution of usual food intakes
is required to assess exposure rates to toxicants found in the food supply in a group.
Ratios of dietary components are also of importance. For example, researchers may
be interested in assessing the proportion of individuals in a group who consume, on the
average, more than 30% of calories from fat, or more than 10% of calories from saturated
fat. The methods presented in this paper for estimating the usual intake distribution for
a nutrient cannot be directly applied to ratios of dietary components as those described
above. Typically, both the numerator and the denominator in the ratio are observed
subject to measurement error, and cannot be assumed to be independent.
References
Beaton, G.H. (1994) Criteria of an adequate diet. In: Shils, R.E., Olson, J.A., Shike,
M. eds. Modern Nutrition in Health and Disease. Lea and Febiger, Philadelphia.
Blom, G. (1958) Statistical Estimates and Transformed Beta Variables. Wiley, New
York.
Carriquiry, A.L. (1999). Assessing the prevalence of nutrient inadequacy. Public Health
Nutrition. In press.
Carroll, R.J., Freedman, L.S., and Hartman, A.M. (1996) Use of semiquantitative food
frequency questionnaires to estimate the distribution of usual intake. American
Journal of Epidemiology, 143:392-404.
22
Chen, C. (1999) Spline estimators of the distribution function of a variable measured
with error. Doctoral Thesis, Department of Statistics, Iowa State University.
Cowles, K., and Carlin, B.S. (1996) Markov chain Monte Carlo convergence diagnostics:
A comparative review, Journal of the American Statistical Association, 81:86-98.
Denison, D.G.T., Mallik, B.K., and Smith, A.F.M. (1998) Automatic Bayesian curve
tting. Applied Statistics, 60:333-350.
Dodd, K. (1997) A Technical Guide to C-SIDE. Technical Report 96-TR 32, Dietary
Assessment Research Series Report 9, Department of Statistics and Center for Agricultural and Rural Development (CARD), Iowa State University, Ames.
Eckert, R.S., Carroll, R.J., and Wang, N. (1997) Transformations to additivity in
measurement error models. Biometrics, 53:262-272.
Gelman, A., and Rubin, D.B. (1992) Inference from iterative simulation using multiple
sequences. Statistical Science, 7:457-472.
Green, P.J. (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian
model determination. Biometrika, 82:711-732.
Haytowitz, D.B. Pehrsson, P.R. Smith, J., Gebhardt, S.E., Mathews R.H. and Anderson, B.A. (1996) Key foods: setting priorities for nutrient analysis. Journal of Food
Composition and Analysis, 9:331-364.
Huang, E.T., Fuller, W.A. (1978) Nonnegative regression estimation for sample survey
data. ASA Proceedings of the Social Statistics Section, 300-305.
Institute of Medicine (1998a) Dietary Reference Intakes: Thiamin, Riboavin, Niacin,
Vitamin B6 , Folate, Vitamin B12, Pantothenic Acid, Biotin, and Choline. Preprint,
23
National Academy Press, Washington, DC.
Institute of Medicine (1998b) Dietary Reference Intakes: Calcium, Phosphorus, Magnesium, Vitamin D, and Fluoride. Preprint, National Academy Press, Washington,
DC.
Department of Statistics and Center for Agricultural and Rural Development, Iowa
State University. (1996) A User's Guide to C-SIDE: Software for Intake Distribution, Version 1.0. Technical Report 96-TR 31. Center for Agricultural and Rural
Development, Iowa State University, Ames.
National Research Council (1986) Nutrient Adequacy. National Academy Press, Washington, DC.
National Research Council (1989) Recommended Dietary Allowances. 10th ed. National Academy Press, Washington, DC.
Nusser, S.M., Carriquiry, A.L., Dodd, K.W., and Fuller, W.A. (1996) A semiparametric
transformation approach to estimating usual daily intake distributions. Journal of
the American Statistical Association, 91:1440-1449.
Schubert, A., Holden, J.M., and Wolf, W.R. (1987) Selenium content of a core group
of fooods based on a critical evaluation of published analytical data. Journal of the
American Dietetics Association, 87:285-299.
Smith, A.F.M., Roberts, G.O. (1993) Bayesian computation via the Gibbs sampler and
related Markov chain Monte Carlo methods. Journal of Royal Statistical Society B.
55:3-23.
Schervish, M. (1996) Theory of Statistics. Springer-Verlag, New York.
24
Spiegelhalter, D.J, Best, N.G, Gilks, W.R, and Inskip, H. (1996). Hepatitis B: a case
study in MCMC methods. in Markov Chain Monte Carlo in Practice, eds. Gilks
WR, Richardson S, Spiegelhalter DJ, Chapman and Hall, pp. 339-358.
Stefanski, L.A., and Bay, J.M. (1996) Simulation extrapolation deconvolution of nite
population cumulative distribution function estimators. Biometrika 83:407-417.
Stefanski, L.A., and Carroll, R.J. (1990) Deconvoluting kernel density estimators.
Statistics, 21:169-184.
Stefanski, L.A., and Carroll, R.J. (1991) Deconvolution-based score tests in measurement error models. The Annals of Statistics, 19:249-259.
U.S. Department of Agriculture, Agricultural Research Service (1997). Continuing
Survey of Food Intakes by Individuals, 1994-1996. CSFII Report, Washington, DC:
U.S. Government Printing Oce.
25
Figure 1: Power transformed intake data Yij (points), and two draws of the transformation function, at the 2nd (solid line) and 2,000th iterations (dashed line). Data
correspond to females.
26
Figure 2: Power transformed intake data Yij (points), and two draws of the transformation function, at the 2nd (solid line) and 2,000th iterations (dashed line). Data
correspond to males.
27
Figure 3: Densities of the usual intake of vitamin C for females aged 14-18, estimated
using the diagonal and non-diagonal forms of the weight matrix W in the transformation
into normality.
28
Figure 4: Estimated densities of the usual intake of dietary components in females aged
14-18. The dotted curves correspond to the distribution of two-day means. The solid
curves correspond to the Bayesian estimator described in Section 3.
29
Figure 5: Estimated densities of the usual intake of dietary components in males aged
14-18. The dotted curves correspond to the distribution of two-day means. The solid
curves correspond to the Bayesian estimator described in Section 3.
30
Mean
Std. Dev.
1st percentile
5th percentile
10th percentile
50th percentile
90th percentile
95th percentile
99th percentile
Diagonal W
Full W
88.2
(77.4, 99.7)
52.9
(40.8, 66.4)
14.4
(7.8, 22.5)
23.7
(17.2, 32.5)
31.1
(23.7, 41.3)
77.0
(64.9, 89.8)
158.9
(134.6, 187.4)
187.7
(154.5, 228.6)
247.1
(194.9, 324.9)
86.1
(76.6, 96.9)
52.0
(40.6, 66.6)
15.3
(9.1, 23.2)
24.3
(16.6, 31.8)
31.3
(23.3, 40.6)
74.5
(63.3, 85.0)
156.0
(134.3, 185.8)
186.3
(157.4, 228.1)
246.5
(195.9, 317.7)
Table 1: Mean, standard deviation, and quantiles of the usual intake distribution of
vitamin C for females 14-18. Values in parenthesis are the 95% credible intervals. Estimates were obtained using a diagonal and a full weight matrix for estimation of the
transformation into normality.
31
Calcium Cholesterol Iron Protein Vit A Vit C
Mean
Std Dev
Ratio
1st
5th
10th
50th
90th
95th
99th
(mg)
750
348
.9
210
273
343
685
1255
1405
1765
(mg)
208
125
1.6
36
59
76
180
369
434
668
(mg)
13.4
7.0
1.0
3.9
5.9
7.1
12.0
20.1
25.2
42.5
(g)
64.8
23.0
1.3
23.8
27.6
36.3
62.4
95.5
106.5
125.4
(g)
790
680
1.1
112
176
231
577
1536
2005
3419
(mg)
88.0
83.1
1.1
7.3
11.1
15.6
61.1
191.4
251.0
405.2
Table 2: Mean, standard deviation, and selected percentiles of the distribution of twoday individual means, and ratio of within- to between-individual variance in intakes for
females aged 14-18.
Calcium Cholesterol Iron Protein Vit A Vit C
Mean
Std Dev
Ratio
1st
5th
10th
50th
90th
95th
99th
(mg)
1259
661
.8
305
478
600
1112
2108
2631
3553
(mg)
315
180
1.5
59
104
132
281
556
627
961
(mg)
21.0
11.1
1.0
6.6
8.7
10.2
18.4
33.4
42.6
63.3
(g)
101.2
38.6
1.0
38.0
51.1
58.4
95.1
150.8
179.5
206.3
(g)
1203
881
1.1
152
262
359
965
2322
3041
4055
(mg)
112.6
96.5
1.1
6.7
15.2
22.5
84.8
249.0
309.2
393.1
Table 3: Mean, standard deviation, and precentiles of the distribution of two-day individual means, and ratio of within- to between-individual variances in intakes for males
aged 14-18.
32
Calcium Cholesterol
Mean
Std.
1st
5th
10th
50th
90th
95th
99th
(mg)
748
(702, 800)
248
(202, 296)
298
(233, 369)
389
(324, 451)
447
(390, 510)
724
(672, 781)
1077
(988, 1183)
1188
(1072, 1321)
1389
(1216, 1585)
(mg)
207
(192, 222)
62
(35, 84)
95
(66, 137)
117
(93, 156)
133
(110, 165)
200
(184, 217)
290
(248, 324)
317
(264, 369)
372
(287, 458)
Iron
(mg)
13.3
(12.5, 14.3)
4.2
(3.3, 5.3)
6.2
(4.9, 7.7)
7.7
(6.6, 8.8)
8.6
(7.7, 9.6)
12.7
(11.9, 13.6)
18.8
(17.0, 21.0)
21.1
(18.8, 24.1)
25.6
(21.8, 31.1)
Protein
Vit A
Vit C
(g)
(g)
(mg)
64.8
779
88.2
(61.7, 67.7) (697, 868) (77.4, 99.7)
14.4
406
52.9
(11.1, 18.3) (317, 522) (40.8, 66.4)
35.7
208
14.4
(27.9, 42.5) (144, 289)
(7.8, 22.5)
42.7
295
23.7
(36.6, 48.2) (229, 368) (17.2, 32.5)
47.0
355
31.1
(41.6, 51.9) (292, 428) (23.7, 41.3)
64.0
687
77.0
(60.4, 67.1) (610, 774) (64.9, 89.8)
83.5
1323
158.9
(77.3, 90.5) (1147, 1548) (134.6, 187.4)
89.4
1551
187.7
(81.7, 97.8) (1322, 1893) (154.5, 228.6)
100.6
2032
247.1
(89.5, 115.0) (1630, 2678) (194.9, 324.9)
Table 4: Mean, standard deviation, and selected percentiles of the usual intake distribution of dietary components for females aged 14-18. Values in parentheses are the lower
and upper bounds of 95% credible intervals.
33
Calcium Cholesterol
Mean
Std.
1st
5th
10th
50th
90th
95th
99th
(mg)
1252
(1158, 1332)
461
(374, 563)
478
(362, 589)
627
(530, 735)
726
(633, 828)
1185
(1100, 1275)
1869
(1666, 2062)
2102
(1855, 2385)
2569
(2205, 3035)
(mg)
311
(244, 335)
101
(63, 129)
133
(99, 174)
168
(140, 206)
192
(163, 226)
299
(239, 323)
445
(327, 504)
493
(352, 567)
588
(422, 728)
Iron
(mg)
20.9
(19.6, 22.4)
7.0
(5.5, 8.7)
9.5
(7.9, 11.4)
11.7
(10.0, 13.3)
13.1
(11.7, 14.8)
19.9
(18.3, 21.5)
30.1
(27.5, 33.7)
33.9
(30.2, 38.6)
41.3
(34.8, 49.5)
Protein
(g)
64.8
(61.7, 67.7)
14.4
(11.1, 18.3)
35.7
(27.9, 42.5)
42.7
(36.6, 48.2)
47.0
(41.6, 51.9)
64.0
(60.4, 67.1)
83.5
(77.3, 90.5)
89.4
(81.7, 97.8)
100.6
(89.5, 115.0)
Vit A
(g)
1180
(1082, 1290)
563
(446, 685)
313
(221, 428)
450
(361, 561)
554
(460, 664)
1086
(976, 1203)
1939
(1702, 2215)
2235
(1927, 2580)
2816
(2320, 3374)
Vit C
(mg)
110.8
(100.1, 123.6)
61.3
(49.0, 73.2)
21.5
(13.8, 32.0)
33.7
(25.1, 44.6)
43.0
(34.5, 55.4)
98.6
(86.9, 112.1)
192.9
(167.7, 221.1)
226.7
(193.8, 262.5)
293.2
(240.4, 352.0)
Table 5: Mean, standard deviation, and selected percentiles of the usual intake distribution of dietary components for males aged 14-18. Values in parentheses are the lower
and upper bounds of 95% credible intervals.
34
Females
Calcium
RDA
1,200 mg
Prevalence
.84
(.77, .91)
Iron
RDA
15 mg
Prevalence
.48
(.38, .56)
Protein
RDA
44 g
Prevalence
.01
(.00, .05)
Vitamin A RDA
800 g
Prevalence
.48
(.37, .56)
Vitamin C RDA
60 mg
Prevalence
.26
(.17, .35)
Cholesterol Cut-point 300 mg
Prevalence
.08
(.00, .16)
Males
1,200 mg
.33
(.24, .41)
12 mg
.02
(.00, .05)
59 g
.00
(.00, .02)
1,000 g
.30
(.22, .38)
60 mg
.14
(.07, .21)
300 mg
.50
(.18, .59)
Table 6: Mean of the posterior distribution of prevalence of nutrient inadequacy among
females and males aged 14-18, and 5th and 95th posterior percentiles. Here, prevalence
is dened as Pr(y 0:83RDA), where the RDA for each nutrient is the value published
in the 1989 NRC report. For cholesterol, we report the mean, 5th and 95th percentiles
of the posterior distribution of the prevalence of excessive intakes Pr(y > 300mg).
35
Calcium Cholesterol
1st
5th
10th
50th
90th
95th
99th
(mg)
293
(227, 359)
393
(332, 454)
455
(399, 511)
727
(682, 772)
1092
(990, 1194)
1214
(1083, 1345)
1469
(1271, 1667)
(mg)
105
(78, 132)
130
(107, 153)
144
(123, 165)
204
(188, 220)
284
(247, 321)
311
(263, 359)
369
(297, 441)
Iron
(mg)
6.1
(4.9, 7.3)
7.8
(6.7, 8.9)
8.8
(7.8, 9.8)
13.0
(12.3, 13.7)
19.3
(17.1, 21.5)
21.9
(18.8, 25.0)
28.0
(22.7, 33.3)
Protein
Vit A
Vit C
(g)
(g)
(mg)
35.9
210
16
(29.7, 42.1) (140, 280)
(9, 23)
43.4
315
26
(38.2, 48.6) (243, 387) (19, 34)
47.6
386
34
(43.1, 52.1) (315, 456) (26, 42)
64.5
729
77
(61.3, 67.7) (660, 798) (68, 86)
84.2
1362
158
(77.9, 90.5) (1127, 1597) (131, 185)
90.4
1639
191
(82.6, 98.2) (1300, 1978) (153, 229)
102.6
2328
266
(91.6, 113.6) (1705, 2951) (200, 332)
Table 7: Selected percentiles of the usual intake distribution of dietary components
for females aged 14-18, estimated using the Nusser et al. (1996) frequentist approach.
Values in parentheses are the lower and upper bounds of the approximate 95% condence
intervals computed using a balance repeated replication method.
36
Calcium Cholesterol
1st
5th
10th
50th
90th
95th
99th
(mg)
464
(372, 556)
620
(532, 708)
720
(636, 804)
1186
(1112, 1260)
1885
(1697, 2073)
2136
(1885, 2387)
2687
(2283, 3091)
(mg)
142
(110, 174)
179
(150, 208)
202
(175, 229)
303
(281, 325)
442
(391, 493)
489
(424, 554)
589
(489, 689)
Iron
(mg)
9.0
(7.5, 10.5)
11.4
(10.0, 12.8)
12.8
(11.5, 14.1)
19.8
(18.6, 21.0)
30.4
(27.2, 33.6)
34.4
(30.1, 38.7)
43.3
(36.1, 50.5)
Protein
(g)
55
(47, 63)
66
(59, 73)
73
(67, 80)
99
(94, 103)
133
(123, 143)
144
(131, 157)
166
(147, 185)
Vit A
(g)
360
(261, 459)
508
(407, 609)
607
(508, 706)
1102
(1004, 1200)
1919
(1649, 2189)
2230
(1862, 2598)
2938
(2319, 3557)
Vit C
(mg)
22
(14, 30)
35
(26, 44)
45
(36, 54)
100
(90, 110)
199
(168, 230)
238
(195, 281)
329
(255, 402)
Table 8: Selected percentiles of the usual intake distribution of dietary components
for males aged 14-18, estimated using the Nusser et al. (1996) frequentist approach.
Values in parentheses are the lower and upper bounds of the approximate 95% condence
intervals computed using a balance repeated replication method.
37
Download