Simple conditions to theoretically limit potential confounding of least-squares estimates Brian Knaeble

advertisement
Simple conditions to theoretically limit potential
confounding of least-squares estimates
Brian Knaeble
University of Utah
May 2013
Abstract
Results of multiple regression analysis are often reported as if model
uncertainty is not an issue. If, however, omitted-variable bias is a valid
concern, then the results of this paper may apply. The main result is a
simple theorem, roughly asserting that weak correlates can not reverse existing estimates. The contrapositive, in a special case, produces necessary
conditions for the Yule–Simpson effect. Other applications are discussed,
and a few counter examples are presented to demonstrate how confounding can occur when least expected.
Keywords: least-squares, correlation, confounding, reversal
1
Introduction
Consider this scenario. A sufficient number of high-dimensional observations
have been made to fit a linear model, but the set of explanatory variables to be
used in the model has yet to be determined. Inference regarding the qualitative
nature of the unique effect of Xi on Y is desired. The dimension is large enough
so that it is not computationally feasible to fit every possible model. Thus it
seems that any conclusions reached regarding the unique effect of Xi on Y are
potentially spurious.
Specifically, suppose that a linear model has been selected, with explanatory variables indexed by I. Denote with I β̂i the ith fitted coefficient within
this model, obtained using the principle of least squares. As long as there
is some error in the model, it remains possible that through consideration of
the data associated with additional explanatory variables, indexed by J, that
sign( J,I β̂i ) 6= sign( I β̂i ).
For a particular J, the probability of such a reversal occurring may be low,
but when working with a large set of explanatory variables, there are many such
Js to consider. The situation becomes more serious if one attempts to account
for confounding that arises due to unobserved lurking variables. In the presence
of model uncertainty is sound inference even possible? For more reading along
these lines consult [1] or [2].
1
To clarify our understanding of confounding, we distinguish between latent
confounding of I β̂i , due to the data associated with variables indexed by I,
and potential confounding of I β̂i , due to the data associated with variables
indexed by J. Note that even the data associated with a single Xj , despite not
being correlated with the data for Xi , or the data for Y , holds the potential to
drastically confound I β̂i , just by revealing latent confounding.† This curious
phenomenon is demonstrated in Table 1.1. Latent confounding is not an issue
for data that has been transformed so that each of the vectors of observations
indexed by I become centered and then orthogonal to one another (see Lemma
4.4 of Section 4.3). Orthogonality can be obtained by applying a Gram–Schmidt
procedure in such a way so that the specific vector xi remains unchanged (except
for centering) and the span of the design matrix remains unchanged as well. If
we assume full rank for the original design matrix then such a transformation
is always possible.
However, despite having arranged for an orthogonal design matrix, potential
confounding, due to vectors of data indexed by J, remains a possibility. Some
of the concern can be addressed with the following theorem. It is perhaps best
understood as a pure mathematical statement regarding the least-squares fitting
procedure.
Let r denote the bi variate correlation coefficient, and let R denote the
positive square root of the coefficient of determination. Let I index centered,
orthogonal columns of data for a subset of explanatory variables, and let J index
disjoint (from I), not necessarily centered or orthogonal (to themselves or the
vectors indexed by I), additional columns of data.
Theorem 1.1. For any i ∈ I,
JR
< |r(xi , y)| =⇒ sign( J,I β̂i ) = sign( I β̂i ).
The conclusion is modest, but similar reasoning has been used in the past
† An educational article in Occupational & Environmental Medicine speaks of a trend toward increasing use of (multiple) regression, as opposed to stratification, within epidemiology,
as a means to control for confounding and arrive at an adjusted estimate I β̂i for the unique
effect of Xi on Y [5]. However, there appears to be a belief that in order to confound the
estimate I β̂i , a lurking variable Xj must correlate with both Xi and Y [6].
y
√
√2
− 2
0
0
x1
√
√2
−√ 2
2 √2
−2 2
x2
1
√1
−5√ 2 − 1
5 2−1
x3
√
5 + √2
5√− 2
√2 − 5
− 2−5
Table 1.1: A contrived dataset where x3 is uncorrelated with both x1 and y,
yet sign( 1,2,3 β̂1 ) 6= sign( 1,2 β̂1 ).
2
(see [3] for an early precedent).‡ Also, the main premise is simple, and when true
it can be readily verified. This is especially apparent when J = {j} because
j R = r(xj , y) and correlations are well understood. Conceivably, specialists
could verify the truth of this simpler premise, even in the absence of data, with
an argument based on existing subject matter knowledge. However, caution is
advised in general as the combined effect of a set of multiple, weak correlates
can be greater than expected (see Table 1.2).
2
The Yule–Simpson effect
In this section it is shown how the content of Theorem 1.1 is related to the
Yule–Simpson effect. A few descriptive examples of the Yule–Simpson effect
are stated in order to provide context. The terms Yule–Simpson effect and
Simpson’s paradox are used interchangeably.
Clifford H. Wagner, writing in The American Statistician, introduces Simpson’s paradox as
the designation for a surprising situation that may occur when two
populations are compared with respect to the incidence of some attribute: If the populations are separated in parallel into a set of
descriptive categories, the population with higher overall incidence
may yet exhibit a lower incidence within each such category [4].
In a medical context differing terminology may be used. The following example is taken from the British Medical Journal.
Open surgery (1972-80) had a success rate of 78% (273/350) while
percutaneous nephrolithotomy (1980-5) had a success rate of 83%
(289/350), an improvement over the use of open surgery. However,
the success rates looked rather different when stone diameter was
taken into account. This showed that, for stones of less than 2 cm,
‡ For papers containing similar mathematical results see any of [7, 8, 9]. For scientific
papers where Theorem 1.1 could apply see any of [10, 11, 12, 13, 14].
y
√
√
2
+
√
√3
− 2√
+ 3
−√ 3
− 3
x1
√
2 √2
−2√ 2
2 √2
−2 2
x2
√ √
√2 √3 + − √
2 3 + 2 √2 − −2 2 − x3
√ √
√2 √3 + − 2√ 3 + −2√ 2 − 2 2−
Table 1.2: A contrived data set illustrating how the confounding potential of
x2 and x3 combined, can be greater than expected. ∀ 6= 0, 1 β̂1 = 0.5 and
1,2,3 β̂1 = −1.0, yet as ↓ 0 both 2 R ↓ 0 and 3 R ↓ 0, while 2,3 R ≡ 0.75 > 0.5 =
1 R. Incidentally, 1 β̂1 = 1,2,3 β̂1 = 0.5 when = 0.
3
93% (81/87) of cases of open surgery were successful compared with
just 83% (234/270) of cases of percutaneous nephrolithotomy. Likewise, for stones of more than 2 cm, success rates of 73% (192/263)
and 69% (55/80) were observed for open surgery and percutaneous
nephrolithotomy respectively [15].
A mathematical definition for the Yule–Simpson effect is proposed in Table 2.1. Although, such a definition fails to apply to a well-known, historical
example of the Yule–Simpson effect. As as described in Science:
Examination of aggregate data on graduate admissions to the University of California, Berkeley, for fall 1973 shows a clear but misleading pattern of bias against female applicants. . . . If the data is
properly pooled, taking into account the autonomy of departmental
decision making, thus correcting for the tendency of women to apply to graduate departments that are more difficult for applicants of
either sex to enter, there is a small but statistically significant bias
in favor of women [16].
The details show that not every department had a higher acceptance rate
for females. Nonetheless, the authors still choose to describe the reversal as “a
paradox, sometimes referred to as Simpson’s” [16]. To accommodate their use
of the term, and other similar usage (see [17]), an alternative definition of the
Yule–Simpson effect should be considered. The terminology of linear modeling
can apply.
Let Y indicate the presence or absence of an attribute, taking the values
one or zero. Let Xi indicate membership within one population or another,
taking the values zero or one. Let the s indicator variables Xj1 , Xj2 , ..., Xjs
together indicate category. With I = {i} and J = {j1 , j2 , ..., js } the Yule–
Simpson effect can be understood as a special instance of the following more
general phenomenon.
Definition 2.1. Let I index a subset of explanatory variables and let J index
a disjoint subset of explanatory variables. We say that J has induced a reversal
of β̂i if
sign( J,I β̂i ) 6= sign( I β̂i ).
In light of Definition 2.1 we can interpret Theorem 1.1 as providing conditions that preclude the possibility of a reversal.
population 1
population 2
category 1
category 2
···
category s
a1 /b1
c1 /d1
a2 /b2
c2 /d2
···
···
as /bs
cs /ds
Table 2.1: The Yule-Simpson effect occurs when
4
P
P aj
bj
>
P
P cj ,
dj
yet ∀j
aj
bj
<
cj
dj .
3
The strongest correlate
To further demonstrate the practical nature of Theorem 1.1, we state here a
corollary, that asserts the dominance, or importance, of the strongest observed
correlate. An explicit application to the population-level study of the effects of
nutrition on disease is then discussed.
Corollary 3.1. For i 6∈ J we have
JR
< |r(xi , y)| =⇒ sign( J,i β̂i ) = sign(r(xi , y)).
Knowledge of Corollary 3.1 can influence the decision making process of a
descriptive statistician who is charged with the task of summarizing an existing
data set. As an example consider the large data set associated with an ecological
study of mortality, biochemistry, diet and lifestyle that was carried out in rural
China in the 1980s and early 1990s [18]. County-level consumption rates for
various dietary variables were obtained along with county-level rates for heart
disease. Unadjusted, bi variate, sample correlations were computed, and the
results are displayed in Table 3.1.
One may be tempted to dismiss the strong, observed association between
wheat and heart disease as spurious, because it is conceivable that it would
vanish or even reverse after adjustment for confounding factors. However, in
light of Corollary 3.1, even in the presence of model uncertainty, one may quickly
surmise that the data likely do not indicate a protective effect of wheat on heart
disease.
4
Mathematics
In this section the mathematics that allows for a proof of Theorem 1.1 is presented. Notation is as close to “standard” as possible, while detailed enough for
dietary
variable
observed correlation
with Heart Disease
Cholesterol
Saturated Fat
Fish
Nuts
Salt
Spices
Wheat
Beans
Fruits
Vegetables
-.15
-.18
-.21
.01
.00
.33
.64
-.33
-.03
-.13
Table 3.1: Unadjusted, bi-variate, sample correlation values for select variables;
with J indexing every explanatory variable except wheat, J R = 0.55.
5
clear development of the theory. There is a geometric flavor to the definitions
that is best embraced before moving on to the lemmas and propositions. A
solid understanding of the propositions leads to a thorough understanding of
the proof for the theorem.
4.1
Notation
The existence of a general dataset as depicted in Table 4.1 is assumed. There
are n, m-dimensional observations. Let I index a subset of {1, 2, ..., m}, J index
a disjoint subset, and K index a generic subset. Let i stand for a generic element
of I, j stand for a generic element of J, and k stand for a generic element of K.
Bold symbols indicate observed vectors of data within Rn . Also, h·, ·i is used
for the standard inner product, | · | for the associated, Euclidean norm, and ⊥
to indicate orthogonality.
With e denoting a vector of n ones, the vectors {e, x1 , x2 , ..., xm } are assumed to be linear independent. The span of e, and a subset of vectors indexed
by K, is a vector subspace denoted with K V . For every K, both y 6∈ K V and
y 6⊥ K V are assumed.
In general, V stands for a vector subspace. Also, pre subscripts indicate
a subset of explanatory variables, and a post subscript typically indicates a
variable of interest.
4.2
Definitions
In this subsection K = {k1 , k2 , ..., kp }.
Definition 4.1. Denote the projection of y onto V with
pV (y) = argmin(|y − v|).
v∈V
Definition 4.2. The vector of fitted coefficients, ( K β̂0 ,
is the unique solution of
p K V (y) =
K β̂0 e
+
K β̂k1 xk1
+
K β̂k2 xk2
K β̂k1 , K β̂k2 , ..., K β̂kp ),
+ ... +
y
x1
x2
...
xm
y1
y2
y3
..
.
x1,1
x1,2
x1,3
..
.
x2,1
x2,2
x2,3
..
.
···
...
...
..
.
xm,1
xm,2
xm,3
..
.
yn
x1,n
x2,n
...
xm,n
K β̂kp xkp .
Table 4.1: A sufficiently general data set that illustrates the notation
6
Definition 4.3.
Ky
is the function
Ky
Ky
: (αk1 , αk2 , ..., αkp ) 7→
K β̂0
+
: Rp → R
K β̂k1 αk1
+
K β̂k2 αk2
+ ... +
K β̂kp αkp .
Definition 4.4. The qth fitted value is
K ŷq
=
K y(xk1 ,q , xk2 ,q , ..., xkp ,q ).
Definition 4.5. The vector of fitted values is
K ŷ
Remark 4.1. Within Rn ,
Definition 4.6. Define
determination:
= ( K ŷ1 ,
K ŷ
KR
K ŷ2 , ..., K ŷn ).
= p K V (y).
as the positive square root of the coefficient of
s Pn
p
( K ŷq − ȳ)2
2
Pq=1
.
KR = +
KR = +
n
2
q=1 (yq − ȳ)
Definition 4.7. For generic vectors x = (x1 , x2 , ..., xn ) and y = (y1 , y2 , ..., yn ),
and with s denoting the sample standard deviation, define the Pearson correlation coefficient r as
n 1 X xq − x̄
yq − ȳ
r(x, y) =
.
n − 1 q=1
sx
sy
4.3
Geometry
The following lemmas are stated without proof, as they can be surmised to
be true or derived from the material in books on mathematical analysis (e.g.
Cheney’s text [19]). See the appendix for a proof of Proposition 4.1.
Lemma 4.1. For any y and for any V
(y − pV (y)) ⊥ V.
Lemma 4.2. For any y and for any V
|pV (y)|2 + |y − pV (y)|2 = |y|2 .
Lemma 4.3. For any vectors x, y
x ⊥ y =⇒ |x|2 + |y|2 = |x + y|2 .
Lemma 4.4. For V1 ⊥ V2 and V = span{V1 , V2 }
pV (y) = pV1 (y) + pV2 (y).
7
Definition 4.8. For nonzero vectors y ∈ Rn and v ∈ V , define θ(y, v), with
0 ≤ θ ≤ π, via
hy, vi
cos(θ) =
.
|y||v|
Proposition 4.1. Let V be a vector subspace of Rn . For a fixed vector y 6∈ V ,
with y 6⊥ V , and for a fixed, nonzero vector w ∈ V :
(i) If w has a component orthogonal to pV (y), then θ(y, pV (y) + tw) is a
strictly increasing function of t > 0.
(ii) Otherwise, θ(y, pV (y) + tw) is nondecreasing on {t : t > 0, pV (y) + tw 6=
0}.
4.4
Simplifications
Proofs of the propositions in this section are left to the reader.
Definition 4.9. A vector of data x is centered if x̄ = 0.
Definition 4.10. A vector of data x is geometrically standardized if x̄ = 0 and
|x| = 1.
Definition 4.11. Given a vector of data x we use the term standardization to
describe the process
x − x̄e
x 7→
.
|x − x̄e|
Remark 4.2. Standardization results in geometrically standardized data.
Proposition 4.2. Standardization preserves the orthogonality of a set of centered vectors.
Proposition 4.3. For any K, standardization preserves the signs of { K β̂k }k∈K
and the value of K R.
Proposition 4.4. For any K, if the data is geometrically standardized, then
K β̂0 = 0.
Proposition 4.5. For any K, if the data is geometrically standardized, then
K R = cos(θ(y, p K V (y)) = |p K V (y)|.
Proposition 4.6. For k = 1, 2, ..., m,
kR
= |r(xk , y)|.
Proposition 4.7. For any disjoint, indexing sets I and J, and for any i ∈ I,
if I indexes orthogonal vectors of data, then sign( J,I β̂i ) = sign( J,i β̂i ).
8
4.5
Proof of Theorem 1.1
By Proposition 4.3, geometrically standardized data can be assumed, and by
Proposition 4.2, orthogonality of the vectors indexed by I is retained.
Proposition 4.6 allows us to state the contrapositive of the implication from
Theorem 1.1 as
sign( J,I β̂i ) 6= sign( I β̂i ) =⇒
JR
> i R.
JR
> i R.
By Proposition 4.7 it suffices to demonstrate
sign( J,i β̂i ) 6= sign( i β̂i ) =⇒
The hypothesis, sign( J,i β̂i ) 6= sign( i β̂i ), implies that within
JV
J,i V
separates p J,i V (y) from p i V (y).
Thus the straight line from p J,i V (y) to p i V (y) intersects J V at a point q.
Consider the two-stage path: from p J V (y) to q within J V , and then from
q to p i V (y) within J,i V , along two straight line segments. Using Proposition
4.1 we can conclude∗ that
θ(y, p J V (y)) ≤ θ(y, q) < θ(y, p i V (y)).
Application of the cos function reverses the ordering, resulting in
cos(θ(y, p J V (y))) ≥ cos(θ(y, q)) > cos(θ(y, p i V (y))).
Proposition 4.5 allows us to substitute
cos(θ(y, p i V (y))), and the result is
JR
JR
for cos(θ(y, p J V (y))) and i R for
> i R.
∗ We have assumed in Section 4.1 that for any K, y 6⊥
K V , which implies, even for
geometrically standardized data, and again for any K, that K β̂i 6= 0. Also, if p i V (y) −
p J,i V (y) is a scalar multiple of p J,i (y) then q = 0 and Proposition 4.4 ensures that p J,i (y)
is a scalar multiple of xi . This contradicts either J,i β̂i 6= 0 or J,i β̂j 6= 0 for j ∈ J. Thus
we conclude that p i V (y) − p J,i V (y) is not a scalar multiple of p J,i (y), and we are justified
in using part (i) of Proposition 4.1 along the first segment. Finally, note that Proposition
4.1 applies along the second segment because the segment lies along a ray emanating from
p J,i V (y).
9
Appendix: Proof of Proposition 4.1
hy,pV (y)+t(αpV (y))i
|y||pV (y)+t(αpV (y))|
hy,pV (y)+t(αpV (y))i
|y||pV (y)+t(αpV (y))| =
For part (ii), with α 6= 0, it suffices to show that cos(θ) =
is non increasing on {t : t > 0, t 6= −1/α}. For α > 0,
(1+tα) hy,pV (y)i
hy,pV (y)i
(1+tα) |y||pV (y)| = |y||pV (y)| , which is constant. For α < 0,
hy,pV (y)i
hy,pV (y)+t(αpV (y))i
hy,(1+tα)pV (y)i
when t < −1/α, |y||p
= |y||(1+tα)p
= (1+tα)
(1+tα) |y||pV (y)| =
V (y)+t(αpV (y))|
V (y)|
hy,pV (y)i
hy,pV (y)+t(αpV (y))i
hy,(1+tα)pV (y)i
|y||pV (y)| , which is constant, and when t > −1/α, |y||pV (y)+t(αpV (y))| = |y||(1+tα)pV (y)|
(1+tα) hy,pV (y)i
hy,pV (y)i
hy,pV (y)i
−(1+tα) |y||pV (y)| = − |y||pV (y)| , which is also constant. Furthermore, |y||pV (y)| ≥
hy,pV (y)i
− |y||p
because Lemma 4.2 states |pV (y)|2 + |y − pV (y)|2 = |y|2 , which exV (y)|
hy,(1+tα)pV (y)i
|y||(1+tα)pV (y)|
=
pands to give hpV (y), pV (y)i + hy, yi − 2hy, pV (y)i + hpV (y), pV (y)i = hy, yi,
which implies hy, pV (y)i ≥ 0.
For part (i), with α ∈ R, write w = αpV (y) + u, where u ⊥ pV (y). cos(θ)
hy,pV (y)+t(αpV (y)+u)i
hy,(1+tα)pV (y)+tui
V (y)i+hy,tui
thus becomes |y||p
= |y||(1+tα)p
= hy,(1+tα)p
|y||(1+tα)pV (y)+tu| .
V (y)+t(αpV (y)+u)|
V (y)+tu|
The hy, tui term can be dropped since hy, tui = thy, ui = thpV (y) + (y −
pV (y)), ui = thpV (y), ui + h(y − pV (y)), ui = 0 + 0, where the final zero is due
hy,(1+tα)pV (y)i
to Lemma 4.1. Thus, it suffices to show that |y||(1+tα)p
is decreasing for
V (y)+tu|
t > 0.
Lemma .5. For (1 + tα) 6= 0, t/(1 + tα) is a strictly increasing function of t.
Proof.
d
t
dt 1+tα
=
1(1+tα)−αt
(1+tα)2
=
For t such that (1+tα) > 0,
1
(1+tα)2
> 0.
hy,(1+tα)pV (y)i
|y||(1+tα)pV (y)+tu|
=
1/(1+tα) hy,(1+tα)pV (y)i
1/(1+tα) |y||(1+tα)pV (y)+tu|
=
hy,pV (y)i
|y||pV (y)+tu/(1+tα)| .
Note that t/(1+tα) is positive because t > 0 and (1+tα) >
0, and note also that t/(1 + tα) is increasing by Lemma .5. Thus, as a consequence of Lemma 4.3, |pV (y) + tu/(1 + tα)| is increasing in t, which implies that
hy,pV (y)i
|y||pV (y)+tu/(1+tα)| is decreasing in t as desired.
For t such that (1+tα) < 0,
hy,(1+tα)pV (y)i
|y||(1+tα)pV (y)+tu|
=
1/(1+tα) hy,(1+tα)pV (y)i
1/(1+tα) |y||(1+tα)pV (y)+tu|
=
hy,pV (y)i
−|y||pV (y)+tu/(1+tα)| .
Note that t/(1 + tα) is negative because t > 0 and
(1 + tα) < 0, and note also that t/(1 + tα) is increasing by Lemma .5. Thus, as
a consequence of Lemma 4.3, |pV (y) + tu/(1 + tα)| is decreasing in t, so that
V (y)i
−|y||pV (y)+tu/(1+tα)| is increasing in t, which implies that −|y||pVhy,p
(y)+tu/(1+tα)|
is decreasing in t as desired.
For t such that (1 + tα) = 0, note that α < 0 so that 0 < t < −1/α ⇐⇒
(1+tα) > 0, t = −1/α ⇐⇒ (1+tα) = 0, and t > −1/α ⇐⇒ (1+tα) < 0. Note
also that since y 6∈ V and y 6⊥ V , Lemma 4.2 implies not only hy, pV (y)i ≥ 0
as derived previously, but also the strict inequality hy, pV (y)i > 0. Thus
hy,(1+t1 α)pV (y)i
for {(t1 , t2 , t3 ) : 0 < t1 < t2 = −1/α < t3 < ∞}, |y||(1+t
=
1 α)pV (y)+t1 u|
hy,pV (y)i
hy,(1+t2 α)pV (y)i
hy,(1+t3 α)pV (y)i
|y||pV (y)+t1 u/(1+t1 α)| > 0, |y||(1+t2 α)pV (y)+t2 u| = 0, and |y||(1+t3 α)pV (y)+t3 u| =
hy,pV (y)i
hy,(1+tα)pV (y)i
−|y||pV (y)+t3 u/(1+t3 α)| < 0. This shows that |y||(1+tα)pV (y)+tu| must be decreas-
ing at any positive t satisfying(1 + tα) = 0 as desired.
10
=
References
[1] Chatfield C. Model Uncertainty, Data Mining and Statistical Inference.
Journal of the Royal Statistical Society: Series A 1995; A 158, Part 3, pp.
419466.
[2] Kurth D, Sonis J. Assessment and Control of Confounding in Trauma Research. Journal of Traumatic Stress October 2007; Vol. 20, No. 5, pp.
807820.
[3] Cornfield et al. Smoking and lung cancer: Recent evidence and a discussion
of some questions. Journal of the National Cancer Institute 1959; 22, 173203.
[4] Wagner CH. Simpson’s Paradox in Real Life. The American Statistician
February 1982; 36 (1): 4648. 4648. doi:10.2307/2684093.
[5] McNamee R. Regression modeling and other methods to control confounding. Occupational & Environmental Medicine 2004; 62:500-506
doi:10.1136/oem.2002.001115.
[6] Lu CY. Observational studies: a review of study designs, challenges and
strategies to reduce confounding. The International Journal of Clinical
Practice May 2009; Blackwell Publishing Ltd. 63 5 691-697.
[7] Rosenbaum PR, Rubin DB. Assessing sensitivity to an unobserved binary
covariate in an observational study with binary outcome. Journal of the
Royal Statistical Society, Series B 1983; 11, 212-218.
[8] Lin DY, Psaty BM, Kronmal RA. Assessing the Sensitivity of Regression
Results to Unmeasured Confounders in Observational Studies. Biometrics
September 1998; 54, 948-963.
[9] Hosman CA, Hansen BB, Holland PW. The sensitivity of linear regression
coefficients’ confidence limits to the omission of a confounder. The Annals
of Applied Statistics 2010; Vol. 4, No. 2, 849-870.
[10] Davis et al. Rice Consumption and Urinary Arsenic Concentrations in U.S.
Children. Environmental Health Perspectives. October 2012; Vol.120, Issue
10, p1418-1424.
[11] Jungert et al. Serum 25-hydroxyvitamin D3 and body composition in an elderly cohort from Germany:
a crosssectional
study.
Nutrition
&
Metabolism
2012;
9:42.
http://www.nutritionandmetabolism.com/content/9/1/42
[12] Nelson et al. Daily physical activity predicts
sistance: a cross-sectional observational study
National Health and Nutrition Examination
Journal of Behavioral Nutrition and Physical
http://www.ijbnpa.org/content/10/1/10.
11
degree of insulin reusing the 2003–2004
Survey. International
Activity 2013; 10:10
[13] Lignell et al. Prenatal exposure to polychlorinated biphenyls and polybriminated diphenyl ethers may influence birth weight among infants in a
Swedish cohort with background exposure: a cross-sectional study Environmental Health 2013; 12:44 http://www.ehjournal.net/content/12/1/44.
[14] Cervellati et al. Bone mass density selectively correlates with serum markers
of oxidative damage in post-menopausal women. Clinical Chemistry and
Laboratory Medicine 2012. Volume 51, Issue 2, Pages 333-338.
[15] Julious SA, Mullee MA. Confounding and Simpson’s paradox. British Medical Journal 12/03/1994; 309 (6967): 14801481
[16] Bickel PJ, Hammel EA, O’Connell JW. Sex Bias in Graduate Admissions:
Data From Berkeley. Science 1975; 187 (4175): 398404.
[17] Appleton DR, French JM, and Vanderpump M. Ignoring a Covariate: an
Example of Simpson’s Paradox. The American Statistician 1996; Volume
50, Issue 4, 340-341.
[18] Chen et al. Geographic study of mortality, biochemistry, diet and lifestyle in
rural China. 1975; Retrieved August 21, 2003, from University of Oxford’s
Epidemiological Studies Unit website; http://www.ctsu.ox.ac.uk/ china/monograph/.
[19] Cheney W. Analysis for Applied Mathematics. Springer: New York, Berlin,
Heidelberg, Barcelona, Hong Kong, London, Milan, Paris, Singapore,
Tokyo, 2001.
12
Download