The Role of Control Groups in Mutagenicity Studies: Matching

advertisement
ATLA 31, Supplement 1, 65–75, 2003
65
The Role of Control Groups in Mutagenicity Studies:
Matching Biological and Statistical Relevance1
Dieter Hauschke,2 Torsten Hothorn3 and Juliane Schäfer4
of Biometry, ALTANA Pharma, 78467 Konstanz, Germany; 3Department of Medical
Informatics, Biometry and Epidemiology, University of Erlangen-Nuremberg, 91054 Erlangen, Germany;
4Department of Statistics, University of Munich, 80539 Munich, Germany
2Department
Summary — The statistical test of the conventional hypothesis of “no treatment effect” is commonly used
in the evaluation of mutagenicity experiments. Failing to reject the hypothesis often leads to the conclusion in favour of safety. The major drawback of this indirect approach is that what is controlled by a prespecified level α is the probability of erroneously concluding hazard (producer risk). However, the primary
concern of safety assessment is the control of the consumer risk, i.e. limiting the probability of erroneously
concluding that a product is safe. In order to restrict this risk, safety has to be formulated as the alternative, and hazard, i.e. the opposite, has to be formulated as the hypothesis. The direct safety approach is
examined for the case when the corresponding threshold value is expressed either as a fraction of the population mean for the negative control, or as a fraction of the difference between the positive and negative
controls.
Key words: biological relevance, Fieller confidence intervals, mutagenicity studies, test on equivalence.
Introduction
Statistical Proof of Hazard
Before the administration of the first dose of a
new compound to a human subject, a safety
assessment has to be performed in mutagenicity
studies. Statistical analysis plays a fundamental
part in the interpretation of the data from the corresponding experiments. Usually, the conventional null hypothesis of “no difference in the
effect” between the treatment and negative control group is tested. Failing to reject the null
hypothesis often leads to the conclusion that the
compound has no deleterious effect in the biological model concerned. The major drawback of this
indirect procedure is that what is controlled by
the pre-specified significance level is the probability of erroneously concluding hazard (producer
risk). However, the primary concern of safety
assessment is the control of consumer risk, i.e.
limiting the probability of erroneously concluding
that a product is safe. Thus, the adequate test
problem should be formulated by reversing the
null hypothesis and the alternative, and incorporating a threshold value defined a priori. A solution is derived for this problem for the case of
normally distributed random variables, when the
threshold is expressed either as a fraction of the
population mean for the negative control, or as a
fraction of the difference between the positive and
negative controls.
The following one-way layout represents a typical
experimental design as used in genotoxicity assessment:
1In
{Negative control, Dose1, Dose2, ..., Dosek, Positive
control}.
One objective of the analysis is to identify the noobserved adverse effect dose (NOAED), that is the
highest experimental dose with no statistically
increased safety effect relative to the negative control. The inclusion of a positive control with known
mutagenic potential allows a check to be made on
the sensitivity of the test system. Let Xij denote the
observation of the primary endpoint for the jth
experimental unit in the ith dose group Di (i = 0
denotes the negative control and i = k + 1 the positive control, respectively). It is assumed that these
random variables are mutually independent and
normally distributed with location parameters µi
and unknown but common variances σ 2. Without
loss of generality, it is assumed that the population
means are positive, and that it is known a priori
that, if there is a critical response to the substance,
it will increase in magnitude, that is µi > µ0. Assuming that the mean response is a non-decreasing
function of the dose level, i.e. µ0 ≤ µ1 ≤ µ2 ... ≤ µk,
the conventional approach (proof of hazard) can be
this paper, the terms “proof of hazard/safety” should be interpreted in a statistical sense for the
underlying experimental conditions.
D. Hauschke et al.
66
performed by the following sequential procedure
(1), starting with an assessment of assay sensitivity:
is tested by applying a trend test, e.g.
Bartholomew’s (2). For a more-detailed discussion
of other trend tests for the statistical analysis of
monotone dose–response relationships in mutagenicity assays, see also Hothorn et al. (3). If H0k is
rejected at level α, the test is repeated without the
highest dose:
alternative is demonstrated by measuring the
strength of evidence against the null hypothesis. A
way of directly concluding that a substance has no
harmful effect is the proof of sufficient safety. This
requires that the test problem should be formulated
by reversing the null hypothesis and the alternative, and incorporating a threshold that quantifies
the maximum tolerable increase of risk relative to
the control.
In the next section, the test procedure for the
direct approach is derived, by assuming that the
threshold is expressed either as a fraction of the
population mean for the negative control, or as a
fraction of the difference between positive and negative controls. Recently, these two definitions of a
threshold value were also used in the validation of
an internal standard in comet assay analysis (7).
The first definition is also implicitly applied in the
assessment of a potential mutagenic effect of a substance by the Ames assay. Therefore, one decides in
favour of mutagenicity, if at least two doses produce
a result more than two-fold the spontaneous background.
H0k–1: µ0 = µ1 = µ2 = ... = µk–1
H1k–1: µ0 ≤ µ1 ≤ µ2 ≤ ... ≤ µk–1 and µ0 < µk–1.
Statistical Proof of Safety
H0: µk+1 – µ0 ≤ 0 (no assay sensitivity)
H1: µk+1 – µ0 > 0 (assay sensitivity).
A comparison between the doses and the negative
control is only performed, if H0 was rejected at level
α in favour of H1 (assay sensitivity with respect to
the negative control) according to Student’s t test.
Starting with all doses in the next step, the hypothesis
H0k: µ0 = µ1 = µ2 = ... = µk
H1k: µ0 ≤ µ1 ≤ µ2 ≤ ... ≤ µk and µ0 < µk
In the case of a non-significant result ( p value > α),
the procedure stops. In general, H0i, i = k, ..., 1, is
tested at level α, if, and only if, all H0l have been
rejected at level α, i < l, l = i + 1, ..., k. Hence, the
NOAED is the highest dose Di for which H0i was
not rejected. Based on the closed testing procedure,
Maurer et al. (4) have shown that this a priori
ordered test hierarchy controls the family-wise
error, i.e. the error over all tested hypotheses.
Obviously, the NOAED represents a statistical
no-effect dose that depends on the power of the
study. Hence, a less-sensitive mutagenicity experiment with a small sample size results in higher safe
doses than the corresponding study with a larger
sample size and lower variability, which is exactly
the opposite of what is desired. On the other hand,
a significant statistical result could provide evidence for the conclusion that there is a mutagenic
effect of the treatment. However, even good laboratory practice with a large sample size and little
experimental variation may lead to the problem
that an unimportant difference will be statistically
significant (5).
The classical approach therefore often leads to
the problem that statistical significance does not
necessarily mean biological relevance, and that statistical non-significance does not necessarily correspond to biological irrelevance (6). The major
reason for these difficulties involves the choice of
the null hypothesis and the alternative. In statistical hypothesis testing, the null and alternative
hypotheses are not treated equally, and this results
in an inherent unbalance. The likelihood of the
Regulatory requirements for new drug development
allow the sponsor to proceed along the lines indicated by the fundamental assumptions that:
a) drugs are considered non-efficacious until proven
otherwise; and b) drugs are considered sufficiently
safe until proven otherwise. Therefore, classical
statistical testing directly controls the consumer
risk for demonstrating efficacy, but only the producer risk for demonstrating sufficient safety.
However, it is intuitively clear that the consumer risk should always be of primary concern.
Therefore, the adequate test problem for mutagenic studies is formulated for the two-sample
design as follows, providing consistency of the consumer risk for approval based on efficacy as well as
on safety:
H0i : µi – µ0 ≥ δ (dose Di is hazardous under test
conditions)
H1i : µi – µ0 < δ (dose Di is safe under test conditions),
where (–∞, δ), δ > 0, denotes the safety range.
Inherently, it is necessary to define a priori a minimally relevant safety threshold δ. This means that
an increase of the safety endpoint up to δ is still
acceptable.
Hothorn & Hauschke (8) applied this concept for
the one-way layout with k increasing doses. Instead
of using the term NOAED, the authors introduced
the definition of maximum safe dose (MAXSD) as
follows:
MAXSD = Di, where i = max(i: µj – µ0 < δ, j = 1,...,i).
The role of control groups in mutagenicity studies
It should be noted that this definition assumes only
that all doses lower than MAXSD must also be safe.
Hothorn & Hauschke (8) described the following
sequentially rejecting procedure, controlling the
family-wise error for the determination of the highest safe dose. Starting with the lowest dose, the
shifted hypothesis
H01 : µ1 – µ0 ≥ δ (dose D1 is hazardous under test
conditions)
H11 : µ1 – µ0 < δ (dose D1 is safe under test conditions)
is tested by the two-sample t test. The procedure
stops if H01 is not rejected and hence D1 could not be
proven to be safe. If H01 is rejected at level α, the
problem
H02 : µ2 – µ0 ≥ δ (dose D2 is hazardous under test
conditions)
H12 : µ2 – µ0 < δ (dose D2 is safe under test conditions)
is tested. Again, in the case of a non-significant
result, the procedure stops. In general, H0i is tested
at level α, if, and only if, all H0l have been rejected
at level α, l < i, i = 1, ..., k. The MAXSD is the highest dose Di, i = 1, ..., k, for which the shifted null
hypothesis H0i : µi – µ0 ≥ δ was rejected in favour of
H1i : µi – µ0 < δ (safety), that is:
–
–
X – X0 – δ
ti = i
≤ –tα,n0+ni – 2,
1 1
S n +n
0
i
Î
where tα,ν is the (1 – α) percentile of the central
–
–
t-distribution with ν degrees of freedom, Xi and X0
denote the sample means of dose Di and the negative control, n0 and ni are the corresponding sample
sizes and S2 the pooled estimator of σ 2:
n0
S2 =
–
ni
–
Σ (X0j – X0)2 + Σ (Xij – Xi)2
j=1
j=1
n0 + ni – 2
.
In practice, there is often a reluctance to define δ a
priori. If δ can only be specified a posteriori, the
above stepwise procedure should be based on the
classical confidence intervals, i.e. concluding safety
of dose Di if the one-sided 100(1 – α)% confidence
interval for µi – µ0 is included in the safety range:
1–∞, X– – X–
i
0
+ tα,n0+ni – 2 S
Î n1 + n1 4 ⊂ (–∞,δ ).
0
i
Specification of δ requires the statisticians and
genetic toxicologists to think about what constitutes
a minimally relevant difference; ideally, this should
happen at the planning stage of the experiment, but
not later than after the statistical analysis, when
point estimates and confidence intervals have been
calculated, and the results are to be discussed.
67
A more common situation in practice is that the
value δ is expressed as a proportion of the unknown
population mean µ0 of the negative control.
Suppose that δ = ƒµ0, ƒ > 0, then the foregoing test
problem can be formulated as:
H0i: µi – µ0 ≥ ƒµ0
H1i: µi – µ0 < ƒµ0
which can be restated as:
µ
H0i: i ≥ 1 + ƒ
µ0
µ
H1i: i < 1 + ƒ
µ0
where (–∞, 1 + ƒ) is the corresponding safety interval for the ratio of µi and µ0. By analogy, the maximum safe dose is defined as:
µj
MAXSD = Di, where i = max i: µ < 1 + ƒ, j = 1,...,i .
0
3
4
Sasabuchi (9) demonstrated that the size-α likelihood ratio test rejects the null hypothesis
H0i, i = 1, ..., k, concerning the ratio of the two
means, if:
–
–
Xi – (1 + ƒ) X0
ti =
≤ –tα,n0+ni – 2.
1 (1 + ƒ)2
S n + n
i
0
Î
Hauschke et al. (10) have shown that the condition
ti ≤ –tα,n0+ni–2 is equivalent to: θui ≤ 1 + ƒ and
–
X02 > a0 where
θui =
– –
–
–
X0 Xi + Îa0 Xi2 + ai X02 – a0 ai
–
X02 – a0
a0 =
S2 2
t
,
n0 α,n0+ni – 2
and ai =
S2 2
t
.
ni α,n0+ni – 2
It should be noted that the one-sided 100(1 – α)%
confidence interval (–∞,θui) for µi/µ0 is a special case
of the more-general confidence interval according
–
to Fieller (11). This is because the estimators Xi
–
and X0 are uncorrelated.
Therefore, the corresponding sequentially
rejecting procedure based on either corresponding
tests or confidence intervals can be easily applied
to the situation where the parameter of interest is
expressed as a ratio of location parameters. The
corresponding threshold value for the difference
µi – µ0 is δ = ƒµ0, which is equivalent to the condition that a dose D i is considered safe if
µi < µ0 + δ = (1 + ƒ)µ0. This is illustrated in Figure 1.
Obviously, this critical threshold value should be
based on biological relevance rather than on statistical reasoning, thus taking into account correspon-
D. Hauschke et al.
68
–
–
–
–
In this situation, the estimators Xi – X0 and Xk+1 – X0
are correlated, so the calculation of the one-sided
100(1 – α)% confidence interval (–∞,θui) for
Figure 1: Graphical interpretation of the
threshold
µi – µ0
µk+1 – µ0
}
∆
µ0+δ
µk+1
H0i
H1i
ding safety considerations. Using the definition
δ = ƒµ0 is one possible way of relating statistical and
biological relevance. Another approach is to incorporate the difference between the positive and the
negative controls, i.e. ∆ = µk+1 – µ0. A dose Di is
considered safe if the corresponding mean is not
greater than the mean of the control plus a fraction
of the difference between positive and negative control, that is µi < µ0 + δ = µ0 + ƒ(µk+1 – µ0), ƒ > 0
(see Figure 1). This formulation is equivalent to the
condition that the threshold value for the difference
µi – µ0 is δ = ƒ∆ = ƒ(µk+1 – µ0). The corresponding
formulation of the test problem leads to the following one-sided shifted null hypotheses:
An Example
Adler & Kliesch (14) published raw data from a
micronucleus mutagenicity assay on hydroquinone.
The results for male mice at the 24-hour sampling
time are given in Table 1.
Table 2 provides the summary statistics for the
negative control, the four dose levels of hydroquinone, and for the positive control, cyclophos-
H0i: µi – µ0 ≥ ƒ(µk+1 – µ0)
H1i: µi – µ0 < ƒ(µk+1 – µ0)
Table 1: Number of micronuclei per animal
and 2000 scored cells for the
negative control, four doses of
hydroquinone and the positive
control cyclophosphamide
which can be restated as:
H0i:
µi – µ0
≥ƒ
µk+1 – µ0
H1i:
µi – µ0
< ƒ.
µk+1 – µ0
Schäfer (12) has shown that H0i can be rejected, if:
–
–
–
Xi – ƒXk+1 – (1 – ƒ) X0
ti =
≤ –tα,n0+nk+1+ni – 3,
ƒ2
1
(1 – ƒ)2
S n +n
+
n0
i
k+1
Negative control
30mg/kg
50mg/kg
75mg/kg
100mg/kg
25mg/kg cyclophosphamide
Î
which is equivalent to: θui ≤ ƒ and 0 ≤ Gi < 1 where:
tα,n0+nk+1+ni – 3
c
1
θui = 1 – G Ri – Gi c 0 +
S
Z
k+1
i
5
–
–
Z = Xk+1 – X0,
Îc (1 – G ) – 2c R + c
i
i
0
i
c0 = n1 ,
0
Gi =
tα2,n0+nk+1+ni – 3 S2ck+1
Z2
3, 2, 2, 3, 2, 5, 1
5, 4, 4, 4, 2
7, 4, 6, 8, 6
9, 18, 13, 12, 18
22, 13, 23, 22, 20
33, 15, 32, 20
c02
2
R
+
Gi
k+1 i
c
–
–
Xi – X0
Ri = –
–
Xk+1 – X0
ci = n1 + n1 , ck+1 = n 1 + n1 ,
i
0
k+1
0
Number of
micronuclei/2000 cells
Treatment group
k+1
5
µ0
must be based on Fieller’s method (11).
Analogously, the corresponding sequential rejecting procedure based on either corresponding tests
or confidence intervals can be used. However,
because the threshold value is defined as a fraction
of the difference µk+1 – µ0, the selection of a suitable dose of the positive control is of outstanding
importance. Increasing the difference by using a
high dose of the positive control implies that the
derived threshold might not be considered as a minimum acceptable increase in the safety endpoint.
Thus, doses should not be so high that excessive
responses are observed (13).
The role of control groups in mutagenicity studies
Table 2:
69
Sample means, sizes for the number of micronuclei and upper 95% confidence
µ – µ
µ
limits for µi and µ i – 0µ , i = 1, ..., 4
k+1
0
0
Upper confidence limit for
Treatment group
i
i
i
i
i
i
= 0: Negative control
= 1: 30mg/kg
= 2: 50mg/kg
= 3: 75mg/kg
= k = 4: 100mg/kg
= k + 1 = 5: 25mg/kg
cyclophosphamide
Sample
mean
Sample
size
µi
µ0
µi – µ0
µk+1 – µ0
2.57
3.80
6.20
14.0
20.0
7
5
5
5
5
—
2.31
3.88
19.05
29.20
—
0.24
0.35
0.74
1.04
25.0
4
—
—
Obviously, safety cannot be concluded for the doses 50, 75 and 100mg/kg because they show an unacceptable increase
relative to both the negative control and the difference between the positive and negative control. The low dose 30mg/kg
shows only a slight increase, which might be regarded as biologically unimportant and therefore, could be considered
as MAXSD.
S functions for both standard Fieller confidence intervals (two-sample design) and correlated Fieller confidence
intervals (many-to-one design) are given in Appendices 1 and 2. The output for the above example is in Appendix 3.
All three files can be downloaded from http://www.bioinf.uni-hannover.de/INVITROSTAT.
phamide. Additionally, the corresponding upper 95%
confidence intervals for
µi – µ0
µi
and
µk+1 – µ0 , i = 1, ..., 4,
µ0
are given, which can be interpreted as the percentage of the mutagenic potency of positive minus negative control.
Conclusions
The consumer risk, i.e. limiting the probability of
erroneously concluding safety, is not controlled by
the classical testing approach (proof of hazard).
Furthermore, it often leads to the problem of
inequivalence between statistical significance and
biological relevance. One major reason for this logical difficulty is clearly described by Fisher (15): “ . . .
the null hypothesis is never proved or established,
but is possibly disproved in the course of experimentation. Every experiment may be said to exist
only in order to give the facts a chance of disproving
the null hypothesis.”
Thus, the adequate test problem should be formulated as a proof of safety by reversing the role of
the null hypothesis and the alternative, and incorporating a threshold value. This report is concerned
with the safety approach when the threshold value
is expressed either relative to the negative control
mean, or as a fraction of the difference between positive and negative controls. It should be noted that
the statistical methodology was developed for a normally distributed endpoint with common variance.
Further research has to be done for the issue of violation of these assumptions, e.g. assuming variance
heterogeneity and/or non-normal distribution. Of
course, this approach is suitable, not only for mutagenicity studies, but also for every toxicological
problem related to safety.
Acknowledgement
This paper was partly sponsored by ECVAM (via
EC contract number 17159-2000-11F1ED ISP
DE).
References
1.
2.
3.
4.
5.
Hothorn, L.A. (1995). Biostatistical analysis of the
control vs k treatments design including a positive
control group. In Biometrie in der chemisch-pharmazeutischen Industrie (ed J. Vollmar), pp. 19–26.
Stuttgart, Germany: Gustav Fischer Verlag.
Bartholomew, D.J. (1961). Ordered tests in the analysis of variance. Biometrika 2, 325–332.
Hothorn, L. A., Hayashi, M. & Seidel, D. (2000).
Dose–response relationship in mutagenicity assays
including an appropriate positive control group: a
multiple testing approach. Environmental and
Ecological Statistics 7, 27–42.
Maurer, W., Hothorn, L. A. & Lehmacher, W. (1995).
Multiple comparisons in drug clinical and preclinical
assays: a priori ordered hypotheses. In Biometrie in
der chemisch-pharmazeutischen Industrie (ed. J.
Vollmar), pp. 3–18. Stuttgart, Germany: Gustav
Fischer Verlag.
Hauschke, D., Hayashi, M., Lin, K. K., Lovell, D. P.,
Robinson, W. D. & Yoshimura, I. (1997). Recom-
D. Hauschke et al.
70
mendations for biostatistics of mutagenicity studies.
Drug Information Journal 31, 323–326.
6. Hauschke, D. & Hothorn, L. A. (1998). Safety assessment in toxicological studies: proof of safety versus
proof of hazard. In Design and Analysis of Animal
Studies in Pharmaceutical Development (ed. S-C.
Chow & J-P. Liu), pp. 197–225. New York, NY, USA:
Marcel Dekker.
7. De Boeck, M., Touil, N., De Visscher, G., Vande, P. A.
& Kirsch-Volders, M. (2000). Validation and implementation of an internal standard in comet assay
analysis. Mutation Research 469, 181–197.
8. Hothorn, L. A. & Hauschke, D. (2000). Identifying
the maximum safe dose: a multiple testing approach.
Journal of Biopharmaceutical Statistics 10, 15–30.
9. Sasabuchi, S. (1988). A multivariate one-sided test with
composite hypotheses determined by linear inequalities
when the covariance matrix has an unknown scale factor. Memoirs of the Faculty of Science, Kyushu
University, Series A, Mathematics 42, 9–19.
10. Hauschke, D., Kieser, M. & Hothorn, L. A. (1999).
Proof of safety in toxicology based on the ratio of two
11.
12.
13.
14.
15.
means for normally distributed data. Biometrical
Journal 41, 295–304.
Fieller, E. (1954). Some problems in interval estimation. Journal of the Royal Statistical Society B 16,
175–185.
Schäfer, J. (2001). Kriterien zur Entscheidung über
therapeutische Äquivalenz (Criteria for the Decision
on Therapeutic Equivalence; in German). Masters
Thesis, University of Munich.
Anon. (1991). Guidance Note: The Practical Interpretation of Annex V: Test Method B10, the In Vitro
Mammalian Cell Cytogenetics Test. XI/574/91 Rev. 2.
Brussels, Belgium: Commission of the EC, Directorate General Environment.
Adler, I.D. & Kliesch, U. (1990). Comparison of single and multiple treatment regimens in the mouse
bone marrow micronucleus assay for hydroquinone
and cyclophosphamide. Mutation Research 234,
115–123.
Fisher, R.A. (1935). The Design of Experiments.
London, UK: Oliver & Boyd.
The role of control groups in mutagenicity studies
Appendix 1: An S function for standard Fieller confidence intervals for the
two-sample problem
fieller
##
##
##
##
##
##
##
##
##
##
##
##
##
##
<- function(treat, group, alternative=c(“two.sided”, “greater”, “less”), conf.level = 0.95) {
Computes parametric confidence intervals for the ratio of
mean(dosis)/mean(control) for a two-sample design
Input:
treat: numeric vector of measurements
group: a factor at levels “dosis” and control
alternative: side of the confidence sets to be computed
conf.level: the confidence level
Output:
a list with components “lower”, “upper” and attribute “conf.level”
Example:
treat <- c(rnorm(10,3), rnorm(10,1))
group <- factor(c(rep(“dosis”, 10), rep(“control”,10)))
fieller(treat, group)
if
(!is.vector(treat) || is.null(treat)) stop(“treat is no vector”)
if
(is.null(group)) stop(“no groups given”)
if
(length(treat) != length(group)) stop(“length differ”)
alternative <- match.arg(alternative)
alpha <- 1 - conf.level
if
if
(!any(levels(group) == “dosis”)) stop(“No treatment group defined”)
(!any(levels(group) == “control”)) stop(“No control defined”)
x <- treat[group == “control”]
y <- treat[group == “dosis”]
m <- length(x)
n <- length(y)
S <- (1/(m+n -2))*(sum((x - mean(x))^2) + sum((y - mean(y))^2))
cint <- switch(alternative, two.sided={
tquant <- qt(alpha/2, m + n - 2)
ax <- S/m*tquant^2
ay <- S/n*tquant^2
sqrtt <- sqrt(ax*mean(y)^2 + ay*mean(x)^2 - ax*ay)
c((mean(x)*mean(y) - sqrtt)/(mean(x)^2 - ax),
(mean(x)*mean(y) + sqrtt)/(mean(x)^2 - ax))
}, greater={
tquant <- qt(alpha, m + n - 2)
ax <- S/m*tquant^2
ay <- S/n*tquant^2
sqrtt <- sqrt(ax*mean(y)^2 + ay*mean(x)^2 - ax*ay)
c((mean(x)*mean(y) - sqrtt)/(mean(x)^2 - ax), Inf)
}, less={
tquant <- qt(alpha, m + n - 2)
ax <- S/m*tquant^2
ay <- S/n*tquant^2
sqrtt <- sqrt(ax*mean(y)^2 + ay*mean(x)^2 - ax*ay)
c(0, (mean(x)*mean(y) + sqrtt)/(mean(x)^2 - ax))
})
if (ax > mean(x)^2) stop(“mean(x) is not significantly unequal zero”)
attr(cint, “conf.level”) <- conf.level
return(cint)
}
71
72
D. Hauschke et al.
Appendix 2: An S function for correlated Fieller confidence intervals
according to Schäfer (12)
fiellermuta <- function(treat, group, alternative=c(“two.sided”, “greater”, “less”), conf.level = 0.95) {
##
## Computes parametric confidence intervals for the ratio
## (mean(dosis) - mean(ncontrol))/(mean(pcontrol) - mean(ncontrol))
## for a many-to-one design
##
## Input:
##
treat: numeric vector of measurements
##
group: a factor at levels “dosis”, “pcontrol” and “ncontrol”
##
indicating the group
##
alternative: side of the confidence sets to be computed
##
conf.level: the confidence level
##
## Output:
##
a list with components “lower”, “upper” and attribute “conf.level”
##
## Example:
##
treat <- c(rnorm(10,3), rnorm(10,5), rnorm(10))
##
group <- factor(c(rep(“dosis”, 10), rep(“pcontrol”, 10),
##
rep(“ncontrol”,10)))
##
fiellermuta(treat, group)
##
if
(!is.vector(treat) || is.null(treat)) stop(“treat is no vector”)
if
(is.null(group)) stop(“no groups given”)
if
(length(treat) != length(group)) stop(“length differ”)
alternative <- match.arg(alternative)
if
(!any(levels(group) == “dosis”)) stop(“No dosis group defined”)
if
(!any(levels(group) == “pcontrol”)) stop(“No positive control defined”)
if
(!any(levels(group) == “ncontrol”)) stop(“No negative control defined”)
ndosis <- sum(group == “dosis”)
npcontrol <- sum(group == “pcontrol”)
nncontrol <- sum(group == “ncontrol”)
df <- ndosis + npcontrol + nncontrol - 3
cdosis <- 1/ndosis + 1/nncontrol
cnpcontrol <- 1/npcontrol + 1/nncontrol
cnncontrol <- 1/nncontrol
pooledvar <- ((ndosis - 1) * var(treat[group==“dosis”]) +
(npcontrol - 1) * var(treat[group==“pcontrol”]) +
(nncontrol - 1) * var(treat[group==“ncontrol”]))/df
z <- mean(treat[group==“pcontrol”]) - mean(treat[group==“ncontrol”])
rdosis <- (mean(treat[group==“dosis”]) - mean(treat[group==“ncontrol”]))/z
cint <- switch(alternative, “two.sided” ={
alpha <- (1 - conf.level)/2
gdosis <- (qt(1 - alpha, df)^2 * pooledvar * cnpcontrol)/(z^2)
lower <- 1/(1 - gdosis) * (rdosis - (gdosis * cnncontrol)/cnpcontrol
- (qt(1 - alpha, df) * sqrt(pooledvar))/z
* sqrt(cdosis * (1 - gdosis) - 2 * cnncontrol * rdosis +
cnpcontrol * rdosis^2 + (cnncontrol^2/cnpcontrol) * gdosis))
upper <- 1/(1 - gdosis) * (rdosis - (gdosis * cnncontrol)/cnpcontrol
+ (qt(1 - alpha, df) * sqrt(pooledvar))/z
* sqrt(cdosis * (1 - gdosis) - 2 * cnncontrol * rdosis +
cnpcontrol * rdosis^2 + (cnncontrol^2/cnpcontrol) * gdosis))
c(lower, upper)
The role of control groups in mutagenicity studies
73
}, “less”={
alpha <- 1 - conf.level
gdosis <- (qt(1 - alpha, df)^2 * pooledvar * cnpcontrol)/(z^2)
upper <- 1/(1 - gdosis) * (rdosis - (gdosis * cnncontrol)/cnpcontrol
+ (qt(1 - alpha, df) * sqrt(pooledvar))/z
* sqrt(cdosis * (1 - gdosis) - 2 * cnncontrol * rdosis +
cnpcontrol * rdosis^2 + (cnncontrol^2/cnpcontrol) * gdosis))
c(0, upper)
}, “greater”={
alpha <- 1 - conf.level
gdosis <- (qt(1 - alpha, df)^2 * pooledvar * cnpcontrol)/(z^2)
lower <- 1/(1 - gdosis) * (rdosis - (gdosis * cnncontrol)/cnpcontrol
- (qt(1 - alpha, df) * sqrt(pooledvar))/z
* sqrt(cdosis * (1 - gdosis) - 2 * cnncontrol * rdosis +
cnpcontrol * rdosis^2 + (cnncontrol^2/cnpcontrol) * gdosis))
c(lower, Inf)
})
attr(cint, “conf.level”) <- conf.level
return(cint)
}
The S-functions ‘fieller’ and ‘fiellermuta’ implement the methods described in this paper. Both programs
can be executed by using the commercial program “S-Plus” (http://www.insightful.com/), as well as the
freely-available system “R” (http://www.r-project.org).
74
Appendix 3: The output for the example data
R : Copyright 2002, The R Development Core Team
Version 1.4.1 (2002–01–30)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type ‘license()’ or ‘licence()’ for distribution details.
R is a collaborative project with many contributors.
Type ‘contributors()’ for more information.
Type ‘demo()’ for some demos, ‘help()’ for on-line help, or
‘help.start()’ for a HTML browser interface to help.
Type ‘q()’ to quit R.
> source(“fiellermuta.s”)
> source(“fieller.s”)
>
> # data from the micronuclei example
>
> Cminus <- c(3,2,2,3,2,5,1)
> Cplus <- c(33, 15, 32, 20)
> D1 <- c(5,4,4,4,2)
> group <- factor(c(rep(“control”,7), rep(“dosis”,5)))
>
> # Standard Fieller confidence intervals for the two-sample problem
>
> print(fieller(c(Cminus, D1), group, alternative=“less”))
[1] 0.000000 2.311095
attr(,“conf.level”)
[1] 0.95
>
> D2 <- c(7,4,6,8,6)
> print(fieller(c(Cminus, D2), group, alternative=“less”))
[1] 0.000000 3.882335
attr(,“conf.level”)
[1] 0.95
>
> D3 <- c(9, 18, 13, 12, 18)
> print(fieller(c(Cminus, D3), group, alternative=“less”))
[1] 0.00000 19.08949
attr(,“conf.level”)
[1] 0.95
>
> D4 <- c(22,13,23,22,20)
> print(fieller(c(Cminus, D4), group, alternative=“less”))
[1] 0.00000 29.20161
attr(,“conf.level”)
[1] 0.95
>
> # Correlated Fieller confidence intervals with positive and negative control
>
> group <- factor(c(rep(“dosis”, 5), rep(“ncontrol”, 7), rep(“pcontrol”, 4)))
>
> print(fiellermuta(c(D1, Cminus, Cplus), group, conf.level=0.95, alternative=“less”))
[1] 0.0000000 0.2442684
attr(,“conf.level”)
[1] 0.95
D. Hauschke et al.
The role of control groups in mutagenicity studies
>
> print(fiellermuta(c(D2, Cminus, Cplus), group, conf.level=0.95, alternative=“less”))
[1] 0.0000000 0.3509787
attr(,“conf.level”)
[1] 0.95
>
> print(fiellermuta(c(D3, Cminus, Cplus), group, conf.level=0.95, alternative=“less”))
[1] 0.0000000 0.7360543
attr(,“conf.level”)
[1] 0.95
>
> print(fiellermuta(c(D4, Cminus, Cplus), group, conf.level=0.95, alternative=“less”))
[1] 0.000000 1.043725
attr(,“conf.level”)
[1] 0.95
>
75
Download