Dual controls in rodent carcinogenicity studies

advertisement
Issues and Experience in Analyzing
Transgenic Mouse Carcinogenicity Studies:
An Industry Perspective
Ronald Menton
Wyeth Research
2005 FDA/Industry Statistics Workshop
Washington, DC, 14-16 Sep 2005
Outline
•
•
•
•
Some statistical questions for 2-year studies
Transgenic models
Some thoughts on the questions for transgenic models
Final Comments
Study Design Questions?
•
•
•
•
Are two control groups needed?
How many animals per group?
What groups are needed?
Statistical methods?
Some In-Life Questions?
• When should we terminate group x?
• When should we terminate the study?
• Do we have a valid study?
Questions at End of Study?
• DO WE HAVE A VALID STUDY?
• ARE ANY FINDINGS STATISTICALLY
SIGNIFICANT?
Transgenic Mouse Models
• Mouse model more susceptible to drug-induced tumors
due to
– Knocking out gene associated with tumor suppression
(eg., p53+/-, XPA )
– Insertion of multiple copies of human gene associated
with tumor promotion
(eg., TgrasH2,TG.AC)
• The increased signal permits shorter study duration and
smaller group sizes
Transgenic Models
• Current Regulations (ICH S1B) permit sponsors to conduct
the traditional 2 year rat study plus a short- or mediumterm rodent study in lieu of 2 year studies in both rats and
mice
• The Committee for Medicinal Products for Human Use
stated that the TgrasH2 and p53+/- mouse models are
acceptable alternatives to the 2-year mouse study.
CPMP (2004)
Why Conduct Transgenic Study?
• Faster
– In-life: 6-months vs 2 years
– Study completion: 1 year vs > 3 years
• Less Resources
– Fewer animals
– People
– Space
• Increased Flexibility for Drug Development
Typical Study Design for
2-Year Rodent Study
Group
Control Group 1
Control Group 2
Low Dosage
Mid Dosage
High Dosage
Number of Animals
Males
Females
50-75
50-75
50-75
50-75
50-75
50-75
50-75
50-75
50-75
50-75
Are Two Control Groups Needed?
• Many companies routinely use two vehicle control groups
for 2-year carcinogenicity studies.
• Why?
– Permits an assessment of variation in tumor rates
between groups
– Poor survival in control group is problematic
• See Haseman (1990) for discussion
Multiple Control Groups in 2-Year Studies
Eight of 14 companies indicated that multiple control groups are
employed for at least 75 % of their studies.
10
8
6
4
2
0
8
4
0
0%
9 Two vehicle control groups
1
< 25%
What type of multiple control group
designs are routinely used?
25-75%
> 75%
2 Vehicle control and water control
2 Vehicle control and untreated
control
Studies with Multiple Control Groups
Survey of 14 PhRMA Companies on Statistical Methods Used for 2-year Rodent Carcinogenicity Studies.
Menton R (2003)
Are Two Control Groups Needed?
• Are Two Vehicle Control Groups Needed in Short-term
Carcinogenicity Studies?
• Not for most models
– Low spontaneous rate of tumors
– Survival rate usually high for at least 6 months
Survival for P53 Mouse from 6 NTP Studies
NTP Web Site
Mortality in TgrasH2 Mice
N Studies
N animals
VC
12
180
Male
MNU1 MNU2
4
7
60
104
Mortality
Range
Mean
0-13%
2.8%
0-33% 0-100%
13.3% 57.7%
1. 13-week studies
2. 26-Week Studies
Adapted from Table 4 in Takaoka (2003)
VC
12
179
Female
MNU1
4
60
MNU2
7
105
0-13% 13-27% 13-100%
3.9%
20%
55.2%
Spontaneous Tumors in P53 Mice
Neoplasm
Leukemia: Granulocytic
Malignant Lymphoma
Osteosarcoma or Osteoma
Osteosarcoma
Alveolar/Bronchiolar
Adenoma
Sarcoma
Adapted from NTP Website
Tumor Incidence
Male
Female
All Organs
1/108 (0.93%)
0/109
2/108 (1.85%)
2/109 (1.83%)
2/108 (1.85%)
0/109
Bone
2/108 (1.85%)
0/109
Lung
0/108
Skin
2/108 (1.85%)
1/109 (0.92%)
3/109 (2.75%)
Spontaneous Tumors in TgrasH2 Mice
• Usui (2001) summarized tumor incidence and time of first
tumor for common spontaneous tumors (incidence > 1%)
in 12 ILSI ACT studies.
• 180 male and 178 female mice (15 per study/sex)
• Male tumor incidence: 0 – 1.8%
• Female tumor incidence: 0 – 2.3%
• In most cases, the incidence of these common tumors was
only marginally greater than 1.0%
How Many Animals Per Group?
• 2-year mouse studies typically use between 50-65
animals per group.
• Study duration was typically 24 months for both rat and
mouse studies. The number of animals per group per
sex was evenly divided between 50, 60, and 65.
How Many Animals Per Group?
• Original ILSI protocols recommended 15 animals per
group for transgenic studies
• Recent papers and presentations have recommended 20-25
per group
– Morton (2002)
– Lin (2004)
– CPMP (2004)
Sample Size
• Recommend 20 to 25 mice/sex/group for carcinogenicity
assessment studies in TgrasH2 mice. (Morton 2002)
• Group size of 15 animals in the original transgenic mouse
study protocol is too small. To have a level of power
between 80 and 90% in detecting a true 15% difference,
20-25 animals per group are needed. (Lin 2004)
• The number of animals per group in the ILSI/HESI studies
is too small. An increase in group size to 20-25 animals
per group is recommended. (CPMP 2004)
Power to Detect Selected Increases in Tumor Rate
Assuming Background Tumor Rate Near 0
n=15
P2 =0.01 =0.05
0.1
0.29
0.55
0.15
0.44
0.71
0.2
0.59
0.81
0.25
0.7
0.88
0.3
0.8
0.93
0.35
0.8
0.95
Adapted from Lin (2004)
n=20
=0.01 =0.05
0.39
0.65
0.57
0.81
0.73
0.9
0.84
0.95
0.91
0.98
0.93
0.98
n=25
=0.01 =0.05
0.46
0.73
0.68
0.88
0.82
0.95
0.91
0.98
0.96
0.99
0.97
0.99
n=30
=0.01 =0.05
0.57
0.80
0.77
0.92
0.9
0.97
0.95
0.99
0.98
0.99
0.99
0.99
Power to Detect 15% Increase in Tumor Rate
for Sample Sizes of 15, 20, and 25a
Number of
Historical prevalence of spontaneous neoplasms
0%
3.75%
7.5%
mice/sex/group
Sexes analyzed separately. Test will detect change in one sex or both
15
0.60
0.46
0.53
20
0.78
0.66
0.60
25
0.86
0.67
0.74
Both sexes analyzed together with blocking.
15
0.77
0.52
0.62
20
0.90
0.74
0.66
25
0.96
0.72
0.81
a
Assumptions for these sample power simulations include:
1. A trend test is performed.
2. Three treatment groups and a negative control group are analyzed.
3. Prevalence of treatment-related neoplasm increases proportionally to the dosage.
4. There are no sex differences in neoplastic responses.
5. p < 0.05 is statistically significant.
Adapted from Table 2 in Morton, 2002
What Groups to Include?
• Typical 2-year carcinogenicity study includes 5 groups:
C1, C2, L, M, H
•
All but one respondent indicated that a typical study
includes three dose groups, with one stating that they
usually employ four dose groups.
Study Design for TgrasH2 Study
GROUP
NO.OF MICE
Toxicity
M
F
CB6F1-TgHras2
Vehicle Control
25
Positive Control
25
Low-Dose
25
Mid-Dose
25
High-Dose
25
CB6F1-nonTgrasH2
Vehicle Control
25
High-Dose
25
Adapted from www.rash2.com
25
25
25
25
25
25
25
What Groups to Include?
•
•
•
•
Original ILSI Protocol recommended 7 Groups
C, L, M, H, Positive Control, WT-C, WT-H
WT groups are now considered optional
Two questions:
– Is the PC control group needed?
– If PC group included, then how many animals are
needed in this group?
Positive Controls in Short-term Studies
• Storer (2001) summarized results for 19 ILSI ACT studies
that used p-cresidine as the positive control group
• N=15 per sex
• Males
– P-cresidine was considered positive for 18 of 19 studies
– Bladder tumor incidence ranged from 0 to 86.7%
• Females
– P-cresidine was considered positive for 15 of 19 studies
– Bladder tumor incidence ranged from 0 to 60%
Positive Controls in Short-term Studies
Incidence of Select Neoplasms
in TgrasH2 Mice Treated with MNU
Organ/Diagnosis
Forestomach/
Squamous cell
papilloma/carcinoma
Multisystemic/
Malignant lymphomas
Male (7 Studies)
Range
Mean
Female (7 Studies)
Range
Mean
87-100% 96%
93-100%
98%
53-87%
53-100%
76%
Adapted from Table 8 in Takaoka (2003)
76%
Power for Comparing Tumor Incidence
Between Positive Control and Vehicle Control Group
Background
Tumor Incidence
Incidence = 5%
Positive Control Number in PC Group
Group
n=15
n=25
50%
94.9%
89.5%
60%
97.0%
99.1%
70%
>99.9%
99.6%
80%
99.9%
>99.9%
90%
>99.9%
>99.9%
Background
Incidence = 10%
Number in PC Group
n=15
n=25
83.7
74.8%
91.1%
95.3
99.5%
98.0%
99.8%
>99.9
>99.9%
>99.9
Calculations assume that tumor incidence is compared between the
two groups using a Fisher Exact test at the 5% significance level.
Power was computed via simulation (5000 runs per simulation).
Possible Design for 6-Month
P53+/- or TgrasH2 Study
Group
Control Group
Low Dosage
Mid Dosage1
High Dosage
Positive Control Group2,3
Number of Animals
Males
Females
25
25
25
25
25
25
25
25
15-20
15-20
1. Do we need three dosage groups?
2. After demonstrating model assay validity, do we need the
positive control group?
3. 20 animals if tumor incidence in target organs is 50-60%.
15 animals if tumor incidence in target organs is  70%
Statistical Methods?
• Peto’s test is commonly used for the statistical analysis
of tumor data for 2-year carcinogenicity studies
• Eleven of 13 respondents familiar with the procedures
detailed in the draft FDA guidance document, “Statistical
Aspects of Design, Analysis, and Interpretation of Animal
Carcinogenicity Studies”.
• Twelve companies stated that they are using Peto type tests
for the analysis of tumor data.
Options for Statistical Methodology for
P53 and TgrasH2 Studies
• Cochran-Armitage Trend test and Fisher’s Exact test
Exclude animals that die with short survival times.
Definition of sufficient survival based on time of tumor
observation in sponsor’s historical data and literature
• Peto Methods
• Poly-K methods
Cochran-Armitage and Fisher Exact Tests
•
•
•
•
Advantages
Simple, well known test
Exact tests available
Easy to block or stratify for
other covariates
Appropriate if there are few
fatal tumors and intercurrent
mortality is similar among
groups
Disadvantages
• Requires specification of
survival time for
excluding animals
• Does not account for
time of tumor onset or
cause of death
Peto Methods
•
•
•
•
•
Advantages
FDA may use Peto’s method
Accounts for time of tumor
onset and cause of death
Software available
Exact tests available
Scientists familiar w/
methods
Disadvantages
• Requires specification of
incidental intervals
• Specification of
incidental intervals is
complicated due to small
number of deaths in
vehicle control groups
• Complexity makes
stratification/blocking
more difficult
Poly-K Methods
•
•
•
•
•
Advantages
Adjusts for mortality
Does not require cause of
death determination
Do not have to specify time
intervals
Easy to block or stratify for
the two studies
Fairly simple method
Disadvantages
• Not much experiece for 6month study
• Biologists not familiar with
method
• Application of exact tests
for poly-k method is a
research topic
Statistical Methods?
Incidence of mortality, neoplasms/select non-neoplasms will
be compared among dosage groups using the CochranArmitage trend test and Fisher's exact test between each
dosage group and the vehicle-control group. If excessive
intercurrent mortality is observed then the trend and pairwise
tests of tumor data will be conducted using Peto's method.
What constitutes excessive mortality?
Number of early deaths: > 5? > 10?
Employ Poly-k Method?
Questions During In Life
• Mortality and/or differential intercurrent mortality raises
statistical questions during conduct of 2-year studies
– Should the high dose be lowered?
– Should one or more groups be terminated early?
– Should the study be terminated early?
• Ten of 13 companies indicated that at least one dose group was
terminated early or the top dose lowered for at least one study in the
past five years.
Mortality Guidelines for 2-year Studies
• 20-30 animals per group should be alive during
weeks 80-90
– FDA Draft Guidance (May 2001)
• High-Dose group could be terminated early when the
survival of the group is reduced to 10-12 animals
– Fairweather et al (1998). Drug Information Journal
• A study could be terminated if survival of the control
group goes below 20-30 after weeks 80-90
– FDA Draft Guidance (May 2001)
Mortality Issues for Short-term Studies
• Survival is usually very high in short-term studies
• However, what do we do if it isn’t?
• What are the criteria for evaluating if study is acceptable,
terminating a study, or terminating a dosage group?
Mortality Issues for Short-term Studies
• We (scientific community) do not currently know how
many animals are needed at the end of a 26-week
carcinogenicity study
• We also do not know how many weeks represents
sufficient exposure
• We do know that the more animals per group the more
sensitive the statistical tests will be for detecting
compound related tumor increases of a specified
magnitude
Power for Reduced Survival
Tumor Rate
Background
Increase at
Rate
High Dose
15%
20%
.1%
25%
30%
35%
3%
15%
20%
25%
30%
35%
Sample Size at High Dose
15
10
55 – 67%
72 – 84%
85 – 92%
93 – 96%
96 –99%
44 – 47%
62 – 69%
75 – 79%
85 - 88%
90 –96%
44 – 48%
59 – 66%
70 – 79%
83 – 90%
89 – 94%
32 – 40%
44 – 54%
56 – 67%
66 – 79%
77 – 89%
Description of Power Calculations
• Simulations were conducted to estimate the probability of
detecting differences of 15 - 35% in tumor rates between the
treated groups and control group
– Power calculations assume that tumor incidence is compared among
4 dosage groups using a one-sided Cochran-Armitage trend test
conducted at the 5% significance level
– Background tumor incidence ranged from 0.1% to 3%
– Tumor incidence in L and M dosage groups ranged from
background rates to 2/3 of that in H dosage group
– Power was computed via simulation (1000 runs per simulation)
– Calculations performed for two sets of samples sizes:
25, 24, 22, and 15 in the C, L, M, and H dosage groups,
25, 24, 22, and 10 in the C, L, M, and H dosage groups,
Some Thoughts On
Mortality Guidelines for Short-term Studies
• xx-yy animals per group should be alive during
weeks ww-zz
– xx - yy = 15 – 20?
– ww-zz likely species dependent
• High-dose group could be terminated early when the
survival of the group is reduced to 10-15 (?) animals
before weeks ww-zz.
• A study could be terminated if survival of the control
group goes below 20 (assuming n = 25)
before weeks ww - zz
Are Any Findings Statistically Significant?
• What is Considered Statistically Significant?
• Different approaches are utilized to adjust for the multiple
statistical tests performed in 2-year carcinogenicity studies.
•
Six of 13 companies employ the decision rule in FDA’s draft guidance
document of 0.025 for rare tumors and 0.005 for common tumors.
What significance levels are used for the evaluation of
rare/common tumors?
Rare/Common
4 0.05/0.05 with no adjustments for multiple tumors
1 0.05/0.05 with an adjustment for multiple tumors
2 0.05/0.01 i.e., Haseman Rule
6 0.025/0.005 i.e., FDA Decision Rule
Decision Rule in FDA’s Draft Guidance
Significance levels for making statistical decisions to
accommodate the multiple tests
Control-High Pairwise
Comparisons
Common tumors = 0.005 Common tumors=0.01
Rare tumors = 0.025
Rare tumors = 0.05
Tests for Positive Trend
Standard 2-Year Studies
in Rat & Mouse
Alternative ICH Studies
(eg. 2 year rat study +
6-month mouse study)
Adapted from US FDA (May 2001)
Common tumors = 0.01
Rare tumors = 0.05
Under development
and not yet available.
What is Considered Statistically Significant?
• Is a multiplicity adjustment needed for short-term studies?
• No
– Only a handful of tumor types observed in a study
– Probability of a false positive is low due to low
spontaneous rate
Final Comments
• Alternative mouse models provide additional flexibility in
drug development
• While 25 animals per sex/group is reasonable for the
control and treated transgenic groups, smaller sample sizes
make sense for the positive control group
• Simple statistical methods work well when survival is
high
• More research and/or guidance is needed on defining
adequate survival
Some References
•
•
•
•
•
•
•
•
•
•
•
CPMP Safety Working Party. CHMP SWP conclusions and recommendations on the use of
genetically modified animal models for carcinogenicity assessment. London, 23 June 2004.
Haseman JK, Hajian G, Crump KS, Selwyn MR, and Peace KE, Dual controls in rodent
carcinogenicity studies. In: Statistical issues in drug research and development, Ed by KE Peace.
Marcel Dekker, New York. 1990.
Lin K. Statistical Issues in Review of Carcinogenicity Studies of Pharmaceuticals, Drug Information
Association 40th Annual Meeting, June 16, 2004, Washington, DC
MacDonald J, et al. The utility of genetically modified mouse assays for identifying human
carcinogens: a basic understanding and path forward. Toxicol Sci. 2004:188-94.
Menton R. and R Perry. Statistical Methods for 2-Year Rodent Carcinogenicity Studies. Midwest
Biopharmaceutical Workshop, Muncie, In, 2003.
Morton D. The Tg rasH2 Mouse in Cancer Hazard Identification, Toxicol Pathol, 2002: 139-146.
NTP web pages on Histoical Controls for P53 Mice. http://ntp.niehs.nih.gov/
Study Results & Research Projects >> Study Data Searches >> Historical Controls >> NTP
Historical Control for Genetically-Modified Models
Storer R, et al. p53+/- Hemizygous Knockout Mouse: Overview of Available Data. Toxicol.
Pathol.,2001, 29 Suppl:30-50.
Takaoka M, et al. Interlaboratory comparison of short-term carcinogenicity studies using CB6F1rasH2 transgenic mice. Toxicol Pathol, 2003:191-9.
US Food and Drug Administration, Statistical Aspects of Design, Analysis, and Interpretation of
Animal Carcinogenicity Studies, Draft Guidance for Industry, May 2001.
Usui T, et al., CB6F1-rasH2 mouse: Overview of Available Data. Toxicol Pathol, 2001. 29 Suppl:90108.
Download