Statistical Design, Standardized Procedures, and Data Quality Don Young CCS Associates Advance Research Associates Mountain View, California USA dyoung@araonline.net ICTW Punta del Este, Uruguay Data Variability • Variability is inherent in all biological systems • Without variability… – there would be no evolution • nothing upon which Natural Selection could act – there would be no need for statistics ICTW Punta del Este, Uruguay Without variability, we would all look like him: ICTW Punta del Este, Uruguay With variability comes the need for Statistics • • to help us distinguish “noise” from “signal” • to help us evaluate how representative our Sample is relative to the Population • to help us generalize from our Sample to the Population ICTW Punta del Este, Uruguay Underlying tenet of statistics Population: universe of all patients with ovarian cancer Sample: all patients in our study of ovarian cancer Sampling vs Inference ICTW Punta del Este, Uruguay Population Inference Sampling ICTW Punta del Este, Uruguay Sample Measures of Variability Ex: pre-natal care Variance = “avg squared deviation”= s2 Std Deviation = sqrt of variance = s Std Error of Mean (SEM, SE) = s/sqrt (n) = s/√n ICTW Punta del Este, Uruguay Std Dev in a normal distribution ± 1 SD of mean: 34% x 2 = 68% ± 2 SD of mean: 47.5% x 2 = 95% “natural variability of data” SE = “accuracy of mean” = s/√n ICTW Punta del Este, Uruguay Hypothesis Testing Null hypothesis = Ho: no difference b/w trt groups Reject, in favor of H1: trt groups differ Fail to Reject Ho or Accept Ho vs. Reality or Population ICTW Punta del Este, Uruguay Hypothesis Testing Reality: no diff Reality: diff Reject Ho [claim to see a difference] Accept Ho [see no difference] Type I error OK OK Type II error ICTW Punta del Este, Uruguay Type I error The more egregious error, impact on reputation and scientific progress False positive error; “seeing too much in data” P(Type I) = α = 0.05 (by convention) source: random chance, biased sampling error ICTW Punta del Este, Uruguay Type II error False negative error; “not seeing enough in the data” Prob (Type II) = β = 0.2 max source: too small sample size ICTW Punta del Este, Uruguay 95% Confidence Interval If we were to draw 100 samples of size n from The Population and calculate the mean & 95% CI for each sample, then 95 out of those 100 CIs would encompass the true Pop mean RoT: 95% CI ≈ pt estimate ± 2 SE ICTW Punta del Este, Uruguay Power & Sample Size As much an art as it is a science “With a sample size of n per group, we have an 80% chance of detecting an intergroup difference of size at a twotailed alpha of 0.05.” sample size, power, std dev, clinically meaningful difference, alpha 5 components: any 2 can vary, other 3 must be held constant ICTW Punta del Este, Uruguay Power & Sample Size | delta | 1.(-------|-----(---)----|-------) 2. (-----|-----) (-----|-----) 3. (-----|---(--)---|-----) 4. (----|----)(----|----) ICTW Punta del Este, Uruguay Phases 1, 2, 3, & 4 Clinical Trials • Ph 1: Dose-finding, – DLT = Dose-Limiting Toxicity – MTD = Maximum Tolerated Dose • Ph 2: Anti-tumor activity – Targeted pt population – Single arm or randomized multi-arm • Ph 3: Comparative efficacy – Hypothesis testing, confirmatory – Randomized, double-blinded (if possible) • Ph 4: Post-approval – Expanded populations, tumors ICTW Punta del Este, Uruguay Phases 1, 2, 3, & 4 Clinical Trials • Friday 14.30h -- Interactive Break-out Session • Details of trial design for each phase • DESIGN CAVEAT: A study that attempts to answer too many questions is likely to end up answering none. ICTW Punta del Este, Uruguay Ensuring Data Quality: Standardizing Procedures • Single protocol across all sites and countries – To ensure data are poolable • Central (vs Local) Clinical Labs – Consistent reference ranges – Consistent Toxicity Grading Scales • Standardizing CT scans across sites – specific slice thickness or contiguous reconstructive algorithm • Single reader or fixed panel of readers • Clinical compliance audits, Routine monitoring visits – To ensure protocol is being executed consistently ICTW Punta del Este, Uruguay Patient enrollment rates • Studies rarely enroll at the rate projected – Relax inclusion/exclusion criteria? • Early pts different from later pts • Increased variability – Increase # of study sites? • Increased variability reduced power in spite of increased sample size • Tension between increasing quantifiable patient counts and maintaining abstract statistical concepts ICTW Punta del Este, Uruguay Ensuring Data Quality: “Process Triangle” OpenClinica Clinical Database Medical Records = source documents ICTW Punta del Este, Uruguay “Process Triangle” CRFs Source Data Verification Medical Records = source documents ICTW Punta del Este, Uruguay Data Mgmnt Clinical Database Source Data Verification • To ensure all data on CRFs are found in the Source Documents • Critical to the validity of the Clinical Database • First step in assuring A = B ICTW Punta del Este, Uruguay Data Management • To ensure that contents of CRFs are properly entered into Clinical Database • To issue queries to study site for data clarification and completeness • To code data into an analyzable form – Adverse Events (e.g., MedDRA) – Concomitant Therapies (e.g., WHO-Drug, MedDRA) • Second step in assuring that B = C ICTW Punta del Este, Uruguay Example: Value of SDV • Andrew Wakefield et al. published in Lancet (Feb 1998) a paper (n=12) linking MMR (measlesmumps-rubella) vaccine with – Inflammatory bowel disease – Sudden onset regressive autism (within 14 days) • Lancet retracted paper in Feb 2010 • Brian Deer of The Times of London source verified the data in the publication for the 12 children ICTW Punta del Este, Uruguay Brian Deer findings • 5 of 12 had pre-existing developmental issues prior to vaccine exposure • 9 children were reported with regressive autism – However, only 6 of 9 had any degree of autism – 3 of those 9 did not have autism at all • 9 patients had unremarkable bowel biopsies – Only 1 had bowel disease ICTW Punta del Este, Uruguay