Biases in Studies of Screening Programs Thomas B. Newman, MD, MPH June 10, 2011 Overview Introduction – TN Biases – Defintions Problems with observational studies – Volunteer bias – Lead time bias – Length bias – Stage migration bias – Pseudodisease Screening tests: TN Biases “When your only tool is a hammer, you tend to see every problem as a nail.” Clinical care accounts for 95% of spending but only 20% of determinants of health* Biggest threats are public health threats Interventions aimed at individuals are overemphasized because they are more profitable and we know how to do/sell them *Teutsch SM, Fielding JE. Comparative effectiveness: looking under the lamppost. JAMA 2011; 305:2225-6 Cultural characteristics "We live in a wasteful, technology driven, individualistic and deathdenying culture." --George Annas, New Engl J Med, 1995 What is screening? Common definition: testing to detect asymptomatic disease Better definition*: application of a test to detect a potential disease or condition in people with no known signs or symptoms of that disease or condition. – Disease vs. condition – Asymptomatic vs. no known signs or symptoms *Common screening tests. David M. Eddy, editor. Philadelphia, PA: American College of Physicians, 1991 Screening tests may be history questions Screening Spectrum Risk factor Presymptomatic disease Unrecognized symptomatic disease Recognized symptomatic disease Decreasing numbers labeled and treated Decreasing difficulty demonstrating benefit Examples and overlap Unrecognized symptomatic disease: vision and hearing problems in young children; iron deficiency anemia, depression Presymptomatic disease: neonatal hypothyroidism, syphilis, HIV Risk factor: hypercholesterolemia, hypertension Somewhere between: prostate cancer, ductal carcinoma in situ of the breast, more severe hypertension Evaluating Studies of Screening Ideal Study: – Randomize patients to be screened or not – Compare outcomes in ENTIRE screened group to ENTIRE unscreened group Screened R Not screened D+ DD+ D- Mortality after Randomization Mortaltiy after Randomization Observational studies: Patients are not randomized Compare outcomes in screened vs. unscreened patients Or among patients with disease: – Compare outcomes in those diagnosed by screening vs. those diagnosed by symptoms – Compare stage-specific survival with and without screening KEY DIFFERENCE: Mortality vs. Survival Mortality: denominator is a population, most of whom never get the disease Survival: denominator is patients with the disease Beware of any studies evaluating screening tests using survival Possible Biases in Observational Studies of Screening Tests Volunteer bias Lead time bias Length time bias Stage migration bias Pseudodisease Volunteer Bias People who volunteer for screening differ from those who do not Examples – HIP Mammography study: • Women who volunteered for mammography had lower heart disease death rates – Multicenter Aneurysm Screening Study (MASS; Problem 6.3) • Men aged 65-74 were randomized to either receive an invitation for an abdominal ultrasound scan or not. MASS Within Groups Result in Invited Group MASS -- Invited Group Only N AAA Death Scanned 27,147 43 Not Scanned 6,692 22 33,839 65 % Total Death % 0.16% 2,590 9.54% 0.33% 1,160 17.33% 3,750 Avoiding Volunteer Bias Randomize patients to screened and unscreened Otherwise, try to control for factors (confounders) associated with both screening and outcome – Examples: family history, level of health concern, other health behaviors, baseline health/illnesses Lead Time Bias (zero-time bias) Screening identifies disease during a latent period before it becomes symptomatic If survival is measured from time of diagnosis, screening will always improve survival even if treatment is ineffective Lead time bias Source: EDITORIAL: Finding and Redefining Disease. Effective Clinical Practice, March/April 1999. Available at: ACP- Online http://www.acponline.org/journals/ecp/marapr99/primer.htm accessed 8/30/02 Avoiding Lead Time Bias Only occurs when survival from diagnosis is compared between diseased persons – Screened vs. not screened – Diagnosed by screening vs. by symptoms Avoiding lead time bias – Measure mortality, not survival – Count from date of randomization – Follow patients for a long time (20 years?) and use total, not e.g. 5-year survival Length Bias (Different natural history bias) Screening picks up prevalent disease Prevalence = incidence x duration Slowly growing tumors have greater duration in presymptomatic phase, therefore greater prevalence Therefore, cases picked up by screening will be disproportionately those that are slow growing Length bias Source: EDITORIAL: Finding and Redefining Disease. Effective Clinical Practice, March/April 1999. Available at: ACP- Online http://www.acponline.org/journals/ecp/marapr99/primer.htm Length Bias Slower growing tumor with better prognosis Early detection ? Higher cure rate Avoiding Length Bias Only present when – survival from diagnosis is compared – AND disease is heterogeneous Lead time bias usually present as well Avoiding length bias: – Compare mortality in the ENTIRE screened group to the ENTIRE unscreened group – Study disease subgroups with a uniform natural history Stage migration bias Stage 0 Stage 0 Stage 1 Stage 1 Stage 2 Stage 2 Stage 3 Stage 3 Stage 4 Old tests Stage 4 New tests Stage migration bias Also called the "Will Rogers Phenomenon" – "When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states." -- Will Rogers Documented with colon cancer at Yale Other examples abound – the more you look for disease, the higher the prevalence and the better the prognosis Best reference on this topic: Black WC and Welch HG. Advances in diagnostic imaging and overestimation of disease prevalence and the benefits of therapy. NEJM 1993;328:1237-43. A more general example of Stage Migration Bias VLBW (< 1500 g), LBW (1500-2499 g) and NBW (> 2500 g) newborns exposed to Factor X in utero have decreased mortality compared with those not exposed Is factor X good? Maybe not! Factor X could be cigarette smoking! – Smoking moves babies to lower birthweight strata – Compared with other causes of LBW (i.e., prematurity) it is not as bad Stage Migration Bias NBW NBW LBW LBW VLBW VLBW Unexposed to smoke Exposed to smoke Avoiding Stage Migration Bias The harder you look for disease, and the more advanced the technology – the higher the prevalence, the higher the stage, and the better the (apparent) outcome for the stage Beware of stage migration in any stratified analysis – Check OVERALL survival in screened vs. unscreened group More generally, do not stratify on factors distal in a causal pathway to the factor you wish to evaluate! Pseudodisease A condition that looks just like the disease, but never would have bothered the patient – Type I: Disease which would never cause symptoms – Type II: Preclinical disease in people who will die from another cause before disease presents In an individual treated patient it is impossible to distinguish pseudodisease from successfully treated asymptomatic disease The Problem: – Treating pseudodisease will always look successful – Treating pseudodisease will always be harmful Example: Mayo Lung Project RCT of lung cancer screening Enrollment 1971-76 9,211 male smokers randomized to two study arms – Intervention: chest x-ray and sputum cytology every 4 months for 6 years (75% compliance) – Control: Tests at trial entry, then a recommendation to receive the same tests annually *Marcus et al., JNCI 2000;92:1308-16 Mayo Lung Project Extended Follow-up Results* Among those with lung cancer, intervention group had more cancers diagnosed at early stage and better survival *Marcus et al., JNCI 2000;92:1308-16 MLP Extended Follow-up Results* Intervention group: slight increase in lungcancer mortality (P=0.09 by 1996) *Marcus et al., JNCI 2000;92:1308-16 What happened? After 20 years of follow up, there was a significant increase (29%) in the total number of lung cancers in the screened group – Excess of tumors in early stage – No decrease in late stage tumors Overdiagnosis (pseudodisease) Black W. Overdiagnosis: an underrecognized cause of confusion and harm in cancer screening. JNCI 2000;92:1308-16 Looking for Pseudodisease Appreciate the varying natural history of disease, and limits of diagnosis Impossible to distinguish from successful cure of (asymptomatic) disease in individual patient Few compelling stories of pseudodisease… Clues to pseudodisease: – Higher cumulative incidence of disease in screened group – No difference in overall mortality between screened and unscreened groups Each year, 182,000 women are diagnosed with breast cancer and 43,300 die. One woman in eight either has or will develop breast cancer in her lifetime... If detected early, the five-year survival rate exceeds 95%. Mammograms are among the best early detection methods, yet 13 million women in the U.S. are 40 years old or older and have never had a mammogram. 39,800 Clicks per mammogram (Sept, ’04) Why is this misleading Each year 43,000 die, 182,000 new cases suggests mortality is ~24% 5-year survival > 95% with early detection suggests < 5% mortality, suggesting about 80% of these deaths preventable Actual efficacy is closer < 20% for breast cancer mortality (lower for total mortality) Questions?