USPSTF Breast Cancer Screening: Science, Policy & Politics The Good, The Bad and The Ugly J. Sanford Schwartz, MD Leon Hess Professor of Medicine and Health Management & Economics Perelman School of Medicine & The Wharton School University of Pennsylvania Penn USPSTF Screening Mammography Recommendation for Women Ages 40-50 • • • • • What is the USPSTF? (mandate, membership) The USPSTF decision process Why recommendation was made (and timing) Data on which recommendation was generated The recommendation and why it changed from the previous USPSTF recommendation • Why the recommendation generated controversy – Importance – How recommendation was communicated – Political context • My subjective assessment of how things went, Penn why and why topic will remain controversial “The USPSTF recommends against routine screening mammography in women aged 40-49. The decision to start…should be an individualized one and take patient context into account, including the patient’s values regarding specific benefits and harms.” (C recommendation) Moderate certainty that the net benefit is small Issued October 2009 Penn Penn United States Preventive Services Task Force Government appointed, independent advisory group established 1984 by Congressional mandate • Recommend preventive services that should be incorporated routinely into primary care medical care populations (age, gender, risk factors) • Identify research agenda for preventive care • 16 PCPs (IM, FP, Peds, Ob–Gyn) • Rotating 4–6 year terms • Review scientific evidence clinical preventive Penn services (Members with COI excluded) United States Preventive Services Task Force Staffed by AHRQ (staff, fellows, medical officers) Partner organizations: Federal • CDC • NIH • VA Professional Societies • ACP • APA • ACOG • ACFM Public Advocacy Groups Penn • AARP USPSTF Methodology: A (Very) Short Primer • Select topic (largely subjective process) • Identify interventions and outcomes of interest • Examines key questions via chain of evidence within specified analytic framework • Systematic review of evidence (AHRQ EPC) • Assesses evidence, estimates magnitude and certainty of benefits and harms, assigns consensus recommendation grade • Peer review evidence report & recommendation • Draft recommendation posted on website* • Final recommendation issued • US government and Ann Intern Med review Penn http://www.ahrq.gov/clinic/uspstf08/methods/procmanualap7.htm USPSTF Recommendation Grade Certainty Net Benefit Magnitude of Net Benefit Substantial Moderate Small Zero/Neg High A B C D Moderate B B C D Low Evidence: Insufficient Convincing, Adequate, Inadequate Penn USPSTF Recommendation Provide Routinely A B C D I Strongly recommends Recommends Recommends against routinely providing Recommends against Individual Risk/Benefit insufficient evidence • Highlights Clinical/Other Considerations • Discussion & Recommendation of Others Do Not Provide Penn United States Preventive Services Task Force Does not: • Advise insurers • Make health care coverage decisions However, the Affordable Care Act of 2009 mandates that all preventive services that receive an ‘A’ or ‘B’ recommendation by the USPSTF must be covered by insurers at no cost to the beneficiary Penn Screening Mammography: Primary MD and Patient Questions • Should I get mammograms? • If so, starting at what age, and how often? • When, if ever, should I stop? – What is the benefit? – What are the harms? – How do my personal risk factors for breast cancer affect the decision? Penn “I have yet to see any problem, however complicated, which … looked at it in the right way, did not become still more complicated” – Poul Anderson Penn USPSTF Breast Ca Screening: Methods of Analysis • Meta-analysis of RCTs of screening effectiveness • Trials rated “fair-quality” or better from 2002 review and any new trials or updates since then • Rates and proportions calculated using primary data from Breast Cancer Surveillance Consortium • Outcomes Table constructed to estimate magnitude of screening benefits & harms (by age) • Natural history modeling (CISNET) Penn Diagnostic Test Performance Test Result Disease State Disease Present Disease Absent Test Positive True Positive (TP) False Positive (FP) Test Negative False Negative (FN) True Negative (TN) Sensitivity (Se) = TP TP+FN Predictive Value (PV) + = TP_ TP+FP Specificity (Sp) = TN TN+FP Predictive Value (PV) – = TN_ TN+FN Penn Lead Time Time between disease detection by screening and time of usual symptomatic diagnosis • Rate biological progression disease • Screening test sensitivity Lead time bias Artifactual survival prolongation resulting from earlier disease detection in the absence of increased effectiveness of earlier intervention Penn Lead Time Bias Penn Length/Time Bias Artifactual increased measured survival from selectively increased detection of less aggressive disease with better prognosis Penn Overestimation Screening Test Benefit: Prevalence Bias Unrepresentative impact of detection of prevalent cases in early screening cycles Impact incident cases increases with number subsequent cycles Penn Length Time Bias Clinical symptoms Disease Begins Death Screen detection Clinical symptoms Disease Begins Death Screen detection Courtesy of Emily Conant, MD. University of Pennsylvania Penn Screen Detection Capability Based on Tumor Biology and Growth Rates Penn Overdiagnosis Bias Overdiagnosis of a condition (pseudodisease) that would not become clinically significant in a patient’s lifetime The disease has no affect on mortality and is the major harm of screening Penn New Evidence: Age 40-49 Yrs Age Trial (UK 1991–1997) Study Design • RCT annual mammography to age 48 yrs (n=53,884) vs. “usual care” (n=106,956) • F/U through National Health Service register – 81% attended at least 1 screen; – 4.5 mean rounds – 10.7 yrs follow-up. Results • Breast cancer mortality: RR 0.83 (0.66-1.04) NNI 2,512 (1,149-13,544) Penn • All-cause mortality: RR 0.97 (0.89-1.04) New Evidence: Age 40-49 Yrs Age Trial (UK 1991–1997) Strengths • Designed to determine effectiveness age 40-49 • Largest trial, community population • Most recently conducted RCT • Consistent with results of meta-analysis previous RCTs Limitations • Applicability to US not clear (recall rate 3%–5%) • Mortality lower than expected in control group • Only 10 yrs follow–up Penn • 30% attrition, contamination not reported New Evidence: Age 40-49 Yrs: Additional F/U Gothenburg Trial RCT, ages 39–59 yrs, Gothenburg, Sweden 1982 • Mammography q18 mo (n = 20,724) vs. “usual care” (n = 29,200) • All offered screening at end of trial (year 5) • 85% attended first screen; 5 mean rounds; 14 yrs follow-up. • 25-40% attrition, 20% contamination • Results: age 40-49 yrs: Breast cancer mortality RR 0.69 (0.45-1.05) Penn Penn Meta-analysis Screening RCTs: Women Ages 39 to 49 Year Screening every 1-3 years, all “fair quality” Penn Penn 10–Year Risk of Death from Breast Cancer: Beginning Routine Screening Age 40 vs. Age 50 Ages 40-49 Ages 50-59 Without screening 0.33% 0.89% With screening 0.28% 0.69% Absolute RR 0.05% 0.20% Source: Steve Woloshin, Veterans Affairs Outcomes Group Penn Breast Cancer Surveillance Consortium Data • Women in BCSC who had at least one prior screening mammogram within 2 years (“routine screening”) • Screened between 2000-2005 at all 7 sites • Data provided by age in decades beginning at 40 years (also collapsed for women 70+) Penn Breast Cancer Surveillance Consortium: Registry Advantages • Represent current U.S. practice • National multi-site sample of 8M mammograms. • Reflects real world rather than study population (especially useful when evaluating harms) Penn Breast Cancer Surveillance Consortium Data • Cancer rates increase and false positive mammogram rates decrease with age • Number women undergoing additional imaging and biopsy per BCa diagnosed decrease with age • Biopsy rates are lower in younger than older women • Cancer detection rates similar in US, UK, Europe • Rates of false positives and recall rates in the Penn US at least twice rates in Canada, UK, Europe Outcomes Table Penn Incremental Benefit of Extending Screening Age 50–69 to Age 40–69: CISNET Models Penn Penn Meta-analysis of Screening Trials and CISNET Modeling: Limitations • Subgroup analysis by age excludes data • Trials use intention-to-treat analysis and report “number needed to invite for screening,” not those actually screened • Trials are only “fair-quality” due to attrition (>30%) and contamination (>20%) • Applicability questionable: only one U.S. study, >20 yrs ago, prior to current diagnostic and treatment practices • Harms and CISNET data are for those actually Penn screened Screening Mammography: Benefits • Eight RCTS enrolling more than 600,000 women: – Screening mammography reduces breast cancer mortality. – Observed mortality reduction is ~ 15% (0 to 30%, with better designed trials – i.e., less biased mortality ascertainment and randomization) • Effects on all-cause mortality are unknown. Penn Screening Mammography: Harms • Overdiagnosis (screen detection and subsequent treatment of breast cancer that never would have surfaced clinically) – Extent of overdiagnosis difficult to estimate, requiring life-long f/u of screened and unscreened cohorts – Best estimate 2%-10%, with higher estimates in more rigorous studies (i.e., up to 18% of screen detected breast cancers would never surface clinically) • False positive mammograms resulting in unnecessary biopsies, anxiety and expense Penn USPSTF Screening Mammography: Benefits vs. Harms Beginning Age 40 vs. Age 50 Benefit Harms Magnitude Very Large Very small – moderate Frequency Rare < 1:1,000 Very common 10–50/1,000 Timing Late Early Penn Penn Penn Penn Summary of Evidence • Primary evidence is not changing (and likely will not change, given no active prospective trials) • Interpretation of evidence is changing, but slowly (as usual) – Benefits are modest – Consensus benefits of mammography outweigh harms in women ages 50–69 – Disagreement RE: frequency (annual vs. biennial) – Disagreement RE: screening ages 40–49 – Disagreement RE: when to discontinue screening (age 74; age 79; never) Penn Evidence Limitations: Why there is so much disagreement Data limitations • RCTs comparing start ages 40 vs. 50 inadequate power and f/u duration • No RCTs RE: screening frequency (and unlikely to be conducted) Cultural limitations • Harms difficult for many people to grasp • Bias toward inherent belief in earlier detection, regardless of impact on outcome • Misinformation (incidence, prevalence, Penn benefits, harms) Evidence Limitations: Why there is so much disagreement Evidence based medicine is not value free: • Harms and benefits involve comparison of dissimilar outcomes • Subjective expertise – just locus of control shifted from physician to methodologist Penn Penn "What we've got here is a failure to communicate” Paul Newman Cool Hand Luke, 1967 Penn The decision to start…should be an individualized one and take patient context into account, including the patient’s values regarding specific benefits and harms.” (C recommendation) Moderate certainty that the net benefit is small Penn Penn “In the midst of every challenge lies opportunity” -Albert Einstein Penn Penn