Further research of linking diagnostic test

advertisement
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
The GRADE approach for tests and strategies: from test accuracy to patient important
outcomes and recommendations
Contributors so far (based on this and the prior version of the single article and in no
particular order at this point – others welcome): Holger J Schünemann, Reem Mustafa,
Nancy Santesso, Jan Brozek, Patrick Bossuyt, Miranda Langendam, Andrew D Oxman, Karen
R Steingart, Tommaso Trenti, Paul Glasziou, Roman Jaeschke, Julia Kreis, Mark Helfand, Rob
Scholten, Anne Rutjes, Gordon H Guyatt for the GRADE Working Group
!Attention!: heavy self citation needs to be addressed
Word count:
Tables: 3
Figures: 4
Document1
1
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Key points:
GRADE has developed and applied a comprehensive framework for rating the confidence in
estimates from a body of evidence obtained from diagnostic test studies and linking this
evidence to health outcomes.
Preferably, developers of recommendations will evaluate and rate a body of evidence for
each of the pieces of evidence that is required for decision-making. Ideally, they will base
the rating on a systematic review of the required evidence.
Further research of linking diagnostic test accuracy evidence to other evidence that needed
to judgments about the impact on health outcomes should focus on combining the rating of
the confidence in estimates from the various bodies of evidence
Document1
2
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Abstract:
In the present article we will focus on GRADE’s framework of moving from diagnostic test
accuracy to health related outcomes when direct studies evaluating the impact of diagnostic
tests or strategies are not providing the best available evidence. We will also describe how
guideline developers can use information from diagnostic test accuracy to develop a
recommendation.
Document1
3
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
The GRADE approach for tests and strategies: from test accuracy to patient important
outcomes and recommendations
1. Introduction
The previous article in this series describes how systematic review authors and guideline
developers assess their confidence in the estimates of a body of evidence evaluating tests
and testing strategies, i.e. the quality of evidence. In that article we focused on applying
GRADE to test accuracy (TA) studies. In the present article we will focus on GRADE’s
framework of moving from TA to important health outcomes when direct studies evaluating
the impact of diagnostic tests or strategies are not providing the best available evidence.
We will also describe how guideline developers can use information from diagnostic test
accuracy to develop a recommendation. Thus, the first part of this article will describe the
judgments about directness involved in assessing the link between TA and important health
outcomes. In particular, we will describe why guideline panels should be cautious when
they use evidence of TA as the basis for recommendations because it requires review of and
judgements about the evidence that links the evidence about test accuracy to patient or
population-important outcomes.
The second part will focus on the steps and criteria that are involved in moving from
evidence to a recommendation or decision using examples from guidelines that have
applied this approach to diagnostic tests and strategies. We will conclude by summarizing
work done, challenges and suggestions for future work.
2.0 What evidence is needed to make assumptions about patient outcomes?
Document1
4
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Guideline developers will have to develop a clear idea of what consequences they anticipate
from applying a test or strategy. In fact, the application of GRADE requires an
understanding and consideration that a recommendation about or the use of a test should
result from balancing the desirable and the undesirable consequences (including non-health
related consequences such as resource utilization).1-3 Applying GRADE, in the context of
making recommendations or decisions about tests, means that a division between testing
and therapy, treatment or observation is artificial, but sometimes practical or pragmatic, e.g.
when comparing the diagnostic test accuracy of two competing tests, one of which is
already established. Figure 1 emphasizes that testing, therefore, has consequences that
become part of an intervention, including observation when no further action is required or
possible, and should be considered 4. Developers of recommendations should develop a
pathway that follows from applying a test which allows for the consideration of such
consequences. Figure 2 describes a pathway developed for a World Health Organization
guideline on screening and treatment of cervical intraepithelial neoplasia (CIN), a precursor
for cervical cancer. The guideline panel considered different screening options, human
papilloma virus (HPV) and visual inspection with acetic acid (VIA), and subsequent treatment
that patients may or may not be able to receive. For instance, only some patients,
depending on the type of lesion would be able to receive cryotherapy. If the lesion would
not be deemed eligible for cryotherapy other therapeutic interventions, in this case cold
knife conization or loop electrosurgical excision procedure (LEEP), might be the best
alternatives. The panel then considered the possible consequences that can result from
each of the possible screen and treat pathways in terms of health outcomes, considering
both benefits and harms. Figure 3 describes an alternative generic analytic framework
Document1
5
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
developed by the United States Preventive Services Taskforce (USPSTF) to describe these
considerations and relations.
Figure 4 describes the sequential steps and considerations that are important when
evidence following the consequences of testing, treating, observing and missing diagnoses.
Returning to the example of the cervical screening guidelines, step 1 included a systematic
review of the body of evidence describing the DTA of the two different screening tests
against a reference standard. The authors of the systematic review conducted a metaanalysis to obtain summary estimates of the sensitivity and specificity of the two screening
tests. The meta-analysis revealed a pooled sensitivity of of 95% (95% CI: 84 to 98) and
pooled specificity 84% (95% CI: 72 to 91) for HPV and a pooled sensitivity of 69% (95% CI: 54
to 81) and a pooled specificity of 87% (95% CI: 79 to 92) for VIA, respectively, based on five
studies that compared both tests against a reference standard. Applying these summary
statistics to the pretest probability of the target population (assumed here to be 5%)
allowed determining the number of test positives (true positives and false positives) and
test negatives (true negatives and false negatives). For example, 4.8% (48 per 1000 women
or 95% of 5%) in the HPV and 3.5% (35 per 1000 women) (69% of 5%) in the VIA group
would be true positives, respectively. Step 2 involved linking these test outcomes to the
anticipated important health outcomes. Together with a literature review about which
outcomes women may experience, the multidisciplinary panel provided information about
such outcomes. Women with a positive test, i.e. indicating presence of CIN, would undergo
further management with one of the possible therapies to reduce the risk of cervical cancer.
Treatment will come with a certain risk of cure and side effects. However, also those with a
false positive test results would undergo treatment and experience the adverse
consequences without experiencing the benefits. Women with a negative test results would
Document1
6
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
not be treated and be further observed. However, this will include women with a false
negative test result who will have a certain risk of CIN developing into cervical cancer,
following the natural history of disease. While this model ignores the possibility of repeating
the screening test, in some settings, such as low and middle-income countries, women may
undergo only a single test and is realistic.
The corresponding estimates of treatment efficacy, side effects and natural history, should,
ideally, be derived from systematic reviews of the relevant evidence. For example, the
efficacy of cryotherapy should be evaluated with a systematic review as should the risk of
developing cervical cancer in untreated CIN (the natural history of the disease). In fact, the
systematic review determining the efficacy of cryotherapy revealed a 61% relative risk
reduction based on observational data5 and the search for evidence about the natural
history an approximate 2% progression over 30 years to cervical cancer
(http://globocan.iarc.fr/factsheets/cancers/cervix.asp - April 18, 2012).
For a guideline evaluating the use of testing for cows’ milk allergy (CMA), a condition
affecting between 2 and 5% of children, the guideline panel was asked to evaluate the
possible benefits and downsides of the various test outcomes on the basis of case examples
using semi-quantitative information. 6 7 For instance, in order to understand the
consequences associated with the 264 per 1000 false negative skin prick tests in a
population with a high risk (pretest probability) for CMA, guideline panel members were
provided with typical case scenarios: the child suspected of CMA will be allowed to return
home and will have an allergic reaction (possibly anaphylactic) to cow’s milk at home; high
parental anxiety and reluctance to introduce future foods; may lead to multiple exclusion
diet. The real cause of symptoms (i.e. CMA) will be missed leading to unnecessary
investigations and treatments. These case scenarios, the baseline risk and the possible
Document1
7
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
consequences, were based on a review of the literature and information obtained from
allergist with experience in caring for affected patients.
In another guideline, a WHO guideline panel considered the consequences of applying
serological tests in a population with a 10% risk of pulmonary tuberculosis where a
sensitivity of 59% and specificity of 95% leads to a risk of 81 per 1000 false positives and 36
per 1000 false negatives.8 Guideline panel members applied evidence synthesized in
tuberculosis treatment guidelines to link the treatment efficacy and possible detrimental
effects from delayed diagnosis, confusing other respiratory diseases (such as pneumonia)
with pulmonary TB and consequential death from other disease, adverse drug reactions and
unnecessary consumption of health care and patient resources.
[here other examples]
3.0 How can the confidence in the estimates be graded
Preferably, developers of recommendations will evaluate and rate a body of evidence for
each of the pieces of evidence that is required for decision-making. Ideally, they will base
the rating on a systematic review of the required evidence. This rating will inform how direct
diagnostic test accuracy, in GRADE considered a surrogate marker that requires further
evaluation of the related consequences, relates to health outcomes.
3.1 Rating the diagnostic test accuracy
As described in the prior article when direct evidence about important health outcomes is
not available or associated with low confidence, GRADE begins by assessing the confidence
in the estimates of the DTA related to the test. The systematic review of HPV and VIA
revealed that there was important inconsistency in the specificity estimates across the 5
Document1
8
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
included studies yielding an overall confidence rating of moderate while the confidence
rating for the specificity estimates remained high (Table 1). Layer 1 SoF Tables do not
consider the directness of the relation between DTA and health outcomes.
[other examples here, e.g. CMA]
3.2 Rating the linked evidence – directness of the health outcomes
To complete an assessment of the confidence in the estimates, in an ideal situation a rating
of the confidence in the estimates should be undertaken for the body of evidence informing
all key input variables. In other words, assessing the linked evidence completes the
assessment of the directness of the outcomes in GRADE’s directness domain.
For example, estimates of the baseline risk used to calculate the test results in Table 1 may
influence the overall rating of the confidence. Application of GRADE for prognostic studies
or prevalence studies will inform the rating of this confidence.9ref Falavigna Similarly the
confidence in the estimates of the treatment effects of cryotherapy and other treatments
should influence the overall confidence in the body of evidence supporting a
recommendation. For example, applying GRADE for interventions the confidence in the
estimates for the effects of cryotherapy was very low coming from observational studies
with high risk of bias. Persistence of CIN in false negatives was estimated as approximately
70% based on moderate quality evidence from longitudinal prognostic observational
studies.check Thus, step 2 in Figure 4 involves a rating of the confidence in the estimates
when going from the test results to important health outcomes for a population. The
authors of the cervical cancer guideline had very low confidence in the linked bodies of
evidence when they derived the estimates for the various patient important outcomes
based on the considerations above. Table 2 describes a layer 3 SoF Table for tests based on
Document1
9
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
the best available research evidence and additional information the guideline panel
obtained. The explanations in the related text provide the sources of evidence, assumptions
made and explanations.
[other examples here, including layer three table where rating down for directness is part of
the overall quality rating]
4. How does the confidence in the estimates of the linked evidence influence the overall
rating of the confidence in estimates
Having realized that the linked evidence often lowers the overall confidence one has in the
evidence required to formulate recommendations and make decisions about tests, there are
several options for rating the overall confidence in the estimates.
Option 1. Evaluate which bodies of linked evidence are critical for decision-making and base
the overall rating of the confidence for population important outcomes on the lowest
confidence of these bodies of evidence. For example, despite high confidence in the
estimates of diagnostic test accuracy for TP and FN and moderate confidence in TN and FP,
the recommendation would be associated with a rating of very low confidence resulting
from the uncertainty about several of the linked bodies of evidence (e.g. natural history of
the disease, efficacy of cryotherapy). Whether or not linked evidence is critical to decisionmaking will be influenced by the frequency and importance with which an outcome occurs.
This is the approach the guideline panel on cervical cancer screening took by rating the
overall confidence as very low.
Option 1b. Base the overall rating on any of the linked evidence without considering what
might lead to critical outcomes.
Document1
10
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Option 2. Present the evidence from diagnostic test accuracy and linked evidence
separately without assigning an overall rating of the confidence.
For further discussion
When can linked evidence from other scenarios be applied without completing a full
assessment of the evidence review for all linked evidence.
5. How can decisions and recommendations be made about tests
A recommendation associated with a diagnostic question follows from an evaluation of the
balance between the desirable and undesirable consequences of the test and subsequent
therapy, treatment, management or observation after applying the test (Figure 1).
When the consequences of the false positive, false negative, inconclusive results and
complication rates with the alternative diagnostic strategies are quite secure, and those
outcomes are important, we can make strong inferences concerning the relative impact of a
test on important health outcomes.
The guideline panel that developed recommendations regarding serological testing in
patients with cow milk allergy (see example 2 in box 1), determined that for patients with a
relatively low probability of the disease (approximately 10%) skin prick testing results in a
large number of false positives leading to unnecessary anxiety and further testing. It also
leads to missing about 3% (33/1000 tested patients are false negatives) of patients who
suffer from cow milk allergy with the risk of severe allergic reaction and death.
Document1
11
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Uncertainty regarding the consequences of the false positives and false negative results will
weaken inferences about the balance between desirable and undesirable consequences.
Consider the consequences of false positive and false negative results of diagnostic imaging
for patients suspected of acute sinusitis. Since the primary benefit of treatment is
shortening of illness duration and symptoms, the balance of the patient important
consequences is less clear between a) patients with false negatives results who are deprived
of antibiotics and will have a longer duration of symptoms and an increased risk of
complications from the infection, but suffer no side effects from antibiotic use, and b)
patients with false positive results who receive antibiotics when they should not may feel
relieved that they have received care and treatment. Furthermore, guideline panels will
have to consider the societal consequences (e.g. antibiotics resistance) of administering
antibiotics to false positives.9
GRADE has used decision tables that increase transparency of the decision making process
to document such considerations by a panel.3 Extensive work has informed the selection of
criteria that influence the development of health care recommendations about tests.GIF report,
Reem thesis JCE papers
Formats of these tables, labeled evidence to recommendation frameworks
have been further developed as part of the DECIDE project and are included in GRADE’s
guideline development tool.10
The purpose of the frameworks is to help guideline panels developing recommendations
about the use of tests to move from evidence to recommendations. It is intended to inform
decision makers’ judgments about the desirable and undesirable of the considered options
(these may be diagnostic tests used for diagnosis, monitoring or other purposes which may
sometimes be combined with management options). The frameworks also ensure that
important factors that determine a recommendation (criteria) are considered by providing a
Document1
12
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
concise summary of the best available evidence. They allow for a structure discussion and
identify reasons for agreement and disagreement.
One or more of the three layers of a SoF table for tests should be included in the framework
with a link to the full GRADE evidence profile. Of all three layers a description of the
expected health outcomes, ideally in a layer 3 SoF table (Table 2) or as a narrative summary,
should be included in the evidence to recommendation framework. Modeling, that is
calculating the anticipated benefits and harms as well as other desirable and undesirable
consequences, is often required. The assumptions for these models should be described in
the framework or in background information (Table 2). Other information listed in Table 3
can be included when guideline panels intend to achieve complete transparency about the
recommendations they make (supplement – evidence to recommendation framework).
The cervical cancer guideline panel made a … recommendation for the use of the following
tests based on the considerations described in the evidence to recommendation framework
(supplement).
6. Conclusions
GRADE has developed and applied a comprehensive framework for rating the confidence in
estimates from a body of evidence obtained from DTA studies and linking this evidence to
health outcomes when studies directly evaluating the impact of testing on health outcomes
are not available or not trustworthy. The framework focuses on explicitly and transparently
laying out the bodies of evidence required to making the link. While the framework has
facilitated the development of recommendations about diagnostic tests for several
guidelines6-8 11add other references and can be ready applied, further examples and future
Document1
13
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
research in several areas addressing the assessment of the confidence and the degree of
modeling required will move this field forward. Further testing of evidence to
recommendation frameworks will facilitate the development of recommendations about
tests.
Document1
14
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Disclosure Statement
The authors are members of the GRADE Working Group.
Acknowledgment
This work was partially funded by a European Community's Sixth Framework Programme
(FP6/2001-2006) “The human factor, mobility and Marie Curie Actions Scientist
Reintegration” IGR 42192 – (“GRADE to Dr. Schünemann), the European Community's
Seventh Framework Programme (FP7/2007-2013) under grant agreement °258583 (DECIDE
project), the German Insurance Fund and the Cochrane Collaboration (Methods Innovation
Fund). We would like to thank the many individuals and organizations who have contributed
to the progress of the GRADE approach through funding of meetings and feedback on the
work described in this article.
Document1
15
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
References
1. Schunemann HJ, Oxman AD, Brozek J, Glasziou P, Bossuyt P, Chang S, et al. GRADE:
assessing the quality of evidence for diagnostic recommendations. ACP J Club
2008;149(6):2.
2. Andrews J, Guyatt G, Oxman AD, Alderson P, Dahm P, Falck-Ytter Y, et al. GRADE
guidelines: 14. Going from evidence to recommendations: the significance and
presentation of recommendations. Journal of clinical epidemiology 2013;66(7):71925.
3. Andrews JC, Schunemann HJ, Oxman AD, Pottie K, Meerpohl JJ, Coello PA, et al.
GRADE guidelines: 15. Going from evidence to recommendation-determinants of a
recommendation's direction and strength. Journal of clinical epidemiology
2013;66(7):726-35.
4. Schunemann HJ, Mustafa R, Brozek J. [Diagnostic accuracy and linked evidence--testing
the chain]. Zeitschrift fur Evidenz, Fortbildung und Qualitat im Gesundheitswesen
2012;106(3):153-60.
5. Santesso N, Schunemann H, Blumenthal P, De Vuyst H, Gage J, Garcia F, et al. World
Health Organization Guidelines: Use of cryotherapy for cervical intraepithelial
neoplasia. International journal of gynaecology and obstetrics: the official organ of
the International Federation of Gynaecology and Obstetrics 2012;118(2):97-102.
6. Hsu J, Brozek JL, Terracciano L, Kreis J, Compalati E, Stein AT, et al. Application of
GRADE: Making evidence-based recommendations about diagnostic tests in clinical
practice guidelines. Implementation science : IS 2011;6:62.
7. Fiocchi A, Brozek J, Schunemann H, Bahna SL, von Berg A, Beyer K, et al. World
Allergy Organization (WAO) Diagnosis and Rationale for Action against Cow's Milk
Allergy (DRACMA) Guidelines. Pediatr Allergy Immunol 2010;21 Suppl 21:1-125.
8. WHO. Commercial Serodiagnostic Tests for Diagnosis of Tuberculosis 2011;ISBN 978 92
4 150205 4
9. Spencer FA, Iorio A, You J, Murad MH, Schunemann HJ, Vandvik PO, et al. Uncertainties
in baseline risk estimates and confidence in treatment effects. BMJ 2012;345:e7401.
10. Treweek S, Oxman AD, Alderson P, Bossuyt PM, Brandt L, Brozek J, et al. Developing
and Evaluating Communication Strategies to Support Informed Decisions and
Practice Based on Evidence (DECIDE): protocol and preliminary results.
Implementation science : IS 2013;8:6.
11. Bates SM, Jaeschke R, Stevens SM, Goodacre S, Wells PS, Stevenson MD, et al.
Diagnosis of DVT: Antithrombotic Therapy and Prevention of Thrombosis, 9th ed:
American College of Chest Physicians Evidence-Based Clinical Practice Guidelines.
Chest 2012;141(2 Suppl):e351S-418S.
12. Harris RP, Helfand M, Woolf SH, Lohr KN, Mulrow CD, Teutsch SM, et al. Current
methods of the US Preventive Services Task Force: a review of the process. American
journal of preventive medicine 2001;20(3 Suppl):21-35.
Document1
16
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Figure 1. Linkage of testing, interventions and outcomes
Testing/diagn
osis
(Uncertainty due to baseline
risk or pretest probability as
a result of prognostic studies
and imperfect diagnostic
accuracy studies)
•symptoms, prognostic
factors, tests, other
diagnostic tests or
strategies
Therapy,
treatment,
observation,
management
• either evaluated directly or or
indirectly as linked evidence"
Outcome
• possibly other actions,
monitoring - directly
investigated or based on
assumptions from indirect
evidence
Intervention
Document1
17
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Figure 2. Clinical pathway for cervical cancer screen and treat approach.
HPV = human papilloma virus
VIA = visual inspection with acetic acid
Test + = True and false positive tests (not known when test is performed)
Test - = True and false negatives (not known when test is performed)
CKC = Cold knife conization
Leep = Loop electrosurgical excision procedure
Cryo = cryotherapy
Mortality from cervical cancer*
Cervical Cancer Incidence*
CIN2-3 recurrence*
Undetected CIN2-3 (FN)*
Major bleeding*
Premature delivery*
Infertility*
Major infections*
Minor infections*
Unnecessarily treated (FP)*
Cancers detected at screening*
Document1
18
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Figure 3. Generic analytic framework for a test from 12
Document1
19
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Figure 4. Linking diagnostic test accuracy to patient important outcomes
Document1
20
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Table 1. Layer 1 SoF Table HPV compared to VIA for detection of cervical intraepithelial
neoplasia in women at risk for cervical cancer
Patients or population: women at risk of cervical cancer
Settings: screening clinics across the world
New Test: HPV
Cut-off value: –
Comparison Test: VIA
Cut-off value: –
Reference Test: conization and biopsy
Number of
Participants
(Studies)
8921
(5)
Pooled Sensitivity
HPV
Pooled Specificity
HPV
95% (95% CI: 84 to 98)
Pooled Sensitivity VIA
69% (95% CI: 54 to 81)
84% (95% CI: 72 to 91)
Pooled Specificity VIA
87% (95% CI: 79 to 92)
Number of results per 1000 patients tested
Test Result
True positives (TP)
TP absolute difference
False negatives
(FN)
FN absolute difference
True negatives (TN)
TN absolute difference
False positives
(FP)
FP absolute difference
Baseline risk
5%1
HPV
VIA
48
(42 to 49)
35
(27 to 41)
13 more
2
(1 to 8)
15
(10 to 23)
Quality
of the Evidence
(GRADE)
⊕⊕⊕⊕
high
13 less
798
(684 to 865)
827
(751 to 874)
29 less
152
(86 to 266)
123
(76 to 200)
⊕⊕⊕⊝
moderate2,3
due to inconsistency
29 more
Reference: Mustafa, Santesso, Schünemann ….
Footnotes:
1
Prevalence of 5% was assumed to be the average prevalence in a representative population
Estimates of HPV and VIA sensitivity and specificity were variable despite similar cut-off values; inconsistency could not be
explained by quality of studies. This was a borderline judgment. We downgraded TN and FP. This decision is considered in the
context of other factors, in particular, imprecision.
3 Wide CI for TN and FP that may lead to different decisions depending on which of the confidence limits is assumed.
2
Document1
21
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Table 2. Layer 3 Summary of Findings Table describing population important outcomes
Events in the screen-treat strategies for patient important outcomes
(numbers presented per 1,000,000 patients)*
Outcomes
HPV +/CKC
HPV +/LEEP
HPV +/Cryo
VIA +/CKC
VIA +/LEEP
VIA +/Cryo
NO
screen
Mortality from cervical cancer1
18
7
7
18
10
10
333
Cervical Cancer Incidence2
33
15
15
34
21
21
369
125
190
166
565
612
595
35000
CIN2-3 recurrence3
2000
Undetected CIN2-3 (FN)
Major bleeding4
15000
16546
0
117
13071
0
740
0
741
646
625
691
615
599
500
-
-
-
-
-
-
-
Major infections7
1351
0
104
1068
0
82
0
Minor infections8
18487
0
1826
14605
0
1442
0
Premature delivery5
Infertility6
Unnecessarily treated (FP)
Cancers detected at screening
Document1
152000
123000
2259
22
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
1 We
assume mortality will decrease in true positive due to treatment. It will increase in false negative due to late diagnosis. No mortality from cervical cancer in true negative and false
positive. Our calculations in the model are based on 61% RRR for cryotherapy and that it is 2.8 times more in CKC group and 1.06 times more for LEEP based on Kalliala 2007 mortality
data. Mortality data was indirect as they evaluated all cause mortality in this study. Baseline risk of mortality from cervical cancer 1% per 30 years based on WHO data for Lower and
middle-income countries (http://globocan.iarc.fr/factsheets/cancers/cervix.asp - April 18, 2012)
2 We
assume cervical cancer incidence will decrease in true positive due to treatment. It will increase in false negative due to late diagnosis. No cervical cancer in true negative and false
positive. Our calculations in the model are based on 61% RRR for cryotherapy and that it is 2.1 times more in CKC and similar in LEEP based on Kalliala 2007 cervical cancer data.
Baseline risk of cervical cancer in 2% per 30 year based on WHO data for Lower and middle-income countries (http://globocan.iarc.fr/factsheets/cancers/cervix.asp - April 18, 2012)
3 We
assume CIN2/3 recurrence incidence will decrease in true positive due to treatment. It will be high in false negative due to no diagnosis and natural persistence numbers. No CIN2/3
in true negative and false positive. Our calculations in the model are based on 70% natural persistence with no treatment. Recurrence rates of 4% in cryotherapy, 2.3% in CKC and 5% in
LEEP.
4 We
assumed major bleed would be 0 in TN and FN as they were not treated. We assumed 0.000585 of the population treated with cryotherapy, 0.082728 of the population treated with
CKC and 0 of the population treated with LEEP will have major bleed based on reported proportions in single arm studies.
5 We
assumed premature delivery would be at baseline risk as in the general population in TN and FN as they were not treated. We assumed 5% risk of premature delivery in 1% women
becoming pregnant. We assumed 0.001125 of the population treated with cryotherapy, 0.001706 of the population treated with CKC and 0.00123 of the population treated with LEEP will
have premature delivery based on reported proportions in single arm studies.
6 We
did not identify any data about the risk of infertility after treatment for CIN2+.
7 We
assumed major infection would be 0 in TN and FN as they were not treated. We assumed 0.000518 of the population treated with cryotherapy, 0.006757 of the population treated
with CKC and 0 of the population treated with LEEP will have major infection based on reported proportions in single arm studies.
8 We
assumed minor infection would be 0 in TN and FN as they were not treated. We assumed 0.009131 of the population treated with cryotherapy, 0.092437 of the population treated
with CKC and 0 of the population treated with LEEP will have major infection based on reported proportions in single arm studies.
Document1
23
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Table 3. Evidence to recommendation considerations for guideline panels making recommendations about tests
Criteria
Explanation
How common is the problem?
Describe if the health problem is common (i.e. prevalence) and consider this in the context of other problems the panel is
considering
Is the problem severe?
Is the problem so severe that it is a priority when making health care decisions with patients or the population
What is the diagnostic test accuracy?
Describe the diagnostic test accuracy (DTA) and make a judgment if it appears worth considering (compared to the alternative).
That is, if the DTA is inferior and there are no other apparent benefits from using the index test this judgment supports the
upcoming deliberations or makes them unnecessary
What is the confidence in the
diagnostic test accuracy information?
Describe the confidence in the estimates of the DTA based on the GRADE criteria
Overall, compared to the alternative,
are the anticipated benefits large?
Make a judgment about the magnitude of the considered benefits
Overall, compared to the alternative,
are the anticipated harms small?
Overall, is there certainty about the
link between the diagnostic test
accuracy information and the linked
benefits and harms?
Make a judgment about the magnitude of the considered harms. Include information about side effects of tests.
What is the overall confidence in the
estimates of effect for benefits and
harms?
Describe how confident you are in the overall benefits and harms after considering the DTA information and the information about
the linked evidence.
What is the confidence in the values
that patients place on the benefits and
harms?
Describe the source and confidence related to the values and preferences and how confident you are in the evidence
What would be the impact on health
inequities?
Describe any impact that is expected on health inequities
Document1
Describe how confident one can be in the evidence linking the DTA information and the ensuing (linked) benefits and harms, i.e.
how certain are you in the information informing about the management and therapy and other consequences
24
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Criteria
Explanation
Is the cost small relative to the net
benefits of the favored option?
Make a judgment about the cost relative to the net benefits of the index test relative to the cost.
Document1
25
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Version 20130912
Supplemental information
Document1
26
GRADE detailed series - JCE
GRADE Guidelines: Diagnosis II
Document1
Version 20130912
27
Download