slides - Towards Evidence

advertisement
Towards Evidence-Based Discovery
Informatics Tools for Synthesis
Guest Speaker : Tim Cary
Catherine Blake
School of Information and Library Science
University of North Carolina at Chapel Hill
http://www.ils.unc.edu/~cablake
cablake@email.unc.edu
Systematic Review Process
–
–
–
–
–
–
–
Formulate the problem
Locate and select studies
Assess quality of studies
Collect data
Analyze and present results
Interpret results
Improve and update review
28 months from
initial idea to
publication
Increased demand
due to evidencebased medicine
Guesswork guided
by scientifically
trained intuition
Manual Synthesis
Rescher (1978)
MEDLINE
Hypothesis
Projection
Embase
Retrieval
Select
Context
Information
Corpus
Extraction
Extract
Facts
Collaboration
Iteration
Verification
Verify
Analysis
Analyze
Cochrane - RevMan
• Review Manager (RevMan) is the software used
for preparing and maintaining Cochrane reviews.
• You can use RevMan for protocols and full reviews.
It is most useful when you have formulated the
question for the review, and allows you to prepare
the text, build the tables showing the characteristics
of studies and the comparisons in the review, and
add study data. It can perform meta-analyses and
present the results graphically.
• Source: http://www.cc-ims.net/RevMan
4
Cochrane - GRADEpro
• GRADEpro (GRADEprofiler) is the software used to create
Summary of Findings (SoF) tables in Cochrane systematic
reviews. It can retrieve data of the systematic review and
meta-analyses from a Review Manager 5 file, combine these
data with user-entered data, and then export a Summary of
Findings table ready for import into Review Manager 5. It
performs many of the calculations necessary to present the
key results of systematic reviews in a table format and guides
users through the process of grading the quality of the
evidence using the GRADE approach.
• Source: http://www.cc-ims.net/gradepro
5
Reporting Guidelines
• CONSORT - reporting of RCTs
• PRISMA (formerly QUOROM) [PDF document] preferred reporting items for systematic reviews and
meta-analyses
• STROBE - reporting of observational studies in
epidemiology
• EQUATOR Network - collection of reporting
guidelines
• Source: http://www.cochrane.org/index_authors_researchers.htm
6
Selection Step
• Typical information retrieval framing
– Input: MEDLINE
– Output: Articles included in previous studies
– Goal: identify weighting schemes that identify only
articles included in a traditional analysis
• Examples
– Cohen AM, Hersh WR, Peterson K, Yen PY. Reducing Workload in Systematic
Review Preparation Using Automated Citation Classification. JAMIA
2006;13(2):206-219.
– Demner-Fushman D, Seckman C, Fisher C, Hauser S, Clayton J, Thoma G.
Prototype System To Support Evidence-based Practice. AMIA Annu Symp Proc.
November 2008:151-5.
7
Context Information
• Study Information
– e.g. date, location, ...
• Population Information
Loosely coupled
to review focus
– e.g. gender, age, ...
• Risk Factor or Intervention
– e.g. duration of exposure, confounders
• Disease
– e.g. stage, confounders
Tightly coupled
to review focus
Collaborative Information Synthesis
MEDLINE
Embase
Hypothesis
Projection
Retrieval
External
Data
Context
Information
Corpus
Extraction
Facts
Collaboration
Iteration
Verification
Analysis
Key: Estimate Missing Information
1
What are people with
Breast Cancer exposed to?
Studies with
Breast Cancer
patients
Facts for each study
•number of patients
•age of patients
•geographic location
•risk-factor exposure …
2
What are people in a similar
population exposed to?
Database of
risk factors
BRFSS
Codebook
•question asked
•age, gender
•% responses
3
Are these rates significantly different?
T. Tengs & N. D. Osgood (2001) “The link between smoking and Impotence: Two Decades of Evidence”, Preventive Medicine, 32:447-52
More than Automated Meta-Analysis
• Traditional analysis
– same study design
– medicine = RCT
– epidemiology = cohort
Systematic Review
• Information Synthesis
Information Synthesis
Key
External
database
Entire
study
Main topic
Secondary
Information
– any study that includes
required information
– augment missing
information
Natural Language
Core
Processing
Genomics
News
Chemistry
DocSouth
Human-assisted
Discovery and
Synthesis
Education
Discovery Science
Evidence-based Practice
Human Discovery
and Synthesis
Breast Cancer
Heterogeneous Literature
Synthesis and
Discovery Work Practices
12
METIS Information Extractor
• Semantic Grammar
• Features: words, numbers, and semantic types in the
Unified Medical Language System (UMLS)
{term;’age’} {term:’of’}
{number;10<n2<110}{term;’to’}{number;10<n2<110}
The age of breast cancer subjects ranged between 20 to 64 years old.
{semantic type: neoplastic process, or disease}
• Information extracted :
• risk factor exposure (tobacco and alcohol )
• age (min, max, mean)
• number of subjects with medical condition
 gender
 start and end dates
13
 geographical location
METIS Info Extractor – Evaluation
• Diverse text corpus
– epidemiology, surgery, biology, ...
– cohort studies, case-control trials, ...
• Evaluation
– Metrics (precision, recall)
– Annotators (developer, domain expert, expert
annotator, novice)
– Primary topic (breast cancer, impotence)
– Secondary information (tobacco and alcohol
consumption)
METIS Info Extractor – Recall
1.0
0.9
0.8
Recall
0.7
0.6
0.5
Development
Domain Expert
Expert Annotator
Novice Annotator
0.4
0.3
0.2
0.1
0.0
1
2
Rank
3
4
5
METIS Info Extractor – Precision
Precision
1.0
0.9
Development
0.8
Domain Expert
0.7
Expert Annotator
0.6
Novice Annotator
0.5
0.4
0.3
0.2
0.1
0.0
1
2
3
Rank
4
5
METIS Verifier
Converted
Article
Electronic
version of
article
Verify
information
extracted
METIS Verifier
METIS Analyzer
• Meta-Analysis
–
–
–
–
Developed for agricultural application
Requires empirical studies with a quantitative outcome
Unit of study is an article - not a person
Result – a unitless metric called an effect size
• Two common meta-analysis techniques
– Fixed effects
– Randomized-effects model
Evaluation: Compared generated
effect size with examples in text
books and published articles
,
Result: Same effect size
Synthetic Estimate Evaluation
Actual
Estimated
Tobacco
Consumption
Control Rate
1
0.8
0.6
0.4
0.2
0
1
2
3
4
Average
4
Average
Article Identifier
Alcohol
Consumption
Control Rate
1
0.8
Actual
Estimated
0.6
0.4
0.2
0
1
2
3
Article Identifier
Download