BDK12-2 - Oregon Health & Science University

advertisement
Content
BDK12-2
Information Retrieval
William Hersh, MD
Department of Medical Informatics & Clinical Epidemiology
Oregon Health & Science University
BDK12-2
1
A classification of knowledge-based
content
• Bibliographic
– By definition rich in metadata
• Full-text
– Everything on-line
• Annotated
– Non-text or structured text annotated with text
• Aggregations
– Bringing together all of the above
• These categories are admittedly fuzzy, and increasing
numbers of resources have more than one type
BDK12-2
2
Bibliographic content
• Bibliographic databases
– The old (e.g., MEDLINE) have been revitalized with new
features
– New ones (e.g., National Guidelines Clearinghouse) have
emerged
• Web catalogs
– Share many characteristics of traditional bibliographic
databases
• Real simple syndication/Rich site summary (RSS)
– “Feeds” provide information about new content
BDK12-2
3
Bibliographic databases
• Contain metadata about (mostly) journal articles
and other resources typically found in libraries
• Produced by
– U.S. government – most produced by National Library
of Medicine (NLM, www.nlm.nih.gov)
• e.g., MEDLINE, genomics information, etc.
– Commercial publishers, e.g.,
• EMBASE – part of larger SciVal
• CINAHL – Cumulative Index to Nursing and Allied Health
Literature
• ACM Guide to Computing Literature – computer science and
related areas
BDK12-2
4
MEDLINE
• References to biomedical journal literature
– Original medical IR application – system for searching
MEDLINE launched in 1971 with literature maintained in
MEDLARS system dating back to 1966
• Name derives from MEDLARS On-Line – MEDLINE
– Free to world since 1997 via PubMed – http://pubmed.gov
• Now with links to full text of articles and other resources
• Statistics
–
–
–
–
http://www.nlm.nih.gov/bsd/bsd_key.html
Over 22 million references to peer-reviewed literature
Over 5,000 journals, mostly English language
About 750,000 new references added yearly
BDK12-2
5
National Guidelines Clearinghouse
• Produced by Agency for Healthcare Research and
Quality (AHRQ)
– www.guideline.gov
• Contains detailed information about guidelines
– Including degree they are evidence-based
– Interface allows comparison of elements in database
for multiple guidelines
• Has links to those that are free on Web and links
to producers when proprietary
BDK12-2
6
Web catalogs
• Generally aim to provide quality-filtered Web
sites aimed at specific audiences
– Distinction between catalogs and sites blurry
• Some are aimed towards clinicians
– HON Select – http://www.hon.ch/HONselect/
– Translating Research into Practice –
www.tripdatabase.com
• Others are aimed towards patients/consumers
– Healthfinder – www.healthfinder.gov
BDK12-2
7
RSS
• RSS “feeds” provide short summaries, typically of news,
journal articles, or other recent postings on Web sites
• Users receive RSS feeds by an RSS aggregator that can
typically be configured for the site(s) desired and to filter
based on content
– Work as standalone, in Web browsers, in email clients, etc.
• Two versions (1.0, 2.0) but basically provide
– Title – name of item
– Link – URL of full page
– Description – brief description of page
BDK12-2
8
Full-text content
• Contains complete text as well as tables,
figures, images, etc.
• If there is corresponding print version, both
are usually identical
• Includes
– Periodicals
– Books
– Web sites – may include either of above
BDK12-2
9
Full-text primary literature
• Almost all biomedical journals available electronically
– Many published by Highwire Press (www.highwire.org),
which adds value to content of original publisher, including
British Medical Journal, Journal of the American Medical
Association, New England Journal of Medicine, etc.
– Also published by leading commercial scientific publishers,
e.g., Elsevier, Kluwer, Springer, etc.
– Growing number available via open-access model, e.g.,
Biomed Central (BMC), Public Library of Science (PLoS)
– Another source of full-text papers is PubMed Central
(PMC; http://pubmedcentral.gov)
BDK12-2
10
Books
• Textbooks
– Most well-known clinical textbooks are now available
electronically
• e.g., Harrison’s Principles of Internal Medicine
– Most are bundled into large collections by publishers
• e.g., Access Medicine (McGraw-Hill), Elsevier, Kluwer
– NLM has developed books site as part of Entrez
• http://www.ncbi.nlm.nih.gov/books
• Compendia of drugs, diseases, evidence, etc.
• Handbooks – very popular with clinicians
• Increasingly published on mobile devices
BDK12-2
11
Value added for electronic books
• Multimedia, e.g., skin
lesions, shuffling gait of
Parkinson’s Disease, etc.
• Bundling of multiple
books
• Can be updated in
between “editions”
• Linkage to other
information, e.g., to
references, selfassessments, updates,
other resources, etc.
BDK12-2
12
Web sites
• Defined more narrowly here to refer to
coherent collections of information on Web
• Usually take advantage of Web features, such
as linking, multimedia
• Increasingly integrated with other resources
and available on different platforms (e.g.,
integrated into electronic health records
[EHRs], on smartphones, etc.)
BDK12-2
13
Some notable full-text content on Web
sites
• Government agencies
– National Cancer Institute
• www.cancer.gov
– Centers for Disease Control – travel and infection
information
• http://www.cdc.gov/DiseasesConditions
• http://www.cdc.gov/travel/
– Other NIH institutes, e.g., National Heart, Lung,
and Blood Institute (NHLBI)
• www.nhlbi.nih.gov
BDK12-2
14
Full-text Web sites (cont.)
• Physician-oriented medical news and overviews, e.g.,
– Medscape – www.medscape.com
– PEPID – www.pepid.com
– Many professional societies provide to members, e.g.,
http://www.acponline.org/clinical_information/
• Patient/consumer-oriented, e.g.,
– Intelihealth – www.intelihealth.com
– NetWellness – www.netwellness.com
– WebMD – www.webmd.com
• Many mobile apps provide health information, e.g.,
– iTriage – www.itriagehealth.com
BDK12-2
15
Other interesting types of Web
content
• Wikipedia – www.wikipedia.org
– Encyclopedia with free access and distributed authorship
– Some concerns about manipulation (McHenry, 2004) but
• Comparable to Encyclopedia Britannica? (Giles, 2005 – rebuttal:
Anonymous, 2006)
• Health information quality is reasonably good (Nicholson, 2006)
• Content retrieved prominently in most Web searches (Laurent, 2009)
• Making attempt to improve quality of medical content (Heilman,
2013)
• Body of knowledge
– Software Engineering Body of Knowledge (SWEBOK,
www.swebok.org) organizes knowledge of field
• Social media/Web 2.0 and beyond (Lee, 2011)
BDK12-2
16
Annotated
• Non-text or structured text annotated with
text
• Includes
– Image collections
– Citation databases
– Evidence-based medicine databases
– Clinical decision support
– Genomics databases
– Other databases
BDK12-2
17
Image collections
• Most prominent in the “visual” medical specialties, such as
radiology, pathology, and dermatology
• Well-known collections include
– Visible Human –
http://www.nlm.nih.gov/research/visible/visible_human.html
– Lieberman’s eRadiology – http://eradiology.bidmc.harvard.edu
– WebPath – http://library.med.utah.edu/WebPath/webpath.html
– More pathology – PEIR, www.peir.net
– DermIS – www.dermis.net
– More dermatology, also a decision-support system –
www.visualdx.com
• Many have associated text, which assists with indexing and
retrieval
BDK12-2
18
Citation databases
• Science Citation Index and Social Science Citation
Index
– Database of journal articles that have been cited by
other journal articles
– Now part of a package called Web of Science, which
itself is part of a larger product, Web of Knowledge
(Thomson-Reuters)
• http://wokinfo.com
• SCOPUS – http://www.elsevier.com/onlinetools/scopus
• Google Scholar – http://scholar.google.com
BDK12-2
19
Evidence-based medicine databases
• Cochrane Database of Systematic Reviews –
http://www.cochrane.org
– Collection of systematic reviews, kept updated
• Evidence “formularies”
– Clinical Evidence (BMJ) – http://clinicalevidence.bmj.com/x/index.html
– JAMAevidence – http://jamaevidence.com
• Up to Date – www.uptodate.com
– Clinically oriented overviews of medicine
• Essential Evidence Plus (formerly InfoPOEMS, “Patient-oriented
evidence that matters”) – www.essentialevidenceplus.com
• PubMed Health – https://www.ncbi.nlm.nih.gov/pubmedhealth/
– Systematic reviews and summaries of systematic reviews
BDK12-2
20
Clinical decision support (CDS)
• Content used in CDS systems, usually part of EHRs
– Order sets (usually “evidence-based”)
– CDS rules
– Health/disease management templates
• Growing and evolving commercial market for
such tools, especially as EHR adoption increases;
leaders include
– Zynx – www.zynxhealth.com
– Thomson Reuters Cortellis –
http://cortellis.thomsonreuters.com
– EHR vendors themselves and partners
BDK12-2
21
Genomics databases
• National Center for Biotechnology Information (NCBI,
www.ncbi.nlm.nih.gov; NCBI, 2015) collection links
– Literature references – MEDLINE
– Textbook of genetic diseases – On-Line Mendelian
Inheritance in Man
– Sequence databases – Genbank
– Structure databases – Molecular Modeling Database
– Genomes – Catalog of genes
– Maps – Locations of genes on chromosomes
• More in bioinformatics unit…
BDK12-2
22
Other databases
• Cases (BMC, from Journal of Medical Case Reports and
others)
– www.casesdatabase.com
• ClinicalTrials.gov
– www.clinicaltrials.gov
– Originally database of clinical trials funded by NIH
– Now used as register for clinical trials, with results reporting for
some (DeAngelis, 2005; Laine, 2007; Zarin, 2013; Zarin, 2015)
• NIH RePORTER
– http://projectreporter.nih.gov/reporter.cfm
– Database of all research grants funded by NIH
– Replaced the CRISP database
BDK12-2
23
Aggregations – integrating many
resources
• Clinical – growing tendency of publishers to
aggregate resources into comprehensive products
– Merck Medicus – www.merckmedicus.com
• Collection of many resources available to any licensed US
physician
– ACP Smart Medicine –
http://smartmedicine.acponline.org
• Bundle of resources
– Evidence compendium – formerly called Physician’s Information
and Education Resource (PIER)
– Journals – Annals of Internal Medicine, ACP Journal Club
– Clinical guidelines
BDK12-2
24
Other aggregations
• Biomedical research: Model organism
databases, e.g., Mouse Genome Informatics
– www.informatics.jax.org
– Combines genomics and related data,
bibliographic database, gene references, etc.
• Consumer: MEDLINEplus
– http://medlineplus.gov
– Integrates a variety of licensed resources and
public Web sites
BDK12-2
25
Download