MGREP Concept Mapping Engine

advertisement
Integrating Literature and
Experimental Data
Fan Meng, Ph.D.
Microarray Laboratory
Psychiatry Department and Molecular & Behavioral
Neuroscience Institute
University of Michigan
High Throughput Data Analysis Overview
Integrative Exploration → Hypothesis
freewheeling
glamorous
System → Pathway/Network/Gene Set
Molecular → Gene/Transcript/SNP/Genome
rigid
dull
Raw Data: Expression/Genotype/Sequence
MGREP Concept Mapping Engine
Concepts
Remove
Common
Words
Single Word Variation
Combine with Word
Order Permutation
Radix-tree Match
Figure 1. Overview of our free text-to-ontology
mapping method
Key Idea:
While classical concept match
algorithms use the time
consuming approach of
generating concept variations
during concept match, mgrep
pre-generate concept variations
and uses highly efficient string
match algorithms to achieve
two orders of magnitude
increase in speed over
MetaMap.
Evaluation of MGREP by NCBO
Precision of Mgrep and MetaMap using the 'diseases' dictionary
Data Source
Mgrep
MetaMap
Clincal Trials
0.87
0.71
Gold Miner
0.73
0.548
GEO
0.88
0.755
MedLine
0.23
0.091
Shah NH, Bhatia N, Jonquet C, Rubin D, Chiang AP, Musen MA (2009)
Comparison of concept recognizers for building the Open Biomedical
Annotator. BMC Bioinformatics. 2009 Sep 17;10 Suppl 9:S14.
MGREP in NCBO Annotator Web Service
PubAnatomy
• Integrate Medline literature with
external data
• Enable efficient visual query
• Open architecture
Linking Literature and Experimental Data
• Mapping Medline to brain structures
• Integrating multiple data sets
– Gene expression from the Allen Brain Atlas
– Brain structure relationship from NeuroName
– Protein-protein interaction from MiMI
• Graphic presentation of data
– Allen Brain Atlas
– Protein-protein interaction network
– Gene Co-expression network
PubAnatomy Architecture
• Visualization components: Flex
• Server-side web services: algorithms and graphics
• Backend database: Oracle
Internal
services
algorithm I1
service I1
ithm I2
service I2
dataset I1
…
PubAnatomy
UI
Integration
Literature
dataset I2
…
BioNLP
user
selection
open API
…
…
databases
algorithm U1
User
plug-ins
plugin U1
algorithm U2
plugin U2
dataset U1
…
dataset U2
Visualization Components
Server-Side Web Services
Backend Database
PubAnatomy Interface
PubAnatomy
Download