PPT 2.3Mb

advertisement
Large-scale
knowledge aggregation for
infectious diseases
ASEAN-China International Bioinformatics Workshop
Singapore, 17th April 2008
Olivo Miotto
Institute of Systems Science and Yong Loo Lin School of Medicine,
National University of Singapore
Large-scale Research Questions
What can we learn from large-scale studies of pathogens?

Does H5N1 Avian influenza have pandemic
potential?

What makes Human flu different from Avian flu?

What are stable potential immune epitopes to use
as vaccine candidates for influenza?

How does each serotype of dengue differ from all
others?
Page 2
Large-scale Research Questions
What can we learn from large-scale studies of pathogens?




Does H5N1 Avian
influenza
Large
scale have pandemic
potential?
Statistical
What makes
Human fluevidence
different from Avian flu?
Historical data
Systematic
analysis
What are stable
potential
immune epitopes to use
as vaccine candidates for influenza?
How does each serotype of dengue differ from all
others?
Page 3
We need Metadata!
Metadata = Descriptive data about sequences

If you want to compare avian vs human, you need
host organism info

If you want conservation analysis, you need to
have serotype and host information

If you want to study a period of virus evolution,
you need date information

If you want a balanced dataset, you may need to
filter according to country, date, subtype
Page 4
Knowledge Mining
Identify mutations in H5N1 that
characterize transmissibility
amongst humans
Viral Sequence and Metadata
User-defined
Queries
User-defined
Dictionaries
Viral Protein
References
User-defined
Extraction Rules
and Priorities
Evidence of
strain cocirculation
Cross-reference
Identifiers
Knowledge
Aggregation
Public
Database
Records
Extract Desired
Source
Knowledge from
Public Databases
Characteristic
Mutations
Analysis
H5N1
mutation map
Conservation
Analysis
Epitope
Vaccine
Candidates
Viral Sequence and Metadata
Identify Evolutionarily
Stable Region across
subgroups
Biomedical Text
Active Text
Mining
User-defined
Patterns
Documents with
Cross-reactivity
information
User-defined
Dictionaries
Identify Biomedical
literature with Crossreactivity information
Curator's
Knowledge
Previous
Annotations
Page 5
Scalability in
Bioinformatics Knowledge Mining
Integrative scalability

We need to integrate heterogeneous information from
multiple data repositories with multiple purposes
Quantitative scalability

We need methods that can leverage on and explore
effectively large-scale data sets
Hierarchical scalability

We need to cascade analysis tasks, flowing knowledge
from one task to the next
Page 6
Obstacles to Scalability
Heterogeneity of Biological Databases
Systemic: access to data in different databases
Syntactic: data formats, use of free text
Structural: different table structures in different databases
Semantic: data with different meaning and intent
Semantic Heterogeneity is particularly insidious
Data is rarely used in the way it was originally intended
Low level of end-use technical expertise
Biologists, not computer scientists
Excel spreadsheets, Web page “scraping”
Does not scale up
Page 7
Semantic Heterogeneity in GenBank
Not so Good
Pretty Bad
Good
Page 8
Semantic Heterogeneity in GenBank
Fields (e.g. country/date) are inconsistently
encoded
Inconsistent level of details between databases
Inconsistent field location within different records of the
same database
Implicit encoding of the data (e.g. within the title of a
publication)
Multiple usage of the same field
BAC77216
Usage of
/isolation_source="
/isolation_source="Samoa"
Samoa
isolation_source
AAN74539
field in different
GenPept records
/isolation_source="isolated in 1993"
AAT85667
/isolation_source="Homo sapiens"
Page 9
Influenza Large-Scale Studies
Analyze all influenza protein sequences available


GenBank + GenPept = 92,343 documents
Final dataset comprises 40,169 unique sequences
Various types of analysis, e.g.


Identify amino acid mutations sites that characterize
human-transmissible strains
Compare the diversity of viral sequences over different
periods of time and geographical areas
Several Metadata fields required
Protein name
Host
Subtype
Country
Isolate
Year
Manual Curation is not an Option!
Page 10
The Aggregator of Biological Knowledge
Public Repositories
An end-user environment for data
retrieval, extraction and analysis
input
Uses XML technology and structural
rules to allow biologists to extract and
reconcile the data needed
Data
Collection
augment
Wrapper framework provides access
to multiple sources
query
manage
Researcher
Manages extracted results
Offers plug-in architecture for analysis
tools
control
Data
Management
input
Data
Analysis
augment
filter
KDDABK
System
Page 11
ABK
Structural Rules
Hierarchical value
reconciliation
Automatic formation of
XML Structural Rule
Concise visualization of
XML as name/value tree
Familiar presentation of
metadata for biologists
Point-and-click selection
of location and constraints
Tabulated visualization
and manual curation
RDF storage and output
Page 12
Data Extraction and Cleaning
DENV-1 sequences
Different rules
(or different documents)
produced conflicting values
Values produced by
user-defined rules
User can fill in
or override values
Page 13
Rule performance
rule1
rule4
rule3
rule4
rule2
Multiple rules
often needed
genbank
rule4
rule1
rule5
rule5
rule3
rule3
Year
rule4
rule2
80%
70%
60%
50%
40%
30%
20%
10%
0%
rule2
rule2
rule3
rule1
rule1
rule6
rule5
rule4
rule2
rule3
genpept
rule1
genpept
Some properties
100%
are very fragmented
90%
rule1
rule6
rule5
rule4
rule3
rule2
rule3
genbank
rule1
rule2
rule3
rule2
genpept
genbank
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
rule1
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Origin
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
rule1
rule3
rule1
genbank
rule2
rule3
rule2
rule1
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
Host
Isolate Name
rule2
Subtype
genpept
Page 14
Can H5N1 viruses spread amongst humans?
Page 15
The Antigenic Variability Analyzer (AVANA)
Page 16
Using MI to detect Characteristic Sites
At a characteristic site, the residue observed is
strongly associated to a set of sequences
E.g. :
Arg -> Avian
Thr -> Human
This association is explored by measuring
mutual information of


The residue observed at a site
The label of the set in which it is observed
MI is in range 0 – 1.0
MI = 0.0 -> no statistical significance in the occurrence
of residues in the two sets
MI = 1.0 -> Residues observed in one set are never
observed in the other, and vice versa
Page 17
PB2 Protein
Spikes indicate
characteristic sites
A2A (719 sequences)
MI
Entropy
PB2 Protein
H2H (1650 sequences)
Page 18
RNP proteins: PB2
PB1
binding
NP
binding
DE
A
M
T TA
9
44
64
81
NT
S
T MV VM
105
RNA cap
binding
Nuclear
Localization
Signal
A
TI IV
R
L
DE AV VA E
199
271 292
368
475
567 588 613 627
K
M
S
A
T
N
I
TI
K
A AS
K
A2A
661 674 702
T
T
R
H2H
PB2 (759 aa)
17 sites
http://www-micro.msb.le.ac.uk/3035/Orthomyxoviruses.html
Page 19
H2H characteristic mutations in H5N1
M1
M2
A2A
1997,HONG KONG,A/Hong Kong/156/97
1997,HONG KONG,A/Hong Kong/481/97
1997,HONG KONG,A/Hong Kong/482/97
1997,HONG KONG,A/Hong Kong/483/97
1997,HONG KONG,A/Hong Kong/486/97
1997,HONG KONG,A/Hong Kong/532/97
1997,HONG KONG,A/Hong Kong/538/97
1997,HONG KONG,A/Hong Kong/542/97
1998,HONG KONG,A/HongKong/97/98
V
V
V
V
V
V
V
V
V
V
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
2003,HONG KONG,A/HK/212/03
2003,HONG KONG,A/HK/213/03
2004,THAILAND,A/THAILAND/5(KK-494)/2004
2004,VIETNAM,A/Viet Nam/1194/2004
2004,VIETNAM,A/Viet Nam/1203/2004
2004,VIETNAM,A/Viet Nam/3046/2004
2004,VIETNAM,A/Viet Nam/3062/2004
2004,VIETNAM,A/Vietnam/CL01/2004
2004,VIETNAM,A/Vietnam/CL26/2004
2005,INDONESIA,A/Indonesia/5/2005
2005,INDONESIA,A/Indonesia/CDC184/2005
2005,INDONESIA,A/Indonesia/CDC287E/2005
2005,INDONESIA,A/Indonesia/CDC292T/2005
2005,INDONESIA,A/Indonesia/CDC7/2005
2005,THAILAND,A/Thailand/676/2005
2005,VIETNAM,A/Vietnam/CL105/2005
2005,VIETNAM,A/Vietnam/CL115/2005
2005,VIETNAM,A/Vietnam/CL2009/2005
2006,CHINA,A/human/Zhejiang/16/2006
2006,INDONESIA,A/Indonesia/CDC326/2006
2006,INDONESIA,A/Indonesia/CDC329/2006
2006,INDONESIA,A/Indonesia/CDC357/2006
2006,INDONESIA,A/Indonesia/CDC390/2006
2006,INDONESIA,A/Indonesia/CDC523/2006
2006,INDONESIA,A/Indonesia/CDC582/2006
2006,INDONESIA,A/Indonesia/CDC594/2006
2006,INDONESIA,A/Indonesia/CDC595/2006
2006,INDONESIA,A/Indonesia/CDC623/2006
2006,INDONESIA,A/Indonesia/CDC624E/2006
2006,INDONESIA,A/Indonesia/CDC625/2006
2006,INDONESIA,A/Indonesia/CDC634/2006
2006,INDONESIA,A/Indonesia/CDC699/2006
2006,INDONESIA,A/Indonesia/CDC742/2006
H2H
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
I
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
A
T
T
T
T
T
T
T
T
T
T
T
A
A
T
T
T
T
T
T
A
A
A
T
A
A
T
T
T
T
T
A
A
A
A
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
I
G
G
G
G
G
G
G
G
G
G
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
I
I
S
S
S
S
S
S
I
I
I
S
I
I
S
S
S
S
S
I
I
I
N
I
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
L
NP
L
F
F
F
F
F
F
F
F
F
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
F
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
Y
H
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
V
V
V
V
V
V
V
V
V
V
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
V
V
V
V
V
V
V
V
V
V
V
V
V
V
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
K
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
A
NS1
NS2
G
G
G
G
G
G
G
G
G
G
V
V
V
V
V
V
V
V
V
V
I
I
I
I
I
I
I
I
I
I
R
R
R
R
R
R
R
R
R
R
L
M
M
M
M
M
M
M
M
M
R
R
R
R
R
R
R
R
R
R
L
L
L
L
L
L
L
L
L
L
R
R
R
R
R
R
R
R
R
R
F
F
F
F
F
F
F
F
F
F
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
D
D
D
D
D
D
D
D
D
D
A
V
S
V
A
V
S
A
A
S
F
F
F
F
F
F
F
F
F
F
A
E
E
E
E
E
E
E
E
E
I
I
I
I
I
I
I
I
I
I
V
V
V
V
V
V
V
V
V
V
S
S
S
S
S
S
S
S
S
S
D
E
E
E
E
E
E
E
E
E
P
P
P
P
P
P
P
P
P
P
E
E
E
E
E
E
E
E
E
E
S
N
N
N
N
N
N
N
N
N
S
S
S
S
S
S
S
S
S
S
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
D
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
L
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
V
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
M
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
K
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
P
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
K
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
Y
Q
Q
K
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
K
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
E
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
G
S
S
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
S
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
F
V
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
V
M
T
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
P
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
I
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
L
P
P
P
P
P
P
P
P
P
P
P
P
P
P
T
E
E
E I S
E I S
I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
E I S
R N G
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
F
PA
PB1
PB2
P
P
P
P
P
P
P
P
P
P
D
D
D
D
D
D
D
D
D
D
R
R
R
R
R
R
R
R
R
R
S
S
S
S
S
S
S
S
S
S
G
G
G
G
G
G
G
G
G
G
V
V
V
V
V
V
V
V
V
V
S
S
S
S
S
S
S
S
S
S
L
L
L
L
L
L
L
L
L
L
N
N
N
N
N
N
S
N
N
N
A
A
A
A
A
A
A
A
A
A
K
K
K
K
K
K
K
K
K
K
E
E
E
E
E
E
E
E
E
E
P
L
L
L
L
L
L
L
L
L
A
A
A
A
A
A
A
A
A
A
S
N
N
N
S
N
N
N
N
S
S
S
S
S
S
S
S
S
S
S
T
T
T
T
T
T
T
T
T
T
V
V
V
V
V
V
V
V
V
V
D
D
D
N
D
D
D
D
D
D
A
A
A
A
A
A
A
A
A
A
M
M
M
M
M
M
M
M
M
M
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
A
S
A
S
A
S
A
S
A
S
T
T
T
T
T
T
T
T
T
T
I
V
V
V
V
V
V
V
V
V
R
R
R
R
R
R
R
R
R
R
L
L
L
L
L
L
L
L
L
L
D
E
E
E
E
E
E
E
E
E
A
A
A
A
A
A
A
A
A
A
V
V
V
V
V
V
V
V
V
V
E
E
E
E
K
E
E
E
E
E
A
T
T
T
T
T
T
T
T
T
A
A
A
A
A
A
A
A
A
A
K
K
K
K
K
K
R
K
R
K
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
P
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
G
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
N
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
E
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
F
S
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
S
I
S
S
S
S
S
S
S
S
S
S
S
S
S
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
D
D
D
D
D
A
A
A
A
A
I
I
I
I
I
T
T
T
T
T
A
A
A
A
A
A
A
A
A
A
T
T
T
T
T
I
I
I
I
I
D
D
D
D
D
D
D
D
D
A
A
A
A
A
A
A
A
A
I
I
I
I
I
I
I
I
I
T
T
T
T
T
T
T
T
T
A
A
A
T
T
T
T
T
A
A
A
A
A
A
A
A
A
A
T
T
T
T
T
T
T
T
T
I
I
I
T
T
T
T
T
I
P
P
P
P
P
P
P
P
P
P
P
P
P
P
L
D
D
D
D
D
D
D
D
D
D
D
D
D
D
N
R
R
R
R
R
R
R
R
R
R
R
R
R
R
Q
S
S
S
S
S
S
S
S
S
S
S
S
S
S
L
G
G
G
G
G
G
G
G
G
G
G
G
G
G
D
V
V
V
V
I
I
I
I
V
V
I
I
I
V
A
S
S
S
S
S
S
S
S
S
S
S
S
S
S
C
L
L
L
L
L
L
L
L
L
L
L
L
L
L
I
N
N
N
N
N
N
N
N
N
N
N
N
N
N
Y
A
A
A
A
A
A
A
A
A
A
A
A
A
A
S
K
K
K
R
K
K
K
K
K
K
K
K
K
K
R
E
E
E
E
E
E
E
E
E
E
E
E
E
E
D
S
S
S
S
S
S
S
S
S
S
S
S
S
S
L
A
A
A
A
A
A
A
A
A
A
A
A
A
A
S
S
S
S
S
S
S
S
S
N
S
S
S
S
S
N
S
S
S
S
S
S
S
S
S
S
S
S
S
S
I
T
T
T
T
T
T
T
T
T
T
T
T
T
T
S
V
V
V
V
V
V
V
V
V
V
V
V
V
V
I
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
N
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
S
I
I
T
I
I
I
I
I
I
I
I
I
I
I
I
I
V
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
M
A
A
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
V
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
S
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
M
T
A
I
I
I
T
T
T
T
T
T
T
T
T
T
T
T
T
T
T
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
R
K
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
L
M
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
D
N
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
I
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
V
T
E
E
K
K
K
E
K
K
K
E
E
E
E
E
K
K
E
K
E
E
E
E
K
E
K
E
E
E
E
E
E
E
E
K
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
T
T
A
A
T
A
A
A
T
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
T
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
R
Page 20
Ongoing Projects at ISS

InViDiA - Integrated Virus Diversity Analysis
Web-based tool for metadata-enabled diversity analysis

WADE - Web-based Aggregation and Display of
Epitopes
Web-based tool for aggregating epitope predictions from
multiple prediction systems
Page 21
Thanks to
Johns Hopkins University
Prof. J Thomas August
Dana-Farber Cancer Institute, Harvard
Dr. Vladimir Brusic
Dept. of Biochemistry, NUS
Prof. Tan Tin Wee
AT Heiny, Asif M Khan, Hu Yong Li
Institut Pasteur
Dr. Hervé Bourhy
Partial Grant Support:
National Institute of Allergy and Infectious Diseases, NIH
Grant No. 5 U19 AI56541, Contract No. HHSN2662-00400085C
Page 22
Download