Gene Expression and Cell Identity

advertisement
Gene Expression and Cell Identity
Alexander Diehl
ImmPort Science Talk
3/20/14
Understanding the Nature of Entities
in Reality
Depends on What Parts We See
Reality is Often More Complex Than at
First Glance
And Perspective is Important
What are Cells?
• Cell are physical entities that exist in reality.
• We understand aspects of cells based on
results of experimental assays.
• Our knowledge of cell types is necessarily
incomplete even as we attempt to understand
their nature.
How We Represent Cells in the Cell
Ontology
• Morphology
• Surface marker expression, singly or in
combination
• Transcription factor expression or expression
of other internal protein
• By lineage
• By function or capability
Types of Evidence Behind the
Representation of Cells in CL
•
•
•
•
•
•
Microscopy, with or without staining (histology).
Immunofluorescence in situ or in vitro
Flow cytometry or CyTOF
Colony formation assays
In vivo/in vitro lineage tracking
Directs assays of cellular function, typically in
vitro
• Indirect assays of cellular function in vivo
• And rarely, assays of gene expression.
Experimental Data from Multiple Sources Is
Synthesized into a Single Definition in CL
9
Challenges in Ontology Building
• We want to represent both general cell types and
specific cell types.
• Many cell types are considered equivalent across
species in their general characteristics such as
surface marker expression or functions.
• Hematopoietic cell types in different species, such
as mouse and human, sometimes are called the
same name but are defined by different sets of
surface markers.
Challenges in Ontology Building
• We need to provide unique
representations via logical definitions for
each cell type.
• We need to recognize that in some cases,
different combinations of markers may
identify the same cell type.
The HIPC Lyoplate Project
The HIPC Lyoplate Project
• Standardization of human PBMC
immunophenotyping to enhance reproducibility
across different facilities.
• Use of eight color flow cytometry with
standardized antibody panels
• Use of standardized sample preparation
• Use of standardized instrument settings
• Use of standardized data analysis
The HIPC Lyoplate Project
HIPC-Defined Cell Types in CL
HIPC-Defined Cell Types in CL
HIPC-Defined Cell Types in CL
HIPC-Defined Cell Types in CL
HIPC-Defined Cell Types in CL
“effector CD8+ T cell”
“effector CD8+ T cell”
“effector CD8+ T cell”
A request to CL…
TEMRA = a memory
T cell without
CD45RO expression.
Are These the Same Cell Type?
Are These the Same Cell Type?
Are These the Same Cell Type?
But… “effector CD4+ cell”?
The Gene Expression Part
Questions:
Can we use the structure of the Cell Ontology
as a framework for comparing gene
expression data tied to specific cell types?
Can we use the CL framework to identify genes
that distinguish one cell type from closely
related cell types?
The Immunological Genome Project
Linking IGP data to CL cell types.
• IGP provides gene expression data based on
sorted mouse immune cell types developed
according to standardized methods.
• We mapped IGP cell types to cell type terms in
the Cell Ontology.
• The CL structure was used to guide comparisons
between gene expression profiles of different cell
types.
Linking IGP data to CL cell types.
• 88 cell types were compared in a pairwise fashion.
• Separate gene sets were created for genes whose
expression differed by greater than or less than 1.5
fold, respectively, for each pairwise comparison.
• 7656 gene sets resulted.
• An ontological framework was created to map these
gene sets to Cell Ontology classes.
Map IGP samples to CL
Generate pairwise
comparisons (PWC)
from IGP dataset
Figure 2
Map PWCs to the CL
Choose a CL term
yes
1
Does CL term or a
descendant have
mapped PWC?
0
How many PWCs
are shared with a
neighbor or one of its
descendants?
>1
Workflow
of CL-IGP
Project
no
DO NOT associate
genes with CL
Term
Figure 3
Find intersection
(shared genes)
of PWC matrix
Associate genes
with CL Term
genes
from
PWC #1
genes
from
PWC #2
A.
Find Pairwise comparisons:
“that have the name
‘germinal center B cell’ in ‘Parent’
that can be reached by a
‘has_up_regulated_genes’”
and
“that have the name
‘marginal zone B cell’ in ‘Parent’
that can be reached by a
‘has_down_regulated_genes’”
OUTPUT = 1 gene set
B.
Find Pairwise comparisons:
“that have the name
‘germinal center B cell’ in ‘Parent’
that can be reached by a
‘has_up_regulated_genes’”
and
“that have the name
‘mature B cell’ in ‘Ancestor’
that can be reached by a
‘has_down_regulated_genes’”
OUTPUT = 7 gene sets
C.
Find Pairwise comparisons:
“that have the name
‘Mature B cell’ in ‘Ancestor’
that can be reached by a
‘has_up_regulated_genes’”
and
“that have the name
‘mature lymphocyte’ in ‘Ancestor’
that can be reached by a
‘has_down_regulated_genes’”
and
“that do NOT have the name
‘Mature B cell’ in ‘Ancestor’
that can be reached by an
‘has_down_regulated_genes’”
OUTPUT = 256 gene sets
Searching for
Pairwise
Comparisons
Mapped to CL
To find genes that dis nguish mature
B cells from other lymphocytes….
B cell
genes é
compared
to mature
NK cells
B cell
genes é
compared
to mature
T cells
….An R func on takes all relevant pair-wise
comparisons (Fig2c) as input…
ID
GS:
9100825
GS:
9100826
GS:
9100831
Name
825d
826d
831d
Defini on
CD4+ NKT cell of liver
vs marginal zone B cell
CD4+ NKT cell of spleen
vs marginal zone B cell
Vg2- γδ T cell of spleen
vs marginal zone B cell
….Etc for 253 more pairwise comparisons
….finds the genes present on all comparisons …
825d
ID
10361292
10405216
10467508
10567863
10420758
10467258
10562132
10495781
logFC
-8.49
-5.01
-5.99
-6.57
-5.56
-5.35
-5.86
-5.09
826d
ID
logFC
10405216 -5.13
10361292 -8.56
10567863 -6.21
10467508 -5.64
10420758 -5.50
10492983 -5.59
10368675 -6.67
10562132 -5.70
831d
ID
logFC
10405216 -4.82
10361292 -8.01
10467508 -5.66
10420758 -5.49
10467258 -5.26
10567863 -6.11
10492983 -5.40
10384974 -4.82
…And then outputs genes ranked by mean fold change.
Summary of Upregulated and
Downregulated Genes by Cell Type
Novel Genes Identified for Specific Cell
Types and Confirmed by IGP
Scd1
I830077J02Rik
Scd1 is an enzyme
involved in biosynthesis of
monounsaturated fatty acids
whose expression is restricted to
mature B cells types in comparison
to other immune cell types.
I830077J02Rik, a single-pass
transmembrane protein,
otherwise uncharacterized, is
widely expressed among
myeloid cells. In lymphocytes,
expression of this protein is
restricted to marginal
zone B cells.
normalized gene expression
Novel Genes Identified for Specific Cell
Types and Confirmed by IGP
Scd1
I830077J02Rik
Scd1 is an enzyme
involved in biosynthesis of
monounsaturated fatty acids
whose expression is restricted to
mature B cells types in comparison
to other immune cell types.
I830077J02Rik, a single-pass
transmembrane protein,
otherwise uncharacterized, is
widely expressed among
myeloid cells. In lymphocytes,
expression of this protein is
restricted to marginal
zone B cells.
normalized gene expression
B
A
GC B cell é vs
ac vated NK
Cell
GC B cell é
vs MZ B
cells
GC B cell é
vs ac vated
CD8 T cell
550 Genes > 2fold up
C
Gene
symbol
Rgs13
Igj
Aicda
Mybl1
Ighv1-43
Gm600
Rasgrp3
Gcet2
Rassf6
IgKv
Fold
Change
102.5
86.2
74.3
62.2
53.5
51.8
48.6
47.3
44.1
30.0
Stdv
6.3
22.9
13.2
50.9
46.4
5.4
43.6
2.0
1.5
25.1
Affy ID
MGI ID
10358399
10531126
10541507
10353010
10403021
10416006
10446965
10436024
10531261
10545190
MGI:2180585
MGI:96493
MGI:1342279
MGI:99925
MGI:3704124
MGI:2685446
MGI:3028579
MGI:102969
MGI:1920496
*
Candidate
Genes
Involved in
the Unique
Functions of
Germinal
Center B cells
Conclusions
• Gene expression comparisons placed in an
ontology framework can provide details about
genes uniquely expression in particular
immune cell subtypes.
• Results of our approach has been validated
against non-ontologically based analyses of
IGP data, for instance for NK cells, and similar
results are seen.
Acknowledgements
•
•
•
•
•
•
•
•
Barry Smith
Alan Ruttenberg
Ryan Brinkman
Raphael Gottardo
Richard Scheuermann
David Dougall
Holden Maeckler
Philip McCoy
• Terry Meehan
• Nicole Vasilevsky
• Chris Mungall
• Melissa Haendel
• Judy Blake
• And many other contributors to
the CL
Download