Affymetrix case study VM-story - Center for Biological Sequence

advertisement
Affymetrix case study
Jesper Jørgensen
NsGene A/S
jrj@nsgene.dk
Overview
• Affymetrix GeneChip technology
• Data processing
–
–
–
–
Expression level
Normalisation
Fold change
Statistics
• Parkinson disease
• Ventral versus dorsal midbrain (case study)
• Verification of array data
– Q-PCR
– In situ hybridization
– Immunohistochemistry
Expression profiling
• Expression profiling
– Investigate mRNA expression profile.
– Compare gene expression between two or more situations.
– Case versus control.
• Profiling methods
– Differential display.
– SAGE (Serial Analysis of Gene Expression)
– Micro array (Custom spotted arrays / Affymetrix GeneChip).
Affymetrix GeneChip technology
Gene 5’
3’
Mulitple oligo
probes
PM
MM
Figure adapted from: David Givol, Weizman Institute of Science,
http://www.weizmann.ac.il/home/ligivol/research_interests.html
Probe synthesis on the array
Affymetrix GeneChip technology
Gene 5’
3’
Mulitple oligo
probes
PM
MM
Figure adapted from: David Givol, Weizman Institute of Science,
http://www.weizmann.ac.il/home/ligivol/research_interests.html
Probe set design
A probe set = 11-20 PM,MM pairs
(Probe design is not optimized)
Affymetrix GeneChip technology
Gene 5’
3’
Mulitple oligo
probes
PM
MM
Figure adapted from: David Givol, Weizman Institute of Science,
http://www.weizmann.ac.il/home/ligivol/research_interests.html
Preparation of samples for GeneChip
U133A
U133B
Amplification
(T7 RNA polymerase)
Figure modified from: Knudsen (2002),
“A Biologist's Guide to Analysis of DNA Microarray Data", Wiley.
The hardware
Overview
• Affymetrix GeneChip technology
• Data processing
–
–
–
–
Expression level
Normalisation
Fold change
Statistics
• Parkinson disease
• Ventral versus dorsal mesencephalon (case study)
• Verification of array data
– Q-PCR
– In situ hybridization
– Immune histochemistry
Expression level (probe signal)
Li-Wong model
n: scaling factor obtained by fitting
Several other models exists. Irizarry et
al. (2002) uses log transformed PM
values after carrying out a global
background adjustment and across array
normalisation.
Irrizary et al. (2002) Biostatistics
qspline normalisation (M/A plot)
Before
After
Workman et al., (2002) Genome Biology, vol. 3, No. 9.
Assumption: Most genes
are unchanged.
M/A plot: Raw chip
data are used to plot,
for each probe, the
logarithm of the ratio
between two chips
versus the logarithm of
the mean expression
for the two chips.
Variation
A/A
B/B
Two different amplifications of the same RNA applied to GeneChips
Fold change (Log fold)
• Fold change = sample/control
• Log transformation makes scale symmetric around 0
• All data log2 transformed
4
3
Log fold (2)
2
1
0
-1
-2
-3
-4
0
2
4
6
Fold change
8
10
12
Statistical testing
Is the regulation significant?
• Student and Welch’s t-test
• ANOVA
• SAM
• Wilcoxon
• Kruskal-Wallis
• Westfall-Young
• ………..
Bonferroni correction
At a P-value of 0.05 you expect:
• 5 false positives if you look at 100 genes
• 1200 false positives if you look at 24.000 genes
Increased likelihood of getting a significant result by chance alone
If you want 25% chance of having only one false positive in the list of
regulated genes, you should only consider P-values more significant
than the Bonferroni corrected cutoff.
• 2.5x10-3 (0.25/100) if you look at 100 genes
• 1.0x10-5 (0.25/24.000) if you look at 24.000 genes
Overview
• Affymetrix GeneChip technology
• Data processing
–
–
–
–
Expression level
Normalisation
Fold change
Statistics
• Parkinson disease
• Ventral versus dorsal mesencephalon (case study)
• Verification of array data
– Q-PCR
– In situ hybridization
– Immune histochemistry
Parkinson’s Disease (PD)
• A fairly common
neurodegenerative disorder
(app. 2 million in
USA/Europe)
• Due to loss of the dopamineproducing neurons in the
Substantia Nigra
• Cardinal motor symptoms:
tremor, rigidity and
bradykinesia
• Conventional treatment does
not halt the progression nerve
cell loss
Fetal Transplantation for PD
• Cells from the developing
midbrain (A)
– are collected and dissociated (B)
– and transplanted into the
striatum (C)
• The cells will integrate with the
host brain and produce
dopamine.
Stem cells in Parkinson disease
Langston JW., J Clin Invest. 2005 Jan;115(1):23-5.
Overview
• Affymetrix GeneChip technology
• Data processing
–
–
–
–
Expression level
Normalisation
Fold change
Statistics
• Parkinson disease
• Ventral versus dorsal mesencephalon (case study)
• Verification of array data
– Q-PCR
– In situ hybridization
– Immune histochemistry
Aim
• In the human fetus, DA
neurons can be found in
the ventral part of the
tegmentum (VT) from
approximately 6 weeks.
• In contrast, no DA
neurons can be found in
the neighboring dorsal
part (DT).
• We aim at finding genes
associated with DA
differentiation by using
GeneChips to compare
the expression profiles
of VT and DT.
* TH IHC
High quality RNA from 8w GA human
ventral midbrain
8wVT (A)
8wVT (B)
8wDT (A)
8wDT (B)
Experimental setup
• Compare VT against DT (3x3)
• Affymetrix Human Genome U133 Chip Set
– HG-U133A: Well substantiated genes
– HG-U133B: Mostly EST’s
– Total: 45,000 probes (genome)
A VENTRAL B VENTRAL
C
VENTRAL
A DORSAL
B
DORSAL
C
DORSAL
U133A data permutations and filter
• Red: VM versus DM:
VM (A1 VENTRAL, A2 VENTRAL, B VENTRAL)
DM (A1 DORSAL, A2 DORSAL, B DORSAL)
• Other colors: Permutations
• Low-stringency filter as dotted line:
• Average expression > 50
• P-value < 0.04
• SLR>0.5 (42% up-regulation in VM)
• Arrange with descending fold
change.
SLR
Genes up-regulated in VM on U133A
Low-stringency filter: Average expression > 50, P-value<0.04, SLR>0.5
arranged with descending fold change. Total list 107 probes. Only SLR>1
displayed.
Literature verification
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
ALDH1A
DAT1
VMAT2
TH
Calbindin, 28kDa
HNF3a
3x Nurr1
2x IGF
4x SNCA
4x DRD2
KCNJ6 (Girk2)
Ret
PITX3
BDNF
DLK1 (FA1)
SLC17A6 (VGLUT2)
EPHA5
ERBB4
Overview
• Affymetrix GeneChip technology
• Data processing
–
–
–
–
Expression level
Normalisation
Fold change
Statistics
• Parkinson disease
• Ventral versus dorsal mesencephalon (case study)
• Verification of array data
– Q-PCR
– In situ hybridization
– Immune histochemistry
Verification of array data
Array Data
(100 candiate genes)
Validation on array material
(confirmation)
Validation on new samples
(universality)
Desk work
RNA
Protein
Statistics
Q-PCR
IHC
Literature
ISH
ELISA
Bioinformatics
Northerns
Westerns
30x
cDNA#253 (VM)
cDNA#254 (DM)
cDNA#244 (VM)
0,14
0,12
Fluorescence
299bp
cDNA#245 (DM)
cDNA#256 (VM)
cDNA#257 (DM)
ALDH1A1 RT-PCR
0,1
0,08
0,06
0,04
0,02
299bp
0
35x
1
4
7
103013
16 19
35 22 254028
Cycle
31 34 37
40 43
Q-PCR verification of genes regulated
on U133A
TH Q-PCR on a developmental series of
subdissected human embryonic and
fetal brain material
OD260/280 were measured to 1.88 +/- 0.05 for all RNA samples
Q-PCR analysis and clustering
OD260/280 were measured to 1.88 +/- 0.05 for all RNA samples
Fold change in a mixed population
1.5 fold up-regulation
from no expression
1.5 fold up-regulation
from some expression
Verification of array data
Array Data
(100 candiate genes)
Validation on array material
(confirmation)
Validation on new samples
(universality)
Desk work
RNA
Protein
Statistics
Q-PCR
IHC
Literature
ISH
ELISA
Bioinformatics
Northerns
Westerns
Organization of ISH procedure
GeneChip verification with ISH
ISH from: Vernay et al., J Neurosci. 2005 May 11;25(19):4856-67.
Verification of array data
Array Data
(100 candiate genes)
Validation on array material
(confirmation)
Validation on new samples
(universality)
Desk work
RNA
Protein
Statistics
Q-PCR
IHC
Literature
ISH
ELISA
Bioinformatics
Northerns
Westerns
GeneChip verification with IHC
Courtesy of Josephine Jensen
Conclusions
• Using arrays one will get at snapshot of the expression
profile under the conditions investigated.
– Careful experimental design
– RNA quantity and quality are important
• Since a single array experiment generates thousands of data
points, the primary challenge of the technique is to make
sense of data.
– Calculations/Statistics (back and forth)
– Literature mining
• Independent methods are needed for verification
– Q-PCR
– In situ hybridization (ISH)
– Immunohistochemistry (IHC)
Acknowledgements
NsGene, Ballerup, Denmark (http://www.nsgene.com/)
•
Lars Wahlberg
•
Bengt Juliusson
•
Teit Johansen
Neurotech, Huddinge University Hospital, Sweden
•
Åke Seiger
Department of Medical Genetics, IMBG, Panum Institute, Denmark
•
Claus Hansen
•
Karen Friis
Wallenberg Neuroscience Center, Sweden
•
Anders Björklund
•
Josephine Jensen
•
Elin Andersson
CBS, DTU, Denmark
•
Søren Brunak
•
Steen Knudsen
•
Nikolaj Blom
•
Thomas Nordahl Petersen
Download