Functional Genomics (1)

advertisement
Shirley©
Functional Genomics (1)
Yow--Ling Shiue 薛 佑 玲
Yow
Institute of Biomedical Science
National Sun YatYat-sen University
St
Steps
of
f G
Genome Analysis
A l i
Genome sequence assembled, markers
8
8
Gene location/gene map (mapping)
Gene prediction – train a model for each
genome (including EST & cDNA sequences)
Genome annotation
Functional genomics
8
8
8
8
8
8
Identify repetitive sequences – mask out,
out filter
http://www.ornl.gov/sci/techresources/Human_Genome/research
/function.shtml
Comparative genomics & Integrative genomics
Shirley©
Functional Genomics Technology Goals
8 Generate sets of full
full-length cDNA clones and sequences that
represent human genes and model organisms
8 Support research on methods for studying functions of
nonproteinnonprotein
p
-coding
g sequences
q
8 Develop technology for comprehensive analysis of gene expression
8 Improve methods for genome
genome-wide mutagenesis
8 Develop technology for large
large-scale protein analyses
8
http://www.ornl.gov/sci/techresources/Human_Genome/research/function.shtml
p
g
Definition (1) – Hieter & Boguski 1997
8
The development & application of global
8
8
8
Genomeenome
n m -wide
id orr
System
ystem-wide experimental approaches to assess
gene function by
g
y making
g use of the information &
reagents provided by structural genomics
It is
i characterized
h
t i db
by high
hi h-throughput
highth
h t or
largelarge
-scale experimental methodologies
8
Combined with statistical or computational
analysis of the results
Definition
D fi i i (2
(2) – UC D
Davis
i Genome
G
Center
C
A means of assessing phenotype
differs from more classical
approaches
h primarily
i
il with
ith respectt tto
8
8
The scale & automation of biological
investigations
8
8
A classical investigation of gene
expression might examine how the
expression of a single gene varies with the
development of an organism in vivo
Modern functional genomics
approaches however,
approaches,
however would examine
1,0001,000
-10,000 genes are expressed
as a function of development
http://genomics ucdavis
http://genomics.ucdavis.
edu/index_html.html
Definition (3) – Hunt & Livesey (ed.)
8
Subtracted cDNA libraries
8
Differential display (DD)
8
Representational difference analysis
8
Suppression subtractive hybridization
8
cDNA microarrays
8
2-D gell electrophoresis
l
h
http://www.oup.co.uk/isbn/0
http://www.oup.co.uk/isbn/0p
p
-1919-963774
963774--1
Functional Genomics
8 How to do
8 What to know
8 Gene
G n expression
xpr ssi n
8 Gene regulation
8 Genome
Genome--wide mutagenesis
Shirley©
8
8
8
8
8
8
8
8
Data-mining
Data[SAGE]
SAGE]
Microarray
M croarray analysis
analys s
Subtractive cDNA
libraries
Y
YeastYeast
-two hybrids
h b id
Transgenics
Transposon targeting
RNAi & miRNA
h
http://www.ncbi.nl
//
b l
m.nih.gov/Tools/
Expression Arrays - Microarray
8 C
Cell
ll growth
th in different
diff
nt
environments, treatments etc.
8 Isolate
Is late RNA B cDNAs
8 Measure expression using array
technology
8 Create database of expression
information
8 Data Analysis
8 Display
Di l information
i f
ti iin an easyeasyto--use format
to
8 Show ratio of expression under
different conditions
Affymetrix®
Affymetrix
Affymetr
x® food ch
x®
chip
p
Shirley©
Hi t i l P
Historical
Perspective
ti
8 DNA hybridization (1960s)
8 Detection of hybrids
8
8
8
8
Hydroxyapatite Ca5(PO4)3OH
Radioactive labeling
Enzyme--linked detection
Enzyme
Fluorescent labeling
8 Fixing sample on solid support
8 Southern blots (1970s)
8 Northern blots
8 Dot blots
Shirley©
Basic Principles
8 Main novelty is one of scale
8 Hundreds or thousands of probes rather than tens
8 Probes are attached to solid supports
8 Robotics are used extensively
8 Informatics is a central component at all stages
Shirley©
Gene Expression Analysis (Whole Genome)
8 Quantitative
Q
tit ti Analysis
A l i of
fG
Gene A
Activities
ti iti
8 Transcription Profiles
Yang et al. BMC Genomics 2005
6:90 doi:10.1186/1471
doi:10.1186/1471--21642164-6-90
M j T
Major
Technologies
h l i
8 cDNA probes (> 200 nt), usually produced by PCR
PCR,,
attached to either nylon or glass supports
8 Oligonucleotides (25(25-80 nt) attached to glass support
8 Oligonucleotides (25
(25--30 nt) synthesized in situ on silica
wafers (Affymetrix)
8 Probes attached to tagged beads
Shirley©
4187 genes; 91 samples
Principal Uses of Chips (1)
8 GenomeGenome-scale gene
expression analysis
8 Differentiation
8 Responses to environmental
factors
8 Disease processes
8 Effects of drugs
8Genome
Genome--scale profiling of gene expression
in hepatocellular carcinoma: classification
and survival prediction
8CCR Frontiers in Science (2006); Lee
et al. Hepatology 40:66740:667-76 (2004)
Principal Uses of Chips (2)
8 Detection of
sequence variation
8 Genotyping
8 Detection of somatic
mutations (e.g. in
oncogenes)
g
8 Direct sequencing
Allele-specific hybridization (ASH)
Chee et al. 1996; Wang et al.
1998; Lindblad-Toh
Lindblad Toh et al
al. 2000;
40 different, 2525-bp oligos
Toshiba's
hepatitis C
SNP typing
chip
SNP Strategy - "GeneChip Mapping Assay
cDNA Chips
8 Probes are cDNA fragments,
fragments, usually amplified by PCR
8 Probes are deposited on a solid support, either positively
charged
g nylon
y
or g
glass slide
8 Samples (normally polyA+ RNA) are labeled using fluorescent
dyes
dyes
8 At least two samples are hybridized to chip
8 Fluorescence at different wavelengths measured by a scanner
Shirley©
Molecular Cell Biology,
Biology
Lodish 5th Ed.
cDNA Chip Design
8 Probe selection
8 Non
Non-redundant set of probes
8 Includes genes of interest to project
8 Corresponds to physically available clones
8 Chip layout
8 Grouping of probes by function
8 Correspondence between wells in microtiter plates and
spots on the chip
Shirley©
Probe Selection
8 Make sure that database entries
are cDNA
8 Preference for RefSeq entries
8 Criteria for non
non-redundancy
8 >98% identity over >100 nt
8 Accession number is unique
8 Mapping of sequence to clone
8 Use Unigene clusters
8 Directly use data from sequence
verified
ifi d collection
ll ti (e.g. Research
R
h
Genetics)
8 Independently verify sequence
Agilent
A
il t T
Technology:
h l
60
60mer probe selection;
GeneBin
Shirley©
cDNA Arrays on Nylon and Glass
8 Nylon arrays
8 Up to about 1,000 probes per filter
8 Use radiolabeled cDNA target
8 Can use phosphorimager or XX-ray film
8 Glass arrays
8 Up
p to about 40,000
,
probes per
p
p slide,, or
10,000 per 2cm2 area (limited by
arrayer’s capabilities)
8 Use fluorescent targets
8 Require specialized scanner
RZPD N
Nylon
l
array
Overview of the Production of a Pair of Cheap,
Cheap
LowLow
-density Nylon Arrays of PCR Products
StemCellDB:
library ID
http://stemcell.princeton.edu/v1/sbs_screen.html
Actual image
g of two duplicate
p
arrays
y of 332 clones
each, probed with Sca+ ((-) AA4
AA4- (top) or AA4AA4- (-)
Sca+ (bottom) subtracted probe populations
http://stemcell.princeton.edu/v1/sbs_screen.html
N th
Northern
Blotting
Bl tti Confirmation
C fi
ti
http://stemcell.princeton.edu/v1/sbs_screen.html
Array Type & Spot Density
Array Type
Nylon Macroarrays
y Microarrays
y
Nylon
Glass Microarrays
Oligonucleotide Chips
Spot Density
(per cm 2 )
< 100
< 5000
< 10,000
<250,000
Probe
Target
Labeling
cDNA
cDNA
cDNA
oligo's
RNA
mRNA
mRNA
mRNA
Radioactive
Radioactive/Flourescent
Flourescent
Flourescent
Glass Chip Manufacturing
8 Choice of coupling method
8 Physical (charge), nonnon-specific chemical, specific chemical
(modified PCR primer)
8 Choice of printing method
8 Mechanical pins: flat tip, split tip, pin & ring
8 Piezoelectric
Pi
l t i (壓電的)
(壓電的)deposition
壓電的)d
d
deposition
iti (“i
(“
(“ink
iinkk-jet
jjet”)
t”)
t”)
8 Robot design
g
8 Precision of movement in 3 axes
8 Speed and throughput
8 Number
Numb r of
f pins
pins, numbers
numb rs of
f spots
sp ts per
p r pin load
l d
CHIP 1000,
Shimadzu
Biotech
Physical Spotting
Typical Ink Jet Spot Deposition Results
Volume per spot
spot:
Spot size:
Spot density:
250 nl
1,
1,100 µm
70/cm2
Volume p
per spot:
p
0.5 nl
Spot size:
115 µm
Spot density:
4,800/cm2
Labelled BSA (Cy5)
Typical Pin Spot Deposition Microarray
Results
7x11 microarray consisting of
identical Cy5Cy5-BSA spots (pitch 500 mm)
Typical CV: ≤ 5%
Protocol
Labeling and Hybridization
8 Targets are normally prepared by
oligo(dT)
li (dT) primed
i d cDNA
DNA synthesis
th i
8 Probes should contain 3’ end of mRNA
8 Need CoT1 DNA as competitor (esp.
LINE)
8 Alternative protocol is to make ds cDNA
containing
t i i b
bacterial
t i l promoter,
t
th
then
cRNA
8 Can work with smaller amount of RNA
8 Less
L
quantitative
i i
8 Hybridization usually under coverslips
Shirley©
Scanning the
h Arrays
8 Laser scanners
8 Excellent spatial
p
resolution
8 Good sensitivity,
sensitivity, but can bleach fluorochromes
8 Still rather slow
8 CCD (Charged
(Charged-Coupled Device) scanners
8 Spatial
p
resolution can be a problem
p
8 Sensitivity easily adjustable (exposure time)
8 Faster and cheaper than lasers
BioRad:
VersArray
ChipReader
™ llaser
confocal
scanners
8 In all cases, raw data are images showing fluorescence
on surface of chip
Shirley©
Example: Zeptosens Planar Waveguide Principle –
for High Sensitivity Fluorescence Microarray Detection
free label
Microarray
i
on chip
excitation of bound label
Imaging
of surfacesurface-confined
fluorescence
CCD camera
Glass Microarray –
326 Rat Heart Genes, 2X spotting
Coffee Break
8 What did a Math book says
y to the other?
8 I have a lot of problems!
The Affymetrix Approach
8 Probes are oligos synthesized in situ using a
photolithographic approach
8 There are at least 13
13-16 oligos per gene (PM)
(PM),, plus an
equall number
b of
f negative controls
l (MM)
8 The apparatus requires a fluidics station for hybridization
and a special scanner
8 Only a single fluorochrome is used per hybridization
8 It is very expensive !
Shirley©
Affymetrix GeneChip®
GeneChip®
Affymetrix
y
Chip
p Production - GeneChip
GeneChip®
p®
(Photolithography)
Production of an
Affymetrix GeneChip:
through
h
h the
h use off
photolithography &
combinaotrial chemistry
specific
ifi DNA probes
b are
constructed on the chip
surface (Coe & Antler
2004)
The use of
oligonucleotide arrays.
mRNA
RNA iis extracted
d ffrom
cells and amplified
through a process that
l b l th
labels
the RNA for
f
analysis. The sample is
then applied to an array &
and bound RNA stained
(Coe & Antler 2004).
P b Design
Probe
R=Discrimination
R
Discrimination Score =
(PM(PM
-MM)/(PM+MM)
http://www.affymetrix.c
om/support/technical/te
pp
chnotes/statistical_refer
ence_guide.pdf
C
Commercial
i l Chips
Chi
8 Clontech, Incyte, Research Genetics
8 FilterFilter-based arrays with up to about
8,000
8 000 clones
8 Incyte/Synteni
8 10,000 probe chips, not distributed (have
to send them target RNA)
Incyte microarray
8 Affymetrix
8 OligoOligo-based chips with 12,000 genes of
known function (13
13-16 oligos/gene) and
4x10,000 from ESTs
8 http://www.affymetrix.com/products/
arrays/index affx
arrays/index.affx
Shirley©
Affymetrix Designs
Alternative Technologies
8 Synthesis of probes on microbeads
8 Hybridization in solution
8 Identification of beads by fluorescent bar coding by
embedding transponders
8 Readout using micro
micro-flow cells or optic fibers
535 Multipurpose
p
p
Cell
8 Production of “universal” arrays
8 Array uses a unique combination of oligos,
oligos and probes
containing the proper complements
Shirley©
TwoTwo
-color Assay:
Assay DASL Hybridization of
Labeled Amplicons to Bead
Bead-based Address Code
Sequences
q
on Sentrix Universal Arrays
y
http://www.illu
htt
//
ill
mina.com/pro
ducts/arraysr
eagents/univ
ersal_arrays.i
lmn
Illumina© Universal array
A: 100 beads with different
probe DNA are arrayed in a
capillary in the intended
order
B. Microscopic image
C A beadC.
bead
b d-array system
t
Sample, buffer & waste
reservoir
•Sample solution from the
sample
p reservoir moves
back & forth inside the
beadbead
-array during
hybridization
y
& buffer
solution from the buffer
reservoir is introduced
during washing
Fiber Optics Technology
To learn more: Illumina
Illumina’ss Web site
Arrays for Genetic Analysis
8 Mutation detection
8 Molecular
l
l Inversion Probe Technology
l
for SNP Genotyping (next
slide)
8 20,000 SNPs in a single array
8 PCR followed by primer extension, with detection of alleles by
MALDIMALDI
-TOF mass spectroscopy (MS) (Sequenom)
8 Gene loss & amplification
8 Measure gene dosage in genomic DNA by hybridization to
genomic probes
Shirley©
Genome Research 2005 15, 269269-75.
http://www.affymetrix.com/technology/mip_technology.affx#snp
8Four
Four--color single
g array
y
technology; up to 12, 000
SNPs per reaction
8Amplification with universal
PCR primer pair
8Each amplified
p
probe
p
contains a unique
q
tag sequence that is complementary to
a sequence on the universal tag array
8Tags have been selected to have a
similar Tm & base composition & to be
maximally orthogonal in sequence
complementarity
Bioinformatics of Microarrays
8
Array design:
design: choice of sequences to be used as probes
8
Analysis of scanned images
8 Spot detection, normalization, quantitation
8
Primary analysis of hybridization data
8 Basic statistics, reproducibility, data scattering, etc.
8
Comparison of multiple samples
8 Clustering, SOMs, kk-mean classification …
8 SOMs= selfself-Organizing Maps (a subtype of artificial neural network, lowlow-dimensional viwes
of highhigh-dimensional data)
8
8
Unsupervised learning
Sample
p tracking
g and databasing
g of results
Shirley©
Microarray Data Pipeline
P l
Microarray Data on the Web
8 Many groups have made their raw data available, but in
many formats
8 Some groups have created searchable databases
8 There are several initiatives to create “unified”
databases
8 EBI: ArrayExpress
8 NCBI: Gene Expression Omnibus
8 Companies are beginning to sell microarray expression
data (e.g. Incyte)
Shirley©
Other Web Links
8 Leming Shi’s Gene
Gene--Chips.com page
8 Very rich source of basic information and
commercial and academic links
8 DNA chips for dummies animation
8 The Big Leagues: Pat Brown and NHGRI
microarray projects
Shirley©
http://www.coactivepr.com/assets/pdf/writin
g samples/sequenom/Genotyping%20Bro
g_samples/sequenom/Genotyping%20Bro
chure_v8.pdf
2004
Protons
e-
matrix-assisted laser desorption/inoization
RNase-A: U
RNaseand
dC
RNase-T1:
RNaseT1 Gspecific,
f
digestion of the
dC-transcript of
th opposite
the
it
strand
Single Nucleotide Polymorphisms
RNase--A: U and C
RNase
8A sequence change can
have multiple affects on
the mass spectra
8It can result in a mass
shift, introduction of a
cleavage site or removal
of
f a cleavage
g site
8The forward reactions
indicate the presence
of
f a SNP through
h
h mass
shift
8The reverse reactions
pinpoint the location of
the SNP in the amplicon
reference
f
sequence
Only One Final Word of Wisdom...
8 “...although
g the computer
p
is a wonderful
helpmate for the sequence searcher and
comparer, biochemists and molecular biologists
must guard
d against
i
the
h blind
bli d acceptance of
f any
algorithmic output; given the choice, think like
a biologist and not a statistician
statistician”
8 Russell F. Doolittle, 1990
Shirley©
Suppressive Subtractive Hybridization
cDNA libraries
Tester cDNA with Adaptor 1
Tester cDNA with Adaptor 2
Driver cDNA (in excess)
first hybridization
all components denatured
To remove the most common sequences
a
b
c
d
{
second hyb: mix, add freshly denatured driver;
driver; anneal
a,b,c,d + e
fill in
f
n the
th ends
n
a
(Diatchenko et al.,
1996.
1996 Proc
Proc. Natl
Natl.
Acad. Sci. USA.
93:6025 )
add primers;
primers; PCR amplify
no amplification
b
no ampl
amplification
f cat on
c
linear amplification
d
no amplification
e
exponential amplification
Efficacy
Effi
of
f SSH
Ji et al. 2002 BMC Genomics 3:12
8 Diatchenko et al. 1996 ((PNAS 93:6025))
8 Could detect as little as 0.001% target
8 Critical factor is relative concentration of target
in tester and driver populations
8 Effective enrichment when
8 Target present at >= 0.01%
8 Concentration ratio>=
ratio>= 55-fold
SSH Advantages & Drawbacks
8 Advantages
8
8
8
8
Normalization of transcript levels
Detects small (2(2-fold) differences in transcript levels
Identify
y previously
p
y uncharacterized genes
g
(novel g
(novel
genes)
genes)
Generates subtracted libraries rapidly
8 Drawbacks
k
8 Isolating & sequencing transcripts slow & laboratories
8 Many
M
clones
l
may contain
t i the
th same sequences
8 All transcripts must be verified by Northern or
quantitative RT
q
RT-PCR
Yeast TwoTwo-Hybrid System (1)
8 Protein
Protein--protein interaction
8 A yeast vector for expressing a DNA
DNA-binding domain
8 Flexible linker region without the associated activation domain,
domain,
e.g
e ., the deleted GAL4 containing
c ntainin amino
amin acids 11--692
8 A cDNA sequence
q
encoding
g a protein
p
or protein
p
domain of
f
interest = bait domain is fused in frame to the flexible
linker region so that the vector will express a hybrid
protein composed of the DNA
DNA-binding domain,
domain, linker region,
region,
and bait domain
Molecular
M
l
l Cell
C ll Biology,
Bi l
Lodish 5th Ed.
Yeast TwoTwo-Hybrid System (2)
8 A cDNA library is cloned into multiple copies of a second yeast
vector that encodes a strong activation domain & flexible linker, to
produce a vector library
p
y expressing
p
g multiple
p hybrid
y
proteins
proteins,
p
,
each containing a different fish domain
8 The bait vector & library of fish vectors are then transfected into
engineered yeast cells in which the only copy of a gene required
for histidine synthesis (HIS) is under control of a UAS with
binding
g sites for the DNADNA-binding
g domain of the hybrid
y
bait protein
p
8 Transformed cells that express the bait hybrid & interacting fish
hybrid will be able to activate transcription of the HIS gene
8 The flexibility in the spacing between the DNADNA-binding &
activation
ti ti d
domains
i of
f eukaryotic
k
ti activators
ti t
makes
k thi
this system
t
work
k
Yeast TwoTwo-Hybrid System (3)
8 A twotwo-step selection process is used
8 The bait vector also expresses a wild
wild-type TRP gene
gene,, and
the hybrid vector expresses a wild
wild-type LEU gene
8 Transfected cells are first grown in a medium that lack of
tryptophan & leucine but contain histindine
8 Only cells that have taken up the bait vector & one of the
fish plasmids will survive in this medium
8 The cells that survive then are plated on a medium that
lacks histidine
Yeast TwoTwo-Hybrid System (4)
8 Those cells expressing a fish hybrid that does not bind to
the bait hybrid cannot transcribe the HIS gene &
consequently will not form a colony on medium lacking
histidine
8 The few cells that express a bait
bait-binding fish hybrid will
grow & form colonies in the absence of histindine
8 Recovery of the fish vectors from these colonies yields
cDNA
DN encoding
d
protein domains
d
that
h interact with
h the
h
bait domain
Coffee Break
8 What do boxers and astronomers have in
common?
8 They both see stars!!!
Download