T m

advertisement
ESR1_HUMAN: D538G
http://www.pantherdb.org/tools/csnpScoreForm.jsp?
EVOLUTIONARY ANALYSIS OF CODING SNPS
subPSEC
(substitution
position-specific
evolutionary
conservation)
estimates the
likelihood of a
functional effect.
Values are 0 to 10, (-10 most
likely to be
deleterious).
-3 is the
previously
identified cutoff
point for
functional
significance.
3.968431
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Pdeleterious
(anything
above 0.5 is
substitution
considered
deleterious)
0.72481
D538G
ESR1_HUMAN: D538G
2
http://mutationassessor.org/
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
SNPs in miRNA Binding Sites
• 11 possible candidate SNPs were selected for their potential relevance to breast cancer.
• rs2747648, which resides in a predicted binding site for 3 miRNAs in the estrogen receptorα (ESR1) gene, was associated with a 27% reduction in breast cancer risk in premenopausal
women.
• When the C allele is present, miR-453 binds with greater affinity to ESR1, thus leading to
decreased levels of ERα protein. Postmenopausal women already have reduced levels of
endogenous estrogen, perhaps explaining why this SNP is relevant only in premenopausal
women.
• Would carriers of the ancestral T allele respond better to endocrine therapy ? given that
they will naturally express increased levels of the receptor.
References:
Tchatchou, S. et al. A variant affecting a putative miRNA target site in estrogen receptor (ESR) 1 is associated with breast cancer risk in
premenopausal women. Carcinogenesis 30, 59–64 (2009).
Adams, B. D., Furneaux, H. & White, B. A. The micro-ribonucleic acid (miRNA) miR-206 targets the human estrogen receptor-α (ERα) and
represses ERα messenger RNA and protein expression in breast cancer cell lines. Mol. Endocrinol. 21, 1132–1147 (2007).
3
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://www.genemania.org/
4
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Before you design your own primers –
Don’t reinvent the wheels!
Essential Bioinformatics Resources for Designing PCR Primers for Various Applications:
http://www.humgen.nl/primer_design.html
5
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Basic considerations before designing primers
1. Use NCBI Gene or UCSC genome browser to find gene variants:
• Transcript variants
• Alternative isoforms
• Exon-intron
boundaries
• Pseusogenes
2. Gene conservation
considerations
3. SNPsThere are approximately
56 million SNPs in the
human genome, 16 million are in gene introns and exons, most are silent mutations. Are we
aiming at these locations ?
6
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
jPCR: http://primerdigital.com/tools/soft.html
Primer design and primer characteristics
Primer length determines the specificity and affects annealing to the template:
 Short primer => low specificity, non-specific amplification
 Long primer => decreased binding efficiency at normal annealing temperature
(due to high probability of forming secondary structures such as hairpins).
•
•
•
•
Primer length: 18-24 bps, complete sequence identity to template
G/C content: 40-60%
Avoid mismatches at the 3’ end
The presence of G or C bases within the last five bases from the 3' end of primers
(GC clamp) helps promote specific binding at the 3' end. Avoid 3 or more G or C
at the 3’ end because high primer-dimer probability
• Avoid a 3’ end T
• Always have a reference gene (GAPDH, actin, RPLPO (Large Ribosomal Protein))
performed with your query genes
• Optimal amplicon size: 100-1000 bps
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://www.sciencedirect.com/science/article/pii/S0888754311001066#
7
Primer design: Melting temperature (Tm)
 Tm is the temperature at which 50% of the DNA duplex dissociates to
become single stranded


Determined by primer length, base composition and concentration
Affected by the salt concentration of the PCR reaction mix
 Optimal melting temperature: 52°C - 60°C
Tm above 65°C may cause secondary annealing, higher Tm (75°C 80°C) is recommended for amplifying high GC content targets
 Primer pair Tm mismatch
 Significant primer pair Tm mismatch can lead to poor amplification
(desirable Tm difference < 5°C between primer pairs)
8
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Primer design: Annealing temperature
Ta (Annealing temperature) vs. Tm




9
Ta is determined by the Tm of both primers and amplicons:
optimal Ta=0.3 x Tm(primer)+0.7 x Tm(product)-25
General rule: Ta is 5°C lower than Tm
Higher Ta enhances specific amplification but may lower yields
Crucial in detecting polymorphisms
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Primer design: Specificity and cross homology
 Specificity: Determined primarily by primer length and sequence
 Cross homology: Cross homology may become a problem when
PCR template is DNA with highly repetitive sequences
 Avoid non-specific amplification: BLAST PCR primers against NCBI
non-redundant sequence database
10
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Primer design: Avoid secondary structures
 Hairpins are formed via intra-molecular interactions, negatively affect
primer-template binding, leading to poor or no amplification
 Self-Dimer (homodimer)
 Formed by inter-molecular interactions between the two same primers
 Cross-Dimer (heterodimer)
 Formed by inter-molecular interactions between the sense and antisense
primers
 Avoid Template Secondary Structure
11
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
12
Web Site:
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://bioinfo.ut.ee/primer3-0.4.0/primer3/input.htm
13
Web Site:
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://primer3plus.com/cgi-bin/dev/primer3plus.cgi
14
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Design specific primers for
each transcript:
SNP primers:
0
Web
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Site: http://genepipe.ngc.sinica.edu.tw/primerz/beginDesign.do
15
SNPs
Copy number variation and InDels
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://www4a.biotec.or.th/rexprimer2/Genotyping
16
Dr. Metsada Pasmanik-Chor,
17
http://www4a.biotec.or.th/rexprimer2/OligoChecking
Bioinformatics Unit, Tel Aviv University
Primer Design Tools for Degenerate PCR– CODEHOP
Name
Type
Key Functions
Publication Info
Times Cited
Pros
Cons
Note
YiBu’s Rating
CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer)
PCR primer design
Web-based software
Design degenerate PCR primers based on multiple protein sequences
alignments
Nucleic Acids Research 2003
37
Widely cited with many successful applications; settings for genetic code and
codon usage;
Requires local multiple alignment as input and must be in Blocks Database
format;
In OBRC
4 out of 5
Web Site:
http://blocks.fhcrc.org/codehop.html
More Info: http://www.hsls.pitt.edu/guides/genetics/obrc/dna/pcr_oligos/URL1118954832/info
18
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Cross hybridization
and specificity of
primers
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://www.ncbi.nlm.nih.gov/tools/primer-blast/
19
Resources for PCR Primer Specificity Analysis: NCBI BLAST
20
Primer specificity and Mapping: The UCSC In-Silico PCR
Dr. Metsada Pasmanik-Chor,
21
Bioinformatics Unit, Tel Aviv University
http://genome.csdb.cn/cgi-bin/hgPcr
PCR reaction
setup calculators
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://primerdigital.com/tools/ReactionMixture.html
22
Public PCR Primers/Oligo Probes Repository: The NCBI
Probe Database
ESR1 human
23
http://www.ncbi.nlm.nih.gov/probe
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Resources for real time PCR: RTPrimerDB
Shows pre-calculated primers on all gene transcripts !
Web Site:
24
http://www.rtprimerdb.org/
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Web Site:
http://pga.mgh.harvard.edu/primerbank/index.html
More Info:
http://www.ncbi.nlm.nih.gov/sites/entrez?Db=pubmed&Cmd=ShowDetailView&TermToSearch=14654707
25
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
26
http://primerdepot.nci.nih.gov/
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://eu.idtdna.com/pages/scitools
27
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://eu.idtdna.com/calc/dilution/
Dilution Calculator
Takes an oligo stock solution of higher concentration and determines how much volume to
dilute down to final (desired) lower concentration.
Input of the volumes of the stock solution (Start Volume) and the diluted solution (End
Volume) are not required, but recommended.
28
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Exome Analysis
Identify genetic disease causes:
Sequence the human coding regions of patient and healthy
(1-2% of the human genome (~30Mb)), find the genomic cause of diseases.
http://www.frontiersin.org/Journal/10.3389/fendo.2011.00008/full
http://gtbinf.wordpress.com/2012/11/29/exome-sequence-analysis-group-1/
Dr. Metsada Pasmanik-Chor,
29
Bioinformatics Unit, Tel Aviv University
=>
http://www.ebi.ac.uk/Tools/st/emboss_backtranseq/
=>
30
>A8KAF4_HUMAN A8KAF4 Estrogen receptor OS=Homo sapiens PE=2 SV=1
ATGACCATGACCCTGCACACCAAGGCCAGCGGCATGGCCCTGCTGCACCAGATCCAGGGC
AACGAGCTGGAGCCCCTGAACAGGCCCCAGCTGAAGATCCCCCTGGAGAGGCCCCTGGGC
GAGGTGTACCTGGACAGCAGCAAGCCCGCCGTGTACAACTACCCCGAGGGCGCCGCCTAC
GAGTTCAACGCCGCCGCCGCCGCCAACGCCCAGGTGTACGGCCAGACCGGCCTGCCCTAC
GGCCCCGGCAGCGAGGCCGCCGCCTTCGGCAGCAACGGCCTGGGCGGCTTCCCCCCCCTG
AACAGCGTGAGCCCCAGCCCCCTGATGCTGCTGCACCCCCCCCCCCAGCTGAGCCCCTTC
CTGCAGCCCCACGGCCAGCAGGTGCCCTACTACCTGGAGAACGAGCCCAGCGGCTACACC
GTGAGGGAGGCCGGCCCCCCCGCCTTCTACAGGCCCAACAGCGACAACAGGAGGCAGGGC
GGCAGGGAGAGGCTGGCCAGCACCAACGACAAGGGCAGCATGGCCATGGAGAGCGCCAAG
GAGACCAGGTACTGCGCCGTGTGCAACGACTACGCCAGCGGCTACCACTACGGCGTGTGG
AGCTGCGAGGGCTGCAAGGCCTTCTTCAAGAGGAGCATCCAGGGCCACAACGACTACATG
TGCCCCGCCACCAACCAGTGCACCATCGACAAGAACAGGAGGAAGAGCTGCCAGGCCTGC
AGGCTGAGGAAGTGCTACGAGGTGGGCATGATGAAGGGCATCAGGAAGGACAGGAGGGGC
GGCAGGATGCTGAAGCACAAGAGGCAGAGGGACGACGGCGAGGGCAGGGGCGAGGTGGGC
AGCGCCGGCGACATGAGGGCCGCCAACCTGTGGCCCAGCCCCCTGATGATCAAGAGGAGC
AAGAAGAACAGCCTGGCCCTGAGCCTGACCGCCGACCAGATGGTGAGCGCCCTGCTGGAC
GCCGAGCCCCCCATCCTGTACCCCGAGTACGACCCCACCAGGCCCTTCAGCGAGGCCAGC
ATGATGGGCCTGCTGACCAACCTGGCCGACAGGGAGCTGGTGCACATGATCAACTGGGCC
AAGAGGGTGCCCGGCTTCGTGGACCTGACCCTGCACGACCAGGTGCACCTGCTGGAGTGC
GCCTGGCTGGAGATCCTGATGATCGGCCTGGTGTGGAGGAGCATGGAGCACCCCGGCAAG
CTGCTGTTCGCCCCCAACCTGCTGCTGGACAGGAACCAGGGCAAGTGCGTGGAGGGCATG
GTGGAGATCTTCGACATGCTGCTGGCCACCAGCAGCAGGTTCAGGATGATGAACCTGCAG
GGCGAGGAGTTCGTGTGCCTGAAGAGCATCATCCTGCTGAACAGCGGCGTGTACACCTTC
CTGAGCAGCACCCTGAAGAGCCTGGAGGAGAAGGACCACATCCACAGGGTGCTGGACAAG
ATCACCGACACCCTGATCCACCTGATGGCCAAGGCCGGCCTGACCCTGCAGCAGCAGCAC
CAGAGGCTGGCCCAGCTGCTGCTGATCCTGAGCCACATCAGGCACATGAGCAACAAGGGC
ATGGAGCACCTGTACAGCATGAAGTGCAAGAACGTGGTGCCCCTGTACGACCTGCTGCTG
GAGATGCTGGACGCCCACAGGCTGCACGCCCCCACCAGCAGGGGCGGCGCCAGCGTGGAG
GAGACCGACCAGAGCCACCTGGCCACCGCCGGCAGCACCAGCAGCCACAGCCTGCAGAAG
TACTACATCACCGGCGAGGCCGAGGGCTTCCCCGCCACCGTG
6 frames translation
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://www.ebi.ac.uk/Tools/st/emboss_transeq/
Resources for PCR Primer Mapping/Amplicon Size
Format Conversion tools:
 Reverse and\or Complement of DNA sequences (http://www.bioinformatics.org/sms2/rev_comp.html)
 Split FASTA: divides FASTA sequence records into smaller FASTA sequences of the size you
specify (http://www.bioinformatics.org/sms2/split_fasta.html)
Sequence Analysis:
 DNA Pattern Find: accepts one or more sequences along with a search pattern and returns
the number and positions of sites that match the pattern
(http://www.bioinformatics.org/sms2/dna_pattern.html)
 PCR Primer Stats: accepts a list of PCR primer sequences and returns a report describing the
properties of each primer, including melting temperature, percent GC content, and PCR
suitability (http://www.bioinformatics.org/sms2/pcr_primer_stats.html)
 PCR Products: accepts one or more DNA sequence templates and two primer sequences.
The program searches for perfectly matching primer annealing sites that can generate a
PCR product. Any resulting products are sorted by size, and they are given a title specifying
their length, their position in the original sequence, and the primers that produced them
(http://www.bioinformatics.org/sms2/pcr_products.html)
 Reverse Translate (http://www.bioinformatics.org/sms2/rev_trans.html)
 Translate (http://www.bioinformatics.org/sms2/translate.html)
 Primer Map: accepts a DNA sequence and returns a textual map showing the annealing
positions of PCR primers (http://www.bioinformatics.org/sms2/primer_map.html)
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://www.bioinformatics.org/sms2/index.html
31
Comparing gene-lists
x total
127 x only
62 x-y total overlap
y total
628 y only
566 x-z total overlap
z total
0 z only
0 y-z total overlap
http://www.cmbi.ru.nl/cdd/biovenn/
Venny
http://bioinfogp.cnb.csic.es/tools/venny/
32
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Microarray and Next Generation Sequencing Technologies
Microarray Experiments
Next Generation Sequencing
Anchor DNA single molecule
to solid surface
Amplify template by in situ PCR
Add 4 color labeled
reverse terminators,
polymerase,
universal primer
Remove un-incorporated nucleotide
Detect with laser
Reverse termination, repeat 1…100 times,
the number of cycles determines the length
of sequence.
Probes for genes are located on the chip. Hybridization Next generation sequencing bypass the rate-limiting step of conventional
In both technologies, the great advantage is achieved by novel bio-technologies for
of mRNA to the probes on the chip is performed and
DNA sequencing (separating randomly terminated DNA polymers by gel
producing
high throughput data !!! electrophoresis) by physically arraying DNA molecules on solid surfaces and
results
are recorded.
However,
both
have pros and cons… determining the DNA sequence in situ, without the need for gel separation.
Various
platforms
!
33
http://molonc.bccrc.ca/?page_id=191
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Arrays
pros
cons
relatively cheap
detection of only known transcripts
mature biotechnology and analysis tools
(since the late 90’s)
limited to sequenced organisms, no de-novo
fixed probes, no heterogeneity of coverage
highly reproducible
higher background
low expressed genes are less accurately
detected
still expensive
very sensitive if sufficient sequence depth
direct read-out of all transcripts
paired-end reads, better accuracy
NGS
de-novo sequencing, new genomes
highly reproducible
new and exciting
technical bias in mRNA library preparation and
in transcripts of different length
pre-mature bioinformatics tools
de-novo analysis is tricky, ambiguity in mapping
reads to the genome
very high coverage is needed for low expressed
genes
variable sequence coverage for different
genomic regions
34
In both, consistent biological interpretation !
Consistent Biological Interpretation ?
Marioni J C et al. Genome Res. 2008;18:1509-1517
35
http://cage.unl.edu/RNASEQ_Transcriptomics.pdf
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Copyright © 2008,
Cold Spring Harbor Laboratory Press
NGS are becoming the technology of choice for a wide range of
applications, but the transition away from microarrays is still long.
Different applications have different requirements, so researchers
need to carefully weigh their options when making the choice for
using a platform.
http://www.genengnews.com/gen-articles/next-generation-sequencing-vs-microarrays/4689/
Dr. Metsada Pasmanik-Chor,
36
Bioinformatics Unit, Tel Aviv University
TAU Bioinformatics unit: who are we and what do we do ?
37
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
http://www.tau.ac.il/lifesci/bioinformatics.html
metsada@post.tau.ac.il
Tel: 03-6406992
38
Dr. Metsada Pasmanik-Chor,
Bioinformatics Unit, Tel Aviv University
Download