MB 206 Microbial Biotechnology3

advertisement
Project
MB206-JAN09
Samples :
Plant (A)
Objective:
Isolate 100 ESTs from Plant (A)
RNA
Extraction
cDNA
Library
Construction
ESTs
Generation
RNA Extraction
• Extract RNA from sample (A) – method
depend on sample. Check previous note.
• Check the quality and quantity of the RNA.
• Isolate mRNA from the RNA (using kits)
• Check the quality and quantity of the
mRNA.
Then?
Types of Libraries
Genomic Library
• whole genes w/ promoters & introns (Euk.), operons
(bacteria), DNA regulatory elements…
cDNA Library
• mRNA transcript only w/ 5’ & 3’ untranslated regions
(UTRs), no introns, tissue specific.
(5’UTR)
(3’UTR)
Angelia 09
5
Lambda Library
Lodish, et al. Fig 7-12
Plasmid Library
Lodish, et al. Fig 7-1
Angelia 09
6
Plasmid !!!
mRNA isolation, purification
Check the RNA integrity
Synthesis of cDNA
Treatment of cDNA ends
Ligation to vector
Screening
Screening
The process of identifying one particular
clone containing the gene of interest from
among the very large number of others in the
gene library .
Plate the cDNA library on LB
agar plates
-It need the help of host.
-The detail can refer any cDNA library construction kits.
Angelia 09
10
Sequencing
DNA sequencing by the Sanger method
The standard DNA sequencing technique is the Sanger method,
named for its developer, Frederick Sanger, who shared the 1980
Nobel Prize in Chemistry. This method begins with the use of
special enzymes to synthesize fragments of DNA that terminate
when a selected base appears in the stretch of DNA being
sequenced. These fragments are then sorted according to size
by placing them in a slab of polymeric gel and applying an
electric field -- a technique called electrophoresis. Because of
DNA's negative charge, the fragments move across the gel toward
the positive electrode. The shorter the fragment, the faster it
moves. Typically, each of the terminating bases within the
collection of fragments is tagged with a radioactive probe for
identification.
DNA sequencing example
Problem Statement: Consider the following DNA
sequence (from firefly luciferase). Draw the sequencing
gel pattern that forms as a result of sequencing the
following template DNA with ddNTP as the capper.
atgaccatgattacg...
Solution:
Given DNA template:
DNA synthesized:
5'-atgaccatgattacg...-3'
3'-tactggtactaatgc...-5'
DNA sequencing example
Given DNA template: 5'-atgaccatgattacg...-3'
DNA synthesized:
3'-tactggtactaatgc...-5'
Gel pattern:
+-------------------------+
lane ddATP
|W |
| ||
|
lane ddTTP
|W| | | | |
|
lane ddCTP
|W |
|
|
|
lane ddGTP
|W
||
|
|
+-------------------------+
Electric Field
+
Decreasing size
where "W" indicates the well position, and "|"
denotes the DNA bands on the sequencing gel.
A sequencing gel
This picture is a radiograph. The dark color of the lines is
proportional to the radioactivity from 32P labeled adenonsine
in the transcribed DNA sample.
Reading a sequencing gel
You begin at the right, which are the smallest DNA fragments.
The sequence that you read will be in the 5'-3' direction.
This sequence will be exactly the same as the RNA that
would be generated to encode a protein. The difference is that
the T bases in DNA will be replaced by U residues. As an example,
in the problem given, the smallest DNA fragment on the sequencing
gel is in the C lane, so the first base is a C. The next largest band
is in the G lane, so the DNA fragment of length 2 ends in G.
Therefore the sequence of the first two bases is CG.
The sequence of the first 30 or so bases of the DNA are:
CGTAATCATGGTCATATGAAGCTGGGCCGGGCCGTGC....
When this is made as RNA, its sequence would be:
CGUAAUCATGGUCAUAUGAAGCUGGGCCGGGCCGUGC....
Note that the information content is the same, only the T's have
been replaced by U's!.
The codon table
5’-Base
U(=T)
C
A
G
U(=T)
Phe
Phe
Leu
Leu
Leu
Leu
Leu
Leu
Ile
Ile
Ile
Met
Val
Val
Val
Val
Middle
C
Ser
Ser
Ser
Ser
Pro
Pro
Pro
Pro
Thr
Thr
Thr
Thr
Ala
Ala
Ala
Ala
Base
A
Tyr
Tyr
Term
Term
His
His
Gln
Gln
Asn
Asn
Lys
Lys
Asp
Asp
Glu
Glu
3’-Base
G
Cys
Cys
Term
Trp
Arg
Arg
Arg
Arg
Ser
Ser
Arg
Arg
Gly
Gly
Gly
Gly
U(=T)
C
A
G
U(=T)
C
A
G
U(=T)
C
A
G
U(=T)
C
A
G
Translating the DNA sequence
The order of amino acids in any protein is specificed by the
order of nucleotide bases in the DNA.
Each amino acid is coded by the particular sequence of three bases.
To convert a DNA sequence
First, find the starting codon. The starting codon is always
the codon for the amino acid methionine. This codon is
AUG in the RNA (or ATG in the DNA):
GCGCGGGUCCGGGCAUGAAGCUGGGCCGGGCCGUGC....
Met
In this particular example the next codon is AAG. The first base
(5'end) is A, so that selects the 3rd major row of the table. The
second base (middle base) is A, so that selects the 3rd column of
the table. The last base of the codon is G, selecting the last line in
the block of four.
Translating the DNA sequence
This entry AAG in the table is Lysine (Lys).
Therefore the second amino acid is Lysine.
The first few residues, and their DNA sequence, are as follows
(color coded to indicate the correct location in the
codon table):
Met Lys Leu Gly Arg … ...
AUG AAG CUG GGC CGG GCC GUG C..
This procedure is exactly what cells do when they synthesize
proteins based on the mRNA sequence. The process of translation
in cells occurs in a large complex called the ribosome.
Automated procedure for DNA
sequencing
A computer read-out of the gel generates a “false color” image
where each color corresponds to a base. Then the intensities are
translated into peaks that represent the sequence.
High-throughput seqeuncing:
Capillary electrophoresis
The human genome project
Sheath flow
has spurred an effort to
Laser
develop faster, higher
Sheath flow cuvette
Focusing
lens
throughput, and less
expensive technologies
for DNA sequencing.
Capillary electrophoresis
Beam block
Collection Lensc
(CE) separation has many
PMT
filter
advantages over slab gel
separations. CE separations are faster and are capable of producing
greater resolution. CE instruments can use tens and even
hundreds of capillaries simultaneously. The figure show a simple
CE setup where the fluorescently-labeled DNA is detected as it
exits the capillary.
DNA sequencing.
 Dideoxy analogs of normal nucleotide triphosphates
(ddNTP) cause premature termination of a growing chain of
nucleotides.
ACAGTCGATTG
ACAddG
ACAGTCddG
ACAGTCGATTddG
 Fragments are separated according to their sizes in gel
electrophoresis. The lengths show the positions of “G” in the
original DNA sequence.
Nucleotides and phosphodiester
bond.
Phosphodiester bond
Genomic sequencing.
 Individual chromosomes are broken into
100kb random fragments.
 This library of fragments is screened to find
overlapping fragments – contigs.
 Unique overlapping clones are chosen for
sequencing.
 Put together overlapping sequenced clones
using computer programs.
Sequencing cDNA libraries.
 mRNA is pooled from the tissues which express genes.
 cDNA libraries are prepared by copying of mRNA with
reverse transcriptase.
 Expressed Sequence Tags (EST) – partial sequences of
expressed genes.
 Comparing translated ESTs to annotated proteins –
annotation of genes.
Gene prediction.
Gene – DNA sequence encoding protein, rRNA,
tRNA …
Gene concept is complicated:
- Introns/exons
- Alternative splicing
- Genes-in-genes
- Multisubunit proteins
Gene structure.
ATG
-35
TER
-10
Promoter sequences
Gene
ATG – start codon; TER (TAA, TAG,TGA) – termination codons
Codon usage tables.
- Each amino acid can be encoded by several codons.
- Each organism has characteristic pattern of codon usage.
Problems arising in gene
prediction.
 Distinguishing pseudogenes (not working
former genes) from genes.
 Exon/intron structure in eukaryotes, exon
flanking regions – not very well conserved.
 Exon can be shuffled alternatively –
alternative splicing.
 Genes can overlap each other and occur on
different strands of DNA.
Gene identification
 Homology-based gene prediction
• Similarity Searches (e.g. BLAST, BLAT)
• ESTs
 Ab initio gene prediction
• Prokaryotes
 ORF identification
• Eukaryotes
 Promoter prediction
 PolyA-signal prediction
 Splice site, start/stop-codon predictions
Prokaryotic genes – searching
for ORFs.
- Small genomes have high gene density
Haemophilus influenza – 85% genic
- No introns
- Operons
One transcript, many genes
- Open reading frames (ORF) –
contiguous set of codons, start with Met-codon, ends with stop
codon.
What is Sequencing
A lab technique used to find out the
sequence of nucleotide bases in a DNA
molecule or fragment.
It is a deciphering of the exact order of
base sequence in a nucleotide
Sequence.
Examples are dideoxy sequencing and
maxam-gilbert sequencing.
Method of the Sequencing
There are two methods of sequencing
Maxam and Gilbert method (the manual or
chemical sequencing )
And
Sanger method using dideoxynucleotide
(modern sequencing)
Sanger method is more efficient and
uses fewer toxic chemicals and lower
amounts of radioactivity than the
method of Maxam and Gilbert, it
rapidly became the method of choice.
Download