Uploaded by lewecek335

Key Concepts Lectures 6-7

advertisement
CHEM 114A | Key Concepts Lectures 6-7
I.
Key Concepts from Lecture 6
Restriction Enzymes
a. What are they? What do they recognize? What can a cleavage pattern tell you?
- Restriction enzymes recognize a specific sequence in double-stranded DNA and
cleave both strands.
- Cleavage possesses twofold rotational symmetry.
- Recognized sequence is palindromic.
- Hundreds of restriction enzymes.
- Name = three-letter abbreviation of host organism, followed by strain and roman
numeral
- These enzymes cleave large DNA molecules into smaller sizes that are easier to
analyze.
- The cleavage pattern, the pattern of fragments produced by a restriction enzyme,
can serve as a “fingerprint” of that specific DNA molecule
b. Palindromic
- Recognized sequence is palindromic; sequence is the same forwards and
backwards.
II.
Blotting
a. Northern Blotting= Identifying RNA
i. Radiolabeled complementary DNA sequence (single strand) binds to the
target RNA. Exact same process as Southern Blotting, but detecting
RNA sequence instead of DNA sequence.
b. Southern Blotting=Identifying DNA
-
i. Radioactively labeled complementary DNA sequence (single
strand) binds to the DNA on nitrocellulose sheet, if the target
DNA sequence is present. Appears on the autoradiogram
Used to detect specific DNA molecules amongst many other DNA
molecules.
c. Western Blotting=Identifying Protein
-
III.
i. Radiolabeled antibody that binds to a protein of interest on a SDS
page gel.
Allows the detection and identification of specific proteins separated by gel
electrophoresis. SDS Polyacrylamide gel  Transfer proteins to polymer
sheet  Add radiolabeled specific antibody  Overlay film and develop.
DNA Sequencing
a. The importance of Sanger sequencing.
-
i. What is the role of 2’,3’-Dideoxy analog?
This analog can be used to control the termination of replication by removing the
alcohol group which would bond to the next nucleic acid.
-
ii. How does Sanger sequencing work?
Invented by Frederick Sanger.
Based on the generation of DNA fragments whose length is determined by the
identity of the last base in the sequence.
-
Such a collection of fragments can be obtained through the controlled
termination of replication.
- This is done using dideoxy analogs of each NTP.
- Four separate reaction mixtures used. Each containing a small amount of only
one dideoxy analog per mixture. Each mixture also has all 4 radioactively
labeled dNTP’s. DNA Polymerase used to make complement of the denatured
single strand with a primer. Leads to DNA strands of varying length depending
on where the dideoxy analog bonded.
- Each reaction let run on a denaturing polyacrylamide gel.
iii. *Reading a Sanger sequencing gel to determine the original DNA
sequence.
- Sequence is read from the pattern of chain termination. Read from bottom to top
(short strands to long strands).
- Fluorescence is more commonly used nowadays than radioactive labelling.
- Modern DNA sequencing instruments can sequence more than 106 bases per day.
a. How does DNA synthesis work? Is there a size limitation?
- DNA strands can be chemically synthesized by the sequential addition of
activated monomers.
- Synthesis occurs in the 3’ to 5’ direction.
- Size limitation of about 100 bases.
- Allows the generation of short DNA’s that can be used to amplify genes using
PCR.
- Can be used to make DNA probes for the aforementioned blotting techniques.
- Recent Development: multiple DNA’s are synthesized corresponding to a much
larger sequence and joined to form new tailor-made genes
V.Polymerase Chain Reaction (PCR)
a.
-
Why is PCR so important?
Developed by Kary Mullis in 1984 – Nobel Prize in 1993
Method of greatly amplifying small quantities of specific DNA.
Millions of copies can be made from a starting DNA molecule.
Flanking sequences of the DNA target must be known.
Short DNA must be synthesized that is complementary to these flanking
sequences: these are called primers.
b. Know the components that are needed for PCR:
-
i. Pair of primers (Know the purpose of a primer!)
Primers needed that hybridize and are complementary to the flanking sequences.
ii. All four dNTPS: A, C, T, and G
iii. Heat stable DNA polymerase (why would it need to be heat stable?)
-
From a thermophilic bacteria. Such as Taq polymerase from Thermus aquaticus.
-
iv. Thermal cycler
Machine that cycles between different temperatures.
c. Know the three steps of PCR
- 1) Strand Separation: the two strands of the target DNA molecule are separated
by heating at 95 0C.
- 2) Hybridization of Primers: the solution is quickly cooled to 54 oC to allow the
primers to hybridize the 5’ and 3’ ends of the target DNA.
3) DNA Synthesis: the solution is heated to 72 oC which is the optimal
temperature for DNA synthesis by Taq DNA polymerase. Taq is a heat stable
polymerase from the thermophilic bacterium Thermus aquaticus.
- This cycle is repeated about 20-35 times.
- Ideally, after n cycles, the sequence should be amplified 2n fold. Millionfold after
20 cycles, billionfold after 30 cycles.
PCR Real-Life Uses:
- Medical diagnostics: very small amount of bacterial and viral DNA can be
detected using PCR, HIV can be detected at an early stage.
- Forensics: DNA froma crime scene can be greatly amplified for further
identification.
- Molecular Archaeology: DNA from extinct organismcan be amplified for
evolutionary studies. Ex) Neanderthal genome.
a. What is the purpose of DNA Ligase? In what situation would you use DNA
Ligase?
- DNA ligase catalyzes the ligation, joining, of two DNA duplexes having
compatible overhangs. Restriction enzymes.
- DNA ligase requires a free 3’-hydroxyl group and a 5’-phosphoryl group.
- Both DNAs must be double helical.
- An energy source such as ATP is required for the joining of DNAs.
-
b. Overhangs
- Restriction enzymes can produce either 5’ or 3’ overhangs. Ligase needs one of
each overhang.
c. Both DNAs must be double helical
d. ATP dependent
CHEM 114A | Key Concepts Lectures 6-7
VII.
Cloning
-
DNA ligase can be used to insert novel DNA sequences into a DNA vector.
A DNA vector is inserted into the host where it replicates autonomously, by itself.
Two commonly used vectors are plasmids and phages.
Vectors can be prepared for cloning by cutting with a suitable restriction enzyme
followed by the ligation with target DNA.
Both vector and DNA target must have compatible ends.
Vectors allow the production of large quantities of a DNA of interest.
a. Vectors
i. Plasmid
-
1. Circular dsDNA molecules that occur naturally in some bacteria.
Plasmids are circular, double stranded DNA molecules that occur naturally in
some bacteria.
Range in size from 2 to several hundred kilobases (kb)
-
2. Antibiotic resistance
Plasmids carry genes for a selectable marker, such as antibiotic resistance.
-
3. Site that tolerates insertion of a new DNA sequence.
Plasmids contain a site that tolerate the insertion of a new DNA sequence.
-
4. Puc18 plasmid example
-
-
-
a. Origin of replication for propagation in the host organism
Plasmids have an origin of replication that is required for propagation in the host
organism.
b. Amp resistance and beta-galactosidase as selectable
markers.
pUC18 has ampicillin resistance as a selectable marker.
Beta-galactosidase gene encodes a protein that breaks down a sugar
analog to produce a blue color. This gene contains a polylinker sequence
containing many restriction sites. The presence of an insertion will disrupt
the Beta-galactosidase and return a white color.
X-Gal turns into galactose and a dark blue compound when exposed to
Beta-Galactosidase.
Thus, pUC18 has two selectable markers: ampicillin resistance selects for
bacterial cells containing the plasmid, B-galactosidase gene allows for
blue/white color selection to determine which bacterial cells contain the
DNA insert.
ii. Phage(bacteriophage)
-
1. What are phages?
Phages are viruses that infect bacterial cells and replicate, also called
bacteriophages.
Inject DNA into a bacterial cell resulting in the production of more viral
particles.
2. What are the two modes of infection for lambda phage?
a. Lytic: viral functions are fully expressed, leads to
destruction of the host cell and release of hundreds of
virus particles.
b. Lysogenic: the phage DNA is integrated into the host
genome and can be replicated together with the host DNA.
3. Lambda Phage as cloning vector
-
a. What are the benefits of phage cloning?
Large segments of the 48 kilobase genome of lambda phage can be deleted and
replaced with foreign DNA.
Mutant phages have been made containing extra restriction sites into which a
new DNA can be inserted.
The two remaining pieces of lambda DNA after digestion with EcoRI is equal to
72% of a normal genome length.
Only DNA measuring from 75 to 105% of a normal lambda genome will be
packaged into a viral particle.
Phages can tolerate larger DNA insertions than plasmids ( >10 kb).
These modified viruses enter bacteria much more easily than plasmid vectors.
b. Genomic library
Specific genes can be cloned from very large genomes.
Genomic DNA is first digested into large fragments.
Fragments are isolated that are about 15 kb long using gel electrophoresis.
These fragments are ligated to lambda DNA using compatible ends.
E.coli bacteria are infected with these recombinant phages.
Phages replicate and lyse or kill their bacterial hosts.
The resulting lysate contains a large number of phage particles containing
fragments from the entire genome. This is known as a genomic library.
Key Concepts from Lecture 7
I.
Mutagenesis of DNA: Proteins with new functions can be created through
directed changes (mutations) in DNA.
-
a. Deletions
A specific sequence within a larger DNA can be excised using restriction
enzymes.
The remaining ends are joined together by DNA ligase.
Can use PCR to make targeted deletions of any size (Professor’s preferred
method)  overlap extension PCR: primers for first round of PCR have single
stranded extensions that are complementary to each other.
b. Substitutions
-
i. Site-directed mutagenesis:
Developed by Michael Smith – Nobel Prize in 1993
1. How does it work?
Mutant proteins can be made containing a single amino acid substitution using
oligonucleotides (primers) with the desired mutation.
Need to know the sequence of the gene to be altered.
Mutant primer is annealed to the DNA template and is elongated using DNA
polymerase.
Original parental DNA can be degraded using DpnI, which only cleaves
methylated DNA.
Only mutant DNA is left which will produce mutant protein.
2. What is the purpose of DpnI?
-
Original parental DNA can be degraded using DpnI, which only cleaves
methylated DNA.
c. Insertions
i. Cassette Mutagenesis
Involves cutting plasmid DNA with two different restriction enzymes to remove
a specific region. Then purifying the large fragment.
- A newly synthesized DNA fragment containing compatible ends is then ligated
into the plasmid.
- Allows the swapping of one gene for another.
Gene Synthesis:
- Completely new proteins with novel functions can be designed and synthesized.
- No starting DNA template needed.
- Many oligonucleotides are synthesized which correspond to the desired
sequence.
- These 40-100 base oligonucleotides are annealed and joined together to form the
final DNA sequence for the protein.
- This final sequence is cloned into a plasmid for the final protein.
- Design Amino acid sequence  Design and synthesize gene  Produce and
characterize protein.
Genome Sequencing:
- The complete genomes of many organisms have been sequenced.
- This includes bacteria, fungi, plants, insects, worms, humans, etc.
- This has been made possible with the advent of automated DNA sequencers and
high speed computers for data analysis and sequence assembly.
The First Complete Genome Sequence:
- Diagram in lecture 7, slide 9 depicts the genome sequence of the bacterium
Haemophilis influenzae
- First genome sequence of a free-living organism
- This sequence was determined using a “shotgun” approach in which the genomic
DNA is shattered unto many smaller pieces followed by sequencing of these
fragments.
- These random fragments are analyzed by a computer for overlapping regions,
which determines how they come together to form the full genome sequence.
Human Genome:
- Human genome contains 3 billion base pairs of DNA distributed among 23 pairs
of chromosomes. (total of 46)
- Originally thought that humans would have 100,000 genes, however this was
incorrect.
- Humans have only ~25,000 genes.
- The proteome is more complex due to alternative splicing and post-translational
modification.
- Human genome contains large amounts of non-coding DNA composed of introns
and mobile genetic elements.
- Only 1.5% of the human genome code for proteins
- >90% of the genome is transcribed into RNA at high levels!!
- Noncoding RNAs are most likely playing an important role in eukaryotes.
- Area of intense study.
Comparative Genomics:
- Comparison of the genomes of different organisms can lead to the following
insights…
-
II.
III.
-
-
o Allows the identification of novel genes: comparison of the human and
pufferfish genomes lead to the identification of 1000 previously unknown
genes.
o Evolutionary relationships can be determined: comparisons between the
human, chimpanzee, and neanderthal genomes gives insights into our
own evolutionary history.
o
Gene Expression Analysis
Most genes are present as one copy per genome, however the expression of most
genes into mRNA varies widely
Gene expression varies from cell type to cell type and also at different points in
time or stages of development.
The complete genome sequence allows us to systematically look at the expression
levels of all the individual genes in an organism.
Based on the assumption that the levels of mRNA indicate the level of protein
being produced in the cell.
High density arrays of oligonucleotides can be constructed which are
complementary to the mRNAs produced by the various genes.
Binding of an mRNA extract to this DNA microarray or “gene chip: results in
fluorescence. This fluorescence can be quantitated to determine gene expression.
Red corresponds to gene induction and green corresponds to gene repression.
Eg. Monitoring yeast for changes in gene expression with different environmental
conditions
Allows the determination of gene function and reveals networks of genes.
Recombinant Protein Expression
Natural levels of specific protein levels are usually pretty low.
Eukaryotic genes can be introduced into bacteria.
Bacteria can be used as factories to produce proteins from eukaryotes.
New genes can be introduced into plants to introduce new properties such as pest
resistance.
Eg. Bacillus thuringiensis (bacterium) toxin production in peanut plants protect it
from damage caused by European corn borer larvae.
a. Why do we need to start with an mRNA sequence and not the genomic
sequence?
To isolate the gene for protein expression, one must start with the mRNA
sequence and not the genomic sequence.
This is because the genomic DNA will have intron sequences that are
removed only upon expression into mRNA.
b. Know the process: mRNA  Converted into cDNA using reverse
transcriptase  cDNA ligated into a protein expression plasmid  E.coli
bacterium infected with this plasmid and the protein is produced.
CHEM 114A | Key Concepts Lectures 6-7
c. Know the steps of production of cDNA
- Synthetic oligo(dT) primer is annealed to the poly(A) of the mRNA.
- Reverse transcriptase uses the free 3’-OH end to initiate cDNA synthesis.
- Treatment with alkali (NaOH) at high pH is used to degrade the RNA strand.
- Terminal transferase is used to add a string of dGs to the 3’ end of the newly
synthesized cDNA to create another primer site of known sequence.
- PCR is then used to amplify the cDNA using the oligo(dT) and oligo(dC)
primers.
- And, only generated in viruses for integration of RNA into target genomic DNA.
Synthesized through in vitro revers transcription.
Protein Expression Vectors:
- cDNA is inserted into a plasmid directly after a plasmid-encoded transcription
promoter.
- A ribosome binding site (Shine-Dalgarno sequence is located just before the start
codon of the gene to be expressed (cDNA).
- The resulting cDNA clones can be screened for expression of the protein of
interest.
d. Bacteria lack enzymes required for the post-translational modification
of eukaryotic proteins.
i. Eukaryote cells add carbohydrates groups on the surface of
proteins as a result of post-translational modification.
ii. Eukaryote cells have chaperone proteins that assist in the
proper folding of newly synthesized proteins.
iii. Since bacteria cells do `have these, a eukaryotic host must be
used for expression of a target gene in some cases.
-
e. Introduction of Recombinant genes into eukaryotes
Recombinant DNA can be introduced into eukaryote using several methods…
-
i. Microinjection
DNA is directly injected into the nucleus of a cell using a micropipette.
-
ii. Electroporation
Using a high voltage pulse to make the cell membrane permeable to DNA
molecules.
iii. Viral vectors
-
Retroviruses are the most efficient vectors for delivery of foreign DNA into
eukaryotic cells.
Retroviruses have the capability to integrate the DNA version of their RNA
genomes into the host’s chromosomal DNA.
This integrated DNA can be expressed and replicated by the host cell machinery.
Retroviruses can accept DNA inserts of up to 6 kb
Baculovirus is used for the expression of proteins in insect cells.
IV.Gene disruption (knockout)
a. Why would we need to knockout a gene?
-
-
The function of a gene can be determined by inactivating the gene and looking
for the effect upon the organism. This is called a gene “knockout”.
b. How would we knockout a gene?
Can be done in diverse organisms such as bacteria, yeast, and mice.
Gene knockouts are made using homologous recombination with a mutant
version of the gene.
Gene Disruption by Homologous Recombination:
o A mutant version of the target gene is designed. This mutant gene
maintains some similarity with the wild type (WT) gene, especially at the
5’ and 3’ ends.
o When this gene is introduced into embryonic cells, recombination occurs
between similar regions leading to the replacement of the WT gene with
the inactive mutant version.
o Then look for phenotypic (visible) effects upon the organism.
V.RNA Interference (RNAi)
- 1998: discovered by Andy Fire and Craig Mello – Nobel Prize 2006
a. Know the process of the RNAi pathway
- C. elegans: free living, transparent nematode, about 1 mm in length which lives
in temperate soil environments.
- dsRNA can be easily introduced into C. elegans worms by directly feeding them
E. coli bacteria that produce dsRNA.
- Large-scale screens can be done in which many genes are sequentially knocked
down one by one to determine the gene function.
- PROCEDURE: Introduction of a specific dsRNA into a cell disrupts the mRNA
from the genes that contain sequence corresponding to the dsRNA molecule.
dsRNA is cut into 21 nucleotide fragments (siRNA) by an enzyme called Dicer.
These fragements consist of 19 base pairs with 2 nt of unpaired base at each 5’
end. The two strands of these fragments are separated and incorporated into the
RISC complex. The single stranded 21 nt RNA serves to guide RISC to a
complementary mRNA which is then degraded.
VI. Recombinant DNA and Plants
- Introduction of recombinant genes into plants can be done using the following
methods…
-
a. Tumor-inducing plasmids (Ti plasmids)
Integrate into genome and can express foreign DNA.
The common cell bacterium Agrobacterium tumefaciens infects plants and
introduces foreign DNA.
A tumor, known as the crown gall, grows at the site of infection.
Crown galls synthesize opine which are metabolized by the bacteria.
The metabolism of plant cell is diverted to produce food for the Agrobacterium.
Ti plasmids carried by the Agrobacterium are responsible for the shift to tumor
state and synthesis of opines.
A small portion of the Ti plasmid, called T-DNA, is integrated into the plant cell
genome.
-
Foreign DNA can be inserted into the T-DNA region and expressed upon
infection into a plant.
Only works with dicots, broad-leaved plants such as grapes, and some monocots.
b. Electroporation
Use of high-voltage electrical pulse that makes the cell wall permeable to DNA.
Foreign DNA can be inserted into a larger variety of plants, including cereal
grains, using electroporation.
The cellulose wall is first degraded by treatment with cellulase to form
protoplasts.
A mixture of plasmid DNA and protoplasts is subjected to high voltage electrical
pulses.
DNA enters the cells and expressed foreign DNA.
c. Gene gun
DNA is coated onto tungsten pellets and then fired into plant cells at high
velocity
Benefits of Recombinant DNA Technology:
- Human gene therapy
- Drought resistant plants
- Genetically engineered microbes for bioremediation
- Production of drugs
GMO (Genetically Modified Food):
- Myth: Eating foreign DNA will turn you into a mutant!!!
- Fact: You eat grams of foreign DNA every day.
- The following desired properties can be engineered: pest resistance, drought
tolerance, salt tolerance, cold/heat tolerance, nutrition, quantity/crop yield,
- With a growing world population, this is a much needed technique.
- Genetically modified plants contain engineered or transplanted proteins that can
be broken down by your stomach into amino acids.
- Organic and regular produce both contain pesticides that can cause human health
problems – relatively low concentration.
- GMO food can greatly increase crop yields to feed a growing global population.
- Greater food production per acre of land resulting in less environmental damage.
Molecular Evolution:
- Evolution is the foundation for all biology.
- Molecular evolution is the study of how proteins, nucleic, acid, and other
molecules have changed through time.
- Two molecules are said to be homologous if they are derived from a common
ancestor and later diverged from this ancestral sequence.
a. Homology
-
-
i. Paralogs
Homologs that are present within one species.
-
ii. Orthologs
Homologs that are present within different species and have very similar
functions.
-
iii. What’s the difference between paralogs and orthologs?
The 3D structures of bovine ribonuclease, human ribonuclease, and human
angiogenin are very similar.
-
Homology can be detected by significant sequence similarity resulting in a
common 3D structure.
Bovine and human ribonuclease are orthologs.
Human ribonuclease and angiogenin are paralogs.
iv. Homology can be used to infer function
- Large-scale sequencing has resulted in the discovery of many new genes.
- These can be compared with genes of known function.
- Sequence similarity most likely indicates a similar function in different organisms.
- Sequence alignments are performed.
-
-
-
b. Sequence alignments
Sequences are aligned in regions of similarity either in the specific amino acid
sequence or in the physical character of amino acids.
Hemoglobin: oxygen-carrying protein in blood
Myoglobin: binds oxygen in muscle
At first glance, it seems like not much sequence similarity.
Sequences are slid past each other to find windows of greatest similarity.
There are two good hits: one at the N-terminus, and the other on the C-terminus
Both hits can be combined into one by introducing a gap in one of the sequences.
Addition of a gap allows all regions of similarity to be included in the alignment.
Gap is needed because one protein has evolved to either gain or lose amino
acids.
c. Scoring Alignments
How do you test for the possibility that a grouping of sequence identities has
occurred by chance alone?
For example it is possible to insert many gaps into sequences to come up with
any alignment.
Use a scoring system to guard against this possibility.
Eg. Each sequence identity is given +10 points, whereas each gap is assessed a
penalty of -25 points. Therefore, for the hemoglobin-myoglobin alignment 38
identities and 1 gap results in a score if 355.
d. Shuffling
How do you know if the similarity is statistically significant? Is it better than a
random hit?
The amino acid sequence in one of the proteins from the alignment is shuffled.
The red bar indicates the original alignment (eg. Hemoglobin-myoglobin). To
the far right.
e. Conservative substitutions
-
-
i. Know what is a conservative substitution of amino caids.
Some substitutions result in the replacement of an amino acid with one that has
similar physical properties. The amino acids are similar in size and/or chemical
properties. This is called conservative substitution.
Conservative substitutions must be accounted for in scoring alignments, and
result in a positive score.
ii. Substitution matrices
-
Conservative and single nucleotide substitutions are more likely than substitutions with
more radical changes.
-
We can examine substitutions that have already taken place in existing protein
sequences.
Substitution matrices can be deduced from this analysis.
Large positive score means substitutions occur frequently while a large negative score
indicates a rare substitution.
Starting amino acid at top. Shaded amino acid only requires a single base mutation for
the change. Conservative substitutions have high score.
Amino acids such as cysteine and tryptophan are more conserved than amino acids like
serine and alanine.
Structurally conservative mutations such as lysine for arginine or isoleucine for valine
have relatively high scores.
This type of scoring system is much better at detecting homology between sequences as
compared to just using an exact identity approach.
-
BLAST: Basic Local Alignment Search Tool: Online database of
genome sequences that can be searched.
-
Expect value (E) should be less than 10-5 for a hit to be considered statistically
significant.
Expect Value is the number of sequences with this level of similarity expected to be in
the database by chance.
This should be much less than 1 in order to be statistically significant.
-
http://www.nci.nih.gov
CHEM 114A | Key Concepts Lectures 6-7
-
-
-
-
-
-
-
f. Structural homology
3D structure is much more closely associated with function than the primary
sequence.
a-Hemoglobin, myoglobin, leghemoglobin have very similar structures even
though the sequence similarity between human myoglobin and lupin
leghemoglobin is only 15.6% and is not statistically significant.
These proteins were expected to be related based on their similar biochemical
function of binding oxygen.
Structural homology can also be found for proteins having unrelated biochemical
function.
Actin is a major component of the cytoskeleton.
Heat shock protein (Hsp) assists in the folding of proteins inside cells.
Suggests that they are paralogs.
Descended from a common ancestor and adopted different roles.
Knowledge of 3D structures can aid in the proper alignment of sequences.
In a given family of proteins, residues that are critical for function are highly
conserved.
i. Can similar 3D structures can be considered homologues if the
sequence alignment may be statistically significant?
This conservation can be used as a signal in the detection of similar proteins
even though the complete sequence alignment may not be statistically
significant.
g. Convergent Evolution
When two unrelated proteins, not descended from a common ancestor, evolve to
form the same structure and or function.
Eg. Chymotrypsin and subtilisin – cleave peptide bonds through hydrolysis.
Their active sites are almost identical, however, the overall 3D structures are
very different making it unlikely that they are evolutionarily related.
Convergent evolution can also happen at the macroscopic scale, such as in
animals like bats and birds, or sharks and dolphins.
i. What’s the difference between convergent evolution and
paralog/ortholog homology?
Convergent evolution is how proteins that are unrelated converge to
have similar function/structure, while paralog/ortholog homology
details how proteins have diverged.
Convergent evolution sees similar function/structure from two unrelated proteins
Paralog/ortholog homology takes a common ancestral protein to help determine
a similar structure/function.
h. What is a motif?
More than 10% of all protein can contain motifs that are repeated in that protein.
Can be detected by aligning the protein with itself.
Are the result of a gene duplication event.
i. RNA structural homology
RNA folds back on itself to form elaborate secondary structures containing both
double and single stranded regions.
-
Comparison of a conserved RNA sequence from multiple organisms can allow
one to determine the secondary structure for that RNA.
i. Compensatory mutations
- Compensatory mutations: these are mutations that alter the sequence but
maintain base pairing within the secondary structure.
- MFOLD: a web server for predicting RNA secondary structures. Input RNA
sequence of interest and MFOLD will output multiple RNA secondary structure
arrange in order from most likely to least likely
- http://mfold.rna.albany.edu/?q=mfold/RNA-Folding-Form
Carl Woese (1928-2012):
- 1977 – discovered that ribosomal RNA was highly conserved and could be
aligned from multiple species. Protein synthesis by the ribosome is an ancient
reaction carries out by all living things, therefore it is an excellent barometer of
evolution.
- Found a new class of life  Archaea
- First to prove that all life on earth was related.
-
-
j. Phylogenetic trees
Sequence alignments of given proteins or nucleic acids can be used to construct
evolutionary trees, this is called phylogenetic analysis.
Length of the branch connecting each pair of proteins is proportional to the
number of amino acid differences between the sequences.
Analysis of ancient DNA from Neanderthals allows us to determine their
position on the evolutionary tree.
Approximately 1-4% of human DNA is the result of inbreeding with
Neanderthals.
i. Know how to read the tree.
The closer the branches, the more related the two species/proteins are.
Distance between branches is proportional to the value of difference in sequence.
Download