Natural Selection as the Mechanism for Biological Evolution

advertisement
Biology 21
Macromolecular Structure
Discovering the Connection between DNA Sequence and Protein Structure
A gene is a sequence of DNA nucleotides that provides the information for
producing a specific protein. The RNA copy of the gene contains the code
words, or codons, that determine the order of the amino acids in the protein.
Each codon is a series of three nucleotides that identifies a specific amino acid to
be placed at a given position in the protein. A chart at the bottom of the page
shows the “codon dictionary” used by all living organisms to produce proteins
from gene-derived RNA sequences.
In this exercise, you will
1. Use a computer simulation to show the production of a protein from a
short DNA sequence.
2. Determine the amino acid sequence for an unknown human RNA
sequence provided by your instructor.
3. Identify the gene for this unknown human RNA sequence using the online
Human Genome database at the National Center for Biotechnology
Information.
Codon Dictionary
UUU = phenylalanine
PHE
UCU = serine SER
UUC = phenylalanine
UCC = serine
UUA = leucine
LEU
UCA = serine
UUG = leucine
UCG = serine
CUU = leucine
CCU = proline PRO
CUC = leucine
CCC = proline
CUA = leucine
CCA = proline
CUG = leucine
CCG = proline
AUU = isoleucine
ILE
ACU = threonine THR
AUC = isoleucine
ACC = threonine
AUA = isoleucine
ACA = threonine
AUG = methionine (start) MET
ACG = threonine
GUU = valine
VAL
GCU = alanine ALA
GUC = valine
GCC = alanine
GUA = valine
GCA = alanine
GUG = valine
GCG = alanine
UAU = tyrosine
TYR
UAC = tyrosine
UAA = stop
UAG = stop
CAU = histidine
HIS
CAC = histidine
CAA = glutamine
GLN
CAG = glutamine
AAU = asparagine ASN
AAC = asparagine
AAA = lysine
LYS
AAG = lysine
GAU = aspartic acid ASP
GAC = aspartic acid
GAA = glutamic acid GLU
GAG = glutamic acid
UGU = cysteine
UGC = cysteine
UGA = stop
UGG = tryptophan
CGU = arginine
CGC = arginine
CGA = arginine
CGG = arginine
AGU = serine
AGC = serine
AGA = arginine
AGG = arginine
GGU = glycine
GGC = glycine
GGA = glycine
GGG = glycine
CYS
TRP
ARG
SER
ARG
GLY
Glossary of some terms you may encounter for online search (Part 3)
Intron
Region that interrupts a gene, does not contribute to the protein
sequence
Exon
Part of the gene specifying the amino acid sequence, separated
from other exons by introns
Missense Mutation that causes a substitution of one amino acid for another in
a protein.
Nonsense Mutation that causes the codon for an amino acid to be changed to
a stop codon, leading to a shortened protein.
p
Short arm of a chromosome
q
Long arm of a chromosome
Directions
1. Using a computer simulation to show the production of a protein from a
short DNA sequence
a. Go to http://learn.genetics.utah.edu/content/begin/dna/transcribe/.
(Or access from Otherlinks at
homepage.smc.edu/colavito_mary/biology21.htm)
b. Follow the onscreen directions to produce the protein-coding RNA
strand (called messenger RNA). The following chart shows the
correspondence between nucleotides in the gene sequence and
nucleotides placed in the RNA copy.
Nucleotide in DNA Nucleotide Placed in RNA
A
U
T
A
G
C
C
G
c. Follow the onscreen directions to produce the amino acid sequence
from the RNA. [Hint: All gene sequences begin with the “Start”
codon, AUG, so that methionine will be positioned as the first amino
acid in the chain.]
2. Determining the amino acid sequence for an unknown human RNA
sequence provided by your instructor
a. Obtain a worksheet showing an unknown human RNA sequence from
your instructor.
b. Produce the DNA strand that would be complementary to this RNA
sequence.
c. Using the codon dictionary, determine the amino acid sequence for the
protein encoded by this unknown human RNA sequence.
d. Record your results on the worksheet.
3. Identifying the gene for this unknown human RNA sequence using the
online Human Genome database at the National Center for Biotechnology
Information
a. Go to http://blast.ncbi.nlm.nih.gov/Blast.cgi (Or access from Otherlinks
at homepage.smc.edu/colavito_mary/biology21.htm)
b. Under “Basic Blast”, choose Nucleotide Blast.
c. Type your unknown human RNA sequence into the “Enter Query
Sequence” box at the top, make sure that the Human Genomic +
Transcript Database is selected and then select the “BLAST” button at
the bottom of the screen.
d. When the results are displayed, scroll down to the heading:
“Sequences producing significant alignments”. Choose one of the
sequences at the top of the list, and click on its Accession number. It is
best to use a sequence showing a gene, transcript (mRNA), protein or
disease name rather than one listed simply as “clone”, “predicted” or
“human sequence”.
e. When the next screen is displayed, check for information on the gene
name and chromosomal location.
f. If you need additional information, scroll down to the section labeled
“Features”. This section has items indicated in the table below:
Source
Shows chromosome number and map location
Protein
Shows the sequence of the protein derived from this gene
Gene
Gene Name
Synonyms
Gene ID-links to general information and pubmed
database articles for this gene
MIM – summary of data on inheritance and molecular
biology of the gene
--useful for identifying mutations as described below
Click on number next to MIM*
Chromosomal Location will be reported as Gene Map
Locus
Click on the Chromosomal Location to see a table
giving information on diseases caused by mutations in
the gene.
To find mutations in the gene:
1. Choose Allelic Variants from the OMIM menu on
the left side of the screen
2. Choose any one variant that involves a single
amino acid change. Ex. ARG142CYS means that
cysteine replaces arginine at position 142.
MIM or OMIM
Online Mendelian Inheritance in Man
Download