analysis of dna and genomes

advertisement
ANALYSIS OF DNA AND GENOMES
1. Describe the properties of restriction nucleases and understand how they can be used
to join DNA fragments.
2. Describe how gel electrophoresis is used to separate DNA molecules by size.
3. Understand how nucleic acid hybridization techniques are used to detect the amount
of a particular DNA or RNA molecule present in a complex mixture. Understand
how the reaction conditions can be adjusted to allow either hybridization of only
identical sequences or also of related ones.
4. Describe how Southern blotting is used to detect a particular DNA molecule and
Northern blotting is used to detect a particular RNA molecule present in a complex
sample. Understand how Southern blotting is used to detect a mutation to a particular
gene.
5. Understand how DNA cloning with plasmid cloning vectors that are introduced into
bacteria is used to generate many copies of a specific DNA sequence.
6. Describe the sequence composition of a genomic DNA library and how this type of
library is generated.
7. Describe the sequence composition of a cDNA library and how this type of library is
generated.
8. Distinguish between a genomic DNA and a cDNA clone and describe different uses
for each one.
9. Understand how the polymerase chain reaction is used to amplify a segment of DNA.
10. Describe how a cDNA corresponding to one particular mRNA present in a complex
mixture can be amplified by the polymerase chain reaction.
11. Understand how PCR analysis of variable number of tandem repeat loci is useful in
forensic investigation and paternity testing.
12. Understand how hybridization with allele-specific oligonucleotides can be used in
genetic diagnosis.
13. Describe how microarrays can be used to simultaneously analyze the expression of
many genes or to detect genetic variations.
14. Describe the dideoxy method and highly parallel methods for sequencing DNA.
15. Describe how a protein sequence is deduced from a cDNA sequence. Describe the
different strategies used to locate protein coding sequences within genomes.
16. Describe how antibodies can be used to detect specific proteins, including their use in
Western blotting.
17. Describe how introducing expression vectors into cells is used to produce large
amounts of a desired protein.
18. Understand how the approximate location of a gene that causes a disease can be
determined by linkage analysis with physical markers. Describe the types of markers
that are used.
19. Describe the inheritance patterns for autosomal dominant, autosomal recessive, and
X-linked recessive diseases. Understand how diseases can have complex patterns of
inheritance.
20. Understand how genome-wide association studies can be used to identify genetic
variation associated with increased risk of developing a complex genetic disease.
21. Describe how gene targeting in mice is used to analyze the function of particular
genes.
22. Understand how RNA interference can be used to experimentally turn off expression
of a particular gene and how CRISPR-Cas9 can be used for targeted genome
manipulation.
ANALYSIS OF DNA AND GENOMES
1. Basic methods for isolating, manipulating, and analyzing DNA (recombinant DNA
technology)
a. Restriction nucleases
 Reproducibly cut DNA at specific sites that are recognized by a 4-8 bp
sequence
 Many of these enzymes produce staggered cuts that leave single-stranded
cohesive ends
 Cohesive ends of two different fragments that were produced by the same
enzyme are complementary and can be readily joined to create recombinant
DNA
b. Gel electrophoresis
 DNA is negatively charged and will travel through electric field
 Gel matrix serves as molecular sieve that separates DNA molecules by size;
larger molecules move slower through gel
 DNA is visualized by either staining with ethidium bromide which fluoresces
under ultraviolet light or by prior incorporation of a radioisotope (usually 32P)
into DNA which is then detected by autoradiography
c. Nucleic acid hybridization
 Double stranded DNA can be denatured into two single strands by heating or
pH extremes; complementary DNA or RNA strands can renature (hybridize)
under appropriate conditions
 Many hybridization procedures use a labeled single stranded DNA probe to
detect presence of a particular DNA or RNA species from a complex mixture
 Common method for generating probe in vitro
- Purified DNA fragment denatured and annealed to pool of short random
primers;
- DNA polymerase then incorporates labeled nucleotides to create labeled
DNA molecules
 Stringency of hybridization
- Hybridization temperature determines whether only identical or also
related sequences can hybridize
- Temperatures slightly below melting temperature only permit
hybridization of perfectly matched sequences (high stringency); detect one
gene from entire genome
- Lower temperatures allow related sequences with some mismatches to
also hybridize (low stringency); detect related genes or homologous genes
from another organism
2. Southern and Northern blotting
1
a. Complex mixture of DNA (for Southern blotting) generated by restriction
nuclease or RNA (for Northern blotting) separated by gel electrophoresis prior to
hybridization
b. Following electrophoresis, nucleic acids transferred to and immobilized on
membrane; for Southern blotting, molecules in gel are denatured prior to transfer
c. Membrane exposed to solution containing labeled probe; molecules that hybridize
to probe are identified as discrete bands
d. Southern blotting can detect a mutation in a particular gene caused by insertion or
deletion of a segment of DNA
e. Northern blotting can detect if expression of a particular gene is altered under a
given condition, for example in a mutant organism
3. DNA cloning
a. Describes making of many identical copies of a DNA molecule; also describes
isolation of one segment of DNA such as a gene from the cell’s total DNA
b. To clone using bacteria, segment of DNA is inserted into cloning vector, which is
then propagated in host cells
c. Cloning vectors are commonly bacterial plasmids (small circular DNA
molecules); plasmid is cut with restriction nuclease and joined with DNA
fragment to be cloned
d. Recombinant plasmid DNA is introduced (transfected) into bacteria; propagation
of bacteria in culture results in repeated replication of plasmid, which can be
easily purified from bacteria
4. DNA library
a. Collection of cloned DNA fragments, which are commonly contained in plasmids
in bacterial host
b. Set of recombinant plasmids is generated and transfected into bacteria; bacteria
are grown on plates to form isolated colonies; bacteria in each colony contain
plasmid with one particular cloned DNA fragment
c. Genomic DNA library
 Contains the entire genome of a particular individual
 Total DNA is isolated from cells, cut with restriction nuclease, inserted into
plasmids, and propagated in bacteria
 Fragments in library contain all genes from the individual; also includes
abundance of noncoding DNA
d. cDNA (complementary DNA) library
 Contains only DNA sequences that are transcribed into mRNA; DNA is
generated that is complementary to mRNA
2


mRNA isolated from cells is reverse transcribed into single stranded DNA
with reverse transcriptase; DNA polymerase is used to make double stranded
cDNA which is inserted into plasmids and propagated in bacteria
Clones in library represent all mRNAs expressed in a particular cell type;
different libraries will be generated using different cell types from same
organism
e. Uses of genomic and cDNA libraries
 cDNA clones useful for deducing amino acid sequence of protein or for bulk
production of protein in bacteria or other cells
 Genomic clones useful for obtaining gene regulatory sequences or for
sequencing genomes
 Libraries can be screened to select clones with particular properties, such as
based on sequence (by hybridization) or function of gene product
5. Polymerase chain reaction (PCR)
a. Amplification of a segment of DNA from a complex mixture in vitro without
using bacterial host
b. Need sequence information of short segments at each end of sequence to be
amplified
c. Use two chemically synthesized oligonucleotides each complementary to one of
the strands at opposite ends of sequence to be amplified; serve as primers for
DNA polymerization reactions
d. Each cycle consists of three different temperatures for denaturing DNA strands,
annealing primers to template DNA, and DNA synthesis from primers; requires
heat stable DNA polymerase
e. Perform repeated cycles; DNA that is synthesized in one cycle serves as template
in subsequent cycles which results in exponential amplification of region between
primers
f. Can obtain genomic DNA and cDNA clones by PCR
 For genomic clones, DNA isolated from cells, and PCR used to obtain DNA
located between primers
 For cDNA clones, mRNA isolated from cells and reverse transcribed with
reverse transcriptase; PCR used to obtain cDNA located between primers
g. Useful in forensic analysis and paternity testing
 Examine segments whose length is highly variable between individuals; often
use microsatellites (runs of short repeated sequences) whose lengths vary,
known as variable number of tandem repeats (VNTR)
3


PCR using primers to non-variable segments that flank a particular VNTR;
reactions analyzed by gel electrophoresis; each individual will usually have
different products from paternal and maternal inherited alleles
By examining five to ten different VNTRs, a very precise genetic fingerprint
is established for an individual
h. Useful for detecting presence of low levels of viral DNA indicative of viral
infection; useful for detection of many genetic diseases, particular those involving
insertions and deletions
6. Genetic analysis using allele-specific oligonucleotides
a. Can detect small differences between different alleles, even single base pair
differences
b. Isolate DNA from cells and PCR amplify segment of gene that includes region to
be analyzed
c. Separate reaction by gel electrophoresis and transfer to membrane
d. Hybridize with labeled oligonucleotide probe (chemically synthesized single
stranded DNA approximately 20 nucleotides in length) that recognizes a specific
allele; use conditions that allow only perfect matches to hybridize
e. Repeat procedure using oligonucleotide probe specific for another allele of same
gene, for example to distinguish between normal and disease-causing alleles
7. Hybridization using microarrays
a. Hybridization technique for simultaneously monitoring expression of thousands
of genes or for detecting genetic variation
b. Slide generated containing large, dense array of DNA probes each of known
sequence and position on slide
c. Isolate from cells genomic DNA sample or mRNA sample that is reverse
transcribed to cDNA
d. DNA sample is labeled with fluorescent dye and hybridized to slide
e. Intensity of fluorescence at each spot indicates abundance of particular segment
of DNA within sample
f. Applications
 Analyze genetic variation including single nucleotide polymorphisms (SNPs)
 Determine gene expression patterns that underlie many cellular processes
 Distinguish among different types of cancer cells based upon characteristic
gene expression patterns
8. Sequencing DNA
4
a. Dideoxy (Sanger) method
 Synthesis of purified DNA in vitro using DNA polymerase, primer, and four
deoxynucleotides
 Four reactions each which includes a small amount of one dideoxynucleoside
triphosphate (ddNTP); rare incorporation of a ddNTP blocks further chain
growth
 Reaction with a particular ddNTP generates set of fragments whose lengths
indicate positions where the nucleotide is present
 Each reaction separated by gel electrophoresis in parallel lanes on gel;
examining all fragments in order of size reveals sequence
b. Automated sequencing
 One reaction includes small amounts of all four ddNTPs, each ddNTP labeled
with a different color fluorescent dye
 Detector located at bottom of gel reads the color of the label in each fragment
c. Sequencing genomes by shotgun sequencing method
 Generate several genomic libraries with different insert sizes
 Perform sequencing reactions on millions of different genomic DNA clones;
will have segments of sequence overlap between different clones
 Complex algorithms for assembly of sequence data in appropriate order on
chromosomes based on sequence overlaps
d. Highly parallel sequencing
 Sample preparation
 DNA randomly sheared and ligated to adaptors; DNA molecules attached
to solid surface
 Clonal PCR-based amplification from single DNA molecules on a solid
surface; adaptor sequences serve as priming sites for PCR amplification
 Form high density array of clonal DNA clusters (up to millions of clusters)
 Sequencing by synthesis- repeated cycles using sequencing instrument
 Polymerase-catalyzed addition of nucleotide that is complementary to
template strand
 Fluorescence or chemiluminescence imaging to detect nucleotide
incorporation at each DNA cluster
 One technology uses four different types of fluorescent-labeled reversible dideoxy terminator nucleotides; cycles involve addition of nucleotide, imaging
of fluorescence at each DNA cluster to identify which nucleotide was
incorporated, removal of blocked terminus and fluorophore to allow
subsequent cycle
 Other technology couples pyrophosphate release during nucleotide
incorporation to reaction that produces chemiluminescence signal; cycles
involve addition of one of four nucleotides and chemiluminescence imaging at
5
each DNA cluster to identify if nucleotide was incorporated, each of four
nucleotides added sequentially
9. Finding DNA sequences that encode proteins
a. From cDNA sequence- six potential reading frames; usually only one is
recognized as correct due to protein coding region that begins with ATG and ends
with stop codon (open reading frame) and that is reasonably long; other frames
usually contain frequent stop codons;
b. From genome sequence- more difficult because vast majority of sequence is
noncoding
 Search for open reading frames; must account for long introns within open
reading frames by searching for sequences that signal intron/exon boundaries;
also searching for upstream regulatory sequences helps to find genes
 Sequence cDNAs; large collection of cDNA sequences (database) can be
compared to genomic sequence to locate exons and introns in genes
 Compare sequences between species, for example human and mouse;
conserved sequences usually indicative of exons that encode proteins
10. Use of antibodies to detect specific proteins
a. Antibodies are proteins produced by immune system; produced in billions of
different forms that bind to different targets known as antigens
b. Can be generated by injecting animal with antigen and collecting specific
antibodies produced by immune cells
c. In typical application, primary antibody recognizes antigen of interest; detection
occurs from secondary antibodies coupled to a marker (enzyme, fluorescence)
that bind to primary antibody
d. Western blotting- use antibody to detect specific protein from complex mixture
that has been separated by polyacrylamide-gel electrophoresis; proteins from gel
transferred to and immobilized on membrane prior to antibody detection
11. Producing proteins in large amounts
a. For medically useful proteins, such as insulin, growth hormone, interferon, or for
research use
b. Expression vectors contain strong promoter to drive transcription of adjacent
protein coding gene
c. Protein coding sequence, usually from cDNA, inserted into expression vector;
resultant vector introduced into cells in culture
d. Different expression vectors designed for bacteria, yeast, insect, or mammalian
cells
6
e. High abundance of protein facilitates purification; can engineer gene to produce
protein with molecular tag that facilitates affinity purification
12. Genetic approach for determining gene function and identifying disease-related genes
a. Determine genotype of organism with particular phenotype
 Find genes responsible for genetic diseases in humans
 Find mutated genes in model organisms subjected to random mutagenesis that
have interesting phenotypes
 Linkage analysis
- The closer two loci are on the same chromosome, the greater chance that
they will be passed on to offspring together
- Useful physical markers have known locations in genome and have at
least two different forms (polymorphic); VNTRs can be used as markers;
also many SNPs identified in humans which can be detected by
hybridization techniques
- Examine relationship in families between many markers and disease (or
phenotype)
- If a particular marker is almost always inherited with disease (or
phenotype), mutated gene is located near that marker
- If genome is sequenced, can examine candidate genes located at that
region
b. Inheritance patterns
 Most diseases have genetic component; those with simple Mendelian
inheritance easier to determine responsible gene because defect in single gene
has an overwhelming effect
- Autosomal dominant- one copy of defective gene causes disease; if one
parent has disease, 50% chance of offspring being affected
- Autosomal recessive- both copies of gene must be defective; if both
parents are carriers, 25% chance of offspring being affected, 50% chance
of offspring being carriers
- X-linked recessive- if mother is carrier, 50% chance of son being affected,
50% chance of daughter being carrier; if father has disease, all daughters
will be carriers
 Complex genetic diseases- no simple inheritance pattern; can inherit an
increased risk; dependent upon multiple genes and environment; many
common diseases such as hypertension, heart disease, diabetes, bipolar
disorder
c. Genome-wide association studies to study complex genetic diseases
 Compare frequency of each SNP in disease and control groups
 Most SNPs show no significant difference in frequency of each allele between
two groups
7


Any SNP in which one allele occurs at a greater frequency in disease group
indicates SNP allele is marker for genetic risk factor; these SNPs mark general
location of the genetic alteration contributing to disease risk
By analyzing ~1000-2000 individuals for disease and control groups using
DNA microarrays that type ~500,000 SNPs, have been able to identify
specific genetic variations that increase risk by as little as ~1.2 times
13. Gene targeting in mice to determine gene function (reverse genetics)
a. Introduce mutated, often inactivated, gene into embryonic stem (ES) cells in
culture; in rare instances introduced gene replaces one copy of cellular
counterpart by homologous recombination
b. Identify colonies in which cellular gene has been replaced using Southern blotting
or PCR
c. Inject mutant ES cells into early embryo; implant into mouse; breed offspring that
have mutated gene in germ cells
14. RNA interference
a. Turn off expression of specific gene to study its function; potential therapeutic use
b. Mechanism
 Small fragments (21-23 bp) of double-stranded RNA known as small
interfering RNA (siRNA) that are complementary to particular mRNA
sequence introduced into cells; alternatively, larger double stranded RNA can
be converted to siRNA
 Multi-protein RNA-Induced Silencing Complex (RISC) acts with siRNA to
specifically cleave appropriate mRNA
15. Genome Engineering using CRISPR (clustered regularly interspaced short
palindromic repeat)/Cas9 (CRISPR-associated nuclease 9)
a. Targeted editing of genome; can be used to study gene function; potential
therapeutic use in correcting disease-causing mutations
b. Mechanism
 Introduce into cells Cas9 and guide RNA (gRNA); particular gRNA contains
~20 nt sequence that matches desired genomic target
 gRNA complexes with Cas9, guides it to genomic target and makes doublestrand break
 genomic target (called protospacer) requires adjacent protospacer adjacent
motif (PAM) of NGG for recognition by Cas9
 Endogenous machinery repairs double-strand break; nonhomologous endjoining results in mutation (usually deletion); usually leads to inactive gene by
disrupting downstream reading frame
8

Homologous repair template with desired changes can be introduced to
facilitate precise gene editing by homologous recombination
c. Technique can be used for efficient generation of mutant organisms by
microinjecting Cas9, sgRNA, and repair template into one-cell embryos
9
Download