STUDYING GENOMES

advertisement
STUDYING GENOMES
Studying DNA
Enzymes for DNA manipulation
• Before 1970s, the only way in which individual genes
could be studied was by classical genetics.
• Biochemical research provided (in the early 70s)
molecular biologists with enzymes that could be used to
manipulate DNA molecules in the test tube.
• Molecular biologists adopted these enzymes as tools for
manipulating DNA molecules in pre-determined ways,
using them to make copies of DNA molecules, to cut DNA
molecules into shorter fragments, and to join them
together again in combinations that do not exist in nature.
• These manipulations form the basis of recombinant DNA
technology.
Recombinant DNA technology
• The enzymes available to the molecular biologist fall into
four broad categories:
1.
2.
DNA polymerase – synthesis of new polynucleotides
complementary to an existing DNA or RNA template
Nucleases – degrade DNA molecules by breaking the
phosphodiester bonds
• restriction endonucleases (restriction enzyme) – cleave DNA
molecules only when specific DNA sequences is encountered
3.
4.
Ligases – join DNA molecules together
End modification enzymes – make changes to the ends of
DNA molecules
source: Brown T. A. , Genomes. 2nd ed. http://www.ncbi.nlm.nih.gov/books/NBK21129/
DNA cloning
• DNA cloning (i.e. copying) – logical extension of the ability
to manipulate DNA molecules with restriction
endonucleases and ligases
• vector
• DNA sequence that naturally replicates inside bacteria.
• It consists of an insert (transgene) and larger sequence serving as
the backbone of the vector.
• Used to introduce a specific gene into a target cell. Once the
expression vector is inside the cell, the protein that is encoded by
the gene is produced by the cellular-transcription and translation
machinery ribosomal complexes.
Vectors
• plasmid
• DNA molecule that is separated from, and can replicate
independently of, the chromosomal DNA.
• Double stranded, usually circular, occurs naturally in bacteria.
• Serves as an important tool in genetics and biotechnology labs,
where it is commonly used to multiply (clone) or express particular
genes.
• length of insert: 1-10 kbp
source: wikipedia
restriction endonuclease
ligase
DNA cloning
source: Brown T. A. , Genomes. 2nd ed. http://www.ncbi.nlm.nih.gov/books/NBK21129/
Vectors
• BAC (bacterial artificial chromosome)
• It is a particular plasmid found in E. coli. A typical BAC can carry
about 250 kbp (100-350 kbp).
• cosmid
• 40-45 kbp
• YAC (yeast artificial chromosome)
• 1.5-3.0 Mbp
PCR – Polymerase chain reaction
• DNA cloning results in the purification of a single fragment
•
•
•
•
of DNA from a complex mixture of DNA molecules.
Major disadvantage: it is time-consuming (several days to
produce recombinants) and, in parts, difficult procedure.
The next major technical breakthrough (1983) after gene
cloning was PCR.
It achieves the amplifying of a short fragment of a DNA
molecule in a much shorter time, just a few hours.
PCR is complementary to, not a replacement for, cloning
because it has its own limitations: we need to know the
sequence of at least part of the fragment.
Mapping genomes
What is it about?
• Assigning/locating the specific gene to the particular
region at the chromosome and determining the location
and relative distances between genes at the
chromosome.
• There are two types of maps:
• genetic linkage map – shows the arrangement of genes (or other
markers) along the chromosomes as calculated by the frequency
with which they are inherited together
• physical map – representation of the chromosomes, providing the
physical distance between landmarks on the chromosome, ideally
measured in nucleotide bases
• The ultimate physical map is the complete sequence itself.
Genetic mapping
• use of genetic techniques to construct maps showing the
positions of genes
• genetic techniques:
• cross-breeding experiments
• humans: the examination of family histories (pedigrees)
Genetic linkage map
• Constructed by observing how frequently
two markers (e.g. genes) are inherited together.
• Two markers located on the same chromosome can be
separated only through the process of recombination.
• If they are separated, childs will have just one marker
from the pair.
• However, the closer the markers are, the more tightly
linked they are, and the less likely recombination will
separate them. They will tend to be passed together from
parent to child.
• Recombination frequency provides an estimate of the
distance between two markers.
Genomes 2, T. A. Brown
• Look at the results of meiosis in a hundred
identical cells.
• If crossovers never occur then the resulting
gametes will have the following genotypes:
200 AB, 200 ab
• This is called completed linkage.
• But if (as is more likely) crossovers occur
between A and B in some of the nuclei, then the
allele pairs will not be inherited as single units.
Let us say that crossovers occur during 40 of the
100 meioses. The following gametes will result:
• 160 AB, 160 ab, 40 aB, 40 aB
• This is called partial linkage.
• The recombination frequency depends on the
distance between two genes.
Genomes 2, T. A. Brown
Genetic markers
• A genetic map must show the positions of distinctive
features – markers.
• Any inherited characteristic that differs among individuals
and is easily detectable in the laboratory is a potential
genetic marker.
• Markers can be
• genes
• DNA segments that have no known coding function but which
inheritance pattern can be followed.
Genetic linkage map
• On the genetic maps distances between markers are
measured in terms of centimorgans (cM).
• 1cM apart – they are separated by recombination 1% of
the time
• 1 cM is ROUGHLY equal to physical distance of 1 Mbp in human
Genes as markers
• useful but no ideal
• For larger genomes (e.g. vertebrates), gene maps are not
very detailed (genes are widely spaced out with large
gaps between them).
• Variations within genes lead to observable changes (e.g.
eye color). However, only a fraction of the total number of
genes exist in allelic forms that can be distinguished
conveniently.
• Gene maps are therefore not very comprehensive. We
need other types of marker.
• Mapped features that are not genes are called DNA
markers.
DNA markers
• Must be polymorphic, i.e. alternative forms (alleles) must
exist among individuals so that they are detectable among
different members in family studies.
• Most variations occur within introns, have little or no effect
on an organism, yet they are detectable at the DNA level
and can be used as markers.
1.
2.
3.
restriction fragment length polymorphisms (RFLPs)
simple sequence length polymorphisms (SSLPs)
single nucleotide polymorphisms (SNPs, pronounce “snips”)
RFLPs (restriction fragment length)
• Recall that restriction enzymes cut DNA molecules at specific
recognition sequences.
• This sequence specificity means that treatment of a DNA
molecule with a restriction enzyme should always produce the
same set of fragments.
• This is not always the case with genomic DNA molecules
because some restriction sites exist as two alleles, one allele
displaying the correct
sequence for the restriction
site and therefore being cut,
and the second allele having
a sequence alteration so the
restriction site is no longer
recognized.
source: Brown T. A. , Genomes. 2nd ed. http://www.ncbi.nlm.nih.gov/books/NBK21129/
SSLPs (simple sequence length)
• Repeat sequences that display length variations, different alleles
contain different numbers of repeat units (i.e. SSLPSs are multiallelic).
• variable number of tandem repeat sequences (VNTRs,
minisatellites)
• repeat unit up to 25 bp in length
• simple tandem repeats (STRs, microsatellites)
• repeats are shorter, usually di- or tetranucleotide
source: Brown T. A. , Genomes. 2nd ed. http://www.ncbi.nlm.nih.gov/books/NBK21129/
SNPs (sigle nucleotide)
• Positions in a genome where some individuals have one
•
•
•
•
nucleotide and others have a different nucleotide.
Vast number of SNPs in every genome.
Each SNP could have potentially four alleles, most exist in
just two forms.
The value of two-allelic marker (SNP, RFLP) is limited by
the high possibility that the marker shows no variability
among the members of a family.
The advantages of SNP over RFLP:
• they are abundant (human genome: 1.5 millions of SNPs, 100 000
RFLPs)
• easire to type (i.e. easier to detect)
Marker analysis
• Value of genetic map – marker analysis
• Inherited disease can be located on the map by following
the inheritance of a DNA marker present in affected
individuals (but absent in unaffected individuals), even
though the molecular basis of the disease may not yet be
understood nor the responsible gene identified.
• This represent a cornerstone of testing for genetic
diseases.
Physical maps
• A map generated by genetic techniques is rarely sufficient
for directing the sequencing phase of a genome project.
1. The resolution of a genetic map depends on the number of
crossovers that have been scored.
2. Genetic maps have limited accuracy.
• Before large-scale sequencing begins (see next lecture),
a genetic map must be checked and supplemented by
alternative mapping procedure.
Physical mapping
• A plethora of physical mapping techniques has been
developed to address this problem, the most important
being
1.
2.
3.
Restriction mapping
Fluorescent in situ hybridization (FISH)
Sequence tagged site (STS) mapping
see more at Genomes2, Brown, http://www.ncbi.nlm.nih.gov/books/NBK21116/
Genetic and physical map - compared
• Saccharomyces cerevisiae
chromosome III
• physical map obtained by DNA
sequencing
• the order of the upper two
markers (glk1 and cha1) is
incorrect on the genetic map
• there are also differences in
the relative positioning of other
pairs of markers
Oliver SG et al., Nature, 357, 38–46
more at http://www.informatics.jax.org/silver/chapters/7-1.shtml
Genome maps
relative locations of markers are
established by following inheritance
patterns
visual appearance of a chromosome
when stained and examined under a
microscope
the order and spacing of the
markers, measured in base pairs
sequence
map
source: Talking glossary of genetic terms, http://www.genome.gov/glossary/
NCBI human Genome Resources
• http://www.ncbi.nlm.nih.gov/genome/guide/human/
Download